  1. Fault Tolerant Control Systems

    Bøgh, S.A.

    and isolation, remedial action decision, and reconfiguration. The integration of these modules in software were considered. The general methodology covered the analysis, design, and implementation of fault tolerant control systems on an overall level. Two detailed studies were presented, one on fault detection......, as for example a variable being zero, low or high. Examples were given that illustrate how such models can be established by simple means, and yet provide important information when combined into a complete system. A special achievement was a method to determine how control loops behave in case of faults......This thesis considered the development of fault tolerant control systems. The focus was on the category of automated processes that do not necessarily comprise a high number of identical sensors and actuators to maintain safe operation, but still have a potential for improving immunity to component...

  2. Fault tolerant control for switched linear systems

    Du, Dongsheng; Shi, Peng


    This book presents up-to-date research and novel methodologies on fault diagnosis and fault tolerant control for switched linear systems. It provides a unified yet neat framework of filtering, fault detection, fault diagnosis and fault tolerant control of switched systems. It can therefore serve as a useful textbook for senior and/or graduate students who are interested in knowing the state-of-the-art of filtering, fault detection, fault diagnosis and fault tolerant control areas, as well as recent advances in switched linear systems.  

  3. Fault tolerant control for uncertain systems with parametric faults

    Niemann, Hans Henrik; Poulsen, Niels Kjølstad


    A fault tolerant control (FTC) architecture based on active fault diagnosis (AFD) and the YJBK (Youla, Jarb, Bongiorno and Kucera)parameterization is applied in this paper. Based on the FTC architecture, fault tolerant control of uncertain systems with slowly varying parametric faults...

  4. Synthesis of Fault-Tolerant Embedded Systems

    Eles, Petru; Izosimov, Viacheslav; Pop, Paul


    This work addresses the issue of design optimization for fault- tolerant hard real-time systems. In particular, our focus is on the handling of transient faults using both checkpointing with rollback recovery and active replication. Fault tolerant schedules are generated based on a conditional pr...

  5. Architecting fault-tolerant software systems

    Sözer, Hasan


    The increasing size and complexity of software systems makes it hard to prevent or remove all possible faults. Faults that remain in the system can eventually lead to a system failure. Fault tolerance techniques are introduced for enabling systems to recover and continue operation when they are subj

  6. Fault tolerant control design for hybrid systems

    Yang, Hao; Jiang, Bin [Nanjing University of Aeronautics and Astronautics, Nanjing (China); Cocquempot, Vincent [Universite des Sciences et Technologies de Lille, Villeneuve d' Ascq (France)


    This book intends to provide the readers a good understanding on how to achieve Fault Tolerant Control goal of Hybrid Systems. The book can be used as a reference for the academic research on Fault Tolerant Control and Hybrid Systems or used in Ph.D. study of control theory and engineering. The knowledge background for this monograph would be some undergraduate and graduate courses on Fault Diagnosis and Fault Tolerant Control theory, linear system theory, nonlinear system theory, Hybrid Systems theory and Discrete Event System theory. (orig.)

  7. Software fault tolerance in computer operating systems

    Iyer, Ravishankar K.; Lee, Inhwan


    This chapter provides data and analysis of the dependability and fault tolerance for three operating systems: the Tandem/GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Based on measurements from these systems, basic software error characteristics are investigated. Fault tolerance in operating systems resulting from the use of process pairs and recovery routines is evaluated. Two levels of models are developed to analyze error and recovery processes inside an operating system and interactions among multiple instances of an operating system running in a distributed environment. The measurements show that the use of process pairs in Tandem systems, which was originally intended for tolerating hardware faults, allows the system to tolerate about 70% of defects in system software that result in processor failures. The loose coupling between processors which results in the backup execution (the processor state and the sequence of events occurring) being different from the original execution is a major reason for the measured software fault tolerance. The IBM/MVS system fault tolerance almost doubles when recovery routines are provided, in comparison to the case in which no recovery routines are available. However, even when recovery routines are provided, there is almost a 50% chance of system failure when critical system jobs are involved.

  8. Energy-efficient fault-tolerant systems

    Mathew, Jimson; Pradhan, Dhiraj K


    This book describes the state-of-the-art in energy efficient, fault-tolerant embedded systems. It covers the entire product lifecycle of electronic systems design, analysis and testing and includes discussion of both circuit and system-level approaches. Readers will be enabled to meet the conflicting design objectives of energy efficiency and fault-tolerance for reliability, given the up-to-date techniques presented.

  9. Fault tolerant architecture for artificial olfactory system

    Lotfivand, Nasser; Nizar Hamidon, Mohd; Abdolzadeh, Vida


    In this paper, to cover and mask the faults that occur in the sensing unit of an artificial olfactory system, a novel architecture is offered. The proposed architecture is able to tolerate failures in the sensors of the array and the faults that occur are masked. The proposed architecture for extracting the correct results from the output of the sensors can provide the quality of service for generated data from the sensor array. The results of various evaluations and analysis proved that the proposed architecture has acceptable performance in comparison with the classic form of the sensor array in gas identification. According to the results, achieving a high odor discrimination based on the suggested architecture is possible.

  10. Method and system for environmentally adaptive fault tolerant computing

    Copenhaver, Jason L. (Inventor); Jeremy, Ramos (Inventor); Wolfe, Jeffrey M. (Inventor); Brenner, Dean (Inventor)


    A method and system for adapting fault tolerant computing. The method includes the steps of measuring an environmental condition representative of an environment. An on-board processing system's sensitivity to the measured environmental condition is measured. It is determined whether to reconfigure a fault tolerance of the on-board processing system based in part on the measured environmental condition. The fault tolerance of the on-board processing system may be reconfigured based in part on the measured environmental condition.

  11. Fault detection and fault-tolerant control for nonlinear systems

    Li, Linlin


    Linlin Li addresses the analysis and design issues of observer-based FD and FTC for nonlinear systems. The author analyses the existence conditions for the nonlinear observer-based FD systems to gain a deeper insight into the construction of FD systems. Aided by the T-S fuzzy technique, she recommends different design schemes, among them the L_inf/L_2 type of FD systems. The derived FD and FTC approaches are verified by two benchmark processes. Contents Overview of FD and FTC Technology Configuration of Nonlinear Observer-Based FD Systems Design of L2 nonlinear Observer-Based FD Systems Design of Weighted Fuzzy Observer-Based FD Systems FTC Configurations for Nonlinear Systems< Application to Benchmark Processes Target Groups Researchers and students in the field of engineering with a focus on fault diagnosis and fault-tolerant control fields The Author Dr. Linlin Li completed her dissertation under the supervision of Prof. Steven X. Ding at the Faculty of Engineering, University of Duisburg-Essen, Germany...

  12. Fault-tolerant Actuator System for Electrical Steering of Vehicles

    Sørensen, Jesper Sandberg; Blanke, Mogens


    Being critical to the safety of vehicles, the steering system is required to maintain the vehicles ability to steer until it is brought to halt, should a fault occur. With electrical steering becoming a cost-effective candidate for electrical powered vehicles, a fault-tolerant architecture...... is needed that meets this requirement. This paper studies the fault-tolerance properties of an electrical steering system. It presents a fault-tolerant architecture where a dedicated AC motor design used in conjunction with cheap voltage measurements can ensure detection of all relevant faults...... in the steering system. The paper shows how active control reconfiguration can accommodate all critical faults. The fault-tolerant abilities of the steering system are demonstrated on the hardware of a warehouse truck....

  13. Fault Tolerant Controllers for Sampled-data Systems

    Niemann, H.; Stoustrup, Jakob


    A general compensator architecture for fault tolerant control (FTC) for sampled-data systems is proposed. The architecture is based on the YJBK parameterization of all stabilizing controllers, and uses the dual YJBK parameterization to quantify the performance of the fault tolerant system. The FTC...

  14. Fault tolerant controllers for sampled-data systems

    Niemann, Hans Henrik; Stoustrup, Jakob


    A general compensator architecture for fault tolerant control (FTC) for sampled-data systems is proposed. The architecture is based on the YJBK parameterization of all stabilizing controllers, and uses the dual YJBK parameterization to quantify the performance of the fault tolerant system. The FTC...




    Full Text Available For the improvement of APR1400 Diverse Protection System (DPS design, the Advanced DPS (ADPS has recently been developed to enhance the fault tolerance capability of the system. Major fault masking features of the ADPS compared with the APR1400 DPS are the changes to the channel configuration and reactor trip actuation equipment. To minimize the fault occurrences within the ADPS, and to mitigate the consequences of common-cause failures (CCF within the safety I&C systems, several fault avoidance design features have been applied in the ADPS. The fault avoidance design features include the changes to the system software classification, communication methods, equipment platform, MMI equipment, etc. In addition, the fault detection, location, containment, and recovery processes have been incorporated in the ADPS design. Therefore, it is expected that the ADPS can provide an enhanced fault tolerance capability against the possible faults within the system and its input/output equipment, and the CCF of safety systems.

  16. Mine-Hoist Active Fault Tolerant Control System and Strategy

    WANG Zhi-jie; WANG Yao-cai; MENG Jiang; ZHAO Peng-cheng; CHANG Yan-wei


    Based on fault diagnosis and fault tolerant technologies, the mine-hoist active fault-tolerant control system (MAFCS) is presented with corresponding strategies,, which includes the fault diagnosis module (FDM), the dynamic library (DL) and the fault-tolerant control module (FCM). When a fault is judged from some sensor by FDM, FCM reconfigure the state of MAFCS by calling the parameters from all sub libraries in DL, in order to ensure the reliability and safety of mine hoist. The simulating result shows that, MAFCS is of certain intelligence, which can adopt the corresponding control strategies according to different fault modes, even when there are quite difference between the real data and the prior fault modes.

  17. From fault classification to fault tolerance for multi-agent systems

    Potiron, Katia; Taillibert, Patrick


    Faults are a concern for Multi-Agent Systems (MAS) designers, especially if the MAS are built for industrial or military use because there must be some guarantee of dependability. Some fault classification exists for classical systems, and is used to define faults. When dependability is at stake, such fault classification may be used from the beginning of the system's conception to define fault classes and specify which types of faults are expected. Thus, one may want to use fault classification for MAS; however, From Fault Classification to Fault Tolerance for Multi-Agent Systems argues that

  18. Fault Tolerance in Distributed Systems using Fused State Machines

    Balasubramanian, Bharath; Vijay K Garg


    Replication is a standard technique for fault tolerance in distributed systems modeled as deterministic finite state machines (DFSMs or machines). To correct f crash or f/2 Byzantine faults among n different machines, replication requires nf additional backup machines. We present a solution called fusion that requires just f additional backup machines. First, we build a framework for fault tolerance in DFSMs based on the notion of Hamming distances. We introduce the concept of an (f,m)-fusion...

  19. A fault-tolerant software strategy for digital systems

    Hitt, E. F.; Webb, J. J.


    Techniques developed for producing fault-tolerant software are described. Tolerance is required because of the impossibility of defining fault-free software. Faults are caused by humans and can appear anywhere in the software life cycle. Tolerance is effected through error detection, damage assessment, recovery, and fault treatment, followed by return of the system to service. Multiversion software comprises two or more versions of the software yielding solutions which are examined by a decision algorithm. Errors can also be detected by extrapolation from previous results or by the acceptability of results. Violations of timing specifications can reveal errors, or the system can roll back to an error-free state when a defect is detected. The software, when used in flight control systems, must not impinge on time-critical responses. Efforts are still needed to reduce the costs of developing the fault-tolerant systems.

  20. Fault tolerant system design for uninterruptible power supplies

    B. Y. Volochiy


    Full Text Available The problem of design for reliability of a fault tolerant system for uninterruptible power supplies is considered. Configuration of a fault tolerant system determines the structure of an uninterruptible power supply: power supply built from modules of the same type, stand-by sliding reserve for them, twice total reserve of the power supply with two accumulator batteries, the controls and diagnostics means. The developed tool for automated analytical model of fault tolerant systems generation and illustration of its capabilities in determination of requirements for repair service and accumulator batteries are given.

  1. Fault-Tolerant Systems with Concurrent Error-Locating Capability

    JIANG JianHui(江建慧); MIN YingHua(闵应骅); PENG ChengLian(彭澄廉)


    Fault-tolerant systems have found wide applications in military, industrial andcommercial areas. Most of these systems are constructed by multiple-modular redundancy or er-ror control coding techniques. They need some fault-tolerant specific components (such as voter,switcher, encoder, or decoder) to implement error-detecting or error-correcting functions. However,the problem of error detection, location or correction for fault-tolerance specific components them-selves has not been solved properly so far. Thus, the dependability of a whole fault-tolerant systemwill be greatly affected. This paper presents a theory of robust fault-masking digital circuits forcharacterizing fault-tolerant systems with the ability of concurrent error location and a new schemeof dual-modular redundant systems with partially robust fault-masking property. A basic robustfault-masking circuit is composed of a basic functional circuit and an error-locating corrector. Sucha circuit not only has the ability of concurrent error correction, but also has the ability of concurrenterror location. According to this circuit model, for a partially robust fault-masking dual-modularredundant system, two redundant modules based on alternating-complementary logic consist of thebasic functional circuit. An error-correction specific circuit named as alternating-complementarycorrector is used as the error-locating corrector. The performance (such as hardware complexity,time delay) of the scheme is analyzed.

  2. Sensor Fault Tolerant Generic Model Control for Nonlinear Systems


    A modified Strong Tracking Filter (STF) is used to develop a new approach to sensor fault tolerant control. Generic Model Control (GMC) is used to control the nonlinear process while the process runs normally because of its robust control performance. If a fault occurs in the sensor, a sensor bias vector is then introduced to the output equation of the process model. The sensor bias vector is estimated on-line during every control period using the STF. The estimated sensor bias vector is used to develop a fault detection mechanism to supervise the sensors. When a sensor fault occurs, the conventional GMC is switched to a fault tolerant control scheme, which is, in essence, a state estimation and output prediction based GMC. The laboratory experimental results on a three-tank system demonstrate the effectiveness of the proposed Sensor Fault Tolerant Generic Model Control (SFTGMC) approach.

  3. H infinity Integrated Fault Estimation and Fault Tolerant Control of Discrete-time Piecewise Linear Systems

    Tabatabaeipour, Seyed Mojtaba; Bak, Thomas


    In this paper we consider the problem of fault estimation and accommodation for discrete time piecewise linear systems. A robust fault estimator is designed to estimate the fault such that the estimation error converges to zero and H∞ performance of the fault estimation is minimized. Then......, the estimate of fault is used to compensate for the effect of the fault. Hence, using the estimate of fault, a fault tolerant controller using a piecewise linear static output feedback is designed such that it stabilizes the system and provides an upper bound on the H∞ performance of the faulty system....... Sufficient conditions for the existence of robust fault estimator and fault tolerant controller are derived in terms of linear matrix inequalities. Upper bounds on the H∞ performance can be minimized by solving convex optimization problems with linear matrix inequality constraints. The efficiency...

  4. Guaranteed Cost Fault-Tolerant Control for Networked Control Systems with Sensor Faults

    Qixin Zhu; Kaihong Lu; Guangming Xie; Yonghong Zhu


    For the large scale and complicated structure of networked control systems, time-varying sensor faults could inevitably occur when the system works in a poor environment. Guaranteed cost fault-tolerant controller for the new networked control systems with time-varying sensor faults is designed in this paper. Based on time delay of the network transmission environment, the networked control systems with sensor faults are modeled as a discrete-time system with uncertain parameters. And the mode...

  5. A Fault-tolerant Development Methodology for Industrial Control Systems

    Izadi-Zamanabadi, Roozbeh; Thybo, C.


    and logically sound manner. This paper presents the employe fault-tolerant development methodology and highlights steps, which have been essential for achieving complete and consistent monitoring capabilities. Fault diagnosis for a commercial refrigeration system is treated as a case-study....

  6. On the design of fault-tolerant robotic manipulator systems

    Tesar, Delbert


    Robotic systems are finding increasing use in space applications. Many of these devices are going to be operational on board the Space Station Freedom. Fault tolerance has been deemed necessary because of the criticality of the tasks and the inaccessibility of the systems to maintenance and repair. Design for fault tolerance in manipulator systems is an area within robotics that is without precedence in the literature. In this paper, we will attempt to lay down the foundations for such a technology. Design for fault tolerance demands new and special approaches to design, often at considerable variance from established design practices. These design aspects, together with reliability evaluation and modeling tools, are presented. Mechanical architectures that employ protective redundancies at many levels and have a modular architecture are then studied in detail. Once a mechanical architecture for fault tolerance has been derived, the chronological stages of operational fault tolerance are investigated. Failure detection, isolation, and estimation methods are surveyed, and such methods for robot sensors and actuators are derived. Failure recovery methods are also presented for each of the protective layers of redundancy. Failure recovery tactics often span all of the layers of a control hierarchy. Thus, a unified framework for decision-making and control, which orchestrates both the nominal redundancy management tasks and the failure management tasks, has been derived. The well-developed field of fault-tolerant computers is studied next, and some design principles relevant to the design of fault-tolerant robot controllers are abstracted. Conclusions are drawn, and a road map for the design of fault-tolerant manipulator systems is laid out with recommendations for a 10 DOF arm with dual actuators at each joint.

  7. Fault tolerant hypercube computer system architecture

    Madan, Herb S. (Inventor); Chow, Edward (Inventor)


    A fault-tolerant multiprocessor computer system of the hypercube type comprising a hierarchy of computers of like kind which can be functionally substituted for one another as necessary is disclosed. Communication between the working nodes is via one communications network while communications between the working nodes and watch dog nodes and load balancing nodes higher in the structure is via another communications network separate from the first. A typical branch of the hierarchy reporting to a master node or host computer comprises, a plurality of first computing nodes; a first network of message conducting paths for interconnecting the first computing nodes as a hypercube. The first network provides a path for message transfer between the first computing nodes; a first watch dog node; and a second network of message connecting paths for connecting the first computing nodes to the first watch dog node independent from the first network, the second network provides an independent path for test message and reconfiguration affecting transfers between the first computing nodes and the first switch watch dog node. There is additionally, a plurality of second computing nodes; a third network of message conducting paths for interconnecting the second computing nodes as a hypercube. The third network provides a path for message transfer between the second computing nodes; a fourth network of message conducting paths for connecting the second computing nodes to the first watch dog node independent from the third network. The fourth network provides an independent path for test message and reconfiguration affecting transfers between the second computing nodes and the first watch dog node; and a first multiplexer disposed between the first watch dog node and the second and fourth networks for allowing the first watch dog node to selectively communicate with individual ones of the computing nodes through the second and fourth networks; as well as, a second watch dog node

  8. Data-driven design of fault diagnosis and fault-tolerant control systems

    Ding, Steven X


    Data-driven Design of Fault Diagnosis and Fault-tolerant Control Systems presents basic statistical process monitoring, fault diagnosis, and control methods, and introduces advanced data-driven schemes for the design of fault diagnosis and fault-tolerant control systems catering to the needs of dynamic industrial processes. With ever increasing demands for reliability, availability and safety in technical processes and assets, process monitoring and fault-tolerance have become important issues surrounding the design of automatic control systems. This text shows the reader how, thanks to the rapid development of information technology, key techniques of data-driven and statistical process monitoring and control can now become widely used in industrial practice to address these issues. To allow for self-contained study and facilitate implementation in real applications, important mathematical and control theoretical knowledge and tools are included in this book. Major schemes are presented in algorithm form and...

  9. Fault tolerant highly reliable inertial navigation system

    Jeerage, Mahesh; Boettcher, Kevin

    This paper describes a development of failure detection and isolation (FDI) strategies for highly reliable inertial navigation systems. FDI strategies are developed based on the generalized likelihood ratio test (GLRT). A relationship between detection threshold and false alarm rate is developed in terms of the sensor parameters. A new method for correct isolation of failed sensors is presented. Evaluation of FDI performance parameters, such as false alarm rate, wrong isolation probability, and correct isolation probability, are presented. Finally a fault recovery scheme capable of correcting false isolation of good sensors is presented.

  10. Fault-tolerant Control Systems-An Introductory Overview

    Jin Jiang


    This paper presents an introductory overview on the development of fault-tolerant control systems. For this reason, the paper is written in a tutorial fashion to summarize some of the important results in this subject area deliberately without going into details in any of them. However, key references are provided from which interested readers can obtain more detailed information on a particular subject. It is necessary to mention that, throughout this paper, no efforts were made to provide an exhaustive coverage on the subject matter. In fact, it is far from it. The paper merely represents the view and experience of its author. It can very well be that some important issues or topics were left out unintentionally. If that is the case, the author sincerely apologizes in advance.After a brief account of fault-tolerant control systems, particularly on the original motivations, and the concept of redundancies, the paper reviews the development of fault-tolerant control systems with highlights to several important issues from a historical perspective. The general approaches to fault-tolerant control has been divided into passive, active, and hybrid approaches. The analysis techniques for active fault-tolerant control systems are also discussed. Practical applications of faulttolerant control are highlighted from a practical and industrial perspective. Finally, some critical issues in this area are discussed as open problems for future research/development in this emerging field.

  11. Guaranteed Cost Fault-Tolerant Control for Networked Control Systems with Sensor Faults

    Qixin Zhu


    Full Text Available For the large scale and complicated structure of networked control systems, time-varying sensor faults could inevitably occur when the system works in a poor environment. Guaranteed cost fault-tolerant controller for the new networked control systems with time-varying sensor faults is designed in this paper. Based on time delay of the network transmission environment, the networked control systems with sensor faults are modeled as a discrete-time system with uncertain parameters. And the model of networked control systems is related to the boundary values of the sensor faults. Moreover, using Lyapunov stability theory and linear matrix inequalities (LMI approach, the guaranteed cost fault-tolerant controller is verified to render such networked control systems asymptotically stable. Finally, simulations are included to demonstrate the theoretical results.

  12. SIFT - Design and analysis of a fault-tolerant computer for aircraft control. [Software Implemented Fault Tolerant systems

    Wensley, J. H.; Lamport, L.; Goldberg, J.; Green, M. W.; Levitt, K. N.; Melliar-Smith, P. M.; Shostak, R. E.; Weinstock, C. B.


    SIFT (Software Implemented Fault Tolerance) is an ultrareliable computer for critical aircraft control applications that achieves fault tolerance by the replication of tasks among processing units. The main processing units are off-the-shelf minicomputers, with standard microcomputers serving as the interface to the I/O system. Fault isolation is achieved by using a specially designed redundant bus system to interconnect the processing units. Error detection and analysis and system reconfiguration are performed by software. Iterative tasks are redundantly executed, and the results of each iteration are voted upon before being used. Thus, any single failure in a processing unit or bus can be tolerated with triplication of tasks, and subsequent failures can be tolerated after reconfiguration. Independent execution by separate processors means that the processors need only be loosely synchronized, and a novel fault-tolerant synchronization method is described.

  13. Design of fault tolerant control system for steam generator using

    Kim, Myung Ki; Seo, Mi Ro [Korea Electric Power Research Institute, Taejon (Korea, Republic of)


    A controller and sensor fault tolerant system for a steam generator is designed with fuzzy logic. A structure of the proposed fault tolerant redundant system is composed of a supervisor and two fuzzy weighting modulators. A supervisor alternatively checks a controller and a sensor induced performances to identify which part, a controller or a sensor, is faulty. In order to analyze controller induced performance both an error and a change in error of the system output are chosen as fuzzy variables. The fuzzy logic for a sensor induced performance uses two variables : a deviation between two sensor outputs and its frequency. Fuzzy weighting modulator generates an output signal compensated for faulty input signal. Simulations show that the proposed fault tolerant control scheme for a steam generator regulates well water level by suppressing fault effect of either controllers or sensors. Therefore through duplicating sensors and controllers with the proposed fault tolerant scheme, both a reliability of a steam generator control and sensor system and that of a power plant increase even more. 2 refs., 9 figs., 1 tab. (Author)

  14. GEMS: A Fault Tolerant Grid Job Management System

    Tadepalli, Sriram Satish


    The Grid environments are inherently unstable. Resources join and leave the environment without any prior notification. Application fault detection, checkpointing and restart is of foremost importance in the Grid environments. The need for fault tolerance is especially acute for large parallel applications since the failure rate grows with the number of processors and the duration of the computation. A Grid job management system hides the heterogeneity of the Grid and the complexity of the ...

  15. A Survey on Fault Tolerant Multi Agent System

    Yasir Arfat


    Full Text Available A multi-agent system (MAS is formed by a number of agents connected together to achieve the desired goals specified by the design. Usually in a multi agent system, agents work on behalf of a user to accomplish given goals. In MAS co-ordination, co-operation, negotiation and communication are important aspects to achieve fault tolerance in MAS. The multi-agent system is likely to fail in a distributed environment and as an outcome of such, the resources for MAS may not be available due to the failure of an agent, machine crashes, process failure, software failure, communication failure and/or hardware failure. Therefore, many researchers have proposed fault tolerance approaches to overcome the failure in MAS. So we have surveyed these approaches in this paper, whereby our contribution is threefold. Firstly, we have provided taxonomy of faults and techniques in MAS. Secondly, we have provided a qualitative comparison of existing fault tolerance approaches. Thirdly, we have provided an evaluation of existing fault tolerance techniques. Results show that most of the existing schemes are not very efficient, due to various reasons like high computation costs, costly replication and large communication overheads.

  16. Fault-Tolerant Control for Networked Control Systems with Limited Information in Case of Actuator Fault

    Wang Yan-feng; Wang Pei-liang; Li Zu-xin; Chen Hui-ying


    This paper is concerned with the problem of designing a fault-tolerant controller for uncertain discrete-time networked control systems against actuator possible fault. The step difference between the running step k and the time stamp of the used plant state is modeled as a finite state Markov chain of which the transition probabilities matrix information is limited. By introducing actuator fault indicator matrix, the closed-loop system model is obtained by means of state augmentation techniq...

  17. Fault Tolerance Middleware for a Multi-Core System

    Some, Raphael R.; Springer, Paul L.; Zima, Hans P.; James, Mark; Wagner, David A.


    Fault Tolerance Middleware (FTM) provides a framework to run on a dedicated core of a multi-core system and handles detection of single-event upsets (SEUs), and the responses to those SEUs, occurring in an application running on multiple cores of the processor. This software was written expressly for a multi-core system and can support different kinds of fault strategies, such as introspection, algorithm-based fault tolerance (ABFT), and triple modular redundancy (TMR). It focuses on providing fault tolerance for the application code, and represents the first step in a plan to eventually include fault tolerance in message passing and the FTM itself. In the multi-core system, the FTM resides on a single, dedicated core, separate from the cores used by the application. This is done in order to isolate the FTM from application faults and to allow it to swap out any application core for a substitute. The structure of the FTM consists of an interface to a fault tolerant strategy module, a responder module, a fault manager module, an error factory, and an error mapper that determines the severity of the error. In the present reference implementation, the only fault tolerant strategy implemented is introspection. The introspection code waits for an application node to send an error notification to it. It then uses the error factory to create an error object, and at this time, a severity level is assigned to the error. The introspection code uses its built-in knowledge base to generate a recommended response to the error. Responses might include ignoring the error, logging it, rolling back the application to a previously saved checkpoint, swapping in a new node to replace a bad one, or restarting the application. The original error and recommended response are passed to the top-level fault manager module, which invokes the response. The responder module also notifies the introspection module of the generated response. This provides additional information to the

  18. A Ship Propulsion System Model for Fault-tolerant Control

    Izadi-Zamanabadi, Roozbeh; Blanke, M.

    This report presents a propulsion system model for a low speed marine vehicle, which can be used as a test benchmark for Fault-Tolerant Control purposes. The benchmark serves the purpose of offering realistic and challenging problems relevant in both FDI and (autonomous) supervisory control area....

  19. Distributed Fault-Tolerant Control of Networked Uncertain Euler-Lagrange Systems Under Actuator Faults.

    Chen, Gang; Song, Yongduan; Lewis, Frank L


    This paper investigates the distributed fault-tolerant control problem of networked Euler-Lagrange systems with actuator and communication link faults. An adaptive fault-tolerant cooperative control scheme is proposed to achieve the coordinated tracking control of networked uncertain Lagrange systems on a general directed communication topology, which contains a spanning tree with the root node being the active target system. The proposed algorithm is capable of compensating for the actuator bias fault, the partial loss of effectiveness actuation fault, the communication link fault, the model uncertainty, and the external disturbance simultaneously. The control scheme does not use any fault detection and isolation mechanism to detect, separate, and identify the actuator faults online, which largely reduces the online computation and expedites the responsiveness of the controller. To validate the effectiveness of the proposed method, a test-bed of multiple robot-arm cooperative control system is developed for real-time verification. Experiments on the networked robot-arms are conduced and the results confirm the benefits and the effectiveness of the proposed distributed fault-tolerant control algorithms.

  20. Design a Fault Tolerance for Real Time Distributed System

    Ban M. Khammas


    This paper designed a fault tolerance for soft real time distributed system (FTRTDS). This system is designed to be independently on specific mechanisms and facilities of the underlying real time distributed system. It is designed to be distributed on all the computers in the distributed system and controlled by a central unit.Besides gathering information about a target program spontaneously, it provides information about the target operating system and the target hardware in order to diagno...

  1. Fault tolerance control for proton exchange membrane fuel cell systems

    Wu, Xiaojuan; Zhou, Boyang


    Fault diagnosis and controller design are two important aspects to improve proton exchange membrane fuel cell (PEMFC) system durability. However, the two tasks are often separately performed. For example, many pressure and voltage controllers have been successfully built. However, these controllers are designed based on the normal operation of PEMFC. When PEMFC faces problems such as flooding or membrane drying, a controller with a specific design must be used. This paper proposes a unique scheme that simultaneously performs fault diagnosis and tolerance control for the PEMFC system. The proposed control strategy consists of a fault diagnosis, a reconfiguration mechanism and adjustable controllers. Using a back-propagation neural network, a model-based fault detection method is employed to detect the PEMFC current fault type (flooding, membrane drying or normal). According to the diagnosis results, the reconfiguration mechanism determines which backup controllers to be selected. Three nonlinear controllers based on feedback linearization approaches are respectively built to adjust the voltage and pressure difference in the case of normal, membrane drying and flooding conditions. The simulation results illustrate that the proposed fault tolerance control strategy can track the voltage and keep the pressure difference at desired levels in faulty conditions.

  2. Fault-tolerant clock synchronization validation methodology. [in computer systems

    Butler, Ricky W.; Palumbo, Daniel L.; Johnson, Sally C.


    A validation method for the synchronization subsystem of a fault-tolerant computer system is presented. The high reliability requirement of flight-crucial systems precludes the use of most traditional validation methods. The method presented utilizes formal design proof to uncover design and coding errors and experimentation to validate the assumptions of the design proof. The experimental method is described and illustrated by validating the clock synchronization system of the Software Implemented Fault Tolerance computer. The design proof of the algorithm includes a theorem that defines the maximum skew between any two nonfaulty clocks in the system in terms of specific system parameters. Most of these parameters are deterministic. One crucial parameter is the upper bound on the clock read error, which is stochastic. The probability that this upper bound is exceeded is calculated from data obtained by the measurement of system parameters. This probability is then included in a detailed reliability analysis of the system.

  3. Data-based fault-tolerant control for affine nonlinear systems with actuator faults.

    Xie, Chun-Hua; Yang, Guang-Hong


    This paper investigates the fault-tolerant control (FTC) problem for unknown nonlinear systems with actuator faults including stuck, outage, bias and loss of effectiveness. The upper bounds of stuck faults, bias faults and loss of effectiveness faults are unknown. A new data-based FTC scheme is proposed. It consists of the online estimations of the bounds and a state-dependent function. The estimations are adjusted online to compensate automatically the actuator faults. The state-dependent function solved by using real system data helps to stabilize the system. Furthermore, all signals in the resulting closed-loop system are uniformly bounded and the states converge asymptotically to zero. Compared with the existing results, the proposed approach is data-based. Finally, two simulation examples are provided to show the effectiveness of the proposed approach.

  4. Active Fault Tolerant Control of Livestock Stable Ventilation System

    Gholami, Mehdi


    Modern stables and greenhouses are equipped with different components for providing a comfortable climate for animals and plant. A component malfunction may result in loss of production. Therefore, it is desirable to design a control system, which is stable, and is able to provide an acceptable......). In the FTC part of the thesis, first a passive fault tolerant controller (PFTC) based on state feed-back is proposed for discretetime PWA systems. only actuator faults are considered. By dissipativity theory and H1 analysis, the problem is cast as a set of linear matrix inequalities (LMIs). In the next...

  5. Diagnosis and Fault-Tolerant Control for Thruster-Assisted Position Mooring System

    Nguyen, Trong Dong; Blanke, Mogens; Sørensen, Asgeir


    Development of fault-tolerant control systems is crucial to maintain safe operation of o®shore installations. The objective of this paper is to develop a fault- tolerant control for thruster-assisted position mooring (PM) system with faults occurring in the mooring lines. Faults in line...

  6. Characterization of the faulted behavior of digital computers and fault tolerant systems

    Bavuso, Salvatore J.; Miner, Paul S.


    A development status evaluation is presented for efforts conducted at NASA-Langley since 1977, toward the characterization of the latent fault in digital fault-tolerant systems. Attention is given to the practical, high speed, generalized gate-level logic system simulator developed, as well as to the validation methodology used for the simulator, on the basis of faultable software and hardware simulations employing a prototype MIL-STD-1750A processor. After validation, latency tests will be performed.

  7. Fault-Tolerant Control for Networked Control Systems with Limited Information in Case of Actuator Fault

    Wang Yan-feng


    Full Text Available This paper is concerned with the problem of designing a fault-tolerant controller for uncertain discrete-time networked control systems against actuator possible fault. The step difference between the running step k and the time stamp of the used plant state is modeled as a finite state Markov chain of which the transition probabilities matrix information is limited. By introducing actuator fault indicator matrix, the closed-loop system model is obtained by means of state augmentation technique. The sufficient conditions on the stochastic stability of the closed-loop system are given and the fault-tolerant controller is designed by solving a linear matrix inequality. A numerical example is presented to illustrate the effectiveness of the proposed method.

  8. Summarize of Electric Vehicle Electric System Fault and Fault-tolerant Technology

    Zhang Liwei


    Full Text Available Electric vehicle drive system is a multi-variable function, running environment complexed and changeable system, so it’s failure form is complicated. In this paper, according to the fault happens in different position, establish vehicle fault table, analyze the consequences of failure may cause and the causes of failure. Combined with hardware limitations, and the maximum guarantee system performance requirements, passive software redundancy fault-tolerant strategy is put forward, give an example to analysis the pros and cons of this method.

  9. Satisfactory fault-tolerant controller design for uncertain systems subject to actuator faults

    Zhang Dengfeng; Su Hongye; Wang Zhiquan


    Based on satisfactory control strategy, a new method for robust passive fault tolerant controller is proposed for a class of uncertain discrete-time systems subject to actuator faults. The state-feedback gain matrix is calculated by linear matrix inequality (LMI) technique. The designed controller guarantees that the closed-loop system meets the pre-specified consistent constraints on circular pole index and steady-state variance index simultaneously for normal case and possible actuator fault case. The consistency of the performance indices is discussed. Furthermore, with the mentioned indices constraints, a solution is obtained by convex optimal technique for the robust satisfactory fault-tolerant controller with optimal control-cost.

  10. Fault tolerance techniques for embedded telemetry system: case study

    Krosman, Kazimierz; Sosnowski, Janusz


    This paper presents software methods of improving fault tolerance in embedded systems. These methods have been adapted to a telemetry system dedicated to tracking vehicles for logistics purposes. The developed telemetry system allows us to monitor vehicle position and some technical parameters via GSM communication. It comprises the capability of remote software reconfiguration. To evaluate dependability of the system we use a fault injection technique based on simulating bit-flip errors within memory cells. For this purpose an original testbed has been developed. It provides not only the capability of disturbing internal state of the tested system (via JTAG interface) but also the possibility of controlling system input states and observing its behavior (in particular output signals) according to specified test scenarios. The whole test process is automatized. The paper presents a case study related to a commercial product but the described methodology and techniques can be extended for other embedded systems.

  11. Fault-tolerant design

    Dubrova, Elena


    This textbook serves as an introduction to fault-tolerance, intended for upper-division undergraduate students, graduate-level students and practicing engineers in need of an overview of the field.  Readers will develop skills in modeling and evaluating fault-tolerant architectures in terms of reliability, availability and safety.  They will gain a thorough understanding of fault tolerant computers, including both the theory of how to design and evaluate them and the practical knowledge of achieving fault-tolerance in electronic, communication and software systems.  Coverage includes fault-tolerance techniques through hardware, software, information and time redundancy.  The content is designed to be highly accessible, including numerous examples and exercises.  Solutions and powerpoint slides are available for instructors.   ·         Provides textbook coverage of the fundamental concepts of fault-tolerance; ·         Describes a variety of basic techniques for achieving fault-toleran...

  12. Fault tolerant aggregation for power system services

    Kosek, Anna Magdalena; Gehrke, Oliver; Kullmann, Daniel


    Exploiting the flexibility in distributed energy resources (DER) is seen as an important contribution to allow high penetrations of renewable generation in electrical power systems. However, the present control infrastructure in power systems is not well suited for the integration of a very large...... number of small units. A common approach is to aggregate a portfolio of such units together and expose them to the power system as a single large virtual unit. In order to realize the vision of a Smart Grid, concepts for flexible, resilient and reliable aggregation infrastructures are required...

  13. Ship Propulsion System as a Benchmark for Fault-Tolerant Control

    Izadi-Zamanabadi, Roozbeh; Blanke, M.


    Fault-tolerant control combines fault detection and isolation techniques with supervisory control to achieve autonomous accommodation of faults before they develop into failures. While fault detection and isolation (FDI) methods have matured during the past decade the extension to fault-tolerant control is a fairly new area. The paper presents a ship propulsion system as a benchmark that should be useful as a platform for development of new ideas and comparison of methods. The benchmark has t...

  14. Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

    Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.


    Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions.

  15. Fault Tolerant Feedback Control

    Stoustrup, Jakob; Niemann, H.


    An architecture for fault tolerant feedback controllers based on the Youla parameterization is suggested. It is shown that the Youla parameterization will give a residual vector directly in connection with the fault diagnosis part of the fault tolerant feedback controller. It turns out...... that there is a separation be-tween the feedback controller and the fault tolerant part. The closed loop feedback properties are handled by the nominal feedback controller and the fault tolerant part is handled by the design of the Youla parameter. The design of the fault tolerant part will not affect the design...... of the nominal feedback con-troller....

  16. Distributed Evaluation Functions for Fault Tolerant Multi-Rover Systems

    Agogino, Adrian; Turner, Kagan


    The ability to evolve fault tolerant control strategies for large collections of agents is critical to the successful application of evolutionary strategies to domains where failures are common. Furthermore, while evolutionary algorithms have been highly successful in discovering single-agent control strategies, extending such algorithms to multiagent domains has proven to be difficult. In this paper we present a method for shaping evaluation functions for agents that provide control strategies that both are tolerant to different types of failures and lead to coordinated behavior in a multi-agent setting. This method neither relies of a centralized strategy (susceptible to single point of failures) nor a distributed strategy where each agent uses a system wide evaluation function (severe credit assignment problem). In a multi-rover problem, we show that agents using our agent-specific evaluation perform up to 500% better than agents using the system evaluation. In addition we show that agents are still able to maintain a high level of performance when up to 60% of the agents fail due to actuator, communication or controller faults.

  17. Industrial Cost-Benefit Assessment for Fault-tolerant Control Systems

    Thybo, C.; Blanke, M.


    Economic aspects are decisive for industrial acceptance of research concepts including the promising ideas in fault tolerant control. Fault tolerance is the ability of a system to detect, isolate and accommodate a fault, such that simple faults in a sub-system do not develop into failures...... at a system level. In a design phase for an industrial system, possibilities span from fail safe design where any single point failure is accommodated by hardware, over fault-tolerant design where selected faults are handled without extra hardware, to fault-ignorant design where no extra precaution is taken....... The objective of this paper is to help, in the early product development state, to find the economical most suitable scheme. A salient result is that with increased customer awareness of total cost of ownership, new products can benefit significantly from applying fault tolerant control principles....

  18. Piecewise Sliding Mode Decoupling Fault Tolerant Control System

    Rafi Youssef


    Full Text Available Problem statement: Proposed method in the present study could deal with fault tolerant control system by using the so called decentralized control theory with decoupling fashion sliding mode control, dealing with subsystems instead of whole system and to the knowledge of the author there is no known computational algorithm for decentralized case, Approach: In this study we present a decoupling strategy based on the selection of sliding surface, which should be in piecewise sliding surface partition to apply the PwLTool which have as purpose in our case to delimit regions where sliding mode occur, after that as Results: We get a simple linearized model selected in those regions which could depict the complex system, Conclusion: With the 3 water tank level system as example we implement this new design scenario and since we are interested in networked control system we believe that this kind of controller implementation will not be affected by network delays.

  19. Fault-tolerance in Two-dimensional Topological Systems

    Anderson, Jonas T.

    This thesis is a collection of ideas with the general goal of building, at least in the abstract, a local fault-tolerant quantum computer. The connection between quantum information and topology has proven to be an active area of research in several fields. The introduction of the toric code by Alexei Kitaev demonstrated the usefulness of topology for quantum memory and quantum computation. Many quantum codes used for quantum memory are modeled by spin systems on a lattice, with operators that extract syndrome information placed on vertices or faces of the lattice. It is natural to wonder whether the useful codes in such systems can be classified. This thesis presents work that leverages ideas from topology and graph theory to explore the space of such codes. Homological stabilizer codes are introduced and it is shown that, under a set of reasonable assumptions, any qubit homological stabilizer code is equivalent to either a toric code or a color code. Additionally, the toric code and the color code correspond to distinct classes of graphs. Many systems have been proposed as candidate quantum computers. It is very desirable to design quantum computing architectures with two-dimensional layouts and low complexity in parity-checking circuitry. Kitaev's surface codes provided the first example of codes satisfying this property. They provided a new route to fault tolerance with more modest overheads and thresholds approaching 1%. The recently discovered color codes share many properties with the surface codes, such as the ability to perform syndrome extraction locally in two dimensions. Some families of color codes admit a transversal implementation of the entire Clifford group. This work investigates color codes on the 4.8.8 lattice known as triangular codes. I develop a fault-tolerant error-correction strategy for these codes in which repeated syndrome measurements on this lattice generate a three-dimensional space-time combinatorial structure. I then develop an

  20. Fault Diagnosis and Fault-tolerant Control of Modular Multi-level Converter High-voltage DC System: A Review

    Liu, Hui; Ma, Ke; Wang, Chao;


    Modular Multilevel Converter based High Voltage Direct Current (MMC-HVDC) configuration is a promising solution for the efficient grid integration and bulky power transmission over long distance. However, the large number of series connected identical modules in MMC may increase the probability...... strategies of MMC-HVDC systems for the most common faults happened in MMC-HVDC systems covering MMC faults, DC side faults as well as AC side faults. An important part of this paper is devoted to a discussion of the vulnerable spots as well as failure mechanism of the MMC-HVDC system covering switching...... device fault, DC line faults as well as AC grid faults. Special attention is given to the comparison of the corresponding fault diagnosis and fault-tolerant control approaches. Further, focus is dedicated to control/protection strategies and topologies with fault ride-though capability for MMC...

  1. Adaptive fault-tolerant control of linear systems with actuator saturation and L2-disturbances

    Wei GUAN; Guanghong YANG


    This paper studies the problem of designing adaptive fault-tolerant H-infinity controllers for linear timeinvariant systems with actuator saturation. The disturbance tolerance ability of the closed-loop system is measured by an optimal index. The notion of an adaptive H-infinity performance index is proposed to describe the disturbance attenuation performances of closed-loop systems. New methods for designing indirect adaptive fault-tolerant controllers via state feedback are presented for actuator fault compensations. Based on the on-line estimation of eventual faults, the adaptive fault-tolerant controller parameters are updated automatically to compensate for the fault effects on systems. The designs are developed in the framework of the linear matrix inequality (LMI) approach, which can guarantee the disturbance tolerance ability and adaptive H-infinity performances of closed-loop systems in the cases of actuator saturation and actuator failures. An example is given to illustrate the efficiency of the design method.

  2. Modeling the Fault Tolerant Capability of a Flight Control System: An Exercise in SCR Specification

    Alexander, Chris; Cortellessa, Vittorio; DelGobbo, Diego; Mili, Ali; Napolitano, Marcello


    In life-critical and mission-critical applications, it is important to make provisions for a wide range of contingencies, by providing means for fault tolerance. In this paper, we discuss the specification of a flight control system that is fault tolerant with respect to sensor faults. Redundancy is provided by analytical relations that hold between sensor readings; depending on the conditions, this redundancy can be used to detect, identify and accommodate sensor faults.

  3. A fault tolerant superheat control strategy for supermarket refrigeration systems

    Vinther, Kasper; Izadi-Zamanabadi, Roozbeh; Rasmussen, Henrik;


    in a plug & play fashion. The strategy is outlined by means of procedural steps as well as a flow chart that also illustrates the process of automatic tuning of the maximum slope-seeking controller. Test results are furthermore presented for a display case in a full scale CO2 supermarket refrigeration......In this paper, a fault tolerant control (FTC) strategy is proposed for evaporator superheat control in supermarket refrigeration systems. Conventional control uses a pressure and temperature sensor for this purpose, however, the pressure sensor can fail to function. A contingency control strategy......, based on a maximum slope-seeking control method and only a single temperature sensor, is developed to drive the evaporator outlet temperature to a level that gives a suitable superheat of the refrigerant. The FTC strategy requires no a priori system knowledge or additional hardware and functions...

  4. Fault diagnosis and fault-tolerant control strategies for non-linear systems analytical and soft computing approaches

    Witczak, Marcin


      This book presents selected fault diagnosis and fault-tolerant control strategies for non-linear systems in a unified framework. In particular, starting from advanced state estimation strategies up to modern soft computing, the discrete-time description of the system is employed Part I of the book presents original research results regarding state estimation and neural networks for robust fault diagnosis. Part II is devoted to the presentation of integrated fault diagnosis and fault-tolerant systems. It starts with a general fault-tolerant control framework, which is then extended by introducing robustness with respect to various uncertainties. Finally, it is shown how to implement the proposed framework for fuzzy systems described by the well-known Takagi–Sugeno models. This research monograph is intended for researchers, engineers, and advanced postgraduate students in control and electrical engineering, computer science,as well as mechanical and chemical engineering.

  5. Fault-tolerant for Electric Vehicles Drive System Sensor Failure

    Zhang Liwei


    Full Text Available When EV failure happens, it needs to take some fault-tolerant method to ensure people’s safety. When the current sensor and speed sensor are out of work, the software fault-tolerant control algorithm switching strategy can be used. This paper has done theoretical analysis of the rotor field-oriented vectoe control algorithm into the open loop constant V/F control algorithm, and the phase angle compensation method is used to reduce the shock of current and torque, and simulation is done in MATLAB/Simulink.    

  6. Robust and Active Fault-tolerant Control for a Class of Nonlinear Uncertain Systems

    You-Qing Wang; Dong-Hua Zhou; Li-Heng Liu


    A novel integrated design strategy for robust fault diagnosis and fault-tolerant control (FTC) of a class of nonlinear uncertain systems is proposed. The uncertainties considered in this paper are more general than those in other existing works, and faults are described in a new formulation. It is proven that the states of a closed-loop system converge asymptotically to zero even if there are uncertainties and faults in a system. Simulation results on a simple pendulum are presented for illustration.

  7. Passive Fault Tolerant Control of Piecewise Affine Systems Based on H Infinity Synthesis

    Gholami, Mehdi; Cocquempot, vincent; Schiøler, Henrik


    In this paper we design a passive fault tolerant controller against actuator faults for discretetime piecewise affine (PWA) systems. By using dissipativity theory and H analysis, fault tolerant state feedback controller design is expressed as a set of Linear Matrix Inequalities (LMIs). In the cur......). In the current paper, the PWA system switches not only due to the state but also due to the control input. The method is applied on a large scale livestock ventilation model....

  8. Fault-Tolerant Design and Testing of USB2.0 Peripheral Devices IP Core System

    BAI Xiaoping; WEI Yuanfeng


    Universal serial bus 2.0 (USB2.0) is a kind of mainstream interface technology. The traditional USB developing is only to develop USB peripheral devices. For the USB2.0 peripheral devices IP core system that has wide application foreground, some interference inevitably exists in signal transmitting. Some fault-tolerant design and test methods must be adopted in order to correctly transmit and receive data. Combining with a project, this paper introduces in detail about measures, hardware implement, and test methods of fault-tolerant design about USB2.0 peripheral devices IP core system. Fault-tolerant design measures, noise reduction measures of signal processing, fault-tolerant methods about data encode and decode, package identification (ID) field fault-tolerant methods, and cyclic redundancy checks fault-tolerant methods are discussed. The paper also presents some hardware implement methods about fault-tolerant design of data decode and test methods about fault-tolerant design of USB2.0 IP core system. These methods can offer the reference for development of USB2.0 system in all kinds of electronics instrumentations.

  9. Decentralized Fault Tolerant Control for a Class of Interconnected Nonlinear Systems.

    Shao, Shuai; Yang, Hao; Jiang, Bin; Cheng, Shuyao


    This paper proposes a decentralized fault tolerant methodology for a class of interconnected nonlinear systems. The key novelty of our proposed method is that fault tolerant control can be achieved without necessarily exchanging the state information between the subsystems and the couplings' effect can be dealt with utilizing the cyclic-small-gain methodology. Simulation results demonstrate effectively the validity of our proposed approach.

  10. Synthesis of Fault-Tolerant Embedded Systems with Checkpointing and Replication

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru;


    We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes are statically scheduled and communications are performed using the time...

  11. Energy-Aware Synthesis of Fault-Tolerant Schedules for Real-Time Distributed Embedded Systems

    Poulsen, Kåre Harbo; Pop, Paul; Izosimov, Viacheslav


    This paper presents a design optimisation tool for distributed embedded real-time systems that 1) decides mapping, fault-tolerance policy and generates a fault-tolerant schedule, 2) is targeted for hard real-time, 3) has hard reliability goal, 4) generates static schedule for processes and messages...

  12. Sensor Fault and Delay Tolerant Control for Networked Control Systems Subject to External Disturbances.

    Han, Shi-Yuan; Chen, Yue-Hui; Tang, Gong-You


    In this paper, the problem of sensor fault and delay tolerant control problem for a class of networked control systems under external disturbances is investigated. More precisely, the dynamic characteristics of the external disturbance and sensor fault are described as the output of exogenous systems first. The original sensor fault and delay tolerant control problem is reformulated as an equivalence problem with designed available system output and reformed performance index. The feedforward and feedback sensor fault tolerant controller (FFSFTC) can be obtained by utilizing the solutions of Riccati matrix equation and Stein matrix equation. Based on the designed fault diagnoser, the proposed FFSFTC is further reconstructed to compensate for the sensor fault and delayed measurement effects. Finally, numerical examples are provided to illustrate the effectiveness of our proposed FFSFTC with different cases with various types of sensor faults, measurement delays and external disturbances.

  13. Towards fault-tolerant decision support systems for ship operator guidance

    Nielsen, Ulrik Dam; Lajic, Zoran; Jensen, Jørgen Juncher


    Fault detection and isolation are very important elements in the design of fault-tolerant decision support systems for ship operator guidance. This study outlines remedies that can be applied for fault diagnosis, when the ship responses are assumed to be linear in the wave excitation. A novel...

  14. An Algebra of Fault Tolerance

    Rao, Shrisha


    Every system of any significant size is created by composition from smaller sub-systems or components. It is thus fruitful to analyze the fault-tolerance of a system as a function of its composition. In this paper, two basic types of system composition are described, and an algebra to describe fault tolerance of composed systems is derived. The set of systems forms monoids under the two composition operators, and a semiring when both are concerned. A partial ordering relation between systems is used to compare their fault-tolerance behaviors.

  15. Fault-Tolerant Control of the Road Wheel Subsystem in a Steer-By-Wire System

    Bing Zheng


    Full Text Available This paper describes a fault-tolerant steer-by-wire road wheel control system. With dual motor and dual microcontroller architecture, this system has the capability to tolerate single-point failures without degrading the control system performance. The arbitration bus, mechanical arrangement of motors, and the developed control algorithm allow the system to reconfigure itself automatically in the event of a single-point fault, and assure a smooth reconfiguration process. Both simulation and experimental results illustrate the effectiveness of the proposed fault-tolerant control system.

  16. Efficient Fault Tree Analysis of Complex Fault Tolerant Multiple-Phased Systems

    MO Yuchang; LIU Hongwei; YANG Xiaozong


    Fault tolerant multiple phased systems (FTMPS), i.e., systems whose critical components are independently replicated and whose operational life can be partitioned in a set of disjoint periods, are called "phases". Because of their deployment in critical applications, their reliability analysis is a task of primary relevance to validate the designs. Fault tree analysis based on binary decision diagram (BDD) is one of the most commonly used techniques for FTMPS reliability analysis. To utilize the technique the fault tree structure of FTMPS needs to be converted into the corresponding BDD format. Our research work shows that the system BDD generation algorithms presented in the literature are too inefficient to be used for industrial complex FTPMS because of the problems, such as variable ordering and combination of large BDDs. This paper presents a more efficient approach consisting of a flatting pre-processing technique, a proved efficient ordering heuristic and a bottom-up generation algorithm. The approach tries to combine share-variable BDDs by complex combination operation firstly and then combine no-share-variable BDDs using simple combination operation, thus to alvoid the intensive computations caused by large BDD combination operations. An example FTMPS is analyzed to illustrate the advantages of our approach.

  17. Analysis and optimization of fault-tolerant embedded systems with hardened processors

    Izosimov, Viacheslav; Polian, Ilia; Pop, Paul


    In this paper we propose an approach to the design optimization of fault-tolerant hard real-time embedded systems, which combines hardware and software fault tolerance techniques. We trade-off between selective hardening in hardware and process reexecution in software to provide the required levels...... of fault tolerance against transient faults with the lowest-possible system costs. We propose a system failure probability (SFP) analysis that connects the hardening level with the maximum number of reexecutions in software. We present design optimization heuristics, to select the fault......-tolerant architecture and decide process mapping such that the system cost is minimized, deadlines are satisfied, and the reliability requirements are fulfilled....

  18. Software engineering for fault-tolerant systems. Final technical report, Jan 89-Aug 90

    Goel, A.L.; Mansour, N.


    The objectives of this study are to (1) assess the current state of the art of fault tolerant software schemes, (2) evaluate the status of various software engineering issues in this context, (3) identify critical gaps in the currently available technology and, (4) provide recommendations for research and development efforts to enhance the technological base of fault tolerant software engineering. Towards these objectives, the authors have discussed several software fault tolerance schemes, studied the available experimental and analytical evidence about their usefulness and assessed the current status of fault tolerant software engineering for sequential and parallel computers. Based on the studies reported here, they feel that the current state-of-the-art of fault tolerant software is mature enough to tolerate design faults in specific circumstances with appropriate provisions of redundancy and allied supporting mechanisms. However, no known fault tolerance technique can guarantee failure-free system operation. Further, it is questionable whether the current approaches are cost-effective in achieving the desired gain in operational software reliability. They feel that what is needed is a systematic, cost effective approach to software development which explicitly addresses the fault tolerance issues throughout the development life-cycle.

  19. Fault Tolerant Control System Design Using Automated Methods from Risk Analysis

    Blanke, M.

    Fault tolerant controls have the ability to be resilient to simple faults in control loop components.......Fault tolerant controls have the ability to be resilient to simple faults in control loop components....

  20. Fault Tolerant Emergency Control to Preserve Power System Stability

    Pedersen, Andreas Søndergaard; Richter, Jan H.; Tabatabaeipour, Mojtaba;


    This paper introduces a method for fault-masking and system reconfiguration in power transmission systems. The paper demonstrates how faults are handled by reconfiguring remaining controls through utilisation of wide-area measurement in real time. It is shown how reconfiguration can be obtained...... using a virtual actuator concept, which covers Lure-type systems. The paper shows the steps needed to calculate a virtual actuator, which relies on the solution of a linear matrix inequality. The solution is shown to work with existing controls by adding a compensation signal. Simulation results...... of a benchmark system show ability of the reconfiguration to maintain stability...

  1. Fault-Tolerant Design of Spaceborne Mass Memory System

    张宇宁; 常亮; 杨根庆; 李华旺


    A fault-tolerant spaceborne mass memory architecture is presented based on entirely commercial-off-theshelf components.The highly modularized and scalable memory kernel supports the hierarchical design and is well suited to redundancy structure.Error correcting code(ECC) and periodical scrubbing are used to deal with bit errors induced by single event upset.For 8-bit wide devices, the parallel Reed Solomon(10, 8) can perform coder/decoder calculations in one clock cycle, achieving a data rate of several Gb/...

  2. An Integrated Fault Tolerant Robotic Controller System for High Reliability and Safety

    Marzwell, Neville I.; Tso, Kam S.; Hecht, Myron


    This paper describes the concepts and features of a fault-tolerant intelligent robotic control system being developed for applications that require high dependability (reliability, availability, and safety). The system consists of two major elements: a fault-tolerant controller and an operator workstation. The fault-tolerant controller uses a strategy which allows for detection and recovery of hardware, operating system, and application software failures.The fault-tolerant controller can be used by itself in a wide variety of applications in industry, process control, and communications. The controller in combination with the operator workstation can be applied to robotic applications such as spaceborne extravehicular activities, hazardous materials handling, inspection and maintenance of high value items (e.g., space vehicles, reactor internals, or aircraft), medicine, and other tasks where a robot system failure poses a significant risk to life or property.

  3. Award ER25750: Coordinated Infrastructure for Fault Tolerance Systems Indiana University Final Report

    Lumsdaine, Andrew


    The main purpose of the Coordinated Infrastructure for Fault Tolerance in Systems initiative has been to conduct research with a goal of providing end-to-end fault tolerance on a systemwide basis for applications and other system software. While fault tolerance has been an integral part of most high-performance computing (HPC) system software developed over the past decade, it has been treated mostly as a collection of isolated stovepipes. Visibility and response to faults has typically been limited to the particular hardware and software subsystems in which they are initially observed. Little fault information is shared across subsystems, allowing little flexibility or control on a system-wide basis, making it practically impossible to provide cohesive end-to-end fault tolerance in support of scientific applications. As an example, consider faults such as communication link failures that can be seen by a network library but are not directly visible to the job scheduler, or consider faults related to node failures that can be detected by system monitoring software but are not inherently visible to the resource manager. If information about such faults could be shared by the network libraries or monitoring software, then other system software, such as a resource manager or job scheduler, could ensure that failed nodes or failed network links were excluded from further job allocations and that further diagnosis could be performed. As a founding member and one of the lead developers of the Open MPI project, our efforts over the course of this project have been focused on making Open MPI more robust to failures by supporting various fault tolerance techniques, and using fault information exchange and coordination between MPI and the HPC system software stack from the application, numeric libraries, and programming language runtime to other common system components such as jobs schedulers, resource managers, and monitoring tools.

  4. Fault-diagnosis applications. Model-based condition monitoring. Acutators, drives, machinery, plants, sensors, and fault-tolerant systems

    Isermann, Rolf [Technische Univ. Darmstadt (DE). Inst. fuer Automatisierungstechnik (IAT)


    Supervision, condition-monitoring, fault detection, fault diagnosis and fault management play an increasing role for technical processes and vehicles in order to improve reliability, availability, maintenance and lifetime. For safety-related processes fault-tolerant systems with redundancy are required in order to reach comprehensive system integrity. This book is a sequel of the book ''Fault-Diagnosis Systems'' published in 2006, where the basic methods were described. After a short introduction into fault-detection and fault-diagnosis methods the book shows how these methods can be applied for a selection of 20 real technical components and processes as examples, such as: Electrical drives (DC, AC) Electrical actuators Fluidic actuators (hydraulic, pneumatic) Centrifugal and reciprocating pumps Pipelines (leak detection) Industrial robots Machine tools (main and feed drive, drilling, milling, grinding) Heat exchangers Also realized fault-tolerant systems for electrical drives, actuators and sensors are presented. The book describes why and how the various signal-model-based and process-model-based methods were applied and which experimental results could be achieved. In several cases a combination of different methods was most successful. The book is dedicated to graduate students of electrical, mechanical, chemical engineering and computer science and for engineers. (orig.)

  5. Mapping of Fault-Tolerant Applications with Transparency on Distributed Embedded Systems

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru


    In this paper we present an approach for the mapping optimization of fault-tolerant embedded systems for safety-critical applications. Processes and messages are statically scheduled. Process re-execution is used for recovering from multiple transient faults. We call process recovery transparent...

  6. Fault tolerant software modules for SIFT

    Hecht, M.; Hecht, H.


    The implementation of software fault tolerance is investigated for critical modules of the Software Implemented Fault Tolerance (SIFT) operating system to support the computational and reliability requirements of advanced fly by wire transport aircraft. Fault tolerant designs generated for the error reported and global executive are examined. A description of the alternate routines, implementation requirements, and software validation are included.

  7. Diagnosis and fault-tolerant control

    Blanke, Mogens; Lunze, Jan; Staroswiecki, Marcel


    Fault-tolerant control aims at a gradual shutdown response in automated systems when faults occur. It satisfies the industrial demand for enhanced availability and safety, in contrast to traditional reactions to faults, which bring about sudden shutdowns and loss of availability. The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process that can be used to ensure fault tolerance. It also introduces design methods suitable for diagnostic systems and fault-tolerant controllers for continuous processes that are described by analytical models of discrete-event systems represented by automata. The book is suitable for engineering students, engineers in industry and researchers who wish to get an overview of the variety of approaches to process diagnosis and fault-tolerant contro...

  8. Fault-Tolerant Consensus of Multi-Agent System With Distributed Adaptive Protocol.

    Chen, Shun; Ho, Daniel W C; Li, Lulu; Liu, Ming


    In this paper, fault-tolerant consensus in multi-agent system using distributed adaptive protocol is investigated. Firstly, distributed adaptive online updating strategies for some parameters are proposed based on local information of the network structure. Then, under the online updating parameters, a distributed adaptive protocol is developed to compensate the fault effects and the uncertainty effects in the leaderless multi-agent system. Based on the local state information of neighboring agents, a distributed updating protocol gain is developed which leads to a fully distributed continuous adaptive fault-tolerant consensus protocol design for the leaderless multi-agent system. Furthermore, a distributed fault-tolerant leader-follower consensus protocol for multi-agent system is constructed by the proposed adaptive method. Finally, a simulation example is given to illustrate the effectiveness of the theoretical analysis.

  9. Design Optimization of Time- and Cost-Constrained Fault-Tolerant Distributed Embedded Systems

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru;


    In this paper we present an approach to the design optimization of fault-tolerant embedded systems for safety-critical applications. Processes are statically scheduled and communications are performed using the time-triggered protocol. We use process re-execution and replication for tolerating...

  10. Adaptive sensor-fault tolerant control for a class of multivariable uncertain nonlinear systems.

    Khebbache, Hicham; Tadjine, Mohamed; Labiod, Salim; Boulkroune, Abdesselem


    This paper deals with the active fault tolerant control (AFTC) problem for a class of multiple-input multiple-output (MIMO) uncertain nonlinear systems subject to sensor faults and external disturbances. The proposed AFTC method can tolerate three additive (bias, drift and loss of accuracy) and one multiplicative (loss of effectiveness) sensor faults. By employing backstepping technique, a novel adaptive backstepping-based AFTC scheme is developed using the fact that sensor faults and system uncertainties (including external disturbances and unexpected nonlinear functions caused by sensor faults) can be on-line estimated and compensated via robust adaptive schemes. The stability analysis of the closed-loop system is rigorously proven using a Lyapunov approach. The effectiveness of the proposed controller is illustrated by two simulation examples.

  11. Novel active fault-tolerant control scheme and its application to a double inverted pendulum system


    On the basis of the gain-scheduled H∞ design strategy,a novel active fault-tolerant control scheme is proposed.Under the assumption that the effects of faults on the state-space matrices of systems can be of affine parameter dependence,a reconfigurable robust H∞ linear parameter varying controller is developed.The designed controller is a function of the fault effect factors that can be derived online by using a well-trained neural network.To demonstrate the effectiveness of the proposed method,a double inverted pendulum system,with a fault in the motor tachometer loop,is considered.

  12. Fault-tolerant Supervisory Control

    Izadi-Zamanabadi, Roozbeh

    The main purpose of this work has been to achieve active fault-tolerance in control systems, defined as a methodology where fault detection and isolation techniques are combined with supervisory control to achieve autonomous accommodation of faults before they develop into failures. The aim...... control algorithms. The drawback is, however, that these control systems have become more vulnerable to even simple faults in instrumentation. On the other hand, due to cost-optimality requirements, an extensive use of hardware redundancy has been prohibited. Nevertheless, the dependency and availability...... could be increased through enhancing control systems' ability to on-line perform fault detection and reconfiguration when a fault occurs and before a safety system shuts-down the entire process. The main contributions of this research effort are development and experimentation with methodologies...

  13. Validation Methods Research for Fault-Tolerant Avionics and Control Systems: Working Group Meeting, 2

    Gault, J. W. (Editor); Trivedi, K. S. (Editor); Clary, J. B. (Editor)


    The validation process comprises the activities required to insure the agreement of system realization with system specification. A preliminary validation methodology for fault tolerant systems documented. A general framework for a validation methodology is presented along with a set of specific tasks intended for the validation of two specimen system, SIFT and FTMP. Two major areas of research are identified. First, are those activities required to support the ongoing development of the validation process itself, and second, are those activities required to support the design, development, and understanding of fault tolerant systems.

  14. A Piecewise Affine Hybrid Systems Approach to Fault Tolerant Satellite Formation Control

    Grunnet, Jacob Deleuran; Larsen, Jesper Abildgaard; Bak, Thomas


    In this paper a procedure for modelling satellite formations   including failure dynamics as a piecewise-affine hybrid system is   shown. The formulation enables recently developed methods and tools   for control and analysis of piecewise-affine systems to be applied   leading to synthesis of fault...... tolerant controllers and analysis of   the system behaviour given possible faults.  The method is   illustrated using a simple example involving two satellites trying   to reach a specific formation despite of actuator faults occurring....

  15. Architecture for Intrusion Detection System with Fault Tolerance Using Mobile Agent

    Chintan Bhatt


    Full Text Available This paper is a survey of the work, done for making an IDS fault tolerant.Architecture of IDS that usesmobile Agent provides higher scalability. Mobile Agent uses Platform for detecting Intrusions using filterAgent, co-relater agent, Interpreter agent and rule database. When server (IDS Monitor goes down,other hosts based on priority takes Ownership. This architecture uses decentralized collection andanalysis for identifying Intrusion. Rule sets are fed based on user-behaviour or applicationbehaviour.This paper suggests that intrusion detection system (IDS must be fault tolerant; otherwise, theintruder may first subvert the IDS then attack the target system at will.

  16. Application of Joint Parameter Identification and State Estimation to a Fault-Tolerant Robot System

    Sun, Zhen; Yang, Zhenyu


    The joint parameter identification and state estimation technique is applied to develop a fault-tolerant space robot system. The potential faults in the considered system are abrupt parametric faults, which indicate that some system parameters will immediately deviate from their nominal values...... if a fault happens. The concerned system parameters consist of deterministic parts as well as those describing the stochastic features in the system. Due to the purpose for design of reconfigurable control, these deviated system parameters need to be identified as precisely and quickly as possible. Meanwhile......, it would further simplify the reconfigurable design task and possibly speed up the system recovery, if the system state information under the new operating circumstance can be available along with faulty parameter information. The joint parameter identification and state estimation using the combined...

  17. A real-time fault-tolerant scheduling algorithm with low dependability cost in on-board computer system

    WANG Pei-dong; WEI Zhen-hua


    To make the on-board computer system more dependable and real-time in a satellite, an algorithm of the fault-tolerant scheduling in the on-board computer system with high priority recovery is proposed in this paper. This algorithm can schedule the on-board fault-tolerant tasks in real time. Due to the use of dependability cost, the overhead of scheduling the fault-tolerant tasks can be reduced. The mechanism of the high priority recovery will improve the response to recovery tasks. The fault-tolerant scheduling model is presented simulation results validate the correctness and feasibility of the proposed algorithm.

  18. An architecture for fault tolerant controllers

    Niemann, Hans Henrik; Stoustrup, Jakob


    degradation in the sense of guaranteed degraded performance. A number of fault diagnosis problems, fault tolerant control problems, and feedback control with fault rejection problems are formulated/considered, mainly from a fault modeling point of view. The method is illustrated on a servo example including......A general architecture for fault tolerant control is proposed. The architecture is based on the (primary) YJBK parameterization of all stabilizing compensators and uses the dual YJBK parameterization to quantify the performance of the fault tolerant system. The approach suggested can be applied...... for additive faults, parametric faults, and for system structural changes. The modeling for each of these fault classes is described. The method allows to design for passive as well as for active fault handling. Also, the related design method can be fitted either to guarantee stability or to achieve graceful...

  19. Scheduling and Optimization of Fault-Tolerant Embedded Systems with Transparency/Performance Trade-Offs

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru;


    In this article, we propose a strategy for the synthesis of fault-tolerant schedules and for the mapping of fault-tolerant applications. Our techniques handle transparency/performance trade-offs and use the faultoccurrence information to reduce the overhead due to fault tolerance. Processes and m...

  20. H∞ robust fault-tolerant controller design for an autonomous underwater vehicle's navigation control system

    Cheng, Xiang-Qin; Qu, Jing-Yuan; Yan, Zhe-Ping; Bian, Xin-Qian


    In order to improve the security and reliability for autonomous underwater vehicle (AUV) navigation, an H∞ robust fault-tolerant controller was designed after analyzing variations in state-feedback gain. Operating conditions and the design method were then analyzed so that the control problem could be expressed as a mathematical optimization problem. This permitted the use of linear matrix inequalities (LMI) to solve for the H∞ controller for the system. When considering different actuator failures, these conditions were then also mathematically expressed, allowing the H∞ robust controller to solve for these events and thus be fault-tolerant. Finally, simulation results showed that the H∞ robust fault-tolerant controller could provide precise AUV navigation control with strong robustness.

  1. Robust fault tolerant control of uncertain time-delay linear systems


    Robust fault tolerant control for a class of time-delay linear systems with parameter uncertainties is studied, and a time-delay related state feedback control is proposed. On the basis of Lyapunov method , we prove that the proposed control law has integrity against sensor and/or actuator failures if the correspondent sufficient condition can be satisfied. A heuristic algorithm is also provided to facilitate the realization of the fault tolerant control. Finally, a simulation example is presented to show the effectiveness of the proposed approach.

  2. Enhanced fault-tolerant quantum computing in d-level systems.

    Campbell, Earl T


    Error-correcting codes protect quantum information and form the basis of fault-tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transversal non-Clifford gate. Codes with the desired property are presented for d-level qudit systems with prime d. The codes use n=d-1 qudits and can detect up to ∼d/3 errors. We quantify the performance of these codes for one approach to quantum computation known as magic-state distillation. Unlike prior work, we find performance is always enhanced by increasing d.

  3. Fault Injection and Monitoring Capability for a Fault-Tolerant Distributed Computation System

    Torres-Pomales, Wilfredo; Yates, Amy M.; Malekpour, Mahyar R.


    The Configurable Fault-Injection and Monitoring System (CFIMS) is intended for the experimental characterization of effects caused by a variety of adverse conditions on a distributed computation system running flight control applications. A product of research collaboration between NASA Langley Research Center and Old Dominion University, the CFIMS is the main research tool for generating actual fault response data with which to develop and validate analytical performance models and design methodologies for the mitigation of fault effects in distributed flight control systems. Rather than a fixed design solution, the CFIMS is a flexible system that enables the systematic exploration of the problem space and can be adapted to meet the evolving needs of the research. The CFIMS has the capabilities of system-under-test (SUT) functional stimulus generation, fault injection and state monitoring, all of which are supported by a configuration capability for setting up the system as desired for a particular experiment. This report summarizes the work accomplished so far in the development of the CFIMS concept and documents the first design realization.

  4. Fault Tolerant Computer Architecture

    Sorin, Daniel


    For many years, most computer architects have pursued one primary goal: performance. Architects have translated the ever-increasing abundance of ever-faster transistors provided by Moore's law into remarkable increases in performance. Recently, however, the bounty provided by Moore's law has been accompanied by several challenges that have arisen as devices have become smaller, including a decrease in dependability due to physical faults. In this book, we focus on the dependability challenge and the fault tolerance solutions that architects are developing to overcome it. The two main purposes

  5. Energy/Reliability Trade-offs in Fault-Tolerant Event-Triggered Distributed Embedded Systems

    Gan, Junhe; Gruian, Flavius; Pop, Paul;


    This paper presents an approach to the synthesis of low-power fault-tolerant hard real-time applications mapped on distributed heterogeneous embedded systems. Our synthesis approach decides the mapping of tasks to processing elements, as well as the voltage and frequency levels for executing each...

  6. Robust Fault Tolerant Control for a Class of Time-Delay Systems with Multiple Disturbances

    Songyin Cao


    Full Text Available A robust fault tolerant control (FTC approach is addressed for a class of nonlinear systems with time delay, actuator faults, and multiple disturbances. The first part of the multiple disturbances is supposed to be an uncertain modeled disturbance and the second one represents a norm-bounded variable. First, a composite observer is designed to estimate the uncertain modeled disturbance and actuator fault simultaneously. Then, an FTC strategy consisting of disturbance observer based control (DOBC, fault accommodation, and a mixed H2/H∞ controller is constructed to reconfigure the considered systems with disturbance rejection and attenuation performance. Finally, simulations for a flight control system are given to show the efficiency of the proposed approach.

  7. Distributed Fault-Tolerant Avionic Systems - A Real-Time Perspective

    Burke, Michael


    This paper examines the problem of introducing advanced forms of fault-tolerance via reconfiguration into safety-critical avionic systems. This is required to enable increased availability after fault occurrence in distributed integrated avionic systems(compared to static federated systems). The approach taken is to identify a migration path from current architectures to those that incorporate re-configuration to a lesser or greater degree. Other challenges identified include change of the development process; incremental and flexible timing and safety analyses; configurable kernels applicable for safety-critical systems.

  8. Diagnosis and Tolerant Strategy of an Open-Switch Fault for T-type Three-Level Inverter Systems

    Choi, Uimin; Lee, Kyo Beum; Blaabjerg, Frede


    -tolerant strategy is explained by dividing into two cases: the faulty condition of half-bridge switches and the neutral-point switches. The performance of the T-type inverter system improves considerably by the proposed fault tolerant algorithm when a switch fails. The roposed method does not require additional......This paper proposes a new diagnosis method of an open-switch fault and fault-tolerant control strategy for T-type three-level inverter systems. The location of faulty switch can be identified by the average of normalized phase current and the change of the neutral-point voltage. The proposed fault...

  9. An Efficient Fault Tolerance System Design for Cmos/Nanodevice Digital Memories

    D. Kavitha


    Full Text Available Targeting on the future fault-prone hybrid CMOS/Nanodevice digital memories, this paper present two faulttolerance design approaches the integrally address the tolerance for defect and transient faults. These two approaches share several key features, including the use of a group of Bose-Chaudhuri- Hocquenghem (BCH codes for both defect tolerance and transient fault tolerance, and integration of BCH code selection and dynamic logical-to-physical address mapping. Thus, a new model of BCH decoder is proposed to reduce the area and simplify the computational scheduling of both syndrome and chien search blocks without parallelism leading to high throughput.The goal of fault tolerant computing is improve the dependability of systems where dependability can be defined as the ability of a system to deliver service at an acceptable level of confidence in either presence or absence The results of the simulation and implementation using Xilinx ISE software and the LCD screen on the FPGA’s Board will be shown at last.

  10. Design Optimization of Time- and Cost-Constrained Fault-Tolerant Embedded Systems with Checkpointing and Replication

    Pop, Paul; Izosimov, Viacheslav; Eles, Petru;


    We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes and communications are statically scheduled. Our synthesis approach deci...

  11. Fault-Tolerant Heat Exchanger

    Izenson, Michael G.; Crowley, Christopher J.


    A compact, lightweight heat exchanger has been designed to be fault-tolerant in the sense that a single-point leak would not cause mixing of heat-transfer fluids. This particular heat exchanger is intended to be part of the temperature-regulation system for habitable modules of the International Space Station and to function with water and ammonia as the heat-transfer fluids. The basic fault-tolerant design is adaptable to other heat-transfer fluids and heat exchangers for applications in which mixing of heat-transfer fluids would pose toxic, explosive, or other hazards: Examples could include fuel/air heat exchangers for thermal management on aircraft, process heat exchangers in the cryogenic industry, and heat exchangers used in chemical processing. The reason this heat exchanger can tolerate a single-point leak is that the heat-transfer fluids are everywhere separated by a vented volume and at least two seals. The combination of fault tolerance, compactness, and light weight is implemented in a unique heat-exchanger core configuration: Each fluid passage is entirely surrounded by a vented region bridged by solid structures through which heat is conducted between the fluids. Precise, proprietary fabrication techniques make it possible to manufacture the vented regions and heat-conducting structures with very small dimensions to obtain a very large coefficient of heat transfer between the two fluids. A large heat-transfer coefficient favors compact design by making it possible to use a relatively small core for a given heat-transfer rate. Calculations and experiments have shown that in most respects, the fault-tolerant heat exchanger can be expected to equal or exceed the performance of the non-fault-tolerant heat exchanger that it is intended to supplant (see table). The only significant disadvantages are a slight weight penalty and a small decrease in the mass-specific heat transfer.

  12. A Systematic Approach to Sensitivity Analysis of Fault Tolerant Systems in NMR Architecture

    Kourosh Aslansefat


    Full Text Available A fault tree illustrates the ways through which a system fails. It states different ways in which combination of faulty components result in an undesired event in the system. Being used in phases such as designing and exploiting industrial systems, and the designers able to evaluate the dependability attributes such as reliability, MTTF and sensitivity. In addition, in the mentioned ability, the fault tree is a systematic method for finding systems bottlenecks and weakness point. In spite of its extensive use in evaluating the reliability of systems, fault tree is rarely used in calculating sensitivity. In the last decade, few researches has been conducted in this field, however these methods are not applicable to large scale systems and are not systematic. This paper provides a systematic method for evaluating system sensitivity through fault tree. Then, it introduces sensitivity of NMR architecture as one of the common structures of fault tolerance which is used for enhancing systems’ reliability, safety and availability in industry. This article presents a comprehensive and parameterized formula for NMR structure's sensitivity. The presented method can be a great help for designing and exploiting reliable systems engineers in systematic and instant calculation of sensitivity by means of fault tree.

  13. Active fault tolerant control for vertical tail damaged aircraft with dissimilar redundant actuation system

    Wang Jun; Wang Shaoping; Wang Xingjian; Shi Cun; Mileta M. Tomovic


    This paper proposes an active fault-tolerant control strategy for an aircraft with dissim-ilar redundant actuation system (DRAS) that has suffered from vertical tail damage. A damage degree coefficient based on the effective vertical tail area is introduced to parameterize the damaged flight dynamic model. The nonlinear relationship between the damage degree coefficient and the corresponding stability derivatives is considered. Furthermore, the performance degradation of new input channel with electro-hydrostatic actuator (EHA) is also taken into account in the dam-aged flight dynamic model. Based on the accurate damaged flight dynamic model, a composite method of linear quadratic regulator (LQR) integrating model reference adaptive control (MRAC) is proposed to reconfigure the fault-tolerant control law. The numerical simulation results validate the effectiveness of the proposed fault-tolerant control strategy with accurate flight dynamic model. The results also indicate that aircraft with DRAS has better fault-tolerant control ability than the traditional ones when the vertical tail suffers from serious damage.

  14. Fault Tolerance Mobile Agent System Using Witness Agent in 2-Dimensional Mesh Network

    Ahmad Rostami


    Full Text Available Mobile agents are computer programs that act autonomously on behalf of a user or its owner and travel through a network of heterogeneous machines. Fault tolerance is important in their itinerary. In this paper, existent methods of fault tolerance in mobile agents are described which they are considered in linear network topology. In the methods three agents are used to fault tolerance by cooperating to each others for detecting and recovering server and agent failure. Three types of agents are: actual agent which performs programs for its owner, witness agent which monitors the actual agent and the witness agent after itself, probe which is sent for recovery the actual agent or the witness agent on the side of the witness agent. Communication mechanism in the methods is message passing between these agents. The methods are considered in linear network. We introduce our witness agent approach for fault tolerance mobile agent systems in Two Dimensional Mesh (2D-Mesh Network. Indeed Our approach minimizes Witness-Dependency in this network and then represents its algorithm.

  15. Synthesis of Fault-Tolerant Schedules with Transparency/Performance Trade-offs for Distributed Embedded Systems

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru;


    In this paper we present an approach to the scheduling of fault-tolerant embedded systems for safety-critical applications. Processes and messages are statically scheduled, and we use process re-execution for recovering from multiple transient faults. If process recovery is performed such that th...... process graph, where the fault occurrence information is represented as conditional edges and the transparent recovery is captured using synchronization nodes.......In this paper we present an approach to the scheduling of fault-tolerant embedded systems for safety-critical applications. Processes and messages are statically scheduled, and we use process re-execution for recovering from multiple transient faults. If process recovery is performed...... such that the operation of other processes is not affected, we call it transparent recovery. Although transparent recovery has the advantages of fault containment, improved debugability and less memory needed to store the fault-tolerant schedules, it will introduce delays that can violate the timing constraints...

  16. A Method and Tool for Analyzing Fault-Tolerance in Systems


    on automated analysis of fault-tolerant systems, partly because the protocols of interest are more typical of software than hardware, and of the state space of interesting software systems is often infeasible. 1.2 Overview of this Work This thesis explores a specialized approach to...Transformation Notation In their work on applying HAZOP and FMECA to computer-based systems, McDermid et al. [FM93,FMNP94,MNPF95] have developed an

  17. Fault-tolerant control of linear uncertain systems using H∞ robust predictive control

    Chen Xueqin; Geng Yunhai; Zhang Yingchun; Wang Feng


    The robust fault-tolerant control problem of linear uncertain systems is studied. It is shown that a solution for this problem can be obtained from a H∞ robust predictive controller (RMPC) by the method of linear matrix inequality (LMI). This approach has the advantages of both H∞ control and MPC: the robustness and ability to handle constraints explicitly. The robust closed-loop stability of the linear uncertain system with input and output constraints is proven under an actuator and sensor faults condition. Finally, satisfactory results of simulation experiments verify the validity of this algorithm.

  18. An adaptive fuzzy design for fault-tolerant control of MIMO nonlinear uncertain systems


    This paper presents a novel control method for accommodating actuator faults in a class of multiple-input multiple-output (MIMO) nonlinear uncertain systems.The designed control scheme can tolerate both the time-varying lock-in-place and loss of effectiveness actuator faults.In each subsystem of the considered MIMO system,the controller is obtained from a backstepping procedure;an adaptive fuzzy approximator with minimal learning parameterization is employed to approximate the package of unknown nonlinear f...

  19. Observer-based fault-tolerant control for a class of nonlinear networked control systems

    Mahmoud, M. S.; Memon, A. M.; Shi, Peng


    This paper presents a fault-tolerant control (FTC) scheme for nonlinear systems which are connected in a networked control system. The nonlinear system is first transformed into two subsystems such that the unobservable part is affected by a fault and the observable part is unaffected. An observer is then designed which gives state estimates using a Luenberger observer and also estimates unknown parameter of the system; this helps in fault estimation. The FTC is applied in the presence of sampling due to the presence of a network in the loop. The controller gain is obtained using linear-quadratic regulator technique. The methodology is applied on a mechatronic system and the results show satisfactory performance.

  20. Efficient and Low-Cost Fault Tolerance for Web-Scale Systems

    Serafini, Marco


    Online Web-scale services are being increasingly used to handle critical personal information. The trend towards storing and managing such information on the “cloud” is extending the need for dependable services to a growing range of Web applications, from emailing, to calendars, storage of photos, or finance. This motivates the increased adoption of fault-tolerant replication algorithms in Web-scale systems, ranging from classic, strongly-consistent replication in systems such as Chubby [Bur0...

  1. Adaptive fault-tolerant control of linear time-invariant systems in the presence of actuator saturation

    Wei GUAN; Guanghong YANG


    This paper studies the problem of designing adaptive fault-tolerant controllers for linear time-invariant systems with actuator saturation.New methods for designing indirect adaptive fault-tolerant controllers via state feedback are presented for actuator fault compensations.Based on the on-line estimation of eventual faults,the adaptive fault-tolerant controller parameters are updating automatically to compensate the fault effects on systems.The designs are developed in the framework of linear matrix inequality (LMI) approach,which can enlarge the domain of attraction of closed-loop systems in the cases of actuator saturation and actuator failures.Two examples are given to illustrate the effectiveness of the design method.

  2. Robust H-infinity fault-tolerant control for uncertain descriptor systems by dynamical compensators

    Bing LIANG; Guangren DUAN


    The problem of robust H-infinity fault-tolerant control against sensor failures for a class of uncertain descriptor systems via dynamical compensators is considered.Based on H-infinity theory in descriptor systems,a sufficient condition for the existence of dynamical compensators with H-infinity fault-tolerant function is derived and expressions for the gain matrices in the compensators are presented.The dynamical compensator guarantees that the resultant colsed-loop system is admissible;furthermore,it maintains certain H-infinity norm performance in the normal condition as well as in the event of sensor failures and parameter uncertainties.A numerical example shows the effect of the proposed method.

  3. Feasibility analysis and design of a fault tolerant computing system: a TMR microprocessor system design of 64-Bit COTS microprocessors

    Eken, Huseyin Baha


    The purpose of this thesis is to analyze and determine the feasibility of implementing a fault tolerant computing system that is able to function in the presence of radiation induced Single Event Upsets (SEU) by using the Triple Modular Redundancy (TMR) technique with 64-bit Commercial-Off-The- Shelf (COTS) microprocessors. Due to the radiation environment in space, electronic devices must be designed to tolerate the radiation effects. While there are radiation-hardened devices that can toler...

  4. Fault Tolerant Frequent Pattern Mining

    Shohdy, Sameh; Vishnu, Abhinav; Agrawal, Gagan


    FP-Growth algorithm is a Frequent Pattern Mining (FPM) algorithm that has been extensively used to study correlations and patterns in large scale datasets. While several researchers have designed distributed memory FP-Growth algorithms, it is pivotal to consider fault tolerant FP-Growth, which can address the increasing fault rates in large scale systems. In this work, we propose a novel parallel, algorithm-level fault-tolerant FP-Growth algorithm. We leverage algorithmic properties and MPI advanced features to guarantee an O(1) space complexity, achieved by using the dataset memory space itself for checkpointing. We also propose a recovery algorithm that can use in-memory and disk-based checkpointing, though in many cases the recovery can be completed without any disk access, and incurring no memory overhead for checkpointing. We evaluate our FT algorithm on a large scale InfiniBand cluster with several large datasets using up to 2K cores. Our evaluation demonstrates excellent efficiency for checkpointing and recovery in comparison to the disk-based approach. We have also observed 20x average speed-up in comparison to Spark, establishing that a well designed algorithm can easily outperform a solution based on a general fault-tolerant programming model.

  5. Advanced information processing system: The Army Fault-Tolerant Architecture detailed design overview

    Harper, Richard E.; Babikyan, Carol A.; Butler, Bryan P.; Clasen, Robert J.; Harris, Chris H.; Lala, Jaynarayan H.; Masotto, Thomas K.; Nagle, Gail A.; Prizant, Mark J.; Treadwell, Steven


    The Army Avionics Research and Development Activity (AVRADA) is pursuing programs that would enable effective and efficient management of large amounts of situational data that occurs during tactical rotorcraft missions. The Computer Aided Low Altitude Night Helicopter Flight Program has identified automated Terrain Following/Terrain Avoidance, Nap of the Earth (TF/TA, NOE) operation as key enabling technology for advanced tactical rotorcraft to enhance mission survivability and mission effectiveness. The processing of critical information at low altitudes with short reaction times is life-critical and mission-critical necessitating an ultra-reliable/high throughput computing platform for dependable service for flight control, fusion of sensor data, route planning, near-field/far-field navigation, and obstacle avoidance operations. To address these needs the Army Fault Tolerant Architecture (AFTA) is being designed and developed. This computer system is based upon the Fault Tolerant Parallel Processor (FTPP) developed by Charles Stark Draper Labs (CSDL). AFTA is hard real-time, Byzantine, fault-tolerant parallel processor which is programmed in the ADA language. This document describes the results of the Detailed Design (Phase 2 and 3 of a 3-year project) of the AFTA development. This document contains detailed descriptions of the program objectives, the TF/TA NOE application requirements, architecture, hardware design, operating systems design, systems performance measurements and analytical models.

  6. Service for fault tolerance in the Ad Hoc Networks based on Multi Agent Systems

    Ghalem Belalem


    Full Text Available The Ad hoc networks are distributed networks, self-organized and does not require infrastructure. In such network, mobile infrastructures are subject of disconnections. This situation may concern a voluntary or involuntary disconnection of nodes caused by the high mobility in the Ad hoc network. In these problems we are trying through this work to contribute to solving these problems in order to ensure continuous service by proposing our service for faults tolerance based on Multi Agent Systems (MAS, which predict a problem and decision making in relation to critical nodes. Our work contributes to study the prediction of voluntary and involuntary disconnections in the Ad hoc network; therefore we propose our service for faults tolerance that allows for effective distribution of information in the Network by selecting some objects of the network to be duplicates of information.

  7. Fault Tolerant Software: a Multi Agent System Solution

    Caponetti, Fabio; Bergantino, Nicola; Longhi, Sauro


    Development of high dependable systems remains a labour intensive task. This paper explores recent advances on the adaptation of the software agent architecture for control application while looking to dependability issues. Multiple agent systems theory will be reviewed giving methods to supervise...

  8. Fault-Tolerant Onboard Monitoring and Decision Support Systems

    Lajic, Zoran

    The purpose of this research project is to improve current onboard decision support systems. Special focus is on the onboard prediction of the instantaneous sea state. In this project a new approach to increasing the overall reliability of a monitoring and decision support system has been...

  9. Robust Fault-Tolerant Tracking Control for Nonlinear Networked Control System: Asynchronous Switched Polytopic Approach

    Chaoyang Dong


    Full Text Available This paper is concerned with the robust fault-tolerant tracking control problem for networked control system (NCS. Firstly, considering the locally overlapped switching law widely existed in engineering applications, the NCS is modeled as a locally overlapped switched polytopic system to reduce designing conservatism and solving complexity. Then, switched parameter dependent fault-tolerant tracking controllers are constructed to deal with the asynchronous switching phenomenon caused by the updating delays of the switching signals and weighted coefficients. Additionally, the global uniform asymptotic stability in the mean (GUAS-M and desired weighted l2 performance are guaranteed by combining the switched parameter dependent Lyapunov functional method with the average dwell time (ADT method, and the feasible conditions for the fault-tolerant tracking controllers are obtained in the form of linear matrix inequalities (LMIs. Finally, the performance of the proposed approach is verified on a highly maneuverable technology (HiMAT vehicle’s tracking control problem. Simulation results show the effectiveness of the proposed method.

  10. Fault-Tolerant Scheduling for Real-Time Embedded Control Systems

    Chun-Hua Yang; Geert Deconinck; Wei-Hua Gui


    With the increasing complexity of industrial application, an embedded control system (ECS) requires processing a number of hard real-time tasks and needs fault-tolerance to assure high reliability. Considering the characteristics of real-time tasks in ECS, an integrated algorithm is proposed to schedule real-time tasks and to guarantee that all real-time tasks are completed before their deadlines even in the presence of faults. Based on the nonpreemptive critical-section protocol (NCSP), this paper analyzes the blocking time introduced by resource conflicts of relevancy tasks in fault-tolerant multiprocessor systems. An extended schedulability condition is presented to check the assignment feasibility of a given task to a processor. A primary/backup approach and on-line replacement of failed processors are used to tolerate processor failures. The analysis reveals that the integrated algorithm bounds the blocking time, requires limited overhead on the number of processors, and still assures good processor utilization. This is also demonstrated by simulation results. Both analysis and simulation show the effectiveness of the proposed algorithm in ECS.

  11. Plan for the Characterization of HIRF Effects on a Fault-Tolerant Computer Communication System

    Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.; Koppen, Sandra V.


    This report presents the plan for the characterization of the effects of high intensity radiated fields on a prototype implementation of a fault-tolerant data communication system. Various configurations of the communication system will be tested. The prototype system is implemented using off-the-shelf devices. The system will be tested in a closed-loop configuration with extensive real-time monitoring. This test is intended to generate data suitable for the design of avionics health management systems, as well as redundancy management mechanisms and policies for robust distributed processing architectures.

  12. Local rollback for fault-tolerance in parallel computing systems

    Blumrich, Matthias A [Yorktown Heights, NY; Chen, Dong [Yorktown Heights, NY; Gara, Alan [Yorktown Heights, NY; Giampapa, Mark E [Yorktown Heights, NY; Heidelberger, Philip [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Steinmacher-Burow, Burkhard [Boeblingen, DE; Sugavanam, Krishnan [Yorktown Heights, NY


    A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.

  13. Fault-tolerant Agreement in Synchronous Message-passing Systems

    Raynal, Michel


    The present book focuses on the way to cope with the uncertainty created by process failures (crash, omission failures and Byzantine behavior) in synchronous message-passing systems (i.e., systems whose progress is governed by the passage of time). To that end, the book considers fundamental problems that distributed synchronous processes have to solve. These fundamental problems concern agreement among processes (if processes are unable to agree in one way or another in presence of failures, no non-trivial problem can be solved). They are consensus, interactive consistency, k-set agreement an

  14. Fault-tolerant quantum computation -- a dynamical systems approach

    Fern, J; Simic, S; Sastry, S; Fern, Jesse; Kempe, Julia; Simic, Slobodan; Sastry, Shankar


    We apply a dynamical systems approach to concatenation of quantum error correcting codes, extending and generalizing the results of Rahn et al. [8] to both diagonal and nondiagonal channels. Our point of view is global: instead of focusing on particular types of noise channels, we study the geometry of the coding map as a discrete-time dynamical system on the entire space of noise channels. In the case of diagonal channels, we show that any code with distance at least three corrects (in the infinite concatenation limit) an open set of errors. For CSS codes, we give a more precise characterization of that set. We show how to incorporate noise in the gates, thus completing the framework. We derive some general bounds for noise channels, which allows us to analyze several codes in detail.

  15. Fault tolerant computer control for a Maglev transportation system

    Lala, Jaynarayan H.; Nagle, Gail A.; Anagnostopoulos, George


    Magnetically levitated (Maglev) vehicles operating on dedicated guideways at speeds of 500 km/hr are an emerging transportation alternative to short-haul air and high-speed rail. They have the potential to offer a service significantly more dependable than air and with less operating cost than both air and high-speed rail. Maglev transportation derives these benefits by using magnetic forces to suspend a vehicle 8 to 200 mm above the guideway. Magnetic forces are also used for propulsion and guidance. The combination of high speed, short headways, stringent ride quality requirements, and a distributed offboard propulsion system necessitates high levels of automation for the Maglev control and operation. Very high levels of safety and availability will be required for the Maglev control system. This paper describes the mission scenario, functional requirements, and dependability and performance requirements of the Maglev command, control, and communications system. A distributed hierarchical architecture consisting of vehicle on-board computers, wayside zone computers, a central computer facility, and communication links between these entities was synthesized to meet the functional and dependability requirements on the maglev. Two variations of the basic architecture are described: the Smart Vehicle Architecture (SVA) and the Zone Control Architecture (ZCA). Preliminary dependability modeling results are also presented.

  16. Robust fault tolerant control based on sliding mode method for uncertain linear systems with quantization.

    Hao, Li-Ying; Yang, Guang-Hong


    This paper is concerned with the problem of robust fault-tolerant compensation control problem for uncertain linear systems subject to both state and input signal quantization. By incorporating novel matrix full-rank factorization technique with sliding surface design successfully, the total failure of certain actuators can be coped with, under a special actuator redundancy assumption. In order to compensate for quantization errors, an adjustment range of quantization sensitivity for a dynamic uniform quantizer is given through the flexible choices of design parameters. Comparing with the existing results, the derived inequality condition leads to the fault tolerance ability stronger and much wider scope of applicability. With a static adjustment policy of quantization sensitivity, an adaptive sliding mode controller is then designed to maintain the sliding mode, where the gain of the nonlinear unit vector term is updated automatically to compensate for the effects of actuator faults, quantization errors, exogenous disturbances and parameter uncertainties without the need for a fault detection and isolation (FDI) mechanism. Finally, the effectiveness of the proposed design method is illustrated via a model of a rocket fairing structural-acoustic.

  17. Fault Tolerant Neural Network for ECG Signal Classification Systems

    MERAH, M.


    Full Text Available The aim of this paper is to apply a new robust hardware Artificial Neural Network (ANN for ECG classification systems. This ANN includes a penalization criterion which makes the performances in terms of robustness. Specifically, in this method, the ANN weights are normalized using the auto-prune method. Simulations performed on the MIT ? BIH ECG signals, have shown that significant robustness improvements are obtained regarding potential hardware artificial neuron failures. Moreover, we show that the proposed design achieves better generalization performances, compared to the standard back-propagation algorithm.

  18. Fault Tolerance, Reliability and Testability for Distributed Systems.


    A p;ocesso or all cncin hav 2ald* FIGURE2.1 A- 4 FAUL LOCTO AROC DUR OFSCIN 3.,PR _ _ __ol~ _ _ _ _ _ I , ,CD SUMAYOFAGNSTC Cseas Faile CopnnsTetdN.o... Journal , vol. 18, no. 2. p. 244, 1979 DAVI81 E. A. Davis and P. K. Giloth, "No. 4 ESS: performance objectives and service experience". Bell System...Technical Journal , vol. 60, no. 6, pp. 1203-1224, August, 1981 DlC179 V. DiCIcclo, C.A. Sunshine, J.A. Field, and E.G. Manning, "Alternatives for

  19. Research on robust fault-tolerant control for networked control system with packet dropout

    Huo Zhihong; Fang Huajing


    A kind of networked control system with network-induced delay and packet dropout, modeled on asynchronous dynamical systems was tested, and the integrity design of the networked control system with sensors failures and actuators failures was analyzed using hybrid systems technique based on the robust fault-tolerant control theory. The parametric expression of controller is given based on the feasible solution of linear matrix inequality. The simulation results are provided on the basis of detailed theoretical analysis, which further demonstrate the validity of the proposed schema.

  20. Research on fault-tolerant control of networked control systems based on information scheduling

    Huo Zhihong; Zhang Zhixue; Fang Huajing


    A kind of networked control system is studied; the networked control system with noise disturbance is modeled based on information scheduling and control co-design.Augmented state matrix analysis method is introduced,and robust fault-tolerant control problem of networked control systems with noise disturbance under actuator failures is studied.The parametric expression of the controller under actuator failures is given.Furthermore,the result is analyzed by simulation tests,which not only satisfies the networked control systems stability,but also decreases the data information number in network channel and makes full use of the network resources.

  1. (m,n-Semirings and a Generalized Fault-Tolerance Algebra of Systems

    Syed Eqbal Alam


    Full Text Available We propose a new class of mathematical structures called (m,n-semirings (which generalize the usual semirings and describe their basic properties. We define partial ordering and generalize the concepts of congruence, homomorphism, and so forth, for (m,n-semirings. Following earlier work by Rao (2008, we consider systems made up of several components whose failures may cause them to fail and represent the set of such systems algebraically as an (m,n-semiring. Based on the characteristics of these components, we present a formalism to compare the fault-tolerance behavior of two systems using our framework of a partially ordered (m,n-semiring.

  2. Energy-Efficient Deterministic Fault-Tolerant Scheduling for Embedded Real-Time Systems

    LI Guo-hui; HU Fang-xiao; DU Xiao-kun; TANG Xiang-hong


    By combining fault-tolerance with power management, this paper developed a new method for aperiodic task set for the problem of task scheduling and voltage allocation in embedded real-time systems. The schedulability of the system was analyzed through checkpointing and the energy saving was considered via dynamic voltage and frequency scaling. Simulation results showed that the proposed algorithm had better performance compared with the existing voltage allocation techniques. The proposed technique saves 51.5% energy over FT-Only and 19.9% over FT+EC on average. Therefore, the proposed method was more appropriate for aperiodic tasks in embedded real-time systems.



    Fault tolerance in microprocessor systems has become a popular topic of architecture research.Much work has been done at different levels to accomplish reliability against soft errors, and some fault tolerance architectures have been proposed. But little attention is paid to the thread level superscalar fault tolerance.This letter introduces microthread concept into superscalar processor fault tolerance domain, and puts forward a novel fault tolerance architecture, namely, MicroThread Based (MTB) coarse grained transient fault tolerance superscalar processor architecture, then discusses some detailed implementations.

  4. Task Migration for Fault-Tolerance in Mixed-Criticality Embedded Systems

    Saraswat, Prabhat Kumar; Pop, Paul; Madsen, Jan


    In this paper we are interested in mixed-criticality embedded applications implemented on distributed architectures. Depending on their time-criticality, tasks can be hard or soft real-time and regarding safety-criticality, tasks can be fault-tolerant to transient faults, permanent faults, or have...... no dependability requirements. We use Earliest Deadline First (EDF) scheduling for the hard tasks and the Constant Bandwidth Server (CBS) for the soft tasks. The CBS parameters determine the quality of service (QoS) of soft tasks. Transient faults are tolerated using checkpointing with roll- back recovery....... For tolerating permanent faults in processors, we use task migration, i.e., restarting the safety-critical tasks on other processors. We propose a Greedy-based on- line heuristic for the migration of safety-critical tasks, in response to permanent faults, and the adjustment of CBS parameters on the target...

  5. Design and RAMS Analysis of a Fault-Tolerant Computer Control System

    WANG Shuai; JI Yindong; DONG Wei; YANG Shiyuan


    This paper presents a fault-tolerant computer system. It is designed as a double 2-out-of-2 architecture based on component redundant technique. Also, a quantitative probabilistic model is presented for evaluating the reliability, availability, maintainability and safety (RAMS) of this architecture. Hierarchical modeling method and Markov modeling method are used in RAMS analysis to evaluate the system characteristics. The double 2-out-of-2 system is compared with the other two systems, all voting triple modular redundancy (AVTMR) system and dual-duplex system. According to the result, the double 2-out-of-2 system has the highest dependability. Especially, the system can satisfy the safety integrity level (SIL) 4, which means the system's probability of catastrophic failure less than or equal to 10~8 per hour, therefore, it can be applied to life critical systems such as high-speed railway systems.

  6. Fault Tolerant Approach for Data Encryption and Digital Signature Based on ECC System

    ZENG Yong; MA Jian-feng


    An integrated fault tolerant approach for data encryption and digital signature based on elliptic curve cryptography is proposed. This approach allows the receiver to verify the sender's identity and can simultaneously deal with error detection and data correction. Up to three errors in our approach can be detected and corrected. This approach has at least the same security as that based on RSA system, but smaller keys to achieve the same level of security. Our approach is more efficient than the known ones and more suited for limited environments like personal digital assistants (PDAs), mobile phones and smart cards without RSA co- processors.

  7. Fault tolerant synchronization of chaotic heavy symmetric gyroscope systems versus external disturbances via Lyapunov rule-based fuzzy control.

    Farivar, Faezeh; Shoorehdeli, Mahdi Aliyari


    In this paper, fault tolerant synchronization of chaotic gyroscope systems versus external disturbances via Lyapunov rule-based fuzzy control is investigated. Taking the general nature of faults in the slave system into account, a new synchronization scheme, namely, fault tolerant synchronization, is proposed, by which the synchronization can be achieved no matter whether the faults and disturbances occur or not. By making use of a slave observer and a Lyapunov rule-based fuzzy control, fault tolerant synchronization can be achieved. Two techniques are considered as control methods: classic Lyapunov-based control and Lyapunov rule-based fuzzy control. On the basis of Lyapunov stability theory and fuzzy rules, the nonlinear controller and some generic sufficient conditions for global asymptotic synchronization are obtained. The fuzzy rules are directly constructed subject to a common Lyapunov function such that the error dynamics of two identical chaotic motions of symmetric gyros satisfy stability in the Lyapunov sense. Two proposed methods are compared. The Lyapunov rule-based fuzzy control can compensate for the actuator faults and disturbances occurring in the slave system. Numerical simulation results demonstrate the validity and feasibility of the proposed method for fault tolerant synchronization.

  8. Fault Tolerant Wind Farm Control

    Odgaard, Peter Fogh; Stoustrup, Jakob


    with best at a wind turbine control level. However, some faults are better dealt with at the wind farm control level, if the wind turbine is located in a wind farm. In this paper a benchmark model for fault detection and isolation, and fault tolerant control of wind turbines implemented at the wind farm...... control level is presented. The benchmark model includes a small wind farm of nine wind turbines, based on simple models of the wind turbines as well as the wind and interactions between wind turbines in the wind farm. The model includes wind and power references scenarios as well as three relevant fault...... scenarios. This benchmark model is used in an international competition dealing with Wind Farm fault detection and isolation and fault tolerant control....

  9. Filtering and fault tolerant control of parameter-varying time-delay systems and applications

    Mohammadpour Velni, Javad

    This dissertation addresses some open problems in control systems theory. The problems considered include the dynamic controller and filter design for Linear Parameter Varying (LPV) time-delay systems, the reconfigurable control design in Fault Tolerant Control Systems (FTCS) and fault diagnostics in Diesel engines. In the first part of this thesis, we investigate the problem of designing parameter-dependent filters for output estimation of LPV time-delay systems. The filters are designed such that the filtering error system guarantees an optimum level of H2 or Hinfinity performance. A state-delay term is included in the filter dynamics to reduce the design conservatism and improve the performance. The Linear Matrix Inequality (LMI)-based synthesis conditions developed for the filter design purposes are categorized into the rate-dependent and delay-dependent conditions which could handle the time-varying state-delay and bounded small delay cases, respectively. Among these two, the latter one is shown to provide a significant reduction in the conservativeness in the filter design. The second part of the thesis examines the analysis and synthesis of Fault Tolerant Control (FTC) systems in an LPV framework. For reconfigurable control design purposes, the information from Fault Detection and Isolation (FDI) module, that provides an estimate of the fault parameters, is utilized to schedule the controller matrices. We will also present a formulation that incorporates the factor of detection delay in the FTC supervisory system. It is shown that including this delay in the synthesis conditions leads to improved performance and reduced control effort. For analysis of the FTC systems including time-delay, where the fault parameters might be identified inaccurately, we first introduce the notion of brief instability for LPV time-delay systems. In these systems it is possible that the output trajectory converges to zero even though there are parameter trajectories for which

  10. A multi-layer robust adaptive fault tolerant control system for high performance aircraft

    Huo, Ying

    Modern high-performance aircraft demand advanced fault-tolerant flight control strategies. Not only the control effector failures, but the aerodynamic type failures like wing-body damages often result in substantially deteriorate performance because of low available redundancy. As a result the remaining control actuators may yield substantially lower maneuvering capabilities which do not authorize the accomplishment of the air-craft's original specified mission. The problem is to solve the control reconfiguration on available control redundancies when the mission modification is urged to save the aircraft. The proposed robust adaptive fault-tolerant control (RAFTC) system consists of a multi-layer reconfigurable flight controller architecture. It contains three layers accounting for different types and levels of failures including sensor, actuator, and fuselage damages. In case of the nominal operation with possible minor failure(s) a standard adaptive controller stands to achieve the control allocation. This is referred to as the first layer, the controller layer. The performance adjustment is accounted for in the second layer, the reference layer, whose role is to adjust the reference model in the controller design with a degraded transit performance. The upmost mission adjust is in the third layer, the mission layer, when the original mission is not feasible with greatly restricted control capabilities. The modified mission is achieved through the optimization of the command signal which guarantees the boundedness of the closed-loop signals. The main distinguishing feature of this layer is the the mission decision property based on the current available resources. The contribution of the research is the multi-layer fault-tolerant architecture that can address the complete failure scenarios and their accommodations in realities. Moreover, the emphasis is on the mission design capabilities which may guarantee the stability of the aircraft with restricted post

  11. Fault-tolerant Control of Discrete-time LPV systems using Virtual Actuators and Sensors

    Tabatabaeipour, Mojtaba; Stoustrup, Jakob; Bak, Thomas


    This paper proposes a new fault-tolerant control (FTC) method for discrete-time linear parameter varying (LPV) systems using a reconfiguration block. The basic idea of the method is to achieve the FTC goal without re-designing the nominal controller by inserting a reconfiguration block between th....... Finally, the effectiveness of the method is demonstrated via a numerical example and stator current control of an induction motor.......This paper proposes a new fault-tolerant control (FTC) method for discrete-time linear parameter varying (LPV) systems using a reconfiguration block. The basic idea of the method is to achieve the FTC goal without re-designing the nominal controller by inserting a reconfiguration block between......, it transforms the output of the controller for the faulty system such that the stability and performance goals are preserved. Input-to-state stabilizing LPV gains of the virtual actuator and sensor are obtained by solving linear matrix inequalities (LMIs). We show that separate design of these gains guarantees...

  12. Fault diagnosis and fault-tolerant control based on adaptive control approach

    Shen, Qikun; Shi, Peng


    This book provides recent theoretical developments in and practical applications of fault diagnosis and fault tolerant control for complex dynamical systems, including uncertain systems, linear and nonlinear systems. Combining adaptive control technique with other control methodologies, it investigates the problems of fault diagnosis and fault tolerant control for uncertain dynamic systems with or without time delay. As such, the book provides readers a solid understanding of fault diagnosis and fault tolerant control based on adaptive control technology. Given its depth and breadth, it is well suited for undergraduate and graduate courses on linear system theory, nonlinear system theory, fault diagnosis and fault tolerant control techniques. Further, it can be used as a reference source for academic research on fault diagnosis and fault tolerant control, and for postgraduates in the field of control theory and engineering. .

  13. Fault Diagnosis and Fault-tolerant Control of Modular Multi-level Converter High-voltage DC System

    Liu, Hui; Ma, Ke; Wang, Chao


    Modular Multilevel Converter based High Voltage Direct Current (MMC-HVDC) configuration is a promising solution for the efficient grid integration and bulky power transmission over long distance. However, the large number of series connected identical modules in MMC may increase the probability...... strategies of MMC-HVDC systems for the most common faults happened in MMC-HVDC systems covering MMC faults, DC side faults as well as AC side faults. An important part of this paper is devoted to a discussion of the vulnerable spots as well as failure mechanism of the MMC-HVDC system covering switching...

  14. Multi-Core Technology for and Fault Tolerant High-Performance Spacecraft Computer Systems

    Behr, Peter M.; Haulsen, Ivo; Van Kampenhout, J. Reinier; Pletner, Samuel


    The current architectural trends in the field of multi-core processors can provide an enormous increase in processing power by exploiting the parallelism available in many applications. In particular because of their high energy efficiency, it is obvious that multi-core processor-based systems will also be used in future space missions. In this paper we present the system architecture of a powerful optical sensor system based on the eight core multi-core processor P4080 from Freescale. The fault tolerant structure and the highly effective FDIR concepts implemented on different hardware and software levels of the system are described in detail. The space application scenario and thus the main requirements for the sensor system have been defined by a complex tracking sensor application for autonomous landing or docking manoeuvres.

  15. Performance analysis of fault-tolerant systems in parallel execution of conversations

    Kim, K. H.; Heu, Shin; Yang, Seung M.


    The execution overhead inherent in the conversation scheme, which is a scheme for realizing fault-tolerant cooperating processes free of the domino effect, is analyzed. Multiprocessor/multicomputer systems capable of parallel execution of conversation components are considered and a queuing network model of such systems is adopted. Based on the queuing model, various performance indicators, including system throughput, average number of processors idling inside a conversation due to the synchronization required, and average time spent in the conversation, have been evaluated numerically for several application environments. The numeric results are discussed and several essential performance characteristics of the conversation scheme are derived. For example, when the number of participant processes is not large, say less than six, the system performance is highly affected by the synchronization required on the processes in a conversation, and not so much by the probability of acceptance-test failure.

  16. Fault-Tolerant Grid Architecture and Practice

    JIN Hai(金海); ZOU DeQing(邹德清); CHEN HanHua(陈汉华); SUN JianHua(孙建华); WU Song(吴松)


    Grid computing emerges as effective technologies to couple geographically distributed resources and solve large-scale computational problems in wide area networks. The fault tolerance is a significant and complex issue in grid computing systems. Various techniques have been investigated to detect and correct faults in distributed computing systems. Unreliable fault detection is one of the most effective techniques. Globus as a grid middleware manages resources in a wide area network. The Globus fault detection service uses the well-known techniques based on unreliable fault detectors to detect and report component failures. However, more powerful techniques are required to detect and correct both system-level and application-level faults in a grid system, and a convenient toolkit is also needed to maintain the consistency in the grid. A fault-tolerant grid platform (FTGP) based on an unreliable fault detector and the Globus fault detection service is presented in this paper. The platform offers effective strategies in such three aspects as grid key components, user tasks, and high-level applications.

  17. Model-Based Fault Tolerant Control for Hybrid Dynamic Systems with Sensor Faults%一类带有传染器故障的混合系统的容错控制

    杨浩; 冒泽慧; 姜斌


    A model-based fault tolerant control approach for hybrid linear dynamic systems is proposed in this paper. The proposed method, taking advantage of reliable control, can maintain the performance of the faulty system during the time delay of fault detection and diagnosis (FDD) and fault accommodation (FA), which can be regarded as the first line of defence against sensor faults.Simulation results of a three-tank system with sensor fault are given to show the efficiency of the method.

  18. A Fault-Tolerant Emergency-Aware Access Control Scheme for Cyber-Physical Systems

    Wu, Guowei; Xia, Feng; Yao, Lin


    Access control is an issue of paramount importance in cyber-physical systems (CPS). In this paper, an access control scheme, namely FEAC, is presented for CPS. FEAC can not only provide the ability to control access to data in normal situations, but also adaptively assign emergency-role and permissions to specific subjects and inform subjects without explicit access requests to handle emergency situations in a proactive manner. In FEAC, emergency-group and emergency-dependency are introduced. Emergencies are processed in sequence within the group and in parallel among groups. A priority and dependency model called PD-AGM is used to select optimal response-action execution path aiming to eliminate all emergencies that occurred within the system. Fault-tolerant access control polices are used to address failure in emergency management. A case study of the hospital medical care application shows the effectiveness of FEAC.

  19. Backstepping Design of Adaptive Neural Fault-Tolerant Control for MIMO Nonlinear Systems.

    Gao, Hui; Song, Yongduan; Wen, Changyun


    In this paper, an adaptive controller is developed for a class of multi-input and multioutput nonlinear systems with neural networks (NNs) used as a modeling tool. It is shown that all the signals in the closed-loop system with the proposed adaptive neural controller are globally uniformly bounded for any external input in L[₀,∞]. In our control design, the upper bound of the NN modeling error and the gains of external disturbance are characterized by unknown upper bounds, which is more rational to establish the stability in the adaptive NN control. Filter-based modification terms are used in the update laws of unknown parameters to improve the transient performance. Finally, fault-tolerant control is developed to accommodate actuator failure. An illustrative example applying the adaptive controller to control a rigid robot arm shows the validation of the proposed controller.

  20. Fault Tolerant External Memory Algorithms

    Jørgensen, Allan Grønlund; Brodal, Gerth Stølting; Mølhave, Thomas


    Algorithms dealing with massive data sets are usually designed for I/O-efficiency, often captured by the I/O model by Aggarwal and Vitter. Another aspect of dealing with massive data is how to deal with memory faults, e.g. captured by the adversary based faulty memory RAM by Finocchi and Italiano....... However, current fault tolerant algorithms do not scale beyond the internal memory. In this paper we investigate for the first time the connection between I/O-efficiency in the I/O model and fault tolerance in the faulty memory RAM, and we assume that both memory and disk are unreliable. We show a lower...... bound on the number of I/Os required for any deterministic dictionary that is resilient to memory faults. We design a static and a dynamic deterministic dictionary with optimal query performance as well as an optimal sorting algorithm and an optimal priority queue. Finally, we consider scenarios where...

  1. Robot Position Sensor Fault Tolerance

    Aldridge, Hal A.


    Robot systems in critical applications, such as those in space and nuclear environments, must be able to operate during component failure to complete important tasks. One failure mode that has received little attention is the failure of joint position sensors. Current fault tolerant designs require the addition of directly redundant position sensors which can affect joint design. A new method is proposed that utilizes analytical redundancy to allow for continued operation during joint position sensor failure. Joint torque sensors are used with a virtual passive torque controller to make the robot joint stable without position feedback and improve position tracking performance in the presence of unknown link dynamics and end-effector loading. Two Cartesian accelerometer based methods are proposed to determine the position of the joint. The joint specific position determination method utilizes two triaxial accelerometers attached to the link driven by the joint with the failed position sensor. The joint specific method is not computationally complex and the position error is bounded. The system wide position determination method utilizes accelerometers distributed on different robot links and the end-effector to determine the position of sets of multiple joints. The system wide method requires fewer accelerometers than the joint specific method to make all joint position sensors fault tolerant but is more computationally complex and has lower convergence properties. Experiments were conducted on a laboratory manipulator. Both position determination methods were shown to track the actual position satisfactorily. A controller using the position determination methods and the virtual passive torque controller was able to servo the joints to a desired position during position sensor failure.

  2. Fault tolerant operation of switched reluctance machine

    Wang, Wei

    The energy crisis and environmental challenges have driven industry towards more energy efficient solutions. With nearly 60% of electricity consumed by various electric machines in industry sector, advancement in the efficiency of the electric drive system is of vital importance. Adjustable speed drive system (ASDS) provides excellent speed regulation and dynamic performance as well as dramatically improved system efficiency compared with conventional motors without electronics drives. Industry has witnessed tremendous grow in ASDS applications not only as a driving force but also as an electric auxiliary system for replacing bulky and low efficiency auxiliary hydraulic and mechanical systems. With the vast penetration of ASDS, its fault tolerant operation capability is more widely recognized as an important feature of drive performance especially for aerospace, automotive applications and other industrial drive applications demanding high reliability. The Switched Reluctance Machine (SRM), a low cost, highly reliable electric machine with fault tolerant operation capability, has drawn substantial attention in the past three decades. Nevertheless, SRM is not free of fault. Certain faults such as converter faults, sensor faults, winding shorts, eccentricity and position sensor faults are commonly shared among all ASDS. In this dissertation, a thorough understanding of various faults and their influence on transient and steady state performance of SRM is developed via simulation and experimental study, providing necessary knowledge for fault detection and post fault management. Lumped parameter models are established for fast real time simulation and drive control. Based on the behavior of the faults, a fault detection scheme is developed for the purpose of fast and reliable fault diagnosis. In order to improve the SRM power and torque capacity under faults, the maximum torque per ampere excitation are conceptualized and validated through theoretical analysis and

  3. Methods and apparatuses for self-generating fault-tolerant keys in spread-spectrum systems

    Moradi, Hussein; Farhang, Behrouz; Subramanian, Vijayarangam


    Self-generating fault-tolerant keys for use in spread-spectrum systems are disclosed. At a communication device, beacon signals are received from another communication device and impulse responses are determined from the beacon signals. The impulse responses are circularly shifted to place a largest sample at a predefined position. The impulse responses are converted to a set of frequency responses in a frequency domain. The frequency responses are shuffled with a predetermined shuffle scheme to develop a set of shuffled frequency responses. A set of phase differences is determined as a difference between an angle of the frequency response and an angle of the shuffled frequency response at each element of the corresponding sets. Each phase difference is quantized to develop a set of secret-key quantized phases and a set of spreading codes is developed wherein each spreading code includes a corresponding phase of the set of secret-key quantized phases.

  4. Design of fault tolerant control system for individual blade control helicopters

    Tamayo, Sergio

    This dissertation presents the development of a fault tolerant control scheme for helicopters fitted with individually controlled blades. This novel approach attempts to improve fault tolerant capabilities of helicopter control system by increasing control redundancy using additional actuators for individual blade input and software re-mixing to obtain nominal or close to nominal conditions under failure. An advanced interactive simulation environment has been developed including modeling of sensor failure, swashplate actuator failure, individual blade actuator failure, and blade delamination to support the design, testing, and evaluation of the control laws. This simulation environment is based on the blade element theory for the calculation of forces and moments generated by the main rotor. This discretized model allows for individual blade analysis, which in turn allows measuring the consequences of a stuck blade, or loss of the surface area of the blade itself, with respect to the dynamics of the whole helicopter. The control laws are based on non-linear dynamic inversion and artificial neural network augmentation, which is a mix of linear and nonlinear methods that compensates for model inaccuracies due to linearization or failure. A stability analysis based on the Lyapunov function approach has shown that bounded tracking error is guaranteed, and under specific circumstances, global stability is guaranteed as well. An analysis over the degrees of freedom of the mechanical system and its impact over the helicopter handling qualities is also performed to measure the degree of redundancy achieved with the addition of individual blade actuators as compared to a classic swashplate helicopter configuration. Mathematical analysis and numerical simulation, using reconfiguration of the individual blade control under failure have shown that this control architecture can potentially improve the survivability of the aircraft and reduce pilot workload under failure

  5. A Fault-tolerance Estimating Method for Ionosphere Corrections in Satellite Navigation System

    GAO Shuliang; LI Rui; HUANG Zhigang


    Aiming to the reliable estimates of the ionosphere differential corrections for the satellite navigation system in the presence of the ionosphere anomaly,a fault-tolerance estimating method,which is based on the distributed Kalman filtering,is proposed.The method utilizes the parallel sub-filters for estimating the ionosphere differential corrections.Meanwhile,an infinite norm (IN) method is proposed for the detection of the ionosphere irregularity in the filter processing.Once the anomaly is detected,the sub-filter contaminated by the anomaly measurements will be excluded to ensure the reliability of the estimates.The simulation is conducted to validate the method and the results indicate that the anomaly can be found timely due to the novel fault detection method based on the infinite norm.Because of the parallel sub-filter architecture,the measurements are classified by the spatial distribution so that the ionosphere anomaly can be positioned and excluded more easily.Thus,the method can provide the robust and accurate ionosphere differential corrections.

  6. Neural-Network-Based Adaptive Decentralized Fault-Tolerant Control for a Class of Interconnected Nonlinear Systems.

    Li, Xiao-Jian; Yang, Guang-Hong


    This paper is concerned with the adaptive decentralized fault-tolerant tracking control problem for a class of uncertain interconnected nonlinear systems with unknown strong interconnections. An algebraic graph theory result is introduced to address the considered interconnections. In addition, to achieve the desirable tracking performance, a neural-network-based robust adaptive decentralized fault-tolerant control (FTC) scheme is given to compensate the actuator faults and system uncertainties. Furthermore, via the Lyapunov analysis method, it is proven that all the signals of the resulting closed-loop system are semiglobally bounded, and the tracking errors of each subsystem exponentially converge to a compact set, whose radius is adjustable by choosing different controller design parameters. Finally, the effectiveness and advantages of the proposed FTC approach are illustrated with two simulated examples.

  7. A Constraint Logic Programming Framework for the Synthesis of Fault-Tolerant Schedules for Distributed Embedded Systems

    Poulsen, Kåre Harbo; Pop, Paul; Izosimov, Viacheslav


    We present a constraint logic programming (CLP) approach for synthesis of fault-tolerant hard real-time applications on distributed heterogeneous architectures. We address time-triggered systems, where processes and messages are statically scheduled based on schedule tables. We use process re...

  8. Task Mapping and Bandwidth Reservation for Mixed Hard/Soft Fault-Tolerant Embedded Systems

    In this paper we are interested in mixed hard/soft real-time fault-tolerant applications mapped on distributed heterogeneous architectures. We use the Earliest Deadline First (EDF) scheduling for the hard real-time tasks and the Constant Bandwidth Server (CBS) for the soft tasks. The bandwidth...... reserved for the servers determines the quality of service (QoS) for soft tasks. CBS enforces temporal isolation, such that soft task overruns do not affect the timing guarantees of hard tasks. Transient faults in hard tasks are tolerated using checkpointing with rollback recovery. We have proposed a Tabu...... Search-based approach for task mapping and CBS bandwidth reservation, such that the deadlines for the hard tasks are satisfied, even in the case of transient faults, and the QoS for the soft tasks is maximized. Researchers have used fixed execution time models, such as the worst-case execution times...

  9. Coordinated Fault Tolerance for High-Performance Computing

    Our work to meet our goal of end-to-end fault tolerance has focused on two areas: (1) improving fault tolerance in various software currently available and widely used throughout the HEC domain and (2) using fault information exchange and coordination to achieve holistic, systemwide fault tolerance and understanding how to design and implement interfaces for integrating fault tolerance features for multiple layers of the software stack—from the application, math libraries, and programming language runtime to other common system software such as jobs schedulers, resource managers, and monitoring tools.

  10. Modular, Fault-Tolerant Electronics Supporting Space Exploration Project

    National Aeronautics and Space Administration — Modern electronic systems tolerate only as many point failures as there are redundant system copies, using mere macro-scale redundancy. Fault Tolerant Electronics...

  11. Fault Tolerant Control: A Simultaneous Stabilization Result

    This paper discusses the problem of designing fault tolerant compensators that stabilize a given system both in the nominal situation, as well as in the situation where one of the sensors or one of the actuators has failed. It is shown that such compensators always exist, provided that the system...

  12. Fault tolerant control schemes using integral sliding modes

    The key attribute of a Fault Tolerant Control (FTC) system is its ability to maintain overall system stability and acceptable performance in the face of faults and failures within the feedback system. In this book Integral Sliding Mode (ISM) Control Allocation (CA) schemes for FTC are described, which have the potential to maintain close to nominal fault-free performance (for the entire system response), in the face of actuator faults and even complete failures of certain actuators. Broadly an ISM controller based around a model of the plant with the aim of creating a nonlinear fault tolerant feedback controller whose closed-loop performance is established during the design process. The second approach involves retro-fitting an ISM scheme to an existing feedback controller to introduce fault tolerance. This may be advantageous from an industrial perspective, because fault tolerance can be introduced without changing the existing control loops. A high fidelity benchmark model of a large transport aircraft is u...

  13. A Direct Design from Input/Output Data of Fault-Tolerant Control System Based on GIMC Structure

    This paper deals with a design method of fault-tolerant control system based on Generalized Internal Model Control (GIMC) structure consisting of a standard outer loop feedback controller and an extra inner loop controller. The distinguished feature of GIMC structure is that the controller design for performance and robustness may be done separately. The outer loop controller is designed for nominal performance using some controller synthesis to meet (nominal) control specification, while the inner loop controller is designed to make a trade-off between robustness and performance. This feature is suitable for fault-tolerant control. The outer loop controller is designed for fault-free case, and the inner loop controller for faulty case. In the conventional methods, the inner loop controller is designed to maximize the robust stability margin without information on fault. Therefore, the performance in the faulty case tends to become conservative. In this paper, the inner loop controller is directly designed from experimental data collected from the faulty system. Since the collected data contains information on the fault, conservativeness in the conventional methods is decreased. The inner loop controller is designed by Virtual Reference Feedback Tuning (VRFT). VRFT is a direct design method from input-output data without identifying any models. Since complexity of the controller can be specified by the designer, no complexity reduction has to be required, which becomes advantageous upon implementation. The effectiveness of the proposed design method is confirmed by an experiment.

  14. Validation Methods Research for Fault-Tolerant Avionics and Control Systems Sub-Working Group Meeting. CARE 3 peer review

    A computer aided reliability estimation procedure (CARE 3), developed to model the behavior of ultrareliable systems required by flight-critical avionics and control systems, is evaluated. The mathematical models, numerical method, and fault-tolerant architecture modeling requirements are examined, and the testing and characterization procedures are discussed. Recommendations aimed at enhancing CARE 3 are presented; in particular, the need for a better exposition of the method and the user interface is emphasized.

  15. Noise Threshold and Resource Cost of Fault-Tolerant Quantum Computing with Majorana Fermions in Hybrid Systems

    Fault-tolerant quantum computing in systems composed of both Majorana fermions and topologically unprotected quantum systems, e.g., superconducting circuits or quantum dots, is studied in this Letter. Errors caused by topologically unprotected quantum systems need to be corrected with error-correction schemes, for instance, the surface code. We find that the error-correction performance of such a hybrid topological quantum computer is not superior to a normal quantum computer unless the topological charge of Majorana fermions is insusceptible to noise. If errors changing the topological charge are rare, the fault-tolerance threshold is much higher than the threshold of a normal quantum computer and a surface-code logical qubit could be encoded in only tens of topological qubits instead of about 1,000 normal qubits.

  16. 基于容错观测器的容错控制系统集成设计%Integrated Design of Fault Tolerant Control System Based on Fault Tolerant Observer

    Fault Detection and Diagnosis (FDD) is the first step in fault tolerant control system design. The feature of our design is the integration of FDD with fault tolerant controller, which was rarely done in previous such designs and is different from our way of integration. Our integration is different in that, under sensor failures, we can asymptotically estimate the real states of the system. Asymptotic estimation is discussed in considerable detail in section 2. The integrated design method is observer-based. Our fault tolerant observer can do two things: (1) asymptotic estimation of the real states of the system, (2) detection of sensor failures. Our observer, different from generally used one, is able to compensate for fault signal, thus making asymptotic estimation possible. Simulation results of an autonomous underwater vehicle , given in Figs.2 through 5, show preliminarily the effectiveness of our design%针对一类传感器故障的线性控制系统,设计出一个能在故障情况下正确估计出系统真实状态的容错观测器,并在此基础上实现对传感器故障的容错控制。结合某水下自主航行器(AUV)航向控制系统传感器故障的仿真结果验证了所提方案的有效性。

  17. Active fault tolerant control of piecewise affine systems with reference tracking and input constraints

    performance of the faulty system are held. The design of the supervisory scheme is not considered here. The set of controllers is composed of a normal controller for the fault-free case, an active fault detection and isolation controller for isolation and identification of the faults, and a set of passive...... the reference signal while the control inputs are bounded. The PFTC problem is transformed into a feasibility problem of a set of LMIs. The method is applied on a large-scale live-stock ventilation model....

  18. System Wide Joint Position Sensor Fault Tolerance in Robot Systems Using Cartesian Accelerometers

    Joint position sensors are necessary for most robot control systems. A single position sensor failure in a normal robot system can greatly degrade performance. This paper presents a method to obtain position information from Cartesian accelerometers without integration. Depending on the number and location of the accelerometers. the proposed system can tolerate the loss of multiple position sensors. A solution technique suitable for real-time implementation is presented. Simulations were conducted using 5 triaxial accelerometers to recover from the loss of up to 4 joint position sensors on a 7 degree of freedom robot moving in general three dimensional space. The simulations show good estimation performance using non-ideal accelerometer measurements.

  19. Fault Tolerant Consensus of Multi-Agent Systems with Linear Dynamics

    Full Text Available This paper deals with the consensus problem of linear multi-agent systems with actuator faults. A fault estimator based consensus protocol is provided, together with a convergence analysis. It is shown that the consensus errors of all agents will converge to a small set around the origin, if parameters in the consensus protocol are properly chosen. A numerical example is given to illustrate the effectiveness of the proposed protocol.

  20. Fault diagnosis and fault-tolerant control of photovoltaic micro-inverter

    An observer-based fault diagnosis method and a fault tolerant control for open-switch fault and current sensor fault are proposed for interleaved flyback converters of a micro-inverter system. First, based on the topology of a grid-connected micro-inverter, a mathematical model of the flyback converters is established. Second, a state observer is applied to estimate the currents online and generate corresponding residuals. The fault is diagnosed by comparing the residuals with the thresholds. Finally, a fault-tolerant control that consists of a fault-tolerant topology for the faulty switch and a simple software redundancy control for the faulty current sensor, is proposed to achieve a fault-tolerant operation. The feasibility and effectiveness of the proposed method has been verified by simulation and experimental results.

  1. Thermoelectric-Driven Sustainable Sensing and Actuation Systems for Fault-Tolerant Nuclear Incidents

    safety systems, etc. Such an approach is intrinsically fault tolerant: in the event that system temperatures increase, the amount of available energy will increase, which will make more power available for applications. The system can also be used during normal conditions to provide enhanced monitoring of key system components.

  2. Robust Fault-Tolerant Control for Uncertain Networked Control Systems with State-Delay and Random Data Packet Dropout

    Full Text Available A robust fault-tolerant controller design problem for networked control system (NCS with random packet dropout in both sensor-to-controller link and controller-to-actuator link is investigated. A novel stochastic NCS model with state-delay, model uncertainty, disturbance, probabilistic sensor failure, and actuator failure is proposed. The random packet dropout, sensor failures, and actuator failures are characterized by a binary random variable. The sufficient condition for asymptotical mean-square stability of NCS is derived and the closed-loop NCS satisfies H∞ performance constraints caused by the random packet dropout and disturbance. The fault-tolerant controller is designed by solving a linear matrix inequality. A numerical example is presented to illustrate the effectiveness of the proposed method.

  3. Optimal Heater Control with Technology of Fault Tolerance for Compensating Thermoforming Preheating System

    Full Text Available The adjustment of heater power is very important because the distribution of thickness strongly depends on the distribution of sheet temperature. In this paper, the steady state optimum distribution of heater power is searched by numerical optimization in order to get uniform sheet temperature. In the following step, optimal heater power distribution with a damaged heater was found out using the technology of fault tolerance, which will be used to reduce the repairing time when some heaters are damaged. The merit of this work is that the design variable was the power of each heater which can be directly used in the preheating process of thermoforming.

  4. Low-Cost Fault Tolerant Methodology for Real Time MPSoC Based Embedded System

    Full Text Available We are proposing a design methodology for a fault tolerant homogeneous MPSoC having additional design objectives that include low hardware overhead and performance. We have implemented three different FT methodologies on MPSoCs and compared them against the defined constraints. The comparison of these FT methodologies is carried out by modelling their architectures in VHDL-RTL, on Spartan 3 FPGA. The results obtained through simulations helped us to identify the most relevant scheme in terms of the given design constraints.

  5. Industrial Cost-Benefit Assessment for Fault-tolerant Control Systems

    against failure. The paper describes the assessments needed to find the right path for new industrial designs. The economic decisions in the design phase are discussed: cost of different failures, profits associated with available benefits, investments needed for development and life-time support....... The objective of this paper is to help, in the early product development state, to find the economical most suitable scheme. A salient result is that with increased customer awareness of total cost of ownership, new products can benefit significantly from applying fault tolerant control principles....

  6. Fault-Tolerant Process Control Methods and Applications

    Fault-Tolerant Process Control focuses on the development of general, yet practical, methods for the design of advanced fault-tolerant control systems; these ensure an efficient fault detection and a timely response to enhance fault recovery, prevent faults from propagating or developing into total failures, and reduce the risk of safety hazards. To this end, methods are presented for the design of advanced fault-tolerant control systems for chemical processes which explicitly deal with actuator/controller failures and sensor faults and data losses. Specifically, the book puts forward: ·         a framework for  detection, isolation and diagnosis of actuator and sensor faults for nonlinear systems; ·         controller reconfiguration and safe-parking-based fault-handling methodologies; ·         integrated-data- and model-based fault-detection and isolation and fault-tolerant control methods; ·         methods for handling sensor faults and data losses; and ·      ...

  7. Fault tolerant control based on active fault diagnosis

    An active fault diagnosis (AFD) method will be considered in this paper in connection with a Fault Tolerant Control (FTC) architecture based on the YJBK parameterization of all stabilizing controllers. The architecture consists of a fault diagnosis (FD) part and a controller reconfiguration (CR...

  8. Wind turbine fault detection and fault tolerant control

    In this updated edition of a previous wind turbine fault detection and fault tolerant control challenge, we present a more sophisticated wind turbine model and updated fault scenarios to enhance the realism of the challenge and therefore the value of the solutions. This paper describes the challe...

  9. Fault Tolerant Ethernet Based Network for Time Sensitive Applications in Electrical Power Distribution Systems

    Full Text Available The paper analyses and experimentally verifies deployment of Ethernet based network technology to enable fault tolerant and timely exchange of data among a number of high voltage protective relays that use proprietary serial communication line to exchange data in real time on a state of its high voltage circuitry facilitating a fast protection switching in case of critical failures. The digital serial signal is first fetched into PCM multiplexer where it is mapped to the corresponding E1 (2 Mbit/s time division multiplexed signal. Subsequently, the resulting E1 frames are then packetized and sent through Ethernet control LAN to the opposite PCM demultiplexer where the same but reverse processing is done finally sending a signal into the opposite protective relay. The challenge of this setup is to assure very timely delivery of the control information between protective relays even in the cases of potential failures of Ethernet network itself. The tolerance of Ethernet network to faults is assured using widespread per VLAN Rapid Spanning Tree Protocol potentially extended by 1+1 PCM protection as a valuable option.

  10. Evaluation of Simple Causal Message Logging for Large-Scale Fault Tolerant HPC Systems

    The era of petascale computing brought machines with hundreds of thousands of processors. The next generation of exascale supercomputers will make available clusters with millions of processors. In those machines, mean time between failures will range from a few minutes to few tens of minutes, making the crash of a processor the common case, instead of a rarity. Parallel applications running on those large machines will need to simultaneously survive crashes and maintain high productivity. To achieve that, fault tolerance techniques will have to go beyond checkpoint/restart, which requires all processors to roll back in case of a failure. Incorporating some form of message logging will provide a framework where only a subset of processors are rolled back after a crash. In this paper, we discuss why a simple causal message logging protocol seems a promising alternative to provide fault tolerance in large supercomputers. As opposed to pessimistic message logging, it has low latency overhead, especially in collective communication operations. Besides, it saves messages when more than one thread is running per processor. Finally, we demonstrate that a simple causal message logging protocol has a faster recovery and a low performance penalty when compared to checkpoint/restart. Running NAS Parallel Benchmarks (CG, MG and BT) on 1024 processors, simple causal message logging has a latency overhead below 5%.

  11. Parallel and distributed computation for fault-tolerant object recognition

    The distributed associative memory (DAM) model is suggested for distributed and fault-tolerant computation as it relates to object recognition tasks. The fault-tolerance is with respect to geometrical distortions (scale and rotation), noisy inputs, occulsion/overlap, and memory faults. An experimental system was developed for fault-tolerant structure recognition which shows the feasibility of such an approach. The approach is futher extended to the problem of multisensory data integration and applied successfully to the recognition of colored polyhedral objects.

  12. Development of an interface for an ultrareliable fault-tolerant control system and an electronic servo-control unit

    The NASA Ames Research Center sponsors a research program for the investigation of Intelligent Flight Control Actuation systems. The use of artificial intelligence techniques in conjunction with algorithmic techniques for autonomous, decentralized fault management of flight-control actuation systems is explored under this program. The design, development, and operation of the interface for laboratory investigation of this program is documented. The interface, architecturally based on the Intel 8751 microcontroller, is an interrupt-driven system designed to receive a digital message from an ultrareliable fault-tolerant control system (UFTCS). The interface links the UFTCS to an electronic servo-control unit, which controls a set of hydraulic actuators. It was necessary to build a UFTCS emulator (also based on the Intel 8751) to provide signal sources for testing the equipment.

  13. A Dynamic Slack Management Technique for Real-Time Distributed Embedded System with Enhanced Fault Tolerance and Resource Constraints

    Full Text Available This project work aims to develop a dynamic slack management technique, for real-time distributed embedded systems to reduce the total energy consumption in addition to timing, precedence and resource constraints. The Slack Distribution Technique proposed considers a modified Feedback Control Scheduling (FCS algorithm. This algorithm schedules dependent tasks effectively with precedence and resource constraints. It further minimizes the schedule length and utilizes the available slack to increase the energy efficiency. A fault tolerant mechanism uses a deferred-active-backup scheme increases the schedulability and provides reliability to the system.

  14. Fault-tolerant quantum computation

    The discovery of quantum error correction has greatly improved the long-term prospects for quantum computing technology. Encoded quantum information can be protected from errors that arise due to uncontrolled interactions with the environment, or due to imperfect implementations of quantum logical operations. Recovery from errors can work effectively even if occasional mistakes occur during the recovery procedure. Furthermore, encoded quantum information can be processed without serious propagation of errors. In principle, an arbitrarily long quantum computation can be performed reliably, provided that the average probability of error per gate is less than a certain critical value, the accuracy threshold. It may be possible to incorporate intrinsic fault tolerance into the design of quantum computing hardware, perhaps by invoking topological Aharonov-Bohm interactions to process quantum information.

  15. Massive Sensor Array Fault Tolerance: Tolerance Mechanism and Fault Injection for Validation

    Full Text Available As today's machines become increasingly complex in order to handle intricate tasks, the number of sensors must increase for intelligent operations. Given the large number of sensors, detecting, isolating, and then tolerating faulty sensors is especially important. In this paper, we propose fault tolerance architecture suitable for a massive sensor array often found in highly advanced systems such as autonomous robots. One example is the sensitive skin, a type of massive sensor array. The objective of the sensitive skin is autonomous guidance of machines in unknown environments, requiring elongated operations in a remote site. The entirety of such a system needs to be able to work remotely without human attendance for an extended period of time. To that end, we propose a fault-tolerant architecture whereby component and analytical redundancies are integrated cohesively for effective failure tolerance of a massive array type sensor or sensor system. In addition, we discuss the evaluation results of the proposed tolerance scheme by means of fault injection and validation analysis as a measure of system reliability and performance.

  16. Guaranteed Cost Active Fault-tolerant Control of Networked Control System with Packet Dropout and Transmission Delay

    The problem of guaranteed cost active fault-tolerant controller (AFTC) design for networked control systems (NCSs)with both packet dropout and transmission delay is studied in this paper.Considering the packet dropout and transmission delay,a piecewise constant controller is adopted.With a guaranteed cost function,optimal controllers whose number is equal to the number of actuators are designed,and the design process is formulated as a convex optimal problem that can be solved by existing software.The control strategy is proposed as follows:when actuator failures appear,the fault detection and isolation unit sends out the information to the controller choosing strategy,and then the optimal stabilizing controller with the smallest guaranteed cost value is chosen.Two illustrative examples are given to demonstrate the effectiveness of the proposed approach.By comparing with the existing methods,it can be seen that our method has a better performance.

  17. A Dynamic Effective Fault Tolerance System in Robotic Manipulator using a Hybrid Neural Network based Controller

    Full Text Available Robot manipulator play important role in the field of automobile industry, mainly it is used in gas welding application and manufacturing and assembling of motor parts. In complex trajectory, on each joint the speed of the robot manipulator is affected. For that reason, it is necessary to analyze the noise and vibration of robot's joints for predicting faults also improve the control precision of robotic manipulator. In this study we will propose a new fault detection system for Robot manipulator. The proposed hybrid fault detection system is designed based on fuzzy support vector machine and Artificial Neural Networks (ANNs. In this system the decouple joints are identified and corrected using fuzzy SVM, here non-linear signal are used for complete process and treatment, the Artificial Neural Networks (ANNs are used to detect the free-swinging and locked joint of the robot, two types of neural predictors are also employed in the proposed adaptive neural network structure. The simulation results of a hybrid controller demonstrate the feasibility and performance of the methodology.

  18. Microcontroller-Based Fault Tolerant Data Acquisition System For Air Quality Monitoring And Control Of Environmental Pollution

    Full Text Available ABSTRACT The design applied Passive fault tolerance to a microcontroller based data acquisition system to achieve the stated considerations where redundant sensors and microcontrollers with associated circuitry were designed and implemented to enable measurement of pollutant concentration information from chimney vents in two industry. Microsoft visual basic was used to develop a data mining tool which implemented an underlying artificial neural network model for forecasting pollutant concentrations for future time periods. The feed forward back propagation method was used to train the ANN model with a training data set while a decision tree algorithm was used to select an optimal output result for the model from its two output neurons.

  19. Scheduling and Voltage Scaling for Energy/Reliability Trade-offs in Fault-Tolerant Time-Triggered Embedded Systems

    In this paper we present an approach to the scheduling and voltage scaling of low-power fault-tolerant hard real-time applications mapped on distributed heterogeneous embedded systems. Processes and messages are statically scheduled, and we use process re-execution for recovering from multiple tr...... are satisfied and the energy is minimized. We present a constraint logic programming- based approach which is able to find reliable and schedulable implementations within limited energy and hardware resources. The developed algorithms have been evaluated using extensive experiments....

  20. Study on inverter fault-tolerant operation of PMSM DTC


    This paper presents an investigation of inverter fault-tolerant operation for a permanent magnet synchronous motor (PMSM) direct torque control (DTC) system under various inverter faults. The performance of a faulty standard 6-switch inverter driven PMSM DTC system is analyzed. To avoid the loss or even disaster caused by the inverter faults, a topology-modified inverter with fault-tolerant capability is introduced, which is reconfigured as a 3-phase 4-switch inverter. The modeling of the 4-switch inverter is then analyzed and a novel DTC strategy with a unique nonlinear perpendicular flux observer and feedback compensation scheme is proposed for obtaining a continuous, disturbance-free drive system. The simulation and experimental results demonstrate that the proposed inverter fault-tolerant PMSM DTC system is able to operate stably and continuously with acceptable static and pretty good dynamic performance.

  1. Ship Propulsion System as a Benchmark for Fault-Tolerant Control

    Izadi-Zamanabadi, Roozbeh; Blanke, M.


    -tolerant control is a fairly new area. The paper presents a ship propulsion system as a benchmark that should be useful as a platform for development of new ideas and comparison of methods. The benchmark has two main elements. One is development of efficient FDI algorithms, the other is analysis and implementation...

  2. Fault tolerance and reliability in integrated ship control

    Nielsen, Jens Frederik Dalsgaard; Izadi-Zamanabadi, Roozbeh; Schiøler, Henrik


    Various strategies for achieving fault tolerance in large scale control systems are discussed. The positive and negative impacts of distribution through network communication are presented. The ATOMOS framework for standardized reliable marine automation is presented along with the corresponding...

  3. Fault-Tolerant Precision Formation Guidance for Interferometry Project

    National Aeronautics and Space Administration — A methodology is to be developed that will allow the development and implementation of fault-tolerant control system for distributed collaborative spacecraft. The...

  4. Distributed consensus and fault tolerance - Lecture 1

    In a world where clusters with thousands of nodes are becoming commonplace, we are often faced with the task of having them coordinate and share state. As the number of machines goes up, so does the probability that something goes wrong: a node could temporarily lose connectivity, crash because of some race condition, or have its hard drive fail. What are the challenges when designing fault-tolerant distributed systems, where a cluster is able to survive the loss of individual nodes? In this lecture, we will discuss some basics on this topic (consistency models, CAP theorem, failure modes, byzantine faults), detail the raft consensus algorithm, and showcase an interesting example of a highly resilient distributed system, bitcoin.

  5. Distributed consensus and fault tolerance - Lecture 2

    In a world where clusters with thousands of nodes are becoming commonplace, we are often faced with the task of having them coordinate and share state. As the number of machines goes up, so does the probability that something goes wrong: a node could temporarily lose connectivity, crash because of some race condition, or have its hard drive fail. What are the challenges when designing fault-tolerant distributed systems, where a cluster is able to survive the loss of individual nodes? In this lecture, we will discuss some basics on this topic (consistency models, CAP theorem, failure modes, byzantine faults), detail the raft consensus algorithm, and showcase an interesting example of a highly resilient distributed system, bitcoin.

  6. Diagnosis and Fault-tolerant Control

    The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process...... the applicability of the presented methods. The theoretical results are illustrated by two running examples which are used throughout the book. The book addresses engineering students, engineers in industry and researchers who wish to get a survey over the variety of approaches to process diagnosis and fault...

  7. Fault-tolerant control for current sensors of doubly fed induction generators based on an improved fault detection method

    Fault-tolerant control of current sensors is studied in this paper to improve the reliability of a doubly fed induction generator (DFIG). A fault-tolerant control system of current sensors is presented for the DFIG, which consists of a new current observer and an improved current sensor fault...... detection algorithm, and fault-tolerant control system are investigated by simulation. The results indicate that the outputs of the observer and the sensor are highly coherent. The fault detection algorithm can efficiently detect both soft and hard faults in current sensors, and the fault-tolerant control...... system can effectively tolerate both types of faults. © 2013 Published by Elsevier Ltd. All rights reserved....

  8. Design and Verification of Fault-Tolerant Components

    We present a systematic approach to design and verification of fault-tolerant components with real-time properties as found in embedded systems. A state machine model of the correct component is augmented with internal transitions that represent hypothesized faults. Also, constraints...

  9. Electrical Steering of Vehicles - Fault-tolerant Analysis and Design

    The topic of this paper is systems that need be designed such that no single fault can cause failure at the overall level. A methodology is presented for analysis and design of fault-tolerant architectures, where diagnosis and autonomous reconfiguration can replace high cost triple redundancy sol...

  10. Fault tolerant programmable digital attitude control electronics study

    The attitude control electronics mechanization study to develop a fault tolerant autonomous concept for a three axis system is reported. Programmable digital electronics are compared to general purpose digital computers. The requirements, constraints, and tradeoffs are discussed. It is concluded that: (1) general fault tolerance can be achieved relatively economically, (2) recovery times of less than one second can be obtained, (3) the number of faulty behavior patterns must be limited, and (4) adjoined processes are the best indicators of faulty operation.

  11. Object Replication and CORBA Fault-Tolerant Object Service


    CORBA (Common Object Request Broker Arc hitecture) provides 16Common Object Services for distributed application develo pment, but none of them are fault-tolerance related services. In this paper, we propose a replicated object based Fault-Tolerant Object Service (FTOS) for COR BA environment. Two fault-tolerant mechanisms are provided in FTOS including dy namic voting mechanism and object replication mechanism. The dynamic voting mech anism uses majority-voting strategy to ensure object state consistency in failu re situations. The object replication mechanism can help system administrators t o replicate and start-up objects easily. Our implementation provides a library according to the style of COSS. With this library, programmers can develop distr ibuted applications with fault-tolerance capability very easily.

  12. The New Fault Tolerant Onboard Computer for Microsatellite Missions


    This paper describes an onboard computer with dual processing modules. Each processing module is composed of 32 bit ARM reduced instruction set computer processor and other commercial-off-the-shelf devices. A set of fault handling mechanisms is implemented in the computer system, which enables the system to tolerate a single fault. The onboard software is organized around a set of processes that communicate among each other through a routing process. Meeting an extremely tight set of constraints that include mass, volume, power consumption and space environmental conditions, the fault-tolerant onboard computer has excellent data processing capability that can meet the erquirements of micro-satellite missions.

  13. Byzantine-fault tolerant self-stabilizing protocol for distributed clock synchronization systems

    Malekpour, Mahyar R. (Inventor)


    A rapid Byzantine self-stabilizing clock synchronization protocol that self-stabilizes from any state, tolerates bursts of transient failures, and deterministically converges within a linear convergence time with respect to the self-stabilization period. Upon self-stabilization, all good clocks proceed synchronously. The Byzantine self-stabilizing clock synchronization protocol does not rely on any assumptions about the initial state of the clocks. Furthermore, there is neither a central clock nor an externally generated pulse system. The protocol converges deterministically, is scalable, and self-stabilizes in a short amount of time. The convergence time is linear with respect to the self-stabilization period.


    Full Text Available In this study, for designing of the fault-tolerant control systems by using standard personal computers, the ports have been investigated, different structure versions have been designed and the method for choosing of an optimal structure has been suggested. In this scope, first of all, the ÇİFTYAK system has been defined and its work principle has been determined. Then, data transmission ports of the standard personal computers have been classified and analyzed. After that, the structure versions have been designed and evaluated according to the used data transmission methods, the numbers of ports and the criterions of reliability, performance, truth, control and cost. Finally, the method for choosing of the most optimal structure version has been suggested.

  15. Fault Tolerant Control Using Gaussian Processes and Model Predictive Control

    Full Text Available Essential ingredients for fault-tolerant control are the ability to represent system behaviour following the occurrence of a fault, and the ability to exploit this representation for deciding control actions. Gaussian processes seem to be very promising candidates for the first of these, and model predictive control has a proven capability for the second. We therefore propose to use the two together to obtain fault-tolerant control functionality. Our proposal is illustrated by several reasonably realistic examples drawn from flight control.

  16. Enhancement of Fault Tolerance in Cloud Computing

    Full Text Available In recent years researchers are trying to work out scientific applications in cloud so that it decreases the infrastructure cost and increases the span of team and finally innovative ideas towards applications is increased. But the cloud is still not as much reliable, controllable as grid. So in the evolving Cloud computing environment there is a great need of fault tolerance mechanism for the system to work effectively even in the presence of failure. Moreover Big Organizations are also opting for using Hybrid Cloud instead of private Cloud. Thus, in this paper we propose an approach of using a new framework in Cloud so as to use Cloud for scientific applications as well makes the public Cloud trustworthy platform. There is a progressive approach introduced to provide an effective way to achieve high fault tolerance in Clouds by enabling a new workflow planning method to balance performance, reliability and cost for critical scientific applications and focus mainly on use of distributed resources for workflow execution mainly in serial and concurrent manner.

  17. Simulation modeling based method for choosing an effective set of fault tolerance mechanisms for real-time avionics systems

    In this paper, the reliability allocation problem (RAP) for real-time avionics systems (RTAS) is considered. The proposed method for solving this problem consists of two steps: (i) creation of an RTAS simulation model at the necessary level of abstraction and (ii) application of metaheuristic algorithm to find an optimal solution (i. e., to choose an optimal set of fault tolerance techniques). When during the algorithm execution it is necessary to measure the execution time of some software components, the simulation modeling is applied. The procedure of simulation modeling also consists of the following steps: automatic construction of simulation model of the RTAS configuration and running this model in a simulation environment to measure the required time. This method was implemented as an experimental software tool. The tool works in cooperation with DYANA simulation environment. The results of experiments with the implemented method are presented. Finally, future plans for development of the presented method and tool are briefly described.

  18. SABRE: a bio-inspired fault-tolerant electronic architecture.

    As electronic devices become increasingly complex, ensuring their reliable, fault-free operation is becoming correspondingly more challenging. It can be observed that, in spite of their complexity, biological systems are highly reliable and fault tolerant. Hence, we are motivated to take inspiration for biological systems in the design of electronic ones. In SABRE (self-healing cellular architectures for biologically inspired highly reliable electronic systems), we have designed a bio-inspired fault-tolerant hierarchical architecture for this purpose. As in biology, the foundation for the whole system is cellular in nature, with each cell able to detect faults in its operation and trigger intra-cellular or extra-cellular repair as required. At the next level in the hierarchy, arrays of cells are configured and controlled as function units in a transport triggered architecture (TTA), which is able to perform partial-dynamic reconfiguration to rectify problems that cannot be solved at the cellular level. Each TTA is, in turn, part of a larger multi-processor system which employs coarser grain reconfiguration to tolerate faults that cause a processor to fail. In this paper, we describe the details of operation of each layer of the SABRE hierarchy, and how these layers interact to provide a high systemic level of fault tolerance.

  19. A Fault-Tolerant Architecture for Parlay Application Server

    As the value-added service providing system in the Next-Generation Networks (NGN), Application Servers (AS) are required to provide the carrier-class reliability. To increase the reliability of AS, the fault-tolerant technology is often adopted. This paper proposes a fault-tolerant architecture for AS against single-point faults. The result of analysis shows that the architecture has a good reliability and is easily extendable. Such an advantage is attributed to a kind of special fault-tolerant design, which is different from others in that two Service Logic Program (SLP) instances do not only provide backups to each other, but also share them in the service traffic.

  20. Computer aided reliability, availability, and safety modeling for fault-tolerant computer systems with commentary on the HARP program

    Many of the most challenging reliability problems of our present decade involve complex distributed systems such as interconnected telephone switching computers, air traffic control centers, aircraft and space vehicles, and local area and wide area computer networks. In addition to the challenge of complexity, modern fault-tolerant computer systems require very high levels of reliability, e.g., avionic computers with MTTF goals of one billion hours. Most analysts find that it is too difficult to model such complex systems without computer aided design programs. In response to this need, NASA has developed a suite of computer aided reliability modeling programs beginning with CARE 3 and including a group of new programs such as: HARP, HARP-PC, Reliability Analysts Workbench (Combination of model solvers SURE, STEM, PAWS, and common front-end model ASSIST), and the Fault Tree Compiler. The HARP program is studied and how well the user can model systems using this program is investigated. One of the important objectives will be to study how user friendly this program is, e.g., how easy it is to model the system, provide the input information, and interpret the results. The experiences of the author and his graduate students who used HARP in two graduate courses are described. Some brief comparisons were made with the ARIES program which the students also used. Theoretical studies of the modeling techniques used in HARP are also included. Of course no answer can be any more accurate than the fidelity of the model, thus an Appendix is included which discusses modeling accuracy. A broad viewpoint is taken and all problems which occurred in the use of HARP are discussed. Such problems include: computer system problems, installation manual problems, user manual problems, program inconsistencies, program limitations, confusing notation, long run times, accuracy problems, etc.

  1. A Primer on Architectural Level Fault Tolerance

    This paper introduces the fundamental concepts of fault tolerant computing. Key topics covered are voting, fault detection, clock synchronization, Byzantine Agreement, diagnosis, and reliability analysis. Low level mechanisms such as Hamming codes or low level communications protocols are not covered. The paper is tutorial in nature and does not cover any topic in detail. The focus is on rationale and approach rather than detailed exposition.

  2. Fault Tolerant Parallel Filters Based On Bch Codes

    Full Text Available Digital filters are used in signal processing and communication systems. In some cases, the reliability of those systems is critical, and fault tolerant filter implementations are needed. Over the years, many techniques that exploit the filters’ structure and properties to achieve fault tolerance have been proposed. As technology scales, it enables more complex systems that incorporate many filters. In those complex systems, it is common that some of the filters operate in parallel, for example, by applying the same filter to different input signals. Recently, a simple technique that exploits the presence of parallel filters to achieve multiple fault tolerance has been presented. In this brief, that idea is generalized to show that parallel filters can be protected using Bose– Chaudhuri–Hocquenghem codes (BCH in which each filter is the equivalent of a bit in a traditional ECC. This new scheme allows more efficient protection when the number of parallel filters is large.

  3. Fault-Tolerant Mechanism of the Distributed Cluster Computers"

    The distributed system with high performance and stability is commonly adopted in large scale scientific and engineering computing. In this paper, we discuss a fault-tolerant mechanism under Linux circumstance to improve the fault-tolerant ability of the system, namely a scheme and frame to form the stable computing platform. In terms of the structure and function of the distributed system, active list and file invocation strategies are employed in the task management. System multilevel fault-tolerance can be achieved by repeated processes in a single node and task migration on multi-nodes. Manager node agent introduced in this paper administrates the nodes using the list, disposes of the tasks according to the nodes'performance, and hence, to be able to make full use of the cluster resources. An evaluation method is proposed to appraise the performance. The analyzed results show the usefulness of the scheme proposed except for some additional overhead of memory consumption.

  4. Scheduling of Fault-Tolerant Embedded Systems with Soft and Hard Timing Constraints

    fails or completes, incurs an unacceptable overhead. Thus, we use a quasi-static scheduling strategy, where a set of schedules is synthesized off-line and, at run time, the scheduler will select the right schedule based on the occurrence of faults and the actual execution times of processes...

  5. An Active Fault-Tolerant PWM Tracker for Unknown Nonlinear Stochastic Hybrid Systems: NARMAX Model and OKID-Based State-Space Self-Tuning Control

    Full Text Available An active fault-tolerant pulse-width-modulated tracker using the nonlinear autoregressive moving average with exogenous inputs model-based state-space self-tuning control is proposed for continuous-time multivariable nonlinear stochastic systems with unknown system parameters, plant noises, measurement noises, and inaccessible system states. Through observer/Kalman filter identification method, a good initial guess of the unknown parameters of the chosen model is obtained so as to reduce the identification process time and enhance the system performances. Besides, by modifying the conventional self-tuning control, a fault-tolerant control scheme is also developed. For the detection of fault occurrence, a quantitative criterion is exploited by comparing the innovation process errors estimated by the Kalman filter estimation algorithm. In addition, the weighting matrix resetting technique is presented by adjusting and resetting the covariance matrix of parameter estimates to improve the parameter estimation for faulty system recovery. The technique can effectively cope with partially abrupt and/or gradual system faults and/or input failures with fault detection.

  6. Research and application of Fault-Tolerance techniques in electric locomotive system%电力机车系统容错技术研究及应用

    Fault-Tolerance is one of critical methods providing safety and reliability. In the paper,it presents the basic Fault-Tol-erance techniques in locomotive system, and describes detailly how to realize Fault-Tolerance techniques in each subsystem of electric locomotive.%容错技术是保证系统运行安全性和可靠性的关键手段之一。介绍了电力机车系统中常用的容错技术及其基本原理,并以和谐系列机车为例详细阐述了电力机车各子系统容错技术的实现方法。

  7. Electronic Power Switch for Fault-Tolerant Networks

    Power field-effect transistors reduce energy waste and simplify interconnections. Current switch containing power field-effect transistor (PFET) placed in series with each load in fault-tolerant power-distribution system. If system includes several loads and supplies, switches placed in series with adjacent loads and supplies. System of switches protects against overloads and losses of individual power sources.

  8. A continuous-time semi-markov bayesian belief network model for availability measure estimation of fault tolerant systems

    Full Text Available In this work it is proposed a model for the assessment of availability measure of fault tolerant systems based on the integration of continuous time semi-Markov processes and Bayesian belief networks. This integration results in a hybrid stochastic model that is able to represent the dynamic characteristics of a system as well as to deal with cause-effect relationships among external factors such as environmental and operational conditions. The hybrid model also allows for uncertainty propagation on the system availability. It is also proposed a numerical procedure for the solution of the state probability equations of semi-Markov processes described in terms of transition rates. The numerical procedure is based on the application of Laplace transforms that are inverted by the Gauss quadrature method known as Gauss Legendre. The hybrid model and numerical procedure are illustrated by means of an example of application in the context of fault tolerant systems.Neste trabalho, é proposto um modelo baseado na integração entre processos semi-Markovianos e redes Bayesianas para avaliação da disponibilidade de sistemas tolerantes à falha. Esta integração resulta em um modelo estocástico híbrido o qual é capaz de representar as características dinâmicas de um sistema assim como tratar as relações de causa e efeito entre fatores externos tais como condições ambientais e operacionais. Além disso, o modelo híbrido permite avaliar a propagação de incerteza sobre a disponibilidade do sistema. É também proposto um procedimento numérico para a solução das equações de probabilidade de estado de processos semi-Markovianos descritos por taxas de transição. Tal procedimento numérico é baseado na aplicação de transformadas de Laplace que são invertidas pelo método de quadratura Gaussiana conhecido como Gauss Legendre. O modelo híbrido e procedimento numérico são ilustrados por meio de um exemplo de aplicação no contexto de

  9. SIFT - Multiprocessor architecture for Software Implemented Fault Tolerance flight control and avionics computers

    A brief description of a SIFT (Software Implemented Fault Tolerance) Flight Control Computer with emphasis on implementation is presented. A multiprocessor system that relies on software-implemented fault detection and reconfiguration algorithms is described. A high level reliability and fault tolerance is achieved by the replication of computing tasks among processing units.

  10. A droplet routing technique for fault-tolerant digital microfluidic devices

    Abstract—Efficient droplet routing is one of the key approaches for realizing fault-tolerant microfluidic biochips. It requires that run-time diagnosis and fault recovery can be made possible in such systems. This paper describes a droplet routing technique for a fault-tolerant digital microfluidic

  11. Analysis of a hardware and software fault tolerant processor for critical applications

    Computer systems for critical applications must be designed to tolerate software faults as well as hardware faults. A unified approach to tolerating hardware and software faults is characterized by classifying faults in terms of duration (transient or permanent) rather than source (hardware or software). Errors arising from transient faults can be handled through masking or voting, but errors arising from permanent faults require system reconfiguration to bypass the failed component. Most errors which are caused by software faults can be considered transient, in that they are input-dependent. Software faults are triggered by a particular set of inputs. Quantitative dependability analysis of systems which exhibit a unified approach to fault tolerance can be performed by a hierarchical combination of fault tree and Markov models. A methodology for analyzing hardware and software fault tolerant systems is applied to the analysis of a hypothetical system, loosely based on the Fault Tolerant Parallel Processor. The models consider both transient and permanent faults, hardware and software faults, independent and related software faults, automatic recovery, and reconfiguration.

  12. Reconfigurable Fault Tolerance for FPGAs

    The invention allows a field-programmable gate array (FPGA) or similar device to be efficiently reconfigured in whole or in part to provide higher capacity, non-redundant operation. The redundant device consists of functional units such as adders or multipliers, configuration memory for the functional units, a programmable routing method, configuration memory for the routing method, and various other features such as block RAM, I/O (random access memory, input/output) capability, dedicated carry logic, etc. The redundant device has three identical sets of functional units and routing resources and majority voters that correct errors. The configuration memory may or may not be redundant, depending on need. For example, SRAM-based FPGAs will need some type of radiation-tolerant configuration memory, or they will need triple-redundant configuration memory. Flash or anti-fuse devices will generally not need redundant configuration memory. Some means of loading and verifying the configuration memory is also required. These are all components of the pre-existing redundant FPGA. This innovation modifies the voter to accept a MODE input, which specifies whether ordinary voting is to occur, or if redundancy is to be split. Generally, additional routing resources will also be required to pass data between sections of the device created by splitting the redundancy. In redundancy mode, the voters produce an output corresponding to the two inputs that agree, in the usual fashion. In the split mode, the voters select just one input and convey this to the output, ignoring the other inputs. In a dual-redundant system (as opposed to triple-redundant), instead of a voter, there is some means to latch or gate a state update only when both inputs agree. In this case, the invention would require modification of the latch or gate so that it would operate normally in redundant mode, and would separately latch or gate the inputs in non-redundant mode.

  13. Learning Fault-tolerant Speech Parsing with SCREEN

    This paper describes a new approach and a system SCREEN for fault-tolerant speech parsing. SCREEEN stands for Symbolic Connectionist Robust EnterprisE for Natural language. Speech parsing describes the syntactic and semantic analysis of spontaneous spoken language. The general approach is based on incremental immediate flat analysis, learning of syntactic and semantic speech parsing, parallel integration of current hypotheses, and the consideration of various forms of speech related errors. The goal for this approach is to explore the parallel interactions between various knowledge sources for learning incremental fault-tolerant speech parsing. This approach is examined in a system SCREEN using various hybrid connectionist techniques. Hybrid connectionist techniques are examined because of their promising properties of inherent fault tolerance, learning, gradedness and parallel constraint integration. The input for SCREEN is hypotheses about recognized words of a spoken utterance potentially analyzed by a spe...

  14. Interactive animation of fault-tolerant parallel algorithms

    Animation of algorithms makes understanding them intuitively easier. This paper describes the software tool Raft (Robust Animator of Fault Tolerant Algorithms). The Raft system allows the user to animate a number of parallel algorithms which achieve fault tolerant execution. In particular, we use it to illustrate the key Write-All problem. It has an extensive user-interface which allows a choice of the number of processors, the number of elements in the Write-All array, and the adversary to control the processor failures. The novelty of the system is that the interface allows the user to create new on-line adversaries as the algorithm executes.

  15. Fault Tolerant Control for Civil Structures Based on LMI Approach

    Full Text Available The control system may lose the performance to suppress the structural vibration due to the faults in sensors or actuators. This paper designs the filter to perform the fault detection and isolation (FDI and then reforms the control strategy to achieve the fault tolerant control (FTC. The dynamic equation of the structure with active mass damper (AMD is first formulated. Then, an estimated system is built to transform the FDI filter design problem to the static gain optimization problem. The gain is designed to minimize the gap between the estimated system and the practical system, which can be calculated by linear matrix inequality (LMI approach. The FDI filter is finally used to isolate the sensor faults and reform the FTC strategy. The efficiency of FDI and FTC is validated by the numerical simulation of a three-story structure with AMD system with the consideration of sensor faults. The results show that the proposed FDI filter can detect the sensor faults and FTC controller can effectively tolerate the faults and suppress the structural vibration.

  16. An Active Fault-Tolerant PWM Tracker for Unknown Nonlinear Stochastic Hybrid Systems: NARMAX Model and OKID-Based State-Space Self-Tuning Control

    An active fault-tolerant pulse-width-modulated tracker using the nonlinear autoregressive moving average with exogenous inputs model-based state-space self-tuning control is proposed for continuous-time multivariable nonlinear stochastic systems with unknown system parameters, plant noises, measurement noises, and inaccessible system states. Through observer/Kalman filter identification method, a good initial guess of the unknown parameters of the chosen model is obtained so as to reduce the ...

  17. Design methods for fault-tolerant finite state machines

    VLSI electronic circuits are increasingly being used in space-borne applications where high levels of radiation may induce faults, known as single event upsets. In this paper we review the classical methods of designing fault tolerant digital systems, with an emphasis on those methods which are particularly suitable for VLSI-implementation of finite state machines. Four methods are presented and will be compared in terms of design complexity, circuit size, and estimated circuit delay.

  18. Design Approach for Fault Tolerance in FPGA Architecture

    Full Text Available Failures of nano-metric technologies owing to defects and shrinking process tolerances give rise tosignificant challenges for IC testing. In recent years the application space of reconfigurable devices hasgrown to include many platforms with a strong need for fault tolerance. While these systems frequentlycontain hardware redundancy to allow for continued operation in the presence of operational faults, theneed to recover faulty hardware and return it to full functionality quickly and efficiently is great. Inaddition to providing functional density, FPGAs provide a level of fault tolerance generally not found inmask-programmable devices by including the capability to reconfigure around operational faults in thefield. Reliability and process variability are serious issues for FPGAs in the future. With advancement inprocess technology, the feature size is decreasing which leads to higher defect densities, moresophisticated techniques at increased costs are required to avoid defects. If nano-technology fabricationare applied the yield may go down to zero as avoiding defect during fabrication will not be a feasibleoption Hence, feature architecture have to be defect tolerant. In regular structure like FPGA, redundancyis commonly used for fault tolerance. In this work we present a solution in which configuration bit-streamof FPGA is modified by a hardware controller that is present on the chip itself. The technique usesredundant device for replacing faulty device and increases the yield.

  19. A modified NARMAX model-based self-tuner with fault tolerance for unknown nonlinear stochastic hybrid systems with an input-output direct feed-through term.

    A modified nonlinear autoregressive moving average with exogenous inputs (NARMAX) model-based state-space self-tuner with fault tolerance is proposed in this paper for the unknown nonlinear stochastic hybrid system with a direct transmission matrix from input to output. Through the off-line observer/Kalman filter identification method, one has a good initial guess of modified NARMAX model to reduce the on-line system identification process time. Then, based on the modified NARMAX-based system identification, a corresponding adaptive digital control scheme is presented for the unknown continuous-time nonlinear system, with an input-output direct transmission term, which also has measurement and system noises and inaccessible system states. Besides, an effective state space self-turner with fault tolerance scheme is presented for the unknown multivariable stochastic system. A quantitative criterion is suggested by comparing the innovation process error estimated by the Kalman filter estimation algorithm, so that a weighting matrix resetting technique by adjusting and resetting the covariance matrices of parameter estimate obtained by the Kalman filter estimation algorithm is utilized to achieve the parameter estimation for faulty system recovery. Consequently, the proposed method can effectively cope with partially abrupt and/or gradual system faults and input failures by the fault detection.

  20. Fault-tolerant search algorithms reliable computation with unreliable information

    Why a book on fault-tolerant search algorithms? Searching is one of the fundamental problems in computer science. Time and again algorithmic and combinatorial issues originally studied in the context of search find application in the most diverse areas of computer science and discrete mathematics. On the other hand, fault-tolerance is a necessary ingredient of computing. Due to their inherent complexity, information systems are naturally prone to errors, which may appear at any level - as imprecisions in the data, bugs in the software, or transient or permanent hardware failures. This book pr

  1. Fault Tolerant Control of Wind Turbines

    This paper presents a test benchmark model for the evaluation of fault detection and accommodation schemes. This benchmark model deals with the wind turbine on a system level, and it includes sensor, actuator, and system faults, namely faults in the pitch system, the drive train, the generator......, and the converter system. Since it is a system-level model, converter and pitch system models are simplified because these are controlled by internal controllers working at higher frequencies than the system model. The model represents a three-bladed pitch-controlled variable-speed wind turbine with a nominal power...

  2. Passive Fault tolerant Control of an Inverted Double Pendulum

    A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the Youla parameterization, which requires the nominal controller to be imp...... to be implemented in the observer based form. The proposed method is applied to a double inverted pendulum system, for which an H controller has been designed and verified in a lap setup. In this case study, the fault is a degradation of the tacho loop....

  3. Fault-tolerant and Diagnostic Methods for Navigation

    Precise and reliable navigation is crucial, and for reasons of safety, essential navigation instruments are often duplicated. Hardware redundancy is mostly used to manually switch between instruments should faults occur. In contrast, diagnostic methods are available that can use analytic redundancy...... to diagnose faults and autonomously provide valid navigation data, disregarding any faulty sensor data and use sensor fusion to obtain a best estimate for users. This paper discusses how diagnostic and fault-tolerant methods are applicable in marine systems. An example chosen is sensor fusion for navigation...

  4. Verification of the FtCayuga fault-tolerant microprocessor system. Volume 1: A case study in theorem prover-based verification

    The design and formal verification of a hardware system for a task that is an important component of a fault tolerant computer architecture for flight control systems is presented. The hardware system implements an algorithm for obtaining interactive consistancy (byzantine agreement) among four microprocessors as a special instruction on the processors. The property verified insures that an execution of the special instruction by the processors correctly accomplishes interactive consistency, provided certain preconditions hold. An assumption is made that the processors execute synchronously. For verification, the authors used a computer aided design hardware design verification tool, Spectool, and the theorem prover, Clio. A major contribution of the work is the demonstration of a significant fault tolerant hardware design that is mechanically verified by a theorem prover.

  5. Research on Fault Tolerant Scheduling Algorithms of Web Cluster Based on Probability

    Aiming at the soft real-time fault tolerant demand of critical web applications at present, such as E-commerce, a new fault tolerant scheduling algorithm based on probability is proposed. To achieve fault tolerant scheduling,the primary/slave backup technology is applied on the basis of task's self similar accessing characteristics, when the primary task completed successfully, the resources allocated for the slave task are reclaimed, thus advancing system's efficiency.Experimental results demonstrate on the premise of satisfying system's certain fault tolerant probability, task's schedulabilistic probability is improved, especially, the higher task's self similar degree is, the more obviously the utilization of system resources is enhanced.

  6. Fault Detection for Shipboard Monitoring and Decision Support Systems

    In this paper a basic idea of a fault-tolerant monitoring and decision support system will be explained. Fault detection is an important part of the fault-tolerant design for in-service monitoring and decision support systems for ships. In the paper, a virtual example of fault detection will be p...

  7. Design of Test Articles and Monitoring System for the Characterization of HIRF Effects on a Fault-Tolerant Computer Communication System

    This report describes the design of the test articles and monitoring systems developed to characterize the response of a fault-tolerant computer communication system when stressed beyond the theoretical limits for guaranteed correct performance. A high-intensity radiated electromagnetic field (HIRF) environment was selected as the means of injecting faults, as such environments are known to have the potential to cause arbitrary and coincident common-mode fault manifestations that can overwhelm redundancy management mechanisms. The monitors generate stimuli for the systems-under-test (SUTs) and collect data in real-time on the internal state and the response at the external interfaces. A real-time health assessment capability was developed to support the automation of the test. A detailed description of the nature and structure of the collected data is included. The goal of the report is to provide insight into the design and operation of these systems, and to serve as a reference document for use in post-test analyses.

  8. A Formal Method for Developing Provably Correct Fault-Tolerant Systems Using Partial Refinement and Composition


    system requirements. See [12, 11] for a review of the SCR tabular notation, the state machine model which defines the SCR semantics, and the SCR tools. 1...Specify NAT and REQ. To represent the system’s normal behavior, a state machine model of the system requirements is formulated in terms of two sets of...Formulate the System Properties. In this step, the critical system properties are for- mulated as properties of the state machine model . If possible

  9. Steps toward fault-tolerant quantum chemistry.

    Developing quantum chemistry programs on the coming generation of exascale computers will be a difficult task. The programs will need to be fault-tolerant and minimize the use of global operations. This work explores the use a task-based model that uses a data-centric approach to allocate work to different processes as it applies to quantum chemistry. After introducing the key problems that appear when trying to parallelize a complicated quantum chemistry method such as coupled-cluster theory, we discuss the implications of that model as it pertains to the computational kernel of a coupled-cluster program - matrix multiplication. Also, we discuss the extensions that would required to build a full coupled-cluster program using the task-based model. Current programming models for high-performance computing are fault-intolerant and use global operations. Those properties are unsustainable as computers scale to millions of CPUs; instead one must recognize that these systems will be hierarchical in structure, prone to constant faults, and global operations will be infeasible. The FAST-OS HARE project is introducing a scale-free computing model to address these issues. This model is hierarchical and fault-tolerant by design, allows for the clean overlap of computation and communication, reducing the network load, does not require checkpointing, and avoids the complexity of many HPC runtimes. Development of an algorithm within this model requires a change in focus from imperative programming to a data-centric approach. Quantum chemistry (QC) algorithms, in particular electronic structure methods, are an ideal test bed for this computing model. These methods describe the distribution of electrons in a molecule, which determine the properties of the molecule. The computational cost of these methods is high, scaling quartically or higher in the size of the molecule, which is why QC applications are major users of HPC resources. The complexity of these algorithms means that

  10. Research on Evolvable Repairing Ability of Bio-inspired Fault-tolerance System%仿生容错系统演化修复能力研究

    基于演化硬件技术构建一种仿生容错系统,通过不同模式、数量的故障注入对其演化修复能力进行研究,得到系统故障状况与演化修复能力间的关系:(1)随着故障数量的增加,系统演化修复能力的主要影响因素从演化算法的效率逐步向演化修复过程中的故障“躲避”概率转移;(2)系统的演化修复能力与故障数量符合指数衰减规律.%Based on the Evolvable Hardware EH W) technique, a bio-inspired fault-tolerance system is design and constructed, and the evolvable repairing ability is researched under the different fault modes and quantities. This paper obtains some relations between the faults and the evolvable repairing ability through analyzing the measured data. And there are two most valuable results: (l)With the increase of the quantity of faults, the main influence factors of the evolvable repairing ability are shift from the efficiency of evolution strategy to the probability of the fault avoided; (2)The evolvable repairing ability and the fault quantity are in accord with the exponential decay law.

  11. Nonlinear, Adaptive and Fault-tolerant Control for Electro-hydraulic Servo Systems

    Fluid power systems have been in use since 1795 with the rst hydraulic press patented by Joseph Bramah and today form the basis of many industries. Electro hydraulic servo systems are uid power systems controlled in closed-loop. They transform reference input signals into a set of movements...... in hydraulic actuators (cylinders or motors) by the means of hydraulic uid under pressure. With the development of computing power and control techniques during the last few decades, they are used increasingly in many industrial elds which require high actuation forces within limited space. However, despite...... numerous attractive properties, hydraulic systems are always subject to potential leakages in their components, friction variation in their hydraulic actuators and deciency in their sensors. These violations of normal behaviour reduce the system performances and can lead to system failure...


    P. V. Melyushin


  13. Sliding mode fault detection and fault-tolerant control of smart dampers in semi-active control of building structures

    Recent decades have witnessed much interest in the application of active and semi-active control strategies for seismic protection of civil infrastructures. However, the reliability of these systems is still in doubt as there remains the possibility of malfunctioning of their critical components (i.e. actuators and sensors) during an earthquake. This paper focuses on the application of the sliding mode method due to the inherent robustness of its fault detection observer and fault-tolerant control. The robust sliding mode observer estimates the state of the system and reconstructs the actuators’ faults which are used for calculating a fault distribution matrix. Then the fault-tolerant sliding mode controller reconfigures itself by the fault distribution matrix and accommodates the fault effect on the system. Numerical simulation of a three-story structure with magneto-rheological dampers demonstrates the effectiveness of the proposed fault-tolerant control system. It was shown that the fault-tolerant control system maintains the performance of the structure at an acceptable level in the post-fault case.

  14. Internal Leakage Fault Detection and Tolerant Control of Single-Rod Hydraulic Actuators

    Full Text Available The integration of internal leakage fault detection and tolerant control for single-rod hydraulic actuators is present in this paper. Fault detection is a potential technique to provide efficient condition monitoring and/or preventive maintenance, and fault tolerant control is a critical method to improve the safety and reliability of hydraulic servo systems. Based on quadratic Lyapunov functions, a performance-oriented fault detection method is proposed, which has a simple structure and is prone to implement in practice. The main feature is that, when a prescribed performance index is satisfied (even a slight fault has occurred, there is no fault alarmed; otherwise (i.e., a severe fault has occurred, the fault is detected and then a fault tolerant controller is activated. The proposed tolerant controller, which is based on the parameter adaptive methodology, is also prone to realize, and the learning mechanism is simple since only the internal leakage is considered in parameter adaptation and thus the persistent exciting (PE condition is easily satisfied. After the activation of the fault tolerant controller, the control performance is gradually recovered. Simulation results on a hydraulic servo system with both abrupt and incipient internal leakage fault demonstrate the effectiveness of the proposed fault detection and tolerant control method.

  15. System-Level Development of Fault-Tolerant Distributed Aero-Engine Control Architecture Project

    National Aeronautics and Space Administration — NASA's vision for an "intelligent engine" will be realized with the development of a truly distributed control system and reliable smart transducer node components;...

  16. Cooperative Fault Tolerant Distributed Computing

    HARNESS was proposed as a system that combined the best of emerging technologies found in current distributed computing research and commercial products into a very flexible, dynamically adaptable framework that could be used by applications to allow them to evolve and better handle their execution environment. The HARNESS system was designed using the considerable experience from previous projects such as PVM, MPI, IceT and Cumulvs. As such, the system was designed to avoid any of the common problems found with using these current systems, such as no single point of failure, ability to survive machine, node and software failures. Additional features included improved inter-component connectivity, with full support for dynamic down loading of addition components at run-time thus reducing the stress on application developers to build in all the libraries they need in advance.

  17. A Distributed Fault Tolerance Global Coordinator Election Algorithm in Unreliable High Traffic Distributed Systems

    Full Text Available Distributed systems consist of several management sites which have different resource sharing levels. Resources can be shared among inner site and outer site processes at first and second level respectively. Global coordinator should exist in order to coordinate access to multi site’s shared resources. Moreover; some other coordinators should manage access to inner site’s shared resources so that exerting appropriate coordinator election algorithms in each level is crucial to achieve most efficient system. In this paper a hierarchical distributed election algorithm is proposed which eliminates single point of failure of election launcher. Meanwhile traffic is applied to network at different times and the number of election messages is extremely decreased as well which applies more efficiency especially in high traffic networks. A standby system between coordinators and their first alternative is considered to induct less wait time to processes which want to communicate with coordinator

  18. Self Fault-Tolerance of Protocols: A Case Study


    The prerequisite for the existing protocols' correctness is that protocols can be normally operated under the normal conditions, rather than dealing with abnormal conditions.In other words, protocols with the fault-tolerance can not be provided when some fault occurs. This paper discusses the self fault-tolerance of protocols. It describes some concepts and methods for achieving self fault tolerance of protocols. Meanwhile, it provides a case study, investigates a typical protocol that does not satisfy the self fault-tolerance, and gives a new redesign version of this existing protocol using the proposed approach.

  19. A Multi-Step Simulation Approach Toward Secure Fault Tolerant System Evaluation


    its current state of development. When the system architecture is not established, researchers generally use CTMC (Continuous time Markov chains ...stochastic va bandwidth, etc.) that are statistically derived experimentations. It can also be defined analysis, like the use of queuing theory in case The

    Zahiripour, Seyed Ali; Jalali, Ali Akbar


    A novel switching function based on an optimization strategy for the sliding mode control (SMC) method has been provided for uncertain stochastic systems subject to actuator degradation such that the closed-loop system is globally asymptotically stable with probability one. In the previous researches the focus on sliding surface has been on proportional or proportional-integral function of states. In this research, from a degree of freedom that depends on designer choice is used to meet certain objectives. In the design of the switching function, there is a parameter which the designer can regulate for specified objectives. A sliding-mode controller is synthesized to ensure the reachability of the specified switching surface, despite actuator degradation and uncertainties. Finally, the simulation results demonstrate the effectiveness of the proposed method.

  1. Event-Triggered Faults Tolerant Control for Stochastic Systems with Time Delays

    Full Text Available This paper is concerned with the state-feedback controller design for stochastic networked control systems (NCSs with random actuator failures and transmission delays. Firstly, an event-triggered scheme is introduced to optimize the performance of the stochastic NCSs. Secondly, stochastic NCSs under event-triggered scheme are modeled as stochastic time-delay systems. Thirdly, some less conservative delay-dependent stability criteria in terms of linear matrix inequalities for the codesign of both the controller gain and the trigger parameters are obtained by using delay-decomposition technique and convex combination approach. Finally, a numerical example is provided to show the less sampled data transmission and less conservatism of the proposed theory.

  2. Stable Fault-tolerance Control for a Class of Networked Control Systems%一类网络化控制系统的稳定容错控制

    In this paper, we use the matrix measure technique to study stable fault-tolerance control of networked control systems. State feedback networked control systems with the network-induced delay, parameter uncertainties, sensor failures and actuator failures are considered. State feedback gain K is designed for any invariant delay τ, and some theorems and sufficient conditions for stable fault-tolerance control are given. Example is presented to illustrate the effectiveness of these theorems.

  3. On Fault Tolerance of 3-Dimensional Mesh Networks

    In this paper, the concept of k-submesh and k-submesh connectivity fault tolerance model is proposed. And the fault tolerance of 3-D mesh networks is studied under a more realistic model in which each network node has an independent failure probability. It is first observed that if the node failure probability is fixed, then the connectivity probability of 3-D mesh networks can be arbitrarily small when the network size is sufficiently large. Thus, it is practically important for multicomputer system manufacturer to determine the upper bound for node failure probability when the probability of network connectivity and the network size are given.A novel technique is developed to formally derive lower bounds on the connectivity probability for 3-D mesh networks. The study shows that 3-D mesh networks of practical size can tolerate a large number of faulty nodes thus are reliable enough for multicomputer systems. A number of advantages of 3-D mesh networks over other popular network topologies are given. Compared to 2-D mesh networks, 3-D mesh networks are much stronger in tolerating faulty nodes, while for practical network size, the fault tolerance of 3-D mesh networks is comparable with that of hypercube networks but enjoys much lower node degree.

  4. Communication and Agreement Abstractions for Fault-Tolerant Asynchronous Distributed Systems

    Understanding distributed computing is not an easy task. This is due to the many facets of uncertainty one has to cope with and master in order to produce correct distributed software. Considering the uncertainty created by asynchrony and process crash failures in the context of message-passing systems, the book focuses on the main abstractions that one has to understand and master in order to be able to produce software with guaranteed properties. These fundamental abstractions are communication abstractions that allow the processes to communicate consistently (namely the register abstraction

  5. Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing

    Fault tolerant systems require the ability to detect and recover from physical damage caused by the hardware s environment, faulty connectors, and system degradation over time. This ability applies to military, space, and industrial computing applications. The integrity of Point-to-Point (P2P) communication, between two microcontrollers for example, is an essential part of fault tolerant computing systems. In this paper, different methods of fault detection and recovery are presented and analyzed.

  6. Superior model for fault tolerance computation in designing nano-sized circuit systems

    As CMOS technology scales nano-metrically, reliability turns out to be a decisive subject in the design methodology of nano-sized circuit systems. As a result, several computational approaches have been developed to compute and evaluate reliability of desired nano-electronic circuits. The process of computing reliability becomes very troublesome and time consuming as the computational complexity build ups with the desired circuit size. Therefore, being able to measure reliability instantly and superiorly is fast becoming necessary in designing modern logic integrated circuits. For this purpose, the paper firstly looks into the development of an automated reliability evaluation tool based on the generalization of Probabilistic Gate Model (PGM) and Boolean Difference-based Error Calculator (BDEC) models. The Matlab-based tool allows users to significantly speed-up the task of reliability analysis for very large number of nano-electronic circuits. Secondly, by using the developed automated tool, the paper explores into a comparative study involving reliability computation and evaluation by PGM and, BDEC models for different implementations of same functionality circuits. Based on the reliability analysis, BDEC gives exact and transparent reliability measures, but as the complexity of the same functionality circuits with respect to gate error increases, reliability measure by BDEC tends to be lower than the reliability measure by PGM. The lesser reliability measure by BDEC is well explained in this paper using distribution of different signal input patterns overtime for same functionality circuits. Simulation results conclude that the reliability measure by BDEC depends not only on faulty gates but it also depends on circuit topology, probability of input signals being one or zero and also probability of error on signal lines.

  7. Fault handling schemes in electronic systems with specific application to radiation tolerance and VLSI design

    Naturally occurring space radiation particles can produce transient and permanent changes in the electrical properties of electronic devices and systems. In this work, the transient radiation effects on DRAM and CMOS SRAM were considered. In addition, the effect of total ionizing dose radiation of the switching times of CMOS logic gates were investigated. Effects of transient radiation on the column and cell of MOS dynamic memory cell was simulated using SPICE. It was found that the critical charge of the bitline was higher than that of the cell. In addition, the critical charge of the combined cell-bitline was found to be dependent on the gate voltage of the access transistor. In addition, the effect of total ionizing dose radiation on the switching times of CMOS logic gate was obtained. The results of this work indicate that, the rise time of CMOS logic gates increases, while the fall time decreases with an increase in total ionizing dose radiation. Also, by increasing the size of the P-channel transistor with respect to that of the N-channel transistor, the propagation delay of CMOS logic gate can be made to decrease with, or be independent of an increase in total ionizing dose radiation. Furthermore, a method was developed for replacing polysilicon feedback resistance of SRAMs with a switched capacitor network. A switched capacitor SRAM was implemented using MOS Technology. The critical change of the switched capacitor SRAM has a very large critical charge. The results of this work indicate that switched capacitor SRAM is a viable alternative to SRAM with polysilicon feedback resistance.

  8. Diagnosis and Fault-tolerant Control, 3rd Edition

    The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process...

  9. Algorithm-dependent fault tolerance for distributed computing

    Large-scale distributed systems assembled from commodity parts, like CPlant, have become common tools in the distributed computing world. Because of their size and diversity of parts, these systems are prone to failures. Applications that are being run on these systems have not been equipped to efficiently deal with failures, nor is there vendor support for fault tolerance. Thus, when a failure occurs, the application crashes. While most programmers make use of checkpoints to allow for restarting of their applications, this is cumbersome and incurs substantial overhead. In many cases, there are more efficient and more elegant ways in which to address failures. The goal of this project is to develop a software architecture for the detection of and recovery from faults in a cluster computing environment. The detection phase relies on the latest techniques developed in the fault tolerance community. Recovery is being addressed in an application-dependent manner, thus allowing the programmer to take advantage of algorithmic characteristics to reduce the overhead of fault tolerance. This architecture will allow large-scale applications to be more robust in high-performance computing environments that are comprised of clusters of commodity computers such as CPlant and SMP clusters.

  10. Distributed adaptive fault-tolerant control against actuator faults and lossy interconnection links

    This paper presents an adaptive method to solve the robust fault-tolerant control (FTC) problem for a class of large scale systems against actuator failures and lossy interconnection links. In terms of the special distributed architectures, the adaptation laws are proposed to estimate the unknown eventual faults of actuators and interconnections, constant external disturbances, and controller parameters on-line. Then a class of distributed state feedback controllers are constructed for automatically compensating the fault and disturbance effects on systems based on the information from adaptive schemes. On the basis of Lyapunov stability theory, it shows that the resulting adaptive closed-loop large-scale system can be guaranteed to be asymptotically stable in the presence of uncertain faults of actuators and interconnections, and constant disturbances. The proposed design technique is finally evaluated in the light of a simulation example.

  11. Fault tolerant wind speed estimator used in wind turbine controllers

    . In this paper a fault tolerant wind speed estimator is designed based on a set of unknown input observers, each designed to the different sets of non-faulty sensors. Faults in the rotor, generator and wind speed sensors are considered. The designed wind speed estimator is passive tolerant towards faults...... in the wind speed sensors, and faults in the generator and rotor speed sensors are accommodated by an active fault tolerant observer scheme in which the faults are detected and identified, and the observer corresponding to the non-faulty sensors are used. The potential of the scheme is shown by applying......Advanced control schemes can be used to optimize energy production and cost of energy in modern wind turbines. These control schemes most often rely on wind speed estimations. These designs of wind speed estimators are, however, not designed to be fault tolerant towards faults in the used sensors...

  12. Synthesis of Fault Tolerant Reversible Logic Circuits

    Reversible logic is emerging as an important research area having its application in diverse fields such as low power CMOS design, digital signal processing, cryptography, quantum computing and optical information processing. This paper presents a new 4*4 universal reversible logic gate, IG. It is a parity preserving reversible logic gate, that is, the parity of the inputs matches the parity of the outputs. The proposed parity preserving reversible gate can be used to synthesize any arbitrary Boolean function. It allows any fault that affects no more than a single signal readily detectable at the circuit's primary outputs. Finally, it is shown how a fault tolerant reversible full adder circuit can be realized using only two IGs. It has also been demonstrated that the proposed design offers less hardware complexity and is efficient in terms of gate count, garbage outputs and constant inputs than the existing counterparts.

  13. Universal Fault-Tolerant Computation on Decoherence-Free Subspaces

    A general scheme to perform universal quantum computation fault-tolerantly within decoherence-free subspaces (DFSs) of a system's Hilbert space is derived. This scheme leads to the first fault-tolerant realization of universal quantum computation on DFSs with the properties that (i) only one- and two-qubit interactions are required, and (ii) the system remains within the DFS throughout the entire implementation of a quantum gate. We show explicitly how to perform universal computation on clusters of the four-qubit DFS encoding one logical qubit each under "collective decoherence" (qubit-permutation-invariant system-bath coupling). Our results have immediate relevance to a number of proposed quantum computer implementations, in particular those in which the internal system Hamiltonian is of the Heisenberg type, such as spin-spin coupled quantum dots.


    When a redundant robot performs a fault-tolerant operation for locked joint failures, its fault tolerant properties should include dexterity and sudden change of joint velocity at the moment of locking failed joints and the dexterity during the post-failure. Firstly three fault-tolerant indexes, reduced condition number, sudden change of relative joint velocity and centrality are proposed, which can comprehensively evaluate the kinematical performance of a redundant robot during its entire fault-tolerant operations. Then, the influence of the initial postures of robot's end-effector on these fault-tolerant indexes is analyzed with a planar robot and a spatial robot. Simulation results show that for a given task the joint trajectory with the best comprehensive effect of fault tolerance can be determined by optimizing the initial posture of a robot.

  15. Design study of Software-Implemented Fault-Tolerance (SIFT) computer

    Wensley, J. H.; Goldberg, J.; Green, M. W.; Kutz, W. H.; Levitt, K. N.; Mills, M. E.; Shostak, R. E.; Whiting-Okeefe, P. M.; Zeidler, H. M.


    Software-implemented fault tolerant (SIFT) computer design for commercial aviation is reported. A SIFT design concept is addressed. Alternate strategies for physical implementation are considered. Hardware and software design correctness is addressed. System modeling and effectiveness evaluation are considered from a fault-tolerant point of view.

  16. Employment of Reduced Precision Redundancy for Fault Tolerant FPGA Applications


    2009 17th IEEE Symposium on Field Programmable Custom Computing Machines This research explores the employment of Reduced Precision Redundancy (RPR) as a powersaving alternative to traditional Triple Modular Redundancy (TMR). This paper focuses on the details of RPR implementation and the effect of RPR fault tolerance on the performance of spacecraft systems. RPR-protected system performance is evaluated using a signal-to-noise ratio analogy developed with MATLAB an...

  17. Fault tolerant quantum computation with nondeterministic gates.

    In certain approaches to quantum computing the operations between qubits are nondeterministic and likely to fail. For example, a distributed quantum processor would achieve scalability by networking together many small components; operations between components should be assumed to be failure prone. In the ultimate limit of this architecture each component contains only one qubit. Here we derive thresholds for fault-tolerant quantum computation under this extreme paradigm. We find that computation is supported for remarkably high failure rates (exceeding 90%) providing that failures are heralded; meanwhile the rate of unknown errors should not exceed 2 in 10(4) operations.

  18. Fault Diagnosis and Fault-Tolerant Control of Uncertain Robot Manipulators Using High-Order Sliding Mode

    Full Text Available A robust fault diagnosis and fault-tolerant control (FTC system for uncertain robot manipulators without joint velocity measurement is presented. The actuator faults and robot manipulator component faults are considered. The proposed scheme is designed via an active fault-tolerant control strategy by combining a fault diagnosis scheme based on a super-twisting third-order sliding mode (STW-TOSM observer with a robust super-twisting second-order sliding mode (STW-SOSM controller. Compared to the existing FTC methods, the proposed FTC method can accommodate not only faults but also uncertainties, and it does not require a velocity measurement. In addition, because the proposed scheme is designed based on the high-order sliding mode (HOSM observer/controller strategy, it exhibits fast convergence, high accuracy, and less chattering. Finally, computer simulation results for a PUMA560 robot are obtained to verify the effectiveness of the proposed strategy.

  19. A Blueprint for a Topologically Fault-tolerant Quantum Computer

    The advancement of information processing into the realm of quantum mechanics promises a transcendence in computational power that will enable problems to be solved which are completely beyond the known abilities of any "classical" computer, including any potential non-quantum technologies the future may bring. However, the fragility of quantum states poses a challenging obstacle for realization of a fault-tolerant quantum computer. The topological approach to quantum computation proposes to surmount this obstacle by using special physical systems -- non-Abelian topologically ordered phases of matter -- that would provide intrinsic fault-tolerance at the hardware level. The so-called "Ising-type" non-Abelian topological order is likely to be physically realized in a number of systems, but it can only provide a universal gate set (a requisite for quantum computation) if one has the ability to perform certain dynamical topology-changing operations on the system. Until now, practical methods of implementing thes...

  20. Fault Tolerant Robust Control Applied for Induction Motor (LMI approach

    Full Text Available This paper foregrounds fault tolerant robust control of uncertain dynamic linear systems in the state space representation. In fact, the industrial systems are more and more complex and the diagnosis process becomes indispensable to guarantee their surety of functioning and availability, that’s why a fault tolerant control law is imperative to achieve the diagnosis. In this paper, we address the problem of state feedback H2 /H∞ mixed with regional pole placement for linear continuous uncertain system. Sufficient conditions for feasibility are derived for a general class of convex regions of the complex plan. The conditions are presented as a collection of linear matrix inequalities (LMI 's. The efficiency and performance of this approach are then tested taking into consideration the robust control of a three- phase induction motor drive with the fluctuation of its parameters during the functioning.

  1. Fault tolerance in real-time and multitask parallel computing system%实时多任务并行计算系统的容错技术

      容错技术是实时多任务并行计算系统设计中必须解决的一个关键难点。针对实时多任务并行计算系统的高可靠性和高效性的要求,介绍了计算机系统可靠性和容错技术的基本概念、基本方法和基本思想,在检查点技术和卷回技术的基础上,提出了进行多层次、多角度的并行容错计算机系统设计和解决中途消息和孤立消息的相关方案,给出了相应的模型和技术评估,通过仿真实验证明了该模型的有效性。%Fault tolerance plays a key role in the design of real-time and multitask parallel computing systems. Aiming at the re-quest of high reliability and efficiency in the real-time and multitask parallel computing system, the basic concepts, basic meth-ods and basic thoughts in the technology of reliability and fault tolerance of computing system are introduced, based on the check-pointing technology and back-out recovery technology. Fault-tolerance parallel computing system from multi-levels and multi-aspects and the solving way of midway message and isolated message are put forward. At the same time, the relate model and technology evaluating are discussed to prove the validity of the model.

  2. Fault-tolerant distributed mass storage for LHC computing

    Wiebalck, A; Lindenstruth, V; Stinbeck, T M


    In this paper we present the concept and first prototyping results of a modular fault-tolerant distributed mass storage architecture for large Linux PC clusters as they are deployed by the upcoming particle physics experiments. The device masquerading technique using an Enhanced Network Block Device (ENBD) enables local RAID over remote disks as the key concept of the ClusterRAID system. The block level interface to remote files, partitions or disks provided by the ENBD makes it possible to use the standard Linux software RAID to add fault-tolerance to the system. Preliminary performance measurements indicate that the latency is comparable to a local hard drive. With four disks throughput rates of up to 55MB/s were achieved with first prototypes for a RAIDO setup, and about 40M/s for a RAID5 setup. (29 refs).

  3. Design of Reliable Adaptive Filter with Fault Tolerance Using DSP

    LSM algorithm has been used for plant identifier and noise cancellation. This algorithm has been researched for performance enhancement of filtering. The design and development of a reliable system has been becoming a key issue in industry field because the reliability of a system is considered as an important factor to perform the system's function successfully. And the computing with reliability and fault tolerance is a important factor in the case of aviation, system communication, and nuclear plant. This paper presents design of reliable adaptive filter with fault tolerance. Generally, redundancy is used for reliability. In this case it needs computing or circuit for voting mechanism or computing for fault detection or switching part. But this presented Filter is not in need of computing for voting mechanism, or fault detection. Therefore it has simple computing , and practicality for application. And in this paper, reliability of adaptive filter is analyzed. The effectiveness of the proposed adaptive filter is demonstrated to the case studies of plant identifier and noise cancellation by using DSP. (author). 9 refs., 18 figs.

  4. Robust Adaptive Fault-Tolerant Tracking Control of Three-Phase Induction Motor

    Full Text Available This paper deals with the problem of induction motor tracking control against actuator faults and external disturbances using the linear matrix inequalities (LMIs method and the adaptive method. A direct adaptive fault-tolerant tracking controller design method is developed based on Lyapunov stability theory and a constructive algorithm based on linear matrix inequalities for online tuning of adaptive and state feedback gains to stabilize the closed-loop system in order to reduce the fault effect with disturbance attenuation. Simulation results reveal the merits of proposed robust adaptive fault-tolerant tracking control scheme on an induction motor subjected to actuator faults.

  5. Fault tolerant control with torque limitation based on fault mode for ten-phase permanent magnet synchronous motor

    Guo Hong


    Full Text Available This paper proposes a novel fault tolerant control with torque limitation based on the fault mode for the ten-phase permanent magnet synchronous motor (PMSM under various open-circuit and short-circuit fault conditions, which includes the optimal torque control and the torque limitation control based on the fault mode. The optimal torque control is adopted to guarantee the ripple-free electromagnetic torque operation for the ten-phase motor system under the post-fault condition. Furthermore, we systematically analyze the load capacity of the ten-phase motor system under different fault modes. And a torque limitation control approach based on the fault mode is proposed, which was not available earlier. This approach is able to ensure the safety operation of the faulted motor system in long operating time without causing the overheat fault. The simulation result confirms that the proposed fault tolerant control for the ten-phase motor system is able to guarantee the ripple-free electromagnetic torque and the safety operation in long operating time under the normal and fault conditions.

    Guo Hong; Xu Jinquan


    This paper proposes a novel fault tolerant control with torque limitation based on the fault mode for the ten-phase permanent magnet synchronous motor (PMSM) under various open-circuit and short-circuit fault conditions, which includes the optimal torque control and the torque limitation control based on the fault mode. The optimal torque control is adopted to guarantee the ripple-free electromagnetic torque operation for the ten-phase motor system under the post-fault condition. Furthermore, we systematically analyze the load capacity of the ten-phase motor system under different fault modes. And a torque limitation control approach based on the fault mode is proposed, which was not available earlier. This approach is able to ensure the safety operation of the faulted motor system in long operating time without causing the overheat fault. The simulation result confirms that the proposed fault tolerant control for the ten-phase motor system is able to guarantee the ripple-free electromagnetic torque and the safety operation in long operating time under the normal and fault conditions.

  7. Fault- Tolerant Design Techniques in A CMP Architecture

    YAO Wen-bin; WANG Dong-sheng


    Single-chip multiprocessor ( CMP ) combined with the fault-tolerant(FT) techniques offers an ideal architecture to achieve high availability on the basis of sustaining high computing performance. FT design of a single-chip multiprocessor is described, including the techniques from hardware redundancy to software support and firmware strategy.The design aims at masking the influences of errors and automatically correcting the system states.

  8. Design of Six Channel ABS System with Fault Tolerant Technology%基于容错技术的六通道ABS系统设计

    苗晓锋; 王胜


    采用容错技术,通过2个MCU之间的CAN通讯的消息应答机制,设计了具有容错功能的六通道ABS系统.2个MCU同时接收和处理来自车轮传感器的轮速信号和故障诊断信号.如果主控MCU处于正常工作状态的时候,另一个MCU则处于备用状态,且实时地通过CAN总线和主控进行通讯,对其进行监控.如果主控MCU处于故障状态的时候,备用MCU立即接替主控MCU进行工作,以提高六通道ABS系统的可靠性.%The paper design the six channels ABS system with the fault tolerant technology, which is realized fault tolerant function by message response mechanism of CAN communication between the two MCU. The two MCU receive and process the speed signal and fault diagnosis signal which is come from the wheel speed sensor. If main control MCU is good, the other MCU is in the standby state. At the same time, the main control MCU and standby MCU are communicated by CAN bus timely and inspect each other. If main control MCU is in fault,the standby MCU is work to substitute the main control MCU immediately. The paper proposed design method improvement the reliability of six channels ABS system.

  9. 应用马尔科夫状态图法进行可靠性评估%Evaluation of Reliability of a Fault-tolerance Computer System by Markov Status Graph Evaluation of Reliability of a Fault-tolerance Computer System by Markov Status Graph



    应用马尔科夫状态图法,对一个实际的硬件式可修容错计算机系统进行了可靠性评估。并针对两种容错方式分别得出各自的评估数据,通过实际的数据分析了其优缺点及最佳适用范围。%In this paper, the reliability of a fault-tolerance computer system is evaluated by Markov status graph. Majority voting method and single store method are used to evaluate the reliability and usability of the fault-tolerance system. Through practical computation, the comparison data are also given.

  10. Runtime Instrumentation of SystemC/TLM2 Interfaces for Fault Tolerance Requirements Verification in Software Cosimulation

    Antonio da Silva


  11. Fault-tolerance techniques for SRAM-based FPGAs

    Fault-tolerance in integrated circuits is no longer the exclusive concern of space designers or highly-reliable applications engineers. Today, designers of many next-generation products must cope with reduced margin noises. The continuous evolution of fabrication technology of semiconductor components – shrinking transistor geometry, power supply, speed, and logic density – has significantly reduced the reliability of very deep submicron integrated circuits, in face of various internal and external sources of noise. Field Programmable Gate Arrays (FPGAs), customizable by SRAM cells, are the latest advance in the integrated circuit evolution: millions of memory cells to implement the logic, embedded memories, routing, and embedded microprocessors cores. These re-programmable systems-on-chip platforms must be fault-tolerant to cope with current requirements.

  12. Fault Tolerance in ZigBee Wireless Sensor Networks

    Wireless sensor networks (WSN) based on the IEEE 802.15.4 Personal Area Network standard are finding increasing use in the home automation and emerging smart energy markets. The network and application layers, based on the ZigBee 2007 PRO Standard, provide a convenient framework for component-based software that supports customer solutions from multiple vendors. This technology is supported by System-on-a-Chip solutions, resulting in extremely small and low-power nodes. The Wireless Connections in Space Project addresses the aerospace flight domain for both flight-critical and non-critical avionics. WSNs provide the inherent fault tolerance required for aerospace applications utilizing such technology. The team from Ames Research Center has developed techniques for assessing the fault tolerance of ZigBee WSNs challenged by radio frequency (RF) interference or WSN node failure.

  13. Fault Detection and Isolation and Fault Tolerant Control of Wind Turbines Using Set-Valued Observers

    and Isolation (FDI) and Fault Tolerant Control (FTC) of wind turbines, by taking advantage of the recent advances in SVO theory for model invalidation. A simple wind turbine model is presented along with possible faulty scenarios. The FDI algorithm is built on top of the described model, taking into account......Research on wind turbine Operations & Maintenance (O&M) procedures is critical to the expansion of Wind Energy Conversion systems (WEC). In order to reduce O&M costs and increase the lifespan of the turbine, we study the application of Set-Valued Observers (SVO) to the problem of Fault Detection...

  14. Redundant finite rings for fault-tolerant signal processors

    Jullien, Graham A.; Bizzan, S. S.; Wigley, Neil M.; Miller, W. C.


  15. Control switching in high performance and fault tolerant control

    The problem of reliability in high performance control and in fault tolerant control is considered in this paper. A feedback controller architecture for high performance and fault tolerance is considered. The architecture is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization. By usi...

  16. Data Driven Fault Tolerant Control: A Subspace Approach

    The main stream research on fault detection and fault tolerant control has been focused on model based methods. As far as a model is concerned, changes therein due to faults have to be extracted from measured data. Generally speaking, existing approaches process measured inputs and outputs either by

  17. Concepts and Methods in Fault-tolerant Control

    in an intelligent way. The aim is to prevent that simple faults develop into serious failure and hence increase plant availability and reduce the risk of safety hazards. Fault-tolerant control merges several disciplines into a common framework to achieve these goals. The desired features are obtained through on......Faults in automated processes will often cause undesired reactions and shut-down of a controlled plant, and the consequences could be damage to technical parts of the plant, to personnel or the environment. Fault-tolerant control combines diagnosis with control methods to handle faults...

  18. Robust fault-tolerant control for wing flutter under actuator failure

    Many control laws, such as optimal controller and classical controller, have seen their applications to suppressing the aeroelastic vibrations of the aeroelastic system. However, those con-trol laws may not work effectively if the aeroelastic system involves actuator faults. In the current study for wing flutter of reentry vehicle, the effect of actuator faults on wing flutter system is rarely considered and few of the fault-tolerant control problems are taken into account. In this paper, we use the radial basis function neural network and the finite-time H∞adaptive fault-tolerant control technique to deal with the flutter problem of wings, which is affected by actuator faults, actuator saturation, parameter uncertainties and external disturbances. The theory of this article includes the modeling of wing flutter and fault-tolerant controller design. The stability of the finite-time adaptive fault-tolerant controller is theoretically proved. Simulation results indicate that the designed fault-tolerant flutter controller can effectively deal with the faults in the flutter system and can promptly suppress the wing flutter as well.

  19. Robust fault-tolerant control for wing flutter under actuator failure

    Full Text Available Many control laws, such as optimal controller and classical controller, have seen their applications to suppressing the aeroelastic vibrations of the aeroelastic system. However, those control laws may not work effectively if the aeroelastic system involves actuator faults. In the current study for wing flutter of reentry vehicle, the effect of actuator faults on wing flutter system is rarely considered and few of the fault-tolerant control problems are taken into account. In this paper, we use the radial basis function neural network and the finite-time H∞ adaptive fault-tolerant control technique to deal with the flutter problem of wings, which is affected by actuator faults, actuator saturation, parameter uncertainties and external disturbances. The theory of this article includes the modeling of wing flutter and fault-tolerant controller design. The stability of the finite-time adaptive fault-tolerant controller is theoretically proved. Simulation results indicate that the designed fault-tolerant flutter controller can effectively deal with the faults in the flutter system and can promptly suppress the wing flutter as well.

  20. On the Practicality of `Practical' Byzantine Fault Tolerance

    Byzantine Fault Tolerant (BFT) systems are considered by the systems research community to be state of the art with regards to providing reliability in distributed systems. BFT systems provide safety and liveness guarantees with reasonable assumptions, amongst a set of nodes where at most f nodes display arbitrarily incorrect behaviors, known as Byzantine faults. Despite this, BFT systems are still rarely used in practice. In this paper we describe our experience, from an application developer's perspective, trying to leverage the publicly available and highly-tuned PBFT middleware (by Castro and Liskov), to provide provable reliability guarantees for an electronic voting application with high security and robustness needs. We describe several obstacles we encountered and drawbacks we identified in the PBFT approach. These include some that we tackled, such as lack of support for dynamic client management and leaving state management completely up to the application. Others still remaining include the lack of...

  1. Fault diagnosis and fault-tolerant control and guidance for aerospace vehicles from theory to application

    Fault Diagnosis and Fault-Tolerant Control and Guidance for Aerospace demonstrates the attractive potential of recent developments in control for resolving such issues as improved flight performance, self-protection and extended life of structures. Importantly, the text deals with a number of practically significant considerations: tuning, complexity of design, real-time capability, evaluation of worst-case performance, robustness in harsh environments, and extensibility when development or adaptation is required. Coverage of such issues helps to draw the advanced concepts arising from academic research back towards the technological concerns of industry. Initial coverage of basic definitions and ideas and a literature review gives way to a treatment of important electrical flight control system failures: the oscillatory failure case, runaway, and jamming. Advanced fault detection and diagnosis for linear and nonlinear systems are described. Lastly recovery strategies appropriate to remaining acuator/sensor/c...

  2. Guaranteed Cost Fault-tolerant Control of Networked Control Systems with Short Output Delay and Short Control Delay Based on State Observer

    Full Text Available Supposing that the sensor and controller nodes were time-driven and the actuator node was event-driven, the problem of integrity against sensor failures for the networked control systems with short output delay and short control delay was discussed based on observer. The state observer of the system according to the time-delay compensation strategy was designed. Then, considering possible sensor failures, an augmented mathematic model for the networked control systems based on observer was developed. In terms of the given quadratic performance index function, the integrity condition of the system was given and the designs for guaranteed cost fault-tolerant controller and observer were presented respectively by using the cooperative design approach of the controller and observer and the approach of bilinear matrix inequalities. Finally, a numerical simulation example demonstrated the conclusions are feasible and effective. The proposed control method meets the requirements in industrial networked control systems.

  3. Fault-Tolerant Postselected Quantum Computation: Threshold Analysis

    The schemes for fault-tolerant postselected quantum computation given in [Knill, Fault-Tolerant Postselected Quantum Computation: Schemes,] are analyzed to determine their error-tolerance. The analysis is based on computer-assisted heuristics. It indicates that if classical and quantum communication delays are negligible, then scalable qubit-based quantum computation is possible with errors above 1% per elementary quantum gate.

  4. Survey and future directions of fault-tolerant distributed computing on board spacecraft

    Current and future space missions demand highly reliable on-board computing systems, which are capable of carrying out high-performance data processing. At present, no single computing scheme satisfies both, the highly reliable operation requirement and the high-performance computing requirement. The aim of this paper is to review existing systems and offer a new approach to addressing the problem. In the first part of the paper, a detailed survey of fault-tolerant distributed computing systems for space applications is presented. Fault types and assessment criteria for fault-tolerant systems are introduced. Redundancy schemes for distributed systems are analyzed. A review of the state-of-the-art on fault-tolerant distributed systems is presented and limitations of current approaches are discussed. In the second part of the paper, a new fault-tolerant distributed computing platform with wireless links among the computing nodes is proposed. Novel algorithms, enabling important aspects of the architecture, such as time slot priority adaptive fault-tolerant channel access and fault-tolerant distributed computing using task migration are introduced.

  5. Architecting Fault Tolerance with Exception Handling: Verification and Validation

    When building dependable systems by integrating untrusted software components that were not originally designed to interact with each other, it is likely the occurrence of architectural mismatches related to assumptions in their failure behaviour. These mismatches, if not prevented during system design, have to be tolerated during runtime. This paper presents an architectural abstraction based on exception handling for structuring fault-tolerant software systems.This abstraction comprises several components and connectors that promote an existing untrusted software element into an idealised fault-tolerant architectural element. Moreover, it is considered in the context of a rigorous software development approach based on formal methods for representing the structure and behaviour of the software architecture. The proposed approach relies on a formal specification and verification for analysing exception propagation, and verifying important dependability properties, such as deadlock freedom, and scenarios of architectural reconfiguration. The formal models are automatically generated using model transformation from UML diagrams: component diagram representing the system structure, and sequence diagrams representing the system behaviour. Finally, the formal models are also used for generating unit and integration test cases that are used for assessing the correctness of the source code. The feasibility of the proposed architectural approach was evaluated on an embedded critical case study.

  6. Fault-Tolerant Approach for Modular Multilevel Converters under Submodule Faults

    The modular multilevel converter (MMC) is attractive for medium- or high-power applications because of the advantages of its high modularity, availability, and high power quality. The fault-tolerant operation is one of the important issues for the MMC. This paper proposed a fault-tolerant approac...

  7. Fusion of Built in Test (BIT) Technologies with Embeddable Fault Tolerant Techniques for Power System and Drives in Space Exploration Project

    National Aeronautics and Space Administration — Impact Technologies has proposed development of an effective prognostic and fault accommodation system for critical DC power systems including PV systems. Overall...

  8. The development of basic technology for instrumentation and control - A study on the development of fault-tolerant digital control systems

    Lee, Sang Jeong; Seong, Se Jin; Yoon, Sang Joon; Park, Sang Hyun; Jeong, Il Young; Kwon, Hyun Jin [Chungnam National University, Taejon (Korea, Republic of); Kim, Sang Woo [Pohang University of Science and Technology, Pohang (Korea, Republic of)


    This project developed fault-tolerant control algorithm with bumpless transfer applicable to the steam generator level control system in the low power operating range. A bumpless transfer scheme is introduced for removing the bump phenomenon in the control signal when control failures occur. In particular, the bumpless scheme can be efficiently used with the proposed PID= and AGPC control algorithm has shown good performance in the simulation study. This project also analyzed V and V system structure and summarized PID tuning methods for porting the proposed algorithms to V and V system. Finally, this project developed a test bed where PID and AGPC algorithm is implemented and the derived steam generator model is included as the plant. 51 refs., 17 tabs., 88 figs. (author)

  9. A Framework-Based Approach for Fault-Tolerant Service Robots

    Heejune Ahn


  10. 带有通讯约束的网络化控制系统容错控制技术研究%Fault Tolerant Control for Networked Control Systems with Access Constraints

    In this paper,the problem of fault tolerant control (FTC) considering actuator fault for networked control systems (NCSs) with access constraints is addressed.A static scheduling method,periodic communication sequence (PCS),is applied to allocate network resource and schedule the access to the network.The novelty of this work lies in that the NCS with PCS and actuator fault are modeled as a periodic switching system and the schedule-dependent Lyapunov function method is used to design the fault tolerant controller.For the data packets dropped by scheduling strategy at each sampling time,0 and the value of previous sampling time are respectively considered to recover them.Additionally,the problem of robust FTC for the controlled plant with external energy-bounded disturbance is also respectively discussed under these two situations.Numerical examples are given to illustrate the effectiveness of the proposed design methods.

  11. Software Implemented Fault-Tolerant (SIFT) user's guide

    Program development for a Software Implemented Fault Tolerant (SIFT) computer system is accomplished in the NASA LaRC AIRLAB facility using a DEC VAX-11 to interface with eight Bendix BDX 930 flight control processors. The interface software which provides this SIFT program development capability was developed by AIRLAB personnel. This technical memorandum describes the application and design of this software in detail, and is intended to assist both the user in performance of SIFT research and the systems programmer responsible for maintaining and/or upgrading the SIFT programming environment.

  12. Compilation and Synthesis for Fault-Tolerant Digital Microfluidic Biochips

    Microfluidic-based biochips are replacing the conventional biochemical analyzers, by integrating all the necessary functions for biochemical analysis using microfluidics. The digital microfluidic biochips (DMBs) manipulate discrete amounts of fluids of nanoliter volume, named droplets, on an array...... the introduction of the redundancy required for fault-tolerance. We consider both time redundancy, i.e., re-executing erroneous operations, and space redundancy, i.e., creating redundant droplets for fault-tolerance. Error recovery is performed such that the number of transient faults tolerated is maximized...

  13. Fault-Tolerant Software Design for the Distributed Superviso ry Control Systems%分布式微机监控系统的软件容错设计

    The fault-tolerant techniques are important meth ods to improve the reliability of computer control systems.Taking some practica l examples as the background materials,this paper discusses the fault-tolerant s oftware techniques for the distributed computer supervisory control systems and the applications of artificial intelligence to the fault-tolerant design.%容错技术是提高计算机控制系统可靠性的一种重要 方法。以若干工程实例作为背景材料,讨论分布式微机监控系统的软件容错设计技术,探讨 人工智能技术在容错设计中的应用。

  14. 某型飞机燃油系统容错控制策略研究%Fault-tolerant Control Strategy Research of the Fuel System of a Type Airplane

    本文介绍了容错控制的基本概念,在此基础上对容错控制方法进行了分析,并给出了某型飞机燃油系统的容错控制策略,提高了燃油系统的可靠性,改善了飞机的飞行品质.%This paper introduces the basic concept of fault-tolerant control, on the basis, analyzes fault-tolerant control methods, and gives the fault-tolerant control strategy of the fuel system of a type airplane, which improved the reliability of fuel system and improved the aircraft's quality.

  15. Correct-by-Construction Attack-Tolerant Systems


    for rendering systems Byzantine fault tolerant and to ideas for monitoring distributed system behavior and responding to unusual events . We believe...attack-tolerance, fault tolerant systems, correct-by-construction protocols, formal methods, event logic, functional distributed processes, cyber...nation’s ability to use advanced computer science and substantial computing power to enhance the ability of our systems to detect attacks and

  16. A novel fault tolerant permanent magnet synchronous motor with improved optimal torque control for aerospace application

    Guo Hong


  17. Fault Diagnosis and Fault Tolerant Control with Application on a Wind Turbine Low Speed Shaft Encoder

    . This sensor has to be correct as blade pitch actions should be different at different azimuth angle as the wind speed varies within the rotor field due to different phenomena. A scheme detecting faults in this sensor has previously been designed for the application of a high end fault diagnosis and fault...... tolerant control of wind turbines using a benchmark model. In this paper, the fault diagnosis scheme is improved and integrated with a fault accommodation scheme which enables and disables the individual pitch algorithm based on the fault detection. In this way, the blade and tower loads are not increased...

  18. Exact Regenerating Codes for Byzantine Fault Tolerance in Distributed Storage

    Due to the use of commodity software and hardware, crash-stop and Byzantine failures are likely to be more prevalent in today's large-scale distributed storage systems. Regenerating codes have been shown to be a more efficient way to disperse information across multiple nodes and recover crash-stop failures in the literature. In this paper, we present the design of regeneration codes in conjunction with integrity check that allows exact regeneration of failed nodes and data reconstruction in presence of Byzantine failures. A progressive decoding mechanism is incorporated in both procedures to leverage computation performed thus far. The fault-tolerance and security properties of the schemes are also analyzed.

  19. Design of passive fault-tolerant controllers of a quadrotor based on sliding mode theory

    Full Text Available Abstract In this paper, sliding mode control is used to develop two passive fault tolerant controllers for an AscTec Pelican UAV quadrotor. In the first approach, a regular sliding mode controller (SMC augmented with an integrator uses the robustness property of variable structure control to tolerate partial actuator faults. The second approach is a cascaded sliding mode controller with an inner and outer SMC loops. In this configuration, faults are tolerated in the fast inner loop controlling the velocity system. Tuning the controllers to find the optimal values of the sliding mode controller gains is made using the ecological systems algorithm (ESA, a biologically inspired stochastic search algorithm based on the natural equilibrium of animal species. The controllers are tested using SIMULINK in the presence of two different types of actuator faults, partial loss of motor power affecting all the motors at once, and partial loss of motor speed. Results of the quadrotor following a continuous path demonstrated the effectiveness of the controllers, which are able to tolerate a significant number of actuator faults despite the lack of hardware redundancy in the quadrotor system. Tuning the controller using a faulty system improves further its ability to afford more severe faults. Simulation results show that passive schemes reserve their important role in fault tolerant control and are complementary to active techniques

  20. Fault-tolerant Sensor Fusion for Marine Navigation

    where essential navigation information is provided even with multiple faults in instrumentation. The paper proposes a provable correct implementation through auto-generated state-event logics in a supervisory part of the algorithms. Test results from naval vessels document the performance and shows...... events where the fault-tolerant sensor fusion provided uninterrupted navigation data despite temporal instrument defects...

  1. Development and analysis of the Software Implemented Fault-Tolerance (SIFT) computer

    SIFT (Software Implemented Fault Tolerance) is an experimental, fault-tolerant computer system designed to meet the extreme reliability requirements for safety-critical functions in advanced aircraft. Errors are masked by performing a majority voting operation over the results of identical computations, and faulty processors are removed from service by reassigning computations to the nonfaulty processors. This scheme has been implemented in a special architecture using a set of standard Bendix BDX930 processors, augmented by a special asynchronous-broadcast communication interface that provides direct, processor to processor communication among all processors. Fault isolation is accomplished in hardware; all other fault-tolerance functions, together with scheduling and synchronization are implemented exclusively by executive system software. The system reliability is predicted by a Markov model. Mathematical consistency of the system software with respect to the reliability model has been partially verified, using recently developed tools for machine-aided proof of program correctness.

  2. Fault Tolerant and Optimal Control of Wind Turbines with Distributed High-Speed Generators

    Full Text Available In this paper, the control scheme of a distributed high-speed generator system with a total amount of 12 generators and nominal generator speed of 7000 min − 1 is studied. Specifically, a fault tolerant control (FTC scheme is proposed to keep the turbine in operation in the presence of up to four simultaneous generator faults. The proposed controller structure consists of two layers: The upper layer is the baseline controller, which is separated into a partial load region with the generator torque as an actuating signal and the full-load operation region with the collective pitch angle as the other actuating signal. In addition, the lower layer is responsible for the fault diagnosis and FTC characteristics of the distributed generator drive train. The fault reconstruction and fault tolerant control strategy are tested in simulations with several actuator faults of different types.

  3. Design of passive fault-tolerant flight controller against actuator failures

    The problem of designing passive fault-tolerant flight controller is addressed when the normal and faulty cases are prescribed. First of all, the considered fault and fault-free cases are formed by polytopes. As considering that the safety of a post-fault system is directly related to the maximum values of physical variables in the system, peak-to-peak gain is selected to represent the relationships among the amplitudes of actuator outputs, system outputs, and reference com-mands. Based on the parameter dependent Lyapunov and slack methods, the passive fault-tolerant flight controllers in the absence/presence of system uncertainty for actuator failure cases are designed, respectively. Case studies of an airplane under actuator failures are carried out to validate the effectiveness of the proposed approach.

  4. A New Approach to Robust and Fault Tolerant Control

    In this paper, we shall summarize a new approach to robust and fault tolerant control proposed recently by the author. This approach is based on a variation of all controller parametrization. This robust and fault-tolerant control design consists of two parts: a nominal performance controller and a robustness controller, and works in such a way that when a component (sensor,actuator, etc.) failure is detected, the controller structure is reconfigured by adding a robustness loop to compensate the fault. We shall illustrate how this strategy works under various situations.

  5. Dynamic Output Feedback Based Active Decentralized Fault-Tolerant Control for Reconfigurable Manipulator with Concurrent Failures

    Full Text Available The goal of this paper is to describe an active decentralized fault-tolerant control (ADFTC strategy based on dynamic output feedback for reconfigurable manipulators with concurrent actuator and sensor failures. Consider each joint module of the reconfigurable manipulator as a subsystem, and treat the fault as the unknown input of the subsystem. Firstly, by virtue of linear matrix inequality (LMI technique, the decentralized proportional-integral observer (DPIO is designed to estimate and compensate the sensor fault online; hereafter, the compensated system model could be derived. Then, the actuator fault is estimated similarly by another DPIO using LMI as well, and the sufficient condition of the existence of H∞ fault-tolerant controller in the dynamic output feedback is presented for the compensated system model. Furthermore, the dynamic output feedback controller is presented based on the estimation of actuator fault to realize active fault-tolerant control. Finally, two 3-DOF reconfigurable manipulators with different configurations are employed to verify the effectiveness of the proposed scheme in simulation. The main advantages of the proposed scheme lie in that it can handle the concurrent faults act on the actuator and sensor on the same joint module, as well as there is no requirement of fault detection and isolation process; moreover, it is more feasible to the modularity of the reconfigurable manipulator.

  6. Fault detection and isolation in systems with parametric faults

    The problem of fault detection and isolation of parametric faults is considered in this paper. A fault detection problem based on parametric faults are associated with internal parameter variations in the dynamical system. A fault detection and isolation method for parametric faults is formulated...

  7. Coordinated Fault-Tolerance for High-Performance Computing Final Project Report

    With the Coordinated Infrastructure for Fault Tolerance Systems (CIFTS, as the original project came to be called) project, our aim has been to understand and tackle the following broad research questions, the answers to which will help the HEC community analyze and shape the direction of research in the field of fault tolerance and resiliency on future high-end leadership systems. Will availability of global fault information, obtained by fault information exchange between the different HEC software on a system, allow individual system software to better detect, diagnose, and adaptively respond to faults? If fault-awareness is raised throughout the system through fault information exchange, is it possible to get all system software working together to provide a more comprehensive end-to-end fault management on the system? What are the missing fault-tolerance features that widely used HEC system software lacks today that would inhibit such software from taking advantage of systemwide global fault information? What are the practical limitations of a systemwide approach for end-to-end fault management based on fault awareness and coordination? What mechanisms, tools, and technologies are needed to bring about fault awareness and coordination of responses on a leadership-class system? What standards, outreach, and community interaction are needed for adoption of the concept of fault awareness and coordination for fault management on future systems? Keeping our overall objectives in mind, the CIFTS team has taken a parallel fourfold approach. Our central goal was to design and implement a light-weight, scalable infrastructure with a simple, standardized interface to allow communication of fault-related information through the system and facilitate coordinated responses. This work led to the development of the Fault Tolerance Backplane (FTB) publish-subscribe API specification, together with a reference implementation and several experimental implementations on top of

  8. Using Relocatable Bitstreams For Fault Tolerance


    is called a fault [AL81]. Faults can be classified by their duration, nature and extent [Nel90]. A fault’s duration is transient, intermittent or...Jimeno, E. de la Torre, and T. Riesgo . Straight Method for Reallocation of Complex Cores by Dynamic Reconfiguration in Virtex II FPGAs. The 16th IEEE

  9. Formal and Informal Modeling of Fault Tolerant Noc Architectures

    Full Text Available The suggested new approach based on B-Event formal technics consists of suggesting aspects and constraints related to the reliability of NoC (Network-On-chip and the over-cost related to the solutions of tolerances on the faults: a design of NoC tolerating on the faults for SoC (System-on-Chip containing configurable technology FPGA (Field Programmable Gates Array, by extracting the properties of the NoC architecture. We illustrate our methodology by developing several refinements which produce QNoC (Quality of Service of Network on chip switch architecture from specification to test. We will show how B-event formalism can follow life cycle of NoC design and test: for example the code VHDL (VHSIC Hardware Description Language simulation established of certain kind of architecture can help us to optimize the architecture and produce new architecture; we can inject the new properties related to the new QNoC architecture into formal B-event specification. B-event is associated to Rodin tool environment. As case study, the last stage of refinement used a wireless network in order to generate complete test environment of the studied application.

  10. Adaptive Fault Tolerance for Many-Core Based Space-Borne Computing

    This paper describes an approach to providing software fault tolerance for future deep-space robotic NASA missions, which will require a high degree of autonomy supported by an enhanced on-board computational capability. Such systems have become possible as a result of the emerging many-core technology, which is expected to offer 1024-core chips by 2015. We discuss the challenges and opportunities of this new technology, focusing on introspection-based adaptive fault tolerance that takes into account the specific requirements of applications, guided by a fault model. Introspection supports runtime monitoring of the program execution with the goal of identifying, locating, and analyzing errors. Fault tolerance assertions for the introspection system can be provided by the user, domain-specific knowledge, or via the results of static or dynamic program analysis. This work is part of an on-going project at the Jet Propulsion Laboratory in Pasadena, California.

  11. Scalability, performance, and fault tolerance of PACS architectures

    Blume, Hartwig R.; Prior, Fred W.; di Pierro, Milan C.; Goble, John C.; Lodgberg, Jonas; Kenney, Robert S.; Goeringer, Fred


    Three data-base architectures may be distinguished among Picture Archiving and Communication Systems (PACSs): (1) Configurations with logically and physically centralized data- base and file server, (2) systems with physically distributed file servers and a logically centralized data-base, and (3) installations with logically and physically distributed data- bases and file servers. A brief overview of these architectures and their scaleability, performance, and fault- tolerance is given. A PACS for an existing large university hospital is designed for the first as well as the second architecture using given image production data and workflow. We evaluate the fault-tolerance of the two architectures. By modeling the work-flow and employing queuing theory, solutions with practically realizable data transfer requirements are found for both architectures. With today's performance and cost of computers, storage, and information management technologies, the second and third architectures are preferably implemented, depending on the size of the installation. The architectures offer almost unlimited scaleability, very high fault-tolerance, and optimized workflow. We describe a modern commercial PACS that adheres to the open-systems concept and consists of software application programs that run, independent of specific computer and network components, on off-the-shelf hardware and under standard multi-platform operating systems and utilize commercial data-base management systems and network managers. The system is based on the second architecture with multiple islands of functionality, each with servers and archive modules and a physically distributed data-base. Our PACS architecture supports browser technology: Workstations use the data-base to determine the location of needed information and then, through the image browser, mount the appropriate file server for access. The architecture supports a concept similar to domain name server (DNS) directory services on the

  12. Realization of User Level Fault Tolerant Policy Management through a Holistic Approach for Fault Correlation

    Many modern scientific applications, which are designed to utilize high performance parallel com- puters, occupy hundreds of thousands of computational cores running for days or even weeks. Since many scien- tists compete for resources, most supercomputing centers practice strict scheduling policies and perform meticulous accounting on their usage. Thus computing resources and time assigned to a user is considered invaluable. However, most applications are not well prepared for un- foreseeable faults, still relying on primitive fault tolerance techniques. Considering that ever-plunging mean time to interrupt (MTTI) is making scientific applications more vulnerable to faults, it is increasingly important to provide users not only an improved fault tolerant environment, but also a framework to support their own fault tolerance policies so that their allocation times can be best utilized. This paper addresses a user level fault tolerance policy management based on a holistic approach to digest and correlate fault related information. It introduces simple semantics with which users express their policies on faults, and illustrates how event correlation techniques can be applied to manage and determine the most preferable user policies. The paper also discusses an implementation of the framework using open source software, and demonstrates, as an example, how a molecular dynamics simulation application running on the institutional cluster at Oak Ridge National Laboratory benefits from it.

  13. A New Fault-tolerant Switched Reluctance Motor with reliable fault detection capability

    while no extra search coil is actually needed. The motor itself is able to continue to work under any faulted conditions, providing fault-tolerant features. The working principle, performance evaluation of this motor will be demonstrated in this paper and Finite Element Analysis results are provided....

  14. Robust and Fault-Tolerant Linear Parameter-Varying Control of Wind Turbines

    High performance and reliability are required for wind turbines to be competitive within the energy market. To capture their nonlinear behavior, wind turbines are often modeled using parameter-varying models. In this paper we design and compare multiple linear parameter-varying (LPV) controllers......, designed using a proposed method that allows the inclusion of both faults and uncertainties in the LPV controller design. We specifically consider a 4.8 MW, variable-speed, variable-pitch wind turbine model with a fault in the pitch system. We propose the design of a nominal controller (NC), handling...... the parameter variations along the nominal operating trajectory caused by nonlinear aerodynamics. To accommodate the fault in the pitch system, an active fault-tolerant controller (AFTC) and a passive fault-tolerant controller (PFTC) are designed. In addition to the nominal LPV controller, we also propose...

  15. Reconfigurable Control of Input Affine Nonlinear Systems under Actuator Fault

    This paper proposes a fault tolerant control method for input-affine nonlinear systems using a nonlinear reconfiguration block (RB). The basic idea of the method is to insert the RB between the plant and the nominal controller such that fault tolerance is achieved without re-designing the nominal...

  16. An Aspect-Oriented Approach to Assessing Fault Tolerance


    misconfiguration, and so forth. The Hadoop File System [8] includes a fault injection framework built using AspectJ similar to that which we describe in...this paper. The main differences between our framework and Hadoop fault injectors is that the Hadoop fault injector only supports probabilistic...Transformation and Net-Centric Systems Conference, Orlando, Florida, April 2009. [8] “ Hadoop fault injection,

  17. Online Reconfigurable Self-Timed Links for Fault Tolerant NoC

    of the links. The fault tolerance properties are analyzed using a fault model containing temporary, intermittent, and permanent faults that occur both as bursts and as single faults. The results show a considerable enhancement in the fault tolerance at the cost of performance and area, and with only a slight increase in power consumption.

  18. Fault Tolerance Assistant (FTA): An Exception Handling Programming Model for MPI Applications

    Future high-performance computing systems may face frequent failures with their rapid increase in scale and complexity. Resilience to faults has become a major challenge for large-scale applications running on supercomputers, which demands fault tolerance support for prevalent MPI applications. Among failure scenarios, process failures are one of the most severe issues as they usually lead to termination of applications. However, the widely used MPI implementations do not provide mechanisms for fault tolerance. We propose FTA-MPI (Fault Tolerance Assistant MPI), a programming model that provides support for failure detection, failure notification and recovery. Specifically, FTA-MPI exploits a try/catch model that enables failure localization and transparent recovery of process failures in MPI applications. We demonstrate FTA-MPI with synthetic applications and a molecular dynamics code CoMD, and show that FTA-MPI provides high programmability for users and enables convenient and flexible recovery of process failures.

  19. Fault tolerant control of multivariable processes using auto-tuning PID controller.

    Fault tolerant control of dynamic processes is investigated in this paper using an auto-tuning PID controller. A fault tolerant control scheme is proposed composing an auto-tuning PID controller based on an adaptive neural network model. The model is trained online using the extended Kalman filter (EKF) algorithm to learn system post-fault dynamics. Based on this model, the PID controller adjusts its parameters to compensate the effects of the faults, so that the control performance is recovered from degradation. The auto-tuning algorithm for the PID controller is derived with the Lyapunov method and therefore, the model predicted tracking error is guaranteed to converge asymptotically. The method is applied to a simulated two-input two-output continuous stirred tank reactor (CSTR) with various faults, which demonstrate the applicability of the developed scheme to industrial processes.

  20. Fault Tolerant Controller Design for a Faulty UAV Using Fuzzy Modeling Approach

    Full Text Available We address a fault tolerant control (FTC issue about an unmanned aerial vehicle (UAV under possible simultaneous actuator saturation and faults occurrence. Firstly, the Takagi-Sugeno fuzzy models representing nonlinear flight control systems (FCS for an UAV with unknown disturbances and actuator saturation are established. Then, a normal H-infinity tracking controller is presented using an online estimator, which is introduced to weaken the saturation effect. Based on the normal tracking controller, we propose an adaptive fault tolerant tracking controller (FTTC to solve actuator loss of effectiveness (LOE fault problem. Compared with previous work, this approach developed in our research need not rely on any fault diagnosis unit and is easily applied in engineering. Finally, these results in simulation indicate the efficiency of our presented FTC scheme.

  1. A benchmark for fault tolerant flight control evaluation

    A large transport aircraft simulation benchmark (REconfigurable COntrol for Vehicle Emergency Return - RECOVER) has been developed within the GARTEUR (Group for Aeronautical Research and Technology in Europe) Flight Mechanics Action Group 16 (FM-AG(16)) on Fault Tolerant Control (2004 2008) for the integrated evaluation of fault detection and identification (FDI) and reconfigurable flight control strategies. The benchmark includes a suitable set of assessment criteria and failure cases, based on reconstructed accident scenarios, to assess the potential of new adaptive control strategies to improve aircraft survivability. The application of reconstruction and modeling techniques, based on accident flight data, has resulted in high-fidelity nonlinear aircraft and fault models to evaluate new Fault Tolerant Flight Control (FTFC) concepts and their real-time performance to accommodate in-flight failures.

  2. Fault tolerant vector control of induction motor drive

    For electric composed of technical objects hazardous industries, such as nuclear, military, chemical, etc. an urgent task is to increase their resiliency and survivability. The construction principle of vector control system fault-tolerant asynchronous electric. Displaying recovery efficiency three-phase induction motor drive in emergency mode using two-phase vector control system. The process of formation of a simulation model of the asynchronous electric unbalance in emergency mode. When modeling used coordinate transformation, providing emergency operation electric unbalance work. The results of modeling transient phase loss motor stator. During a power failure phase induction motor cannot save circular rotating field in the air gap of the motor and ensure the restoration of its efficiency at rated torque and speed.

  3. Fault tolerance in space-based digital signal processing and switching systems: Protecting up-link processing resources, demultiplexer, demodulator, and decoder

    Fault tolerance features in the first three major subsystems appearing in the next generation of communications satellites are described. These satellites will contain extensive but efficient high-speed processing and switching capabilities to support the low signal strengths associated with very small aperture terminals. The terminals' numerous data channels are combined through frequency division multiplexing (FDM) on the up-links and are protected individually by forward error-correcting (FEC) binary convolutional codes. The front-end processing resources, demultiplexer, demodulators, and FEC decoders extract all data channels which are then switched individually, multiplexed, and remodulated before retransmission to earth terminals through narrow beam spot antennas. Algorithm based fault tolerance (ABFT) techniques, which relate real number parity values with data flows and operations, are used to protect the data processing operations. The additional checking features utilize resources that can be substituted for normal processing elements when resource reconfiguration is required to replace a failed unit.

  4. Production of Reliable Flight Crucial Software: Validation Methods Research for Fault Tolerant Avionics and Control Systems Sub-Working Group Meeting

    The state of the art in the production of crucial software for flight control applications was addressed. The association between reliability metrics and software is considered. Thirteen software development projects are discussed. A short term need for research in the areas of tool development and software fault tolerance was indicated. For the long term, research in format verification or proof methods was recommended. Formal specification and software reliability modeling, were recommended as topics for both short and long term research.

  5. A Cost Effective Fault-Tolerant Scheme for RAIDs

    The rapid progress in mass storage technology has made it possible for designersto implement large data storage systems for a variety of applications. One of the efficient waysto build large storage systems is to use RAIDs as basic storage modules. In general, the datacan be recovered in RAIDs only when one error occurs. But in large RAIDs systems, the faultprobability will increase when the number of disks increases, and the use of disks with big storagecapacity will cause the recovering time to prolong, thus the probability of the second disk's faultwill increase. Therefore, it is necessary to develop methods to recover data when two or more errorshave occurred. In this paper, a fault tolerant scheme is proposed based on extended Reed-Solomoncode, a recovery procedure is designed to correct up to two errors which is implemented by softwareand hardware together, and the scheme is verified by computer simulation. In this scheme, only tworedundant disks are used to recover up to two disks' fault. The encoding and decoding methods,and the implementation based on software and hardware are described. The application of thescheme in software RAIDs that are built in cluster computers are also described. Compared withthe existing methods such as EVENODD and DH, the proposed scheme has distinct improvementin implementation and redundancy.

  6. Study on Software Fault Injection Based on Onboard System

    Fault injection techniques are the effective methods to evaluate the dependability and validate the fault tolerance mechanisms of computer systems. Among the different fault injection techniques, software implemented fault injection technique is regarded as one of the most promising technique for evaluation of the dependability of computer systems. In this paper, combined the advantages of software fault injection and the particularity of onboard system, a new software fault injection model, which can be used to evaluate the dependability and validate the fault tolerance mechanisms of the onboard system, is put forward. To evaluate the dependability of on boardsystem effectively, the application algorithm on how to use the model is presented. The experimental results show that using the fault injection model and algorithm put forward in this paper, not only most of low-level faults such as processor register faults, memory faults and so on can be injected, but also some high-level faults such as code faults, branch faults etc. can be injected, which can be used to evaluate the dependability of the onboard systems.

  7. A Fault Tolerant Resource Allocation Architecture for Mobile Grid

    Full Text Available Problem statement: In order to achieve high level of reliability and availability, the grid infrastructure should be fault tolerant. Since the failure of resources affects job execution fatally, fault tolerance service is essential to satisfy QoS requirement in grid computing with respect to mobile nodes. Approach: We propose a fault tolerant technique for improving reliability in mobile grid environment considering the node mobility. The Cluster head and monitoring agent was designed in such a way it addresses both resource and network failure and present recovery techniques for overcoming the faults. Results: The proposed model achieves a identifiable performance when compared to the previous model (HRAA. By simulation results, we analyze the node and link failures on parameters such as delivery ratio, throughput and delay against the rate of success. Conclusion: The proposed fault tolerant approach checks for availability of the nodes with least work load for transferring the executed job to cluster head providing an alternate path in case of failure thereby enhancing the reliability of the grid environment.

  8. 网络化控制系统随机容错控制%Stochastic Fault-Tolerant Control of Networked Control Systems

    Considering the random failures of controller and actuator, by introducing a switching matrix descriped by independent Bernoulli random sequence, a stochastic networked control model is established. And the problem of random fault-tolerant control with the existence of sensor failure and controller failure for this constructed model is investigated. Combined with linear matrix inequality technology and Lyapunov stability theory, exponential stability criteria in mean square are obtained. Based on the obtained criteria, random fault-tolerant controllers are designed. The numerical example of a controller with feedback control and fault-tolerant control is given. Simulation example shows that the design method is valid.%考虑控制器及执行器失效的随机性,通过引入具有独立Bernoulli分布随机序列开关矩阵,建立了网络化控制系统随机模型,并研究了该模型存在传感器随机失效、控制器随机失效以及二者同时存在随机失效的随机容错控制问题.

  9. A Novel Nanometric Fault Tolerant Reversible Subtractor Circuit

    Full Text Available Reversibility plays an important role when energy efficient computations are considered. Reversible logic circuits have received significant attention in quantum computing, low power CMOS design, optical information processing and nanotechnology in the recent years. This study proposes a new fault tolerant reversible half-subtractor and a new fault tolerant reversible full-subtractor circuit with nanometric scales. Also in this paper we demonstrate how the well-known and important, PERES gate and TR gate can be synthesized from parity preserving reversible gates. All the designs have nanometric scales.

  10. 具有通信约束的网络化控制系统容错控制研究%Fault-Tolerant Control Research for Networked Control System under Communication Constraints

    Implementing a control system over a communication network induces inevitable time delays that may degrade performance and even cause instability. One of the most effective ways to reduce the negative effect of delays on the performance of networked control system (NCS) is to reduce network traffic. In this paper, adjustable deadbands are explored as a solution to reduce network traffic in NCS. A method of fault-tolerant control of networked control system is presented,which takes into account system response as well as network traffic. The integrity design for a kind of NCS with sensor failures and actuator failures is analyzed based on robust fault-tolerant control theory and information scheduling. After detailed theoretical analysis, the paper also provides the simulation results, which further validate the proposed scheme.

  11. Combining Artificial Intelligence and Robust Techniques with MRAC in Fault Tolerant Control

    The investigation of this thesis presents different approaches for Fault Tolerant Control based on Model Reference Adaptive Control, Artificial Neural Networks, PID controller optimized by a Genetic Algorithm, Nonlinear, Robust and Linear Parameter Varying (LPV) control for Linear Time Invariant (LTI), LPV and nonlinear systems. All of the above techniques are integrated in different controller�s structures to prove their ability to accommodate a fault. Modern systems and their challenging op...

  12. Fault Tolerant, Radiation Hard DSP Project

    National Aeronautics and Space Administration — We propose to develop a radiation tolerant/hardened signal processing node, which effectively utilizes state-of-the-art commercial semiconductors plus our innovative...

  13. A Bypass-Ring Scheme for a Fault Tolerant Multicast

    Full Text Available We present a fault tolerant scheme for recovery from single or multiple node failures in multi-directional multicast trees. The scheme is based on cyclic structures providing alternative paths to eliminate faulty nodes and reroute the traffic. Our scheme is independent of message source and direction in the tree, provides a basis for on-the-fly repair and can be used as a platform for various strategies for reconnecting tree partitions. It only requires an underlying infrastructure to provide a reliable routing service. Although it is described in the context of a message multicast, the scheme can be used universally in all systems using tree-based overlay networks for communication among components.

  14. The SIFT computer and its development. [Software Implemented Fault Tolerance for aircraft control

    Software Implemented Fault Tolerance (SIFT) is an aircraft control computer designed to allow failure probability of less than 10 to the -10th/hour. The system is based on advanced fault-tolerance computing and validation methodology. Since confirmation of reliability by observation is essentially impossible, system reliability is estimated by a Markov model. A mathematical proof is used to justify the validity of the Markov model. System design is represented by a hierarchy of abstract models, and the design proof comprises mathematical proofs that each model is, in fact, an elaboration of the next more abstract model.

  15. Robust Fault-Tolerant Control for Satellite Attitude Stabilization Based on Active Disturbance Rejection Approach with Artificial Bee Colony Algorithm

    This paper proposed a robust fault-tolerant control algorithm for satellite stabilization based on active disturbance rejection approach with artificial bee colony algorithm. The actuating mechanism of attitude control system consists of three working reaction flywheels and one spare reaction flywheel. The speed measurement of reaction flywheel is adopted for fault detection. If any reaction flywheel fault is detected, the corresponding fault flywheel is isolated and the spare reaction flywhe...

  16. Active fault-tolerant control strategy of large civil aircraft under elevator failures

    Full Text Available Aircraft longitudinal control is the most important actuation system and its failures would lead to catastrophic accident of aircraft. This paper proposes an active fault-tolerant control (AFTC strategy for civil aircraft with different numbers of faulty elevators. In order to improve the fault-tolerant flight control system performance and effective utilization of the control surface, trimmable horizontal stabilizer (THS is considered to generate the extra pitch moment. A suitable switching mechanism with performance improvement coefficient is proposed to determine when it is worthwhile to utilize THS. Furthermore, AFTC strategy is detailed by using model following technique and the proposed THS switching mechanism. The basic fault-tolerant controller is designed to guarantee longitudinal control system stability and acceptable performance degradation under partial elevators failure. The proposed AFTC is applied to Boeing 747-200 numerical model and simulation results validate the effectiveness of the proposed AFTC approach.

  18. Real-time fault tolerant full adder design for critical applications

    Full Text Available In the complex computing system, processing units are dealing with devices of smaller size, which are sensitive to the transient faults. A transient fault occurs in a circuit caused by the electromagnetic noises, cosmic rays, crosstalk and power supply noise. It is very difficult to detect these faults during offline testing. Hence an area efficient fault tolerant full adder for testing and repairing of transient and permanent faults occurred in single and multi-net is proposed. Additionally, the proposed architecture can also detect and repair permanent faults. This design incurs much lower hardware overheads relative to the traditional hardware architecture. In addition to this, proposed design also provides higher error detection and correction efficiency when compared to the existing designs.

  19. Active Fault Isolation in MIMO Systems

    Active fault isolation of parametric faults in closed-loop MIMO system s are considered in this paper. The fault isolation consists of two steps. T he first step is group- wise fault isolation. Here, a group of faults is isolated from other pos sible faults in the system. The group-wise fault iso...

  20. Particle Filter Based Fault-tolerant ROV Navigation using Hydro-acoustic Position and Doppler Velocity Measurements

    This paper presents a fault tolerant navigation system for a remotely operated vehicle (ROV). The navigation system uses hydro-acoustic position reference (HPR) and Doppler velocity log (DVL) measurements to achieve an integrated navigation. The fault tolerant functionality is based on a modied...... particle lter. This particle lter is able to run in an asynchronous manner to accommodate the measurement drop out problem, and it overcomes the measurement outliers by switching observation models. Simulations with experimental data show that this fault tolerant navigation system can accurately estimate...

  1. A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism

    Fault tolerance overhead of high performance computing (HPC) applications is becoming critical to the efficient utilization of HPC systems at large scale. HPC applications typically tolerate fail-stop failures by checkpointing. Another promising method is in the algorithm level, called algorithmic recovery. These two methods can achieve high efficiency when the system scale is not very large, but will both lose their effectiveness when systems approach the scale of Exaflops, where the number of processors including in system is expected to achieve one million. This paper develops a new and efficient algorithm-based fault tolerance scheme for HPC applications. When failure occurs during the execution, we do not stop to wait for the recovery of corrupted data, but replace them with the corresponding redundant data and continue the execution. A background accelerated recovery method is also proposed to rebuild redundancy to tolerate multiple times of failures during the execution. To demonstrate the feasibility ...

  2. A Fault Tolerance Mechanism for On-Road Sensor Networks

    On-Road Sensor Networks (ORSNs) play an important role in capturing traffic flow data for predicting short-term traffic patterns, driving assistance and self-driving vehicles. However, this kind of network is prone to large-scale communication failure if a few sensors physically fail. In this paper, to ensure that the network works normally, an effective fault-tolerance mechanism for ORSNs which mainly consists of backup on-road sensor deployment, redundant cluster head deployment and an adaptive failure detection and recovery method is proposed. Firstly, based on the N − x principle and the sensors’ failure rate, this paper formulates the backup sensor deployment problem in the form of a two-objective optimization, which explains the trade-off between the cost and fault resumption. In consideration of improving the network resilience further, this paper introduces a redundant cluster head deployment model according to the coverage constraint. Then a common solving method combining integer-continuing and sequential quadratic programming is explored to determine the optimal location of these two deployment problems. Moreover, an Adaptive Detection and Resume (ADR) protocol is deigned to recover the system communication through route and cluster adjustment if there is a backup on-road sensor mismatch. The final experiments show that our proposed mechanism can achieve an average 90% recovery rate and reduce the average number of failed sensors at most by 35.7%. PMID:27918483

  3. Active leave behavior of members in a fault-tolerant group

    Active replication is an effective means to enhance fault tolerant capability in distributed systems. A fault-tolerant group is composed of replicas of key components in a system. This paper analyzes three types of leave semantics of group members, and manifests activities a group member involves. Then it educes requirements for a group member to safely leave. As to quick-leave semantics, this paper proposes a solution and discusses the non-empty protocol and relay protocol in detail. Further, it gives proofs of correctness and termination property of the protocols. The solution is a building block for a practical and operational group membership module.

  4. Fault Tolerant Message Efficient Coordinator Election Algorithm in High Traffic Bidirectional Ring Network

    Full Text Available Nowadays use of distributed systems such as internet and cloud computing is growing dramatically. Coordinator existence in these systems is crucial due to processes coordinating and consistency requirement as well. However the growth makes their election algorithm even more complicated. Too many algorithms are proposed in this area but the two most well known one are Bully and Ring. In this paper we propose a fault tolerant coordinator election algorithm in typical bidirectional ring topology which is twice as fast as Ring algorithm although far fewer messages are passing due to election. Fault tolerance technique is applied which leads the waiting time for the election reaching to zero.

  5. Fault-tolerant Concave Facility Location Problem with Uniform Requirements

    In this paper,we consider the fault-tolerant concave facility location problem (FTCFL) with uniform requirements. By investigating the structure of the FTCFL,we obtain a modified dual-fitting bifactor approximation algorithm.Combining the scaling and greedy argumentation technique,the approximation factor is proved to be 1.52.

  6. Fault-tolerant quantum computing with color codes

    We present and analyze protocols for fault-tolerant quantum computing using color codes. We present circuit-level schemes for extracting the error syndrome of these codes fault-tolerantly. We further present an integer-program-based decoding algorithm for identifying the most likely error given the syndrome. We simulated our syndrome extraction and decoding algorithms against three physically-motivated noise models using Monte Carlo methods, and used the simulations to estimate the corresponding accuracy thresholds for fault-tolerant quantum error correction. We also used a self-avoiding walk analysis to lower-bound the accuracy threshold for two of these noise models. We present and analyze two architectures for fault-tolerantly computing with these codes: one with 2D arrays of qubits are stacked atop each other and one in a single 2D substrate. Our analysis demonstrates that color codes perform slightly better than Kitaev's surface codes when circuit details are ignored. When these details are considered, w...

  7. Nonlinear and fault-tolerant flight control using multivariate splines

    This paper presents a study on fault tolerant flight control of a high performance aircraft using multivariate splines. The controller is implemented by making use of spline model based adaptive nonlinear dynamic inversion (NDI). This method, indicated as SANDI, combines NDI control with nonlinear c

  8. Modular Multilevel Converter Control Strategy with Fault Tolerance

    The Modular Multilevel Converter (MMC) technology has recently emerged in VSC-HVDC applications where it demonstrated higher efficiency and fault tolerance compared to the classical 2-level topology. Due to the ability of MMC to connect to HV levels, MMC can be also used in transformerless STATCO...

  10. Final Project Report. Scalable fault tolerance runtime technology for petascale computers

    With the massive number of components comprising the forthcoming petascale computer systems, hardware failures will be routinely encountered during execution of large-scale applications. Due to the multidisciplinary, multiresolution, and multiscale nature of scientific problems that drive the demand for high end systems, applications place increasingly differing demands on the system resources: disk, network, memory, and CPU. In addition to MPI, future applications are expected to use advanced programming models such as those developed under the DARPA HPCS program as well as existing global address space programming models such as Global Arrays, UPC, and Co-Array Fortran. While there has been a considerable amount of work in fault tolerant MPI with a number of strategies and extensions for fault tolerance proposed, virtually none of advanced models proposed for emerging petascale systems is currently fault aware. To achieve fault tolerance, development of underlying runtime and OS technologies able to scale to petascale level is needed. This project has evaluated range of runtime techniques for fault tolerance for advanced programming models.

  11. Reversible Logic Synthesis of Fault Tolerant Carry Skip BCD Adder

    Reversible logic is emerging as an important research area having its application in diverse fields such as low power CMOS design, digital signal processing, cryptography, quantum computing and optical information processing. This paper presents a new 4*4 parity preserving reversible logic gate, IG. The proposed parity preserving reversible gate can be used to synthesize any arbitrary Boolean function. It allows any fault that affects no more than a single signal readily detectable at the circuit's primary outputs. It is shown that a fault tolerant reversible full adder circuit can be realized using only two IGs. The proposed fault tolerant full adder (FTFA) is used to design other arithmetic logic circuits for which it is used as the fundamental building block. It has also been demonstrated that the proposed design offers less hardware complexity and is efficient in terms of gate count, garbage outputs and constant inputs than the existing counterparts.

  12. Design and analysis of linear fault-tolerant permanent-magnet vernier machines.

    This paper proposes a new linear fault-tolerant permanent-magnet (PM) vernier (LFTPMV) machine, which can offer high thrust by using the magnetic gear effect. Both PMs and windings of the proposed machine are on short mover, while the long stator is only manufactured from iron. Hence, the proposed machine is very suitable for long stroke system applications. The key of this machine is that the magnetizer splits the two movers with modular and complementary structures. Hence, the proposed machine offers improved symmetrical and sinusoidal back electromotive force waveform and reduced detent force. Furthermore, owing to the complementary structure, the proposed machine possesses favorable fault-tolerant capability, namely, independent phases. In particular, differing from the existing fault-tolerant machines, the proposed machine offers fault tolerance without sacrificing thrust density. This is because neither fault-tolerant teeth nor the flux-barriers are adopted. The electromagnetic characteristics of the proposed machine are analyzed using the time-stepping finite-element method, which verifies the effectiveness of the theoretical analysis.

  13. Empirical Study of FFANN Tolerance to Weight Stuck at Max/Min Fault

    Full Text Available Fault tolerance property of artificial neural networks has been investigated with reference to the hardware model of artificial neural networks. Weight fault is an important link, which causes breakup between two nodes. In this paper three types of weight faults have been explained. Experiments have been performed to demonstrate fault tolerance behavior of feedforward artificial neural network for weight-stuck-MAX/MIN fault. Effect of weight-stuckMAX/MIN fault on trained network has been analyzed in this paper. The obtained results suggest that networks are not fault tolerant to this type of fault.

  14. Fault-tolerant Algorithms for Tick-Generation in Asynchronous Logic: Robust Pulse Generation

    Today's hardware technology presents a new challenge in designing robust systems. Deep submicron VLSI technology introduced transient and permanent faults that were never considered in low-level system designs in the past. Still, robustness of that part of the system is crucial and needs to be guaranteed for any successful product. Distributed systems, on the other hand, have been dealing with similar issues for decades. However, neither the basic abstractions nor the complexity of contemporary fault-tolerant distributed algorithms match the peculiarities of hardware implementations. This paper is intended to be part of an attempt striving to overcome this gap between theory and practice for the clock synchronization problem. Solving this task sufficiently well will allow to build a very robust high-precision clocking system for hardware designs like systems-on-chips in critical applications. As our first building block, we describe and prove correct a novel Byzantine fault-tolerant self-stabilizing pulse syn...

  15. Passive fault tolerant control of a double inverted pendulum - a case study

    A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the YJBK parameterization, which requires the nominal controller to be impl...... to be implemented in observer based form. The proposed method is applied to a double inverted pendulum system, for which an H_inf controller has been designed and verified in a lab setup. In this case study, the fault is a degradation of the tacho loop....

  16. Design of intelligent fault-tolerant to passive underwater integrated navigation system%水下无源组合导航系统智能容错方法设计

      为提高自主水下航行器的导航精度,比较目前 AUV 常用的水下导航方式,将捷联式惯性导航系统与地球物理导航系统相结合构成水下无源组合导航系统。采用容错联邦卡尔曼滤波对各子系统信息进行故障诊断、系统重构和融合。针对传统的2c检验法不能确定故障具体原因,而仅能判断量测信息是否有效的缺陷,提出利用神经网络辅助2c检验法进行故障诊断。通过对水下组合导航系统算法进行仿真分析,结果表明该算法能够快速、准确地判断系统故障源,通过故障隔离和系统重构,使系统在故障情况下依然保持正常工作。%To improve the navigation accuracy of autonomous underwater vehicle, an integrated navigation system was proposed, which was composed of strapdown inertial navigation system (SINS), terrain-aided navigation (TAN) system and gravity-aided navigation (GAN) system. A fault-tolerant federated Kalman filter was used to fuse the various navigation sensors, detect the system fault and reconstruct the navigation system. In view that traditional cai-square hypothesis testing processes fault detection cannot determine the specific cause of the fault, and could only determine the validity of measure information, a new fault detection algorithm based on neural network was adopted. The application on underwater integrated navigation system demonstrates that the algorithm can rapidly and accurately detecting and identify the faults in the system. Therefore the effective fault isolating can be performed to realize the fault tolerance navigation.

  17. The reliability model of the fault-tolerant computing system with triple-modular redundancy based on the independent nodes

    Rahman, P. A.; Bobkova, E. Yu


    This paper deals with a reliability model of the restorable non-stop computing system with triple-modular redundancy based on independent computing nodes, taking into consideration the finite time for node activation and different node failure rates in the active and passive states. The obtained by authors generalized reliability model and calculation formulas for reliability indices for the system based on identical and independent computing nodes with the given threshold for quantity of active nodes, at which system is considered as operable, are also discussed. Finally, the application of the generalized model to the particular case of the non-stop restorable computing system with triple-modular redundancy based on independent nodes and calculation examples for reliability indices are also provided.

  18. A Benchmark Evaluation of Fault Tolerant Wind Turbine Control Concepts

    Odgaard, Peter Fogh; Stoustrup, Jakob


    As the world’s power supply to a larger and larger degree depends on wind turbines, it is consequently and increasingly important that these are as reliable and available as possible. Modern fault tolerant control (FTC) could play a substantial part in increasing reliability of modern wind turbin...... accommodation is handled in software sensor and actuator blocks. This means that the wind turbine controller can continue operation as in the fault free case. The other two evaluated solutions show some potential but probably need improvements before industrial applications....

  19. A fault-tolerant voltage measurement method for series connected battery packs

    Xia, Bing; Mi, Chris


    This paper proposes a fault-tolerant voltage measurement method for battery management systems. Instead of measuring the voltage of individual cells, the proposed method measures the voltage sum of multiple battery cells without additional voltage sensors. A matrix interpretation is developed to demonstrate the viability of the proposed sensor topology to distinguish between sensor faults and cell faults. A methodology is introduced to isolate sensor and cell faults by locating abnormal signals. A measurement electronic circuit is proposed to implement the design concept. Simulation and experiment results support the mathematical analysis and validate the feasibility and robustness of the proposed method. In addition, the measurement problem is generalized and the condition for valid sensor topology is discovered. The tuning of design parameters are analyzed based on fault detection reliability and noise levels.

  20. Fault Tolerant Control Using Proportional-Integral-Derivative Controller Tuned by Genetic Algorithm

    Full Text Available Problem statement: The growing demand for reliability, maintainability and survivability in industrial processes has drawn significant research in fault detection and fault tolerant control domain. A fault is usually defined as an unexpected change in a system, such as component malfunction and variations in operating condition, which tends to degrade the overall system performance. The purpose of fault detection is to detect these malfunctions to take proper action in order to prevent faults from developing into a total system failure. Approach: In this study an effective integrated fault detection and fault tolerant control scheme was developed for a class of LTI system. The scheme was based on a Kalman filter for simultaneous state and fault parameter estimation, statistical decisions for fault detection and activation of controller reconfiguration. Proportional-Integral-Derivative (PID control schemes continue to provide the simplest and yet effective solutions to most of the control engineering applications today. Determination or tuning of the PID parameters continues to be important as these parameters have a great influence on the stability and performance of the control system. In this study GA was proposed to tune the PID controller. Results: The results reflect that proposed scheme improves the performance of the process in terms of time domain specifications, robustness to parametric changes and optimum stability. Also, A comparison with the conventional Ziegler-Nichols method proves the superiority of GA based system. Conclusion: This study demonstrates the effectiveness of genetic algorithm in tuning of a PID controller with optimum parameters. It is, moreover, proved to be robust to the variations in plant dynamic characteristics and disturbances assuring a parameter-insensitive operation of the process.

  1. Fault-Tolerant Software-Defined Radio on Manycore

    Software-defined radio (SDR) platforms generally rely on field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), but such architectures require significant software development. In addition, application demands for radiation mitigation and fault tolerance exacerbate programming challenges. MaXentric Technologies, LLC, has developed a manycore-based SDR technology that provides 100 times the throughput of conventional radiationhardened general purpose processors. Manycore systems (30-100 cores and beyond) have the potential to provide high processing performance at error rates that are equivalent to current space-deployed uniprocessor systems. MaXentric's innovation is a highly flexible radio, providing over-the-air reconfiguration; adaptability; and uninterrupted, real-time, multimode operation. The technology is also compliant with NASA's Space Telecommunications Radio System (STRS) architecture. In addition to its many uses within NASA communications, the SDR can also serve as a highly programmable research-stage prototyping device for new waveforms and other communications technologies. It can also support noncommunication codes on its multicore processor, collocated with the communications workload-reducing the size, weight, and power of the overall system by aggregating processing jobs to a single board computer.

  2. Lightweight storage and overlay networks for fault tolerance.

    The next generation of capability-class, massively parallel processing (MPP) systems is expected to have hundreds of thousands to millions of processors, In such environments, it is critical to have fault-tolerance mechanisms, including checkpoint/restart, that scale with the size of applications and the percentage of the system on which the applications execute. For application-driven, periodic checkpoint operations, the state-of-the-art does not provide a scalable solution. For example, on today's massive-scale systems that execute applications which consume most of the memory of the employed compute nodes, checkpoint operations generate I/O that consumes nearly 80% of the total I/O usage. Motivated by this observation, this project aims to improve I/O performance for application-directed checkpoints through the use of lightweight storage architectures and overlay networks. Lightweight storage provide direct access to underlying storage devices. Overlay networks provide caching and processing capabilities in the compute-node fabric. The combination has potential to signifcantly reduce I/O overhead for large-scale applications. This report describes our combined efforts to model and understand overheads for application-directed checkpoints, as well as implementation and performance analysis of a checkpoint service that uses available compute nodes as a network cache for checkpoint operations.

  3. BFTDT: Byzantine Fault Tolerance tryout for Dependable Transactions in Cloud

    Full Text Available Cloud Web Services (CWS is the technology used for business collaboration and integration among the web users. The Web Services Atomic Transactions (WS-AT have been used for the trusted distributed transaction processing over the web. The WS-AT in the distributed sense has byzantine faults to overcome that Byzantine Faults Techniques (BFT is used. The reliable coordinator provides the services that are Coordination services, Activation services, Registration Services and Completion services which make the transaction effective and reliable. In the trusted environment, to evade congestion of the resources, fair share bandwidth allocation scheme is used to allocate separate bandwidth for each web users and the transaction is processed Coordinator server and the Transaction Processing Monitor (TPM. The WS-AT for business applications analysis shows the high degree of dependability, security, trust, fault tolerance and fairness of the resources in the trusted environment.

  4. TRSTR: A Fault- Tolerant Microprocessor Architecture Based on SMT

    Based on Simultaneous Multithreading (SMT),we propose a fault-tolerant scheme called Tri-modular Redundantly and Simultaneously Threaded processor with Recovery (TRSTR). TRSTR features as following: First, we introduce an arbitrator context into the conventional SRT (Simultaneous and Redundantly Threaded), which acts as an arbitrator when results from the other two contexts disagree, or acts as an ordinary thread generally, thus making full use of SMT' s parallelism. Second, we append reconfigurable feature to sphere of replication in SRT, making it more flexible for changing demands and situations. Third, TRSTR has two working modes: Tri-Simultaneous with Voting (TSV) and Dual-Simultaneous with Arbitrator (DSA), which can switch at will. Finally, in addition to transient-fault coverage,TRSTR has on-line self-checking and self-recovering abilities,so as to shield off some permanent faults and reconfigure itself without stopping the crucial job, improving its reliability and availability.




    Full Text Available Computational grids have the potential for solving large-scale scientific applications using heterogeneous and geographically distributed resources. In addition to the challenges of managing and scheduling these applications, reliability challenges arise because of the unreliable nature of grid infrastructure. Two major problems that are critical to the effective utilization of computational resources are efficient scheduling of jobs and providing fault tolerance in a reliable manner. This paper addresses these problems by combining the checkpoint replication based fault tolerance echanism with Minimum Total Time to Release (MTTR job scheduling algorithm. TTR includes the service time of the job, waiting time in the queue, transfer of input and output data to and from the resource. The MTTR algorithm minimizes the TTR by selecting a computational resource based on job requirements, job characteristics and hardware features of the resources. The fault tolerance mechanism used here sets the job checkpoints based on the resource failure rate. If resource failure occurs, the job is restarted from its last successful state using a checkpoint file from another grid resource. Acritical aspect for an automatic recovery is the availability of checkpoint files. A strategy to increase the availability of checkpoints is replication. Replica Resource Selection Algorithm (RRSA is proposed to provide Checkpoint Replication Service (CRS. Globus Tool Kit is used as the grid middleware to set up a grid environment and evaluate the performance of the proposed approach. The monitoring tools Ganglia and NWS (Network Weather Service are used to gather hardware and network details respectively. The experimental results demonstrate that, the proposed approach effectively schedule the grid jobs with fault tolerant way thereby reduces TTR of the jobs submitted in the grid. Also, it increases the percentage of jobs completed within specified deadline and making the grid

  6. Unconstrained and Constrained Fault-Tolerant Resource Allocation

    First, we study the Unconstrained Fault-Tolerant Resource Allocation (UFTRA) problem (a.k.a. FTFA problem in \\cite{shihongftfa}). In the problem, we are given a set of sites equipped with an unconstrained number of facilities as resources, and a set of clients with set $\\mathcal{R}$ as corresponding connection requirements, where every facility belonging to the same site has an identical opening (operating) cost and every client-facility pair has a connection cost. The objective is to allocate facilities from sites to satisfy $\\mathcal{R}$ at a minimum total cost. Next, we introduce the Constrained Fault-Tolerant Resource Allocation (CFTRA) problem. It differs from UFTRA in that the number of resources available at each site $i$ is limited by $R_{i}$. Both problems are practical extensions of the classical Fault-Tolerant Facility Location (FTFL) problem \\cite{Jain00FTFL}. For instance, their solutions provide optimal resource allocation (w.r.t. enterprises) and leasing (w.r.t. clients) strategies for the cont...

  7. Resource requirements for a fault-tolerant quantum Fourier transform

    The quantum Fourier transform (QFT) is a basic subroutine for most quantum algorithms providing an exponential speedup over classical ones. We investigate resource requirements for a fault-tolerant QFT. To implement single-qubit rotations for a QFT in a fault-tolerant manner, we examine three types of approaches: ancilla-free gate synthesis, ancilla-assisted gate synthesis, and state distillation. While the gate synthesis approximates single-qubit rotations with basic quantum operations, the state distillation enables to perform specific single-qubit rotations required for the QFT exactly. It is unknown, however, which approach is better for the QFT. We estimated the resource requirement for a QFT in each case, where the resource is measured by the total number of the π / 8 gates denoted by T, which is called the T count. Contrary to the initial expectation, the total T count for the state distillation is considerably larger than those for the ancilla-free and ancilla-assisted gate synthesis. Thus, we conclude that the ancilla-assisted gate synthesis is the best for a fault-tolerant QFT so far.

  8. Adaptive Fault-Tolerant Routing in 2D Mesh with Cracky Rectangular Model

    Full Text Available This paper mainly focuses on routing in two-dimensional mesh networks. We propose a novel faulty block model, which is cracky rectangular block, for fault-tolerant adaptive routing. All the faulty nodes and faulty links are surrounded in this type of block, which is a convex structure, in order to avoid routing livelock. Additionally, the model constructs the interior spanning forest for each block in order to keep in touch with the nodes inside of each block. The procedure for block construction is dynamically and totally distributed. The construction algorithm is simple and ease of implementation. And this is a fully adaptive block which will dynamically adjust its scale in accordance with the situation of networks, either the fault emergence or the fault recovery, without shutdown of the system. Based on this model, we also develop a distributed fault-tolerant routing algorithm. Then we give the formal proof for this algorithm to guarantee that messages will always reach their destinations if and only if the destination nodes keep connecting with these mesh networks. So the new model and routing algorithm maximize the availability of the nodes in networks. This is a noticeable overall improvement of fault tolerability of the system.

  9. Fault-Tolerant, Radiation-Hard DSP

    Commercial digital signal processors (DSPs) for use in high-speed satellite computers are challenged by the damaging effects of space radiation, mainly single event upsets (SEUs) and single event functional interrupts (SEFIs). Innovations have been developed for mitigating the effects of SEUs and SEFIs, enabling the use of very-highspeed commercial DSPs with improved SEU tolerances. Time-triple modular redundancy (TTMR) is a method of applying traditional triple modular redundancy on a single processor, exploiting the VLIW (very long instruction word) class of parallel processors. TTMR improves SEU rates substantially. SEFIs are solved by a SEFI-hardened core circuit, external to the microprocessor. It monitors the health of the processor, and if a SEFI occurs, forces the processor to return to performance through a series of escalating events. TTMR and hardened-core solutions were developed for both DSPs and reconfigurable field-programmable gate arrays (FPGAs). This includes advancement of TTMR algorithms for DSPs and reconfigurable FPGAs, plus a rad-hard, hardened-core integrated circuit that services both the DSP and FPGA. Additionally, a combined DSP and FPGA board architecture was fully developed into a rad-hard engineering product. This technology enables use of commercial off-the-shelf (COTS) DSPs in computers for satellite and other space applications, allowing rapid deployment at a much lower cost. Traditional rad-hard space computers are very expensive and typically have long lead times. These computers are either based on traditional rad-hard processors, which have extremely low computational performance, or triple modular redundant (TMR) FPGA arrays, which suffer from power and complexity issues. Even more frustrating is that the TMR arrays of FPGAs require a fixed, external rad-hard voting element, thereby causing them to lose much of their reconfiguration capability and in some cases significant speed reduction. The benefits of COTS high

  10. A Modular and Fault-Tolerant Data Transport Framework

    The High Level Trigger (HLT) of the future ALICE heavy-ion experiment has to reduce its input data rate of up to 25 GB/s to at most 1.25 GB/s for output before the data is written to permanent storage. To cope with these data rates a large PC cluster system is being designed to scale to several 1000 nodes, connected by a fast network. For the software that will run on these nodes a flexible data transport and distribution software framework, described in this thesis, has been developed. The framework consists of a set of separate components, that can be connected via a common interface. This allows to construct different configurations for the HLT, that are even changeable at runtime. To ensure a fault-tolerant operation of the HLT, the framework includes a basic fail-over mechanism that allows to replace whole nodes after a failure. The mechanism will be further expanded in the future, utilizing the runtime reconnection feature of the framework's component interface. To connect cluster nodes a communication ...

  11. Experimental Robot Position Sensor Fault Tolerance Using Accelerometers and Joint Torque Sensors

    Robot systems in critical applications, such as those in space and nuclear environments, must be able to operate during component failure to complete important tasks. One failure mode that has received little attention is the failure of joint position sensors. Current fault tolerant designs require the addition of directly redundant position sensors which can affect joint design. The proposed method uses joint torque sensors found in most existing advanced robot designs along with easily locatable, lightweight accelerometers to provide a joint position sensor fault recovery mode. This mode uses the torque sensors along with a virtual passive control law for stability and accelerometers for joint position information. Two methods for conversion from Cartesian acceleration to joint position based on robot kinematics, not integration, are presented. The fault tolerant control method was tested on several joints of a laboratory robot. The controllers performed well with noisy, biased data and a model with uncertain parameters.

  12. Optimal Configuration of Fault-Tolerance Parameters for Distributed Server Access

    Server replication is a common fault-tolerance strategy to improve transaction dependability for services in communications networks. In distributed architectures, fault-diagnosis and recovery are implemented via the interaction of the server replicas with the clients and other entities such as e...... in replicated server architectures. In order to obtain insight into the system behaviour, a set of relevant environment parameters and controllable fault-tolerance parameters are chosen and the dependability/performance trade-off is evaluated....... such as enhanced name servers. Such architectures provide an increased number of redundancy configuration choices. The influence of a (wide area) network connection can be quite significant and induce trade-offs between dependability and user-perceived performance. This paper develops a quantitative stochastic...

  13. Fault Tolerant Real-Time Networks


    systems literature in the context of motivation for understanding data consistency models that are weaker than atomicity. Keyword: Theory Customer: Ben...Mostafa Ammar, " GOTHIC : A Group Access Control Architecture for Secure Multicast and Anycast," Proceedings of INFOCOM 2002, June 2002. 85. Li Zou, Mostafa

  14. Fault Tolerance Design and Redundancy Management Techniques.


    relations ontre los angles du tri4dre, a~rodynarnique et du triqbdre terrestre sin y = sin~coscxcosa - sinacosesin - cosB sincxcosecos (3 cosy sin Pi...the F-8 digital fly-by-wire aircraft and the space shuttle orbiter . Management of a multicomputer/sensor/actuator/power system is markedly different...for a flight in which the F-8 DFBW aircraft was simulating final approaches to Edwards Dry Lake by the shuttle orbiter . These approaches were

    Hu, Chaofang; Gao, Zhifei; Ren, Yanli; Liu, Yunbing


    In this paper, a reusable launch vehicle (RLV) attitude control problem with actuator faults is addressed via the robust adaptive nonlinear fault-tolerant control (FTC) with norm estimation. Firstly, the accurate tracking task of attitude angles in the presence of parameter uncertainties and external disturbances is considered. A fault-free controller is proposed using dynamic surface control (DSC) combined with fuzzy adaptive approach. Furthermore, the minimal learning parameter strategy via norm estimation technique is introduced to reduce the multi-parameter adaptive computation burden of fuzzy approximation of the lump uncertainties. Secondly, a compensation controller is designed to handle the partial loss fault of actuator effectiveness. The unknown maximum eigenvalue of actuator efficiency loss factors is estimated online. Moreover, stability analysis guarantees that all signals of the closed-loop control system are semi-global uniformly ultimately bounded. Finally, illustrative simulations show the effectiveness of the proposed method.

  16. Review of fault diagnosis and fault-tolerant control for modular multilevel converter of HVDC

    Liu, Hui; Loh, Poh Chiang; Blaabjerg, Frede


    This review focuses on faults in Modular Multilevel Converter (MMC) for use in high voltage direct current (HVDC) systems by analyzing the vulnerable spots and failure mechanism from device to system and illustrating the control & protection methods under failure condition. At the beginning......, several typical topologies of MMC-HVDC systems are presented. Then fault types such as capacitor voltage unbalance, unbalance between upper and lower arm voltage are analyzed and the corresponding fault detection and diagnosis approaches are explained. In addition, more attention is dedicated to control...

  17. Fault-Tolerant Tree-Based Multicasting in Mesh Multicomputers

    WU Jie; CHEN Xiao


    We propose a fault-tolerant tree-based multicast algorithm for 2-dimensional (2-D) meshes based on the concept of the extended safety level which is a vector associated with each node to capture fault information in the neighborhood. In this approach each destination is reached through a minimum number of hops. In order to minimize the total number of traffic steps, three heuristic strategies are proposed. This approach can be easily implemented by pipelined circuit switching (PCS). A simulation study is conducted to measure the total number of traffic steps under different strategies. Our approach is the first attempt to address the faulttolerant tree-based multicast problem in 2-D meshes based on limited global information with a simple model and succinct information.

  18. Structural Fault Tolerance of Scale-Free Networks

    HAO Jingbo; YIN Jianping; ZHANG Boyun


    The fault tolerance of scale-free networks is examined in this paper. Through the simulation on the changes of the average path length and network fragmentation of the Barabasi-Albert model when faults happen, it can be observed that generic scale-free networks are quite robust to random failures, but are very vulnerable to targeted attacks at the same time. Therefore, an existing optimization strategy for the robustness of scale-free networks to failures and attacks is also introduced. The simulation similar with the above proved that the so-called (1, 0) network has potentially interconnectedness closer to that of a scale-free network and robustness to targeted attacks closer to that of an exponential network. Furthermore, its resistance to random failures is better than that of either of them.

  19. Robust and Fault Tolerant Control of CD-players

    the parameter estimation is performed in closed loop, probably because open loop estimation has been stated for being very difficult or even impossible. A novel method, which requires an additional current measurement, is presented in this work where parameter estimation is accomplished in open loop in a simple...... is to be found in the fault-diagnosis and fault-tolerant control fields. One of the main challenges in the positioning control of the focus point in CD-players is to handle two types of disturbances with conflicting requirements in an effective way. While a high bandwidth is desired to better suppress shocks......-ROMs). DVDs can be two-sided with multiple layers, allowing read, write and rewrite operations. Most significantly in this context, the new media typically have much higher physical data densities. This constitutes a significant challenge in terms of playability (the ability to reproduce the information from...

  20. The optimization of global fault tolerant trajectory for redundant manipulator based on self-motion

    Full Text Available The redundancy feature of manipulators provides the possibility for the fault tolerant trajectory planning. Aiming at the completion of the specific task, an algorithm of global fault tolerant trajectory optimization for redundant manipulator based on the self-motion is proposed in this paper. Firstly, inverse kinematics equation of single redundancy manipulator based on self-motion variable and null-space velocity array of Jacobian are analyzed. Secondly, the mathematical description of fault tolerance criteria of the configuration of manipulator is established and the fault tolerance configuration group of manipulator is obtained by using iteration traversal under the fault tolerance criteria. Then, considering the joint limits and minimum the energy consumption as the optimization target, the global fault tolerant joint trajectory is achieved. Finally, simulation for 7 degree of freedom (DOF manipulator is performed, by which the effectiveness of the algorithm is validated.

  1. Fault Tolerant Control Design for the Longitudinal Aircraft Dynamics using Quantitative Feedback Theory

    Flight control laws of modern aircraft are scheduled with respect to flight point parameters. The loss of the air data measurement system implies inevitably the loss of relevant scheduling information. A strategy to design a fault tolerant longitudinal flight control system is proposed which can accommodate the total loss of the angle of attack and the calibrated airspeed measurements. In this scenario the described robust longitudinal control law is employed ensuring a control performance ...

  2. Data center networks topologies, architectures and fault-tolerance characteristics

    This SpringerBrief presents a survey of data center network designs and topologies and compares several properties in order to highlight their advantages and disadvantages. The brief also explores several routing protocols designed for these topologies and compares the basic algorithms to establish connections, the techniques used to gain better performance, and the mechanisms for fault-tolerance. Readers will be equipped to understand how current research on data center networks enables the design of future architectures that can improve performance and dependability of data centers. This con

  3. Fully fault tolerant quantum computation with non-deterministic gates

    In certain approaches to quantum computing the operations between qubits are non-deterministic and likely to fail. For example, a distributed quantum processor would achieve scalability by networking together many small components; operations between components should assumed to be failure prone. In the logical limit of this architecture each component contains only one qubit. Here we derive thresholds for fault tolerant quantum computation under such extreme paradigms. We find that computation is supported for remarkably high failure rates (exceeding 90%) providing that failures are heralded, meanwhile the rate of unknown errors should not exceed 2 in 10^4 operations.

  4. Row fault detection system

    An apparatus, program product and method checks for nodal faults in a row of nodes by causing each node in the row to concurrently communicate with its adjacent neighbor nodes in the row. The communications are analyzed to determine a presence of a faulty node or connection.

  5. A hybrid robust fault tolerant control based on adaptive joint unscented Kalman filter.

    In this paper, a new hybrid robust fault tolerant control scheme is proposed. A robust H∞ control law is used in non-faulty situation, while a Non-Singular Terminal Sliding Mode (NTSM) controller is activated as soon as an actuator fault is detected. Since a linear robust controller is designed, the system is first linearized through the feedback linearization method. To switch from one controller to the other, a fuzzy based switching system is used. An Adaptive Joint Unscented Kalman Filter (AJUKF) is used for fault detection and diagnosis. The proposed method is based on the simultaneous estimation of the system states and parameters. In order to show the efficiency of the proposed scheme, a simulated 3-DOF robotic manipulator is used.

  6. Prognostics Enhancemend Fault-Tolerant Control with an Application to a Hovercraft Project

    National Aeronautics and Space Administration — Fault-Tolerant Control (FTC) is an emerging area of engineering and scientific research that integrates prognostics, health management concepts and intelligent...

  7. DFTSNA:A Distributed Fault—Tolerant Shipboard System

    This paper describes the architecture,fundamental principle and implementation of a distributed fault-tolerant system-DFTSNA,Its objective is o combine extreme reliability with high availability in a shipboard environment,Multi-level fault tolerance is considered and several special purpose hardware subsystems(F-T clusters)are developed.The physical and functional distribution of the system is empha-sized to meet the stringent shipboard requirements.A number of algorithms are produced to support fault-tolerant operation.

  8. 不确定时滞线性系统的鲁棒容错控制%Robust Fault-Tolerant Control of Uncertain Linear Systems With Time- Delay

    容错控制就是使设计的控制系统能对可能发生的故障具有一定的容错能力。该问题直接关系到控制系统运行的可靠性和安全性。完整性是容错控制研究的一个重要方面,它是指系统中一个或多个部件发生故障时系统并不进行重构,而是利用余下的部件仍可使系统稳定并保持规定性能。由于参数扰动的广泛存在性以及执行器和传感器发生故障的不可避免性,使得研究具有参数不确定的时滞系统的鲁棒容错控制问题具有更重要的现实意义。针对传感器故障情况,采用带有滞后的状态反馈控制,首先考虑了线性时滞系统的容错控制问题,给出了线性时滞系统对传感器失效具有完整性的一个充分条件,进而考虑了具有参数扰动的线性时滞系统的鲁棒容错控制问题,给出了鲁棒容错控制时滞系统的设计方法和步骤,并用设计实例和仿真结果证实了这种方法的有效性。%Fault-tolerant control is to make the designed control system have certain endurance to the failures happened potentially. This problem is directly relative to the reliability and safety of the control system. Integrity is one of the important aspects of fault-tolerant researches and it means that when one or more parts fail the system does not need to be reconstructed but make full use of the remained parts to keep the system stable and the prescriptive performance. Because of the universal existence of parameter perturbation and the fault ineluctability for actuator and sensor, it is more important and practical to study robust fault-tolerant control of parameter perturbation for linear system with time-delay. From the sensor failure adopting state feedback control with time-delay ,by considering the fault-tolerant control for linear system with time-delay, a sufficient condition of time-delay linear system possessing integrity against sensor failure was given. Robust

  9. High Speed Fault Injection Tool Implemented With Verilog HDL on FPGA for Testing Fault Tolerance Designs

    Full Text Available This paper presents an FPGA-based fault injection tool, called FITO that supports several synthesizable fault models for dependability analysis of digital systems modeled by Verilog HDL. Using the FITO, experiments can be performed in real-time with good controllability and observability. As a case study, an Open RISC 1200 microprocessor was evaluated using an FPGA circuit. About 4000 permanent, transient, and SEUfaults were injected into this microprocessor. The results show that the FITO tool is more than 79 times faster than a pure simulation-based fault injection with only 2.5% FPGA area overhead.

  10. Fault tolerant wind turbine production operation and shutdown (Sustainable Control)

    Van Engelen, T.; Schuurmans, J.; Kanev, S.; Dong, J.; Verhaegen, M.H.G.; Hayashi, Y.


    Extreme environmental conditions as well as system failure are real-life phenomena. Especially offshore, extreme environmental conditions and system faults are to be dealt with in an effective way. The project Sustainable Control, a new approach to operate wind turbines (Agentschap NL, grant EOSLT02

  11. 2009 fault tolerance for extreme-scale computing workshop, Albuquerque, NM - March 19-20, 2009.

    Katz, D. S.; Daly, J.; DeBardeleben, N.; Elnozahy, M.; Kramer, B.; Lathrop, S.; Nystrom, N.; Milfeld, K.; Sanielevici, S.; Scott, S.; Votta, L.; Louisiana State Univ.; Center for Exceptional Computing; LANL; IBM; Univ. of Illinois; Shodor Foundation; Pittsburgh Supercomputer Center; Texas Advanced Computing Center; ORNL; Sun Microsystems


    This is a report on the third in a series of petascale workshops co-sponsored by Blue Waters and TeraGrid to address challenges and opportunities for making effective use of emerging extreme-scale computing. This workshop was held to discuss fault tolerance on large systems for running large, possibly long-running applications. The main point of the workshop was to have systems people, middleware people (including fault-tolerance experts), and applications people talk about the issues and figure out what needs to be done, mostly at the middleware and application levels, to run such applications on the emerging petascale systems, without having faults cause large numbers of application failures. The workshop found that there is considerable interest in fault tolerance, resilience, and reliability of high-performance computing (HPC) systems in general, at all levels of HPC. The only way to recover from faults is through the use of some redundancy, either in space or in time. Redundancy in time, in the form of writing checkpoints to disk and restarting at the most recent checkpoint after a fault that cause an application to crash/halt, is the most common tool used in applications today, but there are questions about how long this can continue to be a good solution as systems and memories grow faster than I/O bandwidth to disk. There is interest in both modifications to this, such as checkpoints to memory, partial checkpoints, and message logging, and alternative ideas, such as in-memory recovery using residues. We believe that systematic exploration of these ideas holds the most promise for the scientific applications community. Fault tolerance has been an issue of discussion in the HPC community for at least the past 10 years; but much like other issues, the community has managed to put off addressing it during this period. There is a growing recognition that as systems continue to grow to petascale and beyond, the field is approaching the point where we don

  12. Diagnosis and Fault-tolerant Control for Ship Station Keeping

    Blanke, Mogens


    design for systems of high complexity, and also analyse the cases of cascaded or multiple faults. The paper takes as example a ship with two CP propellers, rudders and a bow thruster as actuators, and instrumentation with a suite of global position sensors, inertial navigation units and conventional gyro...... units to provide ship motion information. A salient feature of the design mehod is the ability to analyse cases where faults have occurrred and easily determine where in the faulty system diagnosability and controlability are retained....

  13. Fault tolerant, radiation hard, high performance digital signal processor

    Holmann, Edgar; Linscott, Ivan R.; Maurer, Michael J.; Tyler, G. L.; Libby, Vibeke


    An architecture has been developed for a high-performance VLSI digital signal processor that is highly reliable, fault-tolerant, and radiation-hard. The signal processor, part of a spacecraft receiver designed to support uplink radio science experiments at the outer planets, organizes the connections between redundant arithmetic resources, register files, and memory through a shuffle exchange communication network. The configuration of the network and the state of the processor resources are all under microprogram control, which both maps the resources according to algorithmic needs and reconfigures the processing should a failure occur. In addition, the microprogram is reloadable through the uplink to accommodate changes in the science objectives throughout the course of the mission. The processor will be implemented with silicon compiler tools, and its design will be verified through silicon compilation simulation at all levels from the resources to full functionality. By blending reconfiguration with redundancy the processor implementation is fault-tolerant and reliable, and possesses the long expected lifetime needed for a spacecraft mission to the outer planets.

  14. Checkpoint and Replication Oriented Fault Tolerant Mechanism for MapReduce Framework

    Yang Liu


    Full Text Available MapReduce is an emerging programming paradigm and an associated implementation for processing and generating big data which has been widely applied in data-intensive systems. In cloud environment, node and task failure is no longer accidental but a common feature of large-scale systems. In MapReduce framework, although the rescheduling based fault-tolerant method is simple to implement, it failed to fully consider the location of distributed data, the computation and storage overhead. Thus, a single node failure will increase the completion time dramatically. In this paper, a Checkpoint and Replication Oriented Fault Tolerant scheduling algorithm (CROFT is proposed, which takes both task and node failure into consideration. Preliminary experiments show that with less storage and network overhead. CROFT will significantly reduce the completion time at failure time, and the overall performance of MapReduce can be improved at least over 30% than original mechanism in Hadoop.  

  15. Fault-Tolerant Vision for Vehicle Guidance in Agriculture

    dropout of 3D vision, faults in classification, or other defects, redundant information should be utilized. Such information can be used to diagnose faulty behavior and to temporarily continue operation with a reduced set of sensors when faults or artifacts occur. Additional sensors include GPS receivers......, and aiding sensors such as GPS provide means to detect and isolate single faults in the system. In addition, learning is employed to adapt the system to variational changes in the natural environment. 3D vision is enhanced by learning texture and color information. Intensity gradients on small neighborhoods...... of pixels are shown to provide a superior approach to modeling texture information than other methods. Stochastic automatas using optimally quantized data is demonstrated as a strong approach for offline learning. It is considered how 3D vision provides labeling of training data that subsequently can be fed...

  16. A Simple Fault-Tolerant Adaptive and Minimal Routing Approach in 3-D Meshes

    WU Jie(吴杰)


    In this paper we propose a sufficient condition for minimal routing in 3-dimensional (3-D) meshes with faulty nodes. It is based on an early work of the author on minimal routing in 2-dimensional (2-D) meshes. Unlike many traditional models that assume all the nodes know global fault distribution or just adjacent fault information, our approach is based on the concept of limited global fault information. First, we propose a fault model called faulty cube in which all faulty nodes in the system are contained in a set of faulty cubes. Fault information is then distributed to limited number of nodes while it is still sufficient to support minimal routing. The limited fault information collected at each node is represented by a vector called extended safety level. The extended safety level associated with a node can be used to determine the existence of a minimal path from this node to a given destination. Specifically, we study the existence of minimal paths at a given source node, limited distribution of fault information, minimal routing, and deadlock-free and livelock-free routing. Our results show that any minimal routing that is partially adaptive can be applied in our model as long as the destination node meets a certain condition. We also propose a dynamic planar-adaptive routing scheme that offers better fault tolerance and adaptivity than the planar-adaptive routing scheme in 3-D meshes. Our approach is the first attempt to address adaptive and minimal routing in 3-D meshes with faulty nodes using limited fault information.

  17. DSP-Based Sensor Fault Detection and Post Fault-Tolerant Control of an Induction Motor-Based Electric Vehicle

    Full Text Available This paper deals with sensor fault detection within a reconfigurable direct torque control of an induction motor-based electric vehicle. The proposed strategy concerns current, voltage, and speed sensors faults that are detected and followed by post fault-tolerant control to allow the vehicle continuous operation. The proposed approach is validated through experiments on an induction motor drive and simulations on an electric vehicle using a European urban and extraurban driving cycle.

  18. Fault Detection for Nonlinear Systems

    Stoustrup, Jakob; Niemann, H.H.


    The paper describes a general method for designing fault detection and isolation (FDI) systems for nonlinear processes. For a rich class of nonlinear systems, a nonlinear FDI system can be designed using convex optimization procedures. The proposed method is a natural extension of methods based...

    Cen Zhaohui


    Full Text Available A systematic fault tolerant control (FTC scheme based on fault estimation for a quadrotor actuator, which integrates normal control, active and passive FTC and fault parking is proposed in this paper. Firstly, an adaptive Thau observer (ATO is presented to estimate the quadrotor rotor fault magnitudes, and then faults with different magnitudes and time-varying natures are rated into corresponding fault severity levels based on the pre-defined fault-tolerant boundaries. Secondly, a systematic FTC strategy which can coordinate various FTC methods is designed to compensate for failures depending on the fault types and severity levels. Unlike former stand-alone passive FTC or active FTC, our proposed FTC scheme can compensate for faults in a way of condition-based maintenance (CBM, and especially consider the fatal failures that traditional FTC techniques cannot accommodate to avoid the crashing of UAVs. Finally, various simulations are carried out to show the performance and effectiveness of the proposed method.



    Multimedia Services has drawn much attention from both industrial and academic researchers due to the emerging consumer market, how to provide High-Availability service is one of most important issues to take into account. In this paper, a dynamic fault tolerant algorithm is presented for highly available distributed multimedia service, then by introducing SLB(server load balancing) into fault tolerance and switching servers in different ways according to their functions, the proposed schema can preserve reliability and real-time of the system .The analysis and experiments indicate that resuming server's faulty by this method is smooth and transparent to the client The proposed algorithm is effectively improving the reliability of the multimedia service.

  1. Plugging Braking of Two-PMSM Drive in Subway Applications with Fault-Tolerant Operation

    Full Text Available The Permanent Magnet Synchronous Motor (PMSM is commonly used as traction motors in the electric traction applications such as in subway train. The subway train is better transport vehicle due to its advantages of security, economic, health and friendly with nature. Braking is defined as removal of the kinetic energy stored in moving parts of machine. The plugging braking is the best braking offered and has the shortest time to stop. The subway train is a heavy machine and has a very high moment of inertia requiring a high braking torque to stop. The plugging braking is an effective method to provide a fast stop to the train. In this paper plugging braking system of the PMSM used in the subway train in normal and fault-tolerant operation is made. The model of the PMSM, three-phase Voltage Source Inverter (VSI controlled using Space Vector Pulse Width Modulation technique (SVPWM, Field Oriented Control method (FOC for independent control of two identical PMSMs and fault-tolerant operation is presented. Simulink model of the plugging braking system of PMSM in normal and fault tolerant operation is proposed using Matlab/Simulink software. Simulation results for different cases are given.

  2. A Semantics-Based Approachfor Achieving Self Fault-Tolerance of Protocols

    李腊元; 李春林


    The cooperation of different processes may be lost by mistake when a protocol is executed. The protocol cannot be normally operated under this condition. In this paper, the self fault-tolerance of protocols is discussed, and a semanticsbased approach for achieving self fault-tolerance of protocols is presented. Some main characteristics of self fault-tolerance of protocols concerning liveness, nontermination and infinity are also presented. Meanwhile, the sufficient and necessary conditions for achieving self fault-tolerance of protocols are given. Finally, a typical protocol that does not satisfy the self fault-tolerance is investigated, and a new redesign version of this existing protocol using the proposed approach is given.

  3. Design of Parity Preserving Logic Based Fault Tolerant Reversible Arithmetic Logic Unit

    Full Text Available Reversible Logic is gaining significant consideration as the potential logic design style for implementation in modern nanotechnology and quantum computing with minimal impact on physical entropy .Fault Tolerant reversible logic is one class of reversible logic that maintain the parity of the input and the outputs. Significant contributions have been made in the literature towards the design of fault tolerant reversible logic gate structures and arithmetic units, however, there are not many efforts directed towards the design of fault tolerant reversible ALUs. Arithmetic Logic Unit (ALU is the prime performing unit in any computing device and it has to be made fault tolerant. In this paper we aim to design one such fault tolerant reversible ALU that is constructed using parity preserving reversible logic gates. The designed ALU can generate up to seven Arithmetic operations and four logical operations

  4. Certifying qubit operations below the fault tolerance threshold

    Blume-Kohout, Robin; Nielsen, Erik; Rudinger, Kenneth; Mizrahi, Jonathan; Fortier, Kevin; Maunz, Peter


    Quantum information processors promise fast algorithms for problems inaccessible to classical computers. But since qubits are noisy and error-prone, they will depend on fault-tolerant quantum error correction (FTQEC) to compute reliably. Quantum error correction can protect against general noise if -- and only if -- the error in each physical qubit operation is smaller than a certain threshold. The threshold for general errors is quantified by their diamond norm. Until now, qubits have been assessed primarily by randomized benchmarking (RB), which reports a different "error rate" that is not sensitive to all errors, cannot be compared directly to diamond norm thresholds, and cannot efficiently certify a qubit for FTQEC. We use gate set tomography (GST) to completely characterize the performance of a trapped-Yb$^+$-ion qubit and certify it rigorously as suitable for FTQEC by establishing that its diamond norm error rate is less than $6.7\\times10^{-4}$ with $95\\%$ confidence.

  5. Incorporating Fault Tolerance in LEACH Protocol for Wireless Sensor Networks

    Full Text Available Routing protocols have been a challenging issue in wireless sensor networks. WSN is one of the focussed are of research because of its multi-aspect applications. These networks are self-organized using clustering algorithms to conserve energy. LEACH (Low-Energy Adaptive Clustering Hierarchy protocol[1] is one of the significant protocols for routing in WSN. In LEACH, sensor nodes are organized in several small clusters where there are cluster heads in each cluster. These CHs gather data from their local clusters aggregate them & send them to the base station. On the LEACH many new schemes have been proposed to enhance its activity like its efficiency, security etc. In this paper the fault tolerance issue is being incorporated.

  6. Fault Tolerance Mechanism in Chip Many-Core Processors

    As semiconductor technology advances, there will be billions of transistors on a single chip. Chip many-core processors are emerging to take advantage of these greater transistor densities to deliver greater performance. Effective fault tolerance techniques are essential to improve the yield of such complex chips. In this paper, a core-level redundancy scheme called N+M is proposed to improve N-core processors'yield by providing M spare cores. In such architecture, topology is an important factor because it greatly affects the processors'performance. The concept of logical topology and a topology reconfiguration problem are introduced, which is able to transparently provide target topology with lowest performance degradation as the presence of faulty cores on-chip. A row rippling and column stealing (RRCS) algorithm is also proposed. Results show that PRCS can give solutions with average 13.8% degradation with negligible computing time.

  7. Experimental magic state distillation for fault-tolerant quantum computing.

    Souza, Alexandre M; Zhang, Jingfu; Ryan, Colm A; Laflamme, Raymond


    Any physical quantum device for quantum information processing (QIP) is subject to errors in implementation. In order to be reliable and efficient, quantum computers will need error-correcting or error-avoiding methods. Fault-tolerance achieved through quantum error correction will be an integral part of quantum computers. Of the many methods that have been discovered to implement it, a highly successful approach has been to use transversal gates and specific initial states. A critical element for its implementation is the availability of high-fidelity initial states, such as |0〉 and the 'magic state'. Here, we report an experiment, performed in a nuclear magnetic resonance (NMR) quantum processor, showing sufficient quantum control to improve the fidelity of imperfect initial magic states by distilling five of them into one with higher fidelity.

  8. Experimental magic state distillation for fault-tolerant quantum computing

    Any physical quantum device for quantum information processing is subject to errors in implementation. In order to be reliable and efficient, quantum computers will need error correcting or error avoiding methods. Fault-tolerance achieved through quantum error correction will be an integral part of quantum computers. Of the many methods that have been discovered to implement it, a highly successful approach has been to use transversal gates and specific initial states. A critical element for its implementation is the availability of high-fidelity initial states such as |0> and the Magic State. Here we report an experiment, performed in a nuclear magnetic resonance (NMR) quantum processor, showing sufficient quantum control to improve the fidelity of imperfect initial magic states by distilling five of them into one with higher fidelity.

  9. Data Structures: Sequence Problems, Range Queries, and Fault Tolerance

    for a range of sequence analysis problems that have risen from applications in pattern matching, bioinformatics, and data mining. On a high level, each problem is dened by a function and some constraints and the job at hand is to locate subsequences that score high with this function and are not invalidated...... a certain function on the elements in a given query subsequence. There are many types of functions that has been considered in connection with input from dierent sources. The input could be ip-data sorted by ip-address, real estate prices sorted by zip code, advertising cost sorted by time etc. We consider...... data structures for two classic statistics functions, namely median and mode. Finally, Part III investigates fault tolerant algorithms and data structures. This deals with the trend of avoiding elaborate error checking and correction circuitry that would impose non-negligible costs in terms of hardware...

  10. Fault Tolerant Distributed and Fixed Hierarchical Mobile IP

    Full Text Available To several mobility management protocols proposed for IP-based mobile networks, faulttolerance aspect of mobility agents is a primary requirement to sustain continuous service availability to themobile hosts. For a localized or micro- mobility management solution, the local mobility agent i.e. gateway isa single point of failure because it is responsible for enforcing the signaling and data packets in its domain.Such failures may severely disrupt the communications among the failure-affected users. The problembecomes even more severe for mobility agents in a distributed mobility management scheme with overlappingregistration areas.This paper proposes a fault tolerance scheme for Distributed and Fixed Hierarchical Mobile IP(DFHMIP and evaluates its performance in terms of data transmission cost and blocking probability.

  11. Fault tolerant quantum random number generator certified by Majorana fermions

    Deng, Dong-Ling; Duan, Lu-Ming


    Braiding of Majorana fermions gives accurate topological quantum operations that are intrinsically robust to noise and imperfection, providing a natural method to realize fault-tolerant quantum information processing. Unfortunately, it is known that braiding of Majorana fermions is not sufficient for implementation of universal quantum computation. Here we show that topological manipulation of Majorana fermions provides the full set of operations required to generate random numbers by way of quantum mechanics and to certify its genuine randomness through violation of a multipartite Bell inequality. The result opens a new perspective to apply Majorana fermions for robust generation of certified random numbers, which has important applications in cryptography and other related areas. This work was supported by the NBRPC (973 Program) 2011CBA00300 (2011CBA00302), the IARPA MUSIQC program, the ARO and the AFOSR MURI program.

  12. Fault-tolerant digital microfluidic biochips compilation and synthesis

    Pop, Paul; Stuart, Elena; Madsen, Jan


    This book describes for researchers in the fields of compiler technology, design and test, and electronic design automation the new area of digital microfluidic biochips (DMBs), and thus offers a new application area for their methods.  The authors present a routing-based model of operation execution, along with several associated compilation approaches, which progressively relax the assumption that operations execute inside fixed rectangular modules.  Since operations can experience transient faults during the execution of a bioassay, the authors show how to use both offline (design time) and online (runtime) recovery strategies. The book also presents methods for the synthesis of fault-tolerant application-specific DMB architectures. ·         Presents the current models used for the research on compilation and synthesis techniques of DMBs in a tutorial fashion; ·         Includes a set of “benchmarks”, which are presented in great detail and includes the source code of most of the t...

  13. Novel approach to fault-tolerant logic and yield enhancement

    Takefuji, Y.; Adachi, Y.; Aiso, H.


    A design technique for improving reliability in function of a gate is proposed, in which a plurality of conventional logic circuits (gates) are used so as to give redundancy to a logic circuit itself. The gate with redundancy designed on the basis of the proposed technique is called the fault-tolerant gate (FTG) in this paper. The FTG has a recovery function with respect to a wider variety of faults. It is much more powerful than that offered by the TMR (triple modular redundancy) circuits. Therefore, the highly reliable logic circuits can be realized, and when the concept of FTGs is applied to vlsi chips the production yield must be enhanced. This paper is divided into three parts. In the first part, concrete methods to realize FTGs are described. The second part proves that the reliability of the gates can be improved by employing the concept of FTGs. In the last part, it is shown that the FTG contributes to the yield enhancement of vlsi chips. 13 references.

  14. Comparing fault susceptibility of multiple ISAs and operating systems

    This paper presents a research that aims to compare effects of faults on different configurations of computer systems. The study covers comparison of susceptibility to faults of x86, AMD64, ARM, PowerPC, MIPS architectures and Linux, FreeBSD, Minix operating systems. An emulation based software implemented fault injection technique was used to perform experiments. The problem of choosing an adequate number of tests in experiments is followed by report with collected results where multiple aspects of test runs were analyzed: providing correct computation result, availability of the system under test and error messages. The research allows to determine characteristics of susceptibility to faults of each platform and is a first step towards designing new fault tolerance solutions and assessing their effectiveness.

  15. Relaxed fault-tolerant hardware implementation of neural networks in the presence of multiple transient errors.

    Reliability should be identified as the most important challenge in future nano-scale very large scale integration (VLSI) implementation technologies for the development of complex integrated systems. Normally, fault tolerance (FT) in a conventional system is achieved by increasing its redundancy, which also implies higher implementation costs and lower performance that sometimes makes it even infeasible. In contrast to custom approaches, a new class of applications is categorized in this paper, which is inherently capable of absorbing some degrees of vulnerability and providing FT based on their natural properties. Neural networks are good indicators of imprecision-tolerant applications. We have also proposed a new class of FT techniques called relaxed fault-tolerant (RFT) techniques which are developed for VLSI implementation of imprecision-tolerant applications. The main advantage of RFT techniques with respect to traditional FT solutions is that they exploit inherent FT of different applications to reduce their implementation costs while improving their performance. To show the applicability as well as the efficiency of the RFT method, the experimental results for implementation of a face-recognition computationally intensive neural network and its corresponding RFT realization are presented in this paper. The results demonstrate promising higher performance of artificial neural network VLSI solutions for complex applications in faulty nano-scale implementation environments.

  16. Transient Faults in Computer Systems

    Masson, Gerald M.


    A powerful technique particularly appropriate for the detection of errors caused by transient faults in computer systems was developed. The technique can be implemented in either software or hardware; the research conducted thus far primarily considered software implementations. The error detection technique developed has the distinct advantage of having provably complete coverage of all errors caused by transient faults that affect the output produced by the execution of a program. In other words, the technique does not have to be tuned to a particular error model to enhance error coverage. Also, the correctness of the technique can be formally verified. The technique uses time and software redundancy. The foundation for an effective, low-overhead, software-based certification trail approach to real-time error detection resulting from transient fault phenomena was developed.

  17. SIFT - A preliminary evaluation. [Software Implemented Fault Tolerant computer for aircraft control

    Palumbo, D. L.; Butler, R. W.


    This paper presents the results of a performance evaluation of the SIFT computer system conducted in the NASA AIRLAB facility. The essential system functions are described and compared to both earlier design proposals and subsequent design improvements. The functions supporting fault tolerance are found to consume significant computing resources. With SIFT's specimen task load, scheduled at a 30-Hz rate, the executive tasks such as reconfiguration, clock synchronization and interactive consistency, require 55 percent of the available task slots. Other system overhead (e.g., voting and scheduling) use an average of 50 percent of each remaining task slot.

  18. Imprecise Computation Based Real-time Fault Tolerant Implementation for Model Predictive Control


    Model predictive control (MPC) could not be deployed in real-time control systems for its computation time is not well defined. A real-time fault tolerant implementation algorithm based on imprecise computation is proposed for MPC,according to the solving process of quadratic programming (QP) problem. In this algorithm, system stability is guaranteed even when computation resource is not enough to finish optimization completely. By this kind of graceful degradation, the behavior of real-time control systems is still predictable and determinate. The algorithm is demonstrated by experiments on servomotor, and the simulation results show its effectiveness.

  19. Fault Tolerant Three-Phase AC Motor Drive Topologies: A Comparison of Features, Cost, and Limitations (To Continue)


    C. Phase-Redundant Topology The ability to isolate a faulty phase-leg opens the possibility of introducing a spare inverter leg for improved fault tolerance as shown in Fig.8. The configuration will be referred to as the phase-redundant topology. This circuit topology incorporates the fault isolating SCRs and fuses in only the three active legs of the inverter. A spare fourth leg of the inverter is connected in place of the faulty phase-leg after the fault isolating devices have removed that leg from the system.

  20. Reliable H∞ control of discrete-time systems against random intermittent faults

    Tao, Yuan; Shen, Dong; Fang, Mengqi; Wang, Youqing


    A passive fault-tolerant control strategy is proposed for systems subject to a novel kind of intermittent fault, which is described by a Bernoulli distributed random variable. Three cases of fault location are considered, namely, sensor fault, actuator fault, and both sensor and actuator faults. The dynamic feedback controllers are designed not only to stabilise the fault-free system, but also to guarantee an acceptable performance of the faulty system. The robust H∞ performance index is used to evaluate the effectiveness of the proposed control scheme. In terms of linear matrix inequality, the sufficient conditions of the existence of controllers are given. An illustrative example indicates the effectiveness of the proposed fault-tolerant control method.

  1. Hybrid fault diagnosis of nonlinear systems using neural parameter estimators.

    Sobhani-Tehrani, E; Talebi, H A; Khorasani, K


  2. Fault-tolerant scheduling algorithm with the load factor in Cyber-Physical Systems heterogeneous distributed real-time systems%CPS异构分布实时系统中带负载因子的容错调度

    This paper introduced the newest issue-Cyber-Physical Systems (CPS) and some basic contents of Cyber-Physical Systems. And then,it gave a heterogeneous distributed real-time task system model in the Cyber-Physical Systems. Based on this model and the primary-backup technology ,this paper proposed two fault-tolerant scheduling algorithms,which adapt to the heterogeneous distributed real-time environment of Cyber-Physical Systems:HDLMA (Heterogeneous Distributed Min Loading Algorithm) and H DLFA (Heterogeneous Distributed Loading Factor Algorithm). Finally,this paper analyzed their schedulability,load balancing,the influence of the granularity of tasks on load balancing as well as how scheduling threshold affects the schedulability.%介绍了Cyber-Physical Systems的基本内容,给出了基于Cyber-Physical Systems异构分布式中的实时任务系统模型.并在该模型下结合基/副版本备份技术提出了两种适应于Cyber-Physical Systems异构分布式实时环境的启发式容错调度算法:HDLMA算法和HDLFA算法.最后针对这两种算法,分析了算法可调度性,负载均衡性,任务粒度大小对负载均衡性的影响,以及调度闽值对算法可调度性的影响.

  3. Fault-Tolerant Region-Based Control of an Underwater Vehicle with Kinematically Redundant Thrusters

    Zool H. Ismail


  4. Fault Tolerant Flight Control Using Sliding Modes and Subspace Identification-Based Predictive Control

    Siddiqui, Bilal A.


  5. Fault tolerant motion planning based on joint torque limit for redundant manipulators


    First, two fault tolerant planning algorithms with avoidance of joint static torque limit or joint dynamic torque limit are proposed respectively. The former is suitable for the low-speed manipulators, and the latter is suitable for the high-speed manipulators. These algorithms not only can insure manipulation tasks to lie within the fault tolerant workspace but also can avoid joint torque limit, and hence can insure a redundant manipulator to be fault tolerant in both kinematical sense and dynamic sense. Then, the simulation examples for a planar 3R manipulator demonstrate the validity of these algorithms.

  6. Fault-tolerant measurement-based quantum computing with continuous-variable cluster states.

    Menicucci, Nicolas C


    A long-standing open question about Gaussian continuous-variable cluster states is whether they enable fault-tolerant measurement-based quantum computation. The answer is yes. Initial squeezing in the cluster above a threshold value of 20.5 dB ensures that errors from finite squeezing acting on encoded qubits are below the fault-tolerance threshold of known qubit-based error-correcting codes. By concatenating with one of these codes and using ancilla-based error correction, fault-tolerant measurement-based quantum computation of theoretically indefinite length is possible with finitely squeezed cluster states.

    李琪林; 陈宇; 周明天


    Presently, distributed object technology such as CORBA has increasingly become mature. More and moredistributed application systems are implemented using the standard services and protocols provided by CORBA. Thenew-generation distributed systems such as real time systems, online paying systems and stock exchange systems de-mand assurance of dependability. Fault tolerance is a main way of assurance of system reliability. Thereby, it re-quires low-level CORBA infrastructure to provide fault-tolerance mechanism to ensure dependability and availability.This paper firstly discusses implementation strategy and system model of fault-tolerance CORBA object systems. Sec-ondly, it describes main challenges and solutions during the design of fault-tolerance CORBA systems. Thirdly it in-troduces fault-tolerance CORBA prototype system-TBAFTS on top of a CORBA-compliant object middleware, Tong-Broker developed by us independently. Finally we give our conclusion.

    滕青芳; 孙金龙; 范多旺


    The problem of robust H∞ fault-tolerant control for structural nonlinear vibration systems with multi-constraints is investigated. According to structural dynamics theory, a state-space model containing multi-constraints such as input time-varying delay, actuator failure, parameter nonlinear, disturbance, etc is established. Based on state feedback and Lyapunov stability theory, a sufficient condition of the existence of robust H∞, fault-tolerant controller is derived and then transformed to the corresponding Linear Matrix Inequality (LMI). During inferential reasoning, the matrix inequality is only amplified twice and relied on system's delay-time, so that it is possible to sufficently reduce conservative of controller design. The resultant controller enables structural nonlinear vibration systems to retain robust stability and disturbance attenuation as well as to tolerate actuator failure. A building model with four degrees of freedom subjected to the El Centro earthquake wave is simulated and studied to examine the effectiveness of the algorithm provided above, and the results show that the proposed method is feasible.%研究了多约束条件下非线性结构振动系统的鲁棒H∞容错控制问题.根据建筑结构力学原理,建立了包含输入时变时滞、执行器故障、非线性参数摄动以及干扰等多约束条件的结构振动系统状态模型,基于状态反馈和Lyapunov稳定性理论,提出了一个可满足多约束条件的时滞相关鲁棒H∞容错控制算法,该结果以线性矩阵不等式形式给出.在推导过程中只对矩阵不等式进行了两次放大,结果与输入时滞有关,以尽可能降低控制器设计的保守性.该方法设计的控制器能够使得时滞非线性结构振动系统具有指定H∞范数的干扰抑制能力,对执行器故障具有容错性.通过对一个四自由度建筑结构模型在E1 Centro地震波作用下振动的控制仿真,验证了所提方法的可行性和有效性.


    Gossiping is a popular technique for probabilistic reliable multicast (or broadcast). However,it is often difficult to understand the behavior of gossiping algorithms in an analytic fashion. Indeed,existing analyses of gossip algorithms are either based on simulation or based on ideas borrowed from epidemic models while inheriting some features that do not seem to be appropriate for the setting of gossiping. On one hand, in epidemic spreading, an infected node typically intends to spread the infection an unbounded number of times (or rounds); whereas in gossiping, an infected node (i.e., a node having received the message in question) may prefer to gossip the message a bounded number of times. On the other hand, the often assumed homogeneity in epidemic spreading models (especially that every node has equal contact to everyone else in the population) has been silently inherited in the gossiping literature, meaning that an expensive membership protocol is often needed for maintaining nodes' views. Motivated by these observations, the authors present a characterization of a popular class of fault-tolerant gossip schemes (known as "push-based gossiping") based on a novel probabilistic model, while taking the afore-mentioned factors into consideration.

  10. ALLIANCE: An architecture for fault tolerant multi-robot cooperation

    ALLIANCE is a software architecture that facilitates the fault tolerant cooperative control of teams of heterogeneous mobile robots performing missions composed of loosely coupled, largely independent subtasks. ALLIANCE allows teams of robots, each of which possesses a variety of high-level functions that it can perform during a mission, to individually select appropriate actions throughout the mission based on the requirements of the mission, the activities of other robots, the current environmental conditions, and the robot`s own internal states. ALLIANCE is a fully distributed, behavior-based architecture that incorporates the use of mathematically modeled motivations (such as impatience and acquiescence) within each robot to achieve adaptive action selection. Since cooperative robotic teams usually work in dynamic and unpredictable environments, this software architecture allows the robot team members to respond robustly, reliably, flexibly, and coherently to unexpected environmental changes and modifications in the robot team that may occur due to mechanical failure, the learning of new skills, or the addition or removal of robots from the team by human intervention. The feasibility of this architecture is demonstrated in an implementation on a team of mobile robots performing a laboratory version of hazardous waste cleanup.

  11. Fault-tolerant quantum blind signature protocols against collective noise

    This work proposes two fault-tolerant quantum blind signature protocols based on the entanglement swapping of logical Bell states, which are robust against two kinds of collective noises: the collective-dephasing noise and the collective-rotation noise, respectively. Both of the quantum blind signature protocols are constructed from four-qubit decoherence-free (DF) states, i.e., logical Bell qubits. The initial message is encoded on the logical Bell qubits with logical unitary operations, which will not destroy the anti-noise trait of the logical Bell qubits. Based on the fundamental property of quantum entanglement swapping, the receiver simply performs two Bell-state measurements (rather than four-qubit joint measurements) on the logical Bell qubits to verify the signature, which makes the protocols more convenient in a practical application. Different from the existing quantum signature protocols, our protocols can offer the high fidelity of quantum communication with the employment of logical qubits. Moreover, we hereinafter prove the security of the protocols against some individual eavesdropping attacks, and we show that our protocols have the characteristics of unforgeability, undeniability and blindness.

  12. Fault-tolerant error correction with the gauge color code

    The constituent parts of a quantum computer are inherently vulnerable to errors. To this end, we have developed quantum error-correcting codes to protect quantum information from noise. However, discovering codes that are capable of a universal set of computational operations with the minimal cost in quantum resources remains an important and ongoing challenge. One proposal of significant recent interest is the gauge color code. Notably, this code may offer a reduced resource cost over other well-studied fault-tolerant architectures by using a new method, known as gauge fixing, for performing the non-Clifford operations that are essential for universal quantum computation. Here we examine the gauge color code when it is subject to noise. Specifically, we make use of single-shot error correction to develop a simple decoding algorithm for the gauge color code, and we numerically analyse its performance. Remarkably, we find threshold error rates comparable to those of other leading proposals. Our results thus provide the first steps of a comparative study between the gauge color code and other promising computational architectures.

  13. Probabilistic analysis on fault tolerance of 3-Dimensional mesh networks

    王高才; 陈建二; 王国军; 陈松乔


    The probability model is used to analyze the fault tolerance of mesh. To simplify its analysis, it is as-sumed that the failure probability of each node is independent. A 3-D mesh is partitioned into smaller submeshes,and then the probability with which each submesh satisfies the defined condition is computed. If each submesh satis-fies the condition, then the whole mesh is connected. Consequently, the probability that a 3-D mesh is connected iscomputed assuming each node has a failure probability. Mathematical methods are used to derive a relationship be-tween network node failure probability and network connectivity probability. The calculated results show that the 3-D mesh networks can remain connected with very high probability in practice. It is formally proved that when thenetwork node failure probability is boutded by 0.45 %, the 3-D mesh networks of more than three hundred thousandnodes remain connected with probability larger than 99 %. The theoretical results show that the method is a power-ful technique to calculate the lower bound of the connectivity probability of mesh networks.

  14. Proposal of fault-tolerant tomographic image reconstruction

    Kudo, Hiroyuki; Yamazaki, Fukashi; Nemoto, Takuya


    This paper deals with tomographic image reconstruction under the situation where some of projection data bins are contaminated with abnormal data. Such situations occur in various instances of tomography. We propose a new reconstruction algorithm called the Fault-Tolerant reconstruction outlined as follows. The least-squares (L2-norm) error function ||Ax-b||_2^2 used in ordinary iterative reconstructions is sensitive to the existence of abnormal data. The proposed algorithm utilizes the L1-norm error function ||Ax-b||_1^1 instead of the L2-norm, and we develop a row-action-type iterative algorithm using the proximal splitting framework in convex optimization fields. We also propose an improved version of the L1-norm reconstruction called the L1-TV reconstruction, in which a weak Total Variation (TV) penalty is added to the cost function. Simulation results demonstrate that reconstructed images with the L2-norm were severely damaged by the effect of abnormal bins, whereas images with the L1-norm and L1-TV reco...

    Brown, Benjamin J; Nickerson, Naomi H; Browne, Dan E


    The constituent parts of a quantum computer are inherently vulnerable to errors. To this end, we have developed quantum error-correcting codes to protect quantum information from noise. However, discovering codes that are capable of a universal set of computational operations with the minimal cost in quantum resources remains an important and ongoing challenge. One proposal of significant recent interest is the gauge color code. Notably, this code may offer a reduced resource cost over other well-studied fault-tolerant architectures by using a new method, known as gauge fixing, for performing the non-Clifford operations that are essential for universal quantum computation. Here we examine the gauge color code when it is subject to noise. Specifically, we make use of single-shot error correction to develop a simple decoding algorithm for the gauge color code, and we numerically analyse its performance. Remarkably, we find threshold error rates comparable to those of other leading proposals. Our results thus provide the first steps of a comparative study between the gauge color code and other promising computational architectures.

  16. Fault Tolerant Architecture For A Fly-By-Light Flight Control Computer

    The next generation of flight control computers will utilize fiber optic technology to produce a fly-by-light flight control system. Optical transducers and optical fibers will take the place of electrical position transducers and wires, torsion bars, bell cranks, and cables. Applications for this fly-by-light technology include space launch vehicles, upperstages, space-craft, and commercial/military aircraft. Optical fibers are lighter than mechanical transmission media and unlike conven-tional wire transmissions are not susceptible to electromagnetic interference (EMI) and high energy emission sources. This paper will give an overview of a fault tolerant In-Line Monitored optical flight control system being developed at Boeing Aerospace & Electronics in Seattle, Washington. This system uses passive transducers with fiber optic interconnections which hold promises to virtually eliminate EMI threats to flight control system performance and flight safety and also provide significant weight savings. The main emphasis of this paper will be the In-Line Monitored architecture of the optical transducer system required for use in a fault tolerant flight control system.

  17. Physiological hemostasis based intelligent integrated cooperative controller for precise fault-tolerant control of redundant parallel manipulator

    This paper focuses on precise fault-tolerant control for actual redundant parallel manipulator. Based on kinematic redundancy, some unnoticed influences such as mechanical clearance have been considered to design a more precise and intelligent fault-tolerant plan for actual plants. According to regulation principles in human hemostasis system, a bio-inspired intelligent integrated cooperative controller (BIICC) is developed including system structure, algorithm and step in parameter tuning. The proposed BIICC optimises partial error signal and improves control performance in each sub-channel. Moreover, the new controller transfers and disposes cooperative control signals among different sub-channels to achieve an intelligent integrated fault-tolerant system. The proposed BIICC is applied to an actual 2-DOF (degrees of freedom) redundant parallel manipulator where the feasibility of the new controller is demonstrated. The BIICC is beneficial to control precision and fault-tolerant capability of redundant plant. The improvements are more obvious in cases where extra actuators of redundant manipulator are broken.

  18. Critical Gates Identification for Fault-Tolerant Design in Math Circuits

    Full Text Available Hardware redundancy at different levels of design is a common fault mitigation technique, which is well known for its efficiency to the detriment of area overhead. In order to reduce this drawback, several fault-tolerant techniques have been proposed in literature to find a good trade-off. In this paper, critical constituent gates in math circuits are detected and graded based on the impact of an error in the output of a circuit. These critical gates should be hardened first under the area constraint of design criteria. Indeed, output bits considered crucial to a system receive higher priorities to be protected, reducing the occurrence of critical errors. The 74283 fast adder is used as an example to illustrate the feasibility and efficiency of the proposed approach.

  19. Robust Fault-Tolerant Control for Satellite Attitude Stabilization Based on Active Disturbance Rejection Approach with Artificial Bee Colony Algorithm

    Full Text Available This paper proposed a robust fault-tolerant control algorithm for satellite stabilization based on active disturbance rejection approach with artificial bee colony algorithm. The actuating mechanism of attitude control system consists of three working reaction flywheels and one spare reaction flywheel. The speed measurement of reaction flywheel is adopted for fault detection. If any reaction flywheel fault is detected, the corresponding fault flywheel is isolated and the spare reaction flywheel is activated to counteract the fault effect and ensure that the satellite is working safely and reliably. The active disturbance rejection approach is employed to design the controller, which handles input information with tracking differentiator, estimates system uncertainties with extended state observer, and generates control variables by state feedback and compensation. The designed active disturbance rejection controller is robust to both internal dynamics and external disturbances. The bandwidth parameter of extended state observer is optimized by the artificial bee colony algorithm so as to improve the performance of attitude control system. A series of simulation experiment results demonstrate the performance superiorities of the proposed robust fault-tolerant control algorithm.

  20. Fault tolerant multi-sensor fusion based on the information gain

    In the last decade, multi-robot systems are used in several applications like for example, the army, the intervention areas presenting danger to human life, the management of natural disasters, the environmental monitoring, exploration and agriculture. The integrity of localization of the robots must be ensured in order to achieve their mission in the best conditions. Robots are equipped with proprioceptive (encoders, gyroscope) and exteroceptive sensors (Kinect). However, these sensors could be affected by various faults types that can be assimilated to erroneous measurements, bias, outliers, drifts,… In absence of a sensor fault diagnosis step, the integrity and the continuity of the localization are affected. In this work, we present a muti-sensors fusion approach with Fault Detection and Exclusion (FDE) based on the information theory. In this context, we are interested by the information gain given by an observation which may be relevant when dealing with the fault tolerance aspect. Moreover, threshold optimization based on the quantity of information given by a decision on the true hypothesis is highlighted.

  1. Fault-tolerant Control of Inverter-fed Induction Motor Drives

    The main purpose of this work was to investigate how fault-tolerant control (FTC) could be included in the control scheme of frequency converter fed induction motor applications. This was approached by identifying the potential failure modes for which fault tolerant control should be applied...... a current sensor fault, by switching to a closed loop scalar controller, was analysed. The main contributions of this work are · An investigation of the potential failure modes of inverter fed induction motor drives. · An extension of the FTC development cycle, to include economical cost-benefit analysis....... A description of the different frequency converter components, including models of the inverter, sensors and controllers was given, followed by a fault mode and effect analysis, which points out the potential fault modes of the design. Among the listed fault modes, two were found to be of particular practical...

  2. Fault-tolerant quantum computation with asymmetric Bacon-Shor codes

    We develop a scheme for fault-tolerant quantum computation based on asymmetric Bacon-Shor codes, which works effectively against highly biased noise dominated by dephasing. We find the optimal Bacon-Shor block size as a function of the noise strength and the noise bias, and estimate the logical error rate and overhead cost achieved by this optimal code. Our fault-tolerant gadgets, based on gate teleportation, are well suited for hardware platforms with geometrically local gates in two dimensions.

  3. Fault-Tolerant Wormhole Routing with 2 Virtual Channels in Meshes

    In wormhole meshes, a reliable routing is supposed to be deadlock-free and fault-tolerant. Many routing algorithms are able to tolerate a large number of faults enclosed by rectangular blocks or special convex, none of them,however, is capable of handling two convex fault regions with distance two by using only two virtual networks. In this paper, a fault-tolerant wormhole routing algorithm is presented to tolerate the disjointed convex faulty regions with distance two or no less, which do not contain any nonfaulty nodes and do not prohibit any routing as long as nodes outside faulty regions are connected in the mesh network. The processors' overlapping along the boundaries of different fault regions is allowed. The proposed algorithm, which routes the messages by X-Y routing algorithm in fault-free region, can tolerate convex fault-connected regions with only two virtual channels per physical channel, and is deadlock- and livelock-free. The proposed algorithm can be easily extended to adaptive routing.

  4. Implementation of fault tolerant control for modular multilevel converter using EtherCAT communication

    . This communication platform has to ensure a perfect synchronization between the modules, and it should be also fault tolerant. The analysis of a MMC based on EtherCAT is presented in this paper from implementation and module fault point of view. The experimental tests show that the MMC operates after communication...

  5. Sensor and Actuator Fault-Hiding Reconfigurable Control Design for a Four-Tank System Benchmark

    Fault detection and compensation plays a key role to fulfill high demands for performance and security in today's technological systems. In this paper, a fault-hiding (i.e., tolerant) control scheme that detects and compensates for actuator and sensor faults in a four-tank system benchmark...... Invariant (LTI) system where virtual sensors and virtual actuators are used to correct faulty performance through the use of a pre-fault performance. Simulation results showed that the developed approach can handle different types of faults and able to completely and instantly recover the original system...

  6. Active fault detection in MIMO systems

    The focus in this paper is on active fault detection (AFD) for MIMO systems with parametric faults. The problem of design of auxiliary inputs with respect to detection of parametric faults is investigated. An analysis of the design of auxiliary inputs is given based on analytic transfer functions...




    Full Text Available Traditionally, memory cells were the only circuitry susceptible to transient faults The supporting circuitries around the memory were assumed to be fault-free. Due to the increase in soft error rate in logic circuits, the encoder and decoder circuitry around the memory blocks have become susceptible to soft errors as well and must be protected. Memory cells have been protected from soft errors for more than a decade; due to the increase in soft error rate in logic circuits, the encoder and decoder circuitry around the memory blocks have become susceptible to soft errors as well and must also be protected. In this paper a new approach to design fault-secure encoder and decoder circuitry for memory designs. The key novel contribution of this paper is identifying and defining a new class of error-correcting codes whose redundancy makes the design of faultsecure detectors (FSD particularly simple. We further quantify the importance of protecting encoder and decoder circuitry against transient errors, illustrating a scenario where the system failure rate (FIT is dominated by the failure rate of the encoder and decoder. We prove that Euclidean Geometry Low-Density Parity-Check (EG-LDPC codes have the faultsecure detector capability.

  8. Fault-tolerant Control of Spacecraft Dynamic System Based on Control Network%基于网络控制的航空器动力系统的容错控制

    罗小元; 尚美杰; 陈彩莲; 关新平


    研究了基于网络控制的航空器时变时滞横向动力学系统的容错控制问题.利用Lyapunov稳定性方法和LMI技术,综合网络的特点采用鲁棒完整性方法设计了针时执行器失效的容错控制器.首先给出了航空器横向动力学系统的时滞线性化一般模型,考虑到网络控制系统的运行特点以及网络中的诱导时延和数据包丢失现象,采用一种分段常值的控制器,设计了使系统对执行器失效故障具有容错性能的控制器.进行了仿真研究,仿真结果说明容错控制器作用下当出现执行器故障时系统仍可保持较好的性能,验证了提出的设计方法的有效性.%The fault-tolerant contrpl problem of the spacecraft lateral dynamics with time-varying delay states based on networked control os studied.Combining with the characteristics of the networked control system(NCS),a controller to stabilize the system with actuator failure is proposed based on robust intefrity method by using Lyapunov stabilization theory together with the LMI technique.The soacecraft dynamic system is modeled as a linearized plant with time-varying state delay.Then considering the characteristics of the NCS induced delay and the data dropout,a piecewise constant controller is adopted to design the controller against the actuator failure.The simulation resullts show that the system performance can be guaranteed well under the proposed fault tolerant controller when the acruator failure appears.

  9. Flight Tests of Autopilot Integrated with Fault-Tolerant Control of a Small Fixed-Wing UAV

    Full Text Available A fault-tolerant control scheme for the autopilot of the small fixed-wing UAV is designed and tested by the actual flight experiments. The small fixed-wing UAV called Xiang Fei is developed independently by Nanjing University of Aeronautics and Astronautics. The flight control system is designed based on an open-source autopilot (Pixhawk. Real-time kinematic (RTK GPS is introduced due to its high accuracy. Some modifications on the longitudinal and lateral guidance laws are achieved to improve the flight control performance. Moreover, a data fusion based fault-tolerant control scheme is integrated in altitude control and speed control for altitude sensor failure and airspeed sensor failure, which are the common problems for small fixed-wing UAV. Finally, the real flight experiments are implemented to test the fault-tolerant control based autopilot of UAV. Real flight test results are given and analyzed in detail, which show that the fixed-wing UAV can track the desired altitude and speed commands during the whole flight process including takeoff, climbing, cruising, gliding, landing, and wave-off by the fault-tolerant control based autopilot.

  10. Architecture Synthesis for Cost-Constrained Fault-Tolerant Flow-based Biochips

    Eskesen, Morten Chabert; Pop, Paul; Potluri, Seetal


    In this paper, we are interested in the synthesis of fault-tolerant architectures for flow-based microfluidic biochips, which use microvalves and channels to run biochemical applications. The growth rate of device integration in flow-based microfluidic biochips is scaling faster than Moore's law......) for the synthesis of fault-tolerant biochip architectures. Our approach optimizes the introduction of redundancy within a given unit cost budget, such that, the biochemical application can successfully complete its execution within its deadline, even in the presence of faults, and the yield is maximized...

    Wu, Zhao; Xiong, Naixue; Huang, Yannong; Xu, Degang; Hu, Chunyang


    The services composition technology provides flexible methods for building service composition applications (SCAs) in wireless sensor networks (WSNs). The high reliability and high performance of SCAs help services composition technology promote the practical application of WSNs. The optimization methods for reliability and performance used for traditional software systems are mostly based on the instantiations of software components, which are inapplicable and inefficient in the ever-changing SCAs in WSNs. In this paper, we consider the SCAs with fault tolerance in WSNs. Based on a Universal Generating Function (UGF) we propose a reliability and performance model of SCAs in WSNs, which generalizes a redundancy optimization problem to a multi-state system. Based on this model, an efficient optimization algorithm for reliability and performance of SCAs in WSNs is developed based on a Genetic Algorithm (GA) to find the optimal structure of SCAs with fault-tolerance in WSNs. In order to examine the feasibility of our algorithm, we have evaluated the performance. Furthermore, the interrelationships between the reliability, performance and cost are investigated. In addition, a distinct approach to determine the most suitable parameters in the suggested algorithm is proposed.

  12. A multiobjective scatter search algorithm for fault-tolerant NoC mapping optimisation

    Le, Qianqi; Yang, Guowu; Hung, William N. N.; Zhang, Xinpeng; Fan, Fuyou


  13. Performance analysis of a dependable scheduling strategy based on a fault-tolerant grid model

    WANG Yuanzhuo; LIN Chuang; YANG Yang; SHAN Zhiguang


    The grid provides an integrated computer platform composed of differentiated and distributed systems.These resources are dynamic and heterogeneous.In this paper,a novel fault-tolerant grid-scheduling model is pre sented based on Stochastic Petri Nets (SPN) to assure the heterogeneity and dynamism of the grid system.Also,a new grid-scheduling strategy,the dependable strategy for the shortest expected accomplishing time (DSEAT),is put forward,in which the dependability factor is introduced in the task-dispatching strategy.In the end,the performance of the scheduling strategy based on the fault-tolerant gridscheduling model is analyzed by an software package,named SPNP.The numerical results show that dynamic resources will increase the response time for all classes of tasks in differing degrees.Compared with shortest expected accomplishing time (SEAT) strategy,the DSEAT strategy can reduce the negative effects of dynamic and autonomic resources to some extent so as to guarantee a high quality of service (QoS).

    Sania Bhatti


    Full Text Available Over the last few years, the deployment of WSNs (Wireless Sensor Networks has been fostered in diverse applications. WSN has great potential for a variety of domains ranging from scientific experiments to commercial applications. Due to the deployment of WSNs in dynamic and unpredictable environments. They have potential to cope with variety of faults. This paper proposes an energy-aware fault-tolerant clustering protocol for target tracking applications termed as the FTTT (Fault Tolerant Target Tracking protocol. The identification of RNs (Redundant Nodes makes SN (Sensor Node fault tolerance plausible and the clustering endorsed recovery of sensors supervised by a faulty CH (Cluster Head. The FTTT protocol intends two steps of reducing energy consumption: first, by identifying RNs in the network; secondly, by restricting the numbers of SNs sending data to the CH. Simulations validate the scalability and low power consumption of the FTTT protocol in comparison with LEACH protocol.

    Rakshith Saligram


    Full Text Available Reversible Logic is gaining significant consideration as the potential logic design style for implementationin modern nanotechnology and quantum computing with minimal impact on physical entropy .FaultTolerant reversible logic is one class of reversible logic that maintain the parity of the input and theoutputs. Significant contributions have been made in the literature towards the design of fault tolerantreversible logic gate structures and arithmetic units, however, there are not many efforts directed towardsthe design of fault tolerant reversible ALUs. Arithmetic Logic Unit (ALU is the prime performing unit inany computing device and it has to be made fault tolerant. In this paper we aim to design one such faulttolerant reversible ALU that is constructed using parity preserving reversible logic gates. The designedALU can generate up to seven Arithmetic operations and four logical operations.

  16. Adaptive Vibration Control System for MR Damper Faults

    Full Text Available Several methods have been proposed to estimate the force of a semiactive damper, particularly of a magnetorheological damper because of its importance in automotive and civil engineering. Usually, all models have been proposed assuming experimental data in nominal operating conditions and some of them are estimated for control purposes. Because dampers are prone to fail, fault estimation is useful to design adaptive vibration controllers to accommodate the malfunction in the suspension system. This paper deals with the diagnosis and estimation of faults in an automotive magnetorheological damper. A robust LPV observer is proposed to estimate the lack of force caused by a damper leakage in a vehicle corner. Once the faulty damper is isolated in the vehicle and the fault is estimated, an Adaptive Vibration Control System is proposed to reduce the fault effect using compensation forces from the remaining healthy dampers. To fulfill the semiactive damper constraints in the fault adaptation, an LPV controller is designed for vehicle comfort and road holding. Simulation results show that the fault observer has good performance with robustness to noise and road disturbances and the proposed AVCS improves the comfort up to 24% with respect to a controlled suspension without fault tolerance features.

  17. A fault-tolerant addressable spin qubit in a natural silicon quantum dot.

    Takeda, Kenta; Kamioka, Jun; Otsuka, Tomohiro; Yoneda, Jun; Nakajima, Takashi; Delbecq, Matthieu R; Amaha, Shinichi; Allison, Giles; Kodera, Tetsuo; Oda, Shunri; Tarucha, Seigo


    Fault-tolerant quantum computing requires high-fidelity qubits. This has been achieved in various solid-state systems, including isotopically purified silicon, but is yet to be accomplished in industry-standard natural (unpurified) silicon, mainly as a result of the dephasing caused by residual nuclear spins. This high fidelity can be achieved by speeding up the qubit operation and/or prolonging the dephasing time, that is, increasing the Rabi oscillation quality factor Q (the Rabi oscillation decay time divided by the π rotation time). In isotopically purified silicon quantum dots, only the second approach has been used, leaving the qubit operation slow. We apply the first approach to demonstrate an addressable fault-tolerant qubit using a natural silicon double quantum dot with a micromagnet that is optimally designed for fast spin control. This optimized design allows access to Rabi frequencies up to 35 MHz, which is two orders of magnitude greater than that achieved in previous studies. We find the optimum Q = 140 in such high-frequency range at a Rabi frequency of 10 MHz. This leads to a qubit fidelity of 99.6% measured via randomized benchmarking, which is the highest reported for natural silicon qubits and comparable to that obtained in isotopically purified silicon quantum dot-based qubits. This result can inspire contributions to quantum computing from industrial communities.

    Parmeet Kaur Jaggi


    Full Text Available Mobile ad hoc networks (MANETs have significantly enhanced the wireless networks by eliminating the need for any fixed infrastructure. Hence, these are increasingly being used for expanding the computing capacity of existing networks or for implementation of autonomous mobile computing Grids. However, the fragile nature of MANETs makes the constituent nodes susceptible to failures and the computing potential of these networks can be utilized only if they are fault tolerant. The technique of checkpointing based rollback recovery has been used effectively for fault tolerance in static and cellular mobile systems; yet, the implementation of existing protocols for MANETs is not straightforward. The paper presents a novel rollback recovery protocol for handling the failures of mobile nodes in a MANET using checkpointing and sender based message logging. The proposed protocol utilizes the routing protocol existing in the network for implementing a low overhead recovery mechanism. The presented recovery procedure at a node is completely domino-free and asynchronous. The protocol is resilient to the dynamic characteristics of the MANET; allowing a distributed application to be executed independently without access to any wired Grid or cellular network access points. We also present an algorithm to record a consistent global snapshot of the MANET.

  19. A Multiconstrained Grid Scheduling Algorithm with Load Balancing and Fault Tolerance.

    Grid environment consists of millions of dynamic and heterogeneous resources. A grid environment which deals with computing resources is computational grid and is meant for applications that involve larger computations. A scheduling algorithm is said to be efficient if and only if it performs better resource allocation even in case of resource failure. Allocation of resources is a tedious issue since it has to consider several requirements such as system load, processing cost and time, user's deadline, and resource failure. This work attempts to design a resource allocation algorithm which is budget constrained and also targets load balancing, fault tolerance, and user satisfaction by considering the above requirements. The proposed Multiconstrained Load Balancing Fault Tolerant algorithm (MLFT) reduces the schedule makespan, schedule cost, and task failure rate and improves resource utilization. The proposed MLFT algorithm is evaluated using Gridsim toolkit and the results are compared with the recent algorithms which separately concentrate on all these factors. The comparison results ensure that the proposed algorithm works better than its counterparts.

  20. A Multiconstrained Grid Scheduling Algorithm with Load Balancing and Fault Tolerance

    Full Text Available Grid environment consists of millions of dynamic and heterogeneous resources. A grid environment which deals with computing resources is computational grid and is meant for applications that involve larger computations. A scheduling algorithm is said to be efficient if and only if it performs better resource allocation even in case of resource failure. Allocation of resources is a tedious issue since it has to consider several requirements such as system load, processing cost and time, user’s deadline, and resource failure. This work attempts to design a resource allocation algorithm which is budget constrained and also targets load balancing, fault tolerance, and user satisfaction by considering the above requirements. The proposed Multiconstrained Load Balancing Fault Tolerant algorithm (MLFT reduces the schedule makespan, schedule cost, and task failure rate and improves resource utilization. The proposed MLFT algorithm is evaluated using Gridsim toolkit and the results are compared with the recent algorithms which separately concentrate on all these factors. The comparison results ensure that the proposed algorithm works better than its counterparts.

  1. A fault-tolerant addressable spin qubit in a natural silicon quantum dot

    Fault-tolerant quantum computing requires high-fidelity qubits. This has been achieved in various solid-state systems, including isotopically purified silicon, but is yet to be accomplished in industry-standard natural (unpurified) silicon, mainly as a result of the dephasing caused by residual nuclear spins. This high fidelity can be achieved by speeding up the qubit operation and/or prolonging the dephasing time, that is, increasing the Rabi oscillation quality factor Q (the Rabi oscillation decay time divided by the π rotation time). In isotopically purified silicon quantum dots, only the second approach has been used, leaving the qubit operation slow. We apply the first approach to demonstrate an addressable fault-tolerant qubit using a natural silicon double quantum dot with a micromagnet that is optimally designed for fast spin control. This optimized design allows access to Rabi frequencies up to 35 MHz, which is two orders of magnitude greater than that achieved in previous studies. We find the optimum Q = 140 in such high-frequency range at a Rabi frequency of 10 MHz. This leads to a qubit fidelity of 99.6% measured via randomized benchmarking, which is the highest reported for natural silicon qubits and comparable to that obtained in isotopically purified silicon quantum dot–based qubits. This result can inspire contributions to quantum computing from industrial communities. PMID:27536725

  2. Fuzzy fault diagnosis system of MCFC

    A kind of fault diagnosis system of molten carbonate fuel cell (MCFC) stack is proposed in this paper. It is composed of a fuzzy neural network (FNN) and a fault diagnosis element. FNN is able to deal with the information of the expert knowledge and the experiment data efficiently. It also has the ability to approximate any smooth system. FNN is used to identify the fault diagnosis model of MCFC stack. The fuzzy fault decision element can diagnose the state of the MCFC generating system, normal or fault, and can decide the type of the fault based on the outputs of FNN model and the MCFC system. Some simulation experiment results are demonstrated in this paper.

  3. Arc burst pattern analysis fault detection system

    Russell, B. Don (Inventor); Aucoin, B. Michael (Inventor); Benner, Carl L. (Inventor)


  4. Fault-Tolerant and Elastic Streaming MapReduce with Decentralized Coordination

    The MapReduce programming model, due to its simplicity and scalability, has become an essential tool for processing large data volumes in distributed environments. Recent Stream Processing Systems (SPS) extend this model to provide low-latency analysis of high-velocity continuous data streams. However, integrating MapReduce with streaming poses challenges: first, the runtime variations in data characteristics such as data-rates and key-distribution cause resource overload, that inturn leads to fluctuations in the Quality of the Service (QoS); and second, the stateful reducers, whose state depends on the complete tuple history, necessitates efficient fault-recovery mechanisms to maintain the desired QoS in the presence of resource failures. We propose an integrated streaming MapReduce architecture leveraging the concept of consistent hashing to support runtime elasticity along with locality-aware data and state replication to provide efficient load-balancing with low-overhead fault-tolerance and parallel fault-recovery from multiple simultaneous failures. Our evaluation on a private cloud shows up to 2:8 improvement in peak throughput compared to Apache Storm SPS, and a low recovery latency of 700 -1500 ms from multiple failures.

  5. Combining Artificial Intelligence and Advanced Techniques in Fault-Tolerant Control

    Full Text Available We present the integration of artificial intelligence, robust, nonlinear and model reference adaptive control (MRACmethods for fault-tolerant control (FTC. We combine MRAC schemes with classical PID controllers, artificial neuralnetworks (ANNs, genetic algorithms (GAs, H∞ controls and sliding mode controls. Six different schemas areproposed: the first one is an MRAC with an artificial neural network and a PID controller whose parameters weretuned by a GA using Pattern Search Optimization. The second scheme is an MRAC controller with an H∞ control(H∞. The third scheme is an MRAC controller with a sliding mode controller (SMC. The fourth scheme is an MRACcontroller with an ANN. The fifth scheme is an MRAC controller with a PID controller optimized by a GA. Finally, thelast scheme is an MRAC classical control system. The objective of this research is to generate more powerful FTCmethods and compare the performance of above schemes under different fault conditions in sensors and actuators.An industrial heat exchanger process was the test bed for these approaches. Simulation results showed that the useof Pattern Search Optimization and ANNs improved the performance of the FTC scheme because it makes the controlsystem more robust against sensor and actuator faults.

  6. Snapple : A distributed, fault-tolerant, in-memory key-value store using Conflict-Free Replicated Data Types

    As services grow and receive more traffic, data resilience through replication becomes increasingly important. Modern large-scale Internet services such as Facebook, Google and Twitter serve millions of users concurrently. Replication is a vital component of distributed systems. Eventual consistency and Conflict-Free Replicated Data Types (CRDTs) are suggested as an alternative to strong consistency systems. This thesis implements and evaluates Snapple, a distributed, fault-tolerant, in-memor...

  7. 多相电机调速系统容错控制技术研究%Research of Fault-tolerant Control Technology of Multiphase Motor Speed-regulation System

    本节以五相感应电机为对象,研究定子绕组一相或多相开路时系统运行情况。根据电机无扰动运行(disturbance—freeoperation)条件,分析故障前后电机磁动势及电流。并推导在电机相开路情况下电机的解耦模型,以此为基础分析故障运行中电机容错控制。%By taking five-phase induction motor as object, the system runs is studied with one phase or multiphase stator winding open. Magnetic force and current are analyzed before and after trouble according to the condition of motor disturbance-free operation. Decoupling model of the motor is derived, and fault-tolerant control of the motor is analyzed when motor is trouble.

  8. Automatic Fault-Tolerance Support in Resource Management System Based on Job Checkpoint/Restart%资源管理系统中基于作业检查点的自动容错

    An automatic fault-tolerance method based on job checkpoint/restart in resource management systems is pro-posed The key technologies are presented, including the separation of job checkpoint and task checkpoint, management of checkpoint image files, and automatic job restart.Automatic job checkpoint/restart with BLCR is implemented in SLURM and the challenges are discussed. Analysis and experiments show that the checkpoint and restart works correctly, and the time to complete large-scale jobs is reduced effectively.%本文提出了在资源管理系统中基于作业检查点实现自动容错支持,深入分析了作业与任务检查点分离、映像文件管理、自动恢复执行等关键技术.基于BLCR在SLURM中实现了作业的自动检查点/恢复,详细介绍了实现中的关键技术难题.分析与测试表明,检查点与恢复执行功能正确,并能有效缩短大规模作业成功运行所需的时间.

  9. Fault Detection, Isolation, and Accommodation for LTI Systems Based on GIMC Structure

    D. U. Campos-Delgado


  10. Reconfigurable fault-tolerant controller synthesis for a steer-by-wire vehicle using independently driven wheels

    Wada, Nobutaka; Fujii, Kosuke; Saeki, Masami


    In this paper, a synthesis method for a reconfigurable fault-tolerant control system for use in a steer-by-wire vehicle is proposed. The vehicle considered in this paper is also assumed to have independently driven wheels. The control objective in this work is to enable the vehicle yaw rate to track the reference signal even when the steering actuator breaks down. Since the vehicle yaw rate can be controlled with either the front wheel turn angle or the yaw moment generated by the independently driven wheels, this system has actuator redundancy. We attempt to design a control system that manages this actuator redundancy so that the performance degradation due to the actuator failure is minimised. We utilise a control allocator based on on-line optimisation for managing the actuator redundancy. The fault-tolerant control system with a control allocator has several excellent properties. For example, the method can handle various failure situations. Also, since the control allocation problem is reduced to a convex quadratic programming problem, the on-line computational effort is relatively little. However, so far, it has been unclear whether the stability of the control system with the control allocator is guaranteed when the actuator failure occurs. Therefore, we propose a design method of a fault-tolerant controller based on on-line optimisation that guarantees the stability of the overall system. The effectiveness of the method is established through numerical examples.

  11. Soft-Fault Detection Technologies Developed for Electrical Power Systems

    The NASA Glenn Research Center, partner universities, and defense contractors are working to develop intelligent power management and distribution (PMAD) technologies for future spacecraft and launch vehicles. The goals are to provide higher performance (efficiency, transient response, and stability), higher fault tolerance, and higher reliability through the application of digital control and communication technologies. It is also expected that these technologies will eventually reduce the design, development, manufacturing, and integration costs for large, electrical power systems for space vehicles. The main focus of this research has been to incorporate digital control, communications, and intelligent algorithms into power electronic devices such as direct-current to direct-current (dc-dc) converters and protective switchgear. These technologies, in turn, will enable revolutionary changes in the way electrical power systems are designed, developed, configured, and integrated in aerospace vehicles and satellites. Initial successes in integrating modern, digital controllers have proven that transient response performance can be improved using advanced nonlinear control algorithms. One technology being developed includes the detection of "soft faults," those not typically covered by current systems in use today. Soft faults include arcing faults, corona discharge faults, and undetected leakage currents. Using digital control and advanced signal analysis algorithms, we have shown that it is possible to reliably detect arcing faults in high-voltage dc power distribution systems (see the preceding photograph). Another research effort has shown that low-level leakage faults and cable degradation can be detected by analyzing power system parameters over time. This additional fault detection capability will result in higher reliability for long-lived power systems such as reusable launch vehicles and space exploration missions.

  12. Error-detection-based quantum fault tolerance against discrete Pauli noise

    A quantum computer -- i.e., a computer capable of manipulating data in quantum superposition -- would find applications including factoring, quantum simulation and tests of basic quantum theory. Since quantum superpositions are fragile, the major hurdle in building such a computer is overcoming noise. Developed over the last couple of years, new schemes for achieving fault tolerance based on error detection, rather than error correction, appear to tolerate as much as 3-6% noise per gate -- an order of magnitude better than previous procedures. But proof techniques could not show that these promising fault-tolerance schemes tolerated any noise at all. With an analysis based on decomposing complicated probability distributions into mixtures of simpler ones, we rigorously prove the existence of constant tolerable noise rates ("noise thresholds") for error-detection-based schemes. Numerical calculations indicate that the actual noise threshold this method yields is lower-bounded by 0.1% noise per gate.

  13. Fault system polarity: A matter of chance?

    Many normal fault systems and, on a smaller scale, fracture boudinage exhibit asymmetry so that one fault dip direction dominates. The fraction of throw (or heave) accommodated by faults with the same dip direction in relation to the total fault system throw (or heave) is a quantitative measure of fault system asymmetry and termed 'polarity'. It is a common belief that the formation of domino and shear band boudinage with a monoclinic symmetry requires a component of layer parallel shearing, whereas torn boudins reflect coaxial flow. Moreover, domains of parallel faults are frequently used to infer the presence of a common décollement. Here we show, using Distinct Element Method (DEM) models in which rock is represented by an assemblage of bonded circular particles, that asymmetric fault systems can emerge under symmetric boundary conditions. The pre-requisite for the development of domains of parallel faults is however that the medium surrounding the brittle layer has a very low strength. We demonstrate that, if the 'competence' contrast between the brittle layer and the surrounding material ('jacket', or 'matrix') is high, the fault dip directions and hence fault system polarity can be explained using a random process. The results imply that domains of parallel faults are, for the conditions and properties used in our models, in fact a matter of chance. Our models suggest that domino and shear band boudinage can be an unreliable shear-sense indicator. Moreover, the presence of a décollement should not be inferred on the basis of a domain of parallel faults only.

  14. A fault tolerant model for multi-sensor measurement

    Full Text Available Multi-sensor systems are very powerful in the complex environments. The cointegration theory and the vector error correction model, the statistic methods which widely applied in economic analysis, are utilized to create a fitting model for homogeneous sensors measurements. An algorithm is applied to implement the model for error correction, in which the signal of any sensor can be estimated from those of others. The model divides a signal series into two parts, the training part and the estimated part. By comparing the estimated part with the actual one, the proposed method can identify a sensor with possible faults and repair its signal. With a small amount of training data, the right parameters for the model in real time could be found by the algorithm. When applied in data analysis for aero engine testing, the model works well. Therefore, it is not only an effective method to detect any sensor failure or abnormality, but also a useful approach to correct possible errors.


    Full Text Available Workflow brokers of existing Grid Scheduling Systems are lack of cooperation mechanism which causes inefficient schedules of application distributed resources and it also worsens the utilization of various resources including network bandwidth and computational cycles. Furthermore considering the literature, all of these existing brokering systems primarily evolved around models of centralized hierarchical or client/server. In such models, vital responsibility such as resource discovery is delegated to the centralized server machines, thus they are associated with well-known disadvantages regarding single point of failure, scalability and network congestion at links that are leading to the server. In order to overcome these issues, we implement a new approach for decentralized cooperative workflow scheduling in a dynamically distributed resource sharing environment of Grids. The various actors in the system namely the users who belong to multiple control domains, workflow brokers and resources work together enabling a single cooperative resource sharing environment. But this approach ignored the fact that each grid site may have its own fault-tolerance strategy because each site is itself an autonomous domain. For instance, if a grid site handles the job check-pointing mechanism, each computation node must have the ability of periodical transmission of transient state of the job execution by computational node to the server. When there is a failure of job, it will migrate to another computational node and resume from the last stored checkpoint. A Glow worm Swarm Optimization (GSO for job scheduling is used to address the issue of heterogeneity in fault-tolerance of computational grid but Weighted GSO that overcomes the position update imperfections of general GSO in a more efficient manner shown during comparison analysis. This system supports four kinds of fault-tolerance mechanisms, including the job migration, job retry, check-pointing and

  16. A Secure and Fault-tolerant framework for Mobile IPv6 based networks

    Full Text Available Mobile IPv6 will be an integral part of the next generation Internet protocol. The importance of mobility in the Internet gets keep on increasing. Current specification of Mobile IPv6 does not provide proper support for reliability in the mobile network and there are other problems associated with it. In this paper, we propose “Virtual Private Network (VPN based Home Agent Reliability Protocol (VHAHA” as a complete system architecture and extension to Mobile IPv6 that supports reliability and offers solutions to the security problems that are found in Mobile IP registration part. The key features of this protocol over other protocols are: better survivability, transparent failure detection and recovery, reduced complexity of the system and workload, secure data transfer and improved overall performance.Keywords-Mobility Agents; VPN; VHAHA; Fault-tolerance; Reliability; Self-certified keys; Confidentiality; Authentication; Attack prevention

  17. Fault-Tolerant Technique in the Cluster Computation of the Digital Watershed Model

    This paper describes a parallel computing platform using the existing facilities for the digital watershed model. In this paper, distributed multi-layered structure is applied to the computer cluster system, and the MPI-2 is adopted as a mature parallel programming standard. An agent is introduced which makes it possible to be multi-level fault-tolerant in software development. The communication protocol based on checkpointing and rollback recovery mechanism can realize the transaction reprocessing. Compared with conventional platform, the new system is able to make better use of the computing resource. Experimental results show the speedup ratio of the platform is almost 4 times as that of the conventional one, which demonstrates the high efficiency and good performance of the new approach.

  18. Design of Fault-Tolerant and Dynamically-Reconfigurable Microfluidic Biochips

    Microfluidics-based biochips are soon expected to revolutionize clinical diagnosis, DNA sequencing, and other laboratory procedures involving molecular biology. Most microfluidic biochips are based on the principle of continuous fluid flow and they rely on permanently-etched microchannels, micropumps, and microvalves. We focus here on the automated design of "digital" droplet-based microfluidic biochips. In contrast to continuous-flow systems, digital microfluidics offers dynamic reconfigurability; groups of cells in a microfluidics array can be reconfigured to change their functionality during the concurrent execution of a set of bioassays. We present a simulated annealing-based technique for module placement in such biochips. The placement procedure not only addresses chip area, but it also considers fault tolerance, which allows a microfluidic module to be relocated elsewhere in the system when a single cell is detected to be faulty. Simulation results are presented for a case study involving the polymeras...

  19. Byzantine Fault Tolerance In The Distributed Environment Using Markov Chain Technique

    Full Text Available ABSTRACT The abstract of this paper is to tolerate the byzantine fault by providing the predefined constraints of the Nodes in the distributed environment. The nodes in the distributed environment automatically generated their constraints using Markov chain. The distributed environment predefined constraints and the member nodes predefined constraints can be updated periodically. According to this update if the member nodes predefined constraints may not matches with the distributed system predefined constraints then using Breadth First Search technique the membership service discards the service of the node in the distributed environment . The new node having constraints wants to communicate with the distributed environment. These constraints can be compared with the distributed system constraints using probability of random matching technique.

  20. EEBFTC: Extended Energy Balanced with Fault Tolerance Capability Protocol for WSN

    Full Text Available This paper proposes a new framework for wireless sensor networks (WSN by combining two routing protocol algorithms. In the proposed framework two algorithms are taking into consideration the energy balanced clustering (EBC protocol in WSN with fault tolerance capabilities. The organizer is automatically selected by the base station (BS and then it selects the cluster head (CH. The mechanism of selecting the organizer node and the cluster head (CH is based on the power, efficacy and energy balance load. In addition, the organizer is responsible to select a new CH in case of failure and vice versa. So, the energy balanced clustering and fault tolerance operations will prolong the node life time and thus the network will be efficient in data transmission and more reliable. The new framework after implementation is named Extended Energy Balanced with Fault Tolerance Capability (EEBFTC protocol.