WorldWideScience

Sample records for fault tolerant systems

  1. Fault tolerant computing systems

    International Nuclear Information System (INIS)

    Fault tolerance involves the provision of strategies for error detection damage assessment, fault treatment and error recovery. A survey is given of the different sorts of strategies used in highly reliable computing systems, together with an outline of recent research on the problems of providing fault tolerance in parallel and distributed computing systems. (orig.)

  2. Fault-tolerant computing systems

    International Nuclear Information System (INIS)

    Tests, Diagnosis and Fault Treatment were chosen as the guiding themes of the conference. However, the scope of the conference included reliability, availability, safety and security issues in software and hardware systems as well. The sessions were organized for the conference which was completed by an industrial presentation: Keynote Address, Reconfiguration and Recover, System Level Diagnosis, Voting and Agreement, Testing, Fault-Tolerant Circuits, Array Testing, Modelling, Applied Fault Tolerance, Fault-Tolerant Arrays and Systems, Interconnection Networks, Fault-Tolerant Software. One paper has been indexed separately in the database. (orig./HP)

  3. Fault tolerant control for switched linear systems

    CERN Document Server

    Du, Dongsheng; Shi, Peng

    2015-01-01

    This book presents up-to-date research and novel methodologies on fault diagnosis and fault tolerant control for switched linear systems. It provides a unified yet neat framework of filtering, fault detection, fault diagnosis and fault tolerant control of switched systems. It can therefore serve as a useful textbook for senior and/or graduate students who are interested in knowing the state-of-the-art of filtering, fault detection, fault diagnosis and fault tolerant control areas, as well as recent advances in switched linear systems.  

  4. Fault tolerant control for uncertain systems with parametric faults

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels KjØlstad

    2006-01-01

    A fault tolerant control (FTC) architecture based on active fault diagnosis (AFD) and the YJBK (Youla, Jarb, Bongiorno and Kucera)parameterization is applied in this paper. Based on the FTC architecture, fault tolerant control of uncertain systems with slowly varying parametric faults is investigated. Conditions are given for closed-loop stability in case of false alarms or missing fault detection/isolation.

  5. Fault-Tolerant UAV Flight Control System

    OpenAIRE

    Dybsjord, Kerrin Andre

    2013-01-01

    The main focus of this master?s thesis is fault-tolerant control systems (FTCSs) for unmanned aerial vehicles (UAVs). The goals are to develop an automatic-flight control system (AFCS) with fault detection and isolation (FDI) and a reconfiguration mechanism for accommodation of faults. The literature study reviews methods for fault-tolerant control and also discusses important faults and failures related to UAVs.The FTCS is implemented in MATLAB Simulink with a nonlinear model of the Ces...

  6. Synthesis of Fault-Tolerant Embedded Systems

    DEFF Research Database (Denmark)

    Eles, Petru; Izosimov, Viacheslav

    2008-01-01

    This work addresses the issue of design optimization for fault- tolerant hard real-time systems. In particular, our focus is on the handling of transient faults using both checkpointing with rollback recovery and active replication. Fault tolerant schedules are generated based on a conditional process graph representation. The formulated system synthesis approaches decide the assignment of fault-tolerance policies to processes, the optimal placement of checkpoints and the mapping of processes to processors, such that multiple transient faults are tolerated, transparency requirements are considered, and the timing constraints of the application are satisfied.

  7. Systems approach to software fault tolerance

    Science.gov (United States)

    Caglayan, A. K.; Eckhardt, D. E., Jr.

    1985-01-01

    Computing systems are employed for aerospace applications with high reliability requirements. In order to provide the needed reliability, it was necessary to make use of computing systems with fault-tolerance characteristics. Traditionally, fault tolerance is achieved through the use of hardware redundance. However, fault-tolerant techniques based on suitable software design considerations have also been developed. The present paper is concerned with the major issues arising in the context of an application of fault-tolerant software techniques to dynamic systems. Attention is given to fault-tolerant flight software, software component stability, system stability with fault-tolerant software, the preservation of functional performance, N-version vs. recovery blocks in flight software, systems-based software, static and dynamic models, static and dynamic consistency tests, and recovery block initialization.

  8. Fault tolerant control design for hybrid systems

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Hao; Jiang, Bin [Nanjing University of Aeronautics and Astronautics, Nanjing (China); Cocquempot, Vincent [Universite des Sciences et Technologies de Lille, Villeneuve d' Ascq (France)

    2010-07-01

    This book intends to provide the readers a good understanding on how to achieve Fault Tolerant Control goal of Hybrid Systems. The book can be used as a reference for the academic research on Fault Tolerant Control and Hybrid Systems or used in Ph.D. study of control theory and engineering. The knowledge background for this monograph would be some undergraduate and graduate courses on Fault Diagnosis and Fault Tolerant Control theory, linear system theory, nonlinear system theory, Hybrid Systems theory and Discrete Event System theory. (orig.)

  9. Ultrareliable fault-tolerant control systems

    Science.gov (United States)

    Webster, L. D.; Slykhouse, R. A.; Booth, L. A., Jr.; Carson, T. M.; Davis, G. J.; Howard, J. C.

    1984-01-01

    It is demonstrated that fault-tolerant computer systems, such as on the Shuttles, based on redundant, independent operation are a viable alternative in fault tolerant system designs. The ultrareliable fault-tolerant control system (UFTCS) was developed and tested in laboratory simulations of an UH-1H helicopter. UFTCS includes asymptotically stable independent control elements in a parallel, cross-linked system environment. Static redundancy provides the fault tolerance. A polling is performed among the computers, with results allowing for time-delay channel variations with tight bounds. When compared with the laboratory and actual flight data for the helicopter, the probability of a fault was, for the first 10 hr of flight given a quintuple computer redundancy, found to be 1 in 290 billion. Two weeks of untended Space Station operations would experience a fault probability of 1 in 24 million. Techniques for avoiding channel divergence problems are identified.

  10. Software fault tolerance in computer operating systems

    Science.gov (United States)

    Iyer, Ravishankar K.; Lee, Inhwan

    1994-01-01

    This chapter provides data and analysis of the dependability and fault tolerance for three operating systems: the Tandem/GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Based on measurements from these systems, basic software error characteristics are investigated. Fault tolerance in operating systems resulting from the use of process pairs and recovery routines is evaluated. Two levels of models are developed to analyze error and recovery processes inside an operating system and interactions among multiple instances of an operating system running in a distributed environment. The measurements show that the use of process pairs in Tandem systems, which was originally intended for tolerating hardware faults, allows the system to tolerate about 70% of defects in system software that result in processor failures. The loose coupling between processors which results in the backup execution (the processor state and the sequence of events occurring) being different from the original execution is a major reason for the measured software fault tolerance. The IBM/MVS system fault tolerance almost doubles when recovery routines are provided, in comparison to the case in which no recovery routines are available. However, even when recovery routines are provided, there is almost a 50% chance of system failure when critical system jobs are involved.

  11. Energy-efficient fault-tolerant systems

    CERN Document Server

    Mathew, Jimson; Pradhan, Dhiraj K

    2013-01-01

    This book describes the state-of-the-art in energy efficient, fault-tolerant embedded systems. It covers the entire product lifecycle of electronic systems design, analysis and testing and includes discussion of both circuit and system-level approaches. Readers will be enabled to meet the conflicting design objectives of energy efficiency and fault-tolerance for reliability, given the up-to-date techniques presented.

  12. Approaches differ for fault-tolerant systems

    Energy Technology Data Exchange (ETDEWEB)

    Aseo, J.

    1983-09-01

    Efforts to provide fault-tolerant computer systems focus on two primary architectures: redundant hardware executing different tasks and parallel processors operating on the same set of data and instructions. Parallel processing is the approach favored by August systems (Tigard, Oregon), Hewlett-Packard (Palo Alto, California), Parallel Computers (Santa Cruz, California), Stratus Computers (Natick, Massachuetts) and Tandem Computers (Cupertino, California). Multiple redundant system elements can be found in implementations from Auragen Systems (Fort Lee, New Jersey), and Tolerant Systems (Milpitas, California). Critical differences between the two approaches are the ability to recover from errors in real time as well as the degree of fault tolerance implemented in hardware and software.

  13. Fault-tolerant software - Experiment with the sift operating system. [Software Implemented Fault Tolerance computer

    Science.gov (United States)

    Brunelle, J. E.; Eckhardt, D. E., Jr.

    1985-01-01

    Results are presented of an experiment conducted in the NASA Avionics Integrated Research Laboratory (AIRLAB) to investigate the implementation of fault-tolerant software techniques on fault-tolerant computer architectures, in particular the Software Implemented Fault Tolerance (SIFT) computer. The N-version programming and recovery block techniques were implemented on a portion of the SIFT operating system. The results indicate that, to effectively implement fault-tolerant software design techniques, system requirements will be impacted and suggest that retrofitting fault-tolerant software on existing designs will be inefficient and may require system modification.

  14. Middleware Fault Tolerance Support for the BOSS Embedded Operating System

    OpenAIRE

    Afonso, Francisco; Silva, Carlos A.; Montenegro, Se?rgio; Tavares, Adriano

    2006-01-01

    Critical embedded systems need a dependable operating system and application. Despite all efforts to prevent and remove faults in system development, residual software faults usually persist. Therefore, critical systems need some sort of fault tolerance to deal with these faults and also with hardware faults at operation time. This work proposes fault-tolerant support mechanisms for the BOSS embedded operating system, based on the application of proven fault tolerance strategies by middlew...

  15. Fault tolerant control of systems with saturations

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik

    2013-01-01

    This paper presents framework for fault tolerant controllers (FTC) that includes input saturation. The controller architecture known from FTC is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization is extended to handle input saturation. Applying this controller architecture in connection with faulty systems including input saturation gives an additional YJBK transfer function related to the input saturation. In the fault free case, this additional YJBK transfer function can be applied directly for optimizing the feedback loop around the input saturation. In the faulty case, the design problem is a mixed design problem involved both parametric faults and input saturation.

  16. Software engineering of fault tolerant systems

    CERN Document Server

    Pelliccione, P; Muccini, Henry

    2007-01-01

    In architecting dependable systems, what is required to improve the overall system robustness is fault tolerance. Many methods have been proposed to this end, the solutions are usually considered late during the design and implementation phases of the software life-cycle (e.g., Java and Windows NT exception handling), thus reducing the effectiveness error and fault handling. Since the system design typically models only normal behaviour of the system while ignoring exceptional ones, the implementation of the system is unable to handle abnormal events. Consequently, the system may fail in unexp

  17. A Fault-tolerant Development Methodology for Industrial Control Systems

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh; Thybo, C.

    2004-01-01

    Developing advanced detection schemes is not the lone factor for obtaining a successful fault diagnosis performance. Acquiring significant achievements in applying Fault-tolerance in industrial development requires that fault diagnosis and recovery schemes are developed in a consistent and logically sound manner. This paper presents the employe fault-tolerant development methodology and highlights steps, which have been essential for achieving complete and consistent monitoring capabilities. Fault diagnosis for a commercial refrigeration system is treated as a case-study.

  18. Method and system for environmentally adaptive fault tolerant computing

    Science.gov (United States)

    Copenhaver, Jason L. (Inventor); Jeremy, Ramos (Inventor); Wolfe, Jeffrey M. (Inventor); Brenner, Dean (Inventor)

    2010-01-01

    A method and system for adapting fault tolerant computing. The method includes the steps of measuring an environmental condition representative of an environment. An on-board processing system's sensitivity to the measured environmental condition is measured. It is determined whether to reconfigure a fault tolerance of the on-board processing system based in part on the measured environmental condition. The fault tolerance of the on-board processing system may be reconfigured based in part on the measured environmental condition.

  19. Fault tolerant architecture for artificial olfactory system

    Science.gov (United States)

    Lotfivand, Nasser; Nizar Hamidon, Mohd; Abdolzadeh, Vida

    2015-05-01

    In this paper, to cover and mask the faults that occur in the sensing unit of an artificial olfactory system, a novel architecture is offered. The proposed architecture is able to tolerate failures in the sensors of the array and the faults that occur are masked. The proposed architecture for extracting the correct results from the output of the sensors can provide the quality of service for generated data from the sensor array. The results of various evaluations and analysis proved that the proposed architecture has acceptable performance in comparison with the classic form of the sensor array in gas identification. According to the results, achieving a high odor discrimination based on the suggested architecture is possible.

  20. Fault tolerance versus performance metrics for robot systems

    International Nuclear Information System (INIS)

    The incorporation of fault tolerance techniques into robot systems improves the reliability, but also increases the hardware and computational requirements in the overall system. It is not always clear how to evaluate the merit, or 'effectiveness' of different fault tolerance approaches for a given application. In this paper, we present a new set of performance criteria designed to measure and compare the effectiveness of robot fault tolerance strategies. The measures, which are designed to evaluate fault tolerance/performance/cost tradeoffs, can also be used to evaluate pure performance or pure fault tolerance strategies. We show their usefulness using a variety of proposed fault tolerance approaches in the literature, focusing on multiprocessor control architectures

  1. Synthesizing Fault Tolerant Safety Critical Systems

    Directory of Open Access Journals (Sweden)

    Seemanta Saha

    2014-08-01

    Full Text Available To keep pace with today’s nanotechnology, safety critical embedded systems are becoming less tolerant to errors. Research into techniques to cope with errors in these systems has mostly focused on transformational approach, replication of hardware devices, parallel program design, component based design and/or information redundancy. It would be better to tackle the issue early in the design process that a safety critical system never fails to satisfy its strict dependability requirements. A novel method is outlined in this paper that proposes an efficient approach to synthesize safety critical systems. The proposed method outperforms dominant existing work by introducing the technique of run time detection and completion of proper execution of the system in the presence of faults.

  2. Design and validation of fault-tolerant flight systems

    Science.gov (United States)

    Finelli, George B.; Palumbo, Daniel L.

    1987-01-01

    NASA has undertaken the development of a methodology for the design of easily validated fault-tolerant systems which emphasizes validation processes that can be directly incorporated into the design process. Attention is presently given to the statistical issues arising in the validation of highly reliable fault-tolerant systems. Structured specification and design methodologies, mathematical proof techniques, analytical modeling, simulation/emulation, and physical testing, are all discussed. Important design factors associated with fault-tolerance are noted; synchronization and 'Byzantine resilience' must accompany fault tolerance.

  3. Fault-tolerant actuator system for electrical steering of vehicles

    DEFF Research Database (Denmark)

    Thomsen, Jesper Sandberg; Blanke, Mogens

    2006-01-01

    Being critical to the safety of vehicles, the steering system is required to maintain the vehicles ability to steer until it is brought to halt, should a fault occur. With electrical steering becoming a cost-effective candidate for electrical powered vehicles, a fault-tolerant architecture is needed that meets this requirement. This paper studies the fault-tolerance properties of an electrical steering system. It presents a fault-tolerant architecture where a dedicated AC motor design used in conjunction with cheap voltage measurements can ensure detection of all relevant faults in the steering system. The paper shows how active control reconfiguration can accommodate all critical faults. The fault-tolerant abilities of the steering system are demonstrated on the hardware of a warehouse truck.

  4. Fault tolerant aggregation for power system services

    DEFF Research Database (Denmark)

    Kosek, Anna Magdalena; Gehrke, Oliver

    2013-01-01

    Exploiting the flexibility in distributed energy resources (DER) is seen as an important contribution to allow high penetrations of renewable generation in electrical power systems. However, the present control infrastructure in power systems is not well suited for the integration of a very large number of small units. A common approach is to aggregate a portfolio of such units together and expose them to the power system as a single large virtual unit. In order to realize the vision of a Smart Grid, concepts for flexible, resilient and reliable aggregation infrastructures are required. This paper presents such a concept while focusing on the aspect of resilience and fault tolerance. The proposed concept makes use of a multi-level election algorithm to transparently manage the addition, removal, failure and reorganization of units. It has been implemented and tested as a proof-of-concept on the distributed smart grid test bed SYSLAB at the Technical University of Denmark.

  5. Comparing Distributed Online Stream Processing Systems Considering Fault Tolerance Issues

    Directory of Open Access Journals (Sweden)

    André Leon Sampaio Gradvohl

    2014-05-01

    Full Text Available This paper presents an analysis of four online stream processing systems (MillWheel, S4, Spark Streaming and Storm regarding the strategies they use for fault tolerance. We use this sort of system for processing of data streams that can come from different sources such as web sites, sensors, mobile phones or any set of devices that provide real-time high-speed data. Typically, these systems are concerned more with the throughput in data processing than on fault tolerance. However, depending on the type of application, we should consider fault tolerance as an important a feature. The work describes some of the main strategies for fault tolerance – replication components, upstream backup, checkpoint and recovery – and shows how each of the four systems uses these strategies. In the end, the paper discusses the advantages and disadvantages of the combination of the strategies for fault tolerance in these systems.

  6. Parameter Transient Behavior Analysis on Fault Tolerant Control System

    Science.gov (United States)

    Belcastro, Christine (Technical Monitor); Shin, Jong-Yeob

    2003-01-01

    In a fault tolerant control (FTC) system, a parameter varying FTC law is reconfigured based on fault parameters estimated by fault detection and isolation (FDI) modules. FDI modules require some time to detect fault occurrences in aero-vehicle dynamics. This paper illustrates analysis of a FTC system based on estimated fault parameter transient behavior which may include false fault detections during a short time interval. Using Lyapunov function analysis, the upper bound of an induced-L2 norm of the FTC system performance is calculated as a function of a fault detection time and the exponential decay rate of the Lyapunov function.

  7. Fault-tolerance - The survival attribute of digital systems

    Science.gov (United States)

    Avizienis, A.

    1978-01-01

    Fault-tolerance is the architectural attribute of a digital system that keeps the logic machine doing its specified tasks when its host, the physical system, suffers various kinds of failures of its components. A more general concept of fault-tolerance also includes human mistakes committed during software and hardware implementation and during man/machine interaction among the causes of faults that are to be tolerated by the logic machine. This paper discusses the concept of fault-tolerance, the reasons for its inclusion in digital system architecture, and the methods of its implementation. A chronological view of the evolution of fault-tolerant systems and an outline of some goals for its further development conclude the presentation.

  8. Scheduling and Optimization of Fault-Tolerant Embedded Systems

    OpenAIRE

    Izosimov, Viacheslav

    2006-01-01

    Safety-critical applications have to function correctly even in presence of faults. This thesis deals with techniques for tolerating effects of transient and intermittent faults. Reexecution, software replication, and rollback recovery with checkpointing are used to provide the required level of fault tolerance. These techniques are considered in the context of distributed real-time systems with non-preemptive static cyclic scheduling. Safety-critical applications have strict time and cost co...

  9. Fault Tolerance in Distributed Systems using Fused State Machines

    OpenAIRE

    Balasubramanian, Bharath; Garg, Vijay K

    2013-01-01

    Replication is a standard technique for fault tolerance in distributed systems modeled as deterministic finite state machines (DFSMs or machines). To correct f crash or f/2 Byzantine faults among n different machines, replication requires nf additional backup machines. We present a solution called fusion that requires just f additional backup machines. First, we build a framework for fault tolerance in DFSMs based on the notion of Hamming distances. We introduce the concept ...

  10. From fault classification to fault tolerance for multi-agent systems

    CERN Document Server

    Potiron, Katia; Taillibert, Patrick

    2013-01-01

    Faults are a concern for Multi-Agent Systems (MAS) designers, especially if the MAS are built for industrial or military use because there must be some guarantee of dependability. Some fault classification exists for classical systems, and is used to define faults. When dependability is at stake, such fault classification may be used from the beginning of the system's conception to define fault classes and specify which types of faults are expected. Thus, one may want to use fault classification for MAS; however, From Fault Classification to Fault Tolerance for Multi-Agent Systems argues that

  11. Design of fault tolerant control for nonlinear systems subject to time varying faults

    OpenAIRE

    Bouarar, Tahar; Marx, Benoît; Maquin, Didier; Ragot, José

    2011-01-01

    In this paper, a Fault Tolerant Control (FTC) problem for discrete time nonlinear systems rep- resented by Takagi-Sugeno (T-S) models is investigated. The goal is to design a fault tolerant controller taking into account the faults affecting the overall system behavior in order to ensure the system stability. The principal idea is to introduce a Proportional Integral (PI) observer to detect and to estimate an eventual fault occuring in the system. Based on Lyapunov theory, two new approaches ...

  12. Fault-Tolerant Onboard Monitoring and Decision Support Systems

    DEFF Research Database (Denmark)

    Lajic, Zoran

    2010-01-01

    The purpose of this research project is to improve current onboard decision support systems. Special focus is on the onboard prediction of the instantaneous sea state. In this project a new approach to increasing the overall reliability of a monitoring and decision support system has been established. The basic idea is to convert the given system into a fault-tolerant system and to improve multi-sensor data fusion for the particular system. The background of the project is the SeaSense system, which has been installed on several container ships and navy vessels. The SeaSense system provides a crude and simple estimation of the actual sea state (Hs and Tz), information about the longitudinal hull girder loading, seakeeping performance of the ship, and decision support on how to operate the ship within acceptable limits. The system is able to identify critical forthcoming events and to give advice regarding speed and course changes to decrease the wave-induced loads. The SeaSense system is based on the combineduse of a mathematical model and measurements from a set of sensors. The overall dependability of a shipboard monitoring and decision support system such as the SeaSense system can be improved using fault-tolerant techniques (Fault Diagnosis and System Re-design) and a Sensor Fusion Quality (SFQ) test. Fault diagnosis means to detect the presence of faults in the system. In case sea state estimation is conducted by a ship-wave buoy analogy the best solution is achieved when a set of three different ship responses are used. Faulty signals should be discarded from the procedure for sea state estimation if it is possible, if not the fault should be estimated. The fault diagnosis can be divided into three steps: Fault detection, fault isolation and fault estimation. Fault detection means to decide whether or not a fault has occurred. This step determines the time at which the system is subjected to the given fault. Fault isolation will find in which component a fault has occurred. This step determines the location of the fault. Fault estimation provides an estimate of magnitude of a fault. A supervisory function determines the severity of the fault once its origin has been isolated and its magnitude estimated. Fault-tolerant Sensor Fusion means that the monitoring and decision support system can accommodate faults so that the overall system continues to satisfy its goal and on the other hand in the absence of a fault, the system should be able to provide the most accurate information using the SFQ test.

  13. Software fault tolerance for real-time avionics systems

    Science.gov (United States)

    Anderson, T.; Knight, J. C.

    1983-01-01

    Avionics systems have very high reliability requirements and are therefore prime candidates for the inclusion of fault tolerance techniques. In order to provide tolerance to software faults, some form of state restoration is usually advocated as a means of recovery. State restoration can be very expensive for systems which utilize concurrent processes. The concurrency present in most avionics systems and the further difficulties introduced by timing constraints imply that providing tolerance for software faults may be inordinately expensive or complex. A straightforward pragmatic approach to software fault tolerance which is believed to be applicable to many real-time avionics systems is proposed. A classification system for software errors is presented together with approaches to recovery and continued service for each error type.

  14. Active Fault Tolerant Control of Livestock Stable Ventilation System

    DEFF Research Database (Denmark)

    Gholami, Mehdi

    2011-01-01

    Modern stables and greenhouses are equipped with different components for providing a comfortable climate for animals and plant. A component malfunction may result in loss of production. Therefore, it is desirable to design a control system, which is stable, and is able to provide an acceptable degraded performance even in the faulty case. In this thesis, we have designed such controllers for climate control systems of livestock buildings in three steps: • Deriving a model for the climate control system of a pig-stable. • Designing an active fault diagnosis (AFD) algorithm for different kinds of fault. • Designing a fault tolerant control scheme for the climate control system. In the first step, a conceptual multi-zone model for climate control of a live-stock building is derived. In the next step, two methods for active fault diagnosis are proposed. The AFD methods excite the system by injecting a so-called excitation input. Two different algorithms, the EKF and a new adaptive filter, are used to detect the faults. Fault tolerant controller (FTC) is based on a switching scheme between a set of predefined passive fault tolerant controller (PFTC). In the FTC part of the thesis, first a passive fault tolerant controller (PFTC) based on state feed-back is proposed for discrete-time piecewise affine (PWA) systems. Only actuator faults are considered. Then the PFTC problem is reformulated as a feasibility of a set of linear matrix inequalities (LMIs).

  15. A fault-tolerant intelligent robotic control system

    Science.gov (United States)

    Marzwell, Neville I.; Tso, Kam Sing

    1993-01-01

    This paper describes the concept, design, and features of a fault-tolerant intelligent robotic control system being developed for space and commercial applications that require high dependability. The comprehensive strategy integrates system level hardware/software fault tolerance with task level handling of uncertainties and unexpected events for robotic control. The underlying architecture for system level fault tolerance is the distributed recovery block which protects against application software, system software, hardware, and network failures. Task level fault tolerance provisions are implemented in a knowledge-based system which utilizes advanced automation techniques such as rule-based and model-based reasoning to monitor, diagnose, and recover from unexpected events. The two level design provides tolerance of two or more faults occurring serially at any level of command, control, sensing, or actuation. The potential benefits of such a fault tolerant robotic control system include: (1) a minimized potential for damage to humans, the work site, and the robot itself; (2) continuous operation with a minimum of uncommanded motion in the presence of failures; and (3) more reliable autonomous operation providing increased efficiency in the execution of robotic tasks and decreased demand on human operators for controlling and monitoring the robotic servicing routines.

  16. Fault tolerant tracking control for continuous Takagi-Sugeno systems with time varying faults

    OpenAIRE

    Bouarar, Tahar; Marx, Benoît; Maquin, Didier; Ragot, José

    2011-01-01

    This paper deals with Fault Tolerant Control design for continuous nonlinear Takagi-Sugeno faulty systems. The goal is to ensure both state and fault estimation and the state reference tracking even if faults occur. In this study, the faults affecting the system behavior are considered as time varying functions modeled by exponential functions or first order polynomials. Based on descriptor redundancy property, solutions are proposed for both cases, exponential and polyno- mial faults, in ter...

  17. H infinity Integrated Fault Estimation and Fault Tolerant Control of Discrete-time Piecewise Linear Systems

    DEFF Research Database (Denmark)

    Tabatabaeipour, Seyed Mojtaba; Bak, Thomas

    2012-01-01

    In this paper we consider the problem of fault estimation and accommodation for discrete time piecewise linear systems. A robust fault estimator is designed to estimate the fault such that the estimation error converges to zero and H? performance of the fault estimation is minimized. Then, the estimate of fault is used to compensate for the effect of the fault. Hence, using the estimate of fault, a fault tolerant controller using a piecewise linear static output feedback is designed such that it stabilizes the system and provides an upper bound on the H? performance of the faulty system. Sufficient conditions for the existence of robust fault estimator and fault tolerant controller are derived in terms of linear matrix inequalities. Upper bounds on the H? performance can be minimized by solving convex optimization problems with linear matrix inequality constraints. The efficiency of the method is demonstrated by means of a numerical example.

  18. Fault-tolerant design

    CERN Document Server

    Dubrova, Elena

    2013-01-01

    This textbook serves as an introduction to fault-tolerance, intended for upper-division undergraduate students, graduate-level students and practicing engineers in need of an overview of the field.  Readers will develop skills in modeling and evaluating fault-tolerant architectures in terms of reliability, availability and safety.  They will gain a thorough understanding of fault tolerant computers, including both the theory of how to design and evaluate them and the practical knowledge of achieving fault-tolerance in electronic, communication and software systems.  Coverage includes fault-tolerance techniques through hardware, software, information and time redundancy.  The content is designed to be highly accessible, including numerous examples and exercises.  Solutions and powerpoint slides are available for instructors.   ·         Provides textbook coverage of the fundamental concepts of fault-tolerance; ·         Describes a variety of basic techniques for achieving fault-toleran...

  19. Measurement and analysis of operating system fault tolerance

    Science.gov (United States)

    Lee, I.; Tang, D.; Iyer, R. K.

    1992-01-01

    This paper demonstrates a methodology to model and evaluate the fault tolerance characteristics of operational software. The methodology is illustrated through case studies on three different operating systems: the Tandem GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Measurements are made on these systems for substantial periods to collect software error and recovery data. In addition to investigating basic dependability characteristics such as major software problems and error distributions, we develop two levels of models to describe error and recovery processes inside an operating system and on multiple instances of an operating system running in a distributed environment. Based on the models, reward analysis is conducted to evaluate the loss of service due to software errors and the effect of the fault-tolerance techniques implemented in the systems. Software error correlation in multicomputer systems is also investigated.

  20. Fault tolerant hypercube computer system architecture

    Science.gov (United States)

    Madan, Herb S. (inventor); Chow, Edward (inventor)

    1989-01-01

    A fault-tolerant multiprocessor computer system of the hypercube type comprising a hierarchy of computers of like kind which can be functionally substituted for one another as necessary is disclosed. Communication between the working nodes is via one communications network while communications between the working nodes and watch dog nodes and load balancing nodes higher in the structure is via another communications network separate from the first. A typical branch of the hierarchy reporting to a master node or host computer comprises, a plurality of first computing nodes; a first network of message conducting paths for interconnecting the first computing nodes as a hypercube. The first network provides a path for message transfer between the first computing nodes; a first watch dog node; and a second network of message connecting paths for connecting the first computing nodes to the first watch dog node independent from the first network, the second network provides an independent path for test message and reconfiguration affecting transfers between the first computing nodes and the first switch watch dog node. There is additionally, a plurality of second computing nodes; a third network of message conducting paths for interconnecting the second computing nodes as a hypercube. The third network provides a path for message transfer between the second computing nodes; a fourth network of message conducting paths for connecting the second computing nodes to the first watch dog node independent from the third network. The fourth network provides an independent path for test message and reconfiguration affecting transfers between the second computing nodes and the first watch dog node; and a first multiplexer disposed between the first watch dog node and the second and fourth networks for allowing the first watch dog node to selectively communicate with individual ones of the computing nodes through the second and fourth networks; as well as, a second watch dog node operably connected to the first multiplexer whereby the second watch dog node can selectively communicate with individual ones of the computing nodes through the second and fourth networks. The branch is completed by a first load balancing node; and a second multiplexer connected between the first load balancing node and the first and second watch dog nodes, allowing the first load balancing node to selectively communicate with the first and second watch dog nodes.

  1. Passive Fault-tolerant Control of Discrete-time Piecewise Affine Systems against Actuator Faults

    DEFF Research Database (Denmark)

    Tabatabaeipour, Seyed Mojtaba; Izadi-Zamanabadi, Roozbeh

    2012-01-01

    In this paper, we propose a new method for passive fault-tolerant control of discrete time piecewise affine systems. Actuator faults are considered. A reliable piecewise linear quadratic regulator (LQR) state feedback is designed such that it can tolerate actuator faults. A sufficient condition for the exis- tence of a passive fault-tolerant controller is derived and formulated as the feasibility of a set of linear matrix inequalities (LMIs). The upper bound on the performance cost can be minimized using a convex optimization problem with LMI constraints which can be solved efficiently. The approach is illustrated on a numerical example and a two degree of freedom helicopter.

  2. Trajectory tracking fault tolerant controller design for Takagi-Sugeno systems subject to actuator faults

    OpenAIRE

    Bouarar, Tahar; Marx, Benoît; Maquin, Didier; Ragot, José

    2011-01-01

    This paper investigates the problem of fault tolerant control (FTC) design for nonlinear Takagi-Sugeno (T-S) models with measurable premise variables. The idea is to synthesize a fault tolerant controller ensuring state trajectory tracking. Based on Lyapunov theory, new less conservative approaches are proposed in term of Linear Matrix Inequality (LMI). A PI observer is needed to estimate simultaneously the faults and the faulty system states in order to reconfigure the FTC law. A numerical e...

  3. Fault Tolerant Services for Safe In-Car Embedded Systems

    OpenAIRE

    Navet, Nicolas; Simonot-lion, Franc?oise

    2005-01-01

    Due to the increasing criticality of the functions in terms of safety, embedded automotive systems must now respect stringent dependability constraints despite the faults that may occur in a very harsh environment. In a context where critical functions are distributed over the network, the communication system plays a major role. First, we discuss the main services and functionalities that a communication system should offer for easying the design of fault-tolerant applications in the automot...

  4. Synthesizing Fault Tolerant Safety Critical Systems

    OpenAIRE

    Seemanta Saha; Muhammad Sheikh Sadi

    2014-01-01

    To keep pace with today’s nanotechnology, safety critical embedded systems are becoming less tolerant to errors. Research into techniques to cope with errors in these systems has mostly focused on transformational approach, replication of hardware devices, parallel program design, component based design and/or information redundancy. It would be better to tackle the issue early in the design process that a safety critical system never fails to satisfy its strict dependability requiremen...

  5. Data-driven design of fault diagnosis and fault-tolerant control systems

    CERN Document Server

    Ding, Steven X

    2014-01-01

    Data-driven Design of Fault Diagnosis and Fault-tolerant Control Systems presents basic statistical process monitoring, fault diagnosis, and control methods, and introduces advanced data-driven schemes for the design of fault diagnosis and fault-tolerant control systems catering to the needs of dynamic industrial processes. With ever increasing demands for reliability, availability and safety in technical processes and assets, process monitoring and fault-tolerance have become important issues surrounding the design of automatic control systems. This text shows the reader how, thanks to the rapid development of information technology, key techniques of data-driven and statistical process monitoring and control can now become widely used in industrial practice to address these issues. To allow for self-contained study and facilitate implementation in real applications, important mathematical and control theoretical knowledge and tools are included in this book. Major schemes are presented in algorithm form and...

  6. Comparing Distributed Online Stream Processing Systems Considering Fault Tolerance Issues

    OpenAIRE

    André Leon Sampaio Gradvohl; Hermes Senger; Luciana Arantes; Pierre Sens

    2014-01-01

    This paper presents an analysis of four online stream processing systems (MillWheel, S4, Spark Streaming and Storm) regarding the strategies they use for fault tolerance. We use this sort of system for processing of data streams that can come from different sources such as web sites, sensors, mobile phones or any set of devices that provide real-time high-speed data. Typically, these systems are concerned more with the throughput in data processing than on fault tolerance. However, depending ...

  7. Nonlinear observer based sensor fault tolerant control for nonlinear systems

    OpenAIRE

    Ichalal, Dalil; Marx, Benoît; Maquin, Didier; Ragot, José

    2012-01-01

    This paper deals with the problem of sensor fault tolerant control for Takagi-Sugeno nonlinear systems. Firstly, a residual generator is designed in order to detect and isolate sensor faults. Secondly, a nonlinear observer based controller, adopting the so-called parallel distributed compensation (PDC) structure is designed. The PDC controller is based on a weighted blending of the estimated states provided by different observers. Each observer is constructed to estimate the system state from...

  8. Fault Tolerant Controllers for Sampled-data Systems

    DEFF Research Database (Denmark)

    Niemann, H.; Stoustrup, Jakob

    2004-01-01

    A general compensator architecture for fault tolerant control (FTC) for sampled-data systems is proposed. The architecture is based on the YJBK parameterization of all stabilizing controllers, and uses the dual YJBK parameterization to quantify the performance of the fault tolerant system. The FTC architecture is based on a discrete-time nominal feedback controller and with the FTC part also in discrete-time. Further, a number of problems for the design of the controller reconfiguration parat in the FTC architecture is considered. It is shown how these design problems can be transformed into standard design problems for feedback controllers.

  9. Design of fault tolerant control system for steam generator using

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Myung Ki; Seo, Mi Ro [Korea Electric Power Research Institute, Taejon (Korea, Republic of)

    1998-12-31

    A controller and sensor fault tolerant system for a steam generator is designed with fuzzy logic. A structure of the proposed fault tolerant redundant system is composed of a supervisor and two fuzzy weighting modulators. A supervisor alternatively checks a controller and a sensor induced performances to identify which part, a controller or a sensor, is faulty. In order to analyze controller induced performance both an error and a change in error of the system output are chosen as fuzzy variables. The fuzzy logic for a sensor induced performance uses two variables : a deviation between two sensor outputs and its frequency. Fuzzy weighting modulator generates an output signal compensated for faulty input signal. Simulations show that the proposed fault tolerant control scheme for a steam generator regulates well water level by suppressing fault effect of either controllers or sensors. Therefore through duplicating sensors and controllers with the proposed fault tolerant scheme, both a reliability of a steam generator control and sensor system and that of a power plant increase even more. 2 refs., 9 figs., 1 tab. (Author)

  10. Stability guaranteed active fault tolerant control of networked control systems

    OpenAIRE

    Shanbin Li; Dominique Sauter; Christophe Aubrun; Joseph Yamé

    2007-01-01

    The stability guaranteed active fault-tolerant control against actuators failures and plant uncertainties in networked control systems (NCSs) is addressed. A detailed design procedure is formulated as a convex optimization problem which can be efficiently solved by existing software. An illustrative example is given to show the efficiency of the proposed method for network-based control for uncertain systems.

  11. Summarize of Electric Vehicle Electric System Fault and Fault-tolerant Technology

    OpenAIRE

    Zhang Liwei; Huang Xianjin; Yang Yannan; Xu Chen; Liu Jie

    2013-01-01

    Electric vehicle drive system is a multi-variable function, running environment complexed and changeable system, so it’s failure form is complicated. In this paper, according to the fault happens in different position, establish vehicle fault table, analyze the consequences of failure may cause and the causes of failure. Combined with hardware limitations, and the maximum guarantee system performance requirements, passive software redundancy fault-tolerant strategy is put forward, give an e...

  12. Enhanced Fault-Tolerant Quantum Computing in d -Level Systems

    Science.gov (United States)

    Campbell, Earl T.

    2014-12-01

    Error-correcting codes protect quantum information and form the basis of fault-tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transversal non-Clifford gate. Codes with the desired property are presented for d -level qudit systems with prime d . The codes use n =d -1 qudits and can detect up to ˜d /3 errors. We quantify the performance of these codes for one approach to quantum computation known as magic-state distillation. Unlike prior work, we find performance is always enhanced by increasing d .

  13. Fault Tolerance Middleware for a Multi-Core System

    Science.gov (United States)

    Some, Raphael R.; Springer, Paul L.; Zima, Hans P.; James, Mark; Wagner, David A.

    2012-01-01

    Fault Tolerance Middleware (FTM) provides a framework to run on a dedicated core of a multi-core system and handles detection of single-event upsets (SEUs), and the responses to those SEUs, occurring in an application running on multiple cores of the processor. This software was written expressly for a multi-core system and can support different kinds of fault strategies, such as introspection, algorithm-based fault tolerance (ABFT), and triple modular redundancy (TMR). It focuses on providing fault tolerance for the application code, and represents the first step in a plan to eventually include fault tolerance in message passing and the FTM itself. In the multi-core system, the FTM resides on a single, dedicated core, separate from the cores used by the application. This is done in order to isolate the FTM from application faults and to allow it to swap out any application core for a substitute. The structure of the FTM consists of an interface to a fault tolerant strategy module, a responder module, a fault manager module, an error factory, and an error mapper that determines the severity of the error. In the present reference implementation, the only fault tolerant strategy implemented is introspection. The introspection code waits for an application node to send an error notification to it. It then uses the error factory to create an error object, and at this time, a severity level is assigned to the error. The introspection code uses its built-in knowledge base to generate a recommended response to the error. Responses might include ignoring the error, logging it, rolling back the application to a previously saved checkpoint, swapping in a new node to replace a bad one, or restarting the application. The original error and recommended response are passed to the top-level fault manager module, which invokes the response. The responder module also notifies the introspection module of the generated response. This provides additional information to the introspection module that it can use in generating its next response. For example, if the responder triggers an application rollback and errors are still occurring, the introspection module may decide to recommend an application restart.

  14. Development and application of diagnostic systems to achieve fault tolerance

    International Nuclear Information System (INIS)

    Much work is currently being done to develop and apply diagnostic systems that are tolerant to faulted conditions in the process being monitored and in the sensors that measure the critical parameters associated with the process. A fault-tolerant diagnostic system based on state-determination, pattern-recognition techniques is currently undergoing testing and evaluation in certain applications at the EBR-II reactor. Testing and operational experience with the system to date has shown a high degree of tolerance to sensor failures, while being sensitive to very slight changes in the plant operational state. This paper briefly mentions related work being done by others, and describes in more detail the pattern-recognition system and the results of the testing and operational experience with the system at EBR-II. 9 refs., 10 figs

  15. Development and application of diagnostic systems to achieve fault tolerance

    International Nuclear Information System (INIS)

    Much work is currently being done to develop and apply diagnostic systems that are tolerant to faulted conditions in the process being monitored and in the sensors that measure the critical parameters associated with the process. A fault-tolerant diagnostic system based on state-determination, pattern-recognition techniques is currently undergoing testing and evaluation in certain applications at the EBR-II reactor. Testing and operational experience with the system to date has shown a high degree of tolerance to sensor failures, while being sensitive to very slight changes in the plant operational state. This paper briefly mentions related work being done by others, and describes in more detail the pattern-recognition system and the results of the testing and operational experience with the system at EBR-II

  16. A fault tolerant process control system for PHWRs

    International Nuclear Information System (INIS)

    Many computer control application have stringent requirements for continued correct operation of the control system in the presence of internal faults. There is undoubtedly a great need in the power industry, particularly in the nuclear segment, for reliable control and monitoring systems. One must guarantee safe control and avoid false alarm that can spuriously cause plant shutdown when it is not necessary. This paper discusses the issues involved in the design of fault tolerant systems and describes the features of a fault-tolerant process control system being developed for use in future PHWRs. The processes selected for this computer control application include primary coolant and steam generator pressures. (author). 8 refs., 1 fig

  17. 14 CFR Special Federal Aviation... - Fuel Tank System Fault Tolerance Evaluation Requirements

    Science.gov (United States)

    2010-01-01

    ...2010-01-01 2010-01-01 false Fuel Tank System Fault Tolerance Evaluation Requirements Federal Special Federal...Federal Aviation Regulation No. 88—Fuel Tank System Fault Tolerance Evaluation Requirements 1....

  18. Design a Fault Tolerance for Real Time Distributed System

    OpenAIRE

    Khammas, Ban M.

    2012-01-01

    This paper designed a fault tolerance for soft real time distributed system (FTRTDS). This system is designed to be independently on specific mechanisms and facilities of the underlying real time distributed system. It is designed to be distributed on all the computers in the distributed system and controlled by a central unit.Besides gathering information about a target program spontaneously, it provides information about the target operating system and the target hardware in order to diagno...

  19. Fault tolerance of the NIF power conditioning system

    International Nuclear Information System (INIS)

    The tolerance of the circuit topology proposed for the National Ignition Facility (NIF) power conditioning system to specific fault conditions is investigated. A new pulsed power circuit is proposed for the NIF which is simpler and less expensive than previous ICF systems. The inherent fault modes of the new circuit are different from the conventional approach, and must be understood to ensure adequate NIF system reliability. A test-bed which simulates the NIF capacitor module design was constructed to study the circuit design. Measurements from test-bed experiments with induced faults are compared with results from a detailed circuit model. The model is validated by the measurements and used to predict the behavior of the actual NIF module during faults. The model can be used to optimize fault tolerance of the NIF module through an appropriate distribution of circuit inductance and resistance. The experimental and modeling results are presented, and fault performance is compared with the ratings of pulsed power components. Areas are identified which require additional investigation

  20. Fault Tolerance by Replication in Parallel System

    Directory of Open Access Journals (Sweden)

    Madhavi Vaidya

    2011-04-01

    Full Text Available In this paper the author has concentrated on architecture of a cluster computer and the working of them in context with parallel paradigms. Author has a keen interest on guaranteeing the working of a node efficiently and the data on it should be available at any time to run the task in parallel. The applications while running may face resource faults during execution. The application must dynamically do something to prepare for, and recover from, the expected failure. Typically, checkpointing is used to minimize the loss of computation. Checkpointing is a strategy purely local, but can be very costly. Most checkpointing techniques, however, require central storage for storing checkpoints. This results in a bottleneck and severely limits the scalability of checkpointing, while also proving to be too expensive for dedicated checkpointing networks and storage systems. The author has suggested the technique of replication implemented on it. Replication has been studied for parallel databases in general. Author has worked on parallel execution of task on a node; if it fails then self protecting feature should be turned on. Self-protecting in this context means that computer clusters should detect and handle failures automatically with the help of replication.

  1. Fault-Tolerant Control of a Distributed Database System

    OpenAIRE

    Eva Wu, N.; Ruschmann, Matthew C.; Linderman, Mark H.

    2008-01-01

    Optimal state information-based control policy for a distributed database system subject to server failures is considered. Fault-tolerance is made possible by the partitioned architecture of the system and data redundancy therein. Control actions include restoration of lost data sets in a single server using redundant data sets in the remaining servers, routing of queries to intact servers, or overhaul of the entire system for renewal. Control policies are determined by solving Markov decisio...

  2. A Fault Tolerant System for an Integrated Avionics Sensor Configuration

    Science.gov (United States)

    Caglayan, A. K.; Lancraft, R. E.

    1984-01-01

    An aircraft sensor fault tolerant system methodology for the Transport Systems Research Vehicle in a Microwave Landing System (MLS) environment is described. The fault tolerant system provides reliable estimates in the presence of possible failures both in ground-based navigation aids, and in on-board flight control and inertial sensors. Sensor failures are identified by utilizing the analytic relationships between the various sensors arising from the aircraft point mass equations of motion. The estimation and failure detection performance of the software implementation (called FINDS) of the developed system was analyzed on a nonlinear digital simulation of the research aircraft. Simulation results showing the detection performance of FINDS, using a dual redundant sensor compliment, are presented for bias, hardover, null, ramp, increased noise and scale factor failures. In general, the results show that FINDS can distinguish between normal operating sensor errors and failures while providing an excellent detection speed for bias failures in the MLS, indicated airspeed, attitude and radar altimeter sensors.

  3. Reliable, fault tolerant control systems for nuclear generating stations

    International Nuclear Information System (INIS)

    Two operational features of CANDU Nuclear Power Stations provide for high plant availability. First, the plant re-fuels on-line, thereby eliminating the need for periodic and lengthy refuelling 'outages'. Second, the all plants are controlled by real-time computer systems. Later plants are also protected using real-time computer systems. In the past twenty years, the control systems now operating in 21 plants have achieved an availability of 99.8%, making significant contributions to high CANDU plant capacity factors. This paper describes some of the features that ensure the high degree of system fault tolerance and hence high plant availability. The emphasis will be placed on the fault tolerant features of the computer systems included in the latest reactor design - the CANDU 3 (450MWe). (author)

  4. OPTIMAL CHOICE WITHIN A FAULT TOLERANT FLIGHT CONTROL SYSTEM

    Directory of Open Access Journals (Sweden)

    Vasily Kazak

    2013-04-01

    Full Text Available  Safety of aircraft during the flight is one of the most important problems that concerns of all aviation. Failures/faults main elements automatic control system and damages to the external contour of the aircraft by foreign objects always lead to a change the characteristics of the aircraft, direct and indirect economic costs and sometimes to injury or death of passengers and crew. Real-time active fault tolerant control system makes it possible to warn or prevent emergency situations and thus improve safety.

  5. Design Optimization of Time-and Cost-Constrained Fault-Tolerant Distributed Embedded Systems

    OpenAIRE

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru; Peng, Zebo

    2005-01-01

    In this paper we present an approach to the design optimization of fault-tolerant embedded systems for safety-critical applications. Processes are statically scheduled and communications are performed using the time-triggered protocol. We use process re-execution and replication for tolerating transient faults. Our design optimization approach decides the mapping of processes to processors and the assignment of fault-tolerant policies to processes such that transient faults are tolerated and ...

  6. Design Optimization of Time- and Cost-Constrained Fault-Tolerant Distributed Embedded Systems

    OpenAIRE

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru; Peng, Zebo

    2005-01-01

    In this paper we present an approach to the design optimization of fault-tolerant embedded systems for safety-critical applications. Processes are statically scheduled and communications are performed using the time-triggered protocol. We use process re-execution and replication for tolerating transient faults. Our design optimization approach decides the mapping of processes to processors and the assignment of fault-tolerant policies to processes such that transient faults are tolerated and ...

  7. Analysis and optimization of fault-tolerant embedded systems with hardened processors

    OpenAIRE

    Izosimov, Viacheslav; Polian, Ilia; Pop, Paul

    2009-01-01

    In this paper we propose an approach to the design optimization of fault-tolerant hard real-time embedded systems, which combines hardware and software fault tolerance techniques. We trade-off between selective hardening in hardware and process reexecution in software to provide the required levels of fault tolerance against transient faults with the lowest-possible system costs. We propose a system failure probability (SFP) analysis that connects the hardening level with the maximum number o...

  8. Fault tolerant control for nonlinear systems subject to different types of sensor faults

    OpenAIRE

    Ichalal, Dalil; Marx, Benoît; Maquin, Didier; Ragot, José

    2011-01-01

    This paper deals with the problem of fault tolerant control of nonlinear systems represented by Takagi-Sugeno models subject to sensor faults. Observer based controllers are designed for each faulty-situation (mode). The classical switching law is replaced by a new mechanism which avoid the switching phenomenon. The purpose is to be able to study the stability of the global closed-loop system. This new mechanism uses the residual signals obtained by a residual generator. A bank of observers i...

  9. Fault-tolerant Supervisory Control : System Analysis and Logic Design

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh

    1999-01-01

    The main purpose of this work has been to achieve active fault-tolerance in control systems, defined as a methodology where fault detection and isolation techniques are combined with supervisory control to achieve autonomous accommodation of faults before they develop into failures. The aim of this work has been to develop and employ concepts and methods that are suitable for use in different automation processes, with applicability in various industrial fields. The requirements for high productivity and quality has resulted in employing additional instrumentation and use of more sophisticated control algorithms. The drawback is, however, that these control systems have become more vulnerable to even simple faults in instrumentation. On the other hand, due to cost-optimality requirements, an extensive use of hardware redundancy has been prohibited. Nevertheless, the dependency and availability could be increased through enhancing control systems' ability to on-line perform fault detection and reconfiguration when a fault occurs and before a safety system shuts-down the entire process. The main contributions of this research effort are development and experimentation with methodologies for systematic analysis of reconfiguration and design of supervisor logic. In addition, useful experience is obtained through implementation of a fault-tolerant control scheme against a simulated ship and its propulsion system. A development methodology, which was suggested in the Control Engineering Department, is extended to cope with the important reconfiguration problem. In order to enable a designer to acquire knowledge about reconfiguration possibilities, the structural analysis method is added as an extension to the existing methodology. This extension builds upon the earlier method where fault propagation and severity analysis are the essential parts. Structural analysis (SA) enables the designer to distinguish between the parts of the systems with no redundant information and the parts with possible redundant information. This method, hence, provides the designer with information, which is necessary during the selection of remedial actions. Furthermore, it is shown how sensor information fusion is obtained by using the SA method. The construction of the supervisor's decision logic is essential for the active form of fault-tolerant control. In this regard, two approaches has been presented. The first aims at constructing the decision logic in form of a ``language''. This language is obtained as a direct result of the component based approach, presented in this thesis. This approach is based on the definition of a functional component, components placement in a control system hierarchy and the definition of system level hierarchy. The supervisor language includes all valid strings, representing the combination of valid components, that keep the system functional. This approach is simple and can be automated. In the second approach, implementation of supervisor functionality is realized on the basis of an extension to the traditional state-event machines. Due to parallelity (inherent modularity) the supervisor logic is more easily modified, updated, maintained, and tested. A salient feature is that a change in one task only necessitates redesign of essentially one corresponding state-event machine (SEM). A heuristic guideline is provided for designing the logic in form of SEMs. A ship propulsion system benchmark has been designed and used as a case study. This includes experimentation with the above methodologies and implementation of a fault-tolerant control against the simulation. Four generic faults have been considered. It has been shown how the SA method is easily employed to generate analytical redundancy relations, which in turn are then used for FDI purposes. Three different methods are used to generate residuals. These methods are: simple numerical calculation, a non-linear observer, and a Neuro-Fuzzy method. Employment of each method follows the assumption about the available system information. The results show that it is p

  10. Summarize of Electric Vehicle Electric System Fault and Fault-tolerant Technology

    Directory of Open Access Journals (Sweden)

    Zhang Liwei

    2013-09-01

    Full Text Available Electric vehicle drive system is a multi-variable function, running environment complexed and changeable system, so it’s failure form is complicated. In this paper, according to the fault happens in different position, establish vehicle fault table, analyze the consequences of failure may cause and the causes of failure. Combined with hardware limitations, and the maximum guarantee system performance requirements, passive software redundancy fault-tolerant strategy is put forward, give an example to analysis the pros and cons of this method.

  11. Design a Fault Tolerance for Real Time Distributed System

    Directory of Open Access Journals (Sweden)

    Ban M. Khammas

    2012-01-01

    Full Text Available This paper designed a fault tolerance for soft real time distributed system (FTRTDS. This system is designed to be independently on specific mechanisms and facilities of the underlying real time distributed system. It is designed to be distributed on all the computers in the distributed system and controlled by a central unit.Besides gathering information about a target program spontaneously, it provides information about the target operating system and the target hardware in order to diagnose the fault before occurring, so it can handle the situation before it comes on. And it provides a distributed system with the reactive capability of reconfiguring and reinitializing after the occurrence of a failure.

  12. Scheduling and Optimization of Fault-Tolerant Distributed Embedded Systems

    OpenAIRE

    Izosimov, Viacheslav

    2009-01-01

    Safety-critical applications have to function correctly and deliver high level of quality-ofservice even in the presence of faults. This thesis deals with techniques for tolerating effects of transient and intermittent faults. Re-execution, software replication, and rollback recovery with checkpointing are used to provide the required level of fault tolerance at the software level. Hardening is used to increase the reliability of hardware components. These techniques are considered in the con...

  13. Quantitative evaluation of the fault tolerance of systems important to the safety of atomic power plants

    International Nuclear Information System (INIS)

    Fault tolerance is the property of a system to preserve its performance upon failures of its components. Thus, in nuclear-reactor technology one has only a qualitative evaluation of fault tolerance - the single-failure criterion, which does not enable one to compare and perform goal-directed design of fault-tolerant systems, and in the field of computer technology there are no generally accepted evaluations of fault tolerance that could be applied effectively to reactor systems. This paper considers alternative evaluations of fault tolerance and a method of comprehensive automated calculation of the reliability and fault tolerance of complex systems. The authors presented quantitative estimates of fault tolerance that develop the single-failure criterion. They have limiting processes that allow simple and graphical standardization. They worked out a method and a program for comprehensive calculation of the reliability and fault tolerance of systems of complex structure that are important to the safety of atomic power plants. The quantitative evaluation of the fault tolerance of these systems exhibits a degree of insensitivity to failures and shows to what extent their reliability is determined by a rigorously defined structure, and to what extent by the probabilistic reliability characteristics of the components. To increase safety, one must increase the fault tolerance of the most important systems of atomic power plants

  14. Implementation of FMFRS (Fault Tolerant Most fitting Resource Scheduling algorithm in Real time system

    Directory of Open Access Journals (Sweden)

    Harkiran Kaur

    2013-08-01

    Full Text Available In computational Grid, fault tolerance is an imperative issue to be considered during job scheduling. Due to the widespread use of resources, systems are highly prone to errors and failures. Hence fault tolerance plays a key role in grid to avoid the problem of unreliability. The two main techniques for implementing fault tolerance in grid environment are check pointing and replication. This paper proposes a real time approach to a replication technique named as FMFRS (Fault Tolerant most fitting resource scheduling algorithm to improve the fault tolerance of the fittest resource scheduling algorithm. The proposed method is to improve the fault tolerance by using fittest resource scheduling algorithm, by scheduling the job in coordination with job replication when the resource has low reliability and checking the parameters like Fault Tolerance capacity and Node’s Reliability. Based on the reliability index of the resource, the resource is identified as critical.

  15. Sliding mode based fault detection, reconstruction and fault tolerant control scheme for motor systems.

    Science.gov (United States)

    Mekki, Hemza; Benzineb, Omar; Boukhetala, Djamel; Tadjine, Mohamed; Benbouzid, Mohamed

    2015-07-01

    The fault-tolerant control problem belongs to the domain of complex control systems in which inter-control-disciplinary information and expertise are required. This paper proposes an improved faults detection, reconstruction and fault-tolerant control (FTC) scheme for motor systems (MS) with typical faults. For this purpose, a sliding mode controller (SMC) with an integral sliding surface is adopted. This controller can make the output of system to track the desired position reference signal in finite-time and obtain a better dynamic response and anti-disturbance performance. But this controller cannot deal directly with total system failures. However an appropriate combination of the adopted SMC and sliding mode observer (SMO), later it is designed to on-line detect and reconstruct the faults and also to give a sensorless control strategy which can achieve tolerance to a wide class of total additive failures. The closed-loop stability is proved, using the Lyapunov stability theory. Simulation results in healthy and faulty conditions confirm the reliability of the suggested framework. PMID:25747198

  16. Fault-tolerant reactor protection system

    International Nuclear Information System (INIS)

    A reactor protection system is disclosed having four divisions, with quad redundant sensors for each scram parameter providing input to four independent microprocessor-based electronic chassis. Each electronic chassis acquires the scram parameter data from its own sensor, digitizes the information, and then transmits the sensor reading to the other three electronic chassis via optical fibers. To increase system availability and reduce false scrams, the reactor protection system employs two levels of voting on a need for reactor scram. The electronic chassis perform software divisional data processing, vote 2/3 with spare based upon information from all four sensors, and send the divisional scram signals to the hardware logic panel, which performs a 2/4 division vote on whether or not to initiate a reactor scram. Each chassis makes a divisional scram decision based on data from all sensors. Each division performs independently of the others (asynchronous operation). All communications between the divisions are asynchronous. Each chassis substitutes its own spare sensor reading in the 2/3 vote if a sensor reading from one of the other chassis is faulty or missing. Therefore the presence of at least two valid sensor readings in excess of a set point is required before terminating the output to the hardware logic of a scram inhibition signal even when one of the four sensors is faulty or when one of the divisions is out of service. 16 figs. 16 figs

  17. Fault-tolerant reactor protection system

    Science.gov (United States)

    Gaubatz, Donald C. (Cupertino, CA)

    1997-01-01

    A reactor protection system having four divisions, with quad redundant sensors for each scram parameter providing input to four independent microprocessor-based electronic chassis. Each electronic chassis acquires the scram parameter data from its own sensor, digitizes the information, and then transmits the sensor reading to the other three electronic chassis via optical fibers. To increase system availability and reduce false scrams, the reactor protection system employs two levels of voting on a need for reactor scram. The electronic chassis perform software divisional data processing, vote 2/3 with spare based upon information from all four sensors, and send the divisional scram signals to the hardware logic panel, which performs a 2/4 division vote on whether or not to initiate a reactor scram. Each chassis makes a divisional scram decision based on data from all sensors. Each division performs independently of the others (asynchronous operation). All communications between the divisions are asynchronous. Each chassis substitutes its own spare sensor reading in the 2/3 vote if a sensor reading from one of the other chassis is faulty or missing. Therefore the presence of at least two valid sensor readings in excess of a set point is required before terminating the output to the hardware logic of a scram inhibition signal even when one of the four sensors is faulty or when one of the divisions is out of service.

  18. A Ship Propulsion System Model for Fault-tolerant Control

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh; Blanke, M.

    1998-01-01

    This report presents a propulsion system model for a low speed marine vehicle, which can be used as a test benchmark for Fault-Tolerant Control purposes. The benchmark serves the purpose of offering realistic and challenging problems relevant in both FDI and (autonomous) supervisory control area. The propulsion system model is presented in two versions: the first one consists of one engine and one propeller, and the othe one consists of two engines and their corresponding propellers placed in parallel in the ship. The corresponding programs are developed and are available.

  19. Fault Tolerant Software: a Multi Agent System Solution

    DEFF Research Database (Denmark)

    Caponetti, Fabio; Bergantino, Nicola

    2009-01-01

    Development of high dependable systems remains a labour intensive task. This paper explores recent advances on the adaptation of the software agent architecture for control application while looking to dependability issues. Multiple agent systems theory will be reviewed giving methods to supervise it. Software ageing is shown to be the most common problem and rejuvenation its counteract. The paper will show how an agent population can be monitored, faulty agents isolated and reloaded in a healthy state, hence rejuvenated. The aim is to propose an architecture as basis for the design of control software able to tolerate faults and residual bugs without the need of maintenance stops.

  20. Fault Tolerant Operation in Aero Engine Using Distributed Computation System

    Directory of Open Access Journals (Sweden)

    Neela A G

    2014-04-01

    Full Text Available The paper presents fault tolerant operation in an aero engine based on real-time systems which is built for a very small set of mission-critical applications like space craft’s , avionics and other distributed control systems. The modern software deals with external interfaces and has to consider various timing implications The platform is based on the C and developed using Keil MDK tool with the targeted deadline of 100 milliseconds at the baud rate of 500 kbps. CAN interface executes the role of Transportation and Communication, an interface cable used for serial communication between Digital Electronic Control Unit (DECU and the host to transfer data to the pilot Online Monitoring System and that is based on Laboratory Virtual Instrument Engineering Workbench (Lab VIEW 7.1. Fault diagnosis typically assumes a sufficiently large fault signature and enough time for a reliable decision to be reached. However, for a class of safety critical faults on commercial aircraft engines, prompt detection is paramount within a millisecond range to allow accommodation to avert undesired engine behavior. At the same time, false positives must be avoided to prevent inappropriate control action.

  1. Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

    Science.gov (United States)

    Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.

    1992-01-01

    Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions.

  2. Fault-Tolerant Relative Navigation System (RNS) for Docking Project

    National Aeronautics and Space Administration — A method is propsed to develop a sensor fusion process for blending GPS/IMU/EO data for fault tolerant rendezvous and docking of spacecraft. The methodology takes...

  3. Application-Transparent Fault Tolerance in Distributed Systems

    OpenAIRE

    Becker, Thomas

    1999-01-01

    We present a new software architecture in which all concepts necessary to achieve fault tolerance can be added to an appli- cation automatically without any source code changes. As a case study, we consider the problem of providing a reliable service despite node failures by executing a group of replicat- ed servers. Replica creation and management as well as fail- ure detection and recovery are performed automatically by a separate fault tolerance layer (ft-layer) which is inserted be- tween...

  4. Fault-diagnosis applications model-based condition monitoring actuators, drives, machinery, plants, sensors, and fault-tolerant systems

    CERN Document Server

    Isermann, Rolf

    2011-01-01

    Supervision, condition-monitoring, fault detection, fault diagnosis and fault management play an increasing role for technical processes and vehicles in order to improve reliability, availability, maintenance and lifetime. For safety-related processes fault-tolerant systems with redundancy are required in order to reach comprehensive system integrity.   This book is a sequel of the book "Fault-Diagnosis Systems" published in 2006, where the basic methods were described. After a short introduction into fault-detection and fault-diagnosis methods the book shows how these methods can be applie

  5. Aspect-oriented fault tolerance for real-time embedded systems

    OpenAIRE

    Afonso, Francisco; Silva, Carlos A.; Brito, Nuno; Montenegro, Se?rgio; Tavares, Adriano

    2008-01-01

    Real-time embedded systems for safety-critical applications have to introduce fault tolerance mechanisms in order to cope with hardware and software errors. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This paper describes our ap...

  6. On the description of fault-tolerant systems

    International Nuclear Information System (INIS)

    Various demands by increasing complexity and the disposability of new technologies, like the One-chip-microcomputer and fiber optics, lead to control systems, which are built as decentralized distributed multi-microcomputersystems. They realize not only new control functions but they also open possibilities to increase availability by fault-tolerance. The design or the selection and lay-out of such systems require a quantitative description of these systems. This is possible on the bases of the set of hardware and software moduls of the system by the use of queuing models, reliability nets and diagnostic graphs. This is shown by an example of a practically applied Really Distributed Computer Control System (RDC-System). Computer aided methods for these system descriptions are emphasized. (orig.)

  7. Design and analysis of reliable and fault-tolerant computer systems

    CERN Document Server

    Abd-El-Barr, Mostafa

    2006-01-01

    Covering both the theoretical and practical aspects of fault-tolerant mobile systems, and fault tolerance and analysis, this book tackles the current issues of reliability-based optimization of computer networks, fault-tolerant mobile systems, and fault tolerance and reliability of high speed and hierarchical networks.The book is divided into six parts to facilitate coverage of the material by course instructors and computer systems professionals. The sequence of chapters in each part ensures the gradual coverage of issues from the basics to the most recent developments. A useful set of refere

  8. Diagnostic software and fault tolerant microprocessor based system architectures

    International Nuclear Information System (INIS)

    In numerous industrial applications including power generation, the availability of electronic systems to perform the tasks assigned has become a major issue. At the same time, the functional complexity of these systems has increased enormously. Fortunately, the arrival of cost effective microprocessor based hardware has given the system designer a cadre of techniques to ensure the desired degree of system integrity and availability. These include: dynamic redundancy, isolation, functional diversity, built-in self-tests, embedded test subsystems, communications, error checking and error correcting codes, etc. The choice among the available techniques is generally heuristic and depends greatly on the structure of major components and systems external to the electronic system itself as well as the postulated faults and their relative frequency. Indiscriminate use of these techniques will inevitably increase cost and reduce maintainability while actually reducing system availability and reliability. The issues and the application of these techniques are discussed by describing recent examples of fault tolerant microprocessor based system architectures which include the Plant Safety Monitoring System, the EAGLE-21 Process Protection System and the Advanced Rod Position Indication System for pressurized water reactors. Each of these systems utilize unique internal architectures that address the reliability, availability, and the communications issues while improving maintainability and man-machine interfaces

  9. The X-38 Spacecraft Fault-Tolerant Avionics System

    Science.gov (United States)

    Kouba,Coy; Buscher, Deborah; Busa, Joseph

    2003-01-01

    In 1995 NASA began an experimental program to develop a reusable crew return vehicle (CRV) for the International Space Station. The purpose of the CRV was threefold: (i) to bring home an injured or ill crewmember; (ii) to bring home the entire crew if the Shuttle fleet was grounded; and (iii) to evacuate the crew in the case of an imminent Station threat (i.e., fire, decompression, etc). Built at the Johnson Space Center, were two approach and landing prototypes and one spacecraft demonstrator (called V201). A series of increasingly complex ground subsystem tests were completed, and eight successful high-altitude drop tests were achieved to prove the design concept. In this program, an unprecedented amount of commercial-off-the-shelf technology was utilized in this first crewed spacecraft NASA has built since the Shuttle program. Unfortunately, in 2002 the program was canceled due to changing Agency priorities. The vehicle was 80% complete and the program was shut down in such a manner as to preserve design, development, test and engineering data. This paper describes the X-38 V201 fault-tolerant avionics system. Based on Draper Laboratory's Byzantine-resilient fault-tolerant parallel processing system and their "network element" hardware, each flight computer exchanges information on a strict timescale to process input data, compare results, and issue voted vehicle output commands. Major accomplishments achieved in this development include: (i) a space qualified two-fault tolerant design using mostly COTS (hardware and operating system); (ii) a single event upset tolerant network element board, (iii) on-the-fly recovery of a failed processor; (iv) use of synched cache; (v) realignment of memory to bring back a failed channel; (vi) flight code automatically generated from the master measurement list; and (vii) built in-house by a team of civil servants and support contractors. This paper will present an overview of the avionics system and the hardware implementation, as well as the system software and vehicle command & telemetry functions. Potential improvements and lessons learned on this program are also discussed.

  10. Synthesis of Fault-Tolerant Embedded Systems with Checkpointing and Replication

    OpenAIRE

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru; Peng, Zebo

    2006-01-01

    We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes are statically scheduled and communications are performed using the time-triggered protocol. Our synthesis approach decides the assignment of fault-tolerance policies to processes, the optimal placement of checkpoints and the mapping of processes to processors such that t...

  11. Distributed Adaptive Fault-Tolerant Control of Uncertain Multi-Agent Systems

    OpenAIRE

    Khalili, Mohsen; Zhang, Xiaodong; Polycarpou, Marios M.; PARISINI, THOMAS; Cao, Yongcan

    2015-01-01

    This paper presents an adaptive fault-tolerant control (FTC) scheme for a class of nonlinear uncertain multi-agent systems. A local FTC scheme is designed for each agent using local measurements and suitable information exchanged between neighboring agents. Each local FTC scheme consists of a fault diagnosis module and a reconfigurable controller module comprised of a baseline controller and two adaptive fault-tolerant controllers activated after fault detection and after fa...

  12. Mapping of Fault-Tolerant Applications with Transparency on Distributed Embedded Systems

    OpenAIRE

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru; Peng, Zebo

    2006-01-01

    In this paper we present an approach for the mapping optimization of fault-tolerant embedded systems for safety-critical applications. Processes and messages are statically scheduled. Process re-execution is used for recovering from multiple transient faults. We call process recovery transparent if it does not affect operation of other processes. Transparent recovery has the advantage of fault containment, improved debugability and less memory needed to store the fault-tolerant schedules. How...

  13. Modeling the Fault Tolerant Capability of a Flight Control System: An Exercise in SCR Specification

    Science.gov (United States)

    Alexander, Chris; Cortellessa, Vittorio; DelGobbo, Diego; Mili, Ali; Napolitano, Marcello

    2000-01-01

    In life-critical and mission-critical applications, it is important to make provisions for a wide range of contingencies, by providing means for fault tolerance. In this paper, we discuss the specification of a flight control system that is fault tolerant with respect to sensor faults. Redundancy is provided by analytical relations that hold between sensor readings; depending on the conditions, this redundancy can be used to detect, identify and accommodate sensor faults.

  14. Investigation of an advanced fault tolerant integrated avionics system

    Science.gov (United States)

    Dunn, W. R.; Cottrell, D.; Flanders, J.; Javornik, A.; Rusovick, M.

    1986-01-01

    Presented is an advanced, fault-tolerant multiprocessor avionics architecture as could be employed in an advanced rotorcraft such as LHX. The processor structure is designed to interface with existing digital avionics systems and concepts including the Army Digital Avionics System (ADAS) cockpit/display system, navaid and communications suites, integrated sensing suite, and the Advanced Digital Optical Control System (ADOCS). The report defines mission, maintenance and safety-of-flight reliability goals as might be expected for an operational LHX aircraft. Based on use of a modular, compact (16-bit) microprocessor card family, results of a preliminary study examining simplex, dual and standby-sparing architectures is presented. Given the stated constraints, it is shown that the dual architecture is best suited to meet reliability goals with minimum hardware and software overhead. The report presents hardware and software design considerations for realizing the architecture including redundancy management requirements and techniques as well as verification and validation needs and methods.

  15. Fault-tolerance in Two-dimensional Topological Systems

    Science.gov (United States)

    Anderson, Jonas T.

    This thesis is a collection of ideas with the general goal of building, at least in the abstract, a local fault-tolerant quantum computer. The connection between quantum information and topology has proven to be an active area of research in several fields. The introduction of the toric code by Alexei Kitaev demonstrated the usefulness of topology for quantum memory and quantum computation. Many quantum codes used for quantum memory are modeled by spin systems on a lattice, with operators that extract syndrome information placed on vertices or faces of the lattice. It is natural to wonder whether the useful codes in such systems can be classified. This thesis presents work that leverages ideas from topology and graph theory to explore the space of such codes. Homological stabilizer codes are introduced and it is shown that, under a set of reasonable assumptions, any qubit homological stabilizer code is equivalent to either a toric code or a color code. Additionally, the toric code and the color code correspond to distinct classes of graphs. Many systems have been proposed as candidate quantum computers. It is very desirable to design quantum computing architectures with two-dimensional layouts and low complexity in parity-checking circuitry. Kitaev's surface codes provided the first example of codes satisfying this property. They provided a new route to fault tolerance with more modest overheads and thresholds approaching 1%. The recently discovered color codes share many properties with the surface codes, such as the ability to perform syndrome extraction locally in two dimensions. Some families of color codes admit a transversal implementation of the entire Clifford group. This work investigates color codes on the 4.8.8 lattice known as triangular codes. I develop a fault-tolerant error-correction strategy for these codes in which repeated syndrome measurements on this lattice generate a three-dimensional space-time combinatorial structure. I then develop an integer program that analyzes this structure and determines the most likely set of errors consistent with the observed syndrome values. I implement this integer program to find the threshold for depolarizing noise on small versions of these triangular codes. Because the threshold for magic-state distillation is likely to be higher than this value and because logical CNOT gates can be performed by code deformation in a single block instead of between pairs of blocks, the threshold for fault-tolerant quantum memory for these codes is also the threshold for fault-tolerant quantum computation with them. Since the advent of a threshold theorem for quantum computers much has been improved upon. Thresholds have increased, architectures have become more local, and gate sets have been simplified. The overhead for magic-state distillation has been studied, but not nearly to the extent of the aforementioned topics. A method for greatly reducing this overhead, known as reusable magic states, is studied here. While examples of reusable magic states exist for Clifford gates, I give strong reasons to believe they do not exist for non-Clifford gates.

  16. Reliability modeling of digital component in plant protection system with various fault-tolerant techniques

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Bo Gyung, E-mail: bogyungkim@kaist.ac.kr [Department of Nuclear and Quantum Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701 (Korea, Republic of); Kang, Hyun Gook [Department of Nuclear and Quantum Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701 (Korea, Republic of); Department of Nuclear Engineering, Khalifa University of Science, Technology and Research, Abu Dhabi (United Arab Emirates); Kim, Hee Eun [Department of Nuclear and Quantum Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701 (Korea, Republic of); Lee, Seung Jun [Integrated Safety Assessment Team, Korea Atomic Energy Research Institute, 1045, Daedeok-daero, Daejeon 305-353 (Korea, Republic of); Seong, Poong Hyun [Department of Nuclear and Quantum Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701 (Korea, Republic of)

    2013-12-15

    Highlights: • Integrated fault coverage is introduced for reflecting characteristics of fault-tolerant techniques in the reliability model of digital protection system in NPPs. • The integrated fault coverage considers the process of fault-tolerant techniques from detection to fail-safe generation process. • With integrated fault coverage, the unavailability of repairable component of DPS can be estimated. • The new developed reliability model can reveal the effects of fault-tolerant techniques explicitly for risk analysis. • The reliability model makes it possible to confirm changes of unavailability according to variation of diverse factors. - Abstract: With the improvement of digital technologies, digital protection system (DPS) has more multiple sophisticated fault-tolerant techniques (FTTs), in order to increase fault detection and to help the system safely perform the required functions in spite of the possible presence of faults. Fault detection coverage is vital factor of FTT in reliability. However, the fault detection coverage is insufficient to reflect the effects of various FTTs in reliability model. To reflect characteristics of FTTs in the reliability model, integrated fault coverage is introduced. The integrated fault coverage considers the process of FTT from detection to fail-safe generation process. A model has been developed to estimate the unavailability of repairable component of DPS using the integrated fault coverage. The new developed model can quantify unavailability according to a diversity of conditions. Sensitivity studies are performed to ascertain important variables which affect the integrated fault coverage and unavailability.

  17. Reliability modeling of digital component in plant protection system with various fault-tolerant techniques

    International Nuclear Information System (INIS)

    Highlights: • Integrated fault coverage is introduced for reflecting characteristics of fault-tolerant techniques in the reliability model of digital protection system in NPPs. • The integrated fault coverage considers the process of fault-tolerant techniques from detection to fail-safe generation process. • With integrated fault coverage, the unavailability of repairable component of DPS can be estimated. • The new developed reliability model can reveal the effects of fault-tolerant techniques explicitly for risk analysis. • The reliability model makes it possible to confirm changes of unavailability according to variation of diverse factors. - Abstract: With the improvement of digital technologies, digital protection system (DPS) has more multiple sophisticated fault-tolerant techniques (FTTs), in order to increase fault detection and to help the system safely perform the required functions in spite of the possible presence of faults. Fault detection coverage is vital factor of FTT in reliability. However, the fault detection coverage is insufficient to reflect the effects of various FTTs in reliability model. To reflect characteristics of FTTs in the reliability model, integrated fault coverage is introduced. The integrated fault coverage considers the process of FTT from detection to fail-safe generation process. A model has been developed to estimate the unavailability of repairable component of DPS using the integrated fault coverage. The new developed model can quantify unavailability according to a diversity of conditions. Sensitivity studies are performed to ascertain important variables which affect the integrated fault coverage and unavailability

  18. FTAPE: A fault injection tool to measure fault tolerance

    Science.gov (United States)

    Tsai, Timothy K.; Iyer, Ravishankar K.

    1995-01-01

    The paper introduces FTAPE (Fault Tolerance And Performance Evaluator), a tool that can be used to compare fault-tolerant computers. The tool combines system-wide fault injection with a controllable workload. A workload generator is used to create high stress conditions for the machine. Faults are injected based on this workload activity in order to ensure a high level of fault propagation. The errors/fault ratio and performance degradation are presented as measures of fault tolerance.

  19. Energy-Aware Synthesis of Fault-Tolerant Schedules for Real-Time Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Poulsen, Kåre Harbo; Pop, Paul

    2007-01-01

    This paper presents a design optimisation tool for distributed embedded real-time systems that 1) decides mapping, fault-tolerance policy and generates a fault-tolerant schedule, 2) is targeted for hard real-time, 3) has hard reliability goal, 4) generates static schedule for processes and messages, 5) provides fault-tolerance for k transient/soft faults, 6) optimises for minimal energy consumption, while considering impact of lowering voltages on the probability of faults, 7) uses constraint logic programming (CLP) based implementation.

  20. A fault tolerant superheat control strategy for supermarket refrigeration systems

    DEFF Research Database (Denmark)

    Vinther, Kasper; Izadi-Zamanabadi, Roozbeh

    2013-01-01

    In this paper, a fault tolerant control (FTC) strategy is proposed for evaporator superheat control in supermarket refrigeration systems. Conventional control uses a pressure and temperature sensor for this purpose, however, the pressure sensor can fail to function. A contingency control strategy, based on a maximum slope-seeking control method and only a single temperature sensor, is developed to drive the evaporator outlet temperature to a level that gives a suitable superheat of the refrigerant. The FTC strategy requires no a priori system knowledge or additional hardware and functions in a plug & play fashion. The strategy is outlined by means of procedural steps as well as a flow chart that also illustrates the process of automatic tuning of the maximum slope-seeking controller. Test results are furthermore presented for a display case in a full scale CO2 supermarket refrigeration system.

  1. Reactive system verification case study: Fault-tolerant transputer communication

    Science.gov (United States)

    Crane, D. Francis; Hamory, Philip J.

    1993-01-01

    A reactive program is one which engages in an ongoing interaction with its environment. A system which is controlled by an embedded reactive program is called a reactive system. Examples of reactive systems are aircraft flight management systems, bank automatic teller machine (ATM) networks, airline reservation systems, and computer operating systems. Reactive systems are often naturally modeled (for logical design purposes) as a composition of autonomous processes which progress concurrently and which communicate to share information and/or to coordinate activities. Formal (i.e., mathematical) frameworks for system verification are tools used to increase the users' confidence that a system design satisfies its specification. A framework for reactive system verification includes formal languages for system modeling and for behavior specification and decision procedures and/or proof-systems for verifying that the system model satisfies the system specifications. Using the Ostroff framework for reactive system verification, an approach to achieving fault-tolerant communication between transputers was shown to be effective. The key components of the design, the decoupler processes, may be viewed as discrete-event-controllers introduced to constrain system behavior such that system specifications are satisfied. The Ostroff framework was also effective. The expressiveness of the modeling language permitted construction of a faithful model of the transputer network. The relevant specifications were readily expressed in the specification language. The set of decision procedures provided was adequate to verify the specifications of interest. The need for improved support for system behavior visualization is emphasized.

  2. Evaluation of digital fault-tolerant architectures for nuclear power plant control systems

    International Nuclear Information System (INIS)

    Four fault tolerant architectures were evaluated for their potential reliability in service as control systems of nuclear power plants. The reliability analyses showed that human- and software-related common cause failures and single points of failure in the output modules are dominant contributors to system unreliability. The four architectures are triple-modular-redundant (TMR), both synchronous and asynchronous, and also dual synchronous and asynchronous. The evaluation includes a review of design features, an analysis of the importance of coverage, and reliability analyses of fault tolerant systems. An advantage of fault-tolerant controllers over those not fault tolerant, is that fault-tolerant controllers continue to function after the occurrence of most single hardware faults. However, most fault-tolerant controllers have single hardware components that will cause system failure, almost all controllers have single points of failure in software, and all are subject to common cause failures. Reliability analyses based on data from several industries that have fault-tolerant controllers were used to estimate the mean-time-between-failures of fault-tolerant controllers and to predict those failures modes that may be important in nuclear power plants. 7 refs., 4 tabs

  3. Fault diagnosis and fault-tolerant control strategies for non-linear systems analytical and soft computing approaches

    CERN Document Server

    Witczak, Marcin

    2014-01-01

      This book presents selected fault diagnosis and fault-tolerant control strategies for non-linear systems in a unified framework. In particular, starting from advanced state estimation strategies up to modern soft computing, the discrete-time description of the system is employed Part I of the book presents original research results regarding state estimation and neural networks for robust fault diagnosis. Part II is devoted to the presentation of integrated fault diagnosis and fault-tolerant systems. It starts with a general fault-tolerant control framework, which is then extended by introducing robustness with respect to various uncertainties. Finally, it is shown how to implement the proposed framework for fuzzy systems described by the well-known Takagi–Sugeno models. This research monograph is intended for researchers, engineers, and advanced postgraduate students in control and electrical engineering, computer science,as well as mechanical and chemical engineering.

  4. Observer based actuator fault tolerant control for nonlinear Takagi-Sugeno systems : an LMI approach

    OpenAIRE

    Ichalal, Dalil; Marx, Benoît; Ragot, José; Maquin, Didier

    2010-01-01

    A new actuator fault tolerant control strategy is proposed in this paper for nonlinear Takagi-Sugeno (T-S) systems. The control law aims to compensate the actuator faults and allows the system states to track a reference states corresponding to the output of the system in the fault free situation. The design of such a control law requires the knowledge of the faults, this task is achieved with a proportional integral observer (PIO). The robust stability of the system with the fault tolerant c...

  5. Fault-tolerant for Electric Vehicles Drive System Sensor Failure

    Directory of Open Access Journals (Sweden)

    Zhang Liwei

    2013-10-01

    Full Text Available When EV failure happens, it needs to take some fault-tolerant method to ensure people’s safety. When the current sensor and speed sensor are out of work, the software fault-tolerant control algorithm switching strategy can be used. This paper has done theoretical analysis of the rotor field-oriented vectoe control algorithm into the open loop constant V/F control algorithm, and the phase angle compensation method is used to reduce the shock of current and torque, and simulation is done in MATLAB/Simulink.    

  6. Boolean Logic with Fault Tolerant Coding

    OpenAIRE

    Alagoz, B. Baykant

    2009-01-01

    Error detectable and error correctable coding in Hamming space was researched to discover possible fault tolerant coding constellations, which can implement Boolean logic with fault tolerant property. Basic logic operators of the Boolean algebra were developed to apply fault tolerant coding in the logic circuits. It was shown that application of three-bit fault tolerant codes have provided the digital system skill of auto-recovery without need for designing additional-fault ...

  7. Design and Assessment of a Multiple Sensor Fault Tolerant Robust Control System

    OpenAIRE

    Yang, S. S.; Chen, J.

    2008-01-01

    This paper presents an enhanced robust control design structure to realise fault tolerance towards sensor faults suitable for multi-input-multi-output (MIMO) systems implementation. The proposed design permits fault detection and controller elements to be designed with considerations to stability and robustness towards uncertainties besides multiple faults environment on a common mathematical platform. This framework can also cater to systems requiring fast responses. A design example is illu...

  8. Analysis and optimization of fault-tolerant embedded systems with hardened processors

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Polian, Ilia

    2009-01-01

    In this paper we propose an approach to the design optimization of fault-tolerant hard real-time embedded systems, which combines hardware and software fault tolerance techniques. We trade-off between selective hardening in hardware and process reexecution in software to provide the required levels of fault tolerance against transient faults with the lowest-possible system costs. We propose a system failure probability (SFP) analysis that connects the hardening level with the maximum number of reexecutions in software. We present design optimization heuristics, to select the fault-tolerant architecture and decide process mapping such that the system cost is minimized, deadlines are satisfied, and the reliability requirements are fulfilled.

  9. Synthesis of Fault-Tolerant Embedded Systems with Checkpointing and Replication

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul

    2006-01-01

    We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes are statically scheduled and communications are performed using the time-triggered protocol. Our synthesis approach decides the assignment of fault-tolerance policies to processes, the optimal placement of checkpoints and the mapping of processes to processors such that transient faults are tolerated and the timing constraints of the application are satisfied. We present several synthesis algorithms which are able to find fault-tolerant implementations given a limited amount of resources. The developed algorithms are evaluated using extensive experiments, including a real-life example.

  10. Design Optimization of Time- and Cost-Constrained Fault-Tolerant Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul

    2005-01-01

    In this paper we present an approach to the design optimization of fault-tolerant embedded systems for safety-critical applications. Processes are statically scheduled and communications are performed using the time-triggered protocol. We use process re-execution and replication for tolerating transient faults. Our design optimization approach decides the mapping of processes to processors and the assignment of fault-tolerant policies to processes such that transient faults are tolerated and the timing constraints of the application are satisfied. We present several heuristics which are able to find fault-tolerant implementations given a limited amount of resources. The developed algorithms are evaluated using extensive experiments, including a real-life example.

  11. Design of a fault-tolerant decision-making system for biomedical applications.

    Science.gov (United States)

    Faust, Oliver; Acharya, U Rajendra; Sputh, Bernhard H C; Tamura, Toshiyo

    2013-01-01

    This paper describes the design of a fault-tolerant classification system for medical applications. The design process follows the systems engineering methodology: in the agreement phase, we make the case for fault tolerance in diagnosis systems for biomedical applications. The argument extends the idea that machine diagnosis systems mimic the functionality of human decision-making, but in many cases they do not achieve the fault tolerance of the human brain. After making the case for fault tolerance, both requirements and specification for the fault-tolerant system are introduced before the implementation is discussed. The system is tested with fault and use cases to build up trust in the implemented system. This structured approach aided in the realisation of the fault-tolerant classification system. During the specification phase, we produced a formal model that enabled us to discuss what fault tolerance, reliability and safety mean for this particular classification system. Furthermore, such a formal basis for discussion is extremely useful during the initial stages of the design, because it helps to avoid big mistakes caused by a lack of overview later on in the project. During the implementation, we practiced component reuse by incorporating a reliable classification block, which was developed during a previous project, into the current design. Using a well-structured approach and practicing component reuse we follow best practice for both research and industry projects, which enabled us to realise the fault-tolerant classification system on time and within budget. This system can serve in a wide range of future health care systems. PMID:22288838

  12. Software-Implemented Fault Tolerance in Communications Systems

    Science.gov (United States)

    Gantenbein, Rex E.

    1994-01-01

    Software-implemented fault tolerance (SIFT) is used in many computer-based command, control, and communications (C(3)) systems to provide the nearly continuous availability that they require. In the communications subsystem of Space Station Alpha, SIFT algorithms are used to detect and recover from failures in the data and command link between the Station and its ground support. The paper presents a review of these algorithms and discusses how such techniques can be applied to similar systems found in applications such as manufacturing control, military communications, and programmable devices such as pacemakers. With support from the Tracking and Communication Division of NASA's Johnson Space Center, researchers at the University of Wyoming are developing a testbed for evaluating the effectiveness of these algorithms prior to their deployment. This testbed will be capable of simulating a variety of C(3) system failures and recording the response of the Space Station SIFT algorithms to these failures. The design of this testbed and the applicability of the approach in other environments is described.

  13. Design Optimization of Time- and Cost-Constrained Fault-Tolerant Embedded Systems with Checkpointing and Replication

    OpenAIRE

    Pop, Paul; Izosimov, Viacheslav; Eles, Petru; Peng, Zebo

    2009-01-01

    We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes and communications are statically scheduled. Our synthesis approach decides the assignment of fault-tolerance policies to processes, the optimal placement of checkpoints and the mapping of processes to processors such that multiple transient faults are tolerated and the t...

  14. Design and Assessment of a Multiple Sensor Fault Tolerant Robust Control System

    Directory of Open Access Journals (Sweden)

    J. Chen

    2008-03-01

    Full Text Available This paper presents an enhanced robust control design structure to realise fault tolerance towards sensor faults suitable for multi-input-multi-output (MIMO systems implementation. The proposed design permits fault detection and controller elements to be designed with considerations to stability and robustness towards uncertainties besides multiple faults environment on a common mathematical platform. This framework can also cater to systems requiring fast responses. A design example is illustrated with a fast, multivariable and unstable system, that is, the double inverted pendulum system. Results indicate the potential of this design framework to handle fast systems with multiple sensor faults.

  15. Fault-tolerance memory system architecture for radiation induced errors

    Science.gov (United States)

    White, J. B., Jr.

    1982-01-01

    A fault-tolerant memory (FTM) architecture is presented which can be used to overcome soft memory errors induced by alpha particles, cosmic radiation, or other random sources. The characteristics of the FTM are presented, a mathematical model is developed, and numerical examples are considered to illustrate the effectiveness of the approach. The FTM architecture has been incorporated in the NASA Standard Spacecraft Computer (NSSC-II) which will be employed in a variety of future space payloads and experiments.

  16. Fault tolerance of artificial neural networks with applications in critical systems

    Science.gov (United States)

    Protzel, Peter W.; Palumbo, Daniel L.; Arras, Michael K.

    1992-01-01

    This paper investigates the fault tolerance characteristics of time continuous recurrent artificial neural networks (ANN) that can be used to solve optimization problems. The principle of operations and performance of these networks are first illustrated by using well-known model problems like the traveling salesman problem and the assignment problem. The ANNs are then subjected to 13 simultaneous 'stuck at 1' or 'stuck at 0' faults for network sizes of up to 900 'neurons'. The effects of these faults is demonstrated and the cause for the observed fault tolerance is discussed. An application is presented in which a network performs a critical task for a real-time distributed processing system by generating new task allocations during the reconfiguration of the system. The performance degradation of the ANN under the presence of faults is investigated by large-scale simulations, and the potential benefits of delegating a critical task to a fault tolerant network are discussed.

  17. A Novel Fault Tolerant Reversible Gate For Nanotechnology Based Systems

    Directory of Open Access Journals (Sweden)

    Majid Haghparast

    2008-01-01

    Full Text Available This paper proposes a novel reversible logic gate, NFT. It is a parity preserving reversible logic gate, that is, the parity of the outputs matches that of the inputs. We demonstrate that the NFT gate can implement all Boolean functions. It renders a wide class of circuit faults readily detectable at the circuit's outputs. The proposed parity preserving reversible gate, allows any fault that affects no more than a single signal to be detectable at the circuit's primary outputs. The NFT gate can be used to make fault tolerant reversible logic circuits. We demonstrate how the well-known, and very useful, Toffoli gate can be synthesized from only two parity-preserving reversible gates. We show that our proposed parity-preserving Toffoli gate is much better in terms of number of reversible gates, number of garbage outputs and hardware complexity with compared to the existing counterpart.

  18. Fault tolerant control for nonlinear systems described by Takagi-Sugeno models

    OpenAIRE

    Kheder, Atef; Ben Othman, Kamel; Mohamed BENREJEB; Maquin, Didier

    2010-01-01

    In this paper the problem of active fault tolerant control (FTC) in noisy systems is studied. The proposed FTC strategy is based on the known of the fault estimate and the error between the faulty system state and a reference system state. A proportional integral observer is used in order to estimate the state and the actuator faults. The obtained results are then extended to nonlinear systems described by nonlinear Takagi-Sugeno models. The problem of conception of the proportional integral ...

  19. A Piecewise Affine Hybrid Systems Approach to Fault Tolerant Satellite Formation Control

    DEFF Research Database (Denmark)

    Grunnet, Jacob Deleuran; Larsen, Jesper Abildgaard

    2008-01-01

    In this paper a procedure for modelling satellite formations   including failure dynamics as a piecewise-affine hybrid system is   shown. The formulation enables recently developed methods and tools   for control and analysis of piecewise-affine systems to be applied   leading to synthesis of fault tolerant controllers and analysis of   the system behaviour given possible faults.  The method is   illustrated using a simple example involving two satellites trying   to reach a specific formation despite of actuator faults occurring.

  20. Evaluation of digital fault-tolerant architectures for nuclear power plant control systems

    International Nuclear Information System (INIS)

    This paper reports on four fault-tolerant architectures that were evaluated for their potential reliability in service as control systems of nuclear power plants. The reliability analyses showed that human- and software-related common cause failures and single points of failure in the output modules are dominant contributors to system unreliability. The four architectures are triple-modular-redundant, both synchronous and asynchronous, and also dual synchronous and asynchronous. The evaluation includes a review of design features, an analysis of the importance of coverage, and reliability analyses of fault-tolerant systems. Reliability analyses based on data from several industries that have fault-tolerant controllers were used to estimate the mean-time-between-failures of fault-tolerant controllers and to predict those failure modes that may be important in nuclear power plants

  1. Award ER25750: Coordinated Infrastructure for Fault Tolerance Systems Indiana University Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Lumsdaine, Andrew

    2013-03-08

    The main purpose of the Coordinated Infrastructure for Fault Tolerance in Systems initiative has been to conduct research with a goal of providing end-to-end fault tolerance on a systemwide basis for applications and other system software. While fault tolerance has been an integral part of most high-performance computing (HPC) system software developed over the past decade, it has been treated mostly as a collection of isolated stovepipes. Visibility and response to faults has typically been limited to the particular hardware and software subsystems in which they are initially observed. Little fault information is shared across subsystems, allowing little flexibility or control on a system-wide basis, making it practically impossible to provide cohesive end-to-end fault tolerance in support of scientific applications. As an example, consider faults such as communication link failures that can be seen by a network library but are not directly visible to the job scheduler, or consider faults related to node failures that can be detected by system monitoring software but are not inherently visible to the resource manager. If information about such faults could be shared by the network libraries or monitoring software, then other system software, such as a resource manager or job scheduler, could ensure that failed nodes or failed network links were excluded from further job allocations and that further diagnosis could be performed. As a founding member and one of the lead developers of the Open MPI project, our efforts over the course of this project have been focused on making Open MPI more robust to failures by supporting various fault tolerance techniques, and using fault information exchange and coordination between MPI and the HPC system software stack?from the application, numeric libraries, and programming language runtime to other common system components such as jobs schedulers, resource managers, and monitoring tools.

  2. Fault tolerant control design of nonlinear systems using LMI gain synthesis

    OpenAIRE

    Rodrigues, Mickael; Theilliol, Didier; Sauter, Dominique

    2005-01-01

    In this paper, an active Fault Tolerant Control (FTC) strategy is developed to nonlinear systems described by multiple linear models to prevent the system deterioration by the synthesis of adapted controllers. By considering that Fault Detection, Isolation (FDI) and estimation is realized, the synthesis of an appropriate combination of predesigned gains is performed. The main contribution concerns the design of state feedback gains through LMI both in fault-free and faulty cases in order to p...

  3. New fault tolerant control strategy for nonlinear systems with multiple model approach

    OpenAIRE

    Ichalal, Dalil; Marx, Benoît; Maquin, Didier; Ragot, José

    2010-01-01

    This paper addresses a new methodology to con- struct a fault tolerant control (FTC) in order to compensate actuator faults in nonlinear systems. This approach is based on the representation of the nonlinear model with a multiple model under Takagi-Sugeno's form. The proposed control requires a simultaneous estimation of the system states and of the occurring actuator faults. The performance of the control depends on the quality of the estimations, indeed, it is important to estimate accurate...

  4. Fault Tolerance in a Multi-Layered DRE System: A Case Study

    OpenAIRE

    Paul Rubel; Joseph Loyall; Richard Schantz; Matthew Gillen

    2006-01-01

    Dynamic resource management is a crucial part of the infrastructure for emerging distributed real-time embedded systems, responsible for keeping mission-critical applications operating and allocating the resources necessary for them to meet their requirements. Because of this, the resource manager must be fault-tolerant, with nearly continuous operation. This paper describes our efforts to develop a fault-tolerant multi-layer dynamic resource management capability and the challenges we encoun...

  5. Measuring fault tolerance with the FTAPE fault injection tool

    Science.gov (United States)

    Tsai, Timothy K.; Iyer, Ravishankar K.

    1995-01-01

    This paper describes FTAPE (Fault Tolerance And Performance Evaluator), a tool that can be used to compare fault-tolerant computers. The major parts of the tool include a system-wide fault-injector, a workload generator, and a workload activity measurement tool. The workload creates high stress conditions on the machine. Using stress-based injection, the fault injector is able to utilize knowledge of the workload activity to ensure a high level of fault propagation. The errors/fault ratio, performance degradation, and number of system crashes are presented as measures of fault tolerance.

  6. Industrial Cost-Benefit Assessment for Fault-tolerant Control Systems

    DEFF Research Database (Denmark)

    Thybo, C.; Blanke, M.

    1998-01-01

    Economic aspects are decisive for industrial acceptance of research concepts including the promising ideas in fault tolerant control. Fault tolerance is the ability of a system to detect, isolate and accommodate a fault, such that simple faults in a sub-system do not develop into failures at a system level. In a design phase for an industrial system, possibilities span from fail safe design where any single point failure is accommodated by hardware, over fault-tolerant design where selected faults are handled without extra hardware, to fault-ignorant design where no extra precaution is taken against failure. The paper describes the assessments needed to find the right path for new industrial designs. The economic decisions in the design phase are discussed: cost of different failures, profits associated with available benefits, investments needed for development and life-time support. The objective of this paper is to help, in the early product development state, to find the economical most suitable scheme. Asalient result is that with increased customer awareness of total cost of ownership, new products can benefit significantly from applying fault tolerant control principles.

  7. Towards fault-tolerant decision support systems for ship operator guidance

    DEFF Research Database (Denmark)

    Nielsen, Ulrik Dam; Lajic, Zoran

    2012-01-01

    Fault detection and isolation are very important elements in the design of fault-tolerant decision support systems for ship operator guidance. This study outlines remedies that can be applied for fault diagnosis, when the ship responses are assumed to be linear in the wave excitation. A novel numerical procedure is described for the calculation of residuals using the ship's transfer functions which correlate the wave excitation and the ship responses. As tests, multiplicative faults have artificially been imposed to full-scale motion measurements and it is shown that the developed model is able to detect and isolate all faults.

  8. Towards fault-tolerant decision support systems for ship operator guidance

    International Nuclear Information System (INIS)

    Fault detection and isolation are very important elements in the design of fault-tolerant decision support systems for ship operator guidance. This study outlines remedies that can be applied for fault diagnosis, when the ship responses are assumed to be linear in the wave excitation. A novel numerical procedure is described for the calculation of residuals using the ship's transfer functions which correlate the wave excitation and the ship responses. As tests, multiplicative faults have artificially been imposed to full-scale motion measurements and it is shown that the developed model is able to detect and isolate all faults.

  9. Mapping of Fault-Tolerant Applications with Transparency on Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul

    2006-01-01

    In this paper we present an approach for the mapping optimization of fault-tolerant embedded systems for safety-critical applications. Processes and messages are statically scheduled. Process re-execution is used for recovering from multiple transient faults. We call process recovery transparent if it does not affect operation of other processes. Transparent recovery has the advantage of fault containment, improved debugability and less memory needed to store the fault-tolerant schedules. However, it will introduce additional delays that can lead to violations of the timing constraints of the application. We propose an algorithm for the mapping of fault-tolerant applications with transparency. The algorithm decides a mapping of processes on computation nodes such that the application is schedulable and the transparency properties imposed by the designer are satisfied. The mapping algorithm is driven by a heuristic that is able to estimate the worst-case schedule length and indicate whether a certain mapping alternative is schedulable

  10. FTOS-Verify: Analysis and Verification of Non-Functional Properties for Fault-Tolerant Systems

    OpenAIRE

    Cheng, Chih-Hong; Buckl, Christian; Esparza, Javier; Knoll, Alois

    2009-01-01

    The focus of the tool FTOS is to alleviate designers' burden by offering code generation for non-functional aspects including fault-tolerance mechanisms. One crucial aspect in this context is to ensure that user-selected mechanisms for the system model are sufficient to resist faults as specified in the underlying fault hypothesis. In this paper, formal approaches in verification are proposed to assist the claim. We first raise the precision of FTOS into pure mathematical co...

  11. Fault tolerant distributed real time computer systems for I and C of prototype fast breeder reactor

    International Nuclear Information System (INIS)

    Highlights: • Architecture of distributed real time computer system (DRTCS) used in I and C of PFBR is explained. • Fault tolerant (hot standby) architecture, fault detection and switch over are detailed. • Scaled down model was used to study functional and performance requirements of DRTCS. • Quality of service parameters for scaled down model was critically studied. - Abstract: Prototype fast breeder reactor (PFBR) is in the advanced stage of construction at Kalpakkam, India. Three-tier architecture is adopted for instrumentation and control (I and C) of PFBR wherein bottom tier consists of real time computer (RTC) systems, middle tier consists of process computers and top tier constitutes of display stations. These RTC systems are geographically distributed and networked together with process computers and display stations. Hot standby architecture comprising of dual redundant RTC systems with switch over logic system is deployed in order to achieve fault tolerance. Fault tolerant dual redundant network connectivity is provided in each RTC system and TCP/IP protocol is selected for network communication. In order to assess the performance of distributed RTC systems, scaled down model was developed with 9 representative systems and nearly 15% of I and C signals of PFBR were connected and monitored. Functional and performance testing were carried out for each RTC system and the fault tolerant characteristics were studied by creating various faults into the system and observed the performance. Various quality of service parameters like connection establishment delay, priority parameter, transit delay, throughput, residual error ratio, etc., are critically studied for the network

  12. Fault tolerant distributed real time computer systems for I and C of prototype fast breeder reactor

    Energy Technology Data Exchange (ETDEWEB)

    Manimaran, M., E-mail: maran@igcar.gov.in; Shanmugam, A.; Parimalam, P.; Murali, N.; Satya Murty, S.A.V.

    2014-03-15

    Highlights: • Architecture of distributed real time computer system (DRTCS) used in I and C of PFBR is explained. • Fault tolerant (hot standby) architecture, fault detection and switch over are detailed. • Scaled down model was used to study functional and performance requirements of DRTCS. • Quality of service parameters for scaled down model was critically studied. - Abstract: Prototype fast breeder reactor (PFBR) is in the advanced stage of construction at Kalpakkam, India. Three-tier architecture is adopted for instrumentation and control (I and C) of PFBR wherein bottom tier consists of real time computer (RTC) systems, middle tier consists of process computers and top tier constitutes of display stations. These RTC systems are geographically distributed and networked together with process computers and display stations. Hot standby architecture comprising of dual redundant RTC systems with switch over logic system is deployed in order to achieve fault tolerance. Fault tolerant dual redundant network connectivity is provided in each RTC system and TCP/IP protocol is selected for network communication. In order to assess the performance of distributed RTC systems, scaled down model was developed with 9 representative systems and nearly 15% of I and C signals of PFBR were connected and monitored. Functional and performance testing were carried out for each RTC system and the fault tolerant characteristics were studied by creating various faults into the system and observed the performance. Various quality of service parameters like connection establishment delay, priority parameter, transit delay, throughput, residual error ratio, etc., are critically studied for the network.

  13. Fault Tolerant Control Systems : a Development Method and Real-Life Case Study

    DEFF Research Database (Denmark)

    BØgh, S.A.

    1997-01-01

    This thesis considered the development of fault tolerant control systems. The focus was on the category of automated processes that do not necessarily comprise a high number of identical sensors and actuators to maintain safe operation, but still have a potential for improving immunity to component failures. It is often feasible to increase availability for these control loops by designing the control system to perform on-line detection and reconfiguration in case of faults before the safety system makes a close-down of the process. A general development methodology is given in the thesis that carried the control system designer through the steps necessary to consider fault handling in an early design phase. It was shown how an existing control loop with interface to the plant wide control system could be extended with three additional modules to obtain fault tolerance: Fault detection and isolation, remedial action decision, and reconfiguration. The integration of these modules in software were considered. The general methodology covered the analysis, design, and implementation of fault tolerant control systems on an overall level. Two detailed studies were presented, one on fault detection and isolation design and one on design of the decision logic. Two application case studies were used to emphasize practical aspects of both the development methodology and the detailed studies. One was an electro-mechanical actuator in a position control loop for a diesel engine speed governor where the purpose was to avoid a total close-down in case of the most likely faults. The second was a fault tolerant attitude control system for a micro satellite where the operation of the system is mission critical. The purpose was to avoid hazardous effects from faults and maintain operation if possible. A method was introduced that, after a systematic examination of possible component failures, enables analysis of the relationship between failures and their consequences for the system's operation. This fault propagation analysis is based on coarse models of the subsystems describing the reaction to faults, as for example a variable being zero, low or high. Examples were given that illustrate how such models can be established by simple means, and yet provide important information when combined into a complete system. A special achievement was a method to determine how control loops behave in case of faults. This is not straight forward as the system behaviour depends on the character of the feedback. One of the detailed studies were the design of the decision logic in fault handling, realized as state-event machines. Guidelines for the design were provided, based on experience from the two case studies. Methods for verifying correct operation of the decision logic were described, where a completeness check against the fault propagation analysis is able to guarantee coverage of all considered faults. The usage of software tools to support the development process was illustrated with an off-the-shelf product for constraint logic solving and state-event machine analysis. The coarse system models and the decision logic were analyzed with the tool-box and it was shown how an easy analysis could be performed to verify correctness and completeness of the fault handling design. Experience from this study highlights requirements for a dedicated software environment for fault tolerant control systems design. The second detailed study addressed the detection of a fault event and determination of the failed component. A variety of algorithms were compared, based on two fault scenarios in the speed governor actuator setup. One was a position sensor fault and the second was an actuator current fault. The sensor fault detection was trivial, whereas the actuator fault was more challenging. The study demonstrated that many existing methods have a potential to detect and isolate the two faults, but also that the research field still misses a systematic approach to handle realistic problems such as low sampling rate and nonlinear characteristics of the system

  14. Adaptive sensor-fault tolerant control for a class of multivariable uncertain nonlinear systems.

    Science.gov (United States)

    Khebbache, Hicham; Tadjine, Mohamed; Labiod, Salim; Boulkroune, Abdesselem

    2015-03-01

    This paper deals with the active fault tolerant control (AFTC) problem for a class of multiple-input multiple-output (MIMO) uncertain nonlinear systems subject to sensor faults and external disturbances. The proposed AFTC method can tolerate three additive (bias, drift and loss of accuracy) and one multiplicative (loss of effectiveness) sensor faults. By employing backstepping technique, a novel adaptive backstepping-based AFTC scheme is developed using the fact that sensor faults and system uncertainties (including external disturbances and unexpected nonlinear functions caused by sensor faults) can be on-line estimated and compensated via robust adaptive schemes. The stability analysis of the closed-loop system is rigorously proven using a Lyapunov approach. The effectiveness of the proposed controller is illustrated by two simulation examples. PMID:25701191

  15. Fault-Tolerant Control of a 2 DOF Helicopter (TRMS System) Based on H_infinity

    OpenAIRE

    Bouguerra, Abderrahmen; Saigaa, Djamel; Kara, Kamel; Zeghlache, Samir; Loukal, Keltoum

    2013-01-01

    In this paper, a Fault-Tolerant control of 2 DOF Helicopter (TRMS System) Based on H-infinity is presented. In particular, the introductory part of the paper presents a Fault-Tolerant Control (FTC), the first part of this paper presents a description of the mathematical model of TRMS, and the last part of the paper presented and a polytypic Unknown Input Observer (UIO) is synthesized using equalities and LMIs. This UIO is used to observe the faults and then compensate them, ...

  16. Reliability model of the fault-tolerant multicore system with software recovery

    Directory of Open Access Journals (Sweden)

    B. Y. Volochiy

    2013-09-01

    Full Text Available Introduction. The analyzes of the researched problem, and features of the designed fault-tolerant hardware/software systems was carried out. The description of the approach to developing reliability models of the software/hardware system. This chapter outlines a reliability macromodel of a fault-tolerant multicore system with a software recovery. Model of the fault-tolerant multicore system. The model features include: two multicore systems (main and reserve, some processors for sliding redundancies, one control and diagnostics processor which controls the hardware and software features. Also this model includes the automatic software restart after hardware failure. The proposed model appication example. Weighed calculation of MTTF (mean time to failure considering given probability of infallible performance of the suggested and existing models was performed. Conclusion. The proposed model is designed to estimate reliability of the hardware/software systems.

  17. Fault-tolerant interconnection network and image-processing applications for the PASM parallel processing system

    International Nuclear Information System (INIS)

    The demand for very high speed data processing coupled with falling hardware costs has made large-scale parallel and distributed computer systems both desirable and feasible. Two modes of parallel processing are single instruction stream-multiple data stream (SIMD) and multiple instruction stream-multiple data stream (MIMD). PASM, a partitionable SIMD/MIMD system, is a reconfigurable multimicroprocessor system being designed for image processing and pattern recognition. An important component of these systems is the interconnection network, the mechanism for communication among the computation nodes and memories. Assuring high reliability for such complex systems is a significant task. Thus, a crucial practical aspect of an interconnection network is fault tolerance. In answer to this need, the Extra Stage Cube (ESC), a fault-tolerant, multistage cube-type interconnection network, is define. The fault tolerance of the ESC is explored for both single and multiple faults, routing tags are defined, and consideration is given to permuting data and partitioning the ESC in the presence of faults. The ESC is compared with other fault-tolerant multistage networks. Finally, reliability of the ESC and an enhanced version of it are investigated

  18. Fault Tolerant Control for Takagi-Sugeno systems with unmeasurable premise variables by trajectory tracking

    OpenAIRE

    Ichalal, Dalil; Marx, Benoît; Ragot, José; Maquin, Didier

    2010-01-01

    This paper presents a new method for fault tolerant control of nonlinear systems described by Takagi- Sugeno fuzzy systems with unmeasurable premise variables. The idea is to use a reference model and design a new control law to minimize the state deviation between a healthy reference model and the eventually faulty actual model. This scheme requires the knowledge of the system states and of the occurring faults. These signals are estimated from a Proportional-Integral Observer (PIO) or Propo...

  19. A novel mathematical setup for fault tolerant control systems with state-dependent failure process

    Science.gov (United States)

    Chitraganti, S.; Aberkane, S.; Aubrun, C.

    2014-12-01

    In this paper, we consider a fault tolerant control system (FTCS) with state- dependent failures and provide a tractable mathematical model to handle the state-dependent failures. By assuming abrupt changes in system parameters, we use a jump process modelling of failure process and the fault detection and isolation (FDI) process. In particular, we assume that the failure rates of the failure process vary according to which set the state of the system belongs to.

  20. Distributed Fault-Tolerant Avionic Systems - A Real-Time Perspective

    OpenAIRE

    Burke, Michael; Audsley, Neil

    2010-01-01

    This paper examines the problem of introducing advanced forms of fault-tolerance via reconfiguration into safety-critical avionic systems. This is required to enable increased availability after fault occurrence in distributed integrated avionic systems(compared to static federated systems). The approach taken is to identify a migration path from current architectures to those that incorporate re-configuration to a lesser or greater degree. Other challenges identified includ...

  1. Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 1: Army fault tolerant architecture overview

    Science.gov (United States)

    Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.

    1992-01-01

    Digital computing systems needed for Army programs such as the Computer-Aided Low Altitude Helicopter Flight Program and the Armored Systems Modernization (ASM) vehicles may be characterized by high computational throughput and input/output bandwidth, hard real-time response, high reliability and availability, and maintainability, testability, and producibility requirements. In addition, such a system should be affordable to produce, procure, maintain, and upgrade. To address these needs, the Army Fault Tolerant Architecture (AFTA) is being designed and constructed under a three-year program comprised of a conceptual study, detailed design and fabrication, and demonstration and validation phases. Described here are the results of the conceptual study phase of the AFTA development. Given here is an introduction to the AFTA program, its objectives, and key elements of its technical approach. A format is designed for representing mission requirements in a manner suitable for first order AFTA sizing and analysis, followed by a discussion of the current state of mission requirements acquisition for the targeted Army missions. An overview is given of AFTA's architectural theory of operation.

  2. Checkpointing Based Fault Tolerant Job Scheduling System for Computational Grid

    OpenAIRE

    Mangesh Ramesh Balpande

    2014-01-01

    A computational grid environment, due to its heterogeneous, autonomous and dynamic nature is prone to different kinds of faults which may lead to delay in completion of job or even execution of job from starting point. Checkpointing mechanism plays a vital role for making grid more reliable, cost effective and efficient. In this paper, we have proposed schemes based on system checkpointing and application checkpointing. Their performance comparison is done based on the empirical study. The AB...

  3. Active Fault Tolerant Control-FTC-Design for Takagi-Sugeno Fuzzy Systems with Weighting Functions Depending on the FTC

    Directory of Open Access Journals (Sweden)

    Atef Khedher

    2011-05-01

    Full Text Available In this paper the problem of active fault tolerant control design for noisy systems described by Takagi-Sugeno fuzzy models is studied. The proposed control strategy is based on the known of the fault estimated and the error between the faulty system state and a reference system state. The considered systems are affected by actuator and sensor faults and have the weighting functions depending on the fault tolerant control. A mathematical transformation is used to conceive an augmented system in which all the faults affecting the initial system appear as actuator faults. Then, an adaptive proportional integral observer is used in order to estimate the state and the faults. The problem of conception of the proportional integral observer and of the fault tolerant control strategy is formulated in linear matrices inequalities which can be solved easily. To illustrate the proposed method, It is applied to the three tanks systems.

  4. Preface of the special issue on Advances in Control and Fault-Tolerant Systems

    OpenAIRE

    Korbicz, Jozef; Maquin, Didier; Theilliol, Didier

    2012-01-01

    Today's automatic control systems are of high degrees of integration, complexity, embedding and networking of heterogeneous entities. This trend is driven by the industrial needs for achieving new technical performance and meeting additional performance demands. A most critical and important issue surrounding the design and operation of complex automatic systems is the application of Fault Detection and Isolation and Fault-Tolerant Control (FDI/FTC) technology, aiming at guaranteeing high sys...

  5. Fault tolerant design of a servo manipulator system for hot cell operation

    International Nuclear Information System (INIS)

    In this paper, fault tolerant mechanisms are presented for a servo manipulator system designed to operate in a hot cell. A hot cell is a sealed and shielded room to handle radioactive materials, and it is dangerous for people to work in the hot cell. So, remote operations are necessary to handle the radioactive materials in the hot cell. KAERI has developed a servo manipulator system to perform such remote operations. However, since electric components such as servo motors are weakened with radiation, fault tolerant mechanisms have to be considered. For fault tolerance of the servo manipulator system, hardware and software redundancy has been considered. In the case of hardware, radioactive resistant electric components such as cables and connectors have been adopted and motors driving a transport have been duplicated. In case of software, a reconfiguration algorithm accommodating one motor's failure has been developed. The algorithm uses redundant axes to recover the end effector's motion in spite of one motor's failure

  6. Fault tolerant linear actuator

    Science.gov (United States)

    Tesar, Delbert

    2004-09-14

    In varying embodiments, the fault tolerant linear actuator of the present invention is a new and improved linear actuator with fault tolerance and positional control that may incorporate velocity summing, force summing, or a combination of the two. In one embodiment, the invention offers a velocity summing arrangement with a differential gear between two prime movers driving a cage, which then drives a linear spindle screw transmission. Other embodiments feature two prime movers driving separate linear spindle screw transmissions, one internal and one external, in a totally concentric and compact integrated module.

  7. Checkpointing Based Fault Tolerant Job Scheduling System for Computational Grid

    Directory of Open Access Journals (Sweden)

    Mangesh Ramesh Balpande

    2014-09-01

    Full Text Available A computational grid environment, due to its heterogeneous, autonomous and dynamic nature is prone to different kinds of faults which may lead to delay in completion of job or even execution of job from starting point. Checkpointing mechanism plays a vital role for making grid more reliable, cost effective and efficient. In this paper, we have proposed schemes based on system checkpointing and application checkpointing. Their performance comparison is done based on the empirical study. The ABSC scheme is suitable for the applications where computations are not intense. But for computationally intense applications where reliability is more important ABAC scheme is more suitable. But this scheme may produce slight overheads in fault free situations and very reliable in faulty situations.

  8. Software Fault Tolerance: A Tutorial

    Science.gov (United States)

    Torres-Pomales, Wilfredo

    2000-01-01

    Because of our present inability to produce error-free software, software fault tolerance is and will continue to be an important consideration in software systems. The root cause of software design errors is the complexity of the systems. Compounding the problems in building correct software is the difficulty in assessing the correctness of software for highly complex systems. After a brief overview of the software development processes, we note how hard-to-detect design faults are likely to be introduced during development and how software faults tend to be state-dependent and activated by particular input sequences. Although component reliability is an important quality measure for system level analysis, software reliability is hard to characterize and the use of post-verification reliability estimates remains a controversial issue. For some applications software safety is more important than reliability, and fault tolerance techniques used in those applications are aimed at preventing catastrophes. Single version software fault tolerance techniques discussed include system structuring and closure, atomic actions, inline fault detection, exception handling, and others. Multiversion techniques are based on the assumption that software built differently should fail differently and thus, if one of the redundant versions fails, it is expected that at least one of the other versions will provide an acceptable output. Recovery blocks, N-version programming, and other multiversion techniques are reviewed.

  9. Fault detection and fault tolerant control of a smart base isolation system with magneto-rheological damper

    International Nuclear Information System (INIS)

    Fault detection and isolation (FDI) in real-time systems can provide early warnings for faulty sensors and actuator signals to prevent events that lead to catastrophic failures. The main objective of this paper is to develop FDI and fault tolerant control techniques for base isolation systems with magneto-rheological (MR) dampers. Thus, this paper presents a fixed-order FDI filter design procedure based on linear matrix inequalities (LMI). The necessary and sufficient conditions for the existence of a solution for detecting and isolating faults using the H? formulation is provided in the proposed filter design. Furthermore, an FDI-filter-based fuzzy fault tolerant controller (FFTC) for a base isolation structure model was designed to preserve the pre-specified performance of the system in the presence of various unknown faults. Simulation and experimental results demonstrated that the designed filter can successfully detect and isolate faults from displacement sensors and accelerometers while maintaining excellent performance of the base isolation technology under faulty conditions

  10. Active fault tolerant control of piecewise affine systems with reference tracking and input constraints

    DEFF Research Database (Denmark)

    Gholami, M.; Cocquempot, V.

    2014-01-01

    An active fault tolerant control (AFTC) method is proposed for discrete-time piecewise affine (PWA) systems. Only actuator faults are considered. The AFTC framework contains a supervisory scheme, which selects a suitable controller in a set of controllers such that the stability and an acceptable performance of the faulty system are held. The design of the supervisory scheme is not considered here. The set of controllers is composed of a normal controller for the fault-free case, an active fault detection and isolation controller for isolation and identification of the faults, and a set of passive fault tolerant controllers (PFTCs) modules designed to be robust against a set of actuator faults. In this research, the piecewise nonlinear model is approximated by a PWA system. The PFTCs are state feedback laws. Each one is robust against a fixed set of actuator faults and is able to track the reference signal while the control inputs are bounded. The PFTC problem is transformed into a feasibility problem of a set of LMIs. The method is applied on a large-scale live-stock ventilation model.

  11. An evaluation method of fault-tolerance for digital plant protection system in nuclear power plants

    International Nuclear Information System (INIS)

    In recent years, analog based nuclear power plant (NPP) safety related instrumentation and control (I and C) systems have been replaced to modern digital based I and C systems. NPP safety related I and C systems require very high design reliability compare to the conventional digital systems so that reliability assessment is very important. In the reliability assessment of the digital system, fault tolerance evaluation is one of the crucial factors. However, the evaluation is very difficult because the digital system in NPP is very complex. In this paper, the simulation based fault injection technique on simplified processor is used to evaluate the fault-tolerance of the digital plant protection system (DPPS) with high efficiency with low cost

  12. Closed-loop fault-tolerant control for uncertain nonlinear systems

    OpenAIRE

    Fliess, Michel; Join, Cédric; Sira-Ramirez, Hebertt

    2005-01-01

    We are designing, perhaps for the first time, closed-loop fault-tolerant control for uncertain nonlinear systems. Our solution is based on a new algebraic estimation technique of the derivatives of a time signal, which • yields good estimates of the unknown parameters and of the residuals, i.e., of the fault indicators, • is easily implementable in real time, • is robust with respect to a large variety of noises, without any necessity of knowing their statistical properties. Convincing ...

  13. Diagnosis and fault-tolerant control

    CERN Document Server

    Blanke, Mogens; Lunze, Jan; Staroswiecki, Marcel; Schröder, J

    2006-01-01

    Fault-tolerant control aims at a graceful degradation of the behaviour of automated systems in case of faults. It satisfies the industrial demand for enhanced availability and safety, in contrast to traditional reactions to faults that bring about sudden shutdowns and loss of availability.The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault throught the process, to test the fault detectability and to find the redundancies in the process that can be

  14. Parallel fault-tolerant robot control

    Science.gov (United States)

    Hamilton, Deirdre L.; Bennett, John K.; Walker, Ian D.

    1992-11-01

    Most robot controllers today employ a single processor architecture. As robot control requirements become more complex, these serial controllers have difficulty providing the desired response time. Additionally, with robots being used in environments that are hazardous or inaccessible to humans, fault-tolerant robotic systems are particularly desirable. A uniprocessor control architecture cannot offer tolerance of processor faults. Use of multiple processors for robot control offers two advantages over single processor systems. Parallel control provides a faster response, which in turn allows a finer granularity of control. Processor fault tolerance is also made possible by the existence of multiple processors. There is a trade-off between performance and the level of fault tolerance provided. This paper describes a shared memory multiprocessor robot controller that is capable of providing high performance and processor fault tolerance. We evaluate the performance of this controller, and demonstrate how performance and processor fault tolerance can be balanced in a cost- effective manner.

  15. Fault tolerance control of phase current in permanent magnet synchronous motor control system

    Science.gov (United States)

    Chen, Kele; Chen, Ke; Chen, Xinglong; Li, Jinying

    2014-08-01

    As the Photoelectric tracking system develops from earth based platform to all kinds of moving platform such as plane based, ship based, car based, satellite based and missile based, the fault tolerance control system of phase current sensor is studied in order to detect and control of failure of phase current sensor on a moving platform. By using a DC-link current sensor and the switching state of the corresponding SVPWM inverter, the failure detection and fault control of three phase current sensor is achieved. Under such conditions as one failure, two failures and three failures, fault tolerance is able to be controlled. The reason why under the method, there exists error between fault tolerance control and actual phase current, is analyzed, and solution to weaken the error is provided. The experiment based on permanent magnet synchronous motor system is conducted, and the method is proven to be capable of detecting the failure of phase current sensor effectively and precisely, and controlling the fault tolerance simultaneously. With this method, even though all the three phase current sensors malfunction, the moving platform can still work by reconstructing the phase current of the motor.

  16. Stability Guaranteed Active Fault-Tolerant Control of Networked Control Systems

    Directory of Open Access Journals (Sweden)

    Shanbin Li

    2008-03-01

    Full Text Available The stability guaranteed active fault-tolerant control against actuators failures and plant uncertainties in networked control systems (NCSs is addressed. A detailed design procedure is formulated as a convex optimization problem which can be efficiently solved by existing software. An illustrative example is given to show the efficiency of the proposed method for network-based control for uncertain systems.

  17. Fault-tolerant Stabilization for Linear System with Time Delay

    Directory of Open Access Journals (Sweden)

    Shaohua Wang

    2013-03-01

    Full Text Available In this note, the FTC problem of time-delay systems with the special sensor model of failure is investigated. Firstly, based on Lyapunov stability theorem, through constructing a proper LKF and using integral inequality, the stability condition of the closed-loop system is obtained. Secondly,  by using the nonlinear transformation and the cone complementary linearization algorithm, the controller existence condition of time-delay system in terms of LMIs is obtained, which guarantee the asymptotically stable of the closed-loop systems even if the sensor faults occur, and the controller parameters are also given. Finally, an example is given to show the effectiveness of the proposed methods in this paper.

  18. Energy/Reliability Trade-offs in Fault-Tolerant Event-Triggered Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Gan, Junhe; Gruian, Flavius

    2011-01-01

    This paper presents an approach to the synthesis of low-power fault-tolerant hard real-time applications mapped on distributed heterogeneous embedded systems. Our synthesis approach decides the mapping of tasks to processing elements, as well as the voltage and frequency levels for executing each task, such that transient faults are tolerated, the timing constraints of the application are satisfied, and the energy consumed is minimized. Tasks are scheduled using fixed-priority preemptive scheduling, while replication is used for recovery from multiple transient faults. Addressing energy and reliability simultaneously is especially challenging, since lowering the voltage to reduce the energy consumption has been shown to increase the transient fault rate. We presented a Tabu Search-based approach which uses an energy/reliability trade-off model to find reliable and schedulable implementations with limited energy and hardware resources. We evaluated the algorithm proposed using several synthetic and reallife benchmarks.

  19. Enhanced fault-tolerant quantum computing in d-level systems.

    Science.gov (United States)

    Campbell, Earl T

    2014-12-01

    Error-correcting codes protect quantum information and form the basis of fault-tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transversal non-Clifford gate. Codes with the desired property are presented for d-level qudit systems with prime d. The codes use n=d-1 qudits and can detect up to ?d/3 errors. We quantify the performance of these codes for one approach to quantum computation known as magic-state distillation. Unlike prior work, we find performance is always enhanced by increasing d. PMID:25526106

  20. Fault-tolerant system considerations for a redundant strapdown inertial measurement unit

    Science.gov (United States)

    Motyka, P.; Ornedo, R.; Mangoubi, R.

    1984-01-01

    The development and evaluation of a fault-tolerant system for the Redundant Strapdown Inertial Measurement Unit (RSDIMU) being developed and evaluated by the NASA Langley Research Center was continued. The RSDIMU consists of four two-degree-of-freedom gyros and accelerometers mounted on the faces of a semi-octahedron which can be separated into two halves for damage protection. Compensated and uncompensated fault-tolerant system failure decision algorithms were compared. An algorithm to compensate for sensor noise effects in the fault-tolerant system thresholds was evaluated via simulation. The effects of sensor location and magnitude of the vehicle structural modes on system performance were assessed. A threshold generation algorithm, which incorporates noise compensation and filtered parity equation residuals for structural mode compensation, was evaluated. The effects of the fault-tolerant system on navigational accuracy were also considered. A sensor error parametric study was performed in an attempt to improve the soft failure detection capability without obtaining false alarms. Also examined was an FDI system strategy based on the pairwise comparison of sensor measurements. This strategy has the specific advantage of, in many instances, successfully detecting and isolating up to two simultaneously occurring failures.

  1. Fault Tolerant Computer Architecture

    CERN Document Server

    Sorin, Daniel

    2009-01-01

    For many years, most computer architects have pursued one primary goal: performance. Architects have translated the ever-increasing abundance of ever-faster transistors provided by Moore's law into remarkable increases in performance. Recently, however, the bounty provided by Moore's law has been accompanied by several challenges that have arisen as devices have become smaller, including a decrease in dependability due to physical faults. In this book, we focus on the dependability challenge and the fault tolerance solutions that architects are developing to overcome it. The two main purposes

  2. Parametric Modeling and Fault Tolerant Control

    Science.gov (United States)

    Wu, N. Eva; Ju, Jianhong

    2000-01-01

    Fault tolerant control is considered for a nonlinear aircraft model expressed as a linear parameter-varying system. By proper parameterization of foreseeable faults, the linear parameter-varying system can include fault effects as additional varying parameters. A recently developed technique in fault effect parameter estimation allows us to assume that estimates of the fault effect parameters are available on-line. Reconfigurability is calculated for this model with respect to the loss of control effectiveness to assess the potentiality of the model to tolerate such losses prior to control design. The control design is carried out by applying a polytopic method to the aircraft model. An error bound on fault effect parameter estimation is provided, within which the Lyapunov stability of the closed-loop system is robust. Our simulation results show that as long as the fault parameter estimates are sufficiently accurate, the polytopic controller can provide satisfactory fault-tolerance.

  3. Diagnosis and Fault-tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens; Kinnaert, Michel

    2003-01-01

    The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process that can be used to ensure fault tolerance. Design methods for diagnostic systems and fault-tolerant controllers are presented for processes that are described by analytical models, by discrete-event models or that can be dealt with as quantised systems. Four case studies on pilot processes show the applicability of the presented methods. The theoretical results are illustrated by two running examples which are used throughout the book. The book addresses engineering students, engineers in industry and researchers who wish to get a survey over the variety of approaches to process diagnosis and fault-tolerant control.

  4. A Hybrid Real-time Fault-tolerant Scheduling Algorithm for Partial Reconfigurable System

    Directory of Open Access Journals (Sweden)

    Jinyong Yin

    2012-11-01

    Full Text Available Partial reconfigurable system is an architecture consisting general purpose processors and FPGAs, in which FPGA can be reconfigured in run-time. Based on the architecture, software tasks and hardware tasks that are executed on processor and FPGA respectively co-exist. In this paper, a real-time fault-tolerant scheduling algorithm is proposed to schedule software/hardware hybrid tasks. In the algorithm, the sufficient condition for schedulable hybrid tasks is derived from analyzing system operation conditions when the first deadline is missed, and rollback/recovery and TMR approaches are used respectively to schedule software subtasks and hardware subtasks for fault tolerance. The experimental results demonstrate that all deadlines of accepted hybrid tasks are met and processor’s utilization ratio is increased greatly compared with that of the exiting approaches when multiple faults occur.

  5. A Fault Tolerant Colored Petri Net Model for Flexible Manufacturing Systems

    Scientific Electronic Library Online (English)

    Tomaz C., Barros; Jorge C.A. de, Figueiredo; Angelo, Perkusich.

    1997-11-01

    Full Text Available This paper introduces an approach based on Colored Petri Nets (CPN) to systematically introduce fault-tolerance in the design of a supervisor for a Flexible Manufacturing System (FMS). The system is modeled by means of Place/Transition nets and then is structurally reduced, resulting in a CPN that i [...] s independent of a specific production route. The introduction of fault tolerance in the design of such a supervisor considers both forward recovery and backward recovery. For forward recovery we anticipate faults in resources in a production route and reschedule the production routes for production orders before the faulty resource is reached. The backward recovery is considered at the level of a resource in such a way that when a faulty resource is fixed, the operation restarts on the last consistent operation executed

  6. Fault-Tolerant Heat Exchanger

    Science.gov (United States)

    Izenson, Michael G.; Crowley, Christopher J.

    2005-01-01

    A compact, lightweight heat exchanger has been designed to be fault-tolerant in the sense that a single-point leak would not cause mixing of heat-transfer fluids. This particular heat exchanger is intended to be part of the temperature-regulation system for habitable modules of the International Space Station and to function with water and ammonia as the heat-transfer fluids. The basic fault-tolerant design is adaptable to other heat-transfer fluids and heat exchangers for applications in which mixing of heat-transfer fluids would pose toxic, explosive, or other hazards: Examples could include fuel/air heat exchangers for thermal management on aircraft, process heat exchangers in the cryogenic industry, and heat exchangers used in chemical processing. The reason this heat exchanger can tolerate a single-point leak is that the heat-transfer fluids are everywhere separated by a vented volume and at least two seals. The combination of fault tolerance, compactness, and light weight is implemented in a unique heat-exchanger core configuration: Each fluid passage is entirely surrounded by a vented region bridged by solid structures through which heat is conducted between the fluids. Precise, proprietary fabrication techniques make it possible to manufacture the vented regions and heat-conducting structures with very small dimensions to obtain a very large coefficient of heat transfer between the two fluids. A large heat-transfer coefficient favors compact design by making it possible to use a relatively small core for a given heat-transfer rate. Calculations and experiments have shown that in most respects, the fault-tolerant heat exchanger can be expected to equal or exceed the performance of the non-fault-tolerant heat exchanger that it is intended to supplant (see table). The only significant disadvantages are a slight weight penalty and a small decrease in the mass-specific heat transfer.

  7. An Efficient Fault Tolerance System Design for Cmos/Nanodevice Digital Memories

    Directory of Open Access Journals (Sweden)

    D. Kavitha

    2014-11-01

    Full Text Available Targeting on the future fault-prone hybrid CMOS/Nanodevice digital memories, this paper present two faulttolerance design approaches the integrally address the tolerance for defect and transient faults. These two approaches share several key features, including the use of a group of Bose-Chaudhuri- Hocquenghem (BCH codes for both defect tolerance and transient fault tolerance, and integration of BCH code selection and dynamic logical-to-physical address mapping. Thus, a new model of BCH decoder is proposed to reduce the area and simplify the computational scheduling of both syndrome and chien search blocks without parallelism leading to high throughput.The goal of fault tolerant computing is improve the dependability of systems where dependability can be defined as the ability of a system to deliver service at an acceptable level of confidence in either presence or absence falult.ss The results of the simulation and implementation using Xilinx ISE software and the LCD screen on the FPGA’s Board will be shown at last.

  8. Diagnosis and Tolerant Strategy of an Open-Switch Fault for T-type Three-Level Inverter Systems

    DEFF Research Database (Denmark)

    Choi, Uimin; Lee, Kyo Beum

    2014-01-01

    This paper proposes a new diagnosis method of an open-switch fault and fault-tolerant control strategy for T-type three-level inverter systems. The location of faulty switch can be identified by the average of normalized phase current and the change of the neutral-point voltage. The proposed fault-tolerant strategy is explained by dividing into two cases: the faulty condition of half-bridge switches and the neutral-point switches. The performance of the T-type inverter system improves considerably by the proposed fault tolerant algorithm when a switch fails. The roposed method does not require additional components and complex calculations. Simulation and experimental results verify the feasibility of the proposed fault diagnosis and fault-tolerant control strategy.

  9. Design of an active fault tolerant control for nonlinear systems described by a multi-model representation

    OpenAIRE

    Rodrigues, Mickael; Theilliol, Didier; Sauter, Dominique

    2005-01-01

    In this paper, an new active Fault Tolerant Control (FTC) strategy is developed to nonlinear systems described by multiple linear models to prevent the system deterioration by the synthesis of adapted controllers. When a fault is detected by the fault detection and diagnosis scheme, the reconfigurable controller is designed automatically using a robust gain scheduling strategy. The main contribution concerns the design of state feedback gains through LMI both in fault-free and faulty cases in...

  10. A Novel Fault Tolerant Reversible Gate For Nanotechnology Based Systems

    OpenAIRE

    Majid Haghparast; Keivan Navi

    2008-01-01

    This paper proposes a novel reversible logic gate, NFT. It is a parity preserving reversible logic gate, that is, the parity of the outputs matches that of the inputs. We demonstrate that the NFT gate can implement all Boolean functions. It renders a wide class of circuit faults readily detectable at the circuit's outputs. The proposed parity preserving reversible gate, allows any fault that affects no more than a single signal to be detectable at the circuit's primary outputs. The NFT gate c...

  11. Fault Tolerance Mobile Agent System Using Witness Agent in 2-Dimensional Mesh Network

    Directory of Open Access Journals (Sweden)

    Ahmad Rostami

    2010-09-01

    Full Text Available Mobile agents are computer programs that act autonomously on behalf of a user or its owner and travel through a network of heterogeneous machines. Fault tolerance is important in their itinerary. In this paper, existent methods of fault tolerance in mobile agents are described which they are considered in linear network topology. In the methods three agents are used to fault tolerance by cooperating to each others for detecting and recovering server and agent failure. Three types of agents are: actual agent which performs programs for its owner, witness agent which monitors the actual agent and the witness agent after itself, probe which is sent for recovery the actual agent or the witness agent on the side of the witness agent. Communication mechanism in the methods is message passing between these agents. The methods are considered in linear network. We introduce our witness agent approach for fault tolerance mobile agent systems in Two Dimensional Mesh (2D-Mesh Network. Indeed Our approach minimizes Witness-Dependency in this network and then represents its algorithm.

  12. Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems

    Directory of Open Access Journals (Sweden)

    Streichert Thilo

    2006-01-01

    Full Text Available Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware/software codesign online. Based on field-programmable gate arrays (FPGAs in combination with CPUs, we allow migrating tasks implemented in hardware or software from one node to another. Moreover, if not enough hardware/software resources are available, the migration of functionality from hardware to software or vice versa is provided. Supporting such flexibility through services integrated in a distributed operating system for networked embedded systems is a substantial step towards self-adaptive systems. Beside the formal definition of methods and concepts, we describe in detail a first implementation of a reconfigurable networked embedded system running automotive applications.

  13. Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems

    Directory of Open Access Journals (Sweden)

    Jürgen Teich

    2006-06-01

    Full Text Available Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware/software codesign online. Based on field-programmable gate arrays (FPGAs in combination with CPUs, we allow migrating tasks implemented in hardware or software from one node to another. Moreover, if not enough hardware/software resources are available, the migration of functionality from hardware to software or vice versa is provided. Supporting such flexibility through services integrated in a distributed operating system for networked embedded systems is a substantial step towards self-adaptive systems. Beside the formal definition of methods and concepts, we describe in detail a first implementation of a reconfigurable networked embedded system running automotive applications.

  14. Architectures for fault-tolerant spacecraft computers

    Science.gov (United States)

    Rennels, D. A.

    1978-01-01

    This paper summarizes the results of a long-term research program in fault-tolerant computing for spacecraft on-board processing. In response to changing device technology this program has progressed from the design of a fault-tolerant uniprocessor to the development of fault-tolerant distributed computer systems. The unusual requirements of spacecraft computing are described along with the resulting real-time computer architectures. The following aspects of these designs are discussed: (1) architectural features to minimize complexity in the distributed computer system, (2) fault-detection and recovery, (3) techniques to enhance reliability and testability, and (4) design approaches for LSI implementation.

  15. Managing Fault Tolerance Transparently using CORBA Services

    OpenAIRE

    Meier, Rene

    1999-01-01

    Fault tolerance problems arise in large-scale distributed systems because application components may eventually fail due to hardware problems, operator mistakes or design faults. Fault tolerance mechanisms must be employed to reduce the susceptibility of a given system to failure. In this paper, we describe the design of an architecture to overcome potential application component failures, using CORBA, a distributed object middleware specified by the OMG. Of primary importan...

  16. A Systematic Approach to Sensitivity Analysis of Fault Tolerant Systems in NMR Architecture

    Directory of Open Access Journals (Sweden)

    Kourosh Aslansefat

    2015-01-01

    Full Text Available A fault tree illustrates the ways through which a system fails. It states different ways in which combination of faulty components result in an undesired event in the system. Being used in phases such as designing and exploiting industrial systems, and the designers able to evaluate the dependability attributes such as reliability, MTTF and sensitivity. In addition, in the mentioned ability, the fault tree is a systematic method for finding systems bottlenecks and weakness point. In spite of its extensive use in evaluating the reliability of systems, fault tree is rarely used in calculating sensitivity. In the last decade, few researches has been conducted in this field, however these methods are not applicable to large scale systems and are not systematic. This paper provides a systematic method for evaluating system sensitivity through fault tree. Then, it introduces sensitivity of NMR architecture as one of the common structures of fault tolerance which is used for enhancing systems’ reliability, safety and availability in industry. This article presents a comprehensive and parameterized formula for NMR structure's sensitivity. The presented method can be a great help for designing and exploiting reliable systems engineers in systematic and instant calculation of sensitivity by means of fault tree.

  17. To err is robotic, to tolerate immunological: fault detection in multirobot systems.

    Science.gov (United States)

    Tarapore, Danesh; Lima, Pedro U; Carneiro, Jorge; Christensen, Anders Lyhne

    2015-01-01

    Fault detection and fault tolerance represent two of the most important and largely unsolved issues in the field of multirobot systems (MRS). Efficient, long-term operation requires an accurate, timely detection, and accommodation of abnormally behaving robots. Most existing approaches to fault-tolerance prescribe a characterization of normal robot behaviours, and train a model to recognize these behaviours. Behaviours unrecognized by the model are consequently labelled abnormal or faulty. MRS employing these models do not transition well to scenarios involving temporal variations in behaviour (e.g., online learning of new behaviours, or in response to environment perturbations). The vertebrate immune system is a complex distributed system capable of learning to tolerate the organism's tissues even when they change during puberty or metamorphosis, and to mount specific responses to invading pathogens, all without the need of a genetically hardwired characterization of normality. We present a generic abnormality detection approach based on a model of the adaptive immune system, and evaluate the approach in a swarm of robots. Our results reveal the robust detection of abnormal robots simulating common electro-mechanical and software faults, irrespective of temporal changes in swarm behaviour. Abnormality detection is shown to be scalable in terms of the number of robots in the swarm, and in terms of the size of the behaviour classification space. PMID:25642825

  18. Design Optimization of Time- and Cost-Constrained Fault-Tolerant Embedded Systems with Checkpointing and Replication

    DEFF Research Database (Denmark)

    Pop, Paul; Izosimov, Viacheslav

    2009-01-01

    We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes and communications are statically scheduled. Our synthesis approach decides the assignment of fault-tolerance policies to processes, the optimal placement of checkpoints and the mapping of processes to processors such that multiple transient faults are tolerated and the timing constraints of the application are satisfied. We present several design optimization approaches which are able to find fault-tolerant implementations given a limited amount of resources. The developed algorithms are evaluated using extensive experiments, including a real-life example.

  19. Analysis of fault tolerance and reliability in distributed real-time system architectures

    International Nuclear Information System (INIS)

    Safety critical real-time systems are becoming ubiquitous in many areas of our everyday life. Failures of such systems potentially have catastrophic consequences on different scales, in the worst case even the loss of human life. Therefore, safety critical systems have to meet maximum fault tolerance and reliability requirements. As the design of such systems is far from being trivial, this article focuses on concepts to specifically support the early architectural design. In detail, a simulation based approach for the analysis of fault tolerance and reliability in distributed real-time system architectures is presented. With this approach, safety related features can be evaluated in the early development stages and thus prevent costly redesigns in later ones

  20. Reliability Monitoring of Fault Tolerant Control Systems with Demonstration on an Aircraft Model

    Directory of Open Access Journals (Sweden)

    Hongbin Li

    2007-12-01

    Full Text Available This paper proposes a reliability monitoring scheme for active fault tolerant control systems using a stochastic modeling method. The reliability index is defined based on system dynamical responses and a safety region; the plant and controller are assumed to have a multiple regime model structure, and a semi-Markov model is built for reliability evaluation based on the safety behavior of each regime model estimated by using Monte Carlo simulation. Moreover, the history data of fault detection and isolation decisions is used to update its transition characteristics and reliability model. This method provides an up-to-date reliability index as demonstrated on an aircraft model.

  1. Reliability of voting in fault-tolerant software systems for small output spaces

    Science.gov (United States)

    Mcallister, David F.; Sun, Chien-En; Vouk, Mladen A.

    1987-01-01

    Under a voting strategy in a fault-tolerant software system there is a difference between correctness and agreement. An independent N-version programming reliability model is proposed for treating small output spaces which distinguishes between correctness and agreement. System reliability is investigated using analytical relationships and simulation. A consensus majority voting strategy is proposed and its performance is analyzed and compared with other voting strategies. Consensus majority strategy automatically adapts the voting to different component reliability and output space cardinality characteristics. It is shown that absolute majority voting strategy provides a lower bound on the reliability provided by the consensus majority, and 2-of-n voting strategy an upper bound. If r is the cardinality of the output space it is proved the 1/r is a lower bound on the average reliability of fault-tolerant system components below which the system reliability begins to deteriorate as more versions are added.

  2. New fault tolerant matrix converter

    Energy Technology Data Exchange (ETDEWEB)

    Ibarra, Edorta; Andreu, Jon; Kortabarria, Inigo; Ormaetxea, Enekoitz; Alegria, Inigo Martinez de; Martin, Jose Luis [Department of Electronics and Telecommunications, University of the Basque Country, Alameda de Urquijo s/n, E-48013 Bilbao (Spain); Ibanez, Pedro [TECNALIA, Energy Unit, Parque Tecnologico de Zamudio, E-48170 Bizkaia (Spain)

    2011-02-15

    The matrix converter (MC) presents a promising topology that will have to overcome certain barriers (protection systems, durability, the development of converters for real applications, etc.) in order to gain a foothold in the industry. In some applications, where continuous operation must be insured in the case of a system failure, improved reliability of the converter is of particular importance. In this sense, this article focuses on the study of a fault tolerant MC. The fault tolerance of a converter is characterized by its total or partial response in the case of a breakage of any of its components. Taking into consideration that virtually no work has been done on fault tolerant MCs, this paper describes the most important studies in this area. Moreover, a new method is proposed for detecting the breakage of MC semiconductors. Likewise, a new variation of SVM modulation with failure tolerance capacity is presented. This guarantees the continuous operation of the converter and the pseudo-optimum control of a PMSM. This paper also proposes a novel MC topology, which allows the flexible reconfiguration of this converter, when one or several of its semiconductors are damaged. In this way, the MC can continue operating at 100% of its performance without having to double its resources. In this way, it can be said that the solution described in this article represents a step forward towards the development of reliable matrix converters for real applications. (author)

  3. Online Adaptive Fault Tolerant based Feedback Control Scheduling Algorithm for Multiprocessor Embedded Systems

    OpenAIRE

    Naseer, Oumair; Khan, Rana Atif Ali

    2012-01-01

    Since some years ago, use of Feedback Control Scheduling Algorithm (FCSA) in the control scheduling co-design of multiprocessor embedded system has increased. FCSA provides Quality of Service (QoS) in terms of overall system performance and resource allocation in open and unpredictable environment. FCSA uses quality control feedback loop to keep CPU utilization under desired unitization bound by avoiding overloading and deadline miss ratio. Integrated Fault tolerance (FT) ba...

  4. Survey On Fault Tolerance In Grid Computing

    OpenAIRE

    Latchoumy, P.; Sheik Abdul Khader, P.

    2011-01-01

    Grid computing is defined as a hardware and software infrastructure that enables coordinatedresource sharing within dynamic organizations. In grid computing, the probability of a failure is muchgreater than in traditional parallel computing. Therefore, the fault tolerance is an important property inorder to achieve reliability, availability and QOS. In this paper, we give a survey on various faulttolerance techniques, fault management in different systems and related issues. A fault tolerance...

  5. Fault Tolerance Mobile Agent System Using Witness Agent in 2-Dimensional Mesh Network

    OpenAIRE

    Ahmad Rostami; Hassan Rashidi; Majidreza Shams Zahraie

    2010-01-01

    Mobile agents are computer programs that act autonomously on behalf of a user or its owner and travel through a network of heterogeneous machines. Fault tolerance is important in their itinerary. In this paper, existent methods of fault tolerance in mobile agents are described which they are considered in linear network topology. In the methods three agents are used to fault tolerance by cooperating to each others for detecting and recovering server and agent failure. Three types of agents ar...

  6. Physical fault tolerance of nanoelectronics.

    Science.gov (United States)

    Szkopek, Thomas; Roychowdhury, Vwani P; Antoniadis, Dimitri A; Damoulakis, John N

    2011-04-29

    The error rate in complementary transistor circuits is suppressed exponentially in electron number, arising from an intrinsic physical implementation of fault-tolerant error correction. Contrariwise, explicit assembly of gates into the most efficient known fault-tolerant architecture is characterized by a subexponential suppression of error rate with electron number, and incurs significant overhead in wiring and complexity. We conclude that it is more efficient to prevent logical errors with physical fault tolerance than to correct logical errors with fault-tolerant architecture. PMID:21635055

  7. Fault tolerance in a distributed control system for combined cycle power plants

    Energy Technology Data Exchange (ETDEWEB)

    Ramirez, C.E.; Delgadillo, M.A. [Instituto de Investigaciones Electricas, Temixco (Mexico). Dept. de Instrumentacion y Control

    1996-12-31

    This paper presents how a Fault Tolerant Scheme (FTS) for the controllers of a distributed control system is selected. A dual-redundant configuration was chosen based on a dependability analysis. The defined FTS is described in terms of the four phases of fault-tolerance. A combination of stand-by and a synchronous scheme is considered. The FTS resulted in a cost-effective solution in order to increase the control system reliability because of two main reasons: the hardware configuration do not require special elements, and the FTS take advantage of the manual tracking algorithm to make the FTS software simple. The FTS was implemented and is operating in various controllers of a distributed control system in a combined cycle power plant in Mexico. (author)

  8. Fault-Tolerant Identification in Wireless Sensor Networks for Maximizing System Lifetime

    Directory of Open Access Journals (Sweden)

    Middela Shailaja

    2012-09-01

    Full Text Available Wireless Sensor Network (WSN is used by manyapplications such as security, command and control andsurveillance monitoring. In all such applications, themain application of WSN is sensing data and retrieval ofdata. There are many WSN systems that are querybased. They give responses in a stipulated time based onthe user’s query word. However, the WSN has possiblesensor faults for it is not reliable and thus the networkenergy level goes down. It results in reduction of lifetimeof network. To overcome the fault tolerance mechanismscan be used to improve reliability of the finding failurenodes and recovered by cluster heads. This paperpresents an algorithm that can effectively increaselifetime of WSN besides satisfying the QoS requirementsof application. Such algorithm is adaptive and also faulttolerant. It uses path and source redundancy and basedon hop-by-hop data delivery. Empirical simulationresults revealed that the proposed system is feasible. Thissystem also proposed the authentication of all kinds ofidentified faults and provides the services in qualitymanner. It increases the data flow and reduces the faults

  9. Fault tolerant synchronization of chaotic systems based on T–S fuzzy model with fuzzy sampled-data controller

    International Nuclear Information System (INIS)

    In this paper the fault tolerant synchronization of two chaotic systems based on fuzzy model and sample data is investigated. The problem of fault tolerant synchronization is formulated to study the global asymptotical stability of the error system with the fuzzy sampled-data controller which contains a state feedback controller and a fault compensator. The synchronization can be achieved no matter whether the fault occurs or not. To investigate the stability of the error system and facilitate the design of the fuzzy sampled-data controller, a Takagi–Sugeno (T–S) fuzzy model is employed to represent the chaotic system dynamics. To acquire good performance and produce a less conservative analysis result, a new parameter-dependent Lyapunov–Krasovksii functional and a relaxed stabilization technique are considered. The stability conditions based on linear matrix inequality are obtained to achieve the fault tolerant synchronization of the chaotic systems. Finally, a numerical simulation is shown to verify the results. (general)

  10. FTOS-Verify: Analysis and Verification of Non-Functional Properties for Fault-Tolerant Systems

    CERN Document Server

    Cheng, Chih-Hong; Esparza, Javier; Knoll, Alois

    2009-01-01

    The focus of the tool FTOS is to alleviate designers' burden by offering code generation for non-functional aspects including fault-tolerance mechanisms. One crucial aspect in this context is to ensure that user-selected mechanisms for the system model are sufficient to resist faults as specified in the underlying fault hypothesis. In this paper, formal approaches in verification are proposed to assist the claim. We first raise the precision of FTOS into pure mathematical constructs, and formulate the deterministic assumption, which is necessary as an extension of Giotto-like systems (e.g., FTOS) to equip with fault-tolerance abilities. We show that local properties of a system with the deterministic assumption will be preserved in a modified synchronous system used as the verification model. This enables the use of techniques known from hardware verification. As for implementation, we develop a prototype tool called FTOS-Verify, deploy it as an Eclipse add-on for FTOS, and conduct several case studies.

  11. Simulation of steady-state availability models of fault-tolerant systems with deferred repair

    OpenAIRE

    Carrasco, Juan A.

    2006-01-01

    This paper targets the simulation of continuous-time Markov chain models of fault-tolerant systems with deferred repair. We start by stating sufficient conditions for a given importance sampling scheme to satisfy the bounded relative error property. Using those sufficient conditions, it is noted that many previously proposed importance sampling schemes such as failure biasing and balanced failure biasing satisfy that property. Then, we adapt the importance sampling schemes failure transition ...

  12. Advanced information processing system: The Army Fault-Tolerant Architecture detailed design overview

    Science.gov (United States)

    Harper, Richard E.; Babikyan, Carol A.; Butler, Bryan P.; Clasen, Robert J.; Harris, Chris H.; Lala, Jaynarayan H.; Masotto, Thomas K.; Nagle, Gail A.; Prizant, Mark J.; Treadwell, Steven

    1994-01-01

    The Army Avionics Research and Development Activity (AVRADA) is pursuing programs that would enable effective and efficient management of large amounts of situational data that occurs during tactical rotorcraft missions. The Computer Aided Low Altitude Night Helicopter Flight Program has identified automated Terrain Following/Terrain Avoidance, Nap of the Earth (TF/TA, NOE) operation as key enabling technology for advanced tactical rotorcraft to enhance mission survivability and mission effectiveness. The processing of critical information at low altitudes with short reaction times is life-critical and mission-critical necessitating an ultra-reliable/high throughput computing platform for dependable service for flight control, fusion of sensor data, route planning, near-field/far-field navigation, and obstacle avoidance operations. To address these needs the Army Fault Tolerant Architecture (AFTA) is being designed and developed. This computer system is based upon the Fault Tolerant Parallel Processor (FTPP) developed by Charles Stark Draper Labs (CSDL). AFTA is hard real-time, Byzantine, fault-tolerant parallel processor which is programmed in the ADA language. This document describes the results of the Detailed Design (Phase 2 and 3 of a 3-year project) of the AFTA development. This document contains detailed descriptions of the program objectives, the TF/TA NOE application requirements, architecture, hardware design, operating systems design, systems performance measurements and analytical models.

  13. A Study on Fault-Tolerant Software Architecture for COTS-Based Dependable System

    International Nuclear Information System (INIS)

    Recently, with the rapid development of digital computers and information processing technologies, nuclear instrument and control (I and C) systems which needs safety-critical function have adopted digital technologies. Also, use of commercial off-the-shelf (COTS) software in safety-critical system has been incremented with several reasons such as economical efficiency and technical problems. But, it requires a considerable integration effort and brings about software quality and safety issues. COTS software is usually provided as a black box that cannot be modified. The biggest problem when we integrate such a product into dependable systems is the reliability of COTS software. There is no guarantee that the software will perform its function correctly. It may have bugs or unidentified components. Recently, the method of software verification and validation (V and V) is accepted as a way to assure the dependability of new-developed safety-critical nuclear I and C software. But, because of the limitation of COTS software, software V and V cant be applied as rigorously as new-developed software. There are considerable attentions into describing software architecture with respect to there dependability properties. In this paper, we present fault-tolerant software architecture using the C2 architectural style. The remainder of the paper is organized as follows: Section 2 discusses background work on the COTS software in nuclear I and C, software fault tolerance and C2 ar and C, software fault tolerance and C2 architectural style. Section 3 describes the architecture for fault-tolerant COTS-based software. Finally, we discuss the conclusion and future work

  14. Safety Verification of a Fault Tolerant Reconfigurable Autonomous Goal-Based Robotic Control System

    Science.gov (United States)

    Braman, Julia M. B.; Murray, Richard M; Wagner, David A.

    2007-01-01

    Fault tolerance and safety verification of control systems are essential for the success of autonomous robotic systems. A control architecture called Mission Data System (MDS), developed at the Jet Propulsion Laboratory, takes a goal-based control approach. In this paper, a method for converting goal network control programs into linear hybrid systems is developed. The linear hybrid system can then be verified for safety in the presence of failures using existing symbolic model checkers. An example task is simulated in MDS and successfully verified using HyTech, a symbolic model checking software for linear hybrid systems.

  15. Application of Joint Parameter Identification and State Estimation to a Fault-Tolerant Robot System

    DEFF Research Database (Denmark)

    Sun, Zhen; Yang, Zhenyu

    2011-01-01

    The joint parameter identification and state estimation technique is applied to develop a fault-tolerant space robot system. The potential faults in the considered system are abrupt parametric faults, which indicate that some system parameters will immediately deviate from their nominal values if a fault happens. The concerned system parameters consist of deterministic parts as well as those describing the stochastic features in the system. Due to the purpose for design of reconfigurable control, these deviated system parameters need to be identified as precisely and quickly as possible. Meanwhile, it would further simplify the reconfigurable design task and possibly speed up the system recovery, if the system state information under the new operating circumstance can be available along with faulty parameter information. The joint parameter identification and state estimation using the combined Kalman Filter and Maximum Likelihood (KF-ML) techniques is discussed and applied in this study. The simulation results on a space robot system showed that the proposed method is quite promising in providing both faulty parameter information and state estimation in a quick, accurate and robust manner.

  16. Survey On Fault Tolerance In Grid Computing

    Directory of Open Access Journals (Sweden)

    P. Latchoumy

    2011-12-01

    Full Text Available Grid computing is defined as a hardware and software infrastructure that enables coordinatedresource sharing within dynamic organizations. In grid computing, the probability of a failure is muchgreater than in traditional parallel computing. Therefore, the fault tolerance is an important property inorder to achieve reliability, availability and QOS. In this paper, we give a survey on various faulttolerance techniques, fault management in different systems and related issues. A fault tolerance servicedeals with various types of resource failures, which include process failure, processor failure and networkfailures. This survey provides the related research results about fault tolerance in distinct functional areasof grid infrastructure and also gave the future directions about fault tolerance techniques, and it is a goodreference for researcher.

  17. Fault-tolerant embedded system design and optimization considering reliability estimation uncertainty

    International Nuclear Information System (INIS)

    In this paper, we model embedded system design and optimization, considering component redundancy and uncertainty in the component reliability estimates. The systems being studied consist of software embedded in associated hardware components. Very often, component reliability values are not known exactly. Therefore, for reliability analysis studies and system optimization, it is meaningful to consider component reliability estimates as random variables with associated estimation uncertainty. In this new research, the system design process is formulated as a multiple-objective optimization problem to maximize an estimate of system reliability, and also, to minimize the variance of the reliability estimate. The two objectives are combined by penalizing the variance for prospective solutions. The two most common fault-tolerant embedded system architectures, N-Version Programming and Recovery Block, are considered as strategies to improve system reliability by providing system redundancy. Four distinct models are presented to demonstrate the proposed optimization techniques with or without redundancy. For many design problems, multiple functionally equivalent software versions have failure correlation even if they have been independently developed. The failure correlation may result from faults in the software specification, faults from a voting algorithm, and/or related faults from any two software versions. Our approach considers this correlation in formulating practicrs this correlation in formulating practical optimization models. Genetic algorithms with a dynamic penalty function are applied in solving this optimization problem, and reasonable and interesting results are obtained and discussed

  18. Robust fault tolerant control based on sliding mode method for uncertain linear systems with quantization.

    Science.gov (United States)

    Hao, Li-Ying; Yang, Guang-Hong

    2013-09-01

    This paper is concerned with the problem of robust fault-tolerant compensation control problem for uncertain linear systems subject to both state and input signal quantization. By incorporating novel matrix full-rank factorization technique with sliding surface design successfully, the total failure of certain actuators can be coped with, under a special actuator redundancy assumption. In order to compensate for quantization errors, an adjustment range of quantization sensitivity for a dynamic uniform quantizer is given through the flexible choices of design parameters. Comparing with the existing results, the derived inequality condition leads to the fault tolerance ability stronger and much wider scope of applicability. With a static adjustment policy of quantization sensitivity, an adaptive sliding mode controller is then designed to maintain the sliding mode, where the gain of the nonlinear unit vector term is updated automatically to compensate for the effects of actuator faults, quantization errors, exogenous disturbances and parameter uncertainties without the need for a fault detection and isolation (FDI) mechanism. Finally, the effectiveness of the proposed design method is illustrated via a model of a rocket fairing structural-acoustic. PMID:23701895

  19. Fault tolerant, multiplexed control rod position detection and indication system for nuclear power plants

    International Nuclear Information System (INIS)

    The majority of Westinghouse nuclear plants placed in service thus far have incorporated a Rod Position Indication system based upon an analog design philosophy. This system, while meeting all functional and accuracy requirements, has proven somewhat cumbersome, particularly in the area of initial field calibration and maintenance. This paper describes a new Digital Rod Position Indication system (DRPI) developed for use with pressurized water reactors. The system is based upon a digital design philosophy and meets all previous design constraints and environmental requirements. Further, fault tolerance, improved accuracy, interference from adjacent rods and the elimination of adjustments and calibration has been provided

  20. Reliability of computer systems and networks fault tolerance, analysis, and design

    CERN Document Server

    Shooman, Martin L

    2002-01-01

    With computers becoming embedded as controllers in everything from network servers to the routing of subway schedules to NASA missions, there is a critical need to ensure that systems continue to function even when a component fails. In this book, bestselling author Martin Shooman draws on his expertise in reliability engineering and software engineering to provide a complete and authoritative look at fault tolerant computing. He clearly explains all fundamentals, including how to use redundant elements in system design to ensure the reliability of computer systems and networks.Market: Systems

  1. Nonlinear, Adaptive and Fault-tolerant Control for Electro-hydraulic Servo Systems

    DEFF Research Database (Denmark)

    Choux, Martin

    2011-01-01

    Fluid power systems have been in use since 1795 with the rst hydraulic press patented by Joseph Bramah and today form the basis of many industries. Electro hydraulic servo systems are uid power systems controlled in closed-loop. They transform reference input signals into a set of movements in hydraulic actuators (cylinders or motors) by the means of hydraulic uid under pressure. With the development of computing power and control techniques during the last few decades, they are used increasingly in many industrial elds which require high actuation forces within limited space. However, despite numerous attractive properties, hydraulic systems are always subject to potential leakages in their components, friction variation in their hydraulic actuators and deciency in their sensors. These violations of normal behaviour reduce the system performances and can lead to system failure if they are not detected early and handled. Moreover, the task of controlling electro hydraulic systems for high performance operations is challenging due to the highly nonlinear behaviour of such systems and the large amount of uncertainties present in their models. This thesis focuses on nonlinear adaptive fault-tolerant control for a representative electro hydraulic servo controlled motion system. The thesis extends existing models of hydraulic systems by considering more detailed dynamics in the servo valve and in the friction inside the hydraulic cylinder. It identies the model parameters using experimental data from a test bed by analysing both the time response to standard input signals and the variation of the outputs with dierent excitation frequencies. The thesis also presents a model that accurately describes the static and dynamic normal behaviour of the system. Further, in this thesis, a fault detector is designed and implemented on the test bed that successfully diagnoses internal or external leakages, friction variations in the actuator or fault related to pressure sensors. The presented algorithm uses the position and pressure measurements to detect and isolate faults, avoiding missed detection and false alarm. The thesis also develops a high performance adaptive nonlinear controller for the hydraulic system which outperforms comparable linear controllers widely used in the industry. Because of the controller adaptivity, uncertainties in the model parameters can be handled. Moreover, a special attention is given to reduce the complexity of the controller in order to demonstrate its real-time implementation. Finally the thesis combines the techniques developed in fault detection and nonlinear control in order to develop an active fault-tolerant controller for electro hydraulic servo systems. In order to maintain overall service and performances as high as possible when a potential fault occurs, the fault-tolerant controlled system prognoses the fault and changes its controller parameters or structure. The consequences of an unexpected fault are avoided, high availability is ensured and the overall safety in electro hydraulic servo systems is increased.

  2. Fault-Tolerant Process Control Methods and Applications

    CERN Document Server

    Mhaskar, Prashant; Christofides, Panagiotis D

    2013-01-01

    Fault-Tolerant Process Control focuses on the development of general, yet practical, methods for the design of advanced fault-tolerant control systems; these ensure an efficient fault detection and a timely response to enhance fault recovery, prevent faults from propagating or developing into total failures, and reduce the risk of safety hazards. To this end, methods are presented for the design of advanced fault-tolerant control systems for chemical processes which explicitly deal with actuator/controller failures and sensor faults and data losses. Specifically, the book puts forward: ·         a framework for  detection, isolation and diagnosis of actuator and sensor faults for nonlinear systems; ·         controller reconfiguration and safe-parking-based fault-handling methodologies; ·         integrated-data- and model-based fault-detection and isolation and fault-tolerant control methods; ·         methods for handling sensor faults and data losses; and ·      ...

  3. Algorithmic Based Fault Tolerance Applied to High Performance Computing

    OpenAIRE

    Bosilca, George; Delmas, Remi; Dongarra, Jack; Langou, Julien

    2008-01-01

    We present a new approach to fault tolerance for High Performance Computing system. Our approach is based on a careful adaptation of the Algorithmic Based Fault Tolerance technique (Huang and Abraham, 1984) to the need of parallel distributed computation. We obtain a strongly scalable mechanism for fault tolerance. We can also detect and correct errors (bit-flip) on the fly of a computation. To assess the viability of our approach, we have developed a fault tolerant matrix-m...

  4. A Fault Tolerant Mobile Agent Information Retrieval System

    OpenAIRE

    R. Punithavathi; Duraiswamy, K.

    2010-01-01

    Problem statement: Most of the information retrieval systems used only client-server architectures. The client-server model though powerful, had some limitations. In mobile computing environment which has both wired network and wireless networks with limited communication capabilities, the performance of the system was very low. Approach: Mobile agents are considered a suitable technology to develop applications such as information retrieval system for mobile computing environment. Mobile age...

  5. Flatness-based fault tolerant control of a nonlinear MIMO system using algebraic derivative estimation

    OpenAIRE

    Mai, Philipp; Join, Cédric; Reger, Johan

    2007-01-01

    A flatness-based approach to fault tolerant control is proposed. The approach uses the recently published algebraic derivative estimation method for the estimation of those output derivatives that are necessary for determining intermittent actuator faults. The rapid performance of the estimation allows for an accommodation of the control to the fault. Additionally, taking into account the control saturations a novel classification scheme for actuator faults is introduced that exhibits a compr...

  6. Performance Evaluation of SDS Algorithm with Fault Tolerance for Distributed System

    Directory of Open Access Journals (Sweden)

    K.Sathiya Bharathi,

    2012-07-01

    Full Text Available In the recent past, Security-sensitive applications, such as electronic transaction processing systems, stock quote update systems, which require high quality of security to guarantee authentication, integrity, and confidentiality of information, have adopted Heterogeneous Distributed System (HDS as their platforms.We systematically design a security-driven scheduling architecture that can dynamically measure the trust level of each node in the system by using differential equations and introduce SRank to estimate security overhead of critical tasks using SDS algorithm.Furthermore,we can achieve high quality of security for applications by using security-driven scheduling algorithm for DAGs in terms of minimizing the makespan, risk probability, and speedup. In addition to that the fault tolerant is included using Security Driven Fault Tolerant Scheduling Algorithm (SDFT to tolerate N processors failure at one time, and it introduced a new global scheduler to improve efficiency of scheduling process.Moreover, the SDFT supported flexible security policy applied on real time tasks according to its security requirement and considered the effect of security overhead during scheduling. We also observe that the improvement obtained by our algorithm increases as the security-sensitive data of applications increases.

  7. Fault tolerance improvement for queuing systems under stress load

    International Nuclear Information System (INIS)

    Various kinds of queuing information systems (exchange auctions systems, web servers, SCADA) are faced to unpredictable situations during operation, when information flow that requires being analyzed and processed rises extremely. Such stress load situations often require human (dispatcher's or administrator's) intervention that is the reason why the time of the first denial of service is extremely important. Common queuing systems architecture is described. Existing approaches to computing resource management are considered. A new late-first-denial-of-service resource management approach is proposed

  8. Transparent reliability model for fault-tolerant safety systems

    International Nuclear Information System (INIS)

    A reliability model is presented which may serve as a tool for identification of cost-effective configurations and operating philosophies of computer-based process safety systems. The main merit of the model is the explicit relationship in the mathematical formulas between failure cause and the means used to improve system reliability such as self-test, redundancy, preventive maintenance and corrective maintenance. A component failure taxonomy has been developed which allows the analyst to treat hardware failures, human failures, and software failures of automatic systems in an integrated manner. Furthermore, the taxonomy distinguishes between failures due to excessive environmental stresses and failures initiated by humans during engineering and operation. Attention has been given to develop a transparent model which provides predictions which are in good agreement with observed system performance, and which is applicable for non-experts in the field of reliability

  9. Active Fault Tolerant Control of Livestock Stable Ventilation System

    OpenAIRE

    Gholami, Mehdi

    2012-01-01

    Modern stables and greenhouses are equipped with different components for providing a comfortable climate for animals and plant. A component malfunction may result in loss of production. Therefore, it is desirable to design a control system, which is stable, and is able to provide an acceptable degraded performance even in the faulty case. In this thesis, we have designed such controllers for climate control systems for livestock buildings in three steps: Deriving a model for the climate cont...

  10. The Isis project: Fault-tolerance in large distributed systems

    Science.gov (United States)

    Birman, Kenneth P.; Marzullo, Keith

    1993-01-01

    This final status report covers activities of the Isis project during the first half of 1992. During the report period, the Isis effort has achieved a major milestone in its effort to redesign and reimplement the Isis system using Mach and Chorus as target operating system environments. In addition, we completed a number of publications that address issues raised in our prior work; some of these have recently appeared in print, while others are now being considered for publication in a variety of journals and conferences.

  11. Local rollback for fault-tolerance in parallel computing systems

    Science.gov (United States)

    Blumrich, Matthias A. (Yorktown Heights, NY); Chen, Dong (Yorktown Heights, NY); Gara, Alan (Yorktown Heights, NY); Giampapa, Mark E. (Yorktown Heights, NY); Heidelberger, Philip (Yorktown Heights, NY); Ohmacht, Martin (Yorktown Heights, NY); Steinmacher-Burow, Burkhard (Boeblingen, DE); Sugavanam, Krishnan (Yorktown Heights, NY)

    2012-01-24

    A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.

  12. Fault-tolerant Agreement in Synchronous Message-passing Systems

    CERN Document Server

    Raynal, Michel

    2010-01-01

    The present book focuses on the way to cope with the uncertainty created by process failures (crash, omission failures and Byzantine behavior) in synchronous message-passing systems (i.e., systems whose progress is governed by the passage of time). To that end, the book considers fundamental problems that distributed synchronous processes have to solve. These fundamental problems concern agreement among processes (if processes are unable to agree in one way or another in presence of failures, no non-trivial problem can be solved). They are consensus, interactive consistency, k-set agreement an

  13. Fault-tolerant quantum computation -- a dynamical systems approach

    CERN Document Server

    Fern, J; Simic, S; Sastry, S; Fern, Jesse; Kempe, Julia; Simic, Slobodan; Sastry, Shankar

    2004-01-01

    We apply a dynamical systems approach to concatenation of quantum error correcting codes, extending and generalizing the results of Rahn et al. [8] to both diagonal and nondiagonal channels. Our point of view is global: instead of focusing on particular types of noise channels, we study the geometry of the coding map as a discrete-time dynamical system on the entire space of noise channels. In the case of diagonal channels, we show that any code with distance at least three corrects (in the infinite concatenation limit) an open set of errors. For CSS codes, we give a more precise characterization of that set. We show how to incorporate noise in the gates, thus completing the framework. We derive some general bounds for noise channels, which allows us to analyze several codes in detail.

  14. Fault tolerant computer control for a Maglev transportation system

    Science.gov (United States)

    Lala, Jaynarayan H.; Nagle, Gail A.; Anagnostopoulos, George

    1994-01-01

    Magnetically levitated (Maglev) vehicles operating on dedicated guideways at speeds of 500 km/hr are an emerging transportation alternative to short-haul air and high-speed rail. They have the potential to offer a service significantly more dependable than air and with less operating cost than both air and high-speed rail. Maglev transportation derives these benefits by using magnetic forces to suspend a vehicle 8 to 200 mm above the guideway. Magnetic forces are also used for propulsion and guidance. The combination of high speed, short headways, stringent ride quality requirements, and a distributed offboard propulsion system necessitates high levels of automation for the Maglev control and operation. Very high levels of safety and availability will be required for the Maglev control system. This paper describes the mission scenario, functional requirements, and dependability and performance requirements of the Maglev command, control, and communications system. A distributed hierarchical architecture consisting of vehicle on-board computers, wayside zone computers, a central computer facility, and communication links between these entities was synthesized to meet the functional and dependability requirements on the maglev. Two variations of the basic architecture are described: the Smart Vehicle Architecture (SVA) and the Zone Control Architecture (ZCA). Preliminary dependability modeling results are also presented.

  15. The NILE system architecture: fault-tolerant, wide-area access to computing and data resources

    International Nuclear Information System (INIS)

    NILE is a multi-disciplinary project building a distributed computing environment for HEP. It provides wide-area, fault-tolerant, integrated access to processing and data resources for collaborators of the CLEO experiment, though the goals and principles are applicable to many domains. NILE has three main objectives: a realistic distributed system architecture design, the design of a robust data model, and a Fast-Track implementation providing a prototype design environment which will also be used by CLEO physicists. This paper focuses on the software and wide-area system architecture design and the computing issues involved in making NILE services highly-available. (author)

  16. On reliability modeling and analysis of ultrareliable fault-tolerant digital systems.

    Science.gov (United States)

    Mathur, F. P.

    1971-01-01

    The processes of protective redundancy, namely, standby replacement (SR) redundancy and hybrid redundancy (a combination of SR and multiple-line voting redundancy), find application in the architecture of fault-tolerant digital computers and enable them to be ultrareliable and self-repairing. The claims to ultrareliability lead to the challenge of quantitatively evaluating and assigning a value to the probability of survival as a function of the mission durations intended. This note presents various mathematical models, and derives and displays quantitative evaluations of system reliability as a function of various mission parameters of interest to the system designer.

  17. P2P???????????? Fault-Tolerant Method in P2P Information Management Systems

    Directory of Open Access Journals (Sweden)

    ??

    2012-03-01

    Full Text Available FissionE?????Kautz??P2P??????????????????(d = 2????????????????????FissionE?????????????FissionE?????????????????????????????????????????????“??”?????????????????????FissionE is a Kautz graph based infrastructure of P2P information management systems. It has the optimal network diameter given node degree d = 2. In order to address the problem of degraded routing performance caused by node failures, in this paper we propose a fault-tolerant routing algorithm for the FissionE system. The basic idea is to bypass failed node or link with some certain mechanism, so that FissionE can achieve better routing performance.

  18. Fault-tolerant Cooperative Tasking for Multi-agent Systems

    OpenAIRE

    Karimadini, Mohammad; Lin, Hai

    2011-01-01

    A natural way for cooperative tasking in multi-agent systems is through a top-down design by decomposing a global task into sub-tasks for each individual agent such that the accomplishments of these sub-tasks will guarantee the achievement of the global task. In our previous works [1], [2] we presented necessary and sufficient conditions on the decomposability of a global task automaton between cooperative agents. As a follow-up work, this paper deals with the robustness iss...

  19. Fault Tolerant Neural Network for ECG Signal Classification Systems

    Directory of Open Access Journals (Sweden)

    MERAH, M.

    2011-08-01

    Full Text Available The aim of this paper is to apply a new robust hardware Artificial Neural Network (ANN for ECG classification systems. This ANN includes a penalization criterion which makes the performances in terms of robustness. Specifically, in this method, the ANN weights are normalized using the auto-prune method. Simulations performed on the MIT ? BIH ECG signals, have shown that significant robustness improvements are obtained regarding potential hardware artificial neuron failures. Moreover, we show that the proposed design achieves better generalization performances, compared to the standard back-propagation algorithm.

  20. The cost of software fault tolerance

    Science.gov (United States)

    Migneault, G. E.

    1982-01-01

    The proposed use of software fault tolerance techniques as a means of reducing software costs in avionics and as a means of addressing the issue of system unreliability due to faults in software is examined. A model is developed to provide a view of the relationships among cost, redundancy, and reliability which suggests strategies for software development and maintenance which are not conventional.

  1. Synthesis of Fault-Tolerant Schedules with Transparency/Performance Trade-offs for Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul

    2006-01-01

    In this paper we present an approach to the scheduling of fault-tolerant embedded systems for safety-critical applications. Processes and messages are statically scheduled, and we use process re-execution for recovering from multiple transient faults. If process recovery is performed such that the operation of other processes is not affected, we call it transparent recovery. Although transparent recovery has the advantages of fault containment, improved debugability and less memory needed to store the fault-tolerant schedules, it will introduce delays that can violate the timing constraints of the application. We propose a novel algorithm for the synthesis of fault-tolerant schedules that can handle the transparency/performance trade-offs imposed by the designer, and makes use of the fault-occurrence information to reduce the overhead due to fault tolerance. We model the application as a conditional process graph, where the fault occurrence information is represented as conditional edges and the transparent recovery is captured using synchronization nodes.

  2. Evaluation of error detection coverage and fault-tolerance of digital plant protection system in nuclear power plants

    International Nuclear Information System (INIS)

    Recently, traditional analog-based safety-related instrumentation and control (I and C) systems in nuclear power plants (NPPs) have been replaced with modern digital-based systems. Due to the digitalization of nuclear I and C systems, the safety assessment has become a major issue, as it is crucial to the system's reliability. In the safety assessment of the digitalized system, evaluation of error detection coverage and fault-tolerance are critical factors. For the evaluation, we use C++ based hardware description instead of a board with integrated circuit components. We select the digital plant protection system (DPPS) in NPPs as a target system. Permanent fault is used as a possible fault in the system and some error detection methods are used to detect errors. From the experiment, we confirmed that the proposed approach can evaluate the error detection coverage and the fault-tolerance of DPPS in NPPs

  3. Fault tolerant synchronization of chaotic heavy symmetric gyroscope systems versus external disturbances via Lyapunov rule-based fuzzy control.

    Science.gov (United States)

    Farivar, Faezeh; Shoorehdeli, Mahdi Aliyari

    2012-01-01

    In this paper, fault tolerant synchronization of chaotic gyroscope systems versus external disturbances via Lyapunov rule-based fuzzy control is investigated. Taking the general nature of faults in the slave system into account, a new synchronization scheme, namely, fault tolerant synchronization, is proposed, by which the synchronization can be achieved no matter whether the faults and disturbances occur or not. By making use of a slave observer and a Lyapunov rule-based fuzzy control, fault tolerant synchronization can be achieved. Two techniques are considered as control methods: classic Lyapunov-based control and Lyapunov rule-based fuzzy control. On the basis of Lyapunov stability theory and fuzzy rules, the nonlinear controller and some generic sufficient conditions for global asymptotic synchronization are obtained. The fuzzy rules are directly constructed subject to a common Lyapunov function such that the error dynamics of two identical chaotic motions of symmetric gyros satisfy stability in the Lyapunov sense. Two proposed methods are compared. The Lyapunov rule-based fuzzy control can compensate for the actuator faults and disturbances occurring in the slave system. Numerical simulation results demonstrate the validity and feasibility of the proposed method for fault tolerant synchronization. PMID:21868010

  4. Performance, reliability, and queueing analysis of fault-tolerant computer systems

    Energy Technology Data Exchange (ETDEWEB)

    Nicola, V.F.

    1986-01-01

    Fault-tolerant computer systems are characterized by their ability to continue functioning in the presence of a component (or a subsystem) failure, possibly with a degraded performance. Models and techniques are developed for the evaluation of separate and combined measures of performance and reliability in such systems. Changes in the structure of the system due to failures, repair, and/or degradation are modeled as a continuous time Markov or semi-Markov process, referred to as the structure state process. Associated with each structure state is a service rate (or a performance measure) and a discipline for service preemption interaction upon entering that state. The execution of a task on a fault tolerant system is considered. Transform solutions are derived, and techniques are developed to obtain the distribution of the task completion time which yields job-oriented measures. When all structure states are of the preemptive-resume type, i.e., work conserving system behavior, then the cumulative service (or cumulative performance measure) until a given time and the completion time of a given task are dual measures, so that the distribution of any one of them can be obtained from the other. The cumulative performance measure can be specialized to obtain other system-oriented measures.

  5. The BTeV DAQ and trigger system - some throughput, usability and fault tolerance aspects

    International Nuclear Information System (INIS)

    As presented at the last CHEP conference, the BTeV triggering and data collection pose a significant challenge in construction and operation, generating 1.5 Terabytes/second of raw data from over 30 million detector channels. The authors report on facets of the DAQ and trigger farms. The authors report on the current design of the DAQ, especially its partitioning features to support commissioning of the detector. The authors are exploring collaborations with computer science groups experienced in fault tolerant and dynamic real-time and embedded systems to develop a system to provide the extreme flexibility and high availability required of the heterogeneous trigger farm (?ten thousand DSPs and commodity processors). The authors describe directions in the following areas: system modeling and analysis using the Model Integrated Computing approach to assist in the creation of domain-specific modeling, analysis, and program synthesis environments for building complex, large-scale computer-based systems; System Configuration Management to include compilable design specifications for configurable hardware components, schedules, and communication maps; Runtime Environment and Hierarchical Fault Detection/Management- a system-wide infrastructure for rapidly detecting, isolating, filtering, and reporting faults which will be encapsulated in intelligent active entities (agents) to run on DSPs, L2/3 processors, and other supporting processors throughout the systemessors throughout the system

  6. The BTeV DAQ and Trigger System - Some throughput, usability and fault tolerance aspects

    International Nuclear Information System (INIS)

    As presented at the last CHEP conference, the BTeV triggering and data collection pose a significant challenge in construction and operation, generating 1.5 Terabytes/second of raw data from over 30 million detector channels. We report on facets of the DAQ and trigger farms. We report on the current design of the DAQ, especially its partitioning features to support commissioning of the detector. We are exploring collaborations with computer science groups experienced in fault tolerant and dynamic real-time and embedded systems to develop a system to provide the extreme flexibility and high availability required of the heterogeneous trigger farm (? ten thousand DSPs and commodity processors). We describe directions in the following areas: system modeling and analysis using the Model Integrated Computing approach to assist in the creation of domain-specific modeling, analysis, and program synthesis environments for building complex, large-scale computer-based systems; System Configuration Management to include compilable design specifications for configurable hardware components, schedules, and communication maps; Runtime Environment and Hierarchical Fault Detection/Management--a system-wide infrastructure for rapidly detecting, isolating, filtering, and reporting faults which will be encapsulated in intelligent active entities (agents) to run on DSPs, L2/3 processors, and other supporting processors throughout the system

  7. Fault-tolerant rotary actuator

    Science.gov (United States)

    Tesar, Delbert

    2006-10-17

    A fault-tolerant actuator module, in a single containment shell, containing two actuator subsystems that are either asymmetrically or symmetrically laid out is provided. Fault tolerance in the actuators of the present invention is achieved by the employment of dual sets of equal resources. Dual resources are integrated into single modules, with each having the external appearance and functionality of a single set of resources.

  8. A unified method for analyzing mission reliability for fault tolerant computer systems.

    Science.gov (United States)

    Bricker, J. L.

    1973-01-01

    For fault-tolerant computer systems consisting of multiple classes of modules, a unified method for analyzing mission reliability is proposed and evaluated. The analysis proceeds by generalizing the notions of standby and N modular redundancy into a concept called hybrid-degraded redundancy. The probabilistic evaluation of the unified redundancy concept is then developed to yield, for a given modular class, the joint distribution of success and the number of nonfailed modules from that class, at special times. With this information, a Markov chain analysis gives the reliability of an entire sequence of phases (mission profile).

  9. Over-constrained rigid multibody systems: differential kinematics and fault tolerance

    Science.gov (United States)

    Yi, Yong; McInroy, John E.; Chen, Yixin

    2002-07-01

    Over-constrained parallel manipulators can be used for fault tolerance. This paper derives the differential kinematics and static force model for a general over-constrained rigid multibody system. The result shows that the redundant constraints result in constrained active joints and redundant internal force. By incorporating these constraints, general methods for overcoming stuck legs or even the complete loss of legs are derived. The Stewart platform special case is studied as an example, and the relationship between its forward Jacobian and its inverse Jacobian is also found.

  10. Fault-tolerant control of discrete-time LPV systems using virtual actuators and sensors

    DEFF Research Database (Denmark)

    Tabatabaeipour, S. Mojtaba; Stoustrup, Jakob

    2015-01-01

    This paper introduce a new fault-tolerant control (FTC) method for discrete-time linear parameter varying (LPV) systems. Fault-tolerance is achieved without redesigning the nominal controller by inserting a reconfiguration block between the plant and the nominal controller. The reconfiguration block is realized by a virtual actuator and a virtual sensor. The signals from the faulty system are transformed such that its behavior is similar to that of the nominal system from the viewpoint of the controller. It transforms the controller output for the faulty system preserving the stability and performance. Input-to-state stabilizing LPV gains of the virtual actuator and sensor are obtained by solving LMIs. The gains guarantees the input-to-state stability (ISS) of the closed-loop reconfigured system. Moreover, we obtain performances in terms of the ISS gains for the virtual actuator, the virtual sensor, and their interconnection. Minimizing these performances is formulated as convex optimization problems subject to LMI constraints. The effectiveness of the method is demonstrated via a numerical example and stator current control of an induction motor.

  11. Spacecraft fault tolerance: The Magellan experience

    Science.gov (United States)

    Kasuda, Rick; Packard, Donna Sexton

    1993-01-01

    Interplanetary and earth orbiting missions are now imposing unique fault tolerant requirements upon spacecraft design. Mission success is the prime motivator for building spacecraft with fault tolerant systems. The Magellan spacecraft had many such requirements imposed upon its design. Magellan met these requirements by building redundancy into all the major subsystem components and designing the onboard hardware and software with the capability to detect a fault, isolate it to a component, and issue commands to achieve a back-up configuration. This discussion is limited to fault protection, which is the autonomous capability to respond to a fault. The Magellan fault protection design is discussed, as well as the developmental and flight experiences and a summary of the lessons learned.

  12. State of the art on fault-tolerant real time distributed systems

    International Nuclear Information System (INIS)

    The integration of new computerized functions in power plant, and especially nuclear power plant, control and instrumentation systems implies more and more stringent requirements as to communication system reliability. For if an item of equipment, or even a computer program, can be validated and qualified, no formal qualification procedure is presently imposed on communication networks. This is certainly due to the relative immaturity of these networks, but also to their complexity. It is for this reason that, in the context of preparation for the future PWR 2000 standardized nuclear plants, it would seem appropriate to take a look at fault-tolerant communication systems. Since C and I type applications (in the control room) are divided between several computers and are required to contend with extremely severe time constraints, EDF has undertaken investigation of fault-tolerant, real time distributed systems. This paper summarized the state of the art in the field as it appears from discussion with computer manufacturers, academics and research workers on related projects. The results obtained were then used to determine trends as to ''promising'' solutions. The paper concludes with recommended study programs for the PCC department of EDF/R and DD for the next few years. (author), 9 figs., 10 refs., 2 annexes

  13. Filtering and fault tolerant control of parameter-varying time-delay systems and applications

    Science.gov (United States)

    Mohammadpour Velni, Javad

    This dissertation addresses some open problems in control systems theory. The problems considered include the dynamic controller and filter design for Linear Parameter Varying (LPV) time-delay systems, the reconfigurable control design in Fault Tolerant Control Systems (FTCS) and fault diagnostics in Diesel engines. In the first part of this thesis, we investigate the problem of designing parameter-dependent filters for output estimation of LPV time-delay systems. The filters are designed such that the filtering error system guarantees an optimum level of H2 or Hinfinity performance. A state-delay term is included in the filter dynamics to reduce the design conservatism and improve the performance. The Linear Matrix Inequality (LMI)-based synthesis conditions developed for the filter design purposes are categorized into the rate-dependent and delay-dependent conditions which could handle the time-varying state-delay and bounded small delay cases, respectively. Among these two, the latter one is shown to provide a significant reduction in the conservativeness in the filter design. The second part of the thesis examines the analysis and synthesis of Fault Tolerant Control (FTC) systems in an LPV framework. For reconfigurable control design purposes, the information from Fault Detection and Isolation (FDI) module, that provides an estimate of the fault parameters, is utilized to schedule the controller matrices. We will also present a formulation that incorporates the factor of detection delay in the FTC supervisory system. It is shown that including this delay in the synthesis conditions leads to improved performance and reduced control effort. For analysis of the FTC systems including time-delay, where the fault parameters might be identified inaccurately, we first introduce the notion of brief instability for LPV time-delay systems. In these systems it is possible that the output trajectory converges to zero even though there are parameter trajectories for which the system is locally unstable for a short period of time. Using the analysis conditions for LPV time-delay systems including brief instability, we develop analysis conditions that lead to an explicit formulae that indicates how the FTC closed-loop system performance is degraded under the false identification of the fault parameters. The results are validated on a model of a Highly Maneuverable Aircraft Technology (HiMAT) vehicle. The last part of this thesis presents a model-based diagnostic algorithm for the detection and estimation of the internal leak and restriction in the Exhaust Gas Recirculation (EGR) system of Diesel engines. The initial step in the proposed method is the identification of two parameters in a static relationship. As soon as a fault occurs, the identification algorithm provides a change in the coefficients of the static equation. The results of the experimental validation of the diagnostic algorithm are illustrated on data collected from a test cell and using different trucks during the transient cycle. A statistical analysis is also performed to determine the thresholds that capture the normal variability of the healthy system.

  14. Task Migration for Fault-Tolerance in Mixed-Criticality Embedded Systems

    OpenAIRE

    Saraswat, Prabhat Kumar; Pop, Paul; Madsen, Jan

    2009-01-01

    In this paper we are interested in mixed-criticality embedded applications implemented on distributed architectures. Depending on their time-criticality, tasks can be hard or soft real-time and regarding safety-criticality, tasks can be fault-tolerant to transient faults, permanent faults, or have no dependability requirements. We use Earliest Deadline First (EDF) scheduling for the hard tasks and the Constant Bandwidth Server (CBS) for the soft tasks. The CBS parameters determine the quality...

  15. Modeling and Verification for Timing Satisfaction of Fault-Tolerant Systems with Finiteness

    CERN Document Server

    Cheng, Chih-Hong; Esparza, Javier; Knoll, Alois

    2009-01-01

    The increasing use of model-based tools enables further use of formal verification techniques in the context of distributed real-time systems. To avoid state explosion, it is necessary to construct a verification model that focuses on the aspects under consideration. In this paper, we discuss how we construct a verification model for timing analysis in distributed real-time systems. We (1) give observations concerning restrictions of timed automata to model these systems, (2) formulate mathematical representations how to perform model-to-model transformation to derive verification models from system models, and (3) propose some theoretical criteria how to reduce the model size. The latter is in particular important, as for the verification of complex systems, an efficient model reflecting the properties of the system under consideration is equally important to the verification algorithm itself. Finally, we present an extension of the model-based development tool FTOS, designed to develop fault-tolerant system...

  16. Novel neural networks-based fault tolerant control scheme with fault alarm.

    Science.gov (United States)

    Shen, Qikun; Jiang, Bin; Shi, Peng; Lim, Cheng-Chew

    2014-11-01

    In this paper, the problem of adaptive active fault-tolerant control for a class of nonlinear systems with unknown actuator fault is investigated. The actuator fault is assumed to have no traditional affine appearance of the system state variables and control input. The useful property of the basis function of the radial basis function neural network (NN), which will be used in the design of the fault tolerant controller, is explored. Based on the analysis of the design of normal and passive fault tolerant controllers, by using the implicit function theorem, a novel NN-based active fault-tolerant control scheme with fault alarm is proposed. Comparing with results in the literature, the fault-tolerant control scheme can minimize the time delay between fault occurrence and accommodation that is called the time delay due to fault diagnosis, and reduce the adverse effect on system performance. In addition, the FTC scheme has the advantages of a passive fault-tolerant control scheme as well as the traditional active fault-tolerant control scheme's properties. Furthermore, the fault-tolerant control scheme requires no additional fault detection and isolation model which is necessary in the traditional active fault-tolerant control scheme. Finally, simulation results are presented to demonstrate the efficiency of the developed techniques. PMID:25014982

  17. Scheduling and Optimization of Fault-Tolerant Embedded Systems with Transparency/Performance Trade-Offs

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul

    2012-01-01

    In this article, we propose a strategy for the synthesis of fault-tolerant schedules and for the mapping of fault-tolerant applications. Our techniques handle transparency/performance trade-offs and use the faultoccurrence information to reduce the overhead due to fault tolerance. Processes and messages are statically scheduled, and we use process reexecution for recovering from multiple transient faults. We propose a finegrained transparent recovery, where the property of transparency can be selectively applied to processes and messages. Transparency hides the recovery actions in a selected part of the application so that they do not affect the schedule of other processes and messages. While leading to longer schedules, transparent recovery has the advantage of both improved debuggability and less memory needed to store the faulttolerant schedules.

  18. Task Migration for Fault-Tolerance in Mixed-Criticality Embedded Systems

    DEFF Research Database (Denmark)

    Saraswat, Prabhat Kumar; Pop, Paul

    2009-01-01

    In this paper we are interested in mixed-criticality embedded applications implemented on distributed architectures. Depending on their time-criticality, tasks can be hard or soft real-time and regarding safety-criticality, tasks can be fault-tolerant to transient faults, permanent faults, or have no dependability requirements. We use Earliest Deadline First (EDF) scheduling for the hard tasks and the Constant Bandwidth Server (CBS) for the soft tasks. The CBS parameters determine the quality of service (QoS) of soft tasks. Transient faults are tolerated using checkpointing with roll- back recovery. For tolerating permanent faults in processors, we use task migration, i.e., restarting the safety-critical tasks on other processors. We propose a Greedy-based on- line heuristic for the migration of safety-critical tasks, in response to permanent faults, and the adjustment of CBS parameters on the target processors, such that the faults are tolerated, the deadlines for the hard real-time tasks are satisfied and the QoS for soft tasks is maximized. The proposed online adaptive approach has been evaluated using several synthetic benchmarks and a real-life case study.

  19. Fault-tolerant adaptive control for load-following in static space nuclear power systems

    International Nuclear Information System (INIS)

    In this paper the possible use of dual-loop, model-based adaptive control system for load-following in static space nuclear power systems is investigated. The objective of the fault-tolerant, autonomous control system is to deliver the demanded electric power at the desired voltage level, by appropriately manipulating the neutron power through the control drums. As a result sufficient thermal power is produced to meet the required demand in the presence of dynamically changing system operating conditions and potential sensor failures. The designed controller is proposed for use in combination with the currently considered shunt regulators, or as a back-up controller when other means of power system control, including some of the sensors, fail

  20. Analysis and design of algorithm-based fault-tolerant systems

    Science.gov (United States)

    Nair, V. S. Sukumaran

    1990-01-01

    An important consideration in the design of high performance multiprocessor systems is to ensure the correctness of the results computed in the presence of transient and intermittent failures. Concurrent error detection and correction have been applied to such systems in order to achieve reliability. Algorithm Based Fault Tolerance (ABFT) was suggested as a cost-effective concurrent error detection scheme. The research was motivated by the complexity involved in the analysis and design of ABFT systems. To that end, a matrix-based model was developed and, based on that, algorithms for both the design and analysis of ABFT systems are formulated. These algorithms are less complex than the existing ones. In order to reduce the complexity further, a hierarchical approach is developed for the analysis of large systems.

  1. A Decentralized Service Based Architecture for Fault Tolerant Control

    OpenAIRE

    Li, Rui

    2012-01-01

    Fault Tolerant Control Systems (FTCSs) are control systems including fault tolerant control. These systems are famous for enabling reliability, maintainability and survival ability in safe vehicle design. In some SCANIA Electronic Control Units (ECUs), the ECUs FTCS is based on a centralized fault detector to detect faults and a centralized reconfigurator to reconfigure the system with degraded performance rather than, for example completely shutting down the engine. However, with the size in...

  2. Fault detection and fault tolerance in robotics

    Science.gov (United States)

    Visinsky, Monica; Walker, Ian D.; Cavallaro, Joseph R.

    1992-01-01

    Robots are used in inaccessible or hazardous environments in order to alleviate some of the time, cost and risk involved in preparing men to endure these conditions. In order to perform their expected tasks, the robots are often quite complex, thus increasing their potential for failures. If men must be sent into these environments to repair each component failure in the robot, the advantages of using the robot are quickly lost. Fault tolerant robots are needed which can effectively cope with failures and continue their tasks until repairs can be realistically scheduled. Before fault tolerant capabilities can be created, methods of detecting and pinpointing failures must be perfected. This paper develops a basic fault tree analysis of a robot in order to obtain a better understanding of where failures can occur and how they contribute to other failures in the robot. The resulting failure flow chart can also be used to analyze the resiliency of the robot in the presence of specific faults. By simulating robot failures and fault detection schemes, the problems involved in detecting failures for robots are explored in more depth.

  3. Multi-agent Platform and Toolbox for Fault Tolerant Networked Control Systems

    Directory of Open Access Journals (Sweden)

    Mário J. G. C. Mendes

    2009-04-01

    Full Text Available Industrial distributed networked control systems use different communication networks to exchange different critical levels of information. Real-time control, fault diagnosis (FDI and Fault Tolerant Networked Control (FTNC systems demand one of the more stringent data exchange in the communication networks of these networked control systems (NCS. When dealing with large-scale complex NCS, designing FTNC systems is a very difficult task due to the large number of sensors and actuators spatially distributed and network connected. To solve this issue, a FTNC platform and toolbox are presented in this paper using simple and verifiable principles coming mainly from a decentralized design based on causal modelling partitioning of the NCS and distributed computing using multi-agent systems paradigm, allowing the use of agents with well established FTC methodologies or new ones developed taking into account the NCS specificities. The multi-agent platform and toolbox for FTNC systems have been built in Matlab/Simulink environment, which is in our days the scientific benchmark for this kind of research. Although the tests have been performed with a simple case, the results are promising and this approach is expected to succeed with more complex processes.

  4. AN ALGORITHM AND SOME NUMERICAL EXPERIMENTS FOR THE SCHEDULING OF TASKS WITH FAULT-TOLERANCY CONSTRAINTS ON HETEROGENEOUS SYSTEMS

    OpenAIRE

    Nakechbandi, Moustafa; Colin, Jean-Yves

    2008-01-01

    In this paper, we propose an efficient scheduling algorithm for problems in which tasks with precedence constraints and communication delays have to be scheduled on an heterogeneous distributed system with an one fault hypothesis. Based on an extension of the Critical-Path Method CPM/PERT, our algorithm combines an optimal schedule with some additional tasks duplication, to provide fault-tolerance. Backup copies are not established for tasks that have already more than one original copy. The ...

  5. Modular, Fault-Tolerant Electronics Supporting Space Exploration Project

    National Aeronautics and Space Administration — Modern electronic systems tolerate only as many point failures as there are redundant system copies, using mere macro-scale redundancy. Fault Tolerant Electronics...

  6. A Constraint Logic Programming Framework for the Synthesis of Fault-Tolerant Schedules for Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Poulsen, Kåre Harbo; Pop, Paul

    2007-01-01

    We present a constraint logic programming (CLP) approach for synthesis of fault-tolerant hard real-time applications on distributed heterogeneous architectures. We address time-triggered systems, where processes and messages are statically scheduled based on schedule tables. We use process re-execution for recovering from multiple transient faults. We propose three scheduling approaches, which each present a trade-off between schedule simplicity and performance, (i) full transparency, (ii) slack sharing and (iii) conditional, and provide various degrees of transparency. We have developed a CLP framework that produces the fault-tolerant schedules, guaranteeing schedulability in the presence of transient faults. We show how the framework can be used to tackle design optimization problems.The proposed approach has been evaluated using extensive experiments.

  7. Coordinated Fault Tolerance for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Dongarra, Jack; Bosilca, George; et al.

    2013-04-08

    Our work to meet our goal of end-to-end fault tolerance has focused on two areas: (1) improving fault tolerance in various software currently available and widely used throughout the HEC domain and (2) using fault information exchange and coordination to achieve holistic, systemwide fault tolerance and understanding how to design and implement interfaces for integrating fault tolerance features for multiple layers of the software stack—from the application, math libraries, and programming language runtime to other common system software such as jobs schedulers, resource managers, and monitoring tools.

  8. Optimised sensor selection for control and fault tolerance of electromagnetic suspension systems: a robust loop shaping approach.

    Science.gov (United States)

    Michail, Konstantinos; Zolotas, Argyrios C; Goodall, Roger M

    2014-01-01

    This paper presents a systematic design framework for selecting the sensors in an optimised manner, simultaneously satisfying a set of given complex system control requirements, i.e. optimum and robust performance as well as fault tolerant control for high integrity systems. It is worth noting that optimum sensor selection in control system design is often a non-trivial task. Among all candidate sensor sets, the algorithm explores and separately optimises system performance with all the feasible sensor sets in order to identify fallback options under single or multiple sensor faults. The proposed approach combines modern robust control design, fault tolerant control, multiobjective optimisation and Monte Carlo techniques. Without loss of generality, it's efficacy is tested on an electromagnetic suspension system via appropriate realistic simulations. PMID:24041402

  9. A Fault-Tolerant Emergency-Aware Access Control Scheme for Cyber-Physical Systems

    CERN Document Server

    Wu, Guowei; Xia, Feng; Yao, Lin

    2012-01-01

    Access control is an issue of paramount importance in cyber-physical systems (CPS). In this paper, an access control scheme, namely FEAC, is presented for CPS. FEAC can not only provide the ability to control access to data in normal situations, but also adaptively assign emergency-role and permissions to specific subjects and inform subjects without explicit access requests to handle emergency situations in a proactive manner. In FEAC, emergency-group and emergency-dependency are introduced. Emergencies are processed in sequence within the group and in parallel among groups. A priority and dependency model called PD-AGM is used to select optimal response-action execution path aiming to eliminate all emergencies that occurred within the system. Fault-tolerant access control polices are used to address failure in emergency management. A case study of the hospital medical care application shows the effectiveness of FEAC.

  10. Fault tolerant operation of switched reluctance machine

    Science.gov (United States)

    Wang, Wei

    The energy crisis and environmental challenges have driven industry towards more energy efficient solutions. With nearly 60% of electricity consumed by various electric machines in industry sector, advancement in the efficiency of the electric drive system is of vital importance. Adjustable speed drive system (ASDS) provides excellent speed regulation and dynamic performance as well as dramatically improved system efficiency compared with conventional motors without electronics drives. Industry has witnessed tremendous grow in ASDS applications not only as a driving force but also as an electric auxiliary system for replacing bulky and low efficiency auxiliary hydraulic and mechanical systems. With the vast penetration of ASDS, its fault tolerant operation capability is more widely recognized as an important feature of drive performance especially for aerospace, automotive applications and other industrial drive applications demanding high reliability. The Switched Reluctance Machine (SRM), a low cost, highly reliable electric machine with fault tolerant operation capability, has drawn substantial attention in the past three decades. Nevertheless, SRM is not free of fault. Certain faults such as converter faults, sensor faults, winding shorts, eccentricity and position sensor faults are commonly shared among all ASDS. In this dissertation, a thorough understanding of various faults and their influence on transient and steady state performance of SRM is developed via simulation and experimental study, providing necessary knowledge for fault detection and post fault management. Lumped parameter models are established for fast real time simulation and drive control. Based on the behavior of the faults, a fault detection scheme is developed for the purpose of fast and reliable fault diagnosis. In order to improve the SRM power and torque capacity under faults, the maximum torque per ampere excitation are conceptualized and validated through theoretical analysis and experiments. With the proposed optimal waveform, torque production is greatly improved under the same Root Mean Square (RMS) current constraint. Additionally, position sensorless operation methods under phase faults are investigated to account for the combination of physical position sensor and phase winding faults. A comprehensive solution for position sensorless operation under single and multiple phases fault are proposed and validated through experiments. Continuous position sensorless operation with seamless transition between various numbers of phase fault is achieved.

  11. Fault Tolerance with Real-Time Java

    OpenAIRE

    Masson, Damien; Midonnet, Serge

    2006-01-01

    After having drawn up a state of the art on the theoretical feasibility of a system of periodic tasks scheduled by a preemptive algorithm at ?xed priorities, we show in this article that temporal faults can occur all the same within a theoretically feasible system, that these faults can lead to a failure of the system and that we can use the data calculated during control of admission to install detectors of faults and to de?ne a factor of tolerance. We show then the results obtained on a...

  12. Robot Position Sensor Fault Tolerance

    Science.gov (United States)

    Aldridge, Hal A.

    1997-01-01

    Robot systems in critical applications, such as those in space and nuclear environments, must be able to operate during component failure to complete important tasks. One failure mode that has received little attention is the failure of joint position sensors. Current fault tolerant designs require the addition of directly redundant position sensors which can affect joint design. A new method is proposed that utilizes analytical redundancy to allow for continued operation during joint position sensor failure. Joint torque sensors are used with a virtual passive torque controller to make the robot joint stable without position feedback and improve position tracking performance in the presence of unknown link dynamics and end-effector loading. Two Cartesian accelerometer based methods are proposed to determine the position of the joint. The joint specific position determination method utilizes two triaxial accelerometers attached to the link driven by the joint with the failed position sensor. The joint specific method is not computationally complex and the position error is bounded. The system wide position determination method utilizes accelerometers distributed on different robot links and the end-effector to determine the position of sets of multiple joints. The system wide method requires fewer accelerometers than the joint specific method to make all joint position sensors fault tolerant but is more computationally complex and has lower convergence properties. Experiments were conducted on a laboratory manipulator. Both position determination methods were shown to track the actual position satisfactorily. A controller using the position determination methods and the virtual passive torque controller was able to servo the joints to a desired position during position sensor failure.

  13. Fault Injection Campaign for a Fault Tolerant Duplex Framework

    Science.gov (United States)

    Sacco, Gian Franco; Ferraro, Robert D.; von llmen, Paul; Rennels, Dave A.

    2007-01-01

    Fault tolerance is an efficient approach adopted to avoid or reduce the damage of a system failure. In this work we present the results of a fault injection campaign we conducted on the Duplex Framework (DF). The DF is a software developed by the UCLA group [1, 2] that uses a fault tolerant approach and allows to run two replicas of the same process on two different nodes of a commercial off-the-shelf (COTS) computer cluster. A third process running on a different node, constantly monitors the results computed by the two replicas, and eventually restarts the two replica processes if an inconsistency in their computation is detected. This approach is very cost efficient and can be adopted to control processes on spacecrafts where the fault rate produced by cosmic rays is not very high.

  14. A Byzantine-Fault Tolerant Self-Stabilizing Protocol for Distributed Clock Synchronization Systems

    Science.gov (United States)

    Malekpour, Mahyar R.

    2006-01-01

    Embedded distributed systems have become an integral part of safety-critical computing applications, necessitating system designs that incorporate fault tolerant clock synchronization in order to achieve ultra-reliable assurance levels. Many efficient clock synchronization protocols do not, however, address Byzantine failures, and most protocols that do tolerate Byzantine failures do not self-stabilize. Of the Byzantine self-stabilizing clock synchronization algorithms that exist in the literature, they are based on either unjustifiably strong assumptions about initial synchrony of the nodes or on the existence of a common pulse at the nodes. The Byzantine self-stabilizing clock synchronization protocol presented here does not rely on any assumptions about the initial state of the clocks. Furthermore, there is neither a central clock nor an externally generated pulse system. The proposed protocol converges deterministically, is scalable, and self-stabilizes in a short amount of time. The convergence time is linear with respect to the self-stabilization period. Proofs of the correctness of the protocol as well as the results of formal verification efforts are reported.

  15. Design of fault tolerant control system for individual blade control helicopters

    Science.gov (United States)

    Tamayo, Sergio

    This dissertation presents the development of a fault tolerant control scheme for helicopters fitted with individually controlled blades. This novel approach attempts to improve fault tolerant capabilities of helicopter control system by increasing control redundancy using additional actuators for individual blade input and software re-mixing to obtain nominal or close to nominal conditions under failure. An advanced interactive simulation environment has been developed including modeling of sensor failure, swashplate actuator failure, individual blade actuator failure, and blade delamination to support the design, testing, and evaluation of the control laws. This simulation environment is based on the blade element theory for the calculation of forces and moments generated by the main rotor. This discretized model allows for individual blade analysis, which in turn allows measuring the consequences of a stuck blade, or loss of the surface area of the blade itself, with respect to the dynamics of the whole helicopter. The control laws are based on non-linear dynamic inversion and artificial neural network augmentation, which is a mix of linear and nonlinear methods that compensates for model inaccuracies due to linearization or failure. A stability analysis based on the Lyapunov function approach has shown that bounded tracking error is guaranteed, and under specific circumstances, global stability is guaranteed as well. An analysis over the degrees of freedom of the mechanical system and its impact over the helicopter handling qualities is also performed to measure the degree of redundancy achieved with the addition of individual blade actuators as compared to a classic swashplate helicopter configuration. Mathematical analysis and numerical simulation, using reconfiguration of the individual blade control under failure have shown that this control architecture can potentially improve the survivability of the aircraft and reduce pilot workload under failure conditions.

  16. A Fault-Tolerant Modulation Method to Counteract the Double Open-Switch Fault in Matrix Converter Drive Systems without Redundant Power Devices

    DEFF Research Database (Denmark)

    Chen, Der-Fa; Nguyen-Duy, Khiem

    2012-01-01

    This paper studies the double open-switch fault issue occurring within the conventional matrix converter driving a three-phase permanent-magnet synchronous motor system and proposes a fault-tolerant solution by introducing a revised modulation strategy. In this switching strategy, the rectifier-stage modulation is adjusted based on the knowledge of the switching logics of the inverter-stage and the operating input voltage sectors. However, the proposed fault-tolerant method does not rely on the assist of any redundant power devices or any reconfiguration of the matrix converter circuit by means of using redundant physical connections. It is shown that different locations of the double open switch affect the availability of the revised modulation. The steady state absolute speed error achieved with the proposed method is 4% of the nominal speed. Experimental results are performed to demonstrate the efficacy of the proposed methods.

  17. Fault tolerance using self-checking building-block computers

    Science.gov (United States)

    Rennels, D. A.

    1978-01-01

    The paper attempts to define and characterize a set of VLSI (very large scale integration) building-block circuits which can be used to combine existing microprocessors and memories into a wide variety of fault-tolerant computing systems. Such VLSI circuits would transform fault-tolerant computing into an off-the-shelf technology and enable its routine use for new applications. The self-checking computer module (SCCM) is the basic component out of which fault-tolerant computer systems are constructed. Several fault-tolerant configurations of SCCM are discussed, including the standby redundant uniprocessor, the voted/hybrid uniprocessor, and the distributed computer network.

  18. Fault Tolerant Environment in web crawler Using Hardware Failure Detection

    OpenAIRE

    Anup Garje, Prof Bhavesh Patel

    2012-01-01

    Fault Tolerant Environment is a complete programming environment for the reliable execution of distributed application programs. Fault Tolerant Distributed Environment encompasses all aspects of modern fault-tolerant distributed computing. The built-in user-transparent error detection mechanism covers processor node crashes and hardware transient failures. The mechanism also integrates user-assisted error checks into the system failure model. The nucleus non-blocking checkpointing mechanism c...

  19. Scheduling and Voltage Scaling for Energy/Reliability Trade-offs in Fault-Tolerant Time-Triggered Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Poulsen, Kåre Harbo

    2007-01-01

    In this paper we present an approach to the scheduling and voltage scaling of low-power fault-tolerant hard real-time applications mapped on distributed heterogeneous embedded systems. Processes and messages are statically scheduled, and we use process re-execution for recovering from multiple transient faults. Addressing simultaneously energy and reliability is especially challenging because lowering the voltage to reduce the energy consumption has been shown to exponentially increase the number of transient faults. In addition, time-redundancy based fault-tolerance techniques such as re-execution and dynamic voltage scaling-based low-power techniques are competing for the slack in the schedules. Our approach decides the voltage levels and start times of processes and the transmission times of messages, such that the transient faults are tolerated, the timing constraints of the application are satisfied and the energy is minimized. We present a constraint logic programming- based approach which is able to find reliable and schedulable implementations within limited energy and hardware resources. The developed algorithms have been evaluated using extensive experiments.

  20. Simulation Framework for Evaluation of Fault Tolerant Large Dynamic Distributed System

    Directory of Open Access Journals (Sweden)

    Sanjay Bansal

    2012-08-01

    Full Text Available The use of Java based simulators in the design and development of distributed system for evaluating the dependability on algorithms is appreciable due to their efficiency and scalability. It allows in designing the realistic simulation scenarios. In this work, we have proposed a Saturn, a multithreaded process oriented over simulation framework which is designed for modeling large scale distributed system. Realistic simulation is provided by it to provide a wide-range of distributed system technologies. It is an innovative solution to the problem of evaluating dependability characteristics of distributed system. Our solution is based on several proposed extensions to the simulation model of the MONARC simulation framework. These extensions refer to fault tolerance and system orchestration mechanisms in order to access the reliability and availability of distributed systems. The extended simulation model includes the necessary components to describe various actual failure situations and provides the mechanism to evaluate different strategies for replication and redundancy procedure as well as security enforcement mechanism. It is a simulator which also evaluates major QoS of the heartbeat based adaptive failure detection mechanism.

  1. Fault-tolerant Control of Discrete-time LPV systems using Virtual Actuators and Sensors

    DEFF Research Database (Denmark)

    Tabatabaeipour, Mojtaba; Stoustrup, Jakob

    2015-01-01

    This paper proposes a new fault-tolerant control (FTC) method for discrete-time linear parameter varying (LPV) systems using a reconfiguration block. The basic idea of the method is to achieve the FTC goal without re-designing the nominal controller by inserting a reconfiguration block between the plant and the nominal controller. The reconfiguration block is realized by an LPV virtual actuator and an LPV virtual sensor. Its goal is to transform the signals from the faulty system such that its behavior is similar to that of the nominal system from the viewpoint of the controller. Furthermore, it transforms the output of the controller for the faulty system such that the stability and performance goals are preserved. Input-to-state stabilizing LPV gains of the virtual actuator and sensor are obtained by solving linear matrix inequalities (LMIs). We show that separate design of these gains guarantees the input-to-state stability (ISS) of the closed-loop reconfigured system. Moreover, we obtain performances in terms of the ISS gains for the virtual actuator, the virtual sensor and their interconnection. Minimizing these performances is formulated as convex optimization problems subject to LMI constraints. Finally, the effectiveness of the method is demonstrated via a numerical example and stator current control of an induction motor.

  2. High Speed, High Temperature, Fault Tolerant Operation of a Combination Magnetic-Hydrostatic Bearing Rotor Support System for Turbomachinery

    Science.gov (United States)

    Jansen, Mark; Montague, Gerald; Provenza, Andrew; Palazzolo, Alan

    2004-01-01

    Closed loop operation of a single, high temperature magnetic radial bearing to 30,000 RPM (2.25 million DN) and 540 C (1000 F) is discussed. Also, high temperature, fault tolerant operation for the three axis system is examined. A novel, hydrostatic backup bearing system was employed to attain high speed, high temperature, lubrication free support of the entire rotor system. The hydrostatic bearings were made of a high lubricity material and acted as journal-type backup bearings. New, high temperature displacement sensors were successfully employed to monitor shaft position throughout the entire temperature range and are described in this paper. Control of the system was accomplished through a stand alone, high speed computer controller and it was used to run both the fault-tolerant PID and active vibration control algorithms.

  3. Fault-tolerant integrated system for flight vehicle navigation and control

    Science.gov (United States)

    Mitroshin, E. I.; Glinskii, V. A.; Gorbatenko, V. V.; Moiseenko, V. E.; Nikitin, A. I.

    1989-10-01

    A method for the synthesis of flight vehicle (FV) descent and prelanding maneuvering fault-tolerant control system is proposed, based on checking running and forecasted values of measurement misclosures and functional redundancy of the autonomous information and measurement channels. The descent is divided into two parts: one with autonomous standard loop, and the second one with integrated system to correct errors of vertical channel of inertial navigation system. An algorithm for navigation and guidance takes advantage of the onboard analytical and informational redundancy and isolates gradual and abrupt sensor failures. Structures of standby loops are proposed. In synthesizing standard loops, the necessity to meet contradictory requirements is taken into account. For the autonomous part, the problem of state and parameter estimation is solved by lower-order identificator theory synthesis methods. The of controller and observer coefficients for both parts are determined on the basis of direct and inverse solution of control and estimation problems. This approach makes it possible to coordinate dynamics of processes and to ensure process stability as well as desired precision of guidance.

  4. Fault-tolerance experiments with the JPL STAR computer.

    Science.gov (United States)

    Avizienis, A.; Rennels, D. A.

    1972-01-01

    Results of fault-tolerance experiments performed using an experimental computer with dynamic (standby) redundancy, including replaceable subsystems and a 'program rollback' provision to eliminate transient-caused errors. After a brief review of the specification of fault-tolerance with respect to transient faults, including a description of the method of injection of transient faults in software and system tests, fault-tolerance experiments carried out with this computer with regard to the determination of fault classes, software verification, system verification, and recovery stability are summarized. A test and repair processor is described which constitutes a special monitor unit of the computer and is used to obtain information for fault detection in the other subsystems of the computer and to ensure that proper recovery occurs when a fault is detected.

  5. Aspect-Oriented Approach for the Improvement of the Reliability and Time Performance of a Fault-Tolerant System

    OpenAIRE

    Khalid Bouragba; Hicham Belhadaoui; Mohammed Ouzzif; Mounir Rifi

    2011-01-01

    The principle of separation of concerns is a basis element in the software engineering and allows for the division of properties, becoming smaller each time, so as to master their complexity, from the design phase to achievement phase. This paper proposes the probabilistic assessment of critical fault-tolerant programmed systems to improve reliability and availability of an embedded system. In addition, to improve their response time, we use separation of concerns approach, functional (behavi...

  6. A Problem-Specific Fault-Tolerance Mechanism for Asynchronous, Distributed Systems

    CERN Document Server

    Iamnitchi, A; Foster, Ian

    2000-01-01

    The idle computers on a local area, campus area, or even wide area network represent a significant computational resource---one that is, however, also unreliable, heterogeneous, and opportunistic. This type of resource has been used effectively for embarrassingly parallel problems but not for more tightly coupled problems. We describe an algorithm that allows branch-and-bound problems to be solved in such environments. In designing this algorithm, we faced two challenges: (1) scalability, to effectively exploit the variably sized pools of resources available, and (2) fault tolerance, to ensure the reliability of services. We achieve scalability through a fully decentralized algorithm, by using a membership protocol for managing dynamically available resources. However, this fully decentralized design makes achieving reliability even more challenging. We guarantee fault tolerance in the sense that the loss of up to all but one resource will not affect the quality of the solution. For propagating information ef...

  7. Research on fault diagnose and fault tolerant control of steam generator based on strong tracking filter

    International Nuclear Information System (INIS)

    In order to further improve the safety of nuclear power plants, based on the nonlinear system with stochastic noise, the strong tracking filter is used to evaluate the sensor fault bias of steam generator control system and reconstruct the sensors output to implement the fault tolerant control. The simulation results demonstrate that this method can evaluate the time-varying sensor fault bias effectively and has great fault tolerant ability, and the methodology employing the strong tracking filter for steam generator fault tolerant control design is effective. (authors)

  8. Software-implemented hardware fault tolerance

    CERN Document Server

    Goloubeva, O; Sonza Reorda, M

    2006-01-01

    Addresses the topic of software-implemented hardware fault tolerance (SIHFT), that is, how to deal with faults affecting the hardware by only (or mainly) acting on the software. This book presents the theory behind software-implemented hardware fault tolerance, as well as the practical aspects related to put it at work on real examples.

  9. Fault Tolerant External Memory Algorithms

    DEFF Research Database (Denmark)

    JØrgensen, Allan GrØnlund; Brodal, Gerth StØlting

    2009-01-01

    Algorithms dealing with massive data sets are usually designed for I/O-efficiency, often captured by the I/O model by Aggarwal and Vitter. Another aspect of dealing with massive data is how to deal with memory faults, e.g. captured by the adversary based faulty memory RAM by Finocchi and Italiano. However, current fault tolerant algorithms do not scale beyond the internal memory. In this paper we investigate for the first time the connection between I/O-efficiency in the I/O model and fault tolerance in the faulty memory RAM, and we assume that both memory and disk are unreliable. We show a lower bound on the number of I/Os required for any deterministic dictionary that is resilient to memory faults. We design a static and a dynamic deterministic dictionary with optimal query performance as well as an optimal sorting algorithm and an optimal priority queue. Finally, we consider scenarios where only cells in memory or only cells on disk are corruptible and separate randomized and deterministic dictionaries in the latter.

  10. Mission reliability analysis of fault-tolerant multiple-phased systems

    Energy Technology Data Exchange (ETDEWEB)

    Mo Yuchang [Harbin Institute of Technology, Harbin, Heilongjiang 150001 (China)], E-mail: myc@ftcl.hit.edu.cn; Siewiorek, Daniel [Department of Computer Science and Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213 (United States); Yang Xiaozong [Harbin Institute of Technology, Harbin, Heilongjiang 150001 (China)

    2008-07-15

    Fault-tolerant multiple-phased systems (FTMPS) are defined as systems whose critical components are independently replicated and whose operational life can be partitioned into a set of disjoint periods, called 'phases'. Because of their deployment in critical applications, their mission reliability analysis is a task of primary relevance to validate the designs. This paper is focused on the reliability analysis of FTMPS with random phase durations, non-exponentially distributed repair activities and different repair policies. For self-repairable FTMPS with a component-level reconfiguration architecture, we derive several efficient formulations from the underlying structure characteristics for their intraphase behavior analysis. We also present a uniform solution framework of the mission reliability for FTMPS with generally distributed phase durations. Compared with existing methods based on deterministic and stochastic Petri nets or Markov regenerative stochastic Petri nets, our approach is more simple in concept and powerful in computation. Two examples of FTMPS are analyzed to illustrate the advantages of our approach.

  11. Advanced Information Processing System (AIPS)-based fault tolerant avionics architecture for launch vehicles

    Science.gov (United States)

    Lala, Jaynarayan H.; Harper, Richard E.; Jaskowiak, Kenneth R.; Rosch, Gene; Alger, Linda S.; Schor, Andrei L.

    1990-01-01

    An avionics architecture for the advanced launch system (ALS) that uses validated hardware and software building blocks developed under the advanced information processing system program is presented. The AIPS for ALS architecture defined is preliminary, and reliability requirements can be met by the AIPS hardware and software building blocks that are built using the state-of-the-art technology available in the 1992-93 time frame. The level of detail in the architecture definition reflects the level of detail available in the ALS requirements. As the avionics requirements are refined, the architecture can also be refined and defined in greater detail with the help of analysis and simulation tools. A useful methodology is demonstrated for investigating the impact of the avionics suite to the recurring cost of the ALS. It is shown that allowing the vehicle to launch with selected detected failures can potentially reduce the recurring launch costs. A comparative analysis shows that validated fault-tolerant avionics built out of Class B parts can result in lower life-cycle-cost in comparison to simplex avionics built out of Class S parts or other redundant architectures.

  12. A verified model of fault-tolerance

    Science.gov (United States)

    Rushby, John

    1990-01-01

    The main objectives are: a model of a replicated system with exact-match voting; a fault model that includes transients; a theorem that establishes the conditions under which the system provides fault tolerance; a formal specification of the model; and a mechanically checked verification of the theorem that is consonant with the journal-level presentation. Formal specification and verification revealed typos in the original report, exposed omission in original proof, led to the stronger theorem and more elegant proof, and confirmed that Enhanced Hierarchical Development Methodology (EHDM) has the capability to specify interesting and useful properties in a direct, natural, and readable manner.

  13. Fault-tolerant control for current sensors of doubly fed induction generators based on an improved fault detection method

    DEFF Research Database (Denmark)

    Li, Hui; Yang, Chao

    2014-01-01

    Fault-tolerant control of current sensors is studied in this paper to improve the reliability of a doubly fed induction generator (DFIG). A fault-tolerant control system of current sensors is presented for the DFIG, which consists of a new current observer and an improved current sensor fault detection algorithm. The current observer is constructed by using only voltage signals as inputs. The fault detection algorithm is based on the current observer, in which an adaptive threshold and different fault duration times are considered. The performance of the proposed observer, improved fault detection algorithm, and fault-tolerant control system are investigated by simulation. The results indicate that the outputs of the observer and the sensor are highly coherent. The fault detection algorithm can efficiently detect both soft and hard faults in current sensors, and the fault-tolerant control system can effectively tolerate both types of faults. © 2013 Published by Elsevier Ltd. All rights reserved.

  14. Comment on "Fault Tolerant analysis for stochastic systems using switching diffusion processes' by Yang, Jiang and Cocquempot

    DEFF Research Database (Denmark)

    SchiØler, Henrik; Leth, John-Josef

    2011-01-01

    Results are given in Yang, Jiang and Cocquempot (Yang, H., Jiang, B., and Cocquempot, V. (2009), ‘Fault Tolerance Analysis for Stochastic Systems using Switching Diffusion Processes’, International Journal of Control, 82, 1516–1525) regarding the overall stability of switched diffusion processes based on stability properties of separate processes combined through stochastic switching. This article argues two main results to be empty, in that the presented hypotheses are logically inconsistent.

  15. Adapted importance sampling schemes for the simulation of dependability models of Fault-tolerant systems with deferred repair

    OpenAIRE

    Carrasco, Juan A.

    2006-01-01

    This paper targets the simulation of continuous-time Markov chain models of fault-tolerant systems with deferred repair. We start by stating sufficient conditions for a given importance sampling scheme to satisfy the bounded relative error property. Using those sufficient conditions, it is noted that many previously proposed importance sampling techniques such as failure biasing and balanced failure biasing satisfy that property. Then, we adapt the importance sampling schemes failure transiti...

  16. Efficient Fault-Tolerant Strategy Selection Algorithm in Cloud Computing

    Directory of Open Access Journals (Sweden)

    P.Priyanka

    2014-02-01

    Full Text Available Cloud computing is upcoming a mainstream feature of information technology. More progressively enterprises deploy their software systems in the cloud environment. The applications in cloud are usually large scale and containing a lot of distributed cloud components. Building cloud applications is highly reliable for challenging and critical research issues. Information processing systems has increased the significance of its correct and continuous operation even in the presence of faulty components. To address this issue, proposes a cloud framework to build fault-tolerant cloud applications. We first propose fault detection algorithms to identify significant components from the huge amount of cloud components. Then, we present an efficient fault-tolerance strategy selection algorithm to determine the most suitable fault-tolerance strategy for each significant component. Software fault tolerance is widely adopted to increase the overall system reliability in critical applications. System reliability can be enhanced by employing functionally equivalent components to tolerate component failures. Fault-tolerance strategies introduced a three well-known techniques are in the following with formulas for calculating the failure probabilities of the fault-tolerant modules. Our work will mainly be driven toward the implementation of the framework to measure the strength of fault tolerance service and to make an in-depth analysis of the cost benefits among all the stakeholders. An algorithm is proposed to automatically determine an efficient fault-tolerance strategy for the significant cloud components. Using real failure traces and model, we evaluate the proposed resource provisioning policies to determine their performance, cost as well as cost efficiency. The experimental results show that by tolerating faults of a small part of the most important components, the reliability of cloud applications can be highly improved.

  17. Task Mapping and Bandwidth Reservation for Mixed Hard/Soft Fault-Tolerant Embedded Systems

    DEFF Research Database (Denmark)

    Saraswat, Prabhat Kumar; Pop, Paul

    2010-01-01

    In this paper we are interested in mixed hard/soft real-time fault-tolerant applications mapped on distributed heterogeneous architectures. We use the Earliest Deadline First (EDF) scheduling for the hard real-time tasks and the Constant Bandwidth Server (CBS) for the soft tasks. The bandwidth reserved for the servers determines the quality of service (QoS) for soft tasks. CBS enforces temporal isolation, such that soft task overruns do not affect the timing guarantees of hard tasks. Transient faults in hard tasks are tolerated using checkpointing with rollback recovery. We have proposed a Tabu Search-based approach for task mapping and CBS bandwidth reservation, such that the deadlines for the hard tasks are satisfied, even in the case of transient faults, and the QoS for the soft tasks is maximized. Researchers have used fixed execution time models, such as the worst-case execution times for hard tasks and average execution times for soft tasks. However, we show that by using stochastic execution times for soft tasks, significant improvements can be obtained. The proposed strategy has been evaluated using an extensive set of benchmarks.

  18. Enhancement of Fault Tolerance in Cloud Computing

    OpenAIRE

    Pushpanjali Gupta; Rasmi Ranjan Patra

    2014-01-01

    In recent years researchers are trying to work out scientific applications in cloud so that it decreases the infrastructure cost and increases the span of team and finally innovative ideas towards applications is increased. But the cloud is still not as much reliable, controllable as grid. So in the evolving Cloud computing environment there is a great need of fault tolerance mechanism for the system to work effectively even in the presence of failure. Moreover Big Organizations ar...

  19. A Concept for fault tolerant controllers

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels KjØlstad

    2009-01-01

    This paper describe a concept for fault tolerant controllers (FTC) based on the YJBK (after Youla, Jabr, Bongiorno and Kucera) parameterization. This controller architecture will allow to change the controller on-line in the case of faults in the system. In the described FTC concept, a safe mode controller is applied as the basic feedback controller. A controller for normal operation with high performance is obtained by including certain YJBK parameters (transfer functions) in the controller. This will allow a fast switch from normal operation to safe mode operation in case of critical faults in the system. The described FTC architecture allow the different feedback controllers to apply different sets of sensors and actuators.

  20. Actuator fault diagnosis and fault-tolerant control: Application to the quadruple-tank process

    Science.gov (United States)

    Buciakowski, Mariusz; de Rozprza-Faygel, Micha?; Ocha?ek, Joanna; Witczak, Marcin

    2014-12-01

    The paper focuses on an important problem related to the modern control systems, which is the robust fault-tolerant control. In particular, the problem is oriented towards a practical application to quadruple-tank process. The proposed approach starts with a general description of the system and fault-tolerant strategy, which is composed of a suitably integrated fault estimator and robust controller. The subsequent part of the paper is concerned with the design of robust controller as well as the proposed fault-tolerant control scheme. To confirm the effectiveness of the proposed approach, the final part of the paper presents experimental results for considered the quadruple-tank process.

  1. SURVNET: A Fault Tolerant Local Area Network

    Science.gov (United States)

    Katz, J. L.; Metcalf, B. D.

    1987-01-01

    In response to the Department of Defense's need to enhance the survivability of command and control systems, The MITRE Corporation developed SURVNET, a survivable fiber optic local area network. The network supports data communications with a fault-tolerant, distributed architecture capable of continued communication despite media failure and node outages. SURVNET is configured as a modified fiber-optic broadcast bus. The physical and data link layers are implemented with a combination of IEEE 802.3 (Ethernet) and an augmented version of IEEE 802.4 token passing bus protocols. Special nodes in the network, incorporating fault-tolerant software, are doubly connected to the fiber bus. Periodically, these nodes broadcast a self-addressed test message to determine if continuity exists on the network segment between the node's two physically separate connections. If a discontinuity is detected, the node utilizes its two connections to bridge between the isolated bus segments.

  2. Fault tolerant process computer for BWR nuclear power plant

    International Nuclear Information System (INIS)

    The realization of highly fault tolerant process computer system for monitoring and control of nuclear power plants is described. A general methodology for hardware, system, and software design to achieve fault tolerance is first presented then a detailed description of its implementation to actual system is given taking the PODIA system (Plant Operation by Displayed Information and Automation) as an example. Effective Verification and Validation (V and V) mehtods for software production is also discussed in detail

  3. Fault Tolerance in Control Architectures for Mobile Robots: Fantasy or Reality?

    OpenAIRE

    Crestani, Didier; Godary-dejean, Karen

    2012-01-01

    Due to the future development of robotic autonomous systems in human environment, the fault tolerance paradigm will be a central issue in robotics. This article presents a survey of fault tolerance concepts, means and implementations in robotic architectures.

  4. Fault Tolerant Ethernet Based Network for Time Sensitive Applications in Electrical Power Distribution Systems

    Directory of Open Access Journals (Sweden)

    Leos Bohac

    2013-01-01

    Full Text Available The paper analyses and experimentally verifies deployment of Ethernet based network technology to enable fault tolerant and timely exchange of data among a number of high voltage protective relays that use proprietary serial communication line to exchange data in real time on a state of its high voltage circuitry facilitating a fast protection switching in case of critical failures. The digital serial signal is first fetched into PCM multiplexer where it is mapped to the corresponding E1 (2 Mbit/s time division multiplexed signal. Subsequently, the resulting E1 frames are then packetized and sent through Ethernet control LAN to the opposite PCM demultiplexer where the same but reverse processing is done finally sending a signal into the opposite protective relay. The challenge of this setup is to assure very timely delivery of the control information between protective relays even in the cases of potential failures of Ethernet network itself. The tolerance of Ethernet network to faults is assured using widespread per VLAN Rapid Spanning Tree Protocol potentially extended by 1+1 PCM protection as a valuable option.

  5. Evaluation of Simple Causal Message Logging for Large-Scale Fault Tolerant HPC Systems

    Energy Technology Data Exchange (ETDEWEB)

    Bronevetsky, G; Meneses, E; Kale, L V

    2011-02-25

    The era of petascale computing brought machines with hundreds of thousands of processors. The next generation of exascale supercomputers will make available clusters with millions of processors. In those machines, mean time between failures will range from a few minutes to few tens of minutes, making the crash of a processor the common case, instead of a rarity. Parallel applications running on those large machines will need to simultaneously survive crashes and maintain high productivity. To achieve that, fault tolerance techniques will have to go beyond checkpoint/restart, which requires all processors to roll back in case of a failure. Incorporating some form of message logging will provide a framework where only a subset of processors are rolled back after a crash. In this paper, we discuss why a simple causal message logging protocol seems a promising alternative to provide fault tolerance in large supercomputers. As opposed to pessimistic message logging, it has low latency overhead, especially in collective communication operations. Besides, it saves messages when more than one thread is running per processor. Finally, we demonstrate that a simple causal message logging protocol has a faster recovery and a low performance penalty when compared to checkpoint/restart. Running NAS Parallel Benchmarks (CG, MG and BT) on 1024 processors, simple causal message logging has a latency overhead below 5%.

  6. Recovery in fault-tolerant distributed microcontrollers

    Science.gov (United States)

    Hwang, Riki I.-Ming

    A critical problem facing both the government and commercial space program is the need for lower cost, higher performance and lower power consumption for on-board processing. Special radiation hardened processors have been developed to operate in the space radiation environment, but they are typically one to two orders of magnitude behind the performance of commercial devices, and they consume much more power. Yet there is a need for much greater processing performance in most future space missions. The use of commercial (designated COTS Commercial Off-the-Shelf) processors in space has been prevented by the fact that the space radiation environment causes a unacceptably high transient error rate---derailing their computations every few hours [MESS 92]. However, protective redundancy can be employed along with the technology of fault-tolerant computing to automatically recover from such errors and thus enable their use. This thesis focuses on one aspect of this problem, the embedded microcontrollers highly integrated computer system on a single chip that, not unlike those used in modern automobiles, control various subsystems that make up a spacecraft. This thesis examines tradeoffs and experiments with design techniques required to implement fault-tolerant distributed networks using embedded microcontroller processing nodes. A new fault-tolerant node architecture was developed that allows differing amounts of redundancy to be employed with minimal design change. This includes a special isolated wire-or output system that allows modules to be powered down to recover from some potentially destructive radiation events (latchup). An novel recovery approach was developed that uses comparison voting for error detection and recovery but also employs a "stable" set of recovery actions to allow recovery if multiple errors or Byzantine behaviors occur. Finally, a redundant intercommunication architecture between embedded processing nodes was developed that provides fault-tolerance in communications between them. A testbed has been constructed, a real-time executive has been developed, and a supporting test environment has also been implemented to allow fault-insertion testing of the experimental architecture. Our initial results strongly support the viability of the fault-tolerance approaches we have developed.

  7. Electrical Steering of Vehicles - Fault-tolerant Analysis and Design

    DEFF Research Database (Denmark)

    Blanke, Mogens; Thomsen, Jesper Sandberg

    2006-01-01

    The topic of this paper is systems that need be designed such that no single fault can cause failure at the overall level. A methodology is presented for analysis and design of fault-tolerant architectures, where diagnosis and autonomous reconfiguration can replace high cost triple redundancy solutions and still meet strict requirements to functional safety. The paper applies graph-based analysis of functional system structure to find a novel fault-tolerant architecture for an electrical steering where a dedicated AC-motor design and cheap voltage measurements ensure ability to detect all relevant faults. The paper shows how active control reconfiguration can accommodate all critical faults and the fault-tolerant abilities are demonstrated on a warehouse truck hardware.

  8. MULTILEVEL CONVERTER STATCOM FAULT TOLERANCE CAPABILITY

    Directory of Open Access Journals (Sweden)

    K.Varalakshmi

    2014-11-01

    Full Text Available Fault tolerant capability of multilevel converters in STATCOM (static synchronous compensator has been utilized as power system controller for reactive power compensation and voltage regulation improvement. The advantages of the multilevel structure for The STATCOM are 1 Elimination of bulky transformers.2 Reduction of the output harmonic levels by Synthesizing Sinusoidal voltage 3 lower switching losses. The structure has only one disadvantage that is increased switch failure, due to the increased number of switches. A single switch failure, however, does not necessarily force an (2n + 1- level STATCOM offline. Even with a reduced number of switches, a STATCOM can still provide a significant range of control by removing the module of the faulted switch and continuing with (2n ? 1 levels. This paper introduces an approach to identify the existence of the faulted switch, and reconfigure the STATCOM. This approach is illustrated on 13 level converters STATCOM and total harmonic distortion is analyzed by using MATLAB.

  9. Designing fault-tolerant real-time computer systems with diversified bus architecture for nuclear power plants

    International Nuclear Information System (INIS)

    Fault-tolerant real-time computer (FT-RTC) systems are widely used to perform safe operation of nuclear power plants (NPP) and safe shutdown in the event of any untoward situation. Design requirements for such systems need high reliability, availability, computational ability for measurement via sensors, control action via actuators, data communication and human interface via keyboard or display. All these attributes of FT-RTC systems are required to be implemented using best known methods such as redundant system design using diversified bus architecture to avoid common cause failure, fail-safe design to avoid unsafe failure and diagnostic features to validate system operation. In this context, the system designer must select efficient as well as highly reliable diversified bus architecture in order to realize fault-tolerant system design. This paper presents a comparative study between CompactPCI bus and Versa Module Eurocard (VME) bus architecture for designing FT-RTC systems with switch over logic system (SOLS) for NPP. (author)

  10. A fault fuzzy-ontology for large scale fault-tolerant wireless sensor networks

    OpenAIRE

    Benazzouz, Yazid; Aktouf, Oum-el-kheir; Parissis, Ioannis

    2014-01-01

    Fault tolerance is a key research area for many of applications such as those based on sensor network technologies. In a large scale wireless sensor network (WSN), it becomes important to find new methods for fault-tolerance that can meet new application requirements like Internet of things, urbane intelligence and observation systems. The challenge is beyond the limit of a single wireless sensor network and concerns multiple widely interconnected sub networks. The domain of fault grows consi...

  11. Fault-tolerant architectures for superconducting qubits

    International Nuclear Information System (INIS)

    In this short review, I draw attention to new developments in the theory of fault tolerance in quantum computation that may give concrete direction to future work in the development of superconducting qubit systems. The basics of quantum error-correction codes, which I will briefly review, have not significantly changed since their introduction 15 years ago. But an interesting picture has emerged of an efficient use of these codes that may put fault-tolerant operation within reach. It is now understood that two-dimensional surface codes, close relatives of the original toric code of Kitaev, can be adapted as shown by Raussendorf and Harrington to effectively perform logical gate operations in a very simple planar architecture, with error thresholds for fault-tolerant operation simulated to be 0.75%. This architecture uses topological ideas in its functioning, but it is not 'topological quantum computation'-there are no non-abelian anyons in sight. I offer some speculations on the crucial pieces of superconducting hardware that could be demonstrated in the next couple of years that would be clear stepping stones towards this surface-code architecture.

  12. A Dynamic Slack Management Technique for Real-Time Distributed Embedded System with Enhanced Fault Tolerance and Resource Constraints

    Directory of Open Access Journals (Sweden)

    Santhi Baskaran,

    2011-01-01

    Full Text Available This project work aims to develop a dynamic slack management technique, for real-time distributed embedded systems to reduce the total energy consumption in addition to timing, precedence and resource constraints. The Slack Distribution Technique proposed considers a modified Feedback Control Scheduling (FCS algorithm. This algorithm schedules dependent tasks effectively with precedence and resource constraints. It further minimizes the schedule length and utilizes the available slack to increase the energy efficiency. A fault tolerant mechanism uses a deferred-active-backup scheme increases the schedulability and provides reliability to the system.

  13. A tutorial on the CARE III approach to reliability modeling. [of fault tolerant avionics and control systems

    Science.gov (United States)

    Trivedi, K. S.; Geist, R. M.

    1981-01-01

    The CARE 3 reliability model for aircraft avionics and control systems is described by utilizing a number of examples which frequently use state-of-the-art mathematical modeling techniques as a basis for their exposition. Behavioral decomposition followed by aggregration were used in an attempt to deal with reliability models with a large number of states. A comprehensive set of models of the fault-handling processes in a typical fault-tolerant system was used. These models were semi-Markov in nature, thus removing the usual restrictions of exponential holding times within the coverage model. The aggregate model is a non-homogeneous Markov chain, thus allowing the times to failure to posses Weibull-like distributions. Because of the departures from traditional models, the solution method employed is that of Kolmogorov integral equations, which are evaluated numerically.

  14. Diagnosis and Fault-tolerant Control, 2nd edition.

    DEFF Research Database (Denmark)

    Blanke, Mogens; Kinnaert, Michel

    2006-01-01

    Fault-tolerant control aims at a graceful degradation of the behaviour of automated systems in case of faults. It satisfies the industrial demand for enhanced availability and safety, in contrast to traditional reactions to faults that bring about sudden shutdowns and loss of availability. The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault throught the process, to test the fault detectability and to find the redundancies in the process that can be used to ensure fault tolerance. Design methods for diagnostic systems and fault-tolerant controllers are presented for processes that are described by analytical models, by discrete-event models or that can be dealt with as quantised systems. Five case studies on pilot processes show the applicability of the presented methods. The theoretical results are illustrated by two running examples used throughout the book. The second edition includes new material about reconfigurable control, diagnosis of nonlinear systems, and remote diagnosis. The application examples are extended by a steering-by-wire system and the air path of a diesel engine, both of which include experimental results. The bibliographical notes at the end of all chapters have been up-dated. The chapters end with exercises to be used in lectures.

  15. Fault-Tolerant Precision Formation Guidance for Interferometry Project

    National Aeronautics and Space Administration — A methodology is to be developed that will allow the development and implementation of fault-tolerant control system for distributed collaborative spacecraft. The...

  16. Fault tolerant control - a residual based set-up

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels KjØlstad

    2009-01-01

    A new set-up for fault tolerant control (FTC) for stable systems is presented in this paper. The new set-up is based on a simple implementation of the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization. This implementation of the YJBK parameterization will allow a direct and simple reconfiguration of the feedback controller. Another central part of fault tolerant control is fault diagnosis. The controller implementation can be applied directly in connection with both passive diagnosis (PFD) as well as with active fault diagnosis (AFD). The presented FTC set-up is investigated with respect to sensor reconfiguration. Actuator reconfiguration can be dealt with in a similar way.

  17. Fault Tolerant Environment in web crawler Using Hardware Failure Detection

    Directory of Open Access Journals (Sweden)

    Anup Garje , Prof. Bhavesh Patel , Dr. B. B. Mesharm

    2012-06-01

    Full Text Available Fault Tolerant Environment is a complete programming environment for the reliable execution of distributed application programs. Fault Tolerant Distributed Environment encompasses all aspects of modern fault-tolerant distributed computing. The built-in user-transparent error detection mechanism covers processor node crashes and hardware transient failures. The mechanism also integrates user-assisted error checks into the system failure model. The nucleus non-blocking checkpointing mechanism combined with a novel low overhead roll forward recovery scheme delivers an efficient, low-overload backup and recovery mechanism for distributed processes. Fault Tolerant Distributed Environment also provides a means of remote automatic process allocation on distributed system nodes. In case of recovery is not possible, we can use new microrebooting approach to store the system to stable state.

  18. Enhanced Maritime Safety through Diagnosis and Fault Tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2001-01-01

    Faults in steering, navigation instruments or propulsion machinery are serious on a marine vessel since the consequence could be loss of maneuvering ability, and imply risk of damage to vessel personnel or environment. Early diagnosis and accomodation of faults could enhance safety. Fault-tolerant control is a methodology to help prevent that faults develop into failure. The means include on-line fault diagnosis, automatic condition assessment and calculation of remedial action to avoid hazards. This paper gives an overview of methods to obtain fault-tolerance: fault diagnosis; analysis of properties of a falty system; means to determine remedial actions. The paper illustrates the techniques by two marine examples, sensor fusion for automatic steering and control of the main engine.

  19. Fault-tolerant dynamic task graph scheduling

    Energy Technology Data Exchange (ETDEWEB)

    Kurt, Mehmet C.; Krishnamoorthy, Sriram; Agrawal, Kunal; Agrawal, Gagan

    2014-11-16

    In this paper, we present an approach to fault tolerant execution of dynamic task graphs scheduled using work stealing. In particular, we focus on selective and localized recovery of tasks in the presence of soft faults. We elicit from the user the basic task graph structure in terms of successor and predecessor relationships. The work stealing-based algorithm to schedule such a task graph is augmented to enable recovery when the data and meta-data associated with a task get corrupted. We use this redundancy, and the knowledge of the task graph structure, to selectively recover from faults with low space and time overheads. We show that the fault tolerant design retains the essential properties of the underlying work stealing-based task scheduling algorithm, and that the fault tolerant execution is asymptotically optimal when task re-execution is taken into account. Experimental evaluation demonstrates the low cost of recovery under various fault scenarios.

  20. Fault-Tolerance Strategies and Probabilistic Guarantees for Real-Time Systems

    OpenAIRE

    Aysan, Hu?seyin

    2012-01-01

    Ubiquitous deployment of embedded systems is having a substantial impact on our society, since they interact with our lives in many critical real-time applications. Typically, embedded systems used in safety or mission critical applications (e.g., aerospace, avionics, automotive or nuclear domains) work in harsh environments where they are exposed to frequent transient faults such as power supply jitter, network noise and radiation. They are also susceptible to errors originating from design ...

  1. Fault-tolerant logics for FPGA linux

    International Nuclear Information System (INIS)

    The increasing use of SRAM-based reconfigurable architectures at important areas of research and development (like particle accelerators and space applications) brings new, currently partially unattended effects on top. An already well known, but nevertheless important problem of such systems is its susceptibility to radiation which increases in conjunction with particle flux and energy. Regarding to current knowledge, errors induced by Single Event Upsets (SEU) and Single Event Transients (SET) are handled exclusively in hardware by the use of spacial and temporal redundancy features. Our field of research is to extend conventional fault tolerance to multiple layers of embedded computer systems, starting with the FPGA bit layer and ending up in the software application layer to get a maximum of radiation tolerance in systems running FPGA Linux in radiation susceptible environments. Only a collaboration of all these layers is able to create an adequate amount of data security and process integrity.

  2. Fault-tolerance of functionally adaptive and robust manipulator

    International Nuclear Information System (INIS)

    Robots are required to have the ability to adapt their function according to the tasks to be carried out in an unexpected environment, and to execute tasks even if a part of the system is malfunctions. Fault tolerance is a significant factor of functional adaptability. In this paper, a fault-tolerant control method with a proxy control strategy for a distributed manipulator is proposed. A Byzantine fault model is assumed in the method, where in the behavior of the faulty part cannot be predicted. The method focuses on malfunction of CPU (central processing unit) which is the controller of the manipulator. The method consists of procedures for fault detection, localization, containment, system reconfiguration and error recovery. The fault detection procedure is based on communication using shared memory. A voting algorithm for fault location is proposed. The fault-tolerance control method is implemented in a distributed manipulator with modular architecture, called Fun-ARM (functionally adaptive and robust manipulator). A reaching motion experiment with a CPU pseudo fault is shown, and the proposed fault-tolerant control method is verified. (author)

  3. Robustness and fault tolerance make brains harder to study.

    Science.gov (United States)

    Srinivasan, Shyam; Stevens, Charles F

    2011-01-01

    Brains increase the survival value of organisms by being robust and fault tolerant. That is, brain circuits continue to operate as the organism needs, even when the circuit properties are significantly perturbed. Kispersky and colleagues, in a recent paper in Neural Systems & Circuits, have found that Granger Causality analysis, an important method used to infer circuit connections from the behavior of neurons within the circuit, is defeated by the mechanisms that give rise to this robustness and fault tolerance. PMID:21714944

  4. Optimal Management of Redundant Control Authority for Fault Tolerance

    Science.gov (United States)

    Wu, N. Eva; Ju, Jianhong

    2000-01-01

    This paper is intended to demonstrate the feasibility of a solution to a fault tolerant control problem. It explains, through a numerical example, the design and the operation of a novel scheme for fault tolerant control. The fundamental principle of the scheme was formalized in [5] based on the notion of normalized nonspecificity. The novelty lies with the use of a reliability criterion for redundancy management, and therefore leads to a high overall system reliability.

  5. Extensions to the Parallel Real-Time Artificial Intelligence System (PRAIS) for fault-tolerant heterogeneous cycle-stealing reasoning

    Science.gov (United States)

    Goldstein, David

    1991-01-01

    Extensions to an architecture for real-time, distributed (parallel) knowledge-based systems called the Parallel Real-time Artificial Intelligence System (PRAIS) are discussed. PRAIS strives for transparently parallelizing production (rule-based) systems, even under real-time constraints. PRAIS accomplished these goals (presented at the first annual C Language Integrated Production System (CLIPS) conference) by incorporating a dynamic task scheduler, operating system extensions for fact handling, and message-passing among multiple copies of CLIPS executing on a virtual blackboard. This distributed knowledge-based system tool uses the portability of CLIPS and common message-passing protocols to operate over a heterogeneous network of processors. Results using the original PRAIS architecture over a network of Sun 3's, Sun 4's and VAX's are presented. Mechanisms using the producer-consumer model to extend the architecture for fault-tolerance and distributed truth maintenance initiation are also discussed.

  6. Fault-Tolerant Identification in Wireless Sensor Networks for Maximizing System Lifetime

    OpenAIRE

    Middela Shailaja; AnandaRaj S.P; Poornima S

    2012-01-01

    Wireless Sensor Network (WSN) is used by manyapplications such as security, command and control andsurveillance monitoring. In all such applications, themain application of WSN is sensing data and retrieval ofdata. There are many WSN systems that are querybased. They give responses in a stipulated time based onthe user’s query word. However, the WSN has possiblesensor faults for it is not reliable and thus the networkenergy level goes down. It results in reduction of lifetimeof network. To ...

  7. A Dynamic Effective Fault Tolerance System in Robotic Manipulator using a Hybrid Neural Network based Controller

    Directory of Open Access Journals (Sweden)

    G. Jiji

    2014-04-01

    Full Text Available Robot manipulator play important role in the field of automobile industry, mainly it is used in gas welding application and manufacturing and assembling of motor parts. In complex trajectory, on each joint the speed of the robot manipulator is affected. For that reason, it is necessary to analyze the noise and vibration of robot's joints for predicting faults also improve the control precision of robotic manipulator. In this study we will propose a new fault detection system for Robot manipulator. The proposed hybrid fault detection system is designed based on fuzzy support vector machine and Artificial Neural Networks (ANNs. In this system the decouple joints are identified and corrected using fuzzy SVM, here non-linear signal are used for complete process and treatment, the Artificial Neural Networks (ANNs are used to detect the free-swinging and locked joint of the robot, two types of neural predictors are also employed in the proposed adaptive neural network structure. The simulation results of a hybrid controller demonstrate the feasibility and performance of the methodology.

  8. SEU fault tolerance in artificial neural networks

    International Nuclear Information System (INIS)

    In this paper the authors investigate the robustness of Artificial Neural Networks when encountering transient modification of information bits related to the network operation. These kinds of faults are likely to occur as a consequence of interaction with radiation. Results of tests performed to evaluate the fault tolerance properties of two different digital neural circuits are presented

  9. Sensorless Fault Tolerant Control for Induction Motors

    OpenAIRE

    Djeghali, Nadia; Ghanes, Malek; Djennoune, Said; Barbot, Jean-pierre

    2013-01-01

    In this paper, a sensorless fault tolerant controller for induction motors is developed. In the proposed approach, a robust controller based on backstepping strategy is designed in order to compensate both the load torque disturbance and the rotor resistance variations caused by the broken rotor bars faults. The proposed approach needs neither fault detection and isolation schemes nor controller recon guration. Moreover, to avoid the use of speed and ux sensors, a second order sliding mode ob...

  10. Closed-Loop Evaluation of an Integrated Failure Identification and Fault Tolerant Control System for a Transport Aircraft

    Science.gov (United States)

    Shin, Jong-Yeob; Belcastro, Christine; Khong, thuan

    2006-01-01

    Formal robustness analysis of aircraft control upset prevention and recovery systems could play an important role in their validation and ultimate certification. Such systems developed for failure detection, identification, and reconfiguration, as well as upset recovery, need to be evaluated over broad regions of the flight envelope or under extreme flight conditions, and should include various sources of uncertainty. To apply formal robustness analysis, formulation of linear fractional transformation (LFT) models of complex parameter-dependent systems is required, which represent system uncertainty due to parameter uncertainty and actuator faults. This paper describes a detailed LFT model formulation procedure from the nonlinear model of a transport aircraft by using a preliminary LFT modeling software tool developed at the NASA Langley Research Center, which utilizes a matrix-based computational approach. The closed-loop system is evaluated over the entire flight envelope based on the generated LFT model which can cover nonlinear dynamics. The robustness analysis results of the closed-loop fault tolerant control system of a transport aircraft are presented. A reliable flight envelope (safe flight regime) is also calculated from the robust performance analysis results, over which the closed-loop system can achieve the desired performance of command tracking and failure detection.

  11. Software reliability through fault-avoidance and fault-tolerance

    Science.gov (United States)

    Vouk, Mladen A.; Mcallister, David F.

    1993-01-01

    Strategies and tools for the testing, risk assessment and risk control of dependable software-based systems were developed. Part of this project consists of studies to enable the transfer of technology to industry, for example the risk management techniques for safety-concious systems. Theoretical investigations of Boolean and Relational Operator (BRO) testing strategy were conducted for condition-based testing. The Basic Graph Generation and Analysis tool (BGG) was extended to fully incorporate several variants of the BRO metric. Single- and multi-phase risk, coverage and time-based models are being developed to provide additional theoretical and empirical basis for estimation of the reliability and availability of large, highly dependable software. A model for software process and risk management was developed. The use of cause-effect graphing for software specification and validation was investigated. Lastly, advanced software fault-tolerance models were studied to provide alternatives and improvements in situations where simple software fault-tolerance strategies break down.

  12. Formal and Fault Tolerant Design

    OpenAIRE

    Aljer, Ammar; Devienne, Philippe

    2012-01-01

    Software quality and reliability were verified for a long time at the post-implementation level (test, fault sce-nario ...). The design of embedded systems and digital circuits is more and more complex because of inte-gration density, heterogeneity. Now almost ¾ of the digital circuits contain at least one processor, that is, can execute software code. In other words, co-design is the most usual case and traditional verification by simu-lation is no more practical. Moreover, the increase in ...

  13. Fault tolerant control using Gaussian processes and model predictive control

    Directory of Open Access Journals (Sweden)

    Yang Xiaoke

    2015-03-01

    Full Text Available Essential ingredients for fault-tolerant control are the ability to represent system behaviour following the occurrence of a fault, and the ability to exploit this representation for deciding control actions. Gaussian processes seem to be very promising candidates for the first of these, and model predictive control has a proven capability for the second. We therefore propose to use the two together to obtain fault-tolerant control functionality. Our proposal is illustrated by several reasonably realistic examples drawn from flight control.

  14. Algorithm-Based Fault Tolerance Integrated with Replication

    Science.gov (United States)

    Some, Raphael; Rennels, David

    2008-01-01

    In a proposed approach to programming and utilization of commercial off-the-shelf computing equipment, a combination of algorithm-based fault tolerance (ABFT) and replication would be utilized to obtain high degrees of fault tolerance without incurring excessive costs. The basic idea of the proposed approach is to integrate ABFT with replication such that the algorithmic portions of computations would be protected by ABFT, and the logical portions by replication. ABFT is an extremely efficient, inexpensive, high-coverage technique for detecting and mitigating faults in computer systems used for algorithmic computations, but does not protect against errors in logical operations surrounding algorithms.

  15. Survey of the design and analysis of fault-tolerant computers

    International Nuclear Information System (INIS)

    The art of designing and analyzing fault-tolerant computers is surveyed with special emphasis on problems of analyzing the behavior of computers that have autonomous repair capability. The survey covers the following topics: (1) general issues in computer reliability, (2) fault-tolerance state relations and requirements, (3) computational hierarchy, (4) fault characteristics, (5) fault diagnosis, (6) fault-tolerance schemes for logic networks and machines, (7) fault coverage effects, and (8) fault-tree analysis of coverage. The survey does not include techniques for verifying nonredundant hardware or system software designs or for verifying the correctness of application programs

  16. Performance evaluation based fault tolerant control with actuator saturation avoidance

    OpenAIRE

    Boussaid, Boumedyen; Aubrun, Christophe; Abdelkrim, Naceur; Ben Gayed, Mohamed

    2011-01-01

    In this paper, a new approach regarding a reconfigured system is proposed to improve the performance of an active fault tolerant control system. The system performance is evaluated with an intelligent index of performance. The reconfiguration mechanism is based on a model predictive controller and reference trajectory management techniques. When an actuator fault occurs in the system, a new degraded reference trajectory is generated and the controller calculates new admissible controls. A con...

  17. SABRE: a bio-inspired fault-tolerant electronic architecture.

    Science.gov (United States)

    Bremner, P; Liu, Y; Samie, M; Dragffy, G; Pipe, A G; Tempesti, G; Timmis, J; Tyrrell, A M

    2013-03-01

    As electronic devices become increasingly complex, ensuring their reliable, fault-free operation is becoming correspondingly more challenging. It can be observed that, in spite of their complexity, biological systems are highly reliable and fault tolerant. Hence, we are motivated to take inspiration for biological systems in the design of electronic ones. In SABRE (self-healing cellular architectures for biologically inspired highly reliable electronic systems), we have designed a bio-inspired fault-tolerant hierarchical architecture for this purpose. As in biology, the foundation for the whole system is cellular in nature, with each cell able to detect faults in its operation and trigger intra-cellular or extra-cellular repair as required. At the next level in the hierarchy, arrays of cells are configured and controlled as function units in a transport triggered architecture (TTA), which is able to perform partial-dynamic reconfiguration to rectify problems that cannot be solved at the cellular level. Each TTA is, in turn, part of a larger multi-processor system which employs coarser grain reconfiguration to tolerate faults that cause a processor to fail. In this paper, we describe the details of operation of each layer of the SABRE hierarchy, and how these layers interact to provide a high systemic level of fault tolerance. PMID:23302298

  18. SABRE: a bio-inspired fault-tolerant electronic architecture

    International Nuclear Information System (INIS)

    As electronic devices become increasingly complex, ensuring their reliable, fault-free operation is becoming correspondingly more challenging. It can be observed that, in spite of their complexity, biological systems are highly reliable and fault tolerant. Hence, we are motivated to take inspiration for biological systems in the design of electronic ones. In SABRE (self-healing cellular architectures for biologically inspired highly reliable electronic systems), we have designed a bio-inspired fault-tolerant hierarchical architecture for this purpose. As in biology, the foundation for the whole system is cellular in nature, with each cell able to detect faults in its operation and trigger intra-cellular or extra-cellular repair as required. At the next level in the hierarchy, arrays of cells are configured and controlled as function units in a transport triggered architecture (TTA), which is able to perform partial-dynamic reconfiguration to rectify problems that cannot be solved at the cellular level. Each TTA is, in turn, part of a larger multi-processor system which employs coarser grain reconfiguration to tolerate faults that cause a processor to fail. In this paper, we describe the details of operation of each layer of the SABRE hierarchy, and how these layers interact to provide a high systemic level of fault tolerance. (paper)

  19. USAGE OF STANDARD PERSONAL COMPUTER PORTS FOR DESIGNING OF THE DOUBLE REDUNDANT FAULT-TOLERANT COMPUTER CONTROL SYSTEMS

    Directory of Open Access Journals (Sweden)

    Rafig SAMEDOV

    2005-01-01

    Full Text Available In this study, for designing of the fault-tolerant control systems by using standard personal computers, the ports have been investigated, different structure versions have been designed and the method for choosing of an optimal structure has been suggested. In this scope, first of all, the Ç?FTYAK system has been defined and its work principle has been determined. Then, data transmission ports of the standard personal computers have been classified and analyzed. After that, the structure versions have been designed and evaluated according to the used data transmission methods, the numbers of ports and the criterions of reliability, performance, truth, control and cost. Finally, the method for choosing of the most optimal structure version has been suggested.

  20. A continuous-time semi-markov bayesian belief network model for availability measure estimation of fault tolerant systems

    Scientific Electronic Library Online (English)

    Márcio das Chagas, Moura; Enrique López, Droguett.

    2008-08-01

    Full Text Available Neste trabalho, é proposto um modelo baseado na integração entre processos semi-Markovianos e redes Bayesianas para avaliação da disponibilidade de sistemas tolerantes à falha. Esta integração resulta em um modelo estocástico híbrido o qual é capaz de representar as características dinâmicas de um s [...] istema assim como tratar as relações de causa e efeito entre fatores externos tais como condições ambientais e operacionais. Além disso, o modelo híbrido permite avaliar a propagação de incerteza sobre a disponibilidade do sistema. É também proposto um procedimento numérico para a solução das equações de probabilidade de estado de processos semi-Markovianos descritos por taxas de transição. Tal procedimento numérico é baseado na aplicação de transformadas de Laplace que são invertidas pelo método de quadratura Gaussiana conhecido como Gauss Legendre. O modelo híbrido e procedimento numérico são ilustrados por meio de um exemplo de aplicação no contexto de sistemas tolerantes à falha. Abstract in english In this work it is proposed a model for the assessment of availability measure of fault tolerant systems based on the integration of continuous time semi-Markov processes and Bayesian belief networks. This integration results in a hybrid stochastic model that is able to represent the dynamic charact [...] eristics of a system as well as to deal with cause-effect relationships among external factors such as environmental and operational conditions. The hybrid model also allows for uncertainty propagation on the system availability. It is also proposed a numerical procedure for the solution of the state probability equations of semi-Markov processes described in terms of transition rates. The numerical procedure is based on the application of Laplace transforms that are inverted by the Gauss quadrature method known as Gauss Legendre. The hybrid model and numerical procedure are illustrated by means of an example of application in the context of fault tolerant systems.

  1. Enhancement of Fault Tolerance in Cloud Computing

    Directory of Open Access Journals (Sweden)

    Pushpanjali Gupta

    2014-08-01

    Full Text Available In recent years researchers are trying to work out scientific applications in cloud so that it decreases the infrastructure cost and increases the span of team and finally innovative ideas towards applications is increased. But the cloud is still not as much reliable, controllable as grid. So in the evolving Cloud computing environment there is a great need of fault tolerance mechanism for the system to work effectively even in the presence of failure. Moreover Big Organizations are also opting for using Hybrid Cloud instead of private Cloud. Thus, in this paper we propose an approach of using a new framework in Cloud so as to use Cloud for scientific applications as well makes the public Cloud trustworthy platform. There is a progressive approach introduced to provide an effective way to achieve high fault tolerance in Clouds by enabling a new workflow planning method to balance performance, reliability and cost for critical scientific applications and focus mainly on use of distributed resources for workflow execution mainly in serial and concurrent manner.

  2. Type-Directed Compilation for Fault-Tolerant Non-Interference

    OpenAIRE

    Del Tedesco, Filippo; Sands, David; Russo, Alejandro

    2014-01-01

    Environmental noise (e.g.heat, ionized particles, etc.) causes transient faults in hardware, which lead to corruption of stored values. Mission-critical devices require such faults to be mitigated by fault-tolerance --- a combination of techniques that aim at preserving the functional behaviour of a system despite the disruptive effects of transient faults. Fault-tolerance typically has a high deployment cost -- special hardware might be required to implement it -- and provi...

  3. Analysis of an inherently fault tolerant program

    International Nuclear Information System (INIS)

    Software for process-control systems, such as nuclear power plant safety control systems and robots, can be very complex because of the large number of cases which have to be considered. The approach proposed here uses decentralized control concepts and is based on Dijkstra's ''relaxation'' problem and self-stabilizing systems. The resulting program is inherently fault tolerant of partial hardware failures. Further, often the software is simplified, so that its correctness can be verified more easily. The authors present an overview of the model using a simple control program for a simulated robot as an example. Then they analyze this control program in terms of the degree to which it is decentralized, its partial correctness proof, its convergence proof and its performance. They also discuss some modifications to the basic algorithm

  4. Fault Tolerant Parallel Filters Based On Bch Codes

    Directory of Open Access Journals (Sweden)

    K.Mohana Krishna

    2015-04-01

    Full Text Available Digital filters are used in signal processing and communication systems. In some cases, the reliability of those systems is critical, and fault tolerant filter implementations are needed. Over the years, many techniques that exploit the filters’ structure and properties to achieve fault tolerance have been proposed. As technology scales, it enables more complex systems that incorporate many filters. In those complex systems, it is common that some of the filters operate in parallel, for example, by applying the same filter to different input signals. Recently, a simple technique that exploits the presence of parallel filters to achieve multiple fault tolerance has been presented. In this brief, that idea is generalized to show that parallel filters can be protected using Bose– Chaudhuri–Hocquenghem codes (BCH in which each filter is the equivalent of a bit in a traditional ECC. This new scheme allows more efficient protection when the number of parallel filters is large.

  5. SIFT - Multiprocessor architecture for Software Implemented Fault Tolerance flight control and avionics computers

    Science.gov (United States)

    Forman, P.; Moses, K.

    1979-01-01

    A brief description of a SIFT (Software Implemented Fault Tolerance) Flight Control Computer with emphasis on implementation is presented. A multiprocessor system that relies on software-implemented fault detection and reconfiguration algorithms is described. A high level reliability and fault tolerance is achieved by the replication of computing tasks among processing units.

  6. Automating the Addition of Fault Tolerance with Discrete Controller Synthesis

    OpenAIRE

    Girault, Alain; Rutten, E?ric

    2009-01-01

    Discrete controller synthesis (DCS) is a formal approach, based on the same state-space exploration algorithms as model-checking. Its interest lies in the ability to obtain automatically systems satisfying by construction formal properties specified a priori. In this paper, our aim is to demonstrate the feasibility of this approach for fault tolerance. We start with a fault intolerant program, modeled as the synchronous parallel composition of finite labeled transition systems; we specify for...

  7. Design and Analysis of a Fault Tolerant Microprocessor Based on Triple Modular Redundancy Using VHDL

    OpenAIRE

    Deepti Shinghal; Dinesh Chandra,

    2011-01-01

    There are numerous real time & operation critical systems in which the failure of the system is unacceptable at any stage of processing. The examples of such systems are like ATM machines, satellites, spacecraft etc. In this paper a fault tolerant microprocessor is developed by using checker units with a fault secure ALU and to develop a fault secure ALU the parity prediction logic and two rail checker method was used. Finally triple modular redundancy is applied to develop a fault tolerant p...

  8. Application of a fault-tolerant microprocessor-based core-surveillance system in a German fast breeder reactor

    International Nuclear Information System (INIS)

    For the fast breeder reactor KNK II at Karlsruhe, Germany, a microprocessor-based safety shut-down system is built. Analogue to the triple modular instrumentation it consists of TMR hardware. Functionally it is split into four blocks which operate in cascade-like fashion. The main functions are mean value calculation, current limit control, trend control, and final evaluation. In order to secure correctness, several constructive and analytical methods are applied for fault avoidance, like formal specification languages, programming guidelines, software quality assurance plan, validation, verification, and testing. Since additional means for correct and safe operation are still necessary, fault-tolerance and error-detection techniques are applied. These include self-checking programs, plausibility checks, control data, information exchange and control between the redundancies, and especially diversity. This diversity refers to different teams for the different development phases as well as to different tools and environments, like different programming languages for the application software. Three separate but functional identical programs will be implemented in Iftran, Pascal and PL/M. These will not only be used during the extensive testing period, but also during final operation

  9. Electronic Power Switch for Fault-Tolerant Networks

    Science.gov (United States)

    Volp, J.

    1987-01-01

    Power field-effect transistors reduce energy waste and simplify interconnections. Current switch containing power field-effect transistor (PFET) placed in series with each load in fault-tolerant power-distribution system. If system includes several loads and supplies, switches placed in series with adjacent loads and supplies. System of switches protects against overloads and losses of individual power sources.

  10. A Primer on Architectural Level Fault Tolerance

    Science.gov (United States)

    Butler, Ricky W.

    2008-01-01

    This paper introduces the fundamental concepts of fault tolerant computing. Key topics covered are voting, fault detection, clock synchronization, Byzantine Agreement, diagnosis, and reliability analysis. Low level mechanisms such as Hamming codes or low level communications protocols are not covered. The paper is tutorial in nature and does not cover any topic in detail. The focus is on rationale and approach rather than detailed exposition.

  11. Measures of Fault Tolerance in Distributed Simulated Annealing

    OpenAIRE

    Prakash, Aaditya

    2012-01-01

    In this paper, we examine the different measures of Fault Tolerance in a Distributed Simulated Annealing process. Optimization by Simulated Annealing on a distributed system is prone to various sources of failure. We analyse simulated annealing algorithm, its architecture in distributed platform and potential sources of failures. We examine the behaviour of tolerant distributed system for optimization task. We present possible methods to overcome the failures and achieve fau...

  12. Learning Fault-tolerant Speech Parsing with SCREEN

    CERN Document Server

    Wermter, S; Wermter, Stefan; Weber, Volker

    1994-01-01

    This paper describes a new approach and a system SCREEN for fault-tolerant speech parsing. SCREEEN stands for Symbolic Connectionist Robust EnterprisE for Natural language. Speech parsing describes the syntactic and semantic analysis of spontaneous spoken language. The general approach is based on incremental immediate flat analysis, learning of syntactic and semantic speech parsing, parallel integration of current hypotheses, and the consideration of various forms of speech related errors. The goal for this approach is to explore the parallel interactions between various knowledge sources for learning incremental fault-tolerant speech parsing. This approach is examined in a system SCREEN using various hybrid connectionist techniques. Hybrid connectionist techniques are examined because of their promising properties of inherent fault tolerance, learning, gradedness and parallel constraint integration. The input for SCREEN is hypotheses about recognized words of a spoken utterance potentially analyzed by a spe...

  13. Design of Fault Tolerant Reversible Multiplier

    Directory of Open Access Journals (Sweden)

    H. P. Sinha

    2012-01-01

    Full Text Available In the recent years, reversible logic has emerged as a promising technology having its applications in low power CMOS, quantum computing, nanotechnology, and optical computing. The classical set of gates such as AND, OR, and EXOR are not reversible. This paper proposes a novel 4x4 bit reversible fault tolerant multiplier circuit which can multiply two 4-bit numbers. It is faster and has lower hardware complexity compared to the existing designs. In addition, the proposed reversible multiplier is better than the existing counterparts in terms of delay & power. It is based on two concepts. The partial products can be generated in parallel using Fredkin gates and thereafter the addition is done by using reversible parallel adder designed from IG gates. Thus, this paper provides the initial threshold to building of more complex system which can execute more complicated operations using reversible logic.

  14. Method and apparatus for fault tolerance

    Science.gov (United States)

    Masson, Gerald M. (Inventor); Sullivan, Gregory F. (Inventor)

    1993-01-01

    A method and apparatus for achieving fault tolerance in a computer system having at least a first central processing unit and a second central processing unit. The method comprises the steps of first executing a first algorithm in the first central processing unit on input which produces a first output as well as a certification trail. Next, executing a second algorithm in the second central processing unit on the input and on at least a portion of the certification trail which produces a second output. The second algorithm has a faster execution time than the first algorithm for a given input. Then, comparing the first and second outputs such that an error result is produced if the first and second outputs are not the same. The step of executing a first algorithm and the step of executing a second algorithm preferably takes place over essentially the same time period.

  15. An Active Fault-Tolerant PWM Tracker for Unknown Nonlinear Stochastic Hybrid Systems: NARMAX Model and OKID-Based State-Space Self-Tuning Control

    Directory of Open Access Journals (Sweden)

    Shu-Mei Guo

    2010-01-01

    Full Text Available An active fault-tolerant pulse-width-modulated tracker using the nonlinear autoregressive moving average with exogenous inputs model-based state-space self-tuning control is proposed for continuous-time multivariable nonlinear stochastic systems with unknown system parameters, plant noises, measurement noises, and inaccessible system states. Through observer/Kalman filter identification method, a good initial guess of the unknown parameters of the chosen model is obtained so as to reduce the identification process time and enhance the system performances. Besides, by modifying the conventional self-tuning control, a fault-tolerant control scheme is also developed. For the detection of fault occurrence, a quantitative criterion is exploited by comparing the innovation process errors estimated by the Kalman filter estimation algorithm. In addition, the weighting matrix resetting technique is presented by adjusting and resetting the covariance matrix of parameter estimates to improve the parameter estimation for faulty system recovery. The technique can effectively cope with partially abrupt and/or gradual system faults and/or input failures with fault detection.

  16. Fault tolerant control for steam generators in nuclear power plant

    International Nuclear Information System (INIS)

    Based on the nonlinear system with stochastic noise, a bank of extended Kalman filters is used to estimate the state of sensors. It can real-time detect and isolate the single sensor fault, and reconstruct the sensor output to keep steam generator water level stable. The simulation results show that the methodology of employing a bank of extended Kalman filters for steam generator fault tolerant control design is feasible. (authors)

  17. An Active Fault-Tolerant PWM Tracker for Unknown Nonlinear Stochastic Hybrid Systems: NARMAX Model and OKID-Based State-Space Self-Tuning Control

    OpenAIRE

    Shu-Mei Guo; You Lin; Chia-Wei Chen; Jason S. H. Tsai; Chu-Tong Wang; Leang-San Shieh

    2010-01-01

    An active fault-tolerant pulse-width-modulated tracker using the nonlinear autoregressive moving average with exogenous inputs model-based state-space self-tuning control is proposed for continuous-time multivariable nonlinear stochastic systems with unknown system parameters, plant noises, measurement noises, and inaccessible system states. Through observer/Kalman filter identification method, a good initial guess of the unknown parameters of the chosen model is obtained so as to reduce the ...

  18. False alarms in fault-tolerant dominating sets in graphs

    Directory of Open Access Journals (Sweden)

    Mateusz Nikodem

    2012-01-01

    Full Text Available We develop the problem of fault-tolerant dominating sets (liar's dominating sets in graphs. Namely, we consider a new kind of fault - a false alarm. Characterization of such fault-tolerant dominating sets in three different cases (dependent on the classification of the types of the faults are presented.

  19. False alarms in fault-tolerant dominating sets in graphs

    OpenAIRE

    Mateusz Nikodem

    2012-01-01

    We develop the problem of fault-tolerant dominating sets (liar's dominating sets) in graphs. Namely, we consider a new kind of fault - a false alarm. Characterization of such fault-tolerant dominating sets in three different cases (dependent on the classification of the types of the faults) are presented.

  20. MAFT: The Multicomputer Architecture for Fault-Tolerance

    Science.gov (United States)

    Kieckhafer, Roger M.

    1990-01-01

    Multicomputer Architecture for Fault-Tolerance (MAFT) is a loosely coupled multiprocessor system designed to achieve an unreliability of less than 10(exp -10)/hr in flight-critical real time applications. The MAFT design objectives and architecture are presented. The fault-tolerance implementation of major functions in MAFT is also presented, including communication; task scheduling; reconfiguration; clock synchronization; and data handling and voting. The need for Byzantine agreement or approximate agreement in various functions is discussed. Different methods were selected to achieve agreement in various subsystems. These methods are illustrated by a more detailed description of the task scheduling and error handling subsystems.

  1. Issues in the Imprecise Computation Approach to Fault Tolerance

    Science.gov (United States)

    Liu, Jane W. S.; Lin, Kwei-Jay; Liu, C. L.

    1991-01-01

    The imprecise computation technique can be used in a natural way to enhance fault tolerance. By providing a usable, approximate result whenever a failure or overload prevent the system from producing the desired, precise result, we can increase the availability of data and services, reduce the need for error-recovery operations, and minimize the costs in replication. This paper describes the domain-specific fault tolerance mechanisms that are needed to support the provision and correct usage of imprecise results for several representative application domains. The elements of an application-domain-independent architecture that can effectively integrate these domain-specific mechanisms are also described.

  2. Fault-tolerant search algorithms reliable computation with unreliable information

    CERN Document Server

    Cicalese, Ferdinando

    2013-01-01

    Why a book on fault-tolerant search algorithms? Searching is one of the fundamental problems in computer science. Time and again algorithmic and combinatorial issues originally studied in the context of search find application in the most diverse areas of computer science and discrete mathematics. On the other hand, fault-tolerance is a necessary ingredient of computing. Due to their inherent complexity, information systems are naturally prone to errors, which may appear at any level - as imprecisions in the data, bugs in the software, or transient or permanent hardware failures. This book pr

  3. A modified NARMAX model-based self-tuner with fault tolerance for unknown nonlinear stochastic hybrid systems with an input-output direct feed-through term.

    Science.gov (United States)

    Tsai, Jason S-H; Hsu, Wen-Teng; Lin, Long-Guei; Guo, Shu-Mei; Tann, Joseph W

    2014-01-01

    A modified nonlinear autoregressive moving average with exogenous inputs (NARMAX) model-based state-space self-tuner with fault tolerance is proposed in this paper for the unknown nonlinear stochastic hybrid system with a direct transmission matrix from input to output. Through the off-line observer/Kalman filter identification method, one has a good initial guess of modified NARMAX model to reduce the on-line system identification process time. Then, based on the modified NARMAX-based system identification, a corresponding adaptive digital control scheme is presented for the unknown continuous-time nonlinear system, with an input-output direct transmission term, which also has measurement and system noises and inaccessible system states. Besides, an effective state space self-turner with fault tolerance scheme is presented for the unknown multivariable stochastic system. A quantitative criterion is suggested by comparing the innovation process error estimated by the Kalman filter estimation algorithm, so that a weighting matrix resetting technique by adjusting and resetting the covariance matrices of parameter estimate obtained by the Kalman filter estimation algorithm is utilized to achieve the parameter estimation for faulty system recovery. Consequently, the proposed method can effectively cope with partially abrupt and/or gradual system faults and input failures by the fault detection. PMID:24012389

  4. Concepts and Methods in Fault-tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens; Staroswiecly, M.

    2001-01-01

    Faults in automated processes will often cause undesired reactions and shut-down of a controlled plant, and the consequences could be damage to technical parts of the plant, to personnel or the environment. Fault-tolerant control combines diagnosis with control methods to handle faults in an intelligent way. The aim is to prevent that simple faults develop into serious failure and hence increase plant availability and reduce the risk of safety hazards. Fault-tolerant control merges several disciplines into a common framework to achieve these goals. The desired features are obtained through on-line fault diagnosis, automatic condition assessment and calculation of appropriate remedial actions to avoid certain consequences of a fault. The envelope of the possible remedial actions is very wide. Sometimes, simple could be achieved by replacing a measurement from a faulty sensor by an estimate. In yet other situations, complex reconfiguration or on-line controller redesign is required. This paper gives an overviewof recent tools to analyze and explore structure and other fundamental properties of an automated system such that any inherent redundancy in the controlled process can be fully utilized to maintain availability, even though faults may occur.

  5. Fault Tolerant Control for Kori Unit 1 Steam Generator

    International Nuclear Information System (INIS)

    In order to implement more reliable control systems, failures of a controller, a sensor and an actuator should be taken into consideration in the process of control system design. Traditionally there have been two approaches for dealing with fault-tolerant control problem: active redundancy and passive redundancy. Active redundancy has no reconfiguration part to take an action such as diagnosing and selecting intact controller when a controller failure occurs, that is, one controller guarantees the system stability and performance under failure of the other controller. Meanwhile, passive redundancy has reconfiguration parts which supervise the system, reject the faulty controller, and select the sound controller which performs the mission. Active redundancy structure for fault-tolerant control is focused in the paper and design methods of fault tolerant state feedback control and fault-tolerant output feedback control are proposed, which makes control a system reliable while guaranteeing stability and performance in the sense of H? norm, in the face of controller failures in the dual-controller configuration. The proposed method is applied to Kori Unit 1 steam generator level control system. The results show that the steam generator water level is well controlled in the situation of one controller failure

  6. A Robust Byzantine Fault-Tolerant Replication Technique for Peer-to-Peer Content Distribution

    OpenAIRE

    Ayyasamy Sellappan; Sivanandam Natarajan

    2011-01-01

    Problem statement: In peer-to-peer networks, Byzantine fault tolerance refers to the capability of a system to tolerate Byzantine faults. It can be achieved by replicating the server and by ensuring all server replicas reach an agreement on the input despite Byzantine faulty replicas and clients. Since malicious attacks and software errors can cause faulty nodes to exhibit Byzantine behavior, Byzantine-fault-tolerant algorithms are increasingly important. Approach: In the ...

  7. SMaRtLight: A Practical Fault-Tolerant SDN Controller

    OpenAIRE

    Botelho, Fa?bio; Bessani, Alysson; Ramos, Fernando M. V.; Ferreira, Paulo

    2014-01-01

    The increase in the number of SDN-based deployments in production networks is triggering the need to consider fault-tolerant designs of controller architectures. Commercial SDN controller solutions incorporate fault tolerance, but there has been little discussion in the SDN literature on the design of such systems and the tradeoffs involved. To fill this gap, we present a by-construction design of a fault-tolerant controller, and materialize it by proposing and formalizing a...

  8. Reconfigurable Fault Tolerance for FPGAs

    Science.gov (United States)

    Shuler, Robert, Jr.

    2010-01-01

    The invention allows a field-programmable gate array (FPGA) or similar device to be efficiently reconfigured in whole or in part to provide higher capacity, non-redundant operation. The redundant device consists of functional units such as adders or multipliers, configuration memory for the functional units, a programmable routing method, configuration memory for the routing method, and various other features such as block RAM, I/O (random access memory, input/output) capability, dedicated carry logic, etc. The redundant device has three identical sets of functional units and routing resources and majority voters that correct errors. The configuration memory may or may not be redundant, depending on need. For example, SRAM-based FPGAs will need some type of radiation-tolerant configuration memory, or they will need triple-redundant configuration memory. Flash or anti-fuse devices will generally not need redundant configuration memory. Some means of loading and verifying the configuration memory is also required. These are all components of the pre-existing redundant FPGA. This innovation modifies the voter to accept a MODE input, which specifies whether ordinary voting is to occur, or if redundancy is to be split. Generally, additional routing resources will also be required to pass data between sections of the device created by splitting the redundancy. In redundancy mode, the voters produce an output corresponding to the two inputs that agree, in the usual fashion. In the split mode, the voters select just one input and convey this to the output, ignoring the other inputs. In a dual-redundant system (as opposed to triple-redundant), instead of a voter, there is some means to latch or gate a state update only when both inputs agree. In this case, the invention would require modification of the latch or gate so that it would operate normally in redundant mode, and would separately latch or gate the inputs in non-redundant mode.

  9. Implementation of Fault Tolerant Method Using BCH Code on FPGA

    OpenAIRE

    Mahadevaswamy V P; Sunitha S.L.; Shobha, B. N.

    2012-01-01

    The Fault tolerance degradation is the property thatenables a system (often computer-based) to continue operatingproperly in the event of the failure of (or one or more faultswithin) some of its components. To designing a new 32-bitArithmetic Logic Unit (ALU) that is secure against many attacksor faults and able to correct any 5-bit fault in any position of its 32bits input register of ALU. Because the radiation effects onelectronic circuits may cause to be inverted data bits of registers orm...

  10. A distributed fault tolerant architecture for nuclear reactor control and safety functions

    International Nuclear Information System (INIS)

    This paper reports on a fault tolerance architecture that provides tolerance to a broad scope of hardware, software, and communications faults which is being developed. This architecture relies on widely commercially available operating systems, local area networks, and software standards. Thus, development time is significantly shortened, and modularity allows for continuous and inexpensive system enhancement throughout the expected 20- year life. The fault containment and parallel processing capabilites of computers network are being exploited to provide a high performance, high availability network capable of tolerating a broad scope of hardware software, and operating system faults. The system can tolerate all but one known (and avoidable) single fault, two known and avoidable dual faults, and will detect all higher order fault sequences and provide diagnostics to allow for rapid manual recovery

  11. Implementations of a four-level mechanical architecture for fault-tolerant robots

    International Nuclear Information System (INIS)

    This paper describes a fault tolerant mechanical architecture with four levels devised and implemented in concert with NASA (Tesar, D. and Sreevijayan, D., Four-level fault tolerance in manipulator design for space operations. In First Int. Symp. Measurement and Control in Robotics (ISMCR '90), Houston, Texas, 20-22 June 1990.) Subsequent work has clarified and revised the architecture. The four levels proceed from fault tolerance at the actuator level, to fault tolerance via in-parallel chains, to fault tolerance using serial kinematic redundancy, and finally to the fault tolerance multiple arm systems provide. This is a subsumptive architecture because each successive layer can incorporate the fault tolerance provided by all layers beneath. For instance a serially-redundant robot can incorporate dual fault-tolerant actuators. Redundant systems provide the fault tolerance, but the guiding principle of this architecture is that functional redundancies actively increase the performance of the system. Redundancies do not simply remain dormant until needed. This paper includes specific examples of hardware and/or software implementation at all four levels

  12. Control switching in high performance and fault tolerant control

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels KjØlstad

    2010-01-01

    The problem of reliability in high performance control and in fault tolerant control is considered in this paper. A feedback controller architecture for high performance and fault tolerance is considered. The architecture is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization. By using the nominal controller in the architecture as a simple and robust controller, it is possible to use the YJBK transfer function for optimization of the closed-loop performance. This can be done both in connections with normal operation of the system as well as in connection with faults in the system. The architecture will also allow changing the applied sensors and/or actuators when switching between different controllers. This switchingget particular simple for open-loop stable systems.

  13. An Approach to Build Software Based on Fault Tolerance Computing Using Uncertainty Factor

    Directory of Open Access Journals (Sweden)

    Mrityunjay Brahma

    2013-12-01

    Full Text Available In this work, we have started with an overview on fault tolerance based system. In case of design diversity based software fault tolerance system, we observed that uncertainty remains an important factor. Keeping this factor, we have discussed about implementing Bayes’ theorem and probabilistic mathematical model to handle the uncertainty factor. We assume that, once developed, the complete model will give us better efficiency. The rest of this paper deals with other types of fault tolerance systems and their approaches. This part is a kind of literature review, which includes, fault tolerant computing schemes that rely on the single-design as well as on the multiple-design. Further, in single-design, we have discussed about recovery block, N-version programming, N self-checking programming scheme. Lastly, focusing on multiple-design, we have discussed about software engineering aspects, error detection mechanisms and fault tolerance by fault injection. The paper ends with a general conclusion.

  14. Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 1: FTMP principles of operation

    Science.gov (United States)

    Smith, T. B., Jr.; Lala, J. H.

    1983-01-01

    The basic organization of the fault tolerant multiprocessor, (FTMP) is that of a general purpose homogeneous multiprocessor. Three processors operate on a shared system (memory and I/O) bus. Replication and tight synchronization of all elements and hardware voting is employed to detect and correct any single fault. Reconfiguration is then employed to repair a fault. Multiple faults may be tolerated as a sequence of single faults with repair between fault occurrences.

  15. A Unified Fault-Tolerance Protocol

    Science.gov (United States)

    Miner, Paul; Gedser, Alfons; Pike, Lee; Maddalon, Jeffrey

    2004-01-01

    Davies and Wakerly show that Byzantine fault tolerance can be achieved by a cascade of broadcasts and middle value select functions. We present an extension of the Davies and Wakerly protocol, the unified protocol, and its proof of correctness. We prove that it satisfies validity and agreement properties for communication of exact values. We then introduce bounded communication error into the model. Inexact communication is inherent for clock synchronization protocols. We prove that validity and agreement properties hold for inexact communication, and that exact communication is a special case. As a running example, we illustrate the unified protocol using the SPIDER family of fault-tolerant architectures. In particular we demonstrate that the SPIDER interactive consistency, distributed diagnosis, and clock synchronization protocols are instances of the unified protocol.

  16. Design of Test Articles and Monitoring System for the Characterization of HIRF Effects on a Fault-Tolerant Computer Communication System

    Science.gov (United States)

    Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.; Koppen, Sandra V.

    2008-01-01

    This report describes the design of the test articles and monitoring systems developed to characterize the response of a fault-tolerant computer communication system when stressed beyond the theoretical limits for guaranteed correct performance. A high-intensity radiated electromagnetic field (HIRF) environment was selected as the means of injecting faults, as such environments are known to have the potential to cause arbitrary and coincident common-mode fault manifestations that can overwhelm redundancy management mechanisms. The monitors generate stimuli for the systems-under-test (SUTs) and collect data in real-time on the internal state and the response at the external interfaces. A real-time health assessment capability was developed to support the automation of the test. A detailed description of the nature and structure of the collected data is included. The goal of the report is to provide insight into the design and operation of these systems, and to serve as a reference document for use in post-test analyses.

  17. Implementation of Fault Tolerant Method Using BCH Code on FPGA

    Directory of Open Access Journals (Sweden)

    Mahadevaswamy V P

    2012-09-01

    Full Text Available The Fault tolerance degradation is the property thatenables a system (often computer-based to continue operatingproperly in the event of the failure of (or one or more faultswithin some of its components. To designing a new 32-bitArithmetic Logic Unit (ALU that is secure against many attacksor faults and able to correct any 5-bit fault in any position of its 32bits input register of ALU. Because the radiation effects onelectronic circuits may cause to be inverted data bits of registers ormemories. If one bit of main storage system is changed themission of system would be completely different. The highmotivation in choice of BCH (Bose, chaudhuri, andHocquenghem codes is that, it is able to correct multiple errorsand these classes of codes are kind of powerful random errorcorrecting cyclic codes. In comparison with area penalty methods,32-bit fault tolerant ALU using BCH code is a better choice interms of area as compared to Triple Modular Redundancy (TMRand Residue code. This is due to the fault tolerant method for32-bit ALU using TMR with single or triplicated voting needsingle voting scheme or tripled voter and two extra 32-bit ALUwhich has been increased the hardware overhead by 202% and208% respectively. The Residue code requires hardwareoverhead of 148.9%. However, in comparison with TMR a n dRe s i d u e c o d e , BCH code needs the hardware overhead is 70to 75%, which causes that the overall cost and power consumptionwill get reduces. Thus proposed fault tolerant hardware overheadhas lower hardware and multiple error correction when comparedto the other techniques.

  18. Design of Fault Tolerant Reversible Multiplier

    OpenAIRE

    Sinha, H. P.; Nidhi Syal

    2012-01-01

    In the recent years, reversible logic has emerged as a promising technology having its applications in low power CMOS, quantum computing, nanotechnology, and optical computing. The classical set of gates such as AND, OR, and EXOR are not reversible. This paper proposes a novel 4x4 bit reversible fault tolerant multiplier circuit which can multiply two 4-bit numbers. It is faster and has lower hardware complexity compared to the existing designs. In addition, the proposed reversible multiplier...

  19. Fault-tolerant composite Householder reflection

    Science.gov (United States)

    Torosov, Boyan T.; Kyoseva, Elica; Vitanov, Nikolay V.

    2015-07-01

    We propose a fault-tolerant implementation of the quantum Householder reflection, which is a key operation in various quantum algorithms, quantum-state engineering, generation of arbitrary unitaries, and entanglement characterization. We construct this operation using the modular approach of composite pulses and a relation between the Householder reflection and the quantum phase gate. The proposed implementation is highly insensitive to variations in the experimental parameters, which makes it suitable for high-fidelity quantum information processing.

  20. Steps toward fault-tolerant quantum chemistry.

    Energy Technology Data Exchange (ETDEWEB)

    Taube, Andrew Garvin

    2010-05-01

    Developing quantum chemistry programs on the coming generation of exascale computers will be a difficult task. The programs will need to be fault-tolerant and minimize the use of global operations. This work explores the use a task-based model that uses a data-centric approach to allocate work to different processes as it applies to quantum chemistry. After introducing the key problems that appear when trying to parallelize a complicated quantum chemistry method such as coupled-cluster theory, we discuss the implications of that model as it pertains to the computational kernel of a coupled-cluster program - matrix multiplication. Also, we discuss the extensions that would required to build a full coupled-cluster program using the task-based model. Current programming models for high-performance computing are fault-intolerant and use global operations. Those properties are unsustainable as computers scale to millions of CPUs; instead one must recognize that these systems will be hierarchical in structure, prone to constant faults, and global operations will be infeasible. The FAST-OS HARE project is introducing a scale-free computing model to address these issues. This model is hierarchical and fault-tolerant by design, allows for the clean overlap of computation and communication, reducing the network load, does not require checkpointing, and avoids the complexity of many HPC runtimes. Development of an algorithm within this model requires a change in focus from imperative programming to a data-centric approach. Quantum chemistry (QC) algorithms, in particular electronic structure methods, are an ideal test bed for this computing model. These methods describe the distribution of electrons in a molecule, which determine the properties of the molecule. The computational cost of these methods is high, scaling quartically or higher in the size of the molecule, which is why QC applications are major users of HPC resources. The complexity of these algorithms means that MPI alone is insufficient to achieve parallel scaling; QC developers have been forced to use alternative approaches to achieve scalability and would be receptive to radical shifts in the programming paradigm. Initial work in adapting the simplest QC method, Hartree-Fock, to this the new programming model indicates that the approach is beneficial for QC applications. However, the advantages to being able to scale to exascale computers are greatest for the computationally most expensive algorithms; within QC these are the high-accuracy coupled-cluster (CC) methods. Parallel coupledcluster programs are available, however they are based on the conventional MPI paradigm. Much of the effort is spent handling the complicated data dependencies between the various processors, especially as the size of the problem becomes large. The current paradigm will not survive the move to exascale computers. Here we discuss the initial steps toward designing and implementing a CC method within this model. First, we introduce the general concepts behind a CC method, focusing on the aspects that make these methods difficult to parallelize with conventional techniques. Then we outline what is the computational core of the CC method - a matrix multiply - within the task-based approach that the FAST-OS project is designed to take advantage of. Finally we outline the general setup to implement the simplest CC method in this model, linearized CC doubles (LinCC).

  1. An Approach to Build Software Based on Fault Tolerance Computing Using Uncertainty Factor

    OpenAIRE

    Mrityunjay Brahma

    2013-01-01

    In this work, we have started with an overview on fault tolerance based system. In case of design diversity based software fault tolerance system, we observed that uncertainty remains an important factor. Keeping this factor, we have discussed about implementing Bayes’ theorem and probabilistic mathematical model to handle the uncertainty factor. We assume that, once developed, the complete model will give us better efficiency. The rest of this paper deals with other types of fault toleranc...

  2. Algorithm-dependent fault tolerance for distributed computing

    Energy Technology Data Exchange (ETDEWEB)

    P. D. Hough; M. e. Goldsby; E. J. Walsh

    2000-02-01

    Large-scale distributed systems assembled from commodity parts, like CPlant, have become common tools in the distributed computing world. Because of their size and diversity of parts, these systems are prone to failures. Applications that are being run on these systems have not been equipped to efficiently deal with failures, nor is there vendor support for fault tolerance. Thus, when a failure occurs, the application crashes. While most programmers make use of checkpoints to allow for restarting of their applications, this is cumbersome and incurs substantial overhead. In many cases, there are more efficient and more elegant ways in which to address failures. The goal of this project is to develop a software architecture for the detection of and recovery from faults in a cluster computing environment. The detection phase relies on the latest techniques developed in the fault tolerance community. Recovery is being addressed in an application-dependent manner, thus allowing the programmer to take advantage of algorithmic characteristics to reduce the overhead of fault tolerance. This architecture will allow large-scale applications to be more robust in high-performance computing environments that are comprised of clusters of commodity computers such as CPlant and SMP clusters.

  3. Towards the design of fault-tolerant distributed real-time systems

    OpenAIRE

    Klobedanz, Kay

    2014-01-01

    The number and complexity of embedded systems is rapidly increasing. Especially for large distributed real-time systems, the determination of a feasible deployment is a complex task. In this thesis we present a design approach for distributed real-time systems that supports the designer to determine an appropriate deployment.Distributed systems with hard real-time constraints are so-called safety-critical systems. In safety-critical systems, missing a hard deadline may cause catastrophic cons...

  4. Highly Reliable Fault Tolerant Technique for Safety Critical Applications

    Directory of Open Access Journals (Sweden)

    Nanditha S

    2014-05-01

    Full Text Available This paper presents a highly reliable fault tolerant technique for safety critical applications using Five Modular Redundancy method. In high radiation environments like space crafts and nuclear thermal plants it is likely that single event upsets (SEU degrades the system operation. This causes single bit flips in the sequential elements of electronic components in the system. If these systems are not provided with the fault tolerance then there are high chances of obtaining false response. In order to avoid this problem the system is made redundant and a roll-forward recovery mechanism is used to increase the overall reliability. Scan cell design is employed to shift out the internal states of all the flip flops during comparison and recovery process. The proposed method is designed using verilog HDL on XILINX ISE simulator.

  5. Improving Fault Tolerance in Ad-Hoc Networks by Using Residue Number System

    Directory of Open Access Journals (Sweden)

    A. Barati

    2008-01-01

    Full Text Available In this study, we presented a method for distributing data storage by using residue number system for mobile systems and wireless networks based on peer to peer paradigm. Generally, redundant residue number system is capable in error detection and correction. In proposed method, we made a new system by mixing Redundant Residue Number System (RRNS, Multi Level Residue Number System (ML RNS and Multiple Valued Logic (MVL RNS which was perfect for parallel, carry free, high speed arithmetic and the system supports secure data communication. In addition it had ability of error detection and correction. In comparison to other number systems, it had many improvements in data security, error detection and correction, speed of storage and retrieval.

  6. Fault Tolerant Weighted Voting Algorithms

    Directory of Open Access Journals (Sweden)

    Azad Azadmanesh

    2008-09-01

    Full Text Available Computer networks are now necessities of modern organisations and network security has become a major concern for them. In this paper we have proposed a holistic approach to network security with a hybrid model that includes an Intrusion Detection System (IDS to detect network attacks and a survivability model to assess the impacts of undetected attacks. A neural network-based IDS has been proposed, where the learning mechanism for the neural network is evolved using genetic algorithm. Then the case where an attack evades the IDS and takes the system into a compromised state is discussed. We propose a stochastic model which enables us to do a cost/benefit analysis for systems security. This integrated approach allows systems managers to make more informed decisions regarding both intrusion detection and system protection.

  7. Visual Programming of Fault-Tolerant Distributed Applications

    OpenAIRE

    Muganga, B.; Pacull, F.; Mazouni, K. R.; Wolff, A. -d

    1995-01-01

    The design of fault-tolerant distributed applications is a complex task. In addition to application functionalities, the programmer must consider issues related to both replication and distribution for every application component concerned with fault-tolerance. This paper describes an approach which combines two environments (Specs and Garf) so as to: (1) graphically design applications using high level Petri nets and (2) discharge the programmer of fault-tolerance issues.

  8. Fault tolerance in Hadoop MapReduce implementation

    OpenAIRE

    Cogorno, Mati?as; Rey, Javier; Nesmachnow, Sergio

    2013-01-01

    This document reports the advances on exploring and understanding the fault tolerance mechanisms in Hadoop MapReduce. A description of the current fault tolerance features existing in Hadoop is provided, along with a review of related works on the topic. Finally, the document describes some relevant proposals about fault tolerance worth considering to implement in Hadoop within the PERMARE project in order to provide support for pervasive computing environments.

  9. Diagnosis and Fault-tolerant Control for Ship Station Keeping

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2005-01-01

    This paper adresses the design process of diagnosis and fault-tolerant control when the a system should operate despite multiple failures in sensors or actuators. Graph-teory based analysis of systems structure is demonstrated to be a unique design methodology that can cope with the diagnosis design for systems of high complexity, and also analyse the cases of cascaded or multiple faults. The paper takes as example a ship with two CP propellers, rudders and a bow thruster as actuators, and instrumentation with a suite of global position sensors, inertial navigation units and conventional gyro units to provide ship motion information. A salient feature of the design mehod is the ability to analyse cases where faults have occurrred and easily determine where in the faulty system diagnosability and controlability are retained.

  10. Modeling and Verification for Timing Satisfaction of Fault-Tolerant Systems with Finiteness

    OpenAIRE

    Cheng, Chih-Hong; Buckl, Christian; Esparza, Javier; Knoll, Alois

    2009-01-01

    The increasing use of model-based tools enables further use of formal verification techniques in the context of distributed real-time systems. To avoid state explosion, it is necessary to construct verification models that focus on the aspects under consideration. In this paper, we discuss how we construct a verification model for timing analysis in distributed real-time systems. We (1) give observations concerning restrictions of timed automata to model these systems, (2)...

  11. Nonlinear, Adaptive and Fault-tolerant Control for Electro-hydraulic Servo Systems

    OpenAIRE

    Choux, Martin

    2012-01-01

    Fluid power systems have been in use since 1795 with the rst hydraulic press patented by Joseph Bramah and today form the basis of many industries. Electro hydraulic servo systems are uid power systems controlled in closed-loop. They transform reference input signals into a set of movements in hydraulic actuators (cylinders or motors) by the means of hydraulic uid under pressure. With the development of computing power and control techniques during the last few decad...

  12. Design and Verification of Fault-Tolerant Components

    DEFF Research Database (Denmark)

    Zhang, Miaomiao; Liu, Zhiming

    2009-01-01

    We present a systematic approach to design and verification of fault-tolerant components with real-time properties as found in embedded systems. A state machine model of the correct component is augmented with internal transitions that represent hypothesized faults. Also, constraints on the occurrence or timing of faults are included in this model. This model of a faulty component is then extended with fault detection and recovery mechanisms, again in the form of state machines. Desired properties of the component are model checked for each of the successive models. The models can be made relatively detailed such that they can serve directly as blueprints for engineering, and yet be amenable to exhaustive verication. The approach is illustrated with a design of a triple modular fault-tolerant system that is a real case we received from our collaborators in the aerospace field. We use UPPAAL to model and check this design. Model checking uses concrete parameters, so we extend the result with parametric analysis using abstractions of the automata in a rigorous verification.

  13. Fault Tolerant Control in a Semi-active Suspension

    OpenAIRE

    Tudon-Mart?nez, Juan C.; Morales-Menéndez, Rubén; Ramirez-Mendoza, Ricardo,; Sename, Olivier; Dugard, Luc

    2012-01-01

    A Fault Tolerant Control System (FTCS) in a Quarter of Vehicle (QoV ) model is proposed. The control law is time-varying using a Linear Parameter-Varying (LPV ) based controller, which includes two scheduling parameters. One parameter for monitoring the nonlinear behavior of the damper, and another for fault accommodation using a reference model obtained by a state observer of the normal operating regime. The QoV model represents a semi-active suspension, including an experimental magneto-rhe...

  14. System-Level Development of Fault-Tolerant Distributed Aero-Engine Control Architecture Project

    National Aeronautics and Space Administration — NASA's vision for an "intelligent engine" will be realized with the development of a truly distributed control system and reliable smart transducer node components;...

  15. ACID Support and Fault-Tolerant Database Systems on Cloud:A Review

    OpenAIRE

    Pratiyush Guleria

    2012-01-01

    Cloud computing represents a different way to architect and remotely manage computing resources. One has only to establish an account with Microsoft or Amazon or Google to begin building and deploying application systems into a cloud. These systems can be, but certainly are not restricted to being simplistic. Some applications requires http services, some requires relational database or might require web service infrastructure and message queues. With clouds, IT-related applications can be pr...

  16. Design of fault-tolerant inductive position sensor

    International Nuclear Information System (INIS)

    The position sensors used in a magnetic bearing system are desirable to provide some degree of fault-tolerance as the rotor position is necessary for the feedback control to overcome the open-loop instability. In this paper, we propose and inductive position sensor that can cope with a partial fault in the sensor. The sensor has multiple poles which can be combined to sense the in-plane motion of the rotor. When a high-frequency voltage signal drives each pole of the sensor, the resulting current in the sensor coil contains information regarding the rotor position. The signal processing circuit of the sensor extracts this position information. In this paper, we used the magnetic circuit model of the sensor that shows the analytical relationship between the sensor output and the rotor motion. The multi-polar structure of the sensor makes it possible to introduce redundancy which can be exploited for fault-tolerant operation. The proposed sensor is applied to a magnetically levitated turbo-molecular vacuum pump. Experimental results validate the fault-tolerance algorithm

  17. Fault Detection for Shipboard Monitoring and Decision Support Systems

    DEFF Research Database (Denmark)

    Lajic, Zoran; Nielsen, Ulrik Dam

    2009-01-01

    In this paper a basic idea of a fault-tolerant monitoring and decision support system will be explained. Fault detection is an important part of the fault-tolerant design for in-service monitoring and decision support systems for ships. In the paper, a virtual example of fault detection will be presented for a containership with a real decision support system onboard. All possible faults can be simulated and detected using residuals and the generalized likelihood ratio (GLR) algorithm.

  18. Synthesis of Fault Tolerant Reversible Logic Circuits

    CERN Document Server

    Islam, Md Saiful; Begum, Zerina; Hafiz, Mohd Zulfiquar; Mahmud, Abdullah Al; 10.1109/CAS-ICTD.2009.4960883

    2010-01-01

    Reversible logic is emerging as an important research area having its application in diverse fields such as low power CMOS design, digital signal processing, cryptography, quantum computing and optical information processing. This paper presents a new 4*4 universal reversible logic gate, IG. It is a parity preserving reversible logic gate, that is, the parity of the inputs matches the parity of the outputs. The proposed parity preserving reversible gate can be used to synthesize any arbitrary Boolean function. It allows any fault that affects no more than a single signal readily detectable at the circuit's primary outputs. Finally, it is shown how a fault tolerant reversible full adder circuit can be realized using only two IGs. It has also been demonstrated that the proposed design offers less hardware complexity and is efficient in terms of gate count, garbage outputs and constant inputs than the existing counterparts.

  19. A novel adaptive switching function on fault tolerable sliding mode control for uncertain stochastic systems.

    Science.gov (United States)

    Zahiripour, Seyed Ali; Jalali, Ali Akbar

    2014-09-01

    A novel switching function based on an optimization strategy for the sliding mode control (SMC) method has been provided for uncertain stochastic systems subject to actuator degradation such that the closed-loop system is globally asymptotically stable with probability one. In the previous researches the focus on sliding surface has been on proportional or proportional-integral function of states. In this research, from a degree of freedom that depends on designer choice is used to meet certain objectives. In the design of the switching function, there is a parameter which the designer can regulate for specified objectives. A sliding-mode controller is synthesized to ensure the reachability of the specified switching surface, despite actuator degradation and uncertainties. Finally, the simulation results demonstrate the effectiveness of the proposed method. PMID:24954808

  20. Design and Analysis of a Fault Tolerant Microprocessor Based on Triple Modular Redundancy Using VHDL

    Directory of Open Access Journals (Sweden)

    Deepti Shinghal

    2011-03-01

    Full Text Available There are numerous real time & operation critical systems in which the failure of the system is unacceptable at any stage of processing. The examples of such systems are like ATM machines, satellites, spacecraft etc. In this paper a fault tolerant microprocessor is developed by using checker units with a fault secure ALU and to develop a fault secure ALU the parity prediction logic and two rail checker method was used. Finally triple modular redundancy is applied to develop a fault tolerant processor. Proposed method was validated using the VHDL test environment and the results showed that the reliability of the system increased with a little area overhead.

  1. Software fault tolerance using data diversity

    Science.gov (United States)

    Knight, John C.

    1991-01-01

    Research on data diversity is discussed. Data diversity relies on a different form of redundancy from existing approaches to software fault tolerance and is substantially less expensive to implement. Data diversity can also be applied to software testing and greatly facilitates the automation of testing. Up to now it has been explored both theoretically and in a pilot study, and has been shown to be a promising technique. The effectiveness of data diversity as an error detection mechanism and the application of data diversity to differential equation solvers are discussed.

  2. Checkpoint-based Intelligent Fault tolerance For Cloud Service Providers

    Directory of Open Access Journals (Sweden)

    Rejin Paul

    2012-12-01

    Full Text Available With the increasing demand and benefits of cloud computing infrastructure, real time computing can be performed on cloud infrastructure. A real time system can take advantage of intensive computing capabilities and scalable virtualized environment of cloud computing to execute real time tasks. In most of the real time cloud applications, processing is done on remote cloud computing nodes. So there are more chances of errors, due to the undetermined latency and loose control over computing node. On the other side, most of the real time systems are also safety critical and should be highly reliable. So there is an increased requirement for fault tolerance to achieve reliability for the real time computing on cloud Infrastructure. In this paper, proposes a smart checkpoint infrastructure for virtualized service providers and fault tolerance model for real time cloud computing. The checkpoints are stored in a Hadoop Distributed File System. This allows resuming a task execution faster after a node crash and increasing the fault tolerance of the system, since checkpoints are distributed and replicated in all the nodes of the provider. This paper presents a running implementation of this infrastructure and its evaluation, demonstrating that it is an effective way to make faster checkpoints with low interference on task execution and efficient task recovery after a node failure.One advantage of cloud computing is the dynamicity of re- source provisioning. Our architecture makes use of this advantage by enabling dynamic run- time modi?cations of replication groups

  3. ACID Support and Fault-Tolerant Database Systems on Cloud:A Review

    Directory of Open Access Journals (Sweden)

    Pratiyush Guleria

    2012-10-01

    Full Text Available Cloud computing represents a different way to architect and remotely manage computing resources. One has only to establish an account with Microsoft or Amazon or Google to begin building and deploying application systems into a cloud. These systems can be, but certainly are not restricted to being simplistic. Some applications requires http services, some requires relational database or might require web service infrastructure and message queues. With clouds, IT-related applications can be provided as a service, which can be accessed through internet. There are platforms on cloud which provide scalability and high availability properties for web applications but there are problems related to data consistency at the same time, and in case of server failures, it becomes major problem in applications related to payment services. Data needs to be properly managed in cloud environment and to achieve proper transaction processing and consistency, RDBMS techniques such as ACID transactions should be used. Web services in Azure ensure application availability by replicating stored data at least three times and offer optional geolocation of replicas in separate Microsoft data centres to provide disaster recovery services.Azure storage services provide scalable persistent storage of structured tables, blobs and queues.

  4. The software-implemented fault tolerance /SIFT/ approach to fault tolerant computing

    Science.gov (United States)

    Goldberg, J.

    1982-01-01

    SIFT is an experimental computer designed for highly reliable flight-control service in advanced air transports. Its development was intended to integrate and demonstrate the latest techniques in fault-tolerant computing. During its development, several new problems of some generality were uncovered and solved. The technology developed for the validation of its design is seen as being perhaps as important as the design itself. The SIFT design is described, as is the way in which the design and its validation were shaped by the requirements of its intended application. Attention is also given to reliability and fault tolerance. The most significant feature of the hardware design is the absence of elements that can generate multiple faults, such as shared clocks or data buses. It is noted that the software is realized in only 800 lines of code, of which 80% are in a high-level language.

  5. Learning Fault-tolerant Speech Parsing with SCREEN

    OpenAIRE

    Wermter, Stefan; Weber, Volker

    1994-01-01

    This paper describes a new approach and a system SCREEN for fault-tolerant speech parsing. SCREEEN stands for Symbolic Connectionist Robust EnterprisE for Natural language. Speech parsing describes the syntactic and semantic analysis of spontaneous spoken language. The general approach is based on incremental immediate flat analysis, learning of syntactic and semantic speech parsing, parallel integration of current hypotheses, and the consideration of various forms of speech...

  6. Superior model for fault tolerance computation in designing nano-sized circuit systems

    Energy Technology Data Exchange (ETDEWEB)

    Singh, N. S. S., E-mail: narinderjit@petronas.com.my; Muthuvalu, M. S., E-mail: msmuthuvalu@gmail.com [Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, Bandar Seri Iskandar, Perak (Malaysia); Asirvadam, V. S., E-mail: vijanth-sagayan@petronas.com.my [Electrical and Electronics Engineering Department, Universiti Teknologi PETRONAS, Bandar Seri Iskandar, Perak (Malaysia)

    2014-10-24

    As CMOS technology scales nano-metrically, reliability turns out to be a decisive subject in the design methodology of nano-sized circuit systems. As a result, several computational approaches have been developed to compute and evaluate reliability of desired nano-electronic circuits. The process of computing reliability becomes very troublesome and time consuming as the computational complexity build ups with the desired circuit size. Therefore, being able to measure reliability instantly and superiorly is fast becoming necessary in designing modern logic integrated circuits. For this purpose, the paper firstly looks into the development of an automated reliability evaluation tool based on the generalization of Probabilistic Gate Model (PGM) and Boolean Difference-based Error Calculator (BDEC) models. The Matlab-based tool allows users to significantly speed-up the task of reliability analysis for very large number of nano-electronic circuits. Secondly, by using the developed automated tool, the paper explores into a comparative study involving reliability computation and evaluation by PGM and, BDEC models for different implementations of same functionality circuits. Based on the reliability analysis, BDEC gives exact and transparent reliability measures, but as the complexity of the same functionality circuits with respect to gate error increases, reliability measure by BDEC tends to be lower than the reliability measure by PGM. The lesser reliability measure by BDEC is well explained in this paper using distribution of different signal input patterns overtime for same functionality circuits. Simulation results conclude that the reliability measure by BDEC depends not only on faulty gates but it also depends on circuit topology, probability of input signals being one or zero and also probability of error on signal lines.

  7. Superior model for fault tolerance computation in designing nano-sized circuit systems

    Science.gov (United States)

    Singh, N. S. S.; Asirvadam, V. S.; Muthuvalu, M. S.

    2014-10-01

    As CMOS technology scales nano-metrically, reliability turns out to be a decisive subject in the design methodology of nano-sized circuit systems. As a result, several computational approaches have been developed to compute and evaluate reliability of desired nano-electronic circuits. The process of computing reliability becomes very troublesome and time consuming as the computational complexity build ups with the desired circuit size. Therefore, being able to measure reliability instantly and superiorly is fast becoming necessary in designing modern logic integrated circuits. For this purpose, the paper firstly looks into the development of an automated reliability evaluation tool based on the generalization of Probabilistic Gate Model (PGM) and Boolean Difference-based Error Calculator (BDEC) models. The Matlab-based tool allows users to significantly speed-up the task of reliability analysis for very large number of nano-electronic circuits. Secondly, by using the developed automated tool, the paper explores into a comparative study involving reliability computation and evaluation by PGM and, BDEC models for different implementations of same functionality circuits. Based on the reliability analysis, BDEC gives exact and transparent reliability measures, but as the complexity of the same functionality circuits with respect to gate error increases, reliability measure by BDEC tends to be lower than the reliability measure by PGM. The lesser reliability measure by BDEC is well explained in this paper using distribution of different signal input patterns overtime for same functionality circuits. Simulation results conclude that the reliability measure by BDEC depends not only on faulty gates but it also depends on circuit topology, probability of input signals being one or zero and also probability of error on signal lines.

  8. Superior model for fault tolerance computation in designing nano-sized circuit systems

    International Nuclear Information System (INIS)

    As CMOS technology scales nano-metrically, reliability turns out to be a decisive subject in the design methodology of nano-sized circuit systems. As a result, several computational approaches have been developed to compute and evaluate reliability of desired nano-electronic circuits. The process of computing reliability becomes very troublesome and time consuming as the computational complexity build ups with the desired circuit size. Therefore, being able to measure reliability instantly and superiorly is fast becoming necessary in designing modern logic integrated circuits. For this purpose, the paper firstly looks into the development of an automated reliability evaluation tool based on the generalization of Probabilistic Gate Model (PGM) and Boolean Difference-based Error Calculator (BDEC) models. The Matlab-based tool allows users to significantly speed-up the task of reliability analysis for very large number of nano-electronic circuits. Secondly, by using the developed automated tool, the paper explores into a comparative study involving reliability computation and evaluation by PGM and, BDEC models for different implementations of same functionality circuits. Based on the reliability analysis, BDEC gives exact and transparent reliability measures, but as the complexity of the same functionality circuits with respect to gate error increases, reliability measure by BDEC tends to be lower than the reliability measure by PGM. The lesser reliability measure by BDEC is well explained in this paper using distribution of different signal input patterns overtime for same functionality circuits. Simulation results conclude that the reliability measure by BDEC depends not only on faulty gates but it also depends on circuit topology, probability of input signals being one or zero and also probability of error on signal lines

  9. Data Driven Fault Tolerant Control: A Subspace Approach:

    OpenAIRE

    Dong, J.

    2009-01-01

    The main stream research on fault detection and fault tolerant control has been focused on model based methods. As far as a model is concerned, changes therein due to faults have to be extracted from measured data. Generally speaking, existing approaches process measured inputs and outputs either by a filter designed based on a known model (e.g. for additive faults), or by an identification scheme to estimate the changed model parameters (e.g. due to multiplicative faults). Since the classica...

  10. A Framework-Based Approach for Fault-Tolerant Service Robots

    Directory of Open Access Journals (Sweden)

    Heejune Ahn

    2012-11-01

    Full Text Available Recently the component?based approach has become a major trend in intelligent service robot development due to its reusability and productivity. The framework in a component?based system should provide essential services for application components. However, to our knowledge the existing robot frameworks do not yet support fault tolerance service. Moreover, it is often believed that faults can be handled only at the application level. In this paper, by extending the robot framework with the fault tolerance function, we argue that the framework?based fault tolerance approach is feasible and even has many benefits, including that: 1 the system integrators can build fault tolerance applications from non?fault?aware components; 2 the constraints of the components and the operating environment can be considered at the time of integration, which ? cannot be anticipated eaily at the time of component development; 3 consistency in system reliability can be obtained even in spite of diverse application component sources. In the proposed construction, we build XML rule files defining the rules for probing and determining the fault conditions of each component, contamination cases from a faulty component, and the possible recovery and safety methods. The rule files are established by a system integrator and the fault manager in the framework controls the fault tolerance process according to the rules. We demonstrate that the fault?tolerant framework can incorporate widely accepted fault tolerance techniques. The effectiveness and real?time performance of the framework?based approach and its techniques are examined by testing an autonomous mobile robot in typical fault scenarios.

  11. A Fault-Tolerant Multiprocessor for Real-Time Control Applications

    Science.gov (United States)

    Roberts, Thomas E.; Johnson, Barry W.

    1987-10-01

    This paper presents the design, analysis, and experimental evaluation of a fault-tolerant multiprocessor for use in systems requiring real-time, microprocessor-based control. Example applications of the fault-tolerant system are found in robotics, process control, manufacturing, and factory automation. The architecture for the multiprocessor is presented and analyzed for reliability, availability, and safety. A prototype of the fault-tolerant multiprocessor has been constructed, using Intel 8088 processors, and experimentally evaluated in the laboratory. Both hardware and software descriptions of the system are provided, and an example application to the control of an electric wheelchair is presented.

  12. Documentation of the current fault detection, isolation and reconfiguration software of the AIPS fault-tolerant processor

    Science.gov (United States)

    Lanning, David T.; Shepard, Allen W.; Johnson, Sally C.

    1987-01-01

    Documentation is presented of the December 1986 version of the ADA code for the fault detection, isolation, and reconfiguration (FDIR) functions of the Advanced Information processing System (AIPS) Fault-Tolerant Processor (FTP). Because the FTP is still under development and the software is constantly undergoing changes, this should not be considered final documentation of the FDIR software of the FTP.

  13. Rule-based fault diagnosis of hall sensors and fault-tolerant control of PMSM

    Science.gov (United States)

    Song, Ziyou; Li, Jianqiu; Ouyang, Minggao; Gu, Jing; Feng, Xuning; Lu, Dongbin

    2013-07-01

    Hall sensor is widely used for estimating rotor phase of permanent magnet synchronous motor(PMSM). And rotor position is an essential parameter of PMSM control algorithm, hence it is very dangerous if Hall senor faults occur. But there is scarcely any research focusing on fault diagnosis and fault-tolerant control of Hall sensor used in PMSM. From this standpoint, the Hall sensor faults which may occur during the PMSM operating are theoretically analyzed. According to the analysis results, the fault diagnosis algorithm of Hall sensor, which is based on three rules, is proposed to classify the fault phenomena accurately. The rotor phase estimation algorithms, based on one or two Hall sensor(s), are initialized to engender the fault-tolerant control algorithm. The fault diagnosis algorithm can detect 60 Hall fault phenomena in total as well as all detections can be fulfilled in 1/138 rotor rotation period. The fault-tolerant control algorithm can achieve a smooth torque production which means the same control effect as normal control mode (with three Hall sensors). Finally, the PMSM bench test verifies the accuracy and rapidity of fault diagnosis and fault-tolerant control strategies. The fault diagnosis algorithm can detect all Hall sensor faults promptly and fault-tolerant control algorithm allows the PMSM to face failure conditions of one or two Hall sensor(s). In addition, the transitions between health-control and fault-tolerant control conditions are smooth without any additional noise and harshness. Proposed algorithms can deal with the Hall sensor faults of PMSM in real applications, and can be provided to realize the fault diagnosis and fault-tolerant control of PMSM.

  14. Fault-tolerance techniques for SRAM-based FPGAs

    CERN Document Server

    Kastensmidt, Fernanda Lima; Reis, Ricardo

    2006-01-01

    Fault-tolerance in integrated circuits is no longer the exclusive concern of space designers or highly-reliable applications engineers. Today, designers of many next-generation products must cope with reduced margin noises. The continuous evolution of fabrication technology of semiconductor components – shrinking transistor geometry, power supply, speed, and logic density – has significantly reduced the reliability of very deep submicron integrated circuits, in face of various internal and external sources of noise. Field Programmable Gate Arrays (FPGAs), customizable by SRAM cells, are the latest advance in the integrated circuit evolution: millions of memory cells to implement the logic, embedded memories, routing, and embedded microprocessors cores. These re-programmable systems-on-chip platforms must be fault-tolerant to cope with current requirements.

  15. Fault Tolerance in ZigBee Wireless Sensor Networks

    Science.gov (United States)

    Alena, Richard; Gilstrap, Ray; Baldwin, Jarren; Stone, Thom; Wilson, Pete

    2011-01-01

    Wireless sensor networks (WSN) based on the IEEE 802.15.4 Personal Area Network standard are finding increasing use in the home automation and emerging smart energy markets. The network and application layers, based on the ZigBee 2007 PRO Standard, provide a convenient framework for component-based software that supports customer solutions from multiple vendors. This technology is supported by System-on-a-Chip solutions, resulting in extremely small and low-power nodes. The Wireless Connections in Space Project addresses the aerospace flight domain for both flight-critical and non-critical avionics. WSNs provide the inherent fault tolerance required for aerospace applications utilizing such technology. The team from Ames Research Center has developed techniques for assessing the fault tolerance of ZigBee WSNs challenged by radio frequency (RF) interference or WSN node failure.

  16. Specifying and Verifying Ultra-reliability and Fault-tolerance Properties

    Science.gov (United States)

    Schwartz, R. L.; Melliar-Smith, P. M.

    1983-01-01

    A methodology to rigorously verify ultrareliability and fault tolerance system properties is described. The methodology utilizes a hierarchy of formal mathematical specifications of system design and incremental design proof to prove the system has the desired properties. A small example of the approach is given, and the application of the methodology to the large scale proof of SIFT, a fault tolerant flight control operating system, is discussed.

  17. An algorithm for automatically obtaining distributed and fault-tolerant static schedules

    OpenAIRE

    Girault, Alain; Kalla, Hamoudi; Sighireanu, Mihaela; Sorel, Yves

    2003-01-01

    Embedded systems account for a major part of crit- ical applications (space, aeronautics, nuclear. . . ) as well Our goal is to automatically obtain a distributed and as public domain applications (automotive, consumer fault-tolerant embedded system: distributed because the electronics. . . ). Their main features are: system must run on a distributed architecture; fault-tolerant because the system is critical. Our starting point is a source algorithm, a target distributed architecture, some d...

  18. Fault Detection and Isolation and Fault Tolerant Control of Wind Turbines Using Set-Valued Observers

    DEFF Research Database (Denmark)

    Casau, Pedro; Rosa, Paulo Andre Nobre

    2012-01-01

    Research on wind turbine Operations & Maintenance (O&M) procedures is critical to the expansion of Wind Energy Conversion systems (WEC). In order to reduce O&M costs and increase the lifespan of the turbine, we study the application of Set-Valued Observers (SVO) to the problem of Fault Detection and Isolation (FDI) and Fault Tolerant Control (FTC) of wind turbines, by taking advantage of the recent advances in SVO theory for model invalidation. A simple wind turbine model is presented along with possible faulty scenarios. The FDI algorithm is built on top of the described model, taking into account process disturbances, uncertainty and sensor noise. The FTC strategy takes advantage of the proposed FDI algorithm, enabling the controller reconfiguration shortly after fault events. Additionally, a robust controller is designed so as to increase the wind turbine's performance during low severity faults. Finally, the FDI algorithm is assessed within a publicly available benchmark model, using Monte-Carlo simulation runs.

  19. Active Fault Tolerant Control for Ultrasonic Piezoelectric Motor

    Science.gov (United States)

    Boukhnifer, Moussa

    2012-07-01

    Ultrasonic piezoelectric motor technology is an important system component in integrated mechatronics devices working on extreme operating conditions. Due to these constraints, robustness and performance of the control interfaces should be taken into account in the motor design. In this paper, we apply a new architecture for a fault tolerant control using Youla parameterization for an ultrasonic piezoelectric motor. The distinguished feature of proposed controller architecture is that it shows structurally how the controller design for performance and robustness may be done separately which has the potential to overcome the conflict between performance and robustness in the traditional feedback framework. A fault tolerant control architecture includes two parts: one part for performance and the other part for robustness. The controller design works in such a way that the feedback control system will be solely controlled by the proportional plus double-integral PI2 performance controller for a nominal model without disturbances and H? robustification controller will only be activated in the presence of the uncertainties or an external disturbances. The simulation results demonstrate the effectiveness of the proposed fault tolerant control architecture.

  20. Beam dynamics calculations for fault-tolerance

    International Nuclear Information System (INIS)

    The European Transmutation Demonstration requires a high-power proton accelerator operating in CW mode. This accelerator is also expected to have a very limited number of unexpected beam interruptions per year. To reach such an ambitious goal, it is clear that reliability-oriented design practices need to be followed from the early stage of components design and fault-tolerance capabilities have to be introduced to the maximum extent. The goal of this document is precisely to investigate in more details the fault-tolerance capability of the XT-ADS linac. From previous analysis, it appears that if nothing is done, a cavity's failure leads in nearly all the cases to a complete beam loss, due to the non-relativistic varying velocity of the particles. To avoid such a total beam loss, it is clear that some kind of retuning has to be performed to compensate the lack of acceleration due to the faulty cavity. We have to identify and develop fast failure recovery scenarios to ensure that such retuning can be performed in less than 1 second. 2 ways are investigated. The first way is to stop the beam to achieve the retuning (Scenario 1). The other way is to try to perform the retuning without stopping the beam (Scenario 2). The present analysis demonstrates on the beam dynamics point of view that a fast retuning procedure can be envisaged without stopping the beam (Scenario 2). Nevertheless, this Scenario 2 implies stringent specifications, especially on: - the fault detection time, that has to be extremely short (order of magnitude: 100 ?s) and - the margins required on the accelerating field and RF power point of view, that are higher than in Scenario 1

  1. Using certification trails to achieve software fault tolerance

    Science.gov (United States)

    Sullivan, Gregory F.; Masson, Gerald M.

    1993-01-01

    A conceptually novel and powerful technique to achieve fault tolerance in hardware and software systems is introduced. When used for software fault tolerance, this new technique uses time and software redundancy and can be outlined as follows. In the initial phase, a program is run to solve a problem and store the result. In addition, this program leaves behind a trail of data called a certification trail. In the second phase, another program is run which solves the original problem again. This program, however, has access to the certification trail left by the first program. Because of the availability of the certification trail, the second phase can be performed by a less complex program and can execute more quickly. In the final phase, the two results are accepted as correct; otherwise an error is indicated. An essential aspect of this approach is that the second program must always generate either an error indication or a correct output even when the certification trail it receives from the first program is incorrect. The certification trail approach to fault tolerance was formalized and it was illustrated by applying it to the fundamental problem of finding a minimum spanning tree. Cases in which the second phase can be run concorrectly with the first and act as a monitor are discussed. The certification trail approach was compared to other approaches to fault tolerance. Because of space limitations we have omitted examples of our technique applied to the Huffman tree, and convex hull problems. These can be found in the full version of this paper.

  2. Fault Detection and Isolation and Fault Tolerant Control of Wind Turbines Using Set-Valued Observers

    OpenAIRE

    Casau, Pedro; Rosa, Paulo Andre Nobre; Tabatabaeipour, Seyed Mojtaba; Silvestre, Carlos

    2012-01-01

    Research on wind turbine Operations & Maintenance (O&M) procedures is critical to the expansion of Wind Energy Conversion systems (WEC). In order to reduce O&M costs and increase the lifespan of the turbine, we study the application of Set-Valued Observers (SVO) to the problem of Fault Detection and Isolation (FDI) and Fault Tolerant Control (FTC) of wind turbines, by taking advantage of the recent advances in SVO theory for model invalidation. A simple wind turbine model is presented along w...

  3. On the Practicality of `Practical' Byzantine Fault Tolerance

    CERN Document Server

    Chondros, Nikos; Roussopoulos, Mema

    2011-01-01

    Byzantine Fault Tolerant (BFT) systems are considered by the systems research community to be state of the art with regards to providing reliability in distributed systems. BFT systems provide safety and liveness guarantees with reasonable assumptions, amongst a set of nodes where at most f nodes display arbitrarily incorrect behaviors, known as Byzantine faults. Despite this, BFT systems are still rarely used in practice. In this paper we describe our experience, from an application developer's perspective, trying to leverage the publicly available and highly-tuned PBFT middleware (by Castro and Liskov), to provide provable reliability guarantees for an electronic voting application with high security and robustness needs. We describe several obstacles we encountered and drawbacks we identified in the PBFT approach. These include some that we tackled, such as lack of support for dynamic client management and leaving state management completely up to the application. Others still remaining include the lack of...

  4. Fault-Tolerant Postselected Quantum Computation: Threshold Analysis

    OpenAIRE

    Knill, E

    2004-01-01

    The schemes for fault-tolerant postselected quantum computation given in [Knill, Fault-Tolerant Postselected Quantum Computation: Schemes, http://arxiv.org/abs/quant-ph/0402171] are analyzed to determine their error-tolerance. The analysis is based on computer-assisted heuristics. It indicates that if classical and quantum communication delays are negligible, then scalable qubit-based quantum computation is possible with errors above 1% per elementary quantum gate.

  5. A Reflective Object-Oriented Architecture for Developing Fault-Tolerant Software

    Scientific Electronic Library Online (English)

    Luiz E., Buzato; Cecília M. F., Rubira; Maria Lúcia B., Lisboa.

    1997-11-01

    Full Text Available This paper proposes a reflective object-oriented architecture for developing fault-tolerant software. Reflective object-oriented programming promotes a modular structuring of systems by means of a new dimension of modularization—the separation between base-level objects and meta-level objects. This [...] property allows the creation of metaobjects responsible for managing tasks of application objects located at the base level. In the context of this work, computational reflection is applied to implement various strategies of fault tolerance at the meta-level in a transparent manner for the application programmer, that is, without interfering with the original structure of application objects that require fault tolerance facilities. The use of the proposed architecture has the following advantages: (i) separation of concerns, that is, separate the concerns related to the application domain from those related to the implementation of fault-tolerant mechanisms; (ii) it promotes code reuse of fault-tolerance mechanisms; (iii) it allows application programmers to use the most adequate fault-tolerance strategy for his implementation, and (iv) it provides a design that is more adaptable, flexible and easier to extend than traditional designs for developing fault-tolerant software. Our reflective architecture is composed of three levels, and is based on the abstraction of object groups.

  6. Optimized Nanometric Fault Tolerant Reversible BCD Adder

    Directory of Open Access Journals (Sweden)

    Majid Haghparast

    2012-01-01

    Full Text Available In this study a novel nanometric fault tolerant quantum and reversible binary coded decimal adder is proposed. Reversible logic has found emerging attentions in optical information processing, quantum computing, nanotechnology and low power design. BCD Adder is a combinational circuit that can be used for the addition of two numbers in BCD arithmetic's. The proposed reversible BCD adder has also parity preserving property. It is better than all the existing counterparts. The proposed circuit is optimized. It is compared with the existing circuits in terms of number of constant inputs, number of garbage outputs, quantum cost and hardware complexity. All of the parameters are improved dramatically. It is to be noted that all the circuits have nanometric scales.

  7. A Service-Based Decentralized Architecture for ECU Fault Tolerant Control

    OpenAIRE

    Xia ZHOU

    2012-01-01

    The purpose of this master thesis is to contribute a service-based decentralized architecture for Electronic Control Unit (ECU) with fault tolerant control. As ECU systems are becoming large-scaled, centralized-architecture fault tolerant control is facing challenges in performance, complexity and engineering, for its dependencies, non-modular, non-scalable and so on. In Scania’s ECUs, the architecture is applied by a centralized diagnose system. In this thesis, an alternative solution – ...

  8. Actuator fault-tolerant control based on set separation

    OpenAIRE

    Ocampo-martinez, Carlos; Dona?, J. A.; Sero?n, Mari?a

    2010-01-01

    In this paper, an actuator fault-tolerant control (FTC) strategy based on set separation is presented. The proposed scheme employs a standard configuration consisting of a bank of observers which match the different fault situations that can occur in the plant. Each of these observers has an associated estimation error with a distinctive behaviour when a estimator matches the current fault situation of the plant. With this information from each observer, a fault diagnosis and isolation (FDI) ...

  9. Fault diagnosis and fault-tolerant control and guidance for aerospace vehicles from theory to application

    CERN Document Server

    Zolghadri, Ali; Cieslak, Jerome; Efimov, Denis; Goupil, Philippe

    2014-01-01

    Fault Diagnosis and Fault-Tolerant Control and Guidance for Aerospace demonstrates the attractive potential of recent developments in control for resolving such issues as improved flight performance, self-protection and extended life of structures. Importantly, the text deals with a number of practically significant considerations: tuning, complexity of design, real-time capability, evaluation of worst-case performance, robustness in harsh environments, and extensibility when development or adaptation is required. Coverage of such issues helps to draw the advanced concepts arising from academic research back towards the technological concerns of industry. Initial coverage of basic definitions and ideas and a literature review gives way to a treatment of important electrical flight control system failures: the oscillatory failure case, runaway, and jamming. Advanced fault detection and diagnosis for linear and nonlinear systems are described. Lastly recovery strategies appropriate to remaining acuator/sensor/c...

  10. Fault tolerant wind speed estimator used in wind turbine controllers

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    2012-01-01

    Advanced control schemes can be used to optimize energy production and cost of energy in modern wind turbines. These control schemes most often rely on wind speed estimations. These designs of wind speed estimators are, however, not designed to be fault tolerant towards faults in the used sensors. In this paper a fault tolerant wind speed estimator is designed based on a set of unknown input observers, each designed to the different sets of non-faulty sensors. Faults in the rotor, generator and wind speed sensors are considered. The designed wind speed estimator is passive tolerant towards faults in the wind speed sensors, and faults in the generator and rotor speed sensors are accommodated by an active fault tolerant observer scheme in which the faults are detected and identified, and the observer corresponding to the non-faulty sensors are used. The potential of the scheme is shown by applying the proposed wind speed estimator to a simulation model of a wind turbine. Notice that since the faults are accommodated in the observer scheme the actual controller do not need to be adjusted or reconfigured to accommodate the sensor faults.

  11. Fault Tolerant Circuit Design Using Evolutionary Algorithms

    Directory of Open Access Journals (Sweden)

    Hui-Cong Wu

    2014-01-01

    Full Text Available With the rapid development of semiconductor technology and the increasing proliferation of emission sources, digital circuits are frequently used in harsh electromagnetic environments. Electrostatic Discharge (ESD interferences are gradually gaining prominence, resulting in performance degradations, malfunctions and disturbances in component or system level applications. Conventional solutions to such problem are shielding, filtering and grounding. This paper presents an evolvable hardware platform for the automated design and adaptation of a motor control circuit. The platform uses EHW to automate the configuration of FPGA dedicated to the implementation of the motor control circuit. The ability of the platform to adapt to  certain number of faults was investigated through introducing single logic unit fault and multi-logic unit faults. Results show that the functionality of circuit can be recovered through evolution. It also shows that the placement of faulty affect the ability of GA to evolve correct circuit, and the evolutionary recovery ability of the circuit descends with the number of fault units increasing.

  12. Formal validation of fault-tolerance mechanisms inside GUARDS

    International Nuclear Information System (INIS)

    In this paper we report the experiments carried out during the specification and validation of the fault-tolerance mechanisms developed in the European project Generic Upgradable Architecture for Real-time Dependable Systems (GUARDS). These mechanisms are the components of an architecture developed for embedded safety-critical systems. The validation approach is based on model-checking techniques and exploits the verification methodology supported by the Just Another Concurrency Kit (JACK) environment. The properties that guarantee the desired behaviour of the mechanisms are specified as temporal logic formulae; the JACK model-checker is then used to verify that the behaviour of the mechanisms satisfy such properties also in the presence of faults

  13. SEIF: Secure and Efficient Intrusion Fault tolerant protocol for Wireless Sensor Networks

    OpenAIRE

    Ouadjaout, Abdelraouf; Challal, Yacine; Lasla, Noureddine; Bagaa, Mouloud

    2008-01-01

    In wireless sensor networks, reliability represents a design goal of a primary concern. To build a comprehensive reliable system, it is essential to consider node failures and intruder attacks as unavoidable phenomena. In this paper, we present a new intrusion-fault tolerant routing scheme offering a high level of reliability through a secure multi-path communication topology. Unlike existing intrusion-fault tolerant solutions, our protocol is based on a distributed and in-network verificatio...

  14. Secure and efficient disjoint multipath construction for fault tolerant routing in wireless sensor networks

    OpenAIRE

    Challal, Yacine; Ouadjaout, Abdelraouf; Lasla, Noureddine; Bagaa, Mouloud; Hadjidj, Abdelkrim

    2011-01-01

    In wireless sensor networks, reliability is a design goal of a primary concern. To build a comprehensive reliable system, it is essential to consider node failures and intruder attacks as unavoidable phenomena. In this paper, we present a new intrusion-fault tolerant routing scheme offering a high level of reliability through a secure multi-path routing construction. Unlike existing intrusion-fault tolerant solutions, our protocol is based on a distributed and in-network verification scheme, ...

  15. Hybrid fault tolerance techniques to detect transient faults in embedded processors

    CERN Document Server

    Azambuja, José Rodrigo; Becker, Jürgen

    2014-01-01

    This book describes fault tolerance techniques based on software and hardware to create hybrid techniques. They are able to reduce overall performance degradation and increase error detection when associated with applications implemented in embedded processors. Coverage begins with an extensive discussion of the current state-of-the-art in fault tolerance techniques. The authors then discuss the best trade-off between software-based and hardware-based techniques and introduce novel hybrid techniques. Proposed techniques increase existing fault detection rates up to 100%, while maintaining low performance overheads in area and application execution time. • Discusses the effects of radiation on modern integrated circuits; • Provides a comprehensive overview of state-of-the art fault tolerance techniques based on software, hardware, and hybrid techniques; • Introduces novel hybrid fault tolerance techniques for reconfigurable FPGAs and ASICs; • Performs fault injection campaigns by simulation, bitstream ...

  16. A New Fault-tolerant Switched Reluctance Motor with reliable fault detection capability

    DEFF Research Database (Denmark)

    Lu, Kaiyuan

    2014-01-01

    For reliable fault detection, often, search coils are used in many fault-tolerant drives. The search coils occupy extra slot space. They are normally open-circuited and are not used for torque production. This degrades the motor performance, increases the cost and manufacture complexity. A new Fault-Tolerant Switched Reluctance (FTSR) motor is proposed in this paper. A unique feature of this special design is that it allows use of the unexcited phase coils as search coils for fault detection. Therefore this new motor has all the advantages of using search coils for reliable fault detection while no extra search coil is actually needed. The motor itself is able to continue to work under any faulted conditions, providing fault-tolerant features. The working principle, performance evaluation of this motor will be demonstrated in this paper and Finite Element Analysis results are provided.

  17. A validation methodology for fault-tolerant clock synchronization

    Science.gov (United States)

    Johnson, S. C.; Butler, R. W.

    1984-01-01

    A validation method for the synchronization subsystem of a fault-tolerant computer system is presented. The high reliability requirement of flight crucial systems precludes the use of most traditional validation methods. The method presented utilizes formal design proof to uncover design and coding errors and experimentation to validate the assumptions of the design proof. The experimental method is described and illustrated by validating an experimental implementation of the Software Implemented Fault Tolerance (SIFT) clock synchronization algorithm. The design proof of the algorithm defines the maximum skew between any two nonfaulty clocks in the system in terms of theoretical upper bounds on certain system parameters. The quantile to which each parameter must be estimated is determined by a combinatorial analysis of the system reliability. The parameters are measured by direct and indirect means, and upper bounds are estimated. A nonparametric method based on an asymptotic property of the tail of a distribution is used to estimate the upper bound of a critical system parameter. Although the proof process is very costly, it is extremely valuable when validating the crucial synchronization subsystem.

  18. [Advanced Development for Space Robotics With Emphasis on Fault Tolerance Technology

    Science.gov (United States)

    Tesar, Delbert

    1997-01-01

    This report describes work developing fault tolerant redundant robotic architectures and adaptive control strategies for robotic manipulator systems which can dynamically accommodate drastic robot manipulator mechanism, sensor or control failures and maintain stable end-point trajectory control with minimum disturbance. Kinematic designs of redundant, modular, reconfigurable arms for fault tolerance were pursued at a fundamental level. The approach developed robotic testbeds to evaluate disturbance responses of fault tolerant concepts in robotic mechanisms and controllers. The development was implemented in various fault tolerant mechanism testbeds including duality in the joint servo motor modules, parallel and serial structural architectures, and dual arms. All have real-time adaptive controller technologies to react to mechanism or controller disturbances (failures) to perform real-time reconfiguration to continue the task operations. The developments fall into three main areas: hardware, software, and theoretical.

  19. Fusion of Built in Test (BIT) Technologies with Embeddable Fault Tolerant Techniques for Power System and Drives in Space Exploration Project

    National Aeronautics and Space Administration — Impact Technologies has proposed development of an effective prognostic and fault accommodation system for critical DC power systems including PV systems. Overall...

  20. Exact Regenerating Codes for Byzantine Fault Tolerance in Distributed Storage

    CERN Document Server

    Han, Yunghsiang S; Mow, Wai Ho

    2011-01-01

    Due to the use of commodity software and hardware, crash-stop and Byzantine failures are likely to be more prevalent in today's large-scale distributed storage systems. Regenerating codes have been shown to be a more efficient way to disperse information across multiple nodes and recover crash-stop failures in the literature. In this paper, we present the design of regeneration codes in conjunction with integrity check that allows exact regeneration of failed nodes and data reconstruction in presence of Byzantine failures. A progressive decoding mechanism is incorporated in both procedures to leverage computation performed thus far. The fault-tolerance and security properties of the schemes are also analyzed.

  1. Fault Tolerant Neuro-Robust Position Control of DC Motors

    Directory of Open Access Journals (Sweden)

    Ran Zhang

    2011-10-01

    Full Text Available DC motors are widely used in industry such as mechanics, robotics, and aerospace engineering. In this paper, we present a high performance control method for position control of DC motors. Fault-tolerant control model are also addressed to combine with neuro-robust control approach. It is shown that with the proposed control algorithms, external disturbances and coupled dynamics inherent in the system are effectively compensated using neural network unit in which no analytical estimation on the upper bound of the reconstruction error and uncertainties is needed. Simulations on various flight conditions also confirm the effectiveness of the proposed methods.

  2. Fault-tolerant control under controller-driven sampling using virtual actuator strategy

    OpenAIRE

    Osella, Esteban N.; Haimovich, Hernan; Seron, Mari?a M.

    2013-01-01

    We present a new output feedback fault tolerant control strategy for continuous-time linear systems. The strategy combines a digital nominal controller under controller-driven (varying) sampling with virtual-actuator (VA)-based controller reconfiguration to compensate for actuator faults. In the proposed scheme, the controller controls both the plant and the sampling period, and performs controller reconfiguration by engaging in the loop the VA adapted to the diagnosed fault...

  3. A Tool for Assessing Fault Tolerance Mechanisms applied to Web Service applications

    OpenAIRE

    Farj, Khaled; Speirs, Neil

    2009-01-01

    Testing Fault Tolerance Mechanisms (FTM's) is crucial for the development of today's Web Service applications. In this work, we propose a methodology for assessing the efficacy of FTMs applied to Web services applications distributed over the Internet. We present a tool that uses application level fault injection techniques to inject communication faults by using a network Emulator Service. The emulator also generates additional workload on the tested system in order to produce more realistic...

  4. A Novel Nanometric Fault Tolerant Reversible Subtractor Circuit

    OpenAIRE

    Mozhgan Shiri; Majid Haghparast; Vahid Shahbazi

    2012-01-01

    Reversibility plays an important role when energy efficient computations are considered. Reversible logic circuits have received significant attention in quantum computing, low power CMOS design, optical information processing and nanotechnology in the recent years. This study proposes a new fault tolerant reversible half-subtractor and a new fault tolerant reversible full-subtractor circuit with nanometric scales. Also in this paper we demonstrate how the well-known and important, PERES gate...

  5. Fault tolerant task execution through global trajectory planning

    International Nuclear Information System (INIS)

    Whether a task can be completed after a failure of one of the degrees-of-freedom of a redundant manipulator depends on the joint angle at which the failure takes place. It is possible to achieve fault tolerance by globally planning a trajectory that avoids unfavourable joint positions before a failure occurs. In this article, we present a trajectory planning algorithm that guarantees fault tolerance while simultaneously satisfying joint limit and obstacle avoidance requirements

  6. Advanced development for space robotics with emphasis on fault tolerance

    Science.gov (United States)

    Tesar, D.; Chladek, J.; Hooper, R.; Sreevijayan, D.; Kapoor, C.; Geisinger, J.; Meaney, M.; Browning, G.; Rackers, K.

    1995-01-01

    This paper describes the ongoing work in fault tolerance at the University of Texas at Austin. The paper describes the technical goals the group is striving to achieve and includes a brief description of the individual projects focusing on fault tolerance. The ultimate goal is to develop and test technology applicable to all future missions of NASA (lunar base, Mars exploration, planetary surveillance, space station, etc.).

  7. Fault Tolerance Structure of Radix 2 Signed Digital Adders

    Directory of Open Access Journals (Sweden)

    Jishun Kuang

    2012-01-01

    Full Text Available In this study, structure of fault tolerance adder based on Radix 2 Signed Digital (SD representation is proposed. The “carry-free” property of the SD adder that faults impact limited to a few digits can be used to fault detection which is based on parity checking assumed single fault set. Using an encoding scheme to get the parity value of digits involved in computing, this parity values can be exploited to check the circuit. An error information register is set to store the checking results and the bits of the register indicate the corresponding units faulty or not. According to the fault type, recomputation or reconfiguration is used to error correction. The hardware overhead appending Fault-Tolerant is about 120% and the maximum combinational path delay of the proposed adder is constant with the increase of operands.

  8. A universal, fault-tolerant, non-linear analytic network for modeling and fault detection

    International Nuclear Information System (INIS)

    The similarities and differences of a universal network to normal neural networks are outlined. The description and application of a universal network is discussed by showing how a simple linear system is modeled by normal techniques and by universal network techniques. A full implementation of the universal network as universal process modeling software on a dedicated computer system at EBR-II is described and example results are presented. It is concluded that the universal network provides different feature recognition capabilities than a neural network and that the universal network can provide extremely fast, accurate, and fault-tolerant estimation, validation, and replacement of signals in a real system

  9. A universal, fault-tolerant, non-linear analytic network for modeling and fault detection

    Energy Technology Data Exchange (ETDEWEB)

    Mott, J.E. [Advanced Modeling Techniques Corp., Idaho Falls, ID (United States); King, R.W.; Monson, L.R.; Olson, D.L.; Staffon, J.D. [Argonne National Lab., Idaho Falls, ID (United States)

    1992-03-06

    The similarities and differences of a universal network to normal neural networks are outlined. The description and application of a universal network is discussed by showing how a simple linear system is modeled by normal techniques and by universal network techniques. A full implementation of the universal network as universal process modeling software on a dedicated computer system at EBR-II is described and example results are presented. It is concluded that the universal network provides different feature recognition capabilities than a neural network and that the universal network can provide extremely fast, accurate, and fault-tolerant estimation, validation, and replacement of signals in a real system.

  10. Wind turbine fault detection and fault tolerant control : An enhanced benchmark challenge

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Johnson, Kathryn

    2013-01-01

    In this updated edition of a previous wind turbine fault detection and fault tolerant control challenge, we present a more sophisticated wind turbine model and updated fault scenarios to enhance the realism of the challenge and therefore the value of the solutions. This paper describes the challenge model and the requirements for challenge participants. In addition, it motivates many of the faults by citing publications that give ?eld data from wind turbine control tests.

  11. Local fault-tolerant quantum computation

    International Nuclear Information System (INIS)

    We analyze and study the effects of locality on the fault-tolerance threshold for quantum computation. We analytically estimate how the threshold will depend on a scale parameter r which characterizes the scale-up in the size of the circuit due to encoding. We carry out a detailed seminumerical threshold analysis for concatenated coding using the seven-qubit CSS code in the local and the 'nonlocal' setting. First, we find that the threshold in the local model for the [7,1,3] code has a 1/r dependence, which is in correspondence with our analytical estimate. Second, the threshold, beyond the 1/r dependence, does not depend too strongly on the noise levels for transporting qubits. Beyond these results, we find that it is important to look at more than one level of concatenation in order to estimate the threshold and that it may be beneficial in certain places, like in the transportation of qubits, to do error correction only infrequently

  12. Architectural concepts and redundancy techniques in fault-tolerant computers

    Science.gov (United States)

    Rennels, D. A.

    1974-01-01

    This paper presents a description of redundancy techniques employed in the design of fault-tolerant computers, and a discussion of the effects of functional requirements, technology constraints, and cost considerations which enter into the choice of these techniques. The STAR computer, developed at the Jet Propulsion Laboratory for long-duration planetary spacecraft missions, is discussed along with several later fault-tolerant computer designs. The class of computers described in this paper employs dynamic redundancy, i.e., the machine is divided into a set of submodules, each with standby spares; a special hard core monitor unit detects and diagnoses faults, and effects automated recovery by replacing failed parts.

  13. Adaptive Fault Tolerance for Many-Core Based Space-Borne Computing

    Science.gov (United States)

    James, Mark; Springer, Paul; Zima, Hans

    2010-01-01

    This paper describes an approach to providing software fault tolerance for future deep-space robotic NASA missions, which will require a high degree of autonomy supported by an enhanced on-board computational capability. Such systems have become possible as a result of the emerging many-core technology, which is expected to offer 1024-core chips by 2015. We discuss the challenges and opportunities of this new technology, focusing on introspection-based adaptive fault tolerance that takes into account the specific requirements of applications, guided by a fault model. Introspection supports runtime monitoring of the program execution with the goal of identifying, locating, and analyzing errors. Fault tolerance assertions for the introspection system can be provided by the user, domain-specific knowledge, or via the results of static or dynamic program analysis. This work is part of an on-going project at the Jet Propulsion Laboratory in Pasadena, California.

  14. Coordinated Fault-Tolerance for High-Performance Computing Final Project Report

    Energy Technology Data Exchange (ETDEWEB)

    Panda, Dhabaleswar Kumar [The Ohio State University; Beckman, Pete

    2011-07-28

    With the Coordinated Infrastructure for Fault Tolerance Systems (CIFTS, as the original project came to be called) project, our aim has been to understand and tackle the following broad research questions, the answers to which will help the HEC community analyze and shape the direction of research in the field of fault tolerance and resiliency on future high-end leadership systems. Will availability of global fault information, obtained by fault information exchange between the different HEC software on a system, allow individual system software to better detect, diagnose, and adaptively respond to faults? If fault-awareness is raised throughout the system through fault information exchange, is it possible to get all system software working together to provide a more comprehensive end-to-end fault management on the system? What are the missing fault-tolerance features that widely used HEC system software lacks today that would inhibit such software from taking advantage of systemwide global fault information? What are the practical limitations of a systemwide approach for end-to-end fault management based on fault awareness and coordination? What mechanisms, tools, and technologies are needed to bring about fault awareness and coordination of responses on a leadership-class system? What standards, outreach, and community interaction are needed for adoption of the concept of fault awareness and coordination for fault management on future systems? Keeping our overall objectives in mind, the CIFTS team has taken a parallel fourfold approach. Our central goal was to design and implement a light-weight, scalable infrastructure with a simple, standardized interface to allow communication of fault-related information through the system and facilitate coordinated responses. This work led to the development of the Fault Tolerance Backplane (FTB) publish-subscribe API specification, together with a reference implementation and several experimental implementations on top of existing publish-subscribe tools. We enhanced the intrinsic fault tolerance capabilities representative implementations of a variety of key HPC software subsystems and integrated them with the FTB. Targeting software subsystems included: MPI communication libraries, checkpoint/restart libraries, resource managers and job schedulers, and system monitoring tools. Leveraging the aforementioned infrastructure, as well as developing and utilizing additional tools, we have examined issues associated with expanded, end-to-end fault response from both system and application viewpoints. From the standpoint of system operations, we have investigated log and root cause analysis, anomaly detection and fault prediction, and generalized notification mechanisms. Our applications work has included libraries for fault-tolerance linear algebra, application frameworks for coupled multiphysics applications, and external frameworks to support the monitoring and response for general applications. Our final goal was to engage the high-end computing community to increase awareness of tools and issues around coordinated end-to-end fault management.

  15. Fault Analysis on VSI Fed Induction Motor Drive with Fault Tolerant Strategy

    OpenAIRE

    Nagarajan, S.; S. Rama REDDY

    2014-01-01

    The aim of this study is to design and implement a fault tolerant inverter for induction motor drive. The operations of the induction motor drives are so crucial in some applications that any fault in the drive could result in serious loss to the industry in terms of capital, process and materials not to mention the wastage due to idle labor time. Hence, it is essentials that an induction motor drive should basically be fault tolerant. This study investigates some of the possible faults in th...

  16. Reliable Energy Efficient Fault Tolerant Clustering in Wireless Sensor Network

    Directory of Open Access Journals (Sweden)

    L. Venkatesan

    2014-01-01

    Full Text Available To propose a Reliable, Energy Efficient, Fault Tolerant (REEFT clustering algorithm for aggregating sensor measurements in Wireless Sensor Network (WSN. It is a hierarchical algorithm in which energy efficiency is achieved by constructing static clusters with reliable cluster head based on distance. Lifetime of WSN is improved through solving the important issues in WSN, which are distribution of clusters, optimal number of clusters and number of nodes in a cluster and optimal time duration of clustering cycle. Also the algorithm include fault tolerance feature to tolerate the Cluster Head (CH failure and improve the packet delivery ratio. The algorithm was tested using simulations and its performance improvements were analyzed.

  17. Reconfigurable tree architectures using subtree oriented fault tolerance

    Science.gov (United States)

    Lowrie, Matthew B.; Fuchs, W. Kent

    1987-01-01

    An approach to the design of reconfigurable tree architectures is presented in which spare processors are allocated at the leaves. The approach is unique in that spares are associated with subtrees, and sharing of spares between these subtrees can occur. The subtree-oriented fault-tolerance approach is more reliable than previous approaches capable of tolerating link and switch failures for both single-chip and multichip tree implementations while reducing redundancy in terms of both spare processors and links. VLSI layout is O(n) for binary trees and is directly extensible to N-ary trees and fault tolerance through performance degradation.

  18. A Benchmark Evaluation of Fault Tolerant Wind Turbine Control Concepts

    Energy Technology Data Exchange (ETDEWEB)

    Odgaard, Peter F.; Stoustrup, Jakob

    2015-05-01

    As the world’s power supply to a larger and larger degree depends on wind turbines, it is consequently increasingly important that these are as reliable and available as possible. Modern fault tolerant control could play a substantial part in increasing reliability of modern wind turbines. A benchmark model for wind turbine fault detection and isolation and fault tolerant control has previously been proposed. Based on this benchmark an international competition on wind turbine fault tolerant control was announced. In this article the top three solutions from that competition are presented and evaluated. The analysis show that especially the winner of the competition shows potential for wind turbine fault tolerant control. In addition to showing good performance, the approach is based on method which is relevant for industrial usage. It is based on a virtual sensor and actuator strategy, in which the fault accommodation is handled in software sensor and actuator blocks. This means that the wind turbine controller can continue operation as in the fault free case. The other two evaluated solutions show some potential but clearly need improvements.

  19. An approach to the verification of a fault-tolerant, computer-based reactor safety system: A case study using automated reasoning: Volume 1: Interim report

    International Nuclear Information System (INIS)

    The purpose of this project is to explore the feasibility of automating the verification process for computer systems. The intent is to demonstrate that both the software and hardware that comprise the system meet specified availability and reliability criteria, that is, total design analysis. The approach to automation is based upon the use of Automated Reasoning Software developed at Argonne National Laboratory. This approach is herein referred to as formal analysis and is based on previous work on the formal verification of digital hardware designs. Formal analysis represents a rigorous evaluation which is appropriate for system acceptance in critical applications, such as a Reactor Safety System (RSS). This report describes a formal analysis technique in the context of a case study, that is, demonstrates the feasibility of applying formal analysis via application. The case study described is based on the Reactor Safety System (RSS) for the Experimental Breeder Reactor-II (EBR-II). This is a system where high reliability and availability are tantamount to safety. The conceptual design for this case study incorporates a Fault-Tolerant Processor (FTP) for the computer environment. An FTP is a computer which has the ability to produce correct results even in the presence of any single fault. This technology was selected as it provides a computer-based equivalent to the traditional analog based RSSs. This provides a more conservative design constraint than that imposedvative design constraint than that imposed by the IEEE Standard, Criteria For Protection Systems For Nuclear Power Generating Stations (ANSI N42.7-1972)

  20. Development of software fault-tolerance techniques

    Science.gov (United States)

    Melliar-Smith, P. M.

    1983-01-01

    As computers become more widely used, and in particular as they become used in more safety critical applications, the reliability of the computer system and its software becomes more important. There is also an increasing need for high levels of reliability in applications involving very large numbers of inexpensive units where recall of the units would be disproportionately expensive. The nature of faults and the assumptions made by different approaches to correct operation are considered. The recovery block approach is described and a probabilistic analysis of its effectiveness, with and without correlated design errors is provided. Mechanisms for generating acceptance tests from specifications, and for providing recovery in the presence of asynchrony, are described. An analysis of, and design for, the provision of recovery blocks in the microprogram of the Bendix BDX930 processor is provided. An example of the use of recovery blocks in a simple operating system is also provided.

  1. Active and Passive Fault-Tolerant LPV Control of Wind Turbines

    DEFF Research Database (Denmark)

    Sloth, Christoffer; Esbensen, Thomas

    2010-01-01

    This paper addresses the design and comparison of active and passive fault-tolerant linear parameter-varying (LPV) controllers for wind turbines. The considered wind turbine plant model is characterized by parameter variations along the nominal operating trajectory and includes a model of an incipient fault in the pitch system. We propose the design of an active fault-tolerant controller (AFTC) based on an existing LPV controller design method and extend this method to apply for the design of a passive fault-tolerant controller (PFTC). Both controllers are based on output feedback and are scheduled on the varying parameter to manage the parametervarying nature of the model. The PFTC only relies on measured system variables and an estimated wind speed, while the AFTC also relies on information from a fault diagnosis system. Consequently, the optimization problem involved in designing the PFTC is more difficult to solve, as it involves solving bilinear matrix inequalities (BMIs) instead of linear matrix inequalities (LMIs). Simulation results show the performance of the active faulttolerant control system to be slightly superior to that of the passive fault-tolerant control system.

  2. Solar system fault detection

    Science.gov (United States)

    Farrington, R.B.; Pruett, J.C. Jr.

    1984-05-14

    A fault detecting apparatus and method are provided for use with an active solar system. The apparatus provides an indication as to whether one or more predetermined faults have occurred in the solar system. The apparatus includes a plurality of sensors, each sensor being used in determining whether a predetermined condition is present. The outputs of the sensors are combined in a pre-established manner in accordance with the kind of predetermined faults to be detected. Indicators communicate with the outputs generated by combining the sensor outputs to give the user of the solar system and the apparatus an indication as to whether a predetermined fault has occurred. Upon detection and indication of any predetermined fault, the user can take appropriate corrective action so that the overall reliability and efficiency of the active solar system are increased.

  3. A Remote Characterization System and a fault-tolerant tracking system for subsurface mapping of buried waste sites

    International Nuclear Information System (INIS)

    This paper describes two closely related projects that will provide new technology for characterizing hazardous waste burial sites. The first project, a collaborative effort by five of the national laboratories, involves the development and demonstration of a remotely controlled site characterization system. The Remote Characterization System (RCS) includes a unique low-signature survey vehicle, a base station, radio telemetry data links, satellite-based vehicle tracking, stereo vision, and sensors for noninvasive inspection of the surface and subsurface. The second project, conducted by the Idaho National Engineering Laboratory (INEL), involves the development of a position sensing system that can track a survey vehicle or instrument in the field. This system can coordinate updates at a rate of 200/s with an accuracy better than 0.1% of the distance separating the target and the sensor. It can employ acoustic or electromagnetic signals in a wide range of frequencies and can be operated as a passive or active device

  4. Superconducting generator field winding design for high fault tolerance

    International Nuclear Information System (INIS)

    Development of rotating electrical machines with superconducting field windings is proceeding at numerous sites worldwide. The primary emphasis is on large turbine generators for application to power systems. The EPRI/Westinghouse 300 MVA superconducting generator program is directed towards demonstration of the technology in an actual utility environment for a long period of time. The concept of stability, in the case of superconducting generators, includes the traditional concepts of stability with respect to the electromechanical interactions and oscillations of the machine with the power system as well as the thermohydraulic stability of the cryogenic rotor and its helium supply system. Power system disturbances, such as faults, produce flow and pressure transients in the rotor cooling system. Depending upon the severity and time history of the disturbances, these transients may occasion normalization of the superconductor and destabilize the generator output through loss of field excitation. This paper addresses the question of designing the superconducting winding and its cryogenic cooling system for stability in the presence of large disturbances, a capability which has been called high fault tolerance

  5. Fault tolerant control of multivariable processes using auto-tuning PID controller.

    Science.gov (United States)

    Yu, Ding-Li; Chang, T K; Yu, Ding-Wen

    2005-02-01

    Fault tolerant control of dynamic processes is investigated in this paper using an auto-tuning PID controller. A fault tolerant control scheme is proposed composing an auto-tuning PID controller based on an adaptive neural network model. The model is trained online using the extended Kalman filter (EKF) algorithm to learn system post-fault dynamics. Based on this model, the PID controller adjusts its parameters to compensate the effects of the faults, so that the control performance is recovered from degradation. The auto-tuning algorithm for the PID controller is derived with the Lyapunov method and therefore, the model predicted tracking error is guaranteed to converge asymptotically. The method is applied to a simulated two-input two-output continuous stirred tank reactor (CSTR) with various faults, which demonstrate the applicability of the developed scheme to industrial processes. PMID:15719931

  6. MCNP load balancing and fault tolerance with PVM

    International Nuclear Information System (INIS)

    Version 4A of the Monte Carlo neutron, photon, and electron transport code MCNP, developed by LANL (Los Alamos National Laboratory), supports distributed-memory multiprocessing through the software package PVM (Parallel Virtual Machine, version 3.1.4). Using PVM for interprocessor communication, MCNP can simultaneously execute a single problem on a cluster of UNIX-based workstations. This capability provided system efficiencies that exceeded 80% on dedicated workstation clusters, however, on heterogeneous or multiuser systems, the performance was limited by the slowest processor (i.e., equal work was assigned to each processor). The next public release of MCNP will provide multiprocessing enhancements that include load balancing and fault tolerance which are shown to dramatically increase multiuser system efficiency and reliability

  7. Ensuring fault tolerance of phase-locked clocks

    Science.gov (United States)

    Krishna, C. M.; Shin, K. G.; Butler, R. W.

    1985-01-01

    Processors within a real-time multiprocessor system must be synchronized with as little overhead as possible. Although synchronization can be achieved via both software (e.g., interactive convergence and interactive consistency algorithms) and hardware (e.g., multistage synchronizers and phase-locked clocks), phase-locked clocks are most attractive due to their small overheads. Despite the fact that synchronization of the multiprocessor system with phase-locked clocks is totally different in nature from the interactive consistency algorithm, it is presently proven that it must satisfy the same condition, N equal to or greater than 3m + 1, where N is the total number of clocks in the multiprocessor system and m is the maximum number of faults tolerable. Also presented are results showing how to design phase-locked clocks so as to be impervious up to a given arbitrary number of malicious failures.

  8. MCNP load balancing and fault tolerance with PVM

    International Nuclear Information System (INIS)

    Version 4A of the Monte Carlo neutron, photon, and electron transport code MCNP developed by Los Alamos National Laboratory supports distributed-memory multiprocessing through the parallel virtual machine (PVM) software package, version 3.1.4. Using PVM for interprocessor communication, MCNP can simultaneously execute a single problem on a cluster of UNIX-based workstations. This capability provided system efficiencies that exceed 80% on dedicated workstation clusters; however, on heterogeneous or multiuser systems, the performance was limited by the slowest processor (i.e., equal work was assigned to each processor). The next public release of MCNP will provide multiprocessing enhancements that include load balancing and fault tolerance, which are shown to dramatically increase multiuser system efficiency and reliability

  9. Fault-tolerant computer architecture based on INMOS transputer processor

    Science.gov (United States)

    Ortiz, Jorge L.

    1987-01-01

    Redundant processing was used for several years in mission flight systems. In these systems, more than one processor performs the same task at the same time but only one processor is actually in real use. A fault-tolerance computer architecture based on the features provided by INMOS Transputers is presented. The Transputer architecture provides several communication links that allow data and command communication with other Transputers without the use of a bus. Additionally the Transputer allows the use of parallel processing to increase the system speed considerably. The processor architecture consists of three processors working in parallel keeping all the processors at the same operational level but only one processor is in real control of the process. The design allows each Transputer to perform a test to the other two Transputers and report the operating condition of the neighboring processors. A graphic display was developed to facilitate the identification of any problem by the user.

  10. A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism

    OpenAIRE

    Yao, Erlin; Chen, Mingyu; Wang, Rui; Zhang, Wenli; Tan, Guangming

    2011-01-01

    Fault tolerance overhead of high performance computing (HPC) applications is becoming critical to the efficient utilization of HPC systems at large scale. HPC applications typically tolerate fail-stop failures by checkpointing. Another promising method is in the algorithm level, called algorithmic recovery. These two methods can achieve high efficiency when the system scale is not very large, but will both lose their effectiveness when systems approach the scale of Exaflops,...

  11. Particle Filter Based Fault-tolerant ROV Navigation using Hydro-acoustic Position and Doppler Velocity Measurements

    DEFF Research Database (Denmark)

    Zhao, Bo; Blanke, Mogens

    2012-01-01

    This paper presents a fault tolerant navigation system for a remotely operated vehicle (ROV). The navigation system uses hydro-acoustic position reference (HPR) and Doppler velocity log (DVL) measurements to achieve an integrated navigation. The fault tolerant functionality is based on a modied particle lter. This particle lter is able to run in an asynchronous manner to accommodate the measurement drop out problem, and it overcomes the measurement outliers by switching observation models. Simulations with experimental data show that this fault tolerant navigation system can accurately estimate the ROV kinematic states, even when sensor failures appear frequently.

  12. Fault tolerant strategies for automated operation of nuclear reactors

    International Nuclear Information System (INIS)

    This paper introduces an automatic control system incorporating a number of verification, validation, and command generation tasks with-in a fault-tolerant architecture. The integrated system utilizes recent methods of artificial intelligence such as neural networks and fuzzy logic control. Furthermore, advanced signal processing and nonlinear control methods are also included in the design. The primary goal is to create an on-line capability to validate signals, analyze plant performance, and verify the consistency of commands before control decisions are finalized. The application of this approach to the automated startup of the Experimental Breeder Reactor-II (EBR-II) is performed using a validated nonlinear model. The simulation results show that the advanced concepts have the potential to improve plant availability andsafety

  13. Buffered coscheduling for parallel programming and enhanced fault tolerance

    Science.gov (United States)

    Petrini, Fabrizio (Los Alamos, NM); Feng, Wu-chun (Los Alamos, NM)

    2006-01-31

    A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval. The buffered coscheduling method of this invention also enhances the fault tolerance of a network of parallel machine processors or distributed system processors

  14. Fault tolerant vector control of induction motor drive

    Science.gov (United States)

    Odnokopylov, G.; Bragin, A.

    2014-10-01

    For electric composed of technical objects hazardous industries, such as nuclear, military, chemical, etc. an urgent task is to increase their resiliency and survivability. The construction principle of vector control system fault-tolerant asynchronous electric. Displaying recovery efficiency three-phase induction motor drive in emergency mode using two-phase vector control system. The process of formation of a simulation model of the asynchronous electric unbalance in emergency mode. When modeling used coordinate transformation, providing emergency operation electric unbalance work. The results of modeling transient phase loss motor stator. During a power failure phase induction motor cannot save circular rotating field in the air gap of the motor and ensure the restoration of its efficiency at rated torque and speed.

  15. Hybrid routing technique for a fault-tolerant, integrated information network

    Science.gov (United States)

    Meredith, B. D.

    1986-01-01

    The evolutionary growth of the space station and the diverse activities onboard are expected to require a hierarchy of integrated, local area networks capable of supporting data, voice, and video communications. In addition, fault-tolerant network operation is necessary to protect communications between critical systems attached to the net and to relieve the valuable human resources onboard the space station of time-critical data system repair tasks. A key issue for the design of the fault-tolerant, integrated network is the development of a robust routing algorithm which dynamically selects the optimum communication paths through the net. A routing technique is described that adapts to topological changes in the network to support fault-tolerant operation and system evolvability.

  16. Ethernet Implementation of Fault Tolerant Train Network for Entertainment and Mixed Control Traffic

    Directory of Open Access Journals (Sweden)

    Tarek K. Refaat

    2013-01-01

    Full Text Available This paper studies the integration of the control system and entertainment on board of train wagons. Both the control and entertainment loads are implemented on top of Gigabit Ethernet, each with a dedicated controller/server. The control load has mixed sampling periods. It is proven that this system can tolerate the failure of one controller in one wagon. In a two wagon scenario, fault tolerance at the controller level is studied, and simulation results show that the system can tolerate the failure of 3 controllers. The system is successful in meeting the packet end-to-end delay with zero packet loss in all OPNET simulated scenarios. The maximum permissible entertainment load is determined for the fault tolerant scenarios.

  17. A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism

    CERN Document Server

    Yao, Erlin; Wang, Rui; Zhang, Wenli; Tan, Guangming

    2011-01-01

    Fault tolerance overhead of high performance computing (HPC) applications is becoming critical to the efficient utilization of HPC systems at large scale. HPC applications typically tolerate fail-stop failures by checkpointing. Another promising method is in the algorithm level, called algorithmic recovery. These two methods can achieve high efficiency when the system scale is not very large, but will both lose their effectiveness when systems approach the scale of Exaflops, where the number of processors including in system is expected to achieve one million. This paper develops a new and efficient algorithm-based fault tolerance scheme for HPC applications. When failure occurs during the execution, we do not stop to wait for the recovery of corrupted data, but replace them with the corresponding redundant data and continue the execution. A background accelerated recovery method is also proposed to rebuild redundancy to tolerate multiple times of failures during the execution. To demonstrate the feasibility ...

  18. A benchmark for fault tolerant flight control evaluation:

    OpenAIRE

    Smaili, H.; Breeman, J.; Lombaerts, T.; Stroosma, O.

    2013-01-01

    A large transport aircraft simulation benchmark (REconfigurable COntrol for Vehicle Emergency Return ? RECOVER) has been developed within the GARTEUR (Group for Aeronautical Research and Technology in Europe) Flight Mechanics Action Group 16 (FM-AG(16)) on Fault Tolerant Control (2004 2008) for the integrated evaluation of fault detection and identification (FDI) and reconfigurable flight control strategies. The benchmark includes a suitable set of assessment criteria and failure cases, bas...

  19. Improvement of Matrix Converter Drive Reliability by Online Fault Detection and a Fault-Tolerant Switching Strategy.

    DEFF Research Database (Denmark)

    Nguyen-Duy, Khiem; Liu, Tian-Hua

    2011-01-01

    The matrix converter system is becoming a very promising candidate to replace the conventional two-stage ac/dc/ac converter, but system reliability remains an open issue. The most common reliability problem is that a bidirectional switch has an open-switch fault during operation. In this paper, a matrix converter driving a speed-controlled permanent-magnet synchronous motor is examined under a single open-switch fault. First, a new fault-detection method is proposed using only the motor currents. Second, a novel fault-tolerant switching strategy is presented. By treating the matrix converter as a two-stage rectifier/inverter, existing modulation techniques for the inverter stage can be reused, whereas the rectifier stage is modified by control to counteract the fault. However, the proposed techniques require no additional hardware devices or circuit modifications to the matrix converter. Experimental results show that the proposed method can maintain the motor speed with a maximum ripple of 2%—a fivefold improvement over the uncompensated system. The proposed method therefore offers a very economical and effective solution for the matrix converter fault tolerance problem.

  20. A Fault Tolerant Resource Allocation Architecture for Mobile Grid

    Directory of Open Access Journals (Sweden)

    P. T. Vanathi

    2012-01-01

    Full Text Available Problem statement: In order to achieve high level of reliability and availability, the grid infrastructure should be fault tolerant. Since the failure of resources affects job execution fatally, fault tolerance service is essential to satisfy QoS requirement in grid computing with respect to mobile nodes. Approach: We propose a fault tolerant technique for improving reliability in mobile grid environment considering the node mobility. The Cluster head and monitoring agent was designed in such a way it addresses both resource and network failure and present recovery techniques for overcoming the faults. Results: The proposed model achieves a identifiable performance when compared to the previous model (HRAA. By simulation results, we analyze the node and link failures on parameters such as delivery ratio, throughput and delay against the rate of success. Conclusion: The proposed fault tolerant approach checks for availability of the nodes with least work load for transferring the executed job to cluster head providing an alternate path in case of failure thereby enhancing the reliability of the grid environment.

  1. An upper bound on quantum fault tolerant thresholds

    CERN Document Server

    Fern, Jesse

    2008-01-01

    In this paper we calculate upper bounds on fault tolerance without restrictions on the overhead involved. Optimally adaptive recovery operators are used, and the Shannon entropy is used to estimate the thresholds. By allowing for unrealistically high levels of overhead, we find a quantum fault tolerant threshold of 6.88% for the depolarizing noise used by Knill, which compares to "above 3%" evidenced by Knill. We conjecture that the optimal threshold is 6.90%. We also perform threshold calculations for types of noise other than that discussed by Knill.

  2. A Novel Nanometric Fault Tolerant Reversible Subtractor Circuit

    Directory of Open Access Journals (Sweden)

    Mozhgan Shiri

    2012-11-01

    Full Text Available Reversibility plays an important role when energy efficient computations are considered. Reversible logic circuits have received significant attention in quantum computing, low power CMOS design, optical information processing and nanotechnology in the recent years. This study proposes a new fault tolerant reversible half-subtractor and a new fault tolerant reversible full-subtractor circuit with nanometric scales. Also in this paper we demonstrate how the well-known and important, PERES gate and TR gate can be synthesized from parity preserving reversible gates. All the designs have nanometric scales.

  3. On fault-tolerant structure, distributed fault-diagnosis, reconfiguration, and recovery of the array processors

    Energy Technology Data Exchange (ETDEWEB)

    Hosseini, S.H.

    1989-07-01

    The increasing need for the design of high-performance computers has led to the design of special purpose computers such as array processors. This paper studies the design of fault-tolerant array processors. First, it is shown how hardware redundancy can be employed in the existing structures in order to make them capable of withstanding the failure of some of the array links and processors. Then distributed fault-tolerance schemes are introduced for the diagnosis of the faulty elements, reconfiguration, and recovery of the array. Fault tolerance is maintained by the cooperation of processors in a decentralized form of control without the participation of any type of hardcore or fault-free central controller such as a host computer.

  4. Fault tolerance in space-based digital signal processing and switching systems: Protecting up-link processing resources, demultiplexer, demodulator, and decoder

    Science.gov (United States)

    Redinbo, Robert

    1994-01-01

    Fault tolerance features in the first three major subsystems appearing in the next generation of communications satellites are described. These satellites will contain extensive but efficient high-speed processing and switching capabilities to support the low signal strengths associated with very small aperture terminals. The terminals' numerous data channels are combined through frequency division multiplexing (FDM) on the up-links and are protected individually by forward error-correcting (FEC) binary convolutional codes. The front-end processing resources, demultiplexer, demodulators, and FEC decoders extract all data channels which are then switched individually, multiplexed, and remodulated before retransmission to earth terminals through narrow beam spot antennas. Algorithm based fault tolerance (ABFT) techniques, which relate real number parity values with data flows and operations, are used to protect the data processing operations. The additional checking features utilize resources that can be substituted for normal processing elements when resource reconfiguration is required to replace a failed unit.

  5. Fault tolerant coverage and connectivity in presence of channel randomness.

    Science.gov (United States)

    Sagar, Anil Kumar; Lobiyal, D K

    2014-01-01

    Some applications of wireless sensor network require K-coverage and K-connectivity to ensure the system to be fault tolerance and to make it more reliable. Therefore, it makes coverage and connectivity an important issue in wireless sensor networks. In this paper, we proposed K-coverage and K-connectivity models for wireless sensor networks. In both models, nodes are distributed according to Poisson distribution in the sensor field. To make the proposed model more realistic we used log-normal shadowing path loss model to capture the radio irregularities and studied its impact on K-coverage and K-connectivity. The value of K can be different for different types of applications. Further, we also analyzed the problem of node failure for K-coverage model. In the simulation section, results clearly show that coverage and connectivity of wireless sensor network depend on the node density, shadowing parameters like the path loss exponent, and standard deviation. PMID:24574922

  6. Energy Bounds for Fault-Tolerant Nanoscale Designs

    CERN Document Server

    Marculescu, Diana

    2011-01-01

    The problem of determining lower bounds for the energy cost of a given nanoscale design is addressed via a complexity theory-based approach. This paper provides a theoretical framework that is able to assess the trade-offs existing in nanoscale designs between the amount of redundancy needed for a given level of resilience to errors and the associated energy cost. Circuit size, logic depth and error resilience are analyzed and brought together in a theoretical framework that can be seamlessly integrated with automated synthesis tools and can guide the design process of nanoscale systems comprised of failure prone devices. The impact of redundancy addition on the switching energy and its relationship with leakage energy is modeled in detail. Results show that 99% error resilience is possible for fault-tolerant designs, but at the expense of at least 40% more energy if individual gates fail independently with probability of 1%.

  7. Fault-tolerant Sensor Fusion for Marine Navigation

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2006-01-01

    Reliability of navigation data are critical for steering and manoeuvring control, and in particular so at high speed or in critical phases of a mission. Should faults occur, faulty instruments need be autonomously isolated and faulty information discarded. This paper designs a navigation solution where essential navigation information is provided even with multiple faults in instrumentation. The paper proposes a provable correct implementation through auto-generated state-event logics in a supervisory part of the algorithms. Test results from naval vessels document the performance and shows events where the fault-tolerant sensor fusion provided uninterrupted navigation data despite temporal instrument defects

  8. Final Project Report: Scalable fault tolerance runtime technology for petascale computers

    Energy Technology Data Exchange (ETDEWEB)

    Krishnamoorthy, Sriram; Sadayappan, P

    2015-06-16

    With the massive number of components comprising the forthcoming petascale computer systems, hardware failures will be routinely encountered during execution of large-scale applications. Due to the multidisciplinary, multiresolution, and multiscale nature of scientific problems that drive the demand for high end systems, applications place increasingly differing demands on the system resources: disk, network, memory, and CPU. In addition to MPI, future applications are expected to use advanced programming models such as those developed under the DARPA HPCS program as well as existing global address space programming models such as Global Arrays, UPC, and Co-Array Fortran. While there has been a considerable amount of work in fault tolerant MPI with a number of strategies and extensions for fault tolerance proposed, virtually none of advanced models proposed for emerging petascale systems is currently fault aware. To achieve fault tolerance, development of underlying runtime and OS technologies able to scale to petascale level is needed. This project has evaluated range of runtime techniques for fault tolerance for advanced programming models.

  9. Design and analysis of linear fault-tolerant permanent-magnet vernier machines.

    Science.gov (United States)

    Xu, Liang; Ji, Jinghua; Liu, Guohai; Du, Yi; Liu, Hu

    2014-01-01

    This paper proposes a new linear fault-tolerant permanent-magnet (PM) vernier (LFTPMV) machine, which can offer high thrust by using the magnetic gear effect. Both PMs and windings of the proposed machine are on short mover, while the long stator is only manufactured from iron. Hence, the proposed machine is very suitable for long stroke system applications. The key of this machine is that the magnetizer splits the two movers with modular and complementary structures. Hence, the proposed machine offers improved symmetrical and sinusoidal back electromotive force waveform and reduced detent force. Furthermore, owing to the complementary structure, the proposed machine possesses favorable fault-tolerant capability, namely, independent phases. In particular, differing from the existing fault-tolerant machines, the proposed machine offers fault tolerance without sacrificing thrust density. This is because neither fault-tolerant teeth nor the flux-barriers are adopted. The electromagnetic characteristics of the proposed machine are analyzed using the time-stepping finite-element method, which verifies the effectiveness of the theoretical analysis. PMID:24982959

  10. Design of Fault Tolerant Network Interfaces for NoCs

    DEFF Research Database (Denmark)

    Fiorin, Leandro; Micconi, Laura

    2011-01-01

    Networks-on-Chip (NoCs) appeared as a strategy to deal with the communication requirements of complex IP-based System-on-Chips. As the complexity of designs increases and the technology scales down into the deep-submicron domain, the probability of malfunctions and failures in the NoC components increases. This paper focuses on the study and evaluation of techniques for increasing reliability and resilience of Network Interfaces (NIs). NIs act as interfaces between IP cores and the communication infrastructure; a faulty behavior in them could affect therefore the overall system. In this work, we propose a functional fault model for the NI components, and we present a two-level fault tolerant solution that can be employed for mitigating the effects of both single-event upset soft errors and hard errors on the NI. Experiments show that with a limited overhead we can obtain a significant reliability of the NI, while saving up to 83% in area with respect to a standard Triple Modular Redundancy implementation, as well as a significant energy reduction.

  11. Fault-Tolerant Operation of an Open-End Winding Five-Phase PMSM Drive with Inverter Faults

    OpenAIRE

    Meinguet, Fabien; Nguyen, Ngac-ky; Sandulescu, Paul; Kestelyn, Xavier; Semail, Eric

    2013-01-01

    Multi-phase machines are known for their fault-tolerant capability. However, star-connected machines have no fault tolerance to inverter switch short-circuit fault. This paper investigates the fault-tolerant operation of an open-end five-phase drive, i.e. a multi-phase machine fed with a dual-inverter supply. Open-circuit faults and inverter switch short-circuit faults are considered and handled with various degrees of reconfiguration. Theoretical developments and experimental results validat...

  12. Analysis of GPS Abnormal Conditions within Fault Tolerant Control Laws

    Science.gov (United States)

    Al-Sinbol, Gahssan

    The Global Position System (GPS) is a critical element for the functionality of autonomous flying vehicles. The GPS operation at normal and abnormal conditions directly impacts the trajectory tracking performance of the autonomous Unmanned Aerial Vehicles (UAVs) controllers. The effects of GPS parameter variation must be well understood and user-friendly computational tools must be developed to facilitate the design and evaluation of fault tolerant control laws. This thesis presents the development of a simplified GPS error model in Matlab/Simulink and its use performing a sensitivity analysis of GPS parameters effect under system normal and abnormal operation on different UAV trajectory tracking controllers. The model statistically generates position and velocity errors, simulates the effect of GPS satellite configuration on the position and velocity measurement accuracy, and implements a set of failures to the GPS readings. The model and its graphical user interface was integrated within the WVU UAV simulation environment as a masked Simulink block. The effects on the controllers' trajectory tracking performance of the following GPS parameters were investigated within normal operation ranges and outside: time delay, update rate, error standard deviation, bias, and major position and velocity failures. Several sets of control laws with fixed and adaptive parameters and of different levels of complexity have been used in this investigation. A complex performance index formulated in terms of tracking errors and control activity was used for control laws performance evaluation. The composition of various metrics within the performance index was performed using fixed and variable weights depending on the local characteristics of the commanded trajectory. This study has revealed that GPS error parameters have a significant impact on control laws performance. The proposed GPS model has proved to be a valuable, flexible tool for testing and evaluation of the fault tolerant capabilities of autonomous flight control laws.

  13. Fault tolerance and reliability in integrated ship control : the ATOMOS concept

    DEFF Research Database (Denmark)

    Nielsen, Jens Frederik Dalsgaard; Izadi-Zamanabadi, Roozbeh

    2002-01-01

    Various strategies for achieving fault tolerance in large scale control systems are discussed. The positive and negative impacts of distribution through network communication are presented. The ATOMOS framework for standardized reliable marine automation is presented along with the corresponding reliability issues. A generic framework for simulation of network traffic under fault conditions is suggested and the first practical experiences from a prototype implementation are reported.

  14. A Framework-Based Approach for Fault-Tolerant Service Robots

    OpenAIRE

    Heejune Ahn; Woong-Kee Loh; Woon-Young Yeo

    2012-01-01

    Recently the component?based approach has become a major trend in intelligent service robot development due to its reusability and productivity. The framework in a component?based system should provide essential services for application components. However, to our knowledge the existing robot frameworks do not yet support fault tolerance service. Moreover, it is often believed that faults can be handled only at the application level. In this paper, by extending the robot framework with th...

  15. Robust fault tolerant control framework using uncertain Takagi-Sugeno fuzzy models

    OpenAIRE

    Rotondo, Damiano; Nejjari Akhi-elarab, Fatiha; Puig Cayuela, Vicenc?

    2014-01-01

    This chapter is concerned with the introduction of a fault tolerant control (FTC) framework using uncertain Takagi-Sugeno (FS) fuzzy models. Depending on how much information is available about the fault, the framework gives rise to passive FTC, active FTC without controller reconfiguration and active FTC with controller reconfiguration. The design is performed using a Linear Matrix Inequality (LMI)-based synthesis that directly takes into account the TS description of the system and...

  16. Fault Tolerant, Radiation Hard DSP Project

    National Aeronautics and Space Administration — We propose to develop a radiation tolerant/hardened signal processing node, which effectively utilizes state-of-the-art commercial semiconductors plus our...

  17. Beam Dynamics Studies for the Fault Tolerance Assessment of the PDS-XADS Linac Design

    International Nuclear Information System (INIS)

    In order to meet the high availability/reliability required by the PDS-XADS design, the accelerator needs to implement to the maximum possible extent a fault tolerance strategy that would allow beam operation in the presence of most of the envisaged faults that could occur in its beam line components. In this work, we report the results of beam dynamics simulations performed to characterize the effects of the faults of the main linac components (cavities and focusing magnets) on the beam parameters. The outcome of this activity is the definition of the possible corrective actions that could be conceived (and implemented in the system) in order to guarantee the fault tolerance characteristics of the accelerator. This work has been supported by the PDS-XADS program, funded by the EU 5th Framework Program under contract FIKW-CT-2001-00179

  18. Algorithms for testing fault-tolerance of sequenced jobs.

    Czech Academy of Sciences Publication Activity Database

    Chrobak, M.; Hurand, M.; Sgall, Ji?í

    2009-01-01

    Ro?. 12, ?. 5 (2009), s. 501-515. ISSN 1094-6136 R&D Projects: GA MŠk(CZ) 1M0545; GA AV ?R IAA100190902; GA AV ?R IAA1019401 Keywords : sequencing algorithms * fault-tolerance * dynamic programming Subject RIV: IN - Informatics, Computer Science Impact factor: 1.265, year: 2009

  19. Critique of Fault-Tolerant Quantum Information Processing

    OpenAIRE

    Alicki, Robert

    2013-01-01

    This is a chapter in a book \\emph{Quantum Error Correction} edited by D. A. Lidar and T. A. Brun, and published by Cambridge University Press (2013)\\\\ (http://www.cambridge.org/us/academic/subjects/physics/quantum-physics-quantum-information-and-quantum-computation/quantum-error-correction)\\\\ presenting the author's view on feasibility of fault-tolerant quantum information processing.

  20. Fault tolerant computer for nuclear power plant applications

    International Nuclear Information System (INIS)

    A quadruply redundant synchronous fault tolerant processor (FTP) is now under fabrication at the C.S. Draper Laboratory to be used initially as a trip monitor for the Experimental Breeder Reactor EBR-II operated by the Argonne National Laboratory in Idaho Falls, Idaho. The hardware architecture of this processor is described and certain issues unique to quadruply redundant computers are discussed

  1. Fault-tolerant quantum computing with color codes

    CERN Document Server

    Landahl, Andrew J; Rice, Patrick R

    2011-01-01

    We present and analyze protocols for fault-tolerant quantum computing using color codes. We present circuit-level schemes for extracting the error syndrome of these codes fault-tolerantly. We further present an integer-program-based decoding algorithm for identifying the most likely error given the syndrome. We simulated our syndrome extraction and decoding algorithms against three physically-motivated noise models using Monte Carlo methods, and used the simulations to estimate the corresponding accuracy thresholds for fault-tolerant quantum error correction. We also used a self-avoiding walk analysis to lower-bound the accuracy threshold for two of these noise models. We present and analyze two architectures for fault-tolerantly computing with these codes: one with 2D arrays of qubits are stacked atop each other and one in a single 2D substrate. Our analysis demonstrates that color codes perform slightly better than Kitaev's surface codes when circuit details are ignored. When these details are considered, w...

  2. Refinement for fault-tolerance: An aircraft hand-off protocol

    Science.gov (United States)

    Marzullo, Keith; Schneider, Fred B.; Dehn, Jon

    1994-01-01

    Part of the Advanced Automation System (AAS) for air-traffic control is a protocol to permit flight hand-off from one air-traffic controller to another. The protocol must be fault-tolerant and, therefore, is subtle -- an ideal candidate for the application of formal methods. This paper describes a formal method for deriving fault-tolerant protocols that is based on refinement and proof outlines. The AAS hand-off protocol was actually derived using this method; that derivation is given.

  3. Fault-tolerant computer study. [logic designs for building block circuits

    Science.gov (United States)

    Rennels, D. A.; Avizienis, A. A.; Ercegovac, M. D.

    1981-01-01

    A set of building block circuits is described which can be used with commercially available microprocessors and memories to implement fault tolerant distributed computer systems. Each building block circuit is intended for VLSI implementation as a single chip. Several building blocks and associated processor and memory chips form a self checking computer module with self contained input output and interfaces to redundant communications buses. Fault tolerance is achieved by connecting self checking computer modules into a redundant network in which backup buses and computer modules are provided to circumvent failures. The requirements and design methodology which led to the definition of the building block circuits are discussed.

  4. Multi-version software reliability through fault-avoidance and fault-tolerance

    Science.gov (United States)

    Vouk, Mladen A.; Mcallister, David F.

    1989-01-01

    A number of experimental and theoretical issues associated with the practical use of multi-version software to provide run-time tolerance to software faults were investigated. A specialized tool was developed and evaluated for measuring testing coverage for a variety of metrics. The tool was used to collect information on the relationships between software faults and coverage provided by the testing process as measured by different metrics (including data flow metrics). Considerable correlation was found between coverage provided by some higher metrics and the elimination of faults in the code. Back-to-back testing was continued as an efficient mechanism for removal of un-correlated faults, and common-cause faults of variable span. Software reliability estimation methods was also continued based on non-random sampling, and the relationship between software reliability and code coverage provided through testing. New fault tolerance models were formulated. Simulation studies of the Acceptance Voting and Multi-stage Voting algorithms were finished and it was found that these two schemes for software fault tolerance are superior in many respects to some commonly used schemes. Particularly encouraging are the safety properties of the Acceptance testing scheme.

  5. Reversible Logic Synthesis of Fault Tolerant Carry Skip BCD Adder

    CERN Document Server

    Islam, Md Saiful; 10.3329/jbas.v32i2.2431

    2010-01-01

    Reversible logic is emerging as an important research area having its application in diverse fields such as low power CMOS design, digital signal processing, cryptography, quantum computing and optical information processing. This paper presents a new 4*4 parity preserving reversible logic gate, IG. The proposed parity preserving reversible gate can be used to synthesize any arbitrary Boolean function. It allows any fault that affects no more than a single signal readily detectable at the circuit's primary outputs. It is shown that a fault tolerant reversible full adder circuit can be realized using only two IGs. The proposed fault tolerant full adder (FTFA) is used to design other arithmetic logic circuits for which it is used as the fundamental building block. It has also been demonstrated that the proposed design offers less hardware complexity and is efficient in terms of gate count, garbage outputs and constant inputs than the existing counterparts.

  6. Fault Detection Coverage Quantification of Automatic Test Functions of Digital I and C System in NPPs

    International Nuclear Information System (INIS)

    Recently, analog instrument and control (I and C) systems in nuclear power plants (NPPs) have been replaced with digital systems for safer and more efficient operations. Digital I and C systems have adopted various fault-tolerant techniques that help the system correctly and safely perform the specific required functions in spite of the presence of faults. Each fault-tolerant technique has a different inspection period from real-time monitoring to monthly testing. The range covered by each fault-tolerant technique is also different. The digital I and C system, therefore, adopts multiple barriers consisting of various fault-tolerant techniques to increase total fault detection coverage. Even though these fault-tolerant techniques are adopted to ensure and improve the safety of a system, their effects have not been properly considered yet in most PSA models. Therefore, it is necessary to develop an evaluation method that can describe these features of a digital I and C system. Several issues must be considered in the fault coverage estimation of a digital I and C system, and two of them were handled in this work. The first is to quantify the fault coverage of each fault-tolerant technique implemented in the system, and the second is to exclude the duplicated effect of fault-tolerant techniques implemented simultaneously at each level of the system's hierarchy, as a fault occurring in a system might be detected by one or more fault-tolerant techniques. For this work, fault-tolerant techniques. For this work, fault injection experiment was used to obtain the exact relations between faults and multiple barriers of fault-tolerant techniques. This experiment was applied to a bistable processor (BP) of a reactor protection system

  7. CPN based fault-tolerance performance evaluation of fieldbus for KNGR NPCS network

    International Nuclear Information System (INIS)

    In contrast with conventional Fieldbus researches which are focused on real time performanc ignoring fault-tolerant mechanisms, the aim of this work is real-time performance evaluation of the system including fault. Because the communication network will be applied to Next Generation NPP, maintaining performance in presence of recoverable fault is important. To guarantee this in NPP Control Network, we should investigate the time characteristics of the target system in case of recoverable fault. If the time characteristics meet the requirements of the system, the faults will be recovered by Fieldbus recovery mechanisms and the system will be safe. But, if time characteristics can not meet the requirements, the faults in the Fieldbus can propagate to system failure. For this purpose, we classified the recoverable faults, made the formula which represents delays including recovery mechaisms and made simulation model. We appied the simulation model to KNGR NPCS with some assumptions. The outcome of the simulation is reallistic delays of the fault cases which have been classified. From the outcome of the simulation and the system requirements, we can calculate failure propagation probability from Fieldbus to outer system

  8. Reconfigurable fault-tolerant multielectrode array for dependable monitoring of the human brain.

    Science.gov (United States)

    Acharya, Ipsita; Joshi, Bharat; Lanning, Bruce; Zaveri, Hitten

    2011-01-01

    We introduce a fault-tolerant strategy to improve the dependability of a multi-electrode array (MEA), an issue of considerable concern. We propose an interstitial redundancy approach with local reconfiguration. Here spare modules are placed at interstitial sites and can replace neighboring primary modules when they develop faults. We evaluate the performance of such a system under different faults to characterize MEA dependability as a function of redundancy. The results demonstrate that a considerable improvement in MEA dependability can be achieved with a well designed increase in redundancy. PMID:22254393

  9. Passive fault tolerant control of a double inverted pendulum - a case study

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Stoustrup, Jakob

    2005-01-01

    A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the YJBK parameterization, which requires the nominal controller to be implemented in observer based form. The proposed method is applied to a double inverted pendulum system, for which an H_inf controller has been designed and verified in a lab setup. In this case study, the fault is a degradation of the tacho loop.

  10. Passive Fault tolerant Control of an Inverted Double Pendulum : A Case Study Example

    DEFF Research Database (Denmark)

    Niemann, H.; Stoustrup, Jakob

    2003-01-01

    A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the Youla parameterization, which requires the nominal controller to be implemented in the observer based form. The proposed method is applied to a double inverted pendulum system, for which an H controller has been designed and verified in a lap setup. In this case study, the fault is a degradation of the tacho loop.

  11. Fault detection coverage quantification of automatic test functions of digital I and C system in NPPs

    International Nuclear Information System (INIS)

    Analog instrument and control systems in nuclear power plants have recently been replaced with digital systems for safer and more efficient operation. Digital instrument and control systems have adopted various fault-tolerant techniques that help the system correctly and safely perform the specific required functions regardless of the presence of faults. Each fault-tolerant technique has a different inspection period, from real-time monitoring to monthly testing. The range covered by each fault tolerant technique is also different. The digital instrument and control system, therefore, adopts multiple barriers consisting of various fault-tolerant techniques to increase the total fault detection coverage. Even though these fault-tolerant techniques are adopted to ensure and improve the safety of a system, their effects on the system safety have not yet been properly considered in most probabilistic safety analysis models. Therefore, it is necessary to develop an evaluation method that can describe these features of digital instrument and control systems. Several issues must be considered in the fault coverage estimation of a digital instrument and control system, and two of these are addressed in this work. The first is to quantify the fault coverage of each fault-tolerant technique implemented in the system, and the second is to exclude the duplicated effect of fault-tolerant techniques implemented simultaneously at each level of the system's hierarchy, as a fault occuof the system's hierarchy, as a fault occurring in a system might be detected by one or more fault-tolerant techniques. For this work, a fault injection experiment was used to obtain the exact relations between faults and multiple barriers of fault tolerant techniques. This experiment was applied to a bistable processor of a reactor protection system.

  12. Research on Fault Analysis and Fault-Tolerant Control of EV/HEV Powertrain

    OpenAIRE

    Tabbache, Bekheira; Kheloui, Abdelaziz; Benbouzid, Mohamed; Mamoune, Abdeslam; Diallo, Demba

    2014-01-01

    This paper presents research works in the topics of fault analysis and fault tolerant control of an electric vehicle powered by an inverter-fed induction motor drive and the usual sensors. The considered failures are mainly measurement error due to faulty sensors and power inverter malfunctions. When sensor failure occurs, both software and hardware redundancies have been investigated. Software redundancy has been evaluated in case of speed sensor failure. Hardware redundancy has been used in...

  13. High Performance Modeling of Intelligent Pattern Recognition with Enhanced Fault-Tolerance in Real Time

    Directory of Open Access Journals (Sweden)

    Renukaradhya P.C

    2014-03-01

    Full Text Available Designing an ANN which could recognize the learned patterns even if there is variation in applied test patterns from learned patterns. A mechanism has been developed which provided the recognition facility intelligently. Recognition of patterns can be broadly categorized into two classes. When precision of recognition is not defined, term name “Forced recognition” given to the process. When precision of recognition is properly defined termed “Custom recognition” given to process. Analysis of fault tolerant property of feed forward architecture will be given training with back propagation method. Under this, analysis of effect of initially selected random weights and what should be the nature of random weights so that to maximize the fault tolerance capability of system has done. Analysis can be done with two different distribution namely Gaussian distribution and Uniform distribution. Effect of faults at output is also a function of fault position in ANN system like Hidden layer weight, Output layer weights, with processing elements at hidden layer. Analysis capability of back propagation algorithm itself is to tolerate the fault by learning process. A development of test mechanism to check faulty system in coming future is ANN system in hardware world i.e. on the VLSI chip. Once the architecture implemented it is required a mechanism to check the functioning. Analysis of internal parameters of ANN is completely research work with behavior of internal parameters, which will provide all responsible factors behind success of an ANN.

  14. Fault Tolerant Control Using Proportional-Integral-Derivative Controller Tuned by Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    S. Kanthalakshmi

    2011-01-01

    Full Text Available Problem statement: The growing demand for reliability, maintainability and survivability in industrial processes has drawn significant research in fault detection and fault tolerant control domain. A fault is usually defined as an unexpected change in a system, such as component malfunction and variations in operating condition, which tends to degrade the overall system performance. The purpose of fault detection is to detect these malfunctions to take proper action in order to prevent faults from developing into a total system failure. Approach: In this study an effective integrated fault detection and fault tolerant control scheme was developed for a class of LTI system. The scheme was based on a Kalman filter for simultaneous state and fault parameter estimation, statistical decisions for fault detection and activation of controller reconfiguration. Proportional-Integral-Derivative (PID control schemes continue to provide the simplest and yet effective solutions to most of the control engineering applications today. Determination or tuning of the PID parameters continues to be important as these parameters have a great influence on the stability and performance of the control system. In this study GA was proposed to tune the PID controller. Results: The results reflect that proposed scheme improves the performance of the process in terms of time domain specifications, robustness to parametric changes and optimum stability. Also, A comparison with the conventional Ziegler-Nichols method proves the superiority of GA based system. Conclusion: This study demonstrates the effectiveness of genetic algorithm in tuning of a PID controller with optimum parameters. It is, moreover, proved to be robust to the variations in plant dynamic characteristics and disturbances assuring a parameter-insensitive operation of the process.

  15. Direct Fault Tolerant RLV Altitude Control: A Singular Perturbation Approach

    Science.gov (United States)

    Zhu, J. J.; Lawrence, D. A.; Fisher, J.; Shtessel, Y. B.; Hodel, A. S.; Lu, P.; Jackson, Scott (Technical Monitor)

    2002-01-01

    In this paper, we present a direct fault tolerant control (DFTC) technique, where by "direct" we mean that no explicit fault identification is used. The technique will be presented for the attitude controller (autopilot) for a reusable launch vehicle (RLV), although in principle it can be applied to many other applications. Any partial or complete failure of control actuators and effectors will be inferred from saturation of one or more commanded control signals generated by the controller. The saturation causes a reduction in the effective gain, or bandwidth of the feedback loop, which can be modeled as an increase in singular perturbation in the loop. In order to maintain stability, the bandwidth of the nominal (reduced-order) system will be reduced proportionally according to the singular perturbation theory. The presented DFTC technique automatically handles momentary saturations and integrator windup caused by excessive disturbances, guidance command or dispersions under normal vehicle conditions. For multi-input, multi-output (MIMO) systems with redundant control effectors, such as the RLV attitude control system, an algorithm is presented for determining the direction of bandwidth cutback using the method of minimum-time optimal control with constrained control in order to maintain the best performance that is possible with the reduced control authority. Other bandwidth cutback logic, such as one that preserves the commanded direction of the bandwidth or favors a preferred direction when the commanded direction cannot be achieved, is also discussed. In this extended abstract, a simplistic example is proved to demonstrate the idea. In the final paper, test results on the high fidelity 6-DOF X-33 model with severe dispersions will be presented.

  16. Fault Tolerant Control of Wind Turbines : A benchmark model

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    2013-01-01

    This paper presents a test benchmark model for the evaluation of fault detection and accommodation schemes. This benchmark model deals with the wind turbine on a system level, and it includes sensor, actuator, and system faults, namely faults in the pitch system, the drive train, the generator, and the converter system. Since it is a system-level model, converter and pitch system models are simplified because these are controlled by internal controllers working at higher frequencies than the system model. The model represents a three-bladed pitch-controlled variable-speed wind turbine with a nominal power of 4.8 MW. The fault detection and isolation (FDI) problem was addressed by several teams, and five of the solutions are compared in the second part of this paper. This comparison relies on additional test data in which the faults occur in different operating conditions than in the test data used for the FDI design.

  17. Fault Tolerant Wind Farm Control : a Benchmark Model

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    2013-01-01

    This paper presents a test benchmark model for the evaluation of fault detection and accommodation schemes. This benchmark model deals with the wind turbine on a system level, and it includes sensor, actuator, and system faults, namely faults in the pitch system, the drive train, the generator, and the converter system. Since it is a system-level model, converter and pitch system models are simplified because these are controlled by internal controllers working at higher frequencies than the system model. The model represents a three-bladed pitch-controlled variable-speed wind turbine with a nominal power of 4.8 MW. The fault detection and isolation (FDI) problem was addressed by several teams, and five of the solutions are compared in the second part of this paper. This comparison relies on additional test data in which the faults occur in different operating conditions than in the test data used for the FDI design.

  18. Fault-tolerance performance evaluation of fieldbus for NPCS network of KNGR

    International Nuclear Information System (INIS)

    In contrast with conventional fieldbus researches which are focused merely on real time performance, this study aims to evaluate the real-time performance of the communication system including fault-tolerant mechanisms. Maintaining performance in presence of recoverable faults is very important because the communication network will be applied to next generation NPP(Nuclear Power Plant). In order to guarantee the performance of NPP communication network, the time characteristics of the target system in presence of recoverable fault should be investigated. If the time characteristics meet the requirements of the system, the faults will be recovered by fieldbus recovery mechanisms and the system will be safe. If the time characteristics can not meet the requirements, the faults in the fieldbus can propagate to system failure. In this study, for the purpose of investigating the time characteristics of fieldbus, the recoverable faults are classified and then the formulas which represent delays including recovery mechanisms and the simulation model are developed. In order to validate the proposed approach, the simulation model is applied to the Korea Next Generation Reactor (KNGR) NSSS Process Control System (NPCS). The results of the simulation provide reasonable delay characteristics of the fault cases with recovery mechanisms. Using the outcome of the simulation and the system requirements, we also can calculate the failure propagation probability from fieldbus to outer system

  19. Improving the Navigability of a Hexapod Robot using a Fault-Tolerant Adaptive Gait

    OpenAIRE

    Umar Asif

    2012-01-01

    This paper encompasses a study on the development of a walking gait for fault tolerant locomotion in unstructured environments. The fault tolerant gait for adaptive locomotion fulfills stability conditions in opposition to a fault (locked joints or sensor failure) event preventing a robot to realize stable locomotion over uneven terrains. To accomplish this feat, a fault tolerant gait based on force?position control is proposed in this paper for a hexapod robot to enable stable walking with...

  20. Review of fault diagnosis and fault-tolerant control for modular multilevel converter of HVDC

    DEFF Research Database (Denmark)

    Liu, Hui; Loh, Poh Chiang

    2013-01-01

    This review focuses on faults in Modular Multilevel Converter (MMC) for use in high voltage direct current (HVDC) systems by analyzing the vulnerable spots and failure mechanism from device to system and illustrating the control & protection methods under failure condition. At the beginning, several typical topologies of MMC-HVDC systems are presented. Then fault types such as capacitor voltage unbalance, unbalance between upper and lower arm voltage are analyzed and the corresponding fault detection and diagnosis approaches are explained. In addition, more attention is dedicated to control strategies, when running in MMC faults or grid faults. This paper ends up with a discussion of other opportunities for future development.