WorldWideScience

Sample records for fault tolerant systems

  1. Fault tolerant computing systems

    International Nuclear Information System (INIS)

    Fault tolerance involves the provision of strategies for error detection damage assessment, fault treatment and error recovery. A survey is given of the different sorts of strategies used in highly reliable computing systems, together with an outline of recent research on the problems of providing fault tolerance in parallel and distributed computing systems. (orig.)

  2. Fault Tolerant Control Systems

    DEFF Research Database (Denmark)

    Bøgh, S.A.

    This thesis considered the development of fault tolerant control systems. The focus was on the category of automated processes that do not necessarily comprise a high number of identical sensors and actuators to maintain safe operation, but still have a potential for improving immunity to component...... carried the control system designer through the steps necessary to consider fault handling in an early design phase. It was shown how an existing control loop with interface to the plant wide control system could be extended with three additional modules to obtain fault tolerance: Fault detection and...... isolation, remedial action decision, and reconfiguration. The integration of these modules in software were considered. The general methodology covered the analysis, design, and implementation of fault tolerant control systems on an overall level. Two detailed studies were presented, one on fault detection...

  3. Fault-tolerant system optimization

    Science.gov (United States)

    Rose, J.

    1980-01-01

    The paper describes the decisions to be made in the design of fault-tolerant systems and provides details of a comprehensive model developed to cost optimize such systems. Economical use of replication is making fault-tolerant systems possible and more applications for safety crucial systems such as active flight controls can be expected. In turn, the use of massive redundancy, fault-tolerance, and reconfigurable systems in stimulating the development of new analytical tools for establishing the cost and effectiveness of the safety and cost effectiveness of the levels of replication will increase. Closed-form analytical solutions for the reliability and maintainability analysis of fault-tolerant systems are complex, and Monte-Carlo simulation appears to be a more desirable method of establishing the reliability and maintainability of such systems.

  4. Soft Computing Approaches To Fault Tolerant Systems

    Directory of Open Access Journals (Sweden)

    Neeraj Prakash Srivastava

    2014-05-01

    Full Text Available We present in this paper as an introduction to soft computing techniques for fault tolerant systems and the terminology with different ways of achieving fault tolerance. The paper focuses on the problem of fault tolerance using soft computing techniques. The fundamentals of soft computing approaches and its type with introduction of fault tolerance are discussed. The main objective is to show how to implement soft computing approaches for fault detection, isolation and identification. The paper contains details about soft computing application with an application of wireless sensor network as fault tolerant system.

  5. Fault-Tolerant UAV Flight Control System

    OpenAIRE

    Dybsjord, Kerrin Andre

    2013-01-01

    The main focus of this master’s thesis is fault-tolerant control systems (FTCSs) for unmanned aerial vehicles (UAVs). The goals are to develop an automatic-flight control system (AFCS) with fault detection and isolation (FDI) and a reconfiguration mechanism for accommodation of faults. The literature study reviews methods for fault-tolerant control and also discusses important faults and failures related to UAVs.The FTCS is implemented in MATLAB Simulink with a nonlinear model of the Ces...

  6. Fault Tolerance in Real Time Distributed System

    Directory of Open Access Journals (Sweden)

    Arvind Kumar

    2011-02-01

    Full Text Available In this paper we investigate the different techniques of fault tolerance which are used in many real time distributed systems. The main focus is on types of fault occurring in the system, fault detection techniques and the recovery techniques used. A fault can occur due to link failure, resource failure or by any other reason is to be tolerated for working the system smoothly and accurately. These faults can be detected and recovered by many techniques used ccordingly. An appropriate fault detector can avoid loss due to system crash and reliable fault tolerance technique can save from system failure. This paper provides how these methods are applied to detect and tolerate faults from various Real Time Distributed Systems.

  7. Reconfigurable fault tolerant avionics system

    Science.gov (United States)

    Ibrahim, M. M.; Asami, K.; Cho, Mengu

    This paper presents the design of a reconfigurable avionics system based on modern Static Random Access Memory (SRAM)-based Field Programmable Gate Array (FPGA) to be used in future generations of nano satellites. A major concern in satellite systems and especially nano satellites is to build robust systems with low-power consumption profiles. The system is designed to be flexible by providing the capability of reconfiguring itself based on its orbital position. As Single Event Upsets (SEU) do not have the same severity and intensity in all orbital locations, having the maximum at the South Atlantic Anomaly (SAA) and the polar cusps, the system does not have to be fully protected all the time in its orbit. An acceptable level of protection against high-energy cosmic rays and charged particles roaming in space is provided within the majority of the orbit through software fault tolerance. Check pointing and roll back, besides control flow assertions, is used for that level of protection. In the minority part of the orbit where severe SEUs are expected to exist, a reconfiguration for the system FPGA is initiated where the processor systems are triplicated and protection through Triple Modular Redundancy (TMR) with feedback is provided. This technique of reconfiguring the system as per the level of the threat expected from SEU-induced faults helps in reducing the average dynamic power consumption of the system to one-third of its maximum. This technique can be viewed as a smart protection through system reconfiguration. The system is built on the commercial version of the (XC5VLX50) Xilinx Virtex5 FPGA on bulk silicon with 324 IO. Simulations of orbit SEU rates were carried out using the SPENVIS web-based software package.

  8. Software fault tolerance in computer operating systems

    Science.gov (United States)

    Iyer, Ravishankar K.; Lee, Inhwan

    1994-01-01

    This chapter provides data and analysis of the dependability and fault tolerance for three operating systems: the Tandem/GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Based on measurements from these systems, basic software error characteristics are investigated. Fault tolerance in operating systems resulting from the use of process pairs and recovery routines is evaluated. Two levels of models are developed to analyze error and recovery processes inside an operating system and interactions among multiple instances of an operating system running in a distributed environment. The measurements show that the use of process pairs in Tandem systems, which was originally intended for tolerating hardware faults, allows the system to tolerate about 70% of defects in system software that result in processor failures. The loose coupling between processors which results in the backup execution (the processor state and the sequence of events occurring) being different from the original execution is a major reason for the measured software fault tolerance. The IBM/MVS system fault tolerance almost doubles when recovery routines are provided, in comparison to the case in which no recovery routines are available. However, even when recovery routines are provided, there is almost a 50% chance of system failure when critical system jobs are involved.

  9. Energy-efficient fault-tolerant systems

    CERN Document Server

    Mathew, Jimson; Pradhan, Dhiraj K

    2013-01-01

    This book describes the state-of-the-art in energy efficient, fault-tolerant embedded systems. It covers the entire product lifecycle of electronic systems design, analysis and testing and includes discussion of both circuit and system-level approaches. Readers will be enabled to meet the conflicting design objectives of energy efficiency and fault-tolerance for reliability, given the up-to-date techniques presented.

  10. Fault Tolerant Quantum Filtering and Fault Detection for Quantum Systems

    OpenAIRE

    Gao, Qing; Dong, Daoyi; Petersen, Ian R.

    2015-01-01

    This paper aims to determine the fault tolerant quantum filter and fault detection equation for a class of open quantum systems coupled to a laser field that is subject to stochastic faults. In order to analyze this class of open quantum systems, we propose a quantum-classical Bayesian inference method based on the definition of a so-called quantum-classical conditional expectation. It is shown that the proposed Bayesian inference approach provides a convenient tool to simultaneously derive t...

  11. Fault tolerant control of systems with saturations

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik

    2013-01-01

    This paper presents framework for fault tolerant controllers (FTC) that includes input saturation. The controller architecture known from FTC is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization is extended to handle input saturation. Applying this controller architecture in...... connection with faulty systems including input saturation gives an additional YJBK transfer function related to the input saturation. In the fault free case, this additional YJBK transfer function can be applied directly for optimizing the feedback loop around the input saturation. In the faulty case, the...... design problem is a mixed design problem involved both parametric faults and input saturation....

  12. Software engineering of fault tolerant systems

    CERN Document Server

    Pelliccione, P; Muccini, Henry

    2007-01-01

    In architecting dependable systems, what is required to improve the overall system robustness is fault tolerance. Many methods have been proposed to this end, the solutions are usually considered late during the design and implementation phases of the software life-cycle (e.g., Java and Windows NT exception handling), thus reducing the effectiveness error and fault handling. Since the system design typically models only normal behaviour of the system while ignoring exceptional ones, the implementation of the system is unable to handle abnormal events. Consequently, the system may fail in unexp

  13. Fault tolerant architecture for artificial olfactory system

    Science.gov (United States)

    Lotfivand, Nasser; Nizar Hamidon, Mohd; Abdolzadeh, Vida

    2015-05-01

    In this paper, to cover and mask the faults that occur in the sensing unit of an artificial olfactory system, a novel architecture is offered. The proposed architecture is able to tolerate failures in the sensors of the array and the faults that occur are masked. The proposed architecture for extracting the correct results from the output of the sensors can provide the quality of service for generated data from the sensor array. The results of various evaluations and analysis proved that the proposed architecture has acceptable performance in comparison with the classic form of the sensor array in gas identification. According to the results, achieving a high odor discrimination based on the suggested architecture is possible.

  14. A Fault-tolerant Development Methodology for Industrial Control Systems

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh; Thybo, C.

    2004-01-01

    Developing advanced detection schemes is not the lone factor for obtaining a successful fault diagnosis performance. Acquiring significant achievements in applying Fault-tolerance in industrial development requires that fault diagnosis and recovery schemes are developed in a consistent and...... logically sound manner. This paper presents the employe fault-tolerant development methodology and highlights steps, which have been essential for achieving complete and consistent monitoring capabilities. Fault diagnosis for a commercial refrigeration system is treated as a case-study....

  15. Method and system for environmentally adaptive fault tolerant computing

    Science.gov (United States)

    Copenhaver, Jason L. (Inventor); Jeremy, Ramos (Inventor); Wolfe, Jeffrey M. (Inventor); Brenner, Dean (Inventor)

    2010-01-01

    A method and system for adapting fault tolerant computing. The method includes the steps of measuring an environmental condition representative of an environment. An on-board processing system's sensitivity to the measured environmental condition is measured. It is determined whether to reconfigure a fault tolerance of the on-board processing system based in part on the measured environmental condition. The fault tolerance of the on-board processing system may be reconfigured based in part on the measured environmental condition.

  16. Fault detection and fault-tolerant control for nonlinear systems

    CERN Document Server

    Li, Linlin

    2016-01-01

    Linlin Li addresses the analysis and design issues of observer-based FD and FTC for nonlinear systems. The author analyses the existence conditions for the nonlinear observer-based FD systems to gain a deeper insight into the construction of FD systems. Aided by the T-S fuzzy technique, she recommends different design schemes, among them the L_inf/L_2 type of FD systems. The derived FD and FTC approaches are verified by two benchmark processes. Contents Overview of FD and FTC Technology Configuration of Nonlinear Observer-Based FD Systems Design of L2 nonlinear Observer-Based FD Systems Design of Weighted Fuzzy Observer-Based FD Systems FTC Configurations for Nonlinear Systems< Application to Benchmark Processes Target Groups Researchers and students in the field of engineering with a focus on fault diagnosis and fault-tolerant control fields The Author Dr. Linlin Li completed her dissertation under the supervision of Prof. Steven X. Ding at the Faculty of Engineering, University of Duisburg-Essen, Germany...

  17. Fault-tolerant Actuator System for Electrical Steering of Vehicles

    DEFF Research Database (Denmark)

    Sørensen, Jesper Sandberg; Blanke, Mogens

    Being critical to the safety of vehicles, the steering system is required to maintain the vehicles ability to steer until it is brought to halt, should a fault occur. With electrical steering becoming a cost-effective candidate for electrical powered vehicles, a fault-tolerant architecture is...... needed that meets this requirement. This paper studies the fault-tolerance properties of an electrical steering system. It presents a fault-tolerant architecture where a dedicated AC motor design used in conjunction with cheap voltage measurements can ensure detection of all relevant faults in the...... steering system. The paper shows how active control reconfiguration can accommodate all critical faults. The fault-tolerant abilities of the steering system are demonstrated on the hardware of a warehouse truck....

  18. Abstractions for Fault-Tolerant Distributed System Verification

    Science.gov (United States)

    Pike, Lee S.; Maddalon, Jeffrey M.; Miner, Paul S.; Geser, Alfons

    2004-01-01

    Four kinds of abstraction for the design and analysis of fault tolerant distributed systems are discussed. These abstractions concern system messages, faults, fault masking voting, and communication. The abstractions are formalized in higher order logic, and are intended to facilitate specifying and verifying such systems in higher order theorem provers.

  19. Comparing Distributed Online Stream Processing Systems Considering Fault Tolerance Issues

    Directory of Open Access Journals (Sweden)

    Andr Leon Sampaio Gradvohl

    2014-05-01

    Full Text Available This paper presents an analysis of four online stream processing systems (MillWheel, S4, Spark Streaming and Storm regarding the strategies they use for fault tolerance. We use this sort of system for processing of data streams that can come from different sources such as web sites, sensors, mobile phones or any set of devices that provide real-time high-speed data. Typically, these systems are concerned more with the throughput in data processing than on fault tolerance. However, depending on the type of application, we should consider fault tolerance as an important a feature. The work describes some of the main strategies for fault tolerance replication components, upstream backup, checkpoint and recovery and shows how each of the four systems uses these strategies. In the end, the paper discusses the advantages and disadvantages of the combination of the strategies for fault tolerance in these systems.

  20. From fault classification to fault tolerance for multi-agent systems

    CERN Document Server

    Potiron, Katia; Taillibert, Patrick

    2013-01-01

    Faults are a concern for Multi-Agent Systems (MAS) designers, especially if the MAS are built for industrial or military use because there must be some guarantee of dependability. Some fault classification exists for classical systems, and is used to define faults. When dependability is at stake, such fault classification may be used from the beginning of the system's conception to define fault classes and specify which types of faults are expected. Thus, one may want to use fault classification for MAS; however, From Fault Classification to Fault Tolerance for Multi-Agent Systems argues that

  1. Software fault tolerance

    OpenAIRE

    Kazinov, Tofik Hasanaga; Mostafa, Jalilian Shahrukh

    2009-01-01

    Because of our present inability to produce errorfree software, software fault tolerance is and will contiune to be an important consideration in software system. The root cause of software design errors in the complexity of the systems. This paper surveys various software fault tolerance techniquest and methodologies. They are two gpoups: Single version and Multi version software fault tolerance techniques. It is expected that software fault tolerance research will benefit from this research...

  2. Fault-Tolerant Control For A Robotic Inspection System

    Science.gov (United States)

    Tso, Kam Sing

    1995-01-01

    Report describes first phase of continuing program of research on fault-tolerant control subsystem of telerobotic visual-inspection system. Goal of program to develop robotic system for remotely controlled visual inspection of structures in outer space.

  3. H infinity Integrated Fault Estimation and Fault Tolerant Control of Discrete-time Piecewise Linear Systems

    DEFF Research Database (Denmark)

    Tabatabaeipour, Seyed Mojtaba; Bak, Thomas

    . Sufficient conditions for the existence of robust fault estimator and fault tolerant controller are derived in terms of linear matrix inequalities. Upper bounds on the H∞ performance can be minimized by solving convex optimization problems with linear matrix inequality constraints. The efficiency of the......In this paper we consider the problem of fault estimation and accommodation for discrete time piecewise linear systems. A robust fault estimator is designed to estimate the fault such that the estimation error converges to zero and H∞ performance of the fault estimation is minimized. Then, the...... estimate of fault is used to compensate for the effect of the fault. Hence, using the estimate of fault, a fault tolerant controller using a piecewise linear static output feedback is designed such that it stabilizes the system and provides an upper bound on the H∞ performance of the faulty system...

  4. A fault-tolerant intelligent robotic control system

    Science.gov (United States)

    Marzwell, Neville I.; Tso, Kam Sing

    1993-01-01

    This paper describes the concept, design, and features of a fault-tolerant intelligent robotic control system being developed for space and commercial applications that require high dependability. The comprehensive strategy integrates system level hardware/software fault tolerance with task level handling of uncertainties and unexpected events for robotic control. The underlying architecture for system level fault tolerance is the distributed recovery block which protects against application software, system software, hardware, and network failures. Task level fault tolerance provisions are implemented in a knowledge-based system which utilizes advanced automation techniques such as rule-based and model-based reasoning to monitor, diagnose, and recover from unexpected events. The two level design provides tolerance of two or more faults occurring serially at any level of command, control, sensing, or actuation. The potential benefits of such a fault tolerant robotic control system include: (1) a minimized potential for damage to humans, the work site, and the robot itself; (2) continuous operation with a minimum of uncommanded motion in the presence of failures; and (3) more reliable autonomous operation providing increased efficiency in the execution of robotic tasks and decreased demand on human operators for controlling and monitoring the robotic servicing routines.

  5. Fault tolerant hypercube computer system architecture

    Science.gov (United States)

    Madan, Herb S. (Inventor); Chow, Edward (Inventor)

    1989-01-01

    A fault-tolerant multiprocessor computer system of the hypercube type comprising a hierarchy of computers of like kind which can be functionally substituted for one another as necessary is disclosed. Communication between the working nodes is via one communications network while communications between the working nodes and watch dog nodes and load balancing nodes higher in the structure is via another communications network separate from the first. A typical branch of the hierarchy reporting to a master node or host computer comprises, a plurality of first computing nodes; a first network of message conducting paths for interconnecting the first computing nodes as a hypercube. The first network provides a path for message transfer between the first computing nodes; a first watch dog node; and a second network of message connecting paths for connecting the first computing nodes to the first watch dog node independent from the first network, the second network provides an independent path for test message and reconfiguration affecting transfers between the first computing nodes and the first switch watch dog node. There is additionally, a plurality of second computing nodes; a third network of message conducting paths for interconnecting the second computing nodes as a hypercube. The third network provides a path for message transfer between the second computing nodes; a fourth network of message conducting paths for connecting the second computing nodes to the first watch dog node independent from the third network. The fourth network provides an independent path for test message and reconfiguration affecting transfers between the second computing nodes and the first watch dog node; and a first multiplexer disposed between the first watch dog node and the second and fourth networks for allowing the first watch dog node to selectively communicate with individual ones of the computing nodes through the second and fourth networks; as well as, a second watch dog node operably connected to the first multiplexer whereby the second watch dog node can selectively communicate with individual ones of the computing nodes through the second and fourth networks. The branch is completed by a first load balancing node; and a second multiplexer connected between the first load balancing node and the first and second watch dog nodes, allowing the first load balancing node to selectively communicate with the first and second watch dog nodes.

  6. Measurement and analysis of operating system fault tolerance

    Science.gov (United States)

    Lee, I.; Tang, D.; Iyer, R. K.

    1992-01-01

    This paper demonstrates a methodology to model and evaluate the fault tolerance characteristics of operational software. The methodology is illustrated through case studies on three different operating systems: the Tandem GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Measurements are made on these systems for substantial periods to collect software error and recovery data. In addition to investigating basic dependability characteristics such as major software problems and error distributions, we develop two levels of models to describe error and recovery processes inside an operating system and on multiple instances of an operating system running in a distributed environment. Based on the models, reward analysis is conducted to evaluate the loss of service due to software errors and the effect of the fault-tolerance techniques implemented in the systems. Software error correlation in multicomputer systems is also investigated.

  7. Synthesizing Fault Tolerant Safety Critical Systems

    OpenAIRE

    Seemanta Saha; Muhammad Sheikh Sadi

    2014-01-01

    To keep pace with today’s nanotechnology, safety critical embedded systems are becoming less tolerant to errors. Research into techniques to cope with errors in these systems has mostly focused on transformational approach, replication of hardware devices, parallel program design, component based design and/or information redundancy. It would be better to tackle the issue early in the design process that a safety critical system never fails to satisfy its strict dependability requiremen...

  8. ROBUS-2: A Fault-Tolerant Broadcast Communication System

    Science.gov (United States)

    Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.

    2005-01-01

    The Reliable Optical Bus (ROBUS) is the core communication system of the Scalable Processor-Independent Design for Enhanced Reliability (SPIDER), a general-purpose fault-tolerant integrated modular architecture currently under development at NASA Langley Research Center. The ROBUS is a time-division multiple access (TDMA) broadcast communication system with medium access control by means of time-indexed communication schedule. ROBUS-2 is a developmental version of the ROBUS providing guaranteed fault-tolerant services to the attached processing elements (PEs), in the presence of a bounded number of faults. These services include message broadcast (Byzantine Agreement), dynamic communication schedule update, clock synchronization, and distributed diagnosis (group membership). The ROBUS also features fault-tolerant startup and restart capabilities. ROBUS-2 is tolerant to internal as well as PE faults, and incorporates a dynamic self-reconfiguration capability driven by the internal diagnostic system. This version of the ROBUS is intended for laboratory experimentation and demonstrations of the capability to reintegrate failed nodes, dynamically update the communication schedule, and tolerate and recover from correlated transient faults.

  9. Robust Adaptive Switching Fault-Tolerant Control of a Class of Uncertain Systems against Actuator Faults

    OpenAIRE

    Xiao-Zheng Jin

    2013-01-01

    This paper deals with the fault-tolerant control (FTC) problem for a class of linear time-invariant systems with time-varying actuator faults and uncertainties. For more general consideration, the faults and uncertainties are supposed to depend on the states of systems and unknown constant bounds. For the sake of eliminating the effects of such state-dependent faults and uncertainties automatically, a switching control strategy which is formulated by a sign function is designed to configure c...

  10. Fault Tolerant Services for Safe In-Car Embedded Systems

    OpenAIRE

    Navet, Nicolas; Simonot-Lion, Françoise

    2005-01-01

    Due to the increasing criticality of the functions in terms of safety, embedded automotive systems must now respect stringent dependability constraints despite the faults that may occur in a very harsh environment. In a context where critical functions are distributed over the network, the communication system plays a major role. First, we discuss the main services and functionalities that a communication system should offer for easying the design of fault-tolerant applications in the automot...

  11. Data-driven design of fault diagnosis and fault-tolerant control systems

    CERN Document Server

    Ding, Steven X

    2014-01-01

    Data-driven Design of Fault Diagnosis and Fault-tolerant Control Systems presents basic statistical process monitoring, fault diagnosis, and control methods, and introduces advanced data-driven schemes for the design of fault diagnosis and fault-tolerant control systems catering to the needs of dynamic industrial processes. With ever increasing demands for reliability, availability and safety in technical processes and assets, process monitoring and fault-tolerance have become important issues surrounding the design of automatic control systems. This text shows the reader how, thanks to the rapid development of information technology, key techniques of data-driven and statistical process monitoring and control can now become widely used in industrial practice to address these issues. To allow for self-contained study and facilitate implementation in real applications, important mathematical and control theoretical knowledge and tools are included in this book. Major schemes are presented in algorithm form and...

  12. Comparing Distributed Online Stream Processing Systems Considering Fault Tolerance Issues

    OpenAIRE

    Andr Leon Sampaio Gradvohl; Hermes Senger; Luciana Arantes; Pierre Sens

    2014-01-01

    This paper presents an analysis of four online stream processing systems (MillWheel, S4, Spark Streaming and Storm) regarding the strategies they use for fault tolerance. We use this sort of system for processing of data streams that can come from different sources such as web sites, sensors, mobile phones or any set of devices that provide real-time high-speed data. Typically, these systems are concerned more with the throughput in data processing than on fault tolerance. However, depending ...

  13. Design of fault tolerant control system for steam generator using

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Myung Ki; Seo, Mi Ro [Korea Electric Power Research Institute, Taejon (Korea, Republic of)

    1998-12-31

    A controller and sensor fault tolerant system for a steam generator is designed with fuzzy logic. A structure of the proposed fault tolerant redundant system is composed of a supervisor and two fuzzy weighting modulators. A supervisor alternatively checks a controller and a sensor induced performances to identify which part, a controller or a sensor, is faulty. In order to analyze controller induced performance both an error and a change in error of the system output are chosen as fuzzy variables. The fuzzy logic for a sensor induced performance uses two variables : a deviation between two sensor outputs and its frequency. Fuzzy weighting modulator generates an output signal compensated for faulty input signal. Simulations show that the proposed fault tolerant control scheme for a steam generator regulates well water level by suppressing fault effect of either controllers or sensors. Therefore through duplicating sensors and controllers with the proposed fault tolerant scheme, both a reliability of a steam generator control and sensor system and that of a power plant increase even more. 2 refs., 9 figs., 1 tab. (Author)

  14. Fault-tolerant design of picture archiving and communication systems

    International Nuclear Information System (INIS)

    Reliability is perhaps the most important attribute of a PACS. Any downtime of the system may seriously affect patient care. This paper describes fault-tolerant measures employed in the design of a hospital-wide PACS. Six fault-tolerant measures have been implemented: hardware redundance (networks and archives), data-base backups, monitoring routines for local host processes and network status; uninterruptible power supplied, structured software design techniques, and in-service training of all radiology technologists. A PACS consisting of 13 acquisition nodes, two optical archiving nodes, two data-base server nodes, and five workstation nodes has been developed

  15. Fault Tolerance Middleware for a Multi-Core System

    Science.gov (United States)

    Some, Raphael R.; Springer, Paul L.; Zima, Hans P.; James, Mark; Wagner, David A.

    2012-01-01

    Fault Tolerance Middleware (FTM) provides a framework to run on a dedicated core of a multi-core system and handles detection of single-event upsets (SEUs), and the responses to those SEUs, occurring in an application running on multiple cores of the processor. This software was written expressly for a multi-core system and can support different kinds of fault strategies, such as introspection, algorithm-based fault tolerance (ABFT), and triple modular redundancy (TMR). It focuses on providing fault tolerance for the application code, and represents the first step in a plan to eventually include fault tolerance in message passing and the FTM itself. In the multi-core system, the FTM resides on a single, dedicated core, separate from the cores used by the application. This is done in order to isolate the FTM from application faults and to allow it to swap out any application core for a substitute. The structure of the FTM consists of an interface to a fault tolerant strategy module, a responder module, a fault manager module, an error factory, and an error mapper that determines the severity of the error. In the present reference implementation, the only fault tolerant strategy implemented is introspection. The introspection code waits for an application node to send an error notification to it. It then uses the error factory to create an error object, and at this time, a severity level is assigned to the error. The introspection code uses its built-in knowledge base to generate a recommended response to the error. Responses might include ignoring the error, logging it, rolling back the application to a previously saved checkpoint, swapping in a new node to replace a bad one, or restarting the application. The original error and recommended response are passed to the top-level fault manager module, which invokes the response. The responder module also notifies the introspection module of the generated response. This provides additional information to the introspection module that it can use in generating its next response. For example, if the responder triggers an application rollback and errors are still occurring, the introspection module may decide to recommend an application restart.

  16. Development and application of diagnostic systems to achieve fault tolerance

    International Nuclear Information System (INIS)

    Much work is currently being done to develop and apply diagnostic systems that are tolerant to faulted conditions in the process being monitored and in the sensors that measure the critical parameters associated with the process. A fault-tolerant diagnostic system based on state-determination, pattern-recognition techniques is currently undergoing testing and evaluation in certain applications at the EBR-II reactor. Testing and operational experience with the system to date has shown a high degree of tolerance to sensor failures, while being sensitive to very slight changes in the plant operational state. This paper briefly mentions related work being done by others, and describes in more detail the pattern-recognition system and the results of the testing and operational experience with the system at EBR-II. 9 refs., 10 figs

  17. Design a Fault Tolerance for Real Time Distributed System

    OpenAIRE

    Ban M. Khammas

    2012-01-01

    This paper designed a fault tolerance for soft real time distributed system (FTRTDS). This system is designed to be independently on specific mechanisms and facilities of the underlying real time distributed system. It is designed to be distributed on all the computers in the distributed system and controlled by a central unit.Besides gathering information about a target program spontaneously, it provides information about the target operating system and the target hardware in order to diagno...

  18. Trends in reliability modeling technology for fault tolerant systems

    Science.gov (United States)

    Bavuso, S. J.

    1979-01-01

    Developments in reliability modeling for large fault tolerant avionic computing systems are presented. Issues of state size and complexity, fault coverage, and practical computation are addressed. A two-fold developmental effort is described based on the structural and fault coverage modeling approaches. A technique which was successfully applied to an 865 state pure death stationary Markov model is presented. Of particular interest is a short computer program which executes very quickly to produce reliability results of a large state space model. This model also incorporates fault coverage states for processor, memory, and bus line replaceable units. A second structural reliability modeling scheme is aimed at solving nonstationary Markov models. This technique provides the tool required for studying the reliability of systems with nonconstant failure rates and includes intermittent/transient faults, electronic hardware which exhibits decreasing failure rates, and hydromechanical devices which typically have wearout failure mechanisms. Several aspects of fault coverage, including modeling and data measurement of intermittent/transient faults and latent faults, are elucidated and illustrated. The CARE II (computer-aided reliability estimation) coverage is presented and shortcomings to be eliminated are discussed.

  19. Industrial Computing Systems: A Case Study of Fault Tolerance Analysis

    OpenAIRE

    Shchurov, Andrey A.

    2015-01-01

    Fault tolerance is a key factor of industrial computing systems design. But in practical terms, these systems, like every commercial product, are under great financial constraints and they have to remain in operational state as long as possible due to their commercial attractiveness. This work provides an analysis of the instantaneous failure rate of these systems at the end of their life-time period. On the basis of this analysis, we determine the effect of a critical increase in the system ...

  20. Fault Tolerant Middleware for Agent Systems: A Refinement Approach

    OpenAIRE

    Laibinis, Linas; Troubitsyna, Elena; Iliasov, Alexei; Romanovsky, Alexander

    2009-01-01

    Agent technology offers a number of advantages over traditional distributed systems, such as asynchronous communication, anonymity of individual agents and ability to change operational context. However, it is notoriously difficult to ensure dependability of agent systems. In this paper we present a formal approach for the top-down development of fault tolerant middleware for agent systems. We demonstrate how to develop the middleware that besides providing agent coordination is also able to ...

  1. Fault-Tolerant Control of a Distributed Database System

    OpenAIRE

    N. Eva Wu; Ruschmann, Matthew C.; Mark H. Linderman

    2008-01-01

    Optimal state information-based control policy for a distributed database system subject to server failures is considered. Fault-tolerance is made possible by the partitioned architecture of the system and data redundancy therein. Control actions include restoration of lost data sets in a single server using redundant data sets in the remaining servers, routing of queries to intact servers, or overhaul of the entire system for renewal. Control policies are determined by solving Markov decisio...

  2. Reliable, fault tolerant control systems for nuclear generating stations

    International Nuclear Information System (INIS)

    Two operational features of CANDU Nuclear Power Stations provide for high plant availability. First, the plant re-fuels on-line, thereby eliminating the need for periodic and lengthy refuelling 'outages'. Second, the all plants are controlled by real-time computer systems. Later plants are also protected using real-time computer systems. In the past twenty years, the control systems now operating in 21 plants have achieved an availability of 99.8%, making significant contributions to high CANDU plant capacity factors. This paper describes some of the features that ensure the high degree of system fault tolerance and hence high plant availability. The emphasis will be placed on the fault tolerant features of the computer systems included in the latest reactor design - the CANDU 3 (450MWe). (author)

  3. Diagnosis and Fault-Tolerant Control for Thruster-Assisted Position Mooring System

    DEFF Research Database (Denmark)

    Nguyen, Trong Dong; Blanke, Mogens; Sørensen, Asgeir

    2007-01-01

    Development of fault-tolerant control systems is crucial to maintain safe operation of o®shore installations. The objective of this paper is to develop a fault- tolerant control for thruster-assisted position mooring (PM) system with faults occurring in the mooring lines. Faults in line...

  4. Fault detection, diagnosis and active fault tolerant control for a satellite attitude control system

    OpenAIRE

    Baldi, Pietro

    2015-01-01

    Modern control systems are becoming more and more complex and control algorithms more and more sophisticated. Consequently, Fault Detection and Diagnosis (FDD) and Fault Tolerant Control (FTC) have gained central importance over the past decades, due to the increasing requirements of availability, cost efficiency, reliability and operating safety. This thesis deals with the FDD and FTC problems in a spacecraft Attitude Determination and Control System (ADCS). Firstly, the detailed nonlinea...

  5. Summarize of Electric Vehicle Electric System Fault and Fault-tolerant Technology

    Directory of Open Access Journals (Sweden)

    Zhang Liwei

    2013-09-01

    Full Text Available Electric vehicle drive system is a multi-variable function, running environment complexed and changeable system, so it’s failure form is complicated. In this paper, according to the fault happens in different position, establish vehicle fault table, analyze the consequences of failure may cause and the causes of failure. Combined with hardware limitations, and the maximum guarantee system performance requirements, passive software redundancy fault-tolerant strategy is put forward, give an example to analysis the pros and cons of this method.

  6. Development and Evaluation of Fault-Tolerant Flight Control Systems

    Science.gov (United States)

    Song, Yong D.; Gupta, Kajal (Technical Monitor)

    2004-01-01

    The research is concerned with developing a new approach to enhancing fault tolerance of flight control systems. The original motivation for fault-tolerant control comes from the need for safe operation of control elements (e.g. actuators) in the event of hardware failures in high reliability systems. One such example is modem space vehicle subjected to actuator/sensor impairments. A major task in flight control is to revise the control policy to balance impairment detectability and to achieve sufficient robustness. This involves careful selection of types and parameters of the controllers and the impairment detecting filters used. It also involves a decision, upon the identification of some failures, on whether and how a control reconfiguration should take place in order to maintain a certain system performance level. In this project new flight dynamic model under uncertain flight conditions is considered, in which the effects of both ramp and jump faults are reflected. Stabilization algorithms based on neural network and adaptive method are derived. The control algorithms are shown to be effective in dealing with uncertain dynamics due to external disturbances and unpredictable faults. The overall strategy is easy to set up and the computation involved is much less as compared with other strategies. Computer simulation software is developed. A serious of simulation studies have been conducted with varying flight conditions.

  7. Fault tolerant aggregation for power system services

    DEFF Research Database (Denmark)

    Kosek, Anna Magdalena; Gehrke, Oliver; Kullmann, Daniel

    2013-01-01

    Exploiting the flexibility in distributed energy resources (DER) is seen as an important contribution to allow high penetrations of renewable generation in electrical power systems. However, the present control infrastructure in power systems is not well suited for the integration of a very large...... number of small units. A common approach is to aggregate a portfolio of such units together and expose them to the power system as a single large virtual unit. In order to realize the vision of a Smart Grid, concepts for flexible, resilient and reliable aggregation infrastructures are required. This...

  8. Fault-tolerant design

    CERN Document Server

    Dubrova, Elena

    2013-01-01

    This textbook serves as an introduction to fault-tolerance, intended for upper-division undergraduate students, graduate-level students and practicing engineers in need of an overview of the field.  Readers will develop skills in modeling and evaluating fault-tolerant architectures in terms of reliability, availability and safety.  They will gain a thorough understanding of fault tolerant computers, including both the theory of how to design and evaluate them and the practical knowledge of achieving fault-tolerance in electronic, communication and software systems.  Coverage includes fault-tolerance techniques through hardware, software, information and time redundancy.  The content is designed to be highly accessible, including numerous examples and exercises.  Solutions and powerpoint slides are available for instructors.   ·         Provides textbook coverage of the fundamental concepts of fault-tolerance; ·         Describes a variety of basic techniques for achieving fault-toleran...

  9. A Game-theoretic Approach for Synthesizing Fault-Tolerant Embedded Systems

    CERN Document Server

    Cheng, Chih-Hong; Knoll, Alois; Buckl, Christian

    2010-01-01

    In this paper, we present an approach for fault-tolerant synthesis by combining predefined patterns for fault-tolerance with algorithmic game solving. A non-fault-tolerant system, together with the relevant fault hypothesis and fault-tolerant mechanism templates in a pool are translated into a distributed game, and we perform an incomplete search of strategies to cope with undecidability. The result of the game is translated back to executable code concretizing fault-tolerant mechanisms using constraint solving. The overall approach is implemented to a prototype tool chain and is illustrated using examples.

  10. Fault-tolerant reactor protection system

    International Nuclear Information System (INIS)

    A reactor protection system is disclosed having four divisions, with quad redundant sensors for each scram parameter providing input to four independent microprocessor-based electronic chassis. Each electronic chassis acquires the scram parameter data from its own sensor, digitizes the information, and then transmits the sensor reading to the other three electronic chassis via optical fibers. To increase system availability and reduce false scrams, the reactor protection system employs two levels of voting on a need for reactor scram. The electronic chassis perform software divisional data processing, vote 2/3 with spare based upon information from all four sensors, and send the divisional scram signals to the hardware logic panel, which performs a 2/4 division vote on whether or not to initiate a reactor scram. Each chassis makes a divisional scram decision based on data from all sensors. Each division performs independently of the others (asynchronous operation). All communications between the divisions are asynchronous. Each chassis substitutes its own spare sensor reading in the 2/3 vote if a sensor reading from one of the other chassis is faulty or missing. Therefore the presence of at least two valid sensor readings in excess of a set point is required before terminating the output to the hardware logic of a scram inhibition signal even when one of the four sensors is faulty or when one of the divisions is out of service. 16 figs

  11. Sliding mode based fault detection, reconstruction and fault tolerant control scheme for motor systems.

    Science.gov (United States)

    Mekki, Hemza; Benzineb, Omar; Boukhetala, Djamel; Tadjine, Mohamed; Benbouzid, Mohamed

    2015-07-01

    The fault-tolerant control problem belongs to the domain of complex control systems in which inter-control-disciplinary information and expertise are required. This paper proposes an improved faults detection, reconstruction and fault-tolerant control (FTC) scheme for motor systems (MS) with typical faults. For this purpose, a sliding mode controller (SMC) with an integral sliding surface is adopted. This controller can make the output of system to track the desired position reference signal in finite-time and obtain a better dynamic response and anti-disturbance performance. But this controller cannot deal directly with total system failures. However an appropriate combination of the adopted SMC and sliding mode observer (SMO), later it is designed to on-line detect and reconstruct the faults and also to give a sensorless control strategy which can achieve tolerance to a wide class of total additive failures. The closed-loop stability is proved, using the Lyapunov stability theory. Simulation results in healthy and faulty conditions confirm the reliability of the suggested framework. PMID:25747198

  12. Fault Tolerant Operation in Aero Engine Using Distributed Computation System

    Directory of Open Access Journals (Sweden)

    Neela A G

    2014-04-01

    Full Text Available The paper presents fault tolerant operation in an aero engine based on real-time systems which is built for a very small set of mission-critical applications like space craft’s , avionics and other distributed control systems. The modern software deals with external interfaces and has to consider various timing implications The platform is based on the C and developed using Keil MDK tool with the targeted deadline of 100 milliseconds at the baud rate of 500 kbps. CAN interface executes the role of Transportation and Communication, an interface cable used for serial communication between Digital Electronic Control Unit (DECU and the host to transfer data to the pilot Online Monitoring System and that is based on Laboratory Virtual Instrument Engineering Workbench (Lab VIEW 7.1. Fault diagnosis typically assumes a sufficiently large fault signature and enough time for a reliable decision to be reached. However, for a class of safety critical faults on commercial aircraft engines, prompt detection is paramount within a millisecond range to allow accommodation to avert undesired engine behavior. At the same time, false positives must be avoided to prevent inappropriate control action.

  13. Fault-Tolerant Relative Navigation System (RNS) for Docking Project

    Data.gov (United States)

    National Aeronautics and Space Administration — A method is propsed to develop a sensor fusion process for blending GPS/IMU/EO data for fault tolerant rendezvous and docking of spacecraft. The methodology takes...

  14. Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

    Science.gov (United States)

    Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.

    1992-01-01

    Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions.

  15. Fault-diagnosis applications model-based condition monitoring actuators, drives, machinery, plants, sensors, and fault-tolerant systems

    CERN Document Server

    Isermann, Rolf

    2011-01-01

    Supervision, condition-monitoring, fault detection, fault diagnosis and fault management play an increasing role for technical processes and vehicles in order to improve reliability, availability, maintenance and lifetime. For safety-related processes fault-tolerant systems with redundancy are required in order to reach comprehensive system integrity. This book is a sequel of the book "Fault-Diagnosis Systems" published in 2006, where the basic methods were described. After a short introduction into fault-detection and fault-diagnosis methods the book shows how these methods can be applie

  16. Diagnostic software and fault tolerant microprocessor based system architectures

    International Nuclear Information System (INIS)

    In numerous industrial applications including power generation, the availability of electronic systems to perform the tasks assigned has become a major issue. At the same time, the functional complexity of these systems has increased enormously. Fortunately, the arrival of cost effective microprocessor based hardware has given the system designer a cadre of techniques to ensure the desired degree of system integrity and availability. These include: dynamic redundancy, isolation, functional diversity, built-in self-tests, embedded test subsystems, communications, error checking and error correcting codes, etc. The choice among the available techniques is generally heuristic and depends greatly on the structure of major components and systems external to the electronic system itself as well as the postulated faults and their relative frequency. Indiscriminate use of these techniques will inevitably increase cost and reduce maintainability while actually reducing system availability and reliability. The issues and the application of these techniques are discussed by describing recent examples of fault tolerant microprocessor based system architectures which include the Plant Safety Monitoring System, the EAGLE-21 Process Protection System and the Advanced Rod Position Indication System for pressurized water reactors. Each of these systems utilize unique internal architectures that address the reliability, availability, and the communications issues while improving maintainability and man-machine interfaces

  17. Passive Fault Tolerant Control of Piecewise Affine Systems Based on H Infinity Synthesis

    DEFF Research Database (Denmark)

    Gholami, Mehdi; Cocquempot, vincent; Schiøler, Henrik; Bak, Thomas

    2011-01-01

    In this paper we design a passive fault tolerant controller against actuator faults for discretetime piecewise affine (PWA) systems. By using dissipativity theory and H analysis, fault tolerant state feedback controller design is expressed as a set of Linear Matrix Inequalities (LMIs). In the...

  18. Fault Tolerant Feedback Control

    DEFF Research Database (Denmark)

    Stoustrup, Jakob; Niemann, H.

    An architecture for fault tolerant feedback controllers based on the Youla parameterization is suggested. It is shown that the Youla parameterization will give a residual vector directly in connection with the fault diagnosis part of the fault tolerant feedback controller. It turns out that there...... is a separation be-tween the feedback controller and the fault tolerant part. The closed loop feedback properties are handled by the nominal feedback controller and the fault tolerant part is handled by the design of the Youla parameter. The design of the fault tolerant part will not affect the...... design of the nominal feedback con-troller....

  19. Distributed Evaluation Functions for Fault Tolerant Multi-Rover Systems

    Science.gov (United States)

    Agogino, Adrian; Turner, Kagan

    2005-01-01

    The ability to evolve fault tolerant control strategies for large collections of agents is critical to the successful application of evolutionary strategies to domains where failures are common. Furthermore, while evolutionary algorithms have been highly successful in discovering single-agent control strategies, extending such algorithms to multiagent domains has proven to be difficult. In this paper we present a method for shaping evaluation functions for agents that provide control strategies that both are tolerant to different types of failures and lead to coordinated behavior in a multi-agent setting. This method neither relies of a centralized strategy (susceptible to single point of failures) nor a distributed strategy where each agent uses a system wide evaluation function (severe credit assignment problem). In a multi-rover problem, we show that agents using our agent-specific evaluation perform up to 500% better than agents using the system evaluation. In addition we show that agents are still able to maintain a high level of performance when up to 60% of the agents fail due to actuator, communication or controller faults.

  20. Robust Adaptive Fault-Tolerant Control of Stochastic Systems with Modeling Uncertainties and Actuator Failures

    OpenAIRE

    Wenchuan Cai; Lingling Fan; Yongduan Song

    2014-01-01

    This paper deals with the problem of fault-tolerant control (FTC) of uncertain stochastic systems subject to modeling uncertainties and actuator failures. A robust adaptive fault-tolerant controller design method based on stochastic Lyapunov theory is developed to accommodate the negative impact on system performance arising from uncertain system parameters and external disturbances as well as actuation faults. There is no need for on-line fault detection and diagnosis (FDD) unit in the propo...

  1. Fault-Tolerant Control of a Distributed Database System

    Directory of Open Access Journals (Sweden)

    N. Eva Wu

    2007-12-01

    Full Text Available Optimal state information-based control policy for a distributed database system subject to server failures is considered. Fault-tolerance is made possible by the partitioned architecture of the system and data redundancy therein. Control actions include restoration of lost data sets in a single server using redundant data sets in the remaining servers, routing of queries to intact servers, or overhaul of the entire system for renewal. Control policies are determined by solving Markov decision problems with cost criteria that penalize system unavailability and slow query response. Steady-state system availability and expected query response time of the controlled database are evaluated with the Markov model of the database. Robustness is addressed by introducing additional states into the database model to account for control action delays and decision errors. A robust control policy is solved for the Markov decision problem described by the augmented state model.

  2. Fault-Tolerant Static Scheduling for Real-Time Distributed Embedded Systems

    OpenAIRE

    Girault, Alain; Lavarenne, Christophe; Sighireanu, Mihaela; Sorel, Yves

    2000-01-01

    This paper investigates fault-tolerance issues in real-time distributed embedded systems. Our goal is to propose solutions to automatically produce distributed and fault-tolerant code. We first characterize the systems considered by giving the main assumptions about the physical and logical architecture of these systems. In particular, we consider only processor failures, with a fail-stop behavior. Then, we give a state of the art of the techniques used for fault-tolerance. We also briefly pr...

  3. Industrial Cost-Benefit Assessment for Fault-tolerant Control Systems

    DEFF Research Database (Denmark)

    Thybo, C.; Blanke, M.

    1998-01-01

    Economic aspects are decisive for industrial acceptance of research concepts including the promising ideas in fault tolerant control. Fault tolerance is the ability of a system to detect, isolate and accommodate a fault, such that simple faults in a sub-system do not develop into failures at a...... system level. In a design phase for an industrial system, possibilities span from fail safe design where any single point failure is accommodated by hardware, over fault-tolerant design where selected faults are handled without extra hardware, to fault-ignorant design where no extra precaution is taken...... objective of this paper is to help, in the early product development state, to find the economical most suitable scheme. A salient result is that with increased customer awareness of total cost of ownership, new products can benefit significantly from applying fault tolerant control principles....

  4. Fault-tolerance in Two-dimensional Topological Systems

    Science.gov (United States)

    Anderson, Jonas T.

    This thesis is a collection of ideas with the general goal of building, at least in the abstract, a local fault-tolerant quantum computer. The connection between quantum information and topology has proven to be an active area of research in several fields. The introduction of the toric code by Alexei Kitaev demonstrated the usefulness of topology for quantum memory and quantum computation. Many quantum codes used for quantum memory are modeled by spin systems on a lattice, with operators that extract syndrome information placed on vertices or faces of the lattice. It is natural to wonder whether the useful codes in such systems can be classified. This thesis presents work that leverages ideas from topology and graph theory to explore the space of such codes. Homological stabilizer codes are introduced and it is shown that, under a set of reasonable assumptions, any qubit homological stabilizer code is equivalent to either a toric code or a color code. Additionally, the toric code and the color code correspond to distinct classes of graphs. Many systems have been proposed as candidate quantum computers. It is very desirable to design quantum computing architectures with two-dimensional layouts and low complexity in parity-checking circuitry. Kitaev's surface codes provided the first example of codes satisfying this property. They provided a new route to fault tolerance with more modest overheads and thresholds approaching 1%. The recently discovered color codes share many properties with the surface codes, such as the ability to perform syndrome extraction locally in two dimensions. Some families of color codes admit a transversal implementation of the entire Clifford group. This work investigates color codes on the 4.8.8 lattice known as triangular codes. I develop a fault-tolerant error-correction strategy for these codes in which repeated syndrome measurements on this lattice generate a three-dimensional space-time combinatorial structure. I then develop an integer program that analyzes this structure and determines the most likely set of errors consistent with the observed syndrome values. I implement this integer program to find the threshold for depolarizing noise on small versions of these triangular codes. Because the threshold for magic-state distillation is likely to be higher than this value and because logical CNOT gates can be performed by code deformation in a single block instead of between pairs of blocks, the threshold for fault-tolerant quantum memory for these codes is also the threshold for fault-tolerant quantum computation with them. Since the advent of a threshold theorem for quantum computers much has been improved upon. Thresholds have increased, architectures have become more local, and gate sets have been simplified. The overhead for magic-state distillation has been studied, but not nearly to the extent of the aforementioned topics. A method for greatly reducing this overhead, known as reusable magic states, is studied here. While examples of reusable magic states exist for Clifford gates, I give strong reasons to believe they do not exist for non-Clifford gates.

  5. Disturbance observer based fault estimation and dynamic output feedback fault tolerant control for fuzzy systems with local nonlinear models.

    Science.gov (United States)

    Han, Jian; Zhang, Huaguang; Wang, Yingchun; Liu, Yang

    2015-11-01

    This paper addresses the problems of fault estimation (FE) and fault tolerant control (FTC) for fuzzy systems with local nonlinear models, external disturbances, sensor and actuator faults, simultaneously. Disturbance observer (DO) and FE observer are designed, simultaneously. Compared with the existing results, the proposed observer is with a wider application range. Using the estimation information, a novel fuzzy dynamic output feedback fault tolerant controller (DOFFTC) is designed. The controller can be used for the fuzzy systems with unmeasurable local nonlinear models, mismatched input disturbances, and measurement output affecting by sensor faults and disturbances. At last, the simulation shows the effectiveness of the proposed methods. PMID:26456728

  6. Design and analysis of reliable and fault-tolerant computer systems

    CERN Document Server

    Abd-El-Barr, Mostafa

    2006-01-01

    Covering both the theoretical and practical aspects of fault-tolerant mobile systems, and fault tolerance and analysis, this book tackles the current issues of reliability-based optimization of computer networks, fault-tolerant mobile systems, and fault tolerance and reliability of high speed and hierarchical networks.The book is divided into six parts to facilitate coverage of the material by course instructors and computer systems professionals. The sequence of chapters in each part ensures the gradual coverage of issues from the basics to the most recent developments. A useful set of refere

  7. Fault-tolerant for Electric Vehicles Drive System Sensor Failure

    Directory of Open Access Journals (Sweden)

    Zhang Liwei

    2013-10-01

    Full Text Available When EV failure happens, it needs to take some fault-tolerant method to ensure peoples safety. When the current sensor and speed sensor are out of work, the software fault-tolerant control algorithm switching strategy can be used. This paper has done theoretical analysis of the rotor field-oriented vectoe control algorithm into the open loop constant V/F control algorithm, and the phase angle compensation method is used to reduce the shock of current and torque, and simulation is done in MATLAB/Simulink.

  8. Validation Methods for Fault-Tolerant avionics and control systems, working group meeting 1

    Science.gov (United States)

    1979-01-01

    The proceedings of the first working group meeting on validation methods for fault tolerant computer design are presented. The state of the art in fault tolerant computer validation was examined in order to provide a framework for future discussions concerning research issues for the validation of fault tolerant avionics and flight control systems. The development of positions concerning critical aspects of the validation process are given.

  9. Fault diagnosis and fault-tolerant control strategies for non-linear systems analytical and soft computing approaches

    CERN Document Server

    Witczak, Marcin

    2014-01-01

      This book presents selected fault diagnosis and fault-tolerant control strategies for non-linear systems in a unified framework. In particular, starting from advanced state estimation strategies up to modern soft computing, the discrete-time description of the system is employed Part I of the book presents original research results regarding state estimation and neural networks for robust fault diagnosis. Part II is devoted to the presentation of integrated fault diagnosis and fault-tolerant systems. It starts with a general fault-tolerant control framework, which is then extended by introducing robustness with respect to various uncertainties. Finally, it is shown how to implement the proposed framework for fuzzy systems described by the well-known Takagi–Sugeno models. This research monograph is intended for researchers, engineers, and advanced postgraduate students in control and electrical engineering, computer science,as well as mechanical and chemical engineering.

  10. Evaluation of digital fault-tolerant architectures for nuclear power plant control systems

    International Nuclear Information System (INIS)

    Four fault tolerant architectures were evaluated for their potential reliability in service as control systems of nuclear power plants. The reliability analyses showed that human- and software-related common cause failures and single points of failure in the output modules are dominant contributors to system unreliability. The four architectures are triple-modular-redundant (TMR), both synchronous and asynchronous, and also dual synchronous and asynchronous. The evaluation includes a review of design features, an analysis of the importance of coverage, and reliability analyses of fault tolerant systems. An advantage of fault-tolerant controllers over those not fault tolerant, is that fault-tolerant controllers continue to function after the occurrence of most single hardware faults. However, most fault-tolerant controllers have single hardware components that will cause system failure, almost all controllers have single points of failure in software, and all are subject to common cause failures. Reliability analyses based on data from several industries that have fault-tolerant controllers were used to estimate the mean-time-between-failures of fault-tolerant controllers and to predict those failures modes that may be important in nuclear power plants. 7 refs., 4 tabs

  11. Factorizing fault tolerance

    OpenAIRE

    Prasetya, I.S.W.B.; Swierstra, S.D.

    2000-01-01

    This paper presents a theory of component based development for exception-handling in fault tolerant systems. The theory is based on a general theory of composition, which enables us to factorize the temporal specification of a system into the specifications of its components. This is a new development because in the past efforts to set up such a theory have always been hindered by the problem of composing progress properties.

  12. Synthesis of Fault-Tolerant Embedded Systems with Checkpointing and Replication

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru; Peng, Zebo

    We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes are statically scheduled and communications are performed using the time...

  13. High-Intensity Radiated Field Fault-Injection Experiment for a Fault-Tolerant Distributed Communication System

    Science.gov (United States)

    Yates, Amy M.; Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Gonzalez, Oscar R.; Gray, W. Steven

    2010-01-01

    Safety-critical distributed flight control systems require robustness in the presence of faults. In general, these systems consist of a number of input/output (I/O) and computation nodes interacting through a fault-tolerant data communication system. The communication system transfers sensor data and control commands and can handle most faults under typical operating conditions. However, the performance of the closed-loop system can be adversely affected as a result of operating in harsh environments. In particular, High-Intensity Radiated Field (HIRF) environments have the potential to cause random fault manifestations in individual avionic components and to generate simultaneous system-wide communication faults that overwhelm existing fault management mechanisms. This paper presents the design of an experiment conducted at the NASA Langley Research Center's HIRF Laboratory to statistically characterize the faults that a HIRF environment can trigger on a single node of a distributed flight control system.

  14. Towards fault-tolerant decision support systems for ship operator guidance

    DEFF Research Database (Denmark)

    Nielsen, Ulrik Dam; Lajic, Zoran; Jensen, Jørgen Juncher

    2012-01-01

    Fault detection and isolation are very important elements in the design of fault-tolerant decision support systems for ship operator guidance. This study outlines remedies that can be applied for fault diagnosis, when the ship responses are assumed to be linear in the wave excitation. A novel...

  15. Adaptive Observer-Based Fault-Tolerant Control Design for Uncertain Systems

    OpenAIRE

    Huaming Qian; Yu Peng; Mei Cui

    2015-01-01

    This study focuses on the design of the robust fault-tolerant control (FTC) system based on adaptive observer for uncertain linear time invariant (LTI) systems. In order to improve robustness, rapidity, and accuracy of traditional fault estimation algorithm, an adaptive fault estimation algorithm (AFEA) using an augmented observer is presented. By utilizing a new fault estimator model, an improved AFEA based on linear matrix inequality (LMI) technique is proposed to increase the performance. ...

  16. Fault-Tolerant Control using Adaptive Time-Frequency Method in Bearing Fault Detection for DFIG Wind Energy System

    Directory of Open Access Journals (Sweden)

    Suratsavadee Koonlaboon KORKUA

    2015-02-01

    Full Text Available With the advances in power electronic technology, doubly-fed induction generators (DFIG have increasingly drawn the interest of the wind turbine industry. To ensure the reliable operation and power quality of wind power systems, the fault-tolerant control for DFIG is studied in this paper. The fault-tolerant controller is designed to maintain an acceptable level of performance during bearing fault conditions. Based on measured motor current data, an adaptive statistical time-frequency method is then used to detect the fault occurrence in the system; the controller then compensates for faulty conditions. The feature vectors, including frequency components located in the neighborhood of the characteristic fault frequencies, are first extracted and then used to estimate the next sampling stator side current, in order to better perform the current control. Early fault detection, isolation and successful reconfiguration would be very beneficial in a wind energy conversion system. The feasibility of this fault-tolerant controller has been proven by means of mathematical modeling and digital simulation based on Matlab/Simulink. The simulation results of the generator output show the effectiveness of the proposed fault-tolerant controller.

  17. Active Fault Tolerant Control of Livestock Stable Ventilation System

    DEFF Research Database (Denmark)

    Gholami, Mehdi

    2011-01-01

    degraded performance even in the faulty case. In this thesis, we have designed such controllers for climate control systems for livestock buildings in three steps: Deriving a model for the climate control system of a pig-stable. Designing a active fault diagnosis (AFD) algorithm for different kinds of...... of the hybrid model are estimated by a recursive estimation algorithm, the Extended Kalman Filter (EKF), using experimental data which was provided by an equipped laboratory. Two methods for active fault diagnosis are proposed. The AFD methods excite the system by injecting a so-called excitation...... input. In both methods, the input is designed off-line based on a sensitivity analysis in order to improve the precision of estimation of parameters associated with faults. Two different algorithm, the EKF and a new adaptive filter, are used to estimate the parameters of the system. The fault is...

  18. Designing fault-tolerant distributed archives for picture archiving and communication systems

    OpenAIRE

    Mendenhall, Rebecca; Dewey, Matt; Soutar, Ian

    2001-01-01

    Purpose: Distributed archives in a picture archiving and communication system (PACS) environment can provide added fault tolerance and fail-over capability, as well as increased load capacity at a more economical price than traditional high-availability systems. Systems can be configured with varying levels of fault tolerance, depending on the amountof redundancy desired. There is, however, a direct correlation between the level of hardware redundancy and cost to implement. This presentatio...

  19. Award ER25750: Coordinated Infrastructure for Fault Tolerance Systems Indiana University Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Lumsdaine, Andrew

    2013-03-08

    The main purpose of the Coordinated Infrastructure for Fault Tolerance in Systems initiative has been to conduct research with a goal of providing end-to-end fault tolerance on a systemwide basis for applications and other system software. While fault tolerance has been an integral part of most high-performance computing (HPC) system software developed over the past decade, it has been treated mostly as a collection of isolated stovepipes. Visibility and response to faults has typically been limited to the particular hardware and software subsystems in which they are initially observed. Little fault information is shared across subsystems, allowing little flexibility or control on a system-wide basis, making it practically impossible to provide cohesive end-to-end fault tolerance in support of scientific applications. As an example, consider faults such as communication link failures that can be seen by a network library but are not directly visible to the job scheduler, or consider faults related to node failures that can be detected by system monitoring software but are not inherently visible to the resource manager. If information about such faults could be shared by the network libraries or monitoring software, then other system software, such as a resource manager or job scheduler, could ensure that failed nodes or failed network links were excluded from further job allocations and that further diagnosis could be performed. As a founding member and one of the lead developers of the Open MPI project, our efforts over the course of this project have been focused on making Open MPI more robust to failures by supporting various fault tolerance techniques, and using fault information exchange and coordination between MPI and the HPC system software stack�from the application, numeric libraries, and programming language runtime to other common system components such as jobs schedulers, resource managers, and monitoring tools.

  20. Dataflow models for fault-tolerant control systems

    Science.gov (United States)

    Papadopoulos, G. M.

    1984-01-01

    Dataflow concepts are used to generate a unified hardware/software model of redundant physical systems which are prone to faults. Basic results in input congruence and synchronization are shown to reduce to a simple model of data exchanges between processing sites. Procedures are given for the construction of congruence schemata, the distinguishing features of any correctly designed redundant system.

  1. Evaluation of digital fault-tolerant architectures for nuclear power plant control systems

    International Nuclear Information System (INIS)

    This paper reports on four fault-tolerant architectures that were evaluated for their potential reliability in service as control systems of nuclear power plants. The reliability analyses showed that human- and software-related common cause failures and single points of failure in the output modules are dominant contributors to system unreliability. The four architectures are triple-modular-redundant, both synchronous and asynchronous, and also dual synchronous and asynchronous. The evaluation includes a review of design features, an analysis of the importance of coverage, and reliability analyses of fault-tolerant systems. Reliability analyses based on data from several industries that have fault-tolerant controllers were used to estimate the mean-time-between-failures of fault-tolerant controllers and to predict those failure modes that may be important in nuclear power plants

  2. Fault tolerant control system design by using clustering algorithms of data mining

    OpenAIRE

    Umut Altınışık; Mehmet Yıldırım

    2013-01-01

    In this study, two clustering algorithms and their success in fault isolation have been investigated in order to use in our fault tolerant control (FTC) system. With so many applications used today, the mathematical model of the system cannot be completely established. Therefore, in this study, fault detection and isolation (FDI) is realized by using knowledge-based methods, without the need for any mathematical model. Sensor data, which are taken offline by FDI, are clustered to create knowl...

  3. Passive Fault-tolerant Control of Discrete-time Piecewise Affine Systems against Actuator Faults

    DEFF Research Database (Denmark)

    Tabatabaeipour, Seyed Mojtaba; Izadi-Zamanabadi, Roozbeh; Bak, Thomas; Ravn, Anders Peter

    2012-01-01

    the exis- tence of a passive fault-tolerant controller is derived and formulated as the feasibility of a set of linear matrix inequalities (LMIs). The upper bound on the performance cost can be minimized using a convex optimization problem with LMI constraints which can be solved efficiently. The...

  4. Active fault tolerant control of piecewise affine systems with reference tracking and input constraints

    DEFF Research Database (Denmark)

    Gholami, M.; Cocquempot, V.; Schiøler, H.; Bak, Thomas

    2014-01-01

    An active fault tolerant control (AFTC) method is proposed for discrete-time piecewise affine (PWA) systems. Only actuator faults are considered. The AFTC framework contains a supervisory scheme, which selects a suitable controller in a set of controllers such that the stability and an acceptable...

  5. Application-driven co-design of fault-tolerant industrial systems

    OpenAIRE

    Restrepo Calle, Felipe; Martínez Álvarez, Antonio; Guzmán Miranda, Hipólito; Palomo Pinto, Francisco Rogelio; Cuenca Asensi, Sergio

    2010-01-01

    This paper presents a novel methodology for the HW/SW co-design of fault tolerant embedded systems that pursues the mitigation of radiation-induced upset events (which are a class of Single Event Effects - SEEs) on critical industrial applications. The proposal combines the flexibility and low cost of Software Implemented Hardware Fault Tolerance (SIHFT) techniques with the high reliability of selective hardware replication. The co-design flow is supported by a hardening platform that compris...

  6. Robust Fault Tolerant Control for a Class of Time-Delay Systems with Multiple Disturbances

    OpenAIRE

    Songyin Cao; Jianzhong Qiao

    2013-01-01

    A robust fault tolerant control (FTC) approach is addressed for a class of nonlinear systems with time delay, actuator faults, and multiple disturbances. The first part of the multiple disturbances is supposed to be an uncertain modeled disturbance and the second one represents a norm-bounded variable. First, a composite observer is designed to estimate the uncertain modeled disturbance and actuator fault simultaneously. Then, an FTC strategy consisting of disturbance observer based control (...

  7. Diagnosis and fault-tolerant control

    CERN Document Server

    Blanke, Mogens; Lunze, Jan; Staroswiecki, Marcel

    2016-01-01

    Fault-tolerant control aims at a gradual shutdown response in automated systems when faults occur. It satisfies the industrial demand for enhanced availability and safety, in contrast to traditional reactions to faults, which bring about sudden shutdowns and loss of availability. The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process that can be used to ensure fault tolerance. It also introduces design methods suitable for diagnostic systems and fault-tolerant controllers for continuous processes that are described by analytical models of discrete-event systems represented by automata. The book is suitable for engineering students, engineers in industry and researchers who wish to get an overview of the variety of approaches to process diagnosis and fault-tolerant contro...

  8. Fault-Tolerant Consensus of Multi-Agent System With Distributed Adaptive Protocol.

    Science.gov (United States)

    Chen, Shun; Ho, Daniel W C; Li, Lulu; Liu, Ming

    2015-10-01

    In this paper, fault-tolerant consensus in multi-agent system using distributed adaptive protocol is investigated. Firstly, distributed adaptive online updating strategies for some parameters are proposed based on local information of the network structure. Then, under the online updating parameters, a distributed adaptive protocol is developed to compensate the fault effects and the uncertainty effects in the leaderless multi-agent system. Based on the local state information of neighboring agents, a distributed updating protocol gain is developed which leads to a fully distributed continuous adaptive fault-tolerant consensus protocol design for the leaderless multi-agent system. Furthermore, a distributed fault-tolerant leader-follower consensus protocol for multi-agent system is constructed by the proposed adaptive method. Finally, a simulation example is given to illustrate the effectiveness of the theoretical analysis. PMID:25415998

  9. Reliability Evaluation Methodologies of Fault Tolerant Techniques of Digital I and C Systems in Nuclear Power Plants

    International Nuclear Information System (INIS)

    Since the reactor protection system was replaced from analog to digital, digital reactor protection system has 4 redundant channels and each channel has several modules. It is necessary for various fault tolerant techniques to improve availability and reliability due to using complex components in DPPS. To use the digital system, it is necessary to improve the reliability and availability of a system through fault-tolerant techniques. Several researches make an effort to effects of fault tolerant techniques. However, the effects of fault tolerant techniques have not been properly considered yet in most fault tree models. Various fault-tolerant techniques, which used in digital system in NPPs, should reflect in fault tree analysis for getting lower system unavailability and more reliable PSA. When fault-tolerant techniques are modeled in fault tree, categorizing the module to detect by each fault tolerant techniques, fault coverage, detection period and the fault recovery should be considered. Further work will concentrate on various aspects for fault tree modeling. We will find other important factors, and found a new theory to construct the fault tree model

  10. Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 1: Army fault tolerant architecture overview

    Science.gov (United States)

    Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.

    1992-01-01

    Digital computing systems needed for Army programs such as the Computer-Aided Low Altitude Helicopter Flight Program and the Armored Systems Modernization (ASM) vehicles may be characterized by high computational throughput and input/output bandwidth, hard real-time response, high reliability and availability, and maintainability, testability, and producibility requirements. In addition, such a system should be affordable to produce, procure, maintain, and upgrade. To address these needs, the Army Fault Tolerant Architecture (AFTA) is being designed and constructed under a three-year program comprised of a conceptual study, detailed design and fabrication, and demonstration and validation phases. Described here are the results of the conceptual study phase of the AFTA development. Given here is an introduction to the AFTA program, its objectives, and key elements of its technical approach. A format is designed for representing mission requirements in a manner suitable for first order AFTA sizing and analysis, followed by a discussion of the current state of mission requirements acquisition for the targeted Army missions. An overview is given of AFTA's architectural theory of operation.

  11. System Diagnosis and Fault Tolerance for Distributed Computing System: A Review

    OpenAIRE

    Nilotpal Baruah; Dr. Lakshmi P. Saikia; Dr. K. Hemachandran

    2013-01-01

    An adaptive system diagnosis fault tolerance method for distributed system. The system is comprised of a network including N nodes where N is integer and greater than equal to 3 and each node is able to execute an algorithm to communicate with the network. A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information. As computer network is a collection of...

  12. Fault-tolerant interconnection network and image-processing applications for the PASM parallel processing system

    International Nuclear Information System (INIS)

    The demand for very high speed data processing coupled with falling hardware costs has made large-scale parallel and distributed computer systems both desirable and feasible. Two modes of parallel processing are single instruction stream-multiple data stream (SIMD) and multiple instruction stream-multiple data stream (MIMD). PASM, a partitionable SIMD/MIMD system, is a reconfigurable multimicroprocessor system being designed for image processing and pattern recognition. An important component of these systems is the interconnection network, the mechanism for communication among the computation nodes and memories. Assuring high reliability for such complex systems is a significant task. Thus, a crucial practical aspect of an interconnection network is fault tolerance. In answer to this need, the Extra Stage Cube (ESC), a fault-tolerant, multistage cube-type interconnection network, is define. The fault tolerance of the ESC is explored for both single and multiple faults, routing tags are defined, and consideration is given to permuting data and partitioning the ESC in the presence of faults. The ESC is compared with other fault-tolerant multistage networks. Finally, reliability of the ESC and an enhanced version of it are investigated

  13. Validation Methods Research for Fault-Tolerant Avionics and Control Systems: Working Group Meeting, 2

    Science.gov (United States)

    Gault, J. W. (Editor); Trivedi, K. S. (Editor); Clary, J. B. (Editor)

    1980-01-01

    The validation process comprises the activities required to insure the agreement of system realization with system specification. A preliminary validation methodology for fault tolerant systems documented. A general framework for a validation methodology is presented along with a set of specific tasks intended for the validation of two specimen system, SIFT and FTMP. Two major areas of research are identified. First, are those activities required to support the ongoing development of the validation process itself, and second, are those activities required to support the design, development, and understanding of fault tolerant systems.

  14. Pivotal decomposition for reliability analysis of fault tolerant control systems on unmanned aerial vehicles

    International Nuclear Information System (INIS)

    In this paper, we describe a framework to efficiently assess the reliability of fault tolerant control systems on low-cost unmanned aerial vehicles. The analysis is developed for a system consisting of a fixed number of actuators. In addition, the system includes a scheme to detect failures in individual actuators and, as a consequence, switch between different control algorithms for automatic operation of the actuators. Existing dynamic reliability analysis methods are insufficient for this class of systems because the coverage parameters for different actuator failures can be time-varying, correlated, and difficult to obtain in practice. We address these issues by combining new fault detection performance metrics with pivotal decomposition. These new metrics capture the interactions in different fault detection channels, and can be computed from stochastic models of fault detection algorithms. Our approach also decouples the high dimensional analysis problem into low dimensional sub-problems, yielding a computationally efficient analysis. Finally, we demonstrate the proposed method on a numerical example. The analysis results are also verified by Monte Carlo simulations. - Highlights: • We study fault tolerant control (FTC) systems on low-cost unmanned aerial vehicles. • We build a reliability structure model for FTC systems. • New fault detection performance metrics are integrated via pivotal decomposition. • The fault detection metrics capture the interactions in fault detection channels. • Numerical results show that FTC techniques can improve system reliability

  15. A novel mathematical setup for fault tolerant control systems with state-dependent failure process

    International Nuclear Information System (INIS)

    In this paper, we consider a fault tolerant control system (FTCS) with state- dependent failures and provide a tractable mathematical model to handle the state-dependent failures. By assuming abrupt changes in system parameters, we use a jump process modelling of failure process and the fault detection and isolation (FDI) process. In particular, we assume that the failure rates of the failure process vary according to which set the state of the system belongs to

  16. A Piecewise Affine Hybrid Systems Approach to Fault Tolerant Satellite Formation Control

    DEFF Research Database (Denmark)

    Grunnet, Jacob Deleuran; Larsen, Jesper Abildgaard; Bak, Thomas; Wisniewski, Rafal

    2008-01-01

    In this paper a procedure for modelling satellite formations including failure dynamics as a piecewise-affine hybrid system is shown. The formulation enables recently developed methods and tools for control and analysis of piecewise-affine systems to be applied leading to synthesis of fau...... tolerant controllers and analysis of the system behaviour given possible faults. The method is illustrated using a simple example involving two satellites trying to reach a specific formation despite of actuator faults occurring....

  17. Application of Joint Parameter Identification and State Estimation to a Fault-Tolerant Robot System

    DEFF Research Database (Denmark)

    Sun, Zhen; Yang, Zhenyu

    The joint parameter identification and state estimation technique is applied to develop a fault-tolerant space robot system. The potential faults in the considered system are abrupt parametric faults, which indicate that some system parameters will immediately deviate from their nominal values if a......, it would further simplify the reconfigurable design task and possibly speed up the system recovery, if the system state information under the new operating circumstance can be available along with faulty parameter information. The joint parameter identification and state estimation using the combined...

  18. Fault detection and fault tolerant control of a smart base isolation system with magneto-rheological damper

    International Nuclear Information System (INIS)

    Fault detection and isolation (FDI) in real-time systems can provide early warnings for faulty sensors and actuator signals to prevent events that lead to catastrophic failures. The main objective of this paper is to develop FDI and fault tolerant control techniques for base isolation systems with magneto-rheological (MR) dampers. Thus, this paper presents a fixed-order FDI filter design procedure based on linear matrix inequalities (LMI). The necessary and sufficient conditions for the existence of a solution for detecting and isolating faults using the H∞ formulation is provided in the proposed filter design. Furthermore, an FDI-filter-based fuzzy fault tolerant controller (FFTC) for a base isolation structure model was designed to preserve the pre-specified performance of the system in the presence of various unknown faults. Simulation and experimental results demonstrated that the designed filter can successfully detect and isolate faults from displacement sensors and accelerometers while maintaining excellent performance of the base isolation technology under faulty conditions

  19. Fault-tolerant Stabilization for Linear System with Time Delay

    Directory of Open Access Journals (Sweden)

    Shaohua Wang

    2013-03-01

    Full Text Available In this note, the FTC problem of time-delay systems with the special sensor model of failure is investigated. Firstly, based on Lyapunov stability theorem, through constructing a proper LKF and using integral inequality, the stability condition of the closed-loop system is obtained. Secondly,  by using the nonlinear transformation and the cone complementary linearization algorithm, the controller existence condition of time-delay system in terms of LMIs is obtained, which guarantee the asymptotically stable of the closed-loop systems even if the sensor faults occur, and the controller parameters are also given. Finally, an example is given to show the effectiveness of the proposed methods in this paper.

  20. Energy-efficient fault tolerance in multiprocessor real-time systems

    Science.gov (United States)

    Guo, Yifeng

    The recent progress in the multiprocessor/multicore systems has important implications for real-time system design and operation. From vehicle navigation to space applications as well as industrial control systems, the trend is to deploy multiple processors in real-time systems: systems with 4 -- 8 processors are common, and it is expected that many-core systems with dozens of processing cores will be available in near future. For such systems, in addition to general temporal requirement common for all real-time systems, two additional operational objectives are seen as critical: energy efficiency and fault tolerance. An intriguing dimension of the problem is that energy efficiency and fault tolerance are typically conflicting objectives, due to the fact that tolerating faults (e.g., permanent/transient) often requires extra resources with high energy consumption potential. In this dissertation, various techniques for energy-efficient fault tolerance in multiprocessor real-time systems have been investigated. First, the Reliability-Aware Power Management (RAPM) framework, which can preserve the system reliability with respect to transient faults when Dynamic Voltage Scaling (DVS) is applied for energy savings, is extended to support parallel real-time applications with precedence constraints. Next, the traditional Standby-Sparing (SS) technique for dual processor systems, which takes both transient and permanent faults into consideration while saving energy, is generalized to support multiprocessor systems with arbitrary number of identical processors. Observing the inefficient usage of slack time in the SS technique, a Preference-Oriented Scheduling Framework is designed to address the problem where tasks are given preferences for being executed as soon as possible (ASAP) or as late as possible (ALAP). A preference-oriented earliest deadline (POED) scheduler is proposed and its application in multiprocessor systems for energy-efficient fault tolerance is investigated, where tasks' main copies are executed ASAP while backup copies ALAP to reduce the overlapped execution of main and backup copies of the same task and thus reduce energy consumption. All proposed techniques are evaluated through extensive simulations and compared with other state-of-the-art approaches. The simulation results confirm that the proposed schemes can preserve the system reliability while still achieving substantial energy savings. Finally, for both SS and POED based Energy-Efficient Fault-Tolerant (EEFT) schemes, a series of recovery strategies are designed when more than one (transient and permanent) faults need to be tolerated.

  1. Adaptive Fault-Tolerant Tracking Control of Nonaffine Nonlinear Systems with Actuator Failure

    OpenAIRE

    Hongcheng Zhou; Dezhi Xu; Daobo Wang; Le Ge

    2014-01-01

    This paper proposes an adaptive fault-tolerant control scheme for nonaffine nonlinear systems. A model approximation method which is a solution that bridges the gap between affine and nonaffine control systems is developed firstly. A joint estimation approach is based on unscented Kalman filter, in which both failure parameters and states are simultaneously estimated by means of the argument state vector composed of the unknown faults and states. Then, stability analysis is given for the clos...

  2. Preface of the special issue on Advances in Control and Fault-Tolerant Systems

    OpenAIRE

    Korbicz, Jozef; Maquin, Didier; THEILLIOL, DIDIER

    2012-01-01

    Today's automatic control systems are of high degrees of integration, complexity, embedding and networking of heterogeneous entities. This trend is driven by the industrial needs for achieving new technical performance and meeting additional performance demands. A most critical and important issue surrounding the design and operation of complex automatic systems is the application of Fault Detection and Isolation and Fault-Tolerant Control (FDI/FTC) technology, aiming at guaranteeing high sys...

  3. Active Fault Tolerant Control-FTC-Design for Takagi-Sugeno Fuzzy Systems with Weighting Functions Depending on the FTC

    Directory of Open Access Journals (Sweden)

    Atef Khedher

    2011-05-01

    Full Text Available In this paper the problem of active fault tolerant control design for noisy systems described by Takagi-Sugeno fuzzy models is studied. The proposed control strategy is based on the known of the fault estimated and the error between the faulty system state and a reference system state. The considered systems are affected by actuator and sensor faults and have the weighting functions depending on the fault tolerant control. A mathematical transformation is used to conceive an augmented system in which all the faults affecting the initial system appear as actuator faults. Then, an adaptive proportional integral observer is used in order to estimate the state and the faults. The problem of conception of the proportional integral observer and of the fault tolerant control strategy is formulated in linear matrices inequalities which can be solved easily. To illustrate the proposed method, It is applied to the three tanks systems.

  4. Novel fault tolerant modular system architecture for I and C applications

    International Nuclear Information System (INIS)

    Novel fault tolerant 3U modular system architecture has been developed for safety related and safety critical I and C systems of the reactor. Design innovatively utilizes simplest multi-drop serial bus called Inter-Integrated Circuits (I2C) Bus for system operation with simplicity, fault tolerance and online maintainability (hot swap). I2C bus failure modes analysis was done and system design was hardened for possible failure modes. System backplane uses only passive components, dual redundant I2C buses, data consistency checks and geographical addressing scheme to tackle bus lock ups/stuck buses and bit flips in data transactions. Dual CPU active/standby redundancy architecture with hot swap implements tolerance for CPU software stuck up conditions and hardware faults. System cards implement hot swap for online maintainability, power supply fault containment, communication buses fault containment and I/O channel to channel isolation and independency. Typical applications for pure hardwired (without real time software) Core Temperature Monitoring System for FBRs, as a Universal Signal Conditioning System for safety related I and C systems and as a complete control system for non nuclear safety systems have also been discussed. (author)

  5. Design of fault tolerant control system for steam generator using fuzzy logic

    International Nuclear Information System (INIS)

    A controller and sensor fault tolerant system for a steam generator is designed with fuzzy logic. A structure of the proposed fault tolerant redundant system is composed of a supervisor and two fuzzy weighting modulators. A supervisor alternatively checks a controller and a sensor induced performances to identify which part, a controller or a sensor, is faulty. In order to analyze controller induced performance both an error and a change in error of the system output are chosen as fuzzy variables. The fuzzy logic for a sensor induced performance uses two variables : a deviation between two sensor outputs and its frequency. Fuzzy weighting modulator generates an output signal compensated for faulty input signal. Simulations show that the proposed fault tolerant control scheme for a stem generator regulates well water level by suppressing fault effect of either controllers or sensors. Therefore through duplicating sensors and controllers with the proposed fault tolerant scheme, both a reliability of a steam generator control and sensor system and that of a power plant increase even more

  6. The study of hardware redundancy techniques to provide a fault tolerant system

    OpenAIRE

    Sadeghi, Mostafa; SOLTAN, Hossein; KHAYYAMBASHI, Mohamadreza

    2015-01-01

    Increasing the reliability of computer systems operations is feasible by means of fault tolerance. This tolerance in a digital system is achieved through redundancy in hardware, software, or computation. This sort of redundancy can be performed in static, dynamic, or hybrid configuration. Hardware redundancy is obtained by providing two or more physical samples of a hardware component. In this paper, we study different hardware redundancy techniques.Its efficiency and problems.

  7. A Survey on Software Fault tolerance in Parallel Computing

    OpenAIRE

    Jashan Deep; , Dr. Rajiv Mahajan

    2013-01-01

    Software almost inevitably contains defects. Do everything possible to reduce the fault rate; Use faulttolerance techniques to deal with software faults. Fault tolerance is the ability of a system to perform its function correctly even in the presence of internal faults. Most of the ordinary systems lack fault tolerant software fix. This paper surveys various software Fault Tolerance techniques and methodologies. The conventional fault tolerant approaches viz., Recovery Block (RB), N Version ...

  8. Fault tolerance control of phase current in permanent magnet synchronous motor control system

    Science.gov (United States)

    Chen, Kele; Chen, Ke; Chen, Xinglong; Li, Jinying

    2014-08-01

    As the Photoelectric tracking system develops from earth based platform to all kinds of moving platform such as plane based, ship based, car based, satellite based and missile based, the fault tolerance control system of phase current sensor is studied in order to detect and control of failure of phase current sensor on a moving platform. By using a DC-link current sensor and the switching state of the corresponding SVPWM inverter, the failure detection and fault control of three phase current sensor is achieved. Under such conditions as one failure, two failures and three failures, fault tolerance is able to be controlled. The reason why under the method, there exists error between fault tolerance control and actual phase current, is analyzed, and solution to weaken the error is provided. The experiment based on permanent magnet synchronous motor system is conducted, and the method is proven to be capable of detecting the failure of phase current sensor effectively and precisely, and controlling the fault tolerance simultaneously. With this method, even though all the three phase current sensors malfunction, the moving platform can still work by reconstructing the phase current of the motor.

  9. Scheduling and Optimization of Fault-Tolerant Embedded Systems with Transparency/Performance Trade-Offs

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru; Peng, Zebo

    2012-01-01

    In this article, we propose a strategy for the synthesis of fault-tolerant schedules and for the mapping of fault-tolerant applications. Our techniques handle transparency/performance trade-offs and use the faultoccurrence information to reduce the overhead due to fault tolerance. Processes and...

  10. Fault Injection and Monitoring Capability for a Fault-Tolerant Distributed Computation System

    Science.gov (United States)

    Torres-Pomales, Wilfredo; Yates, Amy M.; Malekpour, Mahyar R.

    2010-01-01

    The Configurable Fault-Injection and Monitoring System (CFIMS) is intended for the experimental characterization of effects caused by a variety of adverse conditions on a distributed computation system running flight control applications. A product of research collaboration between NASA Langley Research Center and Old Dominion University, the CFIMS is the main research tool for generating actual fault response data with which to develop and validate analytical performance models and design methodologies for the mitigation of fault effects in distributed flight control systems. Rather than a fixed design solution, the CFIMS is a flexible system that enables the systematic exploration of the problem space and can be adapted to meet the evolving needs of the research. The CFIMS has the capabilities of system-under-test (SUT) functional stimulus generation, fault injection and state monitoring, all of which are supported by a configuration capability for setting up the system as desired for a particular experiment. This report summarizes the work accomplished so far in the development of the CFIMS concept and documents the first design realization.

  11. Fault Tolerant Computer Architecture

    CERN Document Server

    Sorin, Daniel

    2009-01-01

    For many years, most computer architects have pursued one primary goal: performance. Architects have translated the ever-increasing abundance of ever-faster transistors provided by Moore's law into remarkable increases in performance. Recently, however, the bounty provided by Moore's law has been accompanied by several challenges that have arisen as devices have become smaller, including a decrease in dependability due to physical faults. In this book, we focus on the dependability challenge and the fault tolerance solutions that architects are developing to overcome it. The two main purposes

  12. Fault tolerant distributed real time computer systems for I and C of prototype fast breeder reactor

    International Nuclear Information System (INIS)

    Highlights: • Architecture of distributed real time computer system (DRTCS) used in I and C of PFBR is explained. • Fault tolerant (hot standby) architecture, fault detection and switch over are detailed. • Scaled down model was used to study functional and performance requirements of DRTCS. • Quality of service parameters for scaled down model was critically studied. - Abstract: Prototype fast breeder reactor (PFBR) is in the advanced stage of construction at Kalpakkam, India. Three-tier architecture is adopted for instrumentation and control (I and C) of PFBR wherein bottom tier consists of real time computer (RTC) systems, middle tier consists of process computers and top tier constitutes of display stations. These RTC systems are geographically distributed and networked together with process computers and display stations. Hot standby architecture comprising of dual redundant RTC systems with switch over logic system is deployed in order to achieve fault tolerance. Fault tolerant dual redundant network connectivity is provided in each RTC system and TCP/IP protocol is selected for network communication. In order to assess the performance of distributed RTC systems, scaled down model was developed with 9 representative systems and nearly 15% of I and C signals of PFBR were connected and monitored. Functional and performance testing were carried out for each RTC system and the fault tolerant characteristics were studied by creating various faults into the system and observed the performance. Various quality of service parameters like connection establishment delay, priority parameter, transit delay, throughput, residual error ratio, etc., are critically studied for the network

  13. Fault-tolerant system considerations for a redundant strapdown inertial measurement unit

    Science.gov (United States)

    Motyka, P.; Ornedo, R.; Mangoubi, R.

    1984-01-01

    The development and evaluation of a fault-tolerant system for the Redundant Strapdown Inertial Measurement Unit (RSDIMU) being developed and evaluated by the NASA Langley Research Center was continued. The RSDIMU consists of four two-degree-of-freedom gyros and accelerometers mounted on the faces of a semi-octahedron which can be separated into two halves for damage protection. Compensated and uncompensated fault-tolerant system failure decision algorithms were compared. An algorithm to compensate for sensor noise effects in the fault-tolerant system thresholds was evaluated via simulation. The effects of sensor location and magnitude of the vehicle structural modes on system performance were assessed. A threshold generation algorithm, which incorporates noise compensation and filtered parity equation residuals for structural mode compensation, was evaluated. The effects of the fault-tolerant system on navigational accuracy were also considered. A sensor error parametric study was performed in an attempt to improve the soft failure detection capability without obtaining false alarms. Also examined was an FDI system strategy based on the pairwise comparison of sensor measurements. This strategy has the specific advantage of, in many instances, successfully detecting and isolating up to two simultaneously occurring failures.

  14. An Efficient Fault Tolerance System Design for Cmos/Nanodevice Digital Memories

    Directory of Open Access Journals (Sweden)

    D. Kavitha

    2014-11-01

    Full Text Available Targeting on the future fault-prone hybrid CMOS/Nanodevice digital memories, this paper present two faulttolerance design approaches the integrally address the tolerance for defect and transient faults. These two approaches share several key features, including the use of a group of Bose-Chaudhuri- Hocquenghem (BCH codes for both defect tolerance and transient fault tolerance, and integration of BCH code selection and dynamic logical-to-physical address mapping. Thus, a new model of BCH decoder is proposed to reduce the area and simplify the computational scheduling of both syndrome and chien search blocks without parallelism leading to high throughput.The goal of fault tolerant computing is improve the dependability of systems where dependability can be defined as the ability of a system to deliver service at an acceptable level of confidence in either presence or absence falult.ss The results of the simulation and implementation using Xilinx ISE software and the LCD screen on the FPGA’s Board will be shown at last.

  15. STUDIES ON CONFIGURATION AND RECOVERY TECHNIQUES FOR FAULT-TOLERANT COMPUTING SYSTEMS

    OpenAIRE

    福本, 聡; フクモト, サトシ; Fukumoto, Satoshi

    1992-01-01

    It is of great importance to operate a computer system with high reliability. Several techniques to achieve the high reliability of a computer system have been proposed and implemented in the real computer systems. This dissertation discusses configuration and recovery techniques for fault-tolerant computing systems, for which stochastic models are presented to evaluate performance and/or reliability. Chapter 1 gives introduction for configuration and recovery techniques based on the concept ...

  16. Robust fault-tolerant H? control of active suspension systems with finite-frequency constraint

    Science.gov (United States)

    Wang, Rongrong; Jing, Hui; Karimi, Hamid Reza; Chen, Nan

    2015-10-01

    In this paper, the robust fault-tolerant (FT) H? control problem of active suspension systems with finite-frequency constraint is investigated. A full-car model is employed in the controller design such that the heave, pitch and roll motions can be simultaneously controlled. Both the actuator faults and external disturbances are considered in the controller synthesis. As the human body is more sensitive to the vertical vibration in 4-8 Hz, robust H? control with this finite-frequency constraint is designed. Other performances such as suspension deflection and actuator saturation are also considered. As some of the states such as the sprung mass pitch and roll angles are hard to measure, a robust H? dynamic output-feedback controller with fault tolerant ability is proposed. Simulation results show the performance of the proposed controller.

  17. BYZANTINE FAULT TOLERANCE MODEL FOR SOAP FAULTS

    Directory of Open Access Journals (Sweden)

    V. Ramachandran

    2012-04-01

    Full Text Available The proposed model is to configure Byzantine Fault Tolerance mechanism for every SOAP fault message that is transmitted. The reliability and availability are of major requirements of Web services since they operate in the distributed environment. One of the reliability issues is handling faults. Fault occurs in all the phases of Service Oriented Architecture i.e. during publishing, discovery, composition, binding, and execution. These faults maylead to service downtime, behaves abnormally, and may send incorrect responses. These abnormalities are classified as Byzantine faults in Web services. Even though SOAP specification provides fault handlingmechanisms, the correctness of the received SOAP fault messages are not known. In this paper, a model is proposed to check the correctness of the SOAP fault message received, by incorporating the Byzantine agreement for fault tolerance. The existing fault tolerant mechanism detects server failure and routes the request to the next available server without the knowledge of the client. The proposed model ensures a transparent environment by providing fault handling information to the client. This is achieved by incorporating an activereplication technique.

  18. Fault Tolerance Mobile Agent System Using Witness Agent in 2-Dimensional Mesh Network

    Directory of Open Access Journals (Sweden)

    Ahmad Rostami

    2010-09-01

    Full Text Available Mobile agents are computer programs that act autonomously on behalf of a user or its owner and travel through a network of heterogeneous machines. Fault tolerance is important in their itinerary. In this paper, existent methods of fault tolerance in mobile agents are described which they are considered in linear network topology. In the methods three agents are used to fault tolerance by cooperating to each others for detecting and recovering server and agent failure. Three types of agents are: actual agent which performs programs for its owner, witness agent which monitors the actual agent and the witness agent after itself, probe which is sent for recovery the actual agent or the witness agent on the side of the witness agent. Communication mechanism in the methods is message passing between these agents. The methods are considered in linear network. We introduce our witness agent approach for fault tolerance mobile agent systems in Two Dimensional Mesh (2D-Mesh Network. Indeed Our approach minimizes Witness-Dependency in this network and then represents its algorithm.

  19. A Systematic Approach to Sensitivity Analysis of Fault Tolerant Systems in NMR Architecture

    Directory of Open Access Journals (Sweden)

    Kourosh Aslansefat

    2015-01-01

    Full Text Available A fault tree illustrates the ways through which a system fails. It states different ways in which combination of faulty components result in an undesired event in the system. Being used in phases such as designing and exploiting industrial systems, and the designers able to evaluate the dependability attributes such as reliability, MTTF and sensitivity. In addition, in the mentioned ability, the fault tree is a systematic method for finding systems bottlenecks and weakness point. In spite of its extensive use in evaluating the reliability of systems, fault tree is rarely used in calculating sensitivity. In the last decade, few researches has been conducted in this field, however these methods are not applicable to large scale systems and are not systematic. This paper provides a systematic method for evaluating system sensitivity through fault tree. Then, it introduces sensitivity of NMR architecture as one of the common structures of fault tolerance which is used for enhancing systems’ reliability, safety and availability in industry. This article presents a comprehensive and parameterized formula for NMR structure's sensitivity. The presented method can be a great help for designing and exploiting reliable systems engineers in systematic and instant calculation of sensitivity by means of fault tree.

  20. Problems related to the integration of fault tolerant aircraft electronic systems

    Science.gov (United States)

    Bannister, J. A.; Adlakha, V.; Triyedi, K.; Alspaugh, T. A., Jr.

    1982-01-01

    Problems related to the design of the hardware for an integrated aircraft electronic system are considered. Taxonomies of concurrent systems are reviewed and a new taxonomy is proposed. An informal methodology intended to identify feasible regions of the taxonomic design space is described. Specific tools are recommended for use in the methodology. Based on the methodology, a preliminary strawman integrated fault tolerant aircraft electronic system is proposed. Next, problems related to the programming and control of inegrated aircraft electronic systems are discussed. Issues of system resource management, including the scheduling and allocation of real time periodic tasks in a multiprocessor environment, are treated in detail. The role of software design in integrated fault tolerant aircraft electronic systems is discussed. Conclusions and recommendations for further work are included.

  1. Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems

    Directory of Open Access Journals (Sweden)

    Jürgen Teich

    2006-06-01

    Full Text Available Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware/software codesign online. Based on field-programmable gate arrays (FPGAs in combination with CPUs, we allow migrating tasks implemented in hardware or software from one node to another. Moreover, if not enough hardware/software resources are available, the migration of functionality from hardware to software or vice versa is provided. Supporting such flexibility through services integrated in a distributed operating system for networked embedded systems is a substantial step towards self-adaptive systems. Beside the formal definition of methods and concepts, we describe in detail a first implementation of a reconfigurable networked embedded system running automotive applications.

  2. Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems

    Directory of Open Access Journals (Sweden)

    Streichert Thilo

    2006-01-01

    Full Text Available Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware/software codesign online. Based on field-programmable gate arrays (FPGAs in combination with CPUs, we allow migrating tasks implemented in hardware or software from one node to another. Moreover, if not enough hardware/software resources are available, the migration of functionality from hardware to software or vice versa is provided. Supporting such flexibility through services integrated in a distributed operating system for networked embedded systems is a substantial step towards self-adaptive systems. Beside the formal definition of methods and concepts, we describe in detail a first implementation of a reconfigurable networked embedded system running automotive applications.

  3. Guaranteed Cost Fault-tolerant Controller Design of Networked Control Systems under Variable-period Sampling

    Directory of Open Access Journals (Sweden)

    Xuan Li

    2009-01-01

    Full Text Available This study investigates the problem of integrity against actuator failures for networked control systems under variable-period sampling. Assuming that the distance between any two consecutive sampling instants is less than a given bound, by using the input delay approach, the networked control systems under variable-period sampling are transformed into the continuous-time networked control systems under time-varying delays. Then the existence conditions of guaranteed cost fault-tolerant control law is testified in terms of the Lyapunov stability theory combined with Linear Matrix Inequalities (LMIs. Furthermore, the guaranteed cost fault-tolerant controller gain and the minimization guaranteed cost can be obtained by solving a minimization problem. A numerical simulation example demonstrates the conclusions are feasible and effective. The proposed control method resolves the problems of variable-period sampling and actuator failures, which meets the requirements in industrial networked control systems.

  4. Fault detection and fault-tolerant control using sliding modes

    CERN Document Server

    Alwi, Halim; Tan, Chee Pin

    2011-01-01

    ""Fault Detection and Fault-tolerant Control Using Sliding Modes"" is the first text dedicated to showing the latest developments in the use of sliding-mode concepts for fault detection and isolation (FDI) and fault-tolerant control in dynamical engineering systems. It begins with an introduction to the basic concepts of sliding modes to provide a background to the field. This is followed by chapters that describe the use and design of sliding-mode observers for FDI using robust fault reconstruction. The development of a class of sliding-mode observers is described from first principles throug

  5. Architectures for fault-tolerant spacecraft computers

    Science.gov (United States)

    Rennels, D. A.

    1978-01-01

    This paper summarizes the results of a long-term research program in fault-tolerant computing for spacecraft on-board processing. In response to changing device technology this program has progressed from the design of a fault-tolerant uniprocessor to the development of fault-tolerant distributed computer systems. The unusual requirements of spacecraft computing are described along with the resulting real-time computer architectures. The following aspects of these designs are discussed: (1) architectural features to minimize complexity in the distributed computer system, (2) fault-detection and recovery, (3) techniques to enhance reliability and testability, and (4) design approaches for LSI implementation.

  6. Failure Detection vs. Group Membership in Fault-Tolerant Distributed Systems: Hidden Trade-Offs

    OpenAIRE

    Schiper, A.

    2002-01-01

    Failure detection and group membership are two important components of fault-tolerant distributed systems. Understanding their role is essential when developing efficient solutions, not only in failure-free runs, but also in runs in which processes do crash. While group membership provides consistent information about the status of processes in the system, failure detectors provide inconsistent information. This paper discusses the trade-offs related to the use of these two components, ...

  7. Efficient and Low-Cost Fault Tolerance for Web-Scale Systems

    OpenAIRE

    Serafini, Marco

    2010-01-01

    Online Web-scale services are being increasingly used to handle critical personal information. The trend towards storing and managing such information on the “cloud” is extending the need for dependable services to a growing range of Web applications, from emailing, to calendars, storage of photos, or finance. This motivates the increased adoption of fault-tolerant replication algorithms in Web-scale systems, ranging from classic, strongly-consistent replication in systems such as Chubby [Bur0...

  8. Fault Tolerant Wind Farm Control

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    In the recent years the wind turbine industry has focused on optimizing the cost of energy. One of the important factors in this is to increase reliability of the wind turbines. Advanced fault detection, isolation and accommodation are important tools in this process. Clearly most faults are deal...... relevant fault scenarios. This benchmark model is used in an international competition dealing with Wind Farm fault detection and isolation and fault tolerant control....

  9. A Decoding Approach to Fault Tolerant Control of Linear Systems with Quantized Disturbance Input

    CERN Document Server

    Fosson, Sophie M

    2010-01-01

    The aim of this paper is to propose an alternative method to solve a Fault Tolerant Control problem. The model is a linear system affected by a disturbance term: this represents a large class of technological faulty processes. The goal is to make the system able to tolerate the undesired perturbation, i.e., to remove or at least reduce its negative effects; such a task is performed in three steps: the detection of the fault, its identification and the consequent process recovery. When the disturbance function is known to be \\emph{quantized} over a finite number of levels, the detection can be successfully executed by a recursive \\emph{decoding} algorithm, arising from Information and Coding Theory and suitably adapted to the control framework. This technique is analyzed and tested in a flight control issue; both theoretical considerations and simulations are reported.

  10. Economic modeling of fault tolerant flight control systems in commercial applications

    Science.gov (United States)

    Finelli, G. B.

    1982-01-01

    This paper describes the current development of a comprehensive model which will supply the assessment and analysis capability to investigate the economic viability of Fault Tolerant Flight Control Systems (FTFCS) for commercial aircraft of the 1990's and beyond. An introduction to the unique attributes of fault tolerance and how they will influence aircraft operations and consequent airline costs and benefits is presented. Specific modeling issues and elements necessary for accurate assessment of all costs affected by ownership and operation of FTFCS are delineated. Trade-off factors are presented, aimed at exposing economically optimal realizations of system implementations, resource allocation, and operating policies. A trade-off example is furnished to graphically display some of the analysis capabilities of the comprehensive simulation model now being developed.

  11. FTOS-Verify: Analysis and Verification of Non-Functional Properties for Fault-Tolerant Systems

    CERN Document Server

    Cheng, Chih-Hong; Esparza, Javier; Knoll, Alois

    2009-01-01

    The focus of the tool FTOS is to alleviate designers' burden by offering code generation for non-functional aspects including fault-tolerance mechanisms. One crucial aspect in this context is to ensure that user-selected mechanisms for the system model are sufficient to resist faults as specified in the underlying fault hypothesis. In this paper, formal approaches in verification are proposed to assist the claim. We first raise the precision of FTOS into pure mathematical constructs, and formulate the deterministic assumption, which is necessary as an extension of Giotto-like systems (e.g., FTOS) to equip with fault-tolerance abilities. We show that local properties of a system with the deterministic assumption will be preserved in a modified synchronous system used as the verification model. This enables the use of techniques known from hardware verification. As for implementation, we develop a prototype tool called FTOS-Verify, deploy it as an Eclipse add-on for FTOS, and conduct several case studies.

  12. Lightweigth Adaptive fault-tolerant data storage system (AFTSYS)

    OpenAIRE

    Carretero Prez, Jess

    2008-01-01

    Research group ARCOS of Universidad Carlos III de Madrid (Spain) have been working on flexible and adaptive data storage systems for several years. The storage systems developed are featured by software governance, making them portable across different hardware storage resources, and their dynamic adaptativy to the different circumstances of computer systems following the autonomic system paradigm. They also allow getting high performance storage by using data distribution or striping across ...

  13. Advanced information processing system: The Army Fault-Tolerant Architecture detailed design overview

    Science.gov (United States)

    Harper, Richard E.; Babikyan, Carol A.; Butler, Bryan P.; Clasen, Robert J.; Harris, Chris H.; Lala, Jaynarayan H.; Masotto, Thomas K.; Nagle, Gail A.; Prizant, Mark J.; Treadwell, Steven

    1994-01-01

    The Army Avionics Research and Development Activity (AVRADA) is pursuing programs that would enable effective and efficient management of large amounts of situational data that occurs during tactical rotorcraft missions. The Computer Aided Low Altitude Night Helicopter Flight Program has identified automated Terrain Following/Terrain Avoidance, Nap of the Earth (TF/TA, NOE) operation as key enabling technology for advanced tactical rotorcraft to enhance mission survivability and mission effectiveness. The processing of critical information at low altitudes with short reaction times is life-critical and mission-critical necessitating an ultra-reliable/high throughput computing platform for dependable service for flight control, fusion of sensor data, route planning, near-field/far-field navigation, and obstacle avoidance operations. To address these needs the Army Fault Tolerant Architecture (AFTA) is being designed and developed. This computer system is based upon the Fault Tolerant Parallel Processor (FTPP) developed by Charles Stark Draper Labs (CSDL). AFTA is hard real-time, Byzantine, fault-tolerant parallel processor which is programmed in the ADA language. This document describes the results of the Detailed Design (Phase 2 and 3 of a 3-year project) of the AFTA development. This document contains detailed descriptions of the program objectives, the TF/TA NOE application requirements, architecture, hardware design, operating systems design, systems performance measurements and analytical models.

  14. Fault tolerant synchronization of chaotic systems based on T-S fuzzy model with fuzzy sampled-data controller

    Science.gov (United States)

    Ma, Da-Zhong; Zhang, Hua-Guang; Wang, Zhan-Shan; Feng, Jian

    2010-05-01

    In this paper the fault tolerant synchronization of two chaotic systems based on fuzzy model and sample data is investigated. The problem of fault tolerant synchronization is formulated to study the global asymptotical stability of the error system with the fuzzy sampled-data controller which contains a state feedback controller and a fault compensator. The synchronization can be achieved no matter whether the fault occurs or not. To investigate the stability of the error system and facilitate the design of the fuzzy sampled-data controller, a Takagi-Sugeno (T-S) fuzzy model is employed to represent the chaotic system dynamics. To acquire good performance and produce a less conservative analysis result, a new parameter-dependent Lyapunov-Krasovksii functional and a relaxed stabilization technique are considered. The stability conditions based on linear matrix inequality are obtained to achieve the fault tolerant synchronization of the chaotic systems. Finally, a numerical simulation is shown to verify the results.

  15. Fault tolerant synchronization of chaotic systems based on T–S fuzzy model with fuzzy sampled-data controller

    International Nuclear Information System (INIS)

    In this paper the fault tolerant synchronization of two chaotic systems based on fuzzy model and sample data is investigated. The problem of fault tolerant synchronization is formulated to study the global asymptotical stability of the error system with the fuzzy sampled-data controller which contains a state feedback controller and a fault compensator. The synchronization can be achieved no matter whether the fault occurs or not. To investigate the stability of the error system and facilitate the design of the fuzzy sampled-data controller, a Takagi–Sugeno (T–S) fuzzy model is employed to represent the chaotic system dynamics. To acquire good performance and produce a less conservative analysis result, a new parameter-dependent Lyapunov–Krasovksii functional and a relaxed stabilization technique are considered. The stability conditions based on linear matrix inequality are obtained to achieve the fault tolerant synchronization of the chaotic systems. Finally, a numerical simulation is shown to verify the results. (general)

  16. Fault-Tolerant Onboard Monitoring and Decision Support Systems

    DEFF Research Database (Denmark)

    Lajic, Zoran

    The purpose of this research project is to improve current onboard decision support systems. Special focus is on the onboard prediction of the instantaneous sea state. In this project a new approach to increasing the overall reliability of a monitoring and decision support system has been...

  17. Service for fault tolerance in the Ad Hoc Networks based on Multi Agent Systems

    Directory of Open Access Journals (Sweden)

    Ghalem Belalem

    2011-02-01

    Full Text Available The Ad hoc networks are distributed networks, self-organized and does not require infrastructure. In such network, mobile infrastructures are subject of disconnections. This situation may concern a voluntary or involuntary disconnection of nodes caused by the high mobility in the Ad hoc network. In these problems we are trying through this work to contribute to solving these problems in order to ensure continuous service by proposing our service for faults tolerance based on Multi Agent Systems (MAS, which predict a problem and decision making in relation to critical nodes. Our work contributes to study the prediction of voluntary and involuntary disconnections in the Ad hoc network; therefore we propose our service for faults tolerance that allows for effective distribution of information in the Network by selecting some objects of the network to be duplicates of information.

  18. Fault tolerance improvement for queuing systems under stress load

    International Nuclear Information System (INIS)

    Various kinds of queuing information systems (exchange auctions systems, web servers, SCADA) are faced to unpredictable situations during operation, when information flow that requires being analyzed and processed rises extremely. Such stress load situations often require human (dispatcher's or administrator's) intervention that is the reason why the time of the first denial of service is extremely important. Common queuing systems architecture is described. Existing approaches to computing resource management are considered. A new late-first-denial-of-service resource management approach is proposed

  19. Failure transition distance-based importance sampling schemes for the simulation of repairable fault-tolerant computer systems

    OpenAIRE

    Carrasco, Juan A.

    2006-01-01

    Markov models are often used to evaluate dependability attributes of fault-tolerant computer systems. The use in practice of Markov models is, however, hampered by the well-known state space explosion problem. Simulation alleviates the problem. For Markov models of repairable fault-tolerant systems, standard simulation of dependability measures tends to be expensive due to the rarity of the system failure event. Importance sampling can speed up the simulation. This paper develops two importan...

  20. Safety Verification of a Fault Tolerant Reconfigurable Autonomous Goal-Based Robotic Control System

    Science.gov (United States)

    Braman, Julia M. B.; Murray, Richard M; Wagner, David A.

    2007-01-01

    Fault tolerance and safety verification of control systems are essential for the success of autonomous robotic systems. A control architecture called Mission Data System (MDS), developed at the Jet Propulsion Laboratory, takes a goal-based control approach. In this paper, a method for converting goal network control programs into linear hybrid systems is developed. The linear hybrid system can then be verified for safety in the presence of failures using existing symbolic model checkers. An example task is simulated in MDS and successfully verified using HyTech, a symbolic model checking software for linear hybrid systems.

  1. Transparent reliability model for fault-tolerant safety systems

    International Nuclear Information System (INIS)

    A reliability model is presented which may serve as a tool for identification of cost-effective configurations and operating philosophies of computer-based process safety systems. The main merit of the model is the explicit relationship in the mathematical formulas between failure cause and the means used to improve system reliability such as self-test, redundancy, preventive maintenance and corrective maintenance. A component failure taxonomy has been developed which allows the analyst to treat hardware failures, human failures, and software failures of automatic systems in an integrated manner. Furthermore, the taxonomy distinguishes between failures due to excessive environmental stresses and failures initiated by humans during engineering and operation. Attention has been given to develop a transparent model which provides predictions which are in good agreement with observed system performance, and which is applicable for non-experts in the field of reliability

  2. Fault Tolerant Control: A Simultaneous Stabilization Result

    DEFF Research Database (Denmark)

    Stoustrup, Jakob; Blondel, V.D.

    2004-01-01

    This paper discusses the problem of designing fault tolerant compensators that stabilize a given system both in the nominal situation, as well as in the situation where one of the sensors or one of the actuators has failed. It is shown that such compensators always exist, provided that the system...... is detectable from each output and that it is stabilizable. The proof of this result is constructive, and a worked example shows how to design a fault tolerant compensator for a simple, yet challeging system. A family of second order systems is described that requires fault tolerant compensators of...

  3. The Isis project: Fault-tolerance in large distributed systems

    Science.gov (United States)

    Birman, Kenneth P.; Marzullo, Keith

    1993-01-01

    This final status report covers activities of the Isis project during the first half of 1992. During the report period, the Isis effort has achieved a major milestone in its effort to redesign and reimplement the Isis system using Mach and Chorus as target operating system environments. In addition, we completed a number of publications that address issues raised in our prior work; some of these have recently appeared in print, while others are now being considered for publication in a variety of journals and conferences.

  4. Fault-tolerant Agreement in Synchronous Message-passing Systems

    CERN Document Server

    Raynal, Michel

    2010-01-01

    The present book focuses on the way to cope with the uncertainty created by process failures (crash, omission failures and Byzantine behavior) in synchronous message-passing systems (i.e., systems whose progress is governed by the passage of time). To that end, the book considers fundamental problems that distributed synchronous processes have to solve. These fundamental problems concern agreement among processes (if processes are unable to agree in one way or another in presence of failures, no non-trivial problem can be solved). They are consensus, interactive consistency, k-set agreement an

  5. Local rollback for fault-tolerance in parallel computing systems

    Energy Technology Data Exchange (ETDEWEB)

    Blumrich, Matthias A. (Yorktown Heights, NY); Chen, Dong (Yorktown Heights, NY); Gara, Alan (Yorktown Heights, NY); Giampapa, Mark E. (Yorktown Heights, NY); Heidelberger, Philip (Yorktown Heights, NY); Ohmacht, Martin (Yorktown Heights, NY); Steinmacher-Burow, Burkhard (Boeblingen, DE); Sugavanam, Krishnan (Yorktown Heights, NY)

    2012-01-24

    A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.

  6. Fault Tolerant Software: a Multi Agent System Solution

    DEFF Research Database (Denmark)

    Caponetti, Fabio; Bergantino, Nicola; Longhi, Sauro

    Development of high dependable systems remains a labour intensive task. This paper explores recent advances on the adaptation of the software agent architecture for control application while looking to dependability issues. Multiple agent systems theory will be reviewed giving methods to supervise...... it. Software ageing is shown to be the most common problem and rejuvenation its counteract. The paper will show how an agent population can be monitored, faulty agents isolated and reloaded in a healthy state, hence rejuvenated. The aim is to propose an architecture as basis for the design of control...

  7. A Fault-Tolerant Modulation Method to Counteract the Double Open-Switch Fault in Matrix Converter Drive Systems without Redundant Power Devices

    DEFF Research Database (Denmark)

    Chen, Der-Fa; Nguyen-Duy, Khiem; Liu, Tian-Hua; Andersen, Michael A. E.

    This paper studies the double open-switch fault issue occurring within the conventional matrix converter driving a three-phase permanent-magnet synchronous motor system and proposes a fault-tolerant solution by introducing a revised modulation strategy. In this switching strategy, the rectifier...

  8. A Ship Propulsion System Model for Fault-tolerant Control

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh; Blanke, M.

    . The propulsion system model is presented in two versions: the first one consists of one engine and one propeller, and the othe one consists of two engines and their corresponding propellers placed in parallel in the ship. The corresponding programs are developed and are available....

  9. Fault tolerant PLC system using CPU and I/O redundancy with switch over logic system for nuclear instrumentation

    International Nuclear Information System (INIS)

    Nuclear instrumentation in power plants and fuel reprocessing plants demand very high reliable fault tolerant programmable logic controllers (PLC) since it is directly related to hazardous operation involving safety of plant, operator and in turn public at large. Components of control systems can fail depending on the circumstances and level of preparedness in plants, leading to a minor or major disaster. Utilizing existing technology and configuring system architecture, a fault tolerant PLC can provide superior solution to meet some of the challenges like high reliability, integrity and availability. This paper presents the background, concepts and implementation of fault tolerant PLC architecture using CPU and I/O redundancy with switch over logic system for nuclear instrumentation. (author)

  10. Robust fault tolerant control based on sliding mode method for uncertain linear systems with quantization.

    Science.gov (United States)

    Hao, Li-Ying; Yang, Guang-Hong

    2013-09-01

    This paper is concerned with the problem of robust fault-tolerant compensation control problem for uncertain linear systems subject to both state and input signal quantization. By incorporating novel matrix full-rank factorization technique with sliding surface design successfully, the total failure of certain actuators can be coped with, under a special actuator redundancy assumption. In order to compensate for quantization errors, an adjustment range of quantization sensitivity for a dynamic uniform quantizer is given through the flexible choices of design parameters. Comparing with the existing results, the derived inequality condition leads to the fault tolerance ability stronger and much wider scope of applicability. With a static adjustment policy of quantization sensitivity, an adaptive sliding mode controller is then designed to maintain the sliding mode, where the gain of the nonlinear unit vector term is updated automatically to compensate for the effects of actuator faults, quantization errors, exogenous disturbances and parameter uncertainties without the need for a fault detection and isolation (FDI) mechanism. Finally, the effectiveness of the proposed design method is illustrated via a model of a rocket fairing structural-acoustic. PMID:23701895

  11. Fault Tolerance for Real-Time Systems: Analysis and Optimization of Roll-back Recovery with Checkpointing

    OpenAIRE

    Nikolov, Dimitar

    2014-01-01

    Increasing soft error rates in recent semiconductor technologies enforce the usage of fault tolerance. While fault tolerance enables correct operation in the presence of soft errors, it usually introduces a time overhead. The time overhead is particularly important for a group of computer systems referred to as real-time systems (RTSs) where correct operation is defined as producing the correct result of a computation while satisfying given time constraints (deadlines). Depending on the conse...

  12. Rectifier Fault Diagnosis and Fault Tolerance of a Doubly Fed Brushless Starter Generator

    OpenAIRE

    Liwei Shi; Zhou Bo

    2015-01-01

    This paper presents a rectifier fault diagnosis method with wavelet packet analysis to improve the fault tolerant four-phase doubly fed brushless starter generator (DFBLSG) system reliability. The system components and fault tolerant principle of the high reliable DFBLSG are given. And the common fault of the rectifier is analyzed. The process of wavelet packet transforms fault detection/identification algorithm is introduced in detail. The fault tolerant performance and output voltage experi...

  13. Performance Evaluation of SDS Algorithm with Fault Tolerance for Distributed System

    Directory of Open Access Journals (Sweden)

    K.Sathiya Bharathi,

    2012-07-01

    Full Text Available In the recent past, Security-sensitive applications, such as electronic transaction processing systems, stock quote update systems, which require high quality of security to guarantee authentication, integrity, and confidentiality of information, have adopted Heterogeneous Distributed System (HDS as their platforms.We systematically design a security-driven scheduling architecture that can dynamically measure the trust level of each node in the system by using differential equations and introduce SRank to estimate security overhead of critical tasks using SDS algorithm.Furthermore,we can achieve high quality of security for applications by using security-driven scheduling algorithm for DAGs in terms of minimizing the makespan, risk probability, and speedup. In addition to that the fault tolerant is included using Security Driven Fault Tolerant Scheduling Algorithm (SDFT to tolerate N processors failure at one time, and it introduced a new global scheduler to improve efficiency of scheduling process.Moreover, the SDFT supported flexible security policy applied on real time tasks according to its security requirement and considered the effect of security overhead during scheduling. We also observe that the improvement obtained by our algorithm increases as the security-sensitive data of applications increases.

  14. Survey On Fault Tolerance In Grid Computing

    Directory of Open Access Journals (Sweden)

    P. Latchoumy

    2011-12-01

    Full Text Available Grid computing is defined as a hardware and software infrastructure that enables coordinatedresource sharing within dynamic organizations. In grid computing, the probability of a failure is muchgreater than in traditional parallel computing. Therefore, the fault tolerance is an important property inorder to achieve reliability, availability and QOS. In this paper, we give a survey on various faulttolerance techniques, fault management in different systems and related issues. A fault tolerance servicedeals with various types of resource failures, which include process failure, processor failure and networkfailures. This survey provides the related research results about fault tolerance in distinct functional areasof grid infrastructure and also gave the future directions about fault tolerance techniques, and it is a goodreference for researcher.

  15. Fault-tolerant Control of Discrete-time LPV systems using Virtual Actuators and Sensors

    DEFF Research Database (Denmark)

    Tabatabaeipour, Mojtaba; Stoustrup, Jakob; Bak, Thomas

    2015-01-01

    transforms the output of the controller for the faulty system such that the stability and performance goals are preserved. Input-to-state stabilizing LPV gains of the virtual actuator and sensor are obtained by solving linear matrix inequalities (LMIs). We show that separate design of these gains guarantees......This paper proposes a new fault-tolerant control (FTC) method for discrete-time linear parameter varying (LPV) systems using a reconfiguration block. The basic idea of the method is to achieve the FTC goal without re-designing the nominal controller by inserting a reconfiguration block between the...

  16. A combinatorial method for the evaluation of yield of fault-tolerant systems-on-chip

    OpenAIRE

    Suñé, Víctor; Rodríguez Montañés, Rosa; Carrasco, Juan A.; Munteanu, D-P

    2003-01-01

    In this paper we develop a combinatorial method for the evaluation of yield of fault-tolerant systems-on-chip. The method assumes that defects are produced according to a model in which defects are lethal and affect given components of the system following a distribution common to all defects. The distribution of the number of defects is arbitrary. The method is based on the formulation of the yield as 1 minus the probability that a given boolean function with multiple-valued variables has...

  17. The NILE system architecture: fault-tolerant, wide-area access to computing and data resources

    International Nuclear Information System (INIS)

    NILE is a multi-disciplinary project building a distributed computing environment for HEP. It provides wide-area, fault-tolerant, integrated access to processing and data resources for collaborators of the CLEO experiment, though the goals and principles are applicable to many domains. NILE has three main objectives: a realistic distributed system architecture design, the design of a robust data model, and a Fast-Track implementation providing a prototype design environment which will also be used by CLEO physicists. This paper focuses on the software and wide-area system architecture design and the computing issues involved in making NILE services highly-available. (author)

  18. The BTeV DAQ and Trigger System - Some throughput, usability and fault tolerance aspects

    International Nuclear Information System (INIS)

    As presented at the last CHEP conference, the BTeV triggering and data collection pose a significant challenge in construction and operation, generating 1.5 Terabytes/second of raw data from over 30 million detector channels. We report on facets of the DAQ and trigger farms. We report on the current design of the DAQ, especially its partitioning features to support commissioning of the detector. We are exploring collaborations with computer science groups experienced in fault tolerant and dynamic real-time and embedded systems to develop a system to provide the extreme flexibility and high availability required of the heterogeneous trigger farm (∼ ten thousand DSPs and commodity processors). We describe directions in the following areas: system modeling and analysis using the Model Integrated Computing approach to assist in the creation of domain-specific modeling, analysis, and program synthesis environments for building complex, large-scale computer-based systems; System Configuration Management to include compilable design specifications for configurable hardware components, schedules, and communication maps; Runtime Environment and Hierarchical Fault Detection/Management--a system-wide infrastructure for rapidly detecting, isolating, filtering, and reporting faults which will be encapsulated in intelligent active entities (agents) to run on DSPs, L2/3 processors, and other supporting processors throughout the system

  19. The BTeV DAQ and trigger system - some throughput, usability and fault tolerance aspects

    International Nuclear Information System (INIS)

    As presented at the last CHEP conference, the BTeV triggering and data collection pose a significant challenge in construction and operation, generating 1.5 Terabytes/second of raw data from over 30 million detector channels. The authors report on facets of the DAQ and trigger farms. The authors report on the current design of the DAQ, especially its partitioning features to support commissioning of the detector. The authors are exploring collaborations with computer science groups experienced in fault tolerant and dynamic real-time and embedded systems to develop a system to provide the extreme flexibility and high availability required of the heterogeneous trigger farm (∼ten thousand DSPs and commodity processors). The authors describe directions in the following areas: system modeling and analysis using the Model Integrated Computing approach to assist in the creation of domain-specific modeling, analysis, and program synthesis environments for building complex, large-scale computer-based systems; System Configuration Management to include compilable design specifications for configurable hardware components, schedules, and communication maps; Runtime Environment and Hierarchical Fault Detection/Management- a system-wide infrastructure for rapidly detecting, isolating, filtering, and reporting faults which will be encapsulated in intelligent active entities (agents) to run on DSPs, L2/3 processors, and other supporting processors throughout the system

  20. Diagnosis and Tolerant Strategy of an Open-Switch Fault for T-type Three-Level Inverter Systems

    DEFF Research Database (Denmark)

    Choi, Uimin; Lee, Kyo Beum; Blaabjerg, Frede

    2014-01-01

    -tolerant strategy is explained by dividing into two cases: the faulty condition of half-bridge switches and the neutral-point switches. The performance of the T-type inverter system improves considerably by the proposed fault tolerant algorithm when a switch fails. The roposed method does not require additional...

  1. Fault-tolerant parallel processor

    Energy Technology Data Exchange (ETDEWEB)

    Harper, R.E.; Lala, J.H. (Charles Stark Draper Laboratory, Inc., Cambridge, MA (USA))

    1991-06-01

    This paper addresses issues central to the design and operation of an ultrareliable, Byzantine resilient parallel computer. Interprocessor connectivity requirements are met by treating connectivity as a resource that is shared among many processing elements, allowing flexibility in their configuration and reducing complexity. Redundant groups are synchronized solely by message transmissions and receptions, which aslo provide input data consistency and output voting. Reliability analysis results are presented that demonstrate the reduced failure probability of such a system. Performance analysis results are presented that quantify the temporal overhead involved in executing such fault-tolerance-specific operations. Empirical performance measurements of prototypes of the architecture are presented. 30 refs.

  2. Investigation of the applicability of a functional programming model to fault-tolerant parallel processing for knowledge-based systems

    Science.gov (United States)

    Harper, Richard

    1989-01-01

    In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper Fault-Tolerant Parallel Processor (FTPP). When used in conjunction with the FTPP's fault detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms have been implemented and are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence and recovery. This user interface is described and its use demonstrated. The applicability of the functional programming style to the Activation Framework, a paradigm for intelligent systems, is then briefly described.

  3. State of the art on fault-tolerant real time distributed systems

    International Nuclear Information System (INIS)

    The integration of new computerized functions in power plant, and especially nuclear power plant, control and instrumentation systems implies more and more stringent requirements as to communication system reliability. For if an item of equipment, or even a computer program, can be validated and qualified, no formal qualification procedure is presently imposed on communication networks. This is certainly due to the relative immaturity of these networks, but also to their complexity. It is for this reason that, in the context of preparation for the future PWR 2000 standardized nuclear plants, it would seem appropriate to take a look at fault-tolerant communication systems. Since C and I type applications (in the control room) are divided between several computers and are required to contend with extremely severe time constraints, EDF has undertaken investigation of fault-tolerant, real time distributed systems. This paper summarized the state of the art in the field as it appears from discussion with computer manufacturers, academics and research workers on related projects. The results obtained were then used to determine trends as to ''promising'' solutions. The paper concludes with recommended study programs for the PCC department of EDF/R and DD for the next few years. (author), 9 figs., 10 refs., 2 annexes

  4. Model Checking a Byzantine-Fault-Tolerant Self-Stabilizing Protocol for Distributed Clock Synchronization Systems

    Science.gov (United States)

    Malekpour, Mahyar R.

    2007-01-01

    This report presents the mechanical verification of a simplified model of a rapid Byzantine-fault-tolerant self-stabilizing protocol for distributed clock synchronization systems. This protocol does not rely on any assumptions about the initial state of the system. This protocol tolerates bursts of transient failures, and deterministically converges within a time bound that is a linear function of the self-stabilization period. A simplified model of the protocol is verified using the Symbolic Model Verifier (SMV) [SMV]. The system under study consists of 4 nodes, where at most one of the nodes is assumed to be Byzantine faulty. The model checking effort is focused on verifying correctness of the simplified model of the protocol in the presence of a permanent Byzantine fault as well as confirmation of claims of determinism and linear convergence with respect to the self-stabilization period. Although model checking results of the simplified model of the protocol confirm the theoretical predictions, these results do not necessarily confirm that the protocol solves the general case of this problem. Modeling challenges of the protocol and the system are addressed. A number of abstractions are utilized in order to reduce the state space. Also, additional innovative state space reduction techniques are introduced that can be used in future verification efforts applied to this and other protocols.

  5. Modeling and Verification for Timing Satisfaction of Fault-Tolerant Systems with Finiteness

    CERN Document Server

    Cheng, Chih-Hong; Esparza, Javier; Knoll, Alois

    2009-01-01

    The increasing use of model-based tools enables further use of formal verification techniques in the context of distributed real-time systems. To avoid state explosion, it is necessary to construct a verification model that focuses on the aspects under consideration. In this paper, we discuss how we construct a verification model for timing analysis in distributed real-time systems. We (1) give observations concerning restrictions of timed automata to model these systems, (2) formulate mathematical representations how to perform model-to-model transformation to derive verification models from system models, and (3) propose some theoretical criteria how to reduce the model size. The latter is in particular important, as for the verification of complex systems, an efficient model reflecting the properties of the system under consideration is equally important to the verification algorithm itself. Finally, we present an extension of the model-based development tool FTOS, designed to develop fault-tolerant system...

  6. Economical and Fault-Tolerant Load Balancing in Distributed Stream Processing Systems

    Science.gov (United States)

    Xiao, Fuyuan; Kitasuka, Teruaki; Aritsugi, Masayoshi

    We present an economical and fault-tolerant load balancing strategy (EFTLBS) based on an operator replication mechanism and a load shedding method, that fully utilizes the network resources to realize continuous and highly-available data stream processing without dynamic operator migration over wide area networks. In this paper, we first design an economical operator distribution (EOD) plan based on a bin-packing model under the constraints of each stream bandwidth as well as each server's CPU capacity. Next, we devise super-operator (SO) that load balances multi-degree operator replicas. Moreover, for improving the fault-tolerance of the system, we color the SOs based on a coloring bin-packing (CBP) model that assigns peer operator replicas to different servers. To minimize the effects of input rate bursts upon the system, we take advantage of a load shedding method while keeping the QoS guarantees made by the system based on the SO scheme and the CBP model. Finally, we substantiate the utility of our work through experiments on ns-3.

  7. Design Approach for Fault Recoverable ALU with Improved Fault Tolerance

    Directory of Open Access Journals (Sweden)

    Ankit K V

    2015-08-01

    Full Text Available A new design for fault tolerant and fault recoverable ALU System has been proposed in this paper. Reliability is one of the most critical factors that have to be considered during the designing phase of any IC. In critical applications like Medical equipment & Military applications this reliability factor plays a very critical role in determining the acceptance of product. Insertion of special modules in the main design for reliability enhancement will give considerable amount of area & power penalty. So, a novel approach to this problem is to find ways for reusing the already available components in digital system in efficient way to implement recoverable methodologies. Triple Modular Redundancy (TMR has traditionally used for protecting digital logic from the SEUs (single event upset by triplicating the critical components of the system to give fault tolerance to system. ScTMR- Scan chain-based error recovery TMR technique provides recovery for all internal faults. ScTMR uses a roll-forward approach and employs the scan chain implemented in the circuits for testability purposes to recover the system to fault-free state. The proposed design will incorporate a ScTMR controller over TMR system of ALU and will make the system fault tolerant and fault recoverable. Hence, proposed design will be more efficient & reliable to use in critical applications, than any other design present till today.

  8. Analysis of Fault Tolerance in Peer to Peer Video on Demand System Using V Chaining

    Directory of Open Access Journals (Sweden)

    Hareesh K.

    2013-02-01

    Full Text Available The video on demand system is one of the streaming applications widely access remote video programs over the Internet. One of the variant of VoD system is peer to peer (p2p system. In P2P VoD system peers are frequently fails while chaining. To overcome the failure of peers in chaining fault tolerance mechanism is used. We have proposed VChaining mechanism on Continuous Time Markov Chain model. A birth death process is used to model our proposed mechanism. The parameters used in our model as arrival of requests of peers versus failure of peers, normal versus recovery, the average load on the system, the peer band width and buffer in the system, the server bandwidth and server load. We have simulated above parameters using Video Chaining (V Chaining mechanism. We have compared simulation results with existing mechanism such as optimal, accelerated chaining mechanism. Our simulation results fairer among all the chaining mechanisms

  9. Design of Fault-Tolerant Control for Trajectory Tracking

    OpenAIRE

    Németh, Balazs; Gaspar, Peter; Bokor, Jozsef; Sename, Olivier; Dugard, Luc

    2012-01-01

    The paper proposes a fault-tolerant integrated control system with the brake and the steering for developing a driver assistance system. The purpose is to design a fault-tolerant control which is able to guarantee the trajectory tracking and lateral stability of the vehicle against actuator fault scenarios. Since both actuators affect the lateral dynamics of the vehicle, in the control design a balance and priority between them must be achieved. The method is extended with a fault-tolerant fe...

  10. Multi-agent Platform and Toolbox for Fault Tolerant Networked Control Systems

    Directory of Open Access Journals (Sweden)

    Mário J. G. C. Mendes

    2009-04-01

    Full Text Available Industrial distributed networked control systems use different communication networks to exchange different critical levels of information. Real-time control, fault diagnosis (FDI and Fault Tolerant Networked Control (FTNC systems demand one of the more stringent data exchange in the communication networks of these networked control systems (NCS. When dealing with large-scale complex NCS, designing FTNC systems is a very difficult task due to the large number of sensors and actuators spatially distributed and network connected. To solve this issue, a FTNC platform and toolbox are presented in this paper using simple and verifiable principles coming mainly from a decentralized design based on causal modelling partitioning of the NCS and distributed computing using multi-agent systems paradigm, allowing the use of agents with well established FTC methodologies or new ones developed taking into account the NCS specificities. The multi-agent platform and toolbox for FTNC systems have been built in Matlab/Simulink environment, which is in our days the scientific benchmark for this kind of research. Although the tests have been performed with a simple case, the results are promising and this approach is expected to succeed with more complex processes.

  11. Novel neural networks-based fault tolerant control scheme with fault alarm.

    Science.gov (United States)

    Shen, Qikun; Jiang, Bin; Shi, Peng; Lim, Cheng-Chew

    2014-11-01

    In this paper, the problem of adaptive active fault-tolerant control for a class of nonlinear systems with unknown actuator fault is investigated. The actuator fault is assumed to have no traditional affine appearance of the system state variables and control input. The useful property of the basis function of the radial basis function neural network (NN), which will be used in the design of the fault tolerant controller, is explored. Based on the analysis of the design of normal and passive fault tolerant controllers, by using the implicit function theorem, a novel NN-based active fault-tolerant control scheme with fault alarm is proposed. Comparing with results in the literature, the fault-tolerant control scheme can minimize the time delay between fault occurrence and accommodation that is called the time delay due to fault diagnosis, and reduce the adverse effect on system performance. In addition, the FTC scheme has the advantages of a passive fault-tolerant control scheme as well as the traditional active fault-tolerant control scheme's properties. Furthermore, the fault-tolerant control scheme requires no additional fault detection and isolation model which is necessary in the traditional active fault-tolerant control scheme. Finally, simulation results are presented to demonstrate the efficiency of the developed techniques. PMID:25014982

  12. Fault Tolerant External Memory Algorithms

    DEFF Research Database (Denmark)

    Jørgensen, Allan Grønlund; Brodal, Gerth Stølting; Mølhave, Thomas

    2009-01-01

    Algorithms dealing with massive data sets are usually designed for I/O-efficiency, often captured by the I/O model by Aggarwal and Vitter. Another aspect of dealing with massive data is how to deal with memory faults, e.g. captured by the adversary based faulty memory RAM by Finocchi and Italiano....... However, current fault tolerant algorithms do not scale beyond the internal memory. In this paper we investigate for the first time the connection between I/O-efficiency in the I/O model and fault tolerance in the faulty memory RAM, and we assume that both memory and disk are unreliable. We show a lower...... bound on the number of I/Os required for any deterministic dictionary that is resilient to memory faults. We design a static and a dynamic deterministic dictionary with optimal query performance as well as an optimal sorting algorithm and an optimal priority queue. Finally, we consider scenarios where...

  13. A Fault-Tolerant Emergency-Aware Access Control Scheme for Cyber-Physical Systems

    CERN Document Server

    Wu, Guowei; Xia, Feng; Yao, Lin

    2012-01-01

    Access control is an issue of paramount importance in cyber-physical systems (CPS). In this paper, an access control scheme, namely FEAC, is presented for CPS. FEAC can not only provide the ability to control access to data in normal situations, but also adaptively assign emergency-role and permissions to specific subjects and inform subjects without explicit access requests to handle emergency situations in a proactive manner. In FEAC, emergency-group and emergency-dependency are introduced. Emergencies are processed in sequence within the group and in parallel among groups. A priority and dependency model called PD-AGM is used to select optimal response-action execution path aiming to eliminate all emergencies that occurred within the system. Fault-tolerant access control polices are used to address failure in emergency management. A case study of the hospital medical care application shows the effectiveness of FEAC.

  14. A Constraint Logic Programming Framework for the Synthesis of Fault-Tolerant Schedules for Distributed Embedded Systems

    OpenAIRE

    Poulsen, Kåre Harbo; Pop, Paul; Izosimov, Viacheslav

    2007-01-01

    We present a constraint logic programming (CLP) approach for synthesis of fault-tolerant hard real-time applications on distributed heterogeneous architectures. We address time-triggered systems, where processes and messages are statically scheduled based on schedule tables. We use process re-execution for recovering from multiple transient faults. We propose three scheduling approaches, which each present a trade-off between schedule simplicity and performance, (i) full transparency, (ii) sl...

  15. GRID COMPUTING AND FAULT TOLERANCE APPROACH

    Directory of Open Access Journals (Sweden)

    Pankaj Gupta,

    2011-10-01

    Full Text Available Grid computing is a means of allocating the computational power of alarge number of computers to complex difficult computation orproblem. Grid computing is a distributed computing paradigm thatdiffers from traditional distributed computing in that it is aimed toward large scale systems that even span organizational boundaries. This paper proposes a method to achieve maximum fault tolerance in the Grid environment system by using Reliability consideration by using Replication approach and Check-point approach. Fault tolerance is an important property for large scale computational grid systems, where geographically distributed nodes co-operate to execute a task. In order to achieve high level of reliability and availability, the grid infrastructure should be a foolproof fault tolerant. Since the failure of resources affects job execution fatally, fault tolerance service is essential to satisfy QOS requirement in grid computing. Commonly utilized techniques for providing fault tolerance are job check pointing and replication. Both techniques mitigate the amount of work lost due to changing system availability but can introduce significant runtime overhead. The latter largely depends on the length of check pointing interval and the chosen number of replicas, respectively. In case of complex scientific workflows where tasks can execute in well defined order reliability is another biggest challenge because of the unreliable nature of the grid resources.

  16. Robot Position Sensor Fault Tolerance

    Science.gov (United States)

    Aldridge, Hal A.

    1997-01-01

    Robot systems in critical applications, such as those in space and nuclear environments, must be able to operate during component failure to complete important tasks. One failure mode that has received little attention is the failure of joint position sensors. Current fault tolerant designs require the addition of directly redundant position sensors which can affect joint design. A new method is proposed that utilizes analytical redundancy to allow for continued operation during joint position sensor failure. Joint torque sensors are used with a virtual passive torque controller to make the robot joint stable without position feedback and improve position tracking performance in the presence of unknown link dynamics and end-effector loading. Two Cartesian accelerometer based methods are proposed to determine the position of the joint. The joint specific position determination method utilizes two triaxial accelerometers attached to the link driven by the joint with the failed position sensor. The joint specific method is not computationally complex and the position error is bounded. The system wide position determination method utilizes accelerometers distributed on different robot links and the end-effector to determine the position of sets of multiple joints. The system wide method requires fewer accelerometers than the joint specific method to make all joint position sensors fault tolerant but is more computationally complex and has lower convergence properties. Experiments were conducted on a laboratory manipulator. Both position determination methods were shown to track the actual position satisfactorily. A controller using the position determination methods and the virtual passive torque controller was able to servo the joints to a desired position during position sensor failure.

  17. Scheduling of Fault-Tolerant Embedded Systems with Soft and Hard Timing Constraints

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru; Peng, Zebo

    fails or completes, incurs an unacceptable overhead. Thus, we use a quasi-static scheduling strategy, where a set of schedules is synthesized off-line and, at run time, the scheduler will select the right schedule based on the occurrence of faults and the actual execution times of processes. The....../utility functions to capture the utility of soft processes. Process re-execution is employed to recover from multiple faults. A single static schedule computed off-line is not fault tolerant and is pessimistic in terms of utility, while a purely online approach, which computes a new schedule every time a process...

  18. A Constraint Logic Programming Framework for the Synthesis of Fault-Tolerant Schedules for Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Poulsen, Kåre Harbo; Pop, Paul; Izosimov, Viacheslav

    We present a constraint logic programming (CLP) approach for synthesis of fault-tolerant hard real-time applications on distributed heterogeneous architectures. We address time-triggered systems, where processes and messages are statically scheduled based on schedule tables. We use process re...... framework that produces the fault-tolerant schedules, guaranteeing schedulability in the presence of transient faults. We show how the framework can be used to tackle design optimization problems.The proposed approach has been evaluated using extensive experiments.......-execution for recovering from multiple transient faults. We propose three scheduling approaches, which each present a trade-off between schedule simplicity and performance, (i) full transparency, (ii) slack sharing and (iii) conditional, and provide various degrees of transparency. We have developed a CLP...

  19. Methods and apparatuses for self-generating fault-tolerant keys in spread-spectrum systems

    Energy Technology Data Exchange (ETDEWEB)

    Moradi, Hussein; Farhang, Behrouz; Subramanian, Vijayarangam

    2015-12-15

    Self-generating fault-tolerant keys for use in spread-spectrum systems are disclosed. At a communication device, beacon signals are received from another communication device and impulse responses are determined from the beacon signals. The impulse responses are circularly shifted to place a largest sample at a predefined position. The impulse responses are converted to a set of frequency responses in a frequency domain. The frequency responses are shuffled with a predetermined shuffle scheme to develop a set of shuffled frequency responses. A set of phase differences is determined as a difference between an angle of the frequency response and an angle of the shuffled frequency response at each element of the corresponding sets. Each phase difference is quantized to develop a set of secret-key quantized phases and a set of spreading codes is developed wherein each spreading code includes a corresponding phase of the set of secret-key quantized phases.

  20. Methods and apparatuses for self-generating fault-tolerant keys in spread-spectrum systems

    Science.gov (United States)

    Moradi, Hussein; Farhang, Behrouz; Subramanian, Vijayarangam

    2015-12-22

    Self-generating fault-tolerant keys for use in spread-spectrum systems are disclosed. At a communication device, beacon signals are received from another communication device and impulse responses are determined from the beacon signals. The impulse responses are circularly shifted to place a largest sample at a predefined position. The impulse responses are converted to a set of frequency responses in a frequency domain. The frequency responses are shuffled with a predetermined shuffle scheme to develop a set of shuffled frequency responses. A set of phase differences is determined as a difference between an angle of the frequency response and an angle of the shuffled frequency response at each element of the corresponding sets. Each phase difference is quantized to develop a set of secret-key quantized phases and a set of spreading codes is developed wherein each spreading code includes a corresponding phase of the set of secret-key quantized phases.

  1. Fault tolerant operation of switched reluctance machine

    Science.gov (United States)

    Wang, Wei

    The energy crisis and environmental challenges have driven industry towards more energy efficient solutions. With nearly 60% of electricity consumed by various electric machines in industry sector, advancement in the efficiency of the electric drive system is of vital importance. Adjustable speed drive system (ASDS) provides excellent speed regulation and dynamic performance as well as dramatically improved system efficiency compared with conventional motors without electronics drives. Industry has witnessed tremendous grow in ASDS applications not only as a driving force but also as an electric auxiliary system for replacing bulky and low efficiency auxiliary hydraulic and mechanical systems. With the vast penetration of ASDS, its fault tolerant operation capability is more widely recognized as an important feature of drive performance especially for aerospace, automotive applications and other industrial drive applications demanding high reliability. The Switched Reluctance Machine (SRM), a low cost, highly reliable electric machine with fault tolerant operation capability, has drawn substantial attention in the past three decades. Nevertheless, SRM is not free of fault. Certain faults such as converter faults, sensor faults, winding shorts, eccentricity and position sensor faults are commonly shared among all ASDS. In this dissertation, a thorough understanding of various faults and their influence on transient and steady state performance of SRM is developed via simulation and experimental study, providing necessary knowledge for fault detection and post fault management. Lumped parameter models are established for fast real time simulation and drive control. Based on the behavior of the faults, a fault detection scheme is developed for the purpose of fast and reliable fault diagnosis. In order to improve the SRM power and torque capacity under faults, the maximum torque per ampere excitation are conceptualized and validated through theoretical analysis and experiments. With the proposed optimal waveform, torque production is greatly improved under the same Root Mean Square (RMS) current constraint. Additionally, position sensorless operation methods under phase faults are investigated to account for the combination of physical position sensor and phase winding faults. A comprehensive solution for position sensorless operation under single and multiple phases fault are proposed and validated through experiments. Continuous position sensorless operation with seamless transition between various numbers of phase fault is achieved.

  2. A Byzantine-Fault Tolerant Self-Stabilizing Protocol for Distributed Clock Synchronization Systems

    Science.gov (United States)

    Malekpour, Mahyar R.

    2006-01-01

    Embedded distributed systems have become an integral part of safety-critical computing applications, necessitating system designs that incorporate fault tolerant clock synchronization in order to achieve ultra-reliable assurance levels. Many efficient clock synchronization protocols do not, however, address Byzantine failures, and most protocols that do tolerate Byzantine failures do not self-stabilize. Of the Byzantine self-stabilizing clock synchronization algorithms that exist in the literature, they are based on either unjustifiably strong assumptions about initial synchrony of the nodes or on the existence of a common pulse at the nodes. The Byzantine self-stabilizing clock synchronization protocol presented here does not rely on any assumptions about the initial state of the clocks. Furthermore, there is neither a central clock nor an externally generated pulse system. The proposed protocol converges deterministically, is scalable, and self-stabilizes in a short amount of time. The convergence time is linear with respect to the self-stabilization period. Proofs of the correctness of the protocol as well as the results of formal verification efforts are reported.

  3. The MAFT architecture for distributed fault tolerance

    Energy Technology Data Exchange (ETDEWEB)

    Kieckhafer, R.M.; Walter, C.J.; Finn, A.M.; Thambidurai, P.M.

    1988-04-01

    This paper describes the Multicomputer Architecture for Fault-Tolerance (MAFT), a distributed system designed to provide extremely reliable computation in real-time control systems. MAFT is based on the physical and functional partitioning of executive functions from application functions. The implementation of the executive functions in a special-purpose hardware processor allows the fault-tolerance functions to be transparent to the application programs and minimizes overhead. Byzantine Agreement and Approximate Agreement algorithms are employed for critical system parameters. MAFT supports the use of multiversion hardware and software to tolerate built-in or generic faults. Graceful degradation and restoration of the application workload is permitted in response to the exclusion and readmission of nodes, respectively.

  4. Design of fault tolerant control system for individual blade control helicopters

    Science.gov (United States)

    Tamayo, Sergio

    This dissertation presents the development of a fault tolerant control scheme for helicopters fitted with individually controlled blades. This novel approach attempts to improve fault tolerant capabilities of helicopter control system by increasing control redundancy using additional actuators for individual blade input and software re-mixing to obtain nominal or close to nominal conditions under failure. An advanced interactive simulation environment has been developed including modeling of sensor failure, swashplate actuator failure, individual blade actuator failure, and blade delamination to support the design, testing, and evaluation of the control laws. This simulation environment is based on the blade element theory for the calculation of forces and moments generated by the main rotor. This discretized model allows for individual blade analysis, which in turn allows measuring the consequences of a stuck blade, or loss of the surface area of the blade itself, with respect to the dynamics of the whole helicopter. The control laws are based on non-linear dynamic inversion and artificial neural network augmentation, which is a mix of linear and nonlinear methods that compensates for model inaccuracies due to linearization or failure. A stability analysis based on the Lyapunov function approach has shown that bounded tracking error is guaranteed, and under specific circumstances, global stability is guaranteed as well. An analysis over the degrees of freedom of the mechanical system and its impact over the helicopter handling qualities is also performed to measure the degree of redundancy achieved with the addition of individual blade actuators as compared to a classic swashplate helicopter configuration. Mathematical analysis and numerical simulation, using reconfiguration of the individual blade control under failure have shown that this control architecture can potentially improve the survivability of the aircraft and reduce pilot workload under failure conditions.

  5. Diagnosis and Fault-tolerant Control, 2nd edition

    DEFF Research Database (Denmark)

    Blanke, Mogens; Kinnaert, Michel; Lunze, Jan; Starosweicki, Marcel

    Fault-tolerant control aims at a graceful degradation of the behaviour of automated systems in case of faults. It satisfies the industrial demand for enhanced availability and safety, in contrast to traditional reactions to faults that bring about sudden shutdowns and loss of availability. The book...... presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault throught the process, to test the fault detectability and to find the redundancies in the process that can be used...... to ensure fault tolerance. Design methods for diagnostic systems and fault-tolerant controllers are presented for processes that are described by analytical models, by discrete-event models or that can be dealt with as quantised systems. Five case studies on pilot processes show the applicability of...

  6. Coordinated Fault Tolerance for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Dongarra, Jack; Bosilca, George; et al.

    2013-04-08

    Our work to meet our goal of end-to-end fault tolerance has focused on two areas: (1) improving fault tolerance in various software currently available and widely used throughout the HEC domain and (2) using fault information exchange and coordination to achieve holistic, systemwide fault tolerance and understanding how to design and implement interfaces for integrating fault tolerance features for multiple layers of the software stack—from the application, math libraries, and programming language runtime to other common system software such as jobs schedulers, resource managers, and monitoring tools.

  7. Modular, Fault-Tolerant Electronics Supporting Space Exploration Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Modern electronic systems tolerate only as many point failures as there are redundant system copies, using mere macro-scale redundancy. Fault Tolerant Electronics...

  8. Simulation Framework for Evaluation of Fault Tolerant Large Dynamic Distributed System

    Directory of Open Access Journals (Sweden)

    Sanjay Bansal

    2012-08-01

    Full Text Available The use of Java based simulators in the design and development of distributed system for evaluating the dependability on algorithms is appreciable due to their efficiency and scalability. It allows in designing the realistic simulation scenarios. In this work, we have proposed a Saturn, a multithreaded process oriented over simulation framework which is designed for modeling large scale distributed system. Realistic simulation is provided by it to provide a wide-range of distributed system technologies. It is an innovative solution to the problem of evaluating dependability characteristics of distributed system. Our solution is based on several proposed extensions to the simulation model of the MONARC simulation framework. These extensions refer to fault tolerance and system orchestration mechanisms in order to access the reliability and availability of distributed systems. The extended simulation model includes the necessary components to describe various actual failure situations and provides the mechanism to evaluate different strategies for replication and redundancy procedure as well as security enforcement mechanism. It is a simulator which also evaluates major QoS of the heartbeat based adaptive failure detection mechanism.

  9. Task Mapping and Bandwidth Reservation for Mixed Hard/Soft Fault-Tolerant Embedded Systems

    DEFF Research Database (Denmark)

    Saraswat, Prabhat Kumar; Pop, Paul; Madsen, Jan

    reserved for the servers determines the quality of service (QoS) for soft tasks. CBS enforces temporal isolation, such that soft task overruns do not affect the timing guarantees of hard tasks. Transient faults in hard tasks are tolerated using checkpointing with rollback recovery. We have proposed a Tabu...... Search-based approach for task mapping and CBS bandwidth reservation, such that the deadlines for the hard tasks are satisfied, even in the case of transient faults, and the QoS for the soft tasks is maximized. Researchers have used fixed execution time models, such as the worst-case execution times for...

  10. Fault tolerance issues in nanoelectronics

    OpenAIRE

    Spagocci, S.

    2008-01-01

    The astonishing success story of microelectronics cannot go on indefinitely. In fact, once devices reach the few-atom scale (nanoelectronics), transient quantum effects are expected to impair their behaviour. Fault tolerant techniques will then be required. The aim of this thesis is to investigate the problem of transient errors in nanoelectronic devices. Transient error rates for a selection of nanoelectronic gates, based upon quantum cellular automata and single electron devi...

  11. Electrical Steering of Vehicles - Fault-tolerant Analysis and Design

    DEFF Research Database (Denmark)

    Blanke, Mogens; Thomsen, Jesper Sandberg

    solutions and still meet strict requirements to functional safety. The paper applies graph-based analysis of functional system structure to find a novel fault-tolerant architecture for an electrical steering where a dedicated AC-motor design and cheap voltage measurements ensure ability to detect all...... relevant faults. The paper shows how active control reconfiguration can accommodate all critical faults and the fault-tolerant abilities are demonstrated on a warehouse truck hardware....

  12. Fault tolerant control schemes using integral sliding modes

    CERN Document Server

    Hamayun, Mirza Tariq; Alwi, Halim

    2016-01-01

    The key attribute of a Fault Tolerant Control (FTC) system is its ability to maintain overall system stability and acceptable performance in the face of faults and failures within the feedback system. In this book Integral Sliding Mode (ISM) Control Allocation (CA) schemes for FTC are described, which have the potential to maintain close to nominal fault-free performance (for the entire system response), in the face of actuator faults and even complete failures of certain actuators. Broadly an ISM controller based around a model of the plant with the aim of creating a nonlinear fault tolerant feedback controller whose closed-loop performance is established during the design process. The second approach involves retro-fitting an ISM scheme to an existing feedback controller to introduce fault tolerance. This may be advantageous from an industrial perspective, because fault tolerance can be introduced without changing the existing control loops. A high fidelity benchmark model of a large transport aircraft is u...

  13. Advanced Information Processing System (AIPS)-based fault tolerant avionics architecture for launch vehicles

    Science.gov (United States)

    Lala, Jaynarayan H.; Harper, Richard E.; Jaskowiak, Kenneth R.; Rosch, Gene; Alger, Linda S.; Schor, Andrei L.

    1990-01-01

    An avionics architecture for the advanced launch system (ALS) that uses validated hardware and software building blocks developed under the advanced information processing system program is presented. The AIPS for ALS architecture defined is preliminary, and reliability requirements can be met by the AIPS hardware and software building blocks that are built using the state-of-the-art technology available in the 1992-93 time frame. The level of detail in the architecture definition reflects the level of detail available in the ALS requirements. As the avionics requirements are refined, the architecture can also be refined and defined in greater detail with the help of analysis and simulation tools. A useful methodology is demonstrated for investigating the impact of the avionics suite to the recurring cost of the ALS. It is shown that allowing the vehicle to launch with selected detected failures can potentially reduce the recurring launch costs. A comparative analysis shows that validated fault-tolerant avionics built out of Class B parts can result in lower life-cycle-cost in comparison to simplex avionics built out of Class S parts or other redundant architectures.

  14. Synthesis of Fault-Tolerant Schedules with Transparency/Performance Trade-offs for Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Izosimov, Viacheslav; Pop, Paul; Eles, Petru; Peng, Zebo

    application. We propose a novel algorithm for the synthesis of fault-tolerant schedules that can handle the transparency/performance trade-offs imposed by the designer, and makes use of the fault-occurrence information to reduce the overhead due to fault tolerance. We model the application as a conditional...... process graph, where the fault occurrence information is represented as conditional edges and the transparent recovery is captured using synchronization nodes....

  15. USAGE OF STANDARD PERSONAL COMPUTER PORTS FOR DESIGNING OF THE DOUBLE REDUNDANT FAULT-TOLERANT COMPUTER CONTROL SYSTEMS

    OpenAIRE

    Rafig SAMEDOV; ÇİFTÇİ, Ahmet

    2005-01-01

    In this study, for designing of the fault-tolerant control systems by using standard personal computers, the ports have been investigated, different structure versions have been designed and the method for choosing of an optimal structure has been suggested. In this scope, first of all, the ÇİFTYAK system has been defined and its work principle has been determined. Then, data transmission ports of the standard personal computers have been classified and analyzed. After that, the structure ver...

  16. Validation Methods Research for Fault-Tolerant Avionics and Control Systems Sub-Working Group Meeting. CARE 3 peer review

    Science.gov (United States)

    Trivedi, K. S. (Editor); Clary, J. B. (Editor)

    1980-01-01

    A computer aided reliability estimation procedure (CARE 3), developed to model the behavior of ultrareliable systems required by flight-critical avionics and control systems, is evaluated. The mathematical models, numerical method, and fault-tolerant architecture modeling requirements are examined, and the testing and characterization procedures are discussed. Recommendations aimed at enhancing CARE 3 are presented; in particular, the need for a better exposition of the method and the user interface is emphasized.

  17. System Wide Joint Position Sensor Fault Tolerance in Robot Systems Using Cartesian Accelerometers

    Science.gov (United States)

    Aldridge, Hal A.; Juang, Jer-Nan

    1997-01-01

    Joint position sensors are necessary for most robot control systems. A single position sensor failure in a normal robot system can greatly degrade performance. This paper presents a method to obtain position information from Cartesian accelerometers without integration. Depending on the number and location of the accelerometers. the proposed system can tolerate the loss of multiple position sensors. A solution technique suitable for real-time implementation is presented. Simulations were conducted using 5 triaxial accelerometers to recover from the loss of up to 4 joint position sensors on a 7 degree of freedom robot moving in general three dimensional space. The simulations show good estimation performance using non-ideal accelerometer measurements.

  18. Microcontroller-Based Fault Tolerant Data Acquisition System For Air Quality Monitoring And Control Of Environmental Pollution

    OpenAIRE

    Tochukwu Chiagunye; Eze Aru Okereke; Ilo Somtoochukwu

    2015-01-01

    ABSTRACT The design applied Passive fault tolerance to a microcontroller based data acquisition system to achieve the stated considerations where redundant sensors and microcontrollers with associated circuitry were designed and implemented to enable measurement of pollutant concentration information from chimney vents in two industry. Microsoft visual basic was used to develop a data mining tool which implemented an underlying artificial neural network model for forecasting pollutant concent...

  19. Thermoelectric-Driven Sustainable Sensing and Actuation Systems for Fault-Tolerant Nuclear Incidents

    Energy Technology Data Exchange (ETDEWEB)

    Longtin, Jon [Stony Brook Univ., NY (United States)

    2016-02-08

    The Fukushima Daiichi nuclear incident in March 2011 represented an unprecedented stress test on the safety and backup systems of a nuclear power plant. The lack of reliable information from key components due to station blackout was a serious setback, leaving sensing, actuation, and reporting systems unable to communicate, and safety was compromised. Although there were several independent backup power sources for required safety function on site, ultimately the batteries were drained and the systems stopped working. If, however, key system components were instrumented with self-powered sensing and actuation packages that could report indefinitely on the status of the system, then critical system information could be obtained while providing core actuation and control during off-normal status for as long as needed. This research project focused on the development of such a self-powered sensing and actuation system. The electrical power is derived from intrinsic heat in the reactor components, which is both reliable and plentiful. The key concept was based around using thermoelectric generators that can be integrated directly onto key nuclear components, including pipes, pump housings, heat exchangers, reactor vessels, and shielding structures, as well as secondary-side components. Thermoelectric generators are solid-state devices capable of converting heat directly into electricity. They are commercially available technology. They are compact, have no moving parts, are silent, and have excellent reliability. The key components to the sensor package include a thermoelectric generator (TEG), microcontroller, signal processing, and a wireless radio package, environmental hardening to survive radiation, flooding, vibration, mechanical shock (explosions), corrosion, and excessive temperature. The energy harvested from the intrinsic heat of reactor components can be then made available to power sensors, provide bi-directional communication, recharge batteries for other safety systems, etc. Such an approach is intrinsically fault tolerant: in the event that system temperatures increase, the amount of available energy will increase, which will make more power available for applications. The system can also be used during normal conditions to provide enhanced monitoring of key system components.

  20. Thermoelectric-Driven Sustainable Sensing and Actuation Systems for Fault-Tolerant Nuclear Incidents

    International Nuclear Information System (INIS)

    The Fukushima Daiichi nuclear incident in March 2011 represented an unprecedented stress test on the safety and backup systems of a nuclear power plant. The lack of reliable information from key components due to station blackout was a serious setback, leaving sensing, actuation, and reporting systems unable to communicate, and safety was compromised. Although there were several independent backup power sources for required safety function on site, ultimately the batteries were drained and the systems stopped working. If, however, key system components were instrumented with self-powered sensing and actuation packages that could report indefinitely on the status of the system, then critical system information could be obtained while providing core actuation and control during off-normal status for as long as needed. This research project focused on the development of such a self-powered sensing and actuation system. The electrical power is derived from intrinsic heat in the reactor components, which is both reliable and plentiful. The key concept was based around using thermoelectric generators that can be integrated directly onto key nuclear components, including pipes, pump housings, heat exchangers, reactor vessels, and shielding structures, as well as secondary-side components. Thermoelectric generators are solid-state devices capable of converting heat directly into electricity. They are commercially available technology. They are compact, have no moving parts, are silent, and have excellent reliability. The key components to the sensor package include a thermoelectric generator (TEG), microcontroller, signal processing, and a wireless radio package, environmental hardening to survive radiation, flooding, vibration, mechanical shock (explosions), corrosion, and excessive temperature. The energy harvested from the intrinsic heat of reactor components can be then made available to power sensors, provide bi-directional communication, recharge batteries for other safety systems, etc. Such an approach is intrinsically fault tolerant: in the event that system temperatures increase, the amount of available energy will increase, which will make more power available for applications. The system can also be used during normal conditions to provide enhanced monitoring of key system components.

  1. A study on quantification of unavailability of DPPS with fault tolerant techniques considering fault tolerant techniques' characteristics

    International Nuclear Information System (INIS)

    With the improvement of digital technologies, digital I and C systems have included more various fault tolerant techniques than conventional analog I and C systems have, in order to increase fault detection and to help the system safely perform the required functions in spite of the presence of faults. So, in the reliability evaluation of digital systems, the fault tolerant techniques (FTTs) and their fault coverage must be considered. To consider the effects of FTTs in a digital system, there have been several studies on the reliability of digital model. Therefore, this research based on literature survey attempts to develop a model to evaluate the plant reliability of the digital plant protection system (DPPS) with fault tolerant techniques considering detection and process characteristics and human errors. Sensitivity analysis is performed to ascertain important variables from the fault management coverage and unavailability based on the proposed model

  2. Industrial Cost-Benefit Assessment for Fault-tolerant Control Systems

    DEFF Research Database (Denmark)

    Thybo, C.; Blanke, M.

    1998-01-01

    against failure. The paper describes the assessments needed to find the right path for new industrial designs. The economic decisions in the design phase are discussed: cost of different failures, profits associated with available benefits, investments needed for development and life-time support. The...... objective of this paper is to help, in the early product development state, to find the economical most suitable scheme. A salient result is that with increased customer awareness of total cost of ownership, new products can benefit significantly from applying fault tolerant control principles....

  3. Dynamic and fault-tolerant cluster management

    OpenAIRE

    Gidenstam, Anders; Koldehofe, Boris; Papatriantafilou, Marina; Tsigas, Philippas

    2005-01-01

    Recent decentralised event-based systems have focused on providing event delivery which scales with increasing number of processes. While the main focus of research has been on ensuring that processes maintain only a small amount of information on maintaining membership and routing, an important factor in achieving scalability for event-based peer-to-peer dissemination system is the number of events disseminated at the same time. This work presents a dynamic and fault tolerant cluster managem...

  4. Fault-tolerant Supervisory Control

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh

    could be increased through enhancing control systems' ability to on-line perform fault detection and reconfiguration when a fault occurs and before a safety system shuts-down the entire process. The main contributions of this research effort are development and experimentation with methodologies for....... The first aims at constructing the decision logic in form of a ``language''. This language is obtained as a direct result of the component based approach, presented in this thesis. This approach is based on the definition of a functional component, components placement in a control system hierarchy...... behavior during non-faulty as well as faulty situations. Using the structural model of the system, it is illustrated how to perform sensor information fusion when a sensor fault occurs. It is finally shown how the decision logic of the supervisor for the benchmark is designed. Main results in this...

  5. Evaluation of Simple Causal Message Logging for Large-Scale Fault Tolerant HPC Systems

    Energy Technology Data Exchange (ETDEWEB)

    Bronevetsky, G; Meneses, E; Kale, L V

    2011-02-25

    The era of petascale computing brought machines with hundreds of thousands of processors. The next generation of exascale supercomputers will make available clusters with millions of processors. In those machines, mean time between failures will range from a few minutes to few tens of minutes, making the crash of a processor the common case, instead of a rarity. Parallel applications running on those large machines will need to simultaneously survive crashes and maintain high productivity. To achieve that, fault tolerance techniques will have to go beyond checkpoint/restart, which requires all processors to roll back in case of a failure. Incorporating some form of message logging will provide a framework where only a subset of processors are rolled back after a crash. In this paper, we discuss why a simple causal message logging protocol seems a promising alternative to provide fault tolerance in large supercomputers. As opposed to pessimistic message logging, it has low latency overhead, especially in collective communication operations. Besides, it saves messages when more than one thread is running per processor. Finally, we demonstrate that a simple causal message logging protocol has a faster recovery and a low performance penalty when compared to checkpoint/restart. Running NAS Parallel Benchmarks (CG, MG and BT) on 1024 processors, simple causal message logging has a latency overhead below 5%.

  6. Fault Tolerant Ethernet Based Network for Time Sensitive Applications in Electrical Power Distribution Systems

    Directory of Open Access Journals (Sweden)

    Leos Bohac

    2013-01-01

    Full Text Available The paper analyses and experimentally verifies deployment of Ethernet based network technology to enable fault tolerant and timely exchange of data among a number of high voltage protective relays that use proprietary serial communication line to exchange data in real time on a state of its high voltage circuitry facilitating a fast protection switching in case of critical failures. The digital serial signal is first fetched into PCM multiplexer where it is mapped to the corresponding E1 (2 Mbit/s time division multiplexed signal. Subsequently, the resulting E1 frames are then packetized and sent through Ethernet control LAN to the opposite PCM demultiplexer where the same but reverse processing is done finally sending a signal into the opposite protective relay. The challenge of this setup is to assure very timely delivery of the control information between protective relays even in the cases of potential failures of Ethernet network itself. The tolerance of Ethernet network to faults is assured using widespread per VLAN Rapid Spanning Tree Protocol potentially extended by 1+1 PCM protection as a valuable option.

  7. Fault-Tolerant Process Control Methods and Applications

    CERN Document Server

    Mhaskar, Prashant; Christofides, Panagiotis D

    2013-01-01

    Fault-Tolerant Process Control focuses on the development of general, yet practical, methods for the design of advanced fault-tolerant control systems; these ensure an efficient fault detection and a timely response to enhance fault recovery, prevent faults from propagating or developing into total failures, and reduce the risk of safety hazards. To this end, methods are presented for the design of advanced fault-tolerant control systems for chemical processes which explicitly deal with actuator/controller failures and sensor faults and data losses. Specifically, the book puts forward: ·         a framework for  detection, isolation and diagnosis of actuator and sensor faults for nonlinear systems; ·         controller reconfiguration and safe-parking-based fault-handling methodologies; ·         integrated-data- and model-based fault-detection and isolation and fault-tolerant control methods; ·         methods for handling sensor faults and data losses; and ·      ...

  8. Real-time fault diagnosis and fault-tolerant control

    OpenAIRE

    Gao, Zhiwei; Ding, Steven X.; Cecati, Carlo

    2015-01-01

    This "Special Section on Real-Time Fault Diagnosis and Fault-Tolerant Control" of the IEEE Transactions on Industrial Electronics is motivated to provide a forum for academic and industrial communities to report recent theoretic/application results in real-time monitoring, diagnosis, and fault-tolerant design, and exchange the ideas about the emerging research direction in this field. Twenty-three papers were eventually selected through a strict peer-reviewed procedure, which represent the mo...

  9. Fault Tolerant Control of Induction Motor

    OpenAIRE

    Khalaf Salloum Gaeid

    2011-01-01

    The principle of vector control of electrical machines is to control both the magnitude and the phase of each phase, current and voltage. MATLAB/Simulink has been performed for assessment of operating features of the proposed scheme. Proportional Integral (PI) speed controller is designed in this paper. Test response of the developed variable speed drive along with the simulated response is given and discussed in detail for torque and speed. Fault tolerant fundamental is applied to the system...

  10. Enhancement of Fault Tolerance in Cloud Computing

    OpenAIRE

    Pushpanjali Gupta; Rasmi Ranjan Patra

    2014-01-01

    In recent years researchers are trying to work out scientific applications in cloud so that it decreases the infrastructure cost and increases the span of team and finally innovative ideas towards applications is increased. But the cloud is still not as much reliable, controllable as grid. So in the evolving Cloud computing environment there is a great need of fault tolerance mechanism for the system to work effectively even in the presence of failure. Moreover Big Organizations ar...

  11. RADIC II : a fault tolerant architecture with flexible dynamic redundancy

    OpenAIRE

    Santos, Guna Alexander Silva dos; Rexachs del Rosario, Dolores Isabel

    2007-01-01

    The demand for computational power has been leading the improvement of the High Performance Computing (HPC) area, generally represented by the use of distributed systems like clusters of computers running parallel applications. In this area, fault tolerance plays an important role in order to provide high availability isolating the application from the faults effects. Performance and availability form an undissociable binomial for some kind of applications. Therefore, the fault tolerant solut...

  12. Fault Tolerance-Challenges, Techniques and Implementation in Cloud Computing

    Directory of Open Access Journals (Sweden)

    Anju Bala

    2012-01-01

    Full Text Available Fault tolerance is a major concern to guarantee availability and reliability of critical services as well as application execution. In order to minimize failure impact on the system and application execution, failures should be anticipated and proactively handled. Fault tolerance techniques are used to predict these failures and take an appropriate action before failures actually occur. This paper discusses the existing fault tolerance techniques in cloud computing based on their policies, tools used and research challenges. Cloud virtualized system architecture has been proposed. In the proposed system autonomic fault tolerance has been implemented. The experimental results demonstrate that the proposed system can deal with various software faults for server applications in a cloud virtualized environment.

  13. Designing fault-tolerant real-time computer systems with diversified bus architecture for nuclear power plants

    International Nuclear Information System (INIS)

    Fault-tolerant real-time computer (FT-RTC) systems are widely used to perform safe operation of nuclear power plants (NPP) and safe shutdown in the event of any untoward situation. Design requirements for such systems need high reliability, availability, computational ability for measurement via sensors, control action via actuators, data communication and human interface via keyboard or display. All these attributes of FT-RTC systems are required to be implemented using best known methods such as redundant system design using diversified bus architecture to avoid common cause failure, fail-safe design to avoid unsafe failure and diagnostic features to validate system operation. In this context, the system designer must select efficient as well as highly reliable diversified bus architecture in order to realize fault-tolerant system design. This paper presents a comparative study between CompactPCI bus and Versa Module Eurocard (VME) bus architecture for designing FT-RTC systems with switch over logic system (SOLS) for NPP. (author)

  14. Fault-tolerant architectures for superconducting qubits

    International Nuclear Information System (INIS)

    In this short review, I draw attention to new developments in the theory of fault tolerance in quantum computation that may give concrete direction to future work in the development of superconducting qubit systems. The basics of quantum error-correction codes, which I will briefly review, have not significantly changed since their introduction 15 years ago. But an interesting picture has emerged of an efficient use of these codes that may put fault-tolerant operation within reach. It is now understood that two-dimensional surface codes, close relatives of the original toric code of Kitaev, can be adapted as shown by Raussendorf and Harrington to effectively perform logical gate operations in a very simple planar architecture, with error thresholds for fault-tolerant operation simulated to be 0.75%. This architecture uses topological ideas in its functioning, but it is not 'topological quantum computation'-there are no non-abelian anyons in sight. I offer some speculations on the crucial pieces of superconducting hardware that could be demonstrated in the next couple of years that would be clear stepping stones towards this surface-code architecture.

  15. Nonlinear, Adaptive and Fault-tolerant Control for Electro-hydraulic Servo Systems

    DEFF Research Database (Denmark)

    Choux, Martin

    -tolerant control for a representative electro hydraulic servo controlled motion system. The thesis extends existing models of hydraulic systems by considering more detailed dynamics in the servo valve and in the friction inside the hydraulic cylinder. It identies the model parameters using experimental data from a...

  16. A Dynamic Effective Fault Tolerance System in Robotic Manipulator using a Hybrid Neural Network based Controller

    Directory of Open Access Journals (Sweden)

    G. Jiji

    2014-04-01

    Full Text Available Robot manipulator play important role in the field of automobile industry, mainly it is used in gas welding application and manufacturing and assembling of motor parts. In complex trajectory, on each joint the speed of the robot manipulator is affected. For that reason, it is necessary to analyze the noise and vibration of robot's joints for predicting faults also improve the control precision of robotic manipulator. In this study we will propose a new fault detection system for Robot manipulator. The proposed hybrid fault detection system is designed based on fuzzy support vector machine and Artificial Neural Networks (ANNs. In this system the decouple joints are identified and corrected using fuzzy SVM, here non-linear signal are used for complete process and treatment, the Artificial Neural Networks (ANNs are used to detect the free-swinging and locked joint of the robot, two types of neural predictors are also employed in the proposed adaptive neural network structure. The simulation results of a hybrid controller demonstrate the feasibility and performance of the methodology.

  17. Microcontroller-Based Fault Tolerant Data Acquisition System For Air Quality Monitoring And Control Of Environmental Pollution

    Directory of Open Access Journals (Sweden)

    Tochukwu Chiagunye

    2015-08-01

    Full Text Available ABSTRACT The design applied Passive fault tolerance to a microcontroller based data acquisition system to achieve the stated considerations where redundant sensors and microcontrollers with associated circuitry were designed and implemented to enable measurement of pollutant concentration information from chimney vents in two industry. Microsoft visual basic was used to develop a data mining tool which implemented an underlying artificial neural network model for forecasting pollutant concentrations for future time periods. The feed forward back propagation method was used to train the ANN model with a training data set while a decision tree algorithm was used to select an optimal output result for the model from its two output neurons.

  18. Application-Specific Fault Tolerance via Data Access Characterization

    Energy Technology Data Exchange (ETDEWEB)

    Ali, Nawab; Krishnamoorthy, Sriram; Govind, Niranjan; Kowalski, Karol; Sadayappan, Ponnuswamy

    2011-08-30

    Recent trends in semiconductor technology and supercomputer design predict an increasing probability of faults during an application's execution. Designing an application that is resilient to system failures requires careful evaluation of the impact of various approaches on preserving key application state. In this paper, we present our experiences in an ongoing effort to make a large computational chemistry application fault tolerant. We construct the data access signatures of key application modules to evaluate alternative fault tolerance approaches. We present the instrumentation methodology, characterization of the application modules, and evaluation of fault tolerance techniques using the information collected. The application signatures developed capture application characteristics not traditionally revealed by performance tools. We believe these can be used in the design and evaluation of runtimes beyond fault tolerance.

  19. An efficient fault-tolerant out-patient order entry system based on special distributed client/server architecture.

    Science.gov (United States)

    Chuang, C T

    1998-01-01

    An automatic order entry system is very important for processing out-patient information. This system not only helps physicians to enter their orders directly, but can also reduce order communication error and thus improve medical quality. Therefore, many hospitals have high aspirations to generate and implement direct order entry systems, but they are also concerned about the setbacks of system failure. In this paper, we present an effective and efficient fault-tolerant order entry system based on special distribution client/server architecture that satisfies the requirements of out-patient order entry very well. From the experimental results carried out on a prototype, we found that this system can improve the system response time of order entry and can also generate an operational method having a user friendly interface. The physicians can enter their orders easily, accurately, directly, flexibly and at a faster rate by making choices from standardized and personalized menus in this system. PMID:9667048

  20. Fault Tolerance in Control Architectures for Mobile Robots: Fantasy or Reality?

    OpenAIRE

    Crestani, Didier; Godary-Dejean, Karen

    2012-01-01

    Due to the future development of robotic autonomous systems in human environment, the fault tolerance paradigm will be a central issue in robotics. This article presents a survey of fault tolerance concepts, means and implementations in robotic architectures.

  1. Fault tolerance and reliability in integrated ship control

    DEFF Research Database (Denmark)

    Nielsen, Jens Frederik Dalsgaard; Izadi-Zamanabadi, Roozbeh; Schiøler, Henrik

    2002-01-01

    Various strategies for achieving fault tolerance in large scale control systems are discussed. The positive and negative impacts of distribution through network communication are presented. The ATOMOS framework for standardized reliable marine automation is presented along with the corresponding...

  2. Fault-Tolerant Precision Formation Guidance for Interferometry Project

    Data.gov (United States)

    National Aeronautics and Space Administration — A methodology is to be developed that will allow the development and implementation of fault-tolerant control system for distributed collaborative spacecraft. The...

  3. Fault-tolerant logics for FPGA linux

    International Nuclear Information System (INIS)

    The increasing use of SRAM-based reconfigurable architectures at important areas of research and development (like particle accelerators and space applications) brings new, currently partially unattended effects on top. An already well known, but nevertheless important problem of such systems is its susceptibility to radiation which increases in conjunction with particle flux and energy. Regarding to current knowledge, errors induced by Single Event Upsets (SEU) and Single Event Transients (SET) are handled exclusively in hardware by the use of spacial and temporal redundancy features. Our field of research is to extend conventional fault tolerance to multiple layers of embedded computer systems, starting with the FPGA bit layer and ending up in the software application layer to get a maximum of radiation tolerance in systems running FPGA Linux in radiation susceptible environments. Only a collaboration of all these layers is able to create an adequate amount of data security and process integrity.

  4. Extensions to the Parallel Real-Time Artificial Intelligence System (PRAIS) for fault-tolerant heterogeneous cycle-stealing reasoning

    Science.gov (United States)

    Goldstein, David

    1991-01-01

    Extensions to an architecture for real-time, distributed (parallel) knowledge-based systems called the Parallel Real-time Artificial Intelligence System (PRAIS) are discussed. PRAIS strives for transparently parallelizing production (rule-based) systems, even under real-time constraints. PRAIS accomplished these goals (presented at the first annual C Language Integrated Production System (CLIPS) conference) by incorporating a dynamic task scheduler, operating system extensions for fact handling, and message-passing among multiple copies of CLIPS executing on a virtual blackboard. This distributed knowledge-based system tool uses the portability of CLIPS and common message-passing protocols to operate over a heterogeneous network of processors. Results using the original PRAIS architecture over a network of Sun 3's, Sun 4's and VAX's are presented. Mechanisms using the producer-consumer model to extend the architecture for fault-tolerance and distributed truth maintenance initiation are also discussed.

  5. An efficient fault-tolerant order entry management information system based on special distributed client/server architecture.

    Science.gov (United States)

    Chuang, C T

    1998-11-01

    An automatic order entry system is very important for the processing of out-patient information, not only helping doctors to enter their orders directly but also reducing errors of communication. Many hospitals are anxious to set up a direct order entry system but are concerned about possible system failures. In this paper we report on an effective and efficient fault-tolerant order entry management system which satisfies the requirements for out-patient order entry. From the results of experiments on a prototype we found that the system was user friendly and reduced the time taken. Doctors are able to enter their orders more easily, accurately and quickly by selecting from the standardized and personalized menus to be found in the system. PMID:10338694

  6. Fault Tolerant Architecture for Telecom Wireless CORBA

    Directory of Open Access Journals (Sweden)

    Zhenpeng Xu

    2013-01-01

    Full Text Available In order for non-mobile ORB to interoperate with CORBA objects and clients running on a mobile terminal, OMG have specified Wireless Access and Terminal Mobility of CORBA. In the common core of the CORBA specification, Fault Tolerance has been specified. But it is intended for the wired networks. This study proposes a fault tolerant architecture for the Telecom wireless CORBA based on replication and checkpoint of objects. The storage available at Access Bridge is employed to log messages and entity states of objects on behalf of mobile terminals. The logging and recovery infrastructures are designed on each Access Bridge, to implement the fault tolerant for Telecom wireless CORBA. The Logging Mechanism records the message in a log, from which the Recovery Mechanism can retrieve the message during recovery. The performance analysis shows that the proposed fault tolerant architecture ensures a low loss of computing incurred by the fault of the server object. The proposed fault tolerance architecture is a graceful extension of the original wired Fault Tolerant CORBA and is able to cooperate with the published CORBA specifications seamlessly.

  7. Diagnosis and Fault-tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens; Kinnaert, Michel; Lunze, Jan; Staroswiecki, Marcel

    The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process...... applicability of the presented methods. The theoretical results are illustrated by two running examples which are used throughout the book. The book addresses engineering students, engineers in industry and researchers who wish to get a survey over the variety of approaches to process diagnosis and fault...

  8. Fault tolerant control of a three-phase three-wire shunt active filter system based on reliability analysis

    Energy Technology Data Exchange (ETDEWEB)

    Poure, P. [Laboratoire d' Instrumentation Electronique de Nancy LIEN, EA 3440, Nancy-Universite, Faculte des Sciences et Techniques, BP 239, 54506 Vandoeuvre Cedex (France); Weber, P.; Theilliol, D. [Centre de Recherche en Automatique de Nancy UMR 7039, Nancy-Universite, CNRS, Faculte des Sciences et Techniques, BP 239, 54506 Vandoeuvre Cedex (France); Saadate, S. [Groupe de Recherches en Electrotechnique et Electronique de Nancy UMR 7037, Nancy-Universite, CNRS, Faculte des Sciences et Techniques, BP 239, 54506 Vandoeuvre Cedex (France)

    2009-02-15

    This paper deals with fault tolerant shunt three-phase three-wire active filter topologies for which reliability is very important in industry applications. The determination of the optimal reconfiguration structure among various ones with or without redundant components is discussed based on reliability criteria. First, the reconfiguration of the inverter is detailed and a fast fault diagnosis method for power semi-conductor or driver fault detection and compensation is presented. This method avoids false fault detection due to power semi-conductors switching. The control architecture and algorithm are studied and a fault tolerant control strategy is considered. Simulation results in open and short circuit cases validate the theoretical study. Finally, the reliability of the studied three-phase three-wire filter shunt active topologies is analyzed to determine the optimal one. (author)

  9. Fault-tolerance of functionally adaptive and robust manipulator

    Energy Technology Data Exchange (ETDEWEB)

    Kotosaka, Shin-ya [ATR Human Information Processing Research Labs., Seika, Kyoto (Japan); Asama, Hajime; Kaetsu, Hayato; Endo, Isao

    1997-05-01

    Robots are required to have the ability to adapt their function according to the tasks to be carried out in an unexpected environment, and to execute tasks even if a part of the system is malfunctions. Fault tolerance is a significant factor of functional adaptability. In this paper, a fault-tolerant control method with a proxy control strategy for a distributed manipulator is proposed. A Byzantine fault model is assumed in the method, where in the behavior of the faulty part cannot be predicted. The method focuses on malfunction of CPU (central processing unit) which is the controller of the manipulator. The method consists of procedures for fault detection, localization, containment, system reconfiguration and error recovery. The fault detection procedure is based on communication using shared memory. A voting algorithm for fault location is proposed. The fault-tolerance control method is implemented in a distributed manipulator with modular architecture, called Fun-ARM (functionally adaptive and robust manipulator). A reaching motion experiment with a CPU pseudo fault is shown, and the proposed fault-tolerant control method is verified. (author)

  10. SEU fault tolerance in artificial neural networks

    International Nuclear Information System (INIS)

    In this paper the authors investigate the robustness of Artificial Neural Networks when encountering transient modification of information bits related to the network operation. These kinds of faults are likely to occur as a consequence of interaction with radiation. Results of tests performed to evaluate the fault tolerance properties of two different digital neural circuits are presented

  11. Scaling and renormalization in fault-tolerant quantum computers

    OpenAIRE

    Raginsky, Maxim

    2003-01-01

    This work is concerned with phrasing the concepts of fault-tolerant quantum computation within the framework of disordered systems, Bernoulli site percolation in particular. We show how the so-called "threshold theorems" on the possibility of fault-tolerant quantum computation with constant error rate can be cast as a renormalization (coarse-graining) of the site percolation process describing the occurrence of errors during computation. We also use percolation techniques to derive a trade-of...

  12. Fault-tolerant control for current sensors of doubly fed induction generators based on an improved fault detection method

    DEFF Research Database (Denmark)

    Li, Hui; Yang, Chao; Hu, Yaogang; Zhao, Bin; Zhao, Meng; Chen, Zhe

    2014-01-01

    Fault-tolerant control of current sensors is studied in this paper to improve the reliability of a doubly fed induction generator (DFIG). A fault-tolerant control system of current sensors is presented for the DFIG, which consists of a new current observer and an improved current sensor fault det...

  13. Byzantine-fault tolerant self-stabilizing protocol for distributed clock synchronization systems

    Science.gov (United States)

    Malekpour, Mahyar R. (Inventor)

    2010-01-01

    A rapid Byzantine self-stabilizing clock synchronization protocol that self-stabilizes from any state, tolerates bursts of transient failures, and deterministically converges within a linear convergence time with respect to the self-stabilization period. Upon self-stabilization, all good clocks proceed synchronously. The Byzantine self-stabilizing clock synchronization protocol does not rely on any assumptions about the initial state of the clocks. Furthermore, there is neither a central clock nor an externally generated pulse system. The protocol converges deterministically, is scalable, and self-stabilizes in a short amount of time. The convergence time is linear with respect to the self-stabilization period.

  14. Incorporating Fault Tolerance Tactics in Software Architecture Patterns

    OpenAIRE

    Harrison, Neil B.; Avgeriou, Paris

    2008-01-01

    One important way that an architecture impacts fault tolerance is by making it easy or hard to implement measures that improve fault tolerance. Many such measures are described as fault tolerance tactics. We studied how various fault tolerance tactics can be implemented in the best-known architecture patterns. This shows that certain patterns are better suited to implementing fault tolerance tactics than others, and that certain alternate tactics are better matches than others for a given pat...

  15. USAGE OF STANDARD PERSONAL COMPUTER PORTS FOR DESIGNING OF THE DOUBLE REDUNDANT FAULT-TOLERANT COMPUTER CONTROL SYSTEMS

    Directory of Open Access Journals (Sweden)

    Rafig SAMEDOV

    2005-01-01

    Full Text Available In this study, for designing of the fault-tolerant control systems by using standard personal computers, the ports have been investigated, different structure versions have been designed and the method for choosing of an optimal structure has been suggested. In this scope, first of all, the ÇİFTYAK system has been defined and its work principle has been determined. Then, data transmission ports of the standard personal computers have been classified and analyzed. After that, the structure versions have been designed and evaluated according to the used data transmission methods, the numbers of ports and the criterions of reliability, performance, truth, control and cost. Finally, the method for choosing of the most optimal structure version has been suggested.

  16. Fault Tolerant Implementation of Xilinx Vertex FPGA for Sensor Systems through On-Chip System Evolution

    Science.gov (United States)

    Anandaraj, S. P.; Kumar, R. Naveen; Ravi, S.; Sharma, S. S. V. N.

    Nowadays, majority of applications struggle to achieve good behavior of their subsystems by cooperation of systems, which is independently designed, separately located, but mutually affecting subsystems. Such coordinating systems are hard to attain the specific structural models and effective parameters. In such cases, the evolved hardware (EHW) methods with evolutionary Algorithms (EA) to achieve sophisticated level of information [2]. Numeral systems were introduced with evolvable hardware on a single chip to overcome the lack of flexibility, with the support of modifiable evolutionary algorithm stored in software on a built-in processor. This paper proposed the architecture with Xilinx Virtex-II Pro FPGA with interfaced PowerPC processor. Due to this speedy processing, time consumption in hardware and also allows other parts to be easily modifiable software. The proposed technique will provide more benefits in the future work as regards cost and compactness [1]. The system was completely analyzed on physical devices with software executing in parallel with fitness computation in digital logic circuits, and the results determine that the system uses only double the time when compared to a PC running at 10 times faster clock speed[6].

  17. Model-Based Fault Tolerant Control

    Science.gov (United States)

    Kumar, Aditya; Viassolo, Daniel

    2008-01-01

    The Model Based Fault Tolerant Control (MBFTC) task was conducted under the NASA Aviation Safety and Security Program. The goal of MBFTC is to develop and demonstrate real-time strategies to diagnose and accommodate anomalous aircraft engine events such as sensor faults, actuator faults, or turbine gas-path component damage that can lead to in-flight shutdowns, aborted take offs, asymmetric thrust/loss of thrust control, or engine surge/stall events. A suite of model-based fault detection algorithms were developed and evaluated. Based on the performance and maturity of the developed algorithms two approaches were selected for further analysis: (i) multiple-hypothesis testing, and (ii) neural networks; both used residuals from an Extended Kalman Filter to detect the occurrence of the selected faults. A simple fusion algorithm was implemented to combine the results from each algorithm to obtain an overall estimate of the identified fault type and magnitude. The identification of the fault type and magnitude enabled the use of an online fault accommodation strategy to correct for the adverse impact of these faults on engine operability thereby enabling continued engine operation in the presence of these faults. The performance of the fault detection and accommodation algorithm was extensively tested in a simulation environment.

  18. Simulation modeling based method for choosing an effective set of fault tolerance mechanisms for real-time avionics systems

    Science.gov (United States)

    Bakhmurov, A. G.; Balashov, V. V.; Glonina, A. B.; Pashkov, V. N.; Smeliansky, R. L.; Volkanov, D. Yu.

    2013-12-01

    In this paper, the reliability allocation problem (RAP) for real-time avionics systems (RTAS) is considered. The proposed method for solving this problem consists of two steps: (i) creation of an RTAS simulation model at the necessary level of abstraction and (ii) application of metaheuristic algorithm to find an optimal solution (i. e., to choose an optimal set of fault tolerance techniques). When during the algorithm execution it is necessary to measure the execution time of some software components, the simulation modeling is applied. The procedure of simulation modeling also consists of the following steps: automatic construction of simulation model of the RTAS configuration and running this model in a simulation environment to measure the required time. This method was implemented as an experimental software tool. The tool works in cooperation with DYANA simulation environment. The results of experiments with the implemented method are presented. Finally, future plans for development of the presented method and tool are briefly described.

  19. Fault-tolerant building-block computer study

    Science.gov (United States)

    Rennels, D. A.

    1978-01-01

    Ultra-reliable core computers are required for improving the reliability of complex military systems. Such computers can provide reliable fault diagnosis, failure circumvention, and, in some cases serve as an automated repairman for their host systems. A small set of building-block circuits which can be implemented as single very large integration devices, and which can be used with off-the-shelf microprocessors and memories to build self checking computer modules (SCCM) is described. Each SCCM is a microcomputer which is capable of detecting its own faults during normal operation and is described to communicate with other identical modules over one or more Mil Standard 1553A buses. Several SCCMs can be connected into a network with backup spares to provide fault-tolerant operation, i.e. automated recovery from faults. Alternative fault-tolerant SCCM configurations are discussed along with the cost and reliability associated with their implementation.

  20. Design and Verification of Fault-Tolerant Components

    DEFF Research Database (Denmark)

    Zhang, Miaomiao; Liu, Zhiming; Ravn, Anders Peter; Morisset, Charles

    2009-01-01

    We present a systematic approach to design and verification of fault-tolerant components with real-time properties as found in embedded systems. A state machine model of the correct component is augmented with internal transitions that represent hypothesized faults. Also, constraints on the...... occurrence or timing of faults are included in this model. This model of a faulty component is then extended with fault detection and recovery mechanisms, again in the form of state machines. Desired properties of the component are model checked for each of the successive models. The models can be made...... relatively detailed such that they can serve directly as blueprints for engineering, and yet be amenable to exhaustive verication. The approach is illustrated with a design of a triple modular fault-tolerant system that is a real case we received from our collaborators in the aerospace field. We use UPPAAL...

  1. SABRE: a bio-inspired fault-tolerant electronic architecture

    International Nuclear Information System (INIS)

    As electronic devices become increasingly complex, ensuring their reliable, fault-free operation is becoming correspondingly more challenging. It can be observed that, in spite of their complexity, biological systems are highly reliable and fault tolerant. Hence, we are motivated to take inspiration for biological systems in the design of electronic ones. In SABRE (self-healing cellular architectures for biologically inspired highly reliable electronic systems), we have designed a bio-inspired fault-tolerant hierarchical architecture for this purpose. As in biology, the foundation for the whole system is cellular in nature, with each cell able to detect faults in its operation and trigger intra-cellular or extra-cellular repair as required. At the next level in the hierarchy, arrays of cells are configured and controlled as function units in a transport triggered architecture (TTA), which is able to perform partial-dynamic reconfiguration to rectify problems that cannot be solved at the cellular level. Each TTA is, in turn, part of a larger multi-processor system which employs coarser grain reconfiguration to tolerate faults that cause a processor to fail. In this paper, we describe the details of operation of each layer of the SABRE hierarchy, and how these layers interact to provide a high systemic level of fault tolerance. (paper)

  2. Scheduling and Voltage Scaling for Energy/Reliability Trade-offs in Fault-Tolerant Time-Triggered Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Poulsen, Kåre Harbo; Izosimov, Viacheslav; Eles, Petru

    -execution and dynamic voltage scaling-based low-power techniques are competing for the slack in the schedules. Our approach decides the voltage levels and start times of processes and the transmission times of messages, such that the transient faults are tolerated, the timing constraints of the application are...

  3. A Primer on Architectural Level Fault Tolerance

    Science.gov (United States)

    Butler, Ricky W.

    2008-01-01

    This paper introduces the fundamental concepts of fault tolerant computing. Key topics covered are voting, fault detection, clock synchronization, Byzantine Agreement, diagnosis, and reliability analysis. Low level mechanisms such as Hamming codes or low level communications protocols are not covered. The paper is tutorial in nature and does not cover any topic in detail. The focus is on rationale and approach rather than detailed exposition.

  4. FTMP (Fault Tolerant Multiprocessor) programmer's manual

    Science.gov (United States)

    Feather, F. E.; Liceaga, C. A.; Padilla, P. A.

    1986-01-01

    The Fault Tolerant Multiprocessor (FTMP) computer system was constructed using the Rockwell/Collins CAPS-6 processor. It is installed in the Avionics Integration Research Laboratory (AIRLAB) of NASA Langley Research Center. It is hosted by AIRLAB's System 10, a VAX 11/750, for the loading of programs and experimentation. The FTMP support software includes a cross compiler for a high level language called Automated Engineering Design (AED) System, an assembler for the CAPS-6 processor assembly language, and a linker. Access to this support software is through an automated remote access facility on the VAX which relieves the user of the burden of learning how to use the IBM 4381. This manual is a compilation of information about the FTMP support environment. It explains the FTMP software and support environment along many of the finer points of running programs on FTMP. This will be helpful to the researcher trying to run an experiment on FTMP and even to the person probing FTMP with fault injections. Much of the information in this manual can be found in other sources; we are only attempting to bring together the basic points in a single source. If the reader should need points clarified, there is a list of support documentation in the back of this manual.

  5. Fault Tolerant Parallel Filters Based On Bch Codes

    Directory of Open Access Journals (Sweden)

    K.Mohana Krishna

    2015-04-01

    Full Text Available Digital filters are used in signal processing and communication systems. In some cases, the reliability of those systems is critical, and fault tolerant filter implementations are needed. Over the years, many techniques that exploit the filters’ structure and properties to achieve fault tolerance have been proposed. As technology scales, it enables more complex systems that incorporate many filters. In those complex systems, it is common that some of the filters operate in parallel, for example, by applying the same filter to different input signals. Recently, a simple technique that exploits the presence of parallel filters to achieve multiple fault tolerance has been presented. In this brief, that idea is generalized to show that parallel filters can be protected using Bose– Chaudhuri–Hocquenghem codes (BCH in which each filter is the equivalent of a bit in a traditional ECC. This new scheme allows more efficient protection when the number of parallel filters is large.

  6. Energy/Reliability Trade-offs in Fault-Tolerant Event-Triggered Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Gan, Junhe; Gruian, Flavius; Pop, Paul; Madsen, Jan

    2011-01-01

    reliability simultaneously is especially challenging, since lowering the voltage to reduce the energy consumption has been shown to increase the transient fault rate. We presented a Tabu Search-based approach which uses an energy/reliability trade-off model to find reliable and schedulable implementations...

  7. Design of Fault Tolerant Reversible Multiplier

    Directory of Open Access Journals (Sweden)

    H. P. Sinha

    2012-01-01

    Full Text Available In the recent years, reversible logic has emerged as a promising technology having its applications in low power CMOS, quantum computing, nanotechnology, and optical computing. The classical set of gates such as AND, OR, and EXOR are not reversible. This paper proposes a novel 4x4 bit reversible fault tolerant multiplier circuit which can multiply two 4-bit numbers. It is faster and has lower hardware complexity compared to the existing designs. In addition, the proposed reversible multiplier is better than the existing counterparts in terms of delay & power. It is based on two concepts. The partial products can be generated in parallel using Fredkin gates and thereafter the addition is done by using reversible parallel adder designed from IG gates. Thus, this paper provides the initial threshold to building of more complex system which can execute more complicated operations using reversible logic.

  8. Operating system fault tolerance support for real-time embedded applications

    OpenAIRE

    Afonso, Francisco

    2009-01-01

    Tolerância a falhas é um meio de obter-se alta confiabilidade para sistemas críticos e de elevada disponibilidade. Apesar dos esforços para prevenir e remover falhas durante o desenvolvimento destes sistemas, a aplicação de tolerância a falhas é normalmente necessária, já que o hardware pode falhar durante a operação do sistema e falhas de software são muito difíceis de eliminar completamente. Uma das dificuldades na implementação de técnicas de tolerância a falhas é a falta...

  9. Electronic Power Switch for Fault-Tolerant Networks

    Science.gov (United States)

    Volp, J.

    1987-01-01

    Power field-effect transistors reduce energy waste and simplify interconnections. Current switch containing power field-effect transistor (PFET) placed in series with each load in fault-tolerant power-distribution system. If system includes several loads and supplies, switches placed in series with adjacent loads and supplies. System of switches protects against overloads and losses of individual power sources.

  10. SIFT - Multiprocessor architecture for Software Implemented Fault Tolerance flight control and avionics computers

    Science.gov (United States)

    Forman, P.; Moses, K.

    1979-01-01

    A brief description of a SIFT (Software Implemented Fault Tolerance) Flight Control Computer with emphasis on implementation is presented. A multiprocessor system that relies on software-implemented fault detection and reconfiguration algorithms is described. A high level reliability and fault tolerance is achieved by the replication of computing tasks among processing units.

  11. Robust and Fault-Tolerant Linear Parameter-Varying Control of Wind Turbines

    DEFF Research Database (Denmark)

    Sloth, Christoffer; Esbensen, Thomas; Stoustrup, Jakob

    2011-01-01

    parameter variations along the nominal operating trajectory caused by nonlinear aerodynamics. To accommodate the fault in the pitch system, an active fault-tolerant controller (AFTC) and a passive fault-tolerant controller (PFTC) are designed. In addition to the nominal LPV controller, we also propose a...

  12. Fault Tolerant Heterogeneous Limited Duplication Scheduling algorithm for Decentralized Grid

    Directory of Open Access Journals (Sweden)

    DR. NITIN

    2013-04-01

    Full Text Available Fault tolerance is one of the most desirable property in decentralized grid computing systems, where computational resources are geographically distributed. These resources collaborate in order to execute workflow applications as fast as possible. In workflow applications, tasks are dependent on each other, so it becomes extremely vital that scheduling techniques should also have some decentralized fault tolerant mechanism. In this paper, we have proposed a decentralized fault tolerant mechanism which utilize the checkpoint concept; for Heterogeneous Limited Duplication (HLD algorithm. HLD is based on task duplication scheduling in heterogeneous environment. There are two fold benefits firstly; if node failure occurs then rest of grid nodes sustain the execution of application. Secondly, less makespan of application is obtained using checkpoint concept. Therefore, application scheduled over decentralized grid systems (which are known for their unreliable behavior will yield results fast utilizing algorithm proposed in this paper.

  13. Learning Fault-tolerant Speech Parsing with SCREEN

    CERN Document Server

    Wermter, S; Wermter, Stefan; Weber, Volker

    1994-01-01

    This paper describes a new approach and a system SCREEN for fault-tolerant speech parsing. SCREEEN stands for Symbolic Connectionist Robust EnterprisE for Natural language. Speech parsing describes the syntactic and semantic analysis of spontaneous spoken language. The general approach is based on incremental immediate flat analysis, learning of syntactic and semantic speech parsing, parallel integration of current hypotheses, and the consideration of various forms of speech related errors. The goal for this approach is to explore the parallel interactions between various knowledge sources for learning incremental fault-tolerant speech parsing. This approach is examined in a system SCREEN using various hybrid connectionist techniques. Hybrid connectionist techniques are examined because of their promising properties of inherent fault tolerance, learning, gradedness and parallel constraint integration. The input for SCREEN is hypotheses about recognized words of a spoken utterance potentially analyzed by a spe...

  14. Design Approach for Fault Tolerance in FPGA Architecture

    Directory of Open Access Journals (Sweden)

    Ms. Shweta S. Meshram

    2011-03-01

    Full Text Available Failures of nano-metric technologies owing to defects and shrinking process tolerances give rise to significant challenges for IC testing. In recent years the application space of reconfigurable devices has grown to include many platforms with a strong need for fault tolerance. While these systems frequently contain hardware redundancy to allow for continued operation in the presence of operational faults, the need to recover faulty hardware and return it to full functionality quickly and efficiently is great. In addition to providing functional density, FPGAs provide a level of fault tolerance generally not found in mask-programmable devices by including the capability to reconfigure around operational faults in the field. Reliability and process variability are serious issues for FPGAs in the future. With advancement in process technology, the feature size is decreasing which leads to higher defect densities, more sophisticated techniques at increased costs are required to avoid defects. If nano-technology fabrication are applied the yield may go down to zero as avoiding defect during fabrication will not be a feasible option Hence, feature architecture have to be defect tolerant. In regular structure like FPGA, redundancy is commonly used for fault tolerance. In this work we present a solution in which configuration bit-stream of FPGA is modified by a hardware controller that is present on the chip itself. The technique uses redundant device for replacing faulty device and increases the yield.

  15. Design Approach for Fault Tolerance in FPGA Architecture

    Directory of Open Access Journals (Sweden)

    Ms. Shweta S. Meshram

    2011-03-01

    Full Text Available Failures of nano-metric technologies owing to defects and shrinking process tolerances give rise tosignificant challenges for IC testing. In recent years the application space of reconfigurable devices hasgrown to include many platforms with a strong need for fault tolerance. While these systems frequentlycontain hardware redundancy to allow for continued operation in the presence of operational faults, theneed to recover faulty hardware and return it to full functionality quickly and efficiently is great. Inaddition to providing functional density, FPGAs provide a level of fault tolerance generally not found inmask-programmable devices by including the capability to reconfigure around operational faults in thefield. Reliability and process variability are serious issues for FPGAs in the future. With advancement inprocess technology, the feature size is decreasing which leads to higher defect densities, moresophisticated techniques at increased costs are required to avoid defects. If nano-technology fabricationare applied the yield may go down to zero as avoiding defect during fabrication will not be a feasibleoption Hence, feature architecture have to be defect tolerant. In regular structure like FPGA, redundancyis commonly used for fault tolerance. In this work we present a solution in which configuration bit-streamof FPGA is modified by a hardware controller that is present on the chip itself. The technique usesredundant device for replacing faulty device and increases the yield.

  16. Design methods for fault-tolerant finite state machines

    Science.gov (United States)

    Niranjan, Shailesh; Frenzel, James F.

    1993-01-01

    VLSI electronic circuits are increasingly being used in space-borne applications where high levels of radiation may induce faults, known as single event upsets. In this paper we review the classical methods of designing fault tolerant digital systems, with an emphasis on those methods which are particularly suitable for VLSI-implementation of finite state machines. Four methods are presented and will be compared in terms of design complexity, circuit size, and estimated circuit delay.

  17. Fault-tolerant search algorithms reliable computation with unreliable information

    CERN Document Server

    Cicalese, Ferdinando

    2013-01-01

    Why a book on fault-tolerant search algorithms? Searching is one of the fundamental problems in computer science. Time and again algorithmic and combinatorial issues originally studied in the context of search find application in the most diverse areas of computer science and discrete mathematics. On the other hand, fault-tolerance is a necessary ingredient of computing. Due to their inherent complexity, information systems are naturally prone to errors, which may appear at any level - as imprecisions in the data, bugs in the software, or transient or permanent hardware failures. This book pr

  18. Fault-tolerant and Diagnostic Methods for Navigation

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2003-01-01

    Precise and reliable navigation is crucial, and for reasons of safety, essential navigation instruments are often duplicated. Hardware redundancy is mostly used to manually switch between instruments should faults occur. In contrast, diagnostic methods are available that can use analytic redundancy...... to diagnose faults and autonomously provide valid navigation data, disregarding any faulty sensor data and use sensor fusion to obtain a best estimate for users. This paper discusses how diagnostic and fault-tolerant methods are applicable in marine systems. An example chosen is sensor fusion for...

  19. Concepts and Methods in Fault-tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens; Staroswiecly, M.; Wu, N.E.

    Faults in automated processes will often cause undesired reactions and shut-down of a controlled plant, and the consequences could be damage to technical parts of the plant, to personnel or the environment. Fault-tolerant control combines diagnosis with control methods to handle faults in an...... other situations, complex reconfiguration or on-line controller redesign is required. This paper gives an overview of recent tools to analyze and explore structure and other fundamental properties of an automated system such that any inherent redundancy in the controlled process can be fully utilized to...

  20. Fault-tolerant Sensor Fusion Based on Inertial Measurements and GNSS

    OpenAIRE

    Bryne, Torleiv Håland

    2013-01-01

    The standard observer for inertial navigation system (INS) have for many years been the extended Kalman filter. Due to extensive research, in recent years, on nonlinear observer applied with low-cost inertial sensors can this possible change.Fault-tolerance are in many applications necessary. In dynamic positioning operations are fault-tolerance required. This thesis dealt with development of a fault-tolerant nonlinear observer for integration of INS and Global Navigation Satellite Systems (G...

  1. Implementation of middleware fault tolerance support for real-time embedded applications

    OpenAIRE

    Afonso, Francisco; Carlos A. Silva; Montenegro, Sérgio; Tavares, Adriano

    2006-01-01

    Critical real-time embedded systems need to apply fault tolerance strategies to deal with operation time errors, either in hardware or software. In this paper we present the ongoing work to provide application fault tolerance by means of implementing middleware transparent support over the BOSS embedded operating system. The middleware uses a publishersubscriber protocol and enables the execution of several fault tolerance strategies with minimum burden to the application level software

  2. Fault Tolerant Control for Kori Unit 1 Steam Generator

    International Nuclear Information System (INIS)

    In order to implement more reliable control systems, failures of a controller, a sensor and an actuator should be taken into consideration in the process of control system design. Traditionally there have been two approaches for dealing with fault-tolerant control problem: active redundancy and passive redundancy. Active redundancy has no reconfiguration part to take an action such as diagnosing and selecting intact controller when a controller failure occurs, that is, one controller guarantees the system stability and performance under failure of the other controller. Meanwhile, passive redundancy has reconfiguration parts which supervise the system, reject the faulty controller, and select the sound controller which performs the mission. Active redundancy structure for fault-tolerant control is focused in the paper and design methods of fault tolerant state feedback control and fault-tolerant output feedback control are proposed, which makes control a system reliable while guaranteeing stability and performance in the sense of H∞ norm, in the face of controller failures in the dual-controller configuration. The proposed method is applied to Kori Unit 1 steam generator level control system. The results show that the steam generator water level is well controlled in the situation of one controller failure

  3. Fault-tolerant adaptive control for load-following in static space nuclear power systems

    Science.gov (United States)

    Parlos, Alexander G.; Onbasioglu, Fetiye O.; Peddicord, Kenneth L.; Metzger, John D.

    1992-01-01

    The possible use of a dual-loop model-based adaptive control system for load following in static space nuclear power systems is investigated. The proposed approach has thus far been applied only to a thermoelectric space nuclear power system but is equally applicable to other static space nuclear power systems such as thermionic systems.

  4. Control switching in high performance and fault tolerant control

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels Kjølstad

    2010-01-01

    The problem of reliability in high performance control and in fault tolerant control is considered in this paper. A feedback controller architecture for high performance and fault tolerance is considered. The architecture is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization. By using...... the nominal controller in the architecture as a simple and robust controller, it is possible to use the YJBK transfer function for optimization of the closed-loop performance. This can be done both in connections with normal operation of the system as well as in connection with faults in the system....... The architecture will also allow changing the applied sensors and/or actuators when switching between different controllers. This switchingget particular simple for open-loop stable systems....

  5. Robust TCP Connections for Fault Tolerant Computing

    OpenAIRE

    Ekwall, Richard; Urbán, Péter; Schiper, André

    2002-01-01

    When processes on two different machines communicate, they most often do so using the TCP protocol. While TCP is appropriate for a wide range of applications, it has shortcomings in other application areas. One of these areas is fault tolerant distributed computing. For some of those applications, TCP does not address link failures adequately: TCP breaks the connection if connectivity is lost for some duration (typically minutes). This is sometimes undesirable. The paper proposes robust TCP c...

  6. Design of Fault Tolerant Reversible Multiplier

    OpenAIRE

    H.P.Sinha; Nidhi Syal

    2012-01-01

    In the recent years, reversible logic has emerged as a promising technology having its applications in low power CMOS, quantum computing, nanotechnology, and optical computing. The classical set of gates such as AND, OR, and EXOR are not reversible. This paper proposes a novel 4x4 bit reversible fault tolerant multiplier circuit which can multiply two 4-bit numbers. It is faster and has lower hardware complexity compared to the existing designs. In addition, the proposed reversible multiplier...

  7. Design of Test Articles and Monitoring System for the Characterization of HIRF Effects on a Fault-Tolerant Computer Communication System

    Science.gov (United States)

    Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.; Koppen, Sandra V.

    2008-01-01

    This report describes the design of the test articles and monitoring systems developed to characterize the response of a fault-tolerant computer communication system when stressed beyond the theoretical limits for guaranteed correct performance. A high-intensity radiated electromagnetic field (HIRF) environment was selected as the means of injecting faults, as such environments are known to have the potential to cause arbitrary and coincident common-mode fault manifestations that can overwhelm redundancy management mechanisms. The monitors generate stimuli for the systems-under-test (SUTs) and collect data in real-time on the internal state and the response at the external interfaces. A real-time health assessment capability was developed to support the automation of the test. A detailed description of the nature and structure of the collected data is included. The goal of the report is to provide insight into the design and operation of these systems, and to serve as a reference document for use in post-test analyses.

  8. Steps toward fault-tolerant quantum chemistry.

    Energy Technology Data Exchange (ETDEWEB)

    Taube, Andrew Garvin

    2010-05-01

    Developing quantum chemistry programs on the coming generation of exascale computers will be a difficult task. The programs will need to be fault-tolerant and minimize the use of global operations. This work explores the use a task-based model that uses a data-centric approach to allocate work to different processes as it applies to quantum chemistry. After introducing the key problems that appear when trying to parallelize a complicated quantum chemistry method such as coupled-cluster theory, we discuss the implications of that model as it pertains to the computational kernel of a coupled-cluster program - matrix multiplication. Also, we discuss the extensions that would required to build a full coupled-cluster program using the task-based model. Current programming models for high-performance computing are fault-intolerant and use global operations. Those properties are unsustainable as computers scale to millions of CPUs; instead one must recognize that these systems will be hierarchical in structure, prone to constant faults, and global operations will be infeasible. The FAST-OS HARE project is introducing a scale-free computing model to address these issues. This model is hierarchical and fault-tolerant by design, allows for the clean overlap of computation and communication, reducing the network load, does not require checkpointing, and avoids the complexity of many HPC runtimes. Development of an algorithm within this model requires a change in focus from imperative programming to a data-centric approach. Quantum chemistry (QC) algorithms, in particular electronic structure methods, are an ideal test bed for this computing model. These methods describe the distribution of electrons in a molecule, which determine the properties of the molecule. The computational cost of these methods is high, scaling quartically or higher in the size of the molecule, which is why QC applications are major users of HPC resources. The complexity of these algorithms means that MPI alone is insufficient to achieve parallel scaling; QC developers have been forced to use alternative approaches to achieve scalability and would be receptive to radical shifts in the programming paradigm. Initial work in adapting the simplest QC method, Hartree-Fock, to this the new programming model indicates that the approach is beneficial for QC applications. However, the advantages to being able to scale to exascale computers are greatest for the computationally most expensive algorithms; within QC these are the high-accuracy coupled-cluster (CC) methods. Parallel coupledcluster programs are available, however they are based on the conventional MPI paradigm. Much of the effort is spent handling the complicated data dependencies between the various processors, especially as the size of the problem becomes large. The current paradigm will not survive the move to exascale computers. Here we discuss the initial steps toward designing and implementing a CC method within this model. First, we introduce the general concepts behind a CC method, focusing on the aspects that make these methods difficult to parallelize with conventional techniques. Then we outline what is the computational core of the CC method - a matrix multiply - within the task-based approach that the FAST-OS project is designed to take advantage of. Finally we outline the general setup to implement the simplest CC method in this model, linearized CC doubles (LinCC).

  9. OPTIMAL DESIGN ALGORITHM FOR FAULT TOLERANT INFORMATION SYSTEMS USED FOR PROCESSING ELECTRONIC MEDICAL RECORDS

    Directory of Open Access Journals (Sweden)

    P. V. Melyushin

    2015-01-01

    Full Text Available The paper considers problems on designing of medical information systems and proposes an approach to creation of a highly reliable automated system for processing electronic medical records on the basis of file allocation optimization in the network nodes. A mathematical model has been developed for optimal distribution of the files in the network nodes and an experimental investigation of two schemes of medical information systems has been executed in the paper.

  10. Implementations of a four-level mechanical architecture for fault-tolerant robots

    International Nuclear Information System (INIS)

    This paper describes a fault tolerant mechanical architecture with four levels devised and implemented in concert with NASA (Tesar, D. and Sreevijayan, D., Four-level fault tolerance in manipulator design for space operations. In First Int. Symp. Measurement and Control in Robotics (ISMCR '90), Houston, Texas, 20-22 June 1990.) Subsequent work has clarified and revised the architecture. The four levels proceed from fault tolerance at the actuator level, to fault tolerance via in-parallel chains, to fault tolerance using serial kinematic redundancy, and finally to the fault tolerance multiple arm systems provide. This is a subsumptive architecture because each successive layer can incorporate the fault tolerance provided by all layers beneath. For instance a serially-redundant robot can incorporate dual fault-tolerant actuators. Redundant systems provide the fault tolerance, but the guiding principle of this architecture is that functional redundancies actively increase the performance of the system. Redundancies do not simply remain dormant until needed. This paper includes specific examples of hardware and/or software implementation at all four levels

  11. System-Level Development of Fault-Tolerant Distributed Aero-Engine Control Architecture Project

    Data.gov (United States)

    National Aeronautics and Space Administration — NASA's vision for an "intelligent engine" will be realized with the development of a truly distributed control system and reliable smart transducer node components;...

  12. Towards the design of fault-tolerant distributed real-time systems

    OpenAIRE

    Klobedanz, Kay

    2014-01-01

    Die Anzahl und Komplexität eingebetteter Systeme nimmt stetig zu. Insbesondere für große verteilte Systeme ist die Ermittlung einer passenden Softwareverteilung eine komplexe Aufgabe. In dieser Arbeit präsentieren wir einen Ansatz zum Entwurf eingebetteter Echtzeitsysteme, der den Systementwickler bei der Ermittlung geeigneter Lösungen unterstützt.Bei sicherheitskritischen Systemen mit harten Echtzeitanforderungen kann die Verletzung einer harten Zeitschranke zu Schäden führen. Derartige Syst...

  13. Dependability modelling of a fault tolerant duplex system using AADL and GSPNs

    OpenAIRE

    Rugina, Ana-Elena; Kanoun, Karama; Kaâniche, Mohamed; Guiochet, Jérémie

    2005-01-01

    This research report is intended to explore the possibilities of deriving Generalised Stochastic Petri Nets (GSPNs) dependability models from AADL dependability models in order to estimate dependability measures for computer-based systems.The AADL dependability models are composed of i) AADL architecture models including the various components of the system and ii) their associated AADL error models, as described in Section 3 of this report. Our reference document for describing error models ...

  14. Assessing the reliability of diverse fault-tolerant software-based systems

    OpenAIRE

    Littlewood, B.; Popov, P. T.; L. Strigini

    2002-01-01

    We discuss a problem in the safety assessment of automatic control and protection systems. There is an increasing dependence on software for performing safety-critical functions, like the safety shut-down of dangerous plants. Software brings increased risk of design defects and thus systematic failures; redundancy with diversity between redundant channels is a possible defence. While diversity techniques can improve the dependability of software-based systems, they do not alleviate the diffic...

  15. A Structural Analysis Method Formulation for Fault-tolerant Control System Design

    DEFF Research Database (Denmark)

    Izadi-Zamanabadi, Roozbeh; Staroswiecki, M

    An analysis of structural model representation has been used to extract available inherent redundant information in the system. The paper presents a refined structured model representation based on bipartite directed graph definition and the necessary condition for sensor fusion based on the...

  16. Implementation of Fault Tolerant Method Using BCH Code on FPGA

    Directory of Open Access Journals (Sweden)

    Mahadevaswamy V P

    2012-09-01

    Full Text Available The Fault tolerance degradation is the property thatenables a system (often computer-based to continue operatingproperly in the event of the failure of (or one or more faultswithin some of its components. To designing a new 32-bitArithmetic Logic Unit (ALU that is secure against many attacksor faults and able to correct any 5-bit fault in any position of its 32bits input register of ALU. Because the radiation effects onelectronic circuits may cause to be inverted data bits of registers ormemories. If one bit of main storage system is changed themission of system would be completely different. The highmotivation in choice of BCH (Bose, chaudhuri, andHocquenghem codes is that, it is able to correct multiple errorsand these classes of codes are kind of powerful random errorcorrecting cyclic codes. In comparison with area penalty methods,32-bit fault tolerant ALU using BCH code is a better choice interms of area as compared to Triple Modular Redundancy (TMRand Residue code. This is due to the fault tolerant method for32-bit ALU using TMR with single or triplicated voting needsingle voting scheme or tripled voter and two extra 32-bit ALUwhich has been increased the hardware overhead by 202% and208% respectively. The Residue code requires hardwareoverhead of 148.9%. However, in comparison with TMR a n dRe s i d u e c o d e , BCH code needs the hardware overhead is 70to 75%, which causes that the overall cost and power consumptionwill get reduces. Thus proposed fault tolerant hardware overheadhas lower hardware and multiple error correction when comparedto the other techniques.

  17. A novel adaptive switching function on fault tolerable sliding mode control for uncertain stochastic systems.

    Science.gov (United States)

    Zahiripour, Seyed Ali; Jalali, Ali Akbar

    2014-09-01

    A novel switching function based on an optimization strategy for the sliding mode control (SMC) method has been provided for uncertain stochastic systems subject to actuator degradation such that the closed-loop system is globally asymptotically stable with probability one. In the previous researches the focus on sliding surface has been on proportional or proportional-integral function of states. In this research, from a degree of freedom that depends on designer choice is used to meet certain objectives. In the design of the switching function, there is a parameter which the designer can regulate for specified objectives. A sliding-mode controller is synthesized to ensure the reachability of the specified switching surface, despite actuator degradation and uncertainties. Finally, the simulation results demonstrate the effectiveness of the proposed method. PMID:24954808

  18. Load management in a distributed multimedia streaming environment using a fault-tolerant hierarchical system

    OpenAIRE

    AYBAY, HADİ IŞIK; Shah, Mohammad Ahmed

    2015-01-01

    In contrast to text-only forms of communications, multimedia uses a combination of audiovisual means alongside textual modes of communication. Streaming multimedia is such multimedia that is constantly delivered by a provider of the multimedia to a client. In streaming multimedia the streamed content is continually presented to and received by the end user. Distributed multimedia systems (DMSs) deliver multimedia content to end-users by means of distributed multimedia databases and distribute...

  19. Fault-Tolerant Distributed Systems: a Modular Approach to the Non-Blocking Atomic Commitment Problem

    OpenAIRE

    Raynal, Michel

    1996-01-01

    Agreement problems allow a set of processes to agree on a common output value. These problems are of primary importance in distributed systems and difficult to solve in presence of failures. This paper considers one of these problems whose practical interest is well known, namely the Non-Blocking Atomic Commitment Problem. First, a generic protocol solving this problem is given and then instantiations of its generic statements are provided for both synchronous and asynchronous distributed sys...

  20. Fault tolerant attitude control for small unmanned aircraft systems equipped with an airflow sensor array

    International Nuclear Information System (INIS)

    Inspired by sensing strategies observed in birds and bats, a new attitude control concept of directly using real-time pressure and shear stresses has recently been studied. It was shown that with an array of onboard airflow sensors, small unmanned aircraft systems can promptly respond to airflow changes and improve flight performances. In this paper, a mapping function is proposed to compute aerodynamic moments from the real-time pressure and shear data in a practical and computationally tractable formulation. Since many microscale airflow sensors are embedded on the small unmanned aircraft system surface, it is highly possible that certain sensors may fail. Here, an adaptive control system is developed that is robust to sensor failure as well as other numerical mismatches in calculating real-time aerodynamic moments. The advantages of the proposed method are shown in the following simulation cases: (i) feedback pressure and wall shear data from a distributed array of 45 airflow sensors; (ii) 50% failure of the symmetrically distributed airflow sensor array; and (iii) failure of all the airflow sensors on one wing. It is shown that even if 50% of the airflow sensors have failures, the aircraft is still stable and able to track the attitude commands. (paper)

  1. Sliding mode fault detection and fault-tolerant control of smart dampers in semi-active control of building structures

    Science.gov (United States)

    Yeganeh Fallah, Arash; Taghikhany, Touraj

    2015-12-01

    Recent decades have witnessed much interest in the application of active and semi-active control strategies for seismic protection of civil infrastructures. However, the reliability of these systems is still in doubt as there remains the possibility of malfunctioning of their critical components (i.e. actuators and sensors) during an earthquake. This paper focuses on the application of the sliding mode method due to the inherent robustness of its fault detection observer and fault-tolerant control. The robust sliding mode observer estimates the state of the system and reconstructs the actuators’ faults which are used for calculating a fault distribution matrix. Then the fault-tolerant sliding mode controller reconfigures itself by the fault distribution matrix and accommodates the fault effect on the system. Numerical simulation of a three-story structure with magneto-rheological dampers demonstrates the effectiveness of the proposed fault-tolerant control system. It was shown that the fault-tolerant control system maintains the performance of the structure at an acceptable level in the post-fault case.

  2. Communication and Agreement Abstractions for Fault-Tolerant Asynchronous Distributed Systems

    CERN Document Server

    Raynal, Michel

    2010-01-01

    Understanding distributed computing is not an easy task. This is due to the many facets of uncertainty one has to cope with and master in order to produce correct distributed software. Considering the uncertainty created by asynchrony and process crash failures in the context of message-passing systems, the book focuses on the main abstractions that one has to understand and master in order to be able to produce software with guaranteed properties. These fundamental abstractions are communication abstractions that allow the processes to communicate consistently (namely the register abstraction

  3. ACID Support and Fault-Tolerant Database Systems on Cloud:A Review

    Directory of Open Access Journals (Sweden)

    Pratiyush Guleria

    2012-10-01

    Full Text Available Cloud computing represents a different way to architect and remotely manage computing resources. One has only to establish an account with Microsoft or Amazon or Google to begin building and deploying application systems into a cloud. These systems can be, but certainly are not restricted to being simplistic. Some applications requires http services, some requires relational database or might require web service infrastructure and message queues. With clouds, IT-related applications can be provided as a service, which can be accessed through internet. There are platforms on cloud which provide scalability and high availability properties for web applications but there are problems related to data consistency at the same time, and in case of server failures, it becomes major problem in applications related to payment services. Data needs to be properly managed in cloud environment and to achieve proper transaction processing and consistency, RDBMS techniques such as ACID transactions should be used. Web services in Azure ensure application availability by replicating stored data at least three times and offer optional geolocation of replicas in separate Microsoft data centres to provide disaster recovery services.Azure storage services provide scalable persistent storage of structured tables, blobs and queues.

  4. A lightweight fault-tolerant middleware for a Subaru Telescope second generation observation control system

    Science.gov (United States)

    Jeschke, Eric; Bon, Bruce; Inagaki, Takeshi; Streeper, Sam

    2008-08-01

    Subaru Telescope is developing a second-generation Observation Control System that specifically addresses some of the deficiencies of the current Subaru OCS. Two areas of concern are complexity and failure handling. The current system has over 1000 dedicated OCS processes spread across a dozen hosts and provides nothing in the way of automated failover. Furthermore, manual failover is so fraught with difficulty that it is rarely attempted. Our Generation 2 OCS is written almost entirely in Python and builds upon a Subaru-developed middleware based on the XML-RPC protocol. This framework offers the following benefits: - has very few dependences outside of standard Python - provides a nearly seamless remote proxy object-oriented interface - provides optional user/password authentication and/or SSL encryption - is extremely simple to use from client applications - is connectionless, and assists transparent failover of communications and services on a cluster of hosts - has reasonable performance for a wide range of needs - allows multiple language bindings - for dynamic languages, requires no interface stub files The "back end" (service side) of the OCS is nearing completion, and has already been used successfully during two separate OCS engineering runs. It is comprised of only a couple dozen processes, and provides automated failover capabilities on a rack of commodity x86 Linux servers. We provide an overview of the middleware design and its failover capabilities. Some data on the performance of communications using the middleware protocol is included.

  5. Database mirroring in fault-tolerant continuous technological process control

    Directory of Open Access Journals (Sweden)

    R. Danel

    2015-10-01

    Full Text Available This paper describes the implementations of mirroring technology of the selected database systems Microsoft SQL Server, MySQL and Cach. By simulating critical failures the systems behavior and their resilience against failure were tested. The aim was to determine whether the database mirroring is suitable to use in continuous metallurgical processes for ensuring the fault-tolerant solution at affordable cost. The present day database systems are characterized by high robustness and are resistant to sudden system failure. Database mirroring technologies are reliable and even low-budget projects can be provided with a decent fault-tolerant solution. The database system technologies available for low-budget projects are not suitable for use in real-time systems.

  6. Fault handling schemes in electronic systems with specific application to radiation tolerance and VLSI design

    Science.gov (United States)

    Attia, John Okyere

    1993-10-01

    Naturally occurring space radiation particles can produce transient and permanent changes in the electrical properties of electronic devices and systems. In this work, the transient radiation effects on DRAM and CMOS SRAM were considered. In addition, the effect of total ionizing dose radiation of the switching times of CMOS logic gates were investigated. Effects of transient radiation on the column and cell of MOS dynamic memory cell was simulated using SPICE. It was found that the critical charge of the bitline was higher than that of the cell. In addition, the critical charge of the combined cell-bitline was found to be dependent on the gate voltage of the access transistor. In addition, the effect of total ionizing dose radiation on the switching times of CMOS logic gate was obtained. The results of this work indicate that, the rise time of CMOS logic gates increases, while the fall time decreases with an increase in total ionizing dose radiation. Also, by increasing the size of the P-channel transistor with respect to that of the N-channel transistor, the propagation delay of CMOS logic gate can be made to decrease with, or be independent of an increase in total ionizing dose radiation. Furthermore, a method was developed for replacing polysilicon feedback resistance of SRAMs with a switched capacitor network. A switched capacitor SRAM was implemented using MOS Technology. The critical change of the switched capacitor SRAM has a very large critical charge. The results of this work indicate that switched capacitor SRAM is a viable alternative to SRAM with polysilicon feedback resistance.

  7. Superior model for fault tolerance computation in designing nano-sized circuit systems

    Energy Technology Data Exchange (ETDEWEB)

    Singh, N. S. S., E-mail: narinderjit@petronas.com.my; Muthuvalu, M. S., E-mail: msmuthuvalu@gmail.com [Fundamental and Applied Sciences Department, Universiti Teknologi PETRONAS, Bandar Seri Iskandar, Perak (Malaysia); Asirvadam, V. S., E-mail: vijanth-sagayan@petronas.com.my [Electrical and Electronics Engineering Department, Universiti Teknologi PETRONAS, Bandar Seri Iskandar, Perak (Malaysia)

    2014-10-24

    As CMOS technology scales nano-metrically, reliability turns out to be a decisive subject in the design methodology of nano-sized circuit systems. As a result, several computational approaches have been developed to compute and evaluate reliability of desired nano-electronic circuits. The process of computing reliability becomes very troublesome and time consuming as the computational complexity build ups with the desired circuit size. Therefore, being able to measure reliability instantly and superiorly is fast becoming necessary in designing modern logic integrated circuits. For this purpose, the paper firstly looks into the development of an automated reliability evaluation tool based on the generalization of Probabilistic Gate Model (PGM) and Boolean Difference-based Error Calculator (BDEC) models. The Matlab-based tool allows users to significantly speed-up the task of reliability analysis for very large number of nano-electronic circuits. Secondly, by using the developed automated tool, the paper explores into a comparative study involving reliability computation and evaluation by PGM and, BDEC models for different implementations of same functionality circuits. Based on the reliability analysis, BDEC gives exact and transparent reliability measures, but as the complexity of the same functionality circuits with respect to gate error increases, reliability measure by BDEC tends to be lower than the reliability measure by PGM. The lesser reliability measure by BDEC is well explained in this paper using distribution of different signal input patterns overtime for same functionality circuits. Simulation results conclude that the reliability measure by BDEC depends not only on faulty gates but it also depends on circuit topology, probability of input signals being one or zero and also probability of error on signal lines.

  8. Superior model for fault tolerance computation in designing nano-sized circuit systems

    International Nuclear Information System (INIS)

    As CMOS technology scales nano-metrically, reliability turns out to be a decisive subject in the design methodology of nano-sized circuit systems. As a result, several computational approaches have been developed to compute and evaluate reliability of desired nano-electronic circuits. The process of computing reliability becomes very troublesome and time consuming as the computational complexity build ups with the desired circuit size. Therefore, being able to measure reliability instantly and superiorly is fast becoming necessary in designing modern logic integrated circuits. For this purpose, the paper firstly looks into the development of an automated reliability evaluation tool based on the generalization of Probabilistic Gate Model (PGM) and Boolean Difference-based Error Calculator (BDEC) models. The Matlab-based tool allows users to significantly speed-up the task of reliability analysis for very large number of nano-electronic circuits. Secondly, by using the developed automated tool, the paper explores into a comparative study involving reliability computation and evaluation by PGM and, BDEC models for different implementations of same functionality circuits. Based on the reliability analysis, BDEC gives exact and transparent reliability measures, but as the complexity of the same functionality circuits with respect to gate error increases, reliability measure by BDEC tends to be lower than the reliability measure by PGM. The lesser reliability measure by BDEC is well explained in this paper using distribution of different signal input patterns overtime for same functionality circuits. Simulation results conclude that the reliability measure by BDEC depends not only on faulty gates but it also depends on circuit topology, probability of input signals being one or zero and also probability of error on signal lines

  9. Database mirroring in fault-tolerant continuous technological process control

    OpenAIRE

    Danel, Roman; Otte, Lukáš; Kozel, Roman; Johanides, David; Vilamová, Šárka; Janovská, Kamila; Řepka, Michal

    2016-01-01

    This paper describes the implementations of mirroring technology of the selected database systems – Microsoft SQL Server, MySQL and Caché. By simulating critical failures the systems behavior and their resilience against failure were tested. The aim was to determine whether the database mirroring is suitable to use in continuous metallurgical processes for ensuring the fault-tolerant solution at affordable cost. The present day database systems are characterized by high robustness...

  10. Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing

    Science.gov (United States)

    Akamine, Robert L.; Hodson, Robert F.; LaMeres, Brock J.; Ray, Robert E.

    2011-01-01

    Fault tolerant systems require the ability to detect and recover from physical damage caused by the hardware s environment, faulty connectors, and system degradation over time. This ability applies to military, space, and industrial computing applications. The integrity of Point-to-Point (P2P) communication, between two microcontrollers for example, is an essential part of fault tolerant computing systems. In this paper, different methods of fault detection and recovery are presented and analyzed.

  11. Analysis of a cascaded multilevel inverter with fault-tolerant control

    OpenAIRE

    Jesús Aguayo Alquicira; Abraham Claudio Sánchez; Luis Gerardo Vela Valdés; Marco Antonio Rodríguez; Rodolfo Amalio Vargas Méndez

    2011-01-01

    Cascaded multilevel inverters are widely used in industry for speed control of induction motors and, even when the converters’ operation is highly reliable, several faults can occur, leading to poor engine performance or even causing the whole system to stop. It is desirable to keep the system operational when a failure occurs, even when degraded, and implementing fault-tolerant systems are thus a good choice. This paper presents a general strategy for fault-tolerant control in a 7-level casc...

  12. Fault tolerant microcomputer based alarm annunciator for Dhruva reactor

    International Nuclear Information System (INIS)

    The Dhruva alarm annunciator displays the status of 624 alarm points on an array of display windows using the standard ringback sequence. Recognizing the need for a very high availability, the system is implemented as a fault tolerant configuration. The annunciator is partitioned into three identical units; each unit is implemented using two microcomputers wired in a hot standby mode. In the event of one computer malfunctioning, the standby computer takes over control in a bouncefree transfer. The use of microprocessors has helped built-in flexibility in the system. The system also provides built-in capability to resolve the sequence of occurrence of events and conveys this information to another system for display on a CRT. This report describes the system features, fault tolerant organisation used and the hardware and software developed for the annunciation function. (author). 8 figs

  13. Diagnosis and Fault-tolerant Control, 3rd Edition

    DEFF Research Database (Denmark)

    Blanke, Mogens; Kinnaert, Michel; Lunze, Jan; Staroswiecki, Marcel

    The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process...

  14. Fault-Tolerant Routing in Butterfly Networks

    Directory of Open Access Journals (Sweden)

    Mohammed H. Mahafzah

    2010-01-01

    Full Text Available This research shows that Butterfly networks can be fault-tolerant using Masked Interval Routing Scheme (MIRS. The MIRS was introduced with the aim of compressing the routing tables in a network. It was shown that MIRS could drastically reduce interval information stored in networks such as globe and hypercube graphs, compared to the classical Interval Routing Scheme (IRS. In Butterfly graphs of O(N vertices the number of intervals per edge goes down from Ω in IRS to O(logN in MIRS. This research shows that MIRS may be advantageously used in Butterfly networks, proving that optimal routing with one interval per edge is still possible with a harmless subset of faulty vertices. This research gives an optimal algorithm to reconfigure the intervals in the presence of faults.

  15. H∞ Fault Tolerant Control of WECS Based on the PWA Model

    Directory of Open Access Journals (Sweden)

    Yun-Tao Shi

    2014-03-01

    Full Text Available The main contribution of this paper is the development of H∞ fault tolerant control for a wind energy conversion system (WECS based on the stochastic piecewise affine (PWA model. In this paper the normal and fault stochastic PWA models for WECS including multiple working points at different wind speeds are established. A reliable piecewise linear quadratic regulator state feedback is designed for the fault tolerant actuator and sensor. A sufficient condition for the existence of the passive fault tolerant controller is derived based on some linear matrix inequalities (LMIs. It is shown that the H∞ fault tolerant controller of WECS can control the wind turbine exposed to multiple simultaneous sensor faults or actuator faults; that is, the reliability of wind turbines can be improved.

  16. Fault-Tolerant Coding for State Machines

    Science.gov (United States)

    Naegle, Stephanie Taft; Burke, Gary; Newell, Michael

    2008-01-01

    Two reliable fault-tolerant coding schemes have been proposed for state machines that are used in field-programmable gate arrays and application-specific integrated circuits to implement sequential logic functions. The schemes apply to strings of bits in state registers, which are typically implemented in practice as assemblies of flip-flop circuits. If a single-event upset (SEU, a radiation-induced change in the bit in one flip-flop) occurs in a state register, the state machine that contains the register could go into an erroneous state or could hang, by which is meant that the machine could remain in undefined states indefinitely. The proposed fault-tolerant coding schemes are intended to prevent the state machine from going into an erroneous or hang state when an SEU occurs. To ensure reliability of the state machine, the coding scheme for bits in the state register must satisfy the following criteria: 1. All possible states are defined. 2. An SEU brings the state machine to a known state. 3. There is no possibility of a hang state. 4. No false state is entered. 5. An SEU exerts no effect on the state machine. Fault-tolerant coding schemes that have been commonly used include binary encoding and "one-hot" encoding. Binary encoding is the simplest state machine encoding and satisfies criteria 1 through 3 if all possible states are defined. Binary encoding is a binary count of the state machine number in sequence; the table represents an eight-state example. In one-hot encoding, N bits are used to represent N states: All except one of the bits in a string are 0, and the position of the 1 in the string represents the state. With proper circuit design, one-hot encoding can satisfy criteria 1 through 4. Unfortunately, the requirement to use N bits to represent N states makes one-hot coding inefficient.

  17. Highly Reliable Fault Tolerant Technique for Safety Critical Applications

    Directory of Open Access Journals (Sweden)

    Nanditha S

    2014-05-01

    Full Text Available This paper presents a highly reliable fault tolerant technique for safety critical applications using Five Modular Redundancy method. In high radiation environments like space crafts and nuclear thermal plants it is likely that single event upsets (SEU degrades the system operation. This causes single bit flips in the sequential elements of electronic components in the system. If these systems are not provided with the fault tolerance then there are high chances of obtaining false response. In order to avoid this problem the system is made redundant and a roll-forward recovery mechanism is used to increase the overall reliability. Scan cell design is employed to shift out the internal states of all the flip flops during comparison and recovery process. The proposed method is designed using verilog HDL on XILINX ISE simulator.

  18. Fault-tolerant computation without concatenation

    CERN Document Server

    Dennis, E

    1999-01-01

    It has been known that error-correction via concatenated codes can be done with exponentially small failure rate if the error rate for physical qubits is below a certain accuracy threshold (probably 10^-3 - 10^-6). Other, un-concatenated codes with their own attractive features - e.g., a threshold of 10^-2 - have also been studied. A method to obtain universal computation is presented here which does not rely on any concatenated structure within the code itself, but instead emulates this structure with logical qubits in order to construct an encoded Toffoli gate. This realizes 10^-2 as a threshold for fault-tolerant quantum computation.

  19. Fault Tolerant Control of Wind Turbines

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob; Kinnaert, Michel

    2013-01-01

    nominal power of 4.8 MW. The fault detection and isolation (FDI) problem was addressed by several teams, and five of the solutions are compared in the second part of this paper. This comparison relies on additional test data in which the faults occur in different operating conditions than in the test data...... the converter system. Since it is a system-level model, converter and pitch system models are simplified because these are controlled by internal controllers working at higher frequencies than the system model. The model represents a three-bladed pitch-controlled variable-speed wind turbine with a...

  20. Diagnosis and Fault-tolerant Control for Ship Station Keeping

    DEFF Research Database (Denmark)

    Blanke, Mogens

    design for systems of high complexity, and also analyse the cases of cascaded or multiple faults. The paper takes as example a ship with two CP propellers, rudders and a bow thruster as actuators, and instrumentation with a suite of global position sensors, inertial navigation units and conventional gyro...... units to provide ship motion information. A salient feature of the design mehod is the ability to analyse cases where faults have occurrred and easily determine where in the faulty system diagnosability and controlability are retained.......This paper adresses the design process of diagnosis and fault-tolerant control when the a system should operate despite multiple failures in sensors or actuators. Graph-teory based analysis of systems structure is demonstrated to be a unique design methodology that can cope with the diagnosis...

  1. Fault tolerance in Hadoop MapReduce implementation

    OpenAIRE

    Cogorno, Matas; Rey, Javier; nesmachnow, sergio

    2013-01-01

    This document reports the advances on exploring and understanding the fault tolerance mechanisms in Hadoop MapReduce. A description of the current fault tolerance features existing in Hadoop is provided, along with a review of related works on the topic. Finally, the document describes some relevant proposals about fault tolerance worth considering to implement in Hadoop within the PERMARE project in order to provide support for pervasive computing environments.

  2. Fault-tolerant distributed computing scheme based on erasure codes

    OpenAIRE

    Lacan, Jérôme

    2006-01-01

    Some emerging classes of distributed computing systems, such peer-to-peer or grid computing computing systems, are composed of heterogeneous computing resources potentially unreliable. This paper proposes to use erasure codes to improve the fault-tolerance of parallel distributed computing applications in this context. A general method to generate redundant processes from a set of parallel processes is presented. This scheme allows the recovery of the result of the application even if some...

  3. Runtime Instrumentation of SystemC/TLM2 Interfaces for Fault Tolerance Requirements Verification in Software Cosimulation

    OpenAIRE

    Antonio da Silva; Pablo Parra; scar R. Polo; Sebastin Snchez

    2014-01-01

    This paper presents the design of a SystemC transaction level modelling wrapping library that can be used for the assertion of system properties, protocol compliance, or fault injection. The library uses C++ virtual table hooks as a dynamic binary instrumentation technique to inline wrappers in the TLM2 transaction path. This technique can be applied after the elaboration phase and needs neither source code modifications nor recompilation of the top level SystemC modules. The proposed techniq...

  4. Fault Detection for Shipboard Monitoring and Decision Support Systems

    DEFF Research Database (Denmark)

    Lajic, Zoran; Nielsen, Ulrik Dam

    2009-01-01

    In this paper a basic idea of a fault-tolerant monitoring and decision support system will be explained. Fault detection is an important part of the fault-tolerant design for in-service monitoring and decision support systems for ships. In the paper, a virtual example of fault detection will be p...... presented for a containership with a real decision support system onboard. All possible faults can be simulated and detected using residuals and the generalized likelihood ratio (GLR) algorithm....

  5. Design study of Software-Implemented Fault-Tolerance (SIFT) computer

    Science.gov (United States)

    Wensley, J. H.; Goldberg, J.; Green, M. W.; Kutz, W. H.; Levitt, K. N.; Mills, M. E.; Shostak, R. E.; Whiting-Okeefe, P. M.; Zeidler, H. M.

    1982-01-01

    Software-implemented fault tolerant (SIFT) computer design for commercial aviation is reported. A SIFT design concept is addressed. Alternate strategies for physical implementation are considered. Hardware and software design correctness is addressed. System modeling and effectiveness evaluation are considered from a fault-tolerant point of view.

  6. Model Prediction-Based Approach to Fault Tolerant Control with Applications

    OpenAIRE

    Mahmoud, Professor Magdi S.; Khalid, Dr. Haris M.

    2013-01-01

    Abstract— Fault-tolerant control (FTC) is an integral component in industrial processes as it enables the system to continue robust operation under some conditions. In this paper, an FTC scheme is proposed for interconnected systems within an integrated design framework to yield a timely monitoring and detection of fault and reconfiguring the controller according to those faults. The unscented Kalman filter (UKF)-based fault detection and diagnosis system is initially run on the main plant an...

  7. Fault Tolerant Control in a Semi-active Suspension

    OpenAIRE

    Tudon-Martınez, Juan C.; Morales-Menéndez, Rubén; Ramirez-Mendoza, Ricardo; Sename, Olivier; Dugard, Luc

    2012-01-01

    A Fault Tolerant Control System (FTCS) in a Quarter of Vehicle (QoV ) model is proposed. The control law is time-varying using a Linear Parameter-Varying (LPV ) based controller, which includes two scheduling parameters. One parameter for monitoring the nonlinear behavior of the damper, and another for fault accommodation using a reference model obtained by a state observer of the normal operating regime. The QoV model represents a semi-active suspension, including an experimental magneto-rhe...

  8. Incorporating Fault Tolerance Mechanism into Grid Meta-Scheduler

    Directory of Open Access Journals (Sweden)

    Hong He

    2014-10-01

    Full Text Available In large-scale grid platforms, providing fault-tolerance for users is always a challenging task because of the uncertainty of network resources. In this paper, we present an intelligent agent based meta-scheduler, which is aiming at improving the fault-tolerance of grid systems when running users application. The proposed meta-scheduler is designed as an extendable framework, which allows plugging in multiple scheduling policies for deal with different scenarios. The agent-based scheduling framework enables grid systems to deploy their local schedulers in a flexible manner. Extensive experiments are conducted to investigate the performance of the proposed meta-scheduler, and the results show that it is effective to provide enhanced dependability for grid users, especially when the system is across multi-organization.

  9. A Blueprint for a Topologically Fault-tolerant Quantum Computer

    CERN Document Server

    Bonderson, Parsa; Freedman, Michael; Nayak, Chetan

    2010-01-01

    The advancement of information processing into the realm of quantum mechanics promises a transcendence in computational power that will enable problems to be solved which are completely beyond the known abilities of any "classical" computer, including any potential non-quantum technologies the future may bring. However, the fragility of quantum states poses a challenging obstacle for realization of a fault-tolerant quantum computer. The topological approach to quantum computation proposes to surmount this obstacle by using special physical systems -- non-Abelian topologically ordered phases of matter -- that would provide intrinsic fault-tolerance at the hardware level. The so-called "Ising-type" non-Abelian topological order is likely to be physically realized in a number of systems, but it can only provide a universal gate set (a requisite for quantum computation) if one has the ability to perform certain dynamical topology-changing operations on the system. Until now, practical methods of implementing thes...

  10. Design of fault-tolerant inductive position sensor

    International Nuclear Information System (INIS)

    The position sensors used in a magnetic bearing system are desirable to provide some degree of fault-tolerance as the rotor position is necessary for the feedback control to overcome the open-loop instability. In this paper, we propose and inductive position sensor that can cope with a partial fault in the sensor. The sensor has multiple poles which can be combined to sense the in-plane motion of the rotor. When a high-frequency voltage signal drives each pole of the sensor, the resulting current in the sensor coil contains information regarding the rotor position. The signal processing circuit of the sensor extracts this position information. In this paper, we used the magnetic circuit model of the sensor that shows the analytical relationship between the sensor output and the rotor motion. The multi-polar structure of the sensor makes it possible to introduce redundancy which can be exploited for fault-tolerant operation. The proposed sensor is applied to a magnetically levitated turbo-molecular vacuum pump. Experimental results validate the fault-tolerance algorithm

  11. On Reliability Analysis of Fault-tolerant Multistage Interconnection Networks

    Directory of Open Access Journals (Sweden)

    Rinkle Aggarwal

    2008-11-01

    Full Text Available The design of a suitable interconnection network for inter-processor communication is one of the key issues of the system performance. The reliability of these networks and their ability to continue operating despite failures are major concerns in determining the overall system performance. In this paper a new irregular network IABN has been proposed modifying existing ABN network. ABN is a regular multipath network with limited fault tolerance. The reliabilities of the IABN and ABN multi-stage interconnection networks have been calculated and compared in terms of the Upper and Lower bounds of Mean time to failure (MTTF.The IABN is a network that provides much better fault-tolerance by providing three time more paths between any pair of source-destination and better reliability at the expanse of little more cost than ABN.

  12. Fault-tolerant distributed mass storage for LHC computing

    CERN Document Server

    Wiebalck, A; Lindenstruth, V; Stinbeck, T M

    2003-01-01

    In this paper we present the concept and first prototyping results of a modular fault-tolerant distributed mass storage architecture for large Linux PC clusters as they are deployed by the upcoming particle physics experiments. The device masquerading technique using an Enhanced Network Block Device (ENBD) enables local RAID over remote disks as the key concept of the ClusterRAID system. The block level interface to remote files, partitions or disks provided by the ENBD makes it possible to use the standard Linux software RAID to add fault-tolerance to the system. Preliminary performance measurements indicate that the latency is comparable to a local hard drive. With four disks throughput rates of up to 55MB/s were achieved with first prototypes for a RAIDO setup, and about 40M/s for a RAID5 setup. (29 refs).

  13. Checkpoint-based Intelligent Fault tolerance For Cloud Service Providers

    Directory of Open Access Journals (Sweden)

    Rejin Paul

    2012-12-01

    Full Text Available With the increasing demand and benefits of cloud computing infrastructure, real time computing can be performed on cloud infrastructure. A real time system can take advantage of intensive computing capabilities and scalable virtualized environment of cloud computing to execute real time tasks. In most of the real time cloud applications, processing is done on remote cloud computing nodes. So there are more chances of errors, due to the undetermined latency and loose control over computing node. On the other side, most of the real time systems are also safety critical and should be highly reliable. So there is an increased requirement for fault tolerance to achieve reliability for the real time computing on cloud Infrastructure. In this paper, proposes a smart checkpoint infrastructure for virtualized service providers and fault tolerance model for real time cloud computing. The checkpoints are stored in a Hadoop Distributed File System. This allows resuming a task execution faster after a node crash and increasing the fault tolerance of the system, since checkpoints are distributed and replicated in all the nodes of the provider. This paper presents a running implementation of this infrastructure and its evaluation, demonstrating that it is an effective way to make faster checkpoints with low interference on task execution and efficient task recovery after a node failure.One advantage of cloud computing is the dynamicity of re- source provisioning. Our architecture makes use of this advantage by enabling dynamic run- time modifications of replication groups

  14. Ranking Components using FTCloud for Fault-Tolerant Cloud Applications

    OpenAIRE

    Ms. V. Asha Judi; Mr. C. Sathish

    2014-01-01

    Building highly reliable cloud applications is a challenging and critical research problem.FTCloud framework is introduced to solve this issue in cloud environment.FTCloud is a component ranking based framework for building fault-tolerant cloud applications.It consists of two algorithms.FTCloud1 uses component invocation structures and invocation frequencies for finding significant components.FTCloud2 fuses the system structure information as well as component characteristics to identify the ...

  15. H∞ Fault Tolerant Control of WECS Based on the PWA Model

    OpenAIRE

    Yun-Tao Shi; Qi Kou; De-Hui Sun; Zheng-Xi Li; Shu-Juan Qiao; Yan-Jiao Hou

    2014-01-01

    The main contribution of this paper is the development of H∞ fault tolerant control for a wind energy conversion system (WECS) based on the stochastic piecewise affine (PWA) model. In this paper the normal and fault stochastic PWA models for WECS including multiple working points at different wind speeds are established. A reliable piecewise linear quadratic regulator state feedback is designed for the fault tolerant actuator and sensor. A sufficient condition for the existence of the passiv...

  16. Fault tolerant control with torque limitation based on fault mode for ten-phase permanent magnet synchronous motor

    Directory of Open Access Journals (Sweden)

    Guo Hong

    2015-10-01

    Full Text Available This paper proposes a novel fault tolerant control with torque limitation based on the fault mode for the ten-phase permanent magnet synchronous motor (PMSM under various open-circuit and short-circuit fault conditions, which includes the optimal torque control and the torque limitation control based on the fault mode. The optimal torque control is adopted to guarantee the ripple-free electromagnetic torque operation for the ten-phase motor system under the post-fault condition. Furthermore, we systematically analyze the load capacity of the ten-phase motor system under different fault modes. And a torque limitation control approach based on the fault mode is proposed, which was not available earlier. This approach is able to ensure the safety operation of the faulted motor system in long operating time without causing the overheat fault. The simulation result confirms that the proposed fault tolerant control for the ten-phase motor system is able to guarantee the ripple-free electromagnetic torque and the safety operation in long operating time under the normal and fault conditions.

  17. Fault Tolerance in ZigBee Wireless Sensor Networks

    Science.gov (United States)

    Alena, Richard; Gilstrap, Ray; Baldwin, Jarren; Stone, Thom; Wilson, Pete

    2011-01-01

    Wireless sensor networks (WSN) based on the IEEE 802.15.4 Personal Area Network standard are finding increasing use in the home automation and emerging smart energy markets. The network and application layers, based on the ZigBee 2007 PRO Standard, provide a convenient framework for component-based software that supports customer solutions from multiple vendors. This technology is supported by System-on-a-Chip solutions, resulting in extremely small and low-power nodes. The Wireless Connections in Space Project addresses the aerospace flight domain for both flight-critical and non-critical avionics. WSNs provide the inherent fault tolerance required for aerospace applications utilizing such technology. The team from Ames Research Center has developed techniques for assessing the fault tolerance of ZigBee WSNs challenged by radio frequency (RF) interference or WSN node failure.

  18. Fault-tolerance techniques for SRAM-based FPGAs

    CERN Document Server

    Kastensmidt, Fernanda Lima; Reis, Ricardo

    2006-01-01

    Fault-tolerance in integrated circuits is no longer the exclusive concern of space designers or highly-reliable applications engineers. Today, designers of many next-generation products must cope with reduced margin noises. The continuous evolution of fabrication technology of semiconductor components – shrinking transistor geometry, power supply, speed, and logic density – has significantly reduced the reliability of very deep submicron integrated circuits, in face of various internal and external sources of noise. Field Programmable Gate Arrays (FPGAs), customizable by SRAM cells, are the latest advance in the integrated circuit evolution: millions of memory cells to implement the logic, embedded memories, routing, and embedded microprocessors cores. These re-programmable systems-on-chip platforms must be fault-tolerant to cope with current requirements.

  19. Fault-tolerant quantum computation and communication on a distributed 2D array of small local systems

    Energy Technology Data Exchange (ETDEWEB)

    Fujii, K.; Yamamoto, T.; Imoto, N. [Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531 (Japan); Koashi, M. [Photon Science Center, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8656 (Japan)

    2014-12-04

    We propose a scheme for distributed quantum computation with small local systems connected via noisy quantum channels. We show that the proposed scheme tolerates errors with probabilities ∼30% and ∼ 0.1% in quantum channels and local operations, respectively, both of which are improved substantially compared to the previous works.

  20. Fault-tolerant quantum computation and communication on a distributed 2D array of small local systems

    International Nuclear Information System (INIS)

    We propose a scheme for distributed quantum computation with small local systems connected via noisy quantum channels. We show that the proposed scheme tolerates errors with probabilities ∼30% and ∼ 0.1% in quantum channels and local operations, respectively, both of which are improved substantially compared to the previous works

  1. Fault tolerant controller for a class of additive faults: a quasi-continuous high-order sliding mode approach

    Science.gov (United States)

    Dávila, J.; Cieslak, J.; Henry, D.; Zolghadri, A.

    2015-11-01

    In this paper a fault tolerant control strategy that combines the backstepping procedure and the quasi-continuous high-order sliding mode controller is proposed. The fault tolerance principle is based on a hierarchical application of the backstepping methodology ensuring the finite time convergence of the desired system states, in spite of the considered fault situations. The additive effect of the faults and disturbances is canceled out by the hierarchical application of the quasi-continuous controller ensuring fault-tolerance. The effect of Lebesgue measurable noise over the precision of the proposed controller is studied. Simulation results based on a nonlinear model of the F16 jet fighter show the efficiency of the proposed techniques.

  2. Fault Tolerance and Parallel Processing for NGST

    Science.gov (United States)

    Sengupta, R.; Offenberg, J. D.; Fixsen, D. J.; Nieto-Santisteban, M. A.; Hanisch, R. J.; Stockman, H. S.; Mather, J. C.

    1999-12-01

    The Next Generation Space Telescope (NGST) Image Processing Group is developing scalable cosmic ray rejection and data compression algorithms for parallel processors as part of NASA's Remote Exploration and Experimentation (REE) Project. The primary intention of the REE project is to use commercial-off-the shelf (COTS) technology to develop scalable, low-power, fault tolerant, high performance computers in space. NGST is one of the applications selected to demonstrate the benefit of having on-board supercomputing power. Real-time cosmic ray rejection would enable us to reduce the downlink data volume by as much as two orders of magnitude by combining multiple read-outs on the spacecraft rather than downlinking them separately. The combined read-outs can be further reduced in size by applying lossy and/or lossless data compression algorithms. This work is funded by NASA's REE project, managed by JPL.

  3. A fault-tolerant attitude control system for a satellite based on fuzzy global sliding mode control algorithm

    Science.gov (United States)

    Liang, Jinjin; Dong, Chaoyang; Wang, Qing

    2008-10-01

    An effective approach for fault diagnosis of aeroengine based on integration of wavelet analysis and neural networks is presented. The wavelet transform can accurately localizes the characteristics of a signal in time-frequency domains and in a view of the inter relationship of wavelet transform between exponent theory, the whole and local exponents obtained from wavelet transform coefficients as features are presented for extracting fault signals, which are inputted into radial basis function for fault pattern recognition. The fault diagnosis model of aero-engine is established and the improved Levenberg-Marquardt training algorithm is used to fulfill the network structure and parameter identification. By choosing enough samples to train the fault diagnosis network and the information representing the faults input into the neural network, the fault pattern can be determined. The robustness of wavelet neural network for fault diagnosis is discussed. The practical fault diagnosis for aeroengine vibration approves to be accurate and comprehensive.

  4. On the Practicality of `Practical' Byzantine Fault Tolerance

    CERN Document Server

    Chondros, Nikos; Roussopoulos, Mema

    2011-01-01

    Byzantine Fault Tolerant (BFT) systems are considered by the systems research community to be state of the art with regards to providing reliability in distributed systems. BFT systems provide safety and liveness guarantees with reasonable assumptions, amongst a set of nodes where at most f nodes display arbitrarily incorrect behaviors, known as Byzantine faults. Despite this, BFT systems are still rarely used in practice. In this paper we describe our experience, from an application developer's perspective, trying to leverage the publicly available and highly-tuned PBFT middleware (by Castro and Liskov), to provide provable reliability guarantees for an electronic voting application with high security and robustness needs. We describe several obstacles we encountered and drawbacks we identified in the PBFT approach. These include some that we tackled, such as lack of support for dynamic client management and leaving state management completely up to the application. Others still remaining include the lack of...

  5. Hypothetical Scenario Generator for Fault-Tolerant Diagnosis

    Science.gov (United States)

    James, Mark

    2007-01-01

    The Hypothetical Scenario Generator for Fault-tolerant Diagnostics (HSG) is an algorithm being developed in conjunction with other components of artificial- intelligence systems for automated diagnosis and prognosis of faults in spacecraft, aircraft, and other complex engineering systems. By incorporating prognostic capabilities along with advanced diagnostic capabilities, these developments hold promise to increase the safety and affordability of the affected engineering systems by making it possible to obtain timely and accurate information on the statuses of the systems and predicting impending failures well in advance. The HSG is a specific instance of a hypothetical- scenario generator that implements an innovative approach for performing diagnostic reasoning when data are missing. The special purpose served by the HSG is to (1) look for all possible ways in which the present state of the engineering system can be mapped with respect to a given model and (2) generate a prioritized set of future possible states and the scenarios of which they are parts.

  6. Fault diagnosis and fault-tolerant control and guidance for aerospace vehicles from theory to application

    CERN Document Server

    Zolghadri, Ali; Cieslak, Jerome; Efimov, Denis; Goupil, Philippe

    2014-01-01

    Fault Diagnosis and Fault-Tolerant Control and Guidance for Aerospace demonstrates the attractive potential of recent developments in control for resolving such issues as improved flight performance, self-protection and extended life of structures. Importantly, the text deals with a number of practically significant considerations: tuning, complexity of design, real-time capability, evaluation of worst-case performance, robustness in harsh environments, and extensibility when development or adaptation is required. Coverage of such issues helps to draw the advanced concepts arising from academic research back towards the technological concerns of industry. Initial coverage of basic definitions and ideas and a literature review gives way to a treatment of important electrical flight control system failures: the oscillatory failure case, runaway, and jamming. Advanced fault detection and diagnosis for linear and nonlinear systems are described. Lastly recovery strategies appropriate to remaining acuator/sensor/c...

  7. Fault-Tolerant Postselected Quantum Computation: Threshold Analysis

    OpenAIRE

    Knill, E.

    2004-01-01

    The schemes for fault-tolerant postselected quantum computation given in [Knill, Fault-Tolerant Postselected Quantum Computation: Schemes, http://arxiv.org/abs/quant-ph/0402171] are analyzed to determine their error-tolerance. The analysis is based on computer-assisted heuristics. It indicates that if classical and quantum communication delays are negligible, then scalable qubit-based quantum computation is possible with errors above 1% per elementary quantum gate.

  8. Improvement of Matrix Converter Drive Reliability by Online Fault Detection and a Fault-Tolerant Switching Strategy

    DEFF Research Database (Denmark)

    Nguyen-Duy, Khiem; Liu, Tian-Hua; Chen, Der-Fa

    2011-01-01

    The matrix converter system is becoming a very promising candidate to replace the conventional two-stage ac/dc/ac converter, but system reliability remains an open issue. The most common reliability problem is that a bidirectional switch has an open-switch fault during operation. In this paper, a...... matrix converter driving a speed-controlled permanent-magnet synchronous motor is examined under a single open-switch fault. First, a new fault-detection method is proposed using only the motor currents. Second, a novel fault-tolerant switching strategy is presented. By treating the matrix converter as a....... Experimental results show that the proposed method can maintain the motor speed with a maximum ripple of 2%—a fivefold improvement over the uncompensated system. The proposed method therefore offers a very economical and effective solution for the matrix converter fault tolerance problem....

  9. Guaranteed Cost Fault-tolerant Control of Networked Control Systems with Short Output Delay and Short Control Delay Based on State Observer

    Directory of Open Access Journals (Sweden)

    Xiaomao Huang

    2013-04-01

    Full Text Available Supposing that the sensor and controller nodes were time-driven and the actuator node was event-driven, the problem of integrity against sensor failures for the networked control systems with short output delay and short control delay was discussed based on observer. The state observer of the system according to the time-delay compensation strategy was designed. Then, considering possible sensor failures, an augmented mathematic model for the networked control systems based on observer was developed. In terms of the given quadratic performance index function, the integrity condition of the system was given and the designs for guaranteed cost fault-tolerant controller and observer were presented respectively by using the cooperative design approach of the controller and observer and the approach of bilinear matrix inequalities. Finally, a numerical simulation example demonstrated the conclusions are feasible and effective. The proposed control method meets the requirements in industrial networked control systems.

  10. Observer-based Fault Detection and Isolation for Nonlinear Systems

    OpenAIRE

    Lootsma, T.F.

    2001-01-01

    With the rise in automation the increase in fault detectionand isolation & reconfiguration is inevitable. Interest in fault detection and isolation (FDI) for nonlinear systems has grown significantly in recent years. The design of FDI is motivated by the need for knowledge about occurring faults in fault-tolerant control systems (FTC systems). The idea of FTC systems is to detect, isolate, and handle faults in such a way that the systems can still perform in a required manner. One prefers...

  11. Solar system fault detection

    Science.gov (United States)

    Farrington, R.B.; Pruett, J.C. Jr.

    1984-05-14

    A fault detecting apparatus and method are provided for use with an active solar system. The apparatus provides an indication as to whether one or more predetermined faults have occurred in the solar system. The apparatus includes a plurality of sensors, each sensor being used in determining whether a predetermined condition is present. The outputs of the sensors are combined in a pre-established manner in accordance with the kind of predetermined faults to be detected. Indicators communicate with the outputs generated by combining the sensor outputs to give the user of the solar system and the apparatus an indication as to whether a predetermined fault has occurred. Upon detection and indication of any predetermined fault, the user can take appropriate corrective action so that the overall reliability and efficiency of the active solar system are increased.

  12. Fault tolerant wind speed estimator used in wind turbine controllers

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    Advanced control schemes can be used to optimize energy production and cost of energy in modern wind turbines. These control schemes most often rely on wind speed estimations. These designs of wind speed estimators are, however, not designed to be fault tolerant towards faults in the used sensors...... applying the proposed wind speed estimator to a simulation model of a wind turbine. Notice that since the faults are accommodated in the observer scheme the actual controller do not need to be adjusted or reconfigured to accommodate the sensor faults........ In this paper a fault tolerant wind speed estimator is designed based on a set of unknown input observers, each designed to the different sets of non-faulty sensors. Faults in the rotor, generator and wind speed sensors are considered. The designed wind speed estimator is passive tolerant towards...

  13. Fault Detection and Isolation and Fault Tolerant Control of Wind Turbines Using Set-Valued Observers

    DEFF Research Database (Denmark)

    Casau, Pedro; Rosa, Paulo Andre Nobre; Tabatabaeipour, Seyed Mojtaba; Silvestre, Carlos

    account process disturbances, uncertainty and sensor noise. The FTC strategy takes advantage of the proposed FDI algorithm, enabling the controller reconfiguration shortly after fault events. Additionally, a robust controller is designed so as to increase the wind turbine's performance during low severity......Research on wind turbine Operations & Maintenance (O&M) procedures is critical to the expansion of Wind Energy Conversion systems (WEC). In order to reduce O&M costs and increase the lifespan of the turbine, we study the application of Set-Valued Observers (SVO) to the problem of Fault Detection...... and Isolation (FDI) and Fault Tolerant Control (FTC) of wind turbines, by taking advantage of the recent advances in SVO theory for model invalidation. A simple wind turbine model is presented along with possible faulty scenarios. The FDI algorithm is built on top of the described model, taking into...

  14. Multiple Fault Isolation in Redundant Systems

    Science.gov (United States)

    Shakeri, M.; Pattipati, Krishna R.; Raghavan, V.; Patterson-Hine, Ann; Iverson, David L.

    1997-01-01

    We consider the problem of sequencing tests to isolate multiple faults in redundant (fault-tolerant) systems with minimum expected testing cost (time). It can be shown that single faults and minimal faults, i.e., minimum number of failures with a failure signature different from the union of failure signatures of individual failures, together with their failure signatures, constitute the necessary information for fault diagnosis in redundant systems. In this paper, we develop an algorithm to find all the minimal faults and their failure signatures. Then, we extend the Sure diagnostic strategies [1] of our previous work to diagnose multiple faults in redundant systems. The proposed algorithms and strategies are illustrated using several examples.

  15. A Convex Approach to Fault Tolerant Control

    Science.gov (United States)

    Maghami, Peiman G.; Cox, David E.; Bauer, Frank (Technical Monitor)

    2002-01-01

    The design of control laws for dynamic systems with the potential for actuator failures is considered in this work. The use of Linear Matrix Inequalities allows more freedom in controller design criteria than typically available with robust control. This work proposes an extension of fault-scheduled control design techniques that can find a fixed controller with provable performance over a set of plants. Through convexity of the objective function, performance bounds on this set of plants implies performance bounds on a range of systems defined by a convex hull. This is used to incorporate performance bounds for a variety of soft and hard failures into the control design problem.

  16. A New Fault-tolerant Switched Reluctance Motor with reliable fault detection capability

    DEFF Research Database (Denmark)

    Lu, Kaiyuan

    For reliable fault detection, often, search coils are used in many fault-tolerant drives. The search coils occupy extra slot space. They are normally open-circuited and are not used for torque production. This degrades the motor performance, increases the cost and manufacture complexity. A new...... Fault-Tolerant Switched Reluctance (FTSR) motor is proposed in this paper. A unique feature of this special design is that it allows use of the unexcited phase coils as search coils for fault detection. Therefore this new motor has all the advantages of using search coils for reliable fault detection...... while no extra search coil is actually needed. The motor itself is able to continue to work under any faulted conditions, providing fault-tolerant features. The working principle, performance evaluation of this motor will be demonstrated in this paper and Finite Element Analysis results are provided....

  17. Software fault-tolerant distributed applications in LiPS

    OpenAIRE

    Setz, Thomas

    1997-01-01

    This paper illustrates how software fault-tolerant distributed applications are implemented within LIPS version 2.4, a system for distributed computing using idle-cycles in networks of workstations. The LIPS system [SR92, SR93,STea94,Set95,SF96,ST96,SL97,Set97] employs the tuple space programming paradigm, as originally used in the LINDA programming language. Applications implemented using this paradigm easily adapt to changes in availability as they occur in workstation networks. In LIPS, ap...

  18. Fusion of Built in Test (BIT) Technologies with Embeddable Fault Tolerant Techniques for Power System and Drives in Space Exploration Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Impact Technologies has proposed development of an effective prognostic and fault accommodation system for critical DC power systems including PV systems. Overall...

  19. A Framework-Based Approach for Fault-Tolerant Service Robots

    Directory of Open Access Journals (Sweden)

    Heejune Ahn

    2012-11-01

    Full Text Available Recently the component‐based approach has become a major trend in intelligent service robot development due to its reusability and productivity. The framework in a component‐based system should provide essential services for application components. However, to our knowledge the existing robot frameworks do not yet support fault tolerance service. Moreover, it is often believed that faults can be handled only at the application level. In this paper, by extending the robot framework with the fault tolerance function, we argue that the framework‐based fault tolerance approach is feasible and even has many benefits, including that: 1 the system integrators can build fault tolerance applications from non‐fault‐aware components; 2 the constraints of the components and the operating environment can be considered at the time of integration, which ‐ cannot be anticipated eaily at the time of component development; 3 consistency in system reliability can be obtained even in spite of diverse application component sources. In the proposed construction, we build XML rule files defining the rules for probing and determining the fault conditions of each component, contamination cases from a faulty component, and the possible recovery and safety methods. The rule files are established by a system integrator and the fault manager in the framework controls the fault tolerance process according to the rules. We demonstrate that the fault‐tolerant framework can incorporate widely accepted fault tolerance techniques. The effectiveness and real‐time performance of the framework‐based approach and its techniques are examined by testing an autonomous mobile robot in typical fault scenarios.

  20. Active and Passive Fault-Tolerant LPV Control of Wind Turbines

    DEFF Research Database (Denmark)

    Sloth, Christoffer; Esbensen, Thomas; Stoustrup, Jakob

    2010-01-01

    This paper addresses the design and comparison of active and passive fault-tolerant linear parameter-varying (LPV) controllers for wind turbines. The considered wind turbine plant model is characterized by parameter variations along the nominal operating trajectory and includes a model of an...... scheduled on the varying parameter to manage the parametervarying nature of the model. The PFTC only relies on measured system variables and an estimated wind speed, while the AFTC also relies on information from a fault diagnosis system. Consequently, the optimization problem involved in designing the PFTC...... incipient fault in the pitch system. We propose the design of an active fault-tolerant controller (AFTC) based on an existing LPV controller design method and extend this method to apply for the design of a passive fault-tolerant controller (PFTC). Both controllers are based on output feedback and are...

  1. Active fault-tolerant control strategy of large civil aircraft under elevator failures

    OpenAIRE

    Wang Xingjian; Wang Shaoping; Yang Zhongwei; Zhang Chao

    2015-01-01

    Aircraft longitudinal control is the most important actuation system and its failures would lead to catastrophic accident of aircraft. This paper proposes an active fault-tolerant control (AFTC) strategy for civil aircraft with different numbers of faulty elevators. In order to improve the fault-tolerant flight control system performance and effective utilization of the control surface, trimmable horizontal stabilizer (THS) is considered to generate the extra pitch moment. A suitable switchin...

  2. Local fault-tolerant quantum computation

    International Nuclear Information System (INIS)

    We analyze and study the effects of locality on the fault-tolerance threshold for quantum computation. We analytically estimate how the threshold will depend on a scale parameter r which characterizes the scale-up in the size of the circuit due to encoding. We carry out a detailed seminumerical threshold analysis for concatenated coding using the seven-qubit CSS code in the local and the 'nonlocal' setting. First, we find that the threshold in the local model for the [7,1,3] code has a 1/r dependence, which is in correspondence with our analytical estimate. Second, the threshold, beyond the 1/r dependence, does not depend too strongly on the noise levels for transporting qubits. Beyond these results, we find that it is important to look at more than one level of concatenation in order to estimate the threshold and that it may be beneficial in certain places, like in the transportation of qubits, to do error correction only infrequently

  3. Improved open switch fault detection based on normalized current analysis in multiphase fault tolerant converters

    OpenAIRE

    Salehifar, Mehdi; Moreno Eguilaz, Juan Manuel; Sala Caselles, Vicen; Salehi Arashloo Arashloo, Ramin; Romeral Martnez, Jos Luis

    2013-01-01

    A new open switch fault detection method based on normalized current analysis is proposed for application in multiphase fault tolerant PMSM drives. Performance characteristics of proposed method are single diagnostic variable, ability to detect open phase fault without using auxiliary variable, ability to detect multiple switch fault, simple diagnostic variable, generality, and robustness in case of high unbalanced current waveforms. Theory of diagnostic method with special multiphase d...

  4. A novel fault tolerant permanent magnet synchronous motor with improved optimal torque control for aerospace application

    Directory of Open Access Journals (Sweden)

    Guo Hong

    2015-04-01

    Full Text Available Improving fault tolerant performance of permanent magnet synchronous motor has always been the central issue of the electrically supplied actuator for aerospace application. In this paper, a novel fault tolerant permanent magnet synchronous motor is proposed, which is characterized by two stators and two rotors on the same shaft with a circumferential displacement of mechanical angle of 4.5. It helps to reduce the cogging torque. Each segment of the stator and the rotor can be considered as an 8-pole/10-slot five-phase permanent magnet synchronous motor with concentrated, single-layer and alternate teeth wound winding, which enhance the fault isolation capacity of the motor. Furthermore, the motor has high phase inductance to restrain the short-circuit current. In addition, an improved optimal torque control strategy is proposed to make the motor work well under the open-circuit fault and short-circuit fault conditions. Simulation and experiment results show that the proposed fault tolerant motor system has excellent fault tolerant capacity, which is able to operate continuously under the third open-circuit fault and second short-circuit fault condition without system performance degradation, which was not available earlier.

  5. Universal fault tolerant quantum computation on bilinear nearest neighbor arrays

    OpenAIRE

    Stephens, A. M.; Fowler, A. G.; Hollenberg, L C L

    2007-01-01

    Assuming an array that consists of two parallel lines of qubits and that permits only nearest neighbor interactions, we construct physical and logical circuitry to enable universal fault tolerant quantum computation under the [[7,1,3

  6. Reliability and fault tolerance in the European ADS project

    OpenAIRE

    Biarrotte, Jean-Luc

    2013-01-01

    After an introduction to the theory of reliability, this paper focuses on a description of the linear proton accelerator proposed for the European ADS demonstration project. Design issues are discussed and examples of cases of fault tolerance are given.

  7. Fault Tolerance In Grid Computing: State of the Art and Open Issues

    Directory of Open Access Journals (Sweden)

    Ritu Garg

    2011-02-01

    Full Text Available Fault tolerance is an important property for large scale computational grid systems, wheregeographically distributed nodes co-operate to execute a task. In order to achieve high level of reliabilityand availability, the grid infrastructure should be a foolproof fault tolerant. Since the failure of resourcesaffects job execution fatally, fault tolerance service is essential to satisfy QOS requirement in gridcomputing. Commonly utilized techniques for providing fault tolerance are job checkpointing andreplication. Both techniques mitigate the amount of work lost due to changing system availability but canintroduce significant runtime overhead. The latter largely depends on the length of checkpointing intervaland the chosen number of replicas, respectively. In case of complex scientific workflows where tasks canexecute in well defined order reliability is another biggest challenge because of the unreliable nature ofthe grid resources.

  8. A universal, fault-tolerant, non-linear analytic network for modeling and fault detection

    International Nuclear Information System (INIS)

    The similarities and differences of a universal network to normal neural networks are outlined. The description and application of a universal network is discussed by showing how a simple linear system is modeled by normal techniques and by universal network techniques. A full implementation of the universal network as universal process modeling software on a dedicated computer system at EBR-II is described and example results are presented. It is concluded that the universal network provides different feature recognition capabilities than a neural network and that the universal network can provide extremely fast, accurate, and fault-tolerant estimation, validation, and replacement of signals in a real system

  9. Fault Tolerance Structure of Radix 2 Signed Digital Adders

    Directory of Open Access Journals (Sweden)

    Jishun Kuang

    2012-01-01

    Full Text Available In this study, structure of fault tolerance adder based on Radix 2 Signed Digital (SD representation is proposed. The “carry-free” property of the SD adder that faults impact limited to a few digits can be used to fault detection which is based on parity checking assumed single fault set. Using an encoding scheme to get the parity value of digits involved in computing, this parity values can be exploited to check the circuit. An error information register is set to store the checking results and the bits of the register indicate the corresponding units faulty or not. According to the fault type, recomputation or reconfiguration is used to error correction. The hardware overhead appending Fault-Tolerant is about 120% and the maximum combinational path delay of the proposed adder is constant with the increase of operands.

  10. Design of passive fault-tolerant controllers of a quadrotor based on sliding mode theory

    Directory of Open Access Journals (Sweden)

    Merheb Abdel-Razzak

    2015-09-01

    Full Text Available Abstract In this paper, sliding mode control is used to develop two passive fault tolerant controllers for an AscTec Pelican UAV quadrotor. In the first approach, a regular sliding mode controller (SMC augmented with an integrator uses the robustness property of variable structure control to tolerate partial actuator faults. The second approach is a cascaded sliding mode controller with an inner and outer SMC loops. In this configuration, faults are tolerated in the fast inner loop controlling the velocity system. Tuning the controllers to find the optimal values of the sliding mode controller gains is made using the ecological systems algorithm (ESA, a biologically inspired stochastic search algorithm based on the natural equilibrium of animal species. The controllers are tested using SIMULINK in the presence of two different types of actuator faults, partial loss of motor power affecting all the motors at once, and partial loss of motor speed. Results of the quadrotor following a continuous path demonstrated the effectiveness of the controllers, which are able to tolerate a significant number of actuator faults despite the lack of hardware redundancy in the quadrotor system. Tuning the controller using a faulty system improves further its ability to afford more severe faults. Simulation results show that passive schemes reserve their important role in fault tolerant control and are complementary to active techniques

  11. Fault-tolerant Sensor Fusion for Marine Navigation

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2006-01-01

    Reliability of navigation data are critical for steering and manoeuvring control, and in particular so at high speed or in critical phases of a mission. Should faults occur, faulty instruments need be autonomously isolated and faulty information discarded. This paper designs a navigation solution...... events where the fault-tolerant sensor fusion provided uninterrupted navigation data despite temporal instrument defects...

  12. Diagnosis and fault-tolerant control using set-based methods

    OpenAIRE

    Feng XU

    2014-01-01

    The fault-tolerant capability is an important performance specification for most of technical systems. The examples showing its importance are some catastrophes in civil aviation. According to some official investigations, some air incidents are technically avoidable if the pilots can take right measures. But, relying on the skill and experience of the pilots, it cannot be guaranteed that reliable flight decisions are always made. Instead, if fault-tolerant strategies can be included in the d...

  13. A Remote Characterization System and a fault-tolerant tracking system for subsurface mapping of buried waste sites

    International Nuclear Information System (INIS)

    This paper describes two closely related projects that will provide new technology for characterizing hazardous waste burial sites. The first project, a collaborative effort by five of the national laboratories, involves the development and demonstration of a remotely controlled site characterization system. The Remote Characterization System (RCS) includes a unique low-signature survey vehicle, a base station, radio telemetry data links, satellite-based vehicle tracking, stereo vision, and sensors for noninvasive inspection of the surface and subsurface. The second project, conducted by the Idaho National Engineering Laboratory (INEL), involves the development of a position sensing system that can track a survey vehicle or instrument in the field. This system can coordinate updates at a rate of 200/s with an accuracy better than 0.1% of the distance separating the target and the sensor. It can employ acoustic or electromagnetic signals in a wide range of frequencies and can be operated as a passive or active device

  14. Fault-tolerant control under controller-driven sampling using virtual actuator strategy

    OpenAIRE

    Osella, Esteban N.; Haimovich, Hernan; Seron, María M.

    2013-01-01

    We present a new output feedback fault tolerant control strategy for continuous-time linear systems. The strategy combines a digital nominal controller under controller-driven (varying) sampling with virtual-actuator (VA)-based controller reconfiguration to compensate for actuator faults. In the proposed scheme, the controller controls both the plant and the sampling period, and performs controller reconfiguration by engaging in the loop the VA adapted to the diagnosed fault. The VA also oper...

  15. Robust Adaptive Fault-Tolerant Tracking Control of Three-Phase Induction Motor

    OpenAIRE

    Hossein Tohidi; Koksal Erenturk

    2014-01-01

    This paper deals with the problem of induction motor tracking control against actuator faults and external disturbances using the linear matrix inequalities (LMIs) method and the adaptive method. A direct adaptive fault-tolerant tracking controller design method is developed based on Lyapunov stability theory and a constructive algorithm based on linear matrix inequalities for online tuning of adaptive and state feedback gains to stabilize the closed-loop system in order to reduce the fault e...

  16. Fault Tolerant Strategy for Semi-Active Suspensions with LPV Accommodation

    OpenAIRE

    Tudon-Martınez, Juan,; Varrier, Sébastien; Sename, Olivier; Morales Menendez, Ruben; Martinez Molina, John Jairo; Dugard, Luc

    2013-01-01

    Abstract--A novel fault tolerant strategy to compensate multiplicative actuator faults (damper oil leakages) in a semiactive suspension system is proposed. The compensation of the lack of damping force caused by a faulty damper is carried on by the remainder three healthy semi-active dampers. Once a faulty damper is detected and isolated by a Fault Detection and Isolation strategy based on parity-space, an estimator is activated to compute the missing damping force to compensate. In order to ...

  17. Tolerance of Radial-Basis Functions Against Stuck-At-Faults

    OpenAIRE

    Eickhoff, Ralf; Rückert, Ulrich

    2005-01-01

    Neural networks are intended to be used in future nanoelectronic systems since neural architectures seem to be robust against malfunctioning elements and noise in their weights. In this paper we analyze the fault-tolerance of Radial Basis Function networks to Stuck- At-Faults at the trained weights and at the output of neurons. Moreover, we determine upper bounds on the mean square error arising from these faults.

  18. Design of passive fault-tolerant flight controller against actuator failures

    Directory of Open Access Journals (Sweden)

    Yu Xiang

    2015-02-01

    Full Text Available The problem of designing passive fault-tolerant flight controller is addressed when the normal and faulty cases are prescribed. First of all, the considered fault and fault-free cases are formed by polytopes. As considering that the safety of a post-fault system is directly related to the maximum values of physical variables in the system, peak-to-peak gain is selected to represent the relationships among the amplitudes of actuator outputs, system outputs, and reference commands. Based on the parameter dependent Lyapunov and slack methods, the passive fault-tolerant flight controllers in the absence/presence of system uncertainty for actuator failure cases are designed, respectively. Case studies of an airplane under actuator failures are carried out to validate the effectiveness of the proposed approach.

  19. Design of neuro fuzzy fault tolerant control using an adaptive observer

    International Nuclear Information System (INIS)

    New methodologies and concepts are developed in the control theory to meet the ever-increasing demands in industrial applications. Fault detection and diagnosis of technical processes have become important in the course of progressive automation in the operation of groups of electric drives. When a group of electric drives is under operation, fault tolerant control becomes complicated. For multiple motors in operation, fault detection and diagnosis might prove to be difficult. Estimation of all states and parameters of all drives is necessary to analyze the actuator and sensor faults. To maintain system reliability, detection and isolation of failures should be performed quickly and accurately, and hardware should be properly integrated. Luenberger full order observer can be used for estimation of the entire states in the system for the detection of actuator and sensor failures. Due to the insensitivity of the Luenberger observer to the system parameter variations, state estimation becomes inaccurate under the varying parameter conditions of the drives. Consequently, the estimation performance deteriorates, resulting in ordinary state observers unsuitable for fault detection technique. Therefore an adaptive observe, which can estimate the system states and parameter and detect the faults simultaneously, is designed in our paper. For a Group of D C drives, there may be parameter variations for some of the drives, and for other drives, there may not be parameter variations depending on load torque, friction, etc. So, estimation of all states and parameters of all drives is carried out using an adaptive observer. If there is any deviation with the estimated values, it is understood that fault has occurred and the nature of the fault, whether sensor fault or actuator fault, is determined by neural fuzzy network, and fault tolerant control is reconfigured. Experimental results with neuro fuzzy system using adaptive observer-based fault tolerant control are good, so as to confirm the best characteristics of the proposed approach

  20. Coordinated Fault-Tolerance for High-Performance Computing Final Project Report

    Energy Technology Data Exchange (ETDEWEB)

    Panda, Dhabaleswar Kumar [The Ohio State University; Beckman, Pete

    2011-07-28

    With the Coordinated Infrastructure for Fault Tolerance Systems (CIFTS, as the original project came to be called) project, our aim has been to understand and tackle the following broad research questions, the answers to which will help the HEC community analyze and shape the direction of research in the field of fault tolerance and resiliency on future high-end leadership systems. Will availability of global fault information, obtained by fault information exchange between the different HEC software on a system, allow individual system software to better detect, diagnose, and adaptively respond to faults? If fault-awareness is raised throughout the system through fault information exchange, is it possible to get all system software working together to provide a more comprehensive end-to-end fault management on the system? What are the missing fault-tolerance features that widely used HEC system software lacks today that would inhibit such software from taking advantage of systemwide global fault information? What are the practical limitations of a systemwide approach for end-to-end fault management based on fault awareness and coordination? What mechanisms, tools, and technologies are needed to bring about fault awareness and coordination of responses on a leadership-class system? What standards, outreach, and community interaction are needed for adoption of the concept of fault awareness and coordination for fault management on future systems? Keeping our overall objectives in mind, the CIFTS team has taken a parallel fourfold approach. Our central goal was to design and implement a light-weight, scalable infrastructure with a simple, standardized interface to allow communication of fault-related information through the system and facilitate coordinated responses. This work led to the development of the Fault Tolerance Backplane (FTB) publish-subscribe API specification, together with a reference implementation and several experimental implementations on top of existing publish-subscribe tools. We enhanced the intrinsic fault tolerance capabilities representative implementations of a variety of key HPC software subsystems and integrated them with the FTB. Targeting software subsystems included: MPI communication libraries, checkpoint/restart libraries, resource managers and job schedulers, and system monitoring tools. Leveraging the aforementioned infrastructure, as well as developing and utilizing additional tools, we have examined issues associated with expanded, end-to-end fault response from both system and application viewpoints. From the standpoint of system operations, we have investigated log and root cause analysis, anomaly detection and fault prediction, and generalized notification mechanisms. Our applications work has included libraries for fault-tolerance linear algebra, application frameworks for coupled multiphysics applications, and external frameworks to support the monitoring and response for general applications. Our final goal was to engage the high-end computing community to increase awareness of tools and issues around coordinated end-to-end fault management.

  1. A Robust Byzantine Fault-Tolerant Replication Technique for Peer-to-Peer Content Distribution

    Directory of Open Access Journals (Sweden)

    Ayyasamy Sellappan

    2011-01-01

    Full Text Available Problem statement: In peer-to-peer networks, Byzantine fault tolerance refers to the capability of a system to tolerate Byzantine faults. It can be achieved by replicating the server and by ensuring all server replicas reach an agreement on the input despite Byzantine faulty replicas and clients. Since malicious attacks and software errors can cause faulty nodes to exhibit Byzantine behavior, Byzantine-fault-tolerant algorithms are increasingly important. Approach: In the study, we wish to develop a robust Byzantine Fault-Tolerance Replication (BFTR technique for peer-to-peer content distribution systems which contains fault detection and fault recovery. It is based on collaborative monitoring of each node to detect the occurrence of a fault. Already we proposed a QoS based overlay network architecture (QIRM involving an intelligent replica placement algorithm to improve the network utilization of the P2P system. Results: By simulation results, we show that the proposed technique involves less overhead and recovery time with increased accuracy. Conclusion/Recommendations: Here the result obtained is that BFTR Technique is much efficient than the QIRM with respect to packet drop ratio, average end-to-end delay, throughput and overhead.

  2. Adaptive Fault Tolerance for Many-Core Based Space-Borne Computing

    Science.gov (United States)

    James, Mark; Springer, Paul; Zima, Hans

    2010-01-01

    This paper describes an approach to providing software fault tolerance for future deep-space robotic NASA missions, which will require a high degree of autonomy supported by an enhanced on-board computational capability. Such systems have become possible as a result of the emerging many-core technology, which is expected to offer 1024-core chips by 2015. We discuss the challenges and opportunities of this new technology, focusing on introspection-based adaptive fault tolerance that takes into account the specific requirements of applications, guided by a fault model. Introspection supports runtime monitoring of the program execution with the goal of identifying, locating, and analyzing errors. Fault tolerance assertions for the introspection system can be provided by the user, domain-specific knowledge, or via the results of static or dynamic program analysis. This work is part of an on-going project at the Jet Propulsion Laboratory in Pasadena, California.

  3. Fault-tolerant computer architecture based on INMOS transputer processor

    Science.gov (United States)

    Ortiz, Jorge L.

    1987-01-01

    Redundant processing was used for several years in mission flight systems. In these systems, more than one processor performs the same task at the same time but only one processor is actually in real use. A fault-tolerance computer architecture based on the features provided by INMOS Transputers is presented. The Transputer architecture provides several communication links that allow data and command communication with other Transputers without the use of a bus. Additionally the Transputer allows the use of parallel processing to increase the system speed considerably. The processor architecture consists of three processors working in parallel keeping all the processors at the same operational level but only one processor is in real control of the process. The design allows each Transputer to perform a test to the other two Transputers and report the operating condition of the neighboring processors. A graphic display was developed to facilitate the identification of any problem by the user.

  4. MCNP load balancing and fault tolerance with PVM

    International Nuclear Information System (INIS)

    Version 4A of the Monte Carlo neutron, photon, and electron transport code MCNP, developed by LANL (Los Alamos National Laboratory), supports distributed-memory multiprocessing through the software package PVM (Parallel Virtual Machine, version 3.1.4). Using PVM for interprocessor communication, MCNP can simultaneously execute a single problem on a cluster of UNIX-based workstations. This capability provided system efficiencies that exceeded 80% on dedicated workstation clusters, however, on heterogeneous or multiuser systems, the performance was limited by the slowest processor (i.e., equal work was assigned to each processor). The next public release of MCNP will provide multiprocessing enhancements that include load balancing and fault tolerance which are shown to dramatically increase multiuser system efficiency and reliability

  5. MCNP load balancing and fault tolerance with PVM

    International Nuclear Information System (INIS)

    Version 4A of the Monte Carlo neutron, photon, and electron transport code MCNP developed by Los Alamos National Laboratory supports distributed-memory multiprocessing through the parallel virtual machine (PVM) software package, version 3.1.4. Using PVM for interprocessor communication, MCNP can simultaneously execute a single problem on a cluster of UNIX-based workstations. This capability provided system efficiencies that exceed 80% on dedicated workstation clusters; however, on heterogeneous or multiuser systems, the performance was limited by the slowest processor (i.e., equal work was assigned to each processor). The next public release of MCNP will provide multiprocessing enhancements that include load balancing and fault tolerance, which are shown to dramatically increase multiuser system efficiency and reliability

  6. An Accurate and Fault-Tolerant Target Positioning System for Buildings Using Laser Rangefinders and Low-Cost MEMS-Based MARG Sensors

    Science.gov (United States)

    Zhao, Lin; Guan, Dongxue; Landry, Ren Jr.; Cheng, Jianhua; Sydorenko, Kostyantyn

    2015-01-01

    Target positioning systems based on MEMS gyros and laser rangefinders (LRs) have extensive prospects due to their advantages of low cost, small size and easy realization. The target positioning accuracy is mainly determined by the LRs attitude derived by the gyros. However, the attitude error is large due to the inherent noises from isolated MEMS gyros. In this paper, both accelerometer/magnetometer and LR attitude aiding systems are introduced to aid MEMS gyros. A no-reset Federated Kalman Filter (FKF) is employed, which consists of two local Kalman Filters (KF) and a Master Filter (MF). The local KFs are designed by using the Direction Cosine Matrix (DCM)-based dynamic equations and the measurements from the two aiding systems. The KFs can estimate the attitude simultaneously to limit the attitude errors resulting from the gyros. Then, the MF fuses the redundant attitude estimates to yield globally optimal estimates. Simulation and experimental results demonstrate that the FKF-based system can improve the target positioning accuracy effectively and allow for good fault-tolerant capability. PMID:26512672

  7. FPGA-Based, Self-Checking, Fault-Tolerant Computers

    Science.gov (United States)

    Some, Raphael; Rennels, David

    2004-01-01

    A proposed computer architecture would exploit the capabilities of commercially available field-programmable gate arrays (FPGAs) to enable computers to detect and recover from bit errors. The main purpose of the proposed architecture is to enable fault-tolerant computing in the presence of single-event upsets (SEUs). [An SEU is a spurious bit flip (also called a soft error) caused by a single impact of ionizing radiation.] The architecture would also enable recovery from some soft errors caused by electrical transients and, to some extent, from intermittent and permanent (hard) errors caused by aging of electronic components. A typical FPGA of the current generation contains one or more complete processor cores, memories, and highspeed serial input/output (I/O) channels, making it possible to shrink a board-level processor node to a single integrated-circuit chip. Custom, highly efficient microcontrollers, general-purpose computers, custom I/O processors, and signal processors can be rapidly and efficiently implemented by use of FPGAs. Unfortunately, FPGAs are susceptible to SEUs. Prior efforts to mitigate the effects of SEUs have yielded solutions that degrade performance of the system and require support from external hardware and software. In comparison with other fault-tolerant- computing architectures (e.g., triple modular redundancy), the proposed architecture could be implemented with less circuitry and lower power demand. Moreover, the fault-tolerant computing functions would require only minimal support from circuitry outside the central processing units (CPUs) of computers, would not require any software support, and would be largely transparent to software and to other computer hardware. There would be two types of modules: a self-checking processor module and a memory system (see figure). The self-checking processor module would be implemented on a single FPGA and would be capable of detecting its own internal errors. It would contain two CPUs executing identical programs in lock step, with comparison of their outputs to detect errors. It would also contain various cache local memory circuits, communication circuits, and configurable special-purpose processors that would use self-checking checkers. (The basic principle of the self-checking checker method is to utilize logic circuitry that generates error signals whenever there is an error in either the checker or the circuit being checked.) The memory system would comprise a main memory and a hardware-controlled check-pointing system (CPS) based on a buffer memory denoted the recovery cache. The main memory would contain random-access memory (RAM) chips and FPGAs that would, in addition to everything else, implement double-error-detecting and single-error-correcting memory functions to enable recovery from single-bit errors.

  8. A Universal Operator Theoretic Framework for Quantum Fault Tolerance

    CERN Document Server

    Gilbert, Gerald; Weinstein, Yaakov S; Aggarwal, Vaneet; Calderbank, A Robert

    2007-01-01

    In this paper we introduce a universal operator theoretic framework for quantum fault tolerance. This incorporates a top-down approach that implements a system-level criterion based on specification of the full system dynamics, applied at every level of error correction concatenation. This leads to more accurate determinations of error thresholds than could previously be obtained. The basis for the approach is the Quantum Computer Condition (QCC), an inequality governing the evolution of a quantum computer. In addition to more accurate determination of error threshold values, we show that the QCC provides a means to systematically determine optimality (or non-optimality) of different choices of error correction coding and error avoidance strategies. This is possible because, as we show, all known coding schemes are actually special cases of the QCC. We demonstrate this by introducing a new, operator theoretic form of entanglement assisted quantum error correction, which incorporates as special cases all known...

  9. Fault tolerant vector control of induction motor drive

    Science.gov (United States)

    Odnokopylov, G.; Bragin, A.

    2014-10-01

    For electric composed of technical objects hazardous industries, such as nuclear, military, chemical, etc. an urgent task is to increase their resiliency and survivability. The construction principle of vector control system fault-tolerant asynchronous electric. Displaying recovery efficiency three-phase induction motor drive in emergency mode using two-phase vector control system. The process of formation of a simulation model of the asynchronous electric unbalance in emergency mode. When modeling used coordinate transformation, providing emergency operation electric unbalance work. The results of modeling transient phase loss motor stator. During a power failure phase induction motor cannot save circular rotating field in the air gap of the motor and ensure the restoration of its efficiency at rated torque and speed.

  10. Fault tolerant strategies for automated operation of nuclear reactors

    International Nuclear Information System (INIS)

    This paper introduces an automatic control system incorporating a number of verification, validation, and command generation tasks with-in a fault-tolerant architecture. The integrated system utilizes recent methods of artificial intelligence such as neural networks and fuzzy logic control. Furthermore, advanced signal processing and nonlinear control methods are also included in the design. The primary goal is to create an on-line capability to validate signals, analyze plant performance, and verify the consistency of commands before control decisions are finalized. The application of this approach to the automated startup of the Experimental Breeder Reactor-II (EBR-II) is performed using a validated nonlinear model. The simulation results show that the advanced concepts have the potential to improve plant availability andsafety

  11. The Kaleidoscope switch-a new concept for implementation of a large and fault tolerant ATM switch system

    DEFF Research Database (Denmark)

    Dittmann, Lars

    This paper describes a new concept for implementing a large switch network based on smaller modules. The concept is based an an alternative self-routing structure that due to a point symmetry allows the bit in the routing tag to be processed in a random order. Among others this property provides ...... inherent fault protection and allows a simple implementation of broadcast and multicast. The concept has been implemented as a small prototype, that currently is used in a national experimental ATM network in Denmark...

  12. Fault tolerance in space-based digital signal processing and switching systems: Protecting up-link processing resources, demultiplexer, demodulator, and decoder

    Science.gov (United States)

    Redinbo, Robert

    1994-01-01

    Fault tolerance features in the first three major subsystems appearing in the next generation of communications satellites are described. These satellites will contain extensive but efficient high-speed processing and switching capabilities to support the low signal strengths associated with very small aperture terminals. The terminals' numerous data channels are combined through frequency division multiplexing (FDM) on the up-links and are protected individually by forward error-correcting (FEC) binary convolutional codes. The front-end processing resources, demultiplexer, demodulators, and FEC decoders extract all data channels which are then switched individually, multiplexed, and remodulated before retransmission to earth terminals through narrow beam spot antennas. Algorithm based fault tolerance (ABFT) techniques, which relate real number parity values with data flows and operations, are used to protect the data processing operations. The additional checking features utilize resources that can be substituted for normal processing elements when resource reconfiguration is required to replace a failed unit.

  13. Fault tolerance in space-based digital signal processing and switching systems: Protecting up-link processing resources, demultiplexer, demodulator, and decoder

    Science.gov (United States)

    Redinbo, Robert

    1994-09-01

    Fault tolerance features in the first three major subsystems appearing in the next generation of communications satellites are described. These satellites will contain extensive but efficient high-speed processing and switching capabilities to support the low signal strengths associated with very small aperture terminals. The terminals' numerous data channels are combined through frequency division multiplexing (FDM) on the up-links and are protected individually by forward error-correcting (FEC) binary convolutional codes. The front-end processing resources, demultiplexer, demodulators, and FEC decoders extract all data channels which are then switched individually, multiplexed, and remodulated before retransmission to earth terminals through narrow beam spot antennas. Algorithm based fault tolerance (ABFT) techniques, which relate real number parity values with data flows and operations, are used to protect the data processing operations. The additional checking features utilize resources that can be substituted for normal processing elements when resource reconfiguration is required to replace a failed unit.

  14. Fault-tolerant three-level inverter

    Science.gov (United States)

    Edwards, John; Xu, Longya; Bhargava, Brij B.

    2006-12-05

    A method for driving a neutral point clamped three-level inverter is provided. In one exemplary embodiment, DC current is received at a neutral point-clamped three-level inverter. The inverter has a plurality of nodes including first, second and third output nodes. The inverter also has a plurality of switches. Faults are checked for in the inverter and predetermined switches are automatically activated responsive to a detected fault such that three-phase electrical power is provided at the output nodes.

  15. Production of Reliable Flight Crucial Software: Validation Methods Research for Fault Tolerant Avionics and Control Systems Sub-Working Group Meeting

    Science.gov (United States)

    Dunham, J. R. (Editor); Knight, J. C. (Editor)

    1982-01-01

    The state of the art in the production of crucial software for flight control applications was addressed. The association between reliability metrics and software is considered. Thirteen software development projects are discussed. A short term need for research in the areas of tool development and software fault tolerance was indicated. For the long term, research in format verification or proof methods was recommended. Formal specification and software reliability modeling, were recommended as topics for both short and long term research.

  16. H Fault Tolerant Control of WECS Based on the PWA Model

    OpenAIRE

    Yun-Tao Shi; Qi Kou; De-Hui Sun; Zheng-Xi Li; Shu-Juan Qiao; Yan-Jiao Hou

    2014-01-01

    The main contribution of this paper is the development of H∞ fault tolerant control for a wind energy conversion system (WECS) based on the stochastic piecewise affine (PWA) model. In this paper the normal and fault stochastic PWA models for WECS including multiple working points at different wind speeds are established. A reliable piecewise linear quadratic regulator state feedback is designed for the fault tolerant actuator and sensor. A sufficient condition for the existence of the passive...

  17. A Fault Tolerant Resource Allocation Architecture for Mobile Grid

    Directory of Open Access Journals (Sweden)

    P. T. Vanathi

    2012-01-01

    Full Text Available Problem statement: In order to achieve high level of reliability and availability, the grid infrastructure should be fault tolerant. Since the failure of resources affects job execution fatally, fault tolerance service is essential to satisfy QoS requirement in grid computing with respect to mobile nodes. Approach: We propose a fault tolerant technique for improving reliability in mobile grid environment considering the node mobility. The Cluster head and monitoring agent was designed in such a way it addresses both resource and network failure and present recovery techniques for overcoming the faults. Results: The proposed model achieves a identifiable performance when compared to the previous model (HRAA. By simulation results, we analyze the node and link failures on parameters such as delivery ratio, throughput and delay against the rate of success. Conclusion: The proposed fault tolerant approach checks for availability of the nodes with least work load for transferring the executed job to cluster head providing an alternate path in case of failure thereby enhancing the reliability of the grid environment.

  18. Single-Shot Fault-Tolerant Quantum Error Correction

    Science.gov (United States)

    Bombín, Héctor

    2015-07-01

    Conventional quantum error correcting codes require multiple rounds of measurements to detect errors with enough confidence in fault-tolerant scenarios. Here, I show that for suitable topological codes, a single round of local measurements is enough. This feature is generic and is related to self-correction and confinement phenomena in the corresponding quantum Hamiltonian model. Three-dimensional gauge color codes exhibit this single-shot feature, which also applies to initialization and gauge fixing. Assuming the time for efficient classical computations to be negligible, this yields a topological fault-tolerant quantum computing scheme where all elementary logical operations can be performed in constant time.

  19. Fault tolerant homopolar magnetic bearings with flux invariant control

    International Nuclear Information System (INIS)

    The theory for a novel fault-tolerant 4-active-pole homopolar magnetic bearing is developed. If any one coil of the four coils in the bearing actuator fail, the remaining three coil currents change via an optimal distribution matrix such that the same opposing pole, C-core type, control fluxes as those of the un-failed bearing are produced. The homopolar magnetic bearing thus provides unaltered magnetic forces without any loss of the bearing load capacity even if any one coil suddenly fails. Numerical examples are provided to illustrate the novel fault-tolerant, 4-active pole homopolar magnetic bearings

  20. Fault Tolerant, Radiation Hard DSP Project

    Data.gov (United States)

    National Aeronautics and Space Administration — We propose to develop a radiation tolerant/hardened signal processing node, which effectively utilizes state-of-the-art commercial semiconductors plus our...

  1. Energy Bounds for Fault-Tolerant Nanoscale Designs

    CERN Document Server

    Marculescu, Diana

    2011-01-01

    The problem of determining lower bounds for the energy cost of a given nanoscale design is addressed via a complexity theory-based approach. This paper provides a theoretical framework that is able to assess the trade-offs existing in nanoscale designs between the amount of redundancy needed for a given level of resilience to errors and the associated energy cost. Circuit size, logic depth and error resilience are analyzed and brought together in a theoretical framework that can be seamlessly integrated with automated synthesis tools and can guide the design process of nanoscale systems comprised of failure prone devices. The impact of redundancy addition on the switching energy and its relationship with leakage energy is modeled in detail. Results show that 99% error resilience is possible for fault-tolerant designs, but at the expense of at least 40% more energy if individual gates fail independently with probability of 1%.

  2. Fault tolerant LPV control of the GTM UAV with dynamic control allocation

    OpenAIRE

    Vanek, Bálint; Péni, Tamás; Szabó, Zoltán; Bokor, József

    2014-01-01

    The aim of the paper is to present a dynamic control allocation architecture for the design and development of reconfigurable and fault-tolerant control systems in aerial vehicles. The baseline control system is designed for the nominal dynamics of the aircraft, while faults and actuator saturation limits are handled by the dynamic control allocation scheme. Coordination of these components is provided by a supervisor which re-allocates control authority based on health information, flight en...

  3. Analysis of GPS Abnormal Conditions within Fault Tolerant Control Laws

    Science.gov (United States)

    Al-Sinbol, Gahssan

    The Global Position System (GPS) is a critical element for the functionality of autonomous flying vehicles. The GPS operation at normal and abnormal conditions directly impacts the trajectory tracking performance of the autonomous Unmanned Aerial Vehicles (UAVs) controllers. The effects of GPS parameter variation must be well understood and user-friendly computational tools must be developed to facilitate the design and evaluation of fault tolerant control laws. This thesis presents the development of a simplified GPS error model in Matlab/Simulink and its use performing a sensitivity analysis of GPS parameters effect under system normal and abnormal operation on different UAV trajectory tracking controllers. The model statistically generates position and velocity errors, simulates the effect of GPS satellite configuration on the position and velocity measurement accuracy, and implements a set of failures to the GPS readings. The model and its graphical user interface was integrated within the WVU UAV simulation environment as a masked Simulink block. The effects on the controllers' trajectory tracking performance of the following GPS parameters were investigated within normal operation ranges and outside: time delay, update rate, error standard deviation, bias, and major position and velocity failures. Several sets of control laws with fixed and adaptive parameters and of different levels of complexity have been used in this investigation. A complex performance index formulated in terms of tracking errors and control activity was used for control laws performance evaluation. The composition of various metrics within the performance index was performed using fixed and variable weights depending on the local characteristics of the commanded trajectory. This study has revealed that GPS error parameters have a significant impact on control laws performance. The proposed GPS model has proved to be a valuable, flexible tool for testing and evaluation of the fault tolerant capabilities of autonomous flight control laws.

  4. RSFTS: RULE-BASED SEMANTIC FAULT TOLERANT SCHEDULING FOR CLOUD ENVIRONMENT

    Directory of Open Access Journals (Sweden)

    Pandeeswari R

    2013-02-01

    Full Text Available Cloud computing has emerged as one of the latest technologies for delivering on-demand sophisticated services over the Internet. To make effective use of tremendous capabilities of the cloud, efficient scheduling algorithms are required. While concerning on large scale system, fault tolerance is a very critical issue, since the cloud resources are extensively disseminated among diverse locations. This leads to a higher probability of failures while solving huge problems, thus the cloud service reliability could be relatively low. Therefore, providing an effective fault tolerance technique for a cloud system is mandatory. This paper introduces an efficient and reliable Rule-based Semantic Fault Tolerant Scheduling (RSFTS technique for Cloud Environment. The overall system is described semantically to assign resources based on a set of semantic rules. The proposed technique could achieve the maximum reliability, availability and high efficiency.

  5. Active fault-tolerant control strategy of large civil aircraft under elevator failures

    Directory of Open Access Journals (Sweden)

    Wang Xingjian

    2015-12-01

    Full Text Available Aircraft longitudinal control is the most important actuation system and its failures would lead to catastrophic accident of aircraft. This paper proposes an active fault-tolerant control (AFTC strategy for civil aircraft with different numbers of faulty elevators. In order to improve the fault-tolerant flight control system performance and effective utilization of the control surface, trimmable horizontal stabilizer (THS is considered to generate the extra pitch moment. A suitable switching mechanism with performance improvement coefficient is proposed to determine when it is worthwhile to utilize THS. Furthermore, AFTC strategy is detailed by using model following technique and the proposed THS switching mechanism. The basic fault-tolerant controller is designed to guarantee longitudinal control system stability and acceptable performance degradation under partial elevators failure. The proposed AFTC is applied to Boeing 747-200 numerical model and simulation results validate the effectiveness of the proposed AFTC approach.

  6. FAULT TOLERANCE USING CREDENTIALS MANAGEMENT IN ONLINE TRANSACTION APPLICATION

    Directory of Open Access Journals (Sweden)

    L. Javid Ali

    2014-07-01

    Full Text Available Web applications play a vital role in the IT field for satisfying the web customer. The customer always depends on the online transaction processing system. The web application has various forms which gives a complete service to the customer. These various forms have options that are used to satisfy the customer’s needs because of the attraction over web sites existing in the global market. The traditional web pages will be closed from the current session whenever the customer selects an improper option because of single sign-on property. Selection of wrong option that is not suitable for the current session will lead to reliability problem. If the same user needs the same service, again he has to navigate from home page to the required page, thus adding up extra burden on customer. The customer session should be maintained properly, so that the customer’s satisfaction is retained over the online web application. The existing system classifies the user with their access level and also their fault level. The main objective of the proposed work is to manage the credential in all levels in order to keep the valuable customer for a long time of access in the current session. The credential management and session management are used to manage a multilevel credential from web client to web resource level and vice versa. The options selected by the customer can be classified based on the fault and type of access. The credential management also performs the maintenance process for fixing the fault tolerance level to the web user. A complete log is recorded to trace the overall process in the online transaction processing.

  7. Robust Fault-Tolerant Control for Satellite Attitude Stabilization Based on Active Disturbance Rejection Approach with Artificial Bee Colony Algorithm

    OpenAIRE

    Fei Song; Shiyin Qin

    2014-01-01

    This paper proposed a robust fault-tolerant control algorithm for satellite stabilization based on active disturbance rejection approach with artificial bee colony algorithm. The actuating mechanism of attitude control system consists of three working reaction flywheels and one spare reaction flywheel. The speed measurement of reaction flywheel is adopted for fault detection. If any reaction flywheel fault is detected, the corresponding fault flywheel is isolated and the spare reaction flywhe...

  8. A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism

    CERN Document Server

    Yao, Erlin; Wang, Rui; Zhang, Wenli; Tan, Guangming

    2011-01-01

    Fault tolerance overhead of high performance computing (HPC) applications is becoming critical to the efficient utilization of HPC systems at large scale. HPC applications typically tolerate fail-stop failures by checkpointing. Another promising method is in the algorithm level, called algorithmic recovery. These two methods can achieve high efficiency when the system scale is not very large, but will both lose their effectiveness when systems approach the scale of Exaflops, where the number of processors including in system is expected to achieve one million. This paper develops a new and efficient algorithm-based fault tolerance scheme for HPC applications. When failure occurs during the execution, we do not stop to wait for the recovery of corrupted data, but replace them with the corresponding redundant data and continue the execution. A background accelerated recovery method is also proposed to rebuild redundancy to tolerate multiple times of failures during the execution. To demonstrate the feasibility ...

  9. Critique of Fault-Tolerant Quantum Information Processing

    OpenAIRE

    Alicki, Robert

    2013-01-01

    This is a chapter in a book \\emph{Quantum Error Correction} edited by D. A. Lidar and T. A. Brun, and published by Cambridge University Press (2013)\\\\ (http://www.cambridge.org/us/academic/subjects/physics/quantum-physics-quantum-information-and-quantum-computation/quantum-error-correction)\\\\ presenting the author's view on feasibility of fault-tolerant quantum information processing.

  10. Microprocessor-based fault-tolerant nuclear turbine governor

    International Nuclear Information System (INIS)

    A new microprocessor-based fault-tolerant nuclear turbine governor has been developed. Hierarchically distributed configuration and asynchronous triplicated architecture with middle value voting logic maximizes the plant availability. Problem-oriented language is provided for design ease and program maintainability. The turbine governor with these features is described with test results

  11. Modular Multilevel Converter Control Strategy with Fault Tolerance

    DEFF Research Database (Denmark)

    Teodorescu, Remus; Eni, Emanuel-Petre; Mathe, Laszlo; Rodriguez, Pedro

    2013-01-01

    The Modular Multilevel Converter (MMC) technology has recently emerged in VSC-HVDC applications where it demonstrated higher efficiency and fault tolerance compared to the classical 2-level topology. Due to the ability of MMC to connect to HV levels, MMC can be also used in transformerless STATCO...

  12. A Benchmark Evaluation of Fault Tolerant Wind Turbine Control Concepts

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    2015-01-01

    As the world’s power supply to a larger and larger degree depends on wind turbines, it is consequently and increasingly important that these are as reliable and available as possible. Modern fault tolerant control (FTC) could play a substantial part in increasing reliability of modern wind turbin...

  13. Reversible Logic Synthesis of Fault Tolerant Carry Skip BCD Adder

    CERN Document Server

    Islam, Md Saiful; 10.3329/jbas.v32i2.2431

    2010-01-01

    Reversible logic is emerging as an important research area having its application in diverse fields such as low power CMOS design, digital signal processing, cryptography, quantum computing and optical information processing. This paper presents a new 4*4 parity preserving reversible logic gate, IG. The proposed parity preserving reversible gate can be used to synthesize any arbitrary Boolean function. It allows any fault that affects no more than a single signal readily detectable at the circuit's primary outputs. It is shown that a fault tolerant reversible full adder circuit can be realized using only two IGs. The proposed fault tolerant full adder (FTFA) is used to design other arithmetic logic circuits for which it is used as the fundamental building block. It has also been demonstrated that the proposed design offers less hardware complexity and is efficient in terms of gate count, garbage outputs and constant inputs than the existing counterparts.

  14. Fault Tolerant Message Efficient Coordinator Election Algorithm in High Traffic Bidirectional Ring Network

    Directory of Open Access Journals (Sweden)

    Danial Rahdari

    2012-12-01

    Full Text Available Nowadays use of distributed systems such as internet and cloud computing is growing dramatically. Coordinator existence in these systems is crucial due to processes coordinating and consistency requirement as well. However the growth makes their election algorithm even more complicated. Too many algorithms are proposed in this area but the two most well known one are Bully and Ring. In this paper we propose a fault tolerant coordinator election algorithm in typical bidirectional ring topology which is twice as fast as Ring algorithm although far fewer messages are passing due to election. Fault tolerance technique is applied which leads the waiting time for the election reaching to zero.

  15. Final Project Report. Scalable fault tolerance runtime technology for petascale computers

    Energy Technology Data Exchange (ETDEWEB)

    Krishnamoorthy, Sriram [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Sadayappan, P [Ohio State Univ., Columbus, OH (United States)

    2015-06-16

    With the massive number of components comprising the forthcoming petascale computer systems, hardware failures will be routinely encountered during execution of large-scale applications. Due to the multidisciplinary, multiresolution, and multiscale nature of scientific problems that drive the demand for high end systems, applications place increasingly differing demands on the system resources: disk, network, memory, and CPU. In addition to MPI, future applications are expected to use advanced programming models such as those developed under the DARPA HPCS program as well as existing global address space programming models such as Global Arrays, UPC, and Co-Array Fortran. While there has been a considerable amount of work in fault tolerant MPI with a number of strategies and extensions for fault tolerance proposed, virtually none of advanced models proposed for emerging petascale systems is currently fault aware. To achieve fault tolerance, development of underlying runtime and OS technologies able to scale to petascale level is needed. This project has evaluated range of runtime techniques for fault tolerance for advanced programming models.

  16. Multi-fault Tolerance for Cartesian Data Distributions

    Energy Technology Data Exchange (ETDEWEB)

    Ali, Nawab; Krishnamoorthy, Sriram; Halappanavar, Mahantesh; Daily, Jeffrey A.

    2013-06-01

    Faults are expected to play an increasingly important role in how algorithms and applications are designed to run on future extreme-scale sys- tems. Algorithm-based fault tolerance (ABFT) is a promising approach that involves modications to the algorithm to recover from faults with lower over- heads than replicated storage and a signicant reduction in lost work compared to checkpoint-restart techniques. Fault-tolerant linear algebra (FTLA) algo- rithms employ additional processors that store parities along the dimensions of a matrix to tolerate multiple, simultaneous faults. Existing approaches as- sume regular data distributions (blocked or block-cyclic) with the failures of each data block being independent. To match the characteristics of failures on parallel computers, we extend these approaches to mapping parity blocks in several important ways. First, we handle parity computation for generalized Cartesian data distributions with each processor holding arbitrary subsets of blocks in a Cartesian-distributed array. Second, techniques to handle corre- lated failures, i.e., multiple processors that can be expected to fail together, are presented. Third, we handle the colocation of parity blocks with the data blocks and do not require them to be on additional processors. Several al- ternative approaches, based on graph matching, are presented that attempt to balance the memory overhead on processors while guaranteeing the same fault tolerance properties as existing approaches that assume independent fail- ures on regular blocked data distributions. The evaluation of these algorithms demonstrates that the additional desirable properties are provided by the pro- posed approach with minimal overhead.

  17. Data-based fault-tolerant model predictive controller an application to a complex dearomatization process

    OpenAIRE

    Kettunen, Markus

    2010-01-01

    The tightening global competition during the last few decades has been the driving force for the optimisation of industrial plant operations through the use of advanced control methods, such as model predictive control (MPC). As the occurrence of faults in the process measurements and actuators has become more common due to the increase in the complexity of the control systems, the need for fault-tolerant control (FTC) to prevent the degradation of the controller performance, and therefore th...

  18. Combining dynamical decoupling with fault-tolerant quantum computation

    CERN Document Server

    Ng, Hui Khoon; Preskill, John

    2009-01-01

    We study how dynamical decoupling (DD) pulse sequences can improve the reliability of quantum computers. We prove upper bounds on the accuracy of DD-protected quantum gates and derive sufficient conditions for DD-protected gates to outperform unprotected gates. Under suitable conditions, fault-tolerant quantum circuits constructed from DD-protected gates can tolerate stronger noise, and have a lower overhead cost, than fault-tolerant circuits constructed from unprotected gates. Our accuracy estimates depend on the dynamics of the bath that couples to the quantum computer, and can be expressed either in terms of the operator norm of the bath's Hamiltonian or in terms of the power spectrum of bath correlations; we explain in particular how the performance of recursively generated concatenated pulse sequences can be analyzed from either viewpoint. Our results apply to Hamiltonian noise models with limited spatial correlations.

  19. Fault-tolerant Algorithms for Tick-Generation in Asynchronous Logic: Robust Pulse Generation

    CERN Document Server

    Dolev, Danny; Lenzen, Christoph; Schmid, Ulrich

    2011-01-01

    Today's hardware technology presents a new challenge in designing robust systems. Deep submicron VLSI technology introduced transient and permanent faults that were never considered in low-level system designs in the past. Still, robustness of that part of the system is crucial and needs to be guaranteed for any successful product. Distributed systems, on the other hand, have been dealing with similar issues for decades. However, neither the basic abstractions nor the complexity of contemporary fault-tolerant distributed algorithms match the peculiarities of hardware implementations. This paper is intended to be part of an attempt striving to overcome this gap between theory and practice for the clock synchronization problem. Solving this task sufficiently well will allow to build a very robust high-precision clocking system for hardware designs like systems-on-chips in critical applications. As our first building block, we describe and prove correct a novel Byzantine fault-tolerant self-stabilizing pulse syn...

  20. Active Fault Isolation in MIMO Systems

    DEFF Research Database (Denmark)

    Niemann, Hans Henrik; Poulsen, Niels Kjlstad

    Active fault isolation of parametric faults in closed-loop MIMO system s are considered in this paper. The fault isolation consists of two steps. T he first step is group- wise fault isolation. Here, a group of faults is isolated from other pos sible faults in the system. The group-wise fault...

  1. Beam Dynamics Studies for the Fault Tolerance Assessment of the PDS-XADS Linac Design

    International Nuclear Information System (INIS)

    In order to meet the high availability/reliability required by the PDS-XADS design, the accelerator needs to implement to the maximum possible extent a fault tolerance strategy that would allow beam operation in the presence of most of the envisaged faults that could occur in its beam line components. In this work, we report the results of beam dynamics simulations performed to characterize the effects of the faults of the main linac components (cavities and focusing magnets) on the beam parameters. The outcome of this activity is the definition of the possible corrective actions that could be conceived (and implemented in the system) in order to guarantee the fault tolerance characteristics of the accelerator. This work has been supported by the PDS-XADS program, funded by the EU 5th Framework Program under contract FIKW-CT-2001-00179

  2. Fault-tolerant control design for over-actuated system conditioned by reliability: a drinking water network application

    OpenAIRE

    Weber, Philippe; Simon, Christophe; Theilliol, Didier; Puig, Vicenç

    2012-01-01

    A optimal control law synthesis conditioned by the reliability of actuators in the presence of failures is presented in this paper. The aim is to preserve the health of the actuators and the availability of overactuatued system both in the nominal situation and in the presence of some actuator failures. The reliability assessment is computed by Bayesian Network since it is well suited to model the reliability of complex systems with simple parameter matrices and also to compute actuators reli...

  3. Enhanced Maritime Safety through Diagnosis and Fault Tolerant Control

    DEFF Research Database (Denmark)

    Blanke, Mogens

    2001-01-01

    Faults in steering, navigation instruments or propulsion machinery are serious on a marine vessel since the consequence could be loss of maneuvering ability, and imply risk of damage to vessel personnel or environment. Early diagnosis and accomodation of faults could enhance safety. Fault-toleran...... properties of a falty system; means to determine remedial actions. The paper illustrates the techniques by two marine examples, sensor fusion for automatic steering and control of the main engine....

  4. Active fault tolerant control research for nuclear power plant based on BP neural network

    International Nuclear Information System (INIS)

    In view of the sensor fault of nuclear power plant, the sensor was trained by adopting improved back propagation (BP) neural network method, and the dynamic model bank in different states was set up. The system was detected by using BP neural network in real time. When the sensor goes wrong, it will be controlled by reconstruction. Taking pressurizer as the case, a simulation experiment was performed on the nuclear power plant simulator. The results show that the proposed method is valid for the fault tolerant control of sensor faults in nuclear power plant. (authors)

  5. Bayesian reliability assessment of legacy safety-critical systems upgraded with fault-tolerant off-the-shelf software

    International Nuclear Information System (INIS)

    This paper presents a new way of applying Bayesian assessment to systems, which consist of many components. Full Bayesian inference with such systems is problematic, because it is computationally hard and, far more seriously, one needs to specify a multivariate prior distribution with many counterintuitive dependencies between the probabilities of component failures. The approach taken here is one of decomposition. The system is decomposed into partial views of the systems or part thereof with different degrees of detail and then a mechanism of propagating the knowledge obtained with the more refined views back to the coarser views is applied (recalibration of coarse models). The paper describes the recalibration technique and then evaluates the accuracy of recalibrated models numerically on contrived examples using two techniques: u-plot and prequential likelihood, developed by others for software reliability growth models. The results indicate that the recalibrated predictions are often more accurate than the predictions obtained with the less detailed models, although this is not guaranteed. The techniques used to assess the accuracy of the predictions are accurate enough for one to be able to choose the model giving the most accurate prediction

  6. Reliability analysis and fault-tolerant system development for a redundant strapdown inertial measurement unit. [inertial platforms

    Science.gov (United States)

    Motyka, P.

    1983-01-01

    A methodology is developed and applied for quantitatively analyzing the reliability of a dual, fail-operational redundant strapdown inertial measurement unit (RSDIMU). A Markov evaluation model is defined in terms of the operational states of the RSDIMU to predict system reliability. A 27 state model is defined based upon a candidate redundancy management system which can detect and isolate a spectrum of failure magnitudes. The results of parametric studies are presented which show the effect on reliability of the gyro failure rate, both the gyro and accelerometer failure rates together, false alarms, probability of failure detection, probability of failure isolation, and probability of damage effects and mission time. A technique is developed and evaluated for generating dynamic thresholds for detecting and isolating failures of the dual, separated IMU. Special emphasis is given to the detection of multiple, nonconcurrent failures. Digital simulation time histories are presented which show the thresholds obtained and their effectiveness in detecting and isolating sensor failures.

  7. A fault-tolerant voltage measurement method for series connected battery packs

    Science.gov (United States)

    Xia, Bing; Mi, Chris

    2016-03-01

    This paper proposes a fault-tolerant voltage measurement method for battery management systems. Instead of measuring the voltage of individual cells, the proposed method measures the voltage sum of multiple battery cells without additional voltage sensors. A matrix interpretation is developed to demonstrate the viability of the proposed sensor topology to distinguish between sensor faults and cell faults. A methodology is introduced to isolate sensor and cell faults by locating abnormal signals. A measurement electronic circuit is proposed to implement the design concept. Simulation and experiment results support the mathematical analysis and validate the feasibility and robustness of the proposed method. In addition, the measurement problem is generalized and the condition for valid sensor topology is discovered. The tuning of design parameters are analyzed based on fault detection reliability and noise levels.

  8. Fault Tolerant Control Using Proportional-Integral-Derivative Controller Tuned by Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    S. Kanthalakshmi

    2011-01-01

    Full Text Available Problem statement: The growing demand for reliability, maintainability and survivability in industrial processes has drawn significant research in fault detection and fault tolerant control domain. A fault is usually defined as an unexpected change in a system, such as component malfunction and variations in operating condition, which tends to degrade the overall system performance. The purpose of fault detection is to detect these malfunctions to take proper action in order to prevent faults from developing into a total system failure. Approach: In this study an effective integrated fault detection and fault tolerant control scheme was developed for a class of LTI system. The scheme was based on a Kalman filter for simultaneous state and fault parameter estimation, statistical decisions for fault detection and activation of controller reconfiguration. Proportional-Integral-Derivative (PID control schemes continue to provide the simplest and yet effective solutions to most of the control engineering applications today. Determination or tuning of the PID parameters continues to be important as these parameters have a great influence on the stability and performance of the control system. In this study GA was proposed to tune the PID controller. Results: The results reflect that proposed scheme improves the performance of the process in terms of time domain specifications, robustness to parametric changes and optimum stability. Also, A comparison with the conventional Ziegler-Nichols method proves the superiority of GA based system. Conclusion: This study demonstrates the effectiveness of genetic algorithm in tuning of a PID controller with optimum parameters. It is, moreover, proved to be robust to the variations in plant dynamic characteristics and disturbances assuring a parameter-insensitive operation of the process.

  9. Chasing the FLP Impossibility Result in a LAN or How Robust Can a Fault Tolerant Server Be?

    OpenAIRE

    Urbn, P.; Dfago, X.; Schiper, A.

    2001-01-01

    Chasing the FLP Impossibility Result in a LAN or How Robust Can a Fault Tolerant Server Be? Peter Urban, Xavier Defago and Andre Schiper Fault tolerance can be achieved in distributed systems by replication. However, Fischer, Lynch and Paterson have proven an impossibility result about consensus in the asynchronous system model. Similar impossibility results have been established for atomic broadcast and group membership, and should be as such relevant for implementations of a replicated s...

  10. High Performance Modeling of Intelligent Pattern Recognition with Enhanced Fault-Tolerance in Real Time

    Directory of Open Access Journals (Sweden)

    Renukaradhya P.C

    2014-03-01

    Full Text Available Designing an ANN which could recognize the learned patterns even if there is variation in applied test patterns from learned patterns. A mechanism has been developed which provided the recognition facility intelligently. Recognition of patterns can be broadly categorized into two classes. When precision of recognition is not defined, term name Forced recognition given to the process. When precision of recognition is properly defined termed Custom recognition given to process. Analysis of fault tolerant property of feed forward architecture will be given training with back propagation method. Under this, analysis of effect of initially selected random weights and what should be the nature of random weights so that to maximize the fault tolerance capability of system has done. Analysis can be done with two different distribution namely Gaussian distribution and Uniform distribution. Effect of faults at output is also a function of fault position in ANN system like Hidden layer weight, Output layer weights, with processing elements at hidden layer. Analysis capability of back propagation algorithm itself is to tolerate the fault by learning process. A development of test mechanism to check faulty system in coming future is ANN system in hardware world i.e. on the VLSI chip. Once the architecture implemented it is required a mechanism to check the functioning. Analysis of internal parameters of ANN is completely research work with behavior of internal parameters, which will provide all responsible factors behind success of an ANN.

  11. Lightweight storage and overlay networks for fault tolerance.

    Energy Technology Data Exchange (ETDEWEB)

    Oldfield, Ron A.

    2010-01-01

    The next generation of capability-class, massively parallel processing (MPP) systems is expected to have hundreds of thousands to millions of processors, In such environments, it is critical to have fault-tolerance mechanisms, including checkpoint/restart, that scale with the size of applications and the percentage of the system on which the applications execute. For application-driven, periodic checkpoint operations, the state-of-the-art does not provide a scalable solution. For example, on today's massive-scale systems that execute applications which consume most of the memory of the employed compute nodes, checkpoint operations generate I/O that consumes nearly 80% of the total I/O usage. Motivated by this observation, this project aims to improve I/O performance for application-directed checkpoints through the use of lightweight storage architectures and overlay networks. Lightweight storage provide direct access to underlying storage devices. Overlay networks provide caching and processing capabilities in the compute-node fabric. The combination has potential to signifcantly reduce I/O overhead for large-scale applications. This report describes our combined efforts to model and understand overheads for application-directed checkpoints, as well as implementation and performance analysis of a checkpoint service that uses available compute nodes as a network cache for checkpoint operations.

  12. A Decentralized Adaptive Approach to Fault Tolerant Flight Control

    Science.gov (United States)

    Wu, N. Eva; Nikulin, Vladimir; Heimes, Felix; Shormin, Victor

    2000-01-01

    This paper briefly reports some results of our study on the application of a decentralized adaptive control approach to a 6 DOF nonlinear aircraft model. The simulation results showed the potential of using this approach to achieve fault tolerant control. Based on this observation and some analysis, the paper proposes a multiple channel adaptive control scheme that makes use of the functionally redundant actuating and sensing capabilities in the model, and explains how to implement the scheme to tolerate actuator and sensor failures. The conditions, under which the scheme is applicable, are stated in the paper.

  13. Fault Tolerance for Industrial Actuators in Absence of Accurate Models and Hardware Redundancy

    DEFF Research Database (Denmark)

    Papageorgiou, Dimitrios; Blanke, Mogens; Niemann, Hans Henrik; Richter, Jan H.

    This paper investigates Fault-Tolerant Control for closed-loop systems where only coarse models are available and there is lack of actuator and sensor redundancies. The problem is approached in the form of a typical servomotor in closed-loop. A linear model is extracted from input/output data to ...

  14. Fault management for data systems

    Science.gov (United States)

    Boyd, Mark A.; Iverson, David L.; Patterson-Hine, F. Ann

    1993-01-01

    Issues related to automating the process of fault management (fault diagnosis and response) for data management systems are considered. Substantial benefits are to be gained by successful automation of this process, particularly for large, complex systems. The use of graph-based models to develop a computer assisted fault management system is advocated. The general problem is described and the motivation behind choosing graph-based models over other approaches for developing fault diagnosis computer programs is outlined. Some existing work in the area of graph-based fault diagnosis is reviewed, and a new fault management method which was developed from existing methods is offered. Our method is applied to an automatic telescope system intended as a prototype for future lunar telescope programs. Finally, an application of our method to general data management systems is described.

  15. Fault-Tolerant Quantum Computation With Constant Error Rate

    CERN Document Server

    Aharonov, D; Aharonov, Dorit; Ben-Or, Michael

    1999-01-01

    This paper proves the threshold result, which asserts that quantum computation can be made robust against errors and inaccuracies, when the error rate, $\\eta$, is smaller than a constant threshold, $\\eta_c$. The result holds for a very general, not necessarily probabilistic noise model, for quantum particles with any number of states, and is also generalized to one dimensional quantum computers with only nearest neighbor interactions. No measurements, or classical operations, are required during the quantum computation. The proceeding version was very succinct, and here we fill all the missing details, and elaborate on many parts of the proof. In particular, we devote a section for a discussion of universality issues and proofs that the sets of gates that we use are universal. Another section is devoted to a rigorous proof that fault tolerance can be achieved in the presence of general non probabilistic noise. The systematic structure of the fault tolerant procedures for polynomial codes is explained in lengt...

  16. Topological fault-tolerance in cluster state quantum computation

    International Nuclear Information System (INIS)

    We describe a fault-tolerant version of the one-way quantum computer using a cluster state in three spatial dimensions. Topologically protected quantum gates are realized by choosing appropriate boundary conditions on the cluster. We provide equivalence transformations for these boundary conditions that can be used to simplify fault-tolerant circuits and to derive circuit identities in a topological manner. The spatial dimensionality of the scheme can be reduced to two by converting one spatial axis of the cluster into time. The error threshold is 0.75% for each source in an error model with preparation, gate, storage and measurement errors. The operational overhead is poly-logarithmic in the circuit size

  17. Fault-tolerance performance evaluation of fieldbus for NPCS network of KNGR

    International Nuclear Information System (INIS)

    In contrast with conventional fieldbus researches which are focused merely on real time performance, this study aims to evaluate the real-time performance of the communication system including fault-tolerant mechanisms. Maintaining performance in presence of recoverable faults is very important because the communication network will be applied to next generation NPP(Nuclear Power Plant). In order to guarantee the performance of NPP communication network, the time characteristics of the target system in presence of recoverable fault should be investigated. If the time characteristics meet the requirements of the system, the faults will be recovered by fieldbus recovery mechanisms and the system will be safe. If the time characteristics can not meet the requirements, the faults in the fieldbus can propagate to system failure. In this study, for the purpose of investigating the time characteristics of fieldbus, the recoverable faults are classified and then the formulas which represent delays including recovery mechanisms and the simulation model are developed. In order to validate the proposed approach, the simulation model is applied to the Korea Next Generation Reactor (KNGR) NSSS Process Control System (NPCS). The results of the simulation provide reasonable delay characteristics of the fault cases with recovery mechanisms. Using the outcome of the simulation and the system requirements, we also can calculate the failure propagation probability from fieldbus to outer system

  18. Fault-tolerant Control of Unmanned Underwater Vehicles with Continuous Faults: Simulations and Experiments

    Directory of Open Access Journals (Sweden)

    Qian Liu

    2010-02-01

    Full Text Available A novel thruster fault diagnosis and accommodation method for open-frame underwater vehicles is presented in the paper. The proposed system consists of two units: a fault diagnosis unit and a fault accommodation unit. In the fault diagnosis unit an ICMAC (Improved Credit Assignment Cerebellar Model Articulation Controllers neural network information fusion model is used to realize the fault identification of the thruster. The fault accommodation unit is based on direct calculations of moment and the result of fault identification is used to find the solution of the control allocation problem. The approach resolves the continuous faulty identification of the UV. Results from the experiment are provided to illustrate the performance of the proposed method in uncertain continuous faulty situation.

  19. BFTDT: Byzantine Fault Tolerance tryout for Dependable Transactions in Cloud

    Directory of Open Access Journals (Sweden)

    Gayathri S

    2012-11-01

    Full Text Available Cloud Web Services (CWS is the technology used for business collaboration and integration among the web users. The Web Services Atomic Transactions (WS-AT have been used for the trusted distributed transaction processing over the web. The WS-AT in the distributed sense has byzantine faults to overcome that Byzantine Faults Techniques (BFT is used. The reliable coordinator provides the services that are Coordination services, Activation services, Registration Services and Completion services which make the transaction effective and reliable. In the trusted environment, to evade congestion of the resources, fair share bandwidth allocation scheme is used to allocate separate bandwidth for each web users and the transaction is processed Coordinator server and the Transaction Processing Monitor (TPM. The WS-AT for business applications analysis shows the high degree of dependability, security, trust, fault tolerance and fairness of the resources in the trusted environment.

  20. Faster Quantum Chemistry Simulation on Fault-Tolerant Quantum Computers

    OpenAIRE

    Jones, N. Cody; Whitfield, James D; McMahon, Peter L.; Yung, Man-Hong; Van Meter, Rodney; Aspuru-Guzik, Alan; Yamamoto, Yoshihisa

    2012-01-01

    Quantum computers can in principle simulate quantum physics exponentially faster than their classical counterparts, but some technical hurdles remain. We propose methods which substantially improve the performance of a particular form of simulation, ab initio quantum chemistry, on fault-tolerant quantum computers; these methods generalize readily to other quantum simulation problems. Quantum teleportation plays a key role in these improvements and is used extensively as a computing resource...

  1. Logic Synthesis for Fault-Tolerant Quantum Computers

    OpenAIRE

    Jones, N. Cody

    2013-01-01

    Efficient constructions for quantum logic are essential since quantum computation is experimentally challenging. This thesis develops quantum logic synthesis as a paradigm for reducing the resource overhead in fault-tolerant quantum computing. The model for error correction considered here is the surface code. After developing the theory behind general logic synthesis, the resource costs of magic-state distillation for the $T = \\exp(i \\pi (I-Z)/8)$ gate are quantitatively analyzed. The resour...

  2. Fault-Tolerant Shortest Paths - Beyond the Uniform Failure Model

    OpenAIRE

    Adjiashvili, David

    2013-01-01

    The overwhelming majority of survivable (fault-tolerant) network design models assume a uniform scenario set. Such a scenario set assumes that every subset of the network resources (edges or vertices) of a given cardinality $k$ comprises a scenario. While this approach yields problems with clean combinatorial structure and good algorithms, it often fails to capture the true nature of the scenario set coming from applications. One natural refinement of the uniform model is obtained by partitio...

  3. Fault-tolerant Operations for Universal Blind Quantum Computation

    OpenAIRE

    Chien, Chia-Hung; Van Meter, Rodney; Kuo, Sy-Yen

    2013-01-01

    Blind quantum computation is an appealing use of quantum information technology because it can conceal both the client's data and the algorithm itself from the server. However, problems need to be solved in the practical use of blind quantum computation and fault-tolerance is a major challenge. On an example circuit, the computational cost measured in T gates executed by the client is 97 times more than performing the original computation directly, without using the server, even before applyi...

  4. A Fault Tolerance protocol for ASP calculus: Design and Proof

    OpenAIRE

    Baude, Françoise; Caromel, Denis; Delbé, Christian; Henrio, Ludovic

    2004-01-01

    This research report first details a communication induced checkpointing fault tolerance protocol adapted to ProActive, a Java library that implements the ASP model. This model is based n a request/reply mechanism. In order to prove the correctness of this protocol, we introduce a local partial order between events occurring on a given process. This order is extended into a global order by the Lamport's happened-before relation. Finally, we prove that from a cut that is ''consistent enough'',...

  5. Formal verification of fault-tolerant software design: the CSP approach

    OpenAIRE

    Yeung, WL; Schneider, SA

    2005-01-01

    Software design techniques for tolerating both hardware and software faults have been developed over the past few decades. Paradoxically, it is essential that fault-tolerant software is designed with the highest possible rigour to prevent faults in itself. Such rigour is provided by formal methods and aided by model checking. We illustrate an approach to fault-tolerant software design based on communicating sequential processes through a running example.

  6. Unconstrained and Constrained Fault-Tolerant Resource Allocation

    CERN Document Server

    Liao, Kewen

    2011-01-01

    First, we study the Unconstrained Fault-Tolerant Resource Allocation (UFTRA) problem (a.k.a. FTFA problem in \\cite{shihongftfa}). In the problem, we are given a set of sites equipped with an unconstrained number of facilities as resources, and a set of clients with set $\\mathcal{R}$ as corresponding connection requirements, where every facility belonging to the same site has an identical opening (operating) cost and every client-facility pair has a connection cost. The objective is to allocate facilities from sites to satisfy $\\mathcal{R}$ at a minimum total cost. Next, we introduce the Constrained Fault-Tolerant Resource Allocation (CFTRA) problem. It differs from UFTRA in that the number of resources available at each site $i$ is limited by $R_{i}$. Both problems are practical extensions of the classical Fault-Tolerant Facility Location (FTFL) problem \\cite{Jain00FTFL}. For instance, their solutions provide optimal resource allocation (w.r.t. enterprises) and leasing (w.r.t. clients) strategies for the cont...

  7. Faster quantum chemistry simulation on fault-tolerant quantum computers

    Science.gov (United States)

    Cody Jones, N.; Whitfield, James D.; McMahon, Peter L.; Yung, Man-Hong; Van Meter, Rodney; Aspuru-Guzik, Aln; Yamamoto, Yoshihisa

    2012-11-01

    Quantum computers can in principle simulate quantum physics exponentially faster than their classical counterparts, but some technical hurdles remain. We propose methods which substantially improve the performance of a particular form of simulation, ab initio quantum chemistry, on fault-tolerant quantum computers; these methods generalize readily to other quantum simulation problems. Quantum teleportation plays a key role in these improvements and is used extensively as a computing resource. To improve execution time, we examine techniques for constructing arbitrary gates which perform substantially faster than circuits based on the conventional Solovay-Kitaev algorithm (Dawson and Nielsen 2006 Quantum Inform. Comput. 6 81). For a given approximation error ?, arbitrary single-qubit gates can be produced fault-tolerantly and using a restricted set of gates in time which is O(log??) or O(log?log??) with sufficient parallel preparation of ancillas, constant average depth is possible using a method we call programmable ancilla rotations. Moreover, we construct and analyze efficient implementations of first- and second-quantized simulation algorithms using the fault-tolerant arbitrary gates and other techniques, such as implementing various subroutines in constant time. A specific example we analyze is the ground-state energy calculation for lithium hydride.

  8. Resource requirements for a fault-tolerant quantum Fourier transform

    Science.gov (United States)

    Goto, Hayato; Nakamura, Satoshi; Kujiraoka, Mamiko; Ichimura, Kouichi

    2015-03-01

    The quantum Fourier transform (QFT) is a basic subroutine for most quantum algorithms providing an exponential speedup over classical ones. We investigate resource requirements for a fault-tolerant QFT. To implement single-qubit rotations for a QFT in a fault-tolerant manner, we examine three types of approaches: ancilla-free gate synthesis, ancilla-assisted gate synthesis, and state distillation. While the gate synthesis approximates single-qubit rotations with basic quantum operations, the state distillation enables to perform specific single-qubit rotations required for the QFT exactly. It is unknown, however, which approach is better for the QFT. We estimated the resource requirement for a QFT in each case, where the resource is measured by the total number of the ? / 8 gates denoted by T, which is called the T count. Contrary to the initial expectation, the total T count for the state distillation is considerably larger than those for the ancilla-free and ancilla-assisted gate synthesis. Thus, we conclude that the ancilla-assisted gate synthesis is the best for a fault-tolerant QFT so far.

  9. Fault-Tolerant, Radiation-Hard DSP

    Science.gov (United States)

    Czajkowski, David

    2011-01-01

    Commercial digital signal processors (DSPs) for use in high-speed satellite computers are challenged by the damaging effects of space radiation, mainly single event upsets (SEUs) and single event functional interrupts (SEFIs). Innovations have been developed for mitigating the effects of SEUs and SEFIs, enabling the use of very-highspeed commercial DSPs with improved SEU tolerances. Time-triple modular redundancy (TTMR) is a method of applying traditional triple modular redundancy on a single processor, exploiting the VLIW (very long instruction word) class of parallel processors. TTMR improves SEU rates substantially. SEFIs are solved by a SEFI-hardened core circuit, external to the microprocessor. It monitors the health of the processor, and if a SEFI occurs, forces the processor to return to performance through a series of escalating events. TTMR and hardened-core solutions were developed for both DSPs and reconfigurable field-programmable gate arrays (FPGAs). This includes advancement of TTMR algorithms for DSPs and reconfigurable FPGAs, plus a rad-hard, hardened-core integrated circuit that services both the DSP and FPGA. Additionally, a combined DSP and FPGA board architecture was fully developed into a rad-hard engineering product. This technology enables use of commercial off-the-shelf (COTS) DSPs in computers for satellite and other space applications, allowing rapid deployment at a much lower cost. Traditional rad-hard space computers are very expensive and typically have long lead times. These computers are either based on traditional rad-hard processors, which have extremely low computational performance, or triple modular redundant (TMR) FPGA arrays, which suffer from power and complexity issues. Even more frustrating is that the TMR arrays of FPGAs require a fixed, external rad-hard voting element, thereby causing them to lose much of their reconfiguration capability and in some cases significant speed reduction. The benefits of COTS high-performance signal processing include significant increase in onboard science data processing, enabling orders of magnitude reduction in required communication bandwidth for science data return, orders of magnitude improvement in onboard mission planning and critical decision making, and the ability to rapidly respond to changing mission environments, thus enabling opportunistic science and orders of magnitude reduction in the cost of mission operations through reduction of required staff. Additional benefits of COTS-based, high-performance signal processing include the ability to leverage considerable commercial and academic investments in advanced computing tools, techniques, and infra structure, and the familiarity of the science and IT community with these computing environments.

  10. Fault-Tolerant Quantum Dynamical Decoupling

    OpenAIRE

    Khodjasteh, K.; Lidar, D. A.

    2004-01-01

    Dynamical decoupling pulse sequences have been used to extend coherence times in quantum systems ever since the discovery of the spin-echo effect. Here we introduce a method of recursively concatenated dynamical decoupling pulses, designed to overcome both decoherence and operational errors. This is important for coherent control of quantum systems such as quantum computers. For bounded-strength, non-Markovian environments, such as for the spin-bath that arises in electron- and nuclear-spin b...

  11. A Modular and Fault-Tolerant Data Transport Framework

    CERN Document Server

    Steinbeck, Timm M

    2009-01-01

    The High Level Trigger (HLT) of the future ALICE heavy-ion experiment has to reduce its input data rate of up to 25 GB/s to at most 1.25 GB/s for output before the data is written to permanent storage. To cope with these data rates a large PC cluster system is being designed to scale to several 1000 nodes, connected by a fast network. For the software that will run on these nodes a flexible data transport and distribution software framework, described in this thesis, has been developed. The framework consists of a set of separate components, that can be connected via a common interface. This allows to construct different configurations for the HLT, that are even changeable at runtime. To ensure a fault-tolerant operation of the HLT, the framework includes a basic fail-over mechanism that allows to replace whole nodes after a failure. The mechanism will be further expanded in the future, utilizing the runtime reconnection feature of the framework's component interface. To connect cluster nodes a communication ...

  12. A Modular and Fault-Tolerant Data Transport Framework

    CERN Document Server

    Steinbeck, T M; Steinbeck, Timm M

    2004-01-01

    The High Level Trigger (HLT) of the future ALICE heavy-ion experiment has to reduce its input data rate of up to 25 GB/s to at most 1.25 GB/s for output before the data is written to permanent storage. To cope with these data rates a large PC cluster system is being designed to scale to several 1000 nodes, connected by a fast network. For the software that will run on these nodes a flexible data transport and distribution software framework, described in this thesis, has been developed. The framework consists of a set of separate components, that can be connected via a common interface. This allows to construct different configurations for the HLT, that are even changeable at runtime. To ensure a fault-tolerant operation of the HLT, the framework includes a basic fail-over mechanism that allows to replace whole nodes after a failure. The mechanism will be further expanded in the future, utilizing the runtime reconnection feature of the framework's component interface. To connect cluster nodes a communication ...

  13. Byzantine Fault Tolerance of Regenerating Codes

    CERN Document Server

    Oggier, Frédérique

    2011-01-01

    Recent years have witnessed a slew of coding techniques custom designed for networked storage systems. Network coding inspired regenerating codes are the most prolifically studied among these new age storage centric codes. A lot of effort has been invested in understanding the fundamental achievable trade-offs of storage and bandwidth usage to maintain redundancy in presence of different models of failures, showcasing the efficacy of regenerating codes with respect to traditional erasure coding techniques. For practical usability in open and adversarial environments, as is typical in peer-to-peer systems, we need however not only resilience against erasures, but also from (adversarial) errors. In this paper, we study the resilience of generalized regenerating codes (supporting multi-repairs, using collaboration among newcomers) in the presence of two classes of Byzantine nodes, relatively benign selfish (non-cooperating) nodes, as well as under more active, malicious polluting nodes. We give upper bounds on t...

  14. Two New Protocols for Fault Tolerant Agreement

    Directory of Open Access Journals (Sweden)

    Poonam Saini

    2011-02-01

    Full Text Available The paper attempts to handle failures effectively, while reaching agreement, in a distributed transaction processing system. The standard protocols such as BFTDC [3], Zyzzyva [4] and PBFT [5] handle the problem to a greater extent. However, the limitation with these protocols is that they incur increased message overhead as well as large latency. Moreover, the nodes are evacuated from the transactionsystem after being declared faulty. We propose a novel proactive based agreement which identifies the tentative failures in the system. To improve the failure resiliency with minimum execution overhead, we also propose an optimized reactive view change mechanism. Both mechanisms have been analyzed and compared. The dynamic analysis of the protocol reflects that, in a faulty scenario, the proactive approach is computationally more efficient with reduced latency as compared to reactive one. Moreover, unlike PBFT and BFTDC, our agreement protocol runs in two phases, which leads to reduced message overhead and total execution time. The protocol treats the fail-silent (i.e. crashed nodes in the system.

  15. Design and Analysis of Software fault-Tolerant techniques for Softcore processors in reliable SRAM based FPGA

    OpenAIRE

    Vatsya Tiwari; Prof. Pratap Singh Patwal

    2011-01-01

    This paper discusses high level techniques for designing fault tolerant systems in SRAM-based FPGAs, without modification in the FPGA architecture. Triple Modular Redundancy (TMR) has been successfully applied in FPGAs to mitigate transient faults, which are likely to occur in space applications. However, TMR comes with high area and power dissipation penalties. The new technique proposed in this paper was specifically developed for FPGAs to cope with transient faults in the user combinationa...

  16. Actuator fault-tolerant control design based on reconfigurable reference input

    OpenAIRE

    Theilliol, Didier; Join, Cédric; Zhang, Youmin

    2008-01-01

    The prospective work reported in this paper explores a new approach to enhance the perform-ance of an active fault tolerant control system. The proposed technique is based on a modified recovery/trajectory control system in which a reconfigurable reference input is considered when performance degradation occurs in the system due to faults in actuator dynamics. An added value of this work is to reduce the energy spent to achieve the desired closed-loop per-formance. This work is justified by t...

  17. Optimal Configuration of Fault-Tolerance Parameters for Distributed Server Access

    DEFF Research Database (Denmark)

    Daidone, Alessandro; Renier, Thibault; Bondavalli, Andrea; Schwefel, Hans-Peter

    2013-01-01

    Server replication is a common fault-tolerance strategy to improve transaction dependability for services in communications networks. In distributed architectures, fault-diagnosis and recovery are implemented via the interaction of the server replicas with the clients and other entities such as...... model using stochastic activity networks (SAN) for the evaluation of performance and dependability metrics of a generic transaction-based service implemented on a distributed replication architecture. The composite SAN model can be easily adapted to a wide range of client-server applications deployed in...... replicated server architectures. In order to obtain insight into the system behaviour, a set of relevant environment parameters and controllable fault-tolerance parameters are chosen and the dependability/performance trade-off is evaluated....

  18. Experimental Robot Position Sensor Fault Tolerance Using Accelerometers and Joint Torque Sensors

    Science.gov (United States)

    Aldridge, Hal A.; Juang, Jer-Nan

    1997-01-01

    Robot systems in critical applications, such as those in space and nuclear environments, must be able to operate during component failure to complete important tasks. One failure mode that has received little attention is the failure of joint position sensors. Current fault tolerant designs require the addition of directly redundant position sensors which can affect joint design. The proposed method uses joint torque sensors found in most existing advanced robot designs along with easily locatable, lightweight accelerometers to provide a joint position sensor fault recovery mode. This mode uses the torque sensors along with a virtual passive control law for stability and accelerometers for joint position information. Two methods for conversion from Cartesian acceleration to joint position based on robot kinematics, not integration, are presented. The fault tolerant control method was tested on several joints of a laboratory robot. The controllers performed well with noisy, biased data and a model with uncertain parameters.

  19. Online Reconfigurable Self-Timed Links for Fault Tolerant NoC

    Directory of Open Access Journals (Sweden)

    Teijo Lehtonen

    2007-05-01

    Full Text Available We propose link structures for NoC that have properties for tolerating efficiently transient, intermittent, and permanent errors. This is a necessary step to be taken in order to implement reliable systems in future nanoscale technologies. The protection against transient errors is realized using Hamming coding and interleaving for error detection and retransmission as the recovery method. We introduce two approaches for tackling the intermittent and permanent errors. In the first approach, spare wires are introduced together with reconfiguration circuitry. The other approach uses time redundancy, the transmission is split into two parts, where the data is doubled. In both structures the presence of permanent or intermittent errors is monitored by analyzing previous error syndromes. The links are based on self-timed signaling in which the handshake signals are protected using triple modular redundancy. We present the structures, operation, and designs for the different components of the links. The fault tolerance properties are analyzed using a fault model containing temporary, intermittent, and permanent faults that occur both as bursts and as single faults. The results show a considerable enhancement in the fault tolerance at the cost of performance and area, and with only a slight increase in power consumption.

  20. Task-based Dynamic Fault Tolerance for Humanoid Robot Applications and Its Hardware Implementation

    Directory of Open Access Journals (Sweden)

    Masayuki Murakami

    2008-08-01

    Full Text Available This paper presents a new fault tolerance scheme suitable for humanoid robot applications. In the future, various tasks ranging from daily chores to safety-related tasks will be carried out by individual humanoid robots. If the importance of the tasks is different, the required dependability will vary accordingly. Therefore, for mobile humanoid robots operating under power constraints, fault tolerance that dynamically changes based on the importance of the tasks is desirable because fault-tolerant designs involving hardware redundancy are power intensive. In the proposed fault tolerance scheme, a duplex computer system switches between hot standby and cold standby according to each individual task. However, in mobile humanoid robots, a safety issue arises when cold standby is used for the standby computer unit. Since an unpowered unit cannot immediately start to operate, a biped-walking robot falls down when failover occurs during cold standby. This paper proposes a safety failover method to resolve this issue and describes the hardware design of the safety failover subsystem.

  1. Effect Analysis of Faults in Digital I and C Systems of Nuclear Power Plants

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Seung Jun; Jung, Won Dea [KAERI, Dajeon (Korea, Republic of); Kim, Man Cheol [Chung-Ang University, Seoul (Korea, Republic of)

    2014-08-15

    A reliability analysis of digital instrumentation and control (I and C) systems in nuclear power plants has been introduced as one of the important elements of a probabilistic safety assessment because of the unique characteristics of digital I and C systems. Digital I and C systems have various features distinguishable from those of analog I and C systems such as software and fault-tolerant techniques. In this work, the faults in a digital I and C system were analyzed and a model for representing the effects of the faults was developed. First, the effects of the faults in a system were analyzed using fault injection experiments. A software-implemented fault injection technique in which faults can be injected into the memory was used based on the assumption that all faults in a system are reflected in the faults in the memory. In the experiments, the effect of a fault on the system output was observed. In addition, the success or failure in detecting the fault by fault-tolerant functions included in the system was identified. Second, a fault tree model for representing that a fault is propagated to the system output was developed. With the model, it can be identified how a fault is propagated to the output or why a fault is not detected by fault-tolerant techniques. Based on the analysis results of the proposed method, it is possible to not only evaluate the system reliability but also identify weak points of fault-tolerant techniques by identifying undetected faults. The results can be reflected in the designs to improve the capability of fault-tolerant techniques.

  2. Fault Diagnosis and Fault Tolerant Control with Application on a Wind Turbine Low Speed Shaft Encoder

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Sardi, Hector Eloy Sanchez; Escobet, Teressa; Puig, Vicenc

    2015-01-01

    In recent years, individual pitch control has been developed for wind turbines, with the purpose of reducing blade and tower loads. Such algorithms depend on reliable sensor information. The azimuth angle sensor, which positions the wind turbine rotor in its rotation, is quite important. This...... tolerant control of wind turbines using a benchmark model. In this paper, the fault diagnosis scheme is improved and integrated with a fault accommodation scheme which enables and disables the individual pitch algorithm based on the fault detection. In this way, the blade and tower loads are not increased...... due to individual pitch control algorithm operating with faulty azimuth angle inputs. The proposed approach is evaluated on a wind turbine benchmark model, which is based on the FAST aero-elastic code provided by NREL....

  3. The optimization of global fault tolerant trajectory for redundant manipulator based on self-motion

    Directory of Open Access Journals (Sweden)

    Zhang Jian

    2015-01-01

    Full Text Available The redundancy feature of manipulators provides the possibility for the fault tolerant trajectory planning. Aiming at the completion of the specific task, an algorithm of global fault tolerant trajectory optimization for redundant manipulator based on the self-motion is proposed in this paper. Firstly, inverse kinematics equation of single redundancy manipulator based on self-motion variable and null-space velocity array of Jacobian are analyzed. Secondly, the mathematical description of fault tolerance criteria of the configuration of manipulator is established and the fault tolerance configuration group of manipulator is obtained by using iteration traversal under the fault tolerance criteria. Then, considering the joint limits and minimum the energy consumption as the optimization target, the global fault tolerant joint trajectory is achieved. Finally, simulation for 7 degree of freedom (DOF manipulator is performed, by which the effectiveness of the algorithm is validated.

  4. Data center networks topologies, architectures and fault-tolerance characteristics

    CERN Document Server

    Liu, Yang; Veeraraghavan, Malathi; Lin, Dong; Hamdi, Mounir

    2013-01-01

    This SpringerBrief presents a survey of data center network designs and topologies and compares several properties in order to highlight their advantages and disadvantages. The brief also explores several routing protocols designed for these topologies and compares the basic algorithms to establish connections, the techniques used to gain better performance, and the mechanisms for fault-tolerance. Readers will be equipped to understand how current research on data center networks enables the design of future architectures that can improve performance and dependability of data centers. This con

  5. Implementing fault tolerance in a superconducting quantum circuit

    Science.gov (United States)

    Barends, Rami

    2015-03-01

    The surface code error correction scheme is appealing for superconducting circuits as the fundamental operations have been demonstrated at the fault-tolerant threshold. Here, we present experimental results on the repetition code, a one-dimensional primitive of the surface code which can detect bit-flip errors, implemented on a device consisting of nine Xmon transmon qubits. We discuss the basic mechanics of error detection, show preservation of a Greenberger-Horne-Zeilinger state, and show suppression of environmentally-induced error.

  6. Formal Analysis of a Fault-Tolerant Routing Algorithm for a Network-on-Chip

    OpenAIRE

    ZHANG Zhen; Serwe, Wendelin; Wu, Jian; Yoneda, Tomohiro; Zheng, Hao; Myers, Chris

    2014-01-01

    A fault-tolerant routing algorithm in Network-on-Chip architectures provides adaptivity for on-chip communications. Adding fault-tolerance adaptivity to a routing algorithm increases its design complexity and makes it prone to deadlock and other problems if improperly implemented. Formal verification techniques are needed to check the correctness of the design. This paper performs formal analysis on an extension of the link-fault tolerant Network-on-Chip architecture introduced by Wu et al. t...

  7. Prognostics Enhancemend Fault-Tolerant Control with an Application to a Hovercraft Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Fault-Tolerant Control (FTC) is an emerging area of engineering and scientific research that integrates prognostics, health management concepts and intelligent...

  8. Fault diagnosis and fault-tolerant finite control set-model predictive control of a multiphase voltage-source inverter supplying BLDC motor

    OpenAIRE

    Salehifar, Mehdi; Moreno Eguilaz, Juan Manuel

    2016-01-01

    Due to its fault tolerance, a multiphase brushless direct current (BLDC) motor can meet high reliability demand for application in electric vehicles. The voltage-source inverter (VSI) supplying the motor is subjected to open circuit faults. Therefore, it is necessary to design a fault-tolerant (FT) control algorithm with an embedded fault diagnosis (FD) block. In this paper, finite control set-model predictive control (FCS-MPC) is developed to implement the fault-tolerant control algorithm of...

  9. Fault detection in photovoltaic systems

    OpenAIRE

    Nilsson, David

    2014-01-01

    This master’s thesis concerns three different areas in the field of fault detection in photovoltaic systems.Previous studies have concerned homogeneous systems with a large set of parameters being observed,while this study is focused on a more restrictive case. The first problem is to discover immediate faults occurring in solar panels. A new online algorithm is developed based on similarity measures with in a single installation. It performs reliably and is able to detect all significant fau...

  10. Row fault detection system

    Science.gov (United States)

    Archer, Charles Jens; Pinnow, Kurt Walter; Ratterman, Joseph D.; Smith, Brian Edward

    2010-02-23

    An apparatus and program product check for nodal faults in a row of nodes by causing each node in the row to concurrently communicate with its adjacent neighbor nodes in the row. The communications are analyzed to determine a presence of a faulty node or connection.

  11. 2009 fault tolerance for extreme-scale computing workshop, Albuquerque, NM - March 19-20, 2009.

    Energy Technology Data Exchange (ETDEWEB)

    Katz, D. S.; Daly, J.; DeBardeleben, N.; Elnozahy, M.; Kramer, B.; Lathrop, S.; Nystrom, N.; Milfeld, K.; Sanielevici, S.; Scott, S.; Votta, L.; Louisiana State Univ.; Center for Exceptional Computing; LANL; IBM; Univ. of Illinois; Shodor Foundation; Pittsburgh Supercomputer Center; Texas Advanced Computing Center; ORNL; Sun Microsystems

    2009-02-01

    This is a report on the third in a series of petascale workshops co-sponsored by Blue Waters and TeraGrid to address challenges and opportunities for making effective use of emerging extreme-scale computing. This workshop was held to discuss fault tolerance on large systems for running large, possibly long-running applications. The main point of the workshop was to have systems people, middleware people (including fault-tolerance experts), and applications people talk about the issues and figure out what needs to be done, mostly at the middleware and application levels, to run such applications on the emerging petascale systems, without having faults cause large numbers of application failures. The workshop found that there is considerable interest in fault tolerance, resilience, and reliability of high-performance computing (HPC) systems in general, at all levels of HPC. The only way to recover from faults is through the use of some redundancy, either in space or in time. Redundancy in time, in the form of writing checkpoints to disk and restarting at the most recent checkpoint after a fault that cause an application to crash/halt, is the most common tool used in applications today, but there are questions about how long this can continue to be a good solution as systems and memories grow faster than I/O bandwidth to disk. There is interest in both modifications to this, such as checkpoints to memory, partial checkpoints, and message logging, and alternative ideas, such as in-memory recovery using residues. We believe that systematic exploration of these ideas holds the most promise for the scientific applications community. Fault tolerance has been an issue of discussion in the HPC community for at least the past 10 years; but much like other issues, the community has managed to put off addressing it during this period. There is a growing recognition that as systems continue to grow to petascale and beyond, the field is approaching the point where we don't have any choice but to address this through R&D efforts.

  12. Checkpoint and Replication Oriented Fault Tolerant Mechanism for MapReduce Framework

    Directory of Open Access Journals (Sweden)

    Yang Liu

    2013-09-01

    Full Text Available MapReduce is an emerging programming paradigm and an associated implementation for processing and generating big data which has been widely applied in data-intensive systems. In cloud environment, node and task failure is no longer accidental but a common feature of large-scale systems. In MapReduce framework, although the rescheduling based fault-tolerant method is simple to implement, it failed to fully consider the location of distributed data, the computation and storage overhead. Thus, a single node failure will increase the completion time dramatically. In this paper, a Checkpoint and Replication Oriented Fault Tolerant scheduling algorithm (CROFT is proposed, which takes both task and node failure into consideration. Preliminary experiments show that with less storage and network overhead. CROFT will significantly reduce the completion time at failure time, and the overall performance of MapReduce can be improved at least over 30% than original mechanism in Hadoop.  

  13. Wiring systems and fault finding

    CERN Document Server

    Scaddan, Brian

    1905-01-01

    This book deals with an area of practice which many students and non-electricians find particularly challenging. It explains how to interpret circuit diagrams, wiring systems and the principles and practice of testing and fault diagnosis. It will give the reader confidence to understand the principles of testing and to apply this knowledge to fault finding in electrical circuits.It is a handy reference for anybody who needs to be able to trace faults in circuits, whether in domestic, commercial or industrial settings. It will be a time-saver for all electricians, plumbers, heating engineers, t

  14. Decoherence-Free Subspaces for Multiple-Qubit Errors (II) Universal, Fault-Tolerant Quantum Computation

    CERN Document Server

    Lidar, D A; Kempe, J; Whaley, K B; Lidar, Daniel A.; Bacon, David; Kempe, Julia

    2001-01-01

    Decoherence-free subspaces (DFSs) shield quantum information from errors induced by the interaction with an uncontrollable environment. Here we study a model of correlated errors forming an Abelian subgroup (stabilizer) of the Pauli group (the group of tensor products of Pauli matrices). Unlike previous studies of DFSs, this type of errors does not involve any spatial symmetry assumptions on the system-environment interaction. We solve the problem of universal, fault-tolerant quantum computation on the associated class of DFSs.

  15. Separation of Fault Tolerance and Non-Functional Concerns: Aspect Oriented Patterns and Evaluation

    OpenAIRE

    Kashif Hameed; Rob Williams; Jim Smith

    2010-01-01

    Dependable computer based systems employing fault tolerance and robust software development techniques demand additional error detection and recovery related tasks. This results in tangling of core functionality with these cross cutting non-functional concerns. In this regard current work identifies these dependability related non-functional and cross-cutting concerns and proposes design and implementation solutions in an aspect oriented framework that modularizes and separates them from core...

  16. A Replication-Based Mechanism for Fault Tolerance in MapReduce Framework

    OpenAIRE

    Yang Liu; Wei Wei

    2015-01-01

    MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster. In cloud environment, node and task failure are no longer accidental but a common feature of large-scale systems. Current rescheduling-based fault tolerance method in MapReduce framework failed to fully consider the location of distributed data and the computation and storage overhead of rescheduling failure tasks. Thus, a single ...

  17. Energy Efficient, Delay Sensitive, Fault Tolerant Wireless Sensor Network for Military Monitoring

    OpenAIRE

    Ilker Bekmezci; Fatih Alagöz

    2009-01-01

    In this article, a new TDMA based wireless sensor network (WSN), MILMON, is proposed for military monitoring. The most important design considerations of MILMON are energy consumption, delay, scalability, and fault tolerance. There are three main components of the system: a new time synchronization schema based on the sink with a high range transmitter, hr-FTSP; data indicator slot mechanism, DISM; and a new distributed time-scheduling mechanism ft_DTSM. An analytic and simulation model has s...

  18. Fault-tolerant permanent-magnet synchronous machine drives: fault detection and isolation, control reconfiguration and design considerations

    OpenAIRE

    MEINGUET, Fabien

    2012-01-01

    The need for efficiency, reliability and continuous operation has lead over the years to the development of fault-tolerant electrical drives for various industrial purposes and for transport applications. Permanent-magnet synchronous machines have also been gaining interest due to their high torque-to-mass ratio and high efficiency, which make them a very good candidate to reduce the weight and volume of the equipment.In this work, a multidisciplinary approach for the design of fault-tolerant...

  19. Fault-Tolerant Control of Wind Turbines using a Takagi-Sugeno Sliding Mode Observer

    International Nuclear Information System (INIS)

    In this paper, observer-based fault-tolerant control schemes for actuator and sensor faults are implemented within dynamic wind turbine simulations. The faults are directly reconstructed by means of a Takagi-Sugeno sliding mode observer. As simulation models, both a reduced-order model with 4 degrees of freedom and the aero-elastic code FAST by NREL are used. A fault-tolerant control scheme is set up by subtracting the reconstructed fault from the faulty control signal respectively sensor value. With these fault compensation schemes, the corrected controller behaviour is close to the fault-free case. The global stability of the controller in the full-load region in the presence of faults and with active fault compensation is shown by analysing the derivative of an appropriate Lyapunov function

  20. Systematic fault tolerant control based on adaptive Thau observer estimation for quadrotor UAVs

    Directory of Open Access Journals (Sweden)

    Cen Zhaohui

    2015-03-01

    Full Text Available A systematic fault tolerant control (FTC scheme based on fault estimation for a quadrotor actuator, which integrates normal control, active and passive FTC and fault parking is proposed in this paper. Firstly, an adaptive Thau observer (ATO is presented to estimate the quadrotor rotor fault magnitudes, and then faults with different magnitudes and time-varying natures are rated into corresponding fault severity levels based on the pre-defined fault-tolerant boundaries. Secondly, a systematic FTC strategy which can coordinate various FTC methods is designed to compensate for failures depending on the fault types and severity levels. Unlike former stand-alone passive FTC or active FTC, our proposed FTC scheme can compensate for faults in a way of condition-based maintenance (CBM, and especially consider the fatal failures that traditional FTC techniques cannot accommodate to avoid the crashing of UAVs. Finally, various simulations are carried out to show the performance and effectiveness of the proposed method.

  1. Design and Bandwidth Analysis of Fault-Tolerant Multistage Interconnection Networks

    Directory of Open Access Journals (Sweden)

    R. Aggarwal

    2008-01-01

    Full Text Available The design of a suitable interconnection network for inter-processor communication is one of the key issues of the system performance. In this study a new irregular interconnection network IABN (Irregular Augmented Baseline has been proposed. IABN is designed by modifying existing ABN (Augmented Baseline Network. ABN is a regular multi-path network with limited fault tolerance. IABN provides three times more paths between any pair of source-destination in comparison to ABN. The ABN and IABN MINs are analyzed and compared in terms of performance parameters namely Bandwidth, Cost and Bandwidth per unit Cost. The proposed network IABN provides much better fault-tolerance and almost double bandwidth at the expanse of little more cost than ABN.

  2. Fault tolerant workflow scheduling based on replication and resubmission of tasks in Cloud Computing

    Directory of Open Access Journals (Sweden)

    Jayadivya S K

    2012-06-01

    Full Text Available The aim of workflow scheduling system is to schedule the workflows within the user given deadline to achieve a good success rate. Workflow is a set of tasks processed in a predefined order based on its data and control dependency. Scheduling these workflows in a computing environment, like cloud environment, is an NP-Complete problem and it becomes more challenging when failures of tasks areconsidered. To overcome these failures, the workflow scheduling system should be fault tolerant. In this paper, the proposed Fault Tolerant Workflow Scheduling algorithm (FTWS provides fault tolerance by using replication and resubmission of tasks based on priority of the tasks. The replication of tasks depends on a heuristic metric which is calculated by finding the tradeoff between the replication factor and resubmission factor. The heuristic metric is considered because replication alone may lead to resource wastage and resubmission alone may increase makespan. Tasks are prioritized based on the criticality of the task which is calculated by using parameters like out degree, earliest deadline and high resubmission impact. Priority helps in meeting the deadline of a task and thereby reducing wastage of resources. FTWS schedules workflows within a deadline even in the presence of failures without using any history of information. The experiments were conducted in a simulated cloud environment by scheduling workflows in the presence of failures which are generated randomly. The experimental results of the proposed work demonstrate the effective success rate in-spite of various failures.

  3. An Evaluation of Fault Tolerant Wind Turbine Control Schemes applied to a Benchmark Model

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Stoustrup, Jakob

    international competition on wind turbine fault tolerant control has been proposed. In this article the top three solutions from this wind fault tolerant control competition are introduced and evaluated. The evaluation presented in this paper shows that the winner of the competition performs very well on this...

  4. A TESTING FRAMEWORK FOR FAULT TOLERANT COMPOSITION OF TRANSACTIONAL WEB SERVICES

    Directory of Open Access Journals (Sweden)

    Deepali Diwase

    2012-12-01

    Full Text Available Software testers have great challenges in testing of web services therefore testing technique must be developed for testing of web services. Web service composition is an active research area over last few years. This paper proposes a framework for testing of fault tolerant composition of web services. It will tolerate faults whilecomposition of web services. Exception handling and transaction techniques are used as fault handling mechanisms. After composition web services are deployed on WS-BPEL engine. Testing Framework will fetch results of composite web service from WS-BPEL engine and check whether composed web service is fault tolerant and it is in the consistent state.

  5. Fault Detection for Nonlinear Systems

    DEFF Research Database (Denmark)

    Stoustrup, Jakob; Niemann, H.H.

    The paper describes a general method for designing fault detection and isolation (FDI) systems for nonlinear processes. For a rich class of nonlinear systems, a nonlinear FDI system can be designed using convex optimization procedures. The proposed method is a natural extension of methods based on...

  6. Fault-tolerant topology in the wireless sensor networks for energy depletion and random failure

    International Nuclear Information System (INIS)

    Nodes in the wireless sensor networks (WSNs) are prone to failure due to energy depletion and poor environment, which could have a negative impact on the normal operation of the network. In order to solve this problem, in this paper, we build a fault-tolerant topology which can effectively tolerate energy depletion and random failure. Firstly, a comprehensive failure model about energy depletion and random failure is established. Then an improved evolution model is presented to generate a fault-tolerant topology, and the degree distribution of the topology can be adjusted. Finally, the relation between the degree distribution and the topological fault tolerance is analyzed, and the optimal value of evolution model parameter is obtained. Then the target fault-tolerant topology which can effectively tolerate energy depletion and random failure is obtained. The performances of the new fault tolerant topology are verified by simulation experiments. The results show that the new fault tolerant topology effectively prolongs the network lifetime and has strong fault tolerance. (general)

  7. Synthesizing Logic in Fault-Tolerant Quantum Computers

    Science.gov (United States)

    Jones, Cody

    2014-03-01

    Quantum computers hold the promise of solving problems believed to be intractable using conventional computation, but this potential is impeded by the apparent difficulty in engineering reliable quantum hardware. One solution is quantum error correction (QEC), which enables fault-tolerant computation at the expense of a sizable overhead in qubits and gates. In this talk, I discuss several recent advancements in QEC to reduce the resource overhead in contemporary error-correction schemes like the surface code. Quantum logic can be encoded into so-called ``magic states,'' and the burden of error correction is shifted to verifying a well-characterized state, instead of protecting an arbitrary quantum process from errors. I discuss some of the recent work in magic-state distillation and its extensions to multi-qubit gates like Toffoli, which are ubiquitous in quantum algorithms. For operations in the surface code, resource overheads are improved by as much as two orders of magnitude.

  8. Multiple Dimensional Fault Tolerant Schemes for Crypto Stream Ciphers

    Directory of Open Access Journals (Sweden)

    Chang N. Zhang

    2010-07-01

    Full Text Available To enhance the security and reliability of the widely-used stream ciphers, a 2-D and a 3-D mesh-knightAlgorithm Based Fault Tolerant (ABFT schemes for stream ciphers are developed which can beuniversally applied to RC4 and other stream ciphers. Based on the ready-made arithmetic unit in streamciphers, the proposed 2-D ABFT scheme is able to detect and correct any simple error, and the 3-D meshknightABFT scheme is capable of detecting and correcting up to three errors in an n2-data matrix with linercomputation and bandwidth overhead. The proposed schemes provide one-to-one mapping between dataindex and check sum group so that error can be located and recovered by easier logic and simpleoperations.

  9. Fault Tolerant Distributed and Fixed Hierarchical Mobile IP

    Directory of Open Access Journals (Sweden)

    Paramesh C. Upadhyay

    2010-04-01

    Full Text Available To several mobility management protocols proposed for IP-based mobile networks, faulttolerance aspect of mobility agents is a primary requirement to sustain continuous service availability to themobile hosts. For a localized or micro- mobility management solution, the local mobility agent i.e. gateway isa single point of failure because it is responsible for enforcing the signaling and data packets in its domain.Such failures may severely disrupt the communications among the failure-affected users. The problembecomes even more severe for mobility agents in a distributed mobility management scheme with overlappingregistration areas.This paper proposes a fault tolerance scheme for Distributed and Fixed Hierarchical Mobile IP(DFHMIP and evaluates its performance in terms of data transmission cost and blocking probability.

  10. Adaptive Fault Tolerant Routing Algorithm for Tree-Hypercube Multicomputer

    Directory of Open Access Journals (Sweden)

    Qatawneh Mohammad

    2006-01-01

    Full Text Available A Connected tree-hypercube with faulty links and/or nodes is called injured tree-hypercube. To enable any non faulty node to communicate with any other non faulty node in an injured tree-hypercube, the information on component failures has to be made available to non faulty nodes to route message around the faulty components. We proposed an adaptive fault tolerant routing algorithm for an injured tree-hypercube in which requires each node to know only the condition of it’s own links. This routing algorithm is shown to be capable of routing messages successfully in an injured tree-hypercube as long as the number of faulty components links and/or nodes is equal d (depth.

  11. Certifying qubit operations below the fault tolerance threshold

    CERN Document Server

    Blume-Kohout, Robin; Nielsen, Erik; Rudinger, Kenneth; Mizrahi, Jonathan; Fortier, Kevin; Maunz, Peter

    2016-01-01

    Quantum information processors promise fast algorithms for problems inaccessible to classical computers. But since qubits are noisy and error-prone, they will depend on fault-tolerant quantum error correction (FTQEC) to compute reliably. Quantum error correction can protect against general noise if -- and only if -- the error in each physical qubit operation is smaller than a certain threshold. The threshold for general errors is quantified by their diamond norm. Until now, qubits have been assessed primarily by randomized benchmarking (RB), which reports a different "error rate" that is not sensitive to all errors, cannot be compared directly to diamond norm thresholds, and cannot efficiently certify a qubit for FTQEC. We use gate set tomography (GST) to completely characterize the performance of a trapped-Yb$^+$-ion qubit and certify it rigorously as suitable for FTQEC by establishing that its diamond norm error rate is less than $6.7\\times10^{-4}$ with $95\\%$ confidence.

  12. Optimising Model for Memory Fault Tolerance in Onboard Computer

    Directory of Open Access Journals (Sweden)

    Suresh V. Mathew

    2002-01-01

    Full Text Available This paper presents an optimising model for integrating the traditional reliability prediction methodology with simple analytical techniques to facilitate the designer to decide upon the memory fault-tolerant choices of an onboard computer. In this exercise, the hardware reliability estimates of a circuit without any error correction as well as that of a circuit with error detection and correction were calculated. The failure rates of each component and soldering have been accounted for in these prediction procedures. A suitable probability distribution is chosen for data errors and is analytically combined with the hardware reliability predictions to study the trade-offs. An optimum strategy for introducing the hardware error correction logic in the circuit is presented.

  13. Fault-Tolerant Energy-Efficient Tree in Dynamic WSNs

    Directory of Open Access Journals (Sweden)

    Tarek Moulahi

    2013-04-01

    Full Text Available Broadcasting has a main importance in Wireless Sens or Networks (WSNs. Effectively, the sink node has to collect periodically, data from the environment supervised by sensors. To perform this operation, i t sends requests to all nodes. Furthermore, WSNs have a dynamic behaviour due to their evolution. At any time, a node can be retrieved from the network due to an exhausting energy or a node problem. In fac t, WSNs are prone to failure such as software o r hardware malfunctioning, exhaustion of energy, wireless interference and environmental hazards. Thus, an appropriate broadcasting method should take into consideration this aspect and uses the le ss possible amount of energy to accomplish the task . In this paper, a robust tree-based scheme is proposed which is called Robust Tree Broadcasting (RTB. The new scheme has a load-balanced behaviour which indu ces an efficient use of energy. In addition, RTB has a high-quality fault tolerant performance.

  14. Design of Parity Preserving Logic Based Fault Tolerant Reversible Arithmetic Logic Unit

    Directory of Open Access Journals (Sweden)

    Rakshith Saligram1

    2013-06-01

    Full Text Available Reversible Logic is gaining significant consideration as the potential logic design style for implementation in modern nanotechnology and quantum computing with minimal impact on physical entropy .Fault Tolerant reversible logic is one class of reversible logic that maintain the parity of the input and the outputs. Significant contributions have been made in the literature towards the design of fault tolerant reversible logic gate structures and arithmetic units, however, there are not many efforts directed towards the design of fault tolerant reversible ALUs. Arithmetic Logic Unit (ALU is the prime performing unit in any computing device and it has to be made fault tolerant. In this paper we aim to design one such fault tolerant reversible ALU that is constructed using parity preserving reversible logic gates. The designed ALU can generate up to seven Arithmetic operations and four logical operations

  15. Network on Chip-based Fault Tolerant Routing Algorithm and Its Implementation

    Directory of Open Access Journals (Sweden)

    Shanshan Jiang

    2013-12-01

    Full Text Available In this paper, a new fault-tolerant routing algorithm is presented in order to effectively improve the fault-tolerant performance of NoC. Based on a classical XY dimension routing algorithm, this design realizes a fault-tolerant routing algorithm of a single routing error by increasing its adaptability, and maintains the advantages of the XY routing algorithm, such as simpleness, hardware overhead, and scalability. Then a 3*3 structure of 2D-mesh Noc is simulated in ISE Design Suit 14.1 platform. Experiment results show that the proposed fault-tolerant routing algorithm proposed can complete the functions of routing data forwarding and tolerance of a single fault on NoC.

  16. Observer-Based Fault Estimation and Accomodation for Dynamic Systems

    CERN Document Server

    Zhang, Ke; Shi, Peng

    2013-01-01

    Due to the increasing security and reliability demand of actual industrial process control systems, the study on fault diagnosis and fault tolerant control of dynamic systems has received considerable attention. Fault accommodation (FA) is one of effective methods that can be used to enhance system stability and reliability, so it has been widely and in-depth investigated and become a hot topic in recent years. Fault detection is used to monitor whether a fault occurs, which is the first step in FA. On the basis of fault detection, fault estimation (FE) is utilized to determine online the magnitude of the fault, which is a very important step because the additional controller is designed using the fault estimate. Compared with fault detection, the design difficulties of FE would increase a lot, so research on FE and accommodation is very challenging. Although there have been advancements reported on FE and accommodation for dynamic systems, the common methods at the present stage have design difficulties, whi...

  17. Remedial brushless AC operation of fault-tolerant doubly salient permanent-magnet motor drives

    OpenAIRE

    Zhao, W.; Chau, KT; Cheng, M.; Ji, J.; Zhu, X.

    2010-01-01

    The doubly salient permanent-magnet (DSPM) machine is a new class of stator-PM brushless machines, which inherently offers the fault-tolerant feature. In this paper, a new operation strategy is proposed and implemented for fault-tolerant DSPM motor drives. The key is to operate the DSPM motor drive in a remedial brushless ac (BLAC) mode under the open-circuit fault condition, while operating in the conventional brushless dc mode under normal condition. Both cosimulation and experimental resul...

  18. A TESTING FRAMEWORK FOR FAULT TOLERANT COMPOSITION OF TRANSACTIONAL WEB SERVICES

    OpenAIRE

    Deepali Diwase; Pujashree Vidap

    2012-01-01

    Software testers have great challenges in testing of web services therefore testing technique must be developed for testing of web services. Web service composition is an active research area over last few years. This paper proposes a framework for testing of fault tolerant composition of web services. It will tolerate faults whilecomposition of web services. Exception handling and transaction techniques are used as fault handling mechanisms. After composition web services are deployed on WS-...

  19. Fault-tolerant digital microfluidic biochips compilation and synthesis

    CERN Document Server

    Pop, Paul; Stuart, Elena; Madsen, Jan

    2016-01-01

    This book describes for researchers in the fields of compiler technology, design and test, and electronic design automation the new area of digital microfluidic biochips (DMBs), and thus offers a new application area for their methods.  The authors present a routing-based model of operation execution, along with several associated compilation approaches, which progressively relax the assumption that operations execute inside fixed rectangular modules.  Since operations can experience transient faults during the execution of a bioassay, the authors show how to use both offline (design time) and online (runtime) recovery strategies. The book also presents methods for the synthesis of fault-tolerant application-specific DMB architectures. ·         Presents the current models used for the research on compilation and synthesis techniques of DMBs in a tutorial fashion; ·         Includes a set of “benchmarks”, which are presented in great detail and includes the source code of most of the t...

  20. Review of fault diagnosis and fault-tolerant control for modular multilevel converter of HVDC

    DEFF Research Database (Denmark)

    Liu, Hui; Loh, Poh Chiang; Blaabjerg, Frede

    This review focuses on faults in Modular Multilevel Converter (MMC) for use in high voltage direct current (HVDC) systems by analyzing the vulnerable spots and failure mechanism from device to system and illustrating the control & protection methods under failure condition. At the beginning...

  1. Expert System Detects Power-Distribution Faults

    Science.gov (United States)

    Walters, Jerry L.; Quinn, Todd M.

    1994-01-01

    Autonomous Power Expert (APEX) computer program is prototype expert-system program detecting faults in electrical-power-distribution system. Assists human operators in diagnosing faults and deciding what adjustments or repairs needed for immediate recovery from faults or for maintenance to correct initially nonthreatening conditions that could develop into faults. Written in Lisp.

  2. Second-order sliding mode fault-tolerant control of heat recovery steam generator boiler in combined cycle power plants

    International Nuclear Information System (INIS)

    Power generation plants are intrinsically complex systems due to their numerous internal components. Higher energy efficiency in power plants is now achieved through employing combined cycles. In this article, an adaptive robust Sliding Mode Controller (SMC) is designed to overcome the faults in Heat Recovery Steam Generator boilers (HRSG boilers) as one of the main parts of a combined cycle plant. On condition that a fault occurs in the HRSG boiler, the control system must be able to reconfigure its parameters to maintain the admissible thresholds in dynamic variables such as drum pressure, steam temperature, and drum water level. To achieve good performance for the boiler, the proposed adaptive robust SMC shall conquer the effects of faults and uncertainties by estimating their upper bounds adaptively, and force the outputs of the multivariable boiler to track the outputs of a desired multivariable reference model. Manipulating a suitable control input and using second-order sliding mode control strategy, the output tracking error slides to zero on a PID sliding surface. Besides tracking, the controlled boiler tolerates faults in system matrix, faults in input matrix, and external disturbance signal. Numerical simulations confirm the effectiveness of the proposed FTC (Fault-Tolerant Control) system for an uncertain non-minimum phase HRSG boiler. Highlights: ► This paper proposes a PID-based adaptive second-order sliding mode controller (SMC). ► SMC is robust to actuator and sensor faults and tracks outputs of a reference system. ► SMC is used in fault tolerant control of a heat recovery steam generator boilers. ► Boiler and reference system have different number of states and inputs. ► Performance of SMC is investigated with different faults scenarios in simulations.

  3. Fault Diagnosis for Electrical Distribution Systems using Structural Analysis

    DEFF Research Database (Denmark)

    Knüppel, Thyge; Blanke, Mogens; Østergaard, Jacob

    Fault-tolerance in electrical distribution relies on the ability to diagnose possible faults and determine which components or units cause a problem or are close to doing so. Faults include defects in instrumentation, power generation, transformation and transmission. The focus of this paper is the...... structure graph. This paper shows how three-phase networks are modelled and analysed using structural methods, and it extends earlier results by showing how physical faults can be identified such that adequate remedial actions can be taken. The paper illustrates a feasible modelling technique for structural...... analysis of power systems, it demonstrates detection and isolation of failures in a network, and shows how typical faults are diagnosed. Nonlinear fault simulations illustrate the results....

  4. A performance evaluation of the software-implemented fault-tolerance computer

    Science.gov (United States)

    Palumbo, D. L.; Butler, R. W.

    1986-01-01

    The results of a performance evaluation of the Software-Implemented Fault-Tolerance (SIFT) computer system conducted in the NASA Avionics Integration Research Laboratory are presented. The essential system functions are described and compared to both earlier design proposals and subsequent design improvements. Using SIFT's specimen task load, the executive tasks, such as reconfiguration, clock synchronization, and interactive consistency, are found to consume significant computing resources. Together with other system overhead (e.g., voting and scheduling), the operating system overhead is in excess of 60 percent. The authors propose specific design changes that reduce this overhead burden significantly.

  5. Fault tolerant satellite attitude control using solar radiation pressure based on nonlinear adaptive sliding mode

    Science.gov (United States)

    Varma, S.; Kumar, K. D.

    2010-02-01

    An adaptive fault tolerant nonlinear control design based on the theory of sliding mode is proposed to control the attitude of a satellite using solar radiation pressure. The system comprises of a satellite with two-oppositely placed solar flaps. The nonlinear model describing the system is used to derive an adaptive fault tolerant control law, based on Lyapunov stability theorem, in the presence of unknown, slow-varying satellite mass distribution and solar parameter. Using this control law the solar flaps are suitably rotated to achieve desired satellite attitude performance. The detailed numerical simulation of the governing nonlinear system equation of motion including the effects of various system parameters on the controller performance, establishes the feasibility of the proposed adaptive control strategy in comparison with the sliding mode control without adaptation. This paper also examines several scenarios including sudden failure of one of the solar flaps, occurrence of an abrupt blockage of one of the rotating solar flaps, and occurrence of a periodic actuator fault. The numerical results show the robustness of the proposed adaptive control scheme in controlling the satellite attitude in the presence of external disturbances as well as in the event of failure of one of the solar flaps.

  6. Implementation Of High Reliable Fine Grain Fault Tolerance Redundant Technique For FPGA

    Directory of Open Access Journals (Sweden)

    M.J.C.prasad

    2013-11-01

    Full Text Available SRAM based FPGAs are attractive to use in space applications because of more flexibility and reprogram ability. As technology size decreases below nanometer SRAM based FPGAs are more susceptible to radiation. These effects can cause transient or permanent bit flipping on SRAM cells and respectively change the function of logic elements within FPGAs. Fault-masking methodologies are essential, because it is vital for the system to work always properly irrespective of various faults that occurs in Complex digital circuitry. Due to this fact, redundancy techniques, which target fault masking and fault tolerance are in our scope. In this project we are proposing Quadruple Force Decide Redundancy (QFDR a new approach in fault tolerance for mitigation problems in digital circuits, as simply replicating complete systems in Triple Modular Redundancy (TMR technique may not be sufficient anymore when especially applies to the space applications, failure rate increases because of second instance occurs before the first one recovers. It QFDR makes SRAM-based FPGAs effectively immune from SEU (Single Event Up-set mitigation challenges. The proposed QFDR is operated at an abstraction level of CLBs of FPGA. The Quadruple Force Decide Redundancy (QFDR is a redundant logical structure which quadruplicates logical functions and defines two different Force and Decide rules for different quadruple logic functions based on their level in design and then connects them together using special connection patterns. The complete logic of QFDR is implemented in VHDL. Modelsim Xilinx edition (MXE will be used for simulation and functional verification. Xilinx ISE will be used for synthesis. Xilinx FPGA board will be used for testing and demonstration of the implemented system.

  7. Fault tolerant small satellite attitude control using adaptive non-singular terminal sliding mode

    Science.gov (United States)

    Cao, Lu; Chen, XiaoQian; Sheng, Tao

    2013-06-01

    The Attitude Control System (ACS) plays a pivotal role in the whole performance of the spacecraft on the orbit; therefore, it is vitally important to design the control system with the performance of rapid response, high control precision and insensitive to external perturbations. In the first place, this paper proposes two adaptive nonlinear control algorithms based on the sliding mode control (SMC), which are designed for small satellite attitude control system. The nonlinear dynamics describing the attitude of small satellite is considered in a circle reference orbit, and the stability of the closed-loop system in the presence of external perturbations is investigated. Then, in order to account for accidental or degradation fault in satellite actuators, the fault-tolerant control schemes are presented. Hence, two adaptive fault-tolerant control laws (continuous sliding mode control and non-singular terminal sliding mode control) are developed by adopting the nonlinear analytical model to describe the system, which can guarantee global asymptotic convergence of the attitude control error with the existence of unknown external perturbations. The nonlinear hyperplane based Terminal sliding mode is introduced into the control law design; therefore, the system convergence performance improves and the control error is convergent in "finite time". As a result, the study on the non-singular terminal sliding mode control is the emphasis and the continuous sliding mode control is used to compare with the non-singular terminal sliding mode control. Meanwhile, an adaptive fuzzy algorithm has been proposed to suppress the chattering phenomenon. Moreover, several numerical examples are presented to demonstrate the efficacy of the proposed controllers by correcting for the external perturbations. Simulation results confirm that the suggested methodologies yield high control precision in control. In addition, actuator degradation, actuator stuck and actuator failure for a period of time are simulated to demonstrate the fault recovery capability of the fault tolerant controllers. The numerical results clearly demonstrate the good performance of the adaptive non-singular terminal control in the event of actuator fault compare with the continuous sliding mode control.

  8. Comparing fault susceptibility of multiple ISAs and operating systems

    Science.gov (United States)

    Chyłek, Sławomir

    2015-09-01

    This paper presents a research that aims to compare effects of faults on different configurations of computer systems. The study covers comparison of susceptibility to faults of x86, AMD64, ARM, PowerPC, MIPS architectures and Linux, FreeBSD, Minix operating systems. An emulation based software implemented fault injection technique was used to perform experiments. The problem of choosing an adequate number of tests in experiments is followed by report with collected results where multiple aspects of test runs were analyzed: providing correct computation result, availability of the system under test and error messages. The research allows to determine characteristics of susceptibility to faults of each platform and is a first step towards designing new fault tolerance solutions and assessing their effectiveness.

  9. On the Practicality of Intrinsic Reconfiguration As a Fault Recovery Method in Analog Systems

    OpenAIRE

    Greenwood, Garrison W.

    2004-01-01

    Evolvable hardware combines the powerful search capability of evolutionary algorithms with the flexibility of reprogrammable devices, thereby providing a natural framework for reconfiguration. This framework has generated an interest in using evolvable hardware for fault-tolerant systems because reconfiguration can effectively deal with hardware faults whenever it is impossible to provide spares. But systems cannot tolerate faults indefinitely, which means reconfiguration does have a deadline...

  10. SIFT - A preliminary evaluation. [Software Implemented Fault Tolerant computer for aircraft control

    Science.gov (United States)

    Palumbo, D. L.; Butler, R. W.

    1983-01-01

    This paper presents the results of a performance evaluation of the SIFT computer system conducted in the NASA AIRLAB facility. The essential system functions are described and compared to both earlier design proposals and subsequent design improvements. The functions supporting fault tolerance are found to consume significant computing resources. With SIFT's specimen task load, scheduled at a 30-Hz rate, the executive tasks such as reconfiguration, clock synchronization and interactive consistency, require 55 percent of the available task slots. Other system overhead (e.g., voting and scheduling) use an average of 50 percent of each remaining task slot.

  11. AVR microcontroller simulator for software implemented hardware fault tolerance algorithms research

    Science.gov (United States)

    Piotrowski, Adam; Tarnowski, Szymon; Napieralski, Andrzej

    2008-01-01

    Reliability of new, advanced electronic systems becomes a serious problem especially in places like accelerators and synchrotrons, where sophisticated digital devices operate closely to radiation sources. One of the possible solutions to harden the microprocessor-based system is a strict programming approach known as the Software Implemented Hardware Fault Tolerance. Unfortunately, in real environments it is not possible to perform precise and accurate tests of the new algorithms due to hardware limitation. This paper highlights the AVR-family microcontroller simulator project equipped with an appropriate monitoring and the SEU injection systems.

  12. Fault-tolerant ancilla preparation and noise threshold lower bounds for the 23-qubit Golay code

    CERN Document Server

    Paetznick, Adam

    2011-01-01

    In fault-tolerant quantum computing schemes, the overhead is often dominated by the cost of preparing codewords reliably. This cost generally increases quadratically with the block size of the underlying quantum error-correcting code. In consequence, large codes that are otherwise very efficient have found limited fault-tolerance applications. Fault-tolerant preparation circuits therefore are an important target for optimization. We study the Golay code, a 23-qubit quantum error-correcting code that protects the logical qubit to a distance of seven. In simulations, even using a naive ancilla preparation procedure, the Golay code is competitive with other codes both in terms of overhead and the tolerable noise threshold. We provide two simplified circuits for fault-tolerant preparation of Golay code-encoded ancillas. The new circuits minimize error propagation, reducing the overhead by roughly a factor of four compared to standard encoding circuits. By adapting the malignant set counting technique to depolariz...

  13. LQCD workflow execution framework: Models, provenance and fault-tolerance

    International Nuclear Information System (INIS)

    Large computing clusters used for scientific processing suffer from systemic failures when operated over long continuous periods for executing workflows. Diagnosing job problems and faults leading to eventual failures in this complex environment is difficult, specifically when the success of an entire workflow might be affected by a single job failure. In this paper, we introduce a model-based, hierarchical, reliable execution framework that encompass workflow specification, data provenance, execution tracking and online monitoring of each workflow task, also referred to as participants. The sequence of participants is described in an abstract parameterized view, which is translated into a concrete data dependency based sequence of participants with defined arguments. As participants belonging to a workflow are mapped onto machines and executed, periodic and on-demand monitoring of vital health parameters on allocated nodes is enabled according to pre-specified rules. These rules specify conditions that must be true pre-execution, during execution and post-execution. Monitoring information for each participant is propagated upwards through the reflex and healing architecture, which consists of a hierarchical network of decentralized fault management entities, called reflex engines. They are instantiated as state machines or timed automatons that change state and initiate reflexive mitigation action(s) upon occurrence of certain faults. We describe how this cluster reliability framework is combined with the workflow execution framework using formal rules and actions specified within a structure of first order predicate logic that enables a dynamic management design that reduces manual administrative workload, and increases cluster-productivity.

  14. Fault tolerance in a supercomputer through dynamic repartitioning

    Science.gov (United States)

    Chen, Dong (Croton On Hudson, NY); Coteus, Paul W. (Yorktown Heights, NY); Gara, Alan G. (Mount Kisco, NY); Takken, Todd E. (Mount Kisco, NY)

    2007-02-27

    A multiprocessor, parallel computer is made tolerant to hardware failures by providing extra groups of redundant standby processors and by designing the system so that these extra groups of processors can be swapped with any group which experiences a hardware failure. This swapping can be under software control, thereby permitting the entire computer to sustain a hardware failure but, after swapping in the standby processors, to still appear to software as a pristine, fully functioning system.

  15. Development and evaluation of a Fault-Tolerant Multiprocessor (FTMP) computer. Volume 2: FTMP software

    Science.gov (United States)

    Lala, J. H.; Smith, T. B., III

    1983-01-01

    The software developed for the Fault-Tolerant Multiprocessor (FTMP) is described. The FTMP executive is a timer-interrupt driven dispatcher that schedules iterative tasks which run at 3.125, 12.5, and 25 Hz. Major tasks which run under the executive include system configuration control, flight control, and display. The flight control task includes autopilot and autoland functions for a jet transport aircraft. System Displays include status displays of all hardware elements (processors, memories, I/O ports, buses), failure log displays showing transient and hard faults, and an autopilot display. All software is in a higher order language (AED, an ALGOL derivative). The executive is a fully distributed general purpose executive which automatically balances the load among available processor triads. Provisions for graceful performance degradation under processing overload are an integral part of the scheduling algorithms.

  16. Design and Analysis of Software fault-Tolerant techniques for Softcore processors in reliable SRAM based FPGA

    Directory of Open Access Journals (Sweden)

    Vatsya Tiwari

    2011-11-01

    Full Text Available This paper discusses high level techniques for designing fault tolerant systems in SRAM-based FPGAs, without modification in the FPGA architecture. Triple Modular Redundancy (TMR has been successfully applied in FPGAs to mitigate transient faults, which are likely to occur in space applications. However, TMR comes with high area and power dissipation penalties. The new technique proposed in this paper was specifically developed for FPGAs to cope with transient faults in the user combinational and sequential logic, while also reducing pin count, area and power dissipation. The methodology was validated by fault injection experiments in an emulation board. We present some fault coverage results and a comparison with the TMR approach

  17. Dual-quaternion based fault-tolerant control for spacecraft formation flying with finite-time convergence.

    Science.gov (United States)

    Dong, Hongyang; Hu, Qinglei; Ma, Guangfu

    2016-03-01

    Study results of developing control system for spacecraft formation proximity operations between a target and a chaser are presented. In particular, a coupled model using dual quaternion is employed to describe the proximity problem of spacecraft formation, and a nonlinear adaptive fault-tolerant feedback control law is developed to enable the chaser spacecraft to track the position and attitude of the target even though its actuator occurs fault. Multiple-task capability of the proposed control system is further demonstrated in the presence of disturbances and parametric uncertainties as well. In addition, the practical finite-time stability feature of the closed-loop system is guaranteed theoretically under the designed control law. Numerical simulation of the proposed method is presented to demonstrate the advantages with respect to interference suppression, fast tracking, fault tolerant and practical finite-time stability. PMID:26775087

  18. Particle Filter Based Fault-tolerant ROV Navigation using Hydro-acoustic Position and Doppler Velocity Measurements

    DEFF Research Database (Denmark)

    Zhao, Bo; Blanke, Mogens; Skjetne, Roger

    particle lter. This particle lter is able to run in an asynchronous manner to accommodate the measurement drop out problem, and it overcomes the measurement outliers by switching observation models. Simulations with experimental data show that this fault tolerant navigation system can accurately estimate...

  19. ALLIANCE: An architecture for fault tolerant multi-robot cooperation

    International Nuclear Information System (INIS)

    ALLIANCE is a software architecture that facilitates the fault tolerant cooperative control of teams of heterogeneous mobile robots performing missions composed of loosely coupled, largely independent subtasks. ALLIANCE allows teams of robots, each of which possesses a variety of high-level functions that it can perform during a mission, to individually select appropriate actions throughout the mission based on the requirements of the mission, the activities of other robots, the current environmental conditions, and the robot's own internal states. ALLIANCE is a fully distributed, behavior-based architecture that incorporates the use of mathematically modeled motivations (such as impatience and acquiescence) within each robot to achieve adaptive action selection. Since cooperative robotic teams usually work in dynamic and unpredictable environments, this software architecture allows the robot team members to respond robustly, reliably, flexibly, and coherently to unexpected environmental changes and modifications in the robot team that may occur due to mechanical failure, the learning of new skills, or the addition or removal of robots from the team by human intervention. The feasibility of this architecture is demonstrated in an implementation on a team of mobile robots performing a laboratory version of hazardous waste cleanup

  20. Fault-tolerant authenticated quantum dialogue using logical Bell states

    Science.gov (United States)

    Ye, Tian-Yu

    2015-09-01

    Two fault-tolerant authenticated quantum dialogue protocols are proposed in this paper by employing logical Bell states as the quantum resource, which combat the collective-dephasing noise and the collective-rotation noise, respectively. The two proposed protocols each can accomplish the mutual identity authentication and the dialogue between two participants simultaneously and securely over one kind of collective noise channels. In each of two proposed protocols, the information transmitted through the classical channel is assumed to be eavesdroppable and modifiable. The key for choosing the measurement bases of sample logical qubits is pre-shared privately between two participants. The Bell state measurements rather than the four-qubit joint measurements are adopted for decoding. The two participants share the initial states of message logical Bell states with resort to the direct transmission of auxiliary logical Bell states so that the information leakage problem is avoided. The impersonation attack, the man-in-the-middle attack, the modification attack and the Trojan horse attacks from Eve all are detectable.