WorldWideScience
 
 
1

Fault tolerant computing systems  

International Nuclear Information System (INIS)

Fault tolerance involves the provision of strategies for error detection damage assessment, fault treatment and error recovery. A survey is given of the different sorts of strategies used in highly reliable computing systems, together with an outline of recent research on the problems of providing fault tolerance in parallel and distributed computing systems. (orig.)

2

Fault Tolerant Real Time Systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Real time systems are systems in which there is a commitment for timely response by the computer to external stimuli. Real time applications have to function correctly even in presence of faults. Fault tolerance can be achieved by either hardware or software or time redundancy. Safety-critical applications have strict time and cost constraints, which means that not only faults have to be tolerated but also the constraints should be satisfied. Deadline scheduling means that t...

Persya, A. Christy; Nair, T. R. Gopalakrishnan

2010-01-01

3

Fault Tolerant Real Time Systems  

CERN Document Server

Real time systems are systems in which there is a commitment for timely response by the computer to external stimuli. Real time applications have to function correctly even in presence of faults. Fault tolerance can be achieved by either hardware or software or time redundancy. Safety-critical applications have strict time and cost constraints, which means that not only faults have to be tolerated but also the constraints should be satisfied. Deadline scheduling means that the taskwith the earliest required response time is processed. The most common scheduling algorithms are :Rate Monotonic(RM) and Earliest deadline first(EDF).This paper deals with the interaction between the fault tolerant strategy and the EDF real time scheduling strategy.

Persya, A Christy

2010-01-01

4

Fault Tolerance in Real Time Distributed System  

Directory of Open Access Journals (Sweden)

Full Text Available In this paper we investigate the different techniques of fault tolerance which are used in many real time distributed systems. The main focus is on types of fault occurring in the system, fault detection techniques and the recovery techniques used. A fault can occur due to link failure, resource failure or by any other reason is to be tolerated for working the system smoothly and accurately. These faults can be detected and recovered by many techniques used ccordingly. An appropriate fault detector can avoid loss due to system crash and reliable fault tolerance technique can save from system failure. This paper provides how these methods are applied to detect and tolerate faults from various Real Time Distributed Systems.

Arvind Kumar

2011-02-01

5

Software fault tolerance in computer operating systems  

Science.gov (United States)

This chapter provides data and analysis of the dependability and fault tolerance for three operating systems: the Tandem/GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Based on measurements from these systems, basic software error characteristics are investigated. Fault tolerance in operating systems resulting from the use of process pairs and recovery routines is evaluated. Two levels of models are developed to analyze error and recovery processes inside an operating system and interactions among multiple instances of an operating system running in a distributed environment. The measurements show that the use of process pairs in Tandem systems, which was originally intended for tolerating hardware faults, allows the system to tolerate about 70% of defects in system software that result in processor failures. The loose coupling between processors which results in the backup execution (the processor state and the sequence of events occurring) being different from the original execution is a major reason for the measured software fault tolerance. The IBM/MVS system fault tolerance almost doubles when recovery routines are provided, in comparison to the case in which no recovery routines are available. However, even when recovery routines are provided, there is almost a 50% chance of system failure when critical system jobs are involved.

Iyer, Ravishankar K.; Lee, Inhwan

1994-01-01

6

Energy-efficient fault-tolerant systems  

CERN Document Server

This book describes the state-of-the-art in energy efficient, fault-tolerant embedded systems. It covers the entire product lifecycle of electronic systems design, analysis and testing and includes discussion of both circuit and system-level approaches. Readers will be enabled to meet the conflicting design objectives of energy efficiency and fault-tolerance for reliability, given the up-to-date techniques presented.

Mathew, Jimson; Pradhan, Dhiraj K

2013-01-01

7

Fault-tolerant software - Experiment with the sift operating system. [Software Implemented Fault Tolerance computer  

Science.gov (United States)

Results are presented of an experiment conducted in the NASA Avionics Integrated Research Laboratory (AIRLAB) to investigate the implementation of fault-tolerant software techniques on fault-tolerant computer architectures, in particular the Software Implemented Fault Tolerance (SIFT) computer. The N-version programming and recovery block techniques were implemented on a portion of the SIFT operating system. The results indicate that, to effectively implement fault-tolerant software design techniques, system requirements will be impacted and suggest that retrofitting fault-tolerant software on existing designs will be inefficient and may require system modification.

Brunelle, J. E.; Eckhardt, D. E., Jr.

1985-01-01

8

Fault tolerant control of systems with saturations  

DEFF Research Database (Denmark)

This paper presents framework for fault tolerant controllers (FTC) that includes input saturation. The controller architecture known from FTC is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization is extended to handle input saturation. Applying this controller architecture in connection with faulty systems including input saturation gives an additional YJBK transfer function related to the input saturation. In the fault free case, this additional YJBK transfer function can be applied directly for optimizing the feedback loop around the input saturation. In the faulty case, the design problem is a mixed design problem involved both parametric faults and input saturation.

Niemann, Hans Henrik

2013-01-01

9

Software engineering of fault tolerant systems  

CERN Document Server

In architecting dependable systems, what is required to improve the overall system robustness is fault tolerance. Many methods have been proposed to this end, the solutions are usually considered late during the design and implementation phases of the software life-cycle (e.g., Java and Windows NT exception handling), thus reducing the effectiveness error and fault handling. Since the system design typically models only normal behaviour of the system while ignoring exceptional ones, the implementation of the system is unable to handle abnormal events. Consequently, the system may fail in unexp

Pelliccione, P; Muccini, Henry

2007-01-01

10

A Fault-tolerant Development Methodology for Industrial Control Systems  

DEFF Research Database (Denmark)

Developing advanced detection schemes is not the lone factor for obtaining a successful fault diagnosis performance. Acquiring significant achievements in applying Fault-tolerance in industrial development requires that fault diagnosis and recovery schemes are developed in a consistent and logically sound manner. This paper presents the employe fault-tolerant development methodology and highlights steps, which have been essential for achieving complete and consistent monitoring capabilities. Fault diagnosis for a commercial refrigeration system is treated as a case-study.

Izadi-Zamanabadi, Roozbeh; Thybo, C.

2004-01-01

11

Method and system for environmentally adaptive fault tolerant computing  

Science.gov (United States)

A method and system for adapting fault tolerant computing. The method includes the steps of measuring an environmental condition representative of an environment. An on-board processing system's sensitivity to the measured environmental condition is measured. It is determined whether to reconfigure a fault tolerance of the on-board processing system based in part on the measured environmental condition. The fault tolerance of the on-board processing system may be reconfigured based in part on the measured environmental condition.

Copenhaver, Jason L. (Inventor); Jeremy, Ramos (Inventor); Wolfe, Jeffrey M. (Inventor); Brenner, Dean (Inventor)

2010-01-01

12

Synthesizing Fault Tolerant Safety Critical Systems  

Directory of Open Access Journals (Sweden)

Full Text Available To keep pace with today’s nanotechnology, safety critical embedded systems are becoming less tolerant to errors. Research into techniques to cope with errors in these systems has mostly focused on transformational approach, replication of hardware devices, parallel program design, component based design and/or information redundancy. It would be better to tackle the issue early in the design process that a safety critical system never fails to satisfy its strict dependability requirements. A novel method is outlined in this paper that proposes an efficient approach to synthesize safety critical systems. The proposed method outperforms dominant existing work by introducing the technique of run time detection and completion of proper execution of the system in the presence of faults.

Seemanta Saha

2014-08-01

13

Fault tolerant aggregation for power system services  

DEFF Research Database (Denmark)

Exploiting the flexibility in distributed energy resources (DER) is seen as an important contribution to allow high penetrations of renewable generation in electrical power systems. However, the present control infrastructure in power systems is not well suited for the integration of a very large number of small units. A common approach is to aggregate a portfolio of such units together and expose them to the power system as a single large virtual unit. In order to realize the vision of a Smart Grid, concepts for flexible, resilient and reliable aggregation infrastructures are required. This paper presents such a concept while focusing on the aspect of resilience and fault tolerance. The proposed concept makes use of a multi-level election algorithm to transparently manage the addition, removal, failure and reorganization of units. It has been implemented and tested as a proof-of-concept on the distributed smart grid test bed SYSLAB at the Technical University of Denmark.

Kosek, Anna Magdalena; Gehrke, Oliver

2013-01-01

14

Design and validation of fault-tolerant flight systems  

Science.gov (United States)

NASA has undertaken the development of a methodology for the design of easily validated fault-tolerant systems which emphasizes validation processes that can be directly incorporated into the design process. Attention is presently given to the statistical issues arising in the validation of highly reliable fault-tolerant systems. Structured specification and design methodologies, mathematical proof techniques, analytical modeling, simulation/emulation, and physical testing, are all discussed. Important design factors associated with fault-tolerance are noted; synchronization and 'Byzantine resilience' must accompany fault tolerance.

Finelli, George B.; Palumbo, Daniel L.

1987-01-01

15

Comparing Distributed Online Stream Processing Systems Considering Fault Tolerance Issues  

Directory of Open Access Journals (Sweden)

Full Text Available This paper presents an analysis of four online stream processing systems (MillWheel, S4, Spark Streaming and Storm regarding the strategies they use for fault tolerance. We use this sort of system for processing of data streams that can come from different sources such as web sites, sensors, mobile phones or any set of devices that provide real-time high-speed data. Typically, these systems are concerned more with the throughput in data processing than on fault tolerance. However, depending on the type of application, we should consider fault tolerance as an important a feature. The work describes some of the main strategies for fault tolerance – replication components, upstream backup, checkpoint and recovery – and shows how each of the four systems uses these strategies. In the end, the paper discusses the advantages and disadvantages of the combination of the strategies for fault tolerance in these systems.

André Leon Sampaio Gradvohl

2014-05-01

16

Software fault tolerance  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Because of our present inability to produce errorfree software, software fault tolerance is and will contiune to be an important consideration in software system. The root cause of software design errors in the complexity of the systems. This paper surveys various software fault tolerance techniquest and methodologies. They are two gpoups: Single version and Multi version software fault tolerance techniques. It is expected that software fault tolerance research will benefit from this research...

Kazinov, Tofik Hasanaga; Mostafa, Jalilian Shahrukh

2009-01-01

17

From fault classification to fault tolerance for multi-agent systems  

CERN Document Server

Faults are a concern for Multi-Agent Systems (MAS) designers, especially if the MAS are built for industrial or military use because there must be some guarantee of dependability. Some fault classification exists for classical systems, and is used to define faults. When dependability is at stake, such fault classification may be used from the beginning of the system's conception to define fault classes and specify which types of faults are expected. Thus, one may want to use fault classification for MAS; however, From Fault Classification to Fault Tolerance for Multi-Agent Systems argues that

Potiron, Katia; Taillibert, Patrick

2013-01-01

18

Intelligent System for Parallel Fault-Tolerant Diagnostic Tests Construction  

Directory of Open Access Journals (Sweden)

Full Text Available This investigation deals with the intelligent system for parallel fault-tolerant diagnostic tests construction. A modified parallel algorithm for fault-tolerant diagnostic tests construction is proposed. The algorithm is allowed to optimize processing time on tests construction. A matrix model of data and knowledge representation, as well as various kinds of regularities in data and knowledge are presented. Applied intelligent system for diagnostic of mental health of population which is developed with the use of intelligent system for parallel fault-tolerant DTs construction is suggested.

Anna Yankovskaya

2013-04-01

19

Active Fault Tolerant Control of Livestock Stable Ventilation System  

DEFF Research Database (Denmark)

Modern stables and greenhouses are equipped with different components for providing a comfortable climate for animals and plant. A component malfunction may result in loss of production. Therefore, it is desirable to design a control system, which is stable, and is able to provide an acceptable degraded performance even in the faulty case. In this thesis, we have designed such controllers for climate control systems of livestock buildings in three steps: • Deriving a model for the climate control system of a pig-stable. • Designing an active fault diagnosis (AFD) algorithm for different kinds of fault. • Designing a fault tolerant control scheme for the climate control system. In the first step, a conceptual multi-zone model for climate control of a live-stock building is derived. In the next step, two methods for active fault diagnosis are proposed. The AFD methods excite the system by injecting a so-called excitation input. Two different algorithms, the EKF and a new adaptive filter, are used to detect the faults. Fault tolerant controller (FTC) is based on a switching scheme between a set of predefined passive fault tolerant controller (PFTC). In the FTC part of the thesis, first a passive fault tolerant controller (PFTC) based on state feed-back is proposed for discrete-time piecewise affine (PWA) systems. Only actuator faults are considered. Then the PFTC problem is reformulated as a feasibility of a set of linear matrix inequalities (LMIs).

Gholami, Mehdi

2011-01-01

20

Fault-tolerant computation with higher-dimensional systems  

Energy Technology Data Exchange (ETDEWEB)

Instead of a quantum computer where the fundamental units are 2-dimensional qubits, the author can consider a quantum computer made up of d-dimensional systems. There is a straightforward generalization of the class of stabilizer codes to d-dimensional systems, and he will discuss the theory of fault-tolerant computation using such codes. He proves that universal fault-tolerant computation is possible with any higher-dimensional stabilizer code for prime d.

Gottesman, D.

1998-07-01

 
 
 
 
21

Fault tolerant oxygen control of a diesel engine air system  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper is devoted to the fault tolerant control problem of a Diesel engine air system having a jammed Exhaust Gas Recirculation (EGR) valve. The fault tolerant control is based on replaning the trajectory in order to track a new controlled variable which is the oxygen concentration in the intake manifold instead of the fresh air mass flow. The trajectory planning is based on an inverse model approach, utilizing the fundamental thermodynamic relations of the air system.

Nitsche, Rainer; Bitzer, Matthias; El Khaldi, Mahmoud; Bloch, Ge?rard

2010-01-01

22

H infinity Integrated Fault Estimation and Fault Tolerant Control of Discrete-time Piecewise Linear Systems  

DEFF Research Database (Denmark)

In this paper we consider the problem of fault estimation and accommodation for discrete time piecewise linear systems. A robust fault estimator is designed to estimate the fault such that the estimation error converges to zero and H? performance of the fault estimation is minimized. Then, the estimate of fault is used to compensate for the effect of the fault. Hence, using the estimate of fault, a fault tolerant controller using a piecewise linear static output feedback is designed such that it stabilizes the system and provides an upper bound on the H? performance of the faulty system. Sufficient conditions for the existence of robust fault estimator and fault tolerant controller are derived in terms of linear matrix inequalities. Upper bounds on the H? performance can be minimized by solving convex optimization problems with linear matrix inequality constraints. The efficiency of the method is demonstrated by means of a numerical example.

Tabatabaeipour, Seyed Mojtaba; Bak, Thomas

2012-01-01

23

Measurement and analysis of operating system fault tolerance  

Science.gov (United States)

This paper demonstrates a methodology to model and evaluate the fault tolerance characteristics of operational software. The methodology is illustrated through case studies on three different operating systems: the Tandem GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Measurements are made on these systems for substantial periods to collect software error and recovery data. In addition to investigating basic dependability characteristics such as major software problems and error distributions, we develop two levels of models to describe error and recovery processes inside an operating system and on multiple instances of an operating system running in a distributed environment. Based on the models, reward analysis is conducted to evaluate the loss of service due to software errors and the effect of the fault-tolerance techniques implemented in the systems. Software error correlation in multicomputer systems is also investigated.

Lee, I.; Tang, D.; Iyer, R. K.

1992-01-01

24

ROBUS-2: A Fault-Tolerant Broadcast Communication System  

Science.gov (United States)

The Reliable Optical Bus (ROBUS) is the core communication system of the Scalable Processor-Independent Design for Enhanced Reliability (SPIDER), a general-purpose fault-tolerant integrated modular architecture currently under development at NASA Langley Research Center. The ROBUS is a time-division multiple access (TDMA) broadcast communication system with medium access control by means of time-indexed communication schedule. ROBUS-2 is a developmental version of the ROBUS providing guaranteed fault-tolerant services to the attached processing elements (PEs), in the presence of a bounded number of faults. These services include message broadcast (Byzantine Agreement), dynamic communication schedule update, clock synchronization, and distributed diagnosis (group membership). The ROBUS also features fault-tolerant startup and restart capabilities. ROBUS-2 is tolerant to internal as well as PE faults, and incorporates a dynamic self-reconfiguration capability driven by the internal diagnostic system. This version of the ROBUS is intended for laboratory experimentation and demonstrations of the capability to reintegrate failed nodes, dynamically update the communication schedule, and tolerate and recover from correlated transient faults.

Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Miner, Paul S.

2005-01-01

25

Passive Fault-tolerant Control of Discrete-time Piecewise Affine Systems against Actuator Faults  

DEFF Research Database (Denmark)

In this paper, we propose a new method for passive fault-tolerant control of discrete time piecewise affine systems. Actuator faults are considered. A reliable piecewise linear quadratic regulator (LQR) state feedback is designed such that it can tolerate actuator faults. A sufficient condition for the exis- tence of a passive fault-tolerant controller is derived and formulated as the feasibility of a set of linear matrix inequalities (LMIs). The upper bound on the performance cost can be minimized using a convex optimization problem with LMI constraints which can be solved efficiently. The approach is illustrated on a numerical example and a two degree of freedom helicopter.

Tabatabaeipour, Seyed Mojtaba; Izadi-Zamanabadi, Roozbeh

2012-01-01

26

Passive fault-tolerant control of discrete time piecewise affine systems against actuator faults  

DEFF Research Database (Denmark)

In this article, we propose a new method for passive fault-tolerant control of discrete time piecewise affine systems. Actuator faults are considered. A reliable piecewise linear quadratic regulator state feedback is designed such that it can tolerate actuator faults. A sufficient condition for the existence of a passive fault-tolerant controller is derived and formulated as the feasibility of a set of linear matrix inequalities (LMIs). The upper bound on the performance cost can be minimised using a convex optimisation problem with LMI constraints which can be solved efficiently. The approach is illustrated on a numerical example and a two degree of freedom helicopter. © 2012 Taylor & Francis Group, LLC.

Tabatabaeipour, Mojtaba; Izadi-Zamanabadi, Roozbeh

2012-01-01

27

Data-driven design of fault diagnosis and fault-tolerant control systems  

CERN Document Server

Data-driven Design of Fault Diagnosis and Fault-tolerant Control Systems presents basic statistical process monitoring, fault diagnosis, and control methods, and introduces advanced data-driven schemes for the design of fault diagnosis and fault-tolerant control systems catering to the needs of dynamic industrial processes. With ever increasing demands for reliability, availability and safety in technical processes and assets, process monitoring and fault-tolerance have become important issues surrounding the design of automatic control systems. This text shows the reader how, thanks to the rapid development of information technology, key techniques of data-driven and statistical process monitoring and control can now become widely used in industrial practice to address these issues. To allow for self-contained study and facilitate implementation in real applications, important mathematical and control theoretical knowledge and tools are included in this book. Major schemes are presented in algorithm form and...

Ding, Steven X

2014-01-01

28

Design of fault tolerant control system for steam generator using  

Energy Technology Data Exchange (ETDEWEB)

A controller and sensor fault tolerant system for a steam generator is designed with fuzzy logic. A structure of the proposed fault tolerant redundant system is composed of a supervisor and two fuzzy weighting modulators. A supervisor alternatively checks a controller and a sensor induced performances to identify which part, a controller or a sensor, is faulty. In order to analyze controller induced performance both an error and a change in error of the system output are chosen as fuzzy variables. The fuzzy logic for a sensor induced performance uses two variables : a deviation between two sensor outputs and its frequency. Fuzzy weighting modulator generates an output signal compensated for faulty input signal. Simulations show that the proposed fault tolerant control scheme for a steam generator regulates well water level by suppressing fault effect of either controllers or sensors. Therefore through duplicating sensors and controllers with the proposed fault tolerant scheme, both a reliability of a steam generator control and sensor system and that of a power plant increase even more. 2 refs., 9 figs., 1 tab. (Author)

Kim, Myung Ki; Seo, Mi Ro [Korea Electric Power Research Institute, Taejon (Korea, Republic of)

1998-12-31

29

Fault Tolerance Middleware for a Multi-Core System  

Science.gov (United States)

Fault Tolerance Middleware (FTM) provides a framework to run on a dedicated core of a multi-core system and handles detection of single-event upsets (SEUs), and the responses to those SEUs, occurring in an application running on multiple cores of the processor. This software was written expressly for a multi-core system and can support different kinds of fault strategies, such as introspection, algorithm-based fault tolerance (ABFT), and triple modular redundancy (TMR). It focuses on providing fault tolerance for the application code, and represents the first step in a plan to eventually include fault tolerance in message passing and the FTM itself. In the multi-core system, the FTM resides on a single, dedicated core, separate from the cores used by the application. This is done in order to isolate the FTM from application faults and to allow it to swap out any application core for a substitute. The structure of the FTM consists of an interface to a fault tolerant strategy module, a responder module, a fault manager module, an error factory, and an error mapper that determines the severity of the error. In the present reference implementation, the only fault tolerant strategy implemented is introspection. The introspection code waits for an application node to send an error notification to it. It then uses the error factory to create an error object, and at this time, a severity level is assigned to the error. The introspection code uses its built-in knowledge base to generate a recommended response to the error. Responses might include ignoring the error, logging it, rolling back the application to a previously saved checkpoint, swapping in a new node to replace a bad one, or restarting the application. The original error and recommended response are passed to the top-level fault manager module, which invokes the response. The responder module also notifies the introspection module of the generated response. This provides additional information to the introspection module that it can use in generating its next response. For example, if the responder triggers an application rollback and errors are still occurring, the introspection module may decide to recommend an application restart.

Some, Raphael R.; Springer, Paul L.; Zima, Hans P.; James, Mark; Wagner, David A.

2012-01-01

30

Development and application of diagnostic systems to achieve fault tolerance  

International Nuclear Information System (INIS)

Much work is currently being done to develop and apply diagnostic systems that are tolerant to faulted conditions in the process being monitored and in the sensors that measure the critical parameters associated with the process. A fault-tolerant diagnostic system based on state-determination, pattern-recognition techniques is currently undergoing testing and evaluation in certain applications at the EBR-II reactor. Testing and operational experience with the system to date has shown a high degree of tolerance to sensor failures, while being sensitive to very slight changes in the plant operational state. This paper briefly mentions related work being done by others, and describes in more detail the pattern-recognition system and the results of the testing and operational experience with the system at EBR-II. 9 refs., 10 figs

31

Implementing Fault-Tolerant Services in Goal-Oriented Multi-Agent Systems  

Directory of Open Access Journals (Sweden)

Full Text Available In this paper, findings and analysis detail the implementation of fault tolerance services into a goal-oriented multi-agent systems development platform. Fault tolerance services are used to provide replication-based fault tolerance policies (i.e. static and adaptive to multi-agent systems. This approach provided flexibility and reusability to multi-agent systems because fault tolerance policies were implemented as reusable plan structures. Thus, whenever an agent was needed to be made fault-tolerant, plans for fault tolerance policies were simply activated by sending a request message.

BORA, S.

2014-08-01

32

Fault tolerance of the NIF power conditioning system  

International Nuclear Information System (INIS)

The tolerance of the circuit topology proposed for the National Ignition Facility (NIF) power conditioning system to specific fault conditions is investigated. A new pulsed power circuit is proposed for the NIF which is simpler and less expensive than previous ICF systems. The inherent fault modes of the new circuit are different from the conventional approach, and must be understood to ensure adequate NIF system reliability. A test-bed which simulates the NIF capacitor module design was constructed to study the circuit design. Measurements from test-bed experiments with induced faults are compared with results from a detailed circuit model. The model is validated by the measurements and used to predict the behavior of the actual NIF module during faults. The model can be used to optimize fault tolerance of the NIF module through an appropriate distribution of circuit inductance and resistance. The experimental and modeling results are presented, and fault performance is compared with the ratings of pulsed power components. Areas are identified which require additional investigation

33

14 CFR Special Federal Aviation... - Fuel Tank System Fault Tolerance Evaluation Requirements  

Science.gov (United States)

... 2010-01-01 2010-01-01 false Fuel Tank System Fault Tolerance Evaluation Requirements Federal Special Federal...Special Federal Aviation Regulation No. 88—Fuel Tank System Fault Tolerance Evaluation Requirements 1....

2010-01-01

34

Trends in reliability modeling technology for fault tolerant systems  

Science.gov (United States)

Developments in reliability modeling for large fault tolerant avionic computing systems are presented. Issues of state size and complexity, fault coverage, and practical computation are addressed. A two-fold developmental effort is described based on the structural and fault coverage modeling approaches. A technique which was successfully applied to an 865 state pure death stationary Markov model is presented. Of particular interest is a short computer program which executes very quickly to produce reliability results of a large state space model. This model also incorporates fault coverage states for processor, memory, and bus line replaceable units. A second structural reliability modeling scheme is aimed at solving nonstationary Markov models. This technique provides the tool required for studying the reliability of systems with nonconstant failure rates and includes intermittent/transient faults, electronic hardware which exhibits decreasing failure rates, and hydromechanical devices which typically have wearout failure mechanisms. Several aspects of fault coverage, including modeling and data measurement of intermittent/transient faults and latent faults, are elucidated and illustrated. The CARE II (computer-aided reliability estimation) coverage is presented and shortcomings to be eliminated are discussed.

Bavuso, S. J.

1979-01-01

35

OPTIMAL CHOICE WITHIN A FAULT TOLERANT FLIGHT CONTROL SYSTEM  

Directory of Open Access Journals (Sweden)

Full Text Available  Safety of aircraft during the flight is one of the most important problems that concerns of all aviation. Failures/faults main elements automatic control system and damages to the external contour of the aircraft by foreign objects always lead to a change the characteristics of the aircraft, direct and indirect economic costs and sometimes to injury or death of passengers and crew. Real-time active fault tolerant control system makes it possible to warn or prevent emergency situations and thus improve safety.

Vasily Kazak

2013-04-01

36

An observer based approach for achieving fault diagnosis and fault tolerant control of systems modeled as hybrid Petri nets.  

Science.gov (United States)

In this paper, we propose an approach for achieving detection and identification of faults, and provide fault tolerant control for systems that are modeled using timed hybrid Petri nets. For this purpose, an observer based technique is adopted which is useful in detection of faults, such as sensor faults, actuator faults, signal conditioning faults, etc. The concepts of estimation, reachability and diagnosability have been considered for analyzing faulty behaviors, and based on the detected faults, different schemes are proposed for achieving fault tolerant control using optimization techniques. These concepts are applied to a typical three tank system and numerical results are obtained. PMID:21507399

Renganathan, K; Bhaskar, VidhyaCharan

2011-07-01

37

Fault-tolerant Supervisory Control : System Analysis and Logic Design  

DEFF Research Database (Denmark)

The main purpose of this work has been to achieve active fault-tolerance in control systems, defined as a methodology where fault detection and isolation techniques are combined with supervisory control to achieve autonomous accommodation of faults before they develop into failures. The aim of this work has been to develop and employ concepts and methods that are suitable for use in different automation processes, with applicability in various industrial fields. The requirements for high productivity and quality has resulted in employing additional instrumentation and use of more sophisticated control algorithms. The drawback is, however, that these control systems have become more vulnerable to even simple faults in instrumentation. On the other hand, due to cost-optimality requirements, an extensive use of hardware redundancy has been prohibited. Nevertheless, the dependency and availability could be increased through enhancing control systems' ability to on-line perform fault detection and reconfiguration when a fault occurs and before a safety system shuts-down the entire process. The main contributions of this research effort are development and experimentation with methodologies for systematic analysis of reconfiguration and design of supervisor logic. In addition, useful experience is obtained through implementation of a fault-tolerant control scheme against a simulated ship and its propulsion system. A development methodology, which was suggested in the Control Engineering Department, is extended to cope with the important reconfiguration problem. In order to enable a designer to acquire knowledge about reconfiguration possibilities, the structural analysis method is added as an extension to the existing methodology. This extension builds upon the earlier method where fault propagation and severity analysis are the essential parts. Structural analysis (SA) enables the designer to distinguish between the parts of the systems with no redundant information and the parts with possible redundant information. This method, hence, provides the designer with information, which is necessary during the selection of remedial actions. Furthermore, it is shown how sensor information fusion is obtained by using the SA method. The construction of the supervisor's decision logic is essential for the active form of fault-tolerant control. In this regard, two approaches has been presented. The first aims at constructing the decision logic in form of a ``language''. This language is obtained as a direct result of the component based approach, presented in this thesis. This approach is based on the definition of a functional component, components placement in a control system hierarchy and the definition of system level hierarchy. The supervisor language includes all valid strings, representing the combination of valid components, that keep the system functional. This approach is simple and can be automated. In the second approach, implementation of supervisor functionality is realized on the basis of an extension to the traditional state-event machines. Due to parallelity (inherent modularity) the supervisor logic is more easily modified, updated, maintained, and tested. A salient feature is that a change in one task only necessitates redesign of essentially one corresponding state-event machine (SEM). A heuristic guideline is provided for designing the logic in form of SEMs. A ship propulsion system benchmark has been designed and used as a case study. This includes experimentation with the above methodologies and implementation of a fault-tolerant control against the simulation. Four generic faults have been considered. It has been shown how the SA method is easily employed to generate analytical redundancy relations, which in turn are then used for FDI purposes. Three different methods are used to generate residuals. These methods are: simple numerical calculation, a non-linear observer, and a Neuro-Fuzzy method. Employment of each method follows the assumption about the available system information. The results show that it is p

Izadi-Zamanabadi, Roozbeh

1999-01-01

38

Summarize of Electric Vehicle Electric System Fault and Fault-tolerant Technology  

Directory of Open Access Journals (Sweden)

Full Text Available Electric vehicle drive system is a multi-variable function, running environment complexed and changeable system, so it’s failure form is complicated. In this paper, according to the fault happens in different position, establish vehicle fault table, analyze the consequences of failure may cause and the causes of failure. Combined with hardware limitations, and the maximum guarantee system performance requirements, passive software redundancy fault-tolerant strategy is put forward, give an example to analysis the pros and cons of this method.

Zhang Liwei

2013-09-01

39

Real Time System Fault Tolerance Scheduling Algorithms  

Directory of Open Access Journals (Sweden)

Full Text Available The main objective of this paper is to implement the real time scheduling algorithms and discuss the advantages and disadvantages of the same. Task within the real time system are designed to accomplish certain service(s upon execution, and thus, each task has a particular significance to overall functionality of the system. Scheduling algorithms in non-real time system not considering any type of dead line but in real time system deadline is main criteria for scheduling the task.

Ramita Mehta?

2014-09-01

40

Scheduling and Optimization of Fault-Tolerant Distributed Embedded Systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Safety-critical applications have to function correctly and deliver high level of quality-ofservice even in the presence of faults. This thesis deals with techniques for tolerating effects of transient and intermittent faults. Re-execution, software replication, and rollback recovery with checkpointing are used to provide the required level of fault tolerance at the software level. Hardening is used to increase the reliability of hardware components. These techniques are considered in the con...

Izosimov, Viacheslav

2009-01-01

 
 
 
 
41

A Fault Tolerant Mobile Agent Information Retrieval System  

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: Most of the information retrieval systems used only client-server architectures. The client-server model though powerful, had some limitations. In mobile computing environment which has both wired network and wireless networks with limited communication capabilities, the performance of the system was very low. Approach: Mobile agents are considered a suitable technology to develop applications such as information retrieval system for mobile computing environment. Mobile agents are autonomous and dynamic entities that can migrate between various nodes in the network. They offer many advantages over traditional design methodologies like: reduction in network load, overcoming network latency and disconnected operations. Since the mobile agents do not need continuous communication with the mobile host, they are not affected by the sudden disconnection of wireless network and the situation of turning mobile host off for power saving. In order to get the complete benefit of mobile agent system, the system must be fault tolerant. In the context of mobile agents, fault-tolerance prevents a partial or complete loss of the agent. Results: Our system in mobile computing environment ensured that the agent arrived at its destination with result and performance of the system improved by the way of reduction in the response time. And also, the system allowed sending more requests by the way of creating many mobile agents without affecting the performance. Conclusion: Our research compared the performance of client-server architecture and fault tolerant mobile agent information retrieval system and proved that our system solved the limitations faced by the client server architecture. The system can also be extended to adhoc networks.

R. Punithavathi

2010-01-01

42

Fault-tolerant reactor protection system  

Science.gov (United States)

A reactor protection system having four divisions, with quad redundant sensors for each scram parameter providing input to four independent microprocessor-based electronic chassis. Each electronic chassis acquires the scram parameter data from its own sensor, digitizes the information, and then transmits the sensor reading to the other three electronic chassis via optical fibers. To increase system availability and reduce false scrams, the reactor protection system employs two levels of voting on a need for reactor scram. The electronic chassis perform software divisional data processing, vote 2/3 with spare based upon information from all four sensors, and send the divisional scram signals to the hardware logic panel, which performs a 2/4 division vote on whether or not to initiate a reactor scram. Each chassis makes a divisional scram decision based on data from all sensors. Each division performs independently of the others (asynchronous operation). All communications between the divisions are asynchronous. Each chassis substitutes its own spare sensor reading in the 2/3 vote if a sensor reading from one of the other chassis is faulty or missing. Therefore the presence of at least two valid sensor readings in excess of a set point is required before terminating the output to the hardware logic of a scram inhibition signal even when one of the four sensors is faulty or when one of the divisions is out of service.

Gaubatz, Donald C. (Cupertino, CA)

1997-01-01

43

Fault-tolerant reactor protection system  

International Nuclear Information System (INIS)

A reactor protection system is disclosed having four divisions, with quad redundant sensors for each scram parameter providing input to four independent microprocessor-based electronic chassis. Each electronic chassis acquires the scram parameter data from its own sensor, digitizes the information, and then transmits the sensor reading to the other three electronic chassis via optical fibers. To increase system availability and reduce false scrams, the reactor protection system employs two levels of voting on a need for reactor scram. The electronic chassis perform software divisional data processing, vote 2/3 with spare based upon information from all four sensors, and send the divisional scram signals to the hardware logic panel, which performs a 2/4 division vote on whether or not to initiate a reactor scram. Each chassis makes a divisional scram decision based on data from all sensors. Each division performs independently of the others (asynchronous operation). All communications between the divisions are asynchronous. Each chassis substitutes its own spare sensor reading in the 2/3 vote if a sensor reading from one of the other chassis is faulty or missing. Therefore the presence of at least two valid sensor readings in excess of a set point is required before terminating the output to the hardware logic of a scram inhibition signal even when one of the four sensors is faulty or when one of the divisions is out of service. 16 figs. 16 figs

44

Transient Fault Tolerance and System Safety Enhancement Based on System Theory  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Transient faults are hard to be detected and located due to their unpredictable nature and short duration, and they are the dominant causations of system failures, which makes it necessary to consider transient fault-tolerant design in the development of modern safety-critical industrial system. In this paper an approach based on system theory is proposed to tolerate the transient faults in tunnel construction wireless monitoring and control systems (TCWMCS), in which the effects of transient...

Mingyue Yang; Ye Wang; Yuanqing Qin; Chunjie Zhou; Xiongfeng Huang

2011-01-01

45

System Diagnosis and Fault Tolerance for Distributed Computing System: A Review  

Directory of Open Access Journals (Sweden)

Full Text Available An adaptive system diagnosis fault tolerance method for distributed system. The system is comprised of a network including N nodes where N is integer and greater than equal to 3 and each node is able to execute an algorithm to communicate with the network. A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information. As computer network is a collection of hardware components it is very often that is may have some fault either in the hardware or in the software of the entire network. So to deal with these kinds of faults either hardware of software, some fault diagnosis and fault tolerance mechanism to be implemented for the proper functioning of the system. For such a fault detection and fault tolerant mechanism is to be discussed in this paper. What kind of fault and how they occur will discuss and try to find out some suitable solution of our proposed problem. Various fault detecting mechanism and fault tolerant methodology to be study here and the main goal of the study is to find out some automatic fault detection and fault tolerance techniques

Nilotpal Baruah

2013-10-01

46

A Ship Propulsion System Model for Fault-tolerant Control  

DEFF Research Database (Denmark)

This report presents a propulsion system model for a low speed marine vehicle, which can be used as a test benchmark for Fault-Tolerant Control purposes. The benchmark serves the purpose of offering realistic and challenging problems relevant in both FDI and (autonomous) supervisory control area. The propulsion system model is presented in two versions: the first one consists of one engine and one propeller, and the othe one consists of two engines and their corresponding propellers placed in parallel in the ship. The corresponding programs are developed and are available.

Izadi-Zamanabadi, Roozbeh; Blanke, M.

1998-01-01

47

Fault Tolerant Operation in Aero Engine Using Distributed Computation System  

Directory of Open Access Journals (Sweden)

Full Text Available The paper presents fault tolerant operation in an aero engine based on real-time systems which is built for a very small set of mission-critical applications like space craft’s , avionics and other distributed control systems. The modern software deals with external interfaces and has to consider various timing implications The platform is based on the C and developed using Keil MDK tool with the targeted deadline of 100 milliseconds at the baud rate of 500 kbps. CAN interface executes the role of Transportation and Communication, an interface cable used for serial communication between Digital Electronic Control Unit (DECU and the host to transfer data to the pilot Online Monitoring System and that is based on Laboratory Virtual Instrument Engineering Workbench (Lab VIEW 7.1. Fault diagnosis typically assumes a sufficiently large fault signature and enough time for a reliable decision to be reached. However, for a class of safety critical faults on commercial aircraft engines, prompt detection is paramount within a millisecond range to allow accommodation to avert undesired engine behavior. At the same time, false positives must be avoided to prevent inappropriate control action.

Neela A G

2014-04-01

48

On the description of fault-tolerant systems  

International Nuclear Information System (INIS)

Various demands by increasing complexity and the disposability of new technologies, like the One-chip-microcomputer and fiber optics, lead to control systems, which are built as decentralized distributed multi-microcomputersystems. They realize not only new control functions but they also open possibilities to increase availability by fault-tolerance. The design or the selection and lay-out of such systems require a quantitative description of these systems. This is possible on the bases of the set of hardware and software moduls of the system by the use of queuing models, reliability nets and diagnostic graphs. This is shown by an example of a practically applied Really Distributed Computer Control System (RDC-System). Computer aided methods for these system descriptions are emphasized. (orig.)

49

Diagnostic software and fault tolerant microprocessor based system architectures  

International Nuclear Information System (INIS)

In numerous industrial applications including power generation, the availability of electronic systems to perform the tasks assigned has become a major issue. At the same time, the functional complexity of these systems has increased enormously. Fortunately, the arrival of cost effective microprocessor based hardware has given the system designer a cadre of techniques to ensure the desired degree of system integrity and availability. These include: dynamic redundancy, isolation, functional diversity, built-in self-tests, embedded test subsystems, communications, error checking and error correcting codes, etc. The choice among the available techniques is generally heuristic and depends greatly on the structure of major components and systems external to the electronic system itself as well as the postulated faults and their relative frequency. Indiscriminate use of these techniques will inevitably increase cost and reduce maintainability while actually reducing system availability and reliability. The issues and the application of these techniques are discussed by describing recent examples of fault tolerant microprocessor based system architectures which include the Plant Safety Monitoring System, the EAGLE-21 Process Protection System and the Advanced Rod Position Indication System for pressurized water reactors. Each of these systems utilize unique internal architectures that address the reliability, availability, and the communications issues while improving maintainability and man-machine interfaces

50

Extending the features of software for reliability analysis of fault-tolerant systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The developed software ASNA-2, which is an improved version of the software ASNA-1, is based on the technology of automated estimation of reliability indexes of fault-tolerant systems. This software is designed for automated evaluation of the reliability indexes of fault-tolerant hardware – software systems. This paper describes a software ASNA-2 with the peculiarities of procedures of reliability analysis of fault-tolerant systems. ? ?????? ???????????? ????...

Volochiy, Bohdan; Mandziy, Bohdan; Ozirkovskyi, Leonid

2012-01-01

51

Fault Tolerance in Real Time Multiprocessors - Embedded Systems  

CERN Document Server

All real time tasks which are termed as critical tasks by nature have to complete its execution before its deadline, even in presence of faults. The most popularly used real time task assignment algorithms are First Fit (FF), Best Fit (BF), Bin Packing (BP).The common task scheduling algorithms are Rate Monotonic (RM), Earliest Deadline First (EDF) etc.All the current approaches deal with either fault tolerance or criticality in real time. In this paper we have proposed an integrated approach with a new algorithm, called SASA (Sorting And Sequential Assignment) which maps the real time task assignment with task schedule and fault tolerance

Persya, A Christy

2010-01-01

52

Piecewise Sliding Mode Decoupling Fault Tolerant Control System  

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: Proposed method in the present study could deal with fault tolerant control system by using the so called decentralized control theory with decoupling fashion sliding mode control, dealing with subsystems instead of whole system and to the knowledge of the author there is no known computational algorithm for decentralized case, Approach: In this study we present a decoupling strategy based on the selection of sliding surface, which should be in piecewise sliding surface partition to apply the PwLTool which have as purpose in our case to delimit regions where sliding mode occur, after that as Results: We get a simple linearized model selected in those regions which could depict the complex system, Conclusion: With the 3 water tank level system as example we implement this new design scenario and since we are interested in networked control system we believe that this kind of controller implementation will not be affected by network delays.

Rafi Youssef

2010-01-01

53

Design and analysis of reliable and fault-tolerant computer systems  

CERN Document Server

Covering both the theoretical and practical aspects of fault-tolerant mobile systems, and fault tolerance and analysis, this book tackles the current issues of reliability-based optimization of computer networks, fault-tolerant mobile systems, and fault tolerance and reliability of high speed and hierarchical networks.The book is divided into six parts to facilitate coverage of the material by course instructors and computer systems professionals. The sequence of chapters in each part ensures the gradual coverage of issues from the basics to the most recent developments. A useful set of refere

Abd-El-Barr, Mostafa

2006-01-01

54

Fault-tolerance in Two-dimensional Topological Systems  

Science.gov (United States)

This thesis is a collection of ideas with the general goal of building, at least in the abstract, a local fault-tolerant quantum computer. The connection between quantum information and topology has proven to be an active area of research in several fields. The introduction of the toric code by Alexei Kitaev demonstrated the usefulness of topology for quantum memory and quantum computation. Many quantum codes used for quantum memory are modeled by spin systems on a lattice, with operators that extract syndrome information placed on vertices or faces of the lattice. It is natural to wonder whether the useful codes in such systems can be classified. This thesis presents work that leverages ideas from topology and graph theory to explore the space of such codes. Homological stabilizer codes are introduced and it is shown that, under a set of reasonable assumptions, any qubit homological stabilizer code is equivalent to either a toric code or a color code. Additionally, the toric code and the color code correspond to distinct classes of graphs. Many systems have been proposed as candidate quantum computers. It is very desirable to design quantum computing architectures with two-dimensional layouts and low complexity in parity-checking circuitry. Kitaev's surface codes provided the first example of codes satisfying this property. They provided a new route to fault tolerance with more modest overheads and thresholds approaching 1%. The recently discovered color codes share many properties with the surface codes, such as the ability to perform syndrome extraction locally in two dimensions. Some families of color codes admit a transversal implementation of the entire Clifford group. This work investigates color codes on the 4.8.8 lattice known as triangular codes. I develop a fault-tolerant error-correction strategy for these codes in which repeated syndrome measurements on this lattice generate a three-dimensional space-time combinatorial structure. I then develop an integer program that analyzes this structure and determines the most likely set of errors consistent with the observed syndrome values. I implement this integer program to find the threshold for depolarizing noise on small versions of these triangular codes. Because the threshold for magic-state distillation is likely to be higher than this value and because logical CNOT gates can be performed by code deformation in a single block instead of between pairs of blocks, the threshold for fault-tolerant quantum memory for these codes is also the threshold for fault-tolerant quantum computation with them. Since the advent of a threshold theorem for quantum computers much has been improved upon. Thresholds have increased, architectures have become more local, and gate sets have been simplified. The overhead for magic-state distillation has been studied, but not nearly to the extent of the aforementioned topics. A method for greatly reducing this overhead, known as reusable magic states, is studied here. While examples of reusable magic states exist for Clifford gates, I give strong reasons to believe they do not exist for non-Clifford gates.

Anderson, Jonas T.

55

Reactive system verification case study: Fault-tolerant transputer communication  

Science.gov (United States)

A reactive program is one which engages in an ongoing interaction with its environment. A system which is controlled by an embedded reactive program is called a reactive system. Examples of reactive systems are aircraft flight management systems, bank automatic teller machine (ATM) networks, airline reservation systems, and computer operating systems. Reactive systems are often naturally modeled (for logical design purposes) as a composition of autonomous processes which progress concurrently and which communicate to share information and/or to coordinate activities. Formal (i.e., mathematical) frameworks for system verification are tools used to increase the users' confidence that a system design satisfies its specification. A framework for reactive system verification includes formal languages for system modeling and for behavior specification and decision procedures and/or proof-systems for verifying that the system model satisfies the system specifications. Using the Ostroff framework for reactive system verification, an approach to achieving fault-tolerant communication between transputers was shown to be effective. The key components of the design, the decoupler processes, may be viewed as discrete-event-controllers introduced to constrain system behavior such that system specifications are satisfied. The Ostroff framework was also effective. The expressiveness of the modeling language permitted construction of a faithful model of the transputer network. The relevant specifications were readily expressed in the specification language. The set of decision procedures provided was adequate to verify the specifications of interest. The need for improved support for system behavior visualization is emphasized.

Crane, D. Francis; Hamory, Philip J.

1993-01-01

56

Modeling the Fault Tolerant Capability of a Flight Control System: An Exercise in SCR Specification  

Science.gov (United States)

In life-critical and mission-critical applications, it is important to make provisions for a wide range of contingencies, by providing means for fault tolerance. In this paper, we discuss the specification of a flight control system that is fault tolerant with respect to sensor faults. Redundancy is provided by analytical relations that hold between sensor readings; depending on the conditions, this redundancy can be used to detect, identify and accommodate sensor faults.

Alexander, Chris; Cortellessa, Vittorio; DelGobbo, Diego; Mili, Ali; Napolitano, Marcello

2000-01-01

57

Fault-tolerant for Electric Vehicles Drive System Sensor Failure  

Directory of Open Access Journals (Sweden)

Full Text Available When EV failure happens, it needs to take some fault-tolerant method to ensure people’s safety. When the current sensor and speed sensor are out of work, the software fault-tolerant control algorithm switching strategy can be used. This paper has done theoretical analysis of the rotor field-oriented vectoe control algorithm into the open loop constant V/F control algorithm, and the phase angle compensation method is used to reduce the shock of current and torque, and simulation is done in MATLAB/Simulink.    

Zhang Liwei

2013-10-01

58

Energy-Aware Synthesis of Fault-Tolerant Schedules for Real-Time Distributed Embedded Systems  

DEFF Research Database (Denmark)

This paper presents a design optimisation tool for distributed embedded real-time systems that 1) decides mapping, fault-tolerance policy and generates a fault-tolerant schedule, 2) is targeted for hard real-time, 3) has hard reliability goal, 4) generates static schedule for processes and messages, 5) provides fault-tolerance for k transient/soft faults, 6) optimises for minimal energy consumption, while considering impact of lowering voltages on the probability of faults, 7) uses constraint logic programming (CLP) based implementation.

Pop, Paul

2007-01-01

59

Evaluation of digital fault-tolerant architectures for nuclear power plant control systems  

International Nuclear Information System (INIS)

Four fault tolerant architectures were evaluated for their potential reliability in service as control systems of nuclear power plants. The reliability analyses showed that human- and software-related common cause failures and single points of failure in the output modules are dominant contributors to system unreliability. The four architectures are triple-modular-redundant (TMR), both synchronous and asynchronous, and also dual synchronous and asynchronous. The evaluation includes a review of design features, an analysis of the importance of coverage, and reliability analyses of fault tolerant systems. An advantage of fault-tolerant controllers over those not fault tolerant, is that fault-tolerant controllers continue to function after the occurrence of most single hardware faults. However, most fault-tolerant controllers have single hardware components that will cause system failure, almost all controllers have single points of failure in software, and all are subject to common cause failures. Reliability analyses based on data from several industries that have fault-tolerant controllers were used to estimate the mean-time-between-failures of fault-tolerant controllers and to predict those failures modes that may be important in nuclear power plants. 7 refs., 4 tabs

60

Passive Fault Tolerant Control of Piecewise Affine Systems Based on H Infinity Synthesis  

DEFF Research Database (Denmark)

In this paper we design a passive fault tolerant controller against actuator faults for discretetime piecewise affine (PWA) systems. By using dissipativity theory and H analysis, fault tolerant state feedback controller design is expressed as a set of Linear Matrix Inequalities (LMIs). In the current paper, the PWA system switches not only due to the state but also due to the control input. The method is applied on a large scale livestock ventilation model.

Gholami, Mehdi; Cocquempot, vincent

2011-01-01

 
 
 
 
61

Optimal structure of fault-tolerant software systems  

International Nuclear Information System (INIS)

This paper considers software systems consisting of fault-tolerant components. These components are built from functionally equivalent but independently developed versions characterized by different reliability and execution time. Because of hardware resource constraints, the number of versions that can run simultaneously is limited. The expected system execution time and its reliability (defined as probability of obtaining the correct output within a specified time) strictly depend on parameters of software versions and sequence of their execution. The system structure optimization problem is formulated in which one has to choose software versions for each component and find the sequence of their execution in order to achieve the greatest system reliability subject to cost constraints. The versions are to be chosen from a list of available products. Each version is characterized by its reliability, execution time and cost. The suggested optimization procedure is based on an algorithm for determining system execution time distribution that uses the moment generating function approach and on the genetic algorithm. Both N-version programming and the recovery block scheme are considered within a universal model. Illustrated example is presented

62

High-Intensity Radiated Field Fault-Injection Experiment for a Fault-Tolerant Distributed Communication System  

Science.gov (United States)

Safety-critical distributed flight control systems require robustness in the presence of faults. In general, these systems consist of a number of input/output (I/O) and computation nodes interacting through a fault-tolerant data communication system. The communication system transfers sensor data and control commands and can handle most faults under typical operating conditions. However, the performance of the closed-loop system can be adversely affected as a result of operating in harsh environments. In particular, High-Intensity Radiated Field (HIRF) environments have the potential to cause random fault manifestations in individual avionic components and to generate simultaneous system-wide communication faults that overwhelm existing fault management mechanisms. This paper presents the design of an experiment conducted at the NASA Langley Research Center's HIRF Laboratory to statistically characterize the faults that a HIRF environment can trigger on a single node of a distributed flight control system.

Yates, Amy M.; Torres-Pomales, Wilfredo; Malekpour, Mahyar R.; Gonzalez, Oscar R.; Gray, W. Steven

2010-01-01

63

Analysis and optimization of fault-tolerant embedded systems with hardened processors  

DEFF Research Database (Denmark)

In this paper we propose an approach to the design optimization of fault-tolerant hard real-time embedded systems, which combines hardware and software fault tolerance techniques. We trade-off between selective hardening in hardware and process reexecution in software to provide the required levels of fault tolerance against transient faults with the lowest-possible system costs. We propose a system failure probability (SFP) analysis that connects the hardening level with the maximum number of reexecutions in software. We present design optimization heuristics, to select the fault-tolerant architecture and decide process mapping such that the system cost is minimized, deadlines are satisfied, and the reliability requirements are fulfilled.

Pop, Paul

2009-01-01

64

A Novel Fault Tolerant Reversible Gate For Nanotechnology Based Systems  

Directory of Open Access Journals (Sweden)

Full Text Available This paper proposes a novel reversible logic gate, NFT. It is a parity preserving reversible logic gate, that is, the parity of the outputs matches that of the inputs. We demonstrate that the NFT gate can implement all Boolean functions. It renders a wide class of circuit faults readily detectable at the circuit's outputs. The proposed parity preserving reversible gate, allows any fault that affects no more than a single signal to be detectable at the circuit's primary outputs. The NFT gate can be used to make fault tolerant reversible logic circuits. We demonstrate how the well-known, and very useful, Toffoli gate can be synthesized from only two parity-preserving reversible gates. We show that our proposed parity-preserving Toffoli gate is much better in terms of number of reversible gates, number of garbage outputs and hardware complexity with compared to the existing counterpart.

Majid Haghparast

2008-01-01

65

Fault-Tolerant Control using Adaptive Time-Frequency Method in Bearing Fault Detection for DFIG Wind Energy System  

Directory of Open Access Journals (Sweden)

Full Text Available With the advances of power electronic technology, doubly-fed induction generators (DFIG have increasingly drawn the interest of wind turbine industries. To ensure the reliable operation and power quality of wind power systems, the fault-tolerant control for DFIG is studied in this paper. The fault-tolerant controller is design to maintain acceptable performance during bearing fault condition. Based on measured motor currents data, an adaptive statistical time-frequency method is then used to detect the fault occurrence in the system and then let the controller compensate for faulty conditions. The feature vectors including frequency components located in the neighborhood of the characteristic fault frequencies is first extracted and then used to estimate the next sampling stator side current in order to better perform the current control. Therefore, with early fault detection, isolation and successful reconfiguration would very beneficial in wind energy conversion system. The feasibility of this fault-tolerant controller has been proven by means of mathematical model and digital simulations based on Matlab/Simulink. The simulation results of the generator output show the effectiveness of this proposed fault-tolerant controller.

Suratsavadee Koonlaboon KORKUA

2015-01-01

66

Design and Assessment of a Multiple Sensor Fault Tolerant Robust Control System  

Directory of Open Access Journals (Sweden)

Full Text Available This paper presents an enhanced robust control design structure to realise fault tolerance towards sensor faults suitable for multi-input-multi-output (MIMO systems implementation. The proposed design permits fault detection and controller elements to be designed with considerations to stability and robustness towards uncertainties besides multiple faults environment on a common mathematical platform. This framework can also cater to systems requiring fast responses. A design example is illustrated with a fast, multivariable and unstable system, that is, the double inverted pendulum system. Results indicate the potential of this design framework to handle fast systems with multiple sensor faults.

J. Chen

2008-03-01

67

Implementing Fault-Tolerance in Real-Time Systems by Program Transformations  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We present a formal approach to implement fault-tolerance in real-time embedded systems. The initial fault-intolerant system consists of a set of independent periodic tasks scheduled onto a set of fail-silent processors connected by a reliable communication network. We transform the tasks such that, assuming the availability of an additional spare processor, the system tolerates one failure at a time (transient or permanent). Failure detection is implemented using heartbeating, and failure ma...

Ayav, Tolga; Fradet, Pascal; Girault, Alain

2006-01-01

68

Transient Fault Tolerance and System Safety Enhancement Based on System Theory  

Directory of Open Access Journals (Sweden)

Full Text Available Transient faults are hard to be detected and located due to their unpredictable nature and short duration, and they are the dominant causations of system failures, which makes it necessary to consider transient fault-tolerant design in the development of modern safety-critical industrial system. In this paper an approach based on system theory is proposed to tolerate the transient faults in tunnel construction wireless monitoring and control systems (TCWMCS, in which the effects of transient faults are expressed by dysfunction of interactions among software applications. After analyzing the dysfunctional interactions of the system by the operational process model and educing the causes of dysfunction in the functional control diagram, a safety enhancement way was proposed for the designers, in which effictive safety constraints were set up to tolerate the transient faults. The experiment evaluation indicated that the effects of transient faults could be exposed by the causal factors of dysfunctional interactions and system safety could be enhanced by the enforcement of  appropriate constraints.

Xiongfeng Huang

2011-10-01

69

A Reliable Fault-Tolerant Scheduling Algorithm for Real Time Embedded Systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this paper, we propose a fault-tolerant scheduling for realtime embedded systems. Our scheduling algorithm is dedicated to multibus heterogeneous architectures, which take as input a given system description and a given fault hypothesis. It is based on a data fragmentation and passive redundancy, which allow fast fault detection/retransmission and efficient use of buses. Our scheduling approach consist of a list scheduling heuristic based on a Global System Failure Rate (GSFR). In order to...

Arar, Chafik; Kalla, Hamoudi; Kalla, Salim; Riadh, Hocine

2013-01-01

70

A Piecewise Affine Hybrid Systems Approach to Fault Tolerant Satellite Formation Control  

DEFF Research Database (Denmark)

In this paper a procedure for modelling satellite formations   including failure dynamics as a piecewise-affine hybrid system is   shown. The formulation enables recently developed methods and tools   for control and analysis of piecewise-affine systems to be applied   leading to synthesis of fault tolerant controllers and analysis of   the system behaviour given possible faults.  The method is   illustrated using a simple example involving two satellites trying   to reach a specific formation despite of actuator faults occurring.

Grunnet, Jacob Deleuran; Larsen, Jesper Abildgaard

2008-01-01

71

Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware/software codesign online. Based on field-programmable gate arrays (FPGAs) in combination wi...

Rgen Teich, J.; Christian Haubelt; Dirk Koch; Thilo Streichert

2006-01-01

72

Evaluation and Checkpointing of Fault Tolerant Mobile Agents Execution in Distributed Systems  

Directory of Open Access Journals (Sweden)

Full Text Available The reliable execution of a mobile agent is a very important design issue to build a mobile agent system and many fault-tolerant schemes have been proposed. Hence, in this paper, we present evaluation of the performance of the fault-tolerant schemes for the mobile agent environment. Our evaluation focuses on the checkpointing schemes and deals with the cooperating agents. We derive the FANTOMAS (Fault-Tolerant approach for Mobile Agents design which offers a user transparent fault tolerance that can be activated on request, according to the needs of the task. A theoretical analysis examines the advantages and drawbacks of Fault-Tolerant approach for Mobile Agents. The use of mobile agent, however, is critical and requires reliability in regard to mobile agent failures that may lead to bad response time and hence the availability of the system may lost. In this study, a fault tolerance technology is proposed in order that the system autonomously detects and recovers the fault of the mobile agent due to a failure in a transmission link. Also discuss how transactional agent with types of commitment constraints can commit. Furthermore this paper proposes a solution for effective agent deployment using dynamic agent domains.

Hodjatollah Hamidi

2010-07-01

73

Award ER25750: Coordinated Infrastructure for Fault Tolerance Systems Indiana University Final Report  

Energy Technology Data Exchange (ETDEWEB)

The main purpose of the Coordinated Infrastructure for Fault Tolerance in Systems initiative has been to conduct research with a goal of providing end-to-end fault tolerance on a systemwide basis for applications and other system software. While fault tolerance has been an integral part of most high-performance computing (HPC) system software developed over the past decade, it has been treated mostly as a collection of isolated stovepipes. Visibility and response to faults has typically been limited to the particular hardware and software subsystems in which they are initially observed. Little fault information is shared across subsystems, allowing little flexibility or control on a system-wide basis, making it practically impossible to provide cohesive end-to-end fault tolerance in support of scientific applications. As an example, consider faults such as communication link failures that can be seen by a network library but are not directly visible to the job scheduler, or consider faults related to node failures that can be detected by system monitoring software but are not inherently visible to the resource manager. If information about such faults could be shared by the network libraries or monitoring software, then other system software, such as a resource manager or job scheduler, could ensure that failed nodes or failed network links were excluded from further job allocations and that further diagnosis could be performed. As a founding member and one of the lead developers of the Open MPI project, our efforts over the course of this project have been focused on making Open MPI more robust to failures by supporting various fault tolerance techniques, and using fault information exchange and coordination between MPI and the HPC system software stack?from the application, numeric libraries, and programming language runtime to other common system components such as jobs schedulers, resource managers, and monitoring tools.

Lumsdaine, Andrew

2013-03-08

74

The analysis and optimization of fault tolerance in multiprocessor systems: A graph theoretic approach  

Energy Technology Data Exchange (ETDEWEB)

The proliferation of increasingly powerful and complex multiprocessor systems has made fault-tolerant design a necessity. Optimizing fault tolerance in multiprocessor systems is a very difficult task because it involves multi-dimensional tradeoffs. The system architecture, the computation structure, the implementation technology, the frequency, duration, and location of faults, and many other factors all have certain impact on the effectiveness of a particular fault recovery procedure. The author has attempted to solve this difficult problem by a graph theoretic approach. In this dissertation, he introduces this approach and concentrate on the analysis and optimization of fault tolerance in multiprocessor systems. Specifically, a reconfiguration model that allows a faulted job to be recovered with minimum space and time overhead and without performance degradation is formally introduced. Additionally, eleven parameters are precisely defined to facilitate the evaluation of the fault tolerance of different multiprocessor systems for executing a given set of target applications. They also allow the quantitative comparison of various fault recovery techniques so that efficient algorithms can be developed. The graph theoretic approach presented is widely applicable to multiprocessor systems and applications with various topologies. In this dissertation, the author concentrates on two well-known systems, namely, the mesh and hypercube, and two frequently used computation structures, namely, the path an complete binary tree. Solutions and algorithms for determining various optimization parameters are presented.

Yau, H.W.

1989-01-01

75

Industrial Cost-Benefit Assessment for Fault-tolerant Control Systems  

DEFF Research Database (Denmark)

Economic aspects are decisive for industrial acceptance of research concepts including the promising ideas in fault tolerant control. Fault tolerance is the ability of a system to detect, isolate and accommodate a fault, such that simple faults in a sub-system do not develop into failures at a system level. In a design phase for an industrial system, possibilities span from fail safe design where any single point failure is accommodated by hardware, over fault-tolerant design where selected faults are handled without extra hardware, to fault-ignorant design where no extra precaution is taken against failure. The paper describes the assessments needed to find the right path for new industrial designs. The economic decisions in the design phase are discussed: cost of different failures, profits associated with available benefits, investments needed for development and life-time support. The objective of this paper is to help, in the early product development state, to find the economical most suitable scheme. Asalient result is that with increased customer awareness of total cost of ownership, new products can benefit significantly from applying fault tolerant control principles.

Thybo, C.; Blanke, M.

1998-01-01

76

Application-driven co-design of fault-tolerant industrial systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper presents a novel methodology for the HW/SW co-design of fault tolerant embedded systems that pursues the mitigation of radiation-induced upset events (which are a class of Single Event Effects - SEEs) on critical industrial applications. The proposal combines the flexibility and low cost of Software Implemented Hardware Fault Tolerance (SIHFT) techniques with the high reliability of selective hardware replication. The co-design flow is supported by a hardening platform that compris...

Restrepo Calle, Felipe; Marti?nez A?lvarez, Antonio; Guzma?n Miranda, Hipo?lito; Palomo Pinto, Francisco Rogelio; Cuenca Asensi, Sergio

2010-01-01

77

Enhanced fault-tolerant quantum computing in $d$-level systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Error correcting codes protect quantum information and form the basis of fault tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transverse non-Clifford gate. Codes with the desired property are presented for $d$-level, qudit, systems with prime $d$. The codes use $n=d-1$ qudits and can detect upto $\\sim d/3$ errors. We quantify the performance of these codes for one approach to quantum com...

Campbell, Earl T.

2014-01-01

78

Fault-diagnosis applications. Model-based condition monitoring. Acutators, drives, machinery, plants, sensors, and fault-tolerant systems  

Energy Technology Data Exchange (ETDEWEB)

Supervision, condition-monitoring, fault detection, fault diagnosis and fault management play an increasing role for technical processes and vehicles in order to improve reliability, availability, maintenance and lifetime. For safety-related processes fault-tolerant systems with redundancy are required in order to reach comprehensive system integrity. This book is a sequel of the book ''Fault-Diagnosis Systems'' published in 2006, where the basic methods were described. After a short introduction into fault-detection and fault-diagnosis methods the book shows how these methods can be applied for a selection of 20 real technical components and processes as examples, such as: Electrical drives (DC, AC) Electrical actuators Fluidic actuators (hydraulic, pneumatic) Centrifugal and reciprocating pumps Pipelines (leak detection) Industrial robots Machine tools (main and feed drive, drilling, milling, grinding) Heat exchangers Also realized fault-tolerant systems for electrical drives, actuators and sensors are presented. The book describes why and how the various signal-model-based and process-model-based methods were applied and which experimental results could be achieved. In several cases a combination of different methods was most successful. The book is dedicated to graduate students of electrical, mechanical, chemical engineering and computer science and for engineers. (orig.)

Isermann, Rolf [Technische Univ. Darmstadt (DE). Inst. fuer Automatisierungstechnik (IAT)

2011-07-01

79

Fault Tolerant Control Systems : a Development Method and Real-Life Case Study  

DEFF Research Database (Denmark)

This thesis considered the development of fault tolerant control systems. The focus was on the category of automated processes that do not necessarily comprise a high number of identical sensors and actuators to maintain safe operation, but still have a potential for improving immunity to component failures. It is often feasible to increase availability for these control loops by designing the control system to perform on-line detection and reconfiguration in case of faults before the safety system makes a close-down of the process. A general development methodology is given in the thesis that carried the control system designer through the steps necessary to consider fault handling in an early design phase. It was shown how an existing control loop with interface to the plant wide control system could be extended with three additional modules to obtain fault tolerance: Fault detection and isolation, remedial action decision, and reconfiguration. The integration of these modules in software were considered. The general methodology covered the analysis, design, and implementation of fault tolerant control systems on an overall level. Two detailed studies were presented, one on fault detection and isolation design and one on design of the decision logic. Two application case studies were used to emphasize practical aspects of both the development methodology and the detailed studies. One was an electro-mechanical actuator in a position control loop for a diesel engine speed governor where the purpose was to avoid a total close-down in case of the most likely faults. The second was a fault tolerant attitude control system for a micro satellite where the operation of the system is mission critical. The purpose was to avoid hazardous effects from faults and maintain operation if possible. A method was introduced that, after a systematic examination of possible component failures, enables analysis of the relationship between failures and their consequences for the system's operation. This fault propagation analysis is based on coarse models of the subsystems describing the reaction to faults, as for example a variable being zero, low or high. Examples were given that illustrate how such models can be established by simple means, and yet provide important information when combined into a complete system. A special achievement was a method to determine how control loops behave in case of faults. This is not straight forward as the system behaviour depends on the character of the feedback. One of the detailed studies were the design of the decision logic in fault handling, realized as state-event machines. Guidelines for the design were provided, based on experience from the two case studies. Methods for verifying correct operation of the decision logic were described, where a completeness check against the fault propagation analysis is able to guarantee coverage of all considered faults. The usage of software tools to support the development process was illustrated with an off-the-shelf product for constraint logic solving and state-event machine analysis. The coarse system models and the decision logic were analyzed with the tool-box and it was shown how an easy analysis could be performed to verify correctness and completeness of the fault handling design. Experience from this study highlights requirements for a dedicated software environment for fault tolerant control systems design. The second detailed study addressed the detection of a fault event and determination of the failed component. A variety of algorithms were compared, based on two fault scenarios in the speed governor actuator setup. One was a position sensor fault and the second was an actuator current fault. The sensor fault detection was trivial, whereas the actuator fault was more challenging. The study demonstrated that many existing methods have a potential to detect and isolate the two faults, but also that the research field still misses a systematic approach to handle realistic problems such as low sampling rate and nonlinear characteristics of the system

BØgh, S.A.

1997-01-01

80

An architecture for fault tolerant controllers  

DEFF Research Database (Denmark)

A general architecture for fault tolerant control is proposed. The architecture is based on the (primary) YJBK parameterization of all stabilizing compensators and uses the dual YJBK parameterization to quantify the performance of the fault tolerant system. The approach suggested can be applied for additive faults, parametric faults, and for system structural changes. The modeling for each of these fault classes is described. The method allows to design for passive as well as for active fault handling. Also, the related design method can be fitted either to guarantee stability or to achieve graceful degradation in the sense of guaranteed degraded performance. A number of fault diagnosis problems, fault tolerant control problems, and feedback control with fault rejection problems are formulated/considered, mainly from a fault modeling point of view. The method is illustrated on a servo example including an additive fault and a parametric fault.

Niemann, Hans Henrik

2005-01-01

 
 
 
 
81

Fault-tolerant interconnection network and image-processing applications for the PASM parallel processing system  

International Nuclear Information System (INIS)

The demand for very high speed data processing coupled with falling hardware costs has made large-scale parallel and distributed computer systems both desirable and feasible. Two modes of parallel processing are single instruction stream-multiple data stream (SIMD) and multiple instruction stream-multiple data stream (MIMD). PASM, a partitionable SIMD/MIMD system, is a reconfigurable multimicroprocessor system being designed for image processing and pattern recognition. An important component of these systems is the interconnection network, the mechanism for communication among the computation nodes and memories. Assuring high reliability for such complex systems is a significant task. Thus, a crucial practical aspect of an interconnection network is fault tolerance. In answer to this need, the Extra Stage Cube (ESC), a fault-tolerant, multistage cube-type interconnection network, is define. The fault tolerance of the ESC is explored for both single and multiple faults, routing tags are defined, and consideration is given to permuting data and partitioning the ESC in the presence of faults. The ESC is compared with other fault-tolerant multistage networks. Finally, reliability of the ESC and an enhanced version of it are investigated

82

Intelligent fault-tolerant control for swing-arm system in the space-borne spectrograph  

Science.gov (United States)

Fault-tolerant control (FTC) for the space-borne equipments is very important in the engineering design. This paper presents a two-layer intelligent FTC approach to handle the speed stability problem in the swing-arm system suffering from various faults in space. This approach provides the reliable FTC at the performance level, and improves the control flow error detection capability at the code level. The faults degrading the system performance are detected by the performance-based fault detection mechanism. The detected faults are categorized as the anticipated faults and unanticipated faults by the fault bank. Neural network is used as an on-line estimator to approximate the unanticipated faults. The compensation control and intelligent integral sliding mode control are employed to accommodate two types of faults at the performance level, respectively. To guarantee the reliability of the FTC at the code level, the key parts of the program codes are modified by control flow checking by software signatures (CFCSS) to detect the control flow errors caused by the single event upset. Meanwhile, some of the undetected control flow errors can be detected by the FTC at the performance level. The FTC for the anticipated fault and unanticipated fault are verified in Synopsys Saber, and the detection of control flow error is tested in the DSP controller. Simulation results demonstrate the efficiency of the novel FTC approach.

Shi, Yufeng; Zhou, Chunjie; Huang, Xiongfeng; Yin, Quan

2012-04-01

83

Fault-Tolerant Real-Time Scheduling Algorithm for Energy-Aware Embedded Systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this paper, we propose a fault-tolerant scheduling approach that achieves low energy consumption and high reliability efficiency. Our scheduling solution is dedicated to multi-bus heterogeneous architectures, which take as input a given system description and a given fault hypothesis. It is based on active redundancy to mask a fixed number k of failures supported in the system, so that there is no need for detecting and handling such failures. In order to maximize the system's reliability,...

Arar, Chafik; Kalla, Hamoudi; Kalla, Salim; Bendib, Sonia Sabrina

2013-01-01

84

Distributed Fault-Tolerant Avionic Systems - A Real-Time Perspective  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper examines the problem of introducing advanced forms of fault-tolerance via reconfiguration into safety-critical avionic systems. This is required to enable increased availability after fault occurrence in distributed integrated avionic systems(compared to static federated systems). The approach taken is to identify a migration path from current architectures to those that incorporate re-configuration to a lesser or greater degree. Other challenges identified includ...

Burke, Michael; Audsley, Neil

2010-01-01

85

Algorithm Based Fault Tolerant and Check Pointing for High Performance Computing Systems  

Directory of Open Access Journals (Sweden)

Full Text Available We present a new approach to fault tolerance for High Performance Computing system. An important consideration in the design of high performance multiprocessor systems is to ensure the correctness of the results computed in the presence of transient and intermittent failures. Concurrent error detection and correction have been applied to such systems in order to achieve reliability. Algorithm Based Fault Tolerance (ABFT has been suggested as a cost-effective concurrent error detection scheme. This dissertation explores fault tolerance in a wide variety of matrix operations for parallel and distributed scientific computing. It proposes a novel computing paradigm to provide fault tolerance for numerical algorithms. The research reported in this study has been motivated by the complexity involved in the analysis and design of ABFT systems. We also present, implement and evaluate early detection in ABFT. In early detection, we try to detect the errors that occur in the checksum calculation before starting the actual computation. Early detection improves throughput in cases of intensive computations and cases of high error rates. This dissertation explores fault tolerance in a wide variety of matrix operations for parallel and distributed scientific computing. An empirical performance evaluation of the implementations on a network of workstation confirms that the advantages of our paradigm are its low overhead, simplicity, ease of implementation and feasibility to scientific applications.

Hodjatollah Hamidi

2009-01-01

86

Fault detection and fault tolerant control of a smart base isolation system with magneto-rheological damper  

International Nuclear Information System (INIS)

Fault detection and isolation (FDI) in real-time systems can provide early warnings for faulty sensors and actuator signals to prevent events that lead to catastrophic failures. The main objective of this paper is to develop FDI and fault tolerant control techniques for base isolation systems with magneto-rheological (MR) dampers. Thus, this paper presents a fixed-order FDI filter design procedure based on linear matrix inequalities (LMI). The necessary and sufficient conditions for the existence of a solution for detecting and isolating faults using the H? formulation is provided in the proposed filter design. Furthermore, an FDI-filter-based fuzzy fault tolerant controller (FFTC) for a base isolation structure model was designed to preserve the pre-specified performance of the system in the presence of various unknown faults. Simulation and experimental results demonstrated that the designed filter can successfully detect and isolate faults from displacement sensors and accelerometers while maintaining excellent performance of the base isolation technology under faulty conditions

87

Fault Tolerant Control Systems : a Development Method and Real-Life Case Study  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This thesis considered the development of fault tolerant control systems. The focus was on the category of automated processes that do not necessarily comprise a high number of identical sensors and actuators to maintain safe operation, but still have a potential for improving immunity to component failures. It is often feasible to increase availability for these control loops by designing the control system to perform on-line detection and reconfiguration in case of faults before the safety ...

Bøgh, S. A.

1997-01-01

88

Fault Tolerant Control Systems : a Development Method and Real-Life Case Study  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This thesis considered the development of fault tolerant control systems. The focus was on the category of automated processes that do not necessarily comprise a high number of identical sensors and actuators to maintain safe operation, but still have a potential for improving immunity to component failures. It is often feasible to increase availability for these control loops by designing the control system to perform on-line detection and reconfiguration in case of faults before the safety ...

Bøgh, S. A.

2005-01-01

89

Preface of the special issue on Advances in Control and Fault-Tolerant Systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Today's automatic control systems are of high degrees of integration, complexity, embedding and networking of heterogeneous entities. This trend is driven by the industrial needs for achieving new technical performance and meeting additional performance demands. A most critical and important issue surrounding the design and operation of complex automatic systems is the application of Fault Detection and Isolation and Fault-Tolerant Control (FDI/FTC) technology, aiming at guaranteeing high sys...

Korbicz, Jozef; Maquin, Didier; Theilliol, Didier

2012-01-01

90

Fault-tolerant Stabilization for Linear System with Time Delay  

Directory of Open Access Journals (Sweden)

Full Text Available In this note, the FTC problem of time-delay systems with the special sensor model of failure is investigated. Firstly, based on Lyapunov stability theorem, through constructing a proper LKF and using integral inequality, the stability condition of the closed-loop system is obtained. Secondly,  by using the nonlinear transformation and the cone complementary linearization algorithm, the controller existence condition of time-delay system in terms of LMIs is obtained, which guarantee the asymptotically stable of the closed-loop systems even if the sensor faults occur, and the controller parameters are also given. Finally, an example is given to show the effectiveness of the proposed methods in this paper.

Shaohua Wang

2013-03-01

91

Active fault tolerant control of piecewise affine systems with reference tracking and input constraints  

DEFF Research Database (Denmark)

An active fault tolerant control (AFTC) method is proposed for discrete-time piecewise affine (PWA) systems. Only actuator faults are considered. The AFTC framework contains a supervisory scheme, which selects a suitable controller in a set of controllers such that the stability and an acceptable performance of the faulty system are held. The design of the supervisory scheme is not considered here. The set of controllers is composed of a normal controller for the fault-free case, an active fault detection and isolation controller for isolation and identification of the faults, and a set of passive fault tolerant controllers (PFTCs) modules designed to be robust against a set of actuator faults. In this research, the piecewise nonlinear model is approximated by a PWA system. The PFTCs are state feedback laws. Each one is robust against a fixed set of actuator faults and is able to track the reference signal while the control inputs are bounded. The PFTC problem is transformed into a feasibility problem of a set of LMIs. The method is applied on a large-scale live-stock ventilation model.

Gholami, M.; Cocquempot, V.

2013-01-01

92

Fault tolerant control system design by using clustering algorithms of data mining  

Directory of Open Access Journals (Sweden)

Full Text Available In this study, two clustering algorithms and their success in fault isolation have been investigated in order to use in our fault tolerant control (FTC system. With so many applications used today, the mathematical model of the system cannot be completely established. Therefore, in this study, fault detection and isolation (FDI is realized by using knowledge-based methods, without the need for any mathematical model. Sensor data, which are taken offline by FDI, are clustered to create knowledge base by means of k-means and farthest first traversal algorithm (FFTA, respectively. The results obtained by the two algorithms are compared and FFTA has found to be more successful in fault tolerance.

Umut Alt?n???k

2013-01-01

93

Architecture for Intrusion Detection System with Fault Tolerance Using Mobile Agent  

Directory of Open Access Journals (Sweden)

Full Text Available This paper is a survey of the work, done for making an IDS fault tolerant.Architecture of IDS that usesmobile Agent provides higher scalability. Mobile Agent uses Platform for detecting Intrusions using filterAgent, co-relater agent, Interpreter agent and rule database. When server (IDS Monitor goes down,other hosts based on priority takes Ownership. This architecture uses decentralized collection andanalysis for identifying Intrusion. Rule sets are fed based on user-behaviour or applicationbehaviour.This paper suggests that intrusion detection system (IDS must be fault tolerant; otherwise, theintruder may first subvert the IDS then attack the target system at will.

Chintan Bhatt

2011-10-01

94

Fault tolerance control of phase current in permanent magnet synchronous motor control system  

Science.gov (United States)

As the Photoelectric tracking system develops from earth based platform to all kinds of moving platform such as plane based, ship based, car based, satellite based and missile based, the fault tolerance control system of phase current sensor is studied in order to detect and control of failure of phase current sensor on a moving platform. By using a DC-link current sensor and the switching state of the corresponding SVPWM inverter, the failure detection and fault control of three phase current sensor is achieved. Under such conditions as one failure, two failures and three failures, fault tolerance is able to be controlled. The reason why under the method, there exists error between fault tolerance control and actual phase current, is analyzed, and solution to weaken the error is provided. The experiment based on permanent magnet synchronous motor system is conducted, and the method is proven to be capable of detecting the failure of phase current sensor effectively and precisely, and controlling the fault tolerance simultaneously. With this method, even though all the three phase current sensors malfunction, the moving platform can still work by reconstructing the phase current of the motor.

Chen, Kele; Chen, Ke; Chen, Xinglong; Li, Jinying

2014-08-01

95

Energy/Reliability Trade-offs in Fault-Tolerant Event-Triggered Distributed Embedded Systems  

DEFF Research Database (Denmark)

This paper presents an approach to the synthesis of low-power fault-tolerant hard real-time applications mapped on distributed heterogeneous embedded systems. Our synthesis approach decides the mapping of tasks to processing elements, as well as the voltage and frequency levels for executing each task, such that transient faults are tolerated, the timing constraints of the application are satisfied, and the energy consumed is minimized. Tasks are scheduled using fixed-priority preemptive scheduling, while replication is used for recovery from multiple transient faults. Addressing energy and reliability simultaneously is especially challenging, since lowering the voltage to reduce the energy consumption has been shown to increase the transient fault rate. We presented a Tabu Search-based approach which uses an energy/reliability trade-off model to find reliable and schedulable implementations with limited energy and hardware resources. We evaluated the algorithm proposed using several synthetic and reallife benchmarks.

Gan, Junhe; Pop, Paul

2011-01-01

96

Formal specification of requirements for analytical redundancy-based fault-tolerant flight control systems  

Science.gov (United States)

Flight control systems are undergoing a rapid process of automation. The use of Fly-By-Wire digital flight control systems in commercial aviation (Airbus 320 and Boeing FBW-B777) is a clear sign of this trend. The increased automation goes in parallel with an increased complexity of flight control systems with obvious consequences on reliability and safety. Flight control systems must meet strict fault-tolerance requirements. The standard solution to achieving fault tolerance capability relies on multi-string architectures. On the other hand, multi-string architectures further increase the complexity of the system inducing a reduction of overall reliability. In the past two decades a variety of techniques based on analytical redundancy have been suggested for fault diagnosis purposes. While research on analytical redundancy has obtained desirable results, a design methodology involving requirements specification and feasibility analysis of analytical redundancy based fault tolerant flight control systems is missing. The main objective of this research work is to describe within a formal framework the implications of adopting analytical redundancy as a basis to achieve fault tolerance. The research activity involves analysis of the analytical redundancy approach, analysis of flight control system informal requirements, and re-engineering (modeling and specification) of the fault tolerance requirements. The USAF military specification MIL-F-9490D and supporting documents are adopted as source for the flight control informal requirements. The De Havilland DHC-2 general aviation aircraft equipped with standard autopilot control functions is adopted as pilot application. Relational algebra is adopted as formal framework for the specification of the requirements. The detailed analysis and formalization of the requirements resulted in a better definition of the fault tolerance problem in the framework of analytical redundancy. Fault tolerance requirements and related certification procedures turned out to be considerably more demanding than those typically adopted in the literature. Furthermore, the research work brought up to light important issues in all fields involved in the specification process, namely flight control system requirements, analytical redundancy, and requirements engineering.

Del Gobbo, Diego

2000-10-01

97

Classes of Byzantine Fault-Tolerant Algorithms for Dependable Distributed Systems.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This thesis concentrates on the design of new algorithms for fault-tolerant systems based on system-level hardware masking redundancy. It is argued that any system in which a reliability improvement of at least a factor 100 is required should be based on system-level hardware masking redundancy. The technique of system-level hardware masking redundancy is applicable in a redundant system consisting of a number of processors, in which the system services are replicated on the different process...

Postma, Andre?

1998-01-01

98

Proactive Service Migration for Long-Running Byzantine Fault Tolerant Systems  

CERN Document Server

In this paper, we describe a novel proactive recovery scheme based on service migration for long-running Byzantine fault tolerant systems. Proactive recovery is an essential method for ensuring long term reliability of fault tolerant systems that are under continuous threats from malicious adversaries. The primary benefit of our proactive recovery scheme is a reduced vulnerability window. This is achieved by removing the time-consuming reboot step from the critical path of proactive recovery. Our migration-based proactive recovery is coordinated among the replicas, therefore, it can automatically adjust to different system loads and avoid the problem of excessive concurrent proactive recoveries that may occur in previous work with fixed watchdog timeouts. Moreover, the fast proactive recovery also significantly improves the system availability in the presence of faults.

Zhao, Wenbing

2008-01-01

99

Fault Tolerant Computer Architecture  

CERN Document Server

For many years, most computer architects have pursued one primary goal: performance. Architects have translated the ever-increasing abundance of ever-faster transistors provided by Moore's law into remarkable increases in performance. Recently, however, the bounty provided by Moore's law has been accompanied by several challenges that have arisen as devices have become smaller, including a decrease in dependability due to physical faults. In this book, we focus on the dependability challenge and the fault tolerance solutions that architects are developing to overcome it. The two main purposes

Sorin, Daniel

2009-01-01

100

Cost and benefits design optimization model for fault tolerant flight control systems  

Science.gov (United States)

Requirements and specifications for a method of optimizing the design of fault-tolerant flight control systems are provided. Algorithms that could be used for developing new and modifying existing computer programs are also provided, with recommendations for follow-on work.

Rose, J.

1982-01-01

 
 
 
 
101

Fault-Tolerant Matrix Operations On Multiple Processor Systems Using Weighted Checksums  

Science.gov (United States)

Hardware for performing matrix operations at high speeds is in great demand in signal and image processing and in many real-time and scientific applications. VLSI technology has made it possible to perform fast large-scale vector and matrix computations by using multiple copies of low-cost processors. Since any functional error in a high performance system may seriously jeopardize the operation of the system and its data integrity, some level of fault-tolerance must be obtained to ensure that the results of long computations are valid. A low-cost checksum scheme had been proposed to obtain fault-tolerant matrix operations on multiple processor systems. However, this scheme can only correct errors in matrix multiplication; it can detect, but not correct errors in matrix-vector multiplication, LU-decomposition, and matrix inversion. In order to solve these problems with the checksum scheme, a very general matrix encoding scheme is proposed in this paper to achieve fault-tolerant matrix operations with multiple processor systems. Since many signal and image processing algorithms involving a "multiply-and-accumulate" type of expression can be transformed into matrix-vector multiplication operations and executed in a linear array, this scheme is extremely useful for cost-effective and fault-tolerant signal and image processing.

Jou, Jing-Yang; Abraham, Jacob A.

1984-11-01

102

A Fault tolerant Control Supervisory System development Procedurefor Small Satellites : The AAUSAT-II case  

DEFF Research Database (Denmark)

The paper presents a stepwise procedure to develop a fault tolerant control system for small satellites. The procedure is illustrated through implementation on the AAUSAT-II spacecraft. As it is shown the presented procedure requires expertise from several disciplines that are nevertheless necessary for obtaining a complete and consistent solution.

Izadi-Zamanabadi, Roozbeh; Larsen, Jesper Abildgaard

103

A Novel Fault Tolerant Reversible Gate For Nanotechnology Based Systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper proposes a novel reversible logic gate, NFT. It is a parity preserving reversible logic gate, that is, the parity of the outputs matches that of the inputs. We demonstrate that the NFT gate can implement all Boolean functions. It renders a wide class of circuit faults readily detectable at the circuit's outputs. The proposed parity preserving reversible gate, allows any fault that affects no more than a single signal to be detectable at the circuit's primary outputs. The NFT gate c...

Majid Haghparast; Keivan Navi

2008-01-01

104

Fault Injection and Monitoring Capability for a Fault-Tolerant Distributed Computation System  

Science.gov (United States)

The Configurable Fault-Injection and Monitoring System (CFIMS) is intended for the experimental characterization of effects caused by a variety of adverse conditions on a distributed computation system running flight control applications. A product of research collaboration between NASA Langley Research Center and Old Dominion University, the CFIMS is the main research tool for generating actual fault response data with which to develop and validate analytical performance models and design methodologies for the mitigation of fault effects in distributed flight control systems. Rather than a fixed design solution, the CFIMS is a flexible system that enables the systematic exploration of the problem space and can be adapted to meet the evolving needs of the research. The CFIMS has the capabilities of system-under-test (SUT) functional stimulus generation, fault injection and state monitoring, all of which are supported by a configuration capability for setting up the system as desired for a particular experiment. This report summarizes the work accomplished so far in the development of the CFIMS concept and documents the first design realization.

Torres-Pomales, Wilfredo; Yates, Amy M.; Malekpour, Mahyar R.

2010-01-01

105

A Fault Tolerant Colored Petri Net Model for Flexible Manufacturing Systems  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english This paper introduces an approach based on Colored Petri Nets (CPN) to systematically introduce fault-tolerance in the design of a supervisor for a Flexible Manufacturing System (FMS). The system is modeled by means of Place/Transition nets and then is structurally reduced, resulting in a CPN that i [...] s independent of a specific production route. The introduction of fault tolerance in the design of such a supervisor considers both forward recovery and backward recovery. For forward recovery we anticipate faults in resources in a production route and reschedule the production routes for production orders before the faulty resource is reached. The backward recovery is considered at the level of a resource in such a way that when a faulty resource is fixed, the operation restarts on the last consistent operation executed

Tomaz C., Barros; Jorge C.A. de, Figueiredo; Angelo, Perkusich.

106

Diagnosis and Fault-tolerant Control  

DEFF Research Database (Denmark)

The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault through the process, to test the fault detectability and to find the redundancies in the process that can be used to ensure fault tolerance. Design methods for diagnostic systems and fault-tolerant controllers are presented for processes that are described by analytical models, by discrete-event models or that can be dealt with as quantised systems. Four case studies on pilot processes show the applicability of the presented methods. The theoretical results are illustrated by two running examples which are used throughout the book. The book addresses engineering students, engineers in industry and researchers who wish to get a survey over the variety of approaches to process diagnosis and fault-tolerant control.

Blanke, Mogens

2003-01-01

107

Stochastic Models for Fault Tolerance  

CERN Document Server

As modern society relies on the fault-free operation of complex computing systems, system fault-tolerance has become an indispensable requirement. Therefore, we need mechanisms that guarantee correct service in cases where system components fail, be they software or hardware elements. Redundancy patterns are commonly used, for either redundancy in space or redundancy in time. Wolter's book details methods of redundancy in time that need to be issued at the right moment. In particular, she addresses the so-called "timeout selection problem", i.e., the question of choosing the right ti

Wolter, Katinka M

2010-01-01

108

Diagnosis and Tolerant Strategy of an Open-Switch Fault for T-type Three-Level Inverter Systems  

DEFF Research Database (Denmark)

This paper proposes a new diagnosis method of an open-switch fault and fault-tolerant control strategy for T-type three-level inverter systems. The location of faulty switch can be identified by the average of normalized phase current and the change of the neutral-point voltage. The proposed fault-tolerant strategy is explained by dividing into two cases: the faulty condition of half-bridge switches and the neutral-point switches. The performance of the T-type inverter system improves considerably by the proposed fault tolerant algorithm when a switch fails. The roposed method does not require additional components and complex calculations. Simulation and experimental results verify the feasibility of the proposed fault diagnosis and fault-tolerant control strategy.

Choi, Uimin; Lee, Kyo Beum

2014-01-01

109

Fault Tolerance Mobile Agent System Using Witness Agent in 2-Dimensional Mesh Network  

Directory of Open Access Journals (Sweden)

Full Text Available Mobile agents are computer programs that act autonomously on behalf of a user or its owner and travel through a network of heterogeneous machines. Fault tolerance is important in their itinerary. In this paper, existent methods of fault tolerance in mobile agents are described which they are considered in linear network topology. In the methods three agents are used to fault tolerance by cooperating to each others for detecting and recovering server and agent failure. Three types of agents are: actual agent which performs programs for its owner, witness agent which monitors the actual agent and the witness agent after itself, probe which is sent for recovery the actual agent or the witness agent on the side of the witness agent. Communication mechanism in the methods is message passing between these agents. The methods are considered in linear network. We introduce our witness agent approach for fault tolerance mobile agent systems in Two Dimensional Mesh (2D-Mesh Network. Indeed Our approach minimizes Witness-Dependency in this network and then represents its algorithm.

Ahmad Rostami

2010-09-01

110

Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems  

Directory of Open Access Journals (Sweden)

Full Text Available Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware/software codesign online. Based on field-programmable gate arrays (FPGAs in combination with CPUs, we allow migrating tasks implemented in hardware or software from one node to another. Moreover, if not enough hardware/software resources are available, the migration of functionality from hardware to software or vice versa is provided. Supporting such flexibility through services integrated in a distributed operating system for networked embedded systems is a substantial step towards self-adaptive systems. Beside the formal definition of methods and concepts, we describe in detail a first implementation of a reconfigurable networked embedded system running automotive applications.

Streichert Thilo

2006-01-01

111

Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems  

Directory of Open Access Journals (Sweden)

Full Text Available Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware/software codesign online. Based on field-programmable gate arrays (FPGAs in combination with CPUs, we allow migrating tasks implemented in hardware or software from one node to another. Moreover, if not enough hardware/software resources are available, the migration of functionality from hardware to software or vice versa is provided. Supporting such flexibility through services integrated in a distributed operating system for networked embedded systems is a substantial step towards self-adaptive systems. Beside the formal definition of methods and concepts, we describe in detail a first implementation of a reconfigurable networked embedded system running automotive applications.

Jürgen Teich

2006-06-01

112

Rigorous Development of Fault-Tolerant Transactions for Information Retrieval Systems Using Event-B  

Directory of Open Access Journals (Sweden)

Full Text Available The aim of this study is to demonstrate the approach of stepwise development of a distributed transaction mechanism for information retrieval systems. In this study, we formally develop an abstract model of transactions in Event-B for an IR system, in which fault tolerance is provided in the distributed transaction execution. We starts from an abstract system specification and gradually introduce implementation details in a series of correctness-preserving transformations, where complex system properties (such as fault tolerant could be specified in a structured and rigorous way. During each transformation, the refinement between the abstract specification of the system and its detailed design is verified. Using Event-B, we achieve a high degree of automatic proof via this incremental approach.

Hong-Jiang Gao

2008-01-01

113

BYZANTINE FAULT TOLERANCE MODEL FOR SOAP FAULTS  

Directory of Open Access Journals (Sweden)

Full Text Available The proposed model is to configure Byzantine Fault Tolerance mechanism for every SOAP fault message that is transmitted. The reliability and availability are of major requirements of Web services since they operate in the distributed environment. One of the reliability issues is handling faults. Fault occurs in all the phases of Service Oriented Architecture i.e. during publishing, discovery, composition, binding, and execution. These faults maylead to service downtime, behaves abnormally, and may send incorrect responses. These abnormalities are classified as Byzantine faults in Web services. Even though SOAP specification provides fault handlingmechanisms, the correctness of the received SOAP fault messages are not known. In this paper, a model is proposed to check the correctness of the SOAP fault message received, by incorporating the Byzantine agreement for fault tolerance. The existing fault tolerant mechanism detects server failure and routes the request to the next available server without the knowledge of the client. The proposed model ensures a transparent environment by providing fault handling information to the client. This is achieved by incorporating an activereplication technique.

V. Ramachandran

2012-04-01

114

Reliability Monitoring of Fault Tolerant Control Systems with Demonstration on an Aircraft Model  

Directory of Open Access Journals (Sweden)

Full Text Available This paper proposes a reliability monitoring scheme for active fault tolerant control systems using a stochastic modeling method. The reliability index is defined based on system dynamical responses and a safety region; the plant and controller are assumed to have a multiple regime model structure, and a semi-Markov model is built for reliability evaluation based on the safety behavior of each regime model estimated by using Monte Carlo simulation. Moreover, the history data of fault detection and isolation decisions is used to update its transition characteristics and reliability model. This method provides an up-to-date reliability index as demonstrated on an aircraft model.

Hongbin Li

2007-12-01

115

Design Optimization of Time- and Cost-Constrained Fault-Tolerant Embedded Systems with Checkpointing and Replication  

DEFF Research Database (Denmark)

We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-critical applications. We use checkpointing with rollback recovery and active replication for tolerating transient faults. Processes and communications are statically scheduled. Our synthesis approach decides the assignment of fault-tolerance policies to processes, the optimal placement of checkpoints and the mapping of processes to processors such that multiple transient faults are tolerated and the timing constraints of the application are satisfied. We present several design optimization approaches which are able to find fault-tolerant implementations given a limited amount of resources. The developed algorithms are evaluated using extensive experiments, including a real-life example.

Pop, Paul

2009-01-01

116

Reliability of voting in fault-tolerant software systems for small output spaces  

Science.gov (United States)

Under a voting strategy in a fault-tolerant software system there is a difference between correctness and agreement. An independent N-version programming reliability model is proposed for treating small output spaces which distinguishes between correctness and agreement. System reliability is investigated using analytical relationships and simulation. A consensus majority voting strategy is proposed and its performance is analyzed and compared with other voting strategies. Consensus majority strategy automatically adapts the voting to different component reliability and output space cardinality characteristics. It is shown that absolute majority voting strategy provides a lower bound on the reliability provided by the consensus majority, and 2-of-n voting strategy an upper bound. If r is the cardinality of the output space it is proved the 1/r is a lower bound on the average reliability of fault-tolerant system components below which the system reliability begins to deteriorate as more versions are added.

Mcallister, David F.; Sun, Chien-En; Vouk, Mladen A.

1987-01-01

117

Fault Tolerance Mobile Agent System Using Witness Agent in 2-Dimensional Mesh Network  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Mobile agents are computer programs that act autonomously on behalf of a user or its owner and travel through a network of heterogeneous machines. Fault tolerance is important in their itinerary. In this paper, existent methods of fault tolerance in mobile agents are described which they are considered in linear network topology. In the methods three agents are used to fault tolerance by cooperating to each others for detecting and recovering server and agent failure. Three types of agents ar...

Ahmad Rostami; Hassan Rashidi; Majidreza Shams Zahraie

2010-01-01

118

Fault tolerant control of spacecraft  

Science.gov (United States)

Autonomous multiple spacecraft formation flying space missions demand the development of reliable control systems to ensure rapid, accurate, and effective response to various attitude and formation reconfiguration commands. Keeping in mind the complexities involved in the technology development to enable spacecraft formation flying, this thesis presents the development and validation of a fault tolerant control algorithm that augments the AOCS on-board a spacecraft to ensure that these challenging formation flying missions will fly successfully. Taking inspiration from the existing theory of nonlinear control, a fault-tolerant control system for the RyePicoSat missions is designed to cope with actuator faults whilst maintaining the desirable degree of overall stability and performance. Autonomous fault tolerant adaptive control scheme for spacecraft equipped with redundant actuators and robust control of spacecraft in underactuated configuration, represent the two central themes of this thesis. The developed algorithms are validated using a hardware-in-the-loop simulation. A reaction wheel testbed is used to validate the proposed fault tolerant attitude control scheme. A spacecraft formation flying experimental testbed is used to verify the performance of the proposed robust control scheme for underactuated spacecraft configurations. The proposed underactuated formation flying concept leads to more than 60% savings in fuel consumption when compared to a fully actuated spacecraft formation configuration. We also developed a novel attitude control methodology that requires only a single thruster to stabilize three axis attitude and angular velocity components of a spacecraft. Numerical simulations and hardware-in-the-loop experimental results along with rigorous analytical stability analysis shows that the proposed methodology will greatly enhance the reliability of the spacecraft, while allowing for potentially significant overall mission cost reduction.

Godard

119

Fault tolerant safety related computer based process control system for TAPP- 3 and 4  

International Nuclear Information System (INIS)

Computer based control systems for safety related applications in nuclear power plants have to meet not only the functional, performance and interface requirements, but in addition, they have to meet regulatory requirements like enhanced reliability, safety and security. While meeting these stringent requirements, such computer based systems also need to ensure high availability. Availability of these safety related systems has a direct influence on commercial operation of the NPP and on the availability of several megawatts of electrical power to the national grid. Several design features such as fault tolerance, on-line diagnostics and self-supervision etc. are to be incorporated in the computer system architecture, hardware design and software design to meet high reliability and high availability criteria. Reactor Control Division (RCnD) has designed and developed 'Dual Processor Hot Standby' (DPHS) fault tolerant architecture, which not only meets the safety requirements but also provides very high availability. The fault tolerant features of DPHS architecture and the design of Process Control System based on DPHS architecture (DPH5-PCS) for TAPP-3 and 4 are highlighted in this paper. DPH5-PCS for Tarapur Atomic Power Project (TAPP) -3 and 4 regulates Primary Heat Transport (PHT) system pressure, Pressuriser pressure, Pressuriser level, Bleed condenser pressure, Bleed condenser level and Steam generator pressure. (author)

120

Fault-tolerant communication channel structures  

Science.gov (United States)

Systems and techniques for implementing fault-tolerant communication channels and features in communication systems. Selected commercial-off-the-shelf devices can be integrated in such systems to reduce the cost.

Alkalai, Leon (Inventor); Chau, Savio N. (Inventor); Tai, Ann T. (Inventor)

2006-01-01

 
 
 
 
121

Fault tolerant synchronization of chaotic systems based on T–S fuzzy model with fuzzy sampled-data controller  

International Nuclear Information System (INIS)

In this paper the fault tolerant synchronization of two chaotic systems based on fuzzy model and sample data is investigated. The problem of fault tolerant synchronization is formulated to study the global asymptotical stability of the error system with the fuzzy sampled-data controller which contains a state feedback controller and a fault compensator. The synchronization can be achieved no matter whether the fault occurs or not. To investigate the stability of the error system and facilitate the design of the fuzzy sampled-data controller, a Takagi–Sugeno (T–S) fuzzy model is employed to represent the chaotic system dynamics. To acquire good performance and produce a less conservative analysis result, a new parameter-dependent Lyapunov–Krasovksii functional and a relaxed stabilization technique are considered. The stability conditions based on linear matrix inequality are obtained to achieve the fault tolerant synchronization of the chaotic systems. Finally, a numerical simulation is shown to verify the results. (general)

122

Fault Tolerant Control: A Simultaneous Stabilization Result  

DEFF Research Database (Denmark)

This paper discusses the problem of designing fault tolerant compensators that stabilize a given system both in the nominal situation, as well as in the situation where one of the sensors or one of the actuators has failed. It is shown that such compensators always exist, provided that the system is detectable from each output and that it is stabilizable. The proof of this result is constructive, and a worked example shows how to design a fault tolerant compensator for a simple, yet challeging system. A family of second order systems is described that requires fault tolerant compensators of arbitrarily high order. Udgivelsesdato: FEB

Stoustrup, Jakob; Blondel, V.D.

2004-01-01

123

A Study on Fault-Tolerant Software Architecture for COTS-Based Dependable System  

International Nuclear Information System (INIS)

Recently, with the rapid development of digital computers and information processing technologies, nuclear instrument and control (I and C) systems which needs safety-critical function have adopted digital technologies. Also, use of commercial off-the-shelf (COTS) software in safety-critical system has been incremented with several reasons such as economical efficiency and technical problems. But, it requires a considerable integration effort and brings about software quality and safety issues. COTS software is usually provided as a black box that cannot be modified. The biggest problem when we integrate such a product into dependable systems is the reliability of COTS software. There is no guarantee that the software will perform its function correctly. It may have bugs or unidentified components. Recently, the method of software verification and validation (V and V) is accepted as a way to assure the dependability of new-developed safety-critical nuclear I and C software. But, because of the limitation of COTS software, software V and V cant be applied as rigorously as new-developed software. There are considerable attentions into describing software architecture with respect to there dependability properties. In this paper, we present fault-tolerant software architecture using the C2 architectural style. The remainder of the paper is organized as follows: Section 2 discusses background work on the COTS software in nuclear I and C, software fault tolerance and C2 ar and C, software fault tolerance and C2 architectural style. Section 3 describes the architecture for fault-tolerant COTS-based software. Finally, we discuss the conclusion and future work

124

Safety Verification of a Fault Tolerant Reconfigurable Autonomous Goal-Based Robotic Control System  

Science.gov (United States)

Fault tolerance and safety verification of control systems are essential for the success of autonomous robotic systems. A control architecture called Mission Data System (MDS), developed at the Jet Propulsion Laboratory, takes a goal-based control approach. In this paper, a method for converting goal network control programs into linear hybrid systems is developed. The linear hybrid system can then be verified for safety in the presence of failures using existing symbolic model checkers. An example task is simulated in MDS and successfully verified using HyTech, a symbolic model checking software for linear hybrid systems.

Braman, Julia M. B.; Murray, Richard M; Wagner, David A.

2007-01-01

125

Application of Joint Parameter Identification and State Estimation to a Fault-Tolerant Robot System  

DEFF Research Database (Denmark)

The joint parameter identification and state estimation technique is applied to develop a fault-tolerant space robot system. The potential faults in the considered system are abrupt parametric faults, which indicate that some system parameters will immediately deviate from their nominal values if a fault happens. The concerned system parameters consist of deterministic parts as well as those describing the stochastic features in the system. Due to the purpose for design of reconfigurable control, these deviated system parameters need to be identified as precisely and quickly as possible. Meanwhile, it would further simplify the reconfigurable design task and possibly speed up the system recovery, if the system state information under the new operating circumstance can be available along with faulty parameter information. The joint parameter identification and state estimation using the combined Kalman Filter and Maximum Likelihood (KF-ML) techniques is discussed and applied in this study. The simulation results on a space robot system showed that the proposed method is quite promising in providing both faulty parameter information and state estimation in a quick, accurate and robust manner.

Sun, Zhen; Yang, Zhenyu

2011-01-01

126

Fault tolerance improvement for queuing systems under stress load  

International Nuclear Information System (INIS)

Various kinds of queuing information systems (exchange auctions systems, web servers, SCADA) are faced to unpredictable situations during operation, when information flow that requires being analyzed and processed rises extremely. Such stress load situations often require human (dispatcher's or administrator's) intervention that is the reason why the time of the first denial of service is extremely important. Common queuing systems architecture is described. Existing approaches to computing resource management are considered. A new late-first-denial-of-service resource management approach is proposed

127

Fault-tolerant embedded system design and optimization considering reliability estimation uncertainty  

International Nuclear Information System (INIS)

In this paper, we model embedded system design and optimization, considering component redundancy and uncertainty in the component reliability estimates. The systems being studied consist of software embedded in associated hardware components. Very often, component reliability values are not known exactly. Therefore, for reliability analysis studies and system optimization, it is meaningful to consider component reliability estimates as random variables with associated estimation uncertainty. In this new research, the system design process is formulated as a multiple-objective optimization problem to maximize an estimate of system reliability, and also, to minimize the variance of the reliability estimate. The two objectives are combined by penalizing the variance for prospective solutions. The two most common fault-tolerant embedded system architectures, N-Version Programming and Recovery Block, are considered as strategies to improve system reliability by providing system redundancy. Four distinct models are presented to demonstrate the proposed optimization techniques with or without redundancy. For many design problems, multiple functionally equivalent software versions have failure correlation even if they have been independently developed. The failure correlation may result from faults in the software specification, faults from a voting algorithm, and/or related faults from any two software versions. Our approach considers this correlation in formulating practical optimization models. Genetic algorithms with a dynamic penalty function are applied in solving this optimization problem, and reasonable and interesting results are obtained and discussed

128

Fault-tolerant Agreement in Synchronous Message-passing Systems  

CERN Document Server

The present book focuses on the way to cope with the uncertainty created by process failures (crash, omission failures and Byzantine behavior) in synchronous message-passing systems (i.e., systems whose progress is governed by the passage of time). To that end, the book considers fundamental problems that distributed synchronous processes have to solve. These fundamental problems concern agreement among processes (if processes are unable to agree in one way or another in presence of failures, no non-trivial problem can be solved). They are consensus, interactive consistency, k-set agreement an

Raynal, Michel

2010-01-01

129

Local rollback for fault-tolerance in parallel computing systems  

Science.gov (United States)

A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.

Blumrich, Matthias A. (Yorktown Heights, NY); Chen, Dong (Yorktown Heights, NY); Gara, Alan (Yorktown Heights, NY); Giampapa, Mark E. (Yorktown Heights, NY); Heidelberger, Philip (Yorktown Heights, NY); Ohmacht, Martin (Yorktown Heights, NY); Steinmacher-Burow, Burkhard (Boeblingen, DE); Sugavanam, Krishnan (Yorktown Heights, NY)

2012-01-24

130

The Isis project: Fault-tolerance in large distributed systems  

Science.gov (United States)

This final status report covers activities of the Isis project during the first half of 1992. During the report period, the Isis effort has achieved a major milestone in its effort to redesign and reimplement the Isis system using Mach and Chorus as target operating system environments. In addition, we completed a number of publications that address issues raised in our prior work; some of these have recently appeared in print, while others are now being considered for publication in a variety of journals and conferences.

Birman, Kenneth P.; Marzullo, Keith

1993-01-01

131

Reliability of computer systems and networks fault tolerance, analysis, and design  

CERN Document Server

With computers becoming embedded as controllers in everything from network servers to the routing of subway schedules to NASA missions, there is a critical need to ensure that systems continue to function even when a component fails. In this book, bestselling author Martin Shooman draws on his expertise in reliability engineering and software engineering to provide a complete and authoritative look at fault tolerant computing. He clearly explains all fundamentals, including how to use redundant elements in system design to ensure the reliability of computer systems and networks.Market: Systems

Shooman, Martin L

2002-01-01

132

Fault tolerant computer control for a Maglev transportation system  

Science.gov (United States)

Magnetically levitated (Maglev) vehicles operating on dedicated guideways at speeds of 500 km/hr are an emerging transportation alternative to short-haul air and high-speed rail. They have the potential to offer a service significantly more dependable than air and with less operating cost than both air and high-speed rail. Maglev transportation derives these benefits by using magnetic forces to suspend a vehicle 8 to 200 mm above the guideway. Magnetic forces are also used for propulsion and guidance. The combination of high speed, short headways, stringent ride quality requirements, and a distributed offboard propulsion system necessitates high levels of automation for the Maglev control and operation. Very high levels of safety and availability will be required for the Maglev control system. This paper describes the mission scenario, functional requirements, and dependability and performance requirements of the Maglev command, control, and communications system. A distributed hierarchical architecture consisting of vehicle on-board computers, wayside zone computers, a central computer facility, and communication links between these entities was synthesized to meet the functional and dependability requirements on the maglev. Two variations of the basic architecture are described: the Smart Vehicle Architecture (SVA) and the Zone Control Architecture (ZCA). Preliminary dependability modeling results are also presented.

Lala, Jaynarayan H.; Nagle, Gail A.; Anagnostopoulos, George

1994-01-01

133

Fault Tolerant Neural Network for ECG Signal Classification Systems  

Directory of Open Access Journals (Sweden)

Full Text Available The aim of this paper is to apply a new robust hardware Artificial Neural Network (ANN for ECG classification systems. This ANN includes a penalization criterion which makes the performances in terms of robustness. Specifically, in this method, the ANN weights are normalized using the auto-prune method. Simulations performed on the MIT ? BIH ECG signals, have shown that significant robustness improvements are obtained regarding potential hardware artificial neuron failures. Moreover, we show that the proposed design achieves better generalization performances, compared to the standard back-propagation algorithm.

MERAH, M.

2011-08-01

134

A Survey on Software Fault tolerance in Parallel Computing  

Directory of Open Access Journals (Sweden)

Full Text Available Software almost inevitably contains defects. Do everything possible to reduce the fault rate; Use faulttolerance techniques to deal with software faults. Fault tolerance is the ability of a system to perform its function correctly even in the presence of internal faults. Most of the ordinary systems lack fault tolerant software fix. This paper surveys various software Fault Tolerance techniques and methodologies. The conventional fault tolerant approaches viz., Recovery Block (RB, N Version Programming (NVP etc., are too costly to fix in an ordinary lowcost application system because, both the RB and NVP rely on multiple (at least three versions of both software and computing machines.

Jashan Deep

2013-08-01

135

An integrated methodology for the dynamic performance and reliability evaluation of fault-tolerant systems  

Energy Technology Data Exchange (ETDEWEB)

We propose an integrated methodology for the reliability and dynamic performance analysis of fault-tolerant systems. This methodology uses a behavioral model of the system dynamics, similar to the ones used by control engineers to design the control system, but also incorporates artifacts to model the failure behavior of each component. These artifacts include component failure modes (and associated failure rates) and how those failure modes affect the dynamic behavior of the component. The methodology bases the system evaluation on the analysis of the dynamics of the different configurations the system can reach after component failures occur. For each of the possible system configurations, a performance evaluation of its dynamic behavior is carried out to check whether its properties, e.g., accuracy, overshoot, or settling time, which are called performance metrics, meet system requirements. Markov chains are used to model the stochastic process associated with the different configurations that a system can adopt when failures occur. This methodology not only enables an integrated framework for evaluating dynamic performance and reliability of fault-tolerant systems, but also enables a method for guiding the system design process, and further optimization. To illustrate the methodology, we present a case-study of a lateral-directional flight control system for a fighter aircraft.

Dominguez-Garcia, Alejandro D. [Department of Electrical and Computer Engineering, University of Illionois at Urbana-Champaign, Urbana, IL 61801-2918 (United States)], E-mail: aledan@UIUC.EDU; Kassakian, John G.; Schindall, Joel E. [Laboratory for Electromagnetic and Electronic Systems, Massachusetts Institute of Technology, Cambridge, MA 02139-4307 (United States); Zinchuk, Jeffrey J. [Charles Stark Draper Laboratory, Cambridge, MA 02139-3563 (United States)

2008-11-15

136

An integrated methodology for the dynamic performance and reliability evaluation of fault-tolerant systems  

International Nuclear Information System (INIS)

We propose an integrated methodology for the reliability and dynamic performance analysis of fault-tolerant systems. This methodology uses a behavioral model of the system dynamics, similar to the ones used by control engineers to design the control system, but also incorporates artifacts to model the failure behavior of each component. These artifacts include component failure modes (and associated failure rates) and how those failure modes affect the dynamic behavior of the component. The methodology bases the system evaluation on the analysis of the dynamics of the different configurations the system can reach after component failures occur. For each of the possible system configurations, a performance evaluation of its dynamic behavior is carried out to check whether its properties, e.g., accuracy, overshoot, or settling time, which are called performance metrics, meet system requirements. Markov chains are used to model the stochastic process associated with the different configurations that a system can adopt when failures occur. This methodology not only enables an integrated framework for evaluating dynamic performance and reliability of fault-tolerant systems, but also enables a method for guiding the system design process, and further optimization. To illustrate the methodology, we present a case-study of a lateral-directional flight control system for a fighter aircraft

137

Fault-tolerant round-robin A/D converter system  

Science.gov (United States)

A robust A/D converter system that requires much less hardware overhead than traditional modular redundancy approaches is described. A modest amount of oversampling generates information that is exploited to achieve fault tolerance. A generalized likelihood ratio test is used to detect the most likely failure and also to estimate the optimum signal reconstruction. The error detection and correction algorithm reduces to a simple form and requires only a slight amount of hardware overhead. A derivation of the algorithm is presented, and modifications that lead to a realizable system are discussed. The authors then evaluate overall performance through software simulations.

Beckmann, Paul E.; Musicus, Bruce R.

1991-12-01

138

P2P???????????? Fault-Tolerant Method in P2P Information Management Systems  

Directory of Open Access Journals (Sweden)

Full Text Available FissionE?????Kautz??P2P??????????????????(d = 2????????????????????FissionE?????????????FissionE?????????????????????????????????????????????“??”?????????????????????FissionE is a Kautz graph based infrastructure of P2P information management systems. It has the optimal network diameter given node degree d = 2. In order to address the problem of degraded routing performance caused by node failures, in this paper we propose a fault-tolerant routing algorithm for the FissionE system. The basic idea is to bypass failed node or link with some certain mechanism, so that FissionE can achieve better routing performance.

??

2012-03-01

139

Fault-Tolerant Process Control Methods and Applications  

CERN Document Server

Fault-Tolerant Process Control focuses on the development of general, yet practical, methods for the design of advanced fault-tolerant control systems; these ensure an efficient fault detection and a timely response to enhance fault recovery, prevent faults from propagating or developing into total failures, and reduce the risk of safety hazards. To this end, methods are presented for the design of advanced fault-tolerant control systems for chemical processes which explicitly deal with actuator/controller failures and sensor faults and data losses. Specifically, the book puts forward: ·         a framework for  detection, isolation and diagnosis of actuator and sensor faults for nonlinear systems; ·         controller reconfiguration and safe-parking-based fault-handling methodologies; ·         integrated-data- and model-based fault-detection and isolation and fault-tolerant control methods; ·         methods for handling sensor faults and data losses; and ·      ...

Mhaskar, Prashant; Christofides, Panagiotis D

2013-01-01

140

The BTeV DAQ and Trigger System - Some throughput, usability and fault tolerance aspects  

International Nuclear Information System (INIS)

As presented at the last CHEP conference, the BTeV triggering and data collection pose a significant challenge in construction and operation, generating 1.5 Terabytes/second of raw data from over 30 million detector channels. We report on facets of the DAQ and trigger farms. We report on the current design of the DAQ, especially its partitioning features to support commissioning of the detector. We are exploring collaborations with computer science groups experienced in fault tolerant and dynamic real-time and embedded systems to develop a system to provide the extreme flexibility and high availability required of the heterogeneous trigger farm (? ten thousand DSPs and commodity processors). We describe directions in the following areas: system modeling and analysis using the Model Integrated Computing approach to assist in the creation of domain-specific modeling, analysis, and program synthesis environments for building complex, large-scale computer-based systems; System Configuration Management to include compilable design specifications for configurable hardware components, schedules, and communication maps; Runtime Environment and Hierarchical Fault Detection/Management--a system-wide infrastructure for rapidly detecting, isolating, filtering, and reporting faults which will be encapsulated in intelligent active entities (agents) to run on DSPs, L2/3 processors, and other supporting processors throughout the system

 
 
 
 
141

The BTeV DAQ and trigger system - some throughput, usability and fault tolerance aspects  

International Nuclear Information System (INIS)

As presented at the last CHEP conference, the BTeV triggering and data collection pose a significant challenge in construction and operation, generating 1.5 Terabytes/second of raw data from over 30 million detector channels. The authors report on facets of the DAQ and trigger farms. The authors report on the current design of the DAQ, especially its partitioning features to support commissioning of the detector. The authors are exploring collaborations with computer science groups experienced in fault tolerant and dynamic real-time and embedded systems to develop a system to provide the extreme flexibility and high availability required of the heterogeneous trigger farm (?ten thousand DSPs and commodity processors). The authors describe directions in the following areas: system modeling and analysis using the Model Integrated Computing approach to assist in the creation of domain-specific modeling, analysis, and program synthesis environments for building complex, large-scale computer-based systems; System Configuration Management to include compilable design specifications for configurable hardware components, schedules, and communication maps; Runtime Environment and Hierarchical Fault Detection/Management- a system-wide infrastructure for rapidly detecting, isolating, filtering, and reporting faults which will be encapsulated in intelligent active entities (agents) to run on DSPs, L2/3 processors, and other supporting processors throughout the systemessors throughout the system

142

Evaluation of error detection coverage and fault-tolerance of digital plant protection system in nuclear power plants  

International Nuclear Information System (INIS)

Recently, traditional analog-based safety-related instrumentation and control (I and C) systems in nuclear power plants (NPPs) have been replaced with modern digital-based systems. Due to the digitalization of nuclear I and C systems, the safety assessment has become a major issue, as it is crucial to the system's reliability. In the safety assessment of the digitalized system, evaluation of error detection coverage and fault-tolerance are critical factors. For the evaluation, we use C++ based hardware description instead of a board with integrated circuit components. We select the digital plant protection system (DPPS) in NPPs as a target system. Permanent fault is used as a possible fault in the system and some error detection methods are used to detect errors. From the experiment, we confirmed that the proposed approach can evaluate the error detection coverage and the fault-tolerance of DPPS in NPPs

143

State of the art on fault-tolerant real time distributed systems  

International Nuclear Information System (INIS)

The integration of new computerized functions in power plant, and especially nuclear power plant, control and instrumentation systems implies more and more stringent requirements as to communication system reliability. For if an item of equipment, or even a computer program, can be validated and qualified, no formal qualification procedure is presently imposed on communication networks. This is certainly due to the relative immaturity of these networks, but also to their complexity. It is for this reason that, in the context of preparation for the future PWR 2000 standardized nuclear plants, it would seem appropriate to take a look at fault-tolerant communication systems. Since C and I type applications (in the control room) are divided between several computers and are required to contend with extremely severe time constraints, EDF has undertaken investigation of fault-tolerant, real time distributed systems. This paper summarized the state of the art in the field as it appears from discussion with computer manufacturers, academics and research workers on related projects. The results obtained were then used to determine trends as to ''promising'' solutions. The paper concludes with recommended study programs for the PCC department of EDF/R and DD for the next few years. (author), 9 figs., 10 refs., 2 annexes

144

Model Checking a Byzantine-Fault-Tolerant Self-Stabilizing Protocol for Distributed Clock Synchronization Systems  

Science.gov (United States)

This report presents the mechanical verification of a simplified model of a rapid Byzantine-fault-tolerant self-stabilizing protocol for distributed clock synchronization systems. This protocol does not rely on any assumptions about the initial state of the system. This protocol tolerates bursts of transient failures, and deterministically converges within a time bound that is a linear function of the self-stabilization period. A simplified model of the protocol is verified using the Symbolic Model Verifier (SMV) [SMV]. The system under study consists of 4 nodes, where at most one of the nodes is assumed to be Byzantine faulty. The model checking effort is focused on verifying correctness of the simplified model of the protocol in the presence of a permanent Byzantine fault as well as confirmation of claims of determinism and linear convergence with respect to the self-stabilization period. Although model checking results of the simplified model of the protocol confirm the theoretical predictions, these results do not necessarily confirm that the protocol solves the general case of this problem. Modeling challenges of the protocol and the system are addressed. A number of abstractions are utilized in order to reduce the state space. Also, additional innovative state space reduction techniques are introduced that can be used in future verification efforts applied to this and other protocols.

Malekpour, Mahyar R.

2007-01-01

145

Filtering and fault tolerant control of parameter-varying time-delay systems and applications  

Science.gov (United States)

This dissertation addresses some open problems in control systems theory. The problems considered include the dynamic controller and filter design for Linear Parameter Varying (LPV) time-delay systems, the reconfigurable control design in Fault Tolerant Control Systems (FTCS) and fault diagnostics in Diesel engines. In the first part of this thesis, we investigate the problem of designing parameter-dependent filters for output estimation of LPV time-delay systems. The filters are designed such that the filtering error system guarantees an optimum level of H2 or Hinfinity performance. A state-delay term is included in the filter dynamics to reduce the design conservatism and improve the performance. The Linear Matrix Inequality (LMI)-based synthesis conditions developed for the filter design purposes are categorized into the rate-dependent and delay-dependent conditions which could handle the time-varying state-delay and bounded small delay cases, respectively. Among these two, the latter one is shown to provide a significant reduction in the conservativeness in the filter design. The second part of the thesis examines the analysis and synthesis of Fault Tolerant Control (FTC) systems in an LPV framework. For reconfigurable control design purposes, the information from Fault Detection and Isolation (FDI) module, that provides an estimate of the fault parameters, is utilized to schedule the controller matrices. We will also present a formulation that incorporates the factor of detection delay in the FTC supervisory system. It is shown that including this delay in the synthesis conditions leads to improved performance and reduced control effort. For analysis of the FTC systems including time-delay, where the fault parameters might be identified inaccurately, we first introduce the notion of brief instability for LPV time-delay systems. In these systems it is possible that the output trajectory converges to zero even though there are parameter trajectories for which the system is locally unstable for a short period of time. Using the analysis conditions for LPV time-delay systems including brief instability, we develop analysis conditions that lead to an explicit formulae that indicates how the FTC closed-loop system performance is degraded under the false identification of the fault parameters. The results are validated on a model of a Highly Maneuverable Aircraft Technology (HiMAT) vehicle. The last part of this thesis presents a model-based diagnostic algorithm for the detection and estimation of the internal leak and restriction in the Exhaust Gas Recirculation (EGR) system of Diesel engines. The initial step in the proposed method is the identification of two parameters in a static relationship. As soon as a fault occurs, the identification algorithm provides a change in the coefficients of the static equation. The results of the experimental validation of the diagnostic algorithm are illustrated on data collected from a test cell and using different trucks during the transient cycle. A statistical analysis is also performed to determine the thresholds that capture the normal variability of the healthy system.

Mohammadpour Velni, Javad

146

A multi-layer robust adaptive fault tolerant control system for high performance aircraft  

Science.gov (United States)

Modern high-performance aircraft demand advanced fault-tolerant flight control strategies. Not only the control effector failures, but the aerodynamic type failures like wing-body damages often result in substantially deteriorate performance because of low available redundancy. As a result the remaining control actuators may yield substantially lower maneuvering capabilities which do not authorize the accomplishment of the air-craft's original specified mission. The problem is to solve the control reconfiguration on available control redundancies when the mission modification is urged to save the aircraft. The proposed robust adaptive fault-tolerant control (RAFTC) system consists of a multi-layer reconfigurable flight controller architecture. It contains three layers accounting for different types and levels of failures including sensor, actuator, and fuselage damages. In case of the nominal operation with possible minor failure(s) a standard adaptive controller stands to achieve the control allocation. This is referred to as the first layer, the controller layer. The performance adjustment is accounted for in the second layer, the reference layer, whose role is to adjust the reference model in the controller design with a degraded transit performance. The upmost mission adjust is in the third layer, the mission layer, when the original mission is not feasible with greatly restricted control capabilities. The modified mission is achieved through the optimization of the command signal which guarantees the boundedness of the closed-loop signals. The main distinguishing feature of this layer is the the mission decision property based on the current available resources. The contribution of the research is the multi-layer fault-tolerant architecture that can address the complete failure scenarios and their accommodations in realities. Moreover, the emphasis is on the mission design capabilities which may guarantee the stability of the aircraft with restricted post-failure control capabilities. The implementation issues of the architecture are also addressed, with possible realizations and the feasibility analysis.

Huo, Ying

147

An empirical comparison of software fault tolerance and fault elimination  

Science.gov (United States)

Reliability is an important concern in the development of software for modern systems. Some researchers have hypothesized that particular fault-handling approaches or techniques are so effective that other approaches or techniques are superfluous. The authors have performed a study that compares two major approaches to the improvement of software, software fault elimination and software fault tolerance, by examination of the fault detection obtained by five techniques: run-time assertions, multi-version voting, functional testing augmented by structural testing, code reading by stepwise abstraction, and static data-flow analysis. This study has focused on characterizing the sets of faults detected by the techniques and on characterizing the relationships between these sets of faults. The results of the study show that none of the techniques studied is necessarily redundant to any combination of the others. Further results reveal strengths and weakness in the fault detection by the techniques studied and suggest directions for future research.

Shimeall, Timothy J.; Leveson, Nancy G.

1991-01-01

148

Task Migration for Fault-Tolerance in Mixed-Criticality Embedded Systems  

DEFF Research Database (Denmark)

In this paper we are interested in mixed-criticality embedded applications implemented on distributed architectures. Depending on their time-criticality, tasks can be hard or soft real-time and regarding safety-criticality, tasks can be fault-tolerant to transient faults, permanent faults, or have no dependability requirements. We use Earliest Deadline First (EDF) scheduling for the hard tasks and the Constant Bandwidth Server (CBS) for the soft tasks. The CBS parameters determine the quality of service (QoS) of soft tasks. Transient faults are tolerated using checkpointing with roll- back recovery. For tolerating permanent faults in processors, we use task migration, i.e., restarting the safety-critical tasks on other processors. We propose a Greedy-based on- line heuristic for the migration of safety-critical tasks, in response to permanent faults, and the adjustment of CBS parameters on the target processors, such that the faults are tolerated, the deadlines for the hard real-time tasks are satisfied and the QoS for soft tasks is maximized. The proposed online adaptive approach has been evaluated using several synthetic benchmarks and a real-life case study.

Saraswat, Prabhat Kumar; Pop, Paul

2009-01-01

149

Fault-tolerant adaptive control for load-following in static space nuclear power systems  

International Nuclear Information System (INIS)

In this paper the possible use of dual-loop, model-based adaptive control system for load-following in static space nuclear power systems is investigated. The objective of the fault-tolerant, autonomous control system is to deliver the demanded electric power at the desired voltage level, by appropriately manipulating the neutron power through the control drums. As a result sufficient thermal power is produced to meet the required demand in the presence of dynamically changing system operating conditions and potential sensor failures. The designed controller is proposed for use in combination with the currently considered shunt regulators, or as a back-up controller when other means of power system control, including some of the sensors, fail

150

Adaptive Fault-Tolerant Control for Time-Varying Failure in High-Speed Train Computer Systems  

Directory of Open Access Journals (Sweden)

Full Text Available We investigate a novel adaptive fault-tolerant control method for time-varying failure in high-speed train computer systems and propose a fault model for such systems.  First, the dynamics of high-speed train systems are analyzed and a multiple point-mass model is developed. When actuator outputs deviate from the expected value, a novel adaptive fault-tolerant control method based on Lyapunov stable theory is automatically implemented to compensate for the unknown fault effects and ensure system stability and performance. The effectiveness of the proposed approach is also confirmed through numerical simulations by using a train model similar to China Railways High-speed 5

Tao Tao

2013-12-01

151

Novel neural networks-based fault tolerant control scheme with fault alarm.  

Science.gov (United States)

In this paper, the problem of adaptive active fault-tolerant control for a class of nonlinear systems with unknown actuator fault is investigated. The actuator fault is assumed to have no traditional affine appearance of the system state variables and control input. The useful property of the basis function of the radial basis function neural network (NN), which will be used in the design of the fault tolerant controller, is explored. Based on the analysis of the design of normal and passive fault tolerant controllers, by using the implicit function theorem, a novel NN-based active fault-tolerant control scheme with fault alarm is proposed. Comparing with results in the literature, the fault-tolerant control scheme can minimize the time delay between fault occurrence and accommodation that is called the time delay due to fault diagnosis, and reduce the adverse effect on system performance. In addition, the FTC scheme has the advantages of a passive fault-tolerant control scheme as well as the traditional active fault-tolerant control scheme's properties. Furthermore, the fault-tolerant control scheme requires no additional fault detection and isolation model which is necessary in the traditional active fault-tolerant control scheme. Finally, simulation results are presented to demonstrate the efficiency of the developed techniques. PMID:25014982

Shen, Qikun; Jiang, Bin; Shi, Peng; Lim, Cheng-Chew

2014-11-01

152

Energy-Aware Fault Tolerance in Hard Real-Time Embedded Systems  

Directory of Open Access Journals (Sweden)

Full Text Available Energy consumption of electronic devices has become a serious concern in recent years. Energy efficiency is necessary to lengthen the battery lifetime in portable systems, as well as to reduce the operational costs and the environmental impact of stationary systems. Dynamic power management (DPM algorithms aim to reduce the energy consumption at the system level by selectively placing components into low-power states. Dynamic voltage scaling (DVS algorithms reduce energy consumption by changing processor speed and voltage at run-time depending on the needs of the applications running. The proposed method is extended by integrating the DPM model DVS algorithm, thus enabling larger energy savings. The proposed methods are i Postponement method and ii Hybrid method. fault tolerance are also achieved by increasing transistor density and decreasing supply voltage.

S.Subha

2012-07-01

153

Fault tolerant integrated inertial navigation/global positioning systems for next generation spacecraft  

Science.gov (United States)

The authors address the requirements, benefits, and mitigation of risks to adapt a commercial Hexad fault-tolerant inertial navigation/global positioning system (FT IN/GPS) for use in next-generation spacecraft. Next-generation requirements are examined to determine whether a high production base system can meet autonomous, reliable, and low-cost requirements for future spacecraft. The major benefits are the combining and replacement of functions, the reduction of unscheduled maintenance and operations costs, and a higher probability of mission success. The design, development, and production risks are mitigated by the long-term commercial production schedule for the Boeing 777 air data inertial reference unit (ADIRU) which begins in the mid-1990s. The conclusion is that a strapdown ring laser gyro (RLG) Hexad FT IN/GPS is the preferred integrated navigation and control system for next-generation vehicles.

Miller, Hugh; Hilts, David A.

154

Fault tolerant control based on active fault diagnosis  

DEFF Research Database (Denmark)

An active fault diagnosis (AFD) method will be considered in this paper in connection with a Fault Tolerant Control (FTC) architecture based on the YJBK parameterization of all stabilizing controllers. The architecture consists of a fault diagnosis (FD) part and a controller reconfiguration (CR) part. The FTC architecture can be applied for additive faults, parametric faults, and for system structural changes. Only parametric faults will be considered in this paper. The main focus in this paper is on the use of the new approach of active fault diagnosis in connection with FTC. The active fault diagnosis approach is based on including an auxiliary input in the system. A fault signature matrix is introduced in connection with AFD, given as the transfer function from the auxiliary input to the residual output. This can be considered as a generalization of the passive fault diagnosis case, where the diagnosis is only based on a residual vector. The fault diagnosis is then derived by on-line tests by using the residual vector.

Niemann, Hans Henrik

2005-01-01

155

GRID COMPUTING AND FAULT TOLERANCE APPROACH  

Directory of Open Access Journals (Sweden)

Full Text Available Grid computing is a means of allocating the computational power of alarge number of computers to complex difficult computation orproblem. Grid computing is a distributed computing paradigm thatdiffers from traditional distributed computing in that it is aimed toward large scale systems that even span organizational boundaries. This paper proposes a method to achieve maximum fault tolerance in the Grid environment system by using Reliability consideration by using Replication approach and Check-point approach. Fault tolerance is an important property for large scale computational grid systems, where geographically distributed nodes co-operate to execute a task. In order to achieve high level of reliability and availability, the grid infrastructure should be a foolproof fault tolerant. Since the failure of resources affects job execution fatally, fault tolerance service is essential to satisfy QOS requirement in grid computing. Commonly utilized techniques for providing fault tolerance are job check pointing and replication. Both techniques mitigate the amount of work lost due to changing system availability but can introduce significant runtime overhead. The latter largely depends on the length of check pointing interval and the chosen number of replicas, respectively. In case of complex scientific workflows where tasks can execute in well defined order reliability is another biggest challenge because of the unreliable nature of the grid resources.

Pankaj Gupta,

2011-10-01

156

A Fault-Tolerant Emergency-Aware Access Control Scheme for Cyber-Physical Systems  

CERN Document Server

Access control is an issue of paramount importance in cyber-physical systems (CPS). In this paper, an access control scheme, namely FEAC, is presented for CPS. FEAC can not only provide the ability to control access to data in normal situations, but also adaptively assign emergency-role and permissions to specific subjects and inform subjects without explicit access requests to handle emergency situations in a proactive manner. In FEAC, emergency-group and emergency-dependency are introduced. Emergencies are processed in sequence within the group and in parallel among groups. A priority and dependency model called PD-AGM is used to select optimal response-action execution path aiming to eliminate all emergencies that occurred within the system. Fault-tolerant access control polices are used to address failure in emergency management. A case study of the hospital medical care application shows the effectiveness of FEAC.

Wu, Guowei; Xia, Feng; Yao, Lin

2012-01-01

157

A Constraint Logic Programming Framework for the Synthesis of Fault-Tolerant Schedules for Distributed Embedded Systems  

DEFF Research Database (Denmark)

We present a constraint logic programming (CLP) approach for synthesis of fault-tolerant hard real-time applications on distributed heterogeneous architectures. We address time-triggered systems, where processes and messages are statically scheduled based on schedule tables. We use process re-execution for recovering from multiple transient faults. We propose three scheduling approaches, which each present a trade-off between schedule simplicity and performance, (i) full transparency, (ii) slack sharing and (iii) conditional, and provide various degrees of transparency. We have developed a CLP framework that produces the fault-tolerant schedules, guaranteeing schedulability in the presence of transient faults. We show how the framework can be used to tackle design optimization problems.The proposed approach has been evaluated using extensive experiments.

Pop, Paul

2007-01-01

158

System-level fault-tolerance in large-scale parallel machines with buffered coscheduling  

Energy Technology Data Exchange (ETDEWEB)

As the number of processors for multi-teraflop systems grows to tens of thousands, with proposed petaflops systems likely to contain hundreds of thousands of processors, the assumption of fully reliable hardware has been abandoned. Although the mean time between failures for the individual Components can be very high, the large total component count will inevitably lead to frequent failures. It is therefore ofparamount importance to develop new software solutions to deal with the unavoidable reality of hardware faults. In this paper we will first describe the nature of the failures of current large-scale machines, and extrapolate these results to future machines. Based on this preliminary analysis we will present a new technology that we are currently developing, buffered coscheduling, which seeks to implement fault tolerance at the operating system level. Major design goals include dynamic reallocation of resources to allow continuing execution in the presence of hardware failures, very high scalability, high eficiency (low overhead), and transparency-requiring no changes to user applications. Preliminary results show that this is attainable with current hardware.

Petrini, F. (Fabrizio); Davis, Kei,; Sancho, J. C. (Jose Carlos)

2004-01-01

159

Dynamic Distributed Intrusion Detection System Based on Mobile Agents with Fault Tolerance  

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: In earlier days, each and every individual system has particular IDS to the particular system and due to this particular technique there are many drawbacks and much more drawbacks in the system side networks. â??All processes used in discovery of unauthorized uses of network or computer devicesâ? Detection of unusual and abnormal activity/events in real-time. Detects break-ins or attacks through various data sources from logs/audit/surveillance and network traffic. Approach: The Intrusion Detection System (IDS has an objective to identify individuals that try to use a system in a way not authorized or those that have authorization to use but they abuse of their privileges. This study proposing the Dynamic Distributed Intrusion Detection System (DDIDS to improve the system Processing and system Networking. Results: An implementation result of the network plays a very important role in order to connect each and every system through a network. For that reason it is said with the experiment that the enhanced intrusion detection system based on Agent gain highly developed detecting performance with fault tolerance. Conclusion: The main aim of this study is to design and develop the dynamic distributed intrusion detection system that would be accurate, low in false alarms, not easily cheated by small variations in pattern, adaptive and be of real time and also increase the system efficiency and increase the system network efficiency.

D. Manjula

2012-01-01

160

Reversible Fault-Tolerant Logic  

CERN Document Server

It is now widely accepted that the CMOS technology implementing irreversible logic will hit a scaling limit beyond 2016, and that the increased power dissipation is a major limiting factor. Reversible computing can potentially require arbitrarily small amounts of energy. Recently several nano-scale devices which have the potential to scale, and which naturally perform reversible logic, have emerged. This paper addresses several fundamental issues that need to be addressed before any nano-scale reversible computing systems can be realized, including reliability and performance trade-offs and architecture optimization. Many nano-scale devices will be limited to only near neighbor interactions, requiring careful optimization of circuits. We provide efficient fault-tolerant (FT) circuits when restricted to both 2D and 1D. Finally, we compute bounds on the entropy (and hence, heat) generated by our FT circuits and provide quantitative estimates on how large can we make our circuits before we lose any advantage ove...

Boykin, P O; Roychowdhury, Vwani P.

2005-01-01

 
 
 
 
161

Fault tolerant operation of switched reluctance machine  

Science.gov (United States)

The energy crisis and environmental challenges have driven industry towards more energy efficient solutions. With nearly 60% of electricity consumed by various electric machines in industry sector, advancement in the efficiency of the electric drive system is of vital importance. Adjustable speed drive system (ASDS) provides excellent speed regulation and dynamic performance as well as dramatically improved system efficiency compared with conventional motors without electronics drives. Industry has witnessed tremendous grow in ASDS applications not only as a driving force but also as an electric auxiliary system for replacing bulky and low efficiency auxiliary hydraulic and mechanical systems. With the vast penetration of ASDS, its fault tolerant operation capability is more widely recognized as an important feature of drive performance especially for aerospace, automotive applications and other industrial drive applications demanding high reliability. The Switched Reluctance Machine (SRM), a low cost, highly reliable electric machine with fault tolerant operation capability, has drawn substantial attention in the past three decades. Nevertheless, SRM is not free of fault. Certain faults such as converter faults, sensor faults, winding shorts, eccentricity and position sensor faults are commonly shared among all ASDS. In this dissertation, a thorough understanding of various faults and their influence on transient and steady state performance of SRM is developed via simulation and experimental study, providing necessary knowledge for fault detection and post fault management. Lumped parameter models are established for fast real time simulation and drive control. Based on the behavior of the faults, a fault detection scheme is developed for the purpose of fast and reliable fault diagnosis. In order to improve the SRM power and torque capacity under faults, the maximum torque per ampere excitation are conceptualized and validated through theoretical analysis and experiments. With the proposed optimal waveform, torque production is greatly improved under the same Root Mean Square (RMS) current constraint. Additionally, position sensorless operation methods under phase faults are investigated to account for the combination of physical position sensor and phase winding faults. A comprehensive solution for position sensorless operation under single and multiple phases fault are proposed and validated through experiments. Continuous position sensorless operation with seamless transition between various numbers of phase fault is achieved.

Wang, Wei

162

Quantum Error Correction and Fault-Tolerance  

CERN Document Server

I give an overview of the basic concepts behind quantum error correction and quantum fault tolerance. This includes the quantum error correction conditions, stabilizer codes, CSS codes, transversal gates, fault-tolerant error correction, and the threshold theorem.

Gottesman, D

2005-01-01

163

Fault Tolerant Homopolar Magnetic Bearings  

Science.gov (United States)

Magnetic suspensions (MS) satisfy the long life and low loss conditions demanded by satellite and ISS based flywheels used for Energy Storage and Attitude Control (ACESE) service. This paper summarizes the development of a novel MS that improves reliability via fault tolerant operation. Specifically, flux coupling between poles of a homopolar magnetic bearing is shown to deliver desired forces even after termination of coil currents to a subset of failed poles . Linear, coordinate decoupled force-voltage relations are also maintained before and after failure by bias linearization. Current distribution matrices (CDM) which adjust the currents and fluxes following a pole set failure are determined for many faulted pole combinations. The CDM s and the system responses are obtained utilizing 1D magnetic circuit models with fringe and leakage factors derived from detailed, 3D, finite element field models. Reliability results are presented vs. detection/correction delay time and individual power amplifier reliability for 4, 6, and 7 pole configurations. Reliability is shown for two success criteria, i.e. (a) no catcher bearing contact following pole failures and (b) re-levitation off of the catcher bearings following pole failures. An advantage of the method presented over other redundant operation approaches is a significantly reduced requirement for backup hardware such as additional actuators or power amplifiers.

Li, Ming-Hsiu; Palazzolo, Alan; Kenny, Andrew; Provenza, Andrew; Beach, Raymond; Kascak, Albert

2003-01-01

164

A Fault-Tolerant Modulation Method to Counteract the Double Open-Switch Fault in Matrix Converter Drive Systems without Redundant Power Devices  

DEFF Research Database (Denmark)

This paper studies the double open-switch fault issue occurring within the conventional matrix converter driving a three-phase permanent-magnet synchronous motor system and proposes a fault-tolerant solution by introducing a revised modulation strategy. In this switching strategy, the rectifier-stage modulation is adjusted based on the knowledge of the switching logics of the inverter-stage and the operating input voltage sectors. However, the proposed fault-tolerant method does not rely on the assist of any redundant power devices or any reconfiguration of the matrix converter circuit by means of using redundant physical connections. It is shown that different locations of the double open switch affect the availability of the revised modulation. The steady state absolute speed error achieved with the proposed method is 4% of the nominal speed. Experimental results are performed to demonstrate the efficacy of the proposed methods.

Nguyen-Duy, Khiem; Andersen, Michael A. E.

2012-01-01

165

A Novel N-Input Voting Algorithm for X-by-Wire Fault-Tolerant Systems  

Science.gov (United States)

Voting is an important operation in multichannel computation paradigm and realization of ultrareliable and real-time control systems that arbitrates among the results of N redundant variants. These systems include N-modular redundant (NMR) hardware systems and diversely designed software systems based on N-version programming (NVP). Depending on the characteristics of the application and the type of selected voter, the voting algorithms can be implemented for either hardware or software systems. In this paper, a novel voting algorithm is introduced for real-time fault-tolerant control systems, appropriate for applications in which N is large. Then, its behavior has been software implemented in different scenarios of error-injection on the system inputs. The results of analyzed evaluations through plots and statistical computations have demonstrated that this novel algorithm does not have the limitations of some popular voting algorithms such as median and weighted; moreover, it is able to significantly increase the reliability and availability of the system in the best case to 2489.7% and 626.74%, respectively, and in the worst case to 3.84% and 1.55%, respectively.

Karimi, Abbas; Zarafshan, Faraneh; Al-Haddad, S. A. R.; Ramli, Abdul Rahman

2014-01-01

166

Fault Tolerant Analysis For Holonic Manufacturing Systems Based On Collaborative Petri Nets  

Directory of Open Access Journals (Sweden)

Full Text Available Uncertainties are significant characteristics of today's manufacturing systems. Holonic manufacturing systems are new paradigms to handle uncertainties and changes in manufacturing environments. Among many sources of uncertainties, failure prone machines are one of the most important ones. This paper focuses on handling machine failures in holonic manufacturing systems. Machine failure will reduce the number of available resources. Feasibility analysis need to be conducted to check whether the works in process can be completed. To facilitate feasibility analysis, we characterize feasible conditions for systems with failure prone machines. This paper combines the flexibility and robustness of multi-agent theory with the modeling and analytical power of Petri net to adaptively synthesize Petri net agents to control holonic manufacturing systems. The main results include: (1 a collaborative Petri net (CPN agent model for holonic manufacturing systems, (2 a feasible condition to test whether a certain type of machine failures are allowed based on collaborative Petri net agents and (3 fault tolerant analysis of the proposed method.

Fu-Shiung Hsieh

2003-04-01

167

Fault-tolerant Control of Discrete-time LPV systems using Virtual Actuators and Sensors  

DEFF Research Database (Denmark)

This paper proposes a new fault-tolerant control (FTC) method for discrete-time linear parameter varying (LPV) systems using a reconfiguration block. The basic idea of the method is to achieve the FTC goal without re-designing the nominal controller by inserting a reconfiguration block between the plant and the nominal controller. The reconfiguration block is realized by an LPV virtual actuator and an LPV virtual sensor. Its goal is to transform the signals from the faulty system such that its behavior is similar to that of the nominal system from the viewpoint of the controller. Furthermore, it transforms the output of the controller for the faulty system such that the stability and performance goals are preserved. Input-to-state stabilizing LPV gains of the virtual actuator and sensor are obtained by solving linear matrix inequalities (LMIs). We show that separate design of these gains guarantees the input-to-state stability (ISS) of the closed-loop reconfigured system. Moreover, we obtain performances in terms of the ISS gains for the virtual actuator, the virtual sensor and their interconnection. Minimizing these performances is formulated as convex optimization problems subject to LMI constraints. Finally, the effectiveness of the method is demonstrated via a numerical example and stator current control of an induction motor.

Tabatabaeipour, Mojtaba; Stoustrup, Jakob

2014-01-01

168

Fault-tolerant control of discrete-time LPV systems using virtual actuators and sensors  

DEFF Research Database (Denmark)

A new fault-tolerant control (FTC) method for discrete-time linear parameter varying (LPV) systems is proposed using a reconfiguration block. As such FTC can be achieved without redesigning the nominal controller. This involves the insertion of a reconfiguration block between the plant and the nominal controller. The reconfiguration block is realized by an LPV virtual actuator and an LPV virtual sensor. Its transforms the signals from the faulty system such that its behavior is similar to that of the nominal system from the viewpoint of the controller. Also, it transforms the output of the controller for the faulty system such that the stability and performance are preserved. Input-to-state stabilizing LPV gains of the virtual actuator and sensor are obtained by solving LMIs. We show that separate design of these gains guarantees the input-to-state stability (ISS) of the closed-loop reconfigured system. Moreover, we obtain performances in terms of the ISS gains for the virtual actuator, the virtual sensor, and their interconnection.

Tabatabaeipour, S. Mojtaba; Stoustrup, Jakob

2014-01-01

169

Scheduling of Fault-Tolerant Embedded Systems with Soft and Hard Timing Constraints  

DEFF Research Database (Denmark)

In this paper we present an approach to the synthesis of fault-tolerant schedules for embedded applications with soft and hard real-time constraints. We are interested to guarantee the deadlines for the hard processes even in the case of faults, while maximizing the overall utility. We use time/utility functions to capture the utility of soft processes. Process re-execution is employed to recover from multiple faults. A single static schedule computed off-line is not fault tolerant and is pessimistic in terms of utility, while a purely online approach, which computes a new schedule every time a process fails or completes, incurs an unacceptable overhead. Thus, we use a quasi-static scheduling strategy, where a set of schedules is synthesized off-line and, at run time, the scheduler will select the right schedule based on the occurrence of faults and the actual execution times of processes. The proposed schedule synthesis heuristics have been evaluated using extensive experiments.

Pop, Paul

2008-01-01

170

Scheduling and Voltage Scaling for Energy/Reliability Trade-offs in Fault-Tolerant Time-Triggered Embedded Systems  

DEFF Research Database (Denmark)

In this paper we present an approach to the scheduling and voltage scaling of low-power fault-tolerant hard real-time applications mapped on distributed heterogeneous embedded systems. Processes and messages are statically scheduled, and we use process re-execution for recovering from multiple transient faults. Addressing simultaneously energy and reliability is especially challenging because lowering the voltage to reduce the energy consumption has been shown to exponentially increase the number of transient faults. In addition, time-redundancy based fault-tolerance techniques such as re-execution and dynamic voltage scaling-based low-power techniques are competing for the slack in the schedules. Our approach decides the voltage levels and start times of processes and the transmission times of messages, such that the transient faults are tolerated, the timing constraints of the application are satisfied and the energy is minimized. We present a constraint logic programming- based approach which is able to find reliable and schedulable implementations within limited energy and hardware resources. The developed algorithms have been evaluated using extensive experiments.

Pop, Paul

2007-01-01

171

Fault tolerant capabilities of the Cosmic Background Explorer attitude control system  

Science.gov (United States)

The Cosmic Background Explorer (COBE), which was launched November 18, 1989 from Vandenberg Air Force Base aboard a Delta rocket, has been classified by the scientific community as a major success with regards to the field of cosmology theory. Despite a number of anomalies which have occurred during the mission, the attitude control system (ACS) has performed remarkably well. This is due in large part to the fault tolerant capabilities that were designed into the ACS. A unique triaxial control system orientated in the spacecraft's transverse plane provides the ACS the ability to safely survive various sensor and actuator failures. Features that help to achieve this fail-operational system include component cross-strapping and autonomous control electronics switching. This design philosophy was of utmost importance because of the constraint placed upon the ACS to keep the spinning observatory and its cryogen-cooled science instruments pointing away from the sun. Even though the liquid helium was depleted within the expected twelve months from launch, it is still very much desirable to avoid any thermal disturbances upon the remaining functional instruments.

Placanica, Samuel J.

1992-01-01

172

Heap Base Coordinator Finding with Fault Tolerant Method in Distributed Systems  

Directory of Open Access Journals (Sweden)

Full Text Available Coordinator finding in wireless networks is a very important problem, and this problem is solved by suitable algorithms. The main goals of coordinator finding are synchronizing the processes at optimal using of the resources. Many different algorithms have been presented for coordinator finding. The most important leader election algorithms are the Bully and Ring algorithms. In this paper we analyze and compare these algorithms with together and we propose new approach with fault tolerant mechanisms base on heap for coordinator finding in wireless environment. Our algorithm's running time and message complexity compare favorably with existing algorithms. Our work involves substantial modifications of an existing algorithm and its proof, and we adapt the existing algorithms to the noisy environment base on fault tolerant mechanisms

Mehdi EffatParvar

2011-07-01

173

Fault Tolerant External Memory Algorithms  

DEFF Research Database (Denmark)

Algorithms dealing with massive data sets are usually designed for I/O-efficiency, often captured by the I/O model by Aggarwal and Vitter. Another aspect of dealing with massive data is how to deal with memory faults, e.g. captured by the adversary based faulty memory RAM by Finocchi and Italiano. However, current fault tolerant algorithms do not scale beyond the internal memory. In this paper we investigate for the first time the connection between I/O-efficiency in the I/O model and fault tolerance in the faulty memory RAM, and we assume that both memory and disk are unreliable. We show a lower bound on the number of I/Os required for any deterministic dictionary that is resilient to memory faults. We design a static and a dynamic deterministic dictionary with optimal query performance as well as an optimal sorting algorithm and an optimal priority queue. Finally, we consider scenarios where only cells in memory or only cells on disk are corruptible and separate randomized and deterministic dictionaries in the latter.

JØrgensen, Allan GrØnlund; Brodal, Gerth StØlting

2009-01-01

174

Mission reliability analysis of fault-tolerant multiple-phased systems  

International Nuclear Information System (INIS)

Fault-tolerant multiple-phased systems (FTMPS) are defined as systems whose critical components are independently replicated and whose operational life can be partitioned into a set of disjoint periods, called 'phases'. Because of their deployment in critical applications, their mission reliability analysis is a task of primary relevance to validate the designs. This paper is focused on the reliability analysis of FTMPS with random phase durations, non-exponentially distributed repair activities and different repair policies. For self-repairable FTMPS with a component-level reconfiguration architecture, we derive several efficient formulations from the underlying structure characteristics for their intraphase behavior analysis. We also present a uniform solution framework of the mission reliability for FTMPS with generally distributed phase durations. Compared with existing methods based on deterministic and stochastic Petri nets or Markov regenerative stochastic Petri nets, our approach is more simple in concept and powerful in computation. Two examples of FTMPS are analyzed to illustrate the advantages of our approach

175

Mission reliability analysis of fault-tolerant multiple-phased systems  

Energy Technology Data Exchange (ETDEWEB)

Fault-tolerant multiple-phased systems (FTMPS) are defined as systems whose critical components are independently replicated and whose operational life can be partitioned into a set of disjoint periods, called 'phases'. Because of their deployment in critical applications, their mission reliability analysis is a task of primary relevance to validate the designs. This paper is focused on the reliability analysis of FTMPS with random phase durations, non-exponentially distributed repair activities and different repair policies. For self-repairable FTMPS with a component-level reconfiguration architecture, we derive several efficient formulations from the underlying structure characteristics for their intraphase behavior analysis. We also present a uniform solution framework of the mission reliability for FTMPS with generally distributed phase durations. Compared with existing methods based on deterministic and stochastic Petri nets or Markov regenerative stochastic Petri nets, our approach is more simple in concept and powerful in computation. Two examples of FTMPS are analyzed to illustrate the advantages of our approach.

Mo Yuchang [Harbin Institute of Technology, Harbin, Heilongjiang 150001 (China)], E-mail: myc@ftcl.hit.edu.cn; Siewiorek, Daniel [Department of Computer Science and Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213 (United States); Yang Xiaozong [Harbin Institute of Technology, Harbin, Heilongjiang 150001 (China)

2008-07-15

176

Fault-Tolerant, Real-Time, Multi-Core Computer System  

Science.gov (United States)

A document discusses a fault-tolerant, self-aware, low-power, multi-core computer for space missions with thousands of simple cores, achieving speed through concurrency. The proposed machine decides how to achieve concurrency in real time, rather than depending on programmers. The driving features of the system are simple hardware that is modular in the extreme, with no shared memory, and software with significant runtime reorganizing capability. The document describes a mechanism for moving ongoing computations and data that is based on a functional model of execution. Because there is no shared memory, the processor connects to its neighbors through a high-speed data link. Messages are sent to a neighbor switch, which in turn forwards that message on to its neighbor until reaching the intended destination. Except for the neighbor connections, processors are isolated and independent of each other. The processors on the periphery also connect chip-to-chip, thus building up a large processor net. There is no particular topology to the larger net, as a function at each processor allows it to forward a message in the correct direction. Some chip-to-chip connections are not necessarily nearest neighbors, providing short cuts for some of the longer physical distances. The peripheral processors also provide the connections to sensors, actuators, radios, science instruments, and other devices with which the computer system interacts.

Gostelow, Kim P.

2012-01-01

177

A Direct Design from Input/Output Data of Fault-Tolerant Control System Based on GIMC Structure  

Science.gov (United States)

This paper deals with a design method of fault-tolerant control system based on Generalized Internal Model Control (GIMC) structure consisting of a standard outer loop feedback controller and an extra inner loop controller. The distinguished feature of GIMC structure is that the controller design for performance and robustness may be done separately. The outer loop controller is designed for nominal performance using some controller synthesis to meet (nominal) control specification, while the inner loop controller is designed to make a trade-off between robustness and performance. This feature is suitable for fault-tolerant control. The outer loop controller is designed for fault-free case, and the inner loop controller for faulty case. In the conventional methods, the inner loop controller is designed to maximize the robust stability margin without information on fault. Therefore, the performance in the faulty case tends to become conservative. In this paper, the inner loop controller is directly designed from experimental data collected from the faulty system. Since the collected data contains information on the fault, conservativeness in the conventional methods is decreased. The inner loop controller is designed by Virtual Reference Feedback Tuning (VRFT). VRFT is a direct design method from input-output data without identifying any models. Since complexity of the controller can be specified by the designer, no complexity reduction has to be required, which becomes advantageous upon implementation. The effectiveness of the proposed design method is confirmed by an experiment.

Sakuishi, Tsubasa; Yubai, Kazuhiro; Hirai, Junji

178

Research on fault diagnose and fault tolerant control of steam generator based on strong tracking filter  

International Nuclear Information System (INIS)

In order to further improve the safety of nuclear power plants, based on the nonlinear system with stochastic noise, the strong tracking filter is used to evaluate the sensor fault bias of steam generator control system and reconstruct the sensors output to implement the fault tolerant control. The simulation results demonstrate that this method can evaluate the time-varying sensor fault bias effectively and has great fault tolerant ability, and the methodology employing the strong tracking filter for steam generator fault tolerant control design is effective. (authors)

179

A study on quantification of unavailability of DPPS with fault tolerant techniques considering fault tolerant techniques' characteristics  

International Nuclear Information System (INIS)

With the improvement of digital technologies, digital I and C systems have included more various fault tolerant techniques than conventional analog I and C systems have, in order to increase fault detection and to help the system safely perform the required functions in spite of the presence of faults. So, in the reliability evaluation of digital systems, the fault tolerant techniques (FTTs) and their fault coverage must be considered. To consider the effects of FTTs in a digital system, there have been several studies on the reliability of digital model. Therefore, this research based on literature survey attempts to develop a model to evaluate the plant reliability of the digital plant protection system (DPPS) with fault tolerant techniques considering detection and process characteristics and human errors. Sensitivity analysis is performed to ascertain important variables from the fault management coverage and unavailability based on the proposed model

180

Comment on "Fault Tolerant analysis for stochastic systems using switching diffusion processes' by Yang, Jiang and Cocquempot  

DEFF Research Database (Denmark)

Results are given in Yang, Jiang and Cocquempot (Yang, H., Jiang, B., and Cocquempot, V. (2009), ‘Fault Tolerance Analysis for Stochastic Systems using Switching Diffusion Processes’, International Journal of Control, 82, 1516–1525) regarding the overall stability of switched diffusion processes based on stability properties of separate processes combined through stochastic switching. This article argues two main results to be empty, in that the presented hypotheses are logically inconsistent.

SchiØler, Henrik; Leth, John-Josef

2011-01-01

 
 
 
 
181

Task Mapping and Bandwidth Reservation for Mixed Hard/Soft Fault-Tolerant Embedded Systems  

DEFF Research Database (Denmark)

In this paper we are interested in mixed hard/soft real-time fault-tolerant applications mapped on distributed heterogeneous architectures. We use the Earliest Deadline First (EDF) scheduling for the hard real-time tasks and the Constant Bandwidth Server (CBS) for the soft tasks. The bandwidth reserved for the servers determines the quality of service (QoS) for soft tasks. CBS enforces temporal isolation, such that soft task overruns do not affect the timing guarantees of hard tasks. Transient faults in hard tasks are tolerated using checkpointing with rollback recovery. We have proposed a Tabu Search-based approach for task mapping and CBS bandwidth reservation, such that the deadlines for the hard tasks are satisfied, even in the case of transient faults, and the QoS for the soft tasks is maximized. Researchers have used fixed execution time models, such as the worst-case execution times for hard tasks and average execution times for soft tasks. However, we show that by using stochastic execution times for soft tasks, significant improvements can be obtained. The proposed strategy has been evaluated using an extensive set of benchmarks.

Saraswat, Prabhat Kumar; Pop, Paul

2010-01-01

182

Fault Tolerant Control using the Primary and Dual Youla Parameterizations  

DEFF Research Database (Denmark)

Different aspects of modeling faults in dynamic systems are considered in connection with reliable control (RC). The fault models include models with additive faults, multiplicative faults and structural changes in the models due to faults in the systems. These descriptions are considered in connection with reliable control and feedback control with fault rejection. The main emphasis is on fault modeling. A number of fault diagnosis problems, reliable control problems, and feedback control with fault rejection problems are formulated/considered, again, mainly form a fault modeling point of view. Reliability is introduced by means of the (primary) Youla parameterization of all stabilizing controllers, where an additional loop is closed around a diagnostic signal. In order to quantify the level of reliability, the dual Youla parameterization is introduced which can be used to analyze how large faults can be tolerated without losing e.g. stability.

Niemann, H.; Stoustrup, Jakob

2002-01-01

183

Fault-tolerant control for current sensors of doubly fed induction generators based on an improved fault detection method  

DEFF Research Database (Denmark)

Fault-tolerant control of current sensors is studied in this paper to improve the reliability of a doubly fed induction generator (DFIG). A fault-tolerant control system of current sensors is presented for the DFIG, which consists of a new current observer and an improved current sensor fault detection algorithm. The current observer is constructed by using only voltage signals as inputs. The fault detection algorithm is based on the current observer, in which an adaptive threshold and different fault duration times are considered. The performance of the proposed observer, improved fault detection algorithm, and fault-tolerant control system are investigated by simulation. The results indicate that the outputs of the observer and the sensor are highly coherent. The fault detection algorithm can efficiently detect both soft and hard faults in current sensors, and the fault-tolerant control system can effectively tolerate both types of faults. © 2013 Published by Elsevier Ltd. All rights reserved.

Li, Hui; Yang, Chao

2014-01-01

184

Parallel and distributed computation for fault-tolerant object recognition  

Science.gov (United States)

The distributed associative memory (DAM) model is suggested for distributed and fault-tolerant computation as it relates to object recognition tasks. The fault-tolerance is with respect to geometrical distortions (scale and rotation), noisy inputs, occulsion/overlap, and memory faults. An experimental system was developed for fault-tolerant structure recognition which shows the feasibility of such an approach. The approach is futher extended to the problem of multisensory data integration and applied successfully to the recognition of colored polyhedral objects.

Wechsler, Harry

1988-01-01

185

Strategies for Fault Tolerance in Multicomponent Applications  

Energy Technology Data Exchange (ETDEWEB)

This paper discusses on-going work with the Integrated Plasma Simulator (IPS), a framework for coupled multiphysics simulations of plasmas, to allow simulations to run through the loss of nodes on which the simulation is executing. While many different techniques are available to improve the fault tolerance of computational science applications on high-performance computer systems, checkpoint/restart (C/R) remains virtually the only one that see widespread use in practice. Our focus here is to augment the traditional C/R approach with additional techniques that can provide a more localized and tailored response to faults based on the ability to restart failed tasks on an individual basis, and the use of information external to the application itself in order to guide decision-making, in many cases avoiding the need to stop and restart the entire simulation. This capability involves several features within the IPS framework, and leverages the Fault Tolerance Backplane, a publish/subscribe event service to disseminate fault-related information throughout HPC systems, to obtain information from the Reliability, Availability and Serviceability (RAS) subsystem of the HPC system. This work is described in the context of Cray XT-series computer systems for concreteness, but is applicable to other environments as well. As part of the analysis of this work, we discuss the requirements to generalize this approach to other complex simulation applications beyond the Integrated Plasma Simulator.

Shet, Aniruddha G [ORNL; Elwasif, Wael R [ORNL; Foley, Samantha S [ORNL; Park, Byung H [ORNL; Bernholdt, David E [ORNL; Bramley, Randall B [ORNL

2011-01-01

186

Efficient Fault-Tolerant Strategy Selection Algorithm in Cloud Computing  

Directory of Open Access Journals (Sweden)

Full Text Available Cloud computing is upcoming a mainstream feature of information technology. More progressively enterprises deploy their software systems in the cloud environment. The applications in cloud are usually large scale and containing a lot of distributed cloud components. Building cloud applications is highly reliable for challenging and critical research issues. Information processing systems has increased the significance of its correct and continuous operation even in the presence of faulty components. To address this issue, proposes a cloud framework to build fault-tolerant cloud applications. We first propose fault detection algorithms to identify significant components from the huge amount of cloud components. Then, we present an efficient fault-tolerance strategy selection algorithm to determine the most suitable fault-tolerance strategy for each significant component. Software fault tolerance is widely adopted to increase the overall system reliability in critical applications. System reliability can be enhanced by employing functionally equivalent components to tolerate component failures. Fault-tolerance strategies introduced a three well-known techniques are in the following with formulas for calculating the failure probabilities of the fault-tolerant modules. Our work will mainly be driven toward the implementation of the framework to measure the strength of fault tolerance service and to make an in-depth analysis of the cost benefits among all the stakeholders. An algorithm is proposed to automatically determine an efficient fault-tolerance strategy for the significant cloud components. Using real failure traces and model, we evaluate the proposed resource provisioning policies to determine their performance, cost as well as cost efficiency. The experimental results show that by tolerating faults of a small part of the most important components, the reliability of cloud applications can be highly improved.

P.Priyanka

2014-02-01

187

Evaluation of Simple Causal Message Logging for Large-Scale Fault Tolerant HPC Systems  

Energy Technology Data Exchange (ETDEWEB)

The era of petascale computing brought machines with hundreds of thousands of processors. The next generation of exascale supercomputers will make available clusters with millions of processors. In those machines, mean time between failures will range from a few minutes to few tens of minutes, making the crash of a processor the common case, instead of a rarity. Parallel applications running on those large machines will need to simultaneously survive crashes and maintain high productivity. To achieve that, fault tolerance techniques will have to go beyond checkpoint/restart, which requires all processors to roll back in case of a failure. Incorporating some form of message logging will provide a framework where only a subset of processors are rolled back after a crash. In this paper, we discuss why a simple causal message logging protocol seems a promising alternative to provide fault tolerance in large supercomputers. As opposed to pessimistic message logging, it has low latency overhead, especially in collective communication operations. Besides, it saves messages when more than one thread is running per processor. Finally, we demonstrate that a simple causal message logging protocol has a faster recovery and a low performance penalty when compared to checkpoint/restart. Running NAS Parallel Benchmarks (CG, MG and BT) on 1024 processors, simple causal message logging has a latency overhead below 5%.

Bronevetsky, G; Meneses, E; Kale, L V

2011-02-25

188

MULTILEVEL CONVERTER STATCOM FAULT TOLERANCE CAPABILITY  

Directory of Open Access Journals (Sweden)

Full Text Available Fault tolerant capability of multilevel converters in STATCOM (static synchronous compensator has been utilized as power system controller for reactive power compensation and voltage regulation improvement. The advantages of the multilevel structure for The STATCOM are 1 Elimination of bulky transformers.2 Reduction of the output harmonic levels by Synthesizing Sinusoidal voltage 3 lower switching losses. The structure has only one disadvantage that is increased switch failure, due to the increased number of switches. A single switch failure, however, does not necessarily force an (2n + 1- level STATCOM offline. Even with a reduced number of switches, a STATCOM can still provide a significant range of control by removing the module of the faulted switch and continuing with (2n ? 1 levels. This paper introduces an approach to identify the existence of the faulted switch, and reconfigure the STATCOM. This approach is illustrated on 13 level converters STATCOM and total harmonic distortion is analyzed by using MATLAB.

K.Varalakshmi

2014-11-01

189

Designing fault-tolerant real-time computer systems with diversified bus architecture for nuclear power plants  

International Nuclear Information System (INIS)

Fault-tolerant real-time computer (FT-RTC) systems are widely used to perform safe operation of nuclear power plants (NPP) and safe shutdown in the event of any untoward situation. Design requirements for such systems need high reliability, availability, computational ability for measurement via sensors, control action via actuators, data communication and human interface via keyboard or display. All these attributes of FT-RTC systems are required to be implemented using best known methods such as redundant system design using diversified bus architecture to avoid common cause failure, fail-safe design to avoid unsafe failure and diagnostic features to validate system operation. In this context, the system designer must select efficient as well as highly reliable diversified bus architecture in order to realize fault-tolerant system design. This paper presents a comparative study between CompactPCI bus and Versa Module Eurocard (VME) bus architecture for designing FT-RTC systems with switch over logic system (SOLS) for NPP. (author)

190

Fault Tolerant Design for Attitude Orbit Control System (AOCS) of ADEOS-II (Advanced Earth Observing Satellite-II)  

Science.gov (United States)

Fault tolerance of Spacecraft Attitude and Control Subsystem (AOCS) is extremely important because an AOCS failure can result in the total loss of spacecraft. More specifically, FDIR (Fault Detection, Isolation and Recovery) function has been applied to many satellites resulting in successful mission operations. This paper presents the FDIR function for the AOCS of ADEOS-II with emphasis on a newly developed FDIR for hybrid navigation. Hybrid navigation which is ADEOS-II nominal operational mode processes GPS Receiver (GPSR), Fine Sun Sensor Head (FSSH), Inertial Reference Unit (IRU) and Earth Sensor Assembly (ESA) for attitude determination and achieves specified attitude control accuracy. Regarding fault tolerance of hybrid navigation, by employing FDIR system, the automatic recovery from failure modes has been proven in the real orbit operations, and the highly robust AOCS was developed. Features of new FDIR are shown below. (a)The FDIR not only monitors anomalies of AOCS components but evaluate the attitude determination of hybrid navigation in comparison with the attitude determination of normal navigation operates independently. (b)The FDIR in case of anomaly turns into the normal navigation mode and maintains continuity of almost earth observation mission. The approach and algorithm developed in this FDIR design can be applied to next earth observing satellite required the precise attitude control accuracy.

Kojima, Yasushi; Tanamachi, Takehiko; Ohkami, Yoshiaki

191

Fault Tolerance in Control Architectures for Mobile Robots: Fantasy or Reality?  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Due to the future development of robotic autonomous systems in human environment, the fault tolerance paradigm will be a central issue in robotics. This article presents a survey of fault tolerance concepts, means and implementations in robotic architectures.

Crestani, Didier; Godary-dejean, Karen

2012-01-01

192

Diagnosis and Fault-tolerant Control, 2nd edition.  

DEFF Research Database (Denmark)

Fault-tolerant control aims at a graceful degradation of the behaviour of automated systems in case of faults. It satisfies the industrial demand for enhanced availability and safety, in contrast to traditional reactions to faults that bring about sudden shutdowns and loss of availability. The book presents effective model-based analysis and design methods for fault diagnosis and fault-tolerant control. Architectural and structural models are used to analyse the propagation of the fault throught the process, to test the fault detectability and to find the redundancies in the process that can be used to ensure fault tolerance. Design methods for diagnostic systems and fault-tolerant controllers are presented for processes that are described by analytical models, by discrete-event models or that can be dealt with as quantised systems. Five case studies on pilot processes show the applicability of the presented methods. The theoretical results are illustrated by two running examples used throughout the book. The second edition includes new material about reconfigurable control, diagnosis of nonlinear systems, and remote diagnosis. The application examples are extended by a steering-by-wire system and the air path of a diesel engine, both of which include experimental results. The bibliographical notes at the end of all chapters have been up-dated. The chapters end with exercises to be used in lectures.

Blanke, Mogens

2006-01-01

193

Fault Analysis on VSI Fed Induction Motor Drive with Fault Tolerant Strategy  

Directory of Open Access Journals (Sweden)

Full Text Available The aim of this study is to design and implement a fault tolerant inverter for induction motor drive. The operations of the induction motor drives are so crucial in some applications that any fault in the drive could result in serious loss to the industry in terms of capital, process and materials not to mention the wastage due to idle labor time. Hence, it is essentials that an induction motor drive should basically be fault tolerant. This study investigates some of the possible faults in the inverter circuits by performing fault analysis on the current waveforms using harmonic spectrum. A fault tolerant system is proposed which can operate even after occurrence of the fault in runtime. A new SIMULINK model for leg swap module is proposed. The simulation studies are done using MATLAB simulation tool and the results are presented. The hardware setup is configured, tested and the experimental results are compared with the simulation results.

S. Nagarajan

2014-03-01

194

Fault tolerant control - a residual based set-up  

DEFF Research Database (Denmark)

A new set-up for fault tolerant control (FTC) for stable systems is presented in this paper. The new set-up is based on a simple implementation of the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization. This implementation of the YJBK parameterization will allow a direct and simple reconfiguration of the feedback controller. Another central part of fault tolerant control is fault diagnosis. The controller implementation can be applied directly in connection with both passive diagnosis (PFD) as well as with active fault diagnosis (AFD). The presented FTC set-up is investigated with respect to sensor reconfiguration. Actuator reconfiguration can be dealt with in a similar way.

Niemann, Hans Henrik; Poulsen, Niels KjØlstad

2009-01-01

195

Fault Tolerant Design for Magnetic Memories  

Directory of Open Access Journals (Sweden)

Full Text Available This study presents a Fault Tolerant memory cores based on the property of Component Reusability, a method for Fault Tolerance for content addressable memories. The memories used in the design are 256, 512, 1024 and 2048 bytes. The fault is injected into the circuitry operation by using Automatic Test Pattern Generators (ATPGs. The design has been implemented in Cadence 90 nm technology and tested with Fault Injection Circuits and ATPG effectiveness was found out to be 100% at a frequency of 500 MHZ.

Arun Kumar P.

2014-03-01

196

Enhanced Maritime Safety through Diagnosis and Fault Tolerant Control  

DEFF Research Database (Denmark)

Faults in steering, navigation instruments or propulsion machinery are serious on a marine vessel since the consequence could be loss of maneuvering ability, and imply risk of damage to vessel personnel or environment. Early diagnosis and accomodation of faults could enhance safety. Fault-tolerant control is a methodology to help prevent that faults develop into failure. The means include on-line fault diagnosis, automatic condition assessment and calculation of remedial action to avoid hazards. This paper gives an overview of methods to obtain fault-tolerance: fault diagnosis; analysis of properties of a falty system; means to determine remedial actions. The paper illustrates the techniques by two marine examples, sensor fusion for automatic steering and control of the main engine.

Blanke, Mogens

2001-01-01

197

Fault-tolerant logics for FPGA linux  

International Nuclear Information System (INIS)

The increasing use of SRAM-based reconfigurable architectures at important areas of research and development (like particle accelerators and space applications) brings new, currently partially unattended effects on top. An already well known, but nevertheless important problem of such systems is its susceptibility to radiation which increases in conjunction with particle flux and energy. Regarding to current knowledge, errors induced by Single Event Upsets (SEU) and Single Event Transients (SET) are handled exclusively in hardware by the use of spacial and temporal redundancy features. Our field of research is to extend conventional fault tolerance to multiple layers of embedded computer systems, starting with the FPGA bit layer and ending up in the software application layer to get a maximum of radiation tolerance in systems running FPGA Linux in radiation susceptible environments. Only a collaboration of all these layers is able to create an adequate amount of data security and process integrity.

198

Fault-tolerant logics for FPGA linux  

Energy Technology Data Exchange (ETDEWEB)

The increasing use of SRAM-based reconfigurable architectures at important areas of research and development (like particle accelerators and space applications) brings new, currently partially unattended effects on top. An already well known, but nevertheless important problem of such systems is its susceptibility to radiation which increases in conjunction with particle flux and energy. Regarding to current knowledge, errors induced by Single Event Upsets (SEU) and Single Event Transients (SET) are handled exclusively in hardware by the use of spacial and temporal redundancy features. Our field of research is to extend conventional fault tolerance to multiple layers of embedded computer systems, starting with the FPGA bit layer and ending up in the software application layer to get a maximum of radiation tolerance in systems running FPGA Linux in radiation susceptible environments. Only a collaboration of all these layers is able to create an adequate amount of data security and process integrity.

Gebelein, Jano; Abel, Norbert; Kebschull, Udo [Kirchhoff-Institute for Physics, University of Heidelberg (Germany)

2009-07-01

199

Fault Tolerant Architecture for Telecom Wireless CORBA  

Directory of Open Access Journals (Sweden)

Full Text Available In order for non-mobile ORB to interoperate with CORBA objects and clients running on a mobile terminal, OMG have specified Wireless Access and Terminal Mobility of CORBA. In the common core of the CORBA specification, Fault Tolerance has been specified. But it is intended for the wired networks. This study proposes a fault tolerant architecture for the Telecom wireless CORBA based on replication and checkpoint of objects. The storage available at Access Bridge is employed to log messages and entity states of objects on behalf of mobile terminals. The logging and recovery infrastructures are designed on each Access Bridge, to implement the fault tolerant for Telecom wireless CORBA. The Logging Mechanism records the message in a log, from which the Recovery Mechanism can retrieve the message during recovery. The performance analysis shows that the proposed fault tolerant architecture ensures a low loss of computing incurred by the fault of the server object. The proposed fault tolerance architecture is a graceful extension of the original wired Fault Tolerant CORBA and is able to cooperate with the published CORBA specifications seamlessly.

Zhenpeng Xu

2013-01-01

200

Optimal Management of Redundant Control Authority for Fault Tolerance  

Science.gov (United States)

This paper is intended to demonstrate the feasibility of a solution to a fault tolerant control problem. It explains, through a numerical example, the design and the operation of a novel scheme for fault tolerant control. The fundamental principle of the scheme was formalized in [5] based on the notion of normalized nonspecificity. The novelty lies with the use of a reliability criterion for redundancy management, and therefore leads to a high overall system reliability.

Wu, N. Eva; Ju, Jianhong

2000-01-01

 
 
 
 
201

Redundant asynchronous microprocessor system for fault tolerant flight control and navigation  

Science.gov (United States)

Unlike their synchronized counterparts, redundant channels in an asynchronous flight system can, under no-fault conditions, exhibit cross-channel data disparities. Sources of these errors are examined in terms of the general, individual functions of the flight control and navigation application in the asynchronous digital environment. The effects of asynchronism on trajectory programmers, dynamic control algorithms and data reconstruction processes are examined in terms of data skews, data latencies, and clock rate uncertainties. An example is presented in which time corrections are applied to reduce the data disparities. Practical limitations of the approach of the example are discussed.

Dunn, W. R.

1983-01-01

202

Software reliability through fault-avoidance and fault-tolerance  

Science.gov (United States)

Strategies and tools for the testing, risk assessment and risk control of dependable software-based systems were developed. Part of this project consists of studies to enable the transfer of technology to industry, for example the risk management techniques for safety-concious systems. Theoretical investigations of Boolean and Relational Operator (BRO) testing strategy were conducted for condition-based testing. The Basic Graph Generation and Analysis tool (BGG) was extended to fully incorporate several variants of the BRO metric. Single- and multi-phase risk, coverage and time-based models are being developed to provide additional theoretical and empirical basis for estimation of the reliability and availability of large, highly dependable software. A model for software process and risk management was developed. The use of cause-effect graphing for software specification and validation was investigated. Lastly, advanced software fault-tolerance models were studied to provide alternatives and improvements in situations where simple software fault-tolerance strategies break down.

Vouk, Mladen A.; Mcallister, David F.

1993-01-01

203

Byzantine-fault tolerant self-stabilizing protocol for distributed clock synchronization systems  

Science.gov (United States)

A rapid Byzantine self-stabilizing clock synchronization protocol that self-stabilizes from any state, tolerates bursts of transient failures, and deterministically converges within a linear convergence time with respect to the self-stabilization period. Upon self-stabilization, all good clocks proceed synchronously. The Byzantine self-stabilizing clock synchronization protocol does not rely on any assumptions about the initial state of the clocks. Furthermore, there is neither a central clock nor an externally generated pulse system. The protocol converges deterministically, is scalable, and self-stabilizes in a short amount of time. The convergence time is linear with respect to the self-stabilization period.

Malekpour, Mahyar R. (Inventor)

2010-01-01

204

A continuous-time semi-markov bayesian belief network model for availability measure estimation of fault tolerant systems  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in portuguese Neste trabalho, é proposto um modelo baseado na integração entre processos semi-Markovianos e redes Bayesianas para avaliação da disponibilidade de sistemas tolerantes à falha. Esta integração resulta em um modelo estocástico híbrido o qual é capaz de representar as características dinâmicas de um s [...] istema assim como tratar as relações de causa e efeito entre fatores externos tais como condições ambientais e operacionais. Além disso, o modelo híbrido permite avaliar a propagação de incerteza sobre a disponibilidade do sistema. É também proposto um procedimento numérico para a solução das equações de probabilidade de estado de processos semi-Markovianos descritos por taxas de transição. Tal procedimento numérico é baseado na aplicação de transformadas de Laplace que são invertidas pelo método de quadratura Gaussiana conhecido como Gauss Legendre. O modelo híbrido e procedimento numérico são ilustrados por meio de um exemplo de aplicação no contexto de sistemas tolerantes à falha. Abstract in english In this work it is proposed a model for the assessment of availability measure of fault tolerant systems based on the integration of continuous time semi-Markov processes and Bayesian belief networks. This integration results in a hybrid stochastic model that is able to represent the dynamic charact [...] eristics of a system as well as to deal with cause-effect relationships among external factors such as environmental and operational conditions. The hybrid model also allows for uncertainty propagation on the system availability. It is also proposed a numerical procedure for the solution of the state probability equations of semi-Markov processes described in terms of transition rates. The numerical procedure is based on the application of Laplace transforms that are inverted by the Gauss quadrature method known as Gauss Legendre. The hybrid model and numerical procedure are illustrated by means of an example of application in the context of fault tolerant systems.

Márcio das Chagas, Moura; Enrique López, Droguett.

2008-08-01

205

A continuous-time semi-markov bayesian belief network model for availability measure estimation of fault tolerant systems  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in portuguese Neste trabalho, é proposto um modelo baseado na integração entre processos semi-Markovianos e redes Bayesianas para avaliação da disponibilidade de sistemas tolerantes à falha. Esta integração resulta em um modelo estocástico híbrido o qual é capaz de representar as características dinâmicas de um s [...] istema assim como tratar as relações de causa e efeito entre fatores externos tais como condições ambientais e operacionais. Além disso, o modelo híbrido permite avaliar a propagação de incerteza sobre a disponibilidade do sistema. É também proposto um procedimento numérico para a solução das equações de probabilidade de estado de processos semi-Markovianos descritos por taxas de transição. Tal procedimento numérico é baseado na aplicação de transformadas de Laplace que são invertidas pelo método de quadratura Gaussiana conhecido como Gauss Legendre. O modelo híbrido e procedimento numérico são ilustrados por meio de um exemplo de aplicação no contexto de sistemas tolerantes à falha. Abstract in english In this work it is proposed a model for the assessment of availability measure of fault tolerant systems based on the integration of continuous time semi-Markov processes and Bayesian belief networks. This integration results in a hybrid stochastic model that is able to represent the dynamic charact [...] eristics of a system as well as to deal with cause-effect relationships among external factors such as environmental and operational conditions. The hybrid model also allows for uncertainty propagation on the system availability. It is also proposed a numerical procedure for the solution of the state probability equations of semi-Markov processes described in terms of transition rates. The numerical procedure is based on the application of Laplace transforms that are inverted by the Gauss quadrature method known as Gauss Legendre. The hybrid model and numerical procedure are illustrated by means of an example of application in the context of fault tolerant systems.

Márcio das Chagas, Moura; Enrique López, Droguett.

206

Sensitivity Analysis of Unavailability of a Component in DPS with Various Fault-Tolerant Techniques  

Energy Technology Data Exchange (ETDEWEB)

With the improvement of digital technologies, digital protection system (DPS) has more multiple sophisticated fault-tolerant techniques (FTTs), in order to increase fault detection and to help the system safely perform the required functions in spite of the possible presence of faults. In the reliability evaluation of digital systems, fault-tolerant techniques (FTTs) and their fault coverage must be considered. Fault detection coverage is crucial factor of FTT in reliability. However, the fault detection coverage is not enough to reflect the effects of various FTTs in reliability model. Thus, integrated fault coverage is suggested to reflect characteristics of FTTs

Kim, Bo Gyung; Kang, Hyun Gook; Kim, Hee Eun; Seong, Poong Hyun [Korea Advanced Institute of Science and Technology, Daejeon (Korea, Republic of); Lee, Seung Jun [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

2012-05-15

207

On the Fault Tolerance and Hamiltonicity of the Optical Transpose Interconnection System of Non-Hamiltonian Base Graphs  

CERN Document Server

Hamiltonicity is an important property in parallel and distributed computation. Existence of Hamiltonian cycle allows efficient emulation of distributed algorithms on a network wherever such algorithm exists for linear-array and ring, and can ensure deadlock freedom in some routing algorithms in hierarchical interconnection networks. Hamiltonicity can also be used for construction of independent spanning tree and leads to designing fault tolerant protocols. Optical Transpose Interconnection Systems or OTIS (also referred to as two-level swapped network) is a widely studied interconnection network topology which is popular due to high degree of scalability, regularity, modularity and package ability. Surprisingly, to our knowledge, only one strong result is known regarding Hamiltonicity of OTIS - showing that OTIS graph built of Hamiltonian base graphs are Hamiltonian. In this work we consider Hamiltonicity of OTIS networks, built on Non-Hamiltonian base and answer some important questions. First, we prove tha...

Ghosh, Esha; Rangan, C Pandu

2011-01-01

208

Analysis of an inherently fault tolerant program  

International Nuclear Information System (INIS)

Software for process-control systems, such as nuclear power plant safety control systems and robots, can be very complex because of the large number of cases which have to be considered. The approach proposed here uses decentralized control concepts and is based on Dijkstra's ''relaxation'' problem and self-stabilizing systems. The resulting program is inherently fault tolerant of partial hardware failures. Further, often the software is simplified, so that its correctness can be verified more easily. The authors present an overview of the model using a simple control program for a simulated robot as an example. Then they analyze this control program in terms of the degree to which it is decentralized, its partial correctness proof, its convergence proof and its performance. They also discuss some modifications to the basic algorithm

209

Application of a fault-tolerant microprocessor-based core-surveillance system in a German fast breeder reactor  

International Nuclear Information System (INIS)

For the fast breeder reactor KNK II at Karlsruhe, Germany, a microprocessor-based safety shut-down system is built. Analogue to the triple modular instrumentation it consists of TMR hardware. Functionally it is split into four blocks which operate in cascade-like fashion. The main functions are mean value calculation, current limit control, trend control, and final evaluation. In order to secure correctness, several constructive and analytical methods are applied for fault avoidance, like formal specification languages, programming guidelines, software quality assurance plan, validation, verification, and testing. Since additional means for correct and safe operation are still necessary, fault-tolerance and error-detection techniques are applied. These include self-checking programs, plausibility checks, control data, information exchange and control between the redundancies, and especially diversity. This diversity refers to different teams for the different development phases as well as to different tools and environments, like different programming languages for the application software. Three separate but functional identical programs will be implemented in Iftran, Pascal and PL/M. These will not only be used during the extensive testing period, but also during final operation

210

A Primer on Architectural Level Fault Tolerance  

Science.gov (United States)

This paper introduces the fundamental concepts of fault tolerant computing. Key topics covered are voting, fault detection, clock synchronization, Byzantine Agreement, diagnosis, and reliability analysis. Low level mechanisms such as Hamming codes or low level communications protocols are not covered. The paper is tutorial in nature and does not cover any topic in detail. The focus is on rationale and approach rather than detailed exposition.

Butler, Ricky W.

2008-01-01

211

Computer aided reliability, availability, and safety modeling for fault-tolerant computer systems with commentary on the HARP program  

Science.gov (United States)

Many of the most challenging reliability problems of our present decade involve complex distributed systems such as interconnected telephone switching computers, air traffic control centers, aircraft and space vehicles, and local area and wide area computer networks. In addition to the challenge of complexity, modern fault-tolerant computer systems require very high levels of reliability, e.g., avionic computers with MTTF goals of one billion hours. Most analysts find that it is too difficult to model such complex systems without computer aided design programs. In response to this need, NASA has developed a suite of computer aided reliability modeling programs beginning with CARE 3 and including a group of new programs such as: HARP, HARP-PC, Reliability Analysts Workbench (Combination of model solvers SURE, STEM, PAWS, and common front-end model ASSIST), and the Fault Tree Compiler. The HARP program is studied and how well the user can model systems using this program is investigated. One of the important objectives will be to study how user friendly this program is, e.g., how easy it is to model the system, provide the input information, and interpret the results. The experiences of the author and his graduate students who used HARP in two graduate courses are described. Some brief comparisons were made with the ARIES program which the students also used. Theoretical studies of the modeling techniques used in HARP are also included. Of course no answer can be any more accurate than the fidelity of the model, thus an Appendix is included which discusses modeling accuracy. A broad viewpoint is taken and all problems which occurred in the use of HARP are discussed. Such problems include: computer system problems, installation manual problems, user manual problems, program inconsistencies, program limitations, confusing notation, long run times, accuracy problems, etc.

Shooman, Martin L.

1991-01-01

212

Design of Fault Tolerant Reversible Multiplier  

Directory of Open Access Journals (Sweden)

Full Text Available In the recent years, reversible logic has emerged as a promising technology having its applications in low power CMOS, quantum computing, nanotechnology, and optical computing. The classical set of gates such as AND, OR, and EXOR are not reversible. This paper proposes a novel 4x4 bit reversible fault tolerant multiplier circuit which can multiply two 4-bit numbers. It is faster and has lower hardware complexity compared to the existing designs. In addition, the proposed reversible multiplier is better than the existing counterparts in terms of delay & power. It is based on two concepts. The partial products can be generated in parallel using Fredkin gates and thereafter the addition is done by using reversible parallel adder designed from IG gates. Thus, this paper provides the initial threshold to building of more complex system which can execute more complicated operations using reversible logic.

H. P. Sinha

2012-01-01

213

SIFT - Multiprocessor architecture for Software Implemented Fault Tolerance flight control and avionics computers  

Science.gov (United States)

A brief description of a SIFT (Software Implemented Fault Tolerance) Flight Control Computer with emphasis on implementation is presented. A multiprocessor system that relies on software-implemented fault detection and reconfiguration algorithms is described. A high level reliability and fault tolerance is achieved by the replication of computing tasks among processing units.

Forman, P.; Moses, K.

1979-01-01

214

An Active Fault-Tolerant PWM Tracker for Unknown Nonlinear Stochastic Hybrid Systems: NARMAX Model and OKID-Based State-Space Self-Tuning Control  

Directory of Open Access Journals (Sweden)

Full Text Available An active fault-tolerant pulse-width-modulated tracker using the nonlinear autoregressive moving average with exogenous inputs model-based state-space self-tuning control is proposed for continuous-time multivariable nonlinear stochastic systems with unknown system parameters, plant noises, measurement noises, and inaccessible system states. Through observer/Kalman filter identification method, a good initial guess of the unknown parameters of the chosen model is obtained so as to reduce the identification process time and enhance the system performances. Besides, by modifying the conventional self-tuning control, a fault-tolerant control scheme is also developed. For the detection of fault occurrence, a quantitative criterion is exploited by comparing the innovation process errors estimated by the Kalman filter estimation algorithm. In addition, the weighting matrix resetting technique is presented by adjusting and resetting the covariance matrix of parameter estimates to improve the parameter estimation for faulty system recovery. The technique can effectively cope with partially abrupt and/or gradual system faults and/or input failures with fault detection.

Shu-Mei Guo

2010-01-01

215

Measures of Fault Tolerance in Distributed Simulated Annealing  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this paper, we examine the different measures of Fault Tolerance in a Distributed Simulated Annealing process. Optimization by Simulated Annealing on a distributed system is prone to various sources of failure. We analyse simulated annealing algorithm, its architecture in distributed platform and potential sources of failures. We examine the behaviour of tolerant distributed system for optimization task. We present possible methods to overcome the failures and achieve fau...

Prakash, Aaditya

2012-01-01

216

Reconfigurable Fault Tolerance for FPGAs  

Science.gov (United States)

The invention allows a field-programmable gate array (FPGA) or similar device to be efficiently reconfigured in whole or in part to provide higher capacity, non-redundant operation. The redundant device consists of functional units such as adders or multipliers, configuration memory for the functional units, a programmable routing method, configuration memory for the routing method, and various other features such as block RAM, I/O (random access memory, input/output) capability, dedicated carry logic, etc. The redundant device has three identical sets of functional units and routing resources and majority voters that correct errors. The configuration memory may or may not be redundant, depending on need. For example, SRAM-based FPGAs will need some type of radiation-tolerant configuration memory, or they will need triple-redundant configuration memory. Flash or anti-fuse devices will generally not need redundant configuration memory. Some means of loading and verifying the configuration memory is also required. These are all components of the pre-existing redundant FPGA. This innovation modifies the voter to accept a MODE input, which specifies whether ordinary voting is to occur, or if redundancy is to be split. Generally, additional routing resources will also be required to pass data between sections of the device created by splitting the redundancy. In redundancy mode, the voters produce an output corresponding to the two inputs that agree, in the usual fashion. In the split mode, the voters select just one input and convey this to the output, ignoring the other inputs. In a dual-redundant system (as opposed to triple-redundant), instead of a voter, there is some means to latch or gate a state update only when both inputs agree. In this case, the invention would require modification of the latch or gate so that it would operate normally in redundant mode, and would separately latch or gate the inputs in non-redundant mode.

Shuler, Robert, Jr.

2010-01-01

217

Fault-Tolerant Spanners: Better and Simpler  

CERN Document Server

A natural requirement of many distributed structures is fault-tolerance: after some failures, whatever remains from the structure should still be effective for whatever remains from the network. In this paper we examine spanners of general graphs that are tolerant to vertex failures, and significantly improve their dependence on the number of faults $r$, for all stretch bounds. For stretch $k \\geq 3$ we design a simple transformation that converts every $k$-spanner construction with at most $f(n)$ edges into an $r$-fault-tolerant $k$-spanner construction with at most $O(r^3 \\log n) \\cdot f(2n/r)$ edges. Applying this to standard greedy spanner constructions gives $r$-fault tolerant $k$-spanners with $\\tilde O(r^{2} n^{1+\\frac{2}{k+1}})$ edges. The previous construction by Chechik, Langberg, Peleg, and Roddity [STOC 2009] depends similarly on $n$ but exponentially on $r$ (approximately like $k^r$). For the case $k=2$ and unit-length edges, an $O(r \\log n)$-approximation algorithm is known from recent work of D...

Dinitz, Michael

2011-01-01

218

Fault-tolerant holonomic quantum computation  

CERN Document Server

We explain how to combine holonomic quantum computation (HQC) with fault tolerant quantum error correction. This establishes the scalability of HQC, putting it on equal footing with other models of computation, while retaining the inherent robustness the method derives from its geometric nature.

Oreshkov, Ognyan; Lidar, Daniel A

2008-01-01

219

Design methods for fault-tolerant finite state machines  

Science.gov (United States)

VLSI electronic circuits are increasingly being used in space-borne applications where high levels of radiation may induce faults, known as single event upsets. In this paper we review the classical methods of designing fault tolerant digital systems, with an emphasis on those methods which are particularly suitable for VLSI-implementation of finite state machines. Four methods are presented and will be compared in terms of design complexity, circuit size, and estimated circuit delay.

Niranjan, Shailesh; Frenzel, James F.

1993-01-01

220

Fault-Tolerant Partial Replication in Large-Scale Database Systems  

CERN Document Server

We investigate a decentralised approach to committing transactions in a replicated database, under partial replication. Previous protocols either reexecute transactions entirely and/or compute a total order of transactions. In contrast, ours applies update values, and orders only conflicting transactions. It results that transactions execute faster, and distributed databases commit in small committees. Both effects contribute to preserve scalability as the number of databases and transactions increase. Our algorithm ensures serializability, and is live and safe in spite of faults.

Sutra, Pierre

2008-01-01

 
 
 
 
221

Experiments in fault tolerant software reliability  

Science.gov (United States)

Twenty functionally equivalent programs were built and tested in a multiversion software experiment. Following unit testing, all programs were subjected to an extensive system test. In the process sixty-one distinct faults were identified among the versions. Less than 12 percent of the faults exhibited varying degrees of positive correlation. The common-cause (or similar) faults spanned as many as 14 components. However, a majority of these faults were trivial, and easily detected by proper unit and/or system testing. Only two of the seven similar faults were difficult faults, and both were caused by specification ambiguities. One of these faults exhibited variable identical-and-wrong response span, i.e. response span which varied with the testing conditions and input data. Techniques that could have been used to avoid the faults are discussed. For example, it was determined that back-to-back testing of 2-tuples could have been used to eliminate about 90 percent of the faults. In addition, four of the seven similar faults could have been detected by using back-to-back testing of 5-tuples. It is believed that most, if not all, similar faults could have been avoided had the specifications been written using more formal notation, the unit testing phase was subject to more stringent standards and controls, and better tools for measuring the quality and adequacy of the test data (e.g. coverage) were used.

Mcallister, David F.; Vouk, Mladen A.

1989-01-01

222

A modified NARMAX model-based self-tuner with fault tolerance for unknown nonlinear stochastic hybrid systems with an input-output direct feed-through term.  

Science.gov (United States)

A modified nonlinear autoregressive moving average with exogenous inputs (NARMAX) model-based state-space self-tuner with fault tolerance is proposed in this paper for the unknown nonlinear stochastic hybrid system with a direct transmission matrix from input to output. Through the off-line observer/Kalman filter identification method, one has a good initial guess of modified NARMAX model to reduce the on-line system identification process time. Then, based on the modified NARMAX-based system identification, a corresponding adaptive digital control scheme is presented for the unknown continuous-time nonlinear system, with an input-output direct transmission term, which also has measurement and system noises and inaccessible system states. Besides, an effective state space self-turner with fault tolerance scheme is presented for the unknown multivariable stochastic system. A quantitative criterion is suggested by comparing the innovation process error estimated by the Kalman filter estimation algorithm, so that a weighting matrix resetting technique by adjusting and resetting the covariance matrices of parameter estimate obtained by the Kalman filter estimation algorithm is utilized to achieve the parameter estimation for faulty system recovery. Consequently, the proposed method can effectively cope with partially abrupt and/or gradual system faults and input failures by the fault detection. PMID:24012389

Tsai, Jason S-H; Hsu, Wen-Teng; Lin, Long-Guei; Guo, Shu-Mei; Tann, Joseph W

2014-01-01

223

Concepts and Methods in Fault-tolerant Control  

DEFF Research Database (Denmark)

Faults in automated processes will often cause undesired reactions and shut-down of a controlled plant, and the consequences could be damage to technical parts of the plant, to personnel or the environment. Fault-tolerant control combines diagnosis with control methods to handle faults in an intelligent way. The aim is to prevent that simple faults develop into serious failure and hence increase plant availability and reduce the risk of safety hazards. Fault-tolerant control merges several disciplines into a common framework to achieve these goals. The desired features are obtained through on-line fault diagnosis, automatic condition assessment and calculation of appropriate remedial actions to avoid certain consequences of a fault. The envelope of the possible remedial actions is very wide. Sometimes, simple could be achieved by replacing a measurement from a faulty sensor by an estimate. In yet other situations, complex reconfiguration or on-line controller redesign is required. This paper gives an overviewof recent tools to analyze and explore structure and other fundamental properties of an automated system such that any inherent redundancy in the controlled process can be fully utilized to maintain availability, even though faults may occur.

Blanke, Mogens

2001-01-01

224

Fault-tolerant search algorithms reliable computation with unreliable information  

CERN Document Server

Why a book on fault-tolerant search algorithms? Searching is one of the fundamental problems in computer science. Time and again algorithmic and combinatorial issues originally studied in the context of search find application in the most diverse areas of computer science and discrete mathematics. On the other hand, fault-tolerance is a necessary ingredient of computing. Due to their inherent complexity, information systems are naturally prone to errors, which may appear at any level - as imprecisions in the data, bugs in the software, or transient or permanent hardware failures. This book pr

Cicalese, Ferdinando

2013-01-01

225

Fault Tolerant Control for Kori Unit 1 Steam Generator  

Energy Technology Data Exchange (ETDEWEB)

In order to implement more reliable control systems, failures of a controller, a sensor and an actuator should be taken into consideration in the process of control system design. Traditionally there have been two approaches for dealing with fault-tolerant control problem: active redundancy and passive redundancy. Active redundancy has no reconfiguration part to take an action such as diagnosing and selecting intact controller when a controller failure occurs, that is, one controller guarantees the system stability and performance under failure of the other controller. Meanwhile, passive redundancy has reconfiguration parts which supervise the system, reject the faulty controller, and select the sound controller which performs the mission. Active redundancy structure for fault-tolerant control is focused in the paper and design methods of fault tolerant state feedback control and fault-tolerant output feedback control are proposed, which makes control a system reliable while guaranteeing stability and performance in the sense of H{infinity} norm, in the face of controller failures in the dual-controller configuration. The proposed method is applied to Kori Unit 1 steam generator level control system. The results show that the steam generator water level is well controlled in the situation of one controller failure.

Kim, Myung-Ki [Korea Electric Power Research Institute, Daejeon (Korea, Republic of)

2007-07-01

226

Fault Tolerant Control for Kori Unit 1 Steam Generator  

International Nuclear Information System (INIS)

In order to implement more reliable control systems, failures of a controller, a sensor and an actuator should be taken into consideration in the process of control system design. Traditionally there have been two approaches for dealing with fault-tolerant control problem: active redundancy and passive redundancy. Active redundancy has no reconfiguration part to take an action such as diagnosing and selecting intact controller when a controller failure occurs, that is, one controller guarantees the system stability and performance under failure of the other controller. Meanwhile, passive redundancy has reconfiguration parts which supervise the system, reject the faulty controller, and select the sound controller which performs the mission. Active redundancy structure for fault-tolerant control is focused in the paper and design methods of fault tolerant state feedback control and fault-tolerant output feedback control are proposed, which makes control a system reliable while guaranteeing stability and performance in the sense of H? norm, in the face of controller failures in the dual-controller configuration. The proposed method is applied to Kori Unit 1 steam generator level control system. The results show that the steam generator water level is well controlled in the situation of one controller failure

227

A distributed fault tolerant architecture for nuclear reactor control and safety functions  

International Nuclear Information System (INIS)

This paper reports on a fault tolerance architecture that provides tolerance to a broad scope of hardware, software, and communications faults which is being developed. This architecture relies on widely commercially available operating systems, local area networks, and software standards. Thus, development time is significantly shortened, and modularity allows for continuous and inexpensive system enhancement throughout the expected 20- year life. The fault containment and parallel processing capabilites of computers network are being exploited to provide a high performance, high availability network capable of tolerating a broad scope of hardware software, and operating system faults. The system can tolerate all but one known (and avoidable) single fault, two known and avoidable dual faults, and will detect all higher order fault sequences and provide diagnostics to allow for rapid manual recovery

228

Unitary reflection groups for quantum fault tolerance  

CERN Document Server

This paper explores the representation of quantum computing in terms of unitary reflections (unitary transformations that leaves invariant a hyperplane of a vector space). The symmetries of qubit systems are found to be supported by Euclidean real reflections (i.e., Coxeter groups) or by specific imprimitive reflection groups, introduced (but not named) in a recent paper [Planat M and Jorrand Ph 2008, J Phys A: Math Theor 41, 182001]. The automorphisms of multiple qubit systems are found to relate to some Clifford operations once the corresponding group of reflections is identified. For a short list, one may point out the Coxeter systems of type B3 and G2 (for single qubits), D5 and A4 (for two qubits), E7 and E6 (for three qubits), and the complex reflection groups G(2l, 2, 5). The relevant fault tolerant groups of reflections (the Bell groups) are generated, as subgroups of the Clifford groups, by the Hadamard gate, the $\\pi$/4 phase gate and an entangling (braid) gate [Kauffman L H and Lomonaco S J 2004 Ne...

Planat, Michel

2008-01-01

229

Fault Tolerant Quantum Computation with Constant Error  

CERN Document Server

Recently Shor showed how to perform fault tolerant quantum computation when the error probability is logarithmically small. We improve this bound and describe fault tolerant quantum computation when the error probability is smaller than some constant threshold. The cost is polylogarithmic in time and space, and no measurements are used during the quantum computation. The result holds also for quantum circuits which operate on nearest neighbors only. To achieve this noise resistance, we use concatenated quantum error correcting codes. The scheme presented is general, and works with all quantum codes that satisfy some restrictions, namely that the code is ``proper''. We present two explicit classes of proper quantum codes. The first example of proper quantum codes generalizes classical secret sharing with polynomials. The second uses a known class of quantum codes and converts it to a proper code. This class is defined over a field with p elements, so the elementary quantum particle is not a qubit but a ``qupit...

Aharonov, D; Aharonov, Dorit; Ben-Or, Michael

1996-01-01

230

Optimized Nanometric Fault Tolerant Reversible BCD Adder  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this study a novel nanometric fault tolerant quantum and reversible binary coded decimal adder is proposed. Reversible logic has found emerging attentions in optical information processing, quantum computing, nanotechnology and low power design. BCD Adder is a combinational circuit that can be used for the addition of two numbers in BCD arithmetic's. The proposed reversible BCD adder has also parity preserving property. It is better than all the existing counterparts. The proposed circuit ...

Majid Haghparast; Masoumeh Shams

2012-01-01

231

Efficient fault-tolerant quantum computing  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Fault tolerant quantum computing methods which work with efficient quantum error correcting codes are discussed. Several new techniques are introduced to restrict accumulation of errors before or during the recovery. Classes of eligible quantum codes are obtained, and good candidates exhibited. This permits a new analysis of the permissible error rates and minimum overheads for robust quantum computing. It is found that, under the standard noise model of ubiquitous stochasti...

Steane, Andrew M.

1998-01-01

232

An Approach to Build Software Based on Fault Tolerance Computing Using Uncertainty Factor  

Directory of Open Access Journals (Sweden)

Full Text Available In this work, we have started with an overview on fault tolerance based system. In case of design diversity based software fault tolerance system, we observed that uncertainty remains an important factor. Keeping this factor, we have discussed about implementing Bayes’ theorem and probabilistic mathematical model to handle the uncertainty factor. We assume that, once developed, the complete model will give us better efficiency. The rest of this paper deals with other types of fault tolerance systems and their approaches. This part is a kind of literature review, which includes, fault tolerant computing schemes that rely on the single-design as well as on the multiple-design. Further, in single-design, we have discussed about recovery block, N-version programming, N self-checking programming scheme. Lastly, focusing on multiple-design, we have discussed about software engineering aspects, error detection mechanisms and fault tolerance by fault injection. The paper ends with a general conclusion.

Mrityunjay Brahma

2013-12-01

233

Quantum Control and Fault-tolerance  

Science.gov (United States)

Quantum control (QC) and the methods of fault-tolerant quantum computing (FTQC) are two of the cornerstones on which the hope for a quantum computer rests. However QC methods do not generally scale well with the size of the system, and it is not known how their performance is hindered when integration with FTQC methods, especially considering these demand a large system size overhead, is attempted under realistic noise models. Here we study this problem using dynamical decoupling in the bang-bang limit as a toy model, with a non-Markovian noise where interactions decay with distance, and show that there exists a regime of the norms of the relevant Hamiltonians, in which dynamical decoupling protected gates provide an advantage over the bare gate implementation. This is a first step towards showing that QC protocols designed for a small set of qubits can be extended to larger sets without a significant loss of performance, as long as the noise model behaves reasonably well.

Paz Silva, Gerardo; Dominy, Jason; Lidar, Daniel

2013-03-01

234

Improving Fault Tolerance in Ad-Hoc Networks by Using Residue Number System  

Directory of Open Access Journals (Sweden)

Full Text Available In this study, we presented a method for distributing data storage by using residue number system for mobile systems and wireless networks based on peer to peer paradigm. Generally, redundant residue number system is capable in error detection and correction. In proposed method, we made a new system by mixing Redundant Residue Number System (RRNS, Multi Level Residue Number System (ML RNS and Multiple Valued Logic (MVL RNS which was perfect for parallel, carry free, high speed arithmetic and the system supports secure data communication. In addition it had ability of error detection and correction. In comparison to other number systems, it had many improvements in data security, error detection and correction, speed of storage and retrieval.

A. Barati

2008-01-01

235

Implementation of Fault Tolerant Method Using BCH Code on FPGA  

Directory of Open Access Journals (Sweden)

Full Text Available The Fault tolerance degradation is the property thatenables a system (often computer-based to continue operatingproperly in the event of the failure of (or one or more faultswithin some of its components. To designing a new 32-bitArithmetic Logic Unit (ALU that is secure against many attacksor faults and able to correct any 5-bit fault in any position of its 32bits input register of ALU. Because the radiation effects onelectronic circuits may cause to be inverted data bits of registers ormemories. If one bit of main storage system is changed themission of system would be completely different. The highmotivation in choice of BCH (Bose, chaudhuri, andHocquenghem codes is that, it is able to correct multiple errorsand these classes of codes are kind of powerful random errorcorrecting cyclic codes. In comparison with area penalty methods,32-bit fault tolerant ALU using BCH code is a better choice interms of area as compared to Triple Modular Redundancy (TMRand Residue code. This is due to the fault tolerant method for32-bit ALU using TMR with single or triplicated voting needsingle voting scheme or tripled voter and two extra 32-bit ALUwhich has been increased the hardware overhead by 202% and208% respectively. The Residue code requires hardwareoverhead of 148.9%. However, in comparison with TMR a n dRe s i d u e c o d e , BCH code needs the hardware overhead is 70to 75%, which causes that the overall cost and power consumptionwill get reduces. Thus proposed fault tolerant hardware overheadhas lower hardware and multiple error correction when comparedto the other techniques.

Mahadevaswamy V P

2012-09-01

236

Fault Tolerant Weighted Voting Algorithms  

Directory of Open Access Journals (Sweden)

Full Text Available Computer networks are now necessities of modern organisations and network security has become a major concern for them. In this paper we have proposed a holistic approach to network security with a hybrid model that includes an Intrusion Detection System (IDS to detect network attacks and a survivability model to assess the impacts of undetected attacks. A neural network-based IDS has been proposed, where the learning mechanism for the neural network is evolved using genetic algorithm. Then the case where an attack evades the IDS and takes the system into a compromised state is discussed. We propose a stochastic model which enables us to do a cost/benefit analysis for systems security. This integrated approach allows systems managers to make more informed decisions regarding both intrusion detection and system protection.

Azad Azadmanesh

2008-09-01

237

Cooperative Fault Tolerant Distributed Computing  

Energy Technology Data Exchange (ETDEWEB)

HARNESS was proposed as a system that combined the best of emerging technologies found in current distributed computing research and commercial products into a very flexible, dynamically adaptable framework that could be used by applications to allow them to evolve and better handle their execution environment. The HARNESS system was designed using the considerable experience from previous projects such as PVM, MPI, IceT and Cumulvs. As such, the system was designed to avoid any of the common problems found with using these current systems, such as no single point of failure, ability to survive machine, node and software failures. Additional features included improved inter-component connectivity, with full support for dynamic down loading of addition components at run-time thus reducing the stress on application developers to build in all the libraries they need in advance.

Fagg, Graham E.

2006-03-15

238

Fault-tolerant battery system employing intra-battery network architecture  

Science.gov (United States)

A distributed energy storing system employing a communications network is disclosed. A distributed battery system includes a number of energy storing modules, each of which includes a processor and communications interface. In a network mode of operation, a battery computer communicates with each of the module processors over an intra-battery network and cooperates with individual module processors to coordinate module monitoring and control operations. The battery computer monitors a number of battery and module conditions, including the potential and current state of the battery and individual modules, and the conditions of the battery's thermal management system. An over-discharge protection system, equalization adjustment system, and communications system are also controlled by the battery computer. The battery computer logs and reports various status data on battery level conditions which may be reported to a separate system platform computer. A module transitions to a stand-alone mode of operation if the module detects an absence of communication connectivity with the battery computer. A module which operates in a stand-alone mode performs various monitoring and control functions locally within the module to ensure safe and continued operation.

Hagen, Ronald A. (Stillwater, MN); Chen, Kenneth W. (Fair Oaks, CA); Comte, Christophe (Montreal, CA); Knudson, Orlin B. (Vadnais Heights, MN); Rouillard, Jean (Saint-Luc, CA)

2000-01-01

239

Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing  

Science.gov (United States)

Fault tolerant systems require the ability to detect and recover from physical damage caused by the hardware s environment, faulty connectors, and system degradation over time. This ability applies to military, space, and industrial computing applications. The integrity of Point-to-Point (P2P) communication, between two microcontrollers for example, is an essential part of fault tolerant computing systems. In this paper, different methods of fault detection and recovery are presented and analyzed.

Akamine, Robert L.; Hodson, Robert F.; LaMeres, Brock J.; Ray, Robert E.

2011-01-01

240

Fault tolerant microcomputer based alarm annunciator for Dhruva reactor  

International Nuclear Information System (INIS)

The Dhruva alarm annunciator displays the status of 624 alarm points on an array of display windows using the standard ringback sequence. Recognizing the need for a very high availability, the system is implemented as a fault tolerant configuration. The annunciator is partitioned into three identical units; each unit is implemented using two microcomputers wired in a hot standby mode. In the event of one computer malfunctioning, the standby computer takes over control in a bouncefree transfer. The use of microprocessors has helped built-in flexibility in the system. The system also provides built-in capability to resolve the sequence of occurrence of events and conveys this information to another system for display on a CRT. This report describes the system features, fault tolerant organisation used and the hardware and software developed for the annunciation function. (author). 8 figs

 
 
 
 
241

A novel adaptive switching function on fault tolerable sliding mode control for uncertain stochastic systems.  

Science.gov (United States)

A novel switching function based on an optimization strategy for the sliding mode control (SMC) method has been provided for uncertain stochastic systems subject to actuator degradation such that the closed-loop system is globally asymptotically stable with probability one. In the previous researches the focus on sliding surface has been on proportional or proportional-integral function of states. In this research, from a degree of freedom that depends on designer choice is used to meet certain objectives. In the design of the switching function, there is a parameter which the designer can regulate for specified objectives. A sliding-mode controller is synthesized to ensure the reachability of the specified switching surface, despite actuator degradation and uncertainties. Finally, the simulation results demonstrate the effectiveness of the proposed method. PMID:24954808

Zahiripour, Seyed Ali; Jalali, Ali Akbar

2014-09-01

242

H? Fault Tolerant Control of WECS Based on the PWA Model  

Directory of Open Access Journals (Sweden)

Full Text Available The main contribution of this paper is the development of H? fault tolerant control for a wind energy conversion system (WECS based on the stochastic piecewise affine (PWA model. In this paper the normal and fault stochastic PWA models for WECS including multiple working points at different wind speeds are established. A reliable piecewise linear quadratic regulator state feedback is designed for the fault tolerant actuator and sensor. A sufficient condition for the existence of the passive fault tolerant controller is derived based on some linear matrix inequalities (LMIs. It is shown that the H? fault tolerant controller of WECS can control the wind turbine exposed to multiple simultaneous sensor faults or actuator faults; that is, the reliability of wind turbines can be improved.

Yun-Tao Shi

2014-03-01

243

A High Performance Protocol for Fault Tolerant Distributed Shared Memory (FaTP  

Directory of Open Access Journals (Sweden)

Full Text Available In distributed environments, runtime failures often occur. If the distributed system has the ability to handle such failures dynamically (within runtime, it is said to be fault tolerant. Such systems suffer from the problem of being slow if compared to other non-fault tolerant systems. Moreover, if the system is based on a Distributed Shared Memory (DSM in exchanging data among the distributed application members, then it is going to be slower and may be inefficient. In this study, a generic DSM based Fault Tolerance Protocol (FaTP is introduced. FaTP is a high performance fault tolerance protocol. The proposed protocol is based on the Linda Tuple space DSM model. It introduces a compact set of DSM access primitives and supplied with a fault tolerance layer based on dynamic replication. The complexity of FaTP has been measured and its performance has been evaluated.

Hosam E. Reffat

2013-01-01

244

A lightweight fault-tolerant middleware for a Subaru Telescope second generation observation control system  

Science.gov (United States)

Subaru Telescope is developing a second-generation Observation Control System that specifically addresses some of the deficiencies of the current Subaru OCS. Two areas of concern are complexity and failure handling. The current system has over 1000 dedicated OCS processes spread across a dozen hosts and provides nothing in the way of automated failover. Furthermore, manual failover is so fraught with difficulty that it is rarely attempted. Our Generation 2 OCS is written almost entirely in Python and builds upon a Subaru-developed middleware based on the XML-RPC protocol. This framework offers the following benefits: - has very few dependences outside of standard Python - provides a nearly seamless remote proxy object-oriented interface - provides optional user/password authentication and/or SSL encryption - is extremely simple to use from client applications - is connectionless, and assists transparent failover of communications and services on a cluster of hosts - has reasonable performance for a wide range of needs - allows multiple language bindings - for dynamic languages, requires no interface stub files The "back end" (service side) of the OCS is nearing completion, and has already been used successfully during two separate OCS engineering runs. It is comprised of only a couple dozen processes, and provides automated failover capabilities on a rack of commodity x86 Linux servers. We provide an overview of the middleware design and its failover capabilities. Some data on the performance of communications using the middleware protocol is included.

Jeschke, Eric; Bon, Bruce; Inagaki, Takeshi; Streeper, Sam

2008-08-01

245

ACID Support and Fault-Tolerant Database Systems on Cloud:A Review  

Directory of Open Access Journals (Sweden)

Full Text Available Cloud computing represents a different way to architect and remotely manage computing resources. One has only to establish an account with Microsoft or Amazon or Google to begin building and deploying application systems into a cloud. These systems can be, but certainly are not restricted to being simplistic. Some applications requires http services, some requires relational database or might require web service infrastructure and message queues. With clouds, IT-related applications can be provided as a service, which can be accessed through internet. There are platforms on cloud which provide scalability and high availability properties for web applications but there are problems related to data consistency at the same time, and in case of server failures, it becomes major problem in applications related to payment services. Data needs to be properly managed in cloud environment and to achieve proper transaction processing and consistency, RDBMS techniques such as ACID transactions should be used. Web services in Azure ensure application availability by replicating stored data at least three times and offer optional geolocation of replicas in separate Microsoft data centres to provide disaster recovery services.Azure storage services provide scalable persistent storage of structured tables, blobs and queues.

Pratiyush Guleria

2012-10-01

246

Special Issue: Fault Tolerant Control of Power Grids  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This special issue contains article on fault detection and isolation and fault tolerant control methods applied to different aspects of modern power grids, both for detection, isolation and accommodating faults in the power grid, and for detection, isolation and accommodation of faults in power generating units.

Odgaard, Peter; Aubrun, Christophe; Majanne, Yrjo

2014-01-01

247

Universal Fault-Tolerant Computation on Decoherence-Free Subspaces  

CERN Document Server

A general scheme to perform universal quantum computation fault-tolerantly within decoherence-free subspaces (DFSs) of a system's Hilbert space is derived. This scheme leads to the first fault-tolerant realization of universal quantum computation on DFSs with the properties that (i) only one- and two-qubit interactions are required, and (ii) the system remains within the DFS throughout the entire implementation of a quantum gate. We show explicitly how to perform universal computation on clusters of the four-qubit DFS encoding one logical qubit each under "collective decoherence" (qubit-permutation-invariant system-bath coupling). Our results have immediate relevance to a number of proposed quantum computer implementations, in particular those in which the internal system Hamiltonian is of the Heisenberg type, such as spin-spin coupled quantum dots.

Bacon, D J; Lidar, D A; Whaley, K B

2000-01-01

248

Dynamic Fault Tolerance in Desktop Grids Based On Reliability  

Directory of Open Access Journals (Sweden)

Full Text Available Fault tolerance is an important issue to guarantee reliable execution of tasks in computational desktop grid environment where execution failures are frequently expected, requires the availability of efficient fault tolerant strategies able to effectively deal with resource failures and/or unplanned periods of unavailability. In this paper we present a Dynamic Fault Tolerant strategy that, rather than just tolerating faults as done by traditional fault-tolerant schedulers, exploit the information concerning size of task, resource speed and resource reliability by maintaining resource history to improve application performance. The performance of this strategy has been compared via simulation with those attained by traditional fault-tolerant strategy. Our results, obtained by considering a set of realistic scenarios modeled after real Desktop Grids, show that our approach results in better application performance and resource utilization.

Geeta Arora

2013-10-01

249

Superior model for fault tolerance computation in designing nano-sized circuit systems  

Science.gov (United States)

As CMOS technology scales nano-metrically, reliability turns out to be a decisive subject in the design methodology of nano-sized circuit systems. As a result, several computational approaches have been developed to compute and evaluate reliability of desired nano-electronic circuits. The process of computing reliability becomes very troublesome and time consuming as the computational complexity build ups with the desired circuit size. Therefore, being able to measure reliability instantly and superiorly is fast becoming necessary in designing modern logic integrated circuits. For this purpose, the paper firstly looks into the development of an automated reliability evaluation tool based on the generalization of Probabilistic Gate Model (PGM) and Boolean Difference-based Error Calculator (BDEC) models. The Matlab-based tool allows users to significantly speed-up the task of reliability analysis for very large number of nano-electronic circuits. Secondly, by using the developed automated tool, the paper explores into a comparative study involving reliability computation and evaluation by PGM and, BDEC models for different implementations of same functionality circuits. Based on the reliability analysis, BDEC gives exact and transparent reliability measures, but as the complexity of the same functionality circuits with respect to gate error increases, reliability measure by BDEC tends to be lower than the reliability measure by PGM. The lesser reliability measure by BDEC is well explained in this paper using distribution of different signal input patterns overtime for same functionality circuits. Simulation results conclude that the reliability measure by BDEC depends not only on faulty gates but it also depends on circuit topology, probability of input signals being one or zero and also probability of error on signal lines.

Singh, N. S. S.; Asirvadam, V. S.; Muthuvalu, M. S.

2014-10-01

250

Synthesis of Fault Tolerant Reversible Logic Circuits  

CERN Document Server

Reversible logic is emerging as an important research area having its application in diverse fields such as low power CMOS design, digital signal processing, cryptography, quantum computing and optical information processing. This paper presents a new 4*4 universal reversible logic gate, IG. It is a parity preserving reversible logic gate, that is, the parity of the inputs matches the parity of the outputs. The proposed parity preserving reversible gate can be used to synthesize any arbitrary Boolean function. It allows any fault that affects no more than a single signal readily detectable at the circuit's primary outputs. Finally, it is shown how a fault tolerant reversible full adder circuit can be realized using only two IGs. It has also been demonstrated that the proposed design offers less hardware complexity and is efficient in terms of gate count, garbage outputs and constant inputs than the existing counterparts.

Islam, Md Saiful; Begum, Zerina; Hafiz, Mohd Zulfiquar; Mahmud, Abdullah Al; 10.1109/CAS-ICTD.2009.4960883

2010-01-01

251

Diagnosis and Fault-tolerant Control for Ship Station Keeping  

DEFF Research Database (Denmark)

This paper adresses the design process of diagnosis and fault-tolerant control when the a system should operate despite multiple failures in sensors or actuators. Graph-teory based analysis of systems structure is demonstrated to be a unique design methodology that can cope with the diagnosis design for systems of high complexity, and also analyse the cases of cascaded or multiple faults. The paper takes as example a ship with two CP propellers, rudders and a bow thruster as actuators, and instrumentation with a suite of global position sensors, inertial navigation units and conventional gyro units to provide ship motion information. A salient feature of the design mehod is the ability to analyse cases where faults have occurrred and easily determine where in the faulty system diagnosability and controlability are retained.

Blanke, Mogens

2005-01-01

252

Fault tolerance in Hadoop MapReduce implementation  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This document reports the advances on exploring and understanding the fault tolerance mechanisms in Hadoop MapReduce. A description of the current fault tolerance features existing in Hadoop is provided, along with a review of related works on the topic. Finally, the document describes some relevant proposals about fault tolerance worth considering to implement in Hadoop within the PERMARE project in order to provide support for pervasive computing environments.

Cogorno, Mati?as; Rey, Javier; Nesmachnow, Sergio

2013-01-01

253

Design and Verification of Fault-Tolerant Components  

DEFF Research Database (Denmark)

We present a systematic approach to design and verification of fault-tolerant components with real-time properties as found in embedded systems. A state machine model of the correct component is augmented with internal transitions that represent hypothesized faults. Also, constraints on the occurrence or timing of faults are included in this model. This model of a faulty component is then extended with fault detection and recovery mechanisms, again in the form of state machines. Desired properties of the component are model checked for each of the successive models. The models can be made relatively detailed such that they can serve directly as blueprints for engineering, and yet be amenable to exhaustive verication. The approach is illustrated with a design of a triple modular fault-tolerant system that is a real case we received from our collaborators in the aerospace field. We use UPPAAL to model and check this design. Model checking uses concrete parameters, so we extend the result with parametric analysis using abstractions of the automata in a rigorous verification.

Zhang, Miaomiao; Liu, Zhiming

2009-01-01

254

Fault tolerant quantum computation with nondeterministic gates.  

Science.gov (United States)

In certain approaches to quantum computing the operations between qubits are nondeterministic and likely to fail. For example, a distributed quantum processor would achieve scalability by networking together many small components; operations between components should be assumed to be failure prone. In the ultimate limit of this architecture each component contains only one qubit. Here we derive thresholds for fault-tolerant quantum computation under this extreme paradigm. We find that computation is supported for remarkably high failure rates (exceeding 90%) providing that failures are heralded; meanwhile the rate of unknown errors should not exceed 2 in 10(4) operations. PMID:21231569

Li, Ying; Barrett, Sean D; Stace, Thomas M; Benjamin, Simon C

2010-12-17

255

A Blueprint for a Topologically Fault-tolerant Quantum Computer  

CERN Document Server

The advancement of information processing into the realm of quantum mechanics promises a transcendence in computational power that will enable problems to be solved which are completely beyond the known abilities of any "classical" computer, including any potential non-quantum technologies the future may bring. However, the fragility of quantum states poses a challenging obstacle for realization of a fault-tolerant quantum computer. The topological approach to quantum computation proposes to surmount this obstacle by using special physical systems -- non-Abelian topologically ordered phases of matter -- that would provide intrinsic fault-tolerance at the hardware level. The so-called "Ising-type" non-Abelian topological order is likely to be physically realized in a number of systems, but it can only provide a universal gate set (a requisite for quantum computation) if one has the ability to perform certain dynamical topology-changing operations on the system. Until now, practical methods of implementing thes...

Bonderson, Parsa; Freedman, Michael; Nayak, Chetan

2010-01-01

256

Analysis of Fault Tolerant Techniques in Secure Mobile Agent Paradigm  

Directory of Open Access Journals (Sweden)

Full Text Available Since the past few years, the network domains and mobile agent technology has been the fastest growing and emerging trend as well. But it has to undergo certain challenges and problems, in order to meet the bandwidth requirements. Moreover, it suffers from issues related to reliability like security and fault tolerance. During the agent migration in an itinerary from one server to other common issue is server crash or agent crash. The parameters used for the evaluation of various techniques are agent centric, system centric, fault type, coordination performance analysis, central management and adaptive. Advantages of each mechanism are also described.

Parul Arora

2014-05-01

257

Fault Tolerant Control in a Semi-active Suspension  

Digital Repository Infrastructure Vision for European Research (DRIVER)

A Fault Tolerant Control System (FTCS) in a Quarter of Vehicle (QoV ) model is proposed. The control law is time-varying using a Linear Parameter-Varying (LPV ) based controller, which includes two scheduling parameters. One parameter for monitoring the nonlinear behavior of the damper, and another for fault accommodation using a reference model obtained by a state observer of the normal operating regime. The QoV model represents a semi-active suspension, including an experimental magneto-rhe...

Tudon-mart?nez, Juan C.; Morales-mene?ndez, Rube?n; Ramirez-mendoza, Ricardo; Sename, Olivier; Dugard, Luc

2012-01-01

258

A defect- and fault-tolerant architecture for nanocomputers  

Science.gov (United States)

Both von Neumann's NAND multiplexing, based on a massive duplication of imperfect devices and randomized imperfect interconnects, and reconfigurable architectures have been investigated to come up with solutions for integrations of highly unreliable nanometre-scale devices. In this paper, we review these two techniques, and present a defect- and fault-tolerant architecture in which von Neumann's NAND multiplexing is combined with a massively reconfigurable architecture. The system performance of this architecture is evaluated by studying its reliability, i.e. the probability of system survival. Our evaluation shows that the suggested architecture can tolerate a device error rate of up to 10-2, with multiple redundant components; the structure is efficiently robust against both permanent and transient faults for an ultra-large integration of highly unreliable nanometre-scale devices.

Han, Jie; Jonker, Pieter

2003-02-01

259

On Reliability Analysis of Fault-tolerant Multistage Interconnection Networks  

Directory of Open Access Journals (Sweden)

Full Text Available The design of a suitable interconnection network for inter-processor communication is one of the key issues of the system performance. The reliability of these networks and their ability to continue operating despite failures are major concerns in determining the overall system performance. In this paper a new irregular network IABN has been proposed modifying existing ABN network. ABN is a regular multipath network with limited fault tolerance. The reliabilities of the IABN and ABN multi-stage interconnection networks have been calculated and compared in terms of the Upper and Lower bounds of Mean time to failure (MTTF.The IABN is a network that provides much better fault-tolerance by providing three time more paths between any pair of source-destination and better reliability at the expanse of little more cost than ABN.

Rinkle Aggarwal

2008-11-01

260

On the Transition Improvement of EV or HEV Induction Motor Propulsion Sensor Fault-Tolerant Controller  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This technical paper deals with the transition performance improvement of a sensor fault-tolerant controller devoted to Electric (EV) or Hybrid Electric Vehicles (HEV). Indeed, improvements are brought over a previously developed technique that exhibit abrupt changes in the torque if a sensor fault is detected and after a transition from a control technique to another one [1]. The Fault-Tolerant Control (FTC) system firstly concerns the sliding mode control technique since better performances...

Tabbache, Bekheira; Benbouzid, Mohamed; Kheloui, Abdelaziz

2010-01-01

 
 
 
 
261

Fault-tolerant adaptive FIR filters using variable detection threshold  

Science.gov (United States)

Adaptive filters are widely used in many digital signal processing applications, where tap weight of the filters are adjusted by stochastic gradient search methods. Block adaptive filtering techniques, such as block least mean square and block conjugate gradient algorithm, were developed to speed up the convergence as well as improve the tracking capability which are two important factors in designing real-time adaptive filter systems. Even though algorithm-based fault tolerance can be used as a low-cost high level fault-tolerant technique to protect the aforementioned systems from hardware failures with minimal hardware overhead, the issue of choosing a good detection threshold remains a challenging problem. First of all, the systems usually only have limited computational resources, i.e., concurrent error detection and correction is not feasible. Secondly, any prior knowledge of input data is very difficult to get in practical settings. We propose a checksum-based fault detection scheme using two-level variable detection thresholds that is dynamically dependent on the past syndromes. Simulations show that the proposed scheme reduces the possibility of false alarms and has a high degree of fault coverage in adaptive filter systems.

Lin, L. K.; Redinbo, G. R.

1994-10-01

262

Beam dynamics calculations for fault-tolerance  

International Nuclear Information System (INIS)

The European Transmutation Demonstration requires a high-power proton accelerator operating in CW mode. This accelerator is also expected to have a very limited number of unexpected beam interruptions per year. To reach such an ambitious goal, it is clear that reliability-oriented design practices need to be followed from the early stage of components design and fault-tolerance capabilities have to be introduced to the maximum extent. The goal of this document is precisely to investigate in more details the fault-tolerance capability of the XT-ADS linac. From previous analysis, it appears that if nothing is done, a cavity's failure leads in nearly all the cases to a complete beam loss, due to the non-relativistic varying velocity of the particles. To avoid such a total beam loss, it is clear that some kind of retuning has to be performed to compensate the lack of acceleration due to the faulty cavity. We have to identify and develop fast failure recovery scenarios to ensure that such retuning can be performed in less than 1 second. 2 ways are investigated. The first way is to stop the beam to achieve the retuning (Scenario 1). The other way is to try to perform the retuning without stopping the beam (Scenario 2). The present analysis demonstrates on the beam dynamics point of view that a fast retuning procedure can be envisaged without stopping the beam (Scenario 2). Nevertheless, this Scenario 2 implies stringent specifications, especially on: - the fault detection time, that has to be extremely short (order of magnitude: 100 ?s) and - the margins required on the accelerating field and RF power point of view, that are higher than in Scenario 1

263

Fault Detection and Isolation and Fault Tolerant Control of Wind Turbines Using Set-Valued Observers  

DEFF Research Database (Denmark)

Research on wind turbine Operations & Maintenance (O&M) procedures is critical to the expansion of Wind Energy Conversion systems (WEC). In order to reduce O&M costs and increase the lifespan of the turbine, we study the application of Set-Valued Observers (SVO) to the problem of Fault Detection and Isolation (FDI) and Fault Tolerant Control (FTC) of wind turbines, by taking advantage of the recent advances in SVO theory for model invalidation. A simple wind turbine model is presented along with possible faulty scenarios. The FDI algorithm is built on top of the described model, taking into account process disturbances, uncertainty and sensor noise. The FTC strategy takes advantage of the proposed FDI algorithm, enabling the controller reconfiguration shortly after fault events. Additionally, a robust controller is designed so as to increase the wind turbine's performance during low severity faults. Finally, the FDI algorithm is assessed within a publicly available benchmark model, using Monte-Carlo simulation runs.

Casau, Pedro; Rosa, Paulo Andre Nobre

2012-01-01

264

Design and Analysis of a Fault Tolerant Microprocessor Based on Triple Modular Redundancy Using VHDL  

Directory of Open Access Journals (Sweden)

Full Text Available There are numerous real time & operation critical systems in which the failure of the system is unacceptable at any stage of processing. The examples of such systems are like ATM machines, satellites, spacecraft etc. In this paper a fault tolerant microprocessor is developed by using checker units with a fault secure ALU and to develop a fault secure ALU the parity prediction logic and two rail checker method was used. Finally triple modular redundancy is applied to develop a fault tolerant processor. Proposed method was validated using the VHDL test environment and the results showed that the reliability of the system increased with a little area overhead.

Deepti Shinghal

2011-03-01

265

Fault Tolerance in ZigBee Wireless Sensor Networks  

Science.gov (United States)

Wireless sensor networks (WSN) based on the IEEE 802.15.4 Personal Area Network standard are finding increasing use in the home automation and emerging smart energy markets. The network and application layers, based on the ZigBee 2007 PRO Standard, provide a convenient framework for component-based software that supports customer solutions from multiple vendors. This technology is supported by System-on-a-Chip solutions, resulting in extremely small and low-power nodes. The Wireless Connections in Space Project addresses the aerospace flight domain for both flight-critical and non-critical avionics. WSNs provide the inherent fault tolerance required for aerospace applications utilizing such technology. The team from Ames Research Center has developed techniques for assessing the fault tolerance of ZigBee WSNs challenged by radio frequency (RF) interference or WSN node failure.

Alena, Richard; Gilstrap, Ray; Baldwin, Jarren; Stone, Thom; Wilson, Pete

2011-01-01

266

Fault-tolerance techniques for SRAM-based FPGAs  

CERN Document Server

Fault-tolerance in integrated circuits is no longer the exclusive concern of space designers or highly-reliable applications engineers. Today, designers of many next-generation products must cope with reduced margin noises. The continuous evolution of fabrication technology of semiconductor components – shrinking transistor geometry, power supply, speed, and logic density – has significantly reduced the reliability of very deep submicron integrated circuits, in face of various internal and external sources of noise. Field Programmable Gate Arrays (FPGAs), customizable by SRAM cells, are the latest advance in the integrated circuit evolution: millions of memory cells to implement the logic, embedded memories, routing, and embedded microprocessors cores. These re-programmable systems-on-chip platforms must be fault-tolerant to cope with current requirements.

Kastensmidt, Fernanda Lima; Reis, Ricardo

2006-01-01

267

Active Fault Tolerant Control for Ultrasonic Piezoelectric Motor  

Science.gov (United States)

Ultrasonic piezoelectric motor technology is an important system component in integrated mechatronics devices working on extreme operating conditions. Due to these constraints, robustness and performance of the control interfaces should be taken into account in the motor design. In this paper, we apply a new architecture for a fault tolerant control using Youla parameterization for an ultrasonic piezoelectric motor. The distinguished feature of proposed controller architecture is that it shows structurally how the controller design for performance and robustness may be done separately which has the potential to overcome the conflict between performance and robustness in the traditional feedback framework. A fault tolerant control architecture includes two parts: one part for performance and the other part for robustness. The controller design works in such a way that the feedback control system will be solely controlled by the proportional plus double-integral PI2 performance controller for a nominal model without disturbances and H? robustification controller will only be activated in the presence of the uncertainties or an external disturbances. The simulation results demonstrate the effectiveness of the proposed fault tolerant control architecture.

Boukhnifer, Moussa

2012-07-01

268

A Framework-Based Approach for Fault-Tolerant Service Robots  

Directory of Open Access Journals (Sweden)

Full Text Available Recently the component?based approach has become a major trend in intelligent service robot development due to its reusability and productivity. The framework in a component?based system should provide essential services for application components. However, to our knowledge the existing robot frameworks do not yet support fault tolerance service. Moreover, it is often believed that faults can be handled only at the application level. In this paper, by extending the robot framework with the fault tolerance function, we argue that the framework?based fault tolerance approach is feasible and even has many benefits, including that: 1 the system integrators can build fault tolerance applications from non?fault?aware components; 2 the constraints of the components and the operating environment can be considered at the time of integration, which ? cannot be anticipated eaily at the time of component development; 3 consistency in system reliability can be obtained even in spite of diverse application component sources. In the proposed construction, we build XML rule files defining the rules for probing and determining the fault conditions of each component, contamination cases from a faulty component, and the possible recovery and safety methods. The rule files are established by a system integrator and the fault manager in the framework controls the fault tolerance process according to the rules. We demonstrate that the fault?tolerant framework can incorporate widely accepted fault tolerance techniques. The effectiveness and real?time performance of the framework?based approach and its techniques are examined by testing an autonomous mobile robot in typical fault scenarios.

Heejune Ahn

2012-11-01

269

Tolerance towards sensor faults: An application to a flexible arm manipulator  

Digital Repository Infrastructure Vision for European Research (DRIVER)

As more engineering operations become automatic, the need for robustness towards faults increases. Hence, a fault tolerant control (FTC) scheme is a valuable asset. This paper presents a robust sensor fault FTC scheme implemented on a flexible arm manipulator, which has many applications in automation. Sensor faults affect the system's performance in the closed loop when the faulty sensor readings are used to generate the control input. In this paper, the non-faulty sensors are used to recons...

Chee Pin Tan; Habib, Maki K.

2008-01-01

270

Analysis of a cascaded multilevel inverter with fault-tolerant control  

Directory of Open Access Journals (Sweden)

Full Text Available Cascaded multilevel inverters are widely used in industry for speed control of induction motors and, even when the converters’ operation is highly reliable, several faults can occur, leading to poor engine performance or even causing the whole system to stop. It is desirable to keep the system operational when a failure occurs, even when degraded, and implementing fault-tolerant systems are thus a good choice. This paper presents a general strategy for fault-tolerant control in a 7-level cascaded multilevel inverter (the faults are in semiconductor devices; the paper includes simulation and experimental results to validate the method.

Jesús Aguayo Alquicira

2011-08-01

271

Full Tolerant Archiving System  

Science.gov (United States)

The archiving system at the Italian center for Astronomical Archives (IA2) manages data from external sources like telescopes, observatories, or surveys and handles them in order to guarantee preservation, dissemination, and reliability, in most cases in a Virtual Observatory (VO) compliant manner. A metadata model dynamic constructor and a data archive manager are new concepts aimed at automatizing the management of different astronomical data sources in a fault tolerant environment. The goal is a full tolerant archiving system, nevertheless complicated by the presence of various and time changing data models, file formats (FITS, HDF5, ROOT, PDS, etc.) and metadata content, even inside the same project. To avoid this unpleasant scenario a novel approach is proposed in order to guarantee data ingestion, backward compatibility, and information preservation.

Knapic, C.; Molinaro, M.; Smareglia, R.

2013-10-01

272

Hypothetical Scenario Generator for Fault-Tolerant Diagnosis  

Science.gov (United States)

The Hypothetical Scenario Generator for Fault-tolerant Diagnostics (HSG) is an algorithm being developed in conjunction with other components of artificial- intelligence systems for automated diagnosis and prognosis of faults in spacecraft, aircraft, and other complex engineering systems. By incorporating prognostic capabilities along with advanced diagnostic capabilities, these developments hold promise to increase the safety and affordability of the affected engineering systems by making it possible to obtain timely and accurate information on the statuses of the systems and predicting impending failures well in advance. The HSG is a specific instance of a hypothetical- scenario generator that implements an innovative approach for performing diagnostic reasoning when data are missing. The special purpose served by the HSG is to (1) look for all possible ways in which the present state of the engineering system can be mapped with respect to a given model and (2) generate a prioritized set of future possible states and the scenarios of which they are parts.

James, Mark

2007-01-01

273

On the Practicality of `Practical' Byzantine Fault Tolerance  

CERN Document Server

Byzantine Fault Tolerant (BFT) systems are considered by the systems research community to be state of the art with regards to providing reliability in distributed systems. BFT systems provide safety and liveness guarantees with reasonable assumptions, amongst a set of nodes where at most f nodes display arbitrarily incorrect behaviors, known as Byzantine faults. Despite this, BFT systems are still rarely used in practice. In this paper we describe our experience, from an application developer's perspective, trying to leverage the publicly available and highly-tuned PBFT middleware (by Castro and Liskov), to provide provable reliability guarantees for an electronic voting application with high security and robustness needs. We describe several obstacles we encountered and drawbacks we identified in the PBFT approach. These include some that we tackled, such as lack of support for dynamic client management and leaving state management completely up to the application. Others still remaining include the lack of...

Chondros, Nikos; Roussopoulos, Mema

2011-01-01

274

Guaranteed Cost Fault-tolerant Control of Networked Control Systems with Short Output Delay and Short Control Delay Based on State Observer  

Directory of Open Access Journals (Sweden)

Full Text Available Supposing that the sensor and controller nodes were time-driven and the actuator node was event-driven, the problem of integrity against sensor failures for the networked control systems with short output delay and short control delay was discussed based on observer. The state observer of the system according to the time-delay compensation strategy was designed. Then, considering possible sensor failures, an augmented mathematic model for the networked control systems based on observer was developed. In terms of the given quadratic performance index function, the integrity condition of the system was given and the designs for guaranteed cost fault-tolerant controller and observer were presented respectively by using the cooperative design approach of the controller and observer and the approach of bilinear matrix inequalities. Finally, a numerical simulation example demonstrated the conclusions are feasible and effective. The proposed control method meets the requirements in industrial networked control systems.

Xiaomao Huang

2013-04-01

275

A fault-tolerant attitude control system for a satellite based on fuzzy global sliding mode control algorithm  

Science.gov (United States)

An effective approach for fault diagnosis of aeroengine based on integration of wavelet analysis and neural networks is presented. The wavelet transform can accurately localizes the characteristics of a signal in time-frequency domains and in a view of the inter relationship of wavelet transform between exponent theory, the whole and local exponents obtained from wavelet transform coefficients as features are presented for extracting fault signals, which are inputted into radial basis function for fault pattern recognition. The fault diagnosis model of aero-engine is established and the improved Levenberg-Marquardt training algorithm is used to fulfill the network structure and parameter identification. By choosing enough samples to train the fault diagnosis network and the information representing the faults input into the neural network, the fault pattern can be determined. The robustness of wavelet neural network for fault diagnosis is discussed. The practical fault diagnosis for aeroengine vibration approves to be accurate and comprehensive.

Liang, Jinjin; Dong, Chaoyang; Wang, Qing

2008-10-01

276

Fault diagnosis and fault-tolerant control and guidance for aerospace vehicles from theory to application  

CERN Document Server

Fault Diagnosis and Fault-Tolerant Control and Guidance for Aerospace demonstrates the attractive potential of recent developments in control for resolving such issues as improved flight performance, self-protection and extended life of structures. Importantly, the text deals with a number of practically significant considerations: tuning, complexity of design, real-time capability, evaluation of worst-case performance, robustness in harsh environments, and extensibility when development or adaptation is required. Coverage of such issues helps to draw the advanced concepts arising from academic research back towards the technological concerns of industry. Initial coverage of basic definitions and ideas and a literature review gives way to a treatment of important electrical flight control system failures: the oscillatory failure case, runaway, and jamming. Advanced fault detection and diagnosis for linear and nonlinear systems are described. Lastly recovery strategies appropriate to remaining acuator/sensor/c...

Zolghadri, Ali; Cieslak, Jerome; Efimov, Denis; Goupil, Philippe

2014-01-01

277

A Reflective Object-Oriented Architecture for Developing Fault-Tolerant Software  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english This paper proposes a reflective object-oriented architecture for developing fault-tolerant software. Reflective object-oriented programming promotes a modular structuring of systems by means of a new dimension of modularization—the separation between base-level objects and meta-level objects. This [...] property allows the creation of metaobjects responsible for managing tasks of application objects located at the base level. In the context of this work, computational reflection is applied to implement various strategies of fault tolerance at the meta-level in a transparent manner for the application programmer, that is, without interfering with the original structure of application objects that require fault tolerance facilities. The use of the proposed architecture has the following advantages: (i) separation of concerns, that is, separate the concerns related to the application domain from those related to the implementation of fault-tolerant mechanisms; (ii) it promotes code reuse of fault-tolerance mechanisms; (iii) it allows application programmers to use the most adequate fault-tolerance strategy for his implementation, and (iv) it provides a design that is more adaptable, flexible and easier to extend than traditional designs for developing fault-tolerant software. Our reflective architecture is composed of three levels, and is based on the abstraction of object groups.

Luiz E., Buzato; Cecília M. F., Rubira; Maria Lúcia B., Lisboa.

1997-11-01

278

Synchrony and Time in Fault-Tolerant Distributed Algorithms  

Science.gov (United States)

In 1985, Fischer, Lynch and Paterson published their celebrated impossibility of solving distributed agreement (consensus) in purely asynchronous distributed systems with crash failures. Synchrony requirements, i.e., constraints on the occurrence of certain events in a distributed system, are hence mandatory for being able to solve interesting distributed computing problems. Timing requirements are the most obvious, though not the only, possibility to express synchrony conditions. We will survey existing partially synchronous distributed computing models and their ability to circumvent impossibility results, and explore the role of time and clocks in designing fault-tolerant distributed algorithms in such models.

Schmid, Ulrich

279

Hybrid fault tolerance techniques to detect transient faults in embedded processors  

CERN Document Server

This book describes fault tolerance techniques based on software and hardware to create hybrid techniques. They are able to reduce overall performance degradation and increase error detection when associated with applications implemented in embedded processors. Coverage begins with an extensive discussion of the current state-of-the-art in fault tolerance techniques. The authors then discuss the best trade-off between software-based and hardware-based techniques and introduce novel hybrid techniques. Proposed techniques increase existing fault detection rates up to 100%, while maintaining low performance overheads in area and application execution time. • Discusses the effects of radiation on modern integrated circuits; • Provides a comprehensive overview of state-of-the art fault tolerance techniques based on software, hardware, and hybrid techniques; • Introduces novel hybrid fault tolerance techniques for reconfigurable FPGAs and ASICs; • Performs fault injection campaigns by simulation, bitstream ...

Azambuja, José Rodrigo; Becker, Jürgen

2014-01-01

280

Two fault tolerant toggle-hook release  

Science.gov (United States)

A coupling device is disclosed which is mechanically two fault tolerant for release. The device comprises a fastener plate and fastener body, each of which is attachable to a different one of a pair of structures to be joined. The fastener plate and body are coupled by an elongate toggle mounted at one end in a socket on the fastener plate for universal pivotal movement thereon. The other end of the toggle is received in an opening in the fastener body and adapted for limited pivotal movement therein. The toggle is adapted to be restrained by three latch hooks arranged in symmetrical equiangular spacing about the axis of the toggle, each hook being mounted on the fastener body for pivotal movement between an unlatching non-contact position with respect to the toggle and a latching position in engagement with a latching surface of the toggle. The device includes releasable lock means for locking each latch hook in its latching position whereby the toggle couples the fastener plate to the fastener body and means for releasing the lock means to unlock each said latch hook from the latch position whereby the unlocking of at least one of the latch hooks from its latching position results in the decoupling of the fastener plate from the fastener body.

Graves, Thomas Joseph (inventor); Brown, Christopher William (inventor)

1991-01-01

 
 
 
 
281

FAULT-TOLERANT SAFETY CONCEPT FOR HUMAN ROBOT INTERACTION  

Directory of Open Access Journals (Sweden)

Full Text Available An important aspect of pHRI (physical Human Robot Interaction is to assure the safety of the human participant in the interaction. One of the approaches to assure safety is to plan the paths executed by the robot based on a certain safety criterion. This acts as a constraint for decision making of intelligent systems. This paper introduces the Fault-Tolerant Distance as a novel safety criterion. It assures an injury free interaction between the robot and a human even in the case of a malfunction in the structure of the robot hardware or software.

Akos Csiszar

2012-11-01

282

Fault Tolerant Neuro-Robust Position Control of DC Motors  

Directory of Open Access Journals (Sweden)

Full Text Available DC motors are widely used in industry such as mechanics, robotics, and aerospace engineering. In this paper, we present a high performance control method for position control of DC motors. Fault-tolerant control model are also addressed to combine with neuro-robust control approach. It is shown that with the proposed control algorithms, external disturbances and coupled dynamics inherent in the system are effectively compensated using neural network unit in which no analytical estimation on the upper bound of the reconstruction error and uncertainties is needed. Simulations on various flight conditions also confirm the effectiveness of the proposed methods.

Ran Zhang

2011-10-01

283

Fault Tolerant CII Middle ware for Wide Area Monitoring ,Control and Protection in Realistic Operational Environments  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Fault tolerance and dependability are of high importance to the information infrastructure that supports and controls the operation of the Critical Infrastructure such as the electrical power grid and telecommunication systems due to their vital role in the proper function of society and economy. This thesis examines GridStat, an established middleware approach for providing fault tolerance and dependability and performs a constructive evaluation of its architectural design. It, then, builds ...

Kakavas, Ioannis

2012-01-01

284

Solar system fault detection  

Science.gov (United States)

A fault detecting apparatus and method are provided for use with an active solar system. The apparatus provides an indication as to whether one or more predetermined faults have occurred in the solar system. The apparatus includes a plurality of sensors, each sensor being used in determining whether a predetermined condition is present. The outputs of the sensors are combined in a pre-established manner in accordance with the kind of predetermined faults to be detected. Indicators communicate with the outputs generated by combining the sensor outputs to give the user of the solar system and the apparatus an indication as to whether a predetermined fault has occurred. Upon detection and indication of any predetermined fault, the user can take appropriate corrective action so that the overall reliability and efficiency of the active solar system are increased.

Farrington, R.B.; Pruett, J.C. Jr.

1984-05-14

285

Wind turbine fault detection and fault tolerant control : An enhanced benchmark challenge  

DEFF Research Database (Denmark)

In this updated edition of a previous wind turbine fault detection and fault tolerant control challenge, we present a more sophisticated wind turbine model and updated fault scenarios to enhance the realism of the challenge and therefore the value of the solutions. This paper describes the challenge model and the requirements for challenge participants. In addition, it motivates many of the faults by citing publications that give ?eld data from wind turbine control tests.

Odgaard, Peter Fogh; Johnson, Kathryn

2013-01-01

286

Fault tolerant task execution through global trajectory planning  

International Nuclear Information System (INIS)

Whether a task can be completed after a failure of one of the degrees-of-freedom of a redundant manipulator depends on the joint angle at which the failure takes place. It is possible to achieve fault tolerance by globally planning a trajectory that avoids unfavourable joint positions before a failure occurs. In this article, we present a trajectory planning algorithm that guarantees fault tolerance while simultaneously satisfying joint limit and obstacle avoidance requirements

287

Fault Tolerant Strategy for Semi-Active Suspensions with LPV Accommodation  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract--A novel fault tolerant strategy to compensate multiplicative actuator faults (damper oil leakages) in a semiactive suspension system is proposed. The compensation of the lack of damping force caused by a faulty damper is carried on by the remainder three healthy semi-active dampers. Once a faulty damper is detected and isolated by a Fault Detection and Isolation strategy based on parity-space, an estimator is activated to compute the missing damping force to compensate. In order to ...

Tudon-mart?nez, Juan; Varrier, Se?bastien; Sename, Olivier; Morales Menendez, Ruben; Martinez Molina, John Jairo; Dugard, Luc

2013-01-01

288

A Remote Characterization System and a fault-tolerant tracking system for subsurface mapping of buried waste sites  

International Nuclear Information System (INIS)

This paper describes two closely related projects that will provide new technology for characterizing hazardous waste burial sites. The first project, a collaborative effort by five of the national laboratories, involves the development and demonstration of a remotely controlled site characterization system. The Remote Characterization System (RCS) includes a unique low-signature survey vehicle, a base station, radio telemetry data links, satellite-based vehicle tracking, stereo vision, and sensors for noninvasive inspection of the surface and subsurface. The second project, conducted by the Idaho National Engineering Laboratory (INEL), involves the development of a position sensing system that can track a survey vehicle or instrument in the field. This system can coordinate updates at a rate of 200/s with an accuracy better than 0.1% of the distance separating the target and the sensor. It can employ acoustic or electromagnetic signals in a wide range of frequencies and can be operated as a passive or active device

289

Reliable Energy Efficient Fault Tolerant Clustering in Wireless Sensor Network  

Directory of Open Access Journals (Sweden)

Full Text Available To propose a Reliable, Energy Efficient, Fault Tolerant (REEFT clustering algorithm for aggregating sensor measurements in Wireless Sensor Network (WSN. It is a hierarchical algorithm in which energy efficiency is achieved by constructing static clusters with reliable cluster head based on distance. Lifetime of WSN is improved through solving the important issues in WSN, which are distribution of clusters, optimal number of clusters and number of nodes in a cluster and optimal time duration of clustering cycle. Also the algorithm include fault tolerance feature to tolerate the Cluster Head (CH failure and improve the packet delivery ratio. The algorithm was tested using simulations and its performance improvements were analyzed.

L. Venkatesan

2014-01-01

290

Chip level simulation of fault tolerant computers  

Science.gov (United States)

Chip level modeling techniques, functional fault simulation, simulation software development, a more efficient, high level version of GSP, and a parallel architecture for functional simulation are discussed.

Armstrong, J. R.

1983-01-01

291

Design of neuro fuzzy fault tolerant control using an adaptive observer  

International Nuclear Information System (INIS)

New methodologies and concepts are developed in the control theory to meet the ever-increasing demands in industrial applications. Fault detection and diagnosis of technical processes have become important in the course of progressive automation in the operation of groups of electric drives. When a group of electric drives is under operation, fault tolerant control becomes complicated. For multiple motors in operation, fault detection and diagnosis might prove to be difficult. Estimation of all states and parameters of all drives is necessary to analyze the actuator and sensor faults. To maintain system reliability, detection and isolation of failures should be performed quickly and accurately, and hardware should be properly integrated. Luenberger full order observer can be used for estimation of the entire states in the system for the detection of actuator and sensor failures. Due to the insensitivity of the Luenberger observer to the system parameter variations, state estimation becomes inaccurate under the varying parameter conditions of the drives. Consequently, the estimation performance deteriorates, resulting in ordinary state observers unsuitable for fault detection technique. Therefore an adaptive observe, which can estimate the system states and parameter and detect the faults simultaneously, is designed in our paper. For a Group of D C drives, there may be parameter variations for some of the drives, and for other drives, there may not be parameter variations depending on load torque, friction, etc. So, estimation of all states and parameters of all drives is carried out using an adaptive observer. If there is any deviation with the estimated values, it is understood that fault has occurred and the nature of the fault, whether sensor fault or actuator fault, is determined by neural fuzzy network, and fault tolerant control is reconfigured. Experimental results with neuro fuzzy system using adaptive observer-based fault tolerant control are good, so as to confirm the best characteristics of the proposed approach

292

Coordinated Fault-Tolerance for High-Performance Computing Final Project Report  

Energy Technology Data Exchange (ETDEWEB)

With the Coordinated Infrastructure for Fault Tolerance Systems (CIFTS, as the original project came to be called) project, our aim has been to understand and tackle the following broad research questions, the answers to which will help the HEC community analyze and shape the direction of research in the field of fault tolerance and resiliency on future high-end leadership systems. Will availability of global fault information, obtained by fault information exchange between the different HEC software on a system, allow individual system software to better detect, diagnose, and adaptively respond to faults? If fault-awareness is raised throughout the system through fault information exchange, is it possible to get all system software working together to provide a more comprehensive end-to-end fault management on the system? What are the missing fault-tolerance features that widely used HEC system software lacks today that would inhibit such software from taking advantage of systemwide global fault information? What are the practical limitations of a systemwide approach for end-to-end fault management based on fault awareness and coordination? What mechanisms, tools, and technologies are needed to bring about fault awareness and coordination of responses on a leadership-class system? What standards, outreach, and community interaction are needed for adoption of the concept of fault awareness and coordination for fault management on future systems? Keeping our overall objectives in mind, the CIFTS team has taken a parallel fourfold approach. Our central goal was to design and implement a light-weight, scalable infrastructure with a simple, standardized interface to allow communication of fault-related information through the system and facilitate coordinated responses. This work led to the development of the Fault Tolerance Backplane (FTB) publish-subscribe API specification, together with a reference implementation and several experimental implementations on top of existing publish-subscribe tools. We enhanced the intrinsic fault tolerance capabilities representative implementations of a variety of key HPC software subsystems and integrated them with the FTB. Targeting software subsystems included: MPI communication libraries, checkpoint/restart libraries, resource managers and job schedulers, and system monitoring tools. Leveraging the aforementioned infrastructure, as well as developing and utilizing additional tools, we have examined issues associated with expanded, end-to-end fault response from both system and application viewpoints. From the standpoint of system operations, we have investigated log and root cause analysis, anomaly detection and fault prediction, and generalized notification mechanisms. Our applications work has included libraries for fault-tolerance linear algebra, application frameworks for coupled multiphysics applications, and external frameworks to support the monitoring and response for general applications. Our final goal was to engage the high-end computing community to increase awareness of tools and issues around coordinated end-to-end fault management.

Panda, Dhabaleswar Kumar [The Ohio State University; Beckman, Pete

2011-07-01

293

A Robust Byzantine Fault-Tolerant Replication Technique for Peer-to-Peer Content Distribution  

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: In peer-to-peer networks, Byzantine fault tolerance refers to the capability of a system to tolerate Byzantine faults. It can be achieved by replicating the server and by ensuring all server replicas reach an agreement on the input despite Byzantine faulty replicas and clients. Since malicious attacks and software errors can cause faulty nodes to exhibit Byzantine behavior, Byzantine-fault-tolerant algorithms are increasingly important. Approach: In the study, we wish to develop a robust Byzantine Fault-Tolerance Replication (BFTR technique for peer-to-peer content distribution systems which contains fault detection and fault recovery. It is based on collaborative monitoring of each node to detect the occurrence of a fault. Already we proposed a QoS based overlay network architecture (QIRM involving an intelligent replica placement algorithm to improve the network utilization of the P2P system. Results: By simulation results, we show that the proposed technique involves less overhead and recovery time with increased accuracy. Conclusion/Recommendations: Here the result obtained is that BFTR Technique is much efficient than the QIRM with respect to packet drop ratio, average end-to-end delay, throughput and overhead.

Ayyasamy Sellappan

2011-01-01

294

Fault tolerance analysis and applications to microwave modules and MMIC's  

Science.gov (United States)

A project whose objective was to provide an overview of built-in-test (BIT) considerations applicable to microwave systems, modules, and MMICs (monolithic microwave integrated circuits) is discussed. Available analytical techniques and software for assessing system failure characteristics were researched, and the resulting investigation provides a review of two techniques which have applicability to microwave systems design. A system-level approach to fault tolerance and redundancy management is presented in its relationship to the subsystem/element design. An overview of the microwave BIT focus from the Air Force Integrated Diagnostics program is presented. The technical reports prepared by the GIMADS team were reviewed for applicability to microwave modules and components. A review of MIMIC (millimeter and microwave integrated circuit) program activities relative to BIT/BITE is given.

Boggan, Garry H.

295

Energy Efficient Fault Tolerant Routing Mechanism for Wireless Sensor Network  

Directory of Open Access Journals (Sweden)

Full Text Available Wireless sensor networks are self-organizing systems with resource-constraints that are often deployed in inhospitable and inaccessible environments in order to gather data about some phenomenon in the outside world. For most sensor network applications, point-to-point reliability is not the main objective (Paradis & Qi, 2007; Instead, reliable delivery of the interesting event to the server has to be guaranteed (may be with a certain probability. The communication in such networks is unpredictable and failure-prone, even more so than in regular wireless ad hoc networks. Hence, it is vital to provide fault tolerant techniques for distributed applications in sensor network. Several approaches have been proposed in many recent studies to address the fault tolerance issue in application, transport and/or routing layers. In this paper, we propose a slight modification of the conventional routing (destination, next hop by introducing the second hop information in the route construction phase in order to use it in case of node/link failure (skip only the failed link. Furthermore, the implementation of this proposed routing technique stabilizes the throughput, reduces the average jitter, provides low control overhead and decreases the energy consumption of the network. As a result, the reliability, availability, energy-efficiency and maintainability of the network are achieved.

Ahmed Roumane

2012-05-01

296

Superconducting generator field winding design for high fault tolerance  

International Nuclear Information System (INIS)

Development of rotating electrical machines with superconducting field windings is proceeding at numerous sites worldwide. The primary emphasis is on large turbine generators for application to power systems. The EPRI/Westinghouse 300 MVA superconducting generator program is directed towards demonstration of the technology in an actual utility environment for a long period of time. The concept of stability, in the case of superconducting generators, includes the traditional concepts of stability with respect to the electromechanical interactions and oscillations of the machine with the power system as well as the thermohydraulic stability of the cryogenic rotor and its helium supply system. Power system disturbances, such as faults, produce flow and pressure transients in the rotor cooling system. Depending upon the severity and time history of the disturbances, these transients may occasion normalization of the superconductor and destabilize the generator output through loss of field excitation. This paper addresses the question of designing the superconducting winding and its cryogenic cooling system for stability in the presence of large disturbances, a capability which has been called high fault tolerance

297

Active and Passive Fault-Tolerant LPV Control of Wind Turbines  

DEFF Research Database (Denmark)

This paper addresses the design and comparison of active and passive fault-tolerant linear parameter-varying (LPV) controllers for wind turbines. The considered wind turbine plant model is characterized by parameter variations along the nominal operating trajectory and includes a model of an incipient fault in the pitch system. We propose the design of an active fault-tolerant controller (AFTC) based on an existing LPV controller design method and extend this method to apply for the design of a passive fault-tolerant controller (PFTC). Both controllers are based on output feedback and are scheduled on the varying parameter to manage the parametervarying nature of the model. The PFTC only relies on measured system variables and an estimated wind speed, while the AFTC also relies on information from a fault diagnosis system. Consequently, the optimization problem involved in designing the PFTC is more difficult to solve, as it involves solving bilinear matrix inequalities (BMIs) instead of linear matrix inequalities (LMIs). Simulation results show the performance of the active faulttolerant control system to be slightly superior to that of the passive fault-tolerant control system.

Sloth, Christoffer; Esbensen, Thomas

2010-01-01

298

A Fuzzy-Based Strategy to Improve Control Reconfiguration Performance of a Sensor Fault-Tolerant Induction Motor Propulsion  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This short paper deals with the transition performance improvement of a sensor fault-tolerant controller devoted to automotive applications. Indeed, improvements are brought over a previously developed technique that exhibit abrupt changes in the torque if a sensor fault is detected and after a transition from a control technique to another one [1]. The Fault-Tolerant Control (FTC) system firstly concerns the sliding mode control technique since better performances are obtained with an encode...

Tabbache, Bekheira; Benbouzid, Mohamed; Kheloui, Abdelaziz; Bourgeot, Jean-matthieu

2011-01-01

299

Buffered coscheduling for parallel programming and enhanced fault tolerance  

Science.gov (United States)

A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval. The buffered coscheduling method of this invention also enhances the fault tolerance of a network of parallel machine processors or distributed system processors

Petrini, Fabrizio (Los Alamos, NM); Feng, Wu-chun (Los Alamos, NM)

2006-01-31

300

Fault tolerant vector control of induction motor drive  

Science.gov (United States)

For electric composed of technical objects hazardous industries, such as nuclear, military, chemical, etc. an urgent task is to increase their resiliency and survivability. The construction principle of vector control system fault-tolerant asynchronous electric. Displaying recovery efficiency three-phase induction motor drive in emergency mode using two-phase vector control system. The process of formation of a simulation model of the asynchronous electric unbalance in emergency mode. When modeling used coordinate transformation, providing emergency operation electric unbalance work. The results of modeling transient phase loss motor stator. During a power failure phase induction motor cannot save circular rotating field in the air gap of the motor and ensure the restoration of its efficiency at rated torque and speed.

Odnokopylov, G.; Bragin, A.

2014-10-01

 
 
 
 
301

Fault tolerant strategies for automated operation of nuclear reactors  

International Nuclear Information System (INIS)

This paper introduces an automatic control system incorporating a number of verification, validation, and command generation tasks with-in a fault-tolerant architecture. The integrated system utilizes recent methods of artificial intelligence such as neural networks and fuzzy logic control. Furthermore, advanced signal processing and nonlinear control methods are also included in the design. The primary goal is to create an on-line capability to validate signals, analyze plant performance, and verify the consistency of commands before control decisions are finalized. The application of this approach to the automated startup of the Experimental Breeder Reactor-II (EBR-II) is performed using a validated nonlinear model. The simulation results show that the advanced concepts have the potential to improve plant availability andsafety

302

A benchmark for fault tolerant flight control evaluation  

Science.gov (United States)

A large transport aircraft simulation benchmark (REconfigurable COntrol for Vehicle Emergency Return - RECOVER) has been developed within the GARTEUR (Group for Aeronautical Research and Technology in Europe) Flight Mechanics Action Group 16 (FM-AG(16)) on Fault Tolerant Control (2004 2008) for the integrated evaluation of fault detection and identification (FDI) and reconfigurable flight control strategies. The benchmark includes a suitable set of assessment criteria and failure cases, based on reconstructed accident scenarios, to assess the potential of new adaptive control strategies to improve aircraft survivability. The application of reconstruction and modeling techniques, based on accident flight data, has resulted in high-fidelity nonlinear aircraft and fault models to evaluate new Fault Tolerant Flight Control (FTFC) concepts and their real-time performance to accommodate in-flight failures.

Smaili, H.; Breeman, J.; Lombaerts, T.; Stroosma, O.

2013-12-01

303

Fault tolerance in space-based digital signal processing and switching systems: Protecting up-link processing resources, demultiplexer, demodulator, and decoder  

Science.gov (United States)

Fault tolerance features in the first three major subsystems appearing in the next generation of communications satellites are described. These satellites will contain extensive but efficient high-speed processing and switching capabilities to support the low signal strengths associated with very small aperture terminals. The terminals' numerous data channels are combined through frequency division multiplexing (FDM) on the up-links and are protected individually by forward error-correcting (FEC) binary convolutional codes. The front-end processing resources, demultiplexer, demodulators, and FEC decoders extract all data channels which are then switched individually, multiplexed, and remodulated before retransmission to earth terminals through narrow beam spot antennas. Algorithm based fault tolerance (ABFT) techniques, which relate real number parity values with data flows and operations, are used to protect the data processing operations. The additional checking features utilize resources that can be substituted for normal processing elements when resource reconfiguration is required to replace a failed unit.

Redinbo, Robert

1994-01-01

304

A Fault Tolerant Resource Allocation Architecture for Mobile Grid  

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: In order to achieve high level of reliability and availability, the grid infrastructure should be fault tolerant. Since the failure of resources affects job execution fatally, fault tolerance service is essential to satisfy QoS requirement in grid computing with respect to mobile nodes. Approach: We propose a fault tolerant technique for improving reliability in mobile grid environment considering the node mobility. The Cluster head and monitoring agent was designed in such a way it addresses both resource and network failure and present recovery techniques for overcoming the faults. Results: The proposed model achieves a identifiable performance when compared to the previous model (HRAA. By simulation results, we analyze the node and link failures on parameters such as delivery ratio, throughput and delay against the rate of success. Conclusion: The proposed fault tolerant approach checks for availability of the nodes with least work load for transferring the executed job to cluster head providing an alternate path in case of failure thereby enhancing the reliability of the grid environment.

P. T. Vanathi

2012-01-01

305

An upper bound on quantum fault tolerant thresholds  

CERN Document Server

In this paper we calculate upper bounds on fault tolerance without restrictions on the overhead involved. Optimally adaptive recovery operators are used, and the Shannon entropy is used to estimate the thresholds. By allowing for unrealistically high levels of overhead, we find a quantum fault tolerant threshold of 6.88% for the depolarizing noise used by Knill, which compares to "above 3%" evidenced by Knill. We conjecture that the optimal threshold is 6.90%. We also perform threshold calculations for types of noise other than that discussed by Knill.

Fern, Jesse

2008-01-01

306

Nonplanar VLSI arrays with high fault-tolerance capabilities  

Energy Technology Data Exchange (ETDEWEB)

This paper proposes and analyzes some new VLSI architectures for improved fault tolerance. The architecture include structures with two planar layers of processing elements as well as extended cubic designs. The analyses for arrays with various redundancy levels show remarkable improvement in both array yield and processor use over those exhibited by conventional 2-D structures. The improvement can be attributed to the benefits of the third dimension to increase the flexibility in spares allocation. The architectures can readily substitute arrays based on mesh or four nearest-neighbor interconnections. From the fault-tolerance viewpoint the cubic structures offer no appreciable performance improvement over the simpler 2-layer structures.

Latifi, S.; El-Amawy, A. (Louisiana State Univ., Baton Rouge, LA (USA))

1989-04-01

307

A Novel Nanometric Fault Tolerant Reversible Subtractor Circuit  

Directory of Open Access Journals (Sweden)

Full Text Available Reversibility plays an important role when energy efficient computations are considered. Reversible logic circuits have received significant attention in quantum computing, low power CMOS design, optical information processing and nanotechnology in the recent years. This study proposes a new fault tolerant reversible half-subtractor and a new fault tolerant reversible full-subtractor circuit with nanometric scales. Also in this paper we demonstrate how the well-known and important, PERES gate and TR gate can be synthesized from parity preserving reversible gates. All the designs have nanometric scales.

Mozhgan Shiri

2012-11-01

308

A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Fault tolerance overhead of high performance computing (HPC) applications is becoming critical to the efficient utilization of HPC systems at large scale. HPC applications typically tolerate fail-stop failures by checkpointing. Another promising method is in the algorithm level, called algorithmic recovery. These two methods can achieve high efficiency when the system scale is not very large, but will both lose their effectiveness when systems approach the scale of Exaflops,...

Yao, Erlin; Chen, Mingyu; Wang, Rui; Zhang, Wenli; Tan, Guangming

2011-01-01

309

Improvement of Matrix Converter Drive Reliability by Online Fault Detection and a Fault-Tolerant Switching Strategy.  

DEFF Research Database (Denmark)

The matrix converter system is becoming a very promising candidate to replace the conventional two-stage ac/dc/ac converter, but system reliability remains an open issue. The most common reliability problem is that a bidirectional switch has an open-switch fault during operation. In this paper, a matrix converter driving a speed-controlled permanent-magnet synchronous motor is examined under a single open-switch fault. First, a new fault-detection method is proposed using only the motor currents. Second, a novel fault-tolerant switching strategy is presented. By treating the matrix converter as a two-stage rectifier/inverter, existing modulation techniques for the inverter stage can be reused, whereas the rectifier stage is modified by control to counteract the fault. However, the proposed techniques require no additional hardware devices or circuit modifications to the matrix converter. Experimental results show that the proposed method can maintain the motor speed with a maximum ripple of 2%—a fivefold improvement over the uncompensated system. The proposed method therefore offers a very economical and effective solution for the matrix converter fault tolerance problem.

Nguyen-Duy, Khiem

2011-01-01

310

Fault tolerant LPV control of the GTM UAV with dynamic control allocation  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The aim of the paper is to present a dynamic control allocation architecture for the design and development of reconfigurable and fault-tolerant control systems in aerial vehicles. The baseline control system is designed for the nominal dynamics of the aircraft, while faults and actuator saturation limits are handled by the dynamic control allocation scheme. Coordination of these components is provided by a supervisor which re-allocates control authority based on health information, flight en...

Vanek, Ba?lint; Pe?ni, Tama?s; Szabo?, Zolta?n; Bokor, Jo?zsef

2014-01-01

311

Fault-tolerant Sensor Fusion for Marine Navigation  

DEFF Research Database (Denmark)

Reliability of navigation data are critical for steering and manoeuvring control, and in particular so at high speed or in critical phases of a mission. Should faults occur, faulty instruments need be autonomously isolated and faulty information discarded. This paper designs a navigation solution where essential navigation information is provided even with multiple faults in instrumentation. The paper proposes a provable correct implementation through auto-generated state-event logics in a supervisory part of the algorithms. Test results from naval vessels document the performance and shows events where the fault-tolerant sensor fusion provided uninterrupted navigation data despite temporal instrument defects

Blanke, Mogens

2006-01-01

312

Energy Bounds for Fault-Tolerant Nanoscale Designs  

CERN Document Server

The problem of determining lower bounds for the energy cost of a given nanoscale design is addressed via a complexity theory-based approach. This paper provides a theoretical framework that is able to assess the trade-offs existing in nanoscale designs between the amount of redundancy needed for a given level of resilience to errors and the associated energy cost. Circuit size, logic depth and error resilience are analyzed and brought together in a theoretical framework that can be seamlessly integrated with automated synthesis tools and can guide the design process of nanoscale systems comprised of failure prone devices. The impact of redundancy addition on the switching energy and its relationship with leakage energy is modeled in detail. Results show that 99% error resilience is possible for fault-tolerant designs, but at the expense of at least 40% more energy if individual gates fail independently with probability of 1%.

Marculescu, Diana

2011-01-01

313

Analysis of GPS Abnormal Conditions within Fault Tolerant Control Laws  

Science.gov (United States)

The Global Position System (GPS) is a critical element for the functionality of autonomous flying vehicles. The GPS operation at normal and abnormal conditions directly impacts the trajectory tracking performance of the autonomous Unmanned Aerial Vehicles (UAVs) controllers. The effects of GPS parameter variation must be well understood and user-friendly computational tools must be developed to facilitate the design and evaluation of fault tolerant control laws. This thesis presents the development of a simplified GPS error model in Matlab/Simulink and its use performing a sensitivity analysis of GPS parameters effect under system normal and abnormal operation on different UAV trajectory tracking controllers. The model statistically generates position and velocity errors, simulates the effect of GPS satellite configuration on the position and velocity measurement accuracy, and implements a set of failures to the GPS readings. The model and its graphical user interface was integrated within the WVU UAV simulation environment as a masked Simulink block. The effects on the controllers' trajectory tracking performance of the following GPS parameters were investigated within normal operation ranges and outside: time delay, update rate, error standard deviation, bias, and major position and velocity failures. Several sets of control laws with fixed and adaptive parameters and of different levels of complexity have been used in this investigation. A complex performance index formulated in terms of tracking errors and control activity was used for control laws performance evaluation. The composition of various metrics within the performance index was performed using fixed and variable weights depending on the local characteristics of the commanded trajectory. This study has revealed that GPS error parameters have a significant impact on control laws performance. The proposed GPS model has proved to be a valuable, flexible tool for testing and evaluation of the fault tolerant capabilities of autonomous flight control laws.

Al-Sinbol, Gahssan

314

Ethernet Implementation of Fault Tolerant Train Network for Entertainment and Mixed Control Traffic  

Directory of Open Access Journals (Sweden)

Full Text Available This paper studies the integration of the control system and entertainment on board of train wagons. Both the control and entertainment loads are implemented on top of Gigabit Ethernet, each with a dedicated controller/server. The control load has mixed sampling periods. It is proven that this system can tolerate the failure of one controller in one wagon. In a two wagon scenario, fault tolerance at the controller level is studied, and simulation results show that the system can tolerate the failure of 3 controllers. The system is successful in meeting the packet end-to-end delay with zero packet loss in all OPNET simulated scenarios. The maximum permissible entertainment load is determined for the fault tolerant scenarios.

Tarek K. Refaat

2013-01-01

315

Tolerance towards sensor faults: An application to a flexible arm manipulator  

Directory of Open Access Journals (Sweden)

Full Text Available As more engineering operations become automatic, the need for robustness towards faults increases. Hence, a fault tolerant control (FTC scheme is a valuable asset. This paper presents a robust sensor fault FTC scheme implemented on a flexible arm manipulator, which has many applications in automation. Sensor faults affect the system's performance in the closed loop when the faulty sensor readings are used to generate the control input. In this paper, the non-faulty sensors are used to reconstruct the faults on the potentially faulty sensors. The reconstruction is subtracted from the faulty sensors to form a compensated `virtual sensor' and this signal (instead of the normally used faulty sensor output is then used to generate the control input. A design method is also presented in which the FTC scheme is made insensitive to any system uncertainties. Two fault conditions are tested; total failure and incipient faults. Then the scheme robustness is tested by implementing the flexible joint's FTC scheme on a flexible link, which has different parameters. Excellent results have been obtained for both cases (joint and link; the FTC scheme caused the system performance is almost identical to the fault-free scenario, whilst providing an indication that a fault is present, even for simultaneous faults.

Chee Pin Tan

2008-11-01

316

Multi-version software reliability through fault-avoidance and fault-tolerance  

Science.gov (United States)

A number of experimental and theoretical issues associated with the practical use of multi-version software to provide run-time tolerance to software faults were investigated. A specialized tool was developed and evaluated for measuring testing coverage for a variety of metrics. The tool was used to collect information on the relationships between software faults and coverage provided by the testing process as measured by different metrics (including data flow metrics). Considerable correlation was found between coverage provided by some higher metrics and the elimination of faults in the code. Back-to-back testing was continued as an efficient mechanism for removal of un-correlated faults, and common-cause faults of variable span. Software reliability estimation methods was also continued based on non-random sampling, and the relationship between software reliability and code coverage provided through testing. New fault tolerance models were formulated. Simulation studies of the Acceptance Voting and Multi-stage Voting algorithms were finished and it was found that these two schemes for software fault tolerance are superior in many respects to some commonly used schemes. Particularly encouraging are the safety properties of the Acceptance testing scheme.

Vouk, Mladen A.; Mcallister, David F.

1989-01-01

317

Fault-tolerant quantum computing with color codes  

CERN Document Server

We present and analyze protocols for fault-tolerant quantum computing using color codes. We present circuit-level schemes for extracting the error syndrome of these codes fault-tolerantly. We further present an integer-program-based decoding algorithm for identifying the most likely error given the syndrome. We simulated our syndrome extraction and decoding algorithms against three physically-motivated noise models using Monte Carlo methods, and used the simulations to estimate the corresponding accuracy thresholds for fault-tolerant quantum error correction. We also used a self-avoiding walk analysis to lower-bound the accuracy threshold for two of these noise models. We present and analyze two architectures for fault-tolerantly computing with these codes: one with 2D arrays of qubits are stacked atop each other and one in a single 2D substrate. Our analysis demonstrates that color codes perform slightly better than Kitaev's surface codes when circuit details are ignored. When these details are considered, w...

Landahl, Andrew J; Rice, Patrick R

2011-01-01

318

Fault Tolerant Congestion based Algorithms in OBS Network  

Directory of Open Access Journals (Sweden)

Full Text Available In Optical Burst Switched networks, each light path carry huge amount of traffic, path failures maydamage the user application. Hence fault-tolerance becomes an important issue on these networks.Blocking probability is a key index of quality of service in Optical Burst Switched (OBS network. TheErlang formula has been used extensively in the traffic engineering of optical communication to calculatethe blocking probability. The paper revisits burst contention resolution problems in OBS networks. Whenthe network is overloaded, no contention resolution scheme would effectively avoid the collision andcause blocking. It is important to first decide, a good routing algorithm and then to choose a wavelengthassignment scheme. In this paper we have developed two algorithms, Fault Tolerant Optimized BlockingAlgorithm (FTOBA and Fault Tolerant Least Congestion Algorithm (FTLCA and then compare theperformance of these algorithms on the basis of blocking probability. These algorithms are based uponthe congestion on path in OBS network and based on the simulation results, we shows that the reliableand fault tolerant routing algorithms reduces the blocking probability.

Hardeep Singh, Dr.Jai Prakash, Dinesh Arora & Dr.Amit Wason

2011-12-01

319

Fault tolerance and reliability in integrated ship control : the ATOMOS concept  

DEFF Research Database (Denmark)

Various strategies for achieving fault tolerance in large scale control systems are discussed. The positive and negative impacts of distribution through network communication are presented. The ATOMOS framework for standardized reliable marine automation is presented along with the corresponding reliability issues. A generic framework for simulation of network traffic under fault conditions is suggested and the first practical experiences from a prototype implementation are reported.

Nielsen, Jens Frederik Dalsgaard; Izadi-Zamanabadi, Roozbeh

2002-01-01

320

Reversible Logic Synthesis of Fault Tolerant Carry Skip BCD Adder  

CERN Document Server

Reversible logic is emerging as an important research area having its application in diverse fields such as low power CMOS design, digital signal processing, cryptography, quantum computing and optical information processing. This paper presents a new 4*4 parity preserving reversible logic gate, IG. The proposed parity preserving reversible gate can be used to synthesize any arbitrary Boolean function. It allows any fault that affects no more than a single signal readily detectable at the circuit's primary outputs. It is shown that a fault tolerant reversible full adder circuit can be realized using only two IGs. The proposed fault tolerant full adder (FTFA) is used to design other arithmetic logic circuits for which it is used as the fundamental building block. It has also been demonstrated that the proposed design offers less hardware complexity and is efficient in terms of gate count, garbage outputs and constant inputs than the existing counterparts.

Islam, Md Saiful; 10.3329/jbas.v32i2.2431

2010-01-01

 
 
 
 
321

Fault-tolerant scheme of holonomic quantum computation on stabilizer codes with robustness to low-weight thermal noise  

Science.gov (United States)

We show an equivalence relation between fault-tolerant circuits for a stabilizer code and fault-tolerant adiabatic processes for holonomic quantum computation (HQC), in the case where quantum information is encoded in the degenerate ground space of the system Hamiltonian. By this equivalence, we can systematically construct a fault-tolerant HQC scheme, which can geometrically implement a universal set of encoded quantum gates by adiabatically deforming the system Hamiltonian. During this process, quantum information is protected from low-weight thermal excitations by an energy gap that does not change with the problem size.

Zheng, Yi-Cong; Brun, Todd A.

2014-03-01

322

Algorithm-based fault-tolerant array architecture for Fermat number transform  

Science.gov (United States)

For many real-time and scientific applications, it is desirable to perform signal and image processing algorithms by means of special hardware with very high speeds. With the advent of VLSI technology, large collections of processing elements, which cooperate with each other to achieve high-speed computation, have become economically feasible. In such systems, some level of fault tolerance must be obtained to ensure the validity of the results. Fermat number transforms (FNT's) are attractive for the implementation of digital convolution because the computations are carried out in modular arithmetic which involves no round-off error. In this paper we present a fault tolerant linear array design for FNT by adopting the weighted checksum approach. The results show that the approach is ideally suited to the FNT since it offers fault tolerance, with very low cost, free from round-off error and overflow problems.

Tahir, Jamel M.; Dlay, Satnam S.; Gorgui-Naguib, Raouf N.; Hinton, Oliver R.

1994-10-01

323

An Active Fault-Tolerant Control Method Ofunmanned Underwater Vehicles with Continuous and Uncertain Faults  

Directory of Open Access Journals (Sweden)

Full Text Available This paper introduces a novel thruster fault diagnosis and accommodation system for open-frame underwater vehicles with abrupt faults. The proposed system consists of two subsystems: a fault diagnosis subsystem and a fault accommodation sub-system. In the fault diagnosis subsystem a ICMAC(Improved Credit Assignment Cerebellar Model Articulation Controllers neural network is used to realize the on-line fault identification and the weighting matrix computation. The fault accommodation subsystem uses a control algorithm based on weighted pseudo-inverse to find the solution of the control allocation problem. To illustrate the proposed method effective, simulation example, under multi-uncertain abrupt faults, is given in the paper.

Yongsheng Yang

2008-11-01

324

Refinement for fault-tolerance: An aircraft hand-off protocol  

Science.gov (United States)

Part of the Advanced Automation System (AAS) for air-traffic control is a protocol to permit flight hand-off from one air-traffic controller to another. The protocol must be fault-tolerant and, therefore, is subtle -- an ideal candidate for the application of formal methods. This paper describes a formal method for deriving fault-tolerant protocols that is based on refinement and proof outlines. The AAS hand-off protocol was actually derived using this method; that derivation is given.

Marzullo, Keith; Schneider, Fred B.; Dehn, Jon

1994-01-01

325

Fault Tolerant Control of Wind Turbines : A benchmark model  

DEFF Research Database (Denmark)

This paper presents a test benchmark model for the evaluation of fault detection and accommodation schemes. This benchmark model deals with the wind turbine on a system level, and it includes sensor, actuator, and system faults, namely faults in the pitch system, the drive train, the generator, and the converter system. Since it is a system-level model, converter and pitch system models are simplified because these are controlled by internal controllers working at higher frequencies than the system model. The model represents a three-bladed pitch-controlled variable-speed wind turbine with a nominal power of 4.8 MW. The fault detection and isolation (FDI) problem was addressed by several teams, and five of the solutions are compared in the second part of this paper. This comparison relies on additional test data in which the faults occur in different operating conditions than in the test data used for the FDI design.

Odgaard, Peter Fogh; Stoustrup, Jakob

2013-01-01

326

Fault Tolerant Wind Farm Control : a Benchmark Model  

DEFF Research Database (Denmark)

This paper presents a test benchmark model for the evaluation of fault detection and accommodation schemes. This benchmark model deals with the wind turbine on a system level, and it includes sensor, actuator, and system faults, namely faults in the pitch system, the drive train, the generator, and the converter system. Since it is a system-level model, converter and pitch system models are simplified because these are controlled by internal controllers working at higher frequencies than the system model. The model represents a three-bladed pitch-controlled variable-speed wind turbine with a nominal power of 4.8 MW. The fault detection and isolation (FDI) problem was addressed by several teams, and five of the solutions are compared in the second part of this paper. This comparison relies on additional test data in which the faults occur in different operating conditions than in the test data used for the FDI design.

Odgaard, Peter Fogh; Stoustrup, Jakob

2013-01-01

327

Passive fault tolerant control of a double inverted pendulum - a case study  

DEFF Research Database (Denmark)

A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the YJBK parameterization, which requires the nominal controller to be implemented in observer based form. The proposed method is applied to a double inverted pendulum system, for which an H_inf controller has been designed and verified in a lab setup. In this case study, the fault is a degradation of the tacho loop.

Niemann, Hans Henrik

2005-01-01

328

Passive Fault tolerant Control of an Inverted Double Pendulum : A Case Study Example  

DEFF Research Database (Denmark)

A passive fault tolerant control scheme is suggested, in which a nominal controller is augmented with an additional block, which guarantees stability and performance after the occurrence of a fault. The method is based on the Youla parameterization, which requires the nominal controller to be implemented in the observer based form. The proposed method is applied to a double inverted pendulum system, for which an H controller has been designed and verified in a lap setup. In this case study, the fault is a degradation of the tacho loop.

Niemann, H.; Stoustrup, Jakob

2003-01-01

329

Making classical ground-state spin computing fault-tolerant  

Science.gov (United States)

We examine a model of classical deterministic computing in which the ground state of the classical system is a spatial history of the computation. This model is relevant to quantum dot cellular automata as well as to recent universal adiabatic quantum computing constructions. In its most primitive form, systems constructed in this model cannot compute in an error-free manner when working at nonzero temperature. However, by exploiting a mapping between the partition function for this model and probabilistic classical circuits we are able to show that it is possible to make this model effectively error-free. We achieve this by using techniques in fault-tolerant classical computing and the result is that the system can compute effectively error-free if the temperature is below a critical temperature. We further link this model to computational complexity and show that a certain problem concerning finite temperature classical spin systems is complete for the complexity class Merlin-Arthur. This provides an interesting connection between the physical behavior of certain many-body spin systems and computational complexity.

Crosson, I. J.; Bacon, D.; Brown, K. R.

2010-09-01

330

Multiversion software reliability through fault-avoidance and fault-tolerance  

Science.gov (United States)

In this project we have proposed to investigate a number of experimental and theoretical issues associated with the practical use of multi-version software in providing dependable software through fault-avoidance and fault-elimination, as well as run-time tolerance of software faults. In the period reported here we have working on the following: We have continued collection of data on the relationships between software faults and reliability, and the coverage provided by the testing process as measured by different metrics (including data flow metrics). We continued work on software reliability estimation methods based on non-random sampling, and the relationship between software reliability and code coverage provided through testing. We have continued studying back-to-back testing as an efficient mechanism for removal of uncorrelated faults, and common-cause faults of variable span. We have also been studying back-to-back testing as a tool for improvement of the software change process, including regression testing. We continued investigating existing, and worked on formulation of new fault-tolerance models. In particular, we have partly finished evaluation of Consensus Voting in the presence of correlated failures, and are in the process of finishing evaluation of Consensus Recovery Block (CRB) under failure correlation. We find both approaches far superior to commonly employed fixed agreement number voting (usually majority voting). We have also finished a cost analysis of the CRB approach.

Vouk, Mladen A.; Mcallister, David F.

1990-01-01

331

Fault detection coverage quantification of automatic test functions of digital I and C system in NPPs  

International Nuclear Information System (INIS)

Analog instrument and control systems in nuclear power plants have recently been replaced with digital systems for safer and more efficient operation. Digital instrument and control systems have adopted various fault-tolerant techniques that help the system correctly and safely perform the specific required functions regardless of the presence of faults. Each fault-tolerant technique has a different inspection period, from real-time monitoring to monthly testing. The range covered by each fault tolerant technique is also different. The digital instrument and control system, therefore, adopts multiple barriers consisting of various fault-tolerant techniques to increase the total fault detection coverage. Even though these fault-tolerant techniques are adopted to ensure and improve the safety of a system, their effects on the system safety have not yet been properly considered in most probabilistic safety analysis models. Therefore, it is necessary to develop an evaluation method that can describe these features of digital instrument and control systems. Several issues must be considered in the fault coverage estimation of a digital instrument and control system, and two of these are addressed in this work. The first is to quantify the fault coverage of each fault-tolerant technique implemented in the system, and the second is to exclude the duplicated effect of fault-tolerant techniques implemented simultaneously at each level of the system's hierarchy, as a fault occuof the system's hierarchy, as a fault occurring in a system might be detected by one or more fault-tolerant techniques. For this work, a fault injection experiment was used to obtain the exact relations between faults and multiple barriers of fault tolerant techniques. This experiment was applied to a bistable processor of a reactor protection system.

332

Software reliability models for fault-tolerant avionics computers and related topics  

Science.gov (United States)

Software reliability research is briefly described. General research topics are reliability growth models, quality of software reliability prediction, the complete monotonicity property of reliability growth, conceptual modelling of software failure behavior, assurance of ultrahigh reliability, and analysis techniques for fault-tolerant systems.

Miller, Douglas R.

1987-01-01

333

Bayesian reliability assessment of legacy safety-critical systems upgraded with fault-tolerant off-the-shelf software  

International Nuclear Information System (INIS)

This paper presents a new way of applying Bayesian assessment to systems, which consist of many components. Full Bayesian inference with such systems is problematic, because it is computationally hard and, far more seriously, one needs to specify a multivariate prior distribution with many counterintuitive dependencies between the probabilities of component failures. The approach taken here is one of decomposition. The system is decomposed into partial views of the systems or part thereof with different degrees of detail and then a mechanism of propagating the knowledge obtained with the more refined views back to the coarser views is applied (recalibration of coarse models). The paper describes the recalibration technique and then evaluates the accuracy of recalibrated models numerically on contrived examples using two techniques: u-plot and prequential likelihood, developed by others for software reliability growth models. The results indicate that the recalibrated predictions are often more accurate than the predictions obtained with the less detailed models, although this is not guaranteed. The techniques used to assess the accuracy of the predictions are accurate enough for one to be able to choose the model giving the most accurate prediction

334

Fault tolerant formation control of nonholonomic mobile robots using online approximators  

Science.gov (United States)

For unmanned systems, it is desirable to have some sort of fault tolerant ability in order to accomplish the mission. Therefore, in this paper, the fault tolerant control of a formation of nonholonomic mobile robots in the presence unknown faults is undertaken. Initially, a kinematic/torque leader-follower formation control law is developed for the robots under the assumption of normal operation, and the stability of the formation is verified using Lyapunov theory. Subsequently, the control law for the formation is modified by incorporating an additional term, and this new control law compensates the effects of the faults. Moreover, the faults could be incipient or abrupt in nature. The additional term used in the modified control law is a function of the unknown fault dynamics which are recovered using the online learning capabilities of online approximators. Additionally, asymptotic convergence of the FDA scheme and the formation errors in the presence of faults is shown using Lyapunov theory. Finally, numerical results are provided to verify the theoretical conjectures.

Thumati, Balaje T.; Dierks, Travis A.; Jagannathan, S.

2010-04-01

335

Improving the Navigability of a Hexapod Robot using a Fault-Tolerant Adaptive Gait  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper encompasses a study on the development of a walking gait for fault tolerant locomotion in unstructured environments. The fault tolerant gait for adaptive locomotion fulfills stability conditions in opposition to a fault (locked joints or sensor failure) event preventing a robot to realize stable locomotion over uneven terrains. To accomplish this feat, a fault tolerant gait based on force?position control is proposed in this paper for a hexapod robot to enable stable walking with...

Umar Asif

2012-01-01

336

Topological fault-tolerance in cluster state quantum computation  

Energy Technology Data Exchange (ETDEWEB)

We describe a fault-tolerant version of the one-way quantum computer using a cluster state in three spatial dimensions. Topologically protected quantum gates are realized by choosing appropriate boundary conditions on the cluster. We provide equivalence transformations for these boundary conditions that can be used to simplify fault-tolerant circuits and to derive circuit identities in a topological manner. The spatial dimensionality of the scheme can be reduced to two by converting one spatial axis of the cluster into time. The error threshold is 0.75% for each source in an error model with preparation, gate, storage and measurement errors. The operational overhead is poly-logarithmic in the circuit size.

Raussendorf, R [Perimeter Institute for Theoretical Physics, Waterloo, ON, M6P 1N8 (Canada); Harrington, J [Perimeter Institute for Theoretical Physics, Waterloo, ON, M6P 1N8 (Canada); Goyal, K [Institute for Quantum Information, California Institute of Technology, Pasadena, CA 91125 (United States)

2007-06-15

337

Multiple Error Algorithm-Based Fault Tolerance For Matrix Triangularizations  

Science.gov (United States)

The checksum methods have been known as the most efficient fault-tolerant matrix triangularization schemes on systolic arrays in the presence of a single transient error. But it is not realistic to expect that at most one transient error occurs during any computation. In this paper, we extend the existing checksum schemes and introduce a block checksum scheme for multiple transient errors applicable to the fault tolerant matrix LU decomposition, Gaussian elimination with pairwise pivoting, and the QR decomposition. The block checksum scheme can detect, locate, and correct one transient error in each submatrix of a given matrix. Then we introduce examples that show that even one transient error can make the corrected results by factorization updates useless due to rounding errors. We also show that by introducing d weighted checksum vectors, we can detect all the transient errors that occur in a maximum of d different columns in matrix triangularizations.

Park, Haesun

1988-02-01

338

Compilation and Synthesis for Fault-Tolerant Digital Microfluidic Biochips  

DEFF Research Database (Denmark)

Microfluidic-based biochips are replacing the conventional biochemical analyzers, by integrating all the necessary functions for biochemical analysis using microfluidics. The digital microfluidic biochips (DMBs) manipulate discrete amounts of fluids of nanoliter volume, named droplets, on an array of electrodes to perform operations such as dispensing, transport, mixing, split, dilution and detection. Researchers have proposed compilation approaches, which, starting from a biochemical application and a biochip architecture, determine the allocation, resource binding, scheduling, placement and routing of the operations in the application. During the execution of a bioassay, operations could experience transient faults, thus impacting negatively the correctness of the application. We have proposed both offline (design time) and online (runtime) recovery strategies. The online recovery strategy decides the introduction of the redundancy required for fault-tolerance. We consider both time redundancy, i.e., re-executing erroneous operations, and space redundancy, i.e., creating redundant droplets for fault-tolerance. Error recovery is performed such that the number of transient faults tolerated is maximized and the timing constraints of the biochemical application are satisfied. Previous work has assumed that the biochip architecture is given, and most approaches consider a rectangular shape for the electrode array, where operations execute on rectangular “modules” formed of electrodes. However, non-regular application-specific architectures are common in practice. Hence, we have proposed an approach to the synthesis of application-specific architectures, such that the cost is minimized and the timing constraints of the application are satisfied. We propose an algorithm to build a library of non-regular modules for a given applicationspecific architecture, so that the area of a non-regular application-specific biochip can be used effectively. During fabrication, DMBs can be affected by permanent faults, which may lead to the failure of the application. Our approach introduces redundant electrodes to synthesize fault-tolerant architectures aiming at increasing the yield of DMBs. We also propose a method to estimate, at design time, the application completion time in case of permanent faults in order to verify if an application can be successfully run on the architecture. The proposed approaches were evaluated using several real-life case studies and synthetic benchmarks.

Alistar, Mirela

2014-01-01

339

Fault-tolerant Landau-Zener quantum gates  

International Nuclear Information System (INIS)

We present a method to perform fault-tolerant single-qubit gate operations using Landau-Zener tunneling. In a single Landau-Zener pulse, the qubit transition frequency is varied in time so that it passes through the frequency of the radiation field. We show that a simple three-pulse sequence allows eliminating errors in the gate up to the third order in errors in the qubit energies or the radiation frequency

340

Fault-Tolerant Landau-Zener Quantum Gates  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We present a method to perform fault-tolerant single-qubit gate operations using Landau-Zener tunneling. In a single Landau-Zener pulse, the qubit transition frequency is varied in time so that it passes through the frequency of the radiation field. We show that a simple three-pulse sequence allows eliminating errors in the gate up to the third order in errors in the qubit energies or the radiation frequency.

Hicke, C.; Santos, L. F.; Dykman, M. I.

2005-01-01

 
 
 
 
341

Fault-Tolerant Time Synchronization in Wireless Sensor Networks  

Directory of Open Access Journals (Sweden)

Full Text Available Wireless Sensor Networks are a special type of ad-hoc networks, where wireless devices collaborate with other devices to send data to the destination. Synchronization is an important issue for wireless sensor networks because temporal coordination is required for many of the collaborative tasks they perform. E.g. For the task of Data Fusion, in object tracking and velocity estimation, in setting the sleep modes of the various nodes so that the battery life is prolonged, etc.. There are several synchronization schemes which have been put forward till date. But only few of them are fault-tolerant. Fault-Tolerant, in this context, means that the scheme would work efficiently even in the presence of malicious nodes. Malicious nodes in this paper refer mainly to the nodes which may provide incorrect time. This paper proposes a novel fault-tolerant synchronization scheme which will provide internal synchronization, taking into consideration the malicious or faulty nodes present in the network.

Vikram Singh, T. P. Sharma

2013-06-01

342

A Reliable and Fault Tolerant Routing for Optical WDM Networks  

CERN Document Server

In optical WDM networks, since each lightpath can carry a huge mount of traffic, failures may seriously damage the end user applications. Hence fault tolerance becomes an important issue on these networks. The light path which carries traffic during normal operation is called as primary path. The traffic is rerouted on a backup path in case of a failure. In this paper we propose to design a reliable and fault tolerant routing algorithm for establishing primary and backup paths. In order to establish the primary path, this algorithm uses load balancing in which link cost metrics are estimated based on the current load of the links. In backup path setup, the source calculates the blocking probability through the received feedback from the destination by sending a small fraction of probe packets along the existing paths. It then selects the optimal light path with the lowest blocking probability. Based on the simulation results, we show that the reliable and fault tolerant routing algorithm reduces the blocking ...

Ramesh, G

2009-01-01

343

Fault-tolerant Control of Unmanned Underwater Vehicles with Continuous Faults: Simulations and Experiments  

Directory of Open Access Journals (Sweden)

Full Text Available A novel thruster fault diagnosis and accommodation method for open-frame underwater vehicles is presented in the paper. The proposed system consists of two units: a fault diagnosis unit and a fault accommodation unit. In the fault diagnosis unit an ICMAC (Improved Credit Assignment Cerebellar Model Articulation Controllers neural network information fusion model is used to realize the fault identification of the thruster. The fault accommodation unit is based on direct calculations of moment and the result of fault identification is used to find the solution of the control allocation problem. The approach resolves the continuous faulty identification of the UV. Results from the experiment are provided to illustrate the performance of the proposed method in uncertain continuous faulty situation.

Qian Liu

2010-02-01

344

Fault-Tolerant, Radiation-Hard DSP  

Science.gov (United States)

Commercial digital signal processors (DSPs) for use in high-speed satellite computers are challenged by the damaging effects of space radiation, mainly single event upsets (SEUs) and single event functional interrupts (SEFIs). Innovations have been developed for mitigating the effects of SEUs and SEFIs, enabling the use of very-highspeed commercial DSPs with improved SEU tolerances. Time-triple modular redundancy (TTMR) is a method of applying traditional triple modular redundancy on a single processor, exploiting the VLIW (very long instruction word) class of parallel processors. TTMR improves SEU rates substantially. SEFIs are solved by a SEFI-hardened core circuit, external to the microprocessor. It monitors the health of the processor, and if a SEFI occurs, forces the processor to return to performance through a series of escalating events. TTMR and hardened-core solutions were developed for both DSPs and reconfigurable field-programmable gate arrays (FPGAs). This includes advancement of TTMR algorithms for DSPs and reconfigurable FPGAs, plus a rad-hard, hardened-core integrated circuit that services both the DSP and FPGA. Additionally, a combined DSP and FPGA board architecture was fully developed into a rad-hard engineering product. This technology enables use of commercial off-the-shelf (COTS) DSPs in computers for satellite and other space applications, allowing rapid deployment at a much lower cost. Traditional rad-hard space computers are very expensive and typically have long lead times. These computers are either based on traditional rad-hard processors, which have extremely low computational performance, or triple modular redundant (TMR) FPGA arrays, which suffer from power and complexity issues. Even more frustrating is that the TMR arrays of FPGAs require a fixed, external rad-hard voting element, thereby causing them to lose much of their reconfiguration capability and in some cases significant speed reduction. The benefits of COTS high-performance signal processing include significant increase in onboard science data processing, enabling orders of magnitude reduction in required communication bandwidth for science data return, orders of magnitude improvement in onboard mission planning and critical decision making, and the ability to rapidly respond to changing mission environments, thus enabling opportunistic science and orders of magnitude reduction in the cost of mission operations through reduction of required staff. Additional benefits of COTS-based, high-performance signal processing include the ability to leverage considerable commercial and academic investments in advanced computing tools, techniques, and infra structure, and the familiarity of the science and IT community with these computing environments.

Czajkowski, David

2011-01-01

345

Lecture Notes : Practical Approach to Reliability, Safety, and Active Fault-tolerance  

DEFF Research Database (Denmark)

"The fundamental objective of the combined safety and Reliability assessment is to identify critical items in the design and the choice of equipment that may jeopardize safety or availability, and thereby to provide arguments for the selection between different options for the system." Achieving safety and reliability has been one the prime objectives for system designers while designing safety critical system for decades. With growing environmental awareness, concerns, and demands, the scope of the design of reliable (and safe) systems has been enhanced to even small components as sensors and actuators. In the past, the normal procedure to address the higher demand for reliability was to add hardware redundancy that in turn increases the production and maintenance costs. Active fault-tolerant design is an attempt to achieve higher redundancy while minimizing the costs. In chapter 2 reliability and safety related issues are considered and described. The idea of introducing this chapter is to provide an overview of the concepts and methods used for reliability and safety assessment. The focus in chapter 3 is on fault-tolerance concept. Type of possible faults in components and customary methods for applying redundancy is described. Finally, the chapter is wrapped up by considering and describing the main subject, which is a formal and consistent procedure to design active fault-tolerant systems

Izadi-Zamanabadi, Roozbeh

2000-01-01

346

Algorithm-Based Fault Tolerance for Numerical Subroutines  

Science.gov (United States)

A software library implements a new methodology of detecting faults in numerical subroutines, thus enabling application programs that contain the subroutines to recover transparently from single-event upsets. The software library in question is fault-detecting middleware that is wrapped around the numericalsubroutines. Conventional serial versions (based on LAPACK and FFTW) and a parallel version (based on ScaLAPACK) exist. The source code of the application program that contains the numerical subroutines is not modified, and the middleware is transparent to the user. The methodology used is a type of algorithm- based fault tolerance (ABFT). In ABFT, a checksum is computed before a computation and compared with the checksum of the computational result; an error is declared if the difference between the checksums exceeds some threshold. Novel normalization methods are used in the checksum comparison to ensure correct fault detections independent of algorithm inputs. In tests of this software reported in the peer-reviewed literature, this library was shown to enable detection of 99.9 percent of significant faults while generating no false alarms.

Tumon, Michael; Granat, Robert; Lou, John

2007-01-01

347

Towards fault-tolerant quantum computing with trapped ions  

CERN Document Server

Today ion traps are among the most promising physical systems for constructing a quantum device harnessing the computing power inherent in the laws of quantum physics. The standard circuit model of quantum computing requires a universal set of quantum logic gates for the implementation of arbitrary quantum operations. As in classical models of computation, quantum error correction techniques enable rectification of small imperfections in gate operations, thus allowing for perfect computation in the presence of noise. For fault-tolerant computation, it is commonly believed that error thresholds ranging between 10^-4 and 10^-2 will be required depending on the noise model and the computational overhead for realizing the quantum gates. Up to now, all experimental implementations have fallen short of these requirements. Here, we report on a Molmer-Sorensen type gate operation entangling ions with a fidelity of 99.3(1)% which together with single-qubit operations forms a universal set of quantum gates. The gate op...

Benhelm, J; Roos, C F; Blatt, R

2008-01-01

348

A Modular and Fault-Tolerant Data Transport Framework  

CERN Document Server

The High Level Trigger (HLT) of the future ALICE heavy-ion experiment has to reduce its input data rate of up to 25 GB/s to at most 1.25 GB/s for output before the data is written to permanent storage. To cope with these data rates a large PC cluster system is being designed to scale to several 1000 nodes, connected by a fast network. For the software that will run on these nodes a flexible data transport and distribution software framework, described in this thesis, has been developed. The framework consists of a set of separate components, that can be connected via a common interface. This allows to construct different configurations for the HLT, that are even changeable at runtime. To ensure a fault-tolerant operation of the HLT, the framework includes a basic fail-over mechanism that allows to replace whole nodes after a failure. The mechanism will be further expanded in the future, utilizing the runtime reconnection feature of the framework's component interface. To connect cluster nodes a communication ...

Steinbeck, T M; Steinbeck, Timm M

2004-01-01

349

A Modular and Fault-Tolerant Data Transport Framework  

CERN Document Server

The High Level Trigger (HLT) of the future ALICE heavy-ion experiment has to reduce its input data rate of up to 25 GB/s to at most 1.25 GB/s for output before the data is written to permanent storage. To cope with these data rates a large PC cluster system is being designed to scale to several 1000 nodes, connected by a fast network. For the software that will run on these nodes a flexible data transport and distribution software framework, described in this thesis, has been developed. The framework consists of a set of separate components, that can be connected via a common interface. This allows to construct different configurations for the HLT, that are even changeable at runtime. To ensure a fault-tolerant operation of the HLT, the framework includes a basic fail-over mechanism that allows to replace whole nodes after a failure. The mechanism will be further expanded in the future, utilizing the runtime reconnection feature of the framework's component interface. To connect cluster nodes a communication ...

Steinbeck, Timm M

2009-01-01

350

Online Reconfigurable Self-Timed Links for Fault Tolerant NoC  

Directory of Open Access Journals (Sweden)

Full Text Available We propose link structures for NoC that have properties for tolerating efficiently transient, intermittent, and permanent errors. This is a necessary step to be taken in order to implement reliable systems in future nanoscale technologies. The protection against transient errors is realized using Hamming coding and interleaving for error detection and retransmission as the recovery method. We introduce two approaches for tackling the intermittent and permanent errors. In the first approach, spare wires are introduced together with reconfiguration circuitry. The other approach uses time redundancy, the transmission is split into two parts, where the data is doubled. In both structures the presence of permanent or intermittent errors is monitored by analyzing previous error syndromes. The links are based on self-timed signaling in which the handshake signals are protected using triple modular redundancy. We present the structures, operation, and designs for the different components of the links. The fault tolerance properties are analyzed using a fault model containing temporary, intermittent, and permanent faults that occur both as bursts and as single faults. The results show a considerable enhancement in the fault tolerance at the cost of performance and area, and with only a slight increase in power consumption.

Teijo Lehtonen

2007-05-01

351

Task-based Dynamic Fault Tolerance for Humanoid Robot Applications and Its Hardware Implementation  

Directory of Open Access Journals (Sweden)

Full Text Available This paper presents a new fault tolerance scheme suitable for humanoid robot applications. In the future, various tasks ranging from daily chores to safety-related tasks will be carried out by individual humanoid robots. If the importance of the tasks is different, the required dependability will vary accordingly. Therefore, for mobile humanoid robots operating under power constraints, fault tolerance that dynamically changes based on the importance of the tasks is desirable because fault-tolerant designs involving hardware redundancy are power intensive. In the proposed fault tolerance scheme, a duplex computer system switches between hot standby and cold standby according to each individual task. However, in mobile humanoid robots, a safety issue arises when cold standby is used for the standby computer unit. Since an unpowered unit cannot immediately start to operate, a biped-walking robot falls down when failover occurs during cold standby. This paper proposes a safety failover method to resolve this issue and describes the hardware design of the safety failover subsystem.

Masayuki Murakami

2008-08-01

352

Fault-Tolerant Quantum Dynamical Decoupling  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Dynamical decoupling pulse sequences have been used to extend coherence times in quantum systems ever since the discovery of the spin-echo effect. Here we introduce a method of recursively concatenated dynamical decoupling pulses, designed to overcome both decoherence and operational errors. This is important for coherent control of quantum systems such as quantum computers. For bounded-strength, non-Markovian environments, such as for the spin-bath that arises in electron- ...

Khodjasteh, K.; Lidar, D. A.

2004-01-01

353

Fault Diagnosis and Accommodation of LTI systems by modified Youla parameterization  

Directory of Open Access Journals (Sweden)

Full Text Available In this paper an Active Fault Tolerant Control (FTC scheme is proposed for Linear Time Invariant (LTI systems, which achieves fault diagnosis followed by fault accommodation. The fault diagnosis scheme is carried out in two steps; Fault detection followed by Fault isolation. Fault detection filter use the sensor measurements to generate residuals, which have a unique static pattern in response to each fault. Distortion in these static patterns generates the probability of the presence of fault. The fault accommodation scheme is carried out using the Generalized Internal Model Control (GIMC architecture, also known as modified Youla parameterization. In addition, performance indices are also evaluated to indicate that the resulting fault tolerant scheme can detect, identify and accommodate actuator and sensor faults under additive faults. The DC motor example is considered for the demonstration of the proposed scheme.

Minupriya A

2012-06-01

354

Fault management for data systems  

Science.gov (United States)

Issues related to automating the process of fault management (fault diagnosis and response) for data management systems are considered. Substantial benefits are to be gained by successful automation of this process, particularly for large, complex systems. The use of graph-based models to develop a computer assisted fault management system is advocated. The general problem is described and the motivation behind choosing graph-based models over other approaches for developing fault diagnosis computer programs is outlined. Some existing work in the area of graph-based fault diagnosis is reviewed, and a new fault management method which was developed from existing methods is offered. Our method is applied to an automatic telescope system intended as a prototype for future lunar telescope programs. Finally, an application of our method to general data management systems is described.

Boyd, Mark A.; Iverson, David L.; Patterson-Hine, F. Ann

1993-01-01

355

Trapped Ion Chain as a Neural Network: Fault-Tolerant Quantum Computation  

CERN Document Server

We demonstrate the possibility of realizing a neural network in a chain of trapped ions with induced long range interactions. Such models permit to store information distributed over the whole system. The storage capacity of such network, which depends on the phonon spectrum of the system, can be controlled by changing the external trapping potential. We analyze the implementation of fault-tolerant universal quantum information processing in such systems.

Pons, M; Braungardt, S; De, A S; Lewenstein, M; Sanpera, A; Sen, U; Wunderlich, C; Ahufinger, Veronica; Braungardt, Sibylle; De, Aditi Sen; Lewenstein, Maciej; Pons, Marisa; Sanpera, Anna; Sen, Ujjwal; Wunderlich, Christof

2007-01-01

356

Byzantine Fault Tolerance of Regenerating Codes  

CERN Document Server

Recent years have witnessed a slew of coding techniques custom designed for networked storage systems. Network coding inspired regenerating codes are the most prolifically studied among these new age storage centric codes. A lot of effort has been invested in understanding the fundamental achievable trade-offs of storage and bandwidth usage to maintain redundancy in presence of different models of failures, showcasing the efficacy of regenerating codes with respect to traditional erasure coding techniques. For practical usability in open and adversarial environments, as is typical in peer-to-peer systems, we need however not only resilience against erasures, but also from (adversarial) errors. In this paper, we study the resilience of generalized regenerating codes (supporting multi-repairs, using collaboration among newcomers) in the presence of two classes of Byzantine nodes, relatively benign selfish (non-cooperating) nodes, as well as under more active, malicious polluting nodes. We give upper bounds on t...

Oggier, Frédérique

2011-01-01

357

Fault tolerance for kinematically redundant robotic manipulators  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Robotic manipulators have been identified as a major technology to be utilized in the cleanup and management of hazardous waste and for deep sea and space exploration. The corrosive nature of such environments reduces the expected life of sensors and actuators. The cost of repairing systems in such environments is often prohibitively large especially when radioactive elements are present. This work is concerned with developing methods by which kinematically redundant manipulators can continue...

Lewis, Christopher L.

1994-01-01

358

Empirical Study of FFANNs Tolerance to Weight Stuck at Zero Fault  

Directory of Open Access Journals (Sweden)

Full Text Available Fault tolerance property of artificial neural networks has been investigated with reference to the hardware model of artificial neural networks. Weight fault is an important link, which causes breakup between two nodes. In this paper weight fault has been explained.Experiments have been performed for Weight-stuck-0 fault. Effect of weight-stuck-0 fault on trained network has been analyzed in this paper. The obtained results suggest that networks are not fault tolerant to this type of fault.

Chandra Sekhar Rai

2010-04-01

359

Fault tolerant control in the case of actuator type of faults based on derivative estimation; Fehlertolerante Regelung bei aktuatoraehnlichen Fehlern mittels Ableitungsschaetzung  

Energy Technology Data Exchange (ETDEWEB)

In this article, we present an FTC-architecture where generalized actuator faults are online diagnosed and compensated for by the control system. Employing least-squares derivative estimators for identifying the faults inserts delay times into the control loop. In the course of a stability analysis tolerable delay times are determined, which allows to deduce admissible values of the process parameters. The FTC-scheme is illustrated by the classical three-tank system. (orig.)

Mai, Philipp; Hillermeier, Claus [Univ. der Bundeswehr Muenchen, Neubiberg (Germany). Professur fuer Automatisierungs- und Regelungstechnik

2010-07-01

360

Sensor and Sensorless Fault Tolerant Control for Induction Motors Using a Wavelet Index  

Directory of Open Access Journals (Sweden)

Full Text Available Fault Tolerant Control (FTC systems are crucial in industry to ensure safe and reliable operation, especially of motor drives. This paper proposes the use of multiple controllers for a FTC system of an induction motor drive, selected based on a switching mechanism. The system switches between sensor vector control, sensorless vector control, closed-loop voltage by frequency (V/f control and open loop V/f control. Vector control offers high performance, while V/f is a simple, low cost strategy with high speed and satisfactory performance. The faults dealt with are speed sensor failures, stator winding open circuits, shorts and minimum voltage faults. In the event of compound faults, a protection unit halts motor operation. The faults are detected using a wavelet index. For the sensorless vector control, a novel Boosted Model Reference Adaptive System (BMRAS to estimate the motor speed is presented, which reduces tuning time. Both simulation results and experimental results with an induction motor drive show the scheme to be a fast and effective one for fault detection, while the control methods transition smoothly and ensure the effectiveness of the FTC system. The system is also shown to be flexible, reverting rapidly back to the dominant controller if the motor returns to a healthy state.

Ammar Masaoud

2012-03-01

 
 
 
 
361

Sensor and sensorless fault tolerant control for induction motors using a wavelet index.  

Science.gov (United States)

Fault Tolerant Control (FTC) systems are crucial in industry to ensure safe and reliable operation, especially of motor drives. This paper proposes the use of multiple controllers for a FTC system of an induction motor drive, selected based on a switching mechanism. The system switches between sensor vector control, sensorless vector control, closed-loop voltage by frequency (V/f) control and open loop V/f control. Vector control offers high performance, while V/f is a simple, low cost strategy with high speed and satisfactory performance. The faults dealt with are speed sensor failures, stator winding open circuits, shorts and minimum voltage faults. In the event of compound faults, a protection unit halts motor operation. The faults are detected using a wavelet index. For the sensorless vector control, a novel Boosted Model Reference Adaptive System (BMRAS) to estimate the motor speed is presented, which reduces tuning time. Both simulation results and experimental results with an induction motor drive show the scheme to be a fast and effective one for fault detection, while the control methods transition smoothly and ensure the effectiveness of the FTC system. The system is also shown to be flexible, reverting rapidly back to the dominant controller if the motor returns to a healthy state. PMID:22666016

Gaeid, Khalaf Salloum; Ping, Hew Wooi; Khalid, Mustafa; Masaoud, Ammar

2012-01-01

362

A Benchmark Evaluation of Fault Tolerant Wind Turbine Control Concepts  

DEFF Research Database (Denmark)

As the world’s power supply to a larger and larger degree depends on wind turbines, it is consequently and increasingly important that these are as reliable and available as possible. Modern fault tolerant control (FTC) could play a substantial part in increasing reliability of modern wind turbines. A benchmark model for wind turbine fault detection and isolation, and FTC has previously been proposed. Based on this benchmark, an international competition on wind turbine FTC was announced. In this brief, the top three solutions from that competition are presented and evaluated. The analysis shows that all three methods and, in particular, the winner of the competition shows potential for wind turbine FTC. In addition to showing good performance, the approach is based on a method, which is relevant for industrial usage. It is based on a virtual sensor and actuator strategy, in which the fault accommodation is handled in software sensor and actuator blocks. This means that the wind turbine controller can continue operation as in the fault free case. The other two evaluated solutions show some potential but probably need improvements before industrial applications.

Odgaard, Peter Fogh; Stoustrup, Jakob

2014-01-01

363

Interaction analysis for fault-tolerance in aspect-oriented programming  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The key contribution of Aspect-Oriented Programming (AOP) is the encapsulation of crosscutting concerns in aspects, which facilities modular reasoning. However, common methods of introducing aspects into the system, incorporating features such as implicit control-flow, mean that the ability to discover interactions between aspects can be compromised. This has profound implications for developers working on fault-tolerant systems. We present an analysis for aspects which can re- veal these int...

Weston, N.; Taiani, F.; Rashid, A.

2007-01-01

364

Checkpoint and Replication Oriented Fault Tolerant Mechanism for MapReduce Framework  

Digital Repository Infrastructure Vision for European Research (DRIVER)

MapReduce is an emerging programming paradigm and an associated implementation for processing and generating big data which has been widely applied in data-intensive systems. In cloud environment, node and task failure is no longer accidental but a common feature of large-scale systems. In MapReduce framework, although the rescheduling based fault-tolerant method is simple to implement, it failed to fully consider the location of distributed data, the computation and storage overhead. Thus, a...

Yang Liu; Wei Wei; Yuhong Zhang

2013-01-01

365

Design and simulation of advanced fault tolerant flight control schemes  

Science.gov (United States)

This research effort describes the design and simulation of a distributed Neural Network (NN) based fault tolerant flight control scheme and the interface of the scheme within a simulation/visualization environment. The goal of the fault tolerant flight control scheme is to recover an aircraft from failures to its sensors or actuators. A commercially available simulation package, Aviator Visual Design Simulator (AVDS), was used for the purpose of simulation and visualization of the aircraft dynamics and the performance of the control schemes. For the purpose of the sensor failure detection, identification and accommodation (SFDIA) task, it is assumed that the pitch, roll and yaw rate gyros onboard are without physical redundancy. The task is accomplished through the use of a Main Neural Network (MNN) and a set of three De-Centralized Neural Networks (DNNs), providing analytical redundancy for the pitch, roll and yaw gyros. The purpose of the MNN is to detect a sensor failure while the purpose of the DNNs is to identify the failed sensor and then to provide failure accommodation. The actuator failure detection, identification and accommodation (AFDIA) scheme also features the MNN, for detection of actuator failures, along with three Neural Network Controllers (NNCs) for providing the compensating control surface deflections to neutralize the failure induced pitching, rolling and yawing moments. All NNs continue to train on-line, in addition to an offline trained baseline network structure, using the Extended Back-Propagation Algorithm (EBPA), with the flight data provided by the AVDS simulation package. The above mentioned adaptive flight control schemes have been traditionally implemented sequentially on a single computer. This research addresses the implementation of these fault tolerant flight control schemes on parallel and distributed computer architectures, using Berkeley Software Distribution (BSD) sockets and Message Passing Interface (MPI) for inter-process communication.

Gururajan, Srikanth

366

MPC fault-tolerant flight control case study: Flight 1862  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We demonstrate that the fatal crash of El Al Flight 1862 might have been avoided by using MPC-based fault-tolerant control. Simulation on a detailed nonlinear model shows that it is possible to reconfigure the controller so that the aircraft is flown successfully down to ground level, without entering the condition in which it was lost. We use a reference-model based approach, in which an MPC controller attempts to restore the original functionality of the pilot’s controls. For the purposes...

Maciejowski, Jan; Jones, Colin

2003-01-01

367

Data center networks topologies, architectures and fault-tolerance characteristics  

CERN Document Server

This SpringerBrief presents a survey of data center network designs and topologies and compares several properties in order to highlight their advantages and disadvantages. The brief also explores several routing protocols designed for these topologies and compares the basic algorithms to establish connections, the techniques used to gain better performance, and the mechanisms for fault-tolerance. Readers will be equipped to understand how current research on data center networks enables the design of future architectures that can improve performance and dependability of data centers. This con

Liu, Yang; Veeraraghavan, Malathi; Lin, Dong; Hamdi, Mounir

2013-01-01

368

Fault Tolerant Air Bubble Sensor using Triple Modular Redundancy Method  

Directory of Open Access Journals (Sweden)

Full Text Available Detection of air bubbles in the blood is important for various medical treatments that use Extracorporeal Blood Circuits (ECBC, such as hemodialysis, hemofiltration and cardio-pulmonary bypass. Therefore a reliable air bubble detector is needed. In this study designed a fault tolerant air bubble detector. Triple Modular Redundancy (TMR method is used on the sensor section. A voter circuit of the TMR will choose one of three sensor output to be processed further. Application of TMR will prevent errors in the detection of air bubbles, especially if the sensor fails to work

Kuspriyanto Kuspriyanto

2013-03-01

369

Fully fault tolerant quantum computation with non-deterministic gates  

CERN Document Server

In certain approaches to quantum computing the operations between qubits are non-deterministic and likely to fail. For example, a distributed quantum processor would achieve scalability by networking together many small components; operations between components should assumed to be failure prone. In the logical limit of this architecture each component contains only one qubit. Here we derive thresholds for fault tolerant quantum computation under such extreme paradigms. We find that computation is supported for remarkably high failure rates (exceeding 90%) providing that failures are heralded, meanwhile the rate of unknown errors should not exceed 2 in 10^4 operations.

Li, Ying; Stace, Thomas M; Benjamin, Simon C

2010-01-01

370

Optimal construction of arbitrary fault-tolerant gates  

International Nuclear Information System (INIS)

Full text: In this work, we perform a detailed study of the properties of optimal fault-tolerant approximations of arbitrary gates using the gate set directly applicable to the 7-qubit Steane code. Given a unitary matrix distance measure that we define, we find that for a given number of gates n the optimal distance that can be achieved is approximately d = 0.3 10(-0.05n). Full details of the method used to construct these optimal approximations are given. Copyright (2005) Australian Institute of Physics

371

Fault Tolerant Electrical Machines. State of the Art and Future Directions  

Directory of Open Access Journals (Sweden)

Full Text Available Nowadays the evolution of electrical engineering achieved a successful expansion in the area of fault tolerant electrical machines. To achieve fault tolerance researchers tried to design various geometries and different electrical drives. When new designers are intended to be performed the knowledge of the actualstate of the work is impetuously needed. The paper summarizes the most important information on these topics. Both fault tolerant machine and drive structure were taken into accounts. In the paper also a new idea for a fault tolerant switched reluctance machine having a special winding is presented. The future tasks to be performed are also mentioned in the paper.

Mircea RUBA

2008-05-01

372

Robust fault-tolerant control for a biped robot using a recurrent cerebellar model articulation controller.  

Science.gov (United States)

A design technique of a recurrent cerebellar model articulation controller (RCMAC)-based fault-tolerant control (FTC) system is investigated to rectify the nonlinear faults of a biped robot. The proposed RCMAC-based FTC (RCFTC) scheme contains two components: 1) an online fault estimation module based on an RCMAC is used to provide approximation information for any nonnominal behavior due to the system failure and modeling error of the biped robot; and 2) a controller module consisting of a computed torque controller and a robust FTC is utilized to achieve FTC. In the controller module, the computed torque controller reveals a basic stabilizing controller to stabilize the system, and the robust FTC is utilized to compensate for the effects of the system failure so as to achieve fault accommodation. The adaptive laws of the RCFTC system are rigorously established based on the Lyapunov function, so that the stability of the system can be guaranteed. Finally, two simulation cases of a biped robot are presented to illustrate the effectiveness of the proposed design method. Simulation results show that the RCFTC system can effectively recover the control performance for the system in the presence of the nonlinear faults and modeling uncertainties. PMID:17278565

Lin, Chih-Min; Chen, Chiu-Hsiung

2007-02-01

373

2009 fault tolerance for extreme-scale computing workshop, Albuquerque, NM - March 19-20, 2009.  

Energy Technology Data Exchange (ETDEWEB)

This is a report on the third in a series of petascale workshops co-sponsored by Blue Waters and TeraGrid to address challenges and opportunities for making effective use of emerging extreme-scale computing. This workshop was held to discuss fault tolerance on large systems for running large, possibly long-running applications. The main point of the workshop was to have systems people, middleware people (including fault-tolerance experts), and applications people talk about the issues and figure out what needs to be done, mostly at the middleware and application levels, to run such applications on the emerging petascale systems, without having faults cause large numbers of application failures. The workshop found that there is considerable interest in fault tolerance, resilience, and reliability of high-performance computing (HPC) systems in general, at all levels of HPC. The only way to recover from faults is through the use of some redundancy, either in space or in time. Redundancy in time, in the form of writing checkpoints to disk and restarting at the most recent checkpoint after a fault that cause an application to crash/halt, is the most common tool used in applications today, but there are questions about how long this can continue to be a good solution as systems and memories grow faster than I/O bandwidth to disk. There is interest in both modifications to this, such as checkpoints to memory, partial checkpoints, and message logging, and alternative ideas, such as in-memory recovery using residues. We believe that systematic exploration of these ideas holds the most promise for the scientific applications community. Fault tolerance has been an issue of discussion in the HPC community for at least the past 10 years; but much like other issues, the community has managed to put off addressing it during this period. There is a growing recognition that as systems continue to grow to petascale and beyond, the field is approaching the point where we don't have any choice but to address this through R&D efforts.

Katz, D. S.; Daly, J.; DeBardeleben, N.; Elnozahy, M.; Kramer, B.; Lathrop, S.; Nystrom, N.; Milfeld, K.; Sanielevici, S.; Scott, S.; Votta, L.; Louisiana State Univ.; Center for Exceptional Computing; LANL; IBM; Univ. of Illinois; Shodor Foundation; Pittsburgh Supercomputer Center; Texas Advanced Computing Center; ORNL; Sun Microsystems

2009-02-01

374

A FAULT TOLERANT TOKEN BASED ATOMIC BROADCAST ALGORITHM RELYING ON RESPONSIVE PROPERTY  

Directory of Open Access Journals (Sweden)

Full Text Available In the Distributed Environment where shared resources are involved, we have basically two types of mechanism to allocate the shared resources: either by passing tokens or by having Request and Reply Messages. In the shared environment, a processor might fail (i.e. may crash which may lead to failure. This paper proposes a fault tolerant token based atomic broadcast algorithm which does rely on unreliable failure detectors. It combines the failure detector and a token based mechanism, satisfying responsiveness property. The mechanism can tolerate processor level faults as compared to the existing system level failure, because the proposed system is relying on the unreliable failure detector and also rely on the responsive property.

NEELAMANI SAMAL

2013-04-01

375

Checkpoint and Replication Oriented Fault Tolerant Mechanism for MapReduce Framework  

Directory of Open Access Journals (Sweden)

Full Text Available MapReduce is an emerging programming paradigm and an associated implementation for processing and generating big data which has been widely applied in data-intensive systems. In cloud environment, node and task failure is no longer accidental but a common feature of large-scale systems. In MapReduce framework, although the rescheduling based fault-tolerant method is simple to implement, it failed to fully consider the location of distributed data, the computation and storage overhead. Thus, a single node failure will increase the completion time dramatically. In this paper, a Checkpoint and Replication Oriented Fault Tolerant scheduling algorithm (CROFT is proposed, which takes both task and node failure into consideration. Preliminary experiments show that with less storage and network overhead. CROFT will significantly reduce the completion time at failure time, and the overall performance of MapReduce can be improved at least over 30% than original mechanism in Hadoop.  

Yang Liu

2013-09-01

376

Fault tolerant, radiation hard, high performance digital signal processor  

Science.gov (United States)

An architecture has been developed for a high-performance VLSI digital signal processor that is highly reliable, fault-tolerant, and radiation-hard. The signal processor, part of a spacecraft receiver designed to support uplink radio science experiments at the outer planets, organizes the connections between redundant arithmetic resources, register files, and memory through a shuffle exchange communication network. The configuration of the network and the state of the processor resources are all under microprogram control, which both maps the resources according to algorithmic needs and reconfigures the processing should a failure occur. In addition, the microprogram is reloadable through the uplink to accommodate changes in the science objectives throughout the course of the mission. The processor will be implemented with silicon compiler tools, and its design will be verified through silicon compilation simulation at all levels from the resources to full functionality. By blending reconfiguration with redundancy the processor implementation is fault-tolerant and reliable, and possesses the long expected lifetime needed for a spacecraft mission to the outer planets.

Holmann, Edgar; Linscott, Ivan R.; Maurer, Michael J.; Tyler, G. L.; Libby, Vibeke

1990-01-01

377

Scalable Fault-Tolerant Location Management Scheme for Mobile IP  

Directory of Open Access Journals (Sweden)

Full Text Available As the number of mobile nodes registering with a network rapidly increases in Mobile IP, multiple mobility (home of foreign agents can be allocated to a network in order to improve performance and availability. Previous fault tolerant schemes (denoted by PRT schemes to mask failures of the mobility agents use passive replication techniques. However, they result in high failure-free latency during registration process if the number of mobility agents in the same network increases, and force each mobility agent to manage bindings of all the mobile nodes registering with its network. In this paper, we present a new fault-tolerant scheme (denoted by CML scheme using checkpointing and message logging techniques. The CML scheme achieves low failure-free latency even if the number of mobility agents in a network increases, and improves scalability to a large number of mobile nodes registering with each network compared with the PRT schemes. Additionally, the CML scheme allows each failed mobility agent to recover bindings of the mobile nodes registering with the mobility agent when it is repaired even if all the other mobility agents in the same network concurrently fail.

JinHo Ahn

2001-11-01

378

1985 seminar on power plant digital control and fault-tolerant microcomputers: proceedings  

International Nuclear Information System (INIS)

An EPRI Seminar to address Power Plant Digital Controls and Fault-Tolerant Microcomputers Technology was hosted by Arizona Public Service Company in Phoenix, Arizona on April 9-12, 1986. The attendees represented a broad spectrum of US and foreign utilities, architect and consulting firms, and NSSS and computer system hardware vendors. These proceedings contain the text of the formal presentations as well as the papers and slides used during the short courses offered

379

Decoherence-Free Subspaces for Multiple-Qubit Errors (II) Universal, Fault-Tolerant Quantum Computation  

CERN Document Server

Decoherence-free subspaces (DFSs) shield quantum information from errors induced by the interaction with an uncontrollable environment. Here we study a model of correlated errors forming an Abelian subgroup (stabilizer) of the Pauli group (the group of tensor products of Pauli matrices). Unlike previous studies of DFSs, this type of errors does not involve any spatial symmetry assumptions on the system-environment interaction. We solve the problem of universal, fault-tolerant quantum computation on the associated class of DFSs.

Lidar, D A; Kempe, J; Whaley, K B; Lidar, Daniel A.; Bacon, David; Kempe, Julia

2001-01-01

380

The Nile fast-track implementation: fault-tolerant parallel processing of legacy CLEO data  

International Nuclear Information System (INIS)

Nile is a multi-disciplinary project building distributed parallel fault-tolerant computing for high energy physics and related fields. Nile Fast-Track is an early prototype of many key design principles of the full Nile project which is distributed computing over a wide area network. Object oriented design techniques are employed to produce a test-bed system which is extremely modular. We report on the Fast-Track project design, its status, and future plans. (author)

 
 
 
 
381

Fault-tolerant sub-lithographic design with rollback recovery.  

Science.gov (United States)

Shrinking feature sizes and energy levels coupled with high clock rates and decreasing node capacitance lead us into a regime where transient errors in logic cannot be ignored. Consequently, several recent studies have focused on feed-forward spatial redundancy techniques to combat these high transient fault rates. To complement these studies, we analyze fine-grained rollback techniques and show that they can offer lower spatial redundancy factors with no significant impact on system performance for fault rates up to one fault per device per ten million cycles of operation (P(f) = 10(-7)) in systems with 10(12) susceptible devices. Further, we concretely demonstrate these claims on nanowire-based programmable logic arrays. Despite expensive rollback buffers and general-purpose, conservative analysis, we show the area overhead factor of our technique is roughly an order of magnitude lower than a gate level feed-forward redundancy scheme. PMID:21730568

Naeimi, Helia; Dehon, André

2008-03-19

382

High Speed Fault Injection Tool Implemented With Verilog HDL on FPGA for Testing Fault Tolerance Designs  

Directory of Open Access Journals (Sweden)

Full Text Available This paper presents an FPGA-based fault injection tool, called FITO that supports several synthesizable fault models for dependability analysis of digital systems modeled by Verilog HDL. Using the FITO, experiments can be performed in real-time with good controllability and observability. As a case study, an Open RISC 1200 microprocessor was evaluated using an FPGA circuit. About 4000 permanent, transient, and SEUfaults were injected into this microprocessor. The results show that the FITO tool is more than 79 times faster than a pure simulation-based fault injection with only 2.5% FPGA area overhead.

G. Gopinath Reddy

2013-11-01

383

Actuator usage and fault tolerance of the James Webb Space Telescope optical element mirror actuators  

Science.gov (United States)

The James Webb Space Telescope (JWST) telescope's secondary mirror and eighteen primary mirror segments are each actively controlled in rigid body position via six hexapod actuators. The mirrors are stowed to the mirror support structure to survive the launch environment and then must be deployed 12.5 mm to reach the nominally deployed position before the Wavefront Sensing & Control (WFS&C) alignment and phasing process begins. The actuation system is electrically, but not mechanically redundant. Therefore, with the large number of hexapod actuators, the fault tolerance of the OTE architecture and WFS&C alignment process has been carefully considered. The details of the fault tolerance will be discussed, including motor life budgeting, failure signatures, and motor life.

Barto, A.; Acton, D. S.; Finley, P.; Gallagher, B.; Hardy, B.; Knight, J. S.; Lightsey, P.

2012-09-01

384

A taxonomy of reconfiguration techniques for fault-tolerant processor arrays--  

Energy Technology Data Exchange (ETDEWEB)

The authors overview, characterize, and classify some typical reconfiguration schemes in light of a proposed taxonomy. This taxonomy can be used as a guide for future research in design and analysis of reconfiguration schemes. Studying how to evaluate fault-tolerant arrays and how to exploit application characteristics to achieve dependable computing are important complementary directions of research towards reliable processor-array design. A related research problem is that of functional reconfiguration, that is, learning how to configure the topology of a parallel system to implement a different function or run a different application. Important directions of research include how to apply or extend processor-array reconfiguration algorithms to other topologies and how to marry functional and fault-tolerance reconfiguration requirements and solutions. The Diogenes approach discussed in this article is a case where this goal is naturally achieved.

Chean, M. (Shell Development Co., Houston, TX (USA)); Fortes, J.A.B. (Purdue Univ., Lafayette, IN (USA))

1990-01-01

385

Robust and Fault-Tolerant Linear Parameter-Varying Control of Wind Turbines  

DEFF Research Database (Denmark)

High performance and reliability are required for wind turbines to be competitive within the energy market. To capture their nonlinear behavior, wind turbines are often modeled using parameter-varying models. In this paper we design and compare multiple linear parameter-varying (LPV) controllers, designed using a proposed method that allows the inclusion of both faults and uncertainties in the LPV controller design. We specifically consider a 4.8 MW, variable-speed, variable-pitch wind turbine model with a fault in the pitch system. We propose the design of a nominal controller (NC), handling the parameter variations along the nominal operating trajectory caused by nonlinear aerodynamics. To accommodate the fault in the pitch system, an active fault-tolerant controller (AFTC) and a passive fault-tolerant controller (PFTC) are designed. In addition to the nominal LPV controller, we also propose a robust controller (RC). This controller is able to take into account model uncertainties in the aerodynamic model. The controllers are based on output feedback and are scheduled on an estimated wind speed to manage the parameter-varying nature of the model. Furthermore, the AFTC relies on information from a fault diagnosis system. The optimization problems involved in designing the PFTC and RC are based on solving bilinear matrix inequalities (BMIs) instead of linear matrix inequalities (LMIs) due to unmeasured parameter variations. Consequently, they are more difficult to solve. The paper presents a procedure, where the BMIs are rewritten into two necessary LMI conditions, which are solved using a two-step procedure. Simulation results show the performance of the LPV controllers to be superior to that of a reference controller designed based on classical principles.

Sloth, Christoffer; Esbensen, Thomas

2011-01-01

386

Fault-Tolerant Control of Wind Turbines using a Takagi-Sugeno Sliding Mode Observer  

Science.gov (United States)

In this paper, observer-based fault-tolerant control schemes for actuator and sensor faults are implemented within dynamic wind turbine simulations. The faults are directly reconstructed by means of a Takagi-Sugeno sliding mode observer. As simulation models, both a reduced-order model with 4 degrees of freedom and the aero-elastic code FAST by NREL are used. A fault-tolerant control scheme is set up by subtracting the reconstructed fault from the faulty control signal respectively sensor value. With these fault compensation schemes, the corrected controller behaviour is close to the fault-free case. The global stability of the controller in the full-load region in the presence of faults and with active fault compensation is shown by analysing the derivative of an appropriate Lyapunov function.

Georg, Sören; Schulte, Horst

2014-06-01

387

Fault tolerant cooperative control for UAV rendezvous problem subject to actuator faults  

Science.gov (United States)

This paper investigates the problem of fault tolerant cooperative control for UAV rendezvous problem in which multiple UAVs are required to arrive at their designated target despite presence of a fault in the thruster of any UAV. An integrated hierarchical scheme is proposed and developed that consists of a cooperative rendezvous planning algorithm at the team level and a nonlinear fault detection and isolation (FDI) subsystem at individual UAV's actuator/sensor level. Furthermore, a rendezvous re-planning strategy is developed that interfaces the rendezvous planning algorithm with the low-level FDI. A nonlinear geometric approach is used for the FDI subsystem that can detect and isolate faults in various UAV actuators including thrusters and control surfaces. The developed scheme is implemented for a rendezvous scenario with three Aerosonde UAVs, a single target, and presence of a priori known threats. Simulation results reveal the effectiveness of our proposed scheme in fulfilling the rendezvous mission objective that is specified as a successful intercept of Aerosondes at their designated target, despite the presence of severe loss of effectiveness in Aerosondes engine thrusters.

Jiang, T.; Meskin, N.; Sobhani-Tehrani, E.; Khorasani, K.; Rabbath, C. A.

2007-04-01

388

Evaluation of Fault Detection Coverage of Digital I and C Systems  

Energy Technology Data Exchange (ETDEWEB)

In the fault tolerance evaluation, fault detection coverage is a crucial factor. The fault detection coverage is the ability to detect errors that are caused by faults in a system. If faults are not detected by a certain detection algorithm, the system could be in failure. Evaluating the fault detection coverage of the fault-tolerant technique is important for the safety analysis of digital systems. Digital I and C systems have more various fault-tolerant techniques than conventional analog I and C systems. Even though these fault-tolerant techniques are designed to ensure and improve the safety of systems, the effects of them have not been properly considered yet in most system probabilistic safety assessment (PSA) models. There have been several researches into the reliability of digital systems. However, systematical frameworks or reasonable models to obtain the reliability of digital systems by considering the effects of fault-tolerant techniques have not been proposed. Therefore, it is necessary to develop an evaluation method reflecting the features of digital I and C systems. The evaluation method for fault detection coverage of digital I and C systems was proposed in this work. The proposed method quantifies the fault detection coverage based on the fault injection experiment. Even though there are several limitations of the fault injection experiment such as fault injection into only memory and register, the method has an advantage of that it is possible to observe the actual system behavior against faults in the system. More accurate system reliability evaluation of digital I and C systems can be expected through the experiment result.

Lee, Seung Jun; Jung, Wondea [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

2013-10-15

389

Evaluation of Fault Detection Coverage of Digital I and C Systems  

International Nuclear Information System (INIS)

In the fault tolerance evaluation, fault detection coverage is a crucial factor. The fault detection coverage is the ability to detect errors that are caused by faults in a system. If faults are not detected by a certain detection algorithm, the system could be in failure. Evaluating the fault detection coverage of the fault-tolerant technique is important for the safety analysis of digital systems. Digital I and C systems have more various fault-tolerant techniques than conventional analog I and C systems. Even though these fault-tolerant techniques are designed to ensure and improve the safety of systems, the effects of them have not been properly considered yet in most system probabilistic safety assessment (PSA) models. There have been several researches into the reliability of digital systems. However, systematical frameworks or reasonable models to obtain the reliability of digital systems by considering the effects of fault-tolerant techniques have not been proposed. Therefore, it is necessary to develop an evaluation method reflecting the features of digital I and C systems. The evaluation method for fault detection coverage of digital I and C systems was proposed in this work. The proposed method quantifies the fault detection coverage based on the fault injection experiment. Even though there are several limitations of the fault injection experiment such as fault injection into only memory and register, the method has an advantage of that it is possible to observe the actual system behavior against faults in the system. More accurate system reliability evaluation of digital I and C systems can be expected through the experiment result

390

A TESTING FRAMEWORK FOR FAULT TOLERANT COMPOSITION OF TRANSACTIONAL WEB SERVICES  

Directory of Open Access Journals (Sweden)

Full Text Available Software testers have great challenges in testing of web services therefore testing technique must be developed for testing of web services. Web service composition is an active research area over last few years. This paper proposes a framework for testing of fault tolerant composition of web services. It will tolerate faults whilecomposition of web services. Exception handling and transaction techniques are used as fault handling mechanisms. After composition web services are deployed on WS-BPEL engine. Testing Framework will fetch results of composite web service from WS-BPEL engine and check whether composed web service is fault tolerant and it is in the consistent state.

Deepali Diwase

2012-12-01

391

Fault-tolerant conversion between the Steane and Reed-Muller quantum codes.  

Science.gov (United States)

Steane's 7-qubit quantum error-correcting code admits a set of fault-tolerant gates that generate the Clifford group, which in itself is not universal for quantum computation. The 15-qubit Reed-Muller code also does not admit a universal fault-tolerant gate set but possesses fault-tolerant T and control-control-Z gates. Combined with the Clifford group, either of these two gates generates a universal set. Here, we combine these two features by demonstrating how to fault-tolerantly convert between these two codes, providing a new method to realize universal fault-tolerant quantum computation. One interpretation of our result is that both codes correspond to the same subsystem code in different gauges. Our scheme extends to the entire family of quantum Reed-Muller codes. PMID:25192082

Anderson, Jonas T; Duclos-Cianci, Guillaume; Poulin, David

2014-08-22

392

Fault-Tolerant Conversion between the Steane and Reed-Muller Quantum Codes  

Science.gov (United States)

Steane's 7-qubit quantum error-correcting code admits a set of fault-tolerant gates that generate the Clifford group, which in itself is not universal for quantum computation. The 15-qubit Reed-Muller code also does not admit a universal fault-tolerant gate set but possesses fault-tolerant T and control-control-Z gates. Combined with the Clifford group, either of these two gates generates a universal set. Here, we combine these two features by demonstrating how to fault-tolerantly convert between these two codes, providing a new method to realize universal fault-tolerant quantum computation. One interpretation of our result is that both codes correspond to the same subsystem code in different gauges. Our scheme extends to the entire family of quantum Reed-Muller codes.

Anderson, Jonas T.; Duclos-Cianci, Guillaume; Poulin, David

2014-08-01