Role Reconfiguration Based on Decision-Making for Fault-Recovering in Multi- Agent Systems

Distributed hardware systems can be considered as teams of distributed cooperative agents. Thinking this way, will power the designers to develop new agent-based ideas to increase systems’ fault tolerance. In this paper, recovering the system from potential faults by helping the faulty agents in performing their tasks is considered. Besides, each agent, through two phases of distributed decision-making process decides if it can help the faulty agents by undertaking their tasks. If it is possible, the helping actions will be started. The developed ideas are implemented in a simulated Distributed Control System. As it is shown, the proposed distributed fault-clearing method through reconfiguration of the agents’ roles to recover the lost capabilities is very effective. In the presented method, there is no central agent, sentinel or broker to observe the agents and redistribute the tasks among the agents to clear the fault. In fact the system is totally distributed and each agent takes proper actions based on a designed cooperation strategy to clear the fault. The presented methods are tested on a simulated distributed hardware system.

[1]  Milind Tambe,et al.  What Is Wrong With Us? Improving Robustness Through Social Diagnosis , 1998, AAAI/IAAI.

[2]  Zainalabedin Navabi,et al.  VHDL: Analysis and Modeling of Digital Systems , 1992 .

[3]  Staffan Haegg A Sentinel Approach to Fault Handling in Multi-Agent Systems , 1996, DAI.

[4]  R. Ramaswami,et al.  Book Review: Design and Analysis of Fault-Tolerant Digital Systems , 1990 .

[5]  Z. Navabi,et al.  A new task redistribution method for fault clearing in multi-agent systems , 2002, IEEE International Conference on Systems, Man and Cybernetics.

[6]  所 真理雄,et al.  ICMAS-96 : proceedings Second International Conference on Multi-Agent Systems, December 10-13, 1996, Kyoto, Japan , 1996 .

[7]  Mike Williamson,et al.  Matchmaking and Brokering , 1996 .

[8]  Barry W. Johnson Design & analysis of fault tolerant digital systems , 1988 .

[9]  Krithi Ramamritham,et al.  Scheduling algorithms and operating systems support for real-time systems , 1994, Proc. IEEE.

[10]  Majid Nili Ahmadabadi,et al.  A cooperative fault tolerance strategy for distributed object lifting robots , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Lynne E. Parker,et al.  ALLIANCE: an architecture for fault tolerant multirobot cooperation , 1998, IEEE Trans. Robotics Autom..

[12]  Nicholas R. Jennings,et al.  Controlling Cooperative Problem Solving in Industrial Multi-Agent Systems Using Joint Intentions , 1995, Artif. Intell..

[13]  Mark Klein,et al.  Exception handling in agent systems , 1999, AGENTS '99.

[14]  Shinichi Nakasuka,et al.  Fault tolerance in a multiple robots organization based on an organizational learning model , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[15]  Andy M. Tyrrell,et al.  BIOLOGICALLY INSPIRED FAULT-TOLERANT ARCHITECTURES FOR REAL-TIME CONTROL APPLICATIONS , 1999 .