“Adaptare-FD: A dependability-oriented adaptive failure detector”

Mônica Dixit, António Casimiro

in 29th IEEE Symposium on Reliable Distributed Systems (SRDS'10), Nov. 2010.

Abstract: Unreliable failure detectors are a fundamental building block in the design of reliable distributed systems. But unreliability must be bounded, despite the uncertainties affecting the timeliness of communication. This is why it is important to reason in terms of the quality of service (QoS) of failure detectors, both in their specification and evaluation. We propose a novel dependability-oriented approach for specifying the QoS of failure detectors, and introduce Adaptare- FD, an autonomous and adaptive failure detector that executes according to this new specification. The main distinguishing features of Adaptare-FD with respect to existing adaptive failure detection approaches are discussed and explained in detail. A comparative evaluation of Adaptare-FD is presented. We highlight the practical differences between our approach and the well known Chen et al. approach for the specification of QoS requirements. We show that Adaptare-FD is easily configured, independently of the specific network environment. Furthermore, the results obtained using the PlanetLab platform indicate that Adaptare-FD outperforms other timeout-based solutions, combining versatility with improved QoS and dependability assurance.

Research line(s): Timeliness and Adaptation in Dependable Systems (TADS)

