Using Light-Weight Groups to Handle Timing Failures in Quasi-Synchronous Systems

Carlos Almeida and Paulo VerĂ­ssimo

From Proceedings of the 19th IEEE Real-Time Systems Symposium, December 1998, Madrid, Spain.

Abstract

In a quasi-synchronous environment worst-case times associated with a given activity are usually much higher than the average time needed for that activity. Using always those worst-case times can make a system useless. However, not using them may lead to timing failures. On the other hand, fully synchronous behavior is usually restricted to small parts of the global system. In a previously defined architecture we use this small synchronous part to control and validate the other parts of the system. In this paper we present a light-weight group protocol that together with the previously defined architecture makes it possible to efficiently handle timing failures in a quasi-synchronous system. This is specially interesting when active replication is used. It provides application support for a fail-safe behavior, or controlled (timely and safe) switching between different qualities of service.

Also available extended report (gzip postscript).