Resilient State Machine Replication

Paulo Sousa, Nuno Ferreira Neves, Paulo Veríssimo

In Proceedings of the 11th Pacific Rim International Symposium on Dependable Computing (PRDC), Changsha, China, pages 305-309, December 2005.

Abstract

Nowadays, one of the major concerns about the services provided over the Internet is related to their availability. Replication is a well known way to increase the availability of a service. However, replication has some associated costs, namely it is necessary to guarantee a correct coordination among the replicas. Moreover, being the Internet such an unpredictable and insecure environment, coordination correctness should be tolerant to Byzantine faults and immune to timing failures. Several past works address agreement and replication techniques that tolerate Byzantine faults under the asynchronous model, but they all make the assumption that the number of faulty replicas is bounded and known. Assuming a maximum number of f faulty replicas under the asynchronous model is dangerous - there is no way of guaranteeing that no more than f faults will occur during the execution of the system. In this paper, we describe a resilient f fault/intrusion-tolerant state machine replication system, which guarantees that no more than f faults ever occur. The system is asynchronous in its most part and it resorts to a synchronous oracle to periodically remove the effects of faults/attacks from the replicas.

BibTeX

@InProceedings{sousa05resilient,
    author = "P. Sousa and N. F. Neves and P. Ver\'{\i}ssimo",
    title = "Resilient State Machine Replication",
    booktitle = "Proceedings of the 11th Pacific Rim International Symposium on Dependable Computing (PRDC)",
    year = "2005",
    pages = "305-309",
    month = dec
}

Extended Version

Download the pdf.