Abstract
|
This thesis introduces a new dimension ove … This thesis introduces a new dimension over which systems dependability
may be evaluated, exhaustion-safety. Exhaustion-safety
means safety against resource exhaustion, and its concrete semantics
in a given system depends on the type of resource being considered.
The thesis focuses on the nodes of a fault-tolerant distributed
system as crucial resources and on understanding the conditions
in which the typical assumption on the maximum number of node
failures may or may not be violated.
An interesting first finding was that it is impossible to build a node exhaustion-safe intrusion-tolerant distributed system under the asynchronous
model. This result motivated the research on developing
the right model and architecture to guarantee node-exhaustion safety.
The main outcome of this research was proactive resilience,
a new paradigm to build intrusion-tolerant distributed systems.
Proactive resilience is based on architectural hybridization and hybrid
distributed system modeling: the system is asynchronous in
its most part and it resorts to a synchronous subsystem to periodically
recover the nodes and remove the effects of faults/attacks.
The Proactive Resilience Model (PRM) is presented and shown to
be a way of building node-exhaustion-safe intrusion-tolerant distributed
systems.
Finally, the thesis presents two application scenarios of proactive
resilience. First, a proof-of-concept prototype of a secret sharing system
built according to the PRM is described and shown to be highly
resilient under different attack scenarios. Then, a novel intrusion tolerant
state machine replication architecture (based on the PRM)
is presented and a new result established, that a minimum of 3 f + 2k+1 replicas are required to ensure availability, on a system where
f arbitrary faults may happen between recoveries, with at most k
replicas recovering simultaneously. most k
replicas recovering simultaneously.
|