COPYRIGHT NOTICE: Reports contained in this page are included by the contributing authors as a mechanism to ensure timely dissemination of scholarly/technical information on a non-commerical basis. Copyright and all rights therein are maintained by the authors, despite the fact they have offered this information electronically. It is understood that all individuals copying this information will adhere to the terms/constraints invoked by each author's copyright.
Reports may not be copied for commercial redistribution, republication, or dissemination without the explicit permission of the Navigators and the authors.
Sections of some of these reports have been published by IEEE and have IEEE Copyright. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 732-562-3966.

 

The papers are split in four areas:

(T)TCB model, architecture and implementation

Time related applications

Intrusion-tolerance applications

Other papers

 

(T)TCB model, architecture and implementation

The Timely Computing Base Model and Architecture
P.Veríssimo and A. Casimiro
IEEE Transactions on Computers - Special Issue on Asynchronous Real-Time Systems, vol. 51, n. 8, Aug 2002

Current systems are very often based on large-scale, unpredictable and unreliable infrastructures. However, users of these systems increasingly require services with timeliness properties. This creates a difficult-to-solve contradiction with regard to the adequate time model: synchronous, or asynchronous? In this paper, we propose an architectural construct and programming model, which address this problem. We assume the existence of a component that is capable of executing timely functions, however asynchronous the rest of the system may be. We call this component the Timely Computing Base, and it can be used by the other components to execute a set of simple but crucial time-related services .We also show how to use it to build dependable and timely applications exhibiting varying degrees of timeliness assurance, under several synchrony models.

Download PDF Download Postscript Bibtex Entry
Download PDF Download Postscript Bibtex Entry

How to Build a Timely Computing Base using Real-Time Linux
António Casimiro and Pedro Martins and Paulo Veríssimo
Proceedings of the 2000 IEEE International Workshop on Factory Communication Systems, Porto, Portugal, September 2000

In a recent paper we introduced a new model to deal with the problem of handling application timeliness requirements in environments with loose real-time guarantees. This model, called the Timely Computing Base (TCB), is one of partial synchrony. From an engineering point of view, it requires systems to be constructed with a small control part, a TCB module, to protect vital resources with respect to timeliness and to provide basic time related services to applications. Although many different instantiations of systems with a TCB can be envisaged, we have chosen to implement a TCB using PC hardware running the Real-Time Linux operating system over a Fast-Ethernet network. This paper describes the experience gained during the implementation process and shows that it is possible to construct a TCB without the need for special software or hardware components. The problem of achieving real-time communication under RT-Linux is also discussed: we describe the port we have done of a Linux network driver to RT-Linux, explaining the required modifications to allow predictability.

Download PDF Download Postscript Bibtex Entry
Download PDF Download Postscript Bibtex Entry

The Timely Computing Base: Timely Actions in the Presence of Uncertain Timeliness
Paulo Veríssimo, António Casimiro and Christof Fetzer
Proceedings of the International Conference on Dependable Systems and Networks, New York, USA, June 2000

Real-time behavior is specified in compliance with timeliness requirements, which in essence calls for synchronous system models. However, systems often rely on unpredictable and unreliable infrastructures, that suggest the use of asynchronous models. Several models have been proposed to address this issue. We propose an architectural construct that takes a generic approach to the problem of programming in the presence of uncertain timeliness. We assume the existence of a component, capable of executing timely functions, which helps applications with varying degrees of synchrony to behave reliably despite the occurrence of timing failures. We call this component the Timely Computing Base, TCB. This paper describes the TCB architecture and model, and discusses the application programming interface for accessing the TCB services. The implementation of the TCB services uses fail-awareness techniques to increase the coverage of TCB properties.

Download PDF Download Postscript Bibtex Entry
Download PDF Download Postscript Bibtex Entry

Timing Failure Detection with a Timely Computing Base
António Casimiro and Paulo Veríssimo
Third European Research Seminar on Advances in Distributed Systems, Madeira Island, Portugal, April 1999

In a recent report we proposed an architectural construct to address the problem of dealing with timeliness specifications in a generic way. We called it the Timely Computing Base, TCB. The TCB defines a set of services available to applications, including timely execution, duration measurement and timing failure detection. We showed how these services could be used to build dependable and timely applications. In this paper we further extend the description of the TCB, namely by presenting a protocol for its Timing Failure Detection (TFD) service. We discuss the essential aspects of providing such a service under the TCB framework and make some considerations relative to the service interface.

Download PDF Download Postscript Bibtex Entry
Download PDF Download Postscript Bibtex Entry

 


The Design of a COTS Real-Time Distributed Security Kernel
M. Correia, P. Veríssimo, Nuno F. Neves
Fourth European Dependable Computing Conference. Toulouse, France, pages 234--252, October 2002

This paper describes the design of a security kernel called TTCB, which has innovative features. Firstly, it is a distributed subsystem with its own secure network. Secondly, the TTCB is real-time, that is, a synchronous subsystem capable of timely behavior. These two characteristics together are uncommon in security kernels. Thirdly, the TTCB can be implemented using only COTS components.
We discuss essentially three things in this paper: (1) The TTCB is a simple component providing a small set of basic secure services. It aims at building a new style of protocols to achieve intrusion tolerance, which for the most part execute in insecure, arbitrary failure environments, and resort to the TTCB only in crucial parts of their operation. (2) Besides, the TTCB is a synchronous device supplying functions that may be an enabler of a new generation of timed secure protocols, until now known to be fragile due to attacks on timing assumptions. (3) Finally, we present a design methodology that establishes our hybrid failure assumptions in a well-founded manner. It helps us to achieve a robust design, despite using exclusively COTS components, with the advantage of allowing the security kernel to be easily deployed on widely used platforms.

 

  Download Postscript Bibtex Entry
  Download Postscript Bibtex Entry


Uncertainty and Predictability: Can they be reconciled?
Paulo Veríssimo
Future Directions in Distributed Computing, pp. 108-113, Springer Verlag LNCS 2584, May, 2003

We are faced today with the confluence of antagonistic aims, when designing and deploying distributed applications,
such as uncertainty and predictability. Uncertainty is a common denominator of current systems: uncertain synchrony,
fault model, and even topology. However, systems are required to fulfil more and more demanding goals which require predictability under several forms, e.g., timeliness, trustworthiness. This paper introduces a new design philosophy for distributed systems, based on the existence of architectural constructs with privileged properties- wormholes- which endow systems with the capability of evading the uncertainty of the environment for certain crucial steps of their operation where predictability is required. Recently, we have tested this philosophy by studying and prototyping two incarnations of distributed systems with wormholes, which we also report here.
 

Download PDF   Bibtex Entry
Download PDF   Bibtex Entry

Time-related Applications

Generic Timing Fault Tolerance using a Timely Computing Base
A. Casimiro and P.Veríssimo
Proceedings of the International Conference on Dependable Systems and Networks, Washington D.C., USA, June 2002

Designing applications with timeliness requirements in environments of uncertain synchrony is known to be a difficult problem. In this paper, we follow the perspective of timing fault tolerance: timing errors occur, and they are processed using redundancy, e.g., component replication, to recover and deliver timely service. We introduce a paradigm for generic timing fault tolerance with replicated state machines. The paradigm is based on the existence of Timing Failure Detection with timed completeness and accuracy properties. Generic timing fault tolerance implies the ability to dependably observe the system and to timely notify timing failures, which we discuss in the paper. On the other hand, it ensures replica determinism with respect to time (temporal consistency), and safety in case of spare exhaustion. We show that the paradigm can be addressed and realized in the framework of the Timely Computing Base (TCB) model and architecture. Furthermore, we illustrate the generality of our approach by reviewing previous existing solutions and by showing that in contrast with ours, they only secure a restricted semantics, or simply provide ad-hoc solutions.

Download PDF Download Postscript Bibtex Entry
Download PDF Download Postscript Bibtex Entry

Using the Timely Computing Base for Dependable QoS Adaptation
A. Casimiro and P.Veríssimo
Proceedings of the 20th IEEE Symposium on Reliable Distributed Systems , New Orleans, USA, October 2001

In open and heterogeneous environments, where an unpredictable number of applications compete for a limited amount of resources, executions can be affected by also unpredictable delays, which may not even be bounded. Since many of these applications have timeliness requirements, they can only be implemented if they are able to adapt to the existing conditions. Adaptation can be done by several ways, taking into account many different factors, but an obvious factor of success is knowing what they have to adapt to. In this paper we present a novel approach, called Dependable QoS adaptation, which can only be achieved if the environment is accurately and reliably observed.

Dependable QoS adaptation is based on the Timely Computing Base (TCB) model. The TCB model is a partial synchrony model that adequately characterizes environments of uncertain synchrony and allows, at the same time, the specification and verification of timeliness requirements. We introduce the coverage stability property and show that adaptive applications can use the TCB to dependably adapt and enjoy this property. We describe the characteristics and the interface of a QoS coverage service and discuss its implementation details.

Download PDF Download Postscript Bibtex Entry
Download PDF Download Postscript Bibtex Entry

Intrusion-tolerance applications

A Simple Intrusion-Tolerant Reliable Multicast Protocol using the TTCB Model
Miguel Correia, Lau Cheuk Lung, Nuno Ferreira Neves, Paulo Veríssimo
Proceedings of the 21st Simpósio Brasileiro de Redes de Computadores, Natal, Brasil, May 2003

This paper proposes a simple reliable multicast protocol that tolerates arbitrary faults, including malicious faults such as intrusions. The goal is to show a novel way of designing intrusion-tolerant protocols based on a wellfounded hybrid fault model. This model is based on a simple distributed security kernel the TTCB which is used by the processes only to execute securely critical steps of the protocol. Otherwise, the processes and their communication can be attacked in unlimited ways. The TTCB provides only a few basic
services, which allow our protocol to tolerate a number of faults similar to accidental fault-tolerant protocols: for f faults, our protocol requires f + 2 processes, instead of 3f + 1 in typical intrusion-tolerant (or Byzantine) protocols. The protocol exhibits fast termination in the presence of intrusions and/or crash or malicious process failures, since it does not use any cryptography in runtime.

Download PDF   Bibtex Entry
Download PDF   Bibtex Entry

Efficient Byzantine-Resilient Reliable Multicast on a Hybrid Failure Model
M. Correia and L. C. Lung and N. F. Neves and P. Veríssimo
21th IEEE Symposium on Reliable Distributed Systems. Suita, Japan, pages 2--11, October 2002

The paper presents a new reliable multicast protocol that tolerates arbitrary faults, including Byzantine faults. This protocol is developed using a novel way of designing secure protocols which is based on a well-founded hybrid failure model. Despite our claim of arbitrary failure resilience, the protocol
needs not necessarily incur the cost of ``Byzantine agreement'', in number of participants and round/message complexity. It can rely on the existence of a simple distributed security kernel -- the TTCB -- where the participants only execute crucial parts of the protocol operation, under the protection of a crash failure model. Otherwise, participants follow an arbitrary failure model.
The TTCB provides only a few basic services, which allow our protocol to have an efficiency similar to that of accidental fault-tolerant protocols: for f faults, our protocol requires f+2 processes, instead of 3f+1 in Byzantine systems. Besides, the TTCB (which is synchronous) allows secure operation of timed protocols, despite the unpredictable time behavior of the environment (possibly due to attacks on timing assumptions).
 

Download Postscript   Bibtex Entry
Download PostScript   Bibtex Entry

Others

 

Measuring Distributed Durations with Stable Errors
António Casimiro, Pedro Martins, Paulo Veríssimo and Luis Rodrigues
Proceedings of the 22nd IEEE Real-Time Systems Symposium, London, UK, December 2001

The round-trip duration measurement technique is fundamental to solve many problems in asynchronous distributed systems. In essence, this technique provides the means for reading remote clocks with a known and bounded error. Therefore, it is used as a fundamental building block in several clock synchronization algorithms. In general, the technique can be used to implement duration measurement services, such as the one of the Timely Computing Base model. In this paper we propose a new technique to measure distributed durations that minimizes the measurement error and is able to keep this error almost stable. The new technique can be used to improve the precision of remote clock reading in certain situations. We provide a protocol that implements this new technique and we present some evaluation results. The results clearly show that our solution is indeed better than existing ones.

Download PDF Download Postscript Bibtex Entry
Download PDF Download Postscript Bibtex Entry


 

Event Timestamping Tool: a simple PC based kernel to timestamp distributed events
Pedro Martins and António Casimiro
Technical Report DI/FCUL TR-00-4, Department of Informatics, University of Lisboa, July 2000

This report describes the design and implementation of a tool to timestamp distributed events, using a standard PC hardware platform. The Event Timestamping Tool (ETT) is a small software kernel that detects externally generated events using two probe sources, and stores the respective timestamps with known precision bounds. A specialized kernel solution minimizes the response time for an event detection and registration and, consequently, maximizes the precision of the tool. Our approach exploits the Pentium µprocessor internal timestamp counter to provide timestamps with fine granularity.

Download PDF Download Postscript Bibtex Entry
Download PDF Download Postscript Bibtex Entry


Intrusion-Tolerant Architectures: Concepts and Design
Paulo Veríssimo and Nuno F. Neves and Miguel Correia
Architecting Dependable Systems, pp. 3-36, Springer-Verlag LNCS 2677, 2003

There is a significant body of research on distributed computing architectures, methodologies and algorithms, both in the fields of fault tolerance and security. Whilst they have taken separate paths until recently, the problems to be solved are of similar nature. In classical dependability, fault tolerance has been the workhorse of many solutions. Classical security-related work has on the other hand privileged, with few exceptions, intrusion prevention. Intrusion tolerance (IT) is a new approach that has slowly emerged during the past decade, and gained impressive momentum recently. Instead of trying to prevent every single intrusion, these are allowed, but tolerated: the system triggers mechanisms that prevent the intrusion from generating a system security failure. The paper describes the fundamental concepts behind IT, tracing their connection with classical fault tolerance and security. We discuss the main strategies and mechanisms for architecting IT systems, and report on recent advances on distributed IT system architectures.

 

Download PDF   Bibtex Entry
Download PDF   Bibtex Entry

 

 
 

(C) 1999-2002, Navigators
Webmaster contact ttcb-webmaster@di.fc.ul.pt