COPE: Secure and Reliable Parallel Processing
- Research Line(s): Fault and Intrusion Tolerance in Open Distributed Systems (FIT)
- Sponsor: FCT
- Project Number: POSI/CHS/39815/2001
- Total award amount: 35K Euros
- Coordinator: FCUL
- Partners: FCUL
- Start Date: Apr. 2002
- Duration: 32 months
- Keywords: Distributed systems, Fault tolerance, Parallel Computing, Security
- Team at FCUL: 4 researchers, including Nuno Ferreira Neves, Paulo Verissimo, Miguel Correia
Throughout the last ten years there has been an considerable evolution in the area of parallel systems and applications. With the ending of the Cold War the available funds for the construction of specialized supercomputers suffered a substantial reduction. These systems were able to deliver a superior floating-point performance but their costs were several orders of magnitude higher when compared with the existing mainframes.
Users, however, continued to demonstrate interest in this kind of systems. In fact, there has been an increase on the number of applications that need significant computational capacities. In many cases parallel processing is the only solution, otherwise, results would take too long to be calculated and would lose their usefulness. These applications come from the most diverse areas of knowledge, such as medical sciences including genetic engineering, financial modeling, and robotics.
Today exists a more pragmatic attitude on the development of support systems for parallel applications. The hardware architecture that is most commonly found consists on a group of workstations or PCs interconnected by a high-performance network. Application programming can be done using a message-passing platform with a standard interface, such as the Message Passing Interface (MPI). The existing platforms, however, still have a number of restrictions. For instance, they impose an interactions model where processes are constrained to exchange messages only to the processes of the same application, which prevents the cooperation among parallel applications.
In this project we want to make contributions to the resolution of three fundamental problems:
In the first place, the project will investigate ways to extend existing interaction models in a way that more complex and dynamic applications can be executed. Processes should be allowed to communicate with other parallel applications, and ideally with processes that are not supported by the same platform.
In second place, the project will research new mechanisms that will allow the detection and eventual recovery of failures. The message-passing platform should at least have the responsibility of informing the application when communication is interrupted, and it should supply a set of techniques that would facilitate the recovery of failed processes.
In third place, the project will investigate mechanisms that will increase the security of the overhaul system. The assumption of a more generic interactions model brings the possibility of malicious attacks to the platform and applications. Authentication, for instance, is a service that should be available to the applications.
This project will also implement and evaluate a prototype of a message-passing system with the mechanisms that will be developed.
- Miguel Correia, Nuno Ferreira Neves, Paulo Verissimo, Lau Cheuk Lung, “Low Complexity Byzantine-Resilient Consensus”, Distributed Computing, vol. 17, n. 3, pp. 237--249, March 2005. http://www.springerlink.com/index/10.1007/s00446-004-0110-7, Oct. 2005.
- Nuno Ferreira Neves, Miguel Correia, Paulo Verissimo, “Wormhole-Aware Byzantine Protocols”, in 2nd Bertinoro Workshop on Future Directions in Distributed Computing: Survivability - Obstacles and Solutions (FuDiCo: SOS), Bertinoro, Italy, June, 2004., Jun. 2004.
- Miguel Correia, Nuno Ferreira Neves, Lau Cheuk Lung, Paulo Verissimo, “Low Complexity Byzantine-Resilient Consensus”, Missing institution, Tech. Rep., Oct. 2003. Technical Report DI/FCUL TR-03-25, Department of Computer Science, University of Lisbon. August 2003
- Paulo Verissimo, Nuno Ferreira Neves, Miguel Correia, “Intrusion-Tolerant Architectures: Concepts and Design”, in Architecting Dependable Systems, ser. LNCS. Springer-Verlag, Jun. 2003, vol. 2677, pp. 3–36. Extended version in http://hdl.handle.net/10455/2954
- Miguel Correia, Lau Cheuk Lung, Nuno Ferreira Neves, Paulo Verissimo, “A Simple Intrusion-Tolerant Reliable Multicast Protocol using the TTCB Model”, in Proceedings of the 21st Simpósio Brasileiro de Redes de Computadores, Natal, Brasil, May 2003, May 2003.
BibTeXNavigators - COPE project
|Current projects:||DiSIEM, IRCoC, NORTH, Abyss, SUPERCLOUD, COST Action IC1402, SEGRID|
|Past projects:||TCLOUDS, MASSIF, MAFTIA, RESIST NoE, KARYON, HIDENETS, CORTEX, CRUTIAL, TRONE, SITAN, ReD, DIVERSE, CloudFIT, READAPT, REGENESYS, RC-Clouds, TACID, DARIO, RITAS, AJECT, MICRA, DEAR-COTS, COPE, DEFEATS, MOOSCO, TOPCOM, BioBankCloud, PROPHECY, SAPIENT, SecFuNet, FTH-Grid, AIR-II, AIR, ESFORS, CaberNet, GODC, BROADCAST, CoDiCom, Delta-4, RAPTOR|