|
|
Programme / Program
Exposé de Michel Raynal : The Information Structure of Consensus Michel Raynal,IRISA Transparents Publication complète en PostScript Publication complète en pdf
Exposé de Leslie Lamport : Part I: An introduction to state machines. Part II: Paxos. Part III: Byzantine Paxos. Leslie Lamport, Microsoft Bibliographie:
Exposé de Eli Gafni : Distributed Computing: A Glimmer of a Theory Eli Gafni,UCLA Transparents en format ppt
Exposé de Eric Goubault : Geometric methods for parallel and distributed system Eric Goubault,CEA Transparents en PostScript Transparents en pdf In this tutorial, I will explain how geometry can help solve some problems in fault-tolerant distributed computing, as originally developped by Maurice Herlihy, Nir Shavit, Sergio Rajsbaum and others. The main question is whether, on a given distributed architecture, and given a number of potential crashes of processors, one can "compute" some functions using a distributed algorithm, in a finite time (i.e. with a finite number of messages exchanged for instance). Not all functions on all architectures are "computable", and the obstructions are in fact geometrical in nature. We will exemplify this on easy examples, such as impossibility of consensus in the asynchronous model, and take advantage of these easy applications to introduce some simple algebraic topology.
Exposé de David Powell : Redundancy Management in an Automatic Subway Line: a Protocol based on the Timed Asynchronous Model David Powell, LAAS-CNRS Tranparents en format pdf Tranparents en format ps Bibliographie In modern automatic subway lines (such as Magaly, in Lyon, and Météor, in Paris), trains are controlled by a pool of processors distributed along the line and on board the trains. The processors are implemented using the "coded processor" technique, that ensure fail-safe operation despite physical faults and design faults affecting the processors. However, to ensure availability, the processors are organized in fault-tolerant duplex pairs. Such duplication can lead to catastrophic failure scenarios if redundancy switching occurs when the states of the processors are inconsistent. Since it is provably impossible to guarantee consistent updates of both processors in a pair in a system in which messages can be lost, it was necessary to devise a fail-safe redundancy management protocol. The protocol is based on the timed asynchronous model. It guarantees that redundancy switching can only occur when processor states are consistent. The protocol is now implemented in the Siemens Digisafe safety computer, that is being applied to the Carnesie Line in New York, which is currently being automated.
Exposé de Paulo Verissimo : Travelling through Wormholes or Reconciling Uncertainty and Predictability Paulo Veríssimo
We are faced today with the confluence of antagonistic aims, when designing and deploying distributed systems. On one hand, our applications have to achieve timeliness goals, dictated both by QoS expectations with regard to on-line services (e.g. time- bounded transactions), and by technical issues of real-time nature involved in the deployment of certain services (e.g., multimedia rendering). Likewise, services, despite their sometimes critical nature (not only money-critical, but also privacy- or even safety- critical), are more often deployed on-line or through open networks. On the other hand, the open and large-scale environments where applications and users execute and evolve exhibit a great deal of uncertainty. With the advent of wireless communication, ad-hoc networking, mobile and peer-to-peer computing, the architecture of systems is no longer a given. As such, it is hard to define fault and synchrony models of something we do not know exactly what it looks like. For example, the well known arbitrary failure assumption, general as it may seem, is only valid in the narrow context of a known component structure, and a known failure universe. In the previous paragraph, we essentially talked about uncertainty, the grand challenge faced by distributed system researchers and designers. When talking about uncertainty, 'impossibility' and 'probability' are words that come to mind. Literature has relevant examples on being pessimistic and accepting uncertainty, showing impossibility results, or producing solutions that are uncertain, albeit quantifiably uncertain. Other works have methodically studied what can be done when the system is incrementally less uncertain. Alternatively, other approaches are more optimistic, assuming that the system has periods of determinism, alternating with uncertainty, and try to identify and successfully explore those (sometimes scarce) periods, to perform useful tasks. Nevertheless, a designer does not make strong assumptions just for the sake of it. They are made because they provide guarantees (read: combinations of timeliness and reliability) not enjoyed by other alternatives, in essence, a degree of predictability about system attributes. Recently, we observe works which make increasingly strong assumptions about the environment, to get correspondingly higher guarantees. For example, arguing about the advantage of having perfect failure detectors. Such failure detectors, however, cannot be implemented on environments with uncertain synchrony, which have been the workhorse of all past work on failure detectors. Furthermore, if the objective is timeliness of the applications, those strong assumptions should extend to the very payload environment where applications run, said then to have real-time behaviour. On a soft side of the problem, designers of protocols and systems like Squid, Akamai, Inktomi, AOL, etc., confronted for example with the uncertainty of the Internet, have been providing ad hoc (but normally extremely effective) mechanisms, such as warm cache hierarchies or prioritary execution of special tasks, for performance, or dedicated channels (e.g., physical or overlay networks), both for performance and security. However, short of a systemic approach to the problem, guarantees are essentially statistical, and do not solve the predictability problem, for example: the latency of individual transactions, or multimedia frames, or the intrusion tolerance of a TTP server. We propose to discuss a novel design philosophy for distributed systems with uncertain or unknown attributes, such as synchrony, or security. This philosophy is based on the existence of architectural constructs--- Wormholes--- with privileged properties which endow systems with the capability of evading the uncertainty of the environment for certain crucial steps of their operation where predictability is required. It may open new research avenues allowing to reconcile uncertainty with predictability. The design methodology emerging from wormholes has a large potential for current and future distributed systems, and addresses generic attributes, such as synchrony, security or survivability. Since it is somewhat unconventional, we focus on synchrony, a well-known attribute, for a discussion of the philosophy and demonstration of its feasibility. Formally, synchrony requirements are timeliness specifications, in essence meaningful only in the synchronous system model. Under this model there are known bounds for timing variables, and the mechanisms to meet reliability and timeliness requirements are reasonably well understood, both in terms of distributed systems theory and in real-time systems design principles (e.g., real-time communication, scheduling and replication management). However, unpredictable and unreliable infrastructures are not adequate environments for synchronous models, since it is difficult to enforce timeliness assumptions. Violation of assumptions causes incorrect system behavior. In alternative, the asynchronous model is a well-studied framework, appropriate for these environments. 'Asynchronous' means, in order to be simple at this point, that there are no bounds on essential timing variables, such as processing speed or communication delay. This model has served a number of applications where uncertainty about the provision of service was accepted. In consequence, this status quo leaves us with a problem: fully asynchronous models do not satisfy our needs, because they do not allow timeliness specifications; on the other hand, correct operation under fully synchronous models is very difficult to achieve (if at all possible) in infrastructures with uncertain baseline timeliness properties. One issue of definitive importance is the following: what system model to use for applications with synchrony (i.e. real-time) requirements running on environments with uncertain timeliness? This question is replied by the Wormholes model, as we propose to show, by presenting one of its instantiations for enforcing timeliness or real-time behaviour, the Timely Computing Base (TCB), and reviewing alternative approaches.
Exposé de Gérard Le Lann : Distributed Real-Time Computing: Where Do We Stand Gerard Le Lann,INRIA Transparents en pdf Transparents en PostScript
Exposé de Dahlia Malhki : Distributed-Systems Security and Distributed System-Security Dahlia Malkhi, Université de Jerusalem Transparents en ppt Bibliographie: http://www.cs.huji.ac.il/~dalia/ Several security technqiues embrace the principle of increasing security by distribution of trust. Security is leveraged from reducing the trust in any single element of the system, so that services are provided through collaborative activity. This tutorial will present several technologies that realize this principle, including secret sharing schemes, commitment and verification, and pro-active mechanisms. These techniques together enable the ultimate secure distributed computation, called `secure function evaluation': Given n players with initial secret inputs x1..xn, they can compute securely and reliably together any function f(x1, ..., xn) without disclosing anything about their secrets. The tutorial will relate security techniques to fundamental issues in distributed computing. If time allows, I will get into some methods of distributed coordination in penetration-prone environments, and in particular, how to solve consensus using Byzantine quorum systems. No previous knowledge of cryptography is assumed, and almost no cryptography will be presented.
Exposé de Pierre Fraigniaud : Dynamic Networks for Peer-to-Peer Systemsholes Pierre Fraigniaud, LRI,Université d'Orsay Transparents en ppt A Distributed Hash Table (DHT) is distributed lookup table that can be used to implement peer-to-peer (P2P) systems. A DHT allows the discovery and location of data and/or resources, identified by keys, in a distributed network (e.g., Internet), in absence of centralized server or any hierarchical organization. Several DHTs have been recently described in the literature (e.g., Chord, Viceroy, etc.), and some of them have led to the development of experimental systems. We will first survey the main characteristics of these DHTs. Next, we will present a new DHT, called D2B for its relation with the de Bruijn graph. In term of performances, any join or leave of a user implies a constant expected number of link modifications, and, with high probability (w.h.p.), at most O(log n) link modifications, where n denotes the number of nodes currently in the system. The latency of a lookup routing is O(log n), w.h.p., in the sense that a key is at most O(log n) hops away from a consumer. A join involves key redistribution among two nodes (which is the minimum possible), including the node joining the system. Similarly, a leave involves key redistribution among at most three nodes. The set of keys is fairly distributed among nodes, in the sense that every node is responsible for an expected number of |K|/n keys, and, w.h.p., O(|K|log n/n) keys where K is the set of all keys managed by the network. The traffic load incurred by lookup routing is also fairly distributed, and the expected congestion of a node is O((\log n)/n). Finally, a parameter d allows a trade-off between the degree of the network, which increases linearly with d, and the diameter of the network, which decreases logarithmically with d. Hence, a large d allows faster lookup routing, at the price of a slight increase of the latency for joining and leaving the network. In particular, for degree O(\log n), lookups perform in time O(\log n/\log\log n). Variants of DHTs based on the de Bruijn topologies have been simultaneously independently discovered by I. Abraham, B. Awerbuch, Y. Azar, Y. Bartal, D. Malkhi, and E. Pavlov (to appear in the 17th Int. Parallel and Distributed Processing Symposium, IPDPS), F. Kaashoek and D. Karger (In 2nd Int. Workshop on Peer-to-Peer Systems (IPTPS), Feb. 2003), and M. Naor and U. Wieder (to appear in the 15th ACM Symp. on Parallelism in Algorithms and Architectures, SPAA). Simulations should eventually allow to discriminate these variants (including D2B), and to pick the most relevant ideas from each of them. For more details on D2B, we refer the audience to "The Content-Addressable Network D2B", Tech. Rep. 1349 LRI, Univ. Paris-Sud, France, Jan. 2003, available at http://www.lri.fr/~pierre/
Exposé de Joffroy Beauquier : Introduction à l'auto-stabilisation Joffroy Beauquier,LRI, Université d'Orsay Transparents en format ppt Le terme auto-stabilisation est apparu pour la première fois dans un article de Dijkstra en 1974. Pendant une dizaine d'années, cet article resta peu connu, jusqu'à ce que Lamport affirme, lors de la conférence inaugurale de l'un des PODC, qu'il s'agissait pour lui du meilleur papier de cet auteur. Depuis des centaines d'articles ont confirmé le fait qu'il n'est pas nécessaire d'être très rapide pour être en mesure d'exploiter une bonne idée. L'auto-stabilisation a un rapport avec la tolérance aux défaillances. Après des défaillances qui ont pu toucher un nombre arbitrairement grand de composants d'un système, on souhaite que ce système revienne de lui-même, c'est-à-dire sans réinitialisation ni intervention extérieure, à un comportement correct. Le but de ce cours est de définir la notion d'auto-stabilisation, de présenter un des algorithmes de Dijkstra ainsi que quelques propriétés de base. Une partie du matériel se trouve dans les chapitres consacrés à l'auto-stabilisation du livre de Gérard Tel, "Introduction to Distributed Algorithms". Le livre de Shlomi Dolev, "Self-stabilisation", contient également des éléments intéressants.
Exposé de Shay Kutten : Locality Issues in Self Stabilization Shay Kutten, Technion Transparents en PostScript Transparents en pdf Références bibliographiques
Exposé de Friedemann Mattern : Ubiquitous Computing Friedemann Mattern, Zurich Transparents en PostScript Transparents en pdf Bibliographie (accessible sur le site http://www.inf.ethz.ch/vs/publ/):
Résumé: The vision of Ubiquitous Computing is grounded in the believe that Moore's Law (i.e., the observation that the computer power available on a chip doubles every 18 months) will hold for at least another 10 years. This means that in the next few years processors, sensors, and other microelectronic devices will become so small and inexpensive that they can be embedded in almost everything, rendering everyday objects "smart". These smart objects may communicate by wireless means and form spontaneous networks, giving rise to a world-wide distributed system several orders of magnitude larger than today's Internet. The prospects of a world of networked sensors and smart things that virtually talk to each other are fascinating, leading to many new applications and opportunities. The lecture consists of four parts. In the first part we introduce Ubiquitous Computing and motivate why it is an important and interesting concept that will affect many areas of computer science. We also speculate about the future development and mention relevant technologies. The second part is on so-called smart identification techniques. Among other things we explain how RFIDs (i.e., "radio tags" or "smart labels") work. We also mention current and future applications of smart labels. In the third part we show how smart labels, wireless sensors, embedded processors, together with the Internet backend infrastructure, contribute to the integration of the physical world and the virtual world: Smart everyday objects enable many new applications and business ideas, and create unique opportunities and challenges. Living in a smart environment has interesting consequences, however - threats to privacy is one of the issues we briefly consider. In the last part we shall consider some research projects and discuss current research issues and opportunities.
Exposé de Francois Armand : Jaluna-2: A Multi-System Programming Environment Francois Armand, Jaluna,Paris Transparents en PostScript Transparents en pdf The Jaluna-2 system provides a dual operating system environment, enabling to run Linux applications and critical real-time applications simultaneously on the same processor. While Linux applications run on an almost un-modified Linux kernel, critical applications run on top of the C5 micro-kernel, previously known as Chorus. The C5 micro-kernel is a highly modular real-time micro-kernel supporting a rich set of features: various memory management models, real-time schedulers, synchronisation and various IPC mechanisms. Within such an architecture applications are in fact split in two parts: the critical side runs on C5 while non-critical side runs in the Linux environment taking benefit of all commodity software available. These two parts of applications may communicate by means of an Inter-System Communication mechanism, implemented as a virtual bus betwen the two systems. This communication mechanism is asynchronous, reliable and obviously bi-directionnal. The Jaluna-2 architecture is flexible and enables systems to be transparently deployed on the same processor or on different processors as long as they are inter-connected by a reliable bus such as a cPCI bus, for example. The Inter-System Communication mechanism hides the actual configuration of the system to the applications. When both systems run on the same processor, physical resources are dispatched between them: physical memory is split so that each system has its own piece of memory, devices may be dedicated to one system or an another. For example a network interface card may be attached to C5 or to Linux. A peer-device mechanism enables a device to be shared between both systems. This is used to transparently provide an IP stack within the C5 world while keeping the regular Linux IP stack. Jaluna-2 also enables to bring value to the Linux system without requiring deep modifications to its internals. A monitor agent running on the C5 micro-kernel permits to detect failures of the Linux world and restart it. Restart of Linux can be done from an image kept in memory, enabling to perform a fast restart of Linux after a failure. During such a scenario, C5 critical applications are kept running without being disturbed by the Linux failure and restart. In addition, C5 provides the possibility for Linux application to store information, such as checkpoint states in memory. Such memory is unaffected by a Linux failure and restart, this provides a convenient and efficient way for Linux applications to save checkpoints and transparently restart from these checkpoints.
Exposé de Pascal Felber : Algorithms for P2P Distributed Hash Tables Pascal Felber, Institut EURECOM, Sophia Antipolis, France Transparents en format ppt Peer-to-Peer Distributed Hash Tables (P2P DHTs) are structured overlay networks that provide efficient access to data and operate in a fully decentralized manner. This tutorial will introduce the basic principles of DHTs, survey some of the major architectures and algorithms for building scalable P2P DHTs, and evaluate them in terms of speed of lookup, complexity, and resilience to failures. |
|
|
|
|
|