Προχωρημένα Θέματα σε Κατανεμημένα Συστήματα. Distributed Hash Tables

Σχετικά έγγραφα
Physical DB Design. B-Trees Index files can become quite large for large main files Indices on index files are possible.

Other Test Constructions: Likelihood Ratio & Bayes Tests

Δίκτυα Επικοινωνιών ΙΙ: OSPF Configuration

The Simply Typed Lambda Calculus

Block Ciphers Modes. Ramki Thurimella

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

2 Composition. Invertible Mappings

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

ST5224: Advanced Statistical Theory II

Στο εστιατόριο «ToDokimasesPrinToBgaleisStonKosmo?» έξω από τους δακτυλίους του Κρόνου, οι παραγγελίες γίνονται ηλεκτρονικά.

EE512: Error Control Coding

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Math 6 SL Probability Distributions Practice Test Mark Scheme

DISTRIBUTED CACHE TABLE: EFFICIENT QUERY-DRIVEN PROCESSING OF MULTI-TERM QUERIES IN P2P NETWORKS

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

The challenges of non-stable predicates

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Matrices and Determinants

Συστήματα Διαχείρισης Βάσεων Δεδομένων

Section 8.3 Trigonometric Equations

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

ΚΥΠΡΙΑΚΟΣ ΣΥΝΔΕΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY 21 ος ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ Δεύτερος Γύρος - 30 Μαρτίου 2011

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

Code Breaker. TEACHER s NOTES

Μηχανική Μάθηση Hypothesis Testing

Instruction Execution Times

EPL 603 TOPICS IN SOFTWARE ENGINEERING. Lab 5: Component Adaptation Environment (COPE)

derivation of the Laplacian from rectangular to spherical coordinates

Partial Differential Equations in Biology The boundary element method. March 26, 2013

TMA4115 Matematikk 3

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

Statistical Inference I Locally most powerful tests

Assalamu `alaikum wr. wb.

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

Fractional Colorings and Zykov Products of graphs

C.S. 430 Assignment 6, Sample Solutions

department listing department name αχχουντσ ϕανε βαλικτ δδσϕηασδδη σδηφγ ασκϕηλκ τεχηνιχαλ αλαν ϕουν διξ τεχηνιχαλ ϕοην µαριανι

Calculating the propagation delay of coaxial cable

Πανεπιστήμιο Κρήτης, Τμήμα Επιστήμης Υπολογιστών. Δημήτρης Πλεξουσάκης. Physical DB Design

Lecture 2. Soundness and completeness of propositional logic

PARTIAL NOTES for 6.1 Trigonometric Identities

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

the total number of electrons passing through the lamp.

Approximation of distance between locations on earth given by latitude and longitude

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 7η: Consumer Behavior Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 24/3/2007

Ευρετήρια. Βάσεις Δεδομένων. Διδάσκων: Μαρία Χαλκίδη

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Capacitors - Capacitance, Charge and Potential Difference

CYTA Cloud Server Set Up Instructions

Galatia SIL Keyboard Information

5.4 The Poisson Distribution.

HOMEWORK#1. t E(x) = 1 λ = (b) Find the median lifetime of a randomly selected light bulb. Answer:

CE 530 Molecular Simulation

Υλοποίηση Δικτυακών Υποδομών και Υπηρεσιών: OSPF Cost

Homework 3 Solutions

Example Sheet 3 Solutions

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ Η/Υ & ΠΛΗΡΟΦΟΡΙΚΗΣ. του Γεράσιμου Τουλιάτου ΑΜ: 697

Υλοποίηση Δικτυακών Υποδομών και Υπηρεσιών: IOS Routing Configuration

Potential Dividers. 46 minutes. 46 marks. Page 1 of 11

Mean bond enthalpy Standard enthalpy of formation Bond N H N N N N H O O O

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΕΙΡΑΙΩΣ ΤΜΗΜΑ ΠΛΗΡΟΦΟΡΙΚΗΣ ΠΜΣ «ΠΡΟΗΓΜΕΝΑ ΣΥΣΤΗΜΑΤΑ ΠΛΗΡΟΦΟΡΙΚΗΣ» ΚΑΤΕΥΘΥΝΣΗ «ΕΥΦΥΕΙΣ ΤΕΧΝΟΛΟΓΙΕΣ ΕΠΙΚΟΙΝΩΝΙΑΣ ΑΝΘΡΩΠΟΥ - ΥΠΟΛΟΓΙΣΤΗ»

On a four-dimensional hyperbolic manifold with finite volume

Overview. Transition Semantics. Configurations and the transition relation. Executions and computation

Numerical Analysis FMN011

Practice Exam 2. Conceptual Questions. 1. State a Basic identity and then verify it. (a) Identity: Solution: One identity is csc(θ) = 1

ΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΤΜΗΜΑ ΝΟΣΗΛΕΥΤΙΚΗΣ

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 10η: Basics of Game Theory part 2 Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

CHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS

Reminders: linear functions

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

(C) 2010 Pearson Education, Inc. All rights reserved.

Inverse trigonometric functions & General Solution of Trigonometric Equations

Forced Pendulum Numerical approach

Πανεπιστήμιο Δυτικής Μακεδονίας. Τμήμα Μηχανικών Πληροφορικής & Τηλεπικοινωνιών. Ηλεκτρονική Υγεία

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Modbus basic setup notes for IO-Link AL1xxx Master Block

Pg The perimeter is P = 3x The area of a triangle is. where b is the base, h is the height. In our case b = x, then the area is

Correction Table for an Alcoholometer Calibrated at 20 o C

1. Αφετηρία από στάση χωρίς κριτή (self start όπου πινακίδα εκκίνησης) 5 λεπτά µετά την αφετηρία σας από το TC1B KALO LIVADI OUT

Bounding Nonsplitting Enumeration Degrees

Problem Set 3: Solutions

Introduction to the TCP IP protocol stack through a role playing game

Αρχεία και Βάσεις Δεδομένων

HY335Α Δίκτυα Υπολογιστών Xειμερινό Εξάμηνο Πανεπιστήμιο Κρήτης, Τμήμα Επιστήμης Υπολογιστών. Routing Algorithms. Network Layer.

Πέτσιος Στέφανος Κων/νος Α.Μ. #47. Οι απαντήσεις του paper:

Partial Trace and Partial Transpose

Web 論 文. Performance Evaluation and Renewal of Department s Official Web Site. Akira TAKAHASHI and Kenji KAMIMURA

Πανεπιστήμιο Κρήτης, Τμήμα Επιστήμης Υπολογιστών Άνοιξη HΥ463 - Συστήματα Ανάκτησης Πληροφοριών Information Retrieval (IR) Systems

Models for Probabilistic Programs with an Adversary

Ρύθμιση σε whitelist

Special edition of the Technical Chamber of Greece on Video Conference Services on the Internet, 2000 NUTWBCAM

Concrete Mathematics Exercises from 30 September 2016

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits.

Finite Field Problems: Solutions

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Transcript:

Προχωρημένα Θέματα σε Κατανεμημένα Συστήματα Distributed Hash Tables

Today s Agenda What are DHTs? Why are they useful? What makes a good DHT design Case studies Chord Pastry Based on slides by Pascal Felber 2

P2P challenge: Locating content I have it Who has this paper? I have it Simple strategy: flood (e.g., expanding ring) until content is found If R of N nodes have a replica, the expected search cost is at least N/R, i.e., O(N) Need many replicas to keep overhead small Other strategy: centralized index (Napster) Single point of failure, high load Goal: Decentralize the index! 3

Indexed Search Idea Store particular content on particular nodes alternatively: pointers to content When a node wants this content, go to the node that is supposed to hold it (or knows where it is) Challenges Avoid bottlenecks: Distribute the responsibilities evenly among the existing nodes Self-organization w.r.t. nodes joining or leaving (or failing) Give responsibilities to joining nodes Redistribute responsibilities from leaving nodes Fault-tolerance and robustness Operate correctly also under failures 4

Idea: Hash Tables In a classic Hash Table: Table has N buckets Each data item has a key Key is hashed to find bucket in hash table Each bucket is expected to hold 1/N of the items, so storage is balanced insert (key, data) lookup (key) data key pos hash function Beattles h(key)%n 2 hash table 0 1 2 3... N-1 x y z hash bucket In a Distributed Hash Table (DHT), nodes are the buckets Network has N nodes Each data item has a key Key is hashed to find peer responsible for it Each node is expected to hold 1/N of the items, so storage is balanced Additional requirement: Also balance routing load!! insert (key, data) lookup (key) data key pos hash function Beattles h(key)%n 2 0 1 2 N-1... node 5

DHTs: Problems Problem 1 (dynamicity): adding or removing nodes With hash mod N, virtually every key will change its location! h(k) mod m h(k) mod (m+1) h(k) mod (m-1) Solution: use consistent hashing Define a fixed hash space All hash values fall within that space and do not depend on the number of peers (hash bucket) Peers have a hash value (ID) Data items have hash values (KEY) Each data items goes to peer with ID closest to its KEY 6

DHT Hashing Based on consistent hashing (designed for Web caching) Each server is identified by an ID uniformly distributed in range [0, 1] Each web page s URL is (via some hash function) associated with an ID which is uniformly distributed in [0, 1] A page is stored to the closest server in the ID space A client hashes the desired URL, retrieves page from appropriate server Good load balancing: each server covers roughly equal intervals and stores roughly the same number of pages Adding or removing a server invalidates few keys 0.3 0.4 0.5 0 1 server Web page 1. hash (URL) page ID apple.com 0.4 2. request page from closest server client 7

DHTs: Problems (cont d) Problem 2 (size): all nodes must be known to insert or lookup data Works with small and static server populations Solution: each peer has only a few neighbors Messages are routed through neighbors via multiple hops (overlay routing) 8

What Makes a Good DHT Design? Small diameter Should be able to route to any node in a few hops Different DHTs differ fundamentally only in the routing approach Load sharing DHT routing mechanisms should be decentralized no single point of failure no bottleneck Small degree The number of neighbors for each node should remain reasonable Low stretch To achieve good performance, minimize ratio of DHT vs. IP latency Should gracefully handle nodes joining and leaving Reorganize the neighbor sets Bootstrap mechanisms to connect new nodes into the DHT Repartition the affected keys over existing nodes 10

DHT Interface Minimal interface (data-centric) Lookup(key) IP address Generality: Supports a wide range of applications Keys have no semantic meaning Values are application dependent DHTs do not store the data Data storage can be built on top of DHTs Lookup(key) data Insert(key, data) 11

Application spectrum DHTs support many applications: Network storage *CFS, OceanStore, PAST, + Web cache *Squirrel, + E-mail [e-post, + Query and indexing *Kademlia, + Event notification [Scribe] Application-layer multicast *SplitStream, + Naming systems *ChordDNS, INS, +... 12

DHTs in Context User Application CFS DHash Chord TCP/IP store_file load_file File System store_block load_block Reliable Block Storage lookup DHT send receive Transport Retrieve and store files Map files to blocks Storage Replication Caching Lookup Routing Communication 13

DHT Case Studies Case Studies Chord Pastry Questions How is the hash space divided evenly among nodes? How do we locate a node? How do we maintain routing tables? How do we cope with (rapid) changes in membership? 15

Προτωρημένα Θέματα σε Κατανεμημένα Σσστήματα CHORD (MIT)

Chord (MIT) Circular m-bit ID space for both keys and node IDs m=6 Node ID = SHA-1(IP address) 2 m -1 0 N1 Key ID = SHA-1(key) Each key is mapped to its successor node Node whose ID is equal to or follows the key ID N48 N51 N56 K54 N8 K10 N14 Key distribution Each node responsible for O(K/N) keys O(K/N) keys move when a node joins or leaves N42 N38 K38 N32 K30 K24 N21 17

Basic Chord: State and Lookup m=6 Each node knows only two other nodes on the ring: 2 m -1 0 N1 Successor Predecessor (for ring management) N56 K54 lookup(k54) N8 Lookup is achieved by forwarding requests around the ring through successor pointers N48 N51 N14 Requires O(N) hops N42 N21 N38 N32 18

Basic Chord: State and Lookup 19

Complete Chord Finger table Each node knows these two nodes: Successor Predecessor (for ring management) m=6 2 m -1 0 N1 N8+1 N8+2 N8+4 N8+8 N8+16 N8+32 N14 N14 N14 N21 N32 N42 But also: Each node has m fingers n.finger(i) points to node on or after 2 i steps ahead n.finger(0) == n.successor N48 N51 N56 K54 lookup(k54) +16 +8 N8 +1 +2 +4 N14 O(log N) state per node +32 Lookup is achieved by following longest preceding fingers, then the successor N42 N38 N21 O(log N) hops N32 20

Complete Chord Finger table N8+1 N8+2 N8+4 N8+8 N8+16 N8+32 N14 N14 N14 N21 N32 N42 21

Chord Ring Management For correctness, Chord needs to maintain the following invariants Successors are correctly maintained For every key k, succ(k) is responsible for k Fingers are for efficiency, not necessarily correctness! One can always default to successor-based lookup Finger table can be updated lazily 22

Joining the Ring Three step process: 1. Outgoing links Initialize predecessor and all fingers of new node 2. Incoming links Update predecessors and fingers of existing nodes 3. Transfer some keys to the new node 23

Joining the Ring Step 1 Initialize the new node finger table Locate any node n in the ring Ask n to lookup the peers at j+2 0, j+2 1, j+2 2 Use results to populate finger table of j 24

Joining the Ring Step 2 Updating fingers of existing nodes New node j calls update function on existing nodes that must point to j Nodes in the ranges [j-2 i, pred(j)-2 i +1] N56 m=6 2 m -1 0 N1 5 N8+1 N8+2 N8+4 N8+8 N8+16 N8+32 N8 N14 N14 N14 N21 N32 N28 N42 O(log N) nodes need to be updated N51 12 N48 +16 N14-16 N42-16 N21 N38 N32 N28 25

Joining the Ring Step 3 Transfer key responsibility Connect to successor Copy keys from successor to new node Update successor pointer and remove keys N21 N21 N21 N21 N32 N32 N28 N32 N28 N32 N28 K24 K30 K24 K30 K24 K30 K24 K30 K24 26

Leaving the Ring (or Failing) Node departures are treated as node failures m=6 2 m -1 0 N1 Failure of nodes might cause incorrect lookup N8 doesn t know correct successor, so lookup of K19 fails N51 N56 lookup(k19)? N8 +1 +2 +4 Solution: successor list Each node n knows r immediate successors After failure, n contacts first alive successor and updates successor list Correct successors guarantee correct lookups N48 N42 N38 N32 +16 +8 N14 N18 K19 N21 27

Stabilization Case 1: finger tables are reasonably fresh Case 2: successor pointers are correct, not fingers Case 3: successor pointers are inaccurate or key migration is incomplete MUST BE AVOIDED! Stabilization algorithm periodically verifies and refreshes node pointers (including fingers) Eventually stabilizes the system when no node joins or fails N21 N21 N32 N28 n x = n.succ.pred if x in (n, n.succ) n.succ = x N32 n N28 x = n.pred.succ if x in (n.pred, n) n.pred = x 29

Chord and Network Topology Nodes numerically-close are not topologically-close (1M nodes = 10+ hops) 31

Cost of Lookup Cost is O(log N), constant is 0.5 32

Chord Discussion Search types Only equality Scalability Diameter (search and update) in O(log N) w.h.p. Degree in O(log N) Construction: O(log 2 N) if a new node joins Robustness Replication might be used by storing replicas at successor nodes 35

Προτωρημένα Θέματα σε Κατανεμημένα Σσστήματα PASTRY (MSR + Rice)

Pastry Circular m-bit ID space for both keys and nodes Addresses in base 2 b with m/b digits Address: m bits Digit: b bits ==> Address: m/b digits m=8 b=2 N3200 K3122 N3033 2 m -1 0 N0002 N0201 K0220 Node ID = SHA-1(IP address) N3001 N0322 Key ID = SHA-1(key) A key is mapped to the node whose ID is numerically-closest to the key ID N2222 N2120 K2120 N2001 K1320 K1201 N1113 37

Pastry Routing m=8 b=2 2 m -1 0 N0002 N3200 N0201 N0202 N3033 N3001 lookup(k0202) N0221 N0322 N2221 N1113 N2120 N2001 38

Pastry Routing m=8 b=2 2 m -1 0 N0002 N3200 N0201 N0202 N3033 N3001 lookup(k0202) N0221 N0322 N2221 N1113 N2120 N2001 39

Pastry Routing m=8 b=2 2 m -1 0 N0002 N3200 N0201 N0202 N3033 N3001 lookup(k0202) N0221 N0322 N2221 N1113 N2120 N2001 40

Pastry Routing m=8 b=2 2 m -1 0 N0002 N3200 N0201 N0202 N3033 N3001 lookup(k0202) N0221 N0322 N2221 N1113 N2120 N2001 41

Pastry Routing m=8 b=2 2 m -1 0 N0002 N3200 N0201 N0202 N3033 N3001 lookup(k0202) N0221 N0322 N2221 N1113 N2120 N2001 O(logN) hops 42

Pastry Routing O(logN) hops m=8 b=2 2 m -1 0 N0002 Route to 0202: 2221 0002 0221 0201 0202 N3200 N0201 N0202 If chain not complete, forward to numerically closest neighbor (successor) 2221 0002 0221 0210 0201 0202 N3033 N3001 lookup(k0202) N0221 N0322 N2221 N1113 N2120 N2001 43

Pastry State and Lookup For each prefix, a node knows some other node (if any) with same prefix and different next digit m=8 b=2 2 m -1 0 N0002 N0122 Routing table For instance, N0201: N-: N1???, N2???, N3??? N0: N00??, N01??, N03?? N02: N021?, N022?, N023? N020: N0200, N0202, N0203 N3033 N3001 N3200 N0201 N0212 N0221 N0233 N0322 When multiple nodes, choose topologically-closest Maintain good locality properties (more on that later) N2222 N2120 N1301 N1113 N2001 44

m/b rows Προτωρημένα Θέματα σε Κατανεμημένα Σσστήματα 4/3/2016 A Pastry Routing Table m=16 Contains the L nodes that are numerically closest to local node MUST BE UP TO DATE Contains the nodes that are closest to local node according to proximity metric b=2 Leaf set Routing Table Neighborhood set Node ID 10233102 < SMALLER LARGER > 10233033 10233021 10233120 10233122 10233001 10233000 10233230 10233232 02212102 1 22301203 31203203 0 11301233 12230203 13021022 10031203 10132102 2 10323302 10200230 10211302 10222302 3 10230322 10231000 10232121 3 10233001 1 10233232 0 10233120 2 2 b -1 entries per row 13021022 10200230 11301233 31301233 02212102 22301203 31203203 33213321 b=2, so node ID is base 4 (16 bits) Entries in the m th column have m as next digit n th digit of current node Entries in the n th row share the first n digits with current node [ common-prefix next-digit rest ] Entries with no suitable node ID are left empty 45

Pastry Lookup (Detailed) The routing procedure is executed whenever a message arrives at a node 1. IF (key in Leaf Set) 1. If key is in leaf set, destination is 1 hop away, forward directly to destination. 2. ELSE IF (key in Routing Table) 1. Forward to node that matches one more digit 3. ELSE 1. Forward to a node numerically closer, from Leaf Set The procedure always converges! 46

Joining X joins X knows A (A is close to X) X 0629 0629 s routing table Join message A 5324 Route message to node numerically closest to X s ID D s leaf set A???? B 0??? B 0748 C 06?? D 062? A s neighborhood set C 0605 D 0620 48

Locality Desired proximity invariant: all routing table entries refer to a node that is near the present node, according to proximity metric, among all live nodes with prefix appropriate for the entry Assumptions Scalar proximity metric (e.g., ping delay, # IP hops) A node can probe distance to any other node We assume this property holds prior to node X joining the system Show how we can maintain the property after X joins the system 49

Pastry and Network Topology Expected node distance increases with row number in routing table Smaller and smaller numerical jumps Bigger and bigger topological jumps 52

Node Departure To replace a failed node in the leaf set, the node contacts the live node with the largest index on the side of the failed node, and asks for its leaf set To repair a failed routing table entry R d l, node contacts first the node referred to by another entry R i l, id of the same row, and asks for that node s entry for R d l If a member in the Neighbors table is not responding, the node asks other members for their Neighbors table, check the distance of each of the newly discovered nodes, and updates its own Neighbors table 53

Average # Routing Hops 2 nodes selected randomly; a message is routed between Maximum route length is log 2 b N (5 for N=100,000) 54

Probability vs. # Routing Hops 55

# Routing Hops vs. Node Failures 57

Pastry Discussion Search types Only equality Scalability Search O(log 2 b N) w.h.p. Robustness Can route around failures via nodes of leaf set Replication might be used 58

Conclusion DHTs are a simple, yet powerful abstraction Building block of many distributed services (file systems, application-layer multicast, distributed caches, etc.) Many DHT designs, with various pros and cons Balance between state (degree), speed of lookup (diameter), and ease of management System must support rapid changes in membership Dealing with joins/leaves/failures is not trivial Dynamics of P2P networks are difficult to analyze Many open issues worth exploring 59