L13 P2P Overlay Networks: Theory

L13 P2P Overlay Networks: Theory by T.S.R.K. Prasad EA C451 Internetworking Technologies References / Acknowledgements Ch 6: Peer-to-Peer Content Ne...
Author: Lora Hardy
0 downloads 2 Views 3MB Size
L13 P2P Overlay Networks: Theory by T.S.R.K. Prasad EA C451 Internetworking Technologies

References / Acknowledgements Ch 6: Peer-to-Peer Content Networks, [Hoffman] Sec 2.6: Peer-to-Peer Applications, [Kurose] [Theotokis]

References

Stephanos Androutsellis-Theotokis, A Survey of Peer-to-Peer File Sharing Technologies

EA C451 INET TECH

References / Acknowledgements [Ross-P2P]

Keith W. Ross, and Dan Rubenstein, P2P Systems, Infocomm Tutorial

[Kurose-P2P] Ch2: Applications, [Kurose] [Zhang-P2P] Prof. Zhi-Li Zhang, P2P, CSci5221: Foundations of Advanced Networking, Spring 2011. (http://www-users.cselabs.umn.edu/classes/Spring2011/csci5221/index.php)

[Rexford-P2P] Jennifer Rexford, Peer-to-Peer File Sharing, COS 461: Computer Networks (http://www.cs.princeton.edu/courses/archive/spr12/cos461/) References

EA C451 INET TECH

References / Acknowledgements [Stoica-P2P]

Ion Stoica, P2P Networks, EE122: Computer Networking, Dept of EECS, UCB, Fall 2002. (www-inst.eecs.berkeley.edu/~ee122/fa02)

References

EA C451 INET TECH

Optional Readings [Andersen]

David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek, and Robert Morris, Resilient Overlay Networks, SOPS-2001.

[Lua]

Eng Keong Lua, Jon Crowcroft, Marcelo Pias, Ravi Sharma and Steven Lim, A Survey and Comparison of Peer-to-Peer Overlay Network Schemes, IEEE Communications Survey and Tutorial, March 2004.

[Rodrigues]

Rodrigo Rodrigues, and Peter Druschel, Peer-toPeer Systems

Optional Reading

EA C451 INET TECH

Presentation Overview Structured P2P Networks

Unstructured P2P Networks Overlay Networks File Distribution Example Introduction Lecture Outline

EA C451 INET TECH

Presentation Overview Structured P2P Networks

Unstructured P2P Networks Overlay Networks File Distribution Example Introduction Lecture Outline

EA C451 INET TECH

Client-Server Limitations • • • •

Scalability is hard to achieve Presents a single point of failure Requires administration Unused resources at the network edge

P2P systems try to address these limitations

Introduction  C/S Limitations

EA C451 INET TECH

Defintion of P2P 1) Significant autonomy from central servers

2) Exploits resources at the edges of the Internet – storage and content – CPU cycles – human presence

3) Resources at edge have intermittent connectivity, being added & removed Introduction  Definition of P2P

EA C451 INET TECH

It’s a broad definition: • P2P file sharing – Napster, Gnutella, KaZaA, etc

• P2P communication – Instant messaging

• P2P computation

• DHTs & their apps – Chord, CAN, Pastry, Tapestry

• P2P apps built over emerging overlays – PlanetLab

– seti@home

Introduction Definition of P2P

EA C451 INET TECH

Pure P2P architecture • no always-on server • arbitrary end systems directly communicate • peers are intermittently connected and change IP addresses

Introduction Pure P2P Architecture

EA C451 INET TECH

Characteristics of P2P Networks • Clients are also servers and routers – Nodes contribute content, storage, memory, CPU

• Nodes are autonomous (no administrative • authority) • Network is dynamic: nodes enter and leave the network “frequently” • Nodes collaborate directly with each other (not through well-known servers) • Nodes have widely varying capabilities Introduction  Characteristics of P2P Networks

EA C451 INET TECH

Benefits of P2P Networks • Efficient use of resources – Unused bandwidth, storage, processing power at the edge of the network

• Scalability – Consumers of resources also donate resources – Aggregate resources grow naturally with utilization Introduction  Benefits of P2P Networks

EA C451 INET TECH

Benefits of P2P Networks • Reliability – Replicas – Geographic distribution – No single point of failure

• Ease of administration – Nodes self organize – No need to deploy servers to satisfy demand (c.f. scalability) – Built-in fault tolerance, replication, and load balancing Introduction  Benefits of P2P Networks

EA C451 INET TECH

Key Issues in P2P Networks • Join/leave – How do nodes join/leave? Who is allowed?

• Search and retrieval – How to find content? – How are metadata indexes built, stored, distributed?

• Content Distribution – Where is content stored? How is it downloaded and retrieved? Introduction  Key Issues P2P Networks

EA C451 INET TECH

Four Key Primitives (APIs) • Join – How to enter/leave the P2P system?

• Publish – How to advertise a file?

• Search – how to find a file?

• Fetch – how to download a file? Introduction  Four Key Primitives

EA C451 INET TECH

Presentation Overview Structured P2P Networks

Unstructured P2P Networks Overlay Networks File Distribution Example Introduction Lecture Outline

EA C451 INET TECH

File Distribution: Client-Server vs Peer-to-Peer Question: how much time to distribute file (size F) from one server to N peers? – peer upload/download capacity is limited resource

F bits

Internet

File Distribution Example  The Problem

EA C451 INET TECH

Server Distributing a Large File d4

F bits

upload rate us

Internet d1

d3

d2

upload rates ui download rates di File Distribution Example  Server Distributing a Large File

EA C451 INET TECH

Server Distributing a Large File • Sending an F-bit file to N receivers – Transmitting NF bits at rate us – … takes at least NF/us time

• Receiving the data at the slowest receiver – Slowest receiver has download rate dmin= mini{di} – … takes at least F/dmin time

• Download time: max{NF/us , F/dmin}

File Distribution Example  Server Distributing a Large File

EA C451 INET TECH

Speeding Up the File Distribution • Increase the server upload rate – Higher link bandwidth at the server – Multiple servers, each with their own link

• Alternative: have the receivers help – Receivers get a copy of the data – … and redistribute to other receivers – To reduce the burden on the server File Distribution Example Speeding up the File Distribution

EA C451 INET TECH

Peers Help Distributing a Large File F bits

d4

upload rate us

u4 Internet

d3

d1 u2 u1

u3

d2

upload rates ui download rates di 22

File Distribution Example Peer Helpout

EA C451 INET TECH

Peers Help Distributing a Large File • Components of distribution latency – Server must send each bit: min time F/us – Slowest peer must receive each bit: min time F/dmin

• Upload time using all upload resources – Total number of bits: NF – Total upload bandwidth us + sumi(ui)

• Total: max{F/us , F/dmin , NF/(us+sumi(ui))} 23

File Distribution Example Peer Helpout

EA C451 INET TECH

Peer-to-Peer is Self-Scaling • Download time grows slowly with N – Client-server: max{NF/u s, F/dmin} – Peer-to-peer: max{F/us , F/dmin , NF/(us+sumi(ui))}

• But… – Peers may come and go – Peers need to find each other – Peers need to be willing to help each other

24

File Distribution Example P2P is Self-Scaling

EA C451 INET TECH

Client-server vs. P2P: example client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us

Minimum Distribution Time

3.5 P2P Client-Server

3 2.5 2 1.5 1 0.5 0 0

5

10

15

20

25

30

35

N

File Distribution Example Comparison

EA C451 INET TECH

P2P file distribution: BitTorrent • file divided into 256Kb chunks • peers in torrent send/receive file chunks tracker: tracks peers participating in torrent

torrent: group of peers exchanging chunks of a file

Alice arrives … … obtains list of peers from tracker … and begins exchanging file chunks with peers in torrent

File Distribution Example BitTorrent

EA C451 INET TECH

Presentation Overview Structured P2P Networks

Unstructured P2P Networks Overlay Networks File Distribution Example Introduction Lecture Outline

EA C451 INET TECH

Overlay networks overlay edge

Overlay Networks

EA C451 INET TECH

Overlay graph Virtual edge • TCP connection • or simply a pointer to an IP address Overlay maintenance • Periodically ping to make sure neighbor is still alive • Or verify liveness while messaging • If neighbor goes down, may want to establish new edge • New node needs to bootstrap Overlay Networks Overlay Graph

EA C451 INET TECH

Overlays: all in the application layer Tremendous design flexibility – – – –

application transport network data link physical

Topology, maintenance Message types Protocol Messaging over TCP or UDP

Underlying physical net is transparent to developer

application transport network data link physical

application transport network data link physical

– But some overlays exploit proximity Overlay Networks Overlays @ Application Layer

EA C451 INET TECH

Overlay Networks – Normal View

Overlay Networks Normal View

EA C451 INET TECH

Overlay Networks Focus at the application level

Overlay Networks Application Level View

EA C451 INET TECH

Examples of overlays • • • •

DNS BGP routers and their peering relationships Content distribution networks (CDNs) Application-level multicast – economical way around barriers to IP multicast

• And P2P apps ! Overlay Networks Examples of Overlays

EA C451 INET TECH

More about overlays Unstructured overlays • e.g., new node randomly chooses three existing nodes as neighbors Structured overlays • e.g., edges arranged in restrictive structure Proximity • Not necessarily taken into account Overlay Networks More about Overlays

EA C451 INET TECH

Presentation Overview Structured P2P Networks

Unstructured P2P Networks Overlay Networks File Distribution Example Introduction Lecture Outline

EA C451 INET TECH

Unstructured P2P Networks • • • •

Central Server / Directory (Napster) Flooding (Gnutella) Super Peers / Super Nodes (KaZaA) Swarming (BitTorrent)

Unstructured P2P Networks Types

EA C451 INET TECH

Central Server / Directory centralized directory

Join and Publish File list and IP address is uploaded

A

1

1

C

1 B

Unstructured P2P Networks Central Server

EA C451 INET TECH

Central Server / Directory Search (a) User requests file search at server (b) Server responds with list of hosts (IP addresses) containing the file.

Unstructured P2P Networks Central Server

centralized directory

2b

2a

EA C451 INET TECH

Central Server / Directory Fetch

centralized directory

(a) User pings the sources that have the data (Looks for host with best bandwidth)

Pings

Unstructured P2P Networks Central Server

Pings

EA C451 INET TECH

Central Server / Directory Fetch (a) User pings the sources that have the data (Looks for host with best bandwidth) (b) Downloads the file

centralized directory

Downloads the file

Unstructured P2P Networks Central Server

EA C451 INET TECH

Pros and Cons of Central Server Pros: • Simple • Search scope is O(1) • Controllable (pro or con?) Cons: • Server maintains O(N) State • Server does all processing • Single point of failure Unstructured P2P Networks Central Server

EA C451 INET TECH

Unstructured P2P Networks • • • •

Central Server / Directory (Napster) Flooding (Gnutella) Super Peers / Super Nodes (KaZaA) Swarming (BitTorrent)

Unstructured P2P Networks Types

EA C451 INET TECH

Query Flooding • Join – contact a few nodes to become neighbors

• Publish – no need!

• Search – ask neighbors, who ask their neighbors

• Fetch – get file directly from another node

Unstructured P2P Networks Flooding

EA C451 INET TECH

Search by Flooding

xyz.mp3

search

xyz.mp3 ?

Flooding

Unstructured P2P Networks Flooding

EA C451 INET TECH

Search by Flooding

xyz.mp3 xyz.mp3 ?

Flooding search

Unstructured P2P Networks Flooding

EA C451 INET TECH

Search by Flooding

transfer

Unstructured P2P Networks Flooding

EA C451 INET TECH

Flooding: Pros and Cons • Advantages – Fully decentralized – Search cost distributed – Processing per node permits powerful search semantics

• Disadvantages – Search scope is O(N) – Search time may be quite long – High overhead, and nodes come and go often Unstructured P2P Networks Flooding

EA C451 INET TECH

Unstructured P2P Networks • • • •

Central Server / Directory (Napster) Flooding (Gnutella) Super Peers / Super Nodes (KaZaA) Swarming (BitTorrent)

Unstructured P2P Networks Types

EA C451 INET TECH

Super Node Hierarchy • Join – on start, the client contacts a super-node

• Publish – client sends list of files to its super-node

• Search – queries flooded among super-nodes

• Fetch – get file directly from one or more peers

Unstructured P2P Networks Super Nodes  Hierarchy

EA C451 INET TECH

Motivation for Super Nodes • Query consolidation – Many connected nodes may have only a few files – Propagating query to a sub-node may take more time than for the super-node to answer itself

• Stability – Super-node selection favors nodes with high uptime – How long you’ve been on is a good predictor of how long you’ll be around in the future Unstructured P2P Networks Super Nodes  Motivation

EA C451 INET TECH

Super Node Architecture • Each peer is either a super node or is assigned to a super node • Each super node knows about many other super nodes (almost mesh overlay) Unstructured P2P Networks Super Nodes  Architecture

supernodes

EA C451 INET TECH

Super Node Architecture: After Join and Publish xyz.mp3

H3 supernodes

H3

H1

xyz.mp3

H2

abc.mp3 xyz.mp3

H1

abc.mp3

H2

xyz.mp3

Unstructured P2P Networks Super Nodes  After Join and Publish

EA C451 INET TECH

Super Node Architecture: Search – Query Flooding xyz.mp3

H3 supernodes

xyz.mp3 ? H3

H1

xyz.mp3

H2

abc.mp3 xyz.mp3

H1

abc.mp3

H2

xyz.mp3

Unstructured P2P Networks Super Nodes  Query Flooding

EA C451 INET TECH

Super Node Architecture: Search – Query Responses xyz.mp3

H3 supernodes

xyz.mp3 available at H1, H2, H3

H3

H1

xyz.mp3

H2

abc.mp3 xyz.mp3

H1

abc.mp3

H2

xyz.mp3

Unstructured P2P Networks Super Nodes  Query Response

EA C451 INET TECH

Super Node Architecture: Search – Fetch xyz.mp3

H3 supernodes

H3

H1

xyz.mp3

H2

abc.mp3 xyz.mp3

H1

abc.mp3

H2

xyz.mp3

Parallel download of pieces of file possible Unstructured P2P Networks Super Nodes  Fetch

EA C451 INET TECH

Pros and Cons of Super Nodes Pros: • Tries to take into account node heterogeneity: • Bandwidth • Host Computational Resources • Host Availability (?) • Can take into account network locality Cons: • Mechanisms easy to circumvent • Still no real guarantees on search scope or search time Unstructured P2P Networks Super Nodes

EA C451 INET TECH

Unstructured P2P Networks • • • •

Central Server / Directory (Napster) Flooding (Gnutella) Super Peers / Super Nodes (KaZaA) Swarming (BitTorrent)

Unstructured P2P Networks Types

EA C451 INET TECH

Swarming Swarming: Download from others who are downloading the same object at the same time • Join – contact centralized “tracker”server, get a list of peers.

• Publish – Run a tracker server.

• Search – Out-of-band. E.g., use Google to find a tracker for the file you want.

• Fetch – Download chunks of the file from your peers. Upload chunks you have to them. Unstructured P2P Networks Swarming

EA C451 INET TECH

BitTorrent: Publish/Join Tracker

Unstructured P2P Networks Swarming  Publish / Join

EA C451 INET TECH

BitTorrent: Fetch

Unstructured P2P Networks Swarming  Fetch

EA C451 INET TECH

Presentation Overview Structured P2P Networks

Unstructured P2P Networks Overlay Networks File Distribution Example Introduction Lecture Outline

EA C451 INET TECH

The Search (Routing) Problem N1 Key=“title” Value=MP3 data… Publisher

N2

Internet

N4

N5

Structured P2P Networks The Search Problem

N3

?

Client Lookup(“title”)

N6

EA C451 INET TECH

Routed Queries in Structured P2P Networks N2

N1 Publisher

N3

N4

Key=“title” Value=MP3 data…

N6

N7

Client Lookup(“title”)

N8

N9 In structured overlay networks, searching is equivalent to routing on the structure of the overlay. Structured P2P Networks Routing in Structured P2P Networks

EA C451 INET TECH

Distributed Hash Tables • Academic answer to p2p • Goals – Guatanteed lookup success – Provable bounds on search time – Provable scalability

• Makes some things harder – Fuzzy queries / full-text search / etc.

• Read-write, not read-only • Hot Topic in networking since introduction in ~2000/2001

Structured P2P Networks DHT

EA C451 INET TECH

DHT: Overview • Abstraction: a distributed “hash-table” (DHT) data structure: – put(id, item); – item = get(id);

• Implementation: nodes in system form a distributed data structure – Can be Ring, Tree, Hypercube, Skip List, Butterfly Network, ...

Structured P2P Networks DHT Overview

EA C451 INET TECH

DHT: Overview • Structured Overlay Routing: – Join: On startup, contact a “bootstrap” node and integrate yourself into the distributed data structure; get a node id – Publish: Route publication for file id toward a close node id along the data structure – Search: Route a query for file id toward a close node id. Data structure guarantees that query will meet the publication. – Fetch: Two options: • Publication contains actual file => fetch from where query stops • Publication says “I have file X” => query tells you 128.2.1.3 has X, use IP routing to get X from 128.2.1.3 Structured P2P Networks DHT Overview

EA C451 INET TECH

DHT as Library / Platform Distributed application put(key, data)

get (key)

data

Distributed hash table node

node

….

node

DHT provides the information look up service for P2P applications. • Nodes uniformly distributed across key space • Nodes form an overlay network • Nodes maintain list of neighbors in routing table • Decoupled from physical network topology Structured P2P Networks DHT as Library

EA C451 INET TECH

DHT Schemes for Structured P2P Networks • • • •

Chord Pastry Tapestry Content Addressable Network (CAN)

Structured P2P Networks DHT Schemes

EA C451 INET TECH

Chord Overview • Associate to each node and file a unique id in an unidimensional space (a Ring) – E.g., pick from the range [0...2m] – Usually the hash of the file or IP address

• Properties: – Routing table size is O(log N) , where N is the total number of nodes – Guarantees that a file is found in O(log N) hops

from MIT in 2001 Structured P2P Networks Chord  Overview

EA C451 INET TECH

Chord IDs • • • •

Key identifier = SHA-1(key) Node identifier = SHA-1(IP address) Both are uniformly distributed Both exist in the same ID space

• How to map key IDs to node IDs? – The heart of Chord protocol is “consistent hashing” Structured P2P Networks Chord  Chord IDs

EA C451 INET TECH

Chord API • • • • •

insert(key, value)à store key/value at r nodes lookup(key) update(key, newval) join(n) leave()

Structured P2P Networks Chord  Chord API

EA C451 INET TECH

DHT: Consistent Hashing Key 5

Node 105

K5

N105

K20

Circular ID space

N32

N90 K80 A key is stored at its successor: node with next higher ID Structured P2P Networks Chord  Consistent Hashing

EA C451 INET TECH

DHT: Chord Basic Lookup N120 N10

“Where is key 80?”

N105

“N90 has K80”

K80

N32

N90

N60

Structured P2P Networks Chord  Basic Lookup

EA C451 INET TECH

Basic Chord Lookup (Routing) Algorithm Lookup(my-id, key-id) n = my successor if my-id < n < key-id Lookup(id) on node n /go to next hop/ else return my successor /found the correct node/

• Correctness depends only on successors • O(N) lookup time, but we can do better

Structured P2P Networks Chord  Basic Lookup Algorithm

EA C451 INET TECH

DHT: Chord “Finger Table” 1/4

1/2

1/8

1/16 1/32 1/64 1/128

N80

• Entry i in the finger table of node n is the first node that succeeds or equals n + 2i • In other words, the ith finger points 1/2n-i way around the ring Structured P2P Networks Chord  Finger Table

EA C451 INET TECH

DHT: Chord Join • Assume an identifier space [0..8] • Node n1 joins

Succ. Table i id+2i succ 0 2 1 1 3 1 2 5 1

0 1

7

6

2

5

3 4

Structured P2P Networks Chord  Join

EA C451 INET TECH

DHT: Chord Join • Node n2 joins

Succ. Table i id+2i succ 0 2 2 1 3 1 2 5 1

0 1

7

6

2

Succ. Table

5

3 4

Structured P2P Networks Chord Join

i id+2i succ 0 3 1 1 4 1 2 6 1

EA C451 INET TECH

DHT: Chord Join Succ. Table

• Nodes n0, n6 join

i id+2i succ 0 1 1 1 2 2 2 4 0

Succ. Table i id+2i succ 0 2 2 1 3 6 2 5 6

0 1

7 Succ. Table i id+2i succ 0 7 0 1 0 0 2 2 2

6

2

Succ. Table

5

3 4

Structured P2P Networks Chord  Join

i id+2i succ 0 3 6 1 4 6 2 6 6

EA C451 INET TECH

DHT: Chord Join Succ. Table i

i id+2 0 1 1 2 2 4

• Nodes: n1, n2, n0, n6 • Items: f7, f2

Items 7 succ 1 2 0 Succ. Table

0 1

7

Succ. Table

6

i id+2i succ 0 7 0 1 0 0 2 2 2

2

Succ. Table i

5

3 4

Structured P2P Networks Chord  Join

i id+2i succ 0 2 2 1 3 6 2 5 6

i id+2 0 3 1 4 2 6

Items succ 1 6 6 6

EA C451 INET TECH

DHT: Chord Routing Succ. Table

• Upon receiving a query for item id, a node: • Checks whether stores the item locally • If not, forwards the query to the largest node in its successor table that does not exceed id Succ. Table

i id+2i succ 0 7 0 1 0 0 2 2 2

i

i id+2 0 1 1 2 2 4

Items 7 succ 1 2 0 Succ. Table

0 1

7

i id+2i succ 0 2 2 1 3 6 2 5 6

query(7)

6

2

Succ. Table i

5

3 4

Structured P2P Networks Chord  Routing

i id+2 0 3 1 4 2 6

Items succ 1 6 6 6

EA C451 INET TECH

Chord Routing Algorithm Lookup(my-id, key-id) look in local finger table for highest node n such that my-id < n < key-id if n exists Lookup(id) on node n /go to next hop/ else return my successor /found the correct node/

Structured P2P Networks Chord  Routing Algorithm

EA C451 INET TECH

Node Joining in Chord Node n joins the system: • n picks a random identifier, id • n performs n’= lookup(id) • n->successor = n’ • n’->predecessor = n

Structured P2P Networks Chord  Node Joining

EA C451 INET TECH

State Maintenance: Stabilization Protocol • Periodically node n – Asks its successor, n’, about its predecessor n’’ – If n’’ is between n and n’ • n->successor = n’’ • notify n’’ that n is its predecessor

• When node n’’ receives notification message from n – If n is between n‘’->predecessor and n’’, then • n’’->predecessor = n

• Improve robustness – Each node maintain a successor list(usually of size 2*log N) Structured P2P Networks Chord State Maintenance

EA C451 INET TECH

DHT: Network Locality in Chord • Nodes close on a ring can be far on the network

Structured P2P Networks Chord Network Locality

EA C451 INET TECH

DHT: Chord Summary • Routing table size? –Log N fingers

• Routing time? –Each hop expects to 1/2 the distance to the desired id => expect O(log N) hops.

Structured P2P Networks Chord Summary

EA C451 INET TECH

DHT Schemes for Structured P2P Networks • • • •

Chord Pastry Tapestry Content Addressable Network (CAN)

Structured P2P Networks DHT Schemes

EA C451 INET TECH

Pastry: Overview • Similar interface to Chord • Considers network locality to minimize hops messages travel • New node needs to know a nearby node to achieve locality • Each routing hop matches the destination identifier by one more digit • Many choices in each hop (locality possible) Structured P2P Networks Pastry  Overview

EA C451 INET TECH

Object Distribution 2128 - 1

O

objId

Consistent hashing [Karger et al. ‘97] 128 bit circular id space nodeIds (uniform random)

nodeIds

objIds (uniform random) Invariant: node with numerically closest nodeId maintains object

Structured P2P Networks Pastry  Object Distribution

EA C451 INET TECH

Object Insertion/Lookup 2128 - 1

O

X

Route(X)

Msg with key X is routed to live node with nodeId closest to X

Problem: complete routing table not feasible

Structured P2P Networks Pastry Object Insertion/Lookup

EA C451 INET TECH

Routing (Lookup) in Pastry d471f1 d467c4 d462ba d46a1c d4213f

Route(d46a1c)

65a1fc

Structured P2P Networks Pastry Routing

d13da3

Properties • log16 N steps • O(log N) state

EA C451 INET TECH

Node Addition in Pastry d471f1 d467c4 d462ba d46a1c d4213f

New node: d46a1c

Route(d46a1c)

d13da3

65a1fc

Structured P2P Networks Pastry Node Addition

EA C451 INET TECH

DHT Schemes for Structured P2P Networks • • • •

Chord Pastry Tapestry Content Addressable Network (CAN)

Structured P2P Networks DHT Schemes

EA C451 INET TECH

What is Tapestry? • A prototype of a decentralized, scalable, fault-tolerant, adaptive location and routing infrastructure (Zhao, Kubiatowicz, Joseph et al. U.C. Berkeley)

• Network layer of OceanStore global storage system Suffix-based hypercube routing – Core system inspired by Plaxton Algorithm (Plaxton, Rajamaran, Richa (SPAA97))

• Core API: – publishObject(ObjectID, [serverID]) – sendmsgToObject(ObjectID) – sendmsgToNode(NodeID)

Structured P2P Networks Tapestry Overview

EA C451 INET TECH

Tapestry • Uses plaxton mesh data structure – allows nodes to locate objects and route messages to them across an arbitrary-sized network while using a small, constant-sized routing map at each hop – route map is called neighborhood map

• Ideas similar to pastry Structured P2P Networks Tapestry Overview

EA C451 INET TECH

Plaxton (Suffix) Routing Example from 67493 to 34567 destination path resolution order: xxxx7 xxx67 xx567  x4567 34567 x represents wild cards All the numbers are in base-10

Structured P2P Networks Tapestry Plaxton Routing

EA C451 INET TECH

Example Neighborhood Map • Suffix routing • Each entry points to next node in the neighborhood – A neighbor is a connected node to the present node

• Level corresponds to the number of matching digits

Neighbor Map For “5712” (Octal) 0712

x012

xx02

xxx0

1712

x112

5712

xxx1

2712

x212

xx22

5712

3712

x312

xx32

xxx3

4712

x412

xx42

xxx4

5712

x512

xx52

xxx5

6712

x612

xx62

xxx6

7712

5712

xx72

xxx7

4

3

2

1

Routing Levels

Structured P2P Networks Tapestry Neighborhood Map

EA C451 INET TECH

Another Suffix Routing Example Example: Octal digits, 218 namespace, 005712  627510

005712

0 1

2

3 4 5

6 7

00512

340880

0 1

2

3 4 5

6 7

340880

943210

0 1

2

3 4 5

6 7

943210

834510

0 1

2

3 4 5

6 7

833510

387510

0 1

2

3 4 5

6 7

387510

727510

0 1

2

3 4 5

6 7

727510

627510

0 1

2

3 4 5

6 7

627510

Structured P2P Networks Tapestry Plaxton Routing

EA C451 INET TECH

DHT Schemes for Structured P2P Networks • • • •

Chord Pastry Tapestry Content Addressable Network (CAN)

Structured P2P Networks DHT Schemes

EA C451 INET TECH

Content Addressable Network (CAN) • Associate to each node and item a unique id in an d-dimensional space • Properties – Routing table size O(d) – Guarantee that a file is found in at most d*n1/d steps, where n is the total number of nodes

Structured P2P Networks CAN Overview

EA C451 INET TECH

CAN Example: Two Dimensional Space • Space divided between nodes • All nodes cover the entire space • Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 • Example: – Assume space size (8 x 8) – Node n1:(1, 2) first node that joins  cover the entire space

7 6 5 4 3 n1

2 1 0 0

Structured P2P Networks CAN Example

1

2

3

4

5

6

7

EA C451 INET TECH

CAN Example: Two Dimensional Space • Node n2:(4, 2) joins  space is divided between n1 and n2

7 6 5 4 3

n2

n1

2 1 0 0 Structured P2P Networks CAN Example

1

2

3

4

5

6

7

EA C451 INET TECH

CAN Example: Two Dimensional Space • Node n2:(4, 2) joins  space is divided between n1 and n2

7 6 n3

5 4 3

n2

n1

2 1 0 0 Structured P2P Networks CAN Example

1

2

3

4

5

6

7

EA C451 INET TECH

CAN Example: Two Dimensional Space • Nodes n4:(5, 5) and n5:(6,6) join 7 6

n4

n3

5

n5

4 3

n2

n1

2 1 0 0 Structured P2P Networks CAN Example

1

2

3

4

5

6

7

EA C451 INET TECH

CAN Example: Two Dimensional Space • Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6) • Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5);

7 6

n4

n3

5

n5 f4

4 f1

3

n2

n1

2

f3

1

f2

0 0 Structured P2P Networks CAN Example

1

2

3

4

5

6

7

EA C451 INET TECH

CAN Example: Two Dimensional Space • Each item is stored by the node who owns its mapping in the space

7 6

n4

n3

5

n5 f4

4 f1

3

n2

n1

2

f3

1

f2

0 0 Structured P2P Networks CAN Example

1

2

3

4

5

6

7

EA C451 INET TECH

CAN: Query Example • Each node knows its neighbors in the d-space • Forward query to the neighbor that is closest to the query id • Example: assume n1 queries f4

7 6

n4

n3

5

n5 f4

4 f1

3

n2

n1

2

f3

1

f2

0 0 Structured P2P Networks CAN Example

1

2

3

4

5

6

7

EA C451 INET TECH

Suggest Documents