Scalable Routing Via Greedy Embedding

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE INFOCOM 2009...
Author: Stanley Bates
7 downloads 0 Views 145KB Size
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE INFOCOM 2009 proceedings.

Scalable Routing Via Greedy Embedding Cedric Westphal∗ , Guanhong Pei† ∗ DoCoMo

Email:

Labs USA, † Virginia Tech

[email protected], † [email protected]

Abstract—We investigate the construction of greedy embeddings in polylogarithmic dimensional Euclidian spaces in order to achieve scalable routing through geographic routing. We propose a practical algorithm which uses random projection to achieve greedy forwarding on a space of dimension O(log(n)) where nodes have coordinates of size O(log(n)), thus achieving greedy forwarding using a route table at each node of polylogarithmic size with respect to the number of nodes. We further improve this algorithm by using a quasi-greedy algorithm which ensures greedy forwarding works along a path-wise construction, allowing us to further reduce the dimension of the embedding. The proposed algorithm, denoted GLoVE-U, is fully distributed and practical to implement. We evaluate the performance using extensive simulations and show that our greedy forwarding algorithm delivers low path stretch and scales properly.

I. I NTRODUCTION The scalability of routing impacts many systems in many application scenarios. For many systems, geometric (or geographic) routing can enhance the scalability of the routing. Greedy forwarding is particularly attractive, as it requires very little routing state: a packet is forwarded to a destination point based on the relative distance of this point with that of the neighbors of the node holding the packet. Namely, it is forwarded to the node which minimizes the Euclidian distance to the destination. Each node thus is only required to know the position of its neighbors and the size of the routing table is proportional to the degree of the node. Such routing is both local, and memoryless. Geometric routing based on the actual location of the nodes requires accurate knowledge of the node’s location. Further, the physical location of the nodes might not coincide with the connectivity of the network. To take advantage of the network topology, [1] proposed to build virtual (geographic) coordinates on top of the graph created by the network connectivity. Greedy routing is then performed on these virtual coordinates. In essence, a geographic routing overlay is constructed above the network topology to ensure scalable routing using greedy forwarding. The problem then becomes to embed the initial graph topology in a metric space in a way which preserves the connectivity property. Given a connection graph G = (V, E), where V is the set of nodes or routers, and E represents the (bi-directional) links between these nodes, we wish to construct virtual coordinates in a metric space (X, d) on which to perform routing, using the distance d for greedy forwarding. Define n = |V |, so that the network has n nodes. For v ∈ V ,

define by Nv the set of neighbors of v, that is: Nv = {w|w ∈ V, (v, w) ∈ E}. Definition 1.1: A greedy embedding is a mapping f : V → X such that ∀u, w ∈ V, u = w: ∃v ∈ Nu such that d(f (v), f (w)) < d(f (u), f (w))

(1)

Namely, a greedy embedding is an embedding such that, at each vertex, there always exists at least one neighbor which is closer to the destination. In other words, there is no local minimum in a greedy embedding, and greedily forwarding the packets ensures that they reach the destination. It is known [2] that for any graph to be embedded into a Euclidian space, the dimension of the Euclidian space has to be at least log(n). Since log(n) bits are already required merely to express the name of a node in the network, embedding into a space of log(n) dimension is extremely attractive from a scalability perspective, provided each of the log(n) coordinates can be expressed concisely. However, we are not aware of practical protocols to derive virtual coordinates in such a space. We study the embedding of graphs onto Euclidian spaces of poly-logarithmic dimension. We propose an algorithm to construct a greedy embedding such that the size of each entry in the route table is of order log2 (n), and the number of entries at each node is the degree of the node. We also propose a quasi-greedy embedding which improves the performance of the greedy routing and further reduces the dimension of the embedding space, although its scaling behavior stays the same as O(log2 (n)). The algorithm to construct the virtual coordinates is simple and fully distributed, and routes successfully for all node pairs with a low path stretch. The paper is organized as follows: in the next section, we further motivate the problem with some prior related work. In Section III, we provide some background to help build intuition pertaining to our design choices. We also describe our algorithm for embedding virtual coordinates. We propose some variant of the base algorithm in Section IV. We analyze its scaling properties and measure the performance of the algorithm in terms of path stretch and congestion in Section V. We offer concluding remarks in Section VI. II. R ELATED W ORK The ability to construct a greedy embedding depends on the graph G = (V, E) and on the embedding space (X, d).

978-1-4244-3513-5/09/$25.00 ©2009 IEEE

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE INFOCOM 2009 proceedings.

Kleinberg [3], in a milestone paper, showed how to construct a greedy embedding onto the 2-dimensional hyperbolic plane. This does not however solve the issue of scalability, since the coordinates in the hyperbolic plane in Kleinberg’s scheme require n bits to describe the location of one node. This implies that, even though the number of entries in the route table is scalable, the size of each entry is not. For graphs that are planar triangulations, Papadimitriou [4] conjectured that a greedy embedding was always possible in the 2-dimensional Euclidian space l2 . Leighton [5] recently solved Papadimitriou’s conjecture, but the relative distance of points in their construction scales as 1/3n for some graphs, thus requiring n bits again to describe the virtual location of a node. [6] uses physics techniques of spring relaxation to modify the virtual coordinates and make the virtual embedding greedy; however, not all graph topologies are susceptible to the algorithm. Without putting any restriction on the underlying graph, it is obvious that any embedding of virtual coordinates in l2 cannot be greedy. For a Euclidian space, its dimension needs to be at least log(n) for a greedy embedding to exist for any graph G [2]. This leads us to consider embeddings in higher dimensions. Namely, since for a network of size n, describing an address takes already log(n) bits, poly-logarithmic dimensions are still very scalable. In dimension more than two, however, most of the solutions to the issue of local minimum in the virtual topology fail to apply. Typical geographic forwarding mechanisms use two mechanisms, in order to combat the existence of local minimum: greedy forwarding whenever possible, and some recovery mechanism when packets reach a local minimum. GPSR [7] uses face routing, and many other protocols have been devised to route around local minimum (see for instance the references in [8]). Most of these techniques work only for planar graphs [9]. Further, [10] showed that face routing induces sub-optimal congestion and reduces the available throughput through the network. Also, [11] proved that deterministic face routing fails in spaces of dimension strictly more than two. [12] describes a random walk recovery mechanism which applies to the 3D case, but is not easily generalizable to higher dimensions of the order of log(n). Another direction for routing scalability is the study of compact routing (see for instance [13]). Compact routing trades off route table size for path stretch, namely the ratio of the path achieved by the routing scheme over the shortest available path. [14] presents a routing scheme  which uses a maximum route table size at each node of O( n log(n)) with a worst case path stretch of 3. [15] actually demonstrate that a route table size of order O(n) is required to achieve any worst case stretch strictly less than 3. Geographic routing does not provides explicit guarantees on the path stretch. However our algorithm shows an average path stretch less than 1.5 on a wide range of topologies.

III. BACKGROUND AND A LGORITHM We consider a graph G = (V, E). We assume that all edges have equal weight, say 1, so that shortest path routing on the graph corresponds to minimizing the hop count. This can be generalized to assigning other weight to the edges in E. We define by E = {ei , 1 ≤ i ≤ n} the canonical orthonormal basis of Rn . I.e., ei is the vector with a 1 in the i-th coordinate, and 0 otherwise. Choose an integer k < n. From this point on, the only The distance between x, y ∈ Rk metric we use is the l2 norm.  k 2 is thus d(x, y) = y − x = i=1 (xi − yi ) . 1 k n Define k vectors r , . . . , r in l2 . We set each coordinate of rj , 1 ≤ j ≤ k, namely rij , 1 ≤ i ≤ n, to be an i.i.d. Normal(0,1) random variable. For x ∈ l2n , we can define f : l2n → l2k such that: 1 f (x) = √ (< x, r1 >, < x, r2 >, . . . , < x, rk >) k

(2)

f is a random projection of l2n onto l2k . We recall the Johnson-Lindenstrauss Lemma [16]: Lemma 3.1: Pick 0 <  < 1. For u, v ∈ l2n , and for k > k0 , where k0 = O(1/2 log(n)), (1 − ) ≤

f (u) − f (v)2 ≤ (1 + ) u − v2

(3)

The Johnson-Lindenstrauss (JL) Lemma states that one can reduce the dimension of the space and project from a ndimensional space onto a k-dimensional space, while still respecting the distance between points within some  factor. Achlioptas [17] refined the JL lemma using rj , 1 ≤ j ≤ k vectors which have i.i.d. coordinates rji , 1 ≤ i ≤ n drawn from the Rademacher distribution, that is equal to 1 or 1 with equal probability. We use this refinement of the JL lemma below, as it is less computationally intensive while the distortion is no worse than previous JL implementations. For the JL lemma described above, k0 should be chosen as 4 log(n)/(2 /2 − 3 /3). For the Achlioptas procedure, for any β > 0, taking k0 = (4 + 2β) × log(n)/((2 /2 − 3 /3) ensures that Equation (3) is satisfied with probability 1 − n−β . Due to space constraints, we only highlight the principles of our first algorithm, inspired by [2]. More details can be found in [19]. We call this algorithm GLoVE, for Greedy Logarithmic Virtual Embedding. We follow the same approach as [3] by first creating a tree-based system of coordinates which satisfies the greedy forwarding property, then use greedy forwarding to route along short-cuts on the links outside of the tree. GLoVE construction: 1) Extract a spanning tree T out of the graph G (we pick a random root and use the Spanning Tree Protocol [18]); 2) Decompose the tree T into branches T1 , T2 , . . . ..., such that there are at most n branches. 3) Assign to each vertex on the branch Ti some coordinates along a vector taken from the canonical basis

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE INFOCOM 2009 proceedings.

E = {e1 , . . . , en }, where ei is the vector with a 1 in the i-th position and zeroes otherwise. 4) Using the Johnson-Lindenstrauss Lemma, reduce the dimension of the coordinates from n to log(n)/2 , with  chosen so that the greedy forwarding property is preserved. For small , k = O(log(n)/2 ) can become quite large. We also propose a version which tolerates large values for . Large  implies that the random projection of the JL Lemma will have too much distortion to ensure that the embedding is always greedy. Thus, we add a recovery mechanism to the routing performed on top of the GLoVE coordinates, which we describe in Section IV. For a more detailed description of the algorithm, we refer the reader to [19]. IV. VARIANTS OF GL OVE A. Up-tracking Another modification of the GLoVE embedding procedure is the GLoVE-U scheme, for GLoVE with Up-tracking, which we detail below. Recall that k0 = O(log(n)/2 ). Thus, some values of k < k0 cause high distortion in the embedding, and greedy forwarding using the GLoVE embedding does not find all the source-destination paths. The random projection still creates local minima. Our mechanism to recover from local minima in this case works in two phases: first, during the construction of the embedding, we ensure that the embedding is path-wise greedy between any node and the tree root. Namely, for a node vk ∈ V such that the path to the root v0 is vk , vk−1 , . . . , v1 , v0 , we generate the coordinates of vk to ensure that: d(vk , vi ) < d(vk , vi−1 ), for all 1 ≤ i ≤ k − 1

(4)

This guarantees that the path from vk to the root satisfies the greedy forwarding property. To achieve this, we proceed as follows: recall that we compute the coordinates of vk in the embedding as f (g(vk )) = f (g(vk−1 )) + f (el ), for some el ∈ E which is used for this edge (vk−1 , vk ) exclusively. Also, recall that f (el ) = √1k (< el , r1 >, < el , r2 >, . . . , < el , rk >) = √1k (rl1 , . . . , rlk ), with rli all i.i.d. random variables with equiprobable value ±1. Thus, we generate a candidate for f (g(vk )) by computing a set of random variables (rl1 , . . . , rlk ) with the proper distribution. We then check that Equation (4) is satisfied. If it is not, then we re-iterate the procedure with a fresh new set of r.v. (rl1 , . . . , rlk ). The greedy property is path-wise satisfied on the tree, so we can always find a vector (rl1 , . . . , rlk ) in the random projection which satisfies the pathwise greedy property in the embedding. In our experiments, we never have to re-calculate the vector (rl1 , . . . , rlk ) more than 3 times for a 1,000 nodes topology with k dimensions corresponding to  = 0.5, or 10 times for 100,000 nodes. The routing has now two phases: a greedy phase, and a recovery phase. The greedy phase proceeds as usual. When a packet reaches a local minima, that is when the packet

reaches a node which is closer to the destination than any of its neighbors, the routing enter the recovery phase. The node simply forwards the packet to its parent in the tree. Since it is going up in the tree, we call this up-tracking. The intuition is that: since we are in a local minimum, we are not on a path from the root to the destination, where greedy forwarding is ensured by construction. Thus going down on the tree is going in the wrong direction. On the other hand we know there exists a path-wise greedy path from the root to any destination, thus up-tracking always gets the packet closer to a valid greedy path. As soon as possible, the packet resumes greedy forwarding. The packet also carries a short black list of nodes which are local minimum, in order to avoid ping-pong-ing between two nodes. A node checks first that the next hop for greedy forwarding is not on the black list before forwarding. Unlike pure greedy forwarding, this scheme is not memoryless anymore. However we will see in the evaluation section that the size of the black list stays very limited, and that GLoVE-U always recovers from local minima, providing 100% delivery for all source-destination pairs. B. Distributed Algorithm As our intent is to devise a practical algorithm, we constructed a distributed version of GLoVE-U. In order to make the algorithm local, we use the short-branch tree decomposition, which assigns a new vector ei to each edge in the tree. For our purpose, the short-branch decomposition has an important property: the assignment of the vector el to the l-th link is independent of the prior vectors assignments. With a decomposition which assigns the same vectors to different links, so that the coordinates are aligned along that vector’s direction, one needs to keep track of the whole branches. With the short-branch decomposition, since each branch contains exactly one link, the vector assignment process can be memoryless. Actually, since el is a vector extracted from a known distribution which is used only for one link, it can be locally generated. The idea of the distributed algorithm is to build the tree T and the embedding coordinates at the same time using a modified version of the STP protocol. Each node attempts to generate a tree rooted at itself, and then announces this tree to its neighbors. When receiving such announcement, nodes either join the tree, or discard the message. For further details about the algorithm, we refer the reader to [19]. V. P ERFORMANCE E VALUATION We validate our design choices and evaluate the performance of our virtual coordinates. Theorem 5.1: GLoVE requires O(k log(n)) bits to describe the virtual coordinates of the embedding. Proof: The coordinates are generated as f (g(v)) = f (g(p(v))) + √1k (ril , 1 ≤ i ≤ k), where p(v) is the parent el . of v and rl is the vector obtained by randomly projecting √ Since all coordinates are divided by the same factor k, and since the routing depends on the relative coordinates, we can

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE INFOCOM 2009 proceedings.

Fig. 1. Hit Rate for GLoVE for different values of  as a function of n, BA model

omit it. Thus the i-th coordinate of the embedding will n be in the worst case, for a branch in the tree of length n, j=1 rij , where rij are i.i.d. random variables. It is a consequence of the central limit theorem that √ this √ sum is of order n with high probability, so that log( n) = log(n)/2 bits are necessary to describe the i-th coordinate. Since there are k coordinates, the result follows. In order to perform the numerical evaluation, we generated topologies using open-source topology generators for different models. We wrote the implementation for GLoVE and GLoVE-U in C, as a built-from-scratch discrete-event simulator, using the graph topology as an input. We chose this implementation tack in order to make the evaluation run faster, so as to accommodate larger topologies. We first show the proportion of routes found by the algorithm for topologies generated under three different topology models: we use the Barabasi-Albert model [20] (BAmodel) which models AS-level topology of the Internet and explains the emergence of the power-law distribution of node degrees at AS level based on the theory of ”preferential attachment”. We also use the Waxman model [21], which models interconnections within an AS-cluster as well as the links between different clusters, and captures node locality. Finally, we use a random unit disk graph model with n nodes uniformly distributed and nodes connected when they are within  a certain connectivity radius, which is set proportional to log(n)/n. Due to space constraint, only the BA model results are presented here. For the other models, please refer to [19]. Figure 1 depicts the fraction of path that are routable under pure greedy forwarding. We picked some relatively low value for , and we plot the curves for a given value of . This shows that to achieve a constant hit rate as n grows,  should be scaled. This is particularly true in the unit disk model, where the graph looks less and less as the tree as n grows (since the average degree of each node increases as log(n) in order to ensure connectivity of the graph). However, even for these low values of , and even for graphs where the likelihood of hitting a local minimum is high using exclusively greedy forwarding, we see that GLoVE-U always

Fig. 2. Path stretch for GLoVE-U for different values of  as a function of n, Barabasi model

Fig. 3. Path congestion for GLoVE-U, Barabasi model (CDF of the number of paths going through a link)

ensures 100 % delivery: packet can be routed from all sourcedestination pairs. One key point to observe is that, despite relying on the up-tracking mechanism along the tree structure underlying the embedding coordinates, GLoVE-U does not perform treerouting. Up-tracking is used only to extricate packets from local minima in the embedding, but even on a topology with plenty of local minima, the topology still offers enough hint for greedy forwarding to be effective. To see this, consider Figure 2, which depicts the path stretch achieved by the routing for different topologies. The path stretch is the ratio of the number of hops used by the packet using GLoVE-U between a source-destination pair divided by the shortest path for that source-destination pair. Figure 2 also depicts the corresponding 95% confidence interval. We see that the average path stretch for GLoVE-U is always much below that of tree routing: the geometric embedding offers enough hints to the greedy forwarding to take shortcuts off the tree. For the wide range of topologies we studied, the average path stretch stays very limited, mostly around 1.5. Figure 3 shows the congestion for a network with 1,000 nodes and  = 0.5. We compute the distribution of the routes over the links in the network, and plot the CDF. A link is congested if the routing procedure steers more traffic to this link. Tree routing has a profile composed of steps: for instance,

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE INFOCOM 2009 proceedings.

Our distributed algorithm can apply to mobile networks where the mobility is constrained to the edge of the topology, as in for instance cellular networks or wireless mesh networks with a static infrastructure. Indeed, a node can compute its virtual coordinates locally based on the coordinates of its attachment point. A question of interest for further investigation is that of translating mobility along the graph into a trajectory in the virtual space. Another extension of this work involves studying the performance of the algorithm on weighted graphs. The algorithm translates naturally to graphs with weight we for all e ∈ E, and we would like to investigate the performance of the routing algorithm when these we correspond to some level of QoS in the network, such as available bandwidth or delay. R EFERENCES Fig. 4. Average number of recoveries (top) and size of the black list (bottom) for GLoVE-U for different values of  as a function of n, Barabasi model

all the links to a leaf node v will appear on the 999 paths between v and u ∈ V \ {v}. However, links near the root will be more heavily congested, thus the worst performance of tree routing for higher congestion on the graphs. We can confirm that GLoVE-U performs similarly to shortest path routing, and that the stairs profile of tree routing does not appear for GLoVE-U: it spreads the traffic better over all links, despite using the tree for up-tracking. The cost of GLoVE-U resides in the number of up-tracking steps taken by the packets, and by the size of the blacklist carried by the packets. In our simulation, we did not limit the size of the black-list: we counted all the nodes that were blacklisted, and then we retroactively obtained the size required to route all packets correctly. Figure 4 shows the average number of recoveries per packet and the size of the black list for GLoVE-U. On average, the number of recoveries stays very limited. Also, in the worst case, the black list stays relatively small. One can observe that it grows linearly in Figure 4. Since the x-axis is in a logarithmic scale, this means that the scalability of the black list is of order O(log(n)). Even though GLoVE-U is not memoryless, the state packets carry scales comparably to the other routing states in the system. VI. C ONCLUDING R EMARKS AND F UTURE W ORK We have presented an embedding of graphs into a set of virtual coordinates, combined with a distributed routing algorithm, denoted GLoVE-U, which ensure that routing on the embedded coordinates always finds the destination. Our performance evaluation shows that the GLoVE-U indeed delivers packet reliably with a low path stretch. The path stretch depends on the underlying network graph topology, but for common application scenarios, this path stretch stays below 1.5. Further, the amount of information carried in the blacklist is observed to grow as log(n) as well. This confirms that GLoVE-U has very good scalability properties.

[1] A. Rao, S. Ratnasamy, C. Papadimitriou, S. Shenker, and I. Stoica, “Geographic routing without location information,” in Proceedings of ACM MobiCom, 2003, pp. 96–108. [2] P. Maymounkov, “Greedy embeddings, trees and Euclidian vs. Lobachevsky geometry,” Technical Report, available at http://pdos.csail.mit.edu/ petar/pubs.html, 2006. [3] R. Kleinberg, “Geographic routing using hyperbolic space,” in Proceedings of IEEE Infocom, 2007. [4] C. Papadimitriou and D. Ratajczak, “On a conjecture related to geometric routing,” Theoretical Computer Science, vol. 244, no. 1, pp. 3–14, 2005. [5] T. Leighton and A. Moitra, “Some results on greedy embeddings in metric spaces,” in private communication, to appear in FOCS’08. [6] B. Leong, B. Liskov, and R. Morris, “Greedy virtual coordinates for geographic routing,” in Proceedings of ICNP, October 2007, pp. 71–80. [7] B. Karp and H. T. Kung, “GPSR: greedy perimeter stateless routing for wireless networks,” in Proceedings of ACM Mobicom, August 2000, pp. 243–254. [8] B. W. L. Leong, “New techniques for geographic routing,” Ph.D. dissertation, MIT, May 2006. [9] Y.-J. Kim, R. Govindan, B. Karp, and S. Shenker, “Geographic routing made practical,” in Proceedings of NSDI’05, May 2005. [10] S. Subramanian, S. Shakkottai, and P. Gupta, “On optimal geographic routing in wireless networks with holes and non-uniform traffic,” in Proceedings of IEEE Infocom, 2008. [11] S. Durocher, D. Kirkpatrick, and L. Naranayan, “On routing with guaranteed delivery in three-dimensional ad hoc wireless newtorks,” in Proceedings of ICDNC, 2008, pp. 546–557. [12] R. Flury and R. Wattenhofer, “Randomized 3D geographic routing,” in Proceedings of IEEE Infocom, 2008. [13] D. Krioukov, K. Fall, and X. Yang, “Compact routing on Internet-like graphs,” in Proceedings of IEEE INFOCOM’04, March 2004. [14] M. Thorup and U. Zwick, “Compact routing schemes,” in ACM Symposium on Parallel Algorithms and Architectures, 2001, pp. 1–10. [15] C. Gavoille and M. Gengler, “Space-efficiency of routing schemes of stretch factor three,” Journal of Parallel and Distributed Computing, vol. 61, no. 5, pp. 679–687, 2001. [16] W. Johnson and J. Lindenstrauss, “Extensions of lipschitz maps into a hilbert space,” Contemporary Mathematics, vol. 26, pp. 189–206, 1984. [17] D. Achlioptas, “Database-friendly random projections: JohnsonLindenstrauss with binary coins,” Journal of Computer and System Sciences, vol. 66, no. 4, pp. 671–687, June 2003. [18] R. Perlman, “An algorithm for distributed computation of a spanning tree in an extended LAN,” ACM SIGCOMM Computer Communication Review, vol. 15, no. 4, p. 4453, 1985. [19] C. Westphal and G. Pei, “Scalable routing via greedy embedding,” Technical Report, Docomo Labs USA, August 2008. [20] A.-L. Barabasi and R. Albert, “Emergence of scaling in random networks,” Science, vol. 286. [21] B. M. Waxman, “Routing of multipoint connections,” IEEE Journal on Selected Areas in Communications, vol. 6, no. 9, pp. 1617–1622, December 1988.

Suggest Documents