$ IEEE

1758 TRANSACTIONS COMMUNICATIONS, IEEE ON VOL. COM-30, NO. 7 , JULY 1982 A Responsive Distributed Routing Algorithm for Computer Networks In anoth...
Author: Stephany Quinn
9 downloads 2 Views 589KB Size
1758

TRANSACTIONS COMMUNICATIONS, IEEE ON

VOL. COM-30, NO. 7 , JULY 1982

A Responsive Distributed Routing Algorithm for Computer Networks

In another second-generation work, Merlin and Segall [6] describe a pioneering algorithm which, like the first-generation algorithms, maintains “summary” information as opposed to global topology at each node. To avoid forming loops and to JEFFREY M. JAFFE, MEMBER, IEEE, AND recover quickly from failure, there are strongconstraints FRANKLIN H. MOSS, MEMBER, IEEE Thenext on the ordering oftheupdatesamongthenodes. node indicators to each destination induce a tree rooted at Absrrucr-A new distributed algorithm is presented for dynamically thatdestination, and anupdate ordering is impliedby the determining weightedshortestpathsused for messageroutingin positions of the nodes in the tree. This ordering in turn precomputer networks. The major features of the algorithm are that the serves the tree structure. and paths .defined do not form transient loops when weights change The algorithmpresentedhere also provides loop freedom the number of stepsrequired to find new shortest paths when network and rapid recovery using “summary” information rather than links fail is less than for previous algorithms. Specifically, the worst case recovery time is proportional to the largest number of hops h in global topology. This approach is more suited to the evolving any of the weighted shortest paths. For previous loop-free distributed large network environment than thenew ARPANET algorithm algorithms this recovery time is proportional to h2. since the average storage requirement is smaller. The new feature of the algorithm is that it exploits the fact (pointed out in [2], [7] and proved formally here) that first-generation I. INTRODUCTION of static algorithms maintain loop-free paths in the presence Recent interest in theproblem of routingincomputer Therefore,update ordering coor decreasing linkweights. networks has spawned the development of a number of ordination among nodes is onlyrequired for increasing link algorithms for the distributed computation of shortest paths weights or link failures. Moreover, coordination in those in[ 1 J . Let a time-varying positive cost be associated with each stances need only occur among a subset of the nodes-not the link in a network. Then the problem is to determine, at each whole tree as in Merlin-Segall [6]. Also, coordination is only node,thenextnode along the minimal cost pathsto all required briefly after a failure, not for all subsequent updates. destinations. Algorithms involving distributedcomputation In summary, nodes execute afirst-generationalgorithm for amongthenetwork nodesarepreferred for reasons of re- static or decreasing link weights, and a coordinated algorithm liability and responsiveness. These shortestpaths may be among the set of affected nodes in response to link weight used for routing all messages [ 21, or may be used for other increases. purposes. The actual use of this algorithm depends heavily on the apWhat may bereferred to as the “first generation” of plication. If there are many changes in weight, even the COthese works includes the ARPANET algorithm [2] and ordination provided here may .be excessive. Moreover, if the a similar algorithm developed forthe MERIT network by changes occur over short periods of time, the storage required Tajibnapis [ 3 ] . Thesealgorithms arequite similar in terms (in the worst case) does become significant. The present alof the method used for updating the next node information. gorithm is ideal for situations when changes in link weights are In both, each node periodically sends “summary”informarelatively infrequent and yet fast recovery is needed when the tion to its neighborsregardingitscurrentbest

cost to one

ormoredestinations. Each node uses information received from neighbors to determine its best next nodes. One basic attribute of thesealgorithms is that there is nonotion of exercisingcontrolovertheorderinginwhichthenodesperFirst-generationalgorithms also have in formtheirupdates. common two important deficiencies. First, when link weights change, thepathsinduced by thenextnodeindicator may form temporary loops. Second, the algorithms recover slowly from linkand node failures [ 4 ] . Provisions areincluded to abate, but not solve, the latter problem [41. The recognition of these problems motivated the development of a“second generation” of algorithms, seeking to rectify oneorboth of the problems. The new ARPANET .differentapproach tothekind of algorithm [ 5 ] takesa information maintained at each node. In brief, total topological information is maintained at every node, and a protocol is supportedfor broadcasting changes in topologythroughout the network. Paper approved by the Editor for Computer Communication of the IEEE Communications Society for publication after presentation a t the Second International Conference on DistributedComputingSystems, Versailles, France, April 1981. Manuscript received October 3, 1980; revised August 5, 1981. The authors are with the IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598.

changes occur.

The paper proceeds as follows. In Section 11, a description of the“independentupdateprocedure(IUP)”common to first-generationalgorithms is presented. Section 111 presents the theorem which states that IUP provides loop-free paths in the presence of static or decreasing weights. Section IV uses thisresult to presentasimple version of the new algorithm which accounts for any changein link weights. The new algorithm is compared to first-generation algorithms and MerlinSegall in Section V. The main improvement over first-generation algorithms is thatthey recover from failurein time proportional to n (number of network nodes), whereas a version of the new algorithm recovers in time proportional t o h (height of the shortest path tree). Thus, the new algorithm is more “responsive” to local network changes. To recover from a single failure, Merlin-Segall may require as many as U ( h 2 )units of time, compared to U ( h ) for the new algorithm. 11. INDEPENDENT UPDATE PROCEDURE This section gives a brief description of the independent updateprocedure (IUP) common tothe “first generation” algorithms. Theprocedure is described by three elements: a table maintained at each node, an updatemessage exchanged between adjacent nodes, and a simple table update algorithm executed at each node. This section focuses on an asynchronous version of the IUP similar to the one used in [ 3 ] . We

0090-6778/82/0700- 1758$00.75 0 1982 IEEE

1759

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-30, NO. 7 , JULY 1982 n

Fig. 1. Routing table at noded.

&p DES

consider thesituation inwhich network topology is fixed, but in which link weights may change. Let us focus on a node A with k: adjacent nodes B l , -*, Bk. Fig. 1 depicts the tablemaintained :it node A . Entry C(A,DES,Bi) is theestimated minimal cost from A to DES via adjacent node Bi. ArN(h,DES) is the adjacent node which provides the smallesi; estimatedcost to DES. C*(A,DES) is the cost to DES through NN(A,DES), i.e., C*(A ,DES) = C(A,DESJN(A ,DES:/). Initially C(A,DES,Bi) = 00, unless DES = Bi, in which case C(A,DES$i) is set to d ( A ,Bi) [the cost from A to Bi along link (A,&)] . If a node Bi sends an update message “MSG(DES,C)” C*(Bi,DES) = to an adjacent nodeA , the interpret ation is that C. Node B i sends “MSG(DES,C)” to all neighbors whenever C*(Bi,DES) changes. Node A changes its tables when 1) it receives MSG(DES,C) from Bi, or 2) the weight of d(A,Bi) changes (a failure means d ( A pi)= 00). The procedure in thosecases is

1) A receives MSG(DES,C) from Bi. a) Set C(A,DES,Bi) = C d(A,Bi). b) Reevaluate NN(A ,DES) and C*(A,DES) in light of a). If the lowest cost to DES has changed, update NN(A,DES) and C*(A,D13S) and send MSG(DES,C) to all neighbors. 2 ) d ( A pi)changes by Ad(A ,Bi). DO for all DES’s. a) Set C ( A,DES,Bi) = C(A ,DlES,Bi) Ad(A,Bi). b) Do 1) b). 3) d(A,Bi) goes from 00 to some finitevalue. Do for all DES’s. Send to Bi MSG(DES,C*(A ,DES)).

+

+

If linkcosts are staticthenthepaths implicitlydefined are loop free and will indeed be the shortest paths, after sufficiently many steps of execution. If link costs vary, then the paths implicitlydefinedmay define loops. There are several disadvantages to this.First of all; if messages are routed by thesetables they will loop, wasting precious network resources. Second, loops which occur in the tables may delay or prevent the nodes fromobtainingpaths t o the, destination when failures occur.For example, in Fig. 2, letthe arrows denotethenextnodeto DES for nodes B, C, D, and E. Assume that NN(A,DES) = DES, and link @,DES) fails. If C(A,DES,B) < C(A,DES,E), thenatthe failure A chooses NN(A,DES) = B. Assume C(A,DES,E) is much larger than any other cost, e.g., d(A,E) = lo6 and d ( x , y ) = 1 for any other x , y . Thenapproximately lo6 stepstranspire while update messages loop (A,D,C,B,A) andbefore A correctly chooses E as the next node toDES. Toguarantee reasonablyfast detection of loops, a hop count field is included inthe messages andtables for IUP. An update message MSG(DES,C,ii) from Bi causes A to update C(A,DES,Bi) as before, and also to set HOPCOUNT (A ,DES$,.) = h 1. Any message of the formMSG(DES,C,n 1) is ignored ( n = number of network nodes). While loops are thendetectedmorequickly,afte~r a single failure n steps

+

W

Fig. 2. Potential loop when (A, DES) fails. may still berequired to detecttheloop, to correctly obtain the shortestpaths.

and moresteps

111. STRUCTURE THEOREM

In this section we present a theorem which says that the cause of loop formation in the IUP is increasing link weights. Define G(D) to be the graph whose nodes consist of all the nodes of the network with directed edge (A,B) in G(D) iff NN(A,D) = B.As the IUP is executed in time, G(D) changes in time. Theorem: Assume that the IUP is executed in a network. Assume that during execution link weights decrease or do not change. Then 1) V A ,D , C*(A,D) is a nonincreasing function of time; 2) V D , G(D) is loop free atevery instant of time.

Pro0f: 1) We provethis by contradiction. Assume that acost increase occurs, say for the first time in the network at time t at node A t o destination D. Let B = NN(A,D) just before t. By 1)b) and 2)b) of the algorithm, it is clear that C*(A,D) can increase only if A receives MSG(D,C) from B with greater cost than A has previously received from B or if d(A,B) increases. The former contradicts the assumption that it was the firstincrease, and the latter contradicts the assumption that weights are nonincreasing. 2) This is also proved by contradiction. Assume that a loop forms in G(D) and that time t is the first time at which this occurs in the network (see Fig. 3). Let A be the node of the loop for which C*(A,D) is largest, and B be the node on the loop such thatNN(B,D) = A . Therefore, C,*(A ,D) > C,*(B,D) (i.e., the costs at time t ) . Consider the last time t r < t at which B setNN(B,D) = A and C*(B,D) = C,*(B,D) = C(B,D,d). Clearly, Ctf*(B,D) > C,*’*(A,D)where t”< tr is the time at which A sent the message to B . Thus, C,*(A ,D) > C,*(B,D) = C,*(B,D) > C t t p * ( A , D )contradicting , 1) above. IV. COORDINATED UPDATE PROCEDURE This section describes a simple version of the new algorithm designed to prevent loops in the tablesand recover quickly from failure. The algorithmexploits the special property expressed in thetheorem of the previous section-the approach is to use IUP for events stemming from static or decreasing link costs and to add more coordination to account for failures or cost increases. The basic idea behind thecoordinatedupdateprocedure (CUP) is quite simple. When a cost increase occurs on a link, all nodes uptree of the link on G(DES) [seeSection 111 for definition of G(DES)] are progressively “frozen”starting at the node adjacent to the link and proceeding uptree. The freeze state for node A is with respect to a particular destination, DES, and means that A may update its cost entries to DES but may not change NN(A,DES).Node A is notunfrozenuntil all uptreenodes have increased their costs to

1760

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-30, NO. 7 , JULY 1982

Fig. 3.

Loop in algorithm.

reflect the linkcostincrease, thereby “purging” the old information which does not reflect the increase. In this way node A never causes a loop to form by,choosinga node Liptree of it on G(DES). This is true because the only time an uptree node has lower cost is when the downtree node is in freeze state. This is in distinction to IUP, in which an uptree node “news of the may have lower cost due to the fact that the increase” has yet to propagate uptree. The mechanism for achieving the CUPis a$ follows. A single bit is set ii update messages to indicate whether or not it originated from a link increase (0 = not linkincrease, 1 = linK increase). If d ( A , b ) increases and NN(A,DES) = B, then A increases C(A,DES,B) as in step 2) a) of IUP, recalculatesC*(A,DES),sends MSG(DES,C,l) to all neighbors, and enters freeze state.. Nodes receiving MSG(DES,C, 1) from their current next nodes do the same, graduallyfreezing all nodes uptreeof the link increase. An acknowledgmentprocedure unfreezes the nodes. If a node N receives MSG(DES,C, 1) from a neighbor B j which is notitsnextnode,itupdates C(N,DES,Bj), but does not enter. afreeze state.Instead,it acknowledges to B i that the message has been received bysending ACK(DES). When Bi receives ACK(DES) from all neighbors, it is a signal that the outdated information has been purged uptree of B; and Bi, in turn, acknowledges NN(Bi,DES). At this time, Bi leaves the freeze state. No loops will subsequently be caused by Bi as all uptree neighbors must have higher cost. In this manner, a downtree cycle notifies all nodes that they may unfreeze, and the IUP may again be executed. Multiple cost increases may affect a node at once, that is, before one cost increase is purged uptree, one or more cost increases may occur downtree. To account for this, each node keeps track,ofthetotalset of acknowledgments thatare “outstanding” from each neighbor. Assume that node A has k neighbors B , , --,B k . Then A maintains k bit vectors V I , -.., V k to determine when it can leave freeze state. Every time that A receives a freeze message from downtree and forwards it to all neighbors, A sets every V ito Vi*O.That is, A adds a 0 to the end of eachbit vector, designating thatan unacknowledged freeze message received from downtree of A has been sent to all neighbors. When A initiates afreeze message due to an increase in,link cost it sets every Vi to Vi*1 The various freeze messages sent by node A are acknowlas received if all edged by A’s neighbors in the same order freeze messages are processed in FIFO order at all nodes. (This is easily seen by induction on the structureo f the tree.) Thus, when A.receives an acknowledgment from Bi, it knows that Bi is acknowledging the oldest freeze message not yet acknowledged by B,. To record this, the leftmost symbol is removed from V i , i.e., if Vi = xy where x is the leftmost symbol (either a 0 or a 1) then Vi is set t o y . Node A uses the Vi to decide when to send ACK(DES) to NN(A,DES). If A receives ACK(DES) from Bi and I Vi 1 (the size of V i ) is greater than I Vi I for j # i , then the freeze message being acknowledged by Bi has already been acknowledged

by all other neighbors of A . If the leftmost symbol of Vi is a 1 , then the freeze message. was initiated by A and need not be acknowledged to NN(A,DES). If it is 0, A acknowledges NN(A,DES). In either case, if I Vi 1 = 0,for all i , A leaves the freeze state. We now account for the case in which A is awaiting an acknowledgment from Bi (Le., Vi is not empty) and link (A,Bi) fails. To prevent A from waiting indefinitely for acknowledgments, A acts as if it had just received all acknowledgments from Bi, sending any acknowledgments to NN(A ,DES) if relevant and setting Vi = empty. We also assume that, simultaneously, Bi sets NN(Bj,DES) = empty if NN(Bi,DES) = A so that if A ultimately chooses a next node uptree of Bi, a loop is avoided. After ( A , B i ) is restored, Bi does not acknowledge any of the increases io A . A complete description of the entire algorithm as executed at node A is presented in the Appendix. V. FAILURE RECOVERY PROPERTIES In this section we assess the failurerecovery properties of the new algorithm presented in the previous section. This discussion will motivate some modifications to the algorithm, which, though slightly morecomplex, yield a “speed of recovery” which is superior t o first-generation algorithms and Merlin-Segall. A basic difficulty we face in discussing speed of recovery for, asynchronousalgorithms is that the time required for some target state to occur is a function of the arbitrary timing in which the vaiiousnodes execute the algorithm. In order to circumvent thisproblem, we shall henceforth considera hypothetical synchronization of the algorithm in which every nodeexecutes a “step” of the algorithm simultaneouslyat fixed pointsintime. At each step a node may receive and process one message from eachneighbor. Thequestion of “how fast?” is then equivalent to “how many steps?” We then evaluate the speed of recoveryaccording t o the following criterion: assuming that a single failure occurs after optimal paths have been found to adestination, how many steps are requireduntil all nodes again haveoptimalpaths? This includes the number of steps required to detect the failure plus the number of steps needed to find the minimalcost paths (not just achieve connectivity). Finally, it is assumed that all other link weights are static. This is the natural measure to use for, environments in which changes are relatively infrequent and yet fast recovery is. desired (see Section I). The algorithm of Section V I hasworst case speed of recovery (according to the above criterion) of O(x) where x is the number of nodesaffectedby the failure. To see that x steps maybe required, consider thenetwork of Fig. 4 and focus on node DES and shortest path tree G(DES). The nodes affectedby the indicatedfailureare denoted G‘(DES). The structure of G’(DES) i s a subtree G”’(DES) of y nodes connected to node A andx - y nodesin a chain [ G”(DES)] connected t o A . Consider a node C in &“(DES). Three steps after the failure (in which node C receives MSG(DE.5, =, 1) from A , sends MSG(DES, w, 1 ) to neighbors, and receives ACK(DES) from neighbors), node C may select a new NN(C,DES). When this occurs, all nodes in G”’(DES) may end up“attached”to node B in a chain as indicated in Fig. 5, before B learns of the failure. In this case O ( x ) steps arerequired for all nodes to become unfrozen. Conversely, to see that at most O(x) steps are required, we

IEEE TRANSACTIONS COMMUNICATIONS, ON

VOL. COM-30, NO. 7 , JULY 1982

1761

acknowledgments for all freeze messages initiated by it. The detailed implementation is omitted. VI. COMPARISON TO OTHER ALGORITHMS In this section we compare the worst case failure recovery performance of the algorithms presented in this paper to the “first-generation”algorithmsemploying IUP as wellas the Merlin-Segall algorithm. The basis for comparison is the test consideredin the previous section, inwhichwe countthe Fig. 4. Structure of tables uptree of a failure. number sf steps required t o find optimal paths after a single failure has occurred in the network. In order to fix the notion of “step,” all algorithms are assumed to operate in a synchronized fashion. First,it may beinferred directly fromthe discussion at -C ... the end of Section I1 that according to our present criteria, the number o f stepsrequired for first-generationalgorithms employing I u p is ~ ( n )n ,= number of nodes i n the network. Next, t o analyze Merlin-Segall from this point of view we refer to thediscussions in [ 61. Briefly, the strong coordination causes every table update to be part of an update cycle that Fig. 5 . Long chain after a failure. startsfrom DES, propagates uptreetothe leaves, and back downtree t o DES. If h is the height of the shortest path tree notethaton each branch of the affected subtree(in any at the start of the cycle, then the cycle requires h steps t o graph), only one step is required to move up one level on the complete. In [ 6 ] it is shown that at most h cycles are required subtree. Since the maximum heig:htof any branch of the for recovery, andit is easy t o construct exampleswhich original subtree(even with new nodes added at the leaf before require h cycles where the height h is the same at each cycle. notification reaches the leaf) is x , olnly O ( x ) steps are required Therefore, the numberof steps torecovery if O ( h z ) . for all nodes to be frozen and then unfrozen. From there on Finally, the speed of recovery of the new algorithms preusing IUP, only O ( h ) time is required to obtain optimal paths, sentedin this paper is discussed inSection V. Table I sumwhere h is the longest chain’of th.e set of affected nodes in marizes the recovery performance of all of these algorithms. the new shortest path tree. Since h < x , the whole procedure We see that the algorithm of Section V performs best of all. requires O(x? steps. VII. SUMMARY The fact that recovery is as slow as O(x) results from the fact that a node that is unfrozen e;uly (node C in Fig. 4)may This paperhaspresented a new distributed algorithm for reattach to a portion of the shorte:;t path tree which is uptree thedynamicdetermination of weighted shortestpaths in a of the failure (node B ) . We now present a modification to the network. The algorithm was developed by focusing attention algorithm which prevents such an went and thereby provides on the issue of obtaining loop-free paths in the circumstance improved speed of recovery. of changing link weights. It was proved that the cause of loop The basic idea is not to unfreezc: any nodes until all nodes formationina well-known class of distributed algorithms affected by a failure have learned of the failure. This prevents (e.g., that used originally in ARPANET) is increasing link thesituation in which anode which is unfrozen earlyin weights only. An algorithm which always ’ maintains loop response to a failure reattaches to a portion of the shortest freedom, with two basic variations, was developed to exploit path tree which is actually uptree of that failure. To accomplish this simple yet useful fact.The new algorithm, particularly this a wait state is introduced. the second variation, tends to be more responsive than ‘preInstead of “unfreezing” when uptree nodes have purged vious algorithmsin terms of thenumber of stepsrequired “old information,” a node enters a wait state. When the node to completely recover from a link failure. cycle entersthe wait (node A ) thatinitiatedtheupdate APPENDIX state, then all nodes have learned of the failure and node A sendsan “unfreeze” message uptree, unfreezing all uptree COMPLETE DESCRIPTION OF PROTOCOL nodes currently in wait state when they receive the message. Normal State In this way, no affected node searches for a new next node until all outdatedinformation is purged fromthe system. 1) If A receives MSG(DES,C,O) fromneighbor Bi, or Let h l denote the height of the subtree G‘(DES) affected by d(A,Bi) decreases do IUP. the failure. Then only O ( h l ) time is required until aZ1 outdated 2) If A receives MSG(DES,C, 1) from Bi and N N ( A ,DES) # costs have been purged and all nodes unfrozen. Let h2 be the Bi longest path in theshortestpathtreeafterthe failure that a) set C(A,DES,Bi)= C -td(A,Bi) consists only of nodes from G(DES). Then after all nodes are b)send ACK(DES) to Bi. unfrozen, only O(h,) time is required t o obtain optimal paths 3) If d ( A , B i ) increases or (A,Bi) fails and NN(A ,DES) # using IUP. Bi, set C(A,DES,Bi)= C(A,DES,Bi)+ Ad(A,Bi). This enhanced algorithm genc:ralizes naturally to the 4) If A receives MSG(DES,C, 1) from Bi and NN(A ,DES) = multiple failure/cost increase case. A node is’ unfrozen only Bi when it has received “unfreeze” messages. for all freeze a) set C*(A,DES) = C(A,DES,Bi) = C + d ( A pi) messages that were initiateddowntree of it, and has received b) send t o ail neighborsMSG(DES,C*(A,DES),-l)

y-3

1752

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-30, NO. 7,JULY 1982 TABLE I COMPARISON OF SHORTEST PATH ALGORITHMS

Algorithms

NO. of Steps

Case Worst

for Recovery

First Generation Employing IUP

O(n)

Merlin/Segall

9

New(Section Algorithm O(x)

(h2)

4)

1)

New Algorithm (Section 5 )

where

b) if I Vj(A,DES) I = 0 ‘d i update NN(A ,DES) and C*(A,DES), send MSG(DES,C*(A,DES),O) to all neighbors and go to normal state. 8) If ( A,Bi) fails and NN(A ,DES) # Bi a) do step 3) of freeze state b) dostep 7)i) of freeze state until I Vi(A,DES) I = 0. 9) If (A,Bi) fails and NN(A ,DES) = Bi a) do step 6) of normal state b) for all i set Vj(A,DES) + Vj(A,DES) V 5 ... 1‘ where kj = 1 V,(A ,DES)1 c) do step Bf of freeze state. 10) If a failed link ( A, B j ) recovers, send MSG(DES,C*(A,DES),O) to Bj forevery DES. Remark: 9) b)prevents 4 from sending any ACK’s to Bi in case (A,B~) comes’up again.

n=number of network nodes

REFERENCES

h=height of shortest path tree x=number of nodes uptree of failure

on the shortest path tree

c) set Vj(A,DES) = Vj(A,DES)*Ofor every j d)goto freeze state. 5) If d(A,Bi) increases and NN(A,DES) = Bi a)set C*(A,DES) = C(A,DES,Bi) = C(A,DES,Bj) 4Ad(A P i ) b) send to all neighbors MSG(DES,C*(A,DES),l) c) set Vj(A,DES) = Vj(A,DES)-l for every j d) goto freeze state. 6) If (A,Bi) fails and NN(A ,DES) = Bi do the same as step 5 with Ad(A ,Bi) = 00 and also set NN(A ,DES) = empty. 7) If a failed link ( A,Bj) recovers, send MSG(DES,C*(A,DES),O) to Bj for every DES. Freeze State 1) If A receives MSG(DES,C,O) from neighbor Bi set C(A,DES$,.) = C -I- d ( A $i). 2 ) If d(A,Bi) changes by Ad(A,Bi) < 0,set C(A,DES,Bi) = C(A,DES,Bi) Ad(A P i ) . 3) If d ( A ,Bi) increases and NN(A ,DES) # Bi set C(A,DES&,.) = C(A,DES,Bi) -I- Ad(A 4) If A receives MSG(DES,C,~) fromBi adNN(A,DES) f Bi a) set C(A,DES&,.) = C -I-d ( A ,&) b) send ACK(DES) to Bi. 5) If A receives MSG(DES,C,l) from Bi and NN(A,DES) = Bi do step 4 of normal state. 6 ) If d ( A Ji)increases and NN(A,DES) = Bi do step 5 of normal state. 7 ) Let Iri(A,D~s)= xy ,x E { 0, 1). If A receives ACK(DES) from Bi then i) if I Vi(A,DES) i > i Vj(A,DES) 1, Vi # i and x = 0 then a) Vi ( A,DES) Y b) send ACK(DES) to NN(A ,DES) C) if I Vi@ ,DES) I = 0 update NN(A.DES) and C*(A,DES), send MSG(DES,C*(A,DES),O) to all neighbors and go to normal state ii) else a) Vi(A,DES) +Y

+

+

M. Schwartz and T. E. Stem, “Routing techniques used in computer communication networks,” IEEE Trans. Commun., vol. COM-28, pp. 539-552, Apr. 1980. J . M. McQuillan,“Adaptive routing algorithms for distributed computer networks,” Bolt Beranek and Newman Inc., BBN Rep. 2831, May 1974. W. D. Tajibnapis, “A correctness proof of a topology information maintenance protocol for distributed computer networks,” Commun. Ass. Comput. Mach., vol. 20, pp. 477485, July 1977. J. M. McQuillan, G . Falk, and I. Richer,“A review of the development and performance of the ARPANET routing algorithm,” IEEE Trans. Commun.. vol. COM-26, pp. 1802-1811, Dec. 1978. J. M. McQuillan, I. Richer, and E. C. Rosen, “The new routing algorithm for the ARPANET,” IEEE Trans. Commun., vol. COM28, May 1980. failsafe distributed routing P. M. Merlin and A.Segall,“A protocol,” IEEE Trans. Commun., vol. COM-27, pp. 1230-1237, Sept. 1979. T. E. Stern,“An improved routing algorithm for distributed computernetworks,” presented at the ICCAS/80 Workshop on Large Scale Systems.

Using Automated Validation Techniques to Detect Lockups in Packet-Switched Networks MARK SHERMAN AND HARRY RUDIN, SENIOR MEMBER, IEEE Absrracr-The existence of a program for automatically examining of a protocol for theabsence of variousundesired thesyntax properties-such as deadlock-raises the question whether the classical“lockups”describedintheearly packet-switched data network literature could be detectedby such a program. This concise paper answers this question in the affirmative. Paper approved by the Editor for Computer Communication of the IEEECommunicationsSociety for publication after presentation in part at the Symposium on Reliability in Distributed Software ~d Datal base Systems, Pittsburgh, PA, July 21-22, 1981, Manuscript received January 20, 198l;revised October 10, 1981. M. Sherman was with the IBM Zurich Research Laboratory, 8803 Ruschlikon, Switzerland. He i s now with the Department of Computer Science, Carnegie-Mellon University, Pittsburgh, PA 15213. H. Rudin is with the IBM Zurich Research Laboratory, 8803 Ruschlikon, Switzerland.

0090-6778/82/0700-1762$00.75 0 1982 IEEE