5 Brewster Lane, Bellport, New York 11713-2803

Phone: 1-516-286-1339, Fax: 1-516-286-1999

E-mail: yaya@bernstein-plus-sons.com

May 1, 1988

Revised April 26, 1997

by

Herbert J. Bernstein

Click Here for PDF version

Abstract

1. Introduction

2. A Simple Three Node Example

3. A More General Case -- Inhomogeneous Rates, Many Paths

4. Unequal Service Rates on a Given Path

5. Instability in the General Case

Acknowledgement

References

Computer Science Department, New York University,

May 1988, 11 pp.

The author of this document provides it in the good faith belief that the information
contained therein in correct, but, as with any document, there may be errors and
omissions, and this document may not be applicable in any particular circumstances.
Comments, corrections and suggestions may be directed to yaya@bernstein-plus-sons.com.
Under no circumstances should this document be considered as an alternative to consultation
with appropriately trained professionals.

There are problems with the distributed algorithms. Kleinrock [10] pointed out that "... uncontrolled alternate routing in a congested net can lead to chaos. Indeed, the telephone company tends to limit (and even prohibit completely) alternate routing on unusually busy days (Mother's Day, for example)." As Schwartz notes of a common shortest path algorithm, "Although convergence to the shortest path is guaranteed, routing table entries may change during the convergence period, giving rise to possible loops during that interval," [11, p 277]. Even if one suppresses the creation of loops, there can be serious problems. When the offered load on which routing calculations are being done varies significantly on time-scales commensurate with the convergence period of the routing algorithm, one has created a feedback control system which can oscillate for very long periods. The reader is referred to standard texts on Control Theory, e.g. [1].

In this note, we present a simple example of the type of instability which can result from computing an optimal bifurcated routing for a load which changes on the time-scale of the calculation. While the example was created to clearly demonstrate the sub-optimal results of optimal routing in this case, it is not, in our opinion and observation of the Internet, unrealistic. (The Internet is a loose confederation of networks able to provide a reasonable degree of interoperability for users on connected hosts. See [5].)

As a contrast to optimal routing, we will mention routing found by the Maximum Entropy Method (MEM) [2-4, 8, 9, 12]. MEM, originally due to Jaynes, comes from the interaction of Information Theory with Statistical Mechanics. It works for underdetermined systems, producing the smoothest answer consistent with the data. In the case at hand, it would produce network flows equalized for whatever parameter we wish to smooth: traffic by links, total queue size along paths, etc. Since we need no more than that qualitative approach for our example, we will not present further detail on Maximum Entropy here.

We will form our examples by taking static cascades of independent M/M/1 queues as our model of multi-hop network paths, ignoring blocking, dependence in forwarding, and many other effects. Most importantly, we will ignore the probable failure of stochastic equilibrium by using M/M/1 queuing models even though we vary the customer arrival rate. In a network of large diameter and heavy traffic, it is not unreasonable to assume that the time scales of routing calculations are sufficiently large to consider them as leading to approximate equilibrium queue-by-queue. It would be a good idea to do a more accurate non-equilibrium model, but that would take us beyond our simple pedagogical objective into material better suited to a research paper.

We will start with a trivial three-node network. Consider a simple example of a network
consisting of three nodes, *A*, *B*, *C*,
where *A * has traffic l for *B*
and may route it either directly or via *C*.
We assume independent M/M/1 queues at each node, and an equal service
rate for each queue. We assume that *A *
supplies a Poisson distributed stream of packets at rate
along the direct route and (1 - ) along the indirect route.
Our routing decision is to choose the "best" value of .
Assume we seek to control average delay. The average delay is

This equation becomes clearer if we define

use a dimensionless delay *W *
(essentially queue length plus one) and define

and we seek to minimize *W *
by varying . In this case, it is sufficient to look at the boundary
case = 1 ( = .5 ) and at the location of zeros of the derivative of *W *
with respect to .

The zeros of the derivative are given by

The lower root is the one in the proper range ( -.5 <= <= .5) as long as > .293. Below that we use the direct route. Above that value the routing bifurcates.

Consider a situation in which the offered load switches between,
say, = .3 and =.9, spending about half its time at each level.
This might be due to the inherent characteristics of the applications,
or, perhaps due to a periodic sensing of overload at the higher load
and a backing off to the light load to relieve congestion. The average load is then
= . 6, and we face the choice of routing for the instantaneous values or for the average.
The "optimum" values of a for these values of are:

From this table, we observe that the average delay using the optimum a values for the = .3 and
=. 9 cases is *W * = 1.99, which is slightly below the average 2.13 of the*W *
values for alpha .7. We gained about seven percent by using highly dynamic routing. In fact, if we had
used a dynamic shortest path routing, which would have taken = 1.0
in all these cases, we would have paid a serious delay penalty. Worse yet, suppose
we had a processing time for the dynamic algorithm comparable to the cycle time of
the load and switched to the values of for = .3 just
when the load switched to = .9 and vice-versa. Then we would have an average delay for
dynamic routing of *W * > 5.3.* *
(For =. 3 we would have used = . 6 and gotten a delay of*W * = 1.64,
while for = . 9 we would have used = .99 and gotten a delay of*W *
= 9.10). Of further interest is the fact that the Maximum Entropy
value of = . 5 (assuming we balance traffic by links, not paths)
gives a true average delay of *W * = 2.24 using the correct delay
values of *W * = 1.76 for = .3 and *W * = 2. 72 for = .9,
a penalty of about fifteen percent for taking too low a value of .

If we look at the Maximum Entropy solution for traffic balanced by paths, i.e. making the average queue on the direct path equal to the average queue on the indirect path, we obtain bifurcated routing for all values of :

with an average delay for our test case using = . 6 for the alternating traffic of 2.10,
a penalty of about five percent. Our conjecture
is that the optimal dynamic routing solutions are, in many cases, similarly unstable
under reasonable dynamic load, and that the Maximum Entropy routings will prove a more robust
starting point for distributed dynamic adjustments.

Thus consider the extension of this analysis to a network with a single client offering
traffic load trying to reach a server via *n*
node-disjoint paths, where each path, *i*, acts as a cascade of *l _{i} *
independent M/M/1 queues of service rate

Define

and compute the partial derivatives which will be needed in finding minima of *W*
.

for *i* = 1, ..., *n* - 1 and

by taking * _{n}* = 1 -

*i.e.* when

for some independent of *i*, *i* = 1, ..., *n*.
We will solve these equations for * _{i}*, but first note that

with * _{i}* = +/-1. The negative value of

We can solve for by

so that

While all the resulting values of * _{i}* are certainly critical
points of

which, as the sum of a diagonal matrix with positive terms and a scalar times the
matrix of all ones, is positive definite as long as we have * _{i}* >

To select the paths to be dropped, order the paths so that

is monotone non-increasing with *i*, *i.e.*
so that the path with the fastest hop-corrected service rate comes first and the slowest
path comes last. Compare to

If is smaller, drop path *n*
and recompute on the reduced network, since in that case *[n]*
will be negative. To see this 0 > *[n]*
if and only if

if and only if

from which the bound on follows.

We can actually drop more such lines at the same time, since the effect of taking out lines with negative a is to reduce load on other lines, but we cannot assume that the calculation need not be repeated for the reduced set, since we have no assurance that more a will not go negative with this reduced load.

Once we enter a regime in which the critical points are indeed the minima, we can
compute the minimal *W*
from

Then define

as the pseudo-hop count to use of links all of rate * _{i,1}*.
The difference between the delay estimated on this path with the pseudo-hop count
and the real delay is

with equality at * _{i} * = 0 and for equal service rates.

We could extend this example to chains of M/G/1 queues, or even to more general models of the node-disjoint paths, but qualitatively we expect the same basic behavior. If we do our route planning for a light load case, we will tend to favor the "shortest" paths. If the load then forces those paths onto their delay curve knees, that routing will be significantly worse than a route plan which off-loaded some portion of the excess load onto longer paths earlier. It is tempting to think that we can solve this problem by responding to the load change quickly enough. The calculation of an optimal bifurcated route for the node-disjoint M/M/1 cascaded path model is simple, requiring only accurate data on hop counts and service rates. The difficulty lies in gathering the data, not in using it. If we rely on multi-hop distributed reporting of effective service rates and connectivity, by the time we have it available for use, it may well be out of date. At the very least, if we must compute "optimal" routing, we should do so not for the current load, but for a load which we can reasonably expect not to exceed often until the next routing update. Accumulation of variances of loads and delays would make such estimates feasible.

We wish to thank Frances C. Bernstein, Milan Tuba and the students in G22.2263 for their helpful comments.

2. H. J. Bernstein, *Tutorial on Maximum Entropy*,
Philadelphia, Pennsylvania (6 February 1986). Lecture at the Institute for Cancer
Research, Fox Chase Center.

3. H. J. Bernstein, *Digital Communications*,
New Paltz, New York (9 October 1986). Lecture presented in 1986 New Horizons in Physics
and Engineering Lecture Series on Telecommu-nications at the College of New Paltz,
State University of New York.

4. G. Bricogne, "Maximum Entropy and the Foundations of Direct Methods," *Acta Crystallo-graphica*,
A40 (4), pp. 410-445 (1984).

5. V. Cerf, "The Catenet Model for Internetworking," *DARPA/IPTO*,
IEN-48 (July, 1978).

6. L. Fratta, M. Gerla, and L. Kleinrock, "The Flow Deviation Method: An Approach
to Store and Forward Communication Network Design," *Networks*,
3, pp. 97-133 (1973).

7. M. Gerla, "The Design of Store and Forward Networks for Computer Communications,"
*Ph.D. Thesis*,
Dept. of Computer Science, UCLA (1973).

8. E. T. Jaynes, "Information Theory and Statistical Mechanics," *Physical Review*,
106 (4), pp. 620-630 (15 May 1957).

9. E. T. Jaynes, "Information Theory and Statistical Mechanics. II," *Physical Review*,
108 (2), pp. 171-190 (15 October 1957).

10. L. Kleinrock, "Computer Networks" in *Computer Science*,
ed. A. F. Cardenas, L. Presser & M. A. Marin, pp. 241-284, Wiley-Interscience, New
York (1972).

11. M. Schwartz, *Telecommunications Networks*,
Addison-Wesley Publishing Company, Read-ing, Massachusetts (1987).

12. C. E. Shannon, "A Mathematical Theory of Communication," *The Bell System Technical Journal*,
XXVII (3), pp. 379-423, 623-656 (July 1948).