[Jeff Mayersohn <mayersoh@cc5.bbn.com>: more on arpanet]

Wed 8 Oct 86 07:13:37-EDT

The latest in the ongoing saga, or "How to train a neglected child to
behave properly"


Received: from CC5.BBN.COM by vax.darpa.mil (4.12/4.7)
        id AA29722; Wed, 8 Oct 86 01:45:47 edt
Message-Id: <8610080545.AA29722@vax.darpa.mil>
To: prishivalko@ddn1.ARPA, grindle@ddn1.ARPA, leonard@ddn1.ARPA,
To: perry@vax.darpa.mil, blumenthal@cc5.bbn.com, hinden@cc5.bbn.com,
        mckenzie@cc5.bbn.com, pogran@cc5.bbn.com
To: jburke@cc5.bbn.com, bartlett@cc5.bbn.com
Cc: mayersoh@cc5.bbn.com, jwiggins@cc5.bbn.com, cgreenleaf@cc5.bbn.com,
        rpyle@cc5.bbn.com, fserr@cc5.bbn.com
Subject: more on arpanet
Date: 08 Oct 86 01:34:20 EDT (Wed)
From: Jeff Mayersohn <mayersoh@cc5.bbn.com>

I wanted to bring everyone up to date on the progress of our
investigation into Arpanet congestion.

There has been one major "discovery" made during the last week.
Apparently, in the middle of August, the line between the two USC
packet switches, which used to be a stub, was placed into the main
cross-country path. The problem is that this line, which connects two
endpoints on the USC campus, is apparently running at 19.2 kbps. This
change to the network topology is like closing a lane on Storrow Drive
during rush hour. It congests both the main artery and the roads
which feed it. This is undoubtedly a major contributor to the
cross-country congestion that we are seeing; the line should have its
capacity increased at once. This piece of information has been
communicated by Bob Steele, here at BBNCC, to the Arpanet manager at

In the past few weeks, it has been observed that some of the links on
the major cross-country paths have been bouncing up and down. We
believe this is due to a known problem in the microcode; the Arpanet
has not been running the most current microcode release. The newer
release is in the process of being installed. TAC 113, which contains
several efficiencies, is also in the process of being installed. Jeff
Burke tells me that these upgrades will be finished within the week.

Tracy Mallory and Bob Hinden tell me that a new release of the
mailbridge was installed in the network and changes have been made to
the routing tables in the Butterfly gateways which cause some traffic
to favor the Wideband Network over the Arpanet.

John Wiggins and Clive Greenleaf have made a number of measurements on
the Arpanet and have made a number of parameter changes in the packet
switch software. First, it was observed that routing updates were
being generated at very close to the maximum frequency, a sure sign
that routing is thrashing in its attempt to deal with congestion.
Changes were made to line parameters to stabilize routing by reporting
more or less equivalent delays on the three cross-country paths. It
was expected that this would reduce the pointless movement of traffic
from one trunk to another. In addition, it was observed that there
were end-to-end resource shortages in a number of packet switches.
John modified a parameter which reduces the amount of time that a
source packet switch can hold on to resources that it has reserved in
a destination packet switch. The hope is that this would alleviate
some of the contention for end-to-end resources.

The changes described in the last paragraph and (I believe) the
installation of the new mailbridge release were made on October 2.
Measurements made by Wiggins and Greenleaf on October 3 suggest that
the changes had a positive effect. Traffic in the network increased
by about 30% (from October 2 to October 3) during the peak period and
round trip delays were halved. The major symptoms of congestion
persisted, however.

There are two other observations to be made. There appears to have
been a change in some of the characteristics of the network traffic
recently. In particular, we have observed an increase in the distance
between communicating hosts from an average of 2.75 in June to 3.54 in
October. This is worth looking into.

Second, some of the mail on TCP-IP questions why cross-country
subnetwork congestion should affect traffic between, say, Stanford and
SRI. Clive Greenleaf has looked into this and has produced the
following explanation. In order to send a message between a pair of
packet switches, resources are used in both the source and
destination. What is happening is that the long delays across the
network are causing these resources to be held for long periods of
time. The fact that these resources are so occupied is affecting all
other traffic to and from a given switch, even if that traffic is to
or from adjacent switches, or local to the switch. The
Stanford IMP, which seems to send traffic to a large number of remote
destinations, and the SRI IMP appear to be major victims of this

Anyway, here's where we stand:

1) The USC line should be upgraded asap.

2) The microcode release which should eliminate the flapping of key
lines is being installed.

3) The TAC release which should reduce TCP retransmissions and network
overhead is being installed.

4) We are making store-and-forward statistics measurements to identify
all subnetwork bottlenecks.

5) We will produce a complete host traffic matrix in order to
determine how the network traffic has changed and to determine whether
any hosts are exhibiting antisocial behavior.

6) The topological changes recommended in my previous message should
be made. Recall that the network was showing symptoms of congestion
in June, before the USC line was thrown into the cross-country path.

7) We intend to produce a new set of recommended assignments of hosts
to mailbridges and may recommend the addition of mailbridges when this
analysis is complete.

This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:36:58 GMT