Charles Hedrick (firstname.lastname@example.org)
Tue, 14 Jan 86 00:52:23 est
Since my posting yesterday, I have given a bit more thought to the
issue of keeping track of network topology. I got several responses
acknowledging that the issue was an important and difficult one, but
none proposing any real solutions. So it seemed worth putting a bit
more thought into the issue. While I haven't come up with any
startling innovations, I think I see a couple of approaches that would
work. First, let me start by enumerating the possibilities that I
have seen. We have several issues. The first is how hosts keep track
of what gateways are up. The second is how hosts keep track of
changes in gateway status. The third is how hosts know what gateways
exist. Of course these are not orthogonal.
Keeping track of what gateways are up:
pinging - every host sends an echo request to every gateway that
it knows about every 30 sec. or so. Most people consider
this unacceptable because it generates too much network
traffic. TOPS-20 does this, though with an interval of
several minutes. I believe it must be done every 30 sec.,
because we have to be able to discover that a gateway is
down in time to move to another one before connections start
timing out or users start thinking that the system is down.
gateway broadcast - every gateway sends a broadcast every 30 sec.
For a network that supports broadcasts, this gives as good
results as pinging, but the number of packets is far smaller.
PUP and XNS gatewayinfo do this. So does Unix routed. The
only disadvantage I can see is that it only works if the
network supports broadcasts, and that it may not be so good
for single-process systems (e.g. IBM PC). On an IBM PC, you
can't just have a daemon sitting there keeping track of what
networks are up. Telnet could have to wait a minute or two
gathering gateway information before starting to make the
host broadcast - when a host wants to make a connection, it sends
a broadcast asking for any gateways to a certain host to
respond. This is effectively done now by ARP-hacking gateways.
Since an ARP is needed anyway to initiate a connection, it
adds no overhead. This strategy is appropriate for single-
process systems. The only disadvantage I can think of is that
it only works on media that support broadcast. Note that in
a complex network, this stategy requires that the gateways
have some other way to keep track of each other. They must
arrange things so that only the preferred gateway will respond
to an ARP.
Keeping track of changes. These techniques would normally be combined
with those above.
timeouts - when a connection times out, one has a good suspicion
that some part of the current route is down. What to do
about it depends upon which of the above strategies one is
using. If you are using pinging or gateway broadcast,
strictly speaking you don't need to do anything about timeouts.
4.3 uses timeouts because 4.3 establishes a route when a
connection is opened. Even if routed has figured out that
the gateway involved is down, the connection will still try
to use it. A timeout triggers the system to reexamine the
route, using its latest gateway information. On TOPS-20,
this is not needed, since the route is recomputed for each
packet sent. If you are depending upon a host broadcast
(e.g. ARP), a timeout should cause the current route (in this
case ARP table entry) to be removed, so that the host sends
another broadcast to look for a new route. Note that timeouts
do not totally solve the problem of detecting down gateways,
if we have traffic to some gateway (or for ARP-based schemes,
host) that is not connection-oriented. That is, UDP-based
protocols may not have a concept of timeout, or may find it
hard to feed back information about timeouts to lower levels
of the system.
ICMP redirect - depending upon the design, the system may not know
when a better route has become available. Again, TOPS-20
always will, because it recomputes routes each time, and
continually pings all gateways. But 4.2 will not change
routes during a connection. And a system that depends upon
the ARP hack probably doesn't have enough information to do
so either. So one can arrange for gateways to keep track of
each other, and to issue an ICMP redirect if a better route
becomes available. Note that this does not necessarily
require the host to keep track of gateway information. If
all of the gateways do the ARP hack, a host can process an
ICMP redirect simply by removing the ARP table entry for
the destination host involved.
ARP table expiration - Unix expires entries in the ARP table after
N minutes of non-use. This is primarily intended to keep
down the number of entries in the ARP table. However in
theory this could be used to keep routing up to date. If
we expired entries even when they are in use, it would
force a new ARP request. This would (we hope) come back
with the latest routing, taking into account any gateways
that have come up or gone down. The problem with doing
this is that it would increase the number of ARP requests.
If we only use it to discover better routes, we could afford
to do it fairly infrequently, say once every 30 min. If we
depend upon it to discover gateways that are down, we probably
have to do it every 30 sec. This is likely to cause results
that are about as bad as pinging. It would also interfere
with performance, since our experience shows that waiting
for an ARP causes a noticable pause in telnet. Doing this
once every 30 min is not likely to cause a significant
load. Suppose we have 256 hosts on a subnet, each talking
to 4 other hosts at a time (this is probably a gross
overestimate for any real network). That is 1000 ARP
requests in 1800 sec. This is a packet rate of around
1 per second. That should be tolerable. However the
requests are probably not going to be random. There may be
a tendency for them to cluster, due to the fact that all of
the systems will have been rebooted at the same time (the
last power failure).
Knowledge of network topology.
builtin tables - this is fairly common, but with a large network
it becomes a pain to update all the tables.
gateway broadcasts - the gateway broadcast strategy mentioned
above also solves this problem, since it allows the host
to discover what gateways exist simply by monitoring
host broadcasts - the host broadcast strategy mentioned above
also solves this problem, since the host no longer has to
know the network topology. When it needs to make a connection
it broadcasts a request and the gateways have to figure out
who should respond. To use changes in topology, this should be
combined with ICMP redirects when a better route becomes
try a random gateway - TOPS-20 keeps a table with a small number
of "prime" gateways. When it wants to make a connection,
and none of the currently known gateways is right for the
job, it chooses a random prime gateway. This gateway is
expected to know about all of the others, and to issue an
ICMP redirect to the right one. However this only works
if one knows which of the prime gateways are up. TOPS-20
uses pinging. Any other solution to the problem of knowing
which gateway is up will also solve the problem of knowing
what gateways there are, so this strategy is probably not
Some choices are clear:
- we probably don't want pinging to be the primary method of
keeping track of the network.
- ARPs are probably the only reasonable way for single-process
machines to find out about the network, since they can't
be expected to have daemons that keep track of topology.
This implies that all gateways should be expected to
support the "ARP hack", even when subnetting is general
Now the question is whether we also want the gateways to broadcast,
a la routed. My initial reaction is that if we can come up with
a mechanism based on ARPs that will solve all of our problems, there
is no need to run routed or its equivalent on each host. So first,
let's look at a design based on ARPs, and no protocol like routed.
- connections are initially established by issuing an ARP
request. The gateways arrange to answer these in such a
way as to give an optimal route.
- when a connection times out, the ARP table entry for the host
involved is removed. This forces a new ARP for the next
packet to be sent.
- when a better route becomes available, it would be helpful
if the gateway currently being used issues an ICMP
redirect. Because of the timing out of ARP table
entries, this is not completely necessary.
- if non-connection-oriented protocols are being used (so that
timeouts are not possible), or if it is not practical
for gateways to issue ICMP redirects when a better route
becomes available, ARP table entries must expire after
This mechanism is obviously not sufficient for hosts with more than
one Ethernet interface, since they have no way to choose which
interface to use. ARP's don't help, since in general some other
gateway will probably be able to find a route to any host on any
subnet, so there will be responses to ARP requests on both interfaces.
However a host with more than one interface is effectively a gateway.
It should participate in whatever protocol is used among the gateways,
probably EGP or routed.
There are several reasons why one might prefer some other mechanism
for hosts that are capable of running daemons:
- if UDP-based protocols are in heavy use, it may be impractical to
detect down gateways by depending upon timeouts. For Suns,
the Network File System is critical, and that uses UDP.
While NFS does have a concept of timeout, our experience
shows that timeouts may indicate a number of conditions
other than routing failures. It is not clear whether it
would be appropriate to clear ARP table entries when there
is an NFS timeout.
- one may believe that it is not practical to implement ICMP
redirect in the gateways when a better route becomes
available, and that the overhead of expiring ARP entries
If one decides that another scheme is needed other than having the
host broadcast requests, it seems clear that the best alternative is
to have the gateway broadcast the fact that it is up. In that case,
routed seems to make a lot of sense. It is widely implemented, and
seems to do what needs to be done. In a Unix implementation, one also
needs a way to force routes to be recomputed when there is a change in
the gateway table. The method in 4.3 seems to depend upon timeouts.
I suspect it might be better to have an IOCTL that routed could do to
invalidate routes (either all routes whenever a topology change
happens, or some slightly more selective method).
Unfortunately, the problem I have to solve is not just picking the
combination of strategies that I like the best. I also have to be
able to live with existing TCP/IP implementations. Currently Rutgerse
is using 4.2 (Sun, Pyramid, Celerity, Ultrix), TOPS-20, DG, Symbolics,
Bridge, ... We only have source to some of these, and even where we
do have source, it may not be desirable to do major network development
work. If we are unable to change the host implementation, then
the advice to a gateway designer is pretty much the obvious:
1) Do the best one can for hosts that will depend upon ARP's to
discover routing. This means trying to coordinate gateways so
that only the best one responds.
2) Enough systems use code based on 4.2, and routed is a reasonable
enough way of doing things, that it probably makes sense for the
gateways to implement routed.
3) One should probably try to get gateways to issue ICMP redirects
whenever appropriate. However it is not clear which existing
implementations this is going to help. Certainly it would help
TOPS-20. Existing systems that use ARP are pretending that all all of
the hosts are directly connected, so an ICMP redirect is going to be
irrelevant to them. For Unix systems, ICMP redirect doesn't add much
to what routed already provides (and indeed may even confuse it, if
routed thinks it is managing the gateway tables). Circumstances where
ICMP redirects could be generated are when a packet is sent to a
gateway that knows it is not the best route. Len Bosack at Stanford
suggests that gateways should have a command that says we are about to
shut them down. In that case, they can start issuing ICMP redirects
to an alternate. (However one has to be careful to avoid loops. If
the alternate doesn't know you are shutting down, and it is a less
prefered route, it may issue a redirect right back to you.)
This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:35:39 GMT