intro to tcp admin, part 2 of 3


Charles Hedrick (aramis.rutgers.edu!hedrick@rutgers.edu)
25 Jul 88 03:01:39 GMT


down, and go back to the default gateway. A similar approach can also
be used to handle failures in the default gateway. If you have mark
two gateways as default, then the software should be capable of
switching when connections using one of them start failing.
Unfortunately, some common TCP/IP implementations do not mark routes
as down and change to new ones. (In particular Berkeley 4.2 Unix does
not.) However Berkeley 4.3 Unix does do this, and as other vendors
begin to base products on 4.3 rather than 4.2, this ability is
expected to be more common.

4.4 Other ways for hosts to find routes

As long as your TCP/IP implementations handle failing connections
properly, establishing one or more default routes in the configuration
file is likely to be the simplest way to handle routing. However
there are two other routing approaches that are worth considering for
special situations:

   - spying on the routing protocol

   - using proxy ARP

4.4.1 Spying on Routing

Gateways generally have a special protocol that they use among
themselves. Note that redirects cannot be used by gateways.
Redirects are simply ways for gateways to tell "dumb" hosts to use a
different gateway. The gateways themselves must have a complete
picture of the network, and a way to compute the optimal route to each
subnet. Generally they maintain this picture by exchanging
information among themselves. There are several different routing
protocols in use for this purpose. One way for a computer to keep
track of gateways is for it to listen to the gateways' messages.
There is software available for this purpose for most of the common
routing protocols. When you run this software, it maintains a
complete picture of the network, just as the gateways do. The
software is generally designed to maintain your computer's routing
tables dynamically, so that datagrams are always sent to the proper
gateway. In effect, the routing software issues the equivalent of the
Unix "route add" and "route delete" commands as the network topology
changes. Generally this results in a complete routing table, rather
than one that depends upon default routes. (This assumes that the
gateways themselves maintain a complete table. Sometimes gateways
keep track of your campus network completely, but use a default route
for all off-campus networks, etc.)
                                  16

Running routing software on each host does in some sense "solve" the
routing problem. However there are several reasons why this is not
normally recommended except as a last resort. The most serious
problem is that this reintroduces configuration options that must be
kept up to date on each host. Any computer that wants to participate
in the protocol among the gateways will need to configure its software
compatibly with the gateways. Modern gateways often have
configuration options that are complex compared with those of an
individual host. It is undesirable to spread these to every host.

There is a somewhat more specialized problem that applies only to
diskless computers. By its very nature, a diskless computer depends
upon the network and file servers to load programs and to do swapping.
It is dangerous for diskless computers to run any software that
listens to network broadcasts. Routing software generally depends
upon broadcasts. For example, each gateway on the network might
broadcast its routing tables every 30 seconds. The problem with
diskless nodes is that the software to listen to these broadcasts must
be loaded over the network. On a busy computer, programs that are not
used for a few seconds will be swapped or paged out. When they are
activated again, they must be swapped or paged in. Whenever a
broadcast is sent, every computer on the network needs to activate the
routing software in order to process the broadcast. This means that
many diskless computers will be doing swapping or paging at the same
time. This is likely to cause a temporary overload of the network.
Thus it is very unwise for diskless machines to run any software that
requires them to listen to broadcasts.

4.4.2 Proxy ARP

Proxy ARP is an alternative technique for letting gateways make all
the routing decisions. It is applicable to any broadcast network that
uses ARP or a similar technique for mapping Internet addresses into
network-specific addresses such as Ethernet addresses. This
presentation will assume Ethernet. Other network types can be
acccomodated if you replace "Ethernet address" with the appropriate
network-specific address, and ARP with the protocol used for address
mapping by that network type.

In many ways proxy ARP it is similar to using a default route and
redirects, however it uses a different mechanism to communicate routes
to the host. With redirects, a full routing table is used. At any
given moment, the host knows what gateways it is routing datagrams to.
With proxy ARP, you dispense with explicit routing tables, and do
everything at the level of Ethernet addresses. Proxy ARP can be used
for all destinations, only for destinations within your network, or in
various combinations. It will be simplest to explain it as used for
all addresses. To do this, you instruct the host to pretend that
every computer in the world is attached directly to your local
Ethernet. On Unix, this would be done using a command

      route add default 128.6.4.2 0
                                  17

where 128.6.4.2 is assumed to be the Internet address of your host.
As explained above, the metric of 0 causes everything that matches
this route to be sent directly on the local Ethernet.

When a datagram is to be sent to a local Ethernet destination, your
computer needs to know the Ethernet address of the destination. In
order to find that, it uses something generally called the ARP table.
This is simply a mapping from Internet address to Ethernet address.
Here's a typical ARP table. (On our system, it is displayed using the
command "arp -a".)

    FOKKER.RUTGERS.EDU (128.6.5.16) at 8:0:20:0:8:22 temporary
    CROSBY.RUTGERS.EDU (128.6.5.48) at 2:60:8c:49:50:63 temporary
    CAIP.RUTGERS.EDU (128.6.4.16) at 8:0:8b:0:1:6f temporary
    DUDE.RUTGERS.EDU (128.6.20.16) at 2:7:1:0:eb:cd temporary
    W20NS.MIT.EDU (18.70.0.160) at 2:7:1:0:eb:cd temporary
    OBERON.USC.EDU (128.125.1.1) at 2:7:1:2:18:ee temporary
    gatech.edu (128.61.1.1) at 2:7:1:0:eb:cd temporary
    DARTAGNAN.RUTGERS.EDU (128.6.5.65) at 8:0:20:0:15:a9 temporary

Note that it is simply a list of Internet addresses and the
corresponding Ethernet address. The "temporary" indicates that the
entry was added dynamically using ARP, rather than being put into the
table manually.

If there is an entry for the address in the ARP table, the datagram is
simply put on the Ethernet with the corresponding Ethernet address.
If not, an "ARP request" is broadcast, asking for the destination host
to identify itself. This request is in effect a question "will the
host with Internet address 128.6.4.194 please tell me what your
Ethernet address is?". When a response comes back, it is added to the
ARP table, and future datagrams for that destination can be sent
without delay.

This mechanism was originally designed only for use with hosts
attached directly to a single Ethernet. If you need to talk to a host
on a different Ethernet, it was assumed that your routing table would
direct you to a gateway. The gateway would of course have one
interface on your Ethernet. Your computer would then end up looking
up the address of that gateway using ARP. It would generally be
useless to expect ARP to work directly with a computer on a distant
network. Since it isn't on the same Ethernet, there's no Ethernet
address you can use to send datagrams to it. And when you send an ARP
request for it, there's nobody to answer the request.

Proxy ARP is based on the concept that the gateways will act as
proxies for distant hosts. Suppose you have a host on network
128.6.5, with address 128.6.5.2. (computer A in diagram below) It
wants to send a datagram to host 128.6.4.194, which is on a different
Ethernet (subnet 128.6.4). (computer C in diagram below) There is a
gateway connecting the two subnets, with address 128.6.5.1 (gateway
R):

                                  18

              network 1 network 2
               128.6.5 128.6.4
        ============================ ==================
          | | | | | |
       ___|______ _____|____ __|____|__ __|____|____
       128.6.5.2 128.6.5.3 128.6.5.1 128.6.4.194
                                128.6.4.1
       __________ __________ __________ ____________
       computer A computer B gateway R computer C

Now suppose computer A sends an ARP request for computer C. C isn't
able to answer for itself. It's on a different network, and never
even sees the ARP request. However gateway R can act on its behalf.
In effect, your computer asks "will the host with Internet address
128.6.4.194 please tell me what your Ethernet address is?", and the
gateway says "here I am, 128.6.4.194 is 2:7:1:0:eb:cd", where
2:7:1:0:eb:cd is actually the Ethernet address of the gateway. This
bit of illusion works just fine. Your host now thinks that
128.6.4.194 is attached to the local Ethernet with address
2:7:1:0:eb:cd. Of course it isn't. But it works anyway. Whenever
there's a datagram to be sent to 128.6.4.194, your host sends it to
the specified Ethernet address. Since that's the address of a gateway
R, the gateway gets the packet. It then forwards it to the
destination.

Note that the net effect is exactly the same as having an entry in the
routing table saying to route destination 128.6.4.194 to gateway
128.6.5.1:

    128.6.4.194 128.6.5.1 UGH pe0

except that instead of having the routing done at the level of the
routing table, it is done at the level of the ARP table.

Generally it's better to use the routing table. That's what it's
there for. However here are some cases where proxy ARP makes sense:

   - when you have a host that does not implement subnets

   - when you have a host that does not respond properly to redirects

   - when you do not want to have to choose a specific default gateway

   - when your software is unable to recover from a failed route

The technique was first designed to handle hosts that do not support
subnets. Suppose that you have a subnetted network. For example, you
have chosen to break network 128.6 into subnets, so that 128.6.4 and
128.6.5 are separate. Suppose you have a computer that does not
understand subnets. It will assume that all of 128.6 is a single
network. Thus it will be difficult to establish routing table entries
to handle the configuration above. You can't tell it about the
gateway explicitly using "route add 128.6.4.0 128.6.5.1 1" Since it
thinks all of 128.6 is a single network, it can't understand that you
                                  19

are trying to tell it where to send one subnet. It will instead
interpret this command as an attempt to set up a host route to a host
who address is 128.6.4.0. The only thing that would work would be to
establish explicit host routes for every individual host on every
other subnet. You can't depend upon default gateways and redirects in
this situation either. Suppose you said "route add default 128.6.5.1
1". This would establish the gateway 128.6.5.1 as a default. However
the system wouldn't use it to send packets to other subnets. Suppose
the host is 128.6.5.2, and wants to send a datagram to 128.6.4.194.
Since the destination is part of 128.6, your computer considers it to
be on the same network as itself, and doesn't bother to look for a
gateway.

Proxy ARP solves this problem by making the world look the way the
defective implementation expects it to look. Since the host thinks
all other subnets are part of its own network, it will simply issue
ARP requests for them. It expects to get back an Ethernet address
that can be used to establish direct communications. If the gateway
is practicing proxy ARP, it will respond with the gateway's Ethernet
address. Thus datagrams are sent to the gateway, and everything
works.

As you can see, no specific configuration is need to use proxy ARP
with a host that doesn't understand subnets. All you need is for your
gateways to implement proxy ARP. In order to use it for other
purposes, you must explicitly set up the routing table to cause ARP to
be used. By default, TCP/IP implementations will expect to find a
gateway for any destination that is on a different network. In order
to make them issue ARP's, you must explicitly install a route with
metric 0, as in the example "route add default 128.6.5.2 0".

It is obvious that proxy ARP is reasonable in situations where you
have hosts that don't understand subnets. Some comments may be needed
on the other situations. Generally TCP/IP implementations do handle
ICMP redirects properly. Thus it is normally practical to set up a
default route to some gateway, and depend upon the gateway to issue
redirects for destinations that should use a different gateway.
However in case you ever run into an implementation that does not obey
redirects, or cannot be configured to have a default gateway, you may
be able to make things work by depending upon proxy ARP. Of course
this requires that you be able to configure the host to issue ARP's
for all destinations. You will need to read the documentation
carefully to see exactly what routing features your implementation
has.

Sometimes you may choose to depend upon proxy ARP for convenience.
The problem with routing tables is that you have to configure them.
The simplest configuration is simply to establish a default route, but
even there you have to supply some equivalent to the Unix command
"route add default ...". Should you change the addresses of your
gateways, you have to modify this command on all of your hosts, so
that they point to the new default gateway. If you set up a default
route that depends upon proxy ARP (i.e. has metric 0), you won't have
to change your configuration files when gateways change. With proxy
ARP, no gateway addresses are given explicitly. Any gateway can
                                  20

respond to the ARP request, no matter what its address.

In order to save you from having to do configuration, some TCP/IP
implementations default to using ARP when they have no other route.
The most flexible implementations allow you to mix strategies. That
is, if you have specified a route for a particular network, or a
default route, they will use that route. But if there is no route for
a destination, they will treat it as local, and issue an ARP request.
As long as your gateways support proxy ARP, this allows such hosts to
reach any destination without any need for routing tables.

Finally, you may choose to use proxy ARP because it provides better
recovery from failure. This choice is very much dependent upon your
implementation. The next section will discuss the tradeoffs in more
detail.

In situations where there are several gateways attached to your
network, you may wonder how proxy ARP allows you to choose the best
one. As described above, your computer simply sends a broadcast
asking for the Ethernet address for a destination. We assumed that
the gateways would be set up to respond to this broadcast. If there
is more than one gateway, this requires coordination among them.
Ideally, the gateways will have a complete picture of the network
topology. Thus they are able to determine the best route from your
host to any destination. If the gateway coordinate among themselves,
it should be possible for the best gateway to respond to your ARP
request. In practice, it may not always be possible for this to
happen. It is fairly easy to design algorithms to prevent very bad
routes. For example, consider the following situation:

          1 2 3
        ------- A ---------- B ----------

1, 2, and 3 are networks. A and B are gateways, connecting network 2
to 1 or 3. If a host on network 2 wants to talk to a host on network
1, it is fairly easy for gateway A to decide to answer, and for
gateway B to decide not to. Here's how: if gateway B accepted a
datagram for network 1, it would have to forward it to gateway A for
delivery. This would mean that it would take a packet from network 2
and send it right back out on network 2. It is very easy to test for
routes that involve this sort of circularity. It is much harder to
deal with a situation such as the following:

                         1
                  ---------------
                    A B
                    | | 4
                    | |
                  3 | C
                    | |
                    | | 5
                    D E
                  ---------------
                         2

                                  21

Suppose a computer on network 1 wants to send a datagram to one on
network 2. The route via A and D is probably better, because it goes
through only one intermediate network (3). It is also possible to go
via B, C, and E, but that path is probably slightly slower. Now
suppose the computer on network 1 sends an ARP request for a
destination on 2. It is likely that A and B will both respond to that
request. B is not quite as good a route as A. However it is not so
bad as the case above. B won't have to send the datagram right back
out onto network 1. It is unable to determine there is a better
alternative route without doing a significant amount of global
analysis on the network. This may not be practical in the amount of
time available to process an ARP request.

4.4.3 Moving to New Routes After Failures

In principle, TCP/IP routing is capable of handling line failures and
gateway crashes. There are various mechanisms to adjust routing
tables and ARP tables to keep them up to date. Unfortunately, many
major implementations of TCP/IP have not implemented all of these
mechanisms. The net result is that you have to look carefully at the
documentation for your implementation, and consider what kinds of
failures are most likely. You then have to choose a strategy that
will work best for your site. The basic choices for finding routes
have all been listed above: spying on the gateways' routing protocol,
setting up a default route and depending upon redirects, and using
proxy ARP. These methods all have their own limitations in dealing
with a changing network.

Spying on the gateways' routing protocol is theoretically the cleanest
solution. Assuming that the gateways use good routing technology, the
tables that they broadcast contain enough information to maintain
optimal routes to all destinations. Should something in the network
change (a line or a gateway goes down), this information will be
reflected in the tables, and the routing software will be able to
update the hosts' routing tables appropriately. The disadvantages are
entirely practical. However in some situations the robustness of this
approach may outweight the disadvantages. To summarize the discussion
above, the disadvantages are:

   - If the gateways are using sophisticated routing protocols,
     configuration may be fairly complex. Thus you will be faced with
     setting up and maintaining configuration files on every host.

   - Some gateways use proprietary routing protocols. In this case,
     you may not be able to find software for your hosts that
     understands them.

   - If your hosts are diskless, there can be very serious performance
     problems associated with listening to routing broadcasts.

Some gateways may be able to convert from their internal routing
protocol to a simpler one for use by your hosts. This could largely
                                  22

bypass the first two disadvantages. Currently there is no known way
to get around the third one.

The problems with default routes/redirects and with proxy ARP are
similar: they both have trouble dealing with situations where their
table entries no longer apply. The only real difference is that
different tables are involved. Suppose a gateway goes down. If any
of your current routes are using that gateway, you may be in trouble.
If you are depending upon the routing table, the major mechanism for
adjusting routes is the redirect. This works fine in two situations:

   - where the default gateway is not the best route. The default
     gateway can direct you to a better gateway

   - where a distant line or gateway fails. If this changes the best
     route, the current gateway can redirect you to the gateway that
     is now best

The case it does not protect you against is where the gateway that you
are currently sending your datagrams to crashes. Since it is down, it
is unable to redirect you to another gateway. In many cases, you are
also unprotected if your default gateway goes down, since there
routing starts by sending to the default gateway.

The situation with proxy ARP is similar. If the gateways coordinate
themselves properly, the right one will respond initially. If
something elsewhere in the network changes, the gateway you are
currently issuing can issue a redirect to a new gateway that is
better. (It is usually possible to use redirects to override routes
established by proxy ARP.) Again, the case you are not protected
against is where the gateway you are currently using crashes. There
is no equivalent to failure of a default gateway, since any gateway
can respond to the ARP request.

So the big problem is that failure of a gateway you are using is hard
to recover from. It's hard because the main mechanism for changing
routes is the redirect, and a gateway that is down can't issue
redirects. Ideally, this problem should be handled by your TCP/IP
implementation, using timeouts. If a computer stops getting response,
it should cancel the existing route, and try to establish a new one.
Where you are using a default route, this means that the TCP/IP
implementation must be able to declare a route as down based on a
timeout. If you have been redirected to a non-default gateway, and
that route is declared down, traffic will return to the default. The
default gateway can then begin handling the traffic, or redirect it to
a different gateway. To handle failure of a default gateway, it
should be possible to have more than one default. If one is declared
down, another will be used. Together, these mechanisms should take
care of any failure.

Similar mechanisms can be used by systems that depend upon proxy ARP.
If a connection is timing out, the ARP table entry that it uses should
be cleared. This will cause a new ARP request, which can be handled
by a gateway that is still up. A simpler mechanism would simply be to
time out all ARP entries after some period. Since making a new ARP
                                  23

request has a very low overhead, there's no problem with removing an
ARP entry even if it is still good. The next time a datagram is to be
sent, a new request will be made. The response is normally fast
enough that users will not even notice the delay.

Unfortunately, many common implementations do not use these
strategies. In Berkeley 4.2, there is no automatic way of getting rid
of any kind of entry, either routing or ARP. They do not invalidate
routes on timeout nor ARP entries. ARP entries last forever. If
gateway crashes are a significant problem, there may be no choice but
to run software that listens to the routing protocol. In Berkeley
4.3, routing entries are removed when TCP connections are failing.
ARP entries are still not removed. This makes the default route
strategy more attractive for 4.3 than proxy ARP. Having more than one
default route may also allow for recovery from failure of a default
gateway. Note however that 4.3 only handles timeout for connections
using TCP. If a route is being used only by services based on UDP, it
will not recover from gateway failure. While the "traditional" TCP/IP
services use TCP, network file systems generally do not. Thus
4.3-based systems still may not always be able to recover from
failure.

In general, you should examine your implementation in detail to
determine what sort of error recovery strategy it uses. We hope that
the discussion in this section will then help you choose the best way
of dealing with routing.

There is one more strategy that some older implementations use. It is
strongly discouraged, but we mention it here so you can recognize it
if you see it. Some implementations detect gateway failure by taking
active measure to see what gateways are up. The best version of this
is based on a list of all gateways that are currently in use. (This
can be determined from the routing table.) Every minute or so, an
echo request datagram is sent to each such gateway. If a gateway
stops responding to echo requests, it is declared down, and all routes
using it revert to the default. With such an implementation, you
normally supply more than one default gateway. If the current default
stops responding, an alternate is chosen. In some cases, it is not
even necessary to choose an explicit default gateway. The software
will randomly choose any gateway that is responding. This
implementation is very flexible and recovers well from failures.
However a large network full of such implementations will waste a lot
of bandwidth on the echo datagrams that are used to test whether
gateways are up. This is the reason that this strategy is
discouraged.

5. Bridges and Gateways

This section will deal in more detail with the technology used to
construct larger networks. It will focus particularly on how to
connect together multiple Ethernets, token rings, etc. These days
most networks are hierarchical. Individual hosts attach to local-area
                                  24

networks such as Ethernet or token ring. Then those local networks
are connected via some combination of backbone networks and point to
point links. A university might have a network that looks in part
like this:

     ________________________________
     | net 1 net 2 net 3 | net 4 net 5
     | ---------X---------X-------- | -------- --------
     | | | | |
     | Building A | | | |
     | ----------X--------------X-----------------X
     | | campus backbone network :
     |______________________________| :
                                                         serial :
                                                           line :
                                                         -------X-----
                                                             net 6

Nets 1, 2 and 3 are in one building. Nets 4 and 5 are in different
buildings on the same campus. Net 6 is in a somewhat more distant
location. The diagram above shows nets 1, 2, and 3 being connected
directly, with switches that handle the connections being labelled as
"X". Building A is connected to the other buildings on the same
campus by a backbone network. Note that traffic from net 1 to net 5
takes the following path:

   - from 1 to 2 via the direct connection between those networks

   - from 2 to 3 via another direct connection

   - from 3 to the backbone network

   - across the backbone network from building A to the building in
     which net 5 is housed

   - from the backbone network to net 5

Traffic for net 6 would additionally pass over a serial line. With
the setup as shown, the same switch is being used to connect the
backbone network to net 5 and to the serial line. Thus traffic from
net 5 to net 6 would not need to go through the backbone, since there
is a direct connection from net 5 to the serial line.

This section is largely about what goes in those "X"'s.

5.1 Alternative Designs

Note that there are alternatives to the sort of design shown above.
One is to use point to point lines or switched lines directly to each
host. Another is to use a single-level of network technology that is
capable of handling both local and long-haul networking.

                                  25

5.1.1 A mesh of point to point lines

Rather than connecting hosts to a local network such as Ethernet, and
then interconnecting the Ethernets, it is possible to connect
long-haul serial lines directly to the individual computers. If your
network consists primarily of individual computers at distant
locations, this might make sense. Here would be a small design of
that type.

          computer 1 computer 2 computer 3
              | | |
              | | |
              | | |
          computer 4 -------------- computer 5 ----------- computer 6

In the design shown earlier, the task of routing datagrams around the
network is handled by special-purpose switching units shown as "X"'s.
If you run lines directly between pairs of hosts, your hosts will be
doing this sort of routing and switching, as well as their normal
computing. Unless you run lines directly between every pair of
computers, some systems will end up handling traffic for others. For
example, in this design, traffic from 1 to 3 will go through 4, 5 and
6. This is certainly possible, since most TCP/IP implementations are
capable of forwarding datagrams. If your network is of this type, you
should think of your hosts as also acting as gateways. Much of the
discussion below on configuring gateways will apply to the routing
software that you run on your hosts. This sort of configuration is
not as common as it used to be, for two reasons:

   - Most large networks have more than one computer per location. In
     this case it is less expensive to set up a local network at each
     location than to run point to point lines to each computer.

   - Special-purpose switching units have become less expensive. It
     often makes sense to offload the routing and communications tasks
     to a switch rather than handling it on the hosts.

It is of course possible to have a network that mixes the two kinds of
techology. In this case, locations with more equipment would be
handled by a hierarchical system, with local-area networks connected
by switches. Remote locations with a single computer would be handled
by point to point lines going directly to those computers. In this
case the routing software used on the remote computers would have to
be compatible with that used by the switches, or there would need to
be a gateway between the two parts of the network.

Design decisions of this type are typically made after an assessment
of the level of network traffic, the complexity of the network, the
quality of routing software available for the hosts, and the ability
of the hosts to handle extra network traffic.

                                  26

5.1.2 Circuit switching technology

Another alternative to the hierarchical LAN/backbone approach is to
use circuit switches connected to each individual computer. This is
really a variant of the point to point line technique, where the
circuit switch allows each system to have what amounts to a direct
line to every other system. This technology is not widely used within
the TCP/IP community, largely because the TCP/IP protocols assume that
the lowest level handles isolated datagrams. When a continuous
connection is needed, higher network layers maintain it using
datagrams. This datagram-oriented technology does not match a
circuit-oriented environment very closely. In order to use circuit
switching technology, the IP software must be modified to be able to
build and tear down virtual circuits as appropriate. When there is a
datagram for a given destination, a virtual circuit must be opened to
it. The virtual circuit would be closed when there has been no
traffic to that destination for some time. The major use of this
technology is for the DDN (Defense Data Network). The primary
interface to the DDN is based on X.25. This network appears to the
outside as a distributed X.25 network. TCP/IP software intended for
use with the DDN must do precisely the virtual circuit management just
described. Similar techniques could be used with other
circuit-switching technologies, e.g. ATT's DataKit, although there is
almost no software currently available to support this.

5.1.3 Single-level networks

In some cases new developments in wide-area networks can eliminate the
need for hierarchical networks. Early hierarchical networks were set
up because the only convenient network technology was Ethernet or
other LAN's, and those could not span distances large enough to cover
an entire campus. Thus it was necessary to use serial lines to
connect LAN's in various locations. It is now possible to find
network technology whose characteristics are similar to Ethernet, but
where a single network can span a campus. Thus it is possible to
think of using a single large network, with no hierarchical structure.

The primary limitations of a large single-level network are
performance and reliability considerations. If a single network is
used for the entire campus, it is very easy to overload it.
Hierarchical networks can handle a larger traffic volume than
single-level networks if traffic patterns have a reasonable amount of
locality. That is, in many applications, traffic within an individual
department tends to be greater than traffic among departments.

Let's look at a concrete example. Suppose there are 10 departments,
each of which generate 1 Mbit/sec of traffic. Suppose futher than 90%
of that traffic is to other systems within the department, and only
10% is to other departments. If each department has its own network,
that network only needs to handle 1 Mbit/sec. The backbone network
connecting the department also only needs 1 Mbit/sec capacity, since
                                  27

it is handling 10% of 1 Mbit from each department. In order to handle
this situation with a single wide-area network, that network would
have to be able to handle the simultaneous load from all 10
departments, which would be 10 Mbit/sec.

The second limitation on single-level networks is reliability,
maintainability and security. Wide-area networks are more difficult
to diagnose and maintain than local-area networks, because problems
can be introduced from any building to which the network is connected.
They also make traffic visible in all locations. For these reasons,
it is often sensible to handle local traffic locally, and use the
wide-area network only for traffic that actually must go between
buildings. However if you have a situation where each location has
only one or two computers, it may not make sense to set up a local
network at each location, and a single-level network may make sense.

5.1.4 Mixed designs

In practice, few large networks have the luxury of adopting a
theoretically pure design.

It is very unlikely that any large network will be able to avoid using
a hierarchical design. Suppose we set out to use a single-level
network. Even if most buildings have only one or two computers, there
will be some location where there are enough that a local-area network
is justified. The result is a mixture of a single-level network and a
hierachical network. Most buildings have their computers connected
directly to the wide-area network, as with a single-level network.
However in one building there is a local-area network which uses the
wide-area network as a backbone, connecting to it via a switching
unit.

On the other side of the story, even network designers with a strong
commitment to hierarchical networks are likely to find some parts of
the network where it simply doesn't make economic sense to install a
local-area network. So a host is put directly onto the backbone
network, or tied directly to a serial line.

However you should think carefully before making ad hoc departures
from your design philosophy in order to save a few dollars. In the
long run, network maintainability is going to depend upon your ability
to make sense of what is going on in the network. The more consistent
your technology is, the more likely you are to be able to maintain the
network.

                                  28

5.2 An introduction to alternative switching technologies

This section will discuss the characteristics of various technologies
used to switch datagrams between networks. In effect, we are trying
to fill in some details about the black boxes assumed in previous
sections. There are three basic types of switches, generally referred
to as repeaters, bridges, and gateways, or alternatively as level 1, 2
and 3 switches (based on the level of the ISO model at which they
operate). Note however that there are systems that combine features
of more than one of these, particularly bridges and gateways.

The most important dimensions on which switches vary are isolation,
performance, routing and network management facilities. These will be
discussed below.

The most serious difference is between repeaters and the other two
types of switch. Until recently, gateways provided very different
services from bridges. However these two technologies are now coming
closer together. Gateways are beginning to adopt the special-purpose
hardware that has characterized bridges in the past. Bridges are
beginning to adopt more sophisticated routing, isolation features, and
network management, which have characterized gateways in the past.
There are also systems that can function as both bridge and gateway.
This means that at the moment, the crucial decision may not be to
decide whether to use a bridge or a gateway, but to decide what
features you want in a switch and how it fits into your overall
network design.

5.2.1 Repeaters

A repeater is a piece of equipment that connects two networks that use
the same technology. It receives every data packet on each network,
and retransmits it onto the other network. The net result is that the
two networks have exactly the same set of packets on them. For
Ethernet or IEEE 802.3 networks there are actually two different kinds
of repeater. (Other network technologies may not need to make this
distinction.)

A simple repeater operates at a very low level indeed. Its primary
purpose is to get around limitations in cable length caused by signal
loss or timing dispersion. It allows you to construct somewhat larger
networks than you would otherwise be able to construct. It can be
thought of as simply a two-way amplifier. It passes on individual
bits in the signal, without doing any processing at the packet level.
It even passes on collisions. That is, if a collision is generated on
one of the networks connected to it, the repeater generates a
collision on the other network. There is a limit to the number of
repeaters that you can use in a network. The basic Ethernet design
requires that signals must be able to get from one end of the network
to the other within a specified amount of time. This determines a
maximum allowable length. Putting repeaters in the path does not get
                                  29

around this limit. (Indeed each repeater adds some delay, so in some
ways a repeater makes things worse.) Thus the Ethernet configuration
rules limit the number of repeaters that can be in any path.

A "buffered repeater" operates at the level of whole data packets.
Rather than passing on signals a bit at a time, it receives an entire
packet from one network into an internal buffer and then retransmits
it onto the other network. It does not pass on collisions. Because
such low-level features as collisions are not repeated, the two
networks continue to be separate as far as the Ethernet specifications
are concerned. Thus there are no restrictions on the number of
buffered repeaters that can be used. Indeed there is no requirement
that both of the networks be of the same type. However the two
networks must be sufficiently similar that they have the same packet
format. Generally this means that buffered repeaters can be used
between two networks of the IEEE 802.x family (assuming that they have
chosen the same address length), or two networks of some other related
family. A pair of buffered repeaters can be used to connect two
networks via a serial line.

Buffered repeaters share with simple repeaters the most basic feature:
they repeat every data packet that they receive from one network onto
the other. Thus the two networks end up with exactly the same set of
packets on them.

5.2.2 Bridges and gateways

A bridge differs from a buffered repeater primarily in the fact that
it exercizes some selectivity as to what packets it forwards between
networks. Generally the goal is to increase the capacity of the
system by keeping local traffic confined to the network on which it
originates. Only traffic intended for the other network (or some
other network accessed through it) goes through the bridge. So far
this description would also apply to a gateway. Bridges and gateways
differ in the way they determine what packets to forward. A bridge
uses only the ISO level 2 address. In the case of Ethernet or IEEE
802.x networks, this is the 6-byte Ethernet or MAC-level address. (The
term MAC-level address is more general. However for the sake of
concreteness, examples in this section will assume that Ethernet is
being used. You may generally replace the term "Ethernet address"
with the equivalent MAC-level address for other similar technologies.)
A bridge does not examine the packet itself, so it does not use the IP
address or its equivalent for routing decisions. In contrast, a
gateway bases its decisions on the IP address, or its equivalent for
other protocols.

There are several reasons why it matters which kind of address is used
for decisions. The most basic is that it affects the relationship
between the switch and the upper layers of the protocol. If
forwarding is done at the level of the MAC-level address (bridge), the
switch will be invisible to the protocols. If it is done at the IP
level, the switch will be visible. Let's give an example. Here are
                                  30

two networks connected by a bridge:

              network 1 network 2
               128.6.5 128.6.4
        ================== ================================
          | | | | |
       ___|______ __|______|__ _______|___ _______|___
       128.6.5.2 bridge 128.6.4.3 128.6.4.4
       __________ ____________ ___________ ___________
       computer A computer B computer C

Note that the bridge does not have an IP address. As far as computers
A, B, and C are concerned, there is a single Ethernet (or other
network) to which they are all attached. This means that the routing
tables must be set up so that computers on both networks treat both
networks as local. When computer A opens a connection to computer B,
it first broadcasts an ARP request asking for computer B's Ethernet
address. The bridge must pass this broadcast from network 1 to
network 2. (In general, bridges must pass all broadcasts.) Once the
two computers know each other's Ethernet addresses, communications use
the Ethernet address as the destination. At that point, the bridge
can start exerting some selectivity. It will only pass packets whose
Ethernet destination address is for a machine on the other network.
Thus a packet from B to A will be passed from network 2 to 1, but a
packet from B to C will be ignored.

In order to make this selection, the bridge needs to know which
network each machine is on. Most modern bridges build up a table for
each network, listing the Ethernet addresses of machines known to be
on that network. They do this by watching all of the packets on both
networks. When a packet first appears on network 1, it is reasonable
to conclude that the Ethernet source address corresponds to a machine
on network 1.

Note that a bridge must look at every packet on the Ethernet, for two
different reasons. First, it may use the source address to learn
which machines are on which network. Second, it must look at the
destination address in order to decide whether it needs to forward the
packet to the other network.

As mentioned above, generally bridges must pass broadcasts from one
network to the other. Broadcasts are often used to locate a resource.
The ARP request is a typical example of this. Since the bridge has no
way of knowing what host is going to answer the broadcast, it must
pass it on to the other network. Some newer bridges have
user-selectable filters. With them, it is possible to block some
broadcasts and allow others. You might allow ARP broadcasts (which
are essential for IP to function), but confine less essential
broadcasts to one network. For example, you might choose not to pass
rwhod broadcasts, which some systems use to keep track of every user
logged into every other system. You might decide that it is
sufficient for rwhod to know about the systems on a single segment of
the network.

                                  31

Now let's take a look at two networks connected by a gateway

              network 1 network 2
               128.6.5 128.6.4
        ==================== ==================================
          | | | | |
       ___|______ ____|__________|____ _______|___ _______|___
       128.6.5.2 128.6.5.1 128.6.4.1 128.6.4.3 128.6.4.4
       __________ ____________________ ___________ ___________
       computer A gateway computer B computer C

Note that the gateway has IP addresses assigned to each interface.
The computers' routing tables are set up to forward through
appropriate address. For example, computer A has a routing entry
saying that it should use the gateway 128.6.5.1 to get to subnet
128.6.4.

Because the computers know about the gateway, the gateway does not
need to scan all the packets on the Ethernet. The computers will send
packets to it when appropriate. For example, suppose computer A needs
to send a message to computer B. Its routing table will tell it to use
gateway 128.6.5.1. It will issue an ARP request for that address.
The gateway will respond to the ARP request, just as any host would.
>From then on, packets destinated for B will be sent with the gateway's
Ethernet address.

5.2.3 More about bridges

There are several advantages to using the Mac-level address, as a
bridge does. First, every packet on an Ethernet or IEEE network has
such an address. The address is in the same place for every packet,
whether it is IP, DECnet, or some other protocol. Thus it is
relatively fast to get the address from the packet. A gateway must
decode the entire IP header, and if it is to support protocols other
than IP, it must have software for each such protocol. This means
that a bridge automatically supports every possible protocol, whereas
a gateway requires specific provisions for each protocol it is to
support.

However there are also disadvantages. The one that is intrinsic to
the design of a bridge is

   - A bridge must look at every packet on the network, not just those
     addressed to it. Thus it is possible to overload a bridge by
     putting it on a very busy network, even if very little traffic is
     actually going through the bridge.

However there are another set of disadvantages that are based on the
way bridges are usually built. It is possible in principle to design
bridges that do not have these disadvantages, but I don't know of any
plans to do so. They all stem from the fact that bridges do not have
                                  32

a complete routing table that describes the entire system of networks.



This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:42:52 GMT