Re: Broadcast Storms


Charles Hedrick (hedrick@topaz.rutgers.edu)
Wed, 26 Aug 87 23:39:09 EDT


Maybe we need to keep a collection of known causes of broadcast
storms, and send them to the list once a month. Almost certainly what
is going on is that most of your machines are configured to expect
128.165.0.0 as a broadcast address (this would be the default for
4.2), and they have ipforwarding turned on. Thus the 128.165.255.255
looks to them like an attempt to send a message to a host with this
address. Since ipforwarding is on, they try to be nice and forward
the message. Thus they ARP 128.165.255.255. The real fix is to make
sure that every one of your machines has the stormproofing code in it
that I have posted several times. (In brief,
 1) in ipintr, make sure that all possible broadcast addresses
        are recognized.
 2) in udp_input, fix the code that sends unreachables so that its
        test for broadcast addresses includes all possible addresses
 3) in ip_forward, when ipforwarding is off, discard the packet
        in all cases with no error message. [4.3 has fixed this
        already, but not 4.2.] Leave ipforwarding off except on
        actual gateways, and if you use Unix hosts as gateways,
        make sure that proper Martian filtering, etc., is done.
) However it is often impossible to modify the code on every one of
your machines. In that case, a reasonable approach is to make sure
that every one of your machines agrees about the broadcast address.
4.2 systems will use net.0.0. 4.3 systems and Ultrix will default to
net.255.255, but allow an option -broadcast in ifconfig to set a
different address. I suggest that you set your Ultrix machine to use
128.165.0.0 as its broadcast address. This is a violation of the
standards, but it's better to have everyone on your network agree than
to have one lone machine be right.

I do not recommend rwho on big networks in any case. But it should
not cause storms. There are enough other uses of broadcasts that you
should make sure they are safe. I would set the broadcast address to
128.165.0.0 and turn rwho back on. Now use Etherfind on a Sun (or
netwatch on a PC, but if you have 100 machines on an Ethernet, it is
worth buying a Sun just to run Etherfind) and verify that you are not
seeing any ARP's for 128.165.0.0, nor any ICMP unreachables. If you
see either of these, start tracking down the hosts one by one and
fixing them.

If you are using level 2 Bridges, you are asking for this sort of
thing. In that case, you should do this sort of test periodically to
make sure no new problems have crept into your network.

If there are any hosts that insist on sending garbage in response to
broadcasts, isolate them from the rest of your network with a gateway.



This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:39:14 GMT