Re: SUN 3.4 problems


Charles Hedrick (topaz.rutgers.edu!hedrick@RUTGERS.EDU)
27 Aug 87 19:21:58 GMT


I claim no expertise in SunOS 3.4. We are using 3.2 with
locally-added networking enhancements that put it somewhere between
3.3 and 3.4 in terms of functionality. However from your results, it
sounds like Sun's diagnosis is right. The fact that your hosts all
get "ie0: no carrier" or "Ethernet jammed" strongly indicates a
broadcast storm. The fact that things work when you use a separate
Ethernet suggests that there is no error in your software or setup.
However it's not quite right to say that the problem is with your
"network". The problem is not with the network itself, but with the
hosts on that network. If all of the hosts on it are Suns, then Sun
can't entirely avoid blame. 3.4 is based on 4.3BSD's version of IP.
3.2 is based on 4.2BSD's version of IP. Between 4.2 and 4.3, the
broadcast address was changed. (The people who changed the standard
should be shot. The amount of damage done to networks and the
reputation of IP due to inconsistent broadcast addresses is enormous.
By the way, this is not Berkeley's fault. The standard actually
changed.) Unfortunately, there are various bugs in 4.2 (and
presumably Sun 3.2), such that any disagreement over the broadcast
address can cause such a flurry of ICMP unreachables and ARP's that
the network becomes unusable. The solution is going to depend upon
the particular set of machines on your network. You have two choices:
find some broadcast address on which everyone can agree, or split the
network. 4.3-based systems allow you to set the broadcast address.
So do some 4.2-based systems that contain "4.3 enhancements". This
includes Ultrix and Pyramid. Unmodified 4.2 systems use net.0 as the
broadcast address. E.g. if your network number is 128.6, your
broadcast address is 128.6.0.0. The new standard allows either
128.6.255.255 or 255.255.255.255. If you are using subnets, things
get more complex. 4.2 didn't support subnets, but if you patched your
4.2 to do so, you will probably have ended up with a broadcast address
of net.subnet.0. E.g. for us a typical one would be 128.6.4.0. The
new standard, and 4.3, say that the correct broadcast address for a
subnetted network is 128.6.4.255.

One approach would be to tell your 4.3-based systems (i.e. your
Sun 3.4 systems) to use the old broadcast address. There should be
an option to ifconfig to do this. What bothers me is that this
option may not take effect during the early stages of booting.
However the simplest thing to try would be to change the ifconfig
commands, normally present in /etc/rc or /etc/rc.boot to contain
the appropriate option. Assuming you don't use subnets, this would
be something like
  ifconfig ie0 `/bin/hostname` up -trailers broadcast 128.6.0.0
Everything up to "broadcast" should be whatever your ifconfig command
is now. It may be that the option is -broadcast. You should use your
own net number in place of 128.6.0.0. You must make this change to
/etc/rc.boot for every individual client partition. This means you'll
have to bring up the clients one by one single-user or just mount the
partitions on the server, using /dev/ndlx (making sure that the
clients are not running at the time). You might try this for a few
clients to see whether it fixes your problem, before doing it on
all of them.

In retrospect, Sun would probably have been better off distributing
3.4 with the old broadcast address as a default. Once everyone had
upgraded to 3.4, the next release could safely move to the new
address, since 3.4 should (if it is properly implemented) accept
either. At the very least the setup program should provide this as an
option. (Of course I haven't seen 3.4 yet -- maybe it does.)

Other approaches to this problem are to fix all your existing systems
to accept the new address (which may be the best solution if you
have source to them -- we can give you the changes), or to put a
gateway between your 3.4 systems and everything else. If you don't
have any other kind of gateway, you could add a second Ethernet board
to one of your servers and use it as a gateway.

Finally, if all of your systems are Suns, the simplest thing to do is
simply to upgrade them all at once. Bring them all down, and then
bring them up one by one on 3.4.



This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:39:14 GMT