WARNING: TOD clock not initialized -- CHECK AND RESET THE DATE!


Alexander Dupuy (westend!dupuy@columbia.edu)
1 Jan 88 08:16:36 GMT


Ever since the leap second (23:59:60 GMT Jan 1, 1988) the realtime clocks on
Sun-3s have been behaving strangely. When booting /vmunix, just after the
message about using nn buffers, the kernel prints out a little message like the
above. That's not too bothersome for us, since we use rdate and ntpd to keep
our Suns' clocks in synch anyhow.

What is bothersome is that the system clocks have started to slew wildly.
Using a little program I hacked up, I have found that there are spurious deltas
showing up in adjtime(2) on *ANY SUN-3* which has had its time set or adjusted
since the leap second. Running the following program a few times:

adjtime.c
---------
#include <sys/time.h>

struct timeval delta = { 0, 0 },
            olddelta = { 0, 0 };

main ()
{
        if ( adjtime (&delta, &olddelta) == -1)
                perror ("adjtime");

        printf ("adjust %d.%d, oldslew %d.%d\n", delta.tv_sec, delta.tv_usec,
                                        olddelta.tv_sec, olddelta.tv_usec);
}

I get results like this:

Script started on Fri Jan 1 02:14:18 1988

finest# alias adj '/src/local/local/netdate/adjtime; date'
finest# adj
adjust 0.0, oldslew -1490.-143408
Fri Jan 1 02:15:26 EST 1988
westend# adj
adjust 0.0, oldslew 0.0
Fri Jan 1 02:15:32 EST 1988
westend# adj
adjust 0.0, oldslew -1729.-961408
Fri Jan 1 02:15:46 EST 1988
westend#

westend# date 8712311650
Thu Dec 31 16:50:00 EST 1987
westend# adj
adjust 0.0, oldslew 0.0
Thu Dec 31 16:50:09 EST 1987
westend# adj
adjust 0.0, oldslew 0.0
Thu Dec 31 16:50:29 EST 1987
westend# adj
adjust 0.0, oldslew 0.0
Thu Dec 31 16:50:48 EST 1987

script done on Thu Dec 31 16:50:49 1987

As can be seen, every ten to fifteen seconds, some monstrous time adjustment
gets added in by the kernel. This is *not* being done by ntp or any other time
daemon - it even happens in single user mode. It can also be seen that after
the date is reset to 1987 (GMT) this behavior disappears, and time stabilizes.
The silly message when booting disappears as well.

So it looks like the guilty party is /sys/sundev/clock.c. But not having
source code, what can I do?

Other observations: Our Sun-2s (bless their little obsolete cpus) have not even
stuttered since the leap second went down. Their TOD clock code seems to be
just fine.

So will someone with access to Sun kernel sources please help me out? This is
a serious bug, and I imagine Sun will have a patched OBJ/clock.o for binary
sites eventually, but in the meantime, it is stretching the resources of ntpd
to even keep the machines within a *minute* or so of true time. The poor
machines which aren't running ntp are okay until they are rebooted, or someone
foolishly tries to set their time, but once that happens, their watch gears get
unsprung.

@alex

---
arpanet: dupuy@columbia.edu
uucp:	...!seismo!columbia!dupuy

---
arpanet: dupuy@columbia.edu
uucp:	...!seismo!columbia!dupuy



This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:40:40 GMT