network monitoring


Charles Hedrick (hedrick@topaz.rutgers.edu)
Fri, 16 Oct 87 00:56:17 EDT


I have just started to keep statistics and generate reports from our
cisco gateways. I had been waiting for HEMP to finish, and cisco to
implement some whizbang ASN.1 monster. Then I realized that really
that is unnecesary. A simple program can connect to a gateway and by
issuing "show" commands get just about any piece of data I could ever
want. The issue is not getting data. I can easily get so much data
that I drown in paper. The issue is what to do with it once I have
it. So the question is, does anybody have enough experience with
network monitoring to know what kind of statistics it is useful to
collect and what kinds of reports it is useful to produce. For the
moment, I'm collecting data hourly, and producing daily reports on
errors and other items of comparatively short-term interest. (Of
course we don't wait for the daily report to know that a line is down.
We have monitoring tools that ping gateways and selected hosts
regularly, so we know when something is down very soon.) I am also
collecting packet counts for all the gateways, as well as counts of
some events that might indicate that the gateways are overloaded (if
they ever happened, which they don't seem to). From this I plan to
produce usage reports weekly or monthly, and generate long-term
trends. (Of course we all know what the graphs will look like, but
administrators like to see graphs showing that the stuff they have
paid for is getting growing usage.) Also I will probably try to pull
out some specific numbers like the busiest hour, and usage vs. time of
day. But there are zillions of things like this I could do. Does
anyone have any suggestions which ones turn out to be useful?

For your amusement, here's one of my daily error reports. (This is
done more or less entirely in awk, by the way.) [In case anybody
actually looks at it, a couple of comments:
  The reloads were to bring up new software.
  The large number of resets on some interfaces are mostly typical
        of 3Com Multibus Ethernet cards. It doesn't seem to
        indicate anything wrong. The Interlan cards on our
        newer boxes don't seem to do this.
  "lo-input" means an hour in which there was less than 10 packets
        input. This could indicate that something has stopped
        hearing the network. In this case it happens to be
        interfaces whose networks aren't completely in service yet.
]

 Path: topaz.rutgers.edu!aramis.rutgers.edu!hedrick
 From: hedrick@aramis.rutgers.edu
 Newsgroups: ru.netlog
 Subject: gateway errors
 Message-ID: <1907@aramis.rutgers.edu>
 Date: 16 Oct 87 04:13:08 GMT
 Sender: root@aramis.rutgers.edu
 Lines: 55

Errors for lcsr-gw

Thu 1987 Oct 15 02:55:05 reload 1
Thu 1987 Oct 15 02:55:06 Ethernet0 state up
Thu 1987 Oct 15 02:55:06 Ethernet1 state up
Thu 1987 Oct 15 02:55:06 Ethernet2 state up
Thu 1987 Oct 15 02:55:06 Ethernet3 state up

interface address in-errs out-errs resets hangs in-hangs lo-input

Ethernet0 128.6.4.1 0 0 0 0 0 0
Ethernet1 128.6.5.41 0 39 11 0 0 0
Ethernet2 128.6.13.1 0 0 0 0 0 0
Ethernet3 128.6.21.3 0 0 0 0 0 0

Errors for nb-gw

Thu 1987 Oct 15 02:55:17 reload 1
Thu 1987 Oct 15 02:55:17 Ethernet1 state up
Thu 1987 Oct 15 02:55:17 Ethernet2 state up
Thu 1987 Oct 15 02:55:17 Ethernet3 state up
Thu 1987 Oct 15 02:55:17 Ethernet0 state up
Thu 1987 Oct 15 02:55:17 Serial0 state up
Thu 1987 Oct 15 02:55:17 DDN-18220 state up

interface address in-errs out-errs resets hangs in-hangs lo-input

Serial0 128.6.254.1 3 0 0 0 0 0
DDN-18220 10.1.0.89 4 0 0 0 0 3
Ethernet0 128.6.13.39 0 0 1 0 0 0
Ethernet1 128.6.21.1 11 206 18 0 0 0
Ethernet2 128.6.4.27 33 1448 85 0 0 0
Ethernet3 128.6.7.1 0 249 15 0 0 0

Errors for eng-gw

interface address in-errs out-errs resets hangs in-hangs lo-input

Ethernet0 128.6.21.2 0 0 0 0 0 0
Ethernet1 128.6.3.13 0 0 0 0 0 0
Ethernet2 128.6.14.1 0 0 0 0 0 0
Ethernet3 128.6.22.1 0 0 0 0 0 23

Errors for ccis-gw

Thu 1987 Oct 15 10:55:35 Serial0 state down
Thu 1987 Oct 15 18:55:54 Serial0 state up

interface address in-errs out-errs resets hangs in-hangs lo-input

Serial0 128.6.253.2 11 0 0 0 0 15
Serial1 128.6.252.2 0 0 0 0 0 0
Ethernet0 128.6.7.2 0 0 0 0 0 0
Ethernet1 128.6.21.7 0 0 0 0 0 0
Ethernet3 128.6.18.1 11 12 1 0 0 0



This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:39:35 GMT