Re: [Phil Dykstra: more interesting numbers]


Phil Dykstra (phil@BRL.ARPA)
Thu, 21 Apr 88 1:25:37 EDT


> Who is responsible for the remaining traffic?

Good question. I would wager a bet that it is the old GGP induced
"extra-hop" problem. Speaking EGP on the MILNET side of the world
I can't verify this in your case, but here is an example of this very
bad phenomena in action on this half of the core system.

Several weeks ago a typical set of EGP routes received from MINET-GW
(a MILNET Core EGP speaker) looked like this (summarized by number of
routes per gateway):

 # routes Example net Gateway
   307 10.0.0.0 26.0.0.106 Random "mailbridge"
    19 128.165.0.0 26.3.0.75 EGP speaker (YUMA-GW)
    18 128.171.0.0 26.1.0.65 EGP speaker (AERO-GW)
     7 128.56.0.0 26.3.0.29 BRL gw1
     6 128.20.0.0 26.2.0.29 BRL gw2
     4 128.115.0.0 26.6.0.21
     4 128.102.0.0 26.4.0.16
     3 128.60.0.0 26.20.0.8
     3 128.229.0.0 26.0.0.103
     2 129.43.0.0 26.0.0.88
     2 128.47.0.0 26.5.0.60
     2 128.122.0.0 26.0.0.58
     1 192.5.13.0 26.2.0.55
     1 192.31.98.0 26.5.0.129
    ... many more single route entries ...

The mailbridge 26.0.0.106 happened to be the "choice of the day" for
routes via the ARPANET. Seeing a very large number of routes to a
single mailbridge is quite common; it changes every few hours or days.
[I would like by the way to hear if this is load balanced on a per
peer basis or something, or if everyone on a given EGP speaker gets
the same selection.]

But the real problem is the 37 routes to the Core EGP speakers! We
got these routes by polling the MINET gateway, and MINET did what
it was supposed to do - never gave ITSELF as a route to anything.
Any exterior gateway which advertised its routes to MINET came out
correctly. However, >>> any exterior gateway which advertised its
routes to YUMA and/or AERO but not to MINET (i.e. not to the EGP
peer that we polled), showed up as reachable via (one of) the EGP
peer(s) that they spoke to! <<<

This is a serious problem, because besides the sillyness of inducing
an extra "hop" to reach those networks, it also directs a large amount
of traffic to the Core EGP speakers - something which BBN(?) has been
trying to avoid! Thus to answer Thomas Narten's question (I gather
that the machine in question is an ARPA-side Core EGP speaker): The
traffic is probably "extra-hop problem" induced.

How It Happens (in brief - those that know this can skip it):

Internal to the Core system GGP is used to communicate route information.
A GGP speaker can only say "I CAN REACH netX", not HOW. EGP on the other
hand says "I CAN REACH netX VIA gateY." When you speak EGP to one of
the Core EGP speakers, he learns how to reach your nets VIA your
gateway. If you ask that same EGP speaker how to get to netX you will
get the "correct" answer - gateY. However, if you ask a *different* EGP
speaker, his knowledge of the network in question came via GGP in which
the first Core EGP speaker simply said "I CAN REACH netX." The HOW
part, i.e. the gateway that advertised netX in the first place has been
dropped (due to this GGP limitation). Thus someone receiving this
information will end up (needlessly) sending packets to the Core EGP
speaker netX was advertised to, rather than to the gateway that
advertised it.

How To Avoid the Problem:

You can prevent others from getting extra-hop routes to YOU by advertising
your nets to all available EGP speakers. You can avoid getting extra-hop
routes to someone else by polling all available EGP speakers for routes
and favoring those routes that DON'T point to an EGP speaker. [The real
solution of course is to fix GGP.]

Of course if everyone did the above the EGP speakers would be all the
more loaded. One could also question at that point why there was more
than one EGP speaker. One the other hand, licking the extra hop problem
might get a lot of unnecessary non-EGP traffic off of the EGP speakers.
It's hard to tell where the balance would lie. [It is interesting to
note in the recent timetable how the EGP speakers were upgraded before
most of the mailbridges were.]

My apologies for such a long winded answer, but has been a long time
since anyone discussed this problem on this list.

- Phil
<phil@brl.arpa>



This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:41:56 GMT