TCP RTT woes revisited


Craig Partridge (craig@loki.bbn.com)
Sun, 14 Dec 86 11:52:37 -0500


    This weekend I had time to start processing Van Jacobson's suggested
fixes/modifications. Things started working very well after the first fix
which made TCP choose better fragment sizes and increased the time to live
for IP fragments.

    The subsequent testing also revealed some interesting results. (These
are preliminary and subject to reappraisal).

    (1) EACKs appear to make a huge difference in use of the network.
    After seeing signs this was the case, I ran the simple test of
    pushing 50,000 data packets though a software loopback that
    dropped 4% of the packets.

    With EACKs there were 1,930 retransmissions, of which 1 received
    packet was a duplicate (note that some of the retransmissions were
    also dropped).

    Without EACKS there were 12,462 retransmissions of which 9,344
    received packets were duplicates.

    12,462 retransmissions is, of course, bad news, and comes from
    the fact that this RDP sends up to four packets in parallel.
    Typically the four get put into the send queue in the same
    tick of the timer, so when the first gets retransmitted,
    all four do. The moral seems to be use EACKs even though
    they aren't required for a conforming implementation.

    (2) Lixia Zhang's suggestion that one use the RTT of the SYN to
    compute the initial timeout estimate appears to work very well.

    (3) EACKs may make it possible to all but stomp out RTT feedback
    (those unfortunate cases where a dropped packet leds to an
    RTT = (the number of retries * SRTT) + SRTT being used to compute
    a new SRTT. I've been experimenting with discarding RTTs for out of
    order acks. This is best explained by example. If packets 1, 2, 3
    and 4 are sent, and the first ack is an EACK for 3, the implementation
    uses the RTT for 3 to recompute the SRTT, but will discard the RTTs
    for 1 and 2 when they are eventually acked (or EACKed). The
    argument in favor of this scheme is that the acks for 1, and 2
    probably represent either (a) RTTs for packets that were dropped,
    and thus including them would lead to feedback or (b) RTTs that reflect
    an earlier (and slower) state of the network (3 was sent after 1 and 2)
    and using them would make the SRTT a less good prediction of the
    RTT of the next packet. Note that (b) would be more convincing
    if it wasn't the case that 1, 2, 3 and 4 were probaby sent within
    a few milliseconds of each other.

    Watching 5 trial runs of 100 64-byte data packets bounced off Goonhilly
    this algorithm kept the SRTT within the observed range of real RTTs
    (as opposed to RTTs for packets that were dropped and had to be
    retransmitted).

    Using EACKs but taking the RTT for every packet, (again doing 5 trial
    runs) several cases of RTT-feedback were seen. In one case the SRTT
    soared to ~35 seconds when a few packets were dropped in a short period.
    Since the implementation uses Mill's suggested changes which make
    lowering the SRTT take longer than raising it, the SRTT took some
    time to recover.

People may be wondering about observed throughput. How fast does RDP
run vis-a-vis TCP? That turns out to be very difficult to answer.
Identical tests run in parallel or one right after another give
throughput rates that vary by factors of 2 of more. As a result it
is difficult to get throughput numbers that demonstrably show differences
which reflect more than random variation. After running tests for 7
weekends (and millions of packets) I have some theories, but those keep
changing as different tests are run.

Craig

P.S. Those millions of packets are almost all over a software loopback.
The contribution to network congestion has been small.



This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:37:00 GMT