TCP checksum unrolling


Geof Cooper (imagen!geof@decwrl.dec.com)
Tue, 6 Oct 87 16:08:57 pdt


Here is one, undebugged, that illustrates the concept. It
also uses the trick that (if you have it) you can use 32-bit
two's complement addition and add all the carries in at the
end (another trick that is sometimes faster is to generate
a 32-bit one's complement sum and then add the top and bottom
halves together to get the 16-bit sum). Some C compilers
won't accept the wierd syntax below; or maybe I should point
out, as you wretch on the floor, that there is at least ONE
c compiler that DOES accept this syntax.

It is trivial to code it for all C compilers -- but
what you really want to do is code the exact intent of the
following into assembly language. That makes it a lot faster
to add the two halves of a 32-bit word.

These tricks don't work for XNS checksums. Our experience is
that this difference alone makes our XNS implementation a little
slower than our TCP implementation on a 68000.

- Geof

checksum(p, n)
    unsigned short *p;
    short n;
{
    short nloop;
    short nrem;
    unsigned long sum;

    sum = 0;
    if ( n > 0 ) {
        nloop = (n >> 3) + 1;
        nrem = n & 7;

        switch ( nloop ) {

            do {
                    sum += *p++;
                case 7:
                    sum += *p++;
                case 6:
                    sum += *p++;
                case 5:
                    sum += *p++;
                case 4:
                    sum += *p++;
                case 3:
                    sum += *p++;
                case 2:
                    sum += *p++;
                case 1:
                    sum += *p++;
                case 0:
            } while ( --nloop > 0 );
        }
    }

    sum = (sum >> 16) + (sum & 0xffff);
    sum = (sum >> 16) + (sum & 0xffff);

    return ( sum );
}



This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:39:34 GMT