Re: Protcol Development on SUN 2 and 3 computers.


Keith Lantz (lantz@gregorio.stanford.edu)
16 Dec 1986 2058-PST (Tuesday)


Folks might also be interested to know that protocol development in
Berkeley UNIX has been rather easy for years at CMU and Stanford, who
jointly developed what is referred to as the "packet filter". A paper
on the packet filter, by Jeff Mogul, Mike Accetta, and Rick Rashid was
just presented at the Conference on Practical Software Development
Environments. Perhaps the first thing to know is that it provides for
application-level protocol development, rather than kernel hacking.
For example, that's how our ``UNIX server'' for the V-System is
implemented.

We have been beating on Berkeley for several years to include same with
the BSD distributions, with little success. Rumor has it that it IS
included in the 4.3 distribution, but as unsupported software. I am
not offering to support it myself, but if you're sufficiently
interested and vocal enough, who knows who might respond...

Keith

Following is the man page for the 4.3 version of the packet filter.
The 4.2 version differs somewhat.

ENET(4) UNIX Programmer's Manual ENET(4)

NAME
     enet - ethernet packet filter

SYNOPSIS
     pseudo-device enetfilter 64

DESCRIPTION
     The packet filter provides a raw interface to Ethernets and
     similar network data link layers. Packets received that are
     not used by the kernel (i.e., to support IP, ARP, and on
     some systems XNS, protocols) are available through this
     mechanism. The packet filter appears as a set of character
     special files, one per hardware interface. Each enet file
     may be opened multiple times, allowing each interface to be
     used by many processes. The total number of open ethernet
     files is limited to the value given in the kernel configura-
     tion; the example given in the SYNOPSIS above sets the limit
     to 64.

     The minor device numbers are associated with interfaces when
     the system is booted. Minor device 0 is associated with the
     first Ethernet interface ``attached'', minor device 1 with
     the second, and so forth. (These character special files
     are, for historical reasons, given the names /dev/enet0,
     /dev/eneta0, /dev/enetb0, etc.)

     Associated with each open instance of an enet file is a
     user-settable packet filter which is used to deliver incom-
     ing ethernet packets to the appropriate process. Whenever a
     packet is received from the net, successive packet filters
     from the list of filters for all open enet files are applied
     to the packet. When a filter accepts the packet, it is
     placed on the packet input queue of the associated file. If
     no filters accept the packet, it is discarded. The format
     of a packet filter is described below.

     Reads from these files return the next packet from a queue
     of packets that have matched the filter. If insufficient
     buffer space to store the entire packet is specified in the
     read, the packet will be truncated and the trailing contents
     lost. Writes to these devices transmit packets on the net-
     work, with each write generating exactly one packet.

     The packet filter currently supports a variety of different
     ``Ethernet'' data-link levels:

     3mb Ethernet packets consist of 4 or more bytes with the
                    first byte specifying the source ethernet
                    address, the second byte specifying the des-
                    tination ethernet address, and the next two
                    bytes specifying the packet type. (Actually,
                    on the network the source and destination

Printed 9/6/86 8 October 1985 1

ENET(4) UNIX Programmer's Manual ENET(4)

                    addresses are in the opposite order.)

     byte-swapping 3mb Ethernet
                    packets consist of 4 or more bytes with the
                    first byte specifying the source ethernet
                    address, the second byte specifying the des-
                    tination ethernet address, and the next two
                    bytes specifying the packet type. Each short
                    word (pair of bytes) is swapped from the net-
                    work byte order; this device type is only
                    provided as a concession to backwards-
                    compatibility.

     10mb Ethernet packets consist of 14 or more bytes with the
                    first six bytes specifying the destination
                    ethernet address, the next six bytes the
                    source ethernet address, and the next two
                    bytes specifying the packet type.

     The remaining words are interpreted according to the packet
     type. Note that 16-bit and 32-bit quantities may have to be
     byteswapped (and possible short-swapped) to be intelligible
     on a Vax.

     The packet filter mechanism does not know anything about the
     data portion of the packets it sends and receives. The user
     must supply the headers for transmitted packets (although
     the system makes sure that the source address is correct)
     and the headers of received packets are delivered to the
     user. The packet filters treat the entire packet, including
     headers, as uninterpreted data.

IOCTL CALLS
     In addition to FIONREAD, ten special ioctl calls may be
     applied to an open enet file. The first two set and fetch
     parameters for the file and are of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, code, param)
          struct eniocb *param;

     where param is defined in <sys/enet.h> as:

          struct eniocb
          {
                 u_char en_addr;
                 u_char en_maxfilters;
                 u_char en_maxwaiting;
                 u_char en_maxpriority;
                 long en_rtout;
          };

Printed 9/6/86 8 October 1985 2

ENET(4) UNIX Programmer's Manual ENET(4)

     with the applicable codes being:

     EIOCGETP
          Fetch the parameters for this file.

     EIOCSETP
          Set the parameters for this file.

     The maximum filter length parameter en_maxfilters indicates
     the maximum possible packet filter command list length (see
     EIOCSETF below). The maximum input wait queue size parame-
     ter en_maxwaitingindicates the maximum number of packets
     which may be queued for an ethernet file at one time (see
     EIOCSETW below). The maximum priority parameter
     en_maxpriority indicates the highest filter priority which
     may be set for the file (see EIOCSETF below). The en_addr
     field is no longer maintained by the driver; see EIOCDEVP
     below.

     The read timeout parameter en_rtout specifies the number of
     clock ticks to wait before timing out on a read request and
     returning an EOF. This parameter is initialized to zero by
     open(2), indicating no timeout. If it is negative, then read
     requests will return an EOF immediately if there are no
     packets in the input queue. (Note that all parameters
     except for the read timeout are read-only and are ignored
     when changed.)

     A different ioctl is used to get device parameters of the
     ethernet underlying the minor device. It is of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCDEVP, param)

     where param is defined in <sys/enet.h> as:

          struct endevp {
                 u_char end_dev_type;
                 u_char end_addr_len;
                 u_short end_hdr_len;
                 u_short end_MTU;
                 u_char end_addr[EN_MAX_ADDR_LEN];
                 u_char end_broadaddr[EN_MAX_ADDR_LEN];
          };

     The fields are:

     end_dev_type Specifies the device type; currently one of
                    ENDT_3MB, ENDT_BS3MB or ENDT_10MB.

     end_addr_len Specifies the address length in bytes (e.g.,

Printed 9/6/86 8 October 1985 3

ENET(4) UNIX Programmer's Manual ENET(4)

                    1 or 6).

     end_hdr_len Specifies the total header length in bytes
                    (e.g., 4 or 14).

     end_MTU Specifies the maximum packet size, including
                    header, in bytes.

     end_addr The address of this interface; aligned so
                    that the low order byte of the address is the
                    first byte in the array.

     end_broadaddr The hardware destination address for broad-
                    casts on this network.

     The next two calls enable and disable the input packet sig-
     nal mechanism for the file and are of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, code, signp)
          u_int *signp;

     where signp is a pointer to a word containing the number of
     the signal to be sent when an input packet arrives and with
     the applicable codes being:

     EIOCENBS
          Enable the specified signal when an input packet is
          received for this file. If the ENHOLDSIG flag (see
          EIOCMBIS below) is not set, further signals are
          automatically disabled whenever a signal is sent to
          prevent nesting and hence must be specifically re-
          enabled after processing. When a signal number of 0 is
          supplied, this call is equivalent to EIOCINHS.

     EIOCINHS
          Disable any signal when an input packet is received for
          this file (the signp parameter is ignored). This is
          the default when the file is first opened.

     The next two calls set and clear ``mode bits'' for the for
     the file and are of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, code, bits)
          u_short *bits;

     where bits is a short work bit-mask specifying which bits to
     set or clear. Currently, the only bit mask recognized is
     ENHOLDSIG, which (if clear) means that the driver should

Printed 9/6/86 8 October 1985 4

ENET(4) UNIX Programmer's Manual ENET(4)

     disable the effect of EIOCENBS once it has delivered a sig-
     nal. Setting this bit means that you need use EIOCENBS only
     once. (For historical reasons, the default is that ENHOLD-
     SIG is set.) The applicable codes are:

     EIOCMBIS
          Sets the specified mode bits

     EIOCMBIC
          Clears the specified mode bits

     Another ioctl call is used to set the maximum size of the
     packet input queue for an open enet file. It is of the
     form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCSETW, maxwaitingp)
          u_int *maxwaitingp;

     where maxwaitingp is a pointer to a word containing the
     input queue size to be set. If this is greater than maximum
     allowable size (see EIOCGETP above), it is set to the max-
     imum, and if it is zero, it is set to a default value.

     Another ioctl call flushes the queue of incoming packets.
     It is of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCFLUSH, 0)

     The final ioctl call is used to set the packet filter for an
     open enet file. It is of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCSETF, filter)
          struct enfilter *filter

     where enfilter is defined in <sys/enet.h> as:

          struct enfilter
          {
                 u_char enf_Priority;
                 u_char enf_FilterLen;
                 u_short enf_Filter[ENMAXFILTERS];
          };

     A packet filter consists of a priority, the filter command
     list length (in shortwords), and the filter command list
     itself. Each filter command list specifies a sequence of

Printed 9/6/86 8 October 1985 5

ENET(4) UNIX Programmer's Manual ENET(4)

     actions which operate on an internal stack. Each shortword
     of the command list specifies an action from the set {
     ENF_PUSHLIT, ENF_PUSHZERO, ENF_PUSHWORD+N } which respec-
     tively push the next shortword of the command list, zero, or
     shortword N of the incoming packet on the stack, and a
     binary operator from the set { ENF_EQ, ENF_NEQ, ENF_LT,
     ENF_LE, ENF_GT, ENF_GE, ENF_AND, ENF_OR, ENF_XOR } which
     then operates on the top two elements of the stack and
     replaces them with its result. When both an action and
     operator are specified in the same shortword, the action is
     performed followed by the operation.

     The binary operator can also be from the set { ENF_COR,
     ENF_CAND, ENF_CNOR, ENF_CNAND }. These are ``short-
     circuit'' operators, in that they terminate the execution of
     the filter immediately if the condition they are checking
     for is found, and continue otherwise. All pop two elements
     from the stack and compare them for equality; ENF_CAND
     returns false if the result is false; ENF_COR returns true
     if the result is true; ENF_CNAND returns true if the result
     is false; ENF_CNOR returns false if the result is true.
     Unlike the other binary operators, these four do not leave a
     result on the stack, even if they continue.

     The short-circuit operators should be used when possible, to
     reduce the amount of time spent evaluating filters. When
     they are used, you should also arrange the order of the
     tests so that the filter will succeed or fail as soon as
     possible; for example, checking the Socket field of a Pup
     packet is more likely to indicate failure than the packet
     type field.

     The special action ENF_NOPUSH and the special operator
     ENF_NOP can be used to only perform the binary operation or
     to only push a value on the stack. Since both are (con-
     veniently) defined to be zero, indicating only an action
     actually specifies the action followed by ENF_NOP, and indi-
     cating only an operation actually specifies ENF_NOPUSH fol-
     lowed by the operation.

     After executing the filter command list, a non-zero value
     (true) left on top of the stack (or an empty stack) causes
     the incoming packet to be accepted for the corresponding
     enet file and a zero value (false) causes the packet to be
     passed through the next packet filter. (If the filter exits
     as the result of a short-circuit operator, the top-of-stack
     value is ignored.) Specifying an undefined operation or
     action in the command list or performing an illegal opera-
     tion or action (such as pushing a shortword offset past the
     end of the packet or executing a binary operator with fewer
     than two shortwords on the stack) causes a filter to reject
     the packet.

Printed 9/6/86 8 October 1985 6

ENET(4) UNIX Programmer's Manual ENET(4)

     In an attempt to deal with the problem of overlapping and/or
     conflicting packet filters, the filters for each open enet
     file are ordered by the driver according to their priority
     (lowest priority is 0, highest is 255). When processing
     incoming ethernet packets, filters are applied according to
     their priority (from highest to lowest) and for identical
     priority values according to their relative ``busyness''
     (the filter that has previously matched the most packets is
     checked first) until one or more filters accept the packet
     or all filters reject it and it is discarded.

     Filters at a priority of 2 or higher are called "high prior-
     ity" filters. Once a packet is delivered to one of these
     "high priority" enet files, no further filters are examined,
     i.e. the packet is delivered only to the first enet file
     with a "high priority" filter which accepts the packet. A
     packet may be delivered to more than one filter with a
     priority below 2; this might be useful, for example, in
     building replicated programs. However, the use of low-
     priority filters imposes an additional cost on the system,
     as these filters each must be checked against all packets
     not accepted by a high-priority filter.

     The packet filter for an enet file is initialized with
     length 0 at priority 0 by open(2), and hence by default
     accepts all packets which no "high priority" filter is
     interested in.

     Priorities should be assigned so that, in general, the more
     packets a filter is expected to match, the higher its prior-
     ity. This will prevent a lot of needless checking of pack-
     ets against filters that aren't likely to match them.

FILTER EXAMPLES
     The following filter would accept all incoming Pup packets
     on a 3mb ethernet with Pup types in the range 1-0100:

     struct enfilter f =
     {
         10, 19, /* priority and length */
         ENF_PUSHWORD+1, ENF_PUSHLIT, 2,
                 ENF_EQ, /* packet type == PUP */
         ENF_PUSHWORD+3, ENF_PUSHLIT,
                 0xFF00, ENF_AND, /* mask high byte */
         ENF_PUSHZERO, ENF_GT, /* PupType > 0 */
         ENF_PUSHWORD+3, ENF_PUSHLIT,
                 0xFF00, ENF_AND, /* mask high byte */
         ENF_PUSHLIT, 0100, ENF_LE, /* PupType <= 0100 */
         ENF_AND, /* 0 < PupType <= 0100 */
         ENF_AND /* && packet type == PUP */
     };

Printed 9/6/86 8 October 1985 7

ENET(4) UNIX Programmer's Manual ENET(4)

     Note that shortwords, such as the packet type field, are
     byte-swapped and so the literals you compare them to must be
     byte-swapped. Also, although for this example the word
     offsets are constants, code that must run with either 3mb or
     10mb ethernets must use offsets that depend on the device
     type.

     By taking advantage of the ability to specify both an action
     and operation in each word of the command list, the filter
     could be abbreviated to:

     struct enfilter f =
     {
         10, 14, /* priority and length */
         ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_EQ, 2, /* packet type == PUP */
         ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND,
                 0xFF00, /* mask high byte */
         ENF_PUSHZERO | ENF_GT, /* PupType > 0 */
         ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND,
                 0xFF00, /* mask high byte */
         ENF_PUSHLIT | ENF_LE, 0100, /* PupType <= 0100 */
         ENF_AND, /* 0 < PupType <= 0100 */
         ENF_AND /* && packet type == PUP */
     };

     A different example shows the use of "short-circuit" opera-
     tors to create a more efficient filter. This one accepts
     Pup packets (on a 3Mbit ethernet) with a Socket field of
     12345. Note that we check the Socket field before the
     packet type field, since in most packets the Socket is not
     likely to match.

     struct enfilter f =
     {
         10, 9, /* priority and length */
         ENF_PUSHWORD+7, ENF_PUSHLIT | ENF_CAND,
                 0, /* High word of socket */
         ENF_PUSHWORD+8, ENF_PUSHLIT | ENF_CAND,
                 12345, /* Low word of socket */
         ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_CAND,
                 2 /* packet type == Pup */
     };

SEE ALSO
     de(4), ec(4), en(4), il(4), enstat(8)

FILES
     /dev/enet{,a,b,c,...}0

BUGS
     The current implementation can only filter on words within
     the first "mbuf" of the packet; this is around 100 bytes (or

Printed 9/6/86 8 October 1985 8

ENET(4) UNIX Programmer's Manual ENET(4)

     50 words).

     Because packets are streams of bytes, yet the filters
     operate on short words, and standard network byte order is
     usually opposite from Vax byte order, the relational opera-
     tors ENF_LT, ENF_LE, ENF_GT, and ENF_GE are not all that
     useful. Fortunately, they were not often used when the
     packets were treated as streams of shorts, so this is prob-
     ably not a severe problem. If this becomes a severe prob-
     lem, a byte-swapping operator could be added.

     Many of the "features" of this driver are there for histori-
     cal reasons; the manual page could be a lot cleaner if these
     were left out.

HISTORY



This archive was generated by hypermail 2.0b3 on Thu Mar 09 2000 - 14:37:00 GMT