Pro/1000 (em) -current Sync

There are a number of new Pro/1000 family chips out there that are not supported by the 3.4-stable “em” driver. This patch brings the driver in 3.4-stable up to -current. I don't even know if it compiles. I don't have a 3.4 box to test on, but if the resulting kernel does compile it is very likely to work.

3.4-stable to -current “em” sync: em-sync_20040107.diff.gz

Pro/1000 (em) IP/UDP/TCP/VLAN Support

The Pro/1000 supports hardware IP, UDP, and TCP acceleration as well as hardware VLAN filtering. The first of these two diffs adds support for a generic hardware VLAN filtering framework. The second updates the “em” driver to take advantage of this framework as well as enabling IP/UDP/TCP acceleration.

This has not been extensively tested but is believed to work correctly on Pro/1000 MT cards. Cards that do not support these features will probably not like this patch very much (the driver should only try to use these features on cards that support them, but some of that code is still missing). Unfortunately, the per-packet setup code for the checksum support is rather nasty, so the actual performance impact may not be what one might hope for. The VLAN filter is disabled when in IFF_PROMISC or IFF_ALLMULTI mode. In some applications it may be desirable to leave it enabled. Perhaps this behavior warrants a flag (e.g., link0).

The generic bits: if_vlan_20030816.diff.gz

The “em” driver update: em_20030816.diff.gz

OpenBSD 3.3-stable users can try this diff (it includes all the other “em” updates in -current): em_stable_20030817.diff.gz

Routing Table Corruption (RNF_IGNORE)

There is a “feature” in OpenBSD that is meant to do something constructive when an interface is taken down (as in “ifconfig down”). Exactly what it is intended to do is unknown, but it is known that it can corrupt the routing table's radix tree. For some further discussion, see this and this thread.

The simplest solution is to get rid of the broken code: rnf_ignore_20030528.diff.gz

Note: The RNF_IGNORE patch has been checked in to the main tree (and made it into 3.4-release).

Routing Grief

A diff against -current for various routing issue (including more than a few pieces of debug code—and the RNF_IGNORE purge) is here: net_20030816.diff.gz

There appear to be a number of races when adding/removing IP addresses as well as some sloppy rt_refcnt handling for routes. There are also some (untested) ICMP redirect fixes in there. There are a few other things I intend to look into before pulling out the debug code (e.g., for 10 points, tell me what happens to the rtentry (rt) in ip_output() after a goto bad:—more specifically, is it possible for IP_ROUTETOIF to be set and for a packet to go through the IPsec failure path around line 620?).

My firewall box (a Soekris net4501 running a 2.5MB installdisk-like bsd.gz) no longer boots with the default route showing a reference count of −475 or −476 (the original inspiration for these changes). In fact, I have not seen any negative reference counts in the routing tables of any of my boxes since I first applied these patches. It has also cured my logs of intermittent “rtfree: xxxx not freed (neg refs)” messages. However, I still don't understand the mechanism that caused this to happen so consistently on my firewall box; I will keep trying…

Here's a subset of those diffs for what I think is the most serious issue, namely, the lack of splsoftnet()/splx() during address changes: in.diff.gz against -current. (I checked it in to -current on Oct. 3.) Compare to the checkin notice for version 1.24 of FreeBSD's in.c.

OpenBSD 3.1-stable users may want to apply these patches from -current: ip_input31.diff.gz and in31.diff2.gz. These patches should get rid of any “rtfree: xxxx not freed (neg refs)” messages and/or negative reference counts in the routing table. Note that I no longer have a 3.1-stable box set up.

For folks having IPv4 routing problems (after applying the ip_input.c and in.c patches), please update netstat to -current (see the archive page) and then supply the “netstat -rvAnf inet” output with any bug report. In the meantime, this fixes many problems: take note of the current routing table entries, reset the table with “route -n flush”, then add the static routes back. For some hint as to what is going wrong, try comparing the “netstat -rnvAf inet” output before and after this procedure. Look for things like interfaces or radix flags (the stuff between the “< >”s) changing. Another thing to try is to do a “route -n get <ip address>” to see if the routing table thinks a particular IP address should be reached in the way that you expect.

Aug 16
Rebuilt the diff against -current.
Mar 3, May 28 (2003)
Rebuilt the diff against -current.
Sept. 9 (2002)
It now compiles with ROUTE_DEBUG disabled and without DDB. IPv6 can be in the kernel, but has not been tested to work properly. Some more fixes have been added for possible rt leaks. Eventually, this will all be cleaned up and sorted into more appropriate patches. The extra RTFREE() stuff is no longer included as it is now in the -current tree.
Sept. 16
Minor tweakettes; I haven't had a chance to muck with it too much, but I did redo the diff against -current again (since there were a bunch of checkins to this part of the tree recently).
Sept. 25
gcc's __builtin_return_address() apparently doesn't like walking past the end of the stack. This is likely the cause of some panic()s. It is only a problem if a kernel was a) built with the “net*.diff.gz” patches and b) included the ROUTE_DEBUG option.
Oct. 3
The in.c race patch was checked in to -current.
Oct. 18
The patches themselves have not been changed since Sept. 25th. I've got some half-baked diagnostic/timing code all over my tree (trying to figure out where packets are spending their time as they wind through the stack and where there are dropped when the system is overloaded). Yes, that's what the Elan timer code is for.
Nov. 9
Someone emailed to complain that splassert() isn't happy in the in.c patch for OpenBSD 3.1-stable, so I took it out. (Hint to seekers-of-wisdom: those wishing to see replies are advised to include a working return email address.)