Somehow the tg3 driver is strange
A lot of recent systems I have to work with have Tigon3 ethernet interfaces, which behave strangly when used under Linux in settings that are non-trivial, networking-wise.
The Tigon3 interface by Broadcom is extremely popular for on-board Gigabit Ethernet. At least the hp DL140 server and the hp compaq nc8000 notebook have Tigon3 interfaces.
First: A hp DL 140, functioning as router between about 15 network segments, connected to two interfaces which introduce themselves as
0000:02:00.0 Ethernet controller: Broadcom Corporation Netxtreme BCM5704 Gigabit Ethernet (rev 02).
OS is Debian sarge.
With a 2.4.30 kernel (8021q as a module), routing works fine. However, when I use a 2.6 kernel (8021q as a module, as well), the tagged VLANs don't work. It looks like the tagged frames never reach the kernel: A tcpdump sees ARP requests coming in, but no ARP replies going out. This is a bad bug and kills the 2.6 kernel on that box. Good luck for the times when upstream drops 2.4 support. While we're talking about upstream: I have reported this on the LKML, and besides a kind soul who copied the message to the Kernel Bugzilla, nothing has happened. Gee, thanks.
0000:02:0e.0 Ethernet controller: Broadcom Corporation Netxtreme BCM5705M_2 Gigabit Ethernet (rev 03).
The other issue that bugs me for over a year now is that the Tigon3 driver seems to drop information that tcpdump needs to properly display the VLAN tags while tcpdumping the physical interface. This bug can be reproduced on my notebook, an hp compaq nc8000 with a tg3 interface claiming that it is a
0000:02:0e.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705M_2 Gigabit Ethernet (rev 03)
This box handles 802.1q frames just fine. In fact, if it didn't, I wouldn't be typing this since I am currently on the nc8000 connected to a tagged VLAN for testing purposes at the moment. However, when I tcpdump the physical interface, I do only see outgoing frames, and tcpdump -e doesn't show any hints about the frames actually being tagged frames which belong into a VLAN. Untagged frames display just fine. Identical behavior same can be observed on the DL140.
A reference system (an older IBM Thinkpad T22 which has an E100 interface) shows as expected: tcpdumping the physical interface shows incoming and outgoing frames, and the -e option on the tcpdump command line shows that the frame is indeed 802.1q tagged, and the VLAN ID is shown as well.
While debugging this, I found Debian bug #277765, which I filed almost exactly a year ago when I first encountered this issue. After doing some more research (most prominently using the e100-based system as a reference), I now believe that this is a driver issue as well.
Can anybody shed a light on these issues? Which mailing list is the correct one to get answers and probably a fix? LKML and the Kernel Bugzilla are quite obviously wrong.
Oh, btw, I find it annoying that tcpdump doesn't show the 802.1q VLAN tag unless you specify the -e switch, which displays unimportant information such as the MAC addresses as well. It is, IMO, a bad thing to show something - uncommented - as an IP datagram which is not an IP datagram. Debian bug #324706 has the official complaint.
Display comments as Linear | Threaded
Ross Reedstrom on :
Hmm, any change in negotiation with the switch? Could it be that the new drivers don't let the switch know that the tg3 is dot1q compliant, so it's stripping tags?
Marc 'Zugschlus' Haber on :
Usually, 802.1q VLANning is not negotiated at all, it's all manual configuration.
Marius M. on :
I was actually informing myself about getting VLANing working on my servers' BCM5704 and found your blog entry. Do you know if this problems still persist? I'm using the very same ("tg3") driver and I'd like to avoid running into problems when switching over the servers to VLANing.
Thanks in advance!
Marc 'Zugschlus' Haber on :
The tg3 driver has always worked just fine with VLAN tags. The problem that I reported five years ago was that it's hard to debug since the hardware stripps of the VLAN tag of incoming frames and is thus confusing wireshark and tcpdump. This "problem" still persists, but doesn't really hurt as long as you refrain from using untagged frames on a link.