Weird networking MTU issue
Given a simple, switched LAN with Debian Linux (sid, iputils-ping) and Microsoft Windows XP (Service Pack 2, with the Windows Firewall disabled). A user complains about the network being slow (meaning his Windows notebook). I quickly find out that I can ping the box from my Linux notebook if my -s parameter is < 1393 or > 1472. If it's between 1393 and 1472, no replies are received.
I spend the next hour with debugging this not so interesting phenomenon.
Consulting with my Windows colleagues, they quickly find out that "it must be a Linux issue", since if it's a Windows box issueing the ICMP echo request, everything is fine.
Unfortunately, this issue cannot be reproduced with most of the test Windows boxes we have available. After finally finding an available test box which shows the issue, I install Wireshark on the Windows box, which is a rather pleasing experience since Wireshark has a clickable .exe installer that also takes care of installing the winpacap library. This is a lot less painful than it was with Ethereal two years ago.
Looking at the packet trace, the issue is quickly explained.
- iputils-ping sends out the ICMP echo request packets with the "don't fragment" bit set.
- Windows' ping allows fragmentation by keeping the "don't fragment" unset.
- If the echo request is smaller than 1393 bytes, everything is fine. The Windows box answers as it is supposed to. It doesn't matter whether the request is from Linux or from Windows since fragmentation does not play a role here.
- If the echo request is bigger than 1472 data bytes, the request needs to be fragmented as the Linux box' MTU is 1500. In this case, iputils-ping - of course - does not set the DF bit, and the Windows box sends back a properly fragmented echo reply. Same thing happens - of course - with the Windows-generated ping since Windows' ping never sets DF.
- Now, if the Linux-generated echo request (thus having DF set) is within the size range mentioned above, Windows does not answer at all.
- If the Windows-generated echo request (not having DF set) is within the appropriate size range, the request fits in a single frame. Windows fragments the reply with the first fragment being 1434 bytes long.
I now place the hypothesis that some of our Windows boxes operate with a MTU of 1434. Thus, it is not supposed to answer the Linux-generated echo request (with DF set) if it cannot send back the answer unfragmented. It is, however, IMO, supposed to send back a "host unreachable, fragmentation needed but DF set" ICMP error packet. It doesn't do that, which causes the behavior observed.
The Windows boxes in question do not have an "MTU" option in the advanced settings of the network interface as other boxes have, and the appropriate Registry key setting the MTU is missing in the adapter settings.
I am now wondering whether my reasoning is wrong, or Windows is indeed buggy. Or misconfigured.
Any ideas?
 
            
Comments
Display comments as Linear | Threaded
luke on :
"Same thing happens - of course - with the Windows-generated ping since Windows’ ping never sets DF." Only if you don't set it intentionally.
ping -l 1450 -f
should do the trick on at least newer Windows-boxes, so you could test that.
I am not aware of having Windows 2000 or XP setting a MTU smaller on Ethernet-Connections. They do so on PPPoE Connections (1480) On PPTP-Connections, MTU should be below 1440, which is very close to your reading. So: Are the Windows-Boxes on a RAS-Server and is the RAS-Server configured with the Windows-Firewall omitting ICMP?
Marc 'Zugschlus' Haber on :
No, the windows boxes are plain clients that have never been on RAS, an the Network is not firewalled at all in the test setup.
And, wireshark on the pinged host showed the request coming in and nothing going out.