MTU and TCP MSS when using PPPoE

I switched to Bell from Rogers about half a year ago. A goal I had was to remove their router and use my own EdgeRouter Pro. Once I got the PPPoE connection up I was able to ping the rest of the world but couldn't load most websites. Eventually I found I had to adjust the MTU and add MSS clamping to get everything to work. At the time just blindly used MTU and MSS clamp values I found online. They turned out to be correct but last night I decided to experiment and research to find the correct values I should be using.

Finding the MTU

First you should understand that almost all networking gear has their Maximum transmission unit set to 1500 bytes for each interface. The Ethernet header overhead (18 bytes1) is not included in this. This means that the payload inside the Ethernet frame can be at most 1500 bytes long.

What goes inside the payload of the frames depends on what you are doing. If you are pinging an IP, it would be a ICMP packet inside an IP packet so to figure out the largest ICMP packet size you can use, you subtract the size of the IP header (20 bytes2) and the ICMP header (8 bytes) from the MTU: 1500 - 20 - 8 = 1472.

Throw in some PPPoE

Now if you tried to ping with the Don't fragment (DF) flag set, a packet size of 1472 should work and a packet size of 1473 should not work. Like this (on Linux):

$ ping -M do -s 1473 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 1473(1501) bytes of data.
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500

$ ping -M do -s 1472 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 1472(1500) bytes of data.
1480 bytes from 8.8.8.8: icmp_seq=1 ttl=51 time=1.27 ms
1480 bytes from 8.8.8.8: icmp_seq=2 ttl=51 time=24.3 ms
1480 bytes from 8.8.8.8: icmp_seq=3 ttl=51 time=1.31 ms
1480 bytes from 8.8.8.8: icmp_seq=4 ttl=51 time=1.77 ms

That is unless you're connecting over PPPoE. If you are using PPPoE you will find that your ping will fail with a packet size of 1472. This is because PPPoE has its own packet header of 8 bytes. If you subtract the PPPoE header from our previous value you will get the actual largest ICMP packet size: 1472 - 8 = 1464. Now you can try pinging with the new packet size, like this (on Mac):

$ ping -D -s 1465 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 1465 data bytes
ping: sendto: Message too long
ping: sendto: Message too long
Request timeout for icmp_seq 0
ping: sendto: Message too long
Request timeout for icmp_seq 1
ping: sendto: Message too long
Request timeout for icmp_seq 2
ping: sendto: Message too long
Request timeout for icmp_seq 3

$ ping -D -s 1464 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 1464 data bytes
1472 bytes from 8.8.8.8: icmp_seq=0 ttl=59 time=6.844 ms
1472 bytes from 8.8.8.8: icmp_seq=1 ttl=59 time=7.066 ms
1472 bytes from 8.8.8.8: icmp_seq=2 ttl=59 time=7.066 ms
1472 bytes from 8.8.8.8: icmp_seq=3 ttl=59 time=7.229 ms
1472 bytes from 8.8.8.8: icmp_seq=4 ttl=59 time=7.081 ms

What is MSS clamping?

Normally your computer will be able to determine a safe MTU using Path MTU Discovery (PMTUD) but this relies on your ISP actually sending back ICMP Too Big packets. Unfortunately Bell has decided (in their infinite wisdom) that this is not a good thing (probably under the guise of "security") so they leave you high and dry because your TCP connections may end up as "black hole connections"; this happens when the TCP handshake works but trying to send any data just gets dropped silently on their side.

The solution for this is called MSS clamping. You use your firewall to override the Maximum Segment Size (MSS) option on all TCP connections so they do not have issues with packets being too large. To figure out the MSS you want, you take the standard 1500 MTU and subtract the PPPoE header, the IP header, and the TCP header (20 bytes3): 1500 - 8 - 20 - 20 = 1452.

EdgeRouter

If you have an EdgeRouter, you'll want the following configuration options to set the MTU for your PPPoE connection and MSS clamping, where eth0 is the interface you are using and vif 35 is for VLAN 35.

set firewall options mss-clamp interface-type pppoe
set firewall options mss-clamp mss 1452
set interfaces ethernet eth0 vif 35 pppoe 0 mtu 1492

Conclusion

Blindly following values I found posted online worked but I wasn't satisfied. After some experimenting and reading Wikipedia, I now am confident in 1492 as the MTU and 1452 for the TCP MSS, and I understand why they work.

Notes:
  1. Ethernet frame headers start at 18 bytes long, grow to 22 bytes with VLAN tagging, and 26 bytes with Q-in-Q VLAN tagging.
  2. IP packet header start at 20 bytes long and can be up to 60 bytes if there are options specified; however, it is rarely used.
  3. Like IP, TCP packet headers start at 20 bytes long and can be up to 60 bytes if there are options.