Fix: Next-Hop MTU in ICMPv4 "Fragmentation Needed" Packets#12634
Fix: Next-Hop MTU in ICMPv4 "Fragmentation Needed" Packets#12634copybara-service[bot] merged 1 commit intogoogle:masterfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
Please squash your commits: https://github.com/google/gvisor/blob/master/CONTRIBUTING.md#code-reviews |
|
Hey, done Squashed the commits. Please take a look. |
manninglucas
left a comment
There was a problem hiding this comment.
Thanks for the change. e.MTU() returns the network MTU, but according to your change description you want to set the ICMP message value to the link MTU.
See the comment on NetworkEndpoint.MTU() in //pkg/tcpip/stack/registration.go:
// MTU is the maximum transmission unit for this endpoint. This is
// generally calculated as the MTU of the underlying data link endpoint
// minus the network endpoint max header length.
Oh yes, thanks for pointing this out. Updated to use |
Fixes #12568 ### Problem - When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0. - According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect. ### Impact - When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes. - That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub. ### Root cause **In the IPv4 forwarding path**: If a packet: - Has the Don’t Fragment bit set - Exceeds the outgoing link MTU gVisor correctly generates an ICMP Destination Unreachable with Code 4. However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0. The method to set it already exists. It just wasn’t being used. ### What this change does 1. Introduces a new ICMP reason type that carries the next-hop MTU value. 2. Updates sendICMPv4 to: - Set the correct ICMP code for Fragmentation Needed. - Populate the MTU field with the actual link MTU. - Safely cap it to uint16 maximum. 3. Updates the forwarding path to pass the correct MTU when generating the ICMP error. ### Result Now, when a packet is too large and cannot be forwarded: - gVisor correctly includes the actual next-hop MTU in the ICMP response. - Hosts update their Path MTU properly. - No fallback to 552 bytes. - No unnecessary performance degradation. FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 2067f61 PiperOrigin-RevId: 891782710
Fixes #12568 ### Problem - When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0. - According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect. ### Impact - When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes. - That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub. ### Root cause **In the IPv4 forwarding path**: If a packet: - Has the Don’t Fragment bit set - Exceeds the outgoing link MTU gVisor correctly generates an ICMP Destination Unreachable with Code 4. However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0. The method to set it already exists. It just wasn’t being used. ### What this change does 1. Introduces a new ICMP reason type that carries the next-hop MTU value. 2. Updates sendICMPv4 to: - Set the correct ICMP code for Fragmentation Needed. - Populate the MTU field with the actual link MTU. - Safely cap it to uint16 maximum. 3. Updates the forwarding path to pass the correct MTU when generating the ICMP error. ### Result Now, when a packet is too large and cannot be forwarded: - gVisor correctly includes the actual next-hop MTU in the ICMP response. - Hosts update their Path MTU properly. - No fallback to 552 bytes. - No unnecessary performance degradation. FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 2067f61 PiperOrigin-RevId: 892405056
Fixes #12568 ### Problem - When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0. - According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect. ### Impact - When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes. - That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub. ### Root cause **In the IPv4 forwarding path**: If a packet: - Has the Don’t Fragment bit set - Exceeds the outgoing link MTU gVisor correctly generates an ICMP Destination Unreachable with Code 4. However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0. The method to set it already exists. It just wasn’t being used. ### What this change does 1. Introduces a new ICMP reason type that carries the next-hop MTU value. 2. Updates sendICMPv4 to: - Set the correct ICMP code for Fragmentation Needed. - Populate the MTU field with the actual link MTU. - Safely cap it to uint16 maximum. 3. Updates the forwarding path to pass the correct MTU when generating the ICMP error. ### Result Now, when a packet is too large and cannot be forwarded: - gVisor correctly includes the actual next-hop MTU in the ICMP response. - Hosts update their Path MTU properly. - No fallback to 552 bytes. - No unnecessary performance degradation. FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 2067f61 PiperOrigin-RevId: 891782710
Fixes #12568 ### Problem - When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0. - According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect. ### Impact - When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes. - That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub. ### Root cause **In the IPv4 forwarding path**: If a packet: - Has the Don’t Fragment bit set - Exceeds the outgoing link MTU gVisor correctly generates an ICMP Destination Unreachable with Code 4. However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0. The method to set it already exists. It just wasn’t being used. ### What this change does 1. Introduces a new ICMP reason type that carries the next-hop MTU value. 2. Updates sendICMPv4 to: - Set the correct ICMP code for Fragmentation Needed. - Populate the MTU field with the actual link MTU. - Safely cap it to uint16 maximum. 3. Updates the forwarding path to pass the correct MTU when generating the ICMP error. ### Result Now, when a packet is too large and cannot be forwarded: - gVisor correctly includes the actual next-hop MTU in the ICMP response. - Hosts update their Path MTU properly. - No fallback to 552 bytes. - No unnecessary performance degradation. FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 2067f61 PiperOrigin-RevId: 891782710
Fixes #12568 ### Problem - When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0. - According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect. ### Impact - When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes. - That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub. ### Root cause **In the IPv4 forwarding path**: If a packet: - Has the Don’t Fragment bit set - Exceeds the outgoing link MTU gVisor correctly generates an ICMP Destination Unreachable with Code 4. However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0. The method to set it already exists. It just wasn’t being used. ### What this change does 1. Introduces a new ICMP reason type that carries the next-hop MTU value. 2. Updates sendICMPv4 to: - Set the correct ICMP code for Fragmentation Needed. - Populate the MTU field with the actual link MTU. - Safely cap it to uint16 maximum. 3. Updates the forwarding path to pass the correct MTU when generating the ICMP error. ### Result Now, when a packet is too large and cannot be forwarded: - gVisor correctly includes the actual next-hop MTU in the ICMP response. - Hosts update their Path MTU properly. - No fallback to 552 bytes. - No unnecessary performance degradation. FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 8b1f272 PiperOrigin-RevId: 891782710
Fixes #12568 ### Problem - When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0. - According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect. ### Impact - When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes. - That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub. ### Root cause **In the IPv4 forwarding path**: If a packet: - Has the Don’t Fragment bit set - Exceeds the outgoing link MTU gVisor correctly generates an ICMP Destination Unreachable with Code 4. However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0. The method to set it already exists. It just wasn’t being used. ### What this change does 1. Introduces a new ICMP reason type that carries the next-hop MTU value. 2. Updates sendICMPv4 to: - Set the correct ICMP code for Fragmentation Needed. - Populate the MTU field with the actual link MTU. - Safely cap it to uint16 maximum. 3. Updates the forwarding path to pass the correct MTU when generating the ICMP error. ### Result Now, when a packet is too large and cannot be forwarded: - gVisor correctly includes the actual next-hop MTU in the ICMP response. - Hosts update their Path MTU properly. - No fallback to 552 bytes. - No unnecessary performance degradation. FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 8b1f272 PiperOrigin-RevId: 891782710
Fixes #12568 ### Problem - When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0. - According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect. ### Impact - When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes. - That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub. ### Root cause **In the IPv4 forwarding path**: If a packet: - Has the Don’t Fragment bit set - Exceeds the outgoing link MTU gVisor correctly generates an ICMP Destination Unreachable with Code 4. However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0. The method to set it already exists. It just wasn’t being used. ### What this change does 1. Introduces a new ICMP reason type that carries the next-hop MTU value. 2. Updates sendICMPv4 to: - Set the correct ICMP code for Fragmentation Needed. - Populate the MTU field with the actual link MTU. - Safely cap it to uint16 maximum. 3. Updates the forwarding path to pass the correct MTU when generating the ICMP error. ### Result Now, when a packet is too large and cannot be forwarded: - gVisor correctly includes the actual next-hop MTU in the ICMP response. - Hosts update their Path MTU properly. - No fallback to 552 bytes. - No unnecessary performance degradation. FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 8b1f272 PiperOrigin-RevId: 891782710
Fixes #12568 ### Problem - When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0. - According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect. ### Impact - When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes. - That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub. ### Root cause **In the IPv4 forwarding path**: If a packet: - Has the Don’t Fragment bit set - Exceeds the outgoing link MTU gVisor correctly generates an ICMP Destination Unreachable with Code 4. However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0. The method to set it already exists. It just wasn’t being used. ### What this change does 1. Introduces a new ICMP reason type that carries the next-hop MTU value. 2. Updates sendICMPv4 to: - Set the correct ICMP code for Fragmentation Needed. - Populate the MTU field with the actual link MTU. - Safely cap it to uint16 maximum. 3. Updates the forwarding path to pass the correct MTU when generating the ICMP error. ### Result Now, when a packet is too large and cannot be forwarded: - gVisor correctly includes the actual next-hop MTU in the ICMP response. - Hosts update their Path MTU properly. - No fallback to 552 bytes. - No unnecessary performance degradation. FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 8b1f272 PiperOrigin-RevId: 891782710
PiperOrigin-RevId: 893208595
Fixes #12568
Problem
Impact
Root cause
In the IPv4 forwarding path:
If a packet:
gVisor correctly generates an ICMP Destination Unreachable with Code 4.
However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.
What this change does
Result
Now, when a packet is too large and cannot be forwarded: