Skip to content

Fix: Next-Hop MTU in ICMPv4 "Fragmentation Needed" Packets#12634

Merged
copybara-service[bot] merged 1 commit intogoogle:masterfrom
pawannn:master
Apr 2, 2026
Merged

Fix: Next-Hop MTU in ICMPv4 "Fragmentation Needed" Packets#12634
copybara-service[bot] merged 1 commit intogoogle:masterfrom
pawannn:master

Conversation

@pawannn
Copy link
Copy Markdown
Contributor

@pawannn pawannn commented Feb 22, 2026

Fixes #12568

Problem

  • When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0.
  • According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect.

Impact

  • When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes.
  • That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub.

Root cause

In the IPv4 forwarding path:

If a packet:

  • Has the Don’t Fragment bit set
  • Exceeds the outgoing link MTU
    gVisor correctly generates an ICMP Destination Unreachable with Code 4.

However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.

What this change does

  1. Introduces a new ICMP reason type that carries the next-hop MTU value.
  2. Updates sendICMPv4 to:
    • Set the correct ICMP code for Fragmentation Needed.
    • Populate the MTU field with the actual link MTU.
    • Safely cap it to uint16 maximum.
  3. Updates the forwarding path to pass the correct MTU when generating the ICMP error.

Result

Now, when a packet is too large and cannot be forwarded:

  • gVisor correctly includes the actual next-hop MTU in the ICMP response.
  • Hosts update their Path MTU properly.
  • No fallback to 552 bytes.
  • No unnecessary performance degradation.

@google-cla
Copy link
Copy Markdown

google-cla Bot commented Feb 22, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@ayushr2
Copy link
Copy Markdown
Collaborator

ayushr2 commented Mar 30, 2026

@pawannn
Copy link
Copy Markdown
Contributor Author

pawannn commented Mar 30, 2026

Hey, done Squashed the commits. Please take a look.

Copy link
Copy Markdown
Contributor

@manninglucas manninglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change. e.MTU() returns the network MTU, but according to your change description you want to set the ICMP message value to the link MTU.

See the comment on NetworkEndpoint.MTU() in //pkg/tcpip/stack/registration.go:

// MTU is the maximum transmission unit for this endpoint. This is
// generally calculated as the MTU of the underlying data link endpoint
// minus the network endpoint max header length.

@pawannn
Copy link
Copy Markdown
Contributor Author

pawannn commented Mar 30, 2026

Thanks for the change. e.MTU() returns the network MTU, but according to your change description you want to set the ICMP message value to the link MTU.

See the comment on NetworkEndpoint.MTU() in //pkg/tcpip/stack/registration.go:

// MTU is the maximum transmission unit for this endpoint. This is
// generally calculated as the MTU of the underlying data link endpoint
// minus the network endpoint max header length.

Oh yes, thanks for pointing this out. Updated to use e.nic.MTU() to get the actual link MTU.

@pawannn pawannn requested a review from manninglucas March 30, 2026 17:09
copybara-service Bot pushed a commit that referenced this pull request Mar 30, 2026
Fixes #12568

### Problem
- When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0.
- According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect.

### Impact
- When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes.
- That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub.

### Root cause
**In the IPv4 forwarding path**:

If a packet:
- Has the Don’t Fragment bit set
- Exceeds the outgoing link MTU
gVisor correctly generates an ICMP Destination Unreachable with Code 4.

However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.

### What this change does
1. Introduces a new ICMP reason type that carries the next-hop MTU value.
2. Updates sendICMPv4 to:
     - Set the correct ICMP code for Fragmentation Needed.
     - Populate the MTU field with the actual link MTU.
     - Safely cap it to uint16 maximum.
3. Updates the forwarding path to pass the correct MTU when generating the ICMP error.

### Result
Now, when a packet is too large and cannot be forwarded:
- gVisor correctly includes the actual next-hop MTU in the ICMP response.
- Hosts update their Path MTU properly.
- No fallback to 552 bytes.
- No unnecessary performance degradation.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 2067f61
PiperOrigin-RevId: 891782710
copybara-service Bot pushed a commit that referenced this pull request Mar 31, 2026
Fixes #12568

### Problem
- When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0.
- According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect.

### Impact
- When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes.
- That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub.

### Root cause
**In the IPv4 forwarding path**:

If a packet:
- Has the Don’t Fragment bit set
- Exceeds the outgoing link MTU
gVisor correctly generates an ICMP Destination Unreachable with Code 4.

However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.

### What this change does
1. Introduces a new ICMP reason type that carries the next-hop MTU value.
2. Updates sendICMPv4 to:
     - Set the correct ICMP code for Fragmentation Needed.
     - Populate the MTU field with the actual link MTU.
     - Safely cap it to uint16 maximum.
3. Updates the forwarding path to pass the correct MTU when generating the ICMP error.

### Result
Now, when a packet is too large and cannot be forwarded:
- gVisor correctly includes the actual next-hop MTU in the ICMP response.
- Hosts update their Path MTU properly.
- No fallback to 552 bytes.
- No unnecessary performance degradation.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 2067f61
PiperOrigin-RevId: 892405056
copybara-service Bot pushed a commit that referenced this pull request Mar 31, 2026
Fixes #12568

### Problem
- When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0.
- According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect.

### Impact
- When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes.
- That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub.

### Root cause
**In the IPv4 forwarding path**:

If a packet:
- Has the Don’t Fragment bit set
- Exceeds the outgoing link MTU
gVisor correctly generates an ICMP Destination Unreachable with Code 4.

However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.

### What this change does
1. Introduces a new ICMP reason type that carries the next-hop MTU value.
2. Updates sendICMPv4 to:
     - Set the correct ICMP code for Fragmentation Needed.
     - Populate the MTU field with the actual link MTU.
     - Safely cap it to uint16 maximum.
3. Updates the forwarding path to pass the correct MTU when generating the ICMP error.

### Result
Now, when a packet is too large and cannot be forwarded:
- gVisor correctly includes the actual next-hop MTU in the ICMP response.
- Hosts update their Path MTU properly.
- No fallback to 552 bytes.
- No unnecessary performance degradation.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 2067f61
PiperOrigin-RevId: 891782710
copybara-service Bot pushed a commit that referenced this pull request Mar 31, 2026
Fixes #12568

### Problem
- When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0.
- According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect.

### Impact
- When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes.
- That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub.

### Root cause
**In the IPv4 forwarding path**:

If a packet:
- Has the Don’t Fragment bit set
- Exceeds the outgoing link MTU
gVisor correctly generates an ICMP Destination Unreachable with Code 4.

However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.

### What this change does
1. Introduces a new ICMP reason type that carries the next-hop MTU value.
2. Updates sendICMPv4 to:
     - Set the correct ICMP code for Fragmentation Needed.
     - Populate the MTU field with the actual link MTU.
     - Safely cap it to uint16 maximum.
3. Updates the forwarding path to pass the correct MTU when generating the ICMP error.

### Result
Now, when a packet is too large and cannot be forwarded:
- gVisor correctly includes the actual next-hop MTU in the ICMP response.
- Hosts update their Path MTU properly.
- No fallback to 552 bytes.
- No unnecessary performance degradation.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 2067f61
PiperOrigin-RevId: 891782710
Comment thread pkg/tcpip/network/ipv4/ipv4.go Outdated
copybara-service Bot pushed a commit that referenced this pull request Apr 1, 2026
Fixes #12568

### Problem
- When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0.
- According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect.

### Impact
- When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes.
- That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub.

### Root cause
**In the IPv4 forwarding path**:

If a packet:
- Has the Don’t Fragment bit set
- Exceeds the outgoing link MTU
gVisor correctly generates an ICMP Destination Unreachable with Code 4.

However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.

### What this change does
1. Introduces a new ICMP reason type that carries the next-hop MTU value.
2. Updates sendICMPv4 to:
     - Set the correct ICMP code for Fragmentation Needed.
     - Populate the MTU field with the actual link MTU.
     - Safely cap it to uint16 maximum.
3. Updates the forwarding path to pass the correct MTU when generating the ICMP error.

### Result
Now, when a packet is too large and cannot be forwarded:
- gVisor correctly includes the actual next-hop MTU in the ICMP response.
- Hosts update their Path MTU properly.
- No fallback to 552 bytes.
- No unnecessary performance degradation.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 8b1f272
PiperOrigin-RevId: 891782710
copybara-service Bot pushed a commit that referenced this pull request Apr 1, 2026
Fixes #12568

### Problem
- When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0.
- According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect.

### Impact
- When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes.
- That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub.

### Root cause
**In the IPv4 forwarding path**:

If a packet:
- Has the Don’t Fragment bit set
- Exceeds the outgoing link MTU
gVisor correctly generates an ICMP Destination Unreachable with Code 4.

However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.

### What this change does
1. Introduces a new ICMP reason type that carries the next-hop MTU value.
2. Updates sendICMPv4 to:
     - Set the correct ICMP code for Fragmentation Needed.
     - Populate the MTU field with the actual link MTU.
     - Safely cap it to uint16 maximum.
3. Updates the forwarding path to pass the correct MTU when generating the ICMP error.

### Result
Now, when a packet is too large and cannot be forwarded:
- gVisor correctly includes the actual next-hop MTU in the ICMP response.
- Hosts update their Path MTU properly.
- No fallback to 552 bytes.
- No unnecessary performance degradation.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 8b1f272
PiperOrigin-RevId: 891782710
copybara-service Bot pushed a commit that referenced this pull request Apr 1, 2026
Fixes #12568

### Problem
- When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0.
- According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect.

### Impact
- When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes.
- That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub.

### Root cause
**In the IPv4 forwarding path**:

If a packet:
- Has the Don’t Fragment bit set
- Exceeds the outgoing link MTU
gVisor correctly generates an ICMP Destination Unreachable with Code 4.

However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.

### What this change does
1. Introduces a new ICMP reason type that carries the next-hop MTU value.
2. Updates sendICMPv4 to:
     - Set the correct ICMP code for Fragmentation Needed.
     - Populate the MTU field with the actual link MTU.
     - Safely cap it to uint16 maximum.
3. Updates the forwarding path to pass the correct MTU when generating the ICMP error.

### Result
Now, when a packet is too large and cannot be forwarded:
- gVisor correctly includes the actual next-hop MTU in the ICMP response.
- Hosts update their Path MTU properly.
- No fallback to 552 bytes.
- No unnecessary performance degradation.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 8b1f272
PiperOrigin-RevId: 891782710
copybara-service Bot pushed a commit that referenced this pull request Apr 2, 2026
Fixes #12568

### Problem
- When gVisor sends an ICMPv4 “Fragmentation Needed” message (Type 3, Code 4), it was leaving the Next-Hop MTU field as 0.
- According to RFC 1191, this field must contain the MTU of the next-hop link. Sending 0 is incorrect.

### Impact
- When a Linux host receives a Fragmentation Needed message with MTU set to 0, it falls back to an old default value from the RFC plateau table. In practice, this results in the system caching an MTU of 552 bytes.
- That causes connections to containers behind gVisor to degrade significantly, especially for larger packets. It can also create long-lived MTU locks in the routing cache, which impacts performance to external services like GitHub.

### Root cause
**In the IPv4 forwarding path**:

If a packet:
- Has the Don’t Fragment bit set
- Exceeds the outgoing link MTU
gVisor correctly generates an ICMP Destination Unreachable with Code 4.

However, while constructing the ICMP header, it never sets the Next-Hop MTU field using icmpHdr.SetMTU(). As a result, the MTU remains 0.
The method to set it already exists. It just wasn’t being used.

### What this change does
1. Introduces a new ICMP reason type that carries the next-hop MTU value.
2. Updates sendICMPv4 to:
     - Set the correct ICMP code for Fragmentation Needed.
     - Populate the MTU field with the actual link MTU.
     - Safely cap it to uint16 maximum.
3. Updates the forwarding path to pass the correct MTU when generating the ICMP error.

### Result
Now, when a packet is too large and cannot be forwarded:
- gVisor correctly includes the actual next-hop MTU in the ICMP response.
- Hosts update their Path MTU properly.
- No fallback to 552 bytes.
- No unnecessary performance degradation.

FUTURE_COPYBARA_INTEGRATE_REVIEW=#12634 from pawannn:master 8b1f272
PiperOrigin-RevId: 891782710
@copybara-service copybara-service Bot merged commit 28a0852 into google:master Apr 2, 2026
6 checks passed
maci0 pushed a commit to maci0/gvisor that referenced this pull request Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Next-hop MTU in ICMP "Fragmentation Needed" packets is set to 0

4 participants