IPv6 gateway high packet loss on WAN upload with shaping

Started by OPNenthu, January 16, 2025, 10:15:26 AM

Previous topic - Next topic
January 16, 2025, 10:15:26 AM Last Edit: January 16, 2025, 10:18:53 AM by OPNenthu Reason: add images to post body
24.7.11_2-amd64

The IPv6 gateway goes into "offline" status with high packet loss reported when uploading to the web (observed while running online speed tests). Once upload activity ceases the gateway gradually returns to online status.  Health graphs reflect the packet loss on WAN_DHCP6. The IPv4 gateway is not impacted.





Despite what OPNsense says, the packet loss is not real.  The gateway remains online and speed tests indicate 0% actual loss.  It appears to be a reporting issue with no real consequence as far as I can tell.

I found two necessary conditions for this:

- Traffic shaping must be in use; in my case I am exactly following the guide on fixing Bufferbloat with FQ_CoDel.  I have one download pipe fixed to 760 Mbit/s and one upload pipe at 21 Mbit/s.

- The 'Monitor IP' in the gateway configuration must be default (to ping the gateway itself).

If either of these is changed, e.g. disabling the shaping or setting a public DNS as the monitor IP, then the issue is not observed.

Only uploads cause the symptom.  I confirmed with "speedtest-cli --no-download" from a wireless client.  Doing the inverse test with "--no-upload" has zero impact.  I'm seeing exactly the same from my wired clients also when using e.g. speedtest.net or CloudFlare speed test.

It doesn't matter if the upload is over IPv4 or IPv6; both routes will cause the v6 gateway (only) to virtually go offline as the packet "loss" accumulates.

For now I've set the CloudFlare DNS as the gateway monitor IP to work around the issue.


Traffic shaping breaks some IPv6 functionality (esp. DHCPv6 / ICMPv6).

"Real" traffic, though, is not affected I think.

See https://github.com/opnsense/core/issues/7342

So it _might_ be that the packet loss shown is real, as it is packet loss of ICMPv6 only, but your speeds are fine, as those are unaffected TCP/IP(v6) traffic.

It does seem related to that.  Thank you for the link.

I understand now that by setting the monitor IP to another host I am just masking the issue from dpinger / Health RRD, so not a viable work-around.  Unfortunately my WAN IPs are dynamic, so I cannot rely on setting the src/dest mask in shaping.  Hopefully the suggestion on GitHub for adding an option to exclude ICMP from the IP shaper rule will come to pass.

Side question: could this have any bearing on my earlier issue with unreliable IPv6 temporary address generation (it is still happening, btw)?  I know the shaping is for WAN traffic only and should not impact LAN-side, but I am not sure how the shaper works under the hood.