Overlapping ICMP ID not handled by PF state

Started by d00b2020, February 21, 2025, 11:59:56 PM

Previous topic - Next topic
Hello everyone,

After upgrading from OPNSense v24 to v25, I've encountered an issue with ICMP handling that wasn't present in the previous version.

Observed Behavior in v24

When multiple internal hosts send ICMP echo requests (ping) to the same external destination (e.g., 1.1.1.1) using the same ICMP identifier, the state table shows separate entries for each host. For example:

pfctl -ss
all icmp 1.1.1.1:1 <- 10.3.0.92:1       0:0
all icmp 1.1.1.1:1 <- 10.3.0.94:1       0:0
all icmp <wan-ip>:33788 -> 1.1.1.1:33788       0:0
all icmp <wan-ip>:30580 (10.3.0.92:1) -> 1.1.1.1:30580       0:0
all icmp <wan-ip>:18146 (10.3.0.94:1) -> 1.1.1.1:18146       0:0

In this scenario, all hosts get responses from 1.1.1.1 as expected.

Observed Behavior in v25

After upgrading to version 25, only one host at a time receives a response. Here's an example of the state table now:

pfctl -ss
all icmp 1.1.1.1:8 <- 10.255.255.23:1       0:0
all icmp 1.1.1.1:8 <- 10.255.255.10:1       0:0
all icmp <wan-ip>:1 (10.255.255.23:1) -> 1.1.1.1:8       0:0

The behavior is that the first ping to arrive gets an entry in the state table, and the others time out. Once I cancel the ping from the active host, a few seconds later one of the other hosts starts receiving responses.

Windows hosts usually send pings with ID 1 by default. You can force the same ID on all pings from Linux machines using
ping -e 1 1.1.1.1
What I've Tried So Far

Flushing the State Table: Using pfctl -F state clears the state table, but the problem persists when new pings are initiated.

Comparing Outbound Behavior: I've noticed that in other networks behind the same OPNSense instance, even when pings use the same ICMP identifier, the firewall randomizes the ICMP ID on the outgoing packet—effectively avoiding collisions. However, on this particular network, the outgoing ICMP packet retains its original ID, leading to the state collision.

My Suspicions

It appears that something in the new version is affecting how PF (and by extension, NAT and scrub rules) handles ICMP sessions. In v25, it seems that PF is not randomizing the ICMP identifier for this network, or perhaps a rule is different from what we had in v24. With ICMP lacking a port number, PF relies solely on the identifier to track sessions. If two hosts use the same identifier, the first to establish a state "wins" and the others are merged into (or blocked by) that state, causing the observed behavior.

Questions / Request for Guidance

1. Intended Behavior Change?
Is this a known or intended change in how OPNSense (or PF on FreeBSD) handles ICMP state tracking and NAT between versions 24 and 25?

2. Configuration Adjustments:
• Are there new or modified scrub/NAT options in v25 that affect ICMP identifier randomization?
• Should I look for differences in the outbound NAT configuration (e.g., static-port settings) or in the scrub rules applied to this network compared to others that behave as expected?

3. Workaround Recommendations:
What configuration changes can be made in OPNSense (or directly on the FreeBSD shell) to restore behavior similar to v24, where multiple hosts with the same ICMP ID can receive responses?

I appreciate any insights or recommendations from the community or maintainers regarding this issue.

Thanks in advance for your help!

Hi d00b2020, I made some tests and could see the described behavior.

Quote from: d00b2020 on February 21, 2025, 11:59:56 PMWindows hosts usually send pings with ID 1 by default. You can force the same ID on all pings from Linux machines using
I don't have a Windows machine running right now but are you sure about sending pings with ID 1? According to the second link type 1 is unassigned.

The two sources I found (one Windows specific the other not) list type 8 as echo request and type 0 as echo reply. If I set the type to 8 (the default on the linux machines I use) in the ping call, it works as before. Doesn't work when setting to either 0, 3 or 4. Not tested any others.

## client 1
$ ping -e 8 -c 40 1.0.0.1
## client 2
$ ping -e 8 -c 40 1.0.0.1

## server
# pfctl -s s | fgrep 1.0.0
ax1 icmp 1.0.0.1:8 <- 192.168.169.7:8       0:0
ax0 icmp 37.17.238.198:58477 (192.168.169.7:8) -> 1.0.0.1:8       0:0
ax1 icmp 1.0.0.1:8 <- 192.168.169.8:8       0:0
ax0 icmp 37.17.238.198:14027 (192.168.169.8:8) -> 1.0.0.1:8       0:0

https://www.thewindowsclub.com/how-to-allow-pings-icmp-echo-requests-through-windows-firewall
https://www.iana.org/assignments/icmp-parameters/icmp-parameters.xhtml

I agree that the behavior changed and seems more strict now. The other *sense behaves the same as OPNsense 25. Not found a sysctl setting yet to restore the earlier behavior.
Deciso DEC740

Hi patient0, the ID is not the same as the type. This is an echo request packet:

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Type (8)   |   Code (0)     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Checksum (16 bits)       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Identifier (ID) (16 bits)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Sequence Number (16 bits)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        Data (variable)        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

As it happens with TCP/UDP with the port number (session number) the ID plays that roll on ICMP echo requests.

The fact that the ID is not changed within pf means that there can only exists one ICMP echo request with each ID when it leaves the router through WAN because there can be only one state with the public source IP and a certain ID.

The first process that makes the ping with that ID will be the one that could get the way out. The subsequent ones wont get their way because they can't be routed if there is already an existing state with the same ID.

I think that this is only a matter of some configuration change between 24.7 and 25.1, but I didn't investigate it yet.

February 22, 2025, 04:53:34 PM #4 Last Edit: February 22, 2025, 05:02:17 PM by patient0
Quote from: muchacha_grande on February 22, 2025, 04:37:43 PMHi patient0, the ID is not the same as the type. This is an echo request packet:

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Type (8)   |   Code (0)     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Checksum (16 bits)       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Identifier (ID) (16 bits)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Sequence Number (16 bits)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        Data (variable)        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

As it happens with TCP/UDP with the port number (session number) the ID plays that roll on ICMP echo requests.
Thanks for taking the time to explaining it, that does make sense! If feel a bit silly to have chimed in at all. Funny enough on pfSense 2.7.2 CE which is based on FreeBSD-14 the issue does not occur, it's not an OS level issue then.

I did draw a false conclusion because a) missing knowledge and b) misinterpretation of the display of the pfctl -s s

... when called with -e 1
ax1 icmp 1.0.0.1:8 <- 192.168.169.7:1      0:0

... when called with -e 8
ax1 icmp 1.0.0.1:8 <- 192.168.169.7:8      0:0

But wouldn't the router just do what you're asking it to do? You give it a fixed (session) ID of which not two can co-exist?

Seems there's a FreeBSD bug report for this issue https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=283795
Deciso DEC740

Quote from: muchacha_grande on February 22, 2025, 04:37:43 PMI think that this is only a matter of some configuration change between 24.7 and 25.1, but I didn't investigate it yet.

Well, it may not be a configuration problem at all.

Quote from: muchacha_grande on February 22, 2025, 05:42:17 PM
Quote from: muchacha_grande on February 22, 2025, 04:37:43 PMI think that this is only a matter of some configuration change between 24.7 and 25.1, but I didn't investigate it yet.

Well, it may not be a configuration problem at all.
And it adds to the confusion that it _does_ work when I'm setting the ID to 8 (tested on OPNsense 25.7.a_36), not 0,3,4,9,15 nor 16. Wrong header field compared in the code?
Deciso DEC740

Quote from: patient0 on February 22, 2025, 06:38:11 PMAnd it adds to the confusion that it _does_ work when I'm setting the ID to 8 (tested on OPNsense 25.7.a_36), not 0,3,4,9,15 nor 16. Wrong header field compared in the code?

Please open an issue on github about this.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on February 22, 2025, 06:46:08 PMPlease open an issue on github about this.
There is an open FreeBSD bug for it, I just requested an account for the FreeBSD bugzilla and will update the report. And do a package capture while waiting.
Deciso DEC740

Quote from: patient0 on February 22, 2025, 06:51:34 PMThere is an open FreeBSD bug for it, I just requested an account for the FreeBSD bugzilla and will update the report. And do a package capture while waiting.

Great! Thanks!
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Btw: can anyone confirm that when setting the ICMP ID to 8 it does work, or is it only me?
Deciso DEC740


Quote from: muchacha_grande on February 23, 2025, 08:22:29 PMHere is another possible fixed bug that may be related to this problem https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280701
Well, that was a fun read ... not.

Could be part of the whole ICMP kerfuffle, indeed. Too many commits involved to get me head around it.
Deciso DEC740

Remember this old bug ;)

I think the state tracking is overly strict and we said as much before, but there's no expert there that agrees while copying OpenBSD code to FreeBSD. It's a bit ironic.

I'm CCed to the new ticket. Let's see what happens.


Cheers,
Franco

Quote from: patient0 on February 23, 2025, 08:44:45 AMBtw: can anyone confirm that when setting the ICMP ID to 8 it does work, or is it only me?

patient0, I can confirm that using ID 8 works, as you stated.

One can suspect that the problem may be related to the use of the wrong variable being that the ICMP type of the ping is also 8.