Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - greg124816

#1
After thinking this through and noting that the longest length match issue discussed for that old uncommitted patch was for master to backup.

On the master the CARP IP was getting use for src ip on outgoing by default when pinging the backup's non CARP IP. This caused confusion on the backup and it wouldn't reply since it had the IP in backup state.

That is apparently no longer an issue in 11.2 since I can force the CARP IP to be used outgoing from the master and ping6 works.  tcpdump looks good.... everything proper. The backup sends a reply with Dst=CARP Ip and Src=it's non-CARP IP.

In my current case, I can't manipulate the address assignments to affect the longest match because I'm trying to ping the CARP IP from the backup, so the destination is the CARP IP and it will always be the longest match to the CARP IP present in backup state.

What seems to be needed is some logic like whatever they have on IPv4 to ignore CARP interfaces/addresses etc., at least when in backup state. I haven't looked and don't really have time to go any deeper right now.

I'm going to run with things as they are since ping6 with -S option specifying the source works fine and pinging any hosts/ips on or off the opnsense boxes works fine.

As I turn up IPv6 stuff, if any issues crop up due to how it works I'll revisit the issue.

#2
This is another old discussion I found but someone proposed a prefer_source flag to help prefer a "default" ipv6 address on an interface. There was some disagreement that you should just specify the src address in your application, but it looks like it was commited:
https://github.com/freebsd/freebsd/blob/releng/11.2/sys/netinet6/in6_src.c#L461

This is in a different source file. I'm not sure of how it all ties together but in the ping6.c source code i see a comment about not using raw sockets since they dropped root priv. Maybe in6_src.c is not involved.


But, man ifconfig shows:
     prefer_source
             Set a flag to prefer address as a candidate of the source address
             for outgoing packets.

     -prefer_source
             Clear a flag prefer_source.


I set prefer_source on the non CARP ip and things still work the same so it seems it has no effect in this case.

I'm going to see if i can manipulate the address assignments to avoid longest match on the carp IP.
#3
Thank for the pointer.

I was thinking it was not an opnsense issue but I figured I was doing something wrong and someone had seen it before.

before I try and submit a bug I found this promising info:

It seems freebsd 11.2 chooses an IPv6 src address with the longest match to dst address here:
https://github.com/freebsd/freebsd/blob/releng/11.2/sys/netinet6/in6.c#L1818

I found someone posted a patch long ago that prevents using the IP from a CARP interface here:
http://openbsd-archive.7691.n7.nabble.com/hack-for-carp-in-IPv6-source-address-selection-td256756.html

They said it can be worked around by choosing your IPv6 addresses to prevent the longest match selecting the carp interface. I had tried with 2001:db8:d::12:1234 and didn't see any different results. His specific issue was pinging from the master to the backup, I'm seeing the opposite, ping from backup to master.

Looks like i should look a little closer at the in6_matchlen() code and see what is going on and keep poking at it.
https://github.com/freebsd/freebsd/blob/releng/11.2/sys/netinet6/in6.c#L1709
#4
To add to this, I saw the backup firewall to sending NSs to the Solicited-Multicast address, but it still uses the wrong src IP and the master firewall does not reply.... or it does so internally since the sender is the CARP IP it has assigned to itself.

This shows the frames with src mac of the backup firewall, and the src IP of the CARP IP.... it should be using it's permanent IP of 2001:db8:d:3

root@dmzfwa:~ # tcpdump -eni igb2_vlan4 ip6 and not proto 112
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb2_vlan4, link-type EN10MB (Ethernet), capture size 262144 bytes
08:14:19.238855 ac:1f:6b:67:01:fe > 33:33:ff:00:00:01, ethertype IPv6 (0x86dd), length 86: 2001:db8:d::1 > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2001:db8:d::1, length 32
08:14:20.240621 ac:1f:6b:67:01:fe > 33:33:ff:00:00:01, ethertype IPv6 (0x86dd), length 86: 2001:db8:d::1 > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2001:db8:d::1, length 32
08:14:21.242630 ac:1f:6b:67:01:fe > 33:33:ff:00:00:01, ethertype IPv6 (0x86dd), length 86: 2001:db8:d::1 > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2001:db8:d::1, length 32



I tried setting the CARP IP to 2001:db8:d::12:1234  so that the last 24 bits were populated that get copied to solicited multicast address. It didn't make any difference in how things operated.
#5
I have a HA firewall setup that has been working in production with IPv4 for many years.

I'm adding IPv6 now and ran into an issue with source address selection on the Backup CARP interface.

ping6 to any IPv6 address the subnet works correctly from either firewall except for one case:

ping6 to the CARP IP from the Backup firewall fails

For some reason the Backup firewall uses the CARP IP as the source address, even though it is in BACKUP state. If I force ping6 to use the permanent IP assigned to the backup firewall it works fine.

I can see with tcpdump that the frames come with both src and dest IP as the CARP ip. Even the Neighbor Solicitation has incorrect src IP, and is also not sent to ff02::1:ff00:1 (Solicited-node Multicast address).

Here is tcpdump on the master firewall using a regular ping6 to CARP IP from the backup firewall ( ping6 2001:db8:d::1 )
root@dmzfwa:~ # tcpdump -ni igb2_vlan4 ip6 and not proto 112
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb2_vlan4, link-type EN10MB (Ethernet), capture size 262144 bytes
05:54:09.722690 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, echo request, seq 0, length 16
05:54:10.784383 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, echo request, seq 1, length 16
05:54:11.821819 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, echo request, seq 2, length 16
05:54:12.831525 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, echo request, seq 3, length 16
05:54:13.845976 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, echo request, seq 4, length 16
05:54:14.768000 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, neighbor solicitation, who has 2001:db8:d::1, length 32
05:54:14.909059 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, echo request, seq 5, length 16
05:54:15.768636 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, neighbor solicitation, who has 2001:db8:d::1, length 32
05:54:15.963281 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, echo request, seq 6, length 16
05:54:16.768648 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, neighbor solicitation, who has 2001:db8:d::1, length 32
05:54:17.026216 IP6 2001:db8:d::1 > 2001:db8:d::1: ICMP6, echo request, seq 7, length 16


Things work if I force the source IP selection of ping6  (ping6 -S 2001:db8:d::3 2001:db8:d::1)
root@dmzfwa:~ # tcpdump -ni igb2_vlan4 ip6 and not proto 112
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb2_vlan4, link-type EN10MB (Ethernet), capture size 262144 bytes
06:03:25.427573 IP6 2001:db8:d::3 > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2001:db8:d::1, length 32
06:03:25.427666 IP6 2001:db8:d::2 > 2001:db8:d::3: ICMP6, neighbor advertisement, tgt is 2001:db8:d::1, length 32
06:03:25.427740 IP6 2001:db8:d::3 > 2001:db8:d::1: ICMP6, echo request, seq 0, length 16
06:03:25.427776 IP6 2001:db8:d::1 > 2001:db8:d::3: ICMP6, echo reply, seq 0, length 16


Pinging the IPv4 CARP master IP works fine still. It's also not just ping6 having issues, ssh to the CARP master IPv6 ip has the same symptoms(tcpdump looks the same with src+dst as CARP IP).

Has any one seen anything like this? I've rebooted multiple times, built and rebuilt the IPv6 CARP as it's own CARP item in opnsense with different VHID and also as an IP alias on the same VHID. I get the same results both ways. Never any problems with IPv4.

With tcpdump -e option I did verify that the ping6 and NS frames had the proper SRC MAC of the backup firewall interface.

Here are ifconfig details for the interface on both firewalls:

Master firewall(oops, edited to change IPv6 first part to 2001:db8 like the rest):
root@dmzfwa:~ # ifconfig igb2_vlan4
igb2_vlan4: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether ac:1f:6b:67:01:b0
        inet6 fe80::ae1f:6bff:fe67:1b0%igb2_vlan4 prefixlen 64 scopeid 0xd
        inet6 2001:db8:d::2 prefixlen 64
        inet6 2001:db8:d::1 prefixlen 64 vhid 1
        inet 10.10.144.2 netmask 0xffffffc0 broadcast 10.10.144.63
        inet 10.10.144.1 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.58 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.54 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.55 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.56 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.57 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        vlan: 4 vlanpcp: 0 parent interface: igb2
        carp: MASTER vhid 1 advbase 1 advskew 0
        groups: vlan
root@dmzfwa:~ #

Backup firewall:
root@dmzfwb:~ # ifconfig igb2_vlan4
igb2_vlan4: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether ac:1f:6b:67:01:fe
        inet6 fe80::ae1f:6bff:fe67:1fe%igb2_vlan4 prefixlen 64 scopeid 0xd
        inet6 2001:db8:d::3 prefixlen 64
        inet6 2001:db8:d::1 prefixlen 64 vhid 1
        inet 10.10.144.3 netmask 0xffffffc0 broadcast 10.10.144.63
        inet 10.10.144.1 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.58 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.54 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.55 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.56 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        inet 10.10.144.57 netmask 0xffffffc0 broadcast 10.10.144.63 vhid 1
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        vlan: 4 vlanpcp: 0 parent interface: igb2
        carp: BACKUP vhid 1 advbase 1 advskew 100
        groups: vlan
root@dmzfwb:~ #

#6
I was migrating a old pair of redundant firewalls w/pfsense to new hardware w/opnsense

What I discovered is that the CARP implementations are NOT compatible and both become MASTER (while trying to swap in the new B firewall while leaving the A firewall running).

After much confusion, the reason is that opnsense seems to have a properly incrementing replay protection counter in each CARP advertisement, while pfsense's counter is static and unchanging. Since the counters never match the packets get ignored by each firewall and both stay MASTER and continue to send their advertisements. There are 2 counter fields, I'm not sure which one tcpdump is showing, but whichever one it is is definitely not matching between opnsense and pfsense.

I verified the same incrementing replay protection counter behavior on several uCARP https://www.pureftpd.org/project/ucarp systems we have running.

Here are some example tcpdump traces, all done from a single pfsense host looking out two different interfaces

Looking at at Opnsense 19.1 host (initial post had the wrong trace for the opnsense host, it's corrected now):
[2.3.4-RELEASE][root@localhost]/root: tcpdump -T carp -ni em1 vrrp and host 192.168.100.2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
13:05:27.216731 IP 192.168.100.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=6491304834018196506
13:05:28.218369 IP 192.168.100.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=6491304834018196507
13:05:29.220678 IP 192.168.100.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=6491304834018196508
13:05:30.222307 IP 192.168.100.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=6491304834018196509


Looking at a Linux Centos host running ucarp 1.5.1:
[2.3.4-RELEASE][root@localhost]/root: tcpdump -T carp -ni em1 vrrp and host 192.168.100.7
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
12:57:37.999181 IP 192.168.100.7 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=120 authlen=7 counter=5446066235882559562
12:57:39.112282 IP 192.168.100.7 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=120 authlen=7 counter=5446066235882559563
12:57:40.999080 IP 192.168.100.7 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=120 authlen=7 counter=5446066235882559564
12:57:42.115312 IP 192.168.100.7 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=120 authlen=7 counter=5446066235882559565



Looking at its own CARP adverts (pfsense 2.3.4):
[2.3.4-RELEASE][root@localhost]/root: tcpdump -T carp -ni em0 vrrp and host 192.168.101.2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 65535 bytes
12:59:24.369135 IP 192.168.101.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=16106432045254150054
12:59:25.370135 IP 192.168.101.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=16106432045254150054
12:59:26.371129 IP 192.168.101.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=16106432045254150054
12:59:27.372131 IP 192.168.101.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=16106432045254150054


I can see from a 2004 presentation on OpenBSD CARP that the replay detection counters were "not implemented yet" at that time.
See page 10 in this pdf: https://cyber-defense.sans.org/resources/papers/gsec/carp-free-fail-over-protocol-106433

The FreeBSD initial commit of CARP code seems to have been in 2005 and to have included the incrementing counter (sc_counter++) here:
https://github.com/freebsd/freebsd/blob/e1d22638d0a8257ed01b7f95d1b6d5cef74ebd07/sys/netinet/ip_carp.c#L747

The above code with sc_counter++ is what is present in opnsense github source.

Finally, this pfsense code on github is supposed to be from Freebsd with their changes:
https://github.com/pfsense/FreeBSD-src/blob/RELENG_2_2/sys/netinet/ip_carp.c#L715





Does anyone have any comments on the history of this?
Am I somehow confused or totally out of the loop and this is a known incompatibility?
opnsense must have switched to the FreeBSD version of ip_carp.c at some point, when the fork occurred?

I haven't looked at tcpdump code CARP parsing code yet, but it's possible it is parsing 1 of the two counters and pfsense is using the other one. Either way it doe snot seem to be compatible with opnsense or FreeBSD or Ucarp.

I have a couple more pairs to migrate, I dont see where even the latest version of pfsense has the incrementing counter (based on code shown on github). I'm not sure if upgrading to the latest pfsense would get me an incrementing counter so i could more easily migrate. I guess I'll have to bring one up in a VM and check.

If I'm right, this CARP incompatibility is an important thing to know if you are planning to "smoothly" migrate a redundant pair of firewalls from pfsense to opnsense... they will not cooperate on CARP so it's a hard cut from one to the other, not gracefully anyway.
#8
Hey! I fixed it. It wa a typo on that disabled icon span line

Here's the patch that fixes it for me:

--- firewall_rules.php.orig     2018-08-09 16:19:58.929729000 -0700
+++ firewall_rules.php  2018-08-09 16:20:21.632562000 -0700
@@ -440,7 +440,7 @@
                   }  elseif ($filterent['type'] == "reject" && empty($filterent['disabled'])) {
                       $iconfn = "fa fa-times-circle text-danger";
                   }  elseif ($filterent['type'] == "reject" && !empty($filterent['disabled'])) {
-                      $iconfn = "f afa-times-circle text-muted";
+                      $iconfn = "fa fa-times-circle text-muted";
                   } elseif (empty($filterent['disabled'])) {
                       $iconfn = "fa fa-play text-success";
                   } else {

#9
I may have found the issue, in the attached image I have a snapshot of the HTML source.
I highlighted in yellow the two reject rule icon <span> sections, the top one is working the bottom one not.

The text of the two icon spans are:


<span class="fa fa-times-circle text-danger"></span>


<span class="f afa-times-circle text-muted"></span>


Is there a typo with "<fa" and "<f afa?\" which makes the icon not display and then of course not be clickable since there is no icon?

#10
Yes, the green arrow ones work as you are describing. Those are ones with Accept actions in the firewall rule.

I have 3 or 4 Rules with "Reject" action that used to work like that but no longer do.
When the rule is enabled the icon shows up, you can click it and the rule is disabled.... but at that point, there is no icon present and you cannot click to re-enable the rule.

In this image below you can see the green/gray arrows for enabled/disabled Accept rules. There are 2 reject action rules in the image but only the upper one has it's icon. The rule below that reject rule is a disabled reject rule and there is no icon to click for re-enabling it.
#11
I searched a little bit on the forum and did not see this issue mentioned.

After upgrading to 18.7, my firewall "reject" rules still work and can be edited and enabled/disabled if I click the Edit icon (pencil) for the rule. Then on the Edit Firewall Rule page i can check/uncheck the "Disabled: X Disable this Rule" check box and everything works as expected

The issue I'm seeing is with the "single click to enable/disable" rules from the Firewall: Rules: LAN page ( the list of all rules for LAN)

If the Reject rule is currently enabled, I can click the red circle with white X icon and the rule is disabled, I can then Apply the change and the rule is actually changed to disabled.

But, after disabling a reject rule, the Firewall: Rules: LAN page has no icon for that disabled reject rule. Normally (before my upgrade) it would have a grayed out circle with white X in it which you could click to Enable the rule, and then Apply/save the change.

As things are now since the upgrade I have to click the Edit icon(pencil) on the far right of the disabled Reject rule to load the edit page for that rule, then uncheck the Disable checkbox an save/apply to re-enable the rule.

All the Accept/Pass rules I have still work as they did before the upgrade.... I can enable/disable with a single click from the Firewall: Rules: LAN page and they show a grayed out triangle or green triangle indicating the enable/disable state.

I have a redundant pair of firewalls running CARP and both act this same way. I have 4 or 5 reject rules and they all operate this same way now.

Anyone else seeing this?
#12
General Discussion / HA sync functionality question
August 03, 2017, 09:27:22 PM
Ok, I haven't really looked and definitely haven't dug into the code to find the answer myself but, I'm curious about the actual behind the scenes process of changing a simple firewall rule in a HA pair (changing from the master of course).

Ultimately I'm searching for the reason I see a 30-60 second delay in "applying" a rule change.

In the course of double and triple-checking things I have a simple pass-rule where I click on the green/gray triangle to "enable/disable" and of course it produces the "Apply" button up top. But, what I've found is that after I click the green triangle to either enable or disable the rule, it propagates to gui on the HA peer (without ever clicking apply).

I have monitored the active pf ruleset before/after enable/disable and before/after clicking the Apply button. I need to redo the testing because I am not sure what I saw. I think sometimes I wasn't waiting long enough between checks and a previous click of the Apply button took effect on the active pf rules.

Anyway, what I'm most confused by is that when i click to disable/enable the rule, the change propagates to the web gui on the peer in a couple seconds and local page refresh is complete. When I click Apply, it takes over 30 seconds for page refresh to finish but I see all the xmlsync traffic (via tcpdump) occur within a couple seconds.

If i disable the HA Sync, of course there is no delay in page refresh after apply.

I've gone over my entire config, compared to other setups and online examples. I'm not new to the opnsense/pfsense HA setup, I've had one running at home and a couple at work since the pfsense v1.x days.

I haven't not had time to dig into the web gui code to figure out what's supposed to happen as far as the xmlrpcsync and filter reload etc on local and peer. I'm hoping someone knows and has the time and patience to tell me what's supposed to happen.

Thanks!
#13
I dont do any bandwidth usage tracking/control, just on/off per device. With desktops, laptops, phones, xbox/playstation/DS's etc we wanted them grouped to shut them all off, but also to add a single rule to allow a different group, say a Nintendo DS and TV etc.

I use the static arp checkboxes in DHCP reservations to force the mac of the kids devices to only work on a certain IP for each mac. Then I have their IPs grouped in aliases and schedules setup and assigned to firewall rules.

The rule logic seems to need to be to schedule an "allow/pass their traffic" rule above a permanent "block all their traffic" rule at the bottom. Otherwise the clearing of existing connections doesn't work when a schedule kicks a rule in/out.

We also have a separate rule above everything else that just blocks all traffic for each alias. That way we can login and enable/disable that rule to stop internet on all their devices in one shot. Along these same lines, we have "allow" rules we can go enable that will allow internet to all or a subgroup of phone/devices manually while all the auto-scheduled rules have things blocked.

You can lock things down with "only static arp" for the entire interface(actually in the DHCP settings for that interface), but then no hosts will work without having their mac added as a reservation (even if they are setup static). So far kids haven't learned about changing their mac. They were changing IPs effortlessly before I set static ARP up.

Actually I take that "no bandwidth control" statement back, before I moved off pfsense I did do a 64K throttle on phones(that have no cell  service) so they could have messaging app like google hangouts but no "usable internet" as they considered it. I did NOT have a firm grasp on how the throttling worked on pfsense but I got it figured out. On opnsense it's even more confusing to me, although I did manage to get it working for one device, I haven't had much call to expand that throttling setup to other devices.

Basically it all becomes a big list of firewall rules, some enabled, some disabled manually or by schedule at different times, but it does work for complex scenarios with numerous internet connected devices when you dont necessarily want to block all of their devices(even for a single child) at the same time.

What I wonder about is if macvlan type setup might work better, but then I think instead of a very long vertical list of rules for "LAN", I'd have a very wide list of Interfaces tabs in the firewall rules page.

#14
Hello,
This may be dhcpd itself.

Here is a bug fixed in 4.3.5:
- Leases are now scrubbed of certain prior use information when pool
  re-balancing reassigns them from one FO peer to the other.  This
  corrects an issue where leases that were offered but ignored retained
  the client hostname from the original client. Thanks to Pavel Polacek,
  Jan Evangelista Purkyne University for reporting the issue.
  [ISC-Bugs #42008]


https://lists.isc.org/pipermail/dhcp-users/2016-October/020331.html

This Debian Bug report on the issue shows an old stale and incorrect hostname instead of an empty hostname, but it sure seems like it could be related to your issue.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=810971

Then again, OPNsense 17.1.7 has isc-dhcp43-server-4.3.5 , so... it would seem to have that fix but maybe the first bug fix did not completey fix the issue... or maybe you are not on 17.1.7 with 4.3.5.

Definitely looks like it could be related though.

greg
#15
I do not speak German, but I think I understand your comments from reading the translation by Chrome browser. Hopefully you will be able to translate my reply, or someone could translate it for us :)

I see a similar situation with firewall rule changes taking up to 60 seconds to complete from the moment I hit save/apply. I frequently enable/disable several firewall rules to control  internet access for several users throughout the day. So several times a day I wait up to 60 seconds for a rule to be confirmed as applied. It's every time and not once in a while.

If I switch to non-HA setup(remove the IP/URI for HA sync partner), rule changes are applied within 1 or 2 seconds. I have Quad Core Celeron J1900 physical hardware for each firewall, not VMs.

I have monitored the actual pf ruleset from command line via ssh/shell into the machines while making a change via the web gui. The pf rules are modified quickly (less than 5 seconds as you say), the delay is only in the web response to the browser to indicate the changes are complete.

I believe it has nothing to do with CARP itself, as my translation of your post suggests. The changed settings are sent over via XMLRPC to the HA partner IP as set on the HA Settings page. I am using the IP of the HA Partners SYNC interface. Are you using the "LAN IP" of the partner as the HA Sync IP?

I have not had time to troubleshoot thoroughly but since no one has replied to your post, I wanted to say I see similar behavior... without VMs so I think your problem is not VM related. Although, I do not have any solution for you.