This seems to be a pretty common topic, but I haven't found anything definitive. I have a DHCP address on my WAN. I have seen multiple work arounds involving spoofing MACs, using non-routable IPs on the WAN interface for CARP and others. It seems to me that simply doing an ifdown on the WAN interface of the backup firewall is fine for my use case.
The big question is, where should I create my notify logic? Can I do it directly in /usr/local/etc/devd/carp.conf or will that get overwritten with updates? Can I create another file /usr/local/etc/devd/mycustomtweaks.conf that will be safe from updates?
WAN with DHCP and CARP is no fun.
I usually let a modem to the dialin and OPNsense behind with static IPs
I guess the plan is to have stateful failover on DHCP WAN?
Please update the thread if you find any good solutions as I would like to have the same.
Currently I just keep my WAN interfaces without CARP so when a failover occurs it drops all external sessions but at least I still have Internet access.
Quote from: sorano on January 19, 2021, 05:06:05 PM
I guess the plan is to have stateful failover on DHCP WAN?
Please update the thread if you find any good solutions as I would like to have the same.
Currently I just keep my WAN interfaces without CARP so when a failover occurs it drops all external sessions but at least I still have Internet access.
The plan is if the firewall is BACKUP then 'ifdown vtnet0' which is my WAN interface. If the firewall is MASTER then 'ifup vtnet0'. I don't expect this to be stateful nor do I plan to have CARP VIPs on the WAN interface. I simply want to use the CARP state to trigger an interface change.
It actually sounds like you are doing what I am after. How are you achieving that? For instance, just in basic testing on my BACKUP, if I run 'ifconfig vtnet0 down' all interfaces go down and 'ifconfig vtnet0 up' brings all interfaces up. It's bizarre.
Quote from: bubbagump on January 19, 2021, 11:01:26 PM
It actually sounds like you are doing what I am after. How are you achieving that? For instance, just in basic testing on my BACKUP, if I run 'ifconfig vtnet0 down' all interfaces go down and 'ifconfig vtnet0 up' brings all interfaces up. It's bizarre.
I run CARP on all interfaces except for WAN. The WAN interface on each firewall is just configured like "normal" with DHCP.
So the gateway for clients is the CARP LAN IP, and outbound traffic goes out via the WAN of the current CARP master.
Quote from: sorano on January 26, 2021, 03:21:15 PM
Quote from: bubbagump on January 19, 2021, 11:01:26 PM
It actually sounds like you are doing what I am after. How are you achieving that? For instance, just in basic testing on my BACKUP, if I run 'ifconfig vtnet0 down' all interfaces go down and 'ifconfig vtnet0 up' brings all interfaces up. It's bizarre.
I run CARP on all interfaces except for WAN. The WAN interface on each firewall is just configured like "normal" with DHCP.
So the gateway for clients is the CARP LAN IP, and outbound traffic goes out via the WAN of the current CARP master.
I just setup a second OPNsense firewall in my VMware 7 environment. When I have the WAN interface active on the secondary firewall with the same DHCP lease as my primary firewall I experience packet loss across the WAN interface.
I do not have CARP on my WAN interface. It's configured like "normal" as you described with DHCP.
What do you mean with "the same DHCP lease as my primary firewall" ?
Obviously you cannot have the same public IP on two different hosts else you are going to have a bad time.
If you have both firewalls on DHCP, I assume only one of them gets the lease?
Assuming that is so, the second one probably has no internet access, so how do you update it and things like that?
Quote from: bimbar on October 08, 2021, 10:43:45 AM
If you have both firewalls on DHCP, I assume only one of them gets the lease?
Assuming that is so, the second one probably has no internet access, so how do you update it and things like that?
That depends on your ISP. Where I live most ISP's provides more than 1 IP.
EDIT2: I misunderstood the use of pre-empt. As I now read it, pre-empt will address keeping all interfaces in a consistent state. More testing!
EDIT: I have done some additional digging and found that a script placed in /usr/local/rc.syshook.d/carp/ will be called when a carp event occurs. I have play around with this and now have something that works in the case that all 3 CARP interfaces on the primary go down - i.e. power failure; however, if there is a problem that say affects only the WAN interface, then the LAN interface is still pointing to the primary. More reading and testing needed :) lbe 11/02/2021
Has anyone found a hack that facilitates the OP request? Like the OP, I am fine with losing state. I would like to use the HA to keep everything else synced and just have a poor boy solution that will bring up the WAN interface (vtnet1) configured with an LAA MAC shared between the two firewalls in DHCP mode and then taking the WAN interface down when the primary is back in service.
I'm still too new to OPNsense (and HardenedBSD) to know how to implement the event detection and action. I do have many years of experience in Linux and other Unices and am glad to take a shot at writing the control scripts if someone know what hooks/APIs to use.
Thanks!
lbe
I have made a WIP script for WAN with single DHCP lease (only LAN setup as CARP).
I didn't switch to production with it yet, but testers with feedback are welcome.
at least some synthetic test cases did work as expected. A forced switch with Maintenance Mode is almost immediate... no ping lost. The only thing, that took a couple of seconds was when I shutdown the master. There the switch takes a bit longer but acceptable for me.
https://gist.github.com/spali/2da4f23e488219504b2ada12ac59a7dc
Oh how funny! I've been working on this independently over the past few days and didn't check this thread to see you've solved it a couple of days ago!
We've effectively arrived on the same method to achieve this. Except your calls, Spali, are probably much better since you are using the config system's normal calls (which I'm not familiar with. I'm instead smashing in console commands via exec, equivalent to using a hammer. (unsanitized code execution risks here!)
Anyway, I also disable DHCPD on the passive/backup device (so I don't have two DHCP servers on my LAN) and make a call to the dhcp client to request a new lease on the WAN interface. I think we could also enumerate "wan*" interfaces to facilitate environments with multi-wan.
For reference, I create the following as this file on both devices: usr/local/etc/rc.syshook.d/carp/50-DHCP
and then "chmod +x 50-DHCP"
#!/usr/local/bin/php
<?phprequire_once("config.inc");require_once("interfaces.inc");require_once("util.inc");$subsystem = !empty($argv[1]) ? $argv[1] : '';$type = !empty($argv[2]) ? $argv[2] : '';if ($type != 'MASTER' && $type != 'BACKUP') { log_error("Carp '$type' event unknown from source '{$subsystem}'"); exit(1);}if (!strstr($subsystem, '@')) { log_error("Carp '$type' event triggered from wrong source '{$subsystem}'"); exit(1);}foreach($config['interfaces'] as $ifkey => $interface) { if ($ifkey=='wan') { // could change this to match on wan* interfaces for multi-wan setups, maybe? if ($type == 'BACKUP') { log_error("Carp Status is now Backup!"); log_error("Shutting interface: {$interface['if']}"); shell_exec("/sbin/ifconfig {$interface['if']} down"); log_error("Stopping DHCPD"); shell_exec('pluginctl -s dhcpd stop'); } else if ($type == 'MASTER') { log_error("Carp Status is now Master!"); log_error("Starting interface: {$interface['if']}"); shell_exec("/sbin/ifconfig {$interface['if']} up"); log_error("Restarting DHCPD"); shell_exec('pluginctl -s dhcpd restart'); shell_exec("dhclient {$interface['if']}"); } }}?>
For future reference in case Spali's github post ever disappears, they are doing the following instead of my foreach statement:
$ifkey = 'wan';
if ($type === "MASTER") {
log_error("enable interface '$ifkey' due CARP event '$type'");
$config['interfaces'][$ifkey]['enable'] = '1';
write_config("enable interface '$ifkey' due CARP event '$type'", false);
interface_configure(false, $ifkey, false, false);
} else {
log_error("disable interface '$ifkey' due CARP event '$type'");
unset($config['interfaces'][$ifkey]['enable']);
write_config("disable interface '$ifkey' due CARP event '$type'", false);
interface_configure(false, $ifkey, false, false);
}
Edit: For those who may be looking for a DIY on how to enable this, I have a small write-up on the opnsense subreddit, here: https://old.reddit.com/r/opnsense/comments/runb4r/diy_ha_activepassive_for_home_internet/
I am trying to do this on Dual WAN using Spali's script and the primary kicks but the secondary WAN just sits there.
bitcore's solution works, though I don't know if we need to kill the dhcp server on the backup.. if it all works correct, dhcp should failover to the backup when the primary fails.. if you sync all leases, the backup should take over as dhcp server.
If anyone sees this before I figure it out.. how can I tweak Spali's script to kick both WAN interfaces when there is a failure?
Quote from: DocGonzo74 on January 16, 2022, 04:37:28 PM
I am trying to do this on Dual WAN using Spali's script and the primary kicks but the secondary WAN just sits there.
bitcore's solution works, though I don't know if we need to kill the dhcp server on the backup.. if it all works correct, dhcp should failover to the backup when the primary fails.. if you sync all leases, the backup should take over as dhcp server.
If anyone sees this before I figure it out.. how can I tweak Spali's script to kick both WAN interfaces when there is a failure?
regarding DHCP, currently not tested, but according to docs and setup DHCP synced with failover defined, I assume this should work on the LAN side.
also replied in the gist for the other question.
But here too:
Assuming you just want to disable both WAN interfaces on the backup and enable both on the master, you can just duplicate the script with a any suffix in the filename and adjust the
$ifkey variable to for the second WAN interface.
A bit cleaner solution would be adapt the script to allow to define an array for
$ifkey variable to it can loop over the interfaces.
Quote from: bitcore on January 03, 2022, 01:34:36 AM
We've effectively arrived on the same method to achieve this. Except your calls, Spali, are probably much better since you are using the config system's normal calls (which I'm not familiar with. I'm instead smashing in console commands via exec, equivalent to using a hammer. (unsanitized code execution risks here!)
If it works for you, then I you've done a good job ;)
I started as you, but had also the problem with the WAN lease not working etc. And I just decided instead of manually issue a renew, to issue the disabling of the WAN interface over the configuration (same as you would untick "enabled" in the interface GUI) to allow OPNsense to reconfigure everything as it would also do it when manually disabled or enabled. That is also responsible to get the DHCP lease during enable keeps all other stuff up to date. Just thought, it would be less error prone, but I don't like that it probably makes a lot more than required.
I think your version works for what it's made needed adaption for other use cases. Mine does more, but probably to much.
So people can choose what they want and that is good as it is :D
I'm 99% of the way there.. I can get the backup WAN interface to come up (Still trying to figure out how to get both WAN interfaces up) but they aren't passing traffic. I'm trying to figure out how to mod the script to down the WAN interface and up it properly when there is a failover. As it stands, I have 2 scripts set to execute but only the first one is working. A couple questions:
the ifkey.. I assume that the "wan" is just a placeholder.. i've changed that variable to the interface (igb4) and it appears to be working. The second script has igb5 and is not working. Any ideas?
Thanks again. you guys solved a problem that has been vexing me for a while.. now if I could just get both WAN interfaces working.
Not sure why igb4 is working at all.
It's the interface key in the config.xml.
So lowercase of the internal interface name. i.e. lan, wan, opt1, opt2 etc.
Don't mix it with the name you gave to the interface. You can see these in the "Interfaces" -> "Overview" behind the interface in brackets (the first one before comma).
Awesome.. I couldn't figure out how to get that to work.
Another question.. your git has "install on backup router".. I would assume that I have to down the WAN on the primary router as well, no?
Thank you again!
Not sure if I got your question.
But you need the script on both routers.
But during setup I recommend to disable the WAN(s) on the BACKUP router manually to not have both enabled at the same time. On the MASTER you could leave the interface enabled.
Spali, thanks for the assistance... I have the failover workingish.. when my backup comes up, the interfaces come up and the system runs the newwanip script for both, but I don't get an IP address or an active gateway.
I am using a managed switch and have dhcp snooping off, the ISP modem and both interfaces in the same VLAN (L2 only, unrouted), and I'm spoofing the MAC from my primary to my backup. Still going through some ideas on what is happening.. Wondering if I should be spoofing the MAC address on my backup somewhere other than in the GUI. I'm currently just spoofing the primary router's WAN MACs on the secondary router.
Regarding the MAC, maybe you need to sniff the DHCP traffic to find out whats wrong (probably mac spoofing not working properly?). In my case I have two virtual machines. So I spoof the mac on the virtual network card. I have it entered in the GUI too, but maybe this doesn't really work? If your routers are virtual, then don't forget to enable promiscuous mode.
do you use my version of the script with write_config and interface_configure or a custom one?
I'm asking because I had a similar problem as I just started with a script das does just start and stop the interface. The version that uses the configuration interface of opnsense kicks in a lot of reconfiguration tasks that may help.
I am definitely game to try something new. I'm using your script from the git linked in this thread.
I'm playing with some settings on my managed switch to see if that's the culprit. I'm going to stick a cheap netgear switch on the primary Lan to rule out anything blocking traffic (STP is disabled, DHCP snooping off, and I've disabled the mac-move limitation on this switch).
I actually see my interfaces on the backup come up just fine.. they appear to get the same IP address that the WAN Primary had, though I'm not sure if it's because the dhcp lease was sync'd from the primary or if it's requesting a refresh. Either way, my gateways do not come up (I've tried with monitoring on and disabled.. same end result).
I am onto something here. I noticed that my gateways weren't working properly (I have 2 gateways configured and 2 gateway groups). To rule out gateway configuration, I deleted all of them and tried again (with a single wan for now) and boom.. missed one ping and back up.
Also finding that the gateway configuration is quite sticky.. not sure where it's hooked but I can't get rid of it. I had a gateway called "Verizon_WAN_DHCP" and noticed that the Verizon interface was coming up with a new GW "Verizon_WAN_GW".. that second GW isn't configured in the GUI. I checked the config file and all references to it are gone, but when I fail over, it pops back up. Very odd I think.
I also noticed that my switch (Juniper EX2200) was learning the MAC on the primary port, but when I switched over, I see that the MAC is still tied to the primary port. I set up both ports going to my router as no-mac-learning.. and that seems to have bypassed anything the switch is causing. Now the transaction is between the ISP device and my router, leaving DHCP snooping and other security features (for this vlan anyway) on the nightstand.
It's all working. The gateway is still wonky on the primary (I can't seem to delete my old gateway, but a new one pops up and works great). I had to disable dhcp snooping on my WAN VLANs on my managed switch. Disabling snooping didn't work alone, though. I had to disable mac-learning as well. I lose 1 ping and all is well. My secondary WAN (Spectrum) comes up quite slowly but that's OK. They suck.
Great, nice to hear a success ;D
Thanks Spali for being awesome and helping me a bit. You are awesome.
I am a Google Fiber subscriber. My environment is simple with an active/passive firewall - a KVM VM with hardware passthrough of a quad port NIC, and physical hardware firewall with some intel NICs. I have a single WAN, and a single LAN interface running CARP. The VPNs I use continue to function after failover. Stateful protocols such as ipsec or openvpn will drop and need to re-negotiate, but can reconnect immediately. Wireguard has no such issue.
Spali's github post is very useful: https://gist.github.com/spali/2da4f23e488219504b2ada12ac59a7dc
I have updated my personal script to the following, which is a mash of theirs and mine, which I posted in Reddit some time ago: https://www.reddit.com/r/opnsense/comments/runb4r/diy_ha_activepassive_for_home_internet/
#!/usr/local/bin/php
<?php
require_once("config.inc");
require_once("interfaces.inc");
require_once("util.inc");
$subsystem = !empty($argv[1]) ? $argv[1] : '';
$type = !empty($argv[2]) ? $argv[2] : '';
if ($type != 'MASTER' && $type != 'BACKUP') {
log_error("Carp '$type' event unknown from source '{$subsystem}'");
exit(1);
}
if (!strstr($subsystem, '@')) {
log_error("Carp '$type' event triggered from wrong source '{$subsystem}'");
exit(1);
}
foreach($config['interfaces'] as $ifkey => $interface) {
if ($ifkey=='opt3') {
if ($type == 'MASTER') {
log_msg("Carp Status is now Master!");
log_msg("Enabling interface: $ifkey - {$interface['if']}");
shell_exec("/sbin/ifconfig {$interface['if']} up");
$config['interfaces'][$ifkey]['enable'] = '1';
write_config("enable interface '$ifkey' due CARP event '$type'", false);
interface_configure(false, $ifkey, false, false);
sleep(1);
log_msg("Restarting DHCPD");
shell_exec('pluginctl -s dhcpd restart');
sleep(1);
log_msg("Issueing dhclient command to request a DHCP lease");
shell_exec("dhclient {$interface['if']}");
} else if ($type == 'BACKUP') {
log_msg("Carp Status is now Backup!");
log_msg("Disabling interface: $ifkey - {$interface['if']}");
shell_exec("/sbin/ifconfig {$interface['if']} down");
unset($config['interfaces'][$ifkey]['enable']);
write_config("disable interface '$ifkey' due CARP event '$type'", false);
interface_configure(false, $ifkey, false, false);
log_msg("Stopping DHCPD");
shell_exec('pluginctl -s dhcpd stop');
}
}
}
?>
(the forum is breaking the greater than and less than in the PHP brackets at the start and end, correct them yourself)
- This version will also manually "down" interfaces, as disabling them does not appear to fully "shut" the interface in my environment. This can cause mac flapping, and all of the issues related to that condition.
- My version also stops the DHCP Daemon, which ensures that I only have one DHCP server running on my LAN. I need the backup device to actually become "passive". Calling dhclient may not be necessary with the interface_configure call, but it's a holdover from when I previously only used shell_exec("/sbin/ifconfig {$interface['if']} down"); to up/down the interfaces, instead of enabling/disabling the interfaces.
- I use log_msg instead of log_error so that these events show up in the general system log as a "notice".
I do recommend creating a gateway with "Upstream Gateway" checked and a higher metric than the normal WAN gateway, as per spali's github comments to allow the backup to reach the internet via the LAN.
I also recommend disabling the "Backup" router's WAN interface - so that your secondary device will boot up with the WAN in disabled state, and the CARP script will re-enable the interface if CARP goes master. This prevents the devices from both booting up and each having active WAN interfaces.
Quote from: bitcore on October 22, 2024, 04:53:25 AM
I am a Google Fiber subscriber. My environment is simple with an active/passive firewall - a KVM VM with hardware passthrough of a quad port NIC, and physical hardware firewall with some intel NICs. I have a single WAN, and a single LAN interface running CARP. The VPNs I use continue to function after failover. Stateful protocols such as ipsec or openvpn will drop and need to re-negotiate, but can reconnect immediately. Wireguard has no such issue.
Spali's github post is very useful: https://gist.github.com/spali/2da4f23e488219504b2ada12ac59a7dc
I have updated my personal script to the following, which is a mash of theirs and mine, which I posted in Reddit some time ago: https://www.reddit.com/r/opnsense/comments/runb4r/diy_ha_activepassive_for_home_internet/
#!/usr/local/bin/php
<?php
require_once("config.inc");
require_once("interfaces.inc");
require_once("util.inc");
$subsystem = !empty($argv[1]) ? $argv[1] : '';
$type = !empty($argv[2]) ? $argv[2] : '';
if ($type != 'MASTER' && $type != 'BACKUP') {
log_error("Carp '$type' event unknown from source '{$subsystem}'");
exit(1);
}
if (!strstr($subsystem, '@')) {
log_error("Carp '$type' event triggered from wrong source '{$subsystem}'");
exit(1);
}
foreach($config['interfaces'] as $ifkey => $interface) {
if ($ifkey=='opt3') {
if ($type == 'MASTER') {
log_msg("Carp Status is now Master!");
log_msg("Enabling interface: $ifkey - {$interface['if']}");
shell_exec("/sbin/ifconfig {$interface['if']} up");
$config['interfaces'][$ifkey]['enable'] = '1';
write_config("enable interface '$ifkey' due CARP event '$type'", false);
interface_configure(false, $ifkey, false, false);
sleep(1);
log_msg("Restarting DHCPD");
shell_exec('pluginctl -s dhcpd restart');
sleep(1);
log_msg("Issueing dhclient command to request a DHCP lease");
shell_exec("dhclient {$interface['if']}");
} else if ($type == 'BACKUP') {
log_msg("Carp Status is now Backup!");
log_msg("Disabling interface: $ifkey - {$interface['if']}");
shell_exec("/sbin/ifconfig {$interface['if']} down");
unset($config['interfaces'][$ifkey]['enable']);
write_config("disable interface '$ifkey' due CARP event '$type'", false);
interface_configure(false, $ifkey, false, false);
log_msg("Stopping DHCPD");
shell_exec('pluginctl -s dhcpd stop');
}
}
}
?>
(the forum is breaking the greater than and less than in the PHP brackets at the start and end, correct them yourself)
- This version will also manually "down" interfaces, as disabling them does not appear to fully "shut" the interface in my environment. This can cause mac flapping, and all of the issues related to that condition.
- My version also stops the DHCP Daemon, which ensures that I only have one DHCP server running on my LAN. I need the backup device to actually become "passive". Calling dhclient may not be necessary with the interface_configure call, but it's a holdover from when I previously only used shell_exec("/sbin/ifconfig {$interface['if']} down"); to up/down the interfaces, instead of enabling/disabling the interfaces.
- I use log_msg instead of log_error so that these events show up in the general system log as a "notice".
I do recommend creating a gateway with "Upstream Gateway" checked and a higher metric than the normal WAN gateway, as per spali's github comments to allow the backup to reach the internet via the LAN.
I also recommend disabling the "Backup" router's WAN interface - so that your secondary device will boot up with the WAN in disabled state, and the CARP script will re-enable the interface if CARP goes master. This prevents the devices from both booting up and each having active WAN interfaces.
Saw your reddit post and your most recent post on github. Thank you to you and Spali on figuring this out.
Im on 24.7.6, does the script no longer work or is the one you posted here working with .6?
I haven't set this up yet, but I have been looking into doing this for a while.
My setup:
- opnsense main: 192.168.29.1
- opnsense backup: 192.168.29.100
- pfsync/halink between the two: 10.0.0.1 and 10.0.0.2
- GPON ATT is on VLAN 842 (to bypass the need for the ATT Fiber gateway)
What should my CARP virtual IPs be for WAN and LAN?
Should I keep the backup a fully clean OPNsense state or add things like the VLAN 842 for the GPON or restore a proxmox backup so its all the same and just change the CARP settings and such??