[SOLVED] Remote firewall failed system update fallback approach question

Started by stumper, June 16, 2025, 12:52:23 PM

Previous topic - Next topic
Scenario - remote small business office with Opnsense firewall, with only handful of users, none tech savvy. No OOB remote management solution (budget constraints) for console access if remote connectivity is lost for any reason.

Background - I am familiar with and successfully used snapshots, manually falling back to a "known_good" snapshot from UI and CLI on a locally accessible firewall, following Snapshot documentation.

Goal - at remote site, manually create a "known_good" snapshot, create a cron job to run in 20 minutes to fallback (e.g. bectl) to "known_good" snapshot, perform system update (24.7.x to current 25.1.x), log back into remote firewall and cancel cron job if all is good. If for some reason I can't get back into remote system, it will fallback to the "knwon_good" snapshot.

Question - is this a capability that is already available in Opnsense natively (I didn't find anything in the doc) or is there already a working solution via available plugin (my searches didn't find a working solution, but a number described the approach I'm planning), before I create my custom cron job?
N5105  4GB | 250GB | 2x2.5GbE i226-v

For those who may come across this post, I answered my own question by following the "Recommended Workflow" in the documentation here, with the following additions:

1. Create snapshot
2. SSH into firewall, create a script to activate the "known_good" snapshot, then reboot firewall.
3. Create an AT job to run the script in 30 minutes from now ==> echo scriptname | at now + 30 minutes

4. Back in UI, apply firmware updates, then let firewall reboot
5. Login and verify everything working as expected, and if it is, then SSH into firewall and cancel AT job
>> If for some reason you are not able to gain access to the remote firewall for any reason, it will fall back to the "known_good" snapshot. After that, you can debug as needed, resolve issue and retry updates.
N5105  4GB | 250GB | 2x2.5GbE i226-v

Alternatively:

- you want to update from e.g. 25.1.8 to 25.1.9
- you only have a single snapshot so far named "default"

then

- rename your current snapshot to 25.1.9 (what you want to update to): `bectl rename default 25.1.9`
- create a backup snapshot named like the older version: `bectl create 25.1.8`
- activate the backup snapshot: `bectl activate 25.1.8`
- activate the current and soon to be updated snapshot to boot only once: `bectl activate -t 25.1.9`
- update and reboot

In case that fails

- have someone in the remote location power cycle the system - it will boot into the backup snapshot


This works even in case the update breaks the system so badly, the at job cannot run.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

@Patrick - thanks for that feedback and alternate approach, very helpful!!
N5105  4GB | 250GB | 2x2.5GbE i226-v

The "activate for just once" flag (-t) is genius!
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)