os-nut: Broken plugin kills UPS power and halt instead of shutdown

Started by hakuna, April 01, 2026, 09:32:24 AM

Previous topic - Next topic
I have a Debian NUT server running from a 5V3A client, everything is working fine, it is the last one to die during power outage.
Everybody else such as Proxmox, TrueNAS and OPNSense, fetch the UPS status from the little guy above aka "slave"

However, os-nut plugin has broken logic and does not allow overwriting:

1. /usr/local/etc/rc.halt: This is wrong, it must be "shutdown -p now" compatible with FreeBSD (I never understood halt to be honest, it and nothing is the same)
2. /etc/killpower: The worst offence. It fully kills the UPS power on exit. Proxmox, TrueNAS will go kill because of it. This should never be here.
3. If you noticed the first lines, you must not edit this file, I did and everything was reverted back.

# Please don't modify this file as your changes might be overwritten with
# the next update.
#
MONITOR apc1000@10.19.0.14:3493 1 admin master slave
SHUTDOWNCMD "/usr/local/etc/rc.halt"
POWERDOWNFLAG /etc/killpower


How are you guys managing this??
Ditching os-nut plugin altogether and installing nut via pkg so OPNSense has no interference??

As it stands, I have to let OPNSense baremetal power to be cut by force, it is wrong but better that way than it destroying my NAS HDDs when killing the UPS power.

Thank you
Sophos SG210 Rev3 -  i7-6700T - 16GB

1. "/usr/local/etc/rc.halt" does call "shutdown -p now":

#!/bin/sh

# shutdown syshook / plugin scripts
/usr/local/etc/rc.syshook stop

/sbin/shutdown -op now

while :; do sleep 1; done

2. The hardwired killpower flag might call for a feature request to make it configurable.

I haven't noticed because here my OPNsense is the master NUT server and all other servers shut down first. I think the firewall killing the Internet connection should go down last.

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Here's the secret: OPNsense has an API that can be called programmatically by shell scripts. I have another computer run this shell script when it shuts down for UPS to shut down my OPNsense. Avoids NUT nonsense. You might want to edit it a bit but it works fine for me. This script as written requires OPNsense to have a valid SSL cert, though:

#!/usr/bin/env bash
set -Eeuo pipefail

OPNSENSE_HOSTS=("https://opnsense.example.com")   # add more for HA
API_KEY="<API KEY HERE>"
API_SECRET="<API SECRET HERE>"
ENDPOINT="/api/core/firmware/poweroff"

CONNECT_TIMEOUT=3
MAX_TIME=10
RETRIES=3
RETRY_DELAY=2

log(){ logger -t ppb-opnsense -- "$*"; echo "[$(date -Is)] $*"; }

call_shutdown() {
  local url="${1%/}${ENDPOINT}"
  local args=(
    --silent --show-error
    --header "Content-Type: application/json"
    --user "${API_KEY}:${API_SECRET}"
    --data '{}'
    --connect-timeout "$CONNECT_TIMEOUT"
    --max-time "$MAX_TIME"
    --write-out "HTTP_CODE=%{http_code}\n"
    --output /dev/null
  )
  # With LE, system trust store is fine; no -k used.

  local out rc code
  for ((i=1;i<=RETRIES;i++)); do
    set +e
    out=$(curl -X POST "${args[@]}" "$url" 2>&1); rc=$?
    set -e
    code=""; [[ "$out" =~ HTTP_CODE=([0-9]{3}) ]] && code="${BASH_REMATCH[1]}"

    if [[ "$code" =~ ^2..$ || "$code" == "000" ]]; then
      log "Accepted by $url (HTTP:${code:-none})."
      return 0
    fi
    log "Attempt $i failed (rc:$rc HTTP:${code:-none}). Out: $out"
    (( i < RETRIES )) && sleep "$RETRY_DELAY"
  done
  return 1
}

main(){
  command -v curl >/dev/null || { log "ERROR: curl not found"; exit 2; }
  local fail=0
  for h in "${OPNSENSE_HOSTS[@]}"; do
    log "Requesting shutdown: $h"
    call_shutdown "$h" || { log "ERROR: $h did not acknowledge"; ((fail++)); }
  done
  (( fail==0 )) && { log "All shutdown calls issued."; exit 0; } || exit 1
}
main "$@"

Quote from: Patrick M. Hausen on April 01, 2026, 09:55:38 AM1. "/usr/local/etc/rc.halt" does call "shutdown -p now":

#!/bin/sh

# shutdown syshook / plugin scripts
/usr/local/etc/rc.syshook stop

/sbin/shutdown -op now

while :; do sleep 1; done

2. The hardwired killpower flag might call for a feature request to make it configurable.

I haven't noticed because here my OPNsense is the master NUT server and all other servers shut down first. I think the firewall killing the Internet connection should go down last.

HTH,
Patrick

When I was running OPNSense as master and was testing around:

1. it went into halt mode and stayed there
2. UPS power was recycled so it killed everything, including OPNSense itself which was in halt mode, it never fully shutdown.

Quote from: Stormscape on April 01, 2026, 09:59:09 AMAvoids NUT nonsense

Patrick, this also answer your "I think the firewall killing the Internet connection should go down last."

By having a NUT server, it gives me full control of everything:

1. Everybody BIOS is set to turn back online when the power is restored.
2. You can run a command from the NUT server to recycle the UPS power
3. If the battery is back above 80% for example, recycle the UPS power, and that will bring everybody back online [1]
4. This is a Dell Wyse 3040 5v3A so it will run forever before going down, it also turns on automatically when the power is restored.
5. NAS is my only priority, it must be the first one to go down. If OPNSense goes down first or later, so be it.
6. Proxmox scripts do check if the NAS is up and if the NFS shares are active before running backups, otherwise, skip.

I did what I did to keep the whole process fully automated.
I am novice so don't take me too seriously :)
Sophos SG210 Rev3 -  i7-6700T - 16GB

Quote from: hakuna on April 01, 2026, 10:21:19 AM1. it went into halt mode and stayed there

Does your device support ACPI power off? Some embedded systems don't.

Quote from: hakuna on April 01, 2026, 10:21:19 AM2. UPS power was recycled so it killed everything, including OPNSense itself which was in halt mode, it never fully shutdown.

But power cycling when halted does not hurt.

And I already agreed hard wired shutting down the UPS is probably a bad idea - please raise a feature request on Github.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on April 01, 2026, 10:37:41 AMBut power cycling when halted does not hurt.

And I already agreed hard wired shutting down the UPS is probably a bad idea - please raise a feature request o

I wanna believe that power cycling when halted does not hurt, but I cannot.

Back in the day when all we had was IDE HDD, halting a computer meant its reading/writing head would move to park position away from the disk.
So when the power was cut, the head would not hit the disk leaving a hole aka bad block.
Under those circumstances only, halt had meaning and purpose.

SSD/NVMe is electronic, I cannot trust cutting the power while in halt, won't damage it coz it is still in operational mode.
I will die in this hill, halt is not and will never be a shutdown.

I have crossed many old posts about this hard-coded UPS power kill switch, a few have mentioned about open request to remove it but it does not seem to be a priority.

By reading between the lines, there is no solution right now so:

1. Delete os-nut plugin
2. Install nut via pkg and hope OPNSense has no power over it; The config file won't change on OPNSense system update.
3. If the above does not work, have my NUT server to shutdown OPNSense box, nothing fancy, just invoke its shutdown!!
Sophos SG210 Rev3 -  i7-6700T - 16GB

Quote from: hakuna on April 01, 2026, 09:32:24 AM2. /etc/killpower: The worst offence. It fully kills the UPS power on exit. Proxmox, TrueNAS will go kill because of it. This should never be here.
3. If you noticed the first lines, you must not edit this file, I did and everything was reverted back.

# Please don't modify this file as your changes might be overwritten with
# the next update.
#
MONITOR apc1000@10.19.0.14:3493 1 admin master slave
SHUTDOWNCMD "/usr/local/etc/rc.halt"
POWERDOWNFLAG /etc/killpower


If I have understood your problem, when your Thin Client initiates a shutdown due to the condition on your Debian NUT server initiating it, when the Thin Client shuts down, something is seeing /etc/killpower on there and commanding the UPS on your Debian box to initiate a UPS shutdown.

Have I interpreted this correctly?

I recently migrated a box and on the new system I set up the UPS monitoring with the UPS connected to new new box. Prior to migrating data I tested for expected shutdown operations of both computer. Whilst they are not OPNsense boxes, they are both using NUT.

Looking on the configuration of the old system, which had 'upsmon' reconfigured to monitor NUT across the network on the new system, one striking difference I see in my set up compared to yours. I'm not using the "Admin" account, I'm using a lower privileged account for 'upsmon' to monitor with. I also configured the other endpoint as 'master' - I'll check this later and test with 'slave'.

MONITOR SUA1000I@192.168.199.254 1 upsmonitor <cryptic password> master
At a suitable time I'll perform tests to see if /etc/killpower on the client will initiate a shutdown of the remote UPS.

Quote from: hakuna on April 02, 2026, 05:46:32 AMI wanna believe that power cycling when halted does not hurt, but I cannot

When Unix halts there is no operating system running. Any activity will  be from the BIOS of the Thin Client. There was also a time before computers could be programmatically powered off where halting the system was perfectly normal. The 'shutdown' command can also be used to lower the system running state, e.g. back to single-user mode - where filesystems are still mounted and the operating system is still functioning.

Quote from: hakuna on April 02, 2026, 05:46:32 AMBack in the day when all we had was IDE HDD, halting a computer meant its reading/writing head would move to park position away from the disk.

I believe you are referring to parking the HDD's using a park command as the computer is shut down. This goes back to RLL & MFM HDD's. During the evolution of Hard Disk Drives, upon power failure, the IDE drives (and others too) were designed to retract the heads to the landing zone.

I've moved my OPNsense firewall over to my APC SUA1000 UPS.

In OPNSense Nut has been configured to connect to a Nut master on the network.

Here is the contents of upsmon.conf on OPNsense system;

# Please don't modify this file as your changes might be overwritten with
# the next update.
#
MONITOR SUA1000I@192.168.199.254:3493 1 upsmonitor <cryptic password> slave
SHUTDOWNCMD "/usr/local/etc/rc.halt"
POWERDOWNFLAG /etc/killpower



The OpenBSD box monitors the UPS via the serial port, perhaps this may differ with your configuration and your are using USB.

On OPNsense system, issuing upsmon -c fsd results in OPNsense shutting down gracefully and powering off.

On OpenBSD system,  issuing upsmon -c fsd results in OPNsense shutting down gracefully and powering off followed by the OpenBSD system doing the same.

Approximately 90 seconds later the UPS goes in to sleep mode. As the mains power is still on, it awoke some 180 seconds later and the devices powered on and booted normally.

I even created /etc/killpower on the OPNsense system. This file remained after boot-up so I don't think there is anything running on the OPNsense system to act upon this file as it should have been removed if there was.

You listed an APC1000 earlier. Perhaps it is similar to mine. Have a look at my UPS settings as I suspect your problem is going to be related to how you have configured your UPS. With my set-up if a device were to simply halt and remain powered on, it would lose power when the UPS goes to sleep.

I once had a Belkin F6C800 UPS. The only way I could get it to work where it powered off after being commanded to by nut, the system had to remain running and the UPS would then power itself off. Here is the section used in rc.shutdown to deal with it - note, the system never shut down but the file systems were re-mounted Read-Only.
if [ -f /etc/killpower ]; then
       echo "Shutting down UPS Driver...."
       /usr/local/sbin/upsdrvctl stop >/dev/null 2>&1
       sync; sleep 5; sync
       echo "Mounting filesystems Read-Only...."
       /sbin/mount -A -t ffs -f -u -r
       echo "Waiting for UPS to turn off...."
       /usr/local/bin/belkinunv -a F6C800 -k
       echo "Ooops....didn't expect to get here!"
       reboot
fi

Quote from: hakuna on April 02, 2026, 05:46:32 AMSSD/NVMe is electronic, I cannot trust cutting the power while in halt, won't damage it coz it is still in operational mode.
The issue with SSD's is that they have become a mess over time :

- First there were "Real SSD's" as I call them : Both Caching RAM and Power Loss Prevention Capacitors that would guard that RAM and the data on the SSD.
Simple example of such a SSD : Intel 320 Series and everything that was Workstation/Enterprise level at the time.
Later on there were the Crucial M500 DC but these were different than the next group =>

- Then there were suddenly SSD's where this became a 50/50 deal : Only the RAM and the Index of the NAND is guarded. So not the actual data !!
Simple example of such a SSD : Crucial M500

- But then everything went to hell basically :
Samsung started selling "Pro SSD's" which were not Pro at all...
They do have Caching RAM but it's protection is... NONE.
A so called Write Back procedure is used during the next boot and does some checks and that's it then : Maybe it goes well, maybe it doesn't !!

The worst thing about this is that many other brands followed and even tho the market got flooded with them no one cared somehow ?! :(

- The next step was suddenly selling SSD's without any Cache at all !!
They are the cheapest, they work, but really : What are we doing here ??
How did we allow things to get this far ?!

And this was just the SATA era of SSD's...

All of this got applied immediately when the sale of NVMe SSD's started so now your super fast travelling data has even more chance to get corrupted because of that speed !!

YAY ?!?!



/End of rant.

QuoteI will die in this hill, halt is not and will never be a shutdown.
I agree with you that the whole thing should be automatic and hassle free and most of all clean :)

Quote from: lmoore on April 02, 2026, 07:53:53 AMDuring the evolution of Hard Disk Drives, upon power failure, the IDE drives (and others too) were designed to retract the heads to the landing zone.
But AFAIK the SATA Controller needs to give them that signal and that signal again comes from the Operating System so it's one big chain that needs to do it's work correctly.
Weird guy who likes everything Linux and *BSD on PC/Laptop/Tablet/Mobile and funny little ARM based boards :)