os-nut: Broken plugin kills UPS power and halt instead of shutdown

Started by hakuna, Today at 09:32:24 AM

Previous topic - Next topic
I have a Debian NUT server running from a 5V3A client, everything is working fine, it is the last one to die during power outage.
Everybody else such as Proxmox, TrueNAS and OPNSense, fetch the UPS status from the little guy above aka "slave"

However, os-nut plugin has broken logic and does not allow overwriting:

1. /usr/local/etc/rc.halt: This is wrong, it must be "shutdown -p now" compatible with FreeBSD (I never understood halt to be honest, it and nothing is the same)
2. /etc/killpower: The worst offence. It fully kills the UPS power on exit. Proxmox, TrueNAS will go kill because of it. This should never be here.
3. If you noticed the first lines, you must not edit this file, I did and everything was reverted back.

# Please don't modify this file as your changes might be overwritten with
# the next update.
#
MONITOR apc1000@10.19.0.14:3493 1 admin master slave
SHUTDOWNCMD "/usr/local/etc/rc.halt"
POWERDOWNFLAG /etc/killpower


How are you guys managing this??
Ditching os-nut plugin altogether and installing nut via pkg so OPNSense has no interference??

As it stands, I have to let OPNSense baremetal power to be cut by force, it is wrong but better that way than it destroying my NAS HDDs when killing the UPS power.

Thank you
Sophos SG210 Rev3 -  i7-6700T - 16GB

1. "/usr/local/etc/rc.halt" does call "shutdown -p now":

#!/bin/sh

# shutdown syshook / plugin scripts
/usr/local/etc/rc.syshook stop

/sbin/shutdown -op now

while :; do sleep 1; done

2. The hardwired killpower flag might call for a feature request to make it configurable.

I haven't noticed because here my OPNsense is the master NUT server and all other servers shut down first. I think the firewall killing the Internet connection should go down last.

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Here's the secret: OPNsense has an API that can be called programmatically by shell scripts. I have another computer run this shell script when it shuts down for UPS to shut down my OPNsense. Avoids NUT nonsense. You might want to edit it a bit but it works fine for me. This script as written requires OPNsense to have a valid SSL cert, though:

#!/usr/bin/env bash
set -Eeuo pipefail

OPNSENSE_HOSTS=("https://opnsense.example.com")   # add more for HA
API_KEY="<API KEY HERE>"
API_SECRET="<API SECRET HERE>"
ENDPOINT="/api/core/firmware/poweroff"

CONNECT_TIMEOUT=3
MAX_TIME=10
RETRIES=3
RETRY_DELAY=2

log(){ logger -t ppb-opnsense -- "$*"; echo "[$(date -Is)] $*"; }

call_shutdown() {
  local url="${1%/}${ENDPOINT}"
  local args=(
    --silent --show-error
    --header "Content-Type: application/json"
    --user "${API_KEY}:${API_SECRET}"
    --data '{}'
    --connect-timeout "$CONNECT_TIMEOUT"
    --max-time "$MAX_TIME"
    --write-out "HTTP_CODE=%{http_code}\n"
    --output /dev/null
  )
  # With LE, system trust store is fine; no -k used.

  local out rc code
  for ((i=1;i<=RETRIES;i++)); do
    set +e
    out=$(curl -X POST "${args[@]}" "$url" 2>&1); rc=$?
    set -e
    code=""; [[ "$out" =~ HTTP_CODE=([0-9]{3}) ]] && code="${BASH_REMATCH[1]}"

    if [[ "$code" =~ ^2..$ || "$code" == "000" ]]; then
      log "Accepted by $url (HTTP:${code:-none})."
      return 0
    fi
    log "Attempt $i failed (rc:$rc HTTP:${code:-none}). Out: $out"
    (( i < RETRIES )) && sleep "$RETRY_DELAY"
  done
  return 1
}

main(){
  command -v curl >/dev/null || { log "ERROR: curl not found"; exit 2; }
  local fail=0
  for h in "${OPNSENSE_HOSTS[@]}"; do
    log "Requesting shutdown: $h"
    call_shutdown "$h" || { log "ERROR: $h did not acknowledge"; ((fail++)); }
  done
  (( fail==0 )) && { log "All shutdown calls issued."; exit 0; } || exit 1
}
main "$@"

Quote from: Patrick M. Hausen on Today at 09:55:38 AM1. "/usr/local/etc/rc.halt" does call "shutdown -p now":

#!/bin/sh

# shutdown syshook / plugin scripts
/usr/local/etc/rc.syshook stop

/sbin/shutdown -op now

while :; do sleep 1; done

2. The hardwired killpower flag might call for a feature request to make it configurable.

I haven't noticed because here my OPNsense is the master NUT server and all other servers shut down first. I think the firewall killing the Internet connection should go down last.

HTH,
Patrick

When I was running OPNSense as master and was testing around:

1. it went into halt mode and stayed there
2. UPS power was recycled so it killed everything, including OPNSense itself which was in halt mode, it never fully shutdown.

Quote from: Stormscape on Today at 09:59:09 AMAvoids NUT nonsense

Patrick, this also answer your "I think the firewall killing the Internet connection should go down last."

By having a NUT server, it gives me full control of everything:

1. Everybody BIOS is set to turn back online when the power is restored.
2. You can run a command from the NUT server to recycle the UPS power
3. If the battery is back above 80% for example, recycle the UPS power, and that will bring everybody back online [1]
4. This is a Dell Wyse 3040 5v3A so it will run forever before going down, it also turns on automatically when the power is restored.
5. NAS is my only priority, it must be the first one to go down. If OPNSense goes down first or later, so be it.
6. Proxmox scripts do check if the NAS is up and if the NFS shares are active before running backups, otherwise, skip.

I did what I did to keep the whole process fully automated.
I am novice so don't take me too seriously :)
Sophos SG210 Rev3 -  i7-6700T - 16GB

Quote from: hakuna on Today at 10:21:19 AM1. it went into halt mode and stayed there

Does your device support ACPI power off? Some embedded systems don't.

Quote from: hakuna on Today at 10:21:19 AM2. UPS power was recycled so it killed everything, including OPNSense itself which was in halt mode, it never fully shutdown.

But power cycling when halted does not hurt.

And I already agreed hard wired shutting down the UPS is probably a bad idea - please raise a feature request on Github.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)