Reboot after periodic Interface reset

Started by Shcshc, February 03, 2024, 12:10:05 PM

Previous topic - Next topic
After update to 24.1 the system restarts after cron Job: periodic interface reset wan. It's an ppoe interface. before update it dosn't so.


Gesendet von iPad mit Tapatalk Pro

Same for me...

After the update to 24.1.2 (24.1.2_1) OPNsense freezes or reboots after "CronJob: periodic interface reset wan"..

Hi,

I have the same problem here.

Regards

After rebooting the system i do:
In reporting settings:
Reset dns data, rrd, netflow and repair netflow.

then system is not rebooting every night.


Gesendet von iPhone mit Tapatalk Pro

After last update problem is back. Resetting the settings is not working


Gesendet von iPhone mit Tapatalk Pro

June 25, 2024, 05:07:27 AM #5 Last Edit: September 20, 2024, 04:21:49 PM by TheOfficialMrBlah
After it worked for me again for some time, the problem has returned with the 24.1.9 (24.1.9_4) update...

24.7.4_1 <-> Problem still exists

Same problem here,

after a CronJob "periodic interface reset" Parameter "opt4"

Interface opt4 = Device pppoe0

Could anyone solve the problem?
OPNsense 24.1.10_8 (Proxmox VM)

I still have the problem..
Version: 24.7.4_1

I see the following message on the console:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address = 0x10
fault code = supervisor read data, page not present



January 12, 2025, 09:02:44 PM #8 Last Edit: June 10, 2025, 10:49:17 PM by circa1665
My WAN IP address is assigned via DHCP. I created a custom script that monitors WAN connectivity (IPv4 and/or IPv6), attempts recovery by restarting the appropriate DHCP client, and reboots the system as a last resort. I built this by combining snippets from various sources and refining the logic with help from ChatGPT.

  • The script monitors internet connectivity for both IPv4 and IPv6 — if either is configured on the WAN interface.
  • Skips checks for unused stacks (e.g., IPv6-only or IPv4-only environments).
  • If a connection test fails, it restarts the relevant DHCP client and retries up to 5 times, with increasing delays.
  • If recovery still fails, it reboots the system — limited to 2 reboots per hour to prevent loops.
  • Logs all actions to '/var/log/gateway_monitor.log' and syslog.



Step 1: Create the Script

Create the script in '/home':

nano /home/check_gateways.sh

Paste the script content (see below), update the 'WAN_INTERFACE' variable if needed ('igc0'), enable/disable 'ENABLE_IPV4_CHECK', 'ENABLE_IPV6_CHECK' as appropriate:

#!/bin/sh

# Gateway Connection Monitor for OPNsense with auto-recovery and reboot safeguard
# Run via cron: */5 * * * * /path/to/gateway_monitor.sh
# Logs tail -f /var/log/gateway_monitor.log

WAN_INTERFACE="igc0"
ENABLE_IPV4_CHECK=true
ENABLE_IPV6_CHECK=true

IPV4_TEST_HOST="1.1.1.1"
IPV6_TEST_HOST="2606:4700:4700::1111"

MAX_RETRIES=5
MAX_REBOOTS_PER_HOUR=2

LOG_FILE="/var/log/gateway_monitor.log"
REBOOT_RECORD_FILE="/var/db/gateway_monitor_reboots"

# ANSI color codes for terminal output
COLOR_OK="\033[1;32m"
COLOR_FAIL="\033[1;31m"
COLOR_WARN="\033[1;33m"
COLOR_RESET="\033[0m"

log() {
    local level="$1"
    local message="$2"
    local timestamp="$(date '+%Y-%m-%d %H:%M:%S')"
    echo "$timestamp [$level] $message" >> "$LOG_FILE"
    logger -p daemon.notice -t gateway_monitor "$level: $message"
}

log_colored_status() {
    local ipv4_status="$1"
    local ipv6_status="$2"

    local color_ipv4="$COLOR_OK"
    local color_ipv6="$COLOR_OK"

    [ "$ipv4_status" != "OK" ] && color_ipv4="$COLOR_FAIL"
    [ "$ipv6_status" != "OK" ] && color_ipv6="$COLOR_FAIL"

    echo -e "[INFO] Connection status: (IPv4: ${color_ipv4}${ipv4_status}${COLOR_RESET}, IPv6: ${color_ipv6}${ipv6_status}${COLOR_RESET})"
}

check_link_status() {
    if ! ifconfig "$WAN_INTERFACE" | grep -q "status: active"; then
        log "CRITICAL" "WAN link is DOWN on $WAN_INTERFACE. Skipping tests and triggering reboot logic."
        return 1
    fi
    return 0
}

has_ipv4() {
    ip=$(ifconfig "$WAN_INTERFACE" | awk '/inet / {print $2}')
    if echo "$ip" | grep -qE '^([0-9]{1,3}\.){3}[0-9]{1,3}$' && \
       ! echo "$ip" | grep -qE '^169\.254|^127\.'; then
        log "INFO" "Detected IPv4 address: $ip"
        gw=$(netstat -rn -f inet | awk '/^default/ && $NF == "'"$WAN_INTERFACE"'" {print $2; exit}')
        [ -n "$gw" ] && log "INFO" "Detected IPv4 gateway: $gw"
        return 0
    fi
    return 1
}

has_ipv6() {
    ip=$(ifconfig "$WAN_INTERFACE" | awk '/inet6 / && !/fe80::/ {print $2; exit}')
    if [ -n "$ip" ]; then
        log "INFO" "Detected IPv6 address: $ip"
        gw=$(netstat -rn -f inet6 | awk '/^default/ && $NF == "'"$WAN_INTERFACE"'" {print $2; exit}')
        [ -n "$gw" ] && log "INFO" "Detected IPv6 gateway: $gw"
        return 0
    fi
    return 1
}

check_ipv4_connectivity() {
    ping -c 2 -t 2 "$IPV4_TEST_HOST" >/dev/null 2>&1
    return $?
}

check_ipv6_connectivity() {
    ping6 -c 2 -t 2 "$IPV6_TEST_HOST" >/dev/null 2>&1
    return $?
}

restart_ipv4_dhcp() {
    log "WARNING" "Restarting IPv4 DHCP client..."
    pkill dhclient
    rm -f /var/db/dhclient.leases
    /usr/sbin/dhclient -cf /var/etc/dhclient_wan.conf "$WAN_INTERFACE"
}

restart_ipv6_dhcp() {
    log "WARNING" "Restarting IPv6 DHCP client..."
    pkill dhcp6c
    rm -f /var/db/dhcp6c_duid
    /usr/local/sbin/dhcp6c -c /var/etc/dhcp6c_wan.conf "$WAN_INTERFACE"
}

can_reboot() {
    now=$(date +%s)
    cutoff=$((now - 3600))
    [ ! -f "$REBOOT_RECORD_FILE" ] && echo "0" > "$REBOOT_RECORD_FILE"
    grep -E '^[0-9]+$' "$REBOOT_RECORD_FILE" > "$REBOOT_RECORD_FILE.tmp"
    mv "$REBOOT_RECORD_FILE.tmp" "$REBOOT_RECORD_FILE"
    recent_reboots=$(awk -v cutoff="$cutoff" '$1 >= cutoff' "$REBOOT_RECORD_FILE" | wc -l)
    if [ "$recent_reboots" -lt "$MAX_REBOOTS_PER_HOUR" ]; then
        echo "$now" >> "$REBOOT_RECORD_FILE"
        return 0
    fi
    return 1
}

attempt_connectivity_recovery() {
    local type="$1"
    local attempt=1
    local delay=0

    while [ "$attempt" -le "$MAX_RETRIES" ]; do
        if [ "$type" = "ipv4" ]; then
            check_ipv4_connectivity && return 0
            log "WARNING" "IPv4 connectivity FAILED. Attempting DHCP restart ($attempt/$MAX_RETRIES)..."
            restart_ipv4_dhcp
            delay=$((10 + (attempt - 1) * 5))
        else
            check_ipv6_connectivity && return 0
            log "WARNING" "IPv6 connectivity FAILED. Attempting DHCP6 restart ($attempt/$MAX_RETRIES)..."
            restart_ipv6_dhcp
            delay=$((15 + (attempt - 1) * 5))
        fi

        sleep "$delay"

        if [ "$type" = "ipv4" ]; then
            if check_ipv4_connectivity; then
                log "INFO" "IPv4 connectivity OK on attempt $((attempt + 1))"
                return 0
            fi
        else
            if check_ipv6_connectivity; then
                log "INFO" "IPv6 connectivity OK on attempt $((attempt + 1))"
                return 0
            fi
        fi

        attempt=$((attempt + 1))
    done

    return 1
}

log "INFO" "Starting gateway connection check for interface: $WAN_INTERFACE"

check_link_status || {
    if can_reboot; then
        log "CRITICAL" "Reboot permitted under policy (<=2/hour). System will now reboot."
        /sbin/reboot
    else
        log "CRITICAL" "Reboot limit reached. System will NOT reboot."
    fi
    exit 1
}

ipv4_status="SKIPPED"
ipv6_status="SKIPPED"

if [ "$ENABLE_IPV4_CHECK" = true ] && has_ipv4; then
    ipv4_status="OK"
    if ! check_ipv4_connectivity; then
        ipv4_status="FAILED"
        if ! attempt_connectivity_recovery "ipv4"; then
            ipv4_status="FAILED"
        else
            ipv4_status="OK"
        fi
    fi
elif [ "$ENABLE_IPV4_CHECK" = true ]; then
    log "INFO" "IPv4 not configured on interface $WAN_INTERFACE, skipping IPv4 checks."
fi

if [ "$ENABLE_IPV6_CHECK" = true ] && has_ipv6; then
    ipv6_status="OK"
    if ! check_ipv6_connectivity; then
        ipv6_status="FAILED"
        if ! attempt_connectivity_recovery "ipv6"; then
            ipv6_status="FAILED"
        else
            ipv6_status="OK"
        fi
    fi
elif [ "$ENABLE_IPV6_CHECK" = true ]; then
    log "INFO" "IPv6 not configured on interface $WAN_INTERFACE, skipping IPv6 checks."
fi

# Summary logs
if [ "$ipv4_status" = "OK" ] && [ "$ipv6_status" = "OK" ]; then
    log "INFO" "Connection status: HEALTHY (IPv4: OK, IPv6: OK)"
    log_colored_status "OK" "OK"
else
    log "CRITICAL" "Connection status: DEGRADED or FAILED (IPv4: $ipv4_status, IPv6: $ipv6_status)"
    log_colored_status "$ipv4_status" "$ipv6_status"
    if [ "$ipv4_status" = "FAILED" ] || [ "$ipv6_status" = "FAILED" ]; then
        failure_types="$( [ "$ipv4_status" = "FAILED" ] && echo IPv4 ) $( [ "$ipv6_status" = "FAILED" ] && echo IPv6 )"
        log "CRITICAL" "Reboot triggered: $failure_types connectivity failed after $MAX_RETRIES retries."
        if can_reboot; then
            log "CRITICAL" "Reboot permitted under policy (<=2/hour). System will now reboot."
            /sbin/reboot
        else
            log "CRITICAL" "Reboot limit reached. System will NOT reboot."
        fi
    fi
fi

log "INFO" "Gateway connection check completed"
exit 0


Set the correct permissions:

chmod 700 /home/check_gateways.sh


Step 2: Create a Config Action

Define a custom action to run the script via cron:

nano /usr/local/opnsense/service/conf/actions.d/actions_checkgateways.conf
Add the following content:

[check]
command:/home/check_gateways.sh
parameters:
type:script
message:Starting gateway check script
description:Check IPv4/IPv6 connectivity and restart DHCP clients or reboot if recovery fails.


Then apply the changes:

service configd restart
configctl checkgateways check

You should see an output summary if successful.



Step 3: Add a Cron Job

Navigate to:

System > Settings > Cron

Create a new job with an interval of your choice (e.g. every 5 minutes). In the command dropdown, select:
Check IPv4/IPv6 connectivity and restart DHCP clients or reboot if recovery fails.