OPNsense Forum

Archive => 24.1, 24.4 Legacy Series => Topic started by: Shcshc on February 03, 2024, 12:10:05 PM

Title: Reboot after periodic Interface reset
Post by: Shcshc on February 03, 2024, 12:10:05 PM
After update to 24.1 the system restarts after cron Job: periodic interface reset wan. It's an ppoe interface. before update it dosn't so.


Gesendet von iPad mit Tapatalk Pro
Title: Re: Reboot after periodic Interface reset
Post by: TheOfficialMrBlah on March 05, 2024, 12:33:53 PM
Same for me...

After the update to 24.1.2 (24.1.2_1) OPNsense freezes or reboots after "CronJob: periodic interface reset wan"..
Title: Re: Reboot after periodic Interface reset
Post by: Roi on March 15, 2024, 12:58:23 PM
Hi,

I have the same problem here.

Regards
Title: Re: Reboot after periodic Interface reset
Post by: Shcshc on March 25, 2024, 07:23:09 AM
After rebooting the system i do:
In reporting settings:
Reset dns data, rrd, netflow and repair netflow.

then system is not rebooting every night.


Gesendet von iPhone mit Tapatalk Pro
Title: Re: Reboot after periodic Interface reset
Post by: Shcshc on April 06, 2024, 08:20:03 AM
After last update problem is back. Resetting the settings is not working


Gesendet von iPhone mit Tapatalk Pro
Title: Re: Reboot after periodic Interface reset
Post by: TheOfficialMrBlah on June 25, 2024, 05:07:27 AM
After it worked for me again for some time, the problem has returned with the 24.1.9 (24.1.9_4) update...

24.7.4_1 <-> Problem still exists
Title: Re: Reboot after periodic Interface reset
Post by: Thomas B. on July 28, 2024, 12:20:22 PM
Same problem here,

after a CronJob "periodic interface reset" Parameter "opt4"

Interface opt4 = Device pppoe0

Could anyone solve the problem?
Title: Re: Reboot after periodic Interface reset
Post by: TheOfficialMrBlah on September 25, 2024, 05:25:33 AM
I still have the problem..
Version: 24.7.4_1

I see the following message on the console:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address = 0x10
fault code = supervisor read data, page not present


Title: Re: Reboot after periodic Interface reset
Post by: circa1665 on January 12, 2025, 09:02:44 PM
My WAN IP address is assigned via DHCP. I created a custom script that monitors WAN connectivity (IPv4 and/or IPv6), attempts recovery by restarting the appropriate DHCP client, and reboots the system as a last resort. I built this by combining snippets from various sources and refining the logic with help from ChatGPT.




Step 1: Create the Script

Create the script in '/home':

nano /home/check_gateways.sh

Paste the script content (see below), update the 'WAN_INTERFACE' variable if needed ('igc0'), enable/disable 'ENABLE_IPV4_CHECK', 'ENABLE_IPV6_CHECK' as appropriate:

#!/bin/sh

# Gateway Connection Monitor for OPNsense with auto-recovery and reboot safeguard
# Run via cron: */5 * * * * /path/to/gateway_monitor.sh
# Logs: tail -f /var/log/gateway_monitor.log

WAN_INTERFACE="igc0"
ENABLE_IPV4_CHECK=true
ENABLE_IPV6_CHECK=true

IPV4_TEST_HOST="1.1.1.1"
IPV6_TEST_HOST="2606:4700:4700::1111"

MAX_RETRIES=5
MAX_REBOOTS_PER_HOUR=2

LOG_FILE="/var/log/gateway_monitor.log"
REBOOT_RECORD_FILE="/var/db/gateway_monitor_reboots"

# ANSI color codes for terminal output
COLOR_OK="\033[1;32m"
COLOR_FAIL="\033[1;31m"
COLOR_WARN="\033[1;33m"
COLOR_RESET="\033[0m"

log() {
    local level="$1"
    local message="$2"
    local timestamp="$(date '+%Y-%m-%d %H:%M:%S')"
    echo "$timestamp [$level] $message" >> "$LOG_FILE"
    logger -p daemon.notice -t gateway_monitor "$level: $message"
}

log_colored_status() {
    local ipv4_status="$1"
    local ipv6_status="$2"

    local color_ipv4="$COLOR_OK"
    local color_ipv6="$COLOR_OK"

    [ "$ipv4_status" != "OK" ] && color_ipv4="$COLOR_FAIL"
    [ "$ipv6_status" != "OK" ] && color_ipv6="$COLOR_FAIL"

    echo -e "[INFO] Connection status: (IPv4: ${color_ipv4}${ipv4_status}${COLOR_RESET}, IPv6: ${color_ipv6}${ipv6_status}${COLOR_RESET})"
}

check_link_status() {
    if ! ifconfig "$WAN_INTERFACE" | grep -q "status: active"; then
        log "CRITICAL" "WAN link is DOWN on $WAN_INTERFACE. Skipping tests and triggering reboot logic."
        return 1
    fi
    return 0
}

has_ipv4() {
    ip=$(ifconfig "$WAN_INTERFACE" | awk '/inet / {print $2}')
    if echo "$ip" | grep -qE '^([0-9]{1,3}\.){3}[0-9]{1,3}$' && \
       ! echo "$ip" | grep -qE '^169\.254|^127\.'; then
        log "INFO" "Detected IPv4 address: $ip"
        gw=$(netstat -rn -f inet | awk '/^default/ && $NF == "'"$WAN_INTERFACE"'" {print $2; exit}')
        [ -n "$gw" ] && log "INFO" "Detected IPv4 gateway: $gw"
        return 0
    fi
    return 1
}

has_ipv6() {
    ip=$(ifconfig "$WAN_INTERFACE" | awk '/inet6 / && !/fe80::/ {print $2; exit}')
    if [ -n "$ip" ]; then
        log "INFO" "Detected IPv6 address: $ip"
        gw=$(netstat -rn -f inet6 | awk '/^default/ && $NF == "'"$WAN_INTERFACE"'" {print $2; exit}')
        [ -n "$gw" ] && log "INFO" "Detected IPv6 gateway: $gw"
        return 0
    fi
    return 1
}

check_ipv4_connectivity() {
    ping -c 2 -t 2 "$IPV4_TEST_HOST" >/dev/null 2>&1
    return $?
}

check_ipv6_connectivity() {
    ping6 -c 2 -t 2 "$IPV6_TEST_HOST" >/dev/null 2>&1
    return $?
}

restart_ipv4_dhcp() {
    log "WARNING" "Restarting IPv4 DHCP client..."
    pkill dhclient
    rm -f /var/db/dhclient.leases
    /usr/sbin/dhclient -cf /var/etc/dhclient_wan.conf "$WAN_INTERFACE"
}

restart_ipv6_dhcp() {
    log "WARNING" "Restarting IPv6 DHCP client..."
    pkill dhcp6c
    rm -f /var/db/dhcp6c_duid
    /usr/local/sbin/dhcp6c -c /var/etc/dhcp6c_wan.conf "$WAN_INTERFACE"
}

can_reboot() {
    now=$(date +%s)
    cutoff=$((now - 3600))
    [ ! -f "$REBOOT_RECORD_FILE" ] && echo "0" > "$REBOOT_RECORD_FILE"
    grep -E '^[0-9]+$' "$REBOOT_RECORD_FILE" > "$REBOOT_RECORD_FILE.tmp"
    mv "$REBOOT_RECORD_FILE.tmp" "$REBOOT_RECORD_FILE"
    recent_reboots=$(awk -v cutoff="$cutoff" '$1 >= cutoff' "$REBOOT_RECORD_FILE" | wc -l)
    if [ "$recent_reboots" -lt "$MAX_REBOOTS_PER_HOUR" ]; then
        echo "$now" >> "$REBOOT_RECORD_FILE"
        return 0
    fi
    return 1
}

attempt_connectivity_recovery() {
    local type="$1"
    local attempt=1
    local delay=0

    while [ "$attempt" -le "$MAX_RETRIES" ]; do
        if [ "$type" = "ipv4" ]; then
            check_ipv4_connectivity && return 0
            log "WARNING" "IPv4 connectivity FAILED. Attempting DHCP restart ($attempt/$MAX_RETRIES)..."
            restart_ipv4_dhcp
            delay=$((10 + (attempt - 1) * 5))
        else
            check_ipv6_connectivity && return 0
            log "WARNING" "IPv6 connectivity FAILED. Attempting DHCP6 restart ($attempt/$MAX_RETRIES)..."
            restart_ipv6_dhcp
            delay=$((15 + (attempt - 1) * 5))
        fi

        sleep "$delay"

        if [ "$type" = "ipv4" ]; then
            if check_ipv4_connectivity; then
                log "INFO" "IPv4 connectivity OK on attempt $((attempt + 1))"
                return 0
            fi
        else
            if check_ipv6_connectivity; then
                log "INFO" "IPv6 connectivity OK on attempt $((attempt + 1))"
                return 0
            fi
        fi

        attempt=$((attempt + 1))
    done

    return 1
}

log "INFO" "Starting gateway connection check for interface: $WAN_INTERFACE"

check_link_status || {
    if can_reboot; then
        log "CRITICAL" "Reboot permitted under policy (<=2/hour). System will now reboot."
        /sbin/reboot
    else
        log "CRITICAL" "Reboot limit reached. System will NOT reboot."
    fi
    exit 1
}

ipv4_status="SKIPPED"
ipv6_status="SKIPPED"

if [ "$ENABLE_IPV4_CHECK" = true ]; then
    if has_ipv4; then
        ipv4_status="OK"
        if ! check_ipv4_connectivity; then
            ipv4_status="FAILED"
            if ! attempt_connectivity_recovery "ipv4"; then
                ipv4_status="FAILED"
            else
                ipv4_status="OK"
            fi
        fi
    else
        log "WARNING" "No valid IPv4 address on $WAN_INTERFACE. Attempting recovery..."
        ipv4_status="FAILED"
        if ! attempt_connectivity_recovery "ipv4"; then
            ipv4_status="FAILED"
        else
            ipv4_status="OK"
        fi
    fi
fi

if [ "$ENABLE_IPV6_CHECK" = true ]; then
    if has_ipv6; then
        ipv6_status="OK"
        if ! check_ipv6_connectivity; then
            ipv6_status="FAILED"
            if ! attempt_connectivity_recovery "ipv6"; then
                ipv6_status="FAILED"
            else
                ipv6_status="OK"
            fi
        fi
    else
        log "WARNING" "No valid IPv6 address on $WAN_INTERFACE. Attempting recovery..."
        ipv6_status="FAILED"
        if ! attempt_connectivity_recovery "ipv6"; then
            ipv6_status="FAILED"
        else
            ipv6_status="OK"
        fi
    fi
fi

# Summary logs
if [ "$ipv4_status" = "OK" ] && [ "$ipv6_status" = "OK" ]; then
    log "INFO" "Connection status: HEALTHY (IPv4: OK, IPv6: OK)"
    log_colored_status "OK" "OK"
else
    log "CRITICAL" "Connection status: DEGRADED or FAILED (IPv4: $ipv4_status, IPv6: $ipv6_status)"
    log_colored_status "$ipv4_status" "$ipv6_status"
    if [ "$ipv4_status" = "FAILED" ] || [ "$ipv6_status" = "FAILED" ]; then
        failure_types="$( [ "$ipv4_status" = "FAILED" ] && echo IPv4 ) $( [ "$ipv6_status" = "FAILED" ] && echo IPv6 )"
        log "CRITICAL" "Reboot triggered: $failure_types connectivity failed after $MAX_RETRIES retries."
        if can_reboot; then
            log "CRITICAL" "Reboot permitted under policy (<=2/hour). System will now reboot."
            /sbin/reboot
        else
            log "CRITICAL" "Reboot limit reached. System will NOT reboot."
        fi
    fi
fi

log "INFO" "Gateway connection check completed"
exit 0


Set the correct permissions:

chmod 700 /home/check_gateways.sh


Step 2: Create a Config Action

Define a custom action to run the script via cron:

nano /usr/local/opnsense/service/conf/actions.d/actions_checkgateways.conf
Add the following content:

[check]
command:/home/check_gateways.sh
parameters:
type:script
message:Starting gateway check script
description:Check IPv4/IPv6 connectivity and restart DHCP clients or reboot if recovery fails.


Then apply the changes:

service configd restart
configctl checkgateways check

You should see an output summary if successful.



Step 3: Add a Cron Job

Navigate to:

System > Settings > Cron

Create a new job with an interval of your choice (e.g. every 5 minutes). In the command dropdown, select:
Check IPv4/IPv6 connectivity and restart DHCP clients or reboot if recovery fails.