Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - jdiesel

#1
25.1, 25.4 Production Series / System Hang - zroot issue
February 26, 2025, 05:12:45 PM
Seems like I had a  zroot error last night, which resulted in a complete hang.

Solaris: WARNING: Pool 'zroot' has encountered an uncorrectable I/O failure and has been suspended.
Pool 'zroot' has encountered an uncorrectable I/O failure and has been suspended.

I actually do not have most of these logs locally - I do remote logging and that system captured the logs just prior to the hang, localtime was 00:15:18.
I do not seem to have access to disk hygene and this could relevant?
I restarted and all seems just fine now.


2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="15"] AMD Features2=0x121<LAHF,ABM,Prefetch>
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="14"] AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="13"] Features2=0x7ffafbbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="12"] Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="11"] Origin="GenuineIntel" Id=0x406e3 Family=0x6 Model=0x4e Stepping=3
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="10"] CPU: Intel(R) Core(TM) i3-6006U CPU @ 2.00GHz (2000.00-MHz K8-class CPU)
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="9"] VT(efifb): resolution 800x600
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="8"] FreeBSD clang version 18.1.6 (https://github.com/llvm/llvm-project.git llvmorg-18.1.6-0-g1118c2e05e67)
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="7"] FreeBSD 14.2-RELEASE-p1 stable/25.1-n269632-cc316253c68 SMP amd64
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="6"] FreeBSD is a registered trademark of The FreeBSD Foundation.
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="5"] The Regents of the University of California. All rights reserved.
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="4"] Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="3"] Copyright (c) 1992-2023 The FreeBSD Project.
2025-02-26T01:23:13-05:00 firewall kernel - - [meta sequenceId="2"] ---<<BOOT>>---
2025-02-26T01:23:13-05:00 firewall syslog-ng 11475 - [meta sequenceId="1"] syslog-ng starting up; version='4.8.1'
2025-02-26T00:15:18-05:00 firewall kernel - - [meta sequenceId="6485644"] <7>sonewconn: pcb 0xfffff80017f06a80 ([::]:53 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (139 occurrences), euid 0, rgid 0, jail 0
2025-02-26T00:14:18-05:00 firewall kernel - - [meta sequenceId="6485643"] <7>sonewconn: pcb 0xfffff80017f06a80 ([::]:53 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (176 occurrences), euid 0, rgid 0, jail 0
2025-02-26T00:14:01-05:00 firewall kernel - - [meta sequenceId="6485642"] Solaris: WARNING: Pool 'zroot' has encountered an uncorrectable I/O failure and has been suspended.
2025-02-26T00:14:01-05:00 firewall kernel - - [meta sequenceId="6485641"] (aprobe0:ahcich0:0:0:0): Error 5, Retries exhausted
2025-02-26T00:14:01-05:00 firewall kernel - - [meta sequenceId="6485640"] (aprobe0:ahcich0:0:0:0): CAM status: Command timeout
2025-02-26T00:14:01-05:00 firewall kernel - - [meta sequenceId="6485639"] (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
2025-02-26T00:14:01-05:00 firewall kernel - - [meta sequenceId="6485638"] ahcich0: is 00000000 cs 00000200 ss 00000000 rs 00000200 tfd 1d0 serr 00000000 cmd 0000c917
2025-02-26T00:14:01-05:00 firewall kernel - - [meta sequenceId="6485637"] ahcich0: Timeout on slot 9 port 0
2025-02-26T00:13:31-05:00 firewall kernel - - [meta sequenceId="6485636"] (aprobe0:ahcich0:0:0:0): Retrying command, 0 more tries remain
2025-02-26T00:13:31-05:00 firewall kernel - - [meta sequenceId="6485635"] (aprobe0:ahcich0:0:0:0): CAM status: Command timeout
2025-02-26T00:13:31-05:00 firewall kernel - - [meta sequenceId="6485634"] (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
2025-02-26T00:13:31-05:00 firewall kernel - - [meta sequenceId="6485633"] ahcich0: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd 1d0 serr 00000000 cmd 0000c817
2025-02-26T00:13:31-05:00 firewall kernel - - [meta sequenceId="6485632"] ahcich0: Timeout on slot 8 port 0
2025-02-26T00:13:17-05:00 firewall kernel - - [meta sequenceId="6485631"] <7>sonewconn: pcb 0xfffff80017f06a80 ([::]:53 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (204 occurrences), euid 0, rgid 0, jail 0
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485630"] Solaris: WARNING: Pool 'zroot' has encountered an uncorrectable I/O failure and has been suspended.
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485629"] Solaris: WARNING: Pool 'zroot' has encountered an uncorrectable I/O failure and has been suspended.
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485628"] Pool 'zroot' has encountered an uncorrectable I/O failure and has been suspended.
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485627"] (ada0:ahcich0:0:0:0): Error 6, Periph was invalidated
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485626"] Solaris: WARNING: (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-queue Request
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485625"] (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 28 63 f6 40 06 00 00 00 00 00
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485624"] (ada0:ahcich0:0:0:0): Error 6, Periph was invalidated
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485623"] (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-queue Request
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485622"] (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 38 62 f6 40 06 00 00 00 00 00
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485621"] (ada0:ahcich0:0:0:0): Error 6, Periph was invalidated
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485620"] (ada0:ahcich0:0:0:0): CAM status: Command timeout
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485619"] (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 10 10 be e7 40 0e 00 00 00 00 00
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485618"] (ada0:ahcich0:0:0:0): Error 6, Periph was invalidated
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485617"] (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-queue Request
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485616"] (ada0:ahcich0:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485615"] ahcich0: is 00000000 cs 00000000 ss 000000e0 rs 000000e0 tfd 40 serr 00000000 cmd 0000c717
2025-02-26T00:13:01-05:00 firewall kernel - - [meta sequenceId="6485614"] ahcich0: Timeout on slot 5 port 0
2025-02-26T00:12:31-05:00 firewall kernel - - [meta sequenceId="6485613"] (aprobe0:ahcich0:0:0:0): Retrying command, 0 more tries remain
2025-02-26T00:12:31-05:00 firewall kernel - - [meta sequenceId="6485612"] (aprobe0:ahcich0:0:0:0): CAM status: Command timeout
2025-02-26T00:12:31-05:00 firewall kernel - - [meta sequenceId="6485611"] (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
2025-02-26T00:12:31-05:00 firewall kernel - - [meta sequenceId="6485610"] ahcich0: is 00000000 cs 80000000 ss 00000000 rs 80000000 tfd 1d0 serr 00000000 cmd 0000df17
2025-02-26T00:12:31-05:00 firewall kernel - - [meta sequenceId="6485609"] ahcich0: Timeout on slot 31 port 0
2025-02-26T00:12:17-05:00 firewall kernel - - [meta sequenceId="6485608"] <7>sonewconn: pcb 0xfffff80017f06a80 ([::]:53 (proto 6)): Listen queue overflow: 193 already in queue awaiting acceptance (154 occurrences), euid 0, rgid 0, jail 0
2025-02-26T00:12:01-05:00 firewall kernel - - [meta sequenceId="6485607"] (ada0:ahcich0:0:0:0): Error 6, Periph was invalidated
2025-02-26T00:12:01-05:00 firewall kernel - - [meta sequenceId="6485606"] (ada0:ahcich0:0:0:0): CAM status: Command timeout
2025-02-26T00:12:01-05:00 firewall kernel - - [meta sequenceId="6485605"] (ada0:ahcich0:0:0:0): READ_DMA. ACB: c8 00 10 2a 08 41 00 00 00 00 10 00
2025-02-26T00:12:01-05:00 firewall kernel - - [meta sequenceId="6485604"] (ada0:ahcich0:0:0:0): Error 6, Periph was invalidated
2025-02-26T00:12:01-05:00 firewall kernel - - [meta sequenceId="6485603"] (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-queue Request
2025-02-26T00:12:01-05:00 firewall kernel - - [meta sequenceId="6485602"] (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 10 10 bc e7 40 0e 00 00 00 00 00
2025-02-26T00:12:01-05:00 firewall kernel - - [meta sequenceId="6485601"] ahcich0: is 00000000 cs 40000000 ss 00000000 rs 40000000 tfd d0 serr 00000000 cmd 0000de17
2025-02-26T00:12:01-05:00 firewall kernel - - [meta sequenceId="6485600"] ahcich0: Timeout on slot 30 port 0
2025-02-26T00:11:50-05:00 firewall filterlog 91002 - [meta sequenceId="6485599"] 117,,,78ab8ef80d8e923479f99b265b627394,igb3,match,pass,out,4,0x0,,62,38021,0,DF,6,tcp,60,216.212.95.186,37.120.205.197,57294,36069,0,S,998208525,,64240,,mss;sackOK;TS;nop;wscale
2025-02-26T00:11:48-05:00 firewall filterlog 91002 - [meta sequenceId="6485598"] 114,,,fae559338f65e11c53669fc3642c93c2,vlan0.2,match,pass,out,4,0x0,,127,19463,0,DF,6,tcp,52,10.20.1.230,10.20.2.128,3435,24800,0,S,342899722,,65535,,mss;nop;wscale;nop;nop;sackOK
#2
Had a strange situation yesterday. Opnsense stopped passing traffic, would respond to ping, but not pass anything at all. Went to the unit, saw on the screen lots of weird messages. I hardware booted, and it came up just fine, reviewing the logs I see the following in /var/log/dmesg.yesterday:


jason@srv:/var/log $ sudo more dmesg.yesterday
pid 98814 (sh), jid 0, uid 0: exited on signal 10
....

many many of these messages...

....
pid 6196 (sh), jid 0, uid 0: exited on signal 10
pid 6347 (sh), jid 0, uid 0: exited on signal 10
pid 99893 (sh), jid 0, uid 0: exited on signal 10
pid 7281 (sh), jid 0, uid 0: exited on signal 10
...
pid 11520 (sh), jid 0, uid 0: exited on signal 10
pid 11623 (sh), jid 0, uid 0: exited on signal 10
pid 96898 (pfctl), jid 0, uid 0: exited on signal 6 (core dumped)
pid 99746 (pfctl), jid 0, uid 884: exited on signal 6
pid 76134 (pfctl), jid 0, uid 884: exited on signal 6
pid 24347 (pfctl), jid 0, uid 884: exited on signal 6
pid 75209 (pfctl), jid 0, uid 884: exited on signal 6
pid 67322 (pfctl), jid 0, uid 0: exited on signal 6 (core dumped)
ovpnc3: link state changed to DOWN
ovpnc2: link state changed to DOWN
ovpnc1: link state changed to DOWN
Waiting (max 60 seconds) for system process `vnlru' to stop... done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining... 20 3 0 0 0 done
All buffers synced.
Uptime: 31d4h9m32s
---<<BOOT>>---
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.2-RELEASE-p3 stable/23.7-n254818-f155405f505 SMP amd64
FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c)
VT(efifb): resolution 800x600
CPU: Intel(R) Core(TM) i3-6006U CPU @ 2.00GHz (2000.00-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x406e3  Family=0x6  Model=0x4e  Stepping=3
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffafbbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>

regular startup messsages continue....
#3
Recently upgraded to OPNsense 23.1.5_4-amd64

At midnight, for the past 3 days, all traffic stops getting through.
Errors in the log are as follows:

2023-04-03T00:00:00-04:00   Error   configd.py   [520e7553-48a1-49cf-91a7-61c4955d2ed1] Inline action failed with not all arguments converted during string formatting at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/processhandler.py", line 506, in execute inline_act_parameters = self.parameters % tuple(parameters) TypeError: not all arguments converted during string formatting

2023-04-03T00:00:01-04:00   Error   configd.py   [f6f504ec-1676-4279-99da-815a22c5089f] Script action failed with Command 'configctl template reload OPNsense/HAProxy 2 > /dev/null; /usr/local/opnsense/scripts/OPNsense/HAProxy/syncCerts.py sync --output json ' returned non-zero exit status 1. at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/processhandler.py", line 482, in execute subprocess.check_call(script_command, env=self.config_environment, shell=True, File "/usr/local/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'configctl template reload OPNsense/HAProxy 2 > /dev/null; /usr/local/opnsense/scripts/OPNsense/HAProxy/syncCerts.py sync --output json ' returned non-zero exit status 1.

My guess is that something did not get converted correctly at the upgrade, looks maybe to be in haproxy...?

Can I re-apply the upgrade...or...?