Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - BertQuodge

#1
Quote from: funkyd on January 13, 2025, 06:14:55 AMFor what it's worth, I also have a VP2420-4 that's been running 24.7.11_2 for the past two weeks and I haven't had any stability issues. Unfortunately I don't have any suggestions for you, but just mentioning this to rule out any common issue between 24.7.11_2 and this Protectli box.
Hi

Thanks for posting, that's great to know others are not having issues with the VP2420-4, it is something specific to my setup.
#2
Just had another OPNSense crash, just over a day from the last, right in the middle of watching a film with the family. The wife acceptance factor has reduced even further. OPNSense recovered and rebooted itself, though it took a while.

The RAM and SSD has been re-seated again, just in case. Memtest64 shows no issues.

I use LibreNMS to monitor my house equipment, and OPNSense has lots of free memory, disk space and wasn't very warm at the time of the crash. The OPNSense was near(ish) to a WiFi AP, but I moved this a few days ago in case EMI was an issue, but this hasn't helped. OPNSense seemed to be fine until I upgraded to 24.7.11, though this could be a coincidence. I've just run a "opnsense-revert -r 24.7.10 opnsense" with a reboot to see if this helps. I'm not sure if I need to run more commands to fully revert to 24.7.10. Any suggestions would be appreciated, or the number of a good divorce lawyer ;-)


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address   = 0x37a891063000
fault code      = supervisor read data, page not present
instruction pointer   = 0x20:0xffffffff8109fa60
stack pointer           = 0x28:0xfffffe0037992430
frame pointer           = 0x28:0xfffffe0037992430
code segment      = base 0x0, limit 0xfffff, type 0x1b
         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = interrupt enabled, resume, IOPL = 0
current process      = 0 (if_io_tqg_1)
rdi: 000037a891063000 rsi: fffffe0037992558 rdx: 0000000000000028

rcx: 0000000000098a7b  r8: 00000000000000ac  r9: 00000000a10c11ac
rax: 0000000000000000 rbx: fffff80001a65000 rbp: fffffe0037992430
r10: 00000000c7ae7521 r11: 0000000000000014 r12: fffffe0037992558
r13: 000037a891063000 r14: fffff8000fcc7300 r15: fffffe0106bdc000
trap number      = 12
panic: page fault
cpuid = 1
time = 1736709176
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0037992120
vpanic() at vpanic+0x131/frame 0xfffffe0037992250
panic() at panic+0x43/frame 0xfffffe00379922b0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe0037992310
trap_pfault() at trap_pfault+0x46/frame 0xfffffe0037992360
calltrap() at calltrap+0x8/frame 0xfffffe0037992360
--- trap 0xc, rip = 0xffffffff8109fa60, rsp = 0xfffffe0037992430, rbp = 0xfffffe0037992430 ---
memcmp() at memcmp+0x110/frame 0xfffffe0037992430
pf_find_state() at pf_find_state+0xc0/frame 0xfffffe0037992480
pf_test_state_icmp() at pf_test_state_icmp+0x298/frame 0xfffffe00379925e0
pf_test() at pf_test+0x112c/frame 0xfffffe0037992790
pf_check_in() at pf_check_in+0x27/frame 0xfffffe00379927b0
pfil_mbuf_in() at pfil_mbuf_in+0x38/frame 0xfffffe00379927e0
ip_input() at ip_input+0x5d5/frame 0xfffffe0037992840
netisr_dispatch_src() at netisr_dispatch_src+0x9e/frame 0xfffffe0037992890
ether_demux() at ether_demux+0x149/frame 0xfffffe00379928c0
ether_nh_input() at ether_nh_input+0x36a/frame 0xfffffe0037992920
netisr_dispatch_src() at netisr_dispatch_src+0x9e/frame 0xfffffe0037992970
ether_input() at ether_input+0x56/frame 0xfffffe00379929c0
ether_demux() at ether_demux+0x8e/frame 0xfffffe00379929f0
ng_ether_rcv_upper() at ng_ether_rcv_upper+0x8c/frame 0xfffffe0037992a10
ng_apply_item() at ng_apply_item+0x13e/frame 0xfffffe0037992ab0
ng_snd_item() at ng_snd_item+0x274/frame 0xfffffe0037992af0
ng_apply_item() at ng_apply_item+0x13e/frame 0xfffffe0037992b90
ng_snd_item() at ng_snd_item+0x274/frame 0xfffffe0037992bd0
ng_ether_input() at ng_ether_input+0x4c/frame 0xfffffe0037992c00
ether_nh_input() at ether_nh_input+0x1dc/frame 0xfffffe0037992c60
netisr_dispatch_src() at netisr_dispatch_src+0x9e/frame 0xfffffe0037992cb0
ether_input() at ether_input+0x56/frame 0xfffffe0037992d00
iflib_rxeof() at iflib_rxeof+0xc0e/frame 0xfffffe0037992e00
_task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe0037992e40
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x14e/frame 0xfffffe0037992ec0
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame 0xfffffe0037992ef0
fork_exit() at fork_exit+0x7f/frame 0xfffffe0037992f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0037992f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
panic.txt0600001214741012070  7125 ustarrootwheelpage faultversion.txt0600007414741012070  7530 ustarrootwheelFreeBSD 14.1-RELEASE-p6 stable/24.7-n267979-0d692990122 SMP
#3
Hi

I purchased a Protectli Vault Pro VP2420-4, Crucial RAM 32GB DDR4 3200MHz CL22 & a Integral 512GB M.2 SATA III 2280 to run OPNSense in April 2024. Since installation the system has been rock solid, with no crashes, until I upgraded to OPNSense 24.7.11_2 in December of 2024. Since then I've had 3 OPNsense crashes, where the system reboots and recovers by itself. The crash reporter shows the crashes. All 3 crashes have been due to page faults. I've removed the memory and SSD from the Protectli and I've re-seated them but the crashes still occur. The Protectli is UPS fed and no other device have reported any power issues on the same UPS. The Protectli is in a cool environment and isn't near sources of EMI. The firewall isn't driven very hard and I use it at home. I use NUT, BGP, DHCP Server. I only use 2 ports on the Protectli, WAN access and a trunk for my home network. Interestingly, all 3 crashes have occurred after a few days of uptime while watching videos online, 2 with YouTube and one with the BBC.

The OPNSense crashes are receiving a poor wife acceptance factor, so I'd appreciate any advice on how to stop The Great British Bake Off from being interrupted ;-)

The kernel panic is shown below:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address   = 0x0
fault code      = supervisor write data, page not present
instruction pointer   = 0x20:0xffffffff82190d9c
stack pointer           = 0x28:0xffffffff82e54e00
frame pointer           = 0x28:0xffffffff82e54e30
code segment      = base 0x0, limit 0xfffff, type 0x1b
         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = interrupt enabled, resume, IOPL = 0
current process      = 6 (pf purge)
rdi: fffff801e8d47d10 rsi: fffff801e8d47d10 rdx: 0000000095089b03
rcx: 0000000000000000  r8: 0000000022f0d653  r9: 0000000000000000
rax: 0000000000000000 rbx: fffff801e8d68dc0 rbp: ffffffff82e54e30
r10: 0000000000000000 r11: 00000000b9f5a6a9 r12: fffffe0106bdc000
r13: 00000000000877df r14: fffff801e8d47d10 r15: fffff80001b20000
trap number      = 12
panic: page fault
cpuid = 1
time = 1736625558
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff82e54af0
vpanic() at vpanic+0x131/frame 0xffffffff82e54c20
panic() at panic+0x43/frame 0xffffffff82e54c80
trap_fatal() at trap_fatal+0x40b/frame 0xffffffff82e54ce0
trap_pfault() at trap_pfault+0x46/frame 0xffffffff82e54d30
calltrap() at calltrap+0x8/frame 0xffffffff82e54d30
--- trap 0xc, rip = 0xffffffff82190d9c, rsp = 0xffffffff82e54e00, rbp = 0xffffffff82e54e30 ---
pf_detach_state() at pf_detach_state+0x5fc/frame 0xffffffff82e54e30
pf_unlink_state() at pf_unlink_state+0x290/frame 0xffffffff82e54e70
pf_purge_expired_states() at pf_purge_expired_states+0x188/frame 0xffffffff82e54ec0
pf_purge_thread() at pf_purge_thread+0x13b/frame 0xffffffff82e54ef0
fork_exit() at fork_exit+0x7f/frame 0xffffffff82e54f30
fork_trampoline() at fork_trampoline+0xe/frame 0xffffffff82e54f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
panic.txt0600001214740546626  7147 ustarrootwheelpage faultversion.txt0600007414740546626  7552 ustarrootwheelFreeBSD 14.1-RELEASE-p6 stable/24.7-n267979-0d692990122 SMP

EDIT: I forgot to mention, I ran memtest64 for a few hours but no errors were found.

Thanks!
#4
Many thanks for this!

I had disabled crowdsec and tried to perform an upgrade again but the same issue occurred in the logs. After reading your post I rebooted the firewall again and this time the upgrade worked!

So happy now  :)
#5
Hi

My OPNsense firewall has an issue where the system firmware update just says "fetching changelog information" and a spinning icon.

The Firewall tried to update itself, automatically, from 24.7.5 to 24.7.6 last night. When I checked this morning the update was nearly complete and the update process got to "updating crowdsec" and was waiting for a couple of PIDs. That message was displayed over 12 hours ago. though, with no further messages. I then rebooted the Firewall and now when the unit has restarted it says it is running 24.7.5 still. The firewall works and my clients have internet access etc. If I ssh to the firewall I can resolve names and ping www.google.com. I've tried changing to a different mirror but the "fetching changelog information" issue remains. If I run "fetch https://pkg.opnsense.org/FreeBSD:13:amd64/23.1/sets/changelog.txz", via ssh, the file is downloaded correctly.

I only have IPV4 access but I read that changing "Prefer to use IPv4 even if IPv6 is available"might help under setting/general, but this hasn't helped my system.

In the system/firmware/updates log I sometimes see an additional entry beyond "fetching changelog information", though if I leave the system running like this for 30 mins no further messages are displayed:

Updating OPNsense repository catalogue...
Waiting for another process to update repository OPNsense


I've noticed that if I check the upgrade log I have a message at the top saying "pkg-static: Warning: Major OS version upgrade detected. Running "pkg bootstrap -f" recommended". if I try and run "pkg bootstrap -f" at the cl I receive a message "The package management tool is not yet installed on your system". I'm then prompted if I want to install the package. I've declined to install the package as I don't want to make my situation worse, without knowing the implications of installing. This message might also be from a previous upgrade, there are no timestamps in the file for me to be sure when this log is from.

The system has a reliable 100Mb down internet connection is using 4% of its RAM, with a load average of 0.21. I'd appropriate any suggestions on what I can try next.

Thanks
#6
24.1, 24.4 Legacy Series / Re: Missing Traffic Data
April 06, 2024, 10:34:02 AM
Patch applied, all working well now!

Many thanks for your help and support.
#7
24.1, 24.4 Legacy Series / Re: Missing Traffic Data
April 05, 2024, 06:53:03 PM
Brilliant, many thanks!!
#8
24.1, 24.4 Legacy Series / Missing Traffic Data
April 05, 2024, 05:53:21 PM
Hi

I'm new to OPNSense and to this forum so firstly hello & thanks for a great system!

I have installed  OPNsense 24.1.5_1-amd64 on a Protectli Vault Pro VP2420-4 Port PC. I have 2 WAN connections that I'm load balancing between between and all is working well from a firewall point of view. The WAN connections are provided to me as separate ethernet connections on 2 subnets with private 192.168.x.x/24 ip connections (this might be relevant, not sure!).

If I navigate to the Reporting/traffic page I can see graphs of the "in" and "out" data on LAN and WAN interfaces working correctly with real data. On the same page the "Top Hosts" in and out graphs are also displayed but without any data being shown. If I navigate to the "Top Talkers" tab no host data is displayed there either.

I've check the fw's backend log and I can see the error message below, from the configd.py process, when navigate to Reporting/Traffic:

Script action failed with Command '/usr/local/opnsense/scripts/interfaces/traffic_top.py --interfaces 'igc0'' returned non-zero exit status 1. at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/actions/script_output.py", line 44, in execute subprocess.check_call(script_command, env=self.config_environment, shell=True, File "/usr/local/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '/usr/local/opnsense/scripts/interfaces/traffic_top.py --interfaces 'igc0'' returned non-zero exit status 1.

If I then navigate to Top talkers I also see the same error in the backend log as above.

I don't know if this helps but if I run the command from the log via ssh:

/usr/local/opnsense/scripts/interfaces/traffic_top.py --interfaces 'igc0'

I receive this error message:

Traceback (most recent call last):
  File "/usr/local/opnsense/scripts/interfaces/traffic_top.py", line 154, in <module>
    if ip.is_private():
AttributeError: 'IPAddress' object has no attribute 'is_private'


Interface igc0 is the fw's LAN interface and is passing LAN traffic ok. I'm using igc0 as a dot1q trunk.

Any ideas on how I could fix the issue above will be appreciated! As above, I'm new to OPNsense so user error on my part is a likely cause!

Thanks