Could not access opnsense from either web ui or ssh post reboot

Started by bob9744, May 04, 2023, 03:28:49 PM

Previous topic - Next topic
Hi everyone,

Running 23.1.7.

I happened to reboot my router this morning after seeing a "fq_codel_new_sched cannot allocate memory" message in the logs, and discovered that I could no longer connect via either the web ui or ssh from vlan01.

Fortunately, I thought to plug a spare machine into igc0 (LAN), thinking that either something had locked me out, or that bringing a different port up might kickstart things, and it did - shortly after I tried to connect via ssh on the LAN port, the web ui and ssh came up and stayed up, including for vlan01 (igc2).

I rebooted a few times to try and catch where it happens - if I act quickly, I can log in from a machine on vlan01 (igc2), and then watch it boot me off, which happens when igc2 goes down after the 'iflib_netmap_config txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048' message.

Note as well that it's just access to the router that's blocked from igc2 - I can access the internet and other things on my network without problem.

Here are logs from dmesg:


Dual Console: Video Primary, Serial Secondary
igc1: link state changed to UP
igc2: link state changed to UP
lo0: link state changed to UP
coretemp0: <CPU On-Die Thermal Sensors> on cpu0
pflog0: permanently promiscuous mode enabled
igc2: link state changed to DOWN
vlan0: changing name to 'vlan01'
vlan1: changing name to 'vlan02'
vlan2: changing name to 'vlan03'
vlan3: changing name to 'vlan04'
vlan4: changing name to 'vlan05'
igc1: link state changed to DOWN
igc2: link state changed to UP
vlan05: link state changed to UP
vlan02: link state changed to UP
vlan04: link state changed to UP
vlan03: link state changed to UP
vlan01: link state changed to UP
igc1: link state changed to UP
062.320591 [ 851] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
062.707201 [ 851] iflib_netmap_config       txr 4 rxr 4 txd 1024 rxd 1024 rbufsz 2048
igc2: link state changed to DOWN
vlan05: link state changed to DOWN
vlan02: link state changed to DOWN
vlan04: link state changed to DOWN
vlan03: link state changed to DOWN
vlan01: link state changed to DOWN
ipfw2 (+ipv6) initialized, divert loadable, nat loadable, default to accept, logging disabled
load_dn_sched dn_sched FIFO loaded
load_dn_sched dn_sched QFQ loaded
load_dn_sched dn_sched RR loaded
load_dn_sched dn_sched WF2Q+ loaded
load_dn_sched dn_sched PRIO loaded
load_dn_sched dn_sched FQ_CODEL loaded
load_dn_sched dn_sched FQ_PIE loaded
load_dn_aqm dn_aqm CODEL loaded
load_dn_aqm dn_aqm PIE loaded
igc2: link state changed to UP
vlan05: link state changed to UP
vlan02: link state changed to UP
vlan04: link state changed to UP
vlan03: link state changed to UP
vlan01: link state changed to UP
igc0: link state changed to UP


I double-checked my configuration history, and I can't see any change I've made that would cause this.

Any advice would be greatly appreciated!

Thanks!


Not sure if it's relevant, but I am running zenarmor, and have in the past applied the netmap patches and a patch dealing with dhcpv6 startup conflicts with the web ui. I'm used to the web ui having fits periodically - the fact that ssh was blocked as well, though, is a complete mystery to me.

Are there any other logs I should look at?

K - while I was sitting here wrapping this post up, the web ui died again, showing a 503, though this time ssh has remained up.

I may be (probably am) wrong, but I feel like this is a symptom of the ongoing contention I see between dhcpv6 and the web ui, although that doesn't explain how ssh dropped off the radar this morning.

I upgraded the zenarmor engine this afternoon, which resulted in a web ui restart, and now dhcpv6 is stopped and refuses to start. Granted, I'm happy that I'm able to access opnsense via both the UI and ssh, but I'm hoping I don't need to coerce the UI and ssh into working via the LAN port each time I restart the router...

Removing ipv6 from every interface, and therefore disabling the dhcpv6 service, seems to have fixed the issue - i.e. I can reboot now without losing connectivity.