OPNsense Forum

Archive => 18.7 Legacy Series => Topic started by: draga on October 01, 2018, 02:15:38 pm

Title: Kernel panic when unplugging WAN network interface
Post by: draga on October 01, 2018, 02:15:38 pm
Hello everybody, I've been using opnsense for 9 months now. Before last week, all it was doing was just a multi-wan gateway and firewall, but now I'm implementing some more advanced configurations (like ipv6, zerotier, etc.).
Performing some tests, I've noticed that unplugging of my two wan interfaces (both of them show the same problem) I hit a kernel panic and everything hangs. Sometimes it reboots, sometimes it just sits and stops working.
Both of my wans are connected via PPPoE (one is a wlan connection, the other an ADSL) and this is the error I get:

Code: [Select]
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80c2ab63
stack pointer           = 0x28:0xfffffe011abf2f60
frame pointer           = 0x28:0xfffffe011abf2f70
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq259: igb0:que 3)

I've also tried to follow this, but no luck:
https://www.netgate.com/docs/pfsense/hardware/tuning-and-troubleshooting-network-cards.html#Intel_igb.284.29_and_em.284.29_Cards

What could it be? Searching on the forum I've found some similar posts, related to PPPoE Wan devices, but I read things should have been fixed long ago. My opnsense version is OPNsense 18.7.4-amd64 - FreeBSD 11.1 RELEASE-p14 - LibreSSL 2.7.4
Thank you
Stefano
Title: Re: Kernel panic when unplugging WAN network interface
Post by: mimugmail on October 04, 2018, 10:41:43 am
When you install a fresh 18.7 img, does this also happen?
Title: Re: Kernel panic when unplugging WAN network interface
Post by: draga on October 04, 2018, 12:55:42 pm
I haven't tried, yet. I'll be trying as soon as possibile.
Do you mean also re-importing  configurations from the backup?

Thank you
Title: Re: Kernel panic when unplugging WAN network interface
Post by: mimugmail on October 04, 2018, 01:08:48 pm
Yes, you can import the backup xml .. just to be sure if it's also related to 18.7 or already before ...
Title: Re: Kernel panic when unplugging WAN network interface
Post by: draga on October 12, 2018, 07:05:58 pm
Hello,
sorry for the long delay, strong flu here.

I just tested a new installation  from ISO and a restore from the backup. No updates. Same result.

Thank you.
Title: Re: Kernel panic when unplugging WAN network interface
Post by: draga on January 08, 2019, 12:07:43 pm
Hello everybody,
still having the same problem. This morning one of my two Wans is flappy and, from time to time, the pppoe connection disappears (it's a wireless provider). If this down lasts for more than one or two minutes, the entire opnsense hangs and I have to manually restart the APU, otherwise it continues to be stuck there.

Is there anything I can do/try to avoid this? It's not a big issue when I'm here, but quite strong when away and the APU stays blocked for days.

Thank you!
Title: Re: Kernel panic when unplugging WAN network interface
Post by: AC on January 08, 2019, 03:08:48 pm
I had that kind of Problem too.
Do you have the "Kill states" option enabled? (Firewall->Settings->Advanced-> Gateway Monitoring)
Title: Re: Kernel panic when unplugging WAN network interface
Post by: draga on January 08, 2019, 03:40:40 pm
Yes, I had that checked. Now I tried to uncheck it and see what happens. Thank you, I will report here ASAP
Title: Re: Kernel panic when unplugging WAN network interface
Post by: draga on January 08, 2019, 07:49:28 pm
It didn't work. As soon as I unplug one of the wans, the system hangs :(
Title: Re: Kernel panic when unplugging WAN network interface
Post by: AC on January 09, 2019, 09:31:18 am
(Firewall->Settings->Advanced

Try with

unchecked:
"Kill states"
"Bind states to Interface"


checked should be:
"Use sticky Connection" (with MultiWAN)
"Shared forwarding"
"Gateway switching"
Title: Re: Kernel panic when unplugging WAN network interface
Post by: draga on January 09, 2019, 10:42:30 am
Thank you. Actually everything was ok except sticky connections.
Unfortunately, it didn't help. Here's what appears on console, then the APU hangs:

Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80c2f323
stack pointer           = 0x28:0xfffffe0120f9f5e0
frame pointer           = 0x28:0xfffffe0120f9f620
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 31763 (dpinger)
Title: Re: Kernel panic when unplugging WAN network interface
Post by: AC on January 09, 2019, 01:54:08 pm
Ok. That where the options that caused trouble at our company. I don't know where the problem comes from. I see a "page fault" in this message. Maybe you run a memtest on that device, just to confirm that the RAM is ok.

Maybe you could run a tcpdump over SSH while disconnection an interface, so you see that behavior.

stack pointer, frame pointer... sorry. That's where I'm out. :-\
Title: Re: Kernel panic when unplugging WAN network interface
Post by: schnipp on January 09, 2019, 05:34:50 pm
The error looks like a software bug in the NIC driver or kernel. Unplugging the interface (network cable) triggers a hardware interrupt (irq259 in the error message). The code behind that accesses a virtual memory address which is not mapped to a physical memory page.

To prevent unknown system behaviour with possibly trashing data the kernel panics into fail stop mode. In this case, the nic driver or kernel needs an update. Is the nic driver seperately installed or shipped with opnsense? In the latter case the BSD kernel team needs a bug report.
Title: Re: Kernel panic when unplugging WAN network interface
Post by: draga on January 11, 2019, 08:38:48 am
Thank you everybody.
Yes, it is stock opnsense kernel, so I guess stock FreeBSD driver. The device is a APU2C4 with Intel NICs
Title: Re: Kernel panic when unplugging WAN network interface
Post by: draga on January 11, 2019, 09:44:44 am
As I have another APU around, with Realtek nics, I've switched to it and tried.
I don't see any error on serial console, but the APU hangs and becomes unreachable. So different behaviour, but same result.
Thank you.
Title: Re: Kernel panic when unplugging WAN network interface
Post by: ubi_au on March 01, 2019, 09:16:46 am
Same situation like Stefano, with a Deciso A10 board with OPNsense 19.1.1-amd64 and 18.7 (tested with 3 different firewalls), DHCP WAN, 2 OPT, lan and zero-tier, and randonly firewall crash with Fatal trap 12: page fault while in kernel mode...  igb[number]:que...
a real frustrating situation! please, someone, have a hint?
Title: Re: Kernel panic when unplugging WAN network interface
Post by: newsense on March 02, 2019, 08:03:59 am
Interface - Settings

Make sure the three Hardware settings, CRC, LRO and TSO are checked.

Save and reboot
Title: Re: Kernel panic when unplugging WAN network interface
Post by: ubi_au on March 06, 2019, 05:08:23 pm
I tried with and without the  CRC, LRO and TSO checked and system was crash, for the moment without DHCP on wan, I' ve no experienced other crash. seem strange. I'm stay and observe ;-)