Repeated crash/reboot when Wireguard port number is wrong

Started by CosmicRay, May 06, 2020, 02:43:14 PM

Previous topic - Next topic
Hi folks,

This morning, I wanted to test my opnsense firewall's failback procedures when my Wireguard tunnel goes down.  To do this, I edited the peer configuration for the Wireguard tunnel and added 1 to the remote port number, breaking the configuration.

My test was all good, and when I went to set the port number back, the firewall crashed.

It rebooted, and from then on, its uptime would be measured in minutes, followed by another crash.

I eventually went over to it and plugged in a VGA monitor to see what was going on, since there was nothing in the logs.  Immediately before the crash, a ton of bold text flowed by really fast.  Some sort of kernel crash messages perhaps?  I don't know where to find these on FreeBSD.

Eventually I noticed that the initial crash had happened before the fix to the Wireguard port number was committed to disk.  I fixed that port number again, and then the crashes stopped.

As Wireguard isn't even in-kernel on FreeBSD, I have no idea how this could possibly have led to this sort of crash, but yet it has.

How can I help debug this?

(I used FreeBSD *way* back but most of my more recent experience is on Linux)

Some more details - any issue with the Wireguard link appears to cause this.

I have the setup documented here - Wireguard peer/local defined, an unused RFC1918 as the gateway IP, locked interface for it, gateway for it, and NAT/routing rules about it.

After several crashes, I tried disabling the NAT/routing rules and the interface but this did not resolve it.

Outright deleting the gateway and interface did resolve the issue.

This is OPNsense 20.1.6.  RAM and disk do not appear to be culprits here.


For the benefit of anyone else looking here, the workaround for now is to disable shared forwarding.

References:

Github bug https://github.com/opnsense/src/issues/46 "Bootloop and kernel exception with trace while configuring wireguard"

Wireguard issues https://forum.opnsense.org/index.php?topic=14403

pf+route-to: panic when shared forwarding is enabled https://github.com/opnsense/src/issues/52

Possible kernel bug https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233955