OPNsense Forum

Archive => 17.1 Legacy Series => Topic started by: jbakuwel on March 27, 2017, 09:26:58 am

Title: Fatal trap 12
Post by: jbakuwel on March 27, 2017, 09:26:58 am
Hi,

I'm new to OPNsense, have looked at pfSense in the past and until now have been happily rolling my own.
I've installed 17.1 running under Xen (open source) and got a hard crash, see attached screen shot of the console.
Xen in this case is running on a PCengines APU2, happily running production of a number of Linux VMs, but not OPNsense unfortunately. Any suggestions?

Jan
Title: Re: Fatal trap 12
Post by: rgo on March 27, 2017, 10:57:23 am
Why would you run a hypervisor on such low end hardware in the first place?  I would guess the hypervisor has taken up all the resources needed for opnsense and when opnsense goes to get the resources needed.  Then the hypervisor errors out and causing opnsense to error out because it cannot get the resources needed to run.

I would guess if you loaded opnsense direct on the board with out a hypervisor you should find opnsense will come up and run with no problems.  Just my opinion.
Title: Re: Fatal trap 12
Post by: jbakuwel on March 27, 2017, 11:07:06 am
Hi rgo,

Thanks for your reply.

I have very good reasons to run VMs instead of directly on the board which go beyond the topic of this post.

Also please note that the APU2 is not exactly "low-end" hardware - at least not for my purposes. It's running 5 (Linux) VMs at very pleasant speeds, some of which would be equally taxing on the hardware as a fully configured OPNsense would be. In fact OPNsense (in a default post install config, ie. without doing anything at all really), currently has a whopping 1GB of RAM, 4 times as much as the current VPN concentrator.

I very much doubt it's a matter of (lack of) resources. This looks like a bug to me.

Jan
Title: Re: Fatal trap 12
Post by: rgo on March 27, 2017, 11:22:56 am
Well being very long time user of FreeBSD the kernel error you are getting is resources.  That tells me FreeBSD can not get the resources needed.  Dual OS that you are trying to do I think is a bad idea on what I call low end hardware.  Hypervisor should be run on systems that have resources.  4 core @ 1ghz with 4gig of ram is low end hardware in my book.  I only consider running hypervisor if systems have 100+ gigs of ram and more than 32 cores over 2+ghz per core.  Then I will think about running dual os.

My opinion, you already have crippled opnsense by doing what you are trying to do by running a hypervisor then opnsense on top!  My tests, I have done tells me opnsense needs some where around 4ghz total compute CPU to get near 1gig WAN<->LAN speed.  I know J1900 intel cpu that is 2.0 ghz X 4 cores has a hard time going over 1gig WAN<->LAN throughput so 1/2 of the CPU in the board you are talking about.  I can tell you running dual OS is a super super bad idea on that boad.  From the test I have done with opnsense.

If you get it to run I would like to know that kind of throughput you get WAN<->LAN!
Title: Re: Fatal trap 12
Post by: jbakuwel on March 27, 2017, 10:15:22 pm
Hey rgo,

I think it makes no sense to state that the only use case for virtualisation is using machines with 100+ GB of RAM and more than 32 cores. Let's agree to disagree.

Jan

Title: Re: Fatal trap 12
Post by: jbakuwel on March 27, 2017, 10:18:05 pm
Hi all,

A Google search for "xen freebsd trap 12" has many results... it seems I'm not the only one having that problem. When OPNsense boots I see that it realises it's running on a hypervisor and subsequent is disabling TCP offloading. My gut feeling is that this is a problem in the FreeBSD Xen drivers. I'll keep digging - I like what I've seen so far from OPNsense and would like to make it work.

Jan


Title: Re: Fatal trap 12
Post by: franco on March 27, 2017, 11:40:01 pm
What's the "bt" output on this? It's probing the length of a NULL string probably...
Title: Re: Fatal trap 12
Post by: rgo on March 28, 2017, 07:57:48 am
Need to enable debug in kernel.  This will allow you to see the KDB stack backtrace dump so you can see the error.  How the error occurred.  FYI you have no address on fault virtual address.  This means the resource is not assigned.  The error your having is caused by opnsense requesting something from other side and the other side is not giving the request back with the address to opnsense.  Why you have no address at fault virtual address, because their was no resource ( think of it like this you have a network interface and you want to read from that interface.  You can open igb0 and read data.  If you can not open igb0 then their is no resource to read from.  This is not your problem I am just using that as an example.)  Since you are low in the RAM space this is were most of the OS or System is.  Could be as your thinking a problem in the driver or it could be something else.  You need KDB output to know what to chase down.  With out that you are really going on a wild chase.  Like looking for a needle in a 10 mile field.

I do not think Xen driver is the problem.  I could be wrong.  KDB would turn the light on for you.  If you want to chase the problem down.  I would strongly suggest you look at debug dump.
Title: Re: Fatal trap 12
Post by: franco on March 29, 2017, 06:02:08 am
Hmm, the DDB prompt is there, just need to type "bt" and send the output for a bit of context.

We don't have debug kernels at the moment, but it's something I want to have when either base package support is added to FreeBSD or we add it as an optional (kernel) set.


Cheers,
Franco
Title: Re: Fatal trap 12
Post by: jbakuwel on April 01, 2017, 09:43:49 am
Hi Franco, ngo,

Sorry for the delay in my answer. Please see the output of "bt" attached. I've left the VM sitting at the debugger prompt, just in case you need more info. Happy to help debug this further with a debug kernel. I have many years of experience with Linux but unfortunately not much with FreeBSD. Am a reasonably quick learner - well at least I like to think I am ;-)

Jan
Title: Re: Fatal trap 12
Post by: jbakuwel on April 05, 2017, 06:15:27 am
Hi Franco,

Hope the "bt" output helps to close in on the problem?
Please let me know if there's anything I can do to help,
At this stage the trap is a show stopper for me...

kind regards,
Jan
Title: Re: Fatal trap 12
Post by: franco on April 05, 2017, 07:11:04 am
Hi Jan,

Looks like this one: https://github.com/opnsense/src/blob/master/sys/kern/kern_sysctl.c#L924

I'll ask Shawn how we could fix that.


Thanks,
Franco
Title: Re: Fatal trap 12
Post by: jbakuwel on April 06, 2017, 02:41:31 am
Hi Franco,

Many thanks, looking forward to hearing from you. It's quite easy for me to reproduce (just clicking around in the web interface seems to be sufficient); let me know if I can help beta test a new kernel.

Jan


Title: Re: Fatal trap 12
Post by: jbakuwel on April 25, 2017, 07:44:59 am
Hi Franco,

Would you be able to give me a rough estimate when you think this bug might be resolved?
I'm evaluating various systems and would very much like to keep OpnSense on the list...
Please let me know if I can help with this.

kind regards,
Jan