OPNsense Forum

Archive => 15.7 Legacy Series => Topic started by: HrvojeS on October 29, 2015, 10:25:50 pm

Title: [SOLVED] OPNsense bugus WAN IP
Post by: HrvojeS on October 29, 2015, 10:25:50 pm
Hello,

I have a problem with latest OPNsense in VirtualBox VM where WAN interface gets assigned a bogus IP immediately after getting a correct IP via local LAN DHCP server. I currently run pfSense 2.2.4 on my LAN and I'm trying out OPNsense out of pure interest.

My problem is reproducible, and after investigating and pulling my hair out for a day decided to make a video and do some packet captures of both OPNsense and pfSense in VMs.

I have pfSense (2.2.4) firewall at home with LAN (172.20.101.0/24) which has DHCP server turned on. My pfSense DNS servers are:
127.0.0.1
209.244.0.3
8.26.56.26
8.20.247.20
209.244.0.4

I've created static mappings on for firewall1 (00:50:56:00:00:01) which is intended for OPNsense VM WAN link and firewall2 (00:50:56:00:00:02) which is intended for pfSense VM WAN link.

I've created two VMs with identical hardware. Both have NIC1 bridged to my local LAN and NIC2 to their own unique "Internal Network" type. I've installed OPNsense and pfSense side by side and after first boot they both get proper WAN IPs that I've reserved on my LAN DHCP server.

Fun starts at first reboot where pfSense VM renews IP and gets a proper one, but OPNsense renews IP and gets a bogus one (probably related to public DNS servers I'm using). However, I don't think DNS response of any kind should impact any machine's assignment of its own IP.

VM1 (firewall1.lab) is OPNsense, and IP reservation is for 172.20.101.71.
VM2 (firewall2.lab) is pfSense, and IP reservation is for 172.20.101.72.

Yes, I've verified checksums of downloaded images. They're ok. Yes, I've tried this may times and still same result. Yes, I've looked at the source code (dhclient-script[.ext] and rc.renewwanip) for both OPNsense and pfSense and even though there are a lot of differences now, I can't seem the find the culprit.

Video of the test is posted here: https://vimeo.com/144032616 (https://vimeo.com/144032616)

After VMs are created and before they are turned on the first time, packet capture is enabled for first NIC on both VMs. Upon inspection of both packet captuers, it appears there is some overlap of packets between the traces. I'm guessing that VirtualBox is duplicating packets on the physical NIC layer for protocols that it doesn't know intended target (my laptop, VM1 or VM2).

Below is a timeline of important events for this bug. For packet captures I've used display filter ("dns || arp || bootp") to show important packets. My LAN has bunch of other traffic that was captured that is of no importance.
- (vid 0:00 - 2:48): shows LAN reservation for two VMs and creation of two VMs described above
- (vid 2:48 - 3:03): packet capture enabled for both VMs on NIC1
- (vid 3:11): OPNsense VM started
- (vid 3:15): pfSense VM started
- (vid 3:34): OPNsense installer invoked
- (vid 3:36): pfSense installer invoked
- (vid 4:30): pfSense installer finishes and VM reboots (forgot to remove install media)
- (vid 4:54): pfSense VM shut down manually because it booted from install media again
- (vid 5:00): OPNsense installer finishes and VM reboots
- (vid 5:20): OPNsense install media removed
- (vid 5:27): pfSense install media removed
- (vid 5:31): OPNsense VM started (from HDD)
- (vid 5:34): pfSense VM started (from HDD)
- (vid 6:01): OPNsense interfaces manual assignment prompt
- (vid 6:04): pfSense configuring WAN interface (seems that configuration is bypassed?)
- (pcap2 #24,25,32,33) - DHCP discover, offer, request and ack (2 seconds) for firewall2 (pfSense)
- (pcap2 #34) - pfSense checking ARP to ensure noone has the IP that it was offered
- (pcap2 #35,36) - pfSense checking ARP to find out DNS server
- (pcap2 #37) - pfSense doing first DNS query against root NS
- (vid 6:06): pfSense WAN interface config is done
- (vid 6:08): OPNsense NICs manually assigned
- (vid 6:16): pfSense bootup complete showing correct IP for WAN
- (vid 6:18): OPNsense configuring WAN interface
- (pcap1 #50,51,54,55) - DHCP discover, offer, request and ack (2 seconds) for firewall1 (OPNsense)
- (pcap1 #56) - OPNsense checking ARP to ensure noone has the IP that it was offered
- (note): above pcap shows WAN IP assignment only took 2 seconds between 6:18 and 6:20. What OPNsense was doing after that is mistery to me.
- (note): OPNsense's first DNS query is on pcap1 #79 which is 28 seconds after DHCP and WAN IP assignment. This is strange compared to pfSense.
- (vid 6:41): OPNsense WAN interface config is done (took 23secs compared to 2secs on pfSense)
- (vid 6:53): OPNsense bootup complete showing correct IP for WAN
- (vid 7:29): OPNsense reboot requested
- (vid 7:41): pfSense reboot requested
- (vid 8:08): OPNsense configuring WAN interface
- (pcap1 #191,192) - DHCP req and ack (instantaneous) for firewall1 (OPNsense)
- (pcap1 #193) - OPNsense checking ARP to ensure noone has the IP that it was offered
- (vid 8:13): OPNsense WAN interface config is done (took 5secs compared to 23secs first time)
- (pcap1 #200) - OPNsense checking ARP do find out DNS
- (pcap1 #201) - OPNsense told that MAC address for local LAN DNS server
- (pcap1 #202) - OPNsense querying DNS for "setfirst.lab" (huh??)
- (pcap1 #203) - OPNsense got a response from local LAN DNS server (this is ad-based public DNS response)
- (pcap1 #204) - OPNsense checking ARP to ensure noone has IP 92.242.144.50 (huh?? where did this come from??)

- (vid 8:23): OPNsense bootup complete sowring wrong IP (huh??)
- (vid 8:25): pfSense WAN interface is started and done almost instantaneously
- (pcap2 #729,730) - DHCP req and ack (instantaneous) for firewall2 (pfSense)
- (pcap2 #731) - pfSense checking ARP to ensure noone has the IP that it was offered
- (vid 8:35): pfSense WAN interface still showing correct IP that was reserved initially

So, this is a really strange problem that I only see on OPNsense side, and not on pfSense. After this problem happens, OPNsense doesn't route anything properly. I'm honestly really scared at using OPNsense right now as I have no idea how something like this could happen. Any insight would be greatly appreciated. Thank you in advance,

Hrv
Title: Re: OPNsense bugus WAN IP
Post by: AdSchellevis on October 30, 2015, 08:45:05 pm
Hi Hrv,

Found the bug, pfSense added an extra (non standard) option to ifconfig "setfirst" which is called in the dhclient-script.ext script.
Because we don't use that custom patch, ifconfig tries to resolve the name to a number when booting.
On your end it receives an IP address and sets it, I tried it on my end and saw the setfirst.<local domain> passing as well, but on my end there's no address involved.

The fix for this issue can be found here https://github.com/opnsense/core/commit/fb20b901d31b51e8b07c1d803697eb241b57f476

Regards,

Ad
Title: Re: OPNsense bugus WAN IP
Post by: HrvojeS on October 31, 2015, 01:07:39 pm
Hello Ad,

Thank you for the quick response. I knew setfirst wasn't standard, but I didn't realize what difference it makes in OPNsense vs pfSense since both have it but behave differently. I looked at the source for pfSense and wasn't able to see 'setfirst' being utilized anywhere. When you refer to custom patch, is that something that pfSense changes during their image build process to include this function?

Thank you again for your help. I greatly appreciate it. Best regards,

Hrvoje
Title: Re: OPNsense bugus WAN IP
Post by: franco on October 31, 2015, 01:10:56 pm
pfSense essentially is FreeBSD + pfSense Operating System Modifications + GUI that operates on those specific Modifications. OPNsense attempts to cut out the OS modifications and we are down to a number of patches that you can count on one hand. We've found this significantly reduces potential problems and incompatibilities between upgrades and may eventually pave the way for non-FreeBSD OPNsense versions.