OPNsense Forum

Archive => 18.7 Legacy Series => Topic started by: JeGr on October 09, 2018, 05:22:23 pm

Title: IPv6 IPs and GW Trouble in VM
Post by: JeGr on October 09, 2018, 05:22:23 pm
Hi all,

I'm trying to setup our lab environment and for testing the various setup conditions we've also routed some public IP and IP6 prefixes to the Lab VMs.

While having multiple parallel running VMs in the Lab (various OPNsense and pfSense VMs on a separate ESXi), we have set up one OPN and one PF instance very simple:

- WAN: private IP range (10.y.y.131 pf / .141 opn), Gateway .1
- LAN: another private range (172.x.x.131 pf / .141 opn)

as of that - no problems. Our normal firewall is also involved in those subnets (10.y.y.254 and 172.x.x.254) so we can reach either the WAN or LAN site as needed.

Problems first came up with IPv6. Although all IPs (4/6) are assigned static, the PF VM is running as expected.
IP6 was configured accordingly as above and shown in this little ASCII Diagramm (hope that makes it clearer ;))


                                 WAN Uplink                               
                                10.0.250.254                             
                           2001:DB8:210f:f1cf::2/64                       
                                                                         
                                +-----------+                             
           +-----------------+--- Upstream  ---------------------+       
           |                 |  +-----------+  |                 |       
           |                 |                 |                 |       
           |                 |                 |                 |       
           | CARP: .1 & ::1  |                 |                 |       
           |                 |                 |                 |       
     ::1001| .11       ::1002| .12        ::141| .141       ::131| .131   
     +-----|-----+     +-----|-----+     +-----|-----+     +-----|-----+ 
     |Office HW 1|     |Office HW 2|     |OPNsense VM|     |pfSense VM | 
     +--|-----|--+     +--|-----|--+     +--------|--+     +--------|--+ 
  .251  |     |           |     | .252            | .141            | .131
 ::1001 |     |           |     |::1002           |::141            |::131
        |     |           |     |                 |                 |     
        |     |-----------|-----|-----------------|-----------------|     
        |       CARP: .254|                                               
        |             ::1 |           2001:DB8:146:222::/64               
        |                 |              172.22.222.0/24                 
        |                 |                  LAB-NET                     
        |                 |                                               
        +-----------------+                                               
                 |                                                       
                 |                                                       
       other internal VLANs                                               


So as one can see, simple and straightforward, same IP4/IP6 endings vor v4/v6 and all configured static. Clean, simple dual-stack. (the DB8 prefixes are for simplification, the networks on all ends are real public routable IP6)

BUT: With the OPNsense VM I had the first problems and trouble after adding IP6 static config and couldn't reach the upstream WAN IP - no ping6 no outgoing connection. Only after rebooting it showed up with ping OK and reachable, DNS and MTR via v6 working. OK perhaps a glitch. But the second problem still not working: We try to reach the :222::141/131 side (LAB-NET) of those VMs via another v6 only network that is homed behind our office cluster (see other internal VLANs). So to correctly do the routing we added the :146:222::1 from the office cluster as an additional gateway, configured a route for the other prefix and set it up to use the new gateway on the LAB side.

Result:
pf VM -> :146:222::1 GW is green, ping6 working, test prefix can be reached via LAN(LAB) interface.
OPN VM -> :146:222::1 GW is red, no ping working, test prefix can not be reached.

Testing further revealed quite a strange behaviour: ::1001 and ::1002 of the cluster can be reached, but the CARP VIP6 ::1 is "dead" according to the gateway check and can't be ping6'ed even via SSH. Checking the same thing on the WAN side revealed the same, but as the GW on WAN isn't a CARP IP but the upstream switch (::2) that never came up. So it seems the IP6 side of the OPNsense VM somehow has a problem in communicating with CARP VIPs on the same local network segment (no, there are no other clusters or devices speaking CARP, HSRP or VRRP there, so no conflict).

Is that a possibility and is there anything to check it with? Neither pf VMs with 2.4.3 (FreeBSD 11.1) nor 2.4.4 (11.2) show that behavior and as long as this isn't fixed the whole testing lab for v6 cases is dead.

Happily supplying additional info to debug that :)

Greets
Jens
Title: Re: IPv6 IPs and GW Trouble in VM
Post by: marjohn56 on October 09, 2018, 06:16:03 pm
Odd. Let's try and find out what's different between pf and OPN.


Firstly, can you run netstat -rn on both and compare results, that SHOULD give a clue.
Title: Re: IPv6 IPs and GW Trouble in VM
Post by: JeGr on October 10, 2018, 02:02:19 pm
Of course, but that reveals nothing (and we double checked the settings and routes etc. so that would have been a surprise).

IPv4 routes are identical (up to the 131/141 IPs of the VMs itself) but as the VMs are similar even the link#x and flags are a match.
IPv6 differs in the link locals (of course) but missing one route the PF one has the OPN does not (that's the one we would need to point to the GW that is reported down - the one to :146:222::1) are identical, too. Nothing out of the ordinary here.

Another thing that wonders me is, that the OPN VM could not ping6 to the ::1002 on the LAB site (the standby from the CARP cluster) but after(!) ping6 from there (::1002 -> ::141) it would also respond to the ping6 from the OPN VM.

And yes, I already re-installed with a fresh 18.7 ISO inside the VM and re-created the setup by hand to make sure, there is no strange configuration or other problem with the VM, installation or anything else imported.

As it seems, the problem are the CARP VIPs from the cluster.

Also the following is reproducable:

- reboot pf and opn VM
- ping6 from ::141 (opn) to ::131 (pf) -> no ping
- ping6 from ::131 to ::141 -> 1-2 packets fail, then ping is stable
- ping6 from ::141 to ::131 again -> ping OK
- ?

Somehow that v6 behavior seems very strange.
Title: Re: IPv6 IPs and GW Trouble in VM
Post by: JeGr on October 10, 2018, 05:13:42 pm
Adding to the last reply, I got the CARP VIP finally responding (but only temporary) by:

Checking the NDP table:

on 131 (pfsense)
2001:DB8:146:222::1                 00:00:5e:00:01:42   vmx1 23h59m58s S R

on 141 (opnsense)
2001:DB8:146:222::1                 (incomplete)        vmx1 1s        I  1

Huh... an incomplete entry without proper MAC? Strange.
After testing with

> ndp -d 2001:DB8:146:222::1
> ndp -s 2001:DB8:146:222::1 00:00:5e:00:01:42 temp

and checking with

> ndp -an

2001:DB8:146:222::1                 00:00:5e:00:01:42   vmx1 17814d9h3 I

I can now successfully ping6 the CARP VIP on the LAB side. Adding my v6 prefix route to the now working gateway also has the expected result.

That bears the question: what is wrong with the NDP/ARP/Layer2 Handling on the v6 side of OPNsense? Some problems seem to disappear by pinging from the remote hosts to the OPNsense VM which seem to help discover/create the Layer2/NDP entries necessary. But somehow the VM isn't able to resolve it for itself?

Edit: Now the WAN GW6 ist down, too. After rebooting the VM to clean it up and to purge the temp NDP entries, the WAN GW, which is not a CARP/HSRP/VRRP IP, isn't working anymore, too. NDP shows incomplete for both entries with either "expired" or "1s" as expire time. Something really is weird...

Edit2: Funny thing - leaving the VM at the side and checking again 10m later, the WAN GW seems to have "found its MAC" and is now working again. Can't say the same about the LAB GW, the CARP one still incomplete/expired. So definetly some strange readings.

Greets Jens
Title: Re: IPv6 IPs and GW Trouble in VM
Post by: JeGr on October 30, 2018, 10:56:41 am
Updated to 18.7.6 - Lab6GW still down, still having problems with CARP style VIPs and IP6. Still Strange...