OPNsense Forum

English Forums => High availability => Topic started by: Continuity on August 23, 2022, 12:40:42 pm

Title: DHCP not working on cluster, wrong listen port
Post by: Continuity on August 23, 2022, 12:40:42 pm
Hi all

I think this is a bug/issue, but maybe you can see a misconfiguration.

2 firewall in cluster, each with 6 interface:
- igb0 between the firewalls, for pfsync;
- igb1 used as emergency, I can connect a pc here and manage the firewall in worst case scenario.
- igb2->5 aggregated as lagg0 to the switch

on lagg0 there are all the vlan we need.

At the end the ifconfig output.

The dhcp server is on 2 vlan, vlan1 and vlan3.

The clients don't take the IP from dhcp, litterally they don''t receive any response from dhcp servers (wireshark).

The log file was full of
Code: [Select]
2022-08-19T17:17:27 Error dhcpd DHCPDISCOVER from 0a:c4:ad:4b:47:fd via vlan03: peer holds all free leases
2022-08-19T17:17:26 Error dhcpd DHCPDISCOVER from 68:f7:28:fc:c9:f3 via vlan01: peer holds all free leases
2022-08-19T17:17:16 Error dhcpd DHCPDISCOVER from 68:f7:28:fc:c9:f3 via vlan01: peer holds all free leases

Looking aroud (can't remember really how) I found that the dhcp servers could not comunicate eachother.
The firewall rule auto-generated is in place, and also there is another manual rule who permit the traffic between firewalls on port 520 and 519 TCP.

But looking at listening port there is the mistake, both dhcp daemon was listening on port 520.

Code: [Select]
root@fw-slave:~ # netstat -na
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address          Foreign Address        (state)   
[...]
tcp4       0      0 10.203.1.252.520       *.*                    LISTEN     
tcp4       0      0 10.203.5.252.520       *.*                    LISTEN     
[...]

A packet capture let me see that the secondary was trying to open a tcp connection on port 519 of the primary, but primary was listening on port 520.

Looking at file /var/dhcpd/etc/dhcpd.conf, both config was to listen on 520. At the end the file of slave for reference.

So as workaround i modified the /usr/local/etc/inc/plugins.inc.d/dhcpd.inc on master for let it know it is master:

Code: [Select]
root@fw-master:~ # grep -B3 -A35 ZETSU /usr/local/etc/inc/plugins.inc.d/dhcpd.inc

        if (!empty($dhcpifconf['failover_peerip'])) {
            $intip = get_interface_ip($dhcpif, $ifconfig_details);
            /* ZETSU $failover_primary = false; */
            $failover_primary = true;
            if (!empty($config['virtualip']['vip'])) {
                foreach ($config['virtualip']['vip'] as $vipent) {
                    if ($vipent['interface'] == $dhcpif) {
                        $carp_nw = gen_subnet($vipent['subnet'], $vipent['subnet_bits']);
                        if (ip_in_subnet($dhcpifconf['failover_peerip'], "{$carp_nw}/{$vipent['subnet_bits']}")) {
                            /* this is the interface! */
                            if (is_numeric($vipent['advskew']) && (intval($vipent['advskew']) < 20)) {
                                $failover_primary = true;
                            }
                            break;
                        }
                    }
                }
            } else {
                log_error('Warning! DHCP Failover setup and no CARP virtual IPs defined!');
            }
            $dhcpdconf_pri = "";
            if ($failover_primary) {
                $my_port = "519";
                $peer_port = "520";
                $type = "primary";
                $dhcpdconf_pri  = "split 128;\n";
                if (isset($dhcpifconf['failover_split'])) {
                    $dhcpdconf_pri  = "split {$dhcpifconf['failover_split']};\n";
                }
                $dhcpdconf_pri .= "  mclt 600;\n";
            } else {
                $type = "secondary";
                $my_port = "520";
                $peer_port = "519";
            }

I have not take the time to read and understand the if statement.
So the question is:
Why it decide the master is not master ?


ifconfig on primary
Code: [Select]
root@fw-master:~ # ifconfig
igb0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: Cluster
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:6c:28
        inet 10.203.0.1 netmask 0xfffffff8 broadcast 10.203.0.7
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: Emergency
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:6c:29
        inet 192.168.23.253 netmask 0xffffff00 broadcast 192.168.23.255
        media: Ethernet autoselect
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb2: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:6c:2a
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb3: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:6c:2a
        hwaddr 00:30:18:01:6c:2b
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb4: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:6c:2a
        hwaddr 00:30:18:01:6c:2c
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb5: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:6c:2a
        hwaddr 00:30:18:01:6c:2d
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
enc0: flags=0<> metric 0 mtu 1536
        groups: enc
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x8
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=20100<PROMISC,PPROMISC> metric 0 mtu 33160
        groups: pflog
pfsync0: flags=41<UP,RUNNING> metric 0 mtu 1500
        pfsync: syncdev: igb0 syncpeer: 10.203.0.2 maxupd: 128 defer: off
        syncok: 1
        groups: pfsync
lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:6c:2a
        laggproto lacp lagghash l2,l3,l4
        laggport: igb2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: igb3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: igb4 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: igb5 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        groups: lagg
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan01: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: Management
        options=4000000<NOMAP>
        ether 00:30:18:01:6c:2a
        inet 10.203.5.253 netmask 0xffffff00 broadcast 10.203.5.255
        inet 10.203.5.254 netmask 0xffffffff broadcast 10.203.5.254 vhid 3
        groups: vlan
        carp: MASTER vhid 3 advbase 1 advskew 0
        vlan: 5 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan02: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: LAN
        options=4000000<NOMAP>
        ether 00:30:18:01:6c:2a
        inet 192.168.0.213 netmask 0xffffff00 broadcast 192.168.0.255
        inet 192.168.0.254 netmask 0xffffffff broadcast 192.168.0.254 vhid 4
        groups: vlan
        carp: MASTER vhid 4 advbase 1 advskew 0
        vlan: 2 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan03: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: Ospiti
        options=4000000<NOMAP>
        ether 00:30:18:01:6c:2a
        inet 10.203.1.253 netmask 0xffffff00 broadcast 10.203.1.255
        inet 10.203.1.254 netmask 0xffffffff broadcast 10.203.1.254 vhid 1
        groups: vlan
        carp: MASTER vhid 1 advbase 1 advskew 0
        vlan: 4 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan04: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: WAN
        options=4000000<NOMAP>
        ether 00:30:18:01:6c:2a
        inet [snip...] netmask 0xfffffff8 broadcast 185.100.109.151
        inet [snip...] netmask 0xfffffffc broadcast 185.100.109.151 vhid 2
        groups: vlan
        carp: MASTER vhid 2 advbase 1 advskew 0
        vlan: 99 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ovpns1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        inet 192.168.203.1 --> 192.168.203.2 netmask 0xfffffff8
        groups: tun openvpn
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        Opened by PID 54350
ovpns2: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        inet 192.168.203.33 --> 192.168.203.34 netmask 0xffffffe0
        groups: tun openvpn
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        Opened by PID 93215
root@fw-master:~ #

ifconfig on secondary
Code: [Select]
root@fw-slave:~ # ifconfig
igb0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: Cluster
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:66:b2
        inet 10.203.0.2 netmask 0xfffffff8 broadcast 10.203.0.7
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: Emergency
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:66:b3
        inet 192.168.23.252 netmask 0xffffff00 broadcast 192.168.23.255
        media: Ethernet autoselect
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb2: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:66:b4
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb3: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:66:b4
        hwaddr 00:30:18:01:66:b5
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb4: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:66:b4
        hwaddr 00:30:18:01:66:b6
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
igb5: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:66:b4
        hwaddr 00:30:18:01:66:b7
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x7
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
enc0: flags=0<> metric 0 mtu 1536
        groups: enc
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
pfsync0: flags=0<> metric 0 mtu 1500
        groups: pfsync
pflog0: flags=20100<PROMISC,PPROMISC> metric 0 mtu 33160
        groups: pflog
lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4800028<VLAN_MTU,JUMBO_MTU,NOMAP>
        ether 00:30:18:01:66:b4
        laggproto lacp lagghash l2,l3,l4
        laggport: igb2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: igb3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: igb4 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: igb5 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        groups: lagg
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan01: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: Management
        options=4000000<NOMAP>
        ether 00:30:18:01:66:b4
        inet 10.203.5.252 netmask 0xffffff00 broadcast 10.203.5.255
        inet 10.203.5.254 netmask 0xffffffff broadcast 10.203.5.254 vhid 3
        groups: vlan
        carp: BACKUP vhid 3 advbase 1 advskew 100
        vlan: 5 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan02: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: LAN
        options=4000000<NOMAP>
        ether 00:30:18:01:66:b4
        inet 192.168.0.212 netmask 0xffffff00 broadcast 192.168.0.255
        inet 192.168.0.254 netmask 0xffffffff broadcast 192.168.0.254 vhid 4
        groups: vlan
        carp: BACKUP vhid 4 advbase 1 advskew 100
        vlan: 2 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan03: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: Ospiti
        options=4000000<NOMAP>
        ether 00:30:18:01:66:b4
        inet 10.203.1.252 netmask 0xffffff00 broadcast 10.203.1.255
        inet 10.203.1.254 netmask 0xffffffff broadcast 10.203.1.254 vhid 1
        groups: vlan
        carp: BACKUP vhid 1 advbase 1 advskew 100
        vlan: 4 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vlan04: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: WAN
        options=4000000<NOMAP>
        ether 00:30:18:01:66:b4
        inet [snip...] netmask 0xfffffff8 broadcast 185.100.109.151
        inet [snip...] netmask 0xfffffffc broadcast 185.100.109.151 vhid 2
        groups: vlan
        carp: BACKUP vhid 2 advbase 1 advskew 100
        vlan: 99 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ovpns1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        inet 192.168.203.1 --> 192.168.203.2 netmask 0xfffffff8
        groups: tun openvpn
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        Opened by PID 30516
ovpns2: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        inet 192.168.203.33 --> 192.168.203.34 netmask 0xffffffe0
        groups: tun openvpn
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        Opened by PID 30997

DHCP config file of secondary.
Code: [Select]
root@fw-slave:~ # cat /var/dhcpd/etc/dhcpd.conf
option domain-name "dave.lan";
option ldap-server code 95 = text;
option arch code 93 = unsigned integer 16; # RFC4578
option pac-webui code 252 = text;

default-lease-time 7200;
max-lease-time 86400;
log-facility local7;
one-lease-per-client true;
deny duplicates;
ping-check true;
update-conflict-detection false;
authoritative;
failover peer "dhcp_opt4" {
  secondary;
  address 10.203.1.252;
  port 520;
  peer address 10.203.1.253;
  peer port 519;
  max-response-delay 10;
  max-unacked-updates 10;
 
  load balance max seconds 3;
}

failover peer "dhcp_opt2" {
  secondary;
  address 10.203.5.252;
  port 520;
  peer address 10.203.5.253;
  peer port 519;
  max-response-delay 10;
  max-unacked-updates 10;
 
  load balance max seconds 3;
}


subnet 10.203.1.0 netmask 255.255.255.0 {
  pool {
    option domain-name-servers 10.203.1.254;
    deny dynamic bootp clients;
    failover peer "dhcp_opt4";
    range 10.203.1.100 10.203.1.199;
  }

  option routers 10.203.1.254;
  option domain-name-servers 10.203.1.254;

}

subnet 10.203.5.0 netmask 255.255.255.0 {
  pool {
    option domain-name-servers 10.203.5.254;
    deny dynamic bootp clients;
    ignore-client-uids true;
    failover peer "dhcp_opt2";
    range 10.203.5.100 10.203.5.109;
  }

  option routers 10.203.5.254;
  option domain-name-servers 10.203.5.254;

}
Title: DHCP not working on cluster, wrong listen port
Post by: Continuity on August 22, 2023, 05:40:08 pm
Hi all
Finally we found the mistake.

We set the carp address as a /32. You can see the netmask as
Code: [Select]
0xffffffff.
We do it because there is no meaning in put the carp address as a /24.
The interfaces address already have the netmast, and that is sufficient for create the routing entry for "ethernet reachable" addresses.

But on the code there is this control:
Code: [Select]
if (ip_in_subnet($dhcpifconf['failover_peerip'], "{$carp_nw}/{$vipent['subnet_bits']}")) {

this verify that the failover ip is in the carp network, but if the carp network is /32, it is false and both the dhcp goes to "slave mode".


So for now, use the same netmask on carp and on interfaces address. Imho this control doesn't have much sense.
Because imho there is no need for the carp address to be in the same network of the failover ip. Maybe is better to check the failover with the interface addresses... But this is another story.

Could you share your opinion about this ?

Best Regards