[SOLVED] WAN connectivity drops every 10 minutes, dhclient renews

Started by ruby404, June 07, 2023, 02:44:15 PM

Previous topic - Next topic
I'm having a really frustrating issue with OPNsense 23.1.9-amd64 in the UK using TalkTalk fibre.

Approximately every 10 minutes, I get this message in the system log:
dhclient 3763 - [meta sequenceId="1"] Creating resolv.conf

And immediately internet connectivity fails. Wait another 10 minutes and it's back. I can also manually renew the WAN DHCP lease and the internet will start working again.

OPNsense is running on Proxmox as a virtual machine, this setup has worked for months previously without issues. I recently did some network restructuring including some changes on the network interfaces and putting in an Aruba network switch on the LAN. I'm not sure if this problem is related to those changes or a coincidence that it started around the same time.

Looking at /var/db/dhclient.leases.vtnet0 (WAN interface) I see multiple lines (note IPs edited to start with X.X):
lease {
  interface "vtnet0";
  fixed-address X.X.68.206;
  option subnet-mask 255.255.240.0;
  option routers X.X.64.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.64.1;
  renew 3 2023/6/7 10:40:24;
  rebind 3 2023/6/7 10:45:58;
  expire 3 2023/6/7 10:47:54;
}
lease {
  interface "vtnet0";
  fixed-address X.X.213.96;
  option subnet-mask 255.255.224.0;
  option routers X.X.192.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.192.1;
  renew 3 2023/6/7 10:46:58;
  rebind 3 2023/6/7 10:52:32;
  expire 3 2023/6/7 10:54:28;
}
lease {
  interface "vtnet0";
  fixed-address X.X.213.96;
  option subnet-mask 255.255.224.0;
  option routers X.X.192.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.192.1;
  renew 3 2023/6/7 10:47:30;
  rebind 3 2023/6/7 10:53:04;
  expire 3 2023/6/7 10:55:00;
}
lease {
  interface "vtnet0";
  fixed-address X.X.213.96;
  option subnet-mask 255.255.224.0;
  option routers X.X.192.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.192.1;
  renew 3 2023/6/7 10:55:00;
  rebind 3 2023/6/7 11:00:34;
  expire 3 2023/6/7 11:02:30;
}
lease {
  interface "vtnet0";
  fixed-address X.X.100.148;
  option subnet-mask 255.255.240.0;
  option routers X.X.96.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.96.1;
  renew 3 2023/6/7 11:10:05;
  rebind 3 2023/6/7 11:15:39;
  expire 3 2023/6/7 11:17:35;
}
lease {
  interface "vtnet0";
  fixed-address X.X.100.148;
  option subnet-mask 255.255.240.0;
  option routers X.X.96.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.96.1;
  renew 3 2023/6/7 11:17:35;
  rebind 3 2023/6/7 11:23:09;
  expire 3 2023/6/7 11:25:05;
}
lease {
  interface "vtnet0";
  fixed-address X.X.194.162;
  option subnet-mask 255.255.240.0;
  option routers X.X.192.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.192.1;
  renew 3 2023/6/7 11:32:41;
  rebind 3 2023/6/7 11:38:15;
  expire 3 2023/6/7 11:40:11;
}
lease {
  interface "vtnet0";
  fixed-address X.X.194.162;
  option subnet-mask 255.255.240.0;
  option routers X.X.192.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.192.1;
  renew 3 2023/6/7 11:40:11;
  rebind 3 2023/6/7 11:45:45;
  expire 3 2023/6/7 11:47:41;
}
lease {
  interface "vtnet0";
  fixed-address X.X.194.162;
  option subnet-mask 255.255.240.0;
  option routers X.X.192.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.192.1;
  renew 3 2023/6/7 11:55:16;
  rebind 3 2023/6/7 12:00:50;
  expire 3 2023/6/7 12:02:46;
}
lease {
  interface "vtnet0";
  fixed-address X.X.194.162;
  option subnet-mask 255.255.240.0;
  option routers X.X.192.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.192.1;
  renew 3 2023/6/7 12:02:46;
  rebind 3 2023/6/7 12:08:20;
  expire 3 2023/6/7 12:10:16;
}
lease {
  interface "vtnet0";
  fixed-address X.X.194.162;
  option subnet-mask 255.255.240.0;
  option routers X.X.192.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.192.1;
  renew 3 2023/6/7 12:17:51;
  rebind 3 2023/6/7 12:23:25;
  expire 3 2023/6/7 12:25:21;
}
lease {
  interface "vtnet0";
  fixed-address X.X.194.162;
  option subnet-mask 255.255.240.0;
  option routers X.X.192.1;
  option domain-name-servers X.X.79.79,X.X.79.80;
  option dhcp-lease-time 900;
  option dhcp-message-type 5;
  option dhcp-server-identifier X.X.192.1;
  renew 3 2023/6/7 12:25:21;
  rebind 3 2023/6/7 12:30:55;
  expire 3 2023/6/7 12:32:51;
}


# route show default
   route to: default
destination: default
       mask: default
    gateway: X.X.192.1
        fib: 0
  interface: vtnet0
      flags: <UP,GATEWAY,DONE,STATIC>
recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
       0         0         0         0      1500         1         0


10 minutes later when connectivity comes back I get this in the logs:
<13>1 2023-06-07T13:10:21+01:00 host dhclient 25670 - [meta sequenceId="1"] New IP Address (vtnet0): X.X.194.162
<13>1 2023-06-07T13:10:21+01:00 host dhclient 25863 - [meta sequenceId="2"] New Subnet Mask (vtnet0): 255.255.240.0
<13>1 2023-06-07T13:10:21+01:00 host dhclient 26331 - [meta sequenceId="3"] New Broadcast Address (vtnet0): X.X.207.255
<13>1 2023-06-07T13:10:21+01:00 host dhclient 26569 - [meta sequenceId="4"] New Routers (vtnet0): X.X.192.1
<13>1 2023-06-07T13:10:21+01:00 host dhclient 27027 - [meta sequenceId="5"] Creating resolv.conf
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="6"] /usr/local/etc/rc.newwanip: IP renewal starting (new: X.X.194.162, old: X.X.194.162, interface: WAN[wan], device: vtnet0, force: yes)
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="7"] /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="8"] /usr/local/etc/rc.newwanip: ROUTING: configuring inet default gateway on wan
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="9"] /usr/local/etc/rc.newwanip: ROUTING: setting inet default route to X.X.192.1
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="10"] /usr/local/etc/rc.newwanip: plugins_configure monitor (,WAN_DHCP)
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="11"] /usr/local/etc/rc.newwanip: plugins_configure monitor (execute task : dpinger_configure_do(,WAN_DHCP))
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="12"] /usr/local/etc/rc.newwanip: plugins_configure vpn (,wan)
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="13"] /usr/local/etc/rc.newwanip: plugins_configure vpn (execute task : ipsec_configure_do(,wan))
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="14"] /usr/local/etc/rc.newwanip: plugins_configure vpn (execute task : openvpn_configure_do(,wan))
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="15"] /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="16"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (,wan)
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="17"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : dnsmasq_configure_do())
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="18"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : dyndns_configure_do(,wan))
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="19"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : ntpd_configure_do())
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="20"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : opendns_configure_do())
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="21"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : openssh_configure_do(,wan))
<13>1 2023-06-07T13:10:22+01:00 host opnsense 27541 - [meta sequenceId="22"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : unbound_configure_do(,wan))
<13>1 2023-06-07T13:10:23+01:00 host opnsense 27541 - [meta sequenceId="23"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : vxlan_configure_do())
<13>1 2023-06-07T13:10:23+01:00 host opnsense 27541 - [meta sequenceId="24"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : webgui_configure_do(,wan))
<13>1 2023-06-07T13:10:24+01:00 host kernel - - [meta sequenceId="25"] <6>ovpnc1: link state changed to DOWN
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="26"] /usr/local/etc/rc.newwanip: IP renewal starting (new: X.X.0.7, old: X.X.0.11, interface: N[opt3], device: ovpnc1, force: yes)
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="27"] /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'opt3'
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="28"] /usr/local/etc/rc.newwanip: plugins_configure monitor (,N_VPNV4)
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="29"] /usr/local/etc/rc.newwanip: plugins_configure monitor (execute task : dpinger_configure_do(,N_VPNV4))
<13>1 2023-06-07T13:10:25+01:00 host kernel - - [meta sequenceId="30"] <6>ovpnc1: link state changed to UP
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="31"] /usr/local/etc/rc.newwanip: IP address change detected, killing states of old ip X.X.0.11
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="32"] /usr/local/etc/rc.newwanip: plugins_configure vpn (,opt3)
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="33"] /usr/local/etc/rc.newwanip: plugins_configure vpn (execute task : ipsec_configure_do(,opt3))
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="34"] /usr/local/etc/rc.newwanip: plugins_configure vpn (execute task : openvpn_configure_do(,opt3))
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="35"] /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface N.
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="36"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (,opt3)
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="37"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : dnsmasq_configure_do())
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="38"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : dyndns_configure_do(,opt3))
<13>1 2023-06-07T13:10:25+01:00 host opnsense 45452 - [meta sequenceId="39"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : ntpd_configure_do())
<13>1 2023-06-07T13:10:27+01:00 host opnsense 45452 - [meta sequenceId="40"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : opendns_configure_do())
<13>1 2023-06-07T13:10:27+01:00 host opnsense 45452 - [meta sequenceId="41"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : openssh_configure_do(,opt3))
<13>1 2023-06-07T13:10:27+01:00 host opnsense 45452 - [meta sequenceId="42"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : unbound_configure_do(,opt3))
<13>1 2023-06-07T13:10:28+01:00 host opnsense 45452 - [meta sequenceId="43"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : vxlan_configure_do())
<13>1 2023-06-07T13:10:28+01:00 host opnsense 45452 - [meta sequenceId="44"] /usr/local/etc/rc.newwanip: plugins_configure newwanip (execute task : webgui_configure_do(,opt3))


Then we're back to this logged message 10 minutes later, but with no other log messages in between:
<13>1 2023-06-07T13:17:51+01:00 host dhclient 77012 - [meta sequenceId="1"] Creating resolv.conf

Can anyone give advise on how to diagnose this further or resolve it? I've been pulling my hair out and don't seem to be getting anywhere!

Thanks

I forgot to add, that when the internet connectivity is down, I am unable to ping the currently set WAN gateway from OPNsense but I am able to ping Proxmox's IP as well as any normally accessible local network IPs.

I would suggest to look at the changes made.
I am with the same ISP, the same product I think (FTTP by the way), also have OPN on Proxmox and all is good.
I also have those very regular "2023-06-07T16:30:52   Notice   dhclient   Creating resolv.conf" but they don't seem to be a problem per se to anything.

Thanks, I'm on FTTP too (connected to Openreach fibre modem). I've tried reversing all the changes I could remember but I still have the problem. :-\

Here is my WAN configuration attached, would you be able to tell me if it matches with your WAN configuration for the FTTP connection?

Almost the same. Differences are:
- the interface name. Mine is ig0 as I am passing through the port, not using a virtualised one.
- I don't override the MTU. That field I have clear, not ticked.
*And I am on 22.7.11_1

I have now installed a fresh install of OPNSense with all default settings except for adding WAN and LAN IP settings. The problem still occurs!

My next step after that was to factory reset the Openreach FTTP modem. That didn't work either, exact same issue.

So next I will try installing pfSense to see if it happens there..

I have a feeling it may be related to an update to 23.1.x, so that could explain why you're not seeing it on 22.7.11_1 but with a similar setup, maybe..?!

Any chance of installing 22.7 just in case to compare?
Also is it possible to run it bare metal, either version.
Or just before doing any of all this, put their router in to rule out the line itself.

Ah! I think I just found the problem, looks like my silly mistake. After the network restructuring I had accidentally left another VM on the old Proxmox WAN network bridge which meant it was connected directly to the WAN network. So that VM was probably getting it's IP via DHCP from the Openreach modem, at the same time that OPNsense was. So I guess everytime the other VM obtained an IP, connectivity was dropped from OPNsense.

I will find out for sure in the next 10-20 min now that I've fixed that.

By the way, how do you pass through the network port rather than virtualising it? Do you need to pass through the whole PCI device in Proxmox in the Hardware section of the VM?

Nice find. Makes sense.
To passthrough in proxmox, yes you use the Add > PCI Device. It is hardware dependent. The host system needs to have the ability. What are you using for host?

Yes! All is working now. I only wasted two days on that lol ::)

I'm running Proxmox on HP Proliant Microserver Gen10 Plus.

I do remember considering the option of passthrough but didn't do in the end for some reason..

Thanks for your help.

I'm glad to hear. You might want to consider marking the subject as solved. Edit first post and put it there at the beginning (in brackets is good too).

Quote from: cookiemonster on June 07, 2023, 11:29:09 PM
To passthrough in proxmox, yes you use the Add > PCI Device. It is hardware dependent. The host system needs to have the ability. What are you using for host?

One more question on this.. How do you access the Proxmox management webUI when you're passing through the whole PCI device? Do you have a secondary NIC on a different device that you don't passthrough for that?

Thanks

No, just set proxmox to use a static ip in the same range as your lan and that's it. For example if your LAN is in the range 192.168.10.0/24, then you set your proxmox the ip 192.168.10.5. Then your dhcp range for clients is set from 192.168.10.50 - 192.168.10.254.
Then any client like a laptop will be able to reach it without any special setup.
The problem of course is that you can't work on proxmox UI or remote terminal if there is a routing problem with OPN. For instance if your access points are served by OPN and the laptop is unable to get an ip address from dhcp.

Oh ok, I think I understand.

I assume that any NIC you passthrough to a virtual machine is not available to Proxmox in the System->Network section, which is where I usually set up the management IP (on a linux bridge). But if the Linux Bridge cannot have it's Bridge port setup to be the Network Device (since it's already passed through to a VM), if I'm understanding correctly, you set up an additional Linux Bridge (vmbrX).

You then have vmbrX connected to the OPN VM with "Bridge port" field empty and network settings matching the LAN settings for the vmbrX interface in OPN. So traffic is routed through OPN first -> NIC -> vmbrX -> Proxmox webUI.

I missed that part of the setup. You are right that if you pass through the NIC, all its setup is to be done in the VM and therefore outside Proxmox. It might be possible to set some settings in Proxmox but for my setup, I haven't needed it.
I left the default bridge in proxmox with one port assigned to it and it the static ip I assigned it to that bridge, in the same range as I mentioned earlier. The gateway for this bridge is the OPN LAN ip . This interface  is _not_ given to the OPN VM.
No need for another bridge.

Clearly this is suboptimal. If OPN is "down", I can't connect to proxmox UI. I am OK with this setup. In those cases I walk to it and work from the host directly by plugging a keyboard and monitor. The perverse nature of having the main router in a VM in a host without BMC.