Archive > 17.1 Legacy Series

Loss of internet access and OPNSense reachability

(1/5) > >>

Mr.Goodcat:
Hi,

a strange, new problem has reared its ugly head  :-\
Every few hours the connection to the internet through the OPNSense Box is severed and the Box becomes unreachable from the LAN, i.e. doesn't load the GUI and doesn't answer pings. Devices on the LAN-bridge however can still ping each other. From the VGA Console on the Box itself it is possible to ping out to internet but not any LAN client. Oddly enough DHCP seams to work correctly and sets DNS and Gateway appropriately if any new LAN clients are startet.

Two options to get the setup woking again have been identified: restarting OPNSense, or switching one of the previously inactive LAN ports on (e.g. via starting an attached switch).

From the logs these lock-ups always occur after the following WAN-side event (note: re0 is the NIC towards the WAN):

Feb. 12 18:14:32   opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to X.Y.116.1
Feb. 12 18:14:31   opnsense: /usr/local/etc/rc.newwanip: rc.newwanip: on (IP address: X.Y.119.64) (interface: WAN[wan]) (real interface: re0).
Feb. 12 18:14:31   opnsense: /usr/local/etc/rc.newwanip: rc.newwanip: Informational is starting re0.

I tried changing the default setting for gateway monitoring but that didn't help. Actually I'm not quite sure if the observed behaviour also existed under 17.1RC from which an update was performed. Do you have any idea what could be the problem/solution? I suspect the WAN's DHCP lease but that shouldn't affect the ability to ping the OPNSense box from LAN. In such a case the problem should also occur in fixed intervals, but sometimes it takes ~24h and in other instances just a few hours. A case of PEBKAC is always a possibility as well but I'm at a loss regardless.


Details regarding my setup:

* WAN NIC: Realtek 8111G
* LAN BRIDGE (non-filtering): Chelsio T420-CR + Intel i350-T4 + Realtek 8111G
* Updated from 17.1 RC to 17.1 to 17.1.1
* Running default config for the most part, except for static ARP with DHCP on LAN and a non-filtering LAN-bridge
* Installed on SSD

Thanks and kind regards,
Fabian

markus:
Hi Fabian,

I just updated to 17.1.1 and I am also stuck with the problem, that I can not ping the LAN address from my LAN clients. Everything else seems to be working alright, though.

Did you find any solution to this, yet?

Cheers,
Markus

Mr.Goodcat:
Hi Markus,

unfortunately there are no advancements to report. So far I tried to reconfigure most settings and fiddled with the gateway config. Rebooting the modem and OPNSense also didn't help. Maybe the problem is related to OPNSense locking up completely (i.e. https://forum.opnsense.org/index.php?topic=4414.0)?

In any case, having to reboot OPNSense multiple times per day to get back out to WAN isn't an acceptable workaround. If all else fails I'll try 16.x and see if that works.

Does your hardware setup have something in common with mine? Just to make sure this doesn't come from some FreeBSD 11.0 driver issue.

Cheers,
Fabian

djGrrr:
Please try out a kernel we are testing with a new Realtek NIC driver and see if this makes any difference:

From console/shell as root:

# opnsense-update -kr 17.1.1-re
# /usr/local/etc/rc.reboot

Mr.Goodcat:

--- Quote from: djGrrr on February 14, 2017, 06:05:54 pm ---Please try out a kernel we are testing with a new Realtek NIC driver and see if this makes any difference:

From console/shell as root:

# opnsense-update -kr 17.1.1-re
# /usr/local/etc/rc.reboot

--- End quote ---

Hi djGrrr,

thank you for your help! I've just performed the update and the system was up and running again in no time. So far so good.

From the logs I also saw a few other bits that might help to illuminate the situation.

The part below keeps repeating but that shouldn't affect my main problem of losing the path out to WAN and OPNSense itself:

--- Code: ---apinger: Error while feeding rrdtool: Broken pipe
apinger: rrdtool respawning too fast, waiting 300s.
--- End code ---


Throughout the day the part below occurs multiple times as well. What's strange about this is that opt1 (i.e. the chelsio cxgbe1 NIC) is connected to a switch which was powered down during the day. So the port shouldn't come up or down but simply keep its off state. As you can see DynDNS is also mentioned in the log though it was never set up in the first place.

--- Code: ---Feb. 14 17:26:36 configd.py: [5747513c-e9e6-4c9d-ab51-c7797f363eb3] updating dyndns opt1
Feb. 14 17:26:31 kernel: cxgbe0: cxgbe_media_change unimplemented.
Feb. 14 17:26:30 opnsense: /usr/local/etc/rc.linkup: HOTPLUG: Configuring interface opt1
Feb. 14 17:26:30 opnsense: /usr/local/etc/rc.linkup: DEVD Ethernet attached event for opt1
Feb. 14 17:26:30 configd.py: [5d214b1c-b327-480c-8ca2-db2090fb041e] Linkup starting cxgbe0
Feb. 14 17:26:30 kernel: cxgbe0: link state changed to UP
Feb. 14 17:26:30 opnsense: /usr/local/etc/rc.linkup: DEVD Ethernet detached event for opt1
Feb. 14 17:26:30 configd.py: [b24b7a30-bb2c-48b5-b10c-dcdc04d9c060] Linkup stopping cxgbe0
Feb. 14 17:26:30 kernel: cxgbe0: link state changed to DOWN
--- End code ---


Last but not least something regarding my original observeration has changed in the night:

--- Code: ---Feb. 14 04:40:01 opnsense: /usr/local/etc/rc.newwanip: The command '/sbin/route delete -inet '176.199.116.1' -interface 're0'' returned exit code '1', the output was 'route: route has not been found delete host 176.199.116.1: gateway re0 fib 0: not in table'
Feb. 14 04:40:01 opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to 176.199.116.1
Feb. 14 04:40:00 opnsense: /usr/local/etc/rc.newwanip: rc.newwanip: on (IP address: 176.199.119.64) (interface: WAN[wan]) (real interface: re0).
Feb. 14 04:40:00 opnsense: /usr/local/etc/rc.newwanip: rc.newwanip: Informational is starting re0.
--- End code ---

Previously that was the part after which connectivity to the WAN and OPNSense itself broke off. Either it "fixed" itself after some time during which no one noticed it broke, or me reselecting a few settings triggered something. Anyhow I'll keep monitoring it and see if it's fixed with the new kernel. Thanks again!


Cheers,
Fabian

Navigation

[0] Message Index

[#] Next page

Go to full version