OPNsense Forum
Archive => 16.1 Legacy Series => Topic started by: Pedro on April 22, 2016, 04:00:59 pm
-
Hi all,
I finally managed to convince the higher-ups to implement OPNsense. Initially, all seemed to be working ok in the testing, but now that I've moved into production (setup is slightly different) we occasionally loose internet access. Everything seems to be working ok, but OPNsense shows the gateway as being down and we have no internet. I've yet to find a pattern and usually the only way I get internet access back is by rebooting OPNsense. Even a release/renew of the DHCP lease for WAN doesn't solve the problem.
Being rather new to OPNsense and FreeBSD, I'm at a loss as to what further I can do to troubleshoot this issue and would appreciate any help/guidance in solving this.
Specs
OPNsense 16.1.11_1-i386
FreeBSD 10.2-RELEASE-p14
LibreSSL 2.2.6
WAN using DHCP with gateway monitoring.
-
Really hate to bump this after just a couple of days, but the situation persists and I really could do with some help from the more experienced.
Any ideas?
-
I think nobody can help you with your problem because the required information is missing:
+ state of the interfaces (for example provide an output of ifconfig - you can replace your ip address) when it is working and when it is not
+ which protocol are you running on your wan?
+ ping from the firewall - does it work?
+ syslog messages
+ are the services running?
+ do you have any special configuration which is usually not used?
+ ...
Fabian
-
WAN using DHCP with gateway monitoring.
WAN failover?
This is often a big problem, depending how you try to achive monitoring. A lot of ping'able gateways and such tend to ignore pings a lot and therefor you get false alarms and failovers. Even the often used Google DNS (8.8.8.8 or 8.8.4.4) tend to drop pings a lot. What are your settings?
Also, some providers miss a minimum of intelligent DHCP handling and often dismiss TTL or do it non RFC-like, check for such errors in logs.
A combination of this: No ICMP echos, WAN restart and no answer from DHCP -> WAN offline though it wasn't at all. Seen this quite often.
-
Hi all, thanks for your answers so far. I'll try to address them in order:
@Fabian:
- WAN is using DHCP to connect to the outside world;
- When internet fails, ping fails with "no buffer space available";
- Services are all running fine but I have not yet managed to look into syslog messages, that's the next step;
- No special configuration or tunable set yet
@Zeitkind:
Thanks for your input. I've disabled gateway monitoring for the time being and the problem persists, only now OPNsense thinks everything is fine. The "no buffer space available" gave me a little more to go on and I've managed to collect the following:
root@gw:~ # netstat -m
781/1499/2280 mbufs in use (current/cache/total)
762/758/1520/26368 mbuf clusters in use (current/cache/total/max)
762/756 mbuf+clusters out of packet secondary zone in use (current/cache)
0/31/31/13184 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/3906 9k jumbo clusters in use (current/cache/total/max)
0/0/0/2197 16k jumbo clusters in use (current/cache/total/max)
1719K/2014K/3734K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/8/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
Any further ideas? I'll keep watching the logs to see if anything pertinent pops up
-
- When internet fails, ping fails with "no buffer space available";
- check for bad cables
- check for bad NIC
- check for kern.ipc.nmbclusters and kern.maxusers
- check for net.inet.tcp.recvbuf_max and net.inet.tcp.sendbuf_max
- check for lost default route
tbh, I suspect a hardware/driver related problem.
-
Can we get the full unmodified log line of "no buffer space available" output?
-
So, failed again (making that 5 times today). This time however I managed to get a little more info. Also, I neglected to mention earlier that we're in temporary facilities and are "sharing" another network on a seperate vlan, so in essence we have:
partner network on vlan => WAN with DHCP => LAN
When internet fails, any traffic on LAN continues to work just fine.
After internet failed
root@gw:~ # netstat -m
781/1244/2025 mbufs in use (current/cache/total)
746/524/1270/26368 mbuf clusters in use (current/cache/total/max)
746/519 mbuf+clusters out of packet secondary zone in use (current/cache)
0/83/83/13184 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/3906 9k jumbo clusters in use (current/cache/total/max)
0/0/0/2197 16k jumbo clusters in use (current/cache/total/max)
1687K/1691K/3378K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/7/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
root@gw:~ # ifconfig -a
re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=82098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
ether f8:1a:67:00:23:73
inet 192.168.200.1 netmask 0xffffff00 broadcast 192.168.200.255
inet6 fe80::fa1a:67ff:fe00:2373%re0 prefixlen 64 scopeid 0x1
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (none)
status: no carrier
re1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=82098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
ether 64:70:02:00:ef:4c
inet 10.10.0.1 netmask 0xffff0000 broadcast 10.10.255.255
inet6 fe80::6670:2ff:fe00:ef4c%re1 prefixlen 64 scopeid 0x2
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
vr0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=82808<VLAN_MTU,WOL_UCAST,WOL_MAGIC,LINKSTATE>
ether 00:1b:fc:1e:62:1b
inet6 fe80::21b:fcff:fe1e:621b%vr0 prefixlen 64 scopeid 0x3
inet 192.168.4.20 netmask 0xffffff00 broadcast 192.168.4.255
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
pflog0: flags=100<PROMISC> metric 0 mtu 33184
pfsync0: flags=0<> metric 0 mtu 1500
syncpeer: 0.0.0.0 maxupd: 128 defer: off
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x6
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
enc0: flags=0<> metric 0 mtu 1536
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
root@gw:~ # ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: sendto: No buffer space available
ping: sendto: No buffer space available
ping: sendto: No buffer space available
ping: sendto: No buffer space available
^C
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
root@gw:~ # ping 192.168.4.1
PING 192.168.4.1 (192.168.4.1): 56 data bytes
ping: sendto: No buffer space available
ping: sendto: No buffer space available
ping: sendto: No buffer space available
^C
--- 192.168.4.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
After reboot
root@gw:~ # netstat -m
658/1367/2025 mbufs in use (current/cache/total)
640/630/1270/26368 mbuf clusters in use (current/cache/total/max)
640/625 mbuf+clusters out of packet secondary zone in use (current/cache)
0/27/27/13184 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/3906 9k jumbo clusters in use (current/cache/total/max)
0/0/0/2197 16k jumbo clusters in use (current/cache/total/max)
1444K/1709K/3154K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/6/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
root@gw:~ # ifconfig -a
re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=82098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
ether f8:1a:67:00:23:73
inet 192.168.200.1 netmask 0xffffff00 broadcast 192.168.200.255
inet6 fe80::fa1a:67ff:fe00:2373%re0 prefixlen 64 scopeid 0x1
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (none)
status: no carrier
re1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=82098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
ether 64:70:02:00:ef:4c
inet 10.10.0.1 netmask 0xffff0000 broadcast 10.10.255.255
inet6 fe80::6670:2ff:fe00:ef4c%re1 prefixlen 64 scopeid 0x2
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
vr0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=82808<VLAN_MTU,WOL_UCAST,WOL_MAGIC,LINKSTATE>
ether 00:1b:fc:1e:62:1b
inet6 fe80::21b:fcff:fe1e:621b%vr0 prefixlen 64 scopeid 0x3
inet 192.168.4.20 netmask 0xffffff00 broadcast 192.168.4.255
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
pflog0: flags=100<PROMISC> metric 0 mtu 33184
pfsync0: flags=0<> metric 0 mtu 1500
syncpeer: 0.0.0.0 maxupd: 128 defer: off
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x6
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
enc0: flags=0<> metric 0 mtu 1536
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
root@gw:~ # ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: sendto: Operation not permitted
ping: sendto: Operation not permitted
64 bytes from 8.8.8.8: icmp_seq=2 ttl=56 time=124.951 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=56 time=110.339 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=56 time=120.774 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=56 time=88.388 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=56 time=202.519 ms
^C
--- 8.8.8.8 ping statistics ---
7 packets transmitted, 5 packets received, 28.6% packet loss
As far as system log goes, I saw this and am wondering if it could be related somehow:
Apr 27 14:13:43 gw kernel: warning: total configured swap (4194304 pages) exceeds maximum recommended amount (2097312 pages).
Apr 27 14:13:43 gw kernel: warning: increase kern.maxswzone or reduce amount of swap.
@Zeitkind:
We've tested with various cables already, but I'll test with another NIC as soon as I get my hands on one. I'll also look into tunables and adjust kern.ipc.nmbclusters and kern.maxusers. Any pointers as to which would be best? As far as default route goes, I do see this in the logs. Just can't quite be sure if it's the cause of the fault or part of startup/restart:
Apr 27 15:50:07 gw opnsense: /usr/local/etc/rc.bootup: ROUTING: remove current default route to 192.168.4.1
Apr 27 15:50:07 gw opnsense: /usr/local/etc/rc.bootup: ROUTING: setting default route to 192.168.4.1
Apr 27 15:50:07 gw dhcpleases: kqueue error: unkown
Once again, thanks for all the help, really appreciate it.
-
On first glance looking at the good old docs from the parent:
https://doc.pfsense.org/index.php/No_buffer_space_available
That's a good checklist to go through. What Zeitkind said is likely true... vr(4) and re(4) are not the best drivers.
What device is this?
-
I run into a similar situation testing OPNsense running in a Hyper-V VM. Running an iperf3 client with multiple parallel streams (-P option) will cause traffic to stop flowing through OPNSense VM. Trying to ping any IP address though the outside interface returns ping: sendto: No buffer space available. Bringing the interface down and up restores connectivity. I was running into the same issue when running an earlier version of OPNSense on a physical box (Intel N3150 CPU, Dual RTL8111/8168/8411, 4GB RAM)
-
i had the same issue last day with my VM box on ESXI,
the issue was the VM has 2GB memory after i enable proxy server i los the connectivity with the firewall.
i've add some CPU and Memory to the firewall et voila the issue is gone.
what i am trying to say make sure your firewall has enough memory.