OPNsense Forum
Archive => 17.1 Legacy Series => Topic started by: Werner Fischer on July 17, 2017, 03:54:41 pm
-
(UPDATE #1: the issue has finally been fixed through a BIOS/UEFI-Firmware update, see posting #44 in this thread for details: https://forum.opnsense.org/index.php?topic=5511.msg28681#msg28681)
(UPDATE #2: after many days of tests, the problem came up again - see https://forum.opnsense.org/index.php?topic=5511.msg28781#msg28781 for details)
Hi all,
I'm trying to analyze a strange issue: sometimes (very rare, I was only able to reproduce the issue 2 times), the WAN link goes away. From a laptop behind the OPNsense firewall I am not able to ping the WAN's default gateway anymore. I still can access the OPNsense system.
Details:
- Hardware based on JBC390F541AA-19-B (System: http://www.jetwaycomputer.com/JBC390F541AA.html, Mainboard: http://www.jetwaycomputer.com/NF541.html, 2 4quad Port NICs: http://www.jetwayipc.com/content/?ADPIE1ILAN4_3868.html)
- current OPNsense version 17.1.9 with FreeBSD 11.0-RELEASE-p10
- LAN is on igb0, WAN on igb9
- An "ifconfig igb9 down" followed by an "ifconfig igb9 up" fixes the issue - like also described in this thread here: https://forum.opnsense.org/index.php?topic=4850.msg21238#msg21238
- I was able to reproduce the issue only two times. It occurred soon after booting the OPNsense machine. I do not get any hints in /var/log/system.log.
- I have attached a zip file with the output of various tools.
- Unfortunately I have not checked the ARP cache when the problem occurred. In case I can reproduce it again, I'll check that.
- I have also tried pfSense 2.3.4. I was not able to reproduce the error there, but this may have been just luck, or it works there 'cause they use the older FreeBSD 10.3 compared to FreeBSD 11.0.
Have you ever seen an issue like this?
Do you have any hints what I could do to further analyze the issue?
Thanks in advance and best regards,
Werner
-
I also experience the same problem, my OPNsense is a VM (on ESXi) with Intel NICs and I've never been able to find anything about troubleshooting this problem (I've previously posted this on the forum without a response. I thought it was a rare occurrence until I added a cron job to test the connection every minute and this is the (abbreviated) result:
2017-07-04 15:23:15 - WAN interface Restarted on OPNsense
2017-07-04 15:24:15 - WAN interface Restarted on OPNsense
2017-07-06 03:46:22 - WAN interface Restarted on OPNsense
2017-07-06 04:17:22 - WAN interface Restarted on OPNsense
2017-07-07 17:41:22 - WAN interface Restarted on OPNsense
2017-07-09 06:35:22 - WAN interface Restarted on OPNsense
2017-07-09 07:06:22 - WAN interface Restarted on OPNsense
2017-07-10 20:30:22 - WAN interface Restarted on OPNsense
2017-07-12 09:24:22 - WAN interface Restarted on OPNsense
2017-07-12 09:55:22 - WAN interface Restarted on OPNsense
2017-07-12 12:53:22 - WAN interface Restarted on OPNsense
2017-07-12 12:55:15 - WAN interface Restarted on OPNsense
2017-07-12 12:56:15 - WAN interface Restarted on OPNsense
I thought that apinger might restart the interface but that appears not to be the case. :( This hasn't always happened but I can't remember when it did start other than it seems to be recent, as in sometime this year.
-
Thank you for reporting your experiences, Bill.
So your cronjob checks the availability of the WAN interface, and does then a restart in case the interface is not available, right?
Could you maybe post your cronjob-script?
-
Sure, here's the script I use:
#!/bin/sh
# -q quiet
# -c nb of pings to perform
ping -q -c5 [your_wan_gateway] > /dev/null 2>&1 <<-- obviously your wan gateway IP
if [ $? -eq 0 ]
then
echo "ok"
else
# When we restart the NIC we also need to run a dhclient to get our (fixed) IP address:
/etc/rc.d/netif restart vmx0 > /dev/null 2>&1 ; dhclient vmx0 > /dev/null 2>&1
echo "$(date '+%Y-%m-%d %H:%M:%S') - WAN interface Restarted on" $(hostname -s) >> /usr/home/restart_wan.log
fi
I put the script (and the log file) in /usr/home then modify the crontab to add this line at the end:
* * * * * (/usr/home/restart_wan) > /dev/null
I can't really take any credit for that (I found a similar script on the internet) as I'm not very experienced in scripting and it can probably be improved but it works for me.
-
Thank you for your script.
Today morning, the error happened again:
- I have switched on the OPNsense system (having igb9 connected as DHCP client for the WAN uplink, WAN gateway being 10.1.102.1)
- At "05:45:52 UTC" I have connected my laptop on igb0 as a client (getting an IP address via DHCP from OPNsense)
- For about 2-5 minutes, I was able to use the internet on the laptop. Then I noticed that I cannot ping the default gateway 10.1.102.1 of the WAN uplink)
- Then I SSH'ed into OPNsense at "05:53:40 UTC". Here follows the shortened log of my session:
root@OPNsense:~ # date
Wed Jul 19 05:53:53 UTC 2017
root@OPNsense:~ # ifconfig
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4400b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,TXCSUM_IPV6>
ether 00:30:18:cd:e8:54
inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
inet6 fe80::1:1%igb0 prefixlen 64 scopeid 0x1
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
igb1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
[...]
igb9: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4400b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,TXCSUM_IPV6>
ether 00:30:18:cd:ec:63
inet6 fe80::230:18ff:fecd:ec63%igb9 prefixlen 64 scopeid 0xa
inet 10.1.102.55 netmask 0xffffff00 broadcast 10.1.102.255
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
[...]
root@OPNsense:~ # arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1084 seconds [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1136 seconds [ethernet]
root@OPNsense:~ # time arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1016 seconds [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1172 seconds [ethernet]
0.000u 0.004s 0:23.29 0.0% 0+0k 0+0io 0pf+0w
root@OPNsense:~ # clog /var/log/system.log | tail -n 150
[...]
Jul 19 05:37:41 OPNsense kernel: uhub1: 4 ports with 4 removable, self powered
Jul 19 05:37:41 OPNsense kernel: igb9: link state changed to UP
Jul 19 05:37:41 OPNsense kernel: aesni0: No AESNI support.
Jul 19 05:37:42 OPNsense kernel: done.
Jul 19 05:37:42 OPNsense kernel: igb9: link state changed to DOWN
Jul 19 05:37:42 OPNsense sshlockout[12834]: sshlockout/webConfigurator v3.0 starting up
Jul 19 05:37:42 OPNsense configd.py: [3a2068ae-8494-4ad6-9476-7ef4d08a0ce5] Linkup stopping igb9
Jul 19 05:37:46 OPNsense kernel: igb9: link state changed to UP
Jul 19 05:37:46 OPNsense configd.py: [f7d9bfd1-e05b-40c2-b52b-fd08a8c054c3] Linkup starting igb9
Jul 19 05:37:49 OPNsense kernel: done.
Jul 19 05:37:49 OPNsense kernel: pflog0: promiscuous mode enabled
Jul 19 05:37:50 OPNsense kernel: ...done.
Jul 19 05:37:50 OPNsense kernel: done.
Jul 19 05:37:50 OPNsense sshd[48047]: Server listening on :: port 22.
Jul 19 05:37:50 OPNsense sshd[48047]: Server listening on 0.0.0.0 port 22.
[...]
Jul 19 05:38:02 OPNsense sshlockout[70807]: sshlockout/webConfigurator v3.0 starting up
Jul 19 05:38:02 OPNsense kernel: OK
Jul 19 05:38:04 OPNsense kernel:
Jul 19 05:45:52 OPNsense kernel: igb0: link state changed to UP
Jul 19 05:45:52 OPNsense configd.py: [ac2de28c-7b2a-4fa0-9f27-2e8536f2c95d] Linkup starting igb0
Jul 19 05:45:53 OPNsense opnsense: /usr/local/etc/rc.linkup: DEVD Ethernet attached event for lan
Jul 19 05:45:53 OPNsense opnsense: /usr/local/etc/rc.linkup: HOTPLUG: Configuring interface lan
Jul 19 05:45:56 OPNsense configd.py: [2355f16d-4b08-4c1d-85db-cc50b95f937e] updating dyndns lan
Jul 19 05:45:56 OPNsense configd.py: [6072a8dd-cf07-4c0d-aea2-c0b32445b557] updating rfc2136 lan
Jul 19 05:53:40 OPNsense sshd[30826]: Postponed keyboard-interactive for root from 192.168.1.100 port 35188 ssh2 [preauth]
Jul 19 05:53:44 OPNsense opnsense: user 'root' authenticated successfully
Jul 19 05:53:44 OPNsense sshd[30826]: Postponed keyboard-interactive/pam for root from 192.168.1.100 port 35188 ssh2 [preauth]
Jul 19 05:53:44 OPNsense sshd[30826]: Accepted keyboard-interactive/pam for root from 192.168.1.100 port 35188 ssh2
root@OPNsense:~ # date
Wed Jul 19 06:05:53 UTC 2017
root@OPNsense:~ # time arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 204 seconds [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1101 seconds [ethernet]
0.000u 0.004s 0:23.33 0.0% 0+0k 0+0io 0pf+0w
root@OPNsense:~ # date
Wed Jul 19 06:11:33 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
5 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # time arp -a
? (10.1.102.1) at (incomplete) on igb9 expired [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1074 seconds [ethernet]
0.000u 0.004s 0:23.59 0.0% 0+0k 0+0io 0pf+0w
My findings:
- Interface igb9 is still up.
- I cannot ping the default gateway of the WAN (10.1.102.1) anymore.
- Executing "arp -a" takes about 23 seconds (because of trying to resolve IP addresses to hostnames, I have missed to use the "-n" option of arp)
- I did a tcpdump for igb9 on the OPNsense system when I executed a ping. I only see the broadcasts being sent, but no answer (see attachment).
Does anybody have an idea or hint what I could further execute to analyze this issue? (I keep the OPNsystem in this state, as it is not easy to trigger the issue).
-
Hi Werner,
Which one is the WAN interface, igb9 or igb0?
WAN is DHCP, right?
What do the following commands yield?
# ps aux | grep dhclient
# netstat -nr | grep default
Does this bring it back?
# killall dhclient
# dhclient igbX
If not, does this?
# /usr/local/etc/rc.newwanip wan
Cheers,
Franco
-
Hi Franco,
thanks a lot for your friendly help.
WAN is igb9 an WAN is DHCP.
Unfortunately, the commands did not help to get the connectivity again. But it's surprising to me, that executing dhclient again shows a "DHCPACK". Here is my log:
root@OPNsense:~ # ps aux | grep dhclient
root 15548 0.0 0.0 1076296 2844 - Is 05:37 0:00.00 dhclient: igb9 [priv] (dhclient)
_dhcp 24493 0.0 0.0 1076296 2908 - Is 05:37 0:00.00 dhclient: igb9 (dhclient)
root 59205 0.0 0.0 1080488 2856 0 S+ 08:53 0:00.01 grep dhclient
root@OPNsense:~ # netstat -nr | grep default
default 10.1.102.1 UGS igb9
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # date
Wed Jul 19 08:54:46 UTC 2017
root@OPNsense:~ # killall dhclient
root@OPNsense:~ # ps aux | grep dhclient
root 77072 0.0 0.0 1080488 2860 0 S+ 08:55 0:00.00 grep dhclient
root@OPNsense:~ # date
Wed Jul 19 08:55:16 UTC 2017
root@OPNsense:~ # dhclient igb9
DHCPREQUEST on igb9 to 255.255.255.255 port 67
DHCPACK from 10.1.102.5
bound to 10.1.102.55 -- renewal in 21600 seconds.
root@OPNsense:~ # date
Wed Jul 19 08:55:32 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
5 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # date
Wed Jul 19 08:56:08 UTC 2017
root@OPNsense:~ # arp -a -n
? (10.1.102.1) at (incomplete) on igb9 expired [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.101) at f0:de:f1:ab:ce:49 on igb0 expires in 1011 seconds [ethernet]
root@OPNsense:~ # date
Wed Jul 19 08:56:36 UTC 2017
root@OPNsense:~ # /usr/local/etc/rc.newwanip wan
root@OPNsense:~ # date
Wed Jul 19 08:57:09 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # date
Wed Jul 19 08:57:25 UTC 2017
root@OPNsense:~ #
In the past, an "ifconfig igb9 down" followed by an "ifconfig igb9 up" brought the connectivity back up again, but I keep the system in the current state in case that you have any other ideas.
Should I do the steps again while recording a tcpdump?
Thanks,
Werner
-
I had an similar Problem with
an another NIC on a embedded Device.
.. it can be possible, the Power Management
/Energy Saver Mode of your Intel NIC works not correctly.
Disable for the i211 GB Nic:
* the Energy Efficient Ethernet Saver Mode
* and WOL (WakeOnLan) for this Interface.
Test it manually with ethtool ...
Use:
Disable Energy-Efficient Ethernet Mode:
ethtool --set-eee <your nic device> eee off
Disable WakeOnLan:
ethtool --s <your nic device> wol d
If it works ,
put the modifications to load @ boot
in
/etc/rc.local
Sincely
Wilbo.
-
Thank you wilbolinux for your suggestion to check the Energy Efficient Ethernet settings.
I have searched for ethtool, but it seems that this tool is not available for FreeBSD.
Has anybody an idea how the Energy Efficient Ethernet settings can be checked for igb devices under FreeBSD/OPNsense?
-
Hi Werner,
Uhh, I do remember this now...
sysctl -a | grep dev.igb...eee_disabled
Set it to 1 for each interface to disable EEE (Energy Efficient Ethernet) as this may cause up/down issues.
Apologies, I took this from an old e-mail thread, it's not on the forum so hard to find.
Cheers,
Franco
-
Hi Franco,
thank you so much for your fast and confident answer! This makes me think very positive, that we can solve this issue :)
I will test this and let you know within the upcoming days whether I was able to fix my issue by disabling EEE.
Thanks again,
Werner
-
Hi Franco,
I was able to reproduce the problem with OPNsense 17.7 (after I have updated the server mentioned above from 17.1 to 17.7).
I have then disabled Energy-Efficient Ethernet for all 10 NICs of the system (setting the tunables from "dev.igb.0.eee_disabled" to "dev.igb.9.eee_disabled" to "1"). After that, I have had no problems anymore.
I have documented the fix in our wiki here (in German): https://www.thomas-krenn.com/de/wiki/OPNsense_igb_EEE_Funktion_deaktivieren
In case that against my expectation the problem arises again, I would post this here - I hope this will not be necessary ;)
So thanks again for your help,
best regards,
Werner
-
Hi all,
i think Energy Efficient Ethernet (eee) should be disabled by default in opnsense.
cheers
till
-
It seems to only affect a tiny fraction of igb devices. And since the number of igb devices is device-specific, we'd have to write detection code as well or alternatively hardcode the EEE default in the kernel. Unless we narrow these chipsets down to look at the full picture, I am not sure what to do.
Cheers,
Franco
-
Hi again,
unfortunately I got the problem now again, although I have set eee_disabled to 1 for all of the 10 NICs.
The error occurred about 3-4 minutes after I have booted the firewall system. I was able to use the Internet from my laptop behind the firewall, after 3-4 minutes I was not able to ping the WAN's link default gateway anymore:
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # sysctl -a | grep -i eee
options IEEE80211_SUPPORT_MESH
options IEEE80211_AMPDU_AGE
options IEEE80211_DEBUG
z0xfffff8000baeee80 [label="r0w0e0"];
z0xfffff8000baeee80 -> z0xfffff8000521d300;
z0xfffff8000bb5be00 -> z0xfffff8000baeee80;
<consumer id="0xfffff8000baeee80">
hw.bxe.autogreeen: 0
hw.em.eee_setting: 1
dev.igb.9.eee_disabled: 1
dev.igb.8.eee_disabled: 1
dev.igb.7.eee_disabled: 1
dev.igb.6.eee_disabled: 1
dev.igb.5.eee_disabled: 1
dev.igb.4.eee_disabled: 1
dev.igb.3.eee_disabled: 1
dev.igb.2.eee_disabled: 1
dev.igb.1.eee_disabled: 1
dev.igb.0.eee_disabled: 1
root@OPNsense:~ #
Does anybody have any other hints what the root cause for this issue could be?
PS: I will now start some tests in parallel on another system from the same board manufacturer http://www.jetwaycomputer.com/JBC385F551.html I will keep you updated how the tests run there.
-
Where do you set these values, under System: Settings: Tunables?
The settings may be too late if the boot takes long, maybe they could also be set under /boot/loader.conf.local, but I'm not sure.
Cheers,
Franco
-
Hi Franco,
thank you very much for your hint. Indeed, I'm setting the variables as tunables - like I described here: https://www.thomas-krenn.com/de/wiki/OPNsense_igb_EEE_Funktion_deaktivieren
Regarding setting it via configuration files you mentioned that I could try to set it in /boot/loader.conf.local
In a blog posting about network tuning in BSD - https://calomel.org/freebsd_network_tuning.html - the Intel igb EEE setting is described to be set in /etc/sysctl.conf
My questions:
- Should I add the settings in both files to be on the safe side?
- Is there anything else I should test right now (as the error is currently present) before I apply the settings and reboot the system?
Thanks again very much for your valuable help.
-
On my (meanwhile OPNsense 17.7 test system) I have now added the setting to both /boot/loader.conf.local and /etc/sysctl.conf (and rebooted the system afterwards):
root@OPNsense:~ # cat /boot/loader.conf.local
dev.igb.0.eee_disabled=1
dev.igb.1.eee_disabled=1
dev.igb.2.eee_disabled=1
dev.igb.3.eee_disabled=1
dev.igb.4.eee_disabled=1
dev.igb.5.eee_disabled=1
dev.igb.6.eee_disabled=1
dev.igb.7.eee_disabled=1
dev.igb.8.eee_disabled=1
dev.igb.9.eee_disabled=1
root@OPNsense:~ # cat /etc/sysctl.conf
# $FreeBSD$
#
# This file is read when going to multi-user and its contents piped thru
# ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for details.
#
# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
#security.bsd.see_other_uids=0
dev.igb.0.eee_disabled=1
dev.igb.1.eee_disabled=1
dev.igb.2.eee_disabled=1
dev.igb.3.eee_disabled=1
dev.igb.4.eee_disabled=1
dev.igb.5.eee_disabled=1
dev.igb.6.eee_disabled=1
dev.igb.7.eee_disabled=1
dev.igb.8.eee_disabled=1
dev.igb.9.eee_disabled=1
root@OPNsense:~ #
I'll continue to use this setup for the next 2 weeks (I'm powering down the firewall before I leave the office, as the problem occurred only after 3-5 after boot of the firewall - at least in my tests).
By the way: With the current pfSense version I have not been able to reproduce this issue, although it's FreeBSD 10.3 base uses the same igb driver version (2.5.3) like FreeBSD 11.0 does. Are there maybe any other networking changes/tunables between FreeBSD 10.3 and 11.0 that could lead to this issue?
I'll keep you updated once I have any news on the issue.
Best regards,
Werner
-
Unfortunately, the problem now occurred again although I have added the settings to both /boot/loader.conf.local and /etc/sysctl.conf as described above.
As a next step to narrow down the root cause, I will continue to test with another system. The current system has 10 * Intel i211AT Gigabit LAN, the second (which I want to test now - http://www.jetwaycomputer.com/JBC385F551.html) has the following NICs:
- 1 x Intel i219-LM PHY Gigabit LAN (iAMT 11)
- 1 x Intel i211-AT PCI-E Gigabit LAN
- 4 x Intel i350-AM4 PCI-E Gigabit LAN
Maybe the problem only affects the Intel i211-AT chip...
I will keep you updated. In case you have any news/ideas, just let me know.
Thanks & best regards,
Werner
-
As I think that the network issues might be related to some energy saving functions, I have now switched back to the JBC390F541AA-19-B system with its 10 Intel i211-AT based NICs.
I have changed the BIOS setting (BIOS Version file BAR1NA02, BIOS Date 02/25/2016) to the following settings:
- [F3] (Load Optimized Defaults)
- Advanced -> OS Selection -> Android (instead of the default "Windows 7")
- Advanced -> ACPI Settings -> ACPI Sleep State -> Suspend Disabled (instead of the default "S3 (Suspend to RAM)")
- Advanced -> CPU Configuration -> EIST -> Disabled (instead of the default "Enabled")
- Advanced -> CPU Configuration -> Max CPU C State -> C1 (instead of the default "C7")
- Chipset -> South Bridge -> Audio Controller -> Disabled (instead of the default "Enabled")
- Chipset -> South Bridge -> Azalia HDMI Codec -> Disabled (instead of the default "Enabled")
- Chipset -> South Bridge -> System State after Power Failure -> Former State (insted of the default "Always Off")
And I have set the following variables as suggested/mentioned in https://www.freebsd.org/cgi/man.cgi?query=pci&sektion=4 and https://calomel.org/freebsd_network_tuning.html
root@OPNsense:~ # cat /boot/loader.conf.local
dev.igb.0.eee_disabled=1
dev.igb.1.eee_disabled=1
dev.igb.2.eee_disabled=1
dev.igb.3.eee_disabled=1
dev.igb.4.eee_disabled=1
dev.igb.5.eee_disabled=1
dev.igb.6.eee_disabled=1
dev.igb.7.eee_disabled=1
dev.igb.8.eee_disabled=1
dev.igb.9.eee_disabled=1
hw.pci.do_power_suspend=0
dev.igb.0.fc=0
dev.igb.1.fc=0
dev.igb.2.fc=0
dev.igb.3.fc=0
dev.igb.4.fc=0
dev.igb.5.fc=0
dev.igb.6.fc=0
dev.igb.7.fc=0
dev.igb.8.fc=0
dev.igb.9.fc=0
root@OPNsense:~ # cat /etc/sysctl.conf
# $FreeBSD$
#
# This file is read when going to multi-user and its contents piped thru
# ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for details.
#
# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
#security.bsd.see_other_uids=0
dev.igb.0.eee_disabled=1
dev.igb.1.eee_disabled=1
dev.igb.2.eee_disabled=1
dev.igb.3.eee_disabled=1
dev.igb.4.eee_disabled=1
dev.igb.5.eee_disabled=1
dev.igb.6.eee_disabled=1
dev.igb.7.eee_disabled=1
dev.igb.8.eee_disabled=1
dev.igb.9.eee_disabled=1
hw.pci.do_power_suspend=0
dev.igb.0.fc=0
dev.igb.1.fc=0
dev.igb.2.fc=0
dev.igb.3.fc=0
dev.igb.4.fc=0
dev.igb.5.fc=0
dev.igb.6.fc=0
dev.igb.7.fc=0
dev.igb.8.fc=0
dev.igb.9.fc=0
root@OPNsense:~ #
I have also set all these variables as "tunables" in OPNsense, as some settings (e.g. "dev.igb.0.fc=0") have not set the desired value (sysctl -a reported e.g. "dev.igb.0.fc=0" - adding the variables as tunables in OPNsense fixed this).
I have added the output of "pciconf -lvbce" as an attachment (forum login needed to see it). I wanted to check the setting for Active State Power Management - ASPM (all devices show "ASPM disabled(L0s/L1)"). I have found a posting (although 5 years old), were someone suggest to disable this feature in the BIOS ("Just make sure you keep the Active State Power Management option in the Advanced Chipset Control BIOS screen at the Disabled setting (this is the default), because when I enabled this, my Intel NICs occasionally got stuck in a low power state, needing a full reset to resolve." see https://forums.freebsd.org/threads/35529/#post-195907). There is currently no option in the BIOS of the JBC390F541AA-19-B for ASPM. But as pciconf reports it as "disabled" I _think_ this should be ok.
I keep you updated whether I get the NIC issues again or not.
-
I did not have the impression that the settings have helped (although I have not tested over a longer period and I did not see the issue during my short tests).
Meanwhile I got feedback from the board manufacturer regarding "Active State Power Management" (ASPM) for PCIe. There is no option for this in the BIOS version BAR1NA02, but the default setting is already disable (like "pciconf -lvbce" shows it). As ASPM is not activated, it cannot be causing my issue.
I now want to narrow down whether the problem has to do with FreeBSD version 11.0. With pfSense 2.3 (FreeBSD 10.3) we have not observed the issue. As pfSense 2.4 RC is out (currently using 11.0-RELEASE-p12), I'll check whether it is running into this problem (I _think/assume_ that the problem could arise then, too).
For my test I went back to the BIOS and set the following options:
- [F3] (Load Optimized Defaults)
- Advanced -> OS Selection -> Android (instead of the default "Windows 7")
I have kept the default igb driver settings (so dev.igb.9.eee_disabled is set to the default "0" for all 10 NICs).
I will keep you updated once I have any new information.
-
Hi Werner,
Sorry to hear this is still happening, but thank you for keeping on top of it! :)
Cheers,
Franco
-
Using pfSense 2.4 RC I also got some issues after some minutes of using it. From my client laptop (connected directly to igb0 (LAN link of the firewall)) I was not able to reach the WAN network anymore. I did not break completely, I was able to reach pfSense's web interface, on the other side a SSH session, which I have opened before, broke:
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ifconfig
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6400bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:e8:54
hwaddr 00:30:18:cd:e8:54
inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
inet6 fe80::1:1%igb0 prefixlen 64 scopeid 0x1
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
igb1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:e8:55
hwaddr 00:30:18:cd:e8:55
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb2: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:80
hwaddr 00:30:18:cd:ef:80
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb3: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:81
hwaddr 00:30:18:cd:ef:81
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb4: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:82
hwaddr 00:30:18:cd:ef:82
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb5: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:83
hwaddr 00:30:18:cd:ef:83
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb6: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:60
hwaddr 00:30:18:cd:ec:60
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb7: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:61
hwaddr 00:30:18:cd:ec:61
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb8: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:62
hwaddr 00:30:18:cd:ec:62
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb9: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6400bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:63
hwaddr 00:30:18:cd:ec:63
inet6 fe80::230:18ff:fecd:ec63%igb9 prefixlen 64 scopeid 0xa
inet 10.1.102.55 netmask 0xffffff00 broadcast 10.1.102.255
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
enc0: flags=0<> metric 0 mtu 1536
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: enc
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0xc
inet 127.0.0.1 netmask 0xff000000
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: lo
pfsync0: flags=0<> metric 0 mtu 1500
groups: pfsync
syncpeer: 224.0.0.240 maxupd: 128 defer: on
syncok: 1
pflog0: flags=100<PROMISC> metric 0 mtu 33160
groups: pflog
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ps aux
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 11 398.3 0.0 0 64 - RL 10:33 78:02.96 [idle]
root 0 0.0 0.0 0 560 - DLs 10:33 0:00.01 [kernel]
root 1 0.0 0.0 5004 840 - ILs 10:33 0:00.01 /sbin/init --
root 2 0.0 0.0 0 16 - DL 10:33 0:00.00 [crypto]
root 3 0.0 0.0 0 16 - DL 10:33 0:00.00 [crypto returns]
root 4 0.0 0.0 0 32 - DL 10:33 0:00.00 [cam]
root 5 0.0 0.0 0 16 - DL 10:33 0:00.00 [sctp_iterator]
root 6 0.0 0.0 0 16 - DL 10:33 0:00.26 [pf purge]
root 7 0.0 0.0 0 16 - DL 10:33 0:00.37 [rand_harvestq]
root 8 0.0 0.0 0 16 - DL 10:33 0:00.00 [soaiod1]
root 9 0.0 0.0 0 16 - DL 10:33 0:00.00 [soaiod2]
root 10 0.0 0.0 0 16 - DL 10:33 0:00.00 [audit]
root 12 0.0 0.0 0 1040 - WL 10:33 0:05.85 [intr]
root 13 0.0 0.0 0 64 - DL 10:33 0:00.00 [ng_queue]
root 14 0.0 0.0 0 48 - DL 10:33 0:00.02 [geom]
root 15 0.0 0.0 0 96 - DL 10:33 0:00.06 [usb]
root 16 0.0 0.0 0 16 - DL 10:33 0:00.02 [acpi_thermal]
root 17 0.0 0.0 0 16 - DL 10:33 0:00.00 [soaiod3]
root 18 0.0 0.0 0 16 - DL 10:33 0:00.00 [soaiod4]
root 19 0.0 0.0 0 32 - DL 10:33 0:00.02 [pagedaemon]
root 20 0.0 0.0 0 16 - DL 10:33 0:00.00 [vmdaemon]
root 21 0.0 0.0 0 16 - DL 10:33 0:00.00 [pagezero]
root 22 0.0 0.0 0 16 - DL 10:33 0:00.01 [bufspacedaemon]
root 23 0.0 0.0 0 32 - DL 10:33 0:00.03 [bufdaemon]
root 24 0.0 0.0 0 16 - DL 10:33 0:00.01 [vnlru]
root 25 0.0 0.0 0 16 - DL 10:33 0:00.05 [syncer]
root 56 0.0 0.0 0 16 - DL 10:33 0:00.01 [md0]
root 294 0.0 0.3 269012 27140 - Ss 10:33 0:00.03 php-fpm: master process (/usr/local/lib/php-fpm.conf) (php-fpm)
root 308 0.0 0.1 19404 4504 - INs 10:33 0:00.01 /usr/local/sbin/check_reload_status
root 310 0.0 0.1 19404 4300 - IN 10:33 0:00.00 check_reload_status: Monitoring daemon of check_reload_status
root 322 0.0 0.1 9508 4912 - Ss 10:33 0:00.01 /sbin/devd -q -f /etc/pfSense-devd.conf
root 6410 0.0 0.0 10496 2304 - Is 10:33 0:00.00 dhclient: igb9 [priv] (dhclient)
_dhcp 12169 0.0 0.0 10496 2404 - Is 10:33 0:00.00 dhclient: igb9 (dhclient)
root 15133 0.0 0.0 12636 2344 - Ss 10:33 0:00.08 /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
root 24528 0.0 0.0 10948 2316 - Is 10:33 0:00.26 /usr/local/bin/dpinger -S -r 0 -i WAN_DHCP -B 10.1.102.55 -p /v
unbound 25239 0.0 0.3 72792 24572 - Ss 10:33 0:00.48 /usr/local/sbin/unbound -c /var/unbound/unbound.conf
root 30402 0.0 0.1 35588 6900 - Is 10:33 0:00.00 nginx: master process /usr/local/sbin/nginx -c /var/etc/nginx-w
root 30627 0.0 0.1 37636 7652 - I 10:33 0:00.02 nginx: worker process (nginx)
root 30669 0.0 0.1 35588 7492 - I 10:33 0:00.00 nginx: worker process (nginx)
root 31269 0.0 0.0 12468 2360 - Is 10:33 0:00.00 /usr/sbin/cron -s
root 31823 0.0 0.2 24564 12396 - Ss 10:33 0:00.20 /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.
dhcpd 35871 0.0 0.2 22808 13404 - Ss 10:33 0:00.07 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhc
root 36272 0.0 0.0 10332 2296 - S 10:33 0:00.02 /usr/local/sbin/radvd -p /var/run/radvd.pid -C /var/etc/radvd.c
root 39167 0.0 0.4 269012 35036 - I 10:47 0:00.03 php-fpm: pool nginx (php-fpm)
root 41695 0.0 0.1 53408 7524 - Is 10:47 0:00.00 /usr/sbin/sshd
root 42294 0.0 0.1 78756 8056 - Ss 10:47 0:00.06 sshd: admin@pts/0 (sshd)
root 55386 0.0 0.0 10448 2516 - Ss 10:34 0:00.05 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/ru
root 56705 0.0 0.0 8200 1984 - Is 10:34 0:00.00 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/
root 57021 0.0 0.0 8200 2000 - I 10:34 0:00.00 minicron: helper /usr/local/bin/ping_hosts.sh (minicron)
root 57155 0.0 0.0 8200 1984 - Is 10:34 0:00.00 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /usr/
root 57320 0.0 0.0 6148 1908 - IN 10:52 0:00.00 sleep 60
root 57414 0.0 0.0 8200 2000 - I 10:34 0:00.00 minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.expireaccou
root 57694 0.0 0.0 8200 1984 - Is 10:34 0:00.00 /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pi
root 57970 0.0 0.0 8200 2000 - I 10:34 0:00.00 minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.update_alia
root 89712 0.0 0.0 10552 2296 - Is 10:34 0:00.00 /usr/local/sbin/sshlockout_pf 15
root 40597 0.0 0.0 13048 2544 v0- IN 10:34 0:00.23 /bin/sh /var/db/rrd/updaterrd.sh
root 88321 0.0 0.0 39404 2816 v0 Is 10:34 0:00.01 login [pam] (login)
root 89819 0.0 0.0 13048 2888 v0 I 10:34 0:00.01 -sh (sh)
root 89857 0.0 0.0 13048 2760 v0 I+ 10:34 0:00.00 /bin/sh /etc/rc.initial
root 88339 0.0 0.0 10364 2120 v1 Is+ 10:34 0:00.00 /usr/libexec/getty Pc ttyv1
root 88645 0.0 0.0 10364 2120 v2 Is+ 10:34 0:00.00 /usr/libexec/getty Pc ttyv2
root 88769 0.0 0.0 10364 2120 v3 Is+ 10:34 0:00.00 /usr/libexec/getty Pc ttyv3
root 88815 0.0 0.0 10364 2120 v4 Is+ 10:34 0:00.00 /usr/libexec/getty Pc ttyv4
root 88999 0.0 0.0 10364 2120 v5 Is+ 10:34 0:00.00 /usr/libexec/getty Pc ttyv5
root 89192 0.0 0.0 10364 2120 v6 Is+ 10:34 0:00.00 /usr/libexec/getty Pc ttyv6
root 89470 0.0 0.0 10364 2120 v7 Is+ 10:34 0:00.00 /usr/libexec/getty Pc ttyv7
root 42960 0.0 0.0 13048 2760 0 Is 10:47 0:00.01 /bin/sh /etc/rc.initial
root 57901 0.0 0.0 21056 2684 0 R+ 10:53 0:00.00 ps aux
root 60110 0.0 0.0 13336 3780 0 S 10:47 0:00.04 /bin/tcsh
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: packet_write_wait: Connection to 192.168.1.1 port 22: Broken pipe
wfischer@tpw:/home/wfischer-isos/pfsense$
A "ifconfig igb0 down" and "ifconfig igb0 up" and applying a DHCP-client configuration on my laptop (using Network Manager in Ubuntu 16.04) fixed it, but I'm not sure how long it works.
As these symptoms are not exactly the same as I had with OPNsense before, it might be that these issues could be related to pfSense 2.4 RC. But I think for this very system (JBC390F541AA-19-B) there are some networking issues with FreeBSD 11.0 which have not been there with FreeBSD 10.3.
I'll keep you updated once I have any news.
-
I also have a preliminary kernel for 11.1 (no HardenedBSD additions, no shared forwarding) if you want to try. There could be some fixes we are simply missing?
-
This would be nice, and for sure worth a try!
Can you send me some details how I could grab this Kernel and how I apply it to OPNsense?
-
I've sent you the details via PM.
Thanks,
Franco
-
Thanks a lot for the details in the PM, I'll try this as soon as 17.7.1 is out and keep you updated once I have any new findings.
-
I'm currently still having pfSense 2.4 RC on the system, and today in the morning I got exactly the same problem like I get with OPNsense 17.1/17.7: after some time of operation (about 1-3 minutes after boot) I run into the problem. EEE was _not_ deactivated, but as I have seen the issue on OPNsense with EEE having deactivated, too, I won't do currently any tests with EEE deactivated with pfSense 2.4 RC.
Attached you find a full bunch of logs and command output (in case it helps us to analyze the root cause of the issue).
One thing I want to show you right here - as you can see an "ifconfig igb9 down" followed by an "ifconfig igb9 up" fixes the issue:
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: arp -a
? (10.1.102.1) at (incomplete) on igb9 expired [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
pfSense24.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 480 seconds [ethernet]
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:55:44 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ifconfig igb9 down
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:56:26 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ifconfig igb9 up
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:56:34 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:56:41 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1196 seconds [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
pfSense24.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1199 seconds [ethernet]
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:56:47 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
64 bytes from 10.1.102.1: icmp_seq=0 ttl=64 time=0.340 ms
64 bytes from 10.1.102.1: icmp_seq=1 ttl=64 time=0.246 ms
64 bytes from 10.1.102.1: icmp_seq=2 ttl=64 time=0.238 ms
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.225 ms
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.225/0.262/0.340/0.046 ms
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root:
Before I continue to test with the upcoming 17.7.1 and the preliminary Kernel, I think I'll test with pfSense 2.3 (which is based on FreeBSD 10.3) and watch out if it really works rock-solid (to have more evidence that FreeBSD 11.0 is causing the issue, while FreeBSD 10.3 brings no issues).
When I see that pfSense 2.3 indeed runs solid like expected, I'll grab all the output for analysis (especially "sysctl -a") and will compare it to the outputs that I have attached in this post. Maybe we find some settings, which differ, that then could maybe be the reason for this problem.
I will keep you updated ;)
-
Since my last posting on Aug, 31st, I've been running pfSense 2.3.4 on the system, without having any issues. So I'm rather sure, that my problem has to do with FreeBSD 11.0 vs. FreeBSD 10.3. I have attached a ZIP with the logs of pfSense 2.3.4.
I have searched for differences and I have found:
- sysctl -a shows for pfSense 2.3.4 the item "hw.igb.buf_ring_size: 4096". This item is missing on pfSense 2.4-RC. Could this be causing the issue? (UPDATE: on OPNsense 17.7.2 "hw.igb.buf_ring_size: 4096" is present - so I think this is not the root cause for my problem)
- dmesg shows on pfSense 2.4-RC pci entries with "[GIANT-LOCKED]". pfSense 2.3.4 does not list this kind of items, see the code below. Could this be causing the issue? (UPDATE: also OPNsense 17.7.2 shows "[GIANT-LOCKED]")
Here is the code for the mentioned dmesg part:
pfSense 2.3.4 dmesg:
...
pcib19: <PCI-PCI bridge> irq 19 at device 4.0 on pci15
pci19: <PCI bus> on pcib19
igb9: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0x5000-0x501f mem 0xd0600000-0xd061ffff,0xd0620000-0xd0623fff irq 19 at device 0.0 on pci19
...
pfSense 2.4.0-RC dmesg:
...
pcib19: <PCI-PCI bridge> irq 19 at device 4.0 on pci12
pcib19: [GIANT-LOCKED]
pci16: <PCI bus> on pcib19
igb9: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0x5000-0x501f mem 0xd0600000-0xd061ffff,0xd0620000-0xd0623fff irq 19 at device 0.0 on pci16
...
Tomorrow, I will switch back to OPNsense and I will install the FreeBSD 11.1 kernel. I'll keep you updated.
-
I have now installed a FreeBSD 11.1 Kernel (got instruction for that from Franco via PM):
root@OPNsense:~ # freebsd-version -k
11.1-RELEASE-p1
root@OPNsense:~ # freebsd-version -u
11.0-RELEASE-p12
root@OPNsense:~ #
Also with FreeBSD 11.1 the NICs show "GIANT-LOCKED" in the dmesg output (as they do also with FreeBSD 11.0, but not with pfSense 2.3/FreeBSD 10.3 (which does not have the issue)):
...
pcib18: <PCI-PCI bridge> irq 18 at device 3.0 on pci12
pcib18: [GIANT-LOCKED]
pci15: <PCI bus> on pcib18
igb8: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0x6000-0x601f mem 0xd0700000-0xd071ffff,0xd0720000-0xd0723fff irq 18 at device 0.0 on pci15
igb8: Using MSIX interrupts with 3 vectors
igb8: Ethernet address: 00:30:18:cd:ec:62
igb8: Bound queue 0 to cpu 0
igb8: Bound queue 1 to cpu 1
igb8: netmap queues/slots: TX 2/1024, RX 2/1024
pcib19: <PCI-PCI bridge> irq 19 at device 4.0 on pci12
pcib19: [GIANT-LOCKED]
pci16: <PCI bus> on pcib19
igb9: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0x5000-0x501f mem 0xd0600000-0xd061ffff,0xd0620000-0xd0623fff irq 19 at device 0.0 on pci16
igb9: Using MSIX interrupts with 3 vectors
igb9: Ethernet address: 00:30:18:cd:ec:63
igb9: Bound queue 0 to cpu 2
igb9: Bound queue 1 to cpu 3
igb9: netmap queues/slots: TX 2/1024, RX 2/1024
...
But anyway I will stay with this setup for the next days and will watch out whether the problem shows up again or not. I'll keep you updated.
-
Fingers crossed for 11.1. :)
-
Hi all, i have the same issue
My test server is a Lenovo 3000 J Series
Pentium 4 HT 3.2 GHz, 1GB RAM, IDE HDD
OPNsense 17.7.3 updated (2017-SEP-25)
Configured my WAN and LAN, and after a couple of minutes WAN is DOWN.
WAN and LAN are both Realtek 8169 PCI GBE Family Controllers cards
Installed Ubuntu Server 16.04 to test my hardware and everything is OK.
Just OPNsense is having this issue,
thanks
-
I did this
root@opnsense:~ # cat /boot/loader.conf.local
dev.re.0.eee_disabled=1
dev.re.1.eee_disabled=1
hw.pci.do_power_suspend=0
dev.re.0.fe=0
dev.re.1.fe=0
root@opnsense:~ # cat /etc/sysctl.conf
# $FreeBSD$
#
# This file is read when going to multi-user and its contents piped thru
# ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for details.
#
# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
#security.bsd.see_other_uids=0
dev.re.0.eee_disabled=1
dev.re.1.eee_disabled=1
hw.pci.do_power_suspend=0
dev.re.0.fe=0
dev.re.1.fe=0
root@opnsense:~ #
WAN is down after 20 minutes aprox.
-
Hi grolon,
This topic is about the Intel (igb) driver, not Realtek (re). You won't be able to do a lot with the workarounds described here with a different driver. People sporadicly post here to say they have issues with Realtek chipsets and, unfortunately, the state in BSD is not as good as it could be with ends up in solving the issue by migrating to better network cards.
Cheers,
Franco
-
Hi and thanks,
Too bad,
I was planning to move from Zentyal 4/5 (ubuntu server 14) to pFsense or OPNsense, my hardware is 100% OK, and Realtek NICs are very popular over here.
Thanks anyway folks,
-
OT: We do have the official Realtek driver since 17.1 in contrast to FreeBSD and pfSense. Overall, it doesn't get much better than this, but that is still sub par compared to Linux.
Cheers,
Franco
-
I have done a lot of testing with different configurations and it seems that IPS needs a higher end processor processor to function without errors. The lower end Atom processors do not cut it and show high RTT on a gateway when IPS is using the WAN/Gateway interface. Just my observation. There could be other factors involved.
I would like to hear from other users as to what hardware is working with IPS enabled on the WAN Interface. I really want to pin down if this is a performance issue or not.
-
Regarding the igb issues: I have no solution yet, but got some hints from different people - I just summarize them here (I have not tested them yet):
- Reports and some resolution hints for problems with igb on pfSense 2.4 (which is based on FreeBSD 11.1): https://www.administrator.de/wissen/pfsense-2-4-1-port-flapping-packet-loss-354821.html - mentioning "hw.igb.num_queues=1"
- Problems with igb on FreeBSD 11.0: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212413 (The crash seem to be related to a certain type and amount of packets. Removing ALTQ support from kernel fix the issue.) - Some more info on ALTQ can be found here: https://www.freebsdnews.com/2016/04/01/enable-altq-igb-driver-freebsd/
- https://redmine.pfsense.org/issues/7149 (also mentioning "hw.igb.num_queues=1")
- Commit on github/pfSense https://github.com/pfsense/FreeBSD-src/commit/215ddb035593bc4cee275b9dbbf8fc3a7579aee1 - but as I now also got reports from another user that he has seen the issues with PFsense 2.4 (FreeBSD11) and PFsense 2.4.2 (FreeBSD 11.1p4) I don't think that this change in the source of the driver made the situation better for the described problem
I will do further testing within the next days and keep you updated.
-
After some further research, I'll try these settings for the upcoming days:
root@OPNsense:~ # sysctl hw.igb.num_queues
hw.igb.num_queues: 0
root@OPNsense:~ # sysctl hw.pci.enable_msix
hw.pci.enable_msix: 1
root@OPNsense:~ # sysctl hw.igb.enable_msix
hw.igb.enable_msix: 1
root@OPNsense:~ # echo "Settings for next boots, try to fix nic issues:"
Settings for next boots, try to fix nic issues:
root@OPNsense:~ # cat /boot/loader.conf.local
hw.igb.num_queues=1
hw.pci.enable_msix=0
hw.igb.enable_msix=0
As the issues arise only from time to time, it could again take some days until I can say whether or not these settings could help. I'll keep you updated.
-
I am also doing research based on different hardware configurations. My tests all have to do with how OPNsense handles IPS and netmap. I used Lan Speed Test Client and Server as my endpoints. With and without internet connectivity.
I discovered one key fact. Noise is a major contributor in my tests. I will post details later on, but the short of it is, if I place a cheap router between the ISP and the OPNsense box running in IPS mode, everything seems to operate flawlessly because the background chatter is gone and netmap is not stressed. I can see this in the traffic graphs. The cheap router is set to only pass traffic for the WAN IP. It is amazing how much other traffic is on the ISP modem port. This is why some people don't see any issues. It all depends on the traffic coming from the ISP and the type of interface. I use a /28 block of static IP's
Here is the results, so far, of the WAN traffic usage.
On my production PFsense box, WAN traffic averages 2-4 Mb/s. Only about 100k/sec is legitimate traffic. On the OPNsense box, behind the small router, the usage averages under 500 b/s. Huge difference
I setup this small router to route and not bridge. This allows only legitimate traffic to pass on to the OPNsense box. Then OPNsense can do it's thing to properly firewall only legitimate traffic. But I am not so sure that high bandwidth operations would work well with a cheap front end router.
wefinet, try to setup a similar configuration just using a small throw away router on the front end and see if this ends the problems. Did for me.
Here is how I setup the small router.
Turned off any firewall settings.
Put the static IP info into the WAN settings and setup the LAN for 192.168.0.1/24
Then used the routers IP (192.168.0.1) as the gateway in the OPNsense box and 192.168.0.100 as the WAN address.Then setup the LAN on OPNsense to any other subnet. Simple setup.
I suppose if the router supports DMZ, that would be another way to do it.
The other advantage is you can tap into the small router to get unfiltered LAN traffic for streaming or large downloads to reduce traffic to the OPNsense box. This would be nice for media streaming devices that do not need to be behind a firewall.
Lets hear some thoughts on this approach.
-
Ran tests on different systems. All had an SSD drive, 16GB memory, and used a Quad NIC Intel i340-T4.
Fresh install of OPNsense using the same configuration on all systems.
Promiscuous mode made no difference. All readings in Mbps. Isolated the internet with an external router. Used one Workstation outside the router and one inside. Used Lan Speed Test and LST Server on both workstations.
Each system was tested with IPS on and off.
Conclusion is IPS did slow down the firewall on the slower devices. But the i7-7700 had unexpected results being mush slower than expected. The weakest processor, C2758, had the poorest IPS results and the best IPS off results even though it has 8 cores, but the slowest bus speeds @2.4Ghz.
# of cores improved non IPS bandwidth and higher bus speeds improved IPS performance. So I would conclude to choose at least 4 cores and the highest bus speed processor available.
-
Thank you dcol for your hints. According to your reports you see different speeds. I think this is another separate topic, as for my situation the interface is somehow completely down.
I my current tests I'm running OPNsense 17.7.8-amd64 using FreeBSD 11.0-RELEASE-p15. Unfortunately limiting hw.igb.num_queues to 1 did not help. About 5 minutes after boot, I got the issues again. Only an ifconfig down/up did help:
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at (incomplete) on igb9 expired [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1151 seconds [ethernet]
Tue Dec 5 09:25:35 UTC 2017
root@OPNsense:~ # ifconfig igb9 down && date
Tue Dec 5 09:26:08 UTC 2017
root@OPNsense:~ # ifconfig igb9 up && date
Tue Dec 5 09:26:18 UTC 2017
root@OPNsense:~ # arp -a && date
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1168 seconds [ethernet]
Tue Dec 5 09:26:22 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
64 bytes from 10.1.102.1: icmp_seq=0 ttl=64 time=0.467 ms
64 bytes from 10.1.102.1: icmp_seq=1 ttl=64 time=0.358 ms
64 bytes from 10.1.102.1: icmp_seq=2 ttl=64 time=0.276 ms
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.384 ms
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.276/0.371/0.467/0.068 ms
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1198 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1193 seconds [ethernet]
Tue Dec 5 09:26:34 UTC 2017
root@OPNsense:~ # freebsd-version -ku
11.0-RELEASE-p15
11.0-RELEASE-p15
root@OPNsense:~ # sysctl hw.igb.num_queues
hw.igb.num_queues: 1
root@OPNsense:~ # sysctl hw.pci.enable_msix
hw.pci.enable_msix: 0
root@OPNsense:~ # sysctl hw.igb.enable_msix
hw.igb.enable_msix: 0
root@OPNsense:~ #
I'll continue to switch to OPNsense 18.1 Beta as described by Franco here: https://forum.opnsense.org/index.php?topic=6257.0
I'll keep you updated.
-
I have kept limiting hw.igb.num_queues to 1 and having both hw.pci.enable_msix and hw.igb.enable_msix set 0 and have updated to OPNsense 18.1 Beta (using FreeBSD 11.1).
I did not help. As sure as I have started testing again, the problem occured. Starting a speed test on fast.com on a client led immediately to the problem. Only running "ifconfig igb9 down" and "ifconfig igb9 up" again helped:
root@OPNsense:~ # opnsense-update -bkgr 18.1.b -n "snapshots\/beta"
Fetching base-18.1.b-amd64.obsolete: ... done
Fetching base-18.1.b-amd64.txz: .........................................^C
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # opnsense-update -bkgr 18.1.b -n "snapshots\/beta"
Fetching base-18.1.b-amd64.obsolete: ... done
Fetching base-18.1.b-amd64.txz: .......................... done
Fetching kernel-dbg-18.1.b-amd64.txz: ................................ done
!!!!!!!!!!!! ATTENTION !!!!!!!!!!!!!!!
! A critical upgrade is in progress. !
! Please do not turn off the system. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Installing kernel-dbg-18.1.b-amd64.txz... done
Installing base-18.1.b-amd64.txz... done
Installing base-18.1.b-amd64.obsolete... done
Please reboot.
root@OPNsense:~ #
root@OPNsense:~ # /usr/local/etc/rc.reboot
>>> Invoking stop script 'beep'
>>> Invoking stop script 'freebsd'
>>> Invoking stop script 'backup'
Cannot 'stop' flowd_aggregate. Set flowd_aggregate_enable to YES in /etc/rc.conf or use 'onestop' instead of 'stop'.
Shutdown NOW!
shutdown: [pid 63573]
root@OPNsense:~ #
*** FINAL System shutdown message from root@OPNsense.test.thomas-krenn.com ***
System going down IMMEDIATELY
System shutdown time has arrived
Connection to 192.168.1.1 closed by remote host.
Connection to 192.168.1.1 closed.
wfischer@tpw:~$ ssh root@192.168.1.1
Password for root@OPNsense.test.thomas-krenn.com:
Last login: Tue Dec 5 09:24:52 2017 from 192.168.1.100
----------------------------------------------
| Hello, this is OPNsense 17.7 | @@@@@@@@@@@@@@@
| | @@@@ @@@@
| Website: https://opnsense.org/ | @@@\\\ ///@@@
| Handbook: https://docs.opnsense.org/ | )))))))) ((((((((
| Forums: https://forum.opnsense.org/ | @@@/// \\\@@@
| Lists: https://lists.opnsense.org/ | @@@@ @@@@
| Code: https://github.com/opnsense | @@@@@@@@@@@@@@@
----------------------------------------------
0) Logout 7) Ping host
1) Assign interfaces 8) Shell
2) Set interface IP address 9) pfTop
3) Reset the root password 10) Firewall log
4) Reset to factory defaults 11) Reload all services
5) Power off system 12) Upgrade from console
6) Reboot system 13) Restore a backup
Enter an option: 8
root@OPNsense:~ # freebsd-version -ku
11.1-RELEASE-p2
11.1-RELEASE-p2
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
8 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1168 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1152 seconds [ethernet]
Tue Dec 5 10:23:35 UTC 2017
root@OPNsense:~ #
root@OPNsense:~ #
root@OPNsense:~ # clog /var/log/system.log
[...]
Dec 5 10:22:13 OPNsense kernel: aesni0: No AESNI support.
Dec 5 10:22:13 OPNsense kernel:
Dec 5 10:22:13 OPNsense kernel: igb9: link state changed to DOWN
Dec 5 10:22:13 OPNsense sshlockout[15498]: sshlockout/webConfigurator v3.0 starting up
Dec 5 10:22:13 OPNsense configd.py: [499c4346-ad71-4e2b-9e64-ffce20ce3d3c] Linkup stopping igb9
Dec 5 10:22:18 OPNsense kernel: igb9: link state changed to UP
Dec 5 10:22:18 OPNsense configd.py: [6df83b34-21a5-4d42-a53b-f492e8b7193b] Linkup starting igb9
Dec 5 10:22:18 OPNsense opnsense: /usr/local/etc/rc.bootup: Accept router advertisements on interface igb9
Dec 5 10:22:18 OPNsense kernel: igb0: link state changed to DOWN
Dec 5 10:22:18 OPNsense configd.py: [c7a56dbc-d156-4b0a-9022-97f35d436b47] Linkup stopping igb0
Dec 5 10:22:19 OPNsense kernel: pflog0: promiscuous mode enabled
Dec 5 10:22:19 OPNsense kernel: .done.
Dec 5 10:22:19 OPNsense sshd[40530]: Server listening on :: port 22.
Dec 5 10:22:19 OPNsense sshd[40530]: Server listening on 0.0.0.0 port 22.
Dec 5 10:22:19 OPNsense configd.py: [2bb1592c-4e7e-4285-a882-2a110317d983] generate template OPNsense/WebGui
Dec 5 10:22:19 OPNsense kernel: done.
Dec 5 10:22:19 OPNsense configd.py: generate template container OPNsense/WebGui
Dec 5 10:22:19 OPNsense lighttpd[41414]: (log.c.217) server started
Dec 5 10:22:20 OPNsense opnsense: /usr/local/etc/rc.bootup: ROUTING: setting IPv4 default route to 10.1.102.1
Dec 5 10:22:20 OPNsense kernel: done.
Dec 5 10:22:20 OPNsense kernel: done.
Dec 5 10:22:20 OPNsense kernel: done.
Dec 5 10:22:21 OPNsense kernel: done.
Dec 5 10:22:21 OPNsense kernel: done.
Dec 5 10:22:21 OPNsense configd.py: [4ac9229b-5738-4acf-8e67-ba3af24f9232] generate template *
Dec 5 10:22:21 OPNsense kernel: ....done.
Dec 5 10:22:22 OPNsense configd.py: generate template container OPNsense/Auth
Dec 5 10:22:22 OPNsense configd.py: generate template container OPNsense/Captiveportal
Dec 5 10:22:22 OPNsense configd.py: generate template container OPNsense/Cron
Dec 5 10:22:22 OPNsense configd.py: generate template container OPNsense/IDS
Dec 5 10:22:23 OPNsense configd.py: generate template container OPNsense/IPFW
Dec 5 10:22:23 OPNsense kernel: igb0: link state changed to UP
Dec 5 10:22:23 OPNsense configd.py: [2a3a9380-edb6-4a8a-9940-b38c2068244a] Linkup starting igb0
Dec 5 10:22:23 OPNsense configd.py: generate template container OPNsense/Macros
Dec 5 10:22:23 OPNsense configd.py: generate template container OPNsense/Netflow
Dec 5 10:22:24 OPNsense configd.py: generate template container OPNsense/Proxy
Dec 5 10:22:25 OPNsense configd.py: generate template container OPNsense/Sample
Dec 5 10:22:25 OPNsense configd.py: generate template container OPNsense/Sample/sub1
Dec 5 10:22:25 OPNsense configd.py: generate template container OPNsense/Sample/sub2
Dec 5 10:22:25 OPNsense configd.py: generate template container OPNsense/Syslog
Dec 5 10:22:25 OPNsense configd.py: generate template container OPNsense/WebGui
Dec 5 10:22:27 OPNsense configd.py: [58e260da-2e89-4290-a9a1-e985c024ff15] generate template OPNsense/Syslog
Dec 5 10:22:27 OPNsense kernel: done.
Dec 5 10:22:28 OPNsense configd.py: generate template container OPNsense/Syslog
Dec 5 10:22:28 OPNsense kernel: done.
Dec 5 10:22:31 OPNsense configd.py: [831530b7-a519-4d60-b14e-2d35f351ad66] restarting cron
Dec 5 10:22:31 OPNsense sshlockout[15018]: sshlockout/webConfigurator v3.0 starting up
Dec 5 10:22:31 OPNsense kernel: OK
Dec 5 10:22:33 OPNsense kernel:
Dec 5 10:22:54 OPNsense sshd[27160]: Postponed keyboard-interactive for root from 192.168.1.100 port 52728 ssh2 [preauth]
Dec 5 10:22:57 OPNsense opnsense: user 'root' authenticated successfully
Dec 5 10:22:57 OPNsense sshd[27160]: Postponed keyboard-interactive/pam for root from 192.168.1.100 port 52728 ssh2 [preauth]
Dec 5 10:22:57 OPNsense sshd[27160]: Accepted keyboard-interactive/pam for root from 192.168.1.100 port 52728 ssh2
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # sysctl hw.igb.num_queues
hw.igb.num_queues: 1
root@OPNsense:~ # sysctl hw.pci.enable_msix
hw.pci.enable_msix: 0
root@OPNsense:~ # sysctl hw.igb.enable_msix
hw.igb.enable_msix: 0
root@OPNsense:~ # cat /boot/loader.conf.local
hw.igb.num_queues=1
hw.pci.enable_msix=0
hw.igb.enable_msix=0
root@OPNsense:~ # rm /boot/loader.conf.local
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig down igb9
ifconfig: interface down does not exist
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.354 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.279 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=26.825 ms
64 bytes from 10.1.102.1: icmp_seq=6 ttl=64 time=16.797 ms
^C
--- 10.1.102.1 ping statistics ---
7 packets transmitted, 4 packets received, 42.9% packet loss
round-trip min/avg/max/stddev = 0.279/11.064/26.825/11.317 ms
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.333 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=0.263 ms
64 bytes from 10.1.102.1: icmp_seq=6 ttl=64 time=0.342 ms
64 bytes from 10.1.102.1: icmp_seq=7 ttl=64 time=0.284 ms
64 bytes from 10.1.102.1: icmp_seq=8 ttl=64 time=0.285 ms
64 bytes from 10.1.102.1: icmp_seq=9 ttl=64 time=0.299 ms
64 bytes from 10.1.102.1: icmp_seq=10 ttl=64 time=0.314 ms
64 bytes from 10.1.102.1: icmp_seq=11 ttl=64 time=0.364 ms
^C
--- 10.1.102.1 ping statistics ---
26 packets transmitted, 8 packets received, 69.2% packet loss
round-trip min/avg/max/stddev = 0.263/0.310/0.364/0.032 ms
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.415 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.258 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=0.258 ms
^C
--- 10.1.102.1 ping statistics ---
6 packets transmitted, 3 packets received, 50.0% packet loss
round-trip min/avg/max/stddev = 0.258/0.310/0.415/0.074 ms
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.443 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.255 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=0.341 ms
64 bytes from 10.1.102.1: icmp_seq=6 ttl=64 time=0.290 ms
64 bytes from 10.1.102.1: icmp_seq=7 ttl=64 time=0.288 ms
64 bytes from 10.1.102.1: icmp_seq=8 ttl=64 time=0.350 ms
64 bytes from 10.1.102.1: icmp_seq=9 ttl=64 time=0.318 ms
64 bytes from 10.1.102.1: icmp_seq=10 ttl=64 time=0.376 ms
64 bytes from 10.1.102.1: icmp_seq=11 ttl=64 time=0.301 ms
64 bytes from 10.1.102.1: icmp_seq=12 ttl=64 time=0.324 ms
64 bytes from 10.1.102.1: icmp_seq=13 ttl=64 time=0.287 ms
64 bytes from 10.1.102.1: icmp_seq=14 ttl=64 time=0.285 ms
64 bytes from 10.1.102.1: icmp_seq=15 ttl=64 time=0.279 ms
64 bytes from 10.1.102.1: icmp_seq=16 ttl=64 time=0.326 ms
64 bytes from 10.1.102.1: icmp_seq=17 ttl=64 time=0.267 ms
64 bytes from 10.1.102.1: icmp_seq=18 ttl=64 time=0.474 ms
64 bytes from 10.1.102.1: icmp_seq=19 ttl=64 time=0.264 ms
64 bytes from 10.1.102.1: icmp_seq=20 ttl=64 time=0.234 ms
64 bytes from 10.1.102.1: icmp_seq=21 ttl=64 time=0.339 ms
64 bytes from 10.1.102.1: icmp_seq=22 ttl=64 time=0.369 ms
64 bytes from 10.1.102.1: icmp_seq=23 ttl=64 time=0.476 ms
64 bytes from 10.1.102.1: icmp_seq=24 ttl=64 time=0.293 ms
64 bytes from 10.1.102.1: icmp_seq=25 ttl=64 time=0.413 ms
64 bytes from 10.1.102.1: icmp_seq=26 ttl=64 time=0.429 ms
64 bytes from 10.1.102.1: icmp_seq=27 ttl=64 time=0.345 ms
64 bytes from 10.1.102.1: icmp_seq=28 ttl=64 time=0.411 ms
64 bytes from 10.1.102.1: icmp_seq=29 ttl=64 time=0.292 ms
64 bytes from 10.1.102.1: icmp_seq=30 ttl=64 time=0.268 ms
64 bytes from 10.1.102.1: icmp_seq=31 ttl=64 time=0.237 ms
64 bytes from 10.1.102.1: icmp_seq=32 ttl=64 time=0.281 ms
64 bytes from 10.1.102.1: icmp_seq=33 ttl=64 time=0.385 ms
64 bytes from 10.1.102.1: icmp_seq=34 ttl=64 time=0.371 ms
64 bytes from 10.1.102.1: icmp_seq=35 ttl=64 time=0.332 ms
64 bytes from 10.1.102.1: icmp_seq=36 ttl=64 time=0.343 ms
64 bytes from 10.1.102.1: icmp_seq=37 ttl=64 time=0.314 ms
64 bytes from 10.1.102.1: icmp_seq=38 ttl=64 time=0.329 ms
64 bytes from 10.1.102.1: icmp_seq=39 ttl=64 time=0.712 ms
64 bytes from 10.1.102.1: icmp_seq=40 ttl=64 time=0.340 ms
64 bytes from 10.1.102.1: icmp_seq=41 ttl=64 time=0.328 ms
64 bytes from 10.1.102.1: icmp_seq=42 ttl=64 time=0.387 ms
64 bytes from 10.1.102.1: icmp_seq=43 ttl=64 time=0.252 ms
64 bytes from 10.1.102.1: icmp_seq=44 ttl=64 time=0.343 ms
64 bytes from 10.1.102.1: icmp_seq=45 ttl=64 time=0.368 ms
64 bytes from 10.1.102.1: icmp_seq=46 ttl=64 time=0.245 ms
64 bytes from 10.1.102.1: icmp_seq=47 ttl=64 time=0.466 ms
64 bytes from 10.1.102.1: icmp_seq=48 ttl=64 time=0.414 ms
64 bytes from 10.1.102.1: icmp_seq=49 ttl=64 time=0.302 ms
64 bytes from 10.1.102.1: icmp_seq=50 ttl=64 time=0.464 ms
64 bytes from 10.1.102.1: icmp_seq=51 ttl=64 time=0.262 ms
64 bytes from 10.1.102.1: icmp_seq=52 ttl=64 time=0.524 ms
64 bytes from 10.1.102.1: icmp_seq=53 ttl=64 time=0.421 ms
64 bytes from 10.1.102.1: icmp_seq=54 ttl=64 time=0.226 ms
^C
--- 10.1.102.1 ping statistics ---
55 packets transmitted, 52 packets received, 5.5% packet loss
round-trip min/avg/max/stddev = 0.226/0.346/0.712/0.087 ms
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
64 bytes from 10.1.102.1: icmp_seq=0 ttl=64 time=0.325 ms
64 bytes from 10.1.102.1: icmp_seq=1 ttl=64 time=0.356 ms
^C
--- 10.1.102.1 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.325/0.341/0.356/0.015 ms
root@OPNsense:~ #
After that, I have deleted /boot/loader.conf.local (to get the default values after the next boot). I have powered off the OPNsense system, powered it on again and now when I start a speed test on fast.com on a client, I only see on the OPNsense system that the ping times increase - when the speed test is finished the ping times go down again:
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
64 bytes from 10.1.102.1: icmp_seq=0 ttl=64 time=0.422 ms
64 bytes from 10.1.102.1: icmp_seq=1 ttl=64 time=0.319 ms
64 bytes from 10.1.102.1: icmp_seq=2 ttl=64 time=0.471 ms
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=23.658 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=32.818 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=31.154 ms
64 bytes from 10.1.102.1: icmp_seq=6 ttl=64 time=27.961 ms
64 bytes from 10.1.102.1: icmp_seq=7 ttl=64 time=18.703 ms
64 bytes from 10.1.102.1: icmp_seq=8 ttl=64 time=31.381 ms
64 bytes from 10.1.102.1: icmp_seq=9 ttl=64 time=33.733 ms
64 bytes from 10.1.102.1: icmp_seq=10 ttl=64 time=0.243 ms
64 bytes from 10.1.102.1: icmp_seq=11 ttl=64 time=0.357 ms
64 bytes from 10.1.102.1: icmp_seq=12 ttl=64 time=0.290 ms
64 bytes from 10.1.102.1: icmp_seq=13 ttl=64 time=0.246 ms
^C
--- 10.1.102.1 ping statistics ---
14 packets transmitted, 14 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.243/14.411/33.733/14.528 ms
root@OPNsense:~ #
I'll continue with some more tests with a HBJC385F551-63U-B - see http://www.jetwaycomputer.com/JBC385F551.html - and check the 4 i350, the one i211 and the one i219. As the current system has 10x i211 I'm curious how things will run on this other system on the i211 NIC.
I'll keep you updated.
-
I have now the HBJC385F551-63U-B up and running (it comes with an Intel Core i5-6300U CPU).
I'm using the following NICs:
- igb0 as LAN (the first port of the quard-port i350 NIC chip of the system)
- igb4 as WAN (the I211 NIC of the system
- I do not use igb1/2/3 (the other three ports of the i350 NIC chip) and I do not use em0 (the i219 NIC chip of the system)
Currently I'm running the default OPNsense 17.7.8-amd64 with FreeBSD 11.0-RELEASE-p15. No issues so far. I'll keep you updated.
-
Hi Franco & Team,
as it now turned out the NIC issue was really somehow caused by the power management function of the I211.
Turning EEE off via the driver did not help, as outlined in https://www.thomas-krenn.com/de/wiki/OPNsense_igb_EEE_Funktion_deaktivieren
We now received a BIOS update for the system, where the power management of the LAN ports has been switched off via firmware. Up until now, we did not detect any problems any more.
We will do q&a testing of the new BIOS/UEFI-firmware and provide the firmware once the tests are finished in the Downloads-section of our site: https://www.thomas-krenn.com/de/download.html?product=15417
Thank you all for your help.
PS: In case that you are reading this because you are experiencing issues with FreeBSD 11.0/11.1 based systems with embedded I211 NICs, check with your hardware/firmware vendor and ask for a firmware which has the power management functions deactivated ;)
Best regards,
Werner
-
You could also use Intel's Bootutil to disable power management on the NIC using the following command
BootUtil --WOLD
Works with all Intel NIC's
Get the tool here
https://downloadcenter.intel.com/downloads/eula/19186/Intel-Ethernet-Connections-Boot-Utility-Preboot-Images-and-EFI-Drivers?httpDown=https%3A%2F%2Fdownloadmirror.intel.com%2F19186%2Feng%2FPREBOOT.EXE
Intel's webpage
https://downloadcenter.intel.com/download/19186/Intel-Ethernet-Connections-Boot-Utility-Preboot-Images-and-EFI-Drivers
-
Thank you for the hint. I have downloaded the tool (although I'm not sure if the tool should be used with I211-AT chips, as the Intel download site does not list the I211-AT as valid product for this download). In the doc file bootutil.txt I have found this hint regarding -WOLD:
POWER MANAGEMENT OPTIONS:
-WOLENABLE or -WOLE
Enables Wake On LAN (WOL) functionality on the selected port.
-WOLDISABLE or -WOLD
Disables Wake On LAN (WOL) functionality on the selected port.
The I211 data sheet - see https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/i211-ethernet-controller-datasheet.pdf?asset=9567 - lists 10 different power management features in Table 1-9.
I'm not sure how -WOLD really affects those 10 different power management features. So just in case that you as a user are experiencing link down issues, and you are not sure how you could fix it, ask your hardware vendor if there is a firmware which has the power management deactivated.
-
Unfortunately, I got now once again the problem :(
With the new BIOS running, I changed the setting "System State after Power Failure" from "Always Off" to "Always On". I have then saved&exited (using the F4 key) and booted OPNsense. After a while, I plugged the power cable, so the system was off. I plugged in power again, and I have noticed during bootup that fsck has been done. After running a few minutes, the network problem was there again:
wfischer@tpw:~$ ssh root@192.168.1.1
Password for root@OPNsense.test.thomas-krenn.com:
Last login: Thu Dec 21 08:43:11 2017 from 192.168.1.100
----------------------------------------------
| Hello, this is OPNsense 17.7 | @@@@@@@@@@@@@@@
| | @@@@ @@@@
| Website: https://opnsense.org/ | @@@\\\ ///@@@
| Handbook: https://docs.opnsense.org/ | )))))))) ((((((((
| Forums: https://forum.opnsense.org/ | @@@/// \\\@@@
| Lists: https://lists.opnsense.org/ | @@@@ @@@@
| Code: https://github.com/opnsense | @@@@@@@@@@@@@@@
----------------------------------------------
0) Logout 7) Ping host
1) Assign interfaces 8) Shell
2) Set interface IP address 9) pfTop
3) Reset the root password 10) Firewall log
4) Reset to factory defaults 11) Reload all services
5) Power off system 12) Upgrade from console
6) Reboot system 13) Restore a backup
Enter an option: 8
root@OPNsense:~ # ifconfig
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4400b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,TXCSUM_IPV6>
ether 00:30:18:cd:e8:54
inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
inet6 fe80::1:1%igb0 prefixlen 64 scopeid 0x1
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
igb1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:e8:55
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb2: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:80
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb3: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:81
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb4: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:82
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb5: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:83
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb6: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:60
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb7: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:61
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb8: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:62
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb9: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4400b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,TXCSUM_IPV6>
ether 00:30:18:cd:ec:63
inet6 fe80::230:18ff:fecd:ec63%igb9 prefixlen 64 scopeid 0xa
inet 10.1.102.55 netmask 0xffffff00 broadcast 10.1.102.255
nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb
inet 127.0.0.1 netmask 0xff000000
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: lo
enc0: flags=0<> metric 0 mtu 1536
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: enc
pflog0: flags=100<PROMISC> metric 0 mtu 33160
groups: pflog
pfsync0: flags=0<> metric 0 mtu 1500
groups: pfsync
syncpeer: 0.0.0.0 maxupd: 128 defer: off
root@OPNsense:~ # arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1011 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1088 seconds [ethernet]
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
10 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 957 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1034 seconds [ethernet]
root@OPNsense:~ # date
Thu Dec 21 13:19:23 UTC 2017
root@OPNsense:~ # freebsd-version -ku
11.0-RELEASE-p17
11.0-RELEASE-p17
root@OPNsense:~ # sysctl hw.igb.num_queues
hw.igb.num_queues: 0
root@OPNsense:~ # sysctl hw.pci.enable_msix
hw.pci.enable_msix: 1
root@OPNsense:~ # sysctl hw.igb.enable_msix
hw.igb.enable_msix: 1
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 781 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 858 seconds [ethernet]
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 695 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 772 seconds [ethernet]
Thu Dec 21 13:23:42 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
7 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=2 ttl=64 time=0.316 ms
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.298 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.377 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=0.294 ms
^C
--- 10.1.102.1 ping statistics ---
6 packets transmitted, 4 packets received, 33.3% packet loss
round-trip min/avg/max/stddev = 0.294/0.321/0.377/0.033 ms
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1198 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1198 seconds [ethernet]
Thu Dec 21 13:24:22 UTC 2017
root@OPNsense:~ #
Another user of this system (I think he is using pfSense 2.4) switched EEE off via the driver, and at the same time he has set hw.igb.num_queues=1. Up until now, he did not see any issues. I will try this, too. I'll keep you updated.
-
So when you use the I350 as WAN this error does not occur?
-
There are a few items of concern here.
First, the num_queues setting has to do with the number of cores available divided by the number of ports. There should never be more ports than cores or the queues will overrun and could cause a reset of the port. The value of num_queues should be less than or equal to the cores/ports number. This is automatically calculated by the OS if not overridden by the settings. As an example, if you have 4 cores and 3 ports, the num_queues should be 4/3=1.33 which should be set to 1.
Secondly, the eee setting must be done in the tunables section, as the eee setting does not work in the loader.conf.local. Also, all power management settings in the BIOS should be disabled.
You can use the command 'sysctrl -A' in the shell to see the actual settings in use.
-
And check if you have the latest NIC firmware, esp. when you use Intel:
https://downloadcenter.intel.com/de/download/22283/Ethernet-Intel-Ethernet-Adapter-vollst-ndige-Treiber-Pack?product=64404
-
Another user of this system (I think he is using pfSense 2.4) switched EEE off via the driver, and at the same time he has set hw.igb.num_queues=1. Up until now, he did not see any issues. I will try this, too. I'll keep you updated.
Hi all,
that sounds good. Therefore, I also meant that this should be default or adjustable by gui.
happy New Year
cheers till
-
That Intel download link is for drivers, not firmware. In the FreeBSD environment we have no control over the drivers that are used. The firmware is included as part of the bootutil software. But, would be nice to have a hacked driver with no PM at all that could be compiled into the FreeBSD OS.
-
Correct, Firmware should be here, but I have not an account https://www.intel.de/content/www/de/de/embedded/products/networking/ethernet-controller-i210-i211-family-documentation.html
-
That link is for development and simulation tools. probably a good place if you were going to hack the drivers or firmware.
This is what you want
https://downloadcenter.intel.com/download/19186
-
Thanks dcol! I had many problems with the SFP+ cards the last months, regarding SFP it's a much easier process updating the firmware on the chip.