OPNsense Forum

Archive => 17.1 Legacy Series => Topic started by: wefinet on July 17, 2017, 03:54:41 pm

Title: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on July 17, 2017, 03:54:41 pm
(UPDATE #1: the issue has finally been fixed through a BIOS/UEFI-Firmware update, see posting #44 in this thread for details: https://forum.opnsense.org/index.php?topic=5511.msg28681#msg28681)

(UPDATE #2: after many days of tests, the problem came up again - see https://forum.opnsense.org/index.php?topic=5511.msg28781#msg28781 for details)

Hi all,

I'm trying to analyze a strange issue: sometimes (very rare, I was only able to reproduce the issue 2 times), the WAN link goes away. From a laptop behind the OPNsense firewall I am not able to ping the WAN's default gateway anymore. I still can access the OPNsense system.

Details:

Have you ever seen an issue like this?
Do you have any hints what I could do to further analyze the issue?

Thanks in advance and best regards,
Werner
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: phoenix on July 17, 2017, 04:08:29 pm
I also experience the same problem, my OPNsense is a VM (on ESXi) with Intel NICs and I've never been able to find anything about troubleshooting this problem (I've previously posted this on the forum without a response. I thought it was a rare occurrence until I added a cron job to test the connection every minute and this is the (abbreviated) result:

Code: [Select]
2017-07-04 15:23:15 - WAN interface Restarted on OPNsense
2017-07-04 15:24:15 - WAN interface Restarted on OPNsense
2017-07-06 03:46:22 - WAN interface Restarted on OPNsense
2017-07-06 04:17:22 - WAN interface Restarted on OPNsense
2017-07-07 17:41:22 - WAN interface Restarted on OPNsense
2017-07-09 06:35:22 - WAN interface Restarted on OPNsense
2017-07-09 07:06:22 - WAN interface Restarted on OPNsense
2017-07-10 20:30:22 - WAN interface Restarted on OPNsense
2017-07-12 09:24:22 - WAN interface Restarted on OPNsense
2017-07-12 09:55:22 - WAN interface Restarted on OPNsense
2017-07-12 12:53:22 - WAN interface Restarted on OPNsense
2017-07-12 12:55:15 - WAN interface Restarted on OPNsense
2017-07-12 12:56:15 - WAN interface Restarted on OPNsense

I thought that apinger might restart the interface but that appears not to be the case. :( This hasn't always happened but I can't remember when it did start other than it seems to be recent, as in sometime this year.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: wefinet on July 17, 2017, 04:30:52 pm
Thank you for reporting your experiences, Bill.

So your cronjob checks the availability of the WAN interface, and does then a restart in case the interface is not available, right?

Could you maybe post your cronjob-script?
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: phoenix on July 17, 2017, 04:51:40 pm
Sure, here's the script I use:

Code: [Select]
#!/bin/sh
 
# -q quiet
# -c nb of pings to perform
 
ping -q -c5 [your_wan_gateway] > /dev/null 2>&1  <<-- obviously your wan gateway IP
 
if [ $? -eq 0 ]
  then
      echo "ok"
  else
# When we restart the NIC we also need to run a dhclient to get our (fixed) IP address:

            /etc/rc.d/netif restart vmx0 > /dev/null 2>&1 ; dhclient vmx0 > /dev/null 2>&1

    echo "$(date '+%Y-%m-%d %H:%M:%S') - WAN interface Restarted on" $(hostname -s) >> /usr/home/restart_wan.log

fi

I put the script (and the log file) in /usr/home then modify the crontab to add this line at the end:

Code: [Select]
*       *       *       *       *       (/usr/home/restart_wan) > /dev/nullI can't really take any credit for that (I found a similar script on the internet) as I'm not very experienced in scripting and it can probably be improved but it works for me.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: wefinet on July 19, 2017, 10:43:46 am
Thank you for your script.

Today morning, the error happened again:

Code: [Select]
root@OPNsense:~ # date
Wed Jul 19 05:53:53 UTC 2017
root@OPNsense:~ # ifconfig
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4400b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,TXCSUM_IPV6>
ether 00:30:18:cd:e8:54
inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
inet6 fe80::1:1%igb0 prefixlen 64 scopeid 0x1
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
igb1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
[...]
igb9: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4400b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,TXCSUM_IPV6>
ether 00:30:18:cd:ec:63
inet6 fe80::230:18ff:fecd:ec63%igb9 prefixlen 64 scopeid 0xa
inet 10.1.102.55 netmask 0xffffff00 broadcast 10.1.102.255
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
[...]
root@OPNsense:~ # arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1084 seconds [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1136 seconds [ethernet]
root@OPNsense:~ # time arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1016 seconds [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1172 seconds [ethernet]
0.000u 0.004s 0:23.29 0.0% 0+0k 0+0io 0pf+0w
root@OPNsense:~ # clog /var/log/system.log | tail -n 150
[...]
Jul 19 05:37:41 OPNsense kernel: uhub1: 4 ports with 4 removable, self powered
Jul 19 05:37:41 OPNsense kernel: igb9: link state changed to UP
Jul 19 05:37:41 OPNsense kernel: aesni0: No AESNI support.
Jul 19 05:37:42 OPNsense kernel: done.
Jul 19 05:37:42 OPNsense kernel: igb9: link state changed to DOWN
Jul 19 05:37:42 OPNsense sshlockout[12834]: sshlockout/webConfigurator v3.0 starting up
Jul 19 05:37:42 OPNsense configd.py: [3a2068ae-8494-4ad6-9476-7ef4d08a0ce5] Linkup stopping igb9
Jul 19 05:37:46 OPNsense kernel: igb9: link state changed to UP
Jul 19 05:37:46 OPNsense configd.py: [f7d9bfd1-e05b-40c2-b52b-fd08a8c054c3] Linkup starting igb9
Jul 19 05:37:49 OPNsense kernel: done.
Jul 19 05:37:49 OPNsense kernel: pflog0: promiscuous mode enabled
Jul 19 05:37:50 OPNsense kernel: ...done.
Jul 19 05:37:50 OPNsense kernel: done.
Jul 19 05:37:50 OPNsense sshd[48047]: Server listening on :: port 22.
Jul 19 05:37:50 OPNsense sshd[48047]: Server listening on 0.0.0.0 port 22.
[...]
Jul 19 05:38:02 OPNsense sshlockout[70807]: sshlockout/webConfigurator v3.0 starting up
Jul 19 05:38:02 OPNsense kernel: OK
Jul 19 05:38:04 OPNsense kernel:
Jul 19 05:45:52 OPNsense kernel: igb0: link state changed to UP
Jul 19 05:45:52 OPNsense configd.py: [ac2de28c-7b2a-4fa0-9f27-2e8536f2c95d] Linkup starting igb0
Jul 19 05:45:53 OPNsense opnsense: /usr/local/etc/rc.linkup: DEVD Ethernet attached event for lan
Jul 19 05:45:53 OPNsense opnsense: /usr/local/etc/rc.linkup: HOTPLUG: Configuring interface lan
Jul 19 05:45:56 OPNsense configd.py: [2355f16d-4b08-4c1d-85db-cc50b95f937e] updating dyndns lan
Jul 19 05:45:56 OPNsense configd.py: [6072a8dd-cf07-4c0d-aea2-c0b32445b557] updating rfc2136 lan
Jul 19 05:53:40 OPNsense sshd[30826]: Postponed keyboard-interactive for root from 192.168.1.100 port 35188 ssh2 [preauth]
Jul 19 05:53:44 OPNsense opnsense: user 'root' authenticated successfully
Jul 19 05:53:44 OPNsense sshd[30826]: Postponed keyboard-interactive/pam for root from 192.168.1.100 port 35188 ssh2 [preauth]
Jul 19 05:53:44 OPNsense sshd[30826]: Accepted keyboard-interactive/pam for root from 192.168.1.100 port 35188 ssh2
root@OPNsense:~ # date
Wed Jul 19 06:05:53 UTC 2017
root@OPNsense:~ # time arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 204 seconds [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1101 seconds [ethernet]
0.000u 0.004s 0:23.33 0.0% 0+0k 0+0io 0pf+0w
root@OPNsense:~ # date
Wed Jul 19 06:11:33 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
5 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # time arp -a
? (10.1.102.1) at (incomplete) on igb9 expired [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1074 seconds [ethernet]
0.000u 0.004s 0:23.59 0.0% 0+0k 0+0io 0pf+0w

My findings:

Does anybody have an idea or hint what I could further execute to analyze this issue? (I keep the OPNsystem in this state, as it is not easy to trigger the issue).
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: franco on July 19, 2017, 10:52:15 am
Hi Werner,

Which one is the WAN interface, igb9 or igb0?

WAN is DHCP, right?

What do the following commands yield?

# ps aux | grep dhclient
# netstat -nr | grep default

Does this bring it back?

# killall dhclient
# dhclient igbX

If not, does this?

# /usr/local/etc/rc.newwanip wan


Cheers,
Franco

Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: wefinet on July 19, 2017, 11:03:46 am
Hi Franco,

thanks a lot for your friendly help.

WAN is igb9 an WAN is DHCP.

Unfortunately, the commands did not help to get the connectivity again. But it's surprising to me, that executing dhclient again shows a "DHCPACK". Here is my log:

Code: [Select]
root@OPNsense:~ # ps aux | grep dhclient
root   15548   0.0  0.0 1076296  2844  -  Is   05:37     0:00.00 dhclient: igb9 [priv] (dhclient)
_dhcp  24493   0.0  0.0 1076296  2908  -  Is   05:37     0:00.00 dhclient: igb9 (dhclient)
root   59205   0.0  0.0 1080488  2856  0  S+   08:53     0:00.01 grep dhclient
root@OPNsense:~ # netstat -nr | grep default
default            10.1.102.1         UGS        igb9
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # date
Wed Jul 19 08:54:46 UTC 2017
root@OPNsense:~ # killall dhclient
root@OPNsense:~ # ps aux | grep dhclient
root   77072   0.0  0.0 1080488  2860  0  S+   08:55     0:00.00 grep dhclient
root@OPNsense:~ # date
Wed Jul 19 08:55:16 UTC 2017
root@OPNsense:~ # dhclient igb9
DHCPREQUEST on igb9 to 255.255.255.255 port 67
DHCPACK from 10.1.102.5
bound to 10.1.102.55 -- renewal in 21600 seconds.
root@OPNsense:~ # date
Wed Jul 19 08:55:32 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
5 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # date
Wed Jul 19 08:56:08 UTC 2017
root@OPNsense:~ # arp -a -n
? (10.1.102.1) at (incomplete) on igb9 expired [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.101) at f0:de:f1:ab:ce:49 on igb0 expires in 1011 seconds [ethernet]
root@OPNsense:~ # date
Wed Jul 19 08:56:36 UTC 2017
root@OPNsense:~ # /usr/local/etc/rc.newwanip wan
root@OPNsense:~ # date
Wed Jul 19 08:57:09 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # date
Wed Jul 19 08:57:25 UTC 2017
root@OPNsense:~ #

In the past, an "ifconfig igb9 down" followed by an "ifconfig igb9 up" brought the connectivity back up again, but I keep the system in the current state in case that you have any other ideas.

Should I do the steps again while recording a tcpdump?

Thanks,
Werner
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: wilbolinux on August 06, 2017, 02:27:59 am

I had an similar Problem with
an another NIC on a embedded Device.

.. it can be possible, the Power Management
/Energy Saver Mode of your Intel NIC works not correctly.

Disable for the i211 GB Nic:

* the Energy Efficient Ethernet Saver Mode
* and WOL (WakeOnLan) for this Interface.

Test it manually with ethtool ...


Use:

Disable Energy-Efficient Ethernet Mode:

    ethtool --set-eee <your nic device> eee off

Disable WakeOnLan:

   ethtool --s <your nic device> wol d


If it works ,
put the modifications to load @ boot

in

/etc/rc.local





Sincely
Wilbo.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: wefinet on August 16, 2017, 10:29:05 am
Thank you wilbolinux for your suggestion to check the Energy Efficient Ethernet settings.

I have searched for ethtool, but it seems that this tool is not available for FreeBSD.

Has anybody an idea how the Energy Efficient Ethernet settings can be checked for igb devices under FreeBSD/OPNsense?
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: franco on August 16, 2017, 10:41:00 am
Hi Werner,

Uhh, I do remember this now...

sysctl  -a | grep dev.igb...eee_disabled

Set it to 1 for each interface to disable EEE (Energy Efficient Ethernet) as this may cause up/down issues.

Apologies, I took this from an old e-mail thread, it's not on the forum so hard to find.


Cheers,
Franco
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: wefinet on August 16, 2017, 11:21:31 am
Hi Franco,

thank you so much for your fast and confident answer! This makes me think very positive, that we can solve this issue :)

I will test this and let you know within the upcoming days whether I was able to fix my issue by disabling EEE.

Thanks again,
Werner
Title: Re: WAN link gone sometimes (igb driver, I211 nics), only ifconfig down/up fixes it
Post by: wefinet on August 18, 2017, 12:45:46 pm
Hi Franco,

I was able to reproduce the problem with OPNsense 17.7 (after I have updated the server mentioned above from 17.1 to 17.7).

I have then disabled Energy-Efficient Ethernet for all 10 NICs of the system (setting the tunables from "dev.igb.0.eee_disabled" to "dev.igb.9.eee_disabled" to "1"). After that, I have had no problems anymore.

I have documented the fix in our wiki here (in German): https://www.thomas-krenn.com/de/wiki/OPNsense_igb_EEE_Funktion_deaktivieren

In case that against my expectation the problem arises again, I would post this here - I hope this will not be necessary  ;)

So thanks again for your help,
best regards,
Werner
Title: Re: [SOLVED] WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: tillsense on August 20, 2017, 07:52:10 pm
Hi all,

i think Energy Efficient Ethernet (eee) should be disabled by default in opnsense.

cheers
till
Title: Re: [SOLVED] WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: franco on August 21, 2017, 02:35:04 pm
It seems to only affect a tiny fraction of igb devices. And since the number of igb devices is device-specific, we'd have to write detection code as well or alternatively hardcode the EEE default in the kernel. Unless we narrow these chipsets down to look at the full picture, I am not sure what to do.


Cheers,
Franco
Title: WAN link again down, although dev.igb.X.eee_disabled is set to 1
Post by: wefinet on August 22, 2017, 09:06:52 am
Hi again,

unfortunately I got the problem now again, although I have set eee_disabled to 1 for all of the 10 NICs.

The error occurred about 3-4 minutes after I have booted the firewall system. I was able to use the Internet from my laptop behind the firewall, after 3-4 minutes I was not able to ping the WAN's link default gateway anymore:

Code: [Select]
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # sysctl -a | grep -i eee
options IEEE80211_SUPPORT_MESH
options IEEE80211_AMPDU_AGE
options IEEE80211_DEBUG
z0xfffff8000baeee80 [label="r0w0e0"];
z0xfffff8000baeee80 -> z0xfffff8000521d300;
z0xfffff8000bb5be00 -> z0xfffff8000baeee80;
<consumer id="0xfffff8000baeee80">
hw.bxe.autogreeen: 0
hw.em.eee_setting: 1
dev.igb.9.eee_disabled: 1
dev.igb.8.eee_disabled: 1
dev.igb.7.eee_disabled: 1
dev.igb.6.eee_disabled: 1
dev.igb.5.eee_disabled: 1
dev.igb.4.eee_disabled: 1
dev.igb.3.eee_disabled: 1
dev.igb.2.eee_disabled: 1
dev.igb.1.eee_disabled: 1
dev.igb.0.eee_disabled: 1
root@OPNsense:~ #

Does anybody have any other hints what the root cause for this issue could be?

PS: I will now start some tests in parallel on another system from the same board manufacturer http://www.jetwaycomputer.com/JBC385F551.html I will keep you updated how the tests run there.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: franco on August 22, 2017, 10:08:25 am
Where do you set these values, under System: Settings: Tunables?

The settings may be too late if the boot takes long, maybe they could also be set under /boot/loader.conf.local, but I'm not sure.


Cheers,
Franco
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on August 22, 2017, 10:22:36 am
Hi Franco,

thank you very much for your hint. Indeed, I'm setting the variables as tunables - like I described here: https://www.thomas-krenn.com/de/wiki/OPNsense_igb_EEE_Funktion_deaktivieren

Regarding setting it via configuration files you mentioned that I could try to set it in /boot/loader.conf.local
In a blog posting about network tuning in BSD - https://calomel.org/freebsd_network_tuning.html - the Intel igb EEE setting is described to be set in /etc/sysctl.conf

My questions:

Thanks again very much for your valuable help.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on August 22, 2017, 04:02:40 pm
On my (meanwhile OPNsense 17.7 test system) I have now added the setting to both /boot/loader.conf.local and /etc/sysctl.conf (and rebooted the system afterwards):

Code: [Select]
root@OPNsense:~ # cat /boot/loader.conf.local
dev.igb.0.eee_disabled=1
dev.igb.1.eee_disabled=1
dev.igb.2.eee_disabled=1
dev.igb.3.eee_disabled=1
dev.igb.4.eee_disabled=1
dev.igb.5.eee_disabled=1
dev.igb.6.eee_disabled=1
dev.igb.7.eee_disabled=1
dev.igb.8.eee_disabled=1
dev.igb.9.eee_disabled=1
root@OPNsense:~ # cat /etc/sysctl.conf
# $FreeBSD$
#
#  This file is read when going to multi-user and its contents piped thru
#  ``sysctl'' to adjust kernel values.  ``man 5 sysctl.conf'' for details.
#

# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
#security.bsd.see_other_uids=0
dev.igb.0.eee_disabled=1
dev.igb.1.eee_disabled=1
dev.igb.2.eee_disabled=1
dev.igb.3.eee_disabled=1
dev.igb.4.eee_disabled=1
dev.igb.5.eee_disabled=1
dev.igb.6.eee_disabled=1
dev.igb.7.eee_disabled=1
dev.igb.8.eee_disabled=1
dev.igb.9.eee_disabled=1
root@OPNsense:~ #

I'll continue to use this setup for the next 2 weeks (I'm powering down the firewall before I leave the office, as the problem occurred only after 3-5 after boot of the firewall - at least in my tests).

By the way: With the current pfSense version I have not been able to reproduce this issue, although it's FreeBSD 10.3 base uses the same igb driver version (2.5.3) like FreeBSD 11.0 does. Are there maybe any other networking changes/tunables between FreeBSD 10.3 and 11.0 that could lead to this issue?

I'll keep you updated once I have any news on the issue.

Best regards,
Werner
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on August 23, 2017, 07:54:12 am
Unfortunately, the problem now occurred again although I have added the settings to both /boot/loader.conf.local and /etc/sysctl.conf as described above.

As a next step to narrow down the root cause, I will continue to test with another system. The current system has 10 * Intel i211AT Gigabit LAN, the second (which I want to test now - http://www.jetwaycomputer.com/JBC385F551.html) has the following NICs:

Maybe the problem only affects the Intel i211-AT chip...

I will keep you updated. In case you have any news/ideas, just let me know.

Thanks & best regards,
Werner
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on August 24, 2017, 04:17:11 pm
As I think that the network issues might be related to some energy saving functions, I have now switched back to the JBC390F541AA-19-B system with its 10 Intel i211-AT based NICs.

I have changed the BIOS setting (BIOS Version file BAR1NA02, BIOS Date 02/25/2016) to the following settings:

And I have set the following variables as suggested/mentioned in https://www.freebsd.org/cgi/man.cgi?query=pci&sektion=4 and https://calomel.org/freebsd_network_tuning.html

Code: [Select]
root@OPNsense:~ # cat /boot/loader.conf.local
dev.igb.0.eee_disabled=1
dev.igb.1.eee_disabled=1
dev.igb.2.eee_disabled=1
dev.igb.3.eee_disabled=1
dev.igb.4.eee_disabled=1
dev.igb.5.eee_disabled=1
dev.igb.6.eee_disabled=1
dev.igb.7.eee_disabled=1
dev.igb.8.eee_disabled=1
dev.igb.9.eee_disabled=1

hw.pci.do_power_suspend=0

dev.igb.0.fc=0
dev.igb.1.fc=0
dev.igb.2.fc=0
dev.igb.3.fc=0
dev.igb.4.fc=0
dev.igb.5.fc=0
dev.igb.6.fc=0
dev.igb.7.fc=0
dev.igb.8.fc=0
dev.igb.9.fc=0
root@OPNsense:~ # cat /etc/sysctl.conf
# $FreeBSD$
#
#  This file is read when going to multi-user and its contents piped thru
#  ``sysctl'' to adjust kernel values.  ``man 5 sysctl.conf'' for details.
#

# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
#security.bsd.see_other_uids=0
dev.igb.0.eee_disabled=1
dev.igb.1.eee_disabled=1
dev.igb.2.eee_disabled=1
dev.igb.3.eee_disabled=1
dev.igb.4.eee_disabled=1
dev.igb.5.eee_disabled=1
dev.igb.6.eee_disabled=1
dev.igb.7.eee_disabled=1
dev.igb.8.eee_disabled=1
dev.igb.9.eee_disabled=1

hw.pci.do_power_suspend=0

dev.igb.0.fc=0
dev.igb.1.fc=0
dev.igb.2.fc=0
dev.igb.3.fc=0
dev.igb.4.fc=0
dev.igb.5.fc=0
dev.igb.6.fc=0
dev.igb.7.fc=0
dev.igb.8.fc=0
dev.igb.9.fc=0
root@OPNsense:~ #

I have also set all these variables as "tunables" in OPNsense, as some settings (e.g. "dev.igb.0.fc=0") have not set the desired value (sysctl -a reported e.g. "dev.igb.0.fc=0" - adding the variables as tunables in OPNsense fixed this).

I have added the output of "pciconf -lvbce" as an attachment (forum login needed to see it). I wanted to check the setting for Active State Power Management - ASPM (all devices show "ASPM disabled(L0s/L1)"). I have found a posting (although 5 years old), were someone suggest to disable this feature in the BIOS ("Just make sure you keep the Active State Power Management option in the Advanced Chipset Control BIOS screen at the Disabled setting (this is the default), because when I enabled this, my Intel NICs occasionally got stuck in a low power state, needing a full reset to resolve." see https://forums.freebsd.org/threads/35529/#post-195907). There is currently no option in the BIOS of the JBC390F541AA-19-B for ASPM. But as pciconf reports it as "disabled" I _think_ this should be ok.

I keep you updated whether I get the NIC issues again or not.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on August 30, 2017, 10:50:42 am
I did not have the impression that the settings have helped (although I have not tested over a longer period and I did not see the issue during my short tests).

Meanwhile I got feedback from the board manufacturer regarding "Active State Power Management" (ASPM) for PCIe. There is no option for this in the BIOS version BAR1NA02, but the default setting is already disable (like "pciconf -lvbce" shows it). As ASPM is not activated, it cannot be causing my issue.

I now want to narrow down whether the problem has to do with FreeBSD version 11.0. With pfSense 2.3 (FreeBSD 10.3) we have not observed the issue. As pfSense 2.4 RC is out (currently using 11.0-RELEASE-p12), I'll check whether it is running into this problem (I _think/assume_ that the problem could arise then, too).

For my test I went back to the BIOS and set the following options:
I have kept the default igb driver settings (so dev.igb.9.eee_disabled is set to the default "0" for all 10 NICs).

I will keep you updated once I have any new information.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: franco on August 30, 2017, 10:57:45 am
Hi Werner,

Sorry to hear this is still happening, but thank you for keeping on top of it! :)


Cheers,
Franco
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on August 30, 2017, 11:20:48 am
Using pfSense 2.4 RC I also got some issues after some minutes of using it. From my client laptop (connected directly to igb0 (LAN link of the firewall)) I was not able to reach the WAN network anymore. I did not break completely, I was able to reach pfSense's web interface, on the other side a SSH session, which I have opened before, broke:

Code: [Select]
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ifconfig
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6400bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:e8:54
hwaddr 00:30:18:cd:e8:54
inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
inet6 fe80::1:1%igb0 prefixlen 64 scopeid 0x1
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
igb1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:e8:55
hwaddr 00:30:18:cd:e8:55
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb2: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:80
hwaddr 00:30:18:cd:ef:80
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb3: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:81
hwaddr 00:30:18:cd:ef:81
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb4: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:82
hwaddr 00:30:18:cd:ef:82
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb5: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:83
hwaddr 00:30:18:cd:ef:83
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb6: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:60
hwaddr 00:30:18:cd:ec:60
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb7: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:61
hwaddr 00:30:18:cd:ec:61
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb8: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:62
hwaddr 00:30:18:cd:ec:62
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb9: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6400bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:63
hwaddr 00:30:18:cd:ec:63
inet6 fe80::230:18ff:fecd:ec63%igb9 prefixlen 64 scopeid 0xa
inet 10.1.102.55 netmask 0xffffff00 broadcast 10.1.102.255
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
enc0: flags=0<> metric 0 mtu 1536
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: enc
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0xc
inet 127.0.0.1 netmask 0xff000000
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: lo
pfsync0: flags=0<> metric 0 mtu 1500
groups: pfsync
syncpeer: 224.0.0.240 maxupd: 128 defer: on
syncok: 1
pflog0: flags=100<PROMISC> metric 0 mtu 33160
groups: pflog
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ps aux
USER      PID  %CPU %MEM    VSZ   RSS TT  STAT STARTED     TIME COMMAND
root       11 398.3  0.0      0    64  -  RL   10:33   78:02.96 [idle]
root        0   0.0  0.0      0   560  -  DLs  10:33    0:00.01 [kernel]
root        1   0.0  0.0   5004   840  -  ILs  10:33    0:00.01 /sbin/init --
root        2   0.0  0.0      0    16  -  DL   10:33    0:00.00 [crypto]
root        3   0.0  0.0      0    16  -  DL   10:33    0:00.00 [crypto returns]
root        4   0.0  0.0      0    32  -  DL   10:33    0:00.00 [cam]
root        5   0.0  0.0      0    16  -  DL   10:33    0:00.00 [sctp_iterator]
root        6   0.0  0.0      0    16  -  DL   10:33    0:00.26 [pf purge]
root        7   0.0  0.0      0    16  -  DL   10:33    0:00.37 [rand_harvestq]
root        8   0.0  0.0      0    16  -  DL   10:33    0:00.00 [soaiod1]
root        9   0.0  0.0      0    16  -  DL   10:33    0:00.00 [soaiod2]
root       10   0.0  0.0      0    16  -  DL   10:33    0:00.00 [audit]
root       12   0.0  0.0      0  1040  -  WL   10:33    0:05.85 [intr]
root       13   0.0  0.0      0    64  -  DL   10:33    0:00.00 [ng_queue]
root       14   0.0  0.0      0    48  -  DL   10:33    0:00.02 [geom]
root       15   0.0  0.0      0    96  -  DL   10:33    0:00.06 [usb]
root       16   0.0  0.0      0    16  -  DL   10:33    0:00.02 [acpi_thermal]
root       17   0.0  0.0      0    16  -  DL   10:33    0:00.00 [soaiod3]
root       18   0.0  0.0      0    16  -  DL   10:33    0:00.00 [soaiod4]
root       19   0.0  0.0      0    32  -  DL   10:33    0:00.02 [pagedaemon]
root       20   0.0  0.0      0    16  -  DL   10:33    0:00.00 [vmdaemon]
root       21   0.0  0.0      0    16  -  DL   10:33    0:00.00 [pagezero]
root       22   0.0  0.0      0    16  -  DL   10:33    0:00.01 [bufspacedaemon]
root       23   0.0  0.0      0    32  -  DL   10:33    0:00.03 [bufdaemon]
root       24   0.0  0.0      0    16  -  DL   10:33    0:00.01 [vnlru]
root       25   0.0  0.0      0    16  -  DL   10:33    0:00.05 [syncer]
root       56   0.0  0.0      0    16  -  DL   10:33    0:00.01 [md0]
root      294   0.0  0.3 269012 27140  -  Ss   10:33    0:00.03 php-fpm: master process (/usr/local/lib/php-fpm.conf) (php-fpm)
root      308   0.0  0.1  19404  4504  -  INs  10:33    0:00.01 /usr/local/sbin/check_reload_status
root      310   0.0  0.1  19404  4300  -  IN   10:33    0:00.00 check_reload_status: Monitoring daemon of check_reload_status
root      322   0.0  0.1   9508  4912  -  Ss   10:33    0:00.01 /sbin/devd -q -f /etc/pfSense-devd.conf
root     6410   0.0  0.0  10496  2304  -  Is   10:33    0:00.00 dhclient: igb9 [priv] (dhclient)
_dhcp   12169   0.0  0.0  10496  2404  -  Is   10:33    0:00.00 dhclient: igb9 (dhclient)
root    15133   0.0  0.0  12636  2344  -  Ss   10:33    0:00.08 /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
root    24528   0.0  0.0  10948  2316  -  Is   10:33    0:00.26 /usr/local/bin/dpinger -S -r 0 -i WAN_DHCP -B 10.1.102.55 -p /v
unbound 25239   0.0  0.3  72792 24572  -  Ss   10:33    0:00.48 /usr/local/sbin/unbound -c /var/unbound/unbound.conf
root    30402   0.0  0.1  35588  6900  -  Is   10:33    0:00.00 nginx: master process /usr/local/sbin/nginx -c /var/etc/nginx-w
root    30627   0.0  0.1  37636  7652  -  I    10:33    0:00.02 nginx: worker process (nginx)
root    30669   0.0  0.1  35588  7492  -  I    10:33    0:00.00 nginx: worker process (nginx)
root    31269   0.0  0.0  12468  2360  -  Is   10:33    0:00.00 /usr/sbin/cron -s
root    31823   0.0  0.2  24564 12396  -  Ss   10:33    0:00.20 /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.
dhcpd   35871   0.0  0.2  22808 13404  -  Ss   10:33    0:00.07 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhc
root    36272   0.0  0.0  10332  2296  -  S    10:33    0:00.02 /usr/local/sbin/radvd -p /var/run/radvd.pid -C /var/etc/radvd.c
root    39167   0.0  0.4 269012 35036  -  I    10:47    0:00.03 php-fpm: pool nginx (php-fpm)
root    41695   0.0  0.1  53408  7524  -  Is   10:47    0:00.00 /usr/sbin/sshd
root    42294   0.0  0.1  78756  8056  -  Ss   10:47    0:00.06 sshd: admin@pts/0 (sshd)
root    55386   0.0  0.0  10448  2516  -  Ss   10:34    0:00.05 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/ru
root    56705   0.0  0.0   8200  1984  -  Is   10:34    0:00.00 /usr/local/bin/minicron 240 /var/run/ping_hosts.pid /usr/local/
root    57021   0.0  0.0   8200  2000  -  I    10:34    0:00.00 minicron: helper /usr/local/bin/ping_hosts.sh  (minicron)
root    57155   0.0  0.0   8200  1984  -  Is   10:34    0:00.00 /usr/local/bin/minicron 3600 /var/run/expire_accounts.pid /usr/
root    57320   0.0  0.0   6148  1908  -  IN   10:52    0:00.00 sleep 60
root    57414   0.0  0.0   8200  2000  -  I    10:34    0:00.00 minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.expireaccou
root    57694   0.0  0.0   8200  1984  -  Is   10:34    0:00.00 /usr/local/bin/minicron 86400 /var/run/update_alias_url_data.pi
root    57970   0.0  0.0   8200  2000  -  I    10:34    0:00.00 minicron: helper /usr/local/sbin/fcgicli -f /etc/rc.update_alia
root    89712   0.0  0.0  10552  2296  -  Is   10:34    0:00.00 /usr/local/sbin/sshlockout_pf 15
root    40597   0.0  0.0  13048  2544 v0- IN   10:34    0:00.23 /bin/sh /var/db/rrd/updaterrd.sh
root    88321   0.0  0.0  39404  2816 v0  Is   10:34    0:00.01 login [pam] (login)
root    89819   0.0  0.0  13048  2888 v0  I    10:34    0:00.01 -sh (sh)
root    89857   0.0  0.0  13048  2760 v0  I+   10:34    0:00.00 /bin/sh /etc/rc.initial
root    88339   0.0  0.0  10364  2120 v1  Is+  10:34    0:00.00 /usr/libexec/getty Pc ttyv1
root    88645   0.0  0.0  10364  2120 v2  Is+  10:34    0:00.00 /usr/libexec/getty Pc ttyv2
root    88769   0.0  0.0  10364  2120 v3  Is+  10:34    0:00.00 /usr/libexec/getty Pc ttyv3
root    88815   0.0  0.0  10364  2120 v4  Is+  10:34    0:00.00 /usr/libexec/getty Pc ttyv4
root    88999   0.0  0.0  10364  2120 v5  Is+  10:34    0:00.00 /usr/libexec/getty Pc ttyv5
root    89192   0.0  0.0  10364  2120 v6  Is+  10:34    0:00.00 /usr/libexec/getty Pc ttyv6
root    89470   0.0  0.0  10364  2120 v7  Is+  10:34    0:00.00 /usr/libexec/getty Pc ttyv7
root    42960   0.0  0.0  13048  2760  0  Is   10:47    0:00.01 /bin/sh /etc/rc.initial
root    57901   0.0  0.0  21056  2684  0  R+   10:53    0:00.00 ps aux
root    60110   0.0  0.0  13336  3780  0  S    10:47    0:00.04 /bin/tcsh
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: packet_write_wait: Connection to 192.168.1.1 port 22: Broken pipe
wfischer@tpw:/home/wfischer-isos/pfsense$

A "ifconfig igb0 down" and "ifconfig igb0 up" and applying a DHCP-client configuration on my laptop (using Network Manager in Ubuntu 16.04) fixed it, but I'm not sure how long it works.

As these symptoms are not exactly the same as I had with OPNsense before, it might be that these issues could be related to pfSense 2.4 RC. But I think for this very system (JBC390F541AA-19-B) there are some networking issues with FreeBSD 11.0 which have not been there with FreeBSD 10.3.

I'll keep you updated once I have any news.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: franco on August 30, 2017, 11:31:17 am
I also have a preliminary kernel for 11.1 (no HardenedBSD additions, no shared forwarding) if you want to try. There could be some fixes we are simply missing?
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on August 30, 2017, 11:34:17 am
This would be nice, and for sure worth a try!
Can you send me some details how I could grab this Kernel and how I apply it to OPNsense?
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: franco on August 30, 2017, 12:43:08 pm
I've sent you the details via PM.


Thanks,
Franco
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on August 30, 2017, 01:03:38 pm
Thanks a lot for the details in the PM, I'll try this as soon as 17.7.1 is out and keep you updated once I have any new findings.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on August 31, 2017, 10:20:59 am
I'm currently still having pfSense 2.4 RC on the system, and today in the morning I got exactly the same problem like I get with OPNsense 17.1/17.7: after some time of operation (about 1-3 minutes after boot) I run into the problem. EEE was _not_ deactivated, but as I have seen the issue on OPNsense with EEE having deactivated, too, I won't do currently any tests with EEE deactivated with pfSense 2.4 RC.

Attached you find a full bunch of logs and command output (in case it helps us to analyze the root cause of the issue).

One thing I want to show you right here - as you can see an "ifconfig igb9 down" followed by an "ifconfig igb9 up" fixes the issue:
Code: [Select]
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: arp -a
? (10.1.102.1) at (incomplete) on igb9 expired [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
pfSense24.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 480 seconds [ethernet]
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:55:44 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ifconfig igb9 down
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:56:26 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ifconfig igb9 up
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:56:34 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:56:41 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1196 seconds [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
pfSense24.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1199 seconds [ethernet]
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: date
Thu Aug 31 09:56:47 CEST 2017
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root: ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
64 bytes from 10.1.102.1: icmp_seq=0 ttl=64 time=0.340 ms
64 bytes from 10.1.102.1: icmp_seq=1 ttl=64 time=0.246 ms
64 bytes from 10.1.102.1: icmp_seq=2 ttl=64 time=0.238 ms
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.225 ms
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.225/0.262/0.340/0.046 ms
[2.4.0-RC][admin@pfSense24.test.thomas-krenn.com]/root:

Before I continue to test with the upcoming 17.7.1 and the preliminary Kernel, I think I'll test with pfSense 2.3 (which is based on FreeBSD 10.3) and watch out if it really works rock-solid (to have more evidence that FreeBSD 11.0 is causing the issue, while FreeBSD 10.3 brings no issues).

When I see that pfSense 2.3 indeed runs solid like expected, I'll grab all the output for analysis (especially "sysctl -a") and will compare it to the outputs that I have attached in this post. Maybe we find some settings, which differ, that then could maybe be the reason for this problem.

I will keep you updated  ;)
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on September 18, 2017, 04:24:18 pm
Since my last posting on Aug, 31st, I've been running pfSense 2.3.4 on the system, without having any issues. So I'm rather sure, that my problem has to do with FreeBSD 11.0 vs. FreeBSD 10.3. I have attached a ZIP with the logs of pfSense 2.3.4.

I have searched for differences and I have found:

Here is the code for the mentioned dmesg part:
Code: [Select]
pfSense 2.3.4 dmesg:
  ...
  pcib19: <PCI-PCI bridge> irq 19 at device 4.0 on pci15
  pci19: <PCI bus> on pcib19
  igb9: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0x5000-0x501f mem 0xd0600000-0xd061ffff,0xd0620000-0xd0623fff irq 19 at device 0.0 on pci19
  ...

pfSense 2.4.0-RC dmesg:
  ...
  pcib19: <PCI-PCI bridge> irq 19 at device 4.0 on pci12
  pcib19: [GIANT-LOCKED]
  pci16: <PCI bus> on pcib19
  igb9: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0x5000-0x501f mem 0xd0600000-0xd061ffff,0xd0620000-0xd0623fff irq 19 at device 0.0 on pci16
  ...

Tomorrow, I will switch back to OPNsense and I will install the FreeBSD 11.1 kernel. I'll keep you updated.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on September 22, 2017, 01:05:47 pm
I have now installed a FreeBSD 11.1 Kernel (got instruction for that from Franco via PM):

Code: [Select]
root@OPNsense:~ # freebsd-version -k
11.1-RELEASE-p1
root@OPNsense:~ # freebsd-version -u
11.0-RELEASE-p12
root@OPNsense:~ #

Also with FreeBSD 11.1 the NICs show "GIANT-LOCKED" in the dmesg output (as they do also with FreeBSD 11.0, but not with pfSense 2.3/FreeBSD 10.3 (which does not have the issue)):
Code: [Select]
...
pcib18: <PCI-PCI bridge> irq 18 at device 3.0 on pci12
pcib18: [GIANT-LOCKED]
pci15: <PCI bus> on pcib18
igb8: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0x6000-0x601f mem 0xd0700000-0xd071ffff,0xd0720000-0xd0723fff irq 18 at device 0.0 on pci15
igb8: Using MSIX interrupts with 3 vectors
igb8: Ethernet address: 00:30:18:cd:ec:62
igb8: Bound queue 0 to cpu 0
igb8: Bound queue 1 to cpu 1
igb8: netmap queues/slots: TX 2/1024, RX 2/1024
pcib19: <PCI-PCI bridge> irq 19 at device 4.0 on pci12
pcib19: [GIANT-LOCKED]
pci16: <PCI bus> on pcib19
igb9: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0x5000-0x501f mem 0xd0600000-0xd061ffff,0xd0620000-0xd0623fff irq 19 at device 0.0 on pci16
igb9: Using MSIX interrupts with 3 vectors
igb9: Ethernet address: 00:30:18:cd:ec:63
igb9: Bound queue 0 to cpu 2
igb9: Bound queue 1 to cpu 3
igb9: netmap queues/slots: TX 2/1024, RX 2/1024
...

But anyway I will stay with this setup for the next days and will watch out whether the problem shows up again or not. I'll keep you updated.

Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: franco on September 22, 2017, 07:51:39 pm
Fingers crossed for 11.1. :)
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: grolon on September 25, 2017, 04:24:37 pm
Hi all, i have the same issue

My test server is a Lenovo 3000 J Series
Pentium 4 HT 3.2 GHz, 1GB RAM, IDE HDD
OPNsense 17.7.3 updated (2017-SEP-25)

Configured my WAN and LAN, and after a couple of minutes WAN is DOWN.
WAN and LAN are both Realtek 8169 PCI GBE Family Controllers cards

Installed Ubuntu Server 16.04 to test my hardware and everything is OK.

Just OPNsense is having this issue,


thanks
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: grolon on September 25, 2017, 05:41:33 pm
I did this

root@opnsense:~ # cat /boot/loader.conf.local
dev.re.0.eee_disabled=1
dev.re.1.eee_disabled=1
hw.pci.do_power_suspend=0
dev.re.0.fe=0
dev.re.1.fe=0

root@opnsense:~ # cat /etc/sysctl.conf
# $FreeBSD$
#
#  This file is read when going to multi-user and its contents piped thru
#  ``sysctl'' to adjust kernel values.  ``man 5 sysctl.conf'' for details.
#

# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
#security.bsd.see_other_uids=0
dev.re.0.eee_disabled=1
dev.re.1.eee_disabled=1
hw.pci.do_power_suspend=0
dev.re.0.fe=0
dev.re.1.fe=0
root@opnsense:~ #

WAN is down after 20 minutes aprox.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: franco on September 25, 2017, 10:28:22 pm
Hi grolon,

This topic is about the Intel (igb) driver, not Realtek (re). You won't be able to do a lot with the workarounds described here with a different driver. People sporadicly post here to say they have issues with Realtek chipsets and, unfortunately, the state in BSD is not as good as it could be with ends up in solving the issue by migrating to better network cards.


Cheers,
Franco
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: grolon on September 26, 2017, 08:08:39 pm
Hi and thanks,

Too bad,

I was planning to move from Zentyal 4/5 (ubuntu server 14) to pFsense or OPNsense, my hardware is 100% OK, and Realtek NICs are very popular over here.

Thanks anyway folks,
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: franco on September 27, 2017, 06:38:28 am
OT: We do have the official Realtek driver since 17.1 in contrast to FreeBSD and pfSense. Overall, it doesn't get much better than this, but that is still sub par compared to Linux.


Cheers,
Franco
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: dcol on November 22, 2017, 12:46:35 am
I have done a lot of testing with different configurations and it seems that IPS needs a higher end processor processor to function without errors. The lower end Atom processors do not cut it and show high RTT on a gateway when IPS is using the WAN/Gateway interface. Just my observation. There could be other factors involved.

I would like to hear from other users as to what hardware is working with IPS enabled on the WAN Interface. I really want to pin down if this is a performance issue or not.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on November 28, 2017, 03:39:32 pm
Regarding the igb issues: I have no solution yet, but got some hints from different people - I just summarize them here (I have not tested them yet):


I will do further testing within the next days and keep you updated.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on November 29, 2017, 02:00:03 pm
After some further research, I'll try these settings for the upcoming days:

Code: [Select]
root@OPNsense:~ # sysctl hw.igb.num_queues
hw.igb.num_queues: 0
root@OPNsense:~ # sysctl hw.pci.enable_msix
hw.pci.enable_msix: 1
root@OPNsense:~ # sysctl hw.igb.enable_msix
hw.igb.enable_msix: 1
root@OPNsense:~ # echo "Settings for next boots, try to fix nic issues:"
Settings for next boots, try to fix nic issues:
root@OPNsense:~ # cat /boot/loader.conf.local
hw.igb.num_queues=1
hw.pci.enable_msix=0
hw.igb.enable_msix=0

As the issues arise only from time to time, it could again take some days until I can say whether or not these settings could help. I'll keep you updated.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: dcol on December 01, 2017, 06:27:32 pm
I am also doing research based on different hardware configurations. My tests all have to do with how OPNsense handles IPS and netmap. I used Lan Speed Test Client and Server as my endpoints. With and without internet connectivity.

I discovered one key fact. Noise is a major contributor in my tests. I will post details later on, but the short of it is, if I place a cheap router between the ISP and the OPNsense box running in IPS mode, everything seems to operate flawlessly because the background chatter is gone and netmap is not stressed. I can see this in the traffic graphs. The cheap router is set to only pass traffic for the WAN IP. It is amazing how much other traffic is on the ISP modem port. This is why some people don't see any issues. It all depends on the traffic coming from the ISP and the type of interface. I use a /28 block of static IP's

Here is the results, so far, of the WAN traffic usage.
On my production PFsense box, WAN traffic averages 2-4 Mb/s. Only about 100k/sec is legitimate traffic. On the OPNsense box, behind the small router, the usage averages under 500 b/s. Huge difference

I setup this small router to route and not bridge. This allows only legitimate traffic to pass on to the OPNsense box. Then OPNsense can do it's thing to properly firewall only legitimate traffic. But I am not so sure that high bandwidth operations would work well with a cheap front end router.

wefinet, try to setup a similar configuration just using a small throw away router on the front end and see if this ends the problems. Did for me.

Here is how I setup the small router.
Turned off any firewall settings.
Put the static IP info into the WAN settings and setup the LAN for 192.168.0.1/24
Then used the routers IP (192.168.0.1) as the gateway in the OPNsense box and 192.168.0.100 as the WAN address.Then setup the LAN on OPNsense to any other subnet. Simple setup.
I suppose if the router supports DMZ, that would be another way to do it.
The other advantage is you can tap into the small router to get unfiltered LAN traffic for streaming or large downloads to reduce traffic to the OPNsense box. This would be nice for media streaming devices that do not need to be behind a firewall.

Lets hear some thoughts on this approach.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: dcol on December 04, 2017, 06:27:16 pm
Ran tests on different systems. All had an SSD drive, 16GB memory, and used a Quad NIC Intel i340-T4.
Fresh install of OPNsense using the same configuration on all systems.
Promiscuous mode made no difference. All readings in Mbps. Isolated the internet with an external router. Used one Workstation outside the router and one inside. Used Lan Speed Test and LST Server on both workstations.
Each system was tested with IPS on and off.

Conclusion is IPS did slow down the firewall on the slower devices. But the i7-7700 had unexpected results being mush slower than expected. The weakest processor, C2758, had the poorest IPS results and the best IPS off results even though it has 8 cores, but the slowest bus speeds @2.4Ghz.

# of cores improved non IPS bandwidth and higher bus speeds improved IPS performance. So I would conclude to choose at least 4 cores and the highest bus speed processor available.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on December 05, 2017, 10:32:52 am
Thank you dcol for your hints. According to your reports you see different speeds. I think this is another separate topic, as for my situation the interface is somehow completely down.

I my current tests I'm running OPNsense 17.7.8-amd64 using FreeBSD 11.0-RELEASE-p15. Unfortunately limiting hw.igb.num_queues to 1 did not help. About 5 minutes after boot, I got the issues again. Only an ifconfig down/up did help:

Code: [Select]
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at (incomplete) on igb9 expired [ethernet]
? (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
? (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1151 seconds [ethernet]
Tue Dec  5 09:25:35 UTC 2017
root@OPNsense:~ # ifconfig igb9 down && date
Tue Dec  5 09:26:08 UTC 2017
root@OPNsense:~ # ifconfig igb9 up && date
Tue Dec  5 09:26:18 UTC 2017
root@OPNsense:~ # arp -a && date
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1168 seconds [ethernet]
Tue Dec  5 09:26:22 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
64 bytes from 10.1.102.1: icmp_seq=0 ttl=64 time=0.467 ms
64 bytes from 10.1.102.1: icmp_seq=1 ttl=64 time=0.358 ms
64 bytes from 10.1.102.1: icmp_seq=2 ttl=64 time=0.276 ms
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.384 ms
^C
--- 10.1.102.1 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.276/0.371/0.467/0.068 ms
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1198 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1193 seconds [ethernet]
Tue Dec  5 09:26:34 UTC 2017
root@OPNsense:~ # freebsd-version -ku
11.0-RELEASE-p15
11.0-RELEASE-p15
root@OPNsense:~ # sysctl hw.igb.num_queues
hw.igb.num_queues: 1
root@OPNsense:~ # sysctl hw.pci.enable_msix
hw.pci.enable_msix: 0
root@OPNsense:~ # sysctl hw.igb.enable_msix
hw.igb.enable_msix: 0
root@OPNsense:~ #

I'll continue to switch to OPNsense 18.1 Beta as described by Franco here: https://forum.opnsense.org/index.php?topic=6257.0

I'll keep you updated.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on December 05, 2017, 12:28:20 pm
I have kept limiting hw.igb.num_queues to 1 and having both hw.pci.enable_msix and hw.igb.enable_msix set 0 and have updated to OPNsense 18.1 Beta (using FreeBSD 11.1).

I did not help. As sure as I have started testing again, the problem occured. Starting a speed test on fast.com on a client led immediately to the problem. Only running "ifconfig igb9 down" and "ifconfig igb9 up" again helped:

Code: [Select]
root@OPNsense:~ # opnsense-update -bkgr 18.1.b -n "snapshots\/beta"
Fetching base-18.1.b-amd64.obsolete: ... done
Fetching base-18.1.b-amd64.txz: .........................................^C
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # opnsense-update -bkgr 18.1.b -n "snapshots\/beta"
Fetching base-18.1.b-amd64.obsolete: ... done
Fetching base-18.1.b-amd64.txz: .......................... done
Fetching kernel-dbg-18.1.b-amd64.txz: ................................ done
!!!!!!!!!!!! ATTENTION !!!!!!!!!!!!!!!
! A critical upgrade is in progress. !
! Please do not turn off the system. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Installing kernel-dbg-18.1.b-amd64.txz... done
Installing base-18.1.b-amd64.txz... done
Installing base-18.1.b-amd64.obsolete... done
Please reboot.
root@OPNsense:~ #
root@OPNsense:~ # /usr/local/etc/rc.reboot
>>> Invoking stop script 'beep'
>>> Invoking stop script 'freebsd'
>>> Invoking stop script 'backup'
Cannot 'stop' flowd_aggregate. Set flowd_aggregate_enable to YES in /etc/rc.conf or use 'onestop' instead of 'stop'.
Shutdown NOW!
shutdown: [pid 63573]
root@OPNsense:~ #                                                                               
*** FINAL System shutdown message from root@OPNsense.test.thomas-krenn.com ***
                                                                             

System going down IMMEDIATELY                                                 

                                                                               

System shutdown time has arrived
Connection to 192.168.1.1 closed by remote host.
Connection to 192.168.1.1 closed.
wfischer@tpw:~$ ssh root@192.168.1.1
Password for root@OPNsense.test.thomas-krenn.com:
Last login: Tue Dec  5 09:24:52 2017 from 192.168.1.100
----------------------------------------------
|      Hello, this is OPNsense 17.7          |         @@@@@@@@@@@@@@@
|                                            |        @@@@         @@@@
| Website: https://opnsense.org/        |         @@@\\\   ///@@@
| Handbook: https://docs.opnsense.org/   |       ))))))))   ((((((((
| Forums: https://forum.opnsense.org/  |         @@@///   \\\@@@
| Lists: https://lists.opnsense.org/  |        @@@@         @@@@
| Code: https://github.com/opnsense  |         @@@@@@@@@@@@@@@
----------------------------------------------

  0) Logout                              7) Ping host
  1) Assign interfaces                   8) Shell
  2) Set interface IP address            9) pfTop
  3) Reset the root password            10) Firewall log
  4) Reset to factory defaults          11) Reload all services
  5) Power off system                   12) Upgrade from console
  6) Reboot system                      13) Restore a backup

Enter an option: 8

root@OPNsense:~ # freebsd-version -ku
11.1-RELEASE-p2
11.1-RELEASE-p2
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
8 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1168 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1152 seconds [ethernet]
Tue Dec  5 10:23:35 UTC 2017
root@OPNsense:~ #
root@OPNsense:~ #
root@OPNsense:~ # clog /var/log/system.log
[...]
Dec  5 10:22:13 OPNsense kernel: aesni0: No AESNI support.
Dec  5 10:22:13 OPNsense kernel:
Dec  5 10:22:13 OPNsense kernel: igb9: link state changed to DOWN
Dec  5 10:22:13 OPNsense sshlockout[15498]: sshlockout/webConfigurator v3.0 starting up
Dec  5 10:22:13 OPNsense configd.py: [499c4346-ad71-4e2b-9e64-ffce20ce3d3c] Linkup stopping igb9
Dec  5 10:22:18 OPNsense kernel: igb9: link state changed to UP
Dec  5 10:22:18 OPNsense configd.py: [6df83b34-21a5-4d42-a53b-f492e8b7193b] Linkup starting igb9
Dec  5 10:22:18 OPNsense opnsense: /usr/local/etc/rc.bootup: Accept router advertisements on interface igb9
Dec  5 10:22:18 OPNsense kernel: igb0: link state changed to DOWN
Dec  5 10:22:18 OPNsense configd.py: [c7a56dbc-d156-4b0a-9022-97f35d436b47] Linkup stopping igb0
Dec  5 10:22:19 OPNsense kernel: pflog0: promiscuous mode enabled
Dec  5 10:22:19 OPNsense kernel: .done.
Dec  5 10:22:19 OPNsense sshd[40530]: Server listening on :: port 22.
Dec  5 10:22:19 OPNsense sshd[40530]: Server listening on 0.0.0.0 port 22.
Dec  5 10:22:19 OPNsense configd.py: [2bb1592c-4e7e-4285-a882-2a110317d983] generate template OPNsense/WebGui
Dec  5 10:22:19 OPNsense kernel: done.
Dec  5 10:22:19 OPNsense configd.py: generate template container OPNsense/WebGui
Dec  5 10:22:19 OPNsense lighttpd[41414]: (log.c.217) server started
Dec  5 10:22:20 OPNsense opnsense: /usr/local/etc/rc.bootup: ROUTING: setting IPv4 default route to 10.1.102.1
Dec  5 10:22:20 OPNsense kernel: done.
Dec  5 10:22:20 OPNsense kernel: done.
Dec  5 10:22:20 OPNsense kernel: done.
Dec  5 10:22:21 OPNsense kernel: done.
Dec  5 10:22:21 OPNsense kernel: done.
Dec  5 10:22:21 OPNsense configd.py: [4ac9229b-5738-4acf-8e67-ba3af24f9232] generate template *
Dec  5 10:22:21 OPNsense kernel: ....done.
Dec  5 10:22:22 OPNsense configd.py: generate template container OPNsense/Auth
Dec  5 10:22:22 OPNsense configd.py: generate template container OPNsense/Captiveportal
Dec  5 10:22:22 OPNsense configd.py: generate template container OPNsense/Cron
Dec  5 10:22:22 OPNsense configd.py: generate template container OPNsense/IDS
Dec  5 10:22:23 OPNsense configd.py: generate template container OPNsense/IPFW
Dec  5 10:22:23 OPNsense kernel: igb0: link state changed to UP
Dec  5 10:22:23 OPNsense configd.py: [2a3a9380-edb6-4a8a-9940-b38c2068244a] Linkup starting igb0
Dec  5 10:22:23 OPNsense configd.py: generate template container OPNsense/Macros
Dec  5 10:22:23 OPNsense configd.py: generate template container OPNsense/Netflow
Dec  5 10:22:24 OPNsense configd.py: generate template container OPNsense/Proxy
Dec  5 10:22:25 OPNsense configd.py: generate template container OPNsense/Sample
Dec  5 10:22:25 OPNsense configd.py: generate template container OPNsense/Sample/sub1
Dec  5 10:22:25 OPNsense configd.py: generate template container OPNsense/Sample/sub2
Dec  5 10:22:25 OPNsense configd.py: generate template container OPNsense/Syslog
Dec  5 10:22:25 OPNsense configd.py: generate template container OPNsense/WebGui
Dec  5 10:22:27 OPNsense configd.py: [58e260da-2e89-4290-a9a1-e985c024ff15] generate template OPNsense/Syslog
Dec  5 10:22:27 OPNsense kernel: done.
Dec  5 10:22:28 OPNsense configd.py: generate template container OPNsense/Syslog
Dec  5 10:22:28 OPNsense kernel: done.
Dec  5 10:22:31 OPNsense configd.py: [831530b7-a519-4d60-b14e-2d35f351ad66] restarting cron
Dec  5 10:22:31 OPNsense sshlockout[15018]: sshlockout/webConfigurator v3.0 starting up
Dec  5 10:22:31 OPNsense kernel: OK
Dec  5 10:22:33 OPNsense kernel:
Dec  5 10:22:54 OPNsense sshd[27160]: Postponed keyboard-interactive for root from 192.168.1.100 port 52728 ssh2 [preauth]
Dec  5 10:22:57 OPNsense opnsense: user 'root' authenticated successfully
Dec  5 10:22:57 OPNsense sshd[27160]: Postponed keyboard-interactive/pam for root from 192.168.1.100 port 52728 ssh2 [preauth]
Dec  5 10:22:57 OPNsense sshd[27160]: Accepted keyboard-interactive/pam for root from 192.168.1.100 port 52728 ssh2
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # sysctl hw.igb.num_queues
hw.igb.num_queues: 1
root@OPNsense:~ # sysctl hw.pci.enable_msix
hw.pci.enable_msix: 0
root@OPNsense:~ # sysctl hw.igb.enable_msix
hw.igb.enable_msix: 0
root@OPNsense:~ # cat /boot/loader.conf.local
hw.igb.num_queues=1
hw.pci.enable_msix=0
hw.igb.enable_msix=0
root@OPNsense:~ # rm /boot/loader.conf.local
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig down igb9
ifconfig: interface down does not exist
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.354 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.279 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=26.825 ms
64 bytes from 10.1.102.1: icmp_seq=6 ttl=64 time=16.797 ms
^C
--- 10.1.102.1 ping statistics ---
7 packets transmitted, 4 packets received, 42.9% packet loss
round-trip min/avg/max/stddev = 0.279/11.064/26.825/11.317 ms
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.333 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=0.263 ms
64 bytes from 10.1.102.1: icmp_seq=6 ttl=64 time=0.342 ms
64 bytes from 10.1.102.1: icmp_seq=7 ttl=64 time=0.284 ms
64 bytes from 10.1.102.1: icmp_seq=8 ttl=64 time=0.285 ms
64 bytes from 10.1.102.1: icmp_seq=9 ttl=64 time=0.299 ms
64 bytes from 10.1.102.1: icmp_seq=10 ttl=64 time=0.314 ms
64 bytes from 10.1.102.1: icmp_seq=11 ttl=64 time=0.364 ms
^C
--- 10.1.102.1 ping statistics ---
26 packets transmitted, 8 packets received, 69.2% packet loss
round-trip min/avg/max/stddev = 0.263/0.310/0.364/0.032 ms
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.415 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.258 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=0.258 ms
^C
--- 10.1.102.1 ping statistics ---
6 packets transmitted, 3 packets received, 50.0% packet loss
round-trip min/avg/max/stddev = 0.258/0.310/0.415/0.074 ms
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.443 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.255 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=0.341 ms
64 bytes from 10.1.102.1: icmp_seq=6 ttl=64 time=0.290 ms
64 bytes from 10.1.102.1: icmp_seq=7 ttl=64 time=0.288 ms
64 bytes from 10.1.102.1: icmp_seq=8 ttl=64 time=0.350 ms
64 bytes from 10.1.102.1: icmp_seq=9 ttl=64 time=0.318 ms
64 bytes from 10.1.102.1: icmp_seq=10 ttl=64 time=0.376 ms
64 bytes from 10.1.102.1: icmp_seq=11 ttl=64 time=0.301 ms
64 bytes from 10.1.102.1: icmp_seq=12 ttl=64 time=0.324 ms
64 bytes from 10.1.102.1: icmp_seq=13 ttl=64 time=0.287 ms
64 bytes from 10.1.102.1: icmp_seq=14 ttl=64 time=0.285 ms
64 bytes from 10.1.102.1: icmp_seq=15 ttl=64 time=0.279 ms
64 bytes from 10.1.102.1: icmp_seq=16 ttl=64 time=0.326 ms
64 bytes from 10.1.102.1: icmp_seq=17 ttl=64 time=0.267 ms
64 bytes from 10.1.102.1: icmp_seq=18 ttl=64 time=0.474 ms
64 bytes from 10.1.102.1: icmp_seq=19 ttl=64 time=0.264 ms
64 bytes from 10.1.102.1: icmp_seq=20 ttl=64 time=0.234 ms
64 bytes from 10.1.102.1: icmp_seq=21 ttl=64 time=0.339 ms
64 bytes from 10.1.102.1: icmp_seq=22 ttl=64 time=0.369 ms
64 bytes from 10.1.102.1: icmp_seq=23 ttl=64 time=0.476 ms
64 bytes from 10.1.102.1: icmp_seq=24 ttl=64 time=0.293 ms
64 bytes from 10.1.102.1: icmp_seq=25 ttl=64 time=0.413 ms
64 bytes from 10.1.102.1: icmp_seq=26 ttl=64 time=0.429 ms
64 bytes from 10.1.102.1: icmp_seq=27 ttl=64 time=0.345 ms
64 bytes from 10.1.102.1: icmp_seq=28 ttl=64 time=0.411 ms
64 bytes from 10.1.102.1: icmp_seq=29 ttl=64 time=0.292 ms
64 bytes from 10.1.102.1: icmp_seq=30 ttl=64 time=0.268 ms
64 bytes from 10.1.102.1: icmp_seq=31 ttl=64 time=0.237 ms
64 bytes from 10.1.102.1: icmp_seq=32 ttl=64 time=0.281 ms
64 bytes from 10.1.102.1: icmp_seq=33 ttl=64 time=0.385 ms
64 bytes from 10.1.102.1: icmp_seq=34 ttl=64 time=0.371 ms
64 bytes from 10.1.102.1: icmp_seq=35 ttl=64 time=0.332 ms
64 bytes from 10.1.102.1: icmp_seq=36 ttl=64 time=0.343 ms
64 bytes from 10.1.102.1: icmp_seq=37 ttl=64 time=0.314 ms
64 bytes from 10.1.102.1: icmp_seq=38 ttl=64 time=0.329 ms
64 bytes from 10.1.102.1: icmp_seq=39 ttl=64 time=0.712 ms
64 bytes from 10.1.102.1: icmp_seq=40 ttl=64 time=0.340 ms
64 bytes from 10.1.102.1: icmp_seq=41 ttl=64 time=0.328 ms
64 bytes from 10.1.102.1: icmp_seq=42 ttl=64 time=0.387 ms
64 bytes from 10.1.102.1: icmp_seq=43 ttl=64 time=0.252 ms
64 bytes from 10.1.102.1: icmp_seq=44 ttl=64 time=0.343 ms
64 bytes from 10.1.102.1: icmp_seq=45 ttl=64 time=0.368 ms
64 bytes from 10.1.102.1: icmp_seq=46 ttl=64 time=0.245 ms
64 bytes from 10.1.102.1: icmp_seq=47 ttl=64 time=0.466 ms
64 bytes from 10.1.102.1: icmp_seq=48 ttl=64 time=0.414 ms
64 bytes from 10.1.102.1: icmp_seq=49 ttl=64 time=0.302 ms
64 bytes from 10.1.102.1: icmp_seq=50 ttl=64 time=0.464 ms
64 bytes from 10.1.102.1: icmp_seq=51 ttl=64 time=0.262 ms
64 bytes from 10.1.102.1: icmp_seq=52 ttl=64 time=0.524 ms
64 bytes from 10.1.102.1: icmp_seq=53 ttl=64 time=0.421 ms
64 bytes from 10.1.102.1: icmp_seq=54 ttl=64 time=0.226 ms
^C
--- 10.1.102.1 ping statistics ---
55 packets transmitted, 52 packets received, 5.5% packet loss
round-trip min/avg/max/stddev = 0.226/0.346/0.712/0.087 ms
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
64 bytes from 10.1.102.1: icmp_seq=0 ttl=64 time=0.325 ms
64 bytes from 10.1.102.1: icmp_seq=1 ttl=64 time=0.356 ms
^C
--- 10.1.102.1 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.325/0.341/0.356/0.015 ms
root@OPNsense:~ #

After that, I have deleted /boot/loader.conf.local (to get the default values after the next boot). I have powered off the OPNsense system, powered it on again and now when I start a speed test on fast.com on a client, I only see on the OPNsense system that the ping times increase - when the speed test is finished the ping times go down again:

Code: [Select]
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
64 bytes from 10.1.102.1: icmp_seq=0 ttl=64 time=0.422 ms
64 bytes from 10.1.102.1: icmp_seq=1 ttl=64 time=0.319 ms
64 bytes from 10.1.102.1: icmp_seq=2 ttl=64 time=0.471 ms
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=23.658 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=32.818 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=31.154 ms
64 bytes from 10.1.102.1: icmp_seq=6 ttl=64 time=27.961 ms
64 bytes from 10.1.102.1: icmp_seq=7 ttl=64 time=18.703 ms
64 bytes from 10.1.102.1: icmp_seq=8 ttl=64 time=31.381 ms
64 bytes from 10.1.102.1: icmp_seq=9 ttl=64 time=33.733 ms
64 bytes from 10.1.102.1: icmp_seq=10 ttl=64 time=0.243 ms
64 bytes from 10.1.102.1: icmp_seq=11 ttl=64 time=0.357 ms
64 bytes from 10.1.102.1: icmp_seq=12 ttl=64 time=0.290 ms
64 bytes from 10.1.102.1: icmp_seq=13 ttl=64 time=0.246 ms
^C
--- 10.1.102.1 ping statistics ---
14 packets transmitted, 14 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.243/14.411/33.733/14.528 ms
root@OPNsense:~ #

I'll continue with some more tests with a HBJC385F551-63U-B - see http://www.jetwaycomputer.com/JBC385F551.html - and check the 4 i350, the one i211 and the one i219. As the current system has 10x i211 I'm curious how things will run on this other system on the i211 NIC.

I'll keep you updated.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on December 05, 2017, 01:21:06 pm
I have now the HBJC385F551-63U-B up and running (it comes with an Intel Core i5-6300U CPU).

I'm using the following NICs:

Currently I'm running the default OPNsense 17.7.8-amd64 with FreeBSD 11.0-RELEASE-p15. No issues so far. I'll keep you updated.
Title: Fixed through BIOS update (Re: WAN link gone sometimes (igb driver, ...)
Post by: wefinet on December 18, 2017, 03:33:23 pm
Hi Franco & Team,

as it now turned out the NIC issue was really somehow caused by the power management function of the I211.

Turning EEE off via the driver did not help, as outlined in https://www.thomas-krenn.com/de/wiki/OPNsense_igb_EEE_Funktion_deaktivieren

We now received a BIOS update for the system, where the power management of the LAN ports has been switched off via firmware. Up until now, we did not detect any problems any more.

We will do q&a testing of the new BIOS/UEFI-firmware and provide the firmware once the tests are finished in the Downloads-section of our site: https://www.thomas-krenn.com/de/download.html?product=15417

Thank you all for your help.

PS: In case that you are reading this because you are experiencing issues with FreeBSD 11.0/11.1 based systems with embedded I211 NICs, check with your hardware/firmware vendor and ask for a firmware which has the power management functions deactivated  ;)

Best regards,
Werner
Title: Re: [FIXED] WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: dcol on December 18, 2017, 05:05:24 pm
You could also use Intel's Bootutil to disable power management on the NIC using the following command
BootUtil --WOLD

Works with all Intel NIC's

Get the tool here
https://downloadcenter.intel.com/downloads/eula/19186/Intel-Ethernet-Connections-Boot-Utility-Preboot-Images-and-EFI-Drivers?httpDown=https%3A%2F%2Fdownloadmirror.intel.com%2F19186%2Feng%2FPREBOOT.EXE

Intel's webpage
https://downloadcenter.intel.com/download/19186/Intel-Ethernet-Connections-Boot-Utility-Preboot-Images-and-EFI-Drivers
Title: Re: [FIXED] WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on December 19, 2017, 03:06:34 pm
Thank you for the hint. I have downloaded the tool (although I'm not sure if the tool should be used with I211-AT chips, as the Intel download site does not list the I211-AT as valid product for this download). In the doc file bootutil.txt I have found this hint regarding -WOLD:

Code: [Select]
POWER MANAGEMENT OPTIONS:
-WOLENABLE or -WOLE
  Enables Wake On LAN (WOL) functionality on the selected port.
-WOLDISABLE or -WOLD
  Disables Wake On LAN (WOL) functionality on the selected port.

The I211 data sheet - see https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/i211-ethernet-controller-datasheet.pdf?asset=9567 - lists 10 different power management features in Table 1-9.

I'm not sure how -WOLD really affects those 10 different power management features. So just in case that you as a user are experiencing link down issues, and you are not sure how you could fix it, ask your hardware vendor if there is a firmware which has the power management deactivated.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: wefinet on December 21, 2017, 02:31:43 pm
Unfortunately, I got now once again the problem :(

With the new BIOS running, I changed the setting "System State after Power Failure" from "Always Off" to "Always On". I have then saved&exited (using the F4 key) and booted OPNsense. After a while, I plugged the power cable, so the system was off. I plugged in power again, and I have noticed during bootup that fsck has been done. After running a few minutes, the network problem was there again:

Code: [Select]
wfischer@tpw:~$ ssh root@192.168.1.1
Password for root@OPNsense.test.thomas-krenn.com:
Last login: Thu Dec 21 08:43:11 2017 from 192.168.1.100
----------------------------------------------
|      Hello, this is OPNsense 17.7          |         @@@@@@@@@@@@@@@
|                                            |        @@@@         @@@@
| Website: https://opnsense.org/        |         @@@\\\   ///@@@
| Handbook: https://docs.opnsense.org/   |       ))))))))   ((((((((
| Forums: https://forum.opnsense.org/  |         @@@///   \\\@@@
| Lists: https://lists.opnsense.org/  |        @@@@         @@@@
| Code: https://github.com/opnsense  |         @@@@@@@@@@@@@@@
----------------------------------------------

  0) Logout                              7) Ping host
  1) Assign interfaces                   8) Shell
  2) Set interface IP address            9) pfTop
  3) Reset the root password            10) Firewall log
  4) Reset to factory defaults          11) Reload all services
  5) Power off system                   12) Upgrade from console
  6) Reboot system                      13) Restore a backup

Enter an option: 8

root@OPNsense:~ # ifconfig
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4400b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,TXCSUM_IPV6>
ether 00:30:18:cd:e8:54
inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
inet6 fe80::1:1%igb0 prefixlen 64 scopeid 0x1
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
igb1: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:e8:55
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb2: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:80
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb3: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:81
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb4: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:82
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb5: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ef:83
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb6: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:60
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb7: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:61
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb8: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=6403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:30:18:cd:ec:62
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: no carrier
igb9: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4400b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,TXCSUM_IPV6>
ether 00:30:18:cd:ec:63
inet6 fe80::230:18ff:fecd:ec63%igb9 prefixlen 64 scopeid 0xa
inet 10.1.102.55 netmask 0xffffff00 broadcast 10.1.102.255
nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb
inet 127.0.0.1 netmask 0xff000000
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: lo
enc0: flags=0<> metric 0 mtu 1536
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: enc
pflog0: flags=100<PROMISC> metric 0 mtu 33160
groups: pflog
pfsync0: flags=0<> metric 0 mtu 1500
groups: pfsync
syncpeer: 0.0.0.0 maxupd: 128 defer: off
root@OPNsense:~ # arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1011 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1088 seconds [ethernet]
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
10 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 957 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1034 seconds [ethernet]
root@OPNsense:~ # date
Thu Dec 21 13:19:23 UTC 2017
root@OPNsense:~ # freebsd-version -ku
11.0-RELEASE-p17
11.0-RELEASE-p17
root@OPNsense:~ # sysctl hw.igb.num_queues
hw.igb.num_queues: 0
root@OPNsense:~ # sysctl hw.pci.enable_msix
hw.pci.enable_msix: 1
root@OPNsense:~ # sysctl hw.igb.enable_msix
hw.igb.enable_msix: 1
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # arp -a
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 781 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 858 seconds [ethernet]
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 695 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 772 seconds [ethernet]
Thu Dec 21 13:23:42 UTC 2017
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
^C
--- 10.1.102.1 ping statistics ---
7 packets transmitted, 0 packets received, 100.0% packet loss
root@OPNsense:~ # ifconfig igb9 down
root@OPNsense:~ # ifconfig igb9 up
root@OPNsense:~ # ping 10.1.102.1
PING 10.1.102.1 (10.1.102.1): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
64 bytes from 10.1.102.1: icmp_seq=2 ttl=64 time=0.316 ms
64 bytes from 10.1.102.1: icmp_seq=3 ttl=64 time=0.298 ms
64 bytes from 10.1.102.1: icmp_seq=4 ttl=64 time=0.377 ms
64 bytes from 10.1.102.1: icmp_seq=5 ttl=64 time=0.294 ms
^C
--- 10.1.102.1 ping statistics ---
6 packets transmitted, 4 packets received, 33.3% packet loss
round-trip min/avg/max/stddev = 0.294/0.321/0.377/0.033 ms
root@OPNsense:~ # arp -a && date
? (10.1.102.1) at 4c:5e:0c:4b:23:30 on igb9 expires in 1198 seconds [ethernet]
OPNsense.test.thomas-krenn.com (10.1.102.55) at 00:30:18:cd:ec:63 on igb9 permanent [ethernet]
OPNsense.test.thomas-krenn.com (192.168.1.1) at 00:30:18:cd:e8:54 on igb0 permanent [ethernet]
? (192.168.1.100) at f0:de:f1:f3:17:88 on igb0 expires in 1198 seconds [ethernet]
Thu Dec 21 13:24:22 UTC 2017
root@OPNsense:~ #

Another user of this system (I think he is using pfSense 2.4) switched EEE off via the driver, and at the same time he has set hw.igb.num_queues=1. Up until now, he did not see any issues. I will try this, too. I'll keep you updated.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: mimugmail on December 29, 2017, 09:10:13 pm
So when you use the I350 as WAN this error does not occur?
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: dcol on December 30, 2017, 09:29:15 pm
There are a few items of concern here.
First, the num_queues setting has to do with the number of cores available divided by the number of ports. There should never be more ports than cores or the queues will overrun and could cause a reset of the port. The value of num_queues should be less than or equal to the cores/ports number. This is automatically calculated by the OS if not overridden by the settings. As an example, if you have 4 cores and 3 ports, the num_queues should be 4/3=1.33 which should be set to 1.

Secondly, the eee setting must be done in the tunables section, as the eee setting does not work in the loader.conf.local. Also, all power management settings in the BIOS should be disabled.
You can use the command 'sysctrl -A' in the shell to see the actual settings in use.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: mimugmail on December 31, 2017, 06:41:31 am
And check if you have the latest NIC firmware, esp. when you use Intel:

https://downloadcenter.intel.com/de/download/22283/Ethernet-Intel-Ethernet-Adapter-vollst-ndige-Treiber-Pack?product=64404
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: tillsense on December 31, 2017, 11:06:36 am
Another user of this system (I think he is using pfSense 2.4) switched EEE off via the driver, and at the same time he has set hw.igb.num_queues=1. Up until now, he did not see any issues. I will try this, too. I'll keep you updated.

Hi all,

that sounds good. Therefore, I also meant that this should be default or adjustable by gui.
happy New Year

cheers till

Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: dcol on January 01, 2018, 06:43:45 pm
That Intel download link is for drivers, not firmware. In the FreeBSD environment we have no control over the drivers that are used. The firmware is included as part of the bootutil software. But, would be nice to have a hacked driver with no PM at all that could be compiled into the FreeBSD OS.
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: mimugmail on January 01, 2018, 07:52:03 pm
Correct, Firmware should be here, but I have not an account https://www.intel.de/content/www/de/de/embedded/products/networking/ethernet-controller-i210-i211-family-documentation.html
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: dcol on January 01, 2018, 08:07:07 pm
That link is for development and simulation tools. probably a good place if you were going to hack the drivers or firmware.
This is what you want
https://downloadcenter.intel.com/download/19186
Title: Re: WAN link gone sometimes (igb driver, I211 nics), ifconfig d/u fixes it
Post by: mimugmail on January 02, 2018, 06:59:46 am
Thanks dcol! I had many problems with the SFP+ cards the last months, regarding SFP it's a much easier process updating the firmware on the chip.