Issue with I340-T4 Intel 82580 Network controller

Started by abel408, March 21, 2017, 07:54:50 PM

Previous topic - Next topic
I have an Intel 82580 Network controller that I've been having issues with.

I just have a very simple network on it.
OPNSense via igb0 82580 port (10.134.0.1) ---> TP-Link Switch (10.134.0.2)


When the OPNSense box starts up, everything works fine. I can connect to the switch and all devices attached to the switch. Then, for no apparent reason, the connection between the switch and OPNSense box breaks. Both devices still see an active connection, but the OPNSense box loses the ARP entries it had for the switch and devices on the switch. They only way to get the connection back is to reboot the OPNSense box. Sometimes it will go hours without losing the ARP entries, sometimes only minutes.

I had a VyOS router in place of the OPNSense box before and never had these issues so I know there is nothing wrong with the switch.

I just upgraded to the latest version and I'm still seeing this issue.

I have attached other devices to the Intel 82580 network controller and they also lose their ARP entry. The onboard network controller works fine.

I'm having the same issue, with a slighty different setup.
I'm using an old cisco blade server with the latest proxmox 5.0.9. This server is connected with a trunk to a cisco switch in 1GB.

And everything is working fine, and suddently, after approx 24 à 48 hours, I'm losing a virtual interface.
LAN-WAN-GUESTS are still working, DMZ not.
Once I restart the virtual interface (disable - enable), the connection is back.

I've now fix all network card to 1000baseT. Fingers crossed that it stays stable now...


My opnsense version is: OPNsense 17.1.6-amd64.

My Promox config, with network card Intel E1000


Code (ifconfig) Select

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=2088<VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC>
        ether 96:7e:28:e9:06:1c
        inet6 fe80::947e:28ff:fee9:61c%em0 prefixlen 64 scopeid 0x1
        inet 192.168.111.254 netmask 0xffffff00 broadcast 192.168.111.255
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet 1000baseT (1000baseT <full-duplex>)
        status: active
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=2088<VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC>
        ether 1e:2a:a7:75:12:28
        inet6 fe80::1c2a:a7ff:fe75:1228%em1 prefixlen 64 scopeid 0x2
        inet 192.168.222.254 netmask 0xffffff00 broadcast 192.168.222.255
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet 1000baseT (1000baseT <full-duplex>)
        status: active
em2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=2088<VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC>
        ether fa:13:81:16:cb:08
        inet6 fe80::f813:81ff:fe16:cb08%em2 prefixlen 64 scopeid 0x3
        inet 192.168.1.254 netmask 0xffffff00 broadcast 192.168.1.255
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet 1000baseT (1000baseT <full-duplex>)
        status: active
em3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=2088<VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC>
        ether 22:98:02:e4:30:b6
        inet6 fe80::2098:2ff:fee4:30b6%em3 prefixlen 64 scopeid 0x4
        inet 192.168.0.226 netmask 0xffffff00 broadcast 192.168.0.255
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet 1000base


Code (netstat -i) Select
netstat -i
Name    Mtu Network       Address              Ipkts Ierrs Idrop    Opkts Oerrs  Coll
em0    1500 <Link#1>      96:7e:28:e9:06:1c 183889088     0     0 122338557     0     0
em0       - fe80::%em0/64 fe80::947e:28ff:f     1764     -     -    30831     -     -
em0       - 192.168.111.0 securitus              888     -     -     1883     -     -
em1    1500 <Link#2>      1e:2a:a7:75:12:28 138311520  6339     0 218558256     0     0
em1       - fe80::%em1/64 fe80::1c2a:a7ff:f        0     -     -        0     -     -
em1       - 192.168.222.0 securitus             2389     -     -     2624     -     -
em2    1500 <Link#3>      fa:13:81:16:cb:08        0     0     0       13     0     0
em2       - fe80::%em2/64 fe80::f813:81ff:f        0     -     -        1     -     -
em2       - 192.168.1.0/2 192.168.1.254           51     -     -        0     -     -
em3    1500 <Link#4>      22:98:02:e4:30:b6 38157986     0     0 19099481     0     0
em3       - fe80::%em3/64 fe80::2098:2ff:fe        0     -     -        1     -     -
em3       - 192.168.0.0/2 192.168.0.226          563     -     -        0     -     -
enc0*  1536 <Link#5>      enc0                     0     0     0        0     0     0
lo0   16384 <Link#6>      lo0                   7341     0     0     7341     0     0
lo0       - localhost     localhost                0     -     -        0     -     -
lo0       - fe80::%lo0/64 fe80::1%lo0              0     -     -        0     -     -
lo0       - your-net      localhost             7341     -     -     7341     -     -
pflog 33160 <Link#7>      pflog0                   0     0     0   710086     0     0
pfsyn  1500 <Link#8>      pfsync0                  0     0     0        0     0     0
ovpns  1500 <Link#9>      ovpns1                   0     0     0       10     0     0
ovpns     - fe80::%ovpns1 fe80::4c76:46ba:6        0     -     -        1     -     -
ovpns     - 192.168.112.1 192.168.112.1            0     -     -        0     -     -
em0_v  1500 <Link#10>     96:7e:28:e9:06:1c        0     0     0        5     0     0
em0_v     - fe80::%em0_vl fe80::947e:28ff:f        0     -     -        1     -     -
em1_v  1500 <Link#11>     1e:2a:a7:75:12:28        0     0     0        3     0     0
em1_v     - fe80::%em1_vl fe80::1c2a:a7ff:f        0     -     -        1     -     -
em2_v  1500 <Link#12>     fa:13:81:16:cb:08        0     0     0        5     0     0
em2_v     - fe80::%em2_vl fe80::f813:81ff:f        0     -     -        1     -     -
em3_v  1500 <Link#13>     22:98:02:e4:30:b6        0     0     0        4     1     0
em3_v     - fe80::%em3_vl fe80::2098:2ff:fe        0     -     -        1     -     -



Code (netstat -s) Select

tcp:
        522658 packets sent
                57058 data packets (80334229 bytes)
                680 data packets (849291 bytes) retransmitted
                5 data packets unnecessarily retransmitted
                0 resends initiated by MTU discovery
                463766 ack-only packets (0 delayed)
                0 URG only packets
                0 window probe packets
                554 window update packets
                600 control packets
        498363 packets received
                31457 acks (for 80336075 bytes)
                1273 duplicate acks
                0 acks for unsent data
                439680 packets (634087515 bytes) received in-sequence
                99 completely duplicate packets (85432 bytes)
                12 old duplicate packets
                0 packets with some dup. data (0 bytes duped)
                23419 out-of-order packets (33905296 bytes)
                0 packets (0 bytes) of data after window
                0 window probes
                7 window update packets
                53 packets received after close
                0 discarded for bad checksums
                0 discarded for bad header offset fields
                0 discarded because packet too short
                2 discarded due to memory problems
        299 connection requests
        35 connection accepts
        0 bad connection attempts
        0 listen queue overflows
        0 ignored RSTs in the windows
        334 connections established (including accepts)
        6036 connections closed (including 0 drops)
                145 connections updated cached RTT on close
                145 connections updated cached RTT variance on close
                106 connections updated cached ssthresh on close
        0 embryonic connections dropped
        31111 segments updated rtt (of 28921 attempts)
        22 retransmit timeouts
                0 connections dropped by rexmit timeout
        0 persist timeouts
                0 connections dropped by persist timeout
        0 Connections (fin_wait_2) dropped because of timeout
        0 keepalive timeouts
                0 keepalive probes sent
                0 connections dropped by keepalive
        22336 correct ACK header predictions
        439265 correct data packet header predictions
        35 syncache entries added
                0 retransmitted
                0 dupsyn
                0 dropped
                35 completed
                0 bucket overflow
                0 cache overflow
                0 reset
                0 stale
                0 aborted
                0 badack
                0 unreach
                0 zone failures
        35 cookies sent
        0 cookies received
        12 hostcache entries added
                0 bucket overflow
        94 SACK recovery episodes
        568 segment rexmits in SACK recovery episodes
        691091 byte rexmits in SACK recovery episodes
        3406 SACK options (SACK blocks) received
        23041 SACK options (SACK blocks) sent
        0 SACK scoreboard overflow
        0 packets with ECN CE bit set
        0 packets with ECN ECT(0) bit set
        0 packets with ECN ECT(1) bit set
        0 successful ECN handshakes
        0 times ECN reduced the congestion window
TCP connection count by state:
        0 connections in CLOSED state
        6 connections in LISTEN state
        0 connections in SYN_SENT state
        0 connections in SYN_RCVD state
        3 connections in ESTABLISHED state
        0 connections in CLOSE_WAIT state
        0 connections in FIN_WAIT_1 state
        0 connections in CLOSING state
        0 connections in LAST_ACK state
        0 connections in FIN_WAIT_2 state
        0 connections in TIME_WAIT state
udp:
        585116 datagrams received
        0 with incomplete header
        0 with bad data length field
        0 with bad checksum
        1892 with no checksum
        34032 dropped due to no socket
        146301 broadcast/multicast datagrams undelivered
        0 dropped due to full socket buffers
        0 not for hashed pcb
        404783 delivered
        439680 datagrams output
        0 times multicast source filter matche



Let me know if I can help in any way to troubleshoot it a bit more...

No luck, the port just went down again.  :-[
A reboot and eveyrthing is back online. Don't have the time now to go through the log files...

So fixing the speed didn't resolve the issue.


You have many Ierrs on em1 (vmbr777). Is this the DMZ interface? Can't see a public IP address.
2 discards because of memory problems and 5% of the packets are out of order.
Any errors from dmesg?

Did you check the Proxmox logs?
Do you have bonding on the Proxmox host configured?

@abel408
Please post the output of
sysctl dev.igb.0
sysctl hw.igb
netstat -idb -I igb0

Quote from: faunsen on May 19, 2017, 02:11:23 PM
You have many Ierrs on em1 (vmbr777). Is this the DMZ interface? Can't see a public IP address.
2 discards because of memory problems and 5% of the packets are out of order.
Any errors from dmesg?

Did you check the Proxmox logs?
Do you have bonding on the Proxmox host configured?


vmbr777 is my server DMZ. The public IP/side is on vmbr999, but it's hidden after a router.

No bonding. Just 1 interface in use, with 4 trunks (666 - 777 - 888 - 999). This is the part that bothers me, if it was an issue with the hardware (cisco blade), all vlan's should have issues, not only a virutal vlan, no? Or even on the proxmox side?

I've got the feeling that problem is linked with heavy data transfer on opnsense and my 'virtual drivers'.
Fe when my backupserver starts, a big download... But this is just a feeling.



In att some lines from my syslog-server...

May 19, 2017, 04:44:14 PM #7 Last Edit: May 19, 2017, 04:46:55 PM by abel408
Quote from: faunsen on May 19, 2017, 02:17:59 PM
@abel408
Please post the output of
sysctl dev.igb.0
sysctl hw.igb
netstat -idb -I igb0

I've switched ports to igb2

sysctl dev.igb.2

dev.igb.2.host.header_redir_missed: 0
dev.igb.2.host.serdes_violation_pkt: 0
dev.igb.2.host.length_errors: 0
dev.igb.2.host.tx_good_bytes: 2584813
dev.igb.2.host.rx_good_bytes: 9629714
dev.igb.2.host.breaker_tx_pkt_drop: 0
dev.igb.2.host.tx_good_pkt: 0
dev.igb.2.host.breaker_rx_pkt_drop: 0
dev.igb.2.host.breaker_rx_pkts: 0
dev.igb.2.host.rx_pkt: 2
dev.igb.2.host.host_tx_pkt_discard: 0
dev.igb.2.host.breaker_tx_pkt: 0
dev.igb.2.interrupts.rx_overrun: 0
dev.igb.2.interrupts.rx_desc_min_thresh: 0
dev.igb.2.interrupts.tx_queue_min_thresh: 1626039
dev.igb.2.interrupts.tx_queue_empty: 23780
dev.igb.2.interrupts.tx_abs_timer: 0
dev.igb.2.interrupts.tx_pkt_timer: 0
dev.igb.2.interrupts.rx_abs_timer: 0
dev.igb.2.interrupts.rx_pkt_timer: 115268
dev.igb.2.interrupts.asserts: 3561305
dev.igb.2.mac_stats.tso_ctx_fail: 0
dev.igb.2.mac_stats.tso_txd: 0
dev.igb.2.mac_stats.tx_frames_1024_1522: 278
dev.igb.2.mac_stats.tx_frames_512_1023: 588
dev.igb.2.mac_stats.tx_frames_256_511: 285
dev.igb.2.mac_stats.tx_frames_128_255: 144
dev.igb.2.mac_stats.tx_frames_65_127: 16075
dev.igb.2.mac_stats.tx_frames_64: 6410
dev.igb.2.mac_stats.mcast_pkts_txd: 5
dev.igb.2.mac_stats.bcast_pkts_txd: 26
dev.igb.2.mac_stats.good_pkts_txd: 23780
dev.igb.2.mac_stats.total_pkts_txd: 116515
dev.igb.2.mac_stats.total_octets_txd: 8519853
dev.igb.2.mac_stats.good_octets_txd: 2584813
dev.igb.2.mac_stats.total_octets_recvd: 11198418
dev.igb.2.mac_stats.good_octets_recvd: 9629714
dev.igb.2.mac_stats.rx_frames_1024_1522: 1114
dev.igb.2.mac_stats.rx_frames_512_1023: 803
dev.igb.2.mac_stats.rx_frames_256_511: 218
dev.igb.2.mac_stats.rx_frames_128_255: 193
dev.igb.2.mac_stats.rx_frames_65_127: 1006
dev.igb.2.mac_stats.rx_frames_64: 111936
dev.igb.2.mac_stats.mcast_pkts_recvd: 0
dev.igb.2.mac_stats.bcast_pkts_recvd: 111044
dev.igb.2.mac_stats.good_pkts_recvd: 115270
dev.igb.2.mac_stats.total_pkts_recvd: 139781
dev.igb.2.mac_stats.mgmt_pkts_txd: 0
dev.igb.2.mac_stats.mgmt_pkts_drop: 0
dev.igb.2.mac_stats.mgmt_pkts_recvd: 0
dev.igb.2.mac_stats.unsupported_fc_recvd: 0
dev.igb.2.mac_stats.xoff_txd: 58621
dev.igb.2.mac_stats.xoff_recvd: 0
dev.igb.2.mac_stats.xon_txd: 34114
dev.igb.2.mac_stats.xon_recvd: 0
dev.igb.2.mac_stats.coll_ext_errs: 0
dev.igb.2.mac_stats.tx_no_crs: 0
dev.igb.2.mac_stats.alignment_errs: 0
dev.igb.2.mac_stats.crc_errs: 0
dev.igb.2.mac_stats.recv_errs: 0
dev.igb.2.mac_stats.recv_jabber: 0
dev.igb.2.mac_stats.recv_oversize: 0
dev.igb.2.mac_stats.recv_fragmented: 0
dev.igb.2.mac_stats.recv_undersize: 0
dev.igb.2.mac_stats.recv_no_buff: 0
dev.igb.2.mac_stats.recv_length_errors: 0
dev.igb.2.mac_stats.missed_packets: 24511
dev.igb.2.mac_stats.defer_count: 0
dev.igb.2.mac_stats.sequence_errors: 0
dev.igb.2.mac_stats.symbol_errors: 0
dev.igb.2.mac_stats.collision_count: 0
dev.igb.2.mac_stats.late_coll: 0
dev.igb.2.mac_stats.multiple_coll: 0
dev.igb.2.mac_stats.single_coll: 0
dev.igb.2.mac_stats.excess_coll: 0
dev.igb.2.queue7.lro_flushed: 0
dev.igb.2.queue7.lro_queued: 0
dev.igb.2.queue7.rx_bytes: 0
dev.igb.2.queue7.rx_packets: 182577
dev.igb.2.queue7.rxd_tail: 1023
dev.igb.2.queue7.rxd_head: 0
dev.igb.2.queue7.tx_packets: 4035
dev.igb.2.queue7.no_desc_avail: 0
dev.igb.2.queue7.txd_tail: 93
dev.igb.2.queue7.txd_head: 0
dev.igb.2.queue7.interrupt_rate: 8000
dev.igb.2.queue6.lro_flushed: 0
dev.igb.2.queue6.lro_queued: 0
dev.igb.2.queue6.rx_bytes: 0
dev.igb.2.queue6.rx_packets: 114128
dev.igb.2.queue6.rxd_tail: 1023
dev.igb.2.queue6.rxd_head: 0
dev.igb.2.queue6.tx_packets: 1087
dev.igb.2.queue6.no_desc_avail: 0
dev.igb.2.queue6.txd_tail: 82
dev.igb.2.queue6.txd_head: 0
dev.igb.2.queue6.interrupt_rate: 8000
dev.igb.2.queue5.lro_flushed: 0
dev.igb.2.queue5.lro_queued: 0
dev.igb.2.queue5.rx_bytes: 0
dev.igb.2.queue5.rx_packets: 11567
dev.igb.2.queue5.rxd_tail: 1023
dev.igb.2.queue5.rxd_head: 0
dev.igb.2.queue5.tx_packets: 561
dev.igb.2.queue5.no_desc_avail: 0
dev.igb.2.queue5.txd_tail: 22
dev.igb.2.queue5.txd_head: 0
dev.igb.2.queue5.interrupt_rate: 8000
dev.igb.2.queue4.lro_flushed: 0
dev.igb.2.queue4.lro_queued: 0
dev.igb.2.queue4.rx_bytes: 0
dev.igb.2.queue4.rx_packets: 136909
dev.igb.2.queue4.rxd_tail: 1023
dev.igb.2.queue4.rxd_head: 0
dev.igb.2.queue4.tx_packets: 11072
dev.igb.2.queue4.no_desc_avail: 4095
dev.igb.2.queue4.txd_tail: 0
dev.igb.2.queue4.txd_head: 0
dev.igb.2.queue4.interrupt_rate: 8000
dev.igb.2.queue3.lro_flushed: 0
dev.igb.2.queue3.lro_queued: 0
dev.igb.2.queue3.rx_bytes: 0
dev.igb.2.queue3.rx_packets: 328
dev.igb.2.queue3.rxd_tail: 1023
dev.igb.2.queue3.rxd_head: 0
dev.igb.2.queue3.tx_packets: 5116
dev.igb.2.queue3.no_desc_avail: 5103
dev.igb.2.queue3.txd_tail: 1022
dev.igb.2.queue3.txd_head: 0
dev.igb.2.queue3.interrupt_rate: 8000
dev.igb.2.queue2.lro_flushed: 0
dev.igb.2.queue2.lro_queued: 0
dev.igb.2.queue2.rx_bytes: 0
dev.igb.2.queue2.rx_packets: 239406
dev.igb.2.queue2.rxd_tail: 1023
dev.igb.2.queue2.rxd_head: 0
dev.igb.2.queue2.tx_packets: 3577
dev.igb.2.queue2.no_desc_avail: 5095
dev.igb.2.queue2.txd_tail: 1022
dev.igb.2.queue2.txd_head: 0
dev.igb.2.queue2.interrupt_rate: 8000
dev.igb.2.queue1.lro_flushed: 0
dev.igb.2.queue1.lro_queued: 0
dev.igb.2.queue1.rx_bytes: 0
dev.igb.2.queue1.rx_packets: 342474
dev.igb.2.queue1.rxd_tail: 1023
dev.igb.2.queue1.rxd_head: 0
dev.igb.2.queue1.tx_packets: 2683
dev.igb.2.queue1.no_desc_avail: 5103
dev.igb.2.queue1.txd_tail: 1022
dev.igb.2.queue1.txd_head: 0
dev.igb.2.queue1.interrupt_rate: 8000
dev.igb.2.queue0.lro_flushed: 0
dev.igb.2.queue0.lro_queued: 0
dev.igb.2.queue0.rx_bytes: 0
dev.igb.2.queue0.rx_packets: 68432
dev.igb.2.queue0.rxd_tail: 1023
dev.igb.2.queue0.rxd_head: 0
dev.igb.2.queue0.tx_packets: 4274
dev.igb.2.queue0.no_desc_avail: 4095
dev.igb.2.queue0.txd_tail: 0
dev.igb.2.queue0.txd_head: 0
dev.igb.2.queue0.interrupt_rate: 8000
dev.igb.2.fc_low_water: 33152
dev.igb.2.fc_high_water: 33168
dev.igb.2.rx_buf_alloc: 0
dev.igb.2.tx_buf_alloc: 0
dev.igb.2.extended_int_mask: 2147483648
dev.igb.2.interrupt_mask: 4
dev.igb.2.rx_control: 67141634
dev.igb.2.device_control: 1490027073
dev.igb.2.watchdog_timeouts: 0
dev.igb.2.rx_overruns: 0
dev.igb.2.tx_dma_fail: 0
dev.igb.2.mbuf_defrag_fail: 0
dev.igb.2.link_irq: 2
dev.igb.2.dropped: 135555
dev.igb.2.tx_processing_limit: -1
dev.igb.2.rx_processing_limit: 100
dev.igb.2.fc: 3
dev.igb.2.enable_aim: 1
dev.igb.2.nvm: -1
dev.igb.2.%parent: pci1
dev.igb.2.%pnpinfo: vendor=0x8086 device=0x150e subvendor=0x8086 subdevice=0x12a1 class=0x020000
dev.igb.2.%location: slot=0 function=2 dbsf=pci0:1:0:2
dev.igb.2.%driver: igb
dev.igb.2.%desc: Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k


sysctl hw.igb

hw.igb.tx_process_limit: -1
hw.igb.rx_process_limit: 100
hw.igb.num_queues: 0
hw.igb.header_split: 0
hw.igb.buf_ring_size: 4096
hw.igb.max_interrupt_rate: 8000
hw.igb.enable_msix: 1
hw.igb.enable_aim: 1
hw.igb.txd: 1024
hw.igb.rxd: 1024


netstat -idb -I igb2

Name    Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll  Drop
igb2   1500 <Link#3>      00:1b:21:a6:c5:06   115293 135555 24511    9631186    23780     0    2584813     0 109068
igb2      - fe80::%igb2/6 fe80::21b:21ff:fe        0     -     -          0        1     -         96     -     -
igb2      - x.x.x.x.0/2 x-x-x-x.tvc-i    17976     -     -    1500557        0     -          0     -     -



@brononius
Device: /dev/bus/4 [megaraid_disk_22] [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 78 to 79
Device: /dev/bus/4 [megaraid_disk_22] [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 72 to 73

Quite hot. BBQ?
You should check your air con. And the disks. And the switch. Any errors/discards, temperature?
And can you check the errors from a ifconfig on the Proxmox host?

To understand you right. You have one physical interface on the Proxmox blade and put 4 VLANs (btw: What is your understanding of a trunk?) on it and press a backup through this single NIC in and out?
I'd try the virtio nic. 6GByte RAM are good for some performance tweaks.  ;)
FreeBSD uses resources very sparingly with the default settings.

One important thing:
When the problem occurs again, before you reboot the OPNsense VM please do a
netstat -m
netstat -s
sysctl dev.em
etc.
and post it here.

Is the backup traffic going through the DMZ interface?

@abel408
You have more Ierrs than Ipkts!
This could be a hardware problem. Check the cables.
I had recently a series of bad cables that causes this number of errors.
Or maybe the switch has a problem. Any errors on the switch?


This problem isn't isolated to just the switch or cable. Anything I plug into this interface card has this issue. My 2 onboard interface ports are the only ones that are working correctly.

Quote from: abel408 on May 19, 2017, 07:17:04 PM
This problem isn't isolated to just the switch or cable. Anything I plug into this interface card has this issue. My 2 onboard interface ports are the only ones that are working correctly.
Can you change the interface card too?
Just to be sure  :)

I was trying to avoid that, but it sounds like that's my only option at this point...

Yes, the high error rate in contrast to the low traffic let suggest it.
I don't think it's a driver problem.
I use the igb driver too.
Have an eye on hw.igb.buf_ring_size.


Cheers,
Frank

Quote from: faunsen on May 19, 2017, 05:19:49 PM
Quite hot. BBQ? You should check your air con. And the disks. And the switch. Any errors/discards, temperature?
No, just an old server running at my house, so not in a cold room or so...

Quote from: faunsen on May 19, 2017, 05:19:49 PM
To understand you right. You have one physical interface on the Proxmox blade and put 4 VLANs (btw: What is your understanding of a trunk?) on it and press a backup through this single NIC in and out?.

My understanding of a trunk: multiple VLANs through 1 cable/port...

fe the config on my switch:
interface GigabitEthernet0/27
description Proxmox
switchport access vlan 777
switchport trunk encapsulation dot1q
switchport mode trunk
switchport nonegotiate
speed 1000
duplex full
spanning-tree portfast
spanning-tree bpduguard enable


And for proxmox:
root@proxmoxus:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback

iface ens1f0 inet manual

iface ens1f1 inet manual

auto bond0
iface bond0 inet manual
slaves ens1f0
bond_miimon 100
bond_mode 802.3ad

auto bond0.666
iface bond0.666 inet manual
vlan-raw-device bond0

auto bond0.777
iface bond0.777 inet manual
vlan-raw-device bond0

auto bond0.888
iface bond0.888 inet manual
vlan-raw-device bond0

auto bond0.999
iface bond0.999 inet manual
vlan-raw-device bond0

auto vmbr666
iface vmbr666 inet manual
bridge_ports bond0.666
bridge_stp off
bridge_fd 0

auto vmbr777
iface vmbr777 inet static
address  192.168.222.253
netmask  255.255.255.0
gateway  192.168.222.254
bridge_ports bond0.777
bridge_stp off
bridge_fd 0

auto vmbr888
iface vmbr888 inet manual
bridge_ports bond0.888
bridge_stp off
bridge_fd 0

auto vmbr999
iface vmbr999 inet manual
bridge_ports bond0.999
bridge_stp off
bridge_fd 0




Quote from: faunsen on May 19, 2017, 05:19:49 PM
One important thing:
When the problem occurs again, before you reboot the OPNsense VM please do a
netstat -m
netstat -s
sysctl dev.em
And can you check the errors from a ifconfig on the Proxmox host?
For the moment, no more crashes, but I've written it down, so once I see it again, I'll check it.

Quote from: faunsen on May 19, 2017, 05:19:49 PM
Is the backup traffic going through the DMZ interface?
For me it does, not sure how proxmox handles traffic, but the backupserver and others are on the same vlan, with the same gateway. But since they're all on the same KVM, I guess it all passes the network interface?


ps @abel408, sorry to post this in your topic. I thought we were having the same issue. If any moderators reads, you can split it if wanted.