OPNsense Forum

Archive => 16.7 Legacy Series => Topic started by: brady1408 on January 02, 2017, 02:59:20 am

Title: [SOLVED] Zotac nano ci323 LAN Drops after a few days
Post by: brady1408 on January 02, 2017, 02:59:20 am
I'm not sure what is causing this, I am wondering it if it is just the nics that are in this device. Please let me know your thoughts. Every couple of days I lose the ability to connect to the LAN IP and the logs fill with the following.

once this starts happening I have to reboot the machine to get it back.

Jan 2 01:00:49   configd.py: [c7e56997-aa8f-4f94-9c4f-2c070f67ab76] updating dyndns lan
Jan 2 01:00:46   opnsense: /usr/local/etc/rc.linkup: HOTPLUG: Configuring interface lan
Jan 2 01:00:46   opnsense: /usr/local/etc/rc.linkup: DEVD Ethernet attached event for lan
Jan 2 01:00:46   configd.py: [7f6a2423-4cdc-49d5-8181-881fc54f12e0] Linkup starting re0
Jan 2 01:00:46   devd: Executing '/usr/local/opnsense/service/configd_ctl.py interface linkup start re0'
Jan 2 01:00:46   kernel: re0: link state changed to UP
Jan 2 01:00:42   opnsense: /usr/local/etc/rc.linkup: DEVD Ethernet detached event for lan
Jan 2 01:00:42   configd.py: [e29d0834-cbfb-4aff-9c82-5bf2df12a232] Linkup stopping re0
Jan 2 01:00:41   devd: Executing '/usr/local/opnsense/service/configd_ctl.py interface linkup stop re0'
Jan 2 01:00:41   kernel: re0: link state changed to DOWN
Jan 2 01:00:41   kernel: re0: watchdog timeout
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: mbosner on January 04, 2017, 02:46:57 pm
Same here - afaik a known issue. Disabling all HW acceleration helps extending the time before failure.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on January 04, 2017, 03:39:11 pm
Hi guys,

Watchdog timeout points to a hardware lockup. re(4) is not very good in general. Maybe it's better when we switch to FreeBSD 11.0 with 17.1, but in general migrating to better NICs is the best (and ironically) cheapest solution in the long run.

It could also be temperature concerns, malfunctioning lines / cables, etc.

The main question, though, what are you using the box for. How much traffic are you pushing? The more you approach the edge of the specification the more visible such cases can be.


Cheers,
Franco
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: mbosner on January 04, 2017, 10:48:13 pm
Hi franco,

thank you for the reply.

Since the box does not have any extension capabilities it will be hard to replace the nics (usb would be possible).

To replicate the failure you just need to push about 30mb/s (megabyte) and starting at about 20 gig overall traffic one or both nics fail. dmesg tells you that they eventually come up again but the do not forward any traffic after the first failure.

Disabling the acceleration helps - even if you do not use the feature e.g.:

If i disable vlan hw acceleration in the options it only disables the hw feature on my lan side (i only use vlans on my lan side - it might be a bug but i did not test vlan on the wan side). If i disable the acceleration manually it takes much more time and traffic before the nics fail.

Btw. using windows as os (just test installation) the nics do not fail even under much heavier load.

If it helps to grant access to the box for debuging purpose i can add your public key next week (i wont make it earlier).

Cheers
Martin
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on January 05, 2017, 04:35:30 pm
Hi Martin,

Ok, let me rephrase: re(4) drivers on BSD are difficult. I checked the source code for fixes in newer versions and did not find a single one. This is not going to be fixed. ;(

If Windows works, maybe that's an option for the box, or a Linux there. But I don't know the state of it, maybe IPfire can do what you will expect without issues.


Sorry,
Franco
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: xofer on January 05, 2017, 05:35:52 pm
Same here.
kernel: re0: watchdog timeout endlessly and only power cycle helps.
Seems to be a trouble with the re(4) driver and/or Zotac hardware or combination thereof.

At the same time the system itself is responsive, you can log in from terminal, the error is logged, etc. And it never seems to be re1 for us, no matter which is assigned to LAN or WAN.

The most baffling thing is that its totally random. We had one crashing regurarly at one office. At the same time we have two ci323 working in different locations and one of them has never crashed in two months and the other one crashed several times for a week and now has not done it in more than month or so. Does not seem to be related to traffic intensity.

Generally disabling HW acceleration etc did not seem to help us. Temp also not an issue (cooled room, well below 20C). As the box might work for days, its pretty hard to diagnose.

After a couple of weeks we decided to replace ci323 with a different mini-pc that has 2 intel NICs and have had no trouble since.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: mbosner on January 05, 2017, 09:24:28 pm
Hi Franco,

windows is not an option ;)

I already ordered this box with intel nics:  http://www.shuttle.eu/products/slim/ds68u/ (http://www.shuttle.eu/products/slim/ds68u/)

Would you be able to debug the problem if i send you the hardware? Or maybe someone else?

Cheers
Martin

Ok, let me rephrase: re(4) drivers on BSD are difficult. I checked the source code for fixes in newer versions and did not find a single one. This is not going to be fixed. ;(
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on January 05, 2017, 09:56:46 pm
Hi Martin,

That's a a very kind offer, but I cannot allocate time for this between OPNsense and my day job.

If someone else wants to take you up on this, that would be great. We can also try to reach more people on Twitter with this request if you want. :)


Cheers,
Franco
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: mbosner on January 06, 2017, 12:25:21 am
Hi Franco,

sure!

If that works and we find someone who could take care of the problem and try to fix the bug(s) that would be great.

Cheers
Martin
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: mbosner on January 18, 2017, 05:31:20 pm
Hi Franco,

any news on this topic? I sent you a private message with some details.

Cheers
Martin
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: the-mk on January 18, 2017, 06:05:40 pm
I do have such a CI323 box too, since about six month never had any problems... except once when I performed an iperf between two boxes, one on the LAN side, one on the WAN side. I had to poweroff and poweron that box to make it run again (didn't have a monitor attached).

If I can perform some tests for you to gather logfiles just ask!
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: mbosner on January 19, 2017, 02:46:33 am
If you just use the box at home it happens only under high load (over 100mbit).

As soon as i start using iperf or fast downloads the problems appear.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: xofer on January 24, 2017, 11:08:22 am
Seems that compiling a different realtek driver might help:
https://forum.pfsense.org/index.php?topic=103841.msg684436#msg684436
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on January 25, 2017, 06:34:56 pm
Removing re(4) from the kernel is not an easy task as the kernel gets reverted on firmware updates.

Let me look into this and provide an in-kernel driver test based on realtek's version 1.92...

https://github.com/opnsense/src/issues/15

If this works and the world doesn't end we can consider a full switch.


Cheers,
Franco
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on January 25, 2017, 11:05:05 pm
So, here's your code branch for the original re(4) driver from Realtek, slight adaptation for FreeBSD 11.0.

https://github.com/opnsense/src/commits/re

It builds fine but haven't tested this yet. If anybody wants to try... it will apply on any 17.1 for amd64 including prereleases:

# opnsense-update -kr 17.1-re
# /usr/local/etc/rc.reboot

Caveats: UNTESTED, amd64 only and netmap is not in native mode, only emulated.


Cheers,
Franco
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: agh1701 on January 26, 2017, 05:03:06 pm
Working kernel module for FreeBSD 10.3 and opnsense 16.7.  Compiled from realtek source V1.92.

tested on opensense 16.7.  Running for two days streaming directv now.  Installer included.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: agh1701 on January 26, 2017, 05:06:31 pm
Working kernel module for FreeBSD 11 and opnsense 17.1.  Compiled from realtek source V1.92.

tested on opensense 17.1.  It was tested for kernel loading and general i/o.  not load tested.  Installer included.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on January 26, 2017, 06:31:24 pm
Question: How does this even compile under FreeBSD 11.0 when taskqueue_enqueue_fast() is not in the OS anymore? I don't see a replacement for taskqueue_enqueue(). ;)

It's also useful to know that the module cannot be loaded without a replacement kernel without re, right?
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: agh1701 on January 26, 2017, 07:35:04 pm
for bsd11 I found this patch, don't remember where ether pfsense or FreeBSD forums:

Code: [Select]
--- if_re.c 2016-07-19 13:50:27.716636000 -0400
+++ if_re.c.Patched 2016-07-19 13:52:06.534495000 -0400
@@ -47,6 +47,8 @@
 * This driver also support Realtek 8139C+, 8110S/SB/SC, RTL8111B/C/CP/D and RTL8101E/8102E/8103E.
 */
 
+#define       M_DONTWAIT      M_NOWAIT
+
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/sockio.h>
@@ -57,6 +59,7 @@
 #include <sys/taskqueue.h>
 
 #include <net/if.h>
+#include <net/if_var.h>
 #include <net/if_arp.h>
 #include <net/ethernet.h>
 #include <net/if_dl.h>
@@ -5529,7 +5532,7 @@
 
                 sc->re_desc.tx_last_index = (sc->re_desc.tx_last_index+1)%RE_TX_BUF_NUM;
                 txptr=&sc->re_desc.tx_desc[sc->re_desc.tx_last_index];
-                ifp->if_opackets++;
+ if_inc_counter(ifp, IFCOUNTER_OPACKETS, 1);
                 ifp->if_drv_flags &= ~IFF_DRV_OACTIVE;
         }
 
@@ -5672,7 +5675,7 @@
                 }
 
                 eh = mtod(m, struct ether_header *);
-                ifp->if_ipackets++;
+ if_inc_counter(ifp, IFCOUNTER_IPACKETS, 1);
 #ifdef _DEBUG_
                 printf("Rcv Packet, Len=%d \n", m->m_len);
 #endif
@@ -5747,7 +5750,7 @@
 #if OS_VER < VERSION(7,0)
         re_int_task(arg, 0);
 #else
-        taskqueue_enqueue_fast(taskqueue_fast, &sc->re_inttask);
+        taskqueue_enqueue(taskqueue_fast, &sc->re_inttask);
 
         return (FILTER_HANDLED);
 #endif
@@ -5827,7 +5830,7 @@
 
 #if OS_VER>=VERSION(7,0)
         if (CSR_READ_2(sc, RE_ISR) & RE_INTRS) {
-                taskqueue_enqueue_fast(taskqueue_fast, &sc->re_inttask);
+                taskqueue_enqueue(taskqueue_fast, &sc->re_inttask);
                 return;
         }
 #endif

as for loading the module without replacing the kernel it woks as long as you set it to load (if_re_load="YES") in loader.conf.local.  I verified this by checking dmesg, Version 1.92 loaded.  You can not mix kernel modules with different OS's 10.3, 11.  the kernel module must be compiled for that kernel/OS version  but, I suspect you already know that.

I don't mean to impede any development.  if you can get the proper driver built into 17.1 please do.  Please, please!
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: agh1701 on January 26, 2017, 07:43:12 pm
Franco, 

I did not apply the patch to the 10.3 module only the 11 module.  Do you think I should have?  Do you have a better Patch?
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on January 26, 2017, 10:31:33 pm
Hey,

Ok, better. :) I posted a similar patch: https://github.com/opnsense/src/commit/fc62dbeab5043

I built a full kernel with the new driver by replacing the FreeBSD one: https://github.com/opnsense/src/commit/9ab694091b

On OPNsense 17.1, this kernel can be installed by just running the command(s):

# opnsense-update -kr 17.1-re
# /usr/local/etc/rc.reboot

Last but not least, this kernel doesn't need a loader.conf fixup.

If it works for all testers we will consider merging it into an OPNsense 17.1.x release.


Cheers,
Franco
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: agh1701 on January 26, 2017, 10:48:09 pm
Great!  Thanks!  I will start testing your kernel when 17.1 is stable I guess on the 31th.  My wife will kill me if the router crashes while I am at work.  I hope this gets backported to FreeBSD.  According the forums the pfsense and freenas guys are having trouble too.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: paramedic233 on January 26, 2017, 11:08:27 pm
I have a Qotom Thin Mini PC with Intel Celeron j1900 processor onboard, quad core 2.42 GHz, 4GB RAM 64GB SSD, dual LAN dual display serial port(thank goodness for copy and paste).
I initially installed OPNSense, but it kept failing. Wife and family would contact me, irate as they were unable to go online.
Then I tried Sophos, and that too failed.
Then I tried the one that starts with a P, and had pretty good luck with that. It would run dangerously warm and only locked up a few times.
I was bored this past Tuesday and took it back down and re-installed OPNSense, RC1 and have all of my traffic running through the proxy for one machine only.  The four cores are running about 40 C and the sensor on the unit says 27 C.
So, with all of that in mind, is it possible that your unit locked up due to heat issues? Once they get warm, they do some really wonky things.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: the-mk on February 06, 2017, 07:53:43 pm
...
# opnsense-update -kr 17.1-re
# /usr/local/etc/rc.reboot
...
Cheers,
Franco

did the update of the kernel as you described - after rebooting the OPNsense-box - it was still working :) can browse the net and write this post!

now performing some iperf tests between two BananaPis, one at LAN (client), one at WAN (server). Seems like I have a "weak" network-cable since it only transfers with around 100Mbps - but it is still transferring!

How long should it take until it fails with the old driver/kernel?

EDIT: after replacing that 100Mpbs cable it was hitting the OPNsense box with iperf for more than 30 minutes and I still can post here without any reboot :)

I like the new driver - thank you franco!
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on February 07, 2017, 09:06:31 am
Thanks for the report. :)

I'm still waiting for another user and a test on a hardware that I have here, but it looks good and the change will be queued up for the development track soon. And if that works out ok we may be looking at inclusion in a 17.1.x release in a month or so.


Cheers,
Franco
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: vxuser on February 07, 2017, 03:33:35 pm
Thanks for the report. :)

I'm still waiting for another user and a test on a hardware that I have here, but it looks good and the change will be queued up for the development track soon. And if that works out ok we may be looking at inclusion in a 17.1.x release in a month or so.


Cheers,
Franco

Hi,
i have CI323 with 16GB memory and 128GB SSD.
I just found this post , and i will test today if is working.
Currently i'm using a gigabit pppoe connection and hang at upload.
Let's see after patch.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: vxuser on February 08, 2017, 10:02:18 am
just tried latest kernel on ci323 and it's working great.
Also note that enabling powerd, with hiadadaptive it's working ok with pppoe connnection.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on February 08, 2017, 12:20:01 pm
Thanks, I'll upload another kernel for 17.1.1 in a few days, likely changing to the driver by default in 17.1.2 if all goes well.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: the-mk on February 08, 2017, 07:16:14 pm
after about 48 hours still online with the new networkdriver/kernel...
rebooted after the “loadtest“ with iperf to reset interface statistics and had around 2 GB traffic since then (didn't have the chance stream a movie this week...).
looking forward to the next update - are there special steps to do before the next upgrade or does the “manually changed“ kernel stay?
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on February 08, 2017, 07:25:22 pm
Not yet, need to reinstall after upgrade to normal 17.1.1 kernel for safety reasons.

If IPS tests work well here locally I'm very sure 17.1.2 will switch re(4) permanently.


Cheers,
Franco
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: vxuser on February 08, 2017, 07:28:54 pm
Now,
if i used:
# opnsense-update -kr 17.1-re
# /usr/local/etc/rc.reboot

next build will be updated with 17.1-re or 17.1.1 kernel ?
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on February 09, 2017, 08:56:43 am
I think you did answer this question. :)

I'll provide a kernel for 17.1.1-re tomorrow.

FWIW, if you don't have issues with 17.1, you don't strictly need the 17.1.1 kernel.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: xofer on February 10, 2017, 03:20:45 pm
Just put 17.1-re on one of my ci323 boxes that seemed to do the watchdog timeout thing on a weekly basis. Lets see if it lasts longer now.

17.1 upgrade lost the console with my VGA monitor though - neither VGA or EFI works. But i'd rather live without a console than without Internet :-)
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: the-mk on February 10, 2017, 03:32:52 pm
...
17.1 upgrade lost the console with my VGA monitor though - neither VGA or EFI works. But i'd rather live without a console than without Internet :-)
take a look at System>Setting>Administration - there is a setting named primary console - you might want to switch to VGA console and reboot
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: agh1701 on February 10, 2017, 03:59:26 pm
I don't see kernel-17.1.1-re-amd64.txz was this rolled into 17.1.1?
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on February 10, 2017, 04:19:00 pm
It's not there yet. Sorry.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: xofer on February 10, 2017, 05:41:22 pm
...
17.1 upgrade lost the console with my VGA monitor though - neither VGA or EFI works. But i'd rather live without a console than without Internet :-)
take a look at System>Setting>Administration - there is a setting named primary console - you might want to switch to VGA console and reboot

Well, as i said - neither VGA or EFI works - where do you suppose I changed from one to another?
 
But I don't actually want to bring new topics to this thread. If we can get rid of those pesky re0 watchdog timeouts on Zotac ci323, I will be over the moon and don't care that much about the console. Or I will drag an HDMI monitor to that particular room and try that.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: agh1701 on February 10, 2017, 06:23:59 pm
...
17.1 upgrade lost the console with my VGA monitor though - neither VGA or EFI works. But i'd rather live without a console than without Internet :-)
take a look at System>Setting>Administration - there is a setting named primary console - you might want to switch to VGA console and reboot

Well, as i said - neither VGA or EFI works - where do you suppose I changed from one to another?
 
But I don't actually want to bring new topics to this thread. If we can get rid of those pesky re0 watchdog timeouts on Zotac ci323, I will be over the moon and don't care that much about the console. Or I will drag an HDMI monitor to that particular room and try that.

its not just the Zotac, its one or more Realtek lan chips on the whole FREEBSD platform.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: vxuser on February 11, 2017, 09:25:27 pm
opnsense-update -kr 17.1.1-re it's not working.
Also now on ci323 same issue with re1 watchdog time out.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on February 12, 2017, 09:11:05 pm
I said I did not have the time to do it then. ;)

But now it's up for amd64:

# opnsense-update -kr 17.1.1-re


Cheers,
Franco
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: brady1408 on February 14, 2017, 04:54:14 am
Fantastic, I just had time today to upgrade to 17.1.1 the upgrade was successful on my Zotac ci323.

I then ran #opnsense-update -kr 17.1.1-re which also ran successfully for me.

I am now up and running on the new version and will watch closely for any further timeouts.

I just want to say thank you, this is a random issue with this hardware that can be frustrating when it happens. You could go weeks with no issues or you could have it start timing out a few times in a day.

My fingers are crossed that the driver update resolves the issue. I will report back when I know more.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: xofer on February 15, 2017, 04:08:08 pm
17.1-re has been up for 5 days now without a single watchdog timeout in the logs.
Even the occasional non-fatal timeout i used to have from time to time are gone now.

I vote for this realtek driver to be included in the next release.
Title: Re: Zotac nano ci323 LAN Drops after a few days
Post by: franco on February 15, 2017, 09:54:58 pm
Realtek released the FreeBSD driver version 1.93 with built-in support for FreeBSD 11.0. All the more reason to go forward, I've queued it up for 17.1.2.

Thanks to everyone for the discussion and testing!


Cheers,
Franco
Title: Re: [SOLVED] Zotac nano ci323 LAN Drops after a few days
Post by: brady1408 on February 22, 2017, 10:30:48 pm
I know this is closed but I just wanted to report back. It has been 8 days for me and I don't have a single time out in my logs. Thank you!!!
Title: Re: [SOLVED] Zotac nano ci323 LAN Drops after a few days
Post by: franco on February 27, 2017, 01:01:20 pm
Hi Brady,

Thanks, really happy with the switch so far. :)


Cheers,
Franco