Problem with shutdown/reboot as killing suricata gets stuck forever.

Started by mrzaz, June 25, 2026, 09:38:25 AM

Previous topic - Next topic
Quote from: Jorgek on June 29, 2026, 08:23:06 AMHi Franco,

I am facing the same issue. I discovered when the system tried to reboot on last update to Business 26.4.1.
My hardware is from Deciso: DEC 697
I had to connect into the console and kill suricata manually as it never rebooted for more than 10 minutes.
My suricata configuration is in divert mode. Since this divert mode became available, I have switched from IPS mode to divert mode as it makes more sense to inspect in suricata only what firewall is allowing, in my case, one specific rule, instead of inspecting all traffic.

I tried the same command showed as before, but the result was always the same. hanging when trying to stop suricata.
I didn't try changing the suricata mode back to IPS or IDS, but as far as I remember, I nave never experienced this hanging issue before. I have been using OpnSense for more than 3 years and this is the first time I encountered this hanging behavior. All previous updates was always smooth with no issues or hanging.

Regards,
Jorge

Hi Jorge,
Then at least I am not alone in this. 🙂

Due to HW constraints in my old opnsense machine I did not use Suricata that much but has now enabled it more and that's when I discovered it.

It always hanged when trying to shutdown. Only thing powerfully enough to kill it was -9.

Dan Lundqvist
Best regards
Dan Lundqvist (mrzaz)

"It's better to burn up, than fade away..." (Highlander)

I have now changed from Divert (IPS) to Netmap (IDS) and let it run for 24-36h and now tried a normal reboot and at least this time it rebooted normally.
Only took a few seconds for suricata PID to stop and continue with rest of the shutdown/reboot.

I will keep this under wrap and test it again in a few days.

If it now is Divert setting that causes it, we need to try to find the culprit.

I will try to revert to Divert (IPS) and see if I could reproduce and then use a bunch of hopefully good commands to debug.

//Dan Lundqvist
Best regards
Dan Lundqvist (mrzaz)

"It's better to burn up, than fade away..." (Highlander)

Hello,
I had it running for 1-2 days using "Netmap (IPS)" and did a few manual reboots and it shuts down/reboot OK.
I then reconfig it to "Divert (IPS)" and then after a while I got the same hard lock.

I did a lot of debug printouts that I could send if someone is interested.
I tried first "kill -TERM 73697" but that did nothing.  process still hanging.
I then did a "kill -9 73697" and then it continued all the way to reboot and then up again.

Seems like it is happening in relative short time on "Divert (IPS)".

//Dan Lundqvist
Stockholm, Sweden
Best regards
Dan Lundqvist (mrzaz)

"It's better to burn up, than fade away..." (Highlander)

Quote from: franco on June 25, 2026, 09:52:03 PMCan you confirm this only happens with divert? It may be an open file descriptor / socket that the kernel doesn't yield.


Cheers,
Franco

Hi Franco,
I have now tested with Divert (IPS), then Netmap (IPS) and then back to Divert (IPS) again.

Result is that with "Netmap (IPS)" I do not get this hangings and it could run for hours and
every time I tried a reboot it shutdown clean and restarted as normal.

But when I reverted back to "Divert (IPS) and let it run for a few hours and then try
a reboot it hangs waiting for suricata PID to die. I then tried "kill TERM <pid>"
but did not help.  I then had to go for the big gun with "term -9 <pid>" and then
it continued the shutdown and rebooted OK.

I do have some trace printouts that I could send if someone wants to review ?

//Dan Lundqvist
Best regards
Dan Lundqvist (mrzaz)

"It's better to burn up, than fade away..." (Highlander)

I'm not sure what the best thing to look at is, but let's start with the obvious during a hang vs. what I can see here during normal operation:

# sockstat | grep suricata
root    suricata   92847  3 dgram   ??                    -> /var/run/log     
root    suricata   92847  6 div4    *:8000                *:*

# ps auxwww | grep suricata
root    92847   0.6  1.3 155104 107252  -  Ss   19:50      0:01.29 /usr/local/bin/suricata -D -d 8000 --pidfile /var/run/suricata.pid -c /usr/local/etc/suricata/suricata.yaml
root    70011   0.0  0.0  14088   2620  1  S+   19:51      0:00.00 grep suricata

# fstat | grep suricata
root     suricata   92847 text /          6012 -rwxr-xr-x  11163768  r
root     suricata   92847   wd /            34 drwxr-xr-x      28  r
root     suricata   92847 root /            34 drwxr-xr-x      28  r
root     suricata   92847    0 /dev         20 crw-rw-rw-    null rw
root     suricata   92847    1 /dev         20 crw-rw-rw-    null rw
root     suricata   92847    2 /dev         20 crw-rw-rw-    null rw
root     suricata   92847    3* local dgram fffff800283498c0 <-> fffff800282b8280
root     suricata   92847    4 /var/log   2246 -rwx------  280764251  w
root     suricata   92847    5 /var/log   5591 -rwx------  1403297  w
root     suricata   92847    6* divert raw 0 0


Cheers,
Franco
"AI has absolutely reduced the cost of creating technical debt." -- ChatGPT

If you need a temporary solution, I have a suggestion. Add a timeout in the Suricata service script:
sed -i '' 's|command:/usr/local/etc/rc.d/suricata stop|command:/usr/local/etc/rc.d/suricata stop \|\| (sleep 10 \&\& killall -9 suricata)|' /usr/local/opnsense/service/conf/actions.d/actions_ids.conf
Every morning, I wake up and check the Forbes list first. If I'm not on it, I go to work.

Quote from: franco on July 01, 2026, 07:52:16 PMI'm not sure what the best thing to look at is, but let's start with the obvious during a hang vs. what I can see here during normal operation:

# sockstat | grep suricata
root    suricata   92847  3 dgram   ??                    -> /var/run/log     
root    suricata   92847  6 div4    *:8000                *:*

# ps auxwww | grep suricata
root    92847   0.6  1.3 155104 107252  -  Ss   19:50      0:01.29 /usr/local/bin/suricata -D -d 8000 --pidfile /var/run/suricata.pid -c /usr/local/etc/suricata/suricata.yaml
root    70011   0.0  0.0  14088   2620  1  S+   19:51      0:00.00 grep suricata

# fstat | grep suricata
root     suricata   92847 text /          6012 -rwxr-xr-x  11163768  r
root     suricata   92847   wd /            34 drwxr-xr-x      28  r
root     suricata   92847 root /            34 drwxr-xr-x      28  r
root     suricata   92847    0 /dev         20 crw-rw-rw-    null rw
root     suricata   92847    1 /dev         20 crw-rw-rw-    null rw
root     suricata   92847    2 /dev         20 crw-rw-rw-    null rw
root     suricata   92847    3* local dgram fffff800283498c0 <-> fffff800282b8280
root     suricata   92847    4 /var/log   2246 -rwx------  280764251  w
root     suricata   92847    5 /var/log   5591 -rwx------  1403297  w
root     suricata   92847    6* divert raw 0 0


Cheers,
Franco

Hi Franc, see PM.

//Danne
Best regards
Dan Lundqvist (mrzaz)

"It's better to burn up, than fade away..." (Highlander)

Quote from: wincent on Today at 03:41:51 AMIf you need a temporary solution, I have a suggestion. Add a timeout in the Suricata service script:
sed -i '' 's|command:/usr/local/etc/rc.d/suricata stop|command:/usr/local/etc/rc.d/suricata stop \|\| (sleep 10 \&\& killall -9 suricata)|' /usr/local/opnsense/service/conf/actions.d/actions_ids.conf

Thanks, I will check that.  But best would be solving it more permanent. But will save yours anyway during meantime.

//Danne
Best regards
Dan Lundqvist (mrzaz)

"It's better to burn up, than fade away..." (Highlander)

Hello,
this seems to be the same issue i faced here
I have suricata running in ips mode with divert.
so i updated to 26.1.11 and the issue is here

here is what i have pre update:
red@cerberus:~ $ sudo sockstat | grep suricata
root     suricata   29596 3   dgram  -> /var/run/log
root     suricata   29596 6   div4   *:8000                *:*
fred@cerberus:~ $
fred@cerberus:~ $
fred@cerberus:~ $ sudo ps auxwww | grep suricata
root    29596   0.1 23.8 2723192 1922216  -  Ss   20Jun26    96:13.75 /usr/local/bin/suricata -D -d 8000 --pidfile /var/run/suricata.pid -c /usr/local/etc/suricata/suricata.yaml
fred    19042   0.0  0.0   13744    2032  0  S+   10:40       0:00.00 grep suricata
fred@cerberus:~ $
fred@cerberus:~ $
fred@cerberus:~ $ sudo fstat | grep suricata
root     suricata   29596 text /        248115 -rwxr-xr-x  11994960  r
root     suricata   29596   wd /            34 drwxr-xr-x      28  r
root     suricata   29596 root /            34 drwxr-xr-x      28  r
root     suricata   29596    0 /dev         20 crw-rw-rw-    null rw
root     suricata   29596    1 /dev         20 crw-rw-rw-    null rw
root     suricata   29596    2 /dev         20 crw-rw-rw-    null rw
root     suricata   29596    3* local dgram fffff8001bb27640 <-> fffff8001bce2dc0
root     suricata   29596    4 /var/log  27980 -rw-r-----       0  w
root     suricata   29596    5 -           476 -rw-r-----  6318542  w
root     suricata   29596    6* divert raw 0 0

and what i have when suricata  is stucked:

fred@cerberus:~ $ sudo sockstat | grep suricata
root     suricata   29596 3   dgram  (not connected)
root     suricata   29596 6   div4   *:8000                *:*
fred@cerberus:~ $
fred@cerberus:~ $ sudo ps auxwww | grep suricata
root    29596   0.1 23.8 2723192 1922216  -  Ss   20Jun26    96:15.42 /usr/local/bin/suricata -D -d 8000 --pidfile /var/run/suricata.pid -c /usr/local/etc/suricata/suricata.yaml
root    44586   0.0  0.0   14312    2888  -  I    10:44       0:00.01 /bin/sh /usr/local/etc/rc.d/suricata stop
fred    67953   0.0  0.0   13744    2336  0  S+   10:46       0:00.00 grep suricata
fred@cerberus:~ $
fred@cerberus:~ $ sudo fstat | grep suricata
root     suricata   29596 text /        248115 -rwxr-xr-x  11994960  r
root     suricata   29596   wd /            34 drwxr-xr-x      28  r
root     suricata   29596 root /            34 drwxr-xr-x      28  r
root     suricata   29596    0 /dev         20 crw-rw-rw-    null rw
root     suricata   29596    1 /dev         20 crw-rw-rw-    null rw
root     suricata   29596    2 /dev         20 crw-rw-rw-    null rw
root     suricata   29596    3* local dgram fffff8001bb27640
root     suricata   29596    4 /var/log  27980 -rw-r-----       0  w
root     suricata   29596    5 -           476 -rw-r-----  6320460  w
root     suricata   29596    6* divert raw 0 0

when i killed pid 44586 opnsense was able to reboot and. the upgrade is ok.

Thanks for the debug info so far.

I'm really unsure what the problem is here, but I've added a little patch to diagnose if the SIGINT is not properly handled in the divert packet loop.

https://github.com/opnsense/ports/commit/7883cddc3

# opnsense-revert -z suricata


Cheers,
Franco
"AI has absolutely reduced the cost of creating technical debt." -- ChatGPT