English Forums > Zenarmor (Sensei)

1.17 update caused eastpect crash

(1/4) > >>

nicktayl88:
Since updating to Zenarmor 1.17 I am getting the following message in logs

<6>pid 3076 (eastpect), jid 0, uid 0: exited on signal 3

When the error happens I lose all network access to the device for 2-3 minutes, this error is happening roughly once every 5-10 minutes.

Running on an Intel N5105 / I226-V device. Works fine if I disable Zenarmor, and worked fine before upgrading.

Any ideas on how I can fix this one?

sy:
Hi,

Please share a report by selecting all checkboxes via Have Feedback option. The team checks the logs and configuration to find out the root cause.

almodovaris:
I suggest switching between native driver and emulated driver.

just4fun:
Hi,

I have the same behaviour, and I also find "exited on signal 3" in the logs.
My hardware is Intel N100 with 4 x 226-V, 16GB Ram, Samsung NVME.

I have enabled Zenarmor on two internal interfaces (LAN and GUEST), only the worker process
of the LAN interface gets terminated, not that from the the GUEST interface. As there is user traffic from the LAN and no User Traffic on
the GUEST I assume that it is some Packet coming from the LAN causing this, but I have no clue.
LAN Ethernet Link seems stable, also no indication of Link Flapping on the Switch the LAN port is connected to.
For me it happens at random times, every some hours.
When using the native netmap driver, the event is drastic, loosing Network and connectivity to the Opnsense device
itself, it looks like re-initialising is difficult.
When using the emulated netmap driver, the worker process is just restarted, and that's it, so it is barely noticable,
but still happens.

Also, in the log of the main process I find an indication that the worker process does not reply to keepalive
for 21 seconds and thus will be restarted. I guess that where the SIGQUIT Signal (3) comes from.

Inbound packets go first to Zenarmor and then to the firewall rules, if I understood correctly, so no way
to filter out trouble-causing packets before they reach Zenarmor.

Sounds more like a Zenarmor issue to me than a Opensense issue.

Regards,
Stephan

just4fun:
Hi, I have been digging around in the logfiles after setting loglevel to Debug4.
In my setup, there is no DNS Service ( no unbound etc) running on the opnsense system, but I have
two DNS Servers (dnsmasq) running in the LAN, listening on Port 53, nothing special.
Both of them use
- DNSCrypt to query opendns servers for most queries
- plain DNS (udp/53) to my Carrier's DNS Servers ( to resolve hostnames for streaming services to get close-by CDN distribution points for Audio and Video streams)

So some DNS queries may go from the Opnsense host to the internal DNS Server and then again through the
opnsense host to external DNS.


What I found is, that at that time point, when the worker stops responding to heartbeats (you can see that in
the main log, it records the missed heardbeats each second until it reaches 21) I have a sequence like the following in the worker log: (10.XXX.YYY.254 is opnsense, 10.XXX.YYY.250 is internal DNS)

2024-05-02T18:24:34.608386 DBG1 [Connection::setDgaQueryState] [UDP][CLI][opnsense][10.XXX.YYY.254:38940]<->[UDP][SRV][hh_sgsvr_1][10.XXX.YYY.250:53][74bde58f-4c78-4f21-ac9d-4b9ad97e04f4]DGA query state changed to 'not_suspicious'
2024-05-02T18:24:34.608826 DBG2 [peeker_peek_packets] dns extracted name: cti.zenarmor.net
2024-05-02T18:24:34.608840 DBG4  WebApplicationsClassifier: trying hostname: cti.zenarmor.net
2024-05-02T18:24:34.608848 DBG4  WebApplicationsClassifier: trying hostname: zenarmor.net
2024-05-02T18:24:34.608863 DBG2 [PolicyManager::matchPolicyPacketInfo] dir: EGRESS, port: LAN, if: nm0::igc1, maclocal: 80ee73e6f0da, macremote: a8b8e001cfde, iplocal: 10.XXX.YYY.250, ipremote: 10.XXX.YYY.254, dnslocal: hh_sgsvr_1, dnsremote: opnsense, username: n/a, usergroup: n/a
2024-05-02T18:24:34.608872 DBG4 [UserDevice::setIPv4] Updated last_seen 80ee73e6f0da, Now: 1714667074:
2024-05-02T18:24:34.608879 DBG2 [PolicyManager::matchPolicyPacketInfo] [10.XXX.YYY.250:53<->10.XXX.YYY.254:38940] matched policy: System Default Policy
2024-05-02T18:24:34.608890 DBG4 WebCategoryManager: expired cache for domain: cti.zenarmor.net
2024-05-02T18:24:34.608907 DBG3 [10.XXX.YYY.250:53 <-> 10.XXX.YYY.254:38940] checkExceptions1 (System Base Policy)
2024-05-02T18:24:34.608917 DBG3 [10.XXX.YYY.250:53 <-> 10.XXX.YYY.254:38940] checkExceptions1.5 (cti.zenarmor.net)
2024-05-02T18:24:34.608925 DBG3 [Policy::checkExceptions] hostname: 'cti.zenarmor.net'
2024-05-02T18:24:34.608930 DBG3 [Policy::checkExceptions] hostname: 'cti.zenarmor.net'
2024-05-02T18:24:34.608935 DBG3 [Policy::checkExceptions] hostname: 'cti.zenarmor.net'
2024-05-02T18:24:34.608940 DBG3 [Policy::checkExceptions] hostname: 'cti.zenarmor.net'
2024-05-02T18:24:34.608944 DBG3 [Policy::checkExceptions] hostname: 'cti.zenarmor.net'
2024-05-02T18:24:34.608949 DBG3 [Policy::checkExceptions] hostname: 'cti.zenarmor.net'
2024-05-02T18:24:34.608957 DBG2 [10.XXX.YYY.250:53 <-> 10.XXX.YYY.254:38940] checkExceptions: DNS-pi is not whitelisted
2024-05-02T18:24:34.608966 DBG3 [10.XXX.YYY.250:53 <-> 10.XXX.YYY.254:38940] checkExceptions1 (System Default Policy)
2024-05-02T18:24:34.608975 DBG3 [10.XXX.YYY.250:53 <-> 10.XXX.YYY.254:38940] checkExceptions1.5 (cti.zenarmor.net)
2024-05-02T18:24:34.608988 DBG1 DNSWatcher: duplicate DNS entry: cti.zenarmor.net
2024-05-02T18:24:34.608994 DBG1 DNSWatcher: exceeded max pending resps size dropping cti.zenarmor.net[Policy::checkExceptions] [74bde58f-4c78-4f21-ac9d-4b9ad97e04f4] hostname: 'opnsense'
2024-05-02T18:24:34.609014 DBG3 [Policy::checkExceptions] [74bde58f-4c78-4f21-ac9d-4b9ad97e04f4] hostname: 'opnsense'
2024-05-02T18:24:34.609042 DBG4 WebCategoryManager: expired cache for domain: cti.zenarmor.net


These pattern around cti.zenarmor.net seem to occour only exactly at that times when the worker process stops responding to the heartbeats, so one could think it struggles on its own DNS query.
Grepping through the log for "DNSWatcher: exceeded max pending resps size dropping" only finds them at that time point in question.

hope this helps,
regards, Stephan

Navigation

[0] Message Index

[#] Next page

Go to full version