Zenarmor Cloud Nodes Status

Started by just4fun, May 18, 2024, 12:29:42 PM

Previous topic - Next topic
May 24, 2024, 07:49:27 PM #15 Last Edit: May 24, 2024, 07:54:21 PM by Seimus
He ment in the UI, there is enable disable button on top of that section.


@Sy

I upgrated today to 24.1.7 & 1.17.3

The  Cloud Reputation Servers  are indeed empty there is nothing in that section after upgrading to 1.17.3

Looks like something is failing

6c348fd6-a31a-4f60-805e-85accda81ccb] Script action failed with Command '/usr/local/opnsense/scripts/OPNsense/Zenarmor/nodes_status.py --mode 'read'' returned non-zero exit status 1. at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/actions/script_output.py", line 44, in execute subprocess.check_call(script_command, env=self.config_environment, shell=True, File "/usr/local/lib/python3.11/subprocess.py", line 413, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '/usr/local/opnsense/scripts/OPNsense/Zenarmor/nodes_status.py --mode 'read'' returned non-zero exit status 1.


This looks like is related to the issues in your previous post >
https://forum.opnsense.org/index.php?topic=40513.msg198743#msg198743

Running

pkg install -fy os-sensei

Fixes it.

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

Hi Stephan,

Do you still have this issue?

yes, node status sill "DOWN" 0% after some time. Still not shure if it is really an issue or if the system is using
the cache. Only me and my wife behind this zenarmor, so not too much different websites visited.

Hi,

They should appear as online even if the results come from the cache. Is it the same if you restart Engine service?

Hi,

I have been some days off in holoday, so I didn't follow this activity, sorry.

I now have 1.17.4 installed, same behaviour. But I have been able to nail it down.

Because my ISP forces a PPPoE Reset every 24 hours to renew the IP Address assigned to my Firewall,
I am running a cron job which performs a regular interface Reset of the PPPoE Interface in the very early
morning at 4.00 to always have the IP Renewal at night.

It seems that this is the point in time where the Node status goes down and does not recover (what I would be expecting). I verified this by having another cron job resetting the interface at "wake hours" ;-), right before
execution I have restarted the Zenarmor to have the Nodes Status 100%, right after execution It was Down at 0%
and didn't recover in the following 10 Minutes.


Thanks a lot,
Stephan

Hi Stephan,

Zenarmor checks Node status in every 10 minutes.



... repeated the test. Node Status goes Down right after PPPoE Interface reset via cron job and stays down,
now for > 2 hours.

Hi,

Please increase the log level in Settings - Logging - Level from INFO to DEBUG4 then share the /usr/local/zenarmor/log/active/worker*.log file with the support team. You can usee Have Feedback option to create a support ticket then attach the worker *.log files to the ticket.


looks like zenarmor either misses that the public IP adress changes, or it ignores it and does not open
a new connection to the CTI servers.

root@opnsense:/usr/local/etc # netstat -n | grep 5355
udp4       0      0 185.171.XXX.YYY.13285   35.198.172.108.5355   
udp4       0      0 185.171.XXX.YYY.44679   34.65.117.157.5355     
udp4       0      0 185.171.XXX.YYY.25812   34.65.117.157.5355     
udp4       0      0 185.171.XXX.YYY.56199   35.198.172.108.5355   

after PPPoE interface reset still shows the udp connections from the old IP address instead of the new IP address on the PPPoE interface.

In the logs the Cloud Reputation Servers just go from healthy to unhealty.

My workaround now is to restart zenarmor after a new IP is aquired by adding this to
/usr/local/etc/rc.newwanip

/* restart zenarmor engine to re-establish connection to CTI servers */
log_msg("Restarting zenarmor engine");
mwexecf('/usr/local/sbin/zenarmorctl engine restart');

seems to work ok for now.



Hi,

Thanks for reporting the issue. We are going to investigate it and get back to you.


June 20, 2024, 12:01:42 PM #25 Last Edit: June 20, 2024, 05:08:29 PM by just4fun
The workaround works fine since some days now. So it is clear, that under normal circumstances,
Zenarmor maintains a good Cloud Node Status. It just does not recover from a PPPoE Interface
reset. ( it may do so if the Carrier justs disconnects the interface to enforce IP adress renewal,
I didn't test that scenario, but it may be different from performing an interface reset from a cron job
to force the IP address renewal to a specific time.)

Upgrading to OPNsense 24.1.9_3-amd64 didn't improve this. I had to re-add the zenarmorctl engine restart into
rc.newwanip.

best regards,
Stephan

I would like to give the info I have exactly the same problem.
I configured my modem to reconnect 5:00 early in the morning, so that no IP change willl occure within this day.
Every day the Cloud Nodes Status switch to Offline state.

Thx @just4fun for posting a temporary fix.

This is a known cosmetic issue, you can check by restarting the Engine. You can follow the updates.

I don't think it is just a cosmetic issue. Zenarmor looses connectivity to it's cloud nodes, so it is unable
to query them for reputation information. Restarting the engine re-establishes the connection, that's why I still use my workaround decribed earlier in this thread, by adding a zenarmor engine restart command to the end of
/usr/local/etc/rc.newwanip.
I have to re-edit the file after every update. While I can live with it, I still think it should be fixed,
I think it cannot be so difficult to re-establish those connections from scratch when they are lost,
instead of sitting there with known-to-be-offline connections and waiting forever. I'd expect few lines of code
doing that.

Regards,
Stephan

Hi @just4fun,

It's right and a known issue. It will be fixed for the next major version. Thanks for reporting it.