[SOLVED] hostwatch at 100% CPU

Started by tessus, January 17, 2026, 03:54:15 PM

Previous topic - Next topic
Interfaces: Settings: ARP Handling is dead, long live the two tunables :)

But yes something similar needs to be done. We'll have a coordination meeting later about it and try to work through the reported items.


Cheers,
Franco

Maybe I just don't understand it completely but those movement messages seem odd to me.

In my log I see the same message over and over again:
INFO hostwatch: changed ethernet address host 00:1d:63:63:eb:35 moved from 64:62:66:22:44:8c to fe80::6662:66ff:fe22:448c at igc1
The message makes no sense to me, how can it move from a MAC address to an IPv6 address? Could there be a confusion of variables in the code?

Also the fact that it is the same message over and over again kind of makes me a bit suspicious that there might be an error in the code where the previous and current address are compared, or that the state isn't updated correctly.

I woke up this morning to my firewall on 100% cpu after upgrading to 25.7.11_1 last night, and a reboot seemed to fix it but then I came across this thread and also the _2 hotfix which I have now applied. I have tried to configure it to limit to only my LAN interface, but anything other than All keeps the service running. In the logs I see:
2026-01-19T11:18:57Noticekernel<6>[18592] pid 77296 (hostwatch), jid 0, uid 0: exited on signal 6 (no core dump - bad address)
2026-01-19T11:18:56Noticeroot/usr/local/etc/rc.d/hostwatch: WARNING: failed to start hostwatch

For now I have just disabled it.

@troplin that's fixed, but the message has been disabled for now to avoid excessive logging

@Taomyn it's in the list of things to fix this week


Cheers,
Franco

Quote from: jp0469 on January 19, 2026, 01:16:09 AMYou should definitely demand a refund. Be sure to draw attention to your post count so the devs know who they're dealing with.

I'm glad you looked at your own..... don't be an ass

Today at 03:12:34 AM #20 Last Edit: Today at 03:15:35 AM by s1l3nce
I've just applied the latest patch, rebooted and for a while it was all good. After an hour of usage, suddenly I notice very high CPU usage on the hostwatch service.



I'm just stopping this service until they figure it out because it's clearly giving a lot of issues at the moment.

THX for this thread, this service was eating all my memory. after disabling it the usage was immediately at 28%, what a great solution to roll this out for all as fix implemented and started service............

@franco, What´s your recomendation, I´m in 25.7.10, wait till all the issues wil be solved? Wait 26.1?

Thanks for your efforts and support.

BR


Quote from: amarek on Today at 08:14:11 AMTHX for this thread, this service was eating all my memory. after disabling it the usage was immediately at 28%, what a great solution to roll this out for all as fix implemented and started service............
I was away from home and thankfully only the firewall's Web UI became non-functional, so I could still do remote SSH and diagnose the problem. For me the new service silently ate up 52GB of space for logging alone in less than 2 days and somewhat stalled the system as a result. I even read the changelog and noticed it but didn't think much at the time.
So, it's one of those blunders with an unexpectedly high impact, yes, but it's rare. And they did promptly push out hotfixes to remedy the issue on reasonably short notice.

@franco Even if the log message has been fixed (and is now disabled), it still makes no sense:

Firstly,
00:1d:63:63:eb:35 is the MAC address of my dishwasher and
64:62:66:22:44:8c is the LAN interface of the OPNsense box itself. The IPv6 address is the link-local address of the OPNsense box.
So why would hostwatch think that the LLA of the OPNsense box itself has been previously used by my dishwasher?

Secondly, the message is always just ,,host X moved from A to B", shouldn't the database be updated to reflect that after the fist time? There are no messages the opposite way, i.e. ,,host X moved from B to A".

I still believe that the logging issue is just a symptom of the actual problem, e.g. you're somehow comparing the wrong addresses.

Today at 10:02:48 AM #25 Last Edit: Today at 10:21:50 AM by troplin
@franco maybe this line?
https://github.com/opnsense/hostwatch/blob/3000f8f6611c098a7e7d01eaa0253b31c6af9ca3/src/database.rs#L141

Shouldn't that be
when real_ether_address = excluded.real_ether_addressInstead of
when ether_address = excluded.real_ether_address?

EDIT:
Also, this condition is always true:
https://github.com/opnsense/hostwatch/blob/3000f8f6611c098a7e7d01eaa0253b31c6af9ca3/src/lib.rs#L196
Because in the SQL statement prev_ethernet_address is only updated when ethernet_address actually changes.

@troplin best track coding things in the repository in an issue.
Hardware:
DEC740

Quote from: Monviech (Cedrik) on Today at 10:18:50 AM@troplin best track coding things in the repository in an issue.

Ok, can do that. Is there already one that fits or should I create a new one?

Create a new one and point out the issue that you think exist, that way it can be evaluated and potentially fixed. Thank you.
Hardware:
DEC740

https://github.com/opnsense/hostwatch/issues/21

Hope that works for you. I'm sick (and bored) in my bed and typing on my phone, not the ideal environment for programming stuff...