I am having a strange issue. I was running 25.7.4 without any issues. I applied 25.7.5 on 10th of October. On 11th morning at around 5:23 AM, all network went offline. I tried to connect to OPNsense but could not. Since I run OPNsense as a VM in Proxmox, I noticed that the OPNsense VM was suddenly showing 90%+ CPU. I could not even connect to console. I tried to open serial console, but it was also non-responsive. I had to force start the VM.
Same thing happened on 12th and 13th October. On 13th October, I reverted to 25.7.4 snapshot and observed. No issues with CPU on 14th, 15th and 16th.
On 16th, I reapplied 25.7.5. It worked on 17th but today, on 18th morning at 5:40 AM, OPNsense VM again suddenly went to 90% CPU and became unresponsive. The VM had to be force restarted.
After applying 25.7.4 snapshot on 13th, I had created a script to write out CPU usage to a file every 2 minutes between 4:00 AM and 6:00 AM. Scheduled it through cron in OPNsense. This morning, after I recovered the VM from non-responsive state, when I check the CPU usage log, there was no entry after 5:40 AM. OPNsense was so badly hung that it did not even run the cron job. The CPU usage entries started at 5:50 AM after reboot. To summarize:
25.7.5 applied on 10th.
11th : CPU 90% and OPNsense unresponsive at 5:23 AM
12th : CPU 90% and OPNsense unresponsive at 5:00 AM
13th : CPU 90% and OPNsense unresponsive at 5:00 AM
25.7.4 snapshot restored on 13th.
14th : no issue
15th : no issue
16th : no issue
25.7.5 re-applied on 16th.
17th : no issue
18th : CPU 90% and OPNsense unresponsive at 5:40 AM
The high CPU always happens between 5:00 AM and 6:00 AM. I did not make any change to OPNsense configuration for this test. This rules out everything except 25.7.5 as the source of problem. What can I look at to find out what is causing this behavior?
Thanks
Same thing happened on 12th and 13th October. On 13th October, I reverted to 25.7.4 snapshot and observed. No issues with CPU on 14th, 15th and 16th.
On 16th, I reapplied 25.7.5. It worked on 17th but today, on 18th morning at 5:40 AM, OPNsense VM again suddenly went to 90% CPU and became unresponsive. The VM had to be force restarted.
After applying 25.7.4 snapshot on 13th, I had created a script to write out CPU usage to a file every 2 minutes between 4:00 AM and 6:00 AM. Scheduled it through cron in OPNsense. This morning, after I recovered the VM from non-responsive state, when I check the CPU usage log, there was no entry after 5:40 AM. OPNsense was so badly hung that it did not even run the cron job. The CPU usage entries started at 5:50 AM after reboot. To summarize:
25.7.5 applied on 10th.
11th : CPU 90% and OPNsense unresponsive at 5:23 AM
12th : CPU 90% and OPNsense unresponsive at 5:00 AM
13th : CPU 90% and OPNsense unresponsive at 5:00 AM
25.7.4 snapshot restored on 13th.
14th : no issue
15th : no issue
16th : no issue
25.7.5 re-applied on 16th.
17th : no issue
18th : CPU 90% and OPNsense unresponsive at 5:40 AM
The high CPU always happens between 5:00 AM and 6:00 AM. I did not make any change to OPNsense configuration for this test. This rules out everything except 25.7.5 as the source of problem. What can I look at to find out what is causing this behavior?
Thanks