Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - falsifyable_entity

#1
Quote from: colourcode on December 16, 2024, 02:19:43 PM
Quote from: falsifyable_entity on December 16, 2024, 11:35:44 AMNope, not even one VLAN, besides Unbound I have pretty much nothing going on

It does sound pretty much exactly like the problem I'm having though and I still have the problem but the initial super long load times are for the most part gone.

Does it happen if you don't have the webgui open at all? Mine never stalls when I'm SSH'd into it but as soon as I open the GUI (dashboard) it 100% all cores immediately with PHP.

Mind checking with SSH and TOP -P? Start top, download a steam game, start a speed test etc, and then open the dashboard and see if you can reproduce it that way. Assuming it's not fully borked without even doing anything.

I can use the GUI fine as long as I'm using spotify / youtube and browsing the net but any heavy load and its game over.

It happens regardless of what I do or how much traffic is going through, its basically random for all intents and purposes. Sometimes it happens with literally 0 load, sometimes when I am actively doing something. Does not matter if i have an SSH session open or the web dashboard
#2
Quote from: colourcode on December 15, 2024, 08:49:10 PM
Quote from: falsifyable_entity on December 13, 2024, 11:48:40 AM
Quote from: newsense on December 13, 2024, 03:58:57 AMYou didn't answer my question about power mgmt features enabled in the BIOS...

Sorry, my bad, there are no power saving related settings in the BIOS, the only one I would consider close is auto boot when power is supplied.

Are you running plenty of VLANS?

Could be completely unrelated problems, but mine seems to be semi-remediated.

Noticed my webgui log was LOADED with dead/dying sessions. Running plenty of vlans and using my normal FQDN for access, guessing it chose different IPs or similar which could've been the reason gui loaded so damn slow (everywhere).

  • I put a 10 minute session timeout in settings > administration. Default 240 min.
  • Added a dns host override entry outside of the search domain.

It still completely shit the bed when working with a lot of traffic, but it seems to only happen on the dashboard now. Doesn't really seem to be related to netflow either as it happens without traffic charts running. But I'm much to stupid to find the actual cause. The GUI is snappy again in most other areas even during higher load.

Nope, not even one VLAN, besides Unbound I have pretty much nothing going on
#3
Quote from: newsense on December 13, 2024, 03:58:57 AMYou didn't answer my question about power mgmt features enabled in the BIOS...

Sorry, my bad, there are no power saving related settings in the BIOS, the only one I would consider close is auto boot when power is supplied.
#4
Running the debug kernel now, for some reason the debug kernel crashes LESS than teh normal one, but it does freeze nontheless. Unfortunately after a few crashes this is what /var/crash looks like

root@Sense:/var/crash # ls -la
total 10
drwxr-x---   2 root wheel  3 Dec  4 23:42 .
drwxr-xr-x  28 root wheel 28 Dec  2 20:45 ..
-rw-r--r--   1 root wheel  5 Dec  2 20:45 minfree
root@Sense:/var/crash # cat minfree
2048
root@Sense:/var/crash #
#5
@newsense
Applied Your changes and awaiting results
#6
I doubt its the CPU, the somewhat overpriced Deciso DEC677 has a CPU worse than I have in that box and they basically guarantee it will always work.
Even when I load websites with idiotic amounts of iframes and media that load from all kinds of domains the CPU usage barely even reaches 40% and I have't seen it reach more than 60% ever, even when using iperf (with all HW offload disabled) it does not peak the CPU more than 60%. And RAM, well i have never seen it use more than 3G, with 8 available, and on 24.1.10 it was fine with an even longer blocklist (I reduced it, then tried disabling it thinking it might help, before turning off Unbound altogether)
#7
Intel NIC I225-V as mentioned in the OG post, no VLANS, only hardware involved besides the OPN box is a OpenWRT router that only acts as a dumb switch and an AP, does nothing else.
The onyl services i am running is NTP, DHCP, and DNS (Unbound with blocklists), nothing more.
I also ran across the idea of a remote log storage, and did that with a raspberry Pi4, but it just had the exact same logs as if they were stored locally, nothing of note was ever logged to it no matter how many times OPN crashed. My guess is the crash is so severe it just kills the logging alongside everything else.
Since I havent been able to get anything useful out of a remote log device i removed it
#8
This box has an external PSU brick 12V 5A and i did swap it with another one I had from a different device, and it hasnt changed anything in the behavior.
Also the 2 day Linux test was conducted on the original power brick and as I already mentioned it was fine for far longer than OPN ever could without freezing
#9
Theres absolutely nothing of note in the log files listed in the linked page, no errors, everything in order.
I already tried disabling services, but it seems its unrelated to what services i run, since even with DNS, DHCP and NTP disabled, basically doing nothing it still did the same thing
#10
# uname -v
FreeBSD 14.1-RELEASE-p6 stable/24.7-n267981-8375762712f SMP


This issue has begun months ago the day i moved off of 24.1.10, but i figured an update would just fix it, unfortunately the issue persisted and got real annoying hence this thread
#11
I also noticed that the wireless interface (the machine has one but i never used it) has disappeared, last time i remember seeing it was on 24.1.10
#12
Dmesg also shows absolutely nothing of note, but I cannot see it after the freeze because the machine does not accept ANY interaction, not from plugged in peripherals nor SSH nor serial
#13
I tested this machine with Linux running for 2 days straight with the 'stress' command running, was solid as a rock, I also ran memtest and it tested the memory green... Thats why I am utterly baffled by this
#14
Theres the output of the command:

# ls -ltrh /var/crash && df -hT
total 1
-rw-r--r--  1 root wheel    5B Dec  2 20:45 minfree
Filesystem                 Type       Size    Used   Avail Capacity  Mounted on
zroot/ROOT/default         zfs        221G    1.6G    219G     1%    /
devfs                      devfs      1.0K      0B    1.0K     0%    /dev
/dev/gpt/efiboot0          msdosfs    260M    1.3M    259M     1%    /boot/efi
zroot/tmp                  zfs        219G    200K    219G     0%    /tmp
zroot/var/log              zfs        219G    117M    219G     0%    /var/log
zroot                      zfs        219G     96K    219G     0%    /zroot
zroot/var/audit            zfs        219G     96K    219G     0%    /var/audit
zroot/home                 zfs        219G     96K    219G     0%    /home
zroot/usr/src              zfs        219G     96K    219G     0%    /usr/src
zroot/usr/ports            zfs        219G     96K    219G     0%    /usr/ports
zroot/var/tmp              zfs        219G     10M    219G     0%    /var/tmp
zroot/var/crash            zfs        219G     96K    219G     0%    /var/crash
zroot/var/mail             zfs        219G     96K    219G     0%    /var/mail
devfs                      devfs      1.0K      0B    1.0K     0%    /var/dhcpd/dev
devfs                      devfs      1.0K      0B    1.0K     0%    /var/unbound/dev
/usr/local/lib/python3.11  nullfs     221G    1.6G    219G     1%    /var/unbound/usr/local/lib/python3.11
/lib                       nullfs     221G    1.6G    219G     1%    /var/unbound/lib


The only service of note i am running is Unbound with blocklists, thats it, I cut everything out trying to isolate this issue
#15
24.7, 24.10 Production Series / Constant lockups/crashes
December 08, 2024, 10:33:36 PM
 Running on a physical system:

Celeron(R) J4125 CPU @ 2.00GHz (4 cores, 4 threads)
8 GB of ram
256 nvme SSD
Intel Ethernet Controller I225-V


Hardware is tested to be all functional.

The system locks up, webui and ssh non functional, keeps routing fine but DNS also dies. Absolutely NOTHING of note in the logs. This only started happening since i updated to 24.7 branch from 24.1.10. Before the update it was stable.

I have done a memtest, and fsck, and everything came up green, I also tried re-applying ALL the imported settings manually. I have no other ideas as to what should I do.

I have exhausted all ideas I had to diagnose this. Can someone with more brains than me help?