intel PC 24.7.2 install stopped routing/switching last night

Started by CanadaGuy, August 24, 2024, 05:54:40 PM

Previous topic - Next topic
I updated to 24.7 shortly after it was released, then .1 and .2 on August 21st. Things have been fine, however last night at some point it simply stopped switching VLANs (at least) but maybe routing too. All I know is that all my VLAN trunks and routing stopped working. A reset started it up again.

I couldn't reasonably do console access as my password is stupid long, and I don't have an easy to use console and I didn't want to try troubleshooting using my phone LTE connection.

I'm not experienced in looking for/parsing logs. Is there a way to find out when and why it stopped last night? I've already rebooted, so does that mean logs are gone?

Which hardware? Which NICs?

For example, I226 NICs are known to sometimes stall if ASPM is enabled. This does not need to be attributable to 24.7.2.
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

It's an OptiPlex 5050 SFF so LAN is integrated IntelĀ® i219-V Ethernet LAN 10/100/1000.
WAN is IO Crest 2.5 Gigabit Ethernet PCI Express PCI-E Network Interface Card 10/100/1000/2500 Mbps RJ45 LAN Intel I225 Chipset, Black, SY-PEX24076.

it has been running great for a year and a half (mostly...few screw ups on my part), but haven't made any layer 2 or layer 3 changes in probably a year.

Your statement is intriguing because I could imagine that a new release with many upgrades may result in new/changed settings.

I do not know of any specific problem with those NICs, but that does not say much.
The Optiplex is fairly old, so it could be the usual "6 weeks or 5 year" rule (hardware is said to fail most often in the first 6 weeks or after 5 years, less often in between), so I would definitely watch this.

If you can look at the logs and identify a panic (if there was one) or other fault, depends on where you put your logs - /var/log can be set to reside in RAM.

Other things to consider are: Is there enough free storage space? Is the disk O.K. (test with smartctl)?

24.7.2 certainly still has a few problems, but none that I know of which manifest like yours.
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

Reboot brought everything back, so it doesn't seem like a systemic malfunction. I'm inclined to think that with it being new software, there could be something else going on that requires the right conditions.

Looks like the default (I would never have changed it) is local logging is enabled per the image.

Thinking about another situation, UI EdgeRouters can crash when the local fs is filled up with logging. Interestingly, I currently have pages and pages (like hundreds) of this:

2024-08-24T12:28:28-04:00 Notice flowd_aggregate.py flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes)
2024-08-24T12:28:28-04:00 Notice flowd_aggregate.py flowparser failed to unpack agent_info (unpack requires a buffer of 16 bytes)
2024-08-24T12:28:28-04:00 Notice flowd_aggregate.py flowparser failed to unpack if_indices (unpack requires a buffer of 8 bytes)
2024-08-24T12:28:28-04:00 Notice flowd_aggregate.py flowparser failed to unpack octets (unpack requires a buffer of 8 bytes)
2024-08-24T12:28:28-04:00 Notice flowd_aggregate.py flowparser failed to unpack proto_flags_tos (unpack requires a buffer of 4 bytes)
2024-08-24T12:28:28-04:00 Notice flowd_aggregate.py flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes)
2024-08-24T12:28:28-04:00 Notice flowd_aggregate.py flowparser failed to unpack agent_info (unpack requires a buffer of 16 bytes)


here's the log from last night when it was last known working to my reboot this morning. August 21st at the bottom of the log is when I did the latest update.
2024-08-24T09:32:02-04:00 Notice kernel The Regents of the University of California. All rights reserved.
2024-08-24T09:32:02-04:00 Notice kernel Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
2024-08-24T09:32:02-04:00 Notice kernel Copyright (c) 1992-2023 The FreeBSD Project.
2024-08-24T09:32:02-04:00 Notice kernel ---<<BOOT>>---
2024-08-24T09:32:01-04:00 Notice syslog-ng syslog-ng starting up; version='4.8.0'
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum done
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum interface_086400.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum interface_003600.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum interface_000300.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum interface_000030.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum dst_port_086400.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum dst_port_003600.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum dst_port_000300.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum src_addr_086400.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum src_addr_003600.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum src_addr_000300.sqlite
2024-08-24T01:11:13-04:00 Notice flowd_aggregate.py vacuum src_addr_details_086400.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum done
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum interface_086400.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum interface_003600.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum interface_000300.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum interface_000030.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum dst_port_086400.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum dst_port_003600.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum dst_port_000300.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum src_addr_086400.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum src_addr_003600.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum src_addr_000300.sqlite
2024-08-23T17:11:02-04:00 Notice flowd_aggregate.py vacuum src_addr_details_086400.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum done
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum interface_086400.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum interface_003600.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum interface_000300.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum interface_000030.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum dst_port_086400.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum dst_port_003600.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum dst_port_000300.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum src_addr_086400.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum src_addr_003600.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum src_addr_000300.sqlite
2024-08-23T09:10:47-04:00 Notice flowd_aggregate.py vacuum src_addr_details_086400.sqlite
2024-08-23T07:07:09-04:00 Notice dhclient dhclient-script: Creating resolv.conf
2024-08-23T07:07:09-04:00 Notice dhclient dhclient-script: New Hostname (igc0): CPE88c9b3bf769e-CMdc360ca0e2cc
2024-08-23T07:07:09-04:00 Notice dhclient dhclient-script: Reason RENEW on igc0 executing
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum done
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum interface_086400.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum interface_003600.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum interface_000300.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum interface_000030.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum dst_port_086400.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum dst_port_003600.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum dst_port_000300.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum src_addr_086400.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum src_addr_003600.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum src_addr_000300.sqlite
2024-08-23T01:10:39-04:00 Notice flowd_aggregate.py vacuum src_addr_details_086400.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum done
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum interface_086400.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum interface_003600.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum interface_000300.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum interface_000030.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum dst_port_086400.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum dst_port_003600.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum dst_port_000300.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum src_addr_086400.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum src_addr_003600.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum src_addr_000300.sqlite
2024-08-22T17:10:35-04:00 Notice flowd_aggregate.py vacuum src_addr_details_086400.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum done
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum interface_086400.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum interface_003600.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum interface_000300.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum interface_000030.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum dst_port_086400.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum dst_port_003600.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum dst_port_000300.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum src_addr_086400.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum src_addr_003600.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum src_addr_000300.sqlite
2024-08-22T09:09:59-04:00 Notice flowd_aggregate.py vacuum src_addr_details_086400.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum done
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum interface_086400.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum interface_003600.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum interface_000300.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum interface_000030.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum dst_port_086400.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum dst_port_003600.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum dst_port_000300.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum src_addr_086400.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum src_addr_003600.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum src_addr_000300.sqlite
2024-08-22T01:09:05-04:00 Notice flowd_aggregate.py vacuum src_addr_details_086400.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum done
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum interface_086400.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum interface_003600.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum interface_000300.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum interface_000030.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum dst_port_086400.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum dst_port_003600.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum dst_port_000300.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum src_addr_086400.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum src_addr_003600.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum src_addr_000300.sqlite
2024-08-21T17:08:14-04:00 Notice flowd_aggregate.py vacuum src_addr_details_086400.sqlite

Possibly netflow data is just too much for your device. Consider disabling it.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

If disabling is not an option, at least go to Reporting - Settings and resset RRD and NetFlow data

Thanks, I'll try those. I never look at the data anyway.

For reference it's a 4 core 3 GHz CPU, with 8 GB ram (only 1.5 GB ever used).

If you never use it, definitely disable the netflow thing. Will make life a whole lot longer for your SSD, if using one.

Quote from: doktornotor on August 25, 2024, 03:17:34 PM
If you never use it, definitely disable the netflow thing. Will make life a whole lot longer for your SSD, if using one.
Thanks, that's a good tip. In principal I wish I had an actual use for it, but you're absolutely right and I hadn't considered that. I suppose that's where the off-device logging/netflow is useful.

Incidentally, those log messages have disappeared since I did a reset. Since they weren't there before one of the recent updates, I'm thinking something was changed during the upgrade causing that issue. If it was writing 5000 log messages per day (approx) then I could see how that might cause problems.

Yeah, once those netlflow DBs get corrupt somehow, they are unfixable (in a reasonable timeframe at least) plus the resulting logspam is obnoxious. Off-device somewhere with the old trusty rotating drives if you want to experiment with that later and do some fancy graphing or whatever.