Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Apocalypsing

#1
Hi, so I've been trying to better track down this problem for months at this point, but because it is extremely intermittent and unpredictable, it's also been difficult for me. Basically, about 95% of the time, my network under OPNsense is performing great. Even with several people in the house surfing the web, streaming etc. in my household.

So basically it's that 5% of the time where things are iffy. For a bit of background, my primary WAN connection is a VDSL2 connection with 100 Mbps of downstream bandwidth, and 20 Mbps of upstream bandwidth. I also have a backup LTE link (similar speeds to the primary link), but for this topic I'm more focused on my primary WAN link.

I think I'm at the point where I'm reliably able to reproduce the problem, and once such case where it appears to be reproducible is when file downloads are happening from Akamai's CDN servers. In essence, if a file is downloaded from their CDN, it causes a considerable latency increase while the file download is in progress, and it beomces bad enough that I even start to see some TCP packet loss, and not just ICMP loss. This all seems to occur in spite of my downloads speeds to Akamai's network not reaching much past 24 Mbps - so in spite of the packet loss and latency increase over idle - I am only using about 25% of my pipe's bandwidth downstream. This is where I'm so confused.

I should mention I already do have AQM and fair queuing rules in place so that everyone connected to my network has a good quality of experience. I first noticed these significant latency increases over idle when I was still using fq_codel as my preferred AQM algorithm, but even with my current setup it's not any better (I switched away from fq_codel thinking I may have hit a quirk relating to that algorithm)

What's even stranger to me is that outside of these edge cases, if someone on the network hammers the downstream with say, a Steam download, other people using the network still have a good experience (I say this as I know too well how heavily Steam can easily cripple a typical home network without any AQM and fair queuing). Most other applications that download lots of data are also fine outside of this.

The heavy latency increases when downloading from Akamai just don't make sense me to me at all, even with the high latency I seem to get to their CDN:

traceroute www.crucial.com
traceroute to www.crucial.com (104.84.168.83), 30 hops max, 60 byte packets
1  <redacted> (10.115.101.1)  0.394 ms  0.322 ms  0.306 ms
2  203-219-198-51.tpgi.com.au (203.219.198.51)  8.212 ms  8.083 ms  8.295 ms
3  per-apt-stg-crt1-be100.tpgi.com.au (203.219.57.129)  54.064 ms  54.287 ms  54.156 ms
4  syd-gls-har-crt1-Hu-0-5-0-1.tpg.com.au (202.7.162.133)  55.579 ms  55.604 ms  55.952 ms
5  203-221-3-82.tpgi.com.au (203.221.3.82)  54.010 ms 203-221-3-18.tpgi.com.au (203.221.3.18)  55.510 ms  55.518 ms
6  203-219-106-90.tpgi.com.au (203.219.106.90)  54.171 ms  53.791 ms  53.985 ms
7  ae5.r02.syd01.icn.netarch.akamai.com (23.56.128.38)  55.726 ms  55.216 ms  55.234 ms
8  ae3.r02.per01.icn.netarch.akamai.com (23.214.115.22)  92.233 ms  92.233 ms  92.216 ms
9  ae0.r01.per01.icn.netarch.akamai.com (23.214.115.16)  93.727 ms  93.613 ms  93.992 ms
10  ae4.r02.sin01.icn.netarch.akamai.com (23.214.115.20)  101.614 ms  101.476 ms  100.087 ms
11  ae7.r02.sin02.icn.netarch.akamai.com (23.215.54.215)  101.332 ms  101.198 ms  101.308 ms
12  ae6.r02.hkg02.icn.netarch.akamai.com (23.215.54.138)  193.061 ms  193.147 ms  193.395 ms
13  ae4.r01.hkg03.icn.netarch.akamai.com (23.215.54.153)  133.912 ms  133.269 ms  133.080 ms
14  ae11.r01.hkg03.ien.netarch.akamai.com (23.56.143.43)  154.087 ms ae13.r02.hkg03.ien.netarch.akamai.com (23.56.143.47)  133.296 ms  133.327 ms
15  ae10.cmignc-hkg.netarch.akamai.com (23.56.143.193)  197.989 ms  197.864 ms  197.974 ms
16  * a104-84-168-83.deploy.static.akamaitechnologies.com (104.84.168.83)  191.860 ms  191.969 ms


Of course most high-latency connections that transfer a lot of data don't seem to do the same to my network, either. So I'm really stumped here. Seeing over 40ms of jitter in this extreme case isn't doing anyone on my network anyfavour who may be in the middle of a VoIP call, for instance.

Edit: should probably mention I'm also running OPNsense version 22.7.7_1 (amd64/OpenSSL). Hardware is a Qotom Q710G4: Intel Celeron J3455E, with 4 x Intel Gigabit NICs, and a 250GB Crucial MX500 SSD.
#2
I was stuck initially on trying to run through opnsense-revert via the console, as I was seeing:

# opnsense-revert -r 21.1.9_multi opnsense-update
Fetching opnsense-update.txz: .. failed


I had the Australian repo mirror set in the web UI as I'm located in Australia. Perhaps the Australian mirror isn't fully up to date for this process, as things didn't work until I changed back to the default Deciso repo. I also temporarily removed mimugmail's repo just to be sure.

Having done that, seems like things are okay now, as the rest of the upgrade process went very smoothly on my Qotom appliance.