Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - sjjh

#1
*push* Anyone an idea? The issue persists with OPNsense 24.7.11_2-amd64. Thx!
#2
Since the last update to Opnsense 24.7 with squid as transparent http proxy and SSL SNI, some users report about being not able to access some https-websites. I tried to capture the relevant cache.log output excerpt (with "debug_options ALL, 1 11,6 26,6 83,6"):

2024/12/16 14:42:23.116 kid1| 83,5| Session.cc(96) NewSessionObject: SSL_new session=0x1c15fae80000
2024/12/16 14:42:23.116 kid1| 83,5| Session.cc(154) CreateSession: link FD 1247 to TLS session=0x1c15fae80000
2024/12/16 14:42:23.117 kid1| 83,5| bio.cc(114) write: FD 1247 wrote 2452 <= 2452
2024/12/16 14:42:23.118 kid1| 83,5| bio.cc(137) read: FD 1247 read -1 <= 5
2024/12/16 14:42:23.118 kid1| 83,5| Io.cc(92) Handshake: -1/35 for TLS connection 0x1c15fae80000 over conn19210 local=216.194.167.35:443 remote=10.63.10.46:60964 FD 1247 flags=33
2024/12/16 14:42:23.119 kid1| 83,5| bio.cc(137) read: FD 1247 read 5 <= 5
2024/12/16 14:42:23.119 kid1| 83,5| bio.cc(137) read: FD 1247 read 19 <= 19
2024/12/16 14:42:23.119 kid1| 83,5| Io.cc(92) Handshake: -1/0 for TLS connection 0x1c15fae80000 over conn19210 local=216.194.167.35:443 remote=10.63.10.46:60964 FD 1247 flags=33
2024/12/16 14:42:23 kid1| ERROR: failure while accepting a TLS connection on conn19210 local=216.194.167.35:443 remote=10.63.10.46:60964 FD 1247 flags=33: SQUID_TLS_ERR_ACCEPT+TLS_LIB_ERR=A000412+TLS_IO_ERR=1
2024/12/16 14:42:23.119 kid1| 83,5| Session.cc(201) SessionSendGoodbye: session=0x1c15fae80000
2024/12/16 14:42:23.119 kid1| 83,5| Session.cc(93) operator(): SSL_free session=0x1c15fae80000

Without squid, I can access the website just fine. The ssl parameters of the website connection seem to be ok: TLS_AES_256_GCM_SHA384. 256-Bit-Key. TLS 1.3), the cert is valid (according to Firefox).

What could be the issue here? How to debug further? Any help appreciated. :)
#3
Sorry for expressing myself not clear enough. And I'm also not sure if I understand your question about "client export vs. simple OpenVPN client side".
I tested a little further and for me it seems to work by:
1. creating a user without adding/assigning any certificate
2. client export within OpenVPN, with the certificate I used to attach to my VPN users.
So there might actually not be a (technical) problem for me right now. And I might just have been confused ("usability problem") by not being able to add the cert to the user anymore and not seeing any users linked anymore to the cert in OpenVPN > Client export. Sry for the noise.
#4
My current situation here is, that I'm assigning one existing cert to multiple users (not matching CN) for VPN access (like, as a MFA next to a password). (Please let's not discuss if this is a sensible setup, I inherited it and need a working setup now to buy time to design and introduce a better concept for my users, e.g., using individual certs per user.)
@franco if I understood you correctly, this should still be doable. I fail so in the Web UI. Can you please elaborate how I can assign an existing non matching cert to a user?

Thanks for your support!
Simon
#5
Quote from: drbob on May 27, 2023, 01:21:36 PM
I have not touched the OPNsense configuration since the upgrade. SSH is disabled on the router so the web interface is the only way to manage it.
Can you try a locale console (if you have physical access and can connect monitor + keyboard)?

No idea about the root cause of the error message. If the web GUI has worked after the update, maybe there's a chance that it is a temporary error. In that case, maybe forcing a reboot (e.g. cutting of power) could help?

Good luck!
#6
Don't worry. I'm interested in finding a good/correct solution myself. :)
I only fear my knowledge might be too limited to help. In case a view at our installation would help, just let me know via PM and we can figure that out.
Simon
#7
Thanks for your confirmation. I opened a ticket.
Simon
#8
Thanks for the replies!
Quote from: meyergru on May 24, 2023, 08:32:57 PM
Just an idea, because I saw symptoms like these: Did you try lowering your MTU or do MSS clamping?

If you use VLANs, PPPoE or anything else that limits ord reduces your real MTU, you will experience packet loss with sites that cannot do correct PMTU discovery along the way to some sites. Such sites will then be much slower, because eventually, the dropped packets are being corrected.
Gateways are indeed PPPoE and internally all interfaces use VLANs. The assigned/default MTU is 1492. I once lowered the MTU and MRU in the settings to 1450, which led to a connection breakdown (but reasons for the breakage might be elsewhere as well).
In the meantime we noticed, that suricata was running in IDS and IPS mode, which is apparently neither supported for VLANs nor for PPPoE. One hypothesis was, that suricata might have messed up with the packages as well (leading to MTU issues). We deactivated suricata and will observe it for a couple of weeks to see if the problem persists.

Quote from: LesterCLL on May 24, 2023, 06:49:59 PM
I have had the same problem for some time, and also the same temporary solution. I thought it had something to with cache, memory, logs etc. But i think I solved it by turning on PowerD. All settings are set to HiAdaptive. It was turned off by default.

It has been running for 2 weeks now without any issues, no reboots. Something which wasn't possible before. I hope this helps.
It is turned off over here as well. I would not have thought of it as a reason but by now I will not rule out anything... I'll keep it in the back of my mind. If disabling suricata will not be the solution and another attempt of lowering the MTU will also not help, I'll for sure try it out.

In any case, I'll report back here. Thx! Simon
#9
Following set up: two PPPoE gateways (two separate contracts) from the same ISP: one smaller uplink for VoIP and one bigger uplink for all the other traffic (mainly surfing the web). Firewall/NAT rules to choose the correct gateway. IP addresses of the gateway are assigned by DHCP from the ISP. The ISP differentiates them by the PPPoE credentials and assigns us the respective WAN IP address. As both gateways come from the same ISP and are only two different sized contracts, both gateways receive the same upstream gateway IP address by the ISP.

As we do have a mixture of different connection problems and I cannot really find the solution, I was researching on the net and stumbled upon a github issue. If I understand the comments by fichtner correct, this set up might not be supported at all.

Is my understanding correct, that it is not supported to have two gateways with public IP addresses A and B which both have the same gateway with IP address C?
I'm asking, because I do not see any error message in the Web GUI and I would have assumed that I do get an error message in the Web GUI if it is not supported. If someone can confirm my understanding, I'm happy to open a ticket. :)
Simon
#10
*push* anyone having an idea?  :)
#11
Not sure if it is helpful or stating the obvious:
I've made good experience with using smokeping with different targets and probes to analyze network connection issues. You could add a smokeping instance targeting your ISP, google, ... to double check if something changes before the issue occurs and which target is still reachable.
Simon
#12
Thx for the offer! To be honest, as it currently seems to work with the IP address specified in postfix the urgent need has declined (I'm rather wondering if there is a bug in the postfix plugin (DNS vs. IP), but that might also be my limited understanding of the whole matter).
I once heard the rumor, that dnsmasq would not be recommended anymore and unbound is the way to go. Thus, I'd understand if dnsmasq is out of focus. On the other hand, if it helps, I'm more than happy to create a new ticket. :)

Personally, I'm more urgent in need to find a solution for the https connection issues  :-X

Cheers
Simon
#13
We are currently experiencing every couple of days a slow down in the internet speed. Users report about websites (e.g. linkedin, paypall, atlasian/jira, word press uploads, MS365 webmail, ...) not loading or loading only very slowly. It seems, that not all websites/connections(/users?) are affected.
I do have an smokeping instance running, where I can see an increase from ~200ms to ~4.5sec for a curl call of google.de. At the same time a ping with fping to google stays at ~5ms (see attachments).

During the the affected times, I can see lots of following messages in the squid cache log:
2023-05-18T11:37:03 squid kid1| conn690249 local=162.125.21.3:443 remote=10.63.12.82:53169 FD 323 flags=33: read/write failure: (60) Operation timed out
2023-05-18T11:34:01 squid kid1| conn681351 local=13.88.181.35:443 remote=10.63.13.68:59116 FD 300 flags=33: read/write failure: (60) Operation timed out
2023-05-18T11:33:41 squid kid1| conn467660 local=40.113.103.199:443 remote=10.63.13.68:49576 FD 618 flags=33: read/write failure: (60) Operation timed out
2023-05-18T11:29:37 squid kid1| conn678654 local=35.174.127.31:443 remote=10.63.12.145:58955 FD 785 flags=33: read/write failure: (60) Operation timed out
2023-05-18T11:21:10 squid kid1| conn366702 local=40.115.3.253:443 remote=10.63.19.98:58979 FD 97 flags=33: read/write failure: (60) Operation timed out
2023-05-18T11:04:35 squid kid1| conn637485 local=157.240.253.13:443 remote=10.63.14.104:62702 FD 863 flags=33: read/write failure: (60) Operation timed out
2023-05-18T11:04:05 squid kid1| conn637480 local=157.240.253.13:443 remote=10.63.14.104:62701 FD 829 flags=33: read/write failure: (60) Operation timed out
2023-05-18T11:01:06 squid kid1| conn550254 local=40.115.3.253:443 remote=10.63.10.18:60849 FD 113 flags=33: read/write failure: (60) Operation timed out
2023-05-18T10:52:20 squid kid1| conn578770 local=40.113.110.67:443 remote=10.63.19.18:65516 FD 495 flags=33: read/write failure: (60) Operation timed out
2023-05-18T10:45:37 squid kid1| conn610397 local=40.113.110.67:443 remote=10.63.75.20:51635 FD 633 flags=33: read/write failure: (60) Operation timed out
2023-05-18T10:43:31 squid kid1| conn610286 local=162.125.21.3:443 remote=10.63.75.20:51616 FD 978 flags=33: read/write failure: (60) Operation timed out
2023-05-18T10:43:26 squid kid1| conn610347 local=162.125.21.3:443 remote=10.63.19.133:49822 FD 1061 flags=33: read/write failure: (60) Operation timed out
2023-05-18T10:42:59 squid kid1| conn598172 local=172.217.18.10:443 remote=10.63.19.133:49264 FD 820 flags=33: read/write failure: (60) Operation timed out

as well as some
2023-05-18T11:29:32 squid kid1| conn682058 local=147.160.187.240:443 remote=10.63.14.104:64056 FD 776 flags=33: read/write failure: (32) Broken pipe
2023-05-18T10:37:14 squid kid1| conn604023 local=52.222.214.43:443 remote=10.63.16.66:51656 FD 633 flags=33: read/write failure: (32) Broken pipe
2023-05-18T09:52:11 squid kid1| conn550127 local=139.177.229.129:443 remote=10.63.15.99:34151 FD 440 flags=33: read/write failure: (32) Broken pipe
2023-05-18T07:19:34 squid kid1| conn440033 local=142.250.186.46:443 remote=10.63.11.30:34628 FD 94 flags=33: read/write failure: (32) Broken pipe
2023-05-18T04:21:08 squid kid1| conn358437 local=139.177.229.129:443 remote=10.63.15.117:62679 FD 64 flags=33: read/write failure: (32) Broken pipe
2023-05-18T00:43:12 squid kid1| conn239322 local=146.75.118.248:443 remote=10.63.12.147:63957 FD 316 flags=33: read/write failure: (32) Broken pipe


It seems to start out of the blue and ends perceived randomly. Once rebooting the OPNsense machine helped, another time rebooting it twice did not help but later another restart seemed to have brought it back to normal.

OPNsense 23.1.7_3-amd64
FreeBSD 13.1-RELEASE-p7
OpenSSL 1.1.1t 7 Feb 2023
with following services:
acme
cicap
clamd
configd
cron
dhcpd
dnsmasq
flowd_aggregate
freshclam
ipfw
login
monit
nginx
ntpd
openssh
openvpn
pf
postfix
redis
routing
rspamd
samplicate
squid
suricata
sysctl
syslog-ng
webgui

edit: squid SSL interception only with SNI (no MITM).

Any idea how to debug the issue? Thx! Simon
#14
Thx for your reply! Dnsmasq was mentioned by fichtner in the linked github issue, but I also understood the PR as it the Web-GUI option was only added to Unbound.
Simon
#15
Note to myself: after changing the postfix plugin configuration to not include a domain name as target, but an IP address, the email gets delivered. This doesn't answer my initial question, but makes it (for the moment) irrelevant.