Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Topics - eugenmayer

#1
AFAICS this is related to the upgrade to OpenVPN 2.6.3, which is included in 23.1.7 (compared to 2.5.8 in 23.1.6) - the server does crash for us (the entire daemon) when linux clients connect (about 100). 2.5.8 does work just fine combined with 23.1.6

One has to downgrade opnsense to 23.1.6 to use OpenVPN 2.5.8, since it seems like 23.1.7 changed the config for OpenVPN (so it is compatible with 2.6.3?)

I did not find any logs on how the OpenVPN server crashed, even with verb9. The daemon was just killed after a couple of clients connected OR after a specific time (30 seconds).
#2
We are currently invstigating a crash of opnsense every single week on Sat between 3-6 am.

While doing so, we looked at the health data. While traffic-health report is available for all 3 years, the system metrics are not.

In fact, after a normal reboot of the box, all system rrd data is lost, the last data available then is older then Aug 2018 - so it seem to have worked back in the days but since then every reboot of the box deletes the system rrd data - while traffic / packets are still there.

I double checked that the 2 checkboxes to

/var RAM disk    Use memory file system for /var
/tmp RAM disk    Use memory file system for /tmp

are not checked - so i have no RAM disk for rrd data.

Any idea how this happens? Interesting fact is, that Aug 06 2018 was the release date auf 18.7 - and i most probably instantly upgraded. So that upgrade could have caused this ( https://opnsense.org/opnsense-18-7-released/ ) ..

I am currently on 19.1.2, cannot go higher due to the LDAP bugs introduced in 19.1.3+

Thanks for any hints
#3
Hello,

fighting this for some time already now and i am really out of ideas.

Setup
- I have about 5 KVM based OPNsense boxes, 1 AWS and 2 apu2c2 boxes. (18.7.8)
- Those 5 KVM boxes are basically identical, running: DHCP, Unbound, OpenVPN, Tinc, HAproxy, ACME (18.7.8)
- 1 AWS is running DHCP, Unbound, OpenVPN, HAproxy, ACME, webproxy (18.1 latest)

Problem
2 of those 5 AWS keep stalling on Saturday every single week ( for 5 and more weeks no). Right now its always the same boxes, it used to be randomly for those 5.

The AWS box seems to stall every week, also Saturday.

What i mean by "stall":
it seems some traffic is still passing through the OPNsense box it looks like NAT is still working as also stateful connections. It seems like the boxes behind OPNsense though cannot access WAN anylonger (outbound issue?)

Also i cannot connect using SSH or terminal, in both cases i can enter the user, but then instead of asking for the password - it just "hangs" there.

What i deducted
For several weeks now, after i detected that the auto-upgrade did not work and they are stuck at 18.7.4, i upgraded them to 18.7.7 ( then .8 ). Now always the same get stuck. I suspected that it is the upgrade so i deactivated the upgrade cron tasks - but this week no update was available, still those 2 stalled and the AWS box.

I also suspected the KVM boxes to "stall" on proxmox backups, i disabled them but that did not help either. Also since the AWS box is not backup using that at all, i expect that was not the right assumption anyway.

Also, 18.1 and 18.7 boxes are affected by this  - host on totally different hypervisors (AWS/kvm proxmox).

While the KVM boxes have about a every similar duty, the AWS box is rather different, still affected.


Help
Could anyway help me getting to the bottom of this - this becomes a real blocker for me in a sense that i might also consider to migrate away if i cannot solve this at all at some point.

If i can get any logs or can let the boxes log additional things while stale out, let me know. Maybe some rrd graph could be interesting or whatever, let me know. Thanks!
#4
I am using a public TLD for which i use the private-domain flag in unbound and also a domain override.

So lets assume it company.com - i use the namespace <namspace>.company.com as a internal domain, so internal.company.com. (Domain override in unbound).

The problem now is, that i am using a tool form ACME DNS-01 challenges which will do a dns lookup on the default DNS server ( OPNsense in this question ) searching for a NS record ( primary nameserver for company.com ) like

dig mysub1.internal.company.com NS

during the challenge. If it finds a NS record, it will poll the primary server for a TXT record created durin DNS-01- if it does not find a NS server it will fail.

Apperently with OPNsense + unbound + domain override that NS responses are all empty. I ask myself how could i potentially fix that.

So

dig mysub1.internal.company.com NS

and

dig internal.company.com NS are emt

are empty, since the domain override is on internal.company.com

dig company.com NS

will return the problem primary NS (public server)

Any hints on how to solve this?
#5
This setup should be based on a proxmox, being behind a opnsense VM hosted on the Proxmox itself which will protect proxmox, offer a firewall, a privat LAN and DHCP/DNS to the VMs and offer a IPsec connection into the LAN to access all VMs/Proxmox which are not NATed. The server is the typical Hetzner Server, so only on NIC but multiple IPs or/subnets on this NIC.

Proxmox Server with 1 NIC(eth0)
3 Public 1IPs, IP2/3 are routed by MAC in the datacenter (to eth0)
eth0 is PCI-Passthroughed to the OPNsense KVM
A private network on vmbr30, 10.1.7.0/24
An openvpn mobile client connect (172.16.0.0/24) to LAN

see more  here

https://stackoverflow.com/questions/44134651/proxmox-with-opnsense-as-pci-passthrough-setup-used-as-firewall-router-ipsec-pri/44150668#44150668
#6

Postend an updated version on stack: https://stackoverflow.com/questions/44118442/proxmox-with-opnsense-as-firewall-gw-routing-issue since the tools there are better to work out such an issue.

Since FreeBSD got a lot better on KVM (virtio anything), i created a setup in a datacenter:


  • Proxmox Server with 1 NIC(eth0)
  • 3 Public 1IPs, IP2/3 are routed by MAC in the datacenter (to eth0)
  • KVM bridged setup ( eth0 no ip, vmbr1 bridged to eth0 with IP1 )
  • A private network on vmbr30, 10.1.7.0/24
  • A shorwall on the proxmox server

see https://stackoverflow.com/questions/44118442/proxmox-with-opnsense-as-firewall-gw-routing-issue for a brief description

When i got this straighten out i would love to place a comprehensive guide on how to run OPNsense as a Appliance with a private network in on Proxmox, passing some services to the outer world using HAproxe+LE and also accessing the private lan using IPsec

#7
I have a IPSEC mobile client connection (172.16.0.0/24) to my LAN ( 10.1.7.0 ).

- I run a DNS-Resolver and a DHCP server which is configured to set DNS entries for each client in LAN. The DNS-Resolver does domain overriding for domain.tld and listens on LAN and 127.0.0.1

Question/Need:
I wan the mobile-client to be able to resolve the domains for my LAN domain, domain.tld - which the DNS resolved offers (i can do that when using).

Configuration:
Thats how i configured the mobile client: https://goo.gl/qYxP56
Thats how i configured the DNS Resolver: https://goo.gl/o6Ibrs

Issue:
When i connect with my (El Capitan/Sierra) IPsec "Cisco" client, i can access LAN i cant really see that the DNS server is used.

If i do query the DNS server directly (from the mobile client) it works

dig test.domain.tld @10.1.7.1

But i cannot resolve domains form domain.tld directly since the DNS server seems not to be forwarded during the connection?