Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - patrick3000

#1
I have OPNsense 24.7.10_2, and I have two gateways set up in a gateway group, with WAN being the primary (high priority) gateway and WAN2 being the secondary (low priority) gateway that is only used through failover when WAN is down.

Today, WAN2 was down most of the day due to an outage at the internet service provider, and it caused some, though not all, websites to fail to load on WAN using Firefox, Chrome, or any other browser. Examples of the websites that failed to load are cnn.com and nytimes.com. Strangely, however, I could still ping those websites despite not being able to access them in a browser. When I disabled WAN2 under system, gateways, configuration, however, all websites loaded properly on WAN.

In sum, there is a bug in gateway groups such that when one of the monitored gateways goes down (in this case, the secondary gateway), the other gateway does not work properly.

This is a continuing issue with gateway groups and monitoring. They didn't work properly in 24.7.1, but the problem was corrected in 24.7.3. See, for example, this post: https://forum.opnsense.org/index.php?topic=41915.15.

Now, it appears that in 24.7.10_2, there are still some problems. In particular, if the secondary gateway goes down, some remote hosts are inaccessible on the primary gateway using http or https.


If anyone has any insights or solutions, it would be great to hear them.
#2
Mr.Moo52 I just found your response to my old thread. That's brilliant. It finally solves the problem of me having to manually set the link speed every time I reboot OPNsense. As you note, the link speed can be set under Interfaces, WAN, Generic Configuration, Speed and Duplex.
#3
I had a problem that OPNsense took hours to update or, in some cases, hung and didn't update at all. I solved the problem a few weeks ago (when I was on something like version 24.7.3) by selecting "Prefer to use IPv4 even if IPv6 is available" in System > Settings > General.

Perhaps you are having a similar problem that could be solved by changing this setting.
#4
Thanks for trying.

However, I have now done some additional testing, and I believe that this problem might not be an OPNsense problem. It might be a host (KVM or Truenas) problem. Here is why I think that.

I just now tried passing through a different adapter (not the one I ultimately want to use) using para-virtualization, and OPNsense handled gateway monitoring and fail-over properly. In particular, when the cable to the physical adapter was yanked, the dashboard showed packet loss rising, and then the dot changed from green to red.

This is not the adapter I ultimately want to use because it's a 1 gpbs adapter, and my internet plan is 1.25 gbps, so I need to use the other adapter (which is 10 gbps and supports n-base T, and negotiates with the modem at 2.5 gbps).

It appears, then, that I only have this problem with one particular physical adapter (the 10 gbps adapter) that I pass to OPNsense. I suspect the problem might relate to not rebooting the host server between when this adapter is used for PCIE pass-through and when it's used for para-virtualization, but I cannot easily reboot the host server for the next couple days so cannot test that theory at the moment.

In any event, it now seems unlikely that this is an OPNsense problem. However, I will leave this thread up in case anyone finds it useful. And again, thanks for your help trouble-shooting.
#5
Here is a screen shot of the OPNsense dashboard gateway status, which shows packet loss (currently 0.0%, which is correct, because both WAN and WAN2 are up). IP addresses have been redacted.

The way it's always worked with gateway monitoring and PCIE pass-through is that if either WAN or WAN2 goes down, the "Loss" value gradually climbs from 0% to 100%, and when it gets above, as I recall, 20% (a number which can be set somewhere), it switches the dot from green to red for that gateway, takes the interface offline, and fails over to the other gateway.

With para-virtualized adapters rather than PCIE pass-through, when WAN or WAN2 goes down, this part of the dashboard continues to show Loss as "0.0%" and never switches the dot from green to red, even though fail-over to the working interface appears to happen properly.
#6
Yes, by virtual adapters, I mean VirtIO paravirtualized adapters, and you're correct, they appear in the OPNsense VM as "vtnet," not "vnet," which was a mistake.

I now see what you're saying that they will show as green in OPNsense even if the underlying physical adapter in the Linux host is down. However, I still think there is something wrong with the OPNsense dashboard, because it shows "Loss: 0.0%" for WAN, even when gateway monitoring has taken WAN off line due to being unable to ping the monitored IP (which is 8.8.8.8, i.e. google in my case).

Still, maybe this isn't such a big deal because gateway monitoring with fail-over still appears to actually work. It's just that the dashboard incorrectly shows 0.0% packet loss even when monitoring is unable to successfully ping.
#7
Proxmox (PVE) and Truenas Scale are both KVM on top of Debian, so for the purpose of virtualization, they are very similar.

By "virtual adapters," I am referring to the feature in Linux, FreeBSD, and other Unix-like operating systems that allows adapters to be passed to a guest VM, where they appear as "vnet1," "vnet2," etc. This is different from PCIE pass-through, which is more like the VM having full control of the adapter as though it were the host. Both have advantages and disadvantages. The reason I am trying to use virtual adapters, rather than what I currently use which is PCIE pass-through, for WAN and WAN2, is that I would like to eventually run WAN and WAN2 through a switch as VLANs, and then pass each VLAN to OPNsense as a virtual adapter.

Gateway monitoring with virtual adapters does not reflect properly in the dashboard in OPNsense 24.7.6, however, because not only does the dashboard show WAN (or WAN2) as up (in green) even when it's down, but it shows packet loss as 0.0%.
#8
I have a multi-WAN setup with gateway monitoring in OPNsense 24.7.6 that I run virtually on Truenas SCALE.

It works properly if I pass through the two WANs (called "WAN" and "WAN2") using PCIE pass-through. However, if I virtualize either of the WANs and pass it through as a virtual adapter, then the dashboard always shows a green dot, which is supposed to indicate that the interface is up, even when the interface is down. (Please note that this appears to be a dashboard problem only, because gateway monitoring appears to still actually work and fail over to whichever WAN is still up.)

As I recall, this problem with the dashboard and virtual adapters subject to gateway monitoring did not exist in earlier versions of OPNsense, such as 24.2, which was the last time I passed through virtual adapters. So, it appears to be a new problem, and I suspect is a bug, in 24.7.

Does anyone know how to fix this dashboard problem with gateway monitoring in version 24.7? Alternatively, does anyone have any experience or feedback with this problem?
#9
Good job, PJW, on solving this problem. However, as of the latest update to OPNsense, I do not believe that your solution, which involves editing hidden LAN firewall rules, is necessary.

In particular, when I first ran into this problem of multi-gateway rules being broken after upgrading to 24.7 several weeks ago, I downgraded to 24.1 as a workaround.

Today, I again upgraded to 24.7, and immediately after the upgrade, the !sshlockout rule you mentioned appeared as a hidden LAN firewall rule. However, after that, I updated to the latest version as of today, which is 24.7.3_1, and when I looked in the LAN firewall rules observed that the !sshlockout rule was gone. So it appears that this problem has been addressed in firmware in the latest version.

Next, I yanked the cable from, in turn, the WAN and WAN2 interface, and failover to the other interface in the gateway group occurred properly. So, it seems that this problem, while it existed in the original release of 24.7 due to the problematic !sshlockout rule, no longer exists in 24.7.3_1.
#10
This recent post may shed some light on this issue: https://forum.opnsense.org/index.php?topic=42552.0.

If WAN cannot ping remote hosts in 24.7, that could explain why gateway monitoring is broken.

For those of you who have 24.7 installed (as noted, I rolled back to 24.1.10 due to this problem), I would suggest manually attempting to ping from each public-facing interface (WAN, WAN2, etc.) to 8.8.8.8 or some other remote host to determine if that's the source of the problem.
#11
For some reason, I have found nothing about this issue except this thread. It definitely worked prior to the upgrade to 24.7, and absolutely does not in 24.7, at least when I tried it last week.

As noted, I downgraded to 24.1.10, and it's back to working, but I was able to do so by rolling back to a snapshot.

One tip: If you downgrade manually to 24.1.10, make sure you have a config file ready that was created in 24.1.10 or earlier. At least in most similar setups (and I assume OPNsense is the same way), restoring from config only works if the config file was created from the same or earlier version to which it's restored.

Of course, downgrading is only a temporary solution. It's not feasible to remain with 24.1.10 permanently, so hopefully there is some interest in a workaround or patch in 24.7 for this, because it's beyond my technical skills to fix it on my own.
#12
I'm not sure if you have a multi-wan setup, but gateway groups with multi-wan appear to be broken in 24.7. The symptoms are similar to yours, so it may be related. After an internet outage, WAN does not come back online. Here is a thread: https://forum.opnsense.org/index.php?topic=41915.0.

Because of this, I have downgraded to 24.1 for the time being.
#13
Thanks. I have downgraded to version 24.1.10_8, and multi-WAN works properly again. After the downgrade, I yanked the cable into the modem for each gateway respectively, and OPNsense properly failed over to the other gateway with uninterrupted internet.

Hopefully, this problem will get fixed at some point. and I will then upgrade again.
#14
Is there any update on this problem? I upgraded to 24.7 about a week ago, and I learned today that my multi-wan setup with a gateway group no longer works.

I have two gateway interfaces--one is called WAN and the other WAN2. I have them in a gateway group, with WAN being the primary gateway and WAN2 the secondary gateway that is only supposed to be used if WAN fails.

Today, the secondary gateway, WAN2, went down due to an outage at the ISP, and I lost internet in my house even though the primary gateway, WAN, was still active and able to send and receive packets.

After WAN2 came back online, I duplicated the problem manually by yanking the cable on WAN2, and again, I lost internet in the house.

Interestingly, when the primary gateway, WAN, fails, then failover to WAN2 happens as it should, but when WAN2 fails, there is a loss of internet entirely. All of this worked in 24.1

I then searched and found this thread. Unfortunately, I do not see any solution here. Unless there is one, I'm going to downgrade to 24.1, which I can do easily because I run OPNsense in a VM on Truenas SCALE and can roll back to a snapshot. Of course, downgrading is not my first choice since I like the new dashboard in 24.7, but it's more important to have multi-wan working.
#15
As a follow-up, I am making some progress on this. I haven't deployed instance configuration yet, but some things are becoming clearer.

First, I have realized that the lack of ability to specify an interface with instance configuration is not a problem, at least for me, because in the client export menu, I will specify a hostname associated with WAN, which will ensure that any authorized connection will arrive on the WAN, rather than WAN2, interface.

Second, some of my questions were answered in this thread: https://forum.opnsense.org/index.php?topic=38532.0.

The bottom line is that there does not appear to be a comprehensive, detailed source of information regarding migration from legacy server to instance configuration, but it's possible to piece it together from the official Decisio documentation here https://docs.opnsense.org/manual/how-tos/sslvpn_instance_roadwarrior.html and the other thread in this forum that I mentioned.