Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - FLguy

#1
Quote from: EricPerl on April 02, 2025, 09:08:00 PMIf you're not using any load balancing or auto failover, you might be better off only having 1 WAN interface, which also means 1 GW.
You might still have gremlins left over from some multi-WAN behavior.

The arp message might just be a consequence of the existence of these 2 GWs.

No, no, no. I want to get back to my multi-WAN load balancing configuration that I had in November. The only reason I am disabling Interfaces or Gateways is to troubleshoot this issue. This issue could definitely be related to multi-WAN, even though most, if not all, of the posts related to the Arpresolve message are single internet configurations. 
#2
Quote from: EricPerl on April 01, 2025, 08:32:14 PMI just did a search on the error message and there were a few entries in the forum.
https://forum.opnsense.org/index.php?topic=34340.0

Damn, I did see this post after all. It doesn't provide any helpful information. He mentioned trying to reach a gateway outside the subnet. My default gateway is the IP that arpresolve complains about.  Of course that's in the same subnet as my interface. 

After you suggested moving my WAN-F interface, that got me thinking, there is no special configuration between my WAN-F = WAN and WAN-C = OPT2 interfaces. They are both DHCP. I can just swap the two cables between those two interfaces and see if the problem starts happening with my cable provider.  So now the interface where the cable provider is now WAN-F, and WAN-C is now <wan>, and that interface is still disabled.  SO FAR, it's been over 24 hours, and no outages.

After 3 days, I will reenable the other interface.  To see what happens.  These two interfaces are configured in the same way:

    <wan>
      <if>igb0</if>
      <descr/>
      <if>igb0</if>
      <descr/>
      <spoofmac/>
      <blockpriv>1</blockpriv>
      <blockbogons>1</blockbogons>
      <ipaddr>dhcp</ipaddr>
      <dhcphostname/>
      <alias-address/>
      <alias-subnet>32</alias-subnet>
      <dhcprejectfrom/>
      <adv_dhcp_pt_timeout/>
      <adv_dhcp_pt_retry/>
      <adv_dhcp_pt_select_timeout/>
      <adv_dhcp_pt_reboot/>
      <adv_dhcp_pt_backoff_cutoff/>
      <adv_dhcp_pt_initial_interval/>
      <adv_dhcp_pt_values>SavedCfg</adv_dhcp_pt_values>
      <adv_dhcp_send_options/>
      <adv_dhcp_request_options/>
      <adv_dhcp_required_options/>
      <adv_dhcp_option_modifiers/>
      <adv_dhcp_config_advanced/>
      <adv_dhcp_config_file_override/>
      <adv_dhcp_config_file_override_path/>
    </wan>
    <opt2>
      <if>igb2</if>
      <descr>WANC</descr>
      <enable>1</enable>
      <spoofmac/>
      <blockpriv>1</blockpriv>
      <blockbogons>1</blockbogons>
      <ipaddr>dhcp</ipaddr>
      <dhcphostname/>
      <alias-address/>
      <alias-subnet>32</alias-subnet>
      <dhcprejectfrom/>
      <adv_dhcp_pt_timeout/>
      <adv_dhcp_pt_retry/>
      <adv_dhcp_pt_select_timeout/>
      <adv_dhcp_pt_reboot/>
      <adv_dhcp_pt_backoff_cutoff/>
      <adv_dhcp_pt_initial_interval/>
      <adv_dhcp_pt_values>SavedCfg</adv_dhcp_pt_values>
      <adv_dhcp_send_options/>
      <adv_dhcp_request_options/>
      <adv_dhcp_required_options/>
      <adv_dhcp_option_modifiers/>
      <adv_dhcp_config_advanced/>
      <adv_dhcp_config_file_override/>
      <adv_dhcp_config_file_override_path/>
    </opt2>
#3
Quote from: cookiemonster on March 31, 2025, 02:47:10 PMYou now mention VLANs.
My bad, VLAN and "LAN segments" have no issues reaching each other and the cable internet when it is active. This is 100% a WAN issue, only with my fiber provider. Every post/thread I read on the Arpresolve message in both pfSense and OPNsense reads like my experience.  Most if not all of them are random in nature and started out of the blue.  None of these threads has found a root cause to this issue.  I have seen workarounds.  I want to confirm that I get this message(s) before I repair the issue next. 

No, for far gateway, I can easily see that as a cause of this problem. I even tried to disable gateway monitoring and Host Route. I have read many posts about disabling the gateway monitor (aka dpinger). 

Quote from: EricPerl on March 31, 2025, 07:39:51 PMPer franco on another thread: The error comes from trying to reach a gateway outside of your WAN subnet.
If you still have the GW group (even though you disable one of the interfaces) and the active one goes down...
Or possibly you get a bogus WAN IP (seems unlikely).

That's one of the reasons I suggested to reassign WAN instead of enabling/disabling WAN-C/WAN-F. No GW group.
Of course, I don't know how different your interfaces configs are, so it may be too cumbersome.

I wouldn't entirely rely on the GW monitor. Again, you can go to Interfaces > Diagnostics and check things out.
There's also ssh and command line...

I would like to see the message from Franco. 

So, there are no gateway groups. I did configure one to try to resolve this problem, but it didn't work, so I removed it. I believe gateway groups are more for PBR than anything else.  This issue is 100% a Layer 2 or Layer 3 with my fiber provider's ONT device and my router.  That arpresolve popping up on other setups, and the problem it causes.  Is exactly my issue.

Moving WAN-F to another interface is a good suggestion. I don't have any inbound rules to move. 

Without being told to look at the dmesg.today, I would still be in the dark.  Many posts with my exact experience.  VERY frustrating issue for everyone. 






#4
So System log shows nothing, other then my actions (config changes) to resolve the problem.  Now dmesg.today is confusing me, as I see a reboot:

Waiting (max 60 seconds) for system process `vnlru' to stop... done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining... 0 0 0 0 done
All buffers synced.
Uptime: 16d1h20m50s
---<<BOOT>>---
Copyright (c) 1992-2023 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.

But I don't remember rebooting the router.  But it's possible I did.  Here is the interesting part, right before the above boot messages.  I have this message:

arpresolve: can't allocate llinfo for 1.2.3.4 on igb2
Appears over 100 times. Of course, 1.2.3.4 is a redaction of my Fiber provider gateway.  IGB2 is my fiber intrface.  :)  SOMETHING.  What does this message mean?!!?  haha

I'm still looking into the details, but it doesn't appear to be a DHCP issue, as I have messages at 10:30 a.m. today that the WAN interface renews DHCP without any issues. My problem today started at 22:26. 

But this arpresolve message is my problem, I see many people with pfSense have the same issues.  So far, static ARP and Static IP resolve the problem, but still no root cause for why this issue... 
#5
LOL, in the middle of submitting my last post above, the problem happened.

It is a Layer 2 or Layer 3 issue with my fiber provider.  My routing table still had the default route, but I could not ping the gateway from two different computers on two different VLANs.  Wired and Wireless.  Yet in Gateways, OPNsense shows that my gateway monitor is alive, which is my default route peer IP.  So internal clients can't reach the fiber provider peer IP, yet OPNsense thinks everything is fine. 

I'm digging into the details now.
#6
@cookiemonster, NO APOLOGY needed.  1000% appreciate the trying support.  I have tried to simplify the configuration as much as possible to the point where I am disabling either the cable or fiber interface via Interfaces > WAN-C (or WAN-F), unchecking Enable Interface.  When I run on the cable provider, I never have the problem unless the cable provider has an issue.  I can see that because the router monitor is red for that link in Gateways. 

Then, when I have time and desire to troubleshoot this issue, I will disable WAN-C and re-enable WAN-F.  Most times, it will last over 24 hours or more.  Then, randomly, everything breaks.  Of course, in this simplified configuration, my work laptop does lose VPN access.  I'm a network engineer by trade.  ;)  Not a good look.  haha 

Yes both WAN interfaces are using DHCP.  I will start looking at the dmesg.today log file.

Thanks very much for the support here. I was running this configuration for years, but something happened last November, and I haven't been right since. Of course, the fiber provider wasn't helpful. 
#7
@cookiemonster, I appreciate the reply. In my original post, I did try to list the Multi-WAN settings (all the settings I thought would matter), with some highlighted in bold:
Quote from: FLguy on January 29, 2025, 03:57:30 PMBefore this problem started, I wasn't running a complete Multi-WAN configuration:
System : Settings : General : Gateway switching was not checked
Firewall : Settings : Advanced : Disable force gateway was not checked

I do have both gateways monitoring their default gateway.  The Fiber gateway is the preferred gateway and has a lower priority.
When the problem first started, I could disable the fiber gateway to restore internet access. 
If I disable the fiber gateway in System: Gateways: Configuration
or reboot the firewall
or Reload all services in cli
That will fix the problem.
I have tried other things like disabling the Cable WAN interface

Sorry, I didn't list my interfaces nor rules, as they are default for the most part.  No Pi-hole or AdGuard.  Funny thing is, really not many moving parts to my setup at all. 

That said, this problem isn't a MultiWAN issue (as I haven't discussed Multi-WAN since the first post). When I'm trying to troubleshoot this problem, my cable interface is entirely disabled.



What can make OPNsense stop routing traffic over the internet, where reloading all services fixes the problem?  Are there any log messages I should look out for?

Thanks for your time and support.
#8
I'm still having this issue, only with my Fiber provider. I have been running off my cable provider, but it had issues on Monday, So I had to move to my fiber provider, and I can't go a full 24 hours without losing the internet completely. I really believe it's OpnSense now, as if I reload the services via CLI, it will always hang on the interface connected to the fiber provider. After 20 to 30 Seconds, the internet starts working.

I'm still looking at the logs I should look at.  Anything to troubleshoot? 

OpnSense was great for years, I hate to move away from it.  But I might try pfSense. 

Any advance would be great.
#9
Hello all,

I still have this problem, but it's gotten worse.  The problem is either with my Fiber ISP, ONT ("modem") device, or my opnsense router.  If I reboot My Opnsense router or reload services, the problem disappears for about 5 to ~90 minutes.  Then I lose internet access over that link.  Right now, I disable the gateway for that provider, and I have no issues running on my Cable provider. 

Any advice on what I can look at to see what is causing this issue?  My hardware is an i5 Protectli.  I just purchased another router from Aliexpress to test if it is hardware-related.  Once it gets here, I am going to install a fresh copy of opnSense and apply my current configuration to it.  I doubt it is hardware-related, But if the problem stays on the router, it will have to be ISP, ONT, or my Opnsense configuration.   

I would like to know if anyone has suggestions on what logs to look for or any troubleshooting advice.  I have a support ticket with my fiber provider, of course they don't see any problems.  But I'm asking for a new ONT device.  I have done everything I can think of.  Part of me feels it's opnsense because I can reload services from CLI, and the internet is functional for another 5 to 90 minutes. 

Also, another thing to mention is that when I reload services from the CLI menu, about 40 to 50% of the time, it will "hang" or stay on "Configuring WANF interface..." (Which is the interface connected to my Fiber ONT device.)  With my cable provider, which is named just WAN?  It will never hang on Configuring that interface.

Thanks for your time,
Nick

#10
The internet Gremlins are out to get me since writing this post.  The problem has happened 4 times now.  Now I just had an active ssh session open to the firewall to use option 11) Reload all services.  Once I do everything goes back to normal.  I'm not sure what to do at this point.  This firewall has been perfect for over 2 years.  Now, something is wrong with it.  Rebooting doesn't solve the problem long term.  Back in Nov, it was once every few weeks, then Dec - Jan, every 2 to 4 days.  Now, it happens every 8 to 12 hours.  :(

Randomly loss connections, DNS does not resolve external FQDN.  My work computer is actively on Cisco VPN and is unaffected by this problem.  That IPSec VPN from my laptop stays active, and all the internet is backhauled over that session. When this problem starts the rest of my house falls apart. 

I would love to know what I should look at and a possible path to resolving or troubleshooting the issue. 

Thank you for your advance. 
#11
Hello all,

I have been running OPNsense for a few years without issues.  The firewall can access two internet providers, 1G Fiber and 1G Cable.  I have had both providers for years as well.  Back in November, we started having problems losing access to the internet.  Yet both providers were OK.  Then I noticed it was not a complete internet loss, as some clients could still ping 1.1.1.1.  But DNS wasn't resolving internet hostnames.  Internal hostnames were fine.

Before this problem started, I wasn't running a complete Multi-WAN configuration:

System : Settings : General : Gateway switching was not checked

Firewall : Settings : Advanced : Disable force gateway was not checked


I do have both gateways monitoring their default gateway.  The Fiber gateway is the preferred gateway and has a lower priority.

When the problem first started, I could disable the fiber gateway to restore internet access.  At first, I thought the issue was DNS (Unbound).  Reenable the fiber gateway, and everything was fine. 

So, I enabled the Multi-WAN settings above, but the problem still occurs.  Every few days, internet access is lost, DNS can't resolve external FQDN, some clients can ping 1.1.1.1, and others can't.  I updated the software to 24.7.11_2 back in Nov. The firewall had ~24.7.1.  I have removed the Multi-WAN settings.  Nothing is fixing this problem.

If I disable the fiber gateway in System: Gateways: Configuration
or reboot the firewall
or Reload all services in cli

That will fix the problem. 

I have tried other things like disabling the Cable WAN interface and running only on the fiber connection. 

I need some guidance on what to look at.

Thank you for any time and support,
Nick
#13
General Discussion / Re: DNS not working via LAN
June 25, 2024, 05:01:51 PM
There is minimal context to support this question. 
Is the LAN subnet 10.0.0.0/x?  Is the LAN interface 10.0.0.1? 
Unbound DNS using "Default settings" works most of the time.  So, what settings did you change? 
Is unbound DNS running (Got a green Play button on the top right)? 
Services: Unbound DNS: General > Network Interfaces set to All (recommended)?
Services: Unbound DNS: Statistics > Do you see Queries increasing?
#14
Hello nirr. 

Do you have an opnsense question for this opnsense forum?  Your whole post is intended for something like Reddit or Discord. Folks are here to support the opnsense community on Opnsense-related topics.

#15
Quote from: domidam on June 18, 2024, 06:22:12 PMThat being said, there could just totally be on setting or something that I am missing. Any other suggestions?

After you make this change, nothing is broken.  The system you are using to configure the firewall needs to be in the same subnet as the NEW mgmt network, in this case, 192.168.2.x/24. 

Quote from: FLguy on June 18, 2024, 07:03:13 AMFor example, Is the new management interface the same physical interface you use to configure the firewall? If it's getting a new IP, you will have to request a new IP from DHCP or assign your PC to a new static address for management.

Brother, I have already mentioned this to you.  If your mgmt interface is now 192.168.2.1, then statically assign your computer to:

IP: 192.168.2.10
Subnet mask: 255.255.255.0

Now connect to your firewall again and configure DHCP on the MGMT network.  Then, set your computer back to DHCP, and you should be good to continue from there. 

domidam, please read the replies thoroughly.  Patrick was very clear in his first reply. 

Take care!