Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Dowd

#1
Quote from: Patrick M. Hausen on June 06, 2024, 09:54:20 AM
Quote from: Dowd on June 06, 2024, 06:57:09 AM
I went into BIOS and under BMC there is a feature? function? called LAN Failover (why is this default I have no idea), but the Mac Address for that matched the rogue one that kept popping up in the logs.
It's the default because it makes sense. You want IPMI for any server - at least I know that I do. And piggybacking the IPMI interface on the first network port is common practice so you have IPMI available with just a single cable.

I don't have any monitor and keyboard anywhere for about one hundred systems. IPMI is the way ...

If you disabled the BMC how do you intend to monitor and control your fan speeds, for example?

And BTW - since at least with the Supermicro boards I run, IPMI is always piggybacked to the first network port - why is that WAN in your installation? The first port is LAN in new OPNsense installs so you changed that? If you keep the standard order of interfaces, IPMI will receive an IP address from your OPNsense DHCP server and will be reachable from LAN - which makes much more sense.

Kind regards,
Patrick

I dont disagree with you at all about the benefits of IPMI/BMC. My issue is with how terrible this implementation is that by default if BMC is not plugged in, it will hijack a port is not allocated to it for BMC. Lets suppose you run a server that is issued a public IP address or at the edge of your network. By default your BMC can take over your IP address exposing your IPMI to the web assuming that you did not properly create firewall rules (if they are supported). What happens if your BMC loses connectivity from a switch going down, or cable goes bad? It will take over whatever traffic is happening on your main interface, killing whatever is running on it. BMC going down is peanuts compared to say your email server or web server going down because your BMC decided it wants the main port. Its a terrible design feature that needs to be changed so you opt INTO bonding interfaces not opt OUT as the defaults are not secure in the slightest. I prefer the old days where management ports were dedicated and if you wanted to use them, you could turn them on or off, but it would be that specific port.

As for why I disabled it, I only use BMC currently for Firmware and BIOS flashing currently, fan speeds dont really matter. Its a server after all, if fan loudness matters in this case I would just order quieter fans honestly.

Quote from: cookiemonster on June 06, 2024, 06:19:35 PM
> If I dont reply back to this thread, then for anyone in the future this was the culprit. Apparently there is a way to turn it off in the firmware as well.

What about coming back to mark it as resolved and confirm it was? And to say thanks to people who stopped to try to give a hand. Would take all of a about 30 secs.

Its marked as resolved and given rep.
#2
So I am 90% sure I found the issue and I am currently testing to see if it goes away. I noticed upon boot this time around that it output the WAN address to the IPMI interface despite not having any cable in the IPMI interface, which struck me as really odd. I went into BIOS and under BMC there is a feature? function? called LAN Failover (why is this default I have no idea), but the Mac Address for that matched the rogue one that kept popping up in the logs.

Since I cant toggle it on/off for the failover in BIOS I just disabled BMC completely. If I dont reply back to this thread, then for anyone in the future this was the culprit. Apparently there is a way to turn it off in the firmware as well.

A similar thread on reddit had the same issue from 8 years ago when I was trying to find a fix for it - https://www.reddit.com/r/homelab/comments/58ojk1/psa_if_you_are_using_an_asrock_board_with_ipmi/

This also explains why when I first setup the system it didn't disconnect for long periods of time, because I had the BMC cable plugged in.
#3
Quote from: cookiemonster on June 06, 2024, 12:20:09 AM
also if proxmox is the OS on the host, list the interfaces $ip a

Can you check the networking setup of your other VM ? If the MAC in the list appears there, then it could be the VM causing the clash. Maybe some software there could also be worth visiting.

Sorry I meant ip a before instead of ip -a. Force of habit. Same result.

I ran ip a on the other vm, it has one interface and its a virtual one, mac address doesn't match. The other interfaces are all related to k8s/docker and also do not match.

The only idea I have so far is to recheck the BIOS for any weird networking that is being done and then reinstall the OS from scratch with OPNSense being the only VM for 24 hours before I start doing anything else. At this stage I am out of ideas since I am baffled.
#4
Quote from: opnfwb on June 06, 2024, 12:09:13 AM
That all seems logical and correct.

The phantom MAC that is stealing the WAN IP, does that MAC coincide with the IPMI interface? I'd want to be extra sure that isn't somehow popping up on the NIC that proxmox is using for the WAN bridge and causing all kinds of issues. I know this is apples/oranges but some newer HP server hardware allows the lights out functionality "move" around to different NICs, so I'm not sure if Asrock has a similar function or not?

The other odd issue that might be worth checking is newer mainboard BIOS' have the option for "install drivers" or some such feature. This usually involves the BIOS UEFI firmware pulling an IP address and trying to stage drivers through to the OS. If your mainboard has such a feature I would suggest turning it off and see if this may also be using that ENO1 NIC for those attempts. This is a long shot but I'm just thinking out loud here trying to help narrow down possibilities.

It doesn't, I ran ip -a on the proxmox machine and none of the mac addresses are ones that match or come close. The only ones with d0:50:99 are the two eno interfaces and they dont match the one outputted. The IPMI interface starts with 7e:89, so its not the same vendor. Its really puzzling where this interface is coming from.

Quote from: opnfwb on June 06, 2024, 12:09:13 AM
I know this is apples/oranges but some newer HP server hardware allows the lights out functionality "move" around to different NICs, so I'm not sure if Asrock has a similar function or not?

I am not aware of such a feature, but I will also check the BIOS if there are any NIC features that could be on and make sure they are off. I am not sure what you mean by lights out functionality unfortunately, its not a term I am familiar with.
#5
Quote from: opnfwb on June 05, 2024, 10:54:29 PM
How is proxmox configured for the OPNsense NICs? Are you using direct hardware passthrough or is there a virtual switch with virtual interfaces assigned to the OPNsense VM?

If it's using a virtual switch, is there another device on the OPNsense WAN side that is 'stealing' the DHCP address?

I have two physical ports on proxmox (third is IPMI and I am not plugged anything into it currently). eno1 and eno2. eno1 is wired to the ONT, eno2 is connected to a Cisco switch wherein are all my physical devices. I have two virtual interfaces vmbr0 and vmbr1, vmbr0 is bridged to eno1 (the ONT box), and vmbr1 is bridged to eno2 (the cisco switch) and also runs the proxmox interface.

For the opnsense vm I have it use vmbr0 and vmbr1. The vmbr0 interface is the WAN (since its bridged to the ONT box) and the vmbr1 is the LAN (since its bridged to the cisco switch interface).

I am attaching a screenshot for reference of the proxmox interfaces and the opnsense vm interfaces. (Ignore the fact it says 23.1, I was testing with 23.1 to see if it behaved differently than 24.1)

There is only one other VM and it is under the vmbr1 interface. Nothing else runs on the vmbr0 interface apart from opnsense.
#6
Quote from: apunkt on June 05, 2024, 09:04:08 PM
Is the MAC from your modem?
I have the same error on my box (with different MAC/IP) with MAC from my Starlink Router since changing to 24.x, however, this is not true. Starlink Router does hold the MAC but _not_ the IP!?
It is not affecting me in any case as far as I can see, though.

Thanks for the reply, I dont have any middle man modem, my wire is cat6 going from the Verizon ONT to the Proxmox box that runs the OPNSense VM. At this stage, I am considering reinstalling proxmox from scratch and reinstalling everything since I dont know where this MAC Address is coming from.
#7
Quote from: opnfwb on June 05, 2024, 05:30:15 AM
What do the logs say? This seems like a DHCP lease expiring on WAN and not renewing until the interface is force reloaded.

Is there anything in System/Log Files/General around the timeframe that the WAN link becomes unresponsive?

There is! I found it yesterday, but I am not sure how to handle it. I tried dealing with Verizon about it, but no luck.

2024-06-05T17:51:46 Notice kernel <3>arp: d0:50:99:f6:xx:xx is using my IP address xx.xx.127.59 on vtnet0!


The MAC address references ASRock for vendor, which is the brand of the motherboard I am using, but it does not belong to the physical NIC in Proxmox, nor is it the MAC address of the virtual NIC thats allocated to OPNSense. I am not sure where this mac address came from and why it constantly takes my OPNSense IP address on renewals.
#8
Quote from: LOTRouter on June 03, 2024, 09:51:58 PM
Some devices (modems, etc.) have a feature in that they stop responding if they have not received an ARP request for a couple of minutes. The cache of BSD based routers (such as OPNSense) is longer than that.

Try adding net.link.ether.inet.max_age=120 to tunables, which forces the router to re-arp every two minutes and often solves this issue.

I changed it, it still dies in 2 hours. It dies in 2 hours on the dot apparently.
#9
Quote from: bestboy on June 03, 2024, 09:06:19 PM
Next time check in System > Routes > Status, if you have a default route set

I checked, the routes exist during the timeout and after the timeout, theres no change. Shown below.
#10
I am currently running a memtest86 to rule out bad ram, and I will also be doing a new install of opnsense without proxmox to see if that is the culprit. In the meantime I am hoping someone has any ideas for things to try on my to do list.

Full disclosure, prior to this new system I had a Lenovo RS140 server that also ran Proxmox + OPNSense and did not experience any sudden disconnects on the WAN. I also used a small Ubi EdgeRouter and experienced no disconnects as well.

I am having an issue where randomly OPNSense will not connect out to the WAN. It gets the DHCP IP from Verizon and works then several hours it will just stop working. The IP is still there, I just cant ping anything outside of the LAN network. The WAN gateway, 1.1.1.1, etc. None of it works. If I go to Overview and navigate to the WAN interface and do the reload button it will work immediately again, but then will break again in a few hours.

I am running OPNSense 24.1, everything is up to date. My specs are

i9-9900k
128GB RAM (4GB allocated to OPNSense)
E3C246D4U2-2T Motherboard (BIOS and Firmware up to date)
2 x Intel X550-AT2 RJ45

LAN works fine, I am not seeing any issues connecting there, it just solely is WAN disconnecting. I have tried replacing the cable, same result.

EDIT: Latest updates:

  • Memtest86 test passed after 24 hours.
  • Static Routes do exist during the timeout, before the timeout, after the timeout
  • Proxmox does not report any interface issues via dmesg | grep eno1 (the WAN interface)
  • I reloaded the WAN at 10:25 AM. Aprox at 12:20 PM (~2 hours later), the WAN died again.
  • net.link.ether.inet.max_age did not make a difference, it still dies in 2 hours.
  • Clean install of 24.1 and clean install of 23.1 made no difference.
  • Logs show <3>arp: d0:50:99:f6:xx:xx is using my IP address <WANIP> on vtnet0! however the MAC Address does not belong to me as my MAC addresses start with d0:50:99:fc, not f6.

EDIT: I believe I found the solution for future reference, documented in the comment here - https://forum.opnsense.org/index.php?topic=40856.msg200606#msg200606
#11
21.7 Legacy Series / Re: Multiple Subnets per VLAN?
January 27, 2022, 08:05:55 AM
Thanks, I will give that a try and see if it works. I am currently absolutely stumped on where I was supposed to go to accomplish something like that.
#12
21.7 Legacy Series / Multiple Subnets per VLAN?
January 27, 2022, 07:34:01 AM
I know it is possible to make two different subnets part of a specific VLAN tag, but I am not sure how to configure this in OPNSense?

For instance suppose I would like 10.1.20.0/24 and 10.1.22.0/24 to be part of VLAN tag 20?
#13
Quote from: chemlud on January 20, 2022, 03:34:11 PM
Sounds like a crude hack :-D

I would have tried to import this section of config.xml from pfsense. No guarantee this works though...

Have you tried exporting config.xml on your opnsense and see if there is the appropriate section for DHCP?

I didn't, I thought there would be discrepancies between the configs and I wasn't sure if the non-static DHCP entries would show up. I am still not sure about the former, but the latter is confirmed from my export. I didn't want clients to get issued new IP addresses from the DHCP pool, but I also dont want to make static entries for every single device at this stage.
#14
I didnt get any replies to this, but I am following up in case anyone has a similar question.

So I went ahead and copied the values from /var/dhcpd/var/db/dhcpd.leases in PFSense and copied them to OPNSense /var/dhcpd/var/db/dhcpd.leases. PFSense and OPNSense both have a server-duid value. I am not sure what the purpose of this value is, but I took the OPNSense server-duid value and pasted it into the PFSense dhcpd.leases file instead of the default pfsense one.

Rebooting the server afterwards showed all the DHCP leases from PFSense in OPNSense and we didn't have any issues so far, I confirmed all the DHCP devices retained their IPs after the router swap.
#15
So I am in the process of trying to fix our existing network, one of the moves we are making is migrating from our several year old pfsense install that wont update to opnsense. I have several things to fix post move, but I want to initially just migrate between the two instances and not have any issues in house with people (things like shared printers not working, etc).

To achieve that I want to move pfsense's dhcp leases to our opnsense instance. I did some research and it appears they are stored in /var/dhcpd/var/db/dhcpd.leases on pfsense. It appears the same file exists on opnsense as well with values.

Is there any issue with how opnsense works if I copy the values from this file and paste it into the same location on opnsense then restart? Would this work?