Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Linwood

#1
Bingo.  When I changed "re0" to "igb0" for interfaces thinking it was safe, and bang in the middle of one of the certs was that string.

Sigh.  Mea culpa. Global search and replace was a bit too global.

Thank you thank you. 
#2
Quote from: Monviech (Cedrik) on August 05, 2025, 09:42:12 PMUpon each service reload,

Ah, I was stuck on reboot not service reload.  Let me investigate (it takes a bit as I can't have the two up at the same time without making a mess, so need to run back and forth and wait for reboots a lot).  More in a bit.

Sorry about that and thanks.
#3
Quote from: Monviech (Cedrik) on August 05, 2025, 06:59:00 AMIf it does not extract the certificate correctly I assume its faulty in the config.xml.

I think I explained the problem poorly.

I've replaced the cert correctly, but as soon as I start the service it becomes corrupted.  It's not the config import.  I literally edited the file to get the right cert, and as soon as I hit the start button (a) it doesn't start, disappearing without error in the log, and (b) the cert key file is corrupt.

Can I somehow run it manually with output to an SSH session the same way it is run when hitting the service start?  I feel like there's an error not being captured that might give me a clue.

I just have no idea what is different in this machine and the other two where it ran fine (a very hold HP I'm retiring, a new BeeLink which is going to be the primary FW) and this machine. It's rather old, but was powerful for its time, however, not unusual for its time.  While it is a "K" it is not overclocked, and has been completely stable for many years, I relegated it to spare only because that processor is not supported on Windows (no TPM, and something about the processor itself).
#4
I have a weird issue.  I'm running OPNSense 25.7.1.1_1 on two separate machines (hardware not virtual). One is a new, simple minded Beelink EQi12 and it works there.

I have an older system, a Z170-WS motherboard with I7-6700K.  It's a pretty powerful desktop and I was going to use it as a backup system. So I installed OPNSense, then all the plugins, then restored the configuration from the first system changing the interface names.

Everything is fine -- almost.  CADDY won't start.  It does not give an error as it fails to start, it just disappears. The log file looks identical to the other system.

A validation gives an error about missing TLS.  So I started looking around and in /var/db/caddy/data/caddy/certificates/temp is a key pair with a numeric name.  If I look at the key file on the working system it's a private key and looks right.

On the non-working system, about 2/3rds of the way down the ascii characters turn into binary junk.  So ... I think "something corrupted it, I'll just put it back".

So I edited the file, and put it back correctly, and started CADDY and ... it disappears and the file is corrupted again.

I have no idea why or how.  So I deleted the plugin and re-installed it, and replaced the config file and rebooted and -- corrupted again.

There is no sign of any disk related issues, it's ZFS and happy.   The binary crap appears identical each time it gets corrupted (at least to the eye).

What makes this a bit more confusing is that the primary system is itself a copy from yet a different firewall system I replaced, and that one is just fine.  So it's not the process of restoring the config file.  Indeed if I replace the file all I have to do is start caddy to corrupt it, nothing else running.

I am using caddy only for proxy, it is not tied to ACME (I run that separately and it doesn't run during these issues, plus this isn't related to that cert I think).

Here's the log (but this is also the log, including that confusing host-checking error, on the working system).

<14>1 2025-08-04T19:59:32-04:00 OPNsense.leferguson.com caddy - - [meta sequenceId="1"] "info","ts":"2025-08-04T23:59:32Z","logger":"admin","msg":"admin endpoint started","address":"unix//var/run/caddy/caddy.sock|0220","enforce_origin":false,"origins":[]}
<12>1 2025-08-04T19:59:32-04:00 OPNsense.leferguson.com caddy - - [meta sequenceId="2"] "warn","ts":"2025-08-04T23:59:32Z","logger":"admin","msg":"admin endpoint on open interface; host checking disabled","address":"unix//var/run/caddy/caddy.sock|0220"}


This is the config file (slightly munged to remove private stuff):

# DO NOT EDIT THIS FILE -- OPNsense auto-generated file


# caddy_user=root

# Global Options
{
        log {
                include http.log.access.0920f7c1-fc06-4693-b451-93f7ec3e50d1
                output net unixgram//var/run/caddy/log.sock {
                }
                format json {
                        time_format rfc3339
                }
        }

        http_port  xxxx
        https_port qqqq

        servers {
                protocols h1 h2 h3
        }

        auto_https off
        grace_period 10s
        import /usr/local/etc/caddy/caddy.d/*.global
}

# Reverse Proxy Configuration


xxx.xxxxxxxxxxxxx.com:qqqq {
        log 0920f7c1-fc06-4693-b451-93f7ec3e50d1
        tls /var/db/caddy/data/caddy/certificates/temp/63a4a8494b255.pem /var/db/caddy/data/caddy/certificates/temp/63a4a8494b255.key {
        }

        handle {
                reverse_proxy 192.168.131.210:80 {
                        transport http {
                        }
                }
        }
}

import /usr/local/etc/caddy/caddy.d/*.conf

There are no config files in the import folder.

Any idea what is going on?  Or where to look?

Linwood

PS. On the working system, the reverse proxy does actually work, it's not just that caddy runs.
#5
So I rebuilt this so that the interface that becomes LAN is an untagged (access) port, and trunked all VLAN's into another interface from which I build the other OPNsense VLAN's, in that case not using the native (untagged, PVID) VLAN on that trunked port.

That works.  It's not a good mirror for the physical OPNsense, but it works.

It seems like, on HyperV, something about the LAN interface (specific to the LAN) is not happy on a trunked interface at least with the LAN VLAN native.

Which is weird. Ubuntu works perfectly fine in that setup. And OPNSense (same version) works perfectly fine in that setup on physical hardware.

I remain confused. But I guess I can work around this by just using LAN as a separate access/untagged virtual interface.
#6
I'm trying an experiment.  I want to see if I can backup my physical OPNsense setup (still running 24.7.12) with a hyperv guest.

First step was just to get opnsense running there, and I'm completely stuck.

My LAN interface VLAN (1) is untagged.  The other VLAN's are tagged. 

I cannot get the LAN interface to respond to an arp (or of course thus a ping) when the HyperV VM is set to pass VLAN 1 as native as below. I set a ubuntu VM this way and it works fine, an untaged ethernet interface works fine, so I don't think it is my VM setup.

VMName            : OPNsense
VMId              : ed48f3f0-d5e0-41d1-af84-f3df9a701a81
AdapterName       : Network Adapter
AdapterId         : Microsoft:ED48F3F0-D5E0-41D1-AF84-F3DF9A701A81\58C90972-3220-4045-89AC-D22295476BC7
OperationMode     : Trunk
NativeVlanId      : 1
AllowedVlanIdList : 1,131,134,136-137

If I set up other VLANs in OPNSense, I can ping from them to devices on those vlan's, so the tagged interfaces work.

If I set the LAN interface to be tagged though (even the same as one I just used) it won't work so long as it is the LAN interface.

When it books it shows the LAN interface as hn0 and with the right V4 IP address.  Pinging from it appears to arp (I can see packets on the target system) but the response never appears as the "who has" arp is never answered.  It's just a switch in between, no routing is involved.  This is the same basic configuration I use with a separate, physical OPNSense configuration with real nic's, and same switch configuration feeding its ports (obviously HyperV is not in the middle).

I'm thinking there is something odd about the combination of untagged and "LAN" but I do not know what I'm looking for, and since I cannot get to the GUI at all I can't explore there. TCPDUMP from the shell command line never show the ARP "who has" reaching the NIC, so my GUESS is it's being filtered, or the lack of a tag is confusing it.

I don't know enough about opnsense/linux from the shell command to know where to go from here.

I've reinstalled 3 times.  I'm using the older version as that is what I have running physically, but I do not think that's relevant.

I CAN get it to work if I un-trunk the LAN interface, but while that might work it won't mirror my physical machine, and I hope to be able to transfer configs between.

Any thoughts?

Linwood

PS. I do HyperV configs all the time for windows and linux. It's always possible I missed something but I think the issue is in my OPNSense setup not HyperV.
#7
Thank you @meyergru

I have found one issue but still not quite understand.  First things first (after a night of it just sitting):

/var/unbound/host_entries.conf looks correct, the bogus entries are not there, and a correct entry for a different name is there for 192.168.130.73.

Oddly /var/unbound/dhcpleases.conf is empty, size zero, but /var/dhcpd/var/db/dhcpd.leases is not, but also looks correct, and does not have either IP address in it from the bogus translations.

I grep'd every file in both folders without finding the bogus IP entry. If I grep by name I find only the unbound override (correct) entry.

It just feels like there's some persistent cache I have not found. 

Continuing to hunt I tried resetting "Aggressive NSEC" (just because it had the word cache) but I forgot to apply, but I did restart unbound (first restart of today). Now the bogus entries did not resolve and the legitimate one did not either.   Note this was JUST the restart, I didn't save the NSEC change by mistake (I saw later).

At that point I found a typo -- the legitimate override had the domain name mis-spelled (missing letter).  I fixed that, and restarted unbound again, and now it works fine.

So... the reason my override was not visible is it was mis-spelled.

But the reason these cached entries continued to appear remains a mystery, as is why they disappeared after about 8 hours of just sitting there.

I did look back and the lease time in DHCP is 12 hours, so that's almost certainly part of this.  However, these leases had been manually deleted.  Further, the TTL on the DNS entries per DIG was coming up as under an hour.  It seems like whatever was doing this was taking a DHCP lease (despite being deleted) and forming a DNS entry with shorter TTL.

When the lease (that was deleted!) expired, AND a restart of unbound occurred, the bogus entry was gone.

What a mess of flakey cleanup.  There are days when I really hate GUI's, I suspect if this was just plain old text file configs I might have found this.

Anyway, I leave this trail of confusing breadcrumbs in case anyone else runs across something similar and it might help.

Thank you for your info, it's surprisingly hard to find where the config files are in google, 95% of what I find just points to the GUI's.
#8
I'm baffled.   OPNSense 24.7.12.

I put a new rPi on the network and it got an address of 192.168.130.53.  I then logged in and set its static address to 192.168.130.248.  Somehow maybe it also got 192.168.130.73 (which is a duplicate, active IP, so maybe it got it and gave it up).

Anyway... I then went into unbound and added an override for 192.168.130.248, and deleted the lease for .53 (there was none showing for .73 but there was a legit unbound override for it for a different name).

Unbound is still responding to queries for this name with .53 and .73 (both), and not the override of .248.

I have so far:

- Unbound is set to clear its cache on restart
- Restarted both DHCP4 (ISC) several times
- Restart unbound several times
- Rebooted OPNSense
- Deleted the override for 192.168.130.73 (which was legit and different name) and put it back
- Downloaded a backup configuration of OPNSense as XML, searched for the bogus IP and name (found the name with correct IP, nothing bogus)
- Waited much longer than the TTL to see if it would expire and vanish - it doesn't.

If I run dig it shows like it has a real A record:

;; ANSWER SECTION:
zwave4.xxxxx.com.  2099    IN      A       192.168.130.53
zwave4.xxxxx.com.  2099    IN      A       192.168.130.73

If I run nslookup with debug it shows a regular query and response with both (wrong) answers.  Nothing weird, just wrong.

In NSLOOKUP if I turn off recursion I get the same wrong answers, so it's not somehow recursing to elsewhere.

The TTL resets occasionally to 3600, notably sometimes when I query it, making me think that after restarting it's being recreated from something.

I turned on verbose debug logging in unbound and see the query and answer but nothing about where it comes from.  It's not forwarding (I've turned off forwarding and nothing changes).

It looks like there is a record somewhere that the GUI is not showing me, but I do not know where to look.  It also doesn't seem to be anyplace that is part of what is backed up by the system.

Any idea how to make it go away?

Linwood
#9
General Discussion / Netdisco LLDP discovery
May 25, 2024, 04:37:37 AM
This is kind of a longshot...

I use netdisco to keep track of layer 2 topology, and most of its data comes from lldp, which it accesses via SNMP downloading the results of lldp discovery of neighbors.

lldp is available for OPNsense (and I'm running it).  OPNsense sees its neighbors.

Netdisco does not see neighbors, or connected devices, and I think it is because the snmp implementation is not providing data on macs on ports, neighbors from lldp, etc.  Emphasis on "think", a brief look through snmpwalk didn't see that kind of data.

Anyone been down this path - is this a restriction in the snmp implementation?  Some mibs not turned on (and can they be)?  Something in netdisco?

It's a minor nit -- OPNsense is not connected to much as it is on the edge. But I was hoping for consistency.

Linwood
#10
I screwed up.  Due to a move going to be without internet for a week or so, and on a whim stopped and picked up a Wifi USB adapter thinking I could just plug it in and connect to my phone's hotspot.

I got the above mentioned device, plugged it in and... doesn't work.

On searching it looks like FreeBSD will add support in 14, not in the current version.

But I also see this page:

https://man.freebsd.org/cgi/man.cgi?query=rtwn_usb&apropos=0&sektion=4&manpath=FreeBSD+12.0-RELEASE&arch=default&format=html

That seems to indicate since 12 it could be added.  I'm hesitant to screw around with kernel changes in my (working, happy, stable) OPNsense system, especially since it is the only place I use FreeBSD.

Did I just simply screw up and should throw this ouit, or is it safe to add drivers as that link indicates?

Linwood
#11
This seems like a simple question but I am baffled.

I am running unbound and dns and dhcp4, but not adguard.

I created a new esp32 device, it probably pulled a dhcp address at some point, but I gave it a static address of 192.168.130.53.

Now OPNsense as a DNS server is giving out a pair of addresses for its name:

Name:    frontdoorpanel.xxxxx.com
Addresses:  192.168.130.53
          192.168.130.72


Simple (I think), the DHCP is hanging around.  But it's not there, even with show expired (I am not certain it ever pulled a DHCP address, that's supposition, usually these things do and I just delete them).

I have looked in the Unbound overrides and only the .53 address is there.  I have looked in DHCP leases and DHCP reservations, it is not in either place. 

There are no unbound aliases defined.

The above is nslookup, so it's not some broadcast mDNS answer from a device.

I asked unbound to log queries and replies and it does so but just basic info, at least I do not see where the detail may be. But showing details in nslookup shows it responding with both.

It's really acting like there is a unbound override in there (or a dhcp4 address), but ... it's not showing up.

I have restarted dhcp4, unbound, and rebooted the OPNsense firewall (in that order) and nothing changes.

I don't know if there is a text file where overrides are stored -- is there?   That I could search?

Where else could this value be?  What could I be missing?  This looks so simple.

Opnsense 23.7.8 on FreeBSD 13.2-Release-p5 on Intel.

Any advice?   It's not actually causing much of a problem, just a network monitoring device (zabbix) is complaining about the DNS vs IP mismatch since it expects only the .53.

Linwood






#12
Perfect, thank you.  I've been burned by ASA's when you change interfaces and suddenly lines of config disappear due to removing and re-adding an interface name (I realize this is different) that was just a bit paranoid.
#13
I think this is a silly question but prefer not to make a mess.

I had a bunch of VLAN's, assigned to opt1, opt2, opt3, opt4, etc

I no longer use the one assigned to opt2.  I've disabled it which takes care of most important stuff, but I'd like to just remove it.

I don't know what happens when I do relative to the opt numbers, and if that matters for any configuration purpose.  Nothing currently refers to the interface name (e.g. this is "Automation"), but I have no idea how it all connects under the covers.

If I just delete it from the assignments page does everything else stay the same?

(On 23.7.5 on FreeBSD amd64)
#14
No one?

Does Unbound not restart for others?
#15
I'm kind of a novice at this but my OpenVPN configuration still works after upgrading from 22 to 23.