Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - RedVortex

#1
Quote from: stuza on October 13, 2025, 08:59:35 AMHi All,

I'm trying to connect my dual-nic OpenSense x64 Windows 11 box running 25.7 to my ISP using PPPoE.  I know the user ID, password, and VLAN are correct as they work on my Asus BT10 but, I keep failing to connect directly with OpnSense despite being told it's easy.  I've tried MAC cloning and without, although that's not required on my BT10 router.

I've tried the official directions but these get vague around the end and I can't get them to work.   I've also tried this and filed https://forum.opnsense.org/index.php?topic=21207.0.

I've followed the instructions sent to me in DM on Reddit and still failed.

Any ideas, please?


You were close.

You create a vlan device and assign it to the network interface (you did that), you then create a PPPoE device and you assign it to the vlan device (you also did that), you then assign the WAN interface to the PPPoE device (not the network interface, I believe this is were you made a mistake according to your screenshots).

The WAN interface should be assigned to the PPPoE device, not the network interface or vlan, the PPPoE device will do this by itself.

Let me know if that works or not. I also use PPPoE over a vlan. Actually, I use a redundant (CARP) opnsense setup and the primary does not need to tag the vlan since the switch does the tagging but the backup needs to tag the vlan by itself since it's connected to a trunk port so I'm setup for both cases and it works well.
#2
Quote from: RedVortex on September 19, 2025, 11:17:24 PM
Quote from: franco on September 15, 2025, 08:30:28 AMCan you open a ticket for that?
Sure thing, sorry I couldn't do it sooner, here it is: https://github.com/opnsense/core/issues/9227

Also, after re-reading your comment, I wasn't too sure (lol) if it was about that I reported regarding CARP and auto-collect of interfaces IP instead of VIP or the other thing I was talking about (DHCP Option 121 which I cannot fix versus the VIP one that I can fix) so I created another issue for the VIP/Auto-collect in case this is the one you wanted: https://github.com/opnsense/core/issues/9228
#3
Quote from: franco on September 15, 2025, 08:30:28 AMCan you open a ticket for that?

Hey franco, I hope you're doing well, long time no chat.

Sure thing, sorry I couldn't do it sooner, here it is: https://github.com/opnsense/core/issues/9227

Let me know if you need anything more.

Thanks
#4
FYI, I also had to disable auto collect and manually fill in the gateway, DNS and NTP for my 5 networks since I'm setup in HA and it was using the interfaces IP instead of the CARP VIP.

Every time there was a failover, everything was losing connectivity because the gateway and DNS were not working anymore since they were pointing to the interface that was down or rebooting instead of the CARP VIP.

I'm not really considering this a "bug" but more of a "heads up" if you use HA and CARP VIP.

The biggest problem I still have with kea is that I cannot send option 121 anymore, the UI doesn't support it (classless static routes, not the static routes in the UI that no client supports or nobody uses that is only 1 IP to 1 Gateway which is option 33). I had a thread on that issue a long time ago and that is problematic a lot for me: https://forum.opnsense.org/index.php?topic=39563.0
#5
Quote from: dwasifar on August 09, 2024, 04:39:45 PM
I have two networks defined in the UniFi controller, one for the main subnet and another for a VLAN subnet (to isolate IOT devices).

After the 24.7.1 upgrade, nothing on either wi-fi network can reach the internet.  Wired connections are fine.

I can't spare the network downtime to troubleshoot it right now, so I reverted to 24.7 and reloaded the same configuration, and everything works again.  If anyone has any thoughts, it'd be welcome for when I can look at it.

Not sure if related or not but I had a similar issue that was caused by Unbound not able to start anymore. I was caused by my Google Home generating a IPv6 network temporarily during opnsense reboot. Once opnsense had rebooted, I saw a ULA IPv6 address on my Google Home IoT network assigned to opnsense (interface / overview). This happens even though this interface IPv6 configuration is "None". This feels related to SLAAC which is impossible to disable it seems.

For some reason, that prevented Unbound from being able to start (I'm binding Unbound to specific interfaces, not ALL as they recommend). When that happens, there are a few things I can do

- Manually remove the ULA IPv6 in command line from the Interface where my Google homes are (it usually doesn't come back once they have internet access, I suppose they do this to talk to each other temporarily during outage)
- Enable dhcpv6 on the interface, save/apply. Re-disable IPv6 (set it back to none), save/apply. (This makes the IPv6 ULA go away and Unbound is now able to start)
- Remove specific interface binding from Unbound so it binds to everything, for some reason this makes Unbound able to start even with this problem.

This is reproducible every time I reboot opnsense and only happens on my Google Home interface (which is linked to Unifi Access points which have their own SSID for my Google Homes).

Next time you upgrade or reinstall, run ifconfig in command line or check in interfaces/overview to see if you don't have an IPv6 on an interface that shouldn't be there and check is Unbound is running or not. You should have IP address access to everything even without DNS running (to access opnsense UI or command line or even ping 8.8.8.8)

Like I said... Could be related or not to your issue but this is my case since the last few updates and I thought I could share in case it helps.
#6
Quote from: franco on July 27, 2024, 10:06:27 AM
https://github.com/opnsense/core/commit/287c13beb

# opnsense-patch 287c13beb

That seems to have helped for Starlink, problem remains to Hurricane Electric GIF tunnel.

I patched, and rebooted 2 times and both times the dpinger for Starlink was up and monitoring. I removed the patch, rebooted a 3rd time and I saw dpinger started on startup and then stopped and did not restart by itself. I then proceeded to enable manually and it remained up.

I wonder if this could also be related to a problem I started having on 24.1 in the very latest updates. After reboot I see a weird IPv6 assigned to the interface (ULA fd9c:xxxxxxxx) where I have my Google Homes. As if the interface would get itself an IPv6 from somewhere. The IPv6 configuration on the interface is "None" so I would not expect the interface to end up having an IPv6 in any way, ever. I wonder if something like SLAAC is enabled at all times now or something like that even though IPv6 is not enabled on the interface.

I need to manually remove the ULA IPv6 from the interface or put the interface in DHCPv6, save/apply, then go back to None and save/apply to get rid of the IPv6 address on the interface. This situation also prevents Unbound from starting, it remains off until I get rid of this IPv6 or remove the interface from the Unbound list of bounded interfaces.

I know this sounds like outside this thread but since it affects IPv6 only and we're talking about weird SLAAC issues, I prefer to let you know about this as well in case it is related or it helps.

EDIT1: Added info to specify that the weird IPv6 I'm getting in an interface is FD9C:xxx which is a ULA IPv6, probably some device created its own ipv6 network and is broadcasting it. Not sure why OpnSense uses it though since I have IPV6 configured to none on the interface. But this definitely prevents Unbound from starting.

EDIT2: It really seems like it is a RA coming from my Google Home devices or Unifi AP maybe and opnsense picks it up by autoconf. Something must have changed recently (likely kernel) that now autoconf ipv6 even if disabled.
inet6 fd9c:85da:835d:8696:92e2:baff:feb0:efeb prefixlen 64 detached autoconf pltime 1800 vltime 1800
#7
I use 2 providers for IPv6

Starlink (DHCPv6) and Hurricane Electric (GIF tunnel).

After reboot, I have an IPv6 on both interface but gateway monitoring (dpinger) is not enabled on any of the ipv6 interfaces so both interfaces are marked as down. All IPv4 gateways are ok and dpinger is running for them, only IPv6 are affected.

I can manually start dpinger on both gatways and then they get marked as up and dpinger continues to run.

If I reboot, the same situation happens again. It also happens if I go on the interface of Starlink and click save to refresh IPs. Seems dpinger gets disabled while the interface flaps but it never gets re-enabled automatically unless I manually start it.

This is a new behaviour since I upgraded to 24.7, this was working fine in 24.1
#8
This is not a bug or problem, this is basically just a FYI...

Careful, I made the mistake of thinking that this 24.1.4 release note was DHCP option 121 to send static routes to dhcp clients (I need this option before moving to kea) but it is not.

o kea-dhcp: add domain-search, time-servers and static-routes client options to subnet configuration


It's DHCP option 33, which is a single IP to a router IP. You cannot use this to route 192.168.30.0/24 to 192.168.31.1 for instance. You can only route 1 ip to a router, like 192.168.30.12 to 192.168.31.1 for instance.

This is basic and can still serve some purpose for some people but most of us use DHCP option 121 in ISC which is totally different than the less used DHCP option 33 which basically enforce /32 on your static route because you cannot define a subnet/CIDR on the IP/network you pass.

In other words, what we likely want, is the support for this KEA feature in OPNSense (DHCP option 121) which encompass and overrides when it is present (per RFC) option 33. Option 121 also enables you to do exactly the same as option 33 since you can specify /32 on a subnet if you want but the important part is that it allows you to specify subnets, not only individual IP addresses to be routed to a router.

https://gitlab.isc.org/isc-projects/kea/-/merge_requests/2135/diffs

Now, since this has been implemented like that. This will likely create a breaking change when option 121 is implemented unless the upgrade process converts what was specified as ip,router to ip/32,router so that we can keep the same field for the better DHCP option 121 without breaking the kea config for people that started using it already in the option 33 format. Unless someone decides to support both options in the UI and configs and then we would have 2 separate fields for static routes in the kea UI. One for single IPs (DHCP option 33) and another one for networks (DHCP option 121)

I'm very glad that a lot of work is being done on KEA and I'm almost to the point of being able to move away from ISC. Once Option 121 is implemented in OPNSense and that KEA also registers its DHCP leases hosts in Unbound, I'll be good to migrate.

Thanks everyone !
#9
Problem is still present in 24.1.2

Bad state

No ALTQ support in kernel
ALTQ related functions disabled
all icmp 100.79.101.92:33064 -> 1.1.1.1:33064       0:0
   age 08:41:39, expires in 00:00:10, 30734:0 pkts, 891286:0 bytes, rule 104
   id: d928da6500000003 creatorid: d7e1a47d gateway: 192.168.100.1
   origif: igb0


Killing it

root@opnsense:~ # pfctl -k id -k d928da6500000003
killed 1 states


State is now back to what it should and gateway is now recovering

root@opnsense:~ # pfctl -ss -vvv | grep "1\.1\.1\.1" -A 3
No ALTQ support in kernel
ALTQ related functions disabled
all icmp 100.79.101.92:33064 -> 1.1.1.1:33064       0:0
   age 00:00:05, expires in 00:00:09, 5:5 pkts, 145:145 bytes, rule 104
   id: 7698db6500000002 creatorid: d7e1a47d gateway: 100.64.0.1
   origif: igb0
#10
24.1, 24.4 Legacy Series / Re: KEA DHCP
February 22, 2024, 05:41:20 PM
Quote from: cprsn on February 22, 2024, 04:45:02 PM

It seems to me this is still an unresolved issue.  I have disabled ISC on all but one interface and migrated the rest to Kea.  For this to work, I found I had to stop ISC entirely, restart Kea, then restart ISC.  Otherwise, the Kea log reports "Address already in use - is another DHCP server running?" errors.  If I then have to reboot opnsense (e.g. after firmware updates), it seems ISC will start before Kea and I will not have DHCP servers active on any of the interfaces except the one that I still have on ISC (Kea will report "address already in use" for the other interfaces).

Is the intent for now to support running ISC on some interfaces and Kea on others or are uses expected to migrate all interfaces to Kea?

This is my experience, it is impossible to run both. franco also confirmed this earlier. ISC gets a hold of all interfaces and prevents KEA from binding to it, as you saw.

In my case, kea was missing too many features that I need before migrating (dhcp custom options for additional routes and also unbound DNS registration) which I rely on heavily thus preventing me from migrating the subnets that I could right away and keep the others on ISC.

For now, it's unfortunately all or nothing, not because of kea, but because ISC bind to all IPs as the output for sockstat shows and from what I read on ISC, it seems to be by design.

Kea however worked well in my case when I tested but unfortunately is missing too many things for me to migrate, yet.
#11
Quote from: axsdenied on February 14, 2024, 06:44:42 PM
I don't have Starlink so I don't have firsthand experience, but out of curiosity, when the Starlink network is up and everything is working with a Starlink network IP, can you still access the dish via the 192.168.100.x network?

Yes, the dish still keeps this IP but the DHCP IP that it will hand you will not be in this range anymore when it gets a SL IP properly.

opnsense usually handles this properly because SL still sends this IP range in the DHCP options (Classless-Static-Route Option 121) that says other networks that can be reached through it and it includes this range (and some other public IPs too I guess for they services in AWS through them).

Here's a DHCP reply when the SL dish is connected to the SL network.

You can see in the dhcp reply default gateway being 100.64.0.1 (which is when SL is UP). The SL dish still uses 192.168.100.1 and in fact, when you use the SL app to manage the antenna, it connects to this IP.

13:08:23.480240 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 350: (tos 0x0, ttl 64, id 49857, offset 0, flags [DF], proto UDP (17), length 336)
    100.64.0.1.67 > 100.79.101.92.68: [no cksum] BOOTP/DHCP, Reply, length 308, xid 0x12b7a4ac, Flags [none] (0x0000)
  Your-IP 100.79.101.92
  Server-IP 10.10.10.10
  Gateway-IP 192.168.100.100
  Client-Ethernet-Address xx:xx:xx:xx:xx:xx
  Vendor-rfc1048 Extensions
    Magic Cookie 0x63825363
    DHCP-Message Option 53, length 1: ACK
    Subnet-Mask Option 1, length 4: 255.192.0.0
    Server-ID Option 54, length 4: 100.64.0.1
    Default-Gateway Option 3, length 4: 100.64.0.1
    Lease-Time Option 51, length 4: 300
    Domain-Name-Server Option 6, length 8: 1.1.1.1,8.8.8.8
    Classless-Static-Route Option 121, length 23: (192.168.100.1/32:0.0.0.0),(34.120.255.244/32:0.0.0.0),(default:100.64.0.1)
    MTU Option 26, length 2: 1500
    END Option 255, length 0
    PAD Option 0, length 0


If I put the SL dish in stow mode (flipped down to not talk to satellites, or when SL is down, maintenance, whatever) the DHCP reply becomes this. The GW is .1 and it gives me .100 in the 192.168.100.0/24 range

13:11:42.957591 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 320: (tos 0x0, ttl 255, id 0, offset 0, flags [none], proto UDP (17), length 306)
    192.168.100.1.67 > 192.168.100.100.68: [no cksum] BOOTP/DHCP, Reply, length 278, xid 0xae69f181, Flags [none] (0x0000)
  Your-IP 192.168.100.100
  Client-Ethernet-Address xx:xx:xx:xx:xx:xx
  Vendor-rfc1048 Extensions
    Magic Cookie 0x63825363
    DHCP-Message Option 53, length 1: ACK
    Subnet-Mask Option 1, length 4: 255.255.255.0
    Server-ID Option 54, length 4: 192.168.100.1
    Default-Gateway Option 3, length 4: 192.168.100.1
    Lease-Time Option 51, length 4: 5
    Domain-Name-Server Option 6, length 4: 192.168.100.1
    MTU Option 26, length 2: 1500
    END Option 255, length 0


And now I have the same problem, the gateway is now marked as down even though SL is back up.

It's weird because for a few seconds when SL comes back up. I see 2 states, one of which would be the right one but it ends up disappearing and the bad state remains

root@opnsense:~ # pfctl -ss -vvv | grep "1\.1\.1\.1" -A 3
No ALTQ support in kernel
ALTQ related functions disabled
all icmp 100.79.101.92:59388 -> 1.1.1.1:59388       0:0
   age 00:02:54, expires in 00:00:09, 171:0 pkts, 4959:0 bytes, rule 104
   id: 9512dc6500000002 creatorid: 5f0e2da3 gateway: 192.168.100.1
   origif: igb0
--
all icmp 100.79.101.92:63965 (192.168.22.14:14148) -> 1.1.1.1:63965       0:0
   age 00:00:11, expires in 00:00:00, 2:2 pkts, 168:168 bytes, rule 104
   id: e113dc6500000002 creatorid: 5f0e2da3 gateway: 100.64.0.1
   origif: igb0


And after a few seconds... The bad one remains and the gateway remains marked as down

root@opnsense:~ # pfctl -ss -vvv | grep "1\.1\.1\.1" -A 3
No ALTQ support in kernel
ALTQ related functions disabled
all icmp 100.79.101.92:59388 -> 1.1.1.1:59388       0:0
   age 00:05:58, expires in 00:00:09, 353:0 pkts, 10237:0 bytes, rule 104
   id: 9512dc6500000002 creatorid: 5f0e2da3 gateway: 192.168.100.1
   origif: igb0


and dpinger is configured to use the right interface (100.64.0.1) but doesn't work likely because of the bad state

root@opnsense:~ # pluginctl -r host_routes
{
    "core": {
        "8.8.8.8": null,
        "8.8.4.4": null
    },
    "dpinger": {
        "8.8.4.4": "10.50.45.70",
        "1.1.1.1": "100.64.0.1",
        "2001:4860:4860::8844": "fe80::200:xxxx:xxxx:xxx%igb0",
        "149.112.112.112": "192.168.2.1",
        "192.168.170.2": "192.168.170.2",
        "192.168.171.2": "192.168.171.2",
        "2620:fe::9": "2001:470:xx:x::x"
    }
}


While SL was down, dpinger updated itself to use the DISH IP properly, so it seems dpinger is doing his job but something else with the states is not working well

Here's how it looks when SL is down

root@opnsense:~ # pluginctl -r host_routes
{
    "core": {
        "8.8.8.8": null,
        "8.8.4.4": null
    },
    "dpinger": {
        "8.8.4.4": "10.50.45.70",
        "1.1.1.1": "192.168.100.1",
        "2001:4860:4860::8844": "fe80::200:xxxx:xxxx:xxx%igb0",
        "149.112.112.112": "192.168.2.1",
        "192.168.170.2": "192.168.170.2",
        "192.168.171.2": "192.168.171.2",
        "2620:fe::9": "2001:470:xx:x::x"
    }
}
#12
Quote from: axsdenied on February 13, 2024, 04:37:16 PM
After your reply I re-read your post.  I actually block DCHP leases from 192.168.100.1 on the WAN interface so that the modem can't temporarily assign an address from that block to opnsense.

If you don't you could also have issues like what you're describing as it's technically a valid network config, it just can't route anywhere and sometimes when the real network is available it doesn't swap.

I do block it on other interfaces than Starlink. The reason I keep it enable on Starlink is to be able to access the dish in case there is an issue like snow on the dish, firmware going bad, ability to access the antenna when it is stowed. In all those cases, the dish falls back on its 192.168.100.1 IP and that's the only way to access it. As soon as it comes back up, it re-issues an IP in the Starlink network. When that happens, I expect the state to be cleared and/or the dpinger to be reloaded/restarted which should also clear the state.

But yes, as as workaround I could block those or even do a cronjob that flushes the state every now and then when it finds it is stucked on 192.168.100.1 or something... But in theory the dhcp, interfaces, gateways scripts should all automatically handles this. It was working fine in 23.x when it was fixed (it was buggy at some point in 22.x or early 23.x, I can't remember exactly when it started to happen but it was around the time the devs were working on the scripts that handle gateways, interfaces, etc...).

Thanks for the idea, I may give it a try if not bugfix is made soon. I did not open a new one since this is regression but maybe I should...  :-\

Most if this was discussed, patched and all in this other thread: https://forum.opnsense.org/index.php?topic=33831.0
#13
Quote from: axsdenied on February 06, 2024, 08:36:11 PM
I used to have this issue in the past as well but hasn't been a problem in a bit.  Currently still on 23.7.12_5 as I was waiting for a few patches before upgrading.  However, if this issue is now back in 24.x I'll be waiting a bit longer :)

Yeah... This is definitely a regression. Almost every day I need to reset the state or the gateway, like this so the state goes back to the right gateway instead of being stucked on the Starlink temporary IP/gateway when it reboots or updates itself. The temporary gateway on which it gets stuck is: 192.168.100.1 but the gateway once it is really up is: 100.64.0.1.

Killing the state, resets it properly.

It is likely something that happens (or doesn't happen in this case) during the interface flap and/or the DHCP address issuance by Starlink to opnsense so the states never reset to the new gateway...

Bad state (My gateway monitoring is configured to ping 1.1.1.1 on Starlink)

root@opnsense:~ # pfctl -ss -vvv | grep "1\.1\.1\.1" -A 3
No ALTQ support in kernel
ALTQ related functions disabled
all icmp 100.79.101.92:28961 -> 1.1.1.1:28961       0:0
   age 08:33:39, expires in 00:00:10, 30306:0 pkts, 878874:0 bytes, rule 102
   id: ec7de16500000001 creatorid: 5f0e2da3 gateway: 192.168.100.1
   origif: igb0


Killing the bad state

root@opnsense:~ # pfctl -k id -k ec7de16500000001
killed 1 states


The right state after killing the bad one. Gateway is now marked as up.


root@opnsense:~ # pfctl -ss -vvv | grep "1\.1\.1\.1" -A 3
No ALTQ support in kernel
ALTQ related functions disabled
all icmp 100.79.101.92:28961 -> 1.1.1.1:28961       0:0
   age 00:00:02, expires in 00:00:10, 3:3 pkts, 87:87 bytes, rule 104
   id: 3564d96500000002 creatorid: 5f0e2da3 gateway: 100.64.0.1
   origif: igb0
#14
Same situation this morning

root@opnsense:~ # pluginctl -r host_routes
{
    "core": {
        "8.8.8.8": null,
        "8.8.4.4": null
    },
    "dpinger": {
        "8.8.4.4": "10.50.45.70",
        "1.1.1.1": "100.64.0.1",
        "2001:4860:4860::8844": "fe80::200:xxxx:xxxx:xxx%igb0",
        "149.112.112.112": "192.168.2.1",
        "192.168.170.2": "192.168.170.2",
        "192.168.171.2": "192.168.171.2",
        "2620:fe::9": "2001:470:xx:x::x"
    }
}


root@opnsense:~ # pfctl -ss -vvv | grep "1\.1\.1\.1" -A 3
No ALTQ support in kernel
ALTQ related functions disabled
all icmp 100.79.101.92:47956 -> 1.1.1.1:47956       0:0
   age 08:02:44, expires in 00:00:09, 28494:0 pkts, 826326:0 bytes, rule 102
   id: ba64cd6500000000 creatorid: 5f0e2da3 gateway: 192.168.100.1
   origif: igb0


After killing the state, dpinger now sees the state as up (I did not restart/reload dpinger, I only cleared the state above)

root@opnsense:~ # pfctl -k id -k ba64cd6500000000
killed 1 states

root@opnsense:~ # pfctl -ss -vvv | grep "1\.1\.1\.1" -A 3
No ALTQ support in kernel
ALTQ related functions disabled
all icmp 100.79.101.92:47956 -> 1.1.1.1:47956       0:0
   age 00:00:17, expires in 00:00:09, 17:17 pkts, 493:493 bytes, rule 104
   id: 7168ce6500000000 creatorid: 5f0e2da3 gateway: 100.64.0.1
   origif: igb0


<165>1 2024-02-06T01:36:14-05:00 opnsense dpinger 53072 - [meta sequenceId="75"] ALERT: STARLINK_DHCP (Addr: 1.1.1.1 Alarm: loss -> down RTT: 0.0 ms RTTd: 0.0 ms Loss: 100.0 %)
<12>1 2024-02-06T01:37:03-05:00 opnsense dpinger 4447 - [meta sequenceId="76"] exiting on signal 15
<12>1 2024-02-06T01:37:03-05:00 opnsense dpinger 47956 - [meta sequenceId="77"] send_interval 1000ms  loss_interval 4000ms  time_period 60000ms  report_interval 0ms  data_len 1  alert_interval 1000ms  latency_alarm 0ms  loss_alarm 0%  alarm_hold 10000ms  dest_addr 1.1.1.1  bind_addr 100.79.101.92  identifier "STARLINK_DHCP "
<165>1 2024-02-06T01:37:03-05:00 opnsense dpinger 53072 - [meta sequenceId="78"] Reloaded gateway watcher configuration on SIGHUP
<165>1 2024-02-06T01:37:21-05:00 opnsense dpinger 53072 - [meta sequenceId="79"] Reloaded gateway watcher configuration on SIGHUP
<12>1 2024-02-06T01:38:19-05:00 opnsense dpinger 35161 - [meta sequenceId="80"] send_interval 1000ms  loss_interval 4000ms  time_period 60000ms  report_interval 0ms  data_len 1  alert_interval 1000ms  latency_alarm 0ms  loss_alarm 0%  alarm_hold 10000ms  dest_addr 2001:4860:4860::8844  bind_addr 2605:59c8:2300:98f9:xxxx:xxxx:xxxx:xxxx  identifier "STARLINK_DHCP6 "
<165>1 2024-02-06T01:38:19-05:00 opnsense dpinger 53072 - [meta sequenceId="81"] Reloaded gateway watcher configuration on SIGHUP
<165>1 2024-02-06T01:38:20-05:00 opnsense dpinger 53072 - [meta sequenceId="82"] ALERT: STARLINK_DHCP6 (Addr: 2001:4860:4860::8844 Alarm: down -> none RTT: 51.3 ms RTTd: 3.9 ms Loss: 0.0 %)
<165>1 2024-02-06T09:41:27-05:00 opnsense dpinger 53072 - [meta sequenceId="1"] ALERT: STARLINK_DHCP (Addr: 1.1.1.1 Alarm: down -> loss RTT: 30.7 ms RTTd: 3.7 ms Loss: 75.0 %)
<165>1 2024-02-06T09:41:57-05:00 opnsense dpinger 53072 - [meta sequenceId="2"] ALERT: STARLINK_DHCP (Addr: 1.1.1.1 Alarm: loss -> none RTT: 30.6 ms RTTd: 5.4 ms Loss: 25.0 %)

#15
Quote from: newsense on February 02, 2024, 07:30:12 PM
Quote from: franco on February 01, 2024, 05:19:56 PM
Quote from: bimbar on February 01, 2024, 10:37:37 AM
DHCPd opens a raw interface on all network interfaces. I don't think it is possible (at least with ISC DHCPd) to use two different DHCP daemons on one host simultaneously.

Correct for ISC-DHCP.

As previously stated, ISC-DHCP and KEA can run in parallel on different interfaces. I've done the transition on production systems with no downtime - as follows:


1) Create Subnet and Reservations for VLAN X in Kea

2) Go to ISC DHCP and disable it on VLAN X -- leaving it running on the other VLANs

3) Go to Kea and enable VLAN X in Settings

4) Validate and continue with the next VLAN in scope were Kea can run without missing any ISC functionality


QED :)

Unfortunately this isn't true. You were simply lucky that your dhcp leases continued to work while you transition.

KEA and ISC cannot coexists. ISC can only bind to *:67. While that is happening either you're unable to start KEA (it will show as green but will not run in reality) of if you are able to start both (you need to start KEA first and then ISC), they will start conflicting and you will not be able to reload/restart KEA after ISC has started anyways.

Here's what you'll get if you are able to run both at the same time

root@opnsense:~ # sockstat -4l -p 67
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS
dhcpd    dhcpd      61078 13 udp4   *:67                  *:*
root     kea-dhcp4  964   14 udp4   192.168.22.1:67       *:*
root     kea-dhcp4  964   16 udp4   192.168.42.1:67       *:*
root     kea-dhcp4  964   18 udp4   192.168.62.1:67       *:*
root     kea-dhcp4  964   20 udp4   192.168.63.1:67       *:*


This will prevent both from working properly.

And if you look into your KEA logs, even if the process shows as green, in reality it is not working and you'll see this, for each interface you are trying to start in KEA, even if you disabled it first in ISC.

WARN [kea-dhcp4.dhcpsrv.0x833712000] DHCPSRV_OPEN_SOCKET_FAIL failed to open socket: Failed to open socket on interface ix1_vlan630, reason: failed to bind fallback socket to address 192.168.63.1, port 67, reason: Address already in use - is another DHCP server running?