Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Topics - pmladenov

#1
Hello,

With almost an year experience with OpnSense trying to accomplish the most simple enterprise setup ever (Headquarter with 2 OpnSense boxes for HA and ~10 small remote branches with a single OpnSense box, all these are single-homed to one Service Provider providing L2 ethernet service to the HQ), I would like to provide my negative feedback, mainly for folks who may consider it for similar future projects.

Overall I'm disappointed.
I'm disappointed with, but not limited to:

  • HA setup - who the hell may think of scale out solution and implement active-active or even active-active-active...? (Keep in mind that OpnSense is based on FreeBSD implementation of PF and CARP which are slightly different than the OpenBSD implementation, OpenBSD has better HA features)
  • IPSec implementation (...if your branch office lost connectivity for a day - don't expect IPSec to re-establish again in case you haven't modified the config files and rely only on FrontEnd UI)
  • Lack of ECMP feature .... yes, you can't have 2 routes (static or dynamic) for the same destination towards 2 different gateways, so you can not do traffic load-sharing
  • Lack of ability to fine tune the PF firewall rules....unless you want to spend a week bugging with ...php scripts
  • Impossibility to add an additional VLAN without bouncing the main physical interface, even though it is a LACP bonding - "well the interruption is a really short...and we're not adding vlans every day, are we?
  • Jumbo frames? Yes, it's supported, even it works, but only if you enable it at the very beginning. Of course if the jumbo frame MTU requirement comes a little bit late and you have already deployed your vlan subinterfaces - you'll need to start over with a fresh install
  • We are not considering to have a DHCP relay for a remote/branch office pointing to the DHCP server located in the Headquarter over VTI IPSec interface, aren't we? Yes, the buggy daemon doesn't like ipsecXXX interface type and cannot even start
  • Dynamic Routing using FRR - Yes, but don't expect to use the FrontEnd UI, unless you want your BGP daemon to restart each time when you have to make a small modification like adding a new BGP neighbor, or even modifying a simple route-map. OSPFv2/v3 is not different. More advanced routing setup like BGP Route-Reflector or Route-Server - forget it, these may be implemented in the UI in 2122 if we're lucky (and yes, they won't work as expected) 
  • Multicast routing? PIM? IGMPv1/2/3? .... Forget it all,moreover the highly limited igmp-proxy doesn't like Lagg interface types (and even vlans in older versions), so can't bind and start at all and even if you can make it to work somehow - don't expect any functionality to replicate the multicast state between 2 HA OpnSense devices..and again, who is looking for HA?
  • VRF functionality, at least for the management traffic - no, not supported by BSD (although JunOS is running on top of FreeBSD for centuries). So, be careful what you're doing with routing, PF, badly documented check boxes on the UI just to avoid locking yourself
  • Lack of documentation? Really? Try to understand something as simple as how the static routing "gateway" concept is working and you'll understand what I mean. During that time, expect your statically configured default gateway suddenly to start pointing to a different interface or being overwritten by dynamic protocol

I understand that some of the limitations listed above are not related to OpnSense itself, but with FreeBSD or some 3rd party plugins.
I also understand that the product is mainly for home users bragging about having "c00l f1r3w@ll @home" but it's far, far, far away for enterprise ready solution.
I would firmly say it's not even being develop with "more than 2 OpnSense boxes in a network" mindset.

On a positive side - I would say - if you are not expecting to change anything after the initial deployment and have all the requirements in advance (and the listed caveats above) - it works and is stable.

What else I'm missing here? What is your experience?
#2
High availability / Active-Active HA tunning
March 30, 2021, 02:11:51 PM
Hello,

Currently I have a HA setup acting as active/active with 2 nodes and pfsync between them (unicast) and pure routing (BGP), without relying on CARP at the moment.

Although I have tested all possible ways of session asymmetry (for instance TCP SYN via FW1, tcp SYC+ACK via FW2 and all other variations) and all looks to work well in the LAB that's not the case outside of the testing environment.
With real traffic (with low number of session < 1000) I'm getting complaints for TCP re-transmits which seems to happen when there's is asymmetrical flows.
I suspect it is related to some kind of pfsync timers (preventing timely synchronization between both firewall nodes)

I've read pfsync(4) and ifconfig(8 ) for both FreeBSD and OpenBSD several times, however I can't fully understand the concept for:

1) pfsync defer option - from the OpenBSD pfsync man page, but nothing in the FreeBSD pfsync man page:

QuoteWhere more than one firewall might actively handle packets, e.g. with certain ospfd, bgpd or carp(4) configurations, it is beneficial to defer transmission of the initial packet of a connection. The pfsync state insert message is sent immediately; the packet is queued until either this message is acknowledged by another system, or a timeout has expired. This behaviour is enabled with the defer parameter to ifconfig.

So in simple words - what's happening after FW1 receives TCP SYN segment and that traffic is allowed by PF rulebase (and we expect that the SYN+ACT segment will be returned back via FW2) with defer and without defer option enabled?

2) pfsync maxupd option, by default set to 128. 

QuoteThe pfsync interface will attempt to collapse multiple state updates into a single packet where possible.The maximum number of times a single state can be updated before a pfsync packet will be sent out is controlled by the maxupd parameter to ifconfig (see ifconfig and the example below for more details). The sending out of a pfsync packet will be delayed by a maximum of one second.

Is it make sense to decrease that parameter to avoid waiting for up to one second before sending pfsync packets to the peer?

3) net.pfsync.pfsync_buckets

QuoteThe number of pfsync buckets.This affects the performance and memory tradeoff.Defaults to twice the number of CPUs.Change only if benchmarks show this helps on your workload.

Any idea here what and how should I monitor to set this properly?


P.S.1 Just went back to the pcap files - almost for all TCP sessions (with few exception) - the segment with SYN flag was re-transmitted in 1 second after the first SYN was sent.
So we have:

(1) Client ----------SYN ---------> FW1 ------------------> Server
                                                    |
                                                 pfsync
                                                    |
(2) Client <------------------------- FW2 <-- SYN+ACK----- Server

Seems that FW2 in (2) is denying SYN+ACK sent from Server in response to the Client, probably because it hasn't seen SYN-SENT session yet from FW1.

P.S.2 - Confirmed - the returned SYC+ACK segment (from Server to Client) is dropped by FW2. It comes just before the state is replicated from FW1 to FW2. I tried with and without defer option on pfsync0 interface on both FWs and don't see any changes in the behavior. Probably the queuing of the initial packet is not working?
#3
Hello,


I have an IPSec routed mode between 2 opnsense FWs: opnFW1 and opnFW2 running:

OPNsense 20.7.5-amd64
FreeBSD 12.1-RELEASE-p10-HBSD
OpenSSL 1.1.1h 22 Sep 2020

After an approximately a week uptime, without any configuration changes on both ends, I'm getting the following error in opnFW1's /var/log/ipsec.log and of course the IPSec is not working....

Mar 22 21:43:47 opnFW1 charon[16547]: 11[KNL] creating acquire job for policy 192.168.1.10/32 === 192.168.1.1/32 with reqid {1000}
Mar 22 21:43:47 opnFW1 charon[16547]: 07[CFG] trap not found, unable to acquire reqid 1000
Mar 22 21:44:19 opnFW1 charon[16547]: 07[KNL] creating acquire job for policy 192.168.1.10/32 === 192.168.1.1/32 with reqid {1000}
Mar 22 21:44:19 opnFW1 charon[16547]: 11[CFG] trap not found, unable to acquire reqid 1000


The ipsec logical interface on opnFW1 is ipsec1000:
root@opnFW1:~ # ifconfig ipsec1000
ipsec1000: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1400
        tunnel inet 192.168.1.10 --> 192.168.1.1
        inet6 fe80::1a5a:58ff:fe10:13a0%ipsec1000 prefixlen 64 scopeid 0x13
        inet 172.16.1.10 --> 172.16.1.1 netmask 0xffffffff
        groups: ipsec
        reqid: 1000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>



From opnFW1 I can successfully ping opnFW2 "underlay" IP address - 192.168.1.1, however I can't ping the "overlay" IP - 172.16.1.1

root@opnFW1:~ # ping -c 2 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
64 bytes from 192.168.1.1: icmp_seq=0 ttl=64 time=7.266 ms
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=3.638 ms

--- 192.168.1.1 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 3.638/5.452/7.266/1.814 ms
root@opnFW1:~ # ping -c 2 172.16.1.1
PING 172.16.1.1 (172.16.1.1): 56 data bytes

--- 172.16.1.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss


The ipsec configuration on opnFW1 is:
root@opnFW1:/usr/local/etc # cat ipsec.conf
# This file is automatically generated. Do not edit
config setup
  uniqueids = yes

conn con1
  aggressive = no
  fragmentation = yes
  keyexchange = ikev2
  mobike = yes
  reauth = yes
  rekey = yes
  forceencaps = no
  installpolicy = no

  dpdaction = restart
  dpddelay = 10s
  dpdtimeout = 60s

  left = 192.168.1.10
  right = 192.168.1.1

  leftid = 192.168.1.10
  ikelifetime = 28800s
  lifetime = 3600s
  ike = aes256gcm16-sha512-ecp512bp!
  leftauth = psk
  rightauth = psk
  rightid = 192.168.1.1
  reqid = 1000
  rightsubnet = 0.0.0.0/0
  leftsubnet = 0.0.0.0/0
  esp = aes256gcm16-sha512-ecp512bp!
  auto = start


From the configuration above - that IPSec should rely on DPD.

On the other side - opnFW2 the logs I'm getting is:

root@opnFW2:/var/log # clog ipsec.log | grep 192.168.1.
Mar 22 13:38:37 opnFW2 charon[41296]: 05[KNL] creating acquire job for policy 192.168.1.1/32 === 192.168.1.10/32 with reqid {9000}
Mar 22 13:39:09 opnFW2 charon[41296]: 02[KNL] creating acquire job for policy 192.168.1.1/32 === 192.168.1.10/32 with reqid {9000}
Mar 22 13:41:20 opnFW2 charon[41296]: 07[KNL] creating acquire job for policy 192.168.1.1/32 === 192.168.1.10/32 with reqid {9000}
Mar 22 13:41:52 opnFW2 charon[41296]: 14[KNL] creating acquire job for policy 192.168.1.1/32 === 192.168.1.10/32 with reqid {9000}
Mar 22 13:42:25 opnFW2 charon[41296]: 15[KNL] creating acquire job for policy 192.168.1.1/32 === 192.168.1.10/32 with reqid {9000}


IPsec logical interface on opnFW2 is ipsec9000:

root@opnFW2:/var/log # ifconfig ipsec9000
ipsec9000: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1400
        tunnel inet 192.168.1.1 --> 192.168.1.10
        inet6 fe80::1e72:1dff:feb6:c703%ipsec9000 prefixlen 64 scopeid 0x25
        inet 172.16.1.1 --> 172.16.1.10 netmask 0xffffffff
        groups: ipsec
        reqid: 9000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

The ping tests are identical: from opnFW2 I can ping 192.168.1.10 and cannot ping 172.16.1.10

root@opnFW2:/var/log # ping 192.168.1.10
PING 192.168.1.10 (192.168.1.10): 56 data bytes
64 bytes from 192.168.1.10: icmp_seq=0 ttl=64 time=7.893 ms
64 bytes from 192.168.1.10: icmp_seq=1 ttl=64 time=7.310 ms
64 bytes from 192.168.1.10: icmp_seq=2 ttl=64 time=7.990 ms
^C
--- 192.168.1.10 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 7.310/7.731/7.990/0.300 ms
root@opnFW2:/var/log # ping -c 2 172.16.1.10
PING 172.16.1.10 (172.16.1.10): 56 data bytes

--- 172.16.1.10 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@opnFW2:/var/log #


IPSec config on opnFW2 related to that tunnel is:

cat /usr/local/etc/ipsec.conf

config setup
  uniqueids = yes

conn con9
  aggressive = no
  fragmentation = yes
  keyexchange = ikev2
  mobike = yes
  reauth = yes
  rekey = yes
  forceencaps = no
  installpolicy = no

  dpdaction = restart
  dpddelay = 10s
  dpdtimeout = 60s

  left = 192.168.1.1
  right = 192.168.1.10

  leftid = 192.168.1.1
  ikelifetime = 28800s
  lifetime = 3600s
  ike = aes256gcm16-sha512-ecp512bp!
  leftauth = psk
  rightauth = psk
  rightid = 192.168.1.10
  reqid = 9000
  rightsubnet = 0.0.0.0/0
  leftsubnet = 0.0.0.0/0
  esp = aes256gcm16-sha512-ecp512bp!
  auto = start



And the funnies thing is that if I restart the strongswan service (/usr/local/etc/rc.d/strongswan onerestart) on opnFW1 (with ... "unable to acquire reqid" logs) the issue disappears and everything starts working again....untill the next time it stops.....

Any ideas, comments are highly appreciated!
Intentionally I haven't restored the connectivity this time, so I can provide any additional outputs/logs if required.

Regards,
Plamen
#4
That was supposed to be simple, but I still can't get it work...

I have a very basic setup:

Site1 LAN <-> OPNsense-FW1 <-- VTI ipsec1000 --> OPNsense-FW2 <-> Site2 LAN

OPNSense-FW1 has a route to Site2 LAN via OPNsense-FW2 dev ipsec1000
OPNSense-FW2 has a route to Site1 LAN via OPNsense-FW1 dev ipsec1000

Hosts in Site1 LAN are able to communicate with hosts in Site2 LAN.

All I would like to accomplish is locally originated traffic from OPNsense-FW1 destined to Site2 LAN subnet to use its Site1 LAN IP address instead of the IP address of ipsec1000 interface.
I assume this is some kind of source NAT with the following logic:

SRC_IP=ipsec1000_IP, DST_IP=Site2 LAN
SRC_NAT_IP=Site1 LAN_IP,
outgoing interface ipsec1000

I tried the above with couple of variations and none of them were working.
What am I missing here?

Regards,
Plamen
#5
21.1 Legacy Series / OPNsense ECMP routing
March 09, 2021, 09:56:04 PM
Hello!

I'm wondering does latest version (21.1.X) of OPNsense support ECMP (equal cost multipath) routing with FRR?

With my lab devices (20.7.5) I tried to simulate it using OSPF and was able to see two paths in FRR (vtysh -> sh ip ro ospf), however netstat -rnl4 shown completely different story (only one of the paths is actually installed into BSD routing table)


I've found that from almost 2 years ago - https://forum.opnsense.org/index.php?topic=12815.0 which was not positive at all during that time...


Regards,
Plamen
#6
I have a very basic setup:

host1 <-> (em0) opnsense1 (em1) <----- ipsec routed mode -----> (em1) opnsense2 (em0) <-> host2

host1 is able to ping host2 will small packets but not with 1500bytes packet.
MTU size is default everywhere (1500 bytes and ipsec interface has MTU 1400 by default).
When I disable "interface scrub" (Firewall -> Settings -> Normalization) on opnsense1 firewall (ONLY!) everything starts working. Strange thing is that I don't touch that setting on opnsense2 firewall (we have scrub enabled there, as per default).

When host 1 send 1500 bytes packet it's received by opnsense1, fragmented to 2 packets (1400 bytes and 100 bytes) and send over the ipsec interface. It's getting received on opnsense2, re-assembled and forwarded to host2 as 1500bytes packet.
The problem is in the opposite direction - ICMP echo reply from Host2 is received by FW2, fragmented and sent via the ipsec interface to FW1. FW1 received both fragments on ipsec interface, combine them into a single 1500 byte packet and send it to Host1.  And here is the problem:

According to tcpdump on em0 interface of FW1, that ICMP 1500byte packet has a WRONG checksum (and at the end host1 is not receiving the replies from host2).

I've spent 2 full days in troubleshooting that, simplifying the setup the to above one. Both opnsense1 and 2 initially were 20.7.5, I upgraded the opnsense1 to the latest 20.7.X (no luck), then I completely reinstalled (from scratch without importing configs) opnsense1 VM, it didn't help at all, after that I deleted it and install 21.1 image but it didn't help either...
I think I'm missing something, it's really strange that in the other direction (host1->host2) everything is working (no bad checksums for re-assembled packets), no need to disable PF scrub on opnsense2 fw.

Any idea what I'm missing because I'm completely out of ideas anymore?
Anything else I can try to troubleshoot that problem?

If I have to disable scrub at all - what will happen with TCP MSS? I don't think it will be negotiated to 1360bytes which will definitely break many apps.




#7
Hello,

I'm running OpnSense 20.7.5

I've configured Site-to-Site IPSec tunnel with IKEv2 and DPD with 2 seconds interval, 5 retries and action=restart tunnel.

My ipsec.config:

Quoteroot@OPNsense:/tmp # cat /usr/local/etc/ipsec.conf
# This file is automatically generated. Do not edit
config setup
  uniqueids = yes

conn pass
  right=127.0.0.1 # so this connection does not get used for other purposes
  leftsubnet=10.30.0.0/16
  rightsubnet=10.30.0.0/16
  type=passthrough
  auto=route

conn con1
  aggressive = no
  fragmentation = yes
  keyexchange = ikev2
  mobike = yes
  reauth = yes
  rekey = yes
  forceencaps = no
  installpolicy = yes
  type = tunnel
  dpdaction = restart
  dpddelay = 2s
  dpdtimeout = 12s



Based on
https://wiki.strongswan.org/projects/strongswan/wiki/connsection

Quotedpdaction = none | clear | hold | restart

controls the use of the Dead Peer Detection protocol (DPD, RFC 3706) where R_U_THERE notification messages
(IKEv1) or empty INFORMATIONAL messages (IKEv2) are periodically sent in order to check the liveliness of the
IPsec peer. The values clear, hold, and restart all activate DPD and determine the action to perform on a timeout.
With clear the connection is closed with no further actions taken. hold installs a trap policy, which will catch
matching traffic and tries to re-negotiate the connection on demand. restart will immediately trigger an attempt
to re-negotiate the connection. The default is none which disables the active sending of DPD messages.

dpddelay = 30s | <time>

defines the period time interval with which R_U_THERE messages/INFORMATIONAL exchanges are sent to the peer.
These are only sent if no other traffic is received. In IKEv2, a value of 0 sends no additional INFORMATIONAL
messages and uses only standard messages (such as those to rekey) to detect dead peers.

dpdtimeout = 150s | <time>

defines the timeout interval, after which all connections to a peer are deleted in case of inactivity.
This only applies to IKEv1, in IKEv2 the default retransmission timeout applies, as every exchange is used to
detect dead peers.

And from https://wiki.strongswan.org/projects/strongswan/wiki/Retransmission

Quote
retransmit_tries    Integer    5    Number of retransmissions to send before giving up
retransmit_timeout    Double    4.0    Timeout in seconds
retransmit_base    Double    1.8    Base of exponential backoff

Using the default values, packets are retransmitted as follows:
Retransmission    Formula    Relative timeout    Absolute timeout
1    4 * 1.8 ^ 0    4s    4s
2    4 * 1.8 ^ 1    7s    11s
3    4 * 1.8 ^ 2    13s    24s
4    4 * 1.8 ^ 3    23s    47s
5    4 * 1.8 ^ 4    42s    89s
giving up    4 * 1.8 ^ 5    76s    165s

Apparently that ipsec.conf configuration is not relevant for ikev2 and that's the reason why it takes so long to reset the tunnel (in my case ~90+ seconds)

Is there any easy way I can fix that one?
As stated in the comment section of /usr/local/etc/ipsec.conf
root@OPNsense:/tmp # cat /usr/local/etc/ipsec.conf
# This file is automatically generated. Do not edit

where should I make the modification (considering I'm not gonna use ikev1 and only ikev2 in that setup)

Regards,
Plamen
#8
20.7 Legacy Series / How "Gateway" concept works
February 04, 2021, 01:47:28 PM
Hello,

I would like to understand the "Gateway" concept in OpnSense (I believe I'm missing something fundamental).
I have 2 static routes for exact the same destination network (let say 192.168.0.0/24) via two different single gateways: GW1 (1.1.1.1) with priority 10 and GW2 (2.2.2.2) with priority of 20.

Because the priority of GW1 is lower than GW2 that should mean (as per my understandings) that GW1 should be the preferred exit point for 192.168.0.0/24 (which seems to be the case, based on netstat -rnl4 output, 192.168.0.0/24 is pointing to 1.1.1.1).

I've also enabled the monitoring feature of both single gateways (using the IP address of the next-hop => for GW1 - 1.1.1.1 and for GW2 - 2.2.2.2)

What is the correct behavior if for some reason GW1 become unreachable (IP address 1.1.1.1 is not reachable)?
My assumption is that it will be displayed as offline (which it is) and because it's offline the entry in the routing table will not point to 1.1.1.1 (GW1) anymore, but it will point to 2.2.2.2 (GW2), because 2.2.2.2 is still reachable, although it has worst priority.  However that's not what I'm seeing. Despite of GW1 is offline, 192.168.0.0/24 is still pointing to it (1.1.1.1) and not to GW2 (2.2.2.2).

What I'm missing here?

Regards,
Plamen





#9
Hello,

I have an OpenSense Hub and Spoke topology with the following IP addressing schema:

Hub: 10.30.0.0/16
Spoke1: 10.31.0.0/16
Spoke2: 10.32.0.0/16
Spoke3: 10.33.0.0/16
...and so on
SpokeX: 10.X.0.0/16
SpokeY: 10.Y.0.0/16

Please note - all Spokes have a connection to the HUB only, there's no Spoke-to-Spoke physical direct link. The only way for SpokeX to communicate with SpokeY is via the HUB.
All Spokes and the HUB are OpnSense firewalls and there's no other firewalls/routes in the topology (only pure L2 switches and OpnSense FWs)

Of course, I need an encryption between all Spokes and the HUB (including Spoke-to-Spoke traffic).

What I did so far (which seems to work up certain extent) is:

Each Spoke has a static default-gw pointing to the HUB and the HUB has a static route for the corresponding Spoke's subnet pointing to the that Spoke opnsense FW.
Without encryption involved I have full connectivity between each spoke and HUB and between spokes itself (via the HUB)

Each Spoke forms an IPSec tunnel to the HUB and for phase 2 I have:

SpokeX:
Local Subnet: 10.X.0.0/16
Remote Subnet: 10.0.0.0/8

Hub (to Spoke X):
Local Subnet: 10.0.0.0/8
Remote Subnet: 10.X.0.0/16

That works fine for the connectivity between Spoke X and HUB sites, as well as Spoke X and Spoke Y (via the HUB site) but the problem I'm facing and looking for a solution is for local Spoke traffic:

For example - Spoke 1 OpnSense FW has 2 inside logical interfaces:

VLAN100 - 10.31.100.1/24
VLAN200 - 10.31.200.1/24

Local hosts (in Spoke1 site) living in VLAN100 (10.31.100.0/24) are NOT able to communicate with local hosts (again in Spoke1 site) living in VLAN200 (10.31.200.0/24). Even these hosts are NOT able to reach their default gateway. I assume their traffic is encrypted following 10/8 phase2 policy.

Is there any way I can exclude that local traffic from being encrypted? These 2 interfaces are locally connected to the same firewall, so apparently I would like to have clear text connectivity between both local subnets following OpnSense routing table (directly connected routes)

Can I have more than 1 subnet in phase 2 (local or remote)?

Any suggestions are highly appreciated!

Regards,
Plamen


#10
Hello,

Today I noticed an annoying "feature" with my virtual lab and physical pre-production setup. Once I add an additional VLAN ID (via webgui - Interfaces -> Other Types -> VLAN) all data traffic using already existing VLANs on the same physical interface is temporary stopping for 2-3 seconds. This happens on a single physical interface as well as LAGG group.
Is that something expected and is there any workaround for preventing it?
(I'm running 20.7.5)

Regards,
Plamen
#11
High availability / CARP group tracking
December 04, 2020, 02:23:43 PM
Hello,

I have 2 opnsense firewalls in HA with 2 different CARP groups - one for the LAN and one for the WAN.
I would like to implement a little bit more complex failover logic - instead of relying on physical interface down event, the idea is to use a script pinging several WAN IPs and in case all are down to demote active CARP LAN group.
Based on https://docs.opnsense.org/development/backend/carp.html document - I've created a shell script returning 0 in case all is good and 1 in case it needs to demote.
The script is executable and located in /usr/local/etc/rc.carp_service_status.d/

The question I have is how, when and by whom that script is being executed?


Regards,
Plamen
#12
Hello Community,

I'm having troubles with FRR (1.17) plugin installed on 20.7.4 and upgraded to 20.7.5.
It's not getting started, even with the zebra daemon only enabled:

The error message is:

root@OPNsense1:/usr/local/etc/rc.d # service frr start
Checking zebra.conf
2020/12/01 07:05:13 ZEBRA: [EC 4043309110] Disabling MPLS support (no kernel support)
OK
/usr/local/etc/rc.d/frr: WARNING: failed precmd routine for zebra
root@OPNsense1:/usr/local/etc/rc.d #


Since I'm playing with OpnSense literally from yesterday, I assume I'm not doing something right.
I simplified my setup as much as possible - default installation in a virtual machine (vmware workstation) with only os-frr pluging installed, vNIC dedicated for management interface (obtaining an IP address from DHCP server) and SSH daemon enabled.
Initially I tried it in more complex setup (different OpnSense VM) with sevaral VLAN interfaces, VTI IPSec tunnel, no firewalling (permit ip any any on all interfaces) with intention to use OSPF for dynamic routing, but I faced exactly the same issue with starting the service (of course, it's not started from the GUI either).
I assume it's something with the BSD startup scripts, because I can manually run zebra and ospfd daemons from a CLI and routing is working. Apparently that's not the way I would like to work.


Any help is appreciated!

Regards,
Plamen