Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Topics - JL

#1
This has been going on since quite some time. The fix reported is to 'delete all netflow data' which does not sit well with me.

This looks like a reoccurring bug ? Not few posts on this board signal this error happened in the past.

flowd_aggregate.pyflowd aggregate died with message Traceback (most recent call last): File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 162, in run aggregate_flowd(self.config, do_vacuum) File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 80, in aggregate_flowd stream_agg_object.add(copy.copy(flow_record)) File "/usr/local/opnsense/scripts/netflow/lib/aggregates/source.py", line 69, in add super(FlowSourceAddrTotals, self).add(flow) File "/usr/local/opnsense/scripts/netflow/lib/aggregates/__init__.py", line 187, in add self._update_cur.execute(self._insert_stmt, flow) sqlite3.DatabaseError: database disk image is malformed

/var/netflow shows no broken files or lock files left behind

work-around with data loss

cd /usr/local/opnsense/scripts/netflow/
./flush_all.sh all


BUG report created https://github.com/opnsense/core/issues/9499


#2
For the n-th time I found Proxmox+OPNSense to get into trouble.

Each time it takes a long time to get to the issue which is essentially 'rewriting' interface names.
Yes, I did not bother to go look for configuration files, yet, there was no indication the config was wrong.
Neither for Proxmox neither for OPNSense the config changed in recent months.

What appears to happen and is observed as a progressive issue is ... first some disruptions, eventually in Proxmox the OPNSense VM shows normal config on the outside but on the inside the interface MAC from the VM is assigned to a different interface or as a secondary MAC (yeah, i know)

I'm now swithching the WAN from virtio to e1000 to see if that makes any difference.

To me this is a mind numbing issue. At one point the OPNSense admin interface was no longer reachable while this is over a dedicated interface with a direct connection over UTP.

The fix was to use a Proxmox feature to rename the interfaces, last time I did something similar manually, this also worked.

Specific to OPNSense I'm worried Proxmox can simply reassign MAC and this doesn't revert, for OPNSense to allow for multiple MAC or a MAC which is not matching the configured vtnetX interface is concerning.

Hoping for your feedback, available to work on analysis if possible.

#3
For some appliance central log collection is not an option and the log query features are not superb.

Would it be possible for me to use a database tool to query the opnsense logs as if it were a database ?
#4
Hey, please think with me for a moment.
 
Using 24.10 in a VM hosted on a Linux VM server. The OPNSense VM is connected to a Linux bridge which simply passes all (tagged) vlan from the interface connected to the switch.

Observation: vlan traffic is seen on the physical interface and bridge with the vlan tag present, the switch only offeres tagged vlan
Problem: inside the VM though, the traffic seems untagged since it is not observed on the vlan0.401 interface for example
Validation: when connecting another VM to vlan0.401 the communication works well
Question: how to fix that the vlan tag from the hypervisor bridge is passed to the opnsense vlan interface from the parent

Linux Bridge config looks like below, the opnsense vlan is attached to the parent which has assinged the bridge interface on the hypervisor host.
---
auto LIF
iface lif inet static
        bridge-ports eth1
        bridge-stp off
        bridge-vlan-aware yes
        bridge-vids 401 402 901 1500
        
For one other interface the bridge has a 1:1 mapping like, this works well since the vlan is not "inside" the VM

auto DIF
iface dif inet static
        bridge-ports eths3.700
        bridge-stp off
        bridge-vlan-aware yes
        bridge-vids 700

I'd prefer to 'pass through' the interface to the VM but this can only be done over a bridge, leading to the current problem situation.

Br,

JL
#5
Because I've spent extraordinary time on finding out since I don't understand the context (yet)

Enabling syncookies blocks some site but not all, this is particulary so for sites hosted by sucuri.net such as linuxmint.com and reportedly also facebook sites. Which makes one wonder what's the common factor here.

Solution is two-fold. Disable syn cookies are set them to adaptive. When setting to adaptive use a value of start of >= 50% but <100%

Hope this helps.

br,

Joris
#6
a bad week, karma, or something entirely different, who will tell

things to know for working with 24.x

foremost: know KEA DHCP requires for the GATEWAYIP/MASK to be set, not the subnet/mask as the GUI help suggests. Though Kea is interesting it seems sub-par in features compared to ISC and also seems less configurable. Curious for what will come of this.

There is a distinct impression the platform is just faster, pages load in an instant

Unbound is not behaving as well as can be again, enabling DNSSEC is not recommended at first. Just leave it off. My assumption is an update may fix whatever is going on. If not, let me know what I did wrong.

Pairing Unbound with DNScrypt can be a headache. Just point only the "query forwarding" to the DNSCrypt service, don't combine this with "DNS over TLS" from unbound, stuff breaks here :-D Also, Unbound has its own visual dashboard.

Writing rules 'feels different', could be me paired with a lack of sleep.

I've also had the weirdest experience with getting the network to actually work properly. This not in small part due to Ubuntu 24 LTS, it is not recommended to upgrade just yet. Something seems alive in there and it is cheeky and mischievous.

Somehow "automatic outbound NAT rules" gave up on me a few times. I had to switch to hybrid mode. I mean, is there a ghost in the machine or what ?!?  This makes a requirement to hide each network separately behind the WAN address. Including the WAN network it seems.

There quite a few small and notable changes and improvements. I'm actually getting curious about opnsense again. Though I do think there's too much awkwardness for casual use it's growing on me. I'm sure to try out the central management features.

There something different about how gateways are managed, not sure what, seems too easy now ?

There's something odd about "Dynamic gateway policy", do i need it enabled or not, the change does not seem to propagate or act consistently over time.

Lack of consistent behavior seems to be growing trend with open source software lately, it is quite concering. Settings were saved but were not, flows seemed to pass until they did not. The mess I've seen Ubuntu make of simple things is just disengious. OPNSense seems to suffer from "ghost states" and can sometimes use a reboot. Not recommendable to accept rebooting as a standard practice.

What I miss profoundly are a way to add exceptions to the bogon filtering. Now it has to be disabled because it matches DHCP and disrupts that. There appears to be some things missing here, maybe a feature deprecated ?

Suricata borks again, I hope I find back the post on how to keep it up and not have it randomly crash. It is stupid this is not documented or fixed in the build. Yes, all hardware offloading is disabled. Oh wait, yes, Suricata stops when the MTU is not set consistently for all interfaces. If you change the MTU updated it here. At least that used to work. Now it reports there is an "<Error> -- opening devname netmap:vtnet1/R failed: Invalid argument" I don't see why and did not find why yet. Oh no it does bork for all interfaces, with the same invalid argument argument. Let's disable IPS mode that worked last time, until i remember the fix.

Just in case, here is how i fixed it last time https://forum.opnsense.org/index.php?topic=38140.0

Why is GeoIP under Firewall Aliases ? Why it is documented so vaguely what URL to point at it?
Works though. I think, can i actually see the GeoIP info anywhere ?


In all, it's okay to work with. It is a building block rather than a one stop toolkit.
Reminds me I have to get Elasticsearch up and running again, pair it with Grafana and stuff.




















Setting up DNScrypt requires little but you should know what.


#7
Please like or share a comment if this post is helpful or you have more questions.


This how-to-fix post to inform people on how Suricata crashes with OPNSense on Proxmox (any version) can be remediated. The advisories here may not be suitable for production environments, I trust you know this already.

SHORT FORM specific to Proxmox


set all bridge interfaces used for opnsense to the same MTU
(it may be required to set the bridge-if MTU to the physical inteface MTU-22)

use the opnsense VM intefaces used for suricata only with virtio network adapters
set the network adapter MTU to 1 to adopt the bridge MTU from proxmox
in opnsense, leave the MTU for the interface blank
in opnsense, leave the MTU for suricata blankin opnsense for Suricata keep the MTU blank and disable promiscuous mode
in opnsense for Suricata set the exact network masks configured for each interface, it may help to add remove networks to match the interfaces enable for Suricata
add the tunables:
> set net.devmap.bufsize to the value display for NS.MOREFRAG or to the MTU value of proxmo (trial and error)
> set the net.devmap.ad to 1
> set the ns.morefrag to the same as for net.defmap.bufsize
reboot the VMsuricata should now be stable


Context


VM-hardware has Q35 chipset and uses virtio network interfaces.
The OPNSense host has qemu-guest-agent installed.


Indicator (console output)
Jan 28 12:39:45 opnsense kernel: 385.664273 [2197] netmap_buf_size_validate error: large MTU (8192) needed but igb1 does not support NS_MOREFRAG

Assumption
This indicates MTU inconsistency when MTU is set >1500 on the bridge and this is 'broken' in-between the bridge and the IPS. To my understanding the network interfaces available on Proxmox are well supported by OPNSense.

For non-virtualised systems the issue may be the same. Check the MTU of the network, match the MTU of the network on the physical interfaces. Consider subtracting 22 from the MTU for compatibility.

Recommended is to check if
MTU on the bridge is >1500

configure : within Proxmox

check and set the VM-hardware network-interface(s) to 1 so these adopt the MTU of the connected network.
you can consider decreasing the MTU with 22 (now named PMTU)


configure : within OPNSense


[ for Suricata] under the 'advanced' section of the IPS service : check and/or clear default packet size (MTU) setting
setting the MTU here can affect detection reliability and 'drop' or 'conflate' frames on inspection, consider setting MTU-22


[ for Interfaces ] check and/or clear MTU settings for the monitored interfaces OR recommended is to set the PMTU as value
important know that on non-enterprise network cards there may not be support for 'real' Jumbo frames which permits MTU >1500


Look up the specifications for the network interface cards (NIC) and do not set the MTU higher than the hardware supports, even if the MTU on the connecting switch is set to a much higher value.


[ for SYSTEM: SETTINGS: TUNABLES ] manually create the key dev.netmap.bufsize with value = <PMTUvalue>
this to work around issues with some NIC where MTU is not working well, so hard-set it here with this key


configure : optionally for OPNSense


[ for SYSTEM: SETTINGS: TUNABLES ] manually create the key dev.netmap.admode with value = 1this to avoid flapping between native and emulation state for the network interface


[ for Suricata] you must consider set the MTU-22 as size for stability


Considerations

when the value for the MTU is cleared for an interface this defaults to 1500
consider this may severely impact IPS performance and/or accuracy

Resources

https://docs.opnsense.org/manual/ips.html
https://man.freebsd.org/cgi/man.cgi...eBSD+12.1-RELEASE+and+Ports#SUPPORTED_DEVICES
https://man.freebsd.org/cgi/man.cgi?vtnet
#8
After much wrestling and worrying the fact NTPd does not sync appears to be due to the bogon filtering flag.

I've noticed other mishaps with bogon filtering in the past, it seems to be there should be some automated excemptions so this flag can be left enabled.

Regards,

JL
#9
From some reason i ended up with over 60GB of logs for unbound.

These are not compressed. I had hoped opnsense would compress the logs automatically, yet i fail to find an option to configure it do so. Typically on a *nix system there's little in the way for reading compressed text log files so I'd prefer to go ahead.

Is there anything to consider for doing so ?

Documentation seems lacking.
#10
22.7 Legacy Series / Failing DNS services
December 31, 2022, 03:39:58 PM
While on older opnsense the 'intrusion detection service' frequently crashed, after the upgrade to 22.7 there are new issues, now it are the DNS services crashing ... which worked fine with older opnsense releases.

There are no apparent log entries indicating the reasons why for the DNS service crashes, using unbound+dnscrypt+bind
To my surprise all three service go down simultaneously. As I've noticed at least one succesful (likely) DNS spoofing attempt I'm not confident these crashes are benign.
#11
hey

thanks for taking a little bit of time to share your thoughts


I have this server at my disposal yet just one public IP

The server is a dual CPU 8c/16t with plenty of RAM and disk

the set-up i have in mind is    [ pubic IP] > [virbr0, virbr1, virbr2] > ( opensense-fw-1, opensense-fw-2) > virtual-LAN > VM1...N
on VM1..N there will be just a few VM running services

so, now  i have the public IP to which i configure DNS to resolve and i want to have this traffic arrive at both of VM1..N on different ports

to this end i expected to use the public-IP a a VIP-WAN but now i' m not certain if the ssh service running on the VM-host will still be reachable if i do so

or for that matter, if i could have the opnsense-ha-cluster correctly resolve the DNS and match with the hosts behind the NAT









#12
20.7 Legacy Series / repeat crashing
October 31, 2020, 06:56:12 AM
dear, it is with a sense of dread i write this post as it concerns opnsense going haywire repeatedly


the logs reviewed thus far do not contain a clear indicator of what happened, but it never happens once, this is time three in just two weeks. For all cases over time the scheduled updates for suricata appear to correlate in time, except today and this time it is even more bad than just unbound and suricata dying.


i've now initiated the update for 20.7.4 which i had not done before since it only presented a pkg update


my question is if others experience such crashes as well, my concern is it may not be just instability of opnsense but an external factor. if so, there are not indicators left in the logs
#13

       
  • with Unbound service there are recurrent issues where the service simply stops responding
  • dns lookups from the opnsense web-ui from any interface work as per normal
  • this makes me think the problem does originate within unbound
Validating the unbound configuration i could not find any blacklist enabled. Rebooting opnsense i could find the domains which return 0.0.0.0 as address briefly do resolve correctly. The IDS was not enabled at the time since it had crashed once more, also when disabling the IDS there was no change observed for the dns queries erroneous results.




#14
20.7 Legacy Series / unstable on proxmox ?
September 27, 2020, 07:10:21 PM
Dear,

Using opnsense since release 17 or so i find it unstable to work with on Proxmox VE 6.2

the disk i/o is troublesome to the point only selecting IDE with SSD emulation appears to work well (for speed), choosing a differen kind of controller results in a lot of swap fail notification.

on shutdown there are a plethora of errors thrown which appear low level, regardless of the controller chosen

in all i don't feel like 20.7 is as production ready as one typically assumes

memory consumption appears quite high out of the box, the VM has 2.5GB of ram and frequently starts complaining it is out of swap space shutting down multiple services without warning
#15
Dear,

Confronted with Zberp being reported as originating from my SmartTV reaching in relation to Netflix traffic (yes, port 80) I came to look at Suricata SID 2021831 which is a flowbits:noalert rule

It took me a while and had to ask but someone pointed out this rule is not supposed to trigger since it is a flowbits rule for which no alert is configured. Hence i wondered if this (most likely) is my mistake of enabling such rule or if this is a known error in the suricata configuration with OPNSense.

Thank you
#16
18.7 Legacy Series / WebGUI very slow on LAN
November 02, 2018, 02:11:31 PM
Dear,


After blaming opnsense i came to realize it is most likely all on me, the slow loading of the WebGUI.


I don't see how to start troubleshooting this. There appears no indication thus far. When i do a factory reset the WebGUI is snappy as expected.


This setup has a WAN - LAN - OPT interface setup, Unbound DNS is set to query WAN and Localhost.


The slow actions is the same over IP or DNS for the management interface. There is no other interface listening to the management interface.


I suspect this may be a routing issue or gateway weighting issue but could not find anything related.


Best Regards,


Joris
#17
General Discussion / restarting unbound by cron
August 20, 2018, 03:02:37 PM
Because of recurring performance issues with unbound i think it is wise to restart the service every n hours.

I could not find the way to configure this from the web interface or in the manual pages.

Please advise.


Thank you
#18
Dear,

My set-up is the latest production release of OpSense on a system with three network interfaces (WAN,Mobile,LAN)

While my entire Sonos setup is working fine as it is entirely connected to Mobile  i now seek to make connections to it from LAN. This uses ssdp which is a multicast based protocol over 239.255.255.250 over port 1900/udp.


STATUS not working : traffic from Sonos Desktop does cross the interfaces but does not return

Validation i run a packet capture on the Mobile interface for "224.0.0.0/4 or 192.168.29.100" which is my Lan IP

As a "narrow it down approach" i've tried various settings. Now i have a rule on top of the rulebase permitting all address towards 239.255.255.250 on both Mobile and Lan, for these rules i've also enable 'allow options' and enabled 'any flags'

In a desperate attempt i've even created src: any dst: 239.255.255.250 for any protocol as well as src: 239.255.255.250 dst: any for any protocol on both networks

Please comment or advise on what to search for. Multicast is a notable omission in any threat related to opnsense.

[update 10:22 CET 29/06/2018 ]

The Sonos App on a Microsoft System is sending SSDP (239.255.255.250) to port 1900/udp but this does not cross the interfaces on the firewall (since multicast)

Installed the IGMP Proxy Service (mixed non-results thus far)

Configured Mobile as Upstream as the Sonos Speakers are here as well as the Sonos Controller on a Tablet
Configured LAN as Downstream as the Sonos Desktop Application is located here

For each of the configured IGMP i have configured the relevant subnet and also added 239.255.255.250/32





#19
Tutorials and FAQs / Telegraf input/output
March 03, 2018, 09:27:22 AM
Hello,

I'm looking to have Telegraf output from opnsense. Not just for system monitoring but also for suricate monitoring, is this available in the current setup of Telegraf or is extra work required ? If need be i could provide extra hands here i figure.