Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - KeyHand

#1
Understandably the primary concern of imposing limits on a tmpfs-backed `/var` is running out of disk space when performing updates.  Looking at my system, it appears though the directories `pkg` uses are already symlinked to non-tmpfs locations.  Specifically cache (PKG_CACHEDIR)

root@OPNsense:~ # ls -l /var/cache/pkg
lrwxr-xr-x  1 root  wheel  19 Jun 30 23:55 /var/cache/pkg -> /root/var/cache/pkg


and database (PKG_DBDIR)

root@OPNsense:~ # ls -l /var/db/pkg
lrwxr-xr-x  1 root  wheel  16 Jun 30 23:55 /var/db/pkg -> /root/var/db/pkg


Are all aspects of OPNsense updates performed via `pkg`?  If so updates via `pkg` should be unaffected by running out of space on tmpfs-backed `/var`.  (Then running out of space on the primary disk is the concern, and you've got a whole bag of other problems.)  If there's some other update mechanism employed, could the relevant cache/database locations be changed to somewhere on the primary disk?
#2
Thanks for the additional information franco. 

Looking at the commit where you migrated from mdmfs to tmpfs, I can see how the previous limit for 60 MB for `/var` might cause issues.  It's certainly far too small.

I guess searching GitHub for related issues should be a requisite step before posting on the forums in future.  The discussion in issue #2856 is very similar to the ground I've covered so far.  I've read through the points you made in the issue and agree with your assessment of the situation

  • adding another UI element requires translation which, although not overly difficult, does complicate matters
  • reverting back to a small, hard size limit for `/var` and `/tmp` should be avoided at all costs

Thankfully I'm in a position where using tmpfs for `/var` isn't absolutely necessary.  And if it was, I'm currently able to throw more RAM at the problem.  In fact, I may do just that in the short term to see how it will handle NetFliter and ntopng logging.

When you say you're not against a contributed upper bound feature, how do you see it being implemented?  Some internal conditional logic that sets tmpfs to some value smaller than total ram, but not smaller than a minimum threshhold?  Or a full-blown user-selectable size specified in the UI (with some safeguards in place)?
#3
I have been testing OPNsense in a virtualised environment (KVM) with modest resources (10GB disk, 4GB memory, 4 cores).  The logic being it if behaves well here, it'll work just fine with more resources.

When investigating how OPNsense handles writing logs to disk, I enabled the '/var RAM disk' option in 'System: Settings: Miscellaneous'.  With conservative logging options set in 'System: Settings: Logging' ('Disable circular logs': checked, 'Preserve logs (Days)': 5) and only basic services running on the system (pf, dhcp, unbound DNS), this worked just fine.

The problem came with enabling more services.  Specifically NetFlow (required by 'Reporting: Insight'), ntopng (installed as a plugin) and redis (a required dependency of ntopng).  Everything would work fine for a while, but certain aspects of the system would become unresponsive after some time had passed.  Most noticeably unbound refused to respond to queries.  The services widget on the dashboard would show unbound and flowd_aggregate as stopped.  Manually restarting them again would work for a few minutes, but eventually they would stop again.  The dashboard also showed high memory usage and disk usage on `/var`; not surprising given the additional services running on the system.

The only place that provided a hint as to what was happening was the OPNsense console: the system was running out of memory and killing processes.  Unfortunately I didn't think to save the exact error messages at the time.  (Does OPNsense regularly save `dmesg` output to a file?)

NetFilter, ntopng and redis obviously require memory to operate.  The compounding factor is those services persisting a significant amount of data to memory-backed storage.  In the case of NetFilter, it's actually a spiral of doom: `/var/log/flowd.log*` files produced by NetFliter are rotated by `flowd_aggregate`, which gets killed due to low memory conditions.

Looking at how the '/var RAM disk' option works, it simply triggers creation of tmpfs with minimal options.  I.e.


mount -t tmpfs tmpfs /var


For reference, enabling '/tmp RAM disk' does something similar

mount -t tmpfs -o mode=01777 tmpfs /tmp

When no `size` parameter is specified, `tmpfs` defaults to using all of the available memory.

Quote

     size   Specifies the total file system size in bytes, unless
            suffixed with one of k, m, g, t, or p, which denote
            byte, kilobyte, megabyte, gigabyte, terabyte and
            petabyte respectively.  If zero (the default) or a
            value larger than SIZE_MAX - PAGE_SIZE is given, the
            available amount of memory (including main memory and
            swap space) will be used.


Understandably, having this particular combination of memory- and disk-hungry applications running on a system with constrained resources isn't a great idea.  However I imagine there are legitimate cases where having a tmpfs-backed `/var` is desirable, and it would be preferable to exhaust logging space before the system runs out of memory and starts killing off processes.

When using '/var RAM disk' option in 'System: Settings: Miscellaneous', can tmpfs be capped at a size less than total system memory?  Preferably user defined, either expressed as a percentage of system memory or an explicit value.
#4
An alias of internal networks is likely easier to manage, but it's really down to how granular you want to make it and how often you can tolerate changing 'something'.

E.g. you could create an alias with the network ranges of all your current VLANs.  When you create a new VLAN, you'll have to add the new network to the alias (ugg, configuration management  ;)).  Conversely, you could just create an alias called 'RFC_1918', add 10.0.0.0/8, 172.16.0.0/12 and 192.168.0.0/16 (assuming all your VLANs use addresses that fall in these ranges), use the alias as the source address of a floating rule to your Pi-hole, and call it a day.

Then there's the (admittedly small) risk of random devices using addresses they shouldn't and gaining access to things they shouldn't be able to access.  (From the WAN this can be mitigated by setting 'Block private networks' and 'Block bogon networks' on the interface.)

In either case, I'd argue that making your configuration explicit limits the potential of unintended traffic and generally makes your OPNsense config easier to read.

#5
Quote from: Sheldon on June 29, 2021, 07:42:02 AM
Thank you for your research!
You're most welcome

Quote from: Ricardo on June 29, 2021, 09:08:12 AM
Just among us.. If you made a poll, and asked 100 random opnsense admin, how many would know this is how this logging thing works in opnsense 21.1? As the docs in this topic  are unhelpful, even worse completely misleading the reader, as usual very frequently.
As with any software project, there are shortcomings in both technical features and the documentation.  If OPNsense was a commercial application that mandated a license fee for its use, and the money raised from said license fee went toward a team of software developers, technical writers, quality assurance testers and support staff, I would absolutely agree.

But it's not.

It's both free (monetarily) to use by anyone and free (libre) to inspect the inner workings of.  The latter I greatly value far more than the former.  When I'm unsure as to why a system component isn't behaving the way I expect it to, I can directly consult the code; something that would be impossible with a closed source project.  Sometimes these investigations provide additional context for system components that may have been overlooked.  I feel it's the very least I can do to contribute to the community.

Quote from: chemlud on June 29, 2021, 09:17:01 AM
Bug report?
Absolutely! Bring it to someone's attention.  Without getting eyes on the problem, it's never going to get fixed.

Quote from: franco on June 29, 2021, 01:11:18 PM
Bug fix! https://github.com/opnsense/core/commit/6f7744993
See, the system works!  :D
#6
I'm glad you managed to figure this out.

For clarity's sake, I'm assuming the chain of DNS queries for clients is 'Pi-hole > Unbound > Stubby'  and the intended query chain for the host is just 'Stubby'.  If this assumption is correct, what are the relevant settings under 'System: Settings: General' and 'Services: Unbound DNS: General' (possibly also 'Services: Unbound DNS: Miscellaneous') that got it working for you?

Specifically, did you end up keeping the Unbound 'Custom options' block, or was it just a case of setting 'Do not use the local DNS service as a nameserver for this system' as checked (to remove the default `127.0.0.1` line in `/etc/resolv.conf`) and adding '172.0.0.1@8053' in one of the 'DNS Server' fields.
#7
Quote from: cookiemonster on June 25, 2021, 07:07:32 PM
Unbound options:
server:
do-not-query-localhost: no
forward-zone:
name: "."
forward-addr: 127.0.0.1@8053

Unbound is set to LAN and WAN.

Are you manually inserting that code block into the 'Custom options' field in 'Services: Unbound DNS: General'?  Does it end up in your generated `/var/unbound/unbound.conf`?

Quote from: cookiemonster on June 26, 2021, 11:04:21 PM
What option in the UI need I use in order for OPNsense to use the DNS server set in the unbound forward-addr: directive for system updates for example?

It looks like you can use the 'DNS over TLS Servers' setting in 'Services: Unbound DNS: Miscellaneous'.  The template used to generate the `unbound.conf` snippet for this section looks a lot like what you're already doing.  Perhaps try setting 'DNS over TLS Servers' to '127.0.0.1@8053' and see if that works.
#8
The 'Local Logging - Disable writing log files to the local disk' option corresponds to the `disablelocallogging` configuration paramter in the back end.  If the option is checked, the logic appears to skip over creating several syslog directives which would result in logs being written to disk; effectively not logging anything.  However, if I'm reading it right, this only appears to occur if 'Disable circular logs' is  unchecked.  I.e. `clog` logs.

So unless 'Disable circular logs' is unchecked, logs will always be written to disk regardless of the 'Local Logging' setting.  This is unlikely to be intentional behaviour.  If your goal is it minimise writes to disk, the current best option is use of '/var RAM disk' (as already suggeted) and a small number for 'Preserve logs (Days)'.

#9
It seems this thread has a corresponding issue on GitHub.  If it's likely to get the devs' attention, I'm happy for for my write-up to be echoed over there.
#10
I came across this thread attempting to achieve the same goal as OP: redirecting all DNS traffic to a specific address, but logging the original destination address before network translation took place.  I've tried various combinations of enabling firewall/NAT logging and setting local tags, but have not been successful so far.  Now that I've done some digging into the matter, I understand why it's not as straightforward as you might think it would be.

It's been established (and confirmed by franco) that OPNsense processes NAT and firewall rules in the same order as pfSense.  I.e. the pfSense documentation on this matter is relevant in OPNsense and, most importantly, NAT occurs before firewall rules.  Meaning that address translation occurs first and any subsequent firewall rules (and associated logging) occurs on the translated address.  So there is no opportunity to log any details about the pre-NAT'ed traffic, including the original destination address.

That's not to say obtaining this information is impossible.  Searching for prior art on the subject, I came across these projects
OPNsense's firewall logs (`/var/log/filter/filter_yyyymmdd.log`) are generated by `/usr/local/sbin/filterlog`.  When `pf` processes a packet that matches a rule with the `log` flag set, it makes it available on the `pflog0` interface.  `filterlog` then reads the packet, parses out various properties, then writes out logging information in a defined format.  It's possible there's scope to extend parsing to extract pre-NAT destination address and port information.  If it were possible, it's likely only one step of many required to expose the information through the web interface.  I.e modifying views, updating the API, etc.

The above is based purely on my limited understanding of OPNsense and FreeBSD internals gained from poking around the source.  I'd really appreciate if someone more knowledgeable could let me know if I'm on the right track.  Better yet, if would be great if one of the developers could chime in on the feasibility of my speculative implementation.
#11
21.1 Legacy Series / Re: NordVPN tunnel
June 25, 2021, 01:41:52 PM
You can achieve selective routing over an OpenVPN connection by following the relevant sections in these two guides:

Creating an OpenVPN client connection ('VPN: OpenVPN: Clients') will automatically create IPv4 and IPv6 gateways ('System: Gateways: Single').  The same is not true for WireGuard interfaces, hence the guide covering manual gateway creation.

Assigning the OpenVPN interface (`ovpnc1` or similar) to new OPNsense system interface (steps 6 and 7 in the NordVPN guide) is required for selective routing to work.
#12
This instance of OPNsense is running under QEMU, so it could be a clock issue.  Something to look into.  Thanks for the tip.
#13
At least we got to the bottom of this. As to why the logic works like this, I have no idea;

Perhaps open an issue pointing back to this thread.  At the very least it will hopefully attract the attention actual OPNsense developers.
#14
WireGuard has no concept of issuing DNS servers via a DHCP-like mechanism, so I'm not sure where this IP could be coming from.

`/etc/resolv.conf` ultimately gets generated by system_resolvconf_generate() which uses the various 'System: Settings: General' DNS parameters, and whether the Unbound (and/or dnsmasq) service is enabled.

Perhaps have a look through your `/conf/config.xml` file for that 10.x.y.z IP.  `dnsserver` should be empty, and `dns[1-9]gw` should be `none`  I.e.


    <dnsserver/>
    <dns1gw>none</dns1gw>
    <dns2gw>none</dns2gw>
    <dns3gw>none</dns3gw>
    <dns4gw>none</dns4gw>
    <dns5gw>none</dns5gw>
    <dns6gw>none</dns6gw>
    <dns7gw>none</dns7gw>
    <dns8gw>none</dns8gw>


WireGuard 'servers' have a `<dns/>` key which should probably be empty; I can't see a way for this value to be populated through the webui.
#15
If requests can be successfully made from the shell (i.e. the host itself) and clients on the LAN (i.e. at least one network external to the host), the problem likely isn't firewall rules.  Requests made through the web interface are effectively the same as what you've just tested.

Looking at the code behind 'Interfaces: Diagnostics: DNS Lookup' (`diag_dns.php`), I can't see too many places where it could be going wrong.  Perhaps `/etc/resolv.conf` is not being populated correctly. It should look like this

root@OPNsense:~ # cat /etc/resolv.conf
domain localdomain
nameserver 127.0.0.1