Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - redbull666

#1
Tutorials and FAQs / Monitoring your ZFS root using monit
February 26, 2022, 09:30:52 AM
I modified a ZFS monitoring script a bit, and use it on Opnsense. It will monitor your "zroot" ZFS pool if you have installed Opnsense on ZFS (you should, ZFS is amazing).

First copy this script to your Opnsense install, I have it in /root. Make sure it's executable.

#! /bin/sh
#
## ZFS health check script for monit.
## Original script from:
## Calomel.org
##     https://calomel.org/zfs_health_check_script.html
#

# Parameters

maxCapacity=$1 # in percentages

usage="Usage: $0 maxCapacityInPercentages\n"

if [ ! "${maxCapacity}" ]; then
  printf "Missing arguments\n"
  printf "${usage}"
  exit 1
fi

# Output for monit user interface

printf "==== ZPOOL STATUS ====\n"
printf "$(/sbin/zpool status)"
printf "\n\n==== ZPOOL LIST ====\n"
printf "%s\n" "$(/sbin/zpool list)"


# Health - Check if all zfs volumes are in good condition. We are looking for
# any keyword signifying a degraded or broken array.

condition=$(/sbin/zpool status | grep -E 'DEGRADED|FAULTED|OFFLINE|UNAVAIL|REMOVED|FAIL|DESTROYED|corrupt|cannot|unrecover')

if [ "${condition}" ]; then
  printf "\n==== ERROR ====\n"
  printf "One of the pools is in one of these statuses: DEGRADED|FAULTED|OFFLINE|UNAVAIL|REMOVED|FAIL|DESTROYED|corrupt|cannot|unrecover!\n"
  printf "$condition"
  exit 1
fi


# Capacity - Make sure the pool capacity is below 80% for best performance. The
# percentage really depends on how large your volume is. If you have a 128GB
# SSD then 80% is reasonable. If you have a 60TB raid-z2 array then you can
# probably set the warning closer to 95%.
#
# ZFS uses a copy-on-write scheme. The file system writes new data to
# sequential free blocks first and when the uberblock has been updated the new
# inode pointers become valid. This method is true only when the pool has
# enough free sequential blocks. If the pool is at capacity and space limited,
# ZFS will be have to randomly write blocks. This means ZFS can not create an
# optimal set of sequential writes and write performance is severely impacted.

capacity=$(/sbin/zpool list -H -o capacity | cut -d'%' -f1)

for line in ${capacity}
  do
    if [ $line -ge $maxCapacity ]; then
      printf "\n==== ERROR ====\n"
      printf "One of the pools has reached it's max capacity!"
      exit 1
    fi
  done


# Errors - Check the columns for READ, WRITE and CKSUM (checksum) drive errors
# on all volumes and all drives using "zpool status". If any non-zero errors
# are reported an email will be sent out. You should then look to replace the
# faulty drive and run "zpool scrub" on the affected volume after resilvering.

errors=$(/sbin/zpool status | grep ONLINE | grep -v state | awk '{print $3 $4 $5}' | grep -v 000)

if [ "${errors}" ]; then
  printf "\n==== ERROR ====\n"
  printf "One of the pools contains errors!"
  printf "$errors"
  exit 1
fi

# Finish - If we made it here then everything is fine
exit 0


Then add a new service to your monit configuration in Opnsense. The "80" is a parameter for one of the alerts, specifically triggering when the pool is 80% full. Of course the script will also trigger on serious issues, such as a degraded pool if one the disks in your mirror is offline.



That's it, assuming you have configured monit correctly to send emails, for example I am using:

#2
Reinstall --> disable --> remove.

No more log errors. So not sure that is working as intended?
#3
I recently installed and later removed Postfix. However, I am now seeing errors in the logs, seemingly Postfix did not get removed or cleaned up properly?

2022-02-25T18:08:08 Warning postfix/master warning: /usr/local/libexec/postfix/pickup: bad command startup -- throttling
2022-02-25T18:08:08 Warning postfix/master warning: process /usr/local/libexec/postfix/pickup pid 50581 exit status 1
2022-02-25T18:08:08 Warning postfix/master warning: /usr/local/libexec/postfix/qmgr: bad command startup -- throttling
2022-02-25T18:08:08 Warning postfix/master warning: process /usr/local/libexec/postfix/qmgr pid 50816 exit status 1
2022-02-25T18:08:07 Critical master fatal: master_spawn: exec /usr/local/libexec/postfix/qmgr: No such file or directory
2022-02-25T18:08:07 Critical master fatal: master_spawn: exec /usr/local/libexec/postfix/pickup: No such file or directory


Will try to reinstall it again and disable it.
#4
Quote from: The_Dave
It turns out the solution to the problem was not to use a server adress in form of de4-wg.socks5.mullvad.net as listed on the mullvad website under servers, but to use a server adress like de4-wireguard.mullvad.net.
Mullvad should really fix this, it's very easy to miss for beginners! Good you figured it out.

And anyway, this guide is amazing work!
#5
I followed this guide today to successfully set up a site-to-site VPN using WG:
https://psychogun.github.io/docs/opnsense/WireGuard-Site-2-Site/

I did not have to use any console commands.
#6
General Discussion / Re: NextDNS
December 05, 2021, 08:46:56 AM
Quote from: rman50 on December 04, 2021, 02:27:07 PM
I configured forwarding to NextDNS using OPNSense's Unbound's DOT configuration (Services -> Unbound DNS -> DNS over TLS). With that configuration the only client device that will show up in the NextDNS GUI is OPNsense itself which is the way I wanted it. I use separate tools (Zeek, Influx & Grafana) to track/report on all my internal DNS queries. If you want individual device names to show up in the NextDNS GUI when utilizing a centralized forwarder, I believe you would need to use the NextDNS CLI client on OPNsense.
Ok thanks, then I misunderstood what you meant!
#7
General Discussion / Re: NextDNS
December 03, 2021, 08:46:50 AM
Quote from: rman50 on November 04, 2021, 01:49:11 PM
With the latest releases of the Unbound plugin, the DNS over TLS configurations works fine with NextDNS and client identification by using the hostname field. I switched from the custom configuration to the plugin once the DOT hostname option was added.
Hi, I am trying to make this work and have my client devices show up in the NextDNS web UI, but I am not sure what you mean with this. Are you referring to the Unbound in Opnsense, or a NextDNS Unbound plugin, or a NextDNS Opnsense plugin (cannot find).
#8
Quote from: franco on June 12, 2020, 08:31:58 AM
Is this the WFQ scheduler? Some people reported glitches over the years, but in FreeBSD the topic to improve on those crashes is nonexistent to my knowledge.


Cheers,
Franco
Yes indeed it is !

I just replaced the SSD, restored configuration, was stable for days. Then re-enabled the Shaper and had a crash within 5 minutes...

I am not sure I had this problem when I used the "regular" shaping, will test this.
#9
After many days of testing, enabling/disabling things, the issue appears to be in the Shaper. I have not had any hard crash with the shaping rules disabled (am using weighted queues).

Anyone else seen this issue?
#10
Same issue on 20.7 so far.

Now testing with all plugins uninstalled.
#11
20.1 Legacy Series / Router keeps crashing on me
June 07, 2020, 05:15:18 PM
I am running Opnsense on a brand new SuperServer E200-9A. Sadly, it keeps crashing with kernel page faults.

https://paste.gg/p/anonymous/d356815f41644a658fd5048a191d5005/files/2f6b5318f60a4ef6905fbb422e5c5768/raw

Any thoughts? Anything I could try or do? Is this likely a hardware issue with the RAM or SSD for example?
#12
That's because that was a single DNS request which only has a 3-5s delay. I guess there's a fallback. The timeouts for the 70+ DNS requests in the firmware update stack up, however.
#13
Fixed!

The issue was indeed the slow DNS. The firmware update performs many DNS lookups, leading the update to run over its timeout.

I had to enable "Do not use the local DNS service as a nameserver for this system", as my Unbound runs on port 5300. Simply a setting I had missed somehow.
#14
I also tested instructions from another thread:

root@action:~ # opnsense-update -M
http://mirror.terrahost.no/opnsense/FreeBSD:11:amd64/20.1
root@action:~ # pkg update -f
Updating OPNsense repository catalogue...
Fetching meta.txz: 100%    1 KiB   1.5kB/s    00:01
Fetching packagesite.txz: 100%  183 KiB 187.0kB/s    00:01
Processing entries: 100%
OPNsense repository update completed. 708 packages processed.
All repositories are up to date.


And:

root@action:~ #  /usr/bin/time configctl firmware check
{
        "connection":"timeout",
        "downgrade_packages":[],
        "download_size":"",
        "last_check":"Fri May 29 11:49:01 CEST 2020",
        "new_packages":[],
        "os_version":"FreeBSD 11.2-RELEASE-p16-HBSD",
        "product_name":"opnsense",
        "product_version":"20.1",
        "reinstall_packages":[],
        "remove_packages":[],
        "repository":"error",
        "updates":"",
        "upgrade_major_message":"",
        "upgrade_major_version":"",
        "upgrade_needs_reboot":"0",
        "upgrade_packages":[]
}
       31.11 real         0.20 user         0.02 sys
#15
I had indeed disabled ipv6 DHCP client, however.

- My provider does not support ipv6 with their modem in bridge mode (Ziggo, NL).
- I have the updates listed up to 20.6, so it worked at some point.
- I have enabled "Prefer IPv4 over IPv6" on the WAN interface.

So, this does not seem like an ipv6 issue? Plus, should updates not work on ipv4?