HOWTO for installing a jail under OPNsense

Started by ajm, February 15, 2022, 02:42:47 PM

Previous topic - Next topic
February 15, 2022, 02:42:47 PM Last Edit: February 16, 2022, 08:31:10 PM by ajm
WRT: https://forum.opnsense.org/index.php?topic=26724

15-02-22: First version
16-02-22: Added a note on use of 'freebsd-update' inside the jail.

HOWTO for installing a jail under OPNsense

This guide describes one approach to the creation of jails running FreeBSD 13.0-RELEASE, under OPNsense 22.1. The example shown, should be adapted as necessary to suit the planned environment.

The example OPNsense host boots from CF, and has a SATA SSD added for the exclusive use of the jails and associated data. It is using ZFS/Boot environments, and the SSD is configured with a ZFS pool.

This way, minimal changes are needed to the OPNsense environment. ZFS is required to permit easy deployment of the jail filesystems using snapshots and clones.

In this scenario, a bridge is required in OPNsense, for the VNET 'epair' interface to be bound. Other methods of networking jails are available, but are outside the scope of this guide.

Note that for installation, the host machine was booted into stock FreeBSD 13.0-RELEASE.
The same procedures should work under OPNsense, but please see the note below regarding 'freebsd-update'.

The following configuration items are referenced:

'tank':
The ZFS pool, mounted at '/tank' in which the jails are created.

'bridge16':
The bridge created in OPNsense, which provides L2/L3 connectivity for the jails.

'epair101':
'epair102':
The VNET epair interface which connects each jail to the bridge.

Jails:
'fserv': a permanent NFS filestore
'mserv': a smart mailhost

These procedures were based on material at:

https://genneko.github.io/playing-with-bsd/system/learning-notes-on-jails/
https://genneko.github.io/playing-with-bsd/networking/freebsd-vlan/
https://clinta.github.io/freebsd-jails-the-hard-way/

DISCLAIMER: This guide may not show the technically optimal solution, and quite possibly could be improved by someone more technically adept than the author. However 'it works for me'.

E&OE, YMMV etc.

Host config file changes

Host: /usr/local/etc/rc.syshook.d/start/11-mount-tank

Purpose: mount ZFS pool 'tank' due to failure of OPN to auto-mount the 2nd. zpool.
Note: must be in running order BEFORE jails are started !

#!/bin/sh
zpool import -f tank

Host: /usr/local/etc/rc.syshook.d/start/12-epair-create

Purpose: create VNET 'epair' interfaces
Note: must be in running order AFTER network interfaces are configured !

#!/bin/sh

# /usr/local/etc/rc.syshook.d/start/12-epair-create
# create required epair interfaces and add as bridge members
#
ifconfig epair101 create
ifconfig epair101b up
ifconfig epair102 create
ifconfig epair102b up
ifconfig bridge16 addm epair101b addm epair102b

Host: /etc/rc.conf

Add:
jail_enable="YES"
jail_list="fserv mserv"


Host: /etc/jail.conf


exec.start = "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown";
exec.clean;
mount.devfs;

host.hostname = $name;
path = "/tank/jails/$name";
exec.consolelog = "/var/log/jail_${name}_console.log";

vnet;
vnet.interface = $vif;

exec.start += "ifconfig $vif $addr";
exec.start += "route add default $gw";

# workaround
# https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=238326
exec.prestop += "ifconfig $vif -vnet $name";

fserv { $vif = "epair101a"; $addr = "10.0.16.16/24"; $gw = "10.0.16.4"; }
mserv { $vif = "epair102a"; $addr = "10.0.16.17/24"; $gw = "10.0.16.4"; }


Host commands to set up jails:

Set up ZFS datasets

Note: The SSD had previously been partitioned and labelled using gpart and glabel.


root@a-fw:~ # zpool create -m /tank tank /dev/gpt/tank
root@a-fw:~ # zfs set compression=on tank
root@a-fw:~ # zfs set atime=off tank
root@a-fw:~ # zfs create tank/jails
root@a-fw:~ # zfs create tank/jails/base
root@a-fw:~ # zfs create tank/jails/base/13.0


Install base.txz into 'basejail' dataset


root@a-fw:~ # mkdir /tank/tmp
root@a-fw:~ # cd /tank/tmp
root@a-fw:/tank/tmp # fetch ftp://ftp.uk.freebsd.org/pub/FreeBSD/releases/amd64/13.0-RELEASE/base.txz
root@a-fw:/tank/tmp # tar -xJpf base.txz -C /tank/jails/base/13.0
root@a-fw:/tank/tmp # cp /etc/localtime /tank/jails/base/13.0/etc
root@a-fw:/tank/tmp # vi /tank/jails/base/13.0/etc/rc.conf

sendmail_enable="NO"
sendmail_submit_enable="NO"
sendmail_outbound_enable="NO"
sendmail_msp_queue_enable="NO"
syslogd_flags="-ss"
cron_flags="-J 60"


Copy resolver configuration from the host to the template:

root@a-fw:~ # cp /etc/resolv.conf /tank/jails/base/13.0/etc/

Edit system crontab in the template to disable adjkern

root@a-fw:~ # vi /tank/jails/base/13.0/etc/crontab

      > #1,31 0-5 * * * root adjkerntz -a


Create /etc/periodic.conf to disable some of the predefined scheduled jobs which are not required for jails

root@a-fw:~ # vi /tank/jails/base/13.0/etc/periodic.conf

# No output for successful script runs.
daily_show_success="NO"
weekly_show_success="NO"
monthly_show_success="NO"
security_show_success="NO"

# Output to log files which are rotated by default.
daily_output="/var/log/daily.log"
daily_status_security_output="/var/log/daily.log"
weekly_output="/var/log/weekly.log"
weekly_status_security_output="/var/log/weekly.log"
monthly_output="/var/log/monthly.log"
monthly_status_security_output="/var/log/monthly.log"

# No need for those without sendmail
daily_clean_hoststat_enable="NO"
daily_status_mail_rejects_enable="NO"
daily_status_mailq_enable="NO"
daily_queuerun_enable="NO"

# Host does those
daily_status_disks_enable="NO"
daily_status_zfs_zpool_list_enable="NO"
daily_status_network_enable="NO"
daily_status_uptime_enable="NO"
daily_ntpd_leapfile_enable="NO"
weekly_locate_enable="NO"
weekly_whatis_enable="NO"
security_status_chksetuid_enable="NO"
security_status_neggrpperm_enable="NO"
security_status_chkuid0_enable="NO"
security_status_ipfwdenied_enable="NO"
security_status_ipfdenied_enable="NO"
security_status_ipfwlimit_enable="NO"
security_status_ipf6denied_enable="NO"
security_status_tcpwrap_enable="NO"


Apply latest FreeBSD patches to base jail


Note: This was performed under stock FreeBSD 13.0-RELEASE, same version as the base.txz
As 'freebsd-update' is not included in OPNsense, this step could be skipped, and the updates carried out within the jails once it is running. Alternatively, it may be an option to copy 'freebsd-update' into OPNsense, and use it with --currently-running 13.0-RELEASE -b <path/to/jail>.

root@a-fw:~ # freebsd-update -b /tank/jails/base/13.0 fetch install

Customize the jails' root shell prompt


root@a-fw:~ # vi /jails/base/13.0/root/.cshrc

   >   # ANSI Color 32 = Green
      set prompt="%{\033[32m%}%B<%n@%m>%b%{\033[0m%}:%~%# "


Create snapshot and clone to a new jail hostname 'fserv'

root@a-fw:~ # zfs snapshot tank/jails/base/13.0@p7
root@a-fw:~ # zfs clone tank/jails/base/13.0@p7 tank/jails/fserv

Test

root@a-fw:~ # service jail start fserv
root@test:~ # jls
   JID  IP Address      Hostname                      Path
     1                  fserv                         /tank/jails/fserv

root@a-fw:~ # jexec 1 /bin/csh

<root@fserv>:/#

DONE !

The jails can now have application packages installed in the normal way. For compiling from source, the relevant 'sets' (.txz's) would need to be installed alongside 'base.txz'.

Post-install updates of the jail

freebsd-update can be used from inside the jail, using the '--currently-running' option so it ignores the OPNsense kernel version:


<root@fserv>:/# freebsd-update --currently-running 13.0-RELEASE fetch
Looking up update.FreeBSD.org mirrors... none found.
Fetching metadata signature for 13.0-RELEASE from update.FreeBSD.org... done.
Fetching metadata index... done.
Inspecting system... done.
Preparing to download files... done.

No updates needed to update system to 13.0-RELEASE-p7.

April 18, 2022, 07:35:08 PM #1 Last Edit: April 19, 2022, 04:06:19 PM by jwatzman
Thanks for writing this up! It was super useful as a baseline to get to the setup that I wanted. Info on my changes in a second, but first a very important and very subtle bug in this setup. For whatever reason, OPNsense doesn't load any devfs rules on boot, which means that the mount.devfs directive gives full access to all devices in devfs -- allowing root on the jail to, for example, have raw disk access. This kind of defeats the purpose of having a jail! The fix is to make sure that, at some point, you run service devfs start which will load the default rules; once those are loaded, the default jail configuration will pick them up and use a reasonable set of defaults (ruleset 4, which hides a lot of things but exposes a set of "safe" devices which is enough for most systems). In your setup, you can probably stick it in either one of your syshook scripts (or make a new one); in my setup, since I don't need any syshook setups, I just stuck it in exec.prepare which is a little silly, you only need to run it once per boot but it's harmless to re-run, but lets me keep all of my conf in one place.

(It's maybe a bug that OPNsense doesn't do this by default, and maybe a bug that the FreeBSD base doesn't complain about jails trying to use a non-existent devfs ruleset? I dunno.)




Okay, so what I did differently. Gonna try to explain what I was trying to do and why I needed to set things up this way; my jail.conf is below.

For one, I'm using a read-only base system modified with symlinks into a read-write mount, nullfs mounted on each other to make the jail root -- this is exactly the setup in the "thin jails" section of this post which you helpfully linked. I named things slightly differently, and used exec.X directives to mount/unmount instead of fstabs (since those can substitute $name more easily) but the idea and mechanics are exactly the same as in that post.

But more interesting for this crowd is that I wanted my jails to live in their own LAN/subnet -- basically, have an interface dedicated to them in OPNsense like my other LANs and VLANs. Then I can set special firewall rules for my set of jails as a unit, e.g., firewall them off from both my main networks but also my guest networks. The way it works is pretty simple: I created a new bridge interface in OPNsense but with no interfaces added into the bridge, and then my exec directives create and configure epair interfaces and add them to the bridge and the jail's VNET as appropriate.

Only a couple of tricks to that. The first is that, for whatever reason, the epair interface on the host, which I add to the bridge, also needs an IP assigned. My understanding was that interfaces joined to a bridge don't/shouldn't have their on IPs, but things didn't seem to work without doing that.

The biggest trick is how to get OPNsense to create an empty bridge. Not only will the UI not let you do it, the low-level config code skips any empty bridges, and even contains a great comment calling out that this might not be ideal. But it also skips adding any invalid member interfaces -- but only after creating the bridge. So what I did was trick OPNsense: I created the bridge in the UI, selecting any interface as a member so it would let me save. (Make sure to enable the box for a link-local address too otherwise IPv6 RA won't make it through to autoconfigure the default route.) Then I manually edited /conf/config.xml -- I found my bridge in the config (inside a <bridged> block), and then changed the members to <members>invalid</members> -- can use anything there as long as it doesn't match the name of another interface you have. Reload or reboot, and there you have your empty bridge. You can assign it in the UI, set up DHCP etc as any other interface.




So let me sum up and actually show my config. Important points if you want to do what I did:


  • Whether you follow ajm's guide or my adaptations, you should almost certainly make sure service devfs start gets run at some point!
  • Follow the linked guide to set up zfs datasets and null mounts.
  • Create an empty bridge by manually editing config.xml to set the bridge members to a non-empty set of invalid interfaces. Make sure to enable the box for a link-local address on the bridge before you do this.

OK, now my config. In OPNsense, my jail/bridge0 LAN is set up on 10.0.9.0/24, with OPNsense itself living at 10.0.9.1; for IPv6 my ISP routes 2001:db8:1234:5670::/60 to me, and the jail LAN is set to "track interface" with a prefix ID of 9 (i.e., 2001:db8:1234:5679::/64 is my jail LAN).

/etc/jail.conf

exec.start = "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown";
exec.clean;

mount.devfs;
exec.prepare += "service devfs start";

host.hostname = "${name}.localdomain";
path = "/jails/roots/$name";

exec.prepare += "mount -t nullfs -o ro /jails/base /jails/roots/$name";
exec.prepare += "mount -t nullfs /jails/$name /jails/roots/$name/rw";
exec.release += "umount /jails/roots/$name/rw";
exec.release += "sleep 5 && umount /jails/roots/$name";

vnet;
vnet.interface = "${if}b";
exec.prepare += "ifconfig ${if} create";
exec.prepare += "ifconfig bridge0 addm ${if}a";
exec.prepare += "ifconfig ${if}a inet ${haddr}/24";
exec.prepare += "ifconfig ${if}a inet6 2001:db8:1234:5679::${haddr}";
exec.start += "ifconfig ${if}b inet ${addr}/24";
exec.start += "ifconfig ${if}b inet6 accept_rtadv";
exec.start += "ifconfig ${if}b inet6 2001:db8:1234:5679::${addr}";
exec.start += "route add default 10.0.9.1";
exec.prestop += "ifconfig ${if}b -vnet $name";
exec.release += "ifconfig ${if}a destroy";

alcatraz {
        $if = "epair101";
        $haddr = "10.0.9.10";
        $addr = "10.0.9.11";
}


/conf/config.xml snippet showing the bridge (only manual edit was to the <members>)

  <bridges>
    <bridged>
      <linklocal>1</linklocal>
      <descr>JailBridge</descr>
      <maxaddr/>
      <timeout/>
      <bridgeif>bridge0</bridgeif>
      <maxage/>
      <fwdelay/>
      <hellotime/>
      <priority/>
      <proto>rstp</proto>
      <holdcnt/>
      <members>invalid</members>
      <ifpriority/>
      <ifpathcost/>
    </bridged>
  </bridges>


zfs list (basically same as FreeBSD guide with things renamed to make more sense to me)

NAME                 USED  AVAIL     REFER  MOUNTPOINT
zroot               2.25G  11.3G       96K  /zroot
zroot/ROOT           904M  11.3G       96K  none
zroot/ROOT/default   903M  11.3G      903M  /
zroot/jails         1.35G  11.3G      132K  /jails
zroot/jails/base     464M  11.3G      464M  /jails/base
zroot/jails/skel    4.41M  11.3G     4.33M  /jails/skel
zroot/jails/alcatraz 910M  11.3G      876M  /jails/alcatraz
zroot/tmp            728K  11.3G      728K  /tmp
zroot/usr            384K  11.3G       96K  /usr
zroot/usr/home        96K  11.3G       96K  /usr/home
zroot/usr/ports       96K  11.3G       96K  /usr/ports
zroot/usr/src         96K  11.3G       96K  /usr/src
zroot/var           5.95M  11.3G       96K  /var
zroot/var/audit       96K  11.3G       96K  /var/audit
zroot/var/crash       96K  11.3G       96K  /var/crash
zroot/var/log       5.45M  11.3G     5.45M  /var/log
zroot/var/mail       120K  11.3G      120K  /var/mail
zroot/var/tmp         96K  11.3G       96K  /var/tmp

Thank you for this post!  I was able to follow it and get a jail up and running on OPNsense 23.1.5_4-amd64.  I had previously created a jail, but hadn't documented how, nor did I ever figure out how to give jails their own network interface; your documentation on creating a bridge in OPNSense and then creating an epair for each jail is great.

One thing I found different on my system compared to the docs was /etc/rc.conf.  Rather than a single file, I have a directory containing files for different services.  In my case, I created /etc/rc.conf.d/jail for the jails configuration:

root@OPNsense:~ # cat /etc/rc.conf.d/jail
jail_enable="YES"
jail_list="unifi-controller-13"


Thank you again for this writeup!   8)

this was a really interesting read, im learning opnsense as i read through forum posts.

what does a jail do in this instance?


It's a container. Actually jails are THE original container technology which Sun's Solaris later adopted and improved as "zones".
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)