Skip to content

Why? and for Who?

Casey Witt edited this page Mar 21, 2018 · 1 revision

This article is more focused on "what" paczfs does, and less on "how" it does it. If you find this interesting, but not exactly what you want then read the README.md document which has a more detailed technical explination.

paczfs is replacement for Docker and CoreOS (based on ZFS, systemd, and ArchLinux) which incorporates the best parts of each, but using a different philosophical approach resulting in a simpler and more secure and robust result (in our opinion).

paczfs is a script to create a fully customizable ArchLinux server image file that can be run on a VPS, bare metal, or in a virtual machine. The image file uses a ZFS root filesystem and zpool "bootfs" property to select which ZFS dataset to mount to the root filesystem at boot, and is focused on running containers using systemd-nspawn instead of docker.

ZFS is a filesystem that supports snapshotting, cloning encryption, and transferring snapshot differentials at the "dataset" level. The following zfs datasets are created by the paczfs script and mounted when the image is booted according to the MOUNTPOINT column.

NAME USED AVAIL REFER MOUNTPOINT zroot 683M 14.2G 96K /zroot zroot/data 736K 14.2G 96K none zroot/data/home 96K 14.2G 96K /home zroot/data/machines 168K 14.2G 96K /var/lib/machines zroot/data/machines/miniroot 8K 14.2G 315M /var/lib/machines/miniroot zroot/data/postgres 96K 14.2G 96K /var/lib/postgres zroot/data/root 184K 14.2G 120K /root zroot/data/srv 96K 14.2G 96K /srv zroot/system 681M 14.2G 96K legacy zroot/system/default 681M 14.2G 675M legacy zroot/system/development 8K 14.2G 675M legacy zroot/system/production 8K 14.2G 622M legacy

Any of the datasets under "zroot/system" can be be mounted as the root filesystem on the next boot by setting the zroot pool "bootfs" property (legacy just tells zfs not to try and mount them directly). For instance, to boot using "zroot/system/production" as the root filesystem on the next boot use:

zpool set bootfs=zroot/system/production zroot

All other zfs datasets which have the "mountpoint" property set to a valid path, and have the "canmount=on" property will be mounted regardless of the root filesystem by the zfs-mount.service (if enabled in the root filesystem that booted). To control mounting of a zfs dataset using the /etc/fstab file (of whatever root filesystem is currently booted) set the zfs dataset property "mountpoint=legacy".

The "zroot/machines/miniroot" is a read-only clone of the "zroot/system/default@miniroot" snapshot which was taken during paczfs execution after the ArchLinux "base" package is installed (but without the kernel).

The "zroot/system/default" dataset is where ArchLinux was actually installed to when paczfs was run.

The "zroot/system/production" dataset is a clone of the "zroot/system/default@production" snapshot which was taken during paczfs execution after the kernel and bootloader were installed, but before the packages in the DEVELOPMENT_PKGS array are installed (ie. a minimally bootable system to use on production servers).

The "zroot/system/development" dataset is a clone of the "zroot/system/default@development" snapshot which was taken at the end of paczfs execution.

In summary, the ArchLinux image produced by paczfs is ready to both:

  • develop containers by booting into the "zroot/system/development" root filesystem and cloning the "zroot/system/default@miniroot" to use as the basis for a new container
  • Host and supervise containers by booting into the "zroot/system/production" root filesystem

The way the "miniroot", "production", and "development" snapshots were taken as paczfs builds the ArchLinux image is conceptually similar (but technically different) to the way Docker caches image "overlays" while building a Dockerfile.

ZFS provides a good filesystem for working with containerization because it doesn't "hide" anything, is extremely reliable and robust, and has very good snapshot and clone capabilities as well as built-in support for sending and receiving snapshot differentials.

systemd-nspawn comes as part of systemd and can be used to run containers. One of the most noticeable differences between docker and systemd-nspawn as that systemd-nspawn doesn't try and do any form or image management (except to look in the /var/lib/machines directory for images by default), and can run containers from both raw image files as well as from directories.

We only use Docker for testing things out as quickly as possible. We don't use it on production servers because you don't always really know what you are getting in a Docker image.

After we play around with something in Docker, we typically create our own ArchLinux based container using a simple bash script. This is comparable to creating a Dockerfile; and if you aren't familiar with ArchLinux probably has about the same learning curve. If you are familiar with ArchLinux then you already know the benefits (which is why so many Dockerfiles use ArchLinux as a base).

Basically, for the purposes of creating an image, Docker can be thought of as a fancy wrapper around bash which manages some sort of Docker "image cache" (ie. docker images). When Docker builds an image it automatically puts the result in this "image cache" so it is available to run with docker. By default, Docker uses "overlayfs" to manage the image cache.

systemd-nspawn on the other hand doesn't run Docker images (at least as far as we know), but run run a container using either a raw image file, or plain directory.

So, instead of using Docker, the paczfs image is optimized for building and running images by cloning the "miniroot" dataset, and then using a simple bash script with the ArchLinux package manager pacman to "install" the system (instead of a Dockerfile), and then run it with systemd-nspawn as a systemd service.

Clone this wiki locally