Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Assume MOK is already enrolled, add tests #19

Draft
wants to merge 69 commits into
base: main
Choose a base branch
from

Conversation

crichez
Copy link
Owner

@crichez crichez commented Jan 5, 2025

Objectives

Don't enroll machine owner keys as part of the role, pass them in as an argument. Also add tests.

Detailed Changes

  1. uki_config is renamed to uki: this simplifies argument names.
  2. Machine owner keys must be enrolled before the role starts: we still verify enrollment to avoid bricking a host, but it is now the caller's responsibility to enroll. This was impossible to automate, so doesn't belong in an Ansible role.
  3. make test now runs integration tests: using libvirt, xorriso, GNU make, and a single configuration file. This closes Pick a testing platform #18.

Christopher Palmer-Richez and others added 30 commits September 3, 2024 16:51
This has been tested with `python -m pip -r requirements.txt` and has
been proven to build an enviornment sufficient to run the test suite.
Still don't have an interactive VM at this point, I fear it may be
crashing on boot. We will try using the default OVMF firmware images and
see if that solves our problems.
This commit configures a testing environment for the 'uki' role. To run
these tests, libvirt and qemu are required. The machine runs and
configures well, but this commit is still missing configuration
instructions, and tests still do not pass. This is a milestone commit
that accomplishes the environment setup, most future commits will focus
on the tests and the collection themselves.
This commit does not provide a complete testing setup. We are running
into a limitation on ansible-test, where we can't provide a custom
inventory file without using the libvirt inventory plugin, but we can't
use SELinux with that plugn. We need to switch to a Makefile or
something else to generate an inventory file, then we can delegate
properly.
This commit changes the test VM configuration to not require
become or sudo. This only works with SELinux disabled due to a bug in
the passt policy that prevents the user from writing to the passt
socket.

Tests still don't run, we don't inherit the tempdir variable between the
two plays we try to run in setup.yml. This will be fixed by writing it
externally, but we should do our work in a separate temporary directory
instead of the workspace directory to make tracking changes easier.
This is meant to be set by the user and should not have been committed.
This commit changes paths and variable inheritance logic to ensure we
can run make test through 'ansible-test' instead of a raw script.

Currently this has lead to some odd behavior where libvirt is spawning
the qemu, passt and other supporting processes, but then loses track of
the domain entirely. I went from using staff to unconfined, so we may
need to reset the system for this to work properly. This is still a
milestone commit since we have a spawning VM.
This change avoids issues we were having with libvirt-python starting
machine processes but losing track of them. We now have a working
machine that we can authenticate with, and a makefile that has (almost)
perfectly working dependencies. There are still oddities related to the
machine rule, since the name of the machine file is not predictable. But
it mostly works, and this is another WIP commit, since we don't actually
have the role passing yet.

The next step is to copy the MOK files to the test machine, and re-read
the role to ensure we didn't make any mistakes or depend on things we
didn't respect during machine setup.
This commit now uses the PID file of the VM to detect whether it
needs to be created.
We might have found a bug in libvirt? If I template the firmware image
file that is generated by virt-fw-vars, the signature database for the
MokList variable gets mangled somehow, and the certificate fingerprint
changes. If I use the immediate output of virt-fw-vars, everything works
fine.
We now install pip from the ensurepip module, since we were getting some
missing library errors using ansible.builtin.package.
Note there is a bug right now where the openssh server become available
before the python3-libdnf5 package is installed by cloud-init. This
results in the package module failing due to an unmet dependency. The
fix is to wait longer or for a different condition, but we're committing
this as is because it's our first ever successful test run.
This is a dependency for Ansible that isn't included in the image for
some reason.
This commit moves most of the test setup logic out of the makefile and
into a new setup playbook. We still use make, but mostly as a
convenience tool to avoid passing all those arguments to
ansible-playbook. There is now a teardown target that stops machines,
but doesn't remove images and configuration.

The testimage.yml file is now a platforms.yml file that lists platforms,
the url to download a cloud image, and the prefix directory before
shim (fedora, centos, redhat, etc). The looping mechanics in ansible are
used to generate an inventory file using all of these platforms, which
allows us to run tests in parallel.
Add a new teardown rule to Makefile that removes test machines without
removing built files sot hey can quickly be spun up again.

Fix detection of UKI rebuild need. Note we should also inspect whether
the certificate and/or private key changed, not just their file paths
are those will likely be the same. Added conditions to only reboot if
the UKI was rebuilt.
Christopher Palmer-Richez added 17 commits January 22, 2025 16:31
We now verify that key identity is our condition for idempotency, not
key nickname.
Previously unnecessarily inserted uki signature verification tasks
between two postinst script installation tasks.
This adds an additional pre-reboot check: that BootNext is set to the
UKI for the kernel version we installed. We now call kernel-install
directly in case the package for the kernel version we need is not
available, in which case apt will fail silently.
This avoids verification issues with pesigcheck.
pesigcheck is used in favor of osslsigncode since it is available on
CentOS and RHEL. Detection of the BootNext EFI variable was broken on
ubuntu since it apparently doesn't include the 'UKI' tag, so a special
case was added.
This should avoid reusing addresses.
This allows the system to upgrade itself instead of relying on cloning a
repository.
Comment on lines +365 to +368
- name: Reboot the host to try and boot from the UKI
become: true
ansible.builtin.reboot:
reboot_timeout: 300
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really make much sense to try and recover from this error, we should print some helpful debug information and exit.

Comment on lines +220 to +237
- name: Backup virt-firmware boot validation service
when:
- ansible_facts.distribution_file_variety == 'Debian'
- boot_svc_search.stat.exists
ansible.builtin.slurp:
src: "{{ boot_svc_path }}"
register: boot_svc_backup
changed_when: false

- name: Install virt-firmware boot validation service from git
when: ansible_facts.distribution_file_variety == 'Debian'
ansible.builtin.copy:
src: "{{ clone_dir.path }}/systemd/{{ boot_svc_name }}"
dest: "{{ boot_svc_path }}"
owner: root
group: root
mode: "0644"
setype: systemd_unit_file_t
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these Debian-conditioned tasks should likely be moved to a different tasks file so we don't pollute the playbook run output with skipped tasks. It's getting a little difficult to figure out what's going on.

Comment on lines +1 to +4
---
uki_mok:
database_path: /etc/pki/pesign
friendly_name: mok
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that this method causes some interface stability issues. If we nest all of our variables in a dictionary, the user needs to declare each key, not just the ones they want to modify. There is likely a way around this by merging the default dictionary and the one provided by the caller, but it's bad form in my opinion. I think we need to change this, and it's a good opportunity to revisit some naming ideas.

The role should be renamed to kernel, since what it essentially does is modify the kernel layout and desired signature state. We then have to add the "kernel_" prefix to all of our arguments. We should flatten the structure as discussed, and it might look like this:

kernel_layout: uki
kernel_sign: true
kernel_signing_tool: pesign
kernel_pesign_key: mok
kernel_pesign_db: /etc/pki/pesign

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually as we reintroduce support for sbsign, other UKI initramfs generators:

kernel_signing_tool: sbsign
kernel_sbsign_key: /etc/kernel/mok.priv
kernel_sbsign_cert: /etc/kernel/mok.pem
kernel_initrd_generator: dracut
kernel_uki_generator: ukify

This change aligns the layout of the integration tests directory with
the expected structure of collection tests, even though we don't use
ansible-test. This lays the groundwork for splitting test components
into their own tasks files and reusing more tasks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pick a testing platform
1 participant