Using SELinux with systemd’s container support

July 03, 2021

we introduced systemd as an SELinux-aware application suite, capable of launching different services with configurable SELinux contexts. Besides service support, systemd has quite a few other features up its sleeve. One of these features is systemd-nspawn.

With systemd-nspawn, systemd provides container capabilities, allowing administrators to interact with systemd-managed containers in an integrated way, almost as if these containers were services themselves. It uses the same primitives as LXC from the Linux Containers project (which was the predecessor of the modern container frameworks) and Docker, based upon namespaces (hence the n in nspawn).

Note:

The Linux Containers project has a product called LXC that combines several isolation and resource management services within the Linux kernel, such as control groups (cgroups) and namespace isolation. cgroups allow for capping or throttling resource consumption in the CPU, memory, and I/O, whereas namespaces allow for hiding information and limiting the view on system resources. Early versions of Docker were built upon LXC, although Docker has since embraced the Linux services itself directly without using LXC.

SELinux-wise, the software running inside the container might not have a correct view on the SELinux state (depending on the container configuration) as the container is isolated from the host itself. SELinux does not yet have namespace support to allow containers or other isolated processes to have their own SELinux view, so if a container has a view on the SELinux state, it should never be allowed to modify it.

Let’s see how systemd-nspawn works and what its SELinux support looks like.

Initializing a systemd container

To create a systemd container, we need to create a place on the filesystem where its files will be stored, and then call systemd-nspawn with the correct arguments. To prepare the filesystem, we can download prebuilt container images, or create one ourselves. Let’s use the Jailkit software, and build a container from it:

  • First, create the directory the container runtimes will be hosted in:
# mkdir /srv/ctr
  • Edit the /etc/jailkit/jk_init.ini file and include the following section:
[nginx]
comment = nginx runtime
paths = /usr/sbin/nginx, /etc/nginx, /var/log/nginx, /var/lib/nginx, /usr/share/nginx, /usr/lib64/nginx, /usr/lib64/perl5/vendor_perl
users = root,nginx
groups = root,nginx
includesections = netbasics, uidbasics, perl

This section tells Jailkit what it should copy into the directory, and which users to support.

  • Execute the jk_init command to populate the directory:
# jk_init -v -j /srv/ctr/nginx nginx
  • Finally, start the container using systemd-nspawn:
# systemd-nspawn -D /srv/ctr/nginx /usr/sbin/nginx \
 -g "daemon off;"

As Nginx will by default attempt to run as a daemon, the container would immediately stop as it no longer has an active process. By launching with the daemon off option, nginx will remain in the foreground, and the container can continue to work.

Using a specific SELinux context

When we launch a container directly, this container will run with the SELinux context of the user. We can, however, pass on the target context for the container using command-line arguments:

  • The --selinux-context= option (-Z for short) allows the administrator to define the SELinux context for the runtime processes of the container.
  • The --selinux-apifs-context= option (-L for short) allows the administrator to define the SELinux context for the files and filesystem of the container.

The SELinux types that can be used here, however, need to be carefully selected. The processes running inside a container cannot perform any type of transitions, so regular SELinux domains are often not feasible to use. Taking our Nginx example again, the httpd_t domain cannot be used for this container.

We can use the SELinux types that the distribution provides for container workloads. Recent CentOS versions will use a domain such as container_t (which was previously known as svirt_lxc_net_t) and a file-oriented SELinux type, container_file_t. While this domain does not hold all possible privileges needed for any container, it provides a good baseline for containers.

Let’s use this type for our container:

  • First, we need to extend the container_t privileges with some additional rights for the nginx daemon. Create a CIL policy file with the following content:
(typeattributeset cil_gen_require container_t)
(typeattributeset cil_gen_require container_file_t)
(typeattributeset cil_gen_require http_port_t)
(typeattributeset cil_gen_require node_t)
(allow container_t container_file_t (chr_file (read open getattr ioctl write)))
(allow container_t self (tcp_socket (create setopt bind listen accept read write)))
(allow container_t http_port_t (tcp_socket (name_bind)))
(allow container_t node_t (tcp_socket (node_bind)))
(allow container_t self (capability (net_bind_service setgid setuid)))
  • Load this file as a new SELinux module:
# semodule -i custom_container.cil
  • Relabel the files of the container with the container_file_t SELinux type:
# chcon -R -t container_file_t /srv/ctr/nginx
  • Launch the container with the appropriate labels:
# systemd-nspawn -D /srv/ctr/nginx \
-Z system_u:system_r:container_t:s0 \
-L system_u:object_r:container_file_t:s0 \
/usr/sbin/nginx -g "daemon off;"

Whenever a container is launched, it remains attached to the current session. We can of course create service files that launch the containers in the background, or use session management services such as screen or tmux. A more user-friendly approach, however, is to use machinectl.

Facilitating container management with machinectl

The machinectl command allows administrators to manage containers or even virtual machines more easily through systemd. For containers, machinectl will use systemd-nspawn.

Let’s use this machinectl command to download, start, and stop a container:

    • First, download a ready-to-go container image with the pull-tar argument and prepare it on the system:
    # machinectl pull-tar https://nspawn.org/storage/archlinux/archlinux/tar/image.tar.xz archlinux

    We can also download the archive manually, and then import it using machinectl import-tar:

    # machinectl import-tar archlinux.tar.xz
    • List the available images with the list-images argument:
    # machinectl list-images
    • We can now clone this image and launch the container:
    # machinectl clone archlinux test
    # machinectl start test
    • To access the container environment, use the shell argument:
    # machinectl shell test
    • We can shut down the container using the poweroff argument:
    # machinectl poweroff test

    When we use machinectl, the containers will run in the unconfined_service_t SELinux domain. There is currently no way to override this. Luckily, we have other tools available to facilitate container management that do have more significant built-in SELinux support, such as Docker and podman.

    Related Articles

    How to add swap space on Ubuntu 21.04 Operating System

    How to add swap space on Ubuntu 21.04 Operating System

    The swap space is a unique space on the disk that is used by the system when Physical RAM is full. When a Linux machine runout the RAM it use swap space to move inactive pages from RAM. Swap space can be created into Linux system in two ways, one we can create a...

    read more

    Lorem ipsum dolor sit amet consectetur

    0 Comments

    Submit a Comment

    Your email address will not be published. Required fields are marked *

    fourteen − 13 =