Tuning systemd services, logging, and device management in Linux

June 23, 2021

systemd is a core component of many Linux distributions. Since its birth in 2010, many distributions have gradually adopted systemd as the core init system, responsible for handling services and boot-up operations.

Throughout its development phase, systemd added several other components to its portfolio:

  • D-Bus, which offers a system and session bus service allowing the use of D-Bus for inter-application communication, merged with systemd.
  • systemd also incorporated udev, which offers a flexible device-node management application.
  • Login capabilities were added to systemd, enabling fine-grained control over user sessions.
  • The journald daemon joined the systemd family to provide a new approach to system and service logging, replacing some of the functionality of standard system loggers.
  • The timerd daemon provides support for the time-based execution of tasks, replacing some of the functionality of standard cron daemons.
  • Network configurations can be managed by systemd-networkd.

This ongoing approach of absorbing several system services into a single application suite has not gone unnoticed and isn’t without controversy. Some distributions even refuse to have systemd as the default init system.

The systemd project includes SELinux support for most of its services. Applications such as systemd, which not only include SELinux awareness but also enforce access controls on specific SELinux classes and permissions (rather than relying on the Linux kernel), are called userspace object managers:

selinux systemd

If an application enforces access controls toward certain classes and permissions, then it will also have its own AVC. Log events resulting from these applications will be identified as USER_AVC events rather than (kernel-managed) AVC events. The systemd application has support for systemd-specific classes, as we will see in the Governing unit operation access section. But before we dive into these specific details, let’s first see what systemd is all about and what SELinux support it has.

Service support in systemd

The main capability of the system daemon that most people know about is its support for system services. Unlike traditional SysV-compatible init systems, systemd does not use scripts to manage services. Instead, it uses a declarative approach for the various services, documenting the wanted state and configuration parameters while using its own logic to ensure that the right set of services start at the right time and in the correct order.

Understanding unit files

systemd uses unit files to declare how a service should behave. These unit files use the INI-style syntax, supporting sections and key/value pairs within each file. A service can have multiple unit files that influence the service at large. It is important to remember that different unit files for the same service are all related:

  • The *.service unit files define how a system service should be launched, what its dependencies are, how systemd should treat sudden failures, and so on.
  • The *.socket unit files define which socket(s) should be created and which permissions should be assigned to it. systemd uses this for services that can be launched on request rather than directly at boot.
  • The *.timer unit files define at what time or frequency the service should be launched. Services that do not necessarily run daemonized but need to execute a certain logic at defined intervals can use these timer files to ensure regular runs. These settings are comparable to the more classic yet still widely used crontabs, which we briefly touch upon in PAM services, in the subsection called Cron.

Other unit files exist as well, although those have more in common with generic system configurations (such as slice definitions and automount settings) and less with runtime services.

System unit files can be placed in one of three locations:

  • Unit files are installed by default by the system’s package manager inside /usr/lib/systemd/system.
  • At runtime, updates can be placed inside /run/systemd/system, which will override the unit files in the default location. However, this location is transient and will not persist across reboots.
  • System administrators can override the configurations in the two locations by placing unit files in /etc/systemd/system. These unit files override previous definitions, so there is no need to remove the unit files from the previous locations.

As an example, check out the default Nginx service unit file, nginx.service, inside /usr/lib/systemd/system:

[Unit]
Description=The nginx HTTP and reverse proxy server
After=network.target remote-fs.target nss-lookup.target
[Service]
Type=forking
PIDFile=/run/nginx.pid
ExecStartPre=/usr/bin/rm -f /run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t
ExecStart=/usr/sbin/nginx
ExecReload=/bin/kill -s HUP $MAINPID
KillSignal=SIGQUIT
TimeoutStopSec=5
KillMode=mixed
PrivateTmp=true
[Install]
WantedBy=multi-user.target

This unit file declares the command to launch Nginx with and informs systemd that the service should be launched after successfully reaching the networkremote-fs, and nss-lookup targets (which is a milestone in the boot process, allowing proper dependency handling). The unit file also declares that it is a dependency of the multi-user target (which is the equivalent of the default run level when using SysV-style init services), which means the service should launch when the system boots.

Setting the SELinux context for a service

When systemd launches a service, it executes the command defined through the ExecStart= configuration entry in the service unit file. By default, a standard domain transition will occur as defined through the SELinux policy.

Package developers and system administrators can, however, update the service unit files to have the service launched in an explicitly mentioned SELinux domain. To accomplish this, the [Service] section of the unit file can be extended with the SELinuxContext= configuration entry.

For instance, to ensure that Nginx launches with the httpd_t:s0:c0.c128 context, you’d use this:

[Service]
Type=forking
PIDFile=/run/nginx.pid
ExecStartPre=/usr/bin/rm -f /run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t
ExecStart=/usr/sbin/nginx
ExecReload=/bin/kill -s HUP $MAINPID
SELinuxContext=system_u:system_r:httpd_t:s0:c0.c128
KillSignal=SIGQUIT
TimeoutStopSec=5
KillMode=mixed
PrivateTmp=true

Of course, it is also possible to use this to have a service running with a different context, which can be useful when developing custom policies for daemons. However, keep in mind that the SELinux policy rules still apply: you cannot ask systemd to launch Nginx, for instance, with the dnsmasq_t domain without updating the SELinux policy so that httpd_exec_t (the entry point for the httpd_t domain) is also made an entry point for the dnsmasq_t domain.

When you request systemd to explicitly use an SELinux context for a service, systemd will attempt to use this context for all execution-related tasks:  ExecStartPre,  ExecStart,  ExecStartPost,  ExecStopPre,  ExecStop,  ExecStopPost,  and  ExecReload.  As these tasks often are not labeled with the right entry point label, these commands can fail. In that case, prefix the commands with + so that the SELinux context definition does not apply to them:

[Service]
Type=forking
PIDFile=/run/nginx.pid
ExecStartPre=+/usr/bin/rm -f /run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t
ExecStart=/usr/sbin/nginx
ExecReload=/bin/kill -s HUP $MAINPID
SELinuxContext=system_u:system_r:httpd_t:s0:c0.c128
KillSignal=SIGQUIT
TimeoutStopSec=5
KillMode=mixed
PrivateTmp=true

While developing and changing unit files, the changed settings might not always be immediately applied to the system. Running systemctl daemon-reload after modifying unit files will ensure that the latest changes on the system are read by systemd.

Using transient services

systemd can also be used to launch applications as if they are services and have them under systemd’s control. Such applications are called transient services as they lack the unit files that generally declare how systemd should behave.

Transient services are launched through the systemd-run application. To show this, let’s create a simple Python script (one that calculates Pi up to 10,000 digits):

from decimal import Decimal, getcontext
getcontext().prec=10000
with open('https://510848-1853064-raikfcquaxqncofqfm.stackpathdns.com/tmp/pi.out', 'w') as f:
  print(sum(1/Decimal(16)**k * (
    Decimal(4)/(8*k+1)-
    Decimal(2)/(8*k+4)-
    Decimal(1)/(8*k+5)-
    Decimal(1)/(8*k+6)) for k in range(10000)), file=f)

As this takes some time, we can opt to run this Python script under systemd’s control:

# systemd-run python3.6 /tmp/pi.py
Running as unit: run-rf9ce45c...f343.service

As transient services do not have unit files to manage, changing the SELinux context must be accomplished through the command line as well. Of course, this is only needed if the standard domain transitions defined in the policy do not result in the wanted behavior:

# systemd-run -p SELinuxContext=guest_u:guest_r:guest_t:s0 python3.6 /tmp/pi.py

The systemd-run application supports this through the --property (or -p) option, through which unit file properties can be added. In the previous example, we use this option to run the script in the guest_t domain using the SELinuxContext property, similar to how we would define this in the unit file itself.

Requiring SELinux for a service

Some services should only run when SELinux is enabled or disabled. With systemd, this can be defined through its conditional parameters.

A service unit file can contain several conditions that need to be valid before systemd will consider executing the service. These conditionals can point to the system type (virtualized or not), kernel command-line parameters, files that do or don’t exist, and so on. The one we are interested in is ConditionSecurity, which represents the state of the given security system—in our case, SELinux.

For instance, look at the selinux-autorelabel.service unit file inside /usr/lib/systemd/system:

[Unit]
Description=Relabel all filesystems
DefaultDependencies=no
Conflicts=shutdown.target
After=sysinit.target
Before=shutdown.target
ConditionSecurity=selinux
[Service]
ExecStart=/usr/libexec/selinux/selinux-autorelabel
Type=oneshot
TimeoutSec=0
RemainAfterExit=yes
StandardOutput=journal+console

Similarly, the Linux distribution provides the selinux-autorelabel-mark.service file. This service ensures that, if SELinux is not active when the system boots (and no /.autorelabel file exists yet), then systemd will create an empty /.autorelabel file. This file ensures that, when the system reboots with SELinux support, the relabeling operation occurs.

Relabeling files during service startup

One of the actions that many services require is the preparation of service-specific runtime directories, such as /run/httpd for the Apache service. systemd supports this through tmpfiles.d.  Within tmpfiles, we can define the files and locations requested to be provided or updated immediately (at boot time) when these are not placed in the (persisted) filesystem.

For instance, the package that provides the Apache daemon installs the following definition as /usr/lib/tmpfiles.d/httpd.conf on the system:

d /run/httpd	710 root apache
d /run/httpd/htcacheclean	700 apache

Like the systemd unit files, the files that contain these settings should be declared in one of the following three locations. Each location overrides the settings of the previous one:

  • The default, package-provided location is /usr/lib/tmpfiles.d.
  • Runtime declarations can be placed in /run/tmpfiles.d.
  • Local system administrator-provided declarations are placed in /etc/tmpfiles.d.

These definitions can get much more specific than just directory creation. Through the tmpfiles.d application, definitions can be set to create files, empty directories upfront, create sub-volumes, manage special files such as symbolic links or block devices, set extended attributes, and more.

One of its features is to set the file mode and ownership, and restore the SELinux context on a file (z) or recursively against a directory (Z). This can be used to change contexts on files that have a proper context definition in the policy, but whose context is not properly assigned.

For instance, look at the definitions in the selinux-policy.conf file inside /usr/lib/tmpfiles.d:

z /sys/devices/system/cpu/online - - -
Z /sys/class/net - - -
z /sys/kernel/uevent_helper - - -
w /sys/fs/selinux/checkreqprot - - - - 0

We need to relabel files inside /sys because this location is labeled with sysfs_t by default and changing the context at runtime does not preserve its status across reboots. Yet some of its files should have a different label – the /sys/devices/system/cpu/online file, for instance, requires the cpu_online_t label:

# matchpathcon /sys/devices/system/cpu/online
/sys/devices/system/cpu/online  system_u:object_r:cpu_online_t:s0

The definition ensures that this (pseudo) file is relabeled at boot so that all other processes that rely on the file labeled with cpu_online_t can happily continue working.

The other arguments to the definition are explicitly marked with a dash in the previous example, meaning that no other parameters need to be configured. They can be used to set the mode, User Identifier (UID), Group Identifier (GID), age, and argument related to the rule.

An example configuration that uses some of these other parameters with the z or Z state is the systemd.conf file:

# grep ^[zZ] /usr/lib/tmpfiles.d/systemd.conf
z /run/log/journal 2755 root systemd-journal - -
Z /run/log/journal/%m ~2750 root systemd-journal - -
z /var/log/journal 2755 root systemd-journal - -
z /var/log/journal/%m 2755 root systemd-journal - -
z /var/log/journal/%m/system.journal 0640 root systemd-journal - -

For more information about the definition format, see man tmpfiles.d.

Using socket-based activation

The system daemon also supports socket-based activation. When configured, systemd will create the socket on which the daemon usually listens and will have the daemon launched when the socket is first used. This allows systems to boot quickly (as many daemons do not need to be launched immediately) while still ensuring that all required sockets are available.

When a client only writes information to the socket (such as with the /dev/log socket), the client does not even need to wait for the daemon to be activated. The data is stored in a buffer until the daemon can read it. Only when the buffer is full will the operation block until the daemon flushes the buffer.

Take a look at the systemd-journald.socket unit file, available inside /usr/lib/systemd/system:

[Unit]
Description=Journal socket
Documentation=man:systemd-journal.service(8) man:journald.conf(8)
DefaultDependencies=no
Before=sockets.target
IgnoreOnIsolate=yes
[Socket]
ListenStream=/run/systemd/journal/stdout
ListenDatagram=/run/systemd/journal/socket
SocketMode=0666
PassCredentials=yes
PassSecurity=yes
ReceiveBuffer=8M
Service=systemd-journald.service

When a client uses one of the mentioned sockets, then systemd will launch the systemd-journald.service unit to accommodate the client interaction. As long as these sockets are not used, the service will not be started.

Inside the [Socket] section, an SELinux-specific entry can be defined: SELinuxContextFromNet=true. When a unit file has this entry set, systemd will obtain the MLS/MCS information from the client context (the application connecting to the socket) and append this to the context of the service. This sensitivity inheritance can be used to prevent any information leakage from taking place when communication is happening through sockets.

Governing unit operation access

Until now, we’ve looked at configuration settings related to systemd’s SELinux support. systemd also uses SELinux to control access to services defined through unit files. When a user wants to perform an operation against a unit (such as starting a service or checking the state of a running service), systemd queries the SELinux policy to see whether it will allow this operation.

The systemd daemon uses the service class to validate the permissions of the client’s domain toward the requested operation. For instance, to validate whether a user context, sysadm_t, can view the status of the service associated with the sshd.service unit file, it checks the context of this file (being sshd_unit_file_t) and then validates whether the status permission is granted:

# sesearch -s sysadm_t -t sshd_unit_file_t -c service -p status -A

Other supported permissions are disable, enable, reload, start, and stop. When a permission is not granted, a USER_AVC denial message will be visible in the audit logs (rather than an AVC message) as the message is not generated by the Linux kernel, but by systemd. So, while the rules themselves are part of the SELinux policy, it is systemd that enforces the access.

systemd, or the client through which systemd is queried, might also provide additional error messages to reflect that the SELinux policy prevents the action. For instance, if we attempt to query systemd over D-Bus (which we cover in the D-Bus communication section) from an unprivileged user domain, then we get the following error:

Error: GDBus.Error:org.freedesktop.DBus.Error.AccessDenied: SELinux policy denies access

To facilitate troubleshooting any systemd-triggered failures, systemd also has an extensive logging component, called systemd-journald, which we’ll cover next.

Logging with systemd

systemd is not only responsible for service management: it takes up several other tasks as well. One of these tasks is log management, traditionally implemented through a system logger.

While systemd still supports running with a traditional system logger, it now suggests the use of systemd-journald. One of the advantages of the journal daemon is that it is not limited to textual, single-line log messages. Daemons can now use binaries as well as multiline messages as part of its logging capabilities.

The journal daemon also registers information about the sending process alongside the log messages themselves. This additional information contains ownership data (the process owner) including the SELinux context of the sending process.

Retrieving SELinux-related information

The traditional approach to receive SELinux-related information (excluding the audit events we tackled before) is to grep through the log information. With the journal daemon, we can accomplish this as follows:

# journalctl -b | grep -i selinux

The -b option passed on to the journal control application informs the journal daemon that we are only interested in the log messages that originated for a specific boot.

Querying logs given an SELinux context

A unique feature of the journal daemon is to use the information associated with the log messages as part of the query to be launched against the journal database. For instance, we can ask the journal daemon to only show those messages that originated from a daemon or application running in the udev_t context:

# journalctl _SELINUX_CONTEXT=system_u:system_r:init_t:s0

The available contexts can be retrieved through the Bash completion support on the system. After writing _SELINUX_CONTEXT=, press Tab twice to see the possible values.

Using setroubleshoot integration with journal

The SELinux troubleshoot daemon is also integrated with systemd-journald. Any alert that comes up from setroubleshootd is also available through the journal daemon.

This helps administrators as they will quickly find out about SELinux denials when investigating problems. For instance, when the Nginx web server is not working properly and this is due to an SELinux policy, a quick investigation of the status of the service will reveal that the SELinux policy is preventing some actions:

# systemctl status nginx

To get more information about the message, use journalctl:

# journalctl -xe

As you can see, systemd-journald has captured environment information related to the service, which can provide much-needed guidance on resolving potential problems.

A third systemd service that has SELinux configuration possibilities is the device daemon.

Handling device files

Linux has a long history of device managers. Initially, administrators needed to make sure that the device nodes were already present on the filesystem (/dev was part of the persisted filesystem). Gradually, Linux adopted more dynamic approaches for device management.

Nowadays, device files are managed through a combination of a pseudo filesystem (devtmpfs) and a userspace device manager called udev. This device manager is merged in systemd as well, becoming systemd-udevd.

The device manager listens on a kernel socket for kernel events. These events inform the device manager about detected or plugged-in devices (or the removal of such devices) and allow the device manager to take appropriate action. For udev, these actions are defined in udev rules.

Using udev rules

Configuring the udev subsystem is mainly done through udev rules. These rules are one-liners that contain a matching part and an action part.

The matching part contains validations, executed against the event(s) that udev receives from the Linux kernel. This validation uses key/value pairs obtained from the event, and includes the following possible keys:

  • Kernel-provided device name (KERNEL)
  • Device subsystem (SUBSYSTEM)
  • Kernel driver (DRIVER)
  • Specific attributes (ATTR)
  • Active environment variables (ENV)
  • The action type to inform if the device is detected or removed (ACTION)

While more match keys are possible, the preceding list is most commonly used.

The Linux kernel will also inform the device manager about the device hierarchy. This allows rules to be defined based on, for instance, the USB controller through which a USB device is plugged in. Alongside the information for the device itself, the kernel will also provide hierarchically related information through similar key/value pairs. These pairs, however, use a key definition in plural form: SUBSYSTEMS instead of SUBSYSTEM, DRIVERS instead of DRIVER, and so on.

For instance, to match a USB webcam with vendor ID 05a9 and product ID 4519, the match-related pairs could look like this:

KERNEL=="video[0-9]*", SUBSYSTEM=="video4linux", SUBSYSTEMS=="usb", ATTR{idVendor}=="05a9", ATTR{idProduct}=="4519"

The second part of a udev rule is the action to take. The most common action is to create a symbolic link to the created device file, ensuring that applications can always reach the same device through the same symbolic link, even when the device from the kernel point of view has a different name. We can, for instance, extend the preceding example with SYMLINK+="webcam1" to have /dev/webcam1 point to this newly detected device.

The udev application supports many more actions than just defining symbolic links, of course. It can associate ownership (OWNER) or group membership (GROUP) on the device, controlling who can access the devices. udev can also set environment variables (ENV) and even run a command (RUN) when the matched device is plugged in or detached from the system. To make sure the command is only executed when the device is added, we need to add an ACTION setting such as ACTION=="add".

Note:

udev can interpret ENV as both a matching key as well as an action key. The difference is the operation performed (a single equals sign = or a double ==). ENV{envvar}=="value" is a match operation (checking whether the variable matches the given value), whereas ENV{envvar}="value" is an action (setting the variable to value).

udev rules are provided by default through the  /usr/lib/udev/rules.d  location. Distributions and applications/drivers will store their default rules in this location. Additional rules or rule overrides can be placed in /etc/udev/rules.d.

It’s important to remember that udev will continue processing rules even when it has already encountered a matching rule. This can be changed on a per-rule basis through the OPTIONS action, as with OPTIONS+="last_rule", which informs udev that it can stop processing further rules for this event.

Setting an SELinux label on a device node

One of the actions that udev supports is to assign an SELinux context on the device node. We can do this using the SECLABEL{selinux} action:

KERNEL=="fd0", ..., SECLABEL{selinux}="system_u:object_r:my_device_t:s0"

Note that this action only sets the context on the device node. If the rule also sets a symbolic link, then the symbolic link itself will inherit the default device_t context.

Placing an SELinux label on a device node is often done together with the other security-related permissions, so the rule often receives additional actions such as setting the target owner (OWNER), group (GROUP), and permission set (MODE). After all, SELinux security controls only apply after the regular, discretionary access control checks have passed, so don’t forget to make sure your users have access to the device nodes outside of the SELinux controls as well.

All the settings we’ve seen so far are about systemd service management and system support. Another component within the systemd ecosystem is D-Bus, which is less about system management and more about facilitating communication and interaction between different applications over a programmable communication bus.

Related Articles

No Results Found

The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.

Lorem ipsum dolor sit amet consectetur

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

5 × 5 =