systemd System and Service Manager CHANGES WITH 254 in spe: Announcements of Future Feature Removals and Incompatible Changes: * We intend to remove cgroup v1 support from systemd release after the end of 2023. If you run services that make explicit use of cgroup v1 features (i.e. the “legacy hierarchy” with separate hierarchies for each controller), please implement compatibility with cgroup v2 (i.e. the “unified hierarchy”) sooner rather than later. Most of Linux userspace has been ported over already. * The next release (v255) will remove support for split-usr (/usr/ mounted separately during late boot, instead of being mounted by the initrd before switching to the rootfs) and unmerged-usr (parallel directories /bin/ and /usr/bin/, /lib/ and /usr/lib/, …). For more details, see: https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html * EnvironmentFile= now treats the line following a comment line trailing with escape as a non comment line. For details, see: https://github.com/systemd/systemd/issues/27975 * Support for System V service scripts is now deprecated and will be removed in a future release. Please make sure to update your software *now* to include a native systemd unit file instead of a legacy System V script to retain compatibility with future systemd releases. * Behaviour of the per-user service manager units have changed w.r.t. sandboxing options, so that they work without having to manually enable PrivateUsers= as well, which is not required for system units. To make this work, we will implicitly enable user namespaces (PrivateUsers=yes) when a sandboxing option is enabled in a user unit. The drawback is that system users will no longer be visible (and appear as ‘nobody’) to the user unit when a sandboxing option is enabled. By definition a sandboxed user unit should run with reduced privileges, so impact should be small. This will remove a great source of confusion that has been reported by users over the years, due to how these options require an extra setting to be manually enabled when used in the per-user service manager, as opposed as to the system service manager. For more details, see: https://lists.freedesktop.org/archives/systemd-devel/2022-December/048682.html Security Relevant Changes: * pam_systemd will now by default pass the CAP_WAKE_ALARM ambient process capability to invoked session processes of regular users on local seats (as well as to systemd –user), unless configured otherwise via data from JSON user records, or via the PAM module’s parameter list. This is useful in order allow desktop tools such as GNOME’s Alarm Clock application to set a timer for CLOCK_REALTIME_ALARM that wakes up the system when it elapses. A per-user service unit file may thus use AmbientCapability= to pass the capability to invoked processes. Note that this capability is relatively narrow in focus (in particular compared to other process capabilities such as CAP_SYS_ADMIN) and we already — by default — permit more impactful operations such as system suspend to local users. Service Manager: * “Startup” memory settings are now supported. Previously IO and CPU settings were already supported via StartupCPUWeight= and similar, this adds the same logic for the various per-unit memory settings StartupMemoryMax= and related. * The service manager gained support for enqueuing POSIX signals to services that carry an additional integer value, exposing the sigqueue() systemd call. This is accessible via new D-Bus calls QueueSignalUnit() (and related), as well as in systemctl via the new –kill-value= parameter. * systemctl gained a new “list-paths” verb, which shows all currently active .path units, similar to how “systemctl list-timers” shows active timers, and “systemctl list-sockets” shows active sockets. * If MemoryDenyWriteExecute= is enabled for a service and the kernel supports the new PR_SET_MDWE prctl() call it is used in preference over seccomp() based system call filtering to achieve the same effect. * systemctl gained a new –when= switch which is honoured by the various forms of shutdown (i.e. reboot, kexec, poweroff, halt) and allows scheduling these operations by time, similar in fashion to how this has been supported by SysV shutdown. * A new set of kernel command line options is now understood: systemd.tty.term.=, systemd.tty.rows.=, systemd.tty.columns.= allow configuring the TTY type and dimensions for the tty specified via . When systemd invokes a service on a tty (via TTYName=) it will look for these and configure the TTY accordingly. This is particularly useful in VM environments, to propagate host terminal settings into the appropriate TTYs of the guest. * A new RootEphemeral= setting is now understood in service units. It takes a boolean argument. If enabled for services that use RootImage= or RootDirectory= an ephemeral copy of the disk image or directory tree is made when the service is started. It is removed automatically when the service is stopped. That ephemeral copy is made using btrfs/xfs reflinks or btrfs snaphots, if available. * The service activation logic gained new settings RestartSteps= and RestartMaxDelaySec= which allow exponentially growing restart intervals for Restart=. * PID 1 will now automatically load the virtio_console kernel module during early initialization if running in a suitable VM. This is done so that early-boot logging can be written to the console if available. * Similar, virtio-vsock supported is loaded early too in suitable VM environments. Since PID 1 sends sd_notify() notifications via AF_VSOCK to the VMM these days (if requested), loading this early is beneficial. * A new verb “fdstore” has been added to systemd-analyze to show the current contents of the file descriptor store of a unit. This is backed by a new D-Bus call DumpUnitFileDescriptorStore() provided by the service manager. * The service manager will now set a new $FDSTORE environment variable when invoking processes for services that have the file descriptor store enabled. * A new service option FileDescriptorStorePreserve= has been added that allows tuning the life-cycle of the per-service file descriptor store. If set to “yes” the entries in the fd store are retained even after the service is fully stopped. * The “systemctl clean” command may now be used to clear the fdstore of a service. * Unit *.preset files gained a new directive “ignore”, in addition to the existing “enable” and “disable”. As the name suggests it leaves units defined like this in its status quo, i.e. neither enables nor disables them. * Service units gained a new setting DelegateSubgroup=. It takes the name of a sub-cgroup to place any processes the service manager forks off in. Previously, the service manager would place all service processes directly in the top-level cgroup it creates for them, no matter what. This usually meant that services with delegation enabled would first have to move themselves down some level in order to not conflict with the “no processes in inner cgroups” rule of cgroupv2. With this option it is now possible to configure the name of a subgroup to place all processes forked off by PID 1 in directly. * The service manager will now look for .upholds/ directories, similar to the existing support for .wants/ and .requires/ directories, and uses contained symlinked units for creating Upholds= dependencies. The [Install] section of unit files gained support for a new UpheldBy= directive to generate symlinks of this automatically when a unit is enabled. * The service manager now supports a new kernel command line option systemd.default_device_timeout_sec=, which may be used to override the default timeout for .device units. * A new “soft-reboot” mechanism has been added to the service manager. A “soft reboot” is similar to a regular reboot, except that it affects userspace only: the service manager shuts down the running services and other units, then optionally switches into a new root file system (mounted to /run/nextroot/), and then passes control to a systemd instace in the new file system which then starts the system up again. The kernel is not rebooted and neither is hardware, firmware or boot loader. It is a fast, lightweight mechanism to quickly reset or update userspace, without the latency that a full system reset involves. Moreover, open file descriptors may be passed across the soft reboot into the new system where they will be passed back to the originating services. This allows pinning resources across the reboot, thus minimizing grey-out time further. Moreover, it is possible to allow specific crucial services to survive the reboot process, if they run off a separate root file system (i.e. use RootDirectory= or RootImage=, or are portable services). This new reboot mechanism is accessible via the new “systemctl soft-reboot” command. * A new service setting MemoryKSM= has been added, which may be used to enable kernel same-page merging individually for services. * A new service setting ImportCredentials= has been added that augments LoadCredential= and LoadCredentialEncrypted= and searches for credentials to import from the system, and supports globbing. Journal: * The sd-journal API learnt a new call sd_journal_get_seqnum() for retrieving the current log record’s sequence number and sequence number ID, which allows applications to order records the same way as journal does internally already. The sequence number is now also exported in the JSON and “export” output of the journal. * journalctl gained a new switch –truncate-newline. If specified multi-line log records will be truncated at the first newline, i.e. only the first line of each log message is shown. systemd-repart: * systemd-repart’s drop-in files gained a new ExcludeFiles= option which may be used to exclude certain files from the effect of CopyFiles=, which allows populating newly created partitions automatically. * systemd-repart’s Verity support now implements the Minimize= setting to minimize the size of the resulting partition. * systemd-repart gained a new –offline= switch, which may be used to control whether images shall be built “online” or “offline”, i.e. whether to make use of kernel facilities such as loopback block devices and DM or not. * If systemd-repart is told to populate a newly created ESP or XBOOTLDR partition with some files it will now default to VFAT rather than ext4, unless specified otherwise. * systemd-repart gained a new –architecture= switch. If specified, the per-architecture GPT partition types (i.e. the root and /usr/ partitions) configured in the partition drop-in files are automatically adjusted to match the specified CPU architecture, in order to simplify cross-architecture DDI building. systemd-boot, systemd-stub, ukify, bootctl, kernel-install: * bootctl gained a new switch –print-root-device (or short: -R) that prints the main block device the root file system is backed by. It’s useful for invocations such as “cfdisk $(bootctl -R)” to quickly have a look at the partition table of the running OS. * systemd-stub will now look for the SMBIOS Type 1 field “io.systemd.stub.kernel-cmdline-extra” and append its value to the kernel command line it invokes. This is useful for VMMs such as qemu to pass additional kernel command lines into the system even when booting via full UEFI. It’s measured into TPM PCR 12. * The KERNEL_INSTALL_LAYOUT= setting for kernel-install gained a new value “auto”. If used a kernel will be automatically analyzed, and if it qualifies as UKI it will be installed as if the setting was to set to “uki”, otherwise via “bls”. * systemd-stub can now optionally load UEFI PE “add-on” images that may contain additional kernel command line information. These “add-ons” superficially look like a regular UEFI executable, and are expected to be signed via SecureBoot/shim. However, they do not actually contain code, but instead a subset of the PE sections that UKIs support. They are supposed to provide a way to extend UKIs with additional resources in a secure and authenticated way. Currently, only the .cmdline PE section may be used in add-ons, in which case any specified string is appended to the command line embedded into the UKI itself. A new ‘addon.efi.stub’ is now provided that can be used to trivially create addons, via ‘ukify’ or ‘objcopy’. In the future we expect other sections to be made extensible like this as well. * ukify has been updated to allow building these UEFI PE “add-on” images, using the new ‘addon.efi.stub’. * ukify gained a new “genkey” verb for generating a set of of key pairs to sign UKIs and their PCR data with. * The kernel-install script has been rewritten in C, and reuses much of the infrastructure of existing tools such as bootctl. Moreover it gained support for –root= and –image= switches, to operate relative to some root file system or DDI. It also gained –esp-path= and –boot-path= options to override the path to the ESP, and the $BOOT partition. Options –make-entry-directory= and –entry-token= have been added as well, similar to bootctl’s options of the same name. * A new kernel-install plugin 60-ukify has been added which will combine kernel/initrd locally into an UKI and sign them with a local key. This may be used to switch to UKI mode even on systems where a local kernel or initrd shall be supported. (Typically UKIs are built and signed on OS vendor systems.) * The ukify tool now supports “petool” in addition to the pre-existing “sbsign” for signing UKIs. * systemd-measure and systemd-stub now look for a new .uname PE section that should encode the kernel’s “uname -r” string. * systemd-measure may now calculate expected PCR hashes for a UKI “offline”, i.e. requires no access to a TPM (neither physical nor software emulated). Memory Pressure & Control: * The sd-event API gained new calls sd_event_add_memory_pressure(), sd_event_source_set_memory_pressure_type(), sd_event_source_set_memory_pressure_period() for creating and configuring an event source that is called whenever the OS signals memory pressure. Another call sd_event_trim_memory() is provided that compacts the process’ memory use by releasing allocated but unused malloc() memory back to the kernel. Services can also provide their own custom callback to do memory trimming. This should improve system behaviour under memory pressure, as on Linux traditionally provided no mechanism to return process memory back to the kernel if the kernel was under pressure to acquire some. This makes use of the kernel’s PSI interface. Most long-running services that systemd contains have been hooked up with this, and in particular systems with low memory should benefit from this. * Service units learnt the new MemoryPressureWatch=, MemoryPressureThresholdSec= for configuring the PSI memory pressure logic individually. If these options are used the $MEMORY_PRESSURE_WATCH and $MEMORY_PRESSURE_WRITE environment variables will be set for the invoked services processes to inform them about the requested memory pressure behaviour. (This is used by the aforementioned sd-events API additions, if set.) * systemd-analyze gained a new “malloc” verb that shows the output generated by glibc’s malloc_info() on services that support it. Right now, only the service manager has been updated accordingly. User & Session Management: * The sd-login API gained a new call sd_session_get_username() for returning the user name who owns a specific login session. It also gained a new call sd_session_get_start_time() for retrieving the time the login session started. A new call sd_session_get_leader() has been added to return the PID of the “leader” process of a session. A new call sd_uid_get_login_time() returns the time the specified user the time since when they most recently were logged in continously with at least one session. * JSON user records gained a new set of fields capabilityAmbientSet and capabilityBoundingSet which contain a list of POSIX capabilities to set for the logged in users in the ambient and bounding sets, respectively. homectl gained the ability to configure these two sets for users via –capability-bounding-set=/–capability-ambient-set=. * pam_systemd learnt two new module options default-capability-bounding-set= + default-capability-ambient-set= to configure the default bounding sets for users as they are logging in, if the JSON user record doesn’t specify this explicitly (see above). The built-in default for the ambient set now contains the CAP_WAKE_ALARM, thus allowing regular users who may log in locally to resume from a system suspend via a timer. (see above) * The Session D-Bus objects systemd-logind provides gained a new SetTTY() method call for updating the TTY of a session after it has been allocated already. This is useful for SSH sessions which are typically allocated first, and for which a TTY is added in later. * The sd-login API gained a new call sd_pid_notifyf_with_fds() which combines the various other sd_pid_notify() flavours into one: takes a format string, an overriding PID, and a set of file descriptors to send along. It also gained a new call sd_pid_notify_barrier() which is equivalent to sd_notify_barrier() but allows specification of the originating PID. * “loginctl list-users” and “loginctl list-sessions” will now show the state of each logged in user/session in their tabular output. It will also show the current idle state of sessions. DDIs: * systemd-dissect will now show the intended CPU architecture of an inspected DDI. * systemd-dissect will now install itself as mount helper for the “ddi” pseudo-file system type. This means you may now mount DDIs directly via /bin/mount or /etc/fstab, making full use of embedded Verity information and all other DDI features. Example: mount -t ddi myimage.raw /some/where * The systemd-dissect tool gained the new switches –attach/–detach for attaching a DDI to a loopback block device without mounting it. It will automatically derive the right sector size from the image and set up Verity and similar, but not mount the file systems in it. * When systemd-gpt-auto-generator or the DDI mounting logic mount an ESP or XBOOTLDR partition the MS_NOSYMFOLLOW mount option is now implied. Given that these file systems are typically untrusted territory this should make mounting them automatically have less of a security impact. * All tools that parse DDIs (such as systemd-nspawn, systemd-dissect, systemd-tmpfiles, …) now understand a new switch –image-policy= which takes a string encoding image dissection policy. With this mechanism automatic discovery and use of specific partition types and the cryptographic requirements on the partitions (Verity, LUKS, …) can be restricted, permitting better control of the exposed attack surfaces when mounting disk images. systemd-gpt-auto-generator will honour such an image policy too, configurable via the systemd.image_policy= kernel command line option. Unit files gained the RootImagePolicy=, MountImagePolicy= and ExtensionImagePolicy= to configure the same for disk images a service runs off. * systemd-analyze gained a new verb “image-policy” for validating and parsing image policy strings. * systemd-dissect gained support for a new –validate switch for superficially validating DDI structure, and checking whether a specific image policy allows the DDI. Network Management: * networkd’s GENEVE support as gained a new .network option InheritInnerProtocol=. Device Management: * udevadm gained the new “verify” verb for validating udev rules files offline. * udev will now create symlinks to loopback block devices in the /dev/loop/by-ref/ directory that are based on the .lo_file_name string field selected during allocation. The systemd-dissect tool and the util-linux losetup command now supports a complementing new switch –loop-ref= for selecting the string. This means a loopback block device may now be allocated under a caller chosen reference and can subsequently be referenced by that without first having to look up the block device name the caller ended up with. * udev also creates symlinks to loopback block devices in the /dev/loop/by-ref/ directory based on the .st_dev/st_ino fields of the inode attached to the loopback block device. This means that attaching a file to a loopback device will implicitly make a handle available to be found via that file’s inode information. * udev gained a new tool “iocost” that can be used to configure QoS IO cost data based on hwdb information onto suitable block devices. Also see https://github.com/iocost-benchmark/iocost-benchmarks. TPM2 Support + Disk Encryption & Authentication: * systemd-cryptenroll/systemd-cryptsetup will now install a TPM2 SRK (“Storage Root Key”) as first step in the TPM2, and then use that for binding FDE to, if TPM2 support is used. This matches recommendations of TCG (see https://trustedcomputinggroup.org/wp-content/uploads/TCG-TPM-v2.0-Provisioning-Guidance-Published-v1r1.pdf) * systemd-cryptenroll and other tools that take TPM2 PCR parameters now understand textual identifiers for these PCRs. * systemd-veritysetup + /etc/veritytab gained support for a series of new options: hash-offset=, superblock=, format=, data-block-size=, hash-block-size=, data-blocks=, salt=, uuid=, hash=, fec-device=, fec-offset=, fec-roots= to configure various aspects of a Verity volume. * systemd-cryptsetup + /etc/crypttab gained support for a new veracrypt-pim= option for setting the Personal Iteration Multiplier of veracrypt volumes. * systemd-integritysetup + /etc/integritytab gained support for a new mode= setting for controlling the dm-integrity mode (journal, bitmap, direct) for the volume. * systemd-analyze gained a new verb “pcrs” that shows the known TPM PCR registers, their symbolic names and current values. systemd-tmpfiles: * The ACL support in tmpfiles.d/ has been updated: if an uppercase “X” access right is specified this is equivalent to “x” but only if the inode in question already has the executable bit set for at least some user/group. Otherwise the “x” bit will be turned off. * tmpfiles.d/’s C line type now understands a new modifier “+”: a line with C+ will result in a “merge” copy, i.e. all files of the source tree are copied into the target tree, even if that tree already exists, resulting in a combined tree of files already present in the target tree and those copied in. * systemd-tmpfiles gained a new –graceful switch. If specified lines with unknown users/groups will silently be skipped. systemd-notify: * systemd-notify gained two new options –fd= and –fdname= for sending arbitrary file descriptors to the service manager (while specifying an explicit name for it). * systemd-notify gained a new –exec switch, which makes it execute the specified command line after sending the requested messages. This is useful for sending out READY=1 first, and then continuing invocation without changing process ID, so that the tool can be nicely used within an ExecStart= line of a unit file that uses Type=ready. sd-event + sd-bus APIs: * The sd-event API gained a new call sd_event_source_leave_ratelimit() which may be used to explicitly end a rate-limit state an event source might be in, resetting all rate limiting counters. * When the sd-bus library is used to make connections to AF_UNIX D-Bus sockets, it will now encode the “description” one can set via sd_bus_set_description into the source socket address. It will also look for this information when accepting a connection. This is useful to track individual D-Bus connections on a D-Bus broker for debug purposes. systemd-resolved: * systemd-resolved gained a new resolved.conf setting StateRetentionSec= which may be used to retain cached DNS records even after their nominal TTL, and use them in case upstream DNS servers cannot be reached. This should make name resolution more resilient in case of network problems. * resolvectl gained a new verb “show-cache” for showing current cache contents of systemd-resolved. Other: * The default keymap to apply may now be chosen at build-time via the new default-keymap meson option. * Most of systemd’s long-running services now have a generic handler of the SIGRTMIN+18 signal handler which executes various operations depending on the sigqueue() parameter sent along. For example, values 0x100…0x107 allow changing the maximum log level of such services. 0x200…0x203 allow changing the log target of such services. 0x300 make the services trim their memory similar to the automatic PSI triggered action, see above. 0x301 make the services output their malloc_info() data to the logs. * machinectl gained new “edit” and “cat” verbs for editing .nspawn files, inspired by systemctl’s verbs of the same which edit unit files. Similar, networkctl gained the same verbs for editing .network, .netdev, .link files. * A new syscall filter group “@sandbox” has been added that contains syscalls for sandboxing system calls such as those for seccomp and Landlock. * New documentation has been added: https://systemd.io/COREDUMP https://systemd.io/MEMORY_PRESSURE * systemd-firstboot gained a new –reset option. If specified the settings in /etc/ it normally initializes are reset instead. * systemd-sysext is now a multi-call binary and also installed under the systemd-confext alias name (via a symlink). When invoked that way it will operate on /etc/ instead of /usr/ + /opt/. It thus becomes a powerful, atomic, secure configuration management of sorts, that locally can merge configuration from multiple confext configuration images into a single immutable tree. * The –network-macvlan=, –network-ipvlan=, –network-interface= switches of systemd-nspawn may now optionally take the intended network interface inside the container. * All our programs will now send an sd_notify() message with their exit status in the EXIT_STATUS= field when exiting, using the usual protocol, including PID 1. This is useful for VMMs and container managers to collect an exit status from a system as it shuts down, as set via “systemctl exit …”. This is particularly useful in test cases and similar, as invocations via a VM can now nicely propagate an exit status to the host, similar to local processes. * systemd-run gained a new switch –expand-environment=no to disable server-side enviornment variable expansion in specified command lines. * The systemd-system-update-generator has been update to also look for the special flag file /etc/system-update in addition to the existing support for /system-update to decide whether to enter system update mode. * The /dev/hugepages/ file system is now mounted with nosuid + nodev mount options by default. * systemd-fstab-generator now understands two new kernel command line options systemd.mount-extra= and systemd.swap-extra= which may be used to configure additional mounts or swaps via the kernel command line, in a format similar to /etc/fstab lines. * systemd-sysupdate’ sysupdate.d/ drop-ins gained a new setting PathRelativeTo=, which can be set to “esp”, “xbootldr”, “boot”, in which case the Path= setting is taken relative to the ESP or XBOOTLDR partitions, rather than the system’s root directory /. The relevant directories are automatically discovered. * The systemd-ac-power tool gained a new switch –low, which reports whether the battery charge is considered “low”, similar to how the s2h suspend logic checks this state to decide whether to enter system suspend or hibernation. * The /etc/os-release file now has two new optional fields VENDOR_NAME= and VENDOR_URL= carrying information about the vendor of the OS. * When the system hibernates information about the used device and offset is now written to a non-volatile EFI variable. On next boot the system will attempt to resume from the location indicated in this EFI variable. This should make hibernation a lot more robust, and requiring no manual configuration of the resume location. * The $XDG_STATE_HOME environment variable (added in more recent versions of the XDG basedir specification) is now honoured to implement the StateDirectory= setting in user services. * A new component “systemd-battery-check” has been added. It may run during early boot (usually in the initrd), and checks the battery charge level of the system. In case the charge level is very low the user is notified (graphically via Plymouth – if available – as well as in text form on the console), and the system is turned off after a 10s delay. CHANGES WITH 253: Announcements of Future Feature Removals and Incompatible Changes: * We intend to remove cgroup v1 support from systemd release after the end of 2023. If you run services that make explicit use of cgroup v1 features (i.e. the “legacy hierarchy” with separate hierarchies for each controller), please implement compatibility with cgroup v2 (i.e. the “unified hierarchy”) sooner rather than later. Most of Linux userspace has been ported over already. * We intend to remove support for split-usr (/usr mounted separately during boot) and unmerged-usr (parallel directories /bin and /usr/bin, /lib and /usr/lib, etc). This will happen in the second half of 2023, in the first release that falls into that time window. For more details, see: https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html * We intend to change behaviour w.r.t. units of the per-user service manager and sandboxing options, so that they work without having to manually enable PrivateUsers= as well, which is not required for system units. To make this work, we will implicitly enable user namespaces (PrivateUsers=yes) when a sandboxing option is enabled in a user unit. The drawback is that system users will no longer be visible (and appear as ‘nobody’) to the user unit when a sandboxing option is enabled. By definition a sandboxed user unit should run with reduced privileges, so impact should be small. This will remove a great source of confusion that has been reported by users over the years, due to how these options require an extra setting to be manually enabled when used in the per-user service manager, as opposed as to the system service manager. We plan to enable this change in the next release later this year. For more details, see: https://lists.freedesktop.org/archives/systemd-devel/2022-December/048682.html Deprecations and incompatible changes: * systemctl will now warn when invoked without /proc/ mounted (e.g. when invoked after chroot() into an directory tree without the API mount points like /proc/ being set up.) Operation in such an environment is not fully supported. * The return value of ‘systemctl is-active|is-enabled|is-failed’ for unknown units is changed: previously 1 or 3 were returned, but now 4 (EXIT_PROGRAM_OR_SERVICES_STATUS_UNKNOWN) is used as documented. * ‘udevadm hwdb’ subcommand is deprecated and will emit a warning. systemd-hwdb (added in 2014) should be used instead. * ‘bootctl –json’ now outputs a single JSON array, instead of a stream of newline-separated JSON objects. * Udev rules in 60-evdev.rules have been changed to load hwdb properties for all modalias patterns. Previously only the first matching pattern was used. This could change what properties are assigned if the user has more and less specific patterns that could match the same device, but it is expected that the change will have no effect for most users. * systemd-networkd-wait-online exits successfully when all interfaces are ready or unmanaged. Previously, if neither ‘–any’ nor ‘–interface=’ options were used, at least one interface had to be in configured state. This change allows the case where systemd-networkd is enabled, but no interfaces are configured, to be handled gracefully. It may occur in particular when a different network manager is also enabled and used. * Some compatibility helpers were dropped: EmergencyAction= in the user manager, as well as measuring kernel command line into PCR 8 in systemd-stub, along with the -Defi-tpm-pcr-compat compile-time option. * The ‘-Dupdate-helper-user-timeout=’ build-time option has been renamed to ‘-Dupdate-helper-user-timeout-sec=’, and now takes an integer as parameter instead of a string. * The DDI image dissection logic (which backs RootImage= in service unit files, the –image= switch in various tools such as systemd-nspawn, as well as systemd-dissect) will now only mount file systems of types btrfs, ext4, xfs, erofs, squashfs, vfat. This list can be overridden via the $SYSTEMD_DISSECT_FILE_SYSTEMS environment variable. These file systems are fairly well supported and maintained in current kernels, while others are usually more niche, exotic or legacy and thus typically do not receive the same level of security support and fixes. * The default per-link multicast DNS mode is changed to “yes” (that was previously “no”). As the default global multicast DNS mode has been “yes” (but can be changed by the build option), now the multicast DNS is enabled on all links by default. You can disable the multicast DNS on all links by setting MulticastDNS= in resolved.conf, or on an interface by calling “resolvectl mdns INTERFACE no”. New components: * A tool ‘ukify’ tool to build, measure, and sign Unified Kernel Images (UKIs) has been added. This replaces functionality provided by ‘dracut –uefi’ and extends it with automatic calculation of PE file offsets, insertion of signed PCR policies generated by systemd-measure, support for initrd concatenation, signing of the embedded Linux image and the combined image with sbsign, and heuristics to autodetect the kernel uname and verify the splash image. Changes in systemd and units: * A new service type Type=notify-reload is defined. When such a unit is reloaded a UNIX process signal (typically SIGHUP) is sent to the main service process. The manager will then wait until it receives a “RELOADING=1” followed by a “READY=1” notification from the unit as response (via sd_notify()). Otherwise, this type is the same as Type=notify. A new setting ReloadSignal= may be used to change the signal to send from the default of SIGHUP. user@.service, systemd-networkd.service, systemd-udevd.service, and systemd-logind have been updated to this type. * Initrd environments which are not on a pure memory file system (e.g. overlayfs combination as opposed to tmpfs) are now supported. With this change, during the initrd → host transition (“switch root”) systemd will erase all files of the initrd only when the initrd is backed by a memory file system such as tmpfs. * New per-unit MemoryZSwapMax= option has been added to configure memory.zswap.max cgroup properties (the maximum amount of zswap used). * A new LogFilterPatterns= option has been added for units. It may be used to specify accept/deny regular expressions for log messages generated by the unit, that shall be enforced by systemd-journald. Rejected messages are neither stored in the journal nor forwarded. This option may be used to suppress noisy or uninteresting messages from units. * The manager has a new org.freedesktop.systemd1.Manager.GetUnitByPIDFD() D-Bus method to query process ownership via a PIDFD, which is more resilient against PID recycling issues. * Scope units now support OOMPolicy=. Login session scopes default to OOMPolicy=continue, allowing login scopes to survive the OOM killer terminating some processes in the scope. * systemd-fstab-generator now supports x-systemd.makefs option for /sysroot/ (in the initrd). * The maximum rate at which daemon reloads are executed can now be limited with the new ReloadLimitIntervalSec=/ReloadLimitBurst= options. (Or the equivalent on the kernel command line: systemd.reload_limit_interval_sec=/systemd.reload_limit_burst=). In addition, systemd now logs the originating unit and PID when a reload request is received over D-Bus. * When enabling a swap device systemd will now reinitialize the device when the page size of the swap space does not match the page size of the running kernel. Note that this requires the ‘swapon’ utility to provide the ‘–fixpgsz’ option, as implemented by util-linux, and it is not supported by busybox at the time of writing. * systemd now executes generator programs in a mount namespace “sandbox” with most of the file system read-only and write access restricted to the output directories, and with a temporary /tmp/ mount provided. This provides a safeguard against programming errors in the generators, but also fixes here-docs in shells, which previously didn’t work in early boot when /tmp/ wasn’t available yet. (This feature has no security implications, because the code is still privileged and can trivially exit the sandbox.) * The system manager will now parse a new “vmm.notify_socket” system credential, which may be supplied to a VM via SMBIOS. If found, the manager will send a “READY=1” notification on the specified socket after boot is complete. This allows readiness notification to be sent from a VM guest to the VM host over a VSOCK socket. * The sample PAM configuration file for systemd-user@.service now includes a call to pam_namespace. This puts children of user@.service in the expected namespace. (Many distributions replace their file with something custom, so this change has limited effect.) * A new environment variable $SYSTEMD_DEFAULT_MOUNT_RATE_LIMIT_BURST can be used to override the mount units burst late limit for parsing ‘/proc/self/mountinfo’, which was introduced in v249. Defaults to 5. * Drop-ins for init.scope changing control group resource limits are now applied, while they were previously ignored. * New build-time configuration options ‘-Ddefault-timeout-sec=’ and ‘-Ddefault-user-timeout-sec=’ have been added, to let distributions choose the default timeout for starting/stopping/aborting system and user units respectively. * Service units gained a new setting OpenFile= which may be used to open arbitrary files in the file system (or connect to arbitrary AF_UNIX sockets in the file system), and pass the open file descriptor to the invoked process via the usual file descriptor passing protocol. This is useful to give unprivileged services access to select files which have restrictive access modes that would normally not allow this. It’s also useful in case RootDirectory= or RootImage= is used to allow access to files from the host environment (which is after all not visible from the service if these two options are used.) Changes in udev: * The new net naming scheme “v253” has been introduced. In the new scheme, ID_NET_NAME_PATH is also set for USB devices not connected via a PCI bus. This extends the coverage of predictable interface names in some embedded systems. The “amba” bus path is now included in ID_NET_NAME_PATH, resulting in a more informative path on some embedded systems. * Partition block devices will now also get symlinks in /dev/disk/by-diskseq/-part, which may be used to reference block device nodes via the kernel’s “diskseq” value. Previously those symlinks were only created for the main block device. * A new operator ‘-=’ is supported for SYMLINK variables. This allows symlinks to be unconfigured even if an earlier rule added them. * ‘udevadm –trigger –settle’ now also works for network devices that are being renamed. Changes in sd-boot, bootctl, and the Boot Loader Specification: * systemd-boot now passes its random seed directly to the kernel’s RNG via the LINUX_EFI_RANDOM_SEED_TABLE_GUID configuration table, which means the RNG gets seeded very early in boot before userspace has started. * systemd-boot will pass a disk-backed random seed – even when secure boot is enabled – if it can additionally get a random seed from EFI itself (via EFI’s RNG protocol), or a prior seed in LINUX_EFI_RANDOM_SEED_TABLE_GUID from a preceding bootloader. * systemd-boot-system-token.service was renamed to systemd-boot-random-seed.service and extended to always save a random seed to ESP on every boot when a compatible boot loader is used. This allows a refreshed random seed to be used in the boot loader. * systemd-boot handles various seed inputs using a domain- and field-separated hashing scheme. * systemd-boot’s ‘random-seed-mode’ option has been removed. A system token is now always required to be present for random seeds to be used. * systemd-boot now supports being loaded from other locations than the ESP, for example for direct kernel boot under QEMU or when embedded into the firmware. * systemd-boot now parses SMBIOS information to detect virtualization. This information is used to skip some warnings which are not useful in a VM and to conditionalize other aspects of behaviour. * systemd-boot now supports a new ‘if-safe’ mode that will perform UEFI Secure Boot automated certificate enrollment from the ESP only if it is considered ‘safe’ to do so. At the moment ‘safe’ means running in a virtual machine. * systemd-stub now processes random seeds in the same way as systemd-boot already does, in case a unified kernel image is being used from a different bootloader than systemd-boot, or without any boot load at all. * bootctl will now generate a system token on all EFI systems, even virtualized ones, and is activated in the case that the system token is missing from either sd-boot and sd-stub booted systems. * bootctl now implements two new verbs: ‘kernel-identify’ prints the type of a kernel image file, and ‘kernel-inspect’ provides information about the embedded command line and kernel version of UKIs. * bootctl now honours $KERNEL_INSTALL_CONF_ROOT with the same meaning as for kernel-install. * The JSON output of “bootctl list” will now contain two more fields: isDefault and isSelected are boolean fields set to true on the default and currently booted boot menu entries. * bootctl gained a new verb “unlink” for removing a boot loader entry type #1 file from disk in a safe and robust way. * bootctl also gained a new verb “cleanup” that automatically removes all files from the ESP’s and XBOOTLDR’s “entry-token” directory, that is not referenced anymore by any installed Type #1 boot loader specification entry. This is particularly useful in environments where a large number of entries reference the same or partly the same resources (for example, for snapshot-based setups). Changes in kernel-install: * A new “installation layout” can be configured as layout=uki. With this setting, a Boot Loader Specification Type#1 entry will not be created. Instead, a new kernel-install plugin 90-uki-copy.install will copy any .efi files from the staging area into the boot partition. A plugin to generate the UKI .efi file must be provided separately. Changes in systemctl: * ‘systemctl reboot’ has dropped support for accepting a positional argument as the argument to the reboot(2) syscall. Please use the –reboot-argument= option instead. * ‘systemctl disable’ will now warn when called on units without install information. A new –no-warn option has been added that silences this warning. * New option ‘–drop-in=’ can be used to tell ‘systemctl edit’ the name of the drop-in to edit. (Previously, ‘override.conf’ was always used.) * ‘systemctl list-dependencies’ now respects –type= and –state=. * ‘systemctl kexec’ now supports XEN VMM environments. * ‘systemctl edit’ will now tell the invoked editor to jump into the first line with actual unit file data, skipping over synthesized comments. Changes in systemd-networkd and related tools: * The [DHCPv4] section in .network file gained new SocketPriority= setting that assigns the Linux socket priority used by the DHCPv4 raw socket. This may be used in conjunction with the EgressQOSMaps=setting in [VLAN] section of .netdev file to send the desired ethernet 802.1Q frame priority for DHCPv4 initial packets. This cannot be achieved with netfilter mangle tables because of the raw socket bypass. * The [DHCPv4] and [IPv6AcceptRA] sections in .network file gained a new QuickAck= boolean setting that enables the TCP quick ACK mode for the routes configured by the acquired DHCPv4 lease or received router advertisements (RAs). * The RouteMetric= option (for DHCPv4, DHCPv6, and IPv6 advertised routes) now accepts three values, for high, medium, and low preference of the router (which can be set with the RouterPreference=) setting. * systemd-networkd-wait-online now supports matching via alternative interface names. * The [DHCPv6] section in .network file gained new SendRelease= setting which enables the DHCPv6 client to send release when it stops. This is the analog of the [DHCPv4] SendRelease= setting. It is enabled by default. * If the Address= setting in [Network] or [Address] sections in .network specified without its prefix length, then now systemd-networkd assumes /32 for IPv4 or /128 for IPv6 addresses. * networkctl shows network and link file dropins in status output. Changes in systemd-dissect: * systemd-dissect gained a new option –list, to print the paths of all files and directories in a DDI. * systemd-dissect gained a new option –mtree, to generate a file manifest compatible with BSD mtree(5) of a DDI * systemd-dissect gained a new option –with, to execute a command with the specified DDI temporarily mounted and used as working directory. This is for example useful to convert a DDI to “tar” simply by running it within a “systemd-dissect –with” invocation. * systemd-dissect gained a new option –discover, to search for Discoverable Disk Images (DDIs) in well-known directories of the system. This will list machine, portable service and system extension disk images. * systemd-dissect now understands 2nd stage initrd images stored as a Discoverable Disk Image (DDI). * systemd-dissect will now display the main UUID of GPT DDIs (i.e. the disk UUID stored in the GPT header) among the other data it can show. * systemd-dissect gained a new –in-memory switch to operate on an in-memory copy of the specified DDI file. This is useful to access a DDI with write access without persisting any changes. It’s also useful for accessing a DDI without keeping the originating file system busy. * The DDI dissection logic will now automatically detect the intended sector size of disk images stored in files, based on the GPT partition table arrangement. Loopback block devices for such DDIs will then be configured automatically for the right sector size. This is useful to make dealing with modern 4K sector size DDIs fully automatic. The systemd-dissect tool will now show the detected sector size among the other DDI information in its output. Changes in systemd-repart: * systemd-repart gained new options –include-partitions= and –exclude-partitions= to filter operation on partitions by type UUID. This allows systemd-repart to be used to build images in which the type of one partition is set based on the contents of another partition (for example when the boot partition shall include a verity hash of the root partition). * systemd-repart also gained a –defer-partitions= option that is similar to –exclude-partitions=, but the size of the partition is still taken into account when sizing partitions, but without populating it. * systemd-repart gained a new –sector-size= option to specify what sector size should be used when an image is created. * systemd-repart now supports generating erofs file systems via CopyFiles= (a read-only file system similar to squashfs). * The Minimize= option was extended to accept “best” (which means the most minimal image possible, but may require multiple attempts) and “guess” (which means a reasonably small image). * The systemd-growfs binary now comes with a regular unit file template systemd-growfs@.service which can be instantiated directly for any desired file system. (Previously, the unit was generated dynamically by various generators, but no regular unit file template was available.) Changes in journal tools: * Various systemd tools will append extra fields to log messages when in debug mode, or when SYSTEMD_ENABLE_LOG_CONTEXT=1 is set. Currently this includes information about D-Bus messages when sd-bus is used, e.g. DBUS_SENDER=, DBUS_DESTINATION=, and DBUS_PATH=, and information about devices when sd-device is used, e.g. DEVNAME= and DRIVER=. Details of what is logged and when are subject to change. * The systemd-journald-audit.socket can now be disabled via the usual “systemctl disable” mechanism to stop collection of audit messages. Please note that it is not enabled statically anymore and must be handled by the preset/enablement logic in package installation scripts. * New options MaxUse=, KeepFree=, MaxFileSize=, and MaxFiles= can be used to curtail disk use by systemd-journal-remote. This is similar to the options supported by systemd-journald. Changes in systemd-cryptenroll, systemd-cryptsetup, and related components: * When enrolling new keys systemd-cryptenroll now supports unlocking via FIDO2 tokens (option –unlock-fido2-device=). Previously, a password was strictly required to be specified. * systemd-cryptsetup now supports pre-flight requests for FIDO2 tokens (except for tokens with user verification, UV) to identify tokens before authentication. Multiple FIDO2 tokens can now be enrolled at the same time, and systemd-cryptsetup will automatically select one that corresponds to one of the available LUKS key slots. * systemd-cryptsetup now supports new options tpm2-measure-bank= and tpm2-measure-pcr= in crypttab(5). These allow specifying the TPM2 PCR bank and number into which the volume key should be measured. This is automatically enabled for the encrypted root volume discovered and activated by systemd-gpt-auto-generator. * systemd-gpt-auto-generator mounts the ESP and XBOOTLDR partitions with “noexec,nosuid,nodev”. * systemd-gpt-auto-generator will now honour the rootfstype= and rootflags= kernel command line switches for root file systems it discovers, to match behaviour in case an explicit root fs is specified via root=. * systemd-pcrphase gained new options –machine-id and –file-system= to measure the machine-id and mount point information into PCR 15. New service unit files systemd-pcrmachine.service and systemd-pcrfs@.service have been added that invoke the tool with these switches during early boot. * systemd-pcrphase gained a –graceful switch will make it exit cleanly with a success exit code even if no TPM device is detected. * systemd-cryptenroll now stores the user-supplied PIN with a salt, making it harder to brute-force. Changes in other tools: * systemd-homed gained support for luksPbkdfForceIterations (the intended number of iterations for the PBKDF operation on LUKS). * Environment variables $SYSTEMD_HOME_MKFS_OPTIONS_BTRFS, $SYSTEMD_HOME_MKFS_OPTIONS_EXT4, and $SYSTEMD_HOME_MKFS_OPTIONS_XFS may now be used to specify additional arguments for mkfs when systemd-homed formats a file system. * systemd-hostnamed now exports the contents of /sys/class/dmi/id/bios_vendor and /sys/class/dmi/id/bios_date via two new D-Bus properties: FirmwareVendor and FirmwareDate. This allows unprivileged code to access those values. systemd-hostnamed also exports the SUPPORT_END= field from os-release(5) as OperatingSystemSupportEnd. hostnamectl make uses of this to show the status of the installed system. * systemd-measure gained an –append= option to sign multiple phase paths with different signing keys. This allows secrets to be accessible only in certain parts of the boot sequence. Note that ‘ukify’ provides similar functionality in a more accessible form. * systemd-timesyncd will now write a structured log message with MESSAGE_ID set to SD_MESSAGE_TIME_BUMP when it bumps the clock based on a on-disk timestamp, similarly to what it did when reaching synchronization via NTP. * systemd-timesyncd will now update the on-disk timestamp file on each boot at least once, making it more likely that the system time increases in subsequent boots. * systemd-vconsole-setup gained support for system/service credentials: vconsole.keymap/vconsole.keymap_toggle and vconsole.font/vconsole.font_map/vconsole.font_unimap are analogous the similarly-named options in vconsole.conf. * systemd-localed will now save the XKB keyboard configuration to /etc/vconsole.conf, and also read it from there with a higher preference than the /etc/X11/xorg.conf.d/00-keyboard.conf config file. Previously, this information was stored in the former file in converted form, and only in latter file in the original form. Tools which want to access keyboard configuration can now do so from a standard location. * systemd-resolved gained support for configuring the nameservers and search domains via kernel command line (nameserver=, domain=) and credentials (network.dns, network.search_domains). * systemd-resolved will now synthesize host names for the DNS stub addresses it supports. Specifically when “_localdnsstub” is resolved, 127.0.0.53 is returned, and if “_localdnsproxy” is resolved 127.0.0.54 is returned. * systemd-notify will now send a “RELOADING=1” notification when called with –reloading, and “STOPPING=1” when called with –stopping. This can be used to implement notifications from units where it’s easier to call a program than to use the sd-daemon library. * systemd-analyze’s ‘plot’ command can now output its information in JSON, controlled via the –json= switch. Also, new –table, and –no-legend options have been added. * ‘machinectl enable’ will now automatically enable machines.target unit in addition to adding the machine unit to the target. Similarly, ‘machinectl start|stop’ gained a –now option to enable or disable the machine unit when starting or stopping it. * systemd-sysusers will now create /etc/ if it is missing. * systemd-sleep ‘HibernateDelaySec=’ setting is changed back to pre-v252’s behaviour, and a new ‘SuspendEstimationSec=’ setting is added to provide the new initial value for the new automated battery estimation functionality. If ‘HibernateDelaySec=’ is set to any value, the automated estimate (and thus the automated hibernation on low battery to avoid data loss) functionality will be disabled. * Default tmpfiles.d/ configuration will now automatically create credentials storage directory ‘/etc/credstore/’ with the appropriate, secure permissions. If ‘/run/credstore/’ exists, its permissions will be fixed too in case they are not correct. Changes in libsystemd and shared code: * sd-bus gained new convenience functions sd_bus_emit_signal_to(), sd_bus_emit_signal_tov(), and sd_bus_message_new_signal_to(). * sd-id128 functions now return -EUCLEAN (instead of -EIO) when the 128bit ID in files such as /etc/machine-id has an invalid format. They also accept NULL as output parameter in more places, which is useful when the caller only wants to validate the inputs and does not need the output value. * sd-login gained new functions sd_pidfd_get_session(), sd_pidfd_get_owner_uid(), sd_pidfd_get_unit(), sd_pidfd_get_user_unit(), sd_pidfd_get_slice(), sd_pidfd_get_user_slice(), sd_pidfd_get_machine_name(), and sd_pidfd_get_cgroup(), that are analogous to sd_pid_get_*(), but accept a PIDFD instead of a PID. * sd-path (and systemd-path) now export four new paths: SD_PATH_SYSTEMD_SYSTEM_ENVIRONMENT_GENERATOR, SD_PATH_SYSTEMD_USER_ENVIRONMENT_GENERATOR, SD_PATH_SYSTEMD_SEARCH_SYSTEM_ENVIRONMENT_GENERATOR, and SD_PATH_SYSTEMD_SEARCH_USER_ENVIRONMENT_GENERATOR, * sd_notify() now supports AF_VSOCK as transport for notification messages (in addition to the existing AF_UNIX support). This is enabled if $NOTIFY_SOCKET is set in a “vsock:CID:port” format. * Detection of chroot() environments now works if /proc/ is not mounted. This affects systemd-detect-virt –chroot, but also means that systemd tools will silently skip various operations in such an environment. * “Lockheed Martin Hardened Security for Intel Processors” (HS SRE) virtualization is now detected. Changes in the build system: * Standalone variants of systemd-repart and systemd-shutdown may now be built (if -Dstandalone=true). * systemd-ac-power has been moved from /usr/lib/ to /usr/bin/, to, for example, allow scripts to conditionalize execution on AC power supply. * The libp11kit library is now loaded through dlopen(3). Changes in the documentation: * Specifications that are not closely tied to systemd have moved to https://uapi-group.org/specifications/: the Boot Loader Specification and the Discoverable Partitions Specification. Contributions from: 김인수, 13r0ck, Aidan Dang, Alberto Planas, Alvin Šipraga, Andika Triwidada, AndyChi, angus-p, Anita Zhang, Antonio Alvarez Feijoo, Arsen Arsenović, asavah, Benjamin Fogle, Benjamin Tissoires, berenddeschouwer, BerndAdameit, Bernd Steinhauser, blutch112, cake03, Callum Farmer, Carlo Teubner, Charles Hardin, chris, Christian Brauner, Christian Göttsche, Cristian Rodríguez, Daan De Meyer, Dan Streetman, DaPigGuy, Darrell Kavanagh, David Tardon, dependabot[bot], Dirk Su, Dmitry V. Levin, drosdeck, Edson Juliano Drosdeck, edupont, Eric DeVolder, Erik Moqvist, Evgeny Vereshchagin, Fabian Gurtner, Felix Riemann, Franck Bui, Frantisek Sumsal, Geert Lorang, Gerd Hoffmann, Gio, Hannoskaj, Hans de Goede, Hugo Carvalho, igo95862, Ilya Leoshkevich, Ivan Shapovalov, Jacek Migacz, Jade Lovelace, Jan Engelhardt, Jan Janssen, Jan Macku, January, Jason A. Donenfeld, jcg, Jean-Tiare Le Bigot, Jelle van der Waa, Jeremy Linton, Jian Zhang, Jiayi Chen, Jia Zhang, Joerg Behrmann, Jörg Thalheim, Joshua Goins, joshuazivkovic, Joshua Zivkovic, Kai-Chuan Hsieh, Khem Raj, Koba Ko, Lennart Poettering, lichao, Li kunyu, Luca Boccassi, Luca BRUNO, Ludwig Nussel, Łukasz Stelmach, Lycowolf, marcel151, Marcus Schäfer, Marek Vasut, Mark Laws, Michael Biebl, Michał Kotyla, Michal Koutný, Michal Sekletár, Mike Gilbert, Mike Yuan, MkfsSion, ml, msizanoen1, mvzlb, MVZ Ludwigsburg, Neil Moore, Nick Rosbrook, noodlejetski, Pasha Vorobyev, Peter Cai, p-fpv, Phaedrus Leeds, Philipp Jungkamp, Quentin Deslandes, Raul Tambre, Ray Strode, reuben olinsky, Richard E. van der Luit, Richard Phibel, Ricky Tigg, Robin Humble, rogg, Rudi Heitbaum, Sam James, Samuel Cabrero, Samuel Thibault, Siddhesh Poyarekar, Simon Brand, Space Meyer, Spindle Security, Steve Ramage, Takashi Sakamoto, Thomas Haller, Tonći Galić, Topi Miettinen, Torsten Hilbrich, Tuetuopay, uerdogan, Ulrich Ölmann, Valentin David, Vitaly Kuznetsov, Vito Caputo, Waltibaba, Will Fancher, William Roberts, wouter bolsterlee, Youfu Zhang, Yu Watanabe, Zbigniew Jędrzejewski-Szmek, Дамјан Георгиевски, наб — Warsaw, 2023-02-15 CHANGES WITH 252 🎃: Announcements of Future Feature Removals: * We intend to remove cgroup v1 support from systemd release after the end of 2023. If you run services that make explicit use of cgroup v1 features (i.e. the “legacy hierarchy” with separate hierarchies for each controller), please implement compatibility with cgroup v2 (i.e. the “unified hierarchy”) sooner rather than later. Most of Linux userspace has been ported over already. * We intend to remove support for split-usr (/usr mounted separately during boot) and unmerged-usr (parallel directories /bin and /usr/bin, /lib and /usr/lib, etc). This will happen in the second half of 2023, in the first release that falls into that time window. For more details, see: https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html Compatibility Breaks: * ConditionKernelVersion= checks that use the ‘=’ or ‘!=’ operators will now do simple string comparisons (instead of version comparisons à la stverscmp()). Version comparisons are still done for the ordering operators ‘<', '>‘, ‘<=', '>=’. Moreover, if no operator is specified, a shell-style glob match is now done. This creates a minor incompatibility compared to older systemd versions when the ‘*’, ‘?’, ‘[‘, ‘]’ characters are used, as these will now match as shell globs instead of literally. Given that kernel version strings typically do not include these characters we expect little breakage through this change. * The service manager will now read the SELinux label used for SELinux access checks from the unit file at the time it loads the file. Previously, the label would be read at the moment of the access check, which was problematic since at that time the unit file might already have been updated or removed. New Features: * systemd-measure is a new tool for calculating and signing expected TPM2 PCR values for a given unified kernel image (UKI) booted via sd-stub. The public key used for the signature and the signed expected PCR information can be embedded inside the UKI. This information can be extracted from the UKI by external tools and code in the image itself and is made available to userspace in the booted kernel. systemd-cryptsetup, systemd-cryptenroll, and systemd-creds have been updated to make use of this information if available in the booted kernel: when locking an encrypted volume/credential to the TPM systemd-cryptenroll/systemd-creds will use the public key to bind the volume/credential to any kernel that carries PCR information signed by the same key pair. When unlocking such volumes/credentials systemd-cryptsetup/systemd-creds will use the signature embedded in the booted UKI to gain access. Binding TPM-based disk encryption to public keys/signatures of PCR values — instead of literal PCR values — addresses the inherent “brittleness” of traditional PCR-bound TPM disk encryption schemes: disks remain accessible even if the UKI is updated, without any TPM specific preparation during the OS update — as long as each UKI carries the necessary PCR signature information. Net effect: if you boot a properly prepared kernel, TPM-bound disk encryption now defaults to be locked to kernels which carry PCR signatures from the same key pair. Example: if a hypothetical distro FooOS prepares its UKIs like this, TPM-based disk encryption is now – by default – bound to only FooOS kernels, and encrypted volumes bound to the TPM cannot be unlocked on kernels from other sources. (But do note this behaviour requires preparation/enabling in the UKI, and of course users can always enroll non-TPM ways to unlock the volume.) * systemd-pcrphase is a new tool that is invoked at six places during system runtime, and measures additional words into TPM2 PCR 11, to mark milestones of the boot process. This allows binding access to specific TPM2-encrypted secrets to specific phases of the boot process. (Example: LUKS2 disk encryption key only accessible in the initrd, but not later.) Changes in systemd itself, i.e. the manager and units * The cpu controller is delegated to user manager units by default, and CPUWeight= settings are applied to the top-level user slice units (app.slice, background.slice, session.slice). This provides a degree of resource isolation between different user services competing for the CPU. * Systemd can optionally do a full preset in the “first boot” condition (instead of just enable-only). This behaviour is controlled by the compile-time option -Dfirst-boot-full-preset. Right now it defaults to ‘false’, but the plan is to switch it to ‘true’ for the subsequent release. * Drop-ins are now allowed for transient units too. * Systemd will set the taint flag ‘support-ended’ if it detects that the OS image is past its end-of-support date. This date is declared in a new /etc/os-release field SUPPORT_END= described below. * Two new settings ConditionCredential= and AssertCredential= can be used to skip or fail units if a certain system credential is not provided. * ConditionMemory= accepts size suffixes (K, M, G, T, …). * DefaultSmackProcessLabel= can be used in system.conf and user.conf to specify the SMACK security label to use when not specified in a unit file. * DefaultDeviceTimeoutSec= can be used in system.conf and user.conf to specify the default timeout when waiting for device units to activate. * C.UTF-8 is used as the default locale if nothing else has been configured. * [Condition|Assert]Firmware= have been extended to support certain SMBIOS fields. For example ConditionFirmware=smbios-field(board_name = “Custom Board”) conditionalizes the unit to run only when /sys/class/dmi/id/board_name contains “Custom Board” (without the quotes). * ConditionFirstBoot= now correctly evaluates as true only during the boot phase of the first boot. A unit executed later, after booting has completed, will no longer evaluate this condition as true. * Socket units will now create sockets in the SELinuxContext= of the associated service unit, if any. * Boot phase transitions (start initrd → exit initrd → boot complete → shutdown) will be measured into TPM2 PCR 11, so that secrets can be bound to a specific runtime phase. E.g.: a LUKS encryption key can be unsealed only in the initrd. * Service credentials (i.e. SetCredential=/LoadCredential=/…) will now also be provided to ExecStartPre= processes. * Various units are now correctly ordered against initrd-switch-root.target where previously a conflict without ordering was configured. A stop job for those units would be queued, but without the ordering it could be executed only after initrd-switch-root.service, leading to units not being restarted in the host system as expected. * In order to fully support the IPMI watchdog driver, which has not yet been ported to the new common watchdog device interface, /dev/watchdog0 will be tried first and systemd will silently fallback to /dev/watchdog if it is not found. * New watchdog-related D-Bus properties are now published by systemd: WatchdogDevice, WatchdogLastPingTimestamp, WatchdogLastPingTimestampMonotonic. * At shutdown, API virtual files systems (proc, sys, etc.) will be unmounted lazily. * At shutdown, systemd will now log about processes blocking unmounting of file systems. * A new meson build option ‘clock-valid-range-usec-max’ was added to allow disabling system time correction if RTC returns a timestamp far in the future. * Propagated restart jobs will no longer be discarded while a unit is activating. * PID 1 will now import system credentials from SMBIOS Type 11 fields (“OEM vendor strings”), in addition to qemu_fwcfg. This provides a simple, fast and generic path for supplying credentials to a VM, without involving external tools such as cloud-init/ignition. * The CPUWeight= setting of unit files now accepts a new special value “idle”, which configures “idle” level scheduling for the unit. * Service processes that are activated due to a .timer or .path unit triggering will now receive information about this via environment variables. Note that this is information is lossy, as activation might be coalesced and only one of the activating triggers will be reported. This is hence more suited for debugging or tracing rather than for behaviour decisions. * The riscv_flush_icache(2) system call has been added to the list of system calls allowed by default when SystemCallFilter= is used. * The selinux context derived from the target executable, instead of ‘init_t’ used for the manager itself, is now used when creating listening sockets for units that specify SELinuxContextFromNet=yes. Changes in sd-boot, bootctl, and the Boot Loader Specification: * The Boot Loader Specification has been cleaned up and clarified. Various corner cases in version string comparisons have been fixed (e.g. comparisons for empty strings). Boot counting is now part of the main specification. * New PCRs measurements are performed during boot: PCR 11 for the kernel+initrd combo, PCR 13 for any sysext images. If a measurement took place this is now reported to userspace via the new StubPcrKernelImage and StubPcrInitRDSysExts EFI variables. * As before, systemd-stub will measure kernel parameters and system credentials into PCR 12. It will now report this fact via the StubPcrKernelParameters EFI variable to userspace. * The UEFI monotonic boot counter is now included in the updated random seed file maintained by sd-boot, providing some additional entropy. * sd-stub will use LoadImage/StartImage to execute the kernel, instead of arranging the image manually and jumping to the kernel entry point. sd-stub also installs a temporary UEFI SecurityOverride to allow the (unsigned) nested image to be booted. This is safe because the outer (signed) stub+kernel binary must have been verified before the stub was executed. * Booting in EFI mixed mode (a 64-bit kernel over 32-bit UEFI firmware) is now supported by sd-boot. * bootctl gained a bunch of new options: –all-architectures to install binaries for all supported EFI architectures, –root= and –image= options to operate on a directory or disk image, and –install-source= to specify the source for binaries to install, –efi-boot-option-description= to control the name of the boot entry. * The sd-boot stub exports a StubFeatures flag, which is used by bootctl to show features supported by the stub that was used to boot. * The PE section offsets that are used by tools that assemble unified kernel images have historically been hard-coded. This may lead to overlapping PE sections which may break on boot. The UKI will now try to detect and warn about this. Any tools that assemble UKIs must update to calculate these offsets dynamically. Future sd-stub versions may use offsets that will not work with the currently used set of hard-coded offsets! * sd-stub now accepts (and passes to the initrd and then to the full OS) new PE sections ‘.pcrsig’ and ‘.pcrkey’ that can be used to embed signatures of expected PCR values, to allow sealing secrets via the TPM2 against pre-calculated PCR measurements. Changes in the hardware database: * ‘systemd-hwdb query’ now supports the –root= option. Changes in systemctl: * systemctl now supports –state= and –type= options for the ‘show’ and ‘status’ verbs. * systemctl gained a new verb ‘list-automounts’ to list automount points. * systemctl gained support for a new –image= switch to be able to operate on the specified disk image (similar to the existing –root= which operates relative to some directory). Changes in systemd-networkd: * networkd can set Linux NetLabel labels for integration with the network control in security modules via a new NetLabel= option. * The RapidCommit= is (re-)introduced to enable faster configuration via DHCPv6 (RFC 3315). * networkd gained a new option TCPCongestionControlAlgorithm= that allows setting a per-route TCP algorithm. * networkd gained a new option KeepFileDescriptor= to allow keeping a reference (file descriptor) open on TUN/TAP interfaces, which is useful to avoid link flaps while the underlying service providing the interface is being serviced. * RouteTable= now also accepts route table names. Changes in systemd-nspawn: * The –bind= and –overlay= options now support relative paths. * The –bind= option now supports a ‘rootidmap’ value, which will use id-mapped mounts to map the root user inside the container to the owner of the mounted directory on the host. Changes in systemd-resolved: * systemd-resolved now persists DNSOverTLS in its state file too. This fixes a problem when used in combination with NetworkManager, which sends the setting only once, causing it to be lost if resolved was restarted at any point. * systemd-resolved now exposes a varlink socket at /run/systemd/resolve/io.systemd.Resolve.Monitor, accessible only for root. Processed DNS requests in a JSON format will be published to any clients connected to this socket. resolvectl gained a ‘monitor’ verb to make use of this. * systemd-resolved now treats unsupported DNSSEC algorithms as INSECURE instead of returning SERVFAIL, as per RFC: https://datatracker.ietf.org/doc/html/rfc6840#section-5.2 * OpenSSL is the default crypto backend for systemd-resolved. (gnutls is still supported.) Changes in libsystemd and other libraries: * libsystemd now exports sd_bus_error_setfv() (a convenience function for setting bus errors), sd_id128_string_equal (a convenience function for 128bit ID string comparisons), and sd_bus_message_read_strv_extend() (a function to incrementally read string arrays). * libsystemd now exports sd_device_get_child_first()/_next() as a high-level interface for enumerating child devices. It also supports sd_device_new_child() for opening a child device given a device object. * libsystemd now exports sd_device_monitor_set()/get_description() which allow setting a custom description that will be used in log messages by sd_device_monitor*. * Private shared libraries (libsystemd-shared-nnn.so, libsystemd-core-nnn.so) are now installed into arch-specific directories to allow multi-arch installs. * A new sd-gpt.h header is now published, listing GUIDs from the Discoverable Partitions specification. For more details see: https://systemd.io/DISCOVERABLE_PARTITIONS/ * A new function sd_hwdb_new_from_path() has been added to open a hwdb database given an explicit path to the file. * The signal number argument to sd_event_add_signal() now can now be ORed with the SD_EVENT_SIGNAL_PROCMASK flag, causing sigprocmask() to be automatically invoked to block the specified signal. This is useful to simplify invocations as the caller doesn’t have to do this manually. * A new convenience call sd_event_set_signal_exit() has been added to sd-event to set up signal handling so that the event loop automatically terminates cleanly on SIGTERM/SIGINT. Changes in other components: * systemd-sysusers, systemd-tmpfiles, and systemd-sysctl configuration can now be provided via the credential mechanism. * systemd-analyze gained a new verb ‘compare-versions’ that implements comparisons for versions strings (similarly to ‘rpmdev-vercmp’ and ‘dpkg –compare-versions’). * ‘systemd-analyze dump’ is extended to accept glob patterns for unit names to limit the output to matching units. * tmpfiles.d/ lines can read file contents to write from a credential. The new modifier char ‘^’ is used to specify that the argument is a credential name. This mechanism is used to automatically populate /etc/motd, /etc/issue, and /etc/hosts from credentials. * tmpfiles.d/ may now be configured to avoid changing uid/gid/mode of an inode if the specification is prefixed with ‘:’ and the inode already exists. * Default tmpfiles.d/ configuration now carries a line to automatically use an ‘ssh.authorized_keys.root’ credential if provided to set up the SSH authorized_keys file for the root user. * systemd-tmpfiles will now gracefully handle absent source of “C” copy lines. * tmpfiles.d/ F/w lines now optionally permit encoding of the payload in base64. This is useful to write arbitrary binary data into files. * The pkgconfig and rpm macros files now export the directory for user units as ‘user_tmpfiles_dir’ and ‘%_user_tmpfilesdir’. * Detection of Apple Virtualization and detection of Parallels and KubeVirt virtualization on non-x86 archs have been added. * os-release gained a new field SUPPORT_END=YYYY-MM-DD to inform the user when their system will become unsupported. * When performing suspend-then-hibernate, the system will estimate the discharge rate and use that to set the delay until hibernation and hibernate immediately instead of suspending when running from a battery and the capacity is below 5%. * systemd-sysctl gained a –strict option to fail when a sysctl setting is unknown to the kernel. * machinectl supports –force for the ‘copy-to’ and ‘copy-from’ verbs. * coredumpctl gained the –root and –image options to look for journal files under the specified root directory, image, or block device. * ‘journalctl -o’ and similar commands now implement a new output mode “short-delta”. It is similar to “short-monotonic”, but also shows the time delta between subsequent messages. * journalctl now respects the –quiet flag when verifying consistency of journal files. * Journal log messages gained a new implicit field _RUNTIME_SCOPE= that will indicate whether a message was logged in the ‘initrd’ phase or in the ‘system’ phase of the boot process. * Journal files gained a new compatibility flag ‘HEADER_INCOMPATIBLE_COMPACT’. Files with this flag implement changes to the storage format that allow reducing size on disk. As with other compatibility flags, older journalctl versions will not be able to read journal files using this new format. The environment variable ‘SYSTEMD_JOURNAL_COMPACT=0’ can be passed to systemd-journald to disable this functionality. It is enabled by default. * systemd-run’s –working-directory= switch now works when used in combination with –scope. * portablectl gained a –force flag to skip certain sanity checks. This is implemented using new flags accepted by systemd-portabled for the *WithExtensions() D-Bus methods: SD_SYSTEMD_PORTABLE_FORCE_ATTACH flag now means that the attach/detach checks whether the units are already present and running will be skipped. Similarly, SD_SYSTEMD_PORTABLE_FORCE_SYSEXT flag means that the check whether image name matches the name declared inside of the image will be skipped. Callers must be sure to do those checks themselves if appropriate. * systemd-portabled will now use the original filename to check extension-release.NAME for correctness, in case it is passed a symlink. * systemd-portabled now uses PrivateTmp=yes in the ‘trusted’ profile too. * sysext’s extension-release files now support ‘_any’ as a special value for the ID= field, to allow distribution-independent extensions (e.g.: fully statically compiled binaries, scripts). It also gained support for a new ARCHITECTURE= field that may be used to explicitly restrict an image to hosts of a specific architecture. * systemd-repart now supports creating squashfs partitions. This requires mksquashfs from squashfs-tools. * systemd-repart gained a –split flag to also generate split artifacts, i.e. a separate file for each partition. This is useful in conjunction with systemd-sysupdate or other tools, or to generate split dm-verity artifacts. * systemd-repart is now able to generate dm-verity partitions, including signatures. * systemd-repart can now set a partition UUID to zero, allowing it to be filled in later, such as when using verity partitions. * systemd-repart now supports drop-ins for its configuration files. * Package metadata logged by systemd-coredump in the system journal is now more compact. * xdg-autostart-service now expands ’tilde’ characters in Exec lines. * systemd-oomd now automatically links against libatomic, if available. * systemd-oomd now sends out a ‘Killed’ D-Bus signal when a cgroup is killed. * scope units now also provide oom-kill status. * systemd-pstore will now try to load only the efi_pstore kernel module before running, ensuring that pstore can be used. * systemd-logind gained a new StopIdleSessionSec= option to stop an idle session after a preconfigure timeout. * systemd-homed will now wait up to 30 seconds for workers to terminate, rather than indefinitely. * homectl gained a new ‘–luks-sector-size=’ flag that allows users to select the preferred LUKS sector size. Must be a power of 2 between 512 and 4096. systemd-userdbd records gained a corresponding field. * systemd-sysusers will now respect the ‘SOURCE_DATE_EPOCH’ environment variable when generating the ‘sp_lstchg’ field, to ensure an image build can be reproducible. * ‘udevadm wait’ will now listen to kernel uevents too when called with –initialized=no. * When naming network devices udev will now consult the Devicetree “alias” fields for the device. * systemd-udev will now create infiniband/by-path and infiniband/by-ibdev links for Infiniband verbs devices. * systemd-udev-trigger.service will now also prioritize input devices. * ConditionACPower= and systemd-ac-power will now assume the system is running on AC power if no battery can be found. * All features and tools using the TPM2 will now communicate with it using a bind key. Beforehand, the tpm2 support used encrypted sessions by creating a primary key that was used to encrypt traffic. This creates a problem as the key created for encrypting the traffic could be faked by an active interposer on the bus. In cases when a pin is used, a bind key will be used. The pin is used as the auth value for the seal key, aka the disk encryption key, and that auth value will be used in the session establishment. An attacker would need the pin value to create the secure session and thus an active interposer without the pin cannot interpose on TPM2 traffic. * systemd-growfs no longer requires udev to run. * systemd-backlight now will better support systems with multiple graphic cards. * systemd-cryptsetup’s keyfile-timeout= option now also works when a device is used as a keyfile. * systemd-cryptenroll gained a new –unlock-key-file= option to get the unlocking key from a key file (instead of prompting the user). Note that this is the key for unlocking the volume in order to be able to enroll a new key, but it is not the key that is enrolled. * systemd-dissect gained a new –umount switch that will safely and synchronously unmount all partitions of an image previously mounted with ‘systemd-dissect –mount’. * When using gcrypt, all systemd tools and services will now configure it to prefer the OS random number generator if present. * All example code shipped with documentation has been relicensed from CC0 to MIT-0. * Unit tests will no longer fail when running on a system without /etc/machine-id. Experimental features: * BPF programs can now be compiled with bpf-gcc (requires libbpf >= 1.0 and bpftool >= 7.0). * sd-boot can automatically enroll SecureBoot keys from files found on the ESP. This enrollment can be either automatic (‘force’ mode) or controlled by the user (‘manual’ mode). It is sufficient to place the SecureBoot keys in the right place in the ESP and they will be picked up by sd-boot and shown in the boot menu. * The mkosi config in systemd gained support for automatically compiling a kernel with the configuration appropriate for testing systemd. This may be useful when developing or testing systemd in tandem with the kernel. Contributions from: 김인수, Adam Williamson, adrian5, Aidan Dang, Akihiko Odaki, Alban Bedel, Albert Mikaelyan, Aleksey Vasenev, Alexander Graf, Alexander Shopov, Alexander Wilson, Alper Nebi Yasak, anarcat, Anders Jonsson, Andre Kalb, Andrew Stone, Andrey Albershteyn, Anita Zhang, Ansgar Burchardt, Antonio Alvarez Feijoo, Arnaud Ferraris, Aryan singh, asavah, Avamander, Avram Lubkin, Balázs Meskó, Bastien Nocera, Benjamin Franzke, BerndAdameit, bin456789, Celeste Liu, Chih-Hsuan Yen, Christian Brauner, Christian Göttsche, Christian Hesse, Clyde Byrd III, codefiles, Colin Walters, Cristian Rodríguez, Daan De Meyer, Daniel Braunwarth, Daniel Rusek, Dan Streetman, Darsey Litzenberger, David Edmundson, David Jaša, David Rheinsberg, David Seifert, David Tardon, dependabot[bot], Devendra Tewari, Dominique Martinet, drosdeck, Edson Juliano Drosdeck, Eduard Tolosa, eggfly, Einsler Lee, Elias Probst, Eli Schwartz, Evgeny Vereshchagin, exploide, Fei Li, Foster Snowhill, Franck Bui, Frank Dana, Frantisek Sumsal, Gerd Hoffmann, Gio, Goffredo Baroncelli, gtwang01, Guillaume W. Bres, H A, Hans de Goede, Heinrich Schuchardt, Hugo Carvalho, i-do-cpp, igo95862, j00512545, Jacek Migacz, Jade Bilkey, James Hilliard, Jan B, Janis Goldschmidt, Jan Janssen, Jan Kuparinen, Jan Luebbe, Jan Macku, Jason A. Donenfeld, Javkhlanbayar Khongorzul, Jeremy Soller, JeroenHD, jiangchuangang, João Loureiro, Joaquín Ignacio Aramendía, Jochen Sprickerhof, Johannes Schauer Marin Rodrigues, Jonas Kümmerlin, Jonas Witschel, Jonathan Kang, Jonathan Lebon, Joost Heitbrink, Jörg Thalheim, josh-gordon-fb, Joyce, Kai Lueke, lastkrick, Lennart Poettering, Leon M. George, licunlong, Li kunyu, LockBlock-dev, Loïc Collignon, Lubomir Rintel, Luca Boccassi, Luca BRUNO, Ludwig Nussel, Łukasz Stelmach, Maccraft123, Marc Kleine-Budde, Marius Vollmer, Martin Wilck, matoro, Matthias Lisin, Max Gautier, Maxim Mikityanskiy, Michael Biebl, Michal Koutný, Michal Sekletár, Michal Stanke, Mike Gilbert, Mitchell Freiderich, msizanoen1, Nick Rosbrook, nl6720, Oğuz Ersen, Oleg Solovyov, Olga Smirnova, Pablo Ceballos, Pavel Zhukov, Phaedrus Leeds, Philipp Gortan, Piotr Drąg, Pyfisch, Quentin Deslandes, Rahil Bhimjiani, Rene Hollander, Richard Huang, Richard Phibel, Rudi Heitbaum, Sam James, Sarah Brofeldt, Sean Anderson, Sebastian Scheibner, Shreenidhi Shedi, Sonali Srivastava, Steve Ramage, Suraj Krishnan, Swapnil Devesh, Takashi Sakamoto, Ted X. Toth, Temuri Doghonadze, Thomas Blume, Thomas Haller, Thomas Hebb, Tomáš Hnyk, Tomasz Paweł Gajc, Topi Miettinen, Ulrich Ölmann, undef, Uriel Corfa, Victor Westerhuis, Vincent Dagonneau, Vishal Chillara Srinivas, Vito Caputo, Weblate, Wenchao Hao, William Roberts, williamsumendap, wineway, xiaoyang, Yuri Chornoivan, Yu Watanabe, Zbigniew Jędrzejewski-Szmek, Zhaofeng Li, наб – The Great Beyond, 2022-10-31 👻 CHANGES WITH 251: Backwards-incompatible changes: * The minimum kernel version required has been bumped from 3.13 to 4.15, and CLOCK_BOOTTIME is now assumed to always exist. * C11 with GNU extensions (aka “gnu11”) is now used to build our components. Public API headers are still restricted to ISO C89. * In v250, a systemd-networkd feature that automatically configures routes to addresses specified in AllowedIPs= was added and enabled by default. However, this causes network connectivity issues in many existing setups. Hence, it has been disabled by default since systemd-stable 250.3. The feature can still be used by explicitly configuring RouteTable= setting in .netdev files. * Jobs started via StartUnitWithFlags() will no longer return ‘skipped’ when a Condition*= check does not succeed, restoring the JobRemoved signal to the behaviour it had before v250. * The org.freedesktop.portable1 methods GetMetadataWithExtensions() and GetImageMetadataWithExtensions() have been fixed to provide an extra return parameter, containing the actual extension release metadata. The current implementation was judged to be broken and unusable, and thus the usual procedure of adding a new set of methods was skipped, and backward compatibility broken instead on the assumption that nobody can be affected given the current state of this interface. * All kernels supported by systemd mix bytes returned by RDRAND (or similar) into the entropy pool at early boot. This means that on those systems, even if /dev/urandom is not yet initialized, it still returns bytes that are of at least RDRAND quality. For that reason, we no longer have reason to invoke RDRAND from systemd itself, which has historically been a source of bugs. Furthermore, kernels ≥5.6 provide the getrandom(GRND_INSECURE) interface for returning random bytes before the entropy pool is initialized without warning into kmsg, which is what we attempt to use if available. systemd’s direct usage of RDRAND has been removed. x86 systems ≥Broadwell that are running an older kernel may experience kmsg warnings that were not seen with 250. For newer kernels, non-x86 systems, or older x86 systems, there should be no visible changes. * sd-boot will now measure the kernel command line into TPM PCR 12 rather than PCR 8. This improves usefulness of the measurements on systems where sd-boot is chainloaded from Grub. Grub measures all commands its executes into PCR 8, which makes it very hard to use reasonably, hence separate ourselves from that and use PCR 12 instead, which is what certain Ubuntu editions already do. To retain compatibility with systems running older systemd systems a new meson option ‘efi-tpm-pcr-compat’ has been added (which defaults to false). If enabled, the measurement is done twice: into the new-style PCR 12 *and* the old-style PCR 8. It’s strongly advised to migrate all users to PCR 12 for this purpose in the long run, as we intend to remove this compatibility feature in two years’ time. * busctl capture now writes output in the newer pcapng format instead of pcap. * A udev rule that imported hwdb matches for USB devices with lowercase hexadecimal vendor/product ID digits was added in systemd 250. This has been reverted, since uppercase hexadecimal digits are supposed to be used, and we already had a rule with the appropriate match. Users might need to adjust their local hwdb entries. * arch_prctl(2) has been moved to the @default set in the syscall filters (as exposed via the SystemCallFilter= setting in service unit files). It is apparently used by the linker now. * The tmpfiles entries that create the /run/systemd/netif directory and its subdirectories were moved from tmpfiles.d/systemd.conf to tmpfiles.d/systemd-network.conf. Users might need to adjust their files that override tmpfiles.d/systemd.conf to account for this change. * The requirement for Portable Services images to contain a well-formed os-release file (i.e.: contain at least an ID field) is now enforced. This applies to base images and extensions, and also to systemd-sysext. Changes in the Boot Loader Specification, kernel-install and sd-boot: * kernel-install’s and bootctl’s Boot Loader Specification Type #1 entry generation logic has been reworked. The user may now pick explicitly by which “token” string to name the installation’s boot entries, via the new /etc/kernel/entry-token file or the new –entry-token= switch to bootctl. By default — as before — the entries are named after the local machine ID. However, in “golden image” environments, where the machine ID shall be initialized on first boot (as opposed to at installation time before first boot) the machine ID will not be available at build time. In this case the –entry-token= switch to bootctl (or the /etc/kernel/entry-token file) may be used to override the “token” for the entries, for example the IMAGE_ID= or ID= fields from /etc/os-release. This will make the OS images independent of any machine ID, and ensure that the images will not carry any identifiable information before first boot, but on the other hand means that multiple parallel installations of the very same image on the same disk cannot be supported. Summary: if you are building golden images that shall acquire identity information exclusively on first boot, make sure to both remove /etc/machine-id *and* to write /etc/kernel/entry-token to the value of the IMAGE_ID= or ID= field of /etc/os-release or another suitable identifier before deploying the image. * The Boot Loader Specification has been extended with /loader/entries.srel file located in the EFI System Partition (ESP) that disambiguates the format of the entries in the /loader/entries/ directory (in order to discern them from incompatible uses of this directory by other projects). For entries that follow the Specification, the string “type1” is stored in this file. bootctl will now write this file automatically when installing the systemd-boot boot loader. * kernel-install supports a new initrd_generator= setting in /etc/kernel/install.conf, that is exported as $KERNEL_INSTALL_INITRD_GENERATOR to kernel-install plugins. This allows choosing different initrd generators. * kernel-install will now create a “staging area” (an initially-empty directory to gather files for a Boot Loader Specification Type #1 entry). The path to this directory is exported as $KERNEL_INSTALL_STAGING_AREA to kernel-install plugins, which should drop files there instead of writing them directly to the final location. kernel-install will move them when all files have been prepared successfully. * New option sort-key= has been added to the Boot Loader Specification to override the sorting order of the entries in the boot menu. It is read by sd-boot and bootctl, and will be written by kernel-install, with the default value of IMAGE_ID= or ID= fields from os-release. Together, this means that on multiboot installations, entries should be grouped and sorted in a predictable way. * The sort order of boot entries has been updated: entries which have the new field sort-key= are sorted by it first, and all entries without it are ordered later. After that, entries are sorted by version so that newest entries are towards the beginning of the list. * The kernel-install tool gained a new ‘inspect’ verb which shows the paths and other settings used. * sd-boot can now optionally beep when the menu is shown and menu entries are selected, which can be useful on machines without a working display. (Controllable via a loader.conf setting.) * The –make-machine-id-directory= switch to bootctl has been replaced by –make-entry-directory=, given that the entry directory is not necessarily named after the machine ID, but after some other suitable ID as selected via –entry-token= described above. The old name of the option is still understood to maximize compatibility. * ‘bootctl list’ gained support for a new –json= switch to output boot menu entries in JSON format. * ‘bootctl is-installed’ now supports the –graceful, and various verbs omit output with the new option –quiet. Changes in systemd-homed: * Starting with v250 systemd-homed uses UID/GID mapping on the mounts of activated home directories it manages (if the kernel and selected file systems support it). So far it mapped three UID ranges: the range from 0…60000, the user’s own UID, and the range 60514…65534, leaving everything else unmapped (in other words, the 16bit UID range is mapped almost fully, with the exception of the UID subrange used for systemd-homed users, with one exception: the user’s own UID). Unmapped UIDs may not be used for file ownership in the home directory — any chown() attempts with them will fail. With this release a fourth range is added to these mappings: 524288…1879048191. This range is the UID range intended for container uses, see: https://systemd.io/UIDS-GIDS This range may be used for container managers that place container OS trees in the home directory (which is a questionable approach, for quota, permission, SUID handling and network file system compatibility reasons, but nonetheless apparently commonplace). Note that this mapping is mapped 1:1 in a pass-through fashion, i.e. the UID assignments from the range are not managed or mapped by `systemd-homed`, and must be managed with other mechanisms, in the context of the local system. Typically, a better approach to user namespacing in relevant container managers would be to leave container OS trees on disk at UID offset 0, but then map them to a dynamically allocated runtime UID range via another UID mount map at container invocation time. That way user namespace UID ranges become strictly a runtime concept, and do not leak into persistent file systems, persistent user databases or persistent configuration, thus greatly simplifying handling, and improving compatibility with home directories intended to be portable like the ones managed by systemd-homed. Changes in shared libraries: * A new libsystemd-core-.so private shared library is installed under /usr/lib/systemd/system, mirroring the existing libsystemd-shared-.so library. This allows the total installation size to be reduced by binary code reuse. * The tag used in the name of libsystemd-shared.so and libsystemd-core.so can be configured via the meson option ‘shared-lib-tag’. Distributions may build subsequent versions of the systemd package with unique tags (e.g. the full package version), thus allowing multiple installations of those shared libraries to be available at the same time. This is intended to fix an issue where programs that link to those libraries would fail to execute because they were installed earlier or later than the appropriate version of the library. * The sd-id128 API gained a new call sd_id128_to_uuid_string() that is similar to sd_id128_to_string() but formats the ID in RFC 4122 UUID format instead of as a simple series of hex characters. * The sd-device API gained two new calls sd_device_new_from_devname() and sd_device_new_from_path() which permit allocating an sd_device object from a device node name or file system path. * sd-device also gained a new call sd_device_open() which will open the device node associated with a device for which an sd_device object has been allocated. The call is supposed to address races around device nodes being removed/recycled due to hotplug events, or media change events: the call checks internally whether the major/minor of the device node and the “diskseq” (in case of block devices) match with the metadata loaded in the sd_device object, thus ensuring that the device once opened really matches the provided sd_device object. Changes in PID1, systemctl, and systemd-oomd: * A new set of service monitor environment variables will be passed to OnFailure=/OnSuccess= handlers, but only if exactly one unit lists the handler unit as OnFailure=/OnSuccess=. The variables are: $MONITOR_SERVICE_RESULT, $MONITOR_EXIT_CODE, $MONITOR_EXIT_STATUS, $MONITOR_INVOCATION_ID and $MONITOR_UNIT. For cases when a single handler needs to watch multiple units, use a templated handler. * A new ExtensionDirectories= setting in service unit files allows system extensions to be loaded from a directory. (It is similar to ExtensionImages=, but takes paths to directories, instead of disk image files.) ‘portablectl attach –extension=’ now also accepts directory paths. * The user.delegate and user.invocation_id extended attributes on cgroups are used in addition to trusted.delegate and trusted.invocation_id. The latter pair requires privileges to set, but the former doesn’t and can be also set by the unprivileged user manager. (Only supported on kernels ≥5.6.) * Units that were killed by systemd-oomd will now have a service result of ‘oom-kill’. The number of times a service was killed is tallied in the ‘user.oomd_ooms’ extended attribute. The OOMPolicy= unit file setting is now also honoured by systemd-oomd. * In unit files the new %y/%Y specifiers can be used to refer to normalized unit file path, which is particularly useful for symlinked unit files. The new %q specifier resolves to the pretty hostname (i.e. PRETTY_HOSTNAME= from /etc/machine-info). The new %d specifier resolves to the credentials directory of a service (same as $CREDENTIALS_DIRECTORY). * The RootDirectory=, MountAPIVFS=, ExtensionDirectories=, *Capabilities*=, ProtectHome=, *Directory=, TemporaryFileSystem=, PrivateTmp=, PrivateDevices=, PrivateNetwork=, NetworkNamespacePath=, PrivateIPC=, IPCNamespacePath=, PrivateUsers=, ProtectClock=, ProtectKernelTunables=, ProtectKernelModules=, ProtectKernelLogs=, MountFlags= service settings now also work in unprivileged user services, i.e. those run by the user’s –user service manager, as long as user namespaces are enabled on the system. * Services with Restart=always and a failing ExecCondition= will no longer be restarted, to bring ExecCondition= behaviour in line with Condition*= settings. * LoadCredential= now accepts a directory as the argument; all files from the directory will be loaded as credentials. * A new D-Bus property ControlGroupId is now exposed on service units, that encapsulates the service’s numeric cgroup ID that newer kernels assign to each cgroup. * PID 1 gained support for configuring the “pre-timeout” of watchdog devices and the associated governor, via the new RuntimeWatchdogPreSec= and RuntimeWatchdogPreGovernor= configuration options in /etc/systemd/system.conf. * systemctl’s –timestamp= option gained a new choice “unix”, to show timestamp as unix times, i.e. seconds since 1970, Jan 1st. * A new “taint” flag named “old-kernel” is introduced which is set when the kernel systemd runs on is older then the current baseline version (see above). The flag is shown in “systemctl status” output. * Two additional taint flags “short-uid-range” and “short-gid-range” have been added as well, which are set when systemd notices it is run within a userns namespace that does not define the full 0…65535 UID range * A new “unmerged-usr” taint flag has been added that is set whenever running on systems where /bin/ + /sbin/ are *not* symlinks to their counterparts in /usr/, i.e. on systems where the /usr/-merge has not been completed. * Generators invoked by PID 1 will now have a couple of useful environment variables set describing the execution context a bit. $SYSTEMD_SCOPE encodes whether the generator is called from the system service manager, or from the per-user service manager. $SYSTEMD_IN_INITRD encodes whether the generator is invoked in initrd context or on the host. $SYSTEMD_FIRST_BOOT encodes whether systemd considers the current boot to be a “first” boot. $SYSTEMD_VIRTUALIZATION encode whether virtualization is detected and which type of hypervisor/container manager. $SYSTEMD_ARCHITECTURE indicates which architecture the kernel is built for. * PID 1 will now automatically pick up system credentials from qemu’s fw_cfg interface, thus allowing passing arbitrary data into VM systems similar to how this is already supported for passing them into `systemd-nspawn` containers. Credentials may now also be passed in via the new kernel command line option `systemd.set_credential=` (note that kernel command line options are world-readable during runtime, and only useful for credentials that require no confidentiality). The credentials that can be passed to unified kernels that use the `systemd-stub` UEFI stub are now similarly picked up automatically. Automatic importing of system credentials this way can be turned off via the new `systemd.import_credentials=no` kernel command line option. * LoadCredential= will now automatically look for credentials in the /etc/credstore/, /run/credstore/, /usr/lib/credstore/ directories if the argument is not an absolute path. Similarly, LoadCredentialEncrypted= will check the same directories plus /etc/credstore.encrypted/, /run/credstore.encrypted/ and /usr/lib/credstore.encrypted/. The idea is to use those directories as the system-wide location for credentials that services should pick up automatically. * System and service credentials are described in great detail in a new document: https://systemd.io/CREDENTIALS Changes in systemd-journald: * The journal JSON export format has been added to listed of stable interfaces (https://systemd.io/PORTABILITY_AND_STABILITY/). * journalctl –list-boots now supports JSON output and the –reverse option. * Under docs/: JOURNAL_EXPORT_FORMATS was imported from the wiki and updated, BUILDING_IMAGES is new: https://systemd.io/JOURNAL_EXPORT_FORMATS https://systemd.io/BUILDING_IMAGES Changes in udev: * Two new hwdb files have been added. One lists “handhelds” (PDAs, calculators, etc.), the other AV production devices (DJ tables, keypads, etc.) that should accessible to the seat owner user by default. * udevadm trigger gained a new –prioritized-subsystem= option to process certain subsystems (and all their parent devices) earlier. systemd-udev-trigger.service now uses this new option to trigger block and TPM devices first, hopefully making the boot a bit faster. * udevadm trigger now implements –type=all, –initialized-match, –initialized-nomatch to trigger both subsystems and devices, only already-initialized devices, and only devices which haven’t been initialized yet, respectively. * udevadm gained a new “wait” command for safely waiting for a specific device to show up in the udev device database. This is useful in scripts that asynchronously allocate a block device (e.g. through repartitioning, or allocating a loopback device or similar) and need to synchronize on the creation to complete. * udevadm gained a new “lock” command for locking one or more block devices while formatting it or writing a partition table to it. It is an implementation of https://systemd.io/BLOCK_DEVICE_LOCKING and usable in scripts dealing with block devices. * udevadm info will show a couple of additional device fields in its output, and will not apply a limited set of coloring to line types. * udevadm info –tree will now show a tree of objects (i.e. devices and suchlike) in the /sys/ hierarchy. * Block devices will now get a new set of device symlinks in /dev/disk/by-diskseq/, which may be used to reference block device nodes via the kernel’s “diskseq” value. Note that this does not guarantee that opening a device by a symlink like this will guarantee that the opened device actually matches the specified diskseq value. To be safe against races, the actual diskseq value of the opened device (BLKGETDISKSEQ ioctl()) must still be compred with the one in the symlink path. * .link files gained support for setting MDI/MID-X on a link. * .link files gained support for [Match] Firmware= setting to match on the device firmware description string. By mistake, it was previously only supported in .network files. * .link files gained support for [Link] SR-IOVVirtualFunctions= setting and [SR-IOV] section to configure SR-IOV virtual functions. Changes in systemd-networkd: * The default scope for unicast routes configured through [Route] section is changed to “link”, to make the behavior consistent with “ip route” command. The manual configuration of [Route] Scope= is still honored. * A new unit systemd-networkd-wait-online@.service has been added that can be used to wait for a specific network interface to be up. * systemd-networkd gained a new [Bridge] Isolated=true|false setting that configures the eponymous kernel attribute on the bridge. * .netdev files now can be used to create virtual WLAN devices, and configure various settings on them, via the [WLAN] section. * .link/.network files gained support for [Match] Kind= setting to match on device kind (“bond”, “bridge”, “gre”, “tun”, “veth”, etc.) This value is also shown by ‘networkctl status’. * The Local= setting in .netdev files for various virtual network devices gained support for specifying, in addition to the network address, the name of a local interface which must have the specified address. * systemd-networkd gained a new [Tunnel] External= setting in .netdev files, to configure tunnels in external mode (a.k.a. collect metadata mode). * [Network] L2TP= setting was removed. Please use interface specifier in Local= setting in .netdev files of corresponding L2TP interface. * New [DHCPServer] BootServerName=, BootServerAddress=, and BootFilename= settings can be used to configure the server address, server name, and file name sent in the DHCP packet (e.g. to configure PXE boot). Changes in systemd-resolved: * systemd-resolved is started earlier (in sysinit.target), so it available earlier and will also be started in the initrd if installed there. Changes in disk encryption: * systemd-cryptenroll can now control whether to require the user to enter a PIN when using TPM-based unlocking of a volume via the new –tpm2-with-pin= option. Option tpm2-pin= can be used in /etc/crypttab. * When unlocking devices via TPM, TPM2 parameter encryption is now used, to ensure that communication between CPU and discrete TPM chips cannot be eavesdropped to acquire disk encryption keys. * A new switch –fido2-credential-algorithm= has been added to systemd-cryptenroll allowing selection of the credential algorithm to use when binding encryption to FIDO2 tokens. Changes in systemd-hostnamed: * HARDWARE_VENDOR= and HARDWARE_MODEL= can be set in /etc/machine-info to override the values gleaned from the hwdb. * A ID_CHASSIS property can be set in the hwdb (for the DMI device /sys/class/dmi/id) to override the chassis that is reported by hostnamed. * hostnamed’s D-Bus interface gained a new method GetHardwareSerial() for reading the hardware serial number, as reportd by DMI. It also exposes a new method D-Bus property FirmwareVersion that encode the firmware version of the system. Changes in other components: * /etc/locale.conf is now populated through tmpfiles.d factory /etc/ handling with the values that were configured during systemd build (if /etc/locale.conf has not been created through some other mechanism). This means that /etc/locale.conf should always have reasonable contents and we avoid a potential mismatch in defaults. * The userdbctl tool will now show UID range information as part of the list of known users. * A new build-time configuration setting default-user-shell= can be used to set the default shell for user records and nspawn shell invocations (instead of the default /bin/bash). * systemd-timesyncd now provides a D-Bus API for receiving NTP server information dynamically at runtime via IPC. * The systemd-creds tool gained a new “has-tpm2” verb, which reports whether a functioning TPM2 infrastructure is available, i.e. if firmware, kernel driver and systemd all have TPM2 support enabled and a device found. * The systemd-creds tool gained support for generating encrypted credentials that are using an empty encryption key. While this provides no integrity nor confidentiality it’s useful to implement codeflows that work the same on TPM-ful and TPM2-less systems. The service manager will only accept credentials “encrypted” that way if a TPM2 device cannot be detected, to ensure that credentials “encrypted” like that cannot be used to trick TPM2 systems. * When deciding whether to colorize output, all systemd programs now also check $COLORTERM (in addition to $NO_COLOR, $SYSTEMD_COLORS, and $TERM). * Meson’s new install_tag feature is now in use for several components, allowing to build and install select binaries only: pam, nss, devel (pkg-config files), systemd-boot, libsystemd, libudev. Example: $ meson build systemd-boot $ meson install –tags systemd-boot –no-rebuild https://mesonbuild.com/Installing.html#installation-tags * A new build configuration option has been added, to allow selecting the default compression algorithm used by systemd-journald and systemd-coredump. This allows to build-in support for decompressing all supported formats, but choose a specific one for compression. E.g.: $ meson -Ddefault-compression=xz Experimental features: * sd-boot gained a new *experimental* setting “reboot-for-bitlocker” in loader.conf that implements booting Microsoft Windows from the sd-boot in a way that first reboots the system, to reset the TPM PCRs. This improves compatibility with BitLocker’s TPM use, as the PCRs will only record the Windows boot process, and not sd-boot itself, thus retaining the PCR measurements not involving sd-boot. Note that this feature is experimental for now, and is likely going to be generalized and renamed in a future release, without retaining compatibility with the current implementation. * A new systemd-sysupdate component has been added that automatically discovers, downloads, and installs A/B-style updates for the host installation itself, or container images, portable service images, and other assets. See the new systemd-sysupdate man page for updates. Contributions from: 4piu, Adam Williamson, adrian5, Albert Brox, AlexCatze, Alex Henrie, Alfonso Sánchez-Beato, Alice S, Alvin Šipraga, amarjargal, Amarjargal, Andrea Pappacoda, Andreas Rammhold, Andy Chi, Anita Zhang, Antonio Alvarez Feijoo, Arfrever Frehtes Taifersar Arahesis, ash, Bastien Nocera, Be, bearhoney, Ben Efros, Benjamin Berg, Benjamin Franzke, Brett Holman, Christian Brauner, Clyde Byrd III, Curtis Klein, Daan De Meyer, Daniele Medri, Daniel Mack, Danilo Krummrich, David, David Bond, Davide Cavalca, David Tardon, davijosw, dependabot[bot], Donald Chan, Dorian Clay, Eduard Tolosa, Elias Probst, Eli Schwartz, Erik Sjölund, Evgeny Vereshchagin, Federico Ceratto, Franck Bui, Frantisek Sumsal, Gaël PORTAY, Georges Basile Stavracas Neto, Gibeom Gwon, Goffredo Baroncelli, Grigori Goronzy, Hans de Goede, Heiko Becker, Hugo Carvalho, Jakob Lell, James Hilliard, Jan Janssen, Jason A. Donenfeld, Joan Bruguera, Joerie de Gram, Josh Triplett, Julia Kartseva, Kazuo Moriwaka, Khem Raj, ksa678491784, Lance, Lan Tian, Laura Barcziova, Lennart Poettering, Leviticoh, licunlong, Lidong Zhong, lincoln auster, Lubomir Rintel, Luca Boccassi, Luca BRUNO, lucagoc, Ludwig Nussel, Marcel Hellwig, march1993, Marco Scardovi, Mario Limonciello, Mariusz Tkaczyk, Markus Weippert, Martin, Martin Liska, Martin Wilck, Matija Skala, Matthew Blythe, Matthias Lisin, Matthijs van Duin, Matt Walton, Max Gautier, Michael Biebl, Michael Olbrich, Michal Koutný, Michal Sekletár, Mike Gilbert, MkfsSion, Morten Linderud, Nick Rosbrook, Nikolai Grigoriev, Nikolai Kostrigin, Nishal Kulkarni, Noel Kuntze, Pablo Ceballos, Peter Hutterer, Peter Morrow, Pigmy-penguin, Piotr Drąg, prumian, Richard Neill, Rike-Benjamin Schuppner, rodin-ia, Romain Naour, Ruben Kerkhof, Ryan Hendrickson, Santa Wiryaman, Sebastian Pucilowski, Seth Falco, Simon Ellmann, Sonali Srivastava, Stefan Seering, Stephen Hemminger, tawefogo, techtino, Temuri Doghonadze, Thomas Batten, Thomas Haller, Thomas Weißschuh, Tobias Stoeckmann, Tomasz Pala, Tyson Whitehead, Vishal Chillara Srinivas, Vivien Didelot, w30023233, wangyuhang, Weblate, Xiaotian Wu, yangmingtai, YmrDtnJu, Yonathan Randolph, Yutsuten, Yu Watanabe, Zbigniew Jędrzejewski-Szmek, наб — Edinburgh, 2022-05-21 CHANGES WITH 250: * Support for encrypted and authenticated credentials has been added. This extends the credential logic introduced with v247 to support non-interactive symmetric encryption and authentication, based on a key that is stored on the /var/ file system or in the TPM2 chip (if available), or the combination of both (by default if a TPM2 chip exists the combination is used, otherwise the /var/ key only). The credentials are automatically decrypted at the moment a service is started, and are made accessible to the service itself in unencrypted form. A new tool ‘systemd-creds’ encrypts credentials for this purpose, and two new service file settings LoadCredentialEncrypted= and SetCredentialEncrypted= configure such credentials. This feature is useful to store sensitive material such as SSL certificates, passwords and similar securely at rest and only decrypt them when needed, and in a way that is tied to the local OS installation or hardware. * systemd-gpt-auto-generator can now automatically set up discoverable LUKS2 encrypted swap partitions. * The GPT Discoverable Partitions Specification has been substantially extended with support for root and /usr/ partitions for the majority of architectures systemd supports. This includes platforms that do not natively support UEFI, because even though GPT is specified under UEFI umbrella, it is useful on other systems too. Specifically, systemd-nspawn, systemd-sysext, systemd-gpt-auto-generator and Portable Services use the concept without requiring UEFI. * The GPT Discoverable Partitions Specifications has been extended with a new set of partitions that may carry PKCS#7 signatures for Verity partitions, encoded in a simple JSON format. This implements a simple mechanism for building disk images that are fully authenticated and can be tested against a set of cryptographic certificates. This is now implemented for the various systemd tools that can operate with disk images, such as systemd-nspawn, systemd-sysext, systemd-dissect, Portable services/RootImage=, systemd-tmpfiles, and systemd-sysusers. The PKCS#7 signatures are passed to the kernel (where they are checked against certificates from the kernel keyring), or can be verified against certificates provided in userspace (via a simple drop-in file mechanism). * systemd-dissect’s inspection logic will now report for which uses a disk image is intended. Specifically, it will display whether an image is suitable for booting on UEFI or in a container (using systemd-nspawn’s –image= switch), whether it can be used as portable service, or attached as system extension. * The system-extension.d/ drop-in files now support a new field SYSEXT_SCOPE= that may encode which purpose a system extension image is for: one of “initrd”, “system” or “portable”. This is useful to make images more self-descriptive, and to ensure system extensions cannot be attached in the wrong contexts. * The os-release file learnt a new PORTABLE_PREFIXES= field which may be used in portable service images to indicate which unit prefixes are supported. * The GPT image dissection logic in systemd-nspawn/systemd-dissect/… now is able to decode images for non-native architectures as well. This allows systemd-nspawn to boot images of non-native architectures if the corresponding user mode emulator is installed and systemd-binfmtd is running. * systemd-logind gained new settings HandlePowerKeyLongPress=, HandleRebootKeyLongPress=, HandleSuspendKeyLongPress= and HandleHibernateKeyLongPress= which may be used to configure actions when the relevant keys are pressed for more than 5s. This is useful on devices that only have hardware for a subset of these keys. By default, if the reboot key is pressed long the poweroff operation is now triggered, and when the suspend key is pressed long the hibernate operation is triggered. Long pressing the other two keys currently does not trigger any operation by default. * When showing unit status updates on the console during boot and shutdown, and a service is slow to start so that the cylon animation is shown, the most recent sd_notify() STATUS= text is now shown as well. Services may use this to make the boot/shutdown output easier to understand, and to indicate what precisely a service that is slow to start or stop is waiting for. In particular, the per-user service manager instance now reports what it is doing and which service it is waiting for this way to the system service manager. * The service manager will now re-execute on reception of the SIGRTMIN+25 signal. It previously already did that on SIGTERM — but only when running as PID 1. There was no signal to request this when running as per-user service manager, i.e. as any other PID than 1. SIGRTMIN+25 works for both system and user managers. * The hardware watchdog logic in PID 1 gained support for operating with the default timeout configured in the hardware, instead of insisting on re-configuring it. Set RuntimeWatchdogSec=default to request this behavior. * A new kernel command line option systemd.watchdog_sec= is now understood which may be used to override the hardware watchdog time-out for the boot. * A new setting DefaultOOMScoreAdjust= is now supported in /etc/systemd/system.conf and /etc/systemd/user.conf. It may be used to set the default process OOM score adjustment value for processes started by the service manager. For per-user service managers this now defaults to 100, but for per-system service managers is left as is. This means that by default now services forked off the user service manager are more likely to be killed by the OOM killer than system services or the managers themselves. * A new per-service setting RestrictFileSystems= as been added that restricts the file systems a service has access to by their type. This is based on the new BPF LSM of the Linux kernel. It provides an effective way to make certain API file systems unavailable to services (and thus minimizing attack surface). A new command “systemd-analyze filesystems” has been added that lists all known file system types (and how they are grouped together under useful group handles). * Services now support a new setting RestrictNetworkInterfaces= for restricting access to specific network interfaces. * Service unit files gained new settings StartupAllowedCPUs= and StartupAllowedMemoryNodes=. These are similar to their counterparts without the “Startup” prefix and apply during the boot process only. This is useful to improve boot-time behavior of the system and assign resources differently during boot than during regular runtime. This is similar to the preexisting StartupCPUWeight= vs. CPUWeight. * Related to this: the various StartupXYZ= settings (i.e. StartupCPUWeight=, StartupAllowedCPUs=, …) are now also applied during shutdown. The settings not prefixed with “Startup” hence apply during regular runtime, and those that are prefixed like that apply during boot and shutdown. * A new per-unit set of conditions/asserts [Condition|Assert][Memory|CPU|IO]Pressure= have been added to make a unit skip/fail activation if the system’s (or a slice’s) memory/cpu/io pressure is above the configured threshold, using the kernel PSI feature. For more details see systemd.unit(5) and https://docs.kernel.org/accounting/psi.html * The combination of ProcSubset=pid and ProtectKernelTunables=yes and/or ProtectKernelLogs=yes can now be used. * The default maximum numbers of inodes have been raised from 64k to 1M for /dev/, and from 400k to 1M for /tmp/. * The per-user service manager learnt support for communicating with systemd-oomd to acquire OOM kill information. * A new service setting ExecSearchPath= has been added that allows changing the search path for executables for services. It affects where we look for the binaries specified in ExecStart= and similar, and the specified directories are also added the $PATH environment variable passed to invoked processes. * A new setting RuntimeRandomizedExtraSec= has been added for service and scope units that allows extending the runtime time-out as configured by RuntimeMaxSec= with a randomized amount. * The syntax of the service unit settings RuntimeDirectory=, StateDirectory=, CacheDirectory=, LogsDirectory= has been extended: if the specified value is now suffixed with a colon, followed by another filename, the latter will be created as symbolic link to the specified directory. This allows creating these service directories together with alias symlinks to make them available under multiple names. * Service unit files gained two new settings TTYRows=/TTYColumns= for configuring rows/columns of the TTY device passed to stdin/stdout/stderr of the service. This is useful to propagate TTY dimensions to a virtual machine. * A new service unit file setting ExitType= has been added that specifies when to assume a service has exited. By default systemd only watches the main process of a service. By setting ExitType=cgroup it can be told to wait for the last process in a cgroup instead. * Automount unit files gained a new setting ExtraOptions= that can be used to configure additional mount options to pass to the kernel when mounting the autofs instance. * “Urlification” (generation of ESC sequences that generate clickable hyperlinks in modern terminals) may now be turned off altogether during build-time. * Path units gained new TriggerLimitBurst= and TriggerLimitIntervalSec= settings that default to 200 and 2 s respectively. The ratelimit ensures that a path unit cannot cause PID1 to busy-loop when it is trying to trigger a service that is skipped because of a Condition*= not being satisfied. This matches the configuration and behaviour of socket units. * The TPM2/FIDO2/PKCS11 support in systemd-cryptsetup is now also built as a plug-in for cryptsetup. This means the plain cryptsetup command may now be used to unlock volumes set up this way. * The TPM2 logic in cryptsetup will now automatically detect systems where the TPM2 chip advertises SHA256 PCR banks but the firmware only updates the SHA1 banks. In such a case PCR policies will be automatically bound to the latter, not the former. This makes the PCR policies reliable, but of course do not provide the same level of trust as SHA256 banks. * The TPM2 logic in systemd-cryptsetup/systemd-cryptsetup now supports RSA primary keys in addition to ECC, improving compatibility with TPM2 chips that do not support ECC. RSA keys are much slower to use than ECC, and hence are only used if ECC is not available. * /etc/crypttab gained support for a new token-timeout= setting for encrypted volumes that allows configuration of the maximum time to wait for PKCS#11/FIDO2 tokens to be plugged in. If the time elapses the logic will query the user for a regular passphrase/recovery key instead. * Support for activating dm-integrity volumes at boot via a new file /etc/integritytab and the tool systemd-integritysetup have been added. This is similar to /etc/crypttab and /etc/veritytab, but deals with dm-integrity instead of dm-crypt/dm-verity. * The systemd-veritysetup-generator now understands a new usrhash= kernel command line option for specifying the Verity root hash for the partition backing the /usr/ file system. A matching set of systemd.verity_usr_* kernel command line options has been added as well. These all work similar to the corresponding options for the root partition. * The sd-device API gained a new API call sd_device_get_diskseq() to return the DISKSEQ property of a device structure. The “disk sequence” concept is a new feature recently introduced to the Linux kernel that allows detecting reuse cycles of block devices, i.e. can be used to recognize when loopback block devices are reused for a different purpose or CD-ROM drives get their media changed. * A new unit systemd-boot-update.service has been added. If enabled (the default) and the sd-boot loader is detected to be installed, it is automatically updated to the newest version when out of date. This is useful to ensure the boot loader remains up-to-date, and updates automatically propagate from the OS tree in /usr/. * sd-boot will now build with SBAT by default in order to facilitate working with recent versions of Shim that require it to be present. * sd-boot can now parse Microsoft Windows’ Boot Configuration Data. This is used to robustly generate boot entry titles for Windows. * A new generic target unit factory-reset.target has been added. It is hooked into systemd-logind similar in fashion to reboot/poweroff/suspend/hibernate, and is supposed to be used to initiate a factory reset operation. What precisely this operation entails is up for the implementer to decide, the primary goal of the new unit is provide a framework where to plug in the implementation and how to trigger it. * A new meson build-time option ‘clock-valid-range-usec-max’ has been added which takes a time in µs and defaults to 15 years. If the RTC time is noticed to be more than the specified time ahead of the built-in epoch of systemd (which by default is the release timestamp of systemd) it is assumed that the RTC is not working correctly, and the RTC is reset to the epoch. (It already is reset to the epoch when noticed to be before it.) This should increase the chance that time doesn’t accidentally jump too far ahead due to faulty hardware or batteries. * A new setting SaveIntervalSec= has been added to systemd-timesyncd, which may be used to automatically save the current system time to disk in regular intervals. This is useful to maintain a roughly monotonic clock even without RTC hardware and with some robustness against abnormal system shutdown. * systemd-analyze verify gained support for a pair of new –image= + –root= switches for verifying units below a specific root directory/image instead of on the host. * systemd-analyze verify gained support for verifying unit files under an explicitly specified unit name, independently of what the filename actually is. * systemd-analyze verify gained a new switch –recursive-errors= which controls whether to only fail on errors found in the specified units or recursively any dependent units. * systemd-analyze security now supports a new –offline mode for analyzing unit files stored on disk instead of loaded units. It may be combined with –root=/–image to analyze unit files under a root directory or disk image. It also learnt a new –threshold= parameter for specifying an exposure level threshold: if the exposure level exceeds the specified value the call will fail. It also gained a new –security-policy= switch for configuring security policies to enforce on the units. A policy is a JSON file that lists which tests shall be weighted how much to determine the overall exposure level. Altogether these new features are useful for fully automatic analysis and enforcement of security policies on unit files. * systemd-analyze security gain a new –json= switch for JSON output. * systemd-analyze learnt a new –quiet switch for reducing non-essential output. It’s honored by the “dot”, “syscall-filter”, “filesystems” commands. * systemd-analyze security gained a –profile= option that can be used to take into account a portable profile when analyzing portable services, since a lot of the security-related settings are enabled through them. * systemd-analyze learnt a new inspect-elf verb that parses ELF core files, binaries and executables and prints metadata information, including the build-id and other info described on: https://systemd.io/COREDUMP_PACKAGE_METADATA/ * .network files gained a new UplinkInterface= in the [IPv6SendRA] section, for automatically propagating DNS settings from other interfaces. * The static lease DHCP server logic in systemd-networkd may now serve IP addresses outside of the configured IP pool range for the server. * CAN support in systemd-networkd gained four new settings Loopback=, OneShot=, PresumeAck=, ClassicDataLengthCode= for tweaking CAN control modes. It gained a number of further settings for tweaking CAN timing quanta. * The [CAN] section in .network file gained new TimeQuantaNSec=, PropagationSegment=, PhaseBufferSegment1=, PhaseBufferSegment2=, SyncJumpWidth=, DataTimeQuantaNSec=, DataPropagationSegment=, DataPhaseBufferSegment1=, DataPhaseBufferSegment2=, and DataSyncJumpWidth= settings to control bit-timing processed by the CAN interface. * DHCPv4 client support in systemd-networkd learnt a new Label= option for configuring the address label to apply to configure IPv4 addresses. * The [IPv6AcceptRA] section of .network files gained support for a new UseMTU= setting that may be used to control whether to apply the announced MTU settings to the local interface. * The [DHCPv4] section in .network file gained a new Use6RD= boolean setting to control whether the DHCPv4 client request and process the DHCP 6RD option. * The [DHCPv6PrefixDelegation] section in .network file is renamed to [DHCPPrefixDelegation], as now the prefix delegation is also supported with DHCPv4 protocol by enabling the Use6RD= setting. * The [DHCPPrefixDelegation] section in .network file gained a new setting UplinkInterface= to specify the upstream interface. * The [DHCPv6] section in .network file gained a new setting UseDelegatedPrefix= to control whether the delegated prefixes will be propagated to the downstream interfaces. * The [IPv6AcceptRA] section of .network files now understands two new settings UseGateway=/UseRoutePrefix= for explicitly configuring whether to use the relevant fields from the IPv6 Router Advertisement records. * The ForceDHCPv6PDOtherInformation= setting in the [DHCPv6] section has been removed. Please use the WithoutRA= and UseDelegatedPrefix= settings in the [DHCPv6] section and the DHCPv6Client= setting in the [IPv6AcceptRA] section to control when the DHCPv6 client is started and how the delegated prefixes are handled by the DHCPv6 client. * The IPv6Token= section in the [Network] section is deprecated, and the [IPv6AcceptRA] section gained the Token= setting for its replacement. The [IPv6Prefix] section also gained the Token= setting. The Token= setting gained ‘eui64’ mode to explicitly configure an address with the EUI64 algorithm based on the interface MAC address. The ‘prefixstable’ mode can now optionally take a secret key. The Token= setting in the [DHCPPrefixDelegation] section now supports all algorithms supported by the same settings in the other sections. * The [RoutingPolicyRule] section of .network file gained a new SuppressInterfaceGroup= setting. * The IgnoreCarrierLoss= setting in the [Network] section of .network files now allows a duration to be specified, controlling how long to wait before reacting to carrier loss. * The [DHCPServer] section of .network file gained a new Router= setting to specify the router address. * The [CAKE] section of .network files gained various new settings AutoRateIngress=, CompensationMode=, FlowIsolationMode=, NAT=, MPUBytes=, PriorityQueueingPreset=, FirewallMark=, Wash=, SplitGSO=, and UseRawPacketSize= for configuring CAKE. * systemd-networkd now ships with new default .network files: 80-container-vb.network which matches host-side network bridge device created by systemd-nspawn’s –network-bridge or –network-zone switch, and 80-6rd-tunnel.network which matches automatically created sit tunnel with 6rd prefix when the DHCP 6RD option is received. * systemd-networkd’s handling of Endpoint= resolution for WireGuard interfaces has been improved. * systemd-networkd will now automatically configure routes to addresses specified in AllowedIPs=. This feature can be controlled via RouteTable= and RouteMetric= settings in [WireGuard] or [WireGuardPeer] sections. * systemd-networkd will now once again automatically generate persistent MAC addresses for batadv and bridge interfaces. Users can disable this by using MACAddress=none in .netdev files. * systemd-networkd and systemd-udevd now support IP over InfiniBand interfaces. The Kind= setting in .netdev file accepts “ipoib”. And systemd.netdev files gained the [IPoIB] section. * systemd-networkd and systemd-udevd now support net.ifname-policy= option on the kernel command-line. This is implemented through the systemd-network-generator service that automatically generates appropriate .link, .network, and .netdev files. * The various systemd-udevd “ethtool” buffer settings now understand the special value “max” to configure the buffers to the maximum the hardware supports. * systemd-udevd’s .link files may now configure a large variety of NIC coalescing settings, plus more hardware offload settings. * .link files gained a new WakeOnLanPassword= setting in the [Link] section that allows to specify a WoL “SecureOn” password on hardware that supports this. * systemd-nspawn’s –setenv= switch now supports an additional syntax: if only a variable name is specified (i.e. without being suffixed by a ‘=’ character and a value) the current value of the environment variable is propagated to the container. e.g. –setenv=FOO will lookup the current value of $FOO in the environment, and pass it down to the container. Similar behavior has been added to homectl’s, machinectl’s and systemd-run’s –setenv= switch. * systemd-nspawn gained a new switch –suppress-sync= which may be used to optionally suppress the effect of the sync()/fsync()/fdatasync() system calls for the container payload. This is useful for build system environments where safety against abnormal system shutdown is not essential as all build artifacts can be regenerated any time, but the performance win is beneficial. * systemd-nspawn will now raise the RLIMIT_NOFILE hard limit to the same value that PID 1 uses for most forked off processes. * systemd-nspawn’s –bind=/–bind-ro= switches now optionally take uidmap/nouidmap options as last parameter. If “uidmap” is used the bind mounts are created with UID mapping taking place that ensures the host’s file ownerships are mapped 1:1 to container file ownerships, even if user namespacing is used. This way files/directories bound into containers will no longer show up as owned by the nobody user as they typically did if no special care was taken to shift them manually. * When discovering Windows installations sd-boot will now attempt to show the Windows version. * The color scheme to use in sd-boot may now be configured at build-time. * sd-boot gained the ability to change screen resolution during boot-time, by hitting the “r” key. This will cycle through available resolutions and save the last selection. * sd-boot learnt a new hotkey “f”. When pressed the system will enter firmware setup. This is useful in environments where it is difficult to hit the right keys early enough to enter the firmware, and works on any firmware regardless which key it natively uses. * sd-boot gained support for automatically booting into the menu item selected on the last boot (using the “@saved” identifier for menu items). * sd-boot gained support for automatically loading all EFI drivers placed in the /EFI/systemd/drivers/ subdirectory of the EFI System Partition (ESP). These drivers are loaded before the menu entries are loaded. This is useful e.g. to load additional file system drivers for the XBOOTLDR partition. * systemd-boot will now paint the input cursor on its own instead of relying on the firmware to do so, increasing compatibility with broken firmware that doesn’t make the cursor reasonably visible. * sd-boot now embeds a .osrel PE section like we expect from Boot Loader Specification Type #2 Unified Kernels. This means sd-boot itself may be used in place of a Type #2 Unified Kernel. This is useful for debugging purposes as it allows chain-loading one a (development) sd-boot instance from another. * sd-boot now supports a new “devicetree” field in Boot Loader Specification Type #1 entries: if configured the specified device tree file is installed before the kernel is invoked. This is useful for installing/applying new devicetree files without updating the kernel image. * Similarly, sd-stub now can read devicetree data from a PE section “.dtb” and apply it before invoking the kernel. * sd-stub (the EFI stub that can be glued in front of a Linux kernel) gained the ability to pick up credentials and sysext files, wrap them in a cpio archive, and pass as an additional initrd to the invoked Linux kernel, in effect placing those files in the /.extra/ directory of the initrd environment. This is useful to implement trusted initrd environments which are fully authenticated but still can be extended (via sysexts) and parameterized (via encrypted/authenticated credentials, see above). Credentials can be located next to the kernel image file (credentials specific to a single boot entry), or in one of the shared directories (credentials applicable to multiple boot entries). * sd-stub now comes with a full man page, that explains its feature set and how to combine a kernel image, an initrd and the stub to build a complete EFI unified kernel image, implementing Boot Loader Specification Type #2. * sd-stub may now provide the initrd to the executed kernel via the LINUX_EFI_INITRD_MEDIA_GUID EFI protocol, adding compatibility for non-x86 architectures. * bootctl learnt new set-timeout and set-timeout-oneshot commands that may be used to set the boot menu time-out of the boot loader (for all or just the subsequent boot). * bootctl and kernel-install will now read variables KERNEL_INSTALL_LAYOUT= from /etc/machine-info and layout= from /etc/kernel/install.conf. When set, it specifies the layout to use for installation directories on the boot partition, so that tools don’t need to guess it based on the already-existing directories. The only value that is defined natively is “bls”, corresponding to the layout specified in https://systemd.io/BOOT_LOADER_SPECIFICATION/. Plugins for kernel-install that implement a different layout can declare other values for this variable. ‘bootctl install’ will now write KERNEL_INSTALL_LAYOUT=bls, on the assumption that if the user installed sd-boot to the ESP, they intend to use the entry layout understood by sd-boot. It’ll also write KERNEL_INSTALL_MACHINE_ID= if it creates any directories using the ID (and it wasn’t specified in the config file yet). Similarly, kernel-install will now write KERNEL_INSTALL_MACHINE_ID= (if it wasn’t specified in the config file yet). Effectively, those changes mean that the machine-id used for boot loader entry installation is “frozen” upon first use and becomes independent of the actual machine-id. Configuring KERNEL_INSTALL_MACHINE_ID fixes the following problem: images created for distribution (“golden images”) are built with no machine-id, so that a unique machine-id can be created on the first boot. But those images may contain boot loader entries with the machine-id used during build included in paths. Using a “frozen” value allows unambiguously identifying entries that match the specific installation, while still permitting parallel installations without conflict. Configuring KERNEL_INSTALL_LAYOUT obviates the need for kernel-install to guess the installation layout. This fixes the problem where a (possibly empty) directory in the boot partition is created from a different layout causing kernel-install plugins to assume the wrong layout. A particular example of how this may happen is the grub2 package in Fedora which includes directories under /boot directly in its file list. Various other packages pull in grub2 as a dependency, so it may be installed even if unused, breaking installations that use the bls layout. * bootctl and systemd-bless-boot can now be linked statically. * systemd-sysext now optionally doesn’t insist on extension-release.d/ files being placed in the image under the image’s file name. If the file system xattr user.extension-release.strict is set on the extension release file, it is accepted regardless of its name. This relaxes security restrictions a bit, as system extension may be attached under a wrong name this way. * udevadm’s test-builtin command learnt a new –action= switch for testing the built-in with the specified action (in place of the default ‘add’). * udevadm info gained new switches –property=/–value for showing only specific udev properties/values instead of all. * A new hwdb database has been added that contains matches for various types of signal analyzers (protocol analyzers, logic analyzers, oscilloscopes, multimeters, bench power supplies, etc.) that should be accessible to regular users. * A new hwdb database entry has been added that carries information about types of cameras (regular or infrared), and in which direction they point (front or back). * A new rule to allow console users access to rfkill by default has been added to hwdb. * Device nodes for the Software Guard eXtension enclaves (sgx_vepc) are now also owned by the system group “sgx”. * A new build-time meson option “extra-net-naming-schemes=” has been added to define additional naming schemes for udev’s network interface naming logic. This is useful for enterprise distributions and similar which want to pin the schemes of certain distribution releases under a specific name and previously had to patch the sources to introduce new named schemes. * The predictable naming logic for network interfaces has been extended to generate stable names from Xen netfront device information. * hostnamed’s chassis property can now be sourced from chassis-type field encoded in devicetree (in addition to the existing DMI support). * systemd-cgls now optionally displays cgroup IDs and extended attributes for each cgroup. (Controllable via the new –xattr= + –cgroup-id= switches.) * coredumpctl gained a new –all switch for operating on all Journal files instead of just the local ones. * systemd-coredump will now use libdw/libelf via dlopen() rather than directly linking, allowing users to easily opt-out of backtrace/metadata analysis of core files, and reduce image sizes when this is not needed. * systemd-coredump will now analyze core files with libdw/libelf in a forked, sandboxed process. * systemd-homed will now try to unmount an activate home area in regular intervals once the user logged out fully. Previously this was attempted exactly once but if the home directory was busy for some reason it was not tried again. * systemd-homed’s LUKS2 home area backend will now create a BSD file system lock on the image file while the home area is active (i.e. mounted). If a home area is found to be locked, logins are politely refused. This should improve behavior when using home areas images that are accessible via the network from multiple clients, and reduce the chance of accidental file system corruption in that case. * Optionally, systemd-homed will now drop the kernel buffer cache once a user has fully logged out, configurable via the new –drop-caches= homectl switch. * systemd-homed now makes use of UID mapped mounts for the home areas. If the kernel and used file system support it, files are now internally owned by the “nobody” user (i.e. the user typically used for indicating “this ownership is not mapped”), and dynamically mapped to the UID used locally on the system via the UID mapping mount logic of recent kernels. This makes migrating home areas between different systems cheaper because recursively chown()ing file system trees is no longer necessary. * systemd-homed’s CIFS backend now optionally supports CIFS service names with a directory suffix, in order to place home directories in a subdirectory of a CIFS share, instead of the top-level directory. * systemd-homed’s CIFS backend gained support for specifying additional mount options in the JSON user record (cifsExtraMountOptions field, and –cifs-extra-mount-options= homectl switch). This is for example useful for configuring mount options such as “noserverino” that some SMB3 services require (use that to run a homed home directory from a FritzBox SMB3 share this way). * systemd-homed will now default to btrfs’ zstd compression for home areas. This is inspired by Fedora’s recent decision to switch to zstd by default. * Additional mount options to use when mounting the file system of LUKS2 volumes in systemd-homed has been added. Via the $SYSTEMD_HOME_MOUNT_OPTIONS_BTRFS, $SYSTEMD_HOME_MOUNT_OPTIONS_EXT4, $SYSTEMD_HOME_MOUNT_OPTIONS_XFS environment variables to systemd-homed or via the luksExtraMountOptions user record JSON property. (Exposed via homectl –luks-extra-mount-options) * homectl’s resize command now takes the special size specifications “min” and “max” to shrink/grow the home area to the minimum/maximum size possible, taking disk usage/space constraints and file system limitations into account. Resizing is now generally graceful: the logic will try to get as close to the specified size as possible, but not consider it a failure if the request couldn’t be fulfilled precisely. * systemd-homed gained the ability to automatically shrink home areas on logout to their minimal size and grow them again on next login. This ensures that while inactive, a home area only takes up the minimal space necessary, but once activated, it provides sufficient space for the user’s needs. This behavior is only supported if btrfs is used as file system inside the home area (because only for btrfs online growing/shrinking is implemented in the kernel). This behavior is now enabled by default, but may be controlled via the new –auto-resize-mode= setting of homectl. * systemd-homed gained support for automatically re-balancing free disk space among active home areas, in case the LUKS2 backends are used, and no explicit disk size was requested. This way disk space is automatically managed and home areas resized in regular intervals and manual resizing when disk space becomes scarce should not be necessary anymore. This behavior is only supported if btrfs is used within the home areas (as only then online shrinking and growing is supported), and may be configured via the new rebalanceWeight JSON user record field (as exposed via the new –rebalance-weight= homectl setting). Re-balancing is mostly automatic, but can also be requested explicitly via “homectl rebalance”, which is synchronous, and thus may be used to wait until the rebalance run is complete. * userdbctl gained a –json= switch for configured the JSON formatting to use when outputting user or group records. * userdbctl gained a new –multiplexer= switch for explicitly configuring whether to use the systemd-userdbd server side user record resolution logic. * userdbctl’s ssh-authorized-keys command learnt a new –chain switch, for chaining up another command to execute after completing the look-up. Since the OpenSSH’s AuthorizedKeysCommand only allows configuration of a single command to invoke, this maybe used to invoke multiple: first userdbctl’s own implementation, and then any other also configured in the command line. * The sd-event API gained a new function sd_event_add_inotify_fd() that is similar to sd_event_add_inotify() but accepts a file descriptor instead of a path in the file system for referencing the inode to watch. * The sd-event API gained a new function sd_event_source_set_ratelimit_expire_callback() that may be used to define a callback function that is called whenever an event source leaves the rate limiting phase. * New documentation has been added explaining which steps are necessary to port systemd to a new architecture: https://systemd.io/PORTING_TO_NEW_ARCHITECTURES * The x-systemd.makefs option in /etc/fstab now explicitly supports ext2, ext3, and f2fs file systems. * Mount units and units generated from /etc/fstab entries with ‘noauto’ are now ordered the same as other units. Effectively, they will be started earlier (if something actually pulled them in) and stopped later, similarly to normal mount units that are part of fs-local.target. This change should be invisible to users, but should prevent those units from being stopped too early during shutdown. * The systemd-getty-generator now honors a new kernel command line argument systemd.getty_auto= and a new environment variable $SYSTEMD_GETTY_AUTO that allows turning it off at boot. This is for example useful to turn off gettys inside of containers or similar environments. * systemd-resolved now listens on a second DNS stub address: 127.0.0.54 (in addition to 127.0.0.53, as before). If DNS requests are sent to this address they are propagated in “bypass” mode only, i.e. are almost not processed locally, but mostly forwarded as-is to the current upstream DNS servers. This provides a stable DNS server address that proxies all requests dynamically to the right upstream DNS servers even if these dynamically change. This stub does not do mDNS/LLMNR resolution. However, it will translate look-ups to DNS-over-TLS if necessary. This new stub is particularly useful in container/VM environments, or for tethering setups: use DNAT to redirect traffic to any IP address to this stub. * systemd-importd now honors new environment variables $SYSTEMD_IMPORT_BTRFS_SUBVOL, $SYSTEMD_IMPORT_BTRFS_QUOTA, $SYSTEMD_IMPORT_SYNC, which may be used disable btrfs subvolume generation, btrfs quota setup and disk synchronization. * systemd-importd and systemd-resolved can now be optionally built with OpenSSL instead of libgcrypt. * systemd-repart no longer requires OpenSSL. * systemd-sysusers will no longer create the redundant ‘nobody’ group by default, as the ‘nobody’ user is already created with an appropriate primary group. * If a unit uses RuntimeMaxSec, systemctl show will now display it. * systemctl show-environment gained support for –output=json. * pam_systemd will now first try to use the X11 abstract socket, and fallback to the socket file in /tmp/.X11-unix/ only if that does not work. * systemd-journald will no longer go back to volatile storage regardless of configuration when its unit is restarted. * Initial support for the LoongArch architecture has been added (system call lists, GPT partition table UUIDs, etc). * systemd-journald’s own logging messages are now also logged to the journal itself when systemd-journald logs to /dev/kmsg. * systemd-journald now re-enables COW for archived journal files on filesystems that support COW. One benefit of this change is that archived journal files will now get compressed on btrfs filesystems that have compression enabled. * systemd-journald now deduplicates fields in a single log message before adding it to the journal. In archived journal files, it will also punch holes for unused parts and truncate the file as appropriate, leading to reductions in disk usage. * journalctl –verify was extended with more informative error messages. * More of sd-journal’s functions are now resistant against journal file corruption. * The shutdown command learnt a new option –show, to display the scheduled shutdown. * A LICENSES/ directory is now included in the git tree. It contains a README.md file that explains the licenses used by source files in this repository. It also contains the text of all applicable licenses as they appear on spdx.org. Contributions from: Aakash Singh, acsfer, Adolfo Jayme Barrientos, Adrian Vovk, Albert Brox, Alberto Mardegan, Alexander Kanavin, alexlzhu, Alfonso Sánchez-Beato, Alvin Šipraga, Alyssa Ross, Amir Omidi, Anatol Pomozov, Andika Triwidada, Andreas Rammhold, Andreas Valder, Andrej Lajovic, Andrew Soutar, Andrew Stone, Andy Chi, Anita Zhang, Anssi Hannula, Antonio Alvarez Feijoo, Antony Deepak Thomas, Arnaud Ferraris, Arvid E. Picciani, Bastien Nocera, Benjamin Berg, Benjamin Herrenschmidt, Ben Stockett, Bogdan Seniuc, Boqun Feng, Carl Lei, chlorophyll-zz, Chris Packham, Christian Brauner, Christian Göttsche, Christian Wehrli, Christoph Anton Mitterer, Cristian Rodríguez, Daan De Meyer, Daniel Maixner, Dann Frazier, Dan Streetman, Davide Cavalca, David Seifert, David Tardon, dependabot[bot], Dimitri John Ledkov, Dimitri Papadopoulos, Dimitry Ishenko, Dmitry Khlebnikov, Dominique Martinet, duament, Egor, Egor Ignatov, Emil Renner Berthing, Emily Gonyer, Ettore Atalan, Evgeny Vereshchagin, Florian Klink, Franck Bui, Frantisek Sumsal, Geass-LL, Gibeom Gwon, GnunuX, Gogo Gogsi, gregzuro, Greg Zuro, Gustavo Costa, Hans de Goede, Hela Basa, Henri Chain, hikigaya58, Hugo Carvalho, Hugo Osvaldo Barrera, Iago Lopez Galeiras, Iago López Galeiras, I-dont-need-name, igo95862, Jack Dähn, James Hilliard, Jan Janssen, Jan Kuparinen, Jan Macku, Jan Palus, Jarkko Sakkinen, Jayce Fayne, jiangchuangang, jlempen, John Lindgren, Jonas Dreßler, Jonas Jelten, Jonas Witschel, Joris Hartog, José Expósito, Julia Kartseva, Kai-Heng Feng, Kai Wohlfahrt, Kay Siver Bø, KennthStailey, Kevin Kuehler, Kevin Orr, Khem Raj, Kristian Klausen, Kyle Laker, lainahai, LaserEyess, Lennart Poettering, Lia Lenckowski, longpanda, Luca Boccassi, Luca BRUNO, Ludwig Nussel, Lukas Senionis, Maanya Goenka, Maciek Borzecki, Marcel Menzel, Marco Scardovi, Marcus Harrison, Mark Boudreau, Matthijs van Duin, Mauricio Vásquez, Maxime de Roucy, Max Resch, MertsA, Michael Biebl, Michael Catanzaro, Michal Koutný, Michal Sekletár, Miika Karanki, Mike Gilbert, Milo Turner, ml, monosans, Nacho Barrientos, nassir90, Nishal Kulkarni, nl6720, Ondrej Kozina, Paulo Neves, Pavel Březina, pedro martelletto, Peter Hutterer, Peter Morrow, Piotr Drąg, Rasmus Villemoes, ratijas, Raul Tambre, rene, Riccardo Schirone, Robert-L-Turner, Robert Scheck, Ross Jennings, saikat0511, Scott Lamb, Scott Worley, Sergei Trofimovich, Sho Iizuka, Slava Bacherikov, Slimane Selyan Amiri, StefanBruens, Steven Siloti, svonohr, Taiki Sugawara, Takashi Sakamoto, Takuro Onoue, Thomas Blume, Thomas Haller, Thomas Mühlbacher, Tianlu Shao, Toke Høiland-Jørgensen, Tom Yan, Tony Asleson, Topi Miettinen, Ulrich Ölmann, Urs Ritzmann, Vincent Bernat, Vito Caputo, Vladimir Panteleev, WANG Xuerui, Wind/owZ, Wu Xiaotian, xdavidwu, Xiaotian Wu, xujing, yangmingtai, Yao Wei, Yao Wei (魏銘廷), Yegor Alexeyev, Yu Watanabe, Zbigniew Jędrzejewski-Szmek, Дамјан Георгиевски, наб — Warsaw, 2021-12-23 CHANGES WITH 249: * When operating on disk images via the –image= switch of various tools (such as systemd-nspawn or systemd-dissect), or when udev finds no ‘root=’ parameter on the kernel command line, and multiple suitable root or /usr/ partitions exist in the image, then a simple comparison inspired by strverscmp() is done on the GPT partition label, and the newest partition is picked. This permits a simple and generic whole-file-system A/B update logic where new operating system versions are dropped into partitions whose label is then updated with a matching version identifier. * systemd-sysusers now supports querying the passwords to set for the users it creates via the “credentials” logic introduced in v247: the passwd.hashed-password. and passwd.plaintext-password. credentials are consulted for the password to use (either in UNIX hashed form, or literally). By default these credentials are inherited down from PID1 (which in turn imports it from a container manager if there is one). This permits easy configuration of user passwords during first boot. Example: # systemd-nspawn -i foo.raw –volatile=yes –set-credential=passwd.plaintext-password.root:foo Note that systemd-sysusers operates in purely additive mode: it executes no operation if the declared users already exist, and hence doesn’t set any passwords as effect of the command line above if the specified root user exists already in the image. (Note that –volatile=yes ensures it doesn’t, though.) * systemd-firstboot now also supports querying various system parameters via the credential subsystems. Thus, as above this may be used to initialize important system parameters on first boot of previously unprovisioned images (i.e. images with a mostly empty /etc/). * PID 1 may now show both the unit name and the unit description strings in its status output during boot. This may be configured with StatusUnitFormat=combined in system.conf or systemd.status-unit-format=combined on the kernel command line. * The systemd-machine-id-setup tool now supports a –image= switch for provisioning a machine ID file into an OS disk image, similar to how –root= operates on an OS file tree. This matches the existing switch of the same name for systemd-tmpfiles, systemd-firstboot, and systemd-sysusers tools. * Similarly, systemd-repart gained support for the –image= switch too. In combination with the existing –size= option, this makes the tool particularly useful for easily growing disk images in a single invocation, following the declarative rules included in the image itself. * systemd-repart’s partition configuration files gained support for a new switch MakeDirectories= which may be used to create arbitrary directories inside file systems that are created, before registering them in the partition table. This is useful in particular for root partitions to create mount point directories for other partitions included in the image. For example, a disk image that contains a root, /home/, and /var/ partitions, may set MakeDirectories=yes to create /home/ and /var/ as empty directories in the root file system on its creation, so that the resulting image can be mounted immediately, even in read-only mode. * systemd-repart’s CopyBlocks= setting gained support for the special value “auto”. If used, a suitable matching partition on the booted OS is found as source to copy blocks from. This is useful when implementing replicating installers, that are booted from one medium and then stream their own root partition onto the target medium. * systemd-repart’s partition configuration files gained support for a Flags=, a ReadOnly= and a NoAuto= setting, allowing control of these GPT partition flags for the created partitions: this is useful for marking newly created partitions as read-only, or as not being subject for automatic mounting from creation on. * The /etc/os-release file has been extended with two new (optional) variables IMAGE_VERSION= and IMAGE_ID=, carrying identity and version information for OS images that are updated comprehensively and atomically as one image. Two new specifiers %M, %A now resolve to these two fields in the various configuration options that resolve specifiers. * portablectl gained a new switch –extension= for enabling portable service images with extensions that follow the extension image concept introduced with v248, and thus allows layering multiple images when setting up the root filesystem of the service. * systemd-coredump will now extract ELF build-id information from processes dumping core and include it in the coredump report. Moreover, it will look for ELF .note.package sections with distribution packaging meta-information about the crashing process. This is useful to directly embed the rpm or deb (or any other) package name and version in ELF files, making it easy to match coredump reports with the specific package for which the software was compiled. This is particularly useful on environments with ELF files from multiple vendors, different distributions and versions, as is common today in our containerized and sand-boxed world. For further information, see: https://systemd.io/COREDUMP_PACKAGE_METADATA * A new udev hardware database has been added for FireWire devices (IEEE 1394). * The “net_id” built-in of udev has been updated with three backwards-incompatible changes: – PCI hotplug slot names on s390 systems are now parsed as hexadecimal numbers. They were incorrectly parsed as decimal previously, or ignored if the name was not a valid decimal number. – PCI onboard indices up to 65535 are allowed. Previously, numbers above 16383 were rejected. This primarily impacts s390 systems, where values up to 65535 are used. – Invalid characters in interface names are replaced with “_”. The new version of the net naming scheme is “v249”. The previous scheme can be selected via the “net.naming-scheme=v247” kernel command line parameter. * sd-bus’ sd_bus_is_ready() and sd_bus_is_open() calls now accept a NULL bus object, for which they will return false. Or in other words, an unallocated bus connection is neither ready nor open. * The sd-device API acquired a new API function sd_device_get_usec_initialized() that returns the monotonic time when the udev device first appeared in the database. * sd-device gained a new APIs sd_device_trigger_with_uuid() and sd_device_get_trigger_uuid(). The former is similar to sd_device_trigger() but returns a randomly generated UUID that is associated with the synthetic uevent generated by the call. This UUID may be read from the sd_device object a monitor eventually receives, via the sd_device_get_trigger_uuid(). This interface requires kernel 4.13 or above to work, and allows tracking a synthetic uevent through the entire device management stack. The “udevadm trigger –settle” logic has been updated to make use of this concept if available to wait precisely for the uevents it generates. “udevadm trigger” also gained a new parameter –uuid that prints the UUID for each generated uevent. * sd-device also gained new APIs sd_device_new_from_ifname() and sd_device_new_from_ifindex() for allocating an sd-device object for the specified network interface. The former accepts an interface name (either a primary or an alternative name), the latter an interface index. * The native Journal protocol has been documented. Clients may talk this as alternative to the classic BSD syslog protocol for locally delivering log records to the Journal. The protocol has been stable for a long time and in fact been implemented already in a variety of alternative client libraries. This documentation makes the support for that official: https://systemd.io/JOURNAL_NATIVE_PROTOCOL * A new BPFProgram= setting has been added to service files. It may be set to a path to a loaded kernel BPF program, i.e. a path to a bpffs file, or a bind mount or symlink to one. This may be used to upload and manage BPF programs externally and then hook arbitrary systemd services into them. * The “home.arpa” domain that has been officially declared as the choice for domain for local home networks per RFC 8375 has been added to the default NTA list of resolved, since DNSSEC is generally not available on private domains. * The CPUAffinity= setting of unit files now resolves “%” specifiers. * A new ManageForeignRoutingPolicyRules= setting has been added to .network files which may be used to exclude foreign-created routing policy rules from systemd-networkd management. * systemd-network-wait-online gained two new switches -4 and -6 that may be used to tweak whether to wait for only IPv4 or only IPv6 connectivity. * .network files gained a new RequiredFamilyForOnline= setting to fine-tune whether to require an IPv4 or IPv6 address in order to consider an interface “online”. * networkctl will now show an over-all “online” state in the per-link information. * In .network files a new OutgoingInterface= setting has been added to specify the output interface in bridge FDB setups. * In .network files the Multipath group ID may now be configured for [NextHop] entries, via the new Group= setting. * The DHCP server logic configured in .network files gained a new setting RelayTarget= that turns the server into a DHCP server relay. The RelayAgentCircuitId= and RelayAgentRemoteId= settings may be used to further tweak the DHCP relay behaviour. * The DHCP server logic also gained a new ServerAddress= setting in .network files that explicitly specifies the server IP address to use. If not specified, the address is determined automatically, as before. * The DHCP server logic in systemd-networkd gained support for static DHCP leases, configurable via the [DHCPServerStaticLease] section. This allows explicitly mapping specific MAC addresses to fixed IP addresses and vice versa. * The RestrictAddressFamilies= setting in service files now supports a new special value “none”. If specified sockets of all address families will be made unavailable to services configured that way. * systemd-fstab-generator and systemd-repart have been updated to support booting from disks that carry only a /usr/ partition but no root partition yet, and where systemd-repart can add it in on the first boot. This is useful for implementing systems that ship with a single /usr/ file system, and whose root file system shall be set up and formatted on a LUKS-encrypted volume whose key is generated locally (and possibly enrolled in the TPM) during the first boot. * The [Address] section of .network files now accepts a new RouteMetric= setting that configures the routing metric to use for the prefix route created as effect of the address configuration. Similarly, the [DHCPv6PrefixDelegation] and [IPv6Prefix] sections gained matching settings for their prefix routes. (The option of the same name in the [DHCPv6] section is moved to [IPv6AcceptRA], since it conceptually belongs there; the old option is still understood for compatibility.) * The DHCPv6 IAID and DUID are now explicitly configurable in .network files. * A new udev property ID_NET_DHCP_BROADCAST on network interface devices is now honoured by systemd-networkd, controlling whether to issue DHCP offers via broadcasting. This is used to ensure that s390 layer 3 network interfaces work out-of-the-box with systemd-networkd. * nss-myhostname and systemd-resolved will now synthesize address records for a new special hostname “_outbound”. The name will always resolve to the local IP addresses most likely used for outbound connections towards the default routes. On multi-homed hosts this is useful to have a stable handle referring to “the” local IP address that matters most, to the point where this is defined. * The Discoverable Partition Specification has been updated with a new GPT partition flag “grow-file-system” defined for its partition types. Whenever partitions with this flag set are automatically mounted (i.e. via systemd-gpt-auto-generator or the –image= switch of systemd-nspawn or other tools; and as opposed to explicit mounting via /etc/fstab), the file system within the partition is automatically grown to the full size of the partition. If the file system size already matches the partition size this flag has no effect. Previously, this functionality has been available via the explicit x-systemd.growfs mount option, and this new flag extends this to automatically discovered mounts. A new GrowFileSystem= setting has been added to systemd-repart drop-in files that allows configuring this partition flag. This new flag defaults to on for partitions automatically created by systemd-repart, except if they are marked read-only. See the specification for further details: https://systemd.io/DISCOVERABLE_PARTITIONS * .network files gained a new setting RoutesToNTP= in the [DHCPv4] section. If enabled (which is the default), and an NTP server address is acquired through a DHCP lease on this interface an explicit route to this address is created on this interface to ensure that NTP traffic to the NTP server acquired on an interface is also routed through that interface. The pre-existing RoutesToDNS= setting that implements the same for DNS servers is now enabled by default. * A pair of service settings SocketBindAllow= + SocketBindDeny= have been added that may be used to restrict the network interfaces sockets created by the service may be bound to. This is implemented via BPF. * A new ConditionFirmware= setting has been added to unit files to conditionalize on certain firmware features. At the moment it may check whether running on an UEFI system, a device.tree system, or if the system is compatible with some specified device-tree feature. * A new ConditionOSRelease= setting has been added to unit files to check os-release(5) fields. The “=”, “!=”, “<", "<=", ">=”, “>” operators may be used to check if some field has some specific value or do an alphanumerical comparison. Equality comparisons are useful for fields like ID, but relative comparisons for fields like VERSION_ID or IMAGE_VERSION. * hostnamed gained a new Describe() D-Bus method that returns a JSON serialization of the host data it exposes. This is exposed via “hostnamectl –json=” to acquire a host identity description in JSON. It’s our intention to add a similar features to most services and objects systemd manages, in order to simplify integration with program code that can consume JSON. * Similarly, networkd gained a Describe() method on its Manager and Link bus objects. This is exposed via “networkctl –json=”. * hostnamectl’s various “get-xyz”/”set-xyz” verb pairs (e.g. “hostnamectl get-hostname”, “hostnamectl “set-hostname”) have been replaced by a single “xyz” verb (e.g. “hostnamectl hostname”) that is used both to get the value (when no argument is given), and to set the value (when an argument is specified). The old names continue to be supported for compatibility. * systemd-detect-virt and ConditionVirtualization= are now able to correctly identify Amazon EC2 environments. * The LogLevelMax= setting of unit files now applies not only to log messages generated *by* the service, but also to log messages generated *about* the service by PID 1. To suppress logs concerning a specific service comprehensively, set this option to a high log level. * bootctl gained support for a new –make-machine-id-directory= switch that allows precise control on whether to create the top-level per-machine directory in the boot partition that typically contains Type 1 boot loader entries. * During build SBAT data to include in the systemd-boot EFI PE binaries may be specified now. * /etc/crypttab learnt a new option “headless”. If specified any requests to query the user interactively for passwords or PINs will be skipped. This is useful on systems that are headless, i.e. where an interactive user is generally not present. * /etc/crypttab also learnt a new option “password-echo=” that allows configuring whether the encryption password prompt shall echo the typed password and if so, do so literally or via asterisks. (The default is the same behaviour as before: provide echo feedback via asterisks.) * FIDO2 support in systemd-cryptenroll/systemd-cryptsetup and systemd-homed has been updated to allow explicit configuration of the “user presence” and “user verification” checks, as well as whether a PIN is required for authentication, via the new switches –fido2-with-user-presence=, –fido2-with-user-verification=, –fido2-with-client-pin= to systemd-cryptenroll and homectl. Which features are available, and may be enabled or disabled depends on the used FIDO2 token. * systemd-nspawn’s –private-user= switch now accepts the special value “identity” which configures a user namespacing environment with an identity mapping of 65535 UIDs. This means the container UID 0 is mapped to the host UID 0, and the UID 1 to host UID 1. On first look this doesn’t appear to be useful, however it does reduce the attack surface a bit, since the resulting container will possess process capabilities only within its namespace and not on the host. * systemd-nspawn’s –private-user-chown switch has been replaced by a more generic –private-user-ownership= switch that accepts one of three values: “chown” is equivalent to the old –private-user-chown, and “off” is equivalent to the absence of the old switch. The value “map” uses the new UID mapping mounts of Linux 5.12 to map ownership of files and directories of the underlying image to the chosen UID range for the container. “auto” is equivalent to “map” if UID mapping mount are supported, otherwise it is equivalent to “chown”. The short -U switch systemd-nspawn now implies –private-user-ownership=auto instead of the old –private-user-chown. Effectively this means: if the backing file system supports UID mapping mounts the feature is now used by default if -U is used. Generally, it’s a good idea to use UID mapping mounts instead of recursive chown()ing, since it allows running containers off immutable images (since no modifications of the images need to take place), and share images between multiple instances. Moreover, the recursive chown()ing operation is slow and can be avoided. Conceptually it’s also a good thing if transient UID range uses do not leak into persistent file ownership anymore. TLDR: finally, the last major drawback of user namespacing has been removed, and -U should always be used (unless you use btrfs, where UID mapped mounts do not exist; or your container actually needs privileges on the host). * nss-systemd now synthesizes user and group shadow records in addition to the main user and group records. Thus, hashed passwords managed by systemd-homed are now accessible via the shadow database. * The userdb logic (and thus nss-systemd, and so on) now read additional user/group definitions in JSON format from the drop-in directories /etc/userdb/, /run/userdb/, /run/host/userdb/ and /usr/lib/userdb/. This is a simple and powerful mechanism for making additional users available to the system, with full integration into NSS including the shadow databases. Since the full JSON user/group record format is supported this may also be used to define users with resource management settings and other runtime settings that pam_systemd and systemd-logind enforce at login. * The userdbctl tool gained two new switches –with-dropin= and –with-varlink= which can be used to fine-tune the sources used for user database lookups. * systemd-nspawn gained a new switch –bind-user= for binding a host user account into the container. This does three things: the user’s home directory is bind mounted from the host into the container, below the /run/userdb/home/ hierarchy. A free UID is picked in the container, and a user namespacing UID mapping to the host user’s UID installed. And finally, a minimal JSON user and group record (along with its hashed password) is dropped into /run/host/userdb/. These records are picked up automatically by the userdb drop-in logic describe above, and allow the user to login with the same password as on the host. Effectively this means: if host and container run new enough systemd versions making a host user available to the container is trivially simple. * systemd-journal-gatewayd now supports the switches –user, –system, –merge, –file= that are equivalent to the same switches of journalctl, and permit exposing only the specified subset of the Journal records. * The OnFailure= dependency between units is now augmented with a implicit reverse dependency OnFailureOf= (this new dependency cannot be configured directly it’s only created as effect of an OnFailure= dependency in the reverse order — it’s visible in “systemctl show” however). Similar, Slice= now has an reverse dependency SliceOf=, that is also not configurable directly, but useful to determine all units that are members of a slice. * A pair of new dependency types between units PropagatesStopTo= + StopPropagatedFrom= has been added, that allows propagation of unit stop events between two units. It operates similar to the existing PropagatesReloadTo= + ReloadPropagatedFrom= dependencies. * A new dependency type OnSuccess= has been added (plus the reverse dependency OnSuccessOf=, which cannot be configured directly, but exists only as effect of the reverse OnSuccess=). It is similar to OnFailure=, but triggers in the opposite case: when a service exits cleanly. This allows “chaining up” of services where one or more services are started once another service has successfully completed. * A new dependency type Upholds= has been added (plus the reverse dependency UpheldBy=, which cannot be configured directly, but exists only as effect of Upholds=). This dependency type is a stronger form of Wants=: if a unit has an UpHolds= dependency on some other unit and the former is active then the latter is started whenever it is found inactive (and no job is queued for it). This is an alternative to Restart= inside service units, but less configurable, and the request to uphold a unit is not encoded in the unit itself but in another unit that intends to uphold it. * The systemd-ask-password tool now also supports reading passwords from the credentials subsystem, via the new –credential= switch. * The systemd-ask-password tool learnt a new switch –emoji= which may be used to explicit control whether the lock and key emoji (🔐) is shown in the password prompt on suitable TTYs. * The –echo switch of systemd-ask-password now optionally takes a parameter that controls character echo. It may either show asterisks (default, as before), turn echo off entirely, or echo the typed characters literally. * The systemd-ask-password tool also gained a new -n switch for suppressing output of a trailing newline character when writing the acquired password to standard output, similar to /bin/echo’s -n switch. * New documentation has been added that describes the organization of the systemd source code tree: https://systemd.io/ARCHITECTURE * Units using ConditionNeedsUpdate= will no longer be activated in the initrd. * It is now possible to list a template unit in the WantedBy= or RequiredBy= settings of the [Install] section of another template unit, which will be instantiated using the same instance name. * A new MemoryAvailable property is available for units. If the unit, or the slices it is part of, have a memory limit set via MemoryMax=/ MemoryHigh=, MemoryAvailable will indicate how much more memory the unit can claim before hitting the limits. * systemd-coredump will now try to stay below the cgroup memory limit placed on itself or one of the slices it runs under, if the storage area for core files (/var/lib/systemd/coredump/) is placed on a tmpfs, since files written on such filesystems count toward the cgroup memory limit. If there is not enough available memory in such cases to store the core file uncompressed, systemd-coredump will skip to compressed storage directly (if enabled) and it will avoid analyzing the core file to print backtrace and metadata in the journal. * tmpfiles.d/ drop-ins gained a new ‘=’ modifier to check if the type of a path matches the configured expectations, and remove it if not. * tmpfiles.d/’s ‘Age’ now accepts an ‘age-by’ argument, which allows to specify which of the several available filesystem timestamps (access time, birth time, change time, modification time) to look at when deciding whether a path has aged enough to be cleaned. * A new IPv6StableSecretAddress= setting has been added to .network files, which takes an IPv6 address to use as secret for IPv6 address generation. * The [DHCPServer] logic in .network files gained support for a new UplinkInterface= setting that permits configuration of the uplink interface name to propagate DHCP lease information from. * The WakeOnLan= setting in .link files now accepts a list of flags instead of a single one, to configure multiple wake-on-LAN policies. * User-space defined tracepoints (USDT) have been added to udev at strategic locations. This is useful for tracing udev behaviour and performance with bpftrace and similar tools. * systemd-journald-upload gained a new NetworkTimeoutSec= option for setting a network timeout time. * If a system service is running in a new mount namespace (RootDirectory= and friends), all file systems will be mounted with MS_NOSUID by default, unless the system is running with SELinux enabled. * When enumerating time zones the timedatectl tool will now consult the ‘tzdata.zi’ file shipped by the IANA time zone database package, in addition to ‘zone1970.tab’, as before. This makes sure time zone aliases are now correctly supported. Some distributions so far did not install this additional file, most do however. If you distribution does not install it yet, it might make sense to change that. * Intel HID rfkill event is no longer masked, since it’s the only source of rfkill event on newer HP laptops. To have both backward and forward compatibility, userspace daemon needs to debounce duplicated events in a short time window. Contributions from: Aakash Singh, adrian5, Albert Brox, Alexander Sverdlin, Alexander Tsoy, Alexey Rubtsov, alexlzhu, Allen Webb, Alvin Šipraga, Alyssa Ross, Anders Wenhaug, Andrea Pappacoda, Anita Zhang, asavah, Balint Reczey, Bertrand Jacquin, borna-blazevic, caoxia2008cxx, Carlo Teubner, Christian Göttsche, Christian Hesse, Daniel Schaefer, Dan Streetman, David Santamaría Rogado, David Tardon, Deepak Rawat, dgcampea, Dimitri John Ledkov, ei-ke, Emilio Herrera, Emil Renner Berthing, Eric Cook, Flos Lonicerae, Franck Bui, Francois Gervais, Frantisek Sumsal, Gibeom Gwon, gitm0, Hamish Moffatt, Hans de Goede, Harsh Barsaiyan, Henri Chain, Hristo Venev, Icenowy Zheng, Igor Zhbanov, imayoda, Jakub Warczarek, James Buren, Jan Janssen, Jan Macku, Jan Synacek, Jason Francis, Jayanth Ananthapadmanaban, Jeremy Szu, Jérôme Carretero, Jesse Stricker, jiangchuangang, Joerg Behrmann, Jóhann B. Guðmundsson, Jörg Deckert, Jörg Thalheim, Juergen Hoetzel, Julia Kartseva, Kai-Heng Feng, Khem Raj, KoyamaSohei, laineantti, Lennart Poettering, LetzteInstanz, Luca Adrian L, Luca Boccassi, Lucas Magasweran, Mantas Mikulėnas, Marco Antonio Mauro, Mark Wielaard, Masahiro Matsuya, Matt Johnston, Michael Catanzaro, Michal Koutný, Michal Sekletár, Mike Crowe, Mike Kazantsev, Milan, milaq, Miroslav Suchý, Morten Linderud, nerdopolis, nl6720, Noah Meyerhans, Oleg Popov, Olle Lundberg, Ondrej Kozina, Paweł Marciniak, Perry.Yuan, Peter Hutterer, Peter Kjellerstedt, Peter Morrow, Phaedrus Leeds, plattrap, qhill, Raul Tambre, Roman Beranek, Roshan Shariff, Ryan Hendrickson, Samuel BF, scootergrisen, Sebastian Blunt, Seong-ho Cho, Sergey Bugaev, Sevan Janiyan, Sibo Dong, simmon, Simon Watts, Srinidhi Kaushik, Štěpán Němec, Steve Bonds, Susant Sahani, sverdlin, syyhao1994, Takashi Sakamoto, Topi Miettinen, tramsay, Trent Piepho, Uwe Kleine-König, Viktor Mihajlovski, Vincent Dechenaux, Vito Caputo, William A. Kennington III, Yangyang Shen, Yegor Alexeyev, Yi Gao, Yu Watanabe, Zbigniew Jędrzejewski-Szmek, zsien, наб — Edinburgh, 2021-07-07 CHANGES WITH 248: * A concept of system extension images is introduced. Such images may be used to extend the /usr/ and /opt/ directory hierarchies at runtime with additional files (even if the file system is read-only). When a system extension image is activated, its /usr/ and /opt/ hierarchies and os-release information are combined via overlayfs with the file system hierarchy of the host OS. A new systemd-sysext tool can be used to merge, unmerge, list, and refresh system extension hierarchies. See https://www.freedesktop.org/software/systemd/man/systemd-sysext.html. The systemd-sysext.service automatically merges installed system extensions during boot (before basic.target, but not in very early boot, since various file systems have to be mounted first). The SYSEXT_LEVEL= field in os-release(5) may be used to specify the supported system extension level. * A new ExtensionImages= unit setting can be used to apply the same system extension image concept from systemd-sysext to the namespaced file hierarchy of specific services, following the same rules and constraints. * Support for a new special “root=tmpfs” kernel command-line option has been added. When specified, a tmpfs is mounted on /, and mount.usr= should be used to point to the operating system implementation. * A new configuration file /etc/veritytab may be used to configure dm-verity integrity protection for block devices. Each line is in the format “volume-name data-device hash-device roothash options”, similar to /etc/crypttab. * A new kernel command-line option systemd.verity.root_options= may be used to configure dm-verity behaviour for the root device. * The key file specified in /etc/crypttab (the third field) may now refer to an AF_UNIX/SOCK_STREAM socket in the file system. The key is acquired by connecting to that socket and reading from it. This allows the implementation of a service to provide key information dynamically, at the moment when it is needed. * When the hostname is set explicitly to “localhost”, systemd-hostnamed will respect this. Previously such a setting would be mostly silently ignored. The goal is to honour configuration as specified by the user. * The fallback hostname that will be used by the system manager and systemd-hostnamed can now be configured in two new ways: by setting DEFAULT_HOSTNAME= in os-release(5), or by setting $SYSTEMD_DEFAULT_HOSTNAME in the environment block. As before, it can also be configured during compilation. The environment variable is intended for testing and local overrides, the os-release(5) field is intended to allow customization by different variants of a distribution that share the same compiled packages. * The environment block of the manager itself may be configured through a new ManagerEnvironment= setting in system.conf or user.conf. This complements existing ways to set the environment block (the kernel command line for the system manager, the inherited environment and user@.service unit file settings for the user manager). * systemd-hostnamed now exports the default hostname and the source of the configured hostname (“static”, “transient”, or “default”) as D-Bus properties. * systemd-hostnamed now exports the “HardwareVendor” and “HardwareModel” D-Bus properties, which are supposed to contain a pair of cleaned up, human readable strings describing the system’s vendor and model. It’s typically sourced from the firmware’s DMI tables, but may be augmented from a new hwdb database. hostnamectl shows this in the status output. * Support has been added to systemd-cryptsetup for extracting the PKCS#11 token URI and encrypted key from the LUKS2 JSON embedded metadata header. This allows the information how to open the encrypted device to be embedded directly in the device and obviates the need for configuration in an external file. * systemd-cryptsetup gained support for unlocking LUKS2 volumes using TPM2 hardware, as well as FIDO2 security tokens (in addition to the pre-existing support for PKCS#11 security tokens). * systemd-repart may enroll encrypted partitions using TPM2 hardware. This may be useful for example to create an encrypted /var partition bound to the machine on first boot. * A new systemd-cryptenroll tool has been added to enroll TPM2, FIDO2 and PKCS#11 security tokens to LUKS volumes, list and destroy them. See: https://0pointer.net/blog/unlocking-luks2-volumes-with-tpm2-fido2-pkcs11-security-hardware-on-systemd-248.html It also supports enrolling “recovery keys” and regular passphrases. * The libfido2 dependency is now based on dlopen(), so that the library is used at runtime when installed, but is not a hard runtime dependency. * systemd-cryptsetup gained support for two new options in /etc/crypttab: “no-write-workqueue” and “no-read-workqueue” which request synchronous processing of encryption/decryption IO. * The manager may be configured at compile time to use the fexecve() instead of the execve() system call when spawning processes. Using fexecve() closes a window between checking the security context of an executable and spawning it, but unfortunately the kernel displays stale information in the process’ “comm” field, which impacts ps output and such. * The configuration option -Dcompat-gateway-hostname has been dropped. “_gateway” is now the only supported name. * The ConditionSecurity=tpm2 unit file setting may be used to check if the system has at least one TPM2 (tpmrm class) device. * A new ConditionCPUFeature= has been added that may be used to conditionalize units based on CPU features. For example, ConditionCPUFeature=rdrand will condition a unit so that it is only run when the system CPU supports the RDRAND opcode. * The existing ConditionControlGroupController= setting has been extended with two new values “v1” and “v2”. “v2” means that the unified v2 cgroup hierarchy is used, and “v1” means that legacy v1 hierarchy or the hybrid hierarchy are used. * A new PrivateIPC= setting on a unit file allows executed processes to be moved into a private IPC namespace, with separate System V IPC identifiers and POSIX message queues. A new IPCNamespacePath= allows the unit to be joined to an existing IPC namespace. * The tables of system calls in seccomp filters are now automatically generated from kernel lists exported on https://fedora.juszkiewicz.com.pl/syscalls.html. The following architectures should now have complete lists: alpha, arc, arm64, arm, i386, ia64, m68k, mips64n32, mips64, mipso32, powerpc, powerpc64, s390, s390x, tilegx, sparc, x86_64, x32. * The MountAPIVFS= service file setting now additionally mounts a tmpfs on /run/ if it is not already a mount point. A writable /run/ has always been a requirement for a functioning system, but this was not guaranteed when using a read-only image. Users can always specify BindPaths= or InaccessiblePaths= as overrides, and they will take precedence. If the host’s root mount point is used, there is no change in behaviour. * New bind mounts and file system image mounts may be injected into the mount namespace of a service (without restarting it). This is exposed respectively as ‘systemctl bind …’ and ‘systemctl mount-image …’. * The StandardOutput= and StandardError= settings can now specify files to be truncated for output (as “truncate:“). * The ExecPaths= and NoExecPaths= settings may be used to specify noexec for parts of the file system. * sd-bus has a new function sd_bus_open_user_machine() to open a connection to the session bus of a specific user in a local container or on the local host. This is exposed in the existing -M switch to systemctl and similar tools: systemctl –user -M lennart@foobar start foo This will connect to the user bus of a user “lennart” in container “foobar”. If no container name is specified, the specified user on the host itself is connected to systemctl –user -M lennart@ start quux * sd-bus also gained a convenience function sd_bus_message_send() to simplify invocations of sd_bus_send(), taking only a single parameter: the message to send. * sd-event allows rate limits to be set on event sources, for dealing with high-priority event sources that might starve out others. See the new man page sd_event_source_set_ratelimit(3) for details. * systemd.link files gained a [Link] Promiscuous= switch, which allows the device to be raised in promiscuous mode. New [Link] TransmitQueues= and ReceiveQueues= settings allow the number of TX and RX queues to be configured. New [Link] TransmitQueueLength= setting allows the size of the TX queue to be configured. New [Link] GenericSegmentOffloadMaxBytes= and GenericSegmentOffloadMaxSegments= allow capping the packet size and the number of segments accepted in Generic Segment Offload. * systemd-networkd gained support for the “B.A.T.M.A.N. advanced” wireless routing protocol that operates on ISO/OSI Layer 2 only and uses ethernet frames to route/bridge packets. This encompasses a new “batadv” netdev Type=, a new [BatmanAdvanced] section with a bunch of new settings in .netdev files, and a new BatmanAdvanced= setting in .network files. * systemd.network files gained a [Network] RouteTable= configuration switch to select the routing policy table. systemd.network files gained a [RoutingPolicyRule] Type= configuration switch (one of “blackhole, “unreachable”, “prohibit”). systemd.network files gained a [IPv6AcceptRA] RouteDenyList= and RouteAllowList= settings to ignore/accept route advertisements from routers matching specified prefixes. The DenyList= setting has been renamed to PrefixDenyList= and a new PrefixAllowList= option has been added. systemd.network files gained a [DHCPv6] UseAddress= setting to optionally ignore the address provided in the lease. systemd.network files gained a [DHCPv6PrefixDelegation] ManageTemporaryAddress= switch. systemd.network files gained a new ActivationPolicy= setting which allows configuring how the UP state of an interface shall be managed, i.e. whether the interface is always upped, always downed, or may be upped/downed by the user using “ip link set dev”. * The default for the Broadcast= setting in .network files has slightly changed: the broadcast address will not be configured for wireguard devices. * systemd.netdev files gained a [VLAN] Protocol=, IngressQOSMaps=, EgressQOSMaps=, and [MACVLAN] BroadcastMulticastQueueLength= configuration options for VLAN packet handling. * udev rules may now set log_level= option. This allows debug logs to be enabled for select events, e.g. just for a specific subsystem or even a single device. * udev now exports the VOLUME_ID, LOGICAL_VOLUME_ID, VOLUME_SET_ID, and DATA_PREPARED_ID properties for block devices with ISO9660 file systems. * udev now exports decoded DMI information about installed memory slots as device properties under the /sys/class/dmi/id/ pseudo device. * /dev/ is not mounted noexec anymore. This didn’t provide any significant security benefits and would conflict with the executable mappings used with /dev/sgx device nodes. The previous behaviour can be restored for individual services with NoExecPaths=/dev (or by allow- listing and excluding /dev from ExecPaths=). * Permissions for /dev/vsock are now set to 0o666, and /dev/vhost-vsock and /dev/vhost-net are owned by the kvm group. * The hardware database has been extended with a list of fingerprint readers that correctly support USB auto-suspend using data from libfprint. * systemd-resolved can now answer DNSSEC questions through the stub resolver interface in a way that allows local clients to do DNSSEC validation themselves. For a question with DO+CD set, it’ll proxy the DNS query and respond with a mostly unmodified packet received from the upstream server. * systemd-resolved learnt a new boolean option CacheFromLocalhost= in resolved.conf. If true the service will provide caching even for DNS lookups made to an upstream DNS server on the 127.0.0.1/::1 addresses. By default (and when the option is false) systemd-resolved will not cache such lookups, in order to avoid duplicate local caching, under the assumption the local upstream server caches anyway. * systemd-resolved now implements RFC5001 NSID in its local DNS stub. This may be used by local clients to determine whether they are talking to the DNS resolver stub or a different DNS server. * When resolving host names and other records resolvectl will now report where the data was acquired from (i.e. the local cache, the network, locally synthesized, …) and whether the network traffic it effected was encrypted or not. Moreover the tool acquired a number of new options –cache=, –synthesize=, –network=, –zone=, –trust-anchor=, –validate= that take booleans and may be used to tweak a lookup, i.e. whether it may be answered from cached information, locally synthesized information, information acquired through the network, the local mDNS/LLMNR zone, the DNSSEC trust anchor, and whether DNSSEC validation shall be executed for the lookup. * systemd-nspawn gained a new –ambient-capability= setting (AmbientCapability= in .nspawn files) to configure ambient capabilities passed to the container payload. * systemd-nspawn gained the ability to configure the firewall using the nftables subsystem (in addition to the existing iptables support). Similarly, systemd-networkd’s IPMasquerade= option now supports nftables as back-end, too. In both cases NAT on IPv6 is now supported too, in addition to IPv4 (the iptables back-end still is IPv4-only). “IPMasquerade=yes”, which was the same as “IPMasquerade=ipv4” before, retains its meaning, but has been deprecated. Please switch to either “ivp4” or “both” (if covering IPv6 is desired). * systemd-importd will now download .verity and .roothash.p7s files along with the machine image (as exposed via machinectl pull-raw). * systemd-oomd now gained a new DefaultMemoryPressureDurationSec= setting to configure the time a unit’s cgroup needs to exceed memory pressure limits before action will be taken, and a new ManagedOOMPreference=none|avoid|omit setting to avoid killing certain units. systemd-oomd is now considered fully supported (the usual backwards-compatibility promises apply). Swap is not required for operation, but it is still recommended. * systemd-timesyncd gained a new ConnectionRetrySec= setting which configures the retry delay when trying to contact servers. * systemd-stdio-bridge gained –system/–user options to connect to the system bus (previous default) or the user session bus. * systemd-localed may now call locale-gen to generate missing locales on-demand (UTF-8-only). This improves integration with Debian-based distributions (Debian/Ubuntu/PureOS/Tanglu/…) and Arch Linux. * systemctl –check-inhibitors=true may now be used to obey inhibitors even when invoked non-interactively. The old –ignore-inhibitors switch is now deprecated and replaced by –check-inhibitors=false. * systemctl import-environment will now emit a warning when called without any arguments (i.e. to import the full environment block of the called program). This command will usually be invoked from a shell, which means that it’ll inherit a bunch of variables which are specific to that shell, and usually to the TTY the shell is connected to, and don’t have any meaning in the global context of the system or user service manager. Instead, only specific variables should be imported into the manager environment block. Similarly, programs which update the manager environment block by directly calling the D-Bus API of the manager, should also push specific variables, and not the full inherited environment. * systemctl’s status output now shows unit state with a more careful choice of Unicode characters: units in maintenance show a “○” symbol instead of the usual “●”, failed units show “×”, and services being reloaded “↻”. * coredumpctl gained a –debugger-arguments= switch to pass arguments to the debugger. It also gained support for showing coredump info in a simple JSON format. * systemctl/loginctl/machinectl’s –signal= option now accept a special value “list”, which may be used to show a brief table with known process signals and their numbers. * networkctl now shows the link activation policy in status. * Various tools gained –pager/–no-pager/–json= switches to enable/disable the pager and provide JSON output. * Various tools now accept two new values for the SYSTEMD_COLORS environment variable: “16” and “256”, to configure how many terminal colors are used in output. * less 568 or newer is now required for the auto-paging logic of the various tools. Hyperlink ANSI sequences in terminal output are now used even if a pager is used, and older versions of less are not able to display these sequences correctly. SYSTEMD_URLIFY=0 may be used to disable this output again. * Builds with support for separate / and /usr/ hierarchies (“split-usr” builds, non-merged-usr builds) are now officially deprecated. A warning is emitted during build. Support is slated to be removed in about a year (when the Debian Bookworm release development starts). * Systems with the legacy cgroup v1 hierarchy are now marked as “tainted”, to make it clearer that using the legacy hierarchy is not recommended. * systemd-localed will now refuse to configure a keymap which is not installed in the file system. This is intended as a bug fix, but could break cases where systemd-localed was used to configure the keymap in advanced of it being installed. It is necessary to install the keymap file first. * The main git development branch has been renamed to ‘main’. * mmcblk[0-9]boot[0-9] devices will no longer be probed automatically for partitions, as in the vast majority of cases they contain none and are used internally by the bootloader (eg: uboot). * systemd will now set the $SYSTEMD_EXEC_PID environment variable for spawned processes to the PID of the process itself. This may be used by programs for detecting whether they were forked off by the service manager itself or are a process forked off further down the tree. * The sd-device API gained four new calls: sd_device_get_action() to determine the uevent add/remove/change/… action the device object has been seen for, sd_device_get_seqno() to determine the uevent sequence number, sd_device_new_from_stat_rdev() to allocate a new sd_device object from stat(2) data of a device node, and sd_device_trigger() to write to the ‘uevent’ attribute of a device. * For most tools the –no-legend= switch has been replaced by –legend=no and –legend=yes, to force whether tables are shown with headers/legends. * Units acquired a new property “Markers” that takes a list of zero, one or two of the following strings: “needs-reload” and “needs-restart”. These markers may be set via “systemctl set-property”. Once a marker is set, “systemctl reload-or-restart –marked” may be invoked to execute the operation the units are marked for. This is useful for package managers that want to mark units for restart/reload while updating, but effect the actual operations at a later step at once. * The sd_bus_message_read_strv() API call of sd-bus may now also be used to parse arrays of D-Bus signatures and D-Bus paths, in addition to regular strings. * bootctl will now report whether the UEFI firmware used a TPM2 device and measured the boot process into it. * systemd-tmpfiles learnt support for a new environment variable $SYSTEMD_TMPFILES_FORCE_SUBVOL which takes a boolean value. If true the v/q/Q lines in tmpfiles.d/ snippets will create btrfs subvolumes even if the root fs of the system is not itself a btrfs volume. * systemd-detect-virt/ConditionVirtualization= will now explicitly detect Docker/Podman environments where possible. Moreover, they should be able to generically detect any container manager as long as it assigns the container a cgroup. * portablectl gained a new “reattach” verb for detaching/reattaching a portable service image, useful for updating images on-the-fly. * Intel SGX enclave device nodes (which expose a security feature of newer Intel CPUs) will now be owned by a new system group “sgx”. Contributions from: Adam Nielsen, Adrian Vovk, AJ Jordan, Alan Perry, Alastair Pharo, Alexander Batischev, Ali Abdallah, Andrew Balmos, Anita Zhang, Annika Wickert, Ansgar Burchardt, Antonio Terceiro, Antonius Frie, Ardy, Arian van Putten, Ariel Fermani, Arnaud T, A S Alam, Bastien Nocera, Benjamin Berg, Benjamin Robin, Björn Daase, caoxia, Carlo Wood, Charles Lee, ChopperRob, chri2, Christian Ehrhardt, Christian Hesse, Christopher Obbard, clayton craft, corvusnix, cprn, Daan De Meyer, Daniele Medri, Daniel Rusek, Dan Sanders, Dan Streetman, Darren Ng, David Edmundson, David Tardon, Deepak Rawat, Devon Pringle, Dmitry Borodaenko, dropsignal, Einsler Lee, Endre Szabo, Evgeny Vereshchagin, Fabian Affolter, Fangrui Song, Felipe Borges, feliperodriguesfr, Felix Stupp, Florian Hülsmann, Florian Klink, Florian Westphal, Franck Bui, Frantisek Sumsal, Gablegritule, Gaël PORTAY, Gaurav, Giedrius Statkevičius, Greg Depoire-Ferrer, Gustavo Costa, Hans de Goede, Hela Basa, heretoenhance, hide, Iago López Galeiras, igo95862, Ilya Dmitrichenko, Jameer Pathan, Jan Tojnar, Jiehong, Jinyuan Si, Joerg Behrmann, John Slade, Jonathan G. Underwood, Jonathan McDowell, Josh Triplett, Joshua Watt, Julia Cartwright, Julien Humbert, Kairui Song, Karel Zak, Kevin Backhouse, Kevin P. Fleming, Khem Raj, Konomi, krissgjeng, l4gfcm, Lajos Veres, Lennart Poettering, Lincoln Ramsay, Luca Boccassi, Luca BRUNO, Lucas Werkmeister, Luka Kudra, Luna Jernberg, Marc-André Lureau, Martin Wilck, Matthias Klumpp, Matt Turner, Michael Gisbers, Michael Marley, Michael Trapp, Michal Fabik, Michał Kopeć, Michal Koutný, Michal Sekletár, Michele Guerini Rocco, Mike Gilbert, milovlad, moson-mo, Nick, nihilix-melix, Oğuz Ersen, Ondrej Mosnacek, pali, Pavel Hrdina, Pavel Sapezhko, Perry Yuan, Peter Hutterer, Pierre Dubouilh, Piotr Drąg, Pjotr Vertaalt, Richard Laager, RussianNeuroMancer, Sam Lunt, Sebastiaan van Stijn, Sergey Bugaev, shenyangyang4, simmon, Simonas Kazlauskas, Slimane Selyan Amiri, Stefan Agner, Steve Ramage, Susant Sahani, Sven Mueller, Tad Fisher, Takashi Iwai, Thomas Haller, Tom Shield, Topi Miettinen, Torsten Hilbrich, tpgxyz, Tyler Hicks, ulf-f, Ulrich Ölmann, Vincent Pelletier, Vinnie Magro, Vito Caputo, Vlad, walbit-de, Whired Planck, wouter bolsterlee, Xℹ Ruoyao, Yangyang Shen, Yuri Chornoivan, Yu Watanabe, Zach Smith, Zbigniew Jędrzejewski-Szmek, Zmicer Turok, Дамјан Георгиевски — Berlin, 2021-03-30 CHANGES WITH 247: * KERNEL API INCOMPATIBILITY: Linux 4.14 introduced two new uevents “bind” and “unbind” to the Linux device model. When this kernel change was made, systemd-udevd was only minimally updated to handle and propagate these new event types. The introduction of these new uevents (which are typically generated for USB devices and devices needing a firmware upload before being functional) resulted in a number of issues which we so far didn’t address. We hoped the kernel maintainers would themselves address these issues in some form, but that did not happen. To handle them properly, many (if not most) udev rules files shipped in various packages need updating, and so do many programs that monitor or enumerate devices with libudev or sd-device, or otherwise process uevents. Please note that this incompatibility is not fault of systemd or udev, but caused by an incompatible kernel change that happened back in Linux 4.14, but is becoming more and more visible as the new uevents are generated by more kernel drivers. To minimize issues resulting from this kernel change (but not avoid them entirely) starting with systemd-udevd 247 the udev “tags” concept (which is a concept for marking and filtering devices during enumeration and monitoring) has been reworked: udev tags are now “sticky”, meaning that once a tag is assigned to a device it will not be removed from the device again until the device itself is removed (i.e. unplugged). This makes sure that any application monitoring devices that match a specific tag is guaranteed to both see uevents where the device starts being relevant, and those where it stops being relevant (the latter now regularly happening due to the new “unbind” uevent type). The udev tags concept is hence now a concept tied to a *device* instead of a device *event* — unlike for example udev properties whose lifecycle (as before) is generally tied to a device event, meaning that the previously determined properties are forgotten whenever a new uevent is processed. With the newly redefined udev tags concept, sometimes it’s necessary to determine which tags are the ones applied by the most recent uevent/database update, in order to discern them from those originating from earlier uevents/database updates of the same device. To accommodate for this a new automatic property CURRENT_TAGS has been added that works similar to the existing TAGS property but only lists tags set by the most recent uevent/database update. Similarly, the libudev/sd-device API has been updated with new functions to enumerate these ‘current’ tags, in addition to the existing APIs that now enumerate the ‘sticky’ ones. To properly handle “bind”/”unbind” on Linux 4.14 and newer it is essential that all udev rules files and applications are updated to handle the new events. Specifically: • All rule files that currently use a header guard similar to ACTION!=”add|change”,GOTO=”xyz_end” should be updated to use ACTION==”remove”,GOTO=”xyz_end” instead, so that the properties/tags they add are also applied whenever “bind” (or “unbind”) is seen. (This is most important for all physical device types — those for which “bind” and “unbind” are currently generated, for all other device types this change is still recommended but not as important — but certainly prepares for future kernel uevent type additions). • Similarly, all code monitoring devices that contains an ‘if’ branch discerning the “add” + “change” uevent actions from all other uevents actions (i.e. considering devices only relevant after “add” or “change”, and irrelevant on all other events) should be reworked to instead negatively check for “remove” only (i.e. considering devices relevant after all event types, except for “remove”, which invalidates the device). Note that this also means that devices should be considered relevant on “unbind”, even though conceptually this — in some form — invalidates the device. Since the precise effect of “unbind” is not generically defined, devices should be considered relevant even after “unbind”, however I/O errors accessing the device should then be handled gracefully. • Any code that uses device tags for deciding whether a device is relevant or not most likely needs to be updated to use the new udev_device_has_current_tag() API (or sd_device_has_current_tag() in case sd-device is used), to check whether the tag is set at the moment an uevent is seen (as opposed to the existing udev_device_has_tag() API which checks if the tag ever existed on the device, following the API concept redefinition explained above). We are very sorry for this breakage and the requirement to update packages using these interfaces. We’d again like to underline that this is not caused by systemd/udev changes, but result of a kernel behaviour change. * UPCOMING INCOMPATIBILITY: So far most downstream distribution packages have not retriggered devices once the udev package (or any auxiliary package installing additional udev rules) is updated. We intend to work with major distributions to change this, so that “udevadm trigger -a change” is issued on such upgrades, ensuring that the updated ruleset is applied to the devices already discovered, so that (asynchronously) after the upgrade completed the udev database is consistent with the updated rule set. This means udev rules must be ready to be retriggered with a “change” action any time, and result in correct and complete udev database entries. While the majority of udev rule files known to us currently get this right, some don’t. Specifically, there are udev rules files included in various packages that only set udev properties on the “add” action, but do not handle the “change” action. If a device matching those rules is retriggered with the “change” action (as is intended here) it would suddenly lose the relevant properties. This always has been problematic, but as soon as all udev devices are triggered on relevant package upgrades this will become particularly so. It is strongly recommended to fix offending rules so that they can handle a “change” action at any time, and acquire all necessary udev properties even then. Or in other words: the header guard mentioned above (ACTION==”remove”,GOTO=”xyz_end”) is the correct approach to handle this, as it makes sure rules are rerun on “change” correctly, and accumulate the correct and complete set of udev properties. udev rule definitions that cannot handle “change” events being triggered at arbitrary times should be considered buggy. * The MountAPIVFS= service file setting now defaults to on if RootImage= and RootDirectory= are used, which means that with those two settings /proc/, /sys/ and /dev/ are automatically properly set up for services. Previous behaviour may be restored by explicitly setting MountAPIVFS=off. * Since PAM 1.2.0 (2015) configuration snippets may be placed in /usr/lib/pam.d/ in addition to /etc/pam.d/. If a file exists in the latter it takes precedence over the former, similar to how most of systemd’s own configuration is handled. Given that PAM stack definitions are primarily put together by OS vendors/distributions (though possibly overridden by users), this systemd release moves its own PAM stack configuration for the “systemd-user” PAM service (i.e. for the PAM session invoked by the per-user user@.service instance) from /etc/pam.d/ to /usr/lib/pam.d/. We recommend moving all packages’ vendor versions of their PAM stack definitions from /etc/pam.d/ to /usr/lib/pam.d/, but if such OS-wide migration is not desired the location to which systemd installs its PAM stack configuration may be changed via the -Dpamconfdir Meson option. * The runtime dependencies on libqrencode, libpcre2, libidn/libidn2, libpwquality and libcryptsetup have been changed to be based on dlopen(): instead of regular dynamic library dependencies declared in the binary ELF headers, these libraries are now loaded on demand only, if they are available. If the libraries cannot be found the relevant operations will fail gracefully, or a suitable fallback logic is chosen. This is supposed to be useful for general purpose distributions, as it allows minimizing the list of dependencies the systemd packages pull in, permitting building of more minimal OS images, while still making use of these “weak” dependencies should they be installed. Since many package managers automatically synthesize package dependencies from ELF shared library dependencies, some additional manual packaging work has to be done now to replace those (slightly downgraded from “required” to “recommended” or whatever is conceptually suitable for the package manager). Note that this change does not alter build-time behaviour: as before the build-time dependencies have to be installed during build, even if they now are optional during runtime. * sd-event.h gained a new call sd_event_add_time_relative() for installing timers relative to the current time. This is mostly a convenience wrapper around the pre-existing sd_event_add_time() call which installs absolute timers. * sd-event event sources may now be placed in a new “exit-on-failure” mode, which may be controlled via the new sd_event_source_get_exit_on_failure() and sd_event_source_set_exit_on_failure() functions. If enabled, any failure returned by the event source handler functions will result in exiting the event loop (unlike the default behaviour of just disabling the event source but continuing with the event loop). This feature is useful to set for all event sources that define “primary” program behaviour (where failure should be fatal) in contrast to “auxiliary” behaviour (where failure should remain local). * Most event source types sd-event supports now accept a NULL handler function, in which case the event loop is exited once the event source is to be dispatched, using the userdata pointer — converted to a signed integer — as exit code of the event loop. Previously this was supported for IO and signal event sources already. Exit event sources still do not support this (simply because it makes little sense there, as the event loop is already exiting when they are dispatched). * A new per-unit setting RootImageOptions= has been added which allows tweaking the mount options for any file system mounted as effect of the RootImage= setting. * Another new per-unit setting MountImages= has been added, that allows mounting additional disk images into the file system tree accessible to the service. * Timer units gained a new FixedRandomDelay= boolean setting. If enabled, the random delay configured with RandomizedDelaySec= is selected in a way that is stable on a given system (though still different for different units). * Socket units gained a new setting Timestamping= that takes “us”, “ns” or “off”. This controls the SO_TIMESTAMP/SO_TIMESTAMPNS socket options. * systemd-repart now generates JSON output when requested with the new –json= switch. * systemd-machined’s OpenMachineShell() bus call will now pass additional policy metadata data fields to the PolicyKit authentication request. * systemd-tmpfiles gained a new -E switch, which is equivalent to –exclude-prefix=/dev –exclude-prefix=/proc –exclude=/run –exclude=/sys. It’s particularly useful in combination with –root=, when operating on OS trees that do not have any of these four runtime directories mounted, as this means no files below these subtrees are created or modified, since those mount points should probably remain empty. * systemd-tmpfiles gained a new –image= switch which is like –root=, but takes a disk image instead of a directory as argument. The specified disk image is mounted inside a temporary mount namespace and the tmpfiles.d/ drop-ins stored in the image are executed and applied to the image. systemd-sysusers similarly gained a new –image= switch, that allows the sysusers.d/ drop-ins stored in the image to be applied onto the image. * Similarly, the journalctl command also gained an –image= switch, which is a quick one-step solution to look at the log data included in OS disk images. * journalctl’s –output=cat option (which outputs the log content without any metadata, just the pure text messages) will now make use of terminal colors when run on a suitable terminal, similarly to the other output modes. * JSON group records now support a “description” string that may be used to add a human-readable textual description to such groups. This is supposed to match the user’s GECOS field which traditionally didn’t have a counterpart for group records. * The “systemd-dissect” tool that may be used to inspect OS disk images and that was previously installed to /usr/lib/systemd/ has now been moved to /usr/bin/, reflecting its updated status of an officially supported tool with a stable interface. It gained support for a new –mkdir switch which when combined with –mount has the effect of creating the directory to mount the image to if it is missing first. It also gained two new commands –copy-from and –copy-to for copying files and directories in and out of an OS image without the need to manually mount it. It also acquired support for a new option –json= to generate JSON output when inspecting an OS image. * The cgroup2 file system is now mounted with the “memory_recursiveprot” mount option, supported since kernel 5.7. This means that the MemoryLow= and MemoryMin= unit file settings now apply recursively to whole subtrees. * systemd-homed now defaults to using the btrfs file system — if available — when creating home directories in LUKS volumes. This may be changed with the DefaultFileSystemType= setting in homed.conf. It’s now the default file system in various major distributions and has the major benefit for homed that it can be grown and shrunk while mounted, unlike the other contenders ext4 and xfs, which can both be grown online, but not shrunk (in fact xfs is the technically most limited option here, as it cannot be shrunk at all). * JSON user records managed by systemd-homed gained support for “recovery keys”. These are basically secondary passphrases that can unlock user accounts/home directories. They are computer-generated rather than user-chosen, and typically have greater entropy. homectl’s –recovery-key= option may be used to add a recovery key to a user account. The generated recovery key is displayed as a QR code, so that it can be scanned to be kept in a safe place. This feature is particularly useful in combination with systemd-homed’s support for FIDO2 or PKCS#11 authentication, as a secure fallback in case the security tokens are lost. Recovery keys may be entered wherever the system asks for a password. * systemd-homed now maintains a “dirty” flag for each LUKS encrypted home directory which indicates that a home directory has not been deactivated cleanly when offline. This flag is useful to identify home directories for which the offline discard logic did not run when offlining, and where it would be a good idea to log in again to catch up. * systemctl gained a new parameter –timestamp= which may be used to change the style in which timestamps are output, i.e. whether to show them in local timezone or UTC, or whether to show µs granularity. * Alibaba’s “pouch” container manager is now detected by systemd-detect-virt, ConditionVirtualization= and similar constructs. Similar, they now also recognize IBM PowerVM machine virtualization. * systemd-nspawn has been reworked to use the /run/host/incoming/ as place to use for propagating external mounts into the container. Similarly /run/host/notify is now used as the socket path for container payloads to communicate with the container manager using sd_notify(). The container manager now uses the /run/host/inaccessible/ directory to place “inaccessible” file nodes of all relevant types which may be used by the container payload as bind mount source to over-mount inodes to make them inaccessible. /run/host/container-manager will now be initialized with the same string as the $container environment variable passed to the container’s PID 1. /run/host/container-uuid will be initialized with the same string as $container_uuid. This means the /run/host/ hierarchy is now the primary way to make host resources available to the container. The Container Interface documents these new files and directories: https://systemd.io/CONTAINER_INTERFACE * Support for the “ConditionNull=” unit file condition has been deprecated and undocumented for 6 years. systemd started to warn about its use 1.5 years ago. It has now been removed entirely. * sd-bus.h gained a new API call sd_bus_error_has_names(), which takes a sd_bus_error struct and a list of error names, and checks if the error matches one of these names. It’s a convenience wrapper that is useful in cases where multiple errors shall be handled the same way. * A new system call filter list “@known” has been added, that contains all system calls known at the time systemd was built. * Behaviour of system call filter allow lists has changed slightly: system calls that are contained in @known will result in EPERM by default, while those not contained in it result in ENOSYS. This should improve compatibility because known system calls will thus be communicated as prohibited, while unknown (and thus newer ones) will be communicated as not implemented, which hopefully has the greatest chance of triggering the right fallback code paths in client applications. * “systemd-analyze syscall-filter” will now show two separate sections at the bottom of the output: system calls known during systemd build time but not included in any of the filter groups shown above, and system calls defined on the local kernel but known during systemd build time. * If the $SYSTEMD_LOG_SECCOMP=1 environment variable is set for systemd-nspawn all system call filter violations will be logged by the kernel (audit). This is useful for tracking down system calls invoked by container payloads that are prohibited by the container’s system call filter policy. * If the $SYSTEMD_SECCOMP=0 environment variable is set for systemd-nspawn (and other programs that use seccomp) all seccomp filtering is turned off. * Two new unit file settings ProtectProc= and ProcSubset= have been added that expose the hidepid= and subset= mount options of procfs. All processes of the unit will only see processes in /proc that are are owned by the unit’s user. This is an important new sandboxing option that is recommended to be set on all system services. All long-running system services that are included in systemd itself set this option now. This option is only supported on kernel 5.8 and above, since the hidepid= option supported on older kernels was not a per-mount option but actually applied to the whole PID namespace. * Socket units gained a new boolean setting FlushPending=. If enabled all pending socket data/connections are flushed whenever the socket unit enters the “listening” state, i.e. after the associated service exited. * The unit file setting NUMAMask= gained a new “all” value: when used, all existing NUMA nodes are added to the NUMA mask. * A new “credentials” logic has been added to system services. This is a simple mechanism to pass privileged data to services in a safe and secure way. It’s supposed to be used to pass per-service secret data such as passwords or cryptographic keys but also associated less private information such as user names, certificates, and similar to system services. Each credential is identified by a short user-chosen name and may contain arbitrary binary data. Two new unit file settings have been added: SetCredential= and LoadCredential=. The former allows setting a credential to a literal string, the latter sets a credential to the contents of a file (or data read from a user-chosen AF_UNIX stream socket). Credentials are passed to the service via a special credentials directory, one file for each credential. The path to the credentials directory is passed in a new $CREDENTIALS_DIRECTORY environment variable. Since the credentials are passed in the file system they may be easily referenced in ExecStart= command lines too, thus no explicit support for the credentials logic in daemons is required (though ideally daemons would look for the bits they need in $CREDENTIALS_DIRECTORY themselves automatically, if set). The $CREDENTIALS_DIRECTORY is backed by unswappable memory if privileges allow it, immutable if privileges allow it, is accessible only to the service’s UID, and is automatically destroyed when the service stops. * systemd-nspawn supports the same credentials logic. It can both consume credentials passed to it via the aforementioned $CREDENTIALS_DIRECTORY protocol as well as pass these credentials on to its payload. The service manager/PID 1 has been updated to match this: it can also accept credentials from the container manager that invokes it (in fact: any process that invokes it), and passes them on to its services. Thus, credentials can be propagated recursively down the tree: from a system’s service manager to a systemd-nspawn service, to the service manager that runs as container payload and to the service it runs below. Credentials may also be added on the systemd-nspawn command line, using new –set-credential= and –load-credential= command line switches that match the aforementioned service settings. * systemd-repart gained new settings Format=, Encrypt=, CopyFiles= in the partition drop-ins which may be used to format/LUKS encrypt/populate any created partitions. The partitions are encrypted/formatted/populated before they are registered in the partition table, so that they appear atomically: either the partitions do not exist yet or they exist fully encrypted, formatted, and populated — there is no time window where they are “half-initialized”. Thus the system is robust to abrupt shutdown: if the tool is terminated half-way during its operations on next boot it will start from the beginning. * systemd-repart’s –size= operation gained a new “auto” value. If specified, and operating on a loopback file it is automatically sized to the minimal size the size constraints permit. This is useful to use “systemd-repart” as an image builder for minimally sized images. * systemd-resolved now gained a third IPC interface for requesting name resolution: besides D-Bus and local DNS to 127.0.0.53 a Varlink interface is now supported. The nss-resolve NSS module has been modified to use this new interface instead of D-Bus. Using Varlink has a major benefit over D-Bus: it works without a broker service, and thus already during earliest boot, before the dbus daemon has been started. This means name resolution via systemd-resolved now works at the same time systemd-networkd operates: from earliest boot on, including in the initrd. * systemd-resolved gained support for a new DNSStubListenerExtra= configuration file setting which may be used to specify additional IP addresses the built-in DNS stub shall listen on, in addition to the main one on 127.0.0.53:53. * Name lookups issued via systemd-resolved’s D-Bus and Varlink interfaces (and thus also via glibc NSS if nss-resolve is used) will now honour a trailing dot in the hostname: if specified the search path logic is turned off. Thus “resolvectl query foo.” is now equivalent to “resolvectl query –search=off foo.”. * systemd-resolved gained a new D-Bus property “ResolvConfMode” that exposes how /etc/resolv.conf is currently managed: by resolved (and in which mode if so) or another subsystem. “resolvctl” will display this property in its status output. * The resolv.conf snippets systemd-resolved provides will now set “.” as the search domain if no other search domain is known. This turns off the derivation of an implicit search domain by nss-dns for the hostname, when the hostname is set to an FQDN. This change is done to make nss-dns using resolv.conf provided by systemd-resolved behave more similarly to nss-resolve. * systemd-tmpfiles’ file “aging” logic (i.e. the automatic clean-up of /tmp/ and /var/tmp/ based on file timestamps) now looks at the “birth” time (btime) of a file in addition to the atime, mtime, and ctime. * systemd-analyze gained a new verb “capability” that lists all known capabilities by the systemd build and by the kernel. * If a file /usr/lib/clock-epoch exists, PID 1 will read its mtime and advance the system clock to it at boot if it is noticed to be before that time. Previously, PID 1 would only advance the time to an epoch time that is set during build-time. With this new file OS builders can change this epoch timestamp on individual OS images without having to rebuild systemd. * systemd-logind will now listen to the KEY_RESTART key from the Linux input layer and reboot the system if it is pressed, similarly to how it already handles KEY_POWER, KEY_SUSPEND or KEY_SLEEP. KEY_RESTART was originally defined in the Multimedia context (to restart playback of a song or film), but is now primarily used in various embedded devices for “Reboot” buttons. Accordingly, systemd-logind will now honour it as such. This may configured in more detail via the new HandleRebootKey= and RebootKeyIgnoreInhibited=. * systemd-nspawn/systemd-machined will now reconstruct hardlinks when copying OS trees, for example in “systemd-nspawn –ephemeral”, “systemd-nspawn –template=”, “machinectl clone” and similar. This is useful when operating with OSTree images, which use hardlinks heavily throughout, and where such copies previously resulting in “exploding” hardlinks. * systemd-nspawn’s –console= setting gained support for a new “autopipe” value, which is identical to “interactive” when invoked on a TTY, and “pipe” otherwise. * systemd-networkd’s .network files gained support for explicitly configuring the multicast membership entries of bridge devices in the [BridgeMDB] section. It also gained support for the PIE queuing discipline in the [FlowQueuePIE] sections. * systemd-networkd’s .netdev files may now be used to create “BareUDP” tunnels, configured in the new [BareUDP] setting. * systemd-networkd’s Gateway= setting in .network files now accepts the special values “_dhcp4” and “_ipv6ra” to configure additional, locally defined, explicit routes to the gateway acquired via DHCP or IPv6 Router Advertisements. The old setting “_dhcp” is deprecated, but still accepted for backwards compatibility. * systemd-networkd’s [IPv6PrefixDelegation] section and IPv6PrefixDelegation= options have been renamed as [IPv6SendRA] and IPv6SendRA= (the old names are still accepted for backwards compatibility). * systemd-networkd’s .network files gained the DHCPv6PrefixDelegation= boolean setting in [Network] section. If enabled, the delegated prefix gained by another link will be configured, and an address within the prefix will be assigned. * systemd-networkd’s .network files gained the Announce= boolean setting in [DHCPv6PrefixDelegation] section. When enabled, the delegated prefix will be announced through IPv6 router advertisement (IPv6 RA). The setting is enabled by default. * VXLAN tunnels may now be marked as independent of any underlying network interface via the new Independent= boolean setting. * systemctl gained support for two new verbs: “service-log-level” and “service-log-target” may be used on services that implement the generic org.freedesktop.LogControl1 D-Bus interface to dynamically adjust the log level and target. All of systemd’s long-running services support this now, but ideally all system services would implement this interface to make the system more uniformly debuggable. * The SystemCallErrorNumber= unit file setting now accepts the new “kill” and “log” actions, in addition to arbitrary error number specifications as before. If “kill” the processes are killed on the event, if “log” the offending system call is audit logged. * A new SystemCallLog= unit file setting has been added that accepts a list of system calls that shall be logged about (audit). * The OS image dissection logic (as used by RootImage= in unit files or systemd-nspawn’s –image= switch) has gained support for identifying and mounting explicit /usr/ partitions, which are now defined in the discoverable partition specification. This should be useful for environments where the root file system is generated/formatted/populated dynamically on first boot and combined with an immutable /usr/ tree that is supplied by the vendor. * In the final phase of shutdown, within the systemd-shutdown binary we’ll now try to detach MD devices (i.e software RAID) in addition to loopback block devices and DM devices as before. This is supposed to be a safety net only, in order to increase robustness if things go wrong. Storage subsystems are expected to properly detach their storage volumes during regular shutdown already (or in case of storage backing the root file system: in the initrd hook we return to later). * If the SYSTEMD_LOG_TID environment variable is set all systemd tools will now log the thread ID in their log output. This is useful when working with heavily threaded programs. * If the SYSTEMD_RDRAND environment variable is set to “0”, systemd will not use the RDRAND CPU instruction. This is useful in environments such as replay debuggers where non-deterministic behaviour is not desirable. * The autopaging logic in systemd’s various tools (such as systemctl) has been updated to turn on “secure” mode in “less” (i.e. $LESSECURE=1) if execution in a “sudo” environment is detected. This disables invoking external programs from the pager, via the pipe logic. This behaviour may be overridden via the new $SYSTEMD_PAGERSECURE environment variable. * Units which have resource limits (.service, .mount, .swap, .slice, .socket, and .slice) gained new configuration settings ManagedOOMSwap=, ManagedOOMMemoryPressure=, and ManagedOOMMemoryPressureLimitPercent= that specify resource pressure limits and optional action taken by systemd-oomd. * A new service systemd-oomd has been added. It monitors resource contention for selected parts of the unit hierarchy using the PSI information reported by the kernel, and kills processes when memory or swap pressure is above configured limits. This service is only enabled by default in developer mode (see below) and should be considered a preview in this release. Behaviour details and option names are subject to change without the usual backwards-compatibility promises. * A new helper oomctl has been added to introspect systemd-oomd state. It is only enabled by default in developer mode and should be considered a preview without the usual backwards-compatibility promises. * New meson option -Dcompat-mutable-uid-boundaries= has been added. If enabled, systemd reads the system UID boundaries from /etc/login.defs at runtime, instead of using the built-in values selected during build. This is an option to improve compatibility for upgrades from old systems. It’s strongly recommended not to make use of this functionality on new systems (or even enable it during build), as it makes something runtime-configurable that is mostly an implementation detail of the OS, and permits avoidable differences in deployments that create all kinds of problems in the long run. * New meson option ‘-Dmode=developer|release’ has been added. When ‘developer’, additional checks and features are enabled that are relevant during upstream development, e.g. verification that semi-automatically-generated documentation has been properly updated following API changes. Those checks are considered hints for developers and are not actionable in downstream builds. In addition, extra features that are not ready for general consumption may be enabled in developer mode. It is thus recommended to set ‘-Dmode=release’ in end-user and distro builds. * systemd-cryptsetup gained support for processing detached LUKS headers specified on the kernel command line via the header= parameter of the luks.options= kernel command line option. The same device/path syntax as for key files is supported for header files like this. * The “net_id” built-in of udev has been updated to ignore ACPI _SUN slot index data for devices that are connected through a PCI bridge where the _SUN index is associated with the bridge instead of the network device itself. Previously this would create ambiguous device naming if multiple network interfaces were connected to the same PCI bridge. Since this is a naming scheme incompatibility on systems that possess hardware like this it has been introduced as new naming scheme “v247”. The previous scheme can be selected via the “net.naming-scheme=v245” kernel command line parameter. * ConditionFirstBoot= semantics have been modified to be safe towards abnormal system power-off during first boot. Specifically, the “systemd-machine-id-commit.service” service now acts as boot milestone indicating when the first boot process is sufficiently complete in order to not consider the next following boot also a first boot. If the system is reset before this unit is reached the first time, the next boot will still be considered a first boot; once it has been reached, no further boots will be considered a first boot. The “first-boot-complete.target” unit now acts as official hook point to order against this. If a service shall be run on every boot until the first boot fully succeeds it may thus be ordered before this target unit (and pull it in) and carry ConditionFirstBoot= appropriately. * bootctl’s set-default and set-oneshot commands now accept the three special strings “@default”, “@oneshot”, “@current” in place of a boot entry id. These strings are resolved to the current default and oneshot boot loader entry, as well as the currently booted one. Thus a command “bootctl set-default @current” may be used to make the currently boot menu item the new default for all subsequent boots. * “systemctl edit” has been updated to show the original effective unit contents in commented form in the text editor. * Units in user mode are now segregated into three new slices: session.slice (units that form the core of graphical session), app.slice (“normal” user applications), and background.slice (low-priority tasks). Unless otherwise configured, user units are placed in app.slice. The plan is to add resource limits and protections for the different slices in the future. * New GPT partition types for RISCV32/64 for the root and /usr partitions, and their associated Verity partitions have been defined, and are now understood by systemd-gpt-auto-generator, and the OS image dissection logic. Contributions from: Adolfo Jayme Barrientos, afg, Alec Moskvin, Alyssa Ross, Amitanand Chikorde, Andrew Hangsleben, Anita Zhang, Ansgar Burchardt, Arian van Putten, Aurelien Jarno, Axel Rasmussen, bauen1, Beniamino Galvani, Benjamin Berg, Bjørn Mork, brainrom, Chandradeep Dey, Charles Lee, Chris Down, Christian Göttsche, Christof Efkemann, Christoph Ruegge, Clemens Gruber, Daan De Meyer, Daniele Medri, Daniel Mack, Daniel Rusek, Dan Streetman, David Tardon, Dimitri John Ledkov, Dmitry Borodaenko, Elias Probst, Elisei Roca, ErrantSpore, Etienne Doms, Fabrice Fontaine, fangxiuning, Felix Riemann, Florian Klink, Franck Bui, Frantisek Sumsal, fwSmit, George Rawlinson, germanztz, Gibeom Gwon, Glen Whitney, Gogo Gogsi, Göran Uddeborg, Grant Mathews, Hans de Goede, Hans Ulrich Niedermann, Haochen Tong, Harald Seiler, huangyong, Hubert Kario, igo95862, Ikey Doherty, Insun Pyo, Jan Chren, Jan Schlüter, Jérémy Nouhaud, Jian-Hong Pan, Joerg Behrmann, Jonathan Lebon, Jörg Thalheim, Josh Brobst, Juergen Hoetzel, Julien Humbert, Kai-Chuan Hsieh, Kairui Song, Kamil Dudka, Kir Kolyshkin, Kristijan Gjoshev, Kyle Huey, Kyle Russell, Lee Whalen, Lennart Poettering, lichangze, Luca Boccassi, Lucas Werkmeister, Luca Weiss, Marc Kleine-Budde, Marco Wang, Martin Wilck, Marti Raudsepp, masmullin2000, Máté Pozsgay, Matt Fenwick, Michael Biebl, Michael Scherer, Michal Koutný, Michal Sekletár, Michal Suchanek, Mikael Szreder, Milo Casagrande, mirabilos, Mitsuha_QuQ, mog422, Muhammet Kara, Nazar Vinnichuk, Nicholas Narsing, Nicolas Fella, Njibhu, nl6720, Oğuz Ersen, Olivier Le Moal, Ondrej Kozina, onlybugreports, Pass Automated Testing Suite, Pat Coulthard, Pavel Sapezhko, Pedro Ruiz, perry_yuan, Peter Hutterer, Phaedrus Leeds, PhoenixDiscord, Piotr Drąg, Plan C, Purushottam choudhary, Rasmus Villemoes, Renaud Métrich, Robert Marko, Roman Beranek, Ronan Pigott, Roy Chen (陳彥廷), RussianNeuroMancer, Samanta Navarro, Samuel BF, scootergrisen, Sorin Ionescu, Steve Dodd, Susant Sahani, Timo Rothenpieler, Tobias Hunger, Tobias Kaufmann, Topi Miettinen, vanou, Vito Caputo, Weblate, Wen Yang, Whired Planck, williamvds, Yu, Li-Yu, Yuri Chornoivan, Yu Watanabe, Zbigniew Jędrzejewski-Szmek, Zmicer Turok, Дамјан Георгиевски – Warsaw, 2020-11-26 CHANGES WITH 246: * The service manager gained basic support for cgroup v2 freezer. Units can now be suspended or resumed either using new systemctl verbs, freeze and thaw respectively, or via D-Bus. * PID 1 may now automatically load pre-compiled AppArmor policies from /etc/apparmor/earlypolicy during early boot. * The CPUAffinity= setting in service unit files now supports a new special value “numa” that causes the CPU affinity masked to be set based on the NUMA mask. * systemd will now log about all left-over processes remaining in a unit when the unit is stopped. It will now warn about services using KillMode=none, as this is generally an unsafe thing to make use of. * Two new unit file settings ConditionPathIsEncrypted=/AssertPathIsEncrypted= have been added. They may be used to check whether a specific file system path resides on a block device that is encrypted on the block level (i.e. using dm-crypt/LUKS). * Another pair of new settings ConditionEnvironment=/AssertEnvironment= has been added that may be used for simple environment checks. This is particularly useful when passing in environment variables from a container manager (or from PAM in case of the systemd –user instance). * .service unit files now accept a new setting CoredumpFilter= which allows configuration of the memory sections coredumps of the service’s processes shall include. * .mount units gained a new ReadWriteOnly= boolean option. If set it will not be attempted to mount a file system read-only if mounting in read-write mode doesn’t succeed. An option x-systemd.rw-only is available in /etc/fstab to control the same. * .socket units gained a new boolean setting PassPacketInfo=. If enabled, the kernel will attach additional per-packet metadata to all packets read from the socket, as an ancillary message. This controls the IP_PKTINFO, IPV6_RECVPKTINFO, NETLINK_PKTINFO socket options, depending on socket type. * .service units gained a new setting RootHash= which may be used to specify the root hash for verity enabled disk images which are specified in RootImage=. RootVerity= may be used to specify a path to the Verity data matching a RootImage= file system. (The latter is only useful for images that do not contain the Verity data embedded into the same image that carries a GPT partition table following the Discoverable Partition Specification). Similarly, systemd-nspawn gained a new switch –verity-data= that takes a path to a file with the verity data of the disk image supplied in –image=, if the image doesn’t contain the verity data itself. * .service units gained a new setting RootHashSignature= which takes either a base64 encoded PKCS#7 signature of the root hash specified with RootHash=, or a path to a file to read the signature from. This allows validation of the root hash against public keys available in the kernel keyring, and is only supported on recent kernels (>= 5.4)/libcryptsetup (>= 2.30). A similar switch has been added to systemd-nspawn and systemd-dissect (–root-hash-sig=). Support for this mechanism has also been added to systemd-veritysetup. * .service unit files gained two new options TimeoutStartFailureMode=/TimeoutStopFailureMode= that may be used to tune behaviour if a start or stop timeout is hit, i.e. whether to terminate the service with SIGTERM, SIGABRT or SIGKILL. * Most options in systemd that accept hexadecimal values prefixed with 0x in additional to the usual decimal notation now also support octal notation when the 0o prefix is used and binary notation if the 0b prefix is used. * Various command line parameters and configuration file settings that configure key or certificate files now optionally take paths to AF_UNIX sockets in the file system. If configured that way a stream connection is made to the socket and the required data read from it. This is a simple and natural extension to the existing regular file logic, and permits other software to provide keys or certificates via simple IPC services, for example when unencrypted storage on disk is not desired. Specifically, systemd-networkd’s Wireguard and MACSEC key file settings as well as systemd-journal-gatewayd’s and systemd-journal-remote’s PEM key/certificate parameters support this now. * Unit files, tmpfiles.d/ snippets, sysusers.d/ snippets and other configuration files that support specifier expansion learnt six new specifiers: %a resolves to the current architecture, %o/%w/%B/%W resolve to the various ID fields from /etc/os-release, %l resolves to the “short” hostname of the system, i.e. the hostname configured in the kernel truncated at the first dot. * Support for the .include syntax in unit files has been removed. The concept has been obsolete for 6 years and we started warning about its pending removal 2 years ago (also see NEWS file below). It’s finally gone now. * StandardError= and StandardOutput= in unit files no longer support the “syslog” and “syslog-console” switches. They were long removed from the documentation, but will now result in warnings when used, and be converted to “journal” and “journal+console” automatically. * If the service setting User= is set to the “nobody” user, a warning message is now written to the logs (but the value is nonetheless accepted). Setting User=nobody is unsafe, since the primary purpose of the “nobody” user is to own all files whose owner cannot be mapped locally. It’s in particular used by the NFS subsystem and in user namespacing. By running a service under this user’s UID it might get read and even write access to all these otherwise unmappable files, which is quite likely a major security problem. * tmpfs mounts automatically created by systemd (/tmp, /run, /dev/shm, and others) now have a size and inode limits applied (50% of RAM for /tmp and /dev/shm, 10% of RAM for other mounts, etc.). Please note that the implicit kernel default is 50% too, so there is no change in the size limit for /tmp and /dev/shm. * nss-mymachines lost support for resolution of users and groups, and now only does resolution of hostnames. This functionality is now provided by nss-systemd. Thus, the ‘mymachines’ entry should be removed from the ‘passwd:’ and ‘group:’ lines in /etc/nsswitch.conf (and ‘systemd’ added if it is not already there). * A new kernel command line option systemd.hostname= has been added that allows controlling the hostname that is initialized early during boot. * A kernel command line option “udev.blockdev_read_only” has been added. If specified all hardware block devices that show up are immediately marked as read-only by udev. This option is useful for making sure that a specific boot under no circumstances modifies data on disk. Use “blockdev –setrw” to undo the effect of this, per device. * A new boolean kernel command line option systemd.swap= has been added, which may be used to turn off automatic activation of swap devices listed in /etc/fstab. * New kernel command line options systemd.condition-needs-update= and systemd.condition-first-boot= have been added, which override the result of the ConditionNeedsUpdate= and ConditionFirstBoot= conditions. * A new kernel command line option systemd.clock-usec= has been added that allows setting the system clock to the specified time in µs since Jan 1st, 1970 early during boot. This is in particular useful in order to make test cases more reliable. * The fs.suid_dumpable sysctl is set to 2 / “suidsafe”. This allows systemd-coredump to save core files for suid processes. When saving the core file, systemd-coredump will use the effective uid and gid of the process that faulted. * The /sys/module/kernel/parameters/crash_kexec_post_notifiers file is now automatically set to “Y” at boot, in order to enable pstore generation for collection with systemd-pstore. * We provide a set of udev rules to enable auto-suspend on PCI and USB devices that were tested to correctly support it. Previously, this was distributed as a set of udev rules, but has now been replaced by by a set of hwdb entries (and a much shorter udev rule to take action if the device modalias matches one of the new hwdb entries). As before, entries are periodically imported from the database maintained by the ChromiumOS project. If you have a device that supports auto-suspend correctly and where it should be enabled by default, please submit a patch that adds it to the database (see /usr/lib/udev/hwdb.d/60-autosuspend.hwdb). * systemd-udevd gained the new configuration option timeout_signal= as well as a corresponding kernel command line option udev.timeout_signal=. The option can be used to configure the UNIX signal that the main daemon sends to the worker processes on timeout. Setting the signal to SIGABRT is useful for debugging. * .link files managed by systemd-udevd gained options RxFlowControl=, TxFlowControl=, AutoNegotiationFlowControl= in the [Link] section, in order to configure various flow control parameters. They also gained RxMiniBufferSize= and RxJumboBufferSize= in order to configure jumbo frame ring buffer sizes. * networkd.conf gained a new boolean setting ManageForeignRoutes=. If enabled systemd-networkd manages all routes configured by other tools. * .network files managed by systemd-networkd gained a new section [SR-IOV], in order to configure SR-IOV capable network devices. * systemd-networkd’s [IPv6Prefix] section in .network files gained a new boolean setting Assign=. If enabled an address from the prefix is automatically assigned to the interface. * systemd-networkd gained a new section [DHCPv6PrefixDelegation] which controls delegated prefixes assigned by DHCPv6 client. The section has three settings: SubnetID=, Assign=, and Token=. The setting SubnetID= allows explicit configuration of the preferred subnet that systemd-networkd’s Prefix Delegation logic assigns to interfaces. If Assign= is enabled (which is the default) an address from any acquired delegated prefix is automatically chosen and assigned to the interface. The setting Token= specifies an optional address generation mode for Assign=. * systemd-networkd’s [Network] section gained a new setting IPv4AcceptLocal=. If enabled the interface accepts packets with local source addresses. * systemd-networkd gained support for configuring the HTB queuing discipline in the [HierarchyTokenBucket] and [HierarchyTokenBucketClass] sections. Similar the “pfifo” qdisc may be configured in the [PFIFO] section, “GRED” in [GenericRandomEarlyDetection], “SFB” in [StochasticFairBlue], “cake” in [CAKE], “PIE” in [PIE], “DRR” in [DeficitRoundRobinScheduler] and [DeficitRoundRobinSchedulerClass], “BFIFO” in [BFIFO], “PFIFOHeadDrop” in [PFIFOHeadDrop], “PFIFOFast” in [PFIFOFast], “HHF” in [HeavyHitterFilter], “ETS” in [EnhancedTransmissionSelection] and “QFQ” in [QuickFairQueueing] and [QuickFairQueueingClass]. * systemd-networkd gained support for a new Termination= setting in the [CAN] section for configuring the termination resistor. It also gained a new ListenOnly= setting for controlling whether to only listen on CAN interfaces, without interfering with traffic otherwise (which is useful for debugging/monitoring CAN network traffic). DataBitRate=, DataSamplePoint=, FDMode=, FDNonISO= have been added to configure various CAN-FD aspects. * systemd-networkd’s [DHCPv6] section gained a new option WithoutRA=. When enabled, DHCPv6 will be attempted right-away without requiring an Router Advertisement packet suggesting it first (i.e. without the ‘M’ or ‘O’ flags set). The [IPv6AcceptRA] section gained a boolean option DHCPv6Client= that may be used to turn off the DHCPv6 client even if the RA packets suggest it. * systemd-networkd’s [DHCPv4] section gained a new setting UseGateway= which may be used to turn off use of the gateway information provided by the DHCP lease. A new FallbackLeaseLifetimeSec= setting may be used to configure how to process leases that lack a lifetime option. * systemd-networkd’s [DHCPv4] and [DHCPServer] sections gained a new setting SendVendorOption= allowing configuration of additional vendor options to send in the DHCP requests/responses. The [DHCPv6] section gained a new SendOption= setting for sending arbitrary DHCP options. RequestOptions= has been added to request arbitrary options from the server. UserClass= has been added to set the DHCP user class field. * systemd-networkd’s [DHCPServer] section gained a new set of options EmitPOP3=/POP3=, EmitSMTP=/SMTP=, EmitLPR=/LPR= for including server information about these three protocols in the DHCP lease. It also gained support for including “MUD” URLs (“Manufacturer Usage Description”). Support for “MUD” URLs was also added to the LLDP stack, configurable in the [LLDP] section in .network files. * The Mode= settings in [MACVLAN] and [MACVTAP] now support ‘source’ mode. Also, the sections now support a new setting SourceMACAddress=. * systemd-networkd’s .netdev files now support a new setting VLANProtocol= in the [Bridge] section that allows configuration of the VLAN protocol to use. * systemd-networkd supports a new Group= setting in the [Link] section of the .network files, to control the link group. * systemd-networkd’s [Network] section gained a new IPv6LinkLocalAddressGenerationMode= setting, which specifies how IPv6 link local address is generated. * A new default .network file is now shipped that matches TUN/TAP devices that begin with “vt-” in their name. Such interfaces will have IP routing onto the host links set up automatically. This is supposed to be used by VM managers to trivially acquire a network interface which is fully set up for host communication, simply by carefully picking an interface name to use. * systemd-networkd’s [DHCPv6] section gained a new setting RouteMetric= which sets the route priority for routes specified by the DHCP server. * systemd-networkd’s [DHCPv6] section gained a new setting VendorClass= which configures the vendor class information sent to DHCP server. * The BlackList= settings in .network files’ [DHCPv4] and [IPv6AcceptRA] sections have been renamed DenyList=. The old names are still understood to provide compatibility. * networkctl gained the new “forcerenew” command for forcing all DHCP server clients to renew their lease. The interface “status” output will now show numerous additional fields of information about an interface. There are new “up” and “down” commands to bring specific interfaces up or down. * systemd-resolved’s DNS= configuration option now optionally accepts a port number (after “:”) and a host name (after “#”). When the host name is specified, the DNS-over-TLS certificate is validated to match the specified hostname. Additionally, in case of IPv6 addresses, an interface may be specified (after “%”). * systemd-resolved may be configured to forward single-label DNS names. This is not standard-conformant, but may make sense in setups where public DNS servers are not used. * systemd-resolved’s DNS-over-TLS support gained SNI validation. * systemd-nspawn’s –resolv-conf= switch gained a number of new supported values. Specifically, options starting with “replace-” are like those prefixed “copy-” but replace any existing resolv.conf file. And options ending in “-uplink” and “-stub” can now be used to propagate other flavours of resolv.conf into the container (as defined by systemd-resolved). * The various programs included in systemd can now optionally output their log messages on stderr prefixed with a timestamp, controlled by the $SYSTEMD_LOG_TIME environment variable. * systemctl gained a new “-P” switch that is a shortcut for “–value –property=…”. * “systemctl list-units” and “systemctl list-machines” no longer hide their first output column with –no-legend. To hide the first column, use –plain. * “systemctl reboot” takes the option “–reboot-argument=”. The optional positional argument to “systemctl reboot” is now being deprecated in favor of this option. * systemd-run gained a new switch –slice-inherit. If specified the unit it generates is placed in the same slice as the systemd-run process itself. * systemd-journald gained support for zstd compression of large fields in journal files. The hash tables in journal files have been hardened against hash collisions. This is an incompatible change and means that journal files created with new systemd versions are not readable with old versions. If the $SYSTEMD_JOURNAL_KEYED_HASH boolean environment variable for systemd-journald.service is set to 0 this new hardening functionality may be turned off, so that generated journal files remain compatible with older journalctl implementations. * journalctl will now include a clickable link in the default output for each log message for which an URL with further documentation is known. This is only supported on terminal emulators that support clickable hyperlinks, and is turned off if a pager is used (since “less” still doesn’t support hyperlinks, unfortunately). Documentation URLs may be included in log messages either by including a DOCUMENTATION= journal field in it, or by associating a journal message catalog entry with the log message’s MESSAGE_ID, which then carries a “Documentation:” tag. * journald.conf gained a new boolean setting Audit= that may be used to control whether systemd-journald will enable audit during initialization. * when systemd-journald’s log stream is broken up into multiple lines because the PID of the sender changed this is indicated in the generated log records via the _LINE_BREAK=pid-change field. * journalctl’s “-o cat” output mode will now show one or more journal fields specified with –output-fields= instead of unconditionally MESSAGE=. This is useful to retrieve a very specific set of fields without any decoration. * The sd-journal.h API gained two new functions: sd_journal_enumerate_available_unique() and sd_journal_enumerate_available_data() that operate like their counterparts that lack the _available_ in the name, but skip items that cannot be read and processed by the local implementation (i.e. are compressed in an unsupported format or such), * coredumpctl gained a new –file= switch, matching the same one in journalctl: a specific journal file may be specified to read the coredump data from. * coredumps collected by systemd-coredump may now be compressed using the zstd algorithm. * systemd-binfmt gained a new switch –unregister for unregistering all registered entries at once. This is now invoked automatically at shutdown, so that binary formats registered with the “F” flag will not block clean file system unmounting. * systemd-notify’s –pid= switch gained new values: “parent”, “self”, “auto” for controlling which PID to send to the service manager: the systemd-notify process’ PID, or the one of the process invoking it. * systemd-logind’s Session bus object learnt a new method call SetType() for temporarily updating the session type of an already allocated session. This is useful for upgrading tty sessions to graphical ones once a compositor is invoked. * systemd-socket-proxy gained a new switch –exit-idle-time= for configuring an exit-on-idle time. * systemd-repart’s –empty= setting gained a new value “create”. If specified a new empty regular disk image file is created under the specified name. Its size may be specified with the new –size= option. The latter is also supported without the “create” mode, in order to grow existing disk image files to the specified size. These two new options are useful when creating or manipulating disk images instead of operating on actual block devices. * systemd-repart drop-ins now support a new UUID= setting to control the UUID to assign to a newly created partition. * systemd-repart’s SizeMin= per-partition parameter now defaults to 10M instead of 0. * systemd-repart’s Label= setting now support the usual, simple specifier expansion. * systemd-homed’s LUKS backend gained the ability to discard empty file system blocks automatically when the user logs out. This is enabled by default to ensure that home directories take minimal space when logged out but get full size guarantees when logged in. This may be controlled with the new –luks-offline-discard= switch to homectl. * If systemd-homed detects that /home/ is encrypted as a whole it will now default to the directory or subvolume backends instead of the LUKS backend, in order to avoid double encryption. The default storage and file system may now be configured explicitly, too, via the new /etc/systemd/homed.conf configuration file. * systemd-homed now supports unlocking home directories with FIDO2 security tokens that support the ‘hmac-secret’ extension, in addition to the existing support for PKCS#11 security token unlocking support. Note that many recent hardware security tokens support both interfaces. The FIDO2 support is accessible via homectl’s –fido2-device= option. * homectl’s –pkcs11-uri= setting now accepts two special parameters: if “auto” is specified and only one suitable PKCS#11 security token is plugged in, its URL is automatically determined and enrolled for unlocking the home directory. If “list” is specified a brief table of suitable PKCS#11 security tokens is shown. Similar, the new –fido2-device= option also supports these two special values, for automatically selecting and listing suitable FIDO2 devices. * The /etc/crypttab tmp option now optionally takes an argument selecting the file system to use. Moreover, the default is now changed from ext2 to ext4. * There’s a new /etc/crypttab option “keyfile-erase”. If specified the key file listed in the same line is removed after use, regardless if volume activation was successful or not. This is useful if the key file is only acquired transiently at runtime and shall be erased before the system continues to boot. * There’s also a new /etc/crypttab option “try-empty-password”. If specified, before asking the user for a password it is attempted to unlock the volume with an empty password. This is useful for installing encrypted images whose password shall be set on first boot instead of at installation time. * systemd-cryptsetup will now attempt to load the keys to unlock volumes with automatically from files in /etc/cryptsetup-keys.d/.key and /run/cryptsetup-keys.d/.key, if any of these files exist. * systemd-cryptsetup may now activate Microsoft BitLocker volumes via /etc/crypttab, during boot. * logind.conf gained a new RuntimeDirectoryInodesMax= setting to control the inode limit for the per-user $XDG_RUNTIME_DIR tmpfs instance. * A new generator systemd-xdg-autostart-generator has been added. It generates systemd unit files from XDG autostart .desktop files, and may be used to let the systemd user instance manage services that are started automatically as part of the desktop session. * “bootctl” gained a new verb “reboot-to-firmware” that may be used to query and change the firmware’s ‘reboot into firmware’ setup flag. * systemd-firstboot gained a new switch –kernel-command-line= that may be used to initialize the /etc/kernel/cmdline file of the image. It also gained a new switch –root-password-hashed= which is like –root-password= but accepts a pre-hashed UNIX password as argument. The new option –delete-root-password may be used to unset any password for the root user (dangerous!). The –root-shell= switch may be used to control the shell to use for the root account. A new –force option may be used to override any already set settings with the parameters specified on the command line (by default, the tool will not override what has already been set before, i.e. is purely incremental). * systemd-firstboot gained support for a new –image= switch, which is similar to –root= but accepts the path to a disk image file, on which it then operates. * A new sd-path.h API has been added to libsystemd. It provides a simple API for retrieving various search paths and primary directories for various resources. * A new call sd_notify_barrier() has been added to the sd-daemon.h API. The call will block until all previously sent sd_notify() messages have been processed by the service manager. This is useful to remove races caused by a process already having disappeared at the time a notification message is processed by the service manager, making correct attribution impossible. The systemd-notify tool will now make use of this call implicitly, but this can be turned off again via the new –no-block switch. * When sending a file descriptor (fd) to the service manager to keep track of, using the sd_notify() mechanism, a new parameter FDPOLL=0 may be specified. If passed the service manager will refrain from poll()ing on the file descriptor. Traditionally (and when the parameter is not specified), the service manager will poll it for POLLHUP or POLLERR events, and immediately close the fds in that case. * The service manager (PID1) gained a new D-Bus method call SetShowStatus() which may be used to control whether it shall show boot-time status output on the console. This method has a similar effect to sending SIGRTMIN+20/SIGRTMIN+21 to PID 1. * The sd-bus API gained a number of convenience functions that take va_list arguments rather than “…”. For example, there’s now sd_bus_call_methodv() to match sd_bus_call_method(). Those calls make it easier to build wrappers that accept variadic arguments and want to pass a ready va_list structure to sd-bus. * sd-bus vtable entries can have a new SD_BUS_VTABLE_ABSOLUTE_OFFSET flag which alters how the userdata pointer to pass to the callbacks is determined. When the flag is set, the offset field is converted as-is into a pointer, without adding it to the object pointer the vtable is associated with. * sd-bus now exposes four new functions: sd_bus_interface_name_is_valid() + sd_bus_service_name_is_valid() + sd_bus_member_name_is_valid() + sd_bus_object_path_is_valid() will validate strings to check if they qualify as various D-Bus concepts. * The sd-bus API gained the SD_BUS_METHOD_WITH_ARGS(), SD_BUS_METHOD_WITH_ARGS_OFFSET() and SD_BUS_SIGNAL_WITH_ARGS() macros that simplify adding argument names to D-Bus methods and signals. * The man pages for the sd-bus and sd-hwdb APIs have been completed. * Various D-Bus APIs of systemd daemons now have man pages that document the methods, signals and properties. * The expectations on user/group name syntax are now documented in detail; documentation on how classic home directories may be converted into home directories managed by homed has been added; documentation regarding integration of homed/userdb functionality in desktops has been added: https://systemd.io/USER_NAMES https://systemd.io/CONVERTING_TO_HOMED https://systemd.io/USERDB_AND_DESKTOPS * Documentation for the on-disk Journal file format has been updated and has now moved to: https://systemd.io/JOURNAL_FILE_FORMAT * The interface for containers (https://systemd.io/CONTAINER_INTERFACE) has been extended by a set of environment variables that expose select fields from the host’s os-release file to the container payload. Similarly, host’s os-release files can be mounted into the container underneath /run/host. Together, those mechanisms provide a standardized way to expose information about the host to the container payload. Both interfaces are implemented in systemd-nspawn. * All D-Bus services shipped in systemd now implement the generic LogControl1 D-Bus API which allows clients to change log level + target of the service during runtime. * Only relevant for developers: the mkosi.default symlink has been dropped from version control. Please create a symlink to one of the distribution-specific defaults in .mkosi/ based on your preference. Contributions from: 24bisquitz, Adam Nielsen, Alan Perry, Alexander Malafeev, Amitanand.Chikorde, Alin Popa, Alvin Šipraga, Amos Bird, Andreas Rammhold, AndreRH, Andrew Doran, Anita Zhang, Ankit Jain, antznin, Arnaud Ferraris, Arthur Moraes do Lago, Arusekk, Balaji Punnuru, Balint Reczey, Bastien Nocera, bemarek, Benjamin Berg, Benjamin Dahlhoff, Benjamin Robin, Chris Down, Chris Kerr, Christian Göttsche, Christian Hesse, Christian Oder, Ciprian Hacman, Clinton Roy, codicodi, Corey Hinshaw, Daan De Meyer, Dana Olson, Dan Callaghan, Daniel Fullmer, Daniel Rusek, Dan Streetman, Dave Reisner, David Edmundson, David Wood, Denis Pronin, Diego Escalante Urrelo, Dimitri John Ledkov, dolphrundgren, duguxy, Einsler Lee, Elisei Roca, Emmanuel Garette, Eric Anderson, Eric DeVolder, Evgeny Vereshchagin, ExtinctFire, fangxiuning, Ferran Pallarès Roca, Filipe Brandenburger, Filippo Falezza, Finn, Florian Klink, Florian Mayer, Franck Bui, Frantisek Sumsal, gaurav, Georg Müller, Gergely Polonkai, Giedrius Statkevičius, Gigadoc2, gogogogi, Gaurav Singh, gzjsgdsb, Hans de Goede, Haochen Tong, ianhi, ignapk, Jakov Smolic, James T. Lee, Jan Janssen, Jan Klötzke, Jan Palus, Jay Burger, Jeremy Cline, Jérémy Rosen, Jian-Hong Pan, Jiri Slaby, Joel Shapiro, Joerg Behrmann, Jörg Thalheim, Jouke Witteveen, Kai-Heng Feng, Kenny Levinsen, Kevin Kuehler, Kumar Kartikeya Dwivedi, layderv, laydervus, Lénaïc Huard, Lennart Poettering, Lidong Zhong, Luca Boccassi, Luca BRUNO, Lucas Werkmeister, Lukas Klingsbo, Lukáš Nykrýn, Łukasz Stelmach, Maciej S. Szmigiero, MadMcCrow, Marc-André Lureau, Marcel Holtmann, Marc Kleine-Budde, Martin Hundebøll, Matthew Leeds, Matt Ranostay, Maxim Fomin, MaxVerevkin, Michael Biebl, Michael Chapman, Michael Gubbels, Michael Marley, Michał Bartoszkiewicz, Michal Koutný, Michal Sekletár, Mike Gilbert, Mike Kazantsev, Mikhail Novosyolov, ml, Motiejus Jakštys, nabijaczleweli, nerdopolis, Niccolò Maggioni, Niklas Hambüchen, Norbert Lange, Paul Cercueil, pelzvieh, Peter Hutterer, Piero La Terza, Pieter Lexis, Piotr Drąg, Rafael Fontenelle, Richard Petri, Ronan Pigott, Ross Lagerwall, Rubens Figueiredo, satmandu, Sean-StarLabs, Sebastian Jennen, sterlinghughes, Surhud More, Susant Sahani, szb512, Thomas Haller, Tobias Hunger, Tom, Tomáš Pospíšek, Tomer Shechner, Tom Hughes, Topi Miettinen, Tudor Roman, Uwe Kleine-König, Valery0xff, Vito Caputo, Vladimir Panteleev, Vladyslav Tronko, Wen Yang, Yegor Vialov, Yigal Korman, Yi Gao, YmrDtnJu, Yuri Chornoivan, Yu Watanabe, Zbigniew Jędrzejewski-Szmek, Zhu Li, Дамјан Георгиевски, наб – Warsaw, 2020-07-30 CHANGES WITH 245: * A new tool “systemd-repart” has been added, that operates as an idempotent declarative repartitioner for GPT partition tables. Specifically, a set of partitions that must or may exist can be configured via drop-in files, and during every boot the partition table on disk is compared with these files, creating missing partitions or growing existing ones based on configurable relative and absolute size constraints. The tool is strictly incremental, i.e. does not delete, shrink or move partitions, but only adds and grows them. The primary use-case is OS images that ship in minimized form, that on first boot are grown to the size of the underlying block device or augmented with additional partitions. For example, the root partition could be extended to cover the whole disk, or a swap or /home partitions could be added on first boot. It can also be used for systems that use an A/B update scheme but ship images with just the A partition, with B added on first boot. The tool is primarily intended to be run in the initrd, shortly before transitioning into the host OS, but can also be run after the transition took place. It automatically discovers the disk backing the root file system, and should hence not require any additional configuration besides the partition definition drop-ins. If no configuration drop-ins are present, no action is taken. * A new component “userdb” has been added, along with a small daemon “systemd-userdbd.service” and a client tool “userdbctl”. The framework allows defining rich user and group records in a JSON format, extending on the classic “struct passwd” and “struct group” structures. Various components in systemd have been updated to process records in this format, including systemd-logind and pam-systemd. The user records are intended to be extensible, and allow setting various resource management, security and runtime parameters that shall be applied to processes and sessions of the user as they log in. This facility is intended to allow associating such metadata directly with user/group records so that they can be produced, extended and consumed in unified form. We hope that eventually frameworks such as sssd will generate records this way, so that for the first time resource management and various other per-user settings can be configured in LDAP directories and then provided to systemd (specifically to systemd-logind and pam-system) to apply on login. For further details see: https://systemd.io/USER_RECORD https://systemd.io/GROUP_RECORD https://systemd.io/USER_GROUP_API * A small new service systemd-homed.service has been added, that may be used to securely manage home directories with built-in encryption. The complete user record data is unified with the home directory, thus making home directories naturally migratable. Its primary back-end is based on LUKS volumes, but fscrypt, plain directories, and other storage schemes are also supported. This solves a couple of problems we saw with traditional ways to manage home directories, in particular when it comes to encryption. For further discussion of this, see the video of Lennart’s talk at AllSystemsGo! 2019: https://media.ccc.de/v/ASG2019-164-reinventing-home-directories For further details about the format and expectations on home directories this new daemon makes, see: https://systemd.io/HOME_DIRECTORY * systemd-journald is now multi-instantiable. In addition to the main instance systemd-journald.service there’s now a template unit systemd-journald@.service, with each instance defining a new named log ‘namespace’ (whose name is specified via the instance part of the unit name). A new unit file setting LogNamespace= has been added, taking such a namespace name, that assigns services to the specified log namespaces. As each log namespace is serviced by its own independent journal daemon, this functionality may be used to improve performance and increase isolation of applications, at the price of losing global message ordering. Each instance of journald has a separate set of configuration files, with possibly different disk usage limitations and other settings. journalctl now takes a new option –namespace= to show logs from a specific log namespace. The sd-journal.h API gained sd_journal_open_namespace() for opening the log stream of a specific log namespace. systemd-journald also gained the ability to exit on idle, which is useful in the context of log namespaces, as this means log daemons for log namespaces can be activated automatically on demand and will stop automatically when no longer used, minimizing resource usage. * When systemd-tmpfiles copies a file tree using the ‘C’ line type it will now label every copied file according to the SELinux database. * When systemd/PID 1 detects it is used in the initrd it will now boot into initrd.target rather than default.target by default. This should make it simpler to build initrds with systemd as for many cases the only difference between a host OS image and an initrd image now is the presence of the /etc/initrd-release file. * A new kernel command line option systemd.cpu_affinity= is now understood. It’s equivalent to the CPUAffinity= option in /etc/systemd/system.conf and allows setting the CPU mask for PID 1 itself and the default for all other processes. * When systemd/PID 1 is reloaded (with systemctl daemon-reload or equivalent), the SELinux database is now reloaded, ensuring that sockets and other file system objects are generated taking the new database into account. * systemd/PID 1 accepts a new “systemd.show-status=error” setting, and “quiet” has been changed to imply that instead of “systemd.show-status=auto”. In this mode, only messages about errors and significant delays in boot are shown on the console. * The sd-event.h API gained native support for the new Linux “pidfd” concept. This permits watching processes using file descriptors instead of PID numbers, which fixes a number of races and makes process supervision more robust and efficient. All of systemd’s components will now use pidfds if the kernel supports it for process watching, with the exception of PID 1 itself, unfortunately. We hope to move PID 1 to exclusively using pidfds too eventually, but this requires some more kernel work first. (Background: PID 1 watches processes using waitid() with the P_ALL flag, and that does not play together nicely with pidfds yet.) * Closely related to this, the sd-event.h API gained two new calls sd_event_source_send_child_signal() (for sending a signal to a watched process) and sd_event_source_get_child_process_own() (for marking a process so that it is killed automatically whenever the event source watching it is freed). * systemd-networkd gained support for configuring Token Bucket Filter (TBF) parameters in its qdisc configuration support. Similarly, support for Stochastic Fairness Queuing (SFQ), Controlled-Delay Active Queue Management (CoDel), and Fair Queue (FQ) has been added. * systemd-networkd gained support for Intermediate Functional Block (IFB) network devices. * systemd-networkd gained support for configuring multi-path IP routes, using the new MultiPathRoute= setting in the [Route] section. * systemd-networkd’s DHCPv4 client has been updated to support a new SendDecline= option. If enabled, duplicate address detection is done after a DHCP offer is received from the server. If a conflict is detected, the address is declined. The DHCPv4 client also gained support for a new RouteMTUBytes= setting that allows to configure the MTU size to be used for routes generated from DHCPv4 leases. * The PrefixRoute= setting in systemd-networkd’s [Address] section of .network files has been deprecated, and replaced by AddPrefixRoute=, with its sense inverted. * The Gateway= setting of [Route] sections of .network files gained support for a special new value “_dhcp”. If set, the configured static route uses the gateway host configured via DHCP. * New User= and SuppressPrefixLength= settings have been implemented for the [RoutingPolicyRule] section of .network files to configure source routing based on UID ranges and prefix length, respectively. * The Type= match property of .link files has been generalized to always match the device type shown by ‘networkctl status’, even for devices where udev does not set DEVTYPE=. This allows e.g. Type=ether to be used. * sd-bus gained a new API call sd_bus_message_sensitive() that marks a D-Bus message object as “sensitive”. Those objects are erased from memory when they are freed. This concept is intended to be used for messages that contain security sensitive data. A new flag SD_BUS_VTABLE_SENSITIVE has been introduced as well to mark methods in sd-bus vtables, causing any incoming and outgoing messages of those methods to be implicitly marked as “sensitive”. * sd-bus gained a new API call sd_bus_message_dump() for dumping the contents of a message (or parts thereof) to standard output for debugging purposes. * systemd-sysusers gained support for creating users with the primary group named differently than the user. * systemd-growfs (i.e. the x-systemd.growfs mount option in /etc/fstab) gained support for growing XFS partitions. Previously it supported only ext4 and btrfs partitions. * The support for /etc/crypttab gained a new x-initrd.attach option. If set, the specified encrypted volume is unlocked already in the initrd. This concept corresponds to the x-initrd.mount option in /etc/fstab. * systemd-cryptsetup gained native support for unlocking encrypted volumes utilizing PKCS#11 smartcards, i.e. for example to bind encryption of volumes to YubiKeys. This is exposed in the new pkcs11-uri= option in /etc/crypttab. * The /etc/fstab support in systemd now supports two new mount options x-systemd.{required,wanted}-by=, for explicitly configuring the units that the specified mount shall be pulled in by, in place of the usual local-fs.target/remote-fs.target. * The https://systemd.io/ web site has been relaunched, directly populated with most of the documentation included in the systemd repository. systemd also acquired a new logo, thanks to Tobias Bernard. * systemd-udevd gained support for managing “alternative” network interface names, as supported by new Linux kernels. For the first time this permits assigning multiple (and longer!) names to a network interface. systemd-udevd will now by default assign the names generated via all supported naming schemes to each interface. This may be further tweaked with .link files and the AlternativeName= and AlternativeNamesPolicy= settings. Other components of systemd have been updated to support the new alternative names wherever appropriate. For example, systemd-nspawn will now generate alternative interface names for the host-facing side of container veth links based on the full container name without truncation. * systemd-nspawn interface naming logic has been updated in another way too: if the main interface name (i.e. as opposed to new-style “alternative” names) based on the container name is truncated, a simple hashing scheme is used to give different interface names to multiple containers whose names all begin with the same prefix. Since this changes the primary interface names pointing to containers if truncation happens, the old scheme may still be requested by selecting an older naming scheme, via the net.naming-scheme= kernel command line option. * PrivateUsers= in service files now works in services run by the systemd –user per-user instance of the service manager. * A new per-service sandboxing option ProtectClock= has been added that locks down write access to the system clock. It takes away device node access to /dev/rtc as well as the system calls that set the system clock and the CAP_SYS_TIME and CAP_WAKE_ALARM capabilities. Note that this option does not affect access to auxiliary services that allow changing the clock, for example access to systemd-timedated. * The systemd-id128 tool gained a new “show” verb for listing or resolving a number of well-known UUIDs/128bit IDs, currently mostly GPT partition table types. * The Discoverable Partitions Specification has been updated to support /var and /var/tmp partition discovery. Support for this has been added to systemd-gpt-auto-generator. For details see: https://systemd.io/DISCOVERABLE_PARTITIONS * “systemctl list-unit-files” has been updated to show a new column with the suggested enablement state based on the vendor preset files for the respective units. * “systemctl” gained a new option “–with-dependencies”. If specified commands such as “systemctl status” or “systemctl cat” will now show all specified units along with all units they depend on. * networkctl gained support for showing per-interface logs in its “status” output. * systemd-networkd-wait-online gained support for specifying the maximum operational state to wait for, and to wait for interfaces to disappear. * The [Match] section of .link and .network files now supports a new option PermanentMACAddress= which may be used to check against the permanent MAC address of a network device even if a randomized MAC address is used. * The [TrafficControlQueueingDiscipline] section in .network files has been renamed to [NetworkEmulator] with the “NetworkEmulator” prefix dropped from the individual setting names. * Any .link and .network files that have an empty [Match] section (this also includes empty and commented-out files) will now be rejected. systemd-udev and systemd-networkd started warning about such files in version 243. * systemd-logind will now validate access to the operation of changing the virtual terminal via a polkit action. By default, only users with at least one session on a local VT are granted permission. * When systemd sets up PAM sessions that invoked service processes shall run in, the pam_setcred() API is now invoked, thus permitting PAM modules to set additional credentials for the processes. * portablectl attach/detach verbs now accept –now and –enable options to combine attachment with enablement and invocation, or detachment with stopping and disablement. * UPGRADE ISSUE: a bug where some jobs were trimmed as redundant was fixed, which in turn exposed bugs in unit configuration of services which have Type=oneshot and should only run once, but do not have RemainAfterExit=yes set. Without RemainAfterExit=yes, a one-shot service may be started again after exiting successfully, for example as a dependency in another transaction. Affected services included some internal systemd services (most notably systemd-vconsole-setup.service, which was updated to have RemainAfterExit=yes), and plymouth-start.service. Please ensure that plymouth has been suitably updated or patched before upgrading to this systemd release. See https://bugzilla.redhat.com/show_bug.cgi?id=1807771 for some additional discussion. Contributions from: AJ Bagwell, Alin Popa, Andreas Rammhold, Anita Zhang, Ansgar Burchardt, Antonio Russo, Arian van Putten, Ashley Davis, Balint Reczey, Bart Willems, Bastien Nocera, Benjamin Dahlhoff, Charles (Chas) Williams, cheese1, Chris Down, Chris Murphy, Christian Ehrhardt, Christian Göttsche, cvoinf, Daan De Meyer, Daniele Medri, Daniel Rusek, Daniel Shahaf, Dann Frazier, Dan Streetman, Dariusz Gadomski, David Michael, Dimitri John Ledkov, Emmanuel Bourg, Evgeny Vereshchagin, ezst036, Felipe Sateler, Filipe Brandenburger, Florian Klink, Franck Bui, Fran Dieguez, Frantisek Sumsal, Greg “GothAck” Miell, Guilhem Lettron, Guillaume Douézan-Grard, Hans de Goede, HATAYAMA Daisuke, Iain Lane, James Buren, Jan Alexander Steffens (heftig), Jérémy Rosen, Jin Park, Jun’ichi Nomura, Kai Krakow, Kevin Kuehler, Kevin P. Fleming, Lennart Poettering, Leonid Bloch, Leonid Evdokimov, lothrond, Luca Boccassi, Lukas K, Lynn Kirby, Mario Limonciello, Mark Deneen, Matthew Leeds, Michael Biebl, Michal Koutný, Michal Sekletár, Mike Auty, Mike Gilbert, mtron, nabijaczleweli, Naïm Favier, Nate Jones, Norbert Lange, Oliver Giles, Paul Davey, Paul Menzel, Peter Hutterer, Piotr Drąg, Rafa Couto, Raphael, rhn, Robert Scheck, Rocka, Romain Naour, Ryan Attard, Sascha Dewald, Shengjing Zhu, Slava Kardakov, Spencer Michaels, Sylvain Plantefeve, Stanislav Angelovič, Susant Sahani, Thomas Haller, Thomas Schmitt, Timo Schlüßler, Timo Wilken, Tobias Bernard, Tobias Klauser, Tobias Stoeckmann, Topi Miettinen, tsia, WataruMatsuoka, Wieland Hoffmann, Wilhelm Schuster, Will Fleming, xduugu, Yong Cong Sin, Yuri Chornoivan, Yu Watanabe, Zach Smith, Zbigniew Jędrzejewski-Szmek, Zeyu DONG – Warsaw, 2020-03-06 CHANGES WITH 244: * Support for the cpuset cgroups v2 controller has been added. Processes may be restricted to specific CPUs using the new AllowedCPUs= setting, and to specific memory NUMA nodes using the new AllowedMemoryNodes= setting. * The signal used in restart jobs (as opposed to e.g. stop jobs) may now be configured using a new RestartKillSignal= setting. This allows units which signals to request termination to implement different behaviour when stopping in preparation for a restart. * “systemctl clean” may now be used also for socket, mount, and swap units. * systemd will also read configuration options from the EFI variable SystemdOptions. This may be used to configure systemd behaviour when modifying the kernel command line is inconvenient, but configuration on disk is read too late, for example for the options related to cgroup hierarchy setup. ‘bootctl systemd-efi-options’ may be used to set the EFI variable. * systemd will now disable printk ratelimits in early boot. This should allow us to capture more logs from the early boot phase where normal storage is not available and the kernel ring buffer is used for logging. Configuration on the kernel command line has higher priority and overrides the systemd setting. systemd programs which log to /dev/kmsg directly use internal ratelimits to prevent runaway logging. (Normally this is only used during early boot, so in practice this change has very little effect.) * Unit files now support top level dropin directories of the form .d/ (e.g. service.d/) that may be used to add configuration that affects all corresponding unit files. * systemctl gained support for ‘stop –job-mode=triggering’ which will stop the specified unit and any units which could trigger it. * Unit status display now includes units triggering and triggered by the unit being shown. * The RuntimeMaxSec= setting is now supported by scopes, not just .service units. This is particularly useful for PAM sessions which create a scope unit for the user login. systemd.runtime_max_sec= setting may used with the pam_systemd module to limit the duration of the PAM session, for example for time-limited logins. * A new @pkey system call group is now defined to make it easier to allow-list memory protection syscalls for containers and services which need to use them. * systemd-udevd: removed the 30s timeout for killing stale workers on exit. systemd-udevd now waits for workers to finish. The hard-coded exit timeout of 30s was too short for some large installations, where driver initialization could be prematurely interrupted during initrd processing if the root file system had been mounted and init was preparing to switch root. If udevd is run without systemd and workers are hanging while udevd receives an exit signal, udevd will now exit when udev.event_timeout is reached for the last hanging worker. With systemd, the exit timeout can additionally be configured using TimeoutStopSec= in systemd-udevd.service. * udev now provides a program (fido_id) that identifies FIDO CTAP1 (“U2F”)/CTAP2 security tokens based on the usage declared in their report and descriptor and outputs suitable environment variables. This replaces the externally maintained allow lists of all known security tokens that were used previously. * Automatically generated autosuspend udev rules for allow-listed devices have been imported from the Chromium OS project. This should improve power saving with many more devices. * udev gained a new “CONST{key}=value” setting that allows matching against system-wide constants without forking a helper binary. Currently “arch” and “virt” keys are supported. * udev now opens CDROMs in non-exclusive mode when querying their capabilities. This should fix issues where other programs trying to use the CDROM cannot gain access to it, but carries a risk of interfering with programs writing to the disk, if they did not open the device in exclusive mode as they should. * systemd-networkd does not create a default route for IPv4 link local addressing anymore. The creation of the route was unexpected and was breaking routing in various cases, but people who rely on it being created implicitly will need to adjust. Such a route may be requested with DefaultRouteOnDevice=yes. Similarly, systemd-networkd will not assign a link-local IPv6 address when IPv6 link-local routing is not enabled. * Receive and transmit buffers may now be configured on links with the new RxBufferSize= and TxBufferSize= settings. * systemd-networkd may now advertise additional IPv6 routes. A new [IPv6RoutePrefix] section with Route= and LifetimeSec= options is now supported. * systemd-networkd may now configure “next hop” routes using the [NextHop] section and Gateway= and Id= settings. * systemd-networkd will now retain DHCP config on restarts by default (but this may be overridden using the KeepConfiguration= setting). The default for SendRelease= has been changed to true. * The DHCPv4 client now uses the OPTION_INFORMATION_REFRESH_TIME option received from the server. The client will use the received SIP server list if UseSIP=yes is set. The client may be configured to request specific options from the server using a new RequestOptions= setting. The client may be configured to send arbitrary options to the server using a new SendOption= setting. A new IPServiceType= setting has been added to configure the “IP service type” value used by the client. * The DHCPv6 client learnt a new PrefixDelegationHint= option to request prefix hints in the DHCPv6 solicitation. * The DHCPv4 server may be configured to send arbitrary options using a new SendOption= setting. * The DHCPv4 server may now be configured to emit SIP server list using the new EmitSIP= and SIP= settings. * systemd-networkd and networkctl may now renew DHCP leases on demand. networkctl has a new ‘networkctl renew’ verb. * systemd-networkd may now reconfigure links on demand. networkctl gained two new verbs: “reload” will reload the configuration, and “reconfigure DEVICE…” will reconfigure one or more devices. * .network files may now match on SSID and BSSID of a wireless network, i.e. the access point name and hardware address using the new SSID= and BSSID= options. networkctl will display the current SSID and BSSID for wireless links. .network files may also match on the wireless network type using the new WLANInterfaceType= option. * systemd-networkd now includes default configuration that enables link-local addressing when connected to an ad-hoc wireless network. * systemd-networkd may configure the Traffic Control queueing disciplines in the kernel using the new [TrafficControlQueueingDiscipline] section and Parent=, NetworkEmulatorDelaySec=, NetworkEmulatorDelayJitterSec=, NetworkEmulatorPacketLimit=, NetworkEmulatorLossRate=, NetworkEmulatorDuplicateRate= settings. * systemd-tmpfiles gained a new w+ setting to append to files. * systemd-analyze dump will now report when the memory configuration in the kernel does not match what systemd has configured (usually, because some external program has modified the kernel configuration on its own). * systemd-analyze gained a new –base-time= switch instructs the ‘calendar’ verb to resolve times relative to that timestamp instead of the present time. * journalctl –update-catalog now produces deterministic output (making reproducible image builds easier). * A new devicetree-overlay setting is now documented in the Boot Loader Specification. * The default value of the WatchdogSec= setting used in systemd services (the ones bundled with the project itself) may be set at configuration time using the -Dservice-watchdog= setting. If set to empty, the watchdogs will be disabled. * systemd-resolved validates IP addresses in certificates now when GnuTLS is being used. * libcryptsetup >= 2.0.1 is now required. * A configuration option -Duser-path= may be used to override the $PATH used by the user service manager. The default is again to use the same path as the system manager. * The systemd-id128 tool gained a new switch “-u” (or “–uuid”) for outputting the 128bit IDs in UUID format (i.e. in the “canonical representation”). * Service units gained a new sandboxing option ProtectKernelLogs= which makes sure the program cannot get direct access to the kernel log buffer anymore, i.e. the syslog() system call (not to be confused with the API of the same name in libc, which is not affected), the /proc/kmsg and /dev/kmsg nodes and the CAP_SYSLOG capability are made inaccessible to the service. It’s recommended to enable this setting for all services that should not be able to read from or write to the kernel log buffer, which are probably almost all. Contributions from: Aaron Plattner, Alcaro, Anita Zhang, Balint Reczey, Bastien Nocera, Baybal Ni, Benjamin Bouvier, Benjamin Gilbert, Carlo Teubner, cbzxt, Chen Qi, Chris Down, Christian Rebischke, Claudio Zumbo, ClydeByrdIII, crashfistfight, Cyprien Laplace, Daniel Edgecumbe, Daniel Gorbea, Daniel Rusek, Daniel Stuart, Dan Streetman, David Pedersen, David Tardon, Dimitri John Ledkov, Dominique Martinet, Donald A. Cupp Jr, Evgeny Vereshchagin, Fabian Henneke, Filipe Brandenburger, Franck Bui, Frantisek Sumsal, Georg Müller, Hans de Goede, Haochen Tong, HATAYAMA Daisuke, Iwan Timmer, Jan Janssen, Jan Kundrát, Jan Synacek, Jan Tojnar, Jay Strict, Jérémy Rosen, Jóhann B. Guðmundsson, Jonas Jelten, Jonas Thelemann, Justin Trudell, J. Xing, Kai-Heng Feng, Kenneth D’souza, Kevin Becker, Kevin Kuehler, Lennart Poettering, Léonard Gérard, Lorenz Bauer, Luca Boccassi, Maciej Stanczew, Mario Limonciello, Marko Myllynen, Mark Stosberg, Martin Wilck, matthiasroos, Michael Biebl, Michael Olbrich, Michael Tretter, Michal Sekletar, Michal Sekletár, Michal Suchanek, Mike Gilbert, Mike Kazantsev, Nicolas Douma, nikolas, Norbert Lange, pan93412, Pascal de Bruijn, Paul Menzel, Pavel Hrdina, Peter Wu, Philip Withnall, Piotr Drąg, Rafael Fontenelle, Renaud Métrich, Riccardo Schirone, RoadrunnerWMC, Ronan Pigott, Ryan Attard, Sebastian Wick, Serge, Siddharth Chandrasekara, Steve Ramage, Steve Traylen, Susant Sahani, Thibault Nélis, Tim Teichmann, Tom Fitzhenry, Tommy J, Torsten Hilbrich, Vito Caputo, ypf791, Yu Watanabe, Zach Smith, Zbigniew Jędrzejewski-Szmek – Warsaw, 2019-11-29 CHANGES WITH 243: * This release enables unprivileged programs (i.e. requiring neither setuid nor file capabilities) to send ICMP Echo (i.e. ping) requests by turning on the “net.ipv4.ping_group_range” sysctl of the Linux kernel for the whole UNIX group range, i.e. all processes. This change should be reasonably safe, as the kernel support for it was specifically implemented to allow safe access to ICMP Echo for processes lacking any privileges. If this is not desirable, it can be disabled again by setting the parameter to “1 0”. * Previously, filters defined with SystemCallFilter= would have the effect that any calling of an offending system call would terminate the calling thread. This behaviour never made much sense, since killing individual threads of unsuspecting processes is likely to create more problems than it solves. With this release the default action changed from killing the thread to killing the whole process. For this to work correctly both a kernel version (>= 4.14) and a libseccomp version (>= 2.4.0) supporting this new seccomp action is required. If an older kernel or libseccomp is used the old behaviour continues to be used. This change does not affect any services that have no system call filters defined, or that use SystemCallErrorNumber= (and thus see EPERM or another error instead of being killed when calling an offending system call). Note that systemd documentation always claimed that the whole process is killed. With this change behaviour is thus adjusted to match the documentation. * On 64 bit systems, the “kernel.pid_max” sysctl is now bumped to 4194304 by default, i.e. the full 22bit range the kernel allows, up from the old 16bit range. This should improve security and robustness, as PID collisions are made less likely (though certainly still possible). There are rumours this might create compatibility problems, though at this moment no practical ones are known to us. Downstream distributions are hence advised to undo this change in their builds if they are concerned about maximum compatibility, but for everybody else we recommend leaving the value bumped. Besides improving security and robustness this should also simplify things as the maximum number of allowed concurrent tasks was previously bounded by both “kernel.pid_max” and “kernel.threads-max” and now effectively only a single knob is left (“kernel.threads-max”). There have been concerns that usability is affected by this change because larger PID numbers are harder to type, but we believe the change from 5 digits to 7 digits doesn’t hamper usability. * MemoryLow= and MemoryMin= gained hierarchy-aware counterparts, DefaultMemoryLow= and DefaultMemoryMin=, which can be used to hierarchically set default memory protection values for a particular subtree of the unit hierarchy. * Memory protection directives can now take a value of zero, allowing explicit opting out of a default value propagated by an ancestor. * systemd now defaults to the “unified” cgroup hierarchy setup during build-time, i.e. -Ddefault-hierarchy=unified is now the build-time default. Previously, -Ddefault-hierarchy=hybrid was the default. This change reflects the fact that cgroupsv2 support has matured substantially in both systemd and in the kernel, and is clearly the way forward. Downstream production distributions might want to continue to use -Ddefault-hierarchy=hybrid (or even =legacy) for their builds as unfortunately the popular container managers have not caught up with the kernel API changes. * Man pages are not built by default anymore (html pages were already disabled by default), to make development builds quicker. When building systemd for a full installation with documentation, meson should be called with -Dman=true and/or -Dhtml=true as appropriate. The default was changed based on the assumption that quick one-off or repeated development builds are much more common than full optimized builds for installation, and people need to pass various other options to when doing “proper” builds anyway, so the gain from making development builds quicker is bigger than the one time disruption for packagers. Two scripts are created in the *build* directory to generate and preview man and html pages on demand, e.g.: build/man/man systemctl build/man/html systemd.index * libidn2 is used by default if both libidn2 and libidn are installed. Please use -Dlibidn=true if libidn is preferred. * The D-Bus “wire format” of the CPUAffinity= attribute is changed on big-endian machines. Before, bytes were written and read in native machine order as exposed by the native libc __cpu_mask interface. Now, little-endian order is always used (CPUs 0–7 are described by bits 0–7 in byte 0, CPUs 8–15 are described by byte 1, and so on). This change fixes D-Bus calls that cross endianness boundary. The presentation format used for CPUAffinity= by “systemctl show” and “systemd-analyze dump” is changed to present CPU indices instead of the raw __cpu_mask bitmask. For example, CPUAffinity=0-1 would be shown as CPUAffinity=03000000000000000000000000000… (on little-endian) or CPUAffinity=00000000000000300000000000000… (on 64-bit big-endian), and is now shown as CPUAffinity=0-1, matching the input format. The maximum integer that will be printed in the new format is 8191 (four digits), while the old format always used a very long number (with the length varying by architecture), so they can be unambiguously distinguished. * /usr/sbin/halt.local is no longer supported. Implementation in distributions was inconsistent and it seems this functionality was very rarely used. To replace this functionality, users should: – either define a new unit and make it a dependency of final.target (systemctl add-wants final.target my-halt-local.service) – or move the shutdown script to /usr/lib/systemd/system-shutdown/ and ensure that it accepts “halt”, “poweroff”, “reboot”, and “kexec” as an argument, see the description in systemd-shutdown(8). * When a [Match] section in .link or .network file is empty (contains no match patterns), a warning will be emitted. Please add any “match all” pattern instead, e.g. OriginalName=* or Name=* in case all interfaces should really be matched. * A new setting NUMAPolicy= may be used to set process memory allocation policy. This setting can be specified in /etc/systemd/system.conf and hence will set the default policy for PID1. The default policy can be overridden on a per-service basis. The related setting NUMAMask= is used to specify NUMA node mask that should be associated with the selected policy. * PID 1 will now listen to Out-Of-Memory (OOM) events the kernel generates when processes it manages are reaching their memory limits, and will place their units in a special state, and optionally kill or stop the whole unit. * The service manager will now expose bus properties for the IO resources used by units. This information is also shown in “systemctl status” now (for services that have IOAccounting=yes set). Moreover, the IO accounting data is included in the resource log message generated whenever a unit stops. * Units may now configure an explicit timeout to wait for when killed with SIGABRT, for example when a service watchdog is hit. Previously, the regular TimeoutStopSec= timeout was applied in this case too — now a separate timeout may be set using TimeoutAbortSec=. * Services may now send a special WATCHDOG=trigger message with sd_notify() to trigger an immediate “watchdog missed” event, and thus trigger service termination. This is useful both for testing watchdog handling, but also for defining error paths in services, that shall be handled the same way as watchdog events. * There are two new per-unit settings IPIngressFilterPath= and IPEgressFilterPath= which allow configuration of a BPF program (usually by specifying a path to a program uploaded to /sys/fs/bpf/) to apply to the IP packet ingress/egress path of all processes of a unit. This is useful to allow running systemd services with BPF programs set up externally. * systemctl gained a new “clean” verb for removing the state, cache, runtime or logs directories of a service while it is terminated. The new verb may also be used to remove the state maintained on disk for timer units that have Persistent= configured. * During the last phase of shutdown systemd will now automatically increase the log level configured in the “kernel.printk” sysctl so that any relevant loggable events happening during late shutdown are made visible. Previously, loggable events happening so late during shutdown were generally lost if the “kernel.printk” sysctl was set to high thresholds, as regular logging daemons are terminated at that time and thus nothing is written to disk. * If processes terminated during the last phase of shutdown do not exit quickly systemd will now show their names after a short time, to make debugging easier. After a longer timeout they are forcibly killed, as before. * journalctl (and the other tools that display logs) will now highlight warnings in yellow (previously, both LOG_NOTICE and LOG_WARNING where shown in bright bold, now only LOG_NOTICE is). Moreover, audit logs are now shown in blue color, to separate them visually from regular logs. References to configuration files are now turned into clickable links on terminals that support that. * systemd-journald will now stop logging to /var/log/journal during shutdown when /var/ is on a separate mount, so that it can be unmounted safely during shutdown. * systemd-resolved gained support for a new ‘strict’ DNS-over-TLS mode. * systemd-resolved “Cache=” configuration option in resolved.conf has been extended to also accept the ‘no-negative’ value. Previously, only a boolean option was allowed (yes/no), having yes as the default. If this option is set to ‘no-negative’, negative answers are not cached while the old cache heuristics are used positive answers. The default remains unchanged. * The predictable naming scheme for network devices now supports generating predictable names for “netdevsim” devices. Moreover, the “en” prefix was dropped from the ID_NET_NAME_ONBOARD udev property. Those two changes form a new net.naming-policy-scheme= entry. Distributions which want to preserve naming stability may want to set the -Ddefault-net-naming-scheme= configuration option. * systemd-networkd now supports MACsec, nlmon, IPVTAP and Xfrm interfaces natively. * systemd-networkd’s bridge FDB support now allows configuration of a destination address for each entry (Destination=), as well as the VXLAN VNI (VNI=), as well as an option to declare what an entry is associated with (AssociatedWith=). * systemd-networkd’s DHCPv4 support now understands a new MaxAttempts= option for configuring the maximum number of DHCP lease requests. It also learnt a new BlackList= option for deny-listing DHCP servers (a similar setting has also been added to the IPv6 RA client), as well as a SendRelease= option for configuring whether to send a DHCP RELEASE message when terminating. * systemd-networkd’s DHCPv4 and DHCPv6 stacks can now be configured separately in the [DHCPv4] and [DHCPv6] sections. * systemd-networkd’s DHCP support will now optionally create an implicit host route to the DNS server specified in the DHCP lease, in addition to the routes listed explicitly in the lease. This should ensure that in multi-homed systems DNS traffic leaves the systems on the interface that acquired the DNS server information even if other routes such as default routes exist. This behaviour may be turned on with the new RoutesToDNS= option. * systemd-networkd’s VXLAN support gained a new option GenericProtocolExtension= for enabling VXLAN Generic Protocol Extension support, as well as IPDoNotFragment= for setting the IP “Don’t fragment” bit on outgoing packets. A similar option has been added to the GENEVE support. * In systemd-networkd’s [Route] section you may now configure FastOpenNoCookie= for configuring per-route TCP fast-open support, as well as TTLPropagate= for configuring Label Switched Path (LSP) TTL propagation. The Type= setting now supports local, broadcast, anycast, multicast, any, xresolve routes, too. * systemd-networkd’s [Network] section learnt a new option DefaultRouteOnDevice= for automatically configuring a default route onto the network device. * systemd-networkd’s bridging support gained two new options ProxyARP= and ProxyARPWifi= for configuring proxy ARP behaviour as well as MulticastRouter= for configuring multicast routing behaviour. A new option MulticastIGMPVersion= may be used to change bridge’s multicast Internet Group Management Protocol (IGMP) version. * systemd-networkd’s FooOverUDP support gained the ability to configure local and peer IP addresses via Local= and Peer=. A new option PeerPort= may be used to configure the peer’s IP port. * systemd-networkd’s TUN support gained a new setting VnetHeader= for tweaking Generic Segment Offload support. * The address family for policy rules may be specified using the new Family= option in the [RoutingPolicyRule] section. * networkctl gained a new “delete” command for removing virtual network devices, as well as a new “–stats” switch for showing device statistics. * networkd.conf gained a new setting SpeedMeter= and SpeedMeterIntervalSec=, to measure bitrate of network interfaces. The measured speed may be shown by ‘networkctl status’. * “networkctl status” now displays MTU and queue lengths, and more detailed information about VXLAN and bridge devices. * systemd-networkd’s .network and .link files gained a new Property= setting in the [Match] section, to match against devices with specific udev properties. * systemd-networkd’s tunnel support gained a new option AssignToLoopback= for selecting whether to use the loopback device “lo” as underlying device. * systemd-networkd’s MACAddress= setting in the [Neighbor] section has been renamed to LinkLayerAddress=, and it now allows configuration of IP addresses, too. * systemd-networkd’s handling of the kernel’s disable_ipv6 sysctl is simplified: systemd-networkd will disable the sysctl (enable IPv6) if IPv6 configuration (static or DHCPv6) was found for a given interface. It will not touch the sysctl otherwise. * The order of entries is $PATH used by the user manager instance was changed to put bin/ entries before the corresponding sbin/ entries. It is recommended to not rely on this order, and only ever have one binary with a given name in the system paths under /usr. * A new tool systemd-network-generator has been added that may generate .network, .netdev and .link files from IP configuration specified on the kernel command line in the format used by Dracut. * The CriticalConnection= setting in .network files is now deprecated, and replaced by a new KeepConfiguration= setting which allows more detailed configuration of the IP configuration to keep in place. * systemd-analyze gained a few new verbs: – “systemd-analyze timestamp” parses and converts timestamps. This is similar to the existing “systemd-analyze calendar” command which does the same for recurring calendar events. – “systemd-analyze timespan” parses and converts timespans (i.e. durations as opposed to points in time). – “systemd-analyze condition” will parse and test ConditionXYZ= expressions. – “systemd-analyze exit-status” will parse and convert exit status codes to their names and back. – “systemd-analyze unit-files” will print a list of all unit file paths and unit aliases. * SuccessExitStatus=, RestartPreventExitStatus=, and RestartForceExitStatus= now accept exit status names (e.g. “DATAERR” is equivalent to “65”). Those exit status name mappings may be displayed with the systemd-analyze exit-status verb describe above. * systemd-logind now exposes a per-session SetBrightness() bus call, which may be used to securely change the brightness of a kernel brightness device, if it belongs to the session’s seat. By using this call unprivileged clients can make changes to “backlight” and “leds” devices securely with strict requirements on session membership. Desktop environments may use this to generically make brightness changes to such devices without shipping private SUID binaries or udev rules for that purpose. * “udevadm info” gained a –wait-for-initialization switch to wait for a device to be initialized. * systemd-hibernate-resume-generator will now look for resumeflags= on the kernel command line, which is similar to rootflags= and may be used to configure device timeout for the hibernation device. * sd-event learnt a new API call sd_event_source_disable_unref() for disabling and unref’ing an event source in a single function. A related call sd_event_source_disable_unrefp() has been added for use with gcc’s cleanup extension. * The sd-id128.h public API gained a new definition SD_ID128_UUID_FORMAT_STR for formatting a 128bit ID in UUID format with printf(). * “busctl introspect” gained a new switch –xml-interface for dumping XML introspection data unmodified. * PID 1 may now show the unit name instead of the unit description string in its status output during boot. This may be configured in the StatusUnitFormat= setting in /etc/systemd/system.conf or the kernel command line option systemd.status_unit_format=. * PID 1 now understands a new option KExecWatchdogSec= in /etc/systemd/system.conf to set a watchdog timeout for kexec reboots. Previously watchdog functionality was only available for regular reboots. The new setting defaults to off, because we don’t know in the general case if the watchdog will be reset after kexec (some drivers do reset it, but not all), and the new userspace might not be configured to handle the watchdog. Moreover, the old ShutdownWatchdogSec= setting has been renamed to RebootWatchdogSec= to more clearly communicate what it is about. The old name is still accepted for compatibility. * The systemd.debug_shell kernel command line option now optionally takes a tty name to spawn the debug shell on, which allows a different tty to be selected than the built-in default. * Service units gained a new ExecCondition= setting which will run before ExecStartPre= and either continue execution of the unit (for clean exit codes), stop execution without marking the unit failed (for exit codes 1 through 254), or stop execution and fail the unit (for exit code 255 or abnormal termination). * A new service systemd-pstore.service has been added that pulls data from /sys/fs/pstore/ and saves it to /var/lib/pstore for later review. * timedatectl gained new verbs for configuring per-interface NTP service configuration for systemd-timesyncd. * “localectl list-locales” won’t list non-UTF-8 locales anymore. It’s 2019. (You can set non-UTF-8 locales though, if you know their name.) * If variable assignments in sysctl.d/ files are prefixed with “-” any failures to apply them are now ignored. * systemd-random-seed.service now optionally credits entropy when applying the seed to the system. Set $SYSTEMD_RANDOM_SEED_CREDIT to true for the service to enable this behaviour, but please consult the documentation first, since this comes with a couple of caveats. * systemd-random-seed.service is now a synchronization point for full initialization of the kernel’s entropy pool. Services that require /dev/urandom to be correctly initialized should be ordered after this service. * The systemd-boot boot loader has been updated to optionally maintain a random seed file in the EFI System Partition (ESP). During the boot phase, this random seed is read and updated with a new seed cryptographically derived from it. Another derived seed is passed to the OS. The latter seed is then credited to the kernel’s entropy pool very early during userspace initialization (from PID 1). This allows systems to boot up with a fully initialized kernel entropy pool from earliest boot on, and thus entirely removes all entropy pool initialization delays from systems using systemd-boot. Special care is taken to ensure different seeds are derived on system images replicated to multiple systems. “bootctl status” will show whether a seed was received from the boot loader. * bootctl gained two new verbs: – “bootctl random-seed” will generate the file in ESP and an EFI variable to allow a random seed to be passed to the OS as described above. – “bootctl is-installed” checks whether systemd-boot is currently installed. * bootctl will warn if it detects that boot entries are misconfigured (for example if the kernel image was removed without purging the bootloader entry). * A new document has been added describing systemd’s use and support for the kernel’s entropy pool subsystem: https://systemd.io/RANDOM_SEEDS * When the system is hibernated the swap device to write the hibernation image to is now automatically picked from all available swap devices, preferring the swap device with the highest configured priority over all others, and picking the device with the most free space if there are multiple devices with the highest priority. * /etc/crypttab support has learnt a new keyfile-timeout= per-device option that permits selecting the timeout how long to wait for a device with an encryption key before asking for the password. * IOWeight= has learnt to properly set the IO weight when using the BFQ scheduler officially found in kernels 5.0+. * A new mailing list has been created for reporting of security issues: systemd-security@redhat.com. For mode details, see https://systemd.io/CONTRIBUTING#security-vulnerability-reports. Contributions from: Aaron Barany, Adrian Bunk, Alan Jenkins, Albrecht Lohofener, Andrej Valek, Anita Zhang, Arian van Putten, Balint Reczey, Bastien Nocera, Ben Boeckel, Benjamin Robin, camoz, Chen Qi, Chris Chiu, Chris Down, Christian Göttsche, Christian Kellner, Clinton Roy, Connor Reeder, Daniel Black, Daniel Lublin, Daniele Medri, Dan Streetman, Dave Reisner, Dave Ross, David Art, David Tardon, Debarshi Ray, Dimitri John Ledkov, Dominick Grift, Donald Buczek, Douglas Christman, Eric DeVolder, EtherGraf, Evgeny Vereshchagin, Feldwor, Felix Riemann, Florian Dollinger, Francesco Pennica, Franck Bui, Frantisek Sumsal, Franz Pletz, frederik, Hans de Goede, Iago López Galeiras, Insun Pyo, Ivan Shapovalov, Iwan Timmer, Jack, Jakob Unterwurzacher, Jan Chren, Jan Klötzke, Jan Losinski, Jan Pokorný, Jan Synacek, Jan-Michael Brummer, Jeka Pats, Jeremy Soller, Jérémy Rosen, Jiri Pirko, Joe Lin, Joerg Behrmann, Joe Richey, Jóhann B. Guðmundsson, Johannes Christ, Johannes Schmitz, Jonathan Rouleau, Jorge Niedbalski, Jörg Thalheim, Kai Krakow, Kai Lüke, Karel Zak, Kashyap Chamarthy, Krayushkin Konstantin, Lennart Poettering, Lubomir Rintel, Luca Boccassi, Luís Ferreira, Marc-André Lureau, Markus Felten, Martin Pitt, Matthew Leeds, Mattias Jernberg, Michael Biebl, Michael Olbrich, Michael Prokop, Michael Stapelberg, Michael Zhivich, Michal Koutný, Michal Sekletar, Mike Gilbert, Milan Broz, Miroslav Lichvar, mpe85, Mr-Foo, Network Silence, Oliver Harley, pan93412, Paul Menzel, pEJipE, Peter A. Bigot, Philip Withnall, Piotr Drąg, Rafael Fontenelle, Robert Scheck, Roberto Santalla, Ronan Pigott, root, RussianNeuroMancer, Sebastian Jennen, shinygold, Shreyas Behera, Simon Schricker, Susant Sahani, Thadeu Lima de Souza Cascardo, Theo Ouzhinski, Thiebaud Weksteen, Thomas Haller, Thomas Weißschuh, Tomas Mraz, Tommi Rantala, Topi Miettinen, VD-Lycos, ven, Vladimir Yerilov, Wieland Hoffmann, William A. Kennington III, William Wold, Xi Ruoyao, Yuri Chornoivan, Yu Watanabe, Zach Smith, Zbigniew Jędrzejewski-Szmek, Zhang Xianwei – Camerino, 2019-09-03 CHANGES WITH 242: * In .link files, MACAddressPolicy=persistent (the default) is changed to cover more devices. For devices like bridges, tun, tap, bond, and similar interfaces that do not have other identifying information, the interface name is used as the basis for persistent seed for MAC and IPv4LL addresses. The way that devices that were handled previously is not changed, and this change is about covering more devices then previously by the “persistent” policy. MACAddressPolicy=random may be used to force randomized MACs and IPv4LL addresses for a device if desired. Hint: the log output from udev (at debug level) was enhanced to clarify what policy is followed and which attributes are used. `SYSTEMD_LOG_LEVEL=debug udevadm test-builtin net_setup_link /sys/class/net/` may be used to view this. Hint: if a bridge interface is created without any slaves, and gains a slave later, then now the bridge does not inherit slave’s MAC. To inherit slave’s MAC, for example, create the following file: “` # /etc/systemd/network/98-bridge-inherit-mac.link [Match] Type=bridge [Link] MACAddressPolicy=none “` * The .device units generated by systemd-fstab-generator and other generators do not automatically pull in the corresponding .mount unit as a Wants= dependency. This means that simply plugging in the device will not cause the mount unit to be started automatically. But please note that the mount unit may be started for other reasons, in particular if it is part of local-fs.target, and any unit which (transitively) depends on local-fs.target is started. * networkctl list/status/lldp now accept globbing wildcards for network interface names to match against all existing interfaces. * The $PIDFILE environment variable is set to point the absolute path configured with PIDFile= for processes of that service. * The fallback DNS server list was augmented with Cloudflare public DNS servers. Use `-Ddns-servers=` to set a different fallback. * A new special target usb-gadget.target will be started automatically when a USB Device Controller is detected (which means that the system is a USB peripheral). * A new unit setting CPUQuotaPeriodSec= assigns the time period relatively to which the CPU time quota specified by CPUQuota= is measured. * A new unit setting ProtectHostname= may be used to prevent services from modifying hostname information (even if they otherwise would have privileges to do so). * A new unit setting NetworkNamespacePath= may be used to specify a namespace for service or socket units through a path referring to a Linux network namespace pseudo-file. * The PrivateNetwork= setting and JoinsNamespaceOf= dependencies now have an effect on .socket units: when used the listening socket is created within the configured network namespace instead of the host namespace. * ExecStart= command lines in unit files may now be prefixed with ‘:’ in which case environment variable substitution is disabled. (Supported for the other ExecXYZ= settings, too.) * .timer units gained two new boolean settings OnClockChange= and OnTimezoneChange= which may be used to also trigger a unit when the system clock is changed or the local timezone is modified. systemd-run has been updated to make these options easily accessible from the command line for transient timers. * Two new conditions for units have been added: ConditionMemory= may be used to conditionalize a unit based on installed system RAM. ConditionCPUs= may be used to conditionalize a unit based on installed CPU cores. * The @default system call filter group understood by SystemCallFilter= has been updated to include the new rseq() system call introduced in kernel 4.15. * A new time-set.target has been added that indicates that the system time has been set from a local source (possibly imprecise). The existing time-sync.target is stronger and indicates that the time has been synchronized with a precise external source. Services where approximate time is sufficient should use the new target. * “systemctl start” (and related commands) learnt a new –show-transaction option. If specified brief information about all jobs queued because of the requested operation is shown. * systemd-networkd recognizes a new operation state ‘enslaved’, used (instead of ‘degraded’ or ‘carrier’) for interfaces which form a bridge, bond, or similar, and an new ‘degraded-carrier’ operational state used for the bond or bridge master interface when one of the enslaved devices is not operational. * .network files learnt the new IgnoreCarrierLoss= option for leaving networks configured even if the carrier is lost. * The RequiredForOnline= setting in .network files may now specify a minimum operational state required for the interface to be considered “online” by systemd-networkd-wait-online. Related to this systemd-networkd-wait-online gained a new option –operational-state= to configure the same, and its –interface= option was updated to optionally also take an operational state specific for an interface. * systemd-networkd-wait-online gained a new setting –any for waiting for only one of the requested interfaces instead of all of them. * systemd-networkd now implements L2TP tunnels. * Two new .network settings UseAutonomousPrefix= and UseOnLinkPrefix= may be used to cause autonomous and onlink prefixes received in IPv6 Router Advertisements to be ignored. * New MulticastFlood=, NeighborSuppression=, and Learning= .network file settings may be used to tweak bridge behaviour. * The new TripleSampling= option in .network files may be used to configure CAN triple sampling. * A new .netdev settings PrivateKeyFile= and PresharedKeyFile= may be used to point to private or preshared key for a WireGuard interface. * /etc/crypttab now supports the same-cpu-crypt and submit-from-crypt-cpus options to tweak encryption work scheduling details. * systemd-tmpfiles will now take a BSD file lock before operating on a contents of directory. This may be used to temporarily exclude directories from aging by taking the same lock (useful for example when extracting a tarball into /tmp or /var/tmp as a privileged user, which might create files with really old timestamps, which nevertheless should not be deleted). For further details, see: https://systemd.io/TEMPORARY_DIRECTORIES * systemd-tmpfiles’ h line type gained support for the FS_PROJINHERIT_FL (‘P’) file attribute (introduced in kernel 4.5), controlling project quota inheritance. * sd-boot and bootctl now implement support for an Extended Boot Loader (XBOOTLDR) partition, that is intended to be mounted to /boot, in addition to the ESP partition mounted to /efi or /boot/efi. Configuration file fragments, kernels, initrds and other EFI images to boot will be loaded from both the ESP and XBOOTLDR partitions. The XBOOTLDR partition was previously described by the Boot Loader Specification, but implementation was missing in sd-boot. Support for this concept allows using the sd-boot boot loader in more conservative scenarios where the boot loader itself is placed in the ESP but the kernels to boot (and their metadata) in a separate partition. * A system may now be booted with systemd.volatile=overlay on the kernel command line, which causes the root file system to be set up an overlayfs mount combining the root-only root directory with a writable tmpfs. In this setup, the underlying root device is not modified, and any changes are lost at reboot. * Similar, systemd-nspawn can now boot containers with a volatile overlayfs root with the new –volatile=overlay switch. * systemd-nspawn can now consume OCI runtime bundles using a new –oci-bundle= option. This implementation is fully usable, with most features in the specification implemented, but since this a lot of new code and functionality, this feature should most likely not be used in production yet. * systemd-nspawn now supports various options described by the OCI runtime specification on the command-line and in .nspawn files: –inaccessible=/Inaccessible= may be used to mask parts of the file system tree, –console=/–pipe may be used to configure how standard input, output, and error are set up. * busctl learned the `emit` verb to generate D-Bus signals. * systemd-analyze cat-config may be used to gather and display configuration spread over multiple files, for example system and user presets, tmpfiles.d, sysusers.d, udev rules, etc. * systemd-analyze calendar now takes an optional new parameter –iterations= which may be used to show a maximum number of iterations the specified expression will elapse next. * The sd-bus C API gained support for naming method parameters in the introspection data. * systemd-logind gained D-Bus APIs to specify the “reboot parameter” the reboot() system call expects. * journalctl learnt a new –cursor-file= option that points to a file from which a cursor should be loaded in the beginning and to which the updated cursor should be stored at the end. * ACRN hypervisor and Windows Subsystem for Linux (WSL) are now detected by systemd-detect-virt (and may also be used in ConditionVirtualization=). * The behaviour of systemd-logind may now be modified with environment variables $SYSTEMD_REBOOT_TO_FIRMWARE_SETUP, $SYSTEMD_REBOOT_TO_BOOT_LOADER_MENU, and $SYSTEMD_REBOOT_TO_BOOT_LOADER_ENTRY. They cause logind to either skip the relevant operation completely (when set to false), or to create a flag file in /run/systemd (when set to true), instead of actually commencing the real operation when requested. The presence of /run/systemd/reboot-to-firmware-setup, /run/systemd/reboot-to-boot-loader-menu, and /run/systemd/reboot-to-boot-loader-entry, may be used by alternative boot loader implementations to replace some steps logind performs during reboot with their own operations. * systemctl can be used to request a reboot into the boot loader menu or a specific boot loader entry with the new –boot-load-menu= and –boot-loader-entry= options to a reboot command. (This requires a boot loader that supports this, for example sd-boot.) * kernel-install will no longer unconditionally create the output directory (e.g. /efi//) for boot loader snippets, but will do only if the machine-specific parent directory (i.e. /efi//) already exists. bootctl has been modified to create this parent directory during sd-boot installation. This makes it easier to use kernel-install with plugins which support a different layout of the bootloader partitions (for example grub2). * During package installation (with `ninja install`), we would create symlinks for getty@tty1.service, systemd-networkd.service, systemd-networkd.socket, systemd-resolved.service, remote-cryptsetup.target, remote-fs.target, systemd-networkd-wait-online.service, and systemd-timesyncd.service in /etc, as if `systemctl enable` was called for those units, to make the system usable immediately after installation. Now this is not done anymore, and instead calling `systemctl preset-all` is recommended after the first installation of systemd. * A new boolean sandboxing option RestrictSUIDSGID= has been added that is built on seccomp. When turned on creation of SUID/SGID files is prohibited. * The NoNewPrivileges= and the new RestrictSUIDSGID= options are now implied if DynamicUser= is turned on for a service. This hardens these services, so that they neither can benefit from nor create SUID/SGID executables. This is a minor compatibility breakage, given that when DynamicUser= was first introduced SUID/SGID behaviour was unaffected. However, the security benefit of these two options is substantial, and the setting is still relatively new, hence we opted to make it mandatory for services with dynamic users. Contributions from: Adam Jackson, Alexander Tsoy, Andrey Yashkin, Andrzej Pietrasiewicz, Anita Zhang, Balint Reczey, Beniamino Galvani, Ben Iofel, Benjamin Berg, Benjamin Dahlhoff, Chris, Chris Morin, Christopher Wong, Claudius Ellsel, Clemens Gruber, dana, Daniel Black, Davide Cavalca, David Michael, David Rheinsberg, emersion, Evgeny Vereshchagin, Filipe Brandenburger, Franck Bui, Frantisek Sumsal, Giacinto Cifelli, Hans de Goede, Hugo Kindel, Ignat Korchagin, Insun Pyo, Jan Engelhardt, Jonas Dorel, Jonathan Lebon, Jonathon Kowalski, Jörg Sommer, Jörg Thalheim, Jussi Pakkanen, Kai-Heng Feng, Lennart Poettering, Lubomir Rintel, Luís Ferreira, Martin Pitt, Matthias Klumpp, Michael Biebl, Michael Niewöhner, Michael Olbrich, Michal Sekletar, Mike Lothian, Paul Menzel, Piotr Drąg, Riccardo Schirone, Robin Elvedi, Roman Kulikov, Ronald Tschalär, Ross Burton, Ryan Gonzalez, Sebastian Krzyszkowiak, Stephane Chazelas, StKob, Susant Sahani, Sylvain Plantefève, Szabolcs Fruhwald, Taro Yamada, Theo Ouzhinski, Thomas Haller, Tobias Jungel, Tom Yan, Tony Asleson, Topi Miettinen, unixsysadmin, Van Laser, Vesa Jääskeläinen, Yu, Li-Yu, Yu Watanabe, Zbigniew Jędrzejewski-Szmek — Warsaw, 2019-04-11 CHANGES WITH 241: * The default locale can now be configured at compile time. Otherwise, a suitable default will be selected automatically (one of C.UTF-8, en_US.UTF-8, and C). * The version string shown by systemd and other tools now includes the git commit hash when built from git. An override may be specified during compilation, which is intended to be used by distributions to include the package release information. * systemd-cat can now filter standard input and standard error streams for different syslog priorities using the new –stderr-priority= option. * systemd-journald and systemd-journal-remote reject entries which contain too many fields (CVE-2018-16865) and set limits on the process’ command line length (CVE-2018-16864). * $DBUS_SESSION_BUS_ADDRESS environment variable is set by pam_systemd again. * A new network device NamePolicy “keep” is implemented for link files, and used by default in 99-default.link (the fallback configuration provided by systemd). With this policy, if the network device name was already set by userspace, the device will not be renamed again. This matches the naming scheme that was implemented before systemd-240. If naming-scheme < 240 is specified, the "keep" policy is also enabled by default, even if not specified. Effectively, this means that if naming-scheme >= 240 is specified, network devices will be renamed according to the configuration, even if they have been renamed already, if “keep” is not specified as the naming policy in the .link file. The 99-default.link file provided by systemd includes “keep” for backwards compatibility, but it is recommended for user installed .link files to *not* include it. The “kernel” policy, which keeps kernel names declared to be “persistent”, now works again as documented. * kernel-install script now optionally takes the paths to one or more initrd files, and passes them to all plugins. * The mincore() system call has been dropped from the @system-service system call filter group, as it is pretty exotic and may potentially used for side-channel attacks. * -fPIE is dropped from compiler and linker options. Please specify -Db_pie=true option to meson to build position-independent executables. Note that the meson option is supported since meson-0.49. * The fs.protected_regular and fs.protected_fifos sysctls, which were added in Linux 4.19 to make some data spoofing attacks harder, are now enabled by default. While this will hopefully improve the security of most installations, it is technically a backwards incompatible change; to disable these sysctls again, place the following lines in /etc/sysctl.d/60-protected.conf or a similar file: fs.protected_regular = 0 fs.protected_fifos = 0 Note that the similar hardlink and symlink protection has been enabled since v199, and may be disabled likewise. * The files read from the EnvironmentFile= setting in unit files now parse backslashes inside quotes literally, matching the behaviour of POSIX shells. * udevadm trigger, udevadm control, udevadm settle and udevadm monitor now automatically become NOPs when run in a chroot() environment. * The tmpfiles.d/ “C” line type will now copy directory trees not only when the destination is so far missing, but also if it already exists as a directory and is empty. This is useful to cater for systems where directory trees are put together from multiple separate mount points but otherwise empty. * A new function sd_bus_close_unref() (and the associated sd_bus_close_unrefp()) has been added to libsystemd, that combines sd_bus_close() and sd_bus_unref() in one. * udevadm control learnt a new option for –ping for testing whether a systemd-udevd instance is running and reacting. * udevadm trigger learnt a new option for –wait-daemon for waiting systemd-udevd daemon to be initialized. Contributions from: Aaron Plattner, Alberts Muktupāvels, Alex Mayer, Ayman Bagabas, Beniamino Galvani, Burt P, Chris Down, Chris Lamb, Chris Morin, Christian Hesse, Claudius Ellsel, dana, Daniel Axtens, Daniele Medri, Dave Reisner, David Santamaría Rogado, Diego Canuhe, Dimitri John Ledkov, Evgeny Vereshchagin, Fabrice Fontaine, Filipe Brandenburger, Franck Bui, Frantisek Sumsal, govwin, Hans de Goede, James Hilliard, Jan Engelhardt, Jani Uusitalo, Jan Janssen, Jan Synacek, Jonathan McDowell, Jonathan Roemer, Jonathon Kowalski, Joost Heitbrink, Jörg Thalheim, Lance, Lennart Poettering, Louis Taylor, Lucas Werkmeister, Mantas Mikulėnas, Marc-Antoine Perennou, marvelousblack, Michael Biebl, Michael Sloan, Michal Sekletar, Mike Auty, Mike Gilbert, Mikhail Kasimov, Neil Brown, Niklas Hambüchen, Patrick Williams, Paul Seyfert, Peter Hutterer, Philip Withnall, Roger James, Ronnie P. Thomas, Ryan Gonzalez, Sam Morris, Stephan Edel, Stephan Gerhold, Susant Sahani, Taro Yamada, Thomas Haller, Topi Miettinen, YiFei Zhu, YmrDtnJu, YunQiang Su, Yu Watanabe, Zbigniew Jędrzejewski-Szmek, zsergeant77, Дамјан Георгиевски — Berlin, 2019-02-14 CHANGES WITH 240: * NoNewPrivileges=yes has been set for all long-running services implemented by systemd. Previously, this was problematic due to SELinux (as this would also prohibit the transition from PID1’s label to the service’s label). This restriction has since been lifted, but an SELinux policy update is required. (See e.g. https://github.com/fedora-selinux/selinux-policy/pull/234.) * DynamicUser=yes is dropped from systemd-networkd.service, systemd-resolved.service and systemd-timesyncd.service, which was enabled in v239 for systemd-networkd.service and systemd-resolved.service, and since v236 for systemd-timesyncd.service. The users and groups systemd-network, systemd-resolve and systemd-timesync are created by systemd-sysusers again. Distributors or system administrators may need to create these users and groups if they not exist (or need to re-enable DynamicUser= for those units) while upgrading systemd. Also, the clock file for systemd-timesyncd may need to move from /var/lib/private/systemd/timesync/clock to /var/lib/systemd/timesync/clock. * When unit files are loaded from disk, previously systemd would sometimes (depending on the unit loading order) load units from the target path of symlinks in .wants/ or .requires/ directories of other units. This meant that unit could be loaded from different paths depending on whether the unit was requested explicitly or as a dependency of another unit, not honouring the priority of directories in search path. It also meant that it was possible to successfully load and start units which are not found in the unit search path, as long as they were requested as a dependency and linked to from .wants/ or .requires/. The target paths of those symlinks are not used for loading units anymore and the unit file must be found in the search path. * A new service type has been added: Type=exec. It’s very similar to Type=simple but ensures the service manager will wait for both fork() and execve() of the main service binary to complete before proceeding with follow-up units. This is primarily useful so that the manager propagates any errors in the preparation phase of service execution back to the job that requested the unit to be started. For example, consider a service that has ExecStart= set to a file system binary that doesn’t exist. With Type=simple starting the unit would be considered instantly successful, as only fork() has to complete successfully and the manager does not wait for execve(), and hence its failure is seen “too late”. With the new Type=exec service type starting the unit will fail, as the manager will wait for the execve() and notice its failure, which is then propagated back to the start job. NOTE: with the next release 241 of systemd we intend to change the systemd-run tool to default to Type=exec for transient services started by it. This should be mostly safe, but in specific corner cases might result in problems, as the systemd-run tool will then block on NSS calls (such as user name look-ups due to User=) done between the fork() and execve(), which under specific circumstances might cause problems. It is recommended to specify “-p Type=simple” explicitly in the few cases where this applies. For regular, non-transient services (i.e. those defined with unit files on disk) we will continue to default to Type=simple. * The Linux kernel’s current default RLIMIT_NOFILE resource limit for userspace processes is set to 1024 (soft) and 4096 (hard). Previously, systemd passed this on unmodified to all processes it forked off. With this systemd release the hard limit systemd passes on is increased to 512K, overriding the kernel’s defaults and substantially increasing the number of simultaneous file descriptors unprivileged userspace processes can allocate. Note that the soft limit remains at 1024 for compatibility reasons: the traditional UNIX select() call cannot deal with file descriptors >= 1024 and increasing the soft limit globally might thus result in programs unexpectedly allocating a high file descriptor and thus failing abnormally when attempting to use it with select() (of course, programs shouldn’t use select() anymore, and prefer poll()/epoll, but the call unfortunately remains undeservedly popular at this time). This change reflects the fact that file descriptor handling in the Linux kernel has been optimized in more recent kernels and allocating large numbers of them should be much cheaper both in memory and in performance than it used to be. Programs that want to take benefit of the increased limit have to “opt-in” into high file descriptors explicitly by raising their soft limit. Of course, when they do that they must acknowledge that they cannot use select() anymore (and neither can any shared library they use — or any shared library used by any shared library they use and so on). Which default hard limit is most appropriate is of course hard to decide. However, given reports that ~300K file descriptors are used in real-life applications we believe 512K is sufficiently high as new default for now. Note that there are also reports that using very high hard limits (e.g. 1G) is problematic: some software allocates large arrays with one element for each potential file descriptor (Java, …) — a high hard limit thus triggers excessively large memory allocations in these applications. Hopefully, the new default of 512K is a good middle ground: higher than what real-life applications currently need, and low enough for avoid triggering excessively large allocations in problematic software. (And yes, somebody should fix Java.) * The fs.nr_open and fs.file-max sysctls are now automatically bumped to the highest possible values, as separate accounting of file descriptors is no longer necessary, as memcg tracks them correctly as part of the memory accounting anyway. Thus, from the four limits on file descriptors currently enforced (fs.file-max, fs.nr_open, RLIMIT_NOFILE hard, RLIMIT_NOFILE soft) we turn off the first two, and keep only the latter two. A set of build-time options (-Dbump-proc-sys-fs-file-max=false and -Dbump-proc-sys-fs-nr-open=false) has been added to revert this change in behaviour, which might be an option for systems that turn off memcg in the kernel. * When no /etc/locale.conf file exists (and hence no locale settings are in place), systemd will now use the “C.UTF-8” locale by default, and set LANG= to it. This locale is supported by various distributions including Fedora, with clear indications that upstream glibc is going to make it available too. This locale enables UTF-8 mode by default, which appears appropriate for 2018. * The “net.ipv4.conf.all.rp_filter” sysctl will now be set to 2 by default. This effectively switches the RFC3704 Reverse Path filtering from Strict mode to Loose mode. This is more appropriate for hosts that have multiple links with routes to the same networks (e.g. a client with a Wi-Fi and Ethernet both connected to the internet). Consult the kernel documentation for details on this sysctl: https://docs.kernel.org/networking/ip-sysctl.html * The v239 change to turn on “net.ipv4.tcp_ecn” by default has been reverted. * CPUAccounting=yes no longer enables the CPU controller when using kernel 4.15+ and the unified cgroup hierarchy, as required accounting statistics are now provided independently from the CPU controller. * Support for disabling a particular cgroup controller within a sub-tree has been added through the DisableControllers= directive. * cgroup_no_v1=all on the kernel command line now also implies using the unified cgroup hierarchy, unless one explicitly passes systemd.unified_cgroup_hierarchy=0 on the kernel command line. * The new “MemoryMin=” unit file property may now be used to set the memory usage protection limit of processes invoked by the unit. This controls the cgroup v2 memory.min attribute. Similarly, the new “IODeviceLatencyTargetSec=” property has been added, wrapping the new cgroup v2 io.latency cgroup property for configuring per-service I/O latency. * systemd now supports the cgroup v2 devices BPF logic, as counterpart to the cgroup v1 “devices” cgroup controller. * systemd-escape now is able to combine –unescape with –template. It also learnt a new option –instance for extracting and unescaping the instance part of a unit name. * sd-bus now provides the sd_bus_message_readv() which is similar to sd_bus_message_read() but takes a va_list object. The pair sd_bus_set_method_call_timeout() and sd_bus_get_method_call_timeout() has been added for configuring the default method call timeout to use. sd_bus_error_move() may be used to efficiently move the contents from one sd_bus_error structure to another, invalidating the source. sd_bus_set_close_on_exit() and sd_bus_get_close_on_exit() may be used to control whether a bus connection object is automatically flushed when an sd-event loop is exited. * When processing classic BSD syslog log messages, journald will now save the original time-stamp string supplied in the new SYSLOG_TIMESTAMP= journal field. This permits consumers to reconstruct the original BSD syslog message more correctly. * StandardOutput=/StandardError= in service files gained support for new “append:…” parameters, for connecting STDOUT/STDERR of a service to a file, and appending to it. * The signal to use as last step of killing of unit processes is now configurable. Previously it was hard-coded to SIGKILL, which may now be overridden with the new KillSignal= setting. Note that this is the signal used when regular termination (i.e. SIGTERM) does not suffice. Similarly, the signal used when aborting a program in case of a watchdog timeout may now be configured too (WatchdogSignal=). * The XDG_SESSION_DESKTOP environment variable may now be configured in the pam_systemd argument line, using the new desktop= switch. This is useful to initialize it properly from a display manager without having to touch C code. * Most configuration options that previously accepted percentage values now also accept permille values with the ‘‰’ suffix (instead of ‘%’). * systemd-resolved may now optionally use OpenSSL instead of GnuTLS for DNS-over-TLS. * systemd-resolved’s configuration file resolved.conf gained a new option ReadEtcHosts= which may be used to turn off processing and honoring /etc/hosts entries. * The “–wait” switch may now be passed to “systemctl is-system-running”, in which case the tool will synchronously wait until the system finished start-up. * hostnamed gained a new bus call to determine the DMI product UUID. * On x86-64 systemd will now prefer using the RDRAND processor instruction over /dev/urandom whenever it requires randomness that neither has to be crypto-grade nor should be reproducible. This should substantially reduce the amount of entropy systemd requests from the kernel during initialization on such systems, though not reduce it to zero. (Why not zero? systemd still needs to allocate UUIDs and such uniquely, which require high-quality randomness.) * networkd gained support for Foo-Over-UDP, ERSPAN and ISATAP tunnels. It also gained a new option ForceDHCPv6PDOtherInformation= for forcing the “Other Information” bit in IPv6 RA messages. The bonding logic gained four new options AdActorSystemPriority=, AdUserPortKey=, AdActorSystem= for configuring various 802.3ad aspects, and DynamicTransmitLoadBalancing= for enabling dynamic shuffling of flows. The tunnel logic gained a new IPv6RapidDeploymentPrefix= option for configuring IPv6 Rapid Deployment. The policy rule logic gained four new options IPProtocol=, SourcePort= and DestinationPort=, InvertRule=. The bridge logic gained support for the MulticastToUnicast= option. networkd also gained support for configuring static IPv4 ARP or IPv6 neighbor entries. * .preset files (as read by ‘systemctl preset’) may now be used to instantiate services. * /etc/crypttab now understands the sector-size= option to configure the sector size for an encrypted partition. * Key material for encrypted disks may now be placed on a formatted medium, and referenced from /etc/crypttab by the UUID of the file system, followed by “=” suffixed by the path to the key file. * The “collect” udev component has been removed without replacement, as it is neither used nor maintained. * When the RuntimeDirectory=, StateDirectory=, CacheDirectory=, LogsDirectory=, ConfigurationDirectory= settings are used in a service the executed processes will now receive a set of environment variables containing the full paths of these directories. Specifically, RUNTIME_DIRECTORY=, STATE_DIRECTORY, CACHE_DIRECTORY, LOGS_DIRECTORY, CONFIGURATION_DIRECTORY are now set if these options are used. Note that these options may be used multiple times per service in which case the resulting paths will be concatenated and separated by colons. * Predictable interface naming has been extended to cover InfiniBand NICs. They will be exposed with an “ib” prefix. * tmpfiles.d/ line types may now be suffixed with a ‘-‘ character, in which case the respective line failing is ignored. * .link files may now be used to configure the equivalent to the “ethtool advertise” commands. * The sd-device.h and sd-hwdb.h APIs are now exported, as an alternative to libudev.h. Previously, the latter was just an internal wrapper around the former, but now these two APIs are exposed directly. * sd-id128.h gained a new function sd_id128_get_boot_app_specific() which calculates an app-specific boot ID similar to how sd_id128_get_machine_app_specific() generates an app-specific machine ID. * A new tool systemd-id128 has been added that can be used to determine and generate various 128bit IDs. * /etc/os-release gained two new standardized fields DOCUMENTATION_URL= and LOGO=. * systemd-hibernate-resume-generator will now honor the “noresume” kernel command line option, in which case it will bypass resuming from any hibernated image. * The systemd-sleep.conf configuration file gained new options AllowSuspend=, AllowHibernation=, AllowSuspendThenHibernate=, AllowHybridSleep= for prohibiting specific sleep modes even if the kernel exports them. * portablectl is now officially supported and has thus moved to /usr/bin/. * bootctl learnt the two new commands “set-default” and “set-oneshot” for setting the default boot loader item to boot to (either persistently or only for the next boot). This is currently only compatible with sd-boot, but may be implemented on other boot loaders too, that follow the boot loader interface. The updated interface is now documented here: https://systemd.io/BOOT_LOADER_INTERFACE * A new kernel command line option systemd.early_core_pattern= is now understood which may be used to influence the core_pattern PID 1 installs during early boot. * busctl learnt two new options -j and –json= for outputting method call replies, properties and monitoring output in JSON. * journalctl’s JSON output now supports simple ANSI coloring as well as a new “json-seq” mode for generating RFC7464 output. * Unit files now support the %g/%G specifiers that resolve to the UNIX group/GID of the service manager runs as, similar to the existing %u/%U specifiers that resolve to the UNIX user/UID. * systemd-logind learnt a new global configuration option UserStopDelaySec= that may be set in logind.conf. It specifies how long the systemd –user instance shall remain started after a user logs out. This is useful to speed up repetitive re-connections of the same user, as it means the user’s service manager doesn’t have to be stopped/restarted on each iteration, but can be reused between subsequent options. This setting defaults to 10s. systemd-logind also exports two new properties on its Manager D-Bus objects indicating whether the system’s lid is currently closed, and whether the system is on AC power. * systemd gained support for a generic boot counting logic, which generically permits automatic reverting to older boot loader entries if newer updated ones don’t work. The boot loader side is implemented in sd-boot, but is kept open for other boot loaders too. For details see: https://systemd.io/AUTOMATIC_BOOT_ASSESSMENT * The SuccessAction=/FailureAction= unit file settings now learnt two new parameters: “exit” and “exit-force”, which result in immediate exiting of the service manager, and are only useful in systemd –user and container environments. * Unit files gained support for a pair of options FailureActionExitStatus=/SuccessActionExitStatus= for configuring the exit status to use as service manager exit status when SuccessAction=/FailureAction= is set to exit or exit-force. * A pair of LogRateLimitIntervalSec=/LogRateLimitBurst= per-service options may now be used to configure the log rate limiting applied by journald per-service. * systemd-analyze gained a new verb “timespan” for parsing and normalizing time span values (i.e. strings like “5min 7s 8us”). * systemd-analyze also gained a new verb “security” for analyzing the security and sand-boxing settings of services in order to determine an “exposure level” for them, indicating whether a service would benefit from more sand-boxing options turned on for them. * “systemd-analyze syscall-filter” will now also show system calls supported by the local kernel but not included in any of the defined groups. * .nspawn files now understand the Ephemeral= setting, matching the –ephemeral command line switch. * sd-event gained the new APIs sd_event_source_get_floating() and sd_event_source_set_floating() for controlling whether a specific event source is “floating”, i.e. destroyed along with the even loop object itself. * Unit objects on D-Bus gained a new “Refs” property that lists all clients that currently have a reference on the unit (to ensure it is not unloaded). * The JoinControllers= option in system.conf is no longer supported, as it didn’t work correctly, is hard to support properly, is legacy (as the concept only exists on cgroup v1) and apparently wasn’t used. * Journal messages that are generated whenever a unit enters the failed state are now tagged with a unique MESSAGE_ID. Similarly, messages generated whenever a service process exits are now made recognizable, too. A tagged message is also emitted whenever a unit enters the “dead” state on success. * systemd-run gained a new switch –working-directory= for configuring the working directory of the service to start. A shortcut -d is equivalent, setting the working directory of the service to the current working directory of the invoking program. The new –shell (or just -S) option has been added for invoking the $SHELL of the caller as a service, and implies –pty –same-dir –wait –collect –service-type=exec. Or in other words, “systemd-run -S” is now the quickest way to quickly get an interactive in a fully clean and well-defined system service context. * machinectl gained a new verb “import-fs” for importing an OS tree from a directory. Moreover, when a directory or tarball is imported and single top-level directory found with the OS itself below the OS tree is automatically mangled and moved one level up. * systemd-importd will no longer set up an implicit btrfs loop-back file system on /var/lib/machines. If one is already set up, it will continue to be used. * A new generator “systemd-run-generator” has been added. It will synthesize a unit from one or more program command lines included in the kernel command line. This is very useful in container managers for example: # systemd-nspawn -i someimage.raw -b systemd.run='”some command line”‘ This will run “systemd-nspawn” on an image, invoke the specified command line and immediately shut down the container again, returning the command line’s exit code. * The block device locking logic is now documented: https://systemd.io/BLOCK_DEVICE_LOCKING * loginctl and machinectl now optionally output the various tables in JSON using the –output= switch. It is our intention to add similar support to systemctl and all other commands. * udevadm’s query and trigger verb now optionally take a .device unit name as argument. * systemd-udevd’s network naming logic now understands a new net.naming-scheme= kernel command line switch, which may be used to pick a specific version of the naming scheme. This helps stabilizing interface names even as systemd/udev are updated and the naming logic is improved. * sd-id128.h learnt two new auxiliary helpers: sd_id128_is_allf() and SD_ID128_ALLF to test if a 128bit ID is set to all 0xFF bytes, and to initialize one to all 0xFF. * After loading the SELinux policy systemd will now recursively relabel all files and directories listed in /run/systemd/relabel-extra.d/*.relabel (which should be simple newline separated lists of paths) in addition to the ones it already implicitly relabels in /run, /dev and /sys. After the relabelling is completed the *.relabel files (and /run/systemd/relabel-extra.d/) are removed. This is useful to permit initrds (i.e. code running before the SELinux policy is in effect) to generate files in the host filesystem safely and ensure that the correct label is applied during the transition to the host OS. * KERNEL API BREAKAGE: Linux kernel 4.18 changed behaviour regarding mknod() handling in user namespaces. Previously mknod() would always fail with EPERM in user namespaces. Since 4.18 mknod() will succeed but device nodes generated that way cannot be opened, and attempts to open them result in EPERM. This breaks the “graceful fallback” logic in systemd’s PrivateDevices= sand-boxing option. This option is implemented defensively, so that when systemd detects it runs in a restricted environment (such as a user namespace, or an environment where mknod() is blocked through seccomp or absence of CAP_SYS_MKNOD) where device nodes cannot be created the effect of PrivateDevices= is bypassed (following the logic that 2nd-level sand-boxing is not essential if the system systemd runs in is itself already sand-boxed as a whole). This logic breaks with 4.18 in container managers where user namespacing is used: suddenly PrivateDevices= succeeds setting up a private /dev/ file system containing devices nodes — but when these are opened they don’t work. At this point it is recommended that container managers utilizing user namespaces that intend to run systemd in the payload explicitly block mknod() with seccomp or similar, so that the graceful fallback logic works again. We are very sorry for the breakage and the requirement to change container configurations for newer kernels. It’s purely caused by an incompatible kernel change. The relevant kernel developers have been notified about this userspace breakage quickly, but they chose to ignore it. * PermissionsStartOnly= setting is deprecated (but is still supported for backwards compatibility). The same functionality is provided by the more flexible “+”, “!”, and “!!” prefixes to ExecStart= and other commands. * $DBUS_SESSION_BUS_ADDRESS environment variable is not set by pam_systemd anymore. * The naming scheme for network devices was changed to always rename devices, even if they were already renamed by userspace. The “kernel” policy was changed to only apply as a fallback, if no other naming policy took effect. * The requirements to build systemd is bumped to meson-0.46 and python-3.5. Contributions from: afg, Alan Jenkins, Aleksei Timofeyev, Alexander Filippov, Alexander Kurtz, Alexey Bogdanenko, Andreas Henriksson, Andrew Jorgensen, Anita Zhang, apnix-uk, Arkan49, Arseny Maslennikov, asavah, Asbjørn Apeland, aszlig, Bastien Nocera, Ben Boeckel, Benedikt Morbach, Benjamin Berg, Bruce Zhang, Carlo Caione, Cedric Viou, Chen Qi, Chris Chiu, Chris Down, Chris Morin, Christian Rebischke, Claudius Ellsel, Colin Guthrie, dana, Daniel, Daniele Medri, Daniel Kahn Gillmor, Daniel Rusek, Daniel van Vugt, Dariusz Gadomski, Dave Reisner, David Anderson, Davide Cavalca, David Leeds, David Malcolm, David Strauss, David Tardon, Dimitri John Ledkov, Dmitry Torokhov, dj-kaktus, Dongsu Park, Elias Probst, Emil Soleyman, Erik Kooistra, Ervin Peters, Evgeni Golov, Evgeny Vereshchagin, Fabrice Fontaine, Faheel Ahmad, Faizal Luthfi, Felix Yan, Filipe Brandenburger, Franck Bui, Frank Schaefer, Frantisek Sumsal, Gautier Husson, Gianluca Boiano, Giuseppe Scrivano, glitsj16, Hans de Goede, Harald Hoyer, Harry Mallon, Harshit Jain, Helmut Grohne, Henry Tung, Hui Yiqun, imayoda, Insun Pyo, Iwan Timmer, Jan Janssen, Jan Pokorný, Jan Synacek, Jason A. Donenfeld, javitoom, Jérémy Nouhaud, Jeremy Su, Jiuyang Liu, João Paulo Rechi Vita, Joe Hershberger, Joe Rayhawk, Joerg Behrmann, Joerg Steffens, Jonas Dorel, Jon Ringle, Josh Soref, Julian Andres Klode, Jun Bo Bi, Jürg Billeter, Keith Busch, Khem Raj, Kirill Marinushkin, Larry Bernstone, Lennart Poettering, Lion Yang, Li Song, Lorenz Hübschle-Schneider, Lubomir Rintel, Lucas Werkmeister, Ludwin Janvier, Lukáš Nykrýn, Luke Shumaker, mal, Marc-Antoine Perennou, Marcin Skarbek, Marco Trevisan (Treviño), Marian Cepok, Mario Hros, Marko Myllynen, Markus Grimm, Martin Pitt, Martin Sobotka, Martin Wilck, Mathieu Trudel-Lapierre, Matthew Leeds, Michael Biebl, Michael Olbrich, Michael ‘pbone’ Pobega, Michael Scherer, Michal Koutný, Michal Sekletar, Michal Soltys, Mike Gilbert, Mike Palmer, Muhammet Kara, Neal