Skip to content

Conversation

kolyshkin
Copy link
Collaborator

@kolyshkin kolyshkin commented Jul 25, 2025

The functionality is available since systemd v238 (see systemd/systemd@6592b97) but was never exposed to go-systemd.

This is to be used by opencontainers/cgroups packages (see opencontainers/cgroups#26) and eventually runc (see opencontainers/runc#4822).

kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Jul 25, 2025
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Jul 25, 2025
This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [1] and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Jul 25, 2025
This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [1] and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Jul 25, 2025
This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [1] and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Jul 27, 2025
This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [1] and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Jul 28, 2025
This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [1] and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Jul 28, 2025
This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [1] and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Jul 28, 2025
To this day, cgroup managers did not have an ability to add a process
to an existing cgroup. One might use cgroups.WriteCgroupProc, but
this is cgroupfs operation and does not take into account systemd.

Let's introduce AddPid, which will add a process, identified by PID,
to the cgroup (or, optionally, a sub cgroup) of the manager.

This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [1] and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
@kolyshkin kolyshkin requested a review from Luap99 July 31, 2025 20:39
@kolyshkin kolyshkin marked this pull request as ready for review July 31, 2025 20:39
Copy link
Collaborator

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Luap99 Luap99 merged commit ed3a911 into coreos:main Jul 31, 2025
8 checks passed
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Jul 31, 2025
To this day, cgroup managers did not have an ability to add a process
to an existing cgroup. One might use cgroups.WriteCgroupProc, but
this is cgroupfs operation and does not take into account systemd.

Let's introduce AddPid, which will add a process, identified by PID,
to the cgroup (or, optionally, a sub cgroup) of the manager.

This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [1] and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Aug 6, 2025
To this day, cgroup managers did not have an ability to add a process
to an existing cgroup. One might use cgroups.WriteCgroupProc, but
this is cgroupfs operation and does not take into account systemd.

Let's introduce AddPid, which will add a process, identified by PID,
to the cgroup (or, optionally, a sub cgroup) of the manager.

This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [1] and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/oc-cgroups that referenced this pull request Aug 20, 2025
To this day, cgroup managers did not have an ability to add a process
to an existing cgroup. One might use cgroups.WriteCgroupProc, but
this is cgroupfs operation and does not take into account systemd.

Let's introduce AddPid, which will add a process, identified by PID,
to the cgroup (or, optionally, a sub cgroup) of the manager.

This will allow runtimes like runc to ask systemd to move the process
into a proper unit, instead of using cgroupfs directly.

Implementation for systemd requires [email protected] (see [1])
and systemd >= v238.

[1]: coreos/go-systemd#458
Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this pull request Sep 24, 2025
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP),
if it is available. Since it requires a recent kernel and might not work,
implement a fallback to older way of joining the cgroup.

Based on work done in
 - https://go-review.googlesource.com/c/go/+/417695
 - coreos/go-systemd#458
 - opencontainers/cgroups#26
 - opencontainers#4822

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this pull request Sep 24, 2025
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP),
if it is available. Since it requires a recent kernel and might not work,
implement a fallback to older way of joining the cgroup.

Based on work done in
 - https://go-review.googlesource.com/c/go/+/417695
 - coreos/go-systemd#458
 - opencontainers/cgroups#26
 - opencontainers#4822

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this pull request Sep 24, 2025
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP),
if it is available. Since it requires a recent kernel and might not work,
implement a fallback to older way of joining the cgroup.

Based on work done in
 - https://go-review.googlesource.com/c/go/+/417695
 - coreos/go-systemd#458
 - opencontainers/cgroups#26
 - opencontainers#4822

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this pull request Sep 26, 2025
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP),
if it is available. Since it requires a recent kernel and might not work,
implement a fallback to older way of joining the cgroup.

Based on work done in
 - https://go-review.googlesource.com/c/go/+/417695
 - coreos/go-systemd#458
 - opencontainers/cgroups#26
 - opencontainers#4822

Regarding E2BIG check in shouldRetryWithoutCgroupFD. The clone3 syscall
first appeared in kernel v5.3 via commit [1], which added a check that
if the size of clone_args structure passed from the userspace is larger
than known to kernel, and the "unknown" part contains non-zero values,
E2BIG is returned. A similar check was already used in other similar
scenarios at the time, and later in kernel v5.4, this was generalized by
patch series [2].

[1]: torvalds/linux@7f192e3
[2]: https://lore.kernel.org/all/[email protected]/#r

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this pull request Sep 26, 2025
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP),
if it is available. Since it requires a recent kernel and might not work,
implement a fallback to older way of joining the cgroup.

Based on work done in
 - https://go-review.googlesource.com/c/go/+/417695
 - coreos/go-systemd#458
 - opencontainers/cgroups#26
 - opencontainers#4822

Regarding E2BIG check in shouldRetryWithoutCgroupFD. The clone3 syscall
first appeared in kernel v5.3 via commit [1], which added a check that
if the size of clone_args structure passed from the userspace is larger
than known to kernel, and the "unknown" part contains non-zero values,
E2BIG is returned. A similar check was already used in other similar
scenarios at the time, and later in kernel v5.4, this was generalized by
patch series [2].

[1]: torvalds/linux@7f192e3
[2]: https://lore.kernel.org/all/[email protected]/#r

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this pull request Sep 26, 2025
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP),
if it is available. Since it requires a recent kernel and might not work,
implement a fallback to older way of joining the cgroup.

Based on:
 - https://go-review.googlesource.com/c/go/+/417695
 - coreos/go-systemd#458
 - opencontainers/cgroups#26
 - opencontainers#4822

Signed-off-by: Kir Kolyshkin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants