-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Fixes inability to use /dev/null when inside a container #3623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
testPath, err := securejoin.SecureJoin("/sys", entry.Path) | ||
if err != nil { | ||
logrus.Errorf("error joining entry path: %s", err) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's an error, I guess we should not continue but return it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe we should not use securejoin here, since this is merely a check that such device exists, and we do not use testPath
in any other way. Worst case scenario, a bad guy will trick the system into adding a line for a device that does not exist into systemd unit properties.
So, I guess, using plain old path.Join
is sufficient here.
@cyphar WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original patch didn't have secure join, but I saw @AkihiroSuda preferred having it (not sure if it's really needed here indeed); see #3620 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AkihiroSuda WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, but probably the reason not to use securejoin should be recorded in the comment lines in the source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I removed securejoin and added a comment why it's not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah securejoin
is not necessary here (these are host paths anyway).
87e308a
to
806e837
Compare
This is a forward port of opencontainers#3620 The original code depended on the origin filesystem to have /dev/{block,char} populated. This is done by udev normally and while is very common non-containerized systemd installs, it's very easy to start systemd in a container created by runc itself and not have /dev/{block,char} populated. When this occurs, the following error output is observed: $ docker run hello-world docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error reopening /dev/null inside container: open /dev/null: operation not permitted: unknown. /dev/null can't be opened because it was not added to the deviceAllowList, as there was no /dev/char directory. The change here utilizes the fact that when sysfs in in use, there is a /sys/dev/{block,char} that is kernel maintained that we can check. Signed-off-by: Evan Phoenix <[email protected]>
806e837
to
462e719
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though I do wonder if we could just always use /sys/dev/...
.
|
||
// We don't bother with securejoin here because we create entry.Path | ||
// right above here, so we know it's safe. | ||
if _, err := os.Stat("/sys" + entry.Path); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realised we probably want to do entry.Path = "/sys" + entry.Path
here, don't we? Otherwise systemd will get a /dev/...
path that doesn't exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I’m not entirely clear on what systemd does with these values but I can tell you that in my setup, those /dev/char entries don’t exist and systemd still does the right thing. Maybe it parses the path and uses the major:minor directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, the /sys/dev/{block,char}/<major>:<minor>
is a directory, not a device file, so we can't use it for systemd.
Perhaps systemd has a set of built-in devices it can recognize. I'll take a look.
backport to 1.1 is done in #3620 |
This is a forward port of #3620
The original code depended on the origin filesystem to have /dev/{block,char} populated. This is done by udev normally and while is very common non-containerized systemd installs, it's very easy to start systemd in a container created by runc itself and not have /dev/{block,char} populated. When this occurs, the following error output is observed:
$ docker run hello-world
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error reopening /dev/null inside container: open /dev/null: operation not permitted: unknown.
/dev/null can't be opened because it was not added to the deviceAllowList, as there was no /dev/char directory. The change here utilizes the fact that when sysfs in in use, there is a /sys/dev/{block,char} that is kernel maintained that we can check.
Signed-off-by: Evan Phoenix [email protected]