Skip to content

Conversation

N-R-K
Copy link
Contributor

@N-R-K N-R-K commented Mar 22, 2025

this turns mouninfo into a multicall binary which can also do
parallel unmounting. most mountinfo code is unchanged; only
notable change is find_mounts() now returns the number of mounts
found and also takes the "process" function as an argument; so
that is_mounted() can be implemented on top of it.

rc_unmount mostly follows the logic of previous do_unmount. some
notable changes:

  • rc_unmount is "lazy" when it comes to retrying failed
    unmounts. it will greedily keep running unmount as long as it
    can before it looks at the "waiting" queue.
  • rc_unmount will check if the mountpoint is still mounted or
    not when umount returns non-zero exit code. this is due to the
    fact that multiple umount calls might race to unmount a shared
    mount. so if umount fails but the mountpoint is no longer
    mounted, we assume success.
  • do_unmount used to fail if fuser did not find any pids using
    the mount. rc_unmount tries one more time; the rationale being
    that there's a gap between the umount call and the fuser call,
    and so whatever was using the mount before might have stopped
    using it now and so it's worth another attempt.

Closes: #662

Copy link
Contributor Author

@N-R-K N-R-K left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on: #810

#!/bin/sh
# can be used for testing rc_unmount:
# # ./tools/manymounts.sh
# # rc_unmount -- -p '^/tmp/manymounts.*'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this should be kept or not. Included it so that others can test easily. Can remove if desired.

@N-R-K N-R-K mentioned this pull request Mar 22, 2025
@N-R-K N-R-K force-pushed the rc_unmount branch 3 times, most recently from 263d4d6 to b586bf6 Compare March 26, 2025 12:42
@N-R-K N-R-K marked this pull request as ready for review March 26, 2025 12:42
Comment on lines 629 to 631
tmps = rc_conf_value("rc_fuser_timeout");
if (!tmps || sscanf(tmps, "%d", &rc_fuser_timeout) != 1)
rc_fuser_timeout = 60;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rc.conf documents this option as following:

# This is how long fuser should wait for a remote server to respond. The
# default is 60 seconds, but  it can be adjusted here.
#rc_fuser_timeout=60

Which makes it seem like it only accepts integer seconds as value, so this change might be fine.

But worth noting that the older script simply passed this over to timeout(1) as is. Which meant it would accept stuff like "40m" or "16s" etc even though not explicitly documented.

Do we care about this? If yes: do we just parse similar to timeout(1) so that everything works as usual? Or emit a warning if sscanf fails instead of silently defaulting to 60 so the user knows something was messed up?

(I thought I had commented this already but apparently not).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using parse_duration from timeutils.h it can accept most of the suffixes now (s, m and h). d (for days) is missing, but that seems unlikely someone would set a timeout in days.

It only accepts integer though, whereas timeout(1) accepts fractional values.

@N-R-K N-R-K force-pushed the rc_unmount branch 2 times, most recently from d3c6a6c to e52d16f Compare April 10, 2025 01:12
Comment on lines +528 to +530
res = do_exec(&args);
if (res.pid > 0)
rc_waitpid(res.pid);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first fuser command to query the pids has a timeout and is async, this one to send the signals on the other hand is synchronous and has no timeout (copied from the rc-mount.sh script).

But this doesn't seem to make sense (?) If we anticipate fuser might take a while then shouldn't we be timing out both of the calls rather than just one of them?

Also the only reason we're doing 2 commands (query pids + send signal) is to avoid killing ourselves if we happen to be using that mount. If we can figure out whether we're using it or not beforehand then it can be changed into a single fuser call which directly sends the signal (async, with timeout).

N-R-K added 2 commits April 15, 2025 04:41
openrc-run has a (hardcoded) 60 seconds timeout, it a service
fails to start/stop within that time, it will be SIGKILL-ed. so
a default 60 seconds fuser_timeout is kind of bad. change it to
20seconds which is somewhat more sensible.
this turns mouninfo into a multicall binary which can also do
parallel unmounting. most mountinfo code is unchanged; only
notable change is find_mounts() now returns the number of mounts
found and also takes the "process" function as an argument; so
that is_mounted() can be implemented on top of it.

rc_unmount mostly follows the logic of previous do_unmount. some
notable changes:

- rc_unmount is "lazy" when it comes to retrying failed
  unmounts. it will greedily keep running unmount as long as it
  can before it looks at the "waiting" queue.
- rc_unmount will check if the mountpoint is still mounted or
  not when umount returns non-zero exit code. this is due to the
  fact that multiple umount calls might race to unmount a shared
  mount. so if umount fails _but_ the mountpoint is no longer
  mounted, we assume success.
- do_unmount used to fail if fuser did not find any pids using
  the mount. rc_unmount tries one more time; the rationale being
  that there's a gap between the umount call and the fuser call,
  and so whatever was using the mount before might have stopped
  using it now and so it's worth another attempt.

Fixes: OpenRC#662
Closes: OpenRC#698
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parallezing do_unmount
1 participant