Skip to content

check_process_leak overhaul #5739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 3, 2022
Merged

Conversation

crusaderky
Copy link
Collaborator

@crusaderky crusaderky commented Jan 31, 2022

Two tests in xarray decorated with @gen_cluster(client=True), therefore not spawning any processes, are very flaky as they fail systematically on check_process_leak. I think the problem is caused by unrelated subprocesses, spawned by previous tests, that don't respond do SIGTERM.

xref: pydata/xarray#6211
log: https://github.com/pydata/xarray/runs/4990065672

@crusaderky crusaderky self-assigned this Jan 31, 2022
@crusaderky crusaderky closed this Feb 1, 2022
@crusaderky crusaderky reopened this Feb 1, 2022
@crusaderky crusaderky marked this pull request as ready for review February 1, 2022 14:52
@crusaderky
Copy link
Collaborator Author

All test failures are unrelated

Copy link
Collaborator

@gjoseph92 gjoseph92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a much, much better implementation than what was there before, thank you!

"""Wait until timeout for mp_context.active_children() to terminate.
Return list of active subprocesses after the timeout expired.
"""
t0 = time()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny super nit: monotonic (or perhaps perf_counter) would technically be more appropriate here I think. I'm sure it will never ever make a difference in this case, and there are plenty of other more important places where we already use non-monotonic time and shouldn't. I'd just like to start being more careful about it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something for a broader discussion maybe? If we want to switch to monotonic, it should be done consistently everywhere, e.g. replace metrics.time. As usual, Windows is a source of pain so need to thread carefully there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref #4528

@crusaderky crusaderky merged commit 834421b into dask:main Feb 3, 2022
@crusaderky crusaderky deleted the process_leak_kill branch February 3, 2022 12:28
@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2022

Unit Test Results

       16 files         16 suites   8h 27m 10s ⏱️
  2 582 tests   2 500 ✔️      80 💤 2
20 554 runs  19 127 ✔️ 1 425 💤 2

For more details on these failures, see this check.

Results for commit ff9ad76.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants