Skip to content

Synchronise integration tests on IPC run-finished signal instead of sleeps#5988

Draft
mostafaNazari702 wants to merge 9 commits into
mochajs:mainfrom
mostafaNazari702:watch-test-sync
Draft

Synchronise integration tests on IPC run-finished signal instead of sleeps#5988
mostafaNazari702 wants to merge 9 commits into
mochajs:mainfrom
mostafaNazari702:watch-test-sync

Conversation

@mostafaNazari702

@mostafaNazari702 mostafaNazari702 commented May 22, 2026

Copy link
Copy Markdown
Contributor

PR Checklist

Overview

runMochaWatchAsync used fixed 2 seconds delays between each "change" which was slow in CI.
Replace with an IPC mocha:watch:runFinished event to let tests to wait for completion instead of sleeping. Helper now always forks and exposes waitForRunFinished.

@codecov

codecov Bot commented May 22, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.02%. Comparing base (6695fba) to head (9f670e3).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5988      +/-   ##
==========================================
+ Coverage   80.89%   81.02%   +0.12%     
==========================================
  Files          64       64              
  Lines        4602     4607       +5     
  Branches      976      997      +21     
==========================================
+ Hits         3723     3733      +10     
+ Misses        879      874       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JoshuaKGoldberg JoshuaKGoldberg left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is meant to work on the watch tests, but looking at the two failures in CI, both are:

  1) --watch
       when enabled
         reruns test when file and directory paths under --watch-files are added:
     Error: runMochaWatchAsync: timed out after 6000ms waiting for watch run to finish
      at Timeout._onTimeout (test/integration/helpers.js:490:13)
      at listOnTimeout (node:internal/timers:588:17)
      at process.processTimers (node:internal/timers:523:7)

The strategies of waiting for explicit signals (rather than hardcoded timing) sounds good to me. But it looks like this PR doesn't fully fix the issues.

@JoshuaKGoldberg JoshuaKGoldberg added the status: waiting for author waiting on response from OP or other posters - more information needed label May 25, 2026
Comment thread lib/cli/watch-run.js Outdated
@mostafaNazari702

mostafaNazari702 commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

// Not ready yet, please review when i Re-request review
{DECBB069-BED7-4A61-9AA3-19D9305B86B7}

@mostafaNazari702 mostafaNazari702 changed the title Synchronise integration tests on IPC run-finished signal instead of sleeps Synchronise integration tests on IPC run-finished signal instead of sleeps May 26, 2026
@mostafaNazari702 mostafaNazari702 changed the title Synchronise integration tests on IPC run-finished signal instead of sleeps Synchronise integration tests on IPC run-finished signal instead of sleeps May 27, 2026
@mostafaNazari702

mostafaNazari702 commented May 27, 2026

Copy link
Copy Markdown
Contributor Author

The in-PR tests failed even after the commit "wait for chokidar to start watching before first run" which was supposed to fix the test timing issues, and then when i pushed tracing instruments, suddenly the tests work, twice....This is very mind-boggling.

We are dealing with a bug that does not want to be caught.

@JoshuaKGoldberg JoshuaKGoldberg marked this pull request as draft May 27, 2026 04:21
@mostafaNazari702

mostafaNazari702 commented May 27, 2026

Copy link
Copy Markdown
Contributor Author

We ( as in me ) are now re-running tests to confirm whether i hopefully fixed it to remove the tracing insturments.

Run 1 ( after committing ): 79 successful checks, all have passed and are green.

(Josh commented during this specific stage).

Updates:

Re-run 1: 80 checks, all green and passed.

Re-run 2: 80 checks, all green and passed.

Re-run 3: The issue has appeared again:

[Tests / lint / lint]: Failing after 24s

[Tests / Test integration in all environments / test-node:integration with node.js 20.19.4 on ubuntu-latest]: Failing after 3m

Last edit to this message: Evaluating whether i should give up or not, 6 hours of troubleshooting totally useless and returned partial positive results.

@JoshuaKGoldberg

Copy link
Copy Markdown
Member

Swell. Whenever you think it's ready, feel free to re-request my review & mark this as ready / not draft. Exciting!

@mostafaNazari702

mostafaNazari702 commented May 27, 2026

Copy link
Copy Markdown
Contributor Author

Swell. Whenever you think it's ready, feel free to re-request my review & mark this as ready / not draft. Exciting!

I recommend to read this comment first before continuing to read.

After instrumenting both sides and comparing passing vs failing CI runs, the issue turned out not to be in our code or chokidar itself but a linux inotify limitation.

inotify watches are per-inode and non-recursive so when a new subdirectory is created chokidar only starts watching it AFTER receiving the parent's IN_CREATE event. Any file events inside that new subdir before the child watch is installed are missed. Our test lands its touchFile exactly in that race window.

This matches prior chokidar work ( that made me give up after finding and going through them):

i also checked alternatives (@parcel/watcher, nsfw, Watchman) and they all have equivalent limitations or unresolved races, so this doesn't appear solvable at the userspace watcher level.

So im stopping further fix attempts here as i have truely put a lot of effort that i should not have ( i don't mean that it does not deserve my time, but rather that i should have been smarter and actually tried to google my issue and find the impossibleness of this issue ). the IPC handshake changes are still a real improvement and stabilize the rest of the watch suite, this one test just hits the kernel race window.

@mark-wiemer

Copy link
Copy Markdown
Member

a linux inotify limitation.

On main, these tests currently fail only on Windows (#5361), have we at least solved that? If we've solved that and introduced an issue unique to Linux (where these tests have passed consistently for the past 6 months), then I may have some ideas:

Our test lands its touchFile exactly in that race window.

Can we, um, move outside of this window? That is, add a 500-millisecond sleep (or whatever works) back into the test to workaround this? I know the PR title is currently "Synchronise integration tests on IPC run-finished signal instead of sleeps" but if we add "when possible" to the end I'm still a happy camper with passing tests.

Of course, would this open us up to no longer catching failing scenarios? If so we should detail those and see what we can do.

(I'm almost back to 100% capacity from my disability, haven't looked at this code yet, but I definitely want to fix this bug!)

@mostafaNazari702

mostafaNazari702 commented May 28, 2026

Copy link
Copy Markdown
Contributor Author

On main, these tests currently fail only on Windows (#5361), have we at least solved that?

Windows is fixed, on this PR, the watch integration suite is green on windows-latest across node 20/22/24 in every CI re-run (and locally), where main is the flaky one (#5361). the watch suite is also faster locally (44 seconds in my branch vs 2 mins in main), which was the original goal of #5714

I have decided to drop my Linux-kernel-related issues and focus solely on #5714 and its goal. That issue will need a new and separate issue that addresses it. Only remaining failure is intermittent on ubuntu Node 20, the "…file and directory paths under --watch-files are added"-test.

@mark-wiemer mark-wiemer removed the status: waiting for author waiting on response from OP or other posters - more information needed label May 28, 2026
@mostafaNazari702

mostafaNazari702 commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

Fails again. I can not do anything more in here unfortuantely.

@mark-wiemer

Copy link
Copy Markdown
Member

Yes, please do feel free to step away if you're ever frustrated with a PR :)

@mark-wiemer mark-wiemer self-assigned this Jun 2, 2026
@mostafaNazari702

Copy link
Copy Markdown
Contributor Author

Yes, please do feel free to step away if you're ever frustrated with a PR :)

I wish you the best of luck with this PR, my friend. The best that can be done is basically extending the timeout time. For now, signal-based is not 100% feasible but you are the expert!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🛠️ Repo: improve speed and reliability of watch mode tests

3 participants