-
Notifications
You must be signed in to change notification settings - Fork 5
[0.6.0-UT] Detect aborted tests and add abort info to UT logs #526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
gulsumgudukbay
commented
Aug 3, 2025
- The test script now detects when tests are aborted/crashed and logs them to both individual JSON and HTML test reports
- The conftest.py provides thread-safe test abort tracking for parallel GPU execution since run_single_gpu.py runs tests on multiple GPUs
- Aborted tests are added to both individual test reports and the final compiled report
- JSON and HTML formats are compatible with pytest-html and pytest_html_merger tools, therefore the aborted tests show up correctly in final_compiled_report.html and final_compiled_report.json
- NOTE: Aborted tests are shown as FAILED tests in the HTML reports as there isn't another HTML class for aborts in the readily available pytest format
- NOTE: For each aborted test file, to make it easy to track, we generate a xxx_last_running_test,json log file as well. This is also for building HTML and JSON log files for aborted tests, since pytest isn't able to do it when a test aborts.
I think we should use the version of this script that lives in the plugin repo: https://github.com/ROCm/rocm-jax/blob/master/jax_rocm_plugin/build/rocm/run_single_gpu.py. Lets try and only maintain test skips here. |
As we discussed, for 0.6.0, we will keep this script for both repositories. In the 0.7.0 release we will migrate this completely to the plugin repo. I am creating a PR to update this script in that repo as well. |