Skip to content

[0.6.0-UT] Detect aborted tests and add abort info to UT logs #526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 8, 2025

Conversation

gulsumgudukbay
Copy link

  • The test script now detects when tests are aborted/crashed and logs them to both individual JSON and HTML test reports
  • The conftest.py provides thread-safe test abort tracking for parallel GPU execution since run_single_gpu.py runs tests on multiple GPUs
  • Aborted tests are added to both individual test reports and the final compiled report
  • JSON and HTML formats are compatible with pytest-html and pytest_html_merger tools, therefore the aborted tests show up correctly in final_compiled_report.html and final_compiled_report.json
  • NOTE: Aborted tests are shown as FAILED tests in the HTML reports as there isn't another HTML class for aborts in the readily available pytest format
  • NOTE: For each aborted test file, to make it easy to track, we generate a xxx_last_running_test,json log file as well. This is also for building HTML and JSON log files for aborted tests, since pytest isn't able to do it when a test aborts.

@gulsumgudukbay gulsumgudukbay requested a review from a team as a code owner August 3, 2025 05:37
@charleshofer
Copy link
Collaborator

I think we should use the version of this script that lives in the plugin repo: https://github.com/ROCm/rocm-jax/blob/master/jax_rocm_plugin/build/rocm/run_single_gpu.py. Lets try and only maintain test skips here.

@gulsumgudukbay
Copy link
Author

I think we should use the version of this script that lives in the plugin repo: https://github.com/ROCm/rocm-jax/blob/master/jax_rocm_plugin/build/rocm/run_single_gpu.py. Lets try and only maintain test skips here.

As we discussed, for 0.6.0, we will keep this script for both repositories. In the 0.7.0 release we will migrate this completely to the plugin repo. I am creating a PR to update this script in that repo as well.

@gulsumgudukbay gulsumgudukbay merged commit 248cf43 into rocm-jaxlib-v0.6.0 Aug 8, 2025
@gulsumgudukbay gulsumgudukbay deleted the add_abort_info_to_ut_logs branch August 8, 2025 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants