Skip to content

Add integration tests #338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 19, 2025
Merged

Add integration tests #338

merged 6 commits into from
Aug 19, 2025

Conversation

Kovbo
Copy link
Collaborator

@Kovbo Kovbo commented Aug 15, 2025

Adding a script to run integration tests.
The script defines a list of notebooks for testing, and starts a SkyPilot cluster executing notebooks one by one.
No need to SSH into the cluster, just run scripts/launch-test-cluster.sh --no-pull and it will execute tests, print results in the console, and terminate the cluster.
Currently, we test the most popular notebooks from the examples folder.

  • I had to use openpipe-art==0.4.7 in all examples because it could not run notebooks programatically with the old version. To keep it compatible with Google Colab, we probably need to release a new art version that supports vLLM 0.9.2, and pin Colab examples to the old vLLM version. Can we?

  • I was constantly running into the RuntimeError: CUDA error: an illegal memory access was encountered. Turns out, it happens when you start different vLLM engines one by one. For example, in one notebook, we build a vLLM engine with the default configuration, but in the new run, we provide engine_args=enforce_eager=True. It does not fully clean the memory, and the new engine fails.
    Workeround: set model._internal_config to None during tests. Not ideal since we cannot test configs.

  • It overrides some notebook variables (to run only one training step for faster execution) and changes the project name so logs are recorded under the “Tester” project on W&B.

  • Added --no-pull args to both launch-test-cluster.sh and launch-cluster.sh to deploy the current version without reverting to the main branch.

  • Pytest caught some silent type errors that were not visible during regular execution. Minor changes in src/art/trajectories.py and src/art/unsloth/train.py

  • I’m still occasionally hitting the following error during test execution: RuntimeError: Sleep mode can only be used for one instance per process. Not sure why it’s happening. Can we disable sleep mode for tests?

@Kovbo Kovbo requested a review from bradhilton August 15, 2025 21:17
@Kovbo Kovbo marked this pull request as ready for review August 15, 2025 21:26
@bradhilton bradhilton merged commit 696c230 into main Aug 19, 2025
2 checks passed
@bradhilton bradhilton deleted the add-integration-tests branch August 19, 2025 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants