Skip to content

Support using SwanLab for experiment tracking #98

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 16, 2025

Conversation

xichengpro
Copy link
Contributor

A significant number of users are unable to access wandb due to network restrictions and are more accustomed to using the localized tool SwanLab. To improve the project's usability and local compatibility, this PR adds support for integrating with SwanLab.

Copy link
Collaborator

@garrett4wade garrett4wade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! The integration of swanlab configuration is fine, but some logging utilities should be merged with existing ones using wandb.

@@ -99,10 +99,14 @@ python3 training/main_sync_ppo.py --help

We recommend using Weights & Biases (wandb) for monitoring. Run `wandb login` or set the `WANDB_API_KEY` environment variable. Set `wandb.mode=True` in your configuration to upload training statistics.

Alternatively, you can use SwanLab for monitoring. Run swanlab login or set the `SWANLAB_API_KEY` environment variable. Set `swanlab.mode=True` in your configuration to upload training statistics.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a hyper link to swanlab to let more people know about it?

Also, use "`" to quote `swanlab login`.

swanlab.mode should be set to online if it has the same API as wandb. The previous typo was fixed in #100

Can you try to merge the two lines about wandb and swanlab somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback!

I've updated the documentation based on your suggestions:

  • Added official links to Weights & Biases and SwanLab for better user reference.
  • Used backticks to quote commands and parameters (e.g., wandb login, swanlab login).
  • Updated swanlab.mode usage to align with WandB's API convention, now using "local" and "cloud" instead of True.
  • Merged WandB and SwanLab descriptions into a single, concise statement for better readability.
  • Added a note about using swanlab.mode="local" if the server is unreachable.

pyproject.toml Outdated
@@ -61,6 +61,8 @@ dependencies = [
"colorlog",
"psutil",
"pynvml",
"swanlab==0.6.2",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

">=" or "=="?

@@ -158,6 +158,20 @@ def log_wandb_tensorboard(data, step=None, summary_writer=None):
for key, val in data.items():
summary_writer.add_scalar(f"{key}", val, step)

def log_swanlab_tensorboard(data, step=None, summary_writer=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should be merged with log_wandb_tensorboard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback!
I've refactored the code and merged log_swanlab_tensorboard with log_wandb_tensorboard into a single function called log_swanlab_wandb_tensorboard.

@@ -447,6 +448,11 @@ async def run_step(self, buf_indices, sample, buffer_id: int):
step=ctrl.step_info.global_step,
summary_writer=self.summary_writer,
)
logging.log_swanlab_tensorboard(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge this with log_wandb_tensorboard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored the code and merged log_swanlab_tensorboard with log_wandb_tensorboard into a single function called log_swanlab_wandb_tensorboard.

requirements.txt Outdated
prettytable
swanlab==0.6.2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double-check the version requirement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've modified the dependency to use the latest version automatically.

xichengpro and others added 6 commits June 12, 2025 14:28
- Added official links for better user reference
- Used backticks to quote commands and parameters
- Unified mode settings to use "online" / "cloud" convention
- Merged WandB and SwanLab descriptions into a single concise statement
- Added note on using `swanlab.mode="local"` when server connection is unavailable
…o log_swanlab_wandb_tensorboard

 - Unified logging logic for SwanLab, WandB, and TensorBoard to reduce code duplication
 - Updated SwanLab version in pyproject.toml
 - Updated SwanLab version in requirements.txt
@xichengpro xichengpro force-pushed the main branch 2 times, most recently from 8474630 to f86d103 Compare June 12, 2025 11:45
- Config now uses provided arguments first
- Falls back to reading from config.yaml if no input is given
@xichengpro
Copy link
Contributor Author

Thanks for the feedback! I've updated the code based on your suggestion. Kindly review it again at your convenience.

Copy link
Collaborator

@garrett4wade garrett4wade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again for your contribution! We are almost there.

As a kind reminder, please format the files such that the CI will pass:

pip install -e .
# clear any external packages installed locally
rm -rf ./sympy
rm -rf ./sglang
# Run formatting
isort . && black .

Comment on lines 144 to 145
_LATEST_WANDB_STEP = 0
_LATEST_SWANLAB_STEP = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two step variables are the same. Remaining a single _LATEST_LOG_STEP will be fine.

Copy link
Contributor Author

@xichengpro xichengpro Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback!
I've merged _LATEST_WANDB_STEP and _LATEST_SWANLAB_STEP into _LATEST_LOG_STEP.

@@ -97,12 +97,15 @@ python3 training/main_sync_ppo.py --help

## Monitoring the Training Process

We recommend using Weights & Biases (wandb) for monitoring. Run `wandb login` or set the `WANDB_API_KEY` environment variable. Set `wandb.mode=online` in your configuration to upload training statistics.
+ We recommend using [Weights & Biases (wandb)](https://github.com/wandb/wandb) or [SwanLab](https://github.com/SwanHubX/SwanLab) for monitoring—run `wandb login` or `swanlab login`, or set the corresponding environment variable API key (`WANDB_API_KEY` or `SWANLAB_API_KEY`). Set `wandb.mode="online"` or `swanlab.mode="cloud"` in your configuration to upload training statistics. If you cannot connect to the server, you can also use `swanlab.mode="local"` to save data locally without uploading.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention wandb.mode=offline together with swanlab.mode=local.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback!
I've added note on using wandb.mode="offline" together with swanlab.mode="local".

@xichengpro
Copy link
Contributor Author

Thank you again for your contribution! We are almost there.

As a kind reminder, please format the files such that the CI will pass:

pip install -e .
# clear any external packages installed locally
rm -rf ./sympy
rm -rf ./sglang
# Run formatting
isort . && black .

Thank you for the reminder!

I've formatted the code using isort and black as requested, and all files should now conform to the project's style guidelines. The CI should now pass successfully.

Let me know if there's anything else I can improve!

@garrett4wade
Copy link
Collaborator

@GurrenLagann97 Can you provide another review?

Copy link
Collaborator

@GurrenLagann97 GurrenLagann97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems perfect

@garrett4wade garrett4wade merged commit bb14f02 into inclusionAI:main Jun 16, 2025
1 check passed
@Zeyi-Lin
Copy link

🎉😄Thanks for your contribution to the SwanLab and AReaL community. @xichengpro

antoinegg1 pushed a commit that referenced this pull request Jul 7, 2025
* Support using SwanLab for experiment tracking

* docs: improve WandB and SwanLab integration documentation
- Added official links for better user reference
- Used backticks to quote commands and parameters
- Unified mode settings to use "online" / "cloud" convention
- Merged WandB and SwanLab descriptions into a single concise statement
- Added note on using `swanlab.mode="local"` when server connection is unavailable

* refactor: update default value of api_key

* fix: correct help description from WandB to SwanLab in SwanLabConfig

* refactor: merge log_swanlab_tensorboard and log_wandb_tensorboard into log_swanlab_wandb_tensorboard

 - Unified logging logic for SwanLab, WandB, and TensorBoard to reduce code duplication

* chore: update swanlab version in dependency config files

 - Updated SwanLab version in pyproject.toml
 - Updated SwanLab version in requirements.txt

* refactor: enhance SwanLab config handling for logging purposes
- Config now uses provided arguments first
- Falls back to reading from config.yaml if no input is given

* docs: add note on using  when server connection is unavailable

* refactor: merge _LATEST_WANDB_STEP and _LATEST_SWANLAB_STEP into _LATEST_LOG_STEP

* Format code with black and isort

* chore: update swanlab version in dependency config files
- Updated SwanLab version in requirements.txt

* refactor: rename swanlab_wandb_data to log_data

---------

Co-authored-by: dubingnan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants