Fix invalid "Runner" export in __all__ and incorrect metric aggregation#660
Merged
Merged
Conversation
1. Remove 'Runner' from __all__ in rllm/__init__.py since it doesn't exist 2. Fix reduce_metrics_by_trajectory_name to collect all rewards instead of replacing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #659
Summary
This PR fixes two issues related to package exports and metric aggregation:
"Runner"entry fromrllm/__init__.py::__all__.reduce_metrics_by_trajectory_name()so that all rewards for a trajectory name are collected instead of overwriting previous values.These changes improve API consistency and ensure metrics are computed from the complete set of trajectory rewards.
Problem
Invalid Public Export
rllm/__init__.pyexposes"Runner"through__all__, but no corresponding symbol is imported or defined. This results in an inconsistent public API and can mislead users relying on exported package symbols.Incorrect Metric Aggregation
reduce_metrics_by_trajectory_name()replaces previously stored rewards when multiple trajectories share the same name.Current behavior:
This causes only the last reward to be retained.
Expected behavior:
All rewards should be collected so that downstream statistics are calculated correctly.
Changes
rllm/init.py
"Runner"from__all__rllm/trainer/algorithms/metrics.py
Nonerewards during aggregationBefore
For trajectories:
[ {"trajectory_name": "task_a", "reward": 0.5}, {"trajectory_name": "task_a", "reward": 0.8}, ]Stored result:
{"task_a": 0.8}After
Stored result:
{"task_a": [0.5, 0.8]}This allows metrics to be computed using the full reward distribution.
Type of Change
Testing
Nonerewards do not break aggregation logic.Impact