Skip to content

Fix invalid "Runner" export in __all__ and incorrect metric aggregation#660

Merged
jeffreysijuntan merged 6 commits into
rllm-org:mainfrom
Muhammad-Ikhwan-Fathulloh:main
Jun 17, 2026
Merged

Fix invalid "Runner" export in __all__ and incorrect metric aggregation#660
jeffreysijuntan merged 6 commits into
rllm-org:mainfrom
Muhammad-Ikhwan-Fathulloh:main

Conversation

@Muhammad-Ikhwan-Fathulloh

Copy link
Copy Markdown
Contributor

Fixes #659

Summary

This PR fixes two issues related to package exports and metric aggregation:

  1. Removes the invalid "Runner" entry from rllm/__init__.py::__all__.
  2. Fixes reward aggregation in reduce_metrics_by_trajectory_name() so that all rewards for a trajectory name are collected instead of overwriting previous values.

These changes improve API consistency and ensure metrics are computed from the complete set of trajectory rewards.

Problem

Invalid Public Export

rllm/__init__.py exposes "Runner" through __all__, but no corresponding symbol is imported or defined. This results in an inconsistent public API and can mislead users relying on exported package symbols.

Incorrect Metric Aggregation

reduce_metrics_by_trajectory_name() replaces previously stored rewards when multiple trajectories share the same name.

Current behavior:

trajectory_rewards[name] = reward

This causes only the last reward to be retained.

Expected behavior:

trajectory_rewards[name].append(reward)

All rewards should be collected so that downstream statistics are calculated correctly.

Changes

rllm/init.py

  • Remove "Runner" from __all__
  • Ensure exported symbols accurately reflect available imports

rllm/trainer/algorithms/metrics.py

  • Aggregate rewards into lists instead of overwriting values
  • Preserve all rewards associated with a trajectory name
  • Add safe handling for None rewards during aggregation

Before

For trajectories:

[
    {"trajectory_name": "task_a", "reward": 0.5},
    {"trajectory_name": "task_a", "reward": 0.8},
]

Stored result:

{"task_a": 0.8}

After

Stored result:

{"task_a": [0.5, 0.8]}

This allows metrics to be computed using the full reward distribution.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature
  • Breaking change
  • Documentation update
  • Refactoring

Testing

  • Verified package exports after removing invalid symbol.
  • Verified reward aggregation preserves all rewards for identical trajectory names.
  • Verified metric reduction works correctly with multiple trajectories.
  • Verified None rewards do not break aggregation logic.

Impact

  • Fixes inconsistent package exports.
  • Prevents loss of reward information during metric aggregation.
  • Produces more accurate training and evaluation metrics.
  • Maintains backward compatibility.

@jeffreysijuntan jeffreysijuntan merged commit e2d8d10 into rllm-org:main Jun 17, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Invalid "Runner" Export in __all__ and Incorrect Metric Aggregation

2 participants