You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor Legacy and DRTulu parsers to use tool_definitions instead of tool_actors
The parsers previously had a code path using Ray actor handles (tool_actors) to
get tool names and parameters. Since the architecture now uses EnvironmentPools,
this was dead code for Legacy and completely broken for DRTulu (which raised an
error at startup).
Changes:
- Legacy parser: remove tool_actors path, use only OpenAI-format tool_definitions
- DRTulu parser: accept tool_definitions + explicit stop_sequences instead of
tool_actors. Stop sequences are fetched from the pool during initialization.
- create_tool_parser: replace tool_actors param with stop_sequences
- grpo_fast: fetch stop_strings from pool for dr_tulu, remove blocking error
- vllm_utils: plumb tool_stop_sequences through to LLMRayActor parser init
- Remove ray import from parsers.py (no longer needed)
- Update tests to use tool_definitions instead of mock Ray actors
Co-authored-by: Cursor <cursoragent@cursor.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,6 +22,7 @@ All notable changes to this project will be documented in this file.
22
22
- Documentation and runtime warning for `dataset_mixer_list` format (float=proportion, int=count) (https://github.com/allenai/open-instruct/pull/1434).
23
23
24
24
### Changed
25
+
- Refactor Legacy and DRTulu tool parsers to use OpenAI-format `tool_definitions` instead of Ray `tool_actors`. Removes `import ray` from `parsers.py`, fixes DRTulu parser which was broken after the pool refactor, and fixes `--tool_parser_type` typo in dr_tulu debug script (https://github.com/allenai/open-instruct/pull/1491).
25
26
- Replaces lambda collators with a "single_example_collator" (https://github.com/allenai/open-instruct/pull/1472).
26
27
- Clarified `activation_memory_budget` guidance in DPO utils with a practical default (`0.5`) and memory/speed tradeoff notes (https://github.com/allenai/open-instruct/pull/1460).
27
28
- Let TransformerTrainModule handle FSDP parallelism instead of manual application in DPO (https://github.com/allenai/open-instruct/pull/1458).
0 commit comments