Refactor Legacy and DRTulu parsers to use tool_definitions instead of tool_actors#1491
Refactor Legacy and DRTulu parsers to use tool_definitions instead of tool_actors#1491
Conversation
Summary of ChangesHello @hamishivi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the tool parsing mechanism within the system, moving away from direct Ray actor dependencies towards a more standardized approach using OpenAI-format tool definitions and explicit stop sequences. This change streamlines the architecture, particularly for the Legacy and DRTulu parsers, and resolves a critical issue that previously blocked the use of the DRTulu parser. The update ensures that tool information is consistently handled across different parser types, improving maintainability and enabling future enhancements. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request is a solid refactoring that removes the dependency on Ray actors from the Legacy and DRTulu parsers, opting to use tool_definitions instead. This change significantly improves modularity and makes the code easier to test, as demonstrated by the cleaner, mock-free test suite. The implementation is consistent and well-executed across all modified files. I have one suggestion to further improve performance by parallelizing some of the Ray calls. Overall, this is an excellent enhancement.
3f2791c to
b0aefc3
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
The pull request successfully refactors the Legacy and DRTulu tool parsers to use OpenAI-format tool_definitions instead of direct Ray actor handles. This improves the modularity of the system and fixes the previously broken dr_tulu parser by correctly plumbing stop sequences from the environment pools to the vLLM engines. The removal of the Ray dependency from parsers.py is a good cleanup. I have provided a suggestion to optimize the Ray metadata fetching in grpo_fast.py to improve startup performance by avoiding sequential blocking calls.
b0aefc3 to
4341085
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
The pull request successfully refactors the Legacy and DRTulu tool parsers to use OpenAI-format tool_definitions instead of direct Ray tool_actors. This change improves modularity by removing the direct dependency on Ray within the parser logic and fixes the previously broken DRTuluToolParser. The plumbing of stop_sequences from the EnvironmentPool through to the LLMRayActor is well-implemented, and the parallelization of tool definition fetching in grpo_fast.py is a good performance optimization. The tests have been appropriately updated to reflect these architectural changes.
4c994e7 to
16a8db0
Compare
… tool_actors The parsers previously had a code path using Ray actor handles (tool_actors) to get tool names and parameters. Since the architecture now uses EnvironmentPools, this was dead code for Legacy and completely broken for DRTulu (which raised an error at startup). Changes: - Legacy parser: remove tool_actors path, use only OpenAI-format tool_definitions - DRTulu parser: accept tool_definitions + explicit stop_sequences instead of tool_actors. Stop sequences are fetched from the pool during initialization. - create_tool_parser: replace tool_actors param with stop_sequences - grpo_fast: fetch stop_strings from pool for dr_tulu, remove blocking error - vllm_utils: plumb tool_stop_sequences through to LLMRayActor parser init - Remove ray import from parsers.py (no longer needed) - Update tests to use tool_definitions instead of mock Ray actors Co-authored-by: Cursor <cursoragent@cursor.com>
16a8db0 to
1b8b9bb
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This is a great refactoring that simplifies the tool parsers by removing the dependency on Ray actors and instead using OpenAI-style tool_definitions. The changes are consistent across the codebase, and the tests have been thoroughly updated to reflect this, which is excellent. The fix for the dr_tulu parser and the performance improvement in initialize_tools_and_envs are also valuable additions. I have one suggestion to improve the robustness of resource handling.
|
Ran and checked that the existing tool and debug scripts for dr tulu, legacy, hermes all worked. |
natolambert
left a comment
There was a problem hiding this comment.
This one seemed just like a good add!
Summary
tool_actorscode path and uses only OpenAI-formattool_definitionsto derive tool names and parameter names for<tool_name>content</tool_name>tag parsing.tool_actorsdependency withtool_definitions+ explicitstop_sequences. Stop sequences are now fetched from the EnvironmentPool during initialization ingrpo_fast.py, then plumbed throughcreate_vllm_engines→LLMRayActor→create_tool_parser.import rayfromparsers.pysince neither parser needs direct Ray actor access anymore.grpo_fast.pythat prevented--tool_parser_type dr_tulufrom being used.tool_definitionsinstead of mock Ray actors.Context
After the EnvironmentPool refactor (#1479), the parsers still had code paths using
tool_actors(Ray actor handles). For the Legacy parser this was dead code (it already had atool_definitionspath). For the DRTulu parser it was completely broken — there was a hard error at startup blockingdr_tuluparser usage.Test plan
tool_definitionsonly)tool_definitions+stop_sequences)create_tool_parserfactory tests pass for all parser typesscripts/train/debug/tools/dr_tulu_parser_debug.sh) on GPUscripts/train/debug/tools/legacy_parser_debug.sh) on GPUMade with Cursor