Skip to content

fix(adk): fix ToolCall Arguments format in GenTransferMessages#741

Open
shenysun wants to merge 16 commits intocloudwego:alpha/08from
shenysun:fix/supervisor_transfer_messages
Open

fix(adk): fix ToolCall Arguments format in GenTransferMessages#741
shenysun wants to merge 16 commits intocloudwego:alpha/08from
shenysun:fix/supervisor_transfer_messages

Conversation

@shenysun
Copy link

@shenysun shenysun commented Feb 3, 2026

What type of PR is this?

fix

Check the PR title.

  • This PR title match the format: <type>(optional scope): <description>
  • The description of this PR title is user-oriented and clear enough for others to understand.
  • Attach the PR updating the user documentation if the current PR requires user awareness at the usage level. User docs repo

(Optional) Translate the PR title into Chinese.

修复 GenTransferMessages 中的 ToolCall Arguments 格式问题

(Optional) More detailed description for this PR(en: English/zh: Chinese).

This PR fixes the ToolCall Arguments format in the GenTransferMessages function. Previously, the Arguments field directly used the destination agent name as a plain string, which doesn't conform to the standard JSON object format for tool calls. This would cause tool call parsing to fail in supervisor scenarios when transferring between agents.

The fix changes the Arguments from:

Arguments: destAgentName

To the proper JSON object format:

Arguments: '{"agent_name": "destAgentName"}'

Files Changed:

  • adk/utils.go: Added JSON marshaling for tool call arguments
  • adk/utils_test.go: Added comprehensive unit tests for GenTransferMessages

Testing:

  • Added TestGenTransferMessages: Verify basic functionality including tool call structure, JSON argument format, and message content
  • Added TestGenTransferMessages_EmptyAgentName: Test edge case with empty agent name
  • Added TestGenTransferMessages_SpecialCharactersInAgentName: Validate handling of special characters in agent names
  • Run supervisor transfer scenarios to verify correct parameter parsing
  • All tests pass successfully ✅

@CLAassistant
Copy link

CLAassistant commented Feb 3, 2026

CLA assistant check
All committers have signed the CLA.

tooCall := schema.ToolCall{ID: toolCallID, Function: schema.FunctionCall{Name: TransferToAgentToolName, Arguments: destAgentName}}
args := map[string]string{"agent_name": destAgentName}
argsJSON, _ := json.Marshal(args)
tooCall := schema.ToolCall{ID: toolCallID, Function: schema.FunctionCall{Name: TransferToAgentToolName, Arguments: string(argsJSON)}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

具体啥情况下会需要这个?

Copy link
Author

@shenysun shenysun Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 我们的 LLM API 基于 OpenAI 协议封装,对 tool_calls 字段有严格校验。多轮对话中传递历史消息时,若格式不符合规范会报错。
  2. 在 OpenAI 协议中,tool_calls arguments 定义为 JSON-encoded 格式。
    https://platform.openai.com/docs/guides/function-calling?api-mode=chat#handling-function-calls
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openai 模型并不会因为这个 argument 不是 json 而报错

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确实,OpenAI 模型本身不会因为这个报错。这个问题是我们内部 LLM API 封装层的校验逻辑导致的,不是框架的 bug。

不过从协议规范角度,tool_calls.arguments 确实应该是 JSON-encoded 格式。我的修改让代码更符合 OpenAI 官方规范。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"JSON-encoded arguments",这个的字面意思不是一定是 object。简单的一个 string 也是 json-encoded.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

您说的没毛病,从 JSON 规范角度 "hello" 确实是 JSON-encoded。

但我的修改考虑的是:

  1. 工具定义的对应关系:transfer_to_agent 工具在代码中明确定义了参数是对象类型
// adk/chatmodel.go:417-427
toolInfoTransferToAgent = &schema.ToolInfo{
    Name: TransferToAgentToolName,
    ParamsOneOf: schema.NewParamsOneOfByParams(map[string]*schema.ParameterInfo{
        "agent_name": {
            Desc:     "the name of the agent to transfer to",
            Required: true,
            Type:     schema.String,
        },
    }),
}

工具的参数定义是一个包含 agent_name 字段的对象,而不是简单字符串。

  1. 解析代码的期望(目前这个场景不会解析,deterministic_transfer 直接构造的tool msg):transfer_to_agent 的实现使用 UnmarshalString 解析参数
// adk/chatmodel.go:495-500
type transferParams struct {
    AgentName string `json:"agent_name"`
}

params := &transferParams{}
err := sonic.UnmarshalString(argumentsInJSON, params)

这段代码期望 argumentsInJSON 是一个 JSON 对象字符串 {"agent_name": "xxx"},才能正确解析到 params.AgentName

shentongmartin and others added 10 commits February 4, 2026 14:32
…loudwego#722)

* feat: add enhanced tool support with multimodal output capabilities and improve message formatting

This commit introduces enhanced tool interfaces that support structured multimodal outputs,
enabling tools to return rich content beyond simple text responses.

Key Changes:

1. New Enhanced Tool Interfaces:
   - Added EnhancedInvokableTool and EnhancedStreamableTool interfaces for multimodal tool execution
   - Both interfaces use ToolCallInfo as input and return ToolResult for structured output

2. ToolResult Schema:
   - Introduced ToolResult type to represent multimodal tool outputs
   - Supports multiple content types: text, image, audio, video, and file
   - Added ToolOutputPart with Index field for streaming chunk merging
   - Implemented ToMessageInputParts() for seamless model integration

3. ToolsNode Enhancements:
   - Extended ToolsNode to support both legacy and enhanced tool types
   - Added automatic conversion between invokable and streamable endpoints
   - Implemented middleware support for enhanced tools
   - Enhanced interrupt and rerun mechanism to handle ToolResult

4. React Agent Integration:
   - Introduce enhancedToolResultSender and enhancedStreamToolResultSender types
   - Support sending *schema.ToolResult with multimodal content (images, audio, video, files)
   - Implement EnhancedInvokable and EnhancedStreamable middleware in tool result collector

5. Message.String() Enhancement:
   - Add formatting support for UserInputMultiContent, AssistantGenMultiContent, and MultiContent
   - Implement formatInputPart, formatOutputPart, and formatChatMessagePart helper functions
   - Create mediaPartFormatter interface with wrapper types for unified media formatting

6. User Input Multi-Content Concatenation:
   - Implement concatUserMultiContent function for merging MessageInputPart slices
   - Support text and base64 audio merging with proper MIME type handling
   - Integrate into ConcatMessages function

7. Callback System:
   - Added CallbackInput and CallbackOutput types for tool callbacks
   - Implemented conversion functions for different callback input/output types

8. Comprehensive Test Coverage:
   - Added tests for enhanced invokable and streamable tools
   - Added TestMessageString with 14 test cases covering various message types

Impact:
- Enables tools to return rich multimodal content (images, audio, video, files)
- Provides foundation for more sophisticated tool implementations
- Maintains full backward compatibility with existing tool ecosystem
…string]bool (cloudwego#737)

- Update ToolsConfig.ReturnDirectly type
- Update ChatModelAgentContext.ReturnDirectly type
- Update internal types: chatModelAgentExecCtx, execContext, reactConfig
- Update all related function signatures
- Update test files with new type syntax

Change-Id: I7d819f1c44da91b76cf9a9f867a88008068153b8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

7 participants