Skip to content

GeminiLLMAdapter merges functionCall messages without merging matching functionResponse messages #3992

@yydrift-code

Description

@yydrift-code

pipecat version

0.0.104

Python version

3.12

Operating System

macOS

Issue description

GeminiLLMAdapter._merge_parallel_tool_calls_for_thinking() currently merges model messages containing function_call parts when thought signatures are present, but it does not merge the matching user messages containing function_response parts.

This can corrupt otherwise valid sequential tool history into an invalid Gemini/Vertex function-call turn.

Related code path:

  • pipecat/adapters/services/gemini_adapter.py
  • _merge_parallel_tool_calls_for_thinking()

Related, but different, from #3557:

Google docs:

Reproduction steps

Take a message history shaped like this:

  1. model(functionCall fc1 + thought_signature)
  2. user(functionResponse fr1)
  3. model(functionCall fc2)
  4. user(functionResponse fr2)

These are two sequential single-tool turns.

Today, _merge_parallel_tool_calls_for_thinking() can turn that into:

  1. model(functionCall fc1, functionCall fc2)
  2. user(functionResponse fr1)
  3. user(functionResponse fr2)

The next request to Gemini / Vertex then fails with:

400 Bad Request
Please ensure that the number of function response parts is equal to the number of function call parts of the function call turn.

We hit this with GoogleVertexLLMService after otherwise normal single-tool turns.

Expected behavior

If Pipecat decides to merge parallel tool calls into a single model turn, it should also merge the corresponding user function_response messages into a single user turn, preserving the same grouping expected by Gemini.

For example:

  1. model(functionCall fc1, functionCall fc2)
  2. user(functionResponse fr1, functionResponse fr2)

Alternatively, if Pipecat cannot safely prove the calls are truly parallel, it should not merge them at all.

Actual behavior

Pipecat merges only the function_call side and leaves function_response messages split, producing an invalid call/response structure for Gemini.

Suggested fix

In GeminiLLMAdapter._merge_parallel_tool_calls_for_thinking():

  • only merge contiguous tool-call groups
  • when merging model function_call parts, also merge the immediately corresponding user function_response parts
  • do not scan through arbitrary text messages and then retroactively regroup tool turns

Suggested regression tests

  1. Parallel case:

    • M(fc1+sig), U(fr1), M(fc2), U(fr2)
    • expected -> M(fc1,fc2), U(fr1,fr2)
  2. Sequential case:

    • M(fc1+sigA), U(fr1), M(fc2+sigB), U(fr2)
    • expected -> no cross-turn merge
  3. Mixed text boundary:

    • M(fc1+sig), U(fr1), M(text), M(fc2), U(fr2)
    • expected -> no merge across the text boundary

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions