Skip to content

GH-3657: Fix DeepSeek tool call content null issue #3817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Acacian
Copy link

@Acacian Acacian commented Jul 14, 2025

✨ Summary

This PR resolves the issue where DeepSeek chat completions with tool calls return an assistant message with content = null, leading to schema validation errors or downstream failures (e.g., in sglang or FastAPI with pydantic).

To address this, we explicitly set the assistant message content to "__tool_call__" when tool calls are present but content is null.


🛠 Changes

  • In DeepSeekChatModel, updated buildGeneration(...) logic:
    • If content == null && toolCalls != empty, set content = "__tool_call__" to maintain compatibility.
  • This change preserves downstream contract expectations, especially for clients expecting content to be a non-null string.
  • No changes to the base AssistantMessage#getText() logic, keeping it backward compatible for other models.

💡 Context

DeepSeek sometimes returns:

{
  "role": "assistant",
  "content": null,
  "tool_calls": [...]
}

This results in downstream consumers (e.g., sglang, FastAPI with pydantic) throwing errors:

ChatCompletionMessageGenericParam.content Field required [type=missing]

By ensuring a placeholder content ("__tool_call__"), we:

  • Avoid null pointer or validation errors
  • Make the message displayable in logs or UIs
  • Align behavior with other models that always return non-null content

🧪 Tests & Verification

  • ✅ Added unit test: ChatClientResponseTests.whenAssistantMessageHasOnlyToolCalls_thenContentIsToolCallMarker
  • ✅ Validated getText() returns "__tool_call__" when expected
  • ✅ Verified compatibility with Generation.getOutput().getText()
  • ✅ All existing tests passed (./mvnw clean verify)
  • ✅ Checkstyle passed (-Ddisable.checks=false)
  • ✅ Apache license headers added
  • ✅ DCO signed
  • ✅ Linear commit history (git log --oneline --graph confirmed)
  • ✅ Rebased on latest main

📘 Documentation

  • JavaDoc and inline comments added where applicable to clarify the fallback behavior.
  • No changes required to public APIs or developer-facing contracts.

🔗 Closes #3657

Please review and let me know if you’d like any refinements.
Happy to iterate!

Closes spring-projects#3657

* Add content fallback for DeepSeek when only tool calls are present
* Ensures AssistantMessage has proper output for downstream processing
* Add test case for AssistantMessage with tool calls only

Signed-off-by: Dongha Koo <[email protected]>
@dev-jonghoonpark
Copy link
Contributor

Please update the copyright end year from 2024 to 2025.

Acacian and others added 2 commits July 14, 2025 22:20
…t/client/ChatClientResponseTests.java

Co-authored-by: jonghoonpark <[email protected]>
Signed-off-by: 구동하 <[email protected]>
Closes spring-projects#3657

* Set content = "__tool_call__" when content is null and toolCalls are present
* Prevents schema validation errors in downstream clients
* Add unit test: ChatClientResponseTests.whenAssistantMessageHasOnlyToolCalls_thenContentIsToolCallMarker
* Apply code style: import formatting, copyright year

Signed-off-by: Dongha Koo <[email protected]>
@Acacian Acacian force-pushed the GH-3657-deepseek-toolcall-fix branch from 55067cc to 4b054b0 Compare July 14, 2025 14:04
@Acacian
Copy link
Author

Acacian commented Jul 14, 2025

@dev-jonghoonpark
Thanks for the reminder regarding the import formatting and copyright year — I've made the changes accordingly.
I'll be more mindful of the style guidelines going forward. Appreciate your review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Question about sglang + deepseek
3 participants