Skip to content

fix(litellm): avoid duplicating content and signed thinking blocks across parallel tool-call splits#3215

Merged
seratch merged 2 commits intoopenai:mainfrom
adityasingh2400:fix/litellm-tool-split-duplicates
May 9, 2026
Merged

fix(litellm): avoid duplicating content and signed thinking blocks across parallel tool-call splits#3215
seratch merged 2 commits intoopenai:mainfrom
adityasingh2400:fix/litellm-tool-split-duplicates

Conversation

@adityasingh2400
Copy link
Copy Markdown
Contributor

Summary

When LitellmModel._fix_tool_message_ordering splits an assistant message that has multiple tool_calls into one message per tool call, it copies the entire parent message into each split. That duplicates content, thinking_blocks, and reasoning_content across every split.

For Anthropic models with extended thinking, this is fatal: each thinking block carries a unique signature, and the API rejects requests that include the same signed block twice. It also corrupts conversation history by repeating the assistant's user-visible text whenever the model emits two or more parallel tool calls.

Keep those shared fields only on the first split; subsequent splits carry their tool_calls alone.

Repro

Pass an assistant message with two parallel tool calls plus content/thinking blocks (which is what LiteLLM emits for Anthropic Claude models with parallel tool use + extended thinking) through _fix_tool_message_ordering. Before the fix, both split assistant messages contain the same content text and the same signed thinking_blocks; after the fix, only the first split carries them.

Test plan

  • New unit test test_split_does_not_duplicate_content_or_thinking covers the regression.
  • All 9 pre-existing test_extended_thinking_message_order.py tests still pass.
  • All 25 LiteLLM-related tests pass.
  • ruff check clean on both files.

🤖 Generated with Claude Code

…en splitting parallel tool calls

When `_fix_tool_message_ordering` splits an assistant message that has
multiple `tool_calls` into one message per tool call, it copied the entire
parent message into each split. That duplicated `content`, `thinking_blocks`,
and `reasoning_content` across every split.

For Anthropic models with extended thinking, this is fatal: each thinking
block carries a unique signature, and the API rejects requests that include
the same signed block twice. It also corrupts conversation history by
repeating the assistant's user-visible text.

Keep those shared fields only on the first split; subsequent splits carry
their `tool_calls` alone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@seratch
Copy link
Copy Markdown
Member

seratch commented May 8, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@seratch seratch marked this pull request as draft May 8, 2026 16:59
Construct the multi-tool assistant fixture via cast so mypy stops
complaining that thinking_blocks / reasoning_content are not part of
ChatCompletionAssistantMessageParam, and that the resulting list-item
type doesn't match the union.

Also cast the filtered assistants list to dict[str, Any] so subscript
and "in"/"not in" checks no longer need per-line type ignores.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@seratch seratch marked this pull request as ready for review May 9, 2026 03:44
@seratch seratch added this to the 0.17.x milestone May 9, 2026
@seratch seratch merged commit 8f40dde into openai:main May 9, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature:extensions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants