Skip to content

fix(aws): add dummy user message for Claude 4.5+/Opus 4.6+ trailing assistant turns#6008

Open
eddiemonroe wants to merge 1 commit into
livekit:mainfrom
eddiemonroe:fix/aws-bedrock-claude-4-5-trailing-assistant
Open

fix(aws): add dummy user message for Claude 4.5+/Opus 4.6+ trailing assistant turns#6008
eddiemonroe wants to merge 1 commit into
livekit:mainfrom
eddiemonroe:fix/aws-bedrock-claude-4-5-trailing-assistant

Conversation

@eddiemonroe

Copy link
Copy Markdown

Summary

Mirror #4973's Anthropic-plugin fix on the AWS Bedrock plugin. When the active model is in the Claude 4.5+/Opus 4.6+ family (which no longer supports assistant-message prefill), append a minimal dummy user message if the conversation would otherwise end on assistant. This prevents botocore.errorfactory.ValidationException: This model does not support assistant message prefill. The conversation must end with a user message. at the Bedrock ConverseStream validation layer.

Closes #6007. Anthropic-plugin precedent: #4973 (merged 2026-03-03), #4907.

Motivation

Anthropic removed assistant-message prefill starting in Claude Sonnet 4.5 and Opus 4.6 (carried forward to Sonnet 4.6/4.7, Opus 4.7/4.8). The same removal applies on Bedrock, where it surfaces as a ConverseStream ValidationException. PR #4973 added end-side dummy-user injection to the Anthropic plugin gated on model ID, but the equivalent fix was never applied to the AWS Bedrock plugin — so anyone routing affected Claude models (e.g., global.anthropic.claude-sonnet-4-6, us.anthropic.claude-opus-4-7-v1:0, anthropic.claude-sonnet-4-5-20250929-v1:0) through livekit.plugins.aws.llm.LLM hits the same wall.

The AWS provider format already has start-side inject_dummy_user_message=True (prepends a dummy when messages[0]["role"] != "user"); this PR adds the symmetric end-side handling.

Changes

  • livekit-agents/livekit/agents/llm/_provider_format/aws.py — add inject_trailing_user_message: bool = False parameter to to_chat_ctx(). When True and messages[-1]["role"] == "assistant", append {"role": "user", "content": [{"text": " "}]}. Defaults to False, so existing callers see no behavior change.
  • livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.py — add _NO_PREFILL_PATTERNS and _model_disables_prefill(model), and pass inject_trailing_user_message=_model_disables_prefill(self._opts.model) to chat_ctx.to_provider_format. Models in the affected Claude family get the injection; everything else (Nova, Llama, Mistral, older Claude) is unaffected.
  • tests/test_chat_ctx.py — four new test cases:
    • test_aws_inject_trailing_user_message_appends_when_last_is_assistant — happy path.
    • test_aws_inject_trailing_user_message_default_off_keeps_assistant_last — backward compat for callers not opting in.
    • test_aws_inject_trailing_user_message_idempotent_when_last_is_user — no extra dummy when already user-terminated.
    • test_aws_model_disables_prefill_matches_inference_profiles — gate matches plain IDs, date-stamped snapshots, and cross-region inference profile prefixes; excludes Nova, Llama, Mistral, older Claude, and empty/None.

Why substring matching instead of startswith

PR #4973 used _NO_PREFILL_PATTERNS = ("claude-sonnet-4-6", "claude-opus-4-6") with model.startswith(p). That works for the Anthropic plugin because its model IDs are short and unprefixed (claude-sonnet-4-6).

Bedrock model IDs vary in shape:

  • Plain: claude-sonnet-4-6
  • Date-stamped snapshot: anthropic.claude-sonnet-4-5-20250929-v1:0
  • Cross-region inference profile: global.anthropic.claude-sonnet-4-6, us.anthropic.claude-opus-4-7-v1:0, eu.anthropic.claude-opus-4-6-v1:0

startswith("claude-sonnet-4-6") would miss the prefixed forms, so this PR uses substring matching (any(p in model for p in _NO_PREFILL_PATTERNS)). If a future change unifies the two plugins' detection logic, that's a reasonable follow-up — keeping them separate for now mirrors #4973's local-to-the-plugin shape.

Pattern list scope

Includes Sonnet 4.5/4.6/4.7 and Opus 4.6/4.7/4.8. Sonnet 4.5 is included (it's the actual cut-line per Anthropic's migration guide); #4973's list omitted it. Future versions will need additions to this tuple; the doc comment above it points at the Anthropic migration guide as the source of truth.

Verification

  • make format: clean (848 files unchanged).
  • make lint: clean.
  • make type-check: clean.
  • Full tests/test_chat_ctx.py: 23 passed, 1 skipped (pre-existing), 0 failures.
  • The 4 new tests pass and exercise both branches of the new flag plus the model gate across all the Bedrock ID shapes I could enumerate.

…ssistant turns

Mirror PR livekit#4973's Anthropic-plugin fix on the AWS Bedrock plugin so
Claude Sonnet 4.5+ and Opus 4.6+ requests don't fail with
ValidationException ("This model does not support assistant message
prefill. The conversation must end with a user message.") when the
final message in `chat_ctx` is an assistant turn.

Changes:
- `_provider_format/aws.py::to_chat_ctx` — add
  `inject_trailing_user_message: bool = False`. When True and the
  messages list ends on `assistant`, append a minimal `{"role": "user",
  "content": [{"text": " "}]}` so the request ends on user.
- `livekit-plugins-aws/.../llm.py` — add `_NO_PREFILL_PATTERNS` and
  `_model_disables_prefill(model)`; pass `inject_trailing_user_message`
  to `to_provider_format` when the active model is in the affected
  Claude family.
- `tests/test_chat_ctx.py` — cover the new parameter (append when last
  is assistant; default-off keeps assistant last; idempotent when last
  is already user) and the model gate (matches plain IDs, snapshots,
  and cross-region inference profiles; excludes Nova/Llama/Mistral/older
  Claude/empty/None).

Why substring matching instead of `startswith` like livekit#4973: Bedrock
model IDs vary in shape — plain (`claude-sonnet-4-6`), date-stamped
snapshots (`anthropic.claude-sonnet-4-5-20250929-v1:0`), and
cross-region inference profiles (`global.anthropic.claude-sonnet-4-6`,
`us.anthropic.claude-opus-4-7-v1:0`). The Anthropic plugin only sees
short IDs from Anthropic's native API, so `startswith` works there;
Bedrock needs substring matching to handle the prefixes.

Closes livekit#6007
Refs livekit#4907, livekit#4973
@CLAassistant

CLAassistant commented Jun 8, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 2 additional findings in Devin Review.

Open in Devin Review

Comment on lines +46 to +53
_NO_PREFILL_PATTERNS = (
"claude-sonnet-4-5",
"claude-sonnet-4-6",
"claude-sonnet-4-7",
"claude-opus-4-6",
"claude-opus-4-7",
"claude-opus-4-8",
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 AWS _NO_PREFILL_PATTERNS is broader than Anthropic's equivalent

The AWS plugin's _NO_PREFILL_PATTERNS at livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.py:46-53 includes "claude-sonnet-4-5" and several future model versions (4-7, 4-8), while the Anthropic plugin at livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py:41 only has ("claude-sonnet-4-6", "claude-opus-4-6"). The AWS list explicitly covers "Claude 4.5+" per the comment, but there is no model named claude-sonnet-4-5 in the Anthropic models list (livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/models.py:6-17). This is likely intentional future-proofing, but means the two plugins have divergent prefill-disable thresholds if a Claude Sonnet 4.5 ever ships. Worth confirming this asymmetry is deliberate.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AWS plugin: missing end-side dummy-user injection for Claude 4.5+/Opus 4.6+ (parallel to #4973)

2 participants