fix(voice): emit turn_ended when workflow yields no audio#3252
fix(voice): emit turn_ended when workflow yields no audio#3252adityasingh2400 wants to merge 1 commit intoopenai:mainfrom
Conversation
When a workflow yields only empty/whitespace deltas (e.g. an LLM streaming keepalives), `_add_text` triggers `_start_turn` but the splitter leaves `_text_buffer` empty, so `_turn_done` never schedules a TTS task. The dispatcher then emits `session_ended` without a matching `turn_ended`, breaking consumers that pair lifecycle events. Push a synthetic `turn_ended` lifecycle onto the ordered task queue when no synthesizable text remains but a turn was started. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 12a03672f4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| local_queue = asyncio.Queue() | ||
| await local_queue.put(VoiceStreamEventLifecycle(event="turn_ended")) | ||
| self._ordered_tasks.append(local_queue) |
There was a problem hiding this comment.
Reset empty turns before accepting the next turn
In streamed sessions, an empty workflow yield followed immediately by another transcript now appends turn_ended but returns before the dispatcher calls _finish_turn(). _started_processing_turn stays true, so the next _add_text() suppresses turn_started, producing turn_started, turn_ended, audio, turn_ended and losing that turn's trace text.
Useful? React with 👍 / 👎.
Summary
When a
VoiceWorkflowBase.run()(oron_start()) yields only empty/whitespace deltas — common with LLM streaming keepalives —StreamedAudioResult._add_textcalls_start_turn(emittingturn_started) but the sentence splitter leaves_text_bufferempty._turn_donethen sees an empty buffer and never schedules a TTS task withfinish_turn=True, so the dispatcher exits the loop and emitssession_endedwithout ever emittingturn_ended.Consumers that pair
turn_started/turn_endedlifecycle events (e.g. UI state machines, transcript boundary tracking) silently break.The fix pushes a synthetic
turn_endedlifecycle event onto the ordered task queue whenever a turn was started but no synthesizable text remains, preserving balanced lifecycle pairs.Test plan
test_voicepipeline_empty_workflow_yield_emits_turn_endedfails onmain(asserts only[turn_started, session_ended]are emitted) and passes after the fixpytest tests/voice -v)🤖 Generated with Claude Code