Skip to content

feat(clevrlabs): add Clevr Labs TTS plugin#6005

Open
DynamicRex wants to merge 4 commits into
livekit:mainfrom
DynamicRex:clevrlabs-plugin
Open

feat(clevrlabs): add Clevr Labs TTS plugin#6005
DynamicRex wants to merge 4 commits into
livekit:mainfrom
DynamicRex:clevrlabs-plugin

Conversation

@DynamicRex

Copy link
Copy Markdown

livekit-plugins-clevrlabs

Adds a streaming TTS plugin for the Clevr Labs conversational speech
model,
following the existing provider-plugin structure (modeled on cartesia).

  • Streaming tts.TTS (streaming=True)
  • Per-conversation voice consistency via add_user_turn()
  • No generation knobs on the constructor (keeps provider-internal params
    off the public surface)

Already published & in production use:
https://pypi.org/project/livekit-plugins-clevrlabs/

Passes make check (ruff lint + format, mypy --strict) under the root
config.

Testing

Talks to the hosted Clevr API; needs a key (free at theclevr.com). The
model is
live right now — happy to provision a credited test account for any
maintainer
who wants to run it live, or you can talk to the hosted model directly.
Reach me
on the LiveKit Slack (@cyrus), or at cyrus@theclevr.com.

Demo link, Uploaded to youtube for ease of access.

Youtube: https://youtu.be/pN7K82K9SzE

Cheers, Thank you for all the work you do. =)

Adds livekit-plugins-clevrlabs, a streaming TTS plugin backed by the
Clevr Labs conversational speech model. Supports per-conversation voice
consistency via add_user_turn(), and ships an is_whisper_hallucination()
helper for filtering Whisper-family STT before it pollutes voice context.

Registered in the workspace [tool.uv.sources]. Passes ruff (lint + format)
and mypy --strict under the repo's root config.
@CLAassistant

CLAassistant commented Jun 8, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment thread livekit-plugins/livekit-plugins-clevrlabs/livekit/plugins/clevrlabs/tts.py Outdated
Comment thread livekit-plugins/livekit-plugins-clevrlabs/livekit/plugins/clevrlabs/tts.py Outdated
Comment on lines +260 to +261
for chunk in buf.push(data):
await self._synthesize_segment(chunk, output_emitter, audio_bstream)

@devin-ai-integration devin-ai-integration Bot Jun 8, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing _mark_started() call prevents TTS metrics from being emitted

All other streaming TTS plugins (e.g. Cartesia at cartesia/tts.py:424, ElevenLabs at elevenlabs/tts.py:480, Deepgram at deepgram/tts.py:316) call self._mark_started() when they begin synthesizing. The Clevr Labs plugin never calls it, so _started_time remains 0. In the base class's _metrics_monitor_task (livekit-agents/livekit/agents/tts/tts.py:576), the guard if not self._started_time returns early, meaning TTS metrics (TTFB, duration, etc.) are silently never emitted.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread livekit-plugins/livekit-plugins-clevrlabs/livekit/plugins/clevrlabs/tts.py Outdated
devin-ai-integration[bot]

This comment was marked as resolved.

…nt import table

Avoids the ~100ms import-time scan of the full Unicode range by inspecting
only the characters present in each (short) input string. Behaviour is
byte-identical across all 1,114,112 codepoints.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

…afe context, self-healing session, drop sample_rate

- _synthesize_segment: open the session inside the wrapped try so session-start
  HTTP errors become APIError (retryable/visible) instead of a raw httpx error
- clear _pending_user_turn only after a successful request so the user audio
  context survives a base-class retry
- simplify session handling to a lock + started flag; a failed start no longer
  caches the error permanently, it is retried on the next call
- remove the sample_rate constructor knob (server output is fixed at 24 kHz);
  output rate is now the _OUTPUT_SAMPLE_RATE constant

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 8 additional findings in Devin Review.

Open in Devin Review


_CURRENCY_MAP = {
"$": ("dollar", "cent"),
"£": ("pound", "penny"),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Incorrect pluralization of "penny" produces "pennys" instead of "pence"

The _expand_currency function naively appends 's' to pluralize all fractional currency names. For $ and this works ("cent" → "cents"), but for £ it produces "pennys" instead of the correct "pence". For example, £2.50 becomes "two pounds and fifty pennys".

Suggested change
"£": ("pound", "penny"),
"£": ("pound", "pence"),
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants