Skip to content

fix: make base-model warning's instruct advice family-aware#137

Merged
inureyes merged 1 commit into
mainfrom
fix/base-model-warning-instruct-advice
May 29, 2026
Merged

fix: make base-model warning's instruct advice family-aware#137
inureyes merged 1 commit into
mainfrom
fix/base-model-warning-instruct-advice

Conversation

@inureyes
Copy link
Copy Markdown
Member

Summary

The no-chat-template warning that mlxcel run / the chat REPL prints when a model ships no chat template advised users to "try a variant named with an -it suffix". That is the Gemma naming convention only. For other families the advice was actively misleading: Llama and Qwen2.5 instruction-tuned checkpoints use -Instruct, and Qwen3 / Qwen3.5 use the plain repo name with -Base marking the non-instruct variant. So a user running Qwen3.5-0.8B-Base (the base sibling of this repo's own README test model Qwen3.5-0.8B) was pointed at a non-existent -it repo instead of simply being told to drop -Base.

The advice now states the per-family conventions:

  • Gemma → -it suffix (e.g. gemma-4-e4b-it-4bit)
  • Llama / Qwen2.5 → -Instruct
  • Qwen3 / Qwen3.5 → plain name, with -Base marking the non-instruct variant

Base-model detection is unchanged — it keys on chat-template absence (processor.is_none()), never on the model name. Only the human-facing advice text changed.

Changes

  • src/commands/chat.rs — replace the -it-only advice block in the base-model warning with family-aware guidance.
  • CHANGELOG.md[Unreleased] → Fixed entry.

Tests

  • cargo fmt --check — clean.
  • cargo check --features metal,accelerate --bin mlxcel — compiles. No test asserts the warning wording, so no test change is required.

Related

Follow-up to the base-model warning introduced in #134 and the structured User/Assistant fallback in #136.

The no-chat-template warning recommended trying a variant named with an "-it" suffix, but -it is the Gemma convention. For other families that advice is wrong: Llama and Qwen2.5 instruction-tuned checkpoints use -Instruct, and Qwen3 / Qwen3.5 use the plain repo name (with -Base marking the non-instruct variant), so a user running Qwen3.5-0.8B-Base was pointed at a non-existent -it repo instead of being told to drop -Base.

The advice now names the per-family conventions (Gemma -it; Llama / Qwen2.5 -Instruct; Qwen3 / Qwen3.5 plain name vs. -Base). Base-model detection is unchanged: it keys on chat-template absence, never on the model name.
@inureyes inureyes added type:bug Bug fixes, error corrections, or issue resolutions status:review Under review priority:low Low priority area:core mlxcel-core: MLX FFI, primitives, KV cache, layers labels May 29, 2026
@inureyes inureyes merged commit 1fc83f9 into main May 29, 2026
4 checks passed
@inureyes inureyes deleted the fix/base-model-warning-instruct-advice branch May 29, 2026 00:57
@inureyes inureyes added status:done Completed and removed status:review Under review labels May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core mlxcel-core: MLX FFI, primitives, KV cache, layers priority:low Low priority status:done Completed type:bug Bug fixes, error corrections, or issue resolutions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant