Add custom Qwen3 model with configurable attention and latentMoE. by copybara-service[bot] · Pull Request #3613 · AI-Hypercomputer/maxtext

copybara-service · 2026-04-08T23:03:45Z

Add custom Qwen3 model with configurable attention and latentMoE.

Specifically, this introduces:

attention_output_dim and moe_expert_input_dim to allow the attention
block output and the MoE expert input to have different dimensionalities
than the base embedding dimension.
A dense_init_scale config to allow configuring the initialization scale
for dense layers across all models (replacing the hardcoded 1.0).

codecov · 2026-04-08T23:08:58Z

Codecov Report

❌ Patch coverage is 38.75000% with 49 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/maxtext/models/qwen3_custom.py	37.68%	43 Missing ⚠️
src/maxtext/layers/moe.py	50.00%	4 Missing ⚠️
src/maxtext/layers/decoders.py	0.00%	1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Specifically, this introduces: * `attention_output_dim` and `moe_expert_input_dim` to allow the attention block output and the MoE expert input to have different dimensionalities than the base embedding dimension. * A `dense_init_scale` config to allow configuring the initialization scale for dense layers across all models (replacing the hardcoded 1.0). PiperOrigin-RevId: 901021328

copybara-service bot assigned gagika Apr 8, 2026

copybara-service bot force-pushed the test_896749554 branch 3 times, most recently from 86b6433 to 4d67a7d Compare April 15, 2026 01:39

copybara-service bot requested a review from abhinavclemson as a code owner April 15, 2026 01:39

copybara-service bot force-pushed the test_896749554 branch 2 times, most recently from 8216eff to a1a0123 Compare April 15, 2026 20:56

copybara-service bot force-pushed the test_896749554 branch 2 times, most recently from b21b20a to f679f0a Compare April 15, 2026 22:05

copybara-service bot changed the title ~~Custom Qwen 30B-A3B~~ Add custom Qwen3 model with configurable attention and latentMoE. Apr 15, 2026

copybara-service bot force-pushed the test_896749554 branch 15 times, most recently from 9350e8a to 3cf64af Compare April 17, 2026 01:52

copybara-service bot force-pushed the test_896749554 branch from 3cf64af to 7f78228 Compare April 17, 2026 02:25

copybara-service bot merged commit 7f78228 into main Apr 17, 2026

copybara-service bot deleted the test_896749554 branch April 17, 2026 02:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add custom Qwen3 model with configurable attention and latentMoE.#3613

Add custom Qwen3 model with configurable attention and latentMoE.#3613
copybara-service[bot] merged 1 commit intomainfrom
test_896749554

copybara-service bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

copybara-service bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

copybara-service bot commented Apr 8, 2026 •

edited

Loading

codecov bot commented Apr 8, 2026 •

edited

Loading