Commit 414e2f0
committed
Add MoE load balancing loss to distillation
1 parent 51c7f2b commit 414e2f0
6 files changed
Lines changed: 43 additions & 3 deletions
File tree
- src/maxtext
- models
- trainers/post_train/distillation
- utils
- tests/post_training/unit
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
505 | 505 | | |
506 | 506 | | |
507 | 507 | | |
| 508 | + | |
| 509 | + | |
508 | 510 | | |
509 | 511 | | |
510 | 512 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1055 | 1055 | | |
1056 | 1056 | | |
1057 | 1057 | | |
1058 | | - | |
| 1058 | + | |
1059 | 1059 | | |
1060 | 1060 | | |
1061 | 1061 | | |
| |||
1299 | 1299 | | |
1300 | 1300 | | |
1301 | 1301 | | |
1302 | | - | |
| 1302 | + | |
1303 | 1303 | | |
1304 | 1304 | | |
1305 | 1305 | | |
| |||
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| 55 | + | |
| 56 | + | |
55 | 57 | | |
56 | 58 | | |
57 | 59 | | |
| |||
373 | 375 | | |
374 | 376 | | |
375 | 377 | | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
376 | 383 | | |
377 | 384 | | |
378 | 385 | | |
379 | 386 | | |
380 | 387 | | |
381 | 388 | | |
382 | 389 | | |
| 390 | + | |
383 | 391 | | |
384 | 392 | | |
385 | 393 | | |
| |||
Lines changed: 7 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
150 | 150 | | |
151 | 151 | | |
152 | 152 | | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
153 | 159 | | |
154 | | - | |
| 160 | + | |
155 | 161 | | |
156 | 162 | | |
157 | 163 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1081 | 1081 | | |
1082 | 1082 | | |
1083 | 1083 | | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
| 1103 | + | |
| 1104 | + | |
| 1105 | + | |
| 1106 | + | |
1084 | 1107 | | |
1085 | 1108 | | |
1086 | 1109 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
437 | 437 | | |
438 | 438 | | |
439 | 439 | | |
| 440 | + | |
440 | 441 | | |
441 | 442 | | |
442 | 443 | | |
| |||
0 commit comments