Name and Version
version: 8772 (bafae27)
built with Clang 21.1.8 for Linux aarch64
Operating systems
Linux
GGML backends
CPU
Hardware
Dimensity 9000+ (X2+3xA710)
Models
https://huggingface.co/ggml-org/Qwen3-0.6B-GGUF/resolve/main/Qwen3-0.6B-Q4_0.gguf
Problem description & steps to reproduce
I tried to test Qwen3 0.6B Q4 on my phone, and noticed severe performance degradation starting from 684b361.
| model |
size |
params |
backend |
threads |
n_ubatch |
fa |
test |
t/s |
| qwen3 0.6B Q4_0 |
403.42 MiB |
751.63 M |
CPU |
4 |
64 |
1 |
pp512 |
351.60 ± 0.00 |
| qwen3 0.6B Q4_0 |
403.42 MiB |
751.63 M |
CPU |
4 |
64 |
1 |
tg128 |
79.88 ± 0.00 |
build: 3a00c98 (8056)
| model |
size |
params |
backend |
threads |
n_ubatch |
fa |
test |
t/s |
| qwen3 0.6B Q4_0 |
403.42 MiB |
751.63 M |
CPU |
4 |
64 |
1 |
pp512 |
161.30 ± 0.00 |
| qwen3 0.6B Q4_0 |
403.42 MiB |
751.63 M |
CPU |
4 |
64 |
1 |
tg128 |
80.05 ± 0.00 |
build: 684b361 (8057)
| model |
size |
params |
backend |
threads |
n_ubatch |
fa |
mmap |
test |
t/s |
| qwen3 0.6B Q4_0 |
403.42 MiB |
751.63 M |
CPU |
4 |
64 |
1 |
0 |
pp512 |
155.72 ± 0.00 |
| qwen3 0.6B Q4_0 |
403.42 MiB |
751.63 M |
CPU |
4 |
64 |
1 |
0 |
tg128 |
79.44 ± 0.00 |
build: bafae27 (8772)
First Bad Commit
684b361
Relevant log output
bin/llama-bench -m /sdcard/Download/Qwen3-0.6B-Q4_0.gguf -fa 1 -ub 64 -r 1 -mmp 0 -t 4
Name and Version
version: 8772 (bafae27)
built with Clang 21.1.8 for Linux aarch64
Operating systems
Linux
GGML backends
CPU
Hardware
Dimensity 9000+ (X2+3xA710)
Models
https://huggingface.co/ggml-org/Qwen3-0.6B-GGUF/resolve/main/Qwen3-0.6B-Q4_0.gguf
Problem description & steps to reproduce
I tried to test Qwen3 0.6B Q4 on my phone, and noticed severe performance degradation starting from 684b361.
build: 3a00c98 (8056)
build: 684b361 (8057)
build: bafae27 (8772)
First Bad Commit
684b361
Relevant log output
bin/llama-bench -m /sdcard/Download/Qwen3-0.6B-Q4_0.gguf -fa 1 -ub 64 -r 1 -mmp 0 -t 4