Skip to content

Eval bug: Performance degradation on ARM starting from b8057 #21854

@pt13762104

Description

@pt13762104

Name and Version

version: 8772 (bafae27)
built with Clang 21.1.8 for Linux aarch64

Operating systems

Linux

GGML backends

CPU

Hardware

Dimensity 9000+ (X2+3xA710)

Models

https://huggingface.co/ggml-org/Qwen3-0.6B-GGUF/resolve/main/Qwen3-0.6B-Q4_0.gguf

Problem description & steps to reproduce

I tried to test Qwen3 0.6B Q4 on my phone, and noticed severe performance degradation starting from 684b361.

model size params backend threads n_ubatch fa test t/s
qwen3 0.6B Q4_0 403.42 MiB 751.63 M CPU 4 64 1 pp512 351.60 ± 0.00
qwen3 0.6B Q4_0 403.42 MiB 751.63 M CPU 4 64 1 tg128 79.88 ± 0.00

build: 3a00c98 (8056)

model size params backend threads n_ubatch fa test t/s
qwen3 0.6B Q4_0 403.42 MiB 751.63 M CPU 4 64 1 pp512 161.30 ± 0.00
qwen3 0.6B Q4_0 403.42 MiB 751.63 M CPU 4 64 1 tg128 80.05 ± 0.00

build: 684b361 (8057)

model size params backend threads n_ubatch fa mmap test t/s
qwen3 0.6B Q4_0 403.42 MiB 751.63 M CPU 4 64 1 0 pp512 155.72 ± 0.00
qwen3 0.6B Q4_0 403.42 MiB 751.63 M CPU 4 64 1 0 tg128 79.44 ± 0.00

build: bafae27 (8772)

First Bad Commit

684b361

Relevant log output

bin/llama-bench -m /sdcard/Download/Qwen3-0.6B-Q4_0.gguf -fa 1 -ub 64 -r 1 -mmp 0 -t 4

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions