Commit 5b3eb3d
authored
[TRITON] Add Attention support to the bench_models benchmarking script (#2274)
* Add Attention support to bench_models.py
* Add MHA layout CLI arg
* Add support for batched_gemm_a16wfp4
* Refactor TP logic and _get_handler
* Remove unified attention from this branch1 parent 5f9fdc9 commit 5b3eb3d
5 files changed
Lines changed: 364 additions & 102 deletions
File tree
- op_tests/op_benchmarks/triton
- model_benchmarking_tool
Lines changed: 14 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
5 | 12 | | |
6 | 13 | | |
7 | 14 | | |
| |||
14 | 21 | | |
15 | 22 | | |
16 | 23 | | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | 24 | | |
22 | 25 | | |
23 | 26 | | |
| |||
29 | 32 | | |
30 | 33 | | |
31 | 34 | | |
32 | | - | |
| 35 | + | |
33 | 36 | | |
34 | 37 | | |
35 | 38 | | |
| |||
145 | 148 | | |
146 | 149 | | |
147 | 150 | | |
148 | | - | |
| 151 | + | |
149 | 152 | | |
150 | 153 | | |
151 | 154 | | |
| |||
154 | 157 | | |
155 | 158 | | |
156 | 159 | | |
157 | | - | |
| 160 | + | |
158 | 161 | | |
159 | 162 | | |
160 | | - | |
| 163 | + | |
161 | 164 | | |
162 | 165 | | |
163 | 166 | | |
164 | 167 | | |
165 | | - | |
| 168 | + | |
166 | 169 | | |
167 | 170 | | |
168 | 171 | | |
169 | 172 | | |
170 | | - | |
| 173 | + | |
171 | 174 | | |
172 | 175 | | |
173 | 176 | | |
174 | 177 | | |
175 | | - | |
| 178 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
65 | | - | |
66 | 65 | | |
67 | 66 | | |
68 | 67 | | |
| |||
0 commit comments