[Optimization] merge matmul and add by BingooYang · Pull Request #6986 · PaddlePaddle/FastDeploy

BingooYang · 2026-03-24T06:23:27Z

Motivation

性能优化

Modifications

将UnquantizedLinearMethod中的matmul和add用linear替换。
带bias情况基本上有加速，不带bias情况小shape下性能有下降（主要是python层if等调度开销，linear内部实现也是matmul）。

Usage or Command

无

Accuracy Tests

精度保持一致

Checklist

[ x ] Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
[ x ] Format your code, run pre-commit before commit.
[ x ] Add unit tests. Please write the reason in this PR if no unit tests.
[ x ] Provide accuracy results.
[ x ] If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-03-24T06:23:36Z

Thanks for your contribution!

codecov-commenter · 2026-03-24T08:12:49Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@98f3fc9). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #6986   +/-   ##
==========================================
  Coverage           ?   73.99%           
==========================================
  Files              ?      376           
  Lines              ?    53406           
  Branches           ?     8470           
==========================================
  Hits               ?    39518           
  Misses             ?    11138           
  Partials           ?     2750

Flag	Coverage Δ
GPU	`73.99% <100.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

BingooYang · 2026-03-25T12:40:19Z

/re-run all-failed

BingooYang · 2026-03-26T05:25:40Z

/re-run all-failed

BingooYang · 2026-03-26T13:22:22Z

/re-run all-failed

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-03 17:42 CST

📋 Review 摘要

PR 概述：将 UnquantizedLinearMethod.apply() 中的 paddle.matmul + paddle.add 优化为 paddle.nn.functional.linear，减少算子调用开销。

变更范围：model_executor/layers/linear.py（核心优化）、测试 baseline 路径更新

影响面 Tag：OP Optimization

问题

未发现阻塞性问题。

✅ 代码分析

检查项	结果
PR 标题规范	✅ 包含 `[Optimization]` Tag
PR 描述完整性	✅ Motivation/Modifications 已填写，附带性能对比图
代码逻辑正确性	✅ PaddlePaddle 的 `F.linear(x, weight, bias)` 计算公式为 `x @ weight + bias`，与原 `matmul + add` 实现等价
新增 assert 检查	✅ 验证 bias shape 与 weight 最后一维匹配，有助于提前发现配置错误
测试覆盖	✅ baseline 已更新（0402 → 0403），精度验证通过

总体评价

代码变更逻辑正确，性能优化合理。带 bias 情况使用 F.linear 可减少 Python 层调度开销；不带 bias 情况保持使用 matmul 避免小 shape 下的性能损失，符合 PR 描述中的性能分析结论。

* replace matmul+add to linear * modify baseline

* merge matmul and add * modify format * using paddle.nn.functional.linear * using _C_ops.linear * using paddle.nn.functional.linear * add FLAGS_use_legacy_linear env var in test case * fix format * add assert and remove env * modify format * using matmul for no bias * modify accurate baseline

BingooYang had a problem deploying to Metax_ci March 24, 2026 06:23 — with GitHub Actions Failure

BingooYang temporarily deployed to Metax_ci March 24, 2026 06:27 — with GitHub Actions Inactive

zhangbo9674 reviewed Mar 24, 2026

View reviewed changes

Comment thread fastdeploy/model_executor/layers/linear.py Outdated

BingooYang temporarily deployed to Metax_ci March 24, 2026 08:57 — with GitHub Actions Inactive

BingooYang temporarily deployed to Metax_ci March 26, 2026 14:08 — with GitHub Actions Inactive

BingooYang temporarily deployed to Metax_ci March 27, 2026 03:48 — with GitHub Actions Inactive

BingooYang temporarily deployed to Metax_ci March 29, 2026 09:10 — with GitHub Actions Inactive

BingooYang temporarily deployed to Metax_ci March 29, 2026 14:23 — with GitHub Actions Inactive

qingqing01 reviewed Mar 31, 2026

View reviewed changes

Comment thread tests/e2e/4cards_cases/test_GLM_45_AIR_mtp_tp4.py Outdated

BingooYang had a problem deploying to Metax_ci April 3, 2026 03:28 — with GitHub Actions Error

BingooYang had a problem deploying to Metax_ci April 3, 2026 03:30 — with GitHub Actions Failure

BingooYang had a problem deploying to Metax_ci April 3, 2026 04:28 — with GitHub Actions Failure

yuanlehome previously approved these changes Apr 3, 2026

View reviewed changes

BingooYang added 11 commits April 3, 2026 15:27

merge matmul and add

9d83df2

modify format

954645b

using paddle.nn.functional.linear

272272c

using _C_ops.linear

70a0724

using paddle.nn.functional.linear

7ac2448

add FLAGS_use_legacy_linear env var in test case

a8ed2a1

fix format

c420da7

add assert and remove env

0c421b7

modify format

251eb4e

using matmul for no bias

6acbde5

modify accurate baseline

ddabdb1

BingooYang dismissed yuanlehome’s stale review via ddabdb1 April 3, 2026 07:28

BingooYang force-pushed the linear_opt branch from bb6edb1 to ddabdb1 Compare April 3, 2026 07:28

BingooYang temporarily deployed to Metax_ci April 3, 2026 07:28 — with GitHub Actions Inactive

PaddlePaddle-bot reviewed Apr 3, 2026

View reviewed changes

qingqing01 approved these changes Apr 3, 2026

View reviewed changes

qingqing01 merged commit 2068656 into PaddlePaddle:develop Apr 3, 2026
36 of 37 checks passed

BingooYang had a problem deploying to Metax_ci April 3, 2026 10:05 — with GitHub Actions Failure

zoooo0820 pushed a commit that referenced this pull request Apr 9, 2026

[Cherry-Pick][Optimization] merge matmul and add(#6986) (#7184)

324f083

* replace matmul+add to linear * modify baseline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Optimization] merge matmul and add#6986

[Optimization] merge matmul and add#6986
qingqing01 merged 11 commits intoPaddlePaddle:developfrom
BingooYang:linear_opt

BingooYang commented Mar 24, 2026 •

edited

Loading

Uh oh!

paddle-bot bot commented Mar 24, 2026

Uh oh!

codecov-commenter commented Mar 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

BingooYang commented Mar 25, 2026

Uh oh!

BingooYang commented Mar 26, 2026

Uh oh!

BingooYang commented Mar 26, 2026

Uh oh!

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

BingooYang commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Mar 24, 2026

Uh oh!

codecov-commenter commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

BingooYang commented Mar 25, 2026

Uh oh!

BingooYang commented Mar 26, 2026

Uh oh!

BingooYang commented Mar 26, 2026

Uh oh!

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

✅ 代码分析

总体评价

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

BingooYang commented Mar 24, 2026 •

edited

Loading

codecov-commenter commented Mar 24, 2026 •

edited

Loading