[Auto Parallel] add main_grad for sharding in auto dy #72493

Xing-lil · 2025-04-25T11:13:51Z

PR Category

Auto Parallel

PR Types

New features

Description

Add inplace param.main_grad replaces old master_grad in auto dy.
Enable by setting export FLAGS_enable_tensor_fusion =1.

sharding tensor_fusion PR #72508

Pcard-70448

paddle-bot · 2025-04-25T11:13:56Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

codecov-commenter · 2025-05-13T20:10:49Z

Codecov Report

Attention: Patch coverage is 97.43590% with 1 line in your changes missing coverage. Please review.

Please upload report for BASE (develop@441816a). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
python/paddle/amp/auto_cast.py	96.55%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop   #72493   +/-   ##
==========================================
  Coverage           ?   97.43%           
==========================================
  Files              ?        3           
  Lines              ?       39           
  Branches           ?        0           
==========================================
  Hits               ?       38           
  Misses             ?        1           
  Partials           ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

liym27

LGTM.

Why use main_grad instead of master_grad here?

liym27 · 2025-05-14T07:19:07Z

python/paddle/optimizer/optimizer.py

-                if param._grad_ivar() is not None:
-                    grad_var = param._grad_ivar()
-                    params_grads.append((param, grad_var))
+                if os.getenv("FLAGS_enable_tensor_fusion") == '1':


Flag 需要考虑 true True的情况

不建议在 auto_cast.py api.py optimizer.py 文件中，都用 FLAGS_enable_tensor_fusion 判断，可以改成配置。

好的，将在后续PR上一起修改

Xing-lil · 2025-05-14T07:21:10Z

LGTM.

Why use main_grad instead of master_grad here?

tensor_fusion fuse grad into a contiguous fuse_grad, requiring inplace grad to avoid concat in each step.
differences between main_grad and master_grad:

main_grad uses inplace cast, whereas master_grad does not.
main_grad cast after each grad node, while master_grad after entire backward.
main_grad access by param.main_grad, master_grad access by grad access.

Xing-lil added 3 commits April 25, 2025 19:05

Update auto_cast.py

1e7cac9

Update optimizer.py

b279aea

Update api.py

4fc5812

Xing-lil added 6 commits April 29, 2025 15:46

Update optimizer.py

a43626a

Update api.py

bfc03c9

Update auto_cast.py

b64e29b

Update api.py

4e656d0

Update semi_auto_parallel_sharding_stage_1.py

0459942

Update auto_cast.py

388364d

Xing-lil force-pushed the FLAGS_enable_inplace_master_grad branch from 900f276 to 388364d Compare May 12, 2025 11:20

Xing-lil added 2 commits May 12, 2025 19:20

Update auto_cast.py

b78d9d0

del Flags_enable_inplace_master_grad

ef7b30b

Xing-lil changed the title ~~[Auto Parallel] add Flags_enable_inplace_master_grad~~ [Auto Parallel] add main_grad for sharding in auto dy May 13, 2025

liym27 approved these changes May 14, 2025

View reviewed changes

liym27 reviewed May 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Auto Parallel] add main_grad for sharding in auto dy #72493

[Auto Parallel] add main_grad for sharding in auto dy #72493

Xing-lil commented Apr 25, 2025 •

edited

Loading

paddle-bot bot commented Apr 25, 2025

codecov-commenter commented May 13, 2025

liym27 left a comment •

edited

Loading

liym27 May 14, 2025

Xing-lil May 14, 2025

Xing-lil commented May 14, 2025

[Auto Parallel] add main_grad for sharding in auto dy #72493

Are you sure you want to change the base?

[Auto Parallel] add main_grad for sharding in auto dy #72493

Conversation

Xing-lil commented Apr 25, 2025 • edited Loading

PR Category

PR Types

Description

paddle-bot bot commented Apr 25, 2025

codecov-commenter commented May 13, 2025

Codecov Report

liym27 left a comment • edited Loading

Choose a reason for hiding this comment

liym27 May 14, 2025

Choose a reason for hiding this comment

Xing-lil May 14, 2025

Choose a reason for hiding this comment

Xing-lil commented May 14, 2025

Xing-lil commented Apr 25, 2025 •

edited

Loading

liym27 left a comment •

edited

Loading