-
Notifications
You must be signed in to change notification settings - Fork 5.7k
[Auto Parallel] add main_grad for sharding in auto dy #72493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
[Auto Parallel] add main_grad for sharding in auto dy #72493
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
900f276
to
388364d
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #72493 +/- ##
==========================================
Coverage ? 97.43%
==========================================
Files ? 3
Lines ? 39
Branches ? 0
==========================================
Hits ? 38
Misses ? 1
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Why use main_grad instead of master_grad here?
if param._grad_ivar() is not None: | ||
grad_var = param._grad_ivar() | ||
params_grads.append((param, grad_var)) | ||
if os.getenv("FLAGS_enable_tensor_fusion") == '1': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Flag 需要考虑 true True的情况
- 不建议在 auto_cast.py api.py optimizer.py 文件中,都用 FLAGS_enable_tensor_fusion 判断,可以改成配置。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,将在后续PR上一起修改
tensor_fusion fuse grad into a contiguous fuse_grad, requiring inplace grad to avoid concat in each step.
|
PR Category
Auto Parallel
PR Types
New features
Description
param.main_grad
replaces oldmaster_grad
in auto dy.export FLAGS_enable_tensor_fusion =1
.sharding tensor_fusion PR #72508
Pcard-70448