[Auto Parallel] Add spmd rule No.2-3、8 for (`log_softmax,cummax,cummin`) and their backward ops #72720

ooooo-create · 2025-05-14T13:57:36Z

PR Category

Auto Parallel

PR Types

New features

Description

【开源任务】算子切分推导规则开发，支持更多模型使用自动并行，简化更多用户的分布式开发成本 #72415

为如下算子添加了切分推导规则：

log_softmax 和 log_softmax_grad，复用 softmax 推导规则，之前已在 rules.cc 注册，pr 只配置了 yaml，没有添加单测，
修复之前 topk_grad 的切分推导规则
cummax 和 cummax_grad，复用 topk 的推导逻辑，同样将 axis 强制转成 Replicated
cummin 和 cummin_grad，与 cummax 相同的处理

…cummax and cummin.

paddle-bot · 2025-05-14T13:57:43Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

ooooo-create · 2025-05-14T15:36:40Z

/re-run all-failed

… it can easily use to have more chance to reshared

… add_spmd_log_softmax

ooooo-create · 2025-05-16T06:26:59Z

@Yeenyeong This pr is ready to be reviewed, Thanks!

Yeenyeong · 2025-05-20T06:42:41Z

paddle/phi/infermeta/spmd_rules/utils.cc

@@ -86,6 +86,10 @@ std::unordered_map<std::string, int64_t> ShardingMergeForTensors(
  for (auto& pair : tensor_axes_to_dim_pairs) {
    for (size_t i = 0; i < pair.second.size(); ++i) {
      auto tensor_axis = pair.first.substr(i, 1);
+      // Cooperate with GetDimsMappingForAxes to deal "1" for replicated.
+      if (tensor_axis == "1") {


What scenario does there lines of code work in? Is there any bug that has happened?

GetDimsMappingForAxes will auto treat "1" to replicated，and the input axis_to_dim_map always come from ShardingMergeForTensors's output.I thought a statution, when x_axes="a1b" and dim_mapping={0,1,-1}, y_axes="a1b" dim_mapping={0,-1,1}, now will out {a:0,1:1,b:-1}, the b map to -1, and i think the b is also can map to 1(because when we set "1", it should be -1). I'm not sure if my modification is necessary, it might not be encountered in actual scenarios. Do I need to change it back

In the situation you assumed, the output of merging should be {a:0, 1:-1, b:1}, which comes from the rules that the ShardingMergeForAxis function follows.

In addition, an axis named "1" should always be in Replicate mode. So when the dims_mapping["1"] != -1 as assumed in your situation, something wrong may happen. So we'd better not to skip checking an axis named "1".

At the very least, this modification may affect all other spmd-rule-inference-functions calling the function and we'd better not to modify the code unless we have to do that.

Thanks!

Thank you for such a clear and logical answer！

jeff41404 · 2025-05-21T03:44:12Z

paddle/phi/infermeta/spmd_rules/topk.cc

@@ -95,24 +91,29 @@ SpmdInfo TopkGradInferSpmd(const DistMetaTensor& x,
          axis));
  // Build einsum notation
  std::string alphabet = "abcdefghijlopqrstuvwxyz";
-  std::string x_axes = alphabet.substr(0, x_ndim - 1);
+  std::string x_axes = alphabet.substr(0, x_ndim);
+  x_axes[axis] = '1';


why do we need this line of code? can we delete L95

Thanks! Has been deleted. becaues set all dim_mapping[axis]=-1 after, it's redundant

jeff41404 · 2025-05-21T03:51:05Z

paddle/phi/infermeta/spmd_rules/rules.cc

+PD_REGISTER_SPMD_RULE(cummax,
+                      PD_INFER_SPMD(phi::distributed::TopkInferSpmdBase),
+                      PD_INFER_SPMD(phi::distributed::TopkGradInferSpmdBase));


Even if CummaxInferSpmd actually calls TopkInferSpmdBase, it is best to write CummaxInferSpmd instead of TopkInferSpmdBase when registering. and the parameters of CummaxInferSpmd and TopkInferSpmdBase are different. same issue with CummaxGradInferSpmd and TopkGradInferSpmdBase.

Thanks! I adapt the InferSpmdContext::AttrAt in inferspmd_utils.cc and parse_attr in auto_parallel_py.cc to add a new attr support DataType, and use CummaxInferSpmd instead of TopkInferSpmdBase

jeff41404 · 2025-05-21T03:51:20Z

paddle/phi/infermeta/spmd_rules/rules.cc

+PD_REGISTER_SPMD_RULE(cummin,
+                      PD_INFER_SPMD(phi::distributed::TopkInferSpmdBase),
+                      PD_INFER_SPMD(phi::distributed::TopkGradInferSpmdBase));


same issue as above

jeff41404

LGTM

ooooo-create added 2 commits May 14, 2025 20:30

Config spmd_rule yaml for log_softmax

e14eecb

split the general func for topk, cummax and cummin;add spmd_rule for …

f678ec6

…cummax and cummin.

ooooo-create requested review from LiYuRio, ForFishes and zhiqiu as code owners May 14, 2025 13:57

paddle-bot bot added the contributor External developers label May 14, 2025

fix rules

8747e1b

luotao1 mentioned this pull request May 15, 2025

【开源任务】算子切分推导规则开发，支持更多模型使用自动并行，简化更多用户的分布式开发成本 #72415

Open

luotao1 added the HappyOpenSource Pro 进阶版快乐开源活动，更具挑战性的任务 label May 15, 2025

luotao1 assigned luotao1 and jeff41404 May 15, 2025

ooooo-create added 6 commits May 15, 2025 21:51

fix topk spmd_rule; add unittests

d8b307b

Cooperate with GetDimsMappingForAxes to deal 1 for replicated.Because…

c8b54d0

… it can easily use to have more chance to reshared

try to ci

644e782

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

8656c33

… add_spmd_log_softmax

fix typos;fix bugs in Topk unittest

92b2ae5

fix bugs in unittest

63e28d7

Resolve merge conflict

5bf7d2a

Yeenyeong reviewed May 20, 2025

View reviewed changes

apply review, wow~

96320b4

ooooo-create requested a review from Yeenyeong May 21, 2025 02:32

Yeenyeong approved these changes May 21, 2025

View reviewed changes

jeff41404 reviewed May 21, 2025

View reviewed changes

apply review;refine pybind function

8564a9f

ooooo-create requested a review from jeff41404 May 21, 2025 10:08

ooooo-create mentioned this pull request May 21, 2025

[Auto Parallel] Add spmd rule No.6 for unique ops. #72824

Merged

jeff41404 approved these changes May 22, 2025

View reviewed changes

zyfncg approved these changes May 22, 2025

View reviewed changes

luotao1 merged commit 77bf836 into PaddlePaddle:develop May 22, 2025
53 of 55 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Auto Parallel] Add spmd rule No.2-3、8 for (`log_softmax,cummax,cummin`) and their backward ops #72720

[Auto Parallel] Add spmd rule No.2-3、8 for (`log_softmax,cummax,cummin`) and their backward ops #72720

Uh oh!

ooooo-create commented May 14, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented May 14, 2025

Uh oh!

ooooo-create commented May 14, 2025

Uh oh!

ooooo-create commented May 16, 2025

Uh oh!

Yeenyeong May 20, 2025

Uh oh!

ooooo-create May 20, 2025 •

edited

Loading

Uh oh!

Yeenyeong May 20, 2025 •

edited

Loading

Uh oh!

ooooo-create May 20, 2025

Uh oh!

jeff41404 May 21, 2025 •

edited

Loading

Uh oh!

ooooo-create May 21, 2025

Uh oh!

jeff41404 May 21, 2025

Uh oh!

ooooo-create May 21, 2025

Uh oh!

jeff41404 May 21, 2025

Uh oh!

jeff41404 left a comment

Uh oh!

Uh oh!

Uh oh!

[Auto Parallel] Add spmd rule No.2-3、8 for (log_softmax,cummax,cummin) and their backward ops #72720

[Auto Parallel] Add spmd rule No.2-3、8 for (log_softmax,cummax,cummin) and their backward ops #72720

Uh oh!

Conversation

ooooo-create commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented May 14, 2025

Uh oh!

ooooo-create commented May 14, 2025

Uh oh!

ooooo-create commented May 16, 2025

Uh oh!

Yeenyeong May 20, 2025

Choose a reason for hiding this comment

Uh oh!

ooooo-create May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Yeenyeong May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ooooo-create May 20, 2025

Choose a reason for hiding this comment

Uh oh!

jeff41404 May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ooooo-create May 21, 2025

Choose a reason for hiding this comment

Uh oh!

jeff41404 May 21, 2025

Choose a reason for hiding this comment

Uh oh!

ooooo-create May 21, 2025

Choose a reason for hiding this comment

Uh oh!

jeff41404 May 21, 2025

Choose a reason for hiding this comment

Uh oh!

jeff41404 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

[Auto Parallel] Add spmd rule No.2-3、8 for (`log_softmax,cummax,cummin`) and their backward ops #72720

[Auto Parallel] Add spmd rule No.2-3、8 for (`log_softmax,cummax,cummin`) and their backward ops #72720

ooooo-create commented May 14, 2025 •

edited

Loading

ooooo-create May 20, 2025 •

edited

Loading

Yeenyeong May 20, 2025 •

edited

Loading

jeff41404 May 21, 2025 •

edited

Loading