Skip to content

[PHI] fix paddle.Tensor.logit for big tensor #73046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 5, 2025

Conversation

DanielSun11
Copy link
Contributor

@DanielSun11 DanielSun11 commented Jun 1, 2025

PR Category

Execute Infrastructure

PR Types

Bug fixes

Description

paddle.Tensor.logit报错信息如下:
image
从报错中可以分析出在logit反向计算的结果与torch对不上。torch 反向结果一些值为NAN,paddle将其初始化为0。根据https://www.paddlepaddle.org.cn/documentation/docs/zh/3.0-beta/api/paddle/logit_cn.html#logit api的描述中发现,当参数eps被设置为None时>1 或者 < 0 的输入对应的输出被置为NAN。但是经过测试发现,在troch中这些输入的反向过程也置为了NAN而paddle将其置为0。这个问题不仅在big tensor中出现,尝试其他较小的shape时仍然也存在。
修复方法:
修改logitGradkernel调用的LogitGradFunctor.operator方法,当eps没有被设置时,将其填充的值从0改为NAN

paddle和torch的对比验证:

  • 当eps参数被设置时:
    image

  • 当eps参数没被设置时(eps=None):
    image

前向和反向与torch都能对得上。

paddleAPITest中所有失败的config全部测试通过

image

#72973 中适用的修复方法与本PR修复方法基本一致,由于#72973 先合入develop分支,因此本PR用于完善单测
pcard-67164

Copy link

paddle-bot bot commented Jun 1, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@DanielSun11
Copy link
Contributor Author

/re-run all-failed

wanghuancoder
wanghuancoder previously approved these changes Jun 4, 2025
Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lshpku lshpku merged commit 9d2c9e4 into PaddlePaddle:develop Jun 5, 2025
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants