Skip to content

paddleocr的log设置会影响到调用模块的log级别设置 #14955

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
jieliu2000 opened this issue Mar 31, 2025 · 3 comments
Open
3 tasks done

paddleocr的log设置会影响到调用模块的log级别设置 #14955

jieliu2000 opened this issue Mar 31, 2025 · 3 comments
Milestone

Comments

@jieliu2000
Copy link

jieliu2000 commented Mar 31, 2025

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

如标题所示,只要调用了paddleocr,所有代码的logging level都会被修改掉,这是不对的,paddleocr应该只修改自己的log级别。我附上了测试代码,只要把代码中paddleocr的初始化代码删除就可以正常运行。也就是下面一段代码:

        from paddleocr import PaddleOCR
        # 初始化PaddleOCR
        paddle = PaddleOCR(use_angle_cls=True, lang='en')     

🏃‍♂️ Environment (运行环境)

OS: windows 11
Paddle OCR version: 2.10.0

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

import unittest
import logging
import io
from unittest.mock import patch

class TestPaddleLogging(unittest.TestCase):
    def setUp(self):
        # 设置日志捕获
        self.log_output = io.StringIO()
        self.handler = logging.StreamHandler(self.log_output)
        self.handler.setFormatter(logging.Formatter('%(levelname)s - %(message)s'))
        logging.getLogger().addHandler(self.handler)
        logging.getLogger().setLevel(logging.INFO)

    def tearDown(self):
        # 清理日志处理器
        logging.getLogger().removeHandler(self.handler)
        self.log_output.close()

    def test_paddle_logging_behavior(self):
        logger = logging.getLogger(__name__)
        
        # 记录初始化前的日志
        pre_init_message = "这是在PaddleOCR初始化之前的日志消息"
        logger.info(pre_init_message)
        
        # 获取初始化前的日志输出
        pre_init_log = self.log_output.getvalue()
        self.assertIn(pre_init_message, pre_init_log)
        
        # 清空日志缓冲区
        self.log_output.truncate(0)
        self.log_output.seek(0)
        
        from paddleocr import PaddleOCR
        # 初始化PaddleOCR
        paddle = PaddleOCR(use_angle_cls=True, lang='en')
        
        # 记录初始化后的日志
        post_init_message = "这是在PaddleOCR初始化之后的日志消息"
        logger.info(post_init_message)
        
        # 获取初始化后的日志输出
        post_init_log = self.log_output.getvalue()
        
        # 验证初始化后的日志是否未显示
        self.assertNotEqual("", post_init_log.strip())

if __name__ == '__main__':
    unittest.main()
@SWHL SWHL assigned SWHL and unassigned SWHL Apr 1, 2025
@SWHL
Copy link
Collaborator

SWHL commented Apr 1, 2025

这个问题之前也遇到过。我这里只能尝试看看能否找到问题根源。

@SWHL SWHL added this to the v3.0.0 milestone Apr 2, 2025
@Bobholamovic
Copy link
Member

定位到这个问题是paddle框架的bug。具体来说,在Paddle 3.0版本的 paddle/distributed/utils/log_utils.py 中,get_logger函数默认对root logger进行修改,对于一个library来说,这不是一个好的实践。建议可以到Paddle框架repo提一个issue反馈一下~

@smilefufu
Copy link

定位到这个问题是paddle框架的bug。具体来说,在Paddle 3.0版本的 paddle/distributed/utils/log_utils.py 中,get_logger函数默认对root logger进行修改,对于一个library来说,这不是一个好的实践。建议可以到Paddle框架repo提一个issue反馈一下~

除了默认 name 是 root,还有一处显示指定 name 为 root 的调用,已反馈到 Paddle 的 repo:
PaddlePaddle/Paddle#57165

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants