Skip to content

multiprocessing.spawn on Linux imports paddleocr.py module instead of paddleocr package, causing ModuleNotFoundError for submodules #15867

@DavGrg

Description

@DavGrg

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

When using Python’s multiprocessing with the spawn (or forkserver) start method, child processes re-import the paddleocr top‐level name from disk. In my environment I have the official PaddleOCR pip package installed under …/site-packages/paddleocr/init.py (alongside paddleocr/paddleocr.py and the tools/ subpackage), so I expect:

import paddleocr
print(paddleocr.__file__)
# -> .../site-packages/paddleocr/__init__.py
from paddleocr.tools.infer.utility import init_args  # should work

However, in the worker process spawned via:

import multiprocessing as mp

mp.set_start_method("spawn", force=True)
p = mp.Process(target=worker)
p.start()
p.join()

the interpreter instead binds paddleocr to …/site-packages/paddleocr/paddleocr.py (the standalone module), not the package’s __init__.py. As a result:

from paddleocr.tools.infer.utility import init_args
# ModuleNotFoundError: No module named 'paddleocr.tools'; 'paddleocr' is not a package

🏃‍♂️ Environment (运行环境)

OS Ubuntu 22.04.3 LTS
Python 3.12.0
PaddleOCR 2.10.0
PaddlePaddle 2.6.2

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

import multiprocessing as mp
import paddleocr

def worker():
    try:
        from paddleocr.tools.infer.utility import init_args
        print("WORKER: import succeeded")
    except Exception as e:
        print("WORKER: import failed:", e)
    print("WORKER paddleocr loaded from:", paddleocr.__file__)

if __name__ == "__main__":
    mp.set_start_method("spawn", force=True)
    print("MAIN   paddleocr loaded from:", paddleocr.__file__)
    p = mp.Process(target=worker)
    p.start()
    p.join()

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions