【Hackathon 8th No.41】在 PaddleOCR 中复现 OmniParser 论文 #15582

robinbg · 2025-06-04T18:53:44Z

…nition and KIE

paddle-bot · 2025-06-04T18:53:48Z

Thanks for your contribution!

Copilot

Pull Request Overview

This PR integrates the OmniParser unified framework into PaddleOCR by adding end-to-end support for data processing, model definition, inference, postprocessing, and documentation/configuration.

Introduces OmniParserPredictor and related inference tooling.
Adds unified model components: backbone, FPN neck, pixel/table/KIE heads, and multi-task loss.
Implements data augmentations, postprocessing logic, YAML configuration, and documentation for OmniParser.

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tools/infer/predict_omniparser.py	Inference script for OmniParser
ppocr/postprocess/omniparser_postprocess.py	Postprocessing for text, table, and KIE outputs
ppocr/modeling/heads/omniparser_table_head.py	Table structure head
ppocr/modeling/heads/omniparser_pixel_head.py	Pixel-level text detection head
ppocr/modeling/heads/omniparser_kie_head.py	Key information extraction head
ppocr/modeling/backbones/omniparser_backbone.py	OmniParser backbone and FPN fusion
ppocr/modeling/architectures/omniparser.py	Unified OmniParser architecture
ppocr/losses/omniparser_loss.py	Multi-task loss combining all heads
ppocr/data/imaug/omniparser_process.py	Data processing and augmentation for OmniParser
docs/algorithm/omniparser/omniparser_en.md	English documentation for OmniParser
docs/algorithm/omniparser/omniparser.md	Chinese documentation for OmniParser
configs/omniparser/omniparser_base.yml	Configuration file for OmniParser training and inference

Comments suppressed due to low confidence (2)

tools/infer/predict_omniparser.py:74

[nitpick] Argument det_model_dir is used both as a directory and a file path. Consider renaming to det_model_path or splitting into separate directory and file parameters for clarity.

model_file_path = args.det_model_dir

configs/omniparser/omniparser_base.yml:52

KIEHead.num_classes is set to 10, but the post-process classes list has only 6 entries. Align the number of classes in the head configuration with the post-process class list to avoid mismatches.

    num_classes: 10

Copilot · 2025-06-04T23:47:24Z

ppocr/postprocess/omniparser_postprocess.py

+        text_score = preds['text_prob'][0, 0].numpy()
+        center_score = preds['center_prob'][0, 0].numpy()
+        border_score = preds['border_prob'][0, 0].numpy()
+


Calling .numpy() on a NumPy array will raise an AttributeError. Ensure preds['text_prob'] is a Paddle tensor before using .numpy(), or remove the .numpy() call and work with the array directly.

Suggested change

text_score = preds['text_prob'][0, 0].numpy()

center_score = preds['center_prob'][0, 0].numpy()

border_score = preds['border_prob'][0, 0].numpy()

# Ensure compatibility with both Paddle tensors and NumPy arrays

text_prob = preds['text_prob'][0, 0]

center_prob = preds['center_prob'][0, 0]

border_prob = preds['border_prob'][0, 0]

text_score = text_prob.numpy() if isinstance(text_prob, paddle.Tensor) else text_prob

center_score = center_prob.numpy() if isinstance(center_prob, paddle.Tensor) else center_prob

border_score = border_prob.numpy() if isinstance(border_prob, paddle.Tensor) else border_prob

Copilot · 2025-06-04T23:47:24Z

ppocr/postprocess/omniparser_postprocess.py

+    def _get_table_boundary(self, boundary_pred):
+        """Extract table boundary from prediction"""
+        # Obtain probability map for boundary
+        boundary_prob = paddle.nn.functional.softmax(boundary_pred, axis=1)[0, 1].numpy()


Using paddle.nn.functional.softmax on a NumPy array will fail. Convert boundary_pred to a Paddle tensor or use a NumPy-based softmax implementation to process the array.

Suggested change

boundary_prob = paddle.nn.functional.softmax(boundary_pred, axis=1)[0, 1].numpy()

boundary_pred_tensor = paddle.to_tensor(boundary_pred)

boundary_prob = paddle.nn.functional.softmax(boundary_pred_tensor, axis=1)[0, 1].numpy()

Copilot · 2025-06-04T23:47:25Z

ppocr/modeling/heads/omniparser_kie_head.py

+                logits.append(logit)
+            else:
+                # No regions case
+                logits.append(paddle.zeros([0, self.num_classes]))


Attribute self.num_classes is not defined in __init__. Initialize self.num_classes = num_classes to avoid an AttributeError.

Copilot · 2025-06-04T23:47:25Z

tools/infer/predict_omniparser.py

+        pre_process_list = [{
+            'OmniParserDataProcess': {
+                'image_shape': [1024, 1024],
+                'augmentation': False,
+                'mean': [0.485, 0.456, 0.406],
+                'std': [0.229, 0.224, 0.225],
+            }
+        }]


The variable pre_process_list is defined but never used. Remove it or integrate it into the preprocessing pipeline to avoid dead code.

Suggested change

pre_process_list = [{

'OmniParserDataProcess': {

'image_shape': [1024, 1024],

'augmentation': False,

'mean': [0.485, 0.456, 0.406],

'std': [0.229, 0.224, 0.225],

}

}]

# Removed unused variable `pre_process_list`.

luotao1 · 2025-06-10T03:18:41Z

请先提交RFC设计文档

Implement OmniParser unified framework for text spotting, table recog…

75e25ac

…nition and KIE

paddle-bot bot added the contributor label Jun 4, 2025

GreatV requested review from Topdu, cuicheng01 and Copilot June 4, 2025 23:44

Copilot AI reviewed Jun 4, 2025

View reviewed changes

luotao1 mentioned this pull request Jun 5, 2025

【Hackathon 8th】开源贡献个人挑战赛 PaddlePaddle/Paddle#71310

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

【Hackathon 8th No.41】在 PaddleOCR 中复现 OmniParser 论文 #15582

【Hackathon 8th No.41】在 PaddleOCR 中复现 OmniParser 论文 #15582

Uh oh!

robinbg commented Jun 4, 2025

Uh oh!

paddle-bot bot commented Jun 4, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 4, 2025

Uh oh!

Copilot AI Jun 4, 2025

Uh oh!

Copilot AI Jun 4, 2025

Uh oh!

Copilot AI Jun 4, 2025

Uh oh!

luotao1 commented Jun 10, 2025

Uh oh!

Uh oh!

-        text_score = preds['text_prob'][0, 0].numpy()
-        center_score = preds['center_prob'][0, 0].numpy()
-        border_score = preds['border_prob'][0, 0].numpy()
+        # Ensure compatibility with both Paddle tensors and NumPy arrays
+        text_prob = preds['text_prob'][0, 0]
+        center_prob = preds['center_prob'][0, 0]
+        border_prob = preds['border_prob'][0, 0]
+        text_score = text_prob.numpy() if isinstance(text_prob, paddle.Tensor) else text_prob
+        center_score = center_prob.numpy() if isinstance(center_prob, paddle.Tensor) else center_prob
+        border_score = border_prob.numpy() if isinstance(border_prob, paddle.Tensor) else border_prob

	boundary_prob = paddle.nn.functional.softmax(boundary_pred, axis=1)[0, 1].numpy()
	boundary_pred_tensor = paddle.to_tensor(boundary_pred)
	boundary_prob = paddle.nn.functional.softmax(boundary_pred_tensor, axis=1)[0, 1].numpy()

【Hackathon 8th No.41】在 PaddleOCR 中复现 OmniParser 论文 #15582

Are you sure you want to change the base?

【Hackathon 8th No.41】在 PaddleOCR 中复现 OmniParser 论文 #15582

Uh oh!

Conversation

robinbg commented Jun 4, 2025

Uh oh!

paddle-bot bot commented Jun 4, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

luotao1 commented Jun 10, 2025

Uh oh!

Uh oh!