Dynamic Merging of Word Bounding Boxes #12283
satvik-27199
started this conversation in
Ideas & Features
Replies: 2 comments
-
Could you please tell me which algorithm you are describing and if a corresponding example is provided? |
Beta Was this translation helpful? Give feedback.
0 replies
-
I would like some insight into this as well. I trained paddle detection model on word level but the inference still shows line boxes. Is there a way to get unmerged boxes? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Currently, the algorithm for merging word bounding boxes combines words into a single bounding box if they are close enough. While this works in many cases, it can cause issues in others where words should remain separate despite their proximity. This leads to inaccurate text extraction and poor representation of the original content.
Example:
Consider the following text snippet:
Word1 Word2 Word3 Word4
In the current implementation, if Word1 and Word2 are close enough, they might be combined into a single bounding box as Word1Word2. Similarly, Word3 and Word4 might also be merged if they are close. This results in incorrect extraction as shown below:
Word1Word2 Word3Word4
I wonder if there any sophisticated merging strategy that takes into account:
Beta Was this translation helpful? Give feedback.
All reactions