Skip to content

Commit 087db4f

Browse files
committed
Added ability to perform image preprocessing. This would perform grayscale conversion, denoising and binarization. This should help improve OCR accuracy.
1 parent 75b7066 commit 087db4f

File tree

1 file changed

+40
-0
lines changed

1 file changed

+40
-0
lines changed

image_preprocessing.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
"""
2+
Responsible for doing image preprocessing. Image preprocessing is need to make the image clearer, crispier and easier to read.
3+
4+
OCR performs better if the image is:
5+
- grayscale
6+
- denoised
7+
- binarised
8+
9+
grayscale conversion: Convert from RGB space to grayscale space. Thus converting a color image into a black-and-white image. Each pixel represents intensity rather than color. Reduces complexity from 3 color channels to 1.
10+
denoised: Applies filters and removes dots, specks and blurs. Currently MedianBlur algorithm is being used with Pillow.
11+
binarised: Convert grayscale image to binary. Removed background clutter. Each pixel becomes either black(0) or white(255). Dark becomes darker and light becomes lighter.
12+
13+
TODO:
14+
- DPI Normalization: to make the image crispier and easy to read
15+
- Contour detection
16+
17+
Currently we are using Pillow, which is basic. We can move to opencv which has better denoising and binarization support. Also it supports contour detection and DPI normalization.
18+
"""
19+
20+
import os
21+
import logging
22+
from PIL import Image, ImageFilter
23+
24+
logger = logging.getLogger(__name__)
25+
26+
27+
def preprocess_image(file_path: str):
28+
try:
29+
base, ext = os.path.splitext(file_path)
30+
ext = ext.lstrip(".")
31+
im = Image.open(file_path)
32+
image_grayscale = im.convert("L")
33+
image_denoised = image_grayscale.filter(ImageFilter.MedianFilter(size=3))
34+
image_binary = image_denoised.point(lambda x: 0 if x < 128 else 255, mode='1')
35+
output_path = f"{base}-processed.{ext}"
36+
image_binary.save(output_path)
37+
return output_path
38+
except Exception as exc:
39+
logger.error(f"Exception {exc} ocurred during image preprocessing.")
40+
return file_path

0 commit comments

Comments
 (0)