Skip to content

Commit d8b33ba

Browse files
committed
add pse curved text detection doc
1 parent 16bacf0 commit d8b33ba

File tree

4 files changed

+28
-8
lines changed

4 files changed

+28
-8
lines changed

doc/doc_ch/algorithm_det_psenet.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,17 +52,27 @@
5252
python3 tools/export_model.py -c configs/det/det_r50_vd_pse.yml -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy Global.save_inference_dir=./inference/det_pse
5353
```
5454

55-
PSE文本检测模型推理,可以执行如下命令:
55+
PSE文本检测模型推理,执行非弯曲文本检测,可以执行如下命令:
5656

5757
```shell
58-
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE"
58+
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" --det_pse_box_type=quad
5959
```
6060

6161
可视化文本检测结果默认保存到`./inference_results`文件夹里面,结果文件的名称前缀为'det_res'。结果示例如下:
6262

6363
![](../imgs_results/det_res_img_10_pse.jpg)
6464

65-
**注意**:由于ICDAR2015数据集只有1000张训练图像,且主要针对英文场景,所以上述模型对中文文本图像检测效果会比较差。
65+
如果想执行弯曲文本检测,可以执行如下命令:
66+
67+
```shell
68+
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" --det_pse_box_type=poly
69+
```
70+
71+
可视化文本检测结果默认保存到`./inference_results`文件夹里面,结果文件的名称前缀为'det_res'。结果示例如下:
72+
73+
![](../imgs_results/det_res_img_10_pse_poly.jpg)
74+
75+
**注意**:由于ICDAR2015数据集只有1000张训练图像,且主要针对英文场景,所以上述模型对中文或弯曲文本图像检测效果会比较差。
6676

6777
<a name="4-2"></a>
6878
### 4.2 C++推理

doc/doc_en/algorithm_det_psenet_en.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,17 +52,27 @@ First, convert the model saved in the PSE text detection training process into a
5252
python3 tools/export_model.py -c configs/det/det_r50_vd_pse.yml -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy Global.save_inference_dir=./inference/det_pse
5353
```
5454

55-
PSE text detection model inference, you can execute the following command:
55+
PSE text detection model inference, to perform non-curved text detection, you can run the following commands:
5656

5757
```shell
58-
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE"
58+
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" --det_pse_box_type=quad
5959
```
6060

6161
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
6262

6363
![](../imgs_results/det_res_img_10_pse.jpg)
6464

65-
**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese text images.
65+
If you want to perform curved text detection, you can execute the following command:
66+
67+
```shell
68+
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_pse/" --det_algorithm="PSE" --det_pse_box_type=poly
69+
```
70+
71+
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
72+
73+
![](../imgs_results/det_res_img_10_pse_poly.jpg)
74+
75+
**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese or curved text images.
6676

6777

6878
<a name="4-2"></a>
332 KB
Loading

tools/infer/predict_det.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,7 @@ def order_points_clockwise(self, pts):
158158
rect[1] = pts[np.argmin(diff)]
159159
rect[3] = pts[np.argmax(diff)]
160160
return rect
161-
161+
162162
def clip_det_res(self, points, img_height, img_width):
163163
for pno in range(points.shape[0]):
164164
points[pno, 0] = int(min(max(points[pno, 0], 0), img_width - 1))
@@ -284,7 +284,7 @@ def __call__(self, img):
284284
total_time += elapse
285285
count += 1
286286
save_pred = os.path.basename(image_file) + "\t" + str(
287-
json.dumps(np.array(dt_boxes).astype(np.int32).tolist())) + "\n"
287+
json.dumps([x.tolist() for x in dt_boxes])) + "\n"
288288
save_results.append(save_pred)
289289
logger.info(save_pred)
290290
logger.info("The predict time of {}: {}".format(image_file, elapse))

0 commit comments

Comments
 (0)