Skip to content

Commit 0600bd7

Browse files
authored
Add Samaritan Script training
Samaritan Script is RTL like Arabic and Hebrew, used for Samaritan Hebrew and Aramaic, sometimes has Arabic letters in some texts. https://en.wikipedia.org/wiki/Samaritan_(Unicode_block) https://en.wikipedia.org/wiki/Samaritan_Hebrew https://en.wikipedia.org/wiki/Samaritan_Aramaic_language
1 parent 79e47de commit 0600bd7

File tree

1 file changed

+110
-0
lines changed

1 file changed

+110
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
Global:
2+
use_gpu: true
3+
epoch_num: 500
4+
log_smooth_window: 20
5+
print_batch_step: 10
6+
save_model_dir: ./output/rec_samaritan_lite
7+
save_epoch_step: 3
8+
eval_batch_step:
9+
- 0
10+
- 2000
11+
cal_metric_during_train: true
12+
pretrained_model: null
13+
checkpoints: null
14+
save_inference_dir: null
15+
use_visualdl: false
16+
infer_img: null
17+
character_dict_path: ppocr/utils/dict/samaritan_dict.txt
18+
max_text_length: 25
19+
infer_mode: false
20+
use_space_char: true
21+
Optimizer:
22+
name: Adam
23+
beta1: 0.9
24+
beta2: 0.999
25+
lr:
26+
name: Cosine
27+
learning_rate: 0.001
28+
regularizer:
29+
name: L2
30+
factor: 1.0e-05
31+
Architecture:
32+
model_type: rec
33+
algorithm: CRNN
34+
Transform: null
35+
Backbone:
36+
name: MobileNetV3
37+
scale: 0.5
38+
model_name: small
39+
small_stride:
40+
- 1
41+
- 2
42+
- 2
43+
- 2
44+
Neck:
45+
name: SequenceEncoder
46+
encoder_type: rnn
47+
hidden_size: 48
48+
Head:
49+
name: CTCHead
50+
fc_decay: 1.0e-05
51+
Loss:
52+
name: CTCLoss
53+
PostProcess:
54+
name: CTCLabelDecode
55+
Metric:
56+
name: RecMetric
57+
main_indicator: acc
58+
Train:
59+
dataset:
60+
name: SimpleDataSet
61+
data_dir: train_data/
62+
label_file_list:
63+
- train_data/samaritan_train.txt
64+
transforms:
65+
- DecodeImage:
66+
img_mode: BGR
67+
channel_first: false
68+
- RecAug: null
69+
- CTCLabelEncode: null
70+
- RecResizeImg:
71+
image_shape:
72+
- 3
73+
- 32
74+
- 320
75+
- KeepKeys:
76+
keep_keys:
77+
- image
78+
- label
79+
- length
80+
loader:
81+
shuffle: true
82+
batch_size_per_card: 256
83+
drop_last: true
84+
num_workers: 8
85+
Eval:
86+
dataset:
87+
name: SimpleDataSet
88+
data_dir: train_data/
89+
label_file_list:
90+
- train_data/samaritan_val.txt
91+
transforms:
92+
- DecodeImage:
93+
img_mode: BGR
94+
channel_first: false
95+
- CTCLabelEncode: null
96+
- RecResizeImg:
97+
image_shape:
98+
- 3
99+
- 32
100+
- 320
101+
- KeepKeys:
102+
keep_keys:
103+
- image
104+
- label
105+
- length
106+
loader:
107+
shuffle: false
108+
drop_last: false
109+
batch_size_per_card: 256
110+
num_workers: 8

0 commit comments

Comments
 (0)