Skip to content

Commit 64abc64

Browse files
authored
Merge pull request #1687 from cuicheng01/develop
polish some docs
2 parents 06be879 + ecd79ac commit 64abc64

File tree

4 files changed

+113
-74
lines changed

4 files changed

+113
-74
lines changed

docs/en/models/PP-LCNet_en.md

Lines changed: 70 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,10 @@
1515
- [4.1 Image Classification](#4.1)
1616
- [4.2 Object Detection](#4.2)
1717
- [4.3 Semantic Segmentation](#4.3)
18-
- [5. Conclusion](#5)
19-
- [6. Reference](#6)
18+
- [5. Inference speed based on V100 GPU](#5)
19+
- [6. Inference speed based on SD855](#6)
20+
- [7. Conclusion](#7)
21+
- [8. Reference](#8)
2022

2123
<a name="1"></a>
2224
## 1. Abstract
@@ -91,38 +93,38 @@ Since the introduction of GoogLeNet, GAP (Global-Average-Pooling) is often direc
9193

9294
For image classification, ImageNet dataset is adopted. Compared with the current mainstream lightweight network, PP-LCNet can obtain faster inference speed with the same accuracy. When using Baidu’s self-developed SSLD distillation strategy, the accuracy is further improved, with the Top-1 Acc of ImageNet exceeding 80% at an inference speed of about 5ms on the Intel CPU side.
9395

94-
| Model | Params(M) | FLOPs(M) | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) |
96+
| Model | Params(M) | FLOPs(M) | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) |
9597
|-------|-----------|----------|---------------|---------------|-------------|
96-
| PP-LCNet-0.25x | 1.5 | 18 | 51.86 | 75.65 | 1.74 |
97-
| PP-LCNet-0.35x | 1.6 | 29 | 58.09 | 80.83 | 1.92 |
98-
| PP-LCNet-0.5x | 1.9 | 47 | 63.14 | 84.66 | 2.05 |
99-
| PP-LCNet-0.75x | 2.4 | 99 | 68.18 | 88.30 | 2.29 |
100-
| PP-LCNet-1x | 3.0 | 161 | 71.32 | 90.03 | 2.46 |
101-
| PP-LCNet-1.5x | 4.5 | 342 | 73.71 | 91.53 | 3.19 |
102-
| PP-LCNet-2x | 6.5 | 590 | 75.18 | 92.27 | 4.27 |
103-
| PP-LCNet-2.5x | 9.0 | 906 | 76.60 | 93.00 | 5.39 |
104-
| PP-LCNet-0.5x\* | 1.9 | 47 | 66.10 | 86.46 | 2.05 |
105-
| PP-LCNet-1.0x\* | 3.0 | 161 | 74.39 | 92.09 | 2.46 |
106-
| PP-LCNet-2.5x\* | 9.0 | 906 | 80.82 | 95.33 | 5.39 |
107-
108-
\* denotes the model after using SSLD distillation.
98+
| PPLCNet_x0_25 | 1.5 | 18 | 51.86 | 75.65 | 1.74 |
99+
| PPLCNet_x0_35 | 1.6 | 29 | 58.09 | 80.83 | 1.92 |
100+
| PPLCNet_x0_5 | 1.9 | 47 | 63.14 | 84.66 | 2.05 |
101+
| PPLCNet_x0_75 | 2.4 | 99 | 68.18 | 88.30 | 2.29 |
102+
| PPLCNet_x1_0 | 3.0 | 161 | 71.32 | 90.03 | 2.46 |
103+
| PPLCNet_x1_5 | 4.5 | 342 | 73.71 | 91.53 | 3.19 |
104+
| PPLCNet_x2_0 | 6.5 | 590 | 75.18 | 92.27 | 4.27 |
105+
| PPLCNet_x2_5 | 9.0 | 906 | 76.60 | 93.00 | 5.39 |
106+
| PPLCNet_x0_5_ssld | 1.9 | 47 | 66.10 | 86.46 | 2.05 |
107+
| PPLCNet_x1_0_ssld | 3.0 | 161 | 74.39 | 92.09 | 2.46 |
108+
| PPLCNet_x2_5_ssld | 9.0 | 906 | 80.82 | 95.33 | 5.39 |
109+
110+
where `_ssld` represents the model after using `SSLD distillation`. For details about `SSLD distillation`, see [SSLD distillation](../advanced_tutorials/knowledge_distillation_en.md).
109111

110112
Performance comparison with other lightweight networks:
111113

112114
| Model | Params(M) | FLOPs(M) | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) |
113115
|-------|-----------|----------|---------------|---------------|-------------|
114-
| MobileNetV2-0.25x | 1.5 | 34 | 53.21 | 76.52 | 2.47 |
115-
| MobileNetV3-small-0.35x | 1.7 | 15 | 53.03 | 76.37 | 3.02 |
116-
| ShuffleNetV2-0.33x | 0.6 | 24 | 53.73 | 77.05 | 4.30 |
117-
| <b>PP-LCNet-0.25x<b> | <b>1.5<b> | <b>18<b> | <b>51.86<b> | <b>75.65<b> | <b>1.74<b> |
118-
| MobileNetV2-0.5x | 2.0 | 99 | 65.03 | 85.72 | 2.85 |
119-
| MobileNetV3-large-0.35x | 2.1 | 41 | 64.32 | 85.46 | 3.68 |
120-
| ShuffleNetV2-0.5x | 1.4 | 43 | 60.32 | 82.26 | 4.65 |
121-
| <b>PP-LCNet-0.5x<b> | <b>1.9<b> | <b>47<b> | <b>63.14<b> | <b>84.66<b> | <b>2.05<b> |
122-
| MobileNetV1-1x | 4.3 | 578 | 70.99 | 89.68 | 3.38 |
123-
| MobileNetV2-1x | 3.5 | 327 | 72.15 | 90.65 | 4.26 |
124-
| MobileNetV3-small-1.25x | 3.6 | 100 | 70.67 | 89.51 | 3.95 |
125-
| <b>PP-LCNet-1x<b> |<b> 3.0<b> | <b>161<b> | <b>71.32<b> | <b>90.03<b> | <b>2.46<b> |
116+
| MobileNetV2_x0_25 | 1.5 | 34 | 53.21 | 76.52 | 2.47 |
117+
| MobileNetV3_small_x0_35 | 1.7 | 15 | 53.03 | 76.37 | 3.02 |
118+
| ShuffleNetV2_x0_33 | 0.6 | 24 | 53.73 | 77.05 | 4.30 |
119+
| <b>PPLCNet_x0_25<b> | <b>1.5<b> | <b>18<b> | <b>51.86<b> | <b>75.65<b> | <b>1.74<b> |
120+
| MobileNetV2_x0_5 | 2.0 | 99 | 65.03 | 85.72 | 2.85 |
121+
| MobileNetV3_large_x0_35 | 2.1 | 41 | 64.32 | 85.46 | 3.68 |
122+
| ShuffleNetV2_x0_5 | 1.4 | 43 | 60.32 | 82.26 | 4.65 |
123+
| <b>PPLCNet_x0_5<b> | <b>1.9<b> | <b>47<b> | <b>63.14<b> | <b>84.66<b> | <b>2.05<b> |
124+
| MobileNetV1_x1_0 | 4.3 | 578 | 70.99 | 89.68 | 3.38 |
125+
| MobileNetV2_x1_0 | 3.5 | 327 | 72.15 | 90.65 | 4.26 |
126+
| MobileNetV3_small_x1_25 | 3.6 | 100 | 70.67 | 89.51 | 3.95 |
127+
| <b>PPLCNet_x1_0<b> |<b> 3.0<b> | <b>161<b> | <b>71.32<b> | <b>90.03<b> | <b>2.46<b> |
126128

127129
<a name="4.2"></a>
128130
### 4.2 Object Detection
@@ -131,10 +133,10 @@ For object detection, we adopt Baidu’s self-developed PicoDet, which focuses o
131133

132134
| Backbone | mAP(%) | Latency(ms) |
133135
|-------|-----------|----------|
134-
MobileNetV3-large-0.35x | 19.2 | 8.1 |
135-
<b>PP-LCNet-0.5x<b> | <b>20.3<b> | <b>6.0<b> |
136-
MobileNetV3-large-0.75x | 25.8 | 11.1 |
137-
<b>PP-LCNet-1x<b> | <b>26.9<b> | <b>7.9<b> |
136+
MobileNetV3_large_x0_35 | 19.2 | 8.1 |
137+
<b>PPLCNet_x0_5<b> | <b>20.3<b> | <b>6.0<b> |
138+
MobileNetV3_large_x0_75 | 25.8 | 11.1 |
139+
<b>PPLCNet_x1_0<b> | <b>26.9<b> | <b>7.9<b> |
138140

139141
<a name="4.3"></a>
140142
### 4.3 Semantic Segmentation
@@ -143,18 +145,47 @@ For semantic segmentation, DeeplabV3+ is adopted. The following table presents t
143145

144146
| Backbone | mIoU(%) | Latency(ms) |
145147
|-------|-----------|----------|
146-
MobileNetV3-large-0.5x | 55.42 | 135 |
147-
<b>PP-LCNet-0.5x<b> | <b>58.36<b> | <b>82<b> |
148-
MobileNetV3-large-0.75x | 64.53 | 151 |
149-
<b>PP-LCNet-1x<b> | <b>66.03<b> | <b>96<b> |
148+
MobileNetV3_large_x0_5 | 55.42 | 135 |
149+
<b>PPLCNet_x0_5<b> | <b>58.36<b> | <b>82<b> |
150+
MobileNetV3_large_x0_75 | 64.53 | 151 |
151+
<b>PPLCNet_x1_0<b> | <b>66.03<b> | <b>96<b> |
150152

151153
<a name="5"></a>
152-
## 5. Conclusion
154+
## 5. Inference speed based on V100 GPU
155+
156+
| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) | FP32<br/>Batch Size=1\4<br/>(ms) | FP32<br/>Batch Size=8<br/>(ms) |
157+
| ------------- | --------- | ----------------- | ---------------------------- | -------------------------------- | ------------------------------ |
158+
| PPLCNet_x0_25 | 224 | 256 | 0.72 | 1.17 | 1.71 |
159+
| PPLCNet_x0_35 | 224 | 256 | 0.69 | 1.21 | 1.82 |
160+
| PPLCNet_x0_5 | 224 | 256 | 0.70 | 1.32 | 1.94 |
161+
| PPLCNet_x0_75 | 224 | 256 | 0.71 | 1.49 | 2.19 |
162+
| PPLCNet_x1_0 | 224 | 256 | 0.73 | 1.64 | 2.53 |
163+
| PPLCNet_x1_5 | 224 | 256 | 0.82 | 2.06 | 3.12 |
164+
| PPLCNet_x2_0 | 224 | 256 | 0.94 | 2.58 | 4.08 |
165+
166+
<a name="6"></a>
167+
168+
## 6. Inference speed based on SD855
169+
170+
| Models | SD855 time(ms)<br>bs=1, thread=1 | SD855 time(ms)<br/>bs=1, thread=2 | SD855 time(ms)<br/>bs=1, thread=4 |
171+
| ------------- | -------------------------------- | --------------------------------- | --------------------------------- |
172+
| PPLCNet_x0_25 | 2.30 | 1.62 | 1.32 |
173+
| PPLCNet_x0_35 | 3.15 | 2.11 | 1.64 |
174+
| PPLCNet_x0_5 | 4.27 | 2.73 | 1.92 |
175+
| PPLCNet_x0_75 | 7.38 | 4.51 | 2.91 |
176+
| PPLCNet_x1_0 | 10.78 | 6.49 | 3.98 |
177+
| PPLCNet_x1_5 | 20.55 | 12.26 | 7.54 |
178+
| PPLCNet_x2_0 | 33.79 | 20.17 | 12.10 |
179+
| PPLCNet_x2_5 | 49.89 | 29.60 | 17.82 |
180+
181+
182+
<a name="7"></a>
183+
## 7. Conclusion
153184

154185
Rather than holding on to perfect FLOPs and Params as academics do, PP-LCNet focuses on analyzing how to add Intel CPU-friendly modules to improve the performance of the model, which can better balance accuracy and inference time. The experimental conclusions therein are available to other researchers in network structure design, while providing NAS search researchers with a smaller search space and general conclusions. The finished PP-LCNet can also be better accepted and applied in industry.
155186

156-
<a name="6"></a>
157-
## 6. Reference
187+
<a name="8"></a>
188+
## 8. Reference
158189

159190
Reference to cite when you use PP-LCNet in a paper:
160191
```

docs/en/quick_start/quick_start_classification_professional_en.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,10 @@ python3 -m paddle.distributed.launch \
7575

7676
The highest accuracy of the validation set is around 0.415.
7777

78+
* ** Note**
79+
80+
* If the number of GPU cards is not 4, the accuracy of the validation set may be different from 0.415. To maintain a comparable accuracy, you need to change the learning rate in the configuration file to `the current learning rate / 4 \* current card number`. The same below.
81+
7882
<a name="2.1.2"></a>
7983

8084

docs/zh_CN/models/PP-LCNet.md

Lines changed: 34 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -95,38 +95,38 @@ BaseNet 经过以上四个方面的改进,得到了 PP-LCNet。下表进一步
9595

9696
图像分类我们选用了 ImageNet 数据集,相比目前主流的轻量级网络,PP-LCNet 在相同精度下可以获得更快的推理速度。当使用百度自研的 SSLD 蒸馏策略后,精度进一步提升,在 Intel cpu 端约 5ms 的推理速度下 ImageNet 的 Top-1 Acc 超过了 80%。
9797

98-
| Model | Params(M) | FLOPs(M) | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) |
98+
| Model | Params(M) | FLOPs(M) | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) |
9999
|-------|-----------|----------|---------------|---------------|-------------|
100-
| PP-LCNet-0.25x | 1.5 | 18 | 51.86 | 75.65 | 1.74 |
101-
| PP-LCNet-0.35x | 1.6 | 29 | 58.09 | 80.83 | 1.92 |
102-
| PP-LCNet-0.5x | 1.9 | 47 | 63.14 | 84.66 | 2.05 |
103-
| PP-LCNet-0.75x | 2.4 | 99 | 68.18 | 88.30 | 2.29 |
104-
| PP-LCNet-1x | 3.0 | 161 | 71.32 | 90.03 | 2.46 |
105-
| PP-LCNet-1.5x | 4.5 | 342 | 73.71 | 91.53 | 3.19 |
106-
| PP-LCNet-2x | 6.5 | 590 | 75.18 | 92.27 | 4.27 |
107-
| PP-LCNet-2.5x | 9.0 | 906 | 76.60 | 93.00 | 5.39 |
108-
| PP-LCNet-0.5x\* | 1.9 | 47 | 66.10 | 86.46 | 2.05 |
109-
| PP-LCNet-1.0x\* | 3.0 | 161 | 74.39 | 92.09 | 2.46 |
110-
| PP-LCNet-2.5x\* | 9.0 | 906 | 80.82 | 95.33 | 5.39 |
111-
112-
其中\*表示使用 SSLD 蒸馏后的模型
100+
| PPLCNet_x0_25 | 1.5 | 18 | 51.86 | 75.65 | 1.74 |
101+
| PPLCNet_x0_35 | 1.6 | 29 | 58.09 | 80.83 | 1.92 |
102+
| PPLCNet_x0_5 | 1.9 | 47 | 63.14 | 84.66 | 2.05 |
103+
| PPLCNet_x0_75 | 2.4 | 99 | 68.18 | 88.30 | 2.29 |
104+
| PPLCNet_x1_0 | 3.0 | 161 | 71.32 | 90.03 | 2.46 |
105+
| PPLCNet_x1_5 | 4.5 | 342 | 73.71 | 91.53 | 3.19 |
106+
| PPLCNet_x2_0 | 6.5 | 590 | 75.18 | 92.27 | 4.27 |
107+
| PPLCNet_x2_5 | 9.0 | 906 | 76.60 | 93.00 | 5.39 |
108+
| PPLCNet_x0_5_ssld | 1.9 | 47 | 66.10 | 86.46 | 2.05 |
109+
| PPLCNet_x1_0_ssld | 3.0 | 161 | 74.39 | 92.09 | 2.46 |
110+
| PPLCNet_x2_5_ssld | 9.0 | 906 | 80.82 | 95.33 | 5.39 |
111+
112+
其中 `_ssld` 表示使用 `SSLD 蒸馏`后的模型。关于 `SSLD蒸馏` 的内容,详情 [SSLD 蒸馏](../advanced_tutorials/knowledge_distillation.md)
113113

114114
与其他轻量级网络的性能对比:
115115

116116
| Model | Params(M) | FLOPs(M) | Top-1 Acc(\%) | Top-5 Acc(\%) | Latency(ms) |
117117
|-------|-----------|----------|---------------|---------------|-------------|
118-
| MobileNetV2-0.25x | 1.5 | 34 | 53.21 | 76.52 | 2.47 |
119-
| MobileNetV3-small-0.35x | 1.7 | 15 | 53.03 | 76.37 | 3.02 |
120-
| ShuffleNetV2-0.33x | 0.6 | 24 | 53.73 | 77.05 | 4.30 |
121-
| <b>PP-LCNet-0.25x<b> | <b>1.5<b> | <b>18<b> | <b>51.86<b> | <b>75.65<b> | <b>1.74<b> |
122-
| MobileNetV2-0.5x | 2.0 | 99 | 65.03 | 85.72 | 2.85 |
123-
| MobileNetV3-large-0.35x | 2.1 | 41 | 64.32 | 85.46 | 3.68 |
124-
| ShuffleNetV2-0.5x | 1.4 | 43 | 60.32 | 82.26 | 4.65 |
125-
| <b>PP-LCNet-0.5x<b> | <b>1.9<b> | <b>47<b> | <b>63.14<b> | <b>84.66<b> | <b>2.05<b> |
126-
| MobileNetV1-1x | 4.3 | 578 | 70.99 | 89.68 | 3.38 |
127-
| MobileNetV2-1x | 3.5 | 327 | 72.15 | 90.65 | 4.26 |
128-
| MobileNetV3-small-1.25x | 3.6 | 100 | 70.67 | 89.51 | 3.95 |
129-
| <b>PP-LCNet-1x<b> |<b> 3.0<b> | <b>161<b> | <b>71.32<b> | <b>90.03<b> | <b>2.46<b> |
118+
| MobileNetV2_x0_25 | 1.5 | 34 | 53.21 | 76.52 | 2.47 |
119+
| MobileNetV3_small_x0_35 | 1.7 | 15 | 53.03 | 76.37 | 3.02 |
120+
| ShuffleNetV2_x0_33 | 0.6 | 24 | 53.73 | 77.05 | 4.30 |
121+
| <b>PPLCNet_x0_25<b> | <b>1.5<b> | <b>18<b> | <b>51.86<b> | <b>75.65<b> | <b>1.74<b> |
122+
| MobileNetV2_x0_5 | 2.0 | 99 | 65.03 | 85.72 | 2.85 |
123+
| MobileNetV3_large_x0_35 | 2.1 | 41 | 64.32 | 85.46 | 3.68 |
124+
| ShuffleNetV2_x0_5 | 1.4 | 43 | 60.32 | 82.26 | 4.65 |
125+
| <b>PPLCNet_x0_5<b> | <b>1.9<b> | <b>47<b> | <b>63.14<b> | <b>84.66<b> | <b>2.05<b> |
126+
| MobileNetV1_x1_0 | 4.3 | 578 | 70.99 | 89.68 | 3.38 |
127+
| MobileNetV2_x1_0 | 3.5 | 327 | 72.15 | 90.65 | 4.26 |
128+
| MobileNetV3_small_x1_25 | 3.6 | 100 | 70.67 | 89.51 | 3.95 |
129+
| <b>PPLCNet_x1_0<b> |<b> 3.0<b> | <b>161<b> | <b>71.32<b> | <b>90.03<b> | <b>2.46<b> |
130130

131131
<a name="4.2"></a>
132132
### 4.2 目标检测
@@ -135,10 +135,10 @@ BaseNet 经过以上四个方面的改进,得到了 PP-LCNet。下表进一步
135135

136136
| Backbone | mAP(%) | Latency(ms) |
137137
|-------|-----------|----------|
138-
MobileNetV3-large-0.35x | 19.2 | 8.1 |
139-
<b>PP-LCNet-0.5x<b> | <b>20.3<b> | <b>6.0<b> |
140-
MobileNetV3-large-0.75x | 25.8 | 11.1 |
141-
<b>PP-LCNet-1x<b> | <b>26.9<b> | <b>7.9<b> |
138+
MobileNetV3_large_x0_35 | 19.2 | 8.1 |
139+
<b>PPLCNet_x0_5<b> | <b>20.3<b> | <b>6.0<b> |
140+
MobileNetV3_large_x0_75 | 25.8 | 11.1 |
141+
<b>PPLCNet_x1_0<b> | <b>26.9<b> | <b>7.9<b> |
142142

143143
<a name="4.3"></a>
144144
### 4.3 语义分割
@@ -147,10 +147,10 @@ MobileNetV3-large-0.75x | 25.8 | 11.1 |
147147

148148
| Backbone | mIoU(%) | Latency(ms) |
149149
|-------|-----------|----------|
150-
|MobileNetV3-large-0.5x | 55.42 | 135 |
151-
|<b>PP-LCNet-0.5x<b> | <b>58.36<b> | <b>82<b> |
152-
|MobileNetV3-large-0.75x | 64.53 | 151 |
153-
|<b>PP-LCNet-1x<b> | <b>66.03<b> | <b>96<b> |
150+
MobileNetV3_large_x0_5 | 55.42 | 135 |
151+
<b>PPLCNet_x0_5<b> | <b>58.36<b> | <b>82<b> |
152+
MobileNetV3_large_x0_75 | 64.53 | 151 |
153+
<b>PPLCNet_x1_0<b> | <b>66.03<b> | <b>96<b> |
154154

155155
<a name="5"></a>
156156

docs/zh_CN/quick_start/quick_start_classification_professional.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,10 @@ python3 -m paddle.distributed.launch \
7575

7676
验证集的最高准确率为 0.415 左右。
7777

78+
* **注意**
79+
80+
* 如果 GPU 卡数不是 4,验证集的准确率可能与 0.415 有差异,若需保持相当的准确率,需要将配置文件中的学习率改为`当前学习率 / 4 \* 当前卡数`。下同。
81+
7882
<a name="2.1.2"></a>
7983

8084

@@ -153,7 +157,7 @@ python3 -m paddle.distributed.launch \
153157
* **注意**
154158

155159
* 其他数据增广的配置文件可以参考 `ppcls/configs/ImageNet/DataAugment/` 中的配置文件。
156-
* 训练 CIFAR100 的迭代轮数较少,因此进行训练时,验证集的精度指标可能会有 1% 左右的波动。
160+
* 训练 CIFAR100 的迭代轮数较少,因此进行训练时,验证集的精度指标可能会有 1% 左右的波动。
157161

158162
<a name="4"></a>
159163

0 commit comments

Comments
 (0)