You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paddle/fluid/inference/tests/api/int8_mkldnn_quantization.md
+65-20Lines changed: 65 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# INT8 MKL-DNN quantization
2
2
3
-
This document describes how to use Paddle inference Engine to convert the FP32 model to INT8 model on ResNet-50 and MobileNet-V1. We provide the instructions on enabling INT8 MKL-DNN quantization in Paddle inference and show the ResNet-50 and MobileNet-V1 results in accuracy and performance.
3
+
This document describes how to use Paddle inference Engine to convert the FP32 models to INT8 models. We provide the instructions on enabling INT8 MKL-DNN quantization in Paddle inference and show the accuracy and performance results of the quantized models, including 7 image classification models: GoogleNet, MobileNet-V1, MobileNet-V2, ResNet-101, ResNet-50, VGG16, VGG19, and 1 object detection model Mobilenet-SSD.
4
4
5
5
## 0. Install PaddlePaddle
6
6
@@ -15,7 +15,7 @@ Note: MKL-DNN and MKL are required.
15
15
16
16
## 1. Enable INT8 MKL-DNN quantization
17
17
18
-
For reference, please examine the code of unit test enclosed in [analyzer_int8_image_classification_tester.cc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/api/analyzer_int8_image_classification_tester.cc).
18
+
For reference, please examine the code of unit test enclosed in [analyzer_int8_image_classification_tester.cc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/api/analyzer_int8_image_classification_tester.cc) and [analyzer_int8_object_detection_tester.cc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/inference/tests/api/analyzer_int8_object_detection_tester.cc).
@@ -64,20 +62,10 @@ We provide the results of accuracy and performance measured on Intel(R) Xeon(R)
64
62
| VGG16 | 3.64 | 10.56 | 2.90 |
65
63
| VGG19 | 2.95 | 9.02 | 3.05 |
66
64
67
-
Notes:
68
-
69
-
* Measurement of accuracy requires a model which accepts two inputs: data and labels.
70
-
71
-
* Different sampling batch size data may cause slight difference on INT8 top accuracy.
72
-
* CAPI performance data is better than python API performance data because of the python overhead. Especially for the small computational model, python overhead will be more obvious.
73
-
74
-
## 3. Commands to reproduce the above accuracy and performance benchmark
75
-
76
-
Two steps to reproduce the above-mentioned accuracy results, and we take GoogleNet benchmark as an example:
77
65
78
-
* ### Prepare dataset
66
+
* ## Prepare dataset
79
67
80
-
Running the following commands to download and preprocess the ILSVRC2012 Validation dataset.
68
+
Run the following commands to download and preprocess the ILSVRC2012 Validation dataset.
Then the user dataset will be preprocessed and saved by default in `/PATH/TO/PADDLE/build/third_party/inference_demo/int8v2/pascalvoc_small/pascalvoc_small.bin`
136
+
137
+
*## Commands to reproduce object detection benchmark
138
+
139
+
You can run `test_analyzer_int8_object_detection` with the following arguments to reproduce the benchmark results for Mobilenet-SSD.
* Measurement of accuracy requires a model which accepts two inputs: data and labels.
149
+
* Different sampling batch size data may cause slight difference on INT8 accuracy.
150
+
* CAPI performance data is better than python API performance data because of the python overhead. Especially for the small computational model, python overhead will be more obvious.
0 commit comments