[Bug fixes]: Modify code and setting to run ACT correctly (#3539)

ranchongzhi · web-flow · commit bf3662078804 · 2023-10-25T15:51:52.000+08:00
* Modify the code in test_seg.py to collect dynamic shape information for dataset ADE20k

* 调整ppmobileseg的验证配置，更新全部测试结果以及环境安装文档

* modify
diff --git a/deploy/slim/act/configs/datasets/ade_data.yml b/deploy/slim/act/configs/datasets/ade_data.yml
@@ -22,6 +22,10 @@ train_dataset:
 
 val_dataset:
   transforms:
+    - type: Resize
+      target_size: [2048, 512]
+      keep_ratio: True
+      size_divisor: 32
     - type: Normalize
       mean: [0.485, 0.456, 0.406]
       std: [0.229, 0.224, 0.225]
diff --git a/deploy/slim/act/readme.md b/deploy/slim/act/readme.md
@@ -17,30 +17,30 @@
 
 ## 2.Benchmark
 
-| 模型 | 策略  | Total IoU (%) | CPU耗时(ms)<br>thread=10<br>mkldnn=on| Nvidia GPU耗时(ms)<br>TRT=on| 配置文件 | Inference模型  |
+| 模型 | 策略  | Total IoU (%) | CPU耗时(ms)<br>thread=12<br>mkldnn=on | Nvidia GPU耗时(ms)<br>TRT=on| 配置文件 | Inference模型  |
 |:-----:|:-----:|:----------:|:---------:| :------:|:------:|:------:|
-| OCRNet_HRNetW48 |Baseline |82.15| **4332.2** | **154.9** | - | [mode](https://paddleseg.bj.bcebos.com/deploy/slim_act/ocrnet/ocrnet_export.zip)|
-| OCRNet_HRNetW48 | 量化蒸馏训练 |82.03| **3728.7** | **59.8**|[config](configs/ocrnet/ocrnet_hrnetw48_qat.yaml)| [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ocrnet/ocrnet_qat.zip) |
-| SegFormer-B0*  |Baseline | 75.27| 285.4| 34.3 |-| [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/segformer/segformer_b0_export.zip) |
-| SegFormer-B0*  |量化蒸馏训练 | 75.22 | 284.1| 35.7|[config](configs/segformer/segformer_b0_qat.yaml)| [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/segformer/segformer_qat.zip) |
-| PP-LiteSeg-Tiny  |Baseline | 77.04 | 640.72 | **11.9** | - |[model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ppliteseg/liteseg_tiny_scale1.0.zip)|
-| PP-LiteSeg-Tiny  |量化蒸馏训练 | 77.14 | 450.19 | **7.5** | [config](./configs/ppliteseg/ppliteseg_qat.yaml)|[model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ppliteseg/save_quant_model_qat.zip)|
-| PP-MobileSeg-Base  |Baseline |41.55| **311.1** | **17.8** | - | [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ppmobileseg/ppmobileseg_base_ade_export.zip) |
-| PP-MobileSeg-Base  |量化蒸馏训练 |39.08| **303.6** | **16.2**| [config](configs/ppmobileseg/ppmobileseg_qat.yml)| [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ppmobileseg/ppmobileseg_base_ade.zip)|
-
-* SegFormer-B0 is tested on CPU under deleted gpu_cpu_map_matmul_v2_to_mul_pass because it will raise an error.
+| OCRNet_HRNetW48 |Baseline |82.16| **5788.7** | **153.0** | - | [mode](https://paddleseg.bj.bcebos.com/deploy/slim_act/ocrnet/ocrnet_export.zip)|
+| OCRNet_HRNetW48 | 量化蒸馏训练 |82.02| **5291.4** | **60.0** |[config](configs/ocrnet/ocrnet_hrnetw48_qat.yaml)| [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ocrnet/ocrnet_qat.zip) |
+| SegFormer-B0*  |Baseline | 75.27| **3234.6** | **72.6** |-| [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/segformer/segformer_b0_export.zip) |
+| SegFormer-B0*  |量化蒸馏训练 | 75.26 | **2906.2** | **52.4** |[config](configs/segformer/segformer_b0_qat.yaml)| [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/segformer/segformer_qat.zip) |
+| PP-LiteSeg-Tiny  |Baseline | 77.04 | 1038.4 | **11.7** | - |[model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ppliteseg/liteseg_tiny_scale1.0.zip)|
+| PP-LiteSeg-Tiny  |量化蒸馏训练 | 77.16 | 1163.8 | **7.2** | [config](./configs/ppliteseg/ppliteseg_qat.yaml)|[model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ppliteseg/save_quant_model_qat.zip)|
+| PP-MobileSeg-Base  |Baseline |40.69| **547.7** | **22.3** | - | [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ppmobileseg/ppmobileseg_base_ade_export.zip) |
+| PP-MobileSeg-Base  |量化蒸馏训练 |38.18| **439.8** | **21.1** | [config](configs/ppmobileseg/ppmobileseg_qat.yml)| [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ppmobileseg/ppmobileseg_base_ade.zip)|
+| PP-MobileSeg-Base  |量化蒸馏训练（关闭IR优化） |39.92| **1296.3** | **44.3** | - | [model](https://paddleseg.bj.bcebos.com/deploy/slim_act/ppmobileseg/ppmobileseg_base_ade.zip)|
+
 * PP-MobileSeg-Base is tested on ADE20K dataset, while others are tested on cityscapes.
 
 - CPU测试环境：
-  - Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
+  - Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
   - cpu thread: 10
 
 
 - Nvidia GPU测试环境：
 
   - 硬件：NVIDIA Tesla V100 单卡
   - 软件：CUDA 11.2, cudnn 8.1.0, TensorRT-8.0.3.4
-  - 测试配置：batch_size: 4
+  - 测试配置：batch_size: 1
 
 - 测速要求：
   - 批量测试取平均：单张图片上测速时间会有浮动，因此测速需要跑10遍warmup，再跑100次取平均。现有test_seg的批量测试已经集成该功能。
@@ -53,21 +53,22 @@
 
 #### 3.1 准备环境
 
-- PaddlePaddle == 2.5 （可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装）
-- PaddleSlim == 2.5
+- PaddlePaddle == develop （可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装）
+- PaddleSlim == develop
 - PaddleSeg == develop
 
 安装paddlepaddle：
 ```shell
 # CPU
-python -m pip install paddlepaddle==2.5.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
-# GPU 以Ubuntu、CUDA 10.2为例
-python -m pip install paddlepaddle-gpu==2.5.1.post102 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
+python -m pip install paddlepaddle==0.0.0 -f https://www.paddlepaddle.org.cn/whl/linux/cpu-mkl/develop.html
+# GPU 以Ubuntu、CUDA 11.2为例
+python -m pip install paddlepaddle-gpu==0.0.0.post112 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
 ```
 
-安装paddleslim 2.5：
+安装paddleslim develop：
 ```shell
-pip install paddleslim@git+https://gitee.com/paddlepaddle/PaddleSlim.git@release/2.5
+git clone https://github.com/PaddlePaddle/PaddleSlim.git & cd PaddleSlim
+python setup.py install
 ```
 
 安装paddleseg develop和对应包：
diff --git a/deploy/slim/act/run_seg.py b/deploy/slim/act/run_seg.py
@@ -96,9 +96,11 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
         paddle.disable_static()
         logit = logits[
             0]  # logit shape is 3, except  data['trans_info'] needs to be empty
+        for i in range(len(data['trans_info'][::-1][0][1])):
+            data['trans_info'][::-1][0][1][i] = paddle.to_tensor(data['trans_info'][::-1][0][1][i])
         logit = reverse_transform(
-            paddle.to_tensor(logit), data['trans_info'], mode='bilinear')
-        pred = paddle.to_tensor(logit)
+            paddle.to_tensor(logit).unsqueeze(0), data['trans_info'], mode='bilinear')
+        pred = paddle.to_tensor(logit).squeeze(0)
         if len(
                 pred.shape
         ) == 4:  # for humanseg model whose prediction is distribution but not class id
diff --git a/deploy/slim/act/test_seg.py b/deploy/slim/act/test_seg.py
@@ -41,6 +41,39 @@ def _transforms(dataset):
     return transforms
 
 
+def find_images_with_bounding_size(eval_dataset: paddle.io.Dataset):
+    max_length_index = -1
+    max_width_index = -1
+    min_length_index = -1
+    min_width_index = -1
+
+    max_length = float('-inf')
+    max_width = float('-inf')
+    min_length = float('inf')
+    min_width = float('inf')
+    for idx, data in enumerate(eval_dataset):
+        image = np.array(data['img'])
+        h, w = image.shape[-2:]
+        if h > max_length:
+            max_length = h
+            max_length_index = idx
+        if w > max_width:
+            max_width = w
+            max_width_index = idx
+        if h < min_length:
+            min_length = h
+            min_length_index = idx
+        if w < min_width:
+            min_width = w
+            min_width_index = idx
+    print(f"Found max image length: {max_length}, index: {max_length_index}")
+    print(f"Found max image width: {max_width}, index: {max_width_index}")
+    print(f"Found min image length: {min_length}, index: {min_length_index}")
+    print(f"Found min image width: {min_width}, index: {min_width_index}")
+    return paddle.io.Subset(eval_dataset, [max_width_index, max_length_index,
+                                           min_width_index, min_length_index])
+
+
 def load_predictor(args):
     """
     load predictor func
@@ -109,7 +142,7 @@ def predict_image(args):
     data = transform({'img': args.image_file})
     data = data['img'][np.newaxis, :]
 
-    # Step2: Prepare prdictor
+    # Step2: Prepare predictor
     predictor, rerun_flag = load_predictor(args)
 
     # Step3: Inference
@@ -167,6 +200,15 @@ def eval(args):
 
     eval_dataset = builder.val_dataset
 
+    predictor, rerun_flag = load_predictor(args)
+
+    if rerun_flag and args.use_multi_img_for_dynamic_shape_collect:
+        print(
+            "***** Try to find the images with the largest and smallest length and width respectively in the ADE20K "
+            "dataset for collecting dynamic shape. *****"
+        )
+        eval_dataset = find_images_with_bounding_size(eval_dataset)
+
     batch_sampler = paddle.io.BatchSampler(
         eval_dataset, batch_size=1, shuffle=False, drop_last=False)
     loader = paddle.io.DataLoader(
@@ -175,8 +217,6 @@ def eval(args):
         num_workers=0,
         return_list=True)
 
-    predictor, rerun_flag = load_predictor(args)
-
     intersect_area_all = 0
     pred_area_all = 0
     label_area_all = 0
@@ -207,14 +247,22 @@ def eval(args):
         time_max = max(time_max, timed)
         predict_time += timed
         if rerun_flag:
-            print(
-                "***** Collect dynamic shape done, Please rerun the program to get correct results. *****"
-            )
-            return
-
+            if args.use_multi_img_for_dynamic_shape_collect:
+                if batch_id == sample_nums - 1:
+                    print(
+                        "***** Collect dynamic shape done, Please rerun the program to get correct results. *****"
+                    )
+                    return
+                else:
+                    continue
+            else:
+                print(
+                    "***** Collect dynamic shape done, Please rerun the program to get correct results. *****"
+                )
+                return
         logit = reverse_transform(
-            paddle.to_tensor(results), data['trans_info'], mode="bilinear")
-        pred = paddle.to_tensor(logit)
+            paddle.to_tensor(results).unsqueeze(0), data['trans_info'], mode="bilinear")
+        pred = paddle.to_tensor(logit).squeeze(0)
         if len(
                 pred.shape
         ) == 4:  # for humanseg model whose prediction is distribution but not class id
@@ -314,6 +362,12 @@ def eval(args):
         help="Whether use mkldnn or not.")
     parser.add_argument(
         "--cpu_threads", type=int, default=1, help="Num of cpu threads.")
+    parser.add_argument(
+        "--use_multi_img_for_dynamic_shape_collect", 
+        type=bool, 
+        default=False, 
+        help="Whether it is necessary to use multiple images to collect shape infomation,\
+        When the image sizes in the data set are different, it needs to be set to True.")
     args = parser.parse_args()
     if args.image_file:
         predict_image(args)