Skip to content

fix TRT not working in CPP inference. #371

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 17 additions & 9 deletions deploy/cpp_infer/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,12 +106,12 @@ python -m pip install git+https://github.com/LDOUBLEV/AutoLog

#### 1.2.1 直接下载安装

* [Paddle预测库官网](https://paddleinference.paddlepaddle.org.cn/v2.1/user_guides/download_lib.html) 上提供了不同cuda版本的Linux预测库,可以在官网查看并**选择合适的预测库版本**(建议选择paddle版本>=2.0.1版本的预测库)。
* [Paddle预测库官网](https://paddleinference.paddlepaddle.org.cn/v2.2/user_guides/download_lib.html) 上提供了不同cuda版本的Linux预测库,可以在官网查看并**选择合适的预测库版本**(建议选择paddle版本>=2.0.1版本的预测库)。

* 下载得到一个`paddle_inference.tgz`压缩包,然后将它解压成文件夹,命令如下(以机器环境为gcc8.2为例):

```bash
wget https://paddle-inference-lib.bj.bcebos.com/2.1.1-gpu-cuda10.1-cudnn7-mkl-gcc8.2/paddle_inference.tgz
wget https://paddle-inference-lib.bj.bcebos.com/2.2.2/cxx_c/Linux/GPU/x86-64_gcc8.2_avx_mkl_cuda10.1_cudnn7.6.5_trt6.0.1.5/paddle_inference.tgz
tar -xf paddle_inference.tgz
```

Expand All @@ -123,7 +123,7 @@ python -m pip install git+https://github.com/LDOUBLEV/AutoLog

```shell
git clone https://github.com/PaddlePaddle/Paddle.git
git checkout release/2.1
git checkout release/2.2
```

* 进入Paddle目录后,编译方法如下。
Expand Down Expand Up @@ -199,11 +199,11 @@ python -m pip install git+https://github.com/LDOUBLEV/AutoLog
以PP-TSM为例,上述参数如下(xxx部分根据用户自己机器情况对应修改)

```bash
OPENCV_DIR=/xxx/xxx/xxx/xxx/xxx/xxx/opencv3
LIB_DIR=/xxx/xxx/xxx/xxx/xxx/paddle_inference
CUDA_LIB_DIR=/xxx/xxx/cuda-xxx/lib64
CUDNN_LIB_DIR=/xxx/xxx/cuda-xxx/lib64
TENSORRT_DIR=/xxx/xxx/TensorRT-7.0.0.11
OPENCV_DIR=/path/to/opencv3
LIB_DIR=/path/to/paddle_inference
CUDA_LIB_DIR=/path/to/cuda/lib64
CUDNN_LIB_DIR=/path/to/cuda/lib64
TENSORRT_DIR=/path/to/TensorRT-x.x.x.x
```

其中,`OPENCV_DIR`为opencv编译安装的地址;`LIB_DIR`为下载(`paddle_inference`文件夹)或者编译生成的Paddle预测库地址(`build/paddle_inference_install_dir`文件夹);`CUDA_LIB_DIR`为cuda库文件地址,在docker中为`/usr/local/cuda/lib64`;`CUDNN_LIB_DIR`为cudnn库文件地址,在docker中为`/usr/lib/x86_64-linux-gnu/`。**注意:以上路径都写绝对路径,不要写相对路径。**
Expand All @@ -222,6 +222,14 @@ python -m pip install git+https://github.com/LDOUBLEV/AutoLog

其中,`mode`为必选参数,表示选择的功能,取值范围['rec'],表示**视频识别**(更多功能会陆续加入)。

注意:如果要在预测时开启TensorRT优化选项,需要先运行以下命令设置好TensorRT的相关路径。
```bash
export PATH=$PATH:/path/to/cuda/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/cuda/bin
export LIBRARY_PATH=$LIBRARY_PATH:/path/to/cuda/bin
export LD_LIBRARY_PATH=/path/to/TensorRT-x.x.x.x/lib:$LD_LIBRARY_PATH
```

##### 1. 调用视频识别:
```bash
# 调用PP-TSM识别
Expand Down Expand Up @@ -297,4 +305,4 @@ I1125 08:10:45.834602 13955 autolog.h:67] preprocess_time(ms): 10.6524, inferenc

### 3 注意

* 在使用Paddle预测库时,推荐使用2.1.0版本的预测库
* 在使用Paddle预测库时,推荐使用2.2.2版本的预测库
27 changes: 18 additions & 9 deletions deploy/cpp_infer/readme_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,12 +106,12 @@ There are two ways to obtain the Paddle prediction library, which will be descri

#### 1.2.1 Download and install directly

* [Paddle prediction library official website](https://paddleinference.paddlepaddle.org.cn/v2.1/user_guides/download_lib.html) provides different cuda versions of Linux prediction libraries, you can Check and **select the appropriate prediction library version** on the official website (it is recommended to select the prediction library with paddle version>=2.0.1).
* [Paddle prediction library official website](https://paddleinference.paddlepaddle.org.cn/v2.2/user_guides/download_lib.html) provides different cuda versions of Linux prediction libraries, you can Check and **select the appropriate prediction library version** on the official website (it is recommended to select the prediction library with paddle version>=2.0.1).

* Download and get a `paddle_inference.tgz` compressed package, and then unzip it into a folder, the command is as follows (taking the machine environment as gcc8.2 as an example):

```bash
wget https://paddle-inference-lib.bj.bcebos.com/2.1.1-gpu-cuda10.1-cudnn7-mkl-gcc8.2/paddle_inference.tgz
wget https://paddle-inference-lib.bj.bcebos.com/2.2.2/cxx_c/Linux/GPU/x86-64_gcc8.2_avx_mkl_cuda10.1_cudnn7.6.5_trt6.0.1.5/paddle_inference.tgz
tar -xf paddle_inference.tgz
```

Expand All @@ -123,7 +123,7 @@ There are two ways to obtain the Paddle prediction library, which will be descri

```shell
git clone https://github.com/PaddlePaddle/Paddle.git
git checkout release/2.1
git checkout release/2.2
```

* After entering the Paddle directory, the compilation method is as follows.
Expand Down Expand Up @@ -199,11 +199,11 @@ There are two ways to obtain the Paddle prediction library, which will be descri
Take PP-TSM as an example, the above parameters are as follows (the xxx part is modified according to the user's own machine situation)

```bash
OPENCV_DIR=/xxx/xxx/xxx/xxx/xxx/xxx/opencv3
LIB_DIR=/xxx/xxx/xxx/xxx/xxx/paddle_inference
CUDA_LIB_DIR=/xxx/xxx/cuda-xxx/lib64
CUDNN_LIB_DIR=/xxx/xxx/cuda-xxx/lib64
TENSORRT_DIR=/xxx/xxx/TensorRT-7.0.0.11
OPENCV_DIR=/path/to/opencv3
LIB_DIR=/path/to/paddle_inference
CUDA_LIB_DIR=/path/to/cuda/lib64
CUDNN_LIB_DIR=/path/to/cuda/lib64
TENSORRT_DIR=/path/to/TensorRT-x.x.x.x
```

Among them, `OPENCV_DIR` is the address where opencv is compiled and installed; `LIB_DIR` is the download (`paddle_inference` folder) or compiled Paddle prediction library address (`build/paddle_inference_install_dir` folder); `CUDA_LIB_DIR` is the cuda library file address , In docker, it is `/usr/local/cuda/lib64`; `CUDNN_LIB_DIR` is the address of the cudnn library file, in docker it is `/usr/lib/x86_64-linux-gnu/`. **Note: The above paths are written as absolute paths, do not write relative paths. **
Expand All @@ -222,7 +222,16 @@ Operation mode:

Among them, `mode` is a required parameter, which means the selected function, and the value range is ['rec'], which means **video recognition** (more functions will be added in succession).

Note: Note: If you want to enable the TensorRT optimization option during prediction, you need to run the following command to set the relevant path of TensorRT.
```bash
export PATH=$PATH:/path/to/cuda/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/cuda/bin
export LIBRARY_PATH=$LIBRARY_PATH:/path/to/cuda/bin
export LD_LIBRARY_PATH=/path/to/TensorRT-x.x.x.x/lib:$LD_LIBRARY_PATH
```

##### 1. Call video recognition:

```bash
# run PP-TSM inference
./build/ppvideo rec \
Expand Down Expand Up @@ -296,4 +305,4 @@ I1125 08:10:45.834602 13955 autolog.h:67] preprocess_time(ms): 10.6524, inferenc

### 3 Attention

* When using the Paddle prediction library, it is recommended to use the prediction library of version 2.1.0.
* When using the Paddle prediction library, it is recommended to use the prediction library of version 2.2.2.
47 changes: 32 additions & 15 deletions deploy/cpp_infer/src/video_rec.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -225,35 +225,52 @@ namespace PaddleVideo
if (this->inference_model_name == "ppTSM" || this->inference_model_name == "TSM")
{
config.EnableTensorRtEngine(
1 << 20, this->rec_batch_num * this->num_seg * 1, 3,
precision,
false, false);
1 << 30, // workspaceSize
this->rec_batch_num * this->num_seg * 1, // maxBatchSize
3, // minSubgraphSize
precision, // precision
false,// useStatic
false //useCalibMode
);
}
else if(this->inference_model_name == "ppTSN" || this->inference_model_name == "TSN")
{
config.EnableTensorRtEngine(
1 << 20, this->rec_batch_num * this->num_seg * 10, 3,
precision,
false, false);
1 << 30,
this->rec_batch_num * this->num_seg * 10,
3, // minSubgraphSize
precision,// precision
false,// useStatic
false //useCalibMode
);
}
else
{
config.EnableTensorRtEngine(
1 << 20, this->rec_batch_num, 3,
precision,
false, false);
1 << 30, // workspaceSize
this->rec_batch_num, // maxBatchSize
3, // minSubgraphSize
precision,// precision
false,// useStatic
false //useCalibMode
);
}
// std::map<std::string, std::vector<int>> min_input_shape =

std::cout << "Enable TensorRT is: " << config.tensorrt_engine_enabled() << std::endl;

/* some model dose not suppport dynamic shape with TRT, deactivate it by default */

// std::map<std::string, std::vector<int> > min_input_shape =
// {
// {"data_batch", {1, 1, 1, 1, 1}}
// {"data_batch_0", {1, this->num_seg, 3, 1, 1}}
// };
// std::map<std::string, std::vector<int>> max_input_shape =
// std::map<std::string, std::vector<int> > max_input_shape =
// {
// {"data_batch", {10, this->num_seg, 3, 224, 224}}
// {"data_batch_0", {1, this->num_seg, 3, 256, 256}}
// };
// std::map<std::string, std::vector<int>> opt_input_shape =
// std::map<std::string, std::vector<int> > opt_input_shape =
// {
// {"data_batch", {this->rec_batch_num, this->num_seg, 3, 224, 224}}
// {"data_batch_0", {this->rec_batch_num, this->num_seg, 3, 224, 224}}
// };

// config.SetTRTDynamicShapeInfo(min_input_shape, max_input_shape,
Expand Down