Skip to content

Commit b0a30a7

Browse files
authored
[Serving]Add PPCls serving examples (#555)
* add ppcls serving examples * fix ppcls/serving docs * fix code style
1 parent 4bbfd97 commit b0a30a7

File tree

13 files changed

+785
-2
lines changed

13 files changed

+785
-2
lines changed
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# PaddleClas 服务化部署示例
2+
3+
## 启动服务
4+
5+
```bash
6+
#下载部署示例代码
7+
git clone https://github.com/PaddlePaddle/FastDeploy.git
8+
cd FastDeploy/examples/vision/classification/paddleclas/serving
9+
10+
# 下载ResNet50_vd模型文件和测试图片
11+
wget https://bj.bcebos.com/paddlehub/fastdeploy/ResNet50_vd_infer.tgz
12+
tar -xvf ResNet50_vd_infer.tgz
13+
wget https://gitee.com/paddlepaddle/PaddleClas/raw/release/2.4/deploy/images/ImageNet/ILSVRC2012_val_00000010.jpeg
14+
15+
# 将配置文件放入预处理目录
16+
mv ResNet50_vd_infer/inference_cls.yaml models/preprocess/1/
17+
18+
# 将模型放入 models/runtime/1目录下, 并重命名为model.pdmodel和model.pdiparams
19+
mv ResNet50_vd_infer/inference.pdmodel models/runtime/1/model.pdmodel
20+
mv ResNet50_vd_infer/inference.pdiparams models/runtime/1/model.pdiparams
21+
22+
# 拉取fastdeploy镜像
23+
# GPU镜像
24+
docker pull paddlepaddle/fastdeploy:0.3.0-gpu-cuda11.4-trt8.4-21.10
25+
# CPU镜像
26+
docker pull paddlepaddle/fastdeploy:0.3.0-cpu-only-21.10
27+
28+
# 运行容器.容器名字为 fd_serving, 并挂载当前目录为容器的 /serving 目录
29+
nvidia-docker run -it --net=host --name fd_serving -v `pwd`/:/serving paddlepaddle/fastdeploy:0.3.0-gpu-cuda11.4-trt8.4-21.10 bash
30+
31+
# 启动服务(不设置CUDA_VISIBLE_DEVICES环境变量,会拥有所有GPU卡的调度权限)
32+
CUDA_VISIBLE_DEVICES=0 fastdeployserver --model-repository=/serving/models --backend-config=python,shm-default-byte-size=10485760
33+
```
34+
>> **注意**:
35+
36+
>> 拉取其他硬件上的镜像请看[服务化部署主文档](../../../../../serving/README.md)
37+
38+
>> 执行fastdeployserver启动服务出现"Address already in use", 请使用`--grpc-port`指定端口号来启动服务,同时更改客户端示例中的请求端口号.
39+
40+
>> 其他启动参数可以使用 fastdeployserver --help 查看
41+
42+
服务启动成功后, 会有以下输出:
43+
```
44+
......
45+
I0928 04:51:15.784517 206 grpc_server.cc:4117] Started GRPCInferenceService at 0.0.0.0:8001
46+
I0928 04:51:15.785177 206 http_server.cc:2815] Started HTTPService at 0.0.0.0:8000
47+
I0928 04:51:15.826578 206 http_server.cc:167] Started Metrics Service at 0.0.0.0:8002
48+
```
49+
50+
51+
## 客户端请求
52+
53+
在物理机器中执行以下命令,发送grpc请求并输出结果
54+
```
55+
#下载测试图片
56+
wget https://gitee.com/paddlepaddle/PaddleClas/raw/release/2.4/deploy/images/ImageNet/ILSVRC2012_val_00000010.jpeg
57+
58+
#安装客户端依赖
59+
python3 -m pip install tritonclient\[all\]
60+
61+
# 发送请求
62+
python3 paddlecls_grpc_client.py
63+
```
64+
65+
发送请求成功后,会返回json格式的检测结果并打印输出:
66+
```
67+
output_name: CLAS_RESULT
68+
{'label_ids': [153], 'scores': [0.6862289905548096]}
69+
```
70+
71+
## 配置修改
72+
73+
当前默认配置在GPU上运行TensorRT引擎, 如果要在CPU或其他推理引擎上运行。 需要修改`models/runtime/config.pbtxt`中配置,详情请参考[配置文档](../../../../../serving/docs/zh_CN/model_configuration.md)
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# PaddleCls Pipeline
2+
3+
The pipeline directory does not have model files, but a version number directory needs to be maintained.
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
name: "paddlecls"
2+
platform: "ensemble"
3+
max_batch_size: 16
4+
input [
5+
{
6+
name: "INPUT"
7+
data_type: TYPE_UINT8
8+
dims: [ -1, -1, 3 ]
9+
}
10+
]
11+
output [
12+
{
13+
name: "CLAS_RESULT"
14+
data_type: TYPE_STRING
15+
dims: [ -1 ]
16+
}
17+
]
18+
ensemble_scheduling {
19+
step [
20+
{
21+
model_name: "preprocess"
22+
model_version: 1
23+
input_map {
24+
key: "preprocess_input"
25+
value: "INPUT"
26+
}
27+
output_map {
28+
key: "preprocess_output"
29+
value: "RUNTIME_INPUT"
30+
}
31+
},
32+
{
33+
model_name: "runtime"
34+
model_version: 1
35+
input_map {
36+
key: "inputs"
37+
value: "RUNTIME_INPUT"
38+
}
39+
output_map {
40+
key: "save_infer_model/scale_0.tmp_1"
41+
value: "RUNTIME_OUTPUT"
42+
}
43+
},
44+
{
45+
model_name: "postprocess"
46+
model_version: 1
47+
input_map {
48+
key: "post_input"
49+
value: "RUNTIME_OUTPUT"
50+
}
51+
output_map {
52+
key: "post_output"
53+
value: "CLAS_RESULT"
54+
}
55+
}
56+
]
57+
}
Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
import json
16+
import numpy as np
17+
import time
18+
19+
import fastdeploy as fd
20+
21+
# triton_python_backend_utils is available in every Triton Python model. You
22+
# need to use this module to create inference requests and responses. It also
23+
# contains some utility functions for extracting information from model_config
24+
# and converting Triton input/output types to numpy types.
25+
import triton_python_backend_utils as pb_utils
26+
27+
28+
class TritonPythonModel:
29+
"""Your Python model must use the same class name. Every Python model
30+
that is created must have "TritonPythonModel" as the class name.
31+
"""
32+
33+
def initialize(self, args):
34+
"""`initialize` is called only once when the model is being loaded.
35+
Implementing `initialize` function is optional. This function allows
36+
the model to intialize any state associated with this model.
37+
Parameters
38+
----------
39+
args : dict
40+
Both keys and values are strings. The dictionary keys and values are:
41+
* model_config: A JSON string containing the model configuration
42+
* model_instance_kind: A string containing model instance kind
43+
* model_instance_device_id: A string containing model instance device ID
44+
* model_repository: Model repository path
45+
* model_version: Model version
46+
* model_name: Model name
47+
"""
48+
# You must parse model_config. JSON string is not parsed here
49+
self.model_config = json.loads(args['model_config'])
50+
print("model_config:", self.model_config)
51+
52+
self.input_names = []
53+
for input_config in self.model_config["input"]:
54+
self.input_names.append(input_config["name"])
55+
print("postprocess input names:", self.input_names)
56+
57+
self.output_names = []
58+
self.output_dtype = []
59+
for output_config in self.model_config["output"]:
60+
self.output_names.append(output_config["name"])
61+
dtype = pb_utils.triton_string_to_numpy(output_config["data_type"])
62+
self.output_dtype.append(dtype)
63+
print("postprocess output names:", self.output_names)
64+
65+
self.postprocess_ = fd.vision.classification.PaddleClasPostprocessor()
66+
67+
def execute(self, requests):
68+
"""`execute` must be implemented in every Python model. `execute`
69+
function receives a list of pb_utils.InferenceRequest as the only
70+
argument. This function is called when an inference is requested
71+
for this model. Depending on the batching configuration (e.g. Dynamic
72+
Batching) used, `requests` may contain multiple requests. Every
73+
Python model, must create one pb_utils.InferenceResponse for every
74+
pb_utils.InferenceRequest in `requests`. If there is an error, you can
75+
set the error argument when creating a pb_utils.InferenceResponse.
76+
Parameters
77+
----------
78+
requests : list
79+
A list of pb_utils.InferenceRequest
80+
Returns
81+
-------
82+
list
83+
A list of pb_utils.InferenceResponse. The length of this list must
84+
be the same as `requests`
85+
"""
86+
responses = []
87+
# print("num:", len(requests), flush=True)
88+
for request in requests:
89+
infer_outputs = pb_utils.get_input_tensor_by_name(
90+
request, self.input_names[0])
91+
infer_outputs = infer_outputs.as_numpy()
92+
93+
results = self.postprocess_.run([infer_outputs, ])
94+
r_str = fd.vision.utils.fd_result_to_json(results)
95+
96+
r_np = np.array(r_str, dtype=np.object)
97+
out_tensor = pb_utils.Tensor(self.output_names[0], r_np)
98+
inference_response = pb_utils.InferenceResponse(
99+
output_tensors=[out_tensor, ])
100+
responses.append(inference_response)
101+
return responses
102+
103+
def finalize(self):
104+
"""`finalize` is called only once when the model is being unloaded.
105+
Implementing `finalize` function is optional. This function allows
106+
the model to perform any necessary clean ups before exit.
107+
"""
108+
print('Cleaning up...')
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
name: "postprocess"
2+
backend: "python"
3+
max_batch_size: 16
4+
5+
input [
6+
{
7+
name: "post_input"
8+
data_type: TYPE_FP32
9+
dims: [ 1000 ]
10+
}
11+
]
12+
13+
output [
14+
{
15+
name: "post_output"
16+
data_type: TYPE_STRING
17+
dims: [ -1 ]
18+
}
19+
]
20+
21+
instance_group [
22+
{
23+
count: 1
24+
kind: KIND_CPU
25+
}
26+
]
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Global:
2+
infer_imgs: "./images/ImageNet/ILSVRC2012_val_00000010.jpeg"
3+
inference_model_dir: "./models"
4+
batch_size: 1
5+
use_gpu: True
6+
enable_mkldnn: True
7+
cpu_num_threads: 10
8+
enable_benchmark: True
9+
use_fp16: False
10+
ir_optim: True
11+
use_tensorrt: False
12+
gpu_mem: 8000
13+
enable_profile: False
14+
15+
PreProcess:
16+
transform_ops:
17+
- ResizeImage:
18+
resize_short: 256
19+
- CropImage:
20+
size: 224
21+
- NormalizeImage:
22+
scale: 0.00392157
23+
mean: [0.485, 0.456, 0.406]
24+
std: [0.229, 0.224, 0.225]
25+
order: ''
26+
channel_num: 3
27+
- ToCHWImage:
28+
29+
PostProcess:
30+
main_indicator: Topk
31+
Topk:
32+
topk: 5
33+
class_id_map_file: "../ppcls/utils/imagenet1k_label_list.txt"
34+
SavePreLabel:
35+
save_dir: ./pre_label/

0 commit comments

Comments
 (0)