Skip to content

Commit 630004d

Browse files
committed
update benchmark info
1 parent f639803 commit 630004d

File tree

3 files changed

+69
-46
lines changed

3 files changed

+69
-46
lines changed

docs/module_usage/instructions/benchmark.md

+42-38
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ PaddleX 支持统计模型推理耗时,需通过环境变量进行设置,具
66
* `PADDLE_PDX_INFER_BENCHMARK_WARMUP`:设置 warm up,在开始测试前,使用随机数据循环迭代 n 次,默认为 `0`
77
* `PADDLE_PDX_INFER_BENCHMARK_DATA_SIZE`: 设置随机数据的尺寸,默认为 `224`
88
* `PADDLE_PDX_INFER_BENCHMARK_ITER`:使用随机数据进行 Benchmark 测试的循环次数,仅当输入数据为 `None` 时,将使用随机数据进行测试;
9-
* `PADDLE_PDX_INFER_BENCHMARK_OUTPUT`用于设置保存本次 benchmark 指标到 `txt` 文件,如 `./benchmark.txt`,默认为 `None`,表示不保存 Benchmark 指标;
9+
* `PADDLE_PDX_INFER_BENCHMARK_OUTPUT`用于设置保存的目录,如 `./benchmark`,默认为 `None`,表示不保存 Benchmark 指标;
1010

1111
使用示例如下:
1212

@@ -15,7 +15,7 @@ PADDLE_PDX_INFER_BENCHMARK=True \
1515
PADDLE_PDX_INFER_BENCHMARK_WARMUP=5 \
1616
PADDLE_PDX_INFER_BENCHMARK_DATA_SIZE=320 \
1717
PADDLE_PDX_INFER_BENCHMARK_ITER=10 \
18-
PADDLE_PDX_INFER_BENCHMARK_OUTPUT=./benchmark.txt \
18+
PADDLE_PDX_INFER_BENCHMARK_OUTPUT=./benchmark \
1919
python main.py \
2020
-c ./paddlex/configs/object_detection/PicoDet-XS.yaml \
2121
-o Global.mode=predict \
@@ -26,44 +26,48 @@ python main.py \
2626
在开启 Benchmark 后,将自动打印 benchmark 指标:
2727

2828
```
29-
+----------------+-----------------+------+---------------+
30-
| Stage | Total Time (ms) | Nums | Avg Time (ms) |
31-
+----------------+-----------------+------+---------------+
32-
| ReadCmp | 185.48870087 | 10 | 18.54887009 |
33-
| Resize | 16.95227623 | 30 | 0.56507587 |
34-
| Normalize | 41.12100601 | 30 | 1.37070020 |
35-
| ToCHWImage | 0.05745888 | 30 | 0.00191530 |
36-
| Copy2GPU | 14.58549500 | 10 | 1.45854950 |
37-
| Infer | 100.14462471 | 10 | 10.01446247 |
38-
| Copy2CPU | 9.54508781 | 10 | 0.95450878 |
39-
| DetPostProcess | 0.56767464 | 30 | 0.01892249 |
40-
+----------------+-----------------+------+---------------+
41-
+-------------+-----------------+------+---------------+
42-
| Stage | Total Time (ms) | Nums | Avg Time (ms) |
43-
+-------------+-----------------+------+---------------+
44-
| PreProcess | 243.61944199 | 30 | 8.12064807 |
45-
| Inference | 124.27520752 | 30 | 4.14250692 |
46-
| PostProcess | 0.56767464 | 30 | 0.01892249 |
47-
| End2End | 379.70948219 | 30 | 12.65698274 |
48-
| WarmUp | 9465.68179131 | 5 | 1893.13635826 |
49-
+-------------+-----------------+------+---------------+
29+
+----------------+-----------------+-----------------+------------------------+
30+
| Component | Total Time (ms) | Number of Calls | Avg Time Per Call (ms) |
31+
+----------------+-----------------+-----------------+------------------------+
32+
| ReadCmp | 102.39458084 | 10 | 10.23945808 |
33+
| Resize | 11.20400429 | 20 | 0.56020021 |
34+
| Normalize | 34.11078453 | 20 | 1.70553923 |
35+
| ToCHWImage | 0.05555153 | 20 | 0.00277758 |
36+
| Copy2GPU | 9.10568237 | 10 | 0.91056824 |
37+
| Infer | 98.22225571 | 10 | 9.82222557 |
38+
| Copy2CPU | 14.30845261 | 10 | 1.43084526 |
39+
| DetPostProcess | 0.45251846 | 20 | 0.02262592 |
40+
+----------------+-----------------+-----------------+------------------------+
41+
+-------------+-----------------+---------------------+----------------------------+
42+
| Stage | Total Time (ms) | Number of Instances | Avg Time Per Instance (ms) |
43+
+-------------+-----------------+---------------------+----------------------------+
44+
| PreProcess | 147.76492119 | 20 | 7.38824606 |
45+
| Inference | 121.63639069 | 20 | 6.08181953 |
46+
| PostProcess | 0.45251846 | 20 | 0.02262592 |
47+
| End2End | 294.03519630 | 20 | 14.70175982 |
48+
| WarmUp | 7937.82591820 | 5 | 1587.56518364 |
49+
+-------------+-----------------+---------------------+----------------------------+
5050
```
5151

52-
在 Benchmark 结果中,会统计该模型全部组件(`Component`)的总耗时(`Total Time`,单位为“毫秒”)、**调用次数**`Nums`)、**调用**平均执行耗时(`Avg Time`,单位为“毫秒”),以及按预热(`WarmUp`)、预处理(`PreProcess`)、模型推理(`Inference`)、后处理(`PostProcess`)和端到端(`End2End`)进行划分的耗时统计,包括每个阶段的总耗时(`Total Time`,单位为“毫秒”)、**样本数**`Nums`)和**单样本**平均执行耗时(`Avg Time`,单位为“毫秒”),同时,保存相关指标会到本地 `./benchmark.csv` 文件中
52+
在 Benchmark 结果中,会统计该模型全部组件(`Component`)的总耗时(`Total Time`,单位为“毫秒”)、**调用次数**`Number of Calls`)、**调用**平均执行耗时(`Avg Time Per Call`,单位“毫秒”),以及按预热(`WarmUp`)、预处理(`PreProcess`)、模型推理(`Inference`)、后处理(`PostProcess`)和端到端(`End2End`)进行划分的耗时统计,包括每个阶段的总耗时(`Total Time`,单位为“毫秒”)、**样本数**`Number of Instances`)和**单样本**平均执行耗时(`Avg Time Per Instance`,单位“毫秒”),同时,上述指标会保存到到本地: `./benchmark/detail.csv` `./benchmark/summary.csv`
5353

5454
```csv
55-
Stage,Total Time (ms),Nums,Avg Time (ms)
56-
ReadCmp,0.18548870086669922,10,0.018548870086669923
57-
Resize,0.0169522762298584,30,0.0005650758743286133
58-
Normalize,0.04112100601196289,30,0.001370700200398763
59-
ToCHWImage,5.745887756347656e-05,30,1.915295918782552e-06
60-
Copy2GPU,0.014585494995117188,10,0.0014585494995117188
61-
Infer,0.10014462471008301,10,0.0100144624710083
62-
Copy2CPU,0.009545087814331055,10,0.0009545087814331055
63-
DetPostProcess,0.0005676746368408203,30,1.892248789469401e-05
64-
PreProcess,0.24361944198608398,30,0.0081206480662028
65-
Inference,0.12427520751953125,30,0.0041425069173177086
66-
PostProcess,0.0005676746368408203,30,1.892248789469401e-05
67-
End2End,0.37970948219299316,30,0.012656982739766438
68-
WarmUp,9.465681791305542,5,1.8931363582611085
55+
Component,Total Time (ms),Number of Calls,Avg Time Per Call (ms)
56+
ReadCmp,0.10199093818664551,10,0.01019909381866455
57+
Resize,0.011309385299682617,20,0.0005654692649841309
58+
Normalize,0.035140275955200195,20,0.0017570137977600097
59+
ToCHWImage,4.744529724121094e-05,20,2.3722648620605467e-06
60+
Copy2GPU,0.00861215591430664,10,0.000861215591430664
61+
Infer,0.820899248123169,10,0.08208992481231689
62+
Copy2CPU,0.006002187728881836,10,0.0006002187728881836
63+
DetPostProcess,0.0004436969757080078,20,2.218484878540039e-05
64+
```
65+
66+
```csv
67+
Stage,Total Time (ms),Number of Instance,Avg Time Per Instance (ms)
68+
PreProcess,0.14848804473876953,20,0.007424402236938477
69+
Inference,0.8355135917663574,20,0.04177567958831787
70+
PostProcess,0.0004436969757080078,20,2.218484878540039e-05
71+
End2End,1.0054960250854492,20,0.05027480125427246
72+
WarmUp,8.869974851608276,5,1.7739949703216553
6973
```

paddlex/configs/object_detection/PicoDet-S.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ Export:
3333
weight_path: https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
3434

3535
Predict:
36-
batch_size: 3
36+
batch_size: 1
3737
model_dir: "output/best_model/inference"
3838
input: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_object_detection_002.png"
3939
kernel_option:

paddlex/inference/utils/benchmark.py

+26-7
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
import functools
1717
from types import GeneratorType
1818
import time
19+
from pathlib import Path
1920
import numpy as np
2021
from prettytable import PrettyTable
2122

@@ -116,8 +117,13 @@ def collect(self, e2e_num):
116117
self._e2e_elapse = time.time() - self._e2e_tic
117118
detail, summary = self.gather(e2e_num)
118119

119-
table_head = ["Stage", "Total Time (ms)", "Nums", "Avg Time (ms)"]
120-
table = PrettyTable(table_head)
120+
detail_head = [
121+
"Component",
122+
"Total Time (ms)",
123+
"Number of Calls",
124+
"Avg Time Per Call (ms)",
125+
]
126+
table = PrettyTable(detail_head)
121127
table.add_rows(
122128
[
123129
(name, f"{total * 1000:.8f}", cnts, f"{avg * 1000:.8f}")
@@ -126,7 +132,13 @@ def collect(self, e2e_num):
126132
)
127133
logging.info(table)
128134

129-
table = PrettyTable(table_head)
135+
summary_head = [
136+
"Stage",
137+
"Total Time (ms)",
138+
"Number of Instances",
139+
"Avg Time Per Instance (ms)",
140+
]
141+
table = PrettyTable(summary_head)
130142
table.add_rows(
131143
[
132144
(name, f"{total * 1000:.8f}", cnts, f"{avg * 1000:.8f}")
@@ -136,10 +148,17 @@ def collect(self, e2e_num):
136148
logging.info(table)
137149

138150
if INFER_BENCHMARK_OUTPUT:
139-
csv_data = [table_head]
140-
csv_data.extend(detail)
141-
csv_data.extend(summary)
142-
with open("benchmark.csv", "w", newline="") as file:
151+
save_dir = Path(INFER_BENCHMARK_OUTPUT)
152+
save_dir.mkdir(parents=True, exist_ok=True)
153+
csv_data = [detail_head, *detail]
154+
# csv_data.extend(detail)
155+
with open(Path(save_dir) / "detail.csv", "w", newline="") as file:
156+
writer = csv.writer(file)
157+
writer.writerows(csv_data)
158+
159+
csv_data = [summary_head, *summary]
160+
# csv_data.extend(summary)
161+
with open(Path(save_dir) / "summary.csv", "w", newline="") as file:
143162
writer = csv.writer(file)
144163
writer.writerows(csv_data)
145164

0 commit comments

Comments
 (0)