Skip to content

Commit fab8a30

Browse files
authored
offline deploy qwen-vl llava (#1246)
1 parent 21aae87 commit fab8a30

File tree

11 files changed

+7
-1242
lines changed

11 files changed

+7
-1242
lines changed

deploy/README.md

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,6 @@ Python端预测部署主要包含两个步骤:
4949
- [blip2](./blip2/README.md)
5050
- [groundingdino](./groundingdino/README.md)
5151
- [sam](./sam/README.md)
52-
- [qwen_vl](./qwen_vl/README.md)
5352

5453
以 groundingdino 为例子。
5554

@@ -77,10 +76,10 @@ python export.py \
7776

7877
## 3. 推理 BenchMark
7978

80-
> Note:
79+
> Note:
8180
> 测试环境为:
8281
Paddle 3.0,
83-
PaddleMIX release/2.0
82+
PaddleMIX release/2.0
8483
PaddleNLP2.7.2
8584
A100 80G单卡。
8685

@@ -103,8 +102,5 @@ A100 80G单卡。
103102
# A100性能数据
104103
|模型|图片分辨率|数据类型 |Paddle Deploy |
105104
|-|-|-|-|
106-
|qwen-vl-7b|448*448|fp16|669.8 ms|
107-
|llava-1.5-7b|336*336|fp16|981.2 ms|
108-
|llava-1.6-7b|336*336|fp16|778.7 ms|
109105
|groundingDino/groundingdino-swint-ogc|800*1193|fp32|100 ms|
110106
|Sam/SamVitH-1024|1024*1024|fp32|121 ms|

deploy/README_en.md

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@
44

55
PaddleMIX utilizes Paddle Inference and provides a Python-based deployment solution. There are two deployment methods:
66

7-
1. **APPflow Deployment**:
7+
1. **APPflow Deployment**:
88
- By setting the `static_mode = True` variable in APPflow, you can enable static graph inference. Additionally, you can accelerate inference using TensorRT. Note that not all models support static graph or TensorRT. Please refer to the [Multi Modal And Scenario](../applications/README_en.md/#multi-modal-and-scenario) section for specific model support.
99

10-
2. **Single Model Deployment**:
10+
2. **Single Model Deployment**:
1111

1212
For APPflow usage, you can set the `static_mode = True` variable to enable static graph inference and optionally accelerate inference using TensorRT.
1313

@@ -48,7 +48,6 @@ Currently supported models:
4848
- [blip2](./blip2/README.md)
4949
- [groundingdino](./groundingdino/README.md)
5050
- [sam](./sam/README.md)
51-
- [qwen_vl](./qwen_vl/README.md)
5251

5352
Using groundingdino as an example.
5453

@@ -76,10 +75,10 @@ Will be exported to the following directory, including `model_state.pdiparams`,
7675

7776
## 3. BenchMark
7877

79-
> Note:
78+
> Note:
8079
> environment
8180
Paddle 3.0
82-
PaddleMIX release/2.0
81+
PaddleMIX release/2.0
8382
PaddleNLP 2.7.2
8483
A100 80G。
8584

@@ -101,8 +100,5 @@ example: GroundingDino benchmark:
101100

102101
|Model|image size|dtype |Paddle Deploy |
103102
|-|-|-|-|
104-
|qwen-vl-7b|448*448|fp16|669.8 ms|
105-
|llava-1.5-7b|336*336|fp16|981.2 ms|
106-
|llava-1.6-7b|336*336|fp16|778.7 ms|
107103
|groundingDino/groundingdino-swint-ogc|800*1193|fp32|100 ms|
108-
|Sam/SamVitH-1024|1024*1024|fp32|121 ms|
104+
|Sam/SamVitH-1024|1024*1024|fp32|121 ms|

deploy/llava/README.md

Lines changed: 0 additions & 83 deletions
This file was deleted.

deploy/llava/export_model.py

Lines changed: 0 additions & 104 deletions
This file was deleted.

deploy/llava/llama_inference_model.py

Lines changed: 0 additions & 127 deletions
This file was deleted.

0 commit comments

Comments
 (0)