6
6
| Model |
7
7
| ---------------------------------|
8
8
| deepseek-ai/deepseek-vl2-small |
9
- | deepseek-ai/deepseek-vl2 |
10
9
11
10
## 环境安装
12
11
[ 安装PaddlePaddle] ( https://github.com/PaddlePaddle/PaddleMIX?tab=readme-ov-file#3-%EF%B8%8F%E5%AE%89%E8%A3%85paddlepaddle )
13
12
- ** python >= 3.10**
14
13
- ** paddlepaddle-gpu 要求develop版本**
15
14
``` bash
16
- # Develop 版本安装示例,请确保使用的Paddle版本为develop版本
15
+ # Develop 版本安装示例
17
16
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu123/
18
17
```
19
18
20
19
2) [ 安装PaddleMIX环境依赖包] ( https://github.com/PaddlePaddle/PaddleMIX?tab=readme-ov-file#3-%EF%B8%8F%E5%AE%89%E8%A3%85paddlepaddle )
21
20
``` bash
22
21
# pip 安装示例,安装paddlemix、ppdiffusers、项目依赖
23
- python -m pip install -e .
22
+ python -m pip install -e .
24
23
python -m pip install -e ppdiffusers
25
24
python -m pip install -r requirements.txt
26
25
27
26
# 安装PaddleNLP
28
- pip uninstall paddlenlp && rm -rf PaddleNLP
29
- git clone https://github.com/PaddlePaddle/PaddleNLP.git
27
+ pip uninstall -y paddlenlp && rm -rf PaddleNLP
28
+ git clone -b release/3.0-beta4-new --depth=1 https://github.com/PaddlePaddle/PaddleNLP.git
30
29
cd PaddleNLP
31
30
pip install -e .
32
- cd csrc
33
- python setup_cuda.py install
31
+ pip install https://paddlenlp.bj.bcebos.com/ops/cu118/paddlenlp_ops-3.0.0b4-py3-none-any.whl
34
32
```
35
33
36
34
## 3 高性能推理
@@ -39,56 +37,72 @@ python setup_cuda.py install
39
37
40
38
```
41
39
export CUDA_VISIBLE_DEVICES=0
42
- export FLAGS_cascade_attention_max_partition_size=163840
43
- export FLAGS_mla_use_tensorcore=1
40
+ export FLAGS_mla_use_tensorcore=0
41
+ export FLAGS_cascade_attention_max_partition_size=128
42
+ export FLAGS_cascade_attention_deal_each_time=16
44
43
python deploy/deepseek_vl2/deepseek_vl2_infer.py \
45
44
--model_name_or_path deepseek-ai/deepseek-vl2-small \
46
45
--question "Describe this image." \
47
46
--image_file paddlemix/demo_images/examples_image1.jpg \
48
47
--min_length 128 \
49
48
--max_length 128 \
49
+ --inference_model True \
50
+ --append_attn True \
51
+ --mode dynamic \
52
+ --dtype bfloat16 \
50
53
--top_k 1 \
51
54
--top_p 0.001 \
52
55
--temperature 0.1 \
53
56
--repetition_penalty 1.05 \
54
- --block_attn True \
57
+ --benchmark
58
+
59
+ # 多图推理
60
+ python deploy/deepseek_vl2/deepseek_vl2_infer_multi_image.py \
61
+ --model_name_or_path deepseek-ai/deepseek-vl2-small \
62
+ --question "What are in these images." \
63
+ --image_file_1 paddlemix/demo_images/examples_image1.jpg \
64
+ --image_file_2 paddlemix/demo_images/examples_image2.jpg \
65
+ --image_file_3 paddlemix/demo_images/examples_image1.jpg \
66
+ --min_length 128 \
67
+ --max_length 128 \
55
68
--inference_model True \
56
69
--append_attn True \
57
70
--mode dynamic \
58
71
--dtype bfloat16 \
59
- --mla_use_matrix_absorption
72
+ --top_k 1 \
73
+ --top_p 0.001 \
74
+ --temperature 0.1 \
75
+ --repetition_penalty 1.05 \
76
+ --benchmark
60
77
```
61
78
62
79
### b. wint8 高性能推理
63
80
```
64
81
export CUDA_VISIBLE_DEVICES=0
65
- export FLAGS_cascade_attention_max_partition_size=163840
66
- export FLAGS_mla_use_tensorcore=1
82
+ export FLAGS_mla_use_tensorcore=0
83
+ export FLAGS_cascade_attention_max_partition_size=128
84
+ export FLAGS_cascade_attention_deal_each_time=16
67
85
python deploy/deepseek_vl2/deepseek_vl2_infer.py \
68
86
--model_name_or_path deepseek-ai/deepseek-vl2-small \
69
87
--question "Describe this image." \
70
88
--image_file paddlemix/demo_images/examples_image1.jpg \
71
89
--min_length 128 \
72
90
--max_length 128 \
73
- --top_k 1 \
74
- --top_p 0.001 \
75
- --temperature 0.1 \
76
- --repetition_penalty 1.05 \
77
- --block_attn True \
78
91
--inference_model True \
79
92
--append_attn True \
80
93
--mode dynamic \
81
94
--dtype bfloat16 \
82
- --mla_use_matrix_absorption \
83
- --quant_type "weight_only_int8"
95
+ --top_k 1 \
96
+ --top_p 0.001 \
97
+ --temperature 0.1 \
98
+ --repetition_penalty 1.05 \
99
+ --quant_type "weight_only_int8" \
100
+ --benchmark True
84
101
```
85
102
86
103
## 4 一键推理 & 推理说明
87
- 进入PaddleMIX目录运行
88
- ``` bash
89
104
cd PaddleMIX
90
105
sh deploy/deepseek_vl2/shell/run.sh
91
- ```
92
106
#### 参数设定
93
107
| parameter | Value |
94
108
| ------------------ | -------------- |
@@ -102,10 +116,3 @@ sh deploy/deepseek_vl2/shell/run.sh
102
116
| ------------------ | -------------- |
103
117
| min_length | 128 |
104
118
| min_length | 128 |
105
-
106
- 以下为单张图片的测速情况
107
-
108
- | model | Paddle高性能推理 | Paddle |
109
- | ------------------------------ | ---------------------| ------------- |
110
- | deepseek-ai/deepseek-vl2-small | 9.3 s | 12.8 s |
111
- | deepseek-ai/deepseek-vl2 | - | 17.2 s |
0 commit comments