Skip to content

Commit 261cd9d

Browse files
committed
add MiniGPT4 inference
1 parent 24f048c commit 261cd9d

File tree

5 files changed

+961
-13
lines changed

5 files changed

+961
-13
lines changed
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
# MiniGPT4 推理加速
2+
3+
本项目提供了基于 MiniGPT4 的推理加速功能,基本的解决思路是将 MiniGPT4 动态图转为静态图,然后基于 PaddleInference 库进行推理加速。
4+
5+
下图展示了 MiniGPT4 的整体模型结构, 可以看到整体上,MiniGPT4的主要部分由 VIT, QFormer 和 Vicuna 模型组成,其中 Vicuna 模型是基于 Llama 训练的,在代码实现中调用的也是Llama代码,为方便描述,忽略不必要的分歧,所以在后续中将语言模型这部分默认描述为Llama。
6+
7+
在本方案中,我们将MiniGPT4 导出为两个子图:VIT 和 QFormer部分导出为一个静态子图, Llama 部分导出为一个子图。后续会结合这两个子图统一做 MiniGPT4 的推理功能。
8+
9+
<center><img src="https://github.com/PaddlePaddle/Paddle/assets/35913314/f0306cb6-4837-4f52-8f57-a0e7e35238f6" /></center>
10+
11+
12+
13+
14+
## 1. 环境准备
15+
### 1.1 基础环境准备:
16+
本项目在以下基础环境进行了验证:
17+
- CUDA: 11.7
18+
- python: 3.11
19+
- paddle: develop版
20+
21+
其中CUDA版本需要>=11.2, 具体Paddle版本可以点击[这里](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html)按需下载。
22+
23+
24+
### 1.2 安装项目库
25+
1. 本项目需要用到 PaddleMIX 和 PaddleNLP 两个库,并且需要下载最新的 develop 版本:
26+
27+
```shell
28+
git clone https://github.com/PaddlePaddle/PaddleNLP.git
29+
git clone https://github.com/PaddlePaddle/PaddleMIX.git
30+
```
31+
32+
2. 安装paddlenlp_ops:
33+
```shell
34+
cd PaddleNLP/csrc
35+
python setup_cuda.py install
36+
```
37+
38+
3. 最后设置相应的环境变量:
39+
```shell
40+
export PYTHONPATH=/wangqinghui/PaddleNLP:/wangqinghui/PaddleMIX
41+
```
42+
43+
### 1.3 特别说明
44+
目前需要修复PaddleNLP和Paddle的部分代码,从而进行MiniGPT4推理加速。这部分功能后续逐步会逐步完善到PaddleNLP和Paddle,但目前如果想使用的话需要手动修改一下。
45+
1. 修改PaddleNLP代码:
46+
参考该[分支代码](https://github.com/1649759610/PaddleNLP/tree/bugfix_minigpt4),依次替换以下文件:
47+
- PaddleNLP/paddlenlp/experimental/transformers/generation_utils.py
48+
- PaddleNLP/paddlenlp/experimental/transformers/llama/modeling.py
49+
- PaddleNLP/llm/export_model.py
50+
51+
2. 修改Paddle代码
52+
进入到Paddle安装目录,打开文件:paddle/static/io.py, 注释第284-287行代码:
53+
```python
54+
if not skip_prune_program:
55+
copy_program = copy_program._prune_with_input(
56+
feeded_var_names=feed_var_names, targets=fetch_vars
57+
)
58+
```
59+
60+
## 2. MiniGPT4 分阶段导出
61+
62+
### 2.1 导出前一部分子图:
63+
请确保在该目录下:PaddleMIX/paddlemix/examples/minigpt4/inference,按照以下命令进行导出:
64+
```
65+
python export_image_encoder.py \
66+
--minigpt4_13b_path "you minigpt4 dir path" \
67+
--save_path "./checkpoints/encode_image/encode_image"
68+
```
69+
70+
### 2.2 导出后一部分子图
71+
请进入到目录: PaddleNLP/llm, 按照以下命令进行导出:
72+
```
73+
python export_model.py \
74+
--model_name_or_path "your llama dir path" \
75+
--output_path "your output path" \
76+
--dtype float16 \
77+
--inference_model \
78+
--model_prefix llama \
79+
--model_type llama-img2txt
80+
81+
```
82+
83+
**备注**: 当前导出Llama部分需要转移到PaddleNLP下进行手动导出,后续将支持在PaddleMIX下一键转出。
84+
85+
## 3. MiniGPT4 静态图推理
86+
请进入到目录PaddleMIX/paddlemix/examples/minigpt4/inference,执行以下命令:
87+
```python
88+
python run_static_predict.py \
89+
--first_model_path "The dir name of image encoder model" \
90+
--second_model_path "The dir name of language model" \
91+
--minigpt4_path "The minigpt4 dir name of saving tokenizer"
92+
```
93+
94+
以下展示了针对以下这个图片,MiniGPT4静态图推理的输出:
95+
96+
<center><img src="https://paddlenlp.bj.bcebos.com/data/images/mugs.png" /></center>
97+
98+
```text
99+
Reference: The image shows two black and white cats sitting next to each other on a blue background. The cats have black fur and white fur with black noses, eyes, and paws. They are both looking at the camera with a curious expression. The mugs are also blue with the same design of the cats on them. There is a small white flower on the left side of the mug. The background is a light blue color.
100+
101+
Outputs: ['The image shows two black and white cats sitting next to each other on a blue background. The cats have black fur and white fur with black noses, eyes, and paws. They are both looking at the camera with a curious expression. The mugs are also blue with the same design of the cats on them. There is a small white flower on the left side of the mug. The background is a light blue color.##']
102+
```
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
import argparse
2+
import os
3+
os.environ["CUDA_VISIBLE_DEVICES"]="7"
4+
os.environ["FLAGS_use_cuda_managed_memory"]="true"
5+
6+
import paddle
7+
from paddlemix import MiniGPT4ForConditionalGeneration
8+
9+
10+
def export(args):
11+
model = MiniGPT4ForConditionalGeneration.from_pretrained(args.minigpt4_13b_path, vit_dtype="float16")
12+
model.eval()
13+
14+
# convert to static graph with specific input description
15+
model = paddle.jit.to_static(
16+
model.encode_images,
17+
input_spec=[
18+
paddle.static.InputSpec(
19+
shape=[None, 3, None, None], dtype="float32"), # images
20+
])
21+
22+
# save to static model
23+
paddle.jit.save(model, args.save_path)
24+
print(f"static model has been to {args.save_path}")
25+
26+
27+
if __name__ == "__main__":
28+
parser = argparse.ArgumentParser()
29+
parser.add_argument(
30+
"--minigpt4_13b_path",
31+
default="your minigpt4 dir path",
32+
type=str,
33+
help="The dir name of minigpt4 checkpoint.",
34+
)
35+
parser.add_argument(
36+
"--save_path",
37+
default="./checkpoints/encode_image/encode_image",
38+
type=str,
39+
help="The saving path of static minigpt4.",
40+
)
41+
args = parser.parse_args()
42+
43+
export(args)
44+
45+
46+
47+
48+
49+
50+
51+
52+
# processor = MiniGPT4Processor.from_pretrained(minigpt4_13b_path)
53+
# print("load processor and model done!")
54+
55+
# # prepare model inputs for MiniGPT4
56+
# url = "https://paddlenlp.bj.bcebos.com/data/images/mugs.png"
57+
# image = Image.open(requests.get(url, stream=True).raw)
58+
59+
# inputs = processor.process_images(image)
60+
# model.
61+
62+
63+
# # generate with MiniGPT4
64+
# outputs = model.generate(**inputs, **generate_kwargs)
65+
# msg = processor.batch_decode(outputs[0])
66+
# print(msg)

0 commit comments

Comments
 (0)