Skip to content

Commit f78516e

Browse files
authored
First order motion (PaddlePaddle#293)
update fom docs
1 parent b2ede95 commit f78516e

File tree

6 files changed

+98
-8
lines changed

6 files changed

+98
-8
lines changed

docs/en_US/tutorials/motion_driving.md

Lines changed: 50 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,22 +9,23 @@
99
</div>
1010
## Multi-Faces swapping
1111

12-
For photoes with multiple faces, we first detect all of the faces, then do facial expression transfer for each face, and finally put those faces back to the original photo to generate a complete new video.
12+
For photoes with multiple faces, we first detect all of the faces, then do facial expression transfer for each face, and finally put those faces back to the original photo to generate a complete new video.
1313

1414
Specific technical steps are shown below:
1515

1616
1. Use the S3FD model to detect the faces of a photo
1717
2. Use the First Order Motion model to do the facial expression transfer of each face
1818
3. Put those "new" generated faces back to the original photo
1919

20-
At the same time, specifically for face related work, PaddleGAN provides a ["faceutils" tool](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/ppgan/faceutils), including face detection, face segmentation models and more.
20+
At the same time, specifically for face related work, PaddleGAN provides a ["faceutils" tool](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/ppgan/faceutils), including face detection, face segmentation models and more.
2121

2222
## How to use
23-
23+
### 1 Test for Face
2424
Users can upload the prepared source image and driving video, then substitute the path of source image and driving video for the `source_image` and `driving_video` parameter in the following running command. It will geneate a video file named `result.mp4` in the `output` folder, which is the animated video file.
2525

2626
Note: for photoes with multiple faces, the longer the distances between faces, the better the result quality you can get.
2727

28+
- single face:
2829
```
2930
cd applications/
3031
python -u tools/first-order-demo.py \
@@ -33,13 +34,59 @@ python -u tools/first-order-demo.py \
3334
--ratio 0.4 \
3435
--relative --adapt_scale
3536
```
37+
- multi face
38+
cd applications/
39+
python -u tools/first-order-demo.py \
40+
--driving_video ../docs/imgs/fom_dv.mp4 \
41+
--source_image ../docs/imgs/fom_source_image_multi_person.png \
42+
--ratio 0.4 \
43+
--relative --adapt_scale \
44+
--multi_person
3645

3746
**params:**
3847
- driving_video: driving video, the motion of the driving video is to be migrated.
3948
- source_image: source_image, support single people and multi-person in the image, the image will be animated according to the motion of the driving video.
4049
- relative: indicate whether the relative or absolute coordinates of the key points in the video are used in the program. It is recommended to use relative coordinates. If absolute coordinates are used, the characters will be distorted after animation.
4150
- adapt_scale: adapt movement scale based on convex hull of keypoints.
4251
- ratio: The pasted face percentage of generated image, this parameter should be adjusted in the case of multi-person image in which the adjacent faces are close. The defualt value is 0.4 and the range is [0.4, 0.5].
52+
- multi_person: There are multi faces in the images. Default means only one face in the image
53+
54+
### 2 Training
55+
**Datasets:**
56+
- fashion See[here](https://vision.cs.ubc.ca/datasets/fashion/)
57+
- VoxCeleb See[here](https://github.com/AliaksandrSiarohin/video-preprocessing)
58+
59+
**params:**
60+
- dataset_name.yaml: Create a config of your own dataset
61+
62+
- For single GPU:
63+
```
64+
export CUDA_VISIBLE_DEVICES=0
65+
python tools/main.py --config-file configs/dataset_name.yaml
66+
```
67+
- For multiple GPUs:
68+
```
69+
export CUDA_VISIBLE_DEVICES=0,1,2,3
70+
python -m paddle.distributed.launch \
71+
tools/main.py \
72+
--config-file configs/dataset_name.yaml
73+
74+
```
75+
76+
**Example:**
77+
- For single GPU:
78+
```
79+
export CUDA_VISIBLE_DEVICES=0
80+
python tools/main.py --config-file configs/firstorder_fashion.yaml \
81+
```
82+
- For multiple GPUs:
83+
```
84+
export CUDA_VISIBLE_DEVICES=0,1,2,3
85+
python -m paddle.distributed.launch \
86+
tools/main.py \
87+
--config-file configs/firstorder_fashion.yaml \
88+
```
89+
4390

4491
**Online Tutorial running in AI Studio:**
4592

docs/en_US/tutorials/wav2lip.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@ python tools/main.py --config-file configs/wav2lip.yaml
4343
```
4444
export CUDA_VISIBLE_DEVICES=0,1,2,3
4545
python -m paddle.distributed.launch \
46-
--log_dir ./mylog_dd.log \
4746
tools/main.py \
4847
--config-file configs/wav2lip.yaml \
4948
@@ -58,7 +57,6 @@ python tools/main.py --config-file configs/wav2lip_hq.yaml
5857
```
5958
export CUDA_VISIBLE_DEVICES=0,1,2,3
6059
python -m paddle.distributed.launch \
61-
--log_dir ./mylog_dd.log \
6260
tools/main.py \
6361
--config-file configs/wav2lip_hq.yaml \
6462
149 KB
Loading

docs/zh_CN/tutorials/motion_driving.md

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,14 @@ First order motion model的任务是image animation,给定一张源图片,
2525
同时,PaddleGAN针对人脸的相关处理提供[faceutil工具](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/ppgan/faceutils),包括人脸检测、五官分割、关键点检测等能力。
2626

2727
## 使用方法
28-
28+
### 1 人脸测试
2929
用户可上传一张单人/多人照片与驱动视频,并在如下命令中的source_image参数和driving_video参数分别换成自己的图片和视频路径,然后运行如下命令,即可完成单人/多人脸动作表情迁移,运行结果为命名为result.mp4的视频文件,保存在output文件夹中。
3030

3131
注意:使用多人脸时,尽量使用人脸间距较大的照片,效果更佳,也可通过手动调节ratio进行效果优化。
3232

3333
本项目中提供了原始图片和驱动视频供展示使用,运行的命令如下:
3434

35+
- 默认为单人脸:
3536
```
3637
cd applications/
3738
python -u tools/first-order-demo.py \
@@ -40,12 +41,57 @@ python -u tools/first-order-demo.py \
4041
--ratio 0.4 \
4142
--relative --adapt_scale
4243
```
44+
- 多人脸
45+
cd applications/
46+
python -u tools/first-order-demo.py \
47+
--driving_video ../docs/imgs/fom_dv.mp4 \
48+
--source_image ../docs/imgs/fom_source_image_multi_person.jpg \
49+
--ratio 0.4 \
50+
--relative --adapt_scale \
51+
--multi_person
4352

4453
- driving_video: 驱动视频,视频中人物的表情动作作为待迁移的对象
4554
- source_image: 原始图片,支持单人图片和多人图片,视频中人物的表情动作将迁移到该原始图片中的人物上
4655
- relative: 指示程序中使用视频和图片中人物关键点的相对坐标还是绝对坐标,建议使用相对坐标,若使用绝对坐标,会导致迁移后人物扭曲变形
4756
- adapt_scale: 根据关键点凸包自适应运动尺度
4857
- ratio: 贴回驱动生成的人脸区域占原图的比例, 用户需要根据生成的效果调整该参数,尤其对于多人脸距离比较近的情况下需要调整改参数, 默认为0.4,调整范围是[0.4, 0.5]
58+
- multi_person: 表示图片中有多张人脸,不加则默认为单人脸
59+
60+
### 2 训练
61+
**数据集:**
62+
- fashion 可以参考[这里](https://vision.cs.ubc.ca/datasets/fashion/)
63+
- VoxCeleb 可以参考[这里](https://github.com/AliaksandrSiarohin/video-preprocessing)
64+
65+
**参数说明:**
66+
- dataset_name.yaml: 需要配置自己的yaml文件及参数
67+
68+
- GPU单卡训练:
69+
```
70+
export CUDA_VISIBLE_DEVICES=0
71+
python tools/main.py --config-file configs/dataset_name.yaml
72+
```
73+
- GPU多卡训练:
74+
```
75+
export CUDA_VISIBLE_DEVICES=0,1,2,3
76+
python -m paddle.distributed.launch \
77+
tools/main.py \
78+
--config-file configs/dataset_name.yaml \
79+
80+
```
81+
82+
**例如:**
83+
- GPU单卡训练:
84+
```
85+
export CUDA_VISIBLE_DEVICES=0
86+
python tools/main.py --config-file configs/firstorder_fashion.yaml
87+
```
88+
- GPU多卡训练:
89+
```
90+
export CUDA_VISIBLE_DEVICES=0,1,2,3
91+
python -m paddle.distributed.launch \
92+
tools/main.py \
93+
--config-file configs/firstorder_fashion.yaml \
94+
```
4995

5096
**在线体验项目**
5197

docs/zh_CN/tutorials/wav2lip.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,6 @@ python tools/main.py --config-file configs/wav2lip.yaml
4545
```
4646
export CUDA_VISIBLE_DEVICES=0,1,2,3
4747
python -m paddle.distributed.launch \
48-
--log_dir ./mylog_dd.log \
4948
tools/main.py \
5049
--config-file configs/wav2lip.yaml \
5150
@@ -60,7 +59,6 @@ python tools/main.py --config-file configs/wav2lip_hq.yaml
6059
```
6160
export CUDA_VISIBLE_DEVICES=0,1,2,3
6261
python -m paddle.distributed.launch \
63-
--log_dir ./mylog_dd.log \
6462
tools/main.py \
6563
--config-file configs/wav2lip_hq.yaml \
6664

ppgan/apps/first_order_predictor.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,7 @@ def get_prediction(face_image):
146146
for im in reader:
147147
driving_video.append(im)
148148
except RuntimeError:
149+
print("Read driving video error!")
149150
pass
150151
reader.close()
151152

0 commit comments

Comments
 (0)