First order motion (PaddlePaddle#293)

lzzyzlbb · web-flow · commit f78516e1d81b · 2021-04-23T12:36:55.000+08:00
update fom docs
diff --git a/docs/en_US/tutorials/motion_driving.md b/docs/en_US/tutorials/motion_driving.md
@@ -9,22 +9,23 @@
 </div>
 ## Multi-Faces swapping
 
-For photoes with multiple faces, we first detect all of the faces,  then do facial expression transfer for each face, and finally put those faces back to the original photo to generate a complete new video. 
+For photoes with multiple faces, we first detect all of the faces,  then do facial expression transfer for each face, and finally put those faces back to the original photo to generate a complete new video.
 
 Specific technical steps are shown below:
 
 1. Use the S3FD model to detect the faces of a photo
 2. Use the First Order Motion model to do the facial expression transfer of each face
 3. Put those "new" generated faces back to the original photo
 
-At the same time, specifically for face related work, PaddleGAN provides a ["faceutils" tool](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/ppgan/faceutils), including face detection, face segmentation models and more. 
+At the same time, specifically for face related work, PaddleGAN provides a ["faceutils" tool](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/ppgan/faceutils), including face detection, face segmentation models and more.
 
 ## How to use
-
+### 1 Test for Face
 Users can upload the prepared source image and driving video, then substitute the path of source image and driving video for the `source_image` and `driving_video` parameter in the following running command. It will geneate a video file named `result.mp4` in the `output` folder, which is the animated video file.
 
 Note: for photoes with multiple faces, the longer the distances between faces, the better the result quality you can get.  
 
+- single face:
 ```
 cd applications/
 python -u tools/first-order-demo.py  \
@@ -33,13 +34,59 @@ python -u tools/first-order-demo.py  \
      --ratio 0.4 \
      --relative --adapt_scale
 ```
+- multi face
+cd applications/
+python -u tools/first-order-demo.py  \
+     --driving_video ../docs/imgs/fom_dv.mp4 \
+     --source_image ../docs/imgs/fom_source_image_multi_person.png \
+     --ratio 0.4 \
+     --relative --adapt_scale \
+     --multi_person
 
 **params:**
 - driving_video: driving video, the motion of the driving video is to be migrated.
 - source_image: source_image, support single people and multi-person in the image, the image will be animated according to the motion of the driving video.
 - relative: indicate whether the relative or absolute coordinates of the key points in the video are used in the program. It is recommended to use relative coordinates. If absolute coordinates are used, the characters will be distorted after animation.
 - adapt_scale: adapt movement scale based on convex hull of keypoints.
 - ratio: The pasted face percentage of generated image, this parameter should be adjusted in the case of multi-person image in which the adjacent faces are close. The defualt value is 0.4 and the range is [0.4, 0.5].
+- multi_person: There are multi faces in the images. Default means only one face in the image
+
+### 2 Training
+**Datasets:**
+- fashion See[here](https://vision.cs.ubc.ca/datasets/fashion/)
+- VoxCeleb See[here](https://github.com/AliaksandrSiarohin/video-preprocessing)
+
+**params:**
+- dataset_name.yaml: Create a config of your own dataset
+
+- For single GPU:
+```
+export CUDA_VISIBLE_DEVICES=0
+python tools/main.py --config-file configs/dataset_name.yaml
+```
+- For multiple GPUs:
+```
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch \
+    tools/main.py \
+    --config-file configs/dataset_name.yaml
+
+```
+
+**Example:**
+- For single GPU:
+```
+export CUDA_VISIBLE_DEVICES=0
+python tools/main.py --config-file configs/firstorder_fashion.yaml \
+```
+- For multiple GPUs:
+```
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch \
+    tools/main.py \
+    --config-file configs/firstorder_fashion.yaml \
+```
+
 
 **Online Tutorial running in AI Studio:**
 
diff --git a/docs/en_US/tutorials/wav2lip.md b/docs/en_US/tutorials/wav2lip.md
@@ -43,7 +43,6 @@ python tools/main.py --config-file configs/wav2lip.yaml
 ```
 export CUDA_VISIBLE_DEVICES=0,1,2,3
 python -m paddle.distributed.launch \
-    --log_dir ./mylog_dd.log \
     tools/main.py \
     --config-file configs/wav2lip.yaml \
 
@@ -58,7 +57,6 @@ python tools/main.py --config-file configs/wav2lip_hq.yaml
 ```
 export CUDA_VISIBLE_DEVICES=0,1,2,3
 python -m paddle.distributed.launch \
-    --log_dir ./mylog_dd.log \
     tools/main.py \
     --config-file configs/wav2lip_hq.yaml \
 
diff --git a/docs/imgs/fom_source_image_multi_person.jpg b/docs/imgs/fom_source_image_multi_person.jpg
diff --git a/docs/zh_CN/tutorials/motion_driving.md b/docs/zh_CN/tutorials/motion_driving.md
@@ -25,13 +25,14 @@ First order motion model的任务是image animation，给定一张源图片，
 同时，PaddleGAN针对人脸的相关处理提供[faceutil工具](https://github.com/PaddlePaddle/PaddleGAN/tree/develop/ppgan/faceutils)，包括人脸检测、五官分割、关键点检测等能力。
 
 ## 使用方法
-
+### 1 人脸测试
 用户可上传一张单人/多人照片与驱动视频，并在如下命令中的source_image参数和driving_video参数分别换成自己的图片和视频路径，然后运行如下命令，即可完成单人/多人脸动作表情迁移，运行结果为命名为result.mp4的视频文件，保存在output文件夹中。
 
 注意：使用多人脸时，尽量使用人脸间距较大的照片，效果更佳，也可通过手动调节ratio进行效果优化。
 
 本项目中提供了原始图片和驱动视频供展示使用，运行的命令如下：
 
+- 默认为单人脸:
 ```
 cd applications/
 python -u tools/first-order-demo.py  \
@@ -40,12 +41,57 @@ python -u tools/first-order-demo.py  \
      --ratio 0.4 \
      --relative --adapt_scale
 ```
+- 多人脸
+cd applications/
+python -u tools/first-order-demo.py  \
+     --driving_video ../docs/imgs/fom_dv.mp4 \
+     --source_image ../docs/imgs/fom_source_image_multi_person.jpg \
+     --ratio 0.4 \
+     --relative --adapt_scale \
+     --multi_person
 
 - driving_video: 驱动视频，视频中人物的表情动作作为待迁移的对象
 - source_image: 原始图片，支持单人图片和多人图片，视频中人物的表情动作将迁移到该原始图片中的人物上
 - relative: 指示程序中使用视频和图片中人物关键点的相对坐标还是绝对坐标，建议使用相对坐标，若使用绝对坐标，会导致迁移后人物扭曲变形
 - adapt_scale: 根据关键点凸包自适应运动尺度
 - ratio: 贴回驱动生成的人脸区域占原图的比例, 用户需要根据生成的效果调整该参数，尤其对于多人脸距离比较近的情况下需要调整改参数, 默认为0.4，调整范围是[0.4, 0.5]
+- multi_person: 表示图片中有多张人脸，不加则默认为单人脸
+
+### 2 训练
+**数据集:**
+- fashion 可以参考[这里](https://vision.cs.ubc.ca/datasets/fashion/)
+- VoxCeleb 可以参考[这里](https://github.com/AliaksandrSiarohin/video-preprocessing)
+
+**参数说明:**
+- dataset_name.yaml: 需要配置自己的yaml文件及参数
+
+- GPU单卡训练:
+```
+export CUDA_VISIBLE_DEVICES=0
+python tools/main.py --config-file configs/dataset_name.yaml
+```
+- GPU多卡训练:
+```
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch \
+    tools/main.py \
+    --config-file configs/dataset_name.yaml \
+
+```
+
+**例如:**
+- GPU单卡训练:
+```
+export CUDA_VISIBLE_DEVICES=0
+python tools/main.py --config-file configs/firstorder_fashion.yaml
+```
+- GPU多卡训练:
+```
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch \
+    tools/main.py \
+    --config-file configs/firstorder_fashion.yaml \
+```
 
 **在线体验项目**
 
diff --git a/docs/zh_CN/tutorials/wav2lip.md b/docs/zh_CN/tutorials/wav2lip.md
@@ -45,7 +45,6 @@ python tools/main.py --config-file configs/wav2lip.yaml
 ```
 export CUDA_VISIBLE_DEVICES=0,1,2,3
 python -m paddle.distributed.launch \
-    --log_dir ./mylog_dd.log \
     tools/main.py \
     --config-file configs/wav2lip.yaml \
 
@@ -60,7 +59,6 @@ python tools/main.py --config-file configs/wav2lip_hq.yaml
 ```
 export CUDA_VISIBLE_DEVICES=0,1,2,3
 python -m paddle.distributed.launch \
-    --log_dir ./mylog_dd.log \
     tools/main.py \
     --config-file configs/wav2lip_hq.yaml \
 
diff --git a/ppgan/apps/first_order_predictor.py b/ppgan/apps/first_order_predictor.py
@@ -146,6 +146,7 @@ def get_prediction(face_image):
             for im in reader:
                 driving_video.append(im)
         except RuntimeError:
+            print("Read driving video error!")
             pass
         reader.close()