From 28c5e02a932f60103183767b8ea59aa701aa3e3f Mon Sep 17 00:00:00 2001 From: BrilliantYuKaimin <91609464+BrilliantYuKaimin@users.noreply.github.com> Date: Wed, 2 Mar 2022 10:10:02 +0800 Subject: [PATCH 1/2] Update README.md --- configs/xcit/README.md | 33 ++++++++++++++++++--------------- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/configs/xcit/README.md b/configs/xcit/README.md index 393befed..277e5719 100755 --- a/configs/xcit/README.md +++ b/configs/xcit/README.md @@ -8,6 +8,8 @@ Following tremendous success in natural language processing, transformers have r ## Getting Started +An [AI Studio](https://aistudio.baidu.com/aistudio/index) project about XCiT has been published, and you can click [here](https://aistudio.baidu.com/aistudio/projectdetail/3449604) to open the project and run commands of training and evaluation directly. + #### Train with single gpu ```bash python tools/train.py -c configs/xcit/${XCIT_ARCH}.yaml @@ -25,7 +27,7 @@ python tools/train.py -c configs/xcit/${XCIT_ARCH}.yaml --load ${XCIT_WEGHT_FILE #### Knowledge distillation -For knowledge distillation, you only need to replace `${XCIT_ARCH}.yaml` to corresponding distillation config file, `${XCIT_ARCH}_dist.yaml`, at above commands. We provide pretrained weights of Teacher model `RegNetY_160`, which can be downloaded [here](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/regnety_160.pdparams). +For knowledge distillation, you only need to replace `${XCIT_ARCH}.yaml` to corresponding distillation config file, `${XCIT_ARCH}_dist.yaml`, at above commands. We provide pretrained weights of Teacher model `RegNetY_160`, which can be downloaded [here](https://passl.bj.bcebos.com/vision_transformers/xcit/regnety_160.pdparams). Checkpoints saved in distillation training include both Teacher's and Student's weights. You can extract the weights of Student by following command. ```bash @@ -38,20 +40,21 @@ python tools/extract_weight.py ${DISTILLATION_WEIGHTS_FILE} --prefix Student --r The results are evaluated on ImageNet2012 validation set | Arch | Weight | Top-1 Acc | Top-5 Acc | Crop ratio | # Params | | ------------------ | ------------------------------------------------------------ | --------- | --------- | ---------- | -------- | -| xcit_nano_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_nano_12_p8_224.pdparams) | 73.90 | 92.13 | 1.0 | 3.05M | -| xcit_tiny_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_12_p8_224.pdparams) | 79.68 | 95.04 | 1.0 | 6.71M | -| xcit_tiny_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_24_p8_224.pdparams) | 81.87 | 95.97 | 1.0 | 12.11M | -| xcit_small_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_12_p8_224.pdparams) | 83.36 | 96.51 | 1.0 | 26.21M | -| xcit_small_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_24_p8_224.pdparams) | 83.82 | 96.65 | 1.0 | 47.63M | -| xcit_medium_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_medium_24_p8_224.pdparams ) | 83.73 | 96.39 | 1.0 | 84.32M | -| xcit_large_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_large_24_p8_224.pdparams) | 84.42 | 96.65 | 1.0 | 188.93M | -| xcit_nano_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_nano_12_p16_224.pdparams) | 70.01 | 89.82 | 1.0 | 3.05M | -| xcit_tiny_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_12_p16_224.pdparams) | 77.15 | 93.72 | 1.0 | 6.72M | -| xcit_tiny_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_24_p16_224.pdparams) | 79.42 | 94.86 | 1.0 | 12.12M | -| xcit_small_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_12_p16_224.pdparams) | 81.89 | 95.83 | 1.0 | 26.25M | -| xcit_small_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_24_p16_224.pdparams) | 82.51 | 95.97 | 1.0 | 47.67M | -| xcit_medium_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_medium_24_p16_224.pdparams) | 82.67 | 95.91 | 1.0 | 84.40M | -| xcit_large_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_large_24_p16_224.pdparams) | 82.89 | 95.89 | 1.0 | 189.10M | +| xcit_nano_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p8_224.pdparams) | 73.90 | 92.13 | 1.0 | 3.05M | +| xcit_nano_12_p8_224_dist | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p8_224_dist.pdparams) | 77.28 | 93.25 | 1.0 | 3.05M | +| xcit_tiny_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_12_p8_224.pdparams) | 79.68 | 95.04 | 1.0 | 6.71M | +| xcit_tiny_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_24_p8_224.pdparams) | 81.87 | 95.97 | 1.0 | 12.11M | +| xcit_small_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_12_p8_224.pdparams) | 83.36 | 96.51 | 1.0 | 26.21M | +| xcit_small_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_24_p8_224.pdparams) | 83.82 | 96.65 | 1.0 | 47.63M | +| xcit_medium_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_medium_24_p8_224.pdparams ) | 83.73 | 96.39 | 1.0 | 84.32M | +| xcit_large_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_large_24_p8_224.pdparams) | 84.42 | 96.65 | 1.0 | 188.93M | +| xcit_nano_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p16_224.pdparams) | 70.01 | 89.82 | 1.0 | 3.05M | +| xcit_tiny_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_12_p16_224.pdparams) | 77.15 | 93.72 | 1.0 | 6.72M | +| xcit_tiny_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_24_p16_224.pdparams) | 79.42 | 94.86 | 1.0 | 12.12M | +| xcit_small_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_12_p16_224.pdparams) | 81.89 | 95.83 | 1.0 | 26.25M | +| xcit_small_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_24_p16_224.pdparams) | 82.51 | 95.97 | 1.0 | 47.67M | +| xcit_medium_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_medium_24_p16_224.pdparams) | 82.67 | 95.91 | 1.0 | 84.40M | +| xcit_large_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_large_24_p16_224.pdparams) | 82.89 | 95.89 | 1.0 | 189.10M | ## Usage From 0a2f86b7ed58ac2dd2e88bb0bbcd5dea0d631f29 Mon Sep 17 00:00:00 2001 From: BrilliantYuKaimin <91609464+BrilliantYuKaimin@users.noreply.github.com> Date: Wed, 2 Mar 2022 10:10:15 +0800 Subject: [PATCH 2/2] Update Classification_Models_Guide.md --- docs/Classification_Models_Guide.md | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/docs/Classification_Models_Guide.md b/docs/Classification_Models_Guide.md index 68cca752..da04f650 100644 --- a/docs/Classification_Models_Guide.md +++ b/docs/Classification_Models_Guide.md @@ -42,20 +42,21 @@ PASSL provides developers with a number of implementations of Transformer classi | beit_large_p16_512 | [ft 22k to 1k](https://passl.bj.bcebos.com/vision_transformers/beit/beit_large_p16_512_ft.pdparams) | 88.60 | 98.66 | 1.0 | 304M | | mlp_mixer_b16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/mlp_mixer/mlp-mixer_b16_224.pdparams) | 76.60 | 92.23 | 0.875 | 60.0M | | mlp_mixer_l16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/mlp_mixer/mlp-mixer_l16_224.pdparams) | 72.06 | 87.67 | 0.875 | 208.2M | -| xcit_nano_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_nano_12_p8_224.pdparams) | 73.90 | 92.13 | 1.0 | 3.05M | -| xcit_tiny_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_12_p8_224.pdparams) | 79.68 | 95.04 | 1.0 | 6.71M | -| xcit_tiny_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_24_p8_224.pdparams) | 81.87 | 95.97 | 1.0 | 12.11M | -| xcit_small_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_12_p8_224.pdparams) | 83.36 | 96.51 | 1.0 | 26.21M | -| xcit_small_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_24_p8_224.pdparams) | 83.82 | 96.65 | 1.0 | 47.63M | -| xcit_medium_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_medium_24_p8_224.pdparams ) | 83.73 | 96.39 | 1.0 | 84.32M | -| xcit_large_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_large_24_p8_224.pdparams) | 84.42 | 96.65 | 1.0 | 188.93M | -| xcit_nano_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_nano_12_p16_224.pdparams) | 70.01 | 89.82 | 1.0 | 3.05M | -| xcit_tiny_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_12_p16_224.pdparams) | 77.15 | 93.72 | 1.0 | 6.72M | -| xcit_tiny_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_24_p16_224.pdparams) | 79.42 | 94.86 | 1.0 | 12.12M | -| xcit_small_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_12_p16_224.pdparams) | 81.89 | 95.83 | 1.0 | 26.25M | -| xcit_small_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_24_p16_224.pdparams) | 82.51 | 95.97 | 1.0 | 47.67M | -| xcit_medium_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_medium_24_p16_224.pdparams) | 82.67 | 95.91 | 1.0 | 84.40M | -| xcit_large_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_large_24_p16_224.pdparams) | 82.89 | 95.89 | 1.0 | 189.10M | +| xcit_nano_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p8_224.pdparams) | 73.90 | 92.13 | 1.0 | 3.05M | +| xcit_nano_12_p8_224_dist | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p8_224_dist.pdparams) | 77.28 | 93.25 | 1.0 | 3.05M | +| xcit_tiny_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_12_p8_224.pdparams) | 79.68 | 95.04 | 1.0 | 6.71M | +| xcit_tiny_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_24_p8_224.pdparams) | 81.87 | 95.97 | 1.0 | 12.11M | +| xcit_small_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_12_p8_224.pdparams) | 83.36 | 96.51 | 1.0 | 26.21M | +| xcit_small_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_24_p8_224.pdparams) | 83.82 | 96.65 | 1.0 | 47.63M | +| xcit_medium_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_medium_24_p8_224.pdparams ) | 83.73 | 96.39 | 1.0 | 84.32M | +| xcit_large_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_large_24_p8_224.pdparams) | 84.42 | 96.65 | 1.0 | 188.93M | +| xcit_nano_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p16_224.pdparams) | 70.01 | 89.82 | 1.0 | 3.05M | +| xcit_tiny_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_12_p16_224.pdparams) | 77.15 | 93.72 | 1.0 | 6.72M | +| xcit_tiny_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_24_p16_224.pdparams) | 79.42 | 94.86 | 1.0 | 12.12M | +| xcit_small_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_12_p16_224.pdparams) | 81.89 | 95.83 | 1.0 | 26.25M | +| xcit_small_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_24_p16_224.pdparams) | 82.51 | 95.97 | 1.0 | 47.67M | +| xcit_medium_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_medium_24_p16_224.pdparams) | 82.67 | 95.91 | 1.0 | 84.40M | +| xcit_large_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_large_24_p16_224.pdparams) | 82.89 | 95.89 | 1.0 | 189.10M | The above metrics were tested on the ImageNet 2012 dataset.