Skip to content

Transformer Pre-trained weight #1456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
hshen14 opened this issue Nov 15, 2018 · 5 comments
Open

Transformer Pre-trained weight #1456

hshen14 opened this issue Nov 15, 2018 · 5 comments
Labels

Comments

@hshen14
Copy link

hshen14 commented Nov 15, 2018

I would like to run the reproduce the bleu for transformer model. May I know whether there is pre-trained weight for sharing? Thanks. @Superjomn @panyx0718 @luotao1

https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/neural_machine_translation/transformer/README_cn.md

@luotao1 luotao1 added the intel label Nov 15, 2018
@panyx0718
Copy link
Contributor

@guoshengCS @kuke

@guoshengCS
Copy link
Collaborator

We upload the pre-trained weight and data used in Transformer base model just now. Please download from http://transformer-model-data.bj.bcebos.com/wmt16_ende_data_bpe_clean.tar.gz for data and http://transformer-model-data.bj.bcebos.com/iter_100000.infer.model.tar.gz for pre-trained weight.

@hshen14
Copy link
Author

hshen14 commented Nov 21, 2018

Thanks @guoshengCS. I tried the inference on CPU and the speed seems very slow. I have some questions:

  1. Is there any instructions to measure the performance? Do you have the total inference time on CPU?
  2. Some sentences looks abnormal during inference, with @ and 乱码. Is it okay?
  3. Would you please show the corresponding BLEU with the model you provided?
  4. Would you please share gen_data/mosesdecoder/scripts/generic/multi-bleu.perl?

Er konnte nicht wieder@@ beleb@@ t werden , und er starb am nächsten Morgen .
Den@@ g besch@@ wert sich schlie乱码lich (schließlich) , dass sein Kopf verletzt dann un@@ bewusst .

@guoshengCS
Copy link
Collaborator

  1. I haven't tried training and inference on CPU. Sorry for that I can't give a benchmark for the performance on CPU, and I haven't heard any released Transformer performance on CPU.
  2. @ is introduce by BPE(byte-pair encoding, which is used in the paper) and can be removed by ed -r 's/(@@ )|(@@ ?$)//g' . The examples seem same as my result, and it might work correctly.
[paddle@yq01-gpu-v110-255-100-01 transformer_1.1]$ grep 'Er konnte nicht wieder\|Den@@ g besch@@ wert sich' ~/guosheng/transformer_test_1.1/models/fluid/neural_machine_translation/transformer/results_2016/predict_iter100000.txt
Den@@ g besch@@ wert sich schließlich , dass sein Kopf verletzt dann un@@ bewusst .
Er konnte nicht wieder@@ beleb@@ t werden , und er starb am nächsten Morgen .
  1. Testing with multi-bleu.perl and the pre-trained weight, the BLEU score is 33.64
  2. You can get multi-bleu.perl according to this https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/neural_machine_translation/transformer/gen_data.sh#L110

@hshen14
Copy link
Author

hshen14 commented Nov 22, 2018

Thanks @guoshengCS. Reproduced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants