Transformer Pre-trained weight #1456

hshen14 · 2018-11-15T08:42:51Z

I would like to run the reproduce the bleu for transformer model. May I know whether there is pre-trained weight for sharing? Thanks. @Superjomn @panyx0718 @luotao1

https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/neural_machine_translation/transformer/README_cn.md

panyx0718 · 2018-11-16T02:12:11Z

@guoshengCS @kuke

guoshengCS · 2018-11-19T05:58:26Z

We upload the pre-trained weight and data used in Transformer base model just now. Please download from http://transformer-model-data.bj.bcebos.com/wmt16_ende_data_bpe_clean.tar.gz for data and http://transformer-model-data.bj.bcebos.com/iter_100000.infer.model.tar.gz for pre-trained weight.

hshen14 · 2018-11-21T02:45:09Z

Thanks @guoshengCS. I tried the inference on CPU and the speed seems very slow. I have some questions:

Is there any instructions to measure the performance? Do you have the total inference time on CPU?
Some sentences looks abnormal during inference, with @ and 乱码. Is it okay?
Would you please show the corresponding BLEU with the model you provided?
Would you please share gen_data/mosesdecoder/scripts/generic/multi-bleu.perl?

Er konnte nicht wieder@@ beleb@@ t werden , und er starb am nächsten Morgen .
Den@@ g besch@@ wert sich schlie乱码lich (schließlich) , dass sein Kopf verletzt dann un@@ bewusst .

guoshengCS · 2018-11-21T05:47:00Z

I haven't tried training and inference on CPU. Sorry for that I can't give a benchmark for the performance on CPU, and I haven't heard any released Transformer performance on CPU.
@ is introduce by BPE(byte-pair encoding, which is used in the paper) and can be removed by ed -r 's/(@@ )|(@@ ?$)//g' . The examples seem same as my result, and it might work correctly.

[paddle@yq01-gpu-v110-255-100-01 transformer_1.1]$ grep 'Er konnte nicht wieder\|Den@@ g besch@@ wert sich' ~/guosheng/transformer_test_1.1/models/fluid/neural_machine_translation/transformer/results_2016/predict_iter100000.txt
Den@@ g besch@@ wert sich schließlich , dass sein Kopf verletzt dann un@@ bewusst .
Er konnte nicht wieder@@ beleb@@ t werden , und er starb am nächsten Morgen .

Testing with multi-bleu.perl and the pre-trained weight, the BLEU score is 33.64
You can get multi-bleu.perl according to this https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleNLP/neural_machine_translation/transformer/gen_data.sh#L110

hshen14 · 2018-11-22T04:49:59Z

Thanks @guoshengCS. Reproduced.

luotao1 added the intel label Nov 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transformer Pre-trained weight #1456

Transformer Pre-trained weight #1456

hshen14 commented Nov 15, 2018

panyx0718 commented Nov 16, 2018

Uh oh!

guoshengCS commented Nov 19, 2018

Uh oh!

hshen14 commented Nov 21, 2018 •

edited

Loading

Uh oh!

guoshengCS commented Nov 21, 2018

Uh oh!

hshen14 commented Nov 22, 2018

Uh oh!

Transformer Pre-trained weight #1456

Transformer Pre-trained weight #1456

Comments

hshen14 commented Nov 15, 2018

panyx0718 commented Nov 16, 2018

Uh oh!

guoshengCS commented Nov 19, 2018

Uh oh!

hshen14 commented Nov 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guoshengCS commented Nov 21, 2018

Uh oh!

hshen14 commented Nov 22, 2018

Uh oh!

hshen14 commented Nov 21, 2018 •

edited

Loading