MT Evaluation

Jump to bottom

Thamme Gowda edited this page Mar 27, 2018 · 1 revision

For MT output evaluation use BLEU metric, as follows:

Get the evaluator script named multi-bleu-detok.perl from mosesdecoder
Keep your references (gold translations) un-modified. No tokenization, no case modifcations, literally no operation.
Detokenize your MT output using the detokenizer. See this page for instructions. Try to bring the MT output to look closer to references (restore case, punctuations etc if necessary)
Run the BLEU script as follows :

multi-bleu-detok.perl REFERENCE1.txt < MT_OUT.txt

Note: if you have multiple references

multi-bleu-detok.perl REFERENCE1.txt REFERENCE2.txt ... REFERENCEn.txt < MT_OUT.txt