Skip to content

Latest commit

 

History

History
38 lines (24 loc) · 1.85 KB

README.md

File metadata and controls

38 lines (24 loc) · 1.85 KB

Multilingual BERT for Textual Entailment in Vietnamese and English

Introduction

This model is designed to address the Natural Language Inference (NLI) task using a pre-trained multilingual model. The goal is to predict the semantic relationship between two sentences (synonymous, antonymous, or semantically unrelated).

Requirements

  • Python 3.6+
  • TensorFlow 2.0+
  • Transformers library
  • Sentencepiece library

Dataset

The datasets contain two columns for sentence pairs (sentence_1, sentence_2) and one column for the label (label). sentence_1 can be an English or Vietnamese sentence, sentence_2 is a Vietnamese sentence, and label has three values: "agree", "disagree", and "neutral".

Model

I use the infoXLM-Large model from Hugging Face. The model is fine-tuned for the NLI task using TensorFlow Keras with additional layers for classification.

Experiments

To find the most suitable model for the task, I experimented with various multilingual models: mBERT, XLM-R, and infoXLM with their variations. The table below shows the experimental results with these models.

Model Train acc Val acc Test acc
mBERT 0.95 0.86 0.85
XLM-RoBERTa-base 0.96 0.9 0.9
XLM-RoBERTa-large 0.99 0.95 0.96
InfoXLM-base 0.85 0.88 0.87
InfoXLM-large 0.99 0.96 0.96

The two models, XLM-R-large and InfoXLM-large, yielded nearly identical results. Ultimately, I chose the InfoXLM-large model to address this task.

Acknowledgements

Thank you to the authors of the mBERT, XLM-R, and infoXLM models.