Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
-
Updated
May 9, 2024 - Python
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
🤖 A PyTorch library of curated Transformer models and their composable components
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chap…
CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)
Unattended Lightweight Text Classifiers with LLM Embeddings
Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.
Deep-learning system proposed by HFL for SemEval-2022 Task 8: Multilingual News Similarity
Resources and tools for the Tutorial - "Hate speech detection, mitigation and beyond" presented at ICWSM 2021
PyTorch implementation of Sentiment Analysis of the long texts written in Serbian language (which is underused language) using pretrained Multilingual RoBERTa based model (XLM-R) on the small dataset.
Sentiment Analysis of tweets written in underused Slavic languages (Serbian, Bosnian and Croatian) using pretrained multilingual RoBERTa based model XLM-R on 2 different datasets.
An implementation of drophead regularization for pytorch transformers
This is a Pytorch (+ Huggingface transformers) implementation of a "simple" text classifier defined using BERT-based models. In this lab we will see how it is simple to use BERT for a sentence classification task, obtaining state-of-the-art results in few lines of python code.
Improving Low-Resource Neural Machine Translation of Related Languages by Transfer Learning
Improving Bilingual Lexicon Induction with Cross-Encoder Reranking (Findings of EMNLP 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
A case study of NLI ( Natural Language Inferencing) with Transfer Learning. Kaggle Competition Rank - 18th (Global)
notebooks to finetune `bert-small-amharic`, `bert-mini-amharic`, and `xlm-roberta-base` models using an Amharic text classification dataset and the transformers library
Language Model Decomposition: Quantifying the Dependency and Correlation of Language Models
Official repository of the ACL 2024 paper "Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!".
Add a description, image, and links to the xlm-roberta topic page so that developers can more easily learn about it.
To associate your repository with the xlm-roberta topic, visit your repo's landing page and select "manage topics."