Skip to content

len-sla/NLP_Flair_Texhero_DistilBERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Name

Predict which tweets are about real dissters and which ones are not in a couple lines of code. Data for calculation taken from https://www.kaggle.com/c/nlp-getting-started

General info

If you need to get quickly some initial results in typical NLP task than using packages Flair, Texthero and DistilBERT would give quite good results.

Libraries and useful links

  1. Flair Embeddings
  2. FastText Embeddings
  3. TransformerWordEmbeddings
  4. How to use flair with keras

Status

Project is: in progress,

Inspiration

Project inspired by Kaggle nootebook

result on leaderboard

###


Second attempt

Rev_B_real_or_not.ipynb Results were worse compared with initial simple automatic approach. That proves how good/opimised Flair Framework is to get best results. Tweaking does not give better results. Maybe more extensive text cleaning and deciphering abbreviation and other shorcuts could do better result. Things like hero.visualization.wordcloud, kmeans, custom_pipeline were checked.

Info

Created by lencz.sla@gmail.com

About

using Flair, Texthero and DistilBERT to get good results

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published