Skip to content

Latest commit

 

History

History
31 lines (27 loc) · 589 Bytes

File metadata and controls

31 lines (27 loc) · 589 Bytes

Quoa Question Pair Similarity Project

Techniques Used:

  1. BOW
  2. TFIDF
  3. Own Word Embedding
  4. Pre-Trained Word2Vec
  5. Pre-Trained GloVe
  6. Pre-Trained BERT
  7. Senetence Similarity

Extracted Features:

  1. Basic Word features
  2. Length Based Features
  3. Token Based Features
  4. Fuzzy features
  5. Cosine similarity between two sentences

Machine Learning Models Used:

  1. Logistic Regression
  2. SVM
  3. Naive Bayes
  4. KNN
  5. Decision Tree
  6. Random Forest
  7. Gradient Boost
  8. ADA Boost
  9. XGBoost

Best Method:

XGBoost using BERT and Advanced Extracted features Accuracy of 86%.