Skip to content

Performed lots of feature engineering and created a nltk pipe line and created a web application using streamlit

Notifications You must be signed in to change notification settings

nishadneeraj1/Quora-Question-pair-Duplicate-checker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Quora Question Pair Duplicate Checker

Developed a Streamlit-based web application to check for duplicate question pairs on Quora. Utilized a pre-trained Random Forest classifier to predict similarity between questions, achieving an impressive 76.56% accuracy on the validation dataset. Implemented natural language processing (NLP)

Objective

The objective of developing a Streamlit-based web application to check for duplicate question pairs on Quora is to provide users with a convenient and efficient way to identify duplicate questions. This can be useful for a variety of reasons, such as:

  • To improve the quality of the Quora platform by reducing the number of duplicate questions.
  • To help users save time by avoiding answering questions that have already been answered.
  • To help users find the most relevant answers to their questions by directing them to the existing duplicate question.

Features

  • The web application utilizes a pre-trained Random Forest classifier to predict the similarity between questions. This classifier was trained on a dataset of labeled question pairs, and it achieves an impressive 76.56% accuracy on the validation dataset. This suggests that the web application is able to reliably identify duplicate question pairs.

  • The web application also implements natural language processing (NLP) techniques to preprocess and feature engineer the question data. This helps to improve the accuracy of the Random Forest classifier.

About

Performed lots of feature engineering and created a nltk pipe line and created a web application using streamlit

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages