Skip to content

Simple resume classifier that helps categorize resumes into predefined job categories

Notifications You must be signed in to change notification settings

Naindeep-Singh/Resume-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Resume-Classifier

Overview

This project is a simple resume classifier that helps categorize resumes into predefined job categories. The classification is done using a K-Nearest Neighbors (KNN) classifier trained on a dataset of resumes.

Features

  • Data Exploration: Explore the provided dataset to understand the distribution of resume categories using visualizations.

  • Text Cleaning: Utilize the cleanResume function to preprocess resume text, removing unnecessary elements like URLs, mentions, and punctuations.

  • Word Cloud: Generate a word cloud to visualize the most frequent words in the cleaned resume text using the wordcloud library.

  • Feature Extraction: Use the TfidfVectorizer from scikit-learn to convert the cleaned text into numerical features suitable for machine learning.

  • Model Training: Train a KNN classifier using the OneVsRestClassifier approach and evaluate its accuracy on training and test sets.

  • Making Predictions: Allow users to input new resume text, and the model will predict its category, providing probability scores for each category.

  • Category Descriptions: Provide brief descriptions of predicted categories based on the top predicted category.

Prerequisites

Ensure you have the necessary libraries installed. You can install them using:

pip install numpy pandas matplotlib seaborn scikit-learn nltk wordcloud

Dataset

The project uses the UpdatedResumeDataSet.csv dataset, containing labeled resumes for training and evaluation.

About

Simple resume classifier that helps categorize resumes into predefined job categories

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published