Design, implement, and train a machine learning model to automatically categorize resumes based on their domain (e.g., sales, marketing, etc.). Following this, develop a script that can be run via the command line to process a batch of resumes, categorize them, and output results to both directory structures and a CSV file.
You can find this dataset here
- Install Anaconda
- Create a new environment with Python 3.6:
conda create -n venv python=3.6
- Activate the environment:
source activate venv
- Install other dependencies:
pip install -r requirements.txt
-
In your Command Prompt enter:
pip install virtualenv
-
Launch virtualenv : In your Command Prompt navigate to your project:(
cd your_project
) and enter:virtualenv env
-
Activate virtualenv:
source env/bin/activate
env\Scripts\activate
on Windows -
Install other dependencies:
pip install -r requirements.txt
- Open command prompt/terminal/anaconda prompt
- Goto the directory:
cd C:\Users\{user}\{your directory}
- Run script:
python script.py --input_dir cv/ --output_dir sort/