Skip to content

RalphGradien/Employee-Turnover-and-HR-data-Exploration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Employee-Turnover-and-HR-data-Exploration

Project Overview

The project aims to understand factors contributing to employee turnover and create a predictive model to identify employees at risk of leaving the company.

OSEMN Data Science Pipeline

  1. Obtaining the Data:

    • Downloaded the dataset from Kaggle.
    • Imported the data into the working environment.
  2. Scrubbing the Data:

    • Checked for missing values (dataset was clean).
    • Examined the dataset for readability and appropriate feature names.
    • Converted categorical features (department, salary) to numeric types.
  3. Exploratory Data Analysis (EDA):

    • Conducted statistical overview and summary.
    • Explored correlations among features using a correlation matrix and heatmap.
    • Analyzed turnover patterns in relation to department, salary, promotion, years at the company, project count, evaluation, average monthly hours, etc.
  4. Modeling the Data:

    • Split the data into training and testing sets.
    • Implemented various machine learning models (Logistic Regression, SVM, kNN, Random Forest).
    • Evaluated model performance using training and testing scores.
  5. Interpreting the Data:

    • Summarized findings from EDA.
    • Highlighted trends related to turnover, satisfaction, salary, project count, and evaluations.
    • Raised questions for further consideration about the impact of losing employees and factors affecting satisfaction and turnover.

Conclusion and Questions

  • Noted trends related to working hours, salary, promotion, and project count.
  • Highlighted correlations between turnover, satisfaction, and salary.
  • Posed questions about the impact of losing employees and factors influencing turnover and satisfaction.

Note: The code sections may need to be reformatted and executed in a Python environment for full functionality.

About

The OSEMN Data Science Pipeline, Logistic Regression, SVM, kNN, and Random Forest

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published