This repository contains Python scripts to clean employee data and perform analysis such as average salaries, identifying youngest and most experienced employees, and department-wise insights.
clean_data(data)Functions to clean raw json employee data.- Function
Analysis(data)to perform analytics on cleaned data.
- Prepare your raw data as a list of dictionaries (e.g., from json or manual input).
- Import the cleaning and analysis functions into your Jupyter notebook or lab.
- Clean your raw data using
clean_data(). - Run analysis using
Analysis()on the cleaned data. - Visualize or print results as needed.
# Import the functions from your local scripts
from data_cleaning import clean_data
from analysis import Analysis
# Example raw data with potential inconsistencies
raw_data = [
{"id": "1", "name": "Alice", "age": "30", "department": "Sales", "salary": "60000", "experience_years": "5"},
{"id": "2", "name": "Bob", "age": " ", "department": "HR", "salary": "75000", "experience_years": "20"},
{"id": "3", "name": "Charlie", "age": "25", "department": "IT", "salary": "$50000", "experience_years": "2"},
# Add more data rows here
]
# Clean the raw data
cleaned = clean_data(raw_data)
# Perform analysis
result = Analysis(cleaned)
# Optionally: print or visualize the `result`