Skip to content

birukd1/student-performance-analysis

Repository files navigation

Student Performance Analysis

A data analysis project examining factors that influence student exam scores, including study time, attendance, and sleep patterns.

Project Overview

This project analyzes student performance data to identify key factors affecting academic success. The analysis includes data cleaning, statistical analysis, and visualization to provide actionable insights.

Dataset

The dataset contains 1000 student records with the following variables:

  • hours_studied: Daily study hours
  • previous_score: Previous exam score
  • attendance: Attendance percentage
  • sleep_hours: Average sleep hours per night
  • internet_usage: Daily internet usage in hours
  • final_score: Final exam score (0-100)

Installation

Required Libraries

pip install pandas matplotlib numpy seaborn jupyter

Usage

Option 1: Python Script

For quick analysis and results:

python analysis.py

This will generate:

  • Cleaned dataset (student_performance_cleaned.csv)
  • Visualization charts (student_analysis_charts.png)
  • Statistical summary in console

Option 2: Jupyter Notebook

For interactive analysis with detailed visualizations:

jupyter notebook

Then open student_analysis.ipynb and run all cells.

Analysis Components

1. Data Cleaning

  • Handling missing values
  • Removing invalid data
  • Data validation

2. Descriptive Statistics

  • Average, maximum, and minimum scores
  • Pass/fail rates (passing grade: 60)
  • Study hours analysis
  • Attendance patterns

3. Visualizations

Python Script (4 charts):

  • Study Hours vs Exam Score
  • Pass vs Fail Distribution
  • Attendance vs Score
  • Score Distribution

Jupyter Notebook (10+ charts):

  • All script visualizations plus:
  • Correlation heatmap
  • Box plots by study groups
  • Violin plots
  • Pair plots
  • Missing data visualization

4. Key Findings

  • Impact of study time on performance
  • Relationship between attendance and scores
  • Pass rate analysis
  • Correlation analysis

Results

Key insights from the analysis:

  • Students studying more than 3 hours per day score significantly higher
  • Attendance above 75% correlates with better performance
  • Overall pass rate: 98%
  • Average score: 88.49

See REPORT.md for detailed findings and recommendations.

Project Structure

.
├── analysis.py                          # Python script for analysis
├── student_analysis.ipynb               # Jupyter notebook
├── student_performance.csv              # Original dataset
├── student_performance_cleaned.csv      # Cleaned dataset (generated)
├── student_analysis_charts.png          # Visualizations (generated)
├── README.md                            # This file
└── REPORT.md                            # Detailed analysis report

Contributing

Feel free to fork this project and submit pull requests for improvements.

License

This project is open source and available for educational purposes.

Author

Biruk D. GitHub: @birukd1

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors