A web application for university students to upload CSV datasets and see the step-by-step mathematical workings of machine learning models, starting with Linear Regression.
- Upload CSV datasets with drag-and-drop
- Interactive data preview with column statistics
- Step-by-step mathematical explanations with LaTeX rendering
- Visualizations including scatter plots, residual plots, and actual vs predicted charts
- Full derivation of the Normal Equation for Linear Regression
- Frontend: React + Vite, Tailwind CSS, KaTeX, Recharts
- Backend: FastAPI (Python), NumPy, Pandas
- Python 3.8+
- Node.js 18+
- npm or yarn
cd backend
# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run the server
uvicorn main:app --reloadThe API will be available at http://localhost:8000
cd frontend
# Install dependencies
npm install
# Run the development server
npm run devThe app will be available at http://localhost:5173
- Open the app in your browser at
http://localhost:5173 - Upload a CSV file (a sample file
sample_data.csvis provided in the root directory) - Select your feature column(s) (X) and target column (Y)
- Click "Train Linear Regression"
- Explore the step-by-step mathematical explanation
A sample CSV file (sample_data.csv) is included with student study data:
hours_studied: Hours spent studyingpractice_tests: Number of practice tests takenprevious_score: Score on previous examfinal_score: Final exam score (target variable)
POST /api/upload- Upload a CSV filePOST /api/train/linear-regression- Train a linear regression modelGET /api/sessions/{session_id}- Get session dataDELETE /api/sessions/{session_id}- Delete a session
The app explains:
- Data Overview - Basic statistics of your dataset
- Problem Setup - The linear model equation and matrix notation
- Cost Function - Mean Squared Error (MSE) and why we use it
- Normal Equation - Step-by-step derivation with actual matrix calculations
- Results - Coefficient interpretation and model evaluation metrics