Skip to content

LixinTu/Predicting-Student-Test-Scores

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Predicting Student Test Scores: A Psychometric Feature Engineering Approach

Python Model Rank

Project Overview

This repository contains the solution for the Kaggle Playground Series "Predicting Student Test Scores". The objective was to predict student standardized test scores based on demographic data and study habits.

By integrating Educational Psychology theories (e.g., Yerkes-Dodson Law) into feature engineering and utilizing a Hybrid Ensemble Strategy (Gradient Boosting + Deep Learning), this solution achieved a top-tier ranking in the competition.

Competition Performance

  • Ranking: 315 / 3459 (Top 9.1%)
  • Evaluation Metric: RMSE (Root Mean Squared Error)
  • Final Score: 8.557

Methodology

1. Domain-Driven Feature Engineering

Unlike standard statistical feature generation, this project focused on "Psychometric Feature Construction" to capture latent behavioral patterns:

  • Cognitive Efficiency: Calculated as the ratio of study hours to break frequency, modeling the efficiency of learning sessions.
  • Study Intensity: A weighted composite feature derived from study duration and session frequency.
  • Resource Interaction: Interaction terms combining parental involvement levels with access to educational resources.

2. Model Architecture & Ensemble Strategy

To balance bias and variance, a heterogeneous ensemble approach was implemented:

  • Component A (Structured Learning): XGBoost Regressor optimized for tabular data interactions.
  • Component B (Representation Learning): TabM / SENet (Squeeze-and-Excitation Network) to capture non-linear and latent data representations.
  • Fusion Strategy: A weighted averaging technique (Linear Blending) was applied to the predictions of Component A and Component B to produce the final output.

Repository Structure

Kaggle_Student_Score_Project/
├── Code/
│   └── student-scores-tabm-xgb-advanced-fe.ipynb   # Main notebook containing FE and Modeling logic
├── Data_Output/
│   ├── My_Psych_XGB.csv                            # Predictions from the XGBoost model
│   ├── Deep_Learning.csv                           # Predictions from the Neural Network model
│   └── Final_Fusion_Conservative.csv               # Final submission file (Ensemble result)
└── RankingTop 9%.PNG                               # Leaderboard proof

About

Kaggle Playground Series S6E1 – Predicting Student Test Scores | Psychometric Feature Engineering & Hybrid Ensemble Modeling (Top 9%)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors