-
Notifications
You must be signed in to change notification settings - Fork 51
Open
Description
Problem
The batch_predict endpoint currently re-trains models from scratch on every request, which is inefficient and wastes computational resources.
Root Cause
In the batch_predict function, models are trained every time instead of reusing previously trained models:
File: /app/services/gaze_tracker.py
Lines: ~330–340 in predict_new_data_simple() function
# ============================
# MODELS
# ============================
model_x = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
model_y = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
model_x.fit(X_train_x, y_train_x) # <--Trainnnig every req
model_y.fit(X_train_y, y_train_y) # <-- Trainning every reqFile: /app/routes/session.py
Line: ~175 in batch_predict() function calls gaze_tracker.predict_new_data_simple()
Current Behavior
- Every prediction request re-reads CSV and retrains models
- Training happens repeatedly for the same calibration data
- Computational resources used unnecessarily
Proposed Solution
Cache trained models in-memory using thread-safe LRU cache with naming convention:
{calib_id} as cache key
If model exists in cache → load model
If not exists → train, save to cache, and automatically manage memory with LRU eviction
Benefits
- Eliminate redundant training computations
- Reduced server CPU usage
- If there's a particular reason this wasn't implemented initially, please let me know.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels