Skip to content

Comments

feat: Add model caching to batch_predict endpoint to eliminate redund…#68

Open
sohampirale wants to merge 3 commits intoruxailab:mainfrom
sohampirale:feat/pkl_files_trained_models
Open

feat: Add model caching to batch_predict endpoint to eliminate redund…#68
sohampirale wants to merge 3 commits intoruxailab:mainfrom
sohampirale:feat/pkl_files_trained_models

Conversation

@sohampirale
Copy link

@sohampirale sohampirale commented Feb 15, 2026

Summary

This PR implements model caching in the predict_new_data_simple function to eliminate redundant model training in the batch_predict endpoint, addressing performance issue

Fixes : #61 .


Changes Made

  • Added pickle file caching mechanism for trained models using {calib_id}_model_x.pkl and {calib_id}_model_y.pkl naming convention
  • Implemented cache lookup before training - loads cached models if they exist
  • Added fallback mechanism to train new models if cache is not available
  • Automatically saves newly trained models to cache for future use
  • Maintains existing functionality while improving performance

Performance Impact

  • Eliminates redundant model training on every batch_predict request
  • Reduces computational overhead for repeated predictions with same calibration data
  • Improves response time for subsequent requests with cached models

Files Changed

  • app/services/gaze_tracker.py: Added caching logic to predict_new_data_simple function

Testing

  • Verified cache creation and loading works correctly
  • Confirmed fallback mechanism works when cache is unavailable
  • Ensured existing functionality remains intact

Caching tested

  • Implemented logic works single threaded as well as multi threaded
image

Deployment Note

  • The current cache is in-memory and scoped per process.
  • In a multi-worker deployment (e.g., multiple Gunicorn workers), each process maintains its own cache.
  • This means the same calib_id may still be retrained in different processes.
  • If horizontal scaling becomes necessary, a shared cache solution (e.g., Redis) can be considered.

@midaa1
Copy link
Contributor

midaa1 commented Feb 15, 2026

this accully has a negative impact in saving the models as a pkl file. This will cause many vital problems:
File size and efficiency -> this will cause a lot of savings for models that cause memory problems
Security risks -> Arbitrary code execution: Pickle can run arbitrary Python when unpickling. A tampered .pkl file can execute code on load.
You can do it using "joblib.dump()"
But I don't recommend that bc memory problems. i think caching and using the model on the fly will be the best practice here

@sohampirale
Copy link
Author

Thanks for pointing out the pickle security issues

Addressed pickle security concerns by implementing thread-safe in-memory LRU cache. Replaces file-based caching with dictionary-based caching that automatically manages memory usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: Eliminate redundant model training in batch_predict endpoint

2 participants