-
Notifications
You must be signed in to change notification settings - Fork 25
feat: Validation Split & MLflow Tracking #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Manual Action Required!This pull request is not yet linked to an issue. Please update the description to include 'Closes: #issue_number' in the appropriate section. This is required to pass validation and initiate metadata synchronization. |
|
Thank you for opening this PR! Our automated system is currently verifying the PR requirements. |
Validation Successful!This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes. |
|
heyy. CI is failing on test_model_save_load_preserves_preprocessing. |
Thanks @Satyamgupta2365 for review. Will keep this in mind. currently the work is in progress. |
Validation Successful!This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes. |
Validation Successful!This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes. |
|
Hii @Satyamgupta2365 finished the PR from my side and is ready for review. Thanks |
Validation Successful!This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes. |
Validation Successful!This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SK8-infi Thank you for the effort! While validation tracking is a great addition, this PR introduces critical architectural issues that must be addressed:
- Optimizer State Reset: Moving the training loop to Python causes a new Adam/SGD instance to be created every epoch. This resets Adam's moment estimates and time-step to zero, breaking its adaptive logic.
- Performance Overhead: Calling the Rust backend once per epoch significantly increases FFI marshaling and context-switching overhead.
- Regressions: The code lacks the
batch_sizeparameter recently merged in PR #63, which will cause build and runtime failures.
Requested Changes:
- Revert the Python Loop: Move the validation logic into the Rust
trainmethod so the optimizer remains persistent. - Sync with Main: Update signatures to include the mandatory
batch_sizeparameter. - Preserve
forward(): Keep this new method inlib.rsas it is a valuable utility.
|
@SK8-infi Please update us on the status. You have 24 hours to address the issues or we’ll close this PR and reassign the task to keep development on track. |
… into feat/val-split-track
Validation Successful!This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes. |
|
had to do some changes in tests as the new signature is (train_losses, val_losses) |
Validation Successful!This pull request has been verified and linked to issue #22. The system is now synchronizing metadata from the referenced issue. Kindly await maintainer review of your changes. |
|
@SK8-infi Build tests are failing. |
|
Seems to be an issue happened during resolving conflicts. Looking into it |
Fixes
Closes: #22
Type of Change
Description
Implemented validation split and MLflow tracking for monitoring model generalization performance.
Changes:
validation_split: float = 0.2parameter toModel.train()method_calculate_validation_loss()method supporting both classification (cross-entropy) and regression (MSE)forward()method to expose raw model outputs for validation loss calculationsave_model()to log bothlossandval_lossmetrics to MLflow with epoch stepsResult: MLflow dashboard now displays overlapping train/validation loss curves for monitoring overfitting and generalization.
How Has This Been Tested?
test_validation_split.pycovering classification, regression, and edge casesforward()method works correctly with Python APIScreenshots / Logs
Contribution Context