-
Notifications
You must be signed in to change notification settings - Fork 8
Reproducibility: Physiological filtering, unified prediction saving, batch testing, docs #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
b55585wy
wants to merge
12
commits into
thuhci:paper
Choose a base branch
from
b55585wy:paper-code-clean
base: paper
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ltin-conflict Fix parameter name conflicts with built-in functions
…ltin-conflict Fix parameter name conflicts with built-in functions in preprocess.py
Add physiological range filtering to improve model training quality by removing samples with physiologically impossible values. Implementation: - Define medical reasonable ranges for vital signs (HR, RR, SpO2, BP) - Apply filtering during training phase to avoid learning from outliers - Apply filtering during validation phase for accurate metrics - No filtering during test phase to evaluate true model performance Physiological ranges defined: - Heart rate (HR): 40-200 bpm - Respiratory rate (RR): 4-30 breaths/min - Blood oxygen saturation (SpO2): 75-100% - Systolic blood pressure (SBP): 60-260 mmHg - Diastolic blood pressure (DBP): 30-200 mmHg Files modified: - utils/utils.py: Add filtering functions (+150 lines) - trainer/load_trainer.py: Apply filtering in train/valid loops (+80 lines) Usage: The filtering is automatically applied when using the standard training pipeline. Filters can be customized via physiological_filter() function parameters.
Add save_detailed_predictions() function to save prediction pairs during testing. Features: - Save predictions to predictions/<exp_name>/<fold>.csv - Support multi-task accumulation in single CSV file - Include task, fold, exp_name metadata fields - Optional via --save-predictions command-line flag Output CSV format: - Columns: prediction, target, task, fold, exp_name - One row per prediction sample - Multiple tasks can be stored in same file File modified: - main.py: Add save_detailed_predictions() function and integration (+41 lines) Usage: python main.py --config <config.json> --save-predictions
Add test_all_models_predictions.py for batch testing all trained models and generating detailed predictions with metadata. Features: - Automatically find and test all models in models/ directory - Generate predictions with complete metadata: * subject_id: participant identifier * scenario: test scenario (sitting, standing, etc.) * start_time, end_time: data timestamps * task, exp_name: experiment info - Support all tasks: HR, RR, SpO2, BP (SBP and DBP) - Handle multiple folds automatically - Save to predictions/<exp_name>/<fold>.csv Implementation: - DetailedDataset class: custom dataset with metadata collection - Custom collate function: handle metadata in batches - Batch processing: test all models efficiently Output format: prediction, target, subject_id, scenario, start_time, end_time, task, exp_name File added: - test_all_models_predictions.py: Complete testing script (334 lines) Usage: python test_all_models_predictions.py --models <model_name> [<model_name> ...] # Or test all models: python test_all_models_predictions.py
Add documentation for physiological filtering and prediction analysis features. Sections added: - Physiological Range Filtering: Medical range definitions and automatic application - Detailed Prediction Pairs: How to save predictions from main.py training - Batch Testing with Metadata: How to use test_all_models_predictions.py script File modified: - README.md: Add Advanced Features section with usage examples (+55 lines)
Move test_all_models_predictions.py to scripts/ directory for better organization. Changes: - Move test_all_models_predictions.py to scripts/ - Update scripts/README.md with comprehensive documentation for test script - Add usage examples and feature descriptions Files: - scripts/test_all_models_predictions.py (moved) - scripts/README.md (updated with test script documentation)
Refactor to use unified save_prediction_pairs_detailed() from utils.py instead of duplicate implementations. Changes: - main.py: Use save_prediction_pairs_detailed() from utils - scripts/test_all_models_predictions.py: Use unified function - Eliminates code duplication Benefits: - Single source of truth for saving logic - Consistent output format - Easier maintenance - main.py: saves without metadata - test script: saves with metadata (subject_id, scenario, timestamps)
Add scripts for generating statistical tables and Excel reports from predictions. Scripts added: - run_statistics.py: Main entry point (109 lines) - calculate_metrics_statistics.py: Metrics computation (588 lines) - generate_excel_report.py: Excel report generation (175 lines) Features: - Compute MAE, RMSE, MAPE, Pearson across experiments - Weighted averaging by sample size - Generate tables by channel, scenario, task - Consolidate into Excel workbook Output: - table1_*.csv - Performance by channel - table2_*.csv - Performance by scenario group - table3_*.csv - Channel comparison - table4_*.csv - Best model recommendations - statistical_report.xlsx - Excel consolidation Usage: python scripts/run_statistics.py Complete workflow: Train → Test → Analyze
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Improve reproducibility for the paper: add physiological filtering, unify prediction saving, batch testing, and documentation.
Changes
save_prediction_pairs_detailed; fix HR (40–200), RR (4–30)utils.save_prediction_pairs_detailedchannels_firstfor non-InceptionTimeREADME.mdandscripts/README.mdCompatibility
Non-breaking;
channels_firstonly for InceptionTime and handled in scripts.Usage
See READMEs; batch testing via
scripts/test_all_models_predictions.py; predictions saved as CSV with metadata.QA
Unit tests for filtering/saving; manual training and batch test runs verified.
Affected Files
README.md; scripts/README.md; utils/utils.py; trainer/load_trainer.py; main.py; scripts/test_all_models_predictions.py
PS
Please refer to Baidu Netdisk for the relevant configuration files and reproduction conditions.