- 94% Classification Accuracy across seven distinct human activities
- 12% Accuracy Improvement through advanced preprocessing and optimization
- Comprehensive ML Pipeline implementation for time-series data
- Robust Feature Engineering with multiple selection techniques
This project implements an advanced machine learning pipeline for human activity classification using multi-dimensional time-series data from body-worn sensors. The system processes and analyzes data from the AReM (Activity Recognition system based on Multisensor data fusion) dataset to classify seven distinct human activities.
graph LR
A[Raw Sensor Data] --> B[Preprocessing]
B --> C[Feature Engineering]
C --> D[Feature Selection]
D --> E[Model Training]
E --> F[Evaluation]
-
Time Series Processing
- Dynamic segmentation up to 20 segments per series
- Temporal pattern extraction
- Local characteristic analysis
-
Statistical Feature Extraction
features = { 'statistical': ['min', 'max', 'mean', 'median'], 'distribution': ['std', 'q1', 'q3'], 'temporal': ['sliding_windows', 'segment_analysis'] }
-
Dimensionality Reduction
- Principal Component Analysis (PCA)
- Recursive Feature Elimination (RFE)
- Cross-validated feature selection
- SMOTE Implementation
- Synthetic sample generation
- Balanced class distribution
- Enhanced model robustness
-
Regularization Techniques
- L1 (Lasso) for feature sparsity
- L2 (Ridge) for overfitting prevention
- Cross-validation for hyperparameter tuning
-
Performance Metrics
metrics = { 'accuracy': 0.94, 'improvement': '12%', 'cross_validation': '5-fold', 'evaluation': ['confusion_matrix', 'ROC_curves'] }
- Overall Accuracy: 94%
- Improvement: 12% increase from baseline
- Cross-Validation: Consistent performance across folds
key_features = {
'temporal': ['segment_patterns', 'window_statistics'],
'statistical': ['quartile_ranges', 'distribution_metrics'],
'engineered': ['custom_indicators', 'derived_features']
}
-
Raw Data Handling
- 88 instances × 6 time series
- 480 consecutive values per series
- 7 distinct activity classes
-
Feature Engineering Framework
- Statistical feature extraction
- Temporal pattern analysis
- Dimensionality optimization
-
Model Architecture
- Regularized logistic regression
- SMOTE for class balancing
- Cross-validated parameter tuning