This project implements a Deep Learning solution (LSTM) to predict the Remaining Useful Life (RUL) of turbofan engines using the NASA CMAPSS dataset. The goal is to facilitate Predictive Maintenance (PdM) strategies by forecasting engine failure before it occurs.
The model processes high-dimensional time-series sensor data to predict exactly how many operational cycles an engine has left before failure.
| Dataset | Conditions | Fault Modes | Model | RMSE Score |
|---|---|---|---|---|
| FD001 | Sea Level | HPC Degradation | LSTM | 13.14 |
| FD003 | Sea Level | HPC + Fan Degradation | LSTM | 41.45 |
Note: The baseline RMSE for FD001 in literature is typically between 15.0 - 17.0. This model achieves 13.14 via advanced preprocessing techniques.
- RUL Clipping: Training RUL was clipped at 125 cycles. Degradation signals are negligible in early life; forcing the model to predict RUL > 125 adds noise and reduces accuracy.
- Sensor Selection: Identified and dropped 7 sensors with zero variance (using EDA).
- Exponential Smoothing: Applied Exponential Weighted Moving Average (alpha=0.1) to denoise high-frequency sensor jitter, exposing the true degradation trend.
- Sliding Window: Transformed 2D sensor data into 3D tensors
(Samples, Window_Size, Features)using a 30-cycle lookback window. - MinMax Scaling: Normalized all sensor inputs to the [0, 1] range to aid LSTM convergence.
- Input Layer: (30, 14) Time-series window
- LSTM Layer 1: 100 Units + Dropout (0.2)
- LSTM Layer 2: 50 Units + Dropout (0.2)
- Output Layer: Dense (1 Unit) with Linear Activation
- Optimizer: Adam
src/config.py: Centralized configuration for hyperparameters (Window size, Smoothing alpha).src/data_utils.py: Modular data pipeline (Loading, Smoothing, Windowing).src/model.py: Keras model definition.train.py: Main execution script for training and evaluation.
-
Clone the repository
git clone [https://github.com/YOUR_USERNAME/nasa-turbofan-pdm.git](https://github.com/YOUR_USERNAME/nasa-turbofan-pdm.git) cd nasa-turbofan-pdm -
Install dependencies pip install -r requirements.txt
-
Run the training pipeline python train.py
-
Operating Condition Clustering: To improve performance on FD002 and FD004, I plan to implement K-Means clustering to normalize sensors based on operating altitude/speed.
-
Attention Mechanisms: Experimenting with Attention-based LSTMs to weigh critical time steps more heavily.
- NASA CMAPSS Dataset
- Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation (Saxena et al.)