CNN-based Stock Price Prediction with Instantaneous Sharpe Ratio

This project implements a CNN-LSTM model to predict the risk-adjusted stock return, represented as an instantaneous Sharpe ratio. The prediction target quantifies the expected k-bar return normalized by its inherent volatility, allowing the model to focus on high signal-to-noise opportunities.

Prediction Target: Instantaneous Sharpe Ratio

The model predicts s_t^(k), defined as:

[ s_t^{(k)} = \frac{C_{t+k} - C_t}{C_t} \Bigg/ \sqrt{\frac{1}{k}\sum_{i=1}^{k}r_{t+i}^2} ]

Where:

C_t is the closing price at time t
r_{t+i} are the future returns over the next k periods
The denominator represents the square root of the mean of squared future returns (RMS volatility)

Interpretation

This ratio represents an instantaneous Sharpe for a k-bar trade, where:

s = 1 means the expected return equals one standard deviation of future noise
s = 2 means two standard deviations, etc.

This standardized measure helps the model identify opportunities with attractive risk-reward profiles rather than just high absolute returns.

Model Architecture

The model employs a sophisticated CNN-LSTM architecture with attention mechanisms:

Feature Importance Layer:
- Learns to weight input features with a trainable parameter vector
- Uses softmax activation to create a probability distribution over features
Convolutional Blocks:
- Gated 1D convolutional layers (similar to GLU) for adaptive feature extraction
- Batch normalization and ReLU activation
Dual Attention Mechanism:
- Channel attention: Captures feature-wise importance
- Spatial attention: Focuses on important time steps
Temporal Processing:
- Temporal attention to aggregate features across time
- Bidirectional LSTM with 2 layers to capture sequential patterns
Output Layers:
- Residual blocks with skip connections
- Trainable output scaling with tanh activation for bounded predictions

Implementation Details

Data Preprocessing

Raw OHLCV data is transformed into 426+ technical indicators
For "raw mode," only 9 base features are used: open, high, low, close, volume, ret, ma5, ma10, ma20
Risk-adjusted returns are calculated using the exact formula shown above
Data is normalized feature-wise to zero mean and unit variance

Training Process

Uses sliding window approach for time series data (default window = 30 bars)
Loss function combines multiple components:
- Primary loss: Smooth L1 (Huber) loss
- Correlation penalty: Encourages alignment between prediction and target trends
- R²-like term: Normalizes MSE by target variance
- Sparsity regularization: L1 penalty on feature importance
- Group LASSO: Encourages feature group selection
Supports balanced sampling across magnitude bins to handle class imbalance
Implements early stopping with patience and learning rate scheduling

Evaluation Metrics

MSE/RMSE: Basic prediction error
MAE: Absolute magnitude of errors
Directional Accuracy: % of correct direction predictions
R²: Proportion of variance explained

Backtesting

Converts model predictions to trading signals (+1/-1/0)
Computes equity curves, returns, Sharpe ratios, and max drawdowns
Compares performance against buy-and-hold benchmark

Usage

Data Preprocessing

python data_preprocessing.py data/stock_data.xlsx --out processed_data/output.csv --k-bar 1 --vol-window 20

Training

python train.py processed_data/output.csv --win 30 --target-shift 1 --epochs 20 --raw-only

Evaluation

python eval.py processed_data/output.csv model_checkpoint/model.pt --win 30

Backtesting

python backtest.py eval_out/predictions.csv

Key Findings

The risk-adjusted target provides more stable training signals
Short-term predictions (target_shift=1) significantly outperform longer horizons
Using only raw features often yields better results than all engineered features
Balanced sampling improves training when target distribution is highly skewed

Implementation Notes

The CNN extracts local patterns from price action
Attention mechanisms help focus on relevant parts of the time series
LSTM captures temporal dependencies and sequence information
Feature importance learning identifies the most predictive indicators
Output scaling helps maintain prediction stability

Project Structure

The codebase follows a modular design with four main components:

Data Preprocessing (data_preprocessing.py):
- Loads raw OHLCV data from Excel files
- Computes technical indicators (~426 features)
- Calculates risk-adjusted returns
- Saves processed data as CSV files
Model Training (train.py):
- Builds and trains CNN-LSTM models
- Uses sliding window approach for time series data
- Implements various loss functions (MSE, correlation, R²-like)
- Offers class-balanced sampling options
Model Evaluation (eval.py):
- Evaluates model performance on test data
- Calculates metrics (MSE, MAE, directional accuracy, R²)
- Generates prediction files for backtesting
Backtesting (backtest.py):
- Simulates trading strategies based on model predictions
- Calculates performance metrics (return, Sharpe ratio, drawdown)
- Compares to buy-and-hold benchmark

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
backtest.py		backtest.py
data_preprocessing.py		data_preprocessing.py
dataset.py		dataset.py
eval.py		eval.py
index.py		index.py
model.py		model.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CNN-based Stock Price Prediction with Instantaneous Sharpe Ratio

Prediction Target: Instantaneous Sharpe Ratio

Interpretation

Model Architecture

Implementation Details

Data Preprocessing

Training Process

Evaluation Metrics

Backtesting

Usage

Data Preprocessing

Training

Evaluation

Backtesting

Key Findings

Implementation Notes

Project Structure

About

Uh oh!

Releases

Packages

Languages

mmTheBest/-DeepSharpeNet

Folders and files

Latest commit

History

Repository files navigation

CNN-based Stock Price Prediction with Instantaneous Sharpe Ratio

Prediction Target: Instantaneous Sharpe Ratio

Interpretation

Model Architecture

Implementation Details

Data Preprocessing

Training Process

Evaluation Metrics

Backtesting

Usage

Data Preprocessing

Training

Evaluation

Backtesting

Key Findings

Implementation Notes

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages