Skip to content

purefe11/StockPatchTST

Repository files navigation

StockPatchTST_banner_proportional_1024x400 Python PyTorch License Status

📚 Table of Contents

Overview of Deep Learning Model Evolution

A chronological overview of model improvements, from LSTM to Transformer variants.

model_history

Transformer-Based Ranking Model for Stock Selection

📘 See TRANSFORMER.md for a detailed architecture explanation.

stock_patch_tst_model

Patches are extracted via Conv1D to compress temporal data, and a Transformer encoder captures inter-patch dependencies for ranking prediction.


🧠 Input Features

📘 Full feature list: FEATURES_TECHNICAL_INDICATORS.md

  • Feature engineering:
    Ratio-based transformations, log-scaling for long-tail features, stable across different stocks.

🏭 Stock Metadata

  • industry_id, is_kospi

💹 Price Flow Indicators

  • close_rate, open_to_close, high_to_low, rsi, ato, macd related

📊 VWMA & Bollinger Bands

  • vwma5_gap, vwma20_gap, vwma_bb_width etc.

🌍 Market Indices

  • KOSPI, KOSDAQ, S&P500 Futures, Nasdaq 100 Futures, VIX, etc.

🔁 Volume & Flow

  • Volume volatility ratios, net buy rates

📈 Candlestick Patterns

  • Upper/lower tail ratios, body ratios

Feature Heatmap

The following heatmap shows pairwise correlations between selected input features. heatmap

Distribution per Feature

Below are the individual distributions of input features used in model training.

is_kospi close_rate vwma5_gap vwma20_gap
vwma_bb_upper_ratio vwma_bb_lower_ratio vwma_bb_width kospi_vwma5_gap
kospi_vwma20_gap kosdaq_vwma5_gap kosdaq_vwma20_gap open_to_close
high_to_low rsi macd_ratio macd_signal_ratio
macd_golden_cross macd_dead_cross atr_ratio kospi_close_rate
kosdaq_close_rate sp500f_close_rate sp500f_ma5_gap sp500f_ma20_gap
nasdaq100f_close_rate nasdaq100f_ma5_gap nasdaq100f_ma20_gap vix_close_rate
sp500v_close_rate trading_volume_volatility_ratio trading_change trading_rolling_change
foreign_rate institution_rate individual_rate foreign_net_buy_days
institution_net_buy_days candle_upper_tail_ratio candle_lower_tail_ratio candle_body_ratio
candle_sign

🔧 Key Features

  • Patch-wise Transformer encoder
  • Industry embedding
  • Soft label generation from 5-day future returns
  • LambdaRankLoss for ranking optimization
  • Real-time applicability (15:40–16:00 trading window)

🎯 Target Selection Strategy

  • Top 200 stocks by daily trading volume
  • Market cap ≥ 500B KRW
  • Excludes limit-up and newly listed stocks

🔧 Model Hyperparameters

  • Input Dim: 41
  • Industry Embedding Dim: 4
  • Model Dim: 64
  • Sliding Window Size: 30
  • Patch Length: 6, Stride: 3
  • Transformer Encoder Heads: 4, Layers: 2
  • Dropout: 2
  • Learning Rate: 5e-4
  • Weight Decay: 5e-5
  • Early Stopping Patience: 10 epochs

🏷️ Labeling & Ranking

  • 5-day return quintiles (20 bins) used for soft labels
  • TOP3 selection by predicted score per day

🧮 20-Quantile Label Bins

training_labels_20 validation_labels_20 test_labels_20

📉 Distribution of Predicted Scores

val_top3_pred test_top3_pred

🔍 Raw Label Distribution

training_labels validation_labels test_labels

📚 Learn More: Ranking Metrics & Loss

For an in-depth explanation of the ranking metric and training objective used in this model, see:

  • 📘 NDCG Explained:
    Understand how Normalized Discounted Cumulative Gain (NDCG) measures ranking quality in stock selection.

  • ⚙️ LambdaRank Loss Guide:
    Dive into the pairwise ranking loss function that optimizes NDCG by comparing stock relevance in every batch.


🔍 Post-Filtering Rules

While model prediction provides initial candidates, additional filtering and dynamic sell strategies are applied to make the system robust for real-world trading.

🛒 Buy Signal Post-Filtering

  • pred_rank <= topn (e.g., TOP3)
  • atr_ratio > 0.03 (sufficient volatility)
  • close_rate > -10% (avoiding sharp decliners)

💵 Sell Signal Detection

  • Max holding period: Forced exit after fixed days (e.g., 5 days).
  • Trailing Entry Extension:
    If a new buy signal occurs during holding, reset holding period.

📈 Return Evaluation

Item 2024 TOP3 2025 TOP3
Number of Samples 195,866 42,966
Number of Buy / Sell Trades 205 / 205 35 / 35
Win Rate (Count, Ratio) 112 trades (54.63%) 26 trades (74.29%)
Loss Rate (Count, Ratio) 93 trades (45.37%) 9 trades (25.71%)
Average Return (Win) 6.23% 7.54%
Average Return (Loss) -4.57% -3.88%
Avg. Holding Period (Win) 9.6 calendar days (6.4 trading days) 9.2 calendar days (5.8 trading days)
Avg. Holding Period (Loss) 9.5 calendar days (6.5 trading days) 10.7 calendar days (6.2 trading days)
Return Deciles [-36.0, -6.5, -3.6, -2.0, -0.7, 0.8, 2.1, 4.1, 5.9, 9.9, 34.6] [-8.2, -3.8, -2.3, 0.5, 1.1, 2.5, 4.2, 7.2, 8.3, 16.3, 30.7]
Trade Capital 10,000,000 10,000,000
Expected Net Return 1.335% 4.601%
Cumulative Net Profit 27,365,419 16,105,142

📅 Monthly Return

2024수익 2025수익

📊 Daily Return

202401수익 202501수익

🏦 Return by Stock

종목별수익

🤔 Return Distribution by Purchase Decision

수익률평균값 수익률분포
수익률분포0 수익률분포1

🔍 Case Study: Specific Stocks

한화오션


💻 Environment

  • Python 3.12.8
  • PyTorch 2.6.0 + CUDA 12.6
  • pykrx 1.0.48
  • Full list in requirements.txt

🧪 Experiment Notebook

The full end-to-end workflow is implemented in the following notebook:
Stock_PatchTST_Ranking.ipynb

This includes:

  • Raw data retrieval
  • Feature engineering and preprocessing
  • Model training and validation
  • Return evaluation and analysis
  • Case studies on selected stocks

📄 License & Acknowledgements

This project is licensed under the MIT License.
See the LICENSE file for details.

This work is inspired by PatchTST.
It was developed and tested on KRX daily stock data from 2020 to 2025. KRX data was primarily retrieved using the pykrx library.
Industry classification codes were retrieved via the Open API from Korea Investment & Securities.

About

PatchTST-based stock ranking model trained with LambdaRank loss on KRX data. Crafted by 🍡 DungiBomi

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published