GitHub - purefe11/StockPatchTST: PatchTST-based stock ranking model trained with LambdaRank loss on KRX data. Crafted by 🍡 DungiBomi

📚 Table of Contents

Input Features
Key Features
Target Selection Strategy
Model Hyperparameters
Labeling & Ranking
Return Evaluation
Case Study
Environment
Experiment Notebook
License & Acknowledgements

Overview of Deep Learning Model Evolution

A chronological overview of model improvements, from LSTM to Transformer variants.

Transformer-Based Ranking Model for Stock Selection

📘 See TRANSFORMER.md for a detailed architecture explanation.

Patches are extracted via Conv1D to compress temporal data, and a Transformer encoder captures inter-patch dependencies for ranking prediction.

🧠 Input Features

📘 Full feature list: FEATURES_TECHNICAL_INDICATORS.md

Feature engineering:
Ratio-based transformations, log-scaling for long-tail features, stable across different stocks.

🏭 Stock Metadata

industry_id, is_kospi

💹 Price Flow Indicators

close_rate, open_to_close, high_to_low, rsi, ato, macd related

📊 VWMA & Bollinger Bands

vwma5_gap, vwma20_gap, vwma_bb_width etc.

🌍 Market Indices

KOSPI, KOSDAQ, S&P500 Futures, Nasdaq 100 Futures, VIX, etc.

🔁 Volume & Flow

Volume volatility ratios, net buy rates

📈 Candlestick Patterns

Upper/lower tail ratios, body ratios

Feature Heatmap

The following heatmap shows pairwise correlations between selected input features.

Distribution per Feature

Below are the individual distributions of input features used in model training.

🔧 Key Features

Patch-wise Transformer encoder
Industry embedding
Soft label generation from 5-day future returns
LambdaRankLoss for ranking optimization
Real-time applicability (15:40–16:00 trading window)

🎯 Target Selection Strategy

Top 200 stocks by daily trading volume
Market cap ≥ 500B KRW
Excludes limit-up and newly listed stocks

🔧 Model Hyperparameters

Input Dim: 41
Industry Embedding Dim: 4
Model Dim: 64
Sliding Window Size: 30
Patch Length: 6, Stride: 3
Transformer Encoder Heads: 4, Layers: 2
Dropout: 2
Learning Rate: 5e-4
Weight Decay: 5e-5
Early Stopping Patience: 10 epochs

🏷️ Labeling & Ranking

5-day return quintiles (20 bins) used for soft labels
TOP3 selection by predicted score per day

🧮 20-Quantile Label Bins

📉 Distribution of Predicted Scores

🔍 Raw Label Distribution

📚 Learn More: Ranking Metrics & Loss

For an in-depth explanation of the ranking metric and training objective used in this model, see:

📘 NDCG Explained:
Understand how Normalized Discounted Cumulative Gain (NDCG) measures ranking quality in stock selection.
⚙️ LambdaRank Loss Guide:
Dive into the pairwise ranking loss function that optimizes NDCG by comparing stock relevance in every batch.

🔍 Post-Filtering Rules

While model prediction provides initial candidates, additional filtering and dynamic sell strategies are applied to make the system robust for real-world trading.

🛒 Buy Signal Post-Filtering

pred_rank <= topn (e.g., TOP3)
atr_ratio > 0.03 (sufficient volatility)
close_rate > -10% (avoiding sharp decliners)

💵 Sell Signal Detection

Max holding period: Forced exit after fixed days (e.g., 5 days).
Trailing Entry Extension:
If a new buy signal occurs during holding, reset holding period.

📈 Return Evaluation

Item	2024 TOP3	2025 TOP3
Number of Samples	195,866	42,966
Number of Buy / Sell Trades	205 / 205	35 / 35
Win Rate (Count, Ratio)	112 trades (54.63%)	26 trades (74.29%)
Loss Rate (Count, Ratio)	93 trades (45.37%)	9 trades (25.71%)
Average Return (Win)	6.23%	7.54%
Average Return (Loss)	-4.57%	-3.88%
Avg. Holding Period (Win)	9.6 calendar days (6.4 trading days)	9.2 calendar days (5.8 trading days)
Avg. Holding Period (Loss)	9.5 calendar days (6.5 trading days)	10.7 calendar days (6.2 trading days)
Return Deciles	[-36.0, -6.5, -3.6, -2.0, -0.7, 0.8, 2.1, 4.1, 5.9, 9.9, 34.6]	[-8.2, -3.8, -2.3, 0.5, 1.1, 2.5, 4.2, 7.2, 8.3, 16.3, 30.7]
Trade Capital	10,000,000	10,000,000
Expected Net Return	1.335%	4.601%
Cumulative Net Profit	27,365,419	16,105,142

📅 Monthly Return

📊 Daily Return

🏦 Return by Stock

🤔 Return Distribution by Purchase Decision

🔍 Case Study: Specific Stocks

💻 Environment

Python 3.12.8
PyTorch 2.6.0 + CUDA 12.6
pykrx 1.0.48
Full list in requirements.txt

🧪 Experiment Notebook

The full end-to-end workflow is implemented in the following notebook:
Stock_PatchTST_Ranking.ipynb

This includes:

Raw data retrieval
Feature engineering and preprocessing
Model training and validation
Return evaluation and analysis
Case studies on selected stocks

📄 License & Acknowledgements

This project is licensed under the MIT License.
See the LICENSE file for details.

This work is inspired by PatchTST.
It was developed and tested on KRX daily stock data from 2020 to 2025. KRX data was primarily retrieved using the pykrx library.
Industry classification codes were retrieved via the Open API from Korea Investment & Securities.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
docs		docs
index		index
market		market
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Stock_PatchTST_Ranking.ipynb		Stock_PatchTST_Ranking.ipynb
requirements.txt		requirements.txt
settings.py		settings.py

License

purefe11/StockPatchTST

Folders and files

Latest commit

History

Repository files navigation