Dynamic Hedging Strategies for Option Buyers Using Deep Reinforcement Learning

Abstract

While extensive research exists on option hedging strategies from the seller's perspective, there remains a significant gap in developing effective strategies for option buyers. Given a fixed strike price, premium, and expiry date, we explore how buyers should optimally hold the underlying stock over time. We compare Deep Double Q-Network (DDQN) against three Monte Carlo Policy Gradient (MCPG) variants using different risk-based loss functions: entropic, Sharpe ratio, and Markowitz. Our custom trading environment features a 7-dimensional state space including position size, normalized stock price, time to expiry, portfolio value, option Greeks ($\delta$ and $\gamma$), and implied volatility with daily data from multiple tickers. DDQN achieves superior mean returns of 352.4% with a 19.9% capture percentage, compared to MCPG variants' 220-264% returns, though at the cost of significantly higher volatility (1159.3% vs. 697-807%). Despite these differences, all models maintain comparable Sharpe ratios (0.30-0.40). Notably, DDQN exhibits weaker downside protection with average losses of -589.2%, compared to MCPG's -261.9% to -344.9%.

→ Full Technical Report (PDF)

Key Features

Custom gym environment simulating realistic option trading scenarios
Multiple ticker support with random episode generation
Black-Scholes pricing with dynamic volatility calculation
Comprehensive state space including normalized prices and option Greeks
Flexible action space for position sizing (0-100% of portfolio)

Results

How to Use

create a virtual environment
run the following command in the root directory to install dependencies

pip install -r requirements.txt

optionally, if you have a compatible GPU and want to train with CUDA, follow these instructions: Successfully using your local NVIDIA GPU with PyTorch or TensorFlow
Now you are ready to train and test models! The relevant project structure is below:

├── policies
├── results
│   ├── data
│   │   ├── testing
│   │   │   ├── DDQN
│   │   │   └── MCPG
│   │   └── training
│   │       ├── DDQN
│   │       ├── MCPG
│   │       └── Q-Learning
│   └── images
│       ├── testing
│       │   ├── comparison
│       │   ├── DDQN
│       │   └── MCPG
│       └── training
│           ├── DDQN
│           ├── MCPG
│           └── Q-Learning
├── src
│   ├── environment
│   ├── models
│   ├── testing
│   ├── training
│   ├── util
│   └── visualization
├── README.md
└── report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dynamic Hedging Strategies for Option Buyers Using Deep Reinforcement Learning

Abstract

Key Features

Results

How to Use

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
policies		policies
results		results
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
__init.py		__init.py
report.pdf		report.pdf
requirements.txt		requirements.txt

bamarler/Deep-RL-for-Option-Hedging

Folders and files

Latest commit

History

Repository files navigation

Dynamic Hedging Strategies for Option Buyers Using Deep Reinforcement Learning

Abstract

Key Features

Results

How to Use

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages