While extensive research exists on option hedging strategies from the seller's perspective, there remains a significant gap in developing effective strategies for option buyers. Given a fixed strike price, premium, and expiry date, we explore how buyers should optimally hold the underlying stock over time. We compare Deep Double Q-Network (DDQN) against three Monte Carlo Policy Gradient (MCPG) variants using different risk-based loss functions: entropic, Sharpe ratio, and Markowitz. Our custom trading environment features a 7-dimensional state space including position size, normalized stock price, time to expiry, portfolio value, option Greeks (
- Custom gym environment simulating realistic option trading scenarios
- Multiple ticker support with random episode generation
- Black-Scholes pricing with dynamic volatility calculation
- Comprehensive state space including normalized prices and option Greeks
- Flexible action space for position sizing (0-100% of portfolio)
- create a virtual environment
- run the following command in the root directory to install dependencies
pip install -r requirements.txt
- optionally, if you have a compatible GPU and want to train with CUDA, follow these instructions: Successfully using your local NVIDIA GPU with PyTorch or TensorFlow
- Now you are ready to train and test models! The relevant project structure is below:
├── policies
├── results
│ ├── data
│ │ ├── testing
│ │ │ ├── DDQN
│ │ │ └── MCPG
│ │ └── training
│ │ ├── DDQN
│ │ ├── MCPG
│ │ └── Q-Learning
│ └── images
│ ├── testing
│ │ ├── comparison
│ │ ├── DDQN
│ │ └── MCPG
│ └── training
│ ├── DDQN
│ ├── MCPG
│ └── Q-Learning
├── src
│ ├── environment
│ ├── models
│ ├── testing
│ ├── training
│ ├── util
│ └── visualization
├── README.md
└── report.pdf
