This is a simple Python program to understand the basics of Statistical Arbitrage (Stat Arb). It walks through:
- Cointegration Test: Testing if two assets are cointegrated, which means their prices move together in the long term, making them suitable for a pairs trading strategy.
- Modeling the Spread: Calculating the spread between two assets using linear regression.
- Entry and Exit Signals: Generating trading signals when the spread deviates significantly from its mean.
- Backtesting: Simulating the performance of the strategy over time and visualizing the portfolio performance.
- Visualization: Plotting the portfolio value and the spread with entry and exit points.
Install required libraries:
pip install pandas numpy statsmodels matplotlib
The program uses randomly generated price data for two assets.
Use coint()
to check if the assets are cointegrated:
- If the p-value < 0.05, the assets are cointegrated and suitable for Stat Arb.
A linear regression predicts one asset’s price using the other, and the spread is calculated as the difference between actual and predicted prices.
- Entry Threshold: Spread > (Mean ± 2 * Std Dev).
- Exit Threshold: Spread reverts to the mean.
Simulates trades based on the signals, calculates portfolio value, and tracks performance over time.
Cointegration Test p-value: 0.3424181037499916
Spread Mean: -5.3660187404602763e-14
Spread Std Dev: 5.106547769214354
Final Portfolio Value: 99777.6834577597
The script will also generate the following plots:
- Portfolio Value Over Time: A graph showing how the portfolio value evolves over time.
- Spread with Entry and Exit Thresholds: A plot displaying the spread between the two assets, along with entry and exit thresholds and marked entry/exit points.
- Data Input: Replace the simulated data with actual price data for the assets you want to trade.
- Entry/Exit Thresholds: Adjust the entry and exit thresholds based on your desired risk/reward profile.
- Initial Capital: Change the
initial_cash
variable to reflect your starting portfolio value.
This is a basic implementation to learn the Statistical Arbitrage Strategy. To improve:
- Use real data (e.g., stock prices).
- Account for transaction costs and risk.
- Experiment with different thresholds.
Run the program, visualize the results, and tweak to deepen your understanding!
updated script: statistical-arbitage-pep-ko.py
This script implements a statistical arbitrage strategy using historical stock data for Coca-Cola (KO) and PepsiCo (PEP). It performs the following steps:
- Data Collection: Fetches historical stock prices from Yahoo Finance (2015-2020).
- Cointegration Test: Checks if the assets are cointegrated with a p-value threshold of 0.1.
- Spread Modeling: Calculates the spread between the two stock prices using linear regression.
- Entry/Exit Signals: Generates signals based on spread deviations from the mean.
- Backtesting: Simulates trades and tracks portfolio performance.
- Results: Outputs final portfolio value and plots performance.
Dependencies:
pip install yfinance statsmodels matplotlib pandas numpy
MIT License