Asset managers and quantitative analysts face a consistent challenge: selecting an allocation strategy that is robust out-of-sample, not just historically attractive. Mean-variance optimisation has been the industry standard since the work of Harry Markowitz (1952), yet in practice it is well-documented to be highly sensitive to estimation error — small changes in expected return inputs produce dramatically different, often extreme portfolios.
This project addresses two operational questions directly relevant to portfolio management:
- Which allocation strategy delivers the best risk-adjusted return when deployed forward in time — not fitted to history?
- How accurately does each strategy predict its own performance — and which strategies should be trusted when their assumptions are violated by regime change?
The predicted vs actual framework makes estimation error visible and measurable, rather than burying it in aggregate backtest statistics. This is directly applicable to strategy selection, risk budgeting, and model validation workflows at asset management firms.
Because return estimates are noisy, this project focused on risk-based and regularized portfolio optimization and all methods were evaluated using rolling out-of-sample backtests.
| Strategy | Estimation inputs | Purpose | Strength | Weakness |
|---|---|---|---|---|
| Equal Weight | None — 1/N baseline | Baseline | Very robust baseline | Ignores risk structure |
| Minimum Variance | Covariance only | Risk control | Good downside protection | Sensitive to covariance estimation |
| Risk Parity | Covariance only | Robust allocation | Diversified risk exposure | Requires stable covariance |
| Regularised Max Sharpe | Mean + Covariance, L2 penalty | Active tilt | Strong theoretical foundation | Unstable with noisy returns |
Each strategy is evaluated across return, risk, and efficiency dimensions:
| Return Quality | Risk |
|---|---|
| Annualised Return (CAGR) | Annualised Volatility |
| Sharpe Ratio | Maximum Drawdown |
| Sortino Ratio | CVaR (95%) |
| Calmar Ratio | — |
Turnover is tracked separately per rebalance — higher turnover means higher transaction costs in production.
To mitigate estimation error in portfolio inputs, the framework optionally applies shrinkage to both expected returns and the covariance matrix. This allows to explore how different levels of regularization affect portfolio stability and out-of-sample performance.
Additionally, each rebalance records predicted vs realised Sharpe, Sortino, and Volatility — measuring how accurately each strategy anticipated its own risk-adjusted performance.
The framework employs a rolling walk-forward backtesting procedure: portfolio parameters are estimated on a fixed-length historical window, optimized weights are applied to the next rebalancing period, and the window is then advanced to simulate sequential real-time portfolio management.
flowchart LR
A["Select rolling training window<br>(1–10 years)"] --> B["Estimate expected returns<br>and covariance matrix"]
B --> C["Compute portfolio weights<br>for each strategy"]
C --> D["Apply weights to<br>next rebalance period"]
D --> E["Record realized returns<br>and risk metrics"]
E --> F{Next rebalance<br>date?}
F -->|Yes| G["Slide training window forward"]
G --> B
F -->|No| H["Aggregate results<br>and compare strategies"]
Training — estimate expected returns and the covariance matrix from the rolling window.
Weights — each strategy computes portfolio allocations independently using the same inputs.
Test — the weights are held fixed until the next rebalance. No adjustments, no look-ahead.
Window Update — the training window slides forward by one month, and the process repeats through the entire out-of-sample period.
Daily adjusted price data is downloaded from Yahoo Finance using the Python library yfinance.
The pipeline performs:
-
price alignment across assets
-
return calculation
-
missing value handling
All results in this project are fully reproducible. The workflow follows a deterministic pipeline:
flowchart LR
A[Download Prices\<br>yfinance] --> B[Compute Returns\<br>Align Series]
B --> C[Walk-Forward\<br>Backtest]
C --> D[Evaluation\<br>Metrics]
D --> E[Streamlit\<br>Dashboard]
The repository includes:
- Automated tests
pytest - Continuous integration - GitHub Actions
- Deterministic backtesting - Fixed pipeline
This ensures the results can be independently verified and extended.
Python 3.11 · scipy · scikit-learn · pandas · numpy · yfinance · Streamlit · Plotly · pytest · GitHub Actions
└── 📁portfolio_optimization
└── 📁.github
└── 📁workflows
├── ci.yaml
└── 📁.streamlit
├── config.toml
└── 📁src
└── 📁portfolio_optimization
├── __init__.py
├── backtest.py
├── config.py
├── data.py
├── main.py
├── metrics.py
├── optimization.py
└── 📁tests
├── __init__.py
├── test_backtest.py
├── test_data.py
├── test_metrics.py
├── test_optimization.py
├── .gitignore
├── app.py
├── pyproject.toml
├── README.md
└── requirements.txt
git clone https://github.com/marieltv/portfolio_optimization.git
cd portfolio_optimization
pip install -e ".[dev]"
streamlit run app.pypytest -vEvaluated on US defence equities — Boeing (BA), Northrop Grumman (NOC), Lockheed Martin (LMT), RTX Corporation (RTX), Axon Enterprise (AXON), and General Dynamics (GD) — over 2018–2026 using a 4-year rolling training window with monthly rebalancing.
| Strategy | CAGR | Sharpe | Sortino | Max Drawdown |
|---|---|---|---|---|
| Equal Weight | 20.3% | 1.05 | 1.08 | -17.5% |
| Min Variance | 19.8% | 1.05 | 1.10 | -16.8% |
| Risk Parity | 20.0% | 1.07 | 1.11 | -15.8% |
| Reg Max Sharpe | 24.9% | 1.16 | 1.23 | -17.4% |
Regularised Max Sharpe achieves the strongest performance across most metrics. Although the strategy is theoretically the most sensitive to estimation error, the L2 penalty appears sufficient to prevent extreme weight concentration in this asset universe.
Estimation accuracy.
Volatility forecasts exhibit relatively low error (MAE ≈ 0.07) across strategies, confirming that covariance estimates are substantially more stable than expected return estimates. However, the Spearman correlations between predicted and realised Sharpe and Sortino ratios are close to zero. This indicates that month-to-month risk-adjusted performance cannot be reliably inferred from historical estimates alone — consistent with findings such as those of Robert C. Merton.
Applying shrinkage techniques such as Ledoit–Wolf shrinkage and James–Stein estimator resulted in only marginal improvements (≈0.01 or less across metrics), suggesting that estimation error is not the dominant source of out-of-sample degradation in this particular universe.
Predicted Sharpe and Sortino values remain relatively flat across time. This behaviour arises from estimating expected returns over a multi-year rolling window. Rather than indicating model failure, it reflects a fundamental empirical property of equity markets: covariance is moderately forecastable, whereas mean returns are not. In practice, these allocation methods therefore provide the greatest value through risk budgeting, diversification, and drawdown control, rather than through short-horizon return prediction.
An interactive dashboard built with Streamlit allows exploration of the backtest results and strategy behaviour.
The dashboard enables users to:
-
select the training window length used for walk-forward optimisation
-
compare portfolio strategies across multiple performance metrics
-
analyse cumulative returns and drawdowns
-
inspect predicted vs realised Sharpe, Sortino, and volatility
-
control mean return shrinkage and covariance matrix shrinkage
-
evaluate portfolio turnover and allocation dynamics
Interactive charts are rendered using Plotly.