A theory-derived optimizer that matches Adam with zero hyperparameter tuning.
The Syntonic optimizer dynamically infers the optimal learning rate timescale from gradient statistics using the scaling law:
where σ² is the gradient variance, λ is the innovation rate (how fast the gradient landscape is changing), and κ = 1.0 is theory-determined — not tuned.
Adam's fixed EMA constants (β₁=0.9, β₂=0.999) accidentally encode temporal scales that happen to match typical training regimes. The Syntonic optimizer makes this structure explicit by computing τ* dynamically.
| Phase | Adam | Syntonic | Δ |
|---|---|---|---|
| Baseline (bs=128) | 80.48% | 82.37% | +1.89% |
| Large Batch (bs=512) | 86.86% | 84.38% | −2.48% |
| Gradient Noise (σ=0.1) | 82.32% | 82.30% | −0.02% |
| Label Noise (20%) | 85.13% | 84.95% | −0.18% |
| Recovery (bs=64) | 86.70% | 87.04% | +0.34% |
| Metric | Adam | Syntonic | Δ |
|---|---|---|---|
| Final accuracy | 62.56% | 61.79% | −0.77% |
| Benchmark | Adam | Syntonic | Δ | Tuning required |
|---|---|---|---|---|
| CIFAR-10 (10 classes) | 86.70% | 87.04% | +0.34% | None |
| CIFAR-100 (100 classes) | 62.56% | 61.79% | −0.77% | None |
Both optimizers use identical conditions: same seed (42), same architecture, same data augmentation, same weight decay (1e-4), same gradient clipping (1.0). Neither optimizer is retuned across the five training phases.
Look at the bottom-left panel in the figures above: τ* Dynamic Adaptation.
Adam's temporal scale (dashed line, τ₁≈10) is fixed. The Syntonic optimizer dynamically adjusts τ* in response to regime changes — it increases during gradient noise and label noise phases, then recovers. This is inference vs coincidence.
The bottom-right panel (Effective Learning Rate) shows the consequence: Adam stays flat at 10⁻³ regardless of regime. The Syntonic optimizer modulates its learning rate based on the local σ²/λ ratio.
Both notebooks run directly in Google Colab (free GPU):
| Notebook | Colab |
|---|---|
| CIFAR-10 Multi-Regime | |
| CIFAR-100 Multi-Regime |
Expected runtime: 30 minutes per notebook on Colab free tier (T4 GPU).
class SyntonicV4(torch.optim.Optimizer):
"""
Syntonic V4 — Theory-derived adaptive optimizer.
Update: p -= (base_lr × κ / τ*) ⊙ (m̂ / (√v̂ + ε))
Tempo: τ* = κ √(σ²/λ) [per-element, clamped]
κ = 1.0 is the theoretical value (not tuned).
"""
def __init__(self, params, base_lr=0.001, kappa=1.0,
tau_min=1.0, tau_max=500.0,
beta_m=0.9, beta_v=0.999,
beta_sigma=0.999, beta_lambda=0.99,
eps=1e-8, weight_decay=0.0):
# ... (see notebooks for full implementation)The full implementation is ~80 lines of PyTorch. See the notebooks for the complete, runnable code.
Both optimizers are configured once at epoch 1 and never retuned:
| Phase | Epochs | Perturbation | Rationale |
|---|---|---|---|
| 1 | 1–10 | Batch size 128 (baseline) | Establish convergence |
| 2 | 11–20 | Batch size 512 | Variance landscape shift |
| 3 | 21–30 | Gradient noise σ=0.1 | Stochastic perturbation |
| 4 | 31–40 | 20% label corruption | Task non-stationarity |
| 5 | 41–50 | Batch size 64, clean | Re-adaptation |
This protocol tests whether the optimizer adapts to regime shifts without human intervention.
The Syntony Principle proposes that the optimal integration time for any adaptive system operating under uncertainty follows:
This is the geometric mean between the noise timescale and the innovation timescale. The result emerges independently from 10 mathematical derivations (Kalman-Riccati, bias-variance, dimensional analysis, information theory, optimal stopping, H∞ control, Cramér-Rao bounds, Allan variance, multi-agent consensus, and the UTAE axiomatic framework).
Full theoretical framework: Bronsard, J.-P. (2025). The Syntony Principle: A Structural Scaling Law for Adaptive Systems — V4.1 Canonical Edition. Zenodo. DOI: 10.5281/zenodo.17254395
Deep learning validation paper: DOI: 10.5281/zenodo.18527033
ImageNet-scale validation (ResNet-50, ~25M parameters) is the natural next step. The key question: does the square-root scaling τ* = κ√(σ²/λ) persist in high-dimensional loss landscapes?
If you have access to multi-GPU compute and are interested in collaborating on this, please reach out.
- Seed: 42 (fixed for all experiments)
- Framework: PyTorch
- Hardware: Google Colab free tier (T4 GPU)
- All results reproducible by running the notebooks end-to-end
@software{bronsard2025syntonic,
author = {Bronsard, Jean-Pierre},
title = {Syntonic Optimizer: Theory-Derived Adaptive Learning Rate from the Syntony Principle},
year = {2025},
publisher = {Zenodo},
doi = {10.5281/zenodo.18527033},
url = {https://doi.org/10.5281/zenodo.18527033}
}CC BY 4.0 — You are free to use, share, and adapt this work with attribution.
Jean-Pierre Bronsard SyntonicAI Recherche — Montréal, QC, Canada ORCID: 0009-0008-6639-7553

