A lightweight Automatic Differentiation (AD) library built from scratch in Python and NumPy. This project implements Reverse-Mode Differentiation (Backpropagation) using a dynamic computational graph, enabling gradient computation for complex linear algebra operations without relying on heavy frameworks like PyTorch or TensorFlow.
- Reverse-Mode AD: Efficiently computes gradients for scalar and tensor operations via the Chain Rule.
- Dynamic Computational Graph: Constructs graphs on-the-fly (similar to PyTorch's
autograd). - Linear Algebra Support: Includes custom backward definitions for complex matrix operations:
- Matrix Multiplication (
@) - Linear Solve (
np.linalg.solve) - Log Determinant (
np.linalg.slogdet)
- Matrix Multiplication (
- Broadcasting Support: Handles gradient shape matching automatically for broadcasted operations.
The engine dynamically builds a computational graph to track operations. Below is the graph for the expression
Figure 1: Gradient accumulation in the backward pass (∂z/∂x = y + 1).
-
Clone the repository
git clone [https://github.com/YOUR-USERNAME/mini-autodiff.git](https://github.com/YOUR-USERNAME/mini-autodiff.git) cd mini-autodiff -
Install dependencies This project requires
numpyandscipy.pip install numpy scipy
You can use the autodiff library to define scalar or matrix functions and compute their gradients automatically.
import numpy as np
import autodiff as ad
# 1. Define inputs
x = 2.0
y = 3.0
# 2. Define a computation function
def my_func(x, y):
# z = x * y + x
return x * y + x
# 3. Compute Gradients
# ad.grad() returns a function that computes the gradient w.r.t inputs
grad_fn = ad.grad(my_func)
grads = grad_fn(x, y)
print(f"Gradient dx: {grads[0]}") # Output: 4.0 (y + 1)
print(f"Gradient dy: {grads[1]}") # Output: 2.0 (x)The engine is robust enough to handle complex statistical models. The included demo.ipynb showcases computing the gradient of the Negative Log-Likelihood for a Multivariate Gaussian distribution with respect to the Covariance Matrix
The engine correctly handles the gradients for logdet and solve:
$\frac{\partial \log|\Sigma|}{\partial \Sigma} = (\Sigma^{-1})^\top$ - Backpropagation through linear systems
$Ax=b$ .
Check demo.ipynb for the full implementation and numerical verification against finite differences.
autodiff.py: The core library containing theVarclass,Opdefinitions, and the topological sort backpropagation engine.demo.ipynb: A Jupyter Notebook demonstrating usage and validating correctness against numerical gradients.
This implementation uses Operator Overloading to build a Directed Acyclic Graph (DAG) as operations are performed. When grad() is called:
- Topological Sort: The graph is traversed to ensure we process nodes in dependency order.
- Vector-Jacobian Product (VJP): Gradients are propagated backward from the output to the inputs using defined VJPs for each operator.
Completed as part of the coursework for COMPSCI 689: Machine Learning at the University of Massachusetts Amherst (Fall 2025).
