This is one of the 100+ free recipes of the IPython Cookbook, Second Edition, by Cyrille Rossant, a guide to numerical computing and data science in the Jupyter Notebook. The ebook and printed book are available for purchase at Packt Publishing.
▶ Text on GitHub with a CC-BY-NC-ND license
▶ Code on GitHub with a MIT license
Chapter 15 : Symbolic and Numerical Mathematics
SymPy includes a module named stats
that lets us create and manipulate random variables. This is useful when we work with probabilistic or statistical models; we can compute symbolic expectancies, variances probabilities, and densities of random variables.
- Let's import SymPy and the stats module:
from sympy import *
from sympy.stats import *
init_printing()
- Let's roll two dice,
X
andY
, with six faces each:
X, Y = Die('X', 6), Die('Y', 6)
- We can compute probabilities defined by equalities (with the
Eq
operator) or inequalities:
P(Eq(X, 3))
P(X > 3)
- Conditions can also involve multiple random variables:
P(X > Y)
- We can compute conditional probabilities:
P(X + Y > 6, X < 5)
- We can also work with arbitrary discrete or continuous random variables:
Z = Normal('Z', 0, 1) # Gaussian variable
P(Z > pi)
- We can compute expectancies and variances:
E(Z**2), variance(Z**2)
- We can also compute densities:
f = density(Z)
var('x')
f(x)
- We can plot these densities:
%matplotlib inline
plot(f(x), (x, -6, 6))
SymPy's stats
module contains many functions to define random variables with classical laws (binomial, exponential, and so on), discrete or continuous. It works by leveraging SymPy's powerful integration algorithms to compute exact probabilistic quantities as integrals of probability distributions. For example,
Eq(Integral(f(x), (x, pi, oo)),
simplify(integrate(f(x), (x, pi, oo))))
Note that the equality condition is written using the Eq
operator rather than the more standard ==
Python syntax. This is a general feature in SymPy; ==
means equality between Python variables, whereas Eq
is the mathematical operation between symbolic expressions.
Here are a few references:
- SymPy stats module documentation at http://docs.sympy.org/latest/modules/stats.html
- Probability lectures on Awesome Math, at https://github.com/rossant/awesome-math/#probability-theory
- Statistics lectures on Awesome Math, at https://github.com/rossant/awesome-math/#statistics