Skip to content

Latest commit

 

History

History
119 lines (77 loc) · 3.69 KB

04_stats.md

File metadata and controls

119 lines (77 loc) · 3.69 KB

IPython Cookbook, Second Edition This is one of the 100+ free recipes of the IPython Cookbook, Second Edition, by Cyrille Rossant, a guide to numerical computing and data science in the Jupyter Notebook. The ebook and printed book are available for purchase at Packt Publishing.

Text on GitHub with a CC-BY-NC-ND license
Code on GitHub with a MIT license

Chapter 15 : Symbolic and Numerical Mathematics

15.4. Computing exact probabilities and manipulating random variables

SymPy includes a module named stats that lets us create and manipulate random variables. This is useful when we work with probabilistic or statistical models; we can compute symbolic expectancies, variances probabilities, and densities of random variables.

How to do it...

  1. Let's import SymPy and the stats module:
from sympy import *
from sympy.stats import *
init_printing()
  1. Let's roll two dice, X and Y, with six faces each:
X, Y = Die('X', 6), Die('Y', 6)
  1. We can compute probabilities defined by equalities (with the Eq operator) or inequalities:
P(Eq(X, 3))

1/6

P(X > 3)

1/2

  1. Conditions can also involve multiple random variables:
P(X > Y)

5/12

  1. We can compute conditional probabilities:
P(X + Y > 6, X < 5)

5/12

  1. We can also work with arbitrary discrete or continuous random variables:
Z = Normal('Z', 0, 1)  # Gaussian variable
P(Z > pi)

Output

  1. We can compute expectancies and variances:
E(Z**2), variance(Z**2)

(1, 2)

  1. We can also compute densities:
f = density(Z)
var('x')
f(x)

Output

  1. We can plot these densities:
%matplotlib inline
plot(f(x), (x, -6, 6))

<matplotlib.figure.Figure at 0x7f775c9c6da0>

How it works...

SymPy's stats module contains many functions to define random variables with classical laws (binomial, exponential, and so on), discrete or continuous. It works by leveraging SymPy's powerful integration algorithms to compute exact probabilistic quantities as integrals of probability distributions. For example, $P(Z &gt; \pi)$ is:

Eq(Integral(f(x), (x, pi, oo)),
   simplify(integrate(f(x), (x, pi, oo))))

Output

Note that the equality condition is written using the Eq operator rather than the more standard == Python syntax. This is a general feature in SymPy; == means equality between Python variables, whereas Eq is the mathematical operation between symbolic expressions.

There's more...

Here are a few references: