Using a Monte Carlo simulation to calculate a value for π
- Length of a simulation = How many points/coordinates were used in that simulation
- Execution time = How long it takes the program to complete the simulation
- Unit Circle = Circle of radius 1 with a center at the origin (0,0) on a standard coordinate system
- Determine the association, if any exists, between the accuracy of a simulation and the length of the simulation
- Determine the association, if any exists, between the spread of resulting values of π and how many coordinates/points were used in the simulation
- Determine the association, if any exists, between the length of the simulation and the execution time
By definition, Monte Carlo simulations use random number sampling to obtain a desired numerical result. To understand the justification for using such a simulation to calculate an experimental value for π, one needs only a rudimentary knowledge of probability, algebra and geometry.
The above is the unit circle inscribed inside a square. While it is technically possible to run the simulation with this figure, I decided not to for several reasons.
- This is a larger area and the resultant data would be equivalent to data acquired from a smaller segment
- It would be more difficult, in my view, to understand the visual representation
- The side length of the square is double the magnitude of the circle's radius. Because of this, the calculations would be more complex
I instead opted to slice the diagram into a fourth as shown below.
The area of the region inside the quarter circle is exactly equal to 1/4 the area of a full unit circle and since in this example we're using the unit circle as our baseline, the radius is equal to 1, the area of the region is as follows
The area of a square in which the unit circle is inscribed is
The area of the square in the above diagram, which is a quarter of the area of the original square is
With the areas determined, we can now determine the probability, if points are "placed" in the quarter square randomly, that they fall within the area of the quarter circle
So, by this chain of logic, we may construct a Monte Carlo simulation to determine a value for π by having the program generate 2 sets of random numbers (for the x and y dimensions) between 0 and 1. If, as the image above states, the sum of the 2 squares is less than 1, the point is counted as falling inside the area of the quarter circle. At the end, the experimental value of π is calculated by
- Dividing the number of points inside the quarter circle by the total amount of points "thrown".
- Multiply the ratio from above by 4, since otherwise, as the equation from above demonstrates, the result would be approaching π/4
I used 4 Python libraries within my program. They and their function within my program are as follows:
- Mathplotlib.pyplot. I used this library to generate a visual representation of the data.
- Numpy. This library was used to make arrays to store the coordinate data. That data was used to calculate the experimental value for Pi. It was also later exported to pyplot and xlsxwriter. Numpy also contained a built-in value for pi that I used to compare the experimental value against.
- Xlsxwriter. This library was used to export my data to excel once the run is complete and the data is completely collected. One limitation of this library was that it did not support writing to pre-existing excel documents. Because of this, I split the data by run and later combined 10 runs of the same length into 1 excel document by trial, and combined those trials into 1 final document.
- Time. I used the built-in time package to measure the execution time of a run.
Line 9 of my code is what controls the length of a run. I commented out the input section to make data collection faster for myself, but if it is desirable for you to control it via console input, feel free to uncomment it.
Once a run is completed, an excel document (with its name defined based on line 15) is generated in the root directory and a mathplot diagram is displayed. Points that fall inside the area of the curve are marked with a blue color and points that fall outside are marked with a red color to make them more distinct from one another.
Main.py is the main program. This is what I used for my data collection.
mc_pi.py is a reference code that I found online. Credits for this piece of code got to @dandrewmyers on Github (see note in Credits section). I used this code as the basis for my code but heavily modified it. As the code had no license attached and was given freely on a public forum, I assumed that the code was open to public use. My version has the MIT license attached so the code can be used without many limitations (see LICENSE for full legalese).
Project_outline was the original project proposal I submitted to my AP Statistics teacher
The Excel Workbooks directory contains all the data that I collected in my trials. I collected 10 runs per trial of a given length. The length went from 10 to 50 to 100 etc. I could not manage to do any length greater than 100,000 points within a reasonable timeframe. The original data (untouched other than minor formatting) is categorized by length and run number.
The figure Snapshots directory stores all of the mathplot images for each run.
The tex directory is a directory used by a github app attached to this repository. Its function is to render out the LaTEX equations within the readme automatically
This repository also includes the final project report (in DOCX format) and presentation (in PPTX format). I also included the speaker notes (in DOCX format) I used to prepare the presentation.
For this AP Statistics group project, I was responsible for setting up the python simulation and collecting data, and the rest of my group took responsibility for analyzing the data, and using that to come to conclusions. We all worked together on the final report, but the powerpoint was largely created by my group members.
The following are visual representations of the data I collected.
Respectively they are runs of length 10, 50, 100, 500, 1000, 5000, 10000, 50000, and 100000
https://github.com/dandrewmyers/numerical/blob/master/mc_pi.py - My code's foundation. This link has since been taken offline but I have the original stored in this repository
http://mathfaculty.fullerton.edu/mathews/n2003/montecarlopimod.html and https://academo.org/demos/estimating-pi-monte-carlo/
Both of the above links described the math behind my program.