Skip to content

Using a Monte Carlo simulation to calculate a value for PI

License

Notifications You must be signed in to change notification settings

IkeKap/StatsProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StatsProject

Using a Monte Carlo simulation to calculate a value for π

2bb5d450319f0a721c07d202cc35d324

2nd GIF

Useful definitions

  1. Length of a simulation = How many points/coordinates were used in that simulation
  2. Execution time = How long it takes the program to complete the simulation
  3. Unit Circle = Circle of radius 1 with a center at the origin (0,0) on a standard coordinate system

Initial goals

  • Determine the association, if any exists, between the accuracy of a simulation and the length of the simulation
  • Determine the association, if any exists, between the spread of resulting values of π and how many coordinates/points were used in the simulation
  • Determine the association, if any exists, between the length of the simulation and the execution time

Description of the methodology

By definition, Monte Carlo simulations use random number sampling to obtain a desired numerical result. To understand the justification for using such a simulation to calculate an experimental value for π, one needs only a rudimentary knowledge of probability, algebra and geometry.

unit_circle image

The above is the unit circle inscribed inside a square. While it is technically possible to run the simulation with this figure, I decided not to for several reasons.

  1. This is a larger area and the resultant data would be equivalent to data acquired from a smaller segment
  2. It would be more difficult, in my view, to understand the visual representation
  3. The side length of the square is double the magnitude of the circle's radius. Because of this, the calculations would be more complex

I instead opted to slice the diagram into a fourth as shown below.

quarter_segment

The area of the region inside the quarter circle is exactly equal to 1/4 the area of a full unit circle and since in this example we're using the unit circle as our baseline, the radius is equal to 1, the area of the region is as follows

The area of a square in which the unit circle is inscribed is

The area of the square in the above diagram, which is a quarter of the area of the original square is

With the areas determined, we can now determine the probability, if points are "placed" in the quarter square randomly, that they fall within the area of the quarter circle

So, by this chain of logic, we may construct a Monte Carlo simulation to determine a value for π by having the program generate 2 sets of random numbers (for the x and y dimensions) between 0 and 1. If, as the image above states, the sum of the 2 squares is less than 1, the point is counted as falling inside the area of the quarter circle. At the end, the experimental value of π is calculated by

  1. Dividing the number of points inside the quarter circle by the total amount of points "thrown".
  2. Multiply the ratio from above by 4, since otherwise, as the equation from above demonstrates, the result would be approaching π/4

How the program works

I used 4 Python libraries within my program. They and their function within my program are as follows:

  1. Mathplotlib.pyplot. I used this library to generate a visual representation of the data.
  2. Numpy. This library was used to make arrays to store the coordinate data. That data was used to calculate the experimental value for Pi. It was also later exported to pyplot and xlsxwriter. Numpy also contained a built-in value for pi that I used to compare the experimental value against.
  3. Xlsxwriter. This library was used to export my data to excel once the run is complete and the data is completely collected. One limitation of this library was that it did not support writing to pre-existing excel documents. Because of this, I split the data by run and later combined 10 runs of the same length into 1 excel document by trial, and combined those trials into 1 final document.
  4. Time. I used the built-in time package to measure the execution time of a run.

Line 9 of my code is what controls the length of a run. I commented out the input section to make data collection faster for myself, but if it is desirable for you to control it via console input, feel free to uncomment it.

Once a run is completed, an excel document (with its name defined based on line 15) is generated in the root directory and a mathplot diagram is displayed. Points that fall inside the area of the curve are marked with a blue color and points that fall outside are marked with a red color to make them more distinct from one another.

Layout of the repository

Main.py is the main program. This is what I used for my data collection.

mc_pi.py is a reference code that I found online. Credits for this piece of code got to @dandrewmyers on Github (see note in Credits section). I used this code as the basis for my code but heavily modified it. As the code had no license attached and was given freely on a public forum, I assumed that the code was open to public use. My version has the MIT license attached so the code can be used without many limitations (see LICENSE for full legalese).

Project_outline was the original project proposal I submitted to my AP Statistics teacher

The Excel Workbooks directory contains all the data that I collected in my trials. I collected 10 runs per trial of a given length. The length went from 10 to 50 to 100 etc. I could not manage to do any length greater than 100,000 points within a reasonable timeframe. The original data (untouched other than minor formatting) is categorized by length and run number.

The figure Snapshots directory stores all of the mathplot images for each run.

The tex directory is a directory used by a github app attached to this repository. Its function is to render out the LaTEX equations within the readme automatically

This repository also includes the final project report (in DOCX format) and presentation (in PPTX format). I also included the speaker notes (in DOCX format) I used to prepare the presentation.

For this AP Statistics group project, I was responsible for setting up the python simulation and collecting data, and the rest of my group took responsibility for analyzing the data, and using that to come to conclusions. We all worked together on the final report, but the powerpoint was largely created by my group members.

Example Mathplots

The following are visual representations of the data I collected.

Respectively they are runs of length 10, 50, 100, 500, 1000, 5000, 10000, 50000, and 100000

10pi3

50pi3

100pi3

500pi3

1kpi3

5kpi3

10kpi3

50kpi3

100kpi3

Credits

https://github.com/dandrewmyers/numerical/blob/master/mc_pi.py - My code's foundation. This link has since been taken offline but I have the original stored in this repository

http://mathfaculty.fullerton.edu/mathews/n2003/montecarlopimod.html and https://academo.org/demos/estimating-pi-monte-carlo/

Both of the above links described the math behind my program.

About

Using a Monte Carlo simulation to calculate a value for PI

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages