A toolkit for measuring the efficacy of various methods for calculating a confidence interval. Currently provides with a toolkit for measuring the efficacy of methods for a confidence interval for the following statistics:
- proportion
- difference between two proportions
This library was mainly inspired by the library: "Five Confidence Intervals for Proportions That You Should Know About" by Dr. Dennis Robert
https://pypi.org/project/CI-methods-analyser/
Wald Interval is defined as so:
from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit, methods_for_CI_for_proportion
# take an already implemented method for calculating CI for proportions
wald_interval = methods_for_CI_for_proportion.wald_interval
# initialize the toolkit
wald_interval_test_toolkit = toolkit(
method=wald_interval, method_name="Wald Interval")
# calculate the real coverage that the method produces
# for each case of a true population proportion (taken from the list `proportions`)
wald_interval_test_toolkit.calculate_coverage_analytically(
sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95)
# now you can access the calculated coverage and a few statistics:
# wald_interval_test_toolkit.coverage # 1-d array of 0-100, the same shape as passed `proportions`
# wald_interval_test_toolkit.average_coverage # np.longdouble 0-100, avg of `coverage`
# wald_interval_test_toolkit.average_deviation # np.longdouble 0-100, avg abs diff w/ `confidence`
# plots the calculated coverage in a matplotlib.pyplot figure
wald_interval_test_toolkit.plot_coverage(
plt_figure_title="Wald Interval coverage")
# you can access the figure here:
# wald_interval_test_toolkit.figure
# shows the figure (non-blocking)
wald_interval_test_toolkit.show_plot()
# because show_plot() is non-blocking,
# you have to pause the execution in order for the figure to be rendered completely
input('press Enter to exit')
This will output the image:
The plot indicates overall bad performance of the method and particularly poor performance for extreme proportions.
You really might want to use a different method. Check out this wonderful medium.com article by Dr. Dennis Robert:
The shortcut function calculate_coverage_and_show_plot
will yield the equivalent calculation and render the same picture:
from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit, methods_for_CI_for_proportion
toolkit(
method=methods_for_CI_for_proportion.wald_interval, method_name="Wald Interval"
).calculate_coverage_and_show_plot(
sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
plt_figure_title="Wald Interval coverage"
)
input('press Enter to exit')
I personally prefer night light-friendly styling:
from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit, methods_for_CI_for_proportion
toolkit(
method=methods_for_CI_for_proportion.wald_interval, method_name="Wald Interval"
).calculate_coverage_and_show_plot(
sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
plt_figure_title="Wald Interval coverage",
theme='dark_background', plot_color="green", line_color="orange"
)
input('press Enter to exit')
You can implement your own methods and test them:
from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit
from CI_methods_analyser.math_functions import normal_z_score_two_tailed
from functools import lru_cache
# not a particularly good method for calculating CI for proportion
@lru_cache(100_000)
def im_telling_ya_test(x: int, n: int, conflevel: float = 0.95):
z = normal_z_score_two_tailed(conflevel)
p = float(x)/n
return (
p - 0.02*z,
p + 0.02*z
)
toolkit(
method=im_telling_ya_test, method_name='"I\'m telling ya" test'
).calculate_coverage_and_show_plot(
sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
plt_figure_title='"I\'m telling ya" coverage',
theme='dark_background', plot_color="green", line_color="orange"
)
input('press Enter to exit')
from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit
from CI_methods_analyser.math_functions import normal_z_score_two_tailed
from functools import lru_cache
# you could say, this method is "too good"
@lru_cache(100_000)
def God_is_my_witness_score(x: int, n: int, conflevel: float = 0.95):
z = normal_z_score_two_tailed(conflevel)
p = float(x)/n
return (
(0 + p)/2 - 0.005*z,
(1 + p)/2 + 0.005*z
)
toolkit(
method=God_is_my_witness_score, method_name='"God is my witness" score'
).calculate_coverage_and_show_plot(
sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
plt_figure_title='"God is my witness" score coverage', theme='dark_background'
)
input('press Enter to exit')
Let's use the implemented Pooled Z test:
, where:from CI_methods_analyser import CImethodForDiffBetwTwoProportions_efficacyToolkit as toolkit_d, methods_for_CI_for_diff_betw_two_proportions as methods
toolkit_d(
method=methods.Z_test_pooled, method_name='Z test pooled'
).calculate_coverage_and_show_plot(
sample_size1=100, sample_size2=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
plt_figure_title='Z test pooled', theme='dark_background',
)
input('press Enter to exit')
As you can see, this test is generally very good for close proportions, unless proportions have extreme values [purple]
Also, this test is extremely concervative for the high and extreme differences between two proportions, i.e. for proportions which values a far apart [green]
You may want to change the color palette (although I wouldn't):
from CI_methods_analyser import CImethodForDiffBetwTwoProportions_efficacyToolkit as toolkit_d, methods_for_CI_for_diff_betw_two_proportions as methods
toolkit_d(
method=methods.Z_test_pooled, method_name='Z test pooled'
).calculate_coverage_and_show_plot(
sample_size1=100, sample_size2=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
plt_figure_title='Z test pooled', theme='dark_background',
colors=("gray", "purple", "white", "orange", "#d62728")
)
input('press Enter to exit')
Two ways can be used to calculate the efficacy of CI methods:
- approximately, with random simulation (as implemented in R by Dr. Dennis Robert, see link above). Here:
calculate_coverage_randomly
- precisely, with the analytical solution. Here:
calculate_coverage_analytically
Both methods are implemented for CI for both statistics: proportion, and difference between two proportions. For the precise analytical solution, an optimization was made. Theoretically, it is lossy, but practically the error is always negligible (as proven by test_z_precision_difference.py
). Optimization is regulated with the parameter z_precision
and it is automatically estimated by default.
1. Equivalence and Noninferiority Testing (as I understand, are fancy terms for 2-sided and 1-sided p tests for the difference between two proportions)
- https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/PASS/Confidence_Intervals_for_the_Difference_Between_Two_Proportions.pdf
- https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/PASS/Non-Inferiority_Tests_for_the_Difference_Between_Two_Proportions.pdf
- https://www.ncss.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Two_Proportions-Non-Inferiority,_Superiority,_Equivalence,_and_Two-Sided_Tests_vs_a_Margin.pdf
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3019319/
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2701110/
- https://pubmed.ncbi.nlm.nih.gov/9595617/
- http://thescipub.com/pdf/10.3844/amjbsp.2010.23.31
2. Biostatistics course (Dr. Nicolas Padilla Raygoza, et al.)
- https://docs.google.com/presentation/d/1t1DowyVDDRFYGHDlJgmYMRN4JCrvFl3q/edit#slide=id.p1
- https://www.google.com/search?q=Dr.+Sc.+Nicolas+Padilla+Raygoza+Biostatistics+course+Part+10&oq=Dr.+Sc.+Nicolas+Padilla+Raygoza+Biostatistics+course+Part+10&aqs=chrome..69i57.3448j0j7&sourceid=chrome&ie=UTF-8
- https://slideplayer.com/slide/9837395/
3. Using z-test instead of a binomial test: