Use prettyunits to make p-values pretty (#95)

tidy-survey-r · Feb 25, 2024 · c500658 · c500658
1 parent 1aa432f
commit c500658
Show file tree

Hide file tree

Showing 3 changed files with 7 additions and 6 deletions.
diff --git a/06-statistical-testing.Rmd b/06-statistical-testing.Rmd
@@ -195,7 +195,7 @@ recs_des %>%
   summarize(mu = survey_mean(SummerTempNight, na.rm = TRUE))
 ```
 
-The result is the same in both methods, so we see that the average temperature U.S. households set their thermostat to in the summer at night is `r signif(ttest_ex1$estimate + 68,3)`$^\circ$F. Looking at the output from `svyttest()`, the t-statistic is `r signif(ttest_ex1$statistic, 3)`, and the p-value is $`r signif(ttest_ex1[["p.value"]], 3)`$, indicating that the average is statistically different from 68$^\circ$F at an $\alpha$ level of $0.05$.
+The result is the same in both methods, so we see that the average temperature U.S. households set their thermostat to in the summer at night is `r signif(ttest_ex1$estimate + 68,3)`$^\circ$F. Looking at the output from `svyttest()`, the t-statistic is `r signif(ttest_ex1$statistic, 3)`, and the p-value is $`r pretty_p_value(ttest_ex1[["p.value"]])`$, indicating that the average is statistically different from 68$^\circ$F at an $\alpha$ level of $0.05$.
 
 If we want an 80% confidence interval for the test statistic, we can use the function `confint()` to change the confidence level. Below, we print both the original 95% confidence interval and the 80% confidence interval:
 
@@ -245,7 +245,7 @@ The output from the `svyttest()` function can be a bit hard to read. Using the {
 broom::tidy(ttest_ex2)
 ```
 
-The estimate differs from Example 1 in that the estimate is not displaying \(\mu - 0.90\) but rather \(\mu\), or the difference between the U.S. households that use AC and the proportion we are comparing to. We can see that there is a difference of `r signif(ttest_ex2$estimate*100,3)` percentage points. Additionally, the t-statistic value in the `statistic` column is `r signif(ttest_ex2$statistic,3)`, and the p-value is `r signif(ttest_ex2$p.value,3)`. These results indicate that the fewer than 90% of U.S. households use AC in their homes.
+The estimate differs from Example 1 in that the estimate is not displaying \(\mu - 0.90\) but rather \(\mu\), or the difference between the U.S. households that use AC and the proportion we are comparing to. We can see that there is a difference of `r signif(ttest_ex2$estimate*100,3)` percentage points. Additionally, the t-statistic value in the `statistic` column is `r signif(ttest_ex2$statistic,3)`, and the p-value is `r pretty_p_value(ttest_ex2$p.value)`. These results indicate that the fewer than 90% of U.S. households use AC in their homes.
 
 <!--Add in callout box about how to use the $ notation to help call out the different values? Maybe indicate how this will be covered more in the reporting chapter? IV: I added a bit up top, not sure if it needs a whole call out box but happy to revisit.-->
 
@@ -277,7 +277,7 @@ ttest_ex3 <- recs_des %>%
 broom::tidy(ttest_ex3)
 ```
 
-The results indicate that the difference in electrical bills for those that used AC and those that did not is, on average, \$`r round(ttest_ex3$estimate,2)`. The difference appears to be statistically significant as the t-statistic is `r signif(ttest_ex3$statistic, 3)` and the p-value is $`r signif(ttest_ex3[["p.value"]], 3)`$. Households that used AC spent, on average, $`r round(ttest_ex3[["estimate"]], 2) %>% unname()` more in 2020 on electricity than households without AC.
+The results indicate that the difference in electrical bills for those that used AC and those that did not is, on average, \$`r round(ttest_ex3$estimate,2)`. The difference appears to be statistically significant as the t-statistic is `r signif(ttest_ex3$statistic, 3)` and the p-value is $`r pretty_p_value(ttest_ex3[["p.value"]])`$. Households that used AC spent, on average, $`r round(ttest_ex3[["estimate"]], 2) %>% unname()` more in 2020 on electricity than households without AC.
 
 #### Example 4: Paired two-sample t-test {.unnumbered #stattest-ttest-ex4}
 
@@ -300,7 +300,7 @@ ttest_ex4 <- recs_des %>%
 broom::tidy(ttest_ex4)
 ```
 
-U.S. households set their thermostat on average `r signif(ttest_ex4$estimate,2)`$^\circ$F warmer in summer nights than winter nights, which is statistically significant (t = `r signif(ttest_ex4$statistic, 3)`, p-value = $`r signif(ttest_ex4$p.value, 3)`$).  
+U.S. households set their thermostat on average `r signif(ttest_ex4$estimate,2)`$^\circ$F warmer in summer nights than winter nights, which is statistically significant (t = `r signif(ttest_ex4$statistic, 3)`, p-value = $`r pretty_p_value(ttest_ex4[["p.value"]])`$).  
 
 ## Chi-Square Tests {#stattest-chi}
 
@@ -432,7 +432,7 @@ chi_ex1 <- anes_des_educ %>%
 chi_ex1
 ```
 
-The output from the `svygofchisq()` indicates that at least one proportion from ANES does not match the ACS data ($\chi^2 =$ `r chi_ex1$statistic`; $p-value =$ `r signif(chi_ex1$p.value,3)`). To get a better idea of the differences, we can use the `expected` output along with `survey_mean()` to create a comparison table:  
+The output from the `svygofchisq()` indicates that at least one proportion from ANES does not match the ACS data ($\chi^2 =$ `r chi_ex1$statistic`; $p-value =$ `r pretty_p_value(chi_ex1[["p.value"]])`). To get a better idea of the differences, we can use the `expected` output along with `survey_mean()` to create a comparison table:  
 
 ```{r}
 #| label: stattest-chi-ex1-table

diff --git a/07-modeling.Rmd b/07-modeling.Rmd
@@ -275,7 +275,7 @@ urb_reg_test <- regTermTest(m_electric_multi, ~Urbanicity:Region)
 urb_reg_test
 ```
 
-This output indicates there is a significant interaction between urbanicity and region (p-value=$`r signif(urb_reg_test[["p"]], 3)`$).
+This output indicates there is a significant interaction between urbanicity and region (p-value=$`r pretty_p_value(urb_reg_test[["p"]])`$).
 
 To examine the predictions, residuals and more from the model, the function `augment()` from {broom} can be used. The `augment()` function will return a tibble with the independent and dependent variables and other fit statistics. The `augment()` function has not been specifically written for objects of class `svyglm`, and as such, a warning will be displayed indicating this at this time. As it was not written exactly for this class of objects, a little tweaking needs to be done after using augment to get the predicted (`.fitted`) and standard error (`.se.fit`) values. To obtain the standard error of the fitted values we need to use the `attr()` function on the `.fitted` values created by `augment()`. 
 

diff --git a/index.Rmd b/index.Rmd
@@ -34,6 +34,7 @@ if (knitr:::is_html_output()){
   options(width=72)
 }
 library(formatR)
+library(prettyunits)
 
 book_colors <- c("#0b3954", "#087e8b", "#bfd7ea", "#ff8484", "#8d6b94")