Skip to content

Commit

Permalink
Add citations for ch2, clean up warnings on build, and update code ch…
Browse files Browse the repository at this point in the history
…unk names to be consistent in each chapter
  • Loading branch information
rpowell22 committed Aug 7, 2023
1 parent 17c1098 commit 6d96341
Show file tree
Hide file tree
Showing 11 changed files with 106 additions and 53 deletions.
26 changes: 12 additions & 14 deletions 02-overview-surveys.Rmd

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions 03-specifying-sample-designs.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Specifying sample designs and replicate weights in {srvyr} {#c03-specifying-sample-designs}

::: {.prereqbox-header}
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq}'`
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq3}'`
:::

::: {.prereqbox data-latex="{Prerequisites}"}
Expand All @@ -21,7 +21,7 @@ source("helper-fun/helper-functions.R")

To help explain the different types of sample designs, this chapter will use the `api` and `scd` data that comes in the {survey} package:
```{r}
#| label: ch3-setup-surveydata
#| label: samp-setup-surveydata
data(api)
data(scd)
```
Expand Down
22 changes: 11 additions & 11 deletions 04-understanding-survey-data-documentation.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Understanding survey data documentation {#c04-understanding-survey-data-documentation}

::: {.prereqbox-header}
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq}'`
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq4}'`
:::

::: {.prereqbox data-latex="{Prerequisites}"}
Expand Down Expand Up @@ -50,21 +50,21 @@ A questionnaire is a series of questions asked to obtain information from survey

The questionnaire is an essential resource for understanding and interpreting the survey data (see Section \@ref(overview-design-questionnaire)), and we should use it alongside any analysis. It provides details about each of the questions asked in the survey, such as question name, question wording, response options, skip logic, randomizations, display specification, mode differences, and the universe (if only a subset of respondents were asked the question).

Below in Figure \@ref(fig:que-examp), we show a question from the ANES 2020 questionnaire [@anes-svy]. This figure shows a particular question's question name (`postvote_rvote`), description (Did R Vote?), full wording of the question and responses, response order, universe, question logic (if `vote_pre` = 0), and other specifications. The section also includes the variable name, which we can link to the codebook.
Below in Figure \@ref(fig:understand-que-examp), we show a question from the ANES 2020 questionnaire [@anes-svy]. This figure shows a particular question's question name (`postvote_rvote`), description (Did R Vote?), full wording of the question and responses, response order, universe, question logic (if `vote_pre` = 0), and other specifications. The section also includes the variable name, which we can link to the codebook.

```{r}
#| label: que-examp
#| label: understand-que-examp
#| echo: false
#| fig.cap: ANES 2020 Questionnaire Example
#| fig.alt: Question information about the variable postvote_rvote from ANES 2020 questionnaire Survey question, Universe, Logic, Web Spec, Response Order, and Released Variable are included.
knitr::include_graphics(path="images/questionnaire-example.jpg")
```

The content and structure of questionnaires vary depending on the specific survey. For instance, question names may be informative (like the ANES example), sequential, or denoted by a code. In some cases, surveys may not use separate names for questions and variables. Figure \@ref(fig:que-examp-2) shows a question from the Behavioral Risk Factor Surveillance System (BRFSS) questionnaire that shows a sequential question number and a coded variable name (as opposed to a question name) [@brfss-svy].
The content and structure of questionnaires vary depending on the specific survey. For instance, question names may be informative (like the ANES example), sequential, or denoted by a code. In some cases, surveys may not use separate names for questions and variables. Figure \@ref(fig:understand-que-examp-2) shows a question from the Behavioral Risk Factor Surveillance System (BRFSS) questionnaire that shows a sequential question number and a coded variable name (as opposed to a question name) [@brfss-svy].

```{r}
#| label: que-examp-2
#| label: understand-que-examp-2
#| echo: false
#| fig.cap: BRFSS 2021 Questionnaire Example
#| fig.alt: Question information about the variable BPHIGH6 from BRFSS 2021 questionnaire. Question number, question text, variable names, responses, skip info and CATI note, interviewer notes, and columns are included.
Expand All @@ -78,18 +78,18 @@ Given the variety in how the survey information is presented in documentation, i

While a questionnaire provides information about the questions asked to respondents, the codebook explains how the survey data was coded and recorded. The codebook lists details such as variable names, variable labels, variable meanings, codes for missing data, values labels, and value types (whether categorical or continuous, etc.). In particular, the codebook often includes information on missing data (as opposed to the questionnaire). The codebook enables us to understand and use the variables appropriately in our analysis.

Figure \@ref(fig:codebook-examp) is a question from the ANES 2020 codebook [@anes-cb]. This part indicates a particular variable's name (`V202066`), question wording, value labels, universe, and associated survey question (`postvote_rvote`).
Figure \@ref(fig:understand-codebook-examp) is a question from the ANES 2020 codebook [@anes-cb]. This part indicates a particular variable's name (`V202066`), question wording, value labels, universe, and associated survey question (`postvote_rvote`).

```{r}
#| label: codebook-examp
#| label: understand-codebook-examp
#| echo: false
#| fig.cap: ANES 2020 Codebook Example
#| fig.alt: Variable information about the variable V202066 from ANES 2020 questionnaire Variable meaning, Value labels, Universe, and Survey Question(s) are included.
knitr::include_graphics(path="images/codebook-example.jpg")
```

Reviewing both questionnaires and codebooks in parallel is important (Figures \@ref(fig:que-examp) and \@ref(fig:codebook-examp), as questions and variables do not always correspond directly to each other in a one-to-one mapping. A single question may have multiple associated variables, or a single variable may summarize multiple questions. Reviewing the codebook clarifies how to interpret the variables.
Reviewing both questionnaires and codebooks in parallel is important (Figures \@ref(fig:understand-que-examp) and \@ref(fig:understand-codebook-examp), as questions and variables do not always correspond directly to each other in a one-to-one mapping. A single question may have multiple associated variables, or a single variable may summarize multiple questions. Reviewing the codebook clarifies how to interpret the variables.

### Errata

Expand Down Expand Up @@ -117,7 +117,7 @@ Missing data can be a significant problem in survey analysis, as it can introduc

c. **Missing not at random (MNAR)**: The missing data is related to unobserved data, and the probability of being missing varies for reasons we are not measuring. For example, if respondents with depression do not answer a question about depression severity.

The survey documentation, often the codebook, represents the missing data with a code. For example, a survey may have "Yes" responses coded to `1`, "No" responses coded to `2`, and missing responses coded to `-9`. Or, the codebook may list different codes depending on why certain data is missing. In the example of variable `V202066` from the ANES (Figure \@ref(fig:codebook-examp)), `-9` represents "Refused," `-7` means that the response was deleted due to an incomplete interview, `-6` means that there is no response because there was no follow-up interview, and `-1` means "Inapplicable" (due to the designed skip pattern).
The survey documentation, often the codebook, represents the missing data with a code. For example, a survey may have "Yes" responses coded to `1`, "No" responses coded to `2`, and missing responses coded to `-9`. Or, the codebook may list different codes depending on why certain data is missing. In the example of variable `V202066` from the ANES (Figure \@ref(fig:understand-codebook-examp)), `-9` represents "Refused," `-7` means that the response was deleted due to an incomplete interview, `-6` means that there is no response because there was no follow-up interview, and `-1` means "Inapplicable" (due to the designed skip pattern).

When running analysis in R, we must handle missing responses as missing data (i.e., `NA`) and not numeric data. If missing responses are treated as zeros or arbitrary values, they can artificially alter summary statistics or introduce spurious patterns in the analysis. Recoding these values to `NA` will allow you to handle missing data in different ways in R, such as using functions like `na.omit()`, `complete.cases()`, or specialized packages like {tidyimpute} or {mice}. These tools allow us to treat missing responses as missing data to conduct your analysis accurately and obtain valid results.

Expand All @@ -136,7 +136,7 @@ Dealing with missing data due to skip patterns requires careful consideration.
When dealing with missing data that is MCAR, MAR, or MNAR, we must consider the implications of how we handle these missing data and avoid introducing more sources of bias. For instance, we can analyze only the respondents who answered all questions by performing listwise deletion, which drops all rows from a data frame with a missing value in any column. We can use the function `tidyr::drop_na()` for listwise deletion. For example, let's say we have a dataset `dat` that has one complete case and 2 cases with some missing data.

```{r}
#| label: drop-na-example1
#| label: understand-dropna-example1
dat <- tibble::tribble(~ col1, ~ col2, ~ col3,
"a", "d", "e",
"b", NA, NA,
Expand All @@ -147,7 +147,7 @@ dat

If we use the `tidyr::drop_na()` funtion, only the first case will remain as the other two cases have at least one missing value.
```{r}
#| label: drop-na-example2
#| label: understand-dropna-example2
dat %>%
tidyr::drop_na()
```
Expand Down
15 changes: 1 addition & 14 deletions 05-descriptive-analysis.Rmd
Original file line number Diff line number Diff line change
@@ -1,20 +1,7 @@
# Descriptive analyses in srvyr {#c05-descriptive-analysis}


```{r}
#| label: desc-summary-tab
#| echo: FALSE
tribble(
~c1, ~c2,
"**Topic**", "Descriptive analysis of survey data",
"**Purpose**", "purpose-blah",
"**Learning Goals**", "learning-goals-blah"
) %>%
knitr::kable(format="pandoc", col.names=NULL, caption="Summary of Chapter 5")
```

::: {.prereqbox-header}
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq}'`
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq5}'`
:::

::: {.prereqbox data-latex="{Prerequisites}"}
Expand Down
2 changes: 1 addition & 1 deletion 06-statistical-testing.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Statistical testing {#c06-statistical-testing}

::: {.prereqbox-header}
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq}'`
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq6}'`
:::

::: {.prereqbox data-latex="{Prerequisites}"}
Expand Down
2 changes: 1 addition & 1 deletion 07-modeling.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Modeling {#c07-modeling}

::: {.prereqbox-header}
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq}'`
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq7}'`
:::

::: {.prereqbox data-latex="{Prerequisites}"}
Expand Down
2 changes: 1 addition & 1 deletion 08-communicating-results.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Communicating Results {#c08-communicating-results}

::: {.prereqbox-header}
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq}'`
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq8}'`
:::

::: {.prereqbox data-latex="{Prerequisites}"}
Expand Down
2 changes: 1 addition & 1 deletion 09-ncvs-vignette.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# National Crime Victimization Survey Vignette {#c09-ncvs-vignette}

::: {.prereqbox-header}
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq}'`
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq9}'`
:::

::: {.prereqbox data-latex="{Prerequisites}"}
Expand Down
2 changes: 1 addition & 1 deletion 10-ambarom-vignette.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# AmericasBarometer Vignette {#c10-ambarom-vignette}

::: {.prereqbox-header}
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq}'`
`r if (knitr:::is_html_output()) '### Prerequisites {- #prereq10}'`
:::

::: {.prereqbox data-latex="{Prerequisites}"}
Expand Down
80 changes: 74 additions & 6 deletions book.bib
Original file line number Diff line number Diff line change
Expand Up @@ -244,14 +244,13 @@ @misc{acs-5yr-doc
}
@article{tse-doc,
author = {Biemer, Paul P.},
title = "{Total Survey Error: Design, Implementation, and Evaluation}",
title = {Total Survey Error: Design, Implementation, and Evaluation},
journal = {Public Opinion Quarterly},
volume = {74},
number = {5},
pages = {817-848},
year = {2010},
month = {01},
abstract = "{The total survey error (TSE) paradigm provides a theoretical framework for optimizing surveys by maximizing data quality within budgetary constraints. In this article, the TSE paradigm is viewed as part of a much larger design strategy that seeks to optimize surveys by maximizing total survey quality; i.e., quality more broadly defined to include user-specified dimensions of quality. Survey methodology, viewed within this larger framework, alters our perspectives on the survey design, implementation, and evaluation. As an example, although a major objective of survey design is to maximize accuracy subject to costs and timeliness constraints, the survey budget must also accommodate additional objectives related to relevance, accessibility, interpretability, comparability, coherence, and completeness that are critical to a survey's “fitness for use.” The article considers how the total survey quality approach can be extended beyond survey design to include survey implementation and evaluation. In doing so, the “fitness for use” perspective is shown to influence decisions regarding how to reduce survey error during design implementation and what sources of error should be evaluated in order to assess the survey quality today and to prepare for the surveys of the future.}",
issn = {0033-362X},
doi = {10.1093/poq/nfq058},
url = {https://doi.org/10.1093/poq/nfq058},
Expand All @@ -271,7 +270,7 @@ @book{groves2009survey
}
@book{biemer2003survqual,
title = {Introduction to survey quality},
author = {Biemer, Paul P and Lyberg, Lars E},
author = {Biemer, Paul P. and Lyberg, Lars E.},
year = 2003,
publisher = {John Wiley \& Sons}
}
Expand All @@ -297,15 +296,84 @@ @article{DeLeeuw_2018
}
@article{biemer_choiceplus,
title = {{Using Bonus Monetary Incentives to Encourage Web Response in Mixed-Mode Household Surveys}},
author = {Biemer, Paul P and Murphy, Joe and Zimmer, Stephanie and Berry, Chip and Deng, Grace and Lewis, Katie},
author = {Biemer, Paul P. and Murphy, Joe and Zimmer, Stephanie and Berry, Chip and Deng, Grace and Lewis, Katie},
year = 2017,
month = {06},
journal = {Journal of Survey Statistics and Methodology},
volume = 6,
number = 2,
pages = {240--261},
pages = {240-261},
doi = {10.1093/jssam/smx015},
issn = {2325-0984},
url = {https://doi.org/10.1093/jssam/smx015},
eprint = {https://academic.oup.com/jssam/article-pdf/6/2/240/24807375/smx015.pdf}
}
}
@book{Bradburn2004,
author = {Norman M. Bradburn and Seymour Sudman and Brian Wansink},
edition = {2nd Edition},
publisher = {Jossey-Bass},
title = {Asking Questions: The Definitive Guide to Questionnaire Design},
year = {2004},
}
@book{Fowler1989,
author = {Floyd J Fowler and Thomas W. Mangione},
publisher = {SAGE},
title = {Standardized Survey Interviewing},
year = {1989},
}
@book{Kim2021,
author = {Jae Kwang Kim and Jun Shao},
publisher = {Chapman \& Hall/CRC Press},
title = {Statistical Methods for Handling Incomplete Data},
year = {2021},
}
@book{Schouten2018,
author = {Barry Schouten and Andy Peytchev and James Wagner},
publisher = {Chapman \& Hall/CRC Press},
title = {Adaptive Survey Design},
year = {2018},
}
@book{Tourangeau2000psych,
author = {Roger Tourangeau and Lance J. Rips and Kenneth Rasinski},
publisher = {Cambridge University Press},
title = {Psychology of Survey Response},
year = {2000},
}
@article{Tourangeau2004spacing,
author = {Roger Tourangeau and Mick P. Couper and Frederick Conrad},
isbn = {0033-362X},
issn = {0033362X},
issue = {3},
journal = {Public Opinion Quarterly},
pages = {368-393},
publisher = {Oxford University Press},
title = {Sapcing, Position, and Order: Interpretive Heuristics for Visual Features of Survey Questions},
volume = {68},
url = {http://www.jstor.org/stable/3521676 http://www.jstor.org/page/info/about/policies/terms.jsp},
year = {2004},
}
@book{Valliant2018weights,
author = {Richard Valliant and Jill A. Dever},
publisher = {Stata Press},
title = {Survey Weights: A Step-by-step Guide to Calculation},
year = {2018},
}
@article{deLeeuw2005,
author = {DeLeeuw, Edith D.},
issue = {2},
journal = {Journal of Official Statistics},
pages = {233-255},
title = {To Mix or Not to Mix Data Collection Modes in Surveys},
volume = {21},
year = {2005},
}

@inbook{Skinner2009,
author = {Chris Skinner},
editor = {C.R. Rao},
title = {Chapter 15: Statistical Disclosure Control for Survey Data},
booktitle = {Handbook of Statistics: Sample Surveys: Design, Methods and Applications},
pages = {381-396},
publisher = {Elsevier B.V.},
year = {2009},
}
2 changes: 1 addition & 1 deletion css/style.css
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ li.ro::marker{
border-top-right-radius: 10px;
}

h3.hasAnchor#prereq {
h3.hasAnchor#prereq3 #prereq4 #prereq5 #prereq6 #prereq7 #prereq8 #prereq9 #prereq10 {
margin-top: 0em !important;
margin-bottom: 0em !important;
}
Expand Down

0 comments on commit 6d96341

Please sign in to comment.