-
Notifications
You must be signed in to change notification settings - Fork 0
/
HumGen_Labg1_covid_map.qmd
276 lines (229 loc) · 11.7 KB
/
HumGen_Labg1_covid_map.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
---
title: 'Lab 5g (graduate students) : Mapping the COVID-19 data'
---
## Learning Objectives
* Making shape maps
* Interactive graphs
### Today's libaries
```{r}
library(tidyverse)
library(maps)
library(mapdata)
library(lubridate)
library(viridis)
library(wesanderson)
library(plotly)
```
## Creating Country, State and County maps
The ideas for this course tutorial came from multiple examples contributed by Prof. Chris Sutherland and these tutorials
1. [Maps in R using maps](https://cran.r-project.org/web/packages/maps/maps.pdf) by [Eric Anderson](http://eriqande.github.io/rep-res-web/lectures/making-maps-with-R.html)
1. [geom_maps](https://ggplot2.tidyverse.org/reference/geom_map.html)
1. [Drawing beautiful maps programmatically with R, sf and ggplot2](https://www.r-spatial.org/r/2018/10/25/ggplot2-sf.html)
For our labs two types of approaches will be used to add data to maps. The first is based on using the longitude and latitude information to add a point at a specific position on a map. The second is two add the information to shapes in a map based on the name of the shape (e.g. states). Although ggmaps is a wonderful tool for mapping using Google Maps and other resources, it is beyond what is needed for now.
As in the previous labs the sources of data are from [Github repo for Novel Coronavirus (COVID-19) Cases](https://github.com/CSSEGISandData/COVID-19) that supports the [dashboard](https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6).
For this lab it is important to note that the time series data does not currently have entries for US States. The daily reports include US State and more recently US country/administrative units. Is possible to concatenate the daily reports to create a time series for US States, but cognizant of changes in the formats of the daily reports.
### Building maps
Here is a graph containing all the coordinate information. Note this is not summarized by country. Since there is a point for each US county. There are many points in the US
```{r}
daily_report <- read_csv(url("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/04-02-2020.csv")) |>
rename(Long = "Long_")
ggplot(daily_report, aes(x = Long, y = Lat, size = Confirmed/1000)) +
borders("world", colour = NA, fill = "grey90") +
geom_point(shape = 21, color='purple', fill='purple', alpha = 0.5) +
labs(title = 'World COVID-19 Confirmed cases',x = '', y = '',
size="Cases (x1000))") +
theme(legend.position = "right") +
coord_fixed(ratio=1.5) +
theme_bw()
```
Zoom in on US 48 states. To do this Alaska, Hawaii and US Territories are filtered. Some US State entries have a Lat and Long of zero, so these are filtered as well.
```{r}
daily_report_filtered_US <- daily_report |>
filter(Country_Region == "US") |>
filter (!Province_State %in% c("Alaska","Hawaii", "American Samoa",
"Puerto Rico","Northern Mariana Islands",
"Virgin Islands", "Recovered", "Guam", "Grand Princess",
"District of Columbia", "Diamond Princess")) |>
filter(Lat > 0)
ggplot(daily_report_filtered_US, aes(x = Long, y = Lat, size = Confirmed)) +
borders("state", colour = "black", fill = "grey90") +
theme_bw() +
geom_point(shape = 21, color='purple', fill='purple', alpha = 0.5) +
labs(title = 'COVID-19 Confirmed Cases in the US', x = '', y = '',
size="Cases") +
theme(legend.position = "right") +
coord_fixed(ratio=1.5)
```
Here is a prettier version based on an example by Anisa Dhana
```{r}
daily_report_filtered_Dhana <- daily_report |>
filter(Country_Region == "US") |>
filter (!Province_State %in% c("Alaska","Hawaii", "American Samoa",
"Puerto Rico","Northern Mariana Islands",
"Virgin Islands", "Recovered", "Guam", "Grand Princess",
"District of Columbia", "Diamond Princess")) |>
filter(Lat > 0)
mybreaks <- c(1, 100, 1000, 10000, 10000)
ggplot(daily_report_filtered_Dhana, aes(x = Long, y = Lat, size = Confirmed)) +
borders("state", colour = "white", fill = "grey90") +
geom_point(aes(x=Long, y=Lat, size=Confirmed, color=Confirmed),stroke=F, alpha=0.7) +
scale_size_continuous(name="Cases", trans="log", range=c(1,7),
breaks=mybreaks, labels = c("1-99",
"100-999", "1,000-9,999", "10,000-99,999", "50,000+")) +
scale_color_viridis_c(option="viridis",name="Cases",
trans="log", breaks=mybreaks, labels = c("1-99",
"100-999", "1,000-9,999", "10,000-99,999", "50,000+")) +
# Cleaning up the graph
theme_void() +
guides( colour = guide_legend()) +
labs(title = "Anisa Dhana's lagout for COVID-19 Confirmed Cases in the US'") +
theme(
legend.position = "bottom",
text = element_text(color = "#22211d"),
plot.background = element_rect(fill = "#ffffff", color = NA),
panel.background = element_rect(fill = "#ffffff", color = NA),
legend.background = element_rect(fill = "#ffffff", color = NA)
) +
coord_fixed(ratio=1.5)
```
* Note that in both examples the ggplot funtion __borders__ is used to define the areas in the map
### Mapping data to shapes
```{r}
daily_report_filtered_shapes <- daily_report |>
filter(Country_Region == "US") |>
group_by(Province_State) |>
summarize(Confirmed = sum(Confirmed)) |>
mutate(Province_State = tolower(Province_State))
# load the US map data
us <- map_data("state")
# We need to join the us map data with our daily report to make one data frame/tibble
state_join <- left_join(us, daily_report_filtered_shapes, by = c("region" = "Province_State"))
# plot state map
```
#### Using R color palattes
This is a bit of a digression back to Labs 3 and 4, but there are many R color palattes to choose from or you can create your own. In the above a simple gradient is used. The example from Anisa Dhana uses the viridis palatte which is designed to be perceived by viewers with common forms of colour blindness. Here is an example using a different color package - Wes Anderson. ...and more [Top R Color Palettes to Know for Great Data Visualization](https://www.datanovia.com/en/blog/top-r-color-palettes-to-know-for-great-data-visualization/)
```{r}
# plot state map
ggplot(data = us, mapping = aes(x = long, y = lat, group = group)) +
coord_fixed(1.3) +
# Add data layer
geom_polygon(data = state_join, aes(fill = Confirmed), color = "black") +
scale_fill_gradientn(colours =
wes_palette("Zissou1", 100, type = "continuous"),
trans = "log10") +
labs(title = "COVID-19 Confirmed Cases in the US'")
```
A look now by the counties using the RColorBrewer
```{r}
library(RColorBrewer)
# To display only colorblind-friendly brewer palettes, specify the option colorblindFriendly = TRUE as follow:
# display.brewer.all(colorblindFriendly = TRUE)
# Get and format the covid report data
daily_report_filtered_shapes <- daily_report |>
unite(Key, Admin2, Province_State, sep = ".") |>
group_by(Key) |>
summarize(Confirmed = sum(Confirmed)) |>
mutate(Key = tolower(Key))
# dim(report_03_27_2020)
# get and format the map data
us <- map_data("state")
counties <- map_data("county") |>
unite(Key, subregion, region, sep = ".", remove = FALSE)
# Join the 2 tibbles
state_join <- left_join(counties, daily_report_filtered_shapes, by = c("Key"))
# sum(is.na(state_join$Confirmed))
ggplot(data = us, mapping = aes(x = long, y = lat, group = group)) +
coord_fixed(1.3) +
# Add data layer
borders("state", colour = "black") +
geom_polygon(data = state_join, aes(fill = Confirmed)) +
scale_fill_gradientn(colors = brewer.pal(n = 5, name = "PuRd"),
breaks = c(1, 10, 100, 1000, 10000, 100000),
trans = "log10", na.value = "White") +
ggtitle("Number of Confirmed Cases by US County") +
theme_bw()
```
If we look at just Massachusetts
```{r}
daily_report_filtered_MA <- daily_report |>
filter(Province_State == "Massachusetts") |>
group_by(Admin2) |>
summarize(Confirmed = sum(Confirmed)) |>
mutate(Admin2 = tolower(Admin2))
us <- map_data("state")
ma_us <- subset(us, region == "massachusetts")
counties <- map_data("county")
ma_county <- subset(counties, region == "massachusetts")
state_join <- left_join(ma_county, daily_report_filtered_MA, by = c("subregion" = "Admin2"))
# plot state map
ggplot(data = ma_county, mapping = aes(x = long, y = lat, group = group)) +
coord_fixed(1.3) +
# Add data layer
geom_polygon(data = state_join, aes(fill = Confirmed), color = "white") +
scale_fill_gradientn(colors = brewer.pal(n = 5, name = "BuGn"),
trans = "log10") +
labs(title = "COVID-19 Confirmed Cases in Massachusetts'")
```
* Note the cases on Nantucket and Dukes counties were reported as one value and not included on the graph. There is also an asssigned category that includes 303 Confirmed cases as of 3/31/2020.
```{r}
daily_report
```
## Interactive graphs
In Lab 4 plotly was introduced. It is a great simple way to make interactive graphs with the maps
```{r}
# !!!! This example assumes the prior code was run
library(plotly)
ggplotly(
ggplot(data = ma_county, mapping = aes(x = long, y = lat, group = group)) +
coord_fixed(1.3) +
# Add data layer
geom_polygon(data = state_join, aes(fill = Confirmed), color = "black") +
scale_fill_gradientn(colours =
wes_palette("Zissou1", 100, type = "continuous")) +
ggtitle("COVID-19 Cases in MA") +
# Cleaning up the graph
labs(x=NULL, y=NULL) +
theme(panel.border = element_blank()) +
theme(panel.background = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.text = element_blank())
)
```
Here is an example with the world map
```{r}
# Read in the daily report
daily_report_filtered_world <- daily_report |>
group_by(Country_Region) |>
summarize(Confirmed = sum(Confirmed), Deaths = sum(Deaths))
# Read in the world map data
world <- as_tibble(map_data("world"))
# Check to see if there are differences in the naming of countries
setdiff(world$region, daily_report$Country_Region)
# Many of these countries are considered states or territories in the JHU covid reports,
# but let's fix a few of them
world <- as_tibble(map_data("world")) |>
mutate(region = str_replace_all(region, c("USA" = "US", "Czech Republic" = "Czechia",
"Ivory Coast" = "Cote d'Ivoire", "Democratic Republic of the Congo" = "Congo (Kinshasa)",
"Republic of Congo" = "Congo (Brazzaville)")))
# Join the covid report with the map data
country_join <- left_join(world, daily_report_filtered_world, by = c("region" = "Country_Region"))
# Create the graph
ggplotly(
ggplot(data = world, mapping = aes(x = long, y = lat, text = region, group = group)) +
coord_fixed(1.3) +
# Add data layer
geom_polygon(data = country_join, aes(fill = Deaths), color = "black") +
scale_fill_gradientn(colours =
wes_palette("Zissou1", 100, type = "continuous")) +
labs(title = "COVID-19 Deaths")
)
```
## Exercises
1. Update Anisa Dhana's graph layout of the US to 3/09/2023. You may need to adjust the range or change to a linear scale (delete trans="log")
2. Update the above graph "Number of Confirmed Cases by US County" to 3/09/2023 and use a different color scheme or theme
3. Make an interactive plot using a state of your choosing using a theme different from used in the above examples. To do this change the above code chunk to the abbreviation for your choice state.
```{r, eval=FALSE}
ma_us <- subset(us, region == "massachusetts")
counties <- map_data("county")
ma_county <- subset(counties, region == "massachusetts")
state_join <- left_join(ma_county, daily_report_filtered_MA, by = c("subregion" = "Admin2"))