Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the apply_weekly_lag() and format_fb_data() - Adopt final changes review #6

Merged
merged 4 commits into from
Mar 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 0 additions & 49 deletions .github/workflows/check-standard.yaml

This file was deleted.

92 changes: 65 additions & 27 deletions R/apply_weekly_lag.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@
#'
#' @description Applying a week lag to the data will create raster images
#' showing the mobility a week before the date of interest.
#' This function works only for QuadKeys without NAs.
#' This function works only for QuadKeys reported without NAs for
#' `n_crisis` and `percent_change` variables .
#'
#' @param data A data.frame
#' @param data A data.frame with the columns `quadkey`,
#' `day`, `hour` and `n_crisis`.
#'
#' @importFrom rlang .data
#'
Expand All @@ -26,9 +28,12 @@
#' package = "quadkeyr"
#' ), "/"),
#' colnames = c(
#' "lat", "lon",
#' "quadkey", "date_time",
#' "n_crisis", "percent_change"
#' "lat",
#' "lon",
#' "quadkey",
#' "date_time",
#' "n_crisis",
#' "percent_change"
#' ),
#' coltypes = list(
#' lat = "d",
Expand All @@ -40,29 +45,62 @@
#' )
#' )
#'
#' apply_weekly_lag(files)
#' apply_weekly_lag(data = files)
apply_weekly_lag <- function(data) {
out_data <- c()
# First I must check that we have all the days and months
if (nrow(missing_combinations(data)) != 0) {
mc <- missing_combinations(data)
# create the combination of QuadKeys,
# days and hours missing in a grid
missing_data <- expand.grid(
quadkey = unique(data$quadkey),
day = mc$day,
hour = mc$hour
)

# Add the missing data to the original files
# Now I have quadkey, day and hour columns complete
data <- dplyr::bind_rows(data, missing_data) |>
dplyr::arrange(.data$day, .data$hour)
}

# I am only considering cases where there aren't NAs
# Let's remove the QuadKeys with 100% NAs for n_crisis
qk_data_without_NA <- data |>
dplyr::group_by(.data$quadkey) |>
dplyr::summarise(empty = !is.na(sum(.data$n_crisis))) |>
dplyr::filter(.data$empty == FALSE) |>
dplyr::ungroup()

data <- data |>
dplyr::filter(.data$quadkey %in% qk_data_without_NA$quadkey)

# QuadKey that appears in all the combination of possible days and hours
# should occur `qk_rep` times.
# If a QuadKey is reported fewer times than that, we will remove it
# to avoid discontinuous sequences of days and subsequent gaps.
min_date <- min(data$day)
max_date <- max(data$day)
days <- as.numeric(max_date - min_date)
qk_rep <- (days + 1) * 3

qk_reg <- data |>
dplyr::count(.data$quadkey) |>
dplyr::filter(.data$n == qk_rep)

data <- data |>
dplyr::filter(.data$quadkey %in% qk_reg$quadkey)

# Now that this is all sorted,
# let's create the lag column
quadkey_lag <- data |>
dplyr::group_by(.data$quadkey) |>
dplyr::arrange(.data$day, .data$hour, .by_group = TRUE) |>
dplyr::mutate(n_crisis_lag_7 = dplyr::lag(.data$n_crisis,
n = (7 * 3))) |>
dplyr::mutate(percent_change_7 = ((.data$n_crisis_lag_7 - .data$n_crisis) /
.data$n_crisis) * 100)
}

for (i in unique(data$quadkey)) {
inter <- data |>
dplyr::filter(.data$quadkey == i)

# I am only considering cases where there aren't NAs
if (!is.na(sum(inter$n_crisis))) {
quadkey_lag <- inter |>
dplyr::group_by(.data$quadkey, .data$hour) |>
dplyr::mutate(n_crisis_lag_7 = dplyr::lag(as.numeric(.data$n_crisis),
n = 7
)) |>
dplyr::mutate(percent_change_7 = ((
.data$n_crisis_lag_7 - .data$n_crisis
) /
.data$n_crisis) * 100)

out_data <- rbind(out_data, quadkey_lag)
}
}

return(out_data)
}
4 changes: 0 additions & 4 deletions R/get_coords.R
Original file line number Diff line number Diff line change
Expand Up @@ -72,11 +72,7 @@ get_qk_coord <- function(data) {
#' (`tileX`, `tileY`), a `quadkey` and a `geometry` column.
#'
#' @keywords internal
#' @export
#'
#' @examples
#'
#'
#' grid <- create_qk_grid(
#' xmin = -59,
#' xmax = -40,
Expand Down
1 change: 0 additions & 1 deletion R/grid_to_polygon.R
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,6 @@ grid_to_polygon <- function(data) {
#' Their value is not necessary.
#'
#' @keywords internal
#' @export
#'
#' @examples
#'
Expand Down
3 changes: 1 addition & 2 deletions R/polygon_to_raster.R
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ polygon_to_raster <- function(data,
# detect and reposrt combination of dates and times missing
if (nrow(missing_combinations(data)) > 0) {
warning("These combinations of days and times are missing in the dataset")
missing_combinations(data)
print(missing_combinations(data))
}


Expand All @@ -104,7 +104,6 @@ polygon_to_raster <- function(data,
next
}


data_raster <- data |>
dplyr::filter(.data$day == as.Date(i,
origin = "1970-01-01"
Expand Down
31 changes: 22 additions & 9 deletions R/read_and_format_files.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@
#' @param coltypes Column specifications (as strings).
#' See vignette("readr", package = "readr") for more details.
#' documentation.
#' @param keep_format Vector of column names,
#' besides `date_time`, `day` and `quadkey`, that you
#' don't want to convert to a number.
#'
#' @seealso \code{\link{format_fb_data}}
#' @seealso \code{\link[readr]{read_csv}}
Expand Down Expand Up @@ -50,7 +53,8 @@
#' head(files)
read_fb_mobility_files <- function(path_to_csvs,
colnames,
coltypes) {
coltypes,
keep_format = NULL) {

# This data always have the same format
fnames <- list.files(
Expand Down Expand Up @@ -84,8 +88,8 @@ read_fb_mobility_files <- function(path_to_csvs,
}
)


data <- format_fb_data(data)
data <- format_fb_data(data,
keep_format = keep_format)

if (nrow(missing_combinations(data)) > 0) {
message(paste(
Expand All @@ -103,10 +107,16 @@ read_fb_mobility_files <- function(path_to_csvs,
#' @description This function removes unnecessary characters such as `\\N`
#' and ensures that the format of the date and QuadKeys is correct.
#'
#' @param data A data.frame with a `quadkey`, `date_time`, `country` columns
#' and other numeric variables
#' @param data A data.frame with a `quadkey` and `date_time` columns
#' and other variables
#' @param keep_format Vector of column names,
#' besides `date_time`, `day` and `quadkey`, that you
#' don't want to convert to a number.
#'
#' @return A data.frame.
#' @return A data.frame without `\N`,
#' `quadkey` without scientific notation and
#' a new column `day` and `hour`
#'
#' @export
#'
#' @seealso \code{\link{read_fb_mobility_files}}
Expand All @@ -115,7 +125,8 @@ read_fb_mobility_files <- function(path_to_csvs,
#'
#' data(result_read_fb_mobility_data)
#' format_fb_data(data = result_read_fb_mobility_data)
format_fb_data <- function(data) {
format_fb_data <- function(data,
keep_format = NULL) {

# remove scientific notation
data$quadkey <- format(data$quadkey,
Expand All @@ -125,6 +136,7 @@ format_fb_data <- function(data) {
# change date format
data$day <- lubridate::date(data$date_time)

# get the hour
data$hour <- as.numeric(format(as.POSIXct(data$date_time,
format = "%Y-%m-%d %H%M"
),
Expand All @@ -138,7 +150,8 @@ format_fb_data <- function(data) {
~ ifelse(. == "\\N", NA, .)
)) |>
dplyr::mutate(dplyr::across(
-c("date_time", "day", "quadkey"), # tidyselect
-c("date_time", "day", "quadkey",
dplyr::all_of(keep_format)), # tidyselect
as.numeric
))

Expand All @@ -163,6 +176,7 @@ format_fb_data <- function(data) {
#'
#' # Sample dataset
#' data <- data.frame(
#' country = c("US", "MX", "MX"),
#' day = c("2023-01-01", "2023-01-03", "2023-01-05"),
#' hour = c(0, 8, 16)
#' )
Expand All @@ -187,6 +201,5 @@ missing_combinations <- function(data) {
data,
by = c("day", "hour")
)

return(missing_combinations)
}
2 changes: 0 additions & 2 deletions R/regular_grid.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ regular_qk_grid <- function(data) {
zoom = qk_zoom
)


if (nrow(grid$data) == nrow(data)) {
warning("The grid is already complete, this function is not necessary")
return(list(
Expand Down Expand Up @@ -124,7 +123,6 @@ add_regular_polygon_grid <- function(data) {
))
}


#' Get regular QuadKey polygon grid derived from
#' the bounding box of the `quadkey` column of a data.frame.
#'
Expand Down
6 changes: 4 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,11 @@ to address and retrieve specific map tiles,
facilitating rapid display within
mapping applications.

```{r qk, echo = FALSE, out.width= "50%", fig.align='center', fig.cap= "The QuadKey of any tile starts with the QuadKey of its parent tile (the containing tile at the previous level). Image extracted from Microsoft's Bing Maps Tile System webpage."}
knitr::include_graphics("vignettes/quadkeys.jpg")

```{r qks, echo = FALSE, out.width= "70%", fig.align='center'}
knitr::include_graphics("vignettes/quadkeys.png")
```
<p style="text-align: center">The QuadKey of any tile starts with the QuadKey of its parent tile (the containing tile at the previous level). Image extracted from Microsoft's Bing Maps Tile System webpage.</p>

---

Expand Down
14 changes: 5 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,17 +38,13 @@ spatial position as a sequence of characters. They provide an efficient
method to address and retrieve specific map tiles, facilitating rapid
display within mapping applications.

<div class="figure" style="text-align: center">

<img src="vignettes/quadkeys.jpg" alt="The QuadKey of any tile starts with the QuadKey of its parent tile (the containing tile at the previous level). Image extracted from Microsoft's Bing Maps Tile System webpage." width="50%" />
<p class="caption">
<img src="vignettes/quadkeys.png" width="70%" style="display: block; margin: auto;" />
<p style="text-align: center">
The QuadKey of any tile starts with the QuadKey of its parent tile (the
containing tile at the previous level). Image extracted from Microsoft’s
Bing Maps Tile System webpage.
</p>

</div>

------------------------------------------------------------------------

The goal of `quadkeyr` is to:
Expand All @@ -62,21 +58,21 @@ The goal of `quadkeyr` is to:
to convert QuadKeys to and from tiles, pixels and geographic
coordinates are available.

<img src="vignettes/workflow_quadkey.png" width="70%" style="display: block; margin: auto;" />
<img src="vignettes/workflow_quadkey.png" width="80%" style="display: block; margin: auto;" />

2. [**Generate Raster Images from Quadkey-Identified
Data**](https://fernandez-lab-wsu.github.io/quadkeyr/articles/quadkey_identified_data_to_raster.html)
Complete a grid of QuadKeys within a specified area and zoom level,
and create a `stars` raster. You can also directly convert QuadKeys
in a data.frame column into an `sf` POLYGON data.frame.

<img src="vignettes/workflow_raster.png" width="70%" style="display: block; margin: auto;" />
<img src="vignettes/workflow_raster.png" width="80%" style="display: block; margin: auto;" />
3. [**Convert Facebook Mobility QuadKey-identified Datasets into Raster
Files**](https://fernandez-lab-wsu.github.io/quadkeyr/articles/facebook_mobility_csvs_to_raster_files.html)
Convert Facebook mobility data `.csv` files into `.tif` files by day and
hour reported.

<img src="vignettes/workflow_facebook.png" width="70%" style="display: block; margin: auto;" />
<img src="vignettes/workflow_facebook.png" width="80%" style="display: block; margin: auto;" />

4. [**Offer an App for visualizing QuadKeys on a
map**](https://fernandez-lab-wsu.github.io/quadkeyr/articles/quadkey_visualization_app.html)
Expand Down
2 changes: 1 addition & 1 deletion docs/articles/facebook_mobility_csvs_to_raster_files.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/articles/quadkey_to_sf_conversion.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/articles/quadkey_visualization_app.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file added docs/articles/quadkeys.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading