Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use format = "file" instead of tar_file #30

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions episodes/files.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ tar_dir({

The target `some_data` was skipped, even though the contents of the file changed.

That is because right now, targets is only tracking the **name** of the file, not its contents. We need to use a special function for that, `tar_file()` from the `tarchetypes` package. `tar_file()` will calculate the "hash" of a file---a unique digital signature that is determined by the file's contents. If the contents change, the hash will change, and this will be detected by `targets`.
That is because right now, targets is only tracking the **name** of the file, not its contents. We need to use a special argument for that, `tar_target(format = "file")`. This will cause `targets` to calculate the "hash" of a file---a unique digital signature that is determined by the file's contents. If the contents change, the hash will change, and this will be detected by `targets`.

```{r}
#| label: example-file-show-3
Expand All @@ -107,7 +107,7 @@ library(targets)
library(tarchetypes)

tar_plan(
tar_file(data_file, "_targets/user/data/hello.txt"),
tar_target(data_file, "_targets/user/data/hello.txt", format = "file"),
some_data = readLines(data_file)
)
```
Expand Down Expand Up @@ -217,7 +217,7 @@ tar_dir({

## Writing out data

Writing to files is similar to loading in files: we will use the `tar_file()` function. There is one important caveat: in this case, the second argument of `tar_file()` (the command to build the target) **must return the path to the file**. Not all functions that write files do this (some return nothing; these treat the output file is a side-effect of running the function), so you may need to define a custom function that writes out the file and then returns its path.
Writing to files is similar to loading in files: we will use `tar_target(format = "file")`. There is one important caveat: in this case, the `command` argument of `tar_target()` **must return the path to the file**. Not all functions that write files do this (some return nothing; these treat the output file is a side-effect of running the function), so you may need to define a custom function that writes out the file and then returns its path.

Let's do this for `writeLines()`, the R function that writes character data to a file. Normally, its output would be `NULL` (nothing), as we can see here:

Expand Down Expand Up @@ -277,9 +277,10 @@ tar_plan(
readLines(!!.x)
),
hello_caps = toupper(hello),
tar_file(
tar_target(
hello_caps_out,
write_lines_file(hello_caps, "_targets/user/results/hello_caps.txt")
write_lines_file(hello_caps, "_targets/user/results/hello_caps.txt"),
format = "file"
)
)
```
Expand Down Expand Up @@ -318,7 +319,7 @@ So this way of writing out results makes your pipeline more robust: we have a gu

::::::::::::::::::::::::::::::::::::: keypoints

- `tarchetypes::tar_file()` tracks the contents of a file
- `tar_target(format = "file")` tracks the contents of a file
- Use `tarchetypes::tar_file_read()` in combination with data loading functions like `read_csv()` to keep the pipeline in sync with your input data
- Use `tarchetypes::tar_file()` in combination with a function that writes to a file and returns its path to write out data

Expand Down
1 change: 0 additions & 1 deletion episodes/organization.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -157,4 +157,3 @@ Striking this balance is more of art than science, and only comes with practice.
- Writing functions is a key skill for `targets` pipelines

::::::::::::::::::::::::::::::::::::::::::::::::

Loading