Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link to Tidy Data Tutor to help visualize data transformations step-by-step when using pipes #783

Open
samanthacsik opened this issue Dec 22, 2021 · 1 comment
Labels
type:enhancement Propose enhancement to the lesson

Comments

@samanthacsik
Copy link

There's a super cool new browser-based tool called Tidy Data Tutor (and the Pandas version, Pandas Tutor), which lets you visualize how a data frame changes at each step of a data analysis/transformation pipeline. I've found this to be a super helpful teaching tool for workshops where I am introducing the pipe operator, %>% for the first time. Tidy Data Tutor will break down each step of your pipeline (i.e. at each %>%) and show exactly how the data frame is altered in that step. I've created a simple example demonstrating the dplyr functions filter(), select() and arrange(), or see the screenshots below:

Step 1: create some data in the browser-based editor (here I created a mini version of the portal_data_joined.csv used in this workshop)
Screen Shot 2021-12-21 at 4 12 05 PM

Step 2: Visualize the data analysis pipeline
Screen Shot 2021-12-21 at 4 11 20 PM

It's easy to embed the reproducible pipeline visualization into lesson materials using the "Sharable URL" at the bottom of the page. It is important to note that because this is a super new tool (I think it was released only about two weeks ago), it may still be a little buggy.

Still, I think it could add value to the Data Manipulation using dyply and tidyr episode in the Data Analysis and Visualization in R for Ecologists Data Carpentries workshop (or any workshop where pipes are taught), particularly after the first example of a pipe:
Screen Shot 2021-12-21 at 4 23 18 PM

One suggestion may be to include language immediately following the text in the screenshot above that states something like:

...Since %>% takes the object on its left and passes it as the first argument to the function on its right, we don’t need to explicitly include the data frame as an argument to the filter() and select() functions any more. To understand the step-wise transformations taking place each time a pipe is used to string together tidyverse functions, you can explore the output of this new online tool, Tidy Data Tutor.

@mondorescue mondorescue added the type:enhancement Propose enhancement to the lesson label Jan 11, 2022
@tobyhodges
Copy link
Member

tobyhodges commented Jul 10, 2024

Thanks @samanthacsik for opening this issue. The lesson underwent a major update and reorganisation when #887 was merged. Although this issue refers to content in a version of the lesson before that update took place, the overall suggestion may still be relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:enhancement Propose enhancement to the lesson
Projects
None yet
Development

No branches or pull requests

3 participants