Usability improvements for pd() and pd_to_p() #665

bwiernik · 2024-08-22T14:34:20Z

I've been using pd() a bunch lately, and there are a few things that make it a little annoying to integrate into my workflow. These 2 things would make it much smoother for me:

[ ] A as_vector argument that returns a simple vector of pd values, rather than a data frame.

This would fit into a workflow where I have a data frame with posterior draws (e.g., as a list column or as posterior::rvar) and I want to compute pd as a new column.

(An alternative would be to make the method for rvar always return a vector value, but I don't like that inconsistency, and it would mean it's not applicable to similar data structures that don't use posterior, like list columns.)

results |> transform(pd = pd(.epred))

[ ] A pd_to_p.p_direction() method that takes the data frame output of p_direction() and directly converts it a data frame with p values instead. That would avoid having to do this rather involved set of steps:

results |> pd() |> transform(p = pd_to_p(pd)) |> subset(select = - pd)

[ ] Maybe it's a little too heretical, but maybe add an as_frequentist_p or as_p argument to pd() to return the results scaled as p values in one step

The text was updated successfully, but these errors were encountered:

strengejacke · 2024-08-22T18:06:18Z

For the first point, I think @mattansb added an as.numeric() method for all functions.

mattansb · 2024-08-22T19:40:54Z

I think that was @DominiqueMakowski ?

@bwiernik How about something like allowing the data.frame method to select an rvar column? (Which I think was what I meant with #604)

grid <- data.frame(
  A = letters[1:2],
  B = c(2, 3),
  val = posterior::rvar(array(rnorm(1200), dim = c(600, 2)))
)

# Pull rvar column:
bayestestR::p_direction(grid$val)
#> Probability of Direction
#> 
#> Parameter |     pd
#> ------------------
#> x[1]      | 53.50%
#> x[2]      | 52.67%

# Or pass the data frame and tell the function what column has rvars:
bayestestR::p_direction(grid, rvar_col = "val")
#>   A B          val        pd
#> 1 a 2 0.062 ± 0.99 0.5350000
#> 2 b 3 0.037 ± 1.00 0.5266667

# Original behavior is maintained when not specifying rvar_col:
bayestestR::p_direction(grid)
#> Probability of Direction
#> 
#> Parameter |   pd
#> ----------------
#> B         | 100%

This is implemented in #666 😈

bwiernik · 2024-08-24T16:13:42Z

I typically use pd() inside of a call to mutate() with several other transformations (eg, taking an rvar columns and computing its median, CI, and pd in one step)

strengejacke · 2024-08-24T18:13:04Z

This sounds like something that can be done with model_parameters().

bwiernik · 2024-08-26T16:07:44Z

No, I'm working with vectors of predictions or custom contrasts

DominiqueMakowski · 2024-08-26T19:01:49Z

I agree that there's room for improvement, I also find out functions sometimes fiddly within tidyverse pipelines

strengejacke · 2024-08-31T08:27:16Z

[ ] A pd_to_p.p_direction() method that takes the data frame output of p_direction() and directly converts it a data frame with p values instead. That would avoid having to do this rather involved set of steps:

Should that return a data frame again?

Fixes #665

* Usability improvements for pd() and pd_to_p() Fixes #665 * docs * add as_p * Work on #664 * add tests * add tests * tests for as..vector * news * include #666 * news * news * lintr

bwiernik · 2024-09-03T23:04:51Z

[ ] A pd_to_p.p_direction() method that takes the data frame output of p_direction() and directly converts it a data frame with p values instead. That would avoid having to do this rather involved set of steps:

Should that return a data frame again?

Yes, I think that's what you implemented?

strengejacke · 2024-09-04T05:29:43Z

yes. and we have as.numeric() or as.vector() methods to returns a vector instead of df.

strengejacke self-assigned this Aug 31, 2024

strengejacke added a commit that referenced this issue Aug 31, 2024

Usability improvements for pd() and pd_to_p()

205e645

Fixes #665

strengejacke mentioned this issue Aug 31, 2024

Usability improvements for pd() and pd_to_p() #668

Merged

strengejacke closed this as completed in #668 Aug 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usability improvements for pd() and pd_to_p() #665

Usability improvements for pd() and pd_to_p() #665

bwiernik commented Aug 22, 2024 •

edited

Loading

strengejacke commented Aug 22, 2024

mattansb commented Aug 22, 2024

bwiernik commented Aug 24, 2024

strengejacke commented Aug 24, 2024

bwiernik commented Aug 26, 2024

DominiqueMakowski commented Aug 26, 2024

strengejacke commented Aug 31, 2024

bwiernik commented Sep 3, 2024

strengejacke commented Sep 4, 2024

Usability improvements for pd() and pd_to_p() #665

Usability improvements for pd() and pd_to_p() #665

Comments

bwiernik commented Aug 22, 2024 • edited Loading

strengejacke commented Aug 22, 2024

mattansb commented Aug 22, 2024

bwiernik commented Aug 24, 2024

strengejacke commented Aug 24, 2024

bwiernik commented Aug 26, 2024

DominiqueMakowski commented Aug 26, 2024

strengejacke commented Aug 31, 2024

bwiernik commented Sep 3, 2024

strengejacke commented Sep 4, 2024

bwiernik commented Aug 22, 2024 •

edited

Loading