behaviour of na_str in formatters 0.5.9 #321

iaugusty · 2024-09-27T12:04:03Z

Summary

Just checking if this is intended behavior of na_str, when na_str is a character string of length > 1

xx <- list(c(NA, 1, NA), c(NA, NA, 1), c(1, NA, NA), c(NA, NA, NA))

sapply(xx, FUN = function(x)format_value(x, format = "xx.x (xx.x - xx.x)", na_str = paste0("ne",1:10)))

"ne1 (1.0 - ne2)" "ne1 (ne2 - 1.0)" "1.0 (ne1 - ne2)" "ne1 (ne2 - ne3)"

I would find it more intuitive if the replacement of an NA value is according to the position of the na_str vector, rather than the position of the NA value in the input vector.

In the above example, I would expect the result to read as

"ne1 (1.0 - ne3)" "ne1 (ne2 - 1.0)" "1.0 (ne2 - ne3)" "ne1 (ne2 - ne3)"

library(formatters)
xx <- list(c(NA, 1, NA), c(NA, NA, 1), c(1, NA, NA), c(NA, NA, NA))
sapply(xx, FUN = function(x)format_value(x, format = "xx.x (xx.x - xx.x)", na_str = paste0("ne",1:10)))
#> [1] "ne1 (1.0 - ne2)" "ne1 (ne2 - 1.0)" "1.0 (ne1 - ne2)" "ne1 (ne2 - ne3)"

R session info
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

other attached packages:
[1] rtables_0.6.9 magrittr_2.0.3 formatters_0.5.9

Melkiades · 2024-09-27T13:57:30Z

@iaugusty could you format the code correctly? so I can run it directly (check reprex!) thanks

shajoezhu · 2025-01-05T16:19:34Z

hi @iaugusty , can you please provide us with the reproducible code and an example please. Thanks!

iaugusty · 2025-01-06T13:19:59Z

library(formatters)

xx <- list(c(NA, 1, NA), c(NA, NA, 1), c(1, NA, NA), c(NA, NA, NA))
sapply(xx, FUN = function(x)format_value(x, format = "xx.x (xx.x - xx.x)", na_str = paste0("ne",1:10)))
#> [1] "ne1 (1.0 - ne2)" "ne1 (ne2 - 1.0)" "1.0 (ne1 - ne2)" "ne1 (ne2 - ne3)"

shajoezhu · 2025-01-07T10:49:52Z

hi @iaugusty , I see what you mean, yes, this was intentional, and you can see the usage of

format_value(c(NA, 1, NA), format = "xx.x (xx.x - xx.x)", na_str = c("NE", "<missing>"))
[1] "NE (1.0 - <missing>)"

the na_str index was intentionally specified to the i_th NA value in input c(NA, 1, NA), how the i_th NA string should be displayed

suppose a <- c(NA, 1, NA), the i_th element is not referring to a_i.

i understand your suggestion. and I think it would be a very nice feature!

iaugusty · 2025-01-07T12:12:35Z

hi @shajoezhu, It would be nice if the user has some control on whether to refer to the i-th element of the na_str, or the i-th missing value in the values data input

shajoezhu · 2025-01-07T12:14:04Z

awesome! thanks! agreed. I will change the label of this issue to feature

Melkiades · 2025-01-21T08:52:28Z

I think the current behavior is the expected behavior. The best solution is to give a vector also to na_str as it is an sapply. Adding the possibility of a longer vector that matches multiple values in format_values can easily break upper-level code or have unexpected final outcomes. The user has an easy way to apply it in a custom way, so I would not pursue this additional feature for format_value

iaugusty · 2025-01-21T09:25:00Z

In case of multi-stats with different length, eg in a_summary : mean, sd, range, a 1-d stat vs a 2-d stat. If you want to report the 2-d stat as (NE, NE) you have to pass na_str as vector of length 2, while mean itself is a 1-d stat

This is the current behaviour

> format_value(c(NA, 1, NA), format = "xx.x (xx.x - xx.x)", na_str = c("NE", "<missing>", "NA"))
[1] "NE (1.0 - <missing>)"
> format_value(c(1, NA, NA), format = "xx.x (xx.x - xx.x)", na_str = c("NE", "<missing>", "NA"))
[1] "1.0 (NE - <missing>)"
> format_value(c(NA, NA, NA), format = "xx.x (xx.x - xx.x)", na_str = c("NE", "<missing>", "NA"))
[1] "NE (<missing> - NA)"

I would find the following result more natural to understand.

> format_value(c(NA, 1, NA), format = "xx.x (xx.x - xx.x)", na_str = c("NE", "<missing>", "NA"))
[1] "NE (1.0 - NA)"
> format_value(c(1, NA, NA), format = "xx.x (xx.x - xx.x)", na_str = c("NE", "<missing>", "NA"))
[1] "1.0 (<missing> - NA)"

If it could be a user option to have this as the behaviour, aside to the current behaviour, the user can better control the outcome.

It is not that big deal, with the current behaviour and na_str a vector with the same value, we achieve the behaviour we want.

Melkiades · 2025-01-21T09:41:12Z

If it could be a user option to have this as the behaviour, aside to the current behaviour, the user can better control the outcome.

It is not that big deal, with the current behaviour and na_str a vector with the same value, we achieve the behaviour we want.

I see; now I understand well. It makes sense to have it as a mask, as you suggest, instead of a filling vector. Having both options seem impractical, though, as it is hard to differentiate the intent of the user without another flag that would be hard to motivate or explain. I would stick with the current behavior only because of the number of fixes that we should potentially make downstream (even if I think it is usually used with only one value like NE instead of NA)

iaugusty · 2025-01-21T09:44:15Z

I'm fine with it

iaugusty added the bug Something isn't working label Sep 27, 2024

Melkiades added the sme Tracks changes for the sme board label Sep 27, 2024

shajoezhu added enhancement New feature or request and removed bug Something isn't working labels Jan 7, 2025

Melkiades added discussion Needs to consider if worth it or how and removed enhancement New feature or request labels Jan 21, 2025

Melkiades closed this as completed Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

behaviour of na_str in formatters 0.5.9 #321

behaviour of na_str in formatters 0.5.9 #321

iaugusty commented Sep 27, 2024 •

edited

Loading

Melkiades commented Sep 27, 2024

shajoezhu commented Jan 5, 2025

iaugusty commented Jan 6, 2025

shajoezhu commented Jan 7, 2025

iaugusty commented Jan 7, 2025

shajoezhu commented Jan 7, 2025

Melkiades commented Jan 21, 2025

iaugusty commented Jan 21, 2025 •

edited

Loading

Melkiades commented Jan 21, 2025

iaugusty commented Jan 21, 2025

behaviour of na_str in formatters 0.5.9 #321

behaviour of na_str in formatters 0.5.9 #321

Comments

iaugusty commented Sep 27, 2024 • edited Loading

Melkiades commented Sep 27, 2024

shajoezhu commented Jan 5, 2025

iaugusty commented Jan 6, 2025

shajoezhu commented Jan 7, 2025

iaugusty commented Jan 7, 2025

shajoezhu commented Jan 7, 2025

Melkiades commented Jan 21, 2025

iaugusty commented Jan 21, 2025 • edited Loading

Melkiades commented Jan 21, 2025

iaugusty commented Jan 21, 2025

iaugusty commented Sep 27, 2024 •

edited

Loading

iaugusty commented Jan 21, 2025 •

edited

Loading