Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Support for multi-components models #1540

Open
larmarange opened this issue Aug 4, 2023 · 5 comments
Open

Feature request: Support for multi-components models #1540

larmarange opened this issue Aug 4, 2023 · 5 comments

Comments

@larmarange
Copy link
Collaborator

larmarange commented Aug 4, 2023

Dear @ddsjoberg

I'm currently adding support for zero-inflated models in broom.helpers, cf. larmarange/broom.helpers#233

With such models, you end-up with two sets of coefficients, for the two components of the model (see component column in the tidy table).

The situation is quite similar with multinomial models and their "y.level" column. Could we add support in tbl_regression() using the code already written for multinomial models?

Best regards

@ddsjoberg
Copy link
Owner

Absolutely! 🚀 🚀 🚀

@larmarange larmarange changed the title Feature request: Support for zero-inflated models Feature request: Support for multi-components models Aug 5, 2023
@larmarange
Copy link
Collaborator Author

betarag models are also multicomponent models. Therefore, we should be more general.

More precisely, we should check (1) if there is a "component" column in the result returned by tidy_plus_plus(), (2) if there is at least two different values in this col (with the selection of certain variables, removal of intercepts, etc. we could end up with just one value). If this is the case, then the tbl should be presented in a similar way as multinomial models.

@larmarange
Copy link
Collaborator Author

Dear @ddsjoberg

Would it be a good time to reopen this discussion and to see how to make tbl_regression() more general for multi components models and to multinomial models other than multinom()?

Best regards

@ddsjoberg
Copy link
Owner

Let's do it!

I think there are two general classes of multicomponent models: multinomial outputs (like polytomous regression) and combination models (like the zero-inflated models or the joint longitudinal-survival models). Does that sound right to you?

I think the easiest way to make these work with tbl_regression() would be if broom.helpers had consistent column names for these model types to identify them as multi-component models, then tbl_regression() could simply look for these column names and have a slightly different print for them (ie print them in two or more sections).

Is there anything else we need to cnosider?

@larmarange
Copy link
Collaborator Author

Thanks, @ddsjoberg, for your feedback.

You are right; there are polytomous regression (resulting with an y.level column) and multi-component models with a component column. We must keep in mind that in some cases, a user may need to group results differently (eg #1567 related to mixed models where the user wanted to group according to the effect column).

My suggestions are as follows.

on broom.helpers side

New function tidy_group_by() and new option group_by in tidy_plus_plus().

This function will be in charge of preparing the results and, if needed, returning an additional group_by column in the results.

By default (group_by = "auto"), if there is a y.level column and this column has a minimum of 2 different values, then group_by will be populated with this column, or, if there is a component column and this column has a minimum of 2 different values, then group_by will be populated with component, otherwise no grouping will be performed. It will cover most standard use cases.

group_by = NULL to force no grouping.

group_by = "effect" (or another column name) indicating that we want to group according to a specific column returned by the tidier.

An option group_labels could be implemented to allow the user to rename the different groups.

Such an approach should be pretty flexible. It even allows the development of custom tidiers with other grouping options.

on gtsummary side

No need to have methods according to the type of model. Simply look at the existence of a group_by column, and if there is a group_by column, simply add group headers (similarly to the variable group headers that you just implemented).

It would be nice to have bold_groups() and italicize_groups() functions.

Do you think it would be possible to natively include a pivot_wider() function (inspired from https://stackoverflow.com/questions/64463878/multinomial-logistic-regression-results-table-in-wide-format-using-the-gtsummary )

What do you think?

NB: in parralel, related to #1684, we could contact VGAM team to discuss if they could introduce tidiers in their package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants