Skip to content

Add explicit classes for structured and unstructured formula variants. #222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 1, 2024

Conversation

matthewwardrop
Copy link
Owner

@matthewwardrop matthewwardrop commented Nov 30, 2024

This patch adds explicit classes for the structured and unstructured formula variants. I was on the fence about pulling the trigger on this, but the issues identified in #213 make the increase in local complexity worth it.

This is a largely backward compatible change, unless code is hard-coded to expect structured instance methods like ._map and _has_structure. I doubt anyone is using them, but just in case, before merging I will add in backward compatible shims for these methods that raise the appropriate deprecation warnings; and ultimately we'll remove these shims in version 2.0 (whenever that happens). Users need never intentionally interact with the new SimpleFormula and StructuredFormula classes.

As a result of these changes, code like:

Formula(lhs=Formula(["y"]), rhs=Formula("1 + z:x + x + y + g:h", _ordering="none"))

will retain the ordering of the nested formulae rather than reordering them; due to the new distinction between List[Term] (which is still reordered according to the local ordering policy) and SimpleFormula which is responsible for its own ordering.

TODO:

  • Deprecated shims for Structured methods and attributes on the new SimpleFormula class.
  • Unit tests
  • Documentation

closes: #213

@matthewwardrop
Copy link
Owner Author

@bashtage I'll clean this up in over the next few days, but let me know if this breaks anything in statsmodels. Even if it doesn't, I plan to add the backwards compatible shims with deprecation warnings (one of which is already present: .root).

@matthewwardrop matthewwardrop force-pushed the improve_formula_nesting branch from 203497f to 6fa58f2 Compare November 30, 2024 07:21
@matthewwardrop matthewwardrop changed the title Allow sentinel values to be used as types rather than instances. Add explicit classes for structured and unstructured formula variants. Nov 30, 2024
@matthewwardrop matthewwardrop force-pushed the improve_formula_nesting branch 3 times, most recently from b6e3be6 to d9e564f Compare December 1, 2024 06:25
@matthewwardrop matthewwardrop force-pushed the improve_formula_nesting branch from d9e564f to 11d7180 Compare December 1, 2024 06:48
@matthewwardrop matthewwardrop merged commit 1f77812 into main Dec 1, 2024
7 checks passed
@matthewwardrop matthewwardrop deleted the improve_formula_nesting branch December 1, 2024 06:52
@bashtage
Copy link
Contributor

bashtage commented Dec 1, 2024

Thanks for getting this in. I just need to finalize cubic splines and then I think statsmodels will be able to pass tests against main. It already is passing on main + cubic + ordering. I'll let you know if there are any issues after I pull this one in instead of my ordering patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants