-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: consolidate ibis.case(), Value.case(), Value.cases(), Value.substitute() #7280
Comments
+1 to consolidating
I don't think we should remove these. They are extremely familiar for users coming from SQL. We can likely consolidate all the various APIs under the
The builder pattern is probably going to become more used in the codebase, especially where there's state that needs to be tracked. For example, I'm working on a way to better support chaining joins that uses this pattern. We're also thinking about how this might help clean up the Agree that we should document any of these that are user-facing. |
I actually kinda like the idea of replacing the case builders with Say we had a top-level def cases(*branches: tuple[ir.BooleanValue, ir.Value], default=None):
cases, results = zip(*branches)
return ops.SearchedCase(cases, results, default).to_expr() Compare the following examples: t = ibis.table({"sym": "str", "left": "int", "right": "int"})
# Existing `ibis.case`
expr1 = t.mutate(
result=(
ibis.case()
.when(_.sym == "+", _.left + _.right)
.when(_.sym == "-", _.left - _.right)
.when(_.sym == "*", _.left * _.right)
.when(_.sym == "/", _.left / _.right)
.else_(0)
.end()
)
)
# New `ibis.cases`
expr2 = t.mutate(
result=ibis.cases(
(_.sym == "+", _.left + _.right),
(_.sym == "-", _.left - _.right),
(_.sym == "*", _.left * _.right),
(_.sym == "/", _.left / _.right),
default=0,
)
) I personally like the
IMO I think we should deprecate the builders in favor of |
|
What do you both think about my proposal of supporting both boolean conditions and equality values? Passing single values already works with Value.cases: t.x.cases([
(t.y, "equal to y"),
(t.z, "equal to z"),
]) vs the more precise/verbose t.x.cases([
(_ == t.y, "equal to y"),
(_ == t.z, "equal to z"),
]) Should this feature get propagated, or should we deprecate and drop it? Similarly, what about passing in lambdas for the predicates? Similarly, what about supporting |
I think that makes sense for
Seems fine to me, although given how good deferred expressions have gotten I'm tempted to avoid introducing lambda support into new apis. 🤷 could go either way here.
Also seems fine. Only argument I'd have to avoid it is it's a bit weird to type/document/support variadic def cases(branches: Mapping[Any, Any] | Iterable[tuple[Any, Any]], *, default: Any = None):
... Again, no strong thoughts 🤷. |
Ok thanks for pointing that out, I felt there was something assymmetrical there but I couldn't put my finger on it. Annoying can't-have-it-all here: I would really like both flavors to have the exact same API, but at the same time I also want it to be a drop-in replacement for
I had this vague intuition there was some madness here, but I haven't been able to come up with the actual case. do you have an example that you want to avoid?
I think this can get added as a followup without breaking anything, so maybe we leave it out for now.
Oh I didn't think of a variadic |
will |
yes, any work done here will have a deprecation period. |
This was finally addressed in #9096 If there are further changes desired, please start a new issue instead of reopening this one (since the "current behavior" is now different from what it was when I originally opened this issue) |
Is your feature request related to a problem?
All 4 functions are basically doing the same thing. It feels to me like we should be able to merge them all into once function with the same API. It is confusing to users (including me!) which they should use, and is extra burden keeping the docstrings etc all correct.
Also, a long-running grossness of .case() is that it returns a CaseBuilder object, which is undocumented. I don't think we want to add this CaseBuilder object to the public API, I think it would be totally doable and much simpler if we could keep it to a single function call.
Describe the solution you'd like
I propose:
(not sure about
default
vselse_
, and ifmapping
is the best name)where the keys can be
"foo"
or5
,value.upper()
value
and evaluated eg_.upper() == _
value
and return one of the above eglambda val: val.upper() == "FOO"
orlambda val: val.upper()
The values can be
_.lower()
value
and return one of the above eglambda val: val.lower()
I'm trying to think if there is some ambiguity if we accept both Values and BooleanValues similar to the ambiguity with selection vs filtering in
t[_.x.isnull()]
, but I can't think of it.I want to support the usecase of chained comparing one column to another:
(t.x == t.y).ifelse("same", (t.x > t.y).ifelse("greater", "less"))
I think this would be done with
t.x.case({lambda x: x == t.y: "same", lambda x: x > t.y: "greater"}, default="less")
but this feels a little gross. Makes me think either a top level or a table base API would be better:
t.cases({_.x == _.y: "same", _.x > _.y: "greater"}, default="less")
ibis.cases(t, {_.x == _.y: "same", _.x > _.y: "greater"}, default="less")
I would want to make this
What version of ibis are you running?
main
What backend(s) are you using, if any?
all
Code of Conduct
The text was updated successfully, but these errors were encountered: