Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enh]: implement nw.expr.interpolate_by #1739

Open
limlam96 opened this issue Jan 6, 2025 · 2 comments
Open

[Enh]: implement nw.expr.interpolate_by #1739

limlam96 opened this issue Jan 6, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@limlam96
Copy link

limlam96 commented Jan 6, 2025

We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?

https://github.com/lvgig/tubular

Please describe the purpose of the new feature or describe the problem to solve.

We currently make use of np.interp in our weighted_quantile calculation (here), and are looking at how to convert this to narwhals - I think the most direct route for this would be to use a nw version of pl.Expr.interpolate_by, wondering if this fits into your roadmap at all?

Suggest a solution if possible.

a nw version of pl.Expr.interpolate_by

If you have tried alternatives, please describe them below.

No response

Additional information that may help us understand your needs.

No response

@MarcoGorelli
Copy link
Member

thanks @limlam96 for the request

yes, i'd say that this is in-scope

does interpolate_by completely cover your needs?

@limlam96
Copy link
Author

limlam96 commented Jan 8, 2025

Nice :)

I think it would yeah, the rough idea would be for a df with weight column and some numeric column, where we want to interpolate the numeric column at some given weight quantiles:

  • sort the df by weight, and add in a cumulative weight column
  • insert rows with the wanted quantiles to this cumulative weight column, and sort by cumulative weight
  • run interpolate_by on the numeric column and cumulative weight column

I think apart from interpolate_by, the rest should be achievable in narwhals

Hope that was clear - and also open to any alternative suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants