Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Series.hist #1859

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

feat: Series.hist #1859

wants to merge 7 commits into from

Conversation

camriddell
Copy link
Contributor

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

Narwhals Expressions do not yet allow one to return a DataFrame (or a struct), so .hist is only implemented at the Series level to be compliant with the rest of the API.

The PyArrow implementation can likely be streamlined a bit more, so a review on that section would be appreciated.

@MarcoGorelli
Copy link
Member

wow, amazing!

.hist is only implemented at the Series level to be compliant with the rest of the API.

agree, good design decision here, I think it would be quite awkward to add Expr.hist because:

  • struct dtype is not supported by default in pandas (and not at all in pandas pre 2.0. Maybe 1.5. But not before that)
  • it's a length-changing expression which it doesn't make sense to aggregate (e.g. nw.col('a').unique().len() makes sense but nw.col('a').hist().struct.field('value').sum() doesn't seem useful..), so supporting this for pyspark / duckdb / ibis could be a real issue. Ibis does seem to have bucket but there's no examples and I have no idea what it does

Hi @mscolnick - just wanted to check that Series.hist would still be useful to you?

Copy link
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a ton @camriddell ! I left a couple of comments and will go through the arrow implementation in more details later today πŸš€

narwhals/_pandas_like/series.py Outdated Show resolved Hide resolved
narwhals/_arrow/series.py Outdated Show resolved Hide resolved
@FBruzzesi FBruzzesi added the enhancement New feature or request label Jan 24, 2025
@mscolnick
Copy link
Contributor

awesome work @camriddell! yes, @MarcoGorelli this is definitely valuable for us (marimo) and would use it right away, thank you all for the efforts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants