Skip to content

Conversation

@cBournhonesque
Copy link

@cBournhonesque cBournhonesque commented Nov 21, 2025

PR that closes #19133
Adopted from #20019

nimit and others added 30 commits November 27, 2024 02:47
… single element and ddof=1 and there are nulls elsewhere in the Series (pola-rs#20077)
…dding a 'strict' keyword argument to concat_df(how='horizontal')
…lementation and added a more robust set of tests on concat
Change-Id: Icd5fb4d566af250b49f25834c931f8d7377e7515
Charles Bournhonesque added 4 commits November 21, 2025 13:53
Change-Id: Ie5f9bd91b671ae5f7e31e228f7b5a246d9168ab3
Change-Id: Iddb25d6798666abc0b9374b071b369f3e92cde39
Change-Id: I32f869c406cc0b34386c7c22aceea12495655eb5
Change-Id: I377b4efd92e0fd1a25639bdc723ceaeaea86113e
@cBournhonesque cBournhonesque changed the title Strict concat 19133 feat: Add strict parameter to pl.concat(how='horizontal') Nov 21, 2025
@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars and removed title needs formatting labels Nov 21, 2025
Charles Bournhonesque added 4 commits November 21, 2025 16:39
Change-Id: I9298a909b7f6eaac6893cecdbdd3be3286fd26f1
Change-Id: I74bdd5d39ec41fa5b4fe3371d62cbeac7b34ee62
Change-Id: If50f99a0b18a3a604620a40623a8323cb52d4e78
Change-Id: I5206f843ef735230ef1c2d37f0627a4e74b54b48
@github-actions github-actions bot added the changes-dsl Do not merge if this label is present and red label Nov 21, 2025
@cBournhonesque
Copy link
Author

@orlp I followed the suggestion in #20019 (comment)
please let me know if that works!

@codecov
Copy link

codecov bot commented Nov 21, 2025

Codecov Report

❌ Patch coverage is 84.78261% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.16%. Comparing base (76758c7) to head (9a99383).
⚠️ Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-stream/src/nodes/zip.rs 75.00% 5 Missing ⚠️
...ates/polars-plan/src/plans/ir/visualization/mod.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #25452      +/-   ##
==========================================
- Coverage   82.16%   82.16%   -0.01%     
==========================================
  Files        1728     1727       -1     
  Lines      240097   240195      +98     
  Branches     3028     3031       +3     
==========================================
+ Hits       197277   197353      +76     
- Misses      42040    42061      +21     
- Partials      780      781       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

}
let out_df = concat_df_horizontal(&out, false)?;
let strict_concat = matches!(self.extend_behavior, ExtendBehavior::Raise);
let out_df = concat_df_horizontal(&out, false, strict_concat)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why that is true.
If we have ExtendBehavior::FillNulls, couldn't we have one input head that has data with a different height than others, in which we case we would want to concat without strict?
I don't see where that happens in the code either

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are in the branch all input heads are broadcast inputs, meaning they have length 1.

Only relevant for LazyFrames. This determines if the concatenated
lazy computations may be executed in parallel.
strict
When how=`horizontal`, require all DataFrames to be the same height
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing period. Can you also emphasize that this is only to disable broadcasting? Different heights are never allowed anyway, only for broadcasting:

When how=`horizontal`, require all DataFrames to be the same height, raising an error instead of broadcasting unit height DataFrames.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are different heights never allowed?
In this example I have two non-unit dataframes being concatenated together: #25263 (comment)

Copy link
Member

@orlp orlp Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they're never allowed.

EDIT: my bad, you're right. Please disregard this. I'm too deep in the implementation of things where this is never allowed that I forgot this is the API by which users can request the ExtendWithNulls behavior.

Charles Bournhonesque added 2 commits November 24, 2025 12:21
Change-Id: I0934aa1fdabe12b6a333f0c97ec0958d0a545175
Change-Id: Id71f67f073bcf3d1dd3d26f62654b50ec40ae673
@orlp
Copy link
Member

orlp commented Nov 25, 2025

Can you address the final comment and then do a rebase on main? Either a squash or cherry-pick is fine whatever you prefer, but right now there's a lot of extra irrelevant commits in this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changes-dsl Do not merge if this label is present and red enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pl.concat(how='horizontal') should be strict by default

9 participants