feat: `SparkLikeNamespace` methods #1779

FBruzzesi · 2025-01-10T09:38:38Z

What type of PR is this? (check all applicable)

Related issues

Related issue [Enh]: Spark Expr missing methods #1714

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

If you have comments or can explain your changes, please do so below

Introduces:

Expr.is_null
Expr.n_unique
Namespace.len
Namespace.any_horizontal
Namespace.mean_horizontal
Namespace.min_horizontal
Namespace.max_horizontal
Namespace.concat
Namespace.concat_str

FBruzzesi · 2025-01-10T09:39:15Z

tests/utils.py

+    sorted_indices = sorted(
+        range(len(sort_list)), key=lambda i: (sort_list[i] is None, sort_list[i])
+    )


Otherwise list with Nones would just fail to sort

FBruzzesi · 2025-01-10T09:39:56Z

narwhals/_spark_like/expr.py

+        from pyspark.sql.types import IntegerType
+
+        def _n_unique(_input: Column) -> Column:
+            return F.count_distinct(_input) + F.max(F.isnull(_input).cast(IntegerType()))


Highly inspired by duckdb implementation 😂😉

FBruzzesi · 2025-01-10T09:41:37Z

narwhals/_spark_like/group_by.py

-                expr._function_name, expr._function_name
-            )
-            agg_func = get_spark_function(function_name, **expr._kwargs)
+            agg_func = get_spark_function(expr._function_name, **expr._kwargs)


I could not manage to make nw.len() work in the group_by context

FBruzzesi · 2025-01-10T10:52:21Z

@MarcoGorelli scikit-lego issue is definitly unrelated

FBruzzesi added 3 commits January 9, 2025 18:55

feat: spark like namespace functions

4eaa78d

horizontal and concat_str

b6160e8

Merge branch 'main' into feat/pyspark-namespace

82718f7

FBruzzesi commented Jan 10, 2025

View reviewed changes

n_unique returns_scalar=True

0e1f345

FBruzzesi commented Jan 10, 2025

View reviewed changes

FBruzzesi added 2 commits January 10, 2025 10:52

rollback sort test

790d829

avoid nw.all() with reductions

476db55

FBruzzesi added the enhancement New feature or request label Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: `SparkLikeNamespace` methods #1779

feat: `SparkLikeNamespace` methods #1779

FBruzzesi commented Jan 10, 2025

FBruzzesi Jan 10, 2025

FBruzzesi Jan 10, 2025

FBruzzesi Jan 10, 2025

FBruzzesi commented Jan 10, 2025

feat: SparkLikeNamespace methods #1779

Are you sure you want to change the base?

feat: SparkLikeNamespace methods #1779

Conversation

FBruzzesi commented Jan 10, 2025

What type of PR is this? (check all applicable)

Related issues

Checklist

If you have comments or can explain your changes, please do so below

FBruzzesi Jan 10, 2025

Choose a reason for hiding this comment

FBruzzesi Jan 10, 2025

Choose a reason for hiding this comment

FBruzzesi Jan 10, 2025

Choose a reason for hiding this comment

FBruzzesi commented Jan 10, 2025

feat: `SparkLikeNamespace` methods #1779

feat: `SparkLikeNamespace` methods #1779