Skip to content

Conversation

Allex-Nik
Copy link
Collaborator

Fixes #1245

Swapping is AnyCol with is AnyFrame allows to avoid applying dataFrameOf() to a ColumnGroup object when convertToDataFrame() in KotlinNotebookPluginUtils is called.

In that case, a separate ColumnGroup object is rendered without the group,
Rendering of a column group
which might look a bit unusual, but the reason to implement it is to ensure more intuitive access to the columns (because we can only access columns a and b and not letters from the code; with the existing approach we would see the group letters, but would not be able to access it).

Cast asDataFrame will leave it without the group being rendered:
Rendering of a column group with cast to dataframe

At the same time, if we convert this ColumnGroup object to a dataframe (with the function public fun AnyBaseCol.toDataFrame(): AnyFrame = dataFrameOf(listOf(this)), which is already present in the library), the group is rendered in the dataframe:
Rendering of a dataframe with a column group

This seems to be a correct result because in all cases rendering of the columns conforms to the way we access columns from the code (so the structure of data is reflected correctly), and column groups inside a dataframe are actually shown as groups, which prevents the user from being misled.

Rendering of two dataframes combined

…e() to align rendering with column access and make it conform to the structure of data
@Jolanrensen
Copy link
Collaborator

Jolanrensen commented Sep 29, 2025

Hmm, while the results are a bit more consistent, it's not exactly how I imagined it to look, and there can still be different results depending on whether you use toDataFrame() and asDataFrame(), which can be confusing.

What I'd expect when working with a column group:
image

and when working with a DataFrame:
image

wdyt?

This might require some extra checks, as júst instance checks are not enough to figure out the declared type.

@koperagen
Copy link
Collaborator

What I'd expect when working with a column group:

For df.columnGroup displaying columnGroup header would be redundant because we refer to column as group.a, group.b, not group.columnGroup.a. So there's a consistency between rendered structure and actual structure + ColumnGroup fulfills its DataFrame contract. I'd go with this fix

@Jolanrensen
Copy link
Collaborator

Jolanrensen commented Sep 29, 2025

For df.columnGroup displaying columnGroup header would be redundant because we refer to column as group.a, group.b, not group.columnGroup.a. So there's a consistency between rendered structure and actual structure + ColumnGroup fulfills its DataFrame contract. I'd go with this fix

I agree with that, but the difference between a column group and a dataframe is that the former can have a name. That is not displayed anywhere in the output currently, but it's still valuable information. So while it could render like a dataframe, the name of the column group should still be visible, I think.

I'm not sure whether we have such an ability in the current table output, but at least, rendering it as a DF with a single column group achieves this goal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ColumnGroup.asDataFrame(), still gets rendered as ColumnGroup in notebook
3 participants