-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: stabilize Table
class
#979
Conversation
This will be restored later, when we finalize the API to work with time series.
🦙 MegaLinter status: ✅ SUCCESS
See detailed report in MegaLinter reports |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #979 +/- ##
==========================================
+ Coverage 94.42% 94.99% +0.57%
==========================================
Files 121 123 +2
Lines 7459 7696 +237
==========================================
+ Hits 7043 7311 +268
+ Misses 416 385 -31 ☔ View full report in Codecov by Sentry. |
Closes #875
Closes #877
Closes partially #977
Summary of Changes
Stabilize the API of the
Table
class. This PR introduces several breaking changes to this class:data
parameter of__init__
is now required.remove_columns_except
toselect_columns
add_table_as_columns
toadd_tables_as_columns
add_table_as_rows
toadd_tables_as_rows
It also adds new functionality throughout the library:
Table.add_index_column
to add a new column with auto-incrementing integer values to a table.Table.filter_rows
to keep only the rows matched by some predicate.Table.filter_rows_by_column
to keep only the rows that have a value in a specific column that matches some predicate.random_seed
forTable.shuffle_rows
andTable.split_rows
to control the pseudorandom number generator. Previously, the methods were deterministic, but the seed was hidden.missing_value_ratio_threshold
ofTable.remove_columns_with_missing_values
to be able to keep columns with only a few missing values.ColumnType
to instantiate column types. This prepares for Overwrite specified schema (selectively) #754.Finally, the methods
Table.summarize_statistics
andColumn.summarize_statistics
are now considerably faster.