You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
In some cases, it's advantageous not to have an entire dataframe in memory at once. This is especially the case when indexing, but can also happen for large sets of queries.
Describe the solution you'd like
Two versions of the transform(): one that takes a dataframe and returns a dataframe (maybe transform_df()?), and one that takes an iter[dict] and returns an iter[dict] (maybe transform_iter()?).
The main transform() function can, based on the input, decide what function to send the input to.
Transformers will be required to implement at least one of these. Both can be easily converted to the other with default implementations. In rare cases, one might want to implement optimized versions of both.
Describe alternatives you've considered
transform_iter currently takes an iterable, but returns a dataframe.
transform_gen currently takes a dataframe and returns an iterable of dataframes
Additional context
We'll need to consider how this interacts with existing transformers and other features. Ideally a solution that does not impact these would be preferable.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
In some cases, it's advantageous not to have an entire dataframe in memory at once. This is especially the case when indexing, but can also happen for large sets of queries.
Describe the solution you'd like
Two versions of the
transform()
: one that takes a dataframe and returns a dataframe (maybetransform_df()
?), and one that takes an iter[dict] and returns an iter[dict] (maybetransform_iter()
?).The main transform() function can, based on the input, decide what function to send the input to.
Transformers will be required to implement at least one of these. Both can be easily converted to the other with default implementations. In rare cases, one might want to implement optimized versions of both.
Describe alternatives you've considered
transform_iter
currently takes an iterable, but returns a dataframe.transform_gen
currently takes a dataframe and returns an iterable of dataframesAdditional context
We'll need to consider how this interacts with existing transformers and other features. Ideally a solution that does not impact these would be preferable.
The text was updated successfully, but these errors were encountered: