Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

two versions of transform #232

Closed
seanmacavaney opened this issue Sep 29, 2021 · 1 comment
Closed

two versions of transform #232

seanmacavaney opened this issue Sep 29, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@seanmacavaney
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

In some cases, it's advantageous not to have an entire dataframe in memory at once. This is especially the case when indexing, but can also happen for large sets of queries.

Describe the solution you'd like

Two versions of the transform(): one that takes a dataframe and returns a dataframe (maybe transform_df()?), and one that takes an iter[dict] and returns an iter[dict] (maybe transform_iter()?).

The main transform() function can, based on the input, decide what function to send the input to.

Transformers will be required to implement at least one of these. Both can be easily converted to the other with default implementations. In rare cases, one might want to implement optimized versions of both.

Describe alternatives you've considered

  • transform_iter currently takes an iterable, but returns a dataframe.
  • transform_gen currently takes a dataframe and returns an iterable of dataframes

Additional context
We'll need to consider how this interacts with existing transformers and other features. Ideally a solution that does not impact these would be preferable.

@seanmacavaney seanmacavaney added the enhancement New feature or request label Sep 29, 2021
@cmacdonald
Copy link
Contributor

This is resolved by #481

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants