Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOP-22350] Add transformations for Transfers with dataframe column filtering #186

Merged
merged 1 commit into from
Jan 21, 2025

Conversation

IlyasDevelopment
Copy link
Contributor

@IlyasDevelopment IlyasDevelopment commented Jan 20, 2025

Change Summary

  • Added a schema for DataframeColumnsFilter transformation:

    {
      "type": "dataframe_columns_filter",
      "filters": [
        {
          "type": "include",
          "field": "col1"
        },
        {
          "type": "rename",
          "field": "col2",
          "to": "new_col2"
        },
        {
          "type": "cast",
          "field": "col3",
          "as_type": "DATE"
        }
      ]
    }
  • Supported filter types:

    • include
    • rename
    • cast
  • Implemented logic in the worker to handle DataframeColumnsFilter transformations: if a filter is present, it is applied to DBReader(columns=['"col1"', '"col2" AS "new_col2"', 'CAST("col3" as DATE) AS "col3"']) for DBHandler types and df.selectExpr("...") for FileHandlers. Filters are converted into a single SQL-like condition using AND, e.g.,

    "col1", "col2" AS "new_col2", CAST("col3" AS DATE) AS "col3"

Checklist

  • Commit message and PR title is comprehensive
  • Keep the change as small as possible
  • Unit and integration tests for the changes exist
  • Tests pass on CI and coverage does not decrease
  • Documentation reflects the changes where applicable
  • docs/changelog/next_release/<pull request or issue id>.<change type>.rst file added describing change
    (see CONTRIBUTING.rst for details.)
  • My PR is ready to review.

@IlyasDevelopment IlyasDevelopment merged commit d83ad79 into develop Jan 21, 2025
18 checks passed
@IlyasDevelopment IlyasDevelopment deleted the feature/DOP-22350 branch January 21, 2025 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants