Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(low-code): add DpathFlattenFields #227

Merged
merged 5 commits into from
Jan 20, 2025

Conversation

darynaishchenko
Copy link
Contributor

@darynaishchenko darynaishchenko commented Jan 17, 2025

What

source-airtable need record transformations that moves everything inside "fields" field to the top level of records and keeps others top level fields.
We already have FlattenFields transformation, but it transforms also objects inside "fields", records in airtable should not follow this format. Everything inside "fields" should be kept as is.

How.

Added DpathFlattenFields, which works with provided path and updates record with object found in the path, doesn't change fields inside the object.

Summary by CodeRabbit

  • New Features

    • Introduced a new DpathFlattenFields transformation for flattening nested fields in records.
    • Supports dynamic field path selection and optional deletion of original values.
    • Enhances data manipulation capabilities for declarative sources.
  • Tests

    • Added comprehensive unit tests for the DpathFlattenFields transformation.
    • Verified functionality across various scenarios, including handling of wildcards and non-existing fields.

@darynaishchenko darynaishchenko self-assigned this Jan 17, 2025
@github-actions github-actions bot added the enhancement New feature or request label Jan 17, 2025
@darynaishchenko
Copy link
Contributor Author

darynaishchenko commented Jan 17, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

coderabbitai bot commented Jan 17, 2025

📝 Walkthrough

Walkthrough

This pull request introduces a new transformation called DpathFlattenFields in the Airbyte CDK declarative components. This transformation allows for the flattening of nested field values to the top level of a record based on a specified field path, with an option to delete the original value. The changes encompass multiple files, including schema definitions, model classes, a factory method, the transformation implementation, and corresponding unit tests.

Changes

File Change Summary
airbyte_cdk/sources/declarative/declarative_component_schema.yaml Added DpathFlattenFields transformation definition; updated transformations in DeclarativeStream and DynamicSchemaLoader to include it.
airbyte_cdk/sources/declarative/models/declarative_component_schema.py Added DpathFlattenFields class with field_path, delete_origin_value, and parameters attributes; updated DeclarativeStream to include DpathFlattenFields in transformations.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Added create_dpath_flatten_fields method to support new transformation and updated mapping.
airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py Implemented DpathFlattenFields transformation logic with transform method.
unit_tests/sources/declarative/transformations/test_dpath_flatten_fields.py Added comprehensive test cases for the new transformation.

Sequence Diagram

sequenceDiagram
    participant Record
    participant DpathFlattenFields
    participant Config
    
    Record->>DpathFlattenFields: Provide record
    DpathFlattenFields->>Config: Evaluate field path
    Config-->>DpathFlattenFields: Return path values
    DpathFlattenFields->>Record: Flatten fields
    alt delete_origin_value is true
        DpathFlattenFields->>Record: Remove original field
    end
Loading

Possibly related PRs

Suggested reviewers

  • maxi297
  • aldogonzalez8
  • lazebnyi

Hey there! 👋 Do you think there are any specific use cases for the DpathFlattenFields transformation that we should highlight in the documentation? Wdyt? 🤔

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

Line range hint 1-465: The import block needs to be sorted according to the style guide.

Similar to the first file, the imports need to be reorganized. Would you like me to help with that? wdyt?

🧹 Nitpick comments (7)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)

1974-1977: The transformation description could be more detailed.

The current description "A transformation that flatten field values to the to top of the record" could be enhanced to better explain the behavior and use cases. What do you think about expanding it to something like:

"A transformation that moves field values from a specified path to the top level of the record, while preserving other top-level fields. This is useful when you need to flatten nested data structures but want to maintain specific top-level fields intact."?


1988-1991: Consider simplifying the array items type definition.

The field_path property defines array items as items: [{ type: string }] which is an unusual syntax. Would it be clearer to use the more common form?

      items:
-         - type: string
+         type: string
airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py (3)

13-19: Would you like to enhance the docstring for better clarity? 📝

The current docstring is a bit brief. What do you think about expanding it to include more details? Here's a suggestion:

"""
Flattens fields from a specified path in the record to the top level.

Args:
    config (Config): The connector configuration
    field_path (List[Union[InterpolatedString, str]]): Path to the field to flatten. 
        Can include wildcards (*) for matching multiple fields.
    parameters (Mapping[str, Any]): Additional parameters for string interpolation
    delete_origin_value (bool, optional): Whether to delete the original field after flattening.
        Defaults to False.

Example:
    For a record like:
    {
        "id": 1,
        "data": {
            "name": "test",
            "value": 42
        }
    }
    
    Using field_path=["data"] will result in:
    {
        "id": 1,
        "data": {"name": "test", "value": 42},  # Only if delete_origin_value=False
        "name": "test",
        "value": 42
    }
"""

26-34: How about simplifying this initialization? 🤔

The current implementation has some duplicate logic for converting paths to InterpolatedString. Would this simpler version work for you?

 def __post_init__(self, parameters: Mapping[str, Any]) -> None:
-    self._field_path = [
-        InterpolatedString.create(path, parameters=parameters) for path in self.field_path
-    ]
-    for path_index in range(len(self.field_path)):
-        if isinstance(self.field_path[path_index], str):
-            self._field_path[path_index] = InterpolatedString.create(
-                self.field_path[path_index], parameters=parameters
-            )
+    self._field_path = [
+        path if isinstance(path, InterpolatedString)
+        else InterpolatedString.create(path, parameters=parameters)
+        for path in self.field_path
+    ]

36-54: A few suggestions to make this transform method even more robust! 💪

  1. Would you consider adding a return type hint? Even though it modifies in-place, it helps with clarity:
 def transform(
     self,
     record: Dict[str, Any],
     config: Optional[Config] = None,
     stream_state: Optional[StreamState] = None,
     stream_slice: Optional[StreamSlice] = None,
-) -> None:
+) -> Dict[str, Any]:  # Shows it returns the modified record
  1. What about adding some error handling for dpath operations? They might raise exceptions:
try:
    if "*" in path:
        matched = dpath.values(record, path)
        extracted = matched[0] if matched else None
    else:
        extracted = dpath.get(record, path, default=[])
except (TypeError, ValueError) as e:
    logger.warning(f"Failed to extract value at path {path}: {e}")
    return record
  1. Consider documenting the in-place modification behavior in the docstring? It's important for users to know the record is modified directly.

What do you think? 🤔

unit_tests/sources/declarative/transformations/test_dpath_flatten_fields.py (2)

5-7: Quick suggestion about the test constants! ✨

What do you think about making these constants more descriptive and type-hinted?

-_ANY_VALUE = -1
+_DUMMY_VALUE: int = -1  # Used as a placeholder in test records
-_DELETE_ORIGIN_VALUE = True
-_DO_NOT_DELETE_ORIGIN_VALUE = False
+_DELETE_ORIGIN_VALUE: bool = True
+_KEEP_ORIGIN_VALUE: bool = False  # More descriptive than "DO_NOT_DELETE"

10-94: Great test coverage! Mind if we add a few more edge cases? 🎯

The current test suite looks solid! Would you consider adding these scenarios to make it even more robust?

# Invalid path that might raise exceptions
pytest.param(
    {"field1": {"field2": [1, 2, 3]}},
    {},
    ["field1", "field2", 0],  # Invalid path with array index
    _KEEP_ORIGIN_VALUE,
    {"field1": {"field2": [1, 2, 3]}},  # Should remain unchanged
    id="handle invalid path gracefully",
),

# Empty dictionary at path
pytest.param(
    {"field1": {"field2": {}}},
    {},
    ["field1", "field2"],
    _KEEP_ORIGIN_VALUE,
    {"field1": {"field2": {}}},  # Should handle empty dict
    id="handle empty dictionary at path",
),

# Nested arrays
pytest.param(
    {"field1": {"field2": [{"a": 1}, {"b": 2}]}},
    {},
    ["field1", "field2", "*"],
    _KEEP_ORIGIN_VALUE,
    {
        "field1": {"field2": [{"a": 1}, {"b": 2}]},
        "a": 1,
        "b": 2
    },
    id="flatten nested arrays with objects",
),

What do you think about these additions? They would help ensure the transformation handles edge cases gracefully! 🤔

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5ed82c and 1067a85.

📒 Files selected for processing (5)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)
  • airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py (1 hunks)
  • unit_tests/sources/declarative/transformations/test_dpath_flatten_fields.py (1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py

[error] 1-11: Import block is un-sorted or un-formatted. The imports need to be organized according to the style guide.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

[error] 5-465: Import block is un-sorted or un-formatted. The imports need to be organized according to the style guide.

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Analyze (python)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

795-812: Well-structured schema definition!

The DpathFlattenFields model is well-documented with clear examples and proper field definitions.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

682-693: Clean factory implementation!

The DpathFlattenFields factory method is well-implemented with proper parameter handling and default values.

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

1319-1319: LGTM! The transformation is properly integrated.

The DpathFlattenFields transformation is correctly added to both the transformations and schema_transformations arrays, maintaining consistency with other transformation types.

Also applies to: 1869-1869

@darynaishchenko
Copy link
Contributor Author

darynaishchenko commented Jan 17, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

682-694: Consider validating field_path contents?

The field_path list comprehension assumes all elements are valid. Should we add validation to ensure each element is either a string or InterpolatedString? This would help catch configuration errors early. wdyt?

 def create_dpath_flatten_fields(
     self, model: DpathFlattenFieldsModel, config: Config, **kwargs: Any
 ) -> DpathFlattenFields:
-    model_field_path: List[Union[InterpolatedString, str]] = [x for x in model.field_path]
+    model_field_path: List[Union[InterpolatedString, str]] = []
+    for x in model.field_path:
+        if not isinstance(x, (str, InterpolatedString)):
+            raise ValueError(f"Invalid field_path element: {x}. Expected string or InterpolatedString.")
+        model_field_path.append(x)
     return DpathFlattenFields(
         config=config,
         field_path=model_field_path,
         delete_origin_value=model.delete_origin_value
         if model.delete_origin_value is not None
         else False,
         parameters=model.parameters or {},
     )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ee2abc3 and 979e67a.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)
  • airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Analyze (python)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

213-215: LGTM! Import follows the established pattern.

The import statement for DpathFlattenFields model is properly grouped with other model imports from the same module.


547-547: LGTM! Factory mapping is consistent with other entries.

The mapping of DpathFlattenFieldsModel to its factory method follows the established pattern.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py (2)

13-19: How about enhancing the docstring to be more descriptive? 🤔

The current docstring could better explain the transformation's purpose and behavior. What do you think about something like this:

-    """
-    Flatten fields only for provided path.
-
-    field_path: List[Union[InterpolatedString, str]] path to the field to flatten.
-    delete_origin_value: bool = False whether to delete origin field or keep it. Default is False.
-
-    """
+    """
+    A transformation that flattens nested fields at specified paths by moving them to the record's top level.
+
+    For example, if a record has {'meta': {'field1': 'value1'}}, using path 'meta' would result in {'field1': 'value1'}.
+
+    Args:
+        field_path: List of paths to the fields that should be flattened. Supports both static strings and interpolated values.
+        delete_origin_value: If True, removes the original nested field after flattening. Defaults to False.
+
+    Note:
+        Paths can contain wildcards (*) for matching multiple fields.
+    """

wdyt? This would make it clearer how the transformation works! 🎯


26-35: Could we simplify the post-initialization logic? 🤔

The current implementation has duplicate conversion logic. The list comprehension already converts all paths to InterpolatedString, making the subsequent loop unnecessary. How about simplifying it like this:

     def __post_init__(self, parameters: Mapping[str, Any]) -> None:
-        self._field_path = [
-            InterpolatedString.create(path, parameters=parameters) for path in self.field_path
-        ]
-        for path_index in range(len(self.field_path)):
-            if isinstance(self.field_path[path_index], str):
-                self._field_path[path_index] = InterpolatedString.create(
-                    self.field_path[path_index], parameters=parameters
-                )
+        self._field_path = [InterpolatedString.create(path, parameters=parameters) for path in self.field_path]

This achieves the same result more concisely. What are your thoughts? 🎯

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 979e67a and 2494e51.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)

Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
unit_tests/sources/declarative/transformations/test_dpath_flatten_fields.py (3)

10-102: Great test coverage! Would you consider adding a few more edge cases? 🤔

The test cases are well-structured and cover many scenarios. What do you think about adding tests for:

  1. Empty objects: {"field2": {}}
  2. Empty arrays: {"field2": []}
  3. Nested arrays: {"field2": {"field3": [{"field4": 1}]}}

This would help ensure the transformation handles these edge cases gracefully, wdyt?


103-110: Would you consider preserving the original input for better debugging? 🤔

Currently, we modify the input record directly. How about:

 def test_dpath_flatten_lists(
     input_record, config, field_path, delete_origin_value, expected_record
 ):
+    # Create a copy to preserve the original input
+    record = input_record.copy()
     flattener = DpathFlattenFields(
         field_path=field_path, parameters={}, config=config, delete_origin_value=delete_origin_value
     )
-    flattener.transform(input_record)
-    assert input_record == expected_record
+    flattener.transform(record)
+    assert record == expected_record

This would help with debugging by preserving the original input, wdyt?


103-105: Would you like to add a docstring to explain the test function? 📝

How about adding a docstring to explain what the test function verifies? Something like:

def test_dpath_flatten_lists(
    input_record, config, field_path, delete_origin_value, expected_record
):
    """
    Test the DpathFlattenFields transformation with various input records and configurations.
    
    Args:
        input_record: The record to transform
        config: Configuration dictionary for the transformation
        field_path: Path to the field to flatten
        delete_origin_value: Whether to delete the original field
        expected_record: Expected transformed record
    """

This would help other developers understand the purpose of the test, wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2494e51 and f73cbcb.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py (1 hunks)
  • unit_tests/sources/declarative/transformations/test_dpath_flatten_fields.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/declarative/transformations/dpath_flatten_fields.py
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)
🔇 Additional comments (1)
unit_tests/sources/declarative/transformations/test_dpath_flatten_fields.py (1)

1-8: Nice work on the imports and constants! 👍

Clean and well-organized setup with meaningful constant names that improve readability.

Copy link
Contributor

@lazebnyi lazebnyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with these changes to avoid slowing down the migration. However, I believe it would be better to add transformation_level to FlattenFields to control the depth to which a record should be flattened.

@darynaishchenko darynaishchenko merged commit 17dd71f into main Jan 20, 2025
23 checks passed
@darynaishchenko darynaishchenko deleted the daryna/low-code/add-dpath-flatten-fields branch January 20, 2025 15:19
rpopov pushed a commit to rpopov/airbyte-python-cdk that referenced this pull request Jan 21, 2025
* main:
  feat(low-code): add DpathFlattenFields (airbytehq#227)
rpopov added a commit to rpopov/airbyte-python-cdk that referenced this pull request Jan 26, 2025
Created a DPath Enhancing Extractor
Refactored the record enhancement logic - moved to the extracted class
Split the tests of DPathExtractor and DPathEnhancingExtractor

Fix the failing tests:

FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_custom_components[test_create_custom_component_with_subcomponent_that_uses_parameters]
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_custom_components_do_not_contain_extra_fields
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_parse_custom_component_fields_if_subcomponent
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_page_increment
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_offset_increment
FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[simple_unstructured_scenario]
FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[no_file_extension_unstructured_scenario]

They faile because of comparing string and int values of the page_size (public) attribute.
Imposed an invariant:
  on construction, page_size can be set to a string or int
  keep only values of one type in page_size for uniform comparison (convert the values of the other type)
  _page_size holds the internal / working value
... unless manipulated directly.

Merged:
feat(low-code concurrent): Allow async job low-code streams that are incremental to be run by the concurrent framework (airbytehq#228)
fix(low-code): Fix declarative low-code state migration in SubstreamPartitionRouter (airbytehq#267)
feat: combine slash command jobs into single job steps (airbytehq#266)
feat(low-code): add items and property mappings to dynamic schemas (airbytehq#256)
feat: add help response for unrecognized slash commands (airbytehq#264)
ci: post direct links to html connector test reports (airbytehq#252) (airbytehq#263)
fix(low-code): Fix legacy state migration in SubstreamPartitionRouter (airbytehq#261)
fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254)
feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236)
feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111)
fix: don't mypy unit_tests (airbytehq#241)
fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225)
feat(concurrent cursor): attempt at clamping datetime (airbytehq#234)
fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254)
feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236)
feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111)
fix: don't mypy unit_tests (airbytehq#241)
fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225)
feat(concurrent cursor): attempt at clamping datetime (airbytehq#234)
ci: use `ubuntu-24.04` explicitly (resolves CI warnings) (airbytehq#244)
Fix(sdm): module ref issue in python components import (airbytehq#243)
feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174)
chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133)
docs: comments on what the `Dockerfile` is for (airbytehq#240)
chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237)
Fix(sdm): module ref issue in python components import (airbytehq#243)
feat(low-code): add DpathFlattenFields (airbytehq#227)
feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174)
chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133)
docs: comments on what the `Dockerfile` is for (airbytehq#240)
chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants