-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(airbyte-cdk): unable to create custom retriever #198
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThe pull request introduces modifications to the Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Wdyt about the related PRs? Do you think they could provide additional context for your changes? 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (8)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
1899-1899
: Double-check optional transformations parameter.Making
transformations
optional simplifies usage. Nice addition! Would you consider adding a docstring clarifying the default empty list? wdyt?🧰 Tools
🪛 GitHub Actions: Linters
[warning] File needs formatting using Ruff formatter
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (1)
2631-2633
: Add a docstring or comment for the custom retriever?Your
MyCustomRetriever
class inheritsSimpleRetriever
but has no docstring. Would you add a quick description for clarity? wdyt?🧰 Tools
🪛 GitHub Actions: Linters
[warning] File needs formatting using Ruff formatter
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(3 hunks)unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linters
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
[warning] File needs formatting using Ruff formatter
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
[warning] File needs formatting using Ruff formatter
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-the-guardian-api' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
1515-1515
: Consider verifying default decoder usage.Setting
decoder=JsonDecoder(parameters={})
as the default is handy. However, if a user wants no decoding, do you plan to support passingNone
or a different decoder? wdyt?🧰 Tools
🪛 GitHub Actions: Linters
[warning] File needs formatting using Ruff formatter
1931-1931
: Good fallback to empty list for transformations.This effectively prevents errors from passing
None
. Looks great to me!🧰 Tools
🪛 GitHub Actions: Linters
[warning] File needs formatting using Ruff formatter
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (1)
2635-2665
: Solid test for custom retriever creation.This covers instantiating a custom retriever in a
DeclarativeStream
. Everything looks in order. Great job!🧰 Tools
🪛 GitHub Actions: Linters
[warning] File needs formatting using Ruff formatter
/format-fix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (2)
2637-2639
: Consider adding a docstring or comment to describe the purpose of this subclass.
Even though it's a simple subclass, it might help future developers if you provide a quick explanation of this custom retriever. wdyt?
2641-2671
: Add an explicit assertion to ensureMyCustomRetriever
is properly tested.
Would you consider adding a small test that verifies any overridden methods or states if added in the future? For example, by checking the retriever’s internal attributes or behavior? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(3 hunks)unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: Check: 'source-the-guardian-api' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
Outdated
Show resolved
Hide resolved
… from create_record_selector method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all other tests are green, I want to unblock this! Asked a bunch of non-blocking questions, but they are, again, non-blocking.
self, | ||
model: HttpRequesterModel, | ||
config: Config, | ||
decoder: Decoder = JsonDecoder(parameters={}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Help me with my shitty Python knowledge. The signature change does two things here:
- Add default value for
decoder
— this is additive and will fix situations where decoder was not passed in - The order of named arguments changed. Is this important, or did we always pass them named and it's compatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of named arguments changed. Is this important, or did we always pass them named and it's compatible?
This is important but we assume model_to_component_factory
is quite mostly private to the CDK hence I'm fine with this breaking change
decoder: Optional[Decoder] = None, | ||
client_side_incremental_sync: Optional[Dict[str, Any]] = None, | ||
transformations: List[RecordTransformation] | None = None, | ||
decoder: Decoder | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentional that it defaults to None here, but JSONDecoder upstairs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting point. Default To None seems to work fine in a quick test, so maybe JsonDecoder(parameters={}) is unnecessary. Let me dig in a little bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, now I remember. The HttpRequester expects a decoder.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1619: error: Argument "decoder" to "HttpRequester" has incompatible type "Decoder | None"; expected "Decoder" [arg-type]
But we could create one if None for consistency with the arguments I other methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a second thought, it would be a little bit redundant, so I think I will put it back as it was so:
Intentional that it defaults to None here, but JSONDecoder upstairs?
Yes, The HttpRequester expects a decoder and mypy yells for that.
@@ -2008,7 +2013,7 @@ def create_record_selector( | |||
name=name, | |||
config=config, | |||
record_filter=record_filter, | |||
transformations=transformations, | |||
transformations=transformations or [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking: Is it intentional that transformations default to None
on argument list of the method, but to []
here? Should we just say transormations: List[RecordTransformation] = []
in the args list, or is it parsed to None
explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lint was very annoyed about using a mutable object as the default, and I thought it was safer to pass [] as the default factory does here for RecordSelector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
source-linkedinads has a custom retriever. Was it broken? Is there anything to do with this one?
self, | ||
model: HttpRequesterModel, | ||
config: Config, | ||
decoder: Decoder = JsonDecoder(parameters={}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of named arguments changed. Is this important, or did we always pass them named and it's compatible?
This is important but we assume model_to_component_factory
is quite mostly private to the CDK hence I'm fine with this breaking change
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Improvements
create_http_requester
.create_record_selector
.What?
I was encountering errors like the following when creating a Custom Retriever:
From what I can see, decoder can be optional here can is default to JsonDecoder here.
For transformations also we set it as optional here.