Service: Make federation logic more explicit#4091
Service: Make federation logic more explicit#4091flyrain wants to merge 3 commits intoapache:mainfrom
Conversation
| return realmConfig().getConfig(LIST_PAGINATION_ENABLED, getResolvedCatalogEntity()); | ||
| } | ||
|
|
||
| public ListNamespacesResponse listNamespaces( |
There was a problem hiding this comment.
Move it down to group related methods together
There was a problem hiding this comment.
nit: moving code is fine, but it's outside the declared scope of this PR (per PR description) and makes the "real diff" harder to deduce 🤷
There was a problem hiding this comment.
What's the suggested action item?
singhpk234
left a comment
There was a problem hiding this comment.
LGTM overall ! Thanks @flyrain !
| // Indicates whether the catalog is a non-federated (Polaris-managed) catalog. | ||
| // Non-federated catalogs support additional capabilities such as pagination, | ||
| // location transformation, credential vending, and multi-table transactions. | ||
| private boolean isNonFederated = false; |
There was a problem hiding this comment.
How about we have isFederated a flag ?
There was a problem hiding this comment.
or isManaged ? to have same semantics
There was a problem hiding this comment.
isFederated was one of my versions. I changed to isNonFederated as there are many !isFederated. I can change it back if you feel strongly about it.
There was a problem hiding this comment.
Option 1 : how about isManaged ? which means catalogs managed by Polaris, this can replace nonFederated
Option 2: also i wonder what if we filp the neg and positive cases
if (isFederated) {
// existing
} else {
// existing
}
wdyt ? isNon went hand in was a bit hard to reason, though i don't strongly feel about it, let me know what do you think or option #1 or option #2
There was a problem hiding this comment.
Good idea, implemented option 2.
singhpk234
left a comment
There was a problem hiding this comment.
LGTM, thanks @flyrain !
|
the test failure seems unrelated. |
dimas-b
left a comment
There was a problem hiding this comment.
Sorry to push back, but I hope we can improve this code a lot more with an OO approach (since we're touching it)... more specific comments below.
| return realmConfig().getConfig(LIST_PAGINATION_ENABLED, getResolvedCatalogEntity()); | ||
| } | ||
|
|
||
| public ListNamespacesResponse listNamespaces( |
There was a problem hiding this comment.
nit: moving code is fine, but it's outside the declared scope of this PR (per PR description) and makes the "real diff" harder to deduce 🤷
| return catalogHandlerUtils().listNamespaces(namespaceCatalog, parent, pageToken, pageSize); | ||
| } else { | ||
| PageToken pageRequest = PageToken.build(pageToken, pageSize, this::shouldDecodeToken); | ||
| var results = ((IcebergCatalog) baseCatalog).listNamespaces(parent, pageRequest); |
There was a problem hiding this comment.
I would not call this "more explicit logic". IMHO, the logic becomes less explicit because instead of using a checked cast (line 239 before) we now implicitly assume that non-federated catalogs are instances of IcebergCatalog
There was a problem hiding this comment.
we now implicitly assume that non-federated catalogs are instances of IcebergCatalog
I don't understand this part. Why would we assume that now?
There was a problem hiding this comment.
That assumption is in the new code - it checks the isFedeated flag, and performs an unchecked (no instanceof) cast here. This means the code assumes that baseCatalog is of type IcebergCatalog if isFedeated == false, which is based purely on ConnectionConfigInfoDpo and it is not clear how ConnectionConfigInfoDpo affects the type of baseCatalog.
There was a problem hiding this comment.
The cast is required regardless, so I don’t see how this refactor makes things worse. Previously, the assumption was implicit and hidden, whereas now it’s made explicit. That’s exactly the intent of this PR.
There was a problem hiding this comment.
If needed we could introduce more guardrail to ensure the catalog of federated catalogs and others.
There was a problem hiding this comment.
Before this change, the code already relied on the fact that certain catalogs are federated and others are not, but that assumption was hidden inside a checked cast, so it wasn’t obvious when reading the flow.
With this refactor:
- The logic is now structured around the federation flag first. This was done in the initialization method, which provide clearly distinguish whether a catalog is federated or not.
- The branching makes it clear that we are handling two distinct cases
- The cast is still required, but it now happens in a well-defined branch, instead of being buried in-line
So the assumption didn’t get introduced by this PR, it was already there. What changed is that it is now surfaced explicitly in the control flow, making the intent easier to see and reason about.
There was a problem hiding this comment.
IMHO a checked cast is more explicit and easier to reason about than an if on a flag that is derived from the config, which is not directly related to the type of the variable being cast.
There was a problem hiding this comment.
initializeCatalog() decides whether a catalog is federated or not. That's the reason we make it explicit there. A condition on catalog class type is hard to understand, and reasoning, given there are many different catalog classes, and all of them are Iceberg catalogs. A conditional check like if (baseCatalog instanceof IcebergCatalog polarisCatalog) makes it harder to understand whether it is a federated one or not.
There was a problem hiding this comment.
initializeCatalog() decides the value of the isFederated field. It does not decide the type of baseCatalog.
There was a problem hiding this comment.
If agree that the current logic in this class is not ideal. Yet, I believe changing checked casts to implicit casts is not better... that's why I'm advocating for a more holistic object-oriented approach.
| authorizeBasicNamespaceOperationOrThrow(op, namespace); | ||
|
|
||
| if (baseCatalog instanceof IcebergCatalog polarisCatalog) { | ||
| if (isFederated) { |
There was a problem hiding this comment.
I'd prefer a different kind of refactoring for this code so as to remove these if/else constructs and have a more object-oriented design.
There was a problem hiding this comment.
Can you be more specific about more object-oriented design?
There was a problem hiding this comment.
Avoid if/else blocks on the "federated" flag but use different sub-classes to implement local and federated catalogs.
The diff will be larger, of course, but I hope the overall code quality will be better.
There was a problem hiding this comment.
I thought about that while making this refactor, but due to a lot of common code, I decide not to do so. But I'm open to it. With the current refactor in the this PR, it becomes easier if anyone want to do so. Is this a blocker for this refactor?
There was a problem hiding this comment.
Not a blocker, but I do not really see how it improves the codebase 🤷
Make federation logic more explicit instead of relying on checking on the object type. No logic change.
Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)