Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report failed status #48

Merged
merged 1 commit into from
Aug 25, 2023
Merged

Report failed status #48

merged 1 commit into from
Aug 25, 2023

Conversation

ryannedolan
Copy link
Collaborator

@ryannedolan ryannedolan commented Aug 25, 2023

Summary

It's not always possible to ensure that the SQL submitted with a Subscription will be implementable by the operator. In addition to syntax errors, it's possible that the user and operator disagree on what databases or tables exist, or which tables the user has access to. In such cases, the operator would just keep retrying and failing, without providing a reason to the user.

With this change, we report .status.failed anytime the operator is unable to implement a pipeline for the Subscription. In addition, we set .status.message to the offending exception. This gives the user some insight into why the Subscription is not being deployed successfully.

N.B. the operator will still backoff and retry failed Subscriptions, rather than give up entirely. In particular, a failed Subscription may suddenly transition to success if the catalog changes, e.g. if a missing table is created etc.

Testing Done

I deployed a subscription with invalid SQL, and validated that the subscription was marked as failed:

$ kubectl get subs
NAME       STATUS                                                                                                                                                             DB         SQL
products   Error: org.apache.calcite.runtime.CalciteContextException: From line 1, column 45 to line 1, column 74: Object 'xxproducts_on_hand' not found within 'INVENTORY'   RAWKAFKA   SELECT "quantity", "product_id" AS KEY FROM INVENTORY."xxproducts_on_hand"

$ kubectl get subs -o yaml | grep failed
    failed: true

Then, after fixing the SQL:

$ kubectl get subs
NAME       STATUS      DB         SQL
products   Deployed.   RAWKAFKA   SELECT "quantity", "product_id" AS KEY FROM INVENTORY."products_on_hand"

$ kubectl get subs -o yaml | grep failed
    failed: false

@ryannedolan ryannedolan enabled auto-merge (squash) August 25, 2023 03:19
@hshukla
Copy link
Collaborator

hshukla commented Aug 25, 2023

N.B. the operator will still backoff and retry failed Subscriptions, rather than give up entirely. In particular, a failed Subscription may suddenly transition to success if the catalog changes, e.g. if a missing table is created etc.

[For followup PR] As we discussed offline, we should see if we can purge/cleanup resources in operator for failed subscription request. This would help in two ways

  • Removing dangling/orphan resources from hoptimator and not worry about them being active after few hours where they are most certainly not needed, especially since we have API fronted to it and API users may make new fresh request after fixing actual issue.
  • By nothing having orphan pipeline, it make sure it does not pollute downstream resources too.

@ryannedolan ryannedolan merged commit f81e6ac into main Aug 25, 2023
1 check passed
@ryannedolan ryannedolan deleted the report-failed branch August 25, 2023 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants