Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cometindex destination db and table initialization logic #4758

Closed
wants to merge 1 commit into from

Conversation

ejmg
Copy link
Collaborator

@ejmg ejmg commented Jul 25, 2024

Describe your changes

The original logic for Indexer::run had a initialization check that created the watermark table and destination tables. The problem with this check is that there isn't a guarantee that the destination tables were initialized just because a watermark table is detected. This PR moves the initialization of destination tables outside of the check.

Issue ticket number and link

Checklist before requesting a review

  • If this code contains consensus-breaking changes, I have added the "consensus-breaking" label. Otherwise, I declare my belief that there are not consensus-breaking changes, for the following reason:

    REPLACE THIS TEXT WITH RATIONALE (CAN BE BRIEF)

Only the crate cometindex is effected by this PR.

@cronokirby
Copy link
Contributor

I think the initialization logic we want can just be "is the database completely empty or not", having intermediate states beyond this is complicated to reason about and we can just not reason about it for now by completely wiping the database when we want to run a different indexer, which makes this particular situation not a thing we have to deal with.

@ejmg
Copy link
Collaborator Author

ejmg commented Jul 25, 2024

Fair, but the current initialization logic still allows for a custom AppView to fail when ran because of how the check is structured. The watermark table can be defined without having the destination tables initialized and that will always throw an unnecessary error. This prevents that specific case regardless of what else downstream consumers might do with respect to initializing their indexers.

@cronokirby
Copy link
Contributor

Yes, but you want it to fail, because you don't want to start running an indexing component that's missed a bunch of history, because:

  • missing history is in and of itself, not good
  • components can enter weird states if they're run on partial history (for example, validator voting power could end up negative or something)

@aubrika
Copy link
Contributor

aubrika commented Jul 25, 2024

there isn't a guarantee that the destination tables were initialized just because a watermark table is detected

the above is worth fixing

I think the initialization logic we want can just be "is the database completely empty or not",

and I agree that wiping the database completely & restarting from zero if not completely empty is the preferable fix, for the reasons Lucas indicates

@conorsch
Copy link
Contributor

@ejmg I'm seeing consensus here that the failure at startup when db contents are not what's as expected is by design, and part of the contract. I recognize that we could make the db init a bit smarter, but I agree it's safest to require that indexer operators provision the database cleanly ahead of running the indexer against it. Ideally we'll improve that story over time, but for now, especially given the debugging in e.g. #4759, it's the best use of our time to be explicit about the db initial state.

There's some great suggestions over in #4752, and I'll let @cronokirby chime in about how best to support that workflow.

@conorsch conorsch closed this Jul 25, 2024
@conorsch
Copy link
Contributor

the failure at startup when db contents are not what's as expected is by design, and part of the contract.

We made a similar call over in #4764

@ejmg
Copy link
Collaborator Author

ejmg commented Jul 25, 2024

Sounds good to me 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants