Skip to content

Conversation

@thomas-waite
Copy link
Contributor

Add timeout protections to migration controller

The Migration controller.rs does not currently have timeout protections in case a migration hangs indefinitely. This PR adds them, and enforces a limit of 20seconds per migration.

If the migration takes longer than 20seconds, it is failed and marked retryable. Exponential backoff is used to determine the next time it can be retried.

@thomas-waite
Copy link
Contributor Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1dd4a02631

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +227 to +231
// Execute with timeout to prevent indefinite hangs
let execute_result = tokio::time::timeout(
tokio::time::Duration::from_secs(MIGRATION_TIMEOUT_SECS),
processor.execute(),
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ensure timed‑out executes stop before running next migration

The new timeout wraps processor.execute() but does not guarantee the underlying work stops; tokio::time::timeout only drops the future. If an implementation uses blocking work or spawns background tasks, the migration can keep running after the timeout while the controller immediately continues to the next migration. That breaks the controller’s sequential‑execution guarantee and can lead to concurrent migrations mutating shared state. Consider running execute in a task that can be explicitly aborted/joined on timeout, or otherwise enforce that timed‑out work is fully stopped before proceeding.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented that migrations should not spawn background tasks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant