Skip to content

ResourceVersion conflicts block pod rollout due to Patroni annotation updates #4439

@rdalmas

Description

@rdalmas

Overview

The Postgres Operator encounters frequent ResourceVersion conflicts when attempting to delete pods during rollout operations. The error occurs in instance.go:876 within the rolloutInstance() function when the operator tries to delete a pod with a stale ResourceVersion due to Patroni's continuous annotation updates.

Root Cause: Patroni updates the pod's status annotation every ~10 seconds (based on loop_wait config) to track cluster state (xlog_location, role, replication_state, etc.). On active databases with frequent writes, this causes the pod's ResourceVersion to increment constantly. When the operator attempts to delete a pod during rollout using client.Preconditions with a ResourceVersion check, the precondition fails because Patroni has updated the annotation in the meantime.

Impact:

  • Pod rollouts fail to complete
  • PostgresCluster status.instances[].updatedReplicas field remains empty
  • Clusters show as 2//2 instead of 2/2/2 in status
  • Pod operations and replication continue normally (cosmetic status issue, but blocks intentional rollouts)

Environment

  • Platform: Kubernetes
  • Platform Version: Unknown (1.2x+)
  • PGO Image Tag: 5.8.6
  • Postgres Version: 15
  • Storage: Cloud provider persistent volumes
  • Patroni Configuration: loop_wait: 10, ttl: 30, synchronous_mode: true

Steps to Reproduce

REPRO

  1. Deploy a PostgresCluster with multiple instances (HA setup with Patroni)
  2. Run an active workload generating frequent database writes (updates xlog_location continuously)
  3. Trigger a pod rollout by updating the PostgresCluster spec (e.g., change resource limits, update image)
  4. Observe operator logs for ResourceVersion conflicts

The issue is more pronounced on:

  • Databases with high transaction rates
  • Clusters with default Patroni loop_wait: 10 seconds
  • Rollouts taking longer than one Patroni loop cycle

EXPECTED

  1. Operator successfully deletes pod with UID check
  2. StatefulSet recreates pod with new template
  3. status.instances[].updatedReplicas updates correctly
  4. Rollout completes without errors

ACTUAL

  1. Operator fails to delete pod with error:
Operation cannot be fulfilled on Pod "...": the ResourceVersion in the precondition (682170681) does not match the ResourceVersion in record (682171386). The object might have been modified
  1. Pod is NOT deleted (rollout blocked)
  2. Status field updatedReplicas remains empty
  3. Error repeats on every reconciliation attempt

Logs

Operator Error Logs

time="2026-02-23T11:44:13Z" level=error msg="Reconciler error" 
PostgresCluster=postgres-51e9e197-1ca5-4ecf-a6a2-4d7a57ff5572/db-51e9e197-1ca5-4ecf-a6a2-4d7a57ff5572 
controller=postgrescluster 
controllerGroup=postgres-operator.crunchydata.com 
controllerKind=PostgresCluster 
error="Operation cannot be fulfilled on Pod \"db-51e9e197-1ca5-4ecf-a6a2-4d7a57ff5572-inst-2f2r-0\": the ResourceVersion in the precondition (682170681) does not match the ResourceVersion in record (682171386). The object might have been modified" 
file="internal/controller/postgrescluster/instance.go:876" 
func="postgrescluster.(*Reconciler).rolloutInstance" 
name=db-51e9e197-1ca5-4ecf-a6a2-4d7a57ff5572 
namespace=postgres-51e9e197-1ca5-4ecf-a6a2-4d7a57ff5572 
reconcileID=8e87f130-56a3-427c-bdea-b2502e144e69

Errors occur across multiple clusters, with frequency ~600 occurrences over 13 hours on a landscape with 87 PostgresClusters.

Verification of Root Cause

Pod ResourceVersion updates every ~10 seconds on active databases:

$ kubectl get pod <pod> -n <namespace> --watch -o jsonpath='{.metadata.resourceVersion}{"\n"}'
682382578
682382926  # ~10s later
682383283  # ~10s later

Patroni annotation changing frequently:

$ kubectl get pod <pod> -o jsonpath='{.metadata.annotations.status}' | jq .xlog_location
331786904544  # Changes with every database write

Labels and Patroni topology match (no actual impact on cluster health):

$ kubectl get pod <pod> -o jsonpath='{.metadata.labels.postgres-operator\.crunchydata\.com/role}'
replica

$ kubectl exec <pod> -c database -- patronictl topology
| Member | Host | Role         | State     | TL | Lag in MB |
|--------|------|--------------|-----------|----|-----------| 
| inst-0 | ...  | Sync Standby | streaming | 26 |         0 |

Proposed Solution

Remove ResourceVersion from Delete Precondition

In internal/controller/postgrescluster/instance.go around line 850:
https://github.com/CrunchyData/postgres-operator/blob/main/internal/controller/postgrescluster/instance.go#L850

// Current code causing conflicts:
return errors.WithStack(
    r.Writer.Delete(ctx, pod, client.Preconditions{
        UID:             &pod.UID,
        ResourceVersion: &pod.ResourceVersion,  // ← Remove this
    }))

// Proposed fix:
return errors.WithStack(
    r.Writer.Delete(ctx, pod, client.Preconditions{
        UID: &pod.UID,  // Keep UID check for safety
    }))

Rationale: The UID check is sufficient to ensure we're deleting the correct pod. The ResourceVersion check is overly strict for intentional deletion during rollout. Patroni's annotation updates don't affect the operator's intent to delete the pod.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions