Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Etcd Upgrade Fails When Persistent Storage Flag is Disabled #78398

Open
saurabhnetskope opened this issue Feb 27, 2025 · 2 comments
Open

Etcd Upgrade Fails When Persistent Storage Flag is Disabled #78398

saurabhnetskope opened this issue Feb 27, 2025 · 2 comments
Assignees
Labels
etcd tech-issues The user has a technical issue about an application triage Triage is needed

Comments

@saurabhnetskope
Copy link

saurabhnetskope commented Feb 27, 2025

Name and Version

bitnami/etcd:3.5.18

What architecture are you using?

amd64

What steps will reproduce the bug?

Description:
We are running a 3-pod etcd cluster without persistent storage, relying on emptyDir. However, after the recent change introduced in commit 1aff4e2, the etcd upgrade is failing.

Steps to Reproduce:

  1. Deploy a 3-pod etcd cluster with persistent storage disabled (emptyDir used instead).
  2. Attempt to perform a rolling upgrade.
  3. Observe that the upgrade fails.

Root Cause:
The issue lies in the is_new_etcd_cluster function. To determine if the cluster is new or existing, this function executes:

is_new_etcd_cluster() {
    local -a extra_flags
    read -r -a extra_flags <<<"$(etcdctl_auth_flags)"
    is_boolean_yes "$ETCD_ON_K8S" && extra_flags+=("--endpoints=$(etcdctl_get_endpoints)")
    ! debug_execute etcdctl endpoint status --cluster "${extra_flags[@]}"
}

During a rolling upgrade, not all endpoints will be responsive. When etcdctl_get_endpoints includes its own endpoint, the command:

! debug_execute etcdctl endpoint status --cluster "${extra_flags[@]}"

fails, leading to the upgrade issue.

Proposed Fix:
Modify etcdctl_get_endpoints to exclude the pod's own endpoint before executing etcdctl endpoint status --cluster. This will prevent failures when checking the cluster status during a rolling upgrade.

Expected Behavior:
The etcd cluster should successfully upgrade even when persistent storage is disabled, allowing rolling upgrades to complete without failure.

Environment Details:

  • Etcd Cluster: 3 pods
  • Storage: emptyDir
  • Affected Commit: 1aff4e2
  • Kubernetes Environment: [Provide Kubernetes version]
  • Helm Chart Version (if applicable): [Provide chart version]

What is the expected behavior?

The etcd cluster should successfully upgrade even when persistent storage is disabled, allowing rolling upgrades to complete without failure.

What do you see instead?

  • Prevents successful rolling upgrades for etcd clusters without persistent storage.
  • Could impact production environments relying on emptyDir for ephemeral etcd storage.
@saurabhnetskope saurabhnetskope added the tech-issues The user has a technical issue about an application label Feb 27, 2025
@github-actions github-actions bot added the triage Triage is needed label Feb 27, 2025
@saurabhnetskope
Copy link
Author

saurabhnetskope commented Feb 27, 2025

this is fixed function.

is_new_etcd_cluster() {
    local -a extra_flags
    read -r -a extra_flags <<< "$(etcdctl_auth_flags)"
    
    is_boolean_yes "$ETCD_ON_K8S" && extra_flags+=("--endpoints=$(etcdctl_get_endpoints true)")
    
    ! debug_execute etcdctl endpoint status --cluster "${extra_flags[@]}"
}

@carrodher
Copy link
Member

Thank you for bringing this issue to our attention. We appreciate your involvement! If you're interested in contributing a solution, we welcome you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.

Your contribution will greatly benefit the community. Feel free to reach out if you have any questions or need assistance.

@carrodher carrodher added the etcd label Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
etcd tech-issues The user has a technical issue about an application triage Triage is needed
Projects
None yet
Development

No branches or pull requests

2 participants