Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Cilium v1.16.4 #493

Merged
merged 1 commit into from
Dec 2, 2024
Merged

Update Cilium v1.16.4 #493

merged 1 commit into from
Dec 2, 2024

Conversation

kvaps
Copy link
Member

@kvaps kvaps commented Dec 2, 2024

Summary by CodeRabbit

  • New Features

    • Introduced new configuration options for socket-based load balancing tracing and initial fetch timeout settings in the Cilium deployment.
    • Enhanced validation checks for deprecated options to prevent misconfigurations.
  • Bug Fixes

    • Improved error messaging for deprecated or invalid settings.
  • Documentation

    • Updated version numbers in README and configuration files to reflect the new version (1.16.4).
  • Chores

    • Updated Dockerfile and image tags to reference the latest version (1.16.4).

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 2, 2024
Copy link
Contributor

coderabbitai bot commented Dec 2, 2024

Walkthrough

The pull request includes updates to the Cilium Helm chart, primarily focusing on version increments from 1.16.3 to 1.16.4 across various files. Key changes involve enhancements to the Chart.yaml, README.md, and configuration files, introducing new parameters for resource management and validation, as well as updates to the Dockerfile. The changes also include improvements in the Envoy configuration, allowing for customizable timeout settings and enhanced tracing capabilities.

Changes

File Path Change Summary
packages/system/cilium/charts/cilium/Chart.yaml Updated appVersion and version from 1.16.3 to 1.16.4; modified annotations for various resource kinds with detailed descriptions.
packages/system/cilium/charts/cilium/README.md Updated version and app version badges from 1.16.3 to 1.16.4.
packages/system/cilium/charts/cilium/files/cilium-envoy/configmap/bootstrap-config.json Added initialFetchTimeout parameters and restructured the resource management section, replacing layeredRuntime with overload_manager.
packages/system/cilium/charts/cilium/templates/cilium-configmap.yaml Added and modified configuration parameters, including trace-sock, proxy-initial-fetch-timeout, and conditional updates to bpf-lb-sock-terminate-pod-connections.
packages/system/cilium/charts/cilium/templates/hubble-relay/serviceaccount.yaml Added automountServiceAccountToken field conditionally based on .Values.serviceAccounts.relay.automount.
packages/system/cilium/charts/cilium/templates/validate.yaml Introduced validation checks for deprecated options and specific ranges for cluster.id and expanded checks for cluster.name.
packages/system/cilium/charts/cilium/values.schema.json Added initialFetchTimeoutSeconds property and updated enabled property to accept null type.
packages/system/cilium/charts/cilium/values.yaml Updated image tags and digests from v1.16.3 to v1.16.4, added tracing option in socketLB, and adjusted certValidityDuration from 1095 to 365 days.
packages/system/cilium/charts/cilium/values.yaml.tmpl Added tracing and initialFetchTimeoutSeconds options, and updated certValidityDuration for the hubble section.
packages/system/cilium/images/cilium/Dockerfile Updated the version argument from v1.16.3 to v1.16.4.

Possibly related PRs

Suggested labels

bug, documentation, enhancement, size:M

🐇 Hopping through the code, so bright and new,
With version updates and features, oh what a view!
Configs are clearer, with tracing in sight,
Cilium's now better, oh what a delight!
From 1.16.3 to 1.16.4, we cheer,
For a smoother deployment, let’s give a big cheer! 🎉


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
packages/system/cilium/charts/cilium/files/cilium-envoy/configmap/bootstrap-config.json (1)

381-387: Consider making max_active_downstream_connections configurable

The overload manager configuration adds protection against resource exhaustion by limiting active downstream connections. However, the hard-coded limit of 50,000 connections might not be suitable for all deployments.

Consider making this value configurable through Helm values:

-          "max_active_downstream_connections": "50000"
+          "max_active_downstream_connections": "{{ .Values.envoy.overloadManager.maxActiveDownstreamConnections }}"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between edbbb9b and e971892.

📒 Files selected for processing (10)
  • packages/system/cilium/charts/cilium/Chart.yaml (2 hunks)
  • packages/system/cilium/charts/cilium/README.md (9 hunks)
  • packages/system/cilium/charts/cilium/files/cilium-envoy/configmap/bootstrap-config.json (3 hunks)
  • packages/system/cilium/charts/cilium/templates/cilium-configmap.yaml (3 hunks)
  • packages/system/cilium/charts/cilium/templates/hubble-relay/serviceaccount.yaml (1 hunks)
  • packages/system/cilium/charts/cilium/templates/validate.yaml (1 hunks)
  • packages/system/cilium/charts/cilium/values.schema.json (1 hunks)
  • packages/system/cilium/charts/cilium/values.yaml (9 hunks)
  • packages/system/cilium/charts/cilium/values.yaml.tmpl (3 hunks)
  • packages/system/cilium/images/cilium/Dockerfile (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • packages/system/cilium/images/cilium/Dockerfile
  • packages/system/cilium/charts/cilium/README.md
🔇 Additional comments (12)
packages/system/cilium/charts/cilium/templates/hubble-relay/serviceaccount.yaml (1)

16-16: LGTM! Security enhancement for service account token mounting

The addition of automountServiceAccountToken follows Kubernetes security best practices by allowing explicit control over service account token mounting for the Hubble relay pods. This helps implement the principle of least privilege.

packages/system/cilium/charts/cilium/Chart.yaml (1)

82-82: LGTM! Version update is consistent

The version update from 1.16.3 to 1.16.4 is consistently applied to both appVersion and version fields.

Let's verify version consistency across other files in the repository:

Also applies to: 98-98

packages/system/cilium/charts/cilium/files/cilium-envoy/configmap/bootstrap-config.json (1)

341-341: LGTM: Initial fetch timeout configuration added

Adding explicit timeouts for initial xDS (LDS/CDS) fetch operations is a good practice to prevent indefinite hangs during Envoy startup.

Also applies to: 357-357

packages/system/cilium/charts/cilium/templates/cilium-configmap.yaml (3)

1278-1278: LGTM: Initial fetch timeout configuration

The proxy-initial-fetch-timeout configuration is properly added and aligns with the Envoy bootstrap configuration.


721-723: LGTM: Improved socket LB termination configuration

The change provides better backward compatibility by falling back to socketLB.enabled when terminatePodConnections is not set.


724-725: LGTM: Socket LB tracing configuration added

New tracing capability for socket LB is properly integrated into the configuration.

packages/system/cilium/charts/cilium/values.schema.json (1)

1956-1958: LGTM: Schema updates for new configurations

The schema properly defines:

  • initialFetchTimeoutSeconds as integer type
  • envoy.enabled to accept null values for more flexible configuration
packages/system/cilium/charts/cilium/values.yaml.tmpl (4)

Line range hint 156-159: Version updates look consistent!

The version updates from v1.16.3 to v1.16.4 and corresponding digest updates have been applied consistently across all components:

  • Cilium agent
  • Hubble relay
  • Cilium operator (all variants)
  • Preflight check

Also applies to: 1317-1319, 2483-2491, 2765-2767


1009-1010: New observability feature: Socket LB tracing

A new configuration option has been added to enable tracing for socket-based load balancing, which will help with debugging and observability.


2162-2163: New reliability feature: Envoy initial fetch timeout

Added configuration for the initial fetch timeout on xDS streams, which helps control how long Envoy will wait for the initial configuration fetch.


1280-1283: Security improvement: Reduced certificate validity duration

The certificate validity duration has been reduced from 1095 days (3 years) to 365 days (1 year) to improve security and ensure compatibility with MacOS, which does not accept self-signed certificates with validity periods exceeding 825 days.

packages/system/cilium/charts/cilium/values.yaml (1)

Line range hint 1-3: Generated file matches template

This is an auto-generated file from values.yaml.tmpl. All changes are consistent with the template file reviewed above.

Comment on lines +153 to +157
{{ if and
(or (and (ge (int .Values.cluster.id) 128) (le (int .Values.cluster.id) 255)) (and (ge (int .Values.cluster.id) 384) (le (int .Values.cluster.id) 511)))
(or .Values.eni.enabled .Values.alibabacloud.enabled (eq .Values.cni.chainingMode "aws-cni")) -}}
{{ fail "Cilium is currently affected by a bug that causes traffic matched by network policies to be incorrectly dropped when running in either ENI mode (both AWS and AlibabaCloud) or AWS VPC CNI chaining mode, if the cluster ID is 128-255 (and 384-511 when maxConnectedClusters=511). Please refer to https://github.com/cilium/cilium/issues/21330 for additional details." }}
{{- end }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Important validation needs to be documented in upgrade notes

The validation for cluster ID ranges (128-255, 384-511) when using ENI mode or AWS VPC CNI chaining mode is not documented in any of the chart's documentation files. This validation is critical as it prevents network policy traffic from being incorrectly dropped.

  • Add documentation about this limitation in the chart's README.md or values.yaml under the relevant sections (cluster.id, eni.enabled, or cni.chainingMode)
  • Consider adding a note in UPGRADE.md (if it exists) to warn users about this limitation when upgrading to v1.16.4
🔗 Analysis chain

LGTM! Important validation for preventing traffic drops.

The new validation block correctly prevents using problematic cluster ID ranges (128-255, 384-511) when running in ENI mode or AWS VPC CNI chaining mode, which could cause network policy traffic to be incorrectly dropped.

Let's verify if this validation is documented in the upgrade notes:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the cluster ID validation is documented in the upgrade notes
# Expected: Find mentions of cluster ID ranges and the associated bug

# Search for cluster ID related documentation in README/upgrade notes
fd -t f -e md | xargs rg -i "cluster.*id.*128.*255|21330"

Length of output: 59


Script:

#!/bin/bash
# Let's try a broader search for upgrade notes and documentation
# First, let's check for any UPGRADE or migration guides
fd -t f "UPGRADE|MIGRATION|CHANGELOG" -i

# Then, let's check the root README and docs directory
fd -t f "README|\.md$" -d 2

Length of output: 786


Script:

#!/bin/bash
# Let's also check the Helm chart's documentation specifically
fd -t f "README|VALUES|NOTES" -i packages/system/cilium/charts/cilium

Length of output: 388


Script:

#!/bin/bash
# Let's also check for any mentions of this issue with a broader search
rg -i "network polic.*drop|eni.*mode.*cluster|vpc.*cni.*cluster" 

Length of output: 952


Script:

#!/bin/bash
# Let's check the content of the Cilium chart's README and values files
# to see if this limitation is documented
cat packages/system/cilium/charts/cilium/README.md

Length of output: 65589


Script:

#!/bin/bash
# Also check the values.yaml for any comments or documentation
cat packages/system/cilium/charts/cilium/values.yaml | rg -i -A 5 -B 5 "cluster.*id|eni|aws|vpc"

Length of output: 11261

@kvaps kvaps merged commit 6caefcd into main Dec 2, 2024
2 checks passed
@kvaps kvaps deleted the upd-cilium3 branch December 2, 2024 18:20
@coderabbitai coderabbitai bot mentioned this pull request Dec 3, 2024
klinch0 pushed a commit to klinch0/cozystack that referenced this pull request Dec 5, 2024
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced new configuration options for socket-based load balancing
tracing and initial fetch timeout settings in the Cilium deployment.
- Enhanced validation checks for deprecated options to prevent
misconfigurations.

- **Bug Fixes**
	- Improved error messaging for deprecated or invalid settings.

- **Documentation**
- Updated version numbers in README and configuration files to reflect
the new version (1.16.4).

- **Chores**
- Updated Dockerfile and image tags to reference the latest version
(1.16.4).

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@coderabbitai coderabbitai bot mentioned this pull request Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant