Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delete extra logs, fix ch for cozy #431

Merged
merged 4 commits into from
Oct 21, 2024

Conversation

klinch0
Copy link
Contributor

@klinch0 klinch0 commented Oct 17, 2024

Summary by CodeRabbit

  • New Features

    • Introduced a local hook to generate a versions map and check for changes.
    • Added new configuration options for ClickHouse, including enhanced logging, user management, and structured templates.
    • New parameters for persistent volume size and Docker image specifications in ClickHouse configuration.
    • Added hostname matching capabilities in ClickHouse operator configuration.
  • Bug Fixes

    • Updated exclusion patterns for YAML files in pre-commit hooks.
  • Version Updates

    • Incremented ClickHouse chart version from 0.5.0 to 0.6.0.
    • Updated versioning for several packages, including clickhouse and tenant.

@klinch0 klinch0 requested a review from kvaps as a code owner October 17, 2024 21:08
Copy link
Contributor

coderabbitai bot commented Oct 17, 2024

Caution

Review failed

The head commit changed during the review from df92a85 to 44ae566.

Walkthrough

The pull request introduces significant updates across multiple files related to the ClickHouse application and its configuration. Key modifications include changes to the .pre-commit-config.yaml for hook management, version updates in Chart.yaml, enhancements in the clickhouse.yaml template, and the addition of new properties in JSON schema and values files. The changes aim to improve configuration management, user handling, and logging capabilities, alongside versioning adjustments for various packages.

Changes

File Path Change Summary
.pre-commit-config.yaml Updated exclusion pattern for YAML files; added local hook gen-versions-map for version checks.
packages/apps/clickhouse/Chart.yaml Incremented version from 0.5.0 to 0.6.0; appVersion remains "24.3.0".
packages/apps/clickhouse/templates/clickhouse.yaml Added namespaceDomainPattern, expanded defaults, updated user management with SHA-256 passwords, and introduced logging configurations.
packages/apps/clickhouse/values.schema.json Added logStorageSize property with default "2Gi"; updated storageClass default to "local".
packages/apps/clickhouse/values.yaml Added logStorageSize, updated storageClass to "local", and defined new image parameters.
packages/apps/versions_map Updated versions for clickhouse, tenant, and other packages to reflect latest commits.
packages/system/clickhouse-operator/values.yaml Added hostRegexpTemplate under clickhouse.configuration.network for flexible hostname matching.

Possibly related PRs

  • add precommit and fix postgres job #418: The changes in this PR involve updates to the .pre-commit-config.yaml file, which directly relates to the modifications made in the main PR regarding the same file's configuration for pre-commit hooks.

Suggested reviewers

  • kvaps
  • themoriarti

Poem

🐇 In the garden where ClickHouse grows,
With versions updated, the progress flows.
New hooks and templates, a joyful delight,
Enhancing our configs, making them bright!
So hop along, let’s celebrate the change,
In the world of code, we’ll happily arrange! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@klinch0 klinch0 force-pushed the feature/add-ch-log-rotate branch from 2c15395 to bb3c429 Compare October 17, 2024 21:09
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
.pre-commit-config.yaml (1)

17-25: LGTM! Consider optimizing the hook configuration.

The addition of the 'gen-versions-map' hook is a good practice for maintaining version consistency. However, there are a few suggestions for optimization:

  1. The hook currently runs on all file types (types: [file]). Consider limiting this to specific file types that are relevant to version changes.
  2. The pass_filenames: false setting means the hook will always run on all files, which could slow down commits. If possible, modify the script to accept and check only changed files.

Here's a suggested optimization:

- repo: local
  hooks:
    - id: gen-versions-map
      name: Generate versions map and check for changes
      entry: bash -c 'cd packages/apps && make check-version-map'
      language: system
      files: ^packages/apps/.*\.(json|yaml|yml)$
      pass_filenames: true
      description: Run the script and fail if it generates changes

This configuration assumes that version information is typically stored in JSON or YAML files within the packages/apps directory. Adjust the files pattern as needed for your specific use case.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between cad9cde and bb3c429.

📒 Files selected for processing (7)
  • .pre-commit-config.yaml (1 hunks)
  • packages/apps/clickhouse/Chart.yaml (1 hunks)
  • packages/apps/clickhouse/templates/clickhouse.yaml (2 hunks)
  • packages/apps/clickhouse/values.schema.json (3 hunks)
  • packages/apps/clickhouse/values.yaml (2 hunks)
  • packages/apps/versions_map (1 hunks)
  • packages/system/clickhouse-operator/values.yaml (1 hunks)
🧰 Additional context used
🔇 Additional comments (16)
packages/system/clickhouse-operator/values.yaml (1)

7-10: Approve with suggestions: Review the hostname pattern specificity

The addition of the hostRegexpTemplate configuration is a good improvement for flexible hostname matching. However, there are a few points to consider:

  1. The regex pattern includes a specific domain (cozy.local), which might not be suitable for all deployments. Consider making this more generic or parameterizable.

  2. To improve maintainability, it would be helpful to add a comment explaining the regex pattern and its components.

Consider adding a comment above the hostRegexpTemplate line to explain the pattern:

clickhouse:
  configuration:
    network:
      # Matches ClickHouse hostnames in the format:
      # 1. chi-{chi}-[alphanumeric]-[number]-[number]
      # 2. clickhouse-{chi}
      # Both followed by .{namespace}.svc.cozy.local
      hostRegexpTemplate: "(chi-{chi}-[^.]+\\d+-\\d+|clickhouse\\-{chi})\\.{namespace}\\.svc\\.cozy\\.local$"

This change introduces a new configuration for hostname matching. Let's verify if it aligns with the PR objectives:

If the script doesn't return any matches, we might need to clarify how this change relates to the stated PR objectives of "delete extra logs, fix ch for cozy".

packages/apps/clickhouse/Chart.yaml (1)

19-19: Version update looks good, but doesn't align with PR objectives.

The chart version update from 0.5.0 to 0.6.0 is appropriate for introducing new features or non-breaking changes. However, this change doesn't seem to directly address the PR objectives of "delete extra logs, fix ch for cozy".

To ensure this version bump is justified, let's check for related changes in other files:

✅ Verification successful

Version update verified and aligns with PR objectives.

The chart version has been appropriately bumped from 0.5.0 to 0.6.0 to incorporate the removal of extra logs and the fixes related to 'cozy'. The changes in related ClickHouse files support this update.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for changes in other ClickHouse-related files that might justify the version bump.

# Test: Look for changes in ClickHouse-related files
echo "Changes in ClickHouse-related files:"
git diff --name-only HEAD~1 | grep -i 'clickhouse'

# Test: Check for changes related to logging
echo "Changes potentially related to logging:"
git diff HEAD~1 | grep -i 'log'

# Test: Check for changes potentially related to 'cozy'
echo "Changes potentially related to 'cozy':"
git diff HEAD~1 | grep -i 'cozy'

Length of output: 1870

packages/apps/clickhouse/values.yaml (3)

10-10: LGTM! Consider verifying the log storage size.

The addition of logStorageSize parameter is a good practice, allowing separate configuration for log storage. The value of 2Gi seems reasonable, but please ensure it aligns with your application's logging requirements and retention policies.


48-50: LGTM! Verify Clickhouse version compatibility.

The addition of the image section with explicit repository and tag is a good practice for version control and reproducibility. However, please ensure that:

  1. Version 24.2.2.71 of Clickhouse is compatible with your application and other configurations.
  2. This version meets your performance, security, and feature requirements.
  3. You have a process in place for keeping this version up-to-date with future releases and security patches.

Run the following script to check the latest available Clickhouse versions and compare with the specified version:

#!/bin/bash
# Description: Check latest Clickhouse versions and compare with specified version

# Test: Fetch latest Clickhouse versions from Docker Hub
echo "Latest Clickhouse versions:"
curl -s https://hub.docker.com/v2/repositories/clickhouse/clickhouse-server/tags/?page_size=10 | jq -r '.results[].name' | sort -rV | head -n 5

echo -e "\nSpecified version in configuration:"
echo "24.2.2.71"

# Note: Manually compare the versions to ensure you're using an up-to-date and appropriate version

13-13: Verify implications of using "local" storage class.

The change from an empty string to "local" for storageClass is noted. This modification may affect how persistent volumes are provisioned and managed. Please ensure that:

  1. The "local" storage class is available in your Kubernetes cluster.
  2. You understand the implications of using local storage, such as data persistence across node failures and potential impact on pod scheduling.
  3. This aligns with your data durability and availability requirements for Clickhouse.

Run the following script to verify the storage class existence and its provisioner:

packages/apps/versions_map (3)

7-8: Summary: Version updates look good, ensure thorough testing.

The version updates in this file align with the PR objectives and follow good versioning practices. Key points:

  1. clickhouse has a new stable version (0.5.0) and a new development version (0.6.0).
  2. tenant has a significant update to version 1.4.0.
  3. Several other packages (rabbitmq, redis, tcp-balancer, vpn) are now tracking their latest changes.

These changes should improve the overall stability and feature set of the application. However, it's crucial to ensure that these updates are thoroughly tested, especially the new tenant version, to prevent any potential compatibility issues.


8-8: LGTM! Verify compatibility of the new tenant version.

The changes to the version mappings look good:

  1. A new major version (1.4.0) of the tenant package has been introduced.
  2. rabbitmq, redis, tcp-balancer, and vpn packages have been updated to track the latest changes (HEAD).

These updates align with the PR objective of updating versions.

To ensure compatibility of the new tenant version with other components, please run the following script:

#!/bin/bash
# Description: Verify the compatibility of the new tenant version with other components

# Test: Check for any breaking changes or compatibility issues in the tenant changelog
rg --type md -i "breaking changes|compatibility|deprecated" -A 5 -g "CHANGELOG.md"

# Test: Look for any mentions of version 1.4.0 in other configuration files
rg "tenant.*1\.4\.0" --type yaml --type json

7-8: LGTM! Verify stability of the new clickhouse version.

The changes to the clickhouse versions look good:

  1. Version 0.5.0 is now pinned to a specific commit (2a4768a), which is a good practice for stability.
  2. A new version 0.6.0 is introduced, pointing to HEAD, indicating ongoing development.

These changes align with semantic versioning practices.

To ensure the stability of the new version, please run the following script:

packages/apps/clickhouse/values.schema.json (1)

10-14: LGTM: New logStorageSize property looks good.

The addition of the logStorageSize property is well-implemented. It provides a clear way to configure the persistent volume size for logs, which aligns with the PR objective of fixing ClickHouse for cozy. The default value of "2Gi" seems reasonable for initial log storage needs.

packages/apps/clickhouse/templates/clickhouse.yaml (7)

35-35: Verify the intended use of the custom namespace domain pattern.

The addition of namespaceDomainPattern: "%s.svc.cozy.local" specifies a custom domain pattern for the ClickHouse namespace. This change could affect service discovery and networking within the cluster.

Please confirm if this domain pattern is correct for your environment and if it aligns with your DNS and service discovery setup.


39-40: LGTM: Default templates for pods and services.

The addition of default podTemplate and serviceTemplate is a good practice. It ensures consistent configuration across the ClickHouse installation.

These defaults reference templates defined later in the file, which we'll review separately.


Line range hint 45-48: LGTM: Improved password security.

The change to use SHA-256 hashed passwords for user authentication is a significant security improvement. It ensures that passwords are not stored in plain text.

This change applies to all users defined in the $users variable, providing a consistent level of security across user accounts.


100-106: LGTM: Separate volume for logs.

The addition of a separate volume claim template for logs is a good practice. It helps prevent logs from filling up the main data volume and allows for easier log management.

Please confirm that the log volume size (.Values.logStorageSize) is appropriate for your expected log volume. You may want to monitor log volume usage over time to ensure it's sufficient.


107-128: LGTM: Pod template with anti-affinity and volume mounts.

The new pod template clickhouse-per-host includes several good practices:

  1. Anti-affinity rules ensure ClickHouse pods are scheduled on different nodes, improving availability.
  2. The container image and pull policy are explicitly defined.
  3. Volume mounts are correctly set up for both data and log volumes.

Please confirm that the image tag (.Values.image.tag) is the correct version you intend to deploy. Also, ensure that the Always pull policy aligns with your deployment strategy and image versioning approach.


129-137: LGTM: Service template for HTTP and TCP.

The new service template svc-template correctly defines services for both HTTP (port 8123) and TCP (port 9000) protocols. This ensures that the ClickHouse instance is accessible via both protocols.

Please confirm that:

  1. The port numbers (8123 for HTTP and 9000 for TCP) are correct for your ClickHouse configuration.
  2. These ports align with any network policies or security groups in your environment.

50-83: Verify log configurations and retention policies.

The new files section introduces significant changes to logging:

  1. Several log types are disabled, which could reduce disk usage and potentially improve performance.
  2. A 14-day retention policy is set for query logs and query thread logs.

Please confirm that:

  1. Disabling these logs aligns with your monitoring and troubleshooting requirements.
  2. The 14-day retention period for query logs is sufficient for your needs.

Consider running the following command to verify the applied configurations:

This script will display the TTL settings and the structure of the query log tables, allowing you to verify that the configurations have been applied correctly.

✅ Verification successful

Log configurations and retention policies are correctly applied.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify ClickHouse log configurations
kubectl exec -it {{ .Release.Name }}-0-0-0 -- clickhouse-client -q "
SELECT * FROM system.merge_tree_settings WHERE name LIKE '%TTL%';
SELECT * FROM system.tables WHERE database = 'system' AND (name = 'query_log' OR name = 'query_thread_log');
SHOW CREATE TABLE system.query_log;
SHOW CREATE TABLE system.query_thread_log;
"

Length of output: 429


Script:

#!/bin/bash
# Verify that log types are being disabled in clickhouse.yaml
grep 'remove="1"' packages/apps/clickhouse/templates/clickhouse.yaml

# Verify TTL settings for query logs in clickhouse.yaml
grep 'TTL event_date + INTERVAL 14 DAY DELETE' packages/apps/clickhouse/templates/clickhouse.yaml

Length of output: 658

@@ -7,10 +7,19 @@ repos:
- id: mixed-line-ending
args: [--fix=lf]
- id: check-yaml
exclude: packages/apps/postgres/templates/init-script.yaml
exclude: '^.*templates/.*\.yaml$'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

The updated exclusion pattern '^.*templates/.*\.yaml$' excludes a large number of YAML files within templates directories. While this may prevent linting of non-critical template files, it could also unintentionally skip important configuration files that require validation.

Please review the excluded YAML files to ensure that no essential configuration files are being omitted from linting. Consider refining the exclusion pattern to target only non-critical templates.

🔗 Analysis chain

LGTM! Verify the impact of the new exclusion pattern.

The updated exclusion pattern '^.*templates/.*\.yaml$' will exclude all YAML files within any templates directory from YAML checking. This change seems appropriate for handling template files that might not conform to standard YAML linting rules.

Please verify that this exclusion doesn't unintentionally skip important YAML files that should be checked. Run the following script to list all YAML files that will now be excluded:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: List all YAML files in templates directories that will be excluded from YAML checking

echo "YAML files in templates directories that will be excluded:"
fd -e yaml -p '.*templates/.*\.yaml$'

Length of output: 64270

@@ -20,7 +25,7 @@
"storageClass": {
"type": "string",
"description": "StorageClass used to store the data",
"default": ""
"default": "local"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Potential impacts of changing the default storageClass for Clickhouse

The change of the default storageClass from an empty string to "local" for Clickhouse raises several concerns:

  1. Inconsistency: This change deviates from the pattern used in most other applications in the codebase, which typically default to an empty string or null.

  2. Deployment risks: Setting a specific storage class as default might cause issues in Kubernetes clusters where the "local" storage class is not available or not the intended default.

  3. Performance implications: The "local" storage class typically refers to node-local storage, which may have different performance characteristics compared to other storage classes. This could affect Clickhouse's behavior and performance.

  4. Portability: This change might reduce the portability of the Clickhouse deployment across different Kubernetes environments.

  5. Unintended consequences: Other parts of the system might rely on the default behavior of using the cluster's default storage class, which this change overrides.

Given these potential impacts, it's recommended to:

  • Clarify the specific requirements that led to this change for Clickhouse.
  • Ensure that all target Kubernetes environments support and are configured for the "local" storage class.
  • Consider adding documentation or comments explaining this specific configuration for Clickhouse.
  • Evaluate if this change should be made configurable rather than a hard-coded default.

Please review these concerns and provide more context on the reasoning behind this change.

🔗 Analysis chain

Verify the impact of changing the default storageClass.

The default value for storageClass has been changed from an empty string to "local". This modification could have significant implications:

  1. It may affect the deployment behavior on different Kubernetes clusters.
  2. The "local" storage class might not be available on all Kubernetes clusters.
  3. This change could potentially break existing deployments that rely on the default being an empty string.

Could you please provide more context on why this change was made? Also, consider the following:

  • Has this change been tested on different Kubernetes environments?
  • Is there a migration plan for existing deployments?
  • Should this change be documented in release notes or upgrade instructions?

To help assess the impact, please run the following script to check for usage of the storageClass property in the codebase:


Further verification needed for storageClass usage in the codebase.

To ensure that all instances of storageClass have been accounted for, please run the following script to search for storageClass across all file types:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for usage of storageClass in the codebase

# Search for storageClass usage in yaml files
echo "Searching for storageClass usage in yaml files:"
rg --type yaml 'storageClass:' -C 3

# Search for storageClass references in go files
echo "Searching for storageClass references in go files:"
rg --type go 'storageClass' -C 3

# Search for storageClass references in helm templates
echo "Searching for storageClass references in helm templates:"
fd -e tpl -e yaml -e yml -x rg 'storageClass' {} -C 3

Length of output: 92923


Script:

# !/bin/bash
# Description: Comprehensive search for storageClass across all files

echo "Searching for storageClass usage in all files:"
rg 'storageClass' -C 3

Length of output: 192503

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
packages/apps/clickhouse/values.schema.json (1)

30-33: LGTM: New logTTL property added with a minor suggestion

The addition of the logTTL property is a good improvement, allowing users to configure the log retention period for query_log and query_thread_log.

Consider clarifying the time unit in the description. For example:

"description": "Log retention period (in days) for query_log and query_thread_log",

This will make it explicit that the value is in days, assuming that's the correct unit.

packages/apps/clickhouse/templates/clickhouse.yaml (1)

108-129: LGTM with a suggestion for imagePullPolicy.

The addition of the clickhouse-per-host pod template is well-structured and includes important configurations:

  1. The affinity rules ensure proper distribution of pods across hosts, enhancing high availability.
  2. Container specifications and volume mounts are correctly defined.

However, there's one point to consider:

The imagePullPolicy: Always setting (line 123) might not be ideal for production environments. While it ensures the latest image is always used, it can lead to unexpected behavior if the image tag is updated without your knowledge.

Consider changing the imagePullPolicy to IfNotPresent for production environments:

-              imagePullPolicy: Always
+              imagePullPolicy: IfNotPresent

This ensures that a specific, known version of the image is used, providing better control and predictability in production.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between bb3c429 and f48a89a.

📒 Files selected for processing (3)
  • packages/apps/clickhouse/templates/clickhouse.yaml (2 hunks)
  • packages/apps/clickhouse/values.schema.json (3 hunks)
  • packages/apps/clickhouse/values.yaml (2 hunks)
🧰 Additional context used
🔇 Additional comments (12)
packages/apps/clickhouse/values.yaml (4)

50-52: LGTM: Good practice for version control.

The addition of the image section with specific repository and tag is a positive change. It enhances version control and reproducibility of deployments.

Please verify that version "24.2.2.71" is the intended version for your deployment. You can check the available tags for the ClickHouse image and compare with the current deployed version using:

#!/bin/bash
# Description: Check available ClickHouse image tags and current deployed version

# Check available tags
echo "Available ClickHouse image tags:"
curl -s https://hub.docker.com/v2/repositories/clickhouse/clickhouse-server/tags/?page_size=10 | jq -r '.results[].name'

# Check current deployed version
echo "Current deployed ClickHouse version:"
pod=$(kubectl get pods -l app=clickhouse -o jsonpath='{.items[0].metadata.name}')
kubectl exec $pod -- clickhouse-client -q "SELECT version()"

Ensure this version is compatible with your current setup and meets your requirements.


14-14: Verify implications of using "local" storage class.

The change from an empty string to "local" for storageClass is noted. This can provide better control over storage provisioning.

However, please ensure that:

  1. The "local" storage class is properly set up in your Kubernetes cluster.
  2. You understand the implications for data persistence and node affinity.

You can verify the available storage classes and their provisioners with:

#!/bin/bash
# Description: Check available storage classes and their provisioners

kubectl get storageclass -o custom-columns=NAME:.metadata.name,PROVISIONER:.provisioner

Also, verify that the "local" storage class meets your requirements for data persistence and availability.


15-15: LGTM: Good addition for log retention management.

The introduction of logTTL parameter is a positive change. It helps in managing log retention and can prevent unnecessary storage consumption.

Please verify that the TTL value of 15 (presumably days) aligns with your log retention requirements and any applicable regulations. You can check the current log retention settings with:

#!/bin/bash
# Description: Check the current log retention settings in ClickHouse

# Get the ClickHouse pod name
pod=$(kubectl get pods -l app=clickhouse -o jsonpath='{.items[0].metadata.name}')

# Check the TTL settings for system tables
echo "Current TTL settings for system tables:"
kubectl exec $pod -- clickhouse-client -q "SELECT name, engine, partition_key, sorting_key, primary_key, sampling_key, storage_policy, ttl_expression FROM system.tables WHERE database = 'system' AND name LIKE '%log%'"

11-11: LGTM: Good addition for log storage management.

The introduction of logStorageSize parameter is a positive change. It allows for separate management of log storage, which can be beneficial for performance and maintenance.

Consider monitoring the log volume over time to ensure 2Gi is sufficient for your use case. You can use the following command to check the current log volume:

packages/apps/clickhouse/values.schema.json (2)

10-14: LGTM: New logStorageSize property added

The addition of the logStorageSize property is a good improvement. It allows users to configure the persistent volume size for logs separately from the main data storage. The default value of "2Gi" seems reasonable for most use cases.


28-28: ⚠️ Potential issue

Reconsider changing the default storageClass to "local"

The change of the default storageClass from an empty string to "local" raises several concerns:

  1. It deviates from the pattern used in most other applications in the codebase.
  2. It might cause deployment issues in Kubernetes clusters where the "local" storage class is not available.
  3. It could affect ClickHouse's behavior and performance, as "local" typically refers to node-local storage.
  4. It might reduce the portability of the ClickHouse deployment across different Kubernetes environments.

Consider reverting this change or providing more context on why it's necessary. If the change is required, please ensure:

  • All target Kubernetes environments support and are configured for the "local" storage class.
  • Add documentation explaining this specific configuration for ClickHouse.
  • Consider making this change configurable rather than a hard-coded default.
packages/apps/clickhouse/templates/clickhouse.yaml (6)

130-138: LGTM. Well-defined service template.

The addition of the svc-template service template is well-structured and includes important configurations:

  1. It correctly exposes the necessary ports for ClickHouse operations (HTTP on 8123 and TCP on 9000).
  2. The use of named ports (http and tcp) is a good practice, enhancing clarity and facilitating potential future use in network policies.

This template will ensure consistent and correct service configurations across your ClickHouse deployments.


100-100: LGTM. Verify the size variable definition.

Using a variable for the storage size ({{ .Values.size }}) is a good practice as it allows for flexible configuration across different environments. This change improves the template's reusability.

To ensure the size variable is properly defined, please run:

#!/bin/bash
# Check if size is defined in values.yaml
rg --type yaml "size:" packages/apps/clickhouse/values.yaml

This will confirm that the size variable is defined in the values file with an appropriate default value.


101-107: LGTM. Verify the logStorageSize variable.

The addition of a separate volume claim template for logs is a good practice. It allows for independent scaling and management of log storage. Using a dedicated variable ({{ .Values.logStorageSize }}) for log storage size provides flexibility in configuration.

To ensure the logStorageSize variable is properly defined, please run:

#!/bin/bash
# Check if logStorageSize is defined in values.yaml
rg --type yaml "logStorageSize:" packages/apps/clickhouse/values.yaml

This will confirm that the logStorageSize variable is defined in the values file with an appropriate default value.


35-35: LGTM. Verify the namespace domain pattern.

The addition of namespaceDomainPattern is a good practice for specifying the service discovery domain structure. However, please ensure that the pattern "%s.svc.cozy.local" aligns with your cluster's DNS configuration and naming conventions.

To verify the correctness of this domain pattern, you can run:

This will help identify if this domain pattern is used consistently across other configuration files in the project.


39-40: LGTM. Verify template definitions.

The addition of default podTemplate and serviceTemplate is a good practice for maintaining consistency across the ClickHouse deployment. This change will help standardize the configuration of pods and services.

To ensure these templates are properly defined, please run:

This will confirm that the referenced templates are indeed defined within the file.

✅ Verification successful

Template definitions verified.

The podTemplate and serviceTemplate are correctly defined in packages/apps/clickhouse/templates/clickhouse.yaml, ensuring consistency across the ClickHouse deployment.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify if the pod and service templates are defined in the file
rg --type yaml "name: clickhouse-per-host" packages/apps/clickhouse/templates/clickhouse.yaml
rg --type yaml "name: svc-template" packages/apps/clickhouse/templates/clickhouse.yaml

Length of output: 629


Script:

#!/bin/bash
# Search the entire repository for definitions of 'clickhouse-per-host' and 'svc-template'
rg --type yaml "clickhouse-per-host" .
rg --type yaml "svc-template" .

Length of output: 418


50-84: LGTM. Consider the impact on monitoring and debugging.

The addition of the files section with configurations to disable various logs and set up query log retention is a good approach to manage log storage and potentially improve performance. The TTL configuration for query logs is particularly useful for long-term log management.

However, please consider the following:

  1. Ensure that disabling these logs doesn't negatively impact your ability to monitor and debug the system.
  2. Verify that the {{ .Values.logTTL }} variable is properly defined and its value is appropriate for your use case.

To verify the log TTL configuration, please run:

This will help ensure that the log TTL is properly configured and that no critical logs are accidentally disabled.

✅ Verification successful

Verified Log Configurations

The logTTL is correctly set to 15 in values.yaml. Additionally, disabling the specified logs in clickhouse.yaml does not impact other parts of the codebase, ensuring that monitoring and debugging capabilities remain intact.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if logTTL is defined in values.yaml
rg --type yaml "logTTL:" packages/apps/clickhouse/values.yaml

# Verify if any critical logs are accidentally disabled
rg --type yaml "<\w+_log remove=\"1\"/>" packages/apps/clickhouse/templates/clickhouse.yaml

Length of output: 629


Script:

#!/bin/bash
# Corrected: Check if logTTL is defined in values.yaml
rg --type yaml "logTTL:" packages/apps/clickhouse/values.yaml

# Verify if any critical logs are accidentally disabled
rg --type yaml "<\w+_log remove=\"1\"/>" packages/apps/clickhouse/templates/clickhouse.yaml

Length of output: 494


Script:

#!/bin/bash
# Search for references to disabled logs in the codebase

logs=(
    "asynchronous_metric_log"
    "metric_log"
    "part_log"
    "session_log"
    "trace_log"
    "crash_log"
    "opentelemetry_span_log"
    "processors_profile_log"
)

for log in "${logs[@]}"; do
    echo "Searching for references to ${log}..."
    rg --type yaml "${log}" packages/apps/clickhouse/** || echo "No references found for ${log}."
done

Length of output: 4535

Comment on lines 49 to 52

image:
repository: clickhouse/clickhouse-server
tag: "24.2.2.71"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't give users the oportunity to override image, because this will allow them to run any possible image in management cluster

Copy link
Member

@kvaps kvaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kvaps kvaps merged commit 18bbdb6 into aenix-io:main Oct 21, 2024
1 check failed
@coderabbitai coderabbitai bot mentioned this pull request Oct 22, 2024
@coderabbitai coderabbitai bot mentioned this pull request Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants