fix: Add TTL cleanup for OpenFGA migration jobs to resolve HPA metric failures #281

keyurbhole · 2026-01-27T10:21:02Z

Description

What problem is being solved?

When running OpenFGA with Horizontal Pod Autoscaler (HPA) in Kubernetes, completed migration Job pods can cause HPA metric collection failures. Since migration pods lack resource requests, the HPA reports FailedGetResourceMetric errors and shows for CPU and memory utilization, preventing proper autoscaling.

How is it being solved?

By adding a configurable ttlSecondsAfterFinished option to the migration Job spec. This leverages Kubernetes' built-in TTL controller to automatically clean up completed Job pods after a specified duration.

What changes are made to solve it?

Added ttlSecondsAfterFinished field to the migration Job template
Added corresponding value in values.yaml with a default of 300 seconds (5 minutes)
Included inline documentation explaining the purpose of the configuration

References

N/A

Review Checklist

[ x] I have clicked on "allow edits by maintainers".
I have added documentation for new/changed functionality in this PR or in a PR to openfga.dev [Provide a link to any relevant PRs in the references section above]
[ x] The correct base branch is being used, if not main
I have added tests to validate that the change in functionality is working as expected

Summary by CodeRabbit

Chores
- Updated Helm chart version to 0.2.52
New Features
- Added automatic Job cleanup configuration for migration jobs. Migration Job pods can now be automatically deleted after completion with a configurable duration (in seconds). The default cleanup duration is set to 300 seconds but can be customized through your Helm values configuration to meet your specific operational requirements.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

linux-foundation-easycla · 2026-01-27T10:21:10Z

The committers listed above are authorized under a signed CLA.

✅ login: keyurbhole / name: Keyur Bhole (2850056, c97391a)

coderabbitai · 2026-01-27T10:21:32Z

Walkthrough

The Helm chart is bumped from version 0.2.51 to 0.2.52 and adds support for configuring automatic cleanup of migration Job pods through a new ttlSecondsAfterFinished field with a default value of 300 seconds.

Changes

Cohort / File(s)	Summary
Chart Metadata `charts/openfga/Chart.yaml`	Version bumped from 0.2.51 to 0.2.52
Migration Job TTL Configuration `charts/openfga/templates/job.yaml`, `charts/openfga/values.schema.json`, `charts/openfga/values.yaml`	Added `ttlSecondsAfterFinished` field to control automatic cleanup of completed migration Job pods, preventing interference with HPA metric calculations; includes schema definition with default 300 seconds and template conditional rendering. Minor YAML formatting adjustments to empty maps.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

chore(openfga): bump OpenFGA Helm chart appVersion #239: Modifies the Chart.yaml version field in another chart version bump

Suggested reviewers

rhamzeh

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: adding TTL cleanup configuration for OpenFGA migration jobs to fix HPA metric failures.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@charts/openfga/templates/job.yaml`:
- Around line 16-18: The template currently omits ttlSecondsAfterFinished when
the value is 0 because the truthy check drops falsy values; update the
conditional to detect presence instead of truthiness so zero is rendered.
Replace the existing "if .Values.migrate.ttlSecondsAfterFinished" check with a
presence check such as "if hasKey .Values.migrate \"ttlSecondsAfterFinished\""
(or use a default-nil comparison like "if (not (eq
(.Values.migrate.ttlSecondsAfterFinished | default nil) nil))") so the
ttlSecondsAfterFinished field is emitted even when set to 0.

In `@charts/openfga/values.yaml`:
- Around line 356-358: The values.yaml currently leaves
migrate.ttlSecondsAfterFinished empty so Helm won’t set the runtime default;
update the chart values to set migrate.ttlSecondsAfterFinished: 300 so the
migration Job pods are automatically deleted after 300 seconds (ensure the key
name migrate.ttlSecondsAfterFinished is used exactly and saved in
charts/openfga/values.yaml).

coderabbitai · 2026-01-27T10:28:40Z

charts/openfga/templates/job.yaml

+  {{- if .Values.migrate.ttlSecondsAfterFinished }}
+  ttlSecondsAfterFinished: {{ .Values.migrate.ttlSecondsAfterFinished }}
+  {{- end }}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

git ls-files | grep -E "(job\.yaml|Chart\.yaml)"

Repository: openfga/helm-charts

Length of output: 151

🏁 Script executed:

cat -n charts/openfga/templates/job.yaml | head -30

Repository: openfga/helm-charts

Length of output: 1213

🏁 Script executed:

fd -e "values\.yaml" charts/openfga

Repository: openfga/helm-charts

Length of output: 424

🏁 Script executed:

fd "values" charts/openfga

Repository: openfga/helm-charts

Length of output: 123

🏁 Script executed:

cat -n charts/openfga/values.yaml | grep -A 10 -B 5 "ttlSecondsAfterFinished"

Repository: openfga/helm-charts

Length of output: 711

🏁 Script executed:

cat -n charts/openfga/values.schema.json | grep -A 20 -B 5 "ttlSecondsAfterFinished"

Repository: openfga/helm-charts

Length of output: 1326

Allow 0 TTL values to render.
Using a truthy check skips the field when users set 0 for immediate cleanup.

🔧 Proposed fix

- {{- if .Values.migrate.ttlSecondsAfterFinished }} + {{- if ne .Values.migrate.ttlSecondsAfterFinished nil }} ttlSecondsAfterFinished: {{ .Values.migrate.ttlSecondsAfterFinished }} {{- end }}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

{{- if .Values.migrate.ttlSecondsAfterFinished }}

ttlSecondsAfterFinished: {{ .Values.migrate.ttlSecondsAfterFinished }}

{{- end }}

{{- if ne .Values.migrate.ttlSecondsAfterFinished nil }}

ttlSecondsAfterFinished: {{ .Values.migrate.ttlSecondsAfterFinished }}

{{- end }}

🤖 Prompt for AI Agents

In `@charts/openfga/templates/job.yaml` around lines 16 - 18, The template currently omits ttlSecondsAfterFinished when the value is 0 because the truthy check drops falsy values; update the conditional to detect presence instead of truthiness so zero is rendered. Replace the existing "if .Values.migrate.ttlSecondsAfterFinished" check with a presence check such as "if hasKey .Values.migrate \"ttlSecondsAfterFinished\"" (or use a default-nil comparison like "if (not (eq (.Values.migrate.ttlSecondsAfterFinished | default nil) nil))") so the ttlSecondsAfterFinished field is emitted even when set to 0.

coderabbitai · 2026-01-27T10:28:41Z

charts/openfga/values.yaml

+  # Automatically delete migration Job pods after completion (in seconds).
+  # Prevents completed pods from interfering with HPA metric calculations.
+  ttlSecondsAfterFinished:


⚠️ Potential issue | 🟠 Major

Set the runtime default for migrate.ttlSecondsAfterFinished.
Right now it’s empty in values.yaml, so Helm won’t apply the intended default of 300s (schema defaults don’t set runtime values). That leaves cleanup disabled by default.

🔧 Proposed fix

- ttlSecondsAfterFinished: + ttlSecondsAfterFinished: 300

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Automatically delete migration Job pods after completion (in seconds).

# Prevents completed pods from interfering with HPA metric calculations.

ttlSecondsAfterFinished:

# Automatically delete migration Job pods after completion (in seconds).

# Prevents completed pods from interfering with HPA metric calculations.

ttlSecondsAfterFinished: 300

🤖 Prompt for AI Agents

In `@charts/openfga/values.yaml` around lines 356 - 358, The values.yaml currently leaves migrate.ttlSecondsAfterFinished empty so Helm won’t set the runtime default; update the chart values to set migrate.ttlSecondsAfterFinished: 300 so the migration Job pods are automatically deleted after 300 seconds (ensure the key name migrate.ttlSecondsAfterFinished is used exactly and saved in charts/openfga/values.yaml).

keyurbhole added 2 commits January 27, 2026 15:37

fix: auto-cleanup openfga migration pods after completion

c97391a

fix: auto-cleanup openfga migration pods after completion

2850056

keyurbhole requested review from a team as code owners January 27, 2026 10:21

coderabbitai bot reviewed Jan 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Add TTL cleanup for OpenFGA migration jobs to resolve HPA metric failures #281

fix: Add TTL cleanup for OpenFGA migration jobs to resolve HPA metric failures #281

keyurbhole commented Jan 27, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

linux-foundation-easycla bot commented Jan 27, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Jan 27, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 27, 2026

Uh oh!

coderabbitai bot Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: Add TTL cleanup for OpenFGA migration jobs to resolve HPA metric failures #281

Are you sure you want to change the base?

fix: Add TTL cleanup for OpenFGA migration jobs to resolve HPA metric failures #281

Conversation

keyurbhole commented Jan 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What problem is being solved?

How is it being solved?

What changes are made to solve it?

References

Review Checklist

Summary by CodeRabbit

Uh oh!

linux-foundation-easycla bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

keyurbhole commented Jan 27, 2026 •

edited by coderabbitai bot

Loading

linux-foundation-easycla bot commented Jan 27, 2026 •

edited

Loading

coderabbitai bot commented Jan 27, 2026 •

edited

Loading