Skip to content

Conversation

@jubrad
Copy link
Collaborator

@jubrad jubrad commented Dec 6, 2025

- Add `materialize_spec_override` to materialize-instance module for arbitrary CRD fields
- Expose `helm_values_override` and `materialize_spec_override` in all cloud examples
- Change default authenticator_kind to Password for all deployments
- Add deepmerge provider for spec merging
- Add Python script to generate Terraform types from upstream CRD/Helm schemas
- Set up pyproject.toml with uv for dependency management
- Add GitHub Action to verify schema sync with upstream Materialize
- Update CONTRIBUTING.md with development setup and type generation docs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Auto generating terraform variables for our CRD and Helm chart is something I've been thinking about for a while and this is a proof of concept for it.

There are many benefits to having an object or set of vars that work with LSP or AI in order to help write code (as opposed to object(any)). And it's also extremely beneficial to be able to set any viable attribute for the helm chart or the materialize CR.

PR has been updated to be a bit more reviewable.
Commit 1 - adds generation script
Commit 2 - adds GH action and docs
Commit 3 - Adds the auto-generated files and adds deprecations

@jubrad jubrad force-pushed the auto-generate-tfvars branch 2 times, most recently from f6c168a to e83a36e Compare December 6, 2025 04:45
@jshiwamV
Copy link
Collaborator

jshiwamV commented Dec 6, 2025

this looks promising will take a look in detail on monday

Copy link
Contributor

@alex-hunt-materialize alex-hunt-materialize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea, but I really don't want us manually parsing and building HCL. Terraform supports json, or we could use a proper HCL parser. I lean toward just converting it all to json, as that opens the door for customers to use their own tools for this too.

I didn't review most of this PR, as it is just a draft. I only did a quick skim of the generator code.


# Rollout configuration
force_rollout = var.force_rollout
request_rollout = var.request_rollout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get consistent about how we handle this stuff. These are removed, while most of the others are marked as deprecated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, marking this as deprecated is probably for the best.

@jubrad
Copy link
Collaborator Author

jubrad commented Dec 8, 2025

This is a great idea, but I really don't want us manually parsing and building HCL. Terraform supports json, or we could use a proper HCL parser. I lean toward just converting it all to json, as that opens the door for customers to use their own tools for this too.

Should we use an HCL parser instead?
https://pypi.org/project/python-hcl2/
We could also rewrite the variables file into json, which both python and terraform can parse easily.

Great ideas I'll give both a shot.

@jubrad jubrad force-pushed the auto-generate-tfvars branch 3 times, most recently from 6a3990f to 63cb495 Compare December 13, 2025 04:24
@jubrad
Copy link
Collaborator Author

jubrad commented Dec 13, 2025

@alex-hunt-materialize I went down a rabbit hole with variables in json.tf... the variable type ends up being a string with horrific formatting. We could also use python-hcl2 to output hcl, but it also has shit formatting for complex variables. Hand rolling hcl gen and formatting is potentially the ugliest option to code up, but it generates the cleanest output which I think is worth prioritizing.

@jubrad jubrad force-pushed the auto-generate-tfvars branch 2 times, most recently from ec06ba6 to 83e2a04 Compare December 13, 2025 05:26
jubrad and others added 3 commits December 12, 2025 23:32
Introduces generate_terraform_types.py which dynamically generates
Terraform variable type definitions from Materialize's upstream schemas:
- CRD types from materialize_crd_descriptions.json
- Helm parameter types from materialize_operator_chart_parameter.yml

The script reads the version from the environmentd_version variable default
in the source code and outputs native HCL format (.gen.tf) with proper
formatting via terraform fmt.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add check_schema_sync.py to verify generated types match upstream schemas
- Add GitHub Actions workflow to run sync check on PRs
- Update CONTRIBUTING.md with instructions for regenerating types

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Generated type definitions:
- kubernetes/modules/materialize-instance/crd_variables.gen.tf
- {aws,azure,gcp}/modules/operator/helm_variables.gen.tf
- {aws,azure,gcp}/examples/simple/override_variables.gen.tf

Additional changes:
- Add helm_values_override and materialize_spec_override variables to
  operator modules and examples for full customization
- Add deprecated force_rollout and request_rollout variables to examples
  for backwards compatibility, merged into materialize_spec_override
- Update materialize-instance module to use the new override variable

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Contributor

@alex-hunt-materialize alex-hunt-materialize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks pretty good. I do worry a bit about how we will test this, as it is a pretty significant change.

if crd_type in type_lookup:
# Detect cycles to prevent infinite recursion
if crd_type in visited:
return "any"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we actually hitting cycles? Should we log something here, so we can fix that?

if len(parts) == 2:
value_type = crd_type_to_terraform(parts[1], type_lookup, visited)
if value_type == "string":
return "map(string)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two lines are useless, as they evaluate to the same as the line below.

affinity = optional(any)
defaultResources = optional(any)
enabled = optional(bool)
nodeSelector = optional(any)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these are labeled any. We should at least be able to indicate object(any) for the ones that are objects, even if we can't easily say they are a map of string to string.

I have an update to the auto-generated helm values documentation in MaterializeInc/materialize#34563 which may help with this, if you decide to use the helm-docs generated file.

Comment on lines +22 to +28
environmentdExtraEnv = length(var.environmentd_extra_env) > 0 ? [{
name = "MZ_SYSTEM_PARAMETER_DEFAULT"
value = join(";", [
for item in var.environmentd_extra_env :
"${item.name}=${item.value}"
])
}] : null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not something to fix in this PR, but this seems like it is wrong and was wrong before this PR. The environmentd_extra_env should not be assumed to be setting a default parameter, but any environment variable.

"string": "any", # string in helm params often means complex YAML that's stringified
"bool": "bool",
"int": "number",
"object": "any", # object without more info maps to any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this map to object(any)?

# Run terraform fmt to ensure proper formatting
run_terraform_fmt(file_path)

return True # We always write, so always return True for simplicity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why even have these return a bool?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants