Skip to content

[DCP] Use the image tag instead of latest in TF deployments#19

Open
gmechali wants to merge 6 commits intodatacommonsorg:mainfrom
gmechali:tagimage
Open

[DCP] Use the image tag instead of latest in TF deployments#19
gmechali wants to merge 6 commits intodatacommonsorg:mainfrom
gmechali:tagimage

Conversation

@gmechali
Copy link
Contributor

@gmechali gmechali commented Mar 12, 2026

FYI - this PR is a child of #15

If you use the image tag "latest" , terraform won't know whether the latest image has changed and will not re-deploy the new image. This makes it more annoying for a customer who may want to update to the latest DCP image.

With the current changes, the dcp image tag is a variable passed in which defaults to latest, but if they run ./update_image_tag.sh, it will modify their terraform.tfvars file to apply the latest image's tag.
Therefore, they will be able to run terraform plan to view what would get re-deployed

@gmechali gmechali requested a review from dwnoble March 12, 2026 20:54
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the deployment and management of Data Commons Platform (DCP) infrastructure using Terraform. The core improvement addresses a common issue with container image deployments where using the "latest" tag prevents infrastructure-as-code tools like Terraform from detecting and applying updates. By introducing explicit image tag management and an automation script, users can now reliably update their DCP images. Furthermore, the PR modularizes the infrastructure into distinct DCP and legacy CDC stacks, improving organization, flexibility, and maintainability for future deployments.

Highlights

  • Image Tag Management: Introduced a mechanism to use specific image tags instead of "latest" for Terraform deployments, ensuring that image updates trigger re-deployments.
  • Automated Tag Updates: Added a new update_image_tag.sh script to automatically fetch and apply the latest non-"latest" image tag from Google Artifact Registry to terraform.tfvars.
  • New DCP Infrastructure Module: Implemented a new Terraform module (infra/dcp/modules/dcp) for deploying the Data Commons Platform (DCP) stack, including Cloud Run services and Spanner instances.
  • Enhanced CDC Infrastructure Module: Updated the existing Terraform module (infra/dcp/modules/cdc) for the Custom Data Commons (CDC) legacy stack, adding Spanner integration and refining environment variable handling.
  • Improved Terraform Configuration: Expanded the main Terraform configuration (infra/dcp/main.tf) to orchestrate both DCP and CDC modules, along with comprehensive variable definitions and outputs.
  • Documentation and Setup Scripts: Provided a new README.md for the DCP infrastructure and a setup.sh script to streamline the initial configuration of the Terraform GCS backend.
Changelog
  • .gitignore
    • Updated ignored files to include Terraform state, lock files, and Python virtual environment directories.
  • infra/dcp/README.md
    • Added a comprehensive README file detailing the Data Commons Platform (DCP) infrastructure, including prerequisites, setup, deployment, architecture, and troubleshooting.
  • infra/dcp/main.tf
    • Added the main Terraform configuration, defining required providers, enabling necessary Google Cloud APIs, and orchestrating the deployment of DCP and CDC modules.
  • infra/dcp/modules/cdc/locals.tf
    • Added local variable definitions for the CDC module, including dynamic GCS bucket naming, API key handling, Redis instance details, and shared Cloud Run environment variables.
  • infra/dcp/modules/cdc/main.tf
    • Added core Terraform resources for the CDC module, encompassing Cloud SQL, Redis, VPC Access Connector, GCS bucket, API keys, Cloud Run services and jobs, and database initialization.
  • infra/dcp/modules/cdc/outputs.tf
    • Added output variables for the CDC module, exposing Redis and MySQL connection details, user credentials, GCS bucket name, and Cloud Run service URLs.
  • infra/dcp/modules/cdc/service_account.tf
    • Added a dedicated service account for the CDC module, assigned necessary IAM roles, and configured secret management for API keys.
  • infra/dcp/modules/cdc/variables.tf
    • Added input variables for the CDC module, covering project configuration, API keys, GCS settings, MySQL, Cloud Run service parameters, Redis, VPC connector, and Spanner integration.
  • infra/dcp/modules/dcp/cloudrun.tf
    • Added the Cloud Run service definition for the DCP module, specifying container image, resources, ports, environment variables for Spanner, and traffic management.
  • infra/dcp/modules/dcp/iam.tf
    • Added IAM configurations for the DCP module, including a service account for the Cloud Run service, Spanner database user role, and public invoker binding.
  • infra/dcp/modules/dcp/locals.tf
    • Added local variable definitions for the DCP module, primarily for resource name prefixing.
  • infra/dcp/modules/dcp/outputs.tf
    • Added output variables for the DCP module, exposing the Cloud Run service URL, service account email, and Spanner instance/database IDs.
  • infra/dcp/modules/dcp/spanner.tf
    • Added Spanner instance and database resources for the DCP module, with conditional creation and deletion protection settings.
  • infra/dcp/modules/dcp/variables.tf
    • Added input variables for the DCP module, defining project details, image configuration, service parameters, Spanner settings, and deletion protection.
  • infra/dcp/outputs.tf
    • Added root output variables for the overall DCP infrastructure, providing URLs and connection names for both the DCP and CDC stacks.
  • infra/dcp/setup.sh
    • Added a new shell script to guide users through configuring the Terraform GCS backend, creating a GCS bucket, and initializing Terraform.
  • infra/dcp/terraform.tfvars.example
    • Added an example Terraform variables file, illustrating configuration options for global settings, stack toggles, and specific DCP/CDC parameters.
  • infra/dcp/update_image_tag.sh
    • Added a new shell script to automate updating the dcp_image_tag in terraform.tfvars with the latest non-'latest' tag from Google Artifact Registry.
  • infra/dcp/variables.tf
    • Added root input variables for the entire DCP infrastructure, encompassing shared global settings, stack enablement toggles, and detailed configurations for both DCP and CDC modules.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a new Terraform module for deploying the Data Commons Platform (DCP) alongside an existing module for the Custom Data Commons (CDC) stack, allowing users to enable either or both. Key changes include adding new Terraform and configuration files to manage GCP resources like Cloud Run services, Spanner instances, MySQL, and Redis, along with updates to .gitignore and setup scripts. Review comments highlight several improvement opportunities: the .terraform.lock.hcl file should be committed for consistent provider versions, a redundant google_project_service resource in the CDC module needs removal to prevent conflicts, and the FORCE_RESTART environment variable using timestamp() in the CDC module should be removed to avoid unnecessary Cloud Run redeployments. Additionally, the logic for constructing Spanner instance and database names is duplicated across multiple DCP module files and should be refactored into modules/dcp/locals.tf for better maintainability.

terraform.rc
tfplan
infra/dcp/backend.tf
.terraform.lock.hcl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The .terraform.lock.hcl file should be committed to version control, not ignored. According to the official Terraform documentation, this file locks the provider versions used for your configuration. Committing it ensures that all team members and CI/CD systems use the exact same provider versions, which prevents inconsistencies and potential "works on my machine" issues.

Comment on lines +29 to +43
resource "google_project_service" "required_apis" {
for_each = toset([
"run.googleapis.com",
"sqladmin.googleapis.com",
"compute.googleapis.com",
"redis.googleapis.com",
"secretmanager.googleapis.com",
"vpcaccess.googleapis.com",
"artifactregistry.googleapis.com",
"iam.googleapis.com"
])
project = var.project_id
service = each.value
disable_on_destroy = false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This google_project_service resource is redundant. The root module (infra/dcp/main.tf) already enables all the necessary APIs for both the DCP and CDC stacks. This duplication will cause conflicts during terraform apply because both resources will attempt to manage the same project services. Please remove this entire resource block.

Comment on lines +55 to +58
{
name = "FORCE_RESTART"
value = "${timestamp()}"
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The FORCE_RESTART environment variable is set using timestamp(). This will force the Cloud Run service to redeploy on every terraform apply, even when no configuration has changed. This can lead to unnecessary service restarts and downtime. Since the goal is to move towards immutable image tags, this forced restart mechanism might no longer be necessary or desirable. Consider removing it to avoid unexpected deployments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant