Skip to content

Dataflow Template for Firestore to Firestore Documents Copy (Clone of PR 3098)#3392

Open
jingqizz wants to merge 49 commits intoGoogleCloudPlatform:mainfrom
jingqizz:pr3098-noloadtest
Open

Dataflow Template for Firestore to Firestore Documents Copy (Clone of PR 3098)#3392
jingqizz wants to merge 49 commits intoGoogleCloudPlatform:mainfrom
jingqizz:pr3098-noloadtest

Conversation

@jingqizz
Copy link

@jingqizz jingqizz commented Feb 20, 2026

Implements a basic Dataflow template for copying documents from one Firestore database to another. (Patched from #3098)

Co-authored-by: pacoavila808 pacoavila@google.com

@gemini-code-assist
Copy link

Summary of Changes

Hello @jingqizz, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust Dataflow template designed to efficiently copy documents from one Firestore database to another. This new capability streamlines data migration and replication workflows for Firestore users. Accompanying this template are new utility classes for managing Firestore resources during testing, ensuring the reliability and maintainability of the new pipeline. The changes also include updated project configurations and detailed documentation to facilitate adoption and deployment.

Highlights

  • New Dataflow Template: Introduced a new Dataflow template named 'Firestore to Firestore' for batch copying documents between two Firestore databases.
  • Firestore Resource Management: Added new resource manager classes (FirestoreAdminResourceManager and FirestoreResourceManager) and associated exceptions for managing Firestore databases and collections within integration tests.
  • Beam Transforms for Firestore: Implemented custom Apache Beam PTransform and DoFn classes (CreatePartitionQueryRequestFn, PrepareWritesFn, RunQueryResponseToDocumentFn) to handle Firestore-specific operations like partitioning queries, preparing writes, and extracting documents.
  • Dependency Updates and Module Integration: Updated the it/google-cloud-platform/pom.xml to include necessary Firestore dependencies and added the new firestore-to-firestore module to the parent v2/pom.xml.
  • Documentation and Terraform Support: Provided comprehensive documentation (README_Cloud_Firestore_to_Firestore.md) for the new template, including usage instructions and a Terraform configuration (dataflow_job.tf) for deployment.
Changelog
  • it/google-cloud-platform/pom.xml
    • Updated google-cloud-spanner dependency version to 6.102.0.
    • Added google-cloud-pubsublite dependency.
    • Added google-cloud-firestore and google-cloud-firestore-admin dependencies.
  • it/google-cloud-platform/src/main/java/org/apache/beam/it/gcp/firestore/FirestoreAdminResourceManager.java
    • Added new class FirestoreAdminResourceManager to manage Firestore database creation and deletion for integration tests.
  • it/google-cloud-platform/src/main/java/org/apache/beam/it/gcp/firestore/FirestoreAdminResourceManagerException.java
    • Added custom exception FirestoreAdminResourceManagerException for Firestore admin resource management operations.
  • it/google-cloud-platform/src/main/java/org/apache/beam/it/gcp/firestore/FirestoreResourceManager.java
    • Added new class FirestoreResourceManager to manage Firestore collection writes, reads, and cleanup for integration tests.
  • it/google-cloud-platform/src/main/java/org/apache/beam/it/gcp/firestore/FirestoreResourceManagerException.java
    • Added custom exception FirestoreResourceManagerException for Firestore resource management operations.
  • it/google-cloud-platform/src/main/java/org/apache/beam/it/gcp/firestore/package-info.java
    • Added package-level Javadoc for Firestore resource management classes.
  • v2/firestore-to-firestore/README_Cloud_Firestore_to_Firestore.md
    • Added documentation for the new Firestore to Firestore Dataflow template, including parameters, getting started guide, and Terraform usage.
  • v2/firestore-to-firestore/pom.xml
    • Added new Maven module firestore-to-firestore with dependencies for the Dataflow template.
  • v2/firestore-to-firestore/src/main/java/com/google/cloud/teleport/v2/templates/FirestoreToFirestore.java
    • Added the main class for the Firestore to Firestore Dataflow template, implementing the pipeline logic for document copying.
  • v2/firestore-to-firestore/src/main/java/com/google/cloud/teleport/v2/templates/package-info.java
    • Added package-level Javadoc for the Firestore to Firestore template.
  • v2/firestore-to-firestore/src/main/java/com/google/cloud/teleport/v2/transforms/CreatePartitionQueryRequestFn.java
    • Added a Beam PTransform to create Firestore PartitionQueryRequest objects from collection IDs.
  • v2/firestore-to-firestore/src/main/java/com/google/cloud/teleport/v2/transforms/PrepareWritesFn.java
    • Added a Beam DoFn to transform source Firestore Document objects into Write requests for the destination database.
  • v2/firestore-to-firestore/src/main/java/com/google/cloud/teleport/v2/transforms/RunQueryResponseToDocumentFn.java
    • Added a Beam DoFn to extract Document objects from Firestore RunQueryResponse messages.
  • v2/firestore-to-firestore/src/main/java/com/google/cloud/teleport/v2/transforms/package-info.java
    • Added package-level Javadoc for the Firestore-related Beam transforms.
  • v2/firestore-to-firestore/src/test/java/com/google/cloud/teleport/v2/transforms/CreatePartitionQueryRequestFnTest.java
    • Added unit tests for CreatePartitionQueryRequestFn to verify correct generation of partition queries.
  • v2/firestore-to-firestore/src/test/java/com/google/cloud/teleport/v2/transforms/PrepareWritesFnTest.java
    • Added unit tests for PrepareWritesFn to ensure correct document name transformation and field preservation.
  • v2/firestore-to-firestore/src/test/java/com/google/cloud/teleport/v2/transforms/RunQueryResponseToDocumentFnTest.java
    • Added unit tests for RunQueryResponseToDocumentFn to confirm proper document extraction from query responses.
  • v2/firestore-to-firestore/terraform/Cloud_Firestore_to_Firestore/dataflow_job.tf
    • Added Terraform configuration for deploying the Firestore to Firestore Dataflow Flex Template job.
  • v2/pom.xml
    • Added firestore-to-firestore as a new module to the parent pom.xml.
Activity
  • The pull request is a clone of an existing PR (3098), indicating a re-submission or continuation of previous work.
  • The author jingqizz has submitted these changes, with pacoavila808 being carbon-copied, suggesting prior involvement or interest from them.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@jingqizz jingqizz marked this pull request as draft February 20, 2026 17:19
@ninjaAB-5 ninjaAB-5 requested a review from a team February 20, 2026 19:09
@jingqizz jingqizz marked this pull request as ready for review February 20, 2026 21:32
ninjaAB-5
ninjaAB-5 previously approved these changes Feb 24, 2026
…tract documents from the RunQueryResponse and step 6. write documents to destination db.
… Being consistent with DataStreamMongoDBToFirestore.java
ninjaAB-5
ninjaAB-5 previously approved these changes Mar 5, 2026
ninjaAB-5
ninjaAB-5 previously approved these changes Mar 5, 2026
ninjaAB-5
ninjaAB-5 previously approved these changes Mar 6, 2026
// 3. Execute each partitioned query
PCollection<RunQueryResponse> responses =
partitionedQueries.apply(
"QueryDocumentsInPartitions",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember, but do these step names have to exclude spaces and needs to be camel case? Or is it convention?

It might be better if they are human readable so they show up in the console better

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's required, but UpperCamelCase is the casing that's general convention in other templates and documentation e.g. https://beam.apache.org/documentation/programming-guide/#applying-transforms.

With that said, I see other templates using spaces, and I agree that is more readable in the UI, so I've added spaces to make them more human readable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

addition New feature or request size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants