feat: Add SQL Server CDC support to Datastream-to-BigQuery template#3450
feat: Add SQL Server CDC support to Datastream-to-BigQuery template#3450mahendrarathore1742 wants to merge 1 commit intoGoogleCloudPlatform:mainfrom
Conversation
Closes GoogleCloudPlatform#3410 Add SQL Server as a supported source database for the Datastream to BigQuery Flex template, enabling CDC replication from SQL Server to BigQuery. Changes: - DataStreamClient: Add SQL Server schema discovery, primary key extraction, and type conversion (20+ SQL Server types to BigQuery) - FormatDatastreamRecordToJson: Handle SQL Server CDC metadata (_metadata_schema, _metadata_lsn, _metadata_tx_id) for Avro format - FormatDatastreamJsonToJson: Handle SQL Server CDC metadata for JSON format. Also fix pre-existing bug where _metadata_lsn was reading from 'database' instead of 'lsn' - DatastreamRow: Add SQL Server sort fields for CDC event ordering - BigQueryDefaultSchemas: Add _metadata_lsn to default schema - DataStreamToBigQuery: Update template requirements and parameter docs to include SQL Server - DataStreamClientTest: Add unit tests for SQL Server type conversion Tested: - Maven compile: SUCCESS (datastream-common + datastream-to-bigquery) - Checkstyle: 0 violations - Unit tests: 11 run, 0 failures
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the capabilities of the Datastream-to-BigQuery Flex template by introducing full support for SQL Server as a source for Change Data Capture (CDC) replication. The changes enable seamless data streaming from SQL Server instances to BigQuery, encompassing schema mapping, metadata processing, and proper data type conversions. This enhancement allows users to leverage the template for a broader range of enterprise data sources, facilitating real-time analytics and data warehousing solutions. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Closes #3410
Add SQL Server as a supported source database for the Datastream to BigQuery Flex template, enabling CDC replication from SQL Server to BigQuery.
Changes:
Tested: