Skip to content

Demo of Redshift Streaming Ingestion (Preview) to sync DynamoDB data with Redshift in near realtime for ETL, Analytics, and Reporting all using SQL

Notifications You must be signed in to change notification settings

damc-dev/example-cdk-dynamodb-stream-to-redshift

Repository files navigation

Near Realtime Analytics of DynamoDB Data with Redshift Streaming Ingestion

This demo shows how you can use Redshift Streaming Ingestion (Preview) to sync DynamoDB data with Redshift in near realtime for ETL, Analytics, and Reporting all using SQL

Motivation

There are tons of great tools for streaming ETL, but if you already know SQL why complicate things when you can use the tools you are already familiar with to load data for in near realtime for analytics.

Architecture Diagram

img

Data Flow Diagram

img

Deployment

Requirements

  • aws cli
  • NodeJS
  • npm
  • jq

Deploy Infrastructure

Install dependencies

npm install

Deploy DynamoDB table, data generator lambda, Kinesis Data Stream, VPC, Redshift Cluster and Redshift IAM Role

npm run deploy

Setup Redshift

Note: this will read the outputs.json file generated by the deploy step above

bash scripts/setup_redshift.sh

Export DynamoDB Table and Initial Redshift Data Load

bash scripts/export_dynamodb_backup.sh
bash scripts/initial_load_from_export.sh -a <export_arn>

Test Incremental Sync of New Member Records

bash scripts/test_sync_time.sh

Login to the Redshift query editor v2 to explore

Go to https://us-east-1.console.aws.amazon.com/sqlworkbench/home?region=us-east-1#/client and login to AWS Account

To connect to database select temporary credentials and admin for the user

Clean up

npm run destroy

About

Demo of Redshift Streaming Ingestion (Preview) to sync DynamoDB data with Redshift in near realtime for ETL, Analytics, and Reporting all using SQL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published