Demo Projects

Serverless Application Model (SAM) for Data Professionals
- AWS Lambda provides serverless computing capabilities and it can be used for performing validation or light processing/transformation of data. Moreover, with its integration with more than 140 AWS services, it facilitates building complex systems employing event-driven architectures. There are many ways to build serverless applications and one of the most efficient ways is using specialised frameworks such as the AWS Serverless Application Model (SAM) and Serverless Framework. In this post, I’ll demonstrate how to build a serverless data processing application using SAM.
Kafka Connect for AWS Services Integration - Aiven OpenSearch Sink Connector
- We discuss how to develop a data pipeline from Apache Kafka into OpenSearch. In part 1, the pipeline is developed locally using Docker while it is deployed on AWS in the next post.
  - Part 1
  - Part 2
Setup Local Development Environment for Apache Flink and Spark Using EMR Container Images
- In this post, we will discuss how to set up a local development environment for Apache Flink and Spark using the EMR container images. For the former, a custom Docker image will be created, which downloads dependent connector Jar files into the Flink library folder, fixes process startup issues, and updates Hadoop configurations for Glue Data Catalog integration. For the latter, instead of creating a custom image, the EMR image is used to launch the Spark container where the required configuration updates are added at runtime via volume-mapping. After illustrating the environment setup, we will discuss a solution where data ingestion/processing is performed in real time using Apache Flink and the processed data is consumed by Apache Spark for analysis.
Data Build Tool (dbt) Pizza Shop Demo
- The data build tool (dbt) is a popular data transformation tool for data warehouse development. Moreover, it can be used for data lakehouse development thanks to open table formats such as Apache Iceberg, Apache Hudi and Delta Lake. In this series of posts, we discuss practical data warehouse/lakehouse examples including ETL orchestration with Apache Airflow.

Name		Name	Last commit message	Last commit date
Latest commit History 165 Commits
.vscode		.vscode
airflow-demo		airflow-demo
automate-dv-demo		automate-dv-demo
daft-quickstart		daft-quickstart
dbt-athena-demo		dbt-athena-demo
dbt-bigquery-demo		dbt-bigquery-demo
dbt-postgres-demo		dbt-postgres-demo
dbt_bootcamp		dbt_bootcamp
ecommerce		ecommerce
elasticsearch-crash-course		elasticsearch-crash-course
flink-spark-local-dev		flink-spark-local-dev
iceberg-cookbook		iceberg-cookbook
opensearch-kafka-connect		opensearch-kafka-connect
pyspark-data-analysis		pyspark-data-analysis
pyspark-hands-on		pyspark-hands-on
sam-for-data-professionals		sam-for-data-professionals
simple_dv		simple_dv
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demo Projects

About

Releases

Packages

Languages

jaehyeon-kim/general-demos

Folders and files

Latest commit

History

Repository files navigation

Demo Projects

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages