From ebf102e71f1bba2635550abe79167f3ef1e744d9 Mon Sep 17 00:00:00 2001 From: Lirone Samoun Date: Mon, 17 Apr 2023 16:50:17 +0200 Subject: [PATCH] add countries_code seed file --- README.md | 33 ++++++++++++++++++++++++++++++++- dbt/seeds/countries_code.csv | 3 +++ 2 files changed, 35 insertions(+), 1 deletion(-) create mode 100644 dbt/seeds/countries_code.csv diff --git a/README.md b/README.md index cda25e7..14108a8 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,33 @@ # data-engineering-capstone-project +Cloud: GCP +Infrastructure as code (IaC): Terraform +Workflow orchestration: Prefect +Data Wareshouse: BigQuery +Data transformation: DBT +Batch processing: Spark + +Problem description +[Problem is well described and it's clear what the problem the project solves] + +Cloud +[The project is developed in the cloud and IaC tools are used] + +Data ingestion: Batch / Workflow orchestration +[End-to-end pipeline: multiple steps in the DAG, uploading data to data lake] + +Data warehouse +[Tables are partitioned and clustered in a way that makes sense for the upstream queries (with explanation)] + +Transformations (dbt, spark, etc) +[Tranformations are defined with dbt, Spark or similar technologies] + +Dashboard +[A dashboard with 2 tiles] + +Reproducibility +[Instructions are clear, it's easy to run the code, and the code works] + to do @@ -9,4 +37,7 @@ Create a new service account Storage Object Admin Compute Storage Admin -Create and download the json key file \ No newline at end of file +Create and download the json key file + + +dbt run --select global_terrorism_lite --vars '{"is_test_run": "true"}' \ No newline at end of file diff --git a/dbt/seeds/countries_code.csv b/dbt/seeds/countries_code.csv new file mode 100644 index 0000000..8433e77 --- /dev/null +++ b/dbt/seeds/countries_code.csv @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5bf65b65d0a6565c6131038a4df517beb55bd884214488fa3b8012011b8265dd +size 4315