Merge pull request #21 from ivdatahub/feature/UpdateProject

chore: update docs + package project
ivanildobarauna-dev · Sep 20, 2024 · 6065014 · 6065014
2 parents 02306b2 + deae037
commit 6065014
Show file tree

Hide file tree

Showing 3 changed files with 82 additions and 23 deletions.
diff --git a/.github/workflows/deploy-image.yml b/.github/workflows/deploy-image.yml
@@ -0,0 +1,65 @@
+#
+name: Docker deploy
+
+# Configures this workflow to run every time a change is pushed to the branch called `release`.
+on:
+  push:
+    branches:
+      - main
+  workflow_dispatch:
+
+# Defines two custom environment variables for the workflow. These are used for the Container registry domain, and a name for the Docker image that this workflow builds.
+env:
+  REGISTRY: ghcr.io
+  IMAGE_NAME: ${{ github.repository }}
+
+# There is a single job in this workflow. It's configured to run on the latest available version of Ubuntu.
+jobs:
+  build-and-push-image:
+    runs-on: ubuntu-latest
+    # Sets the permissions granted to the `GITHUB_TOKEN` for the actions in this job.
+    permissions:
+      contents: read
+      packages: write
+      attestations: write
+      id-token: write
+      #
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+      # Uses the `docker/login-action` action to log in to the Container registry registry using the account and password that will publish the packages. Once published, the packages are scoped to the account defined here.
+      - name: Log in to the Container registry
+        uses: docker/login-action@65b78e6e13532edd9afa3aa52ac7964289d1a9c1
+        with:
+          registry: ${{ env.REGISTRY }}
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+      # This step uses [docker/metadata-action](https://github.com/docker/metadata-action#about) to extract tags and labels that will be applied to the specified image. The `id` "meta" allows the output of this step to be referenced in a subsequent step. The `images` value provides the base name for the tags and labels.
+      - name: Extract metadata (tags, labels) for Docker
+        id: meta
+        uses: docker/metadata-action@9ec57ed1fcdbf14dcef7dfbe97b2010124a938b7
+        with:
+          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
+      # This step uses the `docker/build-push-action` action to build the image, based on your repository's `Dockerfile`. If the build succeeds, it pushes the image to GitHub Packages.
+      # It uses the `context` parameter to define the build's context as the set of files located in the specified path. For more information, see "[Usage](https://github.com/docker/build-push-action#usage)" in the README of the `docker/build-push-action` repository.
+      # It uses the `tags` and `labels` parameters to tag and label the image with the output from the "meta" step.
+      - name: Build and push Docker image
+        id: push
+        uses: docker/build-push-action@f2a1d5e99d037542a71f64918e516c093c6f3fc4
+        with:
+          context: .
+          push: true
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
+
+      # This step generates an artifact attestation for the image, which is an unforgeable statement about where and how it was built. It increases supply chain security for people who consume the image. For more information, see "[AUTOTITLE](/actions/security-guides/using-artifact-attestations-to-establish-provenance-for-builds)."
+      - name: Generate artifact attestation
+        uses: actions/attest-build-provenance@v1
+        with:
+          subject-name: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME}}
+          subject-digest: ${{ steps.push.outputs.digest }}
+          push-to-registry: true
+
+    environment:
+      name: github-packages
+      url: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME}}
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,20 +1,20 @@
-# Contributing to GCP-streaming-pipeline
+# Contributing to data-consumer-pipeline
 
-Firstly, thank you very much for your interest in contributing to GCP-streaming-pipeline! This document provides guidelines to help ensure the contribution process is smooth and efficient for everyone involved.
+Firstly, thank you very much for your interest in contributing to data-consumer-pipeline! This document provides guidelines to help ensure the contribution process is smooth and efficient for everyone involved.
 
 ## How to Contribute
 
 ### 1. Fork the Repository
 
-1. Go to [repository page](https://github.com/IvanildoBarauna/GCP-streaming-pipeline).
+1. Go to [repository page](https://github.com/ivdatahub/data-consumer-pipeline).
 2. Click the "Fork" button in the top right corner to create a copy of the repository on your GitHub.
 
 ### 2. Clone the Repository
 
 Clone the forked repository to your local machine using the command:
 
 ```sh
-git clone https://github.com/seu-usuario/GCP-streaming-pipeline.git
+git clone https://github.com/<your-username>/data-consumer-pipeline.git
 ```
 
 ### 3. Create a Branch
@@ -67,7 +67,7 @@ git push origin branchname
 
 ## Reporting Bugs
 
-If you find a bug, please open an [issue](https://github.com/IvanildoBarauna/GCP-streaming-pipeline/issues) and provide as much information as possible, including:
+If you find a bug, please open an [issue](https://github.com/ivdatahub/data-consumer-pipeline/issues) and provide as much information as possible, including:
 
 - Detailed description of the problem.
 - Steps to reproduce the issue.
@@ -76,8 +76,8 @@ If you find a bug, please open an [issue](https://github.com/IvanildoBarauna/GCP
 
 ## Improvement suggestions
 
-If you have suggestions for improvements, please open an [issue](https://github.com/IvanildoBarauna/GCP-streaming-pipeline/issues) and describe your idea in detail.
+If you have suggestions for improvements, please open an [issue](https://github.com/ivdatahub/data-consumer-pipeline/issues) and describe your idea in detail.
 
 ## Thanks
 
-Thanks for considering contributing to GCP-streaming-pipeline! Every contribution is valuable and helps to improve the project.
+Thanks for considering contributing to data-consumer-pipeline! Every contribution is valuable and helps to improve the project.
diff --git a/README.md b/README.md
@@ -1,49 +1,44 @@
 ## Data Consumer Pipeline: Data Pipeline for ingest data in near real time
+
 ![Project Status](https://img.shields.io/badge/status-development-yellow?style=for-the-badge&logo=github)
 ![Python Version](https://img.shields.io/badge/python-3.9-blue?style=for-the-badge&logo=python)
 ![License](https://img.shields.io/badge/license-MIT-blue?style=for-the-badge&logo=mit)
 
-
 ![Black](https://img.shields.io/badge/code%20style-black-000000.svg?style=for-the-badge&logo=python)
 ![pylint](https://img.shields.io/badge/pylint-10.00-green?style=for-the-badge&logo=python)
 
-[//]: # ([![CI-CD]&#40;https://img.shields.io/github/actions/workflow/status/IvanildoBarauna/data-consumer-pipeline/CI-CD.yaml?&style=for-the-badge&logo=githubactions&cacheSeconds=60&label=Tests&#41;]&#40;https://github.com/IvanildoBarauna/data-consumer-pipeline/actions/workflows/CI-CD.yml&#41;)
-
-[//]: # ([![IMAGE-DEPLOY]&#40;https://img.shields.io/github/actions/workflow/status/IvanildoBarauna/data-consumer-pipeline/deploy-image.yml?&style=for-the-badge&logo=github&cacheSeconds=60&label=Registry&#41;]&#40;https://github.com/IvanildoBarauna/data-consumer-pipeline/actions/workflows/deploy-cloud-run.yaml&#41;)
+[//]: # "[![CI-CD](https://img.shields.io/github/actions/workflow/status/ivdatahub/data-consumer-pipeline/CI-CD.yaml?&style=for-the-badge&logo=githubactions&cacheSeconds=60&label=Tests)](https://github.com/data-consumer-pipeline/data-consumer-pipeline/actions/workflows/CI-CD.yml)"
+[//]: # "[![IMAGE-DEPLOY](https://img.shields.io/github/actions/workflow/status/data-consumer-pipeline/data-consumer-pipeline/deploy-image.yml?&style=for-the-badge&logo=github&cacheSeconds=60&label=Registry)](https://github.com/data-consumer-pipeline/data-consumer-pipeline/actions/workflows/deploy-cloud-run.yaml)"
+[//]: # "[![GCP-DEPLOY](https://img.shields.io/github/actions/workflow/status/data-consumer-pipeline/data-consumer-pipeline/deploy-cloud-run.yaml?&style=for-the-badge&logo=google&cacheSeconds=60&label=Deploy)](https://github.com/data-consumer-pipeline/data-consumer-pipeline/actions/workflows/deploy-cloud-run.yaml)"
 
-[//]: # ([![GCP-DEPLOY]&#40;https://img.shields.io/github/actions/workflow/status/IvanildoBarauna/data-consumer-pipeline/deploy-cloud-run.yaml?&style=for-the-badge&logo=google&cacheSeconds=60&label=Deploy&#41;]&#40;https://github.com/IvanildoBarauna/data-consumer-pipeline/actions/workflows/deploy-cloud-run.yaml&#41;)
-
-
-[![Codecov](https://img.shields.io/codecov/c/github/IvanildoBarauna/data-consumer-pipeline?style=for-the-badge&logo=codecov)](https://app.codecov.io/gh/IvanildoBarauna/data-consumer-pipeline)
+[![Codecov](https://img.shields.io/codecov/c/github/data-consumer-pipeline/data-consumer-pipeline?style=for-the-badge&logo=codecov)](https://app.codecov.io/gh/data-consumer-pipeline/data-consumer-pipeline)
 
 ## Project Summary
 
 Pipeline for processing and consuming streaming data from Pub/Sub, integrating with Dataflow for real-time data processing
 
-
-
 ## Development Stack
 
 [![My Skills](https://skillicons.dev/icons?i=pycharm,python,github,gcp&perline=7)](https://skillicons.dev)
 
 ## Cloud Stack (GCP)
+
 <img src="docs/icons/pubsub.png" Alt="Pub/Sub" width="50" height="50"><img src="docs/icons/dataflow.png" Alt="Dataflow" width="50" height="50"><img src="docs/icons/bigquery.png" Alt="BigQuery" width="50" height="50">
 
 - Pub/Sub: Messaging service provided by GCP for sending and receiving messages between FastAPI and Dataflow pipeline.
 - Dataflow: Serverless data processing service provided by GCP for executing the ETL process.
 - BigQuery: Fully managed, serverless data warehouse provided by GCP for storing and analyzing large datasets.
 
 ## Continuous Integration and Continuous Deployment (CI/CD, DevOps)
-![My Skills](https://skillicons.dev/icons?i=githubactions)
-
 
+![My Skills](https://skillicons.dev/icons?i=githubactions)
 
 ## Contributing
 
 See the following docs:
 
-- [Contributing Guide](https://github.com/IvanildoBarauna/data-consumer-pipeline/blob/main/CONTRIBUTING.md)
-- [Code Of Conduct](https://github.com/IvanildoBarauna/data-consumer-pipeline/blob/main/CODE_OF_CONDUCT.md)
+- [Contributing Guide](https://github.com/ivdatahub/data-consumer-pipeline/blob/main/CONTRIBUTING.md)
+- [Code Of Conduct](https://github.com/ivdatahub/data-consumer-pipeline/blob/main/CODE_OF_CONDUCT.md)
 
 ## Project Highlights:
 
@@ -59,9 +54,8 @@ See the following docs:
 
 - Documentation: Creation of detailed documentation to facilitate the understanding and use of the application, including installation instructions, usage examples and troubleshooting guides.
 
-
 # Data Pipeline Process:
 
 1. Data Extraction: The data extraction process consists of making requests to the API to obtain the data. The requests are made in parallel workers using Cloud Dataflow to optimize the process. The data is extracted in JSON format.
 2. Data Transformation: The data transformation process consists of converting the data to BigQuery Schema. The transformation is done using Cloud Dataflow in parallel workers to optimize the process.
-3. Data Loading: The data loading process consists of loading the data into BigQuery. The data is loaded in parallel workers using Cloud Dataflow to optimize the process.
+3. Data Loading: The data loading process consists of loading the data into BigQuery. The data is loaded in parallel workers using Cloud Dataflow to optimize the process.