Skip to content

A Cloud Function that updates a BigQuery Table with csv data loaded from an api endpoint

License

Notifications You must be signed in to change notification settings

robquinn/super-crunch

Repository files navigation

Google Cloud PubSub Function and CronJob

Table of Contents

About

This is a 4 Google Cloud Functions with shared logic written in python. The functions are triggered by a PubSub, which is triggered by a cron job. The functions retrieve data from different API endpoints that each output a .csv file, then proceeds to ETL (extract, transform, and load) that data into a BigQuery Table.

All the commands necessary to setup, configure, and deploy the function are written into the scripts folder and accessible by make commands.

Requirements

In order to run this project, you'll need

Install

To install ALL the necessary python packages for development:

pip install -r requirements-dev.txt

To install just the production python packages:

pip install -r requirements.txt

Commands

Publish Cloud Function

To publish the cloud function:

make gcp_functions_publish

Delete Cloud Function

To delete the cloud function:

make gcp_functions_delete

Create the PubSub

To create the PubSub that triggers the cloud function:

make gcp_pubsub_create

Delete the PubSub

To delete the PubSub that triggers the cloud function:

make gcp_pubsub_delete

Create the CronJob

To create the cronjob for the PubSub:

make gcp_cronjob_create

Edit the CronJob

To edit the cronjob for the PubSub, edit the file:

scripts/google-cloud/cronjob/edit.sh

Then run the command:

make gcp_cronjob_edit

Delete the CronJob

To delete the cronjob for the PubSub

make gcp_cronjob_delete

Unit Testing

Run All Tests

To run all tests

make test

Test Endpoint

To test the retrieval of the api endpoint csv data:

make test_endpoint

Test Transactions

To test setting up of the Pandas Dataframe and data transformations:

make test_transactions

Test Upload

To test the uploading of the transaction data to the BigQuery table:

make test_upload

Integration Testing

To run the integration test on a function, you first need to select which function you want to test by setting the FUNCTIONS_FRAMEWORK_TARGET env var to the function you want to test.

Next, you will need 3 separate shells.

In the first shell, start the function framework:

make ff_start

Next, in your second shell, start the pubsub emulators:

make gcp_em_pubsub_start

In your third shell, run the following command:

make gcp_em_pubsub_env_init

The output of the command should look something like:

export PUBSUB_EMULATOR_HOST=localhost:8085

Copy that output, paste it into your (third) shell, and hit enter.

That step is important, as the next set of commands will not work without it.

Ensure for these next steps that your .env file has all the variables filled out.

Then, run the following command (in your third shell) to create a PubSub topic:

make gcp_em_pubsub_create_topic

If successful, you should see something like:

Created topic: projects/my-project/topics/my-topic

Next, run the following command (in your third shell) to create a subscription for the PubSub topic:

make gcp_em_pubsub_create_sub

If successful, you should see something like:

Push subscription created: name: "projects/my-project/subscriptions/my-subscription" \\n
topic: "projects/my-project/topics/my-topic"
push_config {
  push_endpoint: "http://localhost:8080"
}
ack_deadline_seconds: 10
message_retention_duration {
  seconds: 604800
}
.
Endpoint for subscription is: http://localhost:8080

Lastly, run the following command (in your third shell) to publish the topic:

make gcp_em_pubsub_publish_topic

If successful, you should see something like:

1
2
3
4
5
6
7
8
9
Published messages to projects/my-project/topics/my-topic.

Now return to your first shell, you should see the output of your integration tests

About

A Cloud Function that updates a BigQuery Table with csv data loaded from an api endpoint

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published