Skip to content

Latest commit

 

History

History
141 lines (115 loc) · 4.15 KB

deployment_documentation.md

File metadata and controls

141 lines (115 loc) · 4.15 KB

Things to Remember

  • create/update .gcloudignore file
    • ensure generated files (env/, pycache, .csv|.jsonl|... data, etc.) are included
    • including .gcloudignore itself
  • Note the options in gcloud functions deploy
    • The ones used normally are in the example below
    • things like timeout or entry_point change for each function as necessary
  • remember \ in CLI on Linux/Mac is ``` (or backtick) on Windows

gcloud CLI commands

Deploy

extract_phl_opa_properties:

  • Linux/Mac
gcloud functions deploy extract_phl_opa_properties \
--gen2 \
--region=us-central1 \
--runtime=python312 \
--source=. \
--entry-point=extract_phl_opa_prop_main \
--service-account='data-pipeline-robot-2024@musa509s24-team2.iam.gserviceaccount.com' \
--timeout=60s \
--memory=1024Mi \
--no-allow-unauthenticated \
--trigger-http
  • Windows
gcloud functions deploy extract_phl_opa_properties `
--gen2 `
--region=us-central1 `
--runtime=python312 `
--source=. `
--entry-point=extract_phl_opa_prop_main `
--service-account="data-pipeline-robot-2024@musa509s24-team2.iam.gserviceaccount.com" `
--timeout=60s `
--memory=1024Mi `
--no-allow-unauthenticated `
--trigger-http
- one-line version
gcloud functions deploy extract_phl_opa_properties --gen2 --region=us-central1 --runtime=python312 --source=.  --entry-point=extract_phl_opa_prop_main  --service-account="data-pipeline-robot-2024@musa509s24-team2.iam.gserviceaccount.com"  --timeout=60s  --memory=1024Mi  --no-allow-unauthenticated --trigger-http --env-vars-file=../.env

prepare_phl_opa_properties:

load_phl_opa_properties:

run_sql:

load_opa_assessments:

  • Windows gcloud functions deploy extract_phl_opa_assess --gen2 --region=us-central1 --runtime=python38 --source=. --entry-point=extract_opa_assess_main --service-account="data-pipeline-robot-2024@musa509s24-team2.iam.gserviceaccount.com" --timeout=60s --memory=2048Mi --set-env-vars=PREP_DATA_LAKE_BUCKET=musa509s24_team02_prepared_data --no-allow-unauthenticated ` --trigger-http ===

extract_phl_pwd_parcels:

gcloud functions deploy extract_phl_pwd_parcels --gen2 --region=us-central1 --runtime=python312 --source=.  --entry-point=extract_pwd_parcel_main  --service-account="data-pipeline-robot-2024@musa509s24-team2.iam.gserviceaccount.com"  --timeout=60s  --memory=1024Mi  --no-allow-unauthenticated --trigger-http 
  • hope to add --env-vars-file=../.env*

CORS configuration

gcloud storage buckets update gs://musa5090s24_team02_public/ --cors-file=public_cors_config.json

Ensure the path to public_cors_config.json is correct.

  • it is currently located at the root folder of the repository*

View current configuration

run gcloud storage buckets describe gs://<bucket_name> in any gcloud accessible CLI

for example, use the following for the public bucket...

gcloud storage buckets describe gs://musa5090s24_team02_public

JSON File contents

` [ { "origin": [""], "method": ["GET","POST","PUT","OPTIONS","HEAD","DELETE"], "responseHeader": [""], "maxAgeSeconds": 3600 } ]

`

Workflow Details

YAML

`

This workflow passes the region where the workflow is deployed

to the Wikipedia API and returns a list of related Wikipedia articles.

A region is retrieved from the GOOGLE_CLOUD_LOCATION system variable

unless you input your own search term; for example, {"searchTerm": "asia"}.

main: params: [input] steps: - extractPHLPropertyData: call: http.post args: url: https://us-central1-musa509s24-team2.cloudfunctions.net/extract_phl_opa_properties auth: type: OIDC - preparePHLPropertyData: call: http.post args: url: https://prepare-phl-opa-properties-u7ppop2rpa-uc.a.run.app # https://us-central1-musa509s24-team2.cloudfunctions.net/_phl_opa_properties auth: type: OIDC - loadPHLPropertyData: call: http.post args: url: https://us-central1-musa509s24-team2.cloudfunctions.net/load_phl_opa_properties auth: type: OIDC `