This project aims to act as a "wrapper" around Google's ZetaSQL library, for the purpose of generating AST diagrams out of arbitrary BigQuery operations.
Inspiration for the design of this project comes from @ZTRossouw.
Building this app has only been tested on Ubuntu 22.04.1 LTS, but should be compatible with any newer release.
The versions specified below are what was used to build the project, and are the only versions that have been tested against.
The Gradle and Maven build systems is used to build the Java portion of this app, and its dependencies.
To build this portion of the app, execute the following commands from the root folder of this project:
$ gradle buildMaven
$ gradle build
Each command will take some time to process, so be patient! When they're finished, this will have built the zeta-toolkit-core
folder with Maven, the src
folder using Gradle, and all of their dependencies.
This project leverages Poetry for Python dependency management.
To install the required dependencies for the Python portion, execute the following command from the root folder of this project:
$ poetry install --without=dev
This app uses the Google BigQuery API, and thus requires service account credentials for a Google Cloud project with the BigQuery API enabled.
If you already have a service account with the BigQuery API enabled, skip to step 8.
The documentation regarding these credentials is out of the scope of this project, but a quick overview of the steps is as follows:
- Go to the Google Cloud Console.
- Create a new project.
- Go to Enabled APIs & Services, and choose
Enable APIs and Services
. - Search for
BigQuery API
, choose it, and chooseEnable
. - Go to Credentials.
- Choose
Create Credentials
, and thenService Account
. - Give this service account a unique name and ID, and then choose
Done
. Changing optional access scopes is not required for this project. - Back on Credentials, click on the newly created service account (it will be named
[service account ID]@[project ID].iam.gserviceaccount.com
). - Choose
Keys
,Add Key
,Create new Key
, and thenJSON
. - Save the newly created file somewhere secure on your computer. It cannot be recovered if lost.
- Copy the file from step 10 into the root folder of this project, and name it
credentials.json
.
This file will be used by the app to authenticate with your service account. Always ensure that your credentials are protected in a production environment.
To run this app, simply start the Flask backend via:
$ python ./main.py
While running, the backend serves a single endpoint: /gradle
. Sending a POST
request with the following body data will trigger the app to generate an AST diagram, and output it to a file in the generated
folder.
Parameter | Description | Example Value |
---|---|---|
projectId |
The project ID of your Google Cloud project. | bigquery-public-data |
table |
The BigQuery table to be analyzed. | samples.wikipedia |
statement |
An SQL statement to be analyzed. | INSERT INTO bigquery-public-data.samples.wikipedia (title) VALUES ('random title'); |
A complete request looks like:
$ curl \
--location
--request POST '127.0.0.1:5000/gradle' \
--header 'Content-Type: application/json' \
--data-raw '{
"projectId": "bigquery-public-data",
"table": "samples.wikipedia",
"statement": "INSERT INTO `bigquery-public-data.samples.wikipedia` (title) VALUES (\'random title\');"
}'
This app can also be run using Docker.
To build the image, run the following command in the root folder of this project:
$ docker build -t zetasql-bigquery-analyzer .
And to run it, you can start a container with the following command:
$ docker run -d --name zetasql-analyzer-container -v /path/to/your/credentials.json:/app/credentials.json -v /path/to/your/generated/folder:/app/generated zetasql-bigquery-analyzer
While the container is running, you can access it at http://localhost:5000
.