Setup for running Presto with Hive Metastore on OpenShift 4 as extended from this blog post. Additionally, there is are Docker Compose instructions for alternate development environments.
This project utilizes Red Hat software and assumes you have access to registry.redhat.io for both the OpenShift 4 and Docker Compose setup.
Deploying Presto on OpenShift 4 assumes you have a working OpenShift 4 cluster.
Both the OpenShift 4 deployment and Docker Compose setup assumes you have access to the S3 object store (AWS).
- Clone the repo
git clone https://github.com/chambridge/presto-on-ocp4.git
- Setup pull secret in your project
oc project <project>
make setup-pull-secret
- Deploy Hive Metastore (using Embedded Derby)
make setup-metastore-secret key=<aws access key> secret=<aws secret>
make deploy-metastore s3bucket=<s3bucket>
- Deploy Presto services (coordinator, workers)
make deploy-presto
- Deploy Redash.
make deploy-redash
- Initialize Redash.
make init-redash
Things you may need to modify:
- Memory settings and worker counts.
You can execute SQL presto-cli:
make oc-run-presto-cli sql=<sql>
Now that your connection works you can generate some data in s3 with the following command:
scripts/gendata.sh <s3bucket-with-path>
From here you can port-foward or create a route for redash and and start querying and building charts as discussed in the blog above.
- Copy the example.env file to .env and update your AWS credential values
cp example.env .env
- Source the .env file for use with Docker Compose
source .env
- Start Docker Compose with S3 bucket
make docker-up s3bucket=<s3bucket>
- View the Docker logs
make docker-logs
-
Launch the presto UI - http://localhost:8080
-
Shutdown the containers
make docker-down
You can execute SQL presto-cli:
make docker-run-presto-cli sql=<sql>
1.Setup requirements
pipenv install
- Copy the example.env file to .env and update your AWS credential values
cp example.env .env
- Port-forward presto (OCP only)
make oc-port-forward-presto
- Run script
python scripts/presto_connect.py