Warning: Here be dragons. This is an attempt to arrive at a minimal config required to launch Meltano on k8s, for design, planning and documentation purposes. It is not yet intended for production use.
Before getting started with the deployment, a few tools are needed. These are all cross-platform (MacOS and Windows at least) ad are installable via brew
or chocolatey
respectively:
- Docker Desktop for running containers locally
- Kind for deploying Kubernetes into containers
- Terraform for manading the deployment and configuration of Kind
kubectl
for interacting with Kuberneteshelm
for deploying apps onto Kubernetes- Lens for exploring the deployment in a lovely UI
If you have deployed applications before, you may already have many of these tools.
With the above in place, deployment is as simple as:
# change to the local deployment dir
cd deploy/local
# Initialise Terraform. This installs the required providers.
terraform init
# See what terraform will create
terraform plan
# Apply to deploy
terraform apply
Quite a lot...
- A local docker registry, to push and pull images to and from.
- Promethius for logging of container and kubernetes cluster metrics.
- A Postgres database instance, configured with two databases for Meltano and Airflow.
- An NFS file server provisioner, for logs and output data.
- Nginx ingress controller, to expose Meltano at localhost and Airflow at localhost/airflow.
- The Meltano UI.
- Airflow.
- Airflow is deployed in two containers, for the webserver and scheduler.
- It is configured to use the Kubernetes Executor, which dynamically provisions a pod per task.
- Logging is centralised using a shared NFS volume mount.
- Check out the respective UI's; Meltano at localhost and Airflow at localhost/airflow.
- For seeing the containers created, and for connecting directly to them, I recommend using Lens to explore.
- Meltano and Airflow are deployed into a namespace called
meltano
. Each container can be connected to directly.
- Meltano and Airflow are deployed into a namespace called
- Airflow logs are centralised using a shared NFS volume mounted at
/project/.meltano/run/airflow/logs
. This can be used to access log files directly. Logs are viewable for each task in the Airflow UI too as normal.
As kubernetes is just running in docker locally, the ultimate recourse to fixing things is to drop the cluster and start again:
kind delete cluster --name="meltano-cluster"
At that point you will also want to delete the state files generated by terraform too, as they relate to the resources we just deleted. You'll be prompted to run terraform init
again as we removed the .terraform
folder for good measure:
.terraform
.terraform.lock.hcl
terraform.tfstate
terraform.tfstate.backup
meltano-cluster-config
Finally, delete the local registry container in Docker Desktop to avoid name-clashes if/when you redeploy.