This repo provides you with the ability to launch a single-region(topology) CockroachDB(Self-hosted) cluster on AWS using terraform.
We are going to use the following tools to launch this cluster.
- CockroachDB - Scalable & Resilient Distributed SQL Database that can survive anything.
- AWS Cloud - Cloud Infrastucture to host a single region of CRDB
- Terraform - To automate infrasturture build on AWS
- PSSH - Parallel SSH tool to install and setup cockroachDB on AWS EC2 Instance
- AWS Client VPN - create a secure remote access to AWS Cloud from client machines
- You can choose to start cockroachDB in secure or insecure mode.
- You can choose to run this out over the internet gateways or via VPN tunnel using AWS Client VPN.
We recommend using the AWS VPN Client and using the secure mode to add extra layers of security as you connect from your local system. The below architecture is for a single-region secure CRDB cluster that the repo will help you build.
As per above architecture, we create 3 EC2 Instances, 1 VPC, 3 Subnets, 1 Network Load Balancer and AWS Client VPN for secure remote machine connection.
Note: Change variables as needed in variables.tf
- Last Updated on 03/10/2022
- Terraform version : 1.3.1
The following pre-reqs need to be setup in advance for using this repo:
- Install Terraform on local machine
- Install and configure AWS CLI properly on local machine
- Create a SSH key-pair, so the launched AWS EC2 instances can be connected through SSH.
- Install PSSH on local system
- Install AWS Client VPN for Desktop on local machine
Terraform enables users to plan, create, and manage infrastructure as code. There are various providers available to cloud providers such as AWS,GCP, Azure and more. These providers provide methods that terraform users to provision and manage infrastructure.
terraform init
- initialize terraform scriptterraform fmt
- format the terraform configuration filesterraform validate
- validate the terraform configurationterraform apply
- create infrastructure from the configurationterraform show
- inspect buildterraform destroy
- destroy the infrastructureterraform apply -var 'instance_name=yetanothername'
- change variables from command line
This is divided into 5 parts
- Infrastucture build using terraform
- Setting up a VPN Tunnel for secure remote access
- Installing and setting up CockroachDB
- Starting CockroachDB
- Workload Testing
Read the below documentation for detailed understanding. https://www.cockroachlabs.com/docs/v22.1/deploy-cockroachdb-on-aws.html
main.tf
and variables.tf
are the 2 essential files from this repo that will create the infrastructure. Check main.tf
were we create the resources to support this install. The below resources are created.
- EC2 Instance
- VPC
- Subnets
- Internet gateway
- Route tables
- Security groups
- Load Balancers & Target groups
-
To check the terraform build plan, run the following command.
terraform plan
-
To build the infrastructure, run the following command.
terraform apply
-
Run the below to get public IP address for EC2 Instances that were just created by terraform. We will need these ip address in next steps.
terraform output
-
Go to
AWS Cloud Console
and verify all the infrastructure is build as expected.
Note : `This step is only needed if you want to create a VPN Tunnel for secure access, if you do not want to then can continue with connecting through the Internet Gateway. The architecture for the same will look like below.`
Follow the Detailed steps here for AWS Client VPN Conncetion. The high level steps are as below for your understanding:
- Generate server and client certificates and keys - Detailed steps here
- Create a Client VPN endpoint, assocaite the target network you created in terraform and associate subnets.
- Verify that your default security group and terraform created security groups are added.
- Add Authorization rule for VPC
- Download Client VPN end point configuration and setup a profile in client VPN.
- Connect via the client VPN end point from local.
- Test connection by connecting via the internal IP for any EC2 Instance.
-
Go to
pssh_hosts_files.txt
and add Internalhost-ip-address
if connecting via AWS Client VPN as per your build that just completed. If connecting via internet gateway then add Externalhost-ip-address
. -
Run the
setup.sh
script for AWS Time Sync Service and for installing CockroachDB.`pssh -i -h pssh_hosts_files.txt -x "-oStrictHostKeyChecking=no -i add-your-key" -I < setup.sh`
-
Log into each node and test if setup ran as expected
`ssh -i add-ec2-key ec2-user@public-ip-of-host`
-
These step can vary depending on how you want to configure the cluster. You can setup the cluster either insecure or secure. Follow the secure cluster creation steps.
Follow the steps described here. Below are some key things you need to do after you install cockroach binary on your local machine.
Note : if AWS VPN Client is used then add internal IPs, for connection through internet gateway use external IPs
-
Modify and run the below command on
each node
of EC2 instance.cockroach start --certs-dir=certs --advertise-addr=node 1:26257 --join=node1:26257,node2:26257,node3:26257 --cache=.25 --max-sql-memory=.25 --background
-
Initialize the cluster from
your local machine
.cockroach init --certs-dir=certs --host=<internal ip address of any node on --join list>
-
Test if the cluster has started by running the below command from local machine
cockroach node status --certs-dir=certs --host=<internal address of any node on --join list>
This should show all the nodes that are running in the cluster. If 3 then 3 nodes.
-
You can also, go to https://ip-any-node:8080 - This should take you a db console. Also, to log into the db console you will need a user. Its recommended to create a new user with password, as below.
(In Local) cockroach sql --certs-dir=certs --host=<address of any node on --join list> (In SQL) CREATE USER with_password WITH LOGIN PASSWORD 'add_password'; show users;
Note : Since we have a self signed certificate the browser may show that its insecure connection .To solve this in production, you can use a separate ui.crt/ui.key that is signed by some known cert authority (Verisign or whatever) -- if you do this, the DB Console will use that key/cert pair for its TLS while the CRDB nodes will still use the node certs signed by your self-signed cert.
We will be running the workload against the aws load balancer that we created. For this we need the IP address or DNS of the load balancer.
-
For Application Load Balancers and Network Load Balancers, use the following command to find the load-balancer-id and DNS, alternatively you can get this info from details in console for load balancer:
aws elbv2 describe-load-balancers --names load-balancer-name
-
Initialize the tpcc workload
cockroach workload init tpcc 'postgresql://root@ip-or-dns-name-of-network-load-balancer:26257/tpcc?sslmode=verify-full&sslrootcert=certs/ca.crt&sslcert=certs/client.root.crt&sslkey=certs/client.root.key'
-
Run the workload against the aws load balancer
cockroach workload run tpcc --duration=10m 'postgresql://root@ip-or-dns-name-of-network-load-balancer:26257/tpcc?sslmode=verify-full&sslrootcert=certs/ca.crt&sslcert=certs/client.root.crt&sslkey=certs/client.root.key'
You should now have a running cluster
with test workload flowing into the DB. Go try some other things on this running cluster using these tutorials/features and have fun.