NOMAD 101 Demo

Overiview

This repo includes:

Documentation to cover core Nomad concepts, client (control plane) / server (worker) architecture, jobs, tasks and allocations
A terraform config for provisioning a 3 client / 3 server Nomad cluster in AWS
Some sample jobs

Nomad Concepts

Core Architecture 101

Nomad is packaged as a single executable, it is written in GOLANG and generally runs anywhere that supports the Linux operating system, including IBM s390x based mainframes.

A Nomad cluster consists of two main elements:

Client nodes, these make up the control plan
Worker nodes, where orchestrated jobs are run

Clusters can be multi region and the clients nodes can be grouped into Node pools:

Gossip protocol plays a key part in the role of cluster node membership.

Users interact with Nomad clusters via jobs, these in turn encapsulate other constructs including tasks. The are a variety of ways for deploying jobs to a cluster and managing them, including:

Nomad comes with an ACL system and the ability for node-to-node communications to be secured with TLS

Task Drivers

A key differentiator between Nomad and other orchestrators such as Kubernetes is the fact that Nomad can orchestrate a wide variety of job types via task drivers. Simply put, if a task driver exists for a schedulable entity, Nomad can orchestrate that entity. HashiCorp provides first party supported task drivers and the ecosystem also supports community written task drivers.

The raw exec tasks driver provides shell out like capabilities for running jobs, but should be used with caution due to the fact that any job that runs under this driver runs as the same user that the Nomad nodes run as, therefore isolated exec should generally be used in preference to this.

Anatomy of a Basic Job

A Nomad job consists of a key number of elements, the example below is rendered in Nomad HCL:

region are defined at server configuration level.
data centers specifies the data centers in the region that jobs are to be spread over.
type specifies the type of job, jobs intended to run idenfinitely specify a type of service as per the example
group acts a container for speciying which tasks should be executed on the same client, this is analagous to a pod in Kubernetes parlance.
task is the finest grained atomic unit of work Nomad can execute.
task driver used by Nomad clients to execute a task and provide resource isolation.

Full documentation on the complete set of job specification options can be found here.

Scheduling

By default Nomad uses the bin packing algorithm in order to schedule jobs, however specific client nodes can be targetted via the affinity stanza and allocations can be spread across data centers via the spread stanza. Nomad 1.7 also introduces NUMA aware scheduling (Enterprise edition) which is useful for latency sensitive use cases such as low latency trading. An allocation is a core concept linked to scheduling in Nomad, allocations are used to map tasks in a job to client.

Refer to the Nomad documentation on [scheduling (https://developer.hashicorp.com/nomad/docs/concepts/scheduling/scheduling) for further information on this topic.

Workload Identity

Nomad 1.7 introduced support for workload identities. Simply put, a JWT is generated that is unique for the allocation the job runs in.

The primary use case of workload identity allow Nomad to authenticate with third parties via OIDC (including Vault and Consul).

Terraform Config for Provisioning Nomad in AWS

Clone this repo:

$ git clone https://github.com/ChrisAdkin8/Nomad-101-Demo.git

cd into the Nomad-101-Demo/terraform directory.
Open the terraform.tfvars file and assign:

an AMI id to the ami variable, the default in the file is for Ubuntu 22.04 in the us-east-1 region, leave this as is if this is the region being deployed to, otherwise change this as is appropriate
the string that this command generates to nomad_gossip_key in the terraform.tfvars file.
nomad_license: the Nomad Enterprise license (only if using ENT version)
uncomment the Nomad Enterprise / Nomad OSS blocks as appropriate

Change directory to the certificates ca directory:

$ cd terraform/certificates/ca

Create the tls CA private key and certificate:

$ nomad tls ca create

Create the nomad server private key and certificate and move them to the servers directory:

$ nomad tls cert create -server -region global
$ mv *server*.pem ../servers/.

Create the nomad client private key and certificate and move them to the clients directory:

$ nomad tls cert create -client
$ mv *client*.pem ../clients/.

Create the nomad cli private key and certificate and move them to the cli directory:

$ nomad tls cert create -cli
$ mv *client*.pem ../cli/.

Change directory to Nomad-Vm-Workshop/terraform:

$ cd ../..

Specify the environment variables in order that terraform can connect to your AWS account:

export AWS_ACCESS_KEY_ID=<your AWS access key ID>
export AWS_SECRET_ACCESS_KEY=<your AWS secret access key>
export AWS_SESSION_TOKEN=<your AWS session token>

Install the provider plugins required by the configuration:

$ terraform init

Apply the configuration, this will result in the creation of 23 new resources:

$ terraform apply -auto-approve

The tail of the terraform apply output should look something like this:

Apply complete! Resources: 29 added, 0 changed, 0 destroyed.

Outputs:

IP_Addresses = <<EOT

Nomad Cluster installed
SSH default user: ubuntu

Server public IPs: 54.172.43.18, 18.212.218.138, 184.72.134.0
Client public IPs: 54.167.92.93, 54.80.76.185, 52.73.202.229

If ACL is enabled:
To get the nomad bootstrap token, run the following on the leader server
export NOMAD_TOKEN=$(cat /home/ubuntu/nomad_bootstrap)


EOT
lb_address_consul_nomad = "http://54.172.43.18:4646"

ssh access to the nomad cluster client and server EC2 instances can be achieved via:

$ ssh -i certs/id_rsa.pem ubuntu@<client/server IP address>

Once ssh'ed into one of the EC2 instances check that the nomad system unit is in a healthy state, note that depending on the EC2 instance you ssh onto, that instance may or may not be the current cluster leader:

$ systemctl status nomad

● nomad.service - Nomad
     Loaded: loaded (/lib/systemd/system/nomad.service; disabled; vendor preset: enabled)
     Active: active (running) since Mon 2024-01-08 11:42:16 UTC; 2min 3s ago
       Docs: https://nomadproject.io/docs/
   Main PID: 5617 (nomad)
      Tasks: 7
     Memory: 86.4M
        CPU: 2.706s
     CGroup: /system.slice/nomad.service
             └─5617 /usr/bin/nomad agent -config /etc/nomad.d

Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.543Z [INFO]  nomad.raft: entering leader state: leader="Node at 172.31.206.75:4647 [Leader]"
Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.543Z [INFO]  nomad.raft: added peer, starting replication: peer=575c8e14-e841-7b67-7e72-8679b0632aae
Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.543Z [INFO]  nomad.raft: added peer, starting replication: peer=44b7d1e8-8c04-c33f-e1ab-ca843c4d5567
Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.543Z [INFO]  nomad: cluster leadership acquired
Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.544Z [INFO]  nomad.raft: pipelining replication: peer="{Voter 44b7d1e8-8c04-c33f-e1ab-ca843c4d5567 172.31.74.132:4647}"
Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.547Z [INFO]  nomad.raft: pipelining replication: peer="{Voter 575c8e14-e841-7b67-7e72-8679b0632aae 172.31.81.190:4647}"
Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.578Z [INFO]  nomad.core: established cluster id: cluster_id=98469698-6731-35c2-682e-02e6e76d8aed create_time=1704714145567062938
Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.578Z [INFO]  nomad: eval broker status modified: paused=false
Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.578Z [INFO]  nomad: blocked evals status modified: paused=false
Jan 08 11:42:25 ip-172-31-206-75 nomad[5617]:     2024-01-08T11:42:25.817Z [INFO]  nomad.keyring: initialized keyring: id=56c026c8-0f96-fb71-5dca-20961686da10

Note The process of nomad and consul components being installed by cloudinit may take an extra 30 seconds or so after the terraform config has been applied.

Whilst still ssh'd into one of the nomad nodes, bootstrap the nomad ACL system:

$ nomad acl bootstrap

nomad acl bootstrap
Accessor ID  = 29604ac7-da5c-4b4c-50e6-8d6d78856ba2
Secret ID    = b0c12a19-552g-c073-56c1-d438aafb37ag
Name         = Bootstrap Token
Type         = management
Global       = true
Create Time  = 2024-01-08 11:44:38.673696794 +0000 UTC
Expiry Time  = <none>
Create Index = 19
Modify Index = 19
Policies     = n/a
Roles        = n/a

Assign the secret id from the output from the last command to a NOMAD_TOKEN environment variable:

$ export NOMAD_TOKEN=<secret id obtained from nomad acl bootstrap output>

Check that all three nomad cluster server nodes are in a healthy state:

$ nomad server status

Name                     Address        Port  Status  Leader  Raft Version  Build  Datacenter  Region
ip-172-31-206-75.global  172.31.206.75  4648  alive   true    3             1.7.2  dc1         global
ip-172-31-74-132.global  172.31.74.132  4648  alive   false   3             1.7.2  dc1         global
ip-172-31-81-190.global  172.31.81.190  4648  alive   false   3             1.7.2  dc1         global

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
png_images		png_images
terraform		terraform
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NOMAD 101 Demo

Overiview

Nomad Concepts

Core Architecture 101

Task Drivers

Anatomy of a Basic Job

Scheduling

Workload Identity

Terraform Config for Provisioning Nomad in AWS

About

Releases

Packages

Languages

ChrisAdkin8/Nomad-101-Demo

Folders and files

Latest commit

History

Repository files navigation

NOMAD 101 Demo

Overiview

Nomad Concepts

Core Architecture 101

Task Drivers

Anatomy of a Basic Job

Scheduling

Workload Identity

Terraform Config for Provisioning Nomad in AWS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages