Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Update makefile - Do Not Merge #89

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,8 @@
.viminfo
.novaclient/
.cinderclient/
certs/
*.tfvars
*.tfstate
*.tfstate.backup
config.yaml
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "kubespray"]
path = kubespray
url = https://github.com/ncsa/kubespray.git
17 changes: 17 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Run workbench in minikube
sudo: required

env:
- CHANGE_MINIKUBE_NONE_USER=true

before_script:
- curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/v1.7.0/bin/linux/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/
- curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
- sudo minikube start --vm-driver=none --kubernetes-version=v1.7.0
- minikube update-context
- JSONPATH='{range .items[*]}{@.metadata.name}:{range @.status.conditions[*]}{@.type}={@.status};{end}{end}'; until kubectl get nodes -o jsonpath="$JSONPATH" 2>&1 | grep -q "Ready=True"; do sleep 1; done

script:
- make workbench
- kubectl get pod

138 changes: 75 additions & 63 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,89 +1,101 @@
# NDS Labs Workbench Deploy Tools
This repository contains a set of [Ansible](https://www.ansible.com/) scripts to deploy [Kubernetes](https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/) and the [Labs Workbench](https://github.com/nds-org/ndslabs) onto an OpenStack cluster
This repository contains a set of tools that will deploy workbench onto one or
more nodes. The tools are all coordinated by `Make` so you can execute the steps
you need for your particular setup.

## Available Deployment Steps
| Step | Description | Make Target |
| ------ | ----------- | ----------- |
| Terraform | Provision VMs on cloud provider using Terraform | `terraform` |
| Verify VMs | Use the Ansible Ping command to make sure that the hosts have been provisioned correctly and are accessible | `ping` |
| Install Kubernetes | Use `Kubespray` to install Kubernetes in the cluster | `kubernetes` |
| NDS Workbench | Deploy the NDS Workbench on the provisioned Kubernetes Cluster | `workbench` |
| Destroy Workbench | Delete all of the services and pods associated with workbench | `workbench-down` |
| Demo Account | Create a demo account with a known password. This is not currently complete since it requires a change to `ndslabsctl` to allow for the admin password to be passed in from command line. Currently it creates a shell in the api server pod and displays instructions on how to install the demo user | `demo-login` |
| Label Worker Nodes | The API Server only starts services on nodes that are labeled. This runs a script to label appropriate nodes as eligible | `label-workers` |
| Destroy VMs | Use Terraform to destroy the cluster and release the VMs | `clean`

## Terraform
Execute this step to use terraform to allocate and commission VMs to host
your kubernetes cluster. Before running this step you need to create a
`tfvars` file that specifies the cluster you would like to create. The contents
of this file are specified in the [Kubespray Terraform README](https://github.com/kubernetes-incubator/kubespray/tree/master/contrib/terraform/openstack).

You also need to set environment variables to your Openstack credentials also
shown in the README.

To run the make command you need to provide names for the `tfvars` file to use
and the `tfstate` to store the results.

If you don't have access to an OpenStack cluster, there are [plenty of ways to run Kubernetes](https://kubernetes.io/docs/setup/pick-right-solution/)!

# Prerequisites
* [Docker](https://www.docker.com/get-docker)

# Build Docker Image
```bash
docker build -t ndslabs/deploy-tools .
% make TFVARS=sdsc-single-note.tfvars TFSTATE=sdsc-single-note.tfstate kubernetes
```

# Run Docker Image
```bash
docker run -it -v /home/core/private:/root/SAVED_AND_SENSITIVE_VOLUME ndslabs/deploy-tools bash
```
Once complete, you can verify your stack with a ping command run on each of the
provisioned hosts. Ansible will communicate with hosts that have an external
IP directly. Other hosts will be contacted via the bastion host.

NOTE: You should remember to map some volume to `/root/SAVED_AND_SENSITIVE_VOLUME` containing your `*-openrc.sh` file. This directory is where the ansible output gets stored. This includes SSH private keys, generated TLS certificates, and Ansible's own fact cache. If you forget to map this directory, its contents **WILL BE LOST FOREVER**.
This command depends on the `tfstate` file from the terraform build to resolve
the inventory

# Provide Your OpenStack Credentials
The first thing you need to do is to `source` the openrc file of the project you wish to deploy to in OpenStack

NOTE: this file can be retrieved for any OpenStack project which you can access by following the insturctions [here](https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/4/html/End_User_Guide/cli_openrc.html).

Assuming you've passed your the openrc.sh files with `-v`, as recommended above:
Try this command:
```bash
source /root/SAVED_AND_SENSITIVE_VOLUME/OpenStackProjectName-openrc.sh
% make TFSTATE=sdsc-single-note.tfstate ping
```

# Prepare Your Site
Some parameters, such as the available flavors (sizes) and images for the deployed OpenStack instances, are properties of the particular installation of OpenStack or the projects to which you are allowed to deploy. We refer to each installation of OpenStack as a "site", and similarly store their variables under `/root/inventory/site_vars`, where each file is named after the site that it represents.

To set up a new site, you can simply copy an existing site and change the names of the images and flavors accordingly.
## Kubespray Deploy Kubernetes
Next step is to install kubernetes on the cluster. Before executing this step
you should customize `k8s-cluster.yml` in the repo's root directory. One setting
of particular note is `calico_mtu` which should be set to the value which is
appropriate for the Openstack you are deploying to. You may also edit settings
in `all.yml` which control cluster-wide settings for the ansible deploy.

## Obtain a CoreOS Image
[Download](https://coreos.com/os/docs/latest/booting-on-openstack.html) the newest stable cloud image of CoreOS for OpenStack and [import](https://docs.openstack.org/user-guide/dashboard-manage-images.html) it into your project.
Once you are satisfied with the settings, you can request the kubspray deploy
with:

Currently supported CoreOS version: **1235.6**

NOTE: While newer versions of CoreOS *should* work, due to CoreOS and Docker versions being tied together later versions may not be supported immediately.

## Choosing a Flavor
Set the site_vars named `flavor_small` / `flavor_medium` / `flavor_large` to flavors that already exist in your OpenStack project, or create new flavors that match these.

# Compose Your Inventory
Make a copy of the existing example or minimal inventory located in `/root/inventory` and edit it to your liking:
Try this command:
```bash
cp inventory/minimal-ncsa inventory/my-cluster
vi inventory/my-cluster
% make TFSTATE=sdsc-single-note.tfstate kubernetes
```

* The top section pertains to **Cluster Variables** - here you can override any group_vars (NOTE: site_vars cannot yet be overridden)
* The middle section defines **Servers**, where we choose the names and quantities for each type of node
* The last section defines **Groups**, which groups the node types that we declared above into several larger groups
After Kubernetes is deployed you will still need to follow the instructions in
the [README](https://github.com/kubernetes-incubator/kubespray/tree/master/contrib/terraform/openstack) to add the new cluster to your kubectl config.

## Deploy Workbench
You must first edit the `config.yml` file in the repo's root directory to set
values for your workbench.

## About Group Variables
Some parameters are different based on the type of node being provisioned - Ansible calls these "groups". The group-specific values can be found under `/root/inventory/group_vars`, where each file is named after the group it represents.
The value for `workbench.domain` is particularly important.

NOTE: these groups can be nested / hierarchical.
**NOTE**: Raw images should be preferred at OpenStack sites where Ceph is used for the backing volumes, as it will significanlty decrease the time needed to provision and start your cluster.
Deploy workbench with the command:

# Ansible Playbooks
After adjusting the inventory/site parameters to your liking, run the three Ansible playbooks to bring up a Labs Workbench cluster:
```bash
ansible-playbook -i inventory/my-cluster playbooks/openstack-provision.yml && \
ansible-playbook -i inventory/my-cluster playbooks/k8s-install.yml && \
ansible-playbook -i inventory/my-cluster playbooks/ndslabs-k8s-install.yml
% make workbench
```

These commands can be run one at a time, or all at once for provisioning in a single command:
## Create a Demo Account
In order to work with your workbench you need to log into it. The account
approval workflow can be vexing in a development environment. You can bypass
this by forcing a demo account into your registry. You can create a demo account
with the command:

```bash
ansible-playbook -i inventory/my-cluster playbooks/openstack-provision.yml playbooks/k8s-install.yml playbooks/ndslabs-k8s-install.yml
% make demo-login
```

## About Playbooks
Each playbook takes care of a small portion of the installation process:
* `playbooks/openstack-provision.yml`: Provision OpenStack volumes and instances with chosen flavor / image
* `playbooks/k8s-install.yml`: Download and install Kubernetes binaries onto each node
* `playbooks/ndslabs-k8s-install.yml`: Deploy our Kubernetes YAML files to start up services necessary to run Labs Workbench
This will copy the account specification file from
`scripts/account-register.json` into the api server's pod and then launch
a bash shell in that pod and prompt you to login as admin using ndsctl and
execute the command to add that user to the registry. This manual step is needed
until [NDS-1172](https://opensource.ncsa.illinois.edu/jira/browse/NDS-1172) is
complete.

## About Node Labels
After running all three playbooks, you should be left with a working cluster.
## Label Compute Nodes
The API Server needs to know which nodes can run services. Execute:

Labels recognized by the cluster are as follows:
* *glfs* server nodes must be labelled with `ndslabs-role-glfs=true` for the GLFS servers to run there
* *compute* nodes must be labelled with `ndslabs-role-compute=true` for the Workbench API server to schedule services there
* *loadbal* nodes must be labelled with `ndslabs-role-loadbal=true` to know where a public IP is available, so it can run the ingress/loadbalance
* *lma* nodes should be labelled with `ndslabs-role-lma=true` to know where dedicated resources are set aside to run logging/monitoring/alerts
```bash
% make label-workers
```
For now this script only works for single node deployments. It will label the
master node as a worker. Eventually, this script will label the kubernetes
worker nodes only.
129 changes: 129 additions & 0 deletions all.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Valid bootstrap options (required): ubuntu, coreos, centos, none
bootstrap_os: none

#Directory where etcd data stored
etcd_data_dir: /var/lib/etcd

# Directory where the binaries will be installed
bin_dir: /usr/local/bin

## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
#access_ip: 1.1.1.1

### LOADBALANCING AND ACCESS MODES
## Enable multiaccess to configure etcd clients to access all of the etcd members directly
## as the "http://hostX:port, http://hostY:port, ..." and ignore the proxy loadbalancers.
## This may be the case if clients support and loadbalance multiple etcd servers natively.
#etcd_multiaccess: true

## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
#loadbalancer_apiserver:
# address: 1.2.3.4
# port: 1234

## Internal loadbalancers for apiservers
#loadbalancer_apiserver_localhost: true

## Local loadbalancer should use this port instead, if defined.
## Defaults to kube_apiserver_port (6443)
#nginx_kube_apiserver_port: 8443

### OTHER OPTIONAL VARIABLES
## For some things, kubelet needs to load kernel modules. For example, dynamic kernel services are needed
## for mounting persistent volumes into containers. These may not be loaded by preinstall kubernetes
## processes. For example, ceph and rbd backed volumes. Set to true to allow kubelet to load kernel
## modules.
# kubelet_load_modules: false

## Internal network total size. This is the prefix of the
## entire network. Must be unused in your environment.
#kube_network_prefix: 18

## With calico it is possible to distributed routes with border routers of the datacenter.
## Warning : enabling router peering will disable calico's default behavior ('node mesh').
## The subnets of each nodes will be distributed by the datacenter router
#peer_with_router: false

## Upstream dns servers used by dnsmasq
#upstream_dns_servers:
# - 8.8.8.8
# - 8.8.4.4

## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', or 'external'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using nova-client before starting the playbook.
#cloud_provider:

## When azure is used, you need to also set the following variables.
## see docs/azure.md for details on how to get these values
#azure_tenant_id:
#azure_subscription_id:
#azure_aad_client_id:
#azure_aad_client_secret:
#azure_resource_group:
#azure_location:
#azure_subnet_name:
#azure_security_group_name:
#azure_vnet_name:
#azure_route_table_name:

## When OpenStack is used, Cinder version can be explicitly specified if autodetection fails (Fixed in 1.9: https://github.com/kubernetes/kubernetes/issues/50461)
#openstack_blockstorage_version: "v1/v2/auto (default)"
## When OpenStack is used, if LBaaSv2 is available you can enable it with the following 2 variables.
#openstack_lbaas_enabled: True
#openstack_lbaas_subnet_id: "Neutron subnet ID (not network ID) to create LBaaS VIP"
## To enable automatic floating ip provisioning, specify a subnet.
#openstack_lbaas_floating_network_id: "Neutron network ID (not subnet ID) to get floating IP from, disabled by default"
## Override default LBaaS behavior
#openstack_lbaas_use_octavia: False
#openstack_lbaas_method: "ROUND_ROBIN"
#openstack_lbaas_provider: "haproxy"
#openstack_lbaas_create_monitor: "yes"
#openstack_lbaas_monitor_delay: "1m"
#openstack_lbaas_monitor_timeout: "30s"
#openstack_lbaas_monitor_max_retries: "3"

## Uncomment to enable experimental kubeadm deployment mode
#kubeadm_enabled: false
#kubeadm_token_first: "{{ lookup('password', 'credentials/kubeadm_token_first length=6 chars=ascii_lowercase,digits') }}"
#kubeadm_token_second: "{{ lookup('password', 'credentials/kubeadm_token_second length=16 chars=ascii_lowercase,digits') }}"
#kubeadm_token: "{{ kubeadm_token_first }}.{{ kubeadm_token_second }}"
#
## Set these proxy values in order to update package manager and docker daemon to use proxies
#http_proxy: ""
#https_proxy: ""
## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
#no_proxy: ""

## Uncomment this if you want to force overlay/overlay2 as docker storage driver
## Please note that overlay2 is only supported on newer kernels
#docker_storage_options: -s overlay2

# Uncomment this if you have more than 3 nameservers, then we'll only use the first 3.
#docker_dns_servers_strict: false

## Default packages to install within the cluster, f.e:
#kpm_packages:
# - name: kube-system/grafana

## Certificate Management
## This setting determines whether certs are generated via scripts or whether a
## cluster of Hashicorp's Vault is started to issue certificates (using etcd
## as a backend). Options are "script" or "vault"
#cert_management: script

# Set to true to allow pre-checks to fail and continue deployment
#ignore_assert_errors: false

## Etcd auto compaction retention for mvcc key value store in hour
#etcd_compaction_retention: 0

## Set level of detail for etcd exported metrics, specify 'extensive' to include histogram metrics.
#etcd_metrics: basic
Loading