nds-org · BenGalewsky · Nov 19, 2017 · Jan 22, 2018 · Jan 31, 2018 · Jan 31, 2018
diff --git a/.gitignore b/.gitignore
@@ -12,3 +12,8 @@
 .viminfo
 .novaclient/
 .cinderclient/
+certs/
+*.tfvars
+*.tfstate
+*.tfstate.backup
+config.yaml
diff --git a/.gitmodules b/.gitmodules
@@ -0,0 +1,3 @@
+[submodule "kubespray"]
+	path = kubespray
+	url = https://github.com/ncsa/kubespray.git
diff --git a/.travis.yml b/.travis.yml
@@ -0,0 +1,17 @@
+# Run workbench in minikube
+sudo: required
+
+env:
+- CHANGE_MINIKUBE_NONE_USER=true
+
+before_script:
+- curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/v1.7.0/bin/linux/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/
+- curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
+- sudo minikube start --vm-driver=none --kubernetes-version=v1.7.0
+- minikube update-context
+- JSONPATH='{range .items[*]}{@.metadata.name}:{range @.status.conditions[*]}{@.type}={@.status};{end}{end}'; until kubectl get nodes -o jsonpath="$JSONPATH" 2>&1 | grep -q "Ready=True"; do sleep 1; done
+
+script:
+  - make workbench
+  - kubectl get pod
+
diff --git a/README.md b/README.md
@@ -1,89 +1,101 @@
 # NDS Labs Workbench Deploy Tools
-This repository contains a set of [Ansible](https://www.ansible.com/) scripts to deploy [Kubernetes](https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/) and the [Labs Workbench](https://github.com/nds-org/ndslabs) onto an OpenStack cluster
+This repository contains a set of tools that will deploy workbench onto one or
+more nodes. The tools are all coordinated by `Make` so you can execute the steps
+you need for your particular setup.
+
+## Available Deployment Steps
+| Step | Description | Make Target |
+| ------ | ----------- | ----------- |
+| Terraform | Provision VMs on cloud provider using Terraform | `terraform` |
+| Verify VMs | Use the Ansible Ping command to make sure that the hosts have been provisioned correctly and are accessible | `ping` |
+| Install Kubernetes | Use `Kubespray` to install Kubernetes in the cluster | `kubernetes` |
+| NDS Workbench | Deploy the NDS Workbench on the provisioned Kubernetes Cluster | `workbench` |
+| Destroy Workbench | Delete all of the services and pods associated with workbench | `workbench-down` |
+| Demo Account | Create a demo account with a known password. This is not currently complete since it requires a change to `ndslabsctl` to allow for the admin password to be passed in from command line. Currently it creates a shell in the api server pod and displays instructions on how to install the demo user | `demo-login` |
+| Label Worker Nodes | The API Server only starts services on nodes that are labeled. This runs a script to label appropriate nodes as eligible | `label-workers` |
+| Destroy VMs | Use Terraform to destroy the cluster and release the VMs | `clean`
+
+## Terraform
+Execute this step to use terraform to allocate and commission VMs to host
+your kubernetes cluster. Before running this step you need to create a
+`tfvars` file that specifies the cluster you would like to create. The contents
+of this file are specified in the [Kubespray Terraform README](https://github.com/kubernetes-incubator/kubespray/tree/master/contrib/terraform/openstack).
+
+You also need to set environment variables to your Openstack credentials also
+shown in the README.
+
+To run the make command you need to provide names for the `tfvars` file to use
+and the `tfstate` to store the results.
 
-If you don't have access to an OpenStack cluster, there are [plenty of ways to run Kubernetes](https://kubernetes.io/docs/setup/pick-right-solution/)!
-
-# Prerequisites
-* [Docker](https://www.docker.com/get-docker)
-
-# Build Docker Image
 ```bash
-docker build -t ndslabs/deploy-tools .
+% make TFVARS=sdsc-single-note.tfvars TFSTATE=sdsc-single-note.tfstate kubernetes
 ```
 
-# Run Docker Image
-```bash
-docker run -it -v /home/core/private:/root/SAVED_AND_SENSITIVE_VOLUME ndslabs/deploy-tools bash 
-```
+Once complete, you can verify your stack with a ping command run on each of the
+provisioned hosts. Ansible will communicate with hosts that have an external
+IP directly. Other hosts will be contacted via the bastion host.
 
-NOTE: You should remember to map some volume to `/root/SAVED_AND_SENSITIVE_VOLUME` containing your `*-openrc.sh` file. This directory is where the ansible output gets stored. This includes SSH private keys, generated TLS certificates, and Ansible's own fact cache. If you forget to map this directory, its contents **WILL BE LOST FOREVER**.
+This command depends on the `tfstate` file from the terraform build to resolve
+the inventory
 
-# Provide Your OpenStack Credentials
-The first thing you need to do is to `source` the openrc file of the project you wish to deploy to in OpenStack
-
-NOTE: this file can be retrieved for any OpenStack project which you can access by following the insturctions [here](https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/4/html/End_User_Guide/cli_openrc.html).
-
-Assuming you've passed your the openrc.sh files with `-v`, as recommended above:
+Try this command:
 ```bash
-source /root/SAVED_AND_SENSITIVE_VOLUME/OpenStackProjectName-openrc.sh
+% make TFSTATE=sdsc-single-note.tfstate ping
 ```
 
-# Prepare Your Site
-Some parameters, such as the available flavors (sizes) and images for the deployed OpenStack instances, are properties of the particular installation of OpenStack or the projects to which you are allowed to deploy. We refer to each installation of OpenStack as a "site", and similarly store their variables under `/root/inventory/site_vars`, where each file is named after the site that it represents.
-
-To set up a new site, you can simply copy an existing site and change the names of the images and flavors accordingly.
+## Kubespray Deploy Kubernetes
+Next step is to install kubernetes on the cluster. Before executing this step
+you should customize `k8s-cluster.yml` in the repo's root directory. One setting
+of particular note is `calico_mtu` which should be set to the value which is
+appropriate for the Openstack you are deploying to. You may also edit settings
+in `all.yml` which control cluster-wide settings for the ansible deploy.
 
-## Obtain a CoreOS Image
-[Download](https://coreos.com/os/docs/latest/booting-on-openstack.html) the newest stable cloud image of CoreOS for OpenStack and [import](https://docs.openstack.org/user-guide/dashboard-manage-images.html) it into your project.
+Once you are satisfied with the settings, you can request the kubspray deploy
+with:
 
-Currently supported CoreOS version: **1235.6**
-
-NOTE: While newer versions of CoreOS *should* work, due to CoreOS and Docker versions being tied together later versions may not be supported immediately.
-
-## Choosing a Flavor
-Set the site_vars named `flavor_small` / `flavor_medium` / `flavor_large` to flavors that already exist in your OpenStack project, or create new flavors that match these.
-
-# Compose Your Inventory
-Make a copy of the existing example or minimal inventory located in `/root/inventory` and edit it to your liking:
+Try this command:
 ```bash
-cp inventory/minimal-ncsa inventory/my-cluster
-vi inventory/my-cluster
+% make TFSTATE=sdsc-single-note.tfstate kubernetes
 ```
 
-* The top section pertains to **Cluster Variables** - here you can override any group_vars (NOTE: site_vars cannot yet be overridden)
-* The middle section defines **Servers**, where we choose the names and quantities for each type of node
-* The last section defines **Groups**, which groups the node types that we declared above into several larger groups
+After Kubernetes is deployed you will still need to follow the instructions in
+the [README](https://github.com/kubernetes-incubator/kubespray/tree/master/contrib/terraform/openstack) to add the new cluster to your kubectl config.
+
+## Deploy Workbench
+You must first edit the `config.yml` file in the repo's root directory to set
+values for your workbench.
 
-## About Group Variables
-Some parameters are different based on the type of node being provisioned - Ansible calls these "groups". The group-specific values can be found under `/root/inventory/group_vars`, where each file is named after the group it represents.
+The value for `workbench.domain` is particularly important.
 
-NOTE: these groups can be nested / hierarchical.
-**NOTE**: Raw images should be preferred at OpenStack sites where Ceph is used for the backing volumes, as it will significanlty decrease the time needed to provision and start your cluster.
+Deploy workbench with the command:
 
-# Ansible Playbooks
-After adjusting the inventory/site parameters to your liking, run the three Ansible playbooks to bring up a Labs Workbench cluster:
 ```bash
-ansible-playbook -i inventory/my-cluster playbooks/openstack-provision.yml && \
-ansible-playbook -i inventory/my-cluster playbooks/k8s-install.yml && \
-ansible-playbook -i inventory/my-cluster playbooks/ndslabs-k8s-install.yml
+% make workbench
 ```
 
-These commands can be run one at a time, or all at once for provisioning in a single command:
+## Create a Demo Account
+In order to work with your workbench you need to log into it. The account
+approval workflow can be vexing in a development environment. You can bypass
+this by forcing a demo account into your registry. You can create a demo account
+with the command:
+
 ```bash
-ansible-playbook -i inventory/my-cluster playbooks/openstack-provision.yml playbooks/k8s-install.yml playbooks/ndslabs-k8s-install.yml
+% make demo-login
 ```
 
-## About Playbooks
-Each playbook takes care of a small portion of the installation process:
-* `playbooks/openstack-provision.yml`: Provision OpenStack volumes and instances with chosen flavor / image
-* `playbooks/k8s-install.yml`: Download and install Kubernetes binaries onto each node
-* `playbooks/ndslabs-k8s-install.yml`: Deploy our Kubernetes YAML files to start up services necessary to run Labs Workbench
+This will copy the account specification file from
+`scripts/account-register.json` into the api server's pod and then launch
+a bash shell in that pod and prompt you to login as admin using ndsctl and
+execute the command to add that user to the registry. This manual step is needed
+until [NDS-1172](https://opensource.ncsa.illinois.edu/jira/browse/NDS-1172) is
+complete.
 
-## About Node Labels
-After running all three playbooks, you should be left with a working cluster.
+## Label Compute Nodes
+The API Server needs to know which nodes can run services. Execute:
 
-Labels recognized by the cluster are as follows:
-* *glfs* server nodes must be labelled with `ndslabs-role-glfs=true` for the GLFS servers to run there
-* *compute* nodes must be labelled with `ndslabs-role-compute=true` for the Workbench API server to schedule services there
-* *loadbal* nodes must be labelled with `ndslabs-role-loadbal=true` to know where a public IP is available, so it can run the ingress/loadbalance
-* *lma* nodes should be labelled with `ndslabs-role-lma=true` to know where dedicated resources are set aside to run logging/monitoring/alerts
+```bash
+% make label-workers
+```
+For now this script only works for single node deployments. It will label the
+master node as a worker. Eventually, this script will label the kubernetes
+worker nodes only.
diff --git a/all.yml b/all.yml
@@ -0,0 +1,129 @@
+# Valid bootstrap options (required): ubuntu, coreos, centos, none
+bootstrap_os: none
+
+#Directory where etcd data stored
+etcd_data_dir: /var/lib/etcd
+
+# Directory where the binaries will be installed
+bin_dir: /usr/local/bin
+
+## The access_ip variable is used to define how other nodes should access
+## the node.  This is used in flannel to allow other flannel nodes to see
+## this node for example.  The access_ip is really useful AWS and Google
+## environments where the nodes are accessed remotely by the "public" ip,
+## but don't know about that address themselves.
+#access_ip: 1.1.1.1
+
+### LOADBALANCING AND ACCESS MODES
+## Enable multiaccess to configure etcd clients to access all of the etcd members directly
+## as the "http://hostX:port, http://hostY:port, ..." and ignore the proxy loadbalancers.
+## This may be the case if clients support and loadbalance multiple etcd servers  natively.
+#etcd_multiaccess: true
+
+## External LB example config
+## apiserver_loadbalancer_domain_name: "elb.some.domain"
+#loadbalancer_apiserver:
+#  address: 1.2.3.4
+#  port: 1234
+
+## Internal loadbalancers for apiservers
+#loadbalancer_apiserver_localhost: true
+
+## Local loadbalancer should use this port instead, if defined.
+## Defaults to kube_apiserver_port (6443)
+#nginx_kube_apiserver_port: 8443
+
+### OTHER OPTIONAL VARIABLES
+## For some things, kubelet needs to load kernel modules.  For example, dynamic kernel services are needed
+## for mounting persistent volumes into containers.  These may not be loaded by preinstall kubernetes
+## processes.  For example, ceph and rbd backed volumes.  Set to true to allow kubelet to load kernel
+## modules.
+# kubelet_load_modules: false
+
+## Internal network total size. This is the prefix of the
+## entire network. Must be unused in your environment.
+#kube_network_prefix: 18
+
+## With calico it is possible to distributed routes with border routers of the datacenter.
+## Warning : enabling router peering will disable calico's default behavior ('node mesh').
+## The subnets of each nodes will be distributed by the datacenter router
+#peer_with_router: false
+
+## Upstream dns servers used by dnsmasq
+#upstream_dns_servers:
+#  - 8.8.8.8
+#  - 8.8.4.4
+
+## There are some changes specific to the cloud providers
+## for instance we need to encapsulate packets with some network plugins
+## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', or 'external'
+## When openstack is used make sure to source in the openstack credentials
+## like you would do when using nova-client before starting the playbook.
+#cloud_provider:
+
+## When azure is used, you need to also set the following variables.
+## see docs/azure.md for details on how to get these values
+#azure_tenant_id:
+#azure_subscription_id:
+#azure_aad_client_id:
+#azure_aad_client_secret:
+#azure_resource_group:
+#azure_location:
+#azure_subnet_name:
+#azure_security_group_name:
+#azure_vnet_name:
+#azure_route_table_name:
+
+## When OpenStack is used, Cinder version can be explicitly specified if autodetection fails (Fixed in 1.9: https://github.com/kubernetes/kubernetes/issues/50461)
+#openstack_blockstorage_version: "v1/v2/auto (default)"
+## When OpenStack is used, if LBaaSv2 is available you can enable it with the following 2 variables.
+#openstack_lbaas_enabled: True
+#openstack_lbaas_subnet_id: "Neutron subnet ID (not network ID) to create LBaaS VIP"
+## To enable automatic floating ip provisioning, specify a subnet.
+#openstack_lbaas_floating_network_id: "Neutron network ID (not subnet ID) to get floating IP from, disabled by default"
+## Override default LBaaS behavior
+#openstack_lbaas_use_octavia: False
+#openstack_lbaas_method: "ROUND_ROBIN"
+#openstack_lbaas_provider: "haproxy"
+#openstack_lbaas_create_monitor: "yes"
+#openstack_lbaas_monitor_delay: "1m"
+#openstack_lbaas_monitor_timeout: "30s"
+#openstack_lbaas_monitor_max_retries: "3"
+
+## Uncomment to enable experimental kubeadm deployment mode
+#kubeadm_enabled: false
+#kubeadm_token_first: "{{ lookup('password', 'credentials/kubeadm_token_first length=6  chars=ascii_lowercase,digits') }}"
+#kubeadm_token_second: "{{ lookup('password', 'credentials/kubeadm_token_second length=16 chars=ascii_lowercase,digits') }}"
+#kubeadm_token: "{{ kubeadm_token_first }}.{{ kubeadm_token_second }}"
+#
+## Set these proxy values in order to update package manager and docker daemon to use proxies
+#http_proxy: ""
+#https_proxy: ""
+## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
+#no_proxy: ""
+
+## Uncomment this if you want to force overlay/overlay2 as docker storage driver
+## Please note that overlay2 is only supported on newer kernels
+#docker_storage_options: -s overlay2
+
+# Uncomment this if you have more than 3 nameservers, then we'll only use the first 3.
+#docker_dns_servers_strict: false
+
+## Default packages to install within the cluster, f.e:
+#kpm_packages:
+#  - name: kube-system/grafana
+
+## Certificate Management
+## This setting determines whether certs are generated via scripts or whether a
+## cluster of Hashicorp's Vault is started to issue certificates (using etcd
+## as a backend). Options are "script" or "vault"
+#cert_management: script
+
+# Set to true to allow pre-checks to fail and continue deployment
+#ignore_assert_errors: false
+
+## Etcd auto compaction retention for mvcc key value store in hour
+#etcd_compaction_retention: 0
+
+## Set level of detail for etcd exported metrics, specify 'extensive' to include histogram metrics.
+#etcd_metrics: basic