To get started with Kubernetes quickly, you can use Kismatic to stand up a small cluster in AWS or virtualized on a personal computer.
But setting up a proper cluster takes a little forethought. Depending on your intent, you may need to engage multiple teams within your organization to correctly provision the required infrastructure. Planning will also help you identify provisioning tasks and know what infromation will be needed to proceed with installation.
Planning focuses mainly on three areas of concern:
- The machines that will form a cluster
- The network the cluster will operate on
- Other services the cluster be interacting with
Etcd Nodes Suggested: 3 |
1 3 5 7 |
Master Nodes Suggested: 2 |
1 2 |
Kubernetes is installed on multiple physical or virtual machines running Linux. These machines become nodes of the Kubernetes cluster.
In a Kismatic installation of Kubernetes, nodes are specialized to one of three distinct roles within the cluster: etcd, master or worker.
- etcd
- These nodes provide data storage for the master.
- master
- These nodes provide API endpoints and manage the Pods installed on workers.
- worker
- These nodes are where your Pods are instantiated.
Nodes within a cluster should have latencies between them of 10ms or lower to prevent instability. If you would like to host workloads at multiple data centers, or in a hybrid cloud scenario, you should expect to set up at least one cluster in each geographically seperated region.
Infrastructure supported:
- bare metal
- virtual machines
- AWS EC2
- Packet.net
If using VMs or IaaS, we suggest avoiding virtualization strategies that rely on the assignment of partial CPUs to your VM. This includes avoiding AWS T2 instances or CPU oversubscription with VMs.
Operating Systems supported:
- RHEL 7
- Centos 7
- Ubuntu 16.04
Minimum hardware requirements:
Node Role | CPU | Ram | Disk (Prototyping1) | Disk (Production1) |
etcd | 1 CPU Core, 2 GHz | 1 GB | 8 GB | 50 GB |
master | 1 CPU Cores, 2 GHz | 2 GB | 8 GB | 50 GB |
worker | 1 CPU Core, 2 GHz | 1 GB | 8 GB | 200 GB |
1A Prototype cluster is one you build for a short term use case (less than a week or so). It can have smaller drives, but you wouldn't want to run like this for extended use.
Worker Count | CPUs | RAM |
---|---|---|
< 5 | 1 | 3.75 |
< 10 | 2 | 7.5 |
< 100 | 4 | 15 |
< 250 | 8 | 30 |
< 500 | 16 | 30 |
< 1000 | 32 | 60 |
Kubernetes nodes must have swap memory disabled. Otherwise, the Kubelet will fail to start. If you want to run your Kubernetes nodes with swap memory enabled, you must override the Kubelet configuration to disable the swap check:
cluster:
# ...
kubelet:
option_overrides:
fail-swap-on: false
Each etcd node receives all the data for a cluster to help protect against data loss in the event that something happens to one of the nodes. A Kubernetes cluster is able to operate as long as more than 50% of its etcd nodes are online. Always use an odd number of etcd nodes. Count of etcd nodes is primarily an availability concern, as adding etcd nodes can decrease Kubernetes performance.
Node Count | Safe for |
1 | Unsafe. Use only for small development clusters |
3 | Failure of any one node |
5 | Simultaneous failure of two nodes |
7 | Simultaneous failure of three nodes |
Master nodes provide API endpoints and keep Kubernetes workloads running. A Kubernetes cluster is able to operate as long as one of its master nodes is online. We suggest at least two master nodes for availability.
Node Count | Safe for |
1 | Unsafe. Use only for small development clusters. |
2 | Failure of any one node. |
Both users of Kubernetes and Kubernetes itself occasionally attempt to communicate with master via a URL. With two or more masters, we suggest introducing a load balanced url (via a virtual IP or DNS CNAME). This is required to allow clients and components with Kubernetes to balance between them or to provide uninterrupted operation in the event that a master node goes offline.
Worker nodes are where your applications will run. your initial worker count should be large enough to hold all the workloads you intend to deploy to it plus enough slack to handle a partial failure. You can add more as necessary after the initial setup without interrupting operation of the cluster.
Networking Technique | Routed Overlay |
How hostnames will be resolved for nodes | Use DNS Let Kismatic Manage Hosts Files on nodes |
Network Policy Control | No network policy Calico-managed network policy |
Pod Network CIDR Block | |
Services Network CIDR Block | |
Load-Balanced URL for Master Nodes |
Kubernetes allocates a unique IP address for every Pod created on a cluster. Within a cluster, all Pods are visible to all other Pods and directly addressable by IP, simplifying point to point communications.
Similarly, Kubernetes uses a special network for Services, allowing them to talk to each other via an address that is stable even as the underlying cluster topology changes.
For this to work, Kubernetes makes use of technologies built in to Docker and Linux (including iptables, bridge networking and the Container Networking Interface). We tie these together with a network technology from Tigera networks called Calico.
To provide these behaviors, Kubernetes needs to be able to issue IP addresses from two IP ranges: a pod network and a services network. This is in addition to the IP addresses nodes will be assigned on their local network.
The pod and service network ranges each need to be assigned a single contiguous CIDR block large enough to handle your workloads and any future scaling. With Calico, Worker and Master nodes are assigned IP addresses for allocation in blocks of 64 IPs; newly created pods will receive an address from this block until all IPs are consumed, at which point an additional block will be allocated to the node.
Thus, your pod network must be sized so that:
Pod Network IP Block Size >= (Worker Node Count + Master Node Count) * 64
Our default CIDR block for a pod is 172.16.0.0/16, which would allow for a maximum of roughly 65k pods in total or roughly 1000 nodes with 64 pods per node or fewer.
Similarly, the service network needs to be large enough to handle all of the Services that might be created on the cluster. Our default is 172.20.0.0/16, which would allow for 65k services and that ought to be enough for anybody.
Care should be taken that the IP addresses under management by Kubernetes do not collide with IP addresses on the local network, including omitting these ranges from control of DHCP.
There are two techniques we support for pod networking on Kubernetes: overlay and routed.
In an overlay network, communications between pods happen on a virtual network that is only visible to machines that are running an agent. This agent communicates with other agents via the node's local network and establishes IP-over-IP tunnels through which Kubernetes Pod traffic is routed.
In this model, no work has to be done to allow pods to communicate with each other (other than ensuring that you are not blocking IP-over-IP traffic). Two or more Kubernetes clusters might even operate on the same pod and services IP ranges, without being able to see each others’ traffic.
However, work does need to be done to expose pods to the local network. This role is usually filled by a Kubernetes Ingress Controller.
Overlay networks work best for development clusters.
In a routed network, communications between pods happen on a network that is accessible to all machines on their local network. In this model, each node acts as a router for the ip ranges of the pods that it hosts. The cluster communicates with existing network routers via BGP to establish the responsibility of nodes for routing these addresses. Once routing is in place, a request to a pod or service IP is treated the same as any other request on the network. There is no tunnel or IP wrapping involved. This may also make it easier to inspect traffic via tools like wire shark and tcpdump.
In a routed model, cluster communications often work out of the box. Sometimes routers need to be configured to expect and acknowledge BGP messages from a cluster.
Routed networks work best when you want a majority of your workloads to be automatically visible to clients that aren't on Kubernetes, including other systems on the local network.
Sometimes, it is valuable to peer nodes in the cluster with a network router that is physically near to them. For this purpose, the cluster announces its BGP messages with an AS Number that may be specified when Kubernetes is installed. Our default AS Number is 64511.
By default, Pods can talk to any port and any other Pod, Service or node on its network. Pod to pod network access is a requirement of Kubernetes, but this degree of openness is not.
When policy is enabled, access to all Pods is restricted and managed in part by Kubernetes and the Calico networking plugin. When adding new Pods, any ports that are identified within the definition will be made accessible to other pods. Access can be further opened or closed using the Calico command line tools installed on every Master node -- for example, you may grant access to a pod, or a namespace of pods, to a developer’s machine.
Network policy is an experimental feature that can make prototyping the cluster more difficult. It’s turned off by default.
All nodes in the cluster will need a short name with which they can communicate with each other. DNS is one way to provide this.
It's also valuable to have a load balanced alias for the master servers in your cluster, allowing for transparent failover if a master node goes offline. This can be performed either via DNS load balancing or via a Virtual IP if your network has a load balancer already. Pick a FQDN and short name for this alias to master that defines your cluster's intent -- for example, if this is the only Kubernetes cluster on your network, kubernetes.yourdomain.com would be ideal.
If you do not wish to run DNS, you may optionally allow the Kismatic installer to manage hosts files on all of your nodes. Be aware that this option will not scale beyond a few dozen nodes, as adding or removing nodes through the installer will force a hosts file update to all nodes on the cluster.
Kubernetes must be allowed to manage network policy for any IP range it manages.
Network policies for the local network on which nodes reside will need to be set up prior to construction of the cluster, or installation will fail.
Purpose for rule | Target node types | Source IP range | Allow Rules |
To allow communication with the kismatic inspector | all | installer node | tcp:8888 |
To allow acces to the API server | worker | worker nodes master nodes The IP ranges of any machines you want to be able to manage Kubernetes workloads |
tcp:6443 |
To allow all internal traffic between Kubernetes nodes | all | All nodes in the Kubernetes cluster | tcp:0-65535 udp:0-65535 |
To allow SSH | all | worker nodes master nodes The IP ranges of any machines you want to be able to manage Kubernetes nodes | tcp:22 |
To allow communications between ETCD nodes | etcd | etcd nodes | tcp:2380 tcp:6660 |
To allow communications between Kubernetes nodes and ETCD | etcd | master nodes | tcp:2379 |
To allow communications between Calico networking and ETCD | etcd | etcd nodes | tcp:6666 |
Expiration period for certificates default 17520h |
Kismatic will automate generation and installation of TLS certificates and keys used for intra-cluster security. It does this using the open source CloudFlare SSL library. These certificates and keys are exclusively used to encrypt and authorize traffic between Kubernetes components; they are not presented to end-users.
The default expiry period for certificates is 17520h (2 years). Certificates must be updated prior to expiration or the cluster will cease to operate without warning. Replacing certificates will cause momentary downtime with Kubernetes as of version 1.4; future versions should allow for certificate "rolling" without downtime.