-
基于 Debian 11 x86-64。同样适用于 Ubuntu 18.04/20.04 等 Debian 系发行版;
-
纯手工硬核方式(The Hard Way)搭建;
-
Kubernetes 集群组件采用二进制方式部署运行;
-
节点加入集群使用“启动引导令牌” (Bootstrap Token) 方式;
-
默认 Container Runtime 使用 CRI-O,也包含 Containerd 部署方法;
-
集群资源组成为:
-
1 HAProxy(Kube API Server、Ingress 外部流量负载均衡、kubectl 管理终端)
-
3 Master 节点
-
3 Node 节点
-
-
etcd 集群复用 3 个 Master 节点,且同样使用二进制方式部署运行;
-
Container Network 使用 Calico;
-
Contianer Storage 使用 NFS-CSI 和 Ceph-CSI,并创建 NFS Storage Class;
-
Kubernetes Dashboard;
-
Nginx Ingress;
-
E.F.K;
-
kube-prometheus;
-
metrics-server;
-
Helm;
-
证书有效期默认 100 年;
组件 | 版本 / 分支 / 镜像 TAG |
---|---|
Kubernetes | 1.27.16 |
CRI-O | 1.27.8 |
cfssl | 1.6.2 |
Containerd (可选) | 1.7.20 |
CNI-Plugins (可选) | 1.4.0 |
crictl (可选) | 1.30.0 |
RunC (可选) | 1.1.13 |
etcd | 3.5.15 |
CoreDNS | 1.8.6 |
Dashboard | 2.7.0 |
kube-prometheus | 0.10 |
Calico | 3.28.0 |
calicoctl | 3.21.5 |
Ingress-Nginx | 1.10.3 |
Elastic Search | 7.16.2 |
Kibana | 7.16.2 |
Fluentd | latest |
Helm | 3.15.2 |
Ceph-CSI | 3.11.0 |
nfs-subdir-external-provisioner | master |
metrics-server | latest |
集群网络 | IP / CIDR / 域名 |
---|---|
Pod CIDR | 172.20.0.0/16 |
Service Cluster CIDR | 10.254.0.0/16 |
Cluster Endpoint | 10.254.0.1 |
CoreDNS | 10.254.0.53 |
Cluster Domain | k8s.inanu.net |
Dashboard | k8s.inanu.net |
Prometheus | k8s-prometheus.inanu.net |
Grafana | k8s-grafana.inanu.net |
Alert Manager | k8s-alertmanager.inanu.net |
Kibana | k8s-kibana.inanu.net |
主机名 | IP | 角色 | 操作系统 |
---|---|---|---|
v0 / v0.inanu.net | 172.31.31.70 | 外部负载均衡 / kubectl 终端 | Debian 11 |
v1 / v1.inanu.net | 172.31.31.71 | Master / etcd | Debian 11 |
v2 / v2.inanu.net | 172.31.31.72 | Master / etcd | Debian 11 |
v3 / v3.inanu.net | 172.31.31.73 | Master / etcd | Debian 11 |
v4 / v4.inanu.net | 172.31.31.74 | Node / Ingress | Debian 11 |
v5 / v5.inanu.net | 172.31.31.75 | Node / Ingress | Debian 11 |
v6 / v6.inanu.net | 172.31.31.76 | Node / Ingress | Debian 11 |
-
本文操作均以 root 身份执行;
-
v0
-
作为 K8S API Server 和 Ingress 外部流量负载均衡,在生产环境中应部署至少两台主机,并安装配置 KeepAlived 实现高可用;
-
下载 CloudFlare cfssl 证书签发工具 (cfssl / cfssljson);
-
-
v1 - v6
-
/var 分区容量足够大;
-
配置 /etc/hosts 或 DNS 服务,保证所有节点均可通过主机名、域名访问;
-
配置 systemd-timesyncd 或 NTP 时间同步;
-
禁用 iptables;
-
v0 能够以 root 身份通过 SSH 密钥直接登录其他主机;
-
- v0 - v6
timedatectl set-timezone Asia/Shanghai
- v1 - v6
cat > /etc/profile.d/path.sh <<EOF
export PATH="/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/kubernetes/client/bin:/opt/kubernetes/server/bin:/opt/kubernetes/node/bin:/opt/cni/bin:/opt/etcd"
EOF
- 如使用 Contianerd CRI,则配置以下 PATH:
cat > /etc/profile.d/path.sh <<EOF
export PATH="/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/kubernetes/client/bin:/opt/kubernetes/server/bin:/opt/kubernetes/node/bin:/opt/cni/bin:/opt/runc/sbin:/opt/containerd/bin:/opt/crictl/bin:/opt/etcd"
EOF
source /etc/profile
- v1 - v6
apt update && apt upgrade -y
apt install -y \
bash-completion \
bridge-utils \
wget \
socat \
jq \
git \
curl \
rsync \
conntrack \
ipset \
ipvsadm \
jq \
ebtables \
sysstat \
libltdl7 \
lvm2 \
iptables \
lsb-release \
libseccomp2 \
scdaemon \
gnupg \
gnupg2 \
gnupg-agent \
nfs-client \
ceph-common \
glusterfs-client \
ca-certificates \
apt-transport-https \
software-properties-common
- v1 - v6
cat > /etc/systemd/system/swapoff.service <<EOF
[Unit]
Before=network.target
[Service]
Type=oneshot
ExecStart=/usr/sbin/swapoff -a
ExecStop=/usr/sbin/swapon -a
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload && systemctl enable --now swapoff.service
sed -i '/swap/ s/^\(.*\)$/#\1/g' /etc/fstab
- v1 - v6
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
modprobe -- overlay
modprobe -- br_netfilter
modprobe -- rbd
cat >> /etc/modules <<EOF
overlay
br_netfilter
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
rbd
EOF
- v0 - v6
- 务必根据实际情况设置内核参数;
cat >> /etc/sysctl.conf <<EOF
### K8S
net.ipv4.ip_forward = 1
net.ipv4.neigh.default.gc_thresh1 = 1024
net.ipv4.neigh.default.gc_thresh2 = 2048
net.ipv4.neigh.default.gc_thresh3 = 4096
net.ipv6.neigh.default.gc_thresh1 = 1024
net.ipv6.neigh.default.gc_thresh2 = 2048
net.ipv6.neigh.default.gc_thresh3 = 4096
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 10
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_no_metrics_save = 0
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_fin_timeout = 30
net.ipv4.ip_local_port_range = 10240 60999
net.core.somaxconn = 8192
net.core.optmem_max = 20480
net.core.netdev_max_backlog = 3000
net.netfilter.nf_conntrack_max = 2310720
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
vm.dirty_background_ratio = 10
vm.swappiness = 0
vm.overcommit_memory = 1
vm.panic_on_oom = 0
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 1048576
fs.file-max = 52706963
fs.nr_open = 52706963
EOF
sysctl -p
- v0 - v6
vi /etc/security/limits.conf
* soft nproc 131072
* hard nproc 131072
* soft nofile 131072
* hard nofile 131072
root soft nproc 131072
root hard nproc 131072
root soft nofile 131072
root hard nofile 131072
-
如使用 Containerd CRI 可能需要将 Debian 11 默认使用的 Cgroups v2 降级至 v1;
-
使用 CRI-O CRI 无需切换 Cgroups 版本;
vi /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet systemd.unified_cgroup_hierarchy=0 cgroup_enable=memory swapaccount=1"
update-grub
reboot
-
v0
-
本文档只部署 1 个 HAProxy 实例。生产环境中应至少部署 2 个 HAProxy 并安装 KeepAlived 进行高可用配置。
-
HAProxy 管理后台 URL 默认为
http://172.31.31.70:9090/ha-status
,如有需要可修改 URI 和端口号; -
HAProxy 管理后台默认账户为
admin / admin-inanu
,如有需要自行更改;
apt install -y haproxy
mv /etc/haproxy/haproxy.cfg{,.ori}
cat > /etc/haproxy/haproxy.cfg <<EOF
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
defaults
log global
mode tcp
option httplog
option dontlognull
timeout connect 10s
timeout client 30s
timeout server 30s
frontend K8S-API-Server
bind 0.0.0.0:6443
option tcplog
mode tcp
default_backend K8S-API-Server
frontend K8S-Ingress-HTTP
bind 0.0.0.0:80
option tcplog
mode tcp
default_backend K8S-Ingress-HTTP
frontend K8S-Ingress-HTTPS
bind 0.0.0.0:443
option tcplog
mode tcp
default_backend K8S-Ingress-HTTPS
frontend HA-Admin
bind 0.0.0.0:9090
mode http
timeout client 5000
stats uri /ha-status
stats realm HAProxy\ Statistics
stats auth admin:admin-inanu ### CHANGE THIS!
#This allows you to take down and bring up back end servers.
#This will produce an error on older versions of HAProxy.
stats admin if TRUE
backend K8S-API-Server
mode tcp
balance roundrobin
option tcp-check
server api-server-1 172.31.31.71:6443 check fall 3 rise 2 maxconn 2000
server api-server-2 172.31.31.72:6443 check fall 3 rise 2 maxconn 2000
server api-server-3 172.31.31.73:6443 check fall 3 rise 2 maxconn 2000
backend K8S-Ingress-HTTP
mode tcp
balance roundrobin
option tcp-check
server ingress-1 172.31.31.74:80 check fall 3 rise 2 maxconn 2000
server ingress-2 172.31.31.75:80 check fall 3 rise 2 maxconn 2000
server ingress-3 172.31.31.76:80 check fall 3 rise 2 maxconn 2000
backend K8S-Ingress-HTTPS
mode tcp
balance roundrobin
option tcp-check
server ingress-1 172.31.31.74:443 check fall 3 rise 2 maxconn 2000
server ingress-2 172.31.31.75:443 check fall 3 rise 2 maxconn 2000
server ingress-3 172.31.31.76:443 check fall 3 rise 2 maxconn 2000
EOF
systemctl enable --now haproxy.service
浏览器打开 http://172.31.31.70:9090/ha-status
。此时所有 backend 均不可用,因为尚未部署 Kube API Server 和 Nginx Ingress。
- v0
- 下载
cd /usr/local/src
tar xzvf kubernetes-client-linux-amd64.tar.gz
tar xzvf kubernetes-node-linux-amd64.tar.gz
tar xzvf kubernetes-server-linux-amd64.tar.gz
chown -R root.root ./kubernetes
mv kubernetes kubernetes-1.27.16
# K8S 组件放置于 /opt/app/kubernetes-1.27.16 并软链至 /opt/kubernetes
for I in {1..6};do
ssh root@v${I} "mkdir -p /opt/app"
rsync -avr /usr/local/src/kubernetes-1.27.16 root@v${I}:/opt/app/
ssh root@v${I} "ln -snf /opt/app/kubernetes-1.27.16 /opt/kubernetes"
done
- v0
cp /usr/local/src/kubernetes-1.27.16/client/bin/kubectl /usr/local/bin/
cp /usr/local/src/kubernetes-1.27.16/node/bin/kubeadm /usr/local/bin/
chmod +x /usr/local/bin/kube* && chown root:root /usr/local/bin/kube*
- v0 - v6
cat >> /root/.bashrc <<EOF
# kubectl autocompletion
source <(/opt/kubernetes/client/bin/kubectl completion bash)
EOF
kubectl completion bash > /etc/bash_completion.d/kubectl
source /root/.bashrc
- v0
mkdir -p /etc/kubernetes/pki
cd /etc/kubernetes/pki
cat > ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "876000h"
},
"profiles": {
"kubernetes": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "876000h"
}
}
}
}
EOF
cat > ca-csr.json <<EOF
{
"CN": "kubernetes-ca",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "Nanu-Network",
"OU": "K8S"
}
],
"ca": {
"expiry": "876000h"
}
}
EOF
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
for I in {1..6};do
ssh root@v${I} "mkdir -p /etc/kubernetes/cert"
rsync -avr ./ca*.pem root@v${I}:/etc/kubernetes/cert/
done
-
v0
K8S 使用证书中的
O
(Organization) 字段作为 K8S 用户组名称认证。system:masters
为内置管理员组。
cat > admin-csr.json <<EOF
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "system:masters",
"OU": "K8S"
}
]
}
EOF
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes admin-csr.json | cfssljson -bare admin
for I in {1..6};do
rsync -avr ./admin*.pem root@v${I}:/root/
ssh root@v${I} "chmod 600 /root/admin*.pem"
done
- v0
kubectl config set-cluster kubernetes \
--certificate-authority=ca.pem \
--embed-certs=true \
--server=https://172.31.31.70:6443 \
--kubeconfig=admin.kubeconfig
kubectl config set-credentials admin \
--client-certificate=admin.pem \
--client-key=admin-key.pem \
--embed-certs=true \
--kubeconfig=admin.kubeconfig
kubectl config set-context kubernetes \
--cluster=kubernetes \
--user=admin \
--kubeconfig=admin.kubeconfig
kubectl config use-context kubernetes --kubeconfig=admin.kubeconfig
mkdir -p mkdir -p /root/.kube
cp ./admin.kubeconfig /root/.kube/config
chmod 700 /root/.kube && chmod 600 /root/.kube/config
- 上一步 v0 节点已能够通过 kubeconfig 认证访问 K8S API Server
for I in {1..6};do
ssh root@v${I} "mkdir -p /root/.kube"
scp ./admin.kubeconfig root@v${I}:/root/.kube/config
ssh root@v${I} "chmod 700 /root/.kube && chmod 600 /root/.kube/config"
done
- v0
- 下载
- etcd data 目录为:
/data/etcd/data
; - etcd wal 目录为:
/data/etcd/wal
;
cd /usr/local/src
tar xzvf etcd-v3.5.15-linux-amd64.tar.gz
for I in {1..3};do
rsync -avr /usr/local/src/etcd-v3.5.15-linux-amd64 root@v${I}:/opt/app/
ssh root@v${I} "chown -R root.root /opt/app/etcd-v3.5.15-linux-amd64; ln -snf /opt/app/etcd-v3.5.15-linux-amd64 /opt/etcd"
done
cat > etcd-csr.json <<EOF
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"172.31.31.71",
"172.31.31.72",
"172.31.31.73",
"v1",
"v2",
"v3",
"v1.inanu.net",
"v2.inanu.net",
"v3.inanu.net"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "Nanu-Network",
"OU": "K8S"
}
]
}
EOF
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes etcd-csr.json | cfssljson -bare etcd
for I in {1..3};do
ssh root@v${I} "mkdir -p /etc/etcd/cert"
ssh root@v${I} "mkdir -p /data/etcd/{data,wal}"
rsync -avr etcd*.pem root@v${I}:/etc/etcd/cert/
done
- v1 - v3
-
--name=
-
--listen-peer-urls=
-
--initial-advertise-peer-urls=
-
--listen-client-urls=
-
--advertise-client-urls=
-
cat > /etc/systemd/system/etcd.service <<EOF
[Unit]
Description=etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/data/etcd/data
ExecStart=/opt/etcd/etcd \\
--name=v1 \\
--data-dir=/data/etcd/data \\
--wal-dir=/data/etcd/wal \\
--snapshot-count=5000 \\
--cert-file=/etc/etcd/cert/etcd.pem \\
--key-file=/etc/etcd/cert/etcd-key.pem \\
--trusted-ca-file=/etc/kubernetes/cert/ca.pem \\
--peer-cert-file=/etc/etcd/cert/etcd.pem \\
--peer-key-file=/etc/etcd/cert/etcd-key.pem \\
--peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \\
--peer-client-cert-auth \\
--client-cert-auth \\
--listen-peer-urls=https://172.31.31.71:2380 \\
--initial-advertise-peer-urls=https://172.31.31.71:2380 \\
--listen-client-urls=https://172.31.31.71:2379,http://127.0.0.1:2379 \\
--advertise-client-urls=https://172.31.31.71:2379 \\
--initial-cluster-token=etcd-cluster-0 \\
--initial-cluster="v1=https://172.31.31.71:2380,v2=https://172.31.31.72:2380,v3=https://172.31.31.73:2380" \\
--initial-cluster-state=new \\
--auto-compaction-mode=periodic \\
--auto-compaction-retention=1 \\
--max-snapshots=5 \\
--max-wals=5 \\
--max-txn-ops=512 \\
--max-request-bytes=33554432 \\
--quota-backend-bytes=6442450944 \\
--heartbeat-interval=250 \\
--election-timeout=2000
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
- v0
for I in {1..3};do
ssh root@v${I} "systemctl daemon-reload && systemctl enable --now etcd.service"
done
- v1 or v2 or v3
for I in {1..3};do
etcdctl \
--endpoints=https://v${I}:2379 \
--cacert=/etc/kubernetes/cert/ca.pem \
--cert=/etc/etcd/cert/etcd.pem \
--key=/etc/etcd/cert/etcd-key.pem endpoint health
done
etcdctl \
-w table --cacert=/etc/kubernetes/cert/ca.pem \
--cert=/etc/etcd/cert/etcd.pem \
--key=/etc/etcd/cert/etcd-key.pem \
--endpoints=https://172.31.31.71:2379,https://172.31.31.72:2379,https://172.31.31.73:2379 endpoint status
- v1 or v2 or v3
cat > /usr/local/bin/backup_etcd.sh <<EOF
#!/bin/bash
BACKUP_DIR="/data/backup/etcd"
BACKUP_FILE="etcd-snapshot-$(date +%Y%m%d-%H%M).db"
ENDPOINTS="http://127.0.0.1:2379"
#CACERT="/etc/ssl/etcd/ssl/ca.pem"
#CERT="/etc/ssl/etcd/ssl/node-master1.pem"
#KEY="/etc/ssl/etcd/ssl/node-master1-key.pem"
if [ ! -d ${BACKUP_DIR} ];then
mkdir -p ${BACKUP_DIR}
fi
#etcdctl \
# --cacert="${CACERT}" --cert="${CERT}" --key="${KEY}" \
# --endpoints="${ENDPOINTS}" \
# snapshot save ${BACKUP_DIR}/${BACKUP_FILE}
/opt/etcd/etcdctl --endpoints="${ENDPOINTS}" \
snapshot save ${BACKUP_DIR}/${BACKUP_FILE}
cd ${BACKUP_DIR}
tar czf ./${BACKUP_FILE}.tar.gz ./${BACKUP_FILE}
rm -f ./${BACKUP_FILE}
# Keep 7 days backup
find ${BACKUP_DIR}/ -name "*.gz" -mtime +7 -exec rm -f {} \;
EOF
chmod +x /usr/local/bin/backup_etcd.sh
crontab -e
# Backup etcd
0 4 * * * /usr/local/bin/backup_etcd.sh > /dev/null 2>&1
- v0
cat > kubernetes-csr.json <<EOF
{
"CN": "kubernetes-master",
"hosts": [
"10.254.0.1",
"127.0.0.1",
"172.31.31.70",
"172.31.31.71",
"172.31.31.72",
"172.31.31.73",
"v0",
"v1",
"v2",
"v3",
"v0.inanu.net",
"v1.inanu.net",
"v2.inanu.net",
"v3.inanu.net",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local.",
"k8s.inanu.net."
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "Nanu-Network",
"OU": "K8S"
}
]
}
EOF
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes
for I in {1..3};do
ssh root@v${I} "mkdir -p /etc/kubernetes/cert"
rsync -avr kubernetes*.pem root@v${I}:/etc/kubernetes/cert/
done
ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64)
cat > encryption-config.yaml <<EOF
kind: EncryptionConfig
apiVersion: v1
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: ${ENCRYPTION_KEY}
- identity: {}
EOF
for I in {1..3};do
rsync -avr encryption-config.yaml root@v${I}:/etc/kubernetes/
done
- v0
cat > audit-policy.yaml <<EOF
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# The following requests were manually identified as high-volume and low-risk, so drop them.
- level: None
resources:
- group: ""
resources:
- endpoints
- services
- services/status
users:
- 'system:kube-proxy'
verbs:
- watch
- level: None
resources:
- group: ""
resources:
- nodes
- nodes/status
userGroups:
- 'system:nodes'
verbs:
- get
- level: None
namespaces:
- kube-system
resources:
- group: ""
resources:
- endpoints
users:
- 'system:kube-controller-manager'
- 'system:kube-scheduler'
- 'system:serviceaccount:kube-system:endpoint-controller'
verbs:
- get
- update
- level: None
resources:
- group: ""
resources:
- namespaces
- namespaces/status
- namespaces/finalize
users:
- 'system:apiserver'
verbs:
- get
# Don't log HPA fetching metrics.
- level: None
resources:
- group: metrics.k8s.io
users:
- 'system:kube-controller-manager'
verbs:
- get
- list
# Don't log these read-only URLs.
- level: None
nonResourceURLs:
- '/healthz*'
- /version
- '/swagger*'
# Don't log events requests.
- level: None
resources:
- group: ""
resources:
- events
# node and pod status calls from nodes are high-volume and can be large, don't log responses
# for expected updates from nodes
- level: Request
omitStages:
- RequestReceived
resources:
- group: ""
resources:
- nodes/status
- pods/status
users:
- kubelet
- 'system:node-problem-detector'
- 'system:serviceaccount:kube-system:node-problem-detector'
verbs:
- update
- patch
- level: Request
omitStages:
- RequestReceived
resources:
- group: ""
resources:
- nodes/status
- pods/status
userGroups:
- 'system:nodes'
verbs:
- update
- patch
# deletecollection calls can be large, don't log responses for expected namespace deletions
- level: Request
omitStages:
- RequestReceived
users:
- 'system:serviceaccount:kube-system:namespace-controller'
verbs:
- deletecollection
# Secrets, ConfigMaps, and TokenReviews can contain sensitive & binary data,
# so only log at the Metadata level.
- level: Metadata
omitStages:
- RequestReceived
resources:
- group: ""
resources:
- secrets
- configmaps
- group: authentication.k8s.io
resources:
- tokenreviews
# Get repsonses can be large; skip them.
- level: Request
omitStages:
- RequestReceived
resources:
- group: ""
- group: admissionregistration.k8s.io
- group: apiextensions.k8s.io
- group: apiregistration.k8s.io
- group: apps
- group: authentication.k8s.io
- group: authorization.k8s.io
- group: autoscaling
- group: batch
- group: certificates.k8s.io
- group: extensions
- group: metrics.k8s.io
- group: networking.k8s.io
- group: policy
- group: rbac.authorization.k8s.io
- group: scheduling.k8s.io
- group: settings.k8s.io
- group: storage.k8s.io
verbs:
- get
- list
- watch
# Default level for known APIs
- level: RequestResponse
omitStages:
- RequestReceived
resources:
- group: ""
- group: admissionregistration.k8s.io
- group: apiextensions.k8s.io
- group: apiregistration.k8s.io
- group: apps
- group: authentication.k8s.io
- group: authorization.k8s.io
- group: autoscaling
- group: batch
- group: certificates.k8s.io
- group: extensions
- group: metrics.k8s.io
- group: networking.k8s.io
- group: policy
- group: rbac.authorization.k8s.io
- group: scheduling.k8s.io
- group: settings.k8s.io
- group: storage.k8s.io
# Default level for all other requests.
- level: Metadata
omitStages:
- RequestReceived
EOF
for I in {1..3};do
rsync -avr audit-policy.yaml root@v${I}:/etc/kubernetes/
done
- v0
cat > proxy-client-csr.json <<EOF
{
"CN": "aggregator",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "Nanu-Network",
"OU": "K8S"
}
]
}
EOF
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes proxy-client-csr.json | cfssljson -bare proxy-client
for I in {1..3};do
rsync -avr proxy-client*.pem root@v${I}:/etc/kubernetes/cert/
done
- v0
cat > service-account-csr.json <<EOF
{
"CN": "service-accounts",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"O": "Nanu-Network",
"OU": "K8S"
}
]
}
EOF
cfssl gencert \
-ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes \
service-account-csr.json | cfssljson -bare service-account
for I in {1..3};do
rsync -avr service-account*.pem root@v${I}:/etc/kubernetes/cert/
done
- v0
- Kube API Server 工作目录:/var/lib/kube-apiserver
for I in {1..3};do
ssh root@v${I} "mkdir -p /var/lib/kube-apiserver"
done
- v1 - v3
-
--apiserver-count
(如 Master 节点 > 3 请修改); -
--etcd-servers
(如 etcd 节点与文档不一致请修改); -
--service-account-issuer
(K8S API Server 负载均衡地址); -
--service-cluster-ip-range
(Cluster CIDR);
-
cat > /etc/systemd/system/kube-apiserver.service <<EOF
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
[Service]
WorkingDirectory=/var/lib/kube-apiserver
ExecStart=/opt/kubernetes/server/bin/kube-apiserver \\
--advertise-address=0.0.0.0 \\
--allow-privileged=true \\
--anonymous-auth=false \\
--apiserver-count=3 \\
--audit-log-compress \\
--audit-log-format="json" \\
--audit-log-maxage=30 \\
--audit-log-maxbackup=50 \\
--audit-log-maxsize=10 \\
--audit-log-mode="blocking" \\
--audit-log-path="/var/lib/kube-apiserver/audit.log" \\
--audit-log-truncate-enabled \\
--audit-policy-file="/etc/kubernetes/audit-policy.yaml" \\
--authorization-mode="Node,RBAC" \\
--bind-address="0.0.0.0" \\
--client-ca-file="/etc/kubernetes/cert/ca.pem" \\
--default-not-ready-toleration-seconds=300 \\
--default-unreachable-toleration-seconds=300 \\
--default-watch-cache-size=200 \\
--delete-collection-workers=4 \\
--enable-admission-plugins=NodeRestriction \\
--enable-aggregator-routing \\
--enable-bootstrap-token-auth \\
--encryption-provider-config="/etc/kubernetes/encryption-config.yaml" \\
--etcd-cafile="/etc/kubernetes/cert/ca.pem" \\
--etcd-certfile="/etc/kubernetes/cert/kubernetes.pem" \\
--etcd-keyfile="/etc/kubernetes/cert/kubernetes-key.pem" \\
--etcd-servers="https://172.31.31.71:2379,https://172.31.31.72:2379,https://172.31.31.73:2379" \\
--event-ttl=168h \\
--goaway-chance=.001 \\
--http2-max-streams-per-connection=42 \\
--kubelet-certificate-authority="/etc/kubernetes/cert/ca.pem" \\
--kubelet-client-certificate="/etc/kubernetes/cert/kubernetes.pem" \\
--kubelet-client-key="/etc/kubernetes/cert/kubernetes-key.pem" \\
--kubelet-timeout=10s \\
--lease-reuse-duration-seconds=120 \\
--max-mutating-requests-inflight=2000 \\
--max-requests-inflight=4000 \\
--profiling \\
--proxy-client-cert-file="/etc/kubernetes/cert/proxy-client.pem" \\
--proxy-client-key-file="/etc/kubernetes/cert/proxy-client-key.pem" \\
--requestheader-allowed-names="aggregator" \\
--requestheader-client-ca-file="/etc/kubernetes/cert/ca.pem" \\
--requestheader-extra-headers-prefix="X-Remote-Extra-" \\
--requestheader-group-headers="X-Remote-Group" \\
--requestheader-username-headers="X-Remote-User" \\
--runtime-config='api/all=true' \\
--secure-port=6443 \\
--service-account-extend-token-expiration=true \\
--service-account-issuer="https://172.31.31.70:6443" \\
--service-account-key-file="/etc/kubernetes/cert/service-account.pem" \\
--service-account-signing-key-file="/etc/kubernetes/cert/service-account-key.pem" \\
--service-cluster-ip-range="10.254.0.0/16" \\
--service-node-port-range=10001-65535 \\
--tls-cert-file="/etc/kubernetes/cert/kubernetes.pem" \\
--tls-private-key-file="/etc/kubernetes/cert/kubernetes-key.pem" \\
--v=2
Restart=on-failure
RestartSec=10
Type=notify
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
- v0
for I in {1..3};do
ssh root@v${I} "systemctl daemon-reload && systemctl enable --now kube-apiserver.service"
done
- v0
kubectl cluster-info
kubectl cluster-info dump
kubectl get all -A
- v0
cat > kube-controller-manager-csr.json <<EOF
{
"CN": "system:kube-controller-manager",
"key": {
"algo": "rsa",
"size": 2048
},
"hosts": [
"127.0.0.1",
"172.31.31.71",
"172.31.31.72",
"172.31.31.73",
"v1",
"v2",
"v3",
"v1.inanu.net",
"v2.inanu.net",
"v3.inanu.net"
],
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "system:kube-controller-manager",
"OU": "K8S"
}
]
}
EOF
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
- v0
for I in {1..3};do
rsync -avr kube-controller-manager*.pem root@v${I}:/etc/kubernetes/cert/
done
- v0
kubectl config set-cluster kubernetes \
--certificate-authority=ca.pem \
--embed-certs=true \
--server="https://172.31.31.70:6443" \
--kubeconfig=kube-controller-manager.kubeconfig
kubectl config set-credentials system:kube-controller-manager \
--client-certificate=kube-controller-manager.pem \
--client-key=kube-controller-manager-key.pem \
--embed-certs=true \
--kubeconfig=kube-controller-manager.kubeconfig
kubectl config set-context system:kube-controller-manager \
--cluster=kubernetes \
--user=system:kube-controller-manager \
--kubeconfig=kube-controller-manager.kubeconfig
kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
- v0
for I in {1..3};do
rsync -avr kube-controller-manager.kubeconfig root@v${I}:/etc/kubernetes/
done
- v0
- kube-controller-manager 工作目录:/var/lib/kube-controller-manager
for I in {1..3};do
ssh root@v${I} "mkdir -p /var/lib/kube-controller-manager"
done
- v1 - v3
cat > /etc/systemd/system/kube-controller-manager.service <<EOF
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
WorkingDirectory=/var/lib/kube-controller-manager
ExecStart=/opt/kubernetes/server/bin/kube-controller-manager \\
--authentication-kubeconfig="/etc/kubernetes/kube-controller-manager.kubeconfig" \\
--authorization-kubeconfig="/etc/kubernetes/kube-controller-manager.kubeconfig" \\
--bind-address=0.0.0.0 \\
--client-ca-file="/etc/kubernetes/cert/ca.pem" \\
--cluster-name="kubernetes" \\
--cluster-signing-cert-file="/etc/kubernetes/cert/ca.pem" \\
--cluster-signing-duration=876000h \\
--cluster-signing-key-file="/etc/kubernetes/cert/ca-key.pem" \\
--concurrent-deployment-syncs=10 \\
--concurrent-endpoint-syncs=10 \\
--concurrent-gc-syncs=30 \\
--concurrent-namespace-syncs=10 \\
--concurrent-rc-syncs=10 \\
--concurrent-replicaset-syncs=10 \\
--concurrent-resource-quota-syncs=10 \\
--concurrent-service-endpoint-syncs=10 \\
--concurrent-service-syncs=2 \\
--concurrent-serviceaccount-token-syncs=10 \\
--concurrent-statefulset-syncs=10 \\
--concurrent-ttl-after-finished-syncs=10 \\
--contention-profiling \\
--controllers=*,bootstrapsigner,tokencleaner \\
--horizontal-pod-autoscaler-sync-period=10s \\
--http2-max-streams-per-connection=42 \\
--kube-api-burst=2000 \\
--kube-api-qps=1000 \\
--kubeconfig="/etc/kubernetes/kube-controller-manager.kubeconfig" \\
--leader-elect \\
--mirroring-concurrent-service-endpoint-syncs=10 \\
--profiling \\
--requestheader-allowed-names="aggregator" \\
--requestheader-client-ca-file="/etc/kubernetes/cert/ca.pem" \\
--requestheader-extra-headers-prefix="X-Remote-Exra-" \\
--requestheader-group-headers="X-Remote-Group" \\
--requestheader-username-headers="X-Remote-User" \\
--root-ca-file="/etc/kubernetes/cert/ca.pem" \\
--secure-port=10252 \\
--service-account-private-key-file="/etc/kubernetes/cert/service-account-key.pem" \\
--service-cluster-ip-range="10.254.0.0/16" \\
--tls-cert-file="/etc/kubernetes/cert/kube-controller-manager.pem" \\
--tls-private-key-file="/etc/kubernetes/cert/kube-controller-manager-key.pem" \\
--use-service-account-credentials=true \\
--v=2
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
- v0
for I in {1..3};do
ssh root@v${I} "systemctl daemon-reload && systemctl enable --now kube-controller-manager"
done
curl -s --cacert ca.pem --cert admin.pem --key admin-key.pem https://172.31.31.71:10252/metrics
curl -s --cacert ca.pem --cert admin.pem --key admin-key.pem https://172.31.31.72:10252/metrics
curl -s --cacert ca.pem --cert admin.pem --key admin-key.pem https://172.31.31.73:10252/metrics
journalctl -u kube-controller-manager.service --no-pager | grep -i 'became leader'
- v0
cat > kube-scheduler-csr.json <<EOF
{
"CN": "system:kube-scheduler",
"hosts": [
"127.0.0.1",
"172.31.31.71",
"172.31.31.72",
"172.31.31.73",
"v1",
"v2",
"v3",
"v1.inanu.net",
"v2.inanu.net",
"v3.inanu.net"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "system:kube-scheduler",
"OU": "K8S"
}
]
}
EOF
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
for I in {1..3};do
rsync -avr kube-scheduler*.pem root@v${I}:/etc/kubernetes/cert/
done
kubectl config set-cluster kubernetes \
--certificate-authority=ca.pem \
--embed-certs=true \
--server="https://172.31.31.70:6443" \
--kubeconfig=kube-scheduler.kubeconfig
kubectl config set-credentials system:kube-scheduler \
--client-certificate=kube-scheduler.pem \
--client-key=kube-scheduler-key.pem \
--embed-certs=true \
--kubeconfig=kube-scheduler.kubeconfig
kubectl config set-context system:kube-scheduler \
--cluster=kubernetes \
--user=system:kube-scheduler \
--kubeconfig=kube-scheduler.kubeconfig
kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
for I in {1..3};do
rsync -avr kube-scheduler.kubeconfig root@v${I}:/etc/kubernetes/
done
- v1 - v3
cat > /etc/kubernetes/kube-scheduler.yaml <<EOF
apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
clientConnection:
burst: 200
kubeconfig: "/etc/kubernetes/kube-scheduler.kubeconfig"
qps: 100
enableContentionProfiling: false
enableProfiling: true
leaderElection:
leaderElect: true
EOF
- v0
- kube-scheduler 工作目录:/var/lib/kube-scheduler
for I in {1..3};do
ssh root@v${I} "mkdir -p /var/lib/kube-scheduler"
done
- v1 - v3
cat > /etc/systemd/system/kube-scheduler.service <<EOF
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
WorkingDirectory=/var/lib/kube-scheduler
ExecStart=/opt/kubernetes/server/bin/kube-scheduler \\
--authentication-kubeconfig="/etc/kubernetes/kube-scheduler.kubeconfig" \\
--authorization-kubeconfig="/etc/kubernetes/kube-scheduler.kubeconfig" \\
--bind-address=0.0.0.0 \\
--client-ca-file="/etc/kubernetes/cert/ca.pem" \\
--config="/etc/kubernetes/kube-scheduler.yaml" \\
--http2-max-streams-per-connection=42 \\
--leader-elect=true \\
--requestheader-allowed-names="" \\
--requestheader-client-ca-file="/etc/kubernetes/cert/ca.pem" \\
--requestheader-extra-headers-prefix="X-Remote-Extra-" \\
--requestheader-group-headers="X-Remote-Group" \\
--requestheader-username-headers="X-Remote-User" \\
--secure-port=10259 \\
--tls-cert-file="/etc/kubernetes/cert/kube-scheduler.pem" \\
--tls-private-key-file="/etc/kubernetes/cert/kube-scheduler-key.pem" \\
--v=2
Restart=always
RestartSec=5
StartLimitInterval=0
[Install]
WantedBy=multi-user.target
EOF
- v0
for I in {1..3};do
ssh root@v${I} "systemctl daemon-reload && systemctl enable --now kube-scheduler"
done
curl -s --cacert ca.pem --cert admin.pem --key admin-key.pem https://172.31.31.71:10259/metrics
curl -s --cacert ca.pem --cert admin.pem --key admin-key.pem https://172.31.31.72:10259/metrics
curl -s --cacert ca.pem --cert admin.pem --key admin-key.pem https://172.31.31.73:10259/metrics
journalctl -u kube-scheduler.service --no-pager | grep -i 'leader'
-
v0
cd /usr/local/src
tar xzvf cri-o-v1.27.8.tar.gz
for I in {1..6};do
rsync -avr /usr/local/src/cri-o root@v${I}:/usr/local/src/
ssh root@v${I} "cd /usr/local/src/cri-o && ./install && rm -rf /usr/local/src/cri-o && mkdir -p /etc/containers"
done
- v1 - v6
cat > /etc/containers/registries.conf <<EOF
unqualified-search-registries = ["docker.io"]
EOF
- v0
for I in {1..6};do
ssh root@v${I} "systemctl daemon-reload && systemctl enable --now crio.service"
done
- v0
for I in {1..6};do
ssh root@v${I} "crictl info"
done
mkdir /usr/local/src/cni-plugins
tar xzvf cni-plugins-linux-amd64-v1.4.0.tgz -C ./cni-plugins
mkdir /usr/local/src/containerd
tar xzvf containerd-1.7.20-linux-amd64.tar.gz -C ./containerd
tar xzvf crictl-v1.30.0-linux-amd64.tar.gz
for I in {1..6};do
ssh root@v${I} "mkdir -p /opt/app/cni-plugins-linux-amd64-v1.4.0/bin && ln -snf /opt/app/cni-plugins-linux-amd64-v1.4.0 /opt/cni"
rsync -avr /usr/local/src/cni-plugins/ root@v${I}:/opt/cni/bin/
ssh root@v${I} "chown -R root.root /opt/cni/bin/* && chmod +x /opt/cni/bin/*"
ssh root@v${I} "mkdir -p /opt/app/containerd-1.7.20-linux-amd64/bin && ln -snf /opt/app/containerd-1.7.20-linux-amd64 /opt/containerd"
rsync -avr /usr/local/src/containerd/bin/ root@v${I}:/opt/containerd/bin/
ssh root@v${I} "chown -R root.root /opt/containerd/bin/* && chmod +x /opt/containerd/bin/*"
ssh root@v${I} "mkdir -p /opt/app/crictl-v1.30.0-linux-amd64/bin && ln -snf /opt/app/crictl-v1.30.0-linux-amd64 /opt/crictl"
rsync -avr /usr/local/src/crictl root@v${I}:/opt/crictl/bin/
ssh root@v${I} "chown -R root.root /opt/crictl/bin/* && chmod +x /opt/crictl/bin/*"
ssh root@v${I} "mkdir -p /opt/app/runc-v1.1.13-linux-amd64/sbin && ln -snf /opt/app/runc-v1.1.13-linux-amd64 /opt/runc"
rsync -avr /usr/local/src/runc.amd64 root@v${I}:/opt/runc/sbin/
ssh root@v${I} "chown -R root.root /opt/runc/sbin/* && chmod +x /opt/runc/sbin/* && mv /opt/runc/sbin/runc.amd64 /opt/runc/sbin/runc"
ssh root@v${I} "mkdir -p /etc/containerd/ && mkdir -p /etc/cni/net.d && mkdir -p /data/containerd"
done
- v1 - v6
cat > /etc/containerd/config.toml <<EOF
disabled_plugins = []
imports = []
oom_score = 0
plugin_dir = ""
required_plugins = []
root = "/data/containerd"
state = "/run/containerd"
version = 2
[cgroup]
path = ""
[debug]
address = ""
format = ""
gid = 0
level = ""
uid = 0
[grpc]
address = "/run/containerd/containerd.sock"
gid = 0
max_recv_message_size = 16777216
max_send_message_size = 16777216
tcp_address = ""
tcp_tls_cert = ""
tcp_tls_key = ""
uid = 0
[metrics]
address = ""
grpc_histogram = false
[plugins]
[plugins."io.containerd.gc.v1.scheduler"]
deletion_threshold = 0
mutation_threshold = 100
pause_threshold = 0.02
schedule_delay = "0s"
startup_delay = "100ms"
[plugins."io.containerd.grpc.v1.cri"]
disable_apparmor = false
disable_cgroup = false
disable_hugetlb_controller = true
disable_proc_mount = false
disable_tcp_service = true
enable_selinux = false
enable_tls_streaming = false
ignore_image_defined_volumes = false
max_concurrent_downloads = 3
max_container_log_line_size = 16384
netns_mounts_under_state_dir = false
restrict_oom_score_adj = false
sandbox_image = "k8s.gcr.io/pause:3.5"
selinux_category_range = 1024
stats_collect_period = 10
stream_idle_timeout = "4h0m0s"
stream_server_address = "127.0.0.1"
stream_server_port = "0"
systemd_cgroup = false
tolerate_missing_hugetlb_controller = true
unset_seccomp_profile = ""
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
conf_template = ""
max_conf_num = 1
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
disable_snapshot_annotations = true
discard_unpacked_layers = false
no_pivot = false
snapshotter = "overlayfs"
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
base_runtime_spec = ""
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = "/opt/runc/sbin/runc"
runtime_root = ""
runtime_type = "io.containerd.runtime.v1.linux"
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime.options]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
base_runtime_spec = ""
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = ""
runtime_root = ""
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
BinaryName = ""
CriuImagePath = ""
CriuPath = ""
CriuWorkPath = ""
IoGid = 0
IoUid = 0
NoNewKeyring = false
NoPivotRoot = false
Root = ""
ShimCgroup = ""
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime]
base_runtime_spec = ""
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = ""
runtime_root = ""
runtime_type = ""
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime.options]
[plugins."io.containerd.grpc.v1.cri".image_decryption]
key_model = "node"
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = ""
[plugins."io.containerd.grpc.v1.cri".registry.auths]
[plugins."io.containerd.grpc.v1.cri".registry.configs]
[plugins."io.containerd.grpc.v1.cri".registry.headers]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
[plugins."io.containerd.internal.v1.opt"]
path = "/opt/containerd"
[plugins."io.containerd.internal.v1.restart"]
interval = "10s"
[plugins."io.containerd.metadata.v1.bolt"]
content_sharing_policy = "shared"
[plugins."io.containerd.monitor.v1.cgroups"]
no_prometheus = false
[plugins."io.containerd.runtime.v1.linux"]
no_shim = false
runtime = "runc"
runtime_root = ""
shim = "containerd-shim"
shim_debug = false
[plugins."io.containerd.runtime.v2.task"]
platforms = ["linux/amd64"]
[plugins."io.containerd.service.v1.diff-service"]
default = ["walking"]
[plugins."io.containerd.snapshotter.v1.aufs"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.btrfs"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.devmapper"]
async_remove = false
base_image_size = ""
pool_name = ""
root_path = ""
[plugins."io.containerd.snapshotter.v1.native"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.overlayfs"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.zfs"]
root_path = ""
[proxy_plugins]
[stream_processors]
[stream_processors."io.containerd.ocicrypt.decoder.v1.tar"]
accepts = ["application/vnd.oci.image.layer.v1.tar+encrypted"]
args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
path = "ctd-decoder"
returns = "application/vnd.oci.image.layer.v1.tar"
[stream_processors."io.containerd.ocicrypt.decoder.v1.tar.gzip"]
accepts = ["application/vnd.oci.image.layer.v1.tar+gzip+encrypted"]
args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
path = "ctd-decoder"
returns = "application/vnd.oci.image.layer.v1.tar+gzip"
[timeouts]
"io.containerd.timeout.shim.cleanup" = "5s"
"io.containerd.timeout.shim.load" = "5s"
"io.containerd.timeout.shim.shutdown" = "3s"
"io.containerd.timeout.task.state" = "2s"
[ttrpc]
address = ""
gid = 0
uid = 0
EOF
- v1 - v6
cat > /etc/systemd/system/containerd.service <<EOF
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target
[Service]
Environment="PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/kubernetes/client/bin:/opt/kubernetes/server/bin:/opt/kubernetes/node/bin:/opt/cni/bin:/opt/runc/sbin:/opt/containerd/bin:/opt/crictl/bin:/opt/etcd"
ExecStartPre=/sbin/modprobe overlay
ExecStart=/opt/containerd/bin/containerd
Restart=always
RestartSec=5
Delegate=yes
KillMode=process
OOMScoreAdjust=-999
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
[Install]
WantedBy=multi-user.target
EOF
- v0
for I in {1..6};do
ssh root@v${I} "systemctl daemon-reload && systemctl enable --now containerd"
done
- v1 - v6
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
- v0
for I in {1..6};do
ssh root@v${I} "crictl info"
done
- v0
for I in {1..6};do
kubeadm token create \
--description kubelet-bootstrap-token \
--groups system:bootstrappers:v${I} \
--kubeconfig /root/.kube/config
done
- v0
kubeadm token list
kubectl -n kube-system get secret | grep 'bootstrap-token'
- v0
for I in {1..6};do
kubectl config set-cluster kubernetes \
--certificate-authority=ca.pem \
--embed-certs=true \
--server=https://172.31.31.70:6443 \
--kubeconfig=./kubelet-bootstrap-v${I}.kubeconfig
BS_TOKEN=$(kubeadm token list --kubeconfig /root/.kube/config | grep "bootstrappers:v${I}" | awk '{print $1}')
kubectl config set-credentials kubelet-bootstrap \
--token=${BS_TOKEN} \
--kubeconfig=./kubelet-bootstrap-v${I}.kubeconfig
kubectl config set-context default \
--cluster=kubernetes \
--user=kubelet-bootstrap \
--kubeconfig=./kubelet-bootstrap-v${I}.kubeconfig
kubectl config use-context default \
--kubeconfig=./kubelet-bootstrap-v${I}.kubeconfig
scp ./kubelet-bootstrap-v${I}.kubeconfig root@v${I}:/etc/kubernetes/kubelet-bootstrap.kubeconfig
done
-
v1 - v6
-
注意修改如下配置:
-
podCIDR
-
clusterDomain
-
clusterDNS
-
cat > /etc/kubernetes/kubelet-config.yaml <<EOF
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: "0.0.0.0"
staticPodPath: ""
syncFrequency: 1m
fileCheckFrequency: 20s
httpCheckFrequency: 20s
staticPodURL: ""
port: 10250
readOnlyPort: 0
rotateCertificates: true
serverTLSBootstrap: true
authentication:
anonymous:
enabled: false
webhook:
enabled: true
x509:
clientCAFile: "/etc/kubernetes/cert/ca.pem"
authorization:
mode: Webhook
registryPullQPS: 0
registryBurst: 20
eventRecordQPS: 0
eventBurst: 20
enableDebuggingHandlers: true
enableContentionProfiling: true
healthzPort: 10248
healthzBindAddress: "0.0.0.0"
clusterDomain: "k8s.inanu.net"
clusterDNS:
- "10.254.0.53"
nodeStatusUpdateFrequency: 10s
nodeStatusReportFrequency: 1m
imageMinimumGCAge: 2m
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
volumeStatsAggPeriod: 1m
kubeletCgroups: ""
systemCgroups: ""
cgroupRoot: ""
cgroupsPerQOS: true
cgroupDriver: systemd
runtimeRequestTimeout: 10m
hairpinMode: promiscuous-bridge
maxPods: 220
podCIDR: "172.20.0.0/16"
podPidsLimit: -1
resolvConf: /etc/resolv.conf
maxOpenFiles: 1000000
kubeAPIQPS: 1000
kubeAPIBurst: 2000
serializeImagePulls: false
evictionHard:
memory.available: "100Mi"
nodefs.available: "10%"
nodefs.inodesFree: "5%"
imagefs.available: "15%"
evictionSoft: {}
enableControllerAttachDetach: true
failSwapOn: true
containerLogMaxSize: 20Mi
containerLogMaxFiles: 10
systemReserved: {}
kubeReserved: {}
systemReservedCgroup: ""
kubeReservedCgroup: ""
enforceNodeAllocatable: ["pods"]
EOF
-
v0
-
kubelet 工作目录:/var/lib/kubelet
for I in {1..6};do
ssh root@v${I} "mkdir -p /var/lib/kubelet/kubelet-plugins/volume/exec"
done
-
v1 - v6
-
如使用 Containerd CRI 需要改动:
-
After=containerd.service
-
Requires=containerd.service
-
Environment="PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/kubernetes/client/bin:/opt/kubernetes/server/bin:/opt/kubernetes/node/bin:/opt/cni/bin:/opt/runc/sbin:/opt/containerd/bin:/opt/crictl/bin:/opt/etcd"
-
--container-runtime-endpoint="unix:///run/containerd/containerd.sock"
-
cat > /etc/systemd/system/kubelet.service <<EOF
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=crio.service
Requires=crio.service
[Service]
Environment="PATH=export PATH="/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/kubernetes/client/bin:/opt/kubernetes/server/bin:/opt/kubernetes/node/bin:/opt/cni/bin:/opt/etcd""
WorkingDirectory=/var/lib/kubelet
ExecStart=/opt/kubernetes/node/bin/kubelet \\
--bootstrap-kubeconfig="/etc/kubernetes/kubelet-bootstrap.kubeconfig" \\
--cert-dir="/etc/kubernetes/cert" \\
--config="/etc/kubernetes/kubelet-config.yaml" \\
--container-runtime-endpoint="unix:///var/run/crio/crio.sock" \\
--kubeconfig="/etc/kubernetes/kubelet.kubeconfig" \\
--root-dir="/var/lib/kubelet" \\
--volume-plugin-dir="/var/lib/kubelet/kubelet-plugins/volume/exec/" \\
--v=2
Restart=always
RestartSec=5
StartLimitInterval=0
[Install]
WantedBy=multi-user.target
EOF
- v0
kubectl create clusterrolebinding kube-apiserver:kubelet-apis \
--clusterrole=system:kubelet-api-admin \
--user kubernetes-master
- v0
kubectl create clusterrolebinding kubelet-bootstrap \
--clusterrole=system:node-bootstrapper \
--group=system:bootstrappers
cat > csr-crb.yaml <<EOF
# Approve all CSRs for the group "system:bootstrappers"
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: auto-approve-csrs-for-group
subjects:
- kind: Group
name: system:bootstrappers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
apiGroup: rbac.authorization.k8s.io
---
# To let a node of the group "system:nodes" renew its own credentials
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: node-client-cert-renewal
subjects:
- kind: Group
name: system:nodes
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
apiGroup: rbac.authorization.k8s.io
---
# A ClusterRole which instructs the CSR approver to approve a node requesting a
# serving cert matching its client cert.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: approve-node-server-renewal-csr
rules:
- apiGroups: ["certificates.k8s.io"]
resources: ["certificatesigningrequests/selfnodeserver"]
verbs: ["create"]
---
# To let a node of the group "system:nodes" renew its own server credentials
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: node-server-cert-renewal
subjects:
- kind: Group
name: system:nodes
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: approve-node-server-renewal-csr
apiGroup: rbac.authorization.k8s.io
EOF
kubectl apply -f ./csr-crb.yaml
- v0
for I in {1..6};do
ssh root@v${I} "systemctl daemon-reload && systemctl enable --now kubelet"
done
- v0
kubectl get csr
kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve
kubectl get csr
- 使用 CA 证书
curl -s --cacert /etc/kubernetes/cert/ca.pem https://172.31.31.74:10250/metrics
Unauthorized
- 使用 CA 证书和 HTTP 基础认证
curl -s --cacert /etc/kubernetes/cert/ca.pem \
-H "Authorization: Bearer 123456" \
https://172.31.31.74:10250/metrics
Unauthorized
- 使用 Kubelet 客户端证书
curl -s --cacert /etc/kubernetes/cert/ca.pem \
--cert /etc/kubernetes/cert/kubelet-client-current.pem \
--key /etc/kubernetes/cert/kubelet-client-current.pem \
https://172.31.31.74:10250/metrics
Forbidden
-
使用 Admin 用户证书
curl -s --cacert /etc/kubernetes/cert/ca.pem \ --cert admin.pem \ --key admin-key.pem \ https://172.31.31.74:10250/metrics
- v0
kubectl create sa kubelet-api-test
kubectl create clusterrolebinding kubelet-api-test \
--clusterrole=system:kubelet-api-admin \
--serviceaccount=default:kubelet-api-test
SECRET=$(kubectl get secrets | grep kubelet-api-test | awk '{print $1}')
TOKEN=$(kubectl describe secret ${SECRET} | grep -E '^token' | awk '{print $2}')
echo ${TOKEN}
curl -s --cacert /etc/kubernetes/cert/ca.pem \
-H "Authorization: Bearer ${TOKEN}" \
https://172.31.31.74:10250/metrics
kubectl delete sa kubelet-api-test
kubectl delete clusterrolebindings.rbac.authorization.k8s.io kubelet-api-test
cat > kube-proxy-csr.json <<EOF
{
"CN": "system:kube-proxy",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "Nanu-Network",
"OU": "K8S"
}
]
}
EOF
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
- v0
for I in {1..6};do
rsync -avr ./kube-proxy*.pem root@v${I}:/etc/kubernetes/cert/
done
- v0
kubectl config set-cluster kubernetes \
--certificate-authority=ca.pem \
--embed-certs=true \
--server=https://172.31.31.70:6443 \
--kubeconfig=kube-proxy.kubeconfig
kubectl config set-credentials kube-proxy \
--client-certificate=kube-proxy.pem \
--client-key=kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=kube-proxy.kubeconfig
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=kube-proxy.kubeconfig
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
- v0
for I in {1..6};do
rsync -avr ./kube-proxy.kubeconfig root@v${I}:/etc/kubernetes/
done
- v1 - v6
cat > /etc/kubernetes/kube-proxy-config.yaml <<EOF
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
burst: 200
kubeconfig: "/etc/kubernetes/kube-proxy.kubeconfig"
qps: 100
bindAddress: 0.0.0.0
healthzBindAddress: 0.0.0.0:10256
metricsBindAddress: 0.0.0.0:10249
enableProfiling: true
clusterCIDR: 172.20.0.0/16
mode: "ipvs"
portRange: ""
iptables:
masqueradeAll: false
ipvs:
scheduler: rr
excludeCIDRs: []
EOF
-
v0
-
Kube-Proxy 工作目录:/var/lib/kube-proxy
for I in {1..6};do
ssh root@v${I} "mkdir -p /var/lib/kube-proxy"
done
- v1 - v6
cat > /etc/systemd/system/kube-proxy.service <<EOF
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
[Service]
Environment="PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/kubernetes/client/bin:/opt/kubernetes/server/bin:/opt/kubernetes/node/bin:/opt/cni/bin:/opt/etcd"
WorkingDirectory=/var/lib/kube-proxy
ExecStart=/opt/kubernetes/node/bin/kube-proxy \\
--config=/etc/kubernetes/kube-proxy-config.yaml \\
--v=2
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
- v0
for I in {1..6};do
ssh root@v${I} "systemctl daemon-reload && systemctl enable --now kube-proxy"
done
- v1 - v6
ipvsadm -ln
- v0
for I in {1..3};do
kubectl label node v${I} node-role.kubernetes.io/master=""
kubectl taint nodes v${I} node-role.kubernetes.io/master=:NoSchedule
done
kubectl get nodes
- v0
wget https://github.com/projectcalico/calico/releases/download/v3.28.0/release-v3.28.0.tgz
tar xzvf release-v3.28.0.tgz
docker load --input ./release-v3.28.0/images/calico-cni.tar
docker load --input ./release-v3.28.0/images/calico-kube-controllers.tar
docker load --input ./release-v3.28.0/images/calico-node.tar
vi ./release-v3.28.0/manifests/calico.yaml
- 本文未启用 IPIP
- name: CALICO_IPV4POOL_IPIP
,如需跨二层则设置为Always
vi calico.yaml
- name: CALICO_IPV4POOL_CIDR
value: "172.20.0.0/16"
- name: IP_AUTODETECTION_METHOD
value: "cidr=172.31.31.0/24"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Never"
kubectl apply -f ./release-v3.28.0/manifests/calico.yaml
kubectl get pods -n kube-system -o wide -w
kubectl describe pods -n kube-system calico-kube-controllers-
kubectl describe pods -n kube-system calico-node-
- v0
cp ./release-v3.28.0/bin/calicoctl/calicoctl-linux-amd64 /usr/local/bin/kubectl-calico
chmod +x /usr/local/bin/kubectl-calico
kubectl calico get node -o wide
kubectl calico get ipPool -o wide
- v0
for I in {4..6};do
kubectl label nodes v${I} node-role.kubernetes.io/node=""
done
kubectl get nodes
- v0
cat > nginx-test.yaml <<EOF
apiVersion: v1
kind: Service
metadata:
name: nginx-test
labels:
app: nginx-test
spec:
type: NodePort
selector:
app: nginx-test
ports:
- name: http
port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nginx-test
labels:
addonmanager.kubernetes.io/mode: Reconcile
spec:
selector:
matchLabels:
app: nginx-test
template:
metadata:
labels:
app: nginx-test
spec:
containers:
- name: nanu-nginx
image: nginx
ports:
- containerPort: 80
EOF
kubectl apply -f ./nginx-test.yaml
kubectl get pods -o wide -l app=nginx-test
ping Pod-IP
kubectl get svc -l app=nginx-test
curl -s http://CLUSTER-IP
kubectl delete -f ./nginx-test.yml
- v0
wget https://get.helm.sh/helm-v3.15.2-linux-amd64.tar.gz
tar xzvf helm-v3.15.2-linux-amd64.tar.gz
chown -R root.root ./linux-amd64
cp -rp ./linux-amd64/helm /usr/local/bin/
chmod +x /usr/local/bin/helm
https://artifacthub.io/packages/search?kind=0
helm repo add stable https://charts.helm.sh/stable
- v0
helm repo add coredns https://coredns.github.io/helm
helm repo update
- 注意 CoreDNS Service IP
helm --namespace=kube-system install coredns coredns/coredns --set service.clusterIP="10.254.0.53"
kubectl get pods -n kube-system -o wide -w
helm status coredns -n kube-system
- v0
kubectl get all -n kube-system -l k8s-app=kube-dns
cat > nginx-test-coredns.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-test-coredns
spec:
replicas: 3
selector:
matchLabels:
run: nginx-test-coredns
template:
metadata:
labels:
run: nginx-test-coredns
spec:
containers:
- name: nginx-test-coredns
image: nginx
ports:
- containerPort: 80
EOF
kubectl apply -f ./nginx-test-coredns.yaml
kubectl get pods -o wide
kubectl expose deploy nginx-test-coredns
kubectl get svc nginx-test-coredns -o wide
cat > dnsutils-check.yml <<EOF
apiVersion: v1
kind: Service
metadata:
name: dnsutils-check
labels:
app: dnsutils-check
spec:
type: NodePort
selector:
app: dnsutils-check
ports:
- name: http
port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: dnsutils-check
labels:
addonmanager.kubernetes.io/mode: Reconcile
spec:
selector:
matchLabels:
app: dnsutils-check
template:
metadata:
labels:
app: dnsutils-check
spec:
containers:
- name: dnsutils
image: tutum/dnsutils:latest
command:
- sleep
- "3600"
ports:
- containerPort: 80
EOF
kubectl apply -f ./dnsutils-check.yml
kubectl get pods -lapp=dnsutils-check -o wide -w
kubectl exec dnsutils-check-XXX -- cat /etc/resolv.conf
kubectl exec dnsutils-check-XXX -- nslookup dnsutils-check
kubectl exec dnsutils-check-XXX -- nslookup nginx-test-coredns
kubectl exec dnsutils-check-XXX -- nslookup kubernetes
kubectl exec dnsutils-check-XXX -- nslookup www.baidu.com
kubectl delete svc nginx-test-coredns
kubectl delete -f ./dnsutils-check.yml
kubectl delete -f ./nginx-test-coredns.yaml
- v0
cat > ./cluster-readonly-clusterrole.yaml <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-readonly
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- persistentvolumeclaims
- pods
- replicationcontrollers
- replicationcontrollers/scale
- serviceaccounts
- services
- nodes
- persistentvolumeclaims
- persistentvolumes
- bindings
- events
- limitranges
- namespaces/status
- pods/log
- pods/status
- replicationcontrollers/status
- resourcequotas
- resourcequotas/status
- namespaces
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- daemonsets
- deployments
- deployments/scale
- replicasets
- replicasets/scale
- statefulsets
verbs:
- get
- list
- watch
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
- apiGroups:
- batch
resources:
- cronjobs
- jobs
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- daemonsets
- deployments
- deployments/scale
- ingresses
- networkpolicies
- replicasets
- replicasets/scale
- replicationcontrollers/scale
verbs:
- get
- list
- watch
- apiGroups:
- policy
resources:
- poddisruptionbudgets
verbs:
- get
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
- networkpolicies
- ingresses
- ingressclasses
verbs:
- get
- list
- watch
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- volumeattachments
verbs:
- get
- list
- watch
- apiGroups:
- rbac.authorization.k8s.io
resources:
- clusterrolebindings
- clusterroles
- roles
- rolebindings
verbs:
- get
- list
- watch
EOF
kubectl apply -f ./cluster-readonly-clusterrole.yaml
cat > ./cluster-readonly-clusterrolebinding.yaml <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-readonly
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-readonly
subjects:
- kind: User
name: cluster-ro
EOF
kubectl apply -f ./cluster-readonly-clusterrolebinding.yaml
- v0
cat > cluster-ro-csr.json <<EOF
{
"CN": "cluster-ro",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "Nanu-Network",
"OU": "K8S"
}
]
}
EOF
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes cluster-ro-csr.json | cfssljson -bare cluster-ro
- v0
kubectl config set-cluster kubernetes \
--certificate-authority=ca.pem \
--embed-certs=true \
--server=https://172.31.31.70:6443 \
--kubeconfig=cluster-ro.kubeconfig
kubectl config set-credentials cluster-ro \
--client-certificate=cluster-ro.pem \
--client-key=cluster-ro-key.pem \
--embed-certs=true \
--kubeconfig=cluster-ro.kubeconfig
kubectl config set-context kubernetes \
--cluster=kubernetes \
--user=cluster-ro \
--kubeconfig=cluster-ro.kubeconfig
kubectl config use-context kubernetes --kubeconfig=cluster-ro.kubeconfig
-
v0
-
单点部署
wget -O metrics-server.yaml https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl apply -f ./metrics-server.yaml
- 高可用部署
wget -O metrics-server-ha.yaml https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability.yaml
kubectl apply -f ./metrics-server-ha.yaml
- v0
for I in {4..6};do
kubectl label nodes v${I} node-role.kubernetes.io/ingress="true"
done
wget -O ingress-nginx-1.10.3.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.10.3/deploy/static/provider/baremetal/deploy.yaml
- v0
kubectl create secret tls inanu.net \
--cert=./inanu.net.crt \
--key=./inanu.net.key
vi ingress-nginx-1.10.3.yaml
- 注释 NodePort Service: kind: Service type: NodePort (出于性能考虑,使用 hostNetwork)
- 配置 Deployment:
args:
- --default-ssl-certificate=default/inanu.net
...
nodeSelector:
kubernetes.io/os: linux
node-role.kubernetes.io/ingress: "true"
hostNetwork: true
kubectl apply -f ./ingress-nginx-1.10.3.yaml
kubectl get pods -n ingress-nginx -o wide -w
ipvsadm -Ln
- v0
- 下载
tar xzvf ceph-csi-3.11.0.tar.gz
cd ./ceph-csi-3.11.0/deploy/cephfs/kubernetes
kubectl apply -f csi-provisioner-rbac.yaml
kubectl apply -f csi-nodeplugin-rbac.yaml
- 配置 Ceph 集群信息
vi csi-config-map.yaml
kubectl apply -f csi-config-map.yaml
kubectl apply -f csi-cephfsplugin-provisioner.yaml
kubectl get pods -o wide -w
kubectl create -f csi-cephfsplugin.yaml
kubectl get all
- v0
git clone https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner.git
cd nfs-subdir-external-provisioner/deploy
kubectl apply -f ./rbac.yaml
-
v0
-
需提前部署好 NFS Server 并 Export 共享目录;
vi ./deployment.yaml
- 配置 NFS Server
- 配置 NFS Path
- 配置 NFS Volumes:
volumes:
- name: nfs-client-root
nfs:
server: NFS_SERVER_IP
path: NFS_PATH
kubectl apply -f ./deployment.yaml
- v0
vi ./class.yaml
metadata:
name: SC_NAME
parameters:
onDelete: "retain"
pathPattern: "${.PVC.namespace}/${.PVC.name}"
kubectl apply -f ./class.yaml
kubectl get sc
- v0
vi ./test-claim.yaml
spec:
storageClassName: SC_NAME
kubectl apply -f ./test-claim.yaml
kubectl get pvc
kubectl get pv
kubectl apply -f ./test-pod.yaml
- 检查 NFS 目录是否有
SUCCESS
文件生成
kubectl delete -f ./test-pod.yaml
kubectl delete -f ./test-claim.yaml
- v0
kubectl create namespace kubernetes-dashboard
kubectl create secret tls kubernetes-dashboard-certs \
--cert=./inanu.net.crt \
--key=./inanu.net.key \
-n kubernetes-dashboard
- v0
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
mv recommended.yaml dashboard-recommended.yaml
vi ./dashboard-recommended.yaml
#apiVersion: v1
#kind: Secret
#metadata:
# labels:
# k8s-app: kubernetes-dashboard
# name: kubernetes-dashboard-certs
# namespace: kubernetes-dashboard
#type: Opaque
# Add Start Command
command:
- /dashboard
args:
- --auto-generate-certificates
- --namespace=kubernetes-dashboard
# Add TLS Config
- --token-ttl=3600
- --bind-address=0.0.0.0
- --tls-cert-file=tls.crt
- --tls-key-file=tls.key
kubectl apply -f ./dashboard-recommended.yaml
kubectl get pods -n kubernetes-dashboard -o wide -w
kubectl create sa dashboard-admin -n kube-system
- v0
kubectl create clusterrolebinding \
dashboard-admin \
--clusterrole=cluster-admin \
--serviceaccount=kube-system:dashboard-admin
ADMIN_SECRET=$(kubectl get secrets -n kube-system | grep dashboard-admin | awk '{print $1}')
DASHBOARD_LOGIN_TOKEN=$(kubectl describe secret -n kube-system ${ADMIN_SECRET} | grep -E '^token' | awk '{print $2}')
kubectl config set-cluster kubernetes \
--certificate-authority=ca.pem \
--embed-certs=true \
--server=https://172.31.31.70:6443 \
--kubeconfig=dashboard.kubeconfig
kubectl config set-credentials dashboard_user \
--token=${DASHBOARD_LOGIN_TOKEN} \
--kubeconfig=dashboard.kubeconfig
kubectl config set-context default \
--cluster=kubernetes \
--user=dashboard_user \
--kubeconfig=dashboard.kubeconfig
kubectl config use-context default --kubeconfig=dashboard.kubeconfig
- v0
kubectl create sa dashboard-ro -n kube-system
kubectl create clusterrolebinding \
dashboard-ro \
--clusterrole=cluster-readonly \
--serviceaccount=kube-system:dashboard-ro
RO_SECRET=$(kubectl get secrets -n kube-system | grep dashboard-ro | awk '{print $1}')
RO_LOGIN_TOKEN=$(kubectl describe secret -n kube-system ${RO_SECRET} | grep -E '^token' | awk '{print $2}')
kubectl config set-cluster kubernetes \
--certificate-authority=ca.pem \
--embed-certs=true \
--server=https://172.31.31.70:6443 \
--kubeconfig=dashboard-ro.kubeconfig
kubectl config set-credentials dashboard_ro \
--token=${RO_LOGIN_TOKEN} \
--kubeconfig=dashboard-ro.kubeconfig
kubectl config set-context default \
--cluster=kubernetes \
--user=dashboard_ro \
--kubeconfig=dashboard-ro.kubeconfig
kubectl config use-context default --kubeconfig=dashboard-ro.kubeconfig
- v0
kubectl port-forward -n kubernetes-dashboard svc/kubernetes-dashboard 4443:443 --address 0.0.0.0
https://172.31.31.70:4443/
-
v0
-
域名:
k8s.inanu.net
cat > ./ingress-dashboard.yaml <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: k8s-dashboard
namespace: kubernetes-dashboard
annotations:
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
ingressClassName: nginx
rules:
- host: k8s.inanu.net
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kubernetes-dashboard
port:
number: 443
tls:
- secretName: kubernetes-dashboard-certs
hosts:
- k8s.inanu.net
EOF
kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission
kubectl apply -f ./ingress-dashboard.yaml
https://v0.inanu.net:30443
- v0
- 检查版本兼容性
wget https://github.com/prometheus-operator/kube-prometheus/archive/refs/heads/release-0.10.zip
unzip kube-prometheus-release-0.10.zip
cd kube-prometheus-release-0.10
- 替换镜像地址(可选)
sed -i -e 's_quay.io_quay.mirrors.ustc.edu.cn_' manifests/*.yaml manifests/setup/*.yaml
sed -i -e 's_policy/v1beta1_policy/v1_' manifests/*.yaml manifests/setup/*.yaml
kubectl apply --server-side -f manifests/setup
kubectl apply -f manifests/
kubectl get all -n monitoring
- v0
kubectl port-forward -n monitoring svc/prometheus-k8s 9091:9090 --address 0.0.0.0
http://v0.inanu.net:9091
- v0
cat > ./ingress-prometheus.yaml <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: k8s-prometheus
namespace: monitoring
annotations:
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /
#nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
spec:
ingressClassName: nginx
rules:
- host: k8s-prometheus.inanu.net
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: prometheus-k8s
port:
number: 9090
tls:
#- secretName: inanu.net
# hosts:
# - k8s-prometheus.inanu.net
EOF
kubectl apply -f ./ingress-prometheus.yaml
http://k8s-prometheus.inanu.net
- v0
kubectl port-forward -n monitoring svc/grafana 3000:3000 --address 0.0.0.0
http://v0.inanu.net:3000
- v0
cat > ingress-grafana.yaml <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: k8s-grafana
namespace: monitoring
annotations:
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /
#nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
spec:
ingressClassName: nginx
rules:
- host: k8s-grafana.inanu.net
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: grafana
port:
number: 3000
tls:
#- secretName: inanu.net
# hosts:
# - k8s-grafana.inanu.net
EOF
kubectl apply -f ./ingress-grafana.yaml
http://k8s-grafana.inanu.net
- v0
kubectl port-forward -n monitoring svc/alertmanager-main 9093:9093 --address 0.0.0.0
http://v0.inanu.net:9093
- v0
cat > ingress-alertmanager.yaml <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: k8s-alertmanager
namespace: monitoring
annotations:
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /
#nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
spec:
ingressClassName: nginx
rules:
- host: k8s-alertmanager.inanu.net
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: alertmanager-main
port:
number: 9093
tls:
#- secretName: inanu.net
# hosts:
# - k8s-grafana.inanu.net
EOF
kubectl apply -f ./ingress-alertmanager.yaml
http://k8s-alertmanager.inanu.net
- v0
kubectl create ns logging
-
v0
-
配置 NFS Squash,将所有 NFS Client 映射为 root,否则 Elastic Search 可能因为权限问题无法写入 NFS 持久卷;
-
注意以下配置:
-
- name: discovery.zen.minimum_master_nodes = quorum = master_nums/2 + 1
-
- name: cluster.initial_master_nodes
-
- name: ES_JAVA_OPTS
-
storageClassName
-
storage
-
cat > elasticsearch-sts.yaml <<EOF
kind: Service
apiVersion: v1
metadata:
name: elasticsearch
namespace: logging
labels:
app: elasticsearch
spec:
selector:
app: elasticsearch
clusterIP: None
ports:
- port: 9200
name: rest-api
- port: 9300
name: node-comm
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es
namespace: logging
spec:
serviceName: elasticsearch
replicas: 3
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
initContainers:
- name: increase-vm-max-map
image: busybox
command:
- "sysctl"
- "-w"
- "vm.max_map_count=262144"
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command:
- "sh"
- "-c"
- "ulimit -n 65536"
securityContext:
privileged: true
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:7.16.2
securityContext:
capabilities:
add:
- "SYS_CHROOT"
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: rest-api
protocol: TCP
- containerPort: 9300
name: node-comm
protocol: TCP
volumeMounts:
- name: elasticsearch-data
mountPath: /usr/share/elasticsearch/data
env:
- name: cluster.name
value: k8s-logs
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: cluster.initial_master_nodes
value: "es-0,es-1,es-2"
- name: discovery.zen.minimum_master_nodes
value: "2"
- name: discovery.seed_hosts
value: "elasticsearch"
- name: ES_JAVA_OPTS
value: "-Xms256m -Xmx256m"
- name: network.host
value: "0.0.0.0"
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
labels:
app: elasticsearch
spec:
storageClassName: sc-nfs
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 5Gi
EOF
kubectl apply -f ./elasticsearch-sts.yaml
kubectl get all -n logging
kubectl get pods -n logging -o wide
- v0
kubectl port-forward es-0 9200:9200 --namespace=logging
curl http://localhost:9200/_cluster/state?pretty
- v0
for I in {4..6};do
kubectl label node v${I} node-role.kubernetes.io/efk="kibana"
done
kubectl get nodes --show-labels
- name: SERVER_PUBLICBASEURL
cat > kibana.yaml <<EOF
apiVersion: v1
kind: Service
metadata:
name: kibana
namespace: logging
labels:
app: kibana
spec:
type: NodePort
ports:
- port: 5601
nodePort: 15601
targetPort: 5601
selector:
app: kibana
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kibana
namespace: logging
labels:
app: kibana
spec:
selector:
matchLabels:
app: kibana
template:
metadata:
labels:
app: kibana
spec:
nodeSelector:
node-role.kubernetes.io/efk: kibana
containers:
- name: kibana
image: docker.elastic.co/kibana/kibana:7.16.2
resources:
limits:
cpu: 1000m
requests:
cpu: 1000m
env:
- name: ELASTICSEARCH_HOSTS
value: http://elasticsearch:9200
- name: I18N_LOCALE
value: zh-CN
- name: SERVER_PUBLICBASEURL
value: https://k8s-kibana.inanu.net
ports:
- containerPort: 5601
EOF
kubectl apply -f ./kibana.yaml
kubectl get pods -n logging -o wide
kubectl get svc -n logging
http://elasticsearch_headless_svc_node:15601
-
v0
-
- host: k8s-kibana.inanu.net
cat > kibana-ingress.yaml <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: k8s-kibana
namespace: logging
annotations:
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /
#nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
spec:
ingressClassName: nginx
rules:
- host: k8s-kibana.inanu.net
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kibana
port:
number: 5601
EOF
http://k8s-kibana.inanu.net
- v0
kind: ConfigMap
apiVersion: v1
metadata:
name: fluentd-config
namespace: logging
data:
system.conf: |-
<system>
root_dir /tmp/fluentd-buffers/
</system>
containers.input.conf: |-
<source>
@id fluentd-containers.log
@type tail # Get latest log from tail.
path /var/log/containers/*.log # Containers log DIR.
pos_file /var/log/es-containers.log.pos # Log position since last time.
tag raw.kubernetes.* # Set log tag.
read_from_head true
<parse> # Format multi-line to JSON.
@type multi_format # Use `multi-format-parser` plugin.
<pattern>
format json
time_key time # Set `time_key` word.
time_format %Y-%m-%dT%H:%M:%S.%NZ
</pattern>
<pattern>
format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
time_format %Y-%m-%dT%H:%M:%S.%N%:z
</pattern>
</parse>
</source>
# https://github.com/GoogleCloudPlatform/fluent-plugin-detect-exceptions
<match raw.kubernetes.**>
@id raw.kubernetes
@type detect_exceptions
remove_tag_prefix raw
message log
stream stream
multiline_flush_interval 5
max_bytes 500000
max_lines 1000
</match>
<filter **> # Join log.
@id filter_concat
@type concat # Fluentd Filter plugin - Join multi-events at different lines.
key message
multiline_end_regexp /\n$/ # Joined by `\n`.
separator ""
</filter>
# Add Kubernetes metadata.
<filter kubernetes.**>
@id filter_kubernetes_metadata
@type kubernetes_metadata
</filter>
# Fix JSON fields in Elasticsearch.
# https://github.com/repeatedly/fluent-plugin-multi-format-parser
<filter kubernetes.**>
@id filter_parser
@type parser
key_name log # Field name.
reserve_data true # Keep original field value.
remove_key_name_field true # Delete field after it's analysed.
<parse>
@type multi_format
<pattern>
format json
</pattern>
<pattern>
format none
</pattern>
</parse>
</filter>
# Delete unused fields.
<filter kubernetes.**>
@type record_transformer
remove_keys $.docker.container_id,$.kubernetes.container_image_id,$.kubernetes.pod_id,$.kubernetes.namespace_id,$.kubernetes.master_url,$.kubernetes.labels.pod-template-hash
</filter>
# Only keep `logging=true` tag in pod's log.
<filter kubernetes.**>
@id filter_log
@type grep
<regexp>
key $.kubernetes.labels.logging
pattern ^true$
</regexp>
</filter>
# Listen configuration - generally used for log aggregation.
forward.input.conf: |-
# Listen TCP messages.
<source>
@id forward
@type forward
</source>
output.conf: |-
<match **>
@id elasticsearch
@type elasticsearch
@log_level info
include_tag_key true
host elasticsearch
port 9200
logstash_format true
logstash_prefix k8s
request_timeout 30s
<buffer>
@type file
path /var/log/fluentd-buffers/kubernetes.system.buffer
flush_mode interval
retry_type exponential_backoff
flush_thread_count 2
flush_interval 5s
retry_forever
retry_max_interval 30
chunk_limit_size 2M
queue_limit_length 8
overflow_action block
</buffer>
</match>
EOF
kubectl apply -f ./fluentd-cf.yaml
- v0
cat > fluentd-ds.yaml <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd-es
namespace: logging
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd-es
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
- ""
resources:
- "namespaces"
- "pods"
verbs:
- "get"
- "watch"
- "list"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd-es
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
name: fluentd-es
namespace: logging
apiGroup: ""
roleRef:
kind: ClusterRole
name: fluentd-es
apiGroup: ""
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-es
namespace: logging
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
selector:
matchLabels:
k8s-app: fluentd-es
template:
metadata:
labels:
k8s-app: fluentd-es
# Ensure Fluentd won't be evicted while node was evicted.
kubernetes.io/cluster-service: "true"
spec:
priorityClassName: system-cluster-critical
serviceAccountName: fluentd-es
containers:
- name: fluentd-es
image: quay.io/fluentd_elasticsearch/fluentd
env:
- name: FLUENTD_ARGS
value: --no-supervisor -q
resources:
limits:
memory: 500Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: containerslog
mountPath: /var/log/containers
readOnly: true
- name: config-volume
mountPath: /etc/fluent/config.d
#nodeSelector:
# beta.kubernetes.io/fluentd-ds-ready: "true"
tolerations:
- operator: Exists
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: containerslog
hostPath:
path: /var/log/containers
- name: config-volume
configMap:
name: fluentd-config
EOF
kubectl apply -f ./fluentd-ds.yaml
kubectl get pods -n logging
-
v0
-
注意:只有具有 logging: "true" 标签才会被接入日志
cat > test-log-pod.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: test-log
labels:
logging: "true" # Trun log on.
spec:
containers:
- name: test-log
image: busybox
args:
- "/bin/sh"
- "-c"
- "while true; do echo $(date); sleep 1; done"
EOF
kubectl apply -f ./test-log-pod.yaml
kubectl logs test-log -f
- 在 Kibana 中创建 Index Pattern
k8s-*
- v0
kubectl delete -f ./test-log-pod.yaml