分类 💦 CloudNative 下的文章

1 初始化操作系统

操作系统基本信息:

  • OS:CentOS 7.6
  • 安装方式:Minimal
  • IP:192.168.2.105
  • Hostname:ks-allinone

安装一些依赖软件和操作工具:

yum update -y
yum install vim ntpdate net-tools socat conntrack ebtables ipset -y

同步时间:

ntpdate ntp.aliyun.com

## 配置开机同步时间
echo "/usr/sbin/ntpdate ntp.aliyun.com" >> /etc/rc.d/rc.local
chmod +x /etc/rc.d/rc.local

优化句柄数和最大打开进程数:

cat >> /etc/security/limits.conf << EOF
*    soft    nofile    65535
*    hard    nofile    65535
*    soft    nproc     65535
*    hard    nproc     65535
EOF

重启系统:

reboot

2 安装 Docker

## 卸载旧版本 Docker
yum remove docker \
    docker-client \
    docker-client-latest \
    docker-common \
    docker-latest \
    docker-latest-logrotate \
    docker-logrotate \
    docker-engine

## 配置仓库
yum install -y yum-utils
yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo
    
## 查询可安装的版本
yum list docker-ce --showduplicates | sort -r

## 指定版本进行安装
yum install docker-ce-20.10.10 docker-ce-cli-20.10.10 containerd.io -y

## 启动 Docker
systemctl start docker

## 配置开机自启动
systemctl enable docker

## 查看启动状态
systemctl status docker 
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: active (running) since 六 2022-02-26 16:20:48 CST; 10s ago
     Docs: https://docs.docker.com
 Main PID: 5483 (dockerd)
   CGroup: /system.slice/docker.service
           └─5483 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
......
Hint: Some lines were ellipsized, use -l to show in full.

3 安装 KubeSphere

## 下载 kk 安装工具
export KKZONE=cn
curl -sfL https://get-kk.kubesphere.io | VERSION=v1.2.1 sh -
## 下载不下来可以多执行几遍,实在下载不下来,使用我提供的链接进行下载:
## wget https://oss.iuskye.com/article/2022-02-26/kubekey-v1.2.1-linux-amd64.tar.gz
## tar zxf kubekey-v1.2.1-linux-amd64.tar.gz && chmod +x kk && export KKZONE=cn

## 为 kk 工具添加可执行权限
chmod +x kk

## 查看可支持的 Kubernetes 版本
./kk version --show-supported-k8s

## 开始安装,此安装过程时间较长,请耐心等待
./kk create cluster --with-kubernetes v1.22.1 --with-kubesphere v3.2.1

## 安装完成后执行以下命令查看安装结果
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l app=ks-install -o jsonpath='{.items[0].metadata.name}') -f

......
#####################################################
###              Welcome to KubeSphere!           ###
#####################################################

Console: http://192.168.2.105:30880
Account: admin
Password: P@88w0rd

NOTES:
  1. After you log into the console, please check the
     monitoring status of service components in
     "Cluster Management". If any service is not
     ready, please wait patiently until all components
     are up and running.
  2. Please change the default password after login.

#####################################################
https://kubesphere.io             2022-02-26 16:51:55
#####################################################
## 查看所有组件的安装状况
kubectl get pod --all-namespaces

4 登录 Web 管理界面

在浏览器输入 URL http://192.168.2.105:30880 即可:

用户密码在上一步命令中查询得知:

  • Username: admin
  • Password: P@88w0rd

5 安装组件分解讲解

5.1 二进制安装

  • kubeadm
  • kubectl
  • kubelet
  • etcd
  • cni-plugins
  • helm

5.2 镜像容器安装

## K8S Master 节点组件
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.22.1
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.22.1
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.22.1
## K8S Worker 节点组件
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.22.1
registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.5
kubesphere/kubectl:v1.22.0
## 网络插件
registry.cn-beijing.aliyuncs.com/kubesphereio/cni:v3.20.0
## DNS 组件
registry.cn-beijing.aliyuncs.com/kubesphereio/coredns:1.8.0
## K8S 组件
registry.cn-beijing.aliyuncs.com/kubesphereio/k8s-dns-node-cache:1.15.12
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controllers:v3.20.0
registry.cn-beijing.aliyuncs.com/kubesphereio/node:v3.20.0
registry.cn-beijing.aliyuncs.com/kubesphereio/pod2daemon-flexvol:v3.20.0
registry.cn-beijing.aliyuncs.com/kubesphereio/provisioner-localpv:2.10.1
## KubeSphere 组件
registry.cn-beijing.aliyuncs.com/kubesphereio/ks-installer:v3.2.1
kubesphere/ks-controller-manager:v3.2.1
kubesphere/ks-apiserver:v3.2.1
kubesphere/ks-console:v3.2.1
kubesphere/notification-manager:v1.4.0
kubesphere/notification-tenant-sidecar:v3.2.0
kubesphere/notification-manager-operator:v1.4.0
prom/prometheus:v2.26.0
csiplugin/snapshot-controller:v4.0.0
kubesphere/prometheus-config-reloader:v0.43.2
kubesphere/prometheus-operator:v0.43.2
kubesphere/kube-rbac-proxy:v0.8.0
registry.cn-beijing.aliyuncs.com/kubesphereio/coredns:1.8.0
prom/alertmanager:v0.21.0
kubesphere/kube-state-metrics:v1.9.7
prom/node-exporter:v0.18.1
mirrorgooglecontainers/defaultbackend-amd64:1.4

5.3 内核优化

/etc/sysctl.conf

net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
vm.max_map_count = 262144
vm.swappiness = 1
fs.inotify.max_user_instances = 524288

6 参考

KubeSphere 官方:https://kubesphere.com.cn/docs/quick-start/all-in-one-on-linux/
KubeSphere Github:https://github.com/kubesphere/kubesphere
Docker:https://docs.docker.com/engine/install/centos/

1 查看类命令

# 获取节点和服务版本信息
kubectl get nodes
# 获取节点和服务版本信息,并查看附加信息
kubectl get nodes -o wide

# 获取pod信息,默认是default名称空间
kubectl get pod
# 获取pod信息,默认是default名称空间,并查看附加信息【如:pod的IP及在哪个节点运行】
kubectl get pod -o wide
# 获取指定名称空间的pod
kubectl get pod -n kube-system
# 获取指定名称空间中的指定pod
kubectl get pod -n kube-system podName
# 获取所有名称空间的pod
kubectl get pod -A 
# 查看pod的详细信息,以yaml格式或json格式显示
kubectl get pods -o yaml
kubectl get pods -o json

# 查看pod的标签信息
kubectl get pod -A --show-labels 
# 根据Selector(label query)来查询pod
kubectl get pod -A --selector="k8s-app=kube-dns"

# 查看运行pod的环境变量
kubectl exec podName env
# 查看指定pod的日志
kubectl logs -f --tail 500 -n kube-system kube-apiserver-k8s-master

# 查看所有名称空间的service信息
kubectl get svc -A
# 查看指定名称空间的service信息
kubectl get svc -n kube-system

# 查看componentstatuses信息
kubectl get cs
# 查看所有configmaps信息
kubectl get cm -A
# 查看所有serviceaccounts信息
kubectl get sa -A
# 查看所有daemonsets信息
kubectl get ds -A
# 查看所有deployments信息
kubectl get deploy -A
# 查看所有replicasets信息
kubectl get rs -A
# 查看所有statefulsets信息
kubectl get sts -A
# 查看所有jobs信息
kubectl get jobs -A
# 查看所有ingresses信息
kubectl get ing -A
# 查看有哪些名称空间
kubectl get ns

# 查看pod的描述信息
kubectl describe pod podName
kubectl describe pod -n kube-system kube-apiserver-k8s-master  
# 查看指定名称空间中指定deploy的描述信息
kubectl describe deploy -n kube-system coredns

# 查看node或pod的资源使用情况
# 需要heapster 或metrics-server支持
kubectl top node
kubectl top pod 

# 查看集群信息
kubectl cluster-info   或  kubectl cluster-info dump
# 查看各组件信息【172.16.1.110为master机器】
kubectl -s https://172.16.1.110:6443 get componentstatuses

2 操作类命令

# 创建资源
kubectl create -f xxx.yaml
# 应用资源
kubectl apply -f xxx.yaml
# 应用资源,该目录下的所有 .yaml, .yml, 或 .json 文件都会被使用
kubectl apply -f <directory>
# 创建test名称空间
kubectl create namespace test

# 删除资源
kubectl delete -f xxx.yaml
kubectl delete -f <directory>
# 删除指定的pod
kubectl delete pod podName
# 删除指定名称空间的指定pod
kubectl delete pod -n test podName
# 删除其他资源
kubectl delete svc svcName
kubectl delete deploy deployName
kubectl delete ns nsName
# 强制删除
kubectl delete pod podName -n nsName --grace-period=0 --force
kubectl delete pod podName -n nsName --grace-period=1
kubectl delete pod podName -n nsName --now

# 编辑资源
kubectl edit pod podName

3 操作类进阶命令

# kubectl exec:进入pod启动的容器
kubectl exec -it podName -n nsName /bin/sh    #进入容器
kubectl exec -it podName -n nsName /bin/bash  #进入容器

# kubectl label:添加label值
kubectl label nodes k8s-node01 zone=north  #为指定节点添加标签 
kubectl label nodes k8s-node01 zone-       #为指定节点删除标签
kubectl label pod podName -n nsName role-name=test    #为指定pod添加标签
kubectl label pod podName -n nsName role-name=dev --overwrite  #修改lable标签值
kubectl label pod podName -n nsName role-name-        #删除lable标签

# kubectl滚动升级; 通过 kubectl apply -f myapp-deployment-v1.yaml 启动deploy
kubectl apply -f myapp-deployment-v2.yaml     #通过配置文件滚动升级
kubectl set image deploy/myapp-deployment myapp="registry.cn-beijing.aliyuncs.com/google_registry/myapp:v3"   #通过命令滚动升级
kubectl rollout undo deploy/myapp-deployment 或者 kubectl rollout undo deploy myapp-deployment    #pod回滚到前一个版本
kubectl rollout undo deploy/myapp-deployment --to-revision=2  #回滚到指定历史版本

# kubectl scale:动态伸缩
kubectl scale deploy myapp-deployment --replicas=5  # 动态伸缩
kubectl scale --replicas=8 -f myapp-deployment-v2.yaml  #动态伸缩【根据资源类型和名称伸缩,其他配置「如:镜像版本不同」不生效】

1 Pod 示例

---
apiVersion: v1                    #必选,版本号,例如v1,版本号必须可以用 kubectl api-versions 查询到 .
kind: Pod                      #必选,Pod,类别关键字
metadata:                      #必选,元数据
  name: string                    #必选,Pod名称
  namespace: string               #必选,Pod所属的命名空间,默认为"default"
  labels:                       #自定义标签
    - name: string                 #自定义标签名字
  annotations:                           #自定义注释列表
    - name: string
spec:                            #必选,规格,用于描述Pod中容器的详细定义
  containers:                       #必选,Pod中容器列表
  - name: string                        #必选,容器名称,需符合RFC 1035规范
    image: string                       #必选,容器的镜像名称
    imagePullPolicy: [ Always|Never|IfNotPresent ]  #获取镜像的策略 Alawys表示下载镜像 IfnotPresent表示优先使用本地镜像,否则下载镜像,Nerver表示仅使用本地镜像
    command: [string]               #容器的启动命令列表,如不指定,使用打包时使用的启动命令
    args: [string]                     #容器的启动命令参数列表
    workingDir: string                     #容器的工作目录
    volumeMounts:                 #挂载到容器内部的存储卷配置
    - name: string                 #引用pod定义的共享存储卷的名称,需用volumes[]部分定义的的卷名
      mountPath: string                 #存储卷在容器内mount的绝对路径,应少于512字符
      readOnly: boolean                 #是否为只读模式
    ports:                      #需要暴露的端口库号列表
    - name: string                 #端口的名称
      containerPort: int                #容器需要监听的端口号
      hostPort: int                    #容器所在主机需要监听的端口号,默认与Container相同
      protocol: string                  #端口协议,支持TCP和UDP,默认TCP
    env:                          #容器运行前需设置的环境变量列表
    - name: string                    #环境变量名称
      value: string                   #环境变量的值
    resources:                          #资源限制和请求的设置
      limits:                       #资源限制的设置
        cpu: string                   #Cpu的限制,单位为core数,将用于docker run --cpu-shares参数
        memory: string                  #内存限制,单位可以为Mib/Gib,将用于docker run --memory参数
      requests:                         #资源请求的设置
        cpu: string                   #Cpu请求,容器启动的初始可用数量
        memory: string                    #内存请求,容器启动的初始可用数量
    livenessProbe:                    #对Pod内各容器健康检查的设置,当探测无响应几次后将自动重启该容器,检查方法有exec、httpGet和tcpSocket,对一个容器只需设置其中一种方法即可
      exec:                     #对Pod容器内检查方式设置为exec方式
        command: [string]               #exec方式需要制定的命令或脚本
      httpGet:                    #对Pod内个容器健康检查方法设置为HttpGet,需要制定Path、port
        path: string
        port: number
        host: string
        scheme: string
        HttpHeaders:
        - name: string
          value: string
      tcpSocket:            #对Pod内个容器健康检查方式设置为tcpSocket方式
         port: number
       initialDelaySeconds: 0       #容器启动完成后首次探测的时间,单位为秒
       timeoutSeconds: 0          #对容器健康检查探测等待响应的超时时间,单位秒,默认1秒
       periodSeconds: 0           #对容器监控检查的定期探测时间设置,单位秒,默认10秒一次
       successThreshold: 0
       failureThreshold: 0
       securityContext:
         privileged: false
    restartPolicy: [Always | Never | OnFailure] #Pod的重启策略,Always表示一旦不管以何种方式终止运行,kubelet都将重启,OnFailure表示只有Pod以非0退出码退出才重启,Nerver表示不再重启该Pod
    nodeSelector: obeject         #设置NodeSelector表示将该Pod调度到包含这个label的node上,以key:value的格式指定
    imagePullSecrets:         #Pull镜像时使用的secret名称,以key:secretkey格式指定
    - name: string
    hostNetwork: false            #是否使用主机网络模式,默认为false,如果设置为true,表示使用宿主机网络
    volumes:                  #在该pod上定义共享存储卷列表
    - name: string              #共享存储卷名称 (volumes类型有很多种)
      emptyDir: {}              #类型为emtyDir的存储卷,与Pod同生命周期的一个临时目录。为空值
      hostPath: string            #类型为hostPath的存储卷,表示挂载Pod所在宿主机的目录
        path: string                #Pod所在宿主机的目录,将被用于同期中mount的目录
      secret:                 #类型为secret的存储卷,挂载集群与定义的secre对象到容器内部
        scretname: string  
        items:     
        - key: string
          path: string
      configMap:                      #类型为configMap的存储卷,挂载预定义的configMap对象到容器内部
        name: string
        items:
        - key: string
          path: string

2 Deployment 示例

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-deployment
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: default.Deployment.redis_server
    spec:
      containers:
      - name: redis
        image: redis:latest
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 6379
      volumes:
      - name: data
        emptyDir: {}

3 Service 示例

kind: Service
apiVersion: v1
metadata:
  name: redis
spec:
  type: NodePort
  ports:
  - protocol: TCP
    port: 6379
    targetPort: 6379
    nodePort: 30379
    name: test
  selector:
    app: default.Deployment.redis_server

4 PVC 示例

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-pvc-mbs-log
spec:
  storageClassName: "nfs-storage"
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Gi

创建 storageClass

helm install nfs-storage stable/nfs-client-provisioner --set nfs.server=192.168.1.90 --set nfs.path=/home/k8s --set storageClass.name=nfs-storage  --set storageClass.defaultClass=true

参考链接:https://juejin.cn/post/6912071173413011470

Yaml 文件最外层分为三层。

version: "3.9"

services:
  service_name:
  my_app:
  db:
  ...:

volumes:
networks:
configs:
secrets:

Reference Links: https://docs.docker.com/compose/compose-file/compose-file-v3/

1 version

1.1 Compose file versions support specific Docker releases

Compose file formatDocker Engine release
Compose specification19.03.0+
3.819.03.0+
3.718.06.0+
3.618.02.0+
3.517.12.0+
3.417.09.0+
3.317.06.0+
3.217.04.0+
3.11.13.1+
3.01.13.0+
2.417.12.0+
2.317.06.0+
2.21.13.0+
2.11.12.0+
2.01.10.0+

1.2 Compose file structure and examples

version: "3.9"
services:

  redis:
    image: redis:alpine
    ports:
      - "6379"
    networks:
      - frontend
    deploy:
      replicas: 2
      update_config:
        parallelism: 2
        delay: 10s
      restart_policy:
        condition: on-failure

  db:
    image: postgres:9.4
    volumes:
      - db-data:/var/lib/postgresql/data
    networks:
      - backend
    deploy:
      placement:
        max_replicas_per_node: 1
        constraints:
          - "node.role==manager"

  vote:
    image: dockersamples/examplevotingapp_vote:before
    ports:
      - "5000:80"
    networks:
      - frontend
    depends_on:
      - redis
    deploy:
      replicas: 2
      update_config:
        parallelism: 2
      restart_policy:
        condition: on-failure

  result:
    image: dockersamples/examplevotingapp_result:before
    ports:
      - "5001:80"
    networks:
      - backend
    depends_on:
      - db
    deploy:
      replicas: 1
      update_config:
        parallelism: 2
        delay: 10s
      restart_policy:
        condition: on-failure

  worker:
    image: dockersamples/examplevotingapp_worker
    networks:
      - frontend
      - backend
    deploy:
      mode: replicated
      replicas: 1
      labels: [APP=VOTING]
      restart_policy:
        condition: on-failure
        delay: 10s
        max_attempts: 3
        window: 120s
      placement:
        constraints:
          - "node.role==manager"

  visualizer:
    image: dockersamples/visualizer:stable
    ports:
      - "8080:8080"
    stop_grace_period: 1m30s
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock"
    deploy:
      placement:
        constraints:
          - "node.role==manager"

networks:
  frontend:
  backend:

volumes:
  db-data:

2 services

services 字段下是一段一段的服务,服务名自定义,见名知意即可。

eg:

version: "3.9"
services:
  web:
    build: .
    depends_on:
      - mysql
      - redis
  mysql:
    image: mysql:v1
  redis:s
    image: redis:v1

2.1 build

  • context
build:
  context: ./dir
  • dockerfile
build:
  context: .
  dockerfile: Dockerfile-alternate
  • args
build:
  context: .
  args:
    - buildno=1
    - gitcommithash=cdc3b19
  • cache_from
build:
  context: .
  cache_from:
    - alpine:latest
    - corp/web_app:3.14
  • labels
build:
  context: .
  labels:
    com.example.description: "Accounting webapp"
    com.example.department: "Finance"
    com.example.label-with-empty-value: ""
build:
  context: .
  labels:
    - "com.example.description=Accounting webapp"
    - "com.example.department=Finance"
    - "com.example.label-with-empty-value"
  • network
build:
  context: .
  network: host
build:
  context: .
  network: custom_network_1
build:
  context: .
  network: none
  • shm_size
build:
  context: .
  shm_size: '2gb'
build:
  context: .
  shm_size: 10000000
  • target
build:
  context: .
  target: prod

2.2 cap_add, cap_drop

cap_add:
  - ALL

cap_drop:
  - NET_ADMIN
  - SYS_ADMIN

2.3 cgroup_parent

cgroup_parent: m-executor-abcd

2.4 command

command: bundle exec thin -p 3000
command: ["bundle", "exec", "thin", "-p", "3000"]

2.5 configs

  • Short syntax
version: "3.9"
services:
  redis:
    image: redis:latest
    deploy:
      replicas: 1
    configs:
      - my_config
      - my_other_config
configs:
  my_config:
    file: ./my_config.txt
  my_other_config:
    external: true
  • Long syntax
version: "3.9"
services:
  redis:
    image: redis:latest
    deploy:
      replicas: 1
    configs:
      - source: my_config
        target: /redis_config
        uid: '103'
        gid: '103'
        mode: 0440
configs:
  my_config:
    file: ./my_config.txt
  my_other_config:
    external: true

2.6 container_name

container_name: my-web-container

2.7 credential_spec

credential_spec:
  file: my-credential-spec.json
credential_spec:
  registry: my-credential-spec

eg:

version: "3.9"
services:
  myservice:
    image: myimage:latest
    credential_spec:
      config: my_credential_spec

configs:
  my_credentials_spec:
    file: ./my-credential-spec.json|

2.8 depends_on

version: "3.9"
services:
  web:
    build: .
    depends_on:
      - db
      - redis
  redis:
    image: redis
  db:
    image: postgres

2.9 deploy

version: "3.9"
services:
  redis:
    image: redis:alpine
    deploy:
      replicas: 6
      placement:
        max_replicas_per_node: 1
      update_config:
        parallelism: 2
        delay: 10s
      restart_policy:
        condition: on-failure
  • endpoint_mode
version: "3.9"

services:
  wordpress:
    image: wordpress
    ports:
      - "8080:80"
    networks:
      - overlay
    deploy:
      mode: replicated
      replicas: 2
      endpoint_mode: vip

  mysql:
    image: mysql
    volumes:
       - db-data:/var/lib/mysql/data
    networks:
       - overlay
    deploy:
      mode: replicated
      replicas: 2
      endpoint_mode: dnsrr

volumes:
  db-data:

networks:
  overlay:
  • labels
version: "3.9"
services:
  web:
    image: web
    deploy:
      labels:
        com.example.description: "This label will appear on the web service"
version: "3.9"
services:
  web:
    image: web
    labels:
      com.example.description: "This label will appear on all containers for the web service"
  • mode
version: "3.9"
services:
  worker:
    image: dockersamples/examplevotingapp_worker
    deploy:
      mode: global
  • placement
version: "3.9"
services:
  db:
    image: postgres
    deploy:
      placement:
        constraints:
          - "node.role==manager"
          - "engine.labels.operatingsystem==ubuntu 18.04"
        preferences:
          - spread: node.labels.zone
  • max_replicas_per_node
version: "3.9"
services:
  worker:
    image: dockersamples/examplevotingapp_worker
    networks:
      - frontend
      - backend
    deploy:
      mode: replicated
      replicas: 6
      placement:
        max_replicas_per_node: 1
  • replicas
version: "3.9"
services:
  worker:
    image: dockersamples/examplevotingapp_worker
    networks:
      - frontend
      - backend
    deploy:
      mode: replicated
      replicas: 6
  • resources
version: "3.9"
services:
  redis:
    image: redis:alpine
    deploy:
      resources:
        limits:
          cpus: '0.50'
          memory: 50M
        reservations:
          cpus: '0.25'
          memory: 20M
  • restart_policy
version: "3.9"
services:
  redis:
    image: redis:alpine
    deploy:
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 120s
  • rollback_config

    • parallelism: The number of containers to rollback at a time. If set to 0, all containers rollback simultaneously.
    • delay: The time to wait between each container group’s rollback (default 0s).
    • failure_action: What to do if a rollback fails. One of continue or pause (default pause)
    • monitor: Duration after each task update to monitor for failure (ns|us|ms|s|m|h) (default 5s) Note: Setting to 0 will use the default 5s.
    • max_failure_ratio: Failure rate to tolerate during a rollback (default 0).
    • order: Order of operations during rollbacks. One of stop-first (old task is stopped before starting new one), or start-first (new task is started first, and the running tasks briefly overlap) (default stop-first).
  • update_config

    • parallelism: The number of containers to update at a time.
    • delay: The time to wait between updating a group of containers.
    • failure_action: What to do if an update fails. One of continue, rollback, or pause (default: pause).
    • monitor: Duration after each task update to monitor for failure (ns|us|ms|s|m|h) (default 5s) Note: Setting to 0 will use the default 5s.
    • max_failure_ratio: Failure rate to tolerate during an update.
    • order: Order of operations during updates. One of stop-first (old task is stopped before starting new one), or start-first (new task is started first, and the running tasks briefly overlap) (default stop-first) Note: Only supported for v3.4 and higher.
version: "3.9"
services:
  vote:
    image: dockersamples/examplevotingapp_vote:before
    depends_on:
      - redis
    deploy:
      replicas: 2
      update_config:
        parallelism: 2
        delay: 10s
        order: stop-first

2.10 devices

devices:
  - "/dev/ttyUSB0:/dev/ttyUSB0"

2.11 dns

dns: 8.8.8.8
dns:
  - 8.8.8.8
  - 9.9.9.9

2.12 dns_search

dns_search: example.com
dns_search:
  - dc1.example.com
  - dc2.example.com

2.13 entrypoint

entrypoint: /code/entrypoint.sh
entrypoint: ["php", "-d", "memory_limit=-1", "vendor/bin/phpunit"]

2.14 env_file

env_file: .env
env_file:
  - ./common.env
  - ./apps/web.env
  - /opt/runtime_opts.env

2.15 environment

environment:
  RACK_ENV: development
  SHOW: 'true'
  SESSION_SECRET:
environment:
  - RACK_ENV=development
  - SHOW=true
  - SESSION_SECRET

2.16 expose

expose:
  - "3000"
  - "8000"

2.17 external_links

external_links:
  - redis_1
  - project_db_1:mysql
  - project_db_1:postgresql

2.18 extra_hosts

extra_hosts:
  - "somehost:162.242.195.82"
  - "otherhost:50.31.209.229"

2.19 healthcheck

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost"]
  interval: 1m30s
  timeout: 10s
  retries: 3
  start_period: 40s

2.20 image

image: redis
image: ubuntu:18.04
image: tutum/influxdb
image: example-registry.com:4000/postgresql
image: a4bc65fd

2.21 init

version: "3.9"
services:
  web:
    image: alpine:latest
    init: true

2.22 isolation

2.23 labels

labels:
  com.example.description: "Accounting webapp"
  com.example.department: "Finance"
  com.example.label-with-empty-value: ""
labels:
  - "com.example.description=Accounting webapp"
  - "com.example.department=Finance"
  - "com.example.label-with-empty-value"

2.24 links

web:
  links:
    - "db"
    - "db:database"
    - "redis"

2.25 logging

logging:
  driver: syslog
  options:
    syslog-address: "tcp://192.168.0.42:123"
driver: "json-file"

driver: "syslog"

driver: "none"

eg:

version: "3.9"
services:
  some-service:
    image: some-service
    logging:
      driver: "json-file"
      options:
        max-size: "200k"
        max-file: "10"

2.26 network_mode

network_mode: "bridge"

network_mode: "host"

network_mode: "none"

network_mode: "service:[service name]"

network_mode: "container:[container name/id]"

2.27 networks

services:
  some-service:
    networks:
     - some-network
     - other-network
  • aliases
services:
  some-service:
    networks:
      some-network:
        aliases:
          - alias1
          - alias3
      other-network:
        aliases:
          - alias2

eg:

version: "3.9"

services:
  web:
    image: "nginx:alpine"
    networks:
      - new

  worker:
    image: "my-worker-image:latest"
    networks:
      - legacy

  db:
    image: mysql
    networks:
      new:
        aliases:
          - database
      legacy:
        aliases:
          - mysql

networks:
  new:
  legacy:
  • ipv4_address, ipv6_address
version: "3.9"

services:
  app:
    image: nginx:alpine
    networks:
      app_net:
        ipv4_address: 172.16.238.10
        ipv6_address: 2001:3984:3989::10

networks:
  app_net:
    ipam:
      driver: default
      config:
        - subnet: "172.16.238.0/24"
        - subnet: "2001:3984:3989::/64"

2.28 pid

pid: "host"

2.29 ports

  • Short syntax
ports:
  - "3000"
  - "3000-3005"
  - "8000:8000"
  - "9090-9091:8080-8081"
  - "49100:22"
  - "127.0.0.1:8001:8001"
  - "127.0.0.1:5000-5010:5000-5010"
  - "127.0.0.1::5000"
  - "6060:6060/udp"
  - "12400-12500:1240"
  • Long syntax

    • target: the port inside the container
    • published: the publicly exposed port
    • protocol: the port protocol (tcp or udp)
    • mode: host for publishing a host port on each node, or ingress for a swarm mode port to be load balanced.
ports:
  - target: 80
    published: 8080
    protocol: tcp
    mode: host

2.30 profiles

profiles: ["frontend", "debug"]
profiles:
  - frontend
  - debug

2.31 restart

restart: "no"
restart: always
restart: on-failure
restart: unless-stopped

2.32 secrets

  • Short syntax
version: "3.9"
services:
  redis:
    image: redis:latest
    deploy:
      replicas: 1
    secrets:
      - my_secret
      - my_other_secret
secrets:
  my_secret:
    file: ./my_secret.txt
  my_other_secret:
    external: true
  • Long syntax

    • source: The identifier of the secret as it is defined in this configuration.
    • target: The name of the file to be mounted in /run/secrets/ in the service’s task containers. Defaults to source if not specified.
    • uid and gid: The numeric UID or GID that owns the file within /run/secrets/ in the service’s task containers. Both default to 0 if not specified.
    • mode: The permissions for the file to be mounted in /run/secrets/ in the service’s task containers, in octal notation. For instance, 0444 represents world-readable. The default in Docker 1.13.1 is 0000, but is be 0444 in newer versions. Secrets cannot be writable because they are mounted in a temporary filesystem, so if you set the writable bit, it is ignored. The executable bit can be set. If you aren’t familiar with UNIX file permission modes, you may find this permissions calculator useful.
version: "3.9"
services:
  redis:
    image: redis:latest
    deploy:
      replicas: 1
    secrets:
      - source: my_secret
        target: redis_secret
        uid: '103'
        gid: '103'
        mode: 0440
secrets:
  my_secret:
    file: ./my_secret.txt
  my_other_secret:
    external: true

2.33 security_opt

security_opt:
  - label:user:USER
  - label:role:ROLE

2.34 stop_grace_period

stop_grace_period: 1s
stop_grace_period: 1m30s

2.35 stop_signal

stop_signal: SIGUSR1

2.36 sysctls

sysctls:
  net.core.somaxconn: 1024
  net.ipv4.tcp_syncookies: 0
sysctls:
  - net.core.somaxconn=1024
  - net.ipv4.tcp_syncookies=0

2.37 tmpfs

tmpfs: /run
tmpfs:
  - /run
  - /tmp

eg:

- type: tmpfs
  target: /app
  tmpfs:
    size: 1000

2.38 ulimits

ulimits:
  nproc: 65535
  nofile:
    soft: 20000
    hard: 40000

2.39 userns_mode

userns_mode: "host"

2.40 volumes

version: "3.9"
services:
  web:
    image: nginx:alpine
    volumes:
      - type: volume
        source: mydata
        target: /data
        volume:
          nocopy: true
      - type: bind
        source: ./static
        target: /opt/app/static

  db:
    image: postgres:latest
    volumes:
      - "/var/run/postgres/postgres.sock:/var/run/postgres/postgres.sock"
      - "dbdata:/var/lib/postgresql/data"

volumes:
  mydata:
  dbdata:
  • Short syntax
volumes:
  # Just specify a path and let the Engine create a volume
  - /var/lib/mysql

  # Specify an absolute path mapping
  - /opt/data:/var/lib/mysql

  # Path on the host, relative to the Compose file
  - ./cache:/tmp/cache

  # User-relative path
  - ~/configs:/etc/configs/:ro

  # Named volume
  - datavolume:/var/lib/mysql
  • Long syntax

    • type: the mount type volume, bind, tmpfs or npipe
    • source: the source of the mount, a path on the host for a bind mount, or the name of a volume defined in the top-level volumes key. Not applicable for a tmpfs mount.
    • target: the path in the container where the volume is mounted
    • read_only: flag to set the volume as read-only
    • bind: configure additional bind options

      • propagation: the propagation mode used for the bind
    • volume: configure additional volume options

      • nocopy: flag to disable copying of data from a container when a volume is created
    • tmpfs: configure additional tmpfs options

      • size: the size for the tmpfs mount in bytes
version: "3.9"
services:
  web:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - type: volume
        source: mydata
        target: /data
        volume:
          nocopy: true
      - type: bind
        source: ./static
        target: /opt/app/static

networks:
  webnet:

volumes:
  mydata:

3 Other Fields

3.1 volumes

version: "3.9"

services:
  db:
    image: db
    volumes:
      - data-volume:/var/lib/db
  backup:
    image: backup-service
    volumes:
      - data-volume:/var/lib/backup/data

volumes:
  data-volume:
  • driver
driver: foobar
  • driver_opts
volumes:
  example:
    driver_opts:
      type: "nfs"
      o: "addr=10.40.0.199,nolock,soft,rw"
      device: ":/docker/example"
  • external
version: "3.9"

services:
  db:
    image: postgres
    volumes:
      - data:/var/lib/postgresql/data

volumes:
  data:
    external: true
  • labels
labels:
  com.example.description: "Database volume"
  com.example.department: "IT/Ops"
  com.example.label-with-empty-value: ""
labels:
  - "com.example.description=Database volume"
  - "com.example.department=IT/Ops"
  - "com.example.label-with-empty-value"
  • name
version: "3.9"
volumes:
  data:
    name: my-app-data
version: "3.9"
volumes:
  data:
    external: true
    name: my-app-data

3.2 networks

  • driver

    • bridge
    • overlay
    • host or none
  • driver_opts
driver_opts:
  foo: "bar"
  baz: 1
  • attachable
networks:
  mynet1:
    driver: overlay
    attachable: true
  • ipam
ipam:
  driver: default
  config:
    - subnet: 172.28.0.0/16
  • internal
internal: true
  • labels
labels:
  com.example.description: "Financial transaction network"
  com.example.department: "Finance"
  com.example.label-with-empty-value: ""
labels:
  - "com.example.description=Financial transaction network"
  - "com.example.department=Finance"
  - "com.example.label-with-empty-value"
  • external
version: "3.9"

services:
  proxy:
    build: ./proxy
    networks:
      - outside
      - default
  app:
    build: ./app
    networks:
      - default

networks:
  outside:
    external: true
version: "3.9"
networks:
  outside:
    external:
      name: actual-name-of-network
  • name
version: "3.9"
networks:
  network1:
    name: my-app-net
version: "3.9"
networks:
  network1:
    external: true
    name: my-app-net

3.3 configs

  • file: The config is created with the contents of the file at the specified path.
  • external: If set to true, specifies that this config has already been created. Docker does not attempt to create it, and if it does not exist, a config not found error occurs.
  • name: The name of the config object in Docker. This field can be used to reference configs that contain special characters. The name is used as is and will not be scoped with the stack name. Introduced in version 3.5 file format.
  • driver and driver_opts: The name of a custom secret driver, and driver-specific options passed as key/value pairs. Introduced in version 3.8 file format, and only supported when using docker stack.
  • template_driver: The name of the templating driver to use, which controls whether and how to evaluate the secret payload as a template. If no driver is set, no templating is used. The only driver currently supported is golang, which uses a golang. Introduced in version 3.8 file format, and only supported when using docker stack. Refer to use a templated config for a examples of templated configs.
configs:
  my_first_config:
    file: ./config_data
  my_second_config:
    external: true
configs:
  my_first_config:
    file: ./config_data
  my_second_config:
    external:
      name: redis_config

3.4 secrets

  • file: The secret is created with the contents of the file at the specified path.
  • external: If set to true, specifies that this secret has already been created. Docker does not attempt to create it, and if it does not exist, a secret not found error occurs.
  • name: The name of the secret object in Docker. This field can be used to reference secrets that contain special characters. The name is used as is and will not be scoped with the stack name. Introduced in version 3.5 file format.
  • template_driver: The name of the templating driver to use, which controls whether and how to evaluate the secret payload as a template. If no driver is set, no templating is used. The only driver currently supported is golang, which uses a golang. Introduced in version 3.8 file format, and only supported when using docker stack.
secrets:
  my_first_secret:
    file: ./secret_data
  my_second_secret:
    external: true
    name: redis_secret

1 创建目录

mkdir -p /opt/wordpress
cd /opt/wordpress

2 创建 docker compose 文件

version: '3.8'
services:
  db:
    image: mysql:5.7
    container_name: "wordpress_mysql"
    volumes:
      - $PWD/db:/var/lib/mysql
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: somewordpress
      MYSQL_DATABASE: wordpress
      MYSQL_USER: wordpress
      MYSQL_PASSWORD: wordpress
  wordpress:
    depends_on:
      - db
    image: wordpress:latest
    container_name: "wordpress"
    ports:
      - "80:80"
    restart: always
    environment:
      WORDPRESS_DB_HOST: db:3306
      WORDPRESS_DB_USER: wordpress
      WORDPRESS_DB_PASSWORD: wordpress
      WORDPRESS_DB_NAME: wordpress
    volumes:
      - $PWD/wp-content:/var/www/html/wp-content

3 启动容器

docker-compose up -d
docker-compose ps
NAME                COMMAND                  SERVICE             STATUS              PORTS
wordpress           "docker-entrypoint.s…"   wordpress           running             0.0.0.0:80->80/tcp, :::80->80/tcp
wordpress_mysql     "docker-entrypoint.s…"   db                  running             33060/tcp

4 开始安装

在浏览器打开地址:http://192.168.2.105 ,然后按照提示一步一步进行安装即可:

1 申请阿里云 SSL 证书

由于之前在阿里云注册过域名,阿里云提供了单域名的免费 SSL 证书,因此使用 harbor.iuskye.com 这个域名申请一个 Nginx 证书。

/root/certs/server.pem
/root/certs/server.key

2 安装 docker

yum install -y yum-utils device-mapper-persistent-data lvm2 
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo 
yum makecache fast
yum install -y docker-ce-20.10.8-3.el7
systemctl start docker
systemctl enable docker

3 安装 docker-compose

wget https://oss.iuskye.com/article/2021-08-25/docker-compose-Linux-x86_64
chmod +x docker-compose-Linux-x86_64 
mv docker-compose-Linux-x86_64 /usr/bin/docker-compose
docker-compose --help

4 安装 harbor

wget https://oss.iuskye.com/article/2021-08-25/harbor-offline-installer-v2.3.2.tgz
tar zxf harbor-offline-installer-v2.3.2.tgz 
cd harbor
cp harbor.yml.tmpl harbor.yml

修改几个值:

vim harbor.yml

hostname: harbor.iuskye.com
https:
  # https port for harbor, default is 443
  port: 443
  # The path of cert and key files for nginx
  certificate: /root/certs/server.pem
  private_key: /root/certs/server.key
./install.sh

5 通过浏览器访问

https://harbor.iuskye.com

账号密码可以从 harbor.yml 文件中查看到。

6 测试上传镜像

登录 Harbor:

docker login --username=admin harbor.iuskye.com

查看镜像:

docker images
REPOSITORY                                               TAG       IMAGE ID       CREATED       SIZE
registry.cn-beijing.aliyuncs.com/iuskye/kube-apiserver   v1.21.3   3d174f00aa39   5 weeks ago   126MB

重新打 tag:

docker tag registry.cn-beijing.aliyuncs.com/iuskye/kube-apiserver:v1.21.3 harbor.iuskye.com/k8s/kube-apiserver:v1.21.3

docker images
REPOSITORY                                               TAG       IMAGE ID       CREATED       SIZE
harbor.iuskye.com/k8s/kube-apiserver                     v1.21.3   3d174f00aa39   5 weeks ago   126MB
registry.cn-beijing.aliyuncs.com/iuskye/kube-apiserver   v1.21.3   3d174f00aa39   5 weeks ago   126MB

推送镜像:

docker push harbor.iuskye.com/k8s/kube-apiserver:v1.21.3

Web 端查看上传的镜像:

1 说明

本文系搭建 kubernetes v1.21.3 版本集群笔记,使用三台虚拟机作为 CentOS7.9 系统测试机,安装 kubeadm、kubelet、kubectl 均使用 yum 安装,网络组件选用的是 flannel。

2 环境准备

部署集群没有特殊说明均使用 root 用户执行命令。

2.1 硬件信息

IPhostnamememdiskexplain
192.168.4.120centos79-node14GB30GBk8s 控制平面节点
192.168.4.121centos79-node24GB30GBk8s 执行节点1
192.168.4.123centos79-node34GB30GBk8s 执行节点2

2.2 软件信息

softwareversion
CentOSCentOS Linux release 7.9.2009 (Core)
Kubernetes1.21.3
Docker20.10.8
Kernel5.4.138-1.el7.elrepo.x86_64

2.3 保证环境正确性

purposecommands
保证集群各节点互通ping -c 3 <ip>
保证MAC地址唯一ip linkifconfig -a
保证集群内主机名唯一查询 hostnamectl status,修改 hostnamectl set-hostname <hostname>
保证系统产品uuid唯一dmidecode -s system-uuidsudo cat /sys/class/dmi/id/product_uuid

修改MAC地址参考命令:

ifconfig eth0 down
ifconfig eth0 hw ether 00:0c:29:84:fd:a4
ifconfig eth0 up

如 product_uuid 不唯一,请考虑重新安装CentOS。

2.4 确保端口开放正常

cetnos79-node1 节点端口检查:

ProtocolDirectionPort RangePurpose
TCPInbound6443*Kube-apiserver
TCPInbound2379-2380Etcd API
TCPInbound10250Kubelet API
TCPInbound10251Kube-scheduler
TCPInbound10252Kube-controller-manager

centos79-node2centos79-node3 节点端口检查:

ProtocolDirectionPort RangePurpose
TCPInbound10250Kubelet api
TCPInbound30000-32767NodePort Service

2.5 配置主机互信

配置hosts解析:

cat >> /etc/hosts <<EOF 
192.168.4.120 centos79-node1
192.168.4.121 centos79-node2
192.168.4.123 centos79-node3 
EOF

centos79-node1 生成ssh密钥,并分发到各个节点:

# 生成ssh密钥,直接一路回车 
ssh-keygen -t rsa 
# 复制刚刚生成的密钥到各节点可信列表中,需分别输入各主机密码 
ssh-copy-id root@centos79-node1 
ssh-copy-id root@centos79-node2 
ssh-copy-id root@centos79-node3

2.6 禁用 swap

swap 仅当内存不够时会使用硬盘块充当额外内存,硬盘的 io 较内存差距极大,禁用 swap 以提高性能各节点均需执行:

swapoff -a 
cp /etc/fstab  /etc/fstab.bak
cat /etc/fstab.bak | grep -v swap > /etc/fstab

2.7 关闭 SELinux

关闭 SELinux,否则 kubelet 挂载目录时可能报错 Permission denied,可以设置为 permissivedisabledpermissive 会提示 warn 信息各节点均需执行:

setenforce 0 
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

2.8 设置时区、同步时间

timedatectl set-timezone Asia/Shanghai 
systemctl enable --now chronyd

查看同步状态:

timedatectl status
# 将当前的 UTC 时间写入硬件时钟 
timedatectl set-local-rtc 0 
# 重启依赖于系统时间的服务 
systemctl restart rsyslog && systemctl restart crond

2.9 关闭防火墙

systemctl stop firewalld
systemctl disable firewalld

2.10 修改内核参数

cp /etc/sysctl.conf{,.bak}
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
echo "net.bridge.bridge-nf-call-ip6tables = 1" >> /etc/sysctl.conf
echo "net.bridge.bridge-nf-call-iptables = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.all.disable_ipv6 = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.default.disable_ipv6 = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.lo.disable_ipv6 = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.all.forwarding = 1"  >> /etc/sysctl.conf
echo "vm.swappiness = 0" >> /etc/sysctl.conf
modprobe br_netfilter
sysctl -p

2.11 开启IPVS支持

vim /etc/sysconfig/modules/ipvs.modules
#!/bin/bash
ipvs_modules="ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed ip_vs_ftp nf_conntrack"
for kernel_module in ${ipvs_modules}; do
  /sbin/modinfo -F filename ${kernel_module} > /dev/null 2>&1
  if [ $? -eq 0 ]; then
    /sbin/modprobe ${kernel_module}
  fi
done
chmod 755 /etc/sysconfig/modules/ipvs.modules 
sh /etc/sysconfig/modules/ipvs.modules 
lsmod | grep ip_vs

2.12 升级内核版本

参考链接

3 部署 Docker

所有节点均需要安装 Docker。

3.1 添加 Docker yum 源

# 安装必要依赖 
yum install -y yum-utils device-mapper-persistent-data lvm2 
# 添加 aliyun docker-ce yum 源 
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo 
# 重建 yum 缓存 
yum makecache fast

3.2 安装 Docker

# 查看可用 docker 版本 
yum list docker-ce.x86_64 --showduplicates | sort -r
已加载插件:fastestmirror
已安装的软件包
可安装的软件包
Loading mirror speeds from cached hostfile
 * elrepo: mirrors.tuna.tsinghua.edu.cn
docker-ce.x86_64            3:20.10.8-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:20.10.8-3.el7                    @docker-ce-stable
docker-ce.x86_64            3:20.10.7-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:20.10.6-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:20.10.5-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:20.10.4-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:20.10.3-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:20.10.2-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:20.10.1-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:20.10.0-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.9-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.8-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.7-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.6-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.5-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.4-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.3-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.2-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.15-3.el7                   docker-ce-stable 
docker-ce.x86_64            3:19.03.14-3.el7                   docker-ce-stable 
docker-ce.x86_64            3:19.03.1-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:19.03.13-3.el7                   docker-ce-stable 
docker-ce.x86_64            3:19.03.12-3.el7                   docker-ce-stable 
docker-ce.x86_64            3:19.03.11-3.el7                   docker-ce-stable 
docker-ce.x86_64            3:19.03.10-3.el7                   docker-ce-stable 
docker-ce.x86_64            3:19.03.0-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.9-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.8-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.7-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.6-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.5-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.4-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.3-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.2-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.1-3.el7                    docker-ce-stable 
docker-ce.x86_64            3:18.09.0-3.el7                    docker-ce-stable 
docker-ce.x86_64            18.06.3.ce-3.el7                   docker-ce-stable 
docker-ce.x86_64            18.06.2.ce-3.el7                   docker-ce-stable 
docker-ce.x86_64            18.06.1.ce-3.el7                   docker-ce-stable 
docker-ce.x86_64            18.06.0.ce-3.el7                   docker-ce-stable 
docker-ce.x86_64            18.03.1.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            18.03.0.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.12.1.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.12.0.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.09.1.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.09.0.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.06.2.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.06.1.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.06.0.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.03.3.ce-1.el7                   docker-ce-stable 
docker-ce.x86_64            17.03.2.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.03.1.ce-1.el7.centos            docker-ce-stable 
docker-ce.x86_64            17.03.0.ce-1.el7.centos            docker-ce-stable
# 安装指定版本 Docker
yum install -y docker-ce-20.10.8-3.el7

这里以安装 20.10.8 版本举例,注意版本号不包含 : 与之前的数字。

3.3 确保网络模块开机自动加载

lsmod | grep overlay 
lsmod | grep br_netfilter

若上面命令无返回值输出或提示文件不存在,需执行以下命令:

cat > /etc/modules-load.d/docker.conf <<EOF 
overlay 
br_netfilter 
EOF
modprobe overlay 
modprobe br_netfilter

3.4 使桥接流量对 iptables 可见

各个节点均需执行:

cat > /etc/sysctl.d/k8s.conf <<EOF 
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

sysctl --system

验证是否生效,均返回 1 即正确。

sysctl -n net.bridge.bridge-nf-call-iptables 
sysctl -n net.bridge.bridge-nf-call-ip6tables

3.5 配置 Docker

mkdir /etc/docker

修改 cgroup 驱动为 systemd [k8s官方推荐]、限制容器日志量、修改存储类型,最后的 docker 家目录可修改:

cat > /etc/docker/daemon.json <<EOF
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2",
  "storage-opts": [
    "overlay2.override_kernel_check=true"
  ],
  "registry-mirrors": ["https://gp8745ui.mirror.aliyuncs.com"],
  "data-root": "/data/docker"
}
EOF

服务脚本第 13 行修改:

vim /lib/systemd/system/docker.service

ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --default-ulimit core=0:0
systemctl daemon-reload

添加开机自启,立即启动:

systemctl enable --now docker

3.6 验证 Docker 是否正常

# 查看docker信息,判断是否与配置一致
docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.6.1-docker)
  scan: Docker Scan (Docker Inc., v0.8.0)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 20.10.8
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: e25210fe30a0a703442421b0f60afac609f950a3
 runc version: v1.0.1-0-g4144b63
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.138-1.el7.elrepo.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 3.846GiB
 Name: centos79-node1
 ID: GFMO:BC7P:5L4S:JACH:EX5I:L6UM:AINU:A3SE:E6B6:ZLBQ:UBPG:QV7O
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
# hello-docker 测试
docker run --rm hello-world
# 删除测试 image
docker rmi hello-world

3.7 添加用户到 Docker 组

对于非 root 用户,无需 sudo 即可使用 docker 命令。

# 添加用户到 docker 组 
usermod -aG docker <USERNAME> 
# 当前会话立即更新 docker 组 
newgrp docker

4 部署 Kubernetes 集群

如未说明,各节点均需执行如下步骤:

4.1 添加 kubernetes 源

cat > /etc/yum.repos.d/kubernetes.repo <<EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

# 重建yum缓存,输入y添加证书认证
yum makecache fast

4.2 安装 kubeadm、kubelet、kubectl

  • 各节点均需安装 kubeadmkubelet
  • kubectlcentos79-node1 节点需安装(作为 worker 节点,kubectl 无法使用,可以不装)
yum list docker-ce.x86_64 --showduplicates | sort -r

version=1.21.3-0
yum install -y kubelet-${version} kubeadm-${version} kubectl-${version}
systemctl enable kubelet

4.3 配置自动补全命令

# 安装 bash 自动补全插件
yum install bash-completion -y
# 设置 kubectl 与 kubeadm 命令补全,下次 login 生效
kubectl completion bash >/etc/bash_completion.d/kubectl
kubeadm completion bash > /etc/bash_completion.d/kubeadm

4.4 为 Docker 设定使用的代理服务(暂跳过该步骤,由阿里云镜像解决)

Kubeadm 部署 Kubernetes 集群的过程中,默认使用 Google 的 Registry 服务 k8s.gcr.io 上的镜像,例如k8s.grc.io/kube-apiserver 等,但国内无法访问到该服务。必要时,可自行设置合适的代理来获取相关镜像,或者从 Dockerhub 上下载镜像至本地后自行对镜像打标签。

这里简单说明一下设置代理服务的方法。编辑 /lib/systemd/system/docker.service 文件,在 [Service] 配置段中添加类似如下内容,其中的 PROXY_SERVER_IPPROXY_PORT 要按照实际情况修改。

Environment="HTTP_PROXY=http://$PROXY_SERVER_IP:$PROXY_PORT"
Environment="HTTPS_PROXY=https://$PROXY_SERVER_IP:$PROXY_PORT"
Environment="NO_PROXY=192.168.4.0/24"

配置完成后需要重载 systemd,并重新启动 docker 服务:

systemctl daemon-reload
systemctl restart docker.service

需要特别说明的是,由 kubeadm 部署的 Kubernetes 集群上,集群核心组件 kube-apiserver、kube-controller-manager、kube-scheduler 和 etcd 等均会以静态 Pod 的形式运行,它们所依赖的镜像文件默认来自于 k8s.gcr.io 这一 Registry 服务之上。但我们无法直接访问该服务,常用的解决办法有如下两种,本示例将选择使用更易于使用的前一种方式:

  • 使用能够到达该服务的代理服务
  • 使用国内的镜像服务器上的服务,例如 gcr.azk8s.cn/google_containersregistry.aliyuncs.com/google_containers 等(经测试,v1.22.0 版本已停用)

4.5 查看指定 k8s 版本需要哪些镜像

kubeadm config images list --kubernetes-version v1.21.3
k8s.gcr.io/kube-apiserver:v1.21.3
k8s.gcr.io/kube-controller-manager:v1.21.3
k8s.gcr.io/kube-scheduler:v1.21.3
k8s.gcr.io/kube-proxy:v1.21.3
k8s.gcr.io/pause:3.4.1
k8s.gcr.io/etcd:3.4.13-0
k8s.gcr.io/coredns/coredns:v1.8.0

4.6 拉取镜像

vim pullimages.sh
#!/bin/bash
# pull images

ver=v1.21.3
registry=registry.cn-hangzhou.aliyuncs.com/google_containers
images=`kubeadm config images list --kubernetes-version=$ver |awk -F '/' '{print $2}'`

for image in $images
do
if [ $image != coredns ];then
    docker pull ${registry}/$image
    if [ $? -eq 0 ];then
        docker tag ${registry}/$image k8s.gcr.io/$image
        docker rmi ${registry}/$image
    else
        echo "ERROR: 下载镜像报错,$image"
    fi
else
    docker pull coredns/coredns:1.8.0
    docker tag coredns/coredns:1.8.0  k8s.gcr.io/coredns/coredns:v1.8.0
    docker rmi coredns/coredns:1.8.0
fi
done
chmod +x pullimages.sh && ./pullimages.sh

拉取完成,执行 docker images 查看镜像:

docker images

REPOSITORY                           TAG        IMAGE ID       CREATED         SIZE
k8s.gcr.io/kube-apiserver            v1.21.3    3d174f00aa39   3 weeks ago     126MB
k8s.gcr.io/kube-scheduler            v1.21.3    6be0dc1302e3   3 weeks ago     50.6MB
k8s.gcr.io/kube-proxy                v1.21.3    adb2816ea823   3 weeks ago     103MB
k8s.gcr.io/kube-controller-manager   v1.21.3    bc2bb319a703   3 weeks ago     120MB
k8s.gcr.io/pause                     3.4.1      0f8457a4c2ec   6 months ago    683kB
k8s.gcr.io/coredns/coredns           v1.8.0     296a6d5035e2   9 months ago    42.5MB
k8s.gcr.io/etcd                      3.4.13-0   0369cf4303ff   11 months ago   253MB

导出镜像,copy 到其它节点:

docker save $(docker images | grep -v REPOSITORY | awk 'BEGIN{OFS=":";ORS=" "}{print $1,$2}') -o k8s-images.tar

scp k8s-images.tar root@centos79-node2:~
scp k8s-images.tar root@centos79-node3:~

在其它节点导入:

docker load -i k8s-images.tar

4.7 修改 kubelet 配置默认 cgroup driver

mkdir /var/lib/kubelet

cat > /var/lib/kubelet/config.yaml <<EOF
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
EOF

4.8 初始化 master 节点

centos79-node1 节点需要执行此步骤。

4.8.1 生成 kubeadm 初始化配置文件

[可选] 仅当需自定义初始化配置时用。

kubeadm config print init-defaults > kubeadm-config.yaml

修改配置文件:

localAPIEndpoint:
  advertiseAddress: 1.2.3.4
# 替换为:
localAPIEndpoint:
  advertiseAddress: 192.168.4.120
  name: centos79-node1
kubernetesVersion: 1.21.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
# 替换为:
kubernetesVersion: 1.21.3
networking:
  podSubnet: "10.244.0.0/16"
  serviceSubnet: 10.96.0.0/12

4.8.2 测试环境是否正常

kubeadm init phase preflight
I0810 13:46:36.581916   20512 version.go:254] remote version is much newer: v1.22.0; falling back to: stable-1.21
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'

4.8.3 初始化 master

10.244.0.0/16 是 flannel 固定使用的 IP 段,设置取决于网络组件要求。

kubeadm init --config=kubeadm-config.yaml --ignore-preflight-errors=2 --upload-certs | tee kubeadm-init.log

输出如下:

W0810 14:55:25.741990   13062 strict.go:54] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeadm.k8s.io", Version:"v1beta2", Kind:"InitConfiguration"}: error unmarshaling JSON: while decoding JSON: json: unknown field "name"
[init] Using Kubernetes version: v1.21.3
[preflight] Running pre-flight checks
        [WARNING Hostname]: hostname "node" could not be reached
        [WARNING Hostname]: hostname "node": lookup node on 223.5.5.5:53: no such host
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local node] and IPs [10.96.0.1 192.168.4.120]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost node] and IPs [192.168.4.120 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost node] and IPs [192.168.4.120 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 17.503592 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.21" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
fceedfd1392b27957c5f6345661d62dc09359b61e07f76f444a9e3095022dab4
[mark-control-plane] Marking the node node as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node node as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.4.120:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:6ad6978a7e72cfae06c836886276634c87bedfa8ff02e44f574ffb96435b4c2b

4.8.4 为日常使用集群的用户添加 kubectl 使用权限

su - iuskye
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/admin.conf
sudo chown $(id -u):$(id -g) $HOME/.kube/admin.conf
echo "export KUBECONFIG=$HOME/.kube/admin.conf" >> ~/.bashrc
exit

4.8.5 配置 master 认证

echo 'export KUBECONFIG=/etc/kubernetes/admin.conf' >> /etc/profile 
. /etc/profile

如果不配置这个,会提示如下输出:The connection to the server localhost:8080 was refused - did you specify the right host or port?
此时 master 节点已经初始化成功,但是还未安装网络组件,还无法与其他节点通讯。

4.8.6 安装网络组件

以 flannel 为例:

curl -o kube-flannel.yml https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f kube-flannel.yml    # 这里下载镜像非常慢,我还是先手动拉下来吧,不行就多试几次
docker pull quay.io/coreos/flannel:v0.14.0
kubectl apply -f kube-flannel.yml

4.8.7 查看 centos79-node1 节点状态

kubectl get nodes
NAME             STATUS     ROLES                  AGE     VERSION
centos79-node2   NotReady   <none>                 7m29s   v1.21.3
centos79-node3   NotReady   <none>                 7m15s   v1.21.3
node             Ready      control-plane,master   33m     v1.21.3

如果 STATUS 提示 NotReady,可以通过 kubectl describe node centos79-node2 查看具体的描述信息,性能差的服务器到达 Ready 状态时间会长些。

4.9 初始化 node 节点并加入集群

4.9.1 获取加入 kubernetes 的命令

访问 centos79-node1 输入创建新 token 命令:

kubeadm token create --print-join-command

同时输出加入集群的命令:

kubeadm join 192.168.4.120:6443 --token 8dj8i5.6jua6ogqvve1ci5u --discovery-token-ca-cert-hash sha256:6ad6978a7e72cfae06c836886276634c87bedfa8ff02e44f574ffb96435b4c2b

这个 token 也可以使用上述 master 上执行的初始化输出结果。

4.9.2 在 node 节点上执行加入集群的命令

kubeadm join 192.168.4.120:6443 --token 8dj8i5.6jua6ogqvve1ci5u --discovery-token-ca-cert-hash sha256:6ad6978a7e72cfae06c836886276634c87bedfa8ff02e44f574ffb96435b4c2b
[preflight] Running pre-flight checks
        [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

4.10 查看集群节点状态

kubectl get nodes
NAME             STATUS     ROLES                  AGE     VERSION
centos79-node2   NotReady   <none>                 7m29s   v1.21.3
centos79-node3   NotReady   <none>                 7m15s   v1.21.3
node             Ready      control-plane,master   33m     v1.21.3

发现 node 节点状态为 NotReady,别着急,等几分钟就好了:

NAME             STATUS   ROLES                  AGE     VERSION
centos79-node2   Ready    <none>                 8m29s    v1.21.3
centos79-node3   Ready    <none>                 8m15s    v1.21.3
node             Ready    control-plane,master   34m   v1.21.3

4.11 部署 Dashboard

4.11.1 部署

curl -o recommended.yaml https://raw.githubusercontent.com/kubernetes/dashboard/v2.3.1/aio/deploy/recommended.yaml

默认 Dashboard 只能集群内部访问,修改 Service 为 NodePort 类型,暴露到外部:

vi recommended.yaml

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
      nodePort: 30001
  type: NodePort
  selector:
    k8s-app: kubernetes-dashboard
kubectl apply -f recommended.yaml    # 这里下载镜像非常慢,我还是先手动拉下来吧,不行就多试几次
docker pull kubernetesui/dashboard:v2.3.1
docker pull kubernetesui/metrics-scraper:v1.0.6

kubectl apply -f recommended.yaml
kubectl get pods,svc -n kubernetes-dashboard
NAME                                             READY   STATUS              RESTARTS   AGE
pod/dashboard-metrics-scraper-856586f554-nb68k   0/1     ContainerCreating   0          52s
pod/kubernetes-dashboard-67484c44f6-shtz7        0/1     ContainerCreating   0          52s

NAME                                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
service/dashboard-metrics-scraper   ClusterIP   10.96.188.208   <none>        8000/TCP        52s
service/kubernetes-dashboard        NodePort    10.97.164.152   <none>        443:30001/TCP   53s

查看状态正在创建容器中,稍后再次查看:

NAME                                             READY   STATUS    RESTARTS   AGE
pod/dashboard-metrics-scraper-856586f554-nb68k   1/1     Running   0          2m11s
pod/kubernetes-dashboard-67484c44f6-shtz7        1/1     Running   0          2m11s

NAME                                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
service/dashboard-metrics-scraper   ClusterIP   10.96.188.208   <none>        8000/TCP        2m11s
service/kubernetes-dashboard        NodePort    10.97.164.152   <none>        443:30001/TCP   2m12s

访问地址:https://NodeIP:30001;使用 Firefox 浏览器,Chrome 浏览器打不开不信任 SSL 证书的网站。

创建 service account 并绑定默认 cluster-admin 管理员集群角色:

kubectl create serviceaccount dashboard-admin -n kube-system
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
Name:         dashboard-admin-token-q2kjk
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: dashboard-admin
              kubernetes.io/service-account.uid: fa1e812e-4487-4288-a444-d4ba49711366

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1066 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IlJ4OWQ5ZUJ5MDlEMkdQSnBYeUtXZDg5M2ZjX090RkhPOUtQZ3JTc1B0Z0UifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tcTJramsiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiZmExZTgxMmUtNDQ4Ny00Mjg4LWE0NDQtZDRiYTQ5NzExMzY2Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.nCpdYK5SjhAI8wqDP6QEDx9dyD4n5yCrx8eZ3R5XkR99vo8diMFdL_6VHtiQekQpwVc7vCkQ0qYhpaGjD2Pzn4EpU44UhQFH5EpG4L5zYvQf6QHBgaZJ68dQe1nMUUMto2jbTq8lEBt3FsJT_If6TkfeHtwfR-X8D2Nm1M8E153hXUPycSbGZImPeE-JVqRC3IJuhv6xgYi-EE08va2d6kDd4MBm-XdCm7QweG5cZaCQAP1qqF8kPfNZzelAGDe6F8V2caxAUECpNE6e4ZW2-h0D7Hp4bZpM4hZZpVr6WCfxuKXwPd-2srorjLi8h_lqSdZCJKJ56TpsED6nkBRffg

获得 token:

eyJhbGciOiJSUzI1NiIsImtpZCI6IlJ4OWQ5ZUJ5MDlEMkdQSnBYeUtXZDg5M2ZjX090RkhPOUtQZ3JTc1B0Z0UifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tcTJramsiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiZmExZTgxMmUtNDQ4Ny00Mjg4LWE0NDQtZDRiYTQ5NzExMzY2Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.nCpdYK5SjhAI8wqDP6QEDx9dyD4n5yCrx8eZ3R5XkR99vo8diMFdL_6VHtiQekQpwVc7vCkQ0qYhpaGjD2Pzn4EpU44UhQFH5EpG4L5zYvQf6QHBgaZJ68dQe1nMUUMto2jbTq8lEBt3FsJT_If6TkfeHtwfR-X8D2Nm1M8E153hXUPycSbGZImPeE-JVqRC3IJuhv6xgYi-EE08va2d6kDd4MBm-XdCm7QweG5cZaCQAP1qqF8kPfNZzelAGDe6F8V2caxAUECpNE6e4ZW2-h0D7Hp4bZpM4hZZpVr6WCfxuKXwPd-2srorjLi8h_lqSdZCJKJ56TpsED6nkBRffg

这里需要注意粘贴的时候有可能被换行,如果被换行,可在记事本中设置为一行。

使用输出的 token 登录 Dashboard。

4.11.2 登录界面

4.11.3 Pods

4.11.4 Service

4.11.5 Config Maps

4.11.6 Secrets

4.11.7 Cluster Role Bindings

4.11.8 NameSpace

5 笔者提供资源

docker pull registry.cn-beijing.aliyuncs.com/iuskye/kube-apiserver:v1.21.3
docker pull registry.cn-beijing.aliyuncs.com/iuskye/kube-scheduler:v1.21.3
docker pull registry.cn-beijing.aliyuncs.com/iuskye/kube-proxy:v1.21.3
docker pull registry.cn-beijing.aliyuncs.com/iuskye/kube-controller-manager:v1.21.3
docker pull registry.cn-beijing.aliyuncs.com/iuskye/coredns:v1.8.0
docker pull registry.cn-beijing.aliyuncs.com/iuskye/etcd:3.4.13-0
docker pull registry.cn-beijing.aliyuncs.com/iuskye/pause:3.4.1
docker pull registry.cn-beijing.aliyuncs.com/iuskye/dashboard:v2.3.1
docker pull registry.cn-beijing.aliyuncs.com/iuskye/metrics-scraper:v1.0.6
docker pull registry.cn-beijing.aliyuncs.com/iuskye/flannel:v0.14.0

Retag:

docker tag registry.cn-beijing.aliyuncs.com/iuskye/kube-apiserver:v1.21.3 k8s.gcr.io/kube-apiserver:v1.21.3
docker tag registry.cn-beijing.aliyuncs.com/iuskye/kube-scheduler:v1.21.3 k8s.gcr.io/kube-scheduler:v1.21.3
docker tag registry.cn-beijing.aliyuncs.com/iuskye/kube-proxy:v1.21.3 k8s.gcr.io/kube-proxy:v1.21.3
docker tag registry.cn-beijing.aliyuncs.com/iuskye/kube-controller-manager:v1.21.3 k8s.gcr.io/kube-controller-manager:v1.21.3
docker tag registry.cn-beijing.aliyuncs.com/iuskye/coredns:v1.8.0 k8s.gcr.io/coredns/coredns:v1.8.0
docker tag registry.cn-beijing.aliyuncs.com/iuskye/etcd:3.4.13-0 k8s.gcr.io/etcd:3.4.13-0
docker tag registry.cn-beijing.aliyuncs.com/iuskye/pause:3.4.1 k8s.gcr.io/pause:3.4.1
docker tag registry.cn-beijing.aliyuncs.com/iuskye/dashboard:v2.3.1 kubernetesui/dashboard:v2.3.1
docker tag registry.cn-beijing.aliyuncs.com/iuskye/metrics-scraper:v1.0.6 kubernetesui/metrics-scraper:v1.0.6
docker tag registry.cn-beijing.aliyuncs.com/iuskye/flannel:v0.14.0 quay.io/coreos/flannel:v0.14.0

6 参考

1 官方架构图

Prometheus 是一套开源的系统监控报警框架。它是由 Google 前员工在2012年创建,作为社区开源项目进行开发,并于2015年正式发布。2016年,Prometheus 正式加入 Cloud Native Computing Foundation。

2 组件介绍

2.1 Prometheus Server

用于收集和存储时间序列数据。Prometheus Server 是 Prometheus 组件中的核心部分,负责实现对监控数据的获取,存储以及查询。 Prometheus Server 可以通过静态配置管理监控目标,也可以配合使用 Service Discovery 的方式动态管理监控目标,并从这些监控目标中获取数据。其次 Prometheus Server 需要对采集到的监控数据进行存储,Prometheus Server 本身就是一个时序数据库,将采集到的监控数据按照时间序列的方式存储在本地磁盘当中。最后Prometheus Server 对外提供了自定义的 PromQL 语言,实现对数据的查询以及分析。

2.2 Exporter

用于暴露已有的第三方服务的 metrics 给 Prometheus。Exporter 将监控数据采集的端点通过 HTTP 服务的形式暴露给 Prometheus Server,Prometheus Server 通过访问该 Exporter 提供的 Endpoint 端点,即可获取到需要采集的监控数据。

2.3 Push Gateway

主要用于短期的 jobs。由于这类 jobs 存在时间较短,可能在 Prometheus 来 pull 之前就消失了。为此,这些 jobs 可以直接向 Prometheus server 端推送它们的 metrics。

2.4 Grafana

第三方展示工具,可以编写 PromQL 查询语句,通过 http 协议与 prometheus 集成。

2.5 AlertManager

从 Prometheus Server 端接收到 alerts 后,会进行去除重复数据,分组,并路由到对方的接受方式,发出报警。常见的接收方式有:电子邮件,钉钉、企业微信,pagerduty等。

2.6 Client Library

客户端库,为需要监控的服务生成相应的 metrics 并暴露给 Prometheus Server。当 Prometheus Server 来 pull 时,直接返回实时状态的 metrics。

3 Prometheus工作流程

  • 指标采集:prometheus server 通过 pull 形式采集监控指标,可以直接拉取监控指标,也可以通过 pushgateway 做中间环节,监控目标先 push 形式上报数据到 pushgateway;
  • 指标处理:prometheus server 将采集的数据存储在自身 db 或者第三方 db;
  • 指标展示:prometheus server 通过提供 http 接口,提供自带或者第三方展示系统;
  • 指标告警:prometheus server 通过 push 告警信息到 alert-manager,alert-manager 通过"静默-抑制-整合-下发"4个阶段处理通知到观察者。

4 Prometheus四种指标分类

Counter

计数器类型,只增不减,如机器的启动时间,HTTP 访问量等。机器重启不会置零,在使用这种指标类型时,通常会结合rate()方法获取该指标在某个时间段的变化率。

Gauge

仪表盘,可增可减,如CPU使用率,大部分监控数据都是这种类型的。

Summary

客户端定义;Summary,Histogram 都属于高级指标,用于凸显数据的分布情况。

Histogram

服务端定义,相比 Summary 性能更好,反应了某个区间内的样本数。

5 服务发现

5.1 基于文件的服务发现

通过创建 target.json 文件,将所有的 target 配置在 target.json,在需要更新 target 的时候,只需要更新 target.json,格式如下:

[
  {
    "targets": [ "localhost:8080"],
    "labels": {
      "env": "localhost",
      "job": "cadvisor"
    }
  }
]

在 prometheus 启动的时候,通过 --config.file 指定配置文件 prometheus.yml,格式如下:

global:
  scrape_interval: 15s
  scrape_timeout: 10s
  evaluation_interval: 15s
scrape_configs:
  - job_name: 'file_ds'
    file_sd_configs:
    - refresh_interval: 1m
    - files:
      - targets.json

其中 refresh_interval 指定了定时读取 target.json 的时间间隔。

这种方式虽然解决了不需要修改 prometheus.yml 的问题,但是问题改成了需要运维维护 target.json,没根本解决问题。

5.2 基于Consul的服务发现

关于 Consul 的介绍这里不多介绍,我们只需知道,Consul 是一个服务配置、发现的中间件。

在基于文件的服务发现中提到,target 的更新需要修改 target.json,本质上没有解决运维操作的步骤。而基于 Consul 的服务发现,则是通过节点主动注册信息到 Consul,prometheus 通过和 Consul 集成,从 Consul 中以 http 协议获取 target 的信息,集成见 prometheus.yml 部分配置:

- job_name: consul_sd
    metrics_path: /metrics
    scheme: http
    consul_sd_configs:
      - server: ${consul_ip}:8500
    relabel_configs:
    # 删除consul自身
    - source_labels: [__meta_consul_service]
      regex: '^consul$'
      action: drop
    #元数据信息添加到job标签
    - source_labels: [__meta_consul_service]
      target_label: "job" 

缺点:

  • 需要引入额外的中间件 consule
  • 需要与 target 集成,在 target 启动时,需要在 consul 上进行注册

5.3 其他动态的服务发现

  • azure_sd_config :从 Azure 的 vm 上拉取配置
  • dns_sd_config: 基于 dns 的服务发现

5.4 对比

方式有点缺点依赖项是否建议
基于文件的服务发现简单没解决根本问题,不能采用 不建议采用
基于Consul的服务发现便于运维需要target继承sdk,在sdk中向consul注册节点信息consul集群建议采用
其他服务发现便于运维依赖资源层环境 不建议采用
使用pushgateway便于运维需要target继承sdk,在sdk中定时上报指标;单点问题,需要借助nginx实现多机部署 可以考虑

6 AlertManager配置介绍

分组

将告警消息分组,便于大量告警 涌入时带来 通知过多问题。

静默

按照一定规则,在一定时间内不进行通知下发,在时间阈值达到后,进行下发。

抑制

一个告警消息被另一种告警消息抑制,另一种告警发送后,该告警不下发。

管理API

HTTP_METHODPATHDESCRIPTION
GET/~/healthyReturns 200 whenever the Alert manager is healthy.
GET/~/readyReturns 200 whenever the Alert manager is ready to serve traffic.
PUT/~/reloadTriggers a reload of the Alertmanager configuration file.

0 前情提要

本文的搭建过程是在K8S系统上复用的,因此关于系统优化配置方面不再赘述。

说明:如不特殊说明,以下操作均在三台系统上执行。

1 新建磁盘

给虚拟机新增一块硬盘,由于我之前已经添加过一块盘了,所以盘符为sdc:

/dev/sdc

2 配置yum源

vim /etc/yum.repos.d/ceph.repo

[Ceph]
name=Ceph packages
baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/x86_64/
gpgcheck=0

[Ceph-noarch]
name=Ceph noarch packages
baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch/
gpgcheck=0
vim /etc/yum.repos.d/epel.repo

[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
#baseurl=http://download.fedoraproject.org/pub/epel/7/$basearch
metalink=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearch
failovermethod=priority
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7

[epel-debuginfo]
name=Extra Packages for Enterprise Linux 7 - $basearch - Debug
#baseurl=http://download.fedoraproject.org/pub/epel/7/$basearch/debug
metalink=https://mirrors.fedoraproject.org/metalink?repo=epel-debug-7&arch=$basearch
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=1

[epel-source]
name=Extra Packages for Enterprise Linux 7 - $basearch - Source
#baseurl=http://download.fedoraproject.org/pub/epel/7/SRPMS
metalink=https://mirrors.fedoraproject.org/metalink?repo=epel-source-7&arch=$basearch
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=1

3 创建普通用户并设置sudo免密

groupadd -g 3000 ceph
useradd -u 3000 -g ceph ceph
echo "ceph" | passwd --stdin ceph
echo "ceph ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ceph
chmod 0440 /etc/sudoers.d/ceph

4 新建的用户创建ssh免密登录

在master节点执行:

su - ceph
ssh-keygen
ssh-copy-id ceph@k8s-master
ssh-copy-id ceph@k8s-node1
ssh-copy-id ceph@k8s-node2

5 安装软件

sudo su - root    # master要从ceph用户切换到root用户下
yum install ceph-deploy -y
wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 https://archive.fedoraproject.org/pub/epel/RPM-GPG-KEY-EPEL-7
yum install python-pip -y
yum install ceph ceph-osd ceph-mds ceph-mon ceph-radosgw -y
yum install ntp -y
systemctl start ntpd
systemctl enable ntpd

Tps:安装时间同步服务的目的是为了防止后续集群因为时间不同步导致健康状态从OK转变为WARN。

6 创建集群

在master节点执行:

su - ceph
mkdir cephcluster
cd cephcluster/
# 初始化创建ceph集群
ceph-deploy new --cluster-network 192.168.0.0/24 --public-network 192.168.0.0/24 k8s-master k8s-node1 k8s-node2
# 初始化monitor服务
ceph-deploy mon create-initial
# 配置信息拷贝到三台节点
ceph-deploy admin k8s-master k8s-node1 k8s-node2
sudo chown -R ceph:ceph /etc/ceph
chown -R ceph:ceph /etc/ceph    # 在其它节点执行

查看状态:

ceph -s
  cluster:
    id:     14450b7d-84ce-40c4-8a1e-46af50457fc6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum k8s-master,k8s-node1,k8s-node2 (age 65s)
    mgr: no daemons active
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

7 配置mgr服务

在master节点执行:

ceph-deploy mgr create k8s-master k8s-node1 k8s-node2

查看状态:

ceph -s
  cluster:
    id:     14450b7d-84ce-40c4-8a1e-46af50457fc6
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 3 daemons, quorum k8s-master,k8s-node1,k8s-node2 (age 99s)
    mgr: k8s-master(active, since 23s), standbys: k8s-node2, k8s-node1
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

8 配置osd服务

在master节点执行:

ceph-deploy osd create --data /dev/sdc k8s-master
ceph-deploy osd create --data /dev/sdc k8s-node1
ceph-deploy osd create --data /dev/sdc k8s-node2

9 配置mon服务

在master节点执行:

先查看Ceph集群中的mon服务状态:

ceph mon stat

e1: 3 mons at {k8s-master=[v2:192.168.0.234:3300/0,v1:192.168.0.234:6789/0],k8s-node1=[v2:192.168.0.235:3300/0,v1:192.168.0.235:6789/0],k8s-node2=[v2:192.168.0.236:3300/0,v1:192.168.0.236:6789/0]}, election epoch 10, leader 0 k8s-master, quorum 0,1,2 k8s-master,k8s-node1,k8s-node2
ceph mon_status --format json-pretty

{
    "name": "k8s-master",
    "rank": 0,
    "state": "leader",
    "election_epoch": 10,
    "quorum": [
        0,
        1,
        2
    ],
    "quorum_age": 495,
    "features": {
        "required_con": "2449958747315912708",
        "required_mon": [
            "kraken",
            "luminous",
            "mimic",
            "osdmap-prune",
            "nautilus"
        ],
        "quorum_con": "4611087854035861503",
        "quorum_mon": [
            "kraken",
            "luminous",
            "mimic",
            "osdmap-prune",
            "nautilus"
        ]
    },
    "outside_quorum": [],
    "extra_probe_peers": [
        {
            "addrvec": [
                {
                    "type": "v2",
                    "addr": "192.168.0.235:3300",
                    "nonce": 0
                },
                {
                    "type": "v1",
                    "addr": "192.168.0.235:6789",
                    "nonce": 0
                }
            ]
        },
        {
            "addrvec": [
                {
                    "type": "v2",
                    "addr": "192.168.0.236:3300",
                    "nonce": 0
                },
                {
                    "type": "v1",
                    "addr": "192.168.0.236:6789",
                    "nonce": 0
                }
            ]
        }
    ],
    "sync_provider": [],
    "monmap": {
        "epoch": 1,
        "fsid": "14450b7d-84ce-40c4-8a1e-46af50457fc6",
        "modified": "2021-03-02 18:32:42.613085",
        "created": "2021-03-02 18:32:42.613085",
        "min_mon_release": 14,
        "min_mon_release_name": "nautilus",
        "features": {
            "persistent": [
                "kraken",
                "luminous",
                "mimic",
                "osdmap-prune",
                "nautilus"
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,
                "name": "k8s-master",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "192.168.0.234:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "192.168.0.234:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "192.168.0.234:6789/0",
                "public_addr": "192.168.0.234:6789/0"
            },
            {
                "rank": 1,
                "name": "k8s-node1",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "192.168.0.235:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "192.168.0.235:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "192.168.0.235:6789/0",
                "public_addr": "192.168.0.235:6789/0"
            },
            {
                "rank": 2,
                "name": "k8s-node2",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "192.168.0.236:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "192.168.0.236:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "192.168.0.236:6789/0",
                "public_addr": "192.168.0.236:6789/0"
            }
        ]
    },
    "feature_map": {
        "mon": [
            {
                "features": "0x3ffddff8ffecffff",
                "release": "luminous",
                "num": 1
            }
        ],
        "osd": [
            {
                "features": "0x3ffddff8ffecffff",
                "release": "luminous",
                "num": 2
            }
        ],
        "client": [
            {
                "features": "0x3ffddff8ffecffff",
                "release": "luminous",
                "num": 2
            }
        ],
        "mgr": [
            {
                "features": "0x3ffddff8ffecffff",
                "release": "luminous",
                "num": 1
            }
        ]
    }
}

发现有3个mon服务,所以不必再次配置。

10 查看服务状态

在master节点执行:

systemctl list-units | grep ceph-mon
ceph-mon@k8s-master.service                                                                                                           loaded active running   Ceph cluster monitor daemon
ceph-mon.target                                                                                                                       loaded active active    ceph target allowing to start/stop all ceph-mon@.service instances at once

systemctl list-units | grep ceph-mgr
ceph-mgr@k8s-master.service                                                                                                           loaded active running   Ceph cluster manager daemon
ceph-mgr.target                                                                                                                       loaded active active    ceph target allowing to start/stop all ceph-mgr@.service instances at once

systemctl list-units | grep ceph-osd
var-lib-ceph-osd-ceph\x2d0.mount                                                                                                      loaded active mounted   /var/lib/ceph/osd/ceph-0
ceph-osd@0.service                                                                                                                    loaded active running   Ceph object storage daemon osd.0
ceph-osd.target 

查看状态:

ceph -s
  cluster:
    id:     14450b7d-84ce-40c4-8a1e-46af50457fc6
    health: HEALTH_WARN
            clock skew detected on mon.k8s-node1

  services:
    mon: 3 daemons, quorum k8s-master,k8s-node1,k8s-node2 (age 15m)
    mgr: k8s-master(active, since 41s), standbys: k8s-node2, k8s-node1
    osd: 3 osds: 3 up (since 10m), 3 in (since 10m)

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 597 GiB / 600 GiB avail
    pgs: 
                                                                                                         loaded active active    ceph target allowing to start/stop all ceph-osd@.service instances at once

发现health: HEALTH_WARN,解决方案:

su - ceph
echo "mon clock drift allowed = 2" >> ~/cephcluster/ceph.conf
echo "mon clock drift warn backoff = 30" >> ~/cephcluster/ceph.conf
ceph-deploy --overwrite-conf config push k8s-master k8s-node1 k8s-node2
sudo systemctl restart ceph-mon.target

再次查看状态:

ceph -s
  cluster:
    id:     14450b7d-84ce-40c4-8a1e-46af50457fc6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum k8s-master,k8s-node1,k8s-node2 (age 2m)
    mgr: k8s-master(active, since 5m), standbys: k8s-node2, k8s-node1
    osd: 3 osds: 3 up (since 16m), 3 in (since 16m)

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 597 GiB / 600 GiB avail
    pgs:

这次状态正常了。

11 配置dashboard

在master节点执行:

yum -y install ceph-mgr-dashboard    # 三个节点都要执行安装操作
echo "mgr initial modules = dashboard" >> ~/cephcluster/ceph.conf
ceph-deploy --overwrite-conf config push k8s-master k8s-node1 k8s-node2
sudo systemctl restart ceph-mgr@k8s-master
ceph mgr module enable dashboard
ceph dashboard create-self-signed-cert
ceph dashboard set-login-credentials admin ceph123
******************************************************************
***          WARNING: this command is deprecated.              ***
*** Please use the ac-user-* related commands to manage users. ***
******************************************************************
Username and password updated
ceph mgr services
{
    "dashboard": "https://k8s-master:8443/"
}

打开浏览器,输入地址 https://192.168.0.234:8443/

输入账号面:admin,ceph123:

12 使用示例

https://kubernetes.io/zh/docs/concepts/storage/volumes/#cephfs
https://github.com/kubernetes/examples/tree/master/volumes/cephfs
https://github.com/kubernetes/examples/blob/master/volumes/cephfs/cephfs.yam

13 参考链接

https://www.cnblogs.com/weiwei2021/p/14060186.html
https://blog.csdn.net/weixin_43902588/article/details/109147778
https://www.cnblogs.com/huchong/p/12435957.html
https://www.cnblogs.com/sisimi/p/7700608.html

1 部署 Docker 服务

curl https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -o /etc/yum.repos.d/docker.repo
yum list docker-ce --showduplicates | sort -r    # 显示所有版本
yum install -y docker-ce-20.10.5    # 指定 docker 版本
systemctl start docker    # 启动 docker 服务
systemctl status docker    # 查看 docker 服务状态
systemctl enable docker    # 设置 docker 开机自启动

2 部署 Prometheus 服务

创建 mon 用户,创建目录:

groupadd -g 2000 mon
useradd -u 2000 -g mon mon
mkdir -p /home/mon/prometheus/{etc,data,rules}

创建配置文件:

vim /home/mon/prometheus/etc/prometheus.yml
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

启动容器服务:

docker pull prom/prometheus
cd /home/mon/
chown mon. -R prometheus
docker run -d --user root -p 9090:9090 --name prometheus \
    -v /home/mon/prometheus/etc/prometheus.yml:/etc/prometheus/prometheus.yml \
    -v /home/mon/prometheus/rules:/etc/prometheus/rules \
    -v /home/mon/prometheus/data:/data/prometheus \
    prom/prometheus \
    --config.file="/etc/prometheus/prometheus.yml" \
    --storage.tsdb.path="/data/prometheus" \
    --web.listen-address="0.0.0.0:9090"

3 部署 Grafana 服务

创建数据目录:

mkdir -p /home/mon/grafana/plugins

安装插件:下载Grafana插件

tar zxf /tmp/grafana-plugins.tar.gz -C /home/mon/grafana/plugins/
chown -R mon. /home/mon/grafana
chmod 777 -R /home/mon/grafana

启动容器服务:

docker pull grafana/grafana:latest
docker run -d -p 3000:3000 -v /home/mon/grafana:/var/lib/grafana --name=grafana grafana/grafana:latest

4 配置 Grafana 对接 Prometheus

访问 http://ip:3000,初始账号密码为 admin/admin,会要求更改密码。

按照如下截图顺序配置 Prometheus Dashboard:

5 部署 Node_Exporter 服务

我以监控一台阿里云ECS为例。

安装配置Node_Exporter:

curl https://github.com/prometheus/node_exporter/releases/download/v1.1.1/node_exporter-1.1.1.linux-amd64.tar.gz > /opt/node_exporter-1.1.1.linux-amd64.tar.gz
cd /opt
tar zxf node_exporter-1.1.1.linux-amd64.tar.gz
mv node_exporter-1.1.1.linux-amd64 node_exporter

配置服务启动脚本:

vim /usr/lib/systemd/system/node_exporter.service

[Unit]
Description=node_exporter service
 
[Service]
User=root
ExecStart=/opt/node_exporter/node_exporter
 
TimeoutStopSec=10
Restart=on-failure
RestartSec=5
 
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl start node_exporter
systemctl status node_exporter
systemctl enable node_exporter

在这台ECS服务器的Nginx下配置反向代理:

vim /opt/nginx/conf/conf.d/blog.conf

    # prometheus monitor node exporter
    location /node/service {
        proxy_pass   http://127.0.0.1:9100/metrics;
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
/opt/nginx/sbin/nginx -s reload

在Prometheus服务器端修改 配置文件:

vim /home/mon/prometheus/etc/prometheus.yml
        
  - job_name: 'node-service'
    static_configs:
    - targets: ['blog.iuskye.com']
      labels:
        instance: node-service
    scheme: https
    metrics_path: /node/service

重启 Prometheus 容器:

docker restart prometheus

验证是否获取到数据,在浏览器输入:https://blog.iuskye.com/node/service:

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 7
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.15.8"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge

······

创建Dashboard:

6 部署 Alertmanager 服务

创建目录:

mkdir -p /home/mon/alertmanager/{etc,data}
chmod 777 -R /home/mon/alertmanager

创建配置文件:

vim  /home/mon/alertmanager/etc/alertmanager.yml

global:
  resolve_timeout: 5m
  smtp_from: '319981932@qq.com'
  smtp_smarthost: 'smtp.qq.com:465'
  smtp_auth_username: '319981932@qq.com'
  # 注意这里需要配置QQ邮箱的授权码,不是登录密码,授权码在账户配置中查看
  smtp_auth_password: 'abcdefghijklmmop'
  smtp_require_tls: false

route:
  group_by: ['alert_node']
  group_wait: 5s
  group_interval: 5s
  repeat_interval: 5m
  receiver: 'email'

receivers:
- name: 'email'
  email_configs:
  # 请注意这里的收件箱请改为你自己的邮箱地址
  - to: '319981932@qq.com'
    send_resolved: true

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alert_node', 'dev', 'instance']

拉取镜像并启动容器:

docker pull prom/alertmanager:latest
chown -R mon. alertmanager/
docker run -d --user root -p 9093:9093 --name alertmanager \
    -v /home/mon/alertmanager/etc/alertmanager.yml:/etc/alertmanager/alertmanager.yml \
    -v /home/mon/alertmanager/data:/alertmanager/data
    prom/alertmanager:latest \
    --config.file="/etc/alertmanager/alertmanager.yml" \
    --web.listen-address="0.0.0.0:9093"

查看alertmanager容器IP地址,用于配置prometheus对接接口:

docker exec -it alertmanager /bin/sh -c "ip a"

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
10: eth0@if11: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
    link/ether 02:42:ac:11:00:04 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.4/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

修改prometheus配置文件对接alertmanager:

vim /home/mon/prometheus/etc/prometheus.yml

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - 172.17.0.4:9093

rule_files:
  - "/etc/prometheus/rules/*rules.yml"

配置告警规则:

vim /home/mon/prometheus/rules/alert-node-rules.yml
groups:
  - name: alert-node
    rules:
    - alert: NodeDown
      # 注意:这里的job_name一定要跟prometheus配置文件中配置的相匹配
      expr: up{job="node-service"} == 0
      for: 1m
      labels:
        severity: critical
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} is down"
        description: "Instance: {{ $labels.instance }} 已经宕机 1分钟"
        value: "{{ $value }}"

    - alert: NodeCpuHigh
      expr: (1 - avg by (instance) (irate(node_cpu_seconds_total{job="node-service",mode="idle"}[5m]))) * 100 > 80
      for: 5m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} cpu使用率过高"
        description: "CPU 使用率超过 80%"
        value: "{{ $value }}"

    - alert: NodeCpuIowaitHigh
      expr: avg by (instance) (irate(node_cpu_seconds_total{job="node-service",mode="iowait"}[5m])) * 100 > 50
      for: 5m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} cpu iowait 使用率过高"
        description: "CPU iowait 使用率超过 50%"
        value: "{{ $value }}"

    - alert: NodeLoad5High
      expr: node_load5 > (count by (instance) (node_cpu_seconds_total{job="node-service",mode='system'})) * 1.2
      for: 5m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} load(5m) 过高"
        description: "Load(5m) 过高,超出cpu核数 1.2倍"
        value: "{{ $value }}"

    - alert: NodeMemoryHigh
      expr: (1 - node_memory_MemAvailable_bytes{job="node-service"} / node_memory_MemTotal_bytes{job="node-service"}) * 100 > 60
      for: 5m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} memory 使用率过高"
        description: "Memory 使用率超过 90%"
        value: "{{ $value }}"

    - alert: NodeDiskRootHigh
      expr: (1 - node_filesystem_avail_bytes{job="node-service",fstype=~"ext.*|xfs",mountpoint ="/"} / node_filesystem_size_bytes{job="node-service",fstype=~"ext.*|xfs",mountpoint ="/"}) * 100 > 90
      for: 10m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} disk(/ 分区) 使用率过高"
        description: "Disk(/ 分区) 使用率超过 90%"
        value: "{{ $value }}"

    - alert: NodeDiskBootHigh
      expr: (1 - node_filesystem_avail_bytes{job="node-service",fstype=~"ext.*|xfs",mountpoint ="/boot"} / node_filesystem_size_bytes{job="node-service",fstype=~"ext.*|xfs",mountpoint ="/boot"}) * 100 > 80
      for: 10m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} disk(/boot 分区) 使用率过高"
        description: "Disk(/boot 分区) 使用率超过 80%"
        value: "{{ $value }}"

    - alert: NodeDiskReadHigh
      expr: irate(node_disk_read_bytes_total{job="node-service"}[5m]) > 20 * (1024 ^ 2)
      for: 5m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} disk 读取字节数 速率过高"
        description: "Disk 读取字节数 速率超过 20 MB/s"
        value: "{{ $value }}"

    - alert: NodeDiskWriteHigh
      expr: irate(node_disk_written_bytes_total{job="node-service"}[5m]) > 20 * (1024 ^ 2)
      for: 5m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} disk 写入字节数 速率过高"
        description: "Disk 写入字节数 速率超过 20 MB/s"
        value: "{{ $value }}"

    - alert: NodeDiskReadRateCountHigh
      expr: irate(node_disk_reads_completed_total{job="node-service"}[5m]) > 3000
      for: 5m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} disk iops 每秒读取速率过高"
        description: "Disk iops 每秒读取速率超过 3000 iops"
        value: "{{ $value }}"

    - alert: NodeDiskWriteRateCountHigh
      expr: irate(node_disk_writes_completed_total{job="node-service"}[5m]) > 3000
      for: 5m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} disk iops 每秒写入速率过高"
        description: "Disk iops 每秒写入速率超过 3000 iops"
        value: "{{ $value }}"

    - alert: NodeInodeRootUsedPercentHigh
      expr: (1 - node_filesystem_files_free{job="node-service",fstype=~"ext4|xfs",mountpoint="/"} / node_filesystem_files{job="node-service",fstype=~"ext4|xfs",mountpoint="/"}) * 100 > 80
      for: 10m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} disk(/ 分区) inode 使用率过高"
        description: "Disk (/ 分区) inode 使用率超过 80%"
        value: "{{ $value }}"

    - alert: NodeInodeBootUsedPercentHigh
      expr: (1 - node_filesystem_files_free{job="node-service",fstype=~"ext4|xfs",mountpoint="/boot"} / node_filesystem_files{job="node-service",fstype=~"ext4|xfs",mountpoint="/boot"}) * 100 > 80
      for: 10m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} disk(/boot 分区) inode 使用率过高"
        description: "Disk (/boot 分区) inode 使用率超过 80%"
        value: "{{ $value }}"

    - alert: NodeFilefdAllocatedPercentHigh
      expr: node_filefd_allocated{job="node-service"} / node_filefd_maximum{job="node-service"} * 100 > 80
      for: 10m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} filefd 打开百分比过高"
        description: "Filefd 打开百分比 超过 80%"
        value: "{{ $value }}"

    - alert: NodeNetworkNetinBitRateHigh
      expr: avg by (instance) (irate(node_network_receive_bytes_total{device=~"eth0|eth1|ens33|ens37"}[1m]) * 8) > 20 * (1024 ^ 2) * 8
      for: 3m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} network 接收比特数 速率过高"
        description: "Network 接收比特数 速率超过 20MB/s"
        value: "{{ $value }}"

    - alert: NodeNetworkNetoutBitRateHigh
      expr: avg by (instance) (irate(node_network_transmit_bytes_total{device=~"eth0|eth1|ens33|ens37"}[1m]) * 8) > 20 * (1024 ^ 2) * 8
      for: 3m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} network 发送比特数 速率过高"
        description: "Network 发送比特数 速率超过 20MB/s"
        value: "{{ $value }}"

    - alert: NodeNetworkNetinPacketErrorRateHigh
      expr: avg by (instance) (irate(node_network_receive_errs_total{device=~"eth0|eth1|ens33|ens37"}[1m])) > 15
      for: 3m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} 接收错误包 速率过高"
        description: "Network 接收错误包 速率超过 15个/秒"
        value: "{{ $value }}"

    - alert: NodeNetworkNetoutPacketErrorRateHigh
      expr: avg by (instance) (irate(node_network_transmit_packets_total{device=~"eth0|eth1|ens33|ens37"}[1m])) > 15
      for: 3m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} 发送错误包 速率过高"
        description: "Network 发送错误包 速率超过 15个/秒"
        value: "{{ $value }}"

    - alert: NodeProcessBlockedHigh
      expr: node_procs_blocked{job="node-service"} > 10
      for: 10m
      labels:
        severity: warning
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} 当前被阻塞的任务的数量过多"
        description: "Process 当前被阻塞的任务的数量超过 10个"
        value: "{{ $value }}"

    - alert: NodeTimeOffsetHigh
      expr: abs(node_timex_offset_seconds{job="node-service"}) > 3 * 60
      for: 2m
      labels:
        severity: info
        instance: "{{ $labels.instance }}"
      annotations:
        summary: "instance: {{ $labels.instance }} 时间偏差过大"
        description: "Time 节点的时间偏差超过 3m"
        value: "{{ $value }}"

重启prometheus容器:

docker restart prometheus

验证是否收到告警邮件,我们将node_exporter关闭,在被监控的ECS服务器操作:

systemctl stop node_exporter

然后刷新prometheus的页面,查看Alerts菜单,我们发现NodeDown规则处于PENDING状态:

等待一分钟后再次刷新查看,已经变成了FIRING状态:

这时候我们去查看下邮箱:

说明已经收到了告警邮件。现在我们把它恢复:

systemctl start node_exporter

然后我们就收到了服务恢复的告警邮件:

7 多容器启动管理

多容器配置,需要修改端口、数据存储路径等信息,例如:

Prometheus

docker run -d --user root -p 9091:9090 --name prometheus-poc \
    -v /home/mon/prometheus-poc/etc/prometheus.yml:/etc/prometheus/prometheus.yml \
    -v /home/mon/prometheus-poc/rules:/etc/prometheus/rules \
    -v /home/mon/prometheus-poc/data:/data/prometheus \
    prom/prometheus \
    --config.file="/etc/prometheus/prometheus.yml" \
    --storage.tsdb.path="/data/prometheus" \
    --web.listen-address="0.0.0.0:9090"

不同之处:

  • -p 9091:9090
  • --name prometheus-poc
  • -v /home/mon/prometheus-poc/etc/prometheus.yml:/etc/prometheus/prometheus.yml
  • -v /home/mon/prometheus-poc/rules:/etc/prometheus/rules
  • -v /home/mon/prometheus-poc/data:/data/prometheus

Grafana

docker run -d -p 3001:3000 -v /home/mon/grafana-poc:/var/lib/grafana --name=grafana-poc grafana/grafana:latest

不同之处:

  • -p 3001:3000
  • --name=grafana-poc
  • -v /home/mon/grafana-poc:/var/lib/grafana

Alertmanager

docker run -d --user root -p 9094:9093 --name alertmanager-poc \
    -v /home/mon/alertmanager-poc/etc/alertmanager.yml:/etc/alertmanager/alertmanager.yml \
    -v /home/mon/alertmanager-poc/data:/alertmanager/data
    prom/alertmanager:latest \
    --config.file="/etc/alertmanager/alertmanager.yml" \
    --web.listen-address="0.0.0.0:9093"

不同之处:

  • -p 9094:9093
  • --name alertmanager-poc
  • -v /home/mon/alertmanager-poc/etc/alertmanager.yml:/etc/alertmanager/alertmanager.yml
  • -v /home/mon/alertmanager-poc/data:/alertmanager/data

8 参考资料

Docker 部署 Prometheus+Grafana

Dashboard Download