环境信息:
K3s 版本:
1.25 stable
节点 CPU 架构、操作系统和版本::
Linux k8s-4 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
集群配置:
1master 1worker
问题描述:
复现步骤:
预期结果:
实际结果:
Master 成功 work节点失败
附加上下文/日志:
日志
12月 01 14:38:26 k8s-1 systemd[1]: Failed to start Lightweight Kubernetes.
12月 01 14:38:26 k8s-1 systemd[1]: Unit k3s-agent.service entered failed state.
12月 01 14:38:26 k8s-1 systemd[1]: k3s-agent.service failed.
12月 01 14:38:31 k8s-1 systemd[1]: k3s-agent.service holdoff time over, scheduling restart.
12月 01 14:38:31 k8s-1 systemd[1]: Stopped Lightweight Kubernetes.
12月 01 14:38:31 k8s-1 systemd[1]: Starting Lightweight Kubernetes...
12月 01 14:38:31 k8s-1 sh[7165]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
12月 01 14:38:31 k8s-1 sh[7165]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
12月 01 14:38:31 k8s-1 k3s[7181]: time="2022-12-01T14:38:31+08:00" level=info msg="Starting k3s agent v1.25.4+k3s1 (0dc63334)"
12月 01 14:38:31 k8s-1 k3s[7181]: time="2022-12-01T14:38:31+08:00" level=info msg="Running load balancer k3s-agent-load-balancer 127.0.0.1:6444 -> [192.168.8.4:6443]"
12月 01 14:38:31 k8s-1 k3s[7181]: time="2022-12-01T14:38:31+08:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:60974->127.0.0.1:6444: read: connection reset by peer"
12月 01 14:38:33 k8s-1 k3s[7181]: time="2022-12-01T14:38:33+08:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:60982->127.0.0.1:6444: read: connection reset by peer"
12月 01 14:38:35 k8s-1 k3s[7181]: time="2022-12-01T14:38:35+08:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:60990->127.0.0.1:6444: read: connection reset by peer"
12月 01 14:38:37 k8s-1 k3s[7181]: time="2022-12-01T14:38:37+08:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:60998->127.0.0.1:6444: read: connection reset by peer"
12月 01 14:38:37 k8s-1 systemd[1]: Stopped Lightweight Kubernetes.
- 使用Centos8 minial 创建虚拟机
- 安装docker 20.10.7,关闭防火墙
- 修改主机hostname
- 使用autok3s创建 1 master 1 work 使用docker 安装 选择集群默认
autok3s create --provider native --cluster --docker-script https://get.docker.com --k3s-channel stable --k3s-install-mirror INSTALL_K3S_MIRROR=cn --k3s-install-script https://rancher-mirror.oss-cn-beijing.aliyuncs.com/k3s/k3s-install.sh --master-extra-args '--docker' --name native --ssh-password admin@123! --ssh-port 22 --ssh-user root --token native_token --worker-extra-args '--docker' --master-ips 192.168.8.10 --worker-ips 192.168.8.12
- 日志显示创建到 work 失败 回滚
- agent失败日志
12月 01 04:57:04 k3s-12 k3s[28244]: Error: failed to run Kubelet: unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /run/k3s/cri-dockerd/cri-dockerd.sock: connect: no such file or directory"
12月 01 04:57:04 k3s-12 k3s[28244]: time="2022-12-01T04:57:04-05:00" level=fatal msg="kubelet exited: failed to run Kubelet: unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /run/k3s/cri-dockerd/cri-dockerd.sock: connect: no such file or directory\""
12月 01 04:57:04 k3s-12 systemd[1]: k3s-agent.service: Main process exited, code=exited, status=1/FAILURE
12月 01 04:57:04 k3s-12 systemd[1]: k3s-agent.service: Failed with result 'exit-code'.
12月 01 04:57:04 k3s-12 systemd[1]: Failed to start Lightweight Kubernetes.
12月 01 04:57:09 k3s-12 systemd[1]: k3s-agent.service: Service RestartSec=5s expired, scheduling restart.
12月 01 04:57:09 k3s-12 systemd[1]: k3s-agent.service: Scheduled restart job, restart counter is at 1.
12月 01 04:57:09 k3s-12 systemd[1]: Stopped Lightweight Kubernetes.
12月 01 04:57:09 k3s-12 systemd[1]: Starting Lightweight Kubernetes...
12月 01 04:57:09 k3s-12 sh[28308]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
12月 01 04:57:09 k3s-12 sh[28309]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
12月 01 04:57:09 k3s-12 k3s[28315]: time="2022-12-01T04:57:09-05:00" level=info msg="Starting k3s agent v1.25.4+k3s1 (0dc63334)"
12月 01 04:57:09 k3s-12 k3s[28315]: time="2022-12-01T04:57:09-05:00" level=info msg="Running load balancer k3s-agent-load-balancer 127.0.0.1:6444 -> [192.168.8.10:6443]"
12月 01 04:57:09 k3s-12 k3s[28315]: time="2022-12-01T04:57:09-05:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:58174->127.0.0.1:6444: read: connection reset by peer"
12月 01 04:57:11 k3s-12 k3s[28315]: time="2022-12-01T04:57:11-05:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:58184->127.0.0.1:6444: read: connection reset by peer"
12月 01 04:57:13 k3s-12 k3s[28315]: time="2022-12-01T04:57:13-05:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:58204->127.0.0.1:6444: read: connection reset by peer"
12月 01 04:57:15 k3s-12 k3s[28315]: time="2022-12-01T04:57:15-05:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:41142->127.0.0.1:6444: read: connection reset by peer"
12月 01 04:57:17 k3s-12 k3s[28315]: time="2022-12-01T04:57:17-05:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": read tcp 127.0.0.1:41164->127.0.0.1:6444: read: connection reset by peer"