Rancher 启动k8s集群节点,卡在reconciling

环境信息:
RKE2 版本: 2.9.3

节点 CPU 架构,操作系统和版本:

x86_64 ubuntu22.04

集群配置:

1 servers

问题描述:

rancher 使用脚本加入linux 节点卡在reconciling

重现步骤:

  • 安装 RKE2 的命令:

    使用图示提供的加入节点脚本,在linux节点执行

预期结果:

实际结果:
卡在reconciling,

查看日志内容,rke2-server的

日志

Dec 29 08:52:59 a10-node rke2[3236]: time=“2024-12-29T08:52:59Z” level=info msg=“Handling backend connection request [a10-node]”
Dec 29 08:52:59 a10-node rke2[3236]: time=“2024-12-29T08:52:59Z” level=info msg=“Remotedialer connected to proxy” url=“wss://127.0.0.1:9345/v1-rke2/connect”
Dec 29 08:52:59 a10-node rke2[3236]: time=“2024-12-29T08:52:59Z” level=error msg=“Sending HTTP 503 response to 127.0.0.1:48174: runtime core not ready”
Dec 29 08:52:59 a10-node rke2[3236]: time=“2024-12-29T08:52:59Z” level=info msg=“Running kube-proxy --cluster-cidr=10.42.0.0/16 --conntrack-max-per-core=0 --conntrack-tcp-timeout-close-wait=0s --conntrack-tcp-timeout-established=0s --healthz-bind-address=127.0.0.1 --hostname-override=a10-node --kubeconfig=/var/lib/rancher/rke2/agent/kubeproxy.kubeconfig --proxy-mode=iptables”
Dec 29 08:53:14 a10-node rke2[3236]: time=“2024-12-29T08:53:14Z” level=warning msg=“Failed to list nodes with etcd role: runtime core not ready”
Dec 29 08:53:16 a10-node rke2[3236]: time=“2024-12-29T08:53:16Z” level=info msg=“Pod for etcd is synced”
Dec 29 08:53:16 a10-node rke2[3236]: time=“2024-12-29T08:53:16Z” level=info msg=“Pod for kube-apiserver is synced”
Dec 29 08:53:26 a10-node rke2[3236]: time=“2024-12-29T08:53:26Z” level=info msg=“Waiting for API server to become available”
Dec 29 08:53:26 a10-node rke2[3236]: time=“2024-12-29T08:53:26Z” level=info msg=“Waiting for API server to become available”
Dec 29 08:53:29 a10-node rke2[3236]: time=“2024-12-29T08:53:29Z” level=warning msg=“Failed to list nodes with etcd role: runtime core not ready”
Dec 29 08:53:44 a10-node rke2[3236]: time=“2024-12-29T08:53:44Z” level=warning msg=“Failed to list nodes with etcd role: runtime core not ready”
Dec 29 08:53:56 a10-node rke2[3236]: time=“2024-12-29T08:53:56Z” level=info msg=“Waiting for API server to become available”

你可以查看下 rancher server 和 rancher-system-agent 服务的日志,如果都没问题,就查看下 rke2 相关组件的状态和日志,参考:RKE2 commands

我查看了 rancher server 的日志即 上述的log, rancher-system-agent 的 log


runtime core not ready 是指?

十有八九是 镜像没拉下来导致,你看看 rancher-system-agent 的日志里,有没有 拉取镜像失败相关的日志,多等会日志刷新