RKE 安装k8s中报错,failed to fetch state from kubernetes。

Rancher Server 设置

  • Rancher 版本:还没到
  • Docker 版本20.10
  • rke version v1.3.15
  • 在线或离线部署:我是进行内网离线部署,操作系统是debian11.

我按照cluster.yml的内容,下载了所有的镜像,导入到每台主机中

docker load -i rancher_mirrored-k8s-dns-kube-dns1.21.1.tar
docker load -i aci-containers-controller.5.2.tar
docker load -i aci-containers-controller-host.5.2.tar
docker load -i autoscaler.1.85.tar
docker load -i calico-cni_v3.22.tar
docker load -i calico-ctl_v3.22.tar
docker load -i calico-kube-controllers_v3.22.0.tar
docker load -i calico-nodeV3.22.0.tar
docker load -i calico-pod2daemon-flexvol3.22.tar
docker load -i coredns_1.9.3.tar 
docker load -i dnsmasq-nanny_1.12.1.tar
docker load -i dns-sidecar.1.21.1.tar
docker load -i flannel-cni_v0.3.0.tar
docker load -i gbp-server5.23.tar
docker load -i hyperkubeV1.24.4.tar
docker load -i mirrored-calico-nodev3.22.0.tar
docker load -i mirrored-coreos-etcd_v3.5.4.tar
docker load -i mirrored-coreos-flannel_v0.15.1.tar
docker load -i mirrored-flannelcni-flannel_v0.17.0.tar
docker load -i mirrored-ingress-nginx-kube-webhook-certgen_v1.1.1.tar
docker load -i mirrored-metrics-server_v0.6.1.tar
docker load -i mirrored-nginx-ingress-controller-defaultbackend1.5.tar
docker load -i mirrored-pause3.6.tar
docker load -i nginx-ingress-controller_1.2.1.tar
docker load -i node-cache1.21.1.tar
docker load -i noiro-cnideploy.5.2.tar 
docker load -i noiro-openvswitch5.23.tar
docker load -i noiro-opflex5.23.tar
docker load -i noiro-opflex-server5.23.tar
docker load -i rke-tools0.1.87.tar
docker load -i weave-kube2.8.1.tar
docker load -i weave-npc2.8.1.tar

安装报错:

root@harbor:~/rancher# rke up
INFO[0000] Running RKE version: v1.3.15                 
INFO[0000] Initiating Kubernetes cluster                
INFO[0000] [state] Possible legacy cluster detected, trying to upgrade 
INFO[0000] [reconcile] Rebuilding and updating local kube config 
INFO[0000] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml] 
INFO[0000] [reconcile] host [192.168.1.86] is a control plane node with reachable Kubernetes API endpoint in the cluster 
INFO[0000] [state] Fetching cluster state from Kubernetes 
INFO[0030] Timed out waiting for kubernetes cluster to get state 
WARN[0030] Failed to fetch state from kubernetes: Timeout waiting for kubernetes cluster to get state 
INFO[0030] [dialer] Setup tunnel for host [192.168.1.89] 
INFO[0030] [dialer] Setup tunnel for host [192.168.1.90] 
INFO[0030] [dialer] Setup tunnel for host [192.168.1.86] 
INFO[0030] [dialer] Setup tunnel for host [192.168.1.91] 
INFO[0030] [dialer] Setup tunnel for host [192.168.1.87] 
INFO[0030] [dialer] Setup tunnel for host [192.168.1.88] 
INFO[0030] [state] Fetching cluster state from Nodes    
INFO[0030] Finding container [cluster-state-deployer] on host [192.168.1.90], try #1 
INFO[0030] Finding container [cluster-state-deployer] on host [192.168.1.91], try #1 
INFO[0030] Finding container [cluster-state-deployer] on host [192.168.1.86], try #1 
INFO[0030] Finding container [cluster-state-deployer] on host [192.168.1.87], try #1 
INFO[0030] Finding container [cluster-state-deployer] on host [192.168.1.88], try #1 
INFO[0030] Finding container [cluster-state-deployer] on host [192.168.1.89], try #1 
INFO[0030] [certificates] Getting Cluster certificates from Kubernetes 
WARN[0030] Failed to fetch certs from kubernetes: secrets "kube-ca" not found 
INFO[0030] [certificates] Fetching kubernetes certificates from nodes 
INFO[0030] Finding container [cert-fetcher] on host [192.168.1.86], try #1 
INFO[0030] Image [rancher/rke-tools:v0.1.87] exists on host [192.168.1.86] 
INFO[0031] Starting container [cert-fetcher] on host [192.168.1.86], try #1 
INFO[0031] [certificates] Successfully started [cert-fetcher] container on host [192.168.1.86] 
INFO[0031] Finding container [cert-fetcher] on host [192.168.1.86], try #1 
INFO[0031] Finding container [cert-fetcher] on host [192.168.1.86], try #1 
INFO[0031] Finding container [cert-fetcher] on host [192.168.1.87], try #1 
INFO[0031] Image [rancher/rke-tools:v0.1.87] exists on host [192.168.1.87] 
INFO[0032] Starting container [cert-fetcher] on host [192.168.1.87], try #1 
INFO[0032] [certificates] Successfully started [cert-fetcher] container on host [192.168.1.87] 
INFO[0032] Finding container [cert-fetcher] on host [192.168.1.88], try #1 
INFO[0032] Image [rancher/rke-tools:v0.1.87] exists on host [192.168.1.88] 
INFO[0032] Starting container [cert-fetcher] on host [192.168.1.88], try #1 
INFO[0033] [certificates] Successfully started [cert-fetcher] container on host [192.168.1.88] 
INFO[0033] Finding container [cert-fetcher] on host [192.168.1.88], try #1 
INFO[0033] Finding container [cert-fetcher] on host [192.168.1.88], try #1 
INFO[0033] Finding container [cert-fetcher] on host [192.168.1.88], try #1 
INFO[0033] Finding container [cert-fetcher] on host [192.168.1.88], try #1 
INFO[0033] Finding container [cert-fetcher] on host [192.168.1.88], try #1 
FATA[0033] Failed to fetch cluster certs from nodes, aborting upgrade: Certificate /etc/kubernetes/.tmp/kube-controller-manager.pem is not found

这个报错是我有镜像没有下载还是什么?
INFO[0000] [state] Fetching cluster state from Kubernetes
INFO[0030] Timed out waiting for kubernetes cluster to get state
WARN[0030] Failed to fetch state from kubernetes: Timeout waiting for kubernetes cluster to get state



[/details]

参考 RKE shouldn't pull kube-admin certs from nodes · Issue #1622 · rancher/rke · GitHub 试试?

我看ca是正常的。我rke remove 了。又重新rke up。

我看最后的提示是
Finished building kubernetes cluster successfully

但是依旧有报错:是3台master节点报错。

WARN[0016] [reconcile] host [192.168.1.86] is a control plane node without reachable Kubernetes API endpoint in the cluster 
INFO[0016] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml] 
WARN[0016] [reconcile] host [192.168.1.87] is a control plane node without reachable Kubernetes API endpoint in the cluster 
INFO[0016] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml] 
WARN[0016] [reconcile] host [192.168.1.88] is a control plane node without reachable Kubernetes API endpoint in the cluster 
WARN[0016] [reconcile] no control plane node with reachable Kubernetes API endpoint in the cluster found 
INFO[0016] [certificates] Successfully deployed kubernetes certificates to Cluster nodes