Rancher2.6.3 升级rke2版本失败

Rancher Server 设置

  • Rancher 版本:2.6.3
  • 安装选项 :docker单机部署

下游集群信息

  • Kubernetes 版本: v1.21.9

问题描述:
通过rancher UI界面对rke2 -将 v1.21.9 升级到v1.21.13 版本,升级失败,集群3个master节点全报故障。节点升级策略:10%,不删除pods。

结果:
升级失败,恢复老版本,通过恢复etcd及rke2版本号,进行回滚操作,回滚失败,master节点的程序全部停掉。

截图:

查看:rke2-server.service 显示rke2的版本是:v1.21.13+rke2r2 而不是回滚后的:v1.21.9+rke2r1

6月 23 04:52:36 k8smaster29 systemd[1]: Starting Rancher Kubernetes Engine v2 (server)...
-- Subject: Unit rke2-server.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit rke2-server.service has begun starting up.
6月 23 04:52:36 k8smaster29 sh[30989]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
6月 23 04:52:36 k8smaster29 sh[30989]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
6月 23 04:52:36 k8smaster29 rke2[30998]: time="2022-06-23T04:52:36+08:00" level=warning msg="not running in CIS mode"
6月 23 04:52:36 k8smaster29 rke2[30998]: time="2022-06-23T04:52:36+08:00" level=info msg="Starting rke2 v1.21.13+rke2r2 (852510e54a652f62c59c1d125cc1e65490ffc45c)"
6月 23 04:52:36 k8smaster29 rke2[30998]: time="2022-06-23T04:52:36+08:00" level=fatal msg="starting kubernetes: preparing server: failed to get CA certs: Get \"https://192.168.0.28:9345/cacerts\": dial tcp 192.1
6月 23 04:52:36 k8smaster29 systemd[1]: rke2-server.service: main process exited, code=exited, status=1/FAILURE

私有仓库用的是阿里云的,阿里云下载不到:v1.21.13+rke2r2这个版本。

[root@rancher ~]# docker pull registry.cn-hangzhou.aliyuncs.com/rancher/rke2-runtime:v1.21.13-rke2r2
Error response from daemon: manifest for registry.cn-hangzhou.aliyuncs.com/rancher/rke2-runtime:v1.21.13-rke2r2 not found: manifest unknown: manifest unknown
[root@rancher ~]# docker pull rancher/rke2-runtime:v1.21.13-rke2r2                                  
v1.21.13-rke2r2: Pulling from rancher/rke2-runtime
166582ab719c: Pull complete 
Digest: sha256:398241e0d6e5b976d9301e0240598de2949a08a862f9a6837286d476c82cb1ad
Status: Downloaded newer image for rancher/rke2-runtime:v1.21.13-rke2r2
docker.io/rancher/rke2-runtime:v1.21.13-rke2r2

参考这里:Rancher v2.6.5创建下游集群v1.23.7-rke2r2失败 - #2,来自 niusmallnan