Rke2集群下通过Rancher-Backup备份RancherServer后进行演练恢复出现失败现象

Rancher Server 设置

  • Rancher 版本:
  • 安装选项 (Helm Chart):
    • RKE2,Rancher2.6.4:
  • 在线:
    NAME STATUS ROLES AGE VERSION
    weifor5 Ready control-plane,etcd,master 3d v1.22.7+rke2r2
    weifor6 Ready control-plane,etcd,master 3d v1.22.7+rke2r2
    weifor7 Ready control-plane,etcd,master 3d v1.22.7+rke2r2

问题描述:
通过Rancher-Backup的Restore恢复时,出现错误,不断重复进行恢复操作

Error restoring namespaced resources [error restoring weifor of type provisioning.cattle.io/v1, Resource=clusters: restoreResource: err updating resource admission webhook "rancherauth.cattle.io" denied the request: creatorID annotation cannot be changed] 
error syncing 'restore-vlg69': handler restore: error restoring namespaced resources, check logs for exact error, requeuing 

重现步骤:
1、rke2的3节点集群;
2、安装rancher2.6.4;
3、在Apps上直接安装rancher-backup;
4、创建业务集群weifor(下游集群);
5、创建minio相关的secret;
6、backup备份,create创建一份备份事务,用上一步的secrete,并存在minio;
7、restore恢复,create创建一份恢复事务,参数均从上一步的backup中获取,包含恢复文件名;

结果:

INFO[2022/04/14 04:57:21] restoreResource: Restoring library-openebs-0.6.0 of type management.cattle.io/v3, Resource=catalogtemplateversions 
INFO[2022/04/14 04:57:21] Getting new UID for library-openebs          
INFO[2022/04/14 04:57:21] Successfully restored library-openebs-0.6.0  
INFO[2022/04/14 04:57:21] restoreResource: Restoring import-token-weifor of type /v1, Resource=secrets 
INFO[2022/04/14 04:57:21] Getting new UID for import-token-weifor      
INFO[2022/04/14 04:57:21] Successfully restored import-token-weifor    
INFO[2022/04/14 04:57:21] restoreResource: Restoring import-token-weifor-98abca5f-950e-4d9c-acca-3802d7f2a24d-f305e4 of type rbac.authorization.k8s.io/v1, Resource=rolebindings 
INFO[2022/04/14 04:57:21] Getting new UID for import-token-weifor      
INFO[2022/04/14 04:57:21] restoreResource: Restoring import-token-weifor-98abca5f-950e-4d9c-acca-3802d7f2a24d-role of type rbac.authorization.k8s.io/v1, Resource=roles 
INFO[2022/04/14 04:57:21] Getting new UID for import-token-weifor      
INFO[2022/04/14 04:57:21] restoreResource: Restoring import-token-weifor-98abca5f-950e-4d9c-acca-3802d7f2a24d of type /v1, Resource=serviceaccounts 
INFO[2022/04/14 04:57:21] Getting new UID for import-token-weifor      
INFO[2022/04/14 04:57:21] Processing controllerRef apps/v1/deployments/rancher 
INFO[2022/04/14 04:57:21] Scaling up controllerRef apps/v1/deployments/rancher to 1 
ERRO[2022/04/14 04:57:21] Error restoring namespaced resources [error restoring weifor of type provisioning.cattle.io/v1, Resource=clusters: restoreResource: err updating resource admission webhook "rancherauth.cattle.io" denied the request: creatorID annotation cannot be changed] 
ERRO[2022/04/14 04:57:21] error syncing 'restore-vlg69': handler restore: error restoring namespaced resources, check logs for exact error, requeuing 
INFO[2022/04/14 04:57:51] Processing Restore CR restore-vlg69          
INFO[2022/04/14 04:57:51] Restoring from backup rancher-backup-rke2-weifor-with-nginx-60478ad5-a41d-4941-b147-918984e0c63f-2022-04-13T08-16-01Z.tar.gz 
INFO[2022/04/14 04:57:51] invoking set s3 service client                s3-accessKey=AKIAIOSFODNN7EXAMPLE s3-bucketName=rancher-backup s3-endpoint="wf.weifor.com:49000" s3-endpoint-ca= s3-folder= s3-region=
INFO[2022/04/14 04:57:51] Temporary location of backup file from s3: /tmp/rancher-backup-rke2-weifor-with-nginx-60478ad5-a41d-4941-b147-918984e0c63f-2022-04-13T08-16-01Z.tar.gz 
INFO[2022/04/14 04:57:51] Successfully downloaded [rancher-backup-rke2-weifor-with-nginx-60478ad5-a41d-4941-b147-918984e0c63f-2022-04-13T08-16-01Z.tar.gz] 
INFO[2022/04/14 04:57:51] Processing controllerRef apps/v1/deployments/rancher 
INFO[2022/04/14 04:57:51] Scaling down controllerRef apps/v1/deployments/rancher to 0 
INFO[2022/04/14 04:57:51] Starting to restore CRDs for restore CR restore-vlg69 
INFO[2022/04/14 04:57:51] restoreResource: Restoring catalogtemplates.management.cattle.io of type apiextensions.k8s.io/v1, Resource=customresourcedefinitions 
INFO[2022/04/14 04:57:51] Successfully restored catalogtemplates.management.cattle.io 
INFO[2022/04/14 04:57:51] restoreResource: Restoring catalogtemplateversions.management.cattle.io of type apiextensions.k8s.io/v1, Resource=customresourcedefinitions 
INFO[2022/04/14 04:57:51] Successfully restored catalogtemplateversions.management.cattle.io 
INFO[2022/04/14 04:57:51] restoreResource: Restoring clusterrepos.catalog.cattle.io of type apiextensions.k8s.io/v1, Resource=customresourcedefinitions 
INFO[2022/04/14 04:57:51] Successfully restored clusterrepos.catalog.cattle.io 
INFO[2022/04/14 04:57:51] restoreResource: Restoring etcdsnapshots.rke.cattle.io of type apiextensions.k8s.io/v1, Resource=customresourcedefinitions 
INFO[2022/04/14 04:57:52] Successfully restored etcdsnapshots.rke.cattle.io 
INFO[2022/04/14 04:57:52] restoreResource: Restoring machinedeployments.cluster.x-k8s.io of type apiextensions.k8s.io/v1, Resource=customresourcedefinitions 
INFO[2022/04/14 04:57:52] Successfully restored machinedeployments.cluster.x-k8s.io 
INFO[2022/04/14 04:57:52] restoreResource: Restoring nodepools.management.cattle.io of type apiextensions.k8s.io/v1, Resource=customresourcedefinitions 
INFO[2022/04/14 04:57:52] Successfully restored nodepools.management.cattle.io    

截图:

其他上下文信息:

[/details]

目前不知道为啥会出现上述问题,因为仅仅是做了演练计划:

Rancher Server正常可用情况下
==> 通过备份以便将来恢复(Rancher UI 下操作)
==> 模拟演练进行恢复操作(Rancher UI 下操作)
==> 没能达到预期竟然把原来能用的RancherServer给搞坏了

由于本人未找到如何在 RKE2 集群里能够 干净、彻底卸载Rancher Server,所以只能重新把整个集群删除后重新建,并用备份文件在全新的环境中进行恢复,结果是能恢复(为什么在原来的集群环境中反而不能恢复!!!!!),目前恢复回来的Rancher能用,就是发现管理下游集群出现 **Waiting for plan to be applied** 字样。