单节点rancher2.6.3一直重启,与K8s无法通信

单节点部署的rancher2.6.3,镜像是rancher/rancher:stable,使用一段时间后出现rancher-server容器一直重启,下面为部分日志:

2022/05/05 08:23:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:23:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:24:17 [ERROR] Failed to install system chart fleet: pod cattle-system/helm-operation-6dwzg failed, watch closed
2022/05/05 08:25:18 [ERROR] Failed to install system chart fleet-crd: pod cattle-system/helm-operation-grdjp failed, watch closed
2022/05/05 08:25:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:25:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:26:18 [ERROR] Failed to install system chart rancher-webhook: pod cattle-system/helm-operation-sg4g7 failed, watch closed
2022/05/05 08:27:19 [ERROR] Failed to install system chart fleet: pod cattle-system/helm-operation-fc7d9 failed, watch closed
2022/05/05 08:27:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:27:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:28:19 [ERROR] Failed to install system chart fleet-crd: pod cattle-system/helm-operation-flltp failed, watch closed
2022/05/05 08:29:19 [ERROR] Failed to install system chart rancher-webhook: pod cattle-system/helm-operation-5shsp failed, watch closed
2022/05/05 08:29:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:29:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:30:20 [ERROR] Failed to install system chart fleet: pod cattle-system/helm-operation-jmdpb failed, watch closed
2022/05/05 08:31:20 [ERROR] Failed to install system chart fleet-crd: pod cattle-system/helm-operation-g9r84 failed, watch closed
2022/05/05 08:31:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:31:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:32:20 [ERROR] Failed to install system chart rancher-webhook: pod cattle-system/helm-operation-sc6b6 failed, watch closed
2022/05/05 08:33:21 [ERROR] Failed to install system chart fleet: pod cattle-system/helm-operation-7pw9w failed, watch closed
2022/05/05 08:33:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:33:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:34:21 [ERROR] Failed to install system chart fleet-crd: pod cattle-system/helm-operation-pp6pm failed, watch closed
2022/05/05 08:35:21 [ERROR] Failed to install system chart rancher-webhook: pod cattle-system/helm-operation-ldv5n failed, watch closed
2022/05/05 08:35:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:35:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:36:21 [ERROR] Failed to install system chart fleet-crd: pod cattle-system/helm-operation-rd8xx failed, watch closed
2022/05/05 08:37:22 [ERROR] Failed to install system chart fleet: pod cattle-system/helm-operation-t2csl failed, watch closed
2022/05/05 08:37:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:37:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:38:22 [ERROR] Failed to install system chart rancher-webhook: pod cattle-system/helm-operation-m5qnz failed, watch closed
2022/05/05 08:39:22 [ERROR] Failed to install system chart fleet: pod cattle-system/helm-operation-kdskb failed, watch closed
2022/05/05 08:39:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:39:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:40:22 [ERROR] Failed to install system chart fleet-crd: pod cattle-system/helm-operation-vxvdd failed, watch closed
2022/05/05 08:41:23 [ERROR] Failed to install system chart rancher-webhook: pod cattle-system/helm-operation-gxj9p failed, watch closed
2022/05/05 08:41:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:41:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:42:23 [ERROR] Failed to install system chart fleet: pod cattle-system/helm-operation-mn6x9 failed, watch closed
2022/05/05 08:43:23 [ERROR] Failed to install system chart fleet-crd: pod cattle-system/helm-operation-xt9bf failed, watch closed
2022/05/05 08:43:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:43:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:44:02 [ERROR] error syncing 'cattle-fleet-system/helm-operation-khf4p': handler helm-operation: Operation cannot be fulfilled on operations.catalog.cattle.io "helm-operation-khf4p": StorageError: invalid object, Code: 4, Key: /registry/catalog.cattle.io/operations/cattle-fleet-system/helm-operation-khf4p, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: e94ac49a-08ec-4515-9479-99fba21b65c7, UID in object meta: , requeuing
2022/05/05 08:44:23 [ERROR] Failed to install system chart rancher-webhook: pod cattle-system/helm-operation-6fwn2 failed, watch closed
2022/05/05 08:45:24 [ERROR] Failed to install system chart fleet: pod cattle-system/helm-operation-lcf8f failed, watch closed
2022/05/05 08:45:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:45:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:46:03 [ERROR] error syncing 'cattle-system/helm-operation-rqm2m': handler helm-operation: Operation cannot be fulfilled on operations.catalog.cattle.io "helm-operation-rqm2m": StorageError: invalid object, Code: 4, Key: /registry/catalog.cattle.io/operations/cattle-system/helm-operation-rqm2m, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 91c7e506-fdee-41d0-840e-3f6cd0a80c03, UID in object meta: , requeuing
2022/05/05 08:46:24 [ERROR] Failed to install system chart fleet-crd: pod cattle-system/helm-operation-4t96x failed, watch closed
2022/05/05 08:47:24 [ERROR] Failed to install system chart rancher-webhook: pod cattle-system/helm-operation-qqrct failed, watch closed
2022/05/05 08:47:30 [ERROR] error syncing 'validating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:47:30 [ERROR] error syncing 'mutating-webhook-configuration': handler need-a-cert: services "webhook-service" not found, requeuing
2022/05/05 08:48:25 [ERROR] Failed to install system chart fleet-crd: pod cattle-system/helm-operation-xlbh2 failed, watch closed
2022/05/05 08:48:33 [ERROR] error syncing '_all_': handler cluster-deploy: Get "https://127.0.0.1:6443/apis/management.cattle.io/v3/users?labelSelector=EDSN6T35DKT2UBQVC5M6ONO%!D(MISSING)hashed-principal-name": dial tcp 127.0.0.1:6443: connect: connection refused, requeuing
2022/05/05 08:48:33 [ERROR] Failed to install system chart fleet: pod cattle-system/helm-operation-chkh9 failed, watch closed
2022/05/05 08:48:33 [FATAL] k3s exited with: exit status 255

single docker 安装模式下,rancher-server容器中会内置启动一个k3s。
k3s exited with: exit status 255 是一个关键线索,你需要排查k3s logs,找到它无法运转的原因。

对于embedded k3s的日志,你可以看一下容器的启动脚本,尝试找到这个k3s日志进行进一步分析。

感谢您的答复
我找到了源文件,并在服务器中运行了该脚本,但是出现了一个错误:exec: tini: not found
注释了脚本的最后一行,再次运行,没有任何返回。
我的rancher启动命令如下:

docker run -d --name=rancher
–privileged --restart=unless-stopped
-v /data/rancher:/var/lib/rancher
-p 80:80 -p 443:443
rancher/rancher:stable

你已经映射了数据目录 -v /data/rancher:/var/lib/rancher
看起来,你只要在host的 /data/rancher 下去找这个k3s日志就可以了。

至于你的 服务器中运行了该脚本,但是出现了一个错误:exec: tini: not found
我没理解是什么意思,那个脚本伴随容器启动已经是运行了。
我发出来的意思是,你可以通过它来了解k3s怎么运行的。
如果要进入容器,运行exec的时候,通常都需要加bash/sh command。比如:

docker exec -it xxx bash

我可不可以删除这个rancher容器,然后重新部署一个新的rancher,然后将之前rancher拉起的集群导入到新的rancher呢?有没有成熟的方案?

师兄找到解决方案了吗,我也遇到同样的问题,从2.5.0升级到2.6.3一样的报错