Local_k3s异常,导致rancher2.6.3循环重启

Rancher Server 设置

  • Rancher 版本:v2.6.3
  • 安装选项 (Docker install/Helm Chart): Docker install
  • 在线或离线部署:在线

下游集群信息

  • 内部K3S 版本: v1.21.7+k3s1

问题描述:
rancher循环重启,看日志应该是rancher里面的k3s异常导致。
目前rancher无法启动,重启后docker logs rancher打出来的日志也是旧的,没有更新。

重现步骤:
不知道怎么重现,上一次登录rancher大概在半个月前,这次登录就发现出问题了,期间没做什么操作。

日志
2022/06/11 10:57:03 [ERROR] error syncing 'rancher-rke2-charts': handler helm-clusterrepo-ensure: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-rke2-charts/675f1b63a0a83905972dcab2794479ed599a6f41b86cd6193d69472d0fa889c9 reset --hard FETCH_HEAD error: exit status 128, detail: fatal: Unable to create '/var/lib/rancher-data/local-catalogs/v2/rancher-rke2-charts/675f1b63a0a83905972dcab2794479ed599a6f41b86cd6193d69472d0fa889c9/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
, handler helm-clusterrepo-download: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-rke2-charts/675f1b63a0a83905972dcab2794479ed599a6f41b86cd6193d69472d0fa889c9 reset --hard HEAD error: exit status 128, detail: fatal: Unable to create '/var/lib/rancher-data/local-catalogs/v2/rancher-rke2-charts/675f1b63a0a83905972dcab2794479ed599a6f41b86cd6193d69472d0fa889c9/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
, requeuing
2022/06/11 10:57:04 [ERROR] Failed to install system chart rancher-webhook: failed to install , pod cattle-system/helm-operation-74kwq exited 123
2022/06/11 10:57:22 [ERROR] error syncing 'rancher-partner-charts': handler helm-clusterrepo-ensure: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-partner-charts/8f17acdce9bffd6e05a58a3798840e408c4ea71783381ecd2e9af30baad65974 fetch origin e54404468214578ddce3d906415aca6f064aee78 error: exit status 128, detail: fatal: .git/index: index file smaller than expected
, handler helm-clusterrepo-download: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-partner-charts/8f17acdce9bffd6e05a58a3798840e408c4ea71783381ecd2e9af30baad65974 reset --hard HEAD error: exit status 128, detail: fatal: .git/index: index file smaller than expected
, requeuing
2022/06/11 10:59:02 [ERROR] error syncing 'rancher-charts': handler helm-clusterrepo-ensure: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-charts/4b40cac650031b74776e87c1a726b0484d0877c3ec137da0872547ff9b73a721 reset --hard FETCH_HEAD error: exit status 128, detail: fatal: Unable to create '/var/lib/rancher-data/local-catalogs/v2/rancher-charts/4b40cac650031b74776e87c1a726b0484d0877c3ec137da0872547ff9b73a721/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
, handler helm-clusterrepo-download: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-charts/4b40cac650031b74776e87c1a726b0484d0877c3ec137da0872547ff9b73a721 reset --hard HEAD error: exit status 128, detail: fatal: Unable to create '/var/lib/rancher-data/local-catalogs/v2/rancher-charts/4b40cac650031b74776e87c1a726b0484d0877c3ec137da0872547ff9b73a721/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
, requeuing
2022/06/11 10:59:04 [ERROR] error syncing 'rancher-rke2-charts': handler helm-clusterrepo-ensure: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-rke2-charts/675f1b63a0a83905972dcab2794479ed599a6f41b86cd6193d69472d0fa889c9 reset --hard FETCH_HEAD error: exit status 128, detail: fatal: Unable to create '/var/lib/rancher-data/local-catalogs/v2/rancher-rke2-charts/675f1b63a0a83905972dcab2794479ed599a6f41b86cd6193d69472d0fa889c9/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
, handler helm-clusterrepo-download: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-rke2-charts/675f1b63a0a83905972dcab2794479ed599a6f41b86cd6193d69472d0fa889c9 reset --hard HEAD error: exit status 128, detail: fatal: Unable to create '/var/lib/rancher-data/local-catalogs/v2/rancher-rke2-charts/675f1b63a0a83905972dcab2794479ed599a6f41b86cd6193d69472d0fa889c9/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
, requeuing
2022/06/11 10:59:05 [ERROR] Failed to install system chart rancher-webhook: failed to install , pod cattle-system/helm-operation-czqrm exited 123
2022/06/11 10:59:22 [ERROR] error syncing 'rancher-partner-charts': handler helm-clusterrepo-ensure: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-partner-charts/8f17acdce9bffd6e05a58a3798840e408c4ea71783381ecd2e9af30baad65974 fetch origin e54404468214578ddce3d906415aca6f064aee78 error: exit status 128, detail: fatal: .git/index: index file smaller than expected
, handler helm-clusterrepo-download: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-partner-charts/8f17acdce9bffd6e05a58a3798840e408c4ea71783381ecd2e9af30baad65974 reset --hard HEAD error: exit status 128, detail: fatal: .git/index: index file smaller than expected
, requeuing
2022/06/11 11:00:28 httputil: ReverseProxy read error during body copy: unexpected EOF
2022/06/11 11:00:28 [ERROR] [updateClusterHealth] Failed to update cluster [c-m-q6v9sgz6]: Put "https://127.0.0.1:6443/apis/management.cattle.io/v3/clusters/c-m-q6v9sgz6": EOF
E0611 11:00:28.852309    1225 leaderelection.go:330] error retrieving resource lock kube-system/cattle-controllers: Get "https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/cattle-controllers?timeout=15m0s": dial tcp 127.0.0.1:6443: connect: connection refused
2022/06/11 11:00:28 [FATAL] k3s exited with: exit status 255

如果是single docker安装模式,可以在rancher server容器里的 /var/lib/rancher 子目录下找k3s的日志,这里面应该会有k3s为什么挂掉的原因。

谢谢,已经恢复。
虽然不知道k3s挂掉的原因,但在k3s-cluster-reset.log里看到最后一句

time="2022-06-20T10:34:16.530473756Z" level=fatal msg="starting kubernetes: preparing server: start managed 
database: cluster-reset was successfully performed, please remove the cluster-reset flag and start k3s normally, if you need to perform another cluster reset, you must first manually delete the /var/lib/rancher/k3s/server/db/reset-flag file"

于是删除/var/lib/rancher/k3s/server/db/reset-flag 文件,重启docker就好了

牛逼,我删了这个/var/lib/rancher/k3s/server/db/reset-flag 也解决了rancher无限重启