rancher单机版隔一段时间ui不能进,其他Deployments服务正常访问,rancher/rancher:v2.5.12容器不断重启,

  • Rancher 版本:
    v2.5.12,使用docker run直接拉取镜像启动
  • 在线或离线部署:
    在线部署

下游集群信息

  • Kubernetes 版本:
    v1.20.15

用户信息
管理员
问题描述:
隔一段时间出现前端的ui无法进入,此时集群内所部署的业务应用是正常的,docker ps 发现rancher/rancher:v2.5.12镜像容器一直再重启,


docker logs 查看该容器日志无响应,docker exec -it 也无法进入到容器, dokcer stop无法停止, 重启docker 也无响应。
重启操作系统一到两次可恢复。
重现步骤:
隔几天发生一次

预期结果:
rancher/rancher:v2.5.12能一直正常
截图:

[details=“日志”]
查看容器日志一直卡住无响应,

Docker Daemon都无响应了…系统的负载正常么?…
先找到dockerd无响应的原因,拿到rancher 的日志,才能进一步排查。

其他容器都是正常的,所有业务pod还可以访问, 只是rancher ui不能访问了。重启服务器后k3s当时的日志为:
W0624 02:15:35.378967 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:15:39.101660 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:21:19.081733 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:22:19.213129 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:25:35.914033 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:28:08.484462 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:34:04.291023 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:36:22.645434 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
time=“2022-06-24T02:40:10.663978932+08:00” level=error msg=“Remotedialer proxy error” error=“read tcp 172.17.0.2:49372->172.17.0.2:6443: i/o timeout”
time=“2022-06-24T02:40:10.666742158+08:00” level=info msg=“error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF”
time=“2022-06-24T02:40:15.766481516+08:00” level=info msg=“Connecting to proxy” url=“wss://172.17.0.2:6443/v1-k3s/connect”
time=“2022-06-24T02:40:15.772207393+08:00” level=info msg=“Handling backend connection request [local-node]”
I0624 02:40:22.201334 54 trace.go:205] Trace[432890754]: “GuaranteedUpdate etcd3” type:*coordination.Lease (24-Jun-2022 02:40:21.627) (total time: 573ms):
Trace[432890754]: —“Transaction committed” 573ms (02:40:00.201)
Trace[432890754]: [573.669847ms] [573.669847ms] END
I0624 02:40:22.201523 54 trace.go:205] Trace[1608209100]: “Update” url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-scheduler,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 02:40:21.627) (total time: 574ms):
Trace[1608209100]: —“Object stored in database” 573ms (02:40:00.201)
Trace[1608209100]: [574.094096ms] [574.094096ms] END
I0624 02:40:24.925386 54 trace.go:205] Trace[154254355]: “Get” url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/cloud-controller-manager,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 02:40:23.631) (total time: 1293ms):
Trace[154254355]: —“About to write a response” 1293ms (02:40:00.925)
Trace[154254355]: [1.293436449s] [1.293436449s] END
I0624 02:40:24.925744 54 trace.go:205] Trace[1027320657]: “GuaranteedUpdate etcd3” type:*core.ConfigMap (24-Jun-2022 02:40:23.629) (total time: 1296ms):
Trace[1027320657]: —“Transaction committed” 1296ms (02:40:00.925)
Trace[1027320657]: [1.29665995s] [1.29665995s] END
I0624 02:40:24.925914 54 trace.go:205] Trace[484912717]: “Update” url:/api/v1/namespaces/kube-system/configmaps/k3s,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc,client:127.0.0.1 (24-Jun-2022 02:40:23.628) (total time: 1297ms):
Trace[484912717]: —“Object stored in database” 1296ms (02:40:00.925)
Trace[484912717]: [1.297049871s] [1.297049871s] END
I0624 02:40:24.927045 54 trace.go:205] Trace[1882061195]: “Get” url:/api/v1/namespaces/kube-system/endpoints/kube-scheduler,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 02:40:24.205) (total time: 721ms):
Trace[1882061195]: —“About to write a response” 721ms (02:40:00.926)
Trace[1882061195]: [721.338054ms] [721.338054ms] END
I0624 02:40:24.927078 54 trace.go:205] Trace[627919641]: “Get” url:/api/v1/namespaces/kube-system/endpoints/kube-controller-manager,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 02:40:23.628) (total time: 1298ms):
Trace[627919641]: —“About to write a response” 1298ms (02:40:00.926)
Trace[627919641]: [1.298278863s] [1.298278863s] END
I0624 02:40:54.315558 54 trace.go:205] Trace[895555987]: “Get” url:/api/v1/namespaces/kube-system/endpoints/cloud-controller-manager,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 02:40:53.505) (total time: 810ms):
Trace[895555987]: —“About to write a response” 810ms (02:40:00.315)
Trace[895555987]: [810.108013ms] [810.108013ms] END
I0624 02:40:54.315898 54 trace.go:205] Trace[793712190]: “Get” url:/api/v1/namespaces/kube-system/endpoints/kube-scheduler,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 02:40:53.504) (total time: 811ms):
Trace[793712190]: —“About to write a response” 811ms (02:40:00.315)
Trace[793712190]: [811.832438ms] [811.832438ms] END
I0624 02:40:54.317958 54 trace.go:205] Trace[1398811364]: “Get” url:/api/v1/namespaces/kube-system/endpoints/kube-controller-manager,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 02:40:53.511) (total time: 806ms):
Trace[1398811364]: —“About to write a response” 806ms (02:40:00.317)
Trace[1398811364]: [806.671169ms] [806.671169ms] END
I0624 02:40:59.226104 54 controller.go:609] quota admission added evaluator for: etcdbackups.management.cattle.io
W0624 02:45:45.689305 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:45:47.804456 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:47:08.454472 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 02:49:23.704694 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
I0624 02:52:31.895997 54 request.go:645] Throttling request took 1.73090654s, request: GET:https://127.0.0.1:6444/apis/events.k8s.io/v1?timeout=32s
W0624 03:00:29.828353 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:00:52.256920 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:01:53.310979 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:03:49.868578 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
I0624 03:12:41.439797 54 trace.go:205] Trace[1511575211]: “Get” url:/api/v1/namespaces/fleet-system/configmaps/gitjob,user-agent:gitjob/v0.0.0 (linux/amd64) kubernetes/$Format,client:10.42.0.92 (24-Jun-2022 03:12:40.793) (total time: 644ms):
Trace[1511575211]: —“About to write a response” 588ms (03:12:00.382)
Trace[1511575211]: [644.717457ms] [644.717457ms] END
I0624 03:12:46.757196 54 trace.go:205] Trace[2134714444]: “Get” url:/api/v1/namespaces/kube-system/endpoints/cloud-controller-manager,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 03:12:46.087) (total time: 669ms):
Trace[2134714444]: —“About to write a response” 669ms (03:12:00.757)
Trace[2134714444]: [669.670972ms] [669.670972ms] END
I0624 03:12:46.758831 54 trace.go:205] Trace[1208491509]: “Get” url:/api/v1/namespaces/kube-system/configmaps/k3s,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc,client:127.0.0.1 (24-Jun-2022 03:12:46.006) (total time: 752ms):
Trace[1208491509]: —“About to write a response” 752ms (03:12:00.758)
Trace[1208491509]: [752.622092ms] [752.622092ms] END
I0624 03:12:46.758670 54 trace.go:205] Trace[1808268135]: “Get” url:/api/v1/namespaces/kube-system/endpoints/kube-scheduler,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 03:12:46.108) (total time: 649ms):
Trace[1808268135]: —“About to write a response” 649ms (03:12:00.758)
Trace[1808268135]: [649.708672ms] [649.708672ms] END
I0624 03:12:46.759112 54 trace.go:205] Trace[511863102]: “Get” url:/api/v1/namespaces/fleet-system/configmaps/fleet-agent-lock,user-agent:fleetagent/v0.0.0 (linux/amd64) kubernetes/$Format,client:10.42.0.94 (24-Jun-2022 03:12:46.000) (total time: 758ms):
Trace[511863102]: —“About to write a response” 758ms (03:12:00.758)
Trace[511863102]: [758.417297ms] [758.417297ms] END
I0624 03:12:46.759400 54 trace.go:205] Trace[991489714]: “Get” url:/api/v1/namespaces/kube-system/endpoints/kube-controller-manager,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 03:12:46.111) (total time: 647ms):
Trace[991489714]: —“About to write a response” 647ms (03:12:00.759)
Trace[991489714]: [647.887379ms] [647.887379ms] END
I0624 03:12:46.762956 54 trace.go:205] Trace[888943931]: “Get” url:/api/v1/namespaces/fleet-system/configmaps/fleet-controller-lock,user-agent:fleetcontroller/v0.0.0 (linux/amd64) kubernetes/$Format,client:10.42.0.90 (24-Jun-2022 03:12:46.009) (total time: 753ms):
Trace[888943931]: —“About to write a response” 753ms (03:12:00.762)
Trace[888943931]: [753.487463ms] [753.487463ms] END
I0624 03:12:55.502186 54 trace.go:205] Trace[1139464299]: “GuaranteedUpdate etcd3” type:*core.ConfigMap (24-Jun-2022 03:12:54.864) (total time: 637ms):
Trace[1139464299]: —“Transaction committed” 635ms (03:12:00.502)
Trace[1139464299]: [637.600924ms] [637.600924ms] END
I0624 03:12:55.502365 54 trace.go:205] Trace[1725929589]: “Update” url:/api/v1/namespaces/fleet-system/configmaps/fleet-agent-lock,user-agent:fleetagent/v0.0.0 (linux/amd64) kubernetes/$Format,client:10.42.0.94 (24-Jun-2022 03:12:54.864) (total time: 637ms):
Trace[1725929589]: —“Object stored in database” 637ms (03:12:00.502)
Trace[1725929589]: [637.944378ms] [637.944378ms] END
I0624 03:12:55.502651 54 trace.go:205] Trace[1980603388]: “GuaranteedUpdate etcd3” type:*core.ConfigMap (24-Jun-2022 03:12:54.867) (total time: 635ms):
Trace[1980603388]: —“Transaction committed” 634ms (03:12:00.502)
Trace[1980603388]: [635.198734ms] [635.198734ms] END
I0624 03:12:55.502787 54 trace.go:205] Trace[1179359797]: “Update” url:/api/v1/namespaces/fleet-system/configmaps/fleet-controller-lock,user-agent:fleetcontroller/v0.0.0 (linux/amd64) kubernetes/$Format,client:10.42.0.90 (24-Jun-2022 03:12:54.867) (total time: 635ms):
Trace[1179359797]: —“Object stored in database” 635ms (03:12:00.502)
Trace[1179359797]: [635.550432ms] [635.550432ms] END
W0624 03:22:40.214730 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:22:41.522922 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:24:07.800897 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
I0624 03:25:03.861947 54 request.go:645] Throttling request took 2.280354377s, request: GET:https://127.0.0.1:6444/apis/storage.k8s.io/v1?timeout=32s
W0624 03:27:52.372962 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:31:37.382246 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:36:05.092390 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:37:38.534708 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:38:02.360933 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
time=“2022-06-24T03:45:10.465515608+08:00” level=error msg=“Remotedialer proxy error” error=“read tcp 172.17.0.2:35892->172.17.0.2:6443: i/o timeout”
time=“2022-06-24T03:45:10.494254958+08:00” level=info msg=“error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF”
I0624 03:45:10.466399 54 trace.go:205] Trace[1974183864]: “Get” url:/api/v1/namespaces/fleet-system/configmaps/gitjob,user-agent:gitjob/v0.0.0 (linux/amd64) kubernetes/$Format,client:10.42.0.92 (24-Jun-2022 03:45:09.366) (total time: 1099ms):
Trace[1974183864]: —“About to write a response” 1097ms (03:45:00.463)
Trace[1974183864]: [1.099217383s] [1.099217383s] END
time=“2022-06-24T03:45:15.470631564+08:00” level=info msg=“Connecting to proxy” url=“wss://172.17.0.2:6443/v1-k3s/connect”
time=“2022-06-24T03:45:15.476648878+08:00” level=info msg=“Handling backend connection request [local-node]”
I0624 03:45:23.729767 54 trace.go:205] Trace[222170756]: “GuaranteedUpdate etcd3” type:*core.ConfigMap (24-Jun-2022 03:45:23.106) (total time: 623ms):
Trace[222170756]: —“Transaction committed” 620ms (03:45:00.729)
Trace[222170756]: [623.224593ms] [623.224593ms] END
I0624 03:45:23.730511 54 trace.go:205] Trace[77619226]: “Update” url:/api/v1/namespaces/kube-system/configmaps/k3s,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc,client:127.0.0.1 (24-Jun-2022 03:45:23.106) (total time: 623ms):
Trace[77619226]: —“Object stored in database” 623ms (03:45:00.729)
Trace[77619226]: [623.711034ms] [623.711034ms] END
I0624 03:45:33.500640 54 trace.go:205] Trace[1271072751]: “Get” url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-scheduler,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 03:45:32.615) (total time: 885ms):
Trace[1271072751]: —“About to write a response” 885ms (03:45:00.500)
Trace[1271072751]: [885.167563ms] [885.167563ms] END
I0624 03:45:33.501005 54 trace.go:205] Trace[777429063]: “Get” url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 03:45:32.609) (total time: 891ms):
Trace[777429063]: —“About to write a response” 891ms (03:45:00.500)
Trace[777429063]: [891.218925ms] [891.218925ms] END
I0624 03:45:33.527837 54 trace.go:205] Trace[1540134021]: “Get” url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/cloud-controller-manager,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 03:45:32.608) (total time: 919ms):
Trace[1540134021]: —“About to write a response” 919ms (03:45:00.527)
Trace[1540134021]: [919.12473ms] [919.12473ms] END
I0624 03:45:34.174289 54 trace.go:205] Trace[966668254]: “GuaranteedUpdate etcd3” type:*core.Endpoints (24-Jun-2022 03:45:33.544) (total time: 629ms):
Trace[966668254]: —“Transaction committed” 521ms (03:45:00.174)
Trace[966668254]: [629.79ms] [629.79ms] END
I0624 03:45:34.174466 54 trace.go:205] Trace[1869885918]: “Update” url:/api/v1/namespaces/kube-system/endpoints/kube-controller-manager,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 03:45:33.544) (total time: 630ms):
Trace[1869885918]: —“Object stored in database” 629ms (03:45:00.174)
Trace[1869885918]: [630.160099ms] [630.160099ms] END
I0624 03:45:34.175127 54 trace.go:205] Trace[974612186]: “GuaranteedUpdate etcd3” type:*core.ConfigMap (24-Jun-2022 03:45:33.654) (total time: 520ms):
Trace[974612186]: —“Transaction committed” 503ms (03:45:00.175)
Trace[974612186]: [520.536201ms] [520.536201ms] END
I0624 03:45:34.175299 54 trace.go:205] Trace[468684007]: “Update” url:/api/v1/namespaces/fleet-system/configmaps/fleet-agent-lock,user-agent:fleetagent/v0.0.0 (linux/amd64) kubernetes/$Format,client:10.42.0.94 (24-Jun-2022 03:45:33.654) (total time: 520ms):
Trace[468684007]: —“Object stored in database” 520ms (03:45:00.175)
Trace[468684007]: [520.940325ms] [520.940325ms] END
I0624 03:45:34.182602 54 trace.go:205] Trace[411094504]: “GuaranteedUpdate etcd3” type:*coordination.Lease (24-Jun-2022 03:45:33.526) (total time: 654ms):
Trace[411094504]: —“Transaction prepared” 165ms (03:45:00.692)
Trace[411094504]: —“Transaction committed” 488ms (03:45:00.181)
Trace[411094504]: [654.995867ms] [654.995867ms] END
I0624 03:45:34.182814 54 trace.go:205] Trace[451322157]: “Update” url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-scheduler,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 03:45:33.526) (total time: 656ms):
Trace[451322157]: —“Object stored in database” 656ms (03:45:00.182)
Trace[451322157]: [656.18087ms] [656.18087ms] END
W0624 03:52:11.728568 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:52:28.527324 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 03:55:10.807300 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
I0624 03:57:22.963182 54 trace.go:205] Trace[532054795]: “Get” url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-scheduler,user-agent:k3s/v1.19.13+k3s1 (linux/amd64) kubernetes/99eadcc/leader-election,client:127.0.0.1 (24-Jun-2022 03:57:21.043) (total time: 1919ms):
Trace[532054795]: —“About to write a response” 1919ms (03:57:00.963)
Trace[532054795]: [1.919555755s] [1.919555755s] END
W0624 03:58:13.418121 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 04:03:03.011593 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 04:06:17.832919 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 04:07:57.886843 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 04:13:08.316844 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 04:14:02.423081 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
W0624 04:16:56.865006 54 watcher.go:207] watch chan error: etcdserver: mvcc: required revision has been compacted
time=“2022-06-24T04:17:54.104594859+08:00” level=info msg=“Cluster-Http-Server 2022/06/24 04:17:54 http: TLS handshake error from 10.42.0.94:57066: EOF”
E0624 04:18:03.792005 54 leaderelection.go:325] error retrieving resource lock kube-system/cloud-controller-manager: Get “https://127.0.0.1:6444/api/v1/namespaces/kube-system/endpoints/cloud-controller-manager”: context deadline exceeded
I0624 04:18:03.793070 54 leaderelection.go:278] failed to renew lease kube-system/cloud-controller-manager: timed out waiting for the condition
F0624 04:18:03.793168 54 controllermanager.go:220] leaderelection lost