K3s异常宕机 [FATAL] k3s exited with: exit status 255

Rancher Server 设置

  • Rancher 版本:v2.5.11
  • 安装选项 (Docker install/Helm Chart): Docker install
  • 在线或离线部署:在线

下游集群信息

  • Kubernetes 版本: v1.20.12
  • Cluster Type (Local/Downstream): Downstream
    • 如果 Downstream,是什么类型的集群?(自定义/导入或为托管 等): 自定义

问题描述:
Rancher是单容器部署,经常会宕机重启,请大神帮忙看看什么原因,错误信息见日志

日志
2022-06-15 03:25:48.388181 E | etcdserver: failed to get read index from raft: context deadline exceeded
2022-06-15 03:26:35.429361 W | etcdserver: failed to revoke 694d81610ffa20d6 ("context deadline exceeded")
2022-06-15 03:26:35.429718 W | etcdserver: read-only range request "key:\"/registry/jobs/\" range_end:\"/registry/jobs0\" limit:500 " with result "error:context deadline exceeded" took too long (4m58.054074433s) to execute
2022-06-15 03:26:35.429929 W | etcdserver: failed to revoke 694d81610ffa1fdc ("context deadline exceeded")
2022-06-15 03:26:35.429948 W | etcdserver: failed to revoke 694d81610ffa20d6 ("context deadline exceeded")
2022-06-15 03:26:35.429955 W | etcdserver: failed to revoke 694d81610ffa20d6 ("context deadline exceeded")
2022-06-15 03:26:35.429960 W | etcdserver: failed to revoke 694d81610ffa1fdc ("context deadline exceeded")
2022-06-15 03:26:35.430059 W | etcdserver: failed to revoke 694d81610ffa1fdc ("context deadline exceeded")
2022-06-15 03:26:35.430067 W | etcdserver: failed to revoke 694d81610ffa20d6 ("context deadline exceeded")
2022-06-15 03:26:35.430073 W | etcdserver: failed to revoke 694d81610ffa1fdc ("context deadline exceeded")
2022-06-15 03:26:35.430083 W | etcdserver: failed to revoke 694d81610ffa20d6 ("context deadline exceeded")
2022-06-15 03:26:35.430088 W | etcdserver: failed to revoke 694d81610ffa1fdc ("context deadline exceeded")
2022-06-15 03:26:35.431093 W | etcdserver: read-only range request "key:\"/registry/project.cattle.io/pipelineexecutions/\" range_end:\"/registry/project.cattle.io/pipelineexecutions0\" count_only:true " with result "error:context deadline exceeded" took too long (4m57.990389077s) to execute
2022-06-15 03:26:35.471452 W | etcdserver: failed to revoke 694d81610ffa20d6 ("lease not found")
2022-06-15 03:26:35.471533 W | etcdserver: failed to revoke 694d81610ffa20d6 ("lease not found")
2022-06-15 03:26:35.471558 W | etcdserver: failed to revoke 694d81610ffa20d6 ("lease not found")
2022-06-15 03:26:35.471566 W | etcdserver: failed to revoke 694d81610ffa1fdc ("lease not found")
2022-06-15 03:26:35.471582 W | etcdserver: failed to revoke 694d81610ffa1fdc ("lease not found")
2022-06-15 03:26:35.474196 W | etcdserver: read-only range request "key:\"/registry/services/endpoints/kube-system/cloud-controller-manager\" " with result "range_response_count:1 size:596" took too long (2m43.012266297s) to execute
2022/06/15 03:26:36 [INFO] error in remotedialer server [400]: read tcp 172.17.0.2:80->192.168.0.182:46646: i/o timeout
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-332': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-332/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-231': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-627': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-627/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: websocket: write timeout
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: write tcp 172.17.0.2:80->192.168.0.182:46666: write: broken pipe
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: write tcp 172.17.0.2:80->192.168.0.182:46674: write: broken pipe
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: write tcp 172.17.0.2:80->192.168.0.182:46670: write: broken pipe
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: write tcp 172.17.0.2:80->192.168.0.182:46650: write: broken pipe
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: write tcp 172.17.0.2:80->192.168.0.182:46652: write: broken pipe
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: write tcp 172.17.0.2:80->192.168.0.182:46660: write: broken pipe
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: write tcp 172.17.0.2:80->192.168.0.182:46656: write: broken pipe
2022/06/15 03:26:37 [ERROR] error syncing 'p-lp9jn/p-6ft69-85': handler pipeline-execution-controller: Get "http://10.43.5.174:8080/job/pipeline_p-6ft69-85/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-fb5gg-63': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-fb5gg-3': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-mbhqp-85': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-mbhqp-88': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-7qq2m-181': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-7qq2m-181/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-lp9jn/p-g68td-252': handler pipeline-execution-controller: Get "http://10.43.5.174:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-427': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-427/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-666': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-467': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-467/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-453': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-453/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-lp9jn/p-g68td-142': handler pipeline-execution-controller: Get "http://10.43.5.174:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-mbhqp-97': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-76qmm-109': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-76qmm-109/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8dwnj-7': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8dwnj-7/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-251': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-584': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-584/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-76qmm-91': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-76qmm-91/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-mbhqp-29': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-mbhqp-29/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-605': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-605/lastBuild/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-pwkhj/p-r7vjw-123': handler pipeline-execution-controller: Get "http://10.43.110.65:8080/job/pipeline_p-r7vjw-123/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-451': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-451/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-pwkhj/p-r7vjw-174': handler pipeline-execution-controller: Get "http://10.43.110.65:8080/job/pipeline_p-r7vjw-174/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-lp9jn/p-88cvr-40': handler pipeline-execution-controller: Get "http://10.43.5.174:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-mbhqp-188': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-mbhqp-188/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-pwkhj/p-x5fhs-513': handler pipeline-execution-controller: Get "http://10.43.110.65:8080/job/pipeline_p-x5fhs-513/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-230': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-230/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-7qq2m-107': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-fb5gg-57': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-mbhqp-172': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-mbhqp-172/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-387': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-76qmm-75': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-272': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-272/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-294': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-24': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-24/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-lp9jn/p-9rbpv-162': handler pipeline-execution-controller: Get "http://10.43.5.174:8080/job/pipeline_p-9rbpv-162/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-mbhqp-18': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-mbhqp-18/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-mbhqp-54': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-mbhqp-54/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-644': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-644/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-2n8ps-39': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-8g9nj-672': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-8g9nj-672/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-76qmm-210': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-76qmm-210/lastBuild/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-fb5gg-13': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,\":\",//crumb)": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
2022/06/15 03:26:37 [ERROR] error syncing 'p-9mknf/p-7qq2m-127': handler pipeline-execution-controller: Get "http://10.43.192.184:8080/job/pipeline_p-7qq2m-127/api/json": context deadline exceeded (Client.Timeout exceeded while awaiting headers), requeuing
W0615 03:26:37.646463       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Role ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.675540       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v3.ClusterUserAttribute ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.676183       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.ClusterRoleBinding ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.679402       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Deployment ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.681582       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Namespace ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.682306       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.ReplicaSet ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.682320       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.StatefulSet ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.687258       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.NetworkPolicy ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.689719       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.691274       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.LimitRange ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.691289       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.DaemonSet ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.691301       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.ReplicationController ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.692291       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v3.ClusterAuthToken ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.693137       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Endpoints ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.693161       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.693906       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Job ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.693920       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Event ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.694408       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1beta1.CronJob ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.696612       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.ServiceAccount ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.699931       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.APIService ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.699945       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.699956       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.ResourceQuota ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.699967       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.Pod ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.699989       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.RoleBinding ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.700000       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.ClusterRole ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.700012       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1beta1.PodSecurityPolicy ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
W0615 03:26:37.874450       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1beta1.Ingress ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
2022/06/15 03:26:37 [INFO] Handling backend connection request [c-bth7s:m-c039fa301449]
2022/06/15 03:26:37 [INFO] Handling backend connection request [c-bth7s:m-17065c97d7be]
2022/06/15 03:26:37 [INFO] Handling backend connection request [c-bth7s:m-0bbe7317da7f]
2022/06/15 03:26:37 [INFO] error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF
2022/06/15 03:26:37 [INFO] Handling backend connection request [c-bth7s:m-d58f2d8525e9]
2022/06/15 03:26:37 [INFO] Handling backend connection request [c-bth7s:m-1bbb828e29a2]
2022/06/15 03:26:37 [INFO] Handling backend connection request [c-bth7s:m-ab1862ef9759]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-0bbe7317da7f]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-fe396830bfd6]
exit status 255
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-5e0b687b36f5]
W0615 03:26:38.378573       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v3.Setting ended with: very short watch: pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: Unexpected watch close - watch lasted less than a second and no items received
W0615 03:26:38.378623       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v3.SourceCodeCredential ended with: very short watch: pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: Unexpected watch close - watch lasted less than a second and no items received
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-17065c97d7be]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-d58f2d8525e9]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-5e0b687b36f5]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-ab1862ef9759]
2022/06/15 03:26:38 [ERROR] Error updating user attribute to trigger refresh: Put "https://127.0.0.1:6443/apis/management.cattle.io/v3/userattributes/u-d4apeatbdc": http: server closed idle connection
2022/06/15 03:26:38 [ERROR] failed to write nodeConfig to agent: write tcp 172.17.0.2:80->192.168.0.182:44206: write: broken pipe
2022/06/15 03:26:38 [ERROR] failed to write nodeConfig to agent: write tcp 172.17.0.2:80->192.168.0.182:43938: write: broken pipe
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-c039fa301449]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-17065c97d7be]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-1bbb828e29a2]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-d58f2d8525e9]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-ab1862ef9759]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-0bbe7317da7f]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-fe396830bfd6]
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-5e0b687b36f5]
2022/06/15 03:26:38 [ERROR] failed to write nodeConfig to agent: write tcp 172.17.0.2:80->192.168.0.182:43948: write: broken pipe
2022/06/15 03:26:38 [ERROR] failed to write nodeConfig to agent: write tcp 172.17.0.2:80->192.168.0.182:44260: write: broken pipe
2022/06/15 03:26:38 [ERROR] failed to write nodeConfig to agent: write tcp 172.17.0.2:80->192.168.0.182:43916: write: broken pipe
2022/06/15 03:26:38 [ERROR] failed to write nodeConfig to agent: write tcp 172.17.0.2:80->192.168.0.182:44304: write: broken pipe
W0615 03:26:38.480345       7 reflector.go:437] pkg/mod/github.com/rancher/client-go@v1.20.0-rancher.1/tools/cache/reflector.go:168: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: tunnel disconnect") has prevented the request from succeeding
2022/06/15 03:26:38 [ERROR] failed to write nodeConfig to agent: write tcp 172.17.0.2:80->192.168.0.182:44374: write: broken pipe
2022/06/15 03:26:38 [ERROR] Error updating user attribute to trigger refresh: Put "https://127.0.0.1:6443/apis/management.cattle.io/v3/userattributes/u-d4apeatbdc": http: server closed idle connection
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-ab1862ef9759]
2022/06/15 03:26:38 [INFO] error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF
2022/06/15 03:26:38 [INFO] Handling backend connection request [c-bth7s:m-0bbe7317da7f]
2022/06/15 03:26:38 [ERROR] failed to write nodeConfig to agent: write tcp 172.17.0.2:80->192.168.0.182:43920: write: broken pipe
2022/06/15 03:26:38 [FATAL] k3s exited with: exit status 255

看上去是etcd服务出问题了

是指local集群的etcd吗?这个问题出现得挺频繁的,过段时间就自动重启

日志中反复出现 etcd read-only,大概率是 rancher server 的宿主机磁盘性能不好导致

看了下监控,确实有磁盘io瓶颈。

rancher server 主机磁盘性能上限如下

单盘最大IOPS:5,000 IOPS
单盘最大吞吐量:150 MB/s

etcd对磁盘要求这么高吗

某些数据积压过多,也有可能引起etcd IO问题,进而导致k3s挂掉或者长时间无法提供服务。
比如:apprevision, https://github.com/rancher/rancher/issues/20654

可以尝试先把主机规格调大一些,确保不会因为OOM挂掉。
然后清理apprevision,再重新调整回原有规格。

好的,我尝试一下