备份rke2目录后集群异常无法恢复

Rancher Server 设置

  • Rancher 版本:

k8s-agent-01 Ready 369d v1.27.8+rke2r1
k8s-server-01 Ready control-plane,etcd,master 374d v1.27.8+rke2r1
k8s-server-02 Ready control-plane,etcd,master 374d v1.27.8+rke2r1
k8s-server-03 Ready control-plane,etcd,master 374d v1.27.8+rke2r1

  • 节点CPU/内核版本

Linux k8s-server-01 5.10.178 #1 SMP Thu Jul 13 08:45:43 UTC 2023 x86_64 GNU/Linux
PRETTY_NAME=“Debian GNU/Linux 10 (buster)”

  • 在线或离线部署:离线部署

问题描述:

RKE2安装后 /var/lib/rancher/rke2 目录默认是在系统盘上,由于系统盘较小,需要将其迁移到数据盘上。迁移后重启rke2-server / rke2-agent 服务,集群正常启动但是感觉底层镜像文件出现了异常。

重现步骤:

1、3个server + 1个agent节点,停止rke2服务:systemctl stop rke2-server ; systemctl stop rke2-agent
2、备份现在的RKE2目录:mv /var/lib/rancher/rke2/* /var/lib/rancher/rke2_bak 
3、将数据盘进行挂载:mount /dev/nvme1n1 /var/lib/rancher/rke2/ 
4、将备份目录复制到挂载目录下:cp -r /var/lib/rancher/rke2/ /var/lib/rancher/rke2/
5、启动rke2服务 :systemctl start rke2-server ; systemctl start rke2-agent

结果:

1、部分业务namespace下的pod,根目录空了
2、部分pod由于权限问题启动失败,这是其中比较重要的作为示例(日志会贴在下面)


3、所有pod 启动时涉及网络插件的情况下都启动失败了

预期结果:

服务重启后集群正常,业务pod不受影响,其他pod正常运行

其他上下文信息:

1、出现异常后第一时间确认rke2 目录内容和rke2_bak目录下内容是否一致,发现部分目录和文件权限已经不一致了,推测是使用了cp -r 而不是cp -ra 的问题
2、业务namespace下根目录空了的Pod,重启后恢复了,推测是镜像文件目录变更导致的问题
3、网络插件使用的是canal + multus

cat /etc/rancher/rke2/config.yaml 

tls-san: 
  - k8s-server-01
  - k8s-server-02
  - k8s-server-03

node-name: k8s-server-01
# bind-address: 0.0.0.0
# data-dir: /var/lib/rancher/rke2
# cluster-cidr: 10.42.0.0/16
# service-cidr: 10.43.0.0/16
# service-node-port-range: 30000-32767
# cluster-domian: cluster.local

bind-address: 172.29.71.11
node-ip: 172.29.71.11
cni:
  - multus
  - canal
【rke2-ingress-nginx-controller-pmt96】【Pod启动失败日志】
  1 2025-01-24T17:27:27.754575571+08:00 stderr F E0124 09:27:27.7544557 main.go:157] "unexpected error obtaining NGINX version" err="fork/exec /usr/bin/nginx: permiss
    ion denied"
  2 2025-01-24T17:27:27.754597953+08:00 stdout F -------------------------------------------------------------------------------
  3 2025-01-24T17:27:27.754653035+08:00 stdout F NGINX Ingress controller
  4 2025-01-24T17:27:27.754660472+08:00 stdout F   Release:nginx-1.9.3-hardened1
  5 2025-01-24T17:27:27.754666549+08:00 stdout F   Build:git-1d7cec346
  6 2025-01-24T17:27:27.754672379+08:00 stdout F   Repository:    https://github.com/rancher/ingress-nginx.git
  7 2025-01-24T17:27:27.754677974+08:00 stdout F   N/A
  8 2025-01-24T17:27:27.754684948+08:00 stdout F -------------------------------------------------------------------------------
  9 2025-01-24T17:27:27.754690686+08:00 stdout F
 10 2025-01-24T17:27:27.754864771+08:00 stderr F W0124 09:27:27.7548127 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterCon
    fig.  This might not work.
 11 2025-01-24T17:27:27.754972379+08:00 stderr F I0124 09:27:27.7549307 main.go:205] "Creating API client" host="https://10.43.0.1:443"
 12 2025-01-24T17:27:27.761434634+08:00 stderr F I0124 09:27:27.7613607 main.go:249] "Running in Kubernetes cluster" major="1" minor="27" git="v1.27.8+rke2r1" state="
    clean" commit="66fee42707cd7f5a89f1987f7cb81b02dd19161c" platform="linux/amd64"
 13 2025-01-24T17:27:27.875345439+08:00 stderr F I0124 09:27:27.8752657 main.go:101] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-cer
    tificate.pem"
 14 2025-01-24T17:27:27.900138016+08:00 stderr F I0124 09:27:27.9000407 ssl.go:536] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/cert
    ificates/key"
 15 2025-01-24T17:27:27.913263125+08:00 stderr F I0124 09:27:27.9131817 nginx.go:260] "Starting NGINX Ingress controller"
 16 2025-01-24T17:27:27.950592885+08:00 stderr F I0124 09:27:27.9505007 event.go:298] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"starmap", Name:"rke2-ingre
    ss-nginx-controller", UID:"92fe7550-ff92-4374-9d0d-e3da135e4269", APIVersion:"v1", ResourceVersion:"22034247", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap
    starmap/rke2-ingress-nginx-controller
 17 2025-01-24T17:27:29.017475752+08:00 stderr F I0124 09:27:29.0173727 store.go:440] "Found valid IngressClass" ingress="starmap/ingress-starmap-adminapp" ingresscla
    ss="nginx"
 18 2025-01-24T17:27:29.017689454+08:00 stderr F I0124 09:27:29.0175817 event.go:298] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"starmap", Name:"ingress-star
    map-adminapp", UID:"7b7a90a8-07b1-460d-98d6-5620b6b91ebb", APIVersion:"networking.k8s.io/v1", ResourceVersion:"155049111", FieldPath:""}): type: 'Normal' reason: 'Sync'
    Scheduled for sync
 19 2025-01-24T17:27:29.018221017+08:00 stderr F I0124 09:27:29.0181677 store.go:440] "Found valid IngressClass" ingress="starmap/ingress-starmap-adminweb" ingresscla
    ss="nginx"
 20 2025-01-24T17:27:29.018345959+08:00 stderr F I0124 09:27:29.0182867 event.go:298] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"starmap", Name:"ingress-star
    map-adminweb", UID:"3dbe8680-2246-4f65-a923-471cf68ae8d3", APIVersion:"networking.k8s.io/v1", ResourceVersion:"155049110", FieldPath:""}): type: 'Normal' reason: 'Sync'
    Scheduled for sync
 21 2025-01-24T17:27:29.115101026+08:00 stderr F I0124 09:27:29.1150177 nginx.go:303] "Starting NGINX process"
 22 2025-01-24T17:27:29.115131435+08:00 stderr F I0124 09:27:29.1150237 leaderelection.go:245] attempting to acquire leader lease starmap/rke2-ingress-nginx-leader...
 23 2025-01-24T17:27:29.115712597+08:00 stderr F F0124 09:27:29.1156577 nginx.go:421] NGINX error: fork/exec /usr/bin/nginx: permission denied
【rke2-ingress-nginx-controller-x2vjt】【网络插件异常describe信息】
[root@k8s-server-01 ~]# kubectl describe po -n starmap rke2-ingress-nginx-controller-x2vjt 
Name:             rke2-ingress-nginx-controller-x2vjt
Namespace:        starmap
Priority:         0
Service Account:  rke2-ingress-nginx
Node:             k8s-server-02/172.29.71.12
Start Time:       Fri, 24 Jan 2025 17:34:16 +0800
Labels:           app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=rke2-ingress-nginx
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=rke2-ingress-nginx
                  app.kubernetes.io/part-of=rke2-ingress-nginx
                  app.kubernetes.io/version=1.9.3
                  controller-revision-hash=dc8b796d5
                  helm.sh/chart=rke2-ingress-nginx-4.8.200
                  pod-template-generation=6
Annotations:      <none>
Status:           Pending
IP:               
IPs:              <none>
Controlled By:    DaemonSet/rke2-ingress-nginx-controller
Containers:
  rke2-ingress-nginx-controller:
    Container ID:  
    Image:         registry.ibdp.webray.com.cn:51808/rancher/nginx-ingress-controller:nginx-1.9.3-hardened1
    Image ID:      
    Ports:         80/TCP, 443/TCP, 8443/TCP
    Host Ports:    80/TCP, 443/TCP, 0/TCP
    Args:
      /nginx-ingress-controller
      --election-id=rke2-ingress-nginx-leader
      --controller-class=k8s.io/ingress-nginx
      --ingress-class=nginx
      --configmap=$(POD_NAMESPACE)/rke2-ingress-nginx-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
      --watch-ingress-without-class=true
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:      100m
      memory:   90Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       rke2-ingress-nginx-controller-x2vjt (v1:metadata.name)
      POD_NAMESPACE:  starmap (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cnrwb (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  rke2-ingress-nginx-admission
    Optional:    false
  kube-api-access-cnrwb:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason                  Age               From               Message
  ----     ------                  ----              ----               -------
  Normal   Scheduled               13m                   default-scheduler  Successfully assigned starmap/rke2-ingress-nginx-controller-x2vjt to k8s-server-02
  Warning  FailedCreatePodSandBox  13m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbo
x "2459c7418fefbd98f6123bcbbcd8406908ccf06c5bc79d0d70261eaa351e5294": plugin type="multus" name="multus-cni-network" failed (add): Multus: [starmap/rke2-ingress-nginx-contro
ller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Unauthorized
  Warning  FailedCreatePodSandBox  13m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbo
x "effea7e5faca9a93b57207776ba8d59b3324fcf8b233beedddae8c443ffff5e3": plugin type="multus" name="multus-cni-network" failed (add): Multus: [starmap/rke2-ingress-nginx-contro
ller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Unauthorized
  Warning  FailedCreatePodSandBox  12m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbo
x "cec68ff07fb9ba49d11e41b5f7511d72e2357dee3499e5ac7f06fabd8798c732": plugin type="multus" name="multus-cni-network" failed (add): Multus: [starmap/rke2-ingress-nginx-contro
ller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Get "https://[10.43.0.1]:443/api/v1/namespaces/starmap/pods/rke2-ingress-nginx-controller-x2vjt?timeout=
1m0s": dial tcp 10.43.0.1:443: connect: connection timed out
  Warning  FailedCreatePodSandBox  12m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbo
x "a2dc15bee830df99350f42e8ba6dd75b526d17ef61be02858a23cdd8ae43fd77": plugin type="multus" name="multus-cni-network" failed (add): Multus: [starmap/rke2-ingress-nginx-contro
ller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Unauthorized
  Warning  FailedCreatePodSandBox  12m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbo
x "9c88c9a811541bc9ae4729f637e567aa834b5ba1de4bad87261f5fde7483711c": plugin type="multus" name="multus-cni-network" failed (add): Multus: [starmap/rke2-ingress-nginx-contro
ller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Unauthorized
  Warning  FailedCreatePodSandBox  12m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbo
x "1ead31a1e2f563a7647a0144735afe11d50fb22190295942eb8dc7999dd0b474": plugin type="multus" name="multus-cni-network" failed (add): Multus: [starmap/rke2-ingress-nginx-contro
ller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Get "https://[10.43.0.1]:443/api/v1/namespaces/starmap/pods/rke2-ingress-nginx-controller-x2vjt?timeout=
1m0s": dial tcp 10.43.0.1:443: connect: connection timed out
  Warning  FailedCreatePodSandBox  11m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbo
x "4d268a1a7f902beb34f708589dde576be079c67f0a1f10761601105514703194": plugin type="multus" name="multus-cni-network" failed (add): Multus: [starmap/rke2-ingress-nginx-contro
ller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Get "https://[10.43.0.1]:443/api/v1/namespaces/starmap/pods/rke2-ingress-nginx-controller-x2vjt?timeout=
1m0s": dial tcp 10.43.0.1:443: connect: connection timed out
  Warning  FailedCreatePodSandBox  11m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbo
x "c95deb1fba26f07bfe413798d8c97c559fd1f52586117fbecd955e8244ae0316": plugin type="multus" name="multus-cni-network" failed (add): Multus: [starmap/rke2-ingress-nginx-contro
ller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Unauthorized
  Warning  FailedCreatePodSandBox  11m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbo
x "e829cae93364a48e09e464a2b0a6ff43654cee343b0e83c5c3d2a81172163a7f": plugin type="multus" name="multus-cni-network" failed (add): Multus: [starmap/rke2-ingress-nginx-contro
ller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Unauthorized
  Warning  FailedCreatePodSandBox  3m20s (x31 over 11m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = fa
iled to setup network for sandbox "dda93492ac045b933557567e322e6cfc5fcc29e26c4cb747c3b578df42ad85b8": plugin type="multus" name="multus-cni-network" failed (add): Multus: [s
tarmap/rke2-ingress-nginx-controller-x2vjt/f2acc921-df9c-46ab-bdc7-675e8d14722f]: error getting pod: Get "https://[10.43.0.1]:443/api/v1/namespaces/starmap/pods/rke2-ingress
-nginx-controller-x2vjt?timeout=1m0s": dial tcp 10.43.0.1:443: connect: connection timed out

网络问题是刚刚发现的,这是一些排查过程记录

[root@k8s-server-01 ~]# kubectl describe svc  kubernetes
Name:              kubernetes
Namespace:         default
Labels:            component=apiserver
                   provider=kubernetes
Annotations:       <none>
Selector:          <none>
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.43.0.1
IPs:               10.43.0.1
Port:              https  443/TCP
TargetPort:        6443/TCP
Endpoints:         172.29.71.11:6443,172.29.71.12:6443,172.29.71.13:6443
Session Affinity:  None
Events:            <none>
[root@k8s-server-01 ~]# ip a | grep 172.29.71.11
    inet 172.29.71.11/24 scope global vClient
[root@k8s-server-01 ~]# netstat -anp | grep 6443
tcp        0      0 172.29.71.11:38886      172.29.71.11:6443       ESTABLISHED 31121/kubelet       
tcp        0      0 127.0.0.1:38080         127.0.0.1:6443          ESTABLISHED 45244/cloud-control 
tcp        0      0 172.29.71.11:38898      172.29.71.11:6443       ESTABLISHED 22043/rke2 server   
tcp        0      0 127.0.0.1:48180         127.0.0.1:6443          ESTABLISHED 32480/kube-schedule 
tcp        0      0 127.0.0.1:38104         127.0.0.1:6443          ESTABLISHED 46850/kube-controll 
tcp        0      0 127.0.0.1:38118         127.0.0.1:6443          ESTABLISHED 46850/kube-controll 
tcp        0      0 127.0.0.1:38090         127.0.0.1:6443          ESTABLISHED 32480/kube-schedule 
tcp        0      0 172.29.71.11:49660      172.29.71.11:6443       ESTABLISHED 51263/kube-proxy    
tcp        0      0 127.0.0.1:48232         127.0.0.1:6443          ESTABLISHED 22043/rke2 server   
tcp        0      0 127.0.0.1:48218         127.0.0.1:6443          ESTABLISHED 22043/rke2 server   
tcp        0      0 127.0.0.1:38068         127.0.0.1:6443          ESTABLISHED 45244/cloud-control 
tcp6       0      0 :::6443                 :::*                    LISTEN      44159/kube-apiserve 
tcp6       0      0 172.29.71.11:6443       172.29.71.11:38886      ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 127.0.0.1:6443          127.0.0.1:38104         ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 172.29.71.11:6443       172.29.71.14:23328      ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 127.0.0.1:6443          127.0.0.1:48180         ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 172.29.71.11:6443       172.29.71.14:22374      ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 127.0.0.1:6443          127.0.0.1:38080         ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 172.29.71.11:6443       172.29.71.11:38898      ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 127.0.0.1:6443          127.0.0.1:48218         ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 127.0.0.1:6443          127.0.0.1:38090         ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 127.0.0.1:6443          127.0.0.1:38068         ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 172.29.71.11:6443       172.29.71.14:23132      ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 ::1:44840               ::1:6443                ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 172.29.71.11:6443       172.29.71.11:49660      ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 127.0.0.1:6443          127.0.0.1:48232         ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 ::1:6443                ::1:44840               ESTABLISHED 44159/kube-apiserve 
tcp6       0      0 127.0.0.1:6443          127.0.0.1:38118         ESTABLISHED 44159/kube-apiserve
[root@k8s-server-01 ~]# kubectl logs --tail=1 -f -n kube-system kube-apiserver-k8s-server-01 
E0124 09:53:15.152215       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
E0124 09:53:43.111963       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
E0124 09:53:55.112176       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
I0124 09:53:58.990544       1 handler.go:232] Adding GroupVersion metrics.k8s.io v1beta1 to ResourceManager
E0124 09:54:46.110695       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
E0124 09:54:46.148987       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
E0124 09:54:58.117271       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
E0124 09:54:58.157002       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
I0124 09:54:58.990425       1 handler.go:232] Adding GroupVersion metrics.k8s.io v1beta1 to ResourceManager
E0124 09:55:12.110388       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
E0124 09:55:30.193818       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
E0124 09:55:45.110400       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
I0124 09:55:58.987170       1 handler.go:232] Adding GroupVersion metrics.k8s.io v1beta1 to ResourceManager
I0124 09:56:04.575020       1 handler.go:232] Adding GroupVersion k3s.cattle.io v1 to ResourceManager
I0124 09:56:04.575214       1 handler.go:232] Adding GroupVersion configuration.konghq.com v1alpha1 to ResourceManager
I0124 09:56:04.575399       1 handler.go:232] Adding GroupVersion configuration.konghq.com v1 to ResourceManager
I0124 09:56:04.575610       1 handler.go:232] Adding GroupVersion snapshot.storage.k8s.io v1 to ResourceManager
I0124 09:56:04.575953       1 handler.go:232] Adding GroupVersion snapshot.storage.k8s.io v1beta1 to ResourceManager
I0124 09:56:04.576164       1 handler.go:232] Adding GroupVersion crd.projectcalico.org v1 to ResourceManager
I0124 09:56:04.576284       1 handler.go:232] Adding GroupVersion gateway.networking.k8s.io v1alpha2 to ResourceManager
I0124 09:56:04.576355       1 handler.go:232] Adding GroupVersion gateway.networking.k8s.io v1beta1 to ResourceManager
I0124 09:56:04.576462       1 handler.go:232] Adding GroupVersion configuration.konghq.com v1beta1 to ResourceManager
I0124 09:56:04.576716       1 handler.go:232] Adding GroupVersion configuration.konghq.com v1 to ResourceManager
I0124 09:56:04.576829       1 handler.go:232] Adding GroupVersion helm.cattle.io v1 to ResourceManager
I0124 09:56:04.576897       1 handler.go:232] Adding GroupVersion snapshot.storage.k8s.io v1 to ResourceManager
I0124 09:56:04.576984       1 handler.go:232] Adding GroupVersion gateway.networking.k8s.io v1 to ResourceManager
I0124 09:56:04.577063       1 handler.go:232] Adding GroupVersion snapshot.storage.k8s.io v1beta1 to ResourceManager
I0124 09:56:04.577116       1 handler.go:232] Adding GroupVersion monitoring.grafana.com v1alpha2 to ResourceManager
I0124 09:56:04.577156       1 handler.go:232] Adding GroupVersion k8s.cni.cncf.io v1 to ResourceManager
I0124 09:56:04.577268       1 handler.go:232] Adding GroupVersion crd.projectcalico.org v1 to ResourceManager
I0124 09:56:04.577360       1 handler.go:232] Adding GroupVersion k3s.cattle.io v1 to ResourceManager
I0124 09:56:04.577444       1 handler.go:232] Adding GroupVersion configuration.konghq.com v1beta1 to ResourceManager
E0124 09:56:46.189194       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"
I0124 09:56:54.529690       1 handler.go:232] Adding GroupVersion metrics.k8s.io v1beta1 to ResourceManager
I0124 09:56:58.990500       1 handler.go:232] Adding GroupVersion metrics.k8s.io v1beta1 to ResourceManager
E0124 09:57:01.147990       1 authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has expired]"

看起来是像是token过期的问题,但在更换rke2目录时,我重启了rke2-server,证书TTL被刷新了

[root@k8s-server-01 ~]# kubectl get secret -n kube-system rke2-serving -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -noout -text | grep Not
            Not Before: Jan 16 03:19:12 2024 GMT
            Not After : Oct 17 04:05:29 2025 GMT

由于我的环境现在存在未知的权限问题,我不敢贸然重启任何running状态的pod

有大佬可以帮看下目录备份恢复的问题吗,感激不尽