Rancher-server 无法添加节点

Rancher Server 设置

  • Rancher 版本:2.5.2
  • 安装选项 (Docker install/Helm Chart):
    • 如果是 Helm Chart 安装,需要提供 Local 集群的类型(RKE1, RKE2, k3s, EKS, 等)和版本:
  • 在线或离线部署:在线部署

下游集群信息

  • Kubernetes 版本: 1.19.4
  • Cluster Type (Local/Downstream):
    • 如果 Downstream,是什么类型的集群?(自定义/导入或为托管 等): 自定义集群

用户信息

  • 登录用户的角色是什么? (管理员/集群所有者/集群成员/项目所有者/项目成员/自定义):管理员
    • 如果自定义,自定义权限集:

问题描述:
rancher 无法添加节点,排查发现rancher-server 出现大量报错

重现步骤:

结果:

预期结果:

截图:



其他上下文信息:

日志

E0626 17:12:57.964948      40 authentication.go:53] Unable to authenticate the request due to an error: [invalid bearer token, square/go-jose: error in cryptographic primitive]
E0626 17:12:58.127681      40 authentication.go:53] Unable to authenticate the request due to an error: [invalid bearer token, square/go-jose: error in cryptographic primitive]
E0626 17:12:58.128644      40 authentication.go:53] Unable to authenticate the request due to an error: [invalid bearer token, square/go-jose: error in cryptographic primitive]
W0626 17:12:58.289747      40 dispatcher.go:128] Failed calling webhook, failing open rancher.cattle.io: failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused
E0626 17:12:58.289789      40 dispatcher.go:129] failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused
E0626 17:12:58.460280      40 authentication.go:53] Unable to authenticate the request due to an error: [invalid bearer token, square/go-jose: error in cryptographic primitive]
W0626 17:12:58.478162      40 dispatcher.go:128] Failed calling webhook, failing open rancher.cattle.io: failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused
E0626 17:12:58.478207      40 dispatcher.go:129] failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused
W0626 17:12:58.719746      40 dispatcher.go:128] Failed calling webhook, failing open rancher.cattle.io: failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused
E0626 17:12:58.719782      40 dispatcher.go:129] failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused
W0626 17:12:58.985558      40 dispatcher.go:128] Failed calling webhook, failing open rancher.cattle.io: failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused
E0626 17:12:58.985599      40 dispatcher.go:129] failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused
E0626 17:12:59.097278      40 authentication.go:53] Unable to authenticate the request due to an error: [invalid bearer token, square/go-jose: error in cryptographic primitive]
E0626 17:12:59.128893      40 authentication.go:53] Unable to authenticate the request due to an error: [invalid bearer token, square/go-jose: error in cryptographic primitive]
E0626 17:12:59.129767      40 authentication.go:53] Unable to authenticate the request due to an error: [invalid bearer token, square/go-jose: error in cryptographic primitive]
W0626 17:12:59.130682      40 dispatcher.go:128] Failed calling webhook, failing open rancher.cattle.io: failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused
^CE0626 17:12:59.130709      40 dispatcher.go:129] failed calling webhook "rancher.cattle.io": Post https://rancher-webhook.cattle-system.svc:443/v1/webhook/validation?timeout=10s: dial tcp 10.43.225.146:443: connect: connection refused


你试试创建用户能创建不,看看报啥错

我这边尝试过使用官网文档的三条命令delete ,更新webhook 证书,但是rancher-webhook 报错依旧
后面我又尝试了,直接更新了cattle-webhook-tls证书,报错依旧

那你应该是更新的有问题,我不知道你是如何更新 webhook 证书的,暂时无法回答

这是我更新证书的步骤,是否还有其他方式可以更新证书呢

查看原tls.crt证书信息

openssl x509 -in tls.crt -noout -text

生成new.key文件

openssl genrsa -out new.key 2048

创建new.csr

openssl req -new -key new.key -out new.csr

生成新的new.crt

openssl x509 -req -days 3650 -sha256 -CA ca.crt -CAkey ca.key -CAcreateserial -extfile openssl.cnf -extensions v3_req -in new.csr -out new.crt

Webhook 证书过期,导致Rancher2.5.1创建角色报证书错误 - #3,来自 ksd 参考这个更新

请问cattle-cluster-agent 出现这个证书异常要怎么修复
我按照您发的rancher server_url 修改的文档,操作到部署agent 这一步了

INFO: Environment: CATTLE_ADDRESS=10.42.235.131 CATTLE_CA_CHECKSUM=464e60c58dbefb2ebb10d93ebe3b59659553e6befc93f5334d86273b894caa61 CATTLE_CLUSTER=true CATTLE_FEATURES= CATTLE_INTERNAL_ADDRESS= CAT               TLE_IS_RKE=false CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cattle-cluster-agent-6f78698bb5-r58dp CATTLE_SERVER=https://8.142.134.194:8443
INFO: Using resolv.conf: nameserver 10.43.0.10 search cattle-system.svc.cluster.local svc.cluster.local cluster.local options ndots:5
INFO: https://8.142.134.194:8443/ping is accessible
INFO: Value from https://8.142.134.194:8443/v3/settings/cacerts is an x509 certificate
time="2022-06-28T01:54:39Z" level=info msg="Listening on /tmp/log.sock"
time="2022-06-28T01:54:39Z" level=info msg="Rancher agent version v2.5.2 is starting"
time="2022-06-28T01:54:39Z" level=info msg="Certificate details from https://8.142.134.194:8443"
time="2022-06-28T01:54:39Z" level=info msg="Certificate #0 (https://8.142.134.194:8443)"
time="2022-06-28T01:54:39Z" level=info msg="Subject: CN=*.yunlizhihui.com"
time="2022-06-28T01:54:39Z" level=info msg="Issuer: CN=Encryption Everywhere DV TLS CA - G1,OU=www.digicert.com,O=DigiCert Inc,C=US"
time="2022-06-28T01:54:39Z" level=info msg="IsCA: false"
time="2022-06-28T01:54:39Z" level=info msg="DNS Names: [*.yunlizhihui.com yunlizhihui.com]"
time="2022-06-28T01:54:39Z" level=info msg="IPAddresses: <none>"
time="2022-06-28T01:54:39Z" level=info msg="NotBefore: 2022-04-11 00:00:00 +0000 UTC"
time="2022-06-28T01:54:39Z" level=info msg="NotAfter: 2023-04-12 23:59:59 +0000 UTC"
time="2022-06-28T01:54:39Z" level=info msg="SignatureAlgorithm: SHA256-RSA"
time="2022-06-28T01:54:39Z" level=info msg="PublicKeyAlgorithm: RSA"
time="2022-06-28T01:54:39Z" level=info msg="Certificate #1 (https://8.142.134.194:8443)"
time="2022-06-28T01:54:39Z" level=info msg="Subject: CN=Encryption Everywhere DV TLS CA - G1,OU=www.digicert.com,O=DigiCert Inc,C=US"
time="2022-06-28T01:54:39Z" level=info msg="Issuer: CN=DigiCert Global Root CA,OU=www.digicert.com,O=DigiCert Inc,C=US"
time="2022-06-28T01:54:39Z" level=info msg="IsCA: true"
time="2022-06-28T01:54:39Z" level=info msg="DNS Names: <none>"
time="2022-06-28T01:54:39Z" level=info msg="IPAddresses: <none>"
time="2022-06-28T01:54:39Z" level=info msg="NotBefore: 2017-11-27 12:46:10 +0000 UTC"
time="2022-06-28T01:54:39Z" level=info msg="NotAfter: 2027-11-27 12:46:10 +0000 UTC"
time="2022-06-28T01:54:39Z" level=info msg="SignatureAlgorithm: SHA256-RSA"
time="2022-06-28T01:54:39Z" level=info msg="PublicKeyAlgorithm: RSA"
time="2022-06-28T01:54:39Z" level=info msg="Certificate details for /etc/kubernetes/ssl/certs/serverca"
time="2022-06-28T01:54:39Z" level=info msg="Certificate #0 (/etc/kubernetes/ssl/certs/serverca)"
time="2022-06-28T01:54:39Z" level=info msg="Subject: CN=dynamiclistener-ca,O=dynamiclistener-org"
time="2022-06-28T01:54:39Z" level=info msg="Issuer: CN=dynamiclistener-ca,O=dynamiclistener-org"
time="2022-06-28T01:54:39Z" level=info msg="IsCA: true"
time="2022-06-28T01:54:39Z" level=info msg="DNS Names: <none>"
time="2022-06-28T01:54:39Z" level=info msg="IPAddresses: <none>"
time="2022-06-28T01:54:39Z" level=info msg="NotBefore: 2020-11-25 15:27:23 +0000 UTC"
time="2022-06-28T01:54:39Z" level=info msg="NotAfter: 2030-11-23 15:27:23 +0000 UTC"
time="2022-06-28T01:54:39Z" level=info msg="SignatureAlgorithm: ECDSA-SHA256"
time="2022-06-28T01:54:39Z" level=info msg="PublicKeyAlgorithm: ECDSA"
time="2022-06-28T01:54:39Z" level=error msg="Issuer of last certificate found in chain (CN=DigiCert Global Root CA,OU=www.digicert.com,O=DigiCert Inc,C=US) does not match with CA certificate Issuer                (CN=dynamiclistener-ca,O=dynamiclistener-org). Please check if the configured server certificate contains all needed intermediate certificates and make sure they are in the correct order (server c               ertificate first, intermediates after)"
time="2022-06-28T01:54:39Z" level=fatal msg="Server certificate does not contain correct DNS and/or IP address entries in the Subject Alternative Names (SAN). Certificate information is displayed a               bove. error: Get \"https://8.142.134.194:8443\": x509: cannot validate certificate for 8.142.134.194 because it doesn't contain any IP SANs"

我什么时候发文档了?发的哪个文档?你都如何操作的?