大佬们好
环境信息:
RKE2 版本:
# rke2 -v
rke2 version v1.30.5+rke2r1 (0c83bc82315cd61664880d0b52a7e070e9fbd623)
go version go1.22.6 X:boringcrypto
rancher 版本:
v2.9.2
节点 CPU 架构,操作系统和版本:
uname -a
Linux ml-57-6 4.19.90-23.21.v2101.fortest.ky10.aarch64 #1 SMP Fri Mar 11 11:37:09 CST 2022 aarch64 aarch64 aarch64 GNU/Linux
操作系统为 rocky linux 9.5, arm64
集群配置:
相同版本的 rke2
上游集群:1 * server
下游集群:3 * server 无 agent
问题描述:
我这里有两个集群,根据 rancher.cn 上的建议,使用了私有证书。当尝试纳管下游集群时,cattle-cluster-agent
在启动过程中报错,导致下游集群无法被纳管。
上游集群 | 下游集群 | |
---|---|---|
配置 | 单节点 server | 三个 server,无 agent |
用途 | 部署 rancher | 业务集群 |
部署方式 | 使用 air gap 方式手动部署 | 同左,手动部署 |
time="2024-12-09T08:13:56Z" level=error msg="Could not securely connect to https://172.16.57.2: Get \"https://172.16.57.2\": tls: failed to verify certificate: x509: cannot validate certificate for 172.16.57.2 b
ecause it doesn't contain any IP SANs"
重现步骤:
-
安装 RKE2 的命令:
INSTALL_RKE2_ARTIFACT_PATH=$PWD sh install.sh
上下游均用该命令安装,并可以正常启动。
-
上游安装 rancher
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable helm fetch rancher-stable/rancher --version=v2.9.2
-
使用该脚本生成私钥
生成自签名 SSL 证书 | Rancher文档
生成命令为bash generate-key.sh --ssl-trusted-ip=172.16.57.2,172.16.57.9 --ssl-domain=rancher.ml.local --ssl-date=3650
之后使用
cd certs kubectl -n cattle-system create secret tls tls-rancher-ingress \ --cert=tls.crt \ --key=tls.key kubectl -n cattle-system create secret generic tls-ca --from-file=cacerts.pem
上面的命令导入到上游集群中
-
使用如下的命令进行安装上游集群的 rancher,rancher 安装成功,并可以进入
cd rancher # 因为将 rke2 识别为了 1.31,暂时删掉版本要求 sed -i "/kubeVersion/d" Chart.yaml helm upgrade install --namespace cattle-system \ --set hostname=rancher.ml.local \ --set rancherImage=172.16.57.5:443/rancher/rancher \ --set ingress.tls.source=secret \ --set privateCA=true \ --set useBundledSystemChart=true .
-
复制 rancher 中的集群纳入命令,粘贴到下游集群
下游集群启动 agent 时报错time="2024-12-09T08:13:56Z" level=error msg="Could not securely connect to https://172.16.57.2: Get \"https://172.16.57.2\": tls: failed to verify certificate: x509: cannot validate certificate for 172.16.57.2 b ecause it doesn't contain any IP SANs"
预期结果:
下游集群的 rancher-agent 正常启动,并纳入上游管理
实际结果:
下游集群中的 rancher-agent 启动报错
日志
下游集群 rancher-agent 启动日志
INFO: Environment: CATTLE_ADDRESS=10.42.0.50 CATTLE_CA_CHECKSUM=6afa873fddb087f5557bfe42550fde4c3a4b4848467c013268e3e12e09047998 CATTLE_CLUSTER=true CATTLE_CLUSTER_AGENT_PORT=tcp://10.43.137.240:80 CATTLE_CLUSTE
R_AGENT_PORT_443_TCP=tcp://10.43.137.240:443 CATTLE_CLUSTER_AGENT_PORT_443_TCP_ADDR=10.43.137.240 CATTLE_CLUSTER_AGENT_PORT_443_TCP_PORT=443 CATTLE_CLUSTER_AGENT_PORT_443_TCP_PROTO=tcp CATTLE_CLUSTER_AGENT_PORT_
80_TCP=tcp://10.43.137.240:80 CATTLE_CLUSTER_AGENT_PORT_80_TCP_ADDR=10.43.137.240 CATTLE_CLUSTER_AGENT_PORT_80_TCP_PORT=80 CATTLE_CLUSTER_AGENT_PORT_80_TCP_PROTO=tcp CATTLE_CLUSTER_AGENT_SERVICE_HOST=10.43.137.2
40 CATTLE_CLUSTER_AGENT_SERVICE_PORT=80 CATTLE_CLUSTER_AGENT_SERVICE_PORT_HTTP=80 CATTLE_CLUSTER_AGENT_SERVICE_PORT_HTTPS_INTERNAL=443 CATTLE_CLUSTER_REGISTRY= CATTLE_INGRESS_IP_DOMAIN=sslip.io CATTLE_INSTALL_UU
ID=c04358c4-8b51-4b3a-88bc-2744edca0bd0 CATTLE_INTERNAL_ADDRESS= CATTLE_IS_RKE=false CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cattle-cluster-agent-655ff6f66-5749n CATTLE_RANCHER_PROVISIONING_CAPI_VERSION= CATTLE
_RANCHER_WEBHOOK_VERSION=104.0.2+up0.5.2 CATTLE_SERVER=https://172.16.57.2 CATTLE_SERVER_VERSION=v2.9.2
INFO: Using resolv.conf: search cattle-system.svc.cluster.local svc.cluster.local cluster.local nameserver 10.43.0.10 options ndots:5
INFO: https://172.16.57.2/ping is accessible
INFO: Value from https://172.16.57.2/v3/settings/cacerts is an x509 certificate
time="2024-12-09T08:39:31Z" level=info msg="Listening on /tmp/log.sock"
time="2024-12-09T08:39:31Z" level=info msg="Rancher agent version v2.9.2 is starting"
time="2024-12-09T08:39:31Z" level=info msg="Testing connection to https://172.16.57.2 using trusted certificate authorities within: /etc/kubernetes/ssl/certs/serverca"
time="2024-12-09T08:39:31Z" level=error msg="Could not securely connect to https://172.16.57.2: Get \"https://172.16.57.2\": tls: failed to verify certificate: x509: cannot validate certificate for 172.16.57.2 b
ecause it doesn't contain any IP SANs"
是我哪里操作的不对吗,还是遗漏了哪些步骤?提前谢谢大佬们的指点与帮助!