创建集群时提示找不到证书文件

环境信息:
RKE2 版本:
v1.26.8+rke2r1
使用如下命令创建集群时,日替如下提示
curl --insecure -fL https://10.13.21.18/system-agent-install.sh | sudo sh -s - --server https://10.13.21.18 --label ‘cattle.io/os=linux’ --token j8pbklttwg82q6zf4ztj7k8zl45mpq9wn6qgkp7spp4nk9kfxjdj44 --ca-checksum 69734a1c2931770854d8635f7efab02b74c03ffbd21f39005013a9390f80e1f1 --etcd --controlplane --worker
日志
error applying plan – check rancher-system-agent.service logs on node for more information, waiting for agent to check in and apply initial plan
查看对应服务器上的日志时,提示如下
10月 20 13:22:23 docker03 rancher-system-agent[485951]: time=“2023-10-20T13:22:23+08:00” level=error msg=“error while appending ca cert to pool for probe kube-scheduler”
10月 20 13:22:25 docker03 rancher-system-agent[485951]: time=“2023-10-20T13:22:25+08:00” level=error msg=“error loading CA cert for probe (kube-scheduler) /var/lib/rancher/rke2/server/tls/kube-scheduler/kube-scheduler.crt: open /var/lib/rancher/rke2/server/tls/kube-scheduler/kube-scheduler.crt: no such file or directory”
10月 20 13:22:25 docker03 rancher-system-agent[485951]: time=“2023-10-20T13:22:25+08:00” level=error msg=“error while appending ca cert to pool for probe kube-scheduler”
10月 20 13:22:25 docker03 rancher-system-agent[485951]: time=“2023-10-20T13:22:25+08:00” level=error msg=“error loading CA cert for probe (kube-controller-manager) /var/lib/rancher/rke2/server/tls/kube-controller-manager/kube-controller-manager.crt: open /var/lib/rancher/rke2/server/tls/kube-controller-manager/kube-controller-manager.crt: no such file or directory”
10月 20 13:22:25 docker03 rancher-system-agent[485951]: time=“2023-10-20T13:22:25+08:00” level=error msg=“error while appending ca cert to pool for probe kube-controller-manager”
我使用的是离线方式安装的,私有镜像仓库是harbor,之前使用在线方式安装没出现这个问题,请问大神是什么情况呢

tls文件夹没有被创建,是网络问题吗?

可以在下游集群主机kubelet日志/var/lib/rancher/rke2/agent/logs/kubelet.log,查看具体报错情况

tls: failed to verify certificate: x509: certificate signed by unknown authority

提示这个

猜测是问题出在证书上,能将rancher安装的命令找到看下。
https://forums.rancher.cn/t/rancher-4-lb/1681和https://forums.rancher.cn/t/rancher-cert-manager-tls-4-lb/1688
两个文章看下证书安装

docker run -d --restart=unless-stopped
-p 80:80 -p 443:443
-e CATTLE_SYSTEM_DEFAULT_REGISTRY=harbor.micia.com:8443
-e CATTLE_SYSTEM_CATALOG=bundled
-v /data/registries.yaml:/etc/rancher/k3s/registries.yaml
-v /data/cert:/data/cert
–privileged
harbor.micia.com:8443/rancher/rancher:latest

我刚才重新安装了一台虚拟机,然后执行curl --insecure -fL https://10.13.21.18/system-agent-install.sh | sudo sh -s - --server https://10.13.21.18 --label ‘cattle.io/os=linux’ --token ltljb6vj8mt5b4ncs4vpjrrwzzbl79ncgm9qrznptbdqpnn6qlqhzw --ca-checksum 80b5532325f8ba21eff882a25ddd9394b6a303c1566b5108caf945821c371b9c --etcd --controlplane --worker 这个命令,成功创建了节点

看下刚才有问题的节点主机名是否存在_字符,我之前遇到

没有特殊字符,之前使用在线方式安装rancher是成功的,之后试了下离线方式就各种问题

K3S我没测试过,可参照上面2个链接,是KSD大神的权威教程,对比下Rancher部署过程和证书安装过程。

那两个文章我没打开

Rancher 高可用安装–Cert-Manager 签发 TLS 证书+4层 LB
Rancher 高可用安装–自签名证书+4层 LB

你有没有遇到另外一个问题,创建deployment的时候,提示 Deployment does not have minimum availability.

这个没遇到过,可以在创建deployment时,设置pod预留试下

不知道什么原因引起的,很奇怪

试了一下,还是那样

我采用rancher2.7.8部署主节点时,报这个错误

Configuring bootstrap node(s) custom-c806af88a4eb: error applying plan – check rancher-system-agent.service logs on node for more information, waiting for agent to check in and apply initial plan

解答问题,有偿