自签名证书添加下游集群Pending:cattle-cluster-agent 报x509: certificate signed by unknown authority

RKE 版本: RKE version v1.4.1

Docker 版本: Docker version 20.10.20

Helm版本: v3.8.0

Rancher版本:v2.6.9

操作系统和内核: CentOS Linux release 7.9.2009 (Core)

主机类型和供应商: HuaWelCloud

重现步骤:

  1. 离线安装,Nginx 7 层负载均衡器对 TLS 进行终结.

  2. 文档脚本生成自签名证书与CA证书,命令如下:sh create_self-signed-cert.sh --ssl-domain=rancher.domainname.com --ssl-size=2048 --ssl-date=3650

  3. RKE安装基本操作:rke config;修改 private_registries、ignore_docker_version: true、
    enable_cri_dockerd: true; rke up

  4. Rancher安装基本操作:
    ./helm template rancher ./rancher-2.6.9.tgz --output-dir .
    –no-hooks
    –namespace cattle-system
    –set useBundledSystemChart=false
    –set rancherImage=192.168.4.2:5000/rancher/rancher
    –set systemDefaultRegistry=192.168.4.2:5000
    –set hostname=rancher.domainname.com
    –set ingress.tls.source=secret
    –set tls=external

  5. Nginx 配置参考官方文档:Chart 安装选项 | Rancher文档

  6. Rancher UI 访问正常;

  7. 通过rke创建第二个RKE集群,Rancher UI 导入已有在集群,如下命令在下游集群一节点执行: curl --insecure -sfL https://rancher.domainname.com/v3/import/nb7ff9zh7tkrktl9jw2227jjl6s6x96hpxxxxxxxxxxxxxxxt9cbtbhbs_c-m-wkpjpcvd.yaml | kubectl apply -f -

  8. 期间,下游集群cattle-cluster-agent有报错:ERROR: https://rancher.domainname.com/ping is not accessible (Failed to connect to rancher.domainname.com port 443: Connection timed out)
    通过补丁做了hostAliases解决
    kubectl -n cattle-system patch deployments cattle-cluster-agent --patch ‘{ “spec”: { “template”: { “spec”: { “hostAliases”: [ { “hostnames”: [ “rancher.domainname.com” ], “ip”: “192.168.4.2” } ] } } }}’

结果:

  1. 下游集群cattle-cluster-agent报错:
    kubectl get pods -A
    NAMESPACE NAME READY STATUS RESTARTS AGE
    cattle-system cattle-cluster-agent-759fc8c486-nbvzd 0/1 CrashLoopBackOff 10 (2m20s ago) 28m
    ……
  2. 容器日志内容:
    kubectl logs cattle-cluster-agent-759fc8c486-nbvzd -n cattle-system
    ………
    time=“2023-02-10T10:07:42Z” level=info msg=“Listening on /tmp/log.sock”
    time=“2023-02-10T10:07:42Z” level=info msg=“Rancher agent version v2.6.9 is starting”
    time=“2023-02-10T10:07:42Z” level=info msg=“Certificate details from https://rancher.domainname.com
    time=“2023-02-10T10:07:42Z” level=info msg=“Certificate #0 (https://rancher.domainname.com)”
    time=“2023-02-10T10:07:42Z” level=info msg=“Subject: CN=rancher.domainname.com,C=CN”
    time=“2023-02-10T10:07:42Z” level=info msg=“Issuer: CN=cattle-ca,C=CN”
    time=“2023-02-10T10:07:42Z” level=info msg=“IsCA: false”
    time=“2023-02-10T10:07:42Z” level=info msg=“DNS Names: [rancher.domainname.com]”
    time=“2023-02-10T10:07:42Z” level=info msg="IPAddresses: "
    time=“2023-02-10T10:07:42Z” level=info msg=“NotBefore: 2023-02-10 09:39:23 +0000 UTC”
    time=“2023-02-10T10:07:42Z” level=info msg=“NotAfter: 2033-02-07 09:39:23 +0000 UTC”
    time=“2023-02-10T10:07:42Z” level=info msg=“SignatureAlgorithm: SHA256-RSA”
    time=“2023-02-10T10:07:42Z” level=info msg=“PublicKeyAlgorithm: RSA”
    time=“2023-02-10T10:07:42Z” level=info msg=“Certificate #1 (https://rancher.domainname.com)”
    time=“2023-02-10T10:07:42Z” level=info msg=“Subject: CN=cattle-ca,C=CN”
    time=“2023-02-10T10:07:42Z” level=info msg=“Issuer: CN=cattle-ca,C=CN”
    time=“2023-02-10T10:07:42Z” level=info msg=“IsCA: true”
    time=“2023-02-10T10:07:42Z” level=info msg="DNS Names: "
    time=“2023-02-10T10:07:42Z” level=info msg="IPAddresses: "
    time=“2023-02-10T10:07:42Z” level=info msg=“NotBefore: 2023-02-10 09:39:23 +0000 UTC”
    time=“2023-02-10T10:07:42Z” level=info msg=“NotAfter: 2033-02-07 09:39:23 +0000 UTC”
    time=“2023-02-10T10:07:42Z” level=info msg=“SignatureAlgorithm: SHA256-RSA”
    time=“2023-02-10T10:07:42Z” level=info msg=“PublicKeyAlgorithm: RSA”
    time=“2023-02-10T10:07:42Z” level=fatal msg=“Certificate chain is not complete, please check if all needed intermediate certificates are included in the server certificate (in the correct order) and if the cacerts setting in Rancher either contains the correct CA certificate (in the case of using self signed certificates) or is empty (in the case of using a certificate signed by a recognized CA). Certificate information is displayed above. error: Get “https://rancher.domainname.com”: x509: certificate signed by unknown authority”

感觉就是容器镜像中缺少CA根证书,应该怎么处理最好呢?谢谢!

请提供完整的 nginx 配置

采用官方文档早期使用Nginx 七层代理的方式配置。

upstream rancher_servers_https {
    least_conn;
    server 192.168.4.52:443 max_fails=3 fail_timeout=1s;
    server 192.168.4.103:443 max_fails=3 fail_timeout=1s;
    server 192.168.4.170:443 max_fails=3 fail_timeout=1s;
}

 map $http_upgrade $connection_upgrade {
    default Upgrade;
    ''      close;
    }

server {
    listen 443 ssl;
    server_name rancher.domainname.com;
    ssl_certificate   "/etc/nginx/cert/tls.crt";
    ssl_certificate_key  "/etc/nginx/cert/tls.key";
    ssl_session_timeout 5m;
	ssl_protocols TLSv1.2 TLSv1.3;
	ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305;
	ssl_prefer_server_ciphers on;

    location / {
        proxy_redirect off;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_pass https://rancher_servers_https;
        client_max_body_size 20m;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
    }
    access_log /var/log/nginx/rancher.log;
}

server {
    listen 80;
    server_name rancher.domainname.com;
    return 301 https://$server_name$request_uri;
    access_log /var/log/nginx/rancher.log;
}

如果使用私有证书,需要先将证书保存到kubernetes密文,在安装时指定 --set privateCA=true

可参考文档中 方式C:使用您已有的证书 的描述

你说的有道理。根据我个人的实验,单从Nginx 7层负载均衡+TLS外部终止这两个限制点
–set tls=external 和 --set privateCA=true 有冲突

–set tls=external ,意思是我Rancher、RKE已经不在负责你域名的SSL验证证,而是交给Nginx来处理了。Rancher内部用的是自己的那一套证书。
–set privateCA=true,需要把自生成CA证书导入到Local集群cattle-system命名空间下面。

你说的方式有机会我会去验证。

下游集群为什么有红框里的提示,是不是哪里错了