Rancher 修改443端口为10443和10444,k8s 集群有2台master节点执行了rke-clearn.sh脚本以后,使用docker run rancher-agent报错

Rancher Server 设置

  • Rancher 版本:
  • 安装选项 (Docker install/Helm Chart):
    • 如果是 Helm Chart 安装,需要提供 Local 集群的类型(RKE1, RKE2, k3s, EKS, 等)和版本:
  • 在线或离线部署:

下游集群信息

  • Kubernetes 版本:
  • Cluster Type (Local/Downstream):
    • 如果 Downstream,是什么类型的集群?(自定义/导入或为托管 等):

用户信息

  • 登录用户的角色是什么? (管理员/集群所有者/集群成员/项目所有者/项目成员/自定义):
    • 如果自定义,自定义权限集:

主机操作系统:
centos7.6

问题描述:
rancher服务器修改端口为10443,然后又修改为10444,master执行rke-clean.sh脚本,在master 执行sudo docker run -d --privileged --restart=unless-stopped --net=host -v /etc/kubernetes:/etc/kubernetes -v /var/run:/var/run registry.dlsc.com:18082/rancher/rancher-agent:v2.5.5 --server https://rancher.dlsc.sgcc.com:10443 --token 94hldsdfcjdkznl5mpj7hs9s8tqhkdnt --ca-checksum 1a65053f0820ebe55877f9dcaa ee88df2c671f379d --etcd --controlplane , 手动执行这个rancher-agent没有问题,但是程自启动的rancher-agent参数和手动启动的不一致 导致无法进行后续安装etcd、apiserver等组件

重现步骤:

结果:

预期结果:

截图:

其他上下文信息:

日志
日志 手动启动的rancher-agent日志

[yunwei@bj-dljy-k8s-149 ~]$ docker logs -f 37ad33a3b9ce
INFO: Arguments: --server https://rancher.dlsc.sgcc.com:10444 --token REDACTED --ca-checksum 1a65053f0820e 140d784aee88df2c671f379d --etcd --controlplane
INFO: Environment: CATTLE_ADDRESS=192.168.203.149 CATTLE_INTERNAL_ADDRESS= CATTLE_NODE_NAME=bj-dljy-k8s-149 CATTLE_ROLE=,etcd,controlplane CATTLE_SERVER=https://rancher.dlsc.sgcc.com:10444 CATTLE_TOKEN=REDACTED
INFO: Using resolv.conf: nameserver 192.168.203.156 nameserver 192.168.203.157
INFO: https://rancher.dlsc.sgcc.com:10444/ping is accessible
INFO: rancher.dlsc.sgcc.com resolves to 192.168.203.171
INFO: Value from https://rancher.dlsc.sgcc.com:10444/v3/settings/cacerts is an x509 certificate
time=“2022-11-26T09:44:30Z” level=info msg=“Listening on /tmp/log.sock”
time=“2022-11-26T09:44:30Z” level=info msg=“Rancher agent version v2.5.5 is starting”
time=“2022-11-26T09:44:30Z” level=info msg=“Option worker=false”
time=“2022-11-26T09:44:30Z” level=info msg=“Option requestedHostname=bj-dljy-k8s-149”
time=“2022-11-26T09:44:30Z” level=info msg=“Option customConfig=map[address:192.168.203.149 internalAddress: label:map roles:[etcd controlplane] taints:]”
time=“2022-11-26T09:44:30Z” level=info msg=“Option etcd=true”
time=“2022-11-26T09:44:30Z” level=info msg=“Option controlPlane=true”
time=“2022-11-26T09:44:30Z” level=info msg=“Connecting to wss://rancher.dlsc.sgcc.com:10444/v3/connect/register with token 646gkhvtzcstmw45k68s7w94hldsdfcjdkznl5mpj7hs9s8tqhkdnt”
time=“2022-11-26T09:44:30Z” level=info msg=“Connecting to proxy” url=“wss://rancher.dlsc.sgcc.com:10444/v3/connect/register”
time=“2022-11-26T09:44:32Z” level=info msg=“Starting plan monitor, checking every 120 seconds”

自动启动的rancher-agent日志
[yunwei@bj-dljy-k8s-149 ~]$ docker logs -f 6ea7f9fe11e3
INFO: Arguments: --server https://rancher.dlsc.sgcc.com:10443 --token REDACTED --ca-checksum 1a65053f0820ebe5 4aee88df2c671f379d --no-register --only-write-certs --node-name bj-dljy-k8s-149
INFO: Environment: CATTLE_ADDRESS=192.168.203.149 CATTLE_AGENT_CONNECT=true CATTLE_INTERNAL_ADDRESS= CATTLE_NODE_NAME=bj-dljy-k8s-149 CATTLE_SERVER=https://rancher.dlsc.sgcc.com:10443 CATTLE_TOKEN=REDACTED CATTLE_WRITE_CERT_ONLY=true
INFO: Using resolv.conf: nameserver 192.168.203.156 nameserver 192.168.203.157
ERROR: https://rancher.dlsc.sgcc.com:10443/ping is not accessible (Failed to connect to rancher.dlsc.sgcc.com port 10443: Connection timed out)
INFO: Arguments: --server https://rancher.dlsc.sgcc.com:10443 --token REDACTED --ca-checksum 1a65053f0820ebe55877f9dcaabb62d9473c2a62140d784aee88df2c671f379d --no-register --only-write-certs --node-name bj-dljy-k8s-149
INFO: Environment: CATTLE_ADDRESS=192.168.203.149 CATTLE_AGENT_CONNECT=true CATTLE_INTERNAL_ADDRESS= CATTLE_NODE_NAME=bj-dljy-k8s-149 CATTLE_SERVER=https://rancher.dlsc.sgcc.com:10443 CATTLE_TOKEN=REDACTED CATTLE_WRITE_CERT_ONLY=true
INFO: Using resolv.conf: nameserver 192.168.203.156 nameserver 192.168.203.157
ERROR: https://rancher.dlsc.sgcc.com:10443/ping is not accessible (Failed to connect to rancher.dlsc.sgcc.com port 10443: Connection timed out)
INFO: Arguments: --server https://rancher.dlsc.sgcc.com:10443 --token REDACTED --ca-checksum 1a65053f0820ebe55877f9dcaabb62d9473c2a62140d784aee88df2c671f379d --no-register --only-write-certs --node-name bj-dljy-k8s-149
INFO: Environment: CATTLE_ADDRESS=192.168.203.149 CATTLE_AGENT_CONNECT=true CATTLE_INTERNAL_ADDRESS= CATTLE_NODE_NAME=bj-dljy-k8s-149 CATTLE_SERVER=https://rancher.dlsc.sgcc.com:10443 CATTLE_TOKEN=REDACTED CATTLE_WRITE_CERT_ONLY=true
INFO: Using resolv.conf: nameserver 192.168.203.156 nameserver 192.168.203.157

程序自动启动的rancher-agent怎么端口和手动执行的端口不一致呢???

抱歉,你的描述我没看懂。

隐约感觉是修改 rancher ip 或 pord 导致的问题,那你可以参考:如何修改 Rancher v2.5 的 Rancher Server IP 地址

你好 可能我描述不是很清楚,我说一下我的操作步骤,rancher的端口改成10443,修改方式是修改k3s集群的daemonset的container端口;修改以后再从rancherUI上修改serverurl为ip:10443,后来又把这两项端口改成10444了,下游的k8s集群master节点执行了rke-clean.sh 脚本,把启动的容器都删了,我重新执行sudo docker run -d --privileged --restart=unless-stopped --net=host -v /etc/kubernetes:/etc/kubernetes -v /var/run:/var/run registry.dlsc.com:18082/rancher/rancher-agent:v2.5.5 --server https://rancher.dlsc.sgcc.com:10443 --token 94hldsdfcjdkznl5mpj7hs9s8tqhkdnt --ca-checksum 1a65053f0820ebe55877f9dcaa ee88df2c671f379d --etcd --controlplane 命令,启动容器以后 看日志参数都是正常的


但是这个容器会在启动同一个镜像的容器,但是第二个容器日志看传的参数是有问题的

因为我的端口从10443修改为10444了,第一次启动rancher-agent手动传参 看日志是没问题的,但是第二个容器参数端口还是10443,导致访问不到k3s里的rancher报错,k8s master组件都安装不上了

是的,第二个 agent 还是会找你原来的端口去启动,所以推荐你按照上面的链接修改 rancher 的 ip 和端口