使用lb非9345端口做高可用

Rancher Server 设置

  • 在线或离线部署:
    在线部署

下游集群信息

  • Kubernetes 版本:
  • Cluster Type (Local/Downstream):
    • 如果 Downstream,是什么类型的集群?(自定义/导入或为托管 等):
      1.31.9-rke2r1

用户信息

  • 登录用户的角色是什么? (管理员/集群所有者/集群成员/项目所有者/项目成员/自定义):
    • 如果自定义,自定义权限集:

主机操作系统:
centos7.9

问题描述:

配置了LB, 因为9345被其他应用占用. 因此使用了9344端口

master节点:
10.7.26.1
10.7.26.2
10.7.26.3
agent 节点:
10.7.26.4
10.7.26.5
10.7.26.6

  1. LB_IP:9344 → 指向后端server节点的9345 端口
    (10.7.26.1:9345
    10.7.26.2:9345
    10.7.26.3:9345 )

  2. server 配置:

server: https://LB_IP:9344
token: ***
kube-proxy-arg:
  - "proxy-mode=ipvs"
  1. agent配置
server: https://LB_IP:9344
token: ***
kube-proxy-arg:
  - "proxy-mode=ipvs"

server节点启动后,pod-proxy可以查看日志:

I0717 06:38:46.985814       1 server_linux.go:230] "Using ipvs Proxier"

但是查看agent 日志的时候报错:

kubectl logs -n kube-system kube-proxy-dsv3-mall004
Error from server: Get "https://10.7.26.4:10250/containerLogs/kube-system/kube-proxy-dsv3-mall004/kube-proxy": proxy error from 127.0.0.1:9345 while dialing 10.7.26.4:10250, code 502: 502 Bad Gateway

查看message:

Jul 17 16:12:22 dsv3-mall004 rke2: time="2025-07-17T16:12:22+08:00" level=info msg="Connecting to proxy" url="wss://10.7.26.2:9344/v1-rke2/connect"
Jul 17 16:12:22 dsv3-mall004 rke2: time="2025-07-17T16:12:22+08:00" level=info msg="Connecting to proxy" url="wss://10.7.26.1:9344/v1-rke2/connect"
Jul 17 16:12:22 dsv3-mall004 rke2: time="2025-07-17T16:12:22+08:00" level=info msg="Connecting to proxy" url="wss://10.7.26.3:9344/v1-rke2/connect"
Jul 17 16:12:22 dsv3-mall004 rke2: time="2025-07-17T16:12:22+08:00" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 10.7.26.2:9344: connect: connection refused"
Jul 17 16:12:22 dsv3-mall004 rke2: time="2025-07-17T16:12:22+08:00" level=error msg="Remotedialer proxy error; reconnecting..." error="dial tcp 10.7.26.2:9344: connect: connection refused" url="wss://10.7.26.2:9344/v1-rke2/connect"

重现步骤:

只需要配置LB的监听端口费 9345 即可

结果:
agent 节点的proxy 的日志无法查看.

预期结果:

proxy-pod日志正常.

截图:

其他上下文信息:

日志
Jul 17 16:12:19 dsv3-mall004 rke2: time="2025-07-17T16:12:19+08:00" level=info msg="Connecting to proxy" url="wss://10.7.26.3:9344/v1-rke2/connect"
Jul 17 16:12:19 dsv3-mall004 rke2: time="2025-07-17T16:12:19+08:00" level=info msg="Connecting to proxy" url="wss://10.7.26.1:9344/v1-rke2/connect"
Jul 17 16:12:19 dsv3-mall004 rke2: time="2025-07-17T16:12:19+08:00" level=info msg="Connecting to proxy" url="wss://10.7.26.2:9344/v1-rke2/connect"
Jul 17 16:12:19 dsv3-mall004 rke2: time="2025-07-17T16:12:19+08:00" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 10.7.26.1:9344: connect: connection refused"
Jul 17 16:12:19 dsv3-mall004 rke2: time="2025-07-17T16:12:19+08:00" level=error msg="Remotedialer proxy error; reconnecting..." error="dial tcp 10.7.26.1:9344: connect: connection refused" url="wss://10.7.26.1:9344/v1-rke2/connect"
Jul 17 16:12:19 dsv3-mall004 rke2: time="2025-07-17T16:12:19+08:00" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 10.7.26.3:9344: connect: connection refused"
Jul 17 16:12:19 dsv3-mall004 rke2: time="2025-07-17T16:12:19+08:00" level=error msg="Remotedialer proxy error; reconnecting..." error="dial tcp 10.7.26.3:9344: connect: connection refused" url="wss://10.7.26.3:9344/v1-rke2/connect"
Jul 17 16:12:19 dsv3-mall004 rke2: time="2025-07-17T16:12:19+08:00" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 10.7.26.2:9344: connect: connection refused"
Jul 17 16:12:19 dsv3-mall004 rke2: time="2025-07-17T16:12:19+08:00" level=error msg="Remotedialer proxy error; reconnecting..." error="dial tcp 10.7.26.2:9344: connect: connection refused" url="wss://10.7.26.2:9344/v1-rke2/connect"

后续我将agent的配置文件修改如下:

server: https://10.7.26.1:9345
# server: https://LB_IP:9344
token: ***
kube-proxy-arg:
  - "proxy-mode=ipvs"

agent节点的proxy pod 正常
查看message日志:

Jul 17 17:07:48 dsv3-mall004 rke2: time="2025-07-17T17:07:48+08:00" level=info msg="Remotedialer connected to proxy" url="wss://10.7.26.3:9345/v1-rke2/connect"
Jul 17 17:07:48 dsv3-mall004 rke2: time="2025-07-17T17:07:48+08:00" level=info msg="Remotedialer connected to proxy" url="wss://10.7.26.2:9345/v1-rke2/connect"
Jul 17 17:07:48 dsv3-mall004 rke2: time="2025-07-17T17:07:48+08:00" level=info msg="Remotedialer connected to proxy" url="wss://10.7.26.1:9345/v1-rke2/connect"

这里不知道为啥,agent可以获取到server集群的所有节点.

可以的,当你使用任意一个 master 节点的 ip 去注册,注册成功后,agent 会查询 集群中所有 api 的地址,然后存储到 如下配置中:

1 个赞