Rke启动失败, 是否和默认iptables策略有关?

RKE 版本: v1.2.16

Docker 版本: 20.10.14

操作系统和内核: (CentOS 7.9, 3.10.0-1160.el7.x86_64)

主机类型和供应商: 内部服务器

cluster.yml 文件:
nodes:

  • address: 192.168.100.54
    user: rancher
    role: [controlplane, worker, etcd]
  • address: 192.168.100.96
    user: rancher
    role: [controlplane, worker, etcd]
  • address: 192.168.100.98
    user: rancher
    role: [controlplane, worker, etcd]

重现步骤:

1. 主机名解析

编辑 /etc/hosts文件将三台主机解析:
192.168.100.54 master
192.168.100.96 worker1
192.168.100.96 worker2

2. 关闭firewalld服务

systemctl stop firewalld
systemctl disable firewalld

3.禁用selinux

SELINUX=disabled

4. 禁用swap分区

5. 修改内核参数

vim /etc/sysctl.d/kubernetes.conf
添加如下配置
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1

6. 配置ipvs功能

由于所有主机默认都没有安装iptables, 所以使用ipvs进行替代

yum install ipset ipvsadm -y

cat < /etc/sysconfig/modules/ipvs.modules
#!/bin/bash
modprobe – ip_vs
modprobe – ip_vs_rr
modprobe – ip_vs_wrr
modprobe – ip_vs_sh
modprobe – nf_conntrack_ipv4
EOF

chmod +x /etc/sysconfig/modules/ipvs.modules

/bin/bash /etc/sysconfig/modules/ipvs.modules

lsmod | grep -e ip_vs -e nf_conntrack_ipv4

7. 安装kubectl

yum install -y kubectl-1.20.14

8. 配置SSH Server

vim /etc/ssh/sshd_config
将 AllowTcpForwarding 设置为yes

9. 添加用户rancher, 生成ssh秘钥, 可以进行免密登录

10. rke up

结果:

INFO[0466] [controlplane] Now checking status of node 192.168.100.54, try #1 
ERRO[0491] Host 192.168.100.54 failed to report Ready status with error: [controlplane] Error getting node 192.168.100.54:  "192.168.100.54" not found 
FATA[0566] [controlPlane] Failed to upgrade Control Plane: [[[controlplane] Error getting node 192.168.100.54:  "192.168.100.54" not found]] 

查看日志
docker logs kube-proxy
日志如下:

I0428 09:25:43.794400    4207 proxier.go:874] Syncing iptables rules
I0428 09:25:44.056310    4207 proxier.go:826] syncProxyRules took 262.873803ms
I0428 09:25:44.057076    4207 proxier.go:874] Syncing iptables rules
I0428 09:25:44.177877    4207 proxier.go:826] syncProxyRules took 121.413256ms

看到同步iptables的规则, 但是本地并没有使用iptables服务, 不知道是否与这个有关
如果有关, 在不使用iptables的情况下, 如何配置使用ipvs来进行启动?
目前准备安装iptables并添加相关端口后重试.

----------------------- 2022/4/29 更新-----------------------------

添加iptables服务, 添加端口号22, 6443,2379,2380,10250,443,80. 刷新iptables
启动rke, 报错如下:

ERRO[0502] Host 192.168.100.54 failed to report Ready status with error: [controlplane] Error getting node 192.168.100.54:  "192.168.100.54" not found 
INFO[0502] [controlplane] Now checking status of node 192.168.100.96, try #1 
ERRO[0527] Host 192.168.100.96 failed to report Ready status with error: [controlplane] Error getting node 192.168.100.96:  "192.168.100.96" not found 
INFO[0527] [controlplane] Now checking status of node 192.168.100.98, try #1 
ERRO[0553] Host 192.168.100.98 failed to report Ready status with error: [controlplane] Error getting node 192.168.100.98:  "192.168.100.98" not found 
INFO[0553] [controlplane] Processing controlplane hosts for upgrade 1 at a time 
INFO[0553] Processing controlplane host 192.168.100.54  
INFO[0553] [controlplane] Now checking status of node 192.168.100.54, try #1 
ERRO[0578] Failed to upgrade hosts: 192.168.100.54 with error [[controlplane] Error getting node 192.168.100.54:  "192.168.100.54" not found] 
FATA[0578] [controlPlane] Failed to upgrade Control Plane: [[[controlplane] Error getting node 192.168.100.54:  "192.168.100.54" not found]] 

查看错误日志 docker logs kubelet, 错误日志如下:

W0429 03:14:13.625155    3024 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
E0429 03:14:13.652102    3024 kubelet.go:2263] node "192.168.100.96" not found
E0429 03:14:13.752441    3024 kubelet.go:2263] node "192.168.100.96" not found
E0429 03:14:13.852770    3024 kubelet.go:2263] node "192.168.100.96" not found
E0429 03:14:13.864717    3024 controller.go:144] failed to ensure lease exists, will retry in 7s, error: leases.coordination.k8s.io "192.168.100.96" is forbidden: User "system:node" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
E0429 03:14:13.953258    3024 kubelet.go:2263] node "192.168.100.96" not found
I0429 03:14:14.026201    3024 csi_plugin.go:1039] Failed to contact API server when waiting for CSINode publishing: csinodes.storage.k8s.io "192.168.100.96" is forbidden: User "system:node" cannot get resource "csinodes" in API group "storage.k8s.io" at the cluster scope

请问是否和cluster.yml未添加指定网络插件(如flanne/Canall)有关?

参考:

https://www.suse.com/zh-cn/support/kb/doc/?id=000020035

非常感谢, 似乎是 rke up 的时候部分pod启动的比较慢的原因, 造成检查失败, 使用 rke up --update-only 命令又启动了两次, 就成功了, 非常感谢您的回复.

另外还想问一下, 目前想要做一个高可用的demo用于测试
目前进度为:

helm install rancher rancher-stable/rancher \
 --namespace cattle-system \
 --create-namespace \
 --set hostname=rancher.example.org \
 --set ingress.tls.source=secret \
 --set bootstrapPassword=admin \
 --version 2.6.4 \
 --timeout 20m

NAME: rancher
LAST DEPLOYED: Thu May  5 21:21:30 2022
NAMESPACE: cattle-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Rancher Server has been installed.

NOTE: Rancher may take several minutes to fully initialize. Please standby while Certificates are being issued, Containers are started and the Ingress rule comes up.

Check out our docs at https://rancher.com/docs/

If you provided your own bootstrap password during installation, browse to https://rancher.example.org to get started.

If this is the first time you installed Rancher, get started by running this command and clicking the URL it generates:

echo https://rancher.example.org/dashboard/?setup=$(kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}')

To get just the bootstrap password on its own, run:

kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}{{ "\n" }}'


[rancher@master ~]$ kubectl -n cattle-system rollout status deploy/rancher
Waiting for deployment "rancher" rollout to finish: 1 of 3 updated replicas are available...
deployment "rancher" successfully rolled out

请问如何才能通过ip地址+端口号的形式(而不是通过域名 https://rancher.example.com)直接用浏览器访问Rancher Server?
不添加证书的情况下可以实现吗?
对于K8S和Rancher纯小白, 还望不吝赐教