Rancher Server 设置
- Rancher 版本:2.8.5
- 安装选项 (Docker install/Helm Chart):
- 如果是 Helm Chart 安装,需要提供 Local 集群的类型(RKE1, RKE2, k3s, EKS, 等)和版本:
- 在线或离线部署:
docker 部署
下游集群信息
- Kubernetes 版本: 1.28.10
- Cluster Type (Local/Downstream):
- 如果 Downstream,是什么类型的集群?(自定义/导入或为托管 等):
用户信息
- 登录用户的角色是什么? (管理员/集群所有者/集群成员/项目所有者/项目成员/自定义):
主机操作系统:
redhat 8.1
问题描述:
集群正常:
但是点击集群出现如下 500 报错:
ksd
2
通过 kubectl get pod -A 查看下游集群的 pod 是否有启动失败的 或者换个浏览器看看是不是缓存问题
ksd
4
拿在看看 local 集群,这种情况我没遇到过,但感觉像是某些 pod 没启动成功似的
ksd
6
再试试 将下游集群的 cattle-cluster-agent pod 删掉,等重新运行成功后,清理缓存,再试试。
]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-6746776655-98hmw 1/1 Running 0 106m
calico-system calico-node-qlflv 1/1 Running 0 106m
calico-system calico-typha-84dbd56667-7tvxq 1/1 Running 0 106m
cattle-system cattle-cluster-agent-78b5748c86-rdv75 1/1 Running 0 2m29s
cattle-system rancher-webhook-5f694d4b4d-2q6bg 1/1 Running 0 104m
kube-system cloud-controller-manager-bdp-vm-028 1/1 Running 0 106m
kube-system etcd-bdp-vm-028 1/1 Running 0 106m
kube-system helm-install-rke2-calico-crd-2w59l 0/1 Completed 0 106m
kube-system helm-install-rke2-calico-hpxvf 0/1 Completed 2 106m
kube-system helm-install-rke2-coredns-t84s2 0/1 Completed 0 106m
kube-system helm-install-rke2-ingress-nginx-64cdr 0/1 Completed 0 106m
kube-system helm-install-rke2-metrics-server-pf9f2 0/1 Completed 0 106m
kube-system helm-install-rke2-snapshot-controller-crd-sb9ll 0/1 Completed 0 106m
kube-system helm-install-rke2-snapshot-controller-qm6br 0/1 Completed 0 106m
kube-system helm-install-rke2-snapshot-validation-webhook-t2w26 0/1 Completed 0 106m
kube-system kube-apiserver-bdp-vm-028 1/1 Running 0 106m
kube-system kube-controller-manager-bdp-vm-028 1/1 Running 0 106m
kube-system kube-proxy-bdp-vm-028 1/1 Running 0 107m
kube-system kube-scheduler-bdp-vm-028 1/1 Running 0 106m
kube-system rke2-coredns-rke2-coredns-664bb795cf-vzt86 1/1 Running 0 106m
kube-system rke2-coredns-rke2-coredns-autoscaler-9d9899867-wzctl 1/1 Running 0 106m
kube-system rke2-ingress-nginx-controller-jclcd 1/1 Running 0 105m
kube-system rke2-metrics-server-64db56557f-8ssd7 1/1 Running 0 105m
kube-system rke2-snapshot-controller-f77488d6b-vbnbl 1/1 Running 0 105m
kube-system rke2-snapshot-validation-webhook-6968bd6c5f-dcltb 1/1 Running 0 105m
tigera-operator tigera-operator-675cdfb494-29j59 1/1 Running 0 106m
[root@bdp-vm-028 ~]# kubectl describe po -n cattle-system cattle-cluster-agent-78b5748c86-rdv75
Name: cattle-cluster-agent-78b5748c86-rdv75
Namespace: cattle-system
Priority: 0
Service Account: cattle
Node: bdp-vm-028/10.0.0.29
Start Time: Tue, 27 Aug 2024 15:01:10 +0800
Labels: app=cattle-cluster-agent
pod-template-hash=78b5748c86
Annotations: cni.projectcalico.org/containerID: 173636b914b1467e8a7533953566e348741d31b7e6c607b48489d32c809b0013
cni.projectcalico.org/podIP: 10.42.74.149/32
cni.projectcalico.org/podIPs: 10.42.74.149/32
Status: Running
IP: 10.42.74.149
IPs:
IP: 10.42.74.149
Controlled By: ReplicaSet/cattle-cluster-agent-78b5748c86
Containers:
cluster-register:
Container ID: containerd://015be5d1c56e8af3e88ad627a383f4c0f2e76e941ff147dd3cf09bbb7f7e80a1
Image: linking-harbor-zb.di.bigdata/rancher/rancher-agent:v2.8.5
Image ID: linking-harbor-zb.di.bigdata/rancher/rancher-agent@sha256:3f4c1e8923f13eea8d31382b672f846133aca380d957b52011e91b2771f77a7a
Port: <none>
Host Port: <none>
State: Running
Started: Tue, 27 Aug 2024 15:01:11 +0800
Ready: True
Restart Count: 0
Environment:
CATTLE_FEATURES: embedded-cluster-api=false,fleet=false,monitoringv1=false,multi-cluster-management=false,multi-cluster-management-agent=true,provisioningv2=false,rke2=false
CATTLE_IS_RKE: false
CATTLE_SERVER: https://10.0.0.34:8443
CATTLE_CA_CHECKSUM: 00ffc43d6dad9d4cd8ed848e06a8451a1deb10b965b01333a28faef361662f0f
CATTLE_CLUSTER: true
CATTLE_K8S_MANAGED: true
CATTLE_CLUSTER_REGISTRY: linking-harbor-zb.di.bigdata
CATTLE_SERVER_VERSION: v2.8.5
CATTLE_INSTALL_UUID: 72bc7a5a-36f9-4dc7-8a4a-5ce6b54b31ea
CATTLE_INGRESS_IP_DOMAIN: sslip.io
Mounts:
/cattle-credentials from cattle-credentials (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-l72xs (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
cattle-credentials:
Type: Secret (a volume populated by a Secret)
SecretName: cattle-credentials-7e479da
Optional: false
kube-api-access-l72xs:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/controlplane=true:NoSchedule
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m32s default-scheduler Successfully assigned cattle-system/cattle-cluster-agent-78b5748c86-rdv75 to bdp-vm-028
Normal Pulled 2m32s kubelet Container image "linking-harbor-zb.di.bigdata/rancher/rancher-agent:v2.8.5" already present on machine
Normal Created 2m32s kubelet Created container cluster-register
Normal Started 2m32s kubelet Started container cluster-register
删除 pod 后,新启动的 pod 正常,页面还是 500
ksd
9
F12 看下前端报错吧
另外,rancher 前面有 LB 之类的东西么?
jamper
14
我在2.7.14版本上导入rke集群也遇到了同样的问题,rancher 前面我有配置F5,我怀疑和这个有关系
netlqh
15
确实是前面代理问题,用 IP 直接访问 rancher 是没问题的 ,但是一般都会把 rancher 代理出来,再研究下代理怎么访问。