Rancher在创建集群时,只能拉取runtime的镜像,不能拉取其他镜像

Rancher Server 设置

  • Rancher 版本:2.7.10
  • 安装选项 (Docker install/Helm Chart):
    • 如果是 Helm Chart 安装,需要提供 Local 集群的类型(RKE1, RKE2, k3s, EKS, 等)和版本:
  • 在线或离线部署:

下游集群信息

  • Kubernetes 版本:
  • Cluster Type (Local/Downstream):
    • 如果 Downstream,是什么类型的集群?(自定义/导入或为托管 等):

用户信息

  • 登录用户的角色是什么? (管理员/集群所有者/集群成员/项目所有者/项目成员/自定义):
    • 如果自定义,自定义权限集:

主机操作系统:

问题描述:
执行curl命令之后,有两个镜像被拉取,但是其他的镜像无法被拉取。

重现步骤:

结果:

预期结果:

截图:

其他上下文信息:

日志
time="2024-08-06T08:34:45.076260014+08:00" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:etcd-node01.aaaa.com,Uid:e032eefb485f30541701137236819e8e,Namespace:kube-system,Attempt:0,}"
time="2024-08-06T08:34:45.102332459+08:00" level=error msg="failed to decode hosts.toml" error="invalid `host` tree"
time="2024-08-06T08:34:45.109748040+08:00" level=info msg="trying next host" error="failed to do request: Head \"https://172.16.8.204:1446/v2/rancher/pause/manifests/3.6\": tls: failed to verify certificate: x509: cannot validate certificate for 172.16.8.204 because it doesn't contain any IP SANs" host="172.16.8.204:1446"
time="2024-08-06T08:34:45.124545298+08:00" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:etcd-node01.aaaa.com,Uid:e032eefb485f30541701137236819e8e,Namespace:kube-system,Attempt:0,} failed, error" error="failed to get sandbox image \"172.16.8.204:1446/rancher/pause:3.6\": failed to pull image \"172.16.8.204:1446/rancher/pause:3.6\": failed to pull and unpack image \"172.16.8.204:1446/rancher/pause:3.6\": failed to resolve reference \"172.16.8.204:1446/rancher/pause:3.6\": failed to do request: Head \"https://172.16.8.204:1446/v2/rancher/pause/manifests/3.6\": tls: failed to verify certificate: x509: cannot validate certificate for 172.16.8.204 because it doesn't contain any IP SANs"
time="2024-08-06T08:34:45.124587108+08:00" level=info msg="stop pulling image 172.16.8.204:1446/rancher/pause:3.6: active requests=0, bytes read=0"


containerd的配置文件中也有认证的配置

# File generated by rke2. DO NOT EDIT. Use config.toml.tmpl instead.
version = 2

[plugins."io.containerd.internal.v1.opt"]
  path = "/var/lib/rancher/rke2/agent/containerd"
[plugins."io.containerd.grpc.v1.cri"]
  stream_server_address = "127.0.0.1"
  stream_server_port = "10010"
  enable_selinux = false
  enable_unprivileged_ports = true
  enable_unprivileged_icmp = true
  sandbox_image = "172.16.8.204:1446/rancher/pause:3.6"

[plugins."io.containerd.grpc.v1.cri".containerd]
  snapshotter = "overlayfs"
  disable_snapshot_annotations = true
  



[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  runtime_type = "io.containerd.runc.v2"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
  SystemdCgroup = false

[plugins."io.containerd.grpc.v1.cri".registry]
  config_path = "/var/lib/rancher/rke2/agent/etc/containerd/certs.d"




[plugins."io.containerd.grpc.v1.cri".registry.configs."172.16.8.204:1446".auth]
  username = "admin"
  password = "aaaaa"

私有镜像仓库的证书配置有问题

这样设置是对的吗?

还有一个问题,为什么runtime的镜像可以拉取

runtime 镜像是从 rancher 中拉取的,不是从镜像仓库,你从镜像名也能看出来

我把所有镜像都放在了rancher,意思是不是,runtime是通过rancher的镜像仓库拉取的,但是rke2集群镜像是在创建集群时配置的镜像仓库拉取的。

大佬,那我的仓库配置应该没问题吧,是证书的问题吗,还有2.7.10在创建集群时能使用http吗

这个是 2.7.10 版本的 bug,参考:https://github.com/rancher/rancher/issues/42373

在有问题的版本中,如果要使用非 80 端口的 http 协议私有镜像仓库,Registry Hostname 不能带有端口号,即 Registry Hostname 与 endpoint 中的服务地址不同。

例如:

mirrors:
  3.26.98.114:
    endpoint:
      - "http://3.26.98.114:8080"

大佬,请问,哪个版本把这问题解决了

但凡你要是看一眼我发的连接,你就不能这么问

是要这么配置吗, 仓库验证这块可以不用写吗

收到,大佬