环境信息:
K3s 版本:
k3s version v1.24.8+k3s1 (648004e4)
go version go1.18.8
节点 CPU 架构、操作系统和版本:
架构为arm64,操作系统为ubuntu20.04
集群配置:
1server,2agents
问题描述:
使用registries.yaml进行私有镜像仓库配置,但不知为何不能正常生效
复现步骤:
离线安装k3s1.24.8+k3s1
在所有节点的/etc/rancher/k3s/
目录下均创建了registries.yaml
registries.yaml
内容:
nvidia@node166:~/Downloads/k3sTest$ cat /etc/rancher/k3s/registries.yaml
mirrors:
docker.io:
endpoint:
- "http://192.168.5.130:1119"
harbor.crrc.com:
endpoint:
- "https://192.168.5.130:1119"
configs:
harbor.crrc.com:
auth:
username: admin
password: Harbor12345
tls:
cert_file: /home/nvidia/Downloads/certs/harbor.crrc.com.crt
key_file: /home/nvidia/Downloads/certs/harbor.crrc.com.key
ca_file: /home/nvidia/Downloads/certs/ca.crt
实际结果:
使用systemctl restart k3s
重启k3s服务后使用sudo crictl info | grep -A 5 "registry"
查看,得到:
"registry": {
"configPath": "",
"mirrors": null,
"configs": null,
"auths": null,
"headers": null
使用sudo k3s crictl pull harbor.crrc.com:1119/test/registry
尝试拉取镜像得到
E1125 07:11:04.613471 45861 remote_image.go:238] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"harbor.crrc.com:1119/test/registry:latest\": failed to resolve reference \"harbor.crrc.com:1119/test/registry:latest\": pulling from host harbor.crrc.com:1119 failed with status code [manifests latest]: 401 Unauthorized" image="harbor.crrc.com:1119/test/registry"
FATA[0000] pulling image: rpc error: code = Unknown desc = failed to pull and unpack image "harbor.crrc.com:1119/test/registry:latest": failed to resolve reference "harbor.crrc.com:1119/test/registry:latest": pulling from host harbor.crrc.com:1119 failed with status code [manifests latest]: 401 Unauthorized
查看containerd配置文件如下
nvidia@node166:~/Downloads/k3sTest$ sudo cat /var/lib/rancher/k3s/agent/etc/containerd/config.toml
version = 2
[plugins."io.containerd.internal.v1.opt"]
path = "/var/lib/rancher/k3s/agent/containerd"
[plugins."io.containerd.grpc.v1.cri"]
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
sandbox_image = "rancher/mirrored-pause:3.6"
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
disable_snapshot_annotations = true
default_runtime_name = "nvidia-container-runtime"
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/var/lib/rancher/k3s/data/03319a42bd191a541dd2fb18e572bf84e43905984afb83f1aca41e70cf220067/bin"
conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia-container-runtime]
runtime_type = "io.containerd.runtime.v1.linux"
runtime_engine = "/usr/bin/nvidia-container-runtime"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."nvidia"]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."nvidia".options]
BinaryName = "/usr/bin/nvidia-container-runtime"
ksd
2024 年11 月 27 日 08:28
2
改成这样试试:
nvidia@node166:~/Downloads/k3sTest$ cat /etc/rancher/k3s/registries.yaml
mirrors:
docker.io:
endpoint:
- "http://192.168.5.130:1119"
harbor.crrc.com:
endpoint:
- "https://192.168.5.130:1119"
configs:
"192.168.5.130:1119":
auth:
username: admin
password: Harbor12345
tls:
cert_file: /home/nvidia/Downloads/certs/harbor.crrc.com.crt
key_file: /home/nvidia/Downloads/certs/harbor.crrc.com.key
ca_file: /home/nvidia/Downloads/certs/ca.crt
改过并且重启过k3s集群了,但还是不行
nvidia@node166:~/Downloads/k3sTest$ cat /etc/rancher/k3s/registries.yaml
mirrors:
docker.io:
endpoint:
- "http://192.168.5.130:1119"
harbor.crrc.com:
endpoint:
- "https://192.168.5.130:1119"
configs:
"192.168.5.130:1119":
auth:
username: admin
password: Harbor12345
tls:
cert_file: /home/nvidia/Downloads/certs/harbor.crrc.com.crt
key_file: /home/nvidia/Downloads/certs/harbor.crrc.com.key
ca_file: /home/nvidia/Downloads/certs/ca.crt
对应的pod.yaml如下:
apiVersion: v1
kind: Pod
metadata:
name: testpod
namespace: default
labels:
app: myapp
environment: dev
spec:
nodeSelector:
kubernetes.io/hostname: node166
containers:
- name: mycontainer
image: harbor.crrc.com:1119/test/registry
imagePullPolicy: Always
pod部署日志如下:
nvidia@node166:~/Downloads/k3sTest$ kubectl describe pods
Name: testpod
Namespace: default
Priority: 0
Node: node166/192.168.5.166
Start Time: Wed, 27 Nov 2024 09:08:13 +0000
Labels: app=myapp
environment=dev
Annotations: <none>
Status: Pending
IP: 10.42.0.30
IPs:
IP: 10.42.0.30
Containers:
mycontainer:
Container ID:
Image: harbor.crrc.com:1119/test/registry
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-v65lk (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-v65lk:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: kubernetes.io/hostname=node166
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 4s
node.kubernetes.io/unreachable:NoExecute op=Exists for 4s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m47s default-scheduler Successfully assigned default/testpod to node166
Normal Pulling 2m20s (x4 over 3m46s) kubelet Pulling image "harbor.crrc.com:1119/test/registry"
Warning Failed 2m19s (x4 over 3m46s) kubelet Failed to pull image "harbor.crrc.com:1119/test/registry": rpc error: code = Unknown desc = failed to pull and unpack image "harbor.crrc.com:1119/test/registry:latest": failed to resolve reference "harbor.crrc.com:1119/test/registry:latest": pulling from host harbor.crrc.com:1119 failed with status code [manifests latest]: 401 Unauthorized
Warning Failed 2m19s (x4 over 3m46s) kubelet Error: ErrImagePull
Warning Failed 2m7s (x6 over 3m46s) kubelet Error: ImagePullBackOff
Normal BackOff 112s (x7 over 3m46s) kubelet Back-off pulling image "harbor.crrc.com:1119/test/registry"
ksd
2024 年11 月 27 日 12:27
4
有两个方案,你都可以试试:
方案 1:
将容器镜像改为为:harbor.crrc.com/test/registry
方案 2:
改为:
nvidia@node166:~/Downloads/k3sTest$ cat /etc/rancher/k3s/registries.yaml
mirrors:
docker.io:
endpoint:
- "http://192.168.5.130:1119"
"harbor.crrc.com:1119":
endpoint:
- "https://192.168.5.130:1119"
configs:
"192.168.5.130:1119":
auth:
username: admin
password: Harbor12345
tls:
cert_file: /home/nvidia/Downloads/certs/harbor.crrc.com.crt
key_file: /home/nvidia/Downloads/certs/harbor.crrc.com.key
ca_file: /home/nvidia/Downloads/certs/ca.crt
这两个方案有冲突,只能按照某一个方案去执行。
谢谢你的回复,这两个方法我都试过了,还是都没有效果,使用sudo crictl info | grep -A 5 "registry"
命令查看时依旧是这样的结果,似乎是k3s没有自动从registries.yaml
文件中生成containerd的配置文件。
"registry": {
"configPath": "",
"mirrors": null,
"configs": null,
"auths": null,
"headers": null
},
有没有可能是k3s版本过旧的缘故
ksd
2024 年11 月 28 日 11:56
6
你不用看这个,现在的 K3s 版本已经不能从 crictl info 中查到配置了,参考:https://github.com/k3s-io/k3s/issues/9626
感谢你的回复,我们所采用的k3s版本为1.24.8,是两年前的老版本。
方案一的报错似乎是找不到认证
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 5m4s kubelet Container mycontainer definition changed, will be restarted
Normal BackOff 4m37s (x2 over 5m4s) kubelet Back-off pulling image "harbor.crrc.com/test/registry"
Warning Failed 4m37s (x70 over 23h) kubelet Error: ImagePullBackOff
Normal Pulling 3m35s (x4 over 5m4s) kubelet Pulling image "harbor.crrc.com/test/registry"
Warning Failed 3m35s (x4 over 5m4s) kubelet Failed to pull image "harbor.crrc.com/test/registry": rpc error: code = Unknown desc = failed to pull and unpack image "harbor.crrc.com/test/registry:latest": failed to resolve reference "harbor.crrc.com/test/registry:latest": failed to do request: Head "https://harbor.crrc.com/v2/test/registry/manifests/latest": x509: certificate is valid for aefe17bf2dcb830be36eb2742c08eb14.a8faa192c3ae0e3652b28d267adf6952.traefik.default, not harbor.crrc.com
Warning Failed 3m35s (x11 over 23h) kubelet Error: ErrImagePull
方案二报的错和之前一样,都是401未登陆的错
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2s default-scheduler Successfully assigned default/testpod to node166
Normal Pulling 1s kubelet Pulling image "harbor.crrc.com:1119/test/registry"
Warning Failed 1s kubelet Failed to pull image "harbor.crrc.com:1119/test/registry": rpc error: code = Unknown desc = failed to pull and unpack image "harbor.crrc.com:1119/test/registry:latest": failed to resolve reference "harbor.crrc.com:1119/test/registry:latest": pulling from host harbor.crrc.com:1119 failed with status code [manifests latest]: 401 Unauthorized
Warning Failed 1s kubelet Error: ErrImagePull
Normal BackOff 1s kubelet Back-off pulling image "harbor.crrc.com:1119/test/registry"
Warning Failed 1s kubelet Error: ImagePullBackOff
官方似乎是在1.26版本之后才更改了这个配置,而且我们在另外的集群上尝试部署了1.30.x的k3s,使用同样的sudo crictl info | grep -A 5 "registry"
命令查看,能查找到registry的配置,如下:
"registry": {
"configPath": "/var/lib/rancher/k3s/agent/etc/containerd/certs.d",
"mirrors": null,
"configs": {
"192.168.5.130:1119": {
"auth": {
其上registries.yaml
的配置如下:
mirrors:
docker-registry:
endpoint:
- "http://registry.cube.local:5000"
"192.168.5.130:1119":
endpoint:
- "http://192.168.5.130:1119"
"harbor.crrc.com:1119":
endpoint:
- "http://192.168.5.130:1119"
configs:
"192.168.5.130:1119":
auth:
username: admin
password: Harbor12345
tls:
cert_file: /home/sgq/Downloads/certs/harbor.crrc.com.crt
key_file: /home/sgq/Downloads/certs/harbor.crrc.com.key
ca_file: /home/sgq/Downloads/certs/ca.crt
这时是可以正常拉取Harbor的私有仓库的。