Rancher中的kube-scheduler什么时候会开始支持实时资源方式的调度算法

Rancher Server 设置

  • Rancher 版本:
  • 安装选项 (Docker install/Helm Chart):
    • 如果是 Helm Chart 安装,需要提供 Local 集群的类型(RKE1, RKE2, k3s, EKS, 等)和版本:
  • 在线或离线部署:

下游集群信息

  • Kubernetes 版本:
  • Cluster Type (Local/Downstream):
    • 如果 Downstream,是什么类型的集群?(自定义/导入或为托管 等):

用户信息

  • 登录用户的角色是什么? (管理员/集群所有者/集群成员/项目所有者/项目成员/自定义):
    • 如果自定义,自定义权限集:

主机操作系统:

**问题描述:**rancher中的kube-scheduler什么时候会开始支持实时资源方式的调度算法

重现步骤:

结果:

预期结果:

截图:

其他上下文信息:

日志


rancher 部署集群中的kube-scheduler 完全采用上游的 kube-scheduler 方案,何来支持和不支持?

我想给下游集群的kube-scheduler引入新的打分插件之类,应用在哪里添加格式怎么写?

这个没弄过,但你知道在 K8s 中咋配置么?这样我能告诉你在 rancher 中咋配置

比如:在cluster.yaml中指定kube-scheduler镜像,如下图:

master主机的kube-scheduler容器一直重启:

rancher页面有报错:

不知正确格式应该是怎么写?

  1. 原来默认的镜像是什么?你只改了个镜像仓库的名称?
  2. 得看具体的容器日志才知道原因

截图中就是默认的。我想先用默认的在cluster.yml中指定成功后,再下一步。现在用默认镜像就没成功。docker logs kube-schduler输出的不像日志,更像是帮助信息,好长一片,事件也没有。应该还未到日志和事件产生的阶段吧。

docker logs 查不到?

Metrics flags:

      --show-hidden-metrics-for-version string   The previous version for which you want to show hidden metrics. Only the previous minor version is meaningful, other values will not be allowed. The format is <major>.<minor>, e.g.: '1.16'. The purpose of this format is make sure you have the opportunity to notice if the next release hides additional metrics, rather than being surprised when they are permanently removed in the release after that.

Logs flags:

      --logging-format string   Sets the log format. Permitted formats: "json", "text".
                                Non-default formats don't honor these flags: --add_dir_header, --alsologtostderr, --log_backtrace_at, --log_dir, --log_file, --log_file_max_size, --logtostderr, --skip_headers, --skip_log_headers, --stderrthreshold, --vmodule, --log-flush-frequency.
                                Non-default choices are currently alpha and subject to change without warning. (default "text")

Global flags:

      --add-dir-header                   If true, adds the file directory to the header of the log messages
      --alsologtostderr                  log to standard error as well as files
  -h, --help                             help for kube-scheduler
      --log-backtrace-at traceLocation   when logging hits line file:N, emit a stack trace (default :0)
      --log-dir string                   If non-empty, write log files in this directory
      --log-file string                  If non-empty, use this log file
      --log-file-max-size uint           Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
      --log-flush-frequency duration     Maximum number of seconds between log flushes (default 5s)
      --logtostderr                      log to standard error instead of files (default true)
      --skip-headers                     If true, avoid header prefixes in the log messages
      --skip-log-headers                 If true, avoid headers when opening log files
      --stderrthreshold severity         logs at or above this threshold go to stderr (default 2)
  -v, --v Level                          number for the log level verbosity (default 0)
      --version version[=true]           Print version information and quit
      --vmodule moduleSpec               comma-separated list of pattern=N settings for file-filtered logging

+ echo kube-scheduler --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-scheduler.yaml --address=0.0.0.0 --leader-elect=true --profiling=false --v=2 --image=ctrimages.hzlinks9.net/rancher/hyperkube:v1.19.6-rancher1
+ grep -q cloud-provider=azure
+ '[' kube-scheduler = kubelet ']'
+ exec kube-scheduler --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-scheduler.yaml --address=0.0.0.0 --leader-elect=true --profiling=false --v=2 --image=ctrimages.hzlinks9.net/rancher/hyperkube:v1.19.6-rancher1
I1130 06:41:23.558965       1 registry.go:173] Registering SelectorSpread plugin
I1130 06:41:23.559070       1 registry.go:173] Registering SelectorSpread plugin
Error: unknown flag: --image
Usage:
  kube-scheduler [flags]

Misc flags:

      --config string            The path to the configuration file. The following flags can overwrite fields in this file:
                                   --address
                                   --port
                                   --use-legacy-policy-config
                                   --policy-configmap
                                   --policy-config-file
                                   --algorithm-provider
      --master string            The address of the Kubernetes API server (overrides any value in kubeconfig)
      --write-config-to string   If set, write the configuration values to this file and exit.

Secure serving flags:

      --bind-address ip                        The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the rest of the cluster, and by CLI/web clients. If blank or an unspecified address (0.0.0.0 or ::), all interfaces will be used. (default 0.0.0.0)
      --cert-dir string                        The directory where the TLS certs are located. If --tls-cert-file and --tls-private-key-file are provided, this flag will be ignored.
      --http2-max-streams-per-connection int   The limit that the server gives to clients for the maximum number of streams in an HTTP/2 connection. Zero means to use golang's default.
      --permit-port-sharing                    If true, SO_REUSEPORT will be used when binding the port, which allows more than one instance to bind on the same address and port. [default=false]
      --secure-port int                        The port on which to serve HTTPS with authentication and authorization. If 0, don't serve HTTPS at all. (default 10259)
      --tls-cert-file string                   File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated after server cert). If HTTPS serving is enabled, and --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to the directory specified by --cert-dir.
      --tls-cipher-suites strings              Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be used. 
                                               Preferred values: TLS_AES_128_GCM_SHA256, TLS_AES_256_GCM_SHA384, TLS_CHACHA20_POLY1305_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256, TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256, TLS_RSA_WITH_3DES_EDE_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_256_CBC_SHA, TLS_RSA_WITH_AES_256_GCM_SHA384. 
                                               Insecure values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_RC4_128_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_RC4_128_SHA, TLS_RSA_WITH_AES_128_CBC_SHA256, TLS_RSA_WITH_RC4_128_SHA.
      --tls-min-version string                 Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12, VersionTLS13
      --tls-private-key-file string            File containing the default x509 private key matching --tls-cert-file.
      --tls-sni-cert-key namedCertKey          A pair of x509 certificate and private key file paths, optionally suffixed with a list of domain patterns which are fully qualified domain names, possibly with prefixed wildcard segments. The domain patterns also allow IP addresses, but IPs should only be used if the apiserver has visibility to the IP address requested by a client. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names. For multiple key/certificate pairs, use the --tls-sni-cert-key multiple times. Examples: "example.crt,example.key" or "foo.crt,foo.key:*.foo.com,foo.com". (default [])

Insecure serving flags:

      --address string   DEPRECATED: the IP address on which to listen for the --port port (set to 0.0.0.0 for all IPv4 interfaces and :: for all IPv6 interfaces). See --bind-address instead. (default "0.0.0.0")
      --port int         DEPRECATED: the port on which to serve HTTP insecurely without authentication and authorization. If 0, don't serve plain HTTP at all. See --secure-port instead. (default 10251)

Authentication flags:

      --authentication-kubeconfig string                  kubeconfig file pointing at the 'core' kubernetes server with enough rights to create tokenreviews.authentication.k8s.io. This is optional. If empty, all token requests are considered to be anonymous and no client CA is looked up in the cluster.
      --authentication-skip-lookup                        If false, the authentication-kubeconfig will be used to lookup missing authentication configuration from the cluster.
      --authentication-token-webhook-cache-ttl duration   The duration to cache responses from the webhook token authenticator. (default 10s)
      --authentication-tolerate-lookup-failure            If true, failures to look up missing authentication configuration from the cluster are not considered fatal. Note that this can result in authentication that treats all requests as anonymous. (default true)
      --client-ca-file string                             If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the CommonName of the client certificate.
      --requestheader-allowed-names strings               List of client certificate common names to allow to provide usernames in headers specified by --requestheader-username-headers. If empty, any client certificate validated by the authorities in --requestheader-client-ca-file is allowed.
      --requestheader-client-ca-file string               Root certificate bundle to use to verify client certificates on incoming requests before trusting usernames in headers specified by --requestheader-username-headers. WARNING: generally do not depend on authorization being already done for incoming requests.
      --requestheader-extra-headers-prefix strings        List of request header prefixes to inspect. X-Remote-Extra- is suggested. (default [x-remote-extra-])
      --requestheader-group-headers strings               List of request headers to inspect for groups. X-Remote-Group is suggested. (default [x-remote-group])
      --requestheader-username-headers strings            List of request headers to inspect for usernames. X-Remote-User is common. (default [x-remote-user])

Authorization flags:

      --authorization-always-allow-paths strings                A list of HTTP paths to skip during authorization, i.e. these are authorized without contacting the 'core' kubernetes server. (default [/healthz])
      --authorization-kubeconfig string                         kubeconfig file pointing at the 'core' kubernetes server with enough rights to create subjectaccessreviews.authorization.k8s.io. This is optional. If empty, all requests not skipped by authorization are forbidden.
      --authorization-webhook-cache-authorized-ttl duration     The duration to cache 'authorized' responses from the webhook authorizer. (default 10s)
      --authorization-webhook-cache-unauthorized-ttl duration   The duration to cache 'unauthorized' responses from the webhook authorizer. (default 10s)

Deprecated flags:

      --algorithm-provider string                  DEPRECATED: the scheduling algorithm provider to use, this sets the default plugins for component config profiles. Choose one of: ClusterAutoscalerProvider | DefaultProvider
      --contention-profiling                       DEPRECATED: enable lock contention profiling, if profiling is enabled (default true)
      --hard-pod-affinity-symmetric-weight int32   DEPRECATED: RequiredDuringScheduling affinity is not symmetric, but there is an implicit PreferredDuringScheduling affinity rule corresponding to every RequiredDuringScheduling affinity rule. --hard-pod-affinity-symmetric-weight represents the weight of implicit PreferredDuringScheduling affinity rule. Must be in the range 0-100.This option was moved to the policy configuration file (default 1)
      --kube-api-burst int32                       DEPRECATED: burst to use while talking with kubernetes apiserver (default 100)
      --kube-api-content-type string               DEPRECATED: content type of requests sent to apiserver. (default "application/vnd.kubernetes.protobuf")
      --kube-api-qps float32                       DEPRECATED: QPS to use while talking with kubernetes apiserver (default 50)
      --kubeconfig string                          DEPRECATED: path to kubeconfig file with authorization and master location information.
      --lock-object-name string                    DEPRECATED: define the name of the lock object. Will be removed in favor of leader-elect-resource-name (default "kube-scheduler")
      --lock-object-namespace string               DEPRECATED: define the namespace of the lock object. Will be removed in favor of leader-elect-resource-namespace. (default "kube-system")
      --policy-config-file string                  DEPRECATED: file with scheduler policy configuration. This file is used if policy ConfigMap is not provided or --use-legacy-policy-config=true. Note: The scheduler will fail if this is combined with Plugin configs
      --policy-configmap string                    DEPRECATED: name of the ConfigMap object that contains scheduler's policy configuration. It must exist in the system namespace before scheduler initialization if --use-legacy-policy-config=false. The config must be provided as the value of an element in 'Data' map with the key='policy.cfg'. Note: The scheduler will fail if this is combined with Plugin configs
      --policy-configmap-namespace string          DEPRECATED: the namespace where policy ConfigMap is located. The kube-system namespace will be used if this is not provided or is empty. Note: The scheduler will fail if this is combined with Plugin configs (default "kube-system")
      --profiling                                  DEPRECATED: enable profiling via web interface host:port/debug/pprof/ (default true)
      --scheduler-name string                      DEPRECATED: name of the scheduler, used to select which pods will be processed by this scheduler, based on pod's "spec.schedulerName". (default "default-scheduler")
      --use-legacy-policy-config                   DEPRECATED: when set to true, scheduler will ignore policy ConfigMap and uses policy config file. Note: The scheduler will fail if this is combined with Plugin configs

Leader election flags:

      --leader-elect                             Start a leader election client and gain leadership before executing the main loop. Enable this when running replicated components for high availability. (default true)
      --leader-elect-lease-duration duration     The duration that non-leader candidates will wait after observing a leadership renewal until attempting to acquire leadership of a led but unrenewed leader slot. This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate. This is only applicable if leader election is enabled. (default 15s)
      --leader-elect-renew-deadline duration     The interval between attempts by the acting master to renew a leadership slot before it stops leading. This must be less than or equal to the lease duration. This is only applicable if leader election is enabled. (default 10s)
      --leader-elect-resource-lock string        The type of resource object that is used for locking during leader election. Supported options are 'endpoints', 'configmaps', 'leases', 'endpointsleases' and 'configmapsleases'. (default "endpointsleases")
      --leader-elect-resource-name string        The name of resource object that is used for locking during leader election. (default "kube-scheduler")
      --leader-elect-resource-namespace string   The namespace of resource object that is used for locking during leader election. (default "kube-system")
      --leader-elect-retry-period duration       The duration the clients should wait between attempting acquisition and renewal of a leadership. This is only applicable if leader election is enabled. (default 2s)

Feature gate flags:

      --feature-gates mapStringBool   A set of key=value pairs that describe feature gates for alpha/experimental features. Options are:
                                      APIListChunking=true|false (BETA - default=true)
                                      APIPriorityAndFairness=true|false (ALPHA - default=false)
                                      APIResponseCompression=true|false (BETA - default=true)
                                      AllAlpha=true|false (ALPHA - default=false)
                                      AllBeta=true|false (BETA - default=false)
                                      AllowInsecureBackendProxy=true|false (BETA - default=true)
                                      AnyVolumeDataSource=true|false (ALPHA - default=false)
                                      AppArmor=true|false (BETA - default=true)
                                      BalanceAttachedNodeVolumes=true|false (ALPHA - default=false)
                                      BoundServiceAccountTokenVolume=true|false (ALPHA - default=false)
                                      CPUManager=true|false (BETA - default=true)
                                      CRIContainerLogRotation=true|false (BETA - default=true)
                                      CSIInlineVolume=true|false (BETA - default=true)
                                      CSIMigration=true|false (BETA - default=true)
                                      CSIMigrationAWS=true|false (BETA - default=false)
                                      CSIMigrationAWSComplete=true|false (ALPHA - default=false)
                                      CSIMigrationAzureDisk=true|false (BETA - default=false)
                                      CSIMigrationAzureDiskComplete=true|false (ALPHA - default=false)
                                      CSIMigrationAzureFile=true|false (ALPHA - default=false)
                                      CSIMigrationAzureFileComplete=true|false (ALPHA - default=false)
                                      CSIMigrationGCE=true|false (BETA - default=false)
                                      CSIMigrationGCEComplete=true|false (ALPHA - default=false)
                                      CSIMigrationOpenStack=true|false (BETA - default=false)
                                      CSIMigrationOpenStackComplete=true|false (ALPHA - default=false)
                                      CSIMigrationvSphere=true|false (BETA - default=false)
                                      CSIMigrationvSphereComplete=true|false (BETA - default=false)
                                      CSIStorageCapacity=true|false (ALPHA - default=false)
                                      CSIVolumeFSGroupPolicy=true|false (ALPHA - default=false)
                                      ConfigurableFSGroupPolicy=true|false (ALPHA - default=false)
                                      CustomCPUCFSQuotaPeriod=true|false (ALPHA - default=false)
                                      DefaultPodTopologySpread=true|false (ALPHA - default=false)
                                      DevicePlugins=true|false (BETA - default=true)
                                      DisableAcceleratorUsageMetrics=true|false (ALPHA - default=false)
                                      DynamicKubeletConfig=true|false (BETA - default=true)
                                      EndpointSlice=true|false (BETA - default=true)
                                      EndpointSliceProxying=true|false (BETA - default=true)
                                      EphemeralContainers=true|false (ALPHA - default=false)
                                      ExpandCSIVolumes=true|false (BETA - default=true)
                                      ExpandInUsePersistentVolumes=true|false (BETA - default=true)
                                      ExpandPersistentVolumes=true|false (BETA - default=true)
                                      ExperimentalHostUserNamespaceDefaulting=true|false (BETA - default=false)
                                      GenericEphemeralVolume=true|false (ALPHA - default=false)
                                      HPAScaleToZero=true|false (ALPHA - default=false)
                                      HugePageStorageMediumSize=true|false (BETA - default=true)
                                      HyperVContainer=true|false (ALPHA - default=false)
                                      IPv6DualStack=true|false (ALPHA - default=false)
                                      ImmutableEphemeralVolumes=true|false (BETA - default=true)
                                      KubeletPodResources=true|false (BETA - default=true)
                                      LegacyNodeRoleBehavior=true|false (BETA - default=true)
                                      LocalStorageCapacityIsolation=true|false (BETA - default=true)
                                      LocalStorageCapacityIsolationFSQuotaMonitoring=true|false (ALPHA - default=false)
                                      NodeDisruptionExclusion=true|false (BETA - default=true)
                                      NonPreemptingPriority=true|false (BETA - default=true)
                                      PodDisruptionBudget=true|false (BETA - default=true)
                                      PodOverhead=true|false (BETA - default=true)
                                      ProcMountType=true|false (ALPHA - default=false)
                                      QOSReserved=true|false (ALPHA - default=false)
                                      RemainingItemCount=true|false (BETA - default=true)
                                      RemoveSelfLink=true|false (ALPHA - default=false)
                                      RotateKubeletServerCertificate=true|false (BETA - default=true)
                                      RunAsGroup=true|false (BETA - default=true)
                                      RuntimeClass=true|false (BETA - default=true)
                                      SCTPSupport=true|false (BETA - default=true)
                                      SelectorIndex=true|false (BETA - default=true)
                                      ServerSideApply=true|false (BETA - default=true)
                                      ServiceAccountIssuerDiscovery=true|false (ALPHA - default=false)
                                      ServiceAppProtocol=true|false (BETA - default=true)
                                      ServiceNodeExclusion=true|false (BETA - default=true)
                                      ServiceTopology=true|false (ALPHA - default=false)
                                      SetHostnameAsFQDN=true|false (ALPHA - default=false)
                                      StartupProbe=true|false (BETA - default=true)
                                      StorageVersionHash=true|false (BETA - default=true)
                                      SupportNodePidsLimit=true|false (BETA - default=true)
                                      SupportPodPidsLimit=true|false (BETA - default=true)
                                      Sysctls=true|false (BETA - default=true)
                                      TTLAfterFinished=true|false (ALPHA - default=false)
                                      TokenRequest=true|false (BETA - default=true)
                                      TokenRequestProjection=true|false (BETA - default=true)
                                      TopologyManager=true|false (BETA - default=true)
                                      ValidateProxyRedirects=true|false (BETA - default=true)
                                      VolumeSnapshotDataSource=true|false (BETA - default=true)
                                      WarningHeaders=true|false (BETA - default=true)
                                      WinDSR=true|false (ALPHA - default=false)
                                      WinOverlay=true|false (ALPHA - default=false)
                                      WindowsEndpointSliceProxying=true|false (ALPHA - default=false)

Metrics flags:

估计是因为你加了某些不支持的参数,或者格式导致的 无法启动

中间有一段这个。应该还是格式怎么写的问题吗?

你想干啥

目前在cluster.yml中只加载了kubeconfig和image,如下:

    scheduler:
      extra_args:
        image: 'ctrimages.hzlinks9.net/rancher/hyperkube:v1.19.6-rancher1'
        kubeconfig: /etc/kubernetes/ssl/kubecfg-kube-scheduler.yaml

如图:


加了image这行才出现报错。前kubeconfig没有报错。

在cluster.yml中指定kube-scheduler的镜像。

kube-scheduler 中根本就没有 image 这个参数,所以你根本就起不来啊

还有,你为什么要想替换 kube-scheduler 的镜像?这个镜像和对应的 K8s 版本都是匹配好的。基本没必要去替换。

本意想解决业务集群各计算节点资源使用不均衡问题,严重的导致节点宕机(我已预留system-reserved和 kube-reserved, 仍效果不佳),因为原生的K8S调度器是基于 Request,而计算节点负载并不能反映真实情况。尝试引入实时负载感知调度器,操作时涉及前面这些东西。
所以,请问:社区版rancher,对计算节点资源使用不均衡问题有好的解决方案吗?可以提供参考一下吗?谢谢!

可使用 descheduler 插件,参考:GitHub - kubernetes-sigs/descheduler: Descheduler for Kubernetes

这个我再试试吧。以前这个我也看过,也是是基于 Request,它只负责检查并驱逐资源使用率高的计算节点的POD,调度POD工作还是依靠kube-scheduler,言下之意,kube-schduler 基于Request,仍然会把POD调度到资源使用率高的计算节点上。

另外,我也在RANCHER文档中找到你说的不支持参数,如下图:

非常感谢你认真解答。谢谢!