Kubernetes-1.13 部署metric-server 笔记

Kubernetes 1.8开始,Kubernetes通过Metrics API提供资源使用指标,例如容器CPU和内存使用。同时heapster 已被 Kubernetes 弃用。以下为 metrics-server 的折腾记录。

首先在来执行下下面命令,发现它找不请求源数据,这是因为没有为系统提供数据源。

$ k top node
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)

部署metric-server

克隆仓库

$ git clone https://github.com/kubernetes-incubator/metrics-server.git
$ cd metrics-server 

编辑配置文件,将deploy 的文件改成国内镜像,并将imagePullPolicy改成 IfNotPresent

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metrics-server
  namespace: kube-system
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
  labels:
    k8s-app: metrics-server
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  template:
    metadata:
      name: metrics-server
      labels:
        k8s-app: metrics-server
    spec:
      serviceAccountName: metrics-server
      volumes:
      # mount in tmp so we can safely use from-scratch images and/or read-only containers
      - name: tmp-dir
        emptyDir: {}
      containers:
      - name: metrics-server
        ### 更改为国内镜像
        image: registry.cn-shenzhen.aliyuncs.com/shuhui/metrics-server:v0.3.1
        ### 改成 IfNotPresent
        imagePullPolicy: IfNotPresent
        ### 追加,忽略证书验证,否则报错
        command:
        - /metrics-server
        - --kubelet-insecure-tls
        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp

编辑清单文件后,部署metrics-server

$ k create -f  deploy/1.8+/

查看部署情况

$ k describe deployments.apps -n kube-system metrics-server
Name:                   metrics-server
Namespace:              kube-system
...
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  16m   deployment-controller  Scaled up replica set metrics-server-7c5b49ff5c to 1
  Normal  ScalingReplicaSet  16m   deployment-controller  Scaled down replica set metrics-server-5f5cd8dc65 to 0

测试

这时候,metric-server虽然部署完了,继续执行查询节点使用资源时,你会发现换种报错方式了。

$ k top node
error: metrics not available yet

查看 metrics-server 的日志

$ k logs -n kube-system metrics-server-879f5ff6d-l2lxr
I0130 08:57:53.278112       1 serving.go:273] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
[restful] 2019/01/30 08:57:53 log.go:33: [restful/swagger] listing is available at https://:443/swaggerapi
[restful] 2019/01/30 08:57:53 log.go:33: [restful/swagger] https://:443/swaggerui/ is mapped to folder /swagger-ui/
I0130 08:57:53.942207       1 serve.go:96] Serving securely on [::]:443
E0130 08:57:58.338932       1 reststorage.go:129] unable to fetch node metrics for node "bs-02-yf-hw2288v3-11.wd.local": no metrics known for node

可以看出无法解析到bs-02-yf-hw2288v3-11.wd.local这个域名,这个域名在本机是可以解析的。

$ ping -c 1 bs-02-yf-hw2288v3-11.wd.local
PING bs-02-yf-hw2288v3-11.wd.local (192.168.50.11) 56(84) bytes of data.
64 bytes from bs-02-yf-hw2288v3-11.wd.local (192.168.50.11): icmp_seq=1 ttl=64 time=0.043 ms

--- bs-02-yf-hw2288v3-11.wd.local ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.043/0.043/0.043/0.000 ms

于是,我们为 metircs-server 手动添加这条DNS 记录

$ k edit -n kube-system deployments.apps metrics-server
<省略若干行>
       containers:
       - command:
         - /metrics-server
         - --kubelet-insecure-tls
         - --kubelet-preferred-address-types=InternalIP
         image: k8s.gcr.io/metrics-server-amd64:v0.3.1
<省略若干行>

保存,继续执行以下命令

$ k top nodes
NAME                            CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
bs-02-yf-hw2288v3-11.wd.local   178m         0%     5811Mi          9%
$ k top nodes

嗯,一切顺利,有了它的存在,可以自由部署 HPA了。

metrics-server默认使用node的主机名,所以CoreDNS无法解析到这个域名的IP,可以在集群初始化的时候加上 --node-name=$(hostname -i),也可以在上游的DNS 添加这个主机的解析。

参考引用