Kube Prometheus 监控Scheduler Controller Manager

Kube Prometheus Scheduler Controller Manager

Posted by BlueFat on Sunday, November 29, 2020

监控scheduler和controller-manager

问题 kube-prometheus安装后,我们可以看到监控指标大部分的配置都是正常的,只有两个没有管理到对应的监控目标,比如 kube-controller-manager 和 kube-scheduler 这两个系统组件。

解决 在master节点测试连接10257和10259端口:

curl -ik https://MASTER_IP:10257
curl -ik https://MASTER_IP:10259

无法访问:若是kubeadmin安装的修改下面两个yaml文件,将- --bind-address=127.0.0.1 修改为 - --bind-address=0.0.0.0

  • /etc/kubernetes/manifests/kube-controller-manager.yaml
  • /etc/kubernetes/manifests/kube-scheduler.yaml

manifests目录下是以静态Pod运行在集群中的,修改静态Pod目录下对应的yaml文件,等待一会后,对应服务会自动重启。

新增kube-scheduler service

cat << EOF > prometheus-kube-scheduler-svc.yaml
apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-scheduler
  labels:
    app.kubernetes.io/name: kube-scheduler
spec:
  selector:
    component: kube-scheduler
  type: ClusterIP
  clusterIP: None
  ports:
  - name: https-metrics
    port: 10259
    targetPort: 10259
    protocol: TCP
EOF
kubectl apply -f prometheus-kube-scheduler-svc.yaml 

新增kube-controller-manager Service

cat << EOF > prometheus-kube-controller-manager-svc.yaml
apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-controller-manager
  labels:
    app.kubernetes.io/name: kube-controller-manager
spec:
  selector:
    component: kube-controller-manager
  type: ClusterIP
  clusterIP: None
  ports:
  - name: https-metrics
    port: 10257
    targetPort: 10257
    protocol: TCP
EOF
kubectl apply -f prometheus-kube-controller-manager-svc.yaml 

等待一会,查看Prometheus target 有显示kube-controller-manager、kubeScheduler信息,就说明正常了。 之后会收到KubeSchedulerDownKubeControllerManagerDownResolved恢复信息