监控scheduler和controller-manager
问题 kube-prometheus安装后,我们可以看到监控指标大部分的配置都是正常的,只有两个没有管理到对应的监控目标,比如 kube-controller-manager 和 kube-scheduler 这两个系统组件。
解决 在master节点测试连接10257和10259端口:
curl -ik https://MASTER_IP:10257
curl -ik https://MASTER_IP:10259
无法访问:若是kubeadmin安装的修改下面两个yaml文件,将- --bind-address=127.0.0.1
修改为 - --bind-address=0.0.0.0
- /etc/kubernetes/manifests/kube-controller-manager.yaml
- /etc/kubernetes/manifests/kube-scheduler.yaml
manifests目录下是以静态Pod运行在集群中的,修改静态Pod目录下对应的yaml文件,等待一会后,对应服务会自动重启。
新增kube-scheduler service
cat << EOF > prometheus-kube-scheduler-svc.yaml
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler
labels:
app.kubernetes.io/name: kube-scheduler
spec:
selector:
component: kube-scheduler
type: ClusterIP
clusterIP: None
ports:
- name: https-metrics
port: 10259
targetPort: 10259
protocol: TCP
EOF
kubectl apply -f prometheus-kube-scheduler-svc.yaml
新增kube-controller-manager Service
cat << EOF > prometheus-kube-controller-manager-svc.yaml
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-controller-manager
labels:
app.kubernetes.io/name: kube-controller-manager
spec:
selector:
component: kube-controller-manager
type: ClusterIP
clusterIP: None
ports:
- name: https-metrics
port: 10257
targetPort: 10257
protocol: TCP
EOF
kubectl apply -f prometheus-kube-controller-manager-svc.yaml
等待一会,查看Prometheus target 有显示kube-controller-manager、kubeScheduler信息,就说明正常了。
之后会收到KubeSchedulerDown
和KubeControllerManagerDown
的Resolved
恢复信息