0%

在Prometheus Operator中添加自定义监控项,监控etcd集群

添加自定义监控项流程

  1. 创建ServiceMonitor对象
  2. 创建Service对象,提供metrics数据接口,并将其和ServiceMonitor关联
  3. 确保Service对象可以正确获取metrics数据

配置etcd证书

查看etcd启动时的证书路径

1
2
3
4
kubectl get po -n kube-system
...
etcd-k8s-master 1/1 Running 1 6h28m
...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
kubectl get po etcd-k8s-master -n kube-system -o yaml
...
spec:
containers:
- command:
- etcd
- --advertise-client-urls=https://192.168.229.134:2379
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --client-cert-auth=true
- --data-dir=/var/lib/etcd
- --initial-advertise-peer-urls=https://192.168.229.134:2380
- --initial-cluster=k8s-master=https://192.168.229.134:2380
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --listen-client-urls=https://127.0.0.1:2379,https://192.168.229.134:2379
- --listen-peer-urls=https://192.168.229.134:2380
- --name=k8s-master
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-client-cert-auth=true
- --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --snapshot-count=10000
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
image: k8s.gcr.io/etcd:3.3.10
...

可以看出etcd使用的证书都在对应节点的/etc/kubernetes/pki/etcd/目录下面。所以先将需要使用的证书通过secret对象保存到集群中:

1
2
3
4
5
kubectl -n monitoring create secret generic etcd-certs \
--from-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--from-file=/etc/kubernetes/pki/etcd/healthcheck-client.key \
--from-file=/etc/kubernetes/pki/etcd/ca.crt
secret/etcd-certs created

将创建etcd-certs对象配置到prometheus资源对象中,直接更新:

1
kubectl edit prometheus k8s -n monitoring

添加secrets的如下属性:

1
2
3
4
5
6
7
nodeSelector:
kubernetes.io/os: linux
podMonitorSelector: {}
replicas: 2
# 添加如下两行
secrets:
- etcd-certs

更新完成后,就可以在Prometheus的Pod中获取之前创建的etcd证书文件了。先查看一下pod名字。

1
2
3
4
5
6
kubectl get po -n monitoring 
NAME READY STATUS RESTARTS AGE
...
prometheus-k8s-0 3/3 Running 1 2m20s
prometheus-k8s-1 3/3 Running 1 3m19s
...

进入两个容器,查看一下证书的具体路径:

1
2
3
4
5
kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring
Defaulting container name to prometheus.
Use 'kubectl describe pod/prometheus-k8s-0 -n monitoring' to see all of the containers in this pod.
/prometheus $ ls /etc/prometheus/secrets/etcd-certs/
ca.crt healthcheck-client.crt healthcheck-client.key

创建ServiceMonitor

创建prometheus-serviceMonitorEtcd.yaml文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: etcd-k8s
name: etcd-k8s
namespace: monitoring
spec:
endpoints:
- port: port
interval: 30s
scheme: https
tlsConfig:
caFile: /etc/prometheus/secrets/etcd-certs/ca.crt
certFile: /etc/prometheus/secrets/etcd-certs/healthcheck-client.crt
keyFile: /etc/prometheus/secrets/etcd-certs/healthcheck-client.key
insecureSkipVerify: true
jobLabel: k8s-app
namespaceSelector:
matchNames:
- kube-system
selector:
matchLabels:
k8s-app: etcd

创建这个serviceMonitor对象:

1
2
kubectl apply -f prometheus-serviceMonitorEtcd.yaml 
servicemonitor.monitoring.coreos.com/etcd-k8s created

创建Service

ServiceMonitor已经创建完成了,需要创建一个对应的Service对象。prometheus-etcdService.yaml内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: etcd
name: etcd-k8s
namespace: kube-system
spec:
ports:
- name: port
port: 2379
protocol: TCP
type: ClusterIP
clusterIP: None
---
apiVersion: v1
kind: Endpoints
metadata:
name: etcd-k8s
namespace: kube-system
labels:
k8s-app: etcd
subsets:
- addresses:
- ip: 192.168.229.134
nodeName: etcd-master
# - ip: 192.168.229.135
# nodeName: etcd02
# - ip: 192.168.229.136
# nodeName: etcd03
ports:
- name: port
port: 2379
protocol: TCP

etcd集群独立于集群之外,所以需要定义一个Endpoints。Endpoints的metadata区域的内容要和Service保持一致,并且将Service的clusterIP设置为None。

在Endpoints的subsets中填写etcd的地址,如果是集群,则在addresses属性下面添加多个地址。

创建Service, Endpoints资源:

1
2
3
kubectl apply -f prometheus-etcdService.yaml 
service/etcd-k8s created
endpoints/etcd-k8s created

采集到数据以后,在Grafana中导入编号为3070的Dashboard。