Kubernetes集群监控之全手动部署

参考

总体目标

从监控平台本身的业务需求来看,至少应该通过平台获取到以下的监控数据:

性能指标(如:CPU、Memory、Load、磁盘、网络等)

  • 容器、Pod相关的性能指标数据
  • 主机节点相关的性能指标数据
  • 容器内进程自己主动暴露的指标数据
  • k8s上应用的网络性能,如http、tcp等数据

状态指标

  • k8s资源对象(Deployment、Daemonset、Pod等)的运行状态指标
  • k8s平台组件(如kube-apiserver、kube-scheduler等)的运行状态指标

获取监控数据之后,还需要对监控进行可视化展示,以及对监控中出现的异常情况进行告警。

主流方案

目前对于kubernetes的主流监控方案主要有以下几种:

Heapster+InfluxDB+Grafana

每个K8S节点的Kubelet内含cAdvisor,暴露出API,Heapster通过访问这些端点得到容器监控数据。它支持多种储存方式,常用的是InfluxDB。这套方案的缺点是数据来源单一、缺乏报警功能以及InfluxDB的单点问题,而且Heapster也已经在新版本中被deprecated(被metrics server取代)了。这种实现方案的详细介绍请见这篇文章。

Metrics-Server+InfluxDB+Grafana

k8s从1.8版本开始,CPU、内存等资源的metrics信息可以通过 Metrics API来获取,用户还可以通过kubectl top直接获取这些metrics信息。Metrics API需要部署Metrics-Server。

各种Exporter+Prometheus+Grafana

通过各种export采集不同维度的监控指标,并通过Prometheus支持的数据格式暴露出来,Prometheus定期pull数据并用Grafana展示,异常情况使用AlertManager告警。本方案下文详细叙述。

架构

总体实现思路如下:

image

image

采集

  • 通过cadvisor采集容器、Pod相关的性能指标数据,并通过暴露的/metrics接口用prometheus抓取

  • 通过prometheus-node-exporter采集主机的性能指标数据,并通过暴露的/metrics接口用prometheus抓取

  • 应用侧自己采集容器中进程主动暴露的指标数据(暴露指标的功能由应用自己实现,并添加平台侧约定的annotation,平台侧负责根据annotation实现通过Prometheus的抓取)

  • 通过blackbox-exporter采集应用的网络性能(http、tcp、icmp等)数据,并通过暴露的/metrics接口用prometheus抓取

  • 通过kube-state-metrics采集k8s资源对象的状态指标数据,并通过暴露的/metrics接口用prometheus抓取

  • 通过etcd、kubelet、kube-apiserver、kube-controller-manager、kube-scheduler自身暴露的/metrics获取节点上与k8s集群相关的一些特征指标数据。

存储(汇聚),通过prometheus pull并汇聚各种exporter的监控数据

展示,通过grafana展示监控信息

告警,通过alertmanager进行告警

监控指标采集实现

容器、Pod相关的性能指标数据—cAdvisor

cAdvisor是谷歌开源的一个容器监控工具,cadvisor采集了主机上容器相关的性能指标数据,通过容器的指标还可进一步计算出pod的指标。

cadvisor提供的一些主要指标有:

1
2
3
4
5
6
7
container_cpu_*	
container_fs_*
container_memory_*
container_network_*
container_spec_*(cpu/memory)
container_start_time_*
container_tasks_state_*

cadvisor接口

目前cAdvisor集成到了kubelet组件内,可以在kubernetes集群中每个启动了kubelet的节点使用cAdvisor提供的metrics接口获取该节点所有容器相关的性能指标数据。1.7.3版本以前,cadvisor的metrics数据集成在kubelet的metrics中,在1.7.3以后版本中cadvisor的metrics被从kubelet的metrics独立出来了,在prometheus采集的时候变成两个scrape的job。

cAdvisor对外提供服务的默认端口为4194,主要提供两种接口:

  • Prometheus格式指标接口:nodeIP:4194/metrics(或者通过kubelet暴露的cadvisor接口nodeIP:10255/metrics/cadvisor);

  • WebUI界面接口:nodeIP:4194/containers/

Prometheus作为一个时间序列数据收集,处理,存储的服务,能够监控的对象必须通过http api暴露出基于Prometheus认可的数据模型的监控数据,cAdvisor接口(nodeIP:4194/metrics)暴露的监控指标数据如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
# HELP cadvisor_version_info A metric with a constant '1' value labeled by kernel version, OS version, docker version, cadvisor version & cadvisor revision.
# TYPE cadvisor_version_info gauge
cadvisor_version_info{cadvisorRevision="",cadvisorVersion="",dockerVersion="1.12.6",kernelVersion="4.9.0-1.2.el7.bclinux.x86_64",osVersion="CentOS Linux 7 (Core)"} 1
# HELP container_cpu_cfs_periods_total Number of elapsed enforcement period intervals.
# TYPE container_cpu_cfs_periods_total counter
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/pod1b0c1f83322defae700f33b1b8b7f572",image="",name="",namespace="",pod_name=""} 7.062239e+06
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/pod7f86ba308f28df9915b802bc48cfee3a",image="",name="",namespace="",pod_name=""} 1.574206e+06
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/podb0c8f695146fe62856bc23709a3e056b",image="",name="",namespace="",pod_name=""} 7.107043e+06
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/podc8cf73836b3caba7bf952ce1ac5a5934",image="",name="",namespace="",pod_name=""} 5.932159e+06
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/podfaa9db59-64b7-11e8-8792-00505694eb6a",image="",name="",namespace="",pod_name=""} 6.979547e+06
container_cpu_cfs_periods_total{container_name="calico-node",id="/kubepods/burstable/podfaa9db59-64b7-11e8-8792-
...

Prometheus配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
- job_name: 'cadvisor'
# 通过https访问apiserver,通过apiserver的api获取数据
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

#以k8s的角色(role)来定义收集,比如node,service,pod,endpoints,ingress等等
kubernetes_sd_configs:
# 从k8s的node对象获取数据
- role: node

relabel_configs:
# 用新的前缀代替原label name前缀,没有replacement的话功能就是去掉label name前缀
# 例如:以下两句的功能就是将__meta_kubernetes_node_label_kubernetes_io_hostname
# 变为kubernetes_io_hostname
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)

# replacement中的值将会覆盖target_label中指定的label name的值,
# 即__address__的值会被替换为kubernetes.default.svc:443
- target_label: __address__
replacement: kubernetes.default.svc:443

# 获取__meta_kubernetes_node_name的值
- source_labels: [__meta_kubernetes_node_name]
#匹配一个或多个任意字符,将上述source_labels的值生成变量
regex: (.+)
# replacement中的值将会覆盖target_label中指定的label name的值,
# 即__metrics_path__的值会被替换为/api/v1/nodes/${1}/proxy/metrics,
# 其中${1}的值会被替换为__meta_kubernetes_node_name的值
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

metric_relabel_configs:
- action: replace
source_labels: [id]
regex: '^/machine\.slice/machine-rkt\\x2d([^\\]+)\\.+/([^/]+)\.service$'
target_label: rkt_container_name
replacement: '${2}-${1}'
- action: replace
source_labels: [id]
regex: '^/system\.slice/(.+)\.service$'
target_label: systemd_service_name
replacement: '${1}'

之后,在prometheus的target(IP:Port/targets)中可以看到cadvisor相应的target

PS:

关于配置文件的一些说明:

  1. 以上配置遵照官方example配置,通过apiserver提供的api做代理获取cAdvisor( https://kubernetes.default.svc:443/api/v1/nodes/k8smaster01/proxy/metrics/cadvisor )的监控指标(和从nodeIP:4194/metrics获取到的内容是一样的),而不是直接从node上获取。为什么这样做,官方这样解释的:This means it will work if Prometheus is running out of cluster, or can’t connect to nodes for some other reason (e.g. because of firewalling)。
  2. Promethues在K8S集群内通过DNS地址 https://kubernetes.default.svc访问apiserver来scrape数据
  3. Prometheus配置文件的语法规则较复杂,为便于理解,我加了一些注释;更多语法规则请见Prometheus官方文档。

关于target中label的一些说明,target中必有的几个source label有:

  1. __address__(当static_configs时通过targets手工配置,当kubernetes_sd_configs时,值从apiserver中获取)、
  2. __metrics_path__(默认值是/metrics)、
  3. __scheme__(默认值是http)
  4. job

其他source label则是根据kubernetes_sd_configs时设置的- role(如endpoints、nodes、service、pod等)从k8s资源对象的label、annotation及其他一些信息中提取的

主机节点性能指标数据—node-exporter

Prometheus社区提供的NodeExporter项目可以对主机的关键度量指标进行监控,通过Kubernetes的DeamonSet可以在各个主机节点上部署有且仅有一个NodeExporter实例,实现对主机性能指标数据的监控。node-exporter所采集的指标主要有:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
node_cpu_*		
node_disk_*
node_entropy_*
node_filefd_*
node_filesystem_*
node_forks_*
node_intr_total_*
node_ipvs_*
node_load_*
node_memory_*
node_netstat_*
node_network_*
node_nf_conntrack_*
node_scrape_*
node_sockstat_*
node_time_seconds_*
node_timex _*
node_xfs_*

Prometheus配置

node-exporter-daemonset.yaml, node-exporter-svc.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: prometheus-node-exporter
namespace: monitoring
labels:
k8s-app: prometheus-node-exporter
spec:
template:
metadata:
name: prometheus-node-exporter
labels:
k8s-app: prometheus-node-exporter
spec:
containers:
- image: prom/node-exporter:v0.18.0
imagePullPolicy: IfNotPresent
name: prometheus-node-exporter
ports:
- name: prom-node-exp
containerPort: 9100
hostPort: 9100
livenessProbe:
failureThreshold: 3
httpGet:
path: /
port: 9100
scheme: HTTP
readinessProbe:
failureThreshold: 3
httpGet:
path: /
port: 9100
scheme: HTTP
resources:
limits:
cpu: 20m
memory: 2Gi
requests:
cpu: 10m
memory: 1Gi
dnsPolicy: ClusterFirst
hostNetwork: true
hostPID: true

---
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: 'true'
prometheus.io/app-metrics: 'true'
prometheus.io/app-metrics-path: '/metrics'
name: prometheus-node-exporter
namespace: monitoring
labels:
k8s-app: prometheus-node-exporter
spec:
ports:
- name: prometheus-node-exporter
port: 9100
protocol: TCP
selector:
k8s-app: prometheus-node-exporter
type: ClusterIP

PS:

1.为了让容器里的node-exporter获取到主机上的网络、PID、IPC指标,这里设置了hostNetwork: true、hostPID: true、hostIPC: true,来与主机共用网络、PID、IPC这三个namespace。

2.此处在Service的annotations中定义标注prometheus.io/scrape: ‘true’,表明该Service需要被Promethues发现并采集数据。

通过NodeExporter暴露的metrics接口(nodeIP:9100/metrics)查看采集到的数据,可以看到是按Prometheus的格式输出的数据:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# HELP node_arp_entries ARP entries by device
# TYPE node_arp_entries gauge
node_arp_entries{device="calid63983a5754"} 1
node_arp_entries{device="calid67ce395c9e"} 1
node_arp_entries{device="calid857f2bf9d5"} 1
node_arp_entries{device="calief3a4b64165"} 1
node_arp_entries{device="eno16777984"} 9
# HELP node_boot_time Node boot time, in unixtime.
# TYPE node_boot_time gauge
node_boot_time 1.527752719e+09
# HELP node_context_switches Total number of context switches.
# TYPE node_context_switches counter
node_context_switches 3.1425612674e+10
# HELP node_cpu Seconds the cpus spent in each mode.
# TYPE node_cpu counter
node_cpu{cpu="cpu0",mode="guest"} 0
node_cpu{cpu="cpu0",mode="guest_nice"} 0
node_cpu{cpu="cpu0",mode="idle"} 2.38051096e+06
node_cpu{cpu="cpu0",mode="iowait"} 11904.19
node_cpu{cpu="cpu0",mode="irq"} 0
node_cpu{cpu="cpu0",mode="nice"} 2990.94
node_cpu{cpu="cpu0",mode="softirq"} 8038.3
...

Prometheus配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
- job_name: 'prometheus-node-exporter'

tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
#The endpoints role discovers targets from listed endpoints of a service. For each
#endpoint address one target is discovered per port. If the endpoint is backed by
#a pod, all additional container ports of the pod, not bound to an endpoint port,
#are discovered as targets as well
- role: endpoints
relabel_configs:
# 只保留endpoints的annotations中含有prometheus.io/scrape: 'true'和port的name为prometheus-node-exporter的endpoint
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_endpoint_port_name]
regex: true;prometheus-node-exporter
action: keep
# Match regex against the concatenated source_labels. Then, set target_label to replacement,
# with match group references (${1}, ${2}, ...) in replacement substituted by their value.
# If regex does not match, no replacement takes place.
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)(?::\d+);(\d+)
replacement: $1:$2
# 去掉label name中的前缀__meta_kubernetes_service_label_
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
# 将__meta_kubernetes_namespace重命名为kubernetes_namespace
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
# 将__meta_kubernetes_service_name重命名为kubernetes_name
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name

采集应用实例中某个进程自己暴露的指标数据

有的应用具有暴露容器内具体进程性能指标的需求,这些指标由应用侧实现采集并暴露,平台侧做汇聚。

如何标识哪些是主动暴露监控指标的应用并获取指标

平台侧可以约定好带哪些annotation前缀的服务是自主暴露监控指标的服务。应用添加平台侧约定的这些annotations,平台侧可以根据这些annotations实现Prometheus的scrape。

例如,应用侧为自己的服务添加如下平台侧约定约定的annotation:

1
2
3
4
prometheus.io/scrape: 'true'
prometheus.io/app-metrics: 'true'
prometheus.io/app-metrics-port: '8080'
prometheus.io/app-metrics-path: '/metrics'

Prometheus可以:

  • 根据prometheus.io/scrape: ‘true’获知对应的endpoint是需要被scrape的
  • 根据prometheus.io/app-metrics: ‘true’获知对应的endpoint中有应用进程暴露的metrics
  • 根据prometheus.io/app-metrics-port: ‘8080’获知进程暴露的metrics的端口号
  • 根据prometheus.io/app-metrics-path: ‘/metrics’获知进程暴露的metrics的具体路径

如何给应用加一些标志信息,并带到Prometheus侧

可能还需要根据平台和业务的需求添加其他一些以prometheus.io/app-info-为前缀的annotation,Prometheus截取下前缀,保留后半部分做key,连同value保留下来。这样满足在平台对应用做其他一些标识的需求。比如加入如下annotation来标识应用所属的的环境、租户以及应用名称

1
2
3
prometheus.io/app-info-env: 'test'
prometheus.io/app-info-tenant: 'test-tenant'
prometheus.io/app-info-name: 'test-app'

Prometheus配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
- job_name: 'kubernetes-app-metrics'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
#The endpoints role discovers targets from listed endpoints of a service. For each
#endpoint address one target is discovered per port. If the endpoint is backed by
#a pod, all additional container ports of the pod, not bound to an endpoint port,
#are discovered as targets as well
- role: endpoints
relabel_configs:
# 只保留endpoint中含有prometheus.io/scrape: 'true'的annotation的endpoint
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_app_metrics]
regex: true;true
action: keep
# 将用户指定的进程的metrics_path替换默认的metrics_path
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_app_metrics_path]
action: replace
target_label: __metrics_path__
regex: (.+)
# 用pod_ip和用户指定的进程的metrics端口组合成真正的可以拿到数据的地址来替换原始__address__
- source_labels: [__meta_kubernetes_pod_ip, __meta_kubernetes_service_annotation_prometheus_io_app_metrics_port]
action: replace
target_label: __address__
regex: (.+);(.+)
replacement: $1:$2
# 去掉label name中的前缀__meta_kubernetes_service_annotation_prometheus_io_app_info_
- action: labelmap
regex: __meta_kubernetes_service_annotation_prometheus_io_app_info_(.+)

PS:

最后两行的作用是将例如prometheus.io/app-info-tenant的annotation名切割成名为tenant的label。

通过blackbox-exporter采集应用的网络性能数据

blackbox-exporter是一个黑盒探测工具,可以对服务的http、tcp、icmp等进行网络探测。

blackbox-exporter部署

blackbox-exporter-deploy.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: prometheus-blackbox-exporter
namespace: monitoring
labels:
k8s-app: prometheus-blackbox-exporter
spec:
selector:
matchLabels:
k8s-app: prometheus-blackbox-exporter
replicas: 1
template:
metadata:
labels:
k8s-app: prometheus-blackbox-exporter
spec:
restartPolicy: Always
containers:
- name: prometheus-blackbox-exporter
image: prom/blackbox-exporter:v0.16.0
imagePullPolicy: IfNotPresent
ports:
- name: blackbox-port
containerPort: 9115
readinessProbe:
tcpSocket:
port: 9115
initialDelaySeconds: 5
timeoutSeconds: 5
resources:
requests:
memory: 50Mi
cpu: 50m
limits:
memory: 60Mi
cpu: 100m
volumeMounts:
- name: config
mountPath: /etc/blackbox_exporter
args:
- --config.file=/etc/blackbox_exporter/blackbox.yml
- --log.level=debug
- --web.listen-address=:9115
volumes:
- name: config
configMap:
name: prometheus-blackbox-exporter

black-exporter-config.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
apiVersion: v1
kind: ConfigMap
metadata:
labels:
k8s-app: prometheus-blackbox-exporter
name: prometheus-blackbox-exporter
namespace: monitoring
data:
blackbox.yml: |-
modules:
http_2xx:
prober: http
timeout: 10s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2"]
valid_status_codes: []
method: GET
preferred_ip_protocol: "ip4"
http_post_2xx: # http post 监测模块
prober: http
timeout: 10s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2"]
method: POST
preferred_ip_protocol: "ip4"
tcp_connect:
prober: tcp
timeout: 10s
icmp:
prober: icmp
timeout: 10s
icmp:
preferred_ip_protocol: "ip4"

black-exporter-svc.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: prometheus-blackbox-exporter
name: prometheus-blackbox-exporter
namespace: monitoring
annotations:
prometheus.io/scrape: 'true'
spec:
type: ClusterIP
selector:
k8s-app: prometheus-blackbox-exporter
ports:
- name: blackbox
port: 9115
targetPort: 9115
protocol: TCP

PS:

blackbox-exporter的配置文件为/etc/blackbox_exporter/blackbox.yml,可以运行时动态的重新加载配置文件,当重新加载配置文件失败时,不影响在运行的配置。重载方式:curl -XPOST http://IP:9115/-/reload

Prometheus配置

在Prometheus的config文件中分别配置对http和tcp的探测:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
- job_name: 'kubernetes-service-http-probe'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: service
# 将metrics_path由默认的/metrics改为/probe
metrics_path: /probe
# Optional HTTP URL parameters.
# 生成__param_module="http_2xx"的label
params:
module: [http_2xx]
relabel_configs:
# 只保留含有label为prometheus/io=scrape的service
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_http_probe]
regex: true;true
action: keep
- source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_namespace, __meta_kubernetes_service_annotation_prometheus_io_http_probe_port, __meta_kubernetes_service_annotation_prometheus_io_http_probe_path]
action: replace
target_label: __param_target
regex: (.+);(.+);(.+);(.+)
replacement: $1.$2:$3$4
# 用__address__这个label的值创建一个名为__param_target的label为blackbox-exporter,值为内部service的访问地址,作为blackbox-exporter采集用
#- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_http_probe_path]
# action: replace
# target_label: __param_target
# regex: (.+);(.+)
# replacement: $1$2
# 用blackbox-exporter的service地址值”prometheus-blackbox-exporter:9115"替换原__address__的值
- target_label: __address__
replacement: prometheus-blackbox-exporter:9115
- source_labels: [__param_target]
target_label: instance
# 去掉label name中的前缀__meta_kubernetes_service_annotation_prometheus_io_app_info_
- action: labelmap
regex: __meta_kubernetes_service_annotation_prometheus_io_app_info_(.+)
#- source_labels: [__meta_kubernetes_namespace]
# target_label: kubernetes_namespace
#- source_labels: [__meta_kubernetes_service_name]
# target_label: kubernetes_name
## kubernetes-services and kubernetes-ingresses are blackbox_exporter related

# Example scrape config for probing services via the Blackbox Exporter.
#
# The relabeling allows the actual service scrape endpoint to be configured
# for all or only some services.
- job_name: 'kubernetes-service-tcp-probe'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: service
# 将metrics_path由默认的/metrics改为/probe
metrics_path: /probe
# Optional HTTP URL parameters.
# 生成__param_module="tcp_connect"的label
params:
module: [tcp_connect]
relabel_configs:
# 只保留含有label为prometheus/io=scrape的service
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_tcp_probe]
regex: true;true
action: keep
- source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_namespace, __meta_kubernetes_service_annotation_prometheus_io_tcp_probe_port]
action: replace
target_label: __param_target
regex: (.+);(.+);(.+)
replacement: $1.$2:$3
# 用__address__这个label的值创建一个名为__param_target的label为blackbox-exporter,值为内部service的访问地址,作为blackbox-exporter采集用
#- source_labels: [__address__]
# target_label: __param_target
# 用blackbox-exporter的service地址值”prometheus-blackbox-exporter:9115"替换原__address__的值
- target_label: __address__
replacement: prometheus-blackbox-exporter:9115
- source_labels: [__param_target]
target_label: instance
# 去掉label name中的前缀__meta_kubernetes_service_annotation_prometheus_io_app_info_
- action: labelmap
regex: __meta_kubernetes_service_annotation_prometheus_io_app_info_(.+)

应用侧配置

应用可以在service中指定平台侧约定的annotation,实现监控平台对该应用的网络服务进行探测:

http探测

1
2
3
4
prometheus.io/scrape: 'true'
prometheus.io/http-probe: 'true'
prometheus.io/http-probe-port: '8080'
prometheus.io/http-probe-path: '/healthz'

tcp探测

1
2
3
prometheus.io/scrape: 'true'
prometheus.io/tcp-probe: 'true'
prometheus.io/tcp-probe-port: '80'

Prometheus根据这些annotation可以获知相应service是需要被探测的,探测的具体网络协议是http还是tcp或其他,以及具体的探测端口。http探测的话还要知道探测的具体url。

资源对象(Deployment、Pod等)的状态—kube-state-metrics

kube-state-metrics采集了k8s中各种资源对象的状态信息:

1
2
3
4
5
6
7
8
9
10
11
kube_daemonset_*(创建时间、所处的阶段、期望跑在几台节点上、应当正在运行的节点数量、不应该跑却跑了daemon pod的节点数、跑好了pod(ready)的节点数)	
kube_deployment_*(创建时间、是否将k8s的label转化为prometheus的label、所处的阶段、是不是以处于paused状态并不再被dp controller处理、期望副本数、rolling update时最多不可用副本数、dp controller观察到的阶段、实际副本数、availabel副本数、unavailabel副本数、updated副本数)
kube_job_*(是否执行完成、创建时间戳...)
kube_namespace_*
kube_node_*
kube_persistentvolumeclaim_*
kube_pod_container_*
kube_pod_*
kube_replicaset_*
kube_service_*
kube_statefulset_*

部署

kube-state-metrics-deploy.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-state-metrics
namespace: monitoring
spec:
selector:
matchLabels:
k8s-app: kube-state-metrics
replicas: 1
template:
metadata:
labels:
k8s-app: kube-state-metrics
spec:
serviceAccountName: prometheus
containers:
- name: kube-state-metrics
image: bitnami/kube-state-metrics:1.8.0
ports:
- protocol: TCP
containerPort: 8080
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
timeoutSeconds: 5
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources:
limits:
cpu: 50m
memory: 1Gi
requests:
cpu: 20m
memory: 512Mi

kube-state-metrics-svc.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: v1
kind: Service
metadata:
name: kube-state-metrics
namespace: monitoring
labels:
k8s-app: kube-state-metrics
annotations:
prometheus.io/scrape: 'true'
spec:
ports:
- name: kube-state-metrics
port: 8080
targetPort: 8080
protocol: TCP
selector:
k8s-app: kube-state-metrics

Prometheus配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
- job_name: 'kube-state-metrics'

tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:

#The endpoints role discovers targets from listed endpoints of a service. For each
#endpoint address one target is discovered per port. If the endpoint is backed by
#a pod, all additional container ports of the pod, not bound to an endpoint port,
#are discovered as targets as well
- role: endpoints
relabel_configs:
# 只保留endpoint中的annotations含有prometheus.io/scrape: 'true'和port的name为prometheus-node-exporter的endpoint
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape,__meta_kubernetes_endpoint_port_name]
regex: true;kube-state-metrics
action: keep
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)(?::\d+);(\d+)
replacement: $1:$2
# 去掉label name中的前缀__meta_kubernetes_service_label_
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
# 将__meta_kubernetes_namespace重命名为kubernetes_namespace
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
# 将__meta_kubernetes_service_name重命名为kubernetes_name
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name

k8s集群组件的状态指标采集

etcd、kube-controller-manager、kube-scheduler、kube-proxy、kube-apiserver、kubelet这几个k8d平台组件分别向外暴露了prometheus标准的指标接口/metrics。可通过配置prometheus来进行读取。

etcd指标获取

以kubeadm启动的k8s集群中,etcd是以static pod的形式启动的,默认没有service及对应的endpoint可供集群内的prometheus访问。所以首先创建一个用来为prometheus提供接口的service(endpoint),etcd-svc.yaml文件如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: etcd-prometheus-discovery
labels:
component: etcd
annotations:
prometheus.io/scrape: 'true'
spec:
selector:
component: etcd
type: ClusterIP
clusterIP: None
ports:
- name: http-metrics
port: 2379
targetPort: 2379
protocol: TCP

prometheus配置抓取的文件加入如下配置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
- job_name: 'etcd'

# 通过https访问apiserver
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

#以k8s的角色(role)来定义收集,比如node,service,pod,endpoints,ingress等等
kubernetes_sd_configs:
# 从endpoints获取apiserver数据
- role: endpoints

#relabel_configs允许在抓取之前对任何目标及其标签进行修改。
relabel_configs:
# 选择哪些label
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name]
# 上述选择的label的值需要与下述对应
regex: true;kube-system;etcd-prometheus-discovery
# 含有符合regex的source_label的endpoints进行保留
action: keep

kube-proxy指标获取

kube-proxy通过10249端口暴露/metrics指标。与3.6.1同理,kube-proxy-svc.yaml如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-proxy-prometheus-discovery
labels:
k8s-app: kube-proxy
annotations:
prometheus.io/scrape: 'true'
spec:
selector:
k8s-app: kube-proxy
type: ClusterIP
clusterIP: None
ports:
- name: http-metrics
port: 10249
targetPort: 10249
protocol: TCP

prometheus配置抓取的文件加入如下配置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
- job_name: 'kube-proxy'

# 通过https访问apiserver
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

#以k8s的角色(role)来定义收集,比如node,service,pod,endpoints,ingress等等
kubernetes_sd_configs:
# 从endpoints获取apiserver数据
- role: endpoints

#relabel_configs允许在抓取之前对任何目标及其标签进行修改。
relabel_configs:
# 选择哪些label
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name]
# 上述选择的label的值需要与下述对应
regex: true;kube-system;kube-proxy-prometheus-discovery
# 含有符合regex的source_label的endpoints进行保留
action: keep

kube-scheduler指标获取

kube-scheduler通过10251端口暴露/metrics指标。与3.6.1同理,kube-scheduler-svc.yaml如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler-prometheus-discovery
labels:
k8s-app: kube-scheduler
annotations:
prometheus.io/scrape: 'true'
spec:
selector:
component: kube-scheduler
type: ClusterIP
clusterIP: None
ports:
- name: http-metrics
port: 10251
targetPort: 10251
protocol: TCP

prometheus对应的配置如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
- job_name: 'kube-scheduler'

# 通过https访问apiserver
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

#以k8s的角色(role)来定义收集,比如node,service,pod,endpoints,ingress等等
kubernetes_sd_configs:
# 从endpoints获取apiserver数据
- role: endpoints

#relabel_configs允许在抓取之前对任何目标及其标签进行修改。
relabel_configs:
# 选择哪些label
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name]
# 上述选择的label的值需要与下述对应
regex: true;kube-system;kube-scheduler-prometheus-discovery
# 含有符合regex的source_label的endpoints进行保留
action: keep

kube-controller-manager指标获取

kube-controller-manager通过10252端口暴露/metrics指标。与etcd同理,kube-controller-manager-svc.yaml如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-controller-manager-prometheus-discovery
labels:
k8s-app: kube-controller-manager
annotations:
prometheus.io/scrape: 'true'
spec:
selector:
component: kube-controller-manager
type: ClusterIP
clusterIP: None
ports:
- name: http-metrics
port: 10252
targetPort: 10252
protocol: TCP

prometheus配置抓取的文件加入如下配置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
- job_name: 'kube-controller-manager'

# 通过https访问apiserver
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

#以k8s的角色(role)来定义收集,比如node,service,pod,endpoints,ingress等等
kubernetes_sd_configs:
# 从endpoints获取apiserver数据
- role: endpoints

#relabel_configs允许在抓取之前对任何目标及其标签进行修改。
relabel_configs:
# 选择哪些label
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name]
# 上述选择的label的值需要与下述对应
regex: true;kube-system;kube-controller-manager-prometheus-discovery
# 含有符合regex的source_label的endpoints进行保留
action: keep

以上四步配置好后可以在promethues的web UI中看到对应的targets

kube-apiserver数据获取

kube-apiserver与上面四个组件不同的是,部署好后集群中默认会有一个名为kubernetes的service和对应的名为kubernetes的endpoint,这个endpoint就是集群内的kube-apiserver的访问入口。可以如下配置prometheus抓取数据:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
- job_name: 'kube-apiservers'

# 通过https访问apiserver
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

#以k8s的角色(role)来定义收集,比如node,service,pod,endpoints,ingress等等
kubernetes_sd_configs:
# 从endpoints获取apiserver数据
- role: endpoints

#relabel_configs允许在抓取之前对任何目标及其标签进行修改。
relabel_configs:
# 选择哪些label
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
# 上述选择的label的值需要与下述对应
regex: default;kubernetes;https
# 含有符合regex的source_label的endpoints进行保留
action: keep

kubelet数据获取

kubelet暴露的metrics端口默认为 10255:

  • 提供的prometheus格式指标接口:nodeIP:10255/metrics,使用Prometheus从这里取数据
  • kubelet提供的stats/summary接口:nodeIP:10255/stats/summary,heapster和最新的metrics-server从这里获取数据

kubelet采集的指标主要有:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
apiserver_client_certificate_expiration_seconds_bucket
apiserver_client_certificate_expiration_seconds_sum
apiserver_client_certificate_expiration_seconds_count
etcd_helper_cache_entry_count
etcd_helper_cache_hit_count
etcd_helper_cache_miss_count
etcd_request_cache_add_latencies_summary
etcd_request_cache_add_latencies_summary_sum
etcd_request_cache_add_latencies_summary_count
etcd_request_cache_get_latencies_summary
etcd_request_cache_get_latencies_summary_sum
etcd_request_cache_get_latencies_summary_count
kubelet_cgroup_manager_latency_microseconds
kubelet_containers_per_pod_count
kubelet_docker_operations
kubelet_network_plugin_operations_latency_microseconds
kubelet_pleg_relist_interval_microseconds
kubelet_pleg_relist_latency_microseconds
kubelet_pod_start_latency_microseconds
kubelet_pod_worker_latency_microseconds
kubelet_running_container_count
kubelet_running_pod_count
kubelet_runtime_operations*
kubernetes_build_info
process_cpu_seconds_total
reflector*
rest_client_request_*
storage_operation_duration_seconds_*

查看kubelet监控指标数据(nodeIP:10255/metrics):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# HELP apiserver_audit_event_total Counter of audit events generated and sent to the audit backend.
# TYPE apiserver_audit_event_total counter
apiserver_audit_event_total 0
# HELP apiserver_client_certificate_expiration_seconds Distribution of the remaining lifetime on the certificate used to authenticate a request.
# TYPE apiserver_client_certificate_expiration_seconds histogram
apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="21600"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="43200"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="86400"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="172800"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="345600"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="604800"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="2.592e+06"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="7.776e+06"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="1.5552e+07"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="3.1104e+07"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="+Inf"} 4161
apiserver_client_certificate_expiration_seconds_sum 1.3091542942737878e+12
apiserver_client_certificate_expiration_seconds_count 4161
...

kubelet由于在每个节点上都有且仅有一个,所以可以通过k8s的node对象找到kubelet的指标,prometheus配置如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
- job_name: 'kubelet'
# 通过https访问apiserver,通过apiserver的api获取数据
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

#以k8s的角色(role)来定义收集,比如node,service,pod,endpoints,ingress等等
kubernetes_sd_configs:
# 从k8s的node对象获取数据
- role: node
relabel_configs:
# 用新的前缀代替原label name前缀,没有replacement的话功能就是去掉label_name前缀
# 例如:以下两句的功能就是将__meta_kubernetes_node_label_kubernetes_io_hostname
# 变为kubernetes_io_hostname
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
# replacement中的值将会覆盖target_label中指定的label name的值,
# 即__address__的值会被替换为kubernetes.default.svc:443
- target_label: __address__
replacement: kubernetes.default.svc:443
#replacement: 10.142.21.21:6443
# 获取__meta_kubernetes_node_name的值
- source_labels: [__meta_kubernetes_node_name]
#匹配一个或多个任意字符,将上述source_labels的值生成变量
regex: (.+)
# 将# replacement中的值将会覆盖target_label中指定的label name的值,
# 即__metrics_path__的值会被替换为/api/v1/nodes/${1}/proxy/metrics,
# 其中${1}的值会被替换为__meta_kubernetes_node_name的值
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
#or:
#- source_labels: [__address__]
# regex: '(.*):10250'
# replacement: '${1}:4194'
# target_label: __address__
#- source_labels: [__meta_kubernetes_node_label_role]
# action: replace
# target_label: role

Prometheus部署

prometheus-rbac.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: prometheus
name: prometheus
namespace: monitoring

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
labels:
k8s-app: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: prometheus
namespace: monitoring

prometheus-sts.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: prometheus
namespace: monitoring
labels:
k8s-app: prometheus
spec:
serviceName: prometheus
replicas: 1
selector:
matchLabels:
k8s-app: prometheus
template:
metadata:
labels:
k8s-app: prometheus
spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: "prom/prometheus:v2.14.0"
imagePullPolicy: "IfNotPresent"
command:
- "/bin/prometheus"
args:
- --config.file=/etc/prometheus/config/prometheus.yml
- --storage.tsdb.path=/prometheus
- --storage.tsdb.retention=14d
- --web.enable-lifecycle
ports:
- containerPort: 9090
protocol: TCP
readinessProbe:
httpGet:
path: /-/ready
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
livenessProbe:
httpGet:
path: /-/healthy
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
# based on 10 running nodes with 30 pods each
resources:
limits:
cpu: 200m
memory: 8Gi
requests:
cpu: 100m
memory: 4Gi
volumeMounts:
- name: prometheus-config
mountPath: /etc/prometheus/config
#mountPath: /etc/prometheus/prometheus.yml
#subPath: prometheus.yml
- name: prometheus-rules
mountPath: /etc/prometheus/rules
- name: prometheus-data
mountPath: /prometheus
volumes:
- name: prometheus-config
configMap:
name: prometheus-config
- name: prometheus-rules
configMap:
name: prometheus-rules
volumeClaimTemplates:
- metadata:
annotations:
volume.beta.kubernetes.io/storage-class: nfs-rw
volume.beta.kubernetes.io/storage-provisioner: flexvolume-huawei.com/fuxinfs
enable: true
name: prometheus-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi

对args参数做几点说明:

  • –storage.tsdb.path:tsdb数据库存储路径

  • –storage.tsdb.retention:数据保留多久,可以看官方文档存储部分

  • –config.file:指定prometheus的config文件的路径

  • –web.enable-lifecycle:加上这个参数后可以向/-/reload(curl -XPOST 10.142.232.150:30006/-/reload)发送HTTP POST请求实现prometheus在config文件修改后的动态reload,更多信息请查看官方文档

  • –web.enable-admin-api:加上这个参数可以为一些高级用户暴露操作数据库功能的API,比如快照备份(curl -XPOST http:///api/v2/admin/tsdb/snapshot),更多信息请查看官方文档 TSDB Admin APIs部分

prometheus-svc.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
kind: Service
apiVersion: v1
metadata:
name: prometheus
namespace: monitoring
spec:
type: NodePort
ports:
- port: 9090
targetPort: 9090
nodePort: 30026
selector:
k8s-app: prometheus

prometheus-config.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_timeout: 10s
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
rule_files:
- "/etc/prometheus/rules/*.rules"

scrape_configs:
- job_name: 'Prometheus'
static_configs:
- targets: ['localhost:9090']

#- job_name: 'federate'
#scrape_interval: 30s
#scrape_timeout: 30s
#honor_labels: true
#metrics_path: '/federate'
#params:
#'match[]':
#- '{job=~".+"}'
##- '{job=~"kubernetes-.*"}'
#static_configs:
#- targets:
#- 'prometheus.prometheus:9090'

- job_name: 'Web'
scrape_interval: 60s
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets: ["https://www.baidu.com"]
labels:
service: baidu
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9115

- job_name: 'Ping'
scrape_interval: 30s
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets: ['192.168.0.11']
labels:
suites: 'test'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9115

#- job_name: 'Mysql'
#scrape_interval: 5s
#static_configs:
#- targets: ['localhost:9104']
#labels:
#instance: cnhwrds01

#- job_name: ECS
#static_configs:
#- targets: ['192.168.0.11:9100']
#labels:
#instance: test

#- job_name: Redis
#static_configs:
#- targets: ['localhost:9121']
#labels:
#instance: test

#- job_name: pushgateway
#scrape_interval: 1m
#static_configs:
#- targets: ['localhost:9091']
#labels:
#instance: pushgateway

- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https

- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics


- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name

- job_name: 'kubernetes-services'
kubernetes_sd_configs:
- role: service
metrics_path: /probe
params:
module: [http_2xx]
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter.example.com:9115
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name

- job_name: 'kubernetes-ingresses'
kubernetes_sd_configs:
- role: ingress
relabel_configs:
- source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
regex: (.+);(.+);(.+)
replacement: ${1}://${2}${3}
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter.example.com:9115
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_ingress_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_ingress_name]
target_label: kubernetes_name

- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name

- job_name: 'prometheus-node-exporter'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
#The endpoints role discovers targets from listed endpoints of a service. For each
#endpoint address one target is discovered per port. If the endpoint is backed by
#a pod, all additional container ports of the pod, not bound to an endpoint port,
#are discovered as targets as well
- role: endpoints
relabel_configs:
# 只保留endpoints的annotations中含有prometheus.io/scrape: 'true'和port的name为prometheus-node-exporter的endpoint
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_endpoint_port_name]
regex: true;prometheus-node-exporter
action: keep
# Match regex against the concatenated source_labels. Then, set target_label to replacement,
# with match group references (${1}, ${2}, ...) in replacement substituted by their value.
# If regex does not match, no replacement takes place.
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)(?::\d+);(\d+)
replacement: $1:$2
# 去掉label name中的前缀__meta_kubernetes_service_label_
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
# 将__meta_kubernetes_namespace重命名为kubernetes_namespace
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
# 将__meta_kubernetes_service_name重命名为kubernetes_name
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name

- job_name: 'kubernetes-cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

- job_name: 'kubernetes-service-http-probe'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: service
metrics_path: /probe
params:
module: [http_2xx]
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_http_probe]
regex: true;true
action: keep
- source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_namespace, __meta_kubernetes_service_annotation_prometheus_io_http_probe_port, __meta_kubernetes_service_annotation_prometheus_io_http_probe_path]
action: replace
target_label: __param_target
regex: (.+);(.+);(.+);(.+)
replacement: $1.$2:$3$4
# 用__address__这个label的值创建一个名为__param_target的label为blackbox-exporter,值为内部service的访问地址,作为blackbox-exporter采集用
#- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_http_probe_path]
# action: replace
# target_label: __param_target
# regex: (.+);(.+)
# replacement: $1$2
# 用blackbox-exporter的service地址值”prometheus-blackbox-exporter:9115"替换原__address__的值
- target_label: __address__
replacement: prometheus-blackbox-exporter:9115
- source_labels: [__param_target]
target_label: instance
# 去掉label name中的前缀__meta_kubernetes_service_annotation_prometheus_io_app_info_
- action: labelmap
regex: __meta_kubernetes_service_annotation_prometheus_io_app_info_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name

- job_name: 'kubernetes-service-tcp-probe'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: service
metrics_path: /probe
params:
module: [tcp_connect]
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_tcp_probe]
regex: true;true
action: keep
- source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_namespace, __meta_kubernetes_service_annotation_prometheus_io_tcp_probe_port]
action: replace
target_label: __param_target
regex: (.+);(.+);(.+)
replacement: $1.$2:$3
# 用__address__这个label的值创建一个名为__param_target的label为blackbox-exporter,值为内部service的访问地址,作为blackbox-exporter采集用
#- source_labels: [__address__]
# target_label: __param_target
# 用blackbox-exporter的service地址值”prometheus-blackbox-exporter:9115"替换原__address__的值
- target_label: __address__
replacement: prometheus-blackbox-exporter:9115
- source_labels: [__param_target]
target_label: instance
# 去掉label name中的前缀__meta_kubernetes_service_annotation_prometheus_io_app_info_
- action: labelmap
regex: __meta_kubernetes_service_annotation_prometheus_io_app_info_(.+)

- job_name: 'kube-state-metrics'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
# 只保留endpoint中的annotations含有prometheus.io/scrape: 'true'和port的name为prometheus-node-exporter的endpoint
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape,__meta_kubernetes_endpoint_port_name]
regex: true;kube-state-metrics
action: keep
# 去掉label name中的前缀__meta_kubernetes_service_label_
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
# 将__meta_kubernetes_namespace重命名为kubernetes_namespace
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
# 将__meta_kubernetes_service_name重命名为kubernetes_name
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name

- job_name: 'kubelet'
# 通过https访问apiserver,通过apiserver的api获取数据
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
# 从k8s的node对象获取数据
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
# 获取__meta_kubernetes_node_name的值
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics

- job_name: 'coredns'
# 通过https访问apiserver
#scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
regex: true;kube-system;coredns-prometheus-discovery;http-metrics
action: keep

prometheus-rules.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-rules
namespace: monitoring
data:
alertmanager.rules: |+
groups:
- name: alertmanager.rules
rules:
- alert: AlertmanagerReloadFailed
expr: alertmanager_config_last_reload_successful == 0
for: 10m
labels:
severity: warning
annotations:
description: Reloading Alertmanager's configuration has failed for {{ $labels.namespace
}}/{{ $labels.pod}}.
summary: Alertmanager configuration reload has failed
general.rules: |+
groups:
- name: general.rules
rules:
- alert: TargetDown
expr: 100 * (count(up == 0) BY (job) / count(up) BY (job)) > 10
for: 10m
labels:
severity: warning
annotations:
description: '{{ $value }}% or more of {{ $labels.job }} targets are down.'
summary: Targets are down
- alert: DeadMansSwitch
expr: vector(1)
labels:
severity: none
annotations:
description: This is a DeadMansSwitch meant to ensure that the entire Alerting
pipeline is functional.
summary: Alerting DeadMansSwitch
- alert: TooManyOpenFileDescriptors
expr: 100 * (process_open_fds / process_max_fds) > 95
for: 10m
labels:
severity: critical
annotations:
description: '{{ $labels.job }}: {{ $labels.namespace }}/{{ $labels.pod }} ({{
$labels.instance }}) is using {{ $value }}% of the available file/socket descriptors.'
summary: too many open file descriptors
- record: instance:fd_utilization
expr: process_open_fds / process_max_fds
- alert: FdExhaustionClose
expr: predict_linear(instance:fd_utilization[1h], 3600 * 4) > 1
for: 10m
labels:
severity: warning
annotations:
description: '{{ $labels.job }}: {{ $labels.namespace }}/{{ $labels.pod }} ({{
$labels.instance }}) instance will exhaust in file/socket descriptors soon'
summary: file descriptors soon exhausted
- alert: FdExhaustionClose
expr: predict_linear(instance:fd_utilization[10m], 3600) > 1
for: 10m
labels:
severity: critical
annotations:
description: '{{ $labels.job }}: {{ $labels.namespace }}/{{ $labels.pod }} ({{
$labels.instance }}) instance will exhaust in file/socket descriptors soon'
summary: file descriptors soon exhausted
job.rules: |+
groups:
- name: job.rules
rules:
- alert: CronJobRunning
expr: time() -kube_cronjob_next_schedule_time > 3600
for: 1h
labels:
severity: warning
annotations:
description: CronJob {{$labels.namespaces}}/{{$labels.cronjob}} is taking more than 1h to complete
summary: CronJob didn't finish after 1h
- alert: JobCompletion
expr: kube_job_spec_completions - kube_job_status_succeeded > 0
for: 1h
labels:
severity: warning
annotations:
description: Job completion is taking more than 1h to complete
cronjob {{$labels.namespaces}}/{{$labels.job}}
summary: Job {{$labels.job}} didn't finish to complete after 1h
- alert: JobFailed
expr: kube_job_status_failed > 0
for: 5m
labels:
severity: warning
annotations:
description: Job {{$labels.namespaces}}/{{$labels.job}} failed to complete
summary: Job failed
kube-apiserver.rules: |+
groups:
- name: kube-apiserver.rules
rules:
- alert: K8SApiserverDown
expr: absent(up{job="kubernetes-apiservers"} == 1)
for: 5m
labels:
severity: critical
annotations:
description: Prometheus failed to scrape API server(s), or all API servers have
disappeared from service discovery.
summary: API server unreachable
- alert: K8SApiServerLatency
expr: histogram_quantile(0.99, sum(apiserver_request_latencies_bucket{subresource!="log",verb!~"CONNECT|WATCHLIST|WATCH|PROXY"})
WITHOUT (instance, resource)) / 1e+06 > 1
for: 10m
labels:
severity: warning
annotations:
description: 99th percentile Latency for {{ $labels.verb }} requests to the
kube-apiserver is higher than 1s.
summary: Kubernetes apiserver latency is high
kube-state-metrics.rules: |+
groups:
- name: kube-state-metrics.rules
rules:
- alert: DeploymentGenerationMismatch
expr: kube_deployment_status_observed_generation != kube_deployment_metadata_generation
for: 15m
labels:
severity: warning
annotations:
description: Observed deployment generation does not match expected one for
deployment {{$labels.namespaces}}/{{$labels.deployment}}
summary: Deployment is outdated
- alert: DeploymentReplicasNotUpdated
expr: ((kube_deployment_status_replicas_updated != kube_deployment_spec_replicas)
or (kube_deployment_status_replicas_available != kube_deployment_spec_replicas))
unless (kube_deployment_spec_paused == 1)
for: 15m
labels:
severity: warning
annotations:
description: Replicas are not updated and available for deployment {{$labels.namespaces}}/{{$labels.deployment}}
summary: Deployment replicas are outdated
- alert: DaemonSetRolloutStuck
expr: kube_daemonset_status_number_ready / kube_daemonset_status_desired_number_scheduled
* 100 < 100
for: 15m
labels:
severity: warning
annotations:
description: Only {{$value}}% of desired pods scheduled and ready for daemon
set {{$labels.namespaces}}/{{$labels.daemonset}}
summary: DaemonSet is missing pods
- alert: K8SDaemonSetsNotScheduled
expr: kube_daemonset_status_desired_number_scheduled - kube_daemonset_status_current_number_scheduled
> 0
for: 10m
labels:
severity: warning
annotations:
description: A number of daemonsets are not scheduled.
summary: Daemonsets are not scheduled correctly
- alert: DaemonSetsMissScheduled
expr: kube_daemonset_status_number_misscheduled > 0
for: 10m
labels:
severity: warning
annotations:
description: A number of daemonsets are running where they are not supposed
to run.
summary: Daemonsets are not scheduled correctly
- alert: PodFrequentlyRestarting
expr: increase(kube_pod_container_status_restarts_total[1h]) > 5
for: 10m
labels:
severity: warning
annotations:
description: Pod {{$labels.namespaces}}/{{$labels.pod}} is was restarted {{$value}}
times within the last hour
summary: Pod is restarting frequently
- alert: KubeNodeNotReady
expr: kube_node_status_condition{job="kube-state-metrics",condition="Ready",status="true"} == 0
for: 1h
labels:
severity: warning
annotations:
message: '{{ $labels.node }} has been unready for more than an hour'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodenotready
- alert: K8SManyNodesNotReady
expr: count(kube_node_status_condition{condition="Ready",status="true"} == 0)
> 1 and (count(kube_node_status_condition{condition="Ready",status="true"} ==
0) / count(kube_node_status_condition{condition="Ready",status="true"})) > 0.5
for: 1m
labels:
severity: critical
annotations:
description: '{{ $value }} Kubernetes nodes (more than 10% are in the NotReady
state).'
summary: Many Kubernetes nodes are Not Ready
kubelet.rules: |+
groups:
- name: kubelet.rules
rules:
- alert: K8SKubeletDown
expr: count(up{job="kubelet"} == 0) / count(up{job="kubelet"}) > 0.03
for: 1h
labels:
severity: warning
annotations:
description: Prometheus failed to scrape {{ $value }}% of kubelets.
summary: Many Kubelets cannot be scraped
- alert: K8SManyKubeletDown
expr: absent(up{job="kubelet"} == 1) or count(up{job="kubelet"} == 0) / count(up{job="kubelet"})
> 0.5
for: 1h
labels:
severity: critical
annotations:
description: Prometheus failed to scrape {{ $value }}% of kubelets, or all Kubelets
have disappeared from service discovery.
summary: Many Kubelets cannot be scraped
- alert: K8SKubeletTooManyPods
expr: kubelet_running_pod_count > 100
labels:
severity: warning
annotations:
description: Kubelet {{$labels.instance}} is running {{$value}} pods, close
to the limit of 110
summary: Kubelet is close to pod limit
kubernetes.rules: |+
groups:
- name: kubernetes.rules
rules:
- record: cluster_namespace_controller_pod_container:spec_memory_limit_bytes
expr: sum(label_replace(container_spec_memory_limit_bytes{container_name!=""},
"controller", "$1", "pod_name", "^(.*)-[a-z0-9]+")) BY (cluster, namespace,
controller, pod_name, container_name)
- record: cluster_namespace_controller_pod_container:spec_cpu_shares
expr: sum(label_replace(container_spec_cpu_shares{container_name!=""}, "controller",
"$1", "pod_name", "^(.*)-[a-z0-9]+")) BY (cluster, namespace, controller, pod_name,
container_name)
- record: cluster_namespace_controller_pod_container:cpu_usage:rate
expr: sum(label_replace(irate(container_cpu_usage_seconds_total{container_name!=""}[5m]),
"controller", "$1", "pod_name", "^(.*)-[a-z0-9]+")) BY (cluster, namespace,
controller, pod_name, container_name)
- record: cluster_namespace_controller_pod_container:memory_usage:bytes
expr: sum(label_replace(container_memory_usage_bytes{container_name!=""}, "controller",
"$1", "pod_name", "^(.*)-[a-z0-9]+")) BY (cluster, namespace, controller, pod_name,
container_name)
- record: cluster_namespace_controller_pod_container:memory_working_set:bytes
expr: sum(label_replace(container_memory_working_set_bytes{container_name!=""},
"controller", "$1", "pod_name", "^(.*)-[a-z0-9]+")) BY (cluster, namespace,
controller, pod_name, container_name)
- record: cluster_namespace_controller_pod_container:memory_rss:bytes
expr: sum(label_replace(container_memory_rss{container_name!=""}, "controller",
"$1", "pod_name", "^(.*)-[a-z0-9]+")) BY (cluster, namespace, controller, pod_name,
container_name)
- record: cluster_namespace_controller_pod_container:memory_cache:bytes
expr: sum(label_replace(container_memory_cache{container_name!=""}, "controller",
"$1", "pod_name", "^(.*)-[a-z0-9]+")) BY (cluster, namespace, controller, pod_name,
container_name)
- record: cluster_namespace_controller_pod_container:memory_pagefaults:rate
expr: sum(label_replace(irate(container_memory_failures_total{container_name!=""}[5m]),
"controller", "$1", "pod_name", "^(.*)-[a-z0-9]+")) BY (cluster, namespace,
controller, pod_name, container_name, scope, type)
- record: cluster_namespace_controller_pod_container:memory_oom:rate
expr: sum(label_replace(irate(container_memory_failcnt{container_name!=""}[5m]),
"controller", "$1", "pod_name", "^(.*)-[a-z0-9]+")) BY (cluster, namespace,
controller, pod_name, container_name, scope, type)
- record: cluster:memory_allocation:percent
expr: 100 * sum(container_spec_memory_limit_bytes{pod_name!=""}) BY (cluster)
/ sum(machine_memory_bytes) BY (cluster)
- record: cluster:memory_used:percent
expr: 100 * sum(container_memory_usage_bytes{pod_name!=""}) BY (cluster) / sum(machine_memory_bytes)
BY (cluster)
- record: cluster:cpu_allocation:percent
expr: 100 * sum(container_spec_cpu_shares{pod_name!=""}) BY (cluster) / sum(container_spec_cpu_shares{id="/"}
* ON(cluster, instance) machine_cpu_cores) BY (cluster)
- record: cluster_resource_verb:apiserver_latency:quantile_seconds
expr: histogram_quantile(0.99, sum(apiserver_request_latencies_bucket) BY (le,
cluster, job, resource, verb)) / 1e+06
labels:
quantile: "0.99"
- record: cluster_resource_verb:apiserver_latency:quantile_seconds
expr: histogram_quantile(0.9, sum(apiserver_request_latencies_bucket) BY (le,
cluster, job, resource, verb)) / 1e+06
labels:
quantile: "0.9"
- record: cluster_resource_verb:apiserver_latency:quantile_seconds
expr: histogram_quantile(0.5, sum(apiserver_request_latencies_bucket) BY (le,
cluster, job, resource, verb)) / 1e+06
labels:
quantile: "0.5"
node.rules: |+
groups:
- name: node.rules
rules:
- alert: NodeExporterDown
expr: absent(up{job="prometheus-node-exporter"} == 1)
for: 10m
labels:
severity: warning
annotations:
description: Prometheus could not scrape a node-exporter for more than 10m,
or node-exporters have disappeared from discovery.
summary: node-exporter cannot be scraped
- alert: K8SNodeOutOfDisk
expr: kube_node_status_condition{condition="OutOfDisk",status="true"} == 1
labels:
service: k8s
severity: critical
annotations:
description: '{{ $labels.node }} has run out of disk space.'
summary: Node ran out of disk space.
- alert: K8SNodeMemoryPressure
expr: kube_node_status_condition{condition="MemoryPressure",status="true"} == 1
labels:
service: k8s
severity: warning
annotations:
description: '{{ $labels.node }} is under memory pressure.'
summary: Node is under memory pressure.
- alert: K8SNodeDiskPressure
expr: kube_node_status_condition{condition="DiskPressure",status="true"} == 1
labels:
service: k8s
severity: warning
annotations:
description: '{{ $labels.node }} is under disk pressure.'
summary: Node is under disk pressure.
- alert: NodeCPUUsage
expr: (100 - (avg by (instance) (irate(node_cpu_seconds_total{job="prometheus-node-exporter",mode="idle"}[5m])) * 100)) > 90
for: 30m
labels:
severity: warning
annotations:
summary: "{{$labels.instance}}: High CPU usage detected"
description: "{{$labels.instance}}: CPU usage is above 90% (current value is: {{ $value }})"
- alert: NodeMemoryUsage
expr: (((node_memory_MemTotal_bytes-node_memory_MemFree_bytes-node_memory_Cached_bytes)/(node_memory_MemTotal_bytes)*100)) > 90
for: 30m
labels:
severity: warning
annotations:
summary: "{{$labels.instance}}: High memory usage detected"
description: "{{$labels.instance}}: Memory usage is above 90% (current value is: {{ $value }})"
prometheus.rules: |+
groups:
- name: prometheus.rules
rules:
- alert: PrometheusReloadFailed
expr: prometheus_config_last_reload_successful == 0
for: 10m
labels:
severity: warning
annotations:
description: Reloading Prometheus' configuration has failed for {{ $labels.namespace
}}/{{ $labels.pod}}.
summary: Prometheus configuration reload has failed

可视化展示Grafana

部署

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
k8s-app: grafana
name: prometheus-grafana
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
k8s-app: grafana
template:
metadata:
labels:
k8s-app: grafana
spec:
containers:
- image: grafana/grafana:6.5.2
livenessProbe:
failureThreshold: 10
httpGet:
path: /api/health
port: 3000
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
name: grafana
ports:
- containerPort: 3000
name: grafana
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /api/health
port: 3000
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
volumeMounts:
- mountPath: /etc/grafana/grafana.ini
name: config
subPath: grafana.ini
- mountPath: /var/lib/grafana
name: storage
volumes:
- configMap:
defaultMode: 420
name: prometheus-grafana
name: config
- emptyDir: {}
name: storage

---
apiVersion: v1
kind: ConfigMap
metadata:
labels:
k8s-app: grafana
name: prometheus-grafana
namespace: monitoring
data:
grafana.ini: |
[server]
root_url = https://www.baidu.com
[log]
mode = console
level = info
[paths]
data = /var/lib/grafana/data
logs = /var/log/grafana
plugins = /var/lib/grafana/plugins
provisioning = /etc/grafana/provisioning
[smtp]
enabled = true
host = user
user = user
password = password
cert_file =
key_file =
skip_verify = false
from_address = user
from_name = user
[emails]
welcome_email_on_sign_up = true

---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: grafana
annotations:
prometheus.io/scrape: 'true'
#prometheus.io/tcp-probe: 'true'
#prometheus.io/tcp-probe-port: '80'
name: prometheus-grafana
namespace: monitoring
spec:
type: NodePort
ports:
- port: 3000
targetPort: 3000
nodePort: 30028
selector:
k8s-app: grafana

grafana配置

  • 添加数据源
  • 添加dashboard

一些比较实用的模板:

315这个模板是cadvisor采集的各种指标的图表
1860这个模板是node-exporter采集的各种主机相关的指标的图表
6417这个模板是kube-state-metrics采集的各种k8s资源对象的状态的图表
4859和4865这两个模板是blackbox-exporter采集的服务的http状态指标的图表(两个效果基本一样,选择其一即可)
5345这个模板是blackbox-exporter采集的服务的网络状态指标的图表

告警

设置警报和通知的主要步骤:

  • 安装配置Alertmanager
  • 配置Prometheus与Alertmanager通信
  • 在Prometheus中创建告警规则

部署

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: alertmanager
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
k8s-app: alertmanager
template:
metadata:
labels:
k8s-app: alertmanager
spec:
containers:
- image: prom/alertmanager:v0.19.0
name: alertmanager
args:
- "--config.file=/etc/alertmanager/config/alertmanager.yml"
- "--storage.path=/alertmanager"
- "--data.retention=720h"
volumeMounts:
- mountPath: "/alertmanager"
name: data
- mountPath: "/etc/alertmanager/config"
name: config-volume
- mountPath: "/etc/alertmanager/template"
name: alertmanager-tmpl
resources:
requests:
cpu: 50m
memory: 500Mi
limits:
cpu: 100m
memory: 1Gi
volumes:
- name: data
emptyDir: {}
- name: config-volume
configMap:
name: alertmanager-config
- name: alertmanager-tmpl
configMap:
name: alertmanager-tmpl

---
apiVersion: v1
kind: ConfigMap
metadata:
name: alertmanager-config
namespace: monitoring
data:
alertmanager.yml: |
global:
resolve_timeout: 1m
wechat_api_corp_id: '****'
wechat_api_url: '****'
wechat_api_secret: '****'
smtp_smarthost: '****'
smtp_from: '****'
smtp_auth_username: '****'
smtp_auth_password: '****'
smtp_require_tls: true

templates:
- '/etc/alertmanager/template/*.tmpl'

route:
group_by: ['alertname', 'job']
group_wait: 20s
group_interval: 20s
repeat_interval: 2m
receiver: 'email'
routes:
- receiver: "email"
group_wait: 10s
continue: true
match_re:
severity: critical|error|warning
#- receiver: "wechat"
#group_wait: 10s
#continue: true
#match_re:
#severity: critical|error|warning

receivers:
#- name: "wechat"
#wechat_configs:
#- send_resolved: true
#to_party: '1'
#agent_id: 1000003
#corp_id: '****'
#api_url: '****'
#api_secret: '****'
- name: "email"
email_configs:
- to: 'cumt_gongzhao@163.com'
send_resolved: true

inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'error'
equal: ['alertname']

---
apiVersion: v1
kind: Service
metadata:
name: alertmanager
namespace: monitoring
labels:
k8s-app: alertmanager
annotations:
prometheus.io/scrape: 'true'
spec:
ports:
- name: web
port: 9093
targetPort: 9093
protocol: TCP
nodePort: 30027
type: NodePort
selector:
k8s-app: alertmanager

---
apiVersion: v1
kind: ConfigMap
metadata:
name: alertmanager-tmpl
namespace: monitoring
data:
wechat.tmpl: |
{{ define "wechat.default.message" }}
{{- if gt (len .Alerts.Firing) 0 -}}{{ range .Alerts }}
@警报
状态: {{ .Status }}
名称: {{ .Labels.alertname }}
级别: {{ .Labels.severity }}
实例: {{ .Labels.instance }}
信息: {{ .Annotations.summary }}
详情: {{ .Annotations.description }}
时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}
{{ end }}{{ end -}}
{{- if gt (len .Alerts.Resolved) 0 -}}{{ range .Alerts }}
@恢复
状态: {{ .Status }}
名称: {{ .Labels.alertname }}
级别: {{ .Labels.severity }}
实例: {{ .Labels.instance }}
信息: {{ .Annotations.summary }}
详情: {{ .Annotations.description }}
时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}
恢复: {{ .EndsAt.Format "2006-01-02 15:04:05" }}
{{ end }}{{ end -}}
{{- end }}
email.tmpl: |
{{ define "email.default.html" }}
{{- if gt (len .Alerts.Firing) 0 -}}{{ range .Alerts }}
@警报
状态: {{ .Status }}
名称: {{ .Labels.alertname }}
级别: {{ .Labels.severity }}
实例: {{ .Labels.instance }}
信息: {{ .Annotations.summary }}
详情: {{ .Annotations.description }}
时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}
{{ end }}{{ end -}}
{{- if gt (len .Alerts.Resolved) 0 -}}{{ range .Alerts }}
@恢复
状态: {{ .Status }}
名称: {{ .Labels.alertname }}
级别: {{ .Labels.severity }}
实例: {{ .Labels.instance }}
信息: {{ .Annotations.summary }}
详情: {{ .Annotations.description }}
时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}
恢复: {{ .EndsAt.Format "2006-01-02 15:04:05" }}
{{ end }}{{ end -}}
{{ end }}

关联Prometheus与alertmanager

在Prometheus的架构中被划分成两个独立的部分。Prometheus负责产生告警,而Alertmanager负责告警产生后的后续处理。因此Alertmanager部署完成后,需要在Prometheus中设置Alertmanager相关的信息。

编辑Prometheus配置文件prometheus.yml,并添加以下内容

1
2
3
4
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']

reload Prometheus后可以在AlertManager的webui上查看告警信息:http://IP:Port

设置报警规则

详见prometheus-rules.yml

--------------------本文结束,感谢您的阅读--------------------