0%

docker安全prometheus和grafana(二)集成钉钉告警

Prometheus之Alertmanager钉钉报警配置

promentheus 官方没有钉钉插件,需要使用三方插件 prometheus-webhook-dingtalk

GitHub地址:https://github.com/timonwong/prometheus-webhook-dingtalk/

1 下载dingtalk

1
2
3
4
5
6
7
8
9
# 下载文件
wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.0.0/prometheus-webhook-dingtalk-2.0.0.linux-amd64.tar.gz

# 解压文件
tar xf prometheus-webhook-dingtalk-2.0.0.linux-amd64.tar.gz -C /usr/local/
chown root:root -R /usr/local/prometheus-webhook-dingtalk-2.0.0.linux-amd64/

# 建立软连
ln -sv /usr/local/prometheus-webhook-dingtalk-2.0.0.linux-amd64/ /usr/local/prometheus-webhook-dingtalk

2 dingtalk使用帮助

1
2
3
4
5
6
7
8
9
10
11
12
13
2.2 dingtalk使用帮助
usage: prometheus-webhook-dingtalk [<flags>]
Flags:
-h, --help Show context-sensitive help (also try --help-long and --help-man).
--web.listen-address=:8060
The address to listen on for web interface.
--web.enable-ui Enable Web UI mounted on /ui path
--web.enable-lifecycle Enable reload via HTTP request.
--config.file=config.yml Path to the configuration file.
--log.level=info Only log messages with the given severity or above. One of: [debug, info, warn, error]
--log.format=logfmt Output format of log messages. One of: [logfmt, json]
--version Show application version.

3 dingtalk配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
2.3 dingtalk配置文件
# 生成文件
touch /usr/local/prometheus-webhook-dingtalk/config.yml

# 配置文件
cat > /usr/local/prometheus-webhook-dingtalk/config.yml << \EOF
## Request timeout
timeout: 5s

## Customizable templates path
templates:
- contrib/templates/legacy/template.tmpl

## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
default_message:
title: '{{ template "legacy.title" . }}'
text: '{{ template "legacy.content" . }}'

## Targets, previously was known as "profiles"
targets:
webhook1:
url: https://oapi.dingtalk.com/robot/send?access_token=412c7cef2c39d96e565a54156b8aec88e02ec94ef19c0e096a6821e10f3430b9
# secret for signature
secret: secret
webhook_mention_all:
url: https://oapi.dingtalk.com/robot/send?access_token=412c7cef2c39d96e565a54156b8aec88e02ec94ef19c0e096a6821e10f3430b9
secret: secret
mention:
all: true
webhook_mention_users:
url: https://oapi.dingtalk.com/robot/send?access_token=412c7cef2c39d96e565a54156b8aec88e02ec94ef19c0e096a6821e10f3430b9
mention:
mobiles: ['13520642397']
EOF


# 查看内容
cat /usr/local/prometheus-webhook-dingtalk/config.yml

4 添加dingtalk.service文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 生成文件
touch /lib/systemd/system/dingtalk.service
# 配置文件
cat > /lib/systemd/system/dingtalk.service << \EOF
[Unit]
Descripton=dingtalk
Documentation=https://github.com/timonwong/prometheus-webhook-dingtalk/
After=network.target

[Service]
Restart=on-failure
WorkingDirectory=/usr/local/prometheus-webhook-dingtalk
ExecStart=/usr/local/prometheus-webhook-dingtalk/prometheus-webhook-dingtalk --config.file=/usr/local/prometheus-webhook-dingtalk/config.yml

[Install]
WantedBy=multi-user.target
EOF

# 检查配置文件
cat /lib/systemd/system/dingtalk.service

5 设置dingtalk开机启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
2.5 设置dingtalk开机启动
~# systemctl enable dingtalk
Created symlink /etc/systemd/system/multi-user.target.wants/dingtalk.service → /lib/systemd/system/dingtalk.service.
~# systemctl start dingtalk
~# systemctl status dingtalk
● dingtalk.service
Loaded: loaded (/lib/systemd/system/dingtalk.service; disabled; vendor preset: enabled)
Active: active (running) since Wed 2021-12-01 14:29:35 CST; 4s ago
Docs: https://github.com/timonwong/prometheus-webhook-dingtalk/
Main PID: 26590 (prometheus-webh)
Tasks: 7 (limit: 7069)
Memory: 2.5M
CGroup: /system.slice/dingtalk.service
└─26590 /usr/local/prometheus-webhook-dingtalk/prometheus-webhook-dingtalk --config.file=/usr/local/prometheus>

Dec 01 14:29:35 nacos-03 systemd[1]: Started dingtalk.service.
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.918Z caller=main.go:60 msg=">
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.918Z caller=main.go:61 msg=">
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.918Z caller=coordinator.go:8>
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.919Z caller=coordinator.go:9>
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.919Z caller=main.go:98 compo>
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: ts=2021-12-01T06:29:35.920Z caller=main.go:114 component=confi>
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.920Z caller=web.go:210 compo>

6 验证dingtalk端口

1
2
3
4
5
6
2.6 验证dingtalk端口
~# lsof -i :8060
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
prometheu 26590 root 3u IPv6 100982 0t0 TCP *:8060 (LISTEN)


7 集成 设置alertmanager

设置alertmanager 修改alertmanager.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
global:
resolve_timeout: 5m
# 发件人
smtp_from: '13520642397@139.com'
# 邮箱服务器的 POP3/SMTP 主机配置 smtp.qq.com 端口为 465 或 587
smtp_smarthost: 'smtp.139.com:25'
# 用户名
smtp_auth_username: '13520642397@139.com'
# 授权码 或 密码
smtp_auth_password: 'secret'
#smtp_auth_secret: 'secret'
smtp_require_tls: false
smtp_hello: '139.com'
templates:
# 指定预警内容模板
- '/etc/alertmanager/template/email.tmpl'
route:
# 指定通过什么字段进行告警分组(如:alertname=A和alertname=B的将会被分导两个组里面)
group_by: ['alertname']
# 在组内等待所配置的时间,如果同组内,5 秒内出现相同报警,在一个组内出现
group_wait: 5s
# 如果组内内容不变化,合并为一条警报信息,5 分钟后发送
group_interval: 5m
# 发送告警间隔时间 s/m/h,如果指定时间内没有修复,则重新发送告警
repeat_interval: 5m
# 默认的receiver。 如果一个报警没有被任何一个route匹配,则发送给默认的接收器
receiver: 'keendataMail'
#子路由(上面所有的route属性都由所有子路由继承,并且可以在每个子路由上进行覆盖)
routes:
- receiver: dingding.webhook1
continue: true
# 当触发当前预警的prometheus规则满足:标签alarmClassify的为normal时(标签名、标签值可以自定义,只要和编写的prometheus的rule里面设置的标签呼应上即可),往keendataMail发送邮件
- receiver: keendataMail
match_re:
alarmClassify: normal
# 当触发当前预警的prometheus规则满足:标签alarmClassify的值为special时(标签名、标签值可以自定义,只要和编写的prometheus的rule里面设置的标签呼应上即可),往QQemail发送邮件
- receiver: QQemail
match_re:
alarmClassify: special
receivers:
#
- name: 'dingding.webhook1'
webhook_configs:
- url: 'http://192.168.12.218:8060/dingtalk/webhook1/send'
send_resolved: true
- name: 'keendataMail'
email_configs:
# 如果想发送多个人就以 ',' 做分割
- to: 'zhangyichao@keendata.com'
send_resolved: true
# 接收邮件的标题
headers: {Subject: "alertmanager报警邮件"}
- name: 'QQemail'
email_configs:
# 如果想发送多个人就以 ',' 做分割
- to: '758829146@qq.com'
send_resolved: true
# 接收邮件的标题
headers: {Subject: "alertmanager报警邮件"}
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']


3.2 重启Alertmanager服务
~# systemctl restart alertmanager.service

参考材料

https://www.cnblogs.com/wangguishe/p/15629091.html