Prometheus : Alert Settings2018/12/12

	This is the Alert Settings on Prometheus. There are many way to receive Alerts like Slack, HipChat, WeChat and others, though, on this example, Configure Alerting with Email Receiver. For more details of Alerting, Refer to the official documents. ⇒ https://prometheus.io/docs/alerting/configuration/
[1]	For Email notification, it needs SMTP Server. On this example, it based on the environment that SMTP Server is running on localhost.
[2]	Install Alertmanager on Prometheus Server Host.

root@dlp:~#

apt -y install prometheus-alertmanager

[3]	Configure Prometheus Alert Settings.

root@dlp:~#

mv /etc/prometheus/alertmanager.yml /etc/prometheus/alertmanager.yml.org

# configure Alertmanager

root@dlp:~#

vi /etc/prometheus/alertmanager.yml

# create new

global:
  # SMTP server to use
  smtp_smarthost: 'localhost:25'
  # require TLS or not
  smtp_require_tls: false
  # notification sender's Email address
  smtp_from: 'Alertmanager <root@dlp.srv.world>'
  # if set SMTP Auth on SMTP server, set below, too
  # smtp_auth_username: 'alertmanager'
  # smtp_auth_password: 'password'

route:
  # Receiver name for notification
  receiver: 'email-notice'
  # grouping definition
  group_by: ['alertname', 'Service', 'Stage', 'Role']
  group_wait:      30s
  group_interval:  5m
  repeat_interval: 4h

receivers:
# any name of Receiver
- name: 'email-notice'
  email_configs:
  # destination Email address
  - to: "root@localhost"

# configure Alerting rules

root@dlp:~#

vi /etc/prometheus/alert_rules.yml

# create new

# for example, monitor node-exporter's Up/Down

groups:
- name: Instances
  rules:
  - alert: InstanceDown
    expr: up == 0
    for: 5m
    labels:
      severity: critical
    annotations:
      description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
      summary: 'Instance {{ $labels.instance }} down'

root@dlp:~#

vi /etc/prometheus/prometheus.yml

# add settings for Alert

# rule_files:
  # - "first.rules"
  # - "second.rules"

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # Alertmanager Server:Port
      - 'localhost:9093'

rule_files:
  # alerting rule
  - "alert_rules.yml"

root@dlp:~#

systemctl restart prometheus prometheus-alertmanager

[4]	If node-exporter is down, following Email is sent. (mail body is HTML)

root@dlp:~#

mail

"/var/mail/root": 1 message 1 new
>N   1 Alertmanager       Wed Dec 12 19:45  88/9628  [FIRING:1] InstanceDown (
? 1
Return-Path: <root@dlp.srv.world>
X-Original-To: root@localhost
Delivered-To: root@localhost
Received: from localhost (localhost [IPv6:::1])
        by dlp.srv.world (Postfix) with ESMTP id EFF0C4027A
        for <root@localhost>; Wed, 12 Dec 2018 19:45:22 +0900 (JST)
Subject: [FIRING:1] InstanceDown (node01.srv.world:9100 node example page)
To: root@localhost
From: Alertmanager <root@dlp.srv.world>
Content-Type: text/html; charset=UTF-8
Date: Wed, 12 Dec 2018 19:45:22 +0900
Message-Id: <20181212024522.EFF0C4027A@dlp.srv.world>

.....
.....

Matched Content