4.3 Installing Alertmanager on the Prometheus (Monitoring) Server

The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of routing them to the correct receiver through an integration such as email. See, Alertmanager in Prometheus documentation.

Perform the following steps on an OES server where Prometheus is installed:

Download and unzip the Alertmanager files from the Prometheus website.

tar xvfz alertmanager-0.25.0.linux-amd64.tar.gz
Copy alert_manager.sh script to unzipped directory and run the script. See, alert_manager.sh.

sh ./alert_manager.sh

After the installation of the Alertmanager on a target, update the static Prometheus server configuration and restart the Prometheus service.

On the Prometheus (Monitoring) server edit the Prometheus configuration file.

/etc/prometheus/exporter-config.yml.

Update the hostname or IP address of the Alertmanager in the targets section (highlighted in the example below).

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: Docker Servers
    static_configs:
      - targets: ['localhost:8080']
  - job_name: OES Servers
    static_configs:
      - targets: ['localhost:9100', 'oesnode01:9100', 'oesnode02:9100', 'oesnode03:9100', 'oesnode04.com']

  - job_name: 'alert-manager'
    static_configs:
      - targets: ['localhost:9093']

Restart the service after the configuration file is updated.

systemctl daemon-reload
systemctl restart prometheus.service

4.3.1 Configuring the Alertmanager notification system

Perform the following steps to configure Alertmanager notification system:

Enter the SMTP server information in the /etc/alertmanager/alertmanager.yml file.

Change the example information according to your requirements.

route:
    group_by: [Alertname]
    group_interval: 30s
    repeat_interval: 30s
    # Send all notifications to me.
    receiver: email-me
receivers:
- name: email-me
  email_configs:
  - send_resolved: true
    to: admin@email.com
    from: demo@email.com
    smarthost: smtp.email.com:587
    auth_username: demo@email.com
    auth_identity: demo@email.com
    auth_password: <enter_the_password>

Create a rule file named prometheus_rules.yml in the /etc/prometheus directory.The example that follows will alert you if any node is unavailable for more than a minute or if there is less than 10% of its disk space left.

groups:
  - name: custom_rules
    rules:
      - record: node_memory_MemFree_percent
        expr: 100 - (100 * node_memory_MemFree_bytes / node_memory_MemTotal_bytes)

      - record: node_filesystem_free_percent
        expr: 100 * node_filesystem_free_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}
  - name: alert_rules
    rules:
      - alert: InstanceDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Instance [{{ $labels.instance }}] down"
          description: "[{{ $labels.instance }}] of job [{{ $labels.job }}] has been down for more than 1 minute."
      - alert: DiskSpaceFree10Percent
        expr: node_filesystem_free_percent <= 10
        labels:
          severity: warning
        annotations:
           summary: "Instance [{{ $labels.instance }}] has 10% or less Free disk space"
           description: "[{{ $labels.instance }}] has only {{ $value }}% or less free."

For more information about Alertmanager, see Configuration and Alerting Rules in Prometheus documentation site.

Edit the configuration file (/etc/prometheus/exporter-config.yml) to include the rule global file and alerting configuration for the notification system (highlighted in the example below).

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: Docker Servers
    static_configs:
      - targets: ['localhost:8080']
  - job_name: OES Servers
    static_configs:
      - targets: ['localhost:9100', 'oesnode01:9100', 'oesnode02:9100', 'oesnode03:9100', 'oesnode04.com']

  - job_name: 'alert-manager'
    static_configs:
      - targets: ['localhost:9093']

rule_files:
  - "prometheus_rules.yml"

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # alertmanager:9093
      - localhost:9093

Restart the service after the configuration file is updated.

systemctl daemon-reload
systemctl restart prometheus.service
systemctl restart alertmanager.service