searchusermenu
  • 发布文章
  • 消息中心
点赞
收藏
评论
分享
原创

promtail+loki+alertmanager实现自定义日志监控告警

2023-08-15 07:40:14
532
0

Loki 是一个由Grafana Labs 开发的开源日志聚合系统,旨在为云原生架构提供高效的日志处理解决方案。
Promtail: 负责采集应用程序和系统的日志数据,并将其发送到 Loki 的集群中。
Loki: 负责存储日志数据,提供 HTTP API 的日志查询,以及数据过滤和筛选。

Alertmanager ,用于告警通知管理

 

安装部署

docker-compose.yaml

version: "3"

services:
  prometheus:
    image: prom/prometheus:latest
    restart: "always"
    ports:
      - 9090:9090
    container_name: "prometheus"
    volumes:
      - /home/test/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
      - /home/test/rules:/etc/prometheus/rules
      - /home/test/prometheus/data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'    #设置yml路径跟上面挂载对应
      - '--storage.tsdb.path=/prometheus'                 #设置数据路径跟上面挂载对应

  loki:
    container_name: loki
    image: grafana/loki:latest
    restart: "always"
    ports:
      - "3100:3100"
    volumes:
      - /home/test/loki:/etc/loki
      - /home/test/loki/rules:/loki/rules
    command: -config.file=/etc/loki/local-config.yaml


#告警模块
  alertmanager:
    image: prom/alertmanager:latest
    restart: "always"
    ports:
      - 9093:9093
    container_name: "alertmanager"
    volumes:
      - /home/test/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml

  grafana:
    container_name: grafana
    image: grafana/grafana:latest
    ports:
      - "3000:3000"

alertManager部署&启动

docker-compose up -f docker-compose.yaml -d alertmanager

alertmanager.yml示例:

vi  /home/test/alertmanager/alertmanager.yml
global:
  resolve_timeout: 5m
  smtp_from: 'your_email@chinatelecom.cn'
  smtp_smarthost: 'smtp.chinatelecom.cn:465'
  smtp_auth_username: 'you_username'
  smtp_auth_password: 'your_pwd'
  smtp_require_tls: false
  smtp_hello: 'chinatelecom'

templates:
  - '/etc/alertmanager/templates/*.tmpl'

route:
  receiver: 'email'
  group_by: ['alertname']
  group_wait: 5s
  group_interval: 5m
  repeat_interval: 15m

  routes:
  - receiver: 'email'
    group_by: [severity]
    match:
      severity: critical

receivers:
- name: 'email'
  email_configs:
  - to: 'email@chinatelecom.cn'

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

loki部署&启动

docker-compose up -f docker-compose.yaml -d loki

loki-config.yml示例:

vi  /home/test/loki/loki-config.yml
auth_enabled: false

server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2023-01-01
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h

chunk_store_config:
  max_look_back_period: 168h
table_manager:
  retention_deletes_enabled: true
  retention_period: 168h

ruler:
  alertmanager_url: h-t-t-p://192.168.92.128:9093

# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration
# analytics to Grafana Labs. These statistics are sent to h-t-t-ps://stats.grafana.org/
#
# Statistics help us better understand how Loki is used, and they show us performance
# levels for most users. This helps us prioritize features and documentation.
# For more information on what's sent, look at
# h-t-t-p-s://github.com/grafana/loki/blob/main/pkg/usagestats/stats.go
# Refer to the buildReport method to see what goes into a report.
#
# If you would like to disable reporting, uncomment the following lines:
#analytics:
#  reporting_enabled: false

loki日志告警规则示例

vi /home/test/loki/rules/instance/rate-alert.yml
groups:
  - name: rate-alerting
    rules:
    - alert: 大量报错日志
      expr: count_over_time(({host=~"192.168.*"}|~"error")[2m]) >100
      for: 5m
      labels:
        severity: warnning
        instance: "logs"
      annotations:
        summary: Too many error logs
        description: Too many error logs

 

promtail部署&启动

curl -O -L "h-t-t-ps://github.com/grafana/loki/releases/download/v2.8.4/promtail-linux-amd64.zip"

unzip "promtail-linux-amd64.zip"

chmod a+x "promtail-linux-amd64"

#启动服务promtail服务
nohup ./promtail-linux-amd64 -config.file=promtail-local-config.yaml &

promtail配置示例

promtail-local-config.yaml

#配置Promtail监听的端口
server:
  http_listen_port: 9080
  grpc_listen_port: 0
#配置Promtail将在何处保存文件,重新启动Promtail时需要使用它,以使其从中断处继续。
positions:
  filename: /tmp/positions_tmp.yaml

#配置Promtail如何连接到Loki实例
clients:
  - url: h-t-t-p://192.168.92.128:3100/loki/api/v1/push

scrape_configs:
- job_name: instance_log
  static_configs:
# 配置发现在当前节点上查找
# 这是 Prometheus 服务发现代码所要求的,但并不适用于Promtail,它只能查看本地机器上的文件。
# 因此,它应该只有 localhost 的值,或者可以完全移除它,Promtail 会使用 localhost 的默认值。
targets:
  - localhost
  - targets:
      - localhost
    labels:
        # 标签映射,用于添加到发送到 push API 的每一行日志上
      job: instance_log
      host: 192.168.92.128
      #自定义被采集的日志路径(支持正则表达)
      __path__: /home/test/logs/*log

 

 

效果展示

 

 

0条评论
0 / 1000
z****n
2文章数
0粉丝数
z****n
2 文章 | 0 粉丝
z****n
2文章数
0粉丝数
z****n
2 文章 | 0 粉丝
原创

promtail+loki+alertmanager实现自定义日志监控告警

2023-08-15 07:40:14
532
0

Loki 是一个由Grafana Labs 开发的开源日志聚合系统,旨在为云原生架构提供高效的日志处理解决方案。
Promtail: 负责采集应用程序和系统的日志数据,并将其发送到 Loki 的集群中。
Loki: 负责存储日志数据,提供 HTTP API 的日志查询,以及数据过滤和筛选。

Alertmanager ,用于告警通知管理

 

安装部署

docker-compose.yaml

version: "3"

services:
  prometheus:
    image: prom/prometheus:latest
    restart: "always"
    ports:
      - 9090:9090
    container_name: "prometheus"
    volumes:
      - /home/test/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
      - /home/test/rules:/etc/prometheus/rules
      - /home/test/prometheus/data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'    #设置yml路径跟上面挂载对应
      - '--storage.tsdb.path=/prometheus'                 #设置数据路径跟上面挂载对应

  loki:
    container_name: loki
    image: grafana/loki:latest
    restart: "always"
    ports:
      - "3100:3100"
    volumes:
      - /home/test/loki:/etc/loki
      - /home/test/loki/rules:/loki/rules
    command: -config.file=/etc/loki/local-config.yaml


#告警模块
  alertmanager:
    image: prom/alertmanager:latest
    restart: "always"
    ports:
      - 9093:9093
    container_name: "alertmanager"
    volumes:
      - /home/test/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml

  grafana:
    container_name: grafana
    image: grafana/grafana:latest
    ports:
      - "3000:3000"

alertManager部署&启动

docker-compose up -f docker-compose.yaml -d alertmanager

alertmanager.yml示例:

vi  /home/test/alertmanager/alertmanager.yml
global:
  resolve_timeout: 5m
  smtp_from: 'your_email@chinatelecom.cn'
  smtp_smarthost: 'smtp.chinatelecom.cn:465'
  smtp_auth_username: 'you_username'
  smtp_auth_password: 'your_pwd'
  smtp_require_tls: false
  smtp_hello: 'chinatelecom'

templates:
  - '/etc/alertmanager/templates/*.tmpl'

route:
  receiver: 'email'
  group_by: ['alertname']
  group_wait: 5s
  group_interval: 5m
  repeat_interval: 15m

  routes:
  - receiver: 'email'
    group_by: [severity]
    match:
      severity: critical

receivers:
- name: 'email'
  email_configs:
  - to: 'email@chinatelecom.cn'

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

loki部署&启动

docker-compose up -f docker-compose.yaml -d loki

loki-config.yml示例:

vi  /home/test/loki/loki-config.yml
auth_enabled: false

server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2023-01-01
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h

chunk_store_config:
  max_look_back_period: 168h
table_manager:
  retention_deletes_enabled: true
  retention_period: 168h

ruler:
  alertmanager_url: h-t-t-p://192.168.92.128:9093

# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration
# analytics to Grafana Labs. These statistics are sent to h-t-t-ps://stats.grafana.org/
#
# Statistics help us better understand how Loki is used, and they show us performance
# levels for most users. This helps us prioritize features and documentation.
# For more information on what's sent, look at
# h-t-t-p-s://github.com/grafana/loki/blob/main/pkg/usagestats/stats.go
# Refer to the buildReport method to see what goes into a report.
#
# If you would like to disable reporting, uncomment the following lines:
#analytics:
#  reporting_enabled: false

loki日志告警规则示例

vi /home/test/loki/rules/instance/rate-alert.yml
groups:
  - name: rate-alerting
    rules:
    - alert: 大量报错日志
      expr: count_over_time(({host=~"192.168.*"}|~"error")[2m]) >100
      for: 5m
      labels:
        severity: warnning
        instance: "logs"
      annotations:
        summary: Too many error logs
        description: Too many error logs

 

promtail部署&启动

curl -O -L "h-t-t-ps://github.com/grafana/loki/releases/download/v2.8.4/promtail-linux-amd64.zip"

unzip "promtail-linux-amd64.zip"

chmod a+x "promtail-linux-amd64"

#启动服务promtail服务
nohup ./promtail-linux-amd64 -config.file=promtail-local-config.yaml &

promtail配置示例

promtail-local-config.yaml

#配置Promtail监听的端口
server:
  http_listen_port: 9080
  grpc_listen_port: 0
#配置Promtail将在何处保存文件,重新启动Promtail时需要使用它,以使其从中断处继续。
positions:
  filename: /tmp/positions_tmp.yaml

#配置Promtail如何连接到Loki实例
clients:
  - url: h-t-t-p://192.168.92.128:3100/loki/api/v1/push

scrape_configs:
- job_name: instance_log
  static_configs:
# 配置发现在当前节点上查找
# 这是 Prometheus 服务发现代码所要求的,但并不适用于Promtail,它只能查看本地机器上的文件。
# 因此,它应该只有 localhost 的值,或者可以完全移除它,Promtail 会使用 localhost 的默认值。
targets:
  - localhost
  - targets:
      - localhost
    labels:
        # 标签映射,用于添加到发送到 push API 的每一行日志上
      job: instance_log
      host: 192.168.92.128
      #自定义被采集的日志路径(支持正则表达)
      __path__: /home/test/logs/*log

 

 

效果展示

 

 

文章来自个人专栏
监控告警
2 文章 | 1 订阅
0条评论
0 / 1000
请输入你的评论
0
0