Prometheus 入门及进阶

欢迎大家订阅本人维护的微信公众号或者今日头条头条号，第一时间获取分享：）。

Warning！！！内容较多备好板凳与瓜子矿泉水！！

参考1

参考2

什么是TSDB？

TSDB(Time Series Database)时序列数据库，我们可以简单的理解为一个优化后用来处理时间序列数据的软件，并且数据中的数组是由时间进行索引的。

时间序列数据库的特点

大部分时间都是写入操作。
写入操作几乎是顺序添加，大多数时候数据到达后都以时间排序。
写操作很少写入很久之前的数据，也很少更新数据。大多数情况在数据被采集到数秒或者数分钟后就会被写入数据库。
删除操作一般为区块删除，选定开始的历史时间并指定后续的区块。很少单独删除某个时间或者分开的随机时间的数据。
基本数据大，一般超过内存大小。一般选取的只是其一小部分且没有规律，缓存几乎不起任何作用。
读操作是十分典型的升序或者降序的顺序读。
高并发的读操作十分常见。

常见的时间序列数据库

TSDB项目官网influxdbhttps://influxdata.com/promethueshttps://prometheus.io/

其他懒得写了。。。

什么是Prometheus?

Prometheus是由SoundCloud开发的开源监控报警系统和时序列数据库(TSDB)。Prometheus使用Go语言开发，是Google BorgMon监控系统的开源版本。

2016年由Google发起Linux基金会旗下的原生云基金会(Cloud Native Computing Foundation), 将Prometheus纳入其下第二大开源项目。Prometheus目前在开源社区相当活跃。

Prometheus和Heapster(Heapster是K8S的一个子项目，用于获取集群的性能数据。)相比功能更完善、更全面。Prometheus性能也足够支撑上万台规模的集群。

Prometheus的特点

多维度数据模型。
灵活的查询语言。
不依赖分布式存储，单个服务器节点是自主的。
通过基于HTTP的pull方式采集时序数据。
可以通过中间网关进行时序列数据推送。
通过服务发现或者静态配置来发现目标服务对象。
支持多种多样的图表和界面展示，比如Grafana等。

Prometheus相关组件

alertmanager
警告管理器，用来进行报警。

Prometheus的架构

下面这张图说明了Prometheus的整体架构，以及生态中的一些组件作用:

Prometheus的基本原理是通过HTTP协议周期性抓取被监控组件的状态，任意组件只要提供对应的HTTP接口就可以接入监控。不需要任何SDK或者其他的集成过程。这样做非常适合做虚拟化环境监控系统，比如VM、Docker、Kubernetes等。输出被监控组件信息的 HTTP 接口被叫做 exporter 。目前互联网公司常用的组件大部分都有 exporter 可以直接使用，比如Varnish、Haproxy、Nginx、MySQL、Linux系统信息(包括磁盘、内存、CPU、网络等等)。

Prometheus服务过程大概是这样：

Prometheus Daemon负责定时去目标上抓取 metrics (指标)数据，每个抓取目标需要暴露一个http 服务的接口给它定时抓取。Prometheus支持通过配置文件、文本文件、Zookeeper、Consul、DNS SRV Lookup 等方式指定抓取目标。Prometheus采用PULL的方式进行监控，即服务器可以直接通过目标 PULL 数据或者间接地通过中间网关来Push数据。
Prometheus 在本地存储抓取的所有数据，并通过一定规则进行清理和整理数据，并把得到的结果存储到新的时间序列中。
Prometheus 通过 PromQL 和其他 API可视化地展示收集的数据。Prometheus 支持很多方式的图表可视化，例如Grafana、自带的Promdash以及自身提供的模版引擎等等。Prometheus 还提供HTTP API的查询方式，自定义所需要的输出。
PushGateway支持Client主动推送metrics到PushGateway，而Prometheus只是定时去Gateway上抓取数据。
A lertmanager是独立于Prometheus的一个组件，可以支持Prometheus的查询语句，提供十分灵活的报警方式。

Prometheus适用的场景

Prometheus在记录纯数字时间序列方面表现非常好。它既适用于面向服务器等硬件指标的监控，也适用于高动态的面向服务架构的监控。对于现在流行的微服务，Prometheus的多维度数据收集和数据筛选查询语言也是非常的强大。Prometheus是为服务的可靠性而设计的，当服务出现故障时，它可以使你快速定位和诊断问题。它的搭建过程对硬件和服务没有很强的依赖关系。

Prometheus不适用的场景

Prometheus它的价值在于可靠性，甚至在很恶劣的环境下，你都可以随时访问它和查看系统服务各种指标的统计信息。如果你对统计数据需要100%的精确，它并不适用，例如：它不适用于实时计费系统。

基础概念

数据模型

时序索引

时序(time series) 是由名字(Metric)，以及一组 key/value 标签定义的，具有相同的名字以及标签属于相同时序。

时序的名字由 ASCII 字符，数字，下划线，以及冒号组成，它必须满足正则表达式 [a-zA-Z_:][a-zA-Z0-9_:]*, 其名字应该具有语义化，一般表示一个可以度量的指标，例如 http_requests_total, 可以表示 http 请求的总数。

时序的标签可以使 Prometheus 的数据更加丰富，能够区分具体不同的实例，例如 http_requests_total{method="POST"} 可以表示所有 http 中的 POST 请求。

标签名称由 ASCII 字符，数字，以及下划线组成，其中 __ 开头属于 Prometheus 保留，标签的值可以是任何 Unicode 字符，支持中文。

时序样本

按照某个时序以时间维度采集的数据，称之为样本，其值包含：

一个 float64 值
一个毫秒级的 unix 时间戳

格式

Prometheus 时序格式与 OpenTSDB 相似：

<metric name>{<label name>=<label value>, ...}

其中包含时序名字以及时序的标签。

时序4种类型

Prometheus 时序部分分为Counter,Gauge,Histogram,Summary

Counter

Counter 表示收集的数据是按照某个趋势（增加／减少）一直变化的，我们往往用它记录服务请求总量，错误总数等。

例如 Prometheus server 中 http_requests_total, 表示 Prometheus 处理的 http 请求总数，我们可以使用 delta, 很容易得到任意区间数据的增量，这个会在 PromQL 一节中细讲。

Gauge

Gauge 表示搜集的数据是一个瞬时的，与时间没有关系，可以任意变高变低，往往可以用来记录内存使用率、磁盘使用率等。

例如 Prometheus server 中 go_goroutines, 表示 Prometheus 当前 goroutines 的数量。

Histogram

Histogram 由 _bucket{le=""}，_bucket{le="+Inf"}, _sum，_count 组成，主要用于表示一段时间范围内对数据进行采样，（通常是请求持续时间或响应大小），并能够对其指定区间以及总数进行统计，通常我们用它计算分位数的直方图。

例如 Prometheus server 中 prometheus_local_storage_series_chunks_persisted, 表示 Prometheus 中每个时序需要存储的 chunks 数量，我们可以用它计算待持久化的数据的分位数。

Summary

Summary 和 Histogram 类似，由 {quantile="<φ>"}，_sum，_count 组成，主要用于表示一段时间内数据采样结果，（通常是请求持续时间或响应大小），它直接存储了 quantile 数据，而不是根据统计区间计算出来的。

例如 Prometheus server 中 prometheus_target_interval_length_seconds。

Histogram vs Summary

都包含 _sum，_count
Histogram 需要通过 _bucket 计算 quantile, 而 Summary 直接存储了 quantile 的值。

作业和实例

prometheus 中，将任意一个独立的数据源（target）称之为实例（instance）。包含相同类型的实例的集合称之为作业（job）。如下是一个含有四个重复实例的作业：

- job: api-server
    - instance 1: 1.2.3.4:5670
    - instance 2: 1.2.3.4:5671
    - instance 3: 5.6.7.8:5670
    - instance 4: 5.6.7.8:5671

自生成标签和时序

prometheus 在采集数据的同时，会自动在时序的基础上添加标签，作为数据源（target）的标识，以便区分：

job: The configured job name that the target belongs to.
instance: The <host>:<port> part of the target's URL that was scraped.

如果其中任一标签已经在此前采集的数据中存在，那么将会根据 honor_labels 设置选项来决定新标签。详见官网解释：https://prometheus.io/docs/operating/configuration/#%3Cscrape_config%3E

对每一个实例而言，prometheus 按照以下时序来存储所采集的数据样本：

up{job="<job-name>", instance="<instance-id>"}: 1 表示该实例正常工作
up{job="<job-name>", instance="<instance-id>"}: 0 表示该实例故障

scrape_duration_seconds{job="<job-name>", instance="<instance-id>"} 表示拉取数据的时间间隔

scrape_samples_post_metric_relabeling{job="<job-name>", instance="<instance-id>"} 表示采用重定义标签（relabeling）操作后仍然剩余的样本数

scrape_samples_scraped{job="<job-name>", instance="<instance-id>"}  表示从该数据源获取的样本数

其中 up 时序可以有效应用于监控该实例是否正常工作。

PromQL

PromQL 基本使用

PromQL (Prometheus Query Language) 是 Prometheus 自己开发的数据查询 DSL 语言，语言表现力非常丰富，内置函数很多，在日常数据可视化，rule 告警中都会使用到它。

我们可以在页面 http://localhost:9090/graph 中，输入下面的查询语句，查看结果，例如：

http_requests_total{code="200"}

字符串和数字

字符串: 在查询语句中，字符串往往作为查询条件 labels 的值，和 Golang 字符串语法一致，可以使用 "", '', 或者 `` , 格式如：

"this is a string"
'these are unescaped: \n \\ \t'
`these are not unescaped: \n ' " \t`

正数，浮点数: 表达式中可以使用正数或浮点数，例如：

3
-2.4

查询结果类型

PromQL 查询结果主要有 3 种类型：

瞬时数据 (Instant vector): 包含一组时序，每个时序只有一个点，例如：http_requests_total
区间数据 (Range vector): 包含一组时序，每个时序有多个点，例如：http_requests_total[5m]
纯量数据 (Scalar): 纯量只有一个数字，没有时序，例如：count(http_requests_total)

查询条件

Prometheus 存储的是时序数据，而它的时序是由名字和一组标签构成的，其实名字也可以写出标签的形式，例如 http_requests_total 等价于 {name="http_requests_total"}。

一个简单的查询相当于是对各种标签的筛选，例如：

http_requests_total{code="200"} // 表示查询名字为 http_requests_total，code 为 "200" 的数据

查询条件支持正则匹配，例如：

http_requests_total{code!="200"}  // 表示查询 code 不为 "200" 的数据
http_requests_total{code=～"2.."} // 表示查询 code 为 "2xx" 的数据
http_requests_total{code!～"2.."} // 表示查询 code 不为 "2xx" 的数据

操作符

Prometheus 查询语句中，支持常见的各种表达式操作符，例如

算术运算符:

支持的算术运算符有 +，-，*，/，%，^, 例如 http_requests_total * 2 表示将 http_requests_total 所有数据 double 一倍。

比较运算符

支持的比较运算符有 ==，!=，>，<，>=，<=, 例如 http_requests_total > 100 表示 http_requests_total 结果中大于 100 的数据。

逻辑运算符:

支持的逻辑运算符有 and，or，unless, 例如 http_requests_total == 5 or http_requests_total == 2 表示 http_requests_total 结果中等于 5 或者 2 的数据。

聚合运算符:

支持的聚合运算符有 sum，min，max，avg，stddev，stdvar，count，count_values，bottomk，topk，quantile，, 例如 max(http_requests_total) 表示 http_requests_total 结果中最大的数据。

注意，和四则运算类型，Prometheus 的运算符也有优先级，它们遵从（^）> (*, /, %) > (+, -) > (==, !=, <=, <, >=, >) > (and, unless) > (or) 的原则。

内置函数

Prometheus 内置不少函数，方便查询以及数据格式化，例如将结果由浮点数转为整数的 floor 和 ceil，

floor(avg(http_requests_total{code="200"}))
ceil(avg(http_requests_total{code="200"}))

查看 http_requests_total 5分钟内，平均每秒数据

更多请参见详情。

Prometheus 配置

配置

Prometheus 启动的时候，可以加载运行参数 -config.file 指定配置文件，默认为 prometheus.yml。

在配置文件中我们可以指定 global, alerting, rule_files, scrape_configs, remote_write, remote_read 等属性。

其代码结构体定义为：

// Config is the top-level configuration for Prometheus's config files.
type Config struct {
    GlobalConfig   GlobalConfig    `yaml:"global"`
    AlertingConfig AlertingConfig  `yaml:"alerting,omitempty"`
    RuleFiles      []string        `yaml:"rule_files,omitempty"`
    ScrapeConfigs  []*ScrapeConfig `yaml:"scrape_configs,omitempty"`

    RemoteWriteConfigs []*RemoteWriteConfig `yaml:"remote_write,omitempty"`
    RemoteReadConfigs  []*RemoteReadConfig  `yaml:"remote_read,omitempty"`

    // Catches all undefined fields and must be empty after parsing.
    XXX map[string]interface{} `yaml:",inline"`

    // original is the input from which the config was parsed.
    original string
}

配置文件结构大概为：

global:
  # How frequently to scrape targets by default.
  [ scrape_interval: <duration> | default = 1m ]

  # How long until a scrape request times out.
  [ scrape_timeout: <duration> | default = 10s ]

  # How frequently to evaluate rules.
  [ evaluation_interval: <duration> | default = 1m ]

  # The labels to add to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    [ <labelname>: <labelvalue> ... ]

# Rule files specifies a list of globs. Rules and alerts are read from
# all matching files.
rule_files:
  [ - <filepath_glob> ... ]

# A list of scrape configurations.
scrape_configs:
  [ - <scrape_config> ... ]

# Alerting specifies settings related to the Alertmanager.
alerting:
  alert_relabel_configs:
    [ - <relabel_config> ... ]
  alertmanagers:
    [ - <alertmanager_config> ... ]

# Settings related to the experimental remote write feature.
remote_write:
  [ - <remote_write> ... ]

# Settings related to the experimental remote read feature.
remote_read:
  [ - <remote_read> ... ]

全局配置

global 属于全局的默认配置，它主要包含 4 个属性，

scrape_interval: 拉取 targets 的默认时间间隔。
scrape_timeout: 拉取一个 target 的超时时间。
evaluation_interval: 执行 rules 的时间间隔。
external_labels: 额外的属性，会添加到拉取的数据并存到数据库中。
其代码结构体定义为：

// GlobalConfig configures values that are used across other configuration
// objects.
type GlobalConfig struct {
    // How frequently to scrape targets by default.
    ScrapeInterval model.Duration `yaml:"scrape_interval,omitempty"`
    // The default timeout when scraping targets.
    ScrapeTimeout model.Duration `yaml:"scrape_timeout,omitempty"`
    // How frequently to evaluate rules by default.
    EvaluationInterval model.Duration `yaml:"evaluation_interval,omitempty"`
    // The labels to add to any timeseries that this Prometheus instance scrapes.
    ExternalLabels model.LabelSet `yaml:"external_labels,omitempty"`

    // Catches all undefined fields and must be empty after parsing.
    XXX map[string]interface{} `yaml:",inline"`
}

配置文件结构体大概为：

global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.
  scrape_timeout: 10s # is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

告警配置

通常我们可以使用运行参数 -alertmanager.xxx 来配置 Alertmanager，但是这样不够灵活，没有办法做到动态更新加载，以及动态定义告警属性。

所以 alerting 配置主要用来解决这个问题，它能够更好的管理 Alertmanager, 主要包含 2 个参数：

- alert_relabel_configs: 动态修改 alert 属性的规则配置。

- alertmanagers: 用于动态发现 Alertmanager 的配置。

其代码结构体定义为：

// AlertingConfig configures alerting and alertmanager related configs.
type AlertingConfig struct {
    AlertRelabelConfigs []*RelabelConfig      `yaml:"alert_relabel_configs,omitempty"`
    AlertmanagerConfigs []*AlertmanagerConfig `yaml:"alertmanagers,omitempty"`

    // Catches all undefined fields and must be empty after parsing.
    XXX map[string]interface{} `yaml:",inline"`
}

配置文件结构大概为：

# Alerting specifies settings related to the Alertmanager.
alerting:
  alert_relabel_configs:
    [ - <relabel_config> ... ]
  alertmanagers:
    [ - <alertmanager_config> ... ]

规则配置

rule_files 主要用于配置 rules 文件，它支持多个文件以及文件目录。

其代码结构定义为：

RuleFiles      []string        `yaml:"rule_files,omitempty"`

配置文件结构大致为：

rule_files:
  - "rules/node.rules"
  - "rules2/*.rules"

数据拉取配置

scrape_configs 主要用于配置拉取数据节点，每一个拉取配置主要包含以下参数：

job_name：任务名称
honor_labels：用于解决拉取数据标签有冲突，当设置为 true, 以拉取数据为准，否则以服务配置为准
params：数据拉取访问时带的请求参数
scrape_interval：拉取时间间隔
scrape_timeout: 拉取超时时间
metrics_path：拉取节点的 metric 路径
scheme：拉取数据访问协议
sample_limit：存储的数据标签个数限制，如果超过限制，该数据将被忽略，不入存储；默认值为0，表示没有限制
relabel_configs：拉取数据重置标签配置
metric_relabel_configs：metric 重置标签配置
其代码结构体定义为：

// ScrapeConfig configures a scraping unit for Prometheus.
type ScrapeConfig struct {
    // The job name to which the job label is set by default.
    JobName string `yaml:"job_name"`
    // Indicator whether the scraped metrics should remain unmodified.
    HonorLabels bool `yaml:"honor_labels,omitempty"`
    // A set of query parameters with which the target is scraped.
    Params url.Values `yaml:"params,omitempty"`
    // How frequently to scrape the targets of this scrape config.
    ScrapeInterval model.Duration `yaml:"scrape_interval,omitempty"`
    // The timeout for scraping targets of this config.
    ScrapeTimeout model.Duration `yaml:"scrape_timeout,omitempty"`
    // The HTTP resource path on which to fetch metrics from targets.
    MetricsPath string `yaml:"metrics_path,omitempty"`
    // The URL scheme with which to fetch metrics from targets.
    Scheme string `yaml:"scheme,omitempty"`
    // More than this many samples post metric-relabelling will cause the scrape to fail.
    SampleLimit uint `yaml:"sample_limit,omitempty"`

    // We cannot do proper Go type embedding below as the parser will then parse
    // values arbitrarily into the overflow maps of further-down types.

    ServiceDiscoveryConfig ServiceDiscoveryConfig `yaml:",inline"`
    HTTPClientConfig       HTTPClientConfig       `yaml:",inline"`

    // List of target relabel configurations.
    RelabelConfigs []*RelabelConfig `yaml:"relabel_configs,omitempty"`
    // List of metric relabel configurations.
    MetricRelabelConfigs []*RelabelConfig `yaml:"metric_relabel_configs,omitempty"`

    // Catches all undefined fields and must be empty after parsing.
    XXX map[string]interface{} `yaml:",inline"`
}

以上配置定义中还包含 ServiceDiscoveryConfig，它的代码定义为：

// ServiceDiscoveryConfig configures lists of different service discovery mechanisms.
type ServiceDiscoveryConfig struct {
    // List of labeled target groups for this job.
    StaticConfigs []*TargetGroup `yaml:"static_configs,omitempty"`
    // List of DNS service discovery configurations.
    DNSSDConfigs []*DNSSDConfig `yaml:"dns_sd_configs,omitempty"`
    // List of file service discovery configurations.
    FileSDConfigs []*FileSDConfig `yaml:"file_sd_configs,omitempty"`
    // List of Consul service discovery configurations.
    ConsulSDConfigs []*ConsulSDConfig `yaml:"consul_sd_configs,omitempty"`
    // List of Serverset service discovery configurations.
    ServersetSDConfigs []*ServersetSDConfig `yaml:"serverset_sd_configs,omitempty"`
    // NerveSDConfigs is a list of Nerve service discovery configurations.
    NerveSDConfigs []*NerveSDConfig `yaml:"nerve_sd_configs,omitempty"`
    // MarathonSDConfigs is a list of Marathon service discovery configurations.
    MarathonSDConfigs []*MarathonSDConfig `yaml:"marathon_sd_configs,omitempty"`
    // List of Kubernetes service discovery configurations.
    KubernetesSDConfigs []*KubernetesSDConfig `yaml:"kubernetes_sd_configs,omitempty"`
    // List of GCE service discovery configurations.
    GCESDConfigs []*GCESDConfig `yaml:"gce_sd_configs,omitempty"`
    // List of EC2 service discovery configurations.
    EC2SDConfigs []*EC2SDConfig `yaml:"ec2_sd_configs,omitempty"`
    // List of OpenStack service discovery configurations.
    OpenstackSDConfigs []*OpenstackSDConfig `yaml:"openstack_sd_configs,omitempty"`
    // List of Azure service discovery configurations.
    AzureSDConfigs []*AzureSDConfig `yaml:"azure_sd_configs,omitempty"`
    // List of Triton service discovery configurations.
    TritonSDConfigs []*TritonSDConfig `yaml:"triton_sd_configs,omitempty"`

    // Catches all undefined fields and must be empty after parsing.
    XXX map[string]interface{} `yaml:",inline"`
}

ServiceDiscoveryConfig 主要用于 target 发现，大体分为两类，静态配置和动态发现。

所以，一份完整的 scrape_configs 配置大致为：

# The job name assigned to scraped metrics by default.
job_name: <job_name>

# How frequently to scrape targets from this job.
[ scrape_interval: <duration> | default = <global_config.scrape_interval> ]

# Per-scrape timeout when scraping this job.
[ scrape_timeout: <duration> | default = <global_config.scrape_timeout> ]

# The HTTP resource path on which to fetch metrics from targets.
[ metrics_path: <path> | default = /metrics ]

# honor_labels controls how Prometheus handles conflicts between labels that are
# already present in scraped data and labels that Prometheus would attach
# server-side ("job" and "instance" labels, manually configured target
# labels, and labels generated by service discovery implementations).
#
# If honor_labels is set to "true", label conflicts are resolved by keeping label
# values from the scraped data and ignoring the conflicting server-side labels.
#
# If honor_labels is set to "false", label conflicts are resolved by renaming
# conflicting labels in the scraped data to "exported_<original-label>" (for
# example "exported_instance", "exported_job") and then attaching server-side
# labels. This is useful for use cases such as federation, where all labels
# specified in the target should be preserved.
#
# Note that any globally configured "external_labels" are unaffected by this
# setting. In communication with external systems, they are always applied only
# when a time series does not have a given label yet and are ignored otherwise.
[ honor_labels: <boolean> | default = false ]

# Configures the protocol scheme used for requests.
[ scheme: <scheme> | default = http ]

# Optional HTTP URL parameters.
params:
  [ <string>: [<string>, ...] ]

# Sets the `Authorization` header on every scrape request with the
# configured username and password.
basic_auth:
  [ username: <string> ]
  [ password: <string> ]

# Sets the `Authorization` header on every scrape request with
# the configured bearer token. It is mutually exclusive with `bearer_token_file`.
[ bearer_token: <string> ]

# Sets the `Authorization` header on every scrape request with the bearer token
# read from the configured file. It is mutually exclusive with `bearer_token`.
[ bearer_token_file: /path/to/bearer/token/file ]

# Configures the scrape request's TLS settings.
tls_config:
  [ <tls_config> ]

# Optional proxy URL.
[ proxy_url: <string> ]

# List of Azure service discovery configurations.
azure_sd_configs:
  [ - <azure_sd_config> ... ]

# List of Consul service discovery configurations.
consul_sd_configs:
  [ - <consul_sd_config> ... ]

# List of DNS service discovery configurations.
dns_sd_configs:
  [ - <dns_sd_config> ... ]

# List of EC2 service discovery configurations.
ec2_sd_configs:
  [ - <ec2_sd_config> ... ]

# List of OpenStack service discovery configurations.
openstack_sd_configs:
  [ - <openstack_sd_config> ... ]

# List of file service discovery configurations.
file_sd_configs:
  [ - <file_sd_config> ... ]

# List of GCE service discovery configurations.
gce_sd_configs:
  [ - <gce_sd_config> ... ]

# List of Kubernetes service discovery configurations.
kubernetes_sd_configs:
  [ - <kubernetes_sd_config> ... ]

# List of Marathon service discovery configurations.
marathon_sd_configs:
  [ - <marathon_sd_config> ... ]

# List of AirBnB's Nerve service discovery configurations.
nerve_sd_configs:
  [ - <nerve_sd_config> ... ]

# List of Zookeeper Serverset service discovery configurations.
serverset_sd_configs:
  [ - <serverset_sd_config> ... ]

# List of Triton service discovery configurations.
triton_sd_configs:
  [ - <triton_sd_config> ... ]

# List of labeled statically configured targets for this job.
static_configs:
  [ - <static_config> ... ]

# List of target relabel configurations.
relabel_configs:
  [ - <relabel_config> ... ]

# List of metric relabel configurations.
metric_relabel_configs:
  [ - <relabel_config> ... ]

# Per-scrape limit on number of scraped samples that will be accepted.
# If more than this number of samples are present after metric relabelling
# the entire scrape will be treated as failed. 0 means no limit.
[ sample_limit: <int> | default = 0 ]

远程可写存储

remote_write 主要用于可写远程存储配置，主要包含以下参数：

url: 访问地址
remote_timeout: 请求超时时间
write_relabel_configs: 标签重置配置, 拉取到的数据，经过重置处理后，发送给远程存储
其代码结构体为：

// RemoteWriteConfig is the configuration for writing to remote storage.
type RemoteWriteConfig struct {
    URL                 *URL             `yaml:"url,omitempty"`
    RemoteTimeout       model.Duration   `yaml:"remote_timeout,omitempty"`
    WriteRelabelConfigs []*RelabelConfig `yaml:"write_relabel_configs,omitempty"`

    // We cannot do proper Go type embedding below as the parser will then parse
    // values arbitrarily into the overflow maps of further-down types.
    HTTPClientConfig HTTPClientConfig `yaml:",inline"`

    // Catches all undefined fields and must be empty after parsing.
    XXX map[string]interface{} `yaml:",inline"`
}

一份完整的配置大致为:

# The URL of the endpoint to send samples to.
url: <string>

# Timeout for requests to the remote write endpoint.
[ remote_timeout: <duration> | default = 30s ]

# List of remote write relabel configurations.
write_relabel_configs:
  [ - <relabel_config> ... ]

# Sets the `Authorization` header on every remote write request with the
# configured username and password.
basic_auth:
  [ username: <string> ]
  [ password: <string> ]

# Sets the `Authorization` header on every remote write request with
# the configured bearer token. It is mutually exclusive with `bearer_token_file`.
[ bearer_token: <string> ]

# Sets the `Authorization` header on every remote write request with the bearer token
# read from the configured file. It is mutually exclusive with `bearer_token`.
[ bearer_token_file: /path/to/bearer/token/file ]

# Configures the remote write request's TLS settings.
tls_config:
  [ <tls_config> ]

# Optional proxy URL.
[ proxy_url: <string> ]

注意： remote_write 属于试验阶段，慎用，因为在以后的版本中可能发生改变。

远程可读存储

remote_read 主要用于可读远程存储配置，主要包含以下参数：

url: 访问地址
remote_timeout: 请求超时时间
其代码结构体为：

// RemoteReadConfig is the configuration for reading from remote storage.
type RemoteReadConfig struct {
    URL           *URL           `yaml:"url,omitempty"`
    RemoteTimeout model.Duration `yaml:"remote_timeout,omitempty"`

    // We cannot do proper Go type embedding below as the parser will then parse
    // values arbitrarily into the overflow maps of further-down types.
    HTTPClientConfig HTTPClientConfig `yaml:",inline"`

    // Catches all undefined fields and must be empty after parsing.
    XXX map[string]interface{} `yaml:",inline"`
}

一份完整的配置大致为:

# The URL of the endpoint to query from.
url: <string>

# Timeout for requests to the remote read endpoint.
[ remote_timeout: <duration> | default = 30s ]

# Sets the `Authorization` header on every remote read request with the
# configured username and password.
basic_auth:
  [ username: <string> ]
  [ password: <string> ]

# Sets the `Authorization` header on every remote read request with
# the configured bearer token. It is mutually exclusive with `bearer_token_file`.
[ bearer_token: <string> ]

# Sets the `Authorization` header on every remote read request with the bearer token
# read from the configured file. It is mutually exclusive with `bearer_token`.
[ bearer_token_file: /path/to/bearer/token/file ]

# Configures the remote read request's TLS settings.
tls_config:
  [ <tls_config> ]

# Optional proxy URL.
[ proxy_url: <string> ]

注意： remote_read 属于试验阶段，慎用，因为在以后的版本中可能发生改变。

服务发现

在 Prometheus 的配置中，一个最重要的概念就是数据源 target，而数据源的配置主要分为静态配置和动态发现, 大致为以下几类：

static_configs: 静态服务发现
dns_sd_configs: DNS 服务发现
file_sd_configs: 文件服务发现
consul_sd_configs: Consul 服务发现
serverset_sd_configs: Serverset 服务发现
nerve_sd_configs: Nerve 服务发现
marathon_sd_configs: Marathon 服务发现
kubernetes_sd_configs: Kubernetes 服务发现
gce_sd_configs: GCE 服务发现
ec2_sd_configs: EC2 服务发现
openstack_sd_configs: OpenStack 服务发现
azure_sd_configs: Azure 服务发现
triton_sd_configs: Triton 服务发现
它们具体使用以及配置模板，请参考服务发现配置模板。

它们中最重要的，也是使用最广泛的应该是 static_configs, 其实那些动态类型都可以看成是某些通用业务使用静态服务封装的结果。

配置样例

Prometheus 的配置参数比较多，但是个人使用较多的是 global, rules, scrap_configs, statstic_config, rebel_config 等。

我平时使用的配置文件大致为这样：

global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.

rule_files:
  - "rules/node.rules"

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node'
    scrape_interval: 8s
    static_configs:
      - targets: ['127.0.0.1:9100', '127.0.0.12:9100']

  - job_name: 'mysqld'
    static_configs:
      - targets: ['127.0.0.1:9104']
  - job_name: 'memcached'
    static_configs:
      - targets: ['127.0.0.1:9150']

Exporter

在 Prometheus 中负责数据汇报的程序统一叫做 Exporter, 而不同的 Exporter 负责不同的业务。它们具有统一命名格式，即 xx_exporter, 例如负责主机信息收集的 node_exporter。

Prometheus 社区已经提供了很多 exporter, 详情请参考这里。

文本格式

在讨论 Exporter 之前，有必要先介绍一下 Prometheus 文本数据格式，因为一个 Exporter 本质上就是将收集的数据，转化为对应的文本格式，并提供 http 请求。

Exporter 收集的数据转化的文本内容以行 (\n) 为单位，空行将被忽略, 文本内容最后一行为空行。

注释

文本内容，如果以 # 开头通常表示注释。

以 # HELP 开头表示 metric 帮助说明。
以 # TYPE 开头表示定义 metric 类型，包含 counter, gauge, histogram, summary, 和 untyped 类型。
其他表示一般注释，供阅读使用，将被 Prometheus 忽略。

采样数据

内容如果不以 # 开头，表示采样数据。它通常紧挨着类型定义行，满足以下格式：

metric_name [
  "{" label_name "=" `"` label_value `"` { "," label_name "=" `"` label_value `"` } [ "," ] "}"
] value [ timestamp ]

下面是一个完整的例子：

# HELP http_requests_total The total number of HTTP requests.
# TYPE http_requests_total counter
http_requests_total{method="post",code="200"} 1027 1395066363000
http_requests_total{method="post",code="400"}    3 1395066363000

# Escaping in label values:
msdos_file_access_time_seconds{path="C:\\DIR\\FILE.TXT",error="Cannot find file:\n\"FILE.TXT\""} 1.458255915e9

# Minimalistic line:
metric_without_timestamp_and_labels 12.47

# A weird metric from before the epoch:
something_weird{problem="division by zero"} +Inf -3982045

# A histogram, which has a pretty complex representation in the text format:
# HELP http_request_duration_seconds A histogram of the request duration.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.05"} 24054
http_request_duration_seconds_bucket{le="0.1"} 33444
http_request_duration_seconds_bucket{le="0.2"} 100392
http_request_duration_seconds_bucket{le="0.5"} 129389
http_request_duration_seconds_bucket{le="1"} 133988
http_request_duration_seconds_bucket{le="+Inf"} 144320
http_request_duration_seconds_sum 53423
http_request_duration_seconds_count 144320

# Finally a summary, which has a complex representation, too:
# HELP rpc_duration_seconds A summary of the RPC duration in seconds.
# TYPE rpc_duration_seconds summary
rpc_duration_seconds{quantile="0.01"} 3102
rpc_duration_seconds{quantile="0.05"} 3272
rpc_duration_seconds{quantile="0.5"} 4773
rpc_duration_seconds{quantile="0.9"} 9001
rpc_duration_seconds{quantile="0.99"} 76656
rpc_duration_seconds_sum 1.7560473e+07
rpc_duration_seconds_count 2693

需要特别注意的是，假设采样数据 metric 叫做 x, 如果 x 是 histogram 或 summary 类型必需满足以下条件：

采样数据的总和应表示为 x_sum。
采样数据的总量应表示为 x_count。
summary 类型的采样数据的 quantile 应表示为 x{quantile="y"}。
histogram 类型的采样分区统计数据将表示为 x_bucket{le="y"}。
histogram 类型的采样必须包含 x_bucket{le="+Inf"}, 它的值等于 x_count 的值。
summary 和 historam 中 quantile 和 le 必需按从小到大顺序排列。

Sample Exporter

既然一个 exporter 就是将收集的数据转化为文本格式，并提供 http 请求即可，那很容自己实现一个。

一个简单的 exporter

下面我将用 golang 实现一个简单的 sample_exporter, 其代码大致为：

package main

import (
    "fmt"
    "net/http"
)

func handler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, exportData)
}

func main() {
    http.HandleFunc("/", handler)
    http.ListenAndServe(":8080", nil)
}

var exportData string = `# HELP sample_http_requests_total The total number of HTTP requests.
# TYPE sample_http_requests_total counter
sample_http_requests_total{method="post",code="200"} 1027 1395066363000
sample_http_requests_total{method="post",code="400"}    3 1395066363000

# Escaping in label values:
sample_msdos_file_access_time_seconds{path="C:\\DIR\\FILE.TXT",error="Cannot find file:\n\"FILE.TXT\""} 1.458255915e9

# Minimalistic line:
sample_metric_without_timestamp_and_labels 12.47

# A histogram, which has a pretty complex representation in the text format:
# HELP sample_http_request_duration_seconds A histogram of the request duration.
# TYPE sample_http_request_duration_seconds histogram
sample_http_request_duration_seconds_bucket{le="0.05"} 24054
sample_http_request_duration_seconds_bucket{le="0.1"} 33444
sample_http_request_duration_seconds_bucket{le="0.2"} 100392
sample_http_request_duration_seconds_bucket{le="0.5"} 129389
sample_http_request_duration_seconds_bucket{le="1"} 133988
sample_http_request_duration_seconds_bucket{le="+Inf"} 144320
sample_http_request_duration_seconds_sum 53423
sample_http_request_duration_seconds_count 144320

# Finally a summary, which has a complex representation, too:
# HELP sample_rpc_duration_seconds A summary of the RPC duration in seconds.
# TYPE sample_rpc_duration_seconds summary
sample_rpc_duration_seconds{quantile="0.01"} 3102
sample_rpc_duration_seconds{quantile="0.05"} 3272
sample_rpc_duration_seconds{quantile="0.5"} 4773
sample_rpc_duration_seconds{quantile="0.9"} 9001
sample_rpc_duration_seconds{quantile="0.99"} 76656
sample_rpc_duration_seconds_sum 1.7560473e+07
sample_rpc_duration_seconds_count 2693
`

当运行此程序，你访问 http://localhost:8080/metrics, 将看到这样的页面：

与 Prometheus 集成

我们可以利用 Prometheus 的 static_configs 来收集 sample_exporter 的数据。

打开 prometheus.yml 文件, 在 scrape_configs 中添加如下配置：

- job_name: "sample"
    static_configs:
      - targets: ["127.0.0.1:8080"]

重启加载配置，然后到 Prometheus Console 查询，你会看到 simple_exporter 的数据。

Node Exporter 安装使用

具体安装请参照https://songjiayang.gitbooks.io/prometheus/exporter/nodeexporter.html

Node Exporter 常用查询语句

收集到 node_exporter 的数据后，我们可以使用 PromQL 进行一些业务查询和监控，下面是一些比较常见的查询。

注意：以下查询均以单个节点作为例子，如果大家想查看所有节点，将 instance="xxx" 去掉即可。

CPU 使用率

100 - (avg by (instance) (irate(node_cpu{instance="xxx", mode="idle"}[5m])) * 100)
CPU 各 mode 占比率
avg by (instance, mode) (irate(node_cpu{instance="xxx"}[5m])) * 100

机器平均负载

node_load1{instance="xxx"} // 1分钟负载
node_load5{instance="xxx"} // 5分钟负载
node_load15{instance="xxx"} // 15分钟负载

内存使用率

100 - ((node_memory_MemFree{instance="xxx"}+node_memory_Cached{instance="xxx"}+node_memory_Buffers{instance="xxx"})/node_memory_MemTotal) * 100

磁盘使用率

100 - node_filesystem_free{instance="xxx",fstype!~"rootfs|selinuxfs|autofs|rpc_pipefs|tmpfs|udev|none|devpts|sysfs|debugfs|fuse.*"} / node_filesystem_size{instance="xxx",fstype!~"rootfs|selinuxfs|autofs|rpc_pipefs|tmpfs|udev|none|devpts|sysfs|debugfs|fuse.*"} * 100

或者你也可以直接使用 {fstype="xxx"} 来指定想查看的磁盘信息

网络 IO

// 上行带宽
sum by (instance) (irate(node_network_receive_bytes{instance="xxx",device!~"bond.*?|lo"}[5m])/128)

// 下行带宽
sum by (instance) (irate(node_network_transmit_bytes{instance="xxx",device!~"bond.*?|lo"}[5m])/128)

网卡出/入包

// 入包量
sum by (instance) (rate(node_network_receive_bytes{instance="xxx",device!="lo"}[5m]))

// 出包量
sum by (instance) (rate(node_network_transmit_bytes{instance="xxx",device!="lo"}[5m]))

其他 Exporter 简介

除了 node_exporter 我们还会根据自己的业务选择安装其他 exporter 或者自己编写，比较常用的 exporter 有，

Memcached exporter 负责收集 Memcached 信息
MySQL server exporter 负责收集 Mysql Sever 信息
MongoDB exporter 负责收集 MongoDB 信息
InfluxDB exporter 负责收集 InfluxDB 信息
JMX exporter 负责收集 Java 虚拟机信息
更多 exporter 请参考链接。

Pushgateway

Pushgateeway 简介

Pushgateway 是 Prometheus 生态中一个重要工具，使用它的原因主要是：

Prometheus 采用 pull 模式，可能由于不在一个子网或者防火墙原因，导致 Prometheus 无法直接拉取各个 target 数据。
在监控业务数据的时候，需要将不同数据汇总, 由 Prometheus 统一收集。

由于以上原因，不得不使用 pushgateway，但在使用之前，有必要了解一下它的一些弊端：

将多个节点数据汇总到 pushgateway, 如果 pushgateway 挂了，受影响比多个 target 大。
Prometheus 拉取状态 up 只针对 pushgateway, 无法做到对每个节点有效。
Pushgateway 可以持久化推送给它的所有监控数据。

因此，即使你的监控已经下线，prometheus 还会拉取到旧的监控数据，需要手动清理 pushgateway 不要的数据。

Pushgateway 安装和使用

安装

二进制安装

你可以到下载页面,选择对应的版本，以 v0.4.0 为例子,

使用 wget 下载 pushgateway

cd ~/Download
https://github.com/prometheus/pushgateway/releases/download/v0.4.0/pushgateway-0.4.0.linux-amd64.tar.gz

使用 tar 解压缩 pushgateway-0.4.0.linux-amd64.tar.gz

cd ~/Prometheus
tar -xvzf ~/Download/pushgateway-0.4.0.linux-amd64.tar.gz
cd pushgateway-0.4.0.linux-amd64

启动 pushgateway
我们可以使用 ./pushgateway -h 查看运行选项，./pushgateway 运行 pushgateway, 如果看到类似输出，表示启动成功。

INFO[0000] Starting pushgateway (version=0.4.0, branch=master, revision=6ceb4a19fa85ac2d6c2d386c144566fb1ede1f6c)  source=main.go:57
INFO[0000] Build context (go=go1.8.3, user=root@87741d1b66a9, date=20170609-12:26:14)  source=main.go:58
INFO[0000] Listening on :9091.                           source=main.go:102

Docker 安装

我们可以使用 prom/pushgateway 的 Docker 镜像，

docker pull prom/pushgateway
docker run -d -p 9091:9091 prom/pushgateway

数据管理

正常情况我们会使用 Client SDK 推送数据到 pushgateway, 但是我们还可以通过 API 来管理, 例如：

- 向 {job="some_job"} 添加单条数据：

echo "some_metric 3.14" | curl --data-binary @- http://pushgateway.example.org:9091/metrics/job/some_job

添加更多更复杂数据，通常数据会带上 instance, 表示来源位置：

cat <<EOF | curl --data-binary @- http://pushgateway.example.org:9091/metrics/job/some_job/instance/some_instance
# TYPE some_metric counter
some_metric{label="val1"} 42
# TYPE another_metric gauge
# HELP another_metric Just an example.
another_metric 2398.283
EOF

删除某个组下的某实例的所有数据：

 curl -X DELETE http://pushgateway.example.org:9091/metrics/job/some_job/instance/some_instance

删除某个组下的所有数据：

curl -X DELETE http://pushgateway.example.org:9091/metrics/job/some_job

可以发现 pushgateway 中的数据我们通常按照 job 和 instance 分组分类，所以这两个参数不可缺少。

因为 Prometheus 配置 pushgateway 的时候，也会指定 job 和 instance, 但是它只表示 pushgateway 实例，不能真正表达收集数据的含义。所以在 prometheus 中配置 pushgateway 的时候，需要添加 honor_labels: true 参数，从而避免收集数据本身的 job 和 instance 被覆盖。

注意，为了防止 pushgateway 重启或意外挂掉，导致数据丢失，我们可以通过 -persistence.file 和 -persistence.interval 参数将数据持久化下来。

Alertmanager

在 Prometheus 中告警分为两部分:

Prometheus 服务根据所设置的告警规则将告警信息发送给 Alertmanager。
Alertmanager 对收到的告警信息进行处理，包括去重，降噪，分组，策略路由告警通知。

使用告警服务主要的步骤如下：

下载配置 Alertmanager。
通过设置 -alertmanager.url 让 Prometheus 服务与 Alertmanager 进行通信。
在 Prometheus 服务中设置告警规则。

Alertmanager 是什么？

Alertmanager 主要用于接收 Prometheus 发送的告警信息，它支持丰富的告警通知渠道，而且很容易做到告警信息进行去重，降噪，分组，策略路由，是一款前卫的告警通知系统。

安装

使用 wget 下载按转包

cd ~/Download
wget https://github.com/prometheus/alertmanager/releases/download/v0.14.0/alertmanager-0.14.0.linux-amd64.tar.gz
cd Prometheus

使用 tar 解压缩 alertmanager-0.14.0.linux-amd64.tar.gz

tar -xvzf ~/Download alertmanager-0.14.0.linux-amd64.tar.gz
cd alertmanager-0.14.0.linux-amd64

解压成功后，使用 ./alertmanager --version 来检查是否安装成功

alertmanager, version 0.14.0 (branch: HEAD, revision: 30af4d051b37ce817ea7e35b56c57a0e2ec9dbb0)
  build user:       root@37b6a49ebba9
  build date:       20180213-08:16:42
  go version:       go1.9.2

基本配置

执行命令 mv simple.yml alertmanager.yml，并修改 alertmanager.yml 配置：

global:
  resolve_timeout: 2h

route:
  group_by: ['alertname']
  group_wait: 5s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'webhook'

receivers:
- name: 'webhook'
  webhook_configs:
  - url: 'http://example.com/xxxx'
    send_resolved: true

说明：这里我们使用 Alertmanager 的 webhook_configs 选项来接收消息，当接收到新的告警信息，它会将消息转发到配置的 url 地址。

通过 Email 接受告警

本章将通过一个简单的实验介绍如何通过 Email 接受告警。

修改 AlertManager 配置文件

其中一些关键配置如下：
global:
  smtp_smarthost: 'smtp.qq.com:587'
  smtp_from: 'xxx@qq.com'
  smtp_auth_username: 'xxx@qq.com'
  smtp_auth_password: 'your_email_password'

route：
  # If an alert has successfully been sent, wait 'repeat_interval' to resend them.
  repeat_interval: 10s    
  #  A default receiver
  receiver: team-X-mails  

receivers:
  - name: 'team-X-mails'
    email_configs:
    - to: 'team-X+alerts@example.org'
在prometheus下添加 alert.rules 文件
文件中写入以下简单规则作为示例。
ALERT memory_high
  IF prometheus_local_storage_memory_series >= 0
  FOR 15s
  ANNOTATIONS {
    summary = "Prometheus using more memory than it should {{ $labels.instance }}",
    description = "{{ $labels.instance }} has lots of memory man (current value: {{ $value }}s)",
  }
修改 prometheus.yml 文件
添加以下规则：
rule_files:
  - "alert.rules"
启动AlertManager服务
./Alertmanager -config.file=simple.yml
启动prometheus服务
./prometheus -Alertmanager.url=http://localhost:9093

根据以上步骤设置，此时 “team-X+alerts@example.org” 应该就可以收到 “xxx@qq.com” 发送的告警邮件了。

通过企业微信接收告警

Alertmanger 从 v0.12 开始已经默认支持企业微信了，下面我们就一起体验一下。

准备工作

step 1: 访问网址注册企业微信账号（不需要企业认证）。

step 2: 访问apps 创建第三方应用，点击创建应用按钮 -> 填写应用信息：

使用版本

prometheus: 2.0.darwin-amd64
node_exporter: 0.15.0.darwin-amd64
alertmanager: 0.14.darwin-amd64

详细配置：

prometheus 配置：
# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - localhost:9093

rule_files:
  - "rules.yml"

scrape_configs:
  - job_name: 'node'
    static_configs:
      - targets: ['localhost:9100']
rules.yml 配置：
groups:
- name: node
  rules:
  - alert: server_status
    expr: up{job="node"} == 0
    for: 15s
    annotations:
      summary: "机器 {{ $labels.instance }} 挂了"
alertmanger 配置：
route:
  group_by: ['alertname']
  receiver: 'wechat'

receivers:
 - name: 'wechat'
  wechat_configs:
 - corp_id: 'xxx'
    to_party: '1'
    agent_id: '1000002'
    api_secret: 'xxxx'
参数说明：

corp_id: 企业微信账号唯一 ID，可以在我的企业中查看。
to_party: 需要发送的组。
agent_id: 第三方企业应用的ID，可以在自己创建的第三方企业应用详情页面查看。
api_secret: 第三方企业应用的密钥，可以在自己创建的第三方企业应用详情页面查看。

详情请参考文档。

验证测试

当我们停掉 node_exporter 的时候，会收到如下告警信息：

当我们重新启动 node_exporter 的时候，会收到如下告警信息：

结论

企业微信从注册到 alertmanger 配置没有什么坑，而且它的通知非常及时，基本不丢消息，大家可以测试体验以下。