文章
问答
冒泡
Kubernetes 使用Prometheus 搭建监控平台

欢迎大家订阅本人维护的微信公众号或者今日头条头条号,第一时间获取分享:)。



转载自阳明老哥技术blog,发现其中有部分问题加上自己实践踩坑 

本人维护的yaml配置文件链接


一般情况下我们是直接通过Dashboard的资源统计图标进行观察的,但是很显然如果要上到生产环境,就需要更自动化的方式来对集群、Pod甚至容器进行监控了。Kubernetes内置了一套监控方案:influxdb+grafana+heapster。但由于之前我们的应用的业务监控使用的是Prometheus,所以这里准备使用Prometheus来完成k8s的集群监控。


Prometheus 简介


Prometheus是SoundCloud开源的一款开源软件。它的实现参考了Google内部的监控实现,与源自Google的Kubernetes结合起来非常合适。另外相比influxdb的方案,性能更加突出,而且还内置了报警功能。它针对大规模的集群环境设计了拉取式的数据采集方式,你只需要在你的应用里面实现一个metrics接口,然后把这个接口告诉Prometheus就可以完成数据采集了。


部署 promethues 配置文件


首先我们使用ConfigMap的形式来设置Prometheus的配置文件,如下:

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: kube-system
data:
  prometheus.yml: |
    global:
      scrape_interval: 30s
      scrape_timeout: 30s
    scrape_configs:
    - job_name: 'prometheus'
      static_configs:
        - targets: ['localhost:9090']
    - job_name: 'kubenetes-cluster'
      scheme: https
      tls_config:
        insecure_skip_verify: true
      kubernetes_sd_configs:
      - api_servers: 
        - 'http://10.139.15.113:8080'
        role: node
    - job_name: 'kubernetes-nodes-cadvisor'
      tls_config:
        insecure_skip_verify: true
      kubernetes_sd_configs:
      - api_servers:
        - 'http://10.139.15.113:8080'
        in_cluster: true
        role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - source_labels: [__meta_kubernetes_role]
        action: replace
        target_label: kubernetes_role
      - source_labels: [__address__]
        regex: '(.*):10250'
        replacement: '${1}:4194'
        target_label: __address__
    - job_name: 'kubernetes-apiserver-cadvisor'
      tls_config:
        insecure_skip_verify: true
      kubernetes_sd_configs:
      - api_servers:
        - 'http://10.139.15.113:8080'
        in_cluster: true
        role: apiserver
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - source_labels: [__meta_kubernetes_role]
        action: replace
        target_label: kubernetes_role
      - source_labels: [__address__]
        regex: '(.*):10250'
        replacement: '${1}:10255'
        target_label: __address__
    - job_name: 'kubernetes-node-exporter'
      tls_config:
        insecure_skip_verify: true
      kubernetes_sd_configs:
      - api_servers:
        - 'http://10.139.15.113:8080'
        in_cluster: true
        role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - source_labels: [__meta_kubernetes_role]
        action: replace
        target_label: kubernetes_role
      - source_labels: [__address__]
        regex: '(.*):10250'
        replacement: '${1}:31672'
        target_label: __address__

将以上配置文件保存为 promethues-config.yaml,然后执行:


$ kubectl create -f prometheus-config.yaml

注意

  • job_name=kubernetes-apiserver-cadvisor 需要将10250端口替换成10255,10255端口是kubelet 实现的metrics,可以在节点上面curl查看内容,curl http://:10255/metrics
  • job_name=kubernetes-nodes-cadvisor 需要将10250端口替换成4194,4194同样是kubernetes集成的容器监控服务,在k8s 1.7版本之前的用10255端口即可,但是1.7版本后cadvisor 监控的数据没有集成到 kubelet 的实现里面去了,这里一定要注意
  • job_name=kubernetes-node-exporter 中替换10250的端口是31672,该端口是 node-exporter暴露的 NodePort 端口,这里需要根据实际情况填写。


部署 node-exporter


先部署node-exporter,为了能够收集每个节点的信息,所以我们这里使用DaemonSet的形式部署PODS:

---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: kube-system
  labels:
    k8s-app: node-exporter
spec:
  template:
    metadata:
      labels:
        k8s-app: node-exporter
    spec:
      containers:
      - image: prom/node-exporter
        name: node-exporter
        ports:
        - containerPort: 9100
          protocol: TCP
          name: http
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: node-exporter
  name: node-exporter
  namespace: kube-system
spec:
  ports:
  - name: http
    port: 9100
    nodePort: 31672
    protocol: TCP
  type: NodePort
  selector:
    k8s-app: node-exporter


将以上文件保存为 node-exporter.yaml ,然后执行命令:

$ kubectl create -f node-exporter.yaml


部署 promethues


接下来通过Deployment部署promethues.yaml文件如下:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: prometheus
  name: prometheus
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        k8s-app: prometheus
    spec:
      containers:
      - image: prom/prometheus:v1.0.1
        name: prometheus
        command:
        - "/bin/prometheus"
        args:
        - "-config.file=/etc/prometheus/prometheus.yml"
        - "-storage.local.path=/prometheus"
        - "-storage.local.retention=24h"
        ports:
        - containerPort: 9090
          protocol: TCP
        volumeMounts:
        - mountPath: "/prometheus"
          name: data
          subPath: prometheus
        - mountPath: "/etc/prometheus"
          name: config-volume
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 200m
            memory: 1Gi
      volumes:
      - name: data
        emptyDir: {}
      - configMap:
          name: prometheus-config
        name: config-volume

将以上文件保存为 promethues-deploy.yaml,然后执行命令:

$ kubectl create -f prometheus-deploy.yaml


创建 promethues-service.yaml:

apiVersion: v1
kind: "Service"
metadata:
  name: prometheus
  labels:
    k8s-app: prometheus
  namespace: kube-system
spec:
  ports:
  - protocol: TCP
    port: 9090
    targetPort: 9090
  selector:
    k8s-app: prometheus


执行创建 promethues service 服务的命令:

$ kubectl create -f prometheus-service.yaml


接下来暴露服务以便可以访问Prometheus的UI界面,你可以通过kubectl port-forward将它暴露在本地:

$ POD=`kubectl get pod -l app=prometheus -n kube-system -o go-template --template '{{range .items}}{{.metadata.name}}{{end}}'`
$ kubectl port-forward $POD -n kube-system 9090:9090

然后用浏览器访问http://localhost:9090就可以访问到Prometheus的界面了。

在这里通过ingress暴露到外网


部署 ingress


部署 ingress-controller,查阅官网我是开启了rbac权限控制开关

  • 首先创建defaultbackend deployment
 $ curl -o default-backend.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/default-backend.yaml
default-backend.yaml 文件如下:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: default-http-backend
  labels:
    app: default-http-backend
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: default-http-backend
  template:
    metadata:
      labels:
        app: default-http-backend
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: default-http-backend
        # Any image is permissible as long as:
        # 1. It serves a 404 page at /
        # 2. It serves 200 on a /healthz endpoint
        image: kirago/defaultbackend:1.4
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 30
          timeoutSeconds: 5
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: 10m
            memory: 20Mi
          requests:
            cpu: 10m
            memory: 20Mi
---
apiVersion: v1
kind: Service
metadata:
  name: default-http-backend
  namespace: kube-system
  labels:
    app: default-http-backend
spec:
  ports:
  - port: 80


  • 执行创建命令:
$ kubectl create -f default-backend.yaml
  • 创建 ingress configmap
 $ curl -o ingress-configmap.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/configmap.yaml

ingress-configmap.yaml 文件如下:

kind: ConfigMap
apiVersion: v1
metadata:
  name: nginx-configuration
  namespace: kube-system
  labels:
    app: ingress-nginx
  • 执行创建命令如下:
$ kubectl create -f ingress-configmap.yaml
  • 创建 tcp-services-configmap.yaml
$ curl -o tcp-service-configmap.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/tcp-services-configmap.yaml
  • tcp-services-configmap.yaml 文件如下:
kind: ConfigMap
apiVersion: v1
metadata:
  name: tcp-services
  namespace: kube-system
  • 执行创建命令:
$ kubectl create -f tcp-services-configmap.yaml
  • 创建 udp-services-configmap.yaml
$ curl -o udp-services-configmap.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/udp-services-configmap.yaml
  • udp-services-configmap.yaml 文件如下:
kind: ConfigMap
apiVersion: v1
metadata:
  name: udp-services
  namespace: kube-system
  • 执行创建命令:
$ kubectl create -f udp-services-configmap.yaml
  • 创建 ingress-rbac.yaml 文件
$ curl -o ingress-rbac.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/rbac.yaml
  • ingress-rbac.yaml 文件如下:
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nginx-ingress-serviceaccount
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: nginx-ingress-clusterrole
rules:
  - apiGroups:
      - ""
    resources:
      - configmaps
      - endpoints
      - nodes
      - pods
      - secrets
    verbs:
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - nodes
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - services
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "extensions"
    resources:
      - ingresses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ""
    resources:
        - events
    verbs:
        - create
        - patch
  - apiGroups:
      - "extensions"
    resources:
      - ingresses/status
    verbs:
      - update
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
  name: nginx-ingress-role
  namespace: kube-system
rules:
  - apiGroups:
      - ""
    resources:
      - configmaps
      - pods
      - secrets
      - namespaces
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - configmaps
    resourceNames:
      # Defaults to "<election-id>-<ingress-class>"
      # Here: "<ingress-controller-leader>-<nginx>"
      # This has to be adapted if you change either parameter
      # when launching the nginx-ingress-controller.
      - "ingress-controller-leader-nginx"
    verbs:
      - get
      - update
  - apiGroups:
      - ""
    resources:
      - configmaps
    verbs:
      - create
  - apiGroups:
      - ""
    resources:
      - endpoints
    verbs:
      - get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
  name: nginx-ingress-role-nisa-binding
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: nginx-ingress-role
subjects:
  - kind: ServiceAccount
    name: nginx-ingress-serviceaccount
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: nginx-ingress-clusterrole-nisa-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: nginx-ingress-clusterrole
subjects:
  - kind: ServiceAccount
    name: nginx-ingress-serviceaccount
    namespace: kube-system
  • 执行如下命令:
$ kubectl create -f ingress-rbac.yaml
  • 创建 ingress-with-rbac.yaml 文件
$ curl -o ingress-with-rbac.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/with-rbac.yaml 
  • ingress-with-rbac.yaml 文件如下:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-ingress-controller
  namespace: kube-system 
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ingress-nginx
  template:
    metadata:
      labels:
        app: ingress-nginx
      annotations:
        prometheus.io/port: '10254'
        prometheus.io/scrape: 'true'
    spec:
      serviceAccountName: nginx-ingress-serviceaccount
      containers:
        - name: nginx-ingress-controller
          image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.12.0
          args:
            - /nginx-ingress-controller
            - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
            - --configmap=$(POD_NAMESPACE)/nginx-configuration
            - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
            - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
            - --annotations-prefix=nginx.ingress.kubernetes.io
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          ports:
          - name: http
            containerPort: 80
          - name: https
            containerPort: 443
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
  • 执行以下创建命令:
$ kubectl create -f ingress-with-rbac.yaml

部署 ingress 实例

  • yaml文件内容如下:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus-ingress
  namespace: kube-system
spec:
  rules:
  - host: prometheus.local  # 替换成你的域名
    http:
      paths:
      - path: 
      - backend:
          serviceName: prometheus
          servicePort: 9090

将以上文件保存为prometheus-ingress.yaml,然后执行命令:

$ kubectl create -f prometheus-ingress.yaml

敲黑板

由于 ingress 是暴露给集群外部使用的,那么架在 ingress 之前的例如F5硬负载是要能够解析到集群内部 ingress 暴露的域名,因为在自己本地搭建环境的时候我就踩了这个坑,配置之后就可以访问了(自己测试验证的时候需要注意)。

最终结果: 

kubernetes
prometheus
监控

关于作者

Kirago
个人站点 https://kiragoo.github.io/
获得点赞
文章被阅读