'Scrap' 태그의 글 목록

Scrap

[ Kubernetes ] Prometheus / 사용자 App의 Metrics를 Scraping 하기 위한 Endpoint 추가 설정 2022.07.08

[ Kubernetes ] Prometheus / 사용자 App의 Metrics를 Scraping 하기 위한 Endpoint 추가 설정

2022. 7. 8. 14:39

Kubernetes 또는 OCP 클러스터에 내가 만든 App을 Deploy하고, 이 App이 Prometheus에서 Scraping하도록 설정하고 싶을 때가 있다.

그럴 때 아래 문서를 따라하면 Metric Exporting, Scraping이 잘 된다.

(단, ServiceMonitor 리소스 설정은 아래 문서에서 살짝 놓친 부분이 있으니까, 이 블로그 페이지의 마지막에 있는 예제를 따를 것)

모니터링 OpenShift Container Platform 4.10 | Red Hat Customer Portal

이 문서에서는 OpenShift Container Platform에서 Prometheus를 구성 및 사용하는 방법에 대한 지침을 설명합니다.

access.redhat.com

위 문서에서 7.2 사용자 정의 프로젝트에 대한 메트릭 컬렉션 설정 부분을 보면 된다.

또는 같은 내용이지만, Page 단위로 분리된 문서를 보고 싶다면 아래 Web Page를 보는 것도 추천 !!!

(다시 말하지만, 위 문서랑 완전히 똑같은 내용이고 Chapter 별로 Page를 분리한 문서 Form이다)

7.2. 사용자 정의 프로젝트에 대한 메트릭 컬렉션 설정 OpenShift Container Platform 4.10 | Red Hat Customer Por

Access Red Hat’s knowledge, guidance, and support through your subscription.

access.redhat.com

주의:
위 문서가 대체로 절차를 잘 설명하고 있지만, ServiceMonitor 리소스에 대한 설정이 이상하다.
그래서 ServiceMonitor 리소스는 내가 별도로 만들었다.

웹 문서의 설명과 다른 부분만 아래와 같이 Comment를 추가했다.

Comment가 붙어 있는 라인만 잘 수정하면, 잘 동작한다.

##
## File name: servicemonitor.yaml
##

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: almighty
  name: almighty
  namespace: openshift-monitoring  ## Note: 반드시 이 namespace로 설정해야 한다.
spec:
  endpoints:
  - interval: 15s
    port: web
    scheme: http
    path: /metrics       ## 이 설정을 꼭 넣자.
  selector:
    matchLabels:
      app: almighty      ## 이 부분이 Pod, Service의 label 값과 같은지 꼭 확인해야 한다.
  namespaceSelector: 
    matchNames:
      - almighty         ## Service 리소스의 값과 같은지 확인해야 한다.

servicemonitor.yaml 파일을 작성하면, 아래와 같이 kubectl 명령으로 kubernetes cluster에 적용한다.

$ kubectl apply -f servicemonitor.yaml

그런데 여기서 생각해볼 것이 있다.
만약, Service에 포함된 Pod가 2개 이상일 때는 위 ServiceMonitor처럼 Scrapping을 시도하면
2개의 Pod 중에서 1개의 Pod만 Scrapping에 응답하기 때문에 반쪽 짜리 Scrapping이 되는 문제가 있다.
Pod가 10개라면, 1개 Pod 입장에서 본다면 Scrapping 인터벌(주기)는 10배로 더 길어진다.

따라서 Service에 포함되는 Pod가 2개 이상일 때는 ServiceMonitor 보다 PodMonitor 조건을 사용하는 것이 좋다.
PodMonitor를 사용하면, Prometheus가 각 Pod마다 접근해서 Scrapping한다.

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: almighty
  labels:
    app: almighty
## FIXME
  namespace: openshift-monitoring
spec:
  selector:
    matchLabels:
## FIXME
      app: almighty
  podMetricsEndpoints:
## FIXME
  - port: web
    path: /metrics
    interval: 7s   ## 테스트를 해보면, 1s 로 설정해도 잘 동작한다.
    scheme: http
  namespaceSelector:
    matchNames:
## FIXME
      - almighty

Prometheus Web UI를 열고, 아래와 같이 Qeury(PromQL)을 입력한다.

## Example 1
http_request_duration_seconds_bucket{code="200",handler="found"}

## Example 2
irate(http_requests_total{namespace="almighty", code="200"}[5m])

위와 같이 챠트가 잘 그려지면, Prometheus의 Scraper가 사용자 App의 Metrics을 잘 Scraping하고 있는 것이다.

참고: Prometheus Metrics 테스트를 위한 Example App

아래 Go Source Code를 이용하는 것이 제일 테스트하기 편하다. (적극 추천)

GitHub - brancz/prometheus-example-app: Go app that exposes metrics about its HTTP handlers.

Go app that exposes metrics about its HTTP handlers. - GitHub - brancz/prometheus-example-app: Go app that exposes metrics about its HTTP handlers.

github.com

Troubleshooting (문제 해결)

위 예시대로 사용하려 했는데 잘 동작하지 않는다면, 아래 설정 작업을 따라해볼 것 !!!

Prometheus의 Cluster Role Binding 설정 작업

만약, Kubernetes 또는 OCP를 초기 구축하고 나서 User App에 대한 Metrics을 Scrapping할 수 있는 Role을 Prometheus ServiceAccount에 부여하지 않았다면, 꼭 Cluster Role과 ClusterRoleBinding을 설정해줘야 한다.

이 Cluster Role Binding 작업이 안 되어 있으면, 위에서 예제로 설명했던 ServiceMonitoring, PodMonitoring을 모두

"endpoints is forbidden" 에러가 발생할 것이다. (아래 Error Log을 참고)

Prometheus Error Log 예시

$ kubectl logs -f -n openshift-monitoring prometheus-k8s-0

ts=2022-08-10T03:29:51.795Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:471: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"almighty\""

위 에러를 없애기 위해 아래와 같이 ClusterRole, ClusterRoleBinding 리소스를 만들어야 한다.

##
## Cluster Role 생성을 위한 YAML 파일 작성
##

$ cat prometheus-cluster-role.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: my-prometheus
rules:
- apiGroups: [""]
  resources:
  - pods
  - services
  - endpoints
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]


##
## Cluster Role Binding 생성을 위한 YAML 파일 작성
##

$ cat prometheus-cluster-role-binding.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: my-prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: clusterlogging-collector-metrics
subjects:
- kind: ServiceAccount
  name: prometheus-k8s
  namespace: openshift-monitoring

##
##
##

$ kubectl apply -f prometheus-cluster-role.yaml

$ kubectl apply -f prometheus-cluster-role-binding.yaml

위 설명에 대한 자세한 정보는 아래 GitHub의 Prometheus RBAC 챕터를 참고.

GitHub - prometheus-operator/prometheus-operator: Prometheus Operator creates/configures/manages Prometheus clusters atop Kubern

Prometheus Operator creates/configures/manages Prometheus clusters atop Kubernetes - GitHub - prometheus-operator/prometheus-operator: Prometheus Operator creates/configures/manages Prometheus clus...

github.com

위readme.md 내용을 화면 캡처한 것!

참고: 다른 블로거의 글

Red Hat 문서에 오류가 있어서 한참 헤메고 있을 때, 아래 블로그 글에서 해법을 찾았다 ^^

쉽게 예제를 만들어서 설명하고 있어서, 이해하기 좋다.

prometheus-operator | loliot

prometheus-operator

wiki.loliot.net

'kubernetes' 카테고리의 다른 글

MetalLB에 대한 Q&A (0)	2022.07.17
OpenShift, OKD Source Code - GitHub Repository (0)	2022.07.08
쉽고 간단하게 Kubernetes SecurityContext 설정 (privileged 권한) (0)	2022.07.06
Kubernetes SRIOV Network Device Plugin에 관해 Code Level 스터디를 하고 싶다면 (0)	2022.07.05
Kubernetes 한글 문서 (0)	2022.07.04

PREV 1 NEXT

sejong.jeonjo@gmail.com