Postgres-Operator with Metrics
When running PostgreSQL in Kubernetes, operators become quickly a topic. Operators are a concept of application specific Kubernetes controller. Or to put it easier: Operators are programs that configure and manage other software.
I use the “Zalando postgres-operator”, a Kubernetes operator that manages PostgreSQL clusters on my Kubernetes setup. One major headache is that this operator deploys your clusters, but doesn’t take care of the monitoring of these clusters. So while the PostgreSQL cluster itself might be up and running, working as expected, the performance might be terrible or there are other problems taking place within the database that are not visible.
For monitoring there is another operator, the prometheus-operator, a Kubernetes controller that manages Prometheus instances, their configuration and their integration with tools like Alertmanager. It provides a Kubernetes-native way to configure scraping by using selectors.
When all of this sounds to much for you, relax, read a bit more about these concepts before diving in deeper.
Before you start
It’s assumed that you already have the prometheus-operator and a Grafana instance installed. A popular setup, which I also use, is the kube-prometheus-stack
It’s also assumed that you are familiar with the concepts of selectors, pods, deployments, services and basics of CRDs.
And that you have used helm
before.
Installing the Zalando postgres-operator
To install the Zalando postgres-operator, we’ll utilise their helm chart, following the official installation guide.
In order to install the operator using helm, we want to provide a values.yaml
, that provides some simple modifications to the default configuration of the operator itself:
# values.yaml
---
configGeneral:
sidecars:
- name: "exporter"
image: "quay.io/prometheuscommunity/postgres-exporter:latest"
ports:
- name: exporter
containerPort: 9187
protocol: TCP
resources:
limits:
cpu: 500m
memory: 256M
requests:
cpu: 100m
memory: 200M
env:
- name: "DATA_SOURCE_URI"
value: "$(POD_NAME)/postgres?sslmode=require"
- name: "DATA_SOURCE_USER"
value: "$(POSTGRES_USER)"
- name: "DATA_SOURCE_PASS"
value: "$(POSTGRES_PASSWORD)"
- name: "PG_EXPORTER_AUTO_DISCOVER_DATABASES"
value: "true"
This values.yaml
file instructs the helm chart, to deploy a configuration for the operator, that will create all PostgreSQL clusters with a sidecar container. This sidecar container runs the postgresql-exporter form the Prometheus community and is configured to connect the local installed cluster and collect various metrics about the individual PostgreSQL-instance.
To install the operator using these values, you can use the official install instructions with the addition of the --values
parameter and the path to the values file you just created:
# add repo for postgres-operator
helm repo add postgres-operator-charts https://opensource.zalando.com/postgres-operator/charts/postgres-operator
# install the postgres-operator
helm install --namespace postgres-operator --create-namespace --values values.yaml postgres-operator postgres-operator-charts/postgres-operator
Note: You’ll require cluster-admin permissions for this, in order to install the CRDs.
Deploying a PostgreSQL cluster
With the operator installed, it’s time to create a PostgreSQL cluster. This is done using a Custom Resource (CR) called postgresql
. In this example, we will use the minimal cluster from the operator repository, but you can make it as complex as you feel comfortable with. A reference can be found the the operator documentation.
# postgresql.yaml
---
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: example-minimal-cluster
spec:
teamId: "example" # needs to be identitical with the name prefix (example-)
volume:
size: 1Gi
numberOfInstances: 2
users:
example: # database owner
- superuser
- createdb
databases:
exampledb: example # dbname: owner
postgresql:
version: "14"
Storing this YAML as postgresql.yaml
, you can use kubectl
:
kubectl create namespace example
kubectl apply -n example -f postgresql.yaml
kubectl wait pod/example-minimal-cluster-0 -n example
Setting up the Prometheus monitoring
As a last step, it’s time to collect the metrics from all PostgreSQL clusters deployed by the operator. This is done using the PodMonitor
CR. Again a simple YAML manifest file is needed, let’s call it podmonitor.yaml
:
# podmonitor.yaml
---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: postgresql
spec:
selector:
matchLabels:
application: spilo # (1)
namespaceSelector:
any: true # (2)
podMetricsEndpoints:
- port: exporter # (3)
interval: 15s
scrapeTimeout: 10s
- targetPort: 8008 # (4)
interval: 15s
scrapeTimeout: 10s
podTargetLabels: # (5)
- spilo-role
- cluster-name
- team
To understand this CR a bit better, lets explain the various parts of the spec
. This CR is deployed in the operator namespace, and will instruct the Prometheus operator to automatically scrape all PostgreSQL clusters created by the operator across the Kubernetes cluster to be monitored without requiring additional instructions.
- The selector for this
PodMonitor
targets allspilo
applications.spilo
is the image that the postgres-operator uses and contains PostgreSQL,splio
and everything needed to cluster the setup. It’s also the default set label by the operator, this should find all cluster instances. - This
namespaceSelector
explicitly instructs thePodMonitor
to search in all namespaces. Without anamespaceSelector
, thePodMonitor
would only look in the same namespace. You an also provide a list of namespaces if you prefer to be more selective. - This port name, is from the sidecar container explicitly configured in the postgres-operator configuration above, that is now deployed with every
postgresql
cluster. - This port is from Patroni and provides additional metrics regarding the cluster status, such as the current leader/replica situation and should help to debug potential replication problems.
podTargetLabels
instructs Prometheus to collect the Kubernetes pod labels and add them to the metrics collected from the scraped exporters. This is useful to identify your different clusters in dashboards and general queries.
Wrapping up and further hints
With all this done, you have a monitored PostgreSQL cluster in your Kubernetes that just waits to be utilised for your next project. Maybe a Mastodon instance?
You can find some good dashboards on the official Grafana website, that should give you better insight into your PostgreSQL setup. But already you can find all the metrics under the pg_
and patroni_
prefix.