Postgres-Operator with Metrics
When running PostgreSQL in Kubernetes, operators become quickly a topic. Operators are a concept of application specific Kubernetes controller. Or to put it easier: Operators are programs that configure and manage other software.
I use the “Zalando postgres-operator”, a Kubernetes operator that manages PostgreSQL clusters on my Kubernetes setup. One major headache is that this operator deploys your clusters, but doesn’t take care of the monitoring of these clusters. So while the PostgreSQL cluster itself might be up and running, working as expected, the performance might be terrible or there are other problems taking place within the database that are not visible.
For monitoring there is another operator, the prometheus-operator, a Kubernetes controller that manages Prometheus instances, their configuration and their integration with tools like Alertmanager. It provides a Kubernetes-native way to configure scraping by using selectors.
When all of this sounds to much for you, relax, read a bit more about these concepts before diving in deeper.
Before you start
It’s assumed that you already have the prometheus-operator and a Grafana instance installed. A popular setup, which I also use, is the kube-prometheus-stack
It’s also assumed that you are familiar with the concepts of selectors, pods, deployments, services and basics of CRDs.
And that you have used
Installing the Zalando postgres-operator
To install the Zalando postgres-operator, we’ll utilise their helm chart, following the official installation guide.
In order to install the operator using helm, we want to provide a
values.yaml, that provides some simple modifications to the default configuration of the operator itself:
# values.yaml --- configGeneral: sidecars: - name: "exporter" image: "quay.io/prometheuscommunity/postgres-exporter:latest" ports: - name: exporter containerPort: 9187 protocol: TCP resources: limits: cpu: 500m memory: 256M requests: cpu: 100m memory: 200M env: - name: "DATA_SOURCE_URI" value: "$(POD_NAME)/postgres?sslmode=require" - name: "DATA_SOURCE_USER" value: "$(POSTGRES_USER)" - name: "DATA_SOURCE_PASS" value: "$(POSTGRES_PASSWORD)" - name: "PG_EXPORTER_AUTO_DISCOVER_DATABASES" value: "true"
values.yaml file instructs the helm chart, to deploy a configuration for the operator, that will create all PostgreSQL clusters with a sidecar container. This sidecar container runs the postgresql-exporter form the Prometheus community and is configured to connect the local installed cluster and collect various metrics about the individual PostgreSQL-instance.
To install the operator using these values, you can use the official install instructions with the addition of the
--values parameter and the path to the values file you just created:
# add repo for postgres-operator helm repo add postgres-operator-charts https://opensource.zalando.com/postgres-operator/charts/postgres-operator # install the postgres-operator helm install --namespace postgres-operator --create-namespace --values values.yaml postgres-operator postgres-operator-charts/postgres-operator
Note: You’ll require cluster-admin permissions for this, in order to install the CRDs.
Deploying a PostgreSQL cluster
With the operator installed, it’s time to create a PostgreSQL cluster. This is done using a Custom Resource (CR) called
postgresql. In this example, we will use the minimal cluster from the operator repository, but you can make it as complex as you feel comfortable with. A reference can be found the the operator documentation.
# postgresql.yaml --- apiVersion: "acid.zalan.do/v1" kind: postgresql metadata: name: example-minimal-cluster spec: teamId: "example" # needs to be identitical with the name prefix (example-) volume: size: 1Gi numberOfInstances: 2 users: example: # database owner - superuser - createdb databases: exampledb: example # dbname: owner postgresql: version: "14"
Storing this YAML as
postgresql.yaml, you can use
kubectl create namespace example kubectl apply -n example -f postgresql.yaml kubectl wait pod/example-minimal-cluster-0 -n example
Setting up the Prometheus monitoring
As a last step, it’s time to collect the metrics from all PostgreSQL clusters deployed by the operator. This is done using the
PodMonitor CR. Again a simple YAML manifest file is needed, let’s call it
# podmonitor.yaml --- apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: postgresql spec: selector: matchLabels: application: spilo # (1) namespaceSelector: any: true # (2) podMetricsEndpoints: - port: exporter # (3) interval: 15s scrapeTimeout: 10s - targetPort: 8008 # (4) interval: 15s scrapeTimeout: 10s podTargetLabels: # (5) - spilo-role - cluster-name - team
To understand this CR a bit better, lets explain the various parts of the
spec. This CR is deployed in the operator namespace, and will instruct the Prometheus operator to automatically scrape all PostgreSQL clusters created by the operator across the Kubernetes cluster to be monitored without requiring additional instructions.
- The selector for this
spilois the image that the postgres-operator uses and contains PostgreSQL,
splioand everything needed to cluster the setup. It’s also the default set label by the operator, this should find all cluster instances.
namespaceSelectorexplicitly instructs the
PodMonitorto search in all namespaces. Without a
PodMonitorwould only look in the same namespace. You an also provide a list of namespaces if you prefer to be more selective.
- This port name, is from the sidecar container explicitly configured in the postgres-operator configuration above, that is now deployed with every
- This port is from Patroni and provides additional metrics regarding the cluster status, such as the current leader/replica situation and should help to debug potential replication problems.
podTargetLabelsinstructs Prometheus to collect the Kubernetes pod labels and add them to the metrics collected from the scraped exporters. This is useful to identify your different clusters in dashboards and general queries.
Wrapping up and further hints
With all this done, you have a monitored PostgreSQL cluster in your Kubernetes that just waits to be utilised for your next project. Maybe a Mastodon instance?
You can find some good dashboards on the official Grafana website, that should give you better insight into your PostgreSQL setup. But already you can find all the metrics under the