Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .yamllint
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ yaml-files:
ignore:
- '**/chart/templates**'


rules:
anchors: enable
braces: enable
Expand Down
31 changes: 31 additions & 0 deletions adr/0002-configure-distroless-pooler-with-pepr.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# 2. Configure the distroless connection pooler with a Pepr module

Date: 2026-06-04

## Status

Proposed

## Context

The Zalando postgres-operator launches the connection-pooler container with only an image and environment variables — no `command`/`args` — and relies on the image's entrypoint to render `pgbouncer.ini` (and the `userlist.txt` auth file) from those env vars and then exec PgBouncer.

The `unicorn` flavor uses a distroless PgBouncer image (`pgbouncer-fips`) that is the bare binary with no entrypoint script, template, or shell. Launched argument-less it prints usage and exits, so the pooler crash-loops. The operator does not let us set the pooler container's command, volumes, or config, and it reconciles the pooler Deployment, so any manual patch is eventually reverted.

We need a way to supply PgBouncer's config, auth file, and launch command that (a) works with a distroless image, (b) survives operator reconciliation, and (c) is coupled to this package's lifecycle. A one-shot `kubectl`/Job patch was rejected (the operator reverts it on its next write).

## Decision

We ship a [Pepr](https://github.com/defenseunicorns/pepr) module (`src/pepr`, capability `pgbouncer-pooler`), bundled as a manifest in the `unicorn` component, that:

@zachariahmiller zachariahmiller Jun 5, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming there is literally no other way to accomplish this, which I am skeptical of without diving much deeper into this I would much rather see if the other pgbouncer image in chainguard's catalogue works or even use the -dev variant over this approach.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a deep dive is critical here. The postgres-operator itself generates the pooler Deployment (per postgresql CR), there does not appear to be any templates that can be overriden so changes will need to be made to the Go code. In the meantime, this mutation addresses this issue.

All pgbouncer images on chainguard only have the pgbouncer binary as the entrypoint, with no args. If you inspect the upstream zalando pgbouncer image, the entrypoint is a script that uses envsubst to create the ini config file and then calls pgbouncer with this ini file as the only argument. This can be verified with:

docker image inspect registry.opensource.zalan.do/acid/pgbouncer:master-32 --format 'Entrypoint: {{.Config.Entrypoint}}{{"\n"}}{{.Config.Cmd}}'
# Entrypoint: [/bin/sh /entrypoint.sh]
# []

docker create --name zalano-pgbouncer registry.opensource.zalan.do/acid/pgbouncer:master-32
docker cp zalano-pgbouncer:/entrypoint.sh .
docker rm zalano-pgbouncer
cat entrypoint.sh
rm entrypoint.sh

entrypoint.sh

#!/bin/sh

set -ex

if [ "$PGUSER" = "postgres" ]; then
    echo "WARNING: pgbouncer will connect with a superuser privileges!"
    echo "You need to fix this as soon as possible."
fi

if [ -z "${CONNECTION_POOLER_CLIENT_TLS_CRT}" ]; then
    openssl req -nodes -new -x509 -subj /CN=spilo.dummy.org \
        -keyout /etc/ssl/certs/pgbouncer.key \
        -out /etc/ssl/certs/pgbouncer.crt
else
    ln -s ${CONNECTION_POOLER_CLIENT_TLS_CRT} /etc/ssl/certs/pgbouncer.crt
    ln -s ${CONNECTION_POOLER_CLIENT_TLS_KEY} /etc/ssl/certs/pgbouncer.key
    if [ ! -z "${CONNECTION_POOLER_CLIENT_CA_FILE}" ]; then
        ln -s ${CONNECTION_POOLER_CLIENT_CA_FILE} /etc/ssl/certs/ca.crt
    fi
fi

envsubst < /etc/pgbouncer/pgbouncer.ini.tmpl > /etc/pgbouncer/pgbouncer.ini
envsubst < /etc/pgbouncer/auth_file.txt.tmpl > /etc/pgbouncer/auth_file.txt

exec /bin/pgbouncer /etc/pgbouncer/pgbouncer.ini

You can verify that any CGR image defaults to just calling pgbouncer --help:

$ docker image inspect cgr.dev/defenseunicorns.com/pgbouncer:latest --format='Entrypoint: {{.Config.Entrypoint}}{{"\n"}}Cmd: {{.Config.Cmd}}'
# Entrypoint: [/usr/bin/pgbouncer]
# Cmd: [--help]

$ docker image inspect cgr.dev/defenseunicorns.com/pgbouncer:latest-dev --format='Entrypoint: {{.Config.Entrypoint}}{{"\n"}}Cmd: {{.Config.Cmd}}'
# Entrypoint: [/usr/bin/pgbouncer]
# Cmd: [--help]

I will move this PR to draft and work with the Zalando team to get the operator itself updated.

@zachariahmiller zachariahmiller Jun 5, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Working with the zalando team and/or chainguard is definitely the correct approach to this problem.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need something in the meantime, consider that it is probably possible to build this image with the necessary changes to match the zalando one internal to the repo and part of an onCreate action or otherwise and then use that.

It would be preferable to just get a solution using the upstream providers, but I am providing another alternative if it is time sensitive.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! tbh I prefer to wait for zalando to make it happen, so going to hold off until then


1. reconciles the operator's pooler credential Secret into a derived `pgbouncer-userlist` Secret (the PgBouncer `auth_file`),
2. mutates each pooler Deployment to mount that Secret plus a chart-shipped `pgbouncer-config` ConfigMap at `/etc/pgbouncer` and to set the PgBouncer launch command, and
3. bootstraps pre-existing pooler Deployments on startup so the mutation also applies when the module is installed onto a running cluster.

The static `pgbouncer.ini` is rendered by the `uds-postgres-config` chart. Only the `unicorn` flavor ships the module; `registry1`/`upstream` use self-configuring PgBouncer images and need none of it.

## Consequences

The distroless FIPS pooler now starts and proxies correctly, and because admission mutation re-applies on every operator write there is no reconcile-drift window (`failurePolicy: Ignore` keeps a webhook outage from blocking the operator).

This adds a TypeScript/Node module and a long-lived Pepr controller (admission webhook) to a previously YAML-only package — new build tooling (Node.js, `pepr build`) and an additional component to maintain. The built manifest is committed at `manifests/pepr-module-pgbouncer.yaml`; its shared `pepr-system` Namespace is stripped so package removal does not affect `pepr-uds-core`. The replica pooler is not yet supported (the rendered config targets the primary), and `registry1` would need a similar approach if it ever moves to a distroless pooler image.
5 changes: 5 additions & 0 deletions bundle/uds-bundle.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,8 @@ packages:
remoteSelector:
cluster-name: pg-cluster
description: "Egress to a non-default pg cluster"
values:
- path: enableConnectionPooler
value: true
- path: enableReplicaConnectionPooler
value: true
38 changes: 38 additions & 0 deletions chart/templates/pgbouncer-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Copyright 2024 Defense Unicorns
# SPDX-License-Identifier: AGPL-3.0-or-later OR LicenseRef-Defense-Unicorns-Commercial

{{- if and .Values.postgresql.enabled (or .Values.postgresql.enableConnectionPooler .Values.postgresql.enableReplicaConnectionPooler) }}
apiVersion: v1
kind: ConfigMap
metadata:
name: pgbouncer-config
namespace: postgres
data:
{{- with .Values.postgresql.configConnectionPooler -}}
pgbouncer.ini: |
[databases]
* = host=pg-cluster.postgres.svc.cluster.local port=5432 auth_user=pooler
postgres = host=pg-cluster.postgres.svc.cluster.local port=5432 auth_user=pooler

[pgbouncer]
pool_mode = {{ .connection_pooler_mode | default "transaction" }}
listen_port = {{ .connection_pooler_listen_port | default 5432 }}
listen_addr = *
admin_users = pooler
auth_dbname = postgres
auth_file = /etc/pgbouncer/userlist.txt
auth_query = SELECT * FROM pooler.user_lookup($1)
auth_type = scram-sha-256
server_tls_sslmode = require
log_connections = 0
log_disconnections = 0
max_prepared_statements = 200
default_pool_size = {{ .connection_pooler_default_pool_size | default 20 }}
reserve_pool_size = {{ .connection_pooler_reserve_pool_size | default 10 }}
max_client_conn = {{ .connection_pooler_max_client_conn | default 10000 }}
max_db_connections = {{ .connection_pooler_max_db_connections | default 60 }}
idle_transaction_timeout = 600
server_login_retry = 5
ignore_startup_parameters = extra_float_digits,options
{{- end -}}
{{- end }}
10 changes: 10 additions & 0 deletions chart/templates/postgres-minimal.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,16 @@ spec:
volume:
size: {{ .Values.postgresql.volume.size | quote }}
numberOfInstances: {{ .Values.postgresql.numberOfInstances }}
{{- if hasKey .Values.postgresql "enableConnectionPooler" }}
enableConnectionPooler: {{ .Values.postgresql.enableConnectionPooler }}
{{- end }}
{{- if hasKey .Values.postgresql "enableReplicaConnectionPooler" }}
enableReplicaConnectionPooler: {{ .Values.postgresql.enableReplicaConnectionPooler }}
{{- end }}
{{- with .Values.postgresql.connectionPooler }}
connectionPooler:
{{- toYaml . | nindent 4 }}
{{- end }}
users:
{{- toYaml .Values.postgresql.users | nindent 4 }} # database owner
databases:
Expand Down
29 changes: 26 additions & 3 deletions chart/templates/uds-package-postgres.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,19 +25,19 @@ spec:
- direction: Egress
remoteGenerated: IntraNamespace

{{- if kindIs "slice" .Values.postgresql.ingress -}}
{{- if kindIs "slice" .Values.postgresql.ingress -}}
{{- range .Values.postgresql.ingress }}
- direction: Ingress
selector:
cluster-name: pg-cluster
{{ . | toYaml | nindent 8 }}
{{- end }}
{{- else }}
{{- else }}
- direction: Ingress
selector:
cluster-name: pg-cluster
{{- .Values.postgresql.ingress | toYaml | nindent 8 }}
{{- end }}
{{- end }}

- direction: Ingress
selector:
Expand All @@ -50,4 +50,27 @@ spec:
selector:
cluster-name: pg-cluster
remoteGenerated: KubeAPI

{{- if or (.Values.postgresql.enableConnectionPooler | default false) (.Values.postgresql.enableReplicaConnectionPooler | default false) }}
{{- if kindIs "slice" .Values.postgresql.ingress -}}
{{- range .Values.postgresql.ingress }}
- direction: Ingress
selector:
application: db-connection-pooler
{{ . | toYaml | nindent 8 }}
{{- end }}
{{- else }}
- direction: Ingress
selector:
application: db-connection-pooler
{{- .Values.postgresql.ingress | toYaml | nindent 8 }}
{{- end }}

- direction: Ingress
selector:
application: db-connection-pooler
remoteNamespace: {{ .Release.Namespace }}
remoteSelector:
app.kubernetes.io/name: postgres-operator
{{- end }}
{{- end }}
8 changes: 8 additions & 0 deletions chart/templates/uds-package.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,14 @@ spec:
selector:
app.kubernetes.io/name: postgres-operator
remoteGenerated: KubeAPI
{{- if or (.Values.postgresql.enableConnectionPooler | default false) (.Values.postgresql.enableReplicaConnectionPooler | default false) }}
- direction: Egress
selector:
app.kubernetes.io/name: postgres-operator
remoteNamespace: postgres
remoteSelector:
application: db-connection-pooler
{{- end }}

# Custom rules for other scenarios (such as connecting to a non-default pg cluster)
{{- range .Values.additionalNetworkAllow }}
Expand Down
24 changes: 24 additions & 0 deletions chart/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,30 @@ postgresql:
additionalVolumes: []
env: []

# Connection pooler options
# Ref: https://opensource.zalando.com/postgres-operator/docs/user.html#connection-pooler
enableConnectionPooler: false
enableReplicaConnectionPooler: false
# Connection pooler configuration. Ref: https://opensource.zalando.com/postgres-operator/docs/reference/operator_parameters.html#connection-pooler-configuration
configConnectionPooler:
# db schema to install lookup function into
connection_pooler_schema: "pooler"
# db user for pooler to use
connection_pooler_user: "pooler"
# docker image
connection_pooler_image: "registry.opensource.zalan.do/acid/pgbouncer:master-32"
# max db connections the pooler should hold
connection_pooler_max_db_connections: 60
# default pooling mode
connection_pooler_mode: "transaction"
# number of pooler instances
connection_pooler_number_of_instances: 2
# default resources
connection_pooler_default_cpu_request: 500m
connection_pooler_default_memory_request: 100Mi
connection_pooler_default_cpu_limit: "1"
connection_pooler_default_memory_limit: 100Mi

# Example values for postgresql
#
# postgresql:
Expand Down
4 changes: 2 additions & 2 deletions common/zarf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ components:
maxTotalSeconds: 300
cmd: |
if ./zarf tools kubectl get packages.uds.dev postgres -n postgres; then
./zarf tools wait-for packages.uds.dev postgres -n postgres '{.status.phase}'=Ready
./zarf tools wait-for resource packages.uds.dev postgres -n postgres '{.status.phase}'=Ready
fi
- description: Postgres Operator to be Healthy
maxTotalSeconds: 90
Expand All @@ -55,5 +55,5 @@ components:
maxTotalSeconds: 300
cmd: |
if ./zarf tools kubectl get postgresql pg-cluster -n postgres; then
./zarf tools wait-for postgresql pg-cluster -n postgres '{.status.PostgresClusterStatus}'=Running
./zarf tools wait-for resource postgresql pg-cluster -n postgres '{.status.PostgresClusterStatus}'=Running
fi
74 changes: 74 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,80 @@ Postgres Operator is configured through [`acid.zalan.do/v1` `Postgresql` custom
value: <value>
```

## Connection Pooler (Unicorn Flavor)

The Zalando operator can front a cluster with a [PgBouncer connection pooler](https://github.com/zalando/postgres-operator/blob/master/docs/reference/cluster_manifest.md#connection-pooler). On the `registry1` and `upstream` flavors the pooler uses Zalando-derived images that build their own `pgbouncer.ini` at startup, so no extra configuration is required. The `unicorn` flavor instead uses Chainguard's distroless `pgbouncer` image, which has no entrypoint to self-configure and therefore crash-loops on its own. To make it work, the unicorn flavor bundles a Pepr module plus a static `pgbouncer.ini` `ConfigMap` (`pgbouncer-config`) that configure the pooler externally. This behavior applies to the unicorn flavor only.

When `postgresql.poolerConfig.enabled` is `true` (the default in the unicorn flavor's values), the chart renders the `pgbouncer.ini` `ConfigMap` and the bundled Pepr module:
- reconciles the operator-created pooler credential secret into a derived `pgbouncer-userlist` secret (a `userlist.txt` auth file in the form `"pooler" "<password>"`), and
- mutates each pooler `Deployment` (`pg-cluster-pooler`, `pg-cluster-pooler-repl`) to mount `pgbouncer.ini` and the auth file at `/etc/pgbouncer` and set the PgBouncer launch command.

Encryption and authentication for the FIPS pooler:
- **App → PgBouncer**: there is no client-side TLS on PgBouncer; in-transit encryption between applications and the pooler is provided by the Istio service mesh (mTLS).
- **PgBouncer → Postgres**: enforced TLS via `server_tls_sslmode = require` (non-SSL connections are rejected by `pg_hba`).
- **Authentication**: `scram-sha-256` with an `auth_query` (`SELECT * FROM pooler.user_lookup($1)`) executed as the `pooler` user.

Pool sizing is configured statically through `postgresql.poolerConfig.*` chart values (rather than the operator's dynamic sizing):

- `postgresql.poolerConfig.enabled`: whether to render the FIPS pooler `ConfigMap` and enable the Pepr-driven configuration (default `false`; set `true` on the unicorn flavor)
- `postgresql.poolerConfig.listenPort`: the port PgBouncer listens on (default `5432`)
- `postgresql.poolerConfig.poolMode`: the PgBouncer pool mode (default `transaction`)
- `postgresql.poolerConfig.defaultPoolSize`: server connections per user/database pair (default `20`)
- `postgresql.poolerConfig.reservePoolSize`: extra connections allowed when a pool is exhausted (default `10`)
- `postgresql.poolerConfig.maxClientConn`: maximum client connections accepted by PgBouncer (default `10000`)
- `postgresql.poolerConfig.maxDBConnections`: maximum server connections per database (default `60`)

> **Limitation — replica pooler:** the rendered `pgbouncer.ini` targets the primary service (`pg-cluster.postgres.svc`). Only the primary pooler (`enableConnectionPooler`) is supported on the FIPS flavor. Do not enable `enableReplicaConnectionPooler` here: the same config would be applied to `pg-cluster-pooler-repl`, routing replica-pooler traffic to the primary. Role-aware (primary/replica) configuration is a follow-up.

### Building and deploying the Pepr module

The Pepr module lives in `src/pepr/` and is built into a Kubernetes manifest at `src/pepr/dist/pepr-module-pgbouncer.yaml`. That `dist/` directory is git-ignored, so the manifest is generated at build time rather than committed.

**It is built and deployed automatically as part of the package — no separate step is required.** The unicorn component in `zarf.yaml` has an `onCreate.before` action that runs `pepr build` whenever the unicorn package is created (`zarf package create --flavor unicorn`, `uds run create-dev-package`, or the release/test CI which call these tasks). The generated manifest is then included as a component `manifests:` entry, and the Pepr controller image (`ghcr.io/defenseunicorns/pepr/private/controller`) is pulled into the package like any other image. Deploying the unicorn package (or a bundle containing it) therefore deploys the module into the `pepr-system` namespace alongside `pepr-uds-core`; there is nothing extra to deploy.

> **Build prerequisite:** the build host (your machine or the CI runner) needs **Node.js 20+** and network access to install dependencies (`npm ci`). This applies only to *creating* the unicorn package, not to *deploying* it in an air-gapped environment — the rendered manifest and the controller image are baked into the package at create time.

For local iteration on the module without creating a full package, run the build directly:

```bash
uds run build-pepr
# equivalent to:
# cd src/pepr && npm ci && npx pepr build --custom-image ghcr.io/defenseunicorns/pepr/private/controller:v1.2.1
```

Unit tests for the module:

```bash
cd src/pepr && npx vitest run
```

> Keep the `--custom-image` tag in the `zarf.yaml` `onCreate` action, the `build-pepr` task, and the component `images:` list in sync (all reference the same Pepr controller image).

### Lifecycle (install and removal)

The Pepr module is bundled inside the postgres-operator package (a `manifests:` entry on the unicorn component), so its lifecycle is coupled to the package:

- **Install:** deploying the unicorn package deploys the module into the existing `pepr-system` namespace (alongside `pepr-uds-core`), as its own Zarf-managed release.
- **Removal:** `uds`/`zarf package remove` of postgres-operator uninstalls the module release, deleting all of its resources — the Deployments/Services/Secrets/RBAC in `pepr-system` **and** the cluster-scoped `pepr-pgbouncer` `ClusterRole`, `ClusterRoleBinding`, and `MutatingWebhookConfiguration`. Nothing dangling is left behind.

The module's manifest deliberately does **not** include the `pepr-system` `Namespace` object (the build strips it via `yq`). That namespace is created and owned by uds-core and shared with `pepr-uds-core`; excluding it ensures removing this package never cascade-deletes the shared namespace (which would tear down uds-core's Pepr). Consequently the package depends on uds-core having already created `pepr-system` at deploy time, which is always the case in a UDS cluster.

Runtime-created resources are cleaned up too: the derived `pgbouncer-userlist` secret carries an `ownerReference` to the operator's pooler credential secret (garbage-collected when the cluster/secret is removed), and the pooler Deployment mutation disappears with the operator-managed pooler Deployment when the cluster is torn down.

### Verifying the pooler (unicorn)

```bash
# pooler pods should be Running (not CrashLoopBackOff / usage-exit)
kubectl -n postgres rollout status deployment/pg-cluster-pooler --timeout=180s
# the pepr module should have created the derived auth secret
kubectl -n postgres get secret pgbouncer-userlist
# the pooler Deployment should carry the injected command + /etc/pgbouncer mount
kubectl -n postgres get deploy pg-cluster-pooler -o jsonpath='{.spec.template.spec.containers[0].command}'; echo
# end-to-end: connect through the pooler service and run a query (use your app/pooler user)
# kubectl -n postgres run pooler-check --rm -it --image=<psql-image> --restart=Never -- \
# psql "host=pg-cluster-pooler.postgres.svc port=5432 dbname=<db> user=<user>" -c 'select 1;'
```

## Secrets Creation

The operator creates credentials secrets in the namespace defined by the `{namespace}.{username}` prefix in `postgresql.users`. See the [Reference Package configuration](https://github.com/uds-packages/reference-package/blob/main/docs/configuration.md#secrets-creation) for an example of how to consume these secrets within an application chart.
Expand Down
Loading
Loading