From c6cb36c4b9c2fc5bff2ad12ff685739dc623511d Mon Sep 17 00:00:00 2001
From: Vic van Gool <vic@cloud66.com>
Date: Wed, 20 May 2026 07:43:35 +0100
Subject: [PATCH] Add guidance on shared-master vs dedicated-worker topology

Small Maestro K8s clusters start with a single shared master running both
the control plane and application workloads. The HA doc covered multi-
master scaling but never the transition point where a single master is
overloaded and a dedicated worker should be added. Adds a section to
configuring-for-high-availability.mdx with the symptoms to watch for
(slow kubectl, scheduling lag, master-node pod evictions, etcd slow-write
warnings) and a Callout distinguishing this from a full HA topology.

Linear: SUP-993

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../configuring-for-high-availability.mdx       | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
diff --git a/build-and-config/configuring-for-high-availability.mdx b/build-and-config/configuring-for-high-availability.mdx
index 0f0abc7..7626910 100644
--- a/build-and-config/configuring-for-high-availability.mdx
+++ b/build-and-config/configuring-for-high-availability.mdx
@@ -11,6 +11,23 @@ In order to achieve high availability for your application, you need multiple re
 Applications running Kubernetes v1.12 or lower do not support the multi-master feature on Cloud 66. If you have deployed an application via Cloud 66 before March 2019, you will need to **redeploy your application "with upgrades" and choose to perform a Kubernetes upgrade** (note that this will incur significant downtime as your cluster will be recreated). All applications deployed after **March 2019** on version v1.13 and above automatically support multi-master clusters.
 </Callout>
 
+## Choosing between a shared master and dedicated workers
+
+A new Maestro Kubernetes cluster starts with a single master node that also runs your application workloads — a "shared master" topology. This is fine for development and small production loads, but it has a hard ceiling: the same node is responsible for both Kubernetes control-plane traffic (the API server, scheduler, controller manager, etcd) and your app's containers, and those two responsibilities compete for the same CPU, memory, and disk I/O.
+
+The right time to add a **dedicated worker** — a node that runs *only* application pods, with the master returned to control-plane duties — is when you start seeing any of:
+
+- **Slow `kubectl` responses or Dashboard timeline lag** when nothing else is changing. The API server is being starved by application containers.
+- **Scheduling decisions that take noticeably longer than they used to** (new pods sitting in `Pending` for tens of seconds before being placed).
+- **Pod evictions on the master node** — the kubelet evicting application pods because the node itself is under memory or disk pressure. These show up on the cluster's events feed and in the timeline.
+- **etcd warnings in the master's logs** about slow writes (`took too long`, `apply request took too long`). etcd is the most latency-sensitive part of the control plane.
+
+If you're hitting any of those on a single-master cluster, add one dedicated worker before chasing other tuning options — it's usually the fastest path back to a healthy cluster. The procedure is the same as [adding any node](#adding-nodes-to-an-application); pick *Worker* in step 5.
+
+<Callout type="info" title="One worker first, then think about HA">
+Adding the first dedicated worker is a different decision from going to a full HA topology (three masters + workers). A single-master + single-worker cluster is not HA — losing the master still takes the cluster down — but it does take the workload pressure off the master and is often enough for small production apps. Move to three masters when you also need the cluster to survive a master node failure.
+</Callout>
+
 ## Adding nodes to an application
 
 To add nodes to an existing application: