Reference · Kubernetes Operator · CNCF
Strimzi — Architecture, Reconciliation, CRD Landscape
A compact reference to the Strimzi Kafka Operator: the three operator processes (Cluster Operator, Topic Operator, User Operator) that watch their respective custom resources, the reconciliation loop that turns a Kafka CR into running broker / controller pods with TLS, and the CRD landscape (Kafka, KafkaNodePool, KafkaTopic, KafkaUser, KafkaConnect, KafkaMirrorMaker2, KafkaBridge, KafkaRebalance) that ties the whole thing together.
1. Architecture
A Strimzi deployment is three operator processes plus a set of CRDs they watch. The Cluster Operator is the central one: it watches every Kafka and KafkaNodePool in its scoped namespaces and, on each change, reconciles the cluster into a set of StrimziPodSet resources (one per node role — controllers, brokers, mixed) plus Services, ConfigMaps, and TLS Secrets. The Topic Operator watches KafkaTopic CRs and creates / updates / deletes topics inside the target cluster via Kafka's Admin API. The User Operator does the same for KafkaUser — credentials and ACLs. CruiseControl is optional; when enabled by the Cluster Operator, it ships as a sidecar deployment that the operator queries when a KafkaRebalance CR is created.
Blue arrows: Kubernetes watches by the operators on their respective CRs. Green arrows: resources reconciled into the target namespace. Each operator runs as its own Deployment; the Cluster Operator co-locates the Topic and User Operator inside an EntityOperator pod by default, but they can also run stand-alone.
2. Reconciliation
A typical create a new cluster reconciliation, with KRaft mode and the EntityOperator co-located. The Cluster Operator's reconciliation loop runs every STRIMZI_FULL_RECONCILIATION_INTERVAL_MS (default 2 min) and is also kicked on every CR change.
The reconciliation loop is idempotent and level-triggered: re-running it always converges to the desired state, so a missed event or a transient API failure is recovered on the next tick. The same loop also drives rolling updates — for example, certificate renewal generates new secrets in step 3–4 and then rolls pods one-by-one through step 5–8, gated on broker readiness probes.
3. CRD Landscape
Kafka— top-level cluster CR. Declares the listeners (plain, TLS, external load balancer / nodeport / ingress / route), authentication / authorization, EntityOperator, CruiseControl, JMX exporter, and (legacy) ZooKeeper or KRaft topology. In modern Strimzi, the node topology lives inKafkaNodePools referenced from this CR.KafkaNodePool— one per role group. Holds replicas, storage class / size, JVM options, resources, tolerations / affinity, and the role flag (controller,broker, or mixed). The Cluster Operator turns each pool into oneStrimziPodSetand oneStatefulSet-equivalent set of pods.KafkaTopic— managed by the Topic Operator. Declares partitions, replicas, and topic configs. Status reports the corresponding Kafka topic state. Strimzi keeps the CR and the actual topic in sync in both directions (CR change → AlterTopic, broker-side topic change → CR status).KafkaUser— managed by the User Operator. Declares authentication type (TLS, SCRAM-SHA-512) and ACLs / quotas. The operator creates aSecretwith the user's credentials and registers ACLs via AdminClient.KafkaConnect/KafkaConnector— a Kafka Connect cluster Deployment with a REST API. Connector instances can be created either declaratively viaKafkaConnectorCRs or through the Connect REST API directly.KafkaMirrorMaker2— a managed MirrorMaker 2.0 Deployment for cross-cluster replication. Source / target cluster aliases, replication policies, and connector configs go here.KafkaBridge— HTTP REST front-end to a Kafka cluster (producer / consumer over HTTP). Useful when a client cannot speak the Kafka wire protocol directly.KafkaRebalance— kicks Cruise Control into computing and (optionally) executing a rebalance proposal. The CR'sstatusreports proposal load metrics; the user transitions it toapproveto execute.
4. Strimzi vs Apache Kafka
Strimzi does not replace Apache Kafka — it runs Apache Kafka. The two sit on different layers: Apache Kafka is the data plane (broker / controller processes that own the log and serve produce / fetch), and Strimzi is the Kubernetes-native control plane that creates those processes, manages their TLS material, advances their version, and reacts to declarative CRs. A Strimzi-managed cluster is the exact same Kafka binary you would run by hand — wrapped in operator-driven lifecycle.
| Apache Kafka (bare) | Strimzi on Kubernetes | |
|---|---|---|
| Layer / role | Data plane — the broker / KRaft controller processes themselves. | Control plane — Kubernetes operator that reconciles those processes from CRs. Runs the same broker binary inside the pods it creates. |
| Deployment unit | JVM process on bare metal / VM / container. Started with kafka-server-start.sh. |
StrimziPodSet → Pods with PVCs + Services + ConfigMaps + Secrets, all materialised by the Cluster Operator. |
| Configuration model | server.properties / shell scripts. Imperative; usually wrapped in Ansible / Chef / Helm. |
Declarative CRs (Kafka, KafkaNodePool, KafkaTopic, KafkaUser, ...). CR change → reconcile. |
| Scaling brokers | Edit configs, start a new broker process, run kafka-reassign-partitions.sh by hand. |
Change KafkaNodePool.spec.replicas. Operator creates the pod / PVC / Service; KafkaRebalance + Cruise Control handles partition movement. |
| TLS / certificates | Operator team generates keystores / truststores out-of-band and distributes them. | ClusterCa + ClientsCa objects auto-generate, rotate, and mount per-broker secrets. Renewal is rolling and automatic. |
| Users / ACLs | kafka-acls.sh or AdminClient calls, managed manually. |
KafkaUser CR. User Operator creates the credential Secret and registers ACLs via AdminClient. |
| Topics | kafka-topics.sh or AdminClient, managed manually. |
KafkaTopic CR. Topic Operator keeps the CR and the actual topic in sync in both directions. |
| Metrics / observability | JMX + (optional) JMX Prometheus exporter sidecar. ServiceMonitor wired by hand. | JmxPrometheusExporter / StrimziMetricsReporter / KafkaExporter wired by the operator. PodMonitor / Grafana dashboards shipped. |
| Rolling upgrades | Manual broker-by-broker restart. Operator team tracks inter-broker protocol bump steps. | KafkaRoller orchestrates the rolling restart with quorum awareness; Kafka.spec.kafka.version bump steps through inter-broker protocol upgrade automatically. |
| Rebalancing | Cruise Control is a separate process you start and call yourself. | Cruise Control is one boolean in the Kafka CR; rebalance is a KafkaRebalance CR whose status surfaces the proposal. |
| Mirroring / bridges / Connect | Each is its own daemon / Connect cluster you bring up and configure. | KafkaMirrorMaker2, KafkaBridge, KafkaConnect CRs each create their own managed Deployment. |
| Failure recovery | Whoever is on-call restarts brokers, restores PVs, re-issues certs. | Operator reconciliation loop is level-triggered: lost pod → recreated, lost PVC → reattached, expired cert → rolled. The on-call read of Kafka.status usually replaces the playbook. |
The short version: if you already run Apache Kafka well by hand on Kubernetes, Strimzi is what the operator pattern would have you build anyway. If you do not run Kubernetes at all, Strimzi is the wrong tool — the value is the controller loop, not the broker binary.