Reference · Kubernetes Operator · CNCF

Strimzi — Architecture, Reconciliation, CRD Landscape

A compact reference to the Strimzi Kafka Operator: the three operator processes (Cluster Operator, Topic Operator, User Operator) that watch their respective custom resources, the reconciliation loop that turns a Kafka CR into running broker / controller pods with TLS, and the CRD landscape (Kafka, KafkaNodePool, KafkaTopic, KafkaUser, KafkaConnect, KafkaMirrorMaker2, KafkaBridge, KafkaRebalance) that ties the whole thing together.

1. Architecture

A Strimzi deployment is three operator processes plus a set of CRDs they watch. The Cluster Operator is the central one: it watches every Kafka and KafkaNodePool in its scoped namespaces and, on each change, reconciles the cluster into a set of StrimziPodSet resources (one per node role — controllers, brokers, mixed) plus Services, ConfigMaps, and TLS Secrets. The Topic Operator watches KafkaTopic CRs and creates / updates / deletes topics inside the target cluster via Kafka's Admin API. The User Operator does the same for KafkaUser — credentials and ACLs. CruiseControl is optional; when enabled by the Cluster Operator, it ships as a sidecar deployment that the operator queries when a KafkaRebalance CR is created.

Custom Resources (etcd / kube-apiserver) Kafka cluster spec, listeners, TLS KafkaNodePool ×N roles, replicas, storage KafkaTopic ×N KafkaUser ×N KafkaConnect / Connector KafkaMirrorMaker2 KafkaBridge KafkaRebalance Cluster Operator watches Kafka, KafkaNodePool, KafkaConnect, MM2, Bridge, Rebalance ClusterCa · ClientsCa management Topic Operator watches KafkaTopic → AdminClient.createTopics / alter User Operator watches KafkaUser → AdminClient ACLs · Secret with creds Reconciliation outputs (in target namespace) StrimziPodSet (controllers / brokers) one per KafkaNodePool, KRaft Pods · PVCs · Services bootstrap, brokers, external listeners ConfigMaps (per node) server.properties / log4j2.properties Secrets (TLS, SCRAM, JMX) cluster-ca, clients-ca, per-node certs NetworkPolicy · PodDisruptionBudget Routes / Ingress / LoadBalancer (per listener) Cruise Control deployment optional · driven by KafkaRebalance KafkaConnect / MM2 / Bridge pods one Deployment per matching CR

Blue arrows: Kubernetes watches by the operators on their respective CRs. Green arrows: resources reconciled into the target namespace. Each operator runs as its own Deployment; the Cluster Operator co-locates the Topic and User Operator inside an EntityOperator pod by default, but they can also run stand-alone.

2. Reconciliation

A typical create a new cluster reconciliation, with KRaft mode and the EntityOperator co-located. The Cluster Operator's reconciliation loop runs every STRIMZI_FULL_RECONCILIATION_INTERVAL_MS (default 2 min) and is also kicked on every CR change.

User (kubectl apply) kube-apiserver + etcd Cluster Operator (watch + reconcile) kubelet + Kafka pods Topic / User Operator (via Admin API) 1 apply Kafka + KafkaNodePool CRs 2 Watch event: ADDED Kafka / KNP 3 generate ClusterCa + ClientsCa 4 create cluster-ca-secret · clients-ca-secret · per-node TLS 5 create StrimziPodSet (controllers, brokers) · ConfigMap · Services 6 Pod schedule → kubelet starts kafka container 7 KRaft quorum forms · brokers register 8 pods Ready · listeners reachable 9 update Kafka.status (Ready) + listenerStatus[] 10 Watch events: KafkaTopic / KafkaUser CRs 11 AdminClient.createTopics / createAcls / createUser 12 AdminResponse OK · topic / user / ACLs in place 13 update KafkaTopic.status / KafkaUser.status (Ready) 14 kubectl get kafka shows READY=True

The reconciliation loop is idempotent and level-triggered: re-running it always converges to the desired state, so a missed event or a transient API failure is recovered on the next tick. The same loop also drives rolling updates — for example, certificate renewal generates new secrets in step 3–4 and then rolls pods one-by-one through step 5–8, gated on broker readiness probes.

3. CRD Landscape

  • Kafka — top-level cluster CR. Declares the listeners (plain, TLS, external load balancer / nodeport / ingress / route), authentication / authorization, EntityOperator, CruiseControl, JMX exporter, and (legacy) ZooKeeper or KRaft topology. In modern Strimzi, the node topology lives in KafkaNodePools referenced from this CR.
  • KafkaNodePool — one per role group. Holds replicas, storage class / size, JVM options, resources, tolerations / affinity, and the role flag (controller, broker, or mixed). The Cluster Operator turns each pool into one StrimziPodSet and one StatefulSet-equivalent set of pods.
  • KafkaTopic — managed by the Topic Operator. Declares partitions, replicas, and topic configs. Status reports the corresponding Kafka topic state. Strimzi keeps the CR and the actual topic in sync in both directions (CR change → AlterTopic, broker-side topic change → CR status).
  • KafkaUser — managed by the User Operator. Declares authentication type (TLS, SCRAM-SHA-512) and ACLs / quotas. The operator creates a Secret with the user's credentials and registers ACLs via AdminClient.
  • KafkaConnect / KafkaConnector — a Kafka Connect cluster Deployment with a REST API. Connector instances can be created either declaratively via KafkaConnector CRs or through the Connect REST API directly.
  • KafkaMirrorMaker2 — a managed MirrorMaker 2.0 Deployment for cross-cluster replication. Source / target cluster aliases, replication policies, and connector configs go here.
  • KafkaBridge — HTTP REST front-end to a Kafka cluster (producer / consumer over HTTP). Useful when a client cannot speak the Kafka wire protocol directly.
  • KafkaRebalance — kicks Cruise Control into computing and (optionally) executing a rebalance proposal. The CR's status reports proposal load metrics; the user transitions it to approve to execute.

4. Strimzi vs Apache Kafka

Strimzi does not replace Apache Kafka — it runs Apache Kafka. The two sit on different layers: Apache Kafka is the data plane (broker / controller processes that own the log and serve produce / fetch), and Strimzi is the Kubernetes-native control plane that creates those processes, manages their TLS material, advances their version, and reacts to declarative CRs. A Strimzi-managed cluster is the exact same Kafka binary you would run by hand — wrapped in operator-driven lifecycle.

Apache Kafka (bare) Strimzi on Kubernetes
Layer / role Data plane — the broker / KRaft controller processes themselves. Control plane — Kubernetes operator that reconciles those processes from CRs. Runs the same broker binary inside the pods it creates.
Deployment unit JVM process on bare metal / VM / container. Started with kafka-server-start.sh. StrimziPodSet → Pods with PVCs + Services + ConfigMaps + Secrets, all materialised by the Cluster Operator.
Configuration model server.properties / shell scripts. Imperative; usually wrapped in Ansible / Chef / Helm. Declarative CRs (Kafka, KafkaNodePool, KafkaTopic, KafkaUser, ...). CR change → reconcile.
Scaling brokers Edit configs, start a new broker process, run kafka-reassign-partitions.sh by hand. Change KafkaNodePool.spec.replicas. Operator creates the pod / PVC / Service; KafkaRebalance + Cruise Control handles partition movement.
TLS / certificates Operator team generates keystores / truststores out-of-band and distributes them. ClusterCa + ClientsCa objects auto-generate, rotate, and mount per-broker secrets. Renewal is rolling and automatic.
Users / ACLs kafka-acls.sh or AdminClient calls, managed manually. KafkaUser CR. User Operator creates the credential Secret and registers ACLs via AdminClient.
Topics kafka-topics.sh or AdminClient, managed manually. KafkaTopic CR. Topic Operator keeps the CR and the actual topic in sync in both directions.
Metrics / observability JMX + (optional) JMX Prometheus exporter sidecar. ServiceMonitor wired by hand. JmxPrometheusExporter / StrimziMetricsReporter / KafkaExporter wired by the operator. PodMonitor / Grafana dashboards shipped.
Rolling upgrades Manual broker-by-broker restart. Operator team tracks inter-broker protocol bump steps. KafkaRoller orchestrates the rolling restart with quorum awareness; Kafka.spec.kafka.version bump steps through inter-broker protocol upgrade automatically.
Rebalancing Cruise Control is a separate process you start and call yourself. Cruise Control is one boolean in the Kafka CR; rebalance is a KafkaRebalance CR whose status surfaces the proposal.
Mirroring / bridges / Connect Each is its own daemon / Connect cluster you bring up and configure. KafkaMirrorMaker2, KafkaBridge, KafkaConnect CRs each create their own managed Deployment.
Failure recovery Whoever is on-call restarts brokers, restores PVs, re-issues certs. Operator reconciliation loop is level-triggered: lost pod → recreated, lost PVC → reattached, expired cert → rolled. The on-call read of Kafka.status usually replaces the playbook.

The short version: if you already run Apache Kafka well by hand on Kubernetes, Strimzi is what the operator pattern would have you build anyway. If you do not run Kubernetes at all, Strimzi is the wrong tool — the value is the controller loop, not the broker binary.

Related