Understanding Controllers and Operators in Kubernetes

Controllers are a core concept of Kubernetes, and the Operator Pattern has become more and more popular as well. I was confused at first about what properties controllers and operators share and how they fit together, so I wrote this summary. I assume some knowledge of basic Kubernetes concepts.

2022-01-16

#kubernetes

Before talking about controllers and operators as actors in the Kubernetes ecosystem, I want to focus on their foundation: State. State is itself an important concept in Kubernetes.

States in Kubernetes #

One of Kubernetes’ core concepts is the use of declarative state. This is infamously explained as:

You just describe how the cluster should look like, and Kubernetes drives the cluster towards that state.

On a very basic signal theory level, there is a fascinating difference between edge triggered systems and level triggered systems.¹ To summarize very superficially: Level triggered systems do not rely on observing a change in state at a certain point in time. This is a very convenient property for distributed systems, where “observing an event at a certain point in time, then reacting instantly to it” is impeded by a bunch of stuff.

So, imagine you have zero Pods running with your application in it. You want to have three. In an edge triggered system, you would yell at the API server: “Add three pods!” You might not get a response right away, so you try again: “Add three pods!” Can you guess what happens? Right, you end up with six Pods because both requests eventually made it to the server. In a level triggered system, you yell “I want there to be three pods!” instead. No matter how often you express that wish to the server, there is never any uncertainty about how many Pods you want. (This is also called idempotence.)

What I described in the previous paragraph is the desired state. It’s what you wish for by telling the API server about it. On the other side, there is the actual state that the cluster is in. In a perfect world, the desired and the actual state would be indistinguishable twins, always in sync. But just like real twins, the two can grow apart. If that happens, we need someone to bring them back together. You could say: we need to reconcile them.

Controllers: Reconciling States #

Another core concept of Kubernetes is the control loop. A controller is an entity that observes a certain type of resource in your cluster. To be more precise, it observes two things: The state that the resources it is responsible for are in, and the state that is desired for these resources. If the two match, the controller does nothing. If the two diverge, the controller takes action by reconciling the two states. Specifically, it manipulates the actual state in such a way that it matches the desired state again. This observing-and-taking-action-as-necessary approach is the control loop.

Kubernetes ships with a lot of built-in controllers that observe and change the state for all the resource types Kubernetes comes with: Pods, Namespaces, ReplicaSets, Endpoints, ServiceAccounts and many more. All of these controllers are run by the kube-controller-manager, which is part of the Kubernetes control plane.²

Controllers in Kubernetes usually rely on the client-go library, which provides some tools for anyone that wants to develop a custom controller. Because of the inherent complexity, directly dealing with these tools (such as Reflectors, Informers and Indexers) can be daunting. Fortunately, there are frameworks that let us work on a higher abstraction level.

Using Kubebuilder to Develop a Custom Controller #

Kubebuilder is one of these frameworks that allows us to build a custom controller, including Custom Resource Definitions (CRDs). This allows us to create a resource type that our custom controller is responsible for.

Have a look the Kubebuilder Quick Start Docs. I won’t repeat everything from the Quick Start of Kubebuilder here, they might update it in the future and then my version would be outdated and might confuse people. So follow their instructions, everything is explained very well over there!

From Custom Controllers to Operators #

The actual purpose of this post was to introduce controllers and operators. So far, we’ve only talked about controllers. But the step towards operators is not that hard. The term was coined when it became feasible to move applications (in the broadest sense, we’re talking databases, stream processors, monitoring systems…) to Kubernetes that previously had been operated elsewhere (e. g. on regular servers). To profit from the advantages of Kubernetes, these services are ported to it. This usually requires some specific conditions and intercommunication settings. These prerequisisites are dependant on the application and therefore domain-specific. Whenever we want to change the state of a cluster (in order to fulfill these prerequisites), we use controllers.

Therefore, Operator is just another word for a controller that fulfills a few extra requirements, which are the following.

An Operator Manages a CRD

Controllers can operate on any Kubernetes resource, including the default ones. An operator on the other hand manages custom resources that are typically packaged and installed in sync with the operator. (Kubebuilder provides this functionality, we can create CRDs when creating an API.)

An Operator Implements Domain-Specific Knowledge

The operator knows something about the resources that Kubernetes doesn’t. The most famous example are operators for databases, because the usually implement a specific strategy on how to share data and scale out the database among different nodes.

An Operator Has Single-Application Focus

Operators are not universally applicable like controllers. Instead of managing generic resources like Pods and Services, operators are in charge of a very specific application product: Postgres, Kafka, Etcd or Prometheus. This becomes obvious when checking RedHat’s OperatorHub.³

Another definition for operators is provided by the OperatorHub:

We call this a Kubernetes-native application. With Operators you can stop treating an application as a collection of primitives like Pods, Deployments, Services or ConfigMaps, but instead as a single object that only exposes the knobs that make sense for the application.

Source

There are other ways to build operators apart from Kubebuilder, most notably the operator-sdk. In fact, RedHat has built an entire product zoo around the notion of Operators, including the aforementioned OperatorHub.

Additional Sources #

See Level Triggering and Reconciliation in K8s ↩︎
Kubernetes Docs: kube-controller-manager ↩︎
Find available operators at the OperatorHub. ↩︎