When Do Kubernetes Operators Make Sense?

I remember that everyone started talking about Docker. We found ourselves having the same discussion over and over again about whether it’s right for the new client project. For a long time, the answer was NO. That was until Docker matured, and the ecosystem started to expand.

Why am I telling you this?
Because I think that Kubernetes Operators are such a tech.
It’s here to stay, and I find myself asking the same questions again.

You are looking at someone who said, “Nah, no one will want to watch 5 minutes videos online. This YouTube thing will never catch on”

As of now, I keep bouncing between using Helm vs. Operators when implementing new components. Like everything else in DevOps, it depends on the project. Are we talking about an on-prem deployment packaging? Is that for implementing public “off the rack” services such as Elasticsearch? Or are we trying to deploy proprietary applications?

It seems like more public projects are moving to Operators, and their quality ranges from bad to ok’ ish to excellent. If you are looking for an example for a good and highly adopted Operator, check out Prometheus.

I’ll try to give you a few use cases where Operators do make sense, and where there may be a better alternative as of today.

Operators in a nutshell

One way of thinking of operators is that they are a combination of two things:

An extension of the Kubernetes API - Meaning, a new way to interact with the cluster, with YAML files we all love.
To extend the Kubernetes API, we need a controller that takes these fancy YAML files (or API calls), and performs an action - Create a POD, run a job, scale a deployment, change the configuration, etc. Its task is to maintain a state.

The idea is that you, the developer of an application, knows its lifecycle and the best way to operate it.

Ingress can be an excellent example for an operator, even though it’s not really an operator, but works similarly:

You have a controller - The Ingress Controller (Nginx, HA Proxy, etc.). Usually, this is a Pod that runs the actual application.
A YAML resource you can interact with - Kind: Ingress. It has its own settings, and you usually keep it in the context of your applications.
When you apply this YAML, Kubernetes interacts with the Ingress controller and updates its config.

To write an operator of your own, use the Operator SDK, which offers a few “frameworks” to work with:

Go - Write your own code.
Ansible - Run roles and playbooks.
Helm - Use Helm to execute your Operator logic.

These frameworks make it less intimidating to start implementing operators and give you a good idea of what you can do.

#1 When you want to package your Helm Charts or Deployment code inside the cluster

Sometimes it’s a good idea to keep a low footprint of your installation. If you are working on a product that is supposed to run on different on-prem environments, you may want to “hide” most of your logic and allow a more straightforward way of interacting with the system.

It’s a bit like taking all of your Helm Charts and storing them inside the cluster instead of keeping them is a bastion host or a laptop.
When you need to interact with the system, you know that all the logic is right there waiting for your input.
This leaves less room for error for whoever operates the system daily.
It may seem like a small problem, but if you’ve worked with some of the larger financial institutions as a vendor, you know that having a dedicated place to store your configuration is not a simple ask.

#2 Keep the configuration next to your code

Prometheus got it right with their ServiceMonitor CRD, and you can see that almost every public Helm chart out there is using it.
Each application can have a ServiceMonitor object that defines how you want to monitor it.

You don’t go and update the Prometheus deployment or update a configuration file - You update the configuration that sits next to your application. This way, you bind the configuration to the application lifecycle, which allows for a cleaner implementation, and puts the power in the hands of the developers.

The developer doesn’t have to know much about the Prometheus implementation. Only how to allow monitoring for the application under her responsibility.

We can take a service such as Kafka, and apply the same logic to it. Instead of making the topic creation part of the Kafka deployment, you can create the topic when deploying the application that uses it.

#3 When you want more of the same

Sometimes you need to deploy the same stack to the cluster multiple times. It may be for running multiple environments or support multi-tenancy - you are using the same deployment package again and again.

You may not want to apply the same logic for managing the deployment lifecycle, and this is where operators sine.
Take the Minio operator - it allows you to launch Minio clusters by applying one YAML. This simplifies and abstracts your configuration, where instead of having a full Helm chart, you have a uniform interface.

The margin of error is lower, and you can enforce an opinionated way of running that specific application stack. I tend to prefer explicit configuration as opposed to what you usually see in public Helm Charts.

But the idea is that you have one component that handles all these installations for you behind the scenes.

#4 There is more to the application lifecycle

For stateful applications, there is more to it than “apply and forget”.
Take databases that you need to regularly backup and perform maintenance.
You have to scale, migrate, and handle them in ways that Kubernetes doesn’t support out of the box.

Cron jobs are a good example of something you want to manage “hands-free,” meaning not running logic that is external to the cluster.

Another example is scaling an app. Sometimes it’s not enough to just set the replica size. You may want to drain pods, make configuration changes, change application-level settings, etc.
An operator allows you to apply this more sophisticated logic to Kubernetes - You essentially extend the default capabilities.

#5 You get a “free” API server

I recently worked on a project that required an automated way to launch new deployments. We had to expose an API that allows dynamic provisioning, and my initial thought was to develop a web service and run it as a pod on the cluster.

But then we have to handle authentication, write the code for the app, and eventually run the logic that provisions a deployment using Helm.
Well, that doesn’t make any sense when you can run a Helm or Ansible based Operator, that gives you all of that out of the box.
Instead of interacting with our custom web API, the client app will just use the Kubernetes API.
I mean, usually, I would jump the opportunity to sit down and write a new service, but this is unnecessary in this case.

Can’t wait to create your very first operator?

If you are convinced that Kubernetes Operators are the way to go, I would highly recommend giving the Ansible Operator SDK a try - RedHat has a tutorial that will get you started.