Today I was given the challenge of providing Kafka as a service to multiple development teams in a way that was consistent and could be managed easily. There are a number of challenges to this, from how do you provision the service request through to when the thing is running, how does it get monitored or upgraded.

Kafka is a streaming tool designed to be a highly available and scalable platform for building pipelines for your data and is used by many companies in production.

I wanted to deploy the ability to manage Kafka centrally, so an operator deployed once, centrally to provide Kafka as a service to development teams was a natural fit. It means that developers are able to quickly service their own needs and the central Cloud team stays off their critical path and can focus on providing platform features, not servicing individual requests.

The cleanest way to provide this type of centrally managed service is to deploy Kafka using an operator. Even though operators are only recently starting to be adopted, I was not disappointed to discover that the Strimzi project gives us a way to do this.  I won't cover what operators are in this article, but if you'd like to find out more about them, take a look at this blog post. There is also a set of training scenarios available on katacoda.


Deployment of the Operator

A deployment is formed of the following pieces:
- a project for the operator to be deployed into: amq-operator
- one or more projects that you would like Kafka to be deployed to: project-dev project-uat project-e2e
- a service account for the operator to run as: system:serviceaccount:amq-operator:strimzi-cluster-operator
- role bindings for the service account: strimzi-cluster-operator-namespaced , strimzi-entity-operator and strimzi-topic-operator

1) Create a new project in your cluster with oc new-project amq-operator

2) Download a zip containing the deployment yaml files and examples from https://access.redhat.com/node/3596931/423/1 if you prefer to run the community Strimzi version, then your mileage should be similar with the 0.10 release at https://github.com/strimzi/strimzi-kafka-operator/tree/release-0.10.x/install/cluster-operator

3) Run shell command on RoleBindings to replace my-project with your desired namespace

On Linux, use:
sed -i 's/namespace: .*/namespace: amq-operator/' install/cluster-operator/*RoleBinding*.yaml

On MacOS, use:
sed -i '' 's/namespace: .*/namespace: amq-operator/' install/cluster-operator/*RoleBinding*.yaml

4) Update install/cluster-operator/050-Deployment-strimzi-cluster-operator.yaml and change the variable STRIMZI_NAMESPACE to list all projects that you want the operator to watch.

apiVersion: extensions/v1beta1
kind: Deployment
spec:
  template:
    spec:
      serviceAccountName: strimzi-cluster-operator
      containers:
      - name: strimzi-cluster-operator
        image: strimzi/cluster-operator:latest
        imagePullPolicy: IfNotPresent
        env:
        - name: STRIMZI_NAMESPACE
          value: project-dev,project-uat,project-prod

5) Apply the RoleBindings to each project that you want the AMQ Operator to manage. If you see permission errors, then you haven't applied these roles.

oc apply -f install/cluster-operator/020-RoleBinding-strimzi-cluster-operator.yaml -n project-dev
oc apply -f install/cluster-operator/031-RoleBinding-strimzi-cluster-operator-entity-operator-delegation.yaml -n project-dev
oc apply -f install/cluster-operator/032-RoleBinding-strimzi-cluster-operator-topic-operator-delegation.yaml -n project-dev

Note: these yaml files contain a namespace value. This represents the namespace which the operator resides in, so should be set to amq-operator in our example. You may prefer to apply the role through the CLI:

oc adm policy add-role-to-user strimzi-cluster-operator-namespaced system:serviceaccount:amq-operator:strimzi-cluster-operator -n project-dev
oc adm policy add-role-to-user strimzi-entity-operator system:serviceaccount:amq-operator:strimzi-cluster-operator -n project-dev
oc adm policy add-role-to-user strimzi-topic-operator system:serviceaccount:amq-operator:strimzi-cluster-operator -n project-dev

oc adm policy add-role-to-user strimzi-cluster-operator-namespaced system:serviceaccount:amq-operator:strimzi-cluster-operator -n project-uat
oc adm policy add-role-to-user strimzi-entity-operator system:serviceaccount:amq-operator:strimzi-cluster-operator -n project-uat
oc adm policy add-role-to-user strimzi-topic-operator system:serviceaccount:amq-operator:strimzi-cluster-operator -n project-uat

oc adm policy add-role-to-user strimzi-cluster-operator-namespaced system:serviceaccount:amq-operator:strimzi-cluster-operator -n project-prod
oc adm policy add-role-to-user strimzi-entity-operator system:serviceaccount:amq-operator:strimzi-cluster-operator -n project-prod
oc adm policy add-role-to-user strimzi-topic-operator system:serviceaccount:amq-operator:strimzi-cluster-operator -n project-prod

The roles when deployed will look like this:

6) Deploy the Operator

oc apply -f install/cluster-operator -n amq-operator

Once deployed, you'll have something that looks like this:

In the logs, you want to see something like this to confirm that it has deployed properly.

Note: Check that the operator is looking for the projects that we specified earlier. If you see any permission errors in the logs, then you need to check that you gave the operator service account system:serviceaccount:amq-operator:strimzi-cluster-operator the Strimzi operator roles listed above. Giving the operator service account the admin role on the project is not sufficient.


Deploying a Kafka cluster

Now that we have deployed the operator, we can use it to deploy a Kafka cluster. This is as simple as creating a new Kafka object in your project namespace. We're going to deploy the persistent storage example to the project-dev namespace.

apiVersion: kafka.strimzi.io/v1alpha1
kind: Kafka
metadata:
  name: my-cluster
spec:
  kafka:
    replicas: 3
    listeners:
      plain: {}
      tls: {}
    config:
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
    storage:
      type: persistent-claim
      size: 1Gi
      deleteClaim: false
  zookeeper:
    replicas: 3
    storage:
      type: persistent-claim
      size: 1Gi
      deleteClaim: false
  entityOperator:
    topicOperator: {}
    userOperator: {}
Paste the example code into the 'import YAML/JSON' prompt.
We can accept this warning, because we are creating a Kafka object and we understand the implications of that.

Once applied, the operator will create the persistent storage and other resources leaving you with a brand new Kafka cluster.

Note: The persistent storage example assumes that you have dynamic provisioning of storage configured, or some pre-created persistent volumes available in your Openshift cluster.

Provisioned storage
Fully deployed Kafka cluster

Adding an additional project to the Operator

If you'd like to add an additional project for the operator to watch after it has been deployed, simply add it to the comma separated list in the environment variable and trigger a new deployment, like so:

Update the STRIMZI_NAMESPACE environment variable to watch additional projects

Ensure that you apply the role binding to allow the amq-operator service account permission to deploy into my-new-project

oc apply -f install/cluster-operator/020-RoleBinding-strimzi-cluster-operator.yaml -n my-new-project
oc apply -f install/cluster-operator/031-RoleBinding-strimzi-cluster-operator-entity-operator-delegation.yaml -n my-new-project
oc apply -f install/cluster-operator/032-RoleBinding-strimzi-cluster-operator-topic-operator-delegation.yaml -n my-new-project

Troubleshooting

If you run into any issues, such as the Operator crashing, errors in the Operator logs or Kafka not being deployed, go through and check that:

- the project you are trying to deploy to is listed in the Operator deployment config STRIMZI_NAMESPACE environment variable;
- that the correct permissions are applied to the project which you are deploying Kafka to, that is that the service account system:serviceaccount:amq-operator:strimzi-cluster-operator has the roles strimzi-cluster-operator-namespaced , strimzi-entity-operator and strimzi-topic-operator applied in the project where Kafka will be deployed i.e project-dev.


For official documentation, please refer to: https://access.redhat.com/documentation/en-us/red_hat_amq/7.2/html-single/using_amq_streams_on_openshift_container_platform/index#con-cluster-operator-rbac-deploying-co

Introduction to Operators: https://www.redhat.com/en/blog/introducing-operator-framework-building-apps-kubernetes

Strimzi community Operator: https://strimzi.io

Katacoda training: https://www.katacoda.com/openshift