The K8ssandra v1.2.0 release was published on Wednesday, 06/02/2021. This release adds support for the latest Apache CassandraTM release, 4.0-RC1, and uses the latest release of Cass Operator, v1.7.1. Here are the details.

Apache CassandraTM 4.0-RC1

We’ve addressed issues across a number of projects to bring 4.0-RC1 to K8ssandra. The recently released v4.0.0-v0.1.25 of the Management API for Apache Cassandra™ is the key piece that delivers the support for 4.0-RC1 within K8ssandra. That release is now the default in K8ssandra. If you configure deployment for version 4.0.0 you’ll get the latest and greatest version of Apache CassandraTM.

Enhanced Pod Scheduling

The ability to control pod scheduling is a critical aspect of maintaining performance on production systems. To help users manage this aspect of deployment, we’ve added support for the specification of tolerations on various resources within K8ssandra.

A big thanks to community members Maulik Parikh (@mparikhcloudbeds) and Grzegorz Kocur (@gkocur) for their feedback while we were designing and implementing this feature!

Example Usage

Check out this example, putting it into action within K8ssandra. 

In our example deployment, we have a series of nodes (node1-8) on which pods can be scheduled.  Our goal is to prevent non-Apache Cassandra™ pods from being scheduled on the same nodes as the Apache Cassandra™ pods – which is the recommended practice in production environments.

To do this, we’ll apply taints to our target Apache Cassandra™ nodes that can be used to prevent other pods from being scheduled on them.

kubectl taint nodes node1 app=cassandra:NoSchedule
kubectl taint nodes node2 app=cassandra:NoSchedule
kubectl taint nodes node3 app=cassandra:NoSchedule

We’ve applied a taint app=cassandra:NoSchedule that will only allow pods with a matching toleration of app=cassandra to be scheduled here.

Similarly, we also apply a set of complementary taints to other nodes dictating that only pods with an app=non-cassandra taint can be scheduled there.

kubectl taint nodes node4 app=non-cassandra:NoSchedule
kubectl taint nodes node5 app=non-cassandra:NoSchedule
kubectl taint nodes node6 app=non-cassandra:NoSchedule
kubectl taint nodes node7 app=non-cassandra:NoSchedule

We then use the tolerations configuration within our values.yaml file used to deploy our K8ssandra cluster via Helm.

(Complete details of the configuration are hidden to highlight the relevant elements)

cassandra:
  version: "4.0.0"
  ...
  datacenters:
  - name: dc1
    size: 3
    ...
  tolerations:
    - key: "app"
      operator: "Equal"
      value: "cassandra"
      effect: "NoSchedule"
...
stargate:
  enabled: true
  tolerations:
    - key: "app"
      operator: "Equal"
      value: "non-cassandra"
      effect: "NoSchedule"
  replicas: 3
...
reaper:
  enabled: true
  tolerations:
    - key: "app"
      operator: "Equal"
      value: "non-cassandra"
      effect: "NoSchedule"

The end result is a deployment that looks something like this:

NAME                                        READY   STATUS    RESTARTS   AGE     IP            NODE                                                
k8ssandra-cass-operator-766b945f65-mm46m    1/1     Running   0          7m21s   10.100.6.15   node8         
k8ssandra-dc1-default-sts-0                 2/2     Running   0          7m6s    10.100.5.3    node1      
k8ssandra-dc1-default-sts-1                 2/2     Running   0          7m6s    10.100.3.3    node2      
k8ssandra-dc1-default-sts-2                 2/2     Running   0          7m6s    10.100.4.3    node3      
k8ssandra-dc1-stargate-54fc755cc8-4wbv8     1/1     Running   0          7m21s   10.100.1.4    node4   
k8ssandra-dc1-stargate-54fc755cc8-7bk4r     1/1     Running   0          7m20s   10.100.0.4    node5   
k8ssandra-dc1-stargate-54fc755cc8-x9krq     1/1     Running   0          7m20s   10.100.2.4    node6   
k8ssandra-reaper-797b7467d-7blk9            0/1     Running   0          59s     10.100.2.5    node7   
k8ssandra-reaper-operator-566cdc787-cptvb   1/1     Running   0          7m21s   10.100.6.16   node8

To learn more about using taints and tolerations, check out the Kubernetes documentation.

We’ll soon be focusing on supporting node affinity within K8ssandra as well, completing the pod scheduling story for the stack.

Cass Operator Migration and v1.7.1 Release

We’re happy to announce the completion of the migration of the Cass Operator project to its new home within the K8ssandra organization. The K8ssandra team was already maintaining and enhancing Cass Operator in the months leading up to the GA release of K8ssandra, so we’re excited to give the project a new home and a renewed focus!

The original code and artifacts have been left in place in the DataStax GitHub org so as to not disturb any existing users. Going forward, all maintenance and enhancements will take place in the K8ssandra GitHub repository and distributed through the K8ssandra DockerHub repository.

The first new release of the migrated project, v1.7.0, was made available on May 7th. It included a number of fixes and new features, check those out here.  We’ve quickly followed up on that with the v1.7.1 patch release that addresses an upgrade issue, initially reported here.

Upgrade Considerations

Cass Operator v1.7.0 introduced a change to the ServiceName field of the StatefulSets deployed. This was done to better support scenarios like pod-to-pod communication via DNS names across clusters. This change however cannot be upgraded automatically as that field is immutable on a StatefulSet.

This change is not required for all users. In many cases, Cass Operator can be upgraded to v1.7.1 without additional changes necessary. This includes existing K8ssandra deployments, where Cass Operator will be upgraded automatically when moving to K8ssandra v1.2.0 as well as deployments that were originally created with Cass Operator v1.7.0.

For more info on the discussed changes, and deciding what is necessary to upgrade your environment, check out the discussion on our forums here.

What’s Next

In addition to working towards further improving the usability of K8ssandra, supporting incremental repair operations, and other enhancements and fixes, we’ve been busy testing deployments across Google Kubernetes Engine (GKE), Amazon Elastic Kubernetes Service (EKS), and Azure Kubernetes Service (AKS) with an eye towards performance. Be on the lookout for upcoming blog content sharing what we’ve found from our performance benchmarking efforts.