Written by Miles Garnsey, K8ssandra engineer

The k8ssandra team is pleased to announce the release of K8ssandra-operator v1.2. First debuted in February this year, K8ssandra-operator is the premiere solution for running Apache Cassandra© on Kubernetes at global scale.

To support resilience in the face of disaster, performance through geographic proximity, horizontal scalability, and the elusive goal of 100% uptime, many enterprises choose to replicate their data across the globe. Apache Cassandra is the trusted solution to data replication, in use at the majority of Fortune 100 companies.

K8ssandra-operator leverages the power of Kubernetes to ease repairs, disaster recovery, authn/z and cluster configuration for Cassandra. K8ssandra-operator makes it possible to automate day 2 operations with a simple declarative API. 

With the release of 1.2, we have focused on enhancements to resilience, observability, security and interopability:

  • New data integration options: We now support Change Data Capture (CDC) and output to Apache Pulsar.
  • More customizability: Injection of initContainers, containers, volumes.
  • Enhanced security: Stargate now supports JWT authentication.
  • Better observability: Operators can now filter Prometheus metrics for Cassandra. Reaper now exposes metrics so repairs can be monitored.
  • Robustified DR: Operators can now define backup schedules for Medusa, large restore jobs can now target a persistent volume, enabling restores when local storage is limited.
  • Fixes to v1.1 pain points: We now support cluster name overrides to enable clusters which do not enjoy RFC 1035 compliant names to be managed by the operator.

Let’s take a closer look at each of these features.

Go go gadget Pulsar

The headline feature of this release is the integration of DataStax Change Data Capture (CDC) for Apache Cassandra. 

By integrating the new DataStax change agent for Apache Cassandra into k8ssandra, we now offer the ability to output mutated rows from Cassandra to Apache Pulsar for integration into downstream systems. 

This requires no changes to the source application, and opens up a new vista of data integration and processing possibilities. Use cases include output to specialised data stores (e.g. Elasticsearch) as well as on-prem/cloud integrations and migrations.

We’ll be publishing another blog post with more details on how this works. Stand by.

You’re in control

We now offer enhanced options for users to define their own initContainer, containers and volumes into the Cassandra pods under K8ssandra-operator’s control. This enables more integrations with third party authn/z, backup/restore and networking suites.

Simplified Stargate Security (Seriously)

On the topic of third party authn/z tooling, Stargate now supports JWT tokens issued by third party IdPs (identity providers) out of the box. Kick those passwords and move into 2022 with an identity and authorization experience that is smoother, more elegant, and more secure.

Your own observatory

We now expose more options for filtering the metrics output by Cassandra via Metrics Collector for Apache Cassandra, allowing large clusters to trim down the number of metrics being stored and avoid clogging up their observability tooling.

We’ve also exposed new metrics for Reaper, so that critical repair tasks can be properly monitored.

(We also fixed a bug where we were outputting duplicate metrics from Cassandra, but let’s consign that to history – it was my bad!)

New Disaster Recovery (DR) capabilities

Replication isn’t enough, it doesn’t save you from rogue operators or buggy applications. Medusa is k8ssandra’s DR/backup/restore solution and now allows users to define backup schedules with the new MedusaBackupSchedule CRD and restore to persistent volumes.

We’ve also addressed a limitation where previously, Medusa would always restore to an emptydir volume. This caused problems for large backups when a node had little local storage (because emptydirs materialize in /tmp on the node by default). Now users can set which volumes their backups will be restored to.

Cluster name overrides 

Migrations into k8ssandra have sometimes struck difficulties because cluster names needed to be RFC1035 compliant, and legacy clusters often had non-compliant names (e.g. with underscores, or camelCase).

We now allow the K8ssandraCluster to have one name while defining another name at the Cassandra layer. This will enable migrations for Cassandra clusters with non-compliant names running outside of Kubernetes to K8ssandra.

For example, if you were dealing with a legacy cluster named __MY_HORRIBLE_CLUSTERNAME___, you might need to override the cluster name as in the below:

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
 name: test
spec:
 cassandra:
   serverVersion: "4.0.4"
   clusterName: "__MY_HORRIBLE_CLUSTERNAME___"
   datacenters:
     - metadata:
         name: dc1
       size: 1
       storageConfig:
         cassandraDataVolumeClaimSpec:
           storageClassName: standard
           accessModes:
             - ReadWriteOnce
           resources:
             requests:
               storage: 5Gi

Upgrade Now

We invite all K8ssandra users to upgrade to K8ssandra-operator v1.2 (see our installation documentation).

Check our zero downtime migration blog post when upgrading from Apache Cassandra, cass-operator and K8ssandra v1.x clusters.

Let us know what you think of K8ssandra-operator by joining us on the K8ssandra Discord or K8ssandra Forum today. For exclusive posts on all things data, follow DataStax on Medium.