The K8ssandra team has just released k8ssandra-operator v1.10.0.
In this blog post, we will highlight some of the changes we brought in since v1.6.0, including some strong backup/restore improvements.
Backup/Restore improvements
One of the biggest changes in v1.9.0 is the upgrade to the recently released Medusa v0.16. This version has a major rewrite of the whole storage backend, which makes it much more efficient and resilient when communicating with the object storage systems.
You can read more about this in the Medusa release blog post.
We also improved a few things in k8ssandra-operator in recent versions, based on community feedback:
Restore mapping computation by the operator
Before v1.8.0, the restore mapping was computed by Medusa itself, once for each pod. This created some issues with hostname resolutions while proving to be both hard to debug and unreliable in some environments.
In order to have a safer mapping generation, we implemented its computation in the operator itself. Preparing for an upcoming feature where a cluster could be created directly from a backup, we created a separate Medusa pod that is used for communications between the operator and Medusa’s storage backend without requiring the cluster to be up. This will also allow to recover from a failed restore by pointing to a different backup without having to fully recreate the cluster.
Status fields and printed columns
MedusaBackup objects materialize Medusa backups as a local object in the Kubernetes cluster. So far they didn’t contain much valuable information on the backup itself, which could’ve been useful to inspect the backup metadata without having to communicate with Medusa directly.
In v1.9.0, we added the following fields in the MedusaBackup CRD status:
.status.totalNodes
: the number of nodes in the datacenter at the time of the backup.status.finishedNodes
: the number of nodes that successfully completed the backup.status.status
: the overall status of the backup (success, error, etc…).status.nodes
: an array of the nodes existing in the datacenter at the time of the backup, along with the tokens they own and their dc/rack placement
Some of these fields are now printed by default when listing the objects with kubectl
or displaying them with Open Lens:
% kubectl get MedusaBackup -A NAMESPACE NAME STARTED FINISHED NODES COMPLETED STATUS dogfood-63gzhx28 medusa-backup-1 27h 27h 3 3 SUCCESS dogfood-63gzhx28 medusa-backup-schedule-1695346200 10h 10h 3 3 SUCCESS
The MedusaBackupJob object now has its start and finish time status fields printed in the output as well, to ease tracking backup progress:
% kubectl get MedusaBackupJob -A NAMESPACE NAME STARTED FINISHED dogfood-63gzhx28 medusa-backup-1 27h 27h dogfood-63gzhx28 medusa-backup-schedule-1695346200 10h 10h
Restore safety
We’ve had reports that k8ssandra-operator would allow triggering the restore of incomplete or incompatible backups, which would obviously result in crash looping broken clusters.
Leveraging the newly added status fields from the MedusaBackup object, we’ve added new safety checks before triggering the restore operation.
Any backup that is incomplete, that was taken on a datacenter with a different name, or that has a different rack layout would fail the MedusaRestoreJob early on without risking cluster breakage.
We recommend to perform a sync MedusaTask to update your existing MedusaBackup objects with these new fields.
Bug fixes
When running a sync task, the operator was missing proper namespace filters which would result in the deletion of MedusaBackup objects from datacenters that were not targeted by the MedusaTask.
This was fixed in v1.9.0 and the MedusaTask boundaries are properly respected.
We also fixed a bug that would prevent scaling up a Cassandra cluster after a restore was done.
Further improvements will come in the next release to improve the backup/restore experience.
Reaper communications through management-api
Reaper communicates with Cassandra through JMX for any interaction. This includes discovering the cluster topology, accessing metrics and triggering repairs.
JMX is often considered as an insecure protocol and some security policies restrict its usage. In K8ssandra, we use the management api for operator to node communications. On top of the additional security it provides, we can augment its capabilities by accessing the internals of Cassandra through its loaded agent.
Starting with v1.10 and the recently released Reaper v3.4.0, you can enable http communications with the management api using the following CRD settings:
spec: reaper: httpManagement: enabled: true
arm64 support
As developer laptops (such as Apple Silicon laptops) and cloud VMs (such as EC2 Graviton instances) are starting to grow in popularity, it became evident that we needed to support the arm64 architectures across our Docker images.
With this new release all K8ssandra images are published for both the amd64 and arm64 architectures for seamless installation on multiple platforms.
Support for DC name overrides
k8ssandra-operator already supported cluster name overrides, which allowed setting the Cassandra cluster name to a different value than the K8ssandraCluster object name, which is constrained in terms of characters (no spaces, uppercase letters or special characters aside from dashes for example).
Considering that the CassandraDatacenter object is a kubernetes object, we could not have two clusters using the same DC names (dc1 for example) in the same namespace. We could not allow either using a datacenter name that wouldn’t match the Kubernetes objects naming constraints.
k8ssandra-operator now has proper support for datacenter name overrides through the .spec.cassandra.datacenters[].datacenterName
field.
This way you can create a cluster with a dc1
datacenter that has a CassandraDatacenter named cluster1-dc1
using the following manifest:
apiVersion: k8ssandra.io/v1alpha1 kind: K8ssandraCluster metadata: name: cluster1 spec: cassandra: serverVersion: "4.0.1" datacenters: - metadata: name: cluster1-dc1 datacenterName: dc1 k8sContext: kind-k8ssandra-0 size: 1 storageConfig: cassandraDataVolumeClaimSpec: storageClassName: standard accessModes: - ReadWriteOnce resources: requests: storage: 5Gi
This also allows using spaces and uppercase letters in your DC names (not that we recommend doing so, but it’s a possibility now).
Annotation for secrets injection
We’ve had an annotation available to inject secrets at runtime for a few versions now, and we recently evolved it so that you can attach specific secrets to specific pods in the statefulset easily.
Statefulsets don’t make that easy (or even possible), since all pods are created with the exact same pod template spec, but using different names.
What we added here is a ${POD_NAME}
placeholder value that will be substituted by our webhook in the secret name, with the actual pod name. This can be very handy for encryption stores management for example where each node has its own keystore/truststore pair:
Considering the above (partial) manifest would generate the following pods with each a different secret mounted:
test-dc1-default-sts-0
would get thetest-dc1-default-sts-0-inode-cert-keystore
test-dc1-default-sts-1
would get thetest-dc1-default-sts-1-inode-cert-keystore
test-dc1-default-sts-2
would get thetest-dc1-default-sts-2-inode-cert-keystore
Metrics improvements
The Cassandra metrics exposed by Vector in the cassandra_metrics
component have been augmented with a namespace
label. This is necessary to distinguish metrics from clusters sharing the same name but running in separate namespaces.
We added new metrics that expose streaming operations and sstables operations (limited to compactions at the moment):
org_apache_cassandra_metrics_extended_compaction_stats_completed org_apache_cassandra_metrics_extended_compaction_stats_total org_apache_cassandra_metrics_extended_streaming_total_files_to_receive org_apache_cassandra_metrics_extended_streaming_total_files_received org_apache_cassandra_metrics_extended_streaming_total_size_to_receive org_apache_cassandra_metrics_extended_streaming_total_size_received org_apache_cassandra_metrics_extended_streaming_total_files_to_send org_apache_cassandra_metrics_extended_streaming_total_files_sent org_apache_cassandra_metrics_extended_streaming_total_size_to_send org_apache_cassandra_metrics_extended_streaming_total_size_sent
New CassandraTasks
Four new CassandraTasks
are now available as part of the upgrade to cass-operator v1.18:
compact
scrub
flush
garbagecollect
They are equivalent to the corresponding nodetool
operations.
Scrub brings the following new job arguments:
no_validate
no_snapshot
skip_corrupted
Compact brings the following new job arguments:
split_output
start_token
end_token
These will allow orchestrating common maintenance operations that were required to be executed through nodetool so far. These operations are available as K8ssandraTasks
as well for multi DC orchestration.
Update now
As usual, we encourage all K8ssandra users to upgrade to v1.10.0 in order to get the latest features and improvements.
Let us know what you think of K8ssandra-operator by joining us on the K8ssandra Discord or K8ssandra Forum today. For exclusive posts on all things data and GenAI, follow DataStax on Medium.