mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-26 19:35:12 +00:00
In theory we shouldn't have empty keys in the database, as we validate all keys that enter the database via CQL with `validation::validate_cql_keys()`, which will reject empty keys. In this context, empty means a single-component key, with its only component being empty. Yet recently we've seen empty keys appear in a cluster and wreak havoc on it, as they will cause the memtable flush to fail due to the sstable summary rejecting the empty key. This will cause an infinite loop, where Scylla keeps retrying to flush the memtable and failing. The intermediate consequence of this is that the node cannot be shut down gracefully. The indirect consequence is possible data loss, as commitlog files cannot be replayed as they just re-insert the empty key into the memtable and the infinite flush retry circle starts all over again. A workaround is to move problematic commitlog files away, allowing the node to start up. This can however lead to data loss, if multiple replicas had to move away commitlogs that contain the same data. To prevent the node getting into an unusable state and subsequent data loss, extend the existing defenses against invalid (empty) keys to the commitlog replay, which will now ignore them during replay. Fixes: #6106 * denesb/empty-keys/v5: commitlog_replayer: ignore entries with invalid keys test: lib/sstable_utils: add make_keys_for_shard validation: add is_cql_key_invalid() validation: validate_cql_key(): make key parameter a `partition_key_view` partition_key_view: add validate method