This PR builds upon the PR for checksum validation (#20207) to further enhance scrub's corruption detection capabilities by validating digests as well. The digest (full checksum) is the checksum over the entire data, as opposed to per-chunk checksums which apply to individual chunks. Until now, digests were not examined on any code paths. This PR integrates digest checking into the compressed/checksummed data sources as an optional feature and enables it only through the validation path of the sstable layer (`sstable::validate()`). The validation path is used by the following tools: * scrub in validate mode * `sstable validate` All other reads, including normal user reads, are unaffected by this change. The PR consists of: * Extensions to the compressed and checksummed data sources to support digest checking. The data sources receive the expected digest as a parameter and calculate the actual digest incrementally across multiple get() calls. The check happens on the get() call that reaches EOF and results to an exception if the digest is invalid. A digest check requires reading the whole file range. Therefore, a partial read or skip() is treated as an internal error. * A new shareable digest component loaded on demand by the validation code. No lifecycle management. * Grouping of old scrub/validate tests for compressed and uncompressed SSTables to reduce code duplication. * scrub/validate tests for SSTables with valid checksums but invalid digests, and SSTables with no digests at all. * scrub/validate tests with 3.x Cassandra SSTables to ensure compatibility. Refs #19058. New feature, no backport is needed. Closes scylladb/scylladb#20720 * github.com:scylladb/scylladb: test: Test scrub/validate with SSTables from Cassandra compaction: Make quarantine optional for perform_sstable_scrub() test: Make random schema optional in scrub_test_framework test: Add tests for invalid digests test: Merge scrub/validate tests for compressed and uncompressed cases sstables: Verify digests on validation path sstables: Check if digest component exists sstables: Add digest in the SSTable components sstables: Add digest check in compressed data source sstables: Add digest check in checksummed data source
Scylla in-source tests.
For details on how to run the tests, see docs/dev/testing.md
Shared C++ utils, libraries are in lib/, for Python - pylib/
alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cql-pytest - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool
If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).
To add a new folder, create a new directory, and then
copy & edit its suite.ini.