…played
Currently, if flushing hints falls within the repair cache timeout, then the flush_time is set to batchlog_manager::_last_replay. _last_replay is updated on each replay, even if some batches weren't replayed. Due to that, we risk the data resurrection.
Update _last_replay only if all batches were replayed.
Fixes: https://github.com/scylladb/scylladb/issues/24415.
Needs backport to all live versions.
- (cherry picked from commit 4d0de1126f)
- (cherry picked from commit e3dcb7e827)
Parent PR: #26793Closesscylladb/scylladb#27092
* github.com:scylladb/scylladb:
test: extend test_batchlog_replay_failure_during_repair
db: batchlog_manager: update _last_replay only if all batches were replayed
The API contract in partition_version.hh states that when dealing with
evictable entries, a real cache tracker pointer has to be passed to all
methods that ask for it. The nonpopulating reader violates this, passing
a nullptr to the snapshot. This was observed to cause a crash when a
concurrent cache read accessed the snapshot with the null tracker.
A reproducer is included which fails before and passes after the fix.
Fixes: #26847Closesscylladb/scylladb#28163
(cherry picked from commit a53f989d2f)
Closesscylladb/scylladb#28278
A replaced node may have pending operation on it. The replace operation
will move the node into the 'left' state and the request will never be
completed. More over the code does not expect left node to have a
request. It will try to process the request and will crash because the
node for the request will not be found.
The patch checks is the replaced node has peening request and completes
it with failure. It also changes topology loading code to skip requests
for nodes that are in a left state. This is not strictly needed, but
makes the code more robust.
Fixes#27990Closesscylladb/scylladb#28009
(cherry picked from commit bee5f63cb6)
Closesscylladb/scylladb#28177
Modify test_batchlog_replay_failure_during_repair to also check
that there isn't data resurrection if flushing hints falls within
the repair cache timeout.
(cherry picked from commit e3dcb7e827)
Currently, if flushing hints falls within the repair cache timeout,
then the flush_time is set to batchlog_manager::_last_replay.
_last_replay is updated on each replay, even if some batches weren't
replayed. Due to that, we risk the data resurrection.
Update _last_replay only if all batches were replayed.
Fixes: https://github.com/scylladb/scylladb/issues/24415.
(cherry picked from commit 4d0de1126f)
Fixes#27992
When doing a commit log oversized allocation, we lock out all other writers by grabbing
the _request_controller semaphore fully (max capacity).
We thereafter assert that the semaphore is in fact zero. However, due to how things work
with the bookkeep here, the semaphore can in fact become negative (some paths will not
actually wait for the semaphore, because this could deadlock).
Thus, if, after we grab the semaphore and execution actually returns to us (task schedule),
new_buffer via segment::allocate is called (due to a non-fully-full segment), we might
in fact grab the segment overhead from zero, resulting in a negative semaphore.
The same problem applies later when we try to sanity check the return of our permits.
Fix is trivial, just accept less-than-zero values, and take same possible ltz-value
into account in exit check (returning units)
Added whitebox (special callback interface for sync) unit test that provokes/creates
the race condition explicitly (and reliably).
Closesscylladb/scylladb#27998
(cherry picked from commit a7cdb602e1)
Closesscylladb/scylladb#28097
Call discover_staging_sstables in view_update_generator::start() instead
of in the constructor, because the constructor is called during
initialization before sstables are loaded.
The initialization order was changed in 5d1f74b86a and caused this
regression. It means the view update generator won't discover staging
sstables on startup and view updates won't be generated for them. It
also causes issues in sstable cleanup.
view_update_generator::start() is called in a later stage of the
initialization, after sstable loading, so do the discovery of staging
sstables there.
Fixesscylladb/scylladb#27956
(cherry picked from commit 5077b69c06)
Closesscylladb/scylladb#28091
Currently, we skip batch replay if less than batch_log_timeout passed
from the moment the batch was written. batch_log_timeout value can
be configured. If it is large, it won't be replayed for a long time.
If the tombstone will be GC'd before the batch is replayed, then we
risk the data resurrection.
To ensure safety we can skip only the batches that won't be GC'd.
In this patch we skip replay of the batches for which:
now() < written_at + min(timeout + propagation_delay)
repair_time is set as a start of batchlog replay, so at the moment
of the check we will have:
repair_time <= now()
So we know that:
repair_time < written_at + propagation_delay
With this condition we are sure that GC won't happen.
(cherry picked from commit e1b2180092)
Return a flag determining whether all the batches were sent successfully in
batchlog_manager::replay_all_failed_batches (batches skipped due to being
too fresh are not counted). Throw in repair_flush_hints_batchlog_handler
if not all batches were replayed, to ensure that repair_time isn't updated.
(cherry picked from commit 7f20b66eff)
batchlog_manager::replay_all_failed_batches skips batches that have
unknown or incorrect version. Next round will process these batches
again.
Such batches will probably be skipped everytime, so there is no point
in keeping them. Even if at some point the version becomes correct,
we should not replay the batch - it might be old and this may lead
to data resurrection.
(cherry picked from commit 904183734f)
When draining the view builder, we abort ongoing operations using the
view builder's abort source, which may cause them to fail with
abort_requested_exception or raft::request_aborted exceptions.
Since these failures are expected during shutdown, reduce the log level
in add_new_view from 'error' to 'debug' for these specific exceptions
while keeping 'error' level for unexpected failures.
Closesscylladb/scylladb#26297
(cherry picked from commit 6bc41926e2)
Closesscylladb/scylladb#27537
Fixes#24346
When reading, we check for each entry and each chunk, if advancing there
will hit EOF of the segment. However, IFF the last chunk being read has
the last entry _exactly_ matching the chunk size, and the chunk ending
at _exactly_ segment size (preset size, typically 32Mb), we did not check
the position, and instead complained about not being able to read.
This has literally _never_ happened in actual commitlog (that was replayed
at least), but has apparently happened more and more in hints replay.
Fix is simple, just check the file position against size when advancing
said position, i.e. when reading (skipping already does).
v2:
* Added unit test
Closesscylladb/scylladb#27236
(cherry picked from commit 59c87025d1)
Closesscylladb/scylladb#27343
Batches that fail on the initial send are retired later, until they
succeed. These retires happen with CL=ALL, regardless of what the
original CL of the batch was. This is unnecessarily strict. We tried to
follow Cassandra here, but Cassandra has a big caveat in their use of
CL=ALL for batches. They accept saving just a hint for any/all of the
endpoints, so a batch which was just logged in hints is good enough for
them.
We do not plan on replicating this usage of hints at this time, so as a
middle ground, the CL is changed to EACH_QUORUM.
Fixes: scylladb/scylladb#25432Closesscylladb/scylladb#26304
(cherry picked from commit d9c3772e20)
Closesscylladb/scylladb#26929
ScyllaDB offers the `compression` DDL property for configuring
compression per user table (compression algorithm and chunk size). If
not specified, the default compression algorithm is the LZ4Compressor
with a 4KiB chunk size (refer to the default constructor for
`compression_parameters`). The same default applies to system tables as
well.
Add a new configuration option to allow customizing the default for user
tables. Use the previously hardcoded default as the new option's default
value.
Note that the option has no effect on ALTER TABLE statements. An altered
table either inherits explicit compression options from the CQL
statement, or maintains its existing options.
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
(cherry picked from commit e1d9c83406)
SSTable compression is currently configurable only per table, via the
`compression` property in CREATE/ALTER TABLE statements. This is
represented internally via the `compression_parameters` class. We plan
to offer the same options via the configuration as well, to make the
default compression method for user tables configurable.
This patch prepares the ground by making the `compression_parameters`
usable as a `config_file::named_value`, namely:
* Define an extraction operator (required by `boost::program_options`
for parsing the options from command line).
* Define a formatter (required by `named_value::operator()`).
* Define a template specialization for `config_type_for` (required by
`named_value` constructor).
* Define a yaml converter (required for parsing the options from
scylla.yaml).
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
(cherry picked from commit a7e46974d4)
If there are pending mutations in the batchlog for a table that
has been dropped, we'll keep attempting to replay them but with
no success -- `db::no_such_column_family` exceptions will be thrown,
and we'll keep trying again and again.
To prevent that, we drop the batch in that case just like we do
in the case of a non-existing keyspace.
A reproducer test has been included in the commit. It fails without
the changes in `db/batchlog_manager.cc`, and it succeeds with them.
Fixesscylladb/scylladb#24806Closesscylladb/scylladb#26057
(cherry picked from commit 35f7d2aec6)
Closesscylladb/scylladb#26201
This change disables caching for raft log table due to the following reasons:
* Immediate reason is a deficiency in handling emerging range tombstones in the cache, which causes stalls.
* Long-term reason is that sequential reads from the raft log do not benefit from the cache, making it better to bypass it to free up space and avoid stalls.
Fixesscylladb/scylladb#26027Closesscylladb/scylladb#26031
(cherry picked from commit 2640b288c2)
Closesscylladb/scylladb#26074
The PRUNE MATERIALIZED VIEW statement is supposed to remove ghost rows from the
view. Ghost rows are rows in the view with no corresponding row in the base table.
Before this patch, only rows whose primary key columns of the base table had
different values than any of the base rows were treated as ghost rows by the PRUNE
statement. However, view rows which have a column in their primary key that's not
in the base primary can also be ghost rows if this column has a different value
than the base row with the same values of remaining primary key columns. That's
because these rows won't be deleted unless we change value of this column in the
base table to this specific value.
In this patch we add a check for this column in the PRUNE MATERIALIZED VIEW logic.
If this column isn't the same in the base table and the view, these rows are also
deleted.
Fixes https://github.com/scylladb/scylladb/issues/25655Closesscylladb/scylladb#25720
(cherry picked from commit 1f9be235b8)
Closesscylladb/scylladb#25956
Consider the following scenario:
- Current replica set is [A, B, C]
- write succeeds on [A, B], and a hint is logged for node C
- before the hint is replayed, D bootstraps and the token migrates from C to D
- hint is replayed to node C while D is pending, but it's too late, since streaming for that token is already done
- C is cleaned up, replayed data is lost, and D has a stale copy until next repair.
In the scenario we effectively fail to send the hint. This scenario is also more likely to happen with tablets,
as it can happen for every tablet migration.
This issue is particularly detrimental to materialized views. View updates use hints by default and a specific
view update may be sent to just one view replica (when a single base replica has a different row state due to
reordering or missed writes). When we lose a hint for such a view update, we can generate a persistent inconsistency
between the base and view - ghost rows can appear due to a lost tombstone and rows may be missing in the view due
to a lost row update. Such inconsistencies can't be fixed neither by repairing the view or the base table.
To handle this, in this patch we add the pending replicas to the list of targets of each hint, even if the original
target is still alive.
This will cause some updates to be redundant. These updates are probably unavoidable for now, but they shouldn't
be too common either. The scenarios for them are:
1. managing to send the hint to the source of a migrating replica before streaming that its token - the write will
arrive on the pending replica anyway in streaming
2. the hint target not being the source of the migration - if we managed to apply the original write of the hint to
the actual source of the migration, the pending replica will get it during streaming
3. sending the same hint to many targets at a similar time - while sending to each target, we'll see the same pending
replica for the hint so we'll send it multiple times
4. possible retries where even though the hint was successfully sent to the main target, we failed to send it to the
pending replica, so we need to retry the entire write
This patch handles both tablet migrations and tablet rebuilds. In the future, for tablet migrations, we can avoid
sending the hint to pending replias if the hint target is not the source fo the migration, which would allow us to
avoid the redundant writes 2 and 3. For rack-aware RF, this will be as simple as checking whether the replicas are
in the same rack.
We also add a test case reproducing the issue.
Co-Authored-By: Raphael S. Carvalho <raphaelsc@scylladb.com>
Fixes https://github.com/scylladb/scylladb/issues/19835Closesscylladb/scylladb#25590
(cherry picked from commit 10b8e1c51c)
Closesscylladb/scylladb#25882
Fixes#25683
Once a table drop is complete, there should be no reason to retain
truncation records for it, as any replay should skip mutations
anyway (no CF), and iff we somehow resurrect a dropped table,
this replay-resurrected data is the least problem anyway.
Adds a prune phase to the startup drop_truncation_rp_records run,
which ignores updating, and instead deletes records for non-existant
tables (which should patch any existing servers with lingering data
as well).
Also does an explicit delete of records on actual table DROP, to
ensure we don't grow this table more than needed even in long
uptime nodes.
Small unit test included.
Closesscylladb/scylladb#25699
(cherry picked from commit bc20861afb)
Closesscylladb/scylladb#25815
Before these changes, the logs in hinted handoff often didn't provide
crucial information like the identifier of the node that hints were
being sent to. Also, some of the logs were misleading and referred to
other places in the code than the one where an exception or some other
situation really occurred.
We modify those logs, extending them by more valuable information
and fixing existing issues. What's more, all of the logs in
`hint_endpoint_manager` and `hint_sender` follow a consistent format
now:
```
<class_name>[<destination host ID>]:<function_name>: <message>
```
This way, we should always have AT LEAST the basic information.
Fixes scylladb/scylladb#25466
Backport:
There is no risk in backporting these changes. They only have
impact on the logs. On the other hand, they might prove helpful
when debugging an issue in hinted handoff.
- (cherry picked from commit 2327d4dfa3)
- (cherry picked from commit d7bc9edc6c)
- (cherry picked from commit 6f1fb7cfb5)
Parent PR: #25470Closesscylladb/scylladb#25538
* github.com:scylladb/scylladb:
db/hints: Add new logs
db/hints: Adjust log levels
db/hints: Improve logs
This PR implements solution proposed in scylladb/scylladb#24481
Instead of terminating connections immediately, the shutdown now proceeds in two stages: first closing the receive (input) side to stop new requests, then waiting for all active requests to complete before fully closing the connections.
The updated shutdown process is as follows:
1. Initial Shutdown Phase
* Close the accept gate to block new incoming connections.
* Abort all accept() calls.
* For all active connections:
* Close only the input side of the connection to prevent new requests.
* Keep the output side open to allow responses to be sent.
2. Drain Phase
* Wait for all in-progress requests to either complete or fail.
3. Final Shutdown Phase
* Fully close all connections.
Fixes scylladb/scylladb#24481
- (cherry picked from commit 122e940872)
- (cherry picked from commit 3848d10a8d)
- (cherry picked from commit 3610cf0bfd)
- (cherry picked from commit 27b3d5b415)
- (cherry picked from commit 061089389c)
- (cherry picked from commit 7334bf36a4)
- (cherry picked from commit ea311be12b)
- (cherry picked from commit 4f63e1df58)
Parent PR: #24499Closesscylladb/scylladb#25519
* github.com:scylladb/scylladb:
test: Set `request_timeout_on_shutdown_in_seconds` to `request_timeout_in_ms`, decrease request timeout.
generic_server: Two-step connection shutdown.
transport: consmetic change, remove extra blanks.
transport: Handle sleep aborted exception in sleep_until_timeout_passes
generic_server: replace empty destructor with `= default`
generic_server: refactor connection::shutdown to use `shutdown_input` and `shutdown_output`
generic_server: add `shutdown_input` and `shutdown_output` functions to `connection` class.
test: Add test for query execution during CQL server shutdown
The gossiper can call `storage_service::on_change` frequently (see scylladb/scylla-enterprise#5613), which may cause high CPU load and even trigger OOMs or related issues.
This PR adds a temporary cache for `system.peers` to resolve host_id -> ip without hitting storage on every call. The cache is short-lived to handle the unlikely case where `system.peers` is updated directly via CQL.
This is a temporary fix; a more thorough solution is tracked in https://github.com/scylladb/scylladb/issues/25620.
Fixes scylladb/scylladb#25660
backport: this patch needs to be backported to all supported versions (2025.1/2/3).
- (cherry picked from commit 91c633371e)
- (cherry picked from commit de5dc4c362)
- (cherry picked from commit 4b907c7711)
Parent PR: #25658Closesscylladb/scylladb#25766
* github.com:scylladb/scylladb:
storage_service: move get_host_id_to_ip_map to system_keyspace
system_keyspace: use peers cache in get_ip_from_peers_table
storage_service: move get_ip_from_peers_table to system_keyspace
Reimplemented the function to use the peers cache. It could be replaced
with get_ip_from_peers_table, but that would create a coroutine frame for
each call.
(cherry picked from commit 4b907c7711)
The storage_service::on_change method can be called quite often
by the gossiper, see scylladb/scylla-enterprise#5613. In this commit
we introduce a temporal cache for system.peers so that we don't have
to go to the storage each time we need to resolve host_id -> ip.
We keep the cache only for a small amount of time to handle the
(unlikely) scenario when the user wants to update system.peers table
from CQL.
Fixesscylladb/scylladb#25660
(cherry picked from commit de5dc4c362)
We plan to add a cache to get_ip_from_peers_table in upcoming commits.
It's more convenient to do this from system_keyspace, since the only two
methods that mutate system.peers (remove_endpoint and update_peers_info)
are already there.
(cherry picked from commit 91c633371e)
Fixes#25682
Refs scylla-enterprise#5580
If the truncation table is large in entries, we might create a
huge parallel execution, quite possibly consuming loads of resources
doing something quite trivial.
Limit concurrency to a small-ish number
Closesscylladb/scylladb#25678
(cherry picked from commit 2eccd17e70)
Closesscylladb/scylladb#25751
We're providing additional information in error messages when throwing
an exception related to data corruption: when a segment is truncated
and when it's content is invalid. That might prove helpful when debugging.
Closesscylladb/scylladb#25190
(cherry picked from commit 408b45fa7e)
Closesscylladb/scylladb#25461
Fixes#25709
If we have large allocations, spanning more than one segment, and
the internal segment references from lead to secondary are the
only thing keeping a segment alive, the implicit drop in
discard_unused_segments and orphan_all can cause a recursive call
to discard_unused_segments, which in turn can lead to vector
corruption/crash, or even double free of segment (iterator confusion).
Need to separate the modification of the vector (_segments) from
actual releasing of objects. Using temporaries is the easiest
solution.
To further reduce recursion, we can also do an early clear of
segment dependencies in callbacks from segment release (cf release).
Closesscylladb/scylladb#25719
(cherry picked from commit cc9eb321a1)
Closesscylladb/scylladb#25756
We're adding new logs in just a few places that may however prove
important when debugging issues in hinted handoff in the future.
(cherry picked from commit 6f1fb7cfb5)
Some of the logs could be clogging Scylla's logs, so we demote their
level to a lower one.
On the other hand, some of the logs would most likely not do that,
and they could be useful when debugging -- we promote them to debug
level.
(cherry picked from commit d7bc9edc6c)
Before these changes, the logs in hinted handoff often didn't provide
crucial information like the identifier of the node that hints were
being sent to. Also, some of the logs were misleading and referred to
other places in the code than the one where an exception or some other
situation really occurred.
We modify those logs, extending them by more valuable information
and fixing existing issues. What's more, all of the logs in
`hint_endpoint_manager` and `hint_sender` follow a consistent format
now:
```
<class_name>[<destination host ID>]:<function_name>: <message>
```
This way, we should always have AT LEAST the basic information.
(cherry picked from commit 2327d4dfa3)
When shutting down in `generic_server`, connections are now closed in two steps.
First, only the RX (receive) side is shut down. Then, after all ongoing requests
are completed, or a timeout happened the connections are fully closed.
Fixesscylladb/scylladb#24481
(cherry picked from commit ea311be12b)
Currently, `peers_table_read_fixup` removes rows with no `host_id`, but
not with null `host_id`. Null host IDs are known to appear in system
tables, for example in `system.cluster_status` after a failed bootstrap.
We better make sure we handle them properly if they ever appear in
`system.peers`.
This commit guarantees that null UUID cannot belong to
`loaded_endpoints` in `storage_service::join_cluster`, which in
particular ensures that we throw a runtime error when a user sets
`recovery_leader` to null UUID during the recovery procedure. This is
handled by the code verifying that `recovery_leader` belongs to
`loaded_endpoints`.
(cherry picked from commit 23f59483b6)
We change the type of the `recovery_leader` config parameter and
`gossip_config::recovery_leader` from sstring to UUID. `recovery_leader`
is supposed to store host ID, so UUID is a natural choice.
After changing the type to UUID, if the user provides an incorrect UUID,
parsing `recovery_leader` will fail early, but the start-up will
continue. Outside the recovery procedure, `recovery_leader` will then be
ignored. In the recovery procedure, the start-up will fail on:
```
throw std::runtime_error(
"Cannot start - Raft-based topology has been enabled but persistent group 0 ID is not present. "
"If you are trying to run the Raft-based recovery procedure, you must set recovery_leader.");
```
(cherry picked from commit 445a15ff45)
This reverts commit 04fb2c026d. 2025.3 got
the reduced threshold, but won't get many of the fixes the warning will
generate, leaving it very noisy. Better to avoid the noise for this release.
Fixes#24384.
Add a test of the batchlog manager replay loop applying failed batches
while some replica is down.
The test reproduces an issue where the batchlog manager tries to replay
a failed batch, doesn't get a response from some replica, and becomes
stuck.
It verifies that the batchlog manager can eventually recover from this
situation and continue applying failed batches.
(cherry picked from commit a9b476e057)
Set a timeout on writes of replayed batches by the batchlog manager.
We want to avoid having infinite timeout for the writes in case it gets
stuck for some unexpected reason.
The timeout is set to be high enough to allow any reasonable write to
complete.
(cherry picked from commit 74a3fa9671)
On shutdown of batchlog manager, abort all writes of replayed batches
by the batchlog manager.
To achieve this we set the appropriate write_type to BATCH, and on
shutdown cancel all write handlers with this type.
(cherry picked from commit 7150632cf2)
When replaying a batch mutation from the batchlog manager and sending it
to all replicas, create the write response handler as cancellable.
To achieve this we define a new wrapper type for batchlog mutations -
batchlog_replay_mutation, and this allows us to overload
create_write_response_handler for this type. This is similar to how it's
done with hint_wrapper and read_repair_mutation.
(cherry picked from commit fc5ba4a1ea)
Although valid for compact tables, non-full (or empty) clustering key prefixes are not handled for row keys when writing sstables. Only the present components are written, consequently if the key is empty, it is omitted entirely.
When parsing sstables, the parsing code unconditionally parses a full prefix.
This mis-match results in parsing failures, as the parser parses part of the row content as a key resulting in a garbage key and subsequent mis-parsing of the row content and maybe even subsequent partitions.
Introduce a new system table: `system.corrupt_data` and infrastructure similar to `large_data_handler`: `corrupt_data_handler` which abstracts how corrupt data is handled. The sstable writer now passes rows such corrupt keys to the corrupt data handler. This way, we avoid corrupting the sstables beyond parsing and the rows are also kept around in system.corrupt_data for later inspection and possible recovery.
Add a full-stack test which checks that rows with bad keys are correctly handled.
Fixes: https://github.com/scylladb/scylladb/issues/24489
The bug is present in all versions, has to be backported to all supported versions.
Closesscylladb/scylladb#24492
* github.com:scylladb/scylladb:
test/boost/sstable_datafile_test: add test for corrupt data
sstables/mx/writer: handler rows with empty keys
test/lib/cql_assertions: introduce columns_assertions
sstables: add corrupt_data_handler to sstables::sstables
tools/scylla-sstable: make large_data_handler a local
db: introduce corrupt_data_handler
mutation: introduce frozen_mutation_fragment_v2
mutation/mutation_partition_view: read_{clustering,static}_row(): return row type
mutation/mutation_partition_view: extract de-ser of {clustering,static} row
idl-compiler.py: generate skip() definition for enums serializers
idl: extract full_position.idl from position_in_partition.idl
db/system_keyspace: add apply_mutation()
db/system_keyspace: introduce the corrupt_data table
When writing large partitions, that is: partitions with size or row count
above a configurable threshold, ScyllaDB outputs a warning to the log:
WARN ... large_data - Writing large partition test/test: (1200031 bytes) to me-3glr_0xkd_54jip2i8oqnl7hk8mu-big-Data.db
This warning contains the information about the size of the partition,
but it does not contain the number of rows written. This can lead to
confusion because in cases where the warning was written because of the
row count being larger than the threshold, but the partition size is below
the threshold, the warning will only contain the partition size in bytes,
leading the user to believe the warning was output because of the
partition size, when in reality it was the row count that triggered the
warning. See #20125
This change adds a size_desc argument to cql_table_large_data_handler::try_record(),
which will contain the description of the size of the object written.
This method is used to output warnings for large partitions, row counts,
row sizes and cell sizes. This change does not modify the warning message
for row and cell sizes, only for partition size and row count.
The warning for large partitions and row counts will now look like this:
WARN ... large_data - Writing large partition test/test: (1200031 bytes/100001 rows) to me-3glr_0xkd_54jip2i8oqnl7hk8mu-big-Data.db
Closesscylladb/scylladb#22010
Similar to large_data_handler, this interface allows sstable writers to
delegate the handling of corrupt data.
Two implementations are provided:
* system_table_corrupt_data_handler - saved corrupt data in
system.corrupt_data, with a TTL=10days (non-configurable for now)
* nop_corrupt_data_handler - drops corrupt data
Allow applying writes in the form of mutations directly to the keyspace.
Allows lower-level mutation API to build writes. Advantageous if writes
can contain large cells that would otherwise possibly cause large
allocation warnings if used via the internal CQL API.