This reverts commit 866c96f536, reversing
changes made to 367633270a.
This change caused all longevities to fail, with a crash in parsing
scylla-metadata. The investigation is still ongoing, with no quick fix
in sight yet.
Fixes: #27496Closesscylladb/scylladb#27518
Large reserves in allocating_section can cause stalls. We already log
reserve increase, but we don't know which table it belongs to:
lsa - LSA allocation failure, increasing reserve in section 0x600009f94590 to 128 segments;
Allocating sections used for updating row cache on memtable flush are
notoriously problematic. Each table has its own row_cache, so its own
allocating_section(s). If we attached table name to those sections, we
could identify which table is causing problems. In some issues we
suspected system.raft, but we can't be sure.
This patch allows naming allocating_sections for the purpose of
identifying them in such log messages. I use abstract_formatter for
this purpose to avoid the cost of formatting strings on the hot path
(e.g. index_reader). And also to avoid duplicating strings which are
already stored elsewhere.
Fixes#25799Closesscylladb/scylladb#27470
This change replaces plain file_writer with crc32_digest_file_writer
for all SSTable components that should be checksummed. The resulting component
digests are stored in the sstable structure and later persisted to disk
as part of the Scylla metadata component during writer::consume_end_of_stream.
Partitions.db uses a piece of the murmur hash of the partition key
internally. The same hash is used to query the bloom filter.
So to avoid computing the hash twice (which involves converting the
key into a hashable linearized form) it would make sense to use
the same `hashed_key` for both purposes.
This is what we do in this patch. We extract the computation
of the `hashed_key` from `make_pk_filter` up to its parent
`sstable_set_impl::create_single_key_sstable_reader`,
and we pass this hash down both to `make_pk_filter` and
to the sstable reader. (And we add a pointer to the `hashed_key`
as a parameter to all functions along the way, to propagate it).
The number of parameters to `mx::make_reader` is getting uncomfortable.
Maybe they should be packed into some structs.
Partitions.db internally uses a piece of the partition key murmur
hash (the same hash which is used to compute the token and the
relevant bits in the bloom filter). Before this patch,
the Partitions.db writer computes the hash internally from the
`sstables::partition_key`.
That's a waste, because this hash is also computed for bloom filter
purposes just before that, in the owning sstable writer.
So in this patch we let the caller pass that hash here instead.
In previous patches we (hopefully) modified all users of
Index and Summary components so that they don't longer
need those components to exist. (And can use Partitions and
Rows components instead).
For efficiency, the cardinality of the bloom filter
(i.e. the number of partition keys which will be written into the sstable)
has to be known before elements are inserted into the filter.
In some cases (e.g. memtables flush) this number is known exactly.
But in others (e.g. repair) it can only be estimated,
and the estimation might be very wrong, leading to an oversized filter.
Because of that, some time ago we added a piece of logic
(ran after the sstable is written, but before it's sealed)
which looks at the actual number of written partitions,
compares it to the initial estimate (on which the size of the bloom
filter was based on), and if the difference is unacceptably large,
it rewrites the bloom filter from partition keys contained in Index.db.
But the idea to rebuild the bloom filters from index files
isn't going to work with BTI indexes, because they don't store
whole partition keys. If we want sstables which don't have Index.db
files, we need some other way to deal with oversized filters.
Partition keys can be recovered from Data.db,
but that would often be way too expensive.
This patch adds another way. We introduce a new component file,
TemporaryHashes. This component, if written at all,
contains the 16-byte murmur hash for every partition key, in order,
and can be used in place of Index to reconstruct the bloom filter.
(Our bloom filters are actually built from the set of murmur hashes of
partition keys. The first step of inserting a partition key into a
filter is hashing the key. Remembering the hashes is sufficient
to build the filter later, without looking at partition keys again.)
As of this patch, if the Index component is not being written,
we don't allocate and populate a bloom filter during the Data.db write.
Instead, we write the murmur hashes to TemporaryHashes, and only
later, after the Data write finishes, we allocate the optimal-size,
bloom filter, we read the hashes back from TemporaryHashes,
and we populate the filter with them.
That is suboptimal.
Writing the hashes to disk (or worse, to S3) and reading
them back is more expensive than building the bloom filter
during the main Data pass.
So ideally it should be avoided in cases where we know
in advance that the partition key count estimate is good enough.
(Which should be the case in flushes and compactions).
But we defer that to a future patch.
(Such a change would involve passing some flag to the sstable writer
if the cardinality estimate is trustworthy, and not creating
TemporaryHashes if the estimate is trustworthy).
In the previous patch we added code responsible
for creating and opening Partitions.db and Rows.db,
but we left those files empty.
In this patch, we populate the files using
`trie::bti_row_index_writer` and `trie::bti_partition_index_writer`.
Note: for the row index, we insert the same clustering blocks to
both indexes. The logic for choosing the size of the blocks
hasn't been changed in any way.
Much of this patch has to do with propagating the current range
tombstone down to all places which can start a new clustering block.
The reason we need that is that, for each clustering block,
BIG indexes store the range tombstone succeeding the block
(i.e. the range tombstone in between the given block and its successor)
BTI indexes store the range tombstone preceding the block,
(i.e. the range tombstone in between the given block and its predecessor).
So before the patch there's no code which looks at the current tombstone
when *starting* the block, only when *ending* the block.
This patch adds an extra copy for each `decorated_key`.
This is mostly unavoidable -- the BTI partition writer just
has to remember the key until its successor appears, to find the
common prefix. (We could avoid the key copy if the BTI isn't used, though.
We don't do that in this patch, we just let the copy happen).
This patch adds code responsible for creation and opening
of BTI index components (Rows.db, Partitions.db) when
BTI index writing is enabled.
(It is enabled if the cluster feature is enabled and the relevant
config entry permits it).
The files are empty for now, and are never read.
We will populate and use them in following patches.
There's a test (boost/sstable_compaction_test.cc::tombstone_purge_test)
which tests the value of `_stats.capped_tombstone_deletion_time`.
Before this patch, for "ms" sstables, `to_deletion_time` would have
be called twice for each written partition tombstone, which would fail
the test.
Since `_pi_write_m.partition_tombstone` always ends up being
converted from `tombstone` to `sstables::deletion_time` anyway,
let's just make it a `sstables::deletion_time` to begin with.
This will ensure that `to_deletion_time` will be able to be
only called once per partition tombstone.
This is yet another part in the BTI index project.
Overarching issue: https://github.com/scylladb/scylladb/issues/19191
Previous part: https://github.com/scylladb/scylladb/pull/25626
Next parts: introducing the new components, Partitions.db and Rows.db
This is the preparatory, uncontroversial part of https://github.com/scylladb/scylladb/pull/26039, which has been split out to a separate PR to make the main part (which, after a revision, will be posted later) smaller.
This series contains several small fixes and changes to BTI-related code added earlier, which either have to be done (i.e. propagating `reader_permit` to IO calls in index reads) or just deserved to be done. There's no single theme for the changes in this PR, refer to the individual commits for details.
The changes are for the sake of new and unreleased code. No backporting should be done.
Closesscylladb/scylladb#26075
* github.com:scylladb/scylladb:
sstables/mx/reader: remove mx::make_reader_with_index_reader
test/boost/bti_index_test: fix indentation
sstables/trie/bti_index_reader: in last_block_offset(), return offset from the beginning of partition, not file
sstables/trie: support reader_permit and trace_state properly
sstables/trie/bti_node_reader: avoid calling into `cached_file` if the target position is already cached
sstables/trie/bti_index_reader: get rid of the seastar::file wrapper in read_row_index_header
sstables/trie/bti_index_reader: support BYPASS CACHE
test/boost/bti_index_test: use read_bti_partitions_db_footer where appropriate
sstables/trie: change the signature of bti_partition_index_writer::finish
sstables/bti_index: improve signatures of special member functions in index writers
streaming/stream_transfer_task: coroutinize `estimate_partitions()`
types/comparable_bytes: add a missing implementation for date_type_impl
sstables: remove an outdated FIXME
storage_service: delete `get_splits()`
sstables/trie: fix some comment typos in bti_index_reader.cc
sstables/mx/writer: rename _pi_write_m.tomb to partition_tombstone
When `mx::make_reader` is used to construct an sstable reader,
it constructs its own index reader internally.
`mx::make_reader_with_index_reader` was originally added
as a variant of `mx::make_reader` which can be used to inject
a custom `index_reader` for testing that the mx Data reader
tolerates inexact indexes.
But now we want the ability to choose between BIG index readers
and BTI index readers if both are present. And at this point,
it seems to me that it makes sense to just construct the index
reader in the caller and pass it via argument to `mx::make_reader`
instead of putting the index selection inside it.
So that's what we do in this patch. And we remove `mx::make_reader_with_index_reader`
because it's no longer different from `mx::make_reader`.
We will use this type as the input to the BTI row index writer.
Since it will be implemented in other translation units,
the definition of the type has to be moved to a header.
A recent commit a0c29055e5 added
some trace printouts which print an std::reference_wrapper<>.
Apparently a formatter for this type was only added to fmt
in version 11.1.0, and it doesn't exist on earlier versions,
such as fmt 11.0.2 on Fedora 41.
Let's avoid requiring shiny-new versions of fmt. The workaround
is easy: just unwrap the reference_wrapper - print pr.get()
instead of just pr, and Scylla returns to building correctly on
Fedora 41.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#25228
Unlike the currently-used sstable index files, BTI indexes don't store the entire partition keys. They only store prefixes of decorated keys, up to the minimum length needed to differentiate a key from its neighbours in the sstable. This saves space.
However, it means that a BTI index query might be off by one partition (on each end of the queried partition range) with respect to the optimal Data position.
For example, if the index stores prefixes `a`, `b`, `c`,
the index has no way to know if the first index entry after key `bb`
is `b` (which might correspond to `ba` as well as `bc`), or `c`.
So the index reader conservatively has to pick the wider Data range, and the Data reader must ignore the superfluous partitions. (And there's no way around that.)
Before this patch, the sstable reader expects the index query to return an exact (optimal) Data range. This patch adjusts the logic of the sstable reader to allow for inexact ranges.
Note: the patch is more complicated that it looks. The logic of the sstable reader was already fairly hard to follow and this adds even more flags, more weird special states and more edge cases. I think I managed to write a decent test and it did find three or four edge cases I wouldn't have noticed otherwise. I think it should cover all the added logic, but I didn't verify code coverage. (Do our scripts for that even work nowadays)? Simplification ideas are welcome.
Preparation for new functionality, no backporting needed.
Closesscylladb/scylladb#25093
* github.com:scylladb/scylladb:
sstables/index_reader: weaken some exactness guarantees in abstract_index_reader
test/boost: add a test for inexact index lookups
sstables/mx/reader: allow passing a custom index reader to the constructor
sstables/index_reader: remove advance_to
sstables/mx/reader: handle inexact lookups in `advance_context()`
sstables/mx/reader: handle inexact lookups in `advance_to_next_partition()`
sstables/index_reader: make the return value of `get_partition_key` optional
sstables/mx/reader: handle "backward jumps" in forward_to
sstables/mx/reader: filter out partitions outside the queried range
sstables/mx/reader: update _pr after `fast_forward_to`
`advance_context()` needs an ability to advance the index to
the partition immediately following the reader's current partition.
For this, it uses `abstract_index_reader::advance_to(dht::ring_position_view)`
But BTI (and any index format which stores only the prefixes of keys
instead of whole keys) can't implement `advance_to` with its current
semantics. The Data position returned by the index for a generic
`advance_to` might be off by one partition.
E.g. if the index stores prefixes `a`, `b`, `c`,
the index has no way to know if the first entry after `bb`
is `b` (which might correspond to `ba` as well as `bc`), or `c`.
However, BTI can be used exactly if the partition is known to
be present in the sstable. (In the above example, if `bb` is known
to be present in the sstable, then it must correspond to `b`.
So the index can reliably advance to `bb` or the first partition after it).
And this is enough for `advance_context()`, because the
current partition is known to be present.
So we can replace the usage of `advance_to` with an equivalent API call
which only works with present keys, but in exchange is implementable
by BTI.
This makes `advance_to` unused, so we remove it.
`advance_to_next_partition()` needs an ability to advance the index to
the partition immediately following the reader's current partition.
For this, it uses `abstract_index_reader::advance_to(dht::ring_position_view)`
But BTI (and any index format which stores only the prefixes of keys
instead of whole keys) can't implement `advance_to` with its current
semantics. The Data position returned by the index for a generic
`advance_to` might be off by one partition.
E.g. if the index stores prefixes `a`, `b`, `c`,
the index has no way to know if the first entry after `bb`
is `b` (which might correspond to `ba` as well as `bc`), or `c`.
However, BTI can be used exactly if the partition is known to
be present in the sstable. (In the above example, if `bb` is known
to be present in the sstable, then it must correspond to `b`.
So the index can reliably advance to `bb` or the first partition after it).
And this is enough for `advance_to_next_partition()`, because the
current partition is known to be present.
So we can replace the usage of `advance_to` with an equivalent API call
which only works with present keys, but in exchange is implementable
by BTI.
BTI indexes only store encoded prefixes of partition keys,
not the whole keys. They can't reliably implement `get_partition_key`.
The index reader interface must be weakened and callers must
be adapted.
A bunch of code assumes that the Data.db stream can only go forward.
But with BTI indexes, if we perform an advance_to, the index can point to a position
which the data reader has already passed, since the index is inexact.
The logic of the data reader ensures that it has stopped
within the last partition range, or just immediately
after it, after reading the next partition key and
noticing that it doesn't belong to the range.
But forward_to can only be used with increasing ranges.
The start of the next range must be greater or equal to the
end of the previous range.
This means that the exact start of the next partition range
must be no earlier than:
1. Before the partition key just read by the data reader,
if the data reader is positioned immediately after a partition key.
2. The start of the first partition after the current data reader
position, if the data reader isn't positioned immediately after a
partition key.
So, if the index returns a position smaller than the current data
reader position, then:
1. If the reader is immediately after a partition key,
we have to reuse this partition key (since we can't go back
in the stream to read it again), and keep reading from
the current position.
2. Otherwise we can safely walk the index to the first partition
that lies no earlier than the current position.
The current index format is exact: it always returns the position of the
first partition in the queried partition range.
But we are about the add an index format where that doesn't have to be the case.
In BTI indexes, the lookup can be off by one partition sometimes. This patch prepares
the reader for that, by skipping the partitions which were read by the
data reader but don't belong to the queried range.
Note: as of this patch, only the "normal path" is ever used.
We add tests exercising these code paths later.
Also note that, as of this patch, actually stepping outside
the queried range would cause the reader to end up in a
state where the underlying parser is positioned right after
partition key immediately following the queried range.
If the reader was forwarded to that key in this state,
it would trip an assert, because the parser can't handle backward
jumps. We will add logic to handle this case in the next patch.
In later patches, we will prepare the reader for inexact index
implementations (ones which can return a Data file range that
includes some partitions before or after the queried range).
For that, we will need to filter out the partitions outside of the
range, and for that we need to remember the range. This is the
goal of this patch.
Note that we are storing a reference to an argument of
`fast_forward_to`. This is okay, because the contract
of `mutation_reader` specifies that the caller must
keep `pr` alive until the next `fast_forward_to`
or until the reader is destroyed.
As requested in #22102, #22103 and #22105 moved the files and fixed other includes and build system.
Moved files:
- clustering_bounds_comparator.hh
- keys.cc
- keys.hh
- clustering_interval_set.hh
- clustering_key_filter.hh
- clustering_ranges_walker.hh
- compound_compat.hh
- compound.hh
- full_position.hh
Fixes: #22102Fixes: #22103Fixes: #22105Closesscylladb/scylladb#25082
This is a refactoring patch in preparation for BTI indexes. It contains no functional changes (or at least it's not intended to).
In this patch, we modify the sstable readers to use index readers through a new virtual `abstract_index_readers` interface.
Later, we will add BTI indexes which will also implement this interface.
This interface contains the methods of `index_reader` which are needed by sstable readers, and leaves out all other methods, such as `current_clustered_cursor`.
Not all methods of this interface will be implementable by a trie-based index later. For example, a trie-based index can't provide a reliable `get_partition_key()`, because — unlike the current index — it only stores partition keys for partitions which have a row index. So the interface will have to be further restricted later. We don't do that in this patch because that will require changes to sstable reader logic, and this patch is supposed to only include cosmetic changes.
No backports needed, this is a preparation for new functionality.
Closesscylladb/scylladb#25000
* github.com:scylladb/scylladb:
sstables: add sstable::make_index_reader() and use where appropriate
sstables/mx: in readers, use abstract_index_reader instead of index_reader
sstables: in validate(), use abstract_index_reader instead of index_reader where possible
test/lib/index_reader_assertions: accept abstract_index_reader instead of index_reader
sstables/index_reader: introduce abstract_index_reader
sstables/index_reader: extract a prefetch_lower_bound() method
If we add multiple index implementations, users of index readers won't
easily know which concrete index reader type is the right one to construct.
We also don't want pieces of code to depend on functionality specific to
certain concrete types, if that's not necessary.
So instead of constructing the readers by themselves, they can use a helper
function, which will return an abstract (virtual) index reader.
This patch adds such a function, as a method of `sstable`.
After we add a second index implementation, we will probably want to
adjust validate() to work with either implementation.
Some validations will be format-specific, but some will be common.
For now, let's use abstract_index_reader for the validations which
can be done through that interface, and let's have downcast-specific
codepaths for the others.
Note: we change a `get_data_file_position()` call to `data_file_positions().start`.
The call happens at the beginning of a partition, and at this points
these two expressions are supposed to be equivalent.
The sstable reader reaches directly for a `clustered_index_cursor`.
But a BTI index reader won't be able to implement
`clustered_index_cursor`, because a BTI index doesn't store
full clustering keys, only some trie-encoded prefixes.
So we want to weaken the dependency. Instead of reaching
for `clustered_index_cursor`, we add a method which expresses
our intent, and we let `index_reader` touch the cursor internally.
Refactor readers and sources to support coroutine usage in
preparation for integration with `make_data_or_index_source`.
Move coroutine-based member initialization out of constructors
where applicable, and defer initialization until first use.
Although valid for compact tables, non-full (or empty) clustering key prefixes are not handled for row keys when writing sstables. Only the present components are written, consequently if the key is empty, it is omitted entirely.
When parsing sstables, the parsing code unconditionally parses a full prefix.
This mis-match results in parsing failures, as the parser parses part of the row content as a key resulting in a garbage key and subsequent mis-parsing of the row content and maybe even subsequent partitions.
Introduce a new system table: `system.corrupt_data` and infrastructure similar to `large_data_handler`: `corrupt_data_handler` which abstracts how corrupt data is handled. The sstable writer now passes rows such corrupt keys to the corrupt data handler. This way, we avoid corrupting the sstables beyond parsing and the rows are also kept around in system.corrupt_data for later inspection and possible recovery.
Add a full-stack test which checks that rows with bad keys are correctly handled.
Fixes: https://github.com/scylladb/scylladb/issues/24489
The bug is present in all versions, has to be backported to all supported versions.
Closesscylladb/scylladb#24492
* github.com:scylladb/scylladb:
test/boost/sstable_datafile_test: add test for corrupt data
sstables/mx/writer: handler rows with empty keys
test/lib/cql_assertions: introduce columns_assertions
sstables: add corrupt_data_handler to sstables::sstables
tools/scylla-sstable: make large_data_handler a local
db: introduce corrupt_data_handler
mutation: introduce frozen_mutation_fragment_v2
mutation/mutation_partition_view: read_{clustering,static}_row(): return row type
mutation/mutation_partition_view: extract de-ser of {clustering,static} row
idl-compiler.py: generate skip() definition for enums serializers
idl: extract full_position.idl from position_in_partition.idl
db/system_keyspace: add apply_mutation()
db/system_keyspace: introduce the corrupt_data table
Although valid for compact tables, non-full (or empty) clustering key
prefixes are not handled for row keys when writing sstables. Only the
present components are written, consequently if the key is empty, it is
omitted entirely.
When parsing sstables, the parsing code unconditionally parses a full
prefix. This mis-match results in parsing failures, as the parser parses
part of the row content as a key resulting in a garbage key and
subsequent mis-parsing of the row content and maybe even subsequent
partitions.
Use the recently introduced corrupt_data_handler to handle rows with
such corrupt keys. This way, we avoid corrupting the sstables beyond
parsing and the rows are also kept around in system.corrupt_data for
later inspection and possible recovery.
So parse errors on corrupt SSTables don't result in crashes, instead
just aborting the read in process.
There are a lot of SCYLLA_ASSERT() usages remaining in sstables/. This
patch tried to focus on those usages which are in the read path. Some
places not only used on the read path may have been converted too, where
the usage of said method is not clear.
Remove `compressor::create()`. This enforces that compressors
are only created through the `sstable_compressor_factory`.
Unlike the synchronous `compressor::create()`, the factory will be able
to create dict-aware compressors.
SSTable readers and writers use `compressor` objects to compress and
decompress chunks of SSTable data files.
`compressor` objects are read-only, so only one of them is needed
for each SSTable. Before this commit, each reader and writer has
its own `compressor` object. This isn't necessary, but it's okay.
But later in this series it will stop being okay, because the creation
of a `compressor` will become an expensive cross-shard
operation (because it might require sharing a compression dictionary
from another shard). So we have to adjust the code so that there is
only once `compressor` per sstable, not one per reader/writer.
We stuff the ownership of this compressor into `sstable::compression`.
To make the ownership clear, we remove `compression_ptr` shared
pointers from readers and writers, and make them access the
compressor via the `sstable::compression` instead.
sstable features indicate that an sstable has some extension, or that
some bug was fixed. They allow us to know if we can rely on certain
properties in a read sstables.
Currently, sstable features are set early in the read path (when we
read the scylla metadata file) and very late in the write path
(when we write the scylla metadata file just before sealing the sstable).
However, we happen to read features before we set them in the write path -
when we resize the bloom filter for a newly written sstable we instantiate
an index reader, and that depends on some features. As a result,
we read a disengaged optional (for the scylla metadata component) as if
it was engaged. This somehow worked so far, but fails with libstdc++
hash table implementation.
Fix it by moving storage of the features to the sstable itself, and
setting it early in the write path.
Fixes#23484Closesscylladb/scylladb#23485
The class in question is a wrapper around output_stream that writes,
flushes and closes the stream in async context. For logging it also
keeps the component filename on board, and now it's good time to patch
it and keep the component_filename instead.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Similarly to previous patches -- mostly the result is used as log
argument. The remaining users include
- scylla sstable tool that dumps component names to json output
- API endpoint that returns component names to user
- tests
these are all good to explicitly convert component_names to strings.
There are few more places that expect strings instead of component name
objects. For now they also use fmt::to_string() explicitly, partially it
will be fixed later, mostly -- as future follow-ups.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Most of the method callers use it as log parameter. There are few more
places that push it to malformed_sstable_exception, which immediately
converts it to string, so this patch makes the exception be constructed
with the component_name either.
And there's one more place that passes this string to file_writer
constructor. For now, convert it to string explicitly, but next patches
will fix that place to use pure component_name too.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's a generic sstable::filename(component_type) method that returns
a file name for the given component. For "popular" components, namely
TOC, Data and Index there are dedicated sstable methods to get their
names. Fix existing callers of the generic method to use the former.
It's shorter, nicer and makes further patching simpler.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Drop it from files that obviously don't need it. Also kill some forward
declarations while at it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#22979
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::stable_partition`.
in this change, we:
- replace `boost::range::stable_parition()` to
`std::ranges::stable_parition()`
- since `std::ranges::stable_parition()` returns a subrange instead of
an iterator, change the names of variables which were previously used
for holding the return value of `boost::range::stable_partition()`
accordingly for better readability.
- remove unused `#include` of boost headers
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21911
Reads which need sstable index were computing
column_values_fixed_lengths each time. This showed up in perf profile
for a sstable-read heavy workload, and amounted to about 1-2% of time.
Computing it involves type name parsing.
Avoid by using cached per-sstable mapping. There is already
sstable::_column_translation which can be used for this. It caches the
mapping for the least-recently used schema. Since the cursor uses the
mapping only for primary key columns, which are stable, any schema
will do, so we can use the last _column_translation. We only need to
make sure that it's always armed, so sstable loading is augmented with
arming with sstable's schema.
Also, fixes a potential use-after-free on schema in column_translation.
Closesscylladb/scylladb#21347
* github.com:scylladb/scylladb:
sstables: Fix potential use-after-free on column_translation::column_info::name
sstables: Avoid computing column_values_fixed_lengths on each read
Replace manual subrange advancement with the more concise and readable
`subrange.advance()` method. This change:
- Eliminates unnecessary subrange instance creation
- Improves code readability
- Reduces potential for unnecessary object allocation
- Leverages the built-in `advance()` method for cleaner iterator handling
The modification simplifies the iteration logic while maintaining the
same functional behavior.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21865