After a schema change, memtable and cache have to be upgraded to the new schema. Currently, they are upgraded (on the first access after a schema change) atomically, i.e. all rows of the entry are upgraded with one non-preemptible call. This is a one of the last vestiges of the times when partition were treated atomically, and it is a well known source of numerous large stalls.
This series makes schema upgrades gentle (preemptible). This is done by co-opting the existing MVCC machinery.
Before the series, all partition_versions in the partition_entry chain have the same schema, and an entry upgrade replaces the entire chain with a single squashed and upgraded version.
After the series, each partition_version has its own schema. A partition entry upgrade happens simply by adding an empty version with the new schema to the head of the chain. Row entries are upgraded to the current schema on-the-fly by the cursor during reads, and by the MVCC version merge ongoing in the background after the upgrade.
The series:
1. Does some code cleanup in the mutation_partition area.
2. Adds a schema field to partition_version and removes it from its containers (partition_snapshot, cache_entry, memtable_entry).
3. Adds upgrading variants of constructors and apply() for `row` and its wrappers.
4. Prepares partition_snapshot_row_cursor, mutation_partition_v2::apply_monotonically and partition_snapshot::merge_partition_versions for dealing with heterogeneous version chains.
5. Modifies partition_entry::upgrade to perform upgrades by extending the version chain with a new schema instead of squashing it to a single upgraded version.
Fixes#2577Closes#13761
* github.com:scylladb/scylladb:
test: mvcc_test: add a test for gentle schema upgrades
partition_version: make partition_entry::upgrade() gentle
partition_version: handle multi-schema snapshots in merge_partition_versions
mutation_partition_v2: handle schema upgrades in apply_monotonically()
partition_version: remove the unused "from" argument in partition_entry::upgrade()
row_cache_test: prepare test_eviction_after_schema_change for gentle schema upgrades
partition_version: handle multi-schema entries in partition_entry::squashed
partition_snapshot_row_cursor: handle multi-schema snapshots
partiton_version: prepare partition_snapshot::squashed() for multi-schema snapshots
partition_version: prepare partition_snapshot::static_row() for multi-schema snapshots
partition_version: add a logalloc::region argument to partition_entry::upgrade()
memtable: propagate the region to memtable_entry::upgrade_schema()
mutation_partition: add an upgrading variant of lazy_row::apply()
mutation_partition: add an upgrading variant of rows_entry::rows_entry
mutation_partition: switch an apply() call to apply_monotonically()
mutation_partition: add an upgrading variant of rows_entry::apply_monotonically()
mutation_fragment: add an upgrading variant of clustering_row::apply()
mutation_partition: add an upgrading variant of row::row
partition_version: remove _schema from partition_entry::operator<<
partition_version: remove the schema argument from partition_entry::read()
memtable: remove _schema from memtable_entry
row_cache: remove _schema from cache_entry
partition_version: remove the _schema field from partition_snapshot
partition_version: add a _schema field to partition_version
mutation_partition: change schema_ptr to schema& in mutation_partition::difference
mutation_partition: change schema_ptr to schema& in mutation_partition constructor
mutation_partition_v2: change schema_ptr to schema& in mutation_partition_v2 constructor
mutation_partition: add upgrading variants of row::apply()
partition_version: update the comment to apply_to_incomplete()
mutation_partition_v2: clean up variants of apply()
mutation_partition: remove apply_weak()
mutation_partition_v2: remove a misleading comment in apply_monotonically()
row_cache_test: add schema changes to test_concurrent_reads_and_eviction
mutation_partition: fix mixed-schema apply()