scylladb

Author	SHA1	Message	Date
Michał Chojnowski	5ad0846bff	view: fix range tombstone handling on flushes in view_updating_consumer View update routines accept `mutation` objects. But what comes out of staging sstable readers is a stream of mutation_fragment_v2 objects. To build view updates after a repair/streaming, we have to convert the fragment stream into `mutation`s. This is done by piping the stream to mutation_rebuilder_v2. To keep memory usage limited, the stream for a single partition might have to be split into multiple partial `mutation` objects. view_update_consumer does that, but in improper way -- when the split/flush happens inside an active range tombstone, the range tombstone isn't closed properly. This is illegal, and triggers an internal error. This patch fixes the problem by closing the active range tombstone (and reopening in the same position in the next `mutation` object). The tombstone is closed just after the last seen clustered position. This is not necessary for correctness -- for example we could delay all processing of the range tombstone until we see its end bound -- but it seems like the most natural semantic. Fixes #14503	2023-07-04 20:33:21 +02:00
Tomasz Grabiec	e48ec6fed3	db, storage_proxy: Drop mutation/frozen_mutation ::shard_of() dht::shard_of() does not use the correct sharder for tablet-based tables. Code which is supposed to work with all kinds of tables should use erm::get_sharder().	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	87b4606cd6	Merge 'atomic_cell: compare value last' from Benny Halevy Currently, when two cells have the same write timestamp and both are alive or expiring, we compare their value first, before checking if either of them is expiring and if both are expiring, comparing their expiration time and ttl value to determine which of them will expire later or was written later. This was based on an early version of Cassandra. However, the Cassandra implementation rightfully changed in `e225c88a65` ([CASSANDRA-14592](https://issues.apache.org/jira/browse/CASSANDRA-14592)), where the cell expiration is considered before the cell value. To summarize, the motivation for this change is three fold: 1. Cassandra compatibility 2. Prevent an edge case where a null value is returned by select query when an expired cell has a larger value than a cell with later expiration. 3. A generalization of the above: value-based reconciliation may cause select query to return a mixture of upserts, if multiple upserts use the same timeastamp but have different expiration times. If the cell value is considered before expiration, the select result may contain cells from different inserts, while reconciling based the expiration times will choose cells consistently from either upserts, as all cells in the respective upsert will carry the same expiration time. Fixes #14182 Also, this series: - updates dml documentation - updates internal documentation - updates and adds unit tests and cql pytest reproducing #14182 Closes #14183 * github.com:scylladb/scylladb: docs: dml: add update ordering section cql-pytest: test_using_timestamp: add tests for rewrites using same timestamp mutation_partition: compare_row_marker_for_merge: consider ttl in case expiry is the same atomic_cell: compare_atomic_cell_for_merge: update and add documentation compare_atomic_cell_for_merge: compare value last for live cells mutation_test: test_cell_ordering: improve debuggability	2023-06-20 12:11:48 +02:00
Benny Halevy	0aa13f70eb	mutation_partition: compare_row_marker_for_merge: consider ttl in case expiry is the same As in compare_atomic_cell_for_merge, we want to consider the row marker ttl for ordering, in case both are expiring and have the same expiration time. This was missed in `a57c087c89` and `a085ef74ff`. With that in mind, add documentation to compare_row_marker_for_merge and a mutual note to both functions about their equivalence. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-06-20 10:10:39 +03:00
Benny Halevy	6717e45ff0	atomic_cell: compare_atomic_cell_for_merge: update and add documentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-06-20 10:10:39 +03:00
Benny Halevy	761d62cd82	compare_atomic_cell_for_merge: compare value last for live cells Currently, when two cells have the same write timestamp and both are alive or expiring, we compare their value first, before checking if either of them is expiring and if both are expiring, comparing their expiration time and ttl value to determine which of them will expire later or was written later. This was changed in CASSANDRA-14592 for consistency with the preference for dead cells over live cells, as expiring cells will become tombstones at a future time and then they'd win over live cells with the same timestamp, hence they should win also before expiration. In addition, comparing the cell value before expiration can lead to unintuitive corner cases where rewriting a cell using the same timestamp but different TTL may cause scylla to return the cell with null value if it expired in the meanwhile. Also, when multiple columns are written using two upserts using the same write timestamp but with different expiration, selecting cells by their value may return a mixed result where each cell is selected individually from either upsert, by picking the cells with the largest values for each column, while using the expiration time to break tie will lead to a more consistent results where a set of cell from only one of the upserts will be selected. Fixes scylladb/scylladb#14182 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-06-20 10:10:39 +03:00
Michał Chojnowski	4f73a28174	mutation: mutation_cleaner: add pause() In unit tests, we would want to delay the merging of some MVCC versions to test the transient scenarios with multiple versions present. In many cases this can be done by holding snapshots to all versions. But sometimes (i.e. during schema upgrades) versions are added and scheduled for merge immediately, without a window for the test to grab a snapshot to the new version. This patch adds a pause() method to mutation_cleaner, which ensures that no asynchronous/implicit MVCC version merges happen within the scope of the call. This functionality will be used by a test added in an upcoming patch.	2023-06-19 22:50:43 +02:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Tomasz Grabiec	51e3b9321b	Merge ' mvcc: make schema upgrades gentle' from Michał Chojnowski After a schema change, memtable and cache have to be upgraded to the new schema. Currently, they are upgraded (on the first access after a schema change) atomically, i.e. all rows of the entry are upgraded with one non-preemptible call. This is a one of the last vestiges of the times when partition were treated atomically, and it is a well known source of numerous large stalls. This series makes schema upgrades gentle (preemptible). This is done by co-opting the existing MVCC machinery. Before the series, all partition_versions in the partition_entry chain have the same schema, and an entry upgrade replaces the entire chain with a single squashed and upgraded version. After the series, each partition_version has its own schema. A partition entry upgrade happens simply by adding an empty version with the new schema to the head of the chain. Row entries are upgraded to the current schema on-the-fly by the cursor during reads, and by the MVCC version merge ongoing in the background after the upgrade. The series: 1. Does some code cleanup in the mutation_partition area. 2. Adds a schema field to partition_version and removes it from its containers (partition_snapshot, cache_entry, memtable_entry). 3. Adds upgrading variants of constructors and apply() for `row` and its wrappers. 4. Prepares partition_snapshot_row_cursor, mutation_partition_v2::apply_monotonically and partition_snapshot::merge_partition_versions for dealing with heterogeneous version chains. 5. Modifies partition_entry::upgrade to perform upgrades by extending the version chain with a new schema instead of squashing it to a single upgraded version. Fixes #2577 Closes #13761 * github.com:scylladb/scylladb: test: mvcc_test: add a test for gentle schema upgrades partition_version: make partition_entry::upgrade() gentle partition_version: handle multi-schema snapshots in merge_partition_versions mutation_partition_v2: handle schema upgrades in apply_monotonically() partition_version: remove the unused "from" argument in partition_entry::upgrade() row_cache_test: prepare test_eviction_after_schema_change for gentle schema upgrades partition_version: handle multi-schema entries in partition_entry::squashed partition_snapshot_row_cursor: handle multi-schema snapshots partiton_version: prepare partition_snapshot::squashed() for multi-schema snapshots partition_version: prepare partition_snapshot::static_row() for multi-schema snapshots partition_version: add a logalloc::region argument to partition_entry::upgrade() memtable: propagate the region to memtable_entry::upgrade_schema() mutation_partition: add an upgrading variant of lazy_row::apply() mutation_partition: add an upgrading variant of rows_entry::rows_entry mutation_partition: switch an apply() call to apply_monotonically() mutation_partition: add an upgrading variant of rows_entry::apply_monotonically() mutation_fragment: add an upgrading variant of clustering_row::apply() mutation_partition: add an upgrading variant of row::row partition_version: remove _schema from partition_entry::operator<< partition_version: remove the schema argument from partition_entry::read() memtable: remove _schema from memtable_entry row_cache: remove _schema from cache_entry partition_version: remove the _schema field from partition_snapshot partition_version: add a _schema field to partition_version mutation_partition: change schema_ptr to schema& in mutation_partition::difference mutation_partition: change schema_ptr to schema& in mutation_partition constructor mutation_partition_v2: change schema_ptr to schema& in mutation_partition_v2 constructor mutation_partition: add upgrading variants of row::apply() partition_version: update the comment to apply_to_incomplete() mutation_partition_v2: clean up variants of apply() mutation_partition: remove apply_weak() mutation_partition_v2: remove a misleading comment in apply_monotonically() row_cache_test: add schema changes to test_concurrent_reads_and_eviction mutation_partition: fix mixed-schema apply()	2023-05-24 22:58:43 +02:00
Raphael S. Carvalho	38b226f997	Resurrect optimization to avoid bloom filter checks during compaction Commit `8c4b5e4283` introduced an optimization which only calculates max purgeable timestamp when a tombstone satisfy the grace period. Commit 'repair: Get rid of the gc_grace_seconds' inverted the order, probably under the assumption that getting grace period can be more expensive than calculating max purgeable, as repair-mode GC will look up into history data in order to calculate gc_before. This caused a significant regression on tombstone heavy compactions, where most of tombstones are still newer than grace period. A compaction which used to take 5s, now takes 35s. 7x slower. The reason is simple, now calculation of max purgeable happens for every single tombstone (once for each key), even the ones that cannot be GC'ed yet. And each calculation has to iterate through (i.e. check the bloom filter of) every single sstable that doesn't participate in compaction. Flame graph makes it very clear that bloom filter is a heavy path without the optimization: 45.64% 45.64% sstable_compact sstable_compaction_test_g [.] utils::filter::bloom_filter::is_present With its resurrection, the problem is gone. This scenario can easily happen, e.g. after a deletion burst, and tombstones becoming only GC'able after they reach upper tiers in the LSM tree. Before this patch, a compaction can be estimated to have this # of filter checks: (# of keys containing any tombstone) * (# of uncompacting sstable runs[1]) [1] It's # of runs, as each key tend to overlap with only one fragment of each run. After this patch, the estimation becomes: (# of keys containing a GC'able tombstone) * (# of uncompacting runs). With repair mode for tombstone GC, the assumption, that retrieval of gc_before is more expensive than calculating max purgeable, is kept. We can revisit it later. But the default mode, which is the "timeout" (i.e. gc_grace_seconds) one, we still benefit from the optimization of deferring the calculation until needed. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13908	2023-05-18 09:01:50 +03:00
Michał Chojnowski	9b0679c140	range_tombstone_change_generator: fix an edge case in flush() range_tombstone_change_generator::flush() mishandles the case when two range tombstones are adjacent and flush(pos, end_of_range=true) is called with pos equal to the end bound of the lesser-position range tombstone. In such case, the start change of the greater-position rtc will be accidentally emitted, and there won't be an end change, which breaks reader assumptions by ending the stream with an unclosed range tombstone, triggering an assertion. This is due to a non-strict inequality used in a place where strict inequality should be used. The modified line was intended to close range tombstones which end exactly on the flush position, but this is unnecessary because such range tombstones are handled by the last `if` in the function anyway. Instead, this line caused range tombstones beginning right after the flush position to be emitted sometimes. Fixes #12462 Closes #13906	2023-05-16 17:54:08 +02:00
Botond Dénes	646396a879	mutation/mutation_partition: append_clustered_row(): use on_internal_error() Instead of simply throwing an exception. With just the exception, it is impossible to find out what went wrong, as this API is very generic and is used in a variety of places. The backtrace printed by `on_internal_error()` will help zero in on the problem. Fixes: #13876 Closes #13883	2023-05-15 20:31:44 +03:00
Avi Kivity	97694d26c4	Merge 'reader_permit: minor improvements to resource consume/release safety' from Botond Dénes This PR contains some small improvements to the safety of consuming/releasing resources to/from the semaphore: * reader_permit: make the low-level `consume()/signal()` API private, making the only user (an RAII class) friend. * reader_resources: split `reset()` into `noexcept` and potentially throwing variant. * reader_resources::reset_to(): try harder to avoid calling `consume()` (when the new resource amount is smaller then the previous one) Closes #13678 * github.com:scylladb/scylladb: reader_permit: resource_units::reset_to(): try harder to avoid calling consume() reader_permit: split resource_units::reset() reader_permit: make consume()/signal() API private	2023-05-14 14:14:23 +03:00
Botond Dénes	ef7b7223d5	mutation/mutation_fragment_stream_validator.cc: rename logger This code inherited its logger variable name from mutation reader, rename it to better match its new context.	2023-05-09 07:55:13 -04:00
Botond Dénes	8681f3e997	readers,mutation: move mutation_fragment_stream_validator to mutation/ The validator classes have their definition in a header located in mutation/, while their implementation is located in a .cc in readers/mutation_reader.cc. This patch fixes this inconsistency by moving the implementation into mutation/mutation_fragment_stream_validator.cc. The only change is that the validator code gets a new logger instance (but the logger variable itself is left unchanged for now).	2023-05-09 07:55:13 -04:00
Nadav Har'El	5f37d43ee6	Merge 'compaction: validate: validate the index too' from Botond Dénes In addition to the data file itself. Currently validation avoids the index altogether, using the crawling reader which only relies on the data file and ignores the index+summary. This is because a corrupt sstable usually has a corrupt index too and using both at the same time might hide the corruption. This patch adds targeted validation of the index, independent of and in addition to the already existing data validation: it validates the order of index entries as well as whether the entry points to a complete partition in the data file. This will usually result in duplicate errors for out-of-order partitions: one for the data file and one for the index file. Fixes: #9611 Closes #11405 * github.com:scylladb/scylladb: test/cql-pytest: add test_sstable_validation.py test/cql-pytest: extract scylla_path,temp_workdir fixtures to conftest.py tools/scylla-sstables: write validation result to stdout sstables/sstable: validate(): delegate to mx validator for mx sstables sstables/mx/reader: add mx specific validator mutation/mutation_fragment_stream_validator: add validator() accessor to validating filter sstables/mx/reader: template data_consume_rows_context_m on the consumer sstables/mx/reader: move row_processing_result to namespace scope sstables/mx/reader: use data_consumer::proceed directly sstables/mx/reader.cc: extend namespace to end-of-file (cosmetic) compaction/compaction: remove now unused scrub_validate_mode_validate_reader() compaction/compaction: move away from scrub_validate_mode_validate_reader() tools/scylla-sstable: move away from scrub_validate_mode_validate_reader() test/boost/sstable_compaction_test: move away from scrub_validate_mode_validate_reader() sstables/sstable: add validate() method compaction/compaction: scrub_sstables_validate_mode(): validate sstables one-by-one compaction: scrub: use error messages from validator mutation_fragment_stream_validator: produce error messages in low-level validator	2023-05-08 17:14:26 +03:00
Avi Kivity	204521b9a7	Merge 'mutation/mutation_compactor: validate range tombstone change before it is moved' from Botond Dénes `e2c9cdb576` moved the validation of the range tombstone change to the place where it is actually consumed, so we don't attempt to pass purged or discarded range tombstones to the validator. In doing so however, the validate pass was moved after the consume call, which moves the range tombstone change, the validator having been passed a moved-from range tombstone. Fix this by moving he validation to before the consume call. Refs: #12575 Closes #13749 * github.com:scylladb/scylladb: test/boost/mutation_test: add sanity test for mutation compaction validator mutation/mutation_compactor: add validation level to compaction state query constructor mutation/mutation_compactor: validate range tombstone change before it is moved	2023-05-04 18:15:35 +03:00
Michał Chojnowski	eb5ccb7356	mutation_partition_v2: fix a minor bug in printer Commit `1cb95b8cf` caused a small regression in the debug printer. After that commit, range tombstones are printed to stdout, instead of the target stream. In practice, this causes range tombstones to appear in test logs out of order with respect to other parts of the debug message. Fix that. Closes #13766	2023-05-04 16:56:40 +03:00
Michał Chojnowski	80c8a6d0e6	partition_version: make partition_entry::upgrade() gentle Preceding commits in this patch series have extended the MVCC mechanism to allow for versions with different schemas in the same entry/snapshot, with on-the-fly and background schema upgrades to the most recent version in the chain. Given that, we can perform gentle schema upgrades by simply adding an empty version with the target schema to the front of the entry. This patch is intended to be the first and only behaviour-changing patch in the series. Previous patches added code paths for multi-schema snapshots, but never exercised them, because before this patch two different schemas within a single MVCC chain never happened. This patch makes it happen and thus exercises all the code in the series up until now. Fixes #2577	2023-05-04 03:35:15 +02:00
Michał Chojnowski	fe576f8f29	partition_version: handle multi-schema snapshots in merge_partition_versions Each partition_version is allowed to have a different schema now. As of this patch, all versions reachable from a snapshot/entry always have the same schema, but this will change in an upcoming patch. This commit prepares merge_partition_versions() for that. See code comments added in this patch for a detailed description. The design chosen in this patch requires adding a bit of information to partition_version. Due to alignment, it results in a regrettable waste of 8 bytes per partition. If we want, we can recover that in the future by squeezing the bit into some free bit in other fields, for example the highest or lowest bits of one of the pointers in partition_version. After this patch, MVCC should be prepared for replacing the atomic schema upgrade() of cache/memtable entries with a gentle upgrade().	2023-05-04 03:35:15 +02:00
Michał Chojnowski	152b4cd4c2	mutation_partition_v2: handle schema upgrades in apply_monotonically() To avoid reactor stalls during schema upgrades of memtable and cache entries, we want to do them interruptibly, not atomically. To achieve that, we want to reuse the existing gentle version merging mechanism. If we generalize version merging algorithms to handle `mutation_partition`s with different schemas, a schema upgrade will boil down simply to adding a new empty MVCC version with the new schema. In a previous patch, we already generalized the cursor to upgrade rows on the fly when reading. But we still have to generalize the other MVCC algorithm: the merging of superfluous mutation_partition_v2 objects. This patch modifies the two-version merging algorithm: apply_monotonically(). The next patch will update its caller, merge_partition_versions() to make of use the updated apply_monotonically() properly.	2023-05-04 03:35:15 +02:00
Michał Chojnowski	0273101890	partition_version: remove the unused "from" argument in partition_entry::upgrade() partition_entry now contains a reference to its schema, so it doesn't have to be supplied by the caller anymore.	2023-05-04 02:37:30 +02:00
Michał Chojnowski	db6a35e3a8	partition_version: handle multi-schema entries in partition_entry::squashed An upcoming patch will enable multiple schemas within a single entry, after the entry is upgraded. partition_entry::squashed isn't prepared for that yet. This patch prepares it.	2023-05-04 02:37:30 +02:00
Michał Chojnowski	f4e853b32d	partiton_version: prepare partition_snapshot::squashed() for multi-schema snapshots When in upcoming patches we allow multiple schema versions within a single snapshot, reads will have to upgrade rows on the fly. This also applies to squashed()	2023-05-04 02:37:29 +02:00
Michał Chojnowski	a2e3cf7463	partition_version: prepare partition_snapshot::static_row() for multi-schema snapshots When in upcoming patches we allow multiple schema versions within a single snapshot, reads will have to upgrade rows on the fly. This also applies to the static row.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	94e4dc3d8d	partition_version: add a logalloc::region argument to partition_entry::upgrade() The argument is currently unused, but will be further propagated to add_version() in an upcoming patch.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	effd1fe70f	mutation_partition: add an upgrading variant of lazy_row::apply() A helper which will be used during upcoming changes to mutation_partition_v2::apply_monotonically(), which will extend it to merging versions with different schemas.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	dce1b3e820	mutation_partition: add an upgrading variant of rows_entry::rows_entry A helper which will be used in upcoming commits.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	2fe25a5aa2	mutation_partition: switch an apply() call to apply_monotonically()	2023-05-04 02:37:29 +02:00
Michał Chojnowski	a34c5e410f	mutation_partition: add an upgrading variant of rows_entry::apply_monotonically() A helper which will be used during upcoming changes to mutation_partition_v2::apply_monotonically(), which will extend it to merging versions with different schemas.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	333e65447c	mutation_fragment: add an upgrading variant of clustering_row::apply() It will be used during upcoming changes in partition_snapshot_row_cursor to prepare it for multi-schema snapshots.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	b488e4d541	mutation_partition: add an upgrading variant of row::row It will be used in upcoming commits. A factory function is used, rather than an actual constructor, because we want to delegate the (easy) case of equal schemas to the existing single-schema constructor. And that's impossible (at least without invoking a copy/move constructor) to do with only constructors.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	caaf0bd6bf	partition_version: remove _schema from partition_entry::operator<< operator<< accepts a schema& and a partition_entry&. But since the latter now contains a reference to its schema inside, the former is redundant. Remove it.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	f6e11c95e2	partition_version: remove the schema argument from partition_entry::read() partition_entry now contains a reference to its schema, so it no longer needs to be supplied by the caller.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	d7d6449a8f	partition_version: remove the _schema field from partition_snapshot After adding a _schema field to each partition version, the field in partition_snapshot is redundant. It can be always recovered from the latest version. Remove it.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	1d01a4a168	partition_version: add a _schema field to partition_version Currently, partition_version does not reference its schema. All partition_version reachable from a entry/snapshot have the same schema, which is referenced in memtable_entry/cache_entry/partition_snapshot. To enable gentle schema upgrades, we want to use the existing background version merging mechanism. To achieve that, we will move the schema reference into partition_version, and we will allow neighbouring MVCC versions to have different schemas, and we will merge them on-the-fly during reads and persistently during background version merges. This way, an upgrade will boil down to adding a new empty version with the new schema. This patch adds the _schema field to partition_version and propagates the schema pointer to it from the version's containers (entry/snapshot). Subsequent patches will remove the schema references from the containers, because they are now redundant.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	bc6a07a16a	mutation_partition: change schema_ptr to schema& in mutation_partition::difference Cosmetic change. See the preceding commit for details.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	a70c5704df	mutation_partition: change schema_ptr to schema& in mutation_partition constructor Cosmetic change. See the preceding commit for details.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	781514acfe	mutation_partition_v2: change schema_ptr to schema& in mutation_partition_v2 constructor We don't have a convention for when to pass `schema_ptr` and and when to pass `const schema&` around. In general, IMHO the natural convention for such a situation is to pass the shared pointer if the callee might extend the lifetime of shared_ptr, and pass a reference otherwise. But we convert between them willy-nilly through shared_from_this(). While passing a reference to a function which actually expects a shared_ptr can make sense (e.g. due to the fact that smart pointers can't be passed in registers), the other way around is rather pointless. This patch takes one occurence of that and modifies the parameter to a reference. Since enable_shared_from_this makes shared pointer parameters and reference parameters interchangeable, this is a purely cosmetic change.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	021b345832	mutation_partition: add upgrading variants of row::apply() They will be used in upcoming patches which introduce incremental schema upgrades. Currently, these variants always copy cells during upgrade. This could be optimized in the future by adding a way to move them instead.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	4214f8d0de	partition_version: update the comment to apply_to_incomplete() The comment refers to "other", but it means "pe". Fix that. The patch also adds a bit of context to the mutation_partition jargon ("evictability" and "continuity"), by reminding how it relates to the concrete abstractions: memtable and cache.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	49a02b08de	mutation_partition_v2: clean up variants of apply() Most variants of apply() and apply_monotonically() in mutation_partition_v2 are leftovers from mutation_partition, and are unused. Thus they only add confusion and maintenance burden. Since we will be modifying apply_monotonically() in upcoming patches, let's clean them up, lest the variants become stale. This patch removes all unused variants of apply() and apply_monotonically() and "manually inlines" the variants which aren't used often enough to carry their own weight. In the end, we are left with a single apply_monotonically() and two convenience apply() helpers. The single apply_monotonically() accepts two schema arguments. This facility is unimplemented and unused as of this patch - the two arguments are always the same - but it will be implemented and used in later parts of the series.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	88a0871729	mutation_partition: remove apply_weak() apply_weak is just an alias for apply(), and most of its variants are dead code. Get rid of it.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	38d9241c30	mutation_partition_v2: remove a misleading comment in apply_monotonically() The comment suggests that the order of sentinel insertion is meaningful because of the resulting eviction order. But the sentinels are added to the tracker with the two-argument version of insert(), which inserts the second argument into the LRU right before the (more recent) first argument. Thus the eviction order of sentinels is decided explicitly, and it doesn't rely on insertion order.	2023-05-04 02:37:29 +02:00
Michał Chojnowski	fb8ae3cca4	mutation_partition: fix mixed-schema apply() In some mixed-schema apply helpers for tests, the source mutation is accidentally copied with the target schema. Fix that. Nothing seems to be currently affected by this bug; I found it when it was triggered by a new test I was adding.	2023-05-04 02:37:29 +02:00
Botond Dénes	60e1a23864	mutation/mutation_compactor: add validation level to compaction state query constructor Allowing the validation level to be customized by whoever creates the compaction state. Add a default value (the previous hardcoded level) to avoid the churn of updating all call sites.	2023-05-03 04:17:05 -04:00
Botond Dénes	be859db112	mutation/mutation_compactor: validate range tombstone change before it is moved `e2c9cdb576` moved the validation of the range tombstone change to the place where it is actually consumed, so we don't attempt to pass purged or discarded range tombstones to the validator. In doing so however, the validate pass was moved after the consume call, which moves the range tombstone change, the validator having been passed a moved-from range tombstone. Fix this by moving he validation to before the consume call. Refs: #12575	2023-05-03 03:07:31 -04:00
Botond Dénes	a6387477fa	mutation/mutation_fragment_stream_validator: add validator() accessor to validating filter	2023-05-02 09:42:42 -04:00
Botond Dénes	d3749b810a	mutation_fragment_stream_validator: produce error messages in low-level validator Currently, error messages for validation errors are produced in several places: * the high-level validator (which is built on the low-level one) * scrub compaction and validation compaction (scrub in validate mode) * scylla-sstable's validate operation We plan to introduce yet another place which would use the low-level validator and hence would have to produce its own error messages. To cut down all this duplication, centralize the production of error messages in the low-level validator, which now returns a `validation_result` object instead of bool from its validate methods. This object can be converted to bool (so its backwards compatible) and also contains an error message if validation failed. In the next patches we will migrate all users of the low level validator (be that direct or indirect) to use the error messages provided in this result object instead of coming up with one themselves.	2023-05-02 09:42:41 -04:00
Kefu Chai	f5b05cf981	treewide: use defaulted operator!=() and operator==() in C++20, compiler generate operator!=() if the corresponding operator==() is already defined, the language now understands that the comparison is symmetric in the new standard. fortunately, our operator!=() is always equivalent to `! operator==()`, this matches the behavior of the default generated operator!=(). so, in this change, all `operator!=` are removed. in addition to the defaulted operator!=, C++20 also brings to us the defaulted operator==() -- it is able to generated the operator==() if the member-wise lexicographical comparison. under some circumstances, this is exactly what we need. so, in this change, if the operator==() is also implemented as a lexicographical comparison of all memeber variables of the class/struct in question, it is implemented using the default generated one by removing its body and mark the function as `default`. moreover, if the class happen to have other comparison operators which are implemented using lexicographical comparison, the default generated `operator<=>` is used in place of the defaulted `operator==`. sometimes, we fail to mark the operator== with the `const` specifier, in this change, to fulfil the need of C++ standard, and to be more correct, the `const` specifier is added. also, to generate the defaulted operator==, the operand should be `const class_name&`, but it is not always the case, in the class of `version`, we use `version` as the parameter type, to fulfill the need of the C++ standard, the parameter type is changed to `const version&` instead. this does not change the semantic of the comparison operator. and is a more idiomatic way to pass non-trivial struct as function parameters. please note, because in C++20, both operator= and operator<=> are symmetric, some of the operators in `multiprecision` are removed. they are the symmetric form of the another variant. if they were not removed, compiler would, for instance, find ambiguous overloaded operator '=='. this change is a cleanup to modernize the code base with C++20 features. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13687	2023-04-27 10:24:46 +03:00

1 2

73 Commits