scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 16:22:15 +00:00

Author	SHA1	Message	Date
Botond Dénes	674d41e3e6	readers/mutation_source: s/make_reader_v2/make_mutation_reader/	2025-05-09 07:53:29 -04:00
Botond Dénes	7af0690762	mutation/mutation_compactor: drop v2 from compactor and related names	2025-05-09 07:53:29 -04:00
Botond Dénes	c29c696780	readers: mv from_mutations_v2.hh from_mutations.hh Completely mechanical change.	2025-04-16 04:46:08 -04:00
Botond Dénes	b104862702	tree: s/make_mutation_reader_from_mutations_v2/make_mutation_reader_from_mutations/s Completely mechanical change.	2025-04-16 04:46:07 -04:00
Botond Dénes	d67202972a	mutation/frozen_mutation: frozen_mutation_consumer_adaptor: fix end-of-partition handling This adaptor adapts a mutation reader pausable consumer to the frozen mutation visitor interface. The pausable consumer protocol allows the consumer to skip the remaining parts of the partition and resume the consumption with the next one. To do this, the consumer just has to return stop_iteration::yes from one of the consume() overloads for clustering elements, then return stop_iteration::no from consume_end_of_partition(). Due to a bug in the adaptor, this sequence leads to terminating the consumption completely -- so any remaining partitions are also skipped. This protocol implementation bug has user-visible effects, when the only user of the adaptor -- read repair -- happens during a query which has limitations on the amount of content in each partition. There are two such queries: select distinct ... and select ... with partition limit. When converting the repaired mutation to to query result, these queries will trigger the skip sequence in the consumer and due to the above described bug, will skip the remaining partitions in the results, omitting these from the final query result. This patch fixes the protocol bug, the return value of the underlying consumer's consume_end_of_partition() is now respected. A unit test is also added which reproduces the problem both with select distinct ... and select ... per partition limit. Follow-up work: * frozen_mutation_consumer_adaptor::on_end_of_partition() calls the underlying consumer's on_end_of_stream(), so when consuming multiple frozen mutations, the underlying's on_end_of_stream() is called for each partition. This is incorrect but benign. * Improve documentation of mutation_reader::consume_pausable(). Fixes: #20084 Closes scylladb/scylladb#23657	2025-04-10 13:19:57 +03:00
Botond Dénes	df09b3f970	replica/mutation_dump: don't assume cells are live Currently the dumper unconditionally extracts the value of atomic cells, assuming they are live. This doesn't always hold of course and attempting to get the value of a dead cell will lead to marshalling errors. Fix by checking is_live() before attempting to get the cell value. Fix for both regular and collection cells.	2025-04-08 00:11:36 -04:00
Botond Dénes	c2518cdf1a	mutation/mutation_compactor: copy key passed-in to consume_new_partition() This doesn't introduce additional work for single-partition queries: the key is copied anyway on consume_end_of_stream(). Multi-partition reads and compaction are not that sensitive to additional copy added. This change fixes a bug in the compacting_reader: currently the reader passes _last_uncompacted_partition_start.key() to the compactor's consume_new_partition(). When the compactor emits enough content for this partition, _last_uncompacted_partition_start is moved from to emit the partition start, this makes the key reference passed to the compaction corrupt (refer to moved-from value). This in turn means that subsequent GC checks done by the compactor will be done with a corrupt key and therefore can result in tombstone being garbage-collected while they still cover data elsewhere (data resurrection). The compacting reader is violating the API contract and normally the bug should be fixed there. We make an exception here because doing the fix in the mutation compactor better aligns with our future plans: * The fix simplifies the compactor (gets rid of _last_dk). * Prepares the way to get rid of the consume API used by the compactor.	2025-04-08 00:11:35 -04:00
Botond Dénes	a2d0d7b9a0	mutation: fold FragmentConsumer[V2] into FlattenedConsumer[V2] FragmentConsumer[V2] also has no direct users, so fold it into FlattenedConsumer[V2] as well. With this, FlattenedConsumer[V2] has a nice and simple definition, with a single nesting level required due to the return-type flexibility.	2025-03-18 09:24:49 -04:00
Botond Dénes	8768e2e08e	mutation: fold StreamedMutationConsumer[V2] into FlattenedConsumer[V2] No code uses StreamedMutationConsumer[V2] directly, so let's take this opportunity to reduce the jungle of consumer concepts.	2025-03-18 09:24:44 -04:00
Kefu Chai	a483ff8647	mutation: replace boost::upper_bound with std::ranges::upper_bound Reduces dependencies on boost/range. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23119	2025-03-04 10:36:57 +03:00
Kefu Chai	6e4cb20a69	tree: implement boost::accumulate with std::ranges library Replace boost::accumulate() calls with std::ranges facilities. This change reduces external dependencies and modernizes the codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23062	2025-02-26 23:22:02 +02:00
Kefu Chai	6e4df57f97	mutation,test: replace boost::equal with std::ranges::equal to reduce third-party dependencies and modernize the codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22999	2025-02-26 14:27:42 +03:00
Kefu Chai	3cf0f71420	query-result-writer: reorder initialization to prevent use-after-move Reorder member variable initialization sequence to ensure `pw` is accessed before being moved. While the current use-after-move warning from clang-tidy is a false positive, this change: - Makes the initialization order more logical - Eliminates misleading static analysis warnings - Prevents potential future issues if class structure changes Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22830	2025-02-17 13:45:35 +03:00
Li Bo	de8de50fb9	Remove redundant code in mutation_partition.cc Use the defined `cdef` variable. Closes scylladb/scylladb#22048	2025-02-15 20:32:22 +02:00
Nadav Har'El	bc7b5926d2	mv: support regular_column_transformation key columns in view In an earlier patch, we introduced regular_column_transformation, a new type of computed column that does a computation on a cell in regular column in the base and returns a potentially transformed cell (value or deletion, timestamp and ttl). In this patch, we wire the materialized view code to support this new kind of computed column that is usable as a materialized-view key column. This new type of computed column is not yet used in this patch - this will come in the next patch, where we will use it for Alternator GSIs. Before this patch, the logic of deciding when the view update needs to create a new row or delete a new one, and which timestamp and ttl to give to the new row, could depend on one (or two - in Alternator) cells read from base-table regular columns. In this patch, this logic is rewritten - the notion of "base table regular columns" is generalized to the notion of "updatable view key columns" - these are view key columns that an update may change - because they really are base regular columns, or a computed function of one (regular_column_transformation). In some sense, the new code is easier to understand - there is no longer a separate "compute_row_marker()" function, rather the top-level generate_update() is now in charge of finding the "updatable view key columns" and calculate the row marker (timestamp and ttl) as part of deciding what needs to be done. But unfortunately the code still has separate code paths for "collection secondary indexing", and also for old-style column_computation (basically, only token_column_computation). Perhaps in the future this can be further simplified. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-02-06 09:59:49 +01:00
Ran Regev	edd56a2c1c	moved cache files to db As requested in #22097, moved the files and fixed other includes and build system. Fixes: #22097 Signed-off-by: Ran Regev <ran.regev@scylladb.com> Closes scylladb/scylladb#22495	2025-02-04 12:21:31 +03:00
Kefu Chai	7215d4bfe9	utils: do not include unused headers these unused includes were identifier by clang-include-cleaner. after auditing these source files, all of the reports have been confirmed. please note, because quite a few source files relied on `utils/to_string.hh` to pull in the specialization of `fmt::formatter<std::optional<T>>`, after removing `#include <fmt/std.h>` from `utils/to_string.hh`, we have to include `fmt/std.h` directly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-01-14 07:56:39 -05:00
Kefu Chai	353b522ca0	treewide: migrate from boost::adaptors::reversed to std::views::reverse now that we are allowed to use C++23. we now have the luxury of using `std::views::reverse`. - replace `boost::adaptors::transformed` with `std::views::transform` - remove unused `#include <boost/range/adaptor/reversed.hpp>` this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-01-07 13:22:00 +02:00
Kefu Chai	f1a0613a39	mutation: remove unused function `prefixed()` is a static function in `mutation_partition_v2.cc`. and this function is not used in this translation unit. so let's remove it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22006	2024-12-20 16:12:10 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Kefu Chai	ce2f80c227	treewide: migrate from boost::make_iterator_range to ranges::subrange Replace boost::make_iterator_range() with std::ranges::subrange. This change improves code modernization and reduces external dependencies: - Replace boost::make_iterator_range() with std::ranges::subrange - Remove boost/range/iterator_range.hpp include - Improve iterator type detection in interval.hh using std::ranges::const_iterator_t<Range> This is part of ongoing efforts to modernize our codebase and minimize external dependencies. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21787	2024-12-09 21:31:53 +02:00
Kefu Chai	48c8d24345	treewide: drop support for fmt < v10 since fedora 38 is EOL. and fedora 39 comes with fmt v10.0.0, also, we've switched to the build image based on fedora 40, which ships fmt-devel v10.2.1, there is no need to support fmt < 10. in this change, we drop the support fmt < 10. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21847	2024-12-09 20:42:38 +02:00
Emil Maskovsky	8191e57036	treewide: fix annotations reported by GH checks Clean up the unnecessary includes reported by the GitHub checks that are polluting the PR diffs. The "utils/assert.hh" report should be actually fixed by the #21739, but as the usage of `SEASTAR_ASSERT()` is protected by the `SEASTAR_DEBUG` check it makes sense to include the header conditionally as well. Closes scylladb/scylladb#21817	2024-12-09 13:44:12 +03:00
Kefu Chai	61ae4a1c86	mutation: remove unused "#include"s This commit follows up on commit `f436edfa22`, which initially cleaned up unused #include directives in the "mutation" subdirectory. This change removes additional unused header files that were missed in the previous cleanup. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21740	2024-12-04 15:36:33 +03:00
Benny Halevy	d5d4307a20	scylla-sstable: dump-summary: print also first and last tokens To help scylla-manager restore to map sstables to nodes or tablets, print also the tokens of the sstable first and last keys. For example, the json output will now look like this: ``` $ build/dev/scylla sstable dump-summary /tmp/scylla-344593/data/ks/t-52a92590afd011ef9b68ba86378ed63b/me-3glp_0tm9_00uv52doobo0bvk2t7-big-Data.db \| jq { "sstables": { "/tmp/scylla-344593/data/ks/t-52a92590afd011ef9b68ba86378ed63b/me-3glp_0tm9_00uv52doobo0bvk2t7-big-Data.db": { "header": { "min_index_interval": 128, "size": 1, "memory_size": 16, "sampling_level": 128, "size_at_full_sampling": 0 }, "positions": [ 4 ], "entries": [ { "key": { "token": "2008715943680221220", "raw": "000400000064", "value": "100" }, "position": 0 } ], "first_key": { "token": "2008715943680221220", "raw": "000400000064", "value": "100" }, "last_key": { "token": "9010454139840013625", "raw": "000400000003", "value": "3" } } } } ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#21735	2024-12-04 10:16:13 +02:00
Kefu Chai	bab12e3a98	treewide: migrate from boost::adaptors::transformed to std::views::transform now that we are allowed to use C++23. we now have the luxury of using `std::views::transform`. in this change, we: - replace `boost::adaptors::transformed` with `std::views::transform` - use `fmt::join()` when appropriate where `boost::algorithm::join()` is not applicable to a range view returned by `std::view::transform`. - use `std::ranges::fold_left()` to accumulate the range returned by `std::view::transform` - use `std::ranges::fold_left()` to get the maximum element in the range returned by `std::view::transform` - use `std::ranges::min()` to get the minimal element in the range returned by `std::view::transform` - use `std::ranges::equal()` to compare the range views returned by `std::view::transform` - remove unused `#include <boost/range/adaptor/transformed.hpp>` - use `std::ranges::subrange()` instead of `boost::make_iterator_range()`, to feed `std::views::transform()` a view range. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. limitations: there are still a couple places where we are still using `boost::adaptors::transformed` due to the lack of a C++23 alternative for `boost::join()` and `boost::adaptors::uniqued`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21700	2024-12-03 09:41:32 +02:00
Kefu Chai	f436edfa22	mutation: remove unused "#include"s these unused includes are identified by clang-include-cleaner. after auditing the source files, all of the reports have been confirmed. please note, because `mutation/mutation.hh` does not include `seastar/coroutine/maybe_yield.hh` anymore, and quite a few source files were relying on this header to bring in the declaration of `maybe_yield()`, we have to include this header in the places where this symbol is used. the same applies to `seastar/core/when_all.hh`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-11-29 14:01:44 +08:00
Avi Kivity	1c26c8deeb	mutation: mutation_partition_v2.hh: switch from boost ranges to std ranges Consolidate on one range solution. Fallout in mutation_partition_v2.cc and row_cache_test.cc due to interoperability problems is adjusted.	2024-11-15 14:36:28 +02:00
Avi Kivity	de822d3a46	mutation: mutation_partition.hh: switch from boost ranges to std ranges Consolidate on one range solution. Fallout in mutation_partition.cc due to interoperability problems is adjusted.	2024-11-15 14:09:31 +02:00
Kefu Chai	00810e6a01	treewide: include seastar/core/format.hh instead of seastar/core/print.hh The later includes the former and in addition to `seastar::format()`, `print.hh` also provides helpers like `seastar::fprint()` and `seastar::print()`, which are deprecated and not used by scylladb. Previously, we include `seastar/core/print.hh` for using `seastar::format()`. and in seastar 5b04939e, we extracted `seastar::format()` into `seastar/core/format.hh`. this allows us to include a much smaller header. In this change, we just include `seastar/core/format.hh` in place of `seastar/core/print.hh`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21574	2024-11-14 17:45:07 +02:00
Michał Chojnowski	35921eb67e	mvcc_test: fix a benign failure of test_apply_to_incomplete_respects_continuity For performance reasons, mutation_partition_v2::maybe_drop(), and by extension also mutation_partition_v2::apply_monotonically(mutation_partition_v2&&) can evict empty row entries, and hence change the continuity of the merged entry. For checking that apply_to_incomplete respects continuity, test_apply_to_incomplete_respects_continuity obtains the continuity of the partition entry before and after apply_to_incomplete by calling e.squashed().get_continuity(). But squashed() uses apply_monotonically(), so in some circumstances the result of squashed() can have smaller continuity than the argument of squashed(), which messes with the thing that the test is trying to check, and causes spurious failures. This patch changes the method of calculating the continuity set, so that it matches the entry exactly, fixing the test failures. Fixes scylladb/scylladb#13757 Closes scylladb/scylladb#21459	2024-11-08 06:08:39 +01:00
Avi Kivity	ee92784098	serialization: replace boost::type with std::type_identity Recently, seastar rpc started accepting std::type_identity in addition to boost::type as a type marker (while labeling the latter with an ominous deprecation warning). Reduce our depedendency on boost by switching to std::type_identity.	2024-11-05 00:43:27 +01:00
Avi Kivity	2531dc2d80	schema_registry: stop including replica/database.hh database.hh is a hotspot that changes often (or its dependencies do). Avoid including it to reduce recompilations. Closes scylladb/scylladb#21407	2024-11-04 13:16:27 +01:00
Avi Kivity	907da210b6	compound_compat: replace use of boost ranges with std ranges To reduce the dependency load, replace use of boost ranges with the std equivalent. Files that lost the indirect boost dependency have it added as a direct dependency.	2024-10-30 19:58:07 +02:00
Kefu Chai	6ead5a4696	treewide: move log.hh into utils/log.hh the log.hh under the root of the tree was created keep the backward compatibility when seastar was extracted into a separate library. so log.hh should belong to `utils` directory, as it is based solely on seastar, and can be used all subsystems. in this change, we move log.hh into utils/log.hh to that it is more modularized. and this also improves the readability, when one see `#include "utils/log.hh"`, it is obvious that this source file needs the logging system, instead of its own log facility -- please note, we do have two other `log.hh` in the tree. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-10-22 06:54:46 +03:00
Avi Kivity	d12ba753e0	utils/unconst, mutation_partition: switch to ranges unconst is a small help that converts a const iterator to a non-const iterator with the help of the container. Currently it is using the boost iterator/range libraries. Convert it to <ranges> as part of an effort to standardize on a single range library. Its only user in mutation_partition is converted as well. Due to more iteroperability problems between <range> and boost, some calls to boost::adaptors::reversed have to be converted as well.	2024-10-07 17:30:12 +03:00
Avi Kivity	e99426df60	treewide: de-static namespace scope functions in headers 'static inline' is always wrong in headers - if the same header is included multiple times, and the function happens not to be inlined, then multiple copies of it will be generated. Fix by mechanically changing '^static inline' to 'inline'.	2024-10-01 14:02:50 +03:00
Tomasz Grabiec	adf99402c5	Merge 'readers/flat_mutation_reader_v2: call set_close_required() from consume()' from Botond Dénes The `consume()` variants just forward the call to the `_impl` method with the same name. The latter, being a member of `::impl`, will bypass the top level `fill_buffer()`, etc. methods and thus will never call `set_close_required()`. Do this in the top-level `consume()` methods instead, to ensure a reader, on which only `consume()` is called, and then is destroyed, will complain as it should (and abort). Only one place was found in core code, which didn't close the reader: `split_mutation() in `mutation/mutation.cc` and this reader is the "from-mutation" one which has no real close routine. All other places were in tests. All this is to say, there were no real bugs uncovered by this PR. Fixes #16520 Improvement, no backport required. Closes scylladb/scylladb#16522 * github.com:scylladb/scylladb: readers/flat_mutation_reader_v2: call set_close_required() from consume*() test/boost/sstable_compaction_test: close reader after use test/boost/repair_test: close reader after use mutation/mutation: split_mutation(): close reader after use	2024-09-17 13:21:34 +02:00
Botond Dénes	1a11f9cf95	mutation/mutation: split_mutation(): close reader after use	2024-09-13 06:52:26 -04:00
Botond Dénes	c7c5817808	Merge 'Improve timestamp heuristics for tombstone garbage collection' from Benny Halevy When purging regular tombstone consult the min_live_timestamp, if available. This is safe since we don't need to protect dead data from resurrection, as it is already dead. For shadowable_tombstones, consult the min_memtable_live_row_marker_timestamp, if available, otherwise fallback to the min_live_timestamp. If we see in a view table a shadowable tombstone with time T, then in any row where the row marker's timestamp is higher than T the shadowable tombstone is completely ignored and it doesn't hide any data in any column, so the shadowable tombstone can be safely purged without any effect or risk resurrecting any deleted data. In other words, rows which might cause problems for purging a shadowable tombstone with time T are rows with row markers older or equal T. So to know if a whole sstable can cause problems for shadowable tombstone of time T, we need to check if the sstable's oldest row marker (and not oldest column) is older or equal T. And the same check applies similarly to the memtable. If both extended timestamp statistics are missing, fallback to the legacy (and inaccurate) min_timestamp. Fixes scylladb/scylladb#20423 Fixes scylladb/scylladb#20424 > [!NOTE] > no backport needed at this time > We may consider backport later on after given some soak time in master/enterprise > since we do see tombstone accumulation in the field under some materialized views workloads Closes scylladb/scylladb#20446 * github.com:scylladb/scylladb: cql-pytest: add test_compaction_tombstone_gc sstable_compaction_test: add mv_tombstone_purge_test sstable_compaction_test: tombstone_purge_test: test that old deleted data do not inhibit tombstone garbage collection sstable_compaction_test: tombstone_purge_test: add testlog debugging sstable_compaction_test: tombstone_purge_test: make_expiring: use next_timestamp sstable, compaction: add debug logging for extended min timestamp stats compaction: get_max_purgeable_timestamp: use memtable and sstable extended timestamp stats compaction: define max_purgeable_fn tombstone: can_gc_fn: move declaration to compaction_garbage_collector.hh sstables: scylla_metadata: add ext_timestamp_stats compaction_group, storage_group, table_state: add extended timestamp stats getters sstables, memtable: track live timestamps memtable_encoding_stats_collector: update row_marker: do nothing if missing	2024-09-13 08:56:51 +03:00
Kefu Chai	3e84d43f93	treewide: use seastar::format() or fmt::format() explicitly before this change, we rely on `using namespace seastar` to use `seastar::format()` without qualifying the `format()` with its namespace. this works fine until we changed the parameter type of format string `seastar::format()` from `const char*` to `fmt::format_string<...>`. this change practically invited `seastar::format()` to the club of `std::format()` and `fmt::format()`, where all members accept a templated parameter as its `fmt` parameter. and `seastar::format()` is not the best candidate anymore. despite that argument-dependent lookup (ADT for short) favors the function which is in the same namespace as its parameter, but `using namespace` makes `seastar::format()` more competitive, so both `std::format()` and `seastar::format()` are considered as the condidates. that is what is happening scylladb in quite a few caller sites of `format()`, hence ADT is not able to tell which function the winner in the name lookup: ``` /__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous 265 \| return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id()); \| ^~~~~~ /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 4290 \| format(format_string<_Args...> __fmt, _Args&&... __args) \| ^ /__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 143 \| format(fmt::format_string<A...> fmt, A&&... a) { \| ^ ``` in this change, we change all `format()` to either `fmt::format()` or `seastar::format()` with following rules: - if the caller expects an `sstring` or `std::string_view`, change to `seastar::format()` - if the caller expects an `std::string`, change to `fmt::format()`. because, `sstring::operator std::basic_string` would incur a deep copy. we will need another change to enable scylladb to compile with the latest seastar. namely, to pass the format string as a templated parameter down to helper functions which format their parameters. to miminize the scope of this change, let's include that change when bumping up the seastar submodule. as that change will depend on the seastar change. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-09-11 23:21:40 +03:00
Benny Halevy	5849ba83e0	sstable, compaction: add debug logging for extended min timestamp stats Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-09-10 19:05:57 +03:00
Benny Halevy	7d893a5ed9	compaction: get_max_purgeable_timestamp: use memtable and sstable extended timestamp stats When purging regular tombstone consult the min_live_timestamp, if available. For shadowable_tombstones, consult the min_memtable_live_row_marker_timestamp, if available, otherwise fallback to the min_live_timestamp. If both are missing, fallback to the legacy (and inaccurate) min_timestamp. Fixes scylladb/scylladb#20423 Fixes scylladb/scylladb#20424 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-09-10 19:05:57 +03:00
Benny Halevy	57e9e9c369	compaction: define max_purgeable_fn Before we add a new, is_shadowable, parameter to it. And define global `can_always_purge` and `can_never_purge` functions, a-la `always_gc` and `never_gc`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-09-10 19:05:57 +03:00
Benny Halevy	b6fabd98c6	tombstone: can_gc_fn: move declaration to compaction_garbage_collector.hh And define `never_gc` globally, same as `always_gc` Before adding a new, is_shadowable parameter to it. Since it is used in the context of compaction it better fits compaction_garbage_collector header rather than tombstone.hh Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-09-10 19:05:57 +03:00
Benny Halevy	14d86a3a12	sstables, memtable: track live timestamps When garbage collecting tombstones, we care only about shadowing of live data. However, currently we track min/max timestamp of both live and dead data, but there is no problem with purging tombstones that shadow dead data (expired or shdowed by other tombstones in the sstable/memtable). Also, for shadowable tombstones, we track live row marker timestamps separately since, if the live row marker timestamp is greater than a shadowable tombstone timestamp, then the row marker would shadow the shadowable tombstone thus exposing the cells in that row, even if their timestasmp may be smaller than the shadow tombstone's. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-09-10 19:05:49 +03:00
Łukasz Paszkowski	ba2f037af5	mutation_partition: drop reverse parameter in compact_for_query The reverse parameter is no longer used with native reverse reads. The row ranges are provided in native reverse order together with a reversed schema, thus the reverse parameter remain false all the time and can be droped.	2024-08-13 10:07:12 +02:00
Łukasz Paszkowski	8b5ec0e963	streamed_mutation_freezer: drop the reverse parameter The reverse parameter is no longer used with native reverse reads. A reversed schema is provided and thus the reverse parameter shall remain false all the time.	2024-08-13 10:07:12 +02:00
Łukasz Paszkowski	da95f44adc	readers: Use reversed schema and native reversed slices The reconcilable_result is built as it would be constructed for forward read queries for tables with reversed order. Mutations constructed for reversed queries are consumed forward. Drop overloaded reversed functions that reverse read_command and reconcilable_result directly and keep only those requiring smart pointers. They are not used any more.	2024-08-13 10:03:46 +02:00
Botond Dénes	fb0ab3c1fb	mutation/canonical_mutation: add key() Extracts the partition key without deserializing the entire mutation.	2024-08-11 09:52:37 -04:00

1 2 3 4 5

211 Commits