scylladb

Author	SHA1	Message	Date
Benny Halevy	3feb759943	everywhere: use utils::chunked_vector for list of mutations Currently, we use std::vector<*mutation> to keep a list of mutations for processing. This can lead to large allocation, e.g. when the vector size is a function of the number of tables. Use a chunked vector instead to prevent oversized allocations. `perf-simple-query --smp 1` results obtained for fixed 400MHz frequency and PGO disabled: Before (read path): ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=read, query_single_key=no, counters=no} Disabling auto compaction Creating 10000 partitions... 89055.97 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39417 insns/op, 18003 cycles/op, 0 errors) 103372.72 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39380 insns/op, 17300 cycles/op, 0 errors) 98942.27 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39413 insns/op, 17336 cycles/op, 0 errors) 103752.93 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39407 insns/op, 17252 cycles/op, 0 errors) 102516.77 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39403 insns/op, 17288 cycles/op, 0 errors) throughput: mean= 99528.13 standard-deviation=6155.71 median= 102516.77 median-absolute-deviation=3844.59 maximum=103752.93 minimum=89055.97 instructions_per_op: mean= 39403.99 standard-deviation=14.25 median= 39406.75 median-absolute-deviation=9.30 maximum=39416.63 minimum=39380.39 cpu_cycles_per_op: mean= 17435.81 standard-deviation=318.24 median= 17300.40 median-absolute-deviation=147.59 maximum=18002.53 minimum=17251.75 ``` After (read path) ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=read, query_single_key=no, counters=no} Disabling auto compaction Creating 10000 partitions... 59755.04 tps ( 66.2 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39466 insns/op, 22834 cycles/op, 0 errors) 71854.16 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39417 insns/op, 17883 cycles/op, 0 errors) 82149.45 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39411 insns/op, 17409 cycles/op, 0 errors) 49640.04 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.3 tasks/op, 39474 insns/op, 19975 cycles/op, 0 errors) 54963.22 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.3 tasks/op, 39474 insns/op, 18235 cycles/op, 0 errors) throughput: mean= 63672.38 standard-deviation=13195.12 median= 59755.04 median-absolute-deviation=8709.16 maximum=82149.45 minimum=49640.04 instructions_per_op: mean= 39448.38 standard-deviation=31.60 median= 39466.17 median-absolute-deviation=25.75 maximum=39474.12 minimum=39411.42 cpu_cycles_per_op: mean= 19267.01 standard-deviation=2217.03 median= 18234.80 median-absolute-deviation=1384.25 maximum=22834.26 minimum=17408.67 ``` `perf-simple-query --smp 1 --write` results obtained for fixed 400MHz frequency and PGO disabled: Before (write path): ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=write, query_single_key=no, counters=no} Disabling auto compaction 63736.96 tps ( 59.4 allocs/op, 16.4 logallocs/op, 14.3 tasks/op, 49667 insns/op, 19924 cycles/op, 0 errors) 64109.41 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 49992 insns/op, 20084 cycles/op, 0 errors) 56950.47 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50005 insns/op, 20501 cycles/op, 0 errors) 44858.42 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50014 insns/op, 21947 cycles/op, 0 errors) 28592.87 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50027 insns/op, 27659 cycles/op, 0 errors) throughput: mean= 51649.63 standard-deviation=15059.74 median= 56950.47 median-absolute-deviation=12087.33 maximum=64109.41 minimum=28592.87 instructions_per_op: mean= 49941.18 standard-deviation=153.76 median= 50005.24 median-absolute-deviation=73.01 maximum=50027.07 minimum=49667.05 cpu_cycles_per_op: mean= 22023.01 standard-deviation=3249.92 median= 20500.74 median-absolute-deviation=1938.76 maximum=27658.75 minimum=19924.32 ``` After (write path) ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=write, query_single_key=no, counters=no} Disabling auto compaction 53395.93 tps ( 59.4 allocs/op, 16.5 logallocs/op, 14.3 tasks/op, 50326 insns/op, 21252 cycles/op, 0 errors) 46527.83 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50704 insns/op, 21555 cycles/op, 0 errors) 55846.30 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50731 insns/op, 21060 cycles/op, 0 errors) 55669.30 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50735 insns/op, 21521 cycles/op, 0 errors) 52130.17 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50757 insns/op, 21334 cycles/op, 0 errors) throughput: mean= 52713.91 standard-deviation=3795.38 median= 53395.93 median-absolute-deviation=2955.40 maximum=55846.30 minimum=46527.83 instructions_per_op: mean= 50650.57 standard-deviation=182.46 median= 50731.38 median-absolute-deviation=84.09 maximum=50756.62 minimum=50325.87 cpu_cycles_per_op: mean= 21344.42 standard-deviation=202.86 median= 21334.00 median-absolute-deviation=176.37 maximum=21554.61 minimum=21060.24 ``` Fixes #24815 Improvement for rare corner cases. No backport required Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#24919	2025-07-13 19:13:11 +03:00
Botond Dénes	c8563b9604	readers: mv generating_v2.hh generating.hh Completely mechanical change.	2025-04-16 04:46:08 -04:00
Botond Dénes	dfd7f03463	tree: s/make_generating_reader_v2/make_generating_reader/ Completely mechanical change.	2025-04-16 04:46:08 -04:00
Botond Dénes	c29c696780	readers: mv from_mutations_v2.hh from_mutations.hh Completely mechanical change.	2025-04-16 04:46:08 -04:00
Botond Dénes	b104862702	tree: s/make_mutation_reader_from_mutations_v2/make_mutation_reader_from_mutations/s Completely mechanical change.	2025-04-16 04:46:07 -04:00
Botond Dénes	a9d75c4f9d	readers: mv empty_v2.hh empty.hh Completely mechanical change.	2025-04-16 04:32:56 -04:00
Botond Dénes	05829f98f3	tree: s/make_empty_flat_reader_v2/make_empty_mutation_reader/ Completely mechanical change.	2025-04-16 04:32:56 -04:00
Kefu Chai	7ff0d7ba98	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22857	2025-02-15 20:32:22 +02:00
Takuya ASADA	03461d6a54	test: compile unit tests into a single executable To reduce test executable size and speed up compilation time, compile unit tests into a single executable. Here is a file size comparison of the unit test executable: - Before applying the patch $ du -h --exclude='.o' --exclude='.o.d' build/release/test/boost/ build/debug/test/boost/ 11G build/release/test/boost/ 29G build/debug/test/boost/ - After applying the patch du -h --exclude='.o' --exclude='.o.d' build/release/test/boost/ build/debug/test/boost/ 5.5G build/release/test/boost/ 19G build/debug/test/boost/ It reduces executable sizes 5.5GB on release, and 10GB on debug. Closes #9155 Closes scylladb/scylladb#21443	2024-12-22 19:14:09 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Kefu Chai	24d14b601b	treewide: s/boost::adaptors::map_values/std::views::values/ now that we are allowed to use C++23. we now have the luxury of using `std::views::values`. in this change, we: - replace `boost::adaptors::map_values` with `std::views::values` - update affected code to work with `std::views::values` - the places where we use `boost::join()` are not changed, because we cannot use `std::views::concat` yet. this helper is only available in C++26. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21265	2024-10-27 21:32:45 +02:00
Kefu Chai	5cd619a60c	treewide: s/boost::adaptors::map_keys/std::views::keys/ now that we are allowed to use C++23. we now have the luxury of using `std::views::keys`. in this change, we: - replace `boost::adaptors::map_keys` with `std::views::keys` - update affected code to work with `std::views::keys` to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21198	2024-10-21 12:47:52 +03:00
Avi Kivity	fdc1449392	treewide: rename flat_mutation_reader_v2 to mutation_reader flat_mutation_reader_v2 was introduced in a pair of commits in 2021: `e3309322c3` "Clone flat_mutation_reader related classes into v2 variants" `08b5773c12` "Adapt flat_mutation_reader_v2 to the new version of the API" as a replacement for flat_mutation_reader, using range_tombstone_change instead of range_tombstone to represent represent range tombstones. See those commits for more information. The transition was incremental; the last use of the original flat_mutation_reader was removed in 2022 in commit `026f8cc1e7` "db: Use mutation_partition_v2 in mvcc" In turn, flat_mutation_reader was introduced in 2017 in commit `748205ca75` "Introduce flat_mutation_reader" To transition from a mutation_reader that nested rows within a partition in a separate stream, to a flat reader that streamed partitions and rows in the same stream. Here, we reclaim the original name and rename the awkward flat_mutation_reader_v2 to mutation_reader. Note that mutation_fragment_v2 remains since we still use the original for compatibilty, sometimes. Some notes about the transition: - files were also renamed. In one case (flat_mutation_reader_test.cc), the rename target already existed, so we rename to mutation_reader_another_test.cc. - a namespace 'mutation_reader' with two definitions existed (in mutation_reader_fwd.hh). Its contents was folded into the mutation_reader class. As a result, a few #includes had to be adjusted. Closes scylladb/scylladb#19356	2024-06-21 07:12:06 +03:00
Kefu Chai	222dbf2ce4	test/boost: include test/lib/test_utils.hh this change was created in the same spirit of 505900f18f. because we are deprecating the operator<< for vector and unorderd_map in Seastar, some tests do not compile anymore if we disable these operators. so to be prepared for the change disabling them, let's include test/lib/test_utils.hh for accessing the printer dedicated for Boost.test. and also '#include <fmt/ranges.h>' when necessary, because, in order to format the ranges using {fmt}, we need to use fmt/ranges.h. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-05-26 12:32:43 +08:00
Tomasz Grabiec	32a191384a	test: Avoid using deprecated sharded API There is not tablet migration in unit tests, so shard_of() can be safely replaced with shard_for_reads(). Even if it's used for writes.	2024-05-16 00:28:47 +02:00
Kefu Chai	a439ebcfce	treewide: include fmt/ranges.h and/or fmt/std.h before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<. with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:56:16 +08:00
Raphael S. Carvalho	41a5c9eaec	test: Reduce mem footprint of test_token_group_based_splitting_mutation_writer Reduces footprint from hundreds of MB to a very few MB. Issue could be reproduced with: ./build/dev/test/boost/mutation_writer_test --run_test=test_token_group_based_splitting_mutation_writer -- -m 500M --smp 1 --random-seed 1848215131 Fixes #17076. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#17187	2024-02-07 09:21:24 +02:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Raphael S. Carvalho	c8668b90e3	mutation_writer: Introduce token-group-based mutation segregator Token group is an abstraction that allows us to easily segregate a mutation stream into buckets. Groups share the same properties as compaction groups. Groups follow the ring order and they don't overlap each other. Groups are defined according to a classifier, which return an id given a token. It's expected that classifier return ids in monotonic increasing order. The reasons for this abstraction are: 1) we don't want to make segregator aware of compaction groups 2) splitting happens before tablet metadata is changed, so the the segregator will have to classify based on whether the token belongs to left (group id 0) or right (group id 1) side of the range to be split. The reason for not extending sstable writer instead, is that today, writer consumer can only tell producer to switch to a new writer, when consuming the end of a partition, but that would be too late for us, as we have to decide to move to a new writer at partition start instead. It will be wired into compaction when it happens in split mode. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 11:26:32 -03:00
Tomasz Grabiec	f88220aeee	stream_transfer_task, multishard_writer: Work with table sharder So that we can use it on tablet-based tables.	2023-07-25 21:08:51 +02:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Kefu Chai	3ae11de204	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:53 +08:00
Avi Kivity	c5e4bf51bd	Introduce mutation/ module Move mutation-related files to a new mutation/ directory. The names are kept in the global namespace to reduce churn; the names are unambiguous in any case. mutation_reader remains in the readers/ module. mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this patch. This is a step forward towards librarization or modularization of the source base. Closes #12788	2023-02-14 11:19:03 +02:00
Raphael S. Carvalho	3c5afb2d5c	test: Enable Scylla test command line options for boost tests We have enabled the command line options without changing a single line of code, we only had to replace old include with scylla_test_case.hh. Next step is to add x-log-compaction-groups options, which will determine the number of compaction groups to be used by all instantiations of replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Pavel Emelyanov	6075e01312	test/lib: Remove sstable_utils.hh from simple_schema.hh The latter is pretty popular test/lib header that disseminates the former one over whole lot of unit tests. The former, in turn, naturally includes sstables.hh thus making tons of unrelated tests depend on sstables class unused by them. However, simple removal doesn't work, becase of local_shard_only bool class definition in sstable_utils.hh used in simple_schema.hh. This thing, in turn, is used in keys making helpers that don't belong to sstable utils, so these are moved into simple_schema as well. When done, this affects the mutation_source_test.hh, which needs the local_shard_only bool class (and helps spreading the sstables.hh throughout more unrelated tests) and a bunch of .cc test sources that used sstable_utils.hh to indirectly include various headers of their demand. After patching, sstables.hh touches 2x times less tests. As a side effect the sstables_manager.hh also becomes 2x times less dependent on by tests. Continuation of `9bdea110a6` Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12240	2022-12-08 15:37:33 +02:00
Benny Halevy	0627667a06	mutation_partition: compact_for_compaction: get tombstone_gc_state And pass down to `do_compact`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-09-07 07:43:15 +03:00
Michael Livshin	029508b77c	flat_mutation_reader ist tot Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-05-31 23:42:34 +03:00
Botond Dénes	f8015d9c26	readers: move combined reader into readers/ Since the combined reader family weighs more than 1K SLOC, it gets its own .cc file.	2022-03-30 15:42:51 +03:00
Botond Dénes	fcf15fda94	readers: generating_reader: use noncopyable_function<> std::function<> requires the functor it wraps to be copyable, which is an unnecessarily strict requirement. To relax this, we use noncopyable_function<> instead. Since the former seems to lack some disambiguation magic of the latter, we add `_v1` and `_v2` postfixes to manually disambiguate.	2022-03-17 06:53:44 +02:00
Benny Halevy	e5538cf52e	test: mutation_write_test: test_timestamp_based_splitting_mutation_writer: no need to downgrade reader to v1 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220315083425.2786228-2-bhalevy@scylladb.com>	2022-03-15 11:41:11 +02:00
Benny Halevy	90edddd7e3	everywhere: use make_flat_mutation_reader_from_mutations_v2 Rather than upgrade_to_v2(make_flat_mutation_reader_from_mutations) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220315083425.2786228-1-bhalevy@scylladb.com>	2022-03-15 11:41:10 +02:00
Mikołaj Sielużycki	1d84a254c0	flat_mutation_reader: Split readers by file and remove unnecessary includes. The flat_mutation_reader files were conflated and contained multiple readers, which were not strictly necessary. Splitting optimizes both iterative compilation times, as touching rarely used readers doesn't recompile large chunks of codebase. Total compilation times are also improved, as the size of flat_mutation_reader.hh and flat_mutation_reader_v2.hh have been reduced and those files are included by many file in the codebase. With changes real 29m14.051s user 168m39.071s sys 5m13.443s Without changes real 30m36.203s user 175m43.354s sys 5m26.376s Closes #10194	2022-03-14 13:20:25 +02:00
Botond Dénes	70e95a9cf7	test/boost/mutation_writer_test: test the v2 variant of distribute_reader_and_consume_on_shards() The underlying implementation behind the v1 and v2 variants if said methods is the same, but we want to move to using the v2 variant in the test as the v1 variant is going away soon.	2022-03-02 09:57:24 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Botond Dénes	e772326b10	mutation_writer: add v2 version of segregate_by_partition() Just a facade using converters behind the scenes. The actual segregator is not worth migrating to v2 while mutation and the flushing readers don't have a v2 versions. Still, migrating all users to a v2 API allows the conversion to happen at a single point where more work is necessary, instead of scattered around all the users. We leave the v1 version in place to aid incremental migration to the v2 one.	2022-01-14 08:54:26 +02:00
Botond Dénes	9826b5d732	mutation_writer: migrate timestamp_based_splitting_writer to v2	2022-01-07 13:51:48 +02:00
Asias He	a8ad385ecd	repair: Get rid of the gc_grace_seconds The gc_grace_seconds is a very fragile and broken design inherited from Cassandra. Deleted data can be resurrected if cluster wide repair is not performed within gc_grace_seconds. This design pushes the job of making the database consistency to the user. In practice, it is very hard to guarantee repair is performed within gc_grace_seconds all the time. For example, repair workload has the lowest priority in the system which can be slowed down by the higher priority workload, so that there is no guarantee when a repair can finish. A gc_grace_seconds value that is used to work might not work after data volume grows in a cluster. Users might want to avoid running repair during a specific period where latency is the top priority for their business. To solve this problem, an automatic mechanism to protect data resurrection is proposed and implemented. The main idea is to remove the tombstone only after the range that covers the tombstone is repaired. In this patch, a new table option tombstone_gc is added. The option is used to configure tombstone gc mode. For example: 1) GC a tombstone after gc_grace_seconds cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ; This is the default mode. If no tombstone_gc option is specified by the user. The old gc_grace_seconds based gc will be used. 2) Never GC a tombstone cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'}; 3) GC a tombstone immediately cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'}; 4) GC a tombstone after repair cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'}; In addition to the 'mode' option, another option 'propagation_delay_in_seconds' is added. It defines the max time a write could possibly delay before it eventually arrives at a node. A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc option can only be used after the whole cluster supports the new feature. A mixed cluster works with no problem. Tests: compaction_test.py, ninja test Fixes #3560 [avi: resolve conflicts vs data_dictionary]	2022-01-04 19:48:14 +02:00
Botond Dénes	aeddcf50a1	mutation_reader: convert make_combined_reader() overloads to v2 Just sprinkle the right amount downgrade_to_v1() and upgrade_to_v2() to call sites, no attempts at optimization was done.	2021-12-20 09:29:05 +02:00
Botond Dénes	64bb48855c	flat_mutation_reader: revamp flat_mutation_reader_from_mutations() Add schema parameter so that: * Caller has better control over schema -- especially relevant for reverse reads where it is not possible to follow the convention of passing the query schema which is reversed compared to that of the mutations. * Now that we don't depend on the mutations for the schema, we can lift the restriction on mutations not being empty: this leads to safer code. When the mutations parameter is empty, an empty reader is created. Add "make_" prefix to follow convention of similar reader factory functions. Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20211115155614.363663-1-bdenes@scylladb.com>	2021-11-15 17:58:46 +02:00
Botond Dénes	e4e369053b	test/boost: mutation_writer_test: harden the partition-based segregator test Test both methods: the "old" disk-based one and the recently added in-memory one, with different configurations and also add additional checks to ensure they don't loose data.	2021-11-02 12:24:37 +02:00
Botond Dénes	74f2290e49	mutation_writer: remove now unused on-disk partition segregator Also removes related tests, including the exception safety test which just spins forever with the memtable method.	2021-11-02 12:24:33 +02:00
Botond Dénes	f2f529855d	compaction,test: use the new in-memory segregator for scrub	2021-11-02 09:00:44 +02:00
Botond Dénes	0d744fd3fa	test: mutation_writer_test: add exception safety test for segregate_by_partition()	2021-10-21 06:50:22 +03:00
Botond Dénes	970fe9a339	mutation_writer: partition_based_splitting_writer: limit number of max buckets Recently we observed an OOM caused by the partition based splitting writer going crazy, creating 1.7K buckets while scrubbing an especially broken sstable. To avoid situations like that in the future, this patch provides a max limit for the number of live buckets. When the number of buckets reach this number, the largest bucket is closed and replaced by a bucket. This will end up creating more output sstables during scrub overall, but now they won't all be written at the same time causing insane memory pressure and possibly OOM. Scrub compaction sets this limit to 100, the same limit the TWCS's timestamp based splitting writer uses (implemented through the classifier - time_window_compaction_strategy::max_data_segregation_window_count). Fixes: #9400 Tests: unit(dev) Closes #9401	2021-09-29 16:31:29 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Botond Dénes	2d2b9e7b36	test/boost: migrate off the global test reader semaphore	2021-07-08 16:53:38 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Solodovnikov	fff7ef1fc2	treewide: reduce boost headers usage in scylla header files `dev-headers` target is also ensured to build successfully. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-05-20 01:33:18 +03:00
Botond Dénes	a53e6bc6e8	mutation_writer: add segregate_by_partition Add a new segregator which segregates a stream, potentially containing duplicate or even out-of-order partitions, into multiple output streams, such that each output stream has strictly monotonic partitions. This segregator will be used by a new scrub compaction mode which is meant to fix sstables containing duplicate or out-of-order data.	2021-05-05 12:03:42 +03:00
Benny Halevy	aa5289f255	test: everywhere: close flat_mutation_reader when done Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00

1 2

66 Commits