scylladb

Author	SHA1	Message	Date
Raphael S. Carvalho	3d9aa9d49e	compaction: Reduce twcs off-strategy space overhead to 10% of free space TWCS off-strategy suffers with 100% space overhead, so a big TWCS table can cause scylla to run out of disk space during node ops. To not penalize TWCS tables, that take a small percentage of disk, with increased write ampl, TWCS off-strategy will be restricted to 10% of free disk space. Then small tables can still compact all disjoint sstables in a single round. Fixes #16514. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `ace4e5111e`)	2024-06-20 20:41:41 +00:00
Raphael S. Carvalho	ef72075920	compaction: wire storage free space into reshape procedure After this, TWCS reshape procedure can be changed to limit job to 10% of available space. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `0ce8ee03f1`)	2024-06-20 20:41:41 +00:00
Kefu Chai	40ce52c3cc	test: use generic boost_test_print_type() in this change, we trade the `boost_test_print_type()` overloads for the generic template of `boost_test_print_type()`, except for those in the very small tests, which presumably want to keep themselves relative self-contained. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18727	2024-05-20 12:56:20 +03:00
Aleksandra Martyniuk	532653f118	replica: replace table::as_table_state Replace table::as_table_state with table::try_get_table_state_with_static_sharding which throws if a table does not use static sharding.	2024-05-10 14:56:38 +02:00
Kefu Chai	a439ebcfce	treewide: include fmt/ranges.h and/or fmt/std.h before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<. with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:56:16 +08:00
Ferenc Szili	5c0de3b097	test/boost/sstable_compaction_test.cc Checks if the tombstone_threshold value will be ignored if unchecked_tombstone_compaction is set to true	2024-03-22 11:21:21 +01:00
Botond Dénes	2335f42b2b	test/boost/sstable_compaction_test: add validation test with valid sstable Add a positive test, as it turns out we had some false-positive validation bugs in the validator and we need a regression test for this.	2024-03-12 11:05:18 -04:00
Botond Dénes	8be97884ec	test/boost/sstable_compaction_test: drop write_corrupt_sstable() helper It is not used anymore.	2024-03-12 11:05:18 -04:00
Botond Dénes	da0f4d3a9f	test/boost/sstable_compaction_test: fix indentation	2024-03-12 11:05:18 -04:00
Botond Dénes	c35092aff6	test/boost/sstable_compaction_test: use test_scrub_framework in test_scrub_quarantine_mode_test The test becomes a lot shorter and it now uses random schema and random data. Indentation is left broken, to be fixed in a future patch.	2024-03-12 11:05:18 -04:00
Botond Dénes	3f76aad609	test/boost/sstable_compaction_test: use scrub_test_framework in sstable_scrub_segregate_mode_test The test becomes a lot shorter and it now uses random schema and random data. Indentation is left broken, to be fixed in a future patch.	2024-03-12 11:05:18 -04:00
Botond Dénes	5237e8133b	test/boost/sstable_compaction_test: use scrub_test_framework in sstable_scrub_skip_mode_test The test becomes a lot shorter and it now uses random schema and random data. The test is also split in two: one test for abort mode and one for skip mode. Indentation is left broken, to be fixed in a future patch.	2024-03-12 11:05:18 -04:00
Botond Dénes	76785baf43	test/boost/sstable_compaction_test: use scrub_test_framework in sstable_scrub_validate_mode_test The test becomes a lot shorter and it now uses random schema and random data. Indentation is left broken, to be fixed in a future patch.	2024-03-12 11:05:18 -04:00
Botond Dénes	b6f0c4efa0	test/boost/sstable_compaction_test: introduce scrub_test_framework Scrub tests require a lot of boilerplate code to work. This has a lot of disadvantages: * Tests are long * The "meat" of the test is lost between all the boiler-plate, it is hard to glean what a test actually does * Tests are hard to write, so we have only a few of them and they test multiple things. * The boiler-plate differs sligthly from test-to-test. To solve this, this patch introduces a new class, `scrub_test_frawmework`, which is a central place for all the boiler-plate code needed to write scrub-related tests. In the next patches, we will migrate scrub related tests to this class.	2024-03-12 11:05:18 -04:00
Avi Kivity	605bf6e221	range.hh: retire range.hh was deprecated in `bd794629f9` (2020) since its names conflict with the C++ library concept of an iterator range. The name ::range also mapped to the dangerous wrapping_interval rather than nonwrapping_interval. Complete the deprecation by removing range.hh and replacing all the aliases by the names they point to from the interval library. Note this now exposes uses of wrapping intervals as they are now explicit. The unit tests are renamed and range.hh is deleted. Closes scylladb/scylladb#17428	2024-02-21 00:24:25 +02:00
Kefu Chai	97587a2ea4	test/boost: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17139	2024-02-06 13:22:16 +02:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Raphael S. Carvalho	ee203f846e	test: Fix segfault when running offstrategy test Observer, that references table_for_test, must of course, not outlive table_for_test. Observer can be called later after the last input sstable is removed from sstable manager. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#16428	2023-12-20 19:04:41 +02:00
Raphael S. Carvalho	d1e6dfadea	sstables: Harden estimate_droppable_tombstone_ratio() interface The interface is fragile because the user may incorrectly use the wrong "gc before". Given that sstable knows how to properly calculate "gc before", let's do it in estimate__d__t__r(), leaving no room for mistakes. sstable_run's variant was also changed to conform to new interface, allowing ICS to properly estimate droppable ratio, using GC before that is calculated using each sstable's range. That's important for upcoming tablets, as we want to query only the range that belongs to a particular tablet in the repair history table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#15931	2023-12-20 19:04:41 +02:00
Raphael S. Carvalho	b1c5d5dd4e	compaction: Add splitting compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 11:40:08 -03:00
Avi Kivity	8fa2e3ad2a	Merge 'Remove sstables::remove_by_toc_name()' from Pavel Emelyanov The helper in question complicates the logic of sstable_directory::process() by making garbage collection differently for sstables deleted "atomically" and deleted "one-by-one". Also, the code that deletes sstables one-by-one and uses remove_by_toc_name() renders excessive TOC file reading, because there's sstable object at hand and it had all_components() ready for use. Surprisingly, there was no test for the deletion-log functionality. This PR adds one. The test passes before the g.c. and regular unlink fix, and (of course) continues passing after it. Closes scylladb/scylladb#16240 * github.com:scylladb/scylladb: sstables: Drop remove_by_name() sstables/fs_storage: Wipe by recognized+unrecognized components sstable_directory: Enlight deletion log replay sstables: Split remove_by_toc_name() test: Add test case to validate deletion log work sstable_directory: Close dir on exception sstable_directory: Fix indentation after previous patch sstable_directory: Coroutinize delete_with_pending_deletion_log() test: Sstable on_delete() is not necessarily in a thread sstable_directory: Split delete_with_pending_deletion_log()	2023-12-03 17:29:34 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Pavel Emelyanov	92f0aa04d0	test: Sstable on_delete() is not necessarily in a thread One of the test cases injects an observer into sstable->unlink() method via its _on_delete() callback. The test's callback assumes that it runs in an async context, but it's a happy coincidence, because deletion via the deletion log runs so. Next patch is changing it and the test case will no longer work. But since it's a test case it can just directly call a libc function for its needs Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-01 15:00:38 +03:00
Kefu Chai	687ba9cacc	test/sstable_compaction_test: check every sstable replaced sstable before this change, in sstable_run_based_compaction_test, we check every 4 sstables, to verify that we close the sstable to be replaced in a batch of 4. since the integer-based generation identifier is monotonically incremental, we can assume that the identifiers of sstables are like 0, 1, 2, 3, .... so if the compaction consumes sstable in a batch of 4, the identifier of the first one in the batch should always be the multiple of 4. unfortunately, this test does not work if we use uuid-based identifier. but if we take a closer look at how we create the dataset, we can have following facts: 1. the `compaction_descriptor` returned by `sstable_run_based_compaction_strategy_for_tests` never set `owned_ranges` in the returned descriptor 2. in `compaction::setup_sstable_reader`, `mutation_reader::forward::no` is used, if `_owned_ranges_checker` is empty 3. `mutation_reader_merger` respects the `fwd_mr` passed to its ctor, so it closes current sstable immediately when the underlying mutation reader reaches the end of stream. in other words, we close every sstable once it is fully consumed in sstable_ompaction_test. and the reason why the existing test passes is that we just sample the sstables whose generation id is a multiple of 4. what happens when we perform compaction in this test is: 1. replace 5 with 33, closing 5 2. replace 6 with 34, closing 6 3. replace 7 with 35, closing 7 4. replace 8 with 36, closing 8 << let's check here.. good, go on! 5. replace 13 with 37, closing 13 ... 8. replace 16 with 40, closing 16 << let's check here.. also, good, go on! so, in this change, we just check all old sstables, to verify that we close each of them once it is fully consumed. Fixes #16073 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-11-16 16:21:46 +08:00
Kefu Chai	18792fe059	test/sstable_compaction_test: s/old_sstables.front()/old_sstable/ for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-11-16 16:21:40 +08:00
Nadav Har'El	64d1d5cf62	Merge 'Fix partition estimation with TWCS tables during streaming' from Raphael "Raph" Carvalho TWCS tables require partition estimation adjustment as incoming streaming data can be segregated into the time windows. Turns out we had two problems in this area that leads to suboptimal bloom filters. 1) With off-strategy enabled, data segregation is postponed, but partition estimation was adjusted as if segregation wasn't postponed. Solved by not adjusting estimation if segregation is postponed. 2) With off-strategy disabled, data segregation is not postponed, but streaming didn't feed any metadata into partition estimation procedure, meaning it had to assume the max windows input data can be segregated into (100). Solved by using schema's default TTL for a precise estimation of window count. For the future, we want to dynamically size filters (see https://github.com/scylladb/scylladb/issues/2024), especially for TWCS that might have SSTables that are left uncompacted until they're fully expired, meaning that the system won't heal itself in a timely manner through compaction on a SSTable that had partition estimation really wrong. Fixes https://github.com/scylladb/scylladb/issues/15704. Closes scylladb/scylladb#15938 * github.com:scylladb/scylladb: streaming: Improve partition estimation with TWCS streaming: Don't adjust partition estimate if segregation is postponed	2023-11-14 20:41:36 +02:00
Kefu Chai	5a6c5320de	test/sstable_compaction_test: use BOOST_REQUIRE_EQUAL when appropriate Boost.Test prints the LHS and RHS when the predicate statement passed to BOOST_REQUIRE_EQUAL() macro evaluates to false. so the error message printed by Boost would be more developer friendly when the test fails. in this test, we replace some BOOST_REQUIRE() with BOOST_REQUIRE_EQUAL() when appropriate. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16047	2023-11-14 13:51:47 +02:00
Pavel Emelyanov	f4696f21a8	test/utils: Drop compaction_manager_test This class only provides a .run() method which allocates a task and calls sstables::test_env::perform_compaction(). This can be done in a helper method, no need for the whole class for it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-13 11:44:51 +03:00
Pavel Emelyanov	0160265c7d	test/env: Add sstables::test_env& to compaction_manager_test::run() Continuation of the previous patch that will also be used further. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-13 11:44:51 +03:00
Pavel Emelyanov	393c066f3e	test/utils: Add sstables::test_env& to compact_sstables() Will be used in next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-13 11:44:51 +03:00
Pavel Emelyanov	9a9e1fdd7d	test/utils: Squash two compact_sstables() helpers Now the one sitting in utils is only called from its peer in compaction test. Things get simpler if they get merged. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-13 11:44:51 +03:00
Pavel Emelyanov	69657a2a97	test/compaction: Use shorter compact_sstables() helper There are several of them spread between the test and utils. One of the test cases can use its local shorter overload for brevity. Also this makes one of the next patches shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-13 11:44:51 +03:00
Pavel Emelyanov	aec3fc493a	test/utils: Move compaction_manager_test::propagate_replacement() The purpose of this method is to turn public the private compaction_manager method of the same name. The caller of this method is having sstable_test_env at hand with its test_env_compaction_manager, so the de-private-isation call can be moved. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-13 11:44:51 +03:00
Raphael S. Carvalho	b551f4abd2	streaming: Improve partition estimation with TWCS When off-strategy is disabled, data segregation is not postponed, meaning that getting partition estimate right is important to decrease filter's false positives. With streaming, we don't have min and max timestamps at destination, well, we could have extended the RPC verb to send them, but turns out we can deduce easily the amount of windows using default TTL. Given partitioner random nature, it's not absurd to assume that a given range being streamed may overlap with all windows, meaning that each range will yield one sstable for each window when segregating incoming data. Today, we assume the worst of 100 windows (which is the max amount of sstables the input data can be segregated into) due to the lack of metadata for estimating the window count. But given that users are recommended to target a max of ~20 windows, it means partition estimate is being downsized 5x more than needed. Let's improve it by using default TTL when estimating window count, so even on absence of timestamp metadata, the partition estimation won't be way off. Fixes #15704. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-11-08 12:10:03 +02:00
Botond Dénes	76ab66ca1f	Merge 'Support state change for S3-backed sstables' from Pavel Emelyanov The sstable currently can move between normal, staging and quarantine state runtime. For S3-backed sstables the state change means maintaining the state itself in the ownership table and updating it accordingly. There's also the upload facility that's implemented as state change too, but this PR doesn't support this part. fixes: #13017 Closes scylladb/scylladb#15829 * github.com:scylladb/scylladb: test: Make test_sstables_excluding_staging_correctness run over s3 too sstables,s3: Support state change (without generation change) system_keyspace: Add state field to system.sstables sstable_directory: Tune up sstables entries processing comment system_keyspace: Tune up status change trace message sstables: Add state string to state enum class convert	2023-11-07 10:45:41 +02:00
Pavel Emelyanov	3173336e97	tests: Use make_sstable_easy() where appropriate There are two test cases out there that make sstable, write it and the load, but the make_sstable_easy() is for that, so use it there. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-02 19:32:43 +03:00
Pavel Emelyanov	731a82869a	test/sstable_compaction_test: Make use of make_table_for_tests() The max_ongoing_compaction_test test case constructs table object by hand. For that it needs tracker, compaction manager and stats. Similarly to previous patch, the test_env::make_table_for_tests() helper does exactly that, so the test case can be simplified as well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-01 14:18:17 +03:00
Pavel Emelyanov	3f354c07a3	test/sstable_compaction_test: Remove unused tracker allocation The sstable_run_based_compaction_test case allocates the tracker but doesn't use it. Probably was left after the case was patched to use make_table_for_tests() helper. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-01 14:18:12 +03:00
Pavel Emelyanov	e71409df38	table_for_tests: Get compaction manager from table There's table_for_tests::get_compaction_manager() helper that's excessive as compaction manager reference can be provided by the wrapped table object itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-31 09:37:22 +03:00
Pavel Emelyanov	cba8f633f1	tests: Split the compaction backlog test case To improve parallelizm of embedded test sub-cases. By coinsidence, indentation fix is not required. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-31 09:27:57 +03:00
Pavel Emelyanov	c88de8f91e	test/compaction: Use shorter make_table_for_tests() overload There's one that doesn't need tempdir path argument since it gets one from the env onboard tempdir anyway Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#15825	2023-10-30 20:16:29 +02:00
Pavel Emelyanov	cb63d303f0	test: Make test_sstables_excluding_staging_correctness run over s3 too This test checks the way sstable is moved and lives in staging state. Now it passes on S3 as well Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-24 19:12:37 +03:00
Pavel Emelyanov	ec94cc9538	Merge 'test: set use_uuid to true by default in sstables::test_env ' from Kefu Chai this series 1. let sstable tests using test_env to use uuid-based sstable identifiers by default 2. let the test who requires integer-based identifier keep using it this should enable us to perform the s3 related test after enforcing the uuid-based identifier for s3 backend, otherwise the s3 related test would fail as it also utilize `test_env`. Closes scylladb/scylladb#14553 * github.com:scylladb/scylladb: test: set use_uuid to true by default in sstables::test_env test: enable test to set uuid_sstable_identifiers	2023-10-19 09:09:38 +03:00
Raphael S. Carvalho	da04fea71e	compaction: Fix key estimation per sstable to produce efficient filters The estimation assumes that size of other components are irrelevant, when estimating the number of partitions for each output sstable. The sstables are split according to the data file size, therefore size of other files are irrelevant for the estimation. With certain data models, like single-row partitions containing small values, the index could be even larger than data. For example, assume index is as large as data, then the estimation would say that 2x more sstables will be generated, and as a result, each sstable are underestimated to have 2x less keys. Fix it by only accounting size of data file. Fixes #15726. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#15727	2023-10-17 11:21:11 +03:00
Kefu Chai	50c8619ed9	test: enable test to set uuid_sstable_identifiers some of the tests are still relying on the integer-based sstable identifier, so let's add a method to test_env, so that the tests relying on this can opt-out. we will change the default setting of sstables::test_env to use uuid-base sstable identifier in the next commit. this change does not change the existing behavior. it just adds a new knob to test_env_config. and let the tests relying on this to customize the test_env_config to disable use_uuid. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-10-07 18:56:47 +08:00
Raphael S. Carvalho	8997fe0625	compaction: Switch to strategy_control::candidates() for regular compaction Now everything is prepared for the switch, let's do it. Now let's wait for ICS to enjoy the set of changes. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-09-25 17:18:21 -03:00
Raphael S. Carvalho	761a37022f	tests: Prepare sstable_compaction_test for change in compaction_strategy interface Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-09-25 17:18:21 -03:00
Raphael S. Carvalho	02f1f24f27	compaction: Allow strategy to retrieve candidates either as sstables or runs That's needed for upcoming changes that will allow ICS to efficiently retrieve sstable runs. Next patch will remove candidates from compaction_strategy's interface to retrieve candidates using this one instead. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-09-25 17:18:21 -03:00
Raphael S. Carvalho	8235889b8a	sstables: tag sstable_run::insert() with nodiscard sstable_run may reject insertion of a sstable if it's going to break the disjoint invariant of the run, but it's important that the caller is aware of it, so it can act on it like generating a new run id for the sstable so it can be inserted in another run. the tag is important to avoid unknown problems in this area. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-09-25 17:18:21 -03:00
Raphael S. Carvalho	91efd878d7	test: Verify that off-strategy can do incremental compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-09-21 11:15:46 -03:00

1 2 3 4 5 ...

308 Commits