TWCS off-strategy suffers with 100% space overhead, so a big TWCS table
can cause scylla to run out of disk space during node ops.
To not penalize TWCS tables, that take a small percentage of disk,
with increased write ampl, TWCS off-strategy will be restricted to
10% of free disk space. Then small tables can still compact all
disjoint sstables in a single round.
Fixes#16514.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit ace4e5111e)
After this, TWCS reshape procedure can be changed to limit job
to 10% of available space.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 0ce8ee03f1)
in this change, we trade the `boost_test_print_type()` overloads
for the generic template of `boost_test_print_type()`, except for
those in the very small tests, which presumably want to keep
themselves relative self-contained.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18727
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.
Refs scylladb#13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The test becomes a lot shorter and it now uses random schema and random
data. The test is also split in two: one test for abort mode and one for
skip mode.
Indentation is left broken, to be fixed in a future patch.
Scrub tests require a lot of boilerplate code to work. This has a lot of
disadvantages:
* Tests are long
* The "meat" of the test is lost between all the boiler-plate, it is
hard to glean what a test actually does
* Tests are hard to write, so we have only a few of them and they test
multiple things.
* The boiler-plate differs sligthly from test-to-test.
To solve this, this patch introduces a new class, `scrub_test_frawmework`,
which is a central place for all the boiler-plate code needed to write
scrub-related tests. In the next patches, we will migrate scrub related
tests to this class.
range.hh was deprecated in bd794629f9 (2020) since its names
conflict with the C++ library concept of an iterator range. The name
::range also mapped to the dangerous wrapping_interval rather than
nonwrapping_interval.
Complete the deprecation by removing range.hh and replacing all the
aliases by the names they point to from the interval library. Note
this now exposes uses of wrapping intervals as they are now explicit.
The unit tests are renamed and range.hh is deleted.
Closesscylladb/scylladb#17428
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.
Replace with seastar::future::get(), which does the same thing.
Observer, that references table_for_test, must of course, not
outlive table_for_test. Observer can be called later after the
last input sstable is removed from sstable manager.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#16428
The interface is fragile because the user may incorrectly use the
wrong "gc before". Given that sstable knows how to properly calculate
"gc before", let's do it in estimate__d__t__r(), leaving no room
for mistakes.
sstable_run's variant was also changed to conform to new interface,
allowing ICS to properly estimate droppable ratio, using GC before
that is calculated using each sstable's range. That's important for
upcoming tablets, as we want to query only the range that belongs
to a particular tablet in the repair history table.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#15931
The helper in question complicates the logic of sstable_directory::process() by making garbage collection differently for sstables deleted "atomically" and deleted "one-by-one". Also, the code that deletes sstables one-by-one and uses remove_by_toc_name() renders excessive TOC file reading, because there's sstable object at hand and it had all_components() ready for use.
Surprisingly, there was no test for the deletion-log functionality. This PR adds one. The test passes before the g.c. and regular unlink fix, and (of course) continues passing after it.
Closesscylladb/scylladb#16240
* github.com:scylladb/scylladb:
sstables: Drop remove_by_name()
sstables/fs_storage: Wipe by recognized+unrecognized components
sstable_directory: Enlight deletion log replay
sstables: Split remove_by_toc_name()
test: Add test case to validate deletion log work
sstable_directory: Close dir on exception
sstable_directory: Fix indentation after previous patch
sstable_directory: Coroutinize delete_with_pending_deletion_log()
test: Sstable on_delete() is not necessarily in a thread
sstable_directory: Split delete_with_pending_deletion_log()
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
One of the test cases injects an observer into sstable->unlink() method
via its _on_delete() callback. The test's callback assumes that it runs
in an async context, but it's a happy coincidence, because deletion via
the deletion log runs so. Next patch is changing it and the test case
will no longer work. But since it's a test case it can just directly
call a libc function for its needs
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
before this change, in sstable_run_based_compaction_test, we check
every 4 sstables, to verify that we close the sstable to be replaced
in a batch of 4.
since the integer-based generation identifier is monotonically
incremental, we can assume that the identifiers of sstables are like
0, 1, 2, 3, .... so if the compaction consumes sstable in a
batch of 4, the identifier of the first one in the batch should
always be the multiple of 4. unfortunately, this test does not work
if we use uuid-based identifier.
but if we take a closer look at how we create the dataset, we can
have following facts:
1. the `compaction_descriptor` returned by
`sstable_run_based_compaction_strategy_for_tests` never
set `owned_ranges` in the returned descriptor
2. in `compaction::setup_sstable_reader`, `mutation_reader::forward::no`
is used, if `_owned_ranges_checker` is empty
3. `mutation_reader_merger` respects the `fwd_mr` passed to its
ctor, so it closes current sstable immediately when the underlying
mutation reader reaches the end of stream.
in other words, we close every sstable once it is fully consumed in
sstable_ompaction_test. and the reason why the existing test passes
is that we just sample the sstables whose generation id is a multiple
of 4. what happens when we perform compaction in this test is:
1. replace 5 with 33, closing 5
2. replace 6 with 34, closing 6
3. replace 7 with 35, closing 7
4. replace 8 with 36, closing 8 << let's check here.. good, go on!
5. replace 13 with 37, closing 13
...
8. replace 16 with 40, closing 16 << let's check here.. also, good, go on!
so, in this change, we just check all old sstables, to verify that
we close each of them once it is fully consumed.
Fixes#16073
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
TWCS tables require partition estimation adjustment as incoming streaming data can be segregated into the time windows.
Turns out we had two problems in this area that leads to suboptimal bloom filters.
1) With off-strategy enabled, data segregation is postponed, but partition estimation was adjusted as if segregation wasn't postponed. Solved by not adjusting estimation if segregation is postponed.
2) With off-strategy disabled, data segregation is not postponed, but streaming didn't feed any metadata into partition estimation procedure, meaning it had to assume the max windows input data can be segregated into (100). Solved by using schema's default TTL for a precise estimation of window count.
For the future, we want to dynamically size filters (see https://github.com/scylladb/scylladb/issues/2024), especially for TWCS that might have SSTables that are left uncompacted until they're fully expired, meaning that the system won't heal itself in a timely manner through compaction on a SSTable that had partition estimation really wrong.
Fixes https://github.com/scylladb/scylladb/issues/15704.
Closesscylladb/scylladb#15938
* github.com:scylladb/scylladb:
streaming: Improve partition estimation with TWCS
streaming: Don't adjust partition estimate if segregation is postponed
Boost.Test prints the LHS and RHS when the predicate statement passed
to BOOST_REQUIRE_EQUAL() macro evaluates to false. so the error message
printed by Boost would be more developer friendly when the test fails.
in this test, we replace some BOOST_REQUIRE() with BOOST_REQUIRE_EQUAL()
when appropriate.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16047
This class only provides a .run() method which allocates a task and
calls sstables::test_env::perform_compaction(). This can be done in a
helper method, no need for the whole class for it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now the one sitting in utils is only called from its peer in compaction
test. Things get simpler if they get merged.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are several of them spread between the test and utils. One of the
test cases can use its local shorter overload for brevity. Also this
makes one of the next patches shorter.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The purpose of this method is to turn public the private
compaction_manager method of the same name. The caller of this method is
having sstable_test_env at hand with its test_env_compaction_manager, so
the de-private-isation call can be moved.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When off-strategy is disabled, data segregation is not postponed,
meaning that getting partition estimate right is important to
decrease filter's false positives. With streaming, we don't
have min and max timestamps at destination, well, we could have
extended the RPC verb to send them, but turns out we can deduce
easily the amount of windows using default TTL. Given partitioner
random nature, it's not absurd to assume that a given range being
streamed may overlap with all windows, meaning that each range
will yield one sstable for each window when segregating incoming
data. Today, we assume the worst of 100 windows (which is the
max amount of sstables the input data can be segregated into)
due to the lack of metadata for estimating the window count.
But given that users are recommended to target a max of ~20
windows, it means partition estimate is being downsized 5x more
than needed. Let's improve it by using default TTL when
estimating window count, so even on absence of timestamp
metadata, the partition estimation won't be way off.
Fixes#15704.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The sstable currently can move between normal, staging and quarantine state runtime. For S3-backed sstables the state change means maintaining the state itself in the ownership table and updating it accordingly.
There's also the upload facility that's implemented as state change too, but this PR doesn't support this part.
fixes: #13017Closesscylladb/scylladb#15829
* github.com:scylladb/scylladb:
test: Make test_sstables_excluding_staging_correctness run over s3 too
sstables,s3: Support state change (without generation change)
system_keyspace: Add state field to system.sstables
sstable_directory: Tune up sstables entries processing comment
system_keyspace: Tune up status change trace message
sstables: Add state string to state enum class convert
There are two test cases out there that make sstable, write it and the
load, but the make_sstable_easy() is for that, so use it there.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The max_ongoing_compaction_test test case constructs table object by
hand. For that it needs tracker, compaction manager and stats. Similarly
to previous patch, the test_env::make_table_for_tests() helper does
exactly that, so the test case can be simplified as well.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The sstable_run_based_compaction_test case allocates the tracker but
doesn't use it. Probably was left after the case was patched to use
make_table_for_tests() helper.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's table_for_tests::get_compaction_manager() helper that's
excessive as compaction manager reference can be provided by the wrapped
table object itself.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
To improve parallelizm of embedded test sub-cases.
By coinsidence, indentation fix is not required.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's one that doesn't need tempdir path argument since it gets one
from the env onboard tempdir anyway
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#15825
this series
1. let sstable tests using test_env to use uuid-based sstable identifiers by default
2. let the test who requires integer-based identifier keep using it
this should enable us to perform the s3 related test after enforcing the uuid-based identifier for s3 backend, otherwise the s3 related test would fail as it also utilize `test_env`.
Closesscylladb/scylladb#14553
* github.com:scylladb/scylladb:
test: set use_uuid to true by default in sstables::test_env
test: enable test to set uuid_sstable_identifiers
The estimation assumes that size of other components are irrelevant,
when estimating the number of partitions for each output sstable.
The sstables are split according to the data file size, therefore
size of other files are irrelevant for the estimation.
With certain data models, like single-row partitions containing small
values, the index could be even larger than data.
For example, assume index is as large as data, then the estimation
would say that 2x more sstables will be generated, and as a result,
each sstable are underestimated to have 2x less keys.
Fix it by only accounting size of data file.
Fixes#15726.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#15727
some of the tests are still relying on the integer-based sstable
identifier, so let's add a method to test_env, so that the tests
relying on this can opt-out. we will change the default setting
of sstables::test_env to use uuid-base sstable identifier in the
next commit. this change does not change the existing behavior.
it just adds a new knob to test_env_config. and let the tests
relying on this to customize the test_env_config to disable
use_uuid.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Now everything is prepared for the switch, let's do it.
Now let's wait for ICS to enjoy the set of changes.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
That's needed for upcoming changes that will allow ICS to efficiently
retrieve sstable runs.
Next patch will remove candidates from compaction_strategy's interface
to retrieve candidates using this one instead.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
sstable_run may reject insertion of a sstable if it's going
to break the disjoint invariant of the run, but it's important
that the caller is aware of it, so it can act on it like
generating a new run id for the sstable so it can be inserted
in another run. the tag is important to avoid unknown
problems in this area.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>