There's a database::get_snapshot_details() method that returns collection of all snapshots for all ks.cf out there and there are several *snapshot_details* aux structures around it. This PR keeps only one "details" and cleans up the way it propagates from database up to the respective API calls.
Closesscylladb/scylladb#18317
* github.com:scylladb/scylladb:
snapshot_ctl: Brush up true_snapshots_size() internals
snapshot_ctl: Remove unused details struct
snapshot_ctl: No double recoding of details
database,snapshots: Move database::snapshot_details into snapshot_ctl
database,snapshots: Make database::get_snapshot_details() return map, not vector
table,snapshots: Move table::snapshot_details into snapshot_ctl
Currently database::get_snapshot_details() returns a collection of
snapshots. The snapshot_ctl converts this collection into similarly
looking one with slightly different structures inside. The resulting
collection is converted one more time on the API layer into another
similarly looking map.
This patch removes the intermediate conversion.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
So that it's in-sync with table::get_snapshot_details(). Next patches
will improve this place even further.
Also, there can be many snapshots and vector can grow large, but that's
less of an issue here.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.
Refs scylladb#13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter. but fortunately, fmt v10 brings the builtin
formatter for classes derived from `std::exception`. but before
switching to {fmt} v10, and after dropping `FMT_DEPRECATED_OSTREAM`
macro, we need to print out `std::runtime_error`. so far, we don't
have a shared place for formatter for `std::runtime_error`. so we
are addressing the needs on a case-by-case basis.
in this change, we just print it using `e.what()`. it's behavior
is identical to what we have now.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This patch changes get_unlimited_query_max_result_size():
* Also set the page-size field, not just the soft/hard limits
* Renames it to get_query_max_result_size()
* Update callers, specifically storage_proxy::get_max_result_size(),
which now has a much simpler common return path and has to drop the
page size on one rare return path.
This is a purely mechanical change, no behaviour is changed.
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.
Replace with seastar::future::get(), which does the same thing.
Store schema_ptr in reader permit instead of storing a const pointer to
schema to ensure that the schema doesn't get changed elsewhere when the
permit is holding on to it. Also update the constructors and all the
relevant callers to pass down schema_ptr instead of a raw pointer.
Fixes#16180
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Closesscylladb/scylladb#16658
Also rephase the messages a bit so they are more uniform.
The goal of this change is to make semaphore mismatches easier to
diagnose, by including the table name and the permit name in the
printout.
While at it, add a test for semaphore mismatch, it didn't have one.
Refs: #15485Closesscylladb/scylladb#15508
In this refactoring commit we remove the db::config::host_id
field, as it's hacky and duplicates token_metadata::get_my_id.
Some tests want specific host_id, we add it to cql_test_config
and use in cql_test_env.
We can't pass host_id to sstables_manager by value since it's
initialized in database constructor and host_id is not loaded yet.
We also prefer not to make a dependency on shared_token_metadata
since in this case we would have to create artificial
shared_token_metadata in many tools and tests where sstables_manager
is used. So we pass a function that returns host_id to
sstables_manager constructor.
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.
There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.
```
INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf
```
Closes#14863
* github.com:scylladb/scylladb:
Kill scylla option to configure number of compaction groups
replica: Wire tablet into compaction group
token_metadata: Add this_host_id to topology config
replica: Switch to chunked_vector for storing compaction groups
replica: Generate group_id for compaction_group on demand
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Pretty cosmetic change, but it will allow S3 to finally support moving
sstables between states (after this patch it still doesn't)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In the next commit, we remove the default value for the
description parameter of migration_manager::announce to avoid
using it in the future. However, many calls to announce in tests
use the default value. We have to change it, but we don't really
care about descriptions in the tests, so we pass the empty string
everywhere.
The migration_manager service is responsible for schema convergence
in the cluster - pushing schema changes to other nodes and pulling
schema when a version mismatch is observed. However, there is also
a part of migration_manager that doesn't really belong there -
creating mutations for schema updates. These are the functions with
prepare_ prefix. They don't modify any state and don't exchange any
messages. They only need to read the local database.
We take these functions out of migration_manager and make them
separate functions to reduce the dependency of other modules
(especially query_processor and CQL statements) on
migration_manager. Since all of these functions only need access
to storage_proxy (or even only replica::database), doing such a
refactor is not complicated. We just have to add one parameter,
either storage_proxy or database and both of them are easily
accessible in the places where these functions are called.
The arguments goes via the db::(drop|truncate)_table_on_all_shards()
pair of calls that start from
- storage_proxy::remote: has its sys.ks reference already
- schema_tables::merge_schema: has sys.ks argument already
- legacy_schema_migrator: the reference was added by previous patch
- tests: run in cql_test_env with sys.ks on board
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
some times we initialize a loop variable like
auto i = 0;
or
int i = 0;
but since the type of `0` is `int`, what we get is a variable of
`int` type, but later we compare it with an unsigned number, if we
compile the source code with `-Werror=sign-compare` option, the
compiler would warn at seeing this. in general, this is a false
alarm, as we are not likely to have a wrong comparison result
here. but in order to prevent issues due to the integer promotion
for comparison in other places. and to prepare for enabling
`-Werror=sign-compare`. let's use unsigned to silence this warning.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
when comparing signed and unsigned numbers, the compiler promotes
the signed number to coomon type -- in this case, the unsigned type,
so they can be compared. but sometimes, it matters. and after the
promotion, the comparison yields the wrong result. this can be
manifested using a short sample like:
```
int main(int argc, char **argv) {
int x = -1;
unsigned y = 2;
fmt::print("{}\n", x < y);
return 0;
}
```
this error can be identified by `-Werror=sign-compare`, but before
enabling this compiling option. let's use `std::cmp_*()` to compare
them.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Split long-runing database mutation tests.
At a trade-off with verbosity, split these sub-tests for the long
running tests database_with_data_in_sstables_is_a_mutation_source_*.
Refs #13905
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Split long running tests
test_database_with_data_in_sstables_is_a_mutation_source_plain and
test_database_with_data_in_sstables_is_a_mutation_source_reverse.
They run with x_log2_compaction_groups of 0 and 1, each one taking from
10 to 15 minutes each in debug mode, for a total of 28 and 22 minutes.
Split the test cases to run with 0 and 1, so test.py can run them in
parallel.
Refs #13905
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Closes#14356
This reverts commit 562087beff.
The regressions introduced by the reverted change have been fixed.
So let's revert this revert to resurrect the
uuid_sstable_identifier_enabled support.
Fixes#10459
dht::shard_of() does not use the correct sharder for tablet-based tables.
Code which is supposed to work with all kinds of tables should use erm::get_sharder().
This reverts commit d1dc579062, reversing
changes made to 3a73048bc9.
Said commit caused regressions in dtests. We need to investigate and fix
those, but in the meanwhile let's revert this to reduce the disruption
to our workflows.
Refs: #14283
before this change, we assume that generation is always integer based.
in order to enable the UUID-based generation identifier if the related
option is set, we should populate this option down to generation generator.
because we don't have access to the cluster features in some places where
a new generation is created, a new accessor exposing feature_service from
sstable manager is added.
Fixes#10459
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).
So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command
The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields
Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)
Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile
The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13963
Several test cases use common operations one files like existence checking, content comparing, etc. with the help of home-brew local helpers. The set makes use of some existing seastar:: ones and generalizes others into test/lib/. The primary intent here is `57 insertions(+), 135 deletions(-)`
Closes#13936
* github.com:scylladb/scylladb:
test: Generalize touch_file() into test_utils.*
test/database: Generalize file/dir touch and exists checks
test/sstables: Use seastar::file_exists() to check
test/sstables: Remove sstdesc
test/sstables: Use compare_files from utils/ in sstable_test
test/sstables: Use compare_files() from utils/ in sstable_3_x_test
test/util: Add compare_file() helpers
With regards to closing the looked-up querier if an exception is thrown. In particular, this requires closing the querier if a semaphore mismatch is detected. Move the table lookup above the line where the querier is looked up, to avoid having to handle the exception from it. As a consequence of closing the querier on the error path, the lookup lambda has to be made a coroutine. This is sad, but this is executed once per page, so its cost should be insignificant when spread over an
entire page worth of work.
Also add a unit test checking that the mismatch is detected in the first place and that readers are closed.
Fixes: #13784Closes#13790
* github.com:scylladb/scylladb:
test/boost/database_test: add unit test for semaphore mismatch on range scans
partition_slice_builder: add set_specific_ranges()
multishard_mutation_query: make reader_context::lookup_readers() exception safe
multishard_mutation_query: lookup_readers(): make inner lambda a coroutine
The tempdir for filesystem-based sstables is {generation}.sstable one.
There are two places that need to know the ".sstable" extention -- the
tempdir creating code and the tempdir garbage-collecting code.
This patch simplifies the sstable class by patching the aforementioned
functions to use newly introduced tempdir_extension string directly,
without the help of static one-line helpers.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The helper is used to return const char* value of the pending delete
dir. Callers can use it directly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
database_test is failing sporadically and the cause was traced back
to commit e3e7c3c7e5.
The commit forces a subset of tests in database_test, to run once
for each of predefined x_log2_compaction_group settings.
That causes two problems:
1) test becomes 240% slower in dev mode.
2) queries on system.auth is timing out, and the reason is a small
table being spread across hundreds of compaction groups in each
shard. so to satisfy a range scan, there will be multiple hops,
making the overhead huge. additionally, the compaction group
aware sstable set is not merged yet. so even point queries will
unnecessarily scan through all the groups.
Fixes#13660.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#13851
this is the first step to the uuid-based generation identifier. the goal is to encapsulate the generation related logic in generator, so its consumers do not have to understand the difference between the int64_t based generation and UUID v1 based generation.
this commit should not change the behavior of existing scylla. it just allows us to derive from `generation_generator` so we can have another generator which generates UUID based generation identifier.
Closes#13073
* github.com:scylladb/scylladb:
replica, test: create generation id using generator
sstables: add generation_generator
test: sstables: use generate_n for generating ids for testing
The only reason why it's there (right next to compaction_fwd.hh) is
because the database::table_truncate_state subclass needs the definition
of compaction_manager::compaction_reenabler subclass.
However, the former sub is not used outside of database.cc and can be
defined in .cc. Keeping it outside of the header allows dropping the
compaction_manager.hh from database.hh thus greatly reducing its fanout
over the code (from ~180 indirect inclusions down to ~20).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13622
reuse generation_generator for generating generation identifiers for
less repeatings. also, add allow update generator to update its
lastest known generation id.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
so we don't need to keep a `prev_gen` around, this also prepares for
the coming change to use generation generator.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Preparing for #10459, this series defines sstables::generation_type::int_t
as `int64_t` at the moment and use that instead of naked `int64_t` variables
so it can be changed in the future to hold e.g. a `std::variant<int64_t, sstables::generation_id>`.
sstables::new_generation was defined to generation new, unique generations.
Currently it is based on incrementing a counter, but it can be extended in the future
to manufacture UUIDs.
The unit tests are cleaned up in this series to minimize their dependency on numeric generations.
Basically, they should be used for loading sstables with hard coded generation numbers stored under `test/resource/sstables`.
For all the rest, the tests should use existing and mechanisms introduced in this series such as generation_factory, sst_factory and smart make_sstable methods in sstable_test_env and table_for_tests to generate new sstables with a unique generation, and use the abstract sst->generation() method to get their generation if needed, without resorting the the actual value it may hold.
Closes#12994
* github.com:scylladb/scylladb:
everywhere: use sstables::generation_type
test: sstable_test_env: use make_new_generation
sstable_directory::components_lister::process: fixup indentation
sstables: make highest_generation_seen return optional generation
replica: table: add make_new_generation function
replica: table: move sstable generation related functions out of line
test: sstables: use generation_type::int_t
sstables: generation_type: define int_t
Use generation_type rather than generation_type::int_t
where possible and removed the deprecated
functions accepting the int_t.i
Ref #10459
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Convert all users to use sstables::generation_type::int_t.
Further patches will continue to convert most to
using sstables::generation_type instead so we can
abstract the value type.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
And propagate it down to where it is created. This will be used to add
trace points for semaphore related events, but this will come in the
next patches.
Tests should just generate the highest sstable version
available. There is no need to ontinue testing old versions,
in particular partially supported ones like "la".
Use also the default values for sstable::format_types, buffer_size,
etc. if there's no particular need to override them.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The replayer code needs system keyspace to fetch truncation records
from, thus it needs this explicit dependency. By the time it runs system
keyspace is fully initialized already
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This method requires callers to remember that the sstable is the collection of files on a filesystem and to know what exact directory they are all in. That's not going to work for object storage, instead, sstable should be moved between more abstract states.
This PR replaces move_to_new_dir() call with the change_state() one that accepts target sub-directory string and moves files around. Currently supported state changes:
* staging -> normal
* upload -> normal | staging
* any -> quarantine
All are pretty straightforward and move files between table basedir subdirectories with the exception that upload -> quarantine should move into upload/quarantine subdirectory. Another thing to keep in mind, that normal state doesn't have its subdir but maps directory to table's base directory.
Closes#12648
* github.com:scylladb/scylladb:
sstable: Remove explicit quarantization call
test: Move move_to_new_dir() method from sstable class
sstable, dist.-loader: Introduce and use pick_up_from_upload() method
sstables, code: Introduce and use change_state() call
distributed_loader: Let make_sstables_available choose target directory
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The call moves the sstable to the specified state.
The change state is translated into the storage driver state change
which is for todays filesystem storage means moving between directories.
The "normal" state maps to the base dir of the table, there's no
dedicated subdir for this state and this brings some trouble into the
play.
The thing is that in order to check if an sstable is in "normal" state
already its impossible to compare filename of its path to any
pre-defined values, as tables' basdirs are dynamic. To overcome this,
the change-state call checks that the sstable is in one of "known"
sub-states, and assumes that it's in normal state otherwise.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.
Closes#12858