Some tests use non-threaded do_with_cql_env() and wrap the inner lambda with seastar::async(). The cql env already provides a helper for that
Closes#15305
* github.com:scylladb/scylladb:
cql_query_test: Fix indentation after previous patch
cql_query_test: Use do_with_cql_env_thread() explicitly
Some tests use non-threaded do_with_cql_env() and wrap the inner lambda
with seastar::async(). The cql env already provides a helper for that
Indentation is deliberately left broken until next patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Passing the gate_closed_exception to the task promise
ends up with abandoned exception since no-one is waiting
for it.
Instead, enter the gate when the task is made
so it will fail make_task if the gate is already closed.
Fixes scylladb/scylladb#15211
In addition, this series adds a private abort_source for each task_manager module
(chained to the main task_manager::abort_source) and abort is requested on task_manager::module::stop().
gate holding in compaction_manager is hardened
and makes sure to stop compaction_manager and task_manager in sstable_compaction_test cases.
Closes#15213
* github.com:scylladb/scylladb:
compaction_manager: stop: close compaction_state:s gates
compaction_manager: gracefully handle gate close
task_manager: task: start: fixup indentation
task_manager: module: make_task: enter gate when the task is created
task_manaer: module: stop: request abort
task_manager: task::impl: subscribe to module about_source
test: compaction_manager_stop_and_drain_race_test: stop compaction and task managers
test: simple_backlog_controller_test: stop compaction and task managers
"experimental" option was marked "Unused" in 64bc8d2f7d. but we
chose to keep it in hope that the upgrade test does not fail.
despite that the upgrade tests per-se survived the "upgrade",
after the upgrade, the tests exercising the experimental features
are still failing hard. they have not been updated to set the
"experimental-features" option, and are still relying on
"experimental" to enable all the experimental features under
test.
so, in this change, let's just drop the option so that
scylla can fail early at seeing this "experimental" option.
this should help us to identify the tests relying on it
quicker. as the "experimental" features should only be used
in development environment, this change should have no impact
to production.
Refs #15214
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15233
Index caching was disabled by default because it caused performance regressions
for some small-partition workloads. See https://github.com/scylladb/scylladb/issues/11202.
However, it also means that there are workloads which could benefit from the
index cache, but (by default) don't.
As a compromise, we can set a default limit on the memory usage of index cache,
which should be small enough to avoid catastrophic regressions in
small-partition workloads, but big enough to accommodate workloads where
index cache is obviously beneficial.
This series adds such a configurable limit, sets it to to 0.2 of total cache memory by default,
and re-enables index caching by default.
Fixes#15118Closes#14994
* github.com:scylladb/scylladb:
test: boost/cache_algorithm_test: add cache_algorithm_test
sstables: partition_index_cache: deglobalize stats
utils: cached_file: deglobalize cached_file metrics
db: config: enable index caching by default
config: add index_cache_fraction
utils: lru: add move semantics to list links
Move partition_index_cache stats from a thread_local variable
to cache_tracker. After the change, partition_index_cache
receives a reference to the stats via constructor, instead of
referencing a global.
This is needed so that cache_tracker can know the memory usage
of index caches (for cache eviction purposes) without relying on
globals.
But it also makes sense even without that motive.
Move cached_file metrics from a thread_local variable
to cache_tracker.
This is needed so that cache_tracker can know
the memory usage of index caches (for purposes
of cache eviction) without relying on globals.
But it also makes sense even without that motive.
The is_partition_dead() local helper accepts partition key argument and
decorates it. Howerver, its caller gets partition key from decorated key
itself, and can just pass it along
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's possible that compaction task is preempted after completion and
before reevaluation, causing pending_tasks to be > 1.
Let's only exit the loop if there are no pending tasks, and also
reduce 100ms sleep which is an eternity for this test.
Fixes#14809.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#15059
local_delection_time (short for ldt) is a timestamp used for the
purpose of purging the tombstone after gc_grace_seconds. if its value
is greater than INT32_MAX, it is capped when being written to sstable.
this is very likely a signal of bad configuration or a even a bug in
scylla. so we keep track of it with a metric named
"scylla_sstables_capped_tombstone_deletion_time".
in this change, a test is added to verify that the metric is updated
upon seeing a tombstone with this abnormal ldt.
because we validate the consistency before and after compaction in
tests, this change adds a parameter to disable this check, otherwise,
because capping the ldt changes the mutation, the validation would
fail the test.
Refs #15015
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.
There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.
```
INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf
```
Closes#14863
* github.com:scylladb/scylladb:
Kill scylla option to configure number of compaction groups
replica: Wire tablet into compaction group
token_metadata: Add this_host_id to topology config
replica: Switch to chunked_vector for storing compaction groups
replica: Generate group_id for compaction_group on demand
in this series, the "experimental" option is marked `Unused` as it has been marked deprecated for almost 2 years since scylla 4.6. and use `experimental_features` to specify the used experimental features explicitly.
Closes#14948
* github.com:scylladb/scylladb:
config: remove unused namespace alias
config: use std::ranges when appropriate
config: drop "experimental" option
test: disable 'enable_user_defined_functions' if experimental_features does not include udf
test: pylib: specify experimental_features explicitly
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The motivation is that token_metadata::get_my_id() is not available
early in the bootstrap process, as raft topology is pulled later
than new tables are registered and created, and this node is added
to topology even later.
To allow creation of compaction groups to retrieve "my id" from
token metadata early, initialization will now feed local id
into topology config which is immutable for each node anyway.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
An sstable can be in one of several states -- normal, quarantined, staging, uploading. Right now this "state" is hard-wired into sstable's path, e.g. quarantined sstable would sit in e.g. /var/lib/data/ks-cf-012345/quarantine/ directory. Respectively, there's a bunch of directory names constexprs in sstables.hh defining each "state". Other than being confusing, this approach doesn't work well with S3 backend. Additionally, there's snapshot subdir that adds to the confusion, because snapshot is not quite a state.
This PR converts "state" from constexpr char* directories names into a enum class and patches the sstable creation, opening and state-changing API to use that enum instead of parsing the path.
refs: #13017
refs: #12707Closes#14152
* github.com:scylladb/scylladb:
sstable/storage: Make filesystem storage with initial state
sstable: Maintain state
sstable: Make .change_state() accept state, not directory string
sstable: Construct it with state
sstables_manager: Remove state-less make_sstable()
table: Make sstables with required state
test: Make sstables with upload state in some cases
tools: Make sstables with normal state
table: Open-code sstables making streaming helpers
tests: Make sstables with normal state by default
sstable_directory: Make sstable with required state
sstable_directory: Construct with state
distributed_loader: Make sstable with desired state when populating
distributed_loader: Make sstable with upload state when uploading
sstable: Introduce state enum
sstable_directory: Merge verify and g.c. calls
distributed_loader: Merge verify and gc invocations
sstable/filesystem: Put underscores to dir members
sstable/s3: Mark make_s3_object_name() const
sstable: Remove filename(dir, ...) method
and use that in compaction_group, rather than
respective accumulators of its own.
bytes_on_disk is implemented by each sstable_set_impl
and is update on insert and erase (whether directly
into the sstable_set_impl or via the sstable_set).
Although compound_sstable_set doesn't implement
insert and erase, it override `bytes_on_disk()` to return
the sum of all the underlying `sstable_set::bytes_on_disk()`.
Also, added respective unit tests for `partitioned_sstable_set`
and `time_series_sstable_set`, that test each type's
bytes_on_disk, including cloning of the set, and the
`compound_sstable_set` bytes_on_disk semantics.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Pretty cosmetic change, but it will allow S3 to finally support moving
sstables between states (after this patch it still doesn't)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
As was mentione in the previous patch, there are few places in tests
that put sstables in upload/ subdir and they really mean it. Those need
to use sstables manager/directory API directly (already) and specify the
state explicitly (this patch)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This is to replace full path sitting on this object eventually. For now
they have to co-exist, but state will be used to make_sstable()-s from
manager with its new API
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The test uses qualified ks.cf name to find the schema, but it's the only
test case that does it. There's no point in maintaining a dedicated
helper on the cql_test_env just for that
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Surprisingly there's a dedicated helper for the check opposite to the
one fixed in the previous patch. Fix one too
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Same as in previous patch, the cql_test_env::require_table_exists()
helper is exactly the same, but returns future and asserts on failures
for no gain
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The cql_test_env::require_keyspace_exists() performs exactly the same
check, but is future-returning function for no reason and it assert()s
on failure, that's less informative (not that it ever failed) than
BOOST_REQUIRE
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's like in previous patch, and for the same reason, but the change is
a bit more complicated because it uses resolved futures' results in few
places, so it likely deserves separate commit
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Those two use straight .then-s sequences, no point in keeping them that
long. Being threads makes next patches shorter and nicer
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The `system.group0_history` table provides useful descriptions for each
command committed to Raft group 0. One way of applying a command to
group 0 is by calling `migration_manager::announce`. This function has
the `description` parameter set to empty string by default. Some calls
to `announce` use this default value which causes `null` values in
`system.group0_history`. We want `system.group0_history` to have an
actual description for every command, so we change all default
descriptions to reasonable ones.
Going further, We remove the default value for the `description`
parameter of `migration_manager::announce` to avoid using it in the
future. Thanks to this, all commands in `system.group0_history` will
have a non-null description.
Fixes#13370Closes#14979
* github.com:scylladb/scylladb:
migration_manager: announce: remove the default value of description
test: always pass empty description to migration_manager::announce
migration_manager: announce: provide descriptions for all calls
Currently `sstable_requiring_cleanup` is updated using `compacting_sstable_registration`, but that mechanism is not used by offstrategy compaction, leading to #14304.
This series introduces `compaction_manager::on_compaction_completion` that intercepts the call
to the table::on_compaction_completion. This allows us to update `sstable_requiring_cleanup` right before the compacted sstables are deleted, making sure they are no leaked to `sstable_requiring_cleanup`, which would hold a reference to them until cleanup attempts to clean them up.
`cleanup_incremental_compaction_test` was adjusted to observe the sstables `on_delete` (by adding a new observer event) to detect the case where cleanup attempts to delete the leaked sstables and fails since they were already deleted from the file system by offstrategy compaction. The test fails with the fix and passes with it.
Fixes#14304Closes#14858
* github.com:scylladb/scylladb:
compaction_manager: on_compaction_completion: erase sstables from sstables_requiring_cleanup
compaction/leveled_compaction_strategy: ideal_level_for_input: special case max_sstable_size==0
sstable: add on_delete observer
compaction_manager: add on_compaction_completion
sstable_compaction_test: cleanup_incremental_compaction_test: verify sstables_requiring_cleanup is empty
"experimental" was marked deprecated in 8b917f7c. this change
was included since Scylla 4.6. now that 5.3 has been branched,
this change will be included 5.4. this should be long enough
for the user's turn around if this option is ever used.
the dtests using this option has been audited and updated
accordingly. and the unit testing this option is removed as well.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Erase retired sstable from compaction_state::sstables_requiring_cleanup
also on_compaction_completion (in addition to
compacting_sstable_registration::release_compacting
for offstrategy compaction with piggybacked cleanup
or any other compaction type that doesn't use
compacting_sstable_registration.
Add cleanup_during_offstrategy_incremental_compaction_test
that is modeled after cleanup_incremental_compaction_test to check
that cleanup doesn't attempt to cleanup already-deleted
sstables that were left over by offstrategy compaction
in sstables_requiring_cleanup.
Fixes#14304
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
In the next commit, we remove the default value for the
description parameter of migration_manager::announce to avoid
using it in the future. However, many calls to announce in tests
use the default value. We have to change it, but we don't really
care about descriptions in the tests, so we pass the empty string
everywhere.
While describing materialized view, print `synchronous_updates` option
only if the tag is present in schema's extensions map. Previously if the
key wasn't present, the default (false) value was printed.
Fixes: #14924Closes#14928
we wait for the same condition couple lines before, so no need to
check it again using `BOOST_CHECK_EQUAL()`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14921
for faster build times and clear inter-module dependencies, we
should not #includes headers not directly used. instead, we should
only #include the headers directly used by a certain compilation
unit.
in this change, the source files under "/compaction" directories
are checked using clangd, which identifies the cases where we have
an #include which is not directly used. all the #includes identified
by clangd are removed, except for "test/lib/scylla_test_case.hh"
as it brings some command line options used by scylla tests.
see also https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14922
In branch 5.2 we erase `dc` from `_datacenters` if there are
no more endpoints listed in `_dc_endpoints[dc]`.
This was lost unintentionally in f3d5df5448
and this commit restores that behavior, and fixes test_remove_endpoint.
Fixesscylladb/scylladb#14896
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#14897
The `migration_manager` service is responsible for schema convergence in
the cluster - pushing schema changes to other nodes and pulling schema
when a version mismatch is observed. However, there is also a part of
`migration_manager` that doesn't really belong there - creating
mutations for schema updates. These are the functions with `prepare_`
prefix. They don't modify any state and don't exchange any messages.
They only need to read the local database.
We take these functions out of `migration_manager` and make them
separate functions to reduce the dependency of other modules (especially
`query_processor` and CQL statements) on `migration_manager`. Since all
of these functions only need access to `storage_proxy` (or even only
`replica::database`), doing such a refactor is not complicated. We just
have to add one parameter, either `storage_proxy` or `database` and both
of them are easily accessible in the places where these functions are
called.
This refactor makes `migration_manager` unneeded in a few functions:
- `alternator::executor::create_keyspace`,
- `cql3::statements::alter_type_statement::prepare_announcement_mutations`,
- `cql3::statements::schema_altering_statement::prepare_schema_mutations`,
- `cql3::query_processor::execute_thrift_schema_command:`,
- `thrift::handler::execute_schema_command`.
We remove the `migration_manager&` parameter from all these functions.
Fixes#14339Closes#14875
* github.com:scylladb/scylladb:
cql3: query_processor::execute_thrift_schema_command: remove an unused parameter
cql3: schema_altering_statement::prepare_schema_mutations: remove an unused parameter
cql3: alter_type_statement::prepare_announcement_mutations: change parameters
alternator: executor::create_keyspace: remove an unused parameter
service: migration_manager: change the prepare_ methods to functions
This change makes tablet load balancing more efficient by performing
migrations independently for different tablets, and making new load
balancing plans concurrently with active migrations.
The migration track is interrupted by pending topology change operations.
The coordinator executes the load balancer on edges of tablet state
machine transitions. This allows new migrations to be started as soon
as tablets finish streaming.
The load balancer is also continuously invoked as long as it produces
a non-empty plan. This is in order to saturate the cluster with
streaming. A single make_plan() call is still not saturating, due
to the way algorithm is implemented.
Overload of shards is limited by the fact that load balancer algorithm tracks
streaming concurrency on both source and target shards of active
migrations and takes concurrency limit into account when producing new
migrations.
Closes#14851
* github.com:scylladb/scylladb:
tablets: load_balancer: Remove double logging
tests: tablets: Check that load balancing is interrupted by topology change
tests: tablets: Add test for load balancing with active migrations
tablets: Balance tablets concurrently with active migrations
storage_service, tablets: Extract generate_migration_updates()
storage_service, tablets: Move get_leaving_replica() to tablets.cc
locator: tablets: Move std::hash definition earlier
storage_service: Advance tablets independently
topology_coordinator: Fix missed notification on abort
tablets: Add formatter for tablet_migration_info
Maps related to column families in database are extracted
to a column_families_data class. Access to them is possible only
through methods. All methods which may preempt hold rwlock
in relevant mode, so that the iterators can't become invalid.
Fixes: #13290Closes#13349
* github.com:scylladb/scylladb:
replica: make tables_metadata's attributes private
replica: add methods to get a filtered copy of tables map
replica: add methods to check if given table exists
replica: add methods to get table or table id
replica: api: return table_id instead of const table_id&
replica: iterate safely over tables related maps
replica: pass tables_metadata to phased_barrier_top_10_counts
replica: add methods to safely add and remove table
replica: wrap column families related maps into tables_metadata
replica: futurize database::add_column_family and database::remove
There are three methods in system_keyspace namespace that run queries over `system.scylla_table_schema_history` table. For that they use qctx which's not nice.
Fortunately, all the callers already have the system_keyspace& local variable or argument they can pass to those methods. Since the accessed table belongs to system keyspace, the latter declares the querying methods as "friends" to let them get private `query_processor& _qp` member
Closes#14876
* github.com:scylladb/scylladb:
schema_tables: Extract query_processor from system_keyspace for querying
schema_tables: Add system_keyspace& argument to ..._column_mapping() calls
migration_manager: Add system_keyspace argument to get_schema_mapping()
It is possible that topology will contain nodes that are no longer normal token owners, so they don't need to be sync'ed with.
Fixes scylladb/scylladb#14793
Closes#14798
* github.com:scylladb/scylladb:
storage_service: refresh_sync_nodes: restrict to reachable token owners
storage_service: refresh_sync_nodes: fix log message
locator: topology: node::state: make fine grained
Currently the node::state is coarse grained
so one cannot distinguish between e.g. a leaving
node due to decommission (where the node is used
for reading) vs. due to remove node (where the
node is not used for reading).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
before this change, after triggering the compaction,
compaction_manager_basic_test waits until the triggered compaction
completes. but since the regular compaction is run in a loop which
does not stop until either the daemon is stopping, or there is no
more sstables to be compacted, or the compaction is disabled.
but we only get the input sstables for compaction after swiching
to the "pending" state, and acquiring the read lock of the
compaction_state, and acquiring the read lock is implemented as
an coroutine, so there is chance that coroutine is suspended,
and the execution switches to the test. in this case, the test
will find that even after the triggered compaction completes,
there are still one or more pending compactions. hence the test
fails.
to address this problem, instead of just waiting for the compaction
to complete, we also wait until the number of pending compaction tasks
is 0. so that even if the test manages to sneak into the time window,
it won't proceed and starting check the compaction manager's stats.
Fixes#14865
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14889