Compaction tasks executors serve two different purposes - as compaction
manager related entity they execute compaction operation and as task
manager related entity they track compaction status.
When one role depends on the other, as it currently is for
compaction_task_impl::done() and compaction_task_executor::compaction_done(),
requirements of both roles need to be satisfied at the same time in each
corner case. Such complexity leads to bugs.
To prevent it, compaction_task_impl::done() of executors no longer depends
on compaction_task_executor::compaction_done().
Fixes: #14912.
Closesscylladb/scylladb#15140
* github.com:scylladb/scylladb:
compaction: warn about compaction_done()
compaction: do not run stopped compaction
compaction: modify lowest compaction tasks' run method
compaction: pass do_throw_if_stopping to compaction_task_executor
This field on the null shard is properly initialized
in maybe_init_schema_commitlog function, until then
we can't make decisions based on its value. This problem
can happen e.g. if add_column_family function is called
with readonly=false before maybe_init_schema_commitlog.
It will call commitlog_for to pass the commitlog to
mark_ready_for_writes and commitlog_for reads _uses_schema_commitlog.
In this commit we add protection against this case - we
trigger internal_error if _uses_schema_commitlog is read
before it is initialized.
maybe_init_schema_commitlog() was added to cql_test_env
to make boost tests work with the new invariant.
We want to switch system.scylla_local table to the
schema commitlog, but load phases hamper here - schema
commitlog is initialized after phase1,
so a table which is using it should be moved to phase2,
but system.scylla_local contains features, and we need
them before schema commitlog initialization for
SCHEMA_COMMITLOG feature.
In this commit we are taking a different approach to
loading system tables. First, we load them all in
one pass in 'readonly' mode. In this mode, the table
cannot be written to and has not yet been assigned
a commit log. To achieve this we've added _readonly bool field
to the table class, it's initialized to true in table's
constructor. In addition, we changed the table constructor
to always assign nullptr to commitlog, and we trigger
an internal error if table.commitlog() property is accessed
while the table is in readonly mode. Then, after
triggering on_system_tables_loaded notifications on
feature_service and sstable_format_selector, we call
system_keyspace::mark_writable and eventually
table::mark_ready_for_writes which selects the
proper commitlog and marks the table as writable.
In sstable_compaction_test we drop several
mark_ready_for_writes calls since they are redundant,
the table has already been made writable in
env.make_table_for_tests call.
The table::commitlog function either returns the current
commitlog or causes an error if the table is readonly. This
didn't work for virtual tables, since they never called
mark_ready_for_writes. In this commit we add this
call to initialize_virtual_tables.
This is a readability refactoring commit without observable changes
in behaviour.
initialize_virtual_tables logically belongs to virtual_tables module,
and it allows to make other functions in virtual_tables.cc
(register_virtual_tables, install_virtual_readers)
local to the module, which simplifies the matters a bit.
all_virtual_tables() is not needed anymore, all the references to
registered virtual tables are now local to virtual_tables module
and can just use virtual_tables variable directly.
In this refactoring commit we remove the db::config::host_id
field, as it's hacky and duplicates token_metadata::get_my_id.
Some tests want specific host_id, we add it to cql_test_config
and use in cql_test_env.
We can't pass host_id to sstables_manager by value since it's
initialized in database constructor and host_id is not loaded yet.
We also prefer not to make a dependency on shared_token_metadata
since in this case we would have to create artificial
shared_token_metadata in many tools and tests where sstables_manager
is used. So we pass a function that returns host_id to
sstables_manager constructor.
Currently starting and stopping of b.m. is spread over main(). Keep it
close to each other.
Another trickery here is that calling b.m.::start() can only be done
after joining the cluster, because this start() spawns replay loop
which, in turn calls token_metadata::count_normal_token_owners() and if
the latter returns zero, the b.m. code uses it as a fraction denominator
and crashes.
With the above in mind, cql_test_env should start batchlog manager after
it "joins the ring" too. For now it doesn't make any difference, but
next patch will make use of it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently minio starts with a bucket that has public anonymous access. Respectively, all tests use unsigned S3 requests. That was done for simplicity, and its better to apply some policy to the bucket and, consequentially, make tests sign their requests.
Other than the obvious benefit that we test requests signing in unit tests, another goal of this PR is to make it possible to simulate and test various error paths locally, e.g. #13745 and #13022Closes#14525
* github.com:scylladb/scylladb:
test/s3: Remove AWS_S3_EXTRA usage
test/s3: Run tests over non-anonymous bucket
test/minio: Create random temp user on start
code: Rename S3_PUBLIC_BUCKET_FOR_TEST
The local node's dc:rack pair is cached on system keyspace on start. However, most of other code don't need it as they get dc:rack from topology or directly from snitch. There are few places left that still mess with sysks cache, but they are easy to patch. So after this patch all the core code uses two sources of dc:rack -- topology / snitch -- instead of three.
Closes#15280
* github.com:scylladb/scylladb:
system_keyspace: Don't require snitch argument on start
system_keyspace: Don't cache local dc:rack pair
system_keyspace: Save local info with explicit location
storage_service: Get endpoint location from snitch, not system keyspace
snitch: Introduce and use get_location() method
repair: Local location variables instead of system keyspace's one
repair: Use full endpoint location instead of datacenter part
Currently minio applies anonymous public policy for the test bucket and
all tests just use unsigned S3 requests. This patch generates a policy
for the temporary minio user and removes the anon public one. All tests
are updated respectively to use the provided key:secret pair.
The use-https bit is off by default as minio still starts with plain
http. That's OK for now, all tests are local and have no secret data
anyway
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The bucket is going to stop being public, rename the env variable in
advance to make the essential patch smaller
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
On boot system keyspace is kicked to insert local info into system.local
table. Among other things there's dc:rack pair which sys.ks. gets from
its cache which, in turn, should have been previously initialized from
snitch on sys.ks. start. This patch makes the local info updating method
get the dc:rack from caller via argument. Callers, in turn, call snitch
directly, because these are main and cql_test_env startup routines.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The s.service since d42685d0cb is having on-board query processor ref^w
pointer and can use it to join cluster
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#15236
The cql_test_env has a virtual require_column_has_value() helper that better fits cql_assertions crowd. Also, the helper in question duplicates some existing code, so it can also be made shorter (and one class table helper gets removed afterwards)
Closes#15208
* github.com:scylladb/scylladb:
cql_assertions: Make permit from env
table: Remove find_partition_slow() helper
sstable_compaction_test: Do not re-decorate key
cql_test_env: Move .require_column_has_value
cql_test_env: Use table.find_row() shortcut
To call table::find_row() one needs to provide a permit. Tests have
short and neat helper to create one from cql_test_env
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The require_column_has_value() finds the cell in three steps -- finds
partition, then row, then cell. The class table already has a method to
facilitate row finding by partition and clustering key
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method in question accepts cdc_generation_service ref argument from
main and cql_test_env, but storage service now has local cdcv gen.
service reference, so this argument and its propagation down the stack
can be removed
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It sort of reverts the 5a97ba7121 commit, because storage service now
uses the cdc generation service to serve raft topo updates which, in
turn, takes the cdc gen. service all over the raft code _just_ to make
it as an argument to storage service topo calls.
Also there's API carrying cdc gen. service for the single call and also
there's an implicit need to kick cdc gen. service on decommission which
also needs storage service to reference cdc gen. after boot is complete
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
in this series, we also print the sstable name and pk when writing a tombstone whose local_deletion_time (ldt for short) is greater than INT32_MAX which cannot be represented by an uint32_t.
Fixes#15015Closes#15107
* github.com:scylladb/scylladb:
sstable/writer: log sstable name and pk when capping ldt
test: sstable_compaction_test: add a test for capped tombstone ldt
local_delection_time (short for ldt) is a timestamp used for the
purpose of purging the tombstone after gc_grace_seconds. if its value
is greater than INT32_MAX, it is capped when being written to sstable.
this is very likely a signal of bad configuration or a even a bug in
scylla. so we keep track of it with a metric named
"scylla_sstables_capped_tombstone_deletion_time".
in this change, a test is added to verify that the metric is updated
upon seeing a tombstone with this abnormal ldt.
because we validate the consistency before and after compaction in
tests, this change adds a parameter to disable this check, otherwise,
because capping the ldt changes the mutation, the validation would
fail the test.
Refs #15015
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
On boot several manipulations with system.local are performed.
1. The host_id value is selected from it with key = local
If not found, system_keyspace generates a new host_id, inserts the
new value into the table and returns back
2. The cluster_name is selected from it with key = local
Then it's system_keyspace that either checks that the name matches
the one from db::config, or inserts the db::config value into the
table
3. The row with key = local is updated with various info like versions,
listen, rpc and bcast addresses, dc, rack, etc. Unconditionally
All three steps are scattered over main, p.1 is called directly, p.2 and
p.3 are executed via system_keyspace::setup() that happens rather late.
Also there's some touch of this table from the cql_test_env startup code.
The proposal is to collect this setup into one place and execute it
early -- as soon as the system.local table is populated. This frees the
system_keyspace code from the logic of selecting host id and cluster
name leaving it to main and keeps it with only select/insert work.
refs: #2795
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#15082
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.
There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.
```
INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf
```
Closes#14863
* github.com:scylladb/scylladb:
Kill scylla option to configure number of compaction groups
replica: Wire tablet into compaction group
token_metadata: Add this_host_id to topology config
replica: Switch to chunked_vector for storing compaction groups
replica: Generate group_id for compaction_group on demand
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.
Fixes: #13942
Message-ID: <ZNsynXayKim2XAFr@scylladb.com>
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The motivation is that token_metadata::get_my_id() is not available
early in the bootstrap process, as raft topology is pulled later
than new tables are registered and created, and this node is added
to topology even later.
To allow creation of compaction groups to retrieve "my id" from
token metadata early, initialization will now feed local id
into topology config which is immutable for each node anyway.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
An sstable can be in one of several states -- normal, quarantined, staging, uploading. Right now this "state" is hard-wired into sstable's path, e.g. quarantined sstable would sit in e.g. /var/lib/data/ks-cf-012345/quarantine/ directory. Respectively, there's a bunch of directory names constexprs in sstables.hh defining each "state". Other than being confusing, this approach doesn't work well with S3 backend. Additionally, there's snapshot subdir that adds to the confusion, because snapshot is not quite a state.
This PR converts "state" from constexpr char* directories names into a enum class and patches the sstable creation, opening and state-changing API to use that enum instead of parsing the path.
refs: #13017
refs: #12707Closes#14152
* github.com:scylladb/scylladb:
sstable/storage: Make filesystem storage with initial state
sstable: Maintain state
sstable: Make .change_state() accept state, not directory string
sstable: Construct it with state
sstables_manager: Remove state-less make_sstable()
table: Make sstables with required state
test: Make sstables with upload state in some cases
tools: Make sstables with normal state
table: Open-code sstables making streaming helpers
tests: Make sstables with normal state by default
sstable_directory: Make sstable with required state
sstable_directory: Construct with state
distributed_loader: Make sstable with desired state when populating
distributed_loader: Make sstable with upload state when uploading
sstable: Introduce state enum
sstable_directory: Merge verify and g.c. calls
distributed_loader: Merge verify and gc invocations
sstable/filesystem: Put underscores to dir members
sstable/s3: Mark make_s3_object_name() const
sstable: Remove filename(dir, ...) method
There are a few good reasons for this change.
1) compaction_group doesn't have to be aware of # of groups
2) thinking forward to dynamic tablets, # of groups cannot be
statically embedded in group id, otherwise it gets stale.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
This reverts commit 70b5360a73. It generates
a failure in group0_test .test_concurrent_group0_modifications in debug
mode with about 4% probability.
Fixes#15050
It's assumed that sstables are not very specific about which
subdirectory an sstable is, so they can use normal state. Places that
need to move sstables between states will use sstable manager API
explicitly
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
To add a sharded service to the cql_test_env one needs to patch it in 5 or 6 places
- add cql_test_env reference
- add cql_test_env constructor argument
- initialize the reference in initializer list
- add service variable to do_with method
- pass the variable to cql_test_env constructor
- (optionally) export it via cql_test_env public method
Steps 1 through 5 are annoying, things get much simpler if look like
- add cql_test_env variable
- (optionally) export it via cql_test_env public method
This is what this PR does
refs: #2795Closes#15028
* github.com:scylladb/scylladb:
cql_test_env: Drop local *this reference
cql_test_env: Drop local references
cql_test_env: Move most of the stuff in run_in_thread()
cql_test_env: Open-code env start/stop and remove both
cql_test_env: Keep other services as class variables
cql_test_env: Keep services as class variables
cql_test_env: Construct env early
cql_test_env: De-static fdpinger variable
cql_test_env: Define all services' variables early
cql_test_env: Keep group0_client pointer
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.
Fixes: #13942
Message-ID: <ZNSWF/cHuvcd+g1t@scylladb.com>
The local auto& foo = env._foo references in run_in_thread() a no longer
needed, the code that uses foo can be switched to use _foo (this->_foo)
instead
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Thw do_with() method is static and cannot just access cql_test_env
variable's fields, using local references instead. To simplify this,
most of the method's content is moved to non-static run_in_thread()
method
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are more services on do_with() stack that are not referenced from
the cql_test_env. Move them to be class variables too
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now they are duplicated -- variables exist on do_with() stack and the
class references some of them. This patch makes is vice-versa -- all the
variables are on the cql_test_env and do_with() references them. The
latter will change soon
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Nowadays they are all scattered along the .do_with() function. Keeping
them in one early place makes it possible to relocate them onto the
cql_test_env later
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's now reference, but some time later it won't be able to get
initialized construction-time, so turn it into pointer
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This check is pointless. The subsequent call to find_column_family()
would call on_internal_error() in case schema is not found, and since
cql_test_env sets abort-on-internal-error to true, this would fail just
like that
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The `system.group0_history` table provides useful descriptions for each
command committed to Raft group 0. One way of applying a command to
group 0 is by calling `migration_manager::announce`. This function has
the `description` parameter set to empty string by default. Some calls
to `announce` use this default value which causes `null` values in
`system.group0_history`. We want `system.group0_history` to have an
actual description for every command, so we change all default
descriptions to reasonable ones.
Going further, We remove the default value for the `description`
parameter of `migration_manager::announce` to avoid using it in the
future. Thanks to this, all commands in `system.group0_history` will
have a non-null description.
Fixes#13370Closes#14979
* github.com:scylladb/scylladb:
migration_manager: announce: remove the default value of description
test: always pass empty description to migration_manager::announce
migration_manager: announce: provide descriptions for all calls
In the next commit, we remove the default value for the
description parameter of migration_manager::announce to avoid
using it in the future. However, many calls to announce in tests
use the default value. We have to change it, but we don't really
care about descriptions in the tests, so we pass the empty string
everywhere.