Commit Graph

3282 Commits

Author SHA1 Message Date
Avi Kivity
0a5d9532f9 Merge 'Sanitize batchlog manager start/stop' from Pavel Emelyanov
This code is now spread over main and differs in cql_test_env. The PR unifies both places and makes the manager start-stop look standard

refs: #2795

Closes #15375

* github.com:scylladb/scylladb:
  batchlog_manager: Remove start() method
  batchlog_manager: Start replay loop in constructor
  main, cql_test_env: Start-stop batchlog manager in one "block"
  batchlog_manager: Move shard-0 check into batchlog_replay_loop()
  batchlog_manager: Fix drain() reentrability
2023-09-13 18:20:56 +03:00
Kamil Braun
a184b07cbb Merge 'raft topology: make CDC_GENERATIONS_V3 single-partition, timeuuid-sorted' from Patryk Jędrzejczak
We make the `CDC_GENERATIONS_V3` table single-partition and change the
clustering key from `range_end` to `(id, range_end)`. We also change the
type of `id` to `timeuuid` and ensure that a new generation always has
the highest `id`. These changes allow efficient clearing of obsolete CDC
generation data, which we need to prevent Raft-topology snapshots from
endlessly growing as we introduce new generations over time.

All this code is protected by an experimental feature flag. It includes
the definition of `CDC_GENERATIONS_V3`. The table is not created unless
the feature flag is enabled.

Fixes #15163

Closes #15319

* github.com:scylladb/scylladb:
  system_keyspace: rename cdc_generation_id_v2
  system_keyspace: change id to timeuuid in CDC_GENERATIONS_V3
  cdc: generation: remove topology_description_generator
  cdc: do not create uuid in make_new_generation_data
  system_kayspace: make CDC_GENERATIONS_V3 single-partition
  cdc: generation: introduce get_common_cdc_generation_mutations
  cdc: generation: rename get_cdc_generation_mutations
2023-09-13 12:54:49 +02:00
Pavel Emelyanov
d48aff5789 batchlog_manager: Remove start() method
It's now a no-op, can be dropped.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:37:52 +03:00
Pavel Emelyanov
3966a50ed4 batchlog_manager: Start replay loop in constructor
... and sanitize the future used on stop.

The loop in question is now started in .start(), but all callers now
construct the manager late enough, so the loop spawning can be moved.
This also calls for renaming the future member of the class and allows
to make it regular, not shared, future.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:35:53 +03:00
Pavel Emelyanov
9f45778467 batchlog_manager: Move shard-0 check into batchlog_replay_loop()
Currently the only caller of it is the batchlog manager itself. It
checks for the shard-id to be zero, calls the method, then the method
asserts that it's run on shard-0.

Moving the check into the method removes the need for assertion and
makes further patching simpler.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:32:12 +03:00
Pavel Emelyanov
38d0ea0916 batchlog_manager: Fix drain() reentrability
Currently drain() is called twise -- first time from
storage_service::drain() (on shutdown), second via
batchlog_manager::stop(). The routine is unintentinally re-entrable,
because:
- explicit check for not aborting the abort source twise
- breaking semaphore can be done multiple times
- co-await-ing of the _started future works because the future is shared

That's not extremely elegant, better to make the drain() bail out early
if it was already called.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:30:07 +03:00
Patryk Jędrzejczak
92209996b5 system_keyspace: rename cdc_generation_id_v2
Changing the second value of cdc_generation_id_v2 from uuid_type
to timeuuid_type made the name of cdc_generation_id_v2 unsuitable
because it does not match cdc::generation_id_v2 anymore.
2023-09-12 11:43:34 +02:00
Patryk Jędrzejczak
1c58c6336a system_keyspace: change id to timeuuid in CDC_GENERATIONS_V3
We change the type of IDs in CDC_GENERATIONS_V3 to timeuuid to
give them a time-based order. We also change how we initialize
them so that the new CDC generation always has the highest ID.
This is the last step to enabling the efficient clearing of
obsolete CDC generation data.

Additionally, we change the types of current_cdc_generation_uuid,
new_cdc_generation_data_uuid and the second values of the elements
in unpublished_cdc_generations to timeuuid, so that they match id
in CDC_GENERATIONS_V3.
2023-09-12 11:43:34 +02:00
Patryk Jędrzejczak
2cd430ac80 system_kayspace: make CDC_GENERATIONS_V3 single-partition
We make CDC_GENERATIONS_V3 single-partition by adding the key
column and changing the clustering key from range_end to
(id, range_end). This is the first step to enabling the efficient
clearing of obsolete CDC generation data, which we need to prevent
Raft-topology snapshots from endlessly growing as we introduce new
generations over time. The next step is to change the type of the id
column to timeuuid. We do it in the following commits.

After making CDC_GENERATIONS_V3 single-partition, there is no easy
way of preserving the num_ranges column. As it is used only for
sanity checking, we remove it to simplify the implementation.
2023-09-12 09:51:45 +02:00
Patryk Jędrzejczak
ed1c1369d9 cdc: generation: rename get_cdc_generation_mutations
In the following commits, we modify the CDC_GENERATIONS_V3 schema
to enable efficient clearing of obsolete CDC generation data.
These modifications make the current get_cdc_generation_mutations
work only for the CDC_GENERATIONS_V2 schema, and we need a new
function for CDC_GENERATIONS_V3, so we add the "_v2" suffix.
2023-09-11 12:30:21 +02:00
Botond Dénes
b062b245ad Merge 'Don't cache dc:rack on system keyspace local cache' from Pavel Emelyanov
The local node's dc:rack pair is cached on system keyspace on start. However, most of other code don't need it as they get dc:rack from topology or directly from snitch. There are few places left that still mess with sysks cache, but they are easy to patch. So after this patch all the core code uses two sources of dc:rack -- topology / snitch -- instead of three.

Closes #15280

* github.com:scylladb/scylladb:
  system_keyspace: Don't require snitch argument on start
  system_keyspace: Don't cache local dc:rack pair
  system_keyspace: Save local info with explicit location
  storage_service: Get endpoint location from snitch, not system keyspace
  snitch: Introduce and use get_location() method
  repair: Local location variables instead of system keyspace's one
  repair: Use full endpoint location instead of datacenter part
2023-09-11 10:26:26 +03:00
Patryk Jędrzejczak
d404443b54 system_keyspace: load unpublished_cdc_generations to topology
We extend service::topology with the list of unpublished CDC
generations and load its contents from system.topology. This step
is the last one in making unpublished CDC generations accessible
to the topology coordinator.

Note that when we load unpublished_cdc_generations, we don't
perform any sanity checks contrary to current_cdc_generation_uuid.
Every unpublished CDC generation was a current generation once,
and we checked it at that moment.
2023-09-08 09:05:01 +02:00
Patryk Jędrzejczak
5ed9d4db6d raft topology: add unpublished_cdc_generations to system.topology
In the following commits, we replace the
topology::transition_state::publish_cdc_generation state with
a background fiber that continually publishes committed CDC
generations. To make these generations accessible to the
topology coordinator, we store them in the new column of
system.topology -- unpublished_cdc_generations.
2023-09-08 09:05:01 +02:00
Aleksandra Martyniuk
8a65477202 tasks: db: change default task_ttl value
If a test isn't going to use task manager or isn't interested in
statuses of finished tasks, then keeping them in the memory
for some time (currently 10s by default) after they are finished
is a memory waste.

Set default task_ttl value to zero. It can be changed by setting
--task-ttl-in-seconds or through rest api (/task_manager/ttl).

In conf/scylla.yaml set task-ttl-in-seconds to 10.

Closes #15239
2023-09-07 12:42:29 +03:00
Dawid Medrek
c7fe5d7f94 utils/lister: Limit the API of scan_dir() to fs::path
Right now, the function allows for passing the path to a file as a seastar::sstring,
which is then converted to std::filesystem::path -- implicitly to the caller.
However, the function performs I/O, and there is no reason to accept any other type
than std::filesystem::path, especially because the conversion is straightforward.
Callers can perform it on their own.

This commit introduces the more constrained API.

Closes #15266
2023-09-05 20:50:42 +03:00
Pavel Emelyanov
5d52a35e05 system_keyspace: Don't require snitch argument on start
Now system keyspace is finally independent from snitch

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-05 12:57:09 +03:00
Pavel Emelyanov
1daa8fa3bb system_keyspace: Don't cache local dc:rack pair
Now no code needs it from system keyspace

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-05 12:56:45 +03:00
Pavel Emelyanov
9926917bf5 system_keyspace: Save local info with explicit location
On boot system keyspace is kicked to insert local info into system.local
table. Among other things there's dc:rack pair which sys.ks. gets from
its cache which, in turn, should have been previously initialized from
snitch on sys.ks. start. This patch makes the local info updating method
get the dc:rack from caller via argument. Callers, in turn, call snitch
directly, because these are main and cql_test_env startup routines.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-05 12:54:46 +03:00
Kefu Chai
f6cca741ea config: remove "experimental" option
"experimental" option was marked "Unused" in 64bc8d2f7d. but we
chose to keep it in hope that the upgrade test does not fail.
despite that the upgrade tests per-se survived the "upgrade",
after the upgrade, the tests exercising the experimental features
are still failing hard. they have not been updated to set the
"experimental-features" option, and are still relying on
"experimental" to enable all the experimental features under
test.

so, in this change, let's just drop the option so that
scylla can fail early at seeing this "experimental" option.
this should help us to identify the tests relying on it
quicker. as the "experimental" features should only be used
in development environment, this change should have no impact
to production.

Refs #15214
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15233
2023-09-05 10:09:04 +03:00
Piotr Smaroń
eb46f1bd17 guardrails: restrict replication factor (RF)
Replacing `minimum_keyspace_rf` config option with 4 config options:
`{minimum,maximum}_replication_factor_{warn,fail}_threshold`, which
allow us to impose soft limits (issue a warning) and hard limits (not
execute CQL) on RF when creating/altering a keyspace.
The reason to rather replace than extend `minimum_keyspace_rf` config
option is to be aligned with Cassandra, which did the same, and has the
same parameters' names.
Only min soft limit is enabled by default and it is set to 3, which means
that we'll generate a CQL warning whenever RF is set to either 1 or 2.
RF's value of 0 is always allowed and means that there will not be any
replicas on a given DC. This was agreed with PM.
Because we don't allow to change guardrails' values when scylla is
running (per PM), there're no tests provided with this PR, and dtests will be
provided separately.
Exceeding guardrails' thresholds will be tracked by metrics.

Resolves #8619
Refs #8892 (the RF part, not the replication-strategy part)

Closes #14262
2023-09-04 19:22:17 +03:00
Avi Kivity
78fc3b5f56 config: rename stream_plan_ranges_percentage to *_fraction
The value is specified as a fraction between 0 and 1, so don't
mislead users into specifying a value between 0 and 100.

Closes #15261
2023-09-03 23:24:29 +03:00
Avi Kivity
9a3d57256a Merge 'config: add index_cache_fraction' from Michał Chojnowski
Index caching was disabled by default because it caused performance regressions
for some small-partition workloads. See https://github.com/scylladb/scylladb/issues/11202.

However, it also means that there are workloads which could benefit from the
index cache, but (by default) don't.

As a compromise, we can set a default limit on the memory usage of index cache,
which should be small enough to avoid catastrophic regressions in
small-partition workloads, but big enough to accommodate workloads where
index cache is obviously beneficial.

This series adds such a configurable limit, sets it to to 0.2 of total cache memory by default,
and re-enables index caching by default.

Fixes #15118

Closes #14994

* github.com:scylladb/scylladb:
  test: boost/cache_algorithm_test: add cache_algorithm_test
  sstables: partition_index_cache: deglobalize stats
  utils: cached_file: deglobalize cached_file metrics
  db: config: enable index caching by default
  config: add index_cache_fraction
  utils: lru: add move semantics to list links
2023-09-03 19:39:31 +03:00
Michał Chojnowski
f00bed9429 sstables: partition_index_cache: deglobalize stats
Move partition_index_cache stats from a thread_local variable
to cache_tracker. After the change, partition_index_cache
receives a reference to the stats via constructor, instead of
referencing a global.

This is needed so that cache_tracker can know the memory usage
of index caches (for cache eviction purposes) without relying on
globals.

But it also makes sense even without that motive.
2023-09-01 22:34:41 +02:00
Michał Chojnowski
c7d9d35030 utils: cached_file: deglobalize cached_file metrics
Move cached_file metrics from a thread_local variable
to cache_tracker.

This is needed so that cache_tracker can know
the memory usage of index caches (for purposes
of cache eviction) without relying on globals.

But it also makes sense even without that motive.
2023-09-01 22:34:41 +02:00
Michał Chojnowski
023accf246 db: config: enable index caching by default
Index caching was disabled by default because it caused performance regressions
for some small-partition workloads. See #11202.

However, it also means that there are workloads which could benefit from the
index cache, but (by default) don't.

As a compromise, we can set a default limit on the memory usage of index cache,
which should be small enough to avoid catastrophical regressions in
small-partition workloads, but big enough to accomodate workloads where
index cache is obviously beneficial.

This patch sets such a limit to 0.2 of total cache memory, and re-enables
index caching by default.
2023-09-01 22:34:23 +02:00
Michał Chojnowski
50b429f255 config: add index_cache_fraction
Adds a configurable upper limit to memory usage by index caches.
See the source code comments added in this patch for more details.

This patch shouldn't change visible behaviour, because the limit is set to 1.0
by default, so it is never triggerred. We will change the default in a future
patch.
2023-09-01 22:34:23 +02:00
Tomasz Grabiec
7b65d4d947 Merge 'Gossiper: provide strong exception safety for endpoint state changes' from Benny Halevy
This series ensures that endpoint state changes (for each single endpoint) are applied to the gossiper endpoint_state_map as a whole and on all shards.
Any failure in the process will keep the existing endpoint state intact.

Note that verbs that modify the endpoint states of multiple endpoints may still succeed to modify some of them before hitting an error and those changes are committed to the endpoint_state_map, so we don't ensure atomicity when updating multiple endpoints' states.

Fixes scylladb/scylladb#14794
Fixes scylladb/scylladb#14799

Closes #15073

* github.com:scylladb/scylladb:
  gossiper: move endpoint_state by value to apply it
  gossiper: replicate: make exception safe
  gms: pass endpoint_state_ptr to endpoint_state change subscribers
  gossiper: modify endpoint state only via replicate
  gossiper: keep and serve shared endpoint_state_ptr in map
  gossiper: get_max_endpoint_state_version: get state by reference
  api/failure_detector: get_all_endpoint_states: reduce allocations
  cdc/generation: get_generation_id_for: get endpoint_state&
  gossiper: add for_each_endpoint_state helpers
  gossiper: add num_endpoints
  gossiper: add my_endpoint_state
2023-09-01 12:23:19 +02:00
Benny Halevy
4f5ffc7719 gossiper: add for_each_endpoint_state helpers
Before changing _endpoint_state_map to hold a
lw_shared_ptr<endpoint_state>, provide synchronous helpers
for users to traverse all endpoint_states with no need
to copy them (as long as the called func does not yield).

With that, gossiper::get_endpoint_states() can be made private.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-31 08:32:31 +03:00
Dawid Medrek
9afaf39acb Get rid of UB in commitlog.hh
Identifiers starting with an underscore followed by a capital letter
are reserved. They should not be used.

Closes #15227
2023-08-31 00:03:04 +03:00
Benny Halevy
2c54d7a35a view, storage_proxy: carry effective_replication_map along with endpoints
When sending mutation to remote endpoint,
the selected endpoints must be in sync with
the current effective_replication_map.

Currently, the endpoints are sent down the storage_proxy
stack, and later on an effective_replication_map is retrieved
again, and it might not match the target or pending endpoints,
similar to the case seen in https://github.com/scylladb/scylladb/issues/15138

The correct way is to carry the same effective replication map
used to select said endpoints and pass it down the stack.
See also https://github.com/scylladb/scylladb/pull/15141

Fixes scylladb/scylladb#15144
Fixes scylladb/scylladb#14730

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15142
2023-08-29 09:08:42 +03:00
Kamil Braun
93be4c0cb0 Merge 'Base node liveliness consistently on gossiper::is_alive' from Benny Halevy
Currently he gossiper marks endpoint_state objects as alive/dead.
I some cases the endpoint_state::is_alive function is checked but in many other cases
gossiper::is_alive(endpoint) is used to determine if the endpoint is alive.

This series removed the endpoint_state::is_alive state and moves all the logic to gossiper::is_alive
that bases its decision on the endpoint having an endpoint_state and being in the _live_endpoints set.

For that, the _live_endpoints is made sure to be replicated to all shards when changed
and the endpoint_state changes are serialized under lock_endpoint, and also making sure that the
endpoint_state in the _endpoint_states_map is never updated in place, but rather a temporary copy is changed
and then safely replicated using gossiper::replicate

Refs https://github.com/scylladb/scylladb/issues/14794

Closes #14801

* github.com:scylladb/scylladb:
  gossiper: mark_alive: remove local_state param
  endpoint_state: get rid of _is_alive member and methods
  gossiper: is_alive: use _live_endpoints
  gossiper: evict_from_membership: erase endpoint from _live_endpoints
  gossiper: replicate_live_endpoints_on_change: use _live_endpoints_version to detect change
  gossiper: run: no need to replicate live_endpoints
  gossiper: fold update_live_endpoints_version into replicate_live_endpoints_on_change
  gossiper: add mutate_live_and_unreachable_endpoints
  gossiper: reset_endpoint_state_map: clear also shadow endpoint sets
  gossiper: reset_endpoint_state_map: clear live/unreachable endpoints on all shards
  gossiper: functions that change _live_endpoints must be called on shard 0
  gossiper: add lock_endpoint_update_semaphore
  gossiper: make _live_endpoints an unordered_set
  endpoint_state: use gossiper::is_alive externally
2023-08-23 17:18:05 +02:00
Patryk Jędrzejczak
ef2eac9941 raft topology: make every type in request_param a named struct
We make every alternative type in the request_param variant
a named struct to make the code more readable. Additionally, this
change will make extending request parameters easier if we decide
to do so in the future.

Closes #15132
2023-08-23 16:56:00 +02:00
Kamil Braun
169d19e5b0 Merge 'raft topology: support --ignore-dead-nodes in removenode and replace' from Patryk Jędrzejczak
We add support for `--ignore-dead-nodes` in `raft_removenode` and
`--ignore-dead-nodes-for-replace` in `raft_replace`. For now, we allow
passing only host ids of the ignored nodes. Supporting IPs is currently
impossible because `raft_address_map` doesn't provide a mapping from IP
to a host id.

The main steps of the implementation are as follows:
- add the `ignore_nodes` column to `system.topology`,
- set the `ignore_nodes` value of the topology mutation in `raft_removenode` and `raft_replace`,
- extend `service::request_param` with alternative types that allow storing a set of ids of the ignored nodes,
- load `ignore_nodes` from `system.topology` into `request_param` in `system_keyspace::load_topology_state`,
- add `ignore_nodes` to `exclude_nodes` in `topology_coordinator::exec_global_command`,
- pass `ignore_nodes` to `replace_with_repair` and `remove_with_repair` in `storage_service::raft_topology_cmd_handler`.

Additionally, we add `test_raft_ignore_nodes.py` with two tests that verify the added changes.

Fixes #15025

Closes #15113

* github.com:scylladb/scylladb:
  test: add test_raft_ignore_nodes
  test: ManagerClient.remove_node: allow List[HostId] for ignore_dead
  raft topology: pass ignore_nodes to {replace, remove}_with_repair
  raft topology: exec_global_command: add ignore_nodes to exclude_nodes
  raft topology: exec_global_command: change type of exclude_nodes
  topology_state_machine: extend request_param with a set of raft ids
  raft topology: set ignore_nodes in raft_removenode and raft_replace
  utils: introduce split_comma_separated_list
  raft topology: add the ignore_nodes column to system.topology
2023-08-22 18:04:59 +02:00
Kamil Braun
cdc3cd2b79 Merge 'raft: add fencing tests' from Petr Gusev
In this PR a simple test for fencing is added. It exercises the data
plane, meaning if it somehow happens that the node has a stale topology
version, then requests from this node will get an error 'stale
topology'. The test just decrements the node version manually through
CQL, so it's quite artificial. To test a more real-world scenario we
need to allow the topology change fiber to sometimes skip unavailable
nodes. Now the algorithm fails and retries indefinitely in this case.

The PR also adds some logs, and removes one seemingly redundant topology
version increment, see the commit messages for details.

Closes #14901

* github.com:scylladb/scylladb:
  test_fencing: add test_fence_hints
  test.py: output the skipped tests
  test.py: add skip_mode decorator and fixture
  test.py: add mode fixture
  hints: add debug log for dropped hints
  hints: send_one_hint: extend the scope of file_send_gate holder
  pylib: add ScyllaMetrics
  hints manager: add send_errors counter
  token_metadata: add debug logs
  fencing: add simple data plane test
  random_tables.py: add counter column type
  raft topology: don't increment version when transitioning to node_state::normal
2023-08-22 16:28:21 +02:00
Patryk Jędrzejczak
1f57d80ba1 topology_state_machine: extend request_param with a set of raft ids
We add two new alternative types to service::request_param:
removenode_param and replace_param. They allow storing the list
of ignored nodes loaded from the ignore_nodes column of
system.topology. We also remove the raft::server_id type because
it has been only used by the replace operation.
2023-08-22 14:17:37 +02:00
Petr Gusev
439c91851f hints: add debug log for dropped hints
Dropping data is rather important event,
let's log it at least at the debug level.
It'll help in debugging tests.
2023-08-22 15:48:40 +04:00
Petr Gusev
9fd3df13a2 hints: send_one_hint: extend the scope of file_send_gate holder
The problem was that the holder in with_gate
call was released too early. This happened
before the possible call to on_hint_send_failure
in then_wrapped. As a result, the effects of
on_hint_send_failure (segment_replay_failed flag)
were not visible in send_one_file after
ctx_ptr->file_send_gate.close(), so we could decide
that the segment was sent in full and delete
it even if sending of some hints led to errors.

Fixes #15110
2023-08-22 15:48:40 +04:00
Petr Gusev
1b7603af23 hints manager: add send_errors counter
There was no indication of problems
in the hints manager metrics before.
We need this counter for fencing tests
in the later commit, but it seems to be
useful on its own.
2023-08-22 14:31:04 +04:00
Patryk Jędrzejczak
0beabdc6ba utils: introduce split_comma_separated_list
Three places handle comma-separated lists similarly:
- ss::remove_node.set(...) in api::set_storage_service,
- storage_service::parse_node_list,
- storage_service::is_repair_based_node_ops_enabled.
In the next commit, the fourth place that needs the same logic
appears -- storage_service::raft_replace. It needs to load
and parse the --ignore-dead-nodes-for-replace param from config.

Moreover, the code in is_repair_based_node_ops_enabled is
different and doesn't seem right. We swap '\"' and '\'' with ' '
but don't do anything with it afterward.

To avoid code duplication and fix is_repair_based_node_ops_enabled,
we introduce the new function utils::split_comma_separated_list.

This change has a small side effect on logging. For example,
ignore_nodes_strs in storage_service::parse_node_list might be
printed in a slightly different form.
2023-08-22 10:30:36 +02:00
Patryk Jędrzejczak
16f5db8af2 raft topology: add the ignore_nodes column to system.topology
In the following commits, we add support for --ignore-dead-nodes
in raft_removenode and --ignore-dead-nodes-for-replace in
raft_replace. To make these request parameters accessible for the
topology coordinator, we store them in the new ignore_nodes
column of system.topology.
2023-08-22 10:30:12 +02:00
Benny Halevy
97061cc3b8 endpoint_state: use gossiper::is_alive externally
Before we remove endpoint_state:_is_alive to rely
solely on gossipper::_live_endpoints.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 09:06:09 +03:00
Avi Kivity
ce43effc21 Merge "fix rebuild with consistent topology management" From Gleb Natapov
"
The series fixes bogus asserting during topology state load and add a
test that runs rebuild to make sure the code will not regress again.

Fixes #14958
"

* 'gleb/rebuilding_fix_v1' of github.com:scylladb/scylla-dev:
  test: add rebuild test
  system_keyspace: fix assertion for missing transition_state
2023-08-21 16:00:42 +03:00
Pavel Emelyanov
6bc30f1944 system_keyspace: De-bloat .setup() from messing with system.local
On boot several manipulations with system.local are performed.

1. The host_id value is selected from it with key = local

   If not found, system_keyspace generates a new host_id, inserts the
   new value into the table and returns back

2. The cluster_name is selected from it with key = local

   Then it's system_keyspace that either checks that the name matches
   the one from db::config, or inserts the db::config value into the
   table

3. The row with key = local is updated with various info like versions,
   listen, rpc and bcast addresses, dc, rack, etc. Unconditionally

All three steps are scattered over main, p.1 is called directly, p.2 and
p.3 are executed via system_keyspace::setup() that happens rather late.
Also there's some touch of this table from the cql_test_env startup code.

The proposal is to collect this setup into one place and execute it
early -- as soon as the system.local table is populated. This frees the
system_keyspace code from the logic of selecting host id and cluster
name leaving it to main and keeps it with only select/insert work.

refs: #2795

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #15082
2023-08-20 21:24:31 +03:00
Kefu Chai
12d6ec5a18 config: respect --log-with-color 1
scylladb overrides some of seastar logging related options with its
own options by applying them with `logging::apply_settings()`. but
we fail to inherit `with_color` from Seastar as we are using the
designated initializer, so the unspecified members are zero initialized.
that's why we always have logging message in black and white even
if scylla is running in a tty and `--log-with-color 1` is specified.

so, make the debugging life more colorful, let's inherit the option
from Seastar, and apply it when setting logging related options.

see also 29e09a3292

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15076
2023-08-20 13:47:43 +03:00
Tomasz Grabiec
bd8bb5d4b1 Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.

There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.

```
INFO  2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO  2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf
```

Closes #14863

* github.com:scylladb/scylladb:
  Kill scylla option to configure number of compaction groups
  replica: Wire tablet into compaction group
  token_metadata: Add this_host_id to topology config
  replica: Switch to chunked_vector for storing compaction groups
  replica: Generate group_id for compaction_group on demand
2023-08-18 15:17:17 +02:00
Avi Kivity
1901475598 Merge 'config: mark "experimental" option unused and cleanups' from Kefu Chai
in this series, the "experimental" option is marked `Unused` as it has been marked deprecated for almost 2 years since scylla 4.6. and use `experimental_features` to specify the used experimental features explicitly.

Closes #14948

* github.com:scylladb/scylladb:
  config: remove unused namespace alias
  config: use std::ranges when appropriate
  config: drop "experimental" option
  test: disable 'enable_user_defined_functions' if experimental_features does not include udf
  test: pylib: specify experimental_features explicitly
2023-08-17 20:42:02 +03:00
Kefu Chai
6788903fd6 db: config: mark config class final
in 34c3688017, we added a virtual function
to `config_file`, and we new and delete pointer pointing to a
`db::config` instance with `unique_ptr<>`. this makes the compiler
nervous, as deleting a pointer pointing to an instance of non-final
class with virtual function could lead to leak, if this pointer actually
points to a derived class of this non-final class. so, in order to
silence the warning and to prevent potential problem in future, let's
mark `db::config` final.

the warning from Clang 16 looks like:

```
In file included from /home/kefu/dev/scylladb/test/lib/test_services.cc:10:
In file included from /home/kefu/dev/scylladb/test/lib/test_services.hh:25:
In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/memory:78:
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: error: delete called on non-final 'db::config' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
        delete __ptr;
        ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<db::config>::operator()' requested here
          get_deleter()(std::move(__ptr));
          ^
/home/kefu/dev/scylladb/test/lib/test_services.cc:189:16: note: in instantiation of member function 'std::unique_ptr<db::config>::~unique_ptr' requested here
    auto cfg = std::make_unique<db::config>();
               ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15071
2023-08-17 13:43:16 +03:00
Raphael S. Carvalho
b578d6643f Kill scylla option to configure number of compaction groups
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:53 -03:00
Piotr Smaroń
34c3688017 db: config: add live_updatable_config_params_changeable_via_cql option
If `live_updatable_config_params_changeable_via_cql` is set to true, configuration parameters defined with `liveness::LiveUpdate` option can be updated in the runtime with CQL, i.e. by updating `system.config` virtual table.
If we don't want any configuration parameter to be changed in the
runtime by updating `system.config` virtual table, this option should be
set to false. This option should be set to false for e.g. cloud users,
who can only perform CQL queries, and should not be able to change
scylla's configuration on the fly.

Current implemenatation is generic, but has a small drawback - messages
returned to the user can be not fully accurate, consider:
```
cqlsh> UPDATE system.config SET value='2' WHERE name='task_ttl_in_seconds';
WriteFailure: Error from server: code=1500 [Replica(s) failed to execute write] message="option is not live-updateable" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
```
where `task_ttl_in_seconds` has been defined with
`liveness::LiveUpdate`, but because `live_updatable_config_params_changeable_via_cql` is set to
`false` in `scylla.yaml,` `task_ttl_in_seconds` cannot be modified in the
runtime by updating `system.config` virtual table.

Fixes #14355

Closes #14382
2023-08-16 17:56:27 +03:00
Benny Halevy
8fbcf1ab9f view: start: ignore also abort_requested_exception
We see the abort_requested_exception error from time
to time, instead of sleep_aborted that was expected
and quietly ignored (in debug log level).

Treat abort_requested_exception the same way since
the error is expected on shutdown and to reduce
test flakiness, as seen for example, in
https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/3033/artifact/logs-full.release.010/1691896356104_repair_additional_test.py%3A%3ATestRepairAdditional%3A%3Atest_repair_schema/node2.log
```
INFO  2023-08-13 03:12:29,151 [shard 0] compaction_manager - Asked to stop
WARN  2023-08-13 03:12:29,152 [shard 0] gossip - failure_detector_loop: Got error in the loop, live_nodes={}: seastar::sleep_aborted (Sleep is aborted)
INFO  2023-08-13 03:12:29,152 [shard 0] gossip - failure_detector_loop: Finished main loop
WARN  2023-08-13 03:12:29,152 [shard 0] cdc - Aborted update CDC description table with generation (2023/08/13 03:12:17, d74aad4b-6d30-4f22-947b-282a6e7c9892)
INFO  2023-08-13 03:12:29,152 [shard 1] compaction_manager - Asked to stop
INFO  2023-08-13 03:12:29,152 [shard 1] compaction_manager - Stopped
INFO  2023-08-13 03:12:29,153 [shard 0] init - Signal received; shutting down
INFO  2023-08-13 03:12:29,153 [shard 0] init - Shutting down view builder ops
INFO  2023-08-13 03:12:29,153 [shard 0] view - Draining view builder
INFO  2023-08-13 03:12:29,153 [shard 1] view - Draining view builder
INFO  2023-08-13 03:12:29,153 [shard 0] compaction_manager - Stopped
ERROR 2023-08-13 03:12:29,153 [shard 0] view - start failed: seastar::abort_requested_exception (abort requested)
ERROR 2023-08-13 03:12:29,153 [shard 1] view - start failed: seastar::abort_requested_exception (abort requested)
```

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15029
2023-08-13 18:39:09 +03:00