Commit Graph

11801 Commits

Author SHA1 Message Date
Kefu Chai
f4706be8a8 test: test_topology_ops: adapt to tablets
in e7d4e080, we reenabled the background writes in this test, but
when running with tablets enabled, background writes are still
disabled because of #17025, which was fixed last week. so we can
enable background writes with tablets.

in this change,

* background writes are enabled with tablets.
* increase the number of nodes by 1 so that we have enough nodes
  to fulfill the needs of tablets, which enforces that the number
  of replicas should always satisfy RF.
* pass rf to `start_writes()` explicitly, so we have less
  magic numbers in the test, and make the data dependencies
  more obvious.

Fixes #17589
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18707
2024-06-08 17:46:37 +02:00
Gleb Natapov
34cf5c81f6 group0, topology coordinator: run group0 and the topology coordinator in gossiper scheduling group
Currently they both run in streaming group and it may become busy during
repair/mv building and affect group0 functionality. Move it to the
gossiper group where it should have more time to run.

Fixes scylladb/scylladb#18863

Closes scylladb/scylladb#19138
2024-06-07 15:31:44 +02:00
Pavel Emelyanov
b854bf4b83 lang: Don't use db::config to create lua context
Similarly to previous patch, lua context needs db::config for creation.
It's better to get the configurables via lang::manager::config.

One thing to note -- lua config carries updateable_values on board, but
respective db::config options and _not_ LiveUpdate-able, so the lua
config could just use simple data types. This patch keeps updateable
values intact for brevity.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 13:07:05 +03:00
Pavel Emelyanov
783ccc0a74 lang: Don't use db::config to create wasm context
The managerr needs to get two "fuel" configurables from db::config in
order to create context. Instead of carrying db config from callers,
keep the options on existing lang::manager::config and use them.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 13:07:05 +03:00
Pavel Emelyanov
fe7ff7172d wasm: Replace startup_context with wasm_config
The lang::manager starts with the help of a context because it needs to
have std::shared_ptr<> pointg to cross-shard shared wasm engine and
runner thread. For that a context is created in advance, that then helps
sharing the engine and runner across manager instances.

This patch removes the "context" and replaces it with classical
manager::config. With it, it's lang::manager who's now responsible for
initializing itself.

In order to have cross-shard engine and thread pointers, the start()
method uses invoke_on_others() facility to share the pointer.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 12:35:57 +03:00
Pavel Emelyanov
0dad72b736 lang: Add manager::start() method
Just like any other sharded<> service, the lang::manager now starts and
stops in a classical sequence of

  await sharded<manager>::start()
  defer([] { await sharded<manager>::stop() })
  await sharded<manager>::invoke_on_all(&manager::start)

For now the method is no-op, next patches will start using it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 12:35:57 +03:00
Pavel Emelyanov
f950469af5 lang: Move manager to lang namespace
And, while at it, rename local variable to refer to it to as "manager"
not "wasm". Query processor and database also have getters named
"wasm()", these are not renamed yet to keep patch smaller (and those
getters are going to be reworked further anyway).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 12:35:57 +03:00
Pavel Emelyanov
1dec79e97d lang: Move wasm::manager to its .cc/.hh files
It's going to become a facade in front of both -- wasm and lua, so keep
it in files with language independent names.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 12:35:57 +03:00
Marcin Maliszkiewicz
c13fea371c cql3: always return created event in create ks/table/type/view statement
In case multiple clients issue concurrently CREATE KEYSPACE IF NOT EXISTS
and later USE KEYSPACE it can happen that schema in driver's session is
out of sync because it synces when it receives special message from
CREATE KEYSPACE response.

Similar situation occurs with other schema change statements.

In this patch we fix only create keyspace/table/type/view statements
by always sending created event. Behavior of any other schema altering
statements remains unchanged.
2024-06-07 10:36:40 +02:00
Piotr Dulikowski
e18aeb2486 Merge 'mv: gossip the same backlog if a different backlog was sent in a response' from Wojciech Mitros
Currently, there are 2 ways of sharing a backlog with other nodes: through
a gossip mechanism, and with responses to replica writes. In gossip, we
check each second if the backlog changed, and if it did we update other
nodes with it. However if the backlog for this node changed on another
node with a write response, the gossiped backlog is currently not updated,
so if after the response the backlog goes back to the value from the previous
gossip round, it will not get sent and the other node will stay with an
outdated backlog - this can be observed in the following scenario:

1. Cluster starts, all nodes gossip their empty view update backlog to one another
2. On node N, `view_update_backlog_broker` (the backlog gossiper) performs an iteration of its backlog update loop, sees no change (backlog has been empty since the start), schedules the next iteration after 1s
3. Within the next 1s, coordinator (different than N) sends a write to N causing a remote view update (which we do not wait for). As a result, node N replies immediately with an increased view update backlog, which is then noted by the coordinator.
4. Still within the 1s, node N finishes the view update in the background, dropping its view update backlog to 0.
5. In the next and following iterations of `view_update_backlog_broker` on N, backlog is empty, as it was in step 2, so no change is seen and no update is sent due to the check
```
auto backlog = _sp.local().get_view_update_backlog();
if (backlog_published && *backlog_published == backlog) {
    sleep_abortable(gms::gossiper::INTERVAL, _as).get();
    continue;
}
```

After this scenario happens, the coordinator keeps an information about an increased view update backlog on N even though it's actually already empty

This patch fixes the issue this by notifying the gossip that a different backlog
was sent in a response, causing it to send an unchanged backlog to other
nodes in the following gossip round.

Fixes: https://github.com/scylladb/scylladb/issues/18461

Similarly to https://github.com/scylladb/scylladb/pull/18646, without admission control (https://github.com/scylladb/scylladb/pull/18334), this patch doesn't affect much, so I'm marking it as backport/none

Tests: manual. Currently this patch only affects the length of MV flow control delay, which is not reliable to base a test on. A proper test will be added when MV admission control is added, so we'll be able to base the test on rejected requests

Closes scylladb/scylladb#18663

* github.com:scylladb/scylladb:
  mv: gossip the same backlog if a different backlog was sent in a response
  node_update_backlog: divide adding and fetching backlogs
2024-06-07 10:20:21 +02:00
Kefu Chai
ad649be1bf treewide: drop thrift support
thrift support was deprecated since ScyllaDB 5.2

> Thrift API - legacy ScyllaDB (and Apache Cassandra) API is
> deprecated and will be removed in followup release. Thrift has
> been disabled by default.

so let's drop it. in this change,

* thrift protocol support is dropped
* all references to thrift support in document are dropped
* the "thrift_version" column in system.local table is
  preserved for backward compatibility, as we could load
  from an existing system.local table which still contains
  this clolumn, so we need to write this column as well.
* "/storage_service/rpc_server" is only preserved for
  backward compatibility with java-based nodetool.
* `rpc_port` and `start_rpc` options are preserved, but
  they are marked as "Unused". so that the new release
  of scylladb can consume existing scylla.yaml configurations
  which might contain these settings. by making them
  deprecated, user will be able get warned, and update
  their configurations before we actually remove them
  in the next major release.

Fixes #3811
Fixes #18416
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-07 06:44:59 +08:00
Avi Kivity
cd553848c1 Merge 'auth-v2: use a single transaction in auth related statements ' from Marcin Maliszkiewicz
Due to gradual raft introduction into statements code in cases when single statement modified more than one table or mutation producing function was composed out of simpler ones we violated transactional logic and statement execution was not atomic as whole.

This patch changes that, so now either all changes resulting from statement execution are applied or none. Affected statements types are:
- schema modification
- auth modifications
- service levels modifications

Fixes https://github.com/scylladb/scylladb/issues/17738

Closes scylladb/scylladb#17910

* github.com:scylladb/scylladb:
  raft: rename mutations_collector to group0_batch
  raft: rename announce to commit
  cql3: raft: attach description to each mutations collector group
  auth: unify mutations_generator type
  auth: drop redundant 'this' keyword
  auth: remove no longer used code from standard_role_manager::legacy_modify_membership
  cql3: auth: use mutation collector for service levels statements
  cql3: auth: use mutation collector for alter role
  cql3: auth: use mutation collector for grant role and revoke role
  cql3: auth: use mutation collector for drop role and auto-revoke
  auth: add refactored modify_membership func in standard_role_manager
  auth: implement empty revoke_all in allow_all_authorizer
  auth: drop request_execution_exception handling from default_authorizer::revoke_all
  Revert "Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks"
  cql3: auth: use mutation collector for grant and revoke permissions
  cql3: extract changes_tablets function in alter_keyspace_statement
  cql3: auth: use mutation collector for create role statement
  auth: move create_role code into service
  auth: add a way to announce mutations having only client_state ref
  auth: add collect_mutations common helper
  auth: remove unused header in common.hh
  auth: add class for gathering mutations without immediate announce
  auth: cql3: use auth facade functions consistently on write path
  auth: remove unused is_enforcing function
2024-06-06 17:31:26 +03:00
Marcin Maliszkiewicz
63e6334a64 raft: rename mutations_collector to group0_batch 2024-06-06 13:26:34 +02:00
Kamil Braun
57e810c852 Merge 'Serialize repair with tablet migration' from Tomasz Grabiec
We want to exclude repair with tablet migrations to avoid races
between repair reads and writes with replica movement. Repair is not
prepared to handle topology transitions in the middle.

One reason why it's not safe is that repair may successfully write to
a leaving replica post streaming phase and consider all replicas to be
repaired, but in fact they are not, the new replica would not be
repaired.

Other kinds of races could result in repair failures. If repair writes
to a leaving replica which was already cleaned up, such writes will
fail, causing repair to fail.

Excluding works by keeping effective_replication_map_ptr in a version
which doesn't have table's tablets in transitions. That prevents later
transitions from starting because topology coordinator's barrier will
wait for that erm before moving to a stage later than
allow_write_both_read_old, so before any requests start using the new
topology. Also, if transitions are already running, repair waits for
them to finish.

A blocked tablet migration (e.g. due to down node) will block repair,
whereas before it would fail. Once admin resolves the cause of blocked migration,
repair will continue.

Fixes #17658.
Fixes #18561.

Closes scylladb/scylladb#18641

* github.com:scylladb/scylladb:
  test: pylib: Do not block async reactor while removing directories
  repair: Exclude tablet migrations with tablet repair
  repair_service: Propagate topology_state_machine to repair_service
  main, storage_service: Move topology_state_machine outside storage_service
  storage_srvice, toplogy: Extract topology_state_machine::await_quiesced()
  tablet_scheduler: Make disabling of balancing interrupt shuffle mode
  tablet_scheduler: Log whether balancing is considered as enabled
2024-06-06 11:27:03 +02:00
Kamil Braun
256517b570 Merge 'tablets: Filter-out left nodes in get_natural_endpoints()' from Tomasz Grabiec
The API already promises this, the comment on effective_replication_map says:
"Excludes replicas which are in the left state".

Tablet replicas on the replaced node are rebuilt after the node
already left. We may no longer have the IP mapping for the left node
so we should not include that node in the replica set. Otherwise,
storage_proxy may try to use the empty IP and fail:

  storage_proxy - No mapping for :: in the passed effective replication map

It's fine to not include it, because storage proxy uses keyspace RF
and not replica list size to determine quorum. The node is not coming
up, so noone should need to contact it.

Users which need replica list stability should use the host_id-based API.

Fixes #18843

Closes scylladb/scylladb#18955

* github.com:scylladb/scylladb:
  tablets: Filter-out left nodes in get_natural_endpoints()
  test: pylib: Extract start_writes() load generator utility
2024-06-06 11:23:27 +02:00
Wojciech Mitros
272e80fe0a node_update_backlog: divide adding and fetching backlogs
Currently, we only update the backlogs in node_update_backlog at the
same time when we're fetching them. This is done using storage_proxy's
method get_view_update_backlog, which is confusing because it's a getter
with side-effects. Additionally, we don't always want to update the
backlog when we're reading it (as in gossip which is only on shard 0)
and we don't always want to read it when we're updating it (when we're
not handling any writes but the backlog drops due to background work
finish).

This patch divides the node_view_backlog::add_fetch as well the
storage_proxy::get_view_update_backlog both into two methods; one
for updating and one for reading the backlog. This patch only replaces
the places where we're currently using the view backlog getter, more
situations where we should get/update the backlog should be considered
in a following patch.
2024-06-06 10:45:13 +02:00
Botond Dénes
cd10beb89d Merge 'Don't use db::config by gossiper' from Pavel Emelyanov
All sharded<service>'s a supposed to have their own config and not use global db::config one. The service config, in turn, is to be created by main/cql_test_env/whatever out of db::config and, maybe, other data. Gossiper is almost there, but it still uses db::config in few places.

Closes scylladb/scylladb#19051

* github.com:scylladb/scylladb:
  gossiper: Stop using db::config
  gossiper: Move force_gossip_generation on gossip_config
  gossiper: Move failure_detector_timeout_ms on gossip_config
  main: Fix indentation after previous patch
  main: Make gossiper config a sharded parameter
  main: Add local variable for set of seeds
  main: Add local variable for group0 id
  main: Add local variable for cluster_name
2024-06-06 09:12:51 +03:00
Nadav Har'El
b5fd854c77 cql-pytest: be more forgiving to ancient versions of Scylla
We recently added to cql-pytest tests the ability to check if tablets
are enabled or not (for some tablet-specific tests). When running
tests against Cassandra or old pre-tablet versions of Scylla, this
fact is detected and "False" is returned immediately. However, we
still look at a system table which didn't exist on really ancient
versions of Scylla, and tests couldn't run against such versions.

The fix is trivial: if that system table is missing, just ignore the
error and return False (i.e., no tablets). There were no tablets on
such ancient versions of Scylla.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#19098
2024-06-06 08:53:26 +03:00
Tomasz Grabiec
2c3f7c996f test: pylib: Fetch all pages by default in run_async
Fetching only the first page is not the intuitive behavior expected by users.

This causes flakiness in some tests which generate variable amount of
keys depending on execution speed and verify later that all keys were
written using a single SELECT statement. When the amount of keys
becomes larger than page size, the test fails.

Fixes #18774

Closes scylladb/scylladb#19004
2024-06-05 18:07:24 +03:00
Tomasz Grabiec
5ca54a6e88 test: pylib: Do not block async reactor while removing directories
This fixes a problem where suite cleanup schedules lots of uninstall()
tasks for servers started in the suite, which schedules lots of tasks,
which synchronously call rmtree(). These take over a minute to finish,
which blocks other tasks for tests which are still executing.

In particular, this was observed to case
ManagerClient.server_stop_gracefully() to time-out. It has a timeout
of 60 seconds. The server was stopped quickly, but the RESTful API
response was not processed in time and the call timed out when it got
the async reactor.
2024-06-05 16:11:22 +02:00
Tomasz Grabiec
98323be296 repair: Exclude tablet migrations with tablet repair
We want to exclude repair with tablet migrations to avoid races
between repair reads and writes with replica movement. Repair is not
prepared to handle topology transitions in the middle.

One reason why it's not safe is that repair may successfully write to
a leaving replica post streaming phase and consider all replicas to be
repaired, but in fact they are not, the new replica would not be
repaired.

Other kinds of races could result in repair failures. If repair writes
to a leaving replica which was already cleaned up, such writes will
fail, causing repair to fail.

Excluding works by keeping effective_replication_map_ptr in a version
which doesn't have table's tablets in transitions. That prevents later
transitions from starting because topology coordinator's barrier will
wait for that erm before moving to a stage later than
allow_write_both_read_old, so before any requets start using the new
topology. Also, if transitions are already running, repair waits for
them to finish.

Fixes #17658.
Fixes #18561.
2024-06-05 16:11:22 +02:00
Tomasz Grabiec
c45ce41330 main, storage_service: Move topology_state_machine outside storage_service
It will be propagated to repair_service to avoid cyclic dependency:

storage_service <-> repair_service
2024-06-05 16:11:22 +02:00
Kamil Braun
18f5d6fd89 Merge 'Fail bootstrap if ip mapping is missing during double write stage' from Gleb Natapov
If a node restart just before it stores bootstrapping node's IP it will
not have ID to IP mapping for bootstrapping node which may cause failure
on a write path. Detect this and fail bootstrapping if it happens.

Closes scylladb/scylladb#18927

* github.com:scylladb/scylladb:
  raft topology: fix indentation after previous commit
  raft topology: do not add bootstrapping node without IP as pending
  test: add test of bootstrap where the coordinator crashes just before storing IP mapping
  schema_tables: remove unused code
2024-06-05 11:15:15 +02:00
Raphael S. Carvalho
3983f69b2d topology_experimental_raft/test_tablets: restore usage of check_with_down
e7246751b6 incorrectly dropped its usage in
test_tablet_missing_data_repair.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#19092
2024-06-05 10:11:02 +02:00
Pavel Emelyanov
dcc083110d gossiper: Stop using db::config
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-04 20:19:47 +03:00
Marcin Maliszkiewicz
ac0e164a6b raft: rename announce to commit
Old wording was derived from existing code which
originated from schema code. Name commit better
describes what we do here.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
370a5b547e cql3: raft: attach description to each mutations collector group
This description is readable from raft log table.
Previously single description was provided for the whole
announce call but since it can contain mutations from
various subsystems now description was moved to
add_mutation(s)/add_generator function calls.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
a88b7fc281 cql3: auth: use mutation collector for service levels statements
This is done to achieve single transaction semantics.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
97a5da5965 cql3: auth: use mutation collector for alter role
This is done to achieve single transaction semantics.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
a12c8ebfce cql3: auth: use mutation collector for grant role and revoke role
This is done to achieve single transaction semantics.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
5ba7d1b116 cql3: auth: use mutation collector for drop role and auto-revoke
The main theme of this commit is executing drop
keyspace/table/aggregate/function statements in a single
transaction together with auth auto-revoke logic.
This is the logic which cleans related permissions after
resource is deleted.

It contains serveral parts which couldn't easily be split
into separate commits mainly because mutation collector related
paths can't be mixed together. It would require holding multiple
guards which we don't support. Another reason is that with mutation
collector the changes are announced in a single place, at the end
of statement execution, if we'd announce something in the middle
then it'd lead to raft concurrent modification infinite loop as it'd
invalidate our guard taken at the begining of statement execution.

So this commit contains:

- moving auto-revoke code to statement execution from migration_listener
 * only for auth-v2 flow, to not break the old one
 * it's now executed during statement execution and not merging schemas,
   which means it produces mutations once as it should and not on each
   node separately
 * on_before callback family wasn't used because I consider it much
   less readable code. Long term we want to remove
   auth_migration_listener.

- adding mutation collector to revoke_all
 * auto-revoke uses this function so it had to be changed,
   auth::revoke_all free function wrapper was added as cql3
   layer should not use underlying_authorizer() directly.

- adding mutation collector to drop_role
 * because it depends on revoke_all and we can't mix old and new flows
 * we need to switch all functions auth::drop_role call uses
 * gradual use of previously introduced modify_membership, otherwise
   we would need to switch even more code in this commit
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
01fb43e35f Revert "Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks"
This reverts commit 80ed442be2.

This logic was replaced in previous commit by dynamic cast.
Hopefully even this cast will be eliminated in the future.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
2a6cfbfb33 cql3: auth: use mutation collector for create role statement
This is done to achieve single transaction semantics.

grant_permissions_to_creator is logically part of create role
but its change will be included in following commits
as it spans multiple usages.

Additinally we disabled rollback during create role as
it won't work and is not needed with single transaction logic.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
7e0a801f53 auth: add class for gathering mutations without immediate announce
To achieve write atomicity across different tables we need to announce
mutations in a single transaction. So instead of each function doing
a separate announce we need to collect mutations and announce them once
at the end.
2024-06-04 15:43:04 +02:00
Botond Dénes
d120f0d7d3 Merge 'tasks: introduce task manager's task folding' from Aleksandra Martyniuk
Task manager's tasks stay in memory after they are finished.
Moreover, even if a child task is unregistered from task manager,
it is still alive since its parent keeps a foreign pointer to it. Also,
when a task has finished successfully there is no point in keeping
all of its descendants in memory.

The patch introduces folding of task manager's tasks. Whenever
a task which has a parent is finished it is unregistered from task
manager and foreign_ptr to it (kept in its parent) is replaced
with its status. Children's statuses of the task are dropped unless
they or one of their descendants failed. So for each operation we
keep a tree of tasks which contains:
- a root task and its direct children (status if they are finished, a task
  otherwise);
- running tasks and their direct children (same as above);
- a statuses path from root to failed tasks.

/task_manager/wait_task/ does not unregister tasks anymore.

Refs: #16694.

- [ ] ** Backport reason (please explain below if this patch should be backported or not) **
Requires backport to 6.0 as task number exploded with tablets.

Closes scylladb/scylladb#18735

* github.com:scylladb/scylladb:
  docs: describe task folding
  test: rest_api: add test for task tree structure
  test: rest_api: modify new_test_module
  tasks: test: modify test_task methods
  api: task_manager: do not unregister task in /task_manager/wait_task/
  tasks: unregister tasks with parents when they are finished
  tasks: fold finished tasks info their parents
  tasks: make task_manager::task::impl::finish_failed noexcept
  tasks: change _children type
2024-06-04 08:43:44 +03:00
Nadav Har'El
95db1c60d6 test/alternator: fix a test failing on Amazon DynamoDB
The test test_table.py::test_concurrent_create_and_delete_table failed
on Amazon DynamoDB because of a silly typo - "false" instead of "False".
A function detecting Scylla tried to return false when noticing this
isn't Scylla - but had a typo, trying to return "false" instead of "False".

This patch fixes this typo, and the test now works on DynamoDB:
test/alternator/run --aws test_table.py::test_concurrent_create_and_delete_table

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#17799
2024-06-02 22:25:56 +03:00
Tomasz Grabiec
7b1eea794b test: perf: Add test for tablet load balancer effectiveness 2024-06-02 14:23:00 +02:00
Aleksandra Martyniuk
d7e80a6520 test: rest_api: add test for task tree structure
Add test which checks whether the tasks are folded into their parent
as expected.
2024-05-31 10:27:09 +02:00
Aleksandra Martyniuk
fc0796f684 test: rest_api: modify new_test_module
Remove remaining test tasks when a test module is removed, so that
a node could shutdown even if a test fails.
2024-05-31 10:27:09 +02:00
Aleksandra Martyniuk
a82a2f0624 tasks: unregister tasks with parents when they are finished
Unregister children that are finished from task manager. They can be
examined through they parents.
2024-05-31 10:27:09 +02:00
Nadav Har'El
c786621b4c test/cql-pytest: reproduce bug of secondary index used before built
This patch adds a test reproducing for the known issue #7963, where
after adding a secondary-index to a table, queries might immediately
start to use this index - even before it is built - and produce wrong
results.

The issue is still open and unfixed, so the new test is marked "xfail".

Interestingly, even though Cassandra claims to have found and fixed
a similar bug in 2015 (CASSANDRA-8505), this test also fails on
Cassandra - trying a query right after CREATE INDEX and before it
was fully built may cause the query to fail.

Refs #7963

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#18993
2024-05-31 10:05:00 +03:00
Lakshmi Narayanan Sreethar
3d7d1fa72a db/config.cc: increment components_memory_reclaim_threshold config default
Incremented the components_memory_reclaim_threshold config's default
value to 0.2 as the previous value was too strict and caused unnecessary
eviction in otherwise healthy clusters.

Fixes #18607

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#18964
2024-05-30 18:03:51 +03:00
Pavel Emelyanov
83d491af02 config: Remove experimental TABLETS feature
... and replace it with boolean enable_tablets option. All the places
in the code are patched to check the latter option instead of the former
feature.

The option is OFF by default, but the default scylla.yaml file sets this
to true, so that newly installed clusters turn tablets ON.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18898
2024-05-30 18:03:51 +03:00
Wojciech Mitros
0de3a5f3ff test mv: remove injection delaying shutdown of a node
In the test_mv_topology_change case, we use an injection to
delay the view updates application, so that the ERMs have
a chance to change in the process. This injection was also
enabled on a new node in the test, which was later decommissioned.
During the shutdown, writes were still being performed, causing
view update generation and delays due to the injection which in
turn delayed the node shutdown, causing the test to timeout.
This patch removes the injection for the node being shut down.
At the same time, the force_gossip_topology_changes=True option
is also removed from its config, but for that option it's enough
to enable on the first node in the cluster and all nodes use it.

Fixes: https://github.com/scylladb/scylladb/issues/18941

Closes scylladb/scylladb#18958
2024-05-29 15:29:55 +02:00
Nadav Har'El
4b04ed1360 test/alternator: be more forgiving on authorizer configuration
The Alternator test suite usually runs on a specific configuration of
Scylla set up by test.py or test/alternator/run. However, we do consider
it an important design goal of this test suite that developers should be
able to run these tests against any DynamoDB-API implementation, including
any version Scylla manually run by the developer in *any way* he or she
pleases.

The recent commit dc80b5dafe changed the way
we retrieve the configured autentication key, which is needed if Scylla is
run with --alternator-enforce-authorization. However, the new code assumed
that Scylla was also run with
     --authenticator PasswordAuthenticator --authorizer CassandraAuthorizer
so that the default role of "cassandra" has a valid, non-null, password
(namely, "cassandra"). If the developer ran Scylla manually without
these options, the test initialization code broke, and all tests in the
suite failed.

This patch fixes this breakage. You can now run the Alternator test
suite against Scylla run manually without any of the aforementioned
options, and everything will work except some tests in test_authorization.py
will fail as expected.

This patch has no affect on the usual test.py or test/alternator/run
runs, as they already run Scylla with all the aforementioned options
and weren't exposed to the problem fixed here.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#18957
2024-05-29 16:22:45 +03:00
Tomasz Grabiec
0d596a425c tablets: Filter-out left nodes in get_natural_endpoints()
The API already promises this, the comment on effective_replication_map says:
"Excludes replicas which are in the left state".

Tablet replicas on the replaced node are rebuilt after the node
already left. We may no longer have the IP mapping for the left node
so we should not include that node in the replica set. Otherwise,
storage_proxy may try to use the empty IP and fail:

  storage_proxy - No mapping for :: in the passed effective replication map

It's fine to not include it, because storage proxy uses keyspace RF
and not replica list size to determine quorum. The node is not coming
up, so noone should need to contact it.

Users which need replica list stability should use the host_id-based API.

Fixes #18843
2024-05-29 14:49:49 +02:00
Pavel Emelyanov
e74a4b038f Merge 'tablets: alter keyspace' from Piotr Smaron
This change supports changing replication factor in tablets-enabled keyspaces.
This covers both increasing and decreasing the number of tablets replicas through
first building topology mutations (`alter_keyspace_statement.cc`) and then
tablets/topology/schema mutations (`topology_coordinator.cc`).
For the limitations of the current solution, please see the docs changes attached to this PR.

Fixes: #16129

Closes scylladb/scylladb#16723

* github.com:scylladb/scylladb:
  test: Do not check tablets mutations on nodes that don't have them
  test: Fix the way tablets RF-change test parses mutation_fragments
  test/tablets: Unmark RF-changing test with xfail
  docs: document ALTER KEYSPACE with tablets
  Return response only when tablets are reallocated
  cql-pytest: Verify RF is changes by at most 1 when tablets on
  cql3/alter_keyspace_statement: Do not allow for change of RF by more than 1
  Reject ALTER with 'replication_factor' tag
  Implement ALTER tablets KEYSPACE statement support
  Parameterize migration_manager::announce by type to allow executing different raft commands
  Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks
  Extend system.topology with 3 new columns to store data required to process alter ks global topo req
  Allow query_processor to check if global topo queue is empty
  Introduce new global topo `keyspace_rf_change` req
  New raft cmd for both schema & topo changes
  Add storage service to query processor
  tablets: tests for adding/removing replicas
  tablet_allocator: make load_balancer_stats_manager configurable by name
2024-05-29 14:17:51 +03:00
Gleb Natapov
27445f5291 test: add test of bootstrap where the coordinator crashes just before storing IP mapping
On the next boot there is no host ID to IP mapping which causes node to
crash again with "No mapping for :: in the passed effective replication map"
assertion.
2024-05-29 11:46:23 +03:00
Tomasz Grabiec
3e1ba4c859 test: pylib: Extract start_writes() load generator utility 2024-05-29 10:02:56 +02:00
Botond Dénes
d37eca0593 test/boost/mutation_reader_test: compacting_reader_next_partition: fix partition order
The test creates two partitions and passes them through the reader, but
the partitions are out-of-order. This is benign but best to fix it
anyway.
Found after bumping validation level inside the compactor.

Closes scylladb/scylladb#18848
2024-05-28 20:41:54 +03:00