Commit Graph

40363 Commits

Author SHA1 Message Date
Pavel Emelyanov
a03755d6d7 test: Add a test that switching between vnodes and tablets is banned
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-21 19:57:55 +03:00
Pavel Emelyanov
4de433ac23 cql3/statements: Don't allow switching between vnode and per-table replication strategies
When ALTER-ing a keyspace one may as well change its vnode/tablet
flavor, which is not currently supported, so prohibit this change
explicitly

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-21 19:57:00 +03:00
Pavel Emelyanov
299219833b cql3/statements: Keep local keyspace variable in alter_keyspace_statement::validate
For convenience of next patching

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-21 19:56:18 +03:00
Raphael S. Carvalho
5e55954f27 replica: Make the storage snapshot survive concurrent compactions
Consider this:
1) file streaming takes storage snapshot = list of sstables
2) concurrent compaction unlink some of those sstables from file system
3) file streaming tries to send unlinked sstables, but files other
than data and index cannot be read as only data and index have file
descriptors opened

To fix it, the snapshot now returns a set of files, one per sstable
component, for each sstable.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#16476
2023-12-21 12:50:28 +02:00
Botond Dénes
e6147c1853 Merge 'Some cleanup in compaction group' from Raphael "Raph" Carvalho
Closes scylladb/scylladb#16448

* github.com:scylladb/scylladb:
  replica: Fix indentation
  replica: Kill unused calculate_disk_space_used_for()
2023-12-21 12:48:38 +02:00
Raphael S. Carvalho
ee203f846e test: Fix segfault when running offstrategy test
Observer, that references table_for_test, must of course, not
outlive table_for_test. Observer can be called later after the
last input sstable is removed from sstable manager.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#16428
2023-12-20 19:04:41 +02:00
David Garcia
9af6c7e40b docs: add myst parser
Closes scylladb/scylladb#16316
2023-12-20 19:04:41 +02:00
Raphael S. Carvalho
d1e6dfadea sstables: Harden estimate_droppable_tombstone_ratio() interface
The interface is fragile because the user may incorrectly use the
wrong "gc before". Given that sstable knows how to properly calculate
"gc before", let's do it in estimate__d__t__r(), leaving no room
for mistakes.

sstable_run's variant was also changed to conform to new interface,
allowing ICS to properly estimate droppable ratio, using GC before
that is calculated using each sstable's range. That's important for
upcoming tablets, as we want to query only the range that belongs
to a particular tablet in the repair history table.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#15931
2023-12-20 19:04:41 +02:00
Botond Dénes
758d9cf005 Merge 'build: cmake: map 'release' to 'RelWithDebInfo'' from Kefu Chai
this preserves the existing behavior of `configure.py` in the CMake
generated `build.ninja`.

* configure.py: map 'release' to 'RelWithDebInfo'
* cmake: rename cmake/mode.Release.cmake to cmake/mode.RelWithDebInfo.cmake
* CMakeLists.txt: s/Release/RelWithDebInfo/

Closes scylladb/scylladb#16479

* github.com:scylladb/scylladb:
  build: cmake: map 'release' to 'RelWithDebInfo'
  build: define BuildType for enclosing build_by_default
2023-12-20 19:04:40 +02:00
Pavel Emelyanov
5866d265c3 Merge ' tools/utils: tool_app_template: handle the case of no args ' from Botond Dénes
Currently, `tool_app_template::run_async()` crashes when invoked with empty argv (with just `argv[0]` populated). This can happen if the tool app is invoked without any further args, e.g. just invoking `scylla nodetool`. The crash happens because unconditional dereferencing of `argv[1]` to get the current operation.

To fix, add an early-exit for this case, just printing a usage message and exiting with exit code 2.

Fixes: #16451

Closes scylladb/scylladb#16456

* github.com:scylladb/scylladb:
  test: add regression tests for invoking tools with no args
  tools/utils: tool_app_template: handle the case of no args
  tools/utils: tool_app_template: remove "scylla-" prefix from app name
2023-12-20 19:04:40 +02:00
Kamil Braun
6fcaec75db Merge 'Add maintenance socket' from Mikołaj Grzebieluch
It enables interaction with the node through CQL protocol without authentication. It gives full-permission access.
The maintenance socket is available by Unix domain socket with file permissions `755`, thus it is not accessible from outside of the node and from other POSIX groups on the node.
It is created before the node joins the cluster.

To set up the maintenance socket, use the `maintenance-socket` option when starting the node.

* If set to `ignore` maintenance socket will not be created.
* If set to `workdir` maintenance socket will be created in `<node's workdir>/cql.m`.
* Otherwise maintenance socket will be created in the specified path.

The default value is `ignore`.

* With python driver

```python
from cassandra.cluster import Cluster
from cassandra.connection import UnixSocketEndPoint
from cassandra.policies import HostFilterPolicy, RoundRobinPolicy

socket = "<node's workdir>/cql.m"
cluster = Cluster([UnixSocketEndPoint(socket)],
                  # Driver tries to connect to other nodes in the cluster, so we need to filter them out.
                  load_balancing_policy=HostFilterPolicy(RoundRobinPolicy(), lambda h: h.address == socket))
session = cluster.connect()
```

Merge note: apparently cqlsh does not support unix domain sockets; it
will have to be fixed in a follow-up.

Closes scylladb/scylladb#16172

* github.com:scylladb/scylladb:
  test.py: add maintenance socket test
  test.py: enable maintenance socket in tests by default
  docs: add maintenance socket documentation
  main: add maintenance socket
  main: refactor initialization of cql controller and auth service
  auth/service: don't create system_auth keyspace when used by maintenance socket
  cql_controller: maintenance socket: fix indentation
  cql_controller: add option to start maintenance socket
  db/config: add maintenance_socket_enabled bool class
  auth: add maintenance_socket_role_manager
  db/config: add maintenance_socket variable
2023-12-20 19:04:40 +02:00
Nadav Har'El
7ee55dd03e cdc, tablets: don't allow enabling CDC with tablets
We do not yet support enabling CDC in a keyspace that uses tablets
(Refs #16317). But the problem is that today, if this is attempted,
we get a nasty failure: the CDC code creates the extra CDC log table,
it doesn't get tablets, and Raft gets surprised and croaks with a
message like:

    Raft instance is stopped, reason: "background error,
    std::_Nested_exceptionraft::state_machine_error (State machine error at
    raft/server.cc:1230): std::runtime_error (Tablet map not found for
    table 48ca1620-9ea5-11ee-bd7c-22730ed96b85)

After Raft croaks, Scylla never recovers until it is rebooted.

In this patch, we replace this disaster by a graceful error -  a CREATE
TABLE or ALTER TABLE operation with CDC enabled will fail in a clear way,
and allowing Scylla to continue operating normally after this failed request.

This fix is important for allowing us to run tests on Scylla with
tablets, and although CDC tests will fail as expected, they won't
fail the other tests that follow (Refs #16473).

Fixes #16318

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#16474
2023-12-20 10:06:34 +01:00
Kamil Braun
ffb6ae917f Merge 'Add support for tablets in Alternator' from Nadav Har'El
The pull requests adds support for tablets in Alternator, and particularly focuses in getting Alternator's GSI and LSI (i.e., materialized views)  to work.

After this series support for tablets in Alternator _mostly_ work, but not completely:
1. CDC doesn't yet work with tablets, and Alternator needs to provide CDC (known as "DynamoDB Streams").
2. Alternator's TTL feature was not tested with tablets, and probably doesn't work because it assumes the replication map belongs to a keyspace.

Because of these reasons, Alternator does not yet use tablets by default and it needs to be enabled explicitly be adding an experimental tag to the new table. This will allow us to test Alternator with tablets even before it is ready for the limelight.

Fixes #16203
Fixes #16313

Closes scylladb/scylladb#16353

* github.com:scylladb/scylladb:
  mv, tablets, alternator: test for Alternator LSI with tablets
  mv: coroutinize wait code for remote view updates
  mv, test: add injection point to delay remove view update
  alternator: explicitly request synchronous updates for LSI
  alternator: fix view creation when using tablets
  alternator: add experimental method to create a table with tablets
2023-12-20 10:00:31 +01:00
Kamil Braun
1f6460972b Merge 'Fix crash on table drop concurrent with streaming ' from Tomasz Grabiec
The observed crash was in the following piece on "cf" access:

        if (*table_is_dropped) {
            sslog.info("[Stream #{}] Skipped streaming the dropped table {}.{}", si->plan_id, si->cf.schema()->ks_name(), si->cf.schema()->cf_name());

Fixes #16181

Also, add a test case which reproduces the problem by doing table drop during tablet migration. But note that the problem is not tablet-specific.

Closes scylladb/scylladb#16341

* github.com:scylladb/scylladb:
  test: tablets: Add test case which tests table drop concurrent with migration
  tests: tablets: Do read barrier in get_tablet_replicas()
  streaming: Keep table by shared ptr to avoid crash on table drop
2023-12-20 09:57:06 +01:00
Kefu Chai
db9e314965 treewide: apply codespell to the comments in source code
for less spelling errors in comment.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16408
2023-12-20 10:25:03 +02:00
Kefu Chai
fafe9d9c38 build: cmake: map 'release' to 'RelWithDebInfo'
this preserves the existing behavior of `configure.py` in the CMake
generated `build.ninja`.

* configure.py: map 'release' to 'RelWithDebInfo'
* cmake: rename cmake/mode.Release.cmake to cmake/mode.RelWithDebInfo.cmake
* CMakeLists.txt: s/Release/RelWithDebInfo/

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-12-20 15:07:43 +08:00
Kefu Chai
72dcb2466d build: define BuildType for enclosing build_by_default
in existing `modes` defined in `configure.py`, "release" is mapped to
"RelWithDebInfo". this behavior matches that of seastar's
`configure.py`, where we also map "release" build mode to
"RelWithDebInfo" CMAKE_BUILD_TYPE.

but in scylladb's existing cmake settings, it maps "release" to
"Release", despite "Release" is listed as one of the typical
CMAKE_BUILD_TYPE values.

so, in this change, to prepare for the mapping, `BuildType` is
introduced to map a build mode to its related settings. the
building settings are still kept in `cmake.${CMAKE_BUILD_TYPE}.cmake`,
but the other settings, like if a build type should be enabled or
its mappings, are stored in `BuildType` in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-12-20 15:07:43 +08:00
Nadav Har'El
2e031f2d8e mv, tablets, alternator: test for Alternator LSI with tablets
This patch adds a test (in the topology test framework) for issue #16313 -
the bug where Alternator LSI must use synchronous view updates but didn't.
This test fails with high probability (around 50%) before the previous patch,
which fixed this bug - and passes consistently after the patch (I ran it
100 times and it didn't fail even once).

This is the first test in the topology framework that uses the DynamoDB
API and not CQL. This required a couple of tiny convenience functions,
which are introduced in the only test file that uses them - but if we
want we can later move them out to a library file.

Unfortunately, the standard AWS SDK for Python - boto3 - is *not*
asynchronous, so this test is also not really asynchronous, and will
block the event loop while making requests to Alternator. However,
for now it doesn't matter (we do NOT run multiple tests in the same
event loop), and if it ever matters, I mentioned a couple of options
what we can do in a comment.

Because this test uses a 10-node cluster, it is skipped in debug-mode
runs. In a later patch we will replace it by a more efficent - and
more reliable - 2-node test.

Refs #16313

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-19 15:41:15 +02:00
Avi Kivity
15acceb69f Merge 'commitlog_test::test_commitlog_reader: handle segment_truncation' from Calle Wilund
Fixes #16312

This test replays a segment before it might be closed or even fully flushed, thus it can (with the new semantics) generate a segment_truncation exception if hitting eof earlier than expected. (Note: test does not use pre-allocated segments).

(First patch makes the test coroutinized to make for a nicer, easier fix change.

Closes scylladb/scylladb#16368

* github.com:scylladb/scylladb:
  commitlog_test::test_commitlog_reader: handle segment_truncation
  commitlog_test: coroutinize test_commitlog_reader
2023-12-19 15:33:38 +02:00
Botond Dénes
6abdced7b9 test: add regression tests for invoking tools with no args
This was recently found to produce a crash. Add a simple regression
test, to make sure future changes don't re-introduce problems with this
rarely used code-path.
2023-12-19 04:08:48 -05:00
Botond Dénes
76492407ab tools/utils: tool_app_template: handle the case of no args
Currently, tool_app_template::run_async() crashes when invoked with
empty argv (with just argv[0] populated). This can happen if the tool
app is invoked without any further args, e.g. just invoking `scylla
nodetool`. The crash happens because unconditional dereferencing of
argv[1] to get the current operation.
To fix, add an early-exit for this case, just printing a usage message
and exiting with exit code 2.
2023-12-19 04:08:33 -05:00
Botond Dénes
975c11a54b tools/utils: tool_app_template: remove "scylla-" prefix from app name
In other words, have all tools pass their name without the "scylla-"
prefix to `tool_app_template::config::name`. E.g., replace
"scylla-nodetool" with just "nodetool".
Patch all usages to re-add the prefix if needed.

The app name is just more flexible this way, some users might want the
name without the "scylla-" prefix (in the next patch).
2023-12-19 04:04:57 -05:00
Botond Dénes
ce317d50bc bytes.hh: correct spelling of delimiter and delimited
Pointed out by the new spellcheck workflow.

Closes scylladb/scylladb#16450
2023-12-18 20:46:21 +02:00
Mikołaj Grzebieluch
ef10b497e1 test.py: add maintenance socket test
Test that when connecting to the maintenance socket, the user has superuser permissions,
even if the authentication is enabled on the regular port.
2023-12-18 17:58:13 +01:00
Mikołaj Grzebieluch
e327478bb5 test.py: enable maintenance socket in tests by default 2023-12-18 17:58:13 +01:00
Mikołaj Grzebieluch
21b3ba4927 docs: add maintenance socket documentation 2023-12-18 17:58:13 +01:00
Mikołaj Grzebieluch
f96d30c2b5 main: add maintenance socket
Add initialization of maintenance_auth_service and cql_maintenance_server_ctl.

Create maintenance socket which enables interaction with the node through
CQL protocol without authentication. The maintenance port is available
by Unix domain socket. It gives full-permission access.
It is created before the node joins the cluster.
2023-12-18 17:58:13 +01:00
Mikołaj Grzebieluch
16ab2c28e4 main: refactor initialization of cql controller and auth service
Move initialization of cql controller and auth service to functions.
It will make it easier to create a new cql controller with a seperate auth service,
for example for the maintenance socket.

Make it possible to initialize new services before joining group0.
2023-12-18 17:58:13 +01:00
Mikołaj Grzebieluch
999be1d14b auth/service: don't create system_auth keyspace when used by maintenance socket
The maintenance socket is created before joining the cluster. When maintenance auth service
is started it creates system_auth keyspace if it's missing. It is not synchronized
with other nodes, because this node hasn't joined the group0 yet. Thus a node has
a mismatched schema and is unable to join the cluster.

The maintenance socket doesn't use role management, thus the problem is solved
by not creating system_auth keyspace when maintenance auth service is created.

The logic of regular CQL port's auth service won't be changed. For the maintenance
socket will be created a new separate auth service.
2023-12-18 17:58:13 +01:00
Mikołaj Grzebieluch
2b9a88d17a cql_controller: maintenance socket: fix indentation 2023-12-18 17:58:13 +01:00
Mikołaj Grzebieluch
ac61d0f695 cql_controller: add option to start maintenance socket
Add an option to listen on the maintenance socket. It is set up on an unix domain socket
and the metrics are disabled.
This enables having an independent authentication mechanism for this socket.

To start the maintenance socket, a new cql_controller has to be created
with
`db::maintenance_socket_enabled::yes` argument.

Creating maintenance socket will raise an exception if
* the path is longer than 107 chars (due to linux limits),
* a file or a directory already exists in the path.

The indentation is fixed in the next commit.
2023-12-18 17:58:13 +01:00
Tomasz Grabiec
84ea8b32b2 test: tablets: Restart cluster in a graceful manner to avoid connection drop in the middle of request serving
After restarting each node, we should wait for other nodes to notice
the node is UP before restarting the next server. Otherwise, the next
node we restart may not send the shutdown notification to the
previously restarted node, if it still sees it as down when we
initiate its shutdown. In this case, the node will learn about the
restart from gossip later, possible when we already started CQL
requests. When a node learns that some node restarted while it
considers it as UP, it will close connections to that node. This will
fail RPC sent to that node, which will cause CQL request to time-out.

Fixes #14746

Closes scylladb/scylladb#16010
2023-12-18 16:22:02 +01:00
Raphael S. Carvalho
63e4d6c965 test: Enable debug compaction logging for sstable_compaction_test
It will make it easier to understand obscure issues like
https://github.com/scylladb/scylladb/issues/13280.

Refs #13280.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#16426
2023-12-18 16:57:46 +03:00
Kefu Chai
db16048761 test/pylib: avoid using asyncio.get_event_loop()
asyncio.get_event_loop() returns the current event loop. but if there
is not, the result of `get_event_loop_policy().get_event_loop()` is
returned. but this behavior is deprecated since Python 3.12, so let's
use asyncio.run() as recommended by
https://docs.python.org/3/library/asyncio-eventloop.html.
asyncio.run() was introduced by Python 3.7, so we should be able to
use it.

this change should silence the waring when running this script
as a stand-alone script with Python 3.12.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16385
2023-12-18 16:47:31 +03:00
Raphael S. Carvalho
5fa69b8a67 replica: Fix indentation
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-12-18 10:23:22 -03:00
Raphael S. Carvalho
8a9784d29c replica: Kill unused calculate_disk_space_used_for()
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-12-18 10:22:19 -03:00
Avi Kivity
cd88f9eb76 Update tools/java submodule (native nodetool)
* tools/java 3963c3abf7...b7ebfd38ef (1):
  > Merge 'Add nodetool interposer script' from Botond Dénes
2023-12-18 14:50:25 +02:00
Mikołaj Grzebieluch
cf43787295 db/config: add maintenance_socket_enabled bool class 2023-12-18 11:42:40 +01:00
Mikołaj Grzebieluch
11a2748d7f auth: add maintenance_socket_role_manager
Add `maintenance_socket_role_manager` which will disable all operations
associated with roles to not depend on system_auth keyspace, which may
be not yet created when the maintenance socket starts listening
2023-12-18 11:42:40 +01:00
Mikołaj Grzebieluch
e682e362a3 db/config: add maintenance_socket variable
If set to "ignore", maintenance socket will be disabled.
If set to "workdir", maintenance socket will be opened on <scylla's
workdir>/cql.m.
Otherwise it will be opened on path provided by maintenance_socket
variable.

It is set by default to 'ignore'.
2023-12-18 11:42:05 +01:00
Kamil Braun
3b108f2e31 Merge 'db: config: make consistent_cluster_management mandatory' from Patryk Jędrzejczak
We make `consistent_cluster_management` mandatory in 5.5. This
option will be always unused and assumed to be true.

Additionally, we make `override_decommission` deprecated, as this option
has been supported only with `consistent_cluster_management=false`.

Making `consistent_cluster_management` mandatory also simplifies
the code. Branches that execute only with
`consistent_cluster_management` disabled are removed.

We also update documentation by removing information irrelevant in 5.5.

Fixes scylladb/scylladb#15854

Note about upgrades: this PR does not introduce any more limitations
to the upgrade procedure than there are already. As in
scylladb/scylladb#16254, we can upgrade from the first version of Scylla
that supports the schema commitlog feature, i.e. from 5.1 (or
corresponding Enterprise release) or later. Assuming this PR ends up in
5.5, the documented upgrade support is from 5.4. For corresponding
Enterprise release, it's from 2023.x (based on 5.2), so all requirements
are met.

Closes scylladb/scylladb#16334

* github.com:scylladb/scylladb:
  docs: update after making consistent_cluster_management mandatory
  system_keyspace, main, cql_test_env: fix indendations
  db: config: make consistent_cluster_management mandatory
  test: boost: schema_change_test: replace disable_raft_schema_config
  db: config: make override_decommission deprecated
  db: config: make force_schema_commit_log deprecated
2023-12-18 09:44:52 +01:00
Botond Dénes
a6200e99e6 Merge 'Handle S3 partial read overflows' from Pavel Emelyanov
The test case that validates upload-sink works does this by getting several random ranges from the uploaded object and checks that the content is what it should be. The range boundaries are generated like this:

```
    uint64_t len = random(1, chunk_size);
    uint64_t offset = random(file_size) - len;
```

The 2nd line is not correct, if random number happens less than the len the offset befomes "negative", i.e. -- very large 64-bit unsigned value.

Next, this offset:len gets into s3 client's get_object_contiguous() helper which in turn converts them into http range header's bytes-specifier format which is "first_bytet-last_byte" one. The math here is

```
    first_byte = offset;
    last_byte = offset + len - 1;
```

Here the overflow of the offset thing results in underflow of the last_byte -- it becomes less than the first_byte. According to RFC this range-specifier is invalid and (!) can be ignored by the server. This is what minio does -- it ignores invalid range and returns back full object.

But that's not all. When returning object portion the http request status code is PartialContent, but when the range is ignored and full object is returned, the status is OK. This makes s3 client's request fail with unexpected_status_error in the middle of the test. Then the object is removed with deferred action and actual error is printed into logs. In the end of the day logs look as if deletion of an object failed with OK status %)

fixes: #16133

Closes scylladb/scylladb#16324

* github.com:scylladb/scylladb:
  test/s3: Avoid object range overflow
  s3/client: Handle GET-with-Range overflows correctly
2023-12-18 10:00:32 +02:00
Avi Kivity
081f30d149 Merge 'Add support to tablet storage splitting' from Raphael "Raph" Carvalho
Support for splitting tablet storage is added.
Until now, tablet storage was composed of a single compaction group, i.e. a group of sstables eligible to be compacted together.

For splitting, tablet storage can now be composed of multiple compaction groups, main, left and right.

Main group stores sstables that require splitting, whereas left and right groups store sstables that were already split according to the tablet's token range.

After table storage is put in splitting mode, new writes will only go to either left or right group, depending on the token.

When all main groups completed splitting their sstables, then coordinator can proceed with tablet metadata changes.
The coordination part is not implemented yet. Only the storage part. The former will come next and will be wired into the latter.

Missing:
- splitting monitor (verify whether coordinator asked for splitting and acts accordingly) (will come next)

Closes scylladb/scylladb#16158

* github.com:scylladb/scylladb:
  replica: Introduce storage group splitting
  replica: Add storage_group::memtable_count()
  replica: Add compaction_group::empty()
  replica: Rename compaction_group_manager to storage_group_manager
  replica: Introduce concept of storage group
  compaction: Add splitting compaction task to manager
  compaction: Prepare rewrite_sstables_compaction_task_executor to be reused for splitting
  compaction: remove scrub-specific code from rewrite_sstables_compaction_task_executor
  replica: Allow uncompacted SSTables to be moved into a new set
  compaction: Add splitting compaction
  flat_mutation_reader: Allow interposer consumers to be stacked
  mutation_writer: Introduce token-group-based mutation segregator
  locator: Introduce tablet_map::get_tablet_id_and_range_side(token)
2023-12-17 21:12:01 +02:00
Nadav Har'El
37b5c03865 mv: coroutinize wait code for remote view updates
In the previous patch we added a delay injection point (for testing)
in the view update code. Because the code was using continuation style,
this resulted in increased indentation and ugly repetition of captures.

So in this patch we coroutinize the code that waits for remote view
updates, making it simpler, shorter, and less indented.

Note that this function still uses continuations in one place:
The remote view update is still composed of two steps that need
to happen one after another, but we don't necessarily need to wait
for them to happen. This is easiest to do with chaining continuations,
and then either waiting or not waiting for the resulting future.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-17 20:15:08 +02:00
Nadav Har'El
bf6848d277 mv, test: add injection point to delay remove view update
It's difficult to write a test (as we plan to do in to in the next patch)
that verifies that synchronous view updates are indeed synchronous, i.e.,
that write with CL=QUORUM on the base-table write returns only after
CL=QUORUM was also achieved in the view table. The difficulty is that in a
fast test machine, even if the synchronous-view-update is completely buggy,
it's likely that by the time the test reads from the view, all view updates
will have been completed anyway.

So in this patch we introduce an injection point, for testing, named
"delay_before_remote_view_update", which adds a delay before the base
replica sends its update to the remote view replica (in case the view
replica is indeed remote). As usual, this injection point isn't
configurable - when enabled it adds a fixed (0.5 second) delay, on all
view updates on all tables.

The existing code used continuation-style Seastar programming, and the
addition of the injection point in this patch made it even uglier, so
in the next patch we will coroutine-ize this code.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-17 20:15:08 +02:00
Nadav Har'El
2c0b472f44 alternator: explicitly request synchronous updates for LSI
DynamoDB's *local* secondary index (LSI) allows strongly-consistent
reads from the materialized view, which must be able to read what was
previously written to the base. To support this, we need the view to
use the "synchronous_updates".

Previously, with vnodes, there was no need for using this option
explicitly, because an LSI has the same partition key as the base table
so the base and view replicas are the same, and the local writes are
done synchronously. But with tablets, this changes - there is no longer
a guarantee that the base and view tablets are located on the same node.
So to restore the strong consistency of LSIs when tablets are enabled,
this patch explicitly adds the "synchronous_updates" option to views
created by Alternator LSIs. We do *not* add this option for GSIs - those
do not support strongly-consistent reads.

This fix was tested by a test that will be introduced in the following
patches. The test showed that before this patch, it was possible that
reading with ConsistentRead=True from an LSI right after the base was
written would miss the new changes, but after this patch, it always
sees the new data in the LSI.

Fixes #16313.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-17 20:14:59 +02:00
Nadav Har'El
d11f5e9625 alternator: fix view creation when using tablets
In commit 88a5ddabce, we fixed materialized
view creation to support tablets. We added to the function called to
create materialized views in CQL, prepare_new_view_announcement()
a missing call to the on_before_create_column_family() notifier that
creates tablets for this new view.

We have the same problem in Alternator when creating a view (GSI or LSI).
The Alternator code does not use prepare_new_view_announcement(), and
instead uses the lower-level function add_table_or_view_to_schema_mutation()
so it didn't get the call to the notifier, so we must add it here too.

Before this patch, creating an Alternator table with tablets (which has
become possible after the previous patch) fails with "Tablet map not found
for table <uuid>". With this patch, it works.

A test for materialized views in Alternator will come in a following
patch, and will test everything together - the CreateTable tag to use
tablets (from the previous patch), the LSI/GSI creation (fixed in this patch)
and the correct consistency of the LSI (fixed in the next patch).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-17 19:55:36 +02:00
Nadav Har'El
8e356d8c31 alternator: add experimental method to create a table with tablets
As explained in issue #16203, we cannot yet enable tablets on Alternator
keyspaces by default, because support for some of the features that
Alternator needs, such as CDC, is not yet available.
Nevertheless, to start testing Alternator integration with tablets,
we want to provide a way to enable tablets in Alternator for tests.

In this patch we add support for a tag, 'experimental:initial_tablets',
which if added on a table during creation, uses tablets for its keyspace.
The value of this tag is a numeric string, and it is exactly analogous
to the 'initial_tablets' property we have in CQL's NetworkTopologyStrategy.

We name this tag with the "experimental:" prefix to emphesize that it
is experimental, and the way to enable or disable tablets will probably
change later.

The new tag only has effect when added while *creating* a table.
Adding, deleting or changing it later on an existing table will have
no effect.

A later patch will have tests that use this tag to test Alternator with
tablets.

Refs #16203.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-17 19:55:30 +02:00
Kefu Chai
e436856cf7 token_metadata: pass node id when formatting it
before this change, we use the format string of
"Can't replace node {} with itself", but fail to include the host id as seastar::format()'s arguments. this fails the compile-time check of fmt, which is yet merged. so, if we really run into this problem, {fmt} would throw before the intended runtime_error is raised -- currently, seastar::log formats the logging messages at runtime, this is not intended.

in this change, we pass `existing_node`, so it can be formatted, and the
intended error message can be printed in log.

Refs 11a4908683
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16342
2023-12-17 19:54:09 +02:00
Evgeniy Naydanov
10eebe3c66 test: use different IP addresses for listen and RPC addresses
Scylla can be configured to use different IPs for the internode communication
and client connections.  This test allocates and configure unique IP addresses
for the client connections (`rpc_address`) for 2-nodes cluster.

Two scenarios tested:
  1) Change RPC IPs sequentially
  2) Change RPC IPs simultaneously

Closes scylladb/scylladb#15965
2023-12-17 18:00:09 +02:00