Commit Graph

43164 Commits

Author SHA1 Message Date
Gleb Natapov
b4d028f51f test: extend existing test to check that a joining node can map addresses of all pre-existing nodes during join
(cherry picked from commit 9e4cd32096)
2024-09-29 12:52:15 +03:00
Gleb Natapov
35a02eb235 group0: make sure that address map has an entry for each new node in the raft configuration
ID->IP mapping is added to the raft address map when the mapping first
appears in the gossiper, but it is added as expiring entry. It becomes
non expiring when a node is added to raft configuration. But when a node
joins those two events may be distant in time (since the node's request
may sit in the topology coordinator queue for a while) and mappings may
expire already from the map. This patch makes sure to transfer the
mapping from the gossiper for a node that is added to the raft
configuration instead of assuming that the mapping is already there.

(cherry picked from commit bddaf498df)
2024-09-26 21:14:14 +00:00
Nadav Har'El
05ed0a033a alternator: exclude CDC log table from ListTables
The Alternator command ListTables is supposed to list actual tables
created with CreateTable, and should list things like materialized views
(created for GSI or LSI) or CDC log tables.

We already properly excluded materialized views from the list - and
had the tests to prove it - but forgot both the exclusion and the testing
for CDC log tables - so creating a table xyz with streams enable would
cause ListTables to also list "xyz_scylla_cdc_log".

This patch fixes both oversights: It adds the code to exclude CDC logs
from the output of ListTables, add adds a test which reproduces the bug
before this fix, and verifies the fix works.

Fixes #19911.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#19914

(cherry picked from commit d293a5787f)
2024-09-26 11:24:57 +03:00
Kefu Chai
b715df1c2e docs: explain precedence of configure options
to explain for instance which setting takes effect if both
command line options and `scylla.yaml` configures the same parameter.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
(cherry picked from commit 1aa030a8cd)

Closes scylladb/scylladb#20776
2024-09-26 10:59:49 +03:00
Botond Dénes
276af11202 Merge '[Backport 6.0] replica: ignore cleanup of deallocated storage group' from Aleksandra Martyniuk
Cleanup of a deallocated tablet throws an exception.
Since failed cleanup is retried, we end up in an infinite loop.

Ignore cleanup of deallocated storage groups.

Fixes: https://github.com/scylladb/scylladb/issues/19752.

Needs to be backported to all branches with tablets (6.0 and later)

(cherry picked from commit 20d6cf55f2)

(cherry picked from commit 2c4b1d6b45)

Refs https://github.com/scylladb/scylladb/pull/20584

Closes scylladb/scylladb#20628

* github.com:scylladb/scylladb:
  test: check if cleanup of deallocated sg is ignored
  replica: ignore cleanup of deallocated storage group
2024-09-26 10:58:01 +03:00
Lakshmi Narayanan Sreethar
24c6fce266 [Backport 6.0] database::get_all_tables_flushed_at: fix return value
The `database::get_all_tables_flushed_at` method returns a variable
without setting the computed all_tables_flushed_at value. This causes
its caller, `maybe_flush_all_tables` to flush all the tables everytime
regardless of when they were last flushed. Fix this by returning
the computed value from `database::get_all_tables_flushed_at`.

Fixes #20301

Closes scylladb/scylladb#20471

* github.com:scylladb/scylladb:
  cql-pytest: add test to verify compaction_flush_all_tables_before_major_seconds config
  database::get_all_tables_flushed_at: fix return value

(cherry picked from commit 0e5b444777)

Backported from #20471 to 6.0.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#20580
2024-09-26 10:52:02 +03:00
Kamil Braun
45f01a886f test: fix topology_custom/test_raft_recovery_stuck flakiness
The test performs consecutive schema changes in RECOVERY mode. The
second change relies on the first. However the driver might route the
changes to different servers and we don't have group 0 to guarantee
linearizability. We must rely on the first change coordinator to push
the schema mutations to other servers before returning, but that only
happens when it sees other servers as alive when doing the schema
change. It wasn't guaranteed in the test. Fix this.

Fixes scylladb/scylladb#20791

Should be backported to all branches containing this test to reduce
flakiness.

(cherry picked from commit f390d4020a)

Closes scylladb/scylladb#20810
2024-09-25 15:12:30 +02:00
Abhinav
d8b66cf6ef raft topology: add error for removal of non-normal nodes
In the current scenario, We check if a node being removed is normal
on the node initiating the removenode request. However, we don't have a
similar check on the topology coordinator. The node being removed could be
normal when we initiate the request, but it doesn't have to be normal when
the topology coordinator starts handling the request.
For example, the topology coordinator could have removed this node while handling
another removenode request that was added to the request queue earlier.

This commit intends to fix this issue by adding more checks in the enqueuing phase
and return errors for duplicate requests for node removal.

This PR fixes a bug. Hence we need to backport it.

Fixes: scylladb/scylladb#20271
(cherry picked from commit b25b8dccbd)

Closes scylladb/scylladb#20801
2024-09-25 11:36:02 +02:00
Gleb Natapov
c75d58aef5 test: skip test_lwt_semaphore::test_cas_semaphore in aarch64 debug mode
The test configures write timeout to much smaller value to make the test
run faster since for some writes sleep is inserted to hit the timeout,
but it makes aarch64 debug flaky since timeout happens when it should
not because of a natural slowness.

(cherry picked from commit 71a5b1c6dd)

Closes scylladb/scylladb#20778
2024-09-24 15:20:56 +02:00
Benny Halevy
2e8ad3ec35 time_window_compaction_strategy: get_reshaping_job: restrict sort of multi_window vector to its size
Currently the function calls boost::partial_sort with a middle
iterator that might be out of bound and cause undefined behavior.

Check the vector size, and do a partial sort only if its longer
than `max_sstables`, otherwise sort the whole vector.

Fixes scylladb/scylladb#20608

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 39ce358d82)

Closes scylladb/scylladb#20664
2024-09-23 16:01:27 +03:00
Botond Dénes
7d421abec4 Merge '[Manual Backport 6.0] generic_server: convert connection tracking to seastar::gate' from Laszlo Ersek
This is a manual backport of #20212 to 6.0, superseding #20346 (which had run into conflicts).

Please see the individual commit messages for backport notes.

Fixes #10305

Closes scylladb/scylladb#20349

* github.com:scylladb/scylladb:
  generic_server: make server::stop() idempotent
  generic_server: coroutinize server::shutdown()
  generic_server: make server::shutdown() idempotent
  test/generic_server: add test case
  configure, cmake: sort the lists of boost unit tests
  generic_server: convert connection tracking to seastar::gate
scylla-6.0.4 scylla-6.0.4-candidate-20240919102642
2024-09-19 09:18:48 +03:00
Tomasz Grabiec
e38b42cedf Merge '[Backport 6.0] tablets: Fix race between repair and split ' from Raphael "Raph" Carvalho
Consider the following:

```
T
0   split prepare starts
1                               repair starts
2   split prepare finishes
3                               repair adds unsplit sstables
4                               repair ends
5   split executes
```
If repair produces sstable after split prepare phase, the replica will not split that sstable later, as prepare phase is considered completed already. That causes split execution to fail as replicas weren't really prepared. This also can be triggered with load-and-stream which shares the same write (consumer) path.

The approach to fix this is the same employed to prevent a race between split and migration. If migration happens during prepare phase, it can happen source misses the split request, but the tablet will still be split on the destination (if needed). Similarly, the repair writer becomes responsible for splitting the data if underlying table is in split mode. That's implemented in replica::table for correctness, so if node crashes, the new sstable missing split is still split before added to the set.

Fixes https://github.com/scylladb/scylladb/issues/19378.
Fixes https://github.com/scylladb/scylladb/issues/19416.

Please replace this line with justification for the backport/* labels added to this PR

(cherry picked from commit 239344ab55)

(cherry picked from commit 74612ad358)

Refs https://github.com/scylladb/scylladb/pull/19427

Closes scylladb/scylladb#20593

* github.com:scylladb/scylladb:
  tablets: Fix race between repair and split
  compaction: Allow "offline" sstable to be split
2024-09-17 13:24:36 +02:00
Gleb Natapov
c04ce4ce64 paxos_state: release semaphore units before checking if a semaphore can be dropped
To drop a semaphore it should not be held by anyone, so we need to
release out units before checking if a semaphore can be dropped.

Fixes: scylladb/scylladb#20602
(cherry picked from commit 9cc54932ae)

Closes scylladb/scylladb#20622
2024-09-16 22:39:03 +03:00
Aleksandra Martyniuk
62593ee85f test: check if cleanup of deallocated sg is ignored
(cherry picked from commit 2c4b1d6b45)
2024-09-16 17:49:59 +02:00
Aleksandra Martyniuk
ae385b29fe replica: ignore cleanup of deallocated storage group
Currently, attempt to cleanup deallocated storage group throws
an exception. Failed tablet cleanup is retried, stucking
in an endless loop.

Ignore cleanup of deallocated storage group.

(cherry picked from commit 20d6cf55f2)
2024-09-16 17:21:14 +02:00
Piotr Dulikowski
977c458555 Merge '[Backport 6.0]: hints: send hints with CL=ALL if target is leaving' from Piotr Dulikowski
Currently, when attempting to send a hint, we might choose its recipients in one of two ways:

- If the original destination is a natural endpoint of the hint, we only send the hint to that node and none other,
- Otherwise, we send the hint to all current replicas of the mutation.

There is a problem when we decommission a node: while data is streamed away from that node, it is still considered to be a natural endpoint of the data that it used to own. Because of that, it might happen that a hint is sent directly to it but streaming will miss it, effectively resulting in the hint being discarded.

As sending the hint _only_ to the leaving replica is a rather bad idea, send the hint to all replicas also in the case when the original destination of the hint is leaving.

Note that this is a conservative fix written only with the decommission + vnode-based keyspaces combo in mind. In general, such "data loss" can occur in other situations where the replica set is changing and we go through a streaming phase, i.e. other topology operations in case of vnodes and tablet load balancing. However, the consistency guarantees of hinted handoff in the face of topology changes are not defined and it is not clear what they should be, if there should be any at all. The picture is further complicated by the fact that hints are used by materialized views, and sending view updates to more replicas than necessary can introduce inconsistencies in the form of "ghost rows". This fix was developed in response to a failing test which checked the hint replay + decommission scenario, and it makes it work again.

Fixes scylladb/scylladb#20558
Fixes scylladb/scylla-dtest#4582
Refs scylladb/scylladb#19835

This is a backport of the original PR without the tests, done avoid the need of resolving merge conflicts in that area.

Closes scylladb/scylladb#20559

* github.com:scylladb/scylladb:
  hints: send hints with CL=ALL if target is leaving
  hints: inline do_send_one_mutation
2024-09-16 10:26:00 +02:00
Raphael S. Carvalho
c67967b65a tablets: Fix race between repair and split
Consider the following:

T
0   split prepare starts
1                               repair starts
2   split prepare finishes
3                               repair adds unsplit sstables
4                               repair ends
5   split executes

If repair produces sstable after split prepare phase, the replica
will not split that sstable later, as prepare phase is considered
completed already. That causes split execution to fail as replicas
weren't really prepared. This also can be triggered with
load-and-stream which shares the same write (consumer) path.

The approach to fix this is the same employed to prevent a race
between split and migration. If migration happens during prepare
phase, it can happen source misses the split request, but the
tablet will still be split on the destination (if needed).
Similarly, the repair writer becomes responsible for splitting
the data if underlying table is in split mode. That's implemented
in replica::table for correctness, so if node crashes, the new
sstable missing split is still split before added to the set.

Fixes #19378.
Fixes #19416.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 74612ad358)
2024-09-13 21:11:25 -03:00
Abhi
d8a71ca6db raft: Add descriptions for requested abort errors
Fixes: scylladb/scylladb#18902

This PR is intended to make debugging easier, hence backporting it to
previous versions shall be useful while debugging issues there

(cherry picked from commit a616f10)

For fixing the backport,  parentheses () were added after variable captures
in lambdas, absence of which wasn't supported in earlier versions of C++.

Closes scylladb/scylladb#20564
2024-09-13 10:18:18 +03:00
Botond Dénes
2f3f734cfb docs/cql/ddl.rst: fix description of sstable_compression
ScyllaDB doesn't support custom compressors. The available compressors
are the only available ones, not the default ones.
Adjust the text to reflect this.

(cherry picked from commit 08f109724b)

Closes scylladb/scylladb#20525
2024-09-13 10:17:12 +03:00
Gleb Natapov
6bd8c9fae5 db/consistency_level: do not use result from hit weighted load balancer if it contains duplicates
Because of https://github.com/scylladb/scylladb/issues/9285 hit weighted
load balancer may sometimes return same node twice. It may cause wrong
data to be read or unexpected errors to be returned to a client. Since
the original bug is not easy to fix and it is rare lets introduce a
workaround. We will check for duplicates and will use non HWLB one if
one is found.

(cherry picked from commit 807e37502a)

Closes scylladb/scylladb#20470
2024-09-13 10:16:39 +03:00
Anna Stuchlik
3948167fca doc: add a page with ScyllaDB limits
This commit adds a page listing the ScyllDB limits
we know today.
The page can and should be extended when other limits
are confirmed.

Closes scylladb/scylladb#19399

(cherry picked from commit 072542a5cc)
2024-09-12 13:32:50 +03:00
Piotr Dulikowski
c423ae1688 hints: send hints with CL=ALL if target is leaving
Currently, when attempting to send a hint, we might choose its
recipients in one of two ways:

- If the original destination is a natural endpoint of the hint, we only
  send the hint to that node and none other,
- Otherwise, we send the hint to all current replicas of the mutation.

There is a problem when we decommission a node: while data is streamed
away from that node, it is still considered to be a natural endpoint of
the data that it used to own. Because of that, it might happen that a
hint is sent directly to it but streaming will miss it, effectively
resulting in the hint being discarded.

As sending the hint _only_ to the leaving replica is a rather bad idea,
send the hint to all replicas also in the case when the original
destiantion of the hint is leaving.

Note that this is a conservative fix written only with the decommission
+ vnode-based keyspaces combo in mind. In general, such "data loss" can
occur in other situations where the replica set is changing and we go
through a streaming phase, i.e. other topology operations in case of
vnodes and tablet load balancing. However, the consistency guarantees of
hinted handoff in the face of topology changes are not defined and it is
not clear what they should be, if there should be any at all. The
picture is further complicated by the fact that hints are used by
materialized views, and sending view updates to more replicas than
necessary can introduce inconsistencies in the form of "ghost rows".
This fix was developed in response to a failing test which checked the
hint replay + decommission scenario, and it makes it work again.

Fixes scylladb/scylla-dtest#4582
Refs scylladb/scylladb#19835

(cherry picked from commit 61ac0a336d)
2024-09-12 10:58:25 +02:00
Piotr Dulikowski
24e70895d5 hints: inline do_send_one_mutation
It's a small method and it is only used once in send_one_mutation.
Inlining it lets us get rid of its declaration in the header - now, if
one needs to change the variables passed from one function to another,
it is no longer necessary to change the header.

(cherry picked from commit 8abb06ab82)
2024-09-12 10:58:22 +02:00
Nadav Har'El
9c1f4d0953 Merge '[Backport 6.0] cql3: add option to not unify bind variables with the same name' from Avi Kivity
Bind variables in CQL have two formats: positional (?) where a variable is referred to by its relative position in the statement, and named (:var), where the user is expected to supply a name->value mapping.

In 19a6e69001 we identified the case where a named bind variable appears twice in a query, and collapsed it to a single entry in the statement metadata. Without this, a driver using the named variable syntax cannot disambiguate which variable is referred to.

However, it turns out that users can use the positional call form even with the named variable syntax, by using the positional API of the driver. To support this use case, we add a configuration variable to disable the same-variable detection.

Because the detection has to happen when the entire statement is visible, we have to supply the configuration to the parser. We call it the dialect and pass it from all callers. The alternative would be to add a pre-prepare call similar to fill_prepare_context that rewrites all expressions in a statement to deduplicate variables.

A unit test is added.

Fixes https://github.com/scylladb/scylladb/issues/15559

This may be useful to users transitioning from Cassandra, so merits a backport.

(cherry picked from commit f9322799af)

(cherry picked from commit d69bf4f010)

(cherry picked from commit ea8441dfa3)

Refs https://github.com/scylladb/scylladb/pull/19493

Subsumes #20389

Closes scylladb/scylladb#20551

* github.com:scylladb/scylladb:
  cql3: add option to not unify bind variables with the same name
  cql3: introduce dialect infrastructure
  cql3: prepared_statement_cache: drop cache key default constructor
  test: cql-pytest: config_value_context: remove strange ast.literal_eval call
  Merge 'config: round-trip boolean configuration variables' from Avi Kivity
2024-09-12 11:21:34 +03:00
Avi Kivity
ad52caac55 cql3: add option to not unify bind variables with the same name
Bind variables in CQL have two formats: positional (`?`) where a
variable is referred to by its relative position in the statement,
and named (`:var`), where the user is expected to supply a
name->value mapping.

In 19a6e69001 we identified the case where a named bind variable
appears twice in a query, and collapsed it to a single entry in the
statement metadata. Without this, a driver using the named variable
syntax cannot disambiguate which variable is referred to.

However, it turns out that users can use the positional call form
even with the named variable syntax, by using the positional
API of the driver. To support this use case, we add a configuration
variable to disable the same-variable detection.

Because the detection has to happen when the entire statement is
visible, we have to supply the configuration to the parser. We
call it the `dialect` and pass it from all callers. The alternative
would be to add a pre-prepare call similar to fill_prepare_context that
rewrites all expressions in a statement to deduplicate variables.

A unit test is added.

Fixes #15559

(cherry picked from commit ea8441dfa3)
(cherry picked from commit edb3068ecf)
2024-09-11 22:55:22 +03:00
Avi Kivity
aabad7e88f cql3: introduce dialect infrastructure
A dialect is a different way to interpret the same CQL statement.

Examples:
 - how duplicate bind variable names are handled (later in this series)
 - whether `column = NULL` in LWT can return true (as is now) or
   whether it always returns NULL (as in SQL)

Currently, dialect is an empty structure and will be filled in later.
It is passed to query_processor methods that also accept a CQL string,
and from there to the parser. It is part of the prepared statement cache
key, so that if the dialect is changed online, previous parses of the
statement are ignored and the statement is prepared again.

The patch is careful to pick up the dialect at the entry point (e.g.
CQL protocol server) so that the dialect doesn't change while a statement
is parsed, prepared, and cached.

(cherry picked from commit d69bf4f010)
2024-09-11 22:55:22 +03:00
Avi Kivity
f8f958030a cql3: prepared_statement_cache: drop cache key default constructor
It's unnecessary, and interferes with the following patch where
we change the cache key type.

(cherry picked from commit f9322799af)
2024-09-11 22:55:22 +03:00
Avi Kivity
643da8e3d8 test: cql-pytest: config_value_context: remove strange ast.literal_eval call
cql-pytest's config_value_context is used to run a code sequence with
different ScyllaDB configuration applied for a while. When it reads
the original value (in order to restore it later), it applies
ast.literal_eval() to it. This is strange, since the config variable isn't
a Python literal.

It was added in 8c464b2ddb ("guardrails: restrict replication
strategy (RS)"). Presumably, as a workaround for #19604 - it sufficiently
massaged the input we read via SELECT to be acceptable later via UPDATE.

Now that #19604 is fixed, we can remove the call to ast.literal_eval,
but have to fix up the parameters to config_value_context to something
that will be accepted without further massaging.

This is a step towards fixing #15559, where we want to run some tests
with a boolean configuration variable changed, and literal_eval is
transforming the string representation of integers to integers and
confusing the driver.

Closes scylladb/scylladb#19696

(cherry picked from commit d5af86bd8a)
2024-09-11 22:55:22 +03:00
Nadav Har'El
79879be753 Merge 'config: round-trip boolean configuration variables' from Avi Kivity
When you SELECT a boolean from system.config, it reads as true/false, but this isn't accepted
on UPDATE (instead, we accept 1/0). This is surprising and annoying, so accept true/false in
both directions.

Not a regression, so a backport isn't strictly necessary.

Closes scylladb/scylladb#19792

* github.com:scylladb/scylladb:
  config: specialize from-string conversion for bool
  config: wrap boost::lexical_cast<> when converting from strings

(cherry picked from commit 9eb47b3ef0)
2024-09-11 22:55:22 +03:00
Kamil Braun
553174251b test: test_raft_no_quorum: increase raft timeout in debug mode
The test cases in this file use an error injection to reduce raft group
0 timeouts (from the default 1 minute), in order to speed up the tests;
the scenarios expect these timeouts to happen, so we want them to happen
as quick as possible, but we don't want to reduce timeouts so much that
it will make other operations fail when we don't expect them to (e.g.
when the test wants to add a node to the cluster).

Unfortunately the selected 5 seconds in debug mode was not enough and
made the tests flaky: scylladb/scylladb#20111.

Increase it to 10 seconds. This unfortunately will slow down these tests
as they have to sometimes wait for 10 seconds for the timeout to happen.
But better to have this than a flaky test.

Fixes: scylladb/scylladb#20111
(cherry picked from commit 52fdf5b4c9)

Closes scylladb/scylladb#20478
2024-09-10 11:56:21 +03:00
Kefu Chai
cd743a4cfa docs: do not install scylla/ppa repo when perform upgrade
for following reasons:

1. the ppa in question does not provide the build for the latest ubuntu's LTS release. it only builds for trusty, xenial, bionic and jammy. according to https://wiki.ubuntu.com/Releases, the latest LTS release is ubuntu noble at the time of writing.
2. the ppa in question does not provide the packages used in production. it does provides the package for *building* scylla
3. after we introduced the relocatable package, there is no need to provide extra user space dependencies apart from scylla packages.

so, in this change, we remove all references to enabling the Scylla/PPA repository.

Fixes scylladb/scylladb#20449

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
(cherry picked from commit fe0e961856)

Closes scylladb/scylladb#20454
2024-09-10 11:54:09 +03:00
Nadav Har'El
44b1fe628c alternator ttl: fix use-after-free
The Alternator TTL scanning code uses an object "scan_ranges_context"
to hold the scanning context. One of the members of this object is
a service::query_state, and that in turn holds a reference to a
service::client_state. The existing constructor created a temporary
client_state object and saved a reference to it - which can result
in use after free as the temporary object is freed as soon as the
constructor ends.

The fix is to save a client_state in the scan_ranges_context object,
instead of a temporary object.

Fixes #19988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
(cherry picked from commit 15f8046fcb)

Closes scylladb/scylladb#20437
2024-09-10 11:51:08 +03:00
Kefu Chai
1e35328161 sstables: correct the debugging message printed when removing temp dir
in 372a4d1b79, we introduced a change
which was for debugging the logging message. but the logging message
intended for printing the temp_dir not prints an `optional<int>`. this
is both confusing, and more importantly, it hurts the debuggability.

in this change, the related change is reverted.

Fixes scylladb/scylladb#20408

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
(cherry picked from commit d26bb9ae30)

Closes scylladb/scylladb#20435
2024-09-10 11:50:01 +03:00
Takuya ASADA
46458226bb install.sh: fix more incorrect permission on strict umask
Even after 13caac7, we still have more files incorrect permission, since
we use "cp -r" and creating new file with redirect.

To fix this, we need to replace "cp -r" with "cp -pr", and "chmod <perm>" on
newly created files.

Fixes #14383
Related #19775

(cherry picked from commit 9d7fed40b5)

Closes scylladb/scylladb#20433
2024-09-10 11:47:41 +03:00
Kefu Chai
6fdb124914 dist: drop %pretrans section
before this change, if user does not have `/bin/sh` around, when
installing scylla packages, the script in `%pretrans" is executed,
and fails due to missing `/bin/sh`. per
https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/#pretrans

> Note that the %pretrans scriptlet will, in the particular case of
> system installation, run before anything at all has been installed.
> This implies that it cannot have any dependencies at all. For this
> reason, %pretrans is best avoided, but if used it MUST (by necessity)
> be written in Lua. See
> https://rpm-software-management.github.io/rpm/manual/lua.html for more
> information.

but we were trying to warn users upgrading from scylla < 1.7.3, which
was released 7 years ago at the time of writing.

in this change, we drop the `%pretrans` section. hopefuly they will
find their way out if they still exist.

Fixes scylladb/scylladb#20321

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
(cherry picked from commit 6970c502c9)

Closes scylladb/scylladb#20386
2024-09-10 11:46:30 +03:00
Avi Kivity
093bff385b docs: cql: document ZstdCompressor for CREATE TABLE
Adjust the wording slightly to be less awkward.

(cherry picked from commit 60acfd8c08)

Closes scylladb/scylladb#20381
2024-09-10 11:45:55 +03:00
Raphael S. Carvalho
a4f6811d5f storage_service: avoid processing same table unnecessarily in split monitor
If there's a token metadata for a given table, and it is in split mode,
it will be registered such that split monitor can look at it, for
example, to start split work, or do nothing if table completed it.

during topology change, e.g. drain, split is stalled since it cannot
take over the state machine.
It was noticed that the log is being spammed with a message saying the
table completed split work, since every tablet metadata update, means
waking up the monitor on behalf of a table. So it makes sense to
demote the logging level to debug. That persists until drain completes
and split can finally complete.

Another thing that was noticed is that during drain, a table can be
submitted for processing faster than the monitor can handle, so the
candidate queue may end up with multiple duplicated entries for same
table, which means unnecessary work. That is fixed by using a
sequenced set, which keeps the current FIFO behavior.

Fixes #20339.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 26facd807e)

Closes scylladb/scylladb#20344
2024-09-10 11:45:06 +03:00
Botond Dénes
8d5f2d8943 Merge '[Backport 6.0] repair: throw if batchlog manager isn't initialized' from Aleksandra Martyniuk
repair_service::repair_flush_hints_batchlog_handler may access batchlog
manager while it is uninitialized.

Throw if batchlog manager isn't initialized.

Fixes: https://github.com/scylladb/scylladb/issues/20236.

Needs backport to 6.0 and 6.1 as they suffer from the uninitialized bm access.

(cherry picked from commit d8e4393418)

(cherry picked from commit f38bb6483a)

Refs https://github.com/scylladb/scylladb/pull/20251

Closes scylladb/scylladb#20392

* github.com:scylladb/scylladb:
  test: add test to ensure repair won't fail with uninitialized bm
  repair: throw if batchlog manager isn't initialized
2024-09-09 15:13:38 +03:00
Jenkins Promoter
0e5108ed7f Update ScyllaDB version to: 6.0.4 2024-09-04 15:34:01 +03:00
Kamil Braun
96064e9647 Merge '[Backport 6.0] Fix node replace with inter-dc encryption enabled.' from Gleb Natapov
Currently if a coordinator and a node being replaced are in the same DC
while inter-dc encryption is enabled (connections between nodes in the
same DC should not be encrypted) the replace operation will fail. It
fails because a coordinator uses non encrypted connection to push raft
data to the new node, but the new node will not accept such connection
until it knows which DC the coordinator belongs to and for that the raft
data needs to be transferred.

The series adds the test for this scenario and the fix for the
chicken&egg problem above.

The series (or at least the fix itself) needs to be backported because
this is a serious regression.

Fixes: https://github.com/scylladb/scylladb/issues/19025

(cherry picked from commit 84757a4ed3)

(cherry picked from commit b98282a976)

(cherry picked from commit 2f1b1fd45e)

(cherry picked from commit 17f4a151ce)

(cherry picked from commit 32a59ba98f)

Refs https://github.com/scylladb/scylladb/pull/20290

Closes scylladb/scylladb#20399

* github.com:scylladb/scylladb:
  topology coordinator: fix indentation after the last patch
  topology coordinator: do not add replacing node without a ring to topology
  test: add test for replace in clusters with encryption enabled
  test.py: add server encryption support to cluster manager
  .gitignore: fix pattern for resources to match only one specific directory
2024-09-03 12:24:35 +02:00
Gleb Natapov
8400e6947b topology coordinator: fix indentation after the last patch
(cherry picked from commit 32a59ba98f)
2024-09-02 17:04:42 +03:00
Gleb Natapov
8510568eda topology coordinator: do not add replacing node without a ring to topology
When only inter dc encryption is enabled a non encrypted connection
between two nodes is allowed only if both nodes are in the same dc.
If a nodes that initiates the connection knows that dst is in the same
dc and hence use non encrypted connection, but the dst not yet knows the
topology of the src such connection will not be allowed since dst cannot
guaranty that dst is in the same dc.

Currently, when topology coordinator is used, a replacing node will
appear in the coordinator's topology immediately after it is added to the
group0. The coordinator will try to send raft message to the new node
and (assuming only inter dc encryption is enabled and replacing node and
the coordinator are in the same dc) it will try to open regular, non encrypted,
connection to it. But the replacing node will not have the coordinator
in it's topology yet (it needs to sync the raft state for that). so it
will reject such connection.

To solve the problem the patch does not add a replacing node that was
just added to group0 to the topology. It will be added later, when
tokens will be assigned to it. At this point a replacing node will
already make sure that its topology state is up-to-date (since it will
execute a raft barrier in join_node_response_params handler) and it knows
coordinator's topology. This aligns replace behaviour with bootstrap
since bootstrap also does not add a node without a ring to the topology.

The patch effectively reverts b8ee8911ca

Fixes: scylladb/scylladb#19025
(cherry picked from commit 17f4a151ce)
2024-09-02 17:04:42 +03:00
Gleb Natapov
cd324b8513 test: add test for replace in clusters with encryption enabled
(cherry picked from commit 2f1b1fd45e)
2024-09-02 17:04:42 +03:00
Gleb Natapov
d441d93e63 test.py: add server encryption support to cluster manager
(cherry picked from commit b98282a976)
2024-09-02 17:04:42 +03:00
Gleb Natapov
84c47df5e3 .gitignore: fix pattern for resources to match only one specific directory
(cherry picked from commit 84757a4ed3)
2024-09-02 15:21:11 +03:00
Aleksandra Martyniuk
ca3cbae70b test: add test to ensure repair won't fail with uninitialized bm
(cherry picked from commit f38bb6483a)
2024-09-02 10:37:18 +02:00
Aleksandra Martyniuk
3e25eadf12 repair: throw if batchlog manager isn't initialized
repair_service::repair_flush_hints_batchlog_handler may access batchlog
manager while it is uninitialized.

Batchlog manager cannot be initialized before repair as we have the
dependencies chain:
repair_service -> storage_service::join_cluster -> batchlog_manager.

Throw if batchlog manager isn't initialized. That won't cause repair
to fail.

(cherry picked from commit d8e4393418)
2024-08-30 13:55:48 +00:00
Laszlo Ersek
f765591886 generic_server: make server::stop() idempotent
After server::shutdown(), make server::stop() more robust too, by allowing
callers (internal or external) to call it several times (not concurrently
though, just yet; see
<https://github.com/scylladb/scylladb/issues/20309>).

Suggested-by: Benny Halevy <bhalevy@scylladb.com>
Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
(cherry picked from commit 49bff3b1ab)
2024-08-30 15:33:00 +02:00
Laszlo Ersek
f17516c4c1 generic_server: coroutinize server::shutdown()
By turning server::shutdown() into a coroutine, we need not dynamically
allocate "nr_conn".

Verified as follows:

(1) In terminal #1:

    build/Dev/scylla --overprovisioned --developer-mode=yes \
        --memory=2G --smp=1 --default-log-level error \
        --logger-log-level cql_server=debug:cql_server_controller=debug

> INFO  [...] cql_server_controller - Starting listening for CQL clients
>                                     on 127.0.0.1:9042 (unencrypted,
>                                     non-shard-aware)
> INFO  [...] cql_server_controller - Starting listening for CQL clients
>                                     on 127.0.0.1:19042 (unencrypted,
>                                     shard-aware)

(2) In terminals #2 and #3:

    tools/cqlsh/bin/cqlsh.py

(3) Press ^C in terminal #1:

> DEBUG [...] cql_server - abort accept nr_total=2
> DEBUG [...] cql_server - abort accept 1 out of 2 done
> DEBUG [...] cql_server - abort accept 2 out of 2 done
> DEBUG [...] cql_server - shutdown connection nr_total=4
> DEBUG [...] cql_server - shutdown connection 1 out of 4 done
> DEBUG [...] cql_server - shutdown connection 2 out of 4 done
> DEBUG [...] cql_server - shutdown connection 3 out of 4 done
> DEBUG [...] cql_server - shutdown connection 4 out of 4 done
> INFO  [...] cql_server_controller - CQL server stopped

This patch is best viewed with "git show --word-diff=color".

Suggested-by: Benny Halevy <bhalevy@scylladb.com>
Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
(cherry picked from commit 1138347e7e)
2024-08-30 15:32:55 +02:00
Laszlo Ersek
b053f794d7 generic_server: make server::shutdown() idempotent
Make server::shutdown() more robust by allowing callers (internal or
external) to call it several times (not concurrently though, just yet; see
<https://github.com/scylladb/scylladb/issues/20309>).

Suggested-by: Benny Halevy <bhalevy@scylladb.com>
Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
(cherry picked from commit 2216275ebd)
2024-08-30 15:32:49 +02:00