Commit Graph

41239 Commits

Author SHA1 Message Date
Nadav Har'El
5d4c60aee3 test/cql-pytest: avoid spurious guardrail warnings
All cql-pytest tests use one node, and unsuprisingly most use RF=1.
By default, as part of the "guardrails" feature, we print a warning
when creating a keyspace with RF=1. This warning gets printed on
every cql-pytest run, which creates a "boy who cried wolf" effect
whereby developers get used to seeing these warnings, and won't care
if new warnings start appearing.

The fix is easy - in run.py start Scylla with minimum-replication-factor-
warn-threshold set to -1 instead of the default 3.

Note that we do have cql-pytest tests for this guardrail, but those don't
rely on the default setting of this variable (they can't, cql-pytest
tests can also be run on a Scylla instance run manually by a developer).
Those tests temporarily set the threshold during the test.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#17274
2024-02-13 17:44:20 +02:00
Kefu Chai
b309e42195 collection_mutation: add formatter for collection_mutation_view::printer
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for
`collection_mutation_view::printer`, and drop its
operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17300
2024-02-13 17:42:25 +02:00
Botond Dénes
120442231f Merge 'row_cache: test cache consistency during multi-partition cache updates' from Michał Chojnowski
Adds a test reproducing https://github.com/scylladb/scylladb/issues/16759, and the instrumentation needed for it.

Closes scylladb/scylladb#17208

* github.com:scylladb/scylladb:
  row_cache_test: test cache consistency during memtable-to-cache merge
  row_cache: use preemption_source in update()
  utils: preempt: add preemption_source
2024-02-13 17:37:06 +02:00
Kefu Chai
54ed65bb50 mutation: s/statics/static content/
codespell reports that "statics" could be the misspelling of
"statistics". but "static" here means the static column(s). so
replace "static" with more specific wording.

Refs #589
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17216
2024-02-13 17:33:21 +02:00
Kefu Chai
9b6a66826c api/storage_service: add more constness to http_context parameter
when we just want to perform read access to `http_context`, there
is no need to use a non-const reference. so let's add `const` specifier
to make this explicit. this shoudl help with the readability and
maintainability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17219
2024-02-13 17:32:45 +02:00
Lakshmi Narayanan Sreethar
f8f8d64982 test.py: support skipping multiple test patterns
Support skipping multiple patterns by allowing them to be passed via
multiple '--skip' arguments to test.py.

Example : `test.py --skip=topology --skip=sstables`

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#17220
2024-02-13 17:32:03 +02:00
Kefu Chai
57d138b80f row_cache: s/fro/reader/
"fro" is the short of "from" but the value is an
`optimized_optional<flat_mutation_reader_v2>`. codespell considers
it a misspelling of "for" or "from". neither of them makes sense,
so let's change it to "reader" for better readability, also for
silencing the warning. so that the geniune warning can stands out,
this would help to make the codespell more useful.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17221
2024-02-13 17:28:14 +02:00
Kefu Chai
c555af3cd8 raft: add formatter for raft::log
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `raft::log`, and drop its
operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17301
2024-02-13 17:17:57 +02:00
Anna Stuchlik
02cd84adbf doc: remove OSS-vs-Ent Matrix from OSS docs
This commit removes the Open Source vs. Enterprise matrix
from the Open Source documentation.

In addition, a redirection is added to prevent 404 in the OSS docs,
and to the removed page is replaced with a link to the same page
in the Enterprise docs.

This commit must be reverted enterprise.git, because
we want to keep the Matrix in the Enterprise docs.

Fixes https://github.com/scylladb/scylladb/issues/17289

Closes scylladb/scylladb#17295
2024-02-13 17:17:22 +02:00
Yaniv Kaul
d2ef100b60 Typos: more/less then -> more/less than
Fix repated typos in comments: more then -> more than, less then -> less than

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#17303
2024-02-13 17:16:15 +02:00
Nadav Har'El
dce47a81b0 alternator, tablets: return error if enabling Streams with tablets
Alternator Streams doesn't yet work on tables using tablets (this is
issue #16317). Before this patch, an attempt to enable it results in
an unsightly InternalServerError, which isn't terrible - but we can
do better.

So in this patch, we make the attempt to enable Streams and tablets
together into a clear error. The error message points to the open issue,
and also suggests how to create a table that uses vnodes, not tablets.

Unfortunately, there are slightly two different code paths and error
messages for two cases: One case is the creation of a new table (where
the validation happens before the keyspace is actually created), and
the other case is an attempt to enable streams on an existing table
with an existing keyspace (which already might or might not be using
tablets).

This patch also adds a test that verifies that trying to enable Streams
with tablets is an error - in both cases (table creation and update).
Obviously, this test - and the validation code - should be removed once
the issue is solved and Alternator Streams begins working with tablets.

Fixes #16497
Refs #16807

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#17311
2024-02-13 16:42:35 +02:00
Raphael S. Carvalho
54226dddf5 replica: Kill vnode-oriented cleanup handling for multiple compaction groups
With tablets, we don't use vnode-oriented sstable cleanup.
So let's just remove unused code and bail out silently if sharding is
tablet based. The reason for silence is that we don't want to break
tests that might be reused for tablets, and it's not a problem for
sstable cleanup to be ignored with tablets.
This approach is actually already used in the higher level code,
implementing the cleanup API.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#17296
2024-02-13 16:35:15 +02:00
Petr Gusev
3722ca0a41 sync_raft_topology_nodes: parallelize system_keyspace update functions
In sync_raft_topology_nodes we execute a system keyspace
update query for each node of the cluster. The system keyspace
tables use schema commitlog which by default enables use_o_dsync.
This means that each write to the commitlog is accompanied by fsync.
For large clusters this can incur hundreds of writes with fsyncs, which
is very expensive. For example, in #17039 for a  moderate size cluster
of 50 nodes sync_raft_topology_nodes took almost 5 seconds.

In this commit we solve this problem by running all such update
queries in parallel. The commitlog should batch them and issue
only one write syscall to the OS.

Closes scylladb/scylladb#17243
2024-02-13 14:44:48 +01:00
Piotr Dulikowski
314fd9a11f test: test_topology_recovery_basic: add missing driver reconnect
Unfortunately, scylladb/python-driver#230 is not fixed yet, so it is
necessary for the sake of our CI's stability to re-create the driver
session after all nodes in the cluster are restarted.

There is one place in test_topology_recovery_basic where all nodes are
restarted but the driver session is not re-created. Even though nodes
are not restarted at once but rather sequentially, we observed a failure
with similar symptoms in a CI run for scylla-enterprise.

Add the missing driver reconnect as a workaround for the issue.

Fixes: scylladb/scylladb#17277

Closes scylladb/scylladb#17278
2024-02-13 12:28:30 +01:00
David Garcia
f45d9d33f1 docs: remove liveness asterisks
Instead of adding an asterisk next to "liveness" linking to the glossary, we will temporarily replace them with a hyperlink pending the implementation of tooltip functionality.

Closes scylladb/scylladb#17244
2024-02-12 20:37:52 +02:00
Avi Kivity
b22db74e6a Regenerate frozen toolchain
For gnutls 3.8.3 and clang clang-16.0.6-4.

Fixes #17285.

Closes scylladb/scylladb#17287
2024-02-12 18:36:11 +02:00
Botond Dénes
3f2d7e8b25 tree: remove unnecessary yields around for_each_tablet()
Commit 904bafd069 consolidated the two
existing for_each_tablet() overloads, to the one which has a future<>
returning callback. It also added yields to the bodies of said
callbacks. This is unnecessary, the loop in for_each_tablet() already
has a yield per tablet, which should be enough to prevent stalls.

This patch is a follow-up to #17118

Closes scylladb/scylladb#17284
2024-02-12 17:10:25 +01:00
Kamil Braun
2e81f045cc Merge 'transport: controller: do_start_server: do not set_cql_read for maintenance port' from Benny Halevy
RPC is not ready yet at this point, so we should not set this application state yet.

Also, simplify add_local_application_state as it contains dead code
that will never generate an internal error after 1d07a596bf.

Fixes #16932

Closes scylladb/scylladb#17263

* github.com:scylladb/scylladb:
  gossiper: add_local_application_state: drop internae error
  transport: controller: do_start_server: do not set_cql_read for maintenance port
2024-02-12 13:26:45 +01:00
Pavel Emelyanov
2b1612aa04 main: Stop lifecycle notifier for real
It wasn't because of storage service, not the latter is stopped (since
e6b34527c1), so the former can be stopped to

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17251
2024-02-12 13:59:50 +02:00
Kefu Chai
7baee379de sstable/storage: pass fs::path to storage::create_links()
this change is a follow-up of 637dd730. the goal is to use
std::filesystem::path for manipulating paths, and to avoid the
converting between sstring and fs::path back and forth.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17257
2024-02-12 13:26:11 +02:00
Kefu Chai
7a5cb69e33 storage_service: s/format()/fmt::format/
in the same spirit of e84a0991, let's switch the callers who expect
std::string to fmt::format(). to minimize the impact and to reduce
the risk, the switch will be performed piecemeal.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17253
2024-02-12 13:24:21 +02:00
Pavel Emelyanov
b9721bd397 test/tablets: Decommissioning node below RF is not allowed
When a node is decommissioned, all tablet replicas need to be moved away
from it. In some cases it may not be possible. If the number of node in
the cluster equals the keysapce RF, one cannot decommission any node
because it's not possible to find nodes for every replica.

The new test case validates this constraint is satisfied.

refs: #16195

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17248
2024-02-12 13:21:47 +02:00
Nadav Har'El
21e7deafeb alternator, mv: fix case of two new key columns in GSI
A materialized view in CQL allows AT MOST ONE view key column that
wasn't a key column in the base table. This is because if there were
two or more of those, the "liveness" (timestamp, ttl) of these different
columns can change at every update, and it's not possible to pick what
liveness to use for the view row we create.

We made an exception for this rule for Alternator: DynamoDB's API allows
creating a GSI whose partition key and range key are both regular columns
in the base table, and we must support this. We claim that the fact that
Alternator allows neither TTL (Alternator's "TTL" is a different feature)
nor user-defined timestamps, does allow picking the liveness for the view
row we create. But we did it wrong!

We claimed in a comment - and implemented in the code before this patch -
that in Alternator we can assume that both GSI key columns will have the
*same* liveness, and in particular timestamp. But this is only true if
one modifies both columns together! In fact, in general it is not true:
We can have two non-key attributes 'a' and 'b' which are the GSI's key
columns, and we can modify *only* b, without modifying a, in which case
the timestamp of the view modification should be b's newer timestamp,
not a's older one. The existing code took a's timestamp, assuming it
will be the same as b's, which is incorrect. The result was that if
we repeatedly modify only b, all view updates will receive the same
timestamp (a's old timestamp), and a deletion will always win over
all the modifications. This patch includes a reproducing test written by
a user (@Zak-Kent) that demonstrates how after a view row is deleted
it doesn't get recreated - because all the modifications use the same
timestamp.

The fix is, as suggested above, to use the *higher* of the two
timestamps of both base-regular-column GSI key columns as the timestamp
for the new view rows or view row deletions. The reproducer that
failed before this patch passes with it. As usual, the reproducer
passes on AWS DynamoDB as well, proving that the test is correct and
should really work.

Fixes #17119

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#17172
2024-02-12 13:17:29 +02:00
Nadav Har'El
341af86167 test/cql-pytest: reproducer for GROUP BY regression
This patch adds a simple reproducer for a regression in Scylla 5.4 caused
by commit 432cb02, breaking LIMIT support in GROUP BY.

Refs #17237

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#17275
2024-02-12 13:09:52 +02:00
Kefu Chai
57df20eef8 configure.py: use un-deprecated module
PEP 632 deprecates distutils module, and it is remove from Python 3.12.
we are actually using the one vendored by setuptools, if we are using
3.12. so let's use shutil for finding ninja executable.
see https://peps.python.org/pep-0632/

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17271
2024-02-12 13:05:35 +02:00
Kamil Braun
7d73c40125 Merge 'test.py: tablets: Fix flakiness of test_tablet_missing_data_repair' from Tomasz Grabiec
Reimplements stop/start sequence using rolling_restart() which is safe
with regards to UP status propagation and not prone to sudden
connection drop which may cause later CQL queries to time out. It also
ensures that CQL is up on all the remaining nodes when the with_down
callback is executed.

The test was observed to fail in CI like this:

```
  cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 127.157.135.26:9042 datacenter1>: ConnectionException('Pool for 127.157.135.26:9042 is shutdown')})
  ...
      @pytest.mark.repair
      @pytest.mark.asyncio
      async def test_tablet_missing_data_repair(manager: ManagerClient):
  ...
          for idx in range(0,3):
              s = servers[idx].server_id
              await manager.server_stop_gracefully(s, timeout=120)
  >           await check()
```

Hopefully: Fixes #17107

Closes scylladb/scylladb#17252

* github.com:scylladb/scylladb:
  test: py: tablets: Fix flakiness of test_tablet_missing_data_repair
  test: pylib: manager_client: Wait for driver to catch up in rolling_restart()
  test: pylib: manager_client: Accept callback in rolling_restart() to execute with node down
2024-02-12 11:52:09 +01:00
Botond Dénes
f068d1a6fa query: do not kill unpaged queries when they reach the tombstone-limit
The reason we introduced the tombstone-limit
(query_tombstone_page_limit), was to allow paged queries to return
incomplete/empty pages in the face of large tombstone spans. This works
by cutting the page after the tombstone-limit amount of tombstones were
processed. If the read is unpaged, it is killed instead. This was a
mistake. First, it doesn't really make sense, the reason we introduced
the tombstone limit, was to allow paged queries to process large
tombstone-spans without timing out. It does not help unpaged queries.
Furthermore, the tombstone-limit can kill internal queries done on
behalf of user queries, because all our internal queries are unpaged.
This can cause denial of service.

So in this patch we disable the tombstone-limit for unpaged queries
altogether, they are allowed to continue even after having processed the
configured limit of tombstones.

Fixes: #17241

Closes scylladb/scylladb#17242
2024-02-12 12:34:04 +02:00
Kefu Chai
9b85d1aebf configure.py, cmake: do not pass -Wignored-qualifiers explicitly
we recently added -Wextra to configure.py, and this option enables
a bunch of warning options, including `-Wignored-qualifiers`. so
there is no need to enable this specific warning anymore. this change
remove ths option from both `configure.py` and the CMake building system.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17272
2024-02-12 12:32:00 +02:00
Avi Kivity
c14571af16 Update seastar submodule
Because Seastar now defaults to C++23, we downgrade it explicitly to
C++20.

* seastar 289ad5e593...5d3ee98073 (10):
  > Update supported C++ standards to C++23 and C++20 (dropping C++17)
  > docker: install clang-tools-18
  > http: add handler_base::verify_mandatory_params()
  > coroutine/exception: document return_exception_ptr()
  > http: use structured-binding when appropriate
  > test/http: Read full server response before sending next
  > doc/lambda-coroutine-fiasco: fix a syntax error
  > util/source_location-compat: use __cpp_consteval
  > Fix incorrect class name in documentation.
  > Add support for missing HTTP PATCH method.

Closes scylladb/scylladb#17268
2024-02-12 12:21:47 +02:00
Patryk Wrobel
9fccd968d3 test_tablets.py: implement test_tablet_count_metric_per_shard
This change introduces a new test that verifies the
functionality related to tablet_count metric.

It checks if tablet_count metric is correctly reported
and updated when new tables are created, when tables
are dropped and when `move_tablet` is executed.

Refs: scylladb#16131
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>

Closes scylladb/scylladb#17165
2024-02-12 11:49:38 +02:00
Kefu Chai
54995fcac0 test/manual: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17255
2024-02-12 11:49:38 +02:00
Asias He
a0e46a6b47 repair: Fix rpc::source and rpc::optional parameter order in rpc message
In a mixed cluster (5.4.1-20231231.3d22f42cf9c3 and
5.5.0~dev-20240119.b1ba904c4977), in the rolling upgrade test, we saw
repair never finishing.

The following was observed:

rpc - client 127.0.0.2:65273 msg_id 5524:  caught exception while
processing a message: std::out_of_range (deserialization buffer
underflow)

It turns out the repair rpc message was not compatible between the two
versions. Even with a rpc stream verb, the new rpc parameters must come
after the rpc::source<> parameter. The rpc::source<> parameter is not
special in the sense that it must be the last parameter.

For example, it should be:

void register_repair_get_row_diff_with_rpc_stream(
std::function<future<rpc::sink<repair_row_on_wire_with_cmd>> (
const rpc::client_info& cinfo, uint32_t repair_meta_id,
rpc::source<repair_hash_with_cmd> source, rpc::optional<shard_id> dst_cpu_id_opt)>&& func);

not:

void register_repair_get_row_diff_with_rpc_stream(
std::function<future<rpc::sink<repair_row_on_wire_with_cmd>> (
const rpc::client_info& cinfo, uint32_t repair_meta_id,
rpc::optional<shard_id> dst_cpu_id_opt, rpc::source<repair_hash_with_cmd> source)>&& func);

Fixes #16941

Closes scylladb/scylladb#17156
2024-02-12 09:50:30 +02:00
Nadav Har'El
13e16475fa cql-pytest: fix skipping of tests on Cassandra or old Scylla
Recently we added a trick to allow running cql-pytests either with or
without tablets. A single fixture test_keyspace uses two separate
fixtures test_keyspace_tablets or test_keyspace_vnodes, as requested.

The problem is that even if test_keyspace doesn't use its
test_keyspace_tablets fixture (it doesn't, if the test isn't
parameterized to ask for tablets explicitly), it's still a fixture,
and it causes the test to be skipped. This causes every test to be
skipped when running on Cassandra or old Scylla which doesn't support
tablets.

The fix is simple - the internal fixture test_keyspace_tablets should
yield None instead of skipping. It is the caller, test_keyspace, which
now skips the test if tablets are requested but test_keyspace_tablets
is None.

Fixes #17266

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#17267
2024-02-11 21:03:25 +02:00
Kefu Chai
f990ea9678 tools/scylla-nodetool: implement describecluster
Refs #15588
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17240
2024-02-11 20:21:07 +02:00
Avi Kivity
14bf09f447 Merge 'utils: managed_bytes: optimize memory usage for small buffers' from Michał Chojnowski
managed_bytes is implemented as chain of blob_storage objects.
Each blob_storage contains 24 bytes of metadata. But in the most
common case -- when there is only a single element in the chain --
16 bytes of this metadata is trivial/unused.

This is regrettable waste because managed_bytes is used for every
database cell in the memtables and cache. It means that every value
of size >= 7 bytes (smaller ones fit in the inline storage of
managed_bytes) receives 16 bytes of useless overhead.

To correct that, this series adds to managed_bytes an alternative storage
layout -- used for buffers small enough to fit in one fragment -- which only
stores the necessary minimum of metadata. (That is: a pointer to the parent,
to facilitate moving the storage during memory defragmentation).

This saves 16 bytes on every cell greater than 15 bytes. Which includes e.g.
every live cell with value bigger than 6 bytes, which likely applies to most cells.

Before:
```
$ build/release/scylla perf-simple-query --duration 10
median 218692.88 tps ( 61.1 allocs/op,  13.1 tasks/op,   41762 insns/op,        0 errors)
$ build/release/scylla perf-simple-query --duration 10 --write
median 173511.46 tps ( 58.3 allocs/op,  13.2 tasks/op,   53258 insns/op,        0 errors)
$ build/release/test/perf/mutation_footprint_test -c1 --row-count=20 --partition-count=100 --data-size=8 --column-count=16
 - in cache:     2580222
 - in memtable:  2549852
```

After:
```
$ build/release/scylla perf-simple-query --duration 10
median 218780.89 tps ( 61.1 allocs/op,  13.1 tasks/op,   41763 insns/op,        0 errors)
$ build/release/scylla perf-simple-query --duration 10 --write
median 173105.78 tps ( 58.3 allocs/op,  13.2 tasks/op,   52913 insns/op,        0 errors)
$ build/release/test/perf/mutation_footprint_test -c1 --row-count=20 --partition-count=100 --data-size=8 --column-count=16
 - in cache:     2068238
 - in memtable:  2037696
```

Closes scylladb/scylladb#14263

* github.com:scylladb/scylladb:
  utils: managed_bytes: optimize memory usage for small buffers
  utils: managed_bytes: rewrite managed_bytes methods in terms of managed_bytes_view
2024-02-11 16:43:40 +02:00
Kefu Chai
cfb2c2c758 db: add formatter for gc_clock::time_point
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `gc_clock::time_point`,
and drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17254
2024-02-11 16:39:25 +02:00
Kefu Chai
33224cc10b sstables/storage: avoid unnecessary type cast
the type of `_dir` was changed to fs::path back in 637dd730, there
is no need to cast `_dir` to fs::path anymore.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17256
2024-02-11 16:37:05 +02:00
Benny Halevy
2ed29e31db gms: inet_address: make constructors explicit
In particular, `inet_address(const sstring& addr)` is
dangerous, since a function like
`topology::get_datacenter(inet_address ep)`
might accidentally convert a `sstring` argument
into an `inet_address` (which would most likely
throw an obscure std::invalid_argument if the datacenter
name does not look like an inet_address).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#17260
2024-02-11 15:44:13 +02:00
Benny Halevy
136df58cbc data_value: delete data_value(T*) constructor
Currently, since the data_value(bool) ctor
is implicit, pointers of any kind are implicitly
convertible to data_value via intermediate conversion
to `bool`.

This is error prone, since it allows unsafe comparison
between e.g. an `sstring` with `some*` by implicit
conversion of both sides to `data_value`.

For example:
```
    sstring name = "dc1";
    struct X {
        sstring s;
    };
    X x(name);
    auto p = &x;
    if (name == p) {}
```

Refs #17261

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#17262
2024-02-11 15:42:55 +02:00
Benny Halevy
f86a5072d6 gossiper: add_local_application_state: drop internae error
After 1d07a596bf that
dropped before_change notifications there is no sense
in getting the local endpoint_state_ptr twice: before
and after the notifications and call on_internal_error
if the state isn't found after the notifications.

Just throw the runtime_error if the endpoint state is not
found, otherwise, use it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-02-11 13:33:26 +02:00
Benny Halevy
ac83df4875 transport: controller: do_start_server: do not set_cql_read for maintenance port
RPC is not ready yet at this point, so we should not
set this application state yet.

This is indicated by the following warning from
`gossiper::add_local_application_state`:
```
WARN  2024-01-22 23:40:53,978 [shard 0:stmt] gossip - Fail to apply application_state: std::runtime_error (endpoint_state_map does not contain endpoint = 127.227.191.13, application_states = {{RPC_READY -> Value(1,1)}})
```

That should really be an internal error, but
it can't because of this bug.

Fixes #16932

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-02-11 11:49:52 +02:00
Kefu Chai
d7a404e1ec alternator: add formatter for alternator::calculate_value_caller
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `alternator::calculate_value_caller`,
and drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17259
2024-02-11 11:49:46 +02:00
Michał Chojnowski
5a3e4a1cc0 utils: managed_bytes: optimize memory usage for small buffers
managed_bytes is implemented as chain of blob_storage objects.
Each blob_storage contains 24 bytes of metadata. But in the most
common case -- when there is only a single element in the chain --
16 bytes of this metadata is trivial/unused.

This is regrettable waste because managed_bytes is used for every
database cell in the memtables and cache. It means that every value
of size >= 7 bytes (smaller ones fit in the inline storage of
managed_bytes) receives 16 bytes of useless overhead.

To correct that, this patch adds to managed_bytes an alternative storage
layout -- used for buffers small enough to fit in one contiguous
fragment -- which only stores the necessary minimum of metadata.
(That is: a pointer to the parent, to facilitate moving the storage during
memory defragmentation).
2024-02-09 20:56:20 +01:00
Tomasz Grabiec
1eedc85990 test: py: tablets: Fix flakiness of test_tablet_missing_data_repair
Reimplement stop/start sequence using rolling_restart() which is safe
with regards to UP status propagation and not prone to sudden
connection drop which may cause later CQL queries to time out. It also
ensures that CQL is up on all the remaining nodes when the with_down
callback is executed.

Hopefully: Fixes #17107
2024-02-09 20:37:06 +01:00
Tomasz Grabiec
27ed2d94fc test: pylib: manager_client: Wait for driver to catch up in rolling_restart()
For sanity of the developers who want to execute CQL queries after
rolling restarts.
2024-02-09 20:35:41 +01:00
Tomasz Grabiec
3ce4ec796a test: pylib: manager_client: Accept callback in rolling_restart() to execute with node down 2024-02-09 20:35:41 +01:00
Pavel Emelyanov
7a710425f0 streaming: Open-code on-stack lambda
It just wraps one if, no benefit in keeping it this way

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17250
2024-02-09 20:31:09 +01:00
Petr Gusev
4554653ad9 storage_proxy: add a test for stop_remote
This patch adds a reproducer test for an issue #16382.
See scylladb/seastar#2044 for details of the problem.

The test is enabled only in dev mode since it requires
error injection mechanism. The patch adds a new injection
into storage_proxy::handle_read to simulate the problem
scenario - the node is shutting down and there are some
unfinished pending replica requests.

Closes scylladb/scylladb#16776
2024-02-09 17:23:13 +01:00
Michał Chojnowski
277a31f0ae utils: managed_bytes: rewrite managed_bytes methods in terms of managed_bytes_view
Some methods of managed_bytes contain the logic needed to read/write the
contents of managed_bytes, even though this logic is already present in
managed_bytes_{,mutable}_view.

Reimplementing those methods by using the views as intermediates allows us to
remove some code and makes the responsibilities cleaner -- after the change,
managed_bytes contains the logic of allocating and freeing the storage,
while views provide read/write access to the storage.

This change will simplify the next patch which changes the internals of
managed_bytes.
2024-02-09 17:00:33 +01:00
Botond Dénes
ba89b86913 Update tools/java submodule
* tools/java c75ce2c1...5e11ed17 (1):
  > bin/nodetool-wrapper: pass all args to nodetool for testings its ability
2024-02-09 16:34:47 +01:00