Commit Graph

43236 Commits

Author SHA1 Message Date
Kefu Chai
9f0b60c7a0 rust: disable incremental build for release build
so that the release build is reproducible. a reproduciable helps
developers to perform postmortem debugging.

Fixes #19225
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19374
2024-06-20 12:01:14 +03:00
Anna Stuchlik
680405b465 doc: separate Entrprise- from OSS-only content
This commit adds files that contain Open Source-specific information
and includes these files with the .. scylladb_include_flag:: directive.
The files include a) a link and b) Table of Contents.

The purpose of this update is to enable adding
Open Source/Enterprise-specific information in the Reference section.

Closes scylladb/scylladb#19362
2024-06-20 11:58:32 +03:00
Piotr Dulikowski
75441ee120 Merge 'mv: fix value of the gossiped view update backlog' from Wojciech Mitros
Currently, when calculating the view update backlog for gossip,
we start with `db::view::update_backlog()` and compare it to backlogs
from all shards. However, this backlog can't be compared to other
backlogs - it has size 0 and we compare the fraction current/size
when comparing backlogs, causing us to compare with `NaN`.
This patch fixes it by starting the comparisons with an empty backlog.

The patch introducing this issue (f70f774e40) wasn't backported, so this one doesn't need to be either

Closes scylladb/scylladb#19247

* github.com:scylladb/scylladb:
  mv: make the view update backlog unmofidiable
  mv: fix value of the gossiped view update backlog
2024-06-20 06:27:11 +02:00
Piotr Dulikowski
78a40dbe2c Merge 'cql: remove global_req_id from schema_altering_statement' from Marcin Maliszkiewicz
Such field is no longer needed as the information comes
directly from group0_batch.

Fixes scylladb/scylladb#19365

Backport: no, we don't backport code cleanups

Closes scylladb/scylladb#19366

* github.com:scylladb/scylladb:
  cql: remove global_req_id from schema_altering_statement
  cql: switch alter keyspace prepare_schema_mutations to use group0_batch
2024-06-20 06:21:48 +02:00
Dawid Medrek
c56de90a26 test/boost/hint_test.cc: Add missing parse() callback
Before these changes, compilation was failing with the following
error:

In file included from test/boost/hint_test.cc:12:
/usr/include/fmt/ranges.h:298:7: error: no member named 'parse' in 'fmt::formatter<db::hints::sync_point::host_id_or_addr>'
  298 |     f.parse(ctx);
      |     ~ ^

We add the missing callback.

Closes scylladb/scylladb#19375
2024-06-19 23:19:33 +02:00
Wojciech Mitros
cde14a5788 mv: make the view update backlog unmofidiable
Currently, a view update backlog may reach an invalid state, when
its max is 0 and its relative_size() is NaN as a result. This can
be achieved either by constructing the backlog with a 0 max or by
modifying the max of an existing backlog. In particular, this
happens when creating the backlog using the default constructor.

In this patch the the default constructor is deleted and a check
is added to make sure that the max is different than 0 is added
to its constructor - if the check fails, we construct an empty
backlog instead, to handle the possibility of getting an invalid
backlog sent from a node with a version that's missing this check.
Additionally, we make the backlogs members private, exposing them
only through const getters.
2024-06-19 19:44:57 +02:00
Pavel Emelyanov
5fe4290f66 gitattributes: Mark swagger .js files as binary
The goal is the same as in 29768a2d02 (gitattributes: Mark *.svg as
binary) -- prevent grep from searching patterns in those files.

Despite those files are, in fact, javascript code, the way they are
formatted is not suitable for human reading, so it's unlikely that anyone
would be interested in grep-ing patters in it. At the same time, those
files consist of of very long lines, so if a grep finds a pattern in one
of those, the output is spoiled.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#19357
2024-06-19 15:07:56 +03:00
Botond Dénes
9d1fa828be Merge 'utils/large_bitset: replace reserve_partial with utils::reserve_gently' from Lakshmi Narayanan Sreethar
Replace the reserve_partial loop in large_bitset constructor with a new function - reserve_gently() that can reserve memory without stalling by repeatedly calling reserve_partial() method of the passed container.

Closes scylladb/scylladb#19361

* github.com:scylladb/scylladb:
  utils/large_bitset: replace reserve_partial with utils::reserve_gently
  utils/stall_free: introduce reserve_gently
2024-06-19 14:31:59 +03:00
Piotr Dulikowski
7567b87e72 Merge 'auth: reuse roles select query during cache population' from Marcin Maliszkiewicz
With big number of shards in the cluster (e.g. 500+) due to cache
periodic refresh we experience high load on role_permissions table
(e.g. 1k op/s). The load on roles table is amplified because to populate
single entry in the cache we do several selects on roles table. Some
of this can't be avoided because roles are arranged in a tree-like
structure where permissions can be inherited.

This patch tries to reuse queries which are simply duplicated. It should
reduce the load on roles table by up to 50%.

Fixes scylladb/scylladb#19299

Closes scylladb/scylladb#19300

* github.com:scylladb/scylladb:
  auth: reuse roles select query during cache population
  auth: coroutinize service::get_uncached_permissions
  auth: coroutinize service::has_superuser
2024-06-19 07:53:47 +02:00
Marcin Maliszkiewicz
56707e2965 cql: remove global_req_id from schema_altering_statement
Such field is no longer needed as the information comes
directly from group0_batch.

Fixes scylladb/scylladb#19365
2024-06-18 20:26:09 +02:00
Lakshmi Narayanan Sreethar
9ad800cfb9 utils/large_bitset: replace reserve_partial with utils::reserve_gently
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-06-18 23:36:30 +05:30
Lakshmi Narayanan Sreethar
31414f54c6 utils/stall_free: introduce reserve_gently
Add reserve_gently() that can reserve memory without stalling by
repeatedly calling reserve_partial() method of the passed container.
Update the comments of existing reserve_partial() methods to mention
this newly introduced reserve_gently() wrapper.
Also, add test to verify the functionality.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-06-18 23:36:30 +05:30
Marcin Maliszkiewicz
685aecde61 cql: switch alter keyspace prepare_schema_mutations to use group0_batch
This is needed to simplify the code in the following commit.
2024-06-18 19:54:55 +02:00
Pavel Emelyanov
f7d5d4877c Merge '[test.py] Fix several issues in log gathering' from Andrei Chekun
Related: https://github.com/scylladb/scylladb/issues/17851

Fix the issue that test logs were not deleted
Fix the issue that the URL to the failed test directory was incorrectly shown even when artifacts_dir_url option was not provided
Fix the issue that there were no node logs when it failed to join the cluster

Closes scylladb/scylladb#19115

* github.com:scylladb/scylladb:
  [test.py] Fix logs had multiplication of lines
  [test.py] Fix log not deleted
  [test.py] Fix log for failed node was nod added to failed directory
  [test.py] Fix URl for failed logs directory in CI
2024-06-18 15:37:29 +03:00
Botond Dénes
2123b22526 Merge 'doc: add 6.x.y to 6.x.z and remove 5.x.y to 5.x.z upgrade guide' from Anna Stuchlik
This PR removes the 5.x.y to 5.x.z upgrade guide and adds the 6.x.y to 6.x.z upgrade guide.

The previous maintenance upgrade guides, such as from 5.x.y to 5.x.z, consisted of several documents - separate for each platform.
The new 6.x.y to 6.x.z upgrade guide is one document - there are tabs to include platform-specific information (we've already done it for other upgrade guides as one generic document is more convenient to use and maintain).

I did not modify the procedures. At some point, they have been reviewed for previous upgrade guides.

Fixes https://github.com/scylladb/scylladb/issues/19322

-  This PR must be backported to branch-6.0, as it adds 6.x specific content.

Closes scylladb/scylladb#19340

* github.com:scylladb/scylladb:
  doc: remove the 5.x.y to 5.x.z upgrade guide
  doc: add the 6.x.y to 6.x.z upgrade guide-6
2024-06-18 14:24:38 +03:00
Wojciech Mitros
1de5566cfa mv: fix value of the gossiped view update backlog
Currently, when calculating the view update backlog for gossip,
we start with `db::view::update_backlog()` and compare it to backlogs
from all shards. However, this backlog can't be compared to other
backlogs - it has size 0 and we compare the fraction current/size
when comparing backlogs, causing us to compare with `NaN`.
This patch fixes it by starting the comparisons with an empty backlog.
2024-06-18 13:15:18 +02:00
Kefu Chai
87247c6542 .github: add workflow to build with latest seastar
so we can be awared that if scylla builds with seastar master HEAD,
and to be prepared if a build failure is found.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19135
2024-06-18 13:34:43 +03:00
Andrei Chekun
6a4b441bf2 [test.py] Fix logs had multiplication of lines
Since the test name was not unique across the run and when we were using a --repeat option, there were several handlers for the same file. With this change test name and accordingly, the log name will be different for the same test but different repeat case. Remove mode from the test name since it's already in mode directory.
2024-06-18 11:14:07 +02:00
Andrei Chekun
b01a5f9bd9 [test.py] Fix log not deleted
One of the created log files was not deleted at all, because there was no delete command. Unlink moved on later stage explicitly after removing the handler that writing to this file to avoid the possibility that something will be added after removing the file.
2024-06-18 11:14:01 +02:00
Kefu Chai
0a74d45425 build: cmake: add commitlog_cleanup_test
in 94cdfcaa94, we added commitlog_cleanup_test to `configure.py`,
but didn't add it to the CMake building system.

in this change, let's add it to the CMake building system.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19314
2024-06-18 12:12:28 +03:00
Kefu Chai
68ef7dda79 config: correct the comment on printable_to_json()
seastar::format() does not use operator<< under the hood, it uses
{fmt}, so update the comment accordingly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19315
2024-06-18 12:08:59 +03:00
Nadav Har'El
2ec1e0f0d5 test/cql-pytest: tests verifying UUID sort order
In issue #15561 some doubts were raised regarding the way ScyllaDB sorts
UUID values. This patch adds a heavily-commented cql-pytest test that
helps understand - and verify that understanding - of the way Scylla sorts
UUIDs, and shows there is some reason in the madness (in particular,
Version 1 UUIDs (time uuids) are sorted like timeuuids, and not as byte
arrays.

The new tests check the different cases (see the comments in the test),
and as usual for cql-pytest tests - they passes also on Cassandra, which
allows us to confirm that the sort order we used is identical to the one
used by Cassandra and not something that Scylla mis-implemented.

Having this test in our suite will also ensure that the UUID ordering
never changes accidentally in the future. If it ever changes, it can
break access to existing tables that use UUID clustering keys, so
it shouldn't change.

Fixes #15561

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#19343
2024-06-18 12:05:30 +03:00
Pavel Emelyanov
147552c34a Merge 'configurable maintenance (streaming) semaphore count resource limit' from Botond Dénes
Making the count resources on the maintenance (streaming) semaphore live update via config. This will allow us to improve repair speed on mixed-shard clusters, where we suspect that reader trashing -- due to the combination of high number of readers on each shard and very conservative reader count limit (10) -- is the main cause of the slowness.
Making this count limit confgurable allows us to start experimenting with this fix, without committing to a count limit increase (or removal), addressing the pain in the field.

Refs: #18269

No OSS backport needed.

Closes scylladb/scylladb#19248

* github.com:scylladb/scylladb:
  replica/database: wire in maintenance_reader_concurrency_semaphore_count_limit
  db/config: introduce maintenance_reader_concurrency_semaphore_count_limit
  reader_concurrency_semaphore: make count parameter live-update
2024-06-18 12:02:24 +03:00
Gleb Natapov
fb764720d3 topology coordinator: add more trace level logging for debugging
Add more logging that provide more visibility into what happens during
topology loading.

Message-ID: <ZnE5OAmUbExVZMWA@scylladb.com>
2024-06-18 10:34:03 +02:00
Botond Dénes
1acc57e19d Merge 'schema: Make "describe" use extensions to string' from Calle Wilund
Fixes #19334

Current impl uses hardcoded printing of a few extensions.
Instead, use extension options to string and print all.

Note: required to make enterprise CI happy again.

Closes scylladb/scylladb#19337

* github.com:scylladb/scylladb:
  schema: Make "describe" use extensions to string
  schema_extensions: Add an option to string method
2024-06-18 11:28:11 +03:00
Botond Dénes
495f7160da Update tools/jmx submodule
* tools/jmx 53696b13...3328a229 (1):
  > scylla-apiclient: add missing license for SBOM report
2024-06-18 11:11:57 +03:00
Andrei Chekun
3c921d5712 Add allure pytest adaptor to the toolchain
Add allure-pytest pip dependency to be able to use it for generating the allure report later.
Main benefits of the allure report:
1. Group test failures
2. Possibility to attach log files to she test itself
3. Timeline of test run
4. Test description on the report
5. Search by test name or tag

[avi: regenerate toolchain]

Closes scylladb/scylladb#19335
2024-06-17 23:17:01 +03:00
Nadav Har'El
4faceeaa33 Merge 'treewide: drop thrift support' from Kefu Chai
thrift support was deprecated since ScyllaDB 5.2

> Thrift API - legacy ScyllaDB (and Apache Cassandra) API is
> deprecated and will be removed in followup release. Thrift has
> been disabled by default.

so let's drop it. in this change,

* thrift protocol support is dropped
* all references to thrift support in document are dropped
* the "thrift_version" column in system.local table is preserved for backward compatibility, as we could load from an existing system.local table which still contains this clolumn, so we need to write this column as well.
* "/storage_service/rpc_server" is only preserved for backward compatibility with java-based nodetool.

Fixes #3811
Fixes #18416
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

- [x] not a fix, no need to backport

Closes scylladb/scylladb#18453

* github.com:scylladb/scylladb:
  config: expand on rpc_keepalive's description
  api: s/rpc/thrift/
  db/system_keyspace: drop thrift_version from system.local table
  transport: do not return client_type from cql_server::connection::make_client_key()
  treewide: drop thrift support
2024-06-17 22:36:49 +03:00
Andrei Chekun
8845978ec5 [test.py] Unbreak cql-pytest and alternator
Provide possibility to run pytest without explicitly providing mode parameter

Closes scylladb/scylladb#19342
2024-06-17 21:41:09 +03:00
Piotr Dulikowski
85128c5b10 Merge 'cql3: always return created event in create keyspace statement' from Marcin Maliszkiewicz
cql3: always return created event in create ks/table/type/view statement

In case multiple clients issue concurrently CREATE KEYSPACE IF NOT EXISTS
and later USE KEYSPACE it can happen that schema in driver's session is
out of sync because it synces when it receives special message from
CREATE KEYSPACE response.

Similar situation occurs with other schema change statements.

In this patch we fix only create keyspace/table/type/view statements
by always sending created event. Behavior of any other schema altering
statements remains unchanged.

Fixes https://github.com/scylladb/scylladb/issues/16909

**backport: no, it's not a regression**

Closes scylladb/scylladb#18819

* github.com:scylladb/scylladb:
  cql3: always return created event in create ks/table/type/view statement
  cql3: auth: move auto-grant closer to resource creation code
  cql3: extract create ks/table/type/view event code
2024-06-17 19:58:38 +02:00
Anna Stuchlik
ea35982764 doc: remove the 5.x.y to 5.x.z upgrade guide
This commit removes the upgrade guide from 5.x.y to 5.x.z.
It is reduntant in version 6.x.
2024-06-17 17:28:39 +02:00
Anna Stuchlik
ead201496d doc: add the 6.x.y to 6.x.z upgrade guide-6
This commit adds the upgrade guide from 6.x.y to 6.x.z.
2024-06-17 17:23:00 +02:00
Marcin Maliszkiewicz
95673907ca auth: reuse roles select query during cache population
With big number of shards in the cluster (e.g. 500+) due to cache
periodic refresh we experience high load on role_permissions table
(e.g. 1k op/s). The load on roles table is amplified because to populate
single entry in the cache we do several selects on roles table. Some
of this can't be avoided because roles are arranged in a tree-like
structure where permissions can be inherited.

This patch tries to reuse queries which are simply duplicated. It should
reduce the load on roles table by up to 50%.

Fixes scylladb/scylladb#19299
2024-06-17 16:46:33 +02:00
Marcin Maliszkiewicz
547eb6d59b auth: coroutinize service::get_uncached_permissions 2024-06-17 16:46:28 +02:00
Marcin Maliszkiewicz
00a24507cb auth: coroutinize service::has_superuser 2024-06-17 16:46:22 +02:00
Kefu Chai
a5a5ca0785 auth: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19312
2024-06-17 17:33:55 +03:00
Yaniv Michael Kaul
9b0eb82175 dist/common/scripts/scylla_coredump_setup: fix typo
Does not able -> Unable

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#19328
2024-06-17 17:33:46 +03:00
Kefu Chai
b64126fe1c db: remove unused operator<<
since we've switched almost all callers of the operator<< to {fmt},
let's drop the unused operator<<:s.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19313
2024-06-17 17:33:31 +03:00
Calle Wilund
73abc56d79 schema: Make "describe" use extensions to string
Fixes #19334

Current impl uses hardcoded printing of a few extensions.
Instead, use extension options to string and print all.
2024-06-17 13:30:24 +00:00
Calle Wilund
d27620e146 schema_extensions: Add an option to string method
Allow an extension to describe itself as the CQL property
string that created it (and is serialized to schema tables)

Only paxos extension requires override.
2024-06-17 13:30:10 +00:00
Kefu Chai
7e9550e9f9 test/py/minio_server.py: do not reference non-existent old_env
in 51c53d8db6, we check `self.old_env[env]` for None, but there
are chances `self.old_env` does not contain a value with `env`.
in that case, we'd have following failure:

```
Traceback (most recent call last):
  File "/home/kefu/dev/scylladb/test/pylib/minio_server.py", line 307, in <module>
    asyncio.run(main())
  File "/usr/lib64/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/kefu/dev/scylladb/test/pylib/minio_server.py", line 304, in main
    await server.stop()
  File "/home/kefu/dev/scylladb/test/pylib/minio_server.py", line 274, in stop
    self._unset_environ()
  File "/home/kefu/dev/scylladb/test/pylib/minio_server.py", line 211, in _unset_environ
    if self.old_env[env] is not None:
       ~~~~~~~~~~~~^^^^^
KeyError: 'S3_CONFFILE_FOR_TEST'
```

this happens if we run `pylib/minio_server.py` as a standalone
application.

in this change, instead of getting the value with index, we use
`dict.get()`, so that it does not throw when the dict does not
have the given key.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19291
2024-06-17 12:42:43 +03:00
Andrei Chekun
293cf355df [test.py] Fix log for failed node was nod added to failed directory
If something happens during nod adding to the cluster, it will not be registered as a part of the cluster. This leads to situations during log gathering that logs for a such node will be missing.
2024-06-17 11:16:55 +02:00
Andrei Chekun
7bbb8d9260 [test.py] Fix URl for failed logs directory in CI
Incorrect passing of the artifacts_dir_url parameter from test.py to pytest leads to the situation when it will pass None as a string and pytest will generate incorrect URL.
2024-06-17 11:16:48 +02:00
Aleksandra Martyniuk
fb3153d253 api: task_manager: delete module from full_task_status
Delete module field from full_task_status as it is unused.

Closes scylladb/scylladb#18853
2024-06-17 09:03:19 +03:00
Nadav Har'El
9fc70a28ca test: unflake test test_alternator_ttl_scheduling_group
This test in topology_experimental_raft/test_alternator.py wants to
check that during Alternator TTL's expiration scans, ALL of the CPU was
used in the "streaming" scheduling group and not in the "statement"
scheduling group. But to allow for some fluke requests (e.g., from the
driver), the test actually allows work in the statement group to be
up to 1% of the work.

Unfortunately, in one test run - a very slow debug+aarch64 run - we
saw the work on the statement group reach 1.4%, failing the test.
I don't know exactly where this work comes from, perhaps the driver,
but before this bug was fixed we saw more than 58% of the work in the
wrong scheduling group, so neither 1% or 1.4% is a sign that the bug
came back. In fact, let's just change the threshold in the test to 10%,
which is also much lower than the pre-fix value of 58%, so is still a
valid regression test.

Fixes #19307

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#19323
2024-06-17 08:39:38 +03:00
Yaron Kaikov
996be2e235 dbuild: update toolchain to get latest scylla-api-client
a new Scylla-api-client was released to get a proper license information
in our SBOM report,

Refs: https://github.com/scylladb/scylla-jmx/issues/237

Closes scylladb/scylladb#19324
2024-06-17 08:37:49 +03:00
Dawid Medrek
670830091c db/hints: Use dedicated functions to lock a shared mutex
Seastar has functions implementing locking a `seastar::shared_mutex`.
We should use those now instead of reimplementing them in Scylla.

Closes scylladb/scylladb#19253
2024-06-14 20:31:37 +02:00
Kamil Braun
bbb424a757 Merge '[test.py] Add uniqueness to the test name' from Andrei Chekun
In CI test always executed with option --repeat=3 that leads to generate 3 test results with the same name. Junit plugin in CI cannot distinguish correctly the difference between these results. In case when we have two passes and one fail, the link to test result will sometimes be redirected to the incorrect one because the test name is the same. To fix this ReportPlugin added that will be responsible to modify the test case name during junit report generation adding to the test name mode and run id.

Fixes: https://github.com/scylladb/scylladb/issues/17851

Fixes: https://github.com/scylladb/scylladb/issues/15973

Closes scylladb/scylladb#19235

* github.com:scylladb/scylladb:
  [test.py] Add uniqueness to the test name
  [test.py] Refactor alternator, nodetool, rest_api
2024-06-14 17:59:07 +02:00
Botond Dénes
5b87fa4cea Merge 'doc: document keyspace and table for nodetool ring' from Kefu Chai
these two arguments are critical when tablets are enabled.

Fixes https://github.com/scylladb/scylladb/issues/19296

---

6.0 is the first release with tablets support. and `nodetool ring` is an important tool to understand the data distribution. so we need to backport this document change to 6.0

Closes scylladb/scylladb#19297

* github.com:scylladb/scylladb:
  doc: document `keyspace` and `table` for `nodetool ring`
  doc: replace tab with space
2024-06-14 16:04:23 +03:00
Kefu Chai
ea3b8c5e4f doc: document keyspace and table for nodetool ring
these two arguments are critical when tablets are enabled.

Fixes #19296
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-14 21:01:14 +08:00