Commit Graph

47991 Commits

Author SHA1 Message Date
Andrzej Jackowski
e3f052d6fb test: dtest: copied get_node_ip from dtests to scylla_cluster.py
Co-authored-by: Marcin Maliszkiewicz <marcinmal@scylladb.com>
2025-06-05 08:20:09 +02:00
Andrzej Jackowski
40e71ad1e6 test: dtest: copy run_rest_api from dtests to cluster.py
Co-authored-by: Marcin Maliszkiewicz <marcinmal@scylladb.com>
2025-06-05 08:20:09 +02:00
Andrzej Jackowski
3da86f04a5 test: dtest: copy run_in_parallel from dtests to data.py
Co-authored-by: Marcin Maliszkiewicz <marcinmal@scylladb.com>
2025-06-05 08:19:54 +02:00
Andrzej Jackowski
a1b1d810f9 test: audit: copy unmodified audit_test.py from dtests
Copied the entire audit_test.py from scylladb/scylla-dtest, to remove
the entire file from scylla-dtest after this patch series is merged.
The motivation is to move entire audit testing to from dtests,
to make it easier to maintain and more reliable.

Changed suite.yaml, to prevent audit_test.py from running because
audit_test.py needs improvement before it starts passing.

Co-authored-by: Marcin Maliszkiewicz <marcinmal@scylladb.com>
2025-06-05 08:19:44 +02:00
Nadav Har'El
6cbcabd100 alternator: hide internal tags from users
The "tags" mechanism in Alternator is a convenient way to attach metadata
to Alternator tables. Recently we have started using it more and more for
internal metadata storage:

  * UpdateTimeToLive stores the attribute in a tag system:ttl_attribute
  * CreateTable stores provisioned throughput in tags
    system:provisioned_rcu and system:provisioned_wcu
  * CreateTable stores the table's creation time in a tag called
    system:table_creation_time.

We do not want any of these internal tags to be visible to a
ListTagsOfResource request, because if they are visible (as before this
patch), systems such as Terraform can get confused when they suddenly
see a tag which they didn't set - and may even attempt to delete it
(as reported in issue #24098).

Moreover, we don't want any of these internal tags to be writable
with TagResource or UntagResource: If a user wants to change the TTL
setting they should do it via UpdateTimeToLive - not by writing
directly to tags.

So in this patch we forbid read or write to *any* tag that begins
with the "system:" prefix, except one: "system:write_isolation".
That tag is deliberately intended to be writable by the user, as
a configuration mechanism, and is never created internally by
Scylla. We should have perhaps chosen a different prefix for
configurable vs. internal tags, or chosen more unique prefixes -
but let's not change these historic names now.

This patch also adds regression tests for the internal tags features,
failing before this patch and passing after:
1. internal tags, specifically system:ttl_attribute, are not visible
   in ListTagsOfResource, and cannot be modified by TagResource or
   UntagResource.
2. system:write_isolation is not internal, and be written by either
   TagResource or UntagResource, and read with ListTagsOfResource.

This patch also fixes a bug in the test where we added more checks
for system:write_isolation - test_tag_resource_write_isolation_values.
This test forgot to remove the system:write_isolation tags from
test_table when it ended, which would lead to other tests that run
later to run with a non-default write isolation - something which we
never intended.

Fixes #24098.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#24299
2025-06-03 20:40:50 +03:00
Pavel Emelyanov
37e6ff1a3c Merge 'test.py: cql: run tests using bare pytest command' from Evgeniy Naydanov
Create a custom pytest test collector for .cql files and move CQL test execution logic from `CQLApprovalTest` class and `pylib/cql_repl/cql_repl.py` file to `CqlTest.runtest()` method.

In result, the only difference between CQLApproval and Python suite types is suffixes of test files.

Also there is a separate commit to remove dead code:

There is `write_junit_failure_report()` method in Test class which was used to generate a JUnitXML report.  But it became a dead code after removal of `write_junit_report()` function in 1e1d213592 to avoid duplication of error reporting in Jenkins (see https://github.com/scylladb/scylladb/issues/23220.)  This commit removes this method and all its implementations in subclasses.

Closes scylladb/scylladb#24301

* github.com:scylladb/scylladb:
  test.py: cql: don't exit from pytest session on failed CQL
  test.py: cql: run tests using bare pytest command
  test.py: python: set test.id according to --run_id argument
  test.py: python: pass --tmpdir from test.py to all Python tests
  test.py: remove dead code after removing of write_junit_report()
2025-06-03 19:32:06 +03:00
Pavel Emelyanov
24f430c6d2 Merge 'test.py: dtest: port next_gating tests from auth_roles_test.py' from Evgeniy Naydanov
Copy `auth_roles_test.py` from scylla-dtest test suite, remove all not next_gating tests from it, and make it works with `test.py`

As a part of the porting process, copy missed utility functions from scylla-dtest, remove unused imports and markers.

Enable the test in `suite.yaml` (run in dev mode only.)

Closes scylladb/scylladb#24343

* github.com:scylladb/scylladb:
  test.py: dtest: make auth_roles_test.py run using test.py
  test.py: dtest: add wait_for_any_log() to tools/log_utils.py
  test.py: dtest: add part of tools/assertions.py
  test.py: dtest: pickup latest code for retrying.py from dtest
  test.py: dtest: copy unmodified auth_roles_test.py
2025-06-03 18:54:47 +03:00
Patryk Jędrzejczak
8756c233e0 test: test_raft_recovery_user_data: disable hinted handoff
The test is currently flaky, writes can fail with "Too many in flight
hints: 10485936". See scylladb/scylladb#23565 for more details.

We suspect that scylladb/scylladb#23565 is caused by an infrastructure
issue - slow disks on some machines we run CI jobs on.

Since the test fails often and investigation doesn't seem to be easy,
we first deflake the test in this patch by disabling hinted handoff.

For replacing nodes, we provide `cfg` because there should have been
`cfg` in the first place. The test was correct anyway because:
- `tablets_mode_for_new_keyspaces` is set to `true` by default in
  test/cluster/suite.yaml,
- `endpoint_snitch` is set to `GossipingPropertyFileSnitch` by default
  if the property file is provided in `ScyllaServer.__init__`.

Ref scylladb/scylladb#23565

We should backport this patch to 2025.2 because this test is also flaky
on CI jobs using 2025.2. Older branches don't have this test.

Closes scylladb/scylladb#24364
2025-06-03 17:48:42 +02:00
Nadav Har'El
ac70e34de9 test/alternator: verify that DeleteItem returns an empty object
A user on StackOverflow (https://stackoverflow.com/questions/79650278)
reported that DeleteItem returns the apropriate response (an empty
object) on DynamoDB, but doesn't on "DynamoDB Local" (Amazon's local
mock of DynamoDB). I wrote the test in this patch to make sure that
Alternator doesn't have this bug, and indeed it doesn't: When DeleteItem
is used without any option that asks for additional output, its reponse
is, as expected, an empty object.

As usual, the new test passes on both Alternator and AWS DynamoDB.
(I didn't actually test on DynamoDB Local, I have some problems with
running that, but it doesn't matter, we have no intention of testing
DynamoDB Local).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#24359
2025-06-03 18:47:34 +03:00
Avi Kivity
744015cf26 test.py: allow cmake configuration and ./configure.py configuration to coexist
Cmake emits its build.ninja into build/, while configure.py emits
build.ninja into ./. test.py uses this difference to choose the directory
structure to test.

The problem is that vscode will randomly call cmake to understand the
directory structure, so we end up with both build.ninja set up.

Invert the logic to look for ./build.ninja to determine the mode (instead
of build/build.ninja which can exist even if the user uses traditional
configuration).

It can still happen that a stray ./build.ninja exists (for example due
to switching branches), but that is rarer than having vscode auto-create
it.

Closes scylladb/scylladb#24269
2025-06-03 16:46:41 +03:00
Piotr Dulikowski
f6669422e1 Merge 'test.py: refactor test facades for better error handling' from Andrei Chekun
Switching to f-string formatting to simplify the code and to unify it with a general approach for formatting strings.
If the log file absent or empty test fails with an error regarding a missing boost log file, however, it's not helpful since it's not a root cause of the fail. Adding logic to log this issue as a warning in a pytest's log file and continue with providing results to the pytest itself.

Closes scylladb/scylladb#24307

* github.com:scylladb/scylladb:
  test.py: enhance boost_facade missing log file handling
  test.py: switch using f-string instead format in facades
2025-06-03 14:03:07 +02:00
Pavel Emelyanov
96029c7c93 Update seastar submodule
* seastar d7ff58f2...26badcb1 (22):
  > http/client: Skip HEAD reply body processing
  > httpd: Remove unused connection::_req member
  > httpd: Don't write body for HEAD replies
  > http: Move trailing chunk write into reply.cc
  > http_client: Add ECONNRESET to retryable errors
  > stall_detector: no backtrace if exception
  > http: Add test for "aborted" client
  > http: in the client, fix malforming of requests with zero-sized bodies
  > http: Track bytes read from a response
  > http: Add test for improper client handling of aborted requests
  > aio_storage_context: Rename iocb_pool::_iocb_pool to _all_iocbs
  > resource: Add some debug-level logging to memory allocation
  > resource: Rework sysconf memory fallback
  > resource: Indentation fix after previous patch
  > resource: Calculate available memory from NUMA nodes
  > resource: Move NUMA nodes vector evaluation up
  > reactor: Drop _reuseport boolean
  > reactor: Simplify network stack creation and initialization
  > reactor: Remove write-only _thread_id
  > reactor: Keep task-queues in std::array instead of static_vector
  > reactor: Mark _id and task_queue::_id const
  > memory: Report oversized alloc count as metric

scylla-gdb update included:

The reactor::_task_queues can be std::array or unique ptrs. Also check
the tq_ptr for being nullptr, as array doesn't have "size" only
"capacity" and can have non-registered groups.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#24294
2025-06-03 13:47:05 +03:00
Evgeniy Naydanov
f0d283afd7 test.py: cql: don't exit from pytest session on failed CQL
There is the fixture in `test/cql/conftest.py` which checks
CQL connection after each test and exit from pytest session if
the connection was failed.  For CQL tests it's simply no
difference what to use: pytest.exit() or pytest.fail() because
tests are executing one-by-one in separate pytest sessions.

Change it to pytest.fail() for future integration into a single
pytest session.
2025-06-03 07:54:51 +00:00
Evgeniy Naydanov
cdc4b520da test.py: cql: run tests using bare pytest command
Create a custom pytest test collector for .cql files and
move CQL test execution logic from `CQLApprovalTest` class
and `pylib/cql_repl/cql_repl.py` file to `CqlTest.runtest()`
method.

In result, the only difference between CQLApproval and Python
suite types is suffixes of test files.
2025-06-03 07:54:51 +00:00
Evgeniy Naydanov
0fba0df4f6 test.py: python: set test.id according to --run_id argument
test.py uses `Test.id` attribute to distinguish repeated tests
in one run and pass it as `--run_id` CLI argument to pytest.
Use this argument to set the test's `id` attribute inside pytest
session to fix problem with paths to some test artifacts.
2025-06-03 07:54:51 +00:00
Michał Chojnowski
ea4d251ad2 compress: fix a use-after-free in dictionary_holder::get_recommended_dict()
The function calls copy() on a foreign_ptr
(stored in a map) which can be destroyed
(erased from the map) before the copy() completes.
This is illegal.

One way to fix this would be to apply an rwlock
to the map. Another way is to wrap the `foreign_ptr`
in a `lw_shared_ptr` and extend its lifetime over
the `copy()` call. This patch does the latter.

Fixes scylladb/scylladb#24165
Fixes scylladb/scylladb#24174

Closes scylladb/scylladb#24175
2025-06-03 10:42:38 +03:00
Piotr Dulikowski
f5b18d275b Merge 'test/boost: Adjust tests to RF-rack-valid keyspaces' from Dawid Mędrek
This PR adjusts existing Boost tests so they respect the invariant
introduced by enabling `rf_rack_valid_keyspaces` configuration option.
We disable it explicitly in more problematic tests. After that, we
enable the option by default in the whole test suite.

Fixes scylladb/scylladb#23958

Backport: backporting to 2025.1 and 2025.2 to be able to test the implementation there too.

Closes scylladb/scylladb#23802

* github.com:scylladb/scylladb:
  test/lib/cql_test_env.cc: Enable rf_rack_valid_keyspaces by default
  test/boost/tablets_test.cc: Explicitly disable rf_rack_valid_keyspaces in problematic tests
  test/boost/tablets_test.cc: Fix indentation in test_load_balancing_with_random_load
  test/boost/tablets_test.cc: Adjust test_load_balancing_with_random_load to RF-rack-validity
  test/boost/tablets_test.cc: Adjust test_load_balancing_works_with_in_progress_transitions to RF-rack-validity
  test/boost/tablets_test.cc: Adjust test_load_balancing_resize_requests to RF-rack-validity
  test/boost/tablets_test.cc: Adjust test_load_balancing_with_two_empty_nodes to RF-rack-validity
  test/boost/tablets_test.cc: Adjust test_load_balancer_shuffle_mode to RF-rack-validity
2025-06-03 08:43:34 +02:00
Evgeniy Naydanov
ac972231fa test.py: python: pass --tmpdir from test.py to all Python tests
`--tmpdir` CLI argument is used to point to the directory with logs
and other test artifacts.  It has default values both in test.py
and pytest (`test/conftest.py`). These values are the same.  But for
non-default values it's required to pass it from test.py to pytest
explicitly.  This done for Topology tests, but not for all Python test
suites.  The commit fixes the problem by adding the argument in
`_prepare_pytest_command()` method of the base `PythonTest` class.
2025-06-03 05:45:05 +00:00
Evgeniy Naydanov
17401aaf31 test.py: remove dead code after removing of write_junit_report()
There is `write_junit_failure_report()` method in Test class which
was used to generate a JUnitXML report.  But it became a dead code
after removal of `write_junit_report()` function in
1e1d213592 to avoid duplication of
error reporting in Jenkins (see #23220.)  This commit removes this
method and all its implementations in subclasses.
2025-06-03 02:28:41 +00:00
Andrei Chekun
738cbc07b5 test.py: enhance boost_facade missing log file handling
If the log file absent or empty test fails with an error regarding a missing boost log file, however, it's not helpful since it's not a root cause of the fail. Adding logic to log this issue as a warning in a pytest's log file and continue with providing results to the pytest itself.
2025-06-02 12:17:10 +02:00
Andrei Chekun
5f6740c1fa test.py: switch using f-string instead format in facades
Switching to f-string formatting to simplify the code and to unify it with a general approach for formatting strings.
2025-06-02 12:16:47 +02:00
Pavel Emelyanov
7fef2c4f61 Merge 'test.py: fix metrics gathering' from Andrei Chekun
Move of the run_process done in https://github.com/scylladb/scylladb/pull/24091 was not fully correct. The method run_process was not overridden in the class ResourceGatherOn, so no metrics are collected at all.
Additionally, fix metrics DB location second time.

Closes scylladb/scylladb#24306

* github.com:scylladb/scylladb:
  test.py: fix metrics DB location
  test.py: fix the possibility to gather resource metrics for test
2025-06-02 13:12:42 +03:00
Botond Dénes
e82b0dff3e Merge 'Move mutation_fragment_v2::kind into mutation_fragment_v2::data, mutation_fragment::kind into mutation_fragment::data' from Radosław Cybulski
Move mutation_fragment_v2::kind field into mutation_fragment_v2::data.
Move mutation_fragment::kind field into mutation_fragment::data.

In both cases the move reduces size of the object by half (to 8 bytes).

On top of testsuite this patch was tested manually. First patched scylla was run. A keyspace and a table was created, with columns TEXT, INT, DOUBLE, BOOLEAN and TIMESTAMP. One row was inserted, `select *` was executed to make sure it's there. Then scylla was terminated and non-patched scylla was run, another row was inserted and `select *` was run to verify both rows exist. After this patched scylla was against started, third row was inserted and final `select *` was done to verify all three rows are there.

This is partial fix to https://github.com/scylladb/scylla-enterprise/issues/5288 issue.

Closes scylladb/scylladb#23452

* github.com:scylladb/scylladb:
  Move mutation_fragment::kind into data object
  Make mutation_fragment::kind enum 1 byte size
  Move mutation_fragment_v2::kind into data object
  Make mutation_fragment_v2::kind enum 1 byte size
2025-06-02 10:57:17 +03:00
Evgeniy Naydanov
e780164a67 test.py: dtest: make auth_roles_test.py run using test.py
As a part of the porting process, remove unused imports and
markers, remove non-next_gating tests, and code for old
ScyllaDB versions.

Enable the test in suite.yaml (run in dev mode only)
2025-06-02 05:14:41 +00:00
Evgeniy Naydanov
145c2fed97 test.py: dtest: add wait_for_any_log() to tools/log_utils.py
Copy wait_for_any_log() function from dtest tools/log_utils.py
with few modifications:

 - Add type hints;
 - Change timeout for node.watch_log_for() calls from 0 to 0.1
   because dtest shim's implementation uses asyncio.timeout()
   and 0 means not "one time" but "never run";
 - Use set() instead of list() for `ret` variable;
 - Remove redundant `found` variable.
 - Remove `remaining` variable and use shallow copies to make
   the code more correct.  As a side effect this makes the
   TimeoutError message more correct too;
 - Use f-string formatting for TimeoutError message;
2025-06-02 05:14:41 +00:00
Evgeniy Naydanov
ff2aea7e5b test.py: dtest: add part of tools/assertions.py
Copy few assertion functions from dtest tools/assertions.py:

 - assertion_exception()
 - assertion_invalid()
 - assertion_one()
 - assertion_all()
2025-06-02 05:14:41 +00:00
Evgeniy Naydanov
9d70b6307b test.py: dtest: pickup latest code for retrying.py from dtest
Sync retrying.py with dtest.
2025-06-02 05:14:41 +00:00
Evgeniy Naydanov
40464faef3 test.py: dtest: copy unmodified auth_roles_test.py
The test is disabled in suite.yaml
2025-06-02 05:14:41 +00:00
Jenkins Promoter
7d562c24b1 Update pgo profiles - aarch64 2025-06-01 04:45:06 +03:00
Jenkins Promoter
75cf16afa2 Update pgo profiles - x86_64 2025-06-01 04:31:56 +03:00
Botond Dénes
c52aec3d2f Merge 'tablets: fix missing data after tablet merge ' from Raphael Raph Carvalho
Consider the following scenario:

1) let's assume tablet 0 has range [1, 5] (pre merge)
2) tablet merge happens, tablet 0 has now range [1, 10]
3) tablet_sstable_set isn't refreshed, so holds a stale state, thinks tablet 0 still has range [1, 5]
4) during a full scan, forward service will intersect the full range with tablet ranges and consume one tablet at a time
5) replica service is asked to consume range [1, 10] of tablet 0 (post merge)

We have two possible outcomes:

With cache bypass:

1) cache reader is bypassed
2) sstable reader is created on range [1, 10]
3) unrefreshed tablet_sstable_set holds stale state, but select correctly all sstables intersecting with range [1, 10]

With cache:

1) cache reader is created
2) finds partition with token 5 is cached
3) sstable reader is created on range [1, 4] (later would fast forward to range [6, 10]; also belongs to tablet 0)
4) incremental selector consumes the pre-merge sstable spanning range [1, 5]
4.1) since the partitioned_sstable_set pre-merge contains only that sstable, EOS is reached
4.2) since EOS is reached, the fast forward to range [6, 10] is not allowed.
So with the set refreshed, sstable set is aligned with tablet ranges, and no premature EOS is signalled, otherwise preventing fast forward to from happening and all data from being properly captured in the read.

This change fixes the bug and triggers a mutation source refresh whenever the number of tablets for the table has changed, not only when we have incoming tablets.

Additionally, includes a fix for range reads that span more than one tablet, which can happen during split execution.

Fixes: https://github.com/scylladb/scylladb/issues/23313

This change needs to be backported to all supported versions which implement tablet merge.

Closes scylladb/scylladb#24287

* github.com:scylladb/scylladb:
  replica: Fix range reads spanning sibling tablets
  test: add reproducer and test for mutation source refresh after merge
  tablets: trigger mutation source refresh on tablet count change
2025-05-30 15:37:29 +03:00
Anna Stuchlik
28cb5a1e02 doc: add OS support for ScyllaDB 2025.2
This commit adds the information about support for platforms
in ScyllaDB version 20252.

Fixes https://github.com/scylladb/scylladb/issues/24180

Closes scylladb/scylladb#24263
2025-05-30 12:23:59 +03:00
Calle Wilund
942477ecd9 encryption/utils: Move encryption httpclient to "general" REST client
Fixed #24296

While the HTTP client used for REST calls in AWS/GCP KMS integration (EAR)
is not general enough to be called a HTTP client as such, it is general
enough to be called a REST client (limited to stateless, single-op REST
calls).

Other code, like general auth integrations (hello Azure) and similar
could reuse this to lessen code duplication.

This patch simply moves the httpclient class from encryption to "rest"
namespace, and explicitly "limits" it to such usage. Making an alias
in encryption to avoid touching more files than needed.

Closes scylladb/scylladb#24297
2025-05-30 12:21:51 +03:00
Pavel Emelyanov
a65ffdd0df test/result_utils: Do not assume map_reduce reducing order
When map_reduce is called on a collection, one shouldn't expect that it
processes the elements of the collection in any specific order.

Current test of map-reduce over boost outcome assumes that if reduce
function is the string concatenation, then it would concatenate the
given vector of strings in the order they are listed. That requirement
should be relaxed, and the result may have reversed concatentation.

Fixes scylladb/scylladb#24321

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#24325
2025-05-30 09:38:59 +02:00
Michael Litvak
3a1be33143 test_cdc_generation_publishing: fix to read monotonically
The test test_multiple_unpublished_cdc_generations reads the CDC
generation timestamps to verify they are published in the correct order.
To do so it issues reads in a loop with a short sleep period and checks
the differences between consecutive reads, assuming they are monotonic.

However the assumption that the reads are monotonic is not valid,
because the reads are issued with consistency_level=ONE, thus we may read
timestamps {A,B} from some node, then read timestamps {A} from another
node that didn't apply the write of the new timestamp B yet. This will
trigger the assert in the test and fail.

To ensure the reads are monotonic we change the test to use consistency
level ALL for the reads.

Fixes scylladb/scylladb#24262

Closes scylladb/scylladb#24272
2025-05-30 08:35:56 +02:00
Pavel Emelyanov
086777e5de Merge 'test.py: python: run tests using bare pytest command' from Evgeniy Naydanov
Main change is splitting logic of `PythonTest.run()` method into `PythonTest.run_ctx()` context manager and `PythonTest.run()` method itself and add the `host` fixture which uses `PythonTest.run_ctx()` context manager to setup and teardown ScyllaDB node if `--test-py-init` argument is used. Otherwise, this fixture returns a value of `--host` CLI argument. Use dynamic scope provided by `testpy_test_fixture_scope()` function instead of `session` to maintain compatibility with `test.py` and `./run` scripts.

Other related changes:

* Add utility `get_testpy_test()` function to `pylib.suite.base` which combines all required steps to create an instance of `Test` class and rework `testpy_test` fixture to use it.

* Switch to use dynamic fixture scope controlled by `--test-py-init` CLI argument to improve compatibility with test.py.  And because in test.py mode the scope is `session`, also change default event loop scope to `session`.

* Convert `get_valid_alternator_role()` to fixture to have more control on the scope of the cache used. Additionally, function `new_dynamodb_session()` was also converted to a fixture, because it uses `get_valid_alternator_role()`.

* Replace dups of `cql` and `this_dc` fixtures in `rest_api` and `pylib/cql_repl` with imports from `cqlpy`.

* Change `build_mode` fixture to return "unknown" if no --mode arguments provided (this is mainly for alternator and cqlpy tests)

* Create a parent directory for a test log file just before opening this file in `run_test()` function instead of having this as a side effect in `Test.__init__()`.

And changes that remove pytest CLI argument duplicates to be able to run tests from different test suites in one pytest session:

* Add 3 supplementary functions to `test.pylib.suite.python`: `add_host_option()` (which adds `--host` options to pytest session), `add_cql_connection_options()` (which adds `--port`, and `--ssl`), and `--add-s3-options` (which adds options related to S3 connection.) Each function decorated with `@cache` decorator to be executed once per pytest session and avoid CLI options duplication for runs which executes `alternator`, `cqlpy`, `rest_api`, or `broadcast_tables` in one pytest session.

* Move `--auth_username` and `--auth_password` options from `cluster/conftest.py` to add_scylla_cql_connection_options() and slightly rework `cql` fixture to support these options.

* Remove `--input`, `--output`, and `--keep-tmp` pytest CLI opionts from `cluster/object_store/conftest.py` because they are not used in these suite.

* Remove `--omit-scylla-output` CLI option from pytest argparser. Instead, remove it from `sys.argv` in `cqlpy/run.py`.  Also, no need to check this option in `alternator/run`.

Closes scylladb/scylladb#23849

* github.com:scylladb/scylladb:
  test.py: python: run tests using bare pytest command
  test.py: rework testpy_test fixture
  test.py: alternator: convert get_valid_alternator_role() to fixture
  test.py: python: split logic of PythonTest.run()
  test.py: add credentials options to add_cql_connection_options()
  test.py: python: remove dups of cql and this_dc fixtures
  test.py: remove duplication of pytest CLI options
  test.py: remove unused CLI options
  test.py: remove `--omit-scylla-output` from pytest argparser
  test.py: set build_mode to "unknown" if no --mode argument
  test.py: create directory for test log in run_test()
2025-05-30 08:48:43 +03:00
Botond Dénes
7db956965e mutation/mutation_compactor: cache regular/shadowable max-purgable in separate members
Max purgeable has two possible values for each partition: one for
regular tombstones and one for shadowable ones. Yet currently a single
member is used to cache the max-purgeable value for the partition, so
whichever kind of tombstone is checked first, its max-purgeable will
become sticky and apply to the other kind of tombstones too. E.g. if the
first can_gc() check is for a regular tombstone, its max-purgeable will
apply to shadowable tombstones in the partition too, meaning they might
not be purged, even though they are purgeable, as the shadowable
max-purgeable is expected to be more lenient. The other way around is
worse, as it will result in regular tombstone being incorrectly purged,
permitted by the more lenient shadowable tombstone max-purgeable.
Fix this by caching the two possible values in two separate members.
A reproducer unit test is also added.

Fixes: scylladb/scylladb#23272

Closes scylladb/scylladb#24171
2025-05-29 22:52:08 +03:00
Avi Kivity
f0ec9dd8f2 Merge 'utils/logalloc: enforce the max contiguous allocation size limit' from Michał Chojnowski
This series fixes the only known violation of logalloc's allocation size limits (in `chunked_managed_vector`), and then it make those limits hard.

Before the series, LSA handles overly-large allocations by forwarding them to the standard allocator. After the series, an attempt to do an overly large allocations via LSA will trigger an `on_internal_error` instead.

We do this because the allocator fallback logic turned out to have subtle and problematic accounting bugs.
We could fix them, or we can remove the mechanism altogether.
It's hard to say which choice is better. This PR arbitrarily makes the choice to remove the mechanism.
This makes the logic simpler, at the risk of escalating some allocation size bugs to crashes.

See the descriptions of individual commits for more details.

Fixes scylladb/scylladb#23850
Fixes scylladb/scylladb#23851
Fixes scylladb/scylladb#23854

I'm not sure if any of this should be backported or not.

The `chunked_managed_vector` fix could be backported, because it's a bugfix. It's an old bug, though, and we have never observed problems related to it.

The changes to `logalloc` aren't supposed to be fixing any observable problem, so a backport probably has more risk than benefit in this case.

Closes scylladb/scylladb#23944

* github.com:scylladb/scylladb:
  utils/logalloc: enforce LSA allocation size limits
  utils/lsa/chunked_managed_vector: fix the calculation of max_chunk_capacity()
2025-05-29 22:11:41 +03:00
Szymon Malewski
18d237a393 alternator/executor: Added checks in batch_write_item
This patch adds checks validating 'BatchWriteItem' requests mostly to avoid ugly fallback message.
It changes request's behaviour in case of an empty array of WriteRequests - previously such an array was ignored and whole request might succeed, now it raises ValidationException, following the documentation and behaviour of DynamoDB.
Patch includes tests in test_manual_requests (`test_batch_write_item_invalid_payload`, `test_batch_write_item_empty_request_list`) testing with several offending cases.

Fixes #23233

Closes scylladb/scylladb#23878
2025-05-29 20:33:57 +03:00
Patryk Jędrzejczak
c21692f3a6 Merge 'token_range_vector: fragment' from Avi Kivity
token_range_vector is a sequence of intervals of tokens. It is used
to describe vnodes or token ranges owned by shards.

Since tokens are bloated (16 bytes instead of 8), and intervals are bloated
(40 byte of overhead instead of 8), and since we have plenty of token ranges,
such vectors can exceed our allocation unit of 128 kB and cause allocation stalls.

This series fixes that by first generalizing some helpers and then changing
token_range_vector to use chunked_vector.

Although this touches IDL, there is no compatibility problem since the encoding
for vector and chunked_vector are identical.

There is no performance concern since token_range_vector is never used on
any hot path (hot paths always contain a partition key).

Fixes #3335.
Fixes #24115.

No backport: minor performance fix that isn't a regression.

Closes scylladb/scylladb#24205

* https://github.com/scylladb/scylladb:
  dht: fragment token_range_vector
  partition_range_compat: generalize wrap/unwrap helpers
2025-05-29 18:45:13 +02:00
Robert Bindar
c570941692 Add nodetool refresh --scope option
This change adds the --scope option to nodetool refresh.
Like in the case of nodetool restore, you can pass either of:
* node - On the local node.
* rack - On the local rack.
* dc - In the datacenter (DC) where the local node lives.
* all (default) - Everywhere across the cluster.
as scope.

The feature is based on the existing load_and_stream paths, so it
requires passing --load-and-stream to the refresh command.
Also, it is not compatible with the --primary-replica-only option.

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>

Closes scylladb/scylladb#23861
2025-05-29 16:12:09 +03:00
Evgeniy Naydanov
0ee0e3f14d test.py: python: run tests using bare pytest command
Add the `host` fixture which uses `PythonTest.run_ctx()` context manager
to setup and teardown ScyllaDB node if `--test-py-init` argument is used.
Otherwise, this fixture returns a value of `--host` CLI argument.

Use dynamic scope provided by `testpy_test_fixture_scope()` function
instead of `session` to maintain compatibility with test.py and ./run
scripts.
2025-05-29 12:33:41 +00:00
Evgeniy Naydanov
b67048f3ee test.py: rework testpy_test fixture
Add utility `get_testpy_test()` function to `pylib.suite.base` which
combines all required steps to create an instance of `Test` class.

Remove redundant `testpy_testsuite` fixture.

Switch to use dynamic fixture scope controlled by `--test-py-init` CLI
argument to improve compatibility with test.py.  And because in test.py
mode the scope is `session`, also change default event loop scope to
`session`.

The fixture is None for test.py mode.

test.py runs tests file-by-file as separate pytest sessions, so, `session`
scope is effectively close to be the same as `module` (can be a difference
in the order.)  In case of running tests with bare pytest command, we need
to use `module` scope to maintain same behavior as test.py, since we run
all tests in one pytest session.
2025-05-29 12:15:28 +00:00
Evgeniy Naydanov
b65cb517b8 test.py: alternator: convert get_valid_alternator_role() to fixture
Convert `get_valid_alternator_role()` to fixture to have more control
on the scope of the cache used.

Additionally, function `new_dynamodb_session()` was also converted to
a fixture, because it uses `get_valid_alternator_role()`.
2025-05-29 12:15:28 +00:00
Evgeniy Naydanov
1f94a9c052 test.py: python: split logic of PythonTest.run()
Split logic of `PythonTest.run()` method into `PythonTest.run_ctx()`
context manager and `PythonTest.run()` method itself.

Done this to reuse setup/teardown code with bare pytest command runs.
2025-05-29 12:15:28 +00:00
Evgeniy Naydanov
27cbfc77fb test.py: add credentials options to add_cql_connection_options()
Move `--auth_username` and `--auth_password` options from
`cluster/conftest.py` to add_cql_connection_options() and slightly
rework `cql` fixture to support these options.
2025-05-29 12:15:28 +00:00
Evgeniy Naydanov
2bba4acdea test.py: python: remove dups of cql and this_dc fixtures
Replace dups of `cql` and `this_dc` fixtures in `rest_api` and
`pylib/cql_repl` with imports from `cqlpy`.
2025-05-29 12:15:28 +00:00
Evgeniy Naydanov
6780461df8 test.py: remove duplication of pytest CLI options
Add 3 supplementary functions to `test.pylib.suite.python`:
`add_host_option()` (which adds `--host` options to pytest session),
`add_cql_connection_options()` (which adds `--port`, and `--ssl`),
and `--add-s3-options` (which adds options related to S3 connection.)
Each function decorated with `@cache` decorator to be executed once per
pytest session and avoid CLI options duplication for runs which
executes `alternator`, `cqlpy`, `rest_api`, or `broadcast_tables`
in one pytest session.
2025-05-29 12:15:28 +00:00
Evgeniy Naydanov
056c5db829 test.py: remove unused CLI options
Remove `--input`, `--output`, and `--keep-tmp` pytest CLI opionts
from `cluster/object_store/conftest.py` because they are not used
in these suite.
2025-05-29 12:15:28 +00:00
Evgeniy Naydanov
b7b68355ef test.py: remove --omit-scylla-output from pytest argparser
Remove `--omit-scylla-output` CLI option from pytest argparser.
Instead, remove it from `sys.argv` in `cqlpy/run.py`.  Also, no need
to check this option in `alternator/run`.
2025-05-29 12:15:28 +00:00