Commit Graph

48109 Commits

Author SHA1 Message Date
Benny Halevy
b00b805da6 test: sstable_resharding_test::sstable_resharding_over_s3_test: use default use_uuid in config
Which is `true` by default anyhow.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-06-18 11:30:29 +03:00
Benny Halevy
f644c5896f test: sstable_datafile_test: compound_sstable_set_basic_test: use uuid sstable generation
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-06-18 11:30:29 +03:00
Benny Halevy
bfa0bb78f9 test: sstable_compaction_test: always use uuid sstable generation
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-06-18 11:30:29 +03:00
Michał Chojnowski
27f66fb110 test/boost/mutation_reader_test: fix a use-after-free in test_fast_forwarding_combined_reader_is_consistent_with_slicing
The contract in mutation_reader.hh says:

```
// pr needs to be valid until the reader is destroyed or fast_forward_to()
// is called again.
    future<> fast_forward_to(const dht::partition_range& pr) {
```

`test_fast_forwarding_combined_reader_is_consistent_with_slicing` violates
this by passing a temporary to `fast_forward_to`.

Fix that.

Fixes scylladb/scylladb#24542

Closes scylladb/scylladb#24543
2025-06-17 19:30:50 +03:00
Anna Stuchlik
648d8caf27 doc: add support for z3 GCP
This commit adds support for z3-highmem-highlssd instance types to
Cloud Instance Recommendations for GCP.

Fixes https://github.com/scylladb/scylladb/issues/24511

Closes scylladb/scylladb#24533
2025-06-17 13:50:46 +03:00
Robert Bindar
1dd37ba47a Add dev documentation for manipulating s3 data manually
This patch intends to give an overview of where, when and how we store
data in S3 and provide a quick set of commands
which help gain local access to the data in case there is a need for
manual intervention.

The patch also collects in the same place links/descriptions for all
formats we use in S3.

Fixes #22438

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>

Closes scylladb/scylladb#24323
2025-06-17 13:21:30 +03:00
Pavel Emelyanov
b0766d1e73 Merge 's3_client: Refactor range class for state validation' from Ernest Zaslavsky
Revamped the `range` class to actively manage its state by enforcing validation on all modifications. This prevents overflow, invalid states, and ensures the object size does not exceed the 5TiB limit in S3. This should address and prevent future problems related to this issue https://github.com/minio/minio/issues/21333

No backport needed since this problem related only to this change https://github.com/scylladb/scylladb/pull/23880

Closes scylladb/scylladb#24312

* github.com:scylladb/scylladb:
  s3_client: headers cleanup
  s3_client: Refactor `range` class for state validation
2025-06-17 10:34:55 +03:00
Ernest Zaslavsky
e398576795 s3_client: Fix hang in get() on EOF by signaling condition variable
* Ensure _get_cv.signal() is called when an empty buffer received
* Prevents `get()` from stalling indefinitely while waiting on EOF
* Found when testing https://github.com/scylladb/scylladb/pull/23695

Closes scylladb/scylladb#24490
2025-06-17 10:33:19 +03:00
Calle Wilund
4a98c258f6 http: Add missing thread_local specifier for static
Refs #24447

Patch adding this somehow managed to leave out the thread_local
specifier. While gnutls cert object can be shared across shards
just fine, the actual shared_ptr here cannot, thus we could
cause memory errors.

Closes scylladb/scylladb#24514
2025-06-17 10:23:52 +03:00
Avi Kivity
cd79a8fc25 Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"
This reverts commit 0b516da95b, reversing
changes made to 30199552ac. It breaks
cluster.random_failures.test_random_failures.test_random_failures
in debug mode (at least).

Fixes #24513
2025-06-16 22:38:12 +03:00
Ernest Zaslavsky
1b20e0be4a s3_client: headers cleanup 2025-06-16 16:02:30 +03:00
Ernest Zaslavsky
9ad7a456fe s3_client: Refactor range class for state validation
Revamped the `range` class to actively manage its state by enforcing validation on all modifications. This prevents overflow, invalid states, and ensures the object size does not exceed the 5TiB limit in S3.
2025-06-16 16:02:24 +03:00
Pavel Emelyanov
5c2e5890a6 Merge 'test.py: Integrate pytest c++ test execution to test.py' from Andrei Chekun
With current changes, pytest executes boost tests. Gathering metrics added to the pytest BoostFacade and UnitFacade to have the possibility to get them for C++ test as previously.
Since boost, raft, unit, and ldap directories aren't executed by test.py, suite.yaml files are renamed to test_config.yaml to preserve the old way of test configuration and removing them from execution by test.py
Pytest executes all modes by itself, JUnit report for the C++ test will be one for the run. That means that there is no possibility to output them in testlog in different folders. So testlog/report directory is used to store all kinds of reports generated during tests. JUnit reports should be testlog/report/junit, Allure reports should be in testlog/report/allure.

**Breaking changes:**
1. Terminal output changed. test.py will run pytest for the next directories: `test/boost`, `test/ldap`, `test/raft`, `test/unit`. `test.py` will blindly translate the output of the pytest to the terminal. Then when all these tests are finished, `test.py` will continue to show previous output for the rest of the test.
2. The format of execution of C++ test directories mentioned above has been changed. Now it will be a simple path to the file with extension. For example, instead of `boost/aggregate_fcts_test` now you need to use `test/boost/aggregate_fcts_test.cc`
3. This PR creates a spike in test amount. The previous logic was to consolidate the boost results from different runs and different modes to one report. So for the three repeats and three modes (nine test results) in CI was shown one result. Now it shows nine results, with differentiating them by mode and run.

**Note:**
Pytest uses pytest-xdist module to run tests in parallel. The Frozen toolchain has this dependency installed, for the local use, please install it manually.

Changes for CI https://github.com/scylladb/scylla-pkg/pull/4949. It will be merged after the current PR will be in master. Short disruption is expected, while PR in scylla-pkg will not be merged.

Fixes: https://github.com/scylladb/qa-tasks/issues/1777

Closes scylladb/scylladb#22894

* github.com:scylladb/scylladb:
  test.py: clean code that isn't used anymore
  test.py: switch off C++ tests from test.py discovery
  test.py: Integrate pytest c++ test execution to test.py
2025-06-16 16:01:37 +03:00
Pavel Emelyanov
0b6532a895 api: Shorten get_simple_states() handler
The one collects map<ip, state> then converts it to a jsonable vector of
helper objects with key and value members. This patch removes the
intermediate map and creates the vector instantly. With that change the
handler makes less data manipulations and behaves like the
get_all_endpoint_states one.

Very similar change was done in 12420dc644 with get_host_to_id_map
handler.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#24456
2025-06-16 15:21:27 +03:00
Tomasz Grabiec
cdb1499898 Merge 'interval: reduce memory footprint' from Avi Kivity
The interval class's memory footprint isn't important for single objects,
but intervals are frequently held in moderately sized collections. In #3335 this
caused a stall. Therefore reducing interval's memory footprint and reduce
allocation pressure.

This series does this by consolidating badly-padded booleans in the object tree
spanned by interval into 5 booleans that are consecutive in memory. This
reduces the space required by these booleans from 40 bytes to 8 bytes.

perf-simple-query report (with refresh-pgo-profiles.sh for each measurement):

before:

252127.60 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37128 insns/op,   18147 cycles/op,        0 errors)
INFO  2025-06-07 21:00:34,010 [shard 0:main] group0_tombstone_gc_handler - Setting reconcile time to   1749319231 (min id=4dbed2f4-43c9-11f0-cbc6-87d1a08b4ca4)
246492.37 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37153 insns/op,   18411 cycles/op,        0 errors)
253633.11 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37127 insns/op,   17941 cycles/op,        0 errors)
254029.93 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37155 insns/op,   17951 cycles/op,        0 errors)
254465.76 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37123 insns/op,   17906 cycles/op,        0 errors)
throughput:
	mean=   252149.75 standard-deviation=3282.75
	median= 253633.11 median-absolute-deviation=1880.17
	maximum=254465.76 minimum=246492.37
instructions_per_op:
	mean=   37137.24 standard-deviation=15.71
	median= 37127.54 median-absolute-deviation=14.45
	maximum=37155.24 minimum=37122.79
cpu_cycles_per_op:
	mean=   18071.19 standard-deviation=212.25
	median= 17950.62 median-absolute-deviation=130.10
	maximum=18411.50 minimum=17906.13

after:

252561.26 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37039 insns/op,   18075 cycles/op,        0 errors)
256876.44 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37022 insns/op,   17785 cycles/op,        0 errors)
257084.38 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37030 insns/op,   17840 cycles/op,        0 errors)
257305.35 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37042 insns/op,   17804 cycles/op,        0 errors)
258088.53 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   37028 insns/op,   17778 cycles/op,        0 errors)
throughput:
	mean=   256383.19 standard-deviation=2185.22
	median= 257084.38 median-absolute-deviation=922.16
	maximum=258088.53 minimum=252561.26
instructions_per_op:
	mean=   37032.17 standard-deviation=8.06
	median= 37030.46 median-absolute-deviation=6.44
	maximum=37041.83 minimum=37021.93
cpu_cycles_per_op:
	mean=   17856.60 standard-deviation=124.70
	median= 17804.16 median-absolute-deviation=71.24
	maximum=18075.50 minimum=17777.95

A small improvement is observed in instructions_per_op. It could be random fluctuations in the compiler performance, or maybe the default constructor/destructor of interval are meaningful even in this simple test.

Small performance improvement, so not a backport candidate.

Closes scylladb/scylladb#24232

* github.com:scylladb/scylladb:
  interval: reduce sizeof
  interval: change start()/end() not to return references to data members
  interval: rename start_ref() back to start() (and end_ref() etc).
  interval: rename start() to start_ref() (and end() etc).
  test: wrapping_interval_test: add more tests for intervals
2025-06-16 09:23:56 +02:00
Botond Dénes
898ce98500 db/batchlog_manager: remove unused member _total_batches_replayed
And its getter. There are no users for either.

Closes scylladb/scylladb#24416
2025-06-16 09:37:00 +03:00
Nadav Har'El
847d9c0911 alternator: update documentation that ttl with tablets does work
Our documentation docs/alternator/new-apis.md claims that Alternator TTL
does not work with tablets, due to issue #16567. However, we fixed that
issue in commit de96c28625. So let's drop
the outdated statement that it doesn't work.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#24427
2025-06-16 09:36:11 +03:00
Ernest Zaslavsky
2b300c8eb9 s3_client: Improve reporting of S3 client statistics
Revise how we report statistics for `chunked_download_source`. Ensure
metrics for downloaded but unconsumed data are visible, as they do not
contribute to read amplification, which is tracked separately.

Closes scylladb/scylladb#24491
2025-06-16 09:33:57 +03:00
Pavel Emelyanov
9aaa33c15a Merge 'main.cc: fix group0 shutdown order' from Petr Gusev
Applier fiber needs local storage, so before shutting down local storage we need to make sure that group0 is stopped.

We also improve the logs for the case when `gate_closed_exception` is thrown while a mutation is being written.

Fixes [scylladb/scylladb#24401](https://github.com/scylladb/scylladb/issues/24401)

Backport: no backport -- not safe and the problem is minor.

Closes scylladb/scylladb#24418

* github.com:scylladb/scylladb:
  storage_service: test_group0_apply_while_node_is_being_shutdown
  main.cc: fix group0 shutdown order
  storage_proxy: log gate_closed_exception
2025-06-16 09:32:34 +03:00
Amnon Heiman
55b21b01ee alternator/stats.cc, metrics-config.yml: docs fix per-table metrics
This patch updates alternator/stats.cc and the get_description.py
configuration (metrics-config.yml) to restore compatibility with
per-table alternator metrics in the documentation generation process.

Previously, the group name for metrics was selected using an inline
expression like (has_table)? "alternator_table" : "alternator", which
made it difficult to maintain a straightforward mapping in the
configuration file.  With this change, the group name is now assigned to
a variable in alternator/stats.cc, allowing metrics-config.yml to map
group names directly. This makes the configuration easier to maintain
and enables get_description.py to document both global and per-table
metrics correctly.

This is a minimal, targeted fix to get the documentation working again
with the new per-table metrics format.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

Closes scylladb/scylladb#24509
2025-06-15 18:06:36 +03:00
Jenkins Promoter
1b5eee6a12 Update pgo profiles - aarch64 2025-06-15 04:57:59 +03:00
Jenkins Promoter
e0c2d591c7 Update pgo profiles - x86_64 2025-06-15 04:44:13 +03:00
Avi Kivity
42d7ae1082 interval: reduce sizeof
An interval object stores five booleans: start()->is_inclusive(),
a boolean since start() itself is an std::optional, two more for
end(), and is_singular(). Due to bad packing, these five booleans
occupy 8 bytes each, for a total of 40 bytes.

Re-pack the interval class by storing those booleans explicitly
close by. Since we lose std::optional's ability to store
a maybe-constructed object, we re-implement it using anonymous
unions and therefore have to implement the 5 special methods.

This helps saves space when vectors of intervals are used, as
seen in #3335 for example.
2025-06-14 21:29:43 +03:00
Avi Kivity
f3dccc2215 interval: change start()/end() not to return references to data members
We'd like to change the data layout of `interval` to save space.
As a result, start() and end() which return references to data
members must return objects (not references). Since we'd like
to maintain zero-copy for these functions, we change them to
return objects containing references (rather than references
to objects), avoiding copying of potentially expensive objects.

We repurpose the interval_bound class to hold references (by
instantiating it with `const T&` instead of `T`) and provide
converting constructors. To make transform_bounds() retain
zero-copy, we add start() and end() that take *this by
rvalue reference.
2025-06-14 21:26:17 +03:00
Avi Kivity
16fb68bb5e interval: rename start_ref() back to start() (and end_ref() etc).
To reduce noise, rename start_ref() back to its original name start(),
after it was changed in the previous patch to force an audit of all calls.
2025-06-14 21:26:16 +03:00
Avi Kivity
3363bc41e2 interval: rename start() to start_ref() (and end() etc).
We are about to change start() to return a proxy object rather
than a `const interval_bound<T>&`. This is generally transparent,
except in one case: `auto x = i.start()`. With the current implementation,
we'll copy object referred to and assign it to x. With the planned
implementation, the proxy object will be assigned to `x`, but it
will keep referring to `i`.

To prevent such problems, rename start() to start_ref() and end()
to end_ref(). This forces us to audit all calls, and redirect calls
that will break to new start_copy() and end_copy() methods.
2025-06-14 21:26:16 +03:00
Avi Kivity
674118fd2e test: wrapping_interval_test: add more tests for intervals
In this series, we will make interval manage its memory directly,
specifically it will directly construct and destroy T values that
it contains rather than let std::optional<T> manage those values
itself.

Add tests that expose bugs encountered during development (actually,
review) of this series. The tests pass before the series, fail
with series as it was before fixing, and pass with the series as
it is now.

The tests use a class maybe_throwing_interval_payload that can
be set to throw at strategic locations and exercise all the interesting
interval shapes.
2025-06-14 21:26:14 +03:00
Patryk Jędrzejczak
c4cf95aeb3 Merge 'raft: simplify voter handler code to not pass node references around' from Emil Maskovsky
Refactor the voter handler logic to only pass around node IDs (`raft::server_id`), instead of pairs of IDs and node descriptor references. Node descriptors can always be efficiently retrieved from the original nodes map, which remains valid throughout the calculation.

This change reduces unnecessary reference passing and simplifies the code. All node detail lookups are now performed via the central nodes map as needed.

Additional cleanup has been done:
* removing redundant comments (that just repeat what the code does)
* use explicit comparators for the datacenter and rack information priorities (instead of the comparison operator) to be more explicit about the prioritization

Fixes: scylladb/scylladb#24035

No backport: This change does not fix any bug and doesn't change the behavior, just cleans up the code in master, therefore no backport is needed.

Closes scylladb/scylladb#24452

* https://github.com/scylladb/scylladb:
  raft: simplify voter handler code to not pass node references around
  raft: reformat voter handler for consistent indentation
  raft: use explicit priority comparators for datacenters and racks
  raft: clean up voter handler by removing redundant comments
2025-06-13 19:02:07 +02:00
Anna Stuchlik
e2b7302183 doc: extend 2025.2 upgrade with a note about consistent topology updates
This commit adds a note that the user should enable consistent topology updates before upgrading
to 2025.2 if they didn't do it (for some reason) when previously upgrading to version 2025.1.

Fixes https://github.com/scylladb/scylladb/issues/24467

Closes scylladb/scylladb#24468
2025-06-13 13:54:59 +03:00
Piotr Dulikowski
238fc24800 Merge 'test: dtest: move audit_test.py to test.py' from Andrzej Jackowski
Copied the entire audit_test.py from scylladb/scylla-dtest, to remove the entire file from scylla-dtest after this patch series is merged. The motivation is to move entire audit testing to from dtests, to make it easier to maintain and more reliable.

After audit_test.py was moved from dtests to test.py, some issues that require fixing arose due to differences between the frameworks.

No backport, moving audit_test.py to test.py is a new testing effort.

Closes scylladb/scylladb#24231

* github.com:scylladb/scylladb:
  test: audit: filter out LOGIN and USE audit logs
  test: audit: remove require mark
  test: audit: wait until raft state is applied in test_permissions
  test: audit: fix problems in audit_test.py
  test: dtest: add dict support to populate in scylla_cluster.py
  test: dtest: copied get_node_ip from dtests to scylla_cluster.py
  test: dtest: copy run_rest_api from dtests to cluster.py
  test: dtest: copy run_in_parallel from dtests to data.py
  test: audit: copy unmodified audit_test.py from dtests
2025-06-12 09:03:45 +02:00
Andrei Chekun
570aaa2ecb test.py: clean code that isn't used anymore
Clean code that is not used anymore
2025-06-11 18:29:26 +02:00
Andrei Chekun
9dca7719b1 test.py: switch off C++ tests from test.py discovery
Switch off C++ tests from test.py discovery. With this change, test.py loses
the ability to directly see and run the C++ tests. Instead, it'll delegate all
things to the pytest.
Since boost, raft, unit, and ldap directories aren't executed by test.py,
suite.yaml files are renamed to test_config.yaml
to preserve the old way of test configuration and removing them from execution
by test.py
Before this patch boost test were visible by test.py and pytest. So if the
test.py will be invoked without test name, it will execute boost tests twice:
with test.py executor and with pytest executor. Depending on the test name
according executor will be used. For example, if test name is
test/boost/aggregate_fcts_test.cc it will be executed by pytest, but if the
boost/aggregate_fcts_test it will be executed by test.py executor.
2025-06-11 18:29:26 +02:00
Andrei Chekun
42d9dbe66a test.py: Integrate pytest c++ test execution to test.py
With current changes pytest executes boost tests. Gathering metrics added to the pytest BoostFacade and UnitFacade
to have the possibility to get them for C++ test as previously.
Since pytest executes all modes by itself JUnit report for the C++ test will be one for the run. That means that there
is no possibility to output them in testlog in different folders. So testlog/report directory is used to store all kinds
of reports generated during tests. JUnit reports should be testlog/report/junit, Allure reports should be in
testlog/report/allure.
**Breaking changes: **
1. Terminal output changed. test.py will run pytest for next directories: test/boost, test/ldap, test/raft, test/unit.
test.py will blindly translate the output of the pytest to the terminal. Then when all these tests are finished, test.py
will continue to show previous output for the rest of the test.
2. The format of execution of C++ test directories mentioned above has been changed. Now it will be a simple path to the
file with extension. For example, instead of boost/aggregate_fcts_test now you need to use test/boost/aggregate_fcts_test.cc
3. This PR creates a spike in test amount. The previous logic was to consolidate the boost results from different runs
and different modes to one report. So for the three repeats and three modes (nine test results) in CI was shown one result.
Now it shows nine results with differentiating them by mode and run.

Note:
Pytest uses pytest-xdist module to run tests in parallel. Frozen toolchain has this dependency installed, for the local
use, please install it manually.
2025-06-11 18:29:23 +02:00
Tomasz Grabiec
eabc1fa6ff Merge 'tablets: deallocate storage state on end_migration' from Michael Litvak
When a tablet is migrated and cleaned up, deallocate the tablet storage
group state on `end_migration` stage, instead of `cleanup` stage:

* When the stage is updated from `cleanup` to `end_migration`, the
  storage group is removed on the leaving replica.
* When the table is initialized, if the tablet stage is `end_migration`
  then we don't allocate a storage group for it. This happens for
  example if the leaving replica is restarted during tablet migration.
  If it's initialized in `cleanup` stage then we allocate a storage
  group, and it will be deallocated when transitioning to
  `end_migration`.

This guarantees that the storage group is always deallocated on the
leaving replica by `end_migration`, and that it is always allocated if
the tablet wasn't cleaned up fully yet.

It is a similar case also for the pending replica when the migration is
aborted. We deallocate the state on `revert_migration` which is the
stage following `cleanup_target`.

Previously the storage group would be allocated when the tablet is
initialized on any of the tablet replicas - also on the leaving replica,
and when the tablet stage is `cleanup` or `end_migration`, and
deallocated during `cleanup`.

This fixes the following issue:

1. A migrating tablet enters cleanup stage
2. the tablet is cleaned up successfuly
3. The leaving replica is restarted, and allocates storage group
4. tablet cleanup is not called because it's already cleaned up
5. the storage group remains allocated on the leaving replica after the
   migration is completed - it's not cleaned up properly.

Fixes https://github.com/scylladb/scylladb/issues/23481

backport to all relevant releases since it's a bug that results in a crash

Closes scylladb/scylladb#24393

* github.com:scylladb/scylladb:
  test/cluster/test_tablets: test restart during tablet cleanup
  test: tablets: add get_tablet_info helper
  tablets: deallocate storage state on end_migration
2025-06-11 17:37:02 +02:00
Andrzej Jackowski
e23d79cb62 test: audit: filter out LOGIN and USE audit logs
LOGIN entries can appear at many points during testing, for example,
when a driver creates a new session. Similarly, `USE ks` statements
can appear unexpectedly, especially when the python-driver calls
`set_keyspace_async` for new connections.

To avoid test checks failures,
this commit filters out LOGIN and USE entries in tests that are
not intended to verify these two types of audit logs.
2025-06-11 09:43:51 +02:00
Andrzej Jackowski
876eaf459b test: audit: remove require mark
After moving audit tests to dtests, require marks are no longer
needed because the tests and the code are in the same repository.
2025-06-11 09:43:51 +02:00
Marcin Maliszkiewicz
111cccf8ba test: audit: wait until raft state is applied in test_permissions
Otherwise test is flaky, expecting permissions to be
enforced before they get applied.
2025-06-11 09:43:51 +02:00
Andrzej Jackowski
6c6234979c test: audit: fix problems in audit_test.py
After audit_test.py was moved from dtests to test.py, the
following issues arose due to differences between the frameworks:
  - Some imports were unnecessary or broken
  - The @pytest.mark.dtest_full decorator was no longer needed
  - The `issue_open` attribute in `xmark` is not supported
  - Support for sending SIGHUP is encapsulated
    by `server_update_config` in test.py`
  - A workaround for scylladb#24473 was required

Moreover, suite.yaml was changed to start running audit_test.py
in dev mode.

Ref. scylladb#24473

Co-authored-by: Marcin Maliszkiewicz <marcinmal@scylladb.com>
2025-06-11 09:43:44 +02:00
Michał Chojnowski
0ade15df33 transport/server: silence the oversized allocation warning in snappy_compress
It has been observed to generate ~200 kiB allocations.

Since we have already been made aware of that, we can silence the warning
to clean up the logs.

Closes scylladb/scylladb#24360
2025-06-10 19:13:26 +03:00
Petr Gusev
b1050944a3 storage_service: test_group0_apply_while_node_is_being_shutdown 2025-06-10 17:25:03 +02:00
Petr Gusev
6b85ab79d6 main.cc: fix group0 shutdown order
group0 persistence relies on local storage, so before
shutting down local storage we need to make sure that
group0 is stopped.

Fixes scylladb/scylladb#24401
2025-06-10 16:06:22 +02:00
Wojciech Mitros
5eb4466789 Return correct creation date time in describe table
Add system:table_creation_time tag with value - timestamp in milliseconds of creation table.
If the tag is present, it will used to fill creation timestamp value (when CreateTable or DescribeTable is called).
If the tag is missing, value 0 for timestamp will be substituted (in other words table was created on 1th january of 1970).
Update test to change how we make sure timestamp is actually used - we create two tables one after another and make sure their creation timestamp is in correct order.
Update tests, that work with tags to filter system tags out.

Fixes #5013

Closes scylladb/scylladb#24007
2025-06-10 15:25:57 +03:00
Nadav Har'El
ed3a0a81d6 test/cqlpy: add some more tests of secondary index system tables
This patch adds a couple of basic tests for system tables related to
secondary indexes - system."IndexInfo" and system_schema.indexes.

I wanted to understand these system tables better when writing
documentation for them - so wrote these tests. These tests can also
serve as regression tests that verify that we don't accidentally lose
support for these system tables. I checked that these tests also pass
in Cassandra 3, 4 and 5.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#24137
2025-06-10 15:00:51 +03:00
Tomasz Grabiec
0b516da95b Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz
This change is preparing ground for state update unification for raft bound subsystems. It introduces schema_applier which in the future will become generic interface for applying mutations in raft.

Pulling `database::apply()` out of schema merging code will allow to batch changes to subsystems. Future generic code will first call `prepare()` on all implementations, then single  `database::apply()` and then `update()` on all implementations, then on each shard it will call `commit()` for all implementations, without preemption so that the change is observed as atomic across all subsystems, and then `post_commit()`.

Backport: no, it's a new feature

Fixes: https://github.com/scylladb/scylladb/issues/19649

Closes scylladb/scylladb#20853

* github.com:scylladb/scylladb:
  storage_service: always wake up load balancer on update tablet metadata
  db: schema_applier: call destroy also when exception occurs
  db: replica: simplify seeding ERM during shema change
  db: remove cleanup from add_column_family
  db: abort on exception during schema commit phase
  db: make user defined types changes atomic
  replica: db: make keyspace schema changes atomic
  db: atomically apply changes to tables and views
  replica: make truncate_table_on_all_shards get whole schema from table_shards
  service: split update_tablet_metadata into two phases
  service: pull out update_tablet_metadata from migration_listener
  db: service: add store_service dependency to schema_applier
  service: simplify load_tablet_metadata and update_tablet_metadata
  db: don't perform move on tablet_hint reference
  replica: split add_column_family_and_make_directory into steps
  replica: db: split drop_table into steps
  db: don't move map references in merge_tables_and_views()
  db: introduce commit_on_shard function
  db: access types during schema merge via special storage
  replica: make non-preemptive keyspace create/update/delete functions public
  replica: split update keyspace into two phases
  replica: split creating keyspace into two functions
  db: rename create_keyspace_from_schema_partition
  db: decouple functions and aggregates schema change notification from merging code
  db: store functions and aggregates change batch in schema_applier
  db: decouple tables and views schema change notifications from merging code
  db: store tables and views schema diff in schema_applier
  db: decouple user type schema change notifications from types merging code
  service: unify keyspace notification functions arguments
  db: replica: decouple keyspace schema change notifications to a separate function
  db: add class encapsulating schema merging
2025-06-10 13:45:32 +02:00
Ernest Zaslavsky
30199552ac s3_client: Mitigate connection exhaustion in download_source
The existing `download_source` implementation optimizes performance
by keeping the connection to S3 open and draining data directly from
the socket. While this eliminates the overhead (60-100ms) of repeatedly
establishing new connections, it leads to rapid exhaustion of client-
side connections.

On a single shard, two `mx_readers` for load and stream are enough to
trigger this issue. Since each client typically holds two connections,
readers keeping index and data sources open can cause deadlocks where
processes stall due to unavailable connections.

Introduce `chunked_download_source`, a new S3 download method built on
`download_source`, to dynamically manage connections:

- Buffers data in 5MiB chunks using a producer-consumer model
- Closes connections once buffers reach capacity, returning them to
  the pool for other clients
- Uses a filling fiber that resumes fetching once buffers are
  consumed from the queue

Performance remains comparable to `download_source`, achieving
95MiB/s for sequential 1GiB downloads from S3. However, preloading
large chunks may cause read amplification.

Fixes: https://github.com/scylladb/scylladb/issues/23785

Closes scylladb/scylladb#23880
2025-06-10 12:58:24 +03:00
Anna Stuchlik
b0ced64c88 doc: remove the limitation for disabling CDC
This commit removes the instruction to stop all writes before disabling CDC with ALTER.

Fixes https://github.com/scylladb/scylla-docs/issues/4020

Closes scylladb/scylladb#24406
2025-06-10 12:53:09 +03:00
Robert Bindar
ca1a9c8d01 Add support for nodetool refresh --skip-reshape
This patch adds the new option in nodetool, patches the
load_new_ss_tables REST request with a new parameter and
skips the reshape step in refresh if this flag is passed.

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>

Closes scylladb/scylladb#24409
Fixes: #24365
2025-06-10 12:52:13 +03:00
David Garcia
62fdebfe78 chore: exclude OS and ENT from google
Closes scylladb/scylladb#24417
2025-06-10 12:50:37 +03:00
Emil Maskovsky
b7e0a01fcc raft: simplify voter handler code to not pass node references around
Refactor the voter handler logic to only pass around node IDs
(`raft::server_id`), instead of pairs of IDs and node descriptor
references. Node descriptors can always be efficiently retrieved from
the original nodes map, which remains valid throughout the calculation.

This change reduces unnecessary reference passing and simplifies
the code. All node detail lookups are now performed via the central
nodes map as needed.

Fixes: scylladb/scylladb#24035
2025-06-10 11:04:56 +02:00
Emil Maskovsky
839e0bf40d raft: reformat voter handler for consistent indentation
Reformatted the voter handler implementation to comply with clang-format
automatic formatting rules. No functional changes.
2025-06-10 11:04:56 +02:00