Commit Graph

45936 Commits

Author SHA1 Message Date
Yaron Kaikov
0fc7e786dd .github/scripts/auto-backport.py: fix wrong username param
In 2e6755ecca I have added a comment when PR has conflicts so the assignee can get a notification about it. There was a problem with the user mention param (a missing `.login`)

Fixing it

Closes scylladb/scylladb#22036
2024-12-25 20:41:34 +02:00
Avi Kivity
465449e4a1 test: combined_test: relicense
Was inadvertantly released under the AGPL.
2024-12-25 13:53:54 +02:00
Avi Kivity
3ffe93b6ae Merge 'Enhance load-and-stream with "scope"' from Pavel Emelyanov
The main purpose of this change is to enhance the restore from object storage usage.

Currently, restore uses the load-and-stream facility. When triggered, the restoring task opens the provided list of sstables directory from the remote bucket and then feeds the list of sstables to load_and_stream() method. The method, in turn, iterates over this list, reads mutations and for each mutation decides where to send one by checking the replication map (it's pretty much the same for both vnodes and tablets, but for tablets that are "fully contained" by a range there's the plan to stream faster).

As described above, restore is governed by a single node and this single node reads all sstables from the object store, which can be very slow. This PR allows speeding things up. For that, the load-and-stream code is equipped with the "scope" filter which limits where mutations can be streamed to. There are four options for that -- all, dc, rack and node. The "all" is how things work currently, "dc" and "rack" filter out target nodes that don't belong to this node's dc/rack respectively. The "node" scope only streams mutations to local node.

With the "node" scope it's possible to make all nodes in the cluster load mutations that belong to them in parallel, without re-sending them to peers. The last patch in this PR is the test that shows how it can be possible.

Closes scylladb/scylladb#21169

* github.com:scylladb/scylladb:
  test: Add scope-streaming test (for restore from backup)
  api: New "scope" API param to load-and-stream calls
  sstables_loader: Propagate scope from API down
  sstables_loader: Filter tablets based on scope
  streamer: Disable scoped streaming of primary replica only
  sstables_loader: Introduce streaming scope
  sstables_loader: Wrap get_endpoints()
2024-12-25 13:52:51 +02:00
Nadav Har'El
23213e8696 Merge 'Make get_built_indexes REST API endpoint be consistent with system."IndexInfo" table' from Pavel Emelyanov
It turned out that aforementioned APIs use slightly different sources of information about view build progress/status which sometimes results in different reporting of whether an index is built. It's good to make those two APIs consistent. Also add a test for the REST API endpoint (system table test was addressed by #21677).

Closes scylladb/scylladb#21814

* github.com:scylladb/scylladb:
  test: Add tests for MVs and indexes reporting by API endpoint(s)
  api: Use built_views table in get_built_indexes API
2024-12-25 11:47:03 +02:00
Avi Kivity
f9c3ab03a3 Merge 'Sort by proximity: shuffle equal-distance replicas' from Benny Halevy
This series re-implements locator::topology::sort_by_proximity
and adds some randomization to shuffle equal-distance replicas for improving load-balancing
when reading with 1 < consistency level < replication factor.

This change also adds a manual test for benchmarking sort_by_proximity,
as it's not exercised by the single-node perf-simple-query.

The benchmark shows performance improvement of over 20% (from about 71 ns to 56 ns
per call for 3 nodes vectors), mainly due to "calculate distance only once" which
pre-calculates the distance from the reference node for each replica once, rather than
each time to comparator is called by std::sort

* Improvement.  No backport needed

Closes scylladb/scylladb#21958

* github.com:scylladb/scylladb:
  locator/topology: do_sort_by_proximity: shuffle equal-distance replicas
  locator/topology: sort_by_proximity: calculate distance only once
  utils: small_vector: expose internal_capacity()
  storage_proxy: sort_endpoints_by_proximity: lookup my_id only if cannot sort by proximity
  test/perf: add perf_sort_by_proximity benchmark
  locator: refactor sort_by_proximity
2024-12-24 17:37:48 +02:00
Pavel Emelyanov
644d36996d test: Add tests for MVs and indexes reporting by API endpoint(s)
So far there's the /column_family/built_indexes one that reports the
index names similar to how system.IndexInfo does, but it's not tested.
This patch adds tests next to existing system. table ones.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-24 16:18:32 +03:00
Pavel Emelyanov
5eb3278d9e api: Use built_views table in get_built_indexes API
Somehow system."IndexInfo" table and column_family/built_indexes REST
API endpoint declare an index "built" at slightly different times:

The former a virtual table which declares an index completely built
when it appears on the system.built_views table.

The latter uses different data -- it takes the list of indexes in
the schema and eliminates indexes which are still listed in the
system.scylla_views_builds_in_progress table.

The mentioned system. tables are updated at different times, so API
notices the change a bit later. It's worth improving the consistency
of these two APIs by making the REST API endpoint piggy-back the
load_built_views() instead of load_view_build_progress(). With that
change the filtering of indexes should be negated.

Fixes #21587

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-24 16:18:00 +03:00
Benny Halevy
d1490bb7bf locator/topology: do_sort_by_proximity: shuffle equal-distance replicas
To improve balancing when reading in 1 < CL < ALL

This implementation has a moderate impact on
the function performance in contrast to full
std::shuffle of the vector before stable_sort:ing it
(especially with large number of nodes to sort).

Before:
test                                               iterations      median         mad         min         max      allocs       tasks        inst      cycles
sort_by_proximity_topology.perf_sort_by_proximity    25541973    39.225ns     0.114ns    38.966ns    39.339ns       0.000       0.000       588.5       116.6

After:
sort_by_proximity_topology.perf_sort_by_proximity    19689561    50.195ns     0.119ns    50.076ns    51.145ns       0.000       0.000       622.5       150.6

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-24 13:00:17 +02:00
Benny Halevy
0fe8bdd0db locator/topology: sort_by_proximity: calculate distance only once
And use a temporary vector to use the precalculated distances.
A later patch will add some randomization to shuffle nodes
at the same distance from the reference node.

This improves the function performance by 50% for 3 replicas,
from 77.4 ns to 39.2 ns, larger replica sets show greater improvement
(over 4X for 15 nodes):

Before:
test                                               iterations      median         mad         min         max      allocs       tasks        inst      cycles
sort_by_proximity_topology.perf_sort_by_proximity    12808773    77.368ns     0.062ns    77.300ns    77.873ns       0.000       0.000      1194.2       231.6

After:
sort_by_proximity_topology.perf_sort_by_proximity    25541973    39.225ns     0.114ns    38.966ns    39.339ns       0.000       0.000       588.5       116.6

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-24 12:27:03 +02:00
Benny Halevy
4af522f61e utils: small_vector: expose internal_capacity()
So we can use it for defining other small_vector
deriving their internal capacity from another small_vector
type.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-24 12:19:20 +02:00
Benny Halevy
3a3df43799 storage_proxy: sort_endpoints_by_proximity: lookup my_id only if cannot sort by proximity
topology::sort_by_proximity already sorts the local node
address first, if present, so look it up only when
using SimpleSnitch, where sort_by_proximity() is a no-op.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-24 12:19:20 +02:00
Benny Halevy
75da99ce8b test/perf: add perf_sort_by_proximity benchmark
benchmark sort_by_proximity

Baseline results on my desktop for sorting 3 nodes:

single run iterations:    0
single run duration:      1.000s
number of runs:           5
number of cores:          1
random seed:              20241224

test                                               iterations      median         mad         min         max      allocs       tasks        inst      cycles
sort_by_proximity_topology.perf_sort_by_proximity    12808773    77.368ns     0.062ns    77.300ns    77.873ns       0.000       0.000      1194.2       231.6

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-24 12:18:24 +02:00
Pavel Emelyanov
a19ad3c655 Merge 'install-dependencies.sh: cleanups to silence shellcheck' from Kefu Chai
this changeset includes two changes to silence the warnings reported by shellcheck. This changeset has no functional impact and serves as a proactive code improvement.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#21756

* github.com:scylladb/scylladb:
  install-dependencies.sh: quote array to avoid re-splitting
  install-dependencies.sh: define local variable using "local -A"
2024-12-24 10:27:27 +03:00
Pavel Emelyanov
972ff80fad test: Add scope-streaming test (for restore from backup)
- create
  - a cluster with given topology
  - keyspace with tablets and given rf value
  - table with some data
- backup
  - flush all nodes
  - kick backup API on every node
- re-create keyspace and table
  - drop it first
  - create again with the same parameters and schema, but don't
    populate table with data
- restore
  - collect nodes to contact and corresponding list of TOCs
    according to the preferred "scope"
  - ask selected nodes to restore, limiting its streaming scope
    and providing the specific list of sstables
- check
  - select mutation fragments from all nodes for random keys
  - make sure that the number of non-empty responses equals the
    expected rf value

Specific topologies, RFs and stream scopes used are:

    rf = 1, nodes = 3, racks = 1, dcs = 1, scope = node
    rf = 3, nodes = 5, racks = 1, dcs = 1, scope = node
    rf = 1, nodes = 4, racks = 2, dcs = 1, scope = rack
    rf = 3, nodes = 6, racks = 2, dcs = 1, scope = rack
    rf = 3, nodes = 6, racks = 3, dcs = 1, scope = rack
    rf = 2, nodes = 8, racks = 4, dcs = 2, scope = dc

nodes and racks are evenly distributed in racks and dcs respectively
in the last topo RF effectively becomes 4 (2 in each dc)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:28:05 +03:00
Pavel Emelyanov
a24dc02255 api: New "scope" API param to load-and-stream calls
There are two of those -- the POST /storage_service/keyspace that loads
and streams new sstables from /upload and POST /storage_service/restore
that does the same, but gets sstables from object store.

The new optional parameter allow users to tun the streaming phase
behavior. The test/pylib client part is also updated here.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:28:05 +03:00
Pavel Emelyanov
960041d4b4 sstables_loader: Propagate scope from API down
Semi-mechanical change that adds newly introduced "scope" parameter to
all the functions between API methods and the low-level streamer object.
No real functional changes. API methods set it to "all" to keep existing
behavior.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:28:05 +03:00
Pavel Emelyanov
e8201a7897 sstables_loader: Filter tablets based on scope
Loading and streaming tablets has pre-filtering loop that walks the
tablet map sorts sstables into three lists:
- fully contained in one of map ranges
- partially overlapping with the map
- not intersecting with the map

Sstables from the 3rd list is immediately dropped from the process and
for the remaining two core load-and-stream happens.

This filtering deserves more care from the newly introduced scope. When
a tablet replica set doesn't get in the scope, the whole entry can be
disregarded, because load-and-stream will only do its "load" part anyway
and all mutations from it will be ignored.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:28:05 +03:00
Pavel Emelyanov
93aed22cd5 streamer: Disable scoped streaming of primary replica only
There's been some discussions of how primary replica only streaming
schould interact with the scope. There are two options how to consider
this combination:

- find where the primary replica is and handle it if it's within the
  requested sope
- within the requested scope find the primary replica for that subset of
  nodes, then handle it

There's also some itermediate solution: suppoer "primary replica in DC"
and reject all other combinations.

Until decided which way is correct, let's disable this configuration.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:19:17 +03:00
Pavel Emelyanov
30aac0d1da sstables_loader: Introduce streaming scope
Currently load-and-stream sends mutations to whatever node is considered
to be a "replica" for it. One exception is the "primary-replica-only"
flag that can be requested by the user.

This patch introduces a "scope" parameter that limits streaming part in
where it can stream the data to with 4 options:

- all -- current way of doing things, stream to wherever needed
- dc -- only stream to nodes that live in the same datacenter
- rack -- only stream to nodes that live in the same rack
- node -- only "stream" to current node

It's not yet configurable and streamer object initializes itself with
"all" mode. Will be changed later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:19:17 +03:00
Pavel Emelyanov
7c1eaa427e sstables_loader: Wrap get_endpoints()
Preparational patch. Next will add more code to get_endpoints() that
will need to work for both if/else branches, this change helps having
less churn later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:19:17 +03:00
Benny Halevy
68b0b442fd locator: refactor sort_by_proximity
Extract can_sort_by_proximity() out so it can be used
later by storage_proxy, and introduce do_sort_by_proximity
that sorts unconditionally.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-23 16:42:55 +02:00
Kefu Chai
cd2a2bd021 repair: correct misspelling of "corespondent"
replace "corespondent" with "corresponding" in a logging message.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22003
2024-12-23 11:29:58 +02:00
Takuya ASADA
03461d6a54 test: compile unit tests into a single executable
To reduce test executable size and speed up compilation time, compile unit
tests into a single executable.

Here is a file size comparison of the unit test executable:

- Before applying the patch
$ du -h --exclude='*.o' --exclude='*.o.d' build/release/test/boost/ build/debug/test/boost/
11G	build/release/test/boost/
29G	build/debug/test/boost/

- After applying the patch
du -h --exclude='*.o' --exclude='*.o.d' build/release/test/boost/ build/debug/test/boost/
5.5G	build/release/test/boost/
19G	build/debug/test/boost/

It reduces executable sizes 5.5GB on release, and 10GB on debug.

Closes #9155

Closes scylladb/scylladb#21443
2024-12-22 19:14:09 +02:00
Piotr Smaron
200f0bb219 alternator: use get_datacenters() in get_network_topology_options()
Currently, `get_network_topology_options()` is using gossip data
and iterates over topology using IPs and not host IDs, which may
result in operating on inconsistent data.
This method's implemenations has been changed to instead use
`get_datacenters()`, which should always return consistent data.

Fixes: scylladb/scylladb#21490

Closes scylladb/scylladb#21940
2024-12-22 18:57:10 +02:00
Avi Kivity
f8ce49ebe9 cql3: implement NOT IN
Where the grammar supports IN, we add NOT IN. This includes the WHERE
clause and LWT IF clause.

Evaluation of NOT IN follows from IN.

In statement_restrictions analysis, they are different, as NOT IN
doesn't enable any clever query plan and must filter.

Some tests are added. An error message was changed ('in' changed to 'IN'),
so some tests are adjusted.

Closes scylladb/scylladb#21992
2024-12-22 15:15:23 +02:00
Kefu Chai
10c79a4d47 test/pylib: do not check for self.cmd when tearing down ScyllaServer
we already check `self.cmd` for null at the very beginning of the
`ScyllaServer.stop()`, and in the `try` block, we don't reset
`self.cmd`, hence there is no need to check it again.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21936
2024-12-20 16:21:40 +02:00
Avi Kivity
eb62593f2c treewide: use angle brackets when including seastar headers
We treat Seastar as a "system" library, and those are included
with angle brackets.

Closes scylladb/scylladb#21959
2024-12-20 16:16:28 +02:00
Kefu Chai
f1a0613a39 mutation: remove unused function
`prefixed()` is a static function in `mutation_partition_v2.cc`.
and this function is not used in this translation unit. so let's
remove it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22006
2024-12-20 16:12:10 +02:00
Yaniv Michael Kaul
dbe4ac7465 LICENSE-ScyllaDB-Source-Available.md: fix markdown
Codespell complained.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#21980
2024-12-20 16:11:39 +02:00
Aleksandra Martyniuk
1c29726477 replica: do not set tablet_task_info if it isn't valid
Currently, in tablet_map_to_mutation, repair's and migration's
tablet_task_info is always set.

Do not set the tablet_task_info if there is no running operation.

Closes scylladb/scylladb#22005
2024-12-20 16:10:53 +02:00
Kefu Chai
2a9f34bb85 test/pytest.ini: put repair marker declaration back
During the consolidation of per-suite pytest.ini files (commit 8bf62a086f),
the 'repair' marker was inadvertently dropped. This led to pytest warnings
for tests using the @pytest.mark.repair decorator.

This patch restores the marker declaration to eliminate the distracting
PytestUnknownMarkWarning:

```
test/topology_experimental_raft/test_tablets.py:396
  /home/kefu/dev/scylladb/test/topology_experimental_raft/test_tablets.py:396: PytestUnknownMarkWarning: Unknown pytest.mark.repair - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.repair

```

Restoring the marker allows tests to use the 'repair' mark without
generating warnings.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21931
2024-12-20 14:04:50 +02:00
Botond Dénes
42d24b2a8a Merge 'Retire topology::sort_by_proximity and compare_endpoints flavors using gms::inet_address' from Benny Halevy
This series converts the call site using compare_endpoints with gms::inet_address.
With that both flavors of compare_endpoints and sort_by_proximity for inet_address
can be retired as no other uses remain.

Also, add a unit test for topology::sort_by_proximity before further changes
to it are considered.

* Code cleanup, no backport is needed

Closes scylladb/scylladb#21976

* github.com:scylladb/scylladb:
  test: network_topology_strategy_test: add test_topology_sort_by_proximity
  locator/topology: retire sort_by_proximity/compare_endpoints for inet_address
  test: test_topology_compare_endpoints: use host_id:s
2024-12-20 13:34:55 +02:00
Yaron Kaikov
74c5aabd23 build_docker: add option for building container based on Ubuntu Pro
Today our container is based on ubuntu:22.04, we need to build another container based on Ubuntu Pro for FIPS support (currently the latest one is 20.04)

The default docker build process doesn't change, if FIPS is required I have added `--type pro` to build a supported container.

To enable FIPS there is a need to attach an Ubuntu Pro subscription (it will be done as part of https://github.com/scylladb/scylla-pkg/issues/4186)

Closes scylladb/scylladb#21974
2024-12-20 13:09:24 +02:00
Asias He
0141906c4a repair: Enable small table optimization for RBNO rebuild
Similar to 9ace191616 (repair: Enable
small table optimization for RBNO bootstrap and decommission), this
patch enables small table optimization for RBNO rebuild.

This is useful for rebuild ops which is used for building an empty DC.

Fixes: #21951

Closes scylladb/scylladb#21952
2024-12-20 13:03:34 +02:00
Kefu Chai
24283d9dd0 test/topology: rename manager_internal to manager_client
instead of reusing the variable name and overriding the parameter,
use a new name for the return value of `manager_internal()` for better
readability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21932
2024-12-20 13:01:45 +02:00
Kefu Chai
6914892a1b repair: do not include unused headers
these unused includes are identified by clang-include-cleaner. after
auditing the source files, all of the reports have been confirmed.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21837
2024-12-20 08:55:56 +02:00
Botond Dénes
d4129ddaa6 Merge 'sstables_manager: do not reclaim unlinked sstables' from Lakshmi Narayanan Sreethar
When an sstable is unlinked, it remains in the _active list of the
sstable manager. Its memory might be reclaimed and later reloaded,
causing issues since the sstable is already unlinked. This patch updates
the on_unlink method to reclaim memory from the sstable upon unlinking,
remove it from memory tracking, and thereby prevent the issues described
above.

Added a testcase to verify the fix.

Fixes #21887

This is a bug fix in the bloom filter reload/reclaim mechanism and should be backported to older versions.

Closes scylladb/scylladb#21895

* github.com:scylladb/scylladb:
  sstables_manager: reclaim memory from sstables on unlink
  sstables_manager: introduce reclaim_memory_and_stop_tracking_sstable()
  sstables: introduce disable_component_memory_reload()
  sstables_manager: log sstable name when reclaiming components
2024-12-19 15:18:16 +02:00
Kefu Chai
16397d8cba message: do not include unused header
In commit bfee93c7, repair verbs were moved to IDL. During this refactoring,
the `gc_clock.hh` header became unused as its references were relocated.
`clang-include-cleaner` helped identify this unnecessary include, which is
now removed to clean up the codebase.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21919
2024-12-19 15:16:34 +02:00
Michał Chojnowski
f6ebd445e4 test_tablets.py: limit concurrency in test_tablet_storage_freeing
Apparently the python driver can't deal with the current concurrency sometimes.
Lower it from 1000 to 100.

Fixes scylladb/scylladb#20489

Closes scylladb/scylladb#20494
2024-12-19 15:14:41 +02:00
Kefu Chai
df36985fc3 raft: do not include unused headers
these unused includes are identified by clang-include-cleaner. after
auditing the source files, all of the reports have been confirmed.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21838
2024-12-19 14:57:22 +02:00
Kefu Chai
93be8f3a0c db,sstables: migate boost::range::stable_partition to std library
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::stable_partition`.

in this change, we:

- replace `boost::range::stable_parition()` to
  `std::ranges::stable_parition()`
- since `std::ranges::stable_parition()` returns a subrange instead of
  an iterator, change the names of variables which were previously used
  for holding the return value of `boost::range::stable_partition()`
  accordingly for better readability.
- remove unused `#include` of boost headers

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21911
2024-12-19 14:56:07 +02:00
Avi Kivity
a4440392d7 build: update dependencies for features to be ported from enterprise
ldap/slapd/toxiproxy/cyrus-sasl - for ldap authentication and authorization
git-lfs/bolt - for profile-guided optimization
lz4-static - for dictionary based network compression
jwt - for Oauth/GCP connectivity (for key management)
openkmip - for kmip testing
fipscheck - for FIPS validation

Frozen toolchain regenerated, with optimized clang from

  https://devpkg.scylladb.com/clang/clang-18.1.8-Fedora-40-aarch64.tar.gz
  https://devpkg.scylladb.com/clang/clang-18.1.8-Fedora-40-x86_64.tar.gz
2024-12-19 14:26:31 +02:00
Wojciech Mitros
37a25d3af4 mv: avoid stalls when calculating affected clustering ranges
Currently, when finishing db::view::calculate_affected_clustering_ranges
we deoverlap, transform and copy all ranges prepared before. This
is all done within a single continuation and can cause stalls.

We fix this by adding yields after each transform and moving elements
to the final vector one by one instead of copying them all at the end.

After this change, the longest continuation in this code will be
deoverlapping the initial ranges (and one transform). While it has
a relatively high computational complexity (we sort all ranges), it
should execute quickly because we're operating on views there and
we don't need to copy the actual bytes. If we encounter a stall there,
we'll need to implement an asynchronous `deoverlap` method.

Fixes scylladb/scylladb#21843

Closes scylladb/scylladb#21846
2024-12-19 12:50:30 +01:00
Kamil Braun
91cddcc17f Merge 'Do not reset quarantine list in non raft mode' from Gleb Natapov
The series contains small fixes to the gossiper one of which fixes #21930. Others I noticed while debugged the issue.

Fixes: scylladb/scylladb#21930

Closes scylladb/scylladb#21956

* github.com:scylladb/scylladb:
  gossiper: do not reset _just_removed_endpoints in non raft mode
  gossiper: do not send echo message to yourself
  gossiper: do not call apply for the node's old state
2024-12-19 11:03:35 +01:00
Pavel Emelyanov
bb094cc099 Merge 'Make restore task abortable' from Calle Wilund
Fixes #20717

Enables abortable interface and propagates abort_source to all s3 objects used for reading the restore data.

Note: because restore is done on each shard, we have to maintain a per-shard abort source proxy for each, and do a background per-shard abort on abort call. This is synced at the end of "run()".

Abort source is added as an optional parameter to s3 storage and the s3 path in distributed loader.

There is no attempt to "clean up" an aborted restore. As we read on a mutation level from remote sstables, we should not cause incomplete sstables as such, even though we might end up of course with partial data restored.

Closes scylladb/scylladb#21567

* github.com:scylladb/scylladb:
  test_backup: Add restore abort test case
  sstables_loader: Make restore task abortable
  distributed_loader: Add optional abort_source to get_sstables_from_object_store
  s3_storage: Add optional abort_source to params/object
  s3::client: Make "readable_file" abortable
2024-12-19 12:23:33 +03:00
Benny Halevy
67b7015ced test: network_topology_strategy_test: add test_topology_sort_by_proximity
Before further changes are made to sort_by_proximity
add a unit test for it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-19 09:45:02 +02:00
Benny Halevy
1c5b0eca41 locator/topology: retire sort_by_proximity/compare_endpoints for inet_address
Those are not used anymore now that the last call
site for compare_endpoints by inet_address is converted
to use host_id.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-19 09:44:41 +02:00
Benny Halevy
dcdc60fffd test: test_topology_compare_endpoints: use host_id:s
This is the last call site requiring the compare_endpoints
flavour for inet_address.

Once this test is converted to use host_id:s instead,
compare_endpoints and sort_by_proximity can be simplified
to support only host_id:s.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-19 09:44:26 +02:00
Kefu Chai
2a31a82ae2 .github: Ensure header generation before include analysis
When running clang-include-cleaner, the tool performs static analysis by
"compiling" specified source files. Previously, non-existent included headers
caused the tool to skip source files, reducing the effectiveness of unused
include detection.

Problem:
- Header files like 'rust/wasmtime_bindings.hh' were not pre-generated
- Compilation errors led to skipping source file analysis

  ```
  /__w/scylladb/scylladb/lang/wasm.hh:15:10: fatal error: 'rust/wasmtime_bindings.hh' file not found
     15 | #include "rust/wasmtime_bindings.hh"
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  Skipping file /__w/scylladb/scylladb/lang/wasm.hh due to compiler errors. clang-include-cleaner expects to work on compilable source code.
  1 error generated.
```

- This significantly reduced clang-include-cleaner's coverage

Solution:
- Build the `wasmtime_bindings` target to generate required header files
- Ensure all necessary headers are created before running static analysis
- Enable full source file checking for unused includes

By generating headers before analysis, we prevent skipping of source files
and improve the comprehensiveness of our include cleaner workflow.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21739
2024-12-19 09:41:46 +02:00
Ferenc Szili
dc375b8cd3 test: enable test_truncate_with_coordinator_crash
This test was added in PR #19789 but was disabled with xfail because of
the bug with way truncate saved the commit log replay positions. More
specifically, the replay positions for shards that had no mutations were
saved to system.truncated with shard_id == 0, regardless for which shard
it was actually saved for (see #21719).
The bug was fixed in #21722, so this change removes the xfail tag from
the test.

Closes scylladb/scylladb#21902
2024-12-18 18:02:52 +01:00