Commit Graph

42180 Commits

Author SHA1 Message Date
Pavel Emelyanov
90593f4e82 view_builder: Generalize mark_as_built(view_ptr) method
Marking is performed in two places and they can be generalized

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-05 19:56:12 +03:00
Pavel Emelyanov
3c3f2cd337 view_builder: Move mark_existing_views_as_built from storage service
Now it's in the correct component

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-05 19:56:11 +03:00
Pavel Emelyanov
895391fb4b storage_service: Add view_builder& reference
Storage service will need to drain v.b. on its drain. Also on cluster
join it marks existing views as built while it's v.b.'s job to do it.
Both will be fixed by next patching and this is prerequisite.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-05 19:55:07 +03:00
Pavel Emelyanov
f00f1f117b main,cql_test_env: Move view_builder start up (and make unconditional)
Just starting sharded<view_builder> is lightweight, its constructor does
nothing but initializes on-board variables. Real work takes off on
view_builder::start() which is not moved.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-05 19:53:33 +03:00
Tomasz Grabiec
0c74c2c12f Merge 'Extend tablet_transition_kind::rebuild to rebuild tablet to new replica' from Pavel Emelyanov
When altering rf for a keyspace, all tablets in this ks will get more replicas. Part of this process is rebuilding tablets' onto new node(s). This PR extends the tablets transition code to support rebuilding of tablet on new replica.

fixes: #18030

Closes scylladb/scylladb#18082

* github.com:scylladb/scylladb:
  test: Check data presense as well
  test: Test how tablets are copied between nodes
  test: Add sanity test for tablet migration
  api: Add method to add replica to a tablet
  tablet: Make leaving replica optional
2024-04-05 12:51:10 +02:00
Pavel Emelyanov
639cc1f576 compaction: Replace formatted_sstables_list with fmt:: facilities
The formatted_sstables_list is auxiliary class that collects a bunch of
sstables::to_string(shared_sstable)-generated strings. One of bad side
effects of this helper is that it allocates memory for the vector of
strings.

This patch achieves the same goal with the help of fmt::join() equipped
with transformed boost adaptor.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18160
2024-04-05 09:17:15 +03:00
Kefu Chai
ff43628b44 gms: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18194
2024-04-05 08:48:17 +03:00
Pavel Emelyanov
2a98e95cd0 api: Coroutinize API get_snapshot_details handler
Now it's possible to understand what it does

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18190
2024-04-04 22:20:28 +03:00
Pavel Emelyanov
c7908c319f test: Check data presense as well
Other than making sure that system.tablets is updated with correct
replica set, it's also good to check that the data is present on the
repsective nodes.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-04 18:01:24 +03:00
Raphael S. Carvalho
9f93dd9fa3 replica: Use flat_hash_map for tablet storage
The reason that we want to switch to flat_hash_map is that only a small
subset of tablets will be allocated on any given shard, therefore it's
wasteful to use a sparse array, and iterations are slow.
Also, the map gives greater development flexibility as one doesn't have
to worry about empty entries.

perf result:

-- reads

scylla_with_chunked_vector-read-no-tablets.txt
median 73223.28 tps ( 62.3 allocs/op,  13.3 tasks/op,   41932 insns/op,        0 errors)
median 74952.87 tps ( 62.3 allocs/op,  13.3 tasks/op,   41969 insns/op,        0 errors)
median 73016.37 tps ( 62.3 allocs/op,  13.3 tasks/op,   41934 insns/op,        0 errors)
median 74078.14 tps ( 62.3 allocs/op,  13.3 tasks/op,   41938 insns/op,        0 errors)
median 75323.07 tps ( 62.3 allocs/op,  13.3 tasks/op,   41944 insns/op,        0 errors)

scylla_with_hash_map-read-no-tablets.txt
median 74963.30 tps ( 62.3 allocs/op,  13.3 tasks/op,   41926 insns/op,        0 errors)
median 74032.09 tps ( 62.3 allocs/op,  13.3 tasks/op,   41918 insns/op,        0 errors)
median 74850.09 tps ( 62.3 allocs/op,  13.3 tasks/op,   41937 insns/op,        0 errors)
median 74239.37 tps ( 62.3 allocs/op,  13.3 tasks/op,   41921 insns/op,        0 errors)
median 74798.14 tps ( 62.3 allocs/op,  13.3 tasks/op,   41925 insns/op,        0 errors)

scylla_with_chunked_vector-read-tablets-1.txt
median 74234.27 tps ( 62.1 allocs/op,  13.3 tasks/op,   41903 insns/op,        0 errors)
median 75775.98 tps ( 62.1 allocs/op,  13.3 tasks/op,   41910 insns/op,        0 errors)
median 76481.56 tps ( 62.1 allocs/op,  13.2 tasks/op,   41874 insns/op,        0 errors)
median 74056.67 tps ( 62.1 allocs/op,  13.3 tasks/op,   41894 insns/op,        0 errors)
median 75287.68 tps ( 62.1 allocs/op,  13.3 tasks/op,   41894 insns/op,        0 errors)

scylla_with_hash_map-read-tablets-1.txt
median 75613.63 tps ( 62.1 allocs/op,  13.2 tasks/op,   41990 insns/op,        0 errors)
median 74819.51 tps ( 62.1 allocs/op,  13.2 tasks/op,   41973 insns/op,        0 errors)
median 75648.41 tps ( 62.1 allocs/op,  13.3 tasks/op,   42025 insns/op,        0 errors)
median 74170.89 tps ( 62.1 allocs/op,  13.2 tasks/op,   42002 insns/op,        0 errors)
median 75447.72 tps ( 62.1 allocs/op,  13.3 tasks/op,   41952 insns/op,        0 errors)

scylla_with_chunked_vector-read-tablets-128.txt
median 73788.57 tps ( 62.1 allocs/op,  13.2 tasks/op,   41956 insns/op,        0 errors)
median 76563.63 tps ( 62.1 allocs/op,  13.3 tasks/op,   42006 insns/op,        0 errors)
median 75536.12 tps ( 62.1 allocs/op,  13.2 tasks/op,   42005 insns/op,        0 errors)
median 74679.17 tps ( 62.1 allocs/op,  13.3 tasks/op,   41958 insns/op,        0 errors)
median 75380.95 tps ( 62.1 allocs/op,  13.2 tasks/op,   41946 insns/op,        0 errors)

scylla_with_hash_map-read-tablets-128.txt
median 75459.99 tps ( 62.1 allocs/op,  13.3 tasks/op,   42055 insns/op,        0 errors)
median 74280.11 tps ( 62.1 allocs/op,  13.3 tasks/op,   42085 insns/op,        0 errors)
median 74502.61 tps ( 62.1 allocs/op,  13.3 tasks/op,   42063 insns/op,        0 errors)
median 74692.27 tps ( 62.1 allocs/op,  13.3 tasks/op,   41994 insns/op,        0 errors)
median 75402.64 tps ( 62.1 allocs/op,  13.3 tasks/op,   42015 insns/op,        0 errors)

-- writes

scylla_with_chunked_vector-write-no-tablets.txt
median 68635.17 tps ( 58.4 allocs/op,  13.3 tasks/op,   52709 insns/op,        0 errors)
median 68716.36 tps ( 58.4 allocs/op,  13.3 tasks/op,   52691 insns/op,        0 errors)
median 68512.76 tps ( 58.4 allocs/op,  13.3 tasks/op,   52721 insns/op,        0 errors)
median 68606.14 tps ( 58.4 allocs/op,  13.3 tasks/op,   52696 insns/op,        0 errors)
median 68619.25 tps ( 58.4 allocs/op,  13.3 tasks/op,   52697 insns/op,        0 errors)

scylla_with_hash_map-write-no-tablets.txt
median 67678.10 tps ( 58.4 allocs/op,  13.3 tasks/op,   52723 insns/op,        0 errors)
median 67966.06 tps ( 58.4 allocs/op,  13.3 tasks/op,   52736 insns/op,        0 errors)
median 67881.47 tps ( 58.4 allocs/op,  13.3 tasks/op,   52743 insns/op,        0 errors)
median 67856.81 tps ( 58.4 allocs/op,  13.3 tasks/op,   52730 insns/op,        0 errors)
median 67812.58 tps ( 58.4 allocs/op,  13.3 tasks/op,   52740 insns/op,        0 errors)

scylla_with_chunked_vector-write-tablets-1.txt
median 67741.83 tps ( 58.4 allocs/op,  13.3 tasks/op,   53425 insns/op,        0 errors)
median 68014.20 tps ( 58.4 allocs/op,  13.3 tasks/op,   53455 insns/op,        0 errors)
median 68228.48 tps ( 58.4 allocs/op,  13.3 tasks/op,   53447 insns/op,        0 errors)
median 67950.96 tps ( 58.4 allocs/op,  13.3 tasks/op,   53443 insns/op,        0 errors)
median 67832.69 tps ( 58.4 allocs/op,  13.3 tasks/op,   53462 insns/op,        0 errors)

scylla_with_hash_map-write-tablets-1.txt
median 66873.70 tps ( 58.4 allocs/op,  13.3 tasks/op,   53548 insns/op,        0 errors)
median 67568.23 tps ( 58.4 allocs/op,  13.3 tasks/op,   53547 insns/op,        0 errors)
median 67653.70 tps ( 58.4 allocs/op,  13.3 tasks/op,   53525 insns/op,        0 errors)
median 67389.21 tps ( 58.4 allocs/op,  13.3 tasks/op,   53536 insns/op,        0 errors)
median 67437.91 tps ( 58.4 allocs/op,  13.3 tasks/op,   53537 insns/op,        0 errors)

scylla_with_chunked_vector-write-tablets-128.txt
median 67115.41 tps ( 58.3 allocs/op,  13.3 tasks/op,   53341 insns/op,        0 errors)
median 66836.07 tps ( 58.3 allocs/op,  13.3 tasks/op,   53342 insns/op,        0 errors)
median 67214.07 tps ( 58.3 allocs/op,  13.3 tasks/op,   53303 insns/op,        0 errors)
median 67198.25 tps ( 58.3 allocs/op,  13.3 tasks/op,   53347 insns/op,        0 errors)
median 67368.78 tps ( 58.3 allocs/op,  13.3 tasks/op,   53374 insns/op,        0 errors)

scylla_with_hash_map-write-tablets-128.txt
median 66273.50 tps ( 58.3 allocs/op,  13.3 tasks/op,   53400 insns/op,        0 errors)
median 66564.89 tps ( 58.3 allocs/op,  13.3 tasks/op,   53432 insns/op,        0 errors)
median 66568.52 tps ( 58.3 allocs/op,  13.3 tasks/op,   53408 insns/op,        0 errors)
median 66368.00 tps ( 58.3 allocs/op,  13.3 tasks/op,   53441 insns/op,        0 errors)
median 66293.55 tps ( 58.3 allocs/op,  13.3 tasks/op,   53408 insns/op,        0 errors)

Fixes #18010.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#18093
2024-04-04 16:25:48 +03:00
Yaniv Kaul
2ce2649ec1 Typo: you -> your
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#17806
2024-04-04 14:55:46 +03:00
Nadav Har'El
c24bc3b57a alternator: do not use tablets on new Alternator tables
A few months ago, in merge d3c1be9107,
we decided that if Scylla has the experimental "tablets" feature enabled,
new Alternator tables should use this feature by default - exactly like
this is the default for new CQL tables.

Sadly, it was now decided to reverse this decision: We do not yet trust
enough LWT on tablets, and since Alternator often (if not always) relies
on LWT, we want Alternator tables to continue to use vnodes - not tablets.

The fix is trivial - just changing the default. No test needed to change
because anyway, all Alternator tests work correctly on Scylla with the
tablets experimental feature disabled. I added a new test to enshrine
the fact that Alternator does not use tablets.

An unfortunate result of this patch will be that Alternator tables
created on versions with this patch (e.g., Scylla 6.0) will not use
tablets and will continue to not use tablets even if Scylla is upgraded
(currently, the use of tablets is decided at table creation time, and
there is no way to "upgrade" a vnode-based table to be tablet based).

This patch should be reverted as soon as LWT support matures on tablets.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#18157
2024-04-04 12:11:29 +03:00
Pavel Emelyanov
1c1004d1bd sstables_loader: Format list of sstables' filenames in place
Loader wants to print set of sstables' names. For that it collects names
into a dedicated vector, then prints it using fmt/ranges facility.

There's a way to achieve the same goal without allocating extra vector
with names -- use fmt::format() and pass it a range converting sstables
into their names.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18159
2024-04-04 12:09:52 +03:00
Ferenc Szili
f1cc6252fd logging: Don't log PK/CK in large partition/row/cell warning
Currently, Scylla logs a warning when it writes a cell, row or partition which are larger than certain configured sizes. These warnings contain the partition key and in case of rows and cells also the cluster key which allow the large row or partition to be identified. However, these keys can contain user-private, sensitive information. The information which identifies the partition/row/cell is also inserted into tables system.large_partitions, system.large_rows and system.large_cells respectivelly.

This change removes the partition and cluster keys from the log messages, but still inserts them into the system tables.

The logged data will look like this:

Large cells:
WARN  2024-04-02 16:49:48,602 [shard 3:  mt] large_data - Writing large cell ks_name/tbl_name: cell_name (SIZE bytes) to sstable.db

Large rows:
WARN  2024-04-02 16:49:48,602 [shard 3:  mt] large_data - Writing large row ks_name/tbl_name: (SIZE bytes) to sstable.db

Large partitions:
WARN  2024-04-02 16:49:48,602 [shard 3:  mt] large_data - Writing large partition ks_name/tbl_name: (SIZE bytes) to sstable.db

Fixes #18041

Closes scylladb/scylladb#18166
2024-04-04 12:06:31 +03:00
Kefu Chai
3b50c39a83 scylla-gdb: access io_queue::_streams and io_queue::_fgs with static_vector
in seastar's b28342fa5a301de3facf5e83dc691524a6b20604, we switched
* `io_queue::_streams` from
  `boost::container::small_vector<fair_queue, 2>` to
  `boost::container::static_vector<fair_queue, 2>`
* `io_queue::_fgs` from
  `std::vector<std::unique_ptr<fair_group>>` to
  `boost::container::static_vector<fair_group, 2>`

so we need to update the gdb script accordingly to reflect this
change, and to avoid the nested try-except blocks, we switch to
a `while` statement to simplify the code structure.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18165
2024-04-04 11:39:10 +03:00
Anna Stuchlik
994f807bf6 docs: add the latest image info to GCP and Azure pages
This commit adds image information for the latest patch release
to the GCP and Azure deployment page.
The information now replaces the reference to the Download Center
so that the user doesn't have to jump to another website.

Fixes https://github.com/scylladb/scylladb/issues/18144

Closes scylladb/scylladb#18168
2024-04-04 11:24:39 +03:00
Kefu Chai
64b8bb239f api/storage_service: throw if table is not found when move tablets
`database::find_column_family()` throws no_such_column_family
if an unknown ks.cf is fed to it. and we call into this function
without checking for the existence of ks.cf first. since
"/storage_service/tablets/move" is a public interface, we should
translate this error to a better http error.

in this change, we check for the existence of the given ks.cf, and
throw an exception so that it can be caught by seastar::httpd::routers,
and converted to an HTTP error.

Fixes #17198
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17217
2024-04-04 11:23:52 +03:00
Pavel Emelyanov
590f0329ae test: Test how tablets are copied between nodes
This patches the previously introduced test by introducing the 'action'
test paramter and tweaking the final checking assertions around tablet
replicas read from system.tablets

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-04 09:22:57 +03:00
Pavel Emelyanov
28964ba5fe test: Add sanity test for tablet migration
It just checks that after api call to move_tablet the resulting replica
is in expected state. This test will be later expanded to check for
rebuild transition.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-04 09:22:31 +03:00
Pavel Emelyanov
79ad760e95 api: Add method to add replica to a tablet
The new API submits rebuild transition with new replicas set to be old
(current) replicas plus the provided one. It looks and acts like the
move_tablet API call with several changes:

- lacks the "source" replica argument
- submits "rebuild" transition kind
- cross racks checks are not performed

The 'force' argument is inherited from move_tablet, but is unused now
and is left for future.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-04 09:22:16 +03:00
Tomasz Grabiec
1a839bcb36 main: Skip tablet metadata loading in maintenance mode
If system.tablets is corrupted, the node would not boot in maintenance
mode, which is needed to fix system.tablets.

Closes scylladb/scylladb#17990
2024-04-04 09:20:09 +03:00
Pavel Emelyanov
b0cba57e29 tablet: Make leaving replica optional
When getting leaving replica from from tablet info and transition info,
the getter code assumes that this replica always exists. It's not going
to be the case soon, so make the return value be optional.

There are four places that mess with leaving replica:

- stream tablet handler: this place checks that the leaving replica is
  _not_ current host. If leaving replica is missing, the check should
  pass

- cleanup tablet handler: this place checks that the leaving replica
  _is_ current host. If leaving replica is missing, the check should
  fail as well

- topology coordinator: it gets leaving replica to call cleanup on. If
  leaving replica is missing, the cleanup call is short-circuited to
  succeed immediately

- load-stats calculator: it checks if the leaving replica is self. This
  check is not patched as it's automatically satisfied by std::optional
  comparison operator overload for wrapped type

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-04-04 09:03:36 +03:00
Michał Chojnowski
8147ab69ac row_cache_test: avoid a throw in external_updater
In test_exception_safety_of_update_from_memtable, we have a potential
throw from external_updater.

external_updater is supposed to be infallible.
Scylla currently aborts when an external_updater throws, so a throw from
there just fails the test.

This isn't intended. We aren't testing external_updater in this test.

Fixes #18163

Closes scylladb/scylladb#18171
2024-04-03 23:22:08 +02:00
Piotr Dulikowski
baae811142 Merge 'auth: keep auth version in scylla_local' from Marcin Maliszkiewicz
Before the patch selection of auth version depended
on consistent topology feature but during raft recovery
procedure this feature is disabled so we need to persist
the version somewhere to not switch back to v1 as this
is not supported.

During recovery auth works in read-only mode, writes
will fail.

Fixes https://github.com/scylladb/scylladb/issues/17736

Closes scylladb/scylladb#18039

* github.com:scylladb/scylladb:
  auth: keep auth version in scylla_local
  auth: coroutinize service::start
2024-04-03 12:25:56 +02:00
Kefu Chai
e2f3fed373 service: qos: fix a typo
s/accesor/accessor/

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18124
2024-04-03 10:33:54 +02:00
Raphael S. Carvalho
12714a4123 locator: Avoid tablet map lookup on every write for getting replicas
We can cache tablet map in erm, to avoid looking it up on every write for
getting write replicas. We do that in tablet_sharder, but not in tablet
erm. Tablet map is immutable in the context of a given erm, so the
address of the map is stable during erm lifetime.

This caught my attention when looking at perf diff output
(comparing tablet and vnode modes).

It also helps when erm is called again on write completion for
checking locality, used for forwarding info to the driver if needed.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#18158
2024-04-03 10:28:04 +02:00
Botond Dénes
d43670046b test/lib: random_schema: disallow boolean_type in keys
They result in poor distribution and poor cardinality, interfering with
tests which want to generate N partitions or rows.

Fixes: #17821

Closes scylladb/scylladb#17856
2024-04-03 09:52:36 +03:00
Botond Dénes
2cb5dcabf7 docs/dev/maintainer.md: document another exceptions to rule no.0
Maintainers are also allowed to commit their own backport PR. They are
allowed to backport their own code, opening a PR to get a CI run for a
backport doesn't change this.

Closes scylladb/scylladb#17727
2024-04-03 09:51:19 +03:00
Piotr Dulikowski
3ba7a4ead2 Merge 'api: upgrade_to_raft topology: add logging' from Benny Halevy
Upgrading raft topology is an important api call
that should be logged.

When failed, it is also important to log the
exception to get better visibility into why
the call failed.

Closes scylladb/scylladb#18143

* github.com:scylladb/scylladb:
  api: storage_service: upgrade_to_raft_topology: fixup indentation
  api: storage_service: upgrade_to_raft_topology: add logging
2024-04-03 07:00:10 +02:00
Pavel Emelyanov
8550a38a8b cql: Reserve vector of column definitions in advance
The vector in question is populted from the content of another map, so
its size is known in advance

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18155
2024-04-02 22:35:10 +03:00
Marcin Maliszkiewicz
562caaf6c6 auth: keep auth version in scylla_local
Before the patch selection of auth version depended
on consistent topology feature but during raft recovery
procedure this feature is disabled so we need to persist
the version somewhere to not switch back to v1 as this
is not supported.

During recovery auth works in read-only mode, writes
will fail.
2024-04-02 19:04:21 +02:00
Benny Halevy
1272d736c0 api: storage_service: upgrade_to_raft_topology: fixup indentation
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-04-02 20:02:51 +03:00
Benny Halevy
31026ae27f api: storage_service: upgrade_to_raft_topology: add logging
Upgrading raft topology is an important api call
that should be logged.

When failed, it is also important to log the
exception to get better visibility into why
the call failed.

Indentation will be fixed in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-04-02 20:02:49 +03:00
Kefu Chai
15d59db98b cql3: select_statement: include <ranges>
we should include used header, to avoid compilation failures like:
```
cql3/statements/select_statement.cc:229:79: error: no member named 'filter' in namespace 'std::ranges::views'
        for (const auto& used_function : used_functions | std::ranges::views::filter(not_native)) {
                                                          ~~~~~~~~~~~~~~~~~~~~^
1 error generated.`
```
if some of the included header drops its own `#include <optional>`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18145
2024-04-02 18:47:54 +03:00
Botond Dénes
2179bfc40d Merge 'Relax initialization of virtual tables' from Pavel Emelyanov
It now happens in initialize_virtual_tables(), but this function is split into sub-calls and iterates over virtual tables map several times to do its work. This PR squashes it into a straightforward code which is shorter and, hopefully, easier to read.

Closes scylladb/scylladb#18133

* github.com:scylladb/scylladb:
  virtual_tables: Open-code install_virtual_readers_and_writers()
  virtual_tables: Move readers setup loop into add_table()
  virtual_tables: Move tables creation loop into add_table()
  virtual_tables: Make add_tablet() a coroutine
  virtual_tables: Open-code register_virtual_tables()
2024-04-02 13:39:26 +03:00
Botond Dénes
469ff4f290 Merge 'repair: Load repair history in background' from Asias He
Currently, we load the repair history during boot up. If the number of
repair history entries is high, it might take a while to load them.

In my test, to load 10M entries, it took around 60 seconds.

It is not a must to load the entries during boot up. It is better to
load them in the background to speed up the boot time.

Fixes #17993

Closes scylladb/scylladb#17994

* github.com:scylladb/scylladb:
  repair: Load repair history in background
  repair: Abort load_history process in shutdown
2024-04-02 10:53:10 +03:00
Botond Dénes
fd12052c89 Update tools/java/ submodule
* tools/java/ d61296dc...b810e8b0 (1):
  > do not include {dclocal_,}read_repair_chance if not enabled
2024-04-02 10:47:57 +03:00
Yaron Kaikov
fcdb80773e github: sync-labels: run only in scylladb oss repo
We currently support the sync-label only in OSS. Since Scylla-enterprise
get all the commits from OSS repo, the sync-label is running and failing
during checkout (since it's a private repo and should have different
configuration)

For now, let's limit the workflows for oss repo

Closes scylladb/scylladb#18142
2024-04-02 10:45:17 +03:00
Botond Dénes
ffdd47c2b1 Merge 'Track and limit memory used by bloom filters' from Lakshmi Narayanan Sreethar
Added support to track and limit the memory usage by sstable components. A reclaimable component of an SSTable is one from which memory can be reclaimed. SSTables and their managers now track such reclaimable memory and limit the component memory usage accordingly. A new configuration variable defines the memory reclaim threshold. If the total memory of the reclaimable components exceeds this limit, memory will be reclaimed to keep the usage under the limit. This PR considers only the bloom filters as reclaimable and adds support to track and limit them as required.

The feature can be manually verified by doing the following :
1. run a single-node single-shard 1GB cluster
2. create a table with bloom-filter-false-positive-chance of 0.001 (to intentionally cause large bloom filter)
3. populate with tiny partitions
4. watch the bloom filter metrics get capped at 100MB

The default value of the `components_memory_reclaim_threshold` config variable which controls the reclamation process is `.1`. This can also be reduced further during manual tests to easily hit the threshold and verify the feature.

Fixes #17747

Closes scylladb/scylladb#17771

* github.com:scylladb/scylladb:
  test_bloom_filter.py: disable reclaiming memory from components
  sstable_datafile_test: add tests to verify auto reclamation of components
  test/lib: allow overriding available memory via test_env_config
  sstables_manager: support reclaiming memory from components
  sstables_manager: store available memory size
  sstables_manager: add variable to track component memory usage
  db/config: add a new variable to limit memory used by table components
  sstable_datafile_test: add testcase to verify reclamation from sstables
  sstables: support reclaiming memory from components
2024-04-02 10:40:52 +03:00
Amnon Heiman
803d414896 get_description.py: Make the Script a library
This patch makes the get_description.py script easier to use by the
documentation automation:
1. The script is now a library.
2. You can choose the output of the script, currently supported pipee
   and yml.

You can still call the from the command line, like before, but you can
also calls it from another python script.

For example the folowing python script would generate the documentation
for the metrics description of the ./alternator/ttl.cc file.
```

import get_description

metrics = get_description.get_metrics_from_file("./alternator/ttl.cc", "scylla", get_description.get_metrics_information("metrics-config.yml"))
get_description.write_metrics_to_file("out.yaml", metrics, "yml")
```

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

Closes scylladb/scylladb#18136
2024-04-02 10:07:11 +03:00
Botond Dénes
ea8478a3e7 scripts/open-coredump.sh: introduce --ci
Coredumps coming from CI are produced by a commit, which is not
available in the scylla.git repository, as CI runs on a merge commit
between the main branch (master or enterprise) and the tested PR branch.
Currently the script will attempt to checkout this commit and will fail
as the commit hash is unrecognized.
To work around this, add a --ci flag, which when used, will force the
main branch to be checked out, instead of the commit hash.

Closes scylladb/scylladb#18023
2024-04-02 09:27:52 +03:00
Kefu Chai
55d0ea48bd test: randomized_nemesis_test: remove fmt::formatter for seastar::timed_out_error
This reverts commit 97b203b1af.

since Seastar provides the formatter, it's not necessary to vendor it in
scylladb anymore.

Refs #13245

Closes scylladb/scylladb#18114
2024-04-02 09:25:51 +03:00
Benny Halevy
d5ac0c06b3 test_sstable_reversing_reader_random_schema: drop workaround for #9352
Issue #9352 was fixed about a year and a half ago
so this workaround should not be needed anymore.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#18121
2024-04-02 09:25:06 +03:00
Raphael S. Carvalho
29f9f7594f replica: Kill table::storage_group_id_for_token()
storage_group_id_for_token() was only needed from within
tablet_storage_group_manager, so we can kill
table::storage_group_id_for_token().

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#18134
2024-04-02 09:23:23 +03:00
Asias He
99b7ccfa8b repair: Load repair history in background
Currently, we load the repair history during boot up. If the number of
repair history entries is high, it might take a while to load them.

In my test, to load 10M entries, it took around 60 seconds.

It is not a must to load the entries during boot up. It is better to
load them in the background to speed up the boot time.

Fixes #17993
2024-04-02 09:24:35 +08:00
Asias He
523895145d repair: Abort load_history process in shutdown
If the node is shutting down, there is no point to continue to load the
repair history.

Refs #17993
2024-04-02 09:24:35 +08:00
Lakshmi Narayanan Sreethar
d86505e399 test_bloom_filter.py: disable reclaiming memory from components
Disabled reclaiming memory from sstable components in the testcase as it
interferes with the false positive calculation.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-04-02 01:37:47 +05:30
Lakshmi Narayanan Sreethar
d261f0fbea sstable_datafile_test: add tests to verify auto reclamation of components
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-04-02 01:37:47 +05:30
Lakshmi Narayanan Sreethar
169629dd40 test/lib: allow overriding available memory via test_env_config
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-04-02 01:37:47 +05:30
Lakshmi Narayanan Sreethar
a36965c474 sstables_manager: support reclaiming memory from components
Reclaim memory from the SSTable that has the most reclaimable memory if
the total reclaimable memory has crossed the threshold. Only the bloom
filter memory is considered reclaimable for now.

Fixes #17747

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-04-02 01:37:47 +05:30