Commit Graph

1037 Commits

Author SHA1 Message Date
Benny Halevy
4dcb8c19bd scylla-sstable: correctly dump sharding_metadata
This patch fixes 2 issues at one go:

First, Currently sstables::load clears the sharding metadata
(via open_data()), and so scylla-sstable always prints
an empty array for it.

Second, printing token values would generate invalid json
as they are currently printed as binary bytes, and they
should be printed simply as numbers, as we do elsewhere,
for example, for the first and last keys.

Fixes #26982

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#26991

(cherry picked from commit f9ce98384a)

Closes scylladb/scylladb#27030
2025-11-16 16:05:14 +02:00
Botond Dénes
694fb53aad tools/scylla-sstable: fix doc links
The doc links in scylla-sstable help output are static, so they always
point to the documentation of the latest stable release, not to the
documentation of the release the tool binary is from. On top of that,
the links point to old open-source documentation, which is now EOL.
Fix both problems: point link at the new source-available documentation
pages and make them version aware.

(cherry picked from commit fe73c90df9)
2025-10-07 10:22:29 +03:00
Botond Dénes
4f01659eda tools/scylla-nodetool: remove trailing " from doc urls
They are accidental leftover from a previous way of storing command
descriptions.

(cherry picked from commit 5a69838d06)
2025-10-03 14:27:27 +00:00
Botond Dénes
f453b5bfa3 Merge '[Backport 2025.1] sstables: Fix quadratic space complexity in partitioned_sstable_set' from Scylladb[bot]
Interval map is very susceptible to quadratic space behavior when it's flooded with many entries overlapping all (or most of) intervals, since each such entry will have presence on all intervals it overlaps with.

A trigger we observed was memtable flush storm, which creates many small "L0" sstables that spans roughly the entire token range.

Since we cannot rely on insertion order, solution will be about storing sstables with such wide ranges in a vector (unleveled).

There should be no consequence for single-key reads, since upper layer applies an additional filtering based on token of key being queried.
And for range scans, there can be an increase in memory usage, but not significant because the sstables span an wide range and would have been selected in the combined reader if the range of scan overlaps with them.

Anyway, this is a protection against storm of memtable flushes and shouldn't be the common scenario.

It works both with tablets and vnodes, by adjusting the token range spanned by compaction group accordingly.

Fixes #23634.

We can backport this into 2024.2, 2025.1, but we should let this cook in master for 1 month or so.

- (cherry picked from commit 494ed6b887)

- (cherry picked from commit 59dad2121f)

- (cherry picked from commit 21d1e78457)

- (cherry picked from commit c77f710a0c)

- (cherry picked from commit d5bee4c814)

Parent PR: #23806

Closes scylladb/scylladb#24012

* github.com:scylladb/scylladb:
  test: Verify partitioned set store split and unsplit correctly
  sstables: Fix quadratic space complexity in partitioned_sstable_set
  compaction: Wire table_state into make_sstable_set()
  compaction: Introduce token_range() to table_state
  dht: Add overlap_ratio() for token range
2025-08-06 09:56:43 +03:00
Aleksandra Martyniuk
3f93cdc61b nodetool: repair: skip tablet keyspaces
Currently, nodetool repair command repairs both vnode and tablet keyspaces
if no keyspace is specified. We should use this command to repair
only vnode keyspaces, but this isn't easily accessible - we have to
explicitly run repair only on vnode keyspaces.

nodetool repair skips tablet keyspaces unless a tablet keyspace
is explicitely passed as an argument.

Fixes: #24040.

Closes scylladb/scylladb#24042

(cherry picked from commit 6f8b378e80)

Closes scylladb/scylladb#25152
2025-07-24 16:32:37 +03:00
Avi Kivity
477cad246f tools: optimized_clang: make it work in the presence of a scylladb profile
optimized_clang.sh trains the compiler using profile-guided optimization
(pgo). However, while doing that, it builds scylladb using its own profile
stored in pgo/profiles and decompressed into build/profile.profdata. Due
to the funky directory structure used for training the compiler, that
path is invalid during the training and the build fails.

The workaround was to build on a cloud machine instead of a workstation -
this worked because the cloud machine didn't have git-lfs installed, and
therefore did not see the stored profile, and the whole mess was averted.

To make this work on a machine that does have access to stored profiles,
disable use of the stored profile even if it exists.

Fixes #22713

Closes scylladb/scylladb#24571

(cherry picked from commit 52f11e140f)

Closes scylladb/scylladb#24620
2025-07-13 21:43:49 +03:00
Raphael S. Carvalho
63bdbebdef sstables: Fix quadratic space complexity in partitioned_sstable_set
Interval map is very susceptible to quadratic space behavior when
it's flooded with many entries overlapping all (or most of)
intervals, since each such entry will have presence on all
intervals it overlaps with.

A trigger we observed was memtable flush storm, which creates many
small "L0" sstables that spans roughly the entire token range.

Since we cannot rely on insertion order, solution will be about
storing sstables with such wide ranges in a vector (unleveled).

There should be no consequence for single-key reads, since upper
layer applies an additional filtering based on token of key being
queried.
And for range scans, there can be an increase in memory usage,
but not significant because the sstables span an wide range and
would have been selected in the combined reader if the range of
scan overlaps with them.

Anyway, this is a protection against storm of memtable flushes
and shouldn't be the common scenario.

It works both with tablets and vnodes, by adjusting the token
range spanned by compaction group accordingly.

Fixes #23634.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit c77f710a0c)
2025-07-11 10:05:30 -03:00
Raphael S. Carvalho
be71400fb2 compaction: Introduce token_range() to table_state
This provides a way for compaction layer to know compaction group's
token range. It will be important for sstable set impl to know
the token range of underlying group.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 59dad2121f)
2025-07-11 09:21:40 -03:00
Botond Dénes
00402cb4c5 sstables: add corrupt_data_handler to sstables::sstables
Similar to how large_data_handler is handled, propagate through
sstables::sstables_manager and store its owner: replica::database.
Tests and tools are also patched. Mostly mechanical changes, updating
constructors and patching callers.

(cherry picked from commit ebd9420687)
2025-07-02 14:04:09 +03:00
Botond Dénes
135727fd07 tools/scylla-sstable: make large_data_handler a local
No reason for it to be a global, not even convenience.

(cherry picked from commit 46ff7f9c12)
2025-07-02 14:03:03 +03:00
Wojciech Mitros
7422796845 view_info: set base info on construction
Currently, the base_info may or may not be set in view schemas.
Even when it's set, it may be modified. This necessitates extra
checks when handling view schemas, as well as potentially causing
errors when we forget to set it at some point.

Instead, we want to make the base info an immutable member of view
schemas (inside view_info). The first step towards that is making
sure that all newly created schemas have the base info set.
We achieve that by requiring a base schema when constructing a view
schema. Unfortunately, this adds complexity each time we're making
a view schema - we need to get the base schema as well.
In most cases, the base schema is already available. The most
problematic scenario is when we create a schema from mutations:
- when parsing system tables we can get the schema from the
database, as regular tables are parsed before views
- when loading a view schema using the schema loader tool, we need
to load the base additionally to the view schema, effectively
doubling the work
- when pulling the schema from another node - in this case we can
only get the current version of the base schema from the local
database

Additionally, we need to consider the base schema version - when
we generate view updates the version of the base schema used for
reads should match the version of the base schema in view's base
info.
This is achieved by selecting the correct (old or new) schema in
`db::schema_tables::merge_tables_and_views` and using the stored
base schema in the schema_registry.

(cherry picked from commit 900687c818)
2025-05-27 21:40:22 +02:00
Łukasz Paszkowski
a1082de797 tools/scylla-nodetool: fix crash when rows_merged cells contain null
Any empty object of the json::json_list type has its internal
_set variable assigned to false which results in such objects
being skipped by the json::json_builder.

Hence, the json returned by the api GET//compaction_manager/compaction_history
does not contain the field `rows_merged` if a cell in the
system.compaction_history table is null or an empty list.

In such cases, executing the command `nodetool compactionhistory`
will result in a crash with the following error message:
`error running operation: rjson::error (JSON assert failed on condition 'false'`

The patch fixes it by checking if the json object contains the
`rows_merged` element before processing. If the element does
not exist, the nodetool will now produce an empty list.

Fixes https://github.com/scylladb/scylladb/issues/23540

Closes scylladb/scylladb#23514

(cherry picked from commit 113647550f)

Closes scylladb/scylladb#24113
2025-05-20 08:30:33 +03:00
Botond Dénes
a0c01964cd tools/scylla-nodetool: status: handle negative load sizes
Negative load sizes don't make sense, but we've seen a case in
production, where a negative number was returned by ScyllaDB REST API,
so be prepared to handle these too.

Fixes: scylladb/scylladb#24134

Closes scylladb/scylladb#24135

(cherry picked from commit 700a5f86ed)

Closes scylladb/scylladb#24167
2025-05-15 17:39:36 +03:00
Yaron Kaikov
502c62d91d install-dependencies.sh: update node_exporter to 1.9.0
Update node_exporter to 1.9.0 to resolve the following CVE's
https://github.com/advisories/GHSA-49gw-vxvf-fc2g
https://github.com/advisories/GHSA-8xfx-rj4p-23jm
https://github.com/advisories/GHSA-crqm-pwhx-j97f
https://github.com/advisories/GHSA-j7vj-rw65-4v26

Fixes: https://github.com/scylladb/scylladb/issues/22884

regenerate frozen toolchain with optimized clang from
* https://devpkg.scylladb.com/clang/clang-18.1.8-Fedora-40-aarch64.tar.gz
* https://devpkg.scylladb.com/clang/clang-18.1.8-Fedora-40-x86_64.tar.gz

Closes scylladb/scylladb#22987

(cherry picked from commit e6227f9a25)

Closes scylladb/scylladb#23021
2025-04-21 23:12:28 +03:00
Botond Dénes
f7761729cc Merge '[Backport 2025.1] nodetool: cluster repair: add a command to repair tablet keyspaces' from Scylladb[bot]
Add a new nodetool cluster super-command. Add nodetool
cluster repair command to repair tablet keyspaces.
It uses the new /storage_service/tablets/repair API.

The nodetool cluster repair command allows you to specify
the keyspace and tables to be repaired. A cluster repair of many
tables will request /storage_service/tablets/repair and wait for
the result synchronously for each table.

The nodetool repair command, which was previously used to repair
keyspaces of any type, now repairs only vnode keyspaces.

Fixes: https://github.com/scylladb/scylladb/issues/22409.

Needs backport to 2025.1 that introduces the new tablet repair API

- (cherry picked from commit cbde835792)

- (cherry picked from commit b81c81c7f4)

- (cherry picked from commit aa3973c850)

- (cherry picked from commit 8bbc5e8923)

- (cherry picked from commit 02fb71da42)

- (cherry picked from commit 9769d7a564)

Parent PR: #22905

Closes scylladb/scylladb#23672

* github.com:scylladb/scylladb:
  docs: nodetool: update repair and add tablet-repair docs
  test: nodetool: add tests for cluster repair command
  nodetool: add cluster repair command
  nodetool: repair: extract getting hosts and dcs to functions
  nodetool: repair: warn about repairing tablet keyspaces
  nodetool: repair: move keyspace_uses_tablets function
2025-04-11 10:53:03 +03:00
Botond Dénes
ec7da3d785 tools/scylla-nodetool: s/GetInt()/GetInt64()/
GetInt() was observed to fail when the integer JSON value overflows the
int32_t type, which `GetInt()` uses for storage. When this happens,
rapidjson will assign a distinct 64 bit integer type to the value, and
attempting to access it as 32 bit integer triggers the wrong-type error,
resulting in assert failure. This was hit on the field where invoking
nodetool netstats resulted in nodetool crashing when the streamed bytes
amounts were higher than maxint.

To avoid such bugs in the future, replace all usage of GetInt() in
nodetool of GetInt64(), just to be sure.

A reproducer is added to the nodetool netstats crash.

Fixes: scylladb/scylladb#23394

Closes scylladb/scylladb#23395

(cherry picked from commit bd8973a025)

Closes scylladb/scylladb#23476
2025-04-10 10:05:18 +03:00
Aleksandra Martyniuk
c5c631f175 nodetool: add cluster repair command
Add a new nodetool cluster repair command that repairs tablet keyspaces.

Users may specify keyspace and tables that they want to repair.
If the keyspace and tables are not specified, all tablet keyspaces
are repaired.

The command calls the new tablet repair API /storage_service/tablets/repair.

(cherry picked from commit 8bbc5e8923)
2025-04-09 14:03:28 +00:00
Aleksandra Martyniuk
8453d4f987 nodetool: repair: extract getting hosts and dcs to functions
(cherry picked from commit aa3973c850)
2025-04-09 14:03:28 +00:00
Aleksandra Martyniuk
a1b8ae57d9 nodetool: repair: warn about repairing tablet keyspaces
Warn about an attempt to repair tablet keysapce with nodetool repair.

A nodetool cluster repair command to repair tablet keyspaces will
be added in the following patches.

(cherry picked from commit b81c81c7f4)
2025-04-09 14:03:28 +00:00
Aleksandra Martyniuk
b500fa498d nodetool: repair: move keyspace_uses_tablets function
(cherry picked from commit cbde835792)
2025-04-09 14:03:28 +00:00
Botond Dénes
38bd74b2d4 tools/scylla-nodetool: netstats: don't assume both senders and receivers
The code currently assumes that a session has both sender and receiver
streams, but it is possible to have just one or the other.
Change the test to include this scenario and remove this assumption from
the code.

Fixes: #22770

Closes scylladb/scylladb#22771

(cherry picked from commit 87e8e00de6)

Closes scylladb/scylladb#22874
2025-02-17 14:34:36 +02:00
Botond Dénes
889fb9c18b Update tools/java submodule
* tools/java 807e991d...6dfe728a (1):
  > dist: support smooth upgrade from enterprise to source availalbe

Fixes: scylladb/scylladb#22820
2025-02-14 11:14:07 +02:00
Botond Dénes
d05b3897a2 Merge '[Backport 2025.1] api: task_manager: do not unregister finish task when its status is queried' from Scylladb[bot]
Currently, when the status of a task is queried and the task is already finished,
it gets unregistered. Getting the status shouldn't be a one-time operation.

Stop removing the task after its status is queried. Adjust tests not to rely
on this behavior. Add task_manager/drain API and nodetool tasks drain
command to remove finished tasks in the module.

Fixes: https://github.com/scylladb/scylladb/issues/21388.

It's a fix to task_manager API, should be backported to all branches

- (cherry picked from commit e37d1bcb98)

- (cherry picked from commit 18cc79176a)

Parent PR: #22310

Closes scylladb/scylladb#22598

* github.com:scylladb/scylladb:
  api: task_manager: do not unregister tasks on get_status
  api: task_manager: add /task_manager/drain
2025-02-13 09:38:12 +02:00
Avi Kivity
75320c9a13 Update tools/cqlsh submodule (driver update, upgradability)
* tools/cqlsh 52c6130...02ec7c5 (18):
  > chore(deps): update dependency scylla-driver to v3.28.2
  > dist: support smooth upgrade from enterprise to source availalbe
  > github action: fix downloading of artifacts
  > chore(deps): update docker/setup-buildx-action action to v3
  > chore(deps): update docker/login-action action to v3
  > chore(deps): update docker/build-push-action action to v6
  > chore(deps): update docker/setup-qemu-action action to v3
  > chore(deps): update peter-evans/dockerhub-description action to v4
  > upload actions: update the usage for multiple artifacts
  > chore(deps): update actions/download-artifact action to v4.1.8
  > chore(deps): update dependency scylla-driver to v3.28.0
  > chore(deps): update pypa/cibuildwheel action to v2.22.0
  > chore(deps): update actions/checkout action to v4
  > chore(deps): update python docker tag to v3.13
  > chore(deps): update actions/upload-artifact action to v4
  > github actions: update it to work
  > add option to output driver debug
  > Add renovate.json (#107)

Fixes: https://github.com/scylladb/scylladb/issues/22420
2025-02-09 18:07:55 +02:00
Avi Kivity
7f350558c2 Update tools/python3 (smooth upgrade from enterprise)
* tools/python3 8415caf...91c9531 (1):
  > dist: support smooth upgrade from enterprise to source availalbe

Ref #22420
2025-02-09 14:22:33 +02:00
Aleksandra Martyniuk
43f2e5f86b nodetool: tasks: print empty string for start_time/end_time if unspecified
If start_time/end_time is unspecified for a task, task_manager API
returns epoch. Nodetool prints the value in task status.

Fix nodetool tasks commands to print empty string for start_time/end_time
if it isn't specified.

Modify nodetool tasks status docs to show empty end_time.

Fixes: #22373.

Closes scylladb/scylladb#22370

(cherry picked from commit 477ad98b72)

Closes scylladb/scylladb#22601
2025-02-06 10:05:07 +02:00
Aleksandra Martyniuk
1f52ced2ff api: task_manager: add /task_manager/drain
In the following patches, get_status won't be unregistering finished
tasks. However, tests need a functionality to drop a task, so that
they could manipulate only with the tasks for operations that were
invoked by these tests.

Add /task_manager/drain/{module} to unregister all finished tasks
from the module. Add respective nodetool command.

(cherry picked from commit e37d1bcb98)
2025-01-31 08:21:03 +00:00
Botond Dénes
2428f22d3e Update tools/python3 submodule
* tools/python3 fbf12d02...8415caf4 (1):
  > dist: Support FIPS mode
2025-01-17 09:17:29 +02:00
Botond Dénes
686a997c04 Merge 'Complete implementation of configuring IO bandwidth limits' from Pavel Emelyanov
In Scylla there are two options that control IO bandwidth limit -- the /storage_service/(compaction|stream)_throughput REST API endpoints. The endpoints are partially implemented and have no counterparts in the nodetool.

This set implements the missing bits and adds tests for new functionality.

Closes scylladb/scylladb#21877

* github.com:scylladb/scylladb:
  nodetool: Implement [gs]etstreamthroughput commands
  nodetool: Implement [gs]etcompationthroughput commands
  test: Add validation of how IO-updating endpoints work
  api: Implement /storage_service/(stream|compaction)_throughput endpoints
  api: Disqualify const config reference
  api: Implement /storage_service/stream_throughput endpoint
  api: Move stream throughput set/get endpoints from storage service block
  api: Move set_compaction_throughput_mb_per_sec to config block
  util: Include fmt/ranges.h in config_file.hh
2025-01-14 07:56:38 -05:00
Botond Dénes
f899f0e411 tools/scylla-sstable: dump-statistics: fix handling of {min,max}_column_names
Said fields in statistics are of type
`disk_array<uint32_t, disk_string<uint16_t>>` and currently are handled
as array of regular strings. However these fields store exploded
clustering keys, so the elements store binary data and converting to
string can yield invalid UTF-8 characters that certain JSON parsers (jq,
or python's json) can choke on. Fix this by treating them as binary and
using `to_hex()` to convert them to string. This requires some massaging
of the json_dumper: passing field offset to all visit() methods and
using a caller-provided disk-string to sstring converter to convert disk
strings to sstring, so in the case of statistics, these fields can be
intercepted and properly handled.

While at it, the type of these fields is also fixed in the
documentation.

Before:

    "min_column_names": [
      "��Z���\u0011�\u0012ŷ4^��<",
      "�2y\u0000�}\u007f"
    ],
    "max_column_names": [
      "��Z���\u0011�\u0012ŷ4^��<",
      "}��B\u0019l%^"
    ],

After:

    "min_column_names": [
      "9dd55a92bc8811ef12c5b7345eadf73c",
      "80327900e2827d7f"
    ],
    "max_column_names": [
      "9dd55a92bc8811ef12c5b7345eadf73c",
      "7df79242196c255e"
    ],

Fixes: #22078

Closes scylladb/scylladb#22225
2025-01-13 09:19:04 +03:00
Botond Dénes
a21ecc3253 tools/scylla-sstable: also try reading scylla.yaml from /etc/scylla
scylla-sstable tries to read scylla.yaml via the following sequence:
1) Use user-provided location is provided (--scylla-yaml-file parameter)
2) Use the environment variables SCYLLA_HOME and/or SCYLLA_CONF if set
3) Use the default location ./conf/scylla.yaml

Step 3 is fine on dev machines, where the binaries are usually invoked
from scylla.git, which does have conf/scylla.yaml, but it doesn't work
on production machines, where the default location for scylla.yaml is
/etc/scylla/scylla.yaml. To reduce friction when used on production
machines, add another fallback in case (3) fails, which tries to read
scylla.yaml from /etc/scylla/scylla.yaml location.

Fixes: scylladb/scylladb#22202

Closes scylladb/scylladb#22241
2025-01-13 09:11:29 +03:00
Avi Kivity
814942505f Merge 'Introduce Encryption-at-Rest (EAR) for sstables and commitlog' from Calle Wilund
Fixes https://github.com/scylladb/scylla-enterprise/issues/5016#issuecomment-2558464631

EAR - encryption at rest. Allows on-disk file encryption of sstables and commitlog data.
Introduces OpenSSL based file level encrypted storage, managed via a set of providers
ranging from local files to cloud KMS providers.

For a more comprehensive explanation, see the included docs (or if possible, original
source tree).

Manual bulk merge of EAR feature from enterprise repo to main scylla repo.

Breaks some features apart, but main EAR is still a humongous commit, because to separate this
I would have to mess with code incrementally, adding time and risk.

This PR includes the local file gen tool, tests and also p11 validation.

Note: CI will not execute the full tests unless master CI is set to provide the same environment
as the enterprise one. Not sure about the status of this ATM.

Note: Includes code to compile against cryptsoft kmipc SDK, but not the SDK. If you happen to
check out this tree in the scylla folder and configure, it will be linked against and KMIP functionality
will be enabled, otherwise not.

Closes scylladb/scylladb#22233

* github.com:scylladb/scylladb:
  docs: Add EAR docs
  main/build: Add p11-kit and initialize
  tools: Add local-file-key-generator tool
  tests: Add EAR tests
  tmpdir: shorten test tempdir path
  EAR: port the ear feature from enterprise
  cql_test_env: Add optional query timeout
  schema/migration_manager: Add schema validate
  sstables: add get_shared_components accessor
  config/config_file: Add exports and definitions of config_type_for<>
2025-01-12 16:10:46 +02:00
Yaron Kaikov
6f30d26f2a Update tools/cqlsh submodule
* tools/cqlsh b09bc793...52c61306 (3):
  > cleanup: remove un-used Dockerfiles
  > .github/workflows/build-push.yml: update to newer macos images
  > cython: fix the usage of cython

Closes scylladb/scylladb#22250
2025-01-12 16:06:30 +02:00
Calle Wilund
f901beec87 tools: Add local-file-key-generator tool
For generating key files for local provider
2025-01-09 10:40:47 +00:00
Wojciech Mitros
d04f376227 mv: add an experimental feature for creating views using tablets
We still have a number of issues to be solved for views with tablets.
Until they are fixed, we should prevent users from creating them,
and use the vnode-based views instead.

This patch prepares the feature for enabling views with tablets. The
feature is disabled by default, but currently it has no effect.
After all tests are adjusted to use the feature, we should depend
on the feature for deciding whether we can create materialized views
in tablet-enabled keyspaces.

The unit tests are adjusted to enable this feature explicitly, and it's
also added to the scylla sstable tool config - this tool treats all
tables as if they were tablet-based (surprisingly, with SimpleStrategy),
so for it to work on views, the new feature must be enabled.

Refs scylladb/scylladb#21832

Closes scylladb/scylladb#21833
2025-01-07 15:52:36 +01:00
Avi Kivity
748d30a34d tools: toolchain: simplify non-emulated build procedure
Avoid using temporary names and instead treat the final image tag
as a temporary.

The new procedure is more or less

   remote-final := local-x86_64
   local-aarch64 += remote-final
   remote-final := local-aarch64 (which now contains the x86_64 image too)

Closes scylladb/scylladb#21981
2025-01-07 16:17:29 +02:00
Kefu Chai
353b522ca0 treewide: migrate from boost::adaptors::reversed to std::views::reverse
now that we are allowed to use C++23. we now have the luxury of using
`std::views::reverse`.

- replace `boost::adaptors::transformed` with `std::views::transform`
- remove unused `#include <boost/range/adaptor/reversed.hpp>`

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-07 13:22:00 +02:00
Botond Dénes
173fad296a tools/schema_loader.cc: remove duplicate include of short_streams.hh
Closes scylladb/scylladb#21982
2025-01-07 13:03:17 +02:00
Kefu Chai
e4463b11af treewide: replace boost::algorithm::join() with fmt::join()
Replace usages of `boost::algorithm::join()` with `fmt::join()` to improve
performance and reduce dependency on Boost. `fmt::join()` allows direct
formatting of ranges and tuples with custom separators without creating
intermediate strings.

When formatting comma-separated values into another string, fmt::join()
avoids the overhead of temporary string creation that
`boost::algorithm::join()` requires. This change also helps streamline
our dependencies by leveraging the existing fmt library instead of
Boost.Algorithm.

To avoid the ambiguity, some caller sites were updated to call
`seastar::format()` explicitly.

See also

- boost::algorithm::join():
  https://www.boost.org/doc/libs/1_87_0/doc/html/string_algo/reference.html#doxygen.join_8hpp
- fmt::join():
  https://fmt.dev/11.0/api/#ranges-api

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22082
2025-01-07 12:45:05 +02:00
Pavel Emelyanov
a24dc02255 api: New "scope" API param to load-and-stream calls
There are two of those -- the POST /storage_service/keyspace that loads
and streams new sstables from /upload and POST /storage_service/restore
that does the same, but gets sstables from object store.

The new optional parameter allow users to tun the streaming phase
behavior. The test/pylib client part is also updated here.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:28:05 +03:00
Avi Kivity
a4440392d7 build: update dependencies for features to be ported from enterprise
ldap/slapd/toxiproxy/cyrus-sasl - for ldap authentication and authorization
git-lfs/bolt - for profile-guided optimization
lz4-static - for dictionary based network compression
jwt - for Oauth/GCP connectivity (for key management)
openkmip - for kmip testing
fipscheck - for FIPS validation

Frozen toolchain regenerated, with optimized clang from

  https://devpkg.scylladb.com/clang/clang-18.1.8-Fedora-40-aarch64.tar.gz
  https://devpkg.scylladb.com/clang/clang-18.1.8-Fedora-40-x86_64.tar.gz
2024-12-19 14:26:31 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Pavel Emelyanov
3081ce24cd nodetool: Implement [gs]etstreamthroughput commands
They exist in the original documentation, but are not yet implemented.
Now it's possible to do it.

It slightly more complex that its compaction counterpart in a sense than
get method reports megabits/s by default and has an option to convert to
MiBs.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 14:39:47 +03:00
Pavel Emelyanov
67089fd5a1 nodetool: Implement [gs]etcompationthroughput commands
They exist in the original documentation, but are not yet implemented.
Now it's possible to do it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 14:39:47 +03:00
Avi Kivity
1bac6b75dc Merge 'Reserve IOCBs for tool applications' from Botond Dénes
Artifact tests have been failing since the switch to the native nodetool, because ScyllaDB doesn't leave any IOCBs for tools. On some setups it will consume all of them and then nodetool and any other native app will refuse to start because it will fail to allocate IOCBs.
This PR fixes this by making use of the freshly introduced `--reserve-io-control-blocks` seastar option, to reserve IOCBs for tool applications. Since the `linux-aio` and `epoll` reactor backends require quite a bit of these, we enable the `io_uring` reactor backend and switch tools to use this backend instead. The `io_uring` reactor backend needs just 2 IOCBs to function, so the reserve of 10 IOCBs set up in this PR is good for running 5 tool applications in parallel, which should be more than enough.

Fixes: https://github.com/scylladb/scylladb/issues/19185

The problem this PR fixes has a manual workaround (and is rare to begin with), no backport needed.

Closes scylladb/scylladb#21527

* github.com:scylladb/scylladb:
  main: configure a reserve IOCB for scylla-nodetool and friends
  configure: enable the io_uring backend
  main: use configure seastar defaults via app_template::seastar_options
2024-12-09 19:22:19 +02:00
Botond Dénes
f55dc71c3f Merge 'Use checksummed input streams in validate_checksums()' from Nikos Dragazis
With commits ed7d352e7d and bb1867c7c7, we now have input streams for both compressed and uncompressed SSTables that provide seamless checksum and digest checking. The code for these was based on `validate_checksums()`, which implements its own validation logic over raw streams. This has led to some duplicate code.

This PR deduplicates the uncompressed case by modifying `validate_checksums()` to use a checksummed input stream instead of a raw stream. The same cannot be done for compressed SSTables though. The reason is that `validate_checksums()` needs to examine the whole data file, even if an invalid chunk is encountered. In the checksummed case we support that by offloading the error handling logic from the data source via a function parameter. In the compressed data source we cannot do that because it needs to return decompressed data and decompression may fail if the data are invalid.

This PR also enables `validate_checksums()` to partially verify SSTables with just the per-chunk checksums if the digest is missing.

In more detail, this PR consists of:
* Port of some integrity checks from `do_validate_uncompressed()` to the checksummed data source. It should now be able to detect corruption due to truncated or appended chunks (expected number of chunks is retrieved from the CRC component).
* Introduction of `error_handler` parameter in checksummed data source and `data_stream()`.
* Refactoring of `validate_checksums()`. The JSON response of `sstable validate-checksums` was also modified to report a missing digest.
*  Tests for `validate_checksums()` against SSTables with truncated data, appended data, invalid digests, or no digest.

Refs #19058.

This PR is a hybrid of cleanup and feature. No backport is needed.

Closes scylladb/scylladb#20933

* github.com:scylladb/scylladb:
  tools/scylla-sstable: Rename valid_checksums -> valid
  test: Check validate_checksums() with missing digest
  sstables: Allow validate_checksums() to report missing digests
  sstables: Refactor validate_checksums() to use checksummed data stream
  sstables: Add error_handler parameter to data_stream()
  sstables: Add error handler in checksummed data source
  sstables: Check for excessive chunks in checksummed data source
  sstables: Check for premature EOF in checksummed data source
  test: test_validate_checksums: Check SSTable with invalid digest
  test: test_validate_checksums: Check SSTable with appended data
  test: test_validate_checksums: Complement test for truncated SSTable
2024-12-04 10:46:18 +02:00
Benny Halevy
d5d4307a20 scylla-sstable: dump-summary: print also first and last tokens
To help scylla-manager restore to map sstables to
nodes or tablets, print also the tokens of the
sstable first and last keys.

For example, the json output will now look like this:
```
$ build/dev/scylla sstable dump-summary /tmp/scylla-344593/data/ks/t-52a92590afd011ef9b68ba86378ed63b/me-3glp_0tm9_00uv52doobo0bvk2t7-big-Data.db | jq
{
  "sstables": {
    "/tmp/scylla-344593/data/ks/t-52a92590afd011ef9b68ba86378ed63b/me-3glp_0tm9_00uv52doobo0bvk2t7-big-Data.db": {
      "header": {
        "min_index_interval": 128,
        "size": 1,
        "memory_size": 16,
        "sampling_level": 128,
        "size_at_full_sampling": 0
      },
      "positions": [
        4
      ],
      "entries": [
        {
          "key": {
            "token": "2008715943680221220",
            "raw": "000400000064",
            "value": "100"
          },
          "position": 0
        }
      ],
      "first_key": {
        "token": "2008715943680221220",
        "raw": "000400000064",
        "value": "100"
      },
      "last_key": {
        "token": "9010454139840013625",
        "raw": "000400000003",
        "value": "3"
      }
    }
  }
}
```

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#21735
2024-12-04 10:16:13 +02:00
Botond Dénes
ca956c0180 configure: enable the io_uring backend
To be used by the tool apps -- also change the backend selected in
tools::utils::configure_tool_mode().
We keep using the more mature AIO backend in ScyllaDB itself, so main.cc
sets the linux_aio backend as the default one (the user can still change
this, same as before).
2024-12-04 02:55:31 -05:00
Avi Kivity
841481c202 Merge "move storage proxy and adjacent services to identify hosts by ids" from Gleb
"
This rather large patch series moves storage proxy and some adjacent
services (like migration manager) to use host ids to identify nodes rather
than ips. Messaging service gains a capability to address nodes by host
ids (which allows dropping translations from topology coordinator code
that worked on host ids already) and also makes sure that a node with
incorrect host id will reject a message (can happen during address
changes).

The series gets rid of the raft address map completely and replaces it with
the gossiper address map which is managed by the gossiper since translation
is now done in the layer below raft.

Fixes: scylladb/scylladb#6403

perf-simple-query -- smp 1 -m 1G output

Before:

enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...
64336.82 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41291 insns/op,   24485 cycles/op,        0 errors)
62669.58 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41277 insns/op,   24695 cycles/op,        0 errors)
69172.12 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   41326 insns/op,   24463 cycles/op,        0 errors)
56706.60 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41143 insns/op,   24513 cycles/op,        0 errors)
56416.65 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41186 insns/op,   24851 cycles/op,        0 errors)

         throughput: mean=61860.35 standard-deviation=5395.48 median=62669.58 median-absolute-deviation=5153.75 maximum=69172.12 minimum=56416.65
instructions_per_op: mean=41244.62 standard-deviation=76.90 median=41276.94 median-absolute-deviation=58.55 maximum=41326.19 minimum=41142.80
  cpu_cycles_per_op: mean=24601.35 standard-deviation=167.39 median=24512.64 median-absolute-deviation=116.65 maximum=24851.45 minimum=24462.70

After:

enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...
65237.35 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   40733 insns/op,   23145 cycles/op,        0 errors)
59283.09 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40624 insns/op,   23948 cycles/op,        0 errors)
70851.03 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40625 insns/op,   23027 cycles/op,        0 errors)
70549.61 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40650 insns/op,   23266 cycles/op,        0 errors)
68634.96 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40622 insns/op,   22935 cycles/op,        0 errors)

         throughput: mean=66911.21 standard-deviation=4814.60 median=68634.96 median-absolute-deviation=3638.40 maximum=70851.03 minimum=59283.09
instructions_per_op: mean=40650.89 standard-deviation=47.55 median=40624.60 median-absolute-deviation=27.11 maximum=40733.37 minimum=40622.33
  cpu_cycles_per_op: mean=23264.16 standard-deviation=402.12 median=23145.29 median-absolute-deviation=237.63 maximum=23947.96 minimum=22934.59

CI: https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/13531/
SCT (longevity-100gb-4h with nemesis_selector: ['topology_changes']): https://jenkins.scylladb.com/view/staging/job/scylla-staging/job/gleb/job/move-to-host-id/3/

Tested mixed cluster manually.
"

* 'gleb/move-to-host-id-v2' of github.com:scylladb/scylla-dev: (55 commits)
  group0: drop unused field from replace_info struct
  test: rename raft_address_map_test to address_map_test and move if from raft tests
  raft_address_map: remove raft address map
  topology coordinator: do not modify expire state for left/new nodes any more in raft address map
  topology coordinator: drop expiring entries in gossiper address map on error injections since raft one is no longer used
  group0: drop raft address map dependency from raft_rpc
  group0: move raft_ticker_type definition from raft_address_map.hh
  storage_service: do not update raft address map on gossiper events
  group0: drop raft address map dependency from raft_server_with_timeouts
  group0: move group0 upgrade code to host ids
  repair: drop raft address map dependency
  group0: remove unused raft address map getter from raft_group0
  group0: drop raft address map from group0_state_machine dependency since it is not used there any more
  group0: remove dependency on raft address map from group0_state_id_handler
  gossiper: add get_application_state_ptr that searches by host_id
  gossiper: change get_live_token_owners to return host ids
  view: move view building to host id
  hints: use host id to send hints
  storage_proxy: remove id_vector_to_addr since it is no longer used
  db: consistency_level: change is_sufficient_live_nodes to work on host ids
  ...
2024-12-03 18:18:48 +02:00
Kefu Chai
bab12e3a98 treewide: migrate from boost::adaptors::transformed to std::views::transform
now that we are allowed to use C++23. we now have the luxury of using
`std::views::transform`.

in this change, we:

- replace `boost::adaptors::transformed` with `std::views::transform`
- use `fmt::join()` when appropriate where `boost::algorithm::join()`
  is not applicable to a range view returned by `std::view::transform`.
- use `std::ranges::fold_left()` to accumulate the range returned by
  `std::view::transform`
- use `std::ranges::fold_left()` to get the maximum element in the
  range returned by `std::view::transform`
- use `std::ranges::min()` to get the minimal element in the range
  returned by `std::view::transform`
- use `std::ranges::equal()` to compare the range views returned
  by `std::view::transform`
- remove unused `#include <boost/range/adaptor/transformed.hpp>`
- use `std::ranges::subrange()` instead of `boost::make_iterator_range()`,
  to feed `std::views::transform()` a view range.

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

limitations:

there are still a couple places where we are still using
`boost::adaptors::transformed` due to the lack of a C++23 alternative
for `boost::join()` and `boost::adaptors::uniqued`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21700
2024-12-03 09:41:32 +02:00