Commit Graph

185 Commits

Author SHA1 Message Date
Avi Kivity
0ae22a09d4 LICENSE: Update to version 1.1
Updated terms of non-commercial use (must be a never-customer).
2026-04-12 19:46:33 +03:00
Pavel Emelyanov
65032877d4 api: Move /storage_service/toppartitions from storage_service.cc to column_family.cc
The endpoint URL remains intact. Having it next to another toppartitions
endpoint (the /column_family/toppartitions one) is natural.

This endpoint only needs sharded<replica::database>&, grabs it from
http_context and doesn't use any other service. In column_family.cc the
database reference is already available as a parameter. Once more user
of http_context.db is gone.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Closes scylladb/scylladb#28996
2026-03-20 09:52:33 +02:00
Michał Chojnowski
1cfce430f1 replica/table: keep track of total pre-compression file size
Every table and sstable set keeps track of the total file size
of contained sstables.

Due to a feature request, we also want to keep track of the hypothetical
file size if Data files were uncompressed, to add a metric that
shows the compression ratio of sstables.

We achieve this by replacing the relevant `uint_64 bytes_on_disk`
counters everywhere with a struct that contains both the actual
(post-compression) size and the hypothetical pre-compression size.

This patch isn't supposed to change any observable behavior.
In the next patch, we will use these changes to add a new metric.
2025-11-13 00:49:57 +01:00
Pavel Emelyanov
f77f9db96c api: Remove system_keyspace ref from column_family API block
This reference was only needed to facilitate get_built_indexes handler
to work. Now it's gone and the sys.ks. reference is no longer needed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-10-03 13:50:22 +03:00
Pavel Emelyanov
95b616d0e5 api: Move get_built_indexes from column_family to view_builder
The handler effectively works with the view_builder and should be
registerd in the block that has this service captured.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-10-03 13:49:33 +03:00
Botond Dénes
1999d8e3d3 compaction: remove using namespace {compaction,sstables}
Some files in compaction/ have using namespace {compaction,sstables}
clauses, some even in headers. This is considered bad practice and
muddies the namespace use. Remove them.
2025-09-25 15:03:57 +03:00
Botond Dénes
86ed627fc4 compaction: move code to namespace compaction
The namespace usage in this directory is very inconsistent, with files
and classes scattered in:
* global namespace
* namespace compaction
* namespace sstables

With cases, where all three used in the same file. This code used to
live in sstables/ and some of it still retains namespace sstables as a
heritage of that time. The mismatch between the dir (future module) and
the namespace used is confusing, so finish the migration and move all
code in compaction/ to namespace compaction too.

This patch, although large, is mechanic and only the following kind of
changes are made:
* replace namespace sstable {} with namespace compaction {}
* add namespace compaction {}
* drop/add sstables::
* drop/add compaction::
* move around forward-declarations so they are in the correct namespace
  context

This refactoring revealed some awkward leftover coupling between
sstables and compaction, in sstables/sstable_set.cc, where the
make_sstable_set() methods of compaction strategies are implemented.
2025-09-25 15:03:56 +03:00
Botond Dénes
1ac7b4c35e treewide: move away from accessing httpd::request::query_parameters
Acecssing this member directly is deprecated, migrate code to use
{get,set}_query_param() and friends instead.

Fixes: https://github.com/scylladb/scylladb/issues/26023
2025-09-24 11:52:15 +03:00
Pavel Emelyanov
88a01308e7 api: Move /storage_service/keyspaces handler to database module
The handler uses database service, not storage_service, and should
belong to the corresponding API module from column_family.cc

Once moved, the handler can use captured sharded<database> reference and
forget about http_context::db.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25834
2025-09-10 17:01:11 +02:00
Pavel Emelyanov
840cdab627 api: Move /load and /metrics/load handlers code to column_family.cc
Both handlers need database to proceed and thus need to be registered
(and unregistered) in a group that captures database for its handlers.

Once moved, the used get_cf_stats() method can be marked local to
column_family.cc file.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25671
2025-09-01 08:11:00 +02:00
Pavel Emelyanov
2510a7b488 api/column_family: Capture sharded<database> to call get_cf_stats()
Update more handlers not to get databse from context, but to capture it
directly on handlers' lambdas.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 12:04:34 +03:00
Pavel Emelyanov
dc31b68451 api: Patch get_cf_stats to get sharded<database>& argument
Now it accepts http context and immediately gets the database from it to
pass to map_reduce_cf. Callers are updated to pass database from where
the context they already have.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 12:03:27 +03:00
Pavel Emelyanov
ffd25f0c16 api: Patch callers of map_reduce_cf(_raw)? to use sharded<database>
There are some of them left that still pass http_context. These handlers
will eventually get their captured sharded database reference, but for
now make them explicitly use one from context. This will allow to
de-templatize map_reduce_cf... helpers making the code simpler.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:43:55 +03:00
Pavel Emelyanov
7e0726d55b api: Use captured sharded<database> reference in handlers
Not all of them can switch from ctx to database, so in few places both,
the database and ctx, are captured. However, the ctx.db reference is no
longer used by the column_family handlers.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
720a8fef4b api/column_family: Make map_reduce_cf_time_histogram() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
49cb81fb56 api/column_famliy: Make sum_sstable() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
3595ea7f49 api/column_family: Make get_cf_unleveled_sstables() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
d32ac35f60 api/column_famliy: Make get_cf_stats_count() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:55 +03:00
Pavel Emelyanov
cde39d3fc7 api/column_family: Make get_cf_rate_and_histogram() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
edc9e302e3 api/column_family: Make get_cf_histogram() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
422debbee2 api/column_family: Make get_cf_stats_sum() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
1c1fabc578 api/column_family: Make set_tables_tombstone_gc() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
0158743f5e api/column_family: Make set_tables_autocompaction() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
f52eb7cae2 api/column_family: Make for_tables_on_all_shards() use sharded<database>
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
818a41ccdb api: Capture sharded<database> for set_server_column_family()
Similarly to other API handlers, instead of using a database from http
context, patch the setting methods to capture the database from main
code and pass it around to handlers.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:54 +03:00
Pavel Emelyanov
bdb7c2b014 api: Make map_reduce_cf_time_histogram() file-local
It's not used outside of api/column_family.cc

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:53 +03:00
Pavel Emelyanov
b0db83575c api: Remove unused ctx argument from run_toppartitions_query()
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-08-18 11:00:53 +03:00
Botond Dénes
8498bd6376 Merge 'Replace container_to_vec with std::ranges' from Pavel Emelyanov
The helper in question converts an iterable collection to a vector of fmt::to_string()-s of the collection elements.
Patch the caller to use standard library and remove the helper.

Closes scylladb/scylladb#24357

* github.com:scylladb/scylladb:
  api: Drop no longer used container_to_vec helper
  api: Use std::ranges to stringify collections
  api: Use std::ranges to convert std::set<sstring> to std::vector<string>
  api: Use db::config::data_file_directories()' vector directly
  api: Coroutinize get_live_endpoint()
2025-06-06 10:57:06 +03:00
Pavel Emelyanov
428edd41f5 api: Make us of datablse::get_all_keyspaces()
There are two places in the API that want to get the list of keyspace
names. For that they call database::get_keyspaces() and then extract
keys from the returned name to class keyspace map.

There's a database::get_all_keyspaces() method that does exactly that.

Remove the map_keys helper from the api/api.hh that becomes unused.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#24353
2025-06-06 10:53:09 +03:00
Pavel Emelyanov
b943902ff7 api: Use std::ranges to convert std::set<sstring> to std::vector<string>
The column_family/get_sstables_for_key endpoint collects a set of
sstable names and converts it to vector of strings using homebrew
helper. The std::ranges convertor works just as nice.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-06-02 20:09:28 +03:00
Botond Dénes
542b2ed0de Merge 'Remove req_params facility from API' from Pavel Emelyanov
The class was introduced to facilitate path and query parameters parsing from requests, but in fact it's mostly dead code.

First, the class introduces the concept of  "mandatory" parameters which are seastar path params. If missing, the parameter validation throws, but in all cases where this option is used in scylla it's impossible to get empty path param -- if the parameter is missing seastar returns 404 (not found) before calling handler.

Second, the req_params::get<T>() doesn't work for anything but string argument (or types such that optional<T> can be implicitly casted to optional<sstring>). And it's in fact only used to get sstrings, so it compiles and works so far.

The remaining ability to parse bool from string is partially duplicated by the validate_bool() method. Using plain method to parse string to bool is less code than req_params introduce.

One (arguably) useful thing req_params do it validate the incoming request _not_ to contain unknown query parameters. However, quite a few endpoints use this, most of them just cherry-pick parameters they want and ignore the others. There's already a comprehensive description of accepted parameters for each endpoint in api-doc/ and req_params duplicate it. Good validation code should rely on api-doc/, not on its partial copy.

Having said that, this PR introduces validate_bool_x() helper to do req_params-like parsing of strings to bools, patches existing handlers to use existing parameters parsing facilities (such as validate_keyspace() and parse_table_infos()) and drops the req_params.

Closes scylladb/scylladb#24159

* github.com:scylladb/scylladb:
  api: Drop class req_params
  api: Stop using req_params in parse_scrub_options
  api: Stop using req_params in tasks::force_keyspace_compaction_async
  api: Stop using req_params in ss::force_keyspace_compaction
  api: Stop using req_params in ss::force_compaction
  api: Stop using req_params in cf::force_major_compaction
  api: Add validate_bool_x() helper
2025-05-27 14:29:05 +03:00
Pavel Emelyanov
a320550bd1 api: Stop using req_params in cf::force_major_compaction
The mandatory "name" parameter can be picked directly from request path
params, as described in the PR description.

The "split_output" is placeholder and is just checked for being there at
all, without any parsing.

Other parameters are query ones too, and are parsed with the help of
recently introduced validate_bool_x helper.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-05-15 11:07:52 +03:00
Pavel Emelyanov
2e83b0367f api: Use structured bindings in get_built_indexes() code
Shorter this way

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#24155
2025-05-14 19:03:13 +03:00
Nadav Har'El
5fd2eabd48 Merge 'Generalize the diversity of parse_table_infos() callers in API' from Pavel Emelyanov
The helper in question is used in several different ways -- by handlers directly (most of the callers), as a part of wrap_ks_cf() helper and by one of its overloads that unpack the "cf" query parameter from request. This PR generalizes most of the described callers thus reducing the number differently-looking of ways API handlers parse "keyspace" and "cf" request parameters.

Continuation of #22742

Closes scylladb/scylladb#23368

* github.com:scylladb/scylladb:
  api: Squash two parse_table_infos into one
  api: Generalize keyspaces:tables parsing a little bit more
  api: Provide general pair<keyspace, vector<table>> parsing
  api: Remove ks_cf_func and related code
2025-04-22 15:40:06 +03:00
Pavel Emelyanov
dc3455bc55 api: Provide general pair<keyspace, vector<table>> parsing
Lots of API handlers get "keyspace" path parameter and parse the "cf"
query one into a vector of table_infos. Generalize those places.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-03-19 15:51:57 +03:00
Pavel Emelyanov
1ba91e28cb sstables: Make get_filename() return component_name
Similarly to previous patches -- mostly the result is used as log
argument. The remaining users include

- scylla sstable tool that dumps component names to json output
- API endpoint that returns component names to user
- tests

these are all good to explicitly convert component_names to strings.

There are few more places that expect strings instead of component name
objects. For now they also use fmt::to_string() explicitly, partially it
will be fixed later, mostly -- as future follow-ups.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-03-19 13:03:29 +03:00
Pavel Emelyanov
c084de1406 api: Generalize disk space counting for table and system
Now when the bodies of both map-reduce reducers are the same, they can
be generalized with each other.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-03-04 19:56:16 +03:00
Pavel Emelyanov
4e2abba5a1 api: Use map_reduce_cf_raw() overload with table name
The existing helper that counds disk space usage for a table map-reduces
the table object "by hand". Its peer that counts the usage for all
tables uses the map_reduce_cf_raw() helper. The latter exists for
specific table as well, so the first counter can benefit from using it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-03-04 19:55:05 +03:00
Pavel Emelyanov
b43e2390db api: Don't collect sstables map to count disk space usage
All the API calls that collect disk usage of sstables accumulate
map<sstable name, disk size>, then merges shard maps into one, then
counts the "disk size" values and drops the map itself on the floor.
This is waste of CPU cycles, disk usage can be just summed up along
cf/sstables iterations, no need to accumulate map with names for that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-03-04 19:53:42 +03:00
Pavel Emelyanov
ac989f7c30 api: Remove get_uuid() local helper
This helper now fully duplicates the validate_table() one, so it
can be removed. Two callers are updated respectively.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-02-17 11:42:33 +03:00
Kefu Chai
7ff0d7ba98 tree: Remove unused boost headers
This commit eliminates unused boost header includes from the tree.

Removing these unnecessary includes reduces dependencies on the
external Boost.Adapters library, leading to faster compile times
and a slightly cleaner codebase.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22857
2025-02-15 20:32:22 +02:00
Nadav Har'El
c2b870ee54 Merge 'De-duplicate validation of tables in some column_family API endpoints' from Pavel Emelyanov
In column_family.cc and storage_service.cc there exist a bunch of helpers that parse and/or validate ks/cf names, and different endpoints use different combinations of those, duplicating the functionality of each other and generating some mess. This PR cleans the endpoints from column_family.cc that parse and validate fully qualified table name (the '$ks:$cf' string).

A visible "improvement" is that `validate_table()` helper usage in the api/ directory is narrowed down to storage_service.cc file only (with the intent to remove that helper completely), and the aforementioned `for_tables_on_all_shards()` helper becomes shorter and tiny bit faster, because it doesn't perform some re-lookups of tables, that had been performed by validation sanity checks before it.

There's more to be done in those helpers, this PR wraps only one part of this mess.

Below is the list of endpoints this PR affects and the tests that validate the changes:

|endpoint|test|
|-|-|
|column_family/autocompaction|rest_api/test_column_family::test_column_family_auto_compaction_table|
|column_family/tombstone_gc|rest_api/test_column_family::test_column_family_tombstone_gc_api|
|column_family/compaction_strategy|rest_api/test_column_family/test_column_family_compaction_strategy|
|compaction_manager/stop_keyspace_compaction/|rest_api/test_compaction_manager::{test_compaction_manager_stop_keyspace_compaction,test_compaction_manager_stop_keyspace_compaction_tables}|

Closes scylladb/scylladb#21533

* github.com:scylladb/scylladb:
  api: Hide parse_tables() helper
  api: Use parse_table_infos() in stop_keyspace_compaction handler
  api: Re-use parse_table_info() in column_family API
  api: Make get_uuid() return table_info (and rename)
  api: Remove keyspace argument from for_table_on_all_shards()
  api: Switch for_table_on_all_shards() to use table_info-s
  api: Hide validate_table() helper
  api: Tables vector is never empty now in for_table_on_all_shards()
  api: Move vectors of tables, not copy
  api: Add table validation to set_compaction_strategy_class endpoint
  api: Use get_uuid() to validate_table() in column family API
  api: Use parse_table_infos() in column family API
2025-02-06 17:28:08 +01:00
Pavel Emelyanov
fb09a645b8 api: Re-use parse_table_info() in column_family API
Several places call parse_fully_qualified_cf_name() and get_uuid()
helpers one after another. Previous patch introduced the
parse_table_info() one that wraps both.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-01-13 11:32:07 +03:00
Pavel Emelyanov
789f468f39 api: Make get_uuid() return table_info (and rename)
The method gets "fully qualified" table name, which is 'ks:cf' string
and returns back the resolved table_id value. Some callers will benefit
from knowing the parsed 'cf' part of it (see next patch).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-01-13 11:32:07 +03:00
Pavel Emelyanov
b55c05c9d0 api: Remove keyspace argument from for_table_on_all_shards()
This argument is needed to find table by ks:cf prair. The "table" part
is taken from the vector of table_info-s, but table_info-s have table_id
value onboard, and the table can be found by this id. So keyspace is not
needed any longer.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-01-13 11:32:07 +03:00
Pavel Emelyanov
84ad9fe82b api: Switch for_table_on_all_shards() to use table_info-s
All callers of it already have one. Next patch will make even more use
of those passed table_info-s.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-01-13 11:32:07 +03:00
Pavel Emelyanov
5a038fba39 api: Tables vector is never empty now in for_table_on_all_shards()
Callers of this method provide vectors of two kinds:
- explicitly single-entry one from endpoints that work on single table
- vector returned by parse_table_infos()

The latter helper, if it gets empty list of tables from user, populates
its return value with all tables from the given keyspace.

The removed check became obsolete after recent changes. Prior to those,
the 2nd case provided vector from another helper called parse_tables(),
which could return empty result.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-01-13 11:32:07 +03:00
Pavel Emelyanov
2016b12252 api: Move vectors of tables, not copy
The set_tables_...() helper called here accept vector by value, so the
existing code copies it. It's better to move, all the more so next
changes will make this place pass vectors with more data onboard.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-01-13 11:32:07 +03:00
Pavel Emelyanov
bf715ca614 api: Add table validation to set_compaction_strategy_class endpoint
This handler doesn't check if the requested table exists. If it doesn't
it will throw later anyway, but most of other endpoints that work with
tables check table early. This early check allows throwing bad-param
exception on missing table, not internal-server-error one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-01-13 11:32:07 +03:00
Pavel Emelyanov
e35245de36 api: Use get_uuid() to validate_table() in column family API
This helper returns uuid, but also "Validates" the table exists by
calling db.find_uuid() and throwing bad_param exception on error.
This change will allow making for_table_on_all_shards() smaller a bit
later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-01-13 11:32:07 +03:00