scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 04:06:59 +00:00

Author	SHA1	Message	Date
Nadav Har'El	5fd2eabd48	Merge 'Generalize the diversity of parse_table_infos() callers in API' from Pavel Emelyanov The helper in question is used in several different ways -- by handlers directly (most of the callers), as a part of wrap_ks_cf() helper and by one of its overloads that unpack the "cf" query parameter from request. This PR generalizes most of the described callers thus reducing the number differently-looking of ways API handlers parse "keyspace" and "cf" request parameters. Continuation of #22742 Closes scylladb/scylladb#23368 * github.com:scylladb/scylladb: api: Squash two parse_table_infos into one api: Generalize keyspaces:tables parsing a little bit more api: Provide general pair<keyspace, vector<table>> parsing api: Remove ks_cf_func and related code	2025-04-22 15:40:06 +03:00
Pavel Emelyanov	dc3455bc55	api: Provide general pair<keyspace, vector<table>> parsing Lots of API handlers get "keyspace" path parameter and parse the "cf" query one into a vector of table_infos. Generalize those places. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-19 15:51:57 +03:00
Pavel Emelyanov	1ba91e28cb	sstables: Make get_filename() return component_name Similarly to previous patches -- mostly the result is used as log argument. The remaining users include - scylla sstable tool that dumps component names to json output - API endpoint that returns component names to user - tests these are all good to explicitly convert component_names to strings. There are few more places that expect strings instead of component name objects. For now they also use fmt::to_string() explicitly, partially it will be fixed later, mostly -- as future follow-ups. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-19 13:03:29 +03:00
Pavel Emelyanov	c084de1406	api: Generalize disk space counting for table and system Now when the bodies of both map-reduce reducers are the same, they can be generalized with each other. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-04 19:56:16 +03:00
Pavel Emelyanov	4e2abba5a1	api: Use map_reduce_cf_raw() overload with table name The existing helper that counds disk space usage for a table map-reduces the table object "by hand". Its peer that counts the usage for all tables uses the map_reduce_cf_raw() helper. The latter exists for specific table as well, so the first counter can benefit from using it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-04 19:55:05 +03:00
Pavel Emelyanov	b43e2390db	api: Don't collect sstables map to count disk space usage All the API calls that collect disk usage of sstables accumulate map<sstable name, disk size>, then merges shard maps into one, then counts the "disk size" values and drops the map itself on the floor. This is waste of CPU cycles, disk usage can be just summed up along cf/sstables iterations, no need to accumulate map with names for that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-04 19:53:42 +03:00
Pavel Emelyanov	ac989f7c30	api: Remove get_uuid() local helper This helper now fully duplicates the validate_table() one, so it can be removed. Two callers are updated respectively. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-02-17 11:42:33 +03:00
Kefu Chai	7ff0d7ba98	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22857	2025-02-15 20:32:22 +02:00
Nadav Har'El	c2b870ee54	Merge 'De-duplicate validation of tables in some column_family API endpoints' from Pavel Emelyanov In column_family.cc and storage_service.cc there exist a bunch of helpers that parse and/or validate ks/cf names, and different endpoints use different combinations of those, duplicating the functionality of each other and generating some mess. This PR cleans the endpoints from column_family.cc that parse and validate fully qualified table name (the '$ks:$cf' string). A visible "improvement" is that `validate_table()` helper usage in the api/ directory is narrowed down to storage_service.cc file only (with the intent to remove that helper completely), and the aforementioned `for_tables_on_all_shards()` helper becomes shorter and tiny bit faster, because it doesn't perform some re-lookups of tables, that had been performed by validation sanity checks before it. There's more to be done in those helpers, this PR wraps only one part of this mess. Below is the list of endpoints this PR affects and the tests that validate the changes: \|endpoint\|test\| \|-\|-\| \|column_family/autocompaction\|rest_api/test_column_family::test_column_family_auto_compaction_table\| \|column_family/tombstone_gc\|rest_api/test_column_family::test_column_family_tombstone_gc_api\| \|column_family/compaction_strategy\|rest_api/test_column_family/test_column_family_compaction_strategy\| \|compaction_manager/stop_keyspace_compaction/\|rest_api/test_compaction_manager::{test_compaction_manager_stop_keyspace_compaction,test_compaction_manager_stop_keyspace_compaction_tables}\| Closes scylladb/scylladb#21533 * github.com:scylladb/scylladb: api: Hide parse_tables() helper api: Use parse_table_infos() in stop_keyspace_compaction handler api: Re-use parse_table_info() in column_family API api: Make get_uuid() return table_info (and rename) api: Remove keyspace argument from for_table_on_all_shards() api: Switch for_table_on_all_shards() to use table_info-s api: Hide validate_table() helper api: Tables vector is never empty now in for_table_on_all_shards() api: Move vectors of tables, not copy api: Add table validation to set_compaction_strategy_class endpoint api: Use get_uuid() to validate_table() in column family API api: Use parse_table_infos() in column family API	2025-02-06 17:28:08 +01:00
Pavel Emelyanov	fb09a645b8	api: Re-use parse_table_info() in column_family API Several places call parse_fully_qualified_cf_name() and get_uuid() helpers one after another. Previous patch introduced the parse_table_info() one that wraps both. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-01-13 11:32:07 +03:00
Pavel Emelyanov	789f468f39	api: Make get_uuid() return table_info (and rename) The method gets "fully qualified" table name, which is 'ks:cf' string and returns back the resolved table_id value. Some callers will benefit from knowing the parsed 'cf' part of it (see next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-01-13 11:32:07 +03:00
Pavel Emelyanov	b55c05c9d0	api: Remove keyspace argument from for_table_on_all_shards() This argument is needed to find table by ks:cf prair. The "table" part is taken from the vector of table_info-s, but table_info-s have table_id value onboard, and the table can be found by this id. So keyspace is not needed any longer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-01-13 11:32:07 +03:00
Pavel Emelyanov	84ad9fe82b	api: Switch for_table_on_all_shards() to use table_info-s All callers of it already have one. Next patch will make even more use of those passed table_info-s. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-01-13 11:32:07 +03:00
Pavel Emelyanov	5a038fba39	api: Tables vector is never empty now in for_table_on_all_shards() Callers of this method provide vectors of two kinds: - explicitly single-entry one from endpoints that work on single table - vector returned by parse_table_infos() The latter helper, if it gets empty list of tables from user, populates its return value with all tables from the given keyspace. The removed check became obsolete after recent changes. Prior to those, the 2nd case provided vector from another helper called parse_tables(), which could return empty result. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-01-13 11:32:07 +03:00
Pavel Emelyanov	2016b12252	api: Move vectors of tables, not copy The set_tables_...() helper called here accept vector by value, so the existing code copies it. It's better to move, all the more so next changes will make this place pass vectors with more data onboard. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-01-13 11:32:07 +03:00
Pavel Emelyanov	bf715ca614	api: Add table validation to set_compaction_strategy_class endpoint This handler doesn't check if the requested table exists. If it doesn't it will throw later anyway, but most of other endpoints that work with tables check table early. This early check allows throwing bad-param exception on missing table, not internal-server-error one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-01-13 11:32:07 +03:00
Pavel Emelyanov	e35245de36	api: Use get_uuid() to validate_table() in column family API This helper returns uuid, but also "Validates" the table exists by calling db.find_uuid() and throwing bad_param exception on error. This change will allow making for_table_on_all_shards() smaller a bit later. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-01-13 11:32:07 +03:00
Pavel Emelyanov	6ab5bade21	api: Use parse_table_infos() in column family API The one is the same as parse_tables(), but returns back name:id pairs. This change will allow making for_table_on_all_shards() smaller a bit later, as well as removing the parse_tables() code eventually. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-01-13 11:32:07 +03:00
Kefu Chai	41de3a17e1	api: move histogram data into future to avoid deep copying Previously, we created a vector<utils_json::histogram> and returned it by copying into a future. Since histogram is a JSON representation of ihistogram, it can be heavyweight, making the vector copy overhead significant. Now we move the vector into the returned future instead of copying it, eliminating the deep copy overhead. The APIs backed by this function are marked deprecated, so this performance improvement is not that important. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22004	2025-01-13 09:08:15 +03:00
Pavel Emelyanov	5eb3278d9e	api: Use built_views table in get_built_indexes API Somehow system."IndexInfo" table and column_family/built_indexes REST API endpoint declare an index "built" at slightly different times: The former a virtual table which declares an index completely built when it appears on the system.built_views table. The latter uses different data -- it takes the list of indexes in the schema and eliminates indexes which are still listed in the system.scylla_views_builds_in_progress table. The mentioned system. tables are updated at different times, so API notices the change a bit later. It's worth improving the consistency of these two APIs by making the REST API endpoint piggy-back the load_built_views() instead of load_view_build_progress(). With that change the filtering of indexes should be negated. Fixes #21587 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-12-24 16:18:00 +03:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Avi Kivity	9024e4940c	counters.hh: drop unused boost includes Re-add them to source files that need them. Closes scylladb/scylladb#21738	2024-12-05 12:27:41 +02:00
Kefu Chai	bab12e3a98	treewide: migrate from boost::adaptors::transformed to std::views::transform now that we are allowed to use C++23. we now have the luxury of using `std::views::transform`. in this change, we: - replace `boost::adaptors::transformed` with `std::views::transform` - use `fmt::join()` when appropriate where `boost::algorithm::join()` is not applicable to a range view returned by `std::view::transform`. - use `std::ranges::fold_left()` to accumulate the range returned by `std::view::transform` - use `std::ranges::fold_left()` to get the maximum element in the range returned by `std::view::transform` - use `std::ranges::min()` to get the minimal element in the range returned by `std::view::transform` - use `std::ranges::equal()` to compare the range views returned by `std::view::transform` - remove unused `#include <boost/range/adaptor/transformed.hpp>` - use `std::ranges::subrange()` instead of `boost::make_iterator_range()`, to feed `std::views::transform()` a view range. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. limitations: there are still a couple places where we are still using `boost::adaptors::transformed` due to the lack of a C++23 alternative for `boost::join()` and `boost::adaptors::uniqued`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21700	2024-12-03 09:41:32 +02:00
Pavel Emelyanov	b158ca7346	api: Remove param field from req_param The req_param class is used to help parsing http request parameters from strings into exact types (typically some simple types like strings, integrals or boolean). On it there are three fields: - name -- the parameter name - param -- the parameter string value - value -- the parameter value of desired type The `param` thing is not really needed, it's only used by few places that print it into logs, but they may as well just print the `value` thing itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#21502	2024-11-11 17:47:55 +02:00
Pavel Emelyanov	d6169630a4	api: Rename set_tables -> for_tables_on_all_shards The former name is not extremely descriptive, hopefully the latter one is better in this sense. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-11-01 12:15:01 +03:00
Pavel Emelyanov	822758dffd	api: Remove foreach_column_family() helper There's a whole lot of helpers and wrappers in api/ that help handlers manipulate keyspaces and tables. One of those is foreach_column_family which calls the provided callable on a table on each shard. There's exactly the same (but a bit more flexible) set_table() helper nearby. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-11-01 12:13:35 +03:00
Avi Kivity	907da210b6	compound_compat: replace use of boost ranges with std ranges To reduce the dependency load, replace use of boost ranges with the std equivalent. Files that lost the indirect boost dependency have it added as a direct dependency.	2024-10-30 19:58:07 +02:00
Lakshmi Narayanan Sreethar	84d06a13c7	api: compaction: add `consider_only_existing_data` option Added a new parameter `consider_only_existing_data` to major compaction API endpoints. When enabled, major compaction will: - Force-flush all tables. - Force a new active segment in the commit log. - Compact all existing SSTables and garbage-collect tombstones by only checking the SSTables being compacted. Memtables, commit logs, and other SSTables not part of the compaction will not be checked, as they will only contain newer data that arrived after the compaction started. The `consider_only_existing_data` is passed down to the compaction descriptor's `gc_check_only_compacting_sstables` option to ensure that only the existing data is considered for garbage collection. The option is also passed to the `maybe_flush_commitlog` method to make sure all the tables are flushed and a new active segment is created in the commit log. Fixes #19728 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-09-05 17:25:45 +05:30
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Raphael S. Carvalho	ad5c5bca5f	replica: get rid of fragile compaction group intrusive list It was added to make integration of storage groups easier, but it's complicated since it's another source of truth and we could have problems if it becomes inconsistent with the group map. Fixes #18506. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-07-09 16:53:35 -03:00
Pavel Emelyanov	31d05925cc	api,database: Move auto-compaction toggle guard Toggling per-table auto-compaction enabling bit is guarded with on-database boolean and raii guard. It's only used by a single api/column_family.cc file, so it can live there. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-16 14:42:51 +03:00
Pavel Emelyanov	a43b178f72	api: Move some table manipulation helpers from storage_service Continuation of the previous patch -- helpers toggling tombstone_gc and auto_compaction on tables should live in the same file that uses them. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-16 14:42:50 +03:00
Pavel Emelyanov	862fcd7bc7	api: Move table-related calls from storage_service domain The storage_service/(enable\|disable)_(tombstone_gc\|auto_compaction) endpoints are not handled by storage_service _service_ and should rather live in the column_family/ domain which is handler by replica::database. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-16 14:42:50 +03:00
Pavel Emelyanov	ba53283d21	api: Reimplement some endpoints using existing helpers The (enable\|disable)_(tombstone_gc\|auto_compaction) endpoints living in column_family domain can benefit from the helpers that do the same in the storage_service domain. The "difference" is that c.f. endpoints do it per-table, while s.s. ones operate on a vector of tables, so the former is a corner case of the latter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-16 14:42:50 +03:00
Pavel Emelyanov	231ffa623c	api: Lost unset of tombstone-gc endpoints On stop all endpoints must be unregistered, these three are lost Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-16 14:42:50 +03:00
Nadav Har'El	1aacfdf460	REST API: stop using deprecated, buggy, path parameter The API req->param["name"] to access parameters in the path part of the URL was buggy - it forgot to do URL decoding and the result of our use of it in Scylla was bugs like #5883 - where special characters in certain REST API requests got botched up (encoded by the client, then not decoded by the server). The solution is to replace all uses of req->param["name"] by the new req->get_path_param("name"), which does the decoding correctly. Unfortunately we needed to change 104 (!) callers in this patch, but the transformation is mostly mechanical and there is no functional changes in this patch. Another set of changes was to bring req, not req->param, to a few functions that want to get the path param. This patch avoids the numerous deprecation warnings we had before, and more importantly, it fixes #5883. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-05-02 12:33:46 +03:00
Pavel Emelyanov	186b36165e	snapshot: Move per-table snap API to other snapshot endpoints So that they are collected in one place and to facilitate next patch that's going to use snapshot-ctl for per-table API too Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-25 10:05:01 +03:00
Pavel Emelyanov	ceac65be1e	api: Reserve vectors in advance Some endpoints in api/column_family fill vectors with data obtained from database and return them back. Since the amount of data is known in advance, it's good to reserve the vector. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 19:13:05 +03:00
Pavel Emelyanov	f3e58cb806	api: Use range-loop to iterate keyspaces The code uses standard for (;;) loop, but range version is nicer Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 19:12:12 +03:00
Kefu Chai	9afec2e3e7	api, compaction: promote flush_mode so that this enum type can be shared by other task(s) as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-01 11:25:53 +08:00
Kefu Chai	ffb5ad494f	api: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16973	2024-01-25 11:28:02 +03:00
Benny Halevy	b12b142232	api: add /storage_service/compact For major compacting all tables in the database. The advantage of this api is that `commitlog->force_new_active_segment` happens only once in `database::flush_all_tables` rather than once per keyspace (when `nodetool compact` translates to a sequence of `/storage_service/keyspace_compaction` calls). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Benny Halevy	1fd85bd37b	api: compaction: add flush_memtables option When flushing is done externally, e.g. by running `nodetool flush` prior to `nodetool compact`, flush_memtables=false can be passed to skip flushing of tables right before they are major-compacted. This is useful to prevent creation of small sstables due to excessive memtable flushing. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Aleksandra Martyniuk	e072a2341d	replica: api: return table_id instead of const table_id& Return table_id instead of const table_id& from database::find_uuid as copying table_id does not cause much overhead and simplifies methods signature.	2023-07-25 17:13:24 +02:00
Aleksandra Martyniuk	cdbfa0b2f5	replica: iterate safely over tables related maps Loops over _column_families and _ks_cf_to_uuid which may preempt are protected by reader mode of rwlock so that iterators won't get invalid.	2023-07-25 17:13:04 +02:00
Aleksandra Martyniuk	52afd9d42d	replica: wrap column families related maps into tables_metadata As a preparation for ensuring access safety for column families related maps, add tables_metadata, access to members of which would be protected by rwlock.	2023-07-25 16:13:00 +02:00
Aleksandra Martyniuk	61dc98b276	api: prevent non-owner cpu access to shared_ptr In get_sstables_for_key in api/column_family.cc a set of lw_shared_ptrs to sstables is passes to reducer of map_reduce0. Reducer then accesses these shared pointers. As reducer is invoked on the same shard map_reduce0 is called, we have an illegal access to shared pointer on non-owner cpu. A set of shared pointers to sstables is trasnsformed in map function, which is guaranteed to be invoked on a shard associated with the service. Fixes: #14515. Closes #14532	2023-07-09 23:09:59 +03:00
Pavel Emelyanov	198bca98ec	table: Return shared sstable from get_sstables_by_partition_key() The call is generic enough not to drop the sstable itself on return so that callers can do whatever they need with it. The only today's caller is API which will convert sstables to filenames on its own Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-06-07 15:04:48 +03:00
Aleksandra Martyniuk	f48b57e7b9	compaction: use table_info in compaction tasks Task manager compaction tasks need table names for logs. Thus, compaction tasks store table infos instead of table ids. get_table_ids function is deleted as it isn't used anywhere.	2023-05-30 09:58:55 +02:00
Raphael S. Carvalho	abc1eae1c2	Add API to disable tombstone GC in compaction Adding new APIs /column_family/tombstone_gc and /storage_service/tombstone_gc. Mimicks existing APIs /column_family/autocompaction and /storage_service/autocompaction. column_family variant must specify a single table only, following existing convention. whereas the storage_service one can specify an entire keyspace, or a subset of a tables in a keyspace. column_family API usage ----- The table name must be in keyspace:name format Get status: curl -s -X GET "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf" Enable GC curl -s -X POST "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf" Disable GC curl -s -X DELETE "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf" storage_service API usage ----- Tables can be specified using a comma-separated list. Enable GC on keyspace curl -s -X POST "http://127.0.0.1:10000/storage_service/tombstone_gc/ks" Disable GC on keyspace curl -s -X DELETE "http://127.0.0.1:10000/storage_service/tombstone_gc/ks" Enable GC on a subset of tables curl -s -X POST "http://127.0.0.1:10000/storage_service/tombstone_gc/ks?cf=table1,table2" Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-05-12 10:34:38 -03:00

1 2 3 4

152 Commits