scylladb

Author	SHA1	Message	Date
Aleksandra Martyniuk	a468138716	api: storage_service: do not log the exception that is passed to user The exceptions that are thrown by the tasks started with API are propagated to users. Hence, there is no need to log it. Remove the logs about exception in user started tasks. Fixes: https://github.com/scylladb/scylladb/issues/16732. Closes scylladb/scylladb#25153 (cherry picked from commit `e607ef10cd`) Closes scylladb/scylladb#25296	2025-08-06 09:36:07 +03:00
Gleb Natapov	ece8a8b3bc	api: unregister raft_topology_get_cmd_status on shutdown In `c8ce9d1c60` we introduced raft_topology_get_cmd_status REST api but the commit forgot to unregister the handler during shutdown. Fixes #24910 Closes scylladb/scylladb#24911 (cherry picked from commit `89f2edf308`) Closes scylladb/scylladb#24922	2025-07-14 11:39:42 +02:00
Gleb Natapov	ad91198417	topology coordinator: add REST endpoint to query the status of ongoing topology cmd rpc The topology coordinator executes several topology cmd rpc against some nodes during a topology change. A topology operation will not proceed unless rpc completes (successfully or not), but sometimes it appears that it hangs and it is hard to tell on which nodes it did not complete yet. Introduce new REST endpoint that can help with debugging such cases. If executed on the topology coordinator it returns currently running topology rpc (if any) and a list of nodes that did not reply yet. (cherry picked from commit `c8ce9d1c60`)	2025-07-08 06:23:48 +00:00
Gleb Natapov	c644526bf9	api: return error from get_host_id_map if gossiper is not enabled yet. Token metadata api is initialized before gossiper is started. get_host_id_map REST endpoint cannot function without the fully initialized gossiper though. The gossiper is started deep in the join_cluster call chain, but if we move token_metadata api initialization after the call it means that no api will be available during bootstrap. This is not what we want. Make a simple fix by returning an error from the api if the gossiper is not initialized yet. Fixes: #24479 Closes scylladb/scylladb#24575 (cherry picked from commit `e364995e28`) Closes scylladb/scylladb#24587	2025-06-24 10:00:48 +03:00
Robert Bindar	a926cba476	Add support for nodetool refresh --skip-reshape This patch adds the new option in nodetool, patches the load_new_ss_tables REST request with a new parameter and skips the reshape step in refresh if this flag is passed. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#24409 Fixes: #24365 (cherry picked from commit `ca1a9c8d01`) Closes scylladb/scylladb#24472	2025-06-13 14:06:19 +03:00
Pavel Emelyanov	c59327950b	api: Introduce skip_cleanup query parameter Just copy the load_and_stream and primary_replica_only logic, this new option is the same in this sense. Throw if it's specified with the load_and_stream one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `1b1f653699`)	2025-06-05 17:48:35 +03:00
Pavel Emelyanov	4a7ddbfe07	code: Push bool skip_cleanup flag around Just put the boolean into the callstack between API and distributed loader to reduce the churn in the next patches. No functional changes, flag is false and unused. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `4ab049ac8d`)	2025-06-05 17:44:40 +03:00
Robert Bindar	b62264e1d9	Add nodetool refresh --scope option This change adds the --scope option to nodetool refresh. Like in the case of nodetool restore, you can pass either of: * node - On the local node. * rack - On the local rack. * dc - In the datacenter (DC) where the local node lives. * all (default) - Everywhere across the cluster. as scope. The feature is based on the existing load_and_stream paths, so it requires passing --load-and-stream to the refresh command. Also, it is not compatible with the --primary-replica-only option. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#23861 (cherry picked from commit `c570941692`)	2025-06-04 11:59:17 +03:00
Nadav Har'El	5fd2eabd48	Merge 'Generalize the diversity of parse_table_infos() callers in API' from Pavel Emelyanov The helper in question is used in several different ways -- by handlers directly (most of the callers), as a part of wrap_ks_cf() helper and by one of its overloads that unpack the "cf" query parameter from request. This PR generalizes most of the described callers thus reducing the number differently-looking of ways API handlers parse "keyspace" and "cf" request parameters. Continuation of #22742 Closes scylladb/scylladb#23368 * github.com:scylladb/scylladb: api: Squash two parse_table_infos into one api: Generalize keyspaces:tables parsing a little bit more api: Provide general pair<keyspace, vector<table>> parsing api: Remove ks_cf_func and related code	2025-04-22 15:40:06 +03:00
Gleb Natapov	6f53611337	gossiper: move force_remove_endpoint to work on host id Since the gossiper works on host ids now it is incorrect to leave this function to work on ip. It makes it impossible to delete outdated entry since the "gossiper.get_host_id(endpoint) != id" check will always be false for such entries (get_host_id() always returns most up -to-date mapping.	2025-04-06 18:39:24 +03:00
Avi Kivity	882f405eed	Merge "Convert gossiper's endpoint state map to be host id based" from Gleb " The series makes endpoint state map in the gossiper addressable by host id instead of ips. The transition has implication outside of the gossiper as well. Gossiper based topology operations are affected by this change since they assume that the mapping is ip based. On wire protocol is not affected by the change as maps that are sent by the gossiper protocol remain ip based. If old node sends two different entries for the same host id the one with newer generation is applied. If new node has two ids that are mapped to the same ip the newer one is added to the outgoing map. Interoperability was verified manually by running mixed cluster. The series concludes the conversion of the system to be host id based. " * 'gleb/gossipper-endpoint-map-to-host-id-v2' of github.com:scylladb/scylla-dev: gossiper: make examine_gossiper private gossiper: rename get_nodes_with_host_id to get_node_ip treewide: drop id parameter from gossiper::for_each_endpoint_state treewide: move gossiper to index nodes by host id gossiper: drop ip from replicate function parameters gossiper: drop ip from apply_new_states parameters gossiper: drop address from handle_major_state_change parameter list gossiper: pass rpc::client_info to gossiper_shutdown verb handler gossiper: add try_get_host_id function gossiper: add ip to endpoint_state serialization: fix std::map de-serializer to not invoke value's default constructor gossiper: drop template from wait_alive_helper function gossiper: move get_supported_features and its users to host id storage_service: make candidates_for_removal host id based gossiper: use peers table to detect address change storage_service: use std::views::keys instead of std::views::transform that returns a key gossiper: move _pending_mark_alive_endpoints to host id gossiper: do not allow to assassinate endpoint in raft topology mode gossiper: fix indentation after previous patch gossiper: do not allow to assassinate non existing endpoint	2025-04-02 12:30:00 +03:00
Michał Chojnowski	a19d6d95f7	api: add the estimate_compression_ratios API call Add an API call which estimates the effectiveness of possible compression config changes. This can be used to make an informed decision about whether to change the compression method, without actually recompressing any SSTables.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	58ae278d10	api: add the retrain_dict API call Add an API call which will retrain the SSTable compression dictionary for a given table. Currently, it needs all nodes to be alive to succeed. We can relax this later.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	dd932ebb2f	compress: add hidden dictionary options Before this commit, "compression options" written into CompressionInfo.db (and used to construct a decompressor) have a 1:1 correspondence to "compression options" specified in the schema. But we want to add a new "compression option" -- the compression dictionary -- which will be written into CompressionInfo.db and used to construct decompressors, but won't be specified in the schema. To reconcile that, in this commit we introduce the notion of a "hidden option". If an option name in `CompressionInfo.db` begins with a dot, then this option will be used to construct decompressors, but won't be visible for other uses. (I.e. for the `sstable_info` API call and for recovering a fake `schema` from `CompressionInfo.db` in the `scylla sstable` tool). Then, we introduce the hidden `.dictionary.{0,1,2,..}` options, which hold the contents of the dictionary blob for this SSTable. (The dictionary is split into several parts because the SSTable format limits the length of a single option value to 16 bits, and dictionaries usually have a length greater than that). This commit only introduces helpers which translate dictionary blobs into "options" for CompressionInfo.db, and vice-versa, but it doesn't use those helpers yet. They will be used in later commits.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	006c631642	sstables/compress: remove get_sstable_compressor() Following up on the previous commit, we avoid constructing a compressor in the `sstable_info` API call, and we instead read the compression options from the `sstable::compression`.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	f4ca94d13b	compress.hh: switch compressor::name() from an instance member to a virtual call Before this patch, `compressor` is designed to be a proper abstract class, where the creator of a compressor doesn't even know what he's creating -- he passes a name, and it gets turned into a `compressor` behind a scenes. But later, when creation of compressors will involve looking up dictionaries, this abstraction will only get in the way. So we give up on keeping `compressor` abstract, and instead of using "opaque" names we turn to an explicit enum of possible compressor types. The main point of this patch is to add the `algorithm` enum and the `algorithm_to_name()` function. The rest of the patch switches the `compressor::name()` function to use `algorithm_to_name()` instead of the passed-by-constructor `compressor::_name`, to keep a single source of truth for the names.	2025-04-01 00:07:27 +02:00
Gleb Natapov	28fb84117d	treewide: drop id parameter from gossiper::for_each_endpoint_state We have it in endpoint_state anyway, so no need to pass both.	2025-03-31 16:50:50 +03:00
Gleb Natapov	4609bbbbb2	treewide: move gossiper to index nodes by host id This patch changes gossiper to index nodes by host ids instead of ips. The main data structure that changes is _endpoint_state_map, but this results in a lot of changes since everything that uses the map directly or indirectly has to be changed. The big victim of this outside of the gossiper itself is topology over gossiper code. It works on IPs and assumes the gossiper does the same and both need to be changed together. Changes to other subsystems are much smaller since they already mostly work on host ids anyway.	2025-03-31 16:50:50 +03:00
Botond Dénes	ea55eed037	Merge 'Snapshot several tables at once in scrub API handler' from Pavel Emelyanov The scrub API handler may want to snapshot several tables. For that, it calls snapshot-ctl method to snapshot a single table for each table in the list. That's excessive, snapshot-ctl has a method to snapshot a bunch of tables at once, just what the scrub handler needs. It's an improvement, so no need to backport Closes scylladb/scylladb#23472 * github.com:scylladb/scylladb: snapshot-ctl: Remove unused snapshot-single-table method api: Snapshot all tables at once in scrub handler	2025-03-31 13:00:32 +03:00
Pavel Emelyanov	0077acd1bb	api: Properly validate table in tablet add\|del replica handlers The handlers in question just go and call database.find_column_family, in case the table in question doesn't exist, the no_such_column_family exception would be thrown, which is not nice. Proper behavior is to throw bad_param one and there's a helper that does it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23389	2025-03-31 10:03:17 +02:00
Pavel Emelyanov	5162f75d0b	api: Snapshot all tables at once in scrub handler The handler walks the list of tables and snapshots each one individually (if needed). That's not very optimal, each such call starts a "snapshot modification operation", which is switching to shard-0 for a lock, then calls the snapshot of multiple tables giving it vector of a single name. There's a method of snapshot-ctl that snapshots several tables at once, no need to open-code it here. One thing to care about -- the take_column_family_snapshot() throws when the vector of table names is empty, so need an explicit skipping check. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-28 10:44:47 +03:00
Pavel Emelyanov	6e7d6b06f0	api: Squash two parse_table_infos into one There are currently three of them: - one that works on query parameter value - one that works on query parameters map - one that works on the request itself The second one is not used any longer by anyone by the third one, so squash them together. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-19 15:53:38 +03:00
Pavel Emelyanov	851bd38953	api: Generalize keyspaces:tables parsing a little bit more Continuation of the previous patch -- there's one caller that uses "non standard" name for the tables query parameter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-19 15:52:54 +03:00
Pavel Emelyanov	dc3455bc55	api: Provide general pair<keyspace, vector<table>> parsing Lots of API handlers get "keyspace" path parameter and parse the "cf" query one into a vector of table_infos. Generalize those places. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-19 15:51:57 +03:00
Pavel Emelyanov	722f282748	api: Remove ks_cf_func and related code The type in question is used by two endpoint handlers that are called with validated keyspace name and parsed vector of table_info-s. Both handlers can parse what they need on their own, all the more so next patches will make this parsing even more simpler. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-19 15:49:55 +03:00
Pavel Emelyanov	1ba91e28cb	sstables: Make get_filename() return component_name Similarly to previous patches -- mostly the result is used as log argument. The remaining users include - scylla sstable tool that dumps component names to json output - API endpoint that returns component names to user - tests these are all good to explicitly convert component_names to strings. There are few more places that expect strings instead of component name objects. For now they also use fmt::to_string() explicitly, partially it will be fixed later, mostly -- as future follow-ups. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-19 13:03:29 +03:00
Botond Dénes	fda3486770	Merge 'Remove some excessive ks:cf -> table_id conversions in API and schema_tables' from Pavel Emelyanov Actually, the main goal of this PR was to remove parse_tables() helpers from api/ in favor of more flexible (yet same complex) parse_table_infos(), but it turned out that it also saves some lookups in database maps. There are several places in API and schema_tables that have table_id at hand, but at some point drop it and carry keyspace and table names over to a place that maps ks:cf back to table_id and then uses it to find the table object. This PR keeps the table_id with the help of table_info struct in those places. This change allows removing the aforementioned parse_table() helpers from api/ and also saves few lookups in database maps. Removing the parse_tables() from api/ is the continuation of previous effort that reduces the set of helpers in api/ code that help handlers "parse" keyspaces and tables names see #22742 #21533 Closes scylladb/scylladb#23216 * github.com:scylladb/scylladb: api: Remove the remaining parse_tables() overload database: Sanitize flush_tables_on_all_shards() schema_tables: Remove all_table_names() database: Make tables flushing helper use table_info-s, not names api: Make keyspace flush endpoint use parse_table_infos() (and a bit more) schema_tables,client_state: Switch to using all_table_infos() schema_tables: Tune up some methods to benefit from table_infos schema_tables: Introduce all_table_infos()	2025-03-17 15:40:41 +02:00
Avi Kivity	4416b0c732	treewide: use angle brackets for including seastar headers Seastar is an external library, so we use angle brackets to include its interfaces. Closes scylladb/scylladb#23301	2025-03-17 10:03:06 +02:00
Gleb Natapov	24d30073f9	messaging_service: pass host id to remove_rpc_client in down notification Do not iterate over all client indexed by hos id to search for those with given IP. Look up by host id directly since now we know it in down notification. In cases host id is not known look it up by ip.	2025-03-11 12:09:22 +02:00
Gleb Natapov	eb59205caf	gossiper: drop deprecated unsafe_assassinate_endpoint operation It was always deprecated.	2025-03-11 12:09:21 +02:00
Gleb Natapov	0e3dcb7954	treewide: move everyone to use host id based gossiper::is_alive and drop ip based one	2025-03-11 12:09:21 +02:00
Gleb Natapov	e47f251178	gossiper: move _live_endpoints and _unreachable_endpoints endpoint to host_id Index live and dead endpoints by host id. It also allows to simplify some code that does a translation.	2025-03-11 12:09:21 +02:00
Gleb Natapov	c4a0fbae16	gossiper: check id match inside force_remove_endpoint Before calling force_remove_endpoint (which works on ip) the code checks that the ip maps to the correct id (not not remove a new node that inherited this ip by mistake). Move the check to the function itself.	2025-03-11 12:09:20 +02:00
Pavel Emelyanov	db70c7bbf7	api: Remove the remaining parse_tables() overload There's only one caller of it left -- the scrub handler. It can use the parse_table_infos() one and get table names from it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-10 13:14:10 +03:00
Pavel Emelyanov	89f3c1a91e	database: Sanitize flush_tables_on_all_shards() Previous patch left this method with few uglinesses - the vector<table_id> argument is named table_names - the sstring keyspace argument is unused - the keyspace argument is captured for no use Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-10 13:13:10 +03:00
Pavel Emelyanov	c2d23d7948	database: Make tables flushing helper use table_info-s, not names The database::flush_tables_on_all_shards() method accepts a keyspace name and a vector of table names. Then it converts ks:cf pair for each of the table name into a table-id and flushes the table with the ID. All the callers of that method already have or can easily get the vector of table_id-s, not just names, so make use of this. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-10 13:11:32 +03:00
Pavel Emelyanov	e94dce1725	api: Make keyspace flush endpoint use parse_table_infos() (and a bit more) Currently the handler in question calls parse_tables() which returns empty list of tables in the "cf" parameter is missing, or the table names if it's present. In the former case the handler will call flush_keyspace_on_all_shards() that just gets all table names from the keyspace and flushes them all. This change makes the handler use parse_table_infos() which is different -- when the "cf" parameter is missing, it gets all tables from the keyspace. So the handler no longer need to call the keyspace flush, it can always call the "flush the list of tables" helper. With that change one of the parse_tables() helpers becomes unused, so remove it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-10 13:06:55 +03:00
Pavel Emelyanov	c084de1406	api: Generalize disk space counting for table and system Now when the bodies of both map-reduce reducers are the same, they can be generalized with each other. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-04 19:56:16 +03:00
Pavel Emelyanov	4e2abba5a1	api: Use map_reduce_cf_raw() overload with table name The existing helper that counds disk space usage for a table map-reduces the table object "by hand". Its peer that counts the usage for all tables uses the map_reduce_cf_raw() helper. The latter exists for specific table as well, so the first counter can benefit from using it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-04 19:55:05 +03:00
Pavel Emelyanov	b43e2390db	api: Don't collect sstables map to count disk space usage All the API calls that collect disk usage of sstables accumulate map<sstable name, disk size>, then merges shard maps into one, then counts the "disk size" values and drops the map itself on the floor. This is waste of CPU cycles, disk usage can be just summed up along cf/sstables iterations, no need to accumulate map with names for that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-04 19:53:42 +03:00
Nadav Har'El	ea19b79fe2	Merge 'De-duplicate API's table name to table ID conversion' from Pavel Emelyanov This is continuation of #21533 There are two almost identical helpers in api/ -- validate_table(ks, cf) and get_uuid(ks, cf). Both check if the ks:cf table exists, throwing bad_param_exception if it doesn't. There's slight difference in their usage, namely -- callers of the latter one get the table_id found and make use of it, while the former helper is void and its callers need to re-search for the uuid again if the need (spoiler: they do). This PR merges two helpers together, so there's less code to maintain. As a nice side effect, the existing validate_table() callers save one re-lookup of the ks:cf pair in database mappings. Affected endpoints are validated by existing tests: * column_family/{autocompation\|tombstone_gc\|compaction_strategy}, validated by the tests described in #21533 * /storage_service/{range_to_endpoint_map\|describe_ring\|ownership}, validated by nodetool tests * /storage_service/tablets/{move\|repair}, validated by tablets move and repair tests Closes scylladb/scylladb#22742 * github.com:scylladb/scylladb: api: Remove get_uuid() local helper api: Make use of validate_table()'s table_id api: Make validate_table() helper return table_id after validation api: Change validate_table()'s ctx argument to database	2025-03-03 13:39:50 +02:00
Asias He	3f59a89e85	repair: Fix return type for storage_service/tablets/repair API The API returns the repair task UUID. For example: {"tablet_task_id":"3597e990-dc4f-11ef-b961-95d5ead302a7"} Fixes #23032 Closes scylladb/scylladb#23050	2025-02-27 12:38:12 +02:00
Gleb Natapov	914c9f1711	treewide: include build_mode.hh for SCYLLA_BUILD_MODE_RELEASE where it is missing Fixes: #22914 Closes scylladb/scylladb#22915	2025-02-20 10:50:04 +03:00
Pavel Emelyanov	ac989f7c30	api: Remove get_uuid() local helper This helper now fully duplicates the validate_table() one, so it can be removed. Two callers are updated respectively. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-02-17 11:42:33 +03:00
Pavel Emelyanov	a4cbc4db55	api: Make use of validate_table()'s table_id There are several places that validate_table() and then call database::find_column_family(ks, cf) which goes and repeats the search done by validate_table() before that. To remove the unneeded work, re-use the table_id found by validate_table() helper. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-02-17 11:42:33 +03:00
Pavel Emelyanov	e698259557	api: Make validate_table() helper return table_id after validation This helper calls database::find_column_family() and ignores the result. The intention of this is just to check if the c.f. in question exists. The find_column_family() in turn calls find_uuid() and then finds the c.f. object using the uuid found. The latter search is not supposed to fail, if it does, the on_internal_error() is called. Said that, replacing find_column_family() with find_uuid() is idempotent. And returning the found table_id will be used by next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-02-17 11:42:32 +03:00
Pavel Emelyanov	1991512826	api: Change validate_table()'s ctx argument to database This is to be in-sync with another get_uuid() helper from API. This, in turn, is to ease the unification of those two, because they are effectively identical (see next patches) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-02-17 11:42:32 +03:00
Botond Dénes	3439d015cb	Merge 'repair: Introduce Host and DC filter support' from Aleksandra Martyniuk Currently, the tablet repair scheduler repairs all replicas of a tablet. It does not support hosts or DCs selection. It should be enough for most cases. However, users might still want to limit the repair to certain hosts or DCs in production. https://github.com/scylladb/scylladb/pull/21985 added the preparation work to add the config options for the selection. This patch adds the hosts or DCs selection support. Fixes https://github.com/scylladb/scylladb/issues/22417 New feature. No backport is needed. Closes scylladb/scylladb#22621 * github.com:scylladb/scylladb: test: add test to check dcs and hosts repair filter test: add repair dc selection to test_tablet_metadata_persistence repair: Introduce Host and DC filter support docs: locator: update the docs and formatter of tablet_task_info	2025-02-17 10:04:09 +02:00
Kefu Chai	7ff0d7ba98	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22857	2025-02-15 20:32:22 +02:00
Asias He	5545289bfa	repair: Introduce Host and DC filter support Currently, the tablet repair scheduler repairs all replicas of a tablet. It does not support hosts or DCs selection. It should be enough for most cases. However, users might still want to limit the repair to certain hosts or DCs in production. #21985 added the preparation work to add the config options for the selection. This patch adds the hosts or DCs selection support. Fixes #22417	2025-02-14 09:13:11 +01:00

1 2 3 4 5 ...

1170 Commits