scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 20:05:10 +00:00

Author	SHA1	Message	Date
Botond Dénes	572003c469	Merge 'Cleanup the way snapshot details are propagated via API' from Pavel Emelyanov There's a database::get_snapshot_details() method that returns collection of all snapshots for all ks.cf out there and there are several snapshot_details aux structures around it. This PR keeps only one "details" and cleans up the way it propagates from database up to the respective API calls. Closes scylladb/scylladb#18317 * github.com:scylladb/scylladb: snapshot_ctl: Brush up true_snapshots_size() internals snapshot_ctl: Remove unused details struct snapshot_ctl: No double recoding of details database,snapshots: Move database::snapshot_details into snapshot_ctl database,snapshots: Make database::get_snapshot_details() return map, not vector table,snapshots: Move table::snapshot_details into snapshot_ctl	2024-04-23 16:28:25 +03:00
Pavel Emelyanov	e8f10be12e	snapshot_ctl: No double recoding of details Currently database::get_snapshot_details() returns a collection of snapshots. The snapshot_ctl converts this collection into similarly looking one with slightly different structures inside. The resulting collection is converted one more time on the API layer into another similarly looking map. This patch removes the intermediate conversion. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-19 20:04:32 +03:00
Kefu Chai	a439ebcfce	treewide: include fmt/ranges.h and/or fmt/std.h before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<. with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:56:16 +08:00
Kamil Braun	eb9ba914a3	Merge 'Set dc and rack in gossiper when loaded from system.peers and load the ignored nodes state for replace' from Benny Halevy The problem this series solves is correctly ignoring DOWN nodes state when replacing a node. When a node is replaced and there are other nodes that are down, the replacing node is told to ignore those DOWN nodes using the `ignore_dead_nodes_for_replace` option. Since the replacing node is bootstrapping it starts with an empty system.peers table so it has no notion about any node state and it learns about all other nodes via gossip shadow round done in `storage_service::prepare_replacement_info`. Normally, since the DOWN nodes to ignore already joined the ring, the remaining node will have their endpoint state already in gossip, but if the whole cluster was restarted while those DOWN nodes did not start, the remaining nodes will only have a partial endpoint state from them, which is loaded from system.peers. Currently, the partial endpoint state contains only `HOST_ID` and `TOKENS`, and in particular it lacks `STATUS`, `DC`, and `RACK`. The first part of this series loads also `DC` and `RACK` from system.peers to make them available to the replacing node as they are crucial for building a correct replication map with network topology replication strategy. But still, without a `STATUS` those nodes are not considered as normal token owners yet, and they do not go through handle_state_normal which adds them to the topology and token_metadata. The second part of this series uses the endpoint state retrieved in the gossip shadow round to explicitly add the ignored nodes' state to topology (including dc and rack) and token_metadata (tokens) in `prepare_replacement_info`. If there are more DOWN nodes that are not explicitly ignored replace will fail (as it should). Fixes scylladb/scylladb#15787 Closes scylladb/scylladb#15788 * github.com:scylladb/scylladb: storage_service: join_token_ring: load ignored nodes state if replacing storage_service: replacement_info: return ignore_nodes state locator: host_id_or_endpoint: keep value as variant gms: endpoint_state: add getters for host_id, dc_rack, and tokens storage_service: topology_state_load: set local STATUS state using add_saved_endpoint gossiper: add_saved_endpoint: set dc and rack gossiper: add_saved_endpoint: fixup indentation gossiper: add_saved_endpoint: make host_id mandatory gossiper: add load_endpoint_state gossiper: start_gossiping: log local state	2024-04-16 10:27:36 +02:00
Pavel Emelyanov	8bad828208	api: Add method to delete replica from tablet Copied from the add_replica counterpart TODO: Generalize common parts of move_tablet and add_\|del_tablet_replica Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-15 16:31:07 +03:00
Benny Halevy	7c2bd8dc34	locator: host_id_or_endpoint: keep value as variant Rather than allowing to keep both host_id and endpoint, keep only one of them and provide resolve functions that use the token_metadata to resolve the host_id into an inet_address or vice verse. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-04-14 15:25:50 +03:00
Tomasz Grabiec	0c74c2c12f	Merge 'Extend tablet_transition_kind::rebuild to rebuild tablet to new replica' from Pavel Emelyanov When altering rf for a keyspace, all tablets in this ks will get more replicas. Part of this process is rebuilding tablets' onto new node(s). This PR extends the tablets transition code to support rebuilding of tablet on new replica. fixes: #18030 Closes scylladb/scylladb#18082 * github.com:scylladb/scylladb: test: Check data presense as well test: Test how tablets are copied between nodes test: Add sanity test for tablet migration api: Add method to add replica to a tablet tablet: Make leaving replica optional	2024-04-05 12:51:10 +02:00
Pavel Emelyanov	2a98e95cd0	api: Coroutinize API get_snapshot_details handler Now it's possible to understand what it does Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18190	2024-04-04 22:20:28 +03:00
Kefu Chai	64b8bb239f	api/storage_service: throw if table is not found when move tablets `database::find_column_family()` throws no_such_column_family if an unknown ks.cf is fed to it. and we call into this function without checking for the existence of ks.cf first. since "/storage_service/tablets/move" is a public interface, we should translate this error to a better http error. in this change, we check for the existence of the given ks.cf, and throw an exception so that it can be caught by seastar::httpd::routers, and converted to an HTTP error. Fixes #17198 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17217	2024-04-04 11:23:52 +03:00
Pavel Emelyanov	79ad760e95	api: Add method to add replica to a tablet The new API submits rebuild transition with new replicas set to be old (current) replicas plus the provided one. It looks and acts like the move_tablet API call with several changes: - lacks the "source" replica argument - submits "rebuild" transition kind - cross racks checks are not performed The 'force' argument is inherited from move_tablet, but is unused now and is left for future. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-04-04 09:22:16 +03:00
Benny Halevy	1272d736c0	api: storage_service: upgrade_to_raft_topology: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-04-02 20:02:51 +03:00
Benny Halevy	31026ae27f	api: storage_service: upgrade_to_raft_topology: add logging Upgrading raft topology is an important api call that should be logged. When failed, it is also important to log the exception to get better visibility into why the call failed. Indentation will be fixed in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-04-02 20:02:49 +03:00
Patryk Wrobel	9eb91b5526	storage_service/ownership: discard get_ownership() requests when tablets enabled This change introduces a logic, that is responsible for checking if tablets are enabled for any of keyspaces when get_ownership() is invoked. Without it, the result would be calculated based solely on sorted_tokens() which was invalid. Refs: scylladb#17342 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-03-11 09:52:25 +01:00
Patryk Wrobel	51da80da7d	storage_service/ownership/{keyspace}: handle requests when tablets are enabled Before this change, when user tried to utilize 'storage_service/ownership/{keyspace}' API with keyspace parameter that uses tablets, then internal error was thrown. The code was calling a function, that is intended for vnodes: get_vnode_effective_replication_map(). This commit introduces graceful handling of such scenario and extends the API to allow passing 'cf' parameter that denotes table name. Now, when keyspace uses tablets and cf parameter is not passed a descriptive error message is returned via BAD_REQUEST. Users cannot query ownership for keyspace that uses tablets, but they can query ownership for a table in a given keyspace that uses tablets. Also, new tests have been added to test/rest_api/test_storage_service.py and to test/topology_experimental_raft/test_tablets.py in order to verify the behavior with and without tablets enabled. Refs: scylladb#17342 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-03-11 09:52:23 +01:00
Botond Dénes	050c6dcad7	api: storage_service/keyspaces: add replication filter To allow to filter the returned keyspaces based by the replication they use: tablets or vnodes. The filter can be disabled by omitting the parameter or passing "all". The default is "all". Fixes: #16509 Closes scylladb/scylladb#17319	2024-02-20 09:04:41 +01:00
Patryk Wrobel	3842bf18a7	storage_service/range_to_endpoint_map: allow API to properly handle tablets This API endpoint was failing when tablets were enabled because of usage of get_vnode_effective_replication_map(). Moreover, it was providing an error message that was not user-friendly. This change extends the handler to properly service the incoming requests. Furthermore, it introduces two new test cases that verify the behavior of storage_service/range_to_endpoint_map API. It also adjusts the test case of this endpoint for vnodes to succeed when tablets are enabled by default. The new logic is as follows: - when tablets are disabled then users may query endpoints for a keyspace or for a given table in a keyspace - when tablets are enabled then users have to provide table name, because effective replication map is per-table When user does not provide table name when tablets are enabled for a given keyspace, then BAD_REQUEST is returned with a meaningful error message. Fixes: scylladb#17343 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#17372	2024-02-18 19:21:53 +02:00
Kefu Chai	9b6a66826c	api/storage_service: add more constness to http_context parameter when we just want to perform read access to `http_context`, there is no need to use a non-const reference. so let's add `const` specifier to make this explicit. this shoudl help with the readability and maintainability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17219	2024-02-13 17:32:45 +02:00
Kamil Braun	e9e24f47ec	Merge 'raft topology: implement upgrade and recovery procedure' from Piotr Dulikowski This PR implements a procedure that upgrades existing clusters to use raft-based topology operations. The procedure does not start automatically, it must be triggered manually by the administrator after making sure that no topology operations are currently running. Upgrade is triggered by sending `POST /storage_service/raft_topology/upgrade` request. This causes the topology coordinator to start who drives the rest of the process: it builds the `system.topology` state based on information observed in gossip and tells all nodes to switch to raft mode. Then, topology coordinator runs normally. Upgrade progress is tracked in a new static column `upgrade_state` in `system.topology`. The procedure also serves as an extension to the current recovery procedure on raft. The current recovery procedure requires restarting nodes in a special mode which disables raft, perform `nodetool removenode` on the dead nodes, clean up some state on the nodes and restart them so that they automatically rebuild the group 0. Raft topology fits into existing procedure by falling back to legacy topology operations after disabling raft. After rebuilding the group 0, upgrade needs to be triggered again. Because upgrade is manual and it might not be convenient for administrators to run it right after upgrading the cluster, we allow the cluster to operate in legacy topology operations mode until upgrade, which includes allowing new nodes to join. In order to allow it, nodes now ask the cluster about the mode they should use to join before proceeding by using a new `JOIN_NODE_QUERY` RPC. The procedure is explained in more detail in `topology-over-raft.md`. Fixes: https://github.com/scylladb/scylladb/issues/15008 Closes scylladb/scylladb#17077 * github.com:scylladb/scylladb: test/topology_custom: upgrade/recovery tests for topology on raft cdc/generation_service: in legacy mode, fall back to raft tables system_keyspace: add read_cdc_generation_opt cdc/generation_service: turn off gossip notifications in raft topo mode cql_test_env: move raft_topology_change_enabled var earlier group0_state_machine: pull snapshot after raft topology feature enabled storage_service: disable persistent feature enabler on upgrade storage_service: replicate raft features to system.peers storage_service: gossip tokens and cdc generation in raft topology mode API: add api for triggering and monitoring topology-on-raft upgrade storage_service: infer which topology operations to use on startup storage_service: set the topology kind value based on group 0 state raft_group0: expose link to the upgrade doc in the header feature_service: fall back to checking legacy features on startup storage_service: add fiber for tracking the topology upgrade progress gms: feature_service: add SUPPORTS_CONSISTENT_TOPOLOGY_CHANGES topology_coordinator: implement core upgrade logic topology_coordinator: extract top-level error handling logic storage_service: initialize discovery leader's state earlier topology_coordinator: allow for custom sharding info in prepare_and_broadcast_cdc_generation_data topology_coordinator: allow for custom sharding info in prepare_new_cdc_generation_data topology_coordinator: remove outdated fixme in prepare_new_cdc_generation_data topology_state_machine: introduce upgrade_state storage_service: disallow topology ops when upgrade is in progress raft_group0_client: add in_recovery method storage_service: introduce join_node_query verb raft_group0: make discover_group0 public raft_group0: filter current node's IP in discover_group0 raft_group0: remove my_id arg from discover_group0 storage_service: make _raft_topology_change_enabled more advanced docs: document raft topology upgrade and recovery	2024-02-09 11:54:53 +01:00
Kefu Chai	c1c96bbc16	api/storage_service: drop /storage_service/describe_ring/ API per its description, "`/storage_service/describe_ring/`" returns the token ranges of an arbitrary keyspace. actually, it returns the first keyspace which is of non-local-vnode-based-strategy. this API is not used by nodetool, neither is it exercised in dtest. scylla-manager has a wrapper for this API though, but that wrapper is not used anywhere. in this change, this API is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17197	2024-02-09 12:49:21 +02:00
Piotr Dulikowski	a672383c2a	API: add api for triggering and monitoring topology-on-raft upgrade Implements the /storage_service/raft_topology/upgrade route. The route supports two methods: POST, which triggers the cluster-wide upgrade to topology-on-raft, and GET which reports the status of the upgrade.	2024-02-08 19:12:28 +01:00
Botond Dénes	35da9551fb	Merge 'storage_service: Add describe_ring support for tablet table' from Asias He The table query param is added to get the describe_ring result for a given table. Both vnode table and tablet table can use this table param, so it is easier for users to user. If the table param is not provided by user and the keyspace contains tablet table, the request will be rejected. E.g., curl "http://127.0.0.1:10000/storage_service/describe_ring/system_auth?table=roles" curl "http://127.0.0.1:10000/storage_service/describe_ring/ks1?table=standard1" Refs #16509 Closes scylladb/scylladb#17118 * github.com:scylladb/scylladb: tablets: Convert to use the new version of for_each_tablet storage_service: Add describe_ring support for tablet table storage_service: Mark host2ip as const tablets: Add for_each_tablet_gently	2024-02-07 10:41:36 +02:00
Tomasz Grabiec	448e117e7d	Merge 'service: validate replication strategy constraints in tablet-moving API' from Aleksandra Martyniuk Validate replication strategy constraints in /storage_service/tablets/move API: - replicas are not on the same node - replicas don't move across DC (violates RF in each DC) - availability is not reduced due to rack overloading Add flag to force tablet move even though dc/rack constraints aren't fulfilled. Test for the change: https://github.com/scylladb/scylla-dtest/pull/3911. Fixes: #16379. Closes scylladb/scylladb#16648 * github.com:scylladb/scylladb: api: service: add force param to move_tablet api service: validate replication strategy constraints	2024-02-05 20:07:21 +01:00
Asias He	04773bd1df	storage_service: Add describe_ring support for tablet table The table query param is added to get the describe_ring result for a given table. Both vnode table and tablet table can use this table param, so it is easier for users to user. If the table param is not provided by user and the keyspace contains tablet table, the request will be rejected. E.g., curl "http://127.0.0.1:10000/storage_service/describe_ring/system_auth?table=roles" curl "http://127.0.0.1:10000/storage_service/describe_ring/ks1?table=standard1" Refs #16509	2024-02-05 18:11:07 +08:00
Aleksandra Martyniuk	89c683f51a	api: service: add force param to move_tablet api Force flag is added to /storage_service/tablets/move. If force is set to true, replication strategy constraints regarding racks and dcs can be broken.	2024-02-02 19:08:01 +01:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Kefu Chai	5e0b3671d3	storage_service: fall back to local cleanup in cleanup_all before this change, if no keyspaces are specified, scylla-nodetool just enumerate all non-local keyspaces, and call "/storage_service/keyspace_cleanup" on them one after another. this is not quite efficient, as each this RESTful API call force a new active commitlog segment, and flushes all tables. so, if the target node of this command has N non-local keyspaces, it would repeat the steps above for N times. this is not necessary. and after a topology change, we would like to run a global "nodetool cleanup" without specifying the keyspace, so this is a typical use case which we do care about. to address this performance issue, in this change, we improve an existing RESTful API call "/storage_service/cleanup_all", so if the topology coordinator is not enabled, we fall back to a local cleanup to cleanup all non-local keyspaces. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-01 11:25:53 +08:00
Kefu Chai	b39cc01bb3	compaction_manager: flush all tables before cleanup according to the document "nodetool cleanup" > Triggers removal of data that the node no longer owns currently, scylla performs cleanup by rewriting the sstables. but commitlog segments may still contain the mutations to the tables which are dropped during sstable rewriting. when scylla server restarts, the dirty mutations are replayed to the memtable. if any of these dirty mutations changes the tables cleaned up. the stale data are reapplied. this would lead to data resurrection. so, in this change we following the same model of major compaction: 1. force new active segment, 2. flush all tables 3. perform cleanup using compaction, which rewrites the sstables of specified tables because we already `flush()` all tables in `cleanup_keyspace_compaction_task_impl::run()`, there is no need to call `flush()` again, in `table::perform_cleanup_compaction()`, so the `flush()` call is dropped in this function, and the tests using this function are updated to call `flush()` manually to preserve the existing behavior. there are two callers of `cleanup_keyspace_compaction_task_impl`, * one is `storage_service::sstable_cleanup_fiber()`, which listens for the events fired by topology_state_machine, which is in turn driven by, for instance, "/storage_service/cleanup_all" API. which cleanup all keyspaces in one after another. * another is "/storage_service/keyspace_cleanup", which cleans up the specified keyspace. in the first use case, we can force a new active segment for a single time, so another parameter to the ctor of `cleanup_keyspace_compaction_task_impl` is introduced to specify if the `db.flush_all_tables()` call should be skiped. please note, there are two possible optimizations, 1. force new active segment only if the mutations in it touches the tables being cleaned up 2. after forcing new active segment, only flush the (mem)tables mutated by the non-active segments but let's leave them for following-up changes. this change is a minimal fix for data resurrection issue. Fixes #16757 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-01 11:25:53 +08:00
Kefu Chai	9afec2e3e7	api, compaction: promote flush_mode so that this enum type can be shared by other task(s) as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-01 11:25:53 +08:00
Pavel Emelyanov	7c5c89ba8d	Revert "Merge 'Use utils::directories instead of db::config to get dirs' from Patryk Wróbel" This reverts commit `370fbd346c`, reversing changes made to `0912d2a2c6`. This makes scylla-manager mis-interpret the data_file_directories somehow, issue #17078	2024-01-31 15:08:14 +03:00
Patryk Wrobel	5ac3d0f135	api/storage_service.cc: use utils::directories to get paths of dirs This change replaces usage of db::config with usage of utils::directories in api/storage_service.cc in order to get the paths of directories. Refs: scylladb#5626 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-01-29 13:11:33 +01:00
Mikołaj Grzebieluch	c530756837	storage_service: add MAINTENANCE option to mode enum join_cluster and start_maintenance_mode are incompatible. To make sure that only one is called when the node starts, add the MAINTENANCE option. start_maintenance_mode sets _operation_mode to MAINTENANCE. join_cluster sets _operation_mode to STARTING. set_mode will result in an internal error if: * it tries to set MAINTENANCE mode when the _operation_mode is other than NONE, i.e. start_maintenance_mode is called after join_cluster (or it is called during the drain, but it also shouldn't happen). * it tries to set STARTING mode when the mode is set to MAINTENANCE, i.e. join_cluster is called after start_maintenance_mode.	2024-01-25 15:27:53 +01:00
Kefu Chai	ffb5ad494f	api: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16973	2024-01-25 11:28:02 +03:00
Benny Halevy	e277ec6aef	force_keyspace_cleanup: skip keyspaces that do not require or support cleanup Local keyspaces do not need cleanup, and keyspaces configured with tablets, where their replication strategy is per-table do not support cleanup. In both cases, just skip their cleanup via the api. Fixes #16738 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#16785	2024-01-16 15:01:49 +03:00
Kefu Chai	ece2bd2f6e	service: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16764	2024-01-15 13:29:33 +02:00
Gleb Natapov	97ab3f6622	storage_service: topology_coordinator: introduce cleanup REST API integrated with the topology coordinator Introduce new REST API "/storage_service/cleanup_all" that, when triggered, instructs the topology coordinator to initiate cluster wide cleanup on all dirty nodes. It is done by introducing new global command "global_topology_request::cleanup".	2024-01-14 15:45:53 +02:00
Avi Kivity	22b77edef3	Merge 'scylla-nodetool: implement the scrub command' from Botond Dénes On top of the capabilities of the java-nodetool command, the following additional functionalit is implemented: * Expose quarantine-mode option of the scrub_keyspace REST API * Exit with error and print a message, when scrub finishes with abort or validation_errors return code The command comes with tests and all tests pass with both the new and the current nodetool implementations. Refs: #15588 Refs: #16208 Closes scylladb/scylladb#16391 * github.com:scylladb/scylladb: tools/scylla-nodetool: implement the scrub command test/nodetool: rest_api_mock.py: add missing "f" to error message f string api: extract scrub_status into its own header	2023-12-12 22:22:35 +02:00
Botond Dénes	8064d17f78	api: extract scrub_status into its own header So it can be shared with scylla-nodetool code.	2023-12-12 09:33:39 -05:00
Botond Dénes	885a807c71	Merge 'api: storage_service: api for starting async compaction' from Aleksandra Martyniuk For all compaction types which can be started with api, add an asynchronous version of api, which returns task_id of the corresponding task manager task. With the task_id a user can check task status, abort, or wait for it, using task manager api. Closes scylladb/scylladb#15092 * github.com:scylladb/scylladb: test: use async api in test_not_created_compaction_task_abort test: test compaction task started asynchronously api: tasks: api for starting async compaction api: compaction: pass pointer to top level compaction tasks	2023-12-12 12:06:52 +02:00
Asias He	5f20e33e15	api: Reject unsupported http api options for repair If an option is not supported, reject the request instead of silently ignoring the unsupported options. It prevents the user thinks the option is supported but it is ignored by scylla core. Fixes #16299 Closes scylladb/scylladb#16300	2023-12-12 09:18:00 +02:00
Aleksandra Martyniuk	b485897704	api: tasks: api for starting async compaction For all compaction types which can be started with api, add an asynchronous version of api, which returns task_id of the corresponding task manager task. With the task_id a user can check task status, abort, or wait for it, using task manager api.	2023-12-11 11:39:33 +01:00
Aleksandra Martyniuk	ceec5577d8	api: compaction: pass pointer to top level compaction tasks As a preparation for asynchronous compaction api, from which we cannot take values by reference, top level compaction tasks get pointers which need to be set to nullptr when they are not needed (like in async api).	2023-12-11 11:36:10 +01:00
Avi Kivity	9c0f05efa1	Merge 'Track tablet streaming under global sessions to prevent side-effects of failed streaming' from Tomasz Grabiec Tablet streaming involves asynchronous RPCs to other replicas which transfer writes. We want side-effects from streaming only within the migration stage in which the streaming was started. This is currently not guaranteed on failure. When streaming master fails (e.g. due to RPC failing), it can be that some streaming work is still alive somewhere (e.g. RPC on wire) and will have side-effects at some point later. This PR implements tracking of all operations involved in streaming which may have side-effects, which allows the topology change coordinator to fence them and wait for them to complete if they were already admitted. The tracking and fencing is implemented by using global "sessions", created for streaming of a single tablet. Session is globally identified by UUID. The identifier is assigned by the topology change coordinator, and stored in system.tablets. Sessions are created and closed based on group0 state (tablet metadata) by the barrier command sent to each replica, which we already do on transitions between stages. Also, each barrier waits for sessions which have been closed to be drained. The barrier is blocked only if there is some session with work which was left behind by unsuccessful streaming. In which case it should not be blocked for long, because streaming process checks often if the guard was left behind and stops if it was. This mechanism of tracking is fault-tolerant: session id is stored in group0, so coordinator can make progress on failover. The barriers guarantee that session exists on all replicas, and that it will be closed on all replicas. Closes scylladb/scylladb#15847 * github.com:scylladb/scylladb: test: tablets: Add test for failed streaming being fenced away error_injection: Introduce poll_for_message() error_injection: Make is_enabled() public api: Add API to kill connection to a particular host range_streamer: Do not block topology change barriers around streaming range_streamer, tablets: Do not keep token metadata around streaming tablets: Fail gracefully when migrating tablet has no pending replica storage_service, api: Add API to disable tablet balancing storage_service, api: Add API to migrate a tablet storage_service, raft topology: Run streaming under session topology guard storage_service, tablets: Use session to guard tablet streaming tablets: Add per-tablet session id field to tablet metadata service: range_streamer: Propagate topology_guard to receivers streaming: Always close the rpc::sink storage_service: Introduce concept of a topology_guard storage_service: Introduce session concept tablets: Fix topology_metadata_guard holding on to the old erm docs: Document the topology_guard mechanism	2023-12-07 16:29:02 +02:00
Tomasz Grabiec	d1c1b59236	storage_service, api: Add API to disable tablet balancing Load balancing needs to be disabled before making a series of manual migrations so that we don't fight with the load balancer. Also will be used in tests to ensure tablets stick to expected locations.	2023-12-06 18:36:17 +01:00
Tomasz Grabiec	1f57d1ea28	storage_service, api: Add API to migrate a tablet Will be used in tests, or for hot fixes in production.	2023-12-06 18:36:17 +01:00
Kefu Chai	f483309165	compaction, api: drop unused functions run_on_existing_tables() is not used at all. and we have two of them. in this change, let's drop them. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16304	2023-12-06 14:31:08 +02:00
Benny Halevy	e5d3c6741f	api: use locator::topology rather than fb_utilities Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Benny Halevy	b12b142232	api: add /storage_service/compact For major compacting all tables in the database. The advantage of this api is that `commitlog->force_new_active_segment` happens only once in `database::flush_all_tables` rather than once per keyspace (when `nodetool compact` translates to a sequence of `/storage_service/keyspace_compaction` calls). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Benny Halevy	1b576f358b	api: add /storage_service/flush For flushing all tables in the database. The advantage of this api is that `commitlog->force_new_active_segment` happens only once in `database::flush_all_tables` rather than once per keyspace (when `nodetool flush` translates to a sequence of `/storage_service/keyspace_flush` calls). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Benny Halevy	1fd85bd37b	api: compaction: add flush_memtables option When flushing is done externally, e.g. by running `nodetool flush` prior to `nodetool compact`, flush_memtables=false can be passed to skip flushing of tables right before they are major-compacted. This is useful to prevent creation of small sstables due to excessive memtable flushing. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00

1 2 3 4 5 ...

392 Commits