scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 21:55:50 +00:00

Author	SHA1	Message	Date
Kefu Chai	819fc95a67	reader: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17036	2024-01-29 16:21:42 +02:00
Kefu Chai	43094d2023	db: add formatter for db::read_repair_decision before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `db::read_repair_decision`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17033	2024-01-29 15:43:51 +02:00
Botond Dénes	d202d32f81	Merge 'Add an API to trigger snapshot in Raft servers' from Kamil Braun This allows the user of `raft::server` to cause it to create a snapshot and truncate the Raft log (leaving no trailing entries; in the future we may extend the API to specify number of trailing entries left if needed). In a later commit we'll add a REST endpoint to Scylla to trigger group 0 snapshots. One use case for this API is to create group 0 snapshots in Scylla deployments which upgraded to Raft in version 5.2 and started with an empty Raft log with no snapshot at the beginning. This causes problems, e.g. when a new node bootstraps to the cluster, it will not receive a snapshot that would contain both schema and group 0 history, which would then lead to inconsistent schema state and trigger assertion failures as observed in scylladb/scylladb#16683. In 5.4 the logic of initial group 0 setup was changed to start the Raft log with a snapshot at index 1 (`ff386e7a44`) but a problem remains with these existing deployments coming from 5.2, we need a way to trigger a snapshot in them (other than performing 1000 arbitrary schema changes). Another potential use case in the future would be to trigger snapshots based on external memory pressure in tablet Raft groups (for strongly consistent tables). The PR adds the API to `raft::server` and a HTTP endpoint that uses it. In a follow-up PR, we plan to modify group 0 server startup logic to automatically call this API if it sees that no snapshot is present yet (to automatically fix the aforementioned 5.2 deployments once they upgrade.) Closes scylladb/scylladb#16816 * github.com:scylladb/scylladb: raft: remove `empty()` from `fsm_output` test: add test for manual triggering of Raft snapshots api: add HTTP endpoint to trigger Raft snapshots raft: server: add `trigger_snapshot` API raft: server: track last persisted snapshot descriptor index raft: server: framework for handling server requests raft: server: inline `poll_fsm_output` raft: server: fix indentation raft: server: move `io_fiber`'s processing of `batch` to a separate function raft: move `poll_output()` from `fsm` to `server` raft: move `_sm_events` from `fsm` to `server` raft: fsm: remove constructor used only in tests raft: fsm: move trace message from `poll_output` to `has_output` raft: fsm: extract `has_output()` raft: pass `max_trailing_entries` through `fsm_output` to `store_snapshot_descriptor` raft: server: pass `*_aborted` to `set_exception` call	2024-01-29 15:06:04 +02:00
Beni Peled	8009170d3a	docs: update the installation instructions with the new gpg 2024 key Closes scylladb/scylladb#17019	2024-01-29 14:37:25 +02:00
Kefu Chai	6f55d68dd9	.git: add more skip words these words are either * shortened words: strategy => strat, read_from_primary => fro * or acronyms: node_or_data => nd before we rename them with better names, let's just add them to the ignore word list. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17002	2024-01-29 14:37:03 +02:00
Kefu Chai	0cbf8f75f0	db: add formatter for dht::decorated_key and repair_sync_boundary before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for dht::decorated_key and repair_sync_boundary. please note, before this change, repair_sync_boundary was using the operator<< based formatter of `dht::decorated_key`, so we are updating both of them in a single commit. because we still use the homebrew generic formatter of vector<> in to format vector<repair_sync_boundary> and vector<dht::decorated_key>, so their operator<< are preserved. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16994	2024-01-29 11:11:41 +02:00
Tzach Livyatan	06a9a925a5	Update link to sizing / pricing calc Closes scylladb/scylladb#17015	2024-01-29 11:07:20 +02:00
Kefu Chai	b5ff098f28	thrift: add formatter for cassandra::ConsistencyLevel::type before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for cassandra::ConsistencyLevel::type. please note, the operator<< for `cassandra::ConsistencyLevel::type` is generated using `thrift` command line tool, which does not emit specialization for fmt::formatter yet, so we need to use `fmt::ostream_formatter` to implement the formatter for this type. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17013	2024-01-29 10:10:35 +02:00
Pavel Emelyanov	3abdb3c7ee	tablets: Remove tablet_aware_replication_strategy::parse_initial_tablets It's now unused, string with initial tablets its parsed elsewhere Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#17010	2024-01-29 10:03:38 +02:00
Kefu Chai	912c588975	thrift: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17012	2024-01-29 10:02:30 +02:00
Kefu Chai	abb12979f8	raft: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17011	2024-01-29 10:00:56 +02:00
Kefu Chai	8f38bd5376	commitlog: add formatter for db::replay_position before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `db::replay_position`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17014	2024-01-29 09:59:30 +02:00
Botond Dénes	d3c1be9107	Merge 'alternator: enable tablets by default if experimental feature is enabled' from Nadav Har'El This series does a similar change to Alternator as was done recently to CQL: 1. If the "tablets" experimental feature in enabled, new Alternator tables will use tablets automatically, without requiring an option on each new table. A default choice of initial_tablets is used. These choices can still be overridden per-table if the user wants to. 3. In particular, all test/alternator tests will also automatically run with tablets enabled 4. However, some tests will fail on tablets because they use features that haven't yet been implemented with tablets - namely Alternator Streams (Refs #16317) and Alternator TTL (Refs #16567). These tests will - until those features are implemented with tablets - continue to be run without tablets. 5. An option is added to the test/alternator/run to allow developers to manually run tests without tablets enabled, if they wish to (this option will be useful in the short term, and can be removed later). Fixes #16355 Closes scylladb/scylladb#16900 * github.com:scylladb/scylladb: test/alternator: add "--vnodes" option to run script alternator: use tablets by default, if available test/alternator: run some tests without tablets	2024-01-29 09:22:13 +02:00
Kefu Chai	cb5453d534	.git: only allow codespell to run on master branch so that non-master branches are not read by 3rd-party tools unless they are audited. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16999	2024-01-29 09:04:20 +02:00
Kefu Chai	f96d25a0a7	tool: check for existence of keyspace before getting it in general, user should save output of `DESC foo.bar` to a file, and pass the path to the file as the argument of `--schema-file` option of `scylla sstable` commands. the CQL statement generated from `DESC` command always include the keyspace name of the table. but in case user create the CQL statement manually and misses the keyspace name. he/she would have following assertion failure ``` scylla: cql3/statements/cf_statement.cc:49: virtual const sstring &cql3::statements::raw::cf_statement::keyspace() const: Assertion `_cf_name->has_keyspace()' failed. ``` this is not a great user experience. so, in this change, we check for the existence of keyspace before looking it up. and throw a runtime error with a better error mesage. so when the CQL statement does not have the keyspace name, the new error message would look like: ``` error processing arguments: could not load schema via schema-file: std::runtime_error (tools::do_load_schemas(): CQL statement does not have keyspace specified) ``` since this check is only performed by `do_load_schemas()` which care about the existence of keyspace, and it only expects the CQL statement to create table/keyspace/type, we just override the new `has_keyspace()` method of the corresponding types derived from `cf_statement`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16981	2024-01-29 09:02:01 +02:00
Anna Stuchlik	dfa88ccc28	doc: document nodetool resetlocalschema This adds the documentation for the nodetool resetlocalschema command. The syntax description is based on the description for Cassandra and the ScyllaDB help for nodetool. Fixes https://github.com/scylladb/scylladb/issues/16286 Closes scylladb/scylladb#16790	2024-01-28 21:09:02 +01:00
Kefu Chai	fe3bc00045	topology_coordinator: fix misspellings in log these misspellings are identified by codespell. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17006	2024-01-26 16:50:39 +02:00
Dawid Medrek	b92fb3537a	main: Postpone start-up of hint manager In this commit, we postpone the start-up of the hint manager until we obtain information about other nodes in the cluster. When we start the hint managers, one of the things that happen is creating endpoint managers -- structures managed by db::hints::manager. Whether we create an instance of endpoint manager depends on the value returned by host_filter::can_hint_for, which, in turn, may depend on the current state of locator::topology. If locator::topology is incomplete, some endpoint managers may not be started even though they should (because the target node IS part of the cluster and we SHOULD send hints to it if there are some). The situation like that can happen because we start the hint managers too early. This commit aims to solve that problem. We only start the hint managers when we've gathered information about the other nodes in the cluster and created the locator::topology using it. Hinted Handoff is not negatively affected by these changes since in between the previous point of starting the hint managers and the current one, all of the mutations performed by service::storage_proxy target the local node, so no hints would need to be generated anyway. Fixes scylladb/scylladb#11870 Closes scylladb/scylladb#16511	2024-01-26 12:49:40 +01:00
Botond Dénes	c6fd4dffbb	Merge 'Remove anonymous namespaces from headers' from Patryk Wróbel Anonymous namespace implies internal linkage for its members. When it is defined in a header, then each translation unit, which includes such header defines its own unique instance of members of the unnamed namespace that are ODR-used within that translation unit. This can lead to unexpected results including code bloat or undefined behavior due to ODR violations. This PR removes unnamed namespaces from header files. References: - [CppCoreGuidelines: "SF.21: Don’t use an unnamed (anonymous) namespace in a header"](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#sf21-dont-use-an-unnamed-anonymous-namespace-in-a-header) - [SEI CERT C++: "DCL59-CPP. Do not define an unnamed namespace in a header file"](https://wiki.sei.cmu.edu/confluence/display/cplusplus/DCL59-CPP.+Do+not+define+an+unnamed+namespace+in+a+header+file) Closes scylladb/scylladb#16998 * github.com:scylladb/scylladb: utils/config_file_impl.hh: remove anonymous namespace from header mutation/mutation.hh: remove anonymous namespace from header	2024-01-26 13:20:17 +02:00
Kefu Chai	a9d781d70f	test/nodetool: only test "storage_service/cleanup_all" with scylla this RESTful API is a scylla specific extension and is only used by scylla-nodetool. currently, the java-based nodetool does not use it at all, so mark it with "scylla_only". one can verify this change with: ``` pytest --mode=debug --nodetool=cassandra test_cleanup.py::test_cleanup ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17001	2024-01-26 13:19:15 +02:00
Botond Dénes	582ddc70ec	Merge 'test/nodetool: return a randomized address if not running with unshare' from Kefu Chai we should allow user to run nodetool tests without `test.py`. but there are good chance that the host could be reused by multiple tests or multiple users who could be using port 12345. by randomizing the IP and port, they would have better chance to complete the test without running into used port problem. Closes scylladb/scylladb#16996 * github.com:scylladb/scylladb: test/nodetool: return a randomized address if not running with unshare test/nodetool: return an address from loopback_network fixture	2024-01-26 13:15:58 +02:00
Kefu Chai	9ee6c00c84	docs: fix misspellings these misspellings are identified by codespell. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17005	2024-01-26 13:14:21 +02:00
Kefu Chai	72cec22932	repair: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16993	2024-01-26 13:12:38 +02:00
Kamil Braun	4f736894e1	Merge 'Add maintenance mode' from Mikołaj Grzebieluch In this mode, the node is not reachable from the outside, i.e. * it refuses all incoming RPC connections, * it does not join the cluster, thus * all group0 operations are disabled (e.g. schema changes), * all cluster-wide operations are disabled for this node (e.g. repair), * other nodes see this node as dead, * cannot read or write data from/to other nodes, * it does not open Alternator and Redis transport ports and the TCP CQL port. The only way to make CQL queries is to use the maintenance socket. The node serves only local data. To start the node in maintenance mode, use the `--maintenance-mode true` flag or set `maintenance_mode: true` in the configuration file. REST API works as usual, but some routes are disabled: * authorization_cache * failure_detector * hinted_hand_off_manager This PR also updates the maintenance socket documentation: * add cqlsh usage to the documentation * update the documentation to use `WhiteListRoundRobinPolicy` Fixes #5489. Closes scylladb/scylladb#15346 * github.com:scylladb/scylladb: test.py: add test for maintenance mode test.py: generalize usage of cluster_con test.py: when connecting to node in maintenance mode use maintenance socket docs: add maintenance mode documentation main: add maintenance mode main: move some REST routes initialization before joining group0 message_service: add sanity check that rpc connections are not created in the maintenance mode raft_group0_client: disable group0 operations in the maintenance mode service/storage_service: add start_maintenance_mode() method storage_service: add MAINTENANCE option to mode enum service/maintenance_mode: add maintenance_mode_enabled bool class service/maintenance_mode: move maintenance_socket_enabled definition to seperate file db/config: add maintenance mode flag docs: add cqlsh usage to maintenance socket documentation docs: update maintenance socket documentation to use WhiteListRoundRobinPolicy	2024-01-26 11:02:34 +01:00
Kefu Chai	637dd73079	sstable/storage: use fs::path to represent _dir and _temp_dir they are directories, and we are concating strings to build the paths to the sstable components. so it would be more elegant to use fs::path for manipulating paths. this change was inspired by the discussion on passing the relative path to sstable to `scylla sstables`, where we use the `path::parent_path()` as the dir of sstable, and then concatenate it with the filename component. but if the `parent_path()` method returns an empty string, we end up with a path like "/me-42-big-TOC.txt", which is not reachable. what we should be reading is "me-42-big-TOC.txt". so, we should better off either using `fs::path` or enforcing the absolute path. since we already using "/" as separator, and concatenating strings, this is an opportunity to switch over to `fs::path` to address the problem and to avoid the string concatenating. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16982	2024-01-26 09:54:41 +02:00
Patryk Wrobel	6faa178f10	utils/config_file_impl.hh: remove anonymous namespace from header Anonymous namespace implies internal linkage for its members. When it is defined in a header, then each translation unit, which includes such header defines its own unique instance of members of the unnamed namespace that are ODR-used within that translation unit. This can lead to unexpected results including code bloat or undefined behavior due to ODR violations. This change aligns the code with the following guidelines: - CppCoreGuidelines: "SF.21: Don’t use an unnamed (anonymous) namespace in a header" - SEI CERT C++: "DCL59-CPP. Do not define an unnamed namespace in a header file" Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-01-26 08:44:44 +01:00
Patryk Wrobel	c218333afb	cql3/type_json.cc: move stringstream content instead of copying it C++20 introduced a new overload of std::ofstringstream::str() that is selected when the mentioned member function is called on r-value. The new overload returns a string, that is move-constructed from the underlying string instead of being copy-constructed. This change applies std::move() on stringstream objects before calling str() member function to avoid copying of the underlying buffer. Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#16990	2024-01-26 09:41:09 +02:00
Kefu Chai	36e81f93d2	.git: do not apply codespell to licenses we should keep the licenses as they are, even with misspellings. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16992	2024-01-26 09:39:27 +02:00
Patryk Wrobel	ba488b10ec	mutation/mutation.hh: remove anonymous namespace from header Anonymous namespace implies internal linkage for its members. When it is defined in a header, then each translation unit, which includes such header defines its own unique instance of members of the unnamed namespace that are ODR-used within that translation unit. This can lead to unexpected results including code bloat or undefined behavior due to ODR violations. This change aligns the code with the following guidelines: - CppCoreGuidelines: "SF.21: Don’t use an unnamed (anonymous) namespace in a header" - SEI CERT C++: "DCL59-CPP. Do not define an unnamed namespace in a header file" Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>	2024-01-26 08:38:39 +01:00
Kefu Chai	01727a5399	test/nodetool: return a randomized address if not running with unshare we should allow user to run nodetool tests without `test.py`. but there are good chance that the host could be reused by multiple tests or multiple users who could be using port 12345. by randomizing the IP and port, they would have better chance to complete the test without running into used port problem. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-01-26 13:32:47 +08:00
Kefu Chai	358d30fd29	test/nodetool: return an address from loopback_network fixture * rename "maybe_setup_loopback_network" to "server_address" * return an address from the fixture this change prepares for bringing back the randomized IP and port, in case users run this test without test.py, by randomizing the IP and port, they would have better chance to complete the test without running into used port problem. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-01-26 13:20:37 +08:00
Avi Kivity	03313d359e	Merge ' db: commitlog_replayer: ignore mutations affected by (tablet) cleanups ' from Michał Chojnowski To avoid data resurrection, mutations deleted by cleanup operations should be skipped during commitlog replay. This series implements the above for tablet cleanups, by using a new system table which holds records of cleanup operations. Fixes #16752 Closes scylladb/scylladb#16888 * github.com:scylladb/scylladb: test: test_tablets: add a test for cleanup after migration test: pylib: add ScyllaCluster.wipe_sstables test: boost: add commitlog_cleanup_test db: commitlog_replayer: ignore mutations affected by (tablet) cleanups replica: table: garbage-collect irrelevant system.commitlog_cleanups records db: commitlog: add min_position() replica: table: populate system.commitlog_cleanups on tablet cleanup db: system_keyspace: add system.commitlog_cleanups replica: table: refresh compound sstable set after tablet cleanup	2024-01-25 20:51:03 +02:00
Patryk Wrobel	a858daf038	service/client_state.cc: remove redundant copying db::schema_tables::all_table_names() returns std::vector<sstring>. Usage of range-for loop without reference results in copying each of the elements of the traversed container. Such copying is redundant. This change introduces usage of const reference to avoid copying. Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#16983	2024-01-25 20:35:05 +02:00
Kamil Braun	543ad0987a	Merge 'raft topology: send barrier_and_drain to a decommissioning node' from Patryk Jędrzejczak We didn't send the `barrier_and_drain` command to a decommissioning node that could still be coordinating requests. It could happen that a decommissioning node sent a request with an old topology version after normal nodes received the new fence version. Then, the request would fail on replicas with the stale topology exception. This PR fixes this problem by modifying `exec_global_command`. From now on, it sends `barrier_and_drain` to a decommissioning node. We also stop filtering stale topology exceptions in `test_topology_ops`. We added this filter after detecting the bug fixed by this PR. Fixes scylladb/scylladb#15804 Fixes scylladb/scylladb#16579 Fixes scylladb/scylladb#16642 Closes scylladb/scylladb#16797 * github.com:scylladb/scylladb: test: test_topology_ops: remove failed mutations filter raft topology: send barrier_and_drain to a decommissioning node raft topology: ensure at most one transitioning node	2024-01-25 16:09:02 +01:00
Kefu Chai	ee28cf2285	test.py: s/defalt/default/ this typo was identified by codespell Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16980	2024-01-25 16:54:07 +02:00
Botond Dénes	6d5ee6d48a	Merge 'test/nodetool: run nodetool tests using "unshare"' from Kefu Chai before this change, we use a random address when launching rest_api_mock server, but there are chances that the randomly picked address conflicts with an already-used address on the host. and the subprocess fails right away with the returncode of 1 upon this failure, but we just continue on and check the readiness of the already-dead server. actually, we've seen test failures caused by the EADDRINUSE failure, and when we checked the readiness of the rest_api_mock by sending HTTP request and reading the response, what we had is not a JSON encoded response but a webpage, which was likely the one returned by a minio server. in this change, we * specify the "launcher" option of nodetool test suite to "unshare", so that all its tests are launched in separated namespaces. * do not use a random address for the mock server, as the network namespaces are separated. Fixes #16542 Closes scylladb/scylladb#16773 * github.com:scylladb/scylladb: test/nodetool: run nodetool tests using "unshare" test.py: add "launcher" option support	2024-01-25 16:53:49 +02:00
Mikołaj Grzebieluch	763911af5b	test.py: add test for maintenance mode The test checks that in maintenance mode server A is not available for other nodes and for clients. It is possible to connect by the maintenance socket to server A and perform local CQL operations.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	ca35e352f5	test.py: generalize usage of cluster_con Add option to pass load_balancing policy. Change hosts type to list of IPs or cassandra.Endpoint.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	77a656bfd6	test.py: when connecting to node in maintenance mode use maintenance socket A node in the maintenance socket hasn't an opened regular CQL port. To connect to the node, the scylla cluster needs to use the node's maintenance socket.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	9c07a189e8	docs: add maintenance mode documentation	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	0bdbd6e8f5	main: add maintenance mode In maintenance mode: * Group0 doesn't start and the node doesn't join the token ring to behave as a dead node to others, * Group0 operations are disabled and result in an error, * Only the maintenance socket listens for CQL requests, * The storage service initialises token_metadata with the local node as the only node on the token ring. Maintenance mode is enabled by passing the --maintenance-mode flag. Maintenance mode starts before the group0 is initialised.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	617adde9c9	main: move some REST routes initialization before joining group0 Move REST endpoints that don't need connection with other nodes, before joining the group0. This way, they can be initialized in the maintenance mode. Move `snapshot_ctl` along with routes because of snapshots API and tasks API. Its constructor is a noop, so it is safe to move it.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	d8de209dcf	message_service: add sanity check that rpc connections are not created in the maintenance mode In maintenance mode, a node shouldn't be able to communicate with other nodes. To make sure this does not happen, the sanity check is added.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	c08266cfe5	raft_group0_client: disable group0 operations in the maintenance mode In maintenance mode, the node doesn't communicate with other nodes, so it doesn't start or apply group0 operations. Users can still try to start it, e.g. change the schema, and the node can't allow it. Init _upgrade_state with recovery in the maintenance mode. Throw an error if the group0 operation is started in maintenance mode.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	97641f646a	service/storage_service: add start_maintenance_mode() method In the maintenance mode, other nodes won't be available thus we disabled joining the token ring and the token metadata won't be populated with the local node's endpoint. When a CQL query is executed it checks the `token_metadata` structure and fails if it is empty. Add a method that initialises `token_metadata` with the local node as the only node in the token ring.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	c530756837	storage_service: add MAINTENANCE option to mode enum join_cluster and start_maintenance_mode are incompatible. To make sure that only one is called when the node starts, add the MAINTENANCE option. start_maintenance_mode sets _operation_mode to MAINTENANCE. join_cluster sets _operation_mode to STARTING. set_mode will result in an internal error if: * it tries to set MAINTENANCE mode when the _operation_mode is other than NONE, i.e. start_maintenance_mode is called after join_cluster (or it is called during the drain, but it also shouldn't happen). * it tries to set STARTING mode when the mode is set to MAINTENANCE, i.e. join_cluster is called after start_maintenance_mode.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	d4c22fc86c	service/maintenance_mode: add maintenance_mode_enabled bool class	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	8b2f0e38d9	service/maintenance_mode: move maintenance_socket_enabled definition to seperate file	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	e6a83b9819	db/config: add maintenance mode flag	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	81ef9fc91e	docs: add cqlsh usage to maintenance socket documentation After https://github.com/scylladb/scylla-cqlsh/pull/67, the user can use cqlsh to connect to the node by maintenance socket.	2024-01-25 15:27:53 +01:00

1 2 3 4 5 ...

40928 Commits