scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 13:45:53 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	dbaca825ec	storage_service: Remove _initialized and is_initialized() This bit is hairy. First, it indicates that the storage service entered the init_server() method. But, once the node is up and running it also indicates whether the gossiper is enabled or not via the APi call. To rely on the operation mode, first, the NONE mode is introduced at which the server starts. Then in init_server() is switches to STARTING. Second change is to stop using the bit in enable/disable gossiper API call, instead -- check the gossiper.is_enabled() itself. To keep the is_initialized API call compatible, when the operation mode is NORMAL it would return true/false according to the status of the gossiper. This change is simple because storage service API handlers already have the gossiper instance hanging around. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-07 13:29:47 +03:00
Pavel Emelyanov	ffbfa3b542	storage_service: Remove _joined and is_joined() The is_joined() status can be get with get_operation_mode(). Since it indicates that the operation mode is JOINING, NORMAL or anything above, the operation mode the enum class should be shuffled to get the simple >= comparison. Another needed change is to set mode few steps earlier than it happens now to cover the non-bootstrap startup case. And the third change is to partially revert the `d49aa7ab` that made the .is_joined() method be future-less. Nowadays the is_joined() is called only from the API which is happy with being future-full in all other storage service state checks. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-07 13:29:47 +03:00
Pavel Emelyanov	ca03fd3145	storage_service: Replace is_starting() with get_operation_mode() This is trivial change, since the only user is in API and the get_operation_mode + mode values are at hand. One thing to pay attention to -- the new method checks the mode to be <= STARTING, not for equality. Now this is equivalent change, but next patch will introduce NONE mode that should be reported as is_starting() too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-07 13:29:47 +03:00
Pavel Emelyanov	c385fe7d79	storage_service: Make get_operation_mode() return mode itself Now it reports back formatted mode. For future convenience it's needed to return the raw value, all the more so the mode enum class is already public. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-07 13:29:47 +03:00
Benny Halevy	5a63026932	api: storage_service: scrub: validate parameters Validate all parameters, rejecting unsupported parameters. Refs #10087 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-16 17:01:46 +02:00
Benny Halevy	16afde46e7	api: storage_service: refactor parse_tables Prepare for string-based parsing and validation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-16 16:53:18 +02:00
Benny Halevy	cce6810615	api: storage_service: refactor validate_keyspace Prepare for string-based validation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-16 16:53:18 +02:00
Benny Halevy	fc2e9abeba	api: storage_service: scrub: throw httpd::bad_param_exception for invalid param values Throwing std::runtime_error results in http status 500 (internal_server_error), but the problem is with the request parameters, nt with the server. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-16 15:39:17 +02:00
Benny Halevy	f6431824a7	api: add keyspace_offstrategy_compaction Perform offstrategy compaction via the REST API with a new `keyspace_offstrategy_compaction` option. This is useful for performing offstrategy compaction post repair, after repairing all token ranges. Otherwise, offstrategy compaction will only be auto-triggered after a 5 minutes idle timeout. Like major compaction, the api call returns the offstrategy compaction task future, so it's waited on. The `long` result counts the number of tables that required offstrategy compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-30 20:40:39 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Benny Halevy	4db57267a6	repair: move tracker-dependent free functions to repair_service These functions are called from the api layer. Continue to hide the repair tracker from the caller but use the repair_service already available at the api layer to invoke the respective high-level methods without requiring `the_repair_tracker()`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:40:09 +02:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Avi Kivity	ae3a360725	database: Move database, keyspace, table classes to replica/ directory The database, keyspace, and table classes represent the replica-only part of the objects after which they are named. Reading from a table doesn't give you the full data, just the replica's view, and it is not consistent since reconciliation is applied on the coordinator. As a first step in acknowledging this, move the related files to a replica/ subdirectory.	2022-01-06 17:07:30 +02:00
Benny Halevy	85f10138f0	api: storage_service: validate_keyspace: improve exception error message Generate the error message using the no_such_keyspace(ks_name) exception. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-12-09 14:40:21 +02:00
Benny Halevy	522a32f19f	api: storage_service: expose validate_keyspace and parse_tables To be used by the compaction_manager api in a following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-12-09 14:25:53 +02:00
Benny Halevy	ff63ad9f6e	api: storage_service: add parse_tables Splits and validate the cf parameter, containing an optional comma-separated list of table names. If any table is not found and a no_such_column_family exception is thrown, wrap it in a `bad_param_exception` so it will translate to `reply::status_type::bad_request` rather than `reply::status_type::internal_server_error`. With that, hide the split_cf function from api/api.hh since it was used only from api/storage_service and new use sites should use validate_tables instead. Fixes #9754 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-12-08 16:42:40 +02:00
Benny Halevy	cc122984d6	compaction: scrub: add quarantine_mode option Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-12-05 18:29:04 +02:00
Benny Halevy	60ff28932c	compaction_manager: perform_sstable_scrub: get the whole compaction_type_options::scrub So we can pass additional options on top of the scrub mode. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-12-05 18:21:37 +02:00
Avi Kivity	03755b362a	Merge 'compaction_manager api: stop ongoing compactions' from Benny Halevy This series extends `compaction_manager::stop_ongoing_compaction` so it can be used from the api layer for: - table::disable_auto_compaction - compaction_manager::stop_compaction Fixes #9313 Fixes #9695 Test: unit(dev) Closes #9699 * github.com:scylladb/scylla: compaction_manager: stop_compaction: wait for ongoing compactions to stop compaction_manager: stop_ongoing_compactions: log Stopping 0 tasks at debug level compaction_manager: unify stop_ongoing_compactions implementations compaction_manager: stop_ongoing_compactions: add compaction_type option compaction_manager: get_compactions: get a table* parameter table: disable_auto_compaction: stop ongoing compactions compaction_manager: make stop_ongoing_compactions public table: futurize disable_auto_compactions	2021-11-30 19:08:14 +02:00
Raphael S. Carvalho	0d5ac845e1	compaction: Make cleanup withstand better disk pressure scenario It's not uncommong for cleanup to be issued against an entire keyspace, which may be composed of tons of tables. To increase chances of success if low on space, cleanup will now start from smaller tables first, such that bigger tables will have more space available, once they're reached, to satisfy their space requirement. parallel_for_each() is dropped and wasn't needed given that manager performs per-shard serialization of cleanup jobs. Refs #9504. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211130133712.64517-1-raphaelsc@scylladb.com>	2021-11-30 16:15:24 +02:00
Benny Halevy	b60d697084	table: futurize disable_auto_compactions So it can stop ongoing compaction and wait for them to complete. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-11-30 08:33:04 +02:00
Botond Dénes	a51529dd15	protocol_servers: strengthen guarantees of listen_addresses() In early versions of the series which proposed protocol servers, the interface had two methods answering pretty much the same question of whether the server is running or not: * listen_addresses(): empty list -> server not running * is_server_running() To reduce redundancy and to avoid possible inconsistencies between the two methods, `is_server_running()` was scrapped, but re-added by a follow-up patch because `listen_addresses()` proved to be unreliable as a source for whether the server is running or not. This patch restores the previous state of having only `listen_addresses()` with two additional changes: * rephrase the comment on `listen_addresses()` to make it clear that implementations must return empty list when the server is not running; * those implementations that have a reliable source of whether the server is running or not, use it to force-return an empty list when the server is not running Tests: dtest(nodetool_additional_test.py) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20211117062539.16932-1-bdenes@scylladb.com>	2021-11-19 11:09:09 +03:00
Benny Halevy	9d4262e264	protocol_server: add per-protocol is_server_running method Change `b0a2a9771f` broke the generic api implementation of is_native_transport_running that relied on the addresses list being empty agter the server is stopped. To fix that, this change introduces a pure virtual method: protocol_server::is_server_running that can be implemented by each derived class. Test: unit(dev) DTest: nodetool_additional_test.py:TestNodetool.binary_test Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20211114135248.588798-1-bhalevy@scylladb.com>	2021-11-14 16:01:31 +02:00
Pavel Emelyanov	82509c9e74	storage_service, database: Move flush-on-drain code Flushing all CFs on shutdown is now fully managed in storage service and it looks weird. Some better place for it seems to be the database itself. Moving the flushing code also imples moving the drain_progress thing and patching the relevant API call. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-09 19:11:49 +03:00
Botond Dénes	134fa98ff4	transport: controller: implement the protocol_server interface	2021-11-05 15:42:41 +02:00
Botond Dénes	bda0d0ccba	thrift: controller: implement the protocol_server interface	2021-11-05 15:42:41 +02:00
Asias He	f5f5714aa6	repair: Return HTTP 400 when repiar id is not found There are two APIs for checking the repair status and they behave differently in case the id is not found. ``` {"host": "192.168.100.11:10001", "method": "GET", "uri": "/storage_service/repair_async/system_auth?id=999", "duration": "1ms", "status": 400, "bytes": 49, "dump": "HTTP/1.1 400 Bad Request\r\nContent-Length: 49\r\nContent-Type: application/json\r\nDate: Wed, 03 Nov 2021 10:49:33 GMT\r\nServer: Seastar httpd\r\n\r\n{\"message\": \"unknown repair id 999\", \"code\": 400}"} {"host": "192.168.100.11:10001", "method": "GET", "uri": "/storage_service/repair_status?id=999&timeout=1", "duration": "0ms", "status": 500, "bytes": 49, "dump": "HTTP/1.1 500 Internal Server Error\r\nContent-Length: 49\r\nContent-Type: application/json\r\nDate: Wed, 03 Nov 2021 10:49:33 GMT\r\nServer: Seastar httpd\r\n\r\n{\"message\": \"unknown repair id 999\", \"code\": 500}"} ``` The correct status code is 400 as this is a parameter error and should not be retried. Returning status code 500 makes smarter http clients retry the request in hopes of server recovering. After this patch: curl -X PGET 'http://127.0.0.1:10000/storage_service/repair_async/system_auth?id=9999' {"message": "unknown repair id 9999", "code": 400} curl -X GET 'http://127.0.0.1:10000/storage_service/repair_status?id=9999' {"message": "unknown repair id 9999", "code": 400} Fixes #9576 Closes #9578	2021-11-03 17:15:40 +02:00
Benny Halevy	a2fc3345bd	storage_service: futurize storage_service::describe_ring Convert storage_service::describe_ring to a coroutine to prevent reactor stalls as seen in #9280. Fixes #9280 Closes #9282 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #9282	2021-10-28 16:51:57 +03:00
Pavel Emelyanov	f0b5ab1c61	storage_service, api: Move set-tables-autocompaction back into API The global autocompaction toggle is no longer tied to the storage service. It naturally belongs to the database, but is small and tidy enough not to pollute database methods and can be placed into the api/ dir itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-10-11 11:13:59 +03:00
Pavel Emelyanov	c53c74258a	api: Remove storage service from new APIs The APIs that had been recently switched to using relevant services no longer need the storage service reference capture, so remove it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-10-11 11:11:52 +03:00
Pavel Emelyanov	c504361c15	view_builder: Accept view_build_statuses The code itself is already in relevant .cc file, not move it to the relevant class. The only significant change is where to get token metadata from. In its old location tokens were provided by the storage service itself, now when it's in the view builder there's no "native" place to get them from, however the rest of the view building code gets tokens from global storage proxy, so do the same here. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-10-11 11:11:40 +03:00
Pavel Emelyanov	540c6fa5ae	api, storage_service: Keep view builder API handlers separate There's the 'storage_service/view_build_statuses' endpoint. It's handler code sits in the storage_service, but the functionality belongs purely to view_builder. Same as with sstables loader, detach the enpoint's API set/unset code, next patches will fix the handler to use view_builder. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-10-11 11:09:07 +03:00
Pavel Emelyanov	68ecec0197	sstables_loader: Accept the sstables loading code The code was moved in the relevant .cc file by previous patch, now make it sit in the relevant class. One "significant" change is that the messaging service is available by local reference already, not by the sharded one. Other dependencies are already satisfied by the patch that introduced the sstables_loader class. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-10-11 11:08:21 +03:00
Pavel Emelyanov	7e49359720	storage_service, api: Keep sstables loading API handlers separate Right now the handlers sit in one boat with the rest of the storage service APIs. Next patches will switch this particular endpoint to use previously introduced sstables_loader, before doing so here's the respective API set/unset stubs. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-10-11 11:05:45 +03:00
Raphael S. Carvalho	342bfbd65a	compaction: Make major compaction on keyspace resilient if low on space Let's major compact the smallest tables first, increasing chances of success if low on disk space. parallel_for_each() didn't have any effect on space requirement as compaction_manager serializes major compaction in a shard. As parallel_for_each() is no longer used, find_column_family() is now used before each compact_all_sstables() to avoid a race with table drop. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211005135257.31931-1-raphaelsc@scylladb.com>	2021-10-05 17:04:34 +03:00
Avi Kivity	148a12f3da	Merge "Keep storage_service less aware of cdc internals" from Pavel E " The storage_service is involved in the cdc_generation_service guts more than needed. - the bool _for_testing bit is cdc-only - there's API-only cdc_generation_service getter - cdc_g._s. startup code partially sits in s._s. one This patch cleans most of the above leaving only the startup _cdc_gen_id on board. tests: unit(dev) refs: #2795 " * 'br-storage-service-vs-cdc-2' of https://github.com/xemul/scylla: api: Use local sharded<cdc::generation_service> reference main: Push cdc::generation_service via API storage_service: Ditch for_testing boolean cdc: Replace db::config with generation_service::config cdc: Drop db::config from description_generator cdc: Remove all arguments from maybe_rewrite_streams_descriptions cdc: Move maybe_rewrite_streams_descriptions into after_join cdc: Squash two methods into one cdc: Turn make_new_cdc_generation a service method cdc: Remove ring-delay arg from make_new_cdc_generation cdc: Keep database reference on generation_service	2021-10-04 14:56:05 +03:00
Pavel Emelyanov	037135316e	api: Use local sharded<cdc::generation_service> reference And remove the getter from storage_service. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 16:04:12 +03:00
Pavel Emelyanov	5d8e05e7ae	main: Push cdc::generation_service via API This is not to mess with storage service in this API call. Next patch will make use of the passed reference. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 16:04:12 +03:00
Pavel Emelyanov	beb345c00a	code: Rename get_local_host_id() into load_...() There will appear the future-less method which better deserves the get_ prefix, so give the existing method the load_ one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 10:33:57 +03:00
Raphael S. Carvalho	acba3bd3c4	sstables: give a more descriptive name to compaction_options the name compaction_options is confusing as it overlaps in meaning with compaction_descriptor. hard to reason what are the exact difference between them, without digging into the implementation. compaction_options is intended to only carry options specific to a give compaction type, like a mode for scrub, so let's rename it to compaction_type_options to make it clearer for the readers. [avi: adjust for scrub changes] Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210908003934.152054-1-raphaelsc@scylladb.com>	2021-09-12 11:21:33 +03:00
Avi Kivity	9fb9299d95	api: remove use of get_local_gossiper() Pass down gossiper from main, converting it to a shard-local instance in calls to register_api() (which is the point that broadcasts the endpoint registration across shards). This helps remove gossiper as a global variable.	2021-09-07 15:53:39 +03:00
Botond Dénes	c1203618eb	api: storage_service: validate_keyspace -> scrub_keyspace (validate mode) Fold validate keyspace into scrub keyspace (validate mode).	2021-08-05 07:36:45 +03:00
Botond Dénes	5f6468d7d7	compaction/compaction_manager: hide perform_sstable_validation() We are folding validation compaction into scrub (at least on the interface level), so remove the validation entry point accordingly and have users go through `perform_sstable_scrub()` instead.	2021-08-05 07:36:44 +03:00
Pavel Emelyanov	df285fca7a	api: Capture and use sharded<storage_service>& in handlers The reference in question is already there, handlers that need storage service can capture it and use. These handlers are not yet stopped, but neither is the storage service itself, so the potentially dangling reference is not being set up here. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-29 05:12:36 +03:00
Pavel Emelyanov	2e50ba7079	api: Carry sharded<storage_service>& down to some handlers Both set_server_storage_service and set_server_storage_proxy set up API handlers that need storage service to work. Now they all call for global storage service instance, but it's better if they receive one from main. This patch carries the sharded storage service reference down to handlers setting function, next patch will make use of it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-29 05:12:36 +03:00
Avi Kivity	331eb57e17	Revert "compression: define 'class' attribute for compression and deprecate 'sstable_compression'" This reverts commit `5571ef0d6d`. It causes rolling upgrade failures. Fixes #9055. Reopens #8948.	2021-07-28 14:14:22 +03:00
Juliusz Stasiewicz	a8b741efe2	endpoint_details: store `_host` as `gms::inet_address` In an upcoming commit I will add "system.describe_ring" table which uses endpoint's inet address as a part of CK and, therefore, needs to keep them sorted with `inet_addr_type::less`.	2021-07-20 14:00:54 +02:00
Botond Dénes	b0ef57c833	api: storage_service: expose validation compaction	2021-07-12 10:25:15 +03:00
Raphael S. Carvalho	1924e8d2b6	treewide: Move compaction code into a new top-level compaction dir Since compaction is layered on top of sstables, let's move all compaction code into a new top-level directory. This change will give me extra motivation to remove all layer violations, like sstable calling compaction-specific code, and compaction entanglement with other components like table and storage service. Next steps: - remove all layer violations - move compaction code in sstables namespace into a new one for compaction. - move compaction unit tests into its own file Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210707194058.87060-1-raphaelsc@scylladb.com>	2021-07-07 23:21:51 +03:00
Avi Kivity	5571ef0d6d	compression: define 'class' attribute for compression and deprecate 'sstable_compression' Cassandra 3.0 deprecated the 'sstable_compression' attribute and added 'class' as a replacement. Follow by supporting both. The SSTABLE_COMPRESSION variable is renamed to SSTABLE_COMPRESSION_DEPRECATED to detect all uses and prevent future misuse. To prevent old-version nodes from seeing the new name, the compression_parameters class preserves the key name when it is constructed from an options map, and emits the same key name when asked to generate an options map. Existing unit tests are modified to use the new name, and a test is added to ensure the old name is still supported. Fixes #8948. Closes #8949	2021-07-07 19:15:20 +02:00

1 2 3 4 5 ...

251 Commits