scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 20:05:10 +00:00

Author	SHA1	Message	Date
Kamil Braun	38f65e5a2e	main: start direct failure detector service We add the new direct failure detector to the list of services started in the Scylla process. To start the service, we need an implementation of `pinger` and `clock`. `pinger` is implemented using existing GOSSIP_ECHO verb. The gossip echo message requires the node's gossip generation number. We handle this by embedding the pinger implementation inside `gossiper`, and making `gossiper` update the generation number (cached inside the pinger class) periodically. `clock` is a simple implementation which uses `std::chrono::steady_clock` and `seastar::sleep_until` underneath. Translating `steady_clock` durations to `direct_failure_detector::clock` durations happens by taking the number of ticks. The service is currently not used, just initialized; no endpoints are added and no listeners are registered yet, but the following commits change that.	2022-05-09 13:14:42 +02:00
Kamil Braun	9551256e81	messaging_service: abortable version of `send_gossip_echo` Use the new `send_message_abortable` function to implement an abortable version of `send_gossip_echo`. These echo messages will be used for direct failure detection.	2022-05-09 13:14:41 +02:00
Kamil Braun	f2548fc3fa	message: abortable version of `send_message` I want to be able to timeout `send_message`, but not through the existing `send_message_timeout` API which forces me to use a particular clock/duration/timepoint type. Introduce a more general `send_message_abortable` API which gets an `abort_source&`, subscribes to it, and uses the `rpc::cancellable` interface to cancel the RPC on abort. The function is 90% copy-pasta from `send_message{_timeout}`, only the abort part is new.	2022-05-09 13:14:41 +02:00
Kamil Braun	c15f3a9698	test: raft: randomized_nemesis_test: remove old failure_detector No longer used. Split from the previous commit for a better diff.	2022-05-09 13:14:41 +02:00
Kamil Braun	915d329f1f	test: raft: randomized_nemesis_test: use `direct_failure_detector::failure_detector` Until now the nemesis test used its own failure detector implementation which used one-way heartbeats. Switch it to use the new direct failure detection service, which will also be used in production code. Integrating it does require some work however as we need to implement the `pinger` and `clock` interfaces for the failure detector. The service is sharded, but for simplicity of implementation we implement rpcs and sleeps by routing the requests to shard 0, where logical timers and network live.	2022-05-09 13:14:41 +02:00
Kamil Braun	e5fc0681d9	test: raft: randomized_nemesis_test: ping all shards on each tick Right now the test is running entirely on shard 0, but we want to introduce a sharded service to the test. The initial naive attempt of doing that failed because the test would time out (reach the tick limit) before any work distributed to other shards could even start. The solution in this commit solves that by synchronizing the shards on each tick. When the test is ran with smp=1, the behavior is as before.	2022-05-09 13:14:41 +02:00
Kamil Braun	e4f85cf425	test: unit test for new failure detector service	2022-05-09 13:14:41 +02:00
Kamil Braun	666e5a414d	direct_failure_detector: introduce new failure detector service The new service performs failure detection by periodically pinging endpoints. The set of pinged endpoints can be dynamically extended and shrinked. To learn about liveness of endpoints, user of the service registers a listener and chooses a threshold - a duration of time which has to pass since the last successful ping in order to mark an endpoint as dead. When an endpoint responds it's immediately marked as alive. Endpoints are identified using abstract integer identifiers. The method of performing a ping is a dependency of the service provided by the user through the `pinger` interface. The implementation of `pinger` is responsible for translating the abstract endpoint IDs to 'real' addresses. For example, production implementation may map endpoint IDs to IP addresses and use TCP/IP to perform the ping, while a test/simulation implementation may use a simulated network that also operates on abstract identifiers. Similarly, the method of measuring time is a dependency provided by the user using the `clock` interface. The service operates on abstract time intervals and timepoints. So, for example, in a production implementation time can be measured using a stopwatch, while in test/simulation we can use a logical clock. The service distributes work across different shards. When an endpoint is added to the set of detected endpoints, the service will choose a shard with the smallest amount of workers and create a worker that is responsible for periodically pinging this endpoint on that shard and sending notifications to listeners. Endpoints can be added or removed only through the shard 0 instance of the service and shard 0 is responsible for coordinating the endpoint workers. Listeners can be registered on any shard.	2022-05-09 13:14:40 +02:00
Botond Dénes	9623589c77	Merge 'Futurize data_read_resolver::resolve and to_data_query_result' from Benny Halevy This series futurizes two synchronous functions used for data reconciliation: `data_read_resolver::resolve` and `to_data_query_result` and does so by introducing lower-level asynchronous infrastructure: `mutation_partition_view::accept_gently`, `frozen_mutation::unfreeze_gently` and `frozen_mutation::consume_gently`, and `mutation::consume_gently`. This trades some cycles on this cold path to prevent known reactor stalls. Fixes #2361 Fixes #10038 Closes #10482 * github.com:scylladb/scylla: mutation: add consume_gently frozen_mutation: add consume_gently query: coroutinize to_data_query_result frozen_mutation: add unfreeze_gently mutation_partition_view: add accept_gently methods storage_proxy: futurize data_read_resolver::resolve	2022-05-06 10:23:02 +03:00
Piotr Sarna	eeec502aee	Merge 'gms: feature_service: reduce boilerplate to add a cluster feature' from Avi Kivity Currently, adding a cluster feature requires editing several files and repeating the new feature name several times. This series reduces the boilerplate to a single line (for non-experimental features), and perhaps three for experimental features. Closes #10488 * github.com:scylladb/scylla: gms: feature_service: remove variable/helper function duplication gms: feature: make `operator bool` implicit gms: feature_service: remove feature variable duplication in enable() gms: feature_service: remove feature variable declaration/definition duplication gms: features: de-quadruplicate active feature names gms: features: de-quadruplicate deprecated feature names gms: feature_service: avoid duplicating feature names when listing known features	2022-05-05 12:43:15 +02:00
Benny Halevy	ca1b616092	mutation: add consume_gently Allow yielding when consuming a mutation, and use in to_data_query_result. Fixes #10038 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-05 13:32:25 +03:00
Benny Halevy	09fb2c983a	frozen_mutation: add consume_gently Allow yielding when consuming a frozen_mutation, and use in to_data_query_result. Refs #10038 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-05 13:32:25 +03:00
Benny Halevy	c9612855c7	query: coroutinize to_data_query_result Reduce stalls by maybe yielding in-between partitions, and by awaiting unfreeze_gently where possible. Refs #10038 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-05 13:32:25 +03:00
Benny Halevy	e12454f175	frozen_mutation: add unfreeze_gently And use in data_read_resolver::resolve Fixes #2361 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-05 13:32:25 +03:00
Benny Halevy	4963eb73b5	mutation_partition_view: add accept_gently methods Allow yielding when consuming mutation_partition_view. To be used in later patches by a new unfreeze_gently function and frozen_mutation::consume. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-05 13:32:25 +03:00
Benny Halevy	f02c25f2c3	storage_proxy: futurize data_read_resolver::resolve Allow yielding in data_read_resolver::resolve to prevent reactor stalls. TODO: unfreeze_gently, to prevent stalls due to large partitions. Refs #2361 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-05 13:32:25 +03:00
Avi Kivity	19ab3edd77	gms: feature_service: remove variable/helper function duplication Each feature has a private variable and a public accessor. Since the accessor effectively makes the variable public, avoid the intermediary and make the variable public directly. To ease mechanical translation, the variable name is chosen as the function name (without the cluster_supports_ prefix). References throughout the codebase are adjusted.	2022-05-04 18:59:56 +03:00
Avi Kivity	435b46cd52	gms: feature: make `operator bool` implicit Features are usually used as booleans, so forcing allowing them to implicitly decay to bool is not a mistake. In fact a bunch of helper functions exist to cast feature variables to bool. Prepare to reduce this boilerplate by allowing automatic conversion to bool.	2022-05-04 18:58:24 +03:00
Avi Kivity	81ad595f61	gms: feature_service: remove feature variable duplication in enable() We have a list of all feature variables in enable(), but the list is also available programatically in _registered_features, so use that instead.	2022-05-04 18:44:28 +03:00
Avi Kivity	f0f4759163	gms: feature_service: remove feature variable declaration/definition duplication Feature variables are both declared and defined. Make that happen in one place, reducing boilerplate.	2022-05-04 18:24:56 +03:00
Avi Kivity	0f95258577	gms: features: de-quadruplicate active feature names Active feature names are present four or five times in the code: a delaration in feature.hh, a definition and initialization (two copies) in feature_service.cc, a use in feature_service.cc, and a possible reference in feature_service.cc if the feature is conditionally enabled. Switch to just one copy or two, using the "foo"sv operator (and "foo"s) to generate a string_view (string) as before. Note that a few features had different external and C++ names; we preserve the external name. This patch does cause literal strings to be present in two places, making them vulnerable to misspellings. But since feature names are immutable, there is little risk that one will change without the other.	2022-05-04 18:12:53 +03:00
Avi Kivity	980b109adb	gms: features: de-quadruplicate deprecated feature names Deprecated features are unused, but are present four times in the code: a delaration in feature.hh, a definition and initialization (two copies) in feature_service.cc, and a use in feature_service.cc. Switch to just one copy, using the "foo"sv operator to generate a string_view as before. Note that a few features had different external and C++ names; we preserve the external name.	2022-05-04 17:54:05 +03:00
Avi Kivity	ebe5ce2870	gms: feature_service: avoid duplicating feature names when listing known features We already have the registered features in a data structure, collect them from there instead of repeating.	2022-05-04 16:19:42 +03:00
Calle Wilund	78350a7e1b	cdc: Ensure columns removed from log table are registered as dropped If we are redefining the log table, we need to ensure any dropped columns are registered in "dropped_columns" table, otherwise clients will not be able to read data older than now. Includes unit test. Should probably be backported to all CDC enabled versions. Fixes #10473 Closes #10474	2022-05-04 14:19:39 +02:00
Michał Radwański	29e09a3292	db/config: command line arguments logger_stdout_timestamps and logger_ostream_type are no longer ignored Closes #10452	2022-05-04 14:40:52 +03:00
Pavel Emelyanov	063d26bc9e	system_keyspace/config: Swallow string->value cast exception When updating an updateable value via CQL the new value comes as a string that's then boost::lexical_cast-ed to the desired value. If the cast throws the respective exception is printed in logs which is very likely uncalled for. fixes: #10394 tests: manual Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220503142942.8145-1-xemul@scylladb.com>	2022-05-04 08:35:12 +03:00
Pavel Emelyanov	b26a3da584	gossiper: Coroutinize wait_for_gossip_to_settle() Looks notably shorter this way tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220422093000.24407-1-xemul@scylladb.com>	2022-05-03 15:58:04 +03:00
Botond Dénes	4440d4b41a	Merge "De-globalize gossiper" from Pavel Emelyanov " - Alternator gets gossiper for its proxy dependency - Forward service method that takes global gossiper can re-use proxy method (forward -> proxy reference is already there) - Table code is patched to require gossiper argument - Snitch gets a dependency reference on snitch_ptr and some extra care for snitch driver vs snitch-ptr interaction and gossip test - Cql test env should carry gossiper reference on-board - Few places can re-use the existing local gossiper reference - Scylla-gdb needs to get gossiper from debug namespace and needs _not_ to get feature service from gossiper " * 'br-gossiper-deglobal-2' of https://github.com/xemul/scylla: code: De-globalize gossiper scylla-gdb, main: Get feature service without gossiper help test: Use cql-test-env gossiper cql test env: Keep gossiper reference on board code: Use gossiper reference where possible snitch: Use local gossiper in drivers snitch: Keep gossiper reference test: Remove snitch from manual gossip test gossiper: Use container() instead of the global pointer main, cql_test_env: Start snitch later snitch: Move snitch_base::get_endpoint_info() forward service: Re-use proxy's helper with duplicated code table: Don't use global gossiper alternator: Don't use global gossiper	2022-05-03 15:56:07 +03:00
Nadav Har'El	6fb762630b	cql-pytest: translate Cassandra's tests for SELECT operations This is a translation of Cassandra's CQL unit test source file validation/operations/SelectTest.java into our our cql-pytest framework. This large test file includes 78 tests for various types of SELECT operations. Four additional tests require UDF in Java syntax, and were skipped. All 78 tests pass on Cassandra. 25 of the tests fail on Scylla reproducing 3 already known Scylla issues and 8 previously-unknown issues: Previously known issues: Refs #2962: Collection column indexing Refs #4244: Add support for mixing token, multi- and single-column restrictions Refs #8627: Cleanly reject updates with indexed values where value > 64k Newly-discovered issues: Refs #10354: SELECT DISTINCT should allow filter on static columns, not just partition keys Refs #10357: Spurious static row returned from query with filtering, despite not matching filter Refs #10358: Comparison with UNSET_VALUE should produce an error Refs #10359: "CONTAINS NULL" and "CONTAINS KEY NULL" restrictions should match nothing Refs #10361: Null or UNSET_VALUE subscript should generate an invalid request error Refs #10366: Enforce Key-length limits during SELECT Refs #10443: SELECT with IN and ORDER BY orders rows per partition instead of for the entire response Refs #10448: The CQL token() function should validate its parameters Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #10449	2022-05-03 11:45:05 +03:00
Pavel Emelyanov	e80adbade3	code: De-globalize gossiper No code uses global gossiper instance, it can be removed. The main and cql-test-env code now have their own real local instances. This change also requires adding the debug:: pointer and fixing the scylle-gdb.py to find the correct global location. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	89ee15b05b	scylla-gdb, main: Get feature service without gossiper help This is needed not to mess with removed global gossiper in the next patch. Other than this, it's better to access services by their own debug:: pointers, not via under-the-good dependencies chains. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	b25fc29801	test: Use cql-test-env gossiper There's yet another -test-env -- the alternator- one -- which needs gossiper. It now uses global reference, but can grab gossiper reference from the cql-test-env which partitipates in initialization. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	b0544ba7bd	cql test env: Keep gossiper reference on board The reference is already available at the env initialization, but it's not kept on the env instance itself. Will be used by the next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	4bea0b7491	code: Use gossiper reference where possible Some places in the code has function-local gossiper reference but continue to use global instance. Re-use the local reference (it's going to become sharded<> instance soon). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	e502047c74	snitch: Use local gossiper in drivers Each driver has a pointer to this shard snitch_ptr which, in turn, has the reference on gossiper. This lets drivers stop using the global gossiper instance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	38c77d0d85	snitch: Keep gossiper reference The reference is put on the snitch_ptr because this is the sharded<> thing and because gossiper reference is the same for different snitch drivers. Also, getting gossiper from snitch_ptr by driver will look simpler than getting it from any base class. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	52fc4d6b22	test: Remove snitch from manual gossip test It's not in use out there Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	7a0ca3fedc	gossiper: Use container() instead of the global pointer Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	2d32c47d0d	main, cql_test_env: Start snitch later Snitch depends on gossiper and system keyspace, so it needs to be started after those two do. fixes #10402 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:32 +03:00
Pavel Emelyanov	f85e12ffa5	snitch: Move snitch_base::get_endpoint_info() This method is only needed by production_snitch_base inheritants Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:34:52 +03:00
Pavel Emelyanov	282a1880a5	forward service: Re-use proxy's helper with duplicated code The get_live_endpoints matches the same method on the proxy side. Since the forward service carries proxy reference, it can use its method (which needs to be made public for that sake). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:34:51 +03:00
Pavel Emelyanov	11c99fc41b	table: Don't use global gossiper The table::get_hit_rate needs gossiper to get hitrates state from. There's no way to carry gossiper reference on the table itself, so it's up to the callers of that method to provide it. Fortunately, there's only one caller -- the proxy -- but the call chain to carry the reference it not very short ... oh, well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:33:08 +03:00
Pavel Emelyanov	7a5c2cdbe6	alternator: Don't use global gossiper There's proxy at hand which can provide local gossiper reference Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:33:07 +03:00
Avi Kivity	a2901a376d	Merge 'Coroutinize some `storage_service` member functions' from Pavel Solodovnikov These trivial changes are mostly intended to reduce the use of `seastar::async`. Closes #10416 * github.com:scylladb/scylla: service: storage_service: coroutinize `start_gossiping()` service: storage_service: coroutinize `node_ops_cmd_heartbeat_updater()` service: storage_service: coroutinize `node_ops_abort_thread()` service: storage_service: coroutinize `node_ops_abort()` service: storage_service: coroutinize `node_ops_done()` service: storage_service: coroutinize `node_ops_update_heartbeat()` service: storage_service: coroutinize `force_remove_completion()` service: storage_service: coroutinize `start_leaving()` service: storage_service: coroutinize `start_sys_dist_ks()` service: storage_service: coroutinize `prepare_to_join()` service: storage_service: coroutinize `removenode_add_ranges()` service: storage_service: coroutinize `unbootstrap()` service: storage_service: coroutinize `get_changed_ranges_for_leaving()`	2022-05-02 12:59:36 +03:00
Botond Dénes	53c66fe24a	Merge "Make LCS reshape and major more efficient by picking the ideal output level" from Raphael S. Carvalho " Today, both operations are picking the highest level as the ideal level for placing the output, but the size of input should be used instead. The formula for calculating the ideal level is: ceil(log base(fan_out) of (total_input_size / max_fragment_size)) where fan_out = 10 by default, total_input_size = total size of input data and max_fragment_size = maximum size for fragment (160M by default) such that 20 fragments will be placed at level 2, as level 1 capacity is 10 fragments only. By placing the output in the incorrect level, tons of backlog will be generated for LCS because it will either have to promote or demote fragments until the levels are properly balanced. " * 'optimize_lcs_major_and_reshape/v2' of https://github.com/raphaelsc/scylla: compaction: LCS: avoid needless work post major compaction completion compaction: LCS: avoid needless work post reshape completion compaction: LCS: extract calculation of ideal level for input compaction: LCS: Fix off-by-one in formula used to calculate ideal level	2022-05-02 10:16:09 +03:00
Pavel Solodovnikov	47834313d8	repair: avoid infinite recursion on stringifying unknown node_ops_cmd Cast the cmd representation to underlying type and avoid infinite recursion in the `operator <<(node_ops_cmd)`. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20220430180701.1012190-1-pa.solodovnikov@scylladb.com>	2022-05-02 10:08:34 +03:00
Avi Kivity	5169ce40ef	Merge 'loading_cache: force minimum size of unprivileged ' from Piotr Grabowski This series enforces a minimum size of the unprivileged section when performing `shrink()` operation. When the cache is shrunk, we still drop entries first from unprivileged section (as before this commit), however, if this section is already small (smaller than `max_size / 2`), we will drop entries from the privileged section. This is necessary, as before this change the unprivileged section could be starved. For example if the cache could store at most 50 entries and there are 49 entries in privileged section, after adding 5 entries (that would go to unprivileged section) 4 of them would get evicted and only the 5th one would stay. This caused problems with BATCH statements where all prepared statements in the batch have to stay in cache at the same time for the batch to correctly execute. To correctly check if the unprivileged section might get too small after dropping an entry, `_current_size` variable, which tracked the overall size of cache, is changed to two variables: `_unprivileged_section_size` and `_privileged_section_size`, tracking section sizes separately. New tests are added to check this new behavior and bookkeeping of the section sizes. A test is added, that sets up a CQL environment with a very small prepared statement cache, reproduces issue in #10440 and stresses the cache. Fixes #10440. Closes #10456 * github.com:scylladb/scylla: loading_cache_test: test prepared stmts cache loading_cache: force minimum size of unprivileged loading_cache: extract dropping entries to lambdas loading_cache: separately track size of sections loading_cache: fix typo in 'privileged'	2022-05-01 19:36:35 +03:00
Avi Kivity	325eb9b4d2	Merge 'compaction: get rid of reader v1' from Benny Halevy This series gets rid of the remaining usage of flat_mutation_reader v1 in compaction Test: sstable_compaction_test Closes #10454 * github.com:scylladb/scylla: compaction: sanitize headers from flat_mutation_reader v1 flat_mutation_reader: get rid of class filter compaction: cleanup_compaction: make_partition_filter: return flat_mutation_reader_v2::filter	2022-05-01 19:29:10 +03:00
Avi Kivity	ab72dbc93e	Merge 'Make internal statements caching more coherent' from Eliran Sinvani There was some doubts about which internal prepared statements are cached and which aren't. In addition some queries that should have been cached (IMO), weren't. This PR adds some verbosity to the caching enabling parameter as well as adding caching to some queries. As a followup I would suggest to have internal queries as a compile time strings that have a compile time hash, this will make the cache lookup not be dependent on the query textual length as it is today, this makes sense given that the queries are static even today. Closes #10465 Fixes #10335. * github.com:scylladb/scylla: internal queries: add caching to some queries query_processor: remove default internal query caching behavior query_processor: make execute_internal caching parameter more verbose	2022-05-01 19:27:06 +03:00
Eliran Sinvani	a16b4e407d	internal queries: add caching to some queries Some of the internal queries didn't have caching enabled even though there are chances of the query executing in large bursts or relatively often, example of the former is `default_authorized::authorize` and for the later is `system_distributed_keyspace::get_service_levels`. Fixes #10335 Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2022-05-01 13:30:02 +03:00

1 2 3 4 5 ...

31077 Commits