scylladb

Author	SHA1	Message	Date
Pavel Emelyanov	98ff779676	batchlog_manager: Add drain and stop logging Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-04 13:42:46 +03:00
Pavel Emelyanov	e2007cd317	batchlog_manager: Coroutinize drain and stop This is not identical change, if drain() resolves with exception we end up skipping the gate closing, but since it's stop why bother Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-04 13:42:46 +03:00
Pavel Emelyanov	8a03683671	batchlog_manager: Drain it with shared future The .drain() method can be called from several places, each needs to wait for its completion. Now this is achieved with the help of a gate, but there's a simpler way Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-04 13:42:45 +03:00
Piotr Dulikowski	e6beab3106	storage_proxy: add allow rate limit flag to mutate/mutate_result Now, mutate/mutate_result accept a flag which decides whether the write should be rate limited or not. The new parameter is mandatory and all call sites were updated.	2022-06-22 20:16:49 +02:00
Avi Kivity	528ab5a502	treewide: change metric calls from make_derive to make_counter make_derive was recently deprecated in favor of make_counter, so make the change throughput the codebase. Closes #10564	2022-05-14 12:53:55 +02:00
Avi Kivity	5937b1fa23	treewide: remove empty comments in top-of-files After `fcb8d040` ("treewide: use Software Package Data Exchange (SPDX) license identifiers"), many dual-licensed files were left with empty comments on top. Remove them to avoid visual noise. Closes #10562	2022-05-13 07:11:58 +02:00
Michael Livshin	00ed4ac74c	batchlog_manager: warn when a batch fails to replay Only for reasons other than "no such KS", i.e. when the failure is presumed transient and the batch in question is not deleted from batchlog and will be retried in the future. (Would info be more appropriate here than warning?) Signed-off-by: Michael Livshin <michael.livshin@scylladb.com> Closes #10556	2022-05-12 13:34:03 +03:00
Eliran Sinvani	e0c7178e75	query_processor: remove default internal query caching behavior When executing internal queries, it is important that the developer will decide if to cache the query internally or not since internal queries are cached indefinitely. Also important is that the programmer will be aware if caching is going to happen or not. The code contained two "groups" of `query_processor::execute_internal`, one group has caching by default and the other doesn't. Here we add overloads to eliminate default values for caching behaviour, forcing an explicit parameter for the caching values. All the call sites were changed to reflect the original caching default that was there. Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2022-05-01 08:33:55 +03:00
Benny Halevy	795d4a0bad	batchlog_manager: batchlog_replay_loop: ignore broken_semaphore if abort_requested drain() breaks _sem, causing do_batch_log_replay to throw broken_semaphore. Ignore this error in batchlog_replay_loop as it's expected on shutdown. https://jenkins.scylladb.com/job/scylla-master/job/dtest-debug/1073/testReport/junit/thrift_tests/TestCompactStorageThriftAccesses/test_get/ ``` E AssertionError: Unexpected errors found: [('node1', ['ERROR 2022-02-14 06:55:44,263 [shard 0] batchlog_manager - Exception in batch replay: seastar::broken_semaphore (Semaphore broken)'])] ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220214090607.1213740-1-bhalevy@scylladb.com>	2022-02-14 11:34:16 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Raphael S. Carvalho	426450dc04	treewide: remove useless include of database.hh Wrote a script based on cpp-include to find places that needlessly included database.hh, which is expensive to process during build time. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220104204359.168895-1-raphaelsc@scylladb.com>	2022-01-05 10:15:19 +02:00
Benny Halevy	d344765ec6	get rid of the global batchlog_manager Now that it's unused. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-11-23 08:27:30 +02:00
Benny Halevy	744275df73	batchlog_manager: get_batch_log_mutation_for: move to storage_proxy And rename to get_batchlog_mutation_for while at it, as it's about the batchlog, not batch_log. This resolves a circular dependency between the batchlog_manager and the storage_proxy that required it in the case. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-11-23 08:27:30 +02:00
Benny Halevy	55967a8597	batchlog_manager: endpoint_filter: move to gossiper There's nothing in this function that actually requries the batchlog manager instance. It uses a random number engine that's moved along with it to class gossiper. This resolves a circular dependency between the batchlog_manager and storage_proxy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-11-23 08:27:30 +02:00
Benny Halevy	85d0bbb4fc	batchlog_manager: do_batch_log_replay: use lambda coroutine Ssimplify the function implemention and error handling by invoking a lambda coroutine on shard 0 that keeps a gate holder and semaphore units on its stack, for RAII- style unwinding. It then may invoke a function on another shard, using the peered service container() to do the replay on the destination shard. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-11-23 08:27:30 +02:00
Benny Halevy	691afe1c4d	batchlog_manager: derive from peering_sharded_service So that do_batch_log_replay can get the sharded batchlog_manager as container(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-11-23 08:27:30 +02:00
Benny Halevy	03039e8f8a	main: allow setting the global batchlog_manager As a prerequisite to globalizing the batchlog_manager, allow setting a global pointer to it and instantiate the sharded<db::batchlog_manager> on the main/cql_test_env stack. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-11-23 08:27:30 +02:00
Pavel Emelyanov	598841a5dd	code: Expell gossiper.hh from other headers This needs to add forward declarations of the gossiper class and re-include some other headers here and there. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-22 13:13:06 +03:00
Benny Halevy	5165780d81	batchlog_manager: refactor drain out of stop drain() aborts the replay loop fiber and returns its future. It's grabbing _gate so stop() will wait on it. The intention is to call stop_replay_loop from storage_service::decommission and do_drain rather than stop, so we can stop the batchlog manager once, using a deferred action in main. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-07-20 20:23:06 +03:00
Benny Halevy	c47fbda076	batchlog_manager: stop: break _sem on shard 0 Abort do_batch_log_replay if waiting on the semaphore. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-07-20 19:35:23 +03:00
Benny Halevy	deef1b4f59	batchlog_manager: stop: use abort_source to abort batchlog_replay_loop Harden start/stop by using an abort_source to abort from the replay loop. Extract the loop into batchlog_replay_loop() coroutine, with the _stop abourt source as a stop condition, plus use it for sleep_abortable to be able to promptly stop while sleeping. start() stores batchlog_replay_loop's future in a newly added _started member, which is waited on in stop() to synchronize with the start process at any stage. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-07-20 19:32:55 +03:00
Benny Halevy	976b517f55	batchlog_manager: do_batch_log_replay: hold _gate So we can wait on do_batch_log_replay on stop(). Note that do_batch_log_replay is called both from batchlog_replay_loop and from the storage_service. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-07-20 19:30:55 +03:00
Avi Kivity	4d70f3baee	storage_proxy: change unordered_set<inet_address> to small_vector in write path The write paths in storage_proxy pass replica sets as std::unordered_set<gms::inet_address>. This is a complex type, with N+1 allocations for N members, so we change it to a small_vector (via inet_address_vector_replica_set) which requires just one allocation, and even zero when up to three replicas are used. This change is more nuanced than the corresponding change to the read path `abe3d7d7` ("Merge 'storage_proxy: use small_vector for vectors of inet_address' from Avi Kivity"), for two reasons: - there is a quadratic algorithm in abstract_write_response_handler::response(): it searches for a replica and erases it. Since this happens for every replica, it happens N^2/2 times. - replica sets for writes always include all datacenters, while reads usually involve just one datacenter. So, a write to a keyspace that has 5 datacenters will invoke 15*(15-1)/2 =105 compares. We could remove this by sending the index of the replica in the replica set to the replica and ask it to include the index in the response, but I think that this is unnecessary. Those 105 compares need to be only 105/15 = 7 times cheaper than the corresponding unordered_set operation, which they surely will. Handling a response after a cross-datacenter round trip surely involves L3 cache misses, and a small_vector reduces these to a minimum compared to an unordered_set with its bucket table, linked list walking and managent, and table rehashing. Tests using perf_simple_query --write --smp 1 --operations-per-shard 1000000 --task-quota-ms show two allocations removed (as expected) and a nice reduction in instructions executed. before: median 204842.54 tps ( 54.2 allocs/op, 13.2 tasks/op, 49890 insns/op) after: median 206077.65 tps ( 52.2 allocs/op, 13.2 tasks/op, 49138 insns/op) Closes #8847	2021-06-17 13:46:40 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Solodovnikov	e0749d6264	treewide: some random header cleanups Eliminate not used includes and replace some more includes with forward declarations where appropriate. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-06-06 19:18:49 +03:00
Asias He	5a410cb6e3	token_metadata: Get rid of get_all_endpoints_count It is now only a wrapper for count_normal_token_owners. Refs #8534	2021-05-06 15:36:20 +08:00
Benny Halevy	3fab0f8694	storage_proxy: convert to shared_token_metadata get() the latest token_metadata_ptr from the shared_token_metadata before each use. expose get_token_metadata_ptr() rather than get_token_metadata() so that caller can keep it across continuations. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:23 +02:00
Pavel Solodovnikov	5ff5df1afd	storage_proxy: un-hardcode force sync flag for `mutate_locally(mutation)` overload Corresponding overload of `storage_proxy::mutate_locally` was hardcoded to pass `db::commitlog::force_sync::no` to the `database::apply`. Unhardcode it and substitute `force_sync::no` to all existing call sites (as it were before). `force_sync::yes` will be used later for paxos learn writes when trying to apply mutations upgraded from an obsolete schema version (similar to the current case when applying locally a `frozen_mutation` stored in accepted proposal). Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200716124915.464789-1-pa.solodovnikov@scylladb.com>	2020-07-16 16:38:48 +03:00
Piotr Sarna	92aadb94e5	treewide: propagate trace state to write path In order to add tracing to places where it can be useful, e.g. materialized view updates and hinted handoff, tracing state is propagated to all applicable call sites.	2020-05-18 16:05:23 +02:00
Rafael Ávila de Espíndola	c5795e8199	everywhere: Replace engine().cpu_id() with this_shard_id() This is a bit simpler and might allow removing a few includes of reactor.hh. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200326194656.74041-1-espindola@scylladb.com>	2020-03-27 11:40:03 +03:00
Pavel Emelyanov	7cdfd94207	batchlog: Use token_metadata from proxy This kills the second global reference on storage_service from batchlog code and breaks the dependency loop between these two. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-10 20:54:32 +03:00
Pavel Emelyanov	b4e66ddf1d	batchlog: Use in-config ring-delay This kills the first (out of two) global reference on storage_service Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-10 20:54:32 +03:00
Pavel Emelyanov	d361894b9d	batchlog_manager: Speed up token_metadata endpoints counting a bit In this place we only need to know the number of endpoints, while current code additionally shuffles them before counting. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2019-12-23 14:22:45 +02:00
Botond Dénes	fddd9a88dd	treewide: silence discarded future warnings for legit discards This patch silences those future discard warnings where it is clear that discarding the future was actually the intent of the original author, and they did the necessary precautions (handling errors). The patch also adds some trivial error handling (logging the error) in some places, which were lacking this, but otherwise look ok. No functional changes.	2019-08-26 18:54:44 +03:00
Gleb Natapov	6a4207f202	Pass service permit to storage_proxy Current cql transport code acquire a permit before processing a query and release it when the query gets a reply, but some quires leave work behind. If the work is allowed to accumulate without any limit a server may eventually run out of memory. To prevent that the permit system should account for the background work as well. The patch is a first step in this direction. It passes a permit down to storage proxy where it will be later hold by background work.	2019-08-12 10:20:43 +03:00
Gleb Natapov	95c6d19f6c	batchlog_manager: fix array out of bound access endpoint_filter() function assumes that each bucket of std::unordered_multimap contains elements with the same key only, so its size can be used to know how many elements with a particular key are there. But this is not the case, elements with multiple keys may share a bucket. Fix it by counting keys in other way. Fixes #3229 Message-Id: <20190501133127.GE21208@scylladb.com>	2019-05-01 17:30:11 +03:00
Asias He	af579a055b	gossip: Get rid of the gms::get_local_failure_detector static object Store the failure_detector object inside gossiper object. - No more the global object sharded<failure_detector> - No need to initialize sharded<failure_detector> manually which simplifies the code in tests/cql_test_env.cc and init.cc.	2019-03-22 09:08:51 +08:00
Duarte Nunes	fa2b0384d2	Replace std::experimental types with C++17 std version. Replace stdx::optional and stdx::string_view with the C++ std counterparts. Some instances of boost::variant were also replaced with std::variant, namely those that called seastar::visit. Scylla now requires GCC 8 to compile. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20190108111141.5369-1-duarte@scylladb.com>	2019-01-08 13:16:36 +02:00
Avi Kivity	30745eeb72	query_processor: replace sharded<database> with the local shard query_processor uses storage_proxy to access data, and the local database object to access replicated metadata. While it seems strange that the database object is not used to access data, it is logical when you consider that a sharded<database> only contain's this node's data, not the cluster data. Take advantage of this to replace sharded<database> with a single database shard.	2018-12-29 11:02:15 +02:00
Avi Kivity	89be47e291	batchlog_manager: remove dependency on db::config Extract configuration into a new struct batchlog_manager_config and have the callers populate it using db::config. This reduces dependencies on global objects.	2018-12-09 20:11:38 +02:00
Avi Kivity	d77e044cde	db: convert sprint() to format() sprint() recently became more strict, throwing on sprint("%s", 5). Replace with the more modern format(). Mechanically converted with https://github.com/avikivity/unsprint.	2018-11-01 13:16:17 +00:00
Tomasz Grabiec	cd201d1987	db/batchlog_manager: Do not return a value from timer callback Timer callbacks are std::function<void()>. Exposed by changing callback_t to noncopyable_function<>. Message-Id: <1536138045-29209-1-git-send-email-tgrabiec@scylladb.com>	2018-09-05 12:32:21 +03:00
Nadav Har'El	25bd139508	cross-tree: clean up use of std::random_device() std::random_device() uses the relatively slow /dev/urandom, and we rarely if ever intend to use it directly - we normally want to use it to seed a faster random_engine (a pseudo-random number generator). In many places in the code, we first created a random_device variable, and then using it created a random_engine variable. However, this practice created the risk of a programmer accidentally using the random_device object, instead of the random_engine object, because both have the same API; This hurts performance. This risk materialized in just two places in the code, utils/uuid.cc and gms/gossiper.cc. A patch for to uuid.cc was sent previously by Pawel and is not included in this patch, and the fix for gossiper.{cc,hh} is included here. To avoid risking the same mistake in the future, this patch switches across the code to an idiom where the random_device object is not named, so cannot be accidentally used. We use the following idiom: std::default_random_engine _engine{std::random_device{}()}; Here std::random_device{}() creates the random device (/dev/urandom) and pulls a random integer from it. It then uses this seed to create the random_engine (the pseudo-random number generator). The std::random_device{} object is temporary and unnamed, and cannot be unintentionally used directly. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180726154958.4405-1-nyh@scylladb.com>	2018-07-26 16:54:58 +01:00
Avi Kivity	512baf536f	storage_proxy: implement write timeouts Require a timeout parameter for storage_proxy::mutate_begin() and all its callers (all the way to thrift and cql modification_statement and batch_statement). This should fix spurious debug-mode test failures, where overcommit and general debug slowness result in the default timeouts being exceeded. Since the tests use infinite timeouts, they should not time out any more. Tests: unit (release), with an extra patch that aborts when a non-infinite timeout is detected. Message-Id: <20180707204424.17116-1-avi@scylladb.com>	2018-07-08 10:27:03 +01:00
Avi Kivity	7c01e66d53	cql3: query_processor: store and use just local shard reference of storage_proxy Since storage_proxy provides access to the entire cluster, a local shard reference is sufficient. Adjust query_processor to store a reference to just the local shard, rather than a seastar::sharded<storage_proxy> and adjust callers. This simplifies the code a little. Message-Id: <20180415142656.25370-3-avi@scylladb.com>	2018-04-16 10:20:50 +02:00
José Guilherme Vanz	380bc0aa0d	Swap arguments order of mutation constructor Swap arguments in the mutation constructor keeping the same standard from the constructor variants. Refs #3084 Signed-off-by: José Guilherme Vanz <guilherme.sft@gmail.com> Message-Id: <20180120000154.3823-1-guilherme.sft@gmail.com>	2018-01-21 12:58:42 +02:00
Avi Kivity	e44517851e	untyped_result_set: reduce dependencies Forward-declare untyped_result_set and untyped_result_set_row, and remove the include from query_processor.hh. Message-Id: <20170916170859.27612-3-avi@scylladb.com>	2017-09-18 15:15:15 +02:00
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Duarte Nunes	9e88b60ef5	mutation: Set cell using clustering_key_prefix Change the clustering key argument in mutation::set_cell from exploded_clustering_prefix to clustering_key_prefix, which allows for some overall code simplification and fewer copies. This mostly affects the cql3 layer. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-04 15:59:50 +02:00
Vlad Zolotarov	a9f6e5f8da	db::batchlog_manager: move collectd registration to the metrics registration layer Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-01-10 16:24:54 -05:00

1 2

82 Commits