scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 02:20:37 +00:00

Author	SHA1	Message	Date
Gleb Natapov	ac88935baa	Provide available memory size to storage_proxy object during creation	2018-06-11 15:34:13 +03:00
Piotr Sarna	f12fdcffdb	storage_proxy: restore optional hinted handoff Since hinted handoff for materialized views is now a separate entity, regular hinted handoff can go back to being optional.	2018-06-04 09:46:06 +02:00
Piotr Sarna	a6aae369da	storage_proxy: add hints manager for views This commit adds a separate hints manager that serves only failed materialized view updates.	2018-06-04 09:46:06 +02:00
Piotr Sarna	ef40f7e628	hints: move send limiter to resource manager Send limiting semaphore is moved from hints manager to resource manager. In consequence, hints manager now keeps a reference to its resource manager.	2018-06-04 09:35:58 +02:00
Piotr Sarna	ffe52681ea	storage_proxy: add mv stats to write handler Previous patch for issue 3416 did not cover passing write stats to write response handler, which results in some write stats being incorrectly counted as user write stats, while they belong to materialized views. This one fixes that by passing correct write stats reference to write response handler constructor. Also at: https://github.com/psarna/scylla/commits/fix_3416_again Closes #3416 Message-Id: <53ef3cc96ccadfdad8992d92ed6a41473419eb0a.1527510473.git.sarna@scylladb.com>	2018-05-28 17:50:49 +01:00
Piotr Sarna	1d590b3ca4	storage_proxy: decouple write_stats from stats This commit extracts metrics related to writes from stats structure, so it can be easily replaced later, e.g. for materialized view metrics. References #3385 References #3416	2018-05-22 16:52:58 +02:00
Piotr Sarna	f5d6326ced	storage_proxy: enable hinted handoff for materialized views This commit initializes and enables hinted handoff for materialized views, even if HH is not explicitly turned on in config. User writes still use hinted handoff only if it is explicitly enabled, while materialized views are allowed to use it unconditionally in order to store failed replica updates somewhere. Fixes #3383	2018-05-21 17:09:27 +02:00
Duarte Nunes	a23bda3393	Merge 'Implement separate timeout for range queries' from Avi " This patchset implements separate timeouts for range queries, and lays the foundations for separate timeouts for other query types. While the feature in itself is worthy, the real motivation is to have the timeouts decided by the caller, instead of storage_proxy. This in turn is required to disentangle each layer behaving differently depending on whether the query is internal or not; instead, the goal is to have each caller declare its needs in terms of consistency level and timeouts, and have the lower layers implement its requirements instead of making their own decisions. Fixes #3013. Tests: unit (release) " * tag '3013/v1.1' of https://github.com/avikivity/scylla: storage_proxy: remove default_query_timeout() storage_proxy: don't use default timeouts query_options: augment with timeout_config thrift: configure thrift transport and handler with a timeout_config transport: configure native transport with a timeout_config cql3: define and populate timeout_config_selector timeout_config: introduce timeout configuration	2018-05-13 20:05:50 +02:00
Botond Dénes	ddd70dc113	Use dht::token_range alias for last/preferred replicas Use the pre-existing type alias instead of fully spelling out the type everywhere.	2018-05-10 06:22:39 +03:00
Botond Dénes	52affa2a61	storage_proxy::coordinator_query_result: merge constructors into one w/ default params	2018-05-10 06:22:39 +03:00
Vlad Zolotarov	48c96d09d6	db::hints::manager: drain hints when the node is decommissioned/removed When node is decommissioned/removed it will drain all its hints and all remote nodes that have hints to it will drain their hints to this node. What "drain" means? - The node that "drains" hints to a specific destination will ignore failures and will continue sending hints till the end of the current segment, erase it and move to the next one till there are no more segments left. After all hints are drained the corresponding hints directory is removed. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-05-08 22:29:21 +01:00
Avi Kivity	c8a6fe3044	storage_proxy: remove default_query_timeout() No longer used.	2018-04-30 13:19:53 +03:00
Avi Kivity	d8dd7e05a7	storage_proxy: don't use default timeouts Require all callers to supply timeouts instead of relying on defaults. Since all callers now have the timeouts set up, they can easily supply them.	2018-04-30 13:19:53 +03:00
Tomasz Grabiec	52c61df930	Relax includes To avoid unnecessary recompilations. Message-Id: <1522168295-994-1-git-send-email-tgrabiec@scylladb.com>	2018-03-28 10:49:07 +03:00
Duarte Nunes	fb54c09e0b	service/storage_proxy: Pass pending endpoints to send_to_endpoint() This will allow us to minimize the number of mutation copies in mutate_MV(). Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180325121412.76844-1-duarte@scylladb.com>	2018-03-25 15:45:21 +03:00
Botond Dénes	eee9bda85b	Make the read-repair decision only once Make the read-repair decision on the first page of a paged-query and use it for all the remaining pages. This helps querier-cache hit-rates as reads to nodes will be sent consistently throught the query.	2018-03-19 16:29:43 +02:00
Botond Dénes	2e2abf6edb	storage_proxy: add coordinator_query_options and coordinator_query_result As yet more parameters and return-values are about to be added to all storage_proxy::query_* methods we need a way that scales better than changing the signatures every time. To this end we aggregate all non-mandatory query parameters into `coordinator_query_options` and all return values into `coordinator_query_result`. This way new fields can be simply added to the respective structs while the signatures of the methods themselves and their client code can remain unchanged.	2018-03-19 15:17:35 +02:00
Botond Dénes	aaf67bcbaa	Consider preferred replicas when choosing endpoints for query_singular() Propagate the preferred_replicas to db::filter_for_query() and consider them when selecting the endpoints. The algoritm for selecting the endpoints is as follows: * Compute the intersection of the endpoint candidates and the preferred endpoints. * If this yields a set of endpoints that already satisfies the CL requirements use this set. * Otherwise select the remaining endpoints according to the load-balancing strategy, just like before.	2018-03-13 10:34:34 +02:00
Botond Dénes	eac597d726	Add preferred and last replicas to the signature of query() preferred_replicas are added to the parameters and last_replicas are added to the return type. The preferred replicas will be used as a hint for the selection of the replicas to send the read requests to. The last replicas (returned) are the replicas actually selected for the read. This will allow queries to consistently hit the same replicas for each page thus reusing readers created on these replicas. For convenience a query() overload is provided that doesn't take or return the preferred and last replicas. This patch only adds the parameters and propagates them down to query_singular() and query_partition_key_range(). The code to actually use these preferred-replicas will be added in later patches. This reason for separating this is to reduce noise and improve reviewability for those functional changes later.	2018-03-13 10:34:34 +02:00
Duarte Nunes	440ea56010	message/messaging_service: Specify algorithm when requesting digest While not strictly needed, specify which algorithm to use when request a digest from a remote node. This is more flexible than relying on a cluster wide feature, although that's what we'll do in subsequent patches. It also makes the verb more consistent with the data request. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-02-01 01:02:50 +00:00
Duarte Nunes	6b4b429883	query-result: Introduce class result_options Introduce class result_options to carry result options through the request pipeline, which at this point mean the result type and the digest algorithm. This class allows us to encapsulate the concrete digest algorithm to use. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-02-01 00:22:50 +00:00
Glauber Costa	08a0c3714c	allow request-specific read timeouts in storage proxy reads Timeouts are a global property. However, for tables in keyspaces like the system keyspace, we don't want to uphold that timeout--in fact, we wan't no timeout there at all. We already apply such configuration for requests waiting in the queued sstable queue: system keyspace requests won't be removed. However, the storage proxy will insert its own timeouts in those requests, causing them to fail. This patch changes the storage proxy read layer so that the timeout is applied based on the column family configuration, which is in turn inherited from the keyspace configuration. This matches our usual way of passing db parameters down. In terms of implementation, we can either move the timeout inside the abstract read executor or keep it external. The former is a bit cleaner, the the latter has the nice property that all executors generated will share the exact same timeout point. In this patch, we chose the latter. We are also careful to propagate the timeout information to the replica. So even if we are talking about the local replica, when we add the request to the concurrency queue, we will do it in accordance with the timeout specified by the storage proxy layer. After this patch, Scylla is able to start just fine with very low timeouts--since read timeouts in the system keyspace are now ignored. Fixes #2462 Implementation notes, and general comments about open discussion in 2462: * Because we are not bypassing the timeout, just setting it high enough, I consider the concerns about the batchlog moot: if we fail for any other reason that will be propagated. Last case, because the timeout is per-CF, we could do what we do for the dirty memory manager and move the batchlog alone to use a different timeout setting. * Storage proxy likes specifying its timeouts as a time_point, whereas when we get low enough as to deal with the read_concurrency_config, we are talking about deltas. So at some point we need to convert time_points to durations. We do that in the database query functions. v2: - use per-request instead of per-table timeouts. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-01-12 07:43:21 -05:00
Nadav Har'El	73aad5736f	Fix compilation of tests/cql_test_env.cc In commit `1f4f71e619`, an stdx::optional<std::vector<sstring>> parameter was added to storage_proxy's constructor. However, this parameter was not made optional, and tests/cql_test_env.cc failed to compile because it didn't provide this parameter. This patch makes this parameter optional (if missing, it's like an empty stdx::optional) so cql_test_env.cc compiles. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20171218132121.18782-1-nyh@scylladb.com>	2017-12-18 15:32:54 +02:00
Vlad Zolotarov	1f4f71e619	main + storage_service: wire up hints generation Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-12-14 15:08:11 -05:00
Gleb Natapov	ddf117535a	storage_proxy: add counters for speculative reads Fixes #3030 Message-Id: <20171206143611.8756-1-gleb@scylladb.com>	2017-12-06 16:38:16 +02:00
Gleb Natapov	16964de1f3	storage_proxy: fail read/write requests early if it cannot be completed due to errors If errors make reaching CL impossible a request can be aborted earlier without waiting for timeout.	2017-12-05 16:46:25 +02:00
Gleb Natapov	d0d8bdf615	storage_proxy: remove unused parameter from get_restricted_ranges() function Message-Id: <20170911084653.GH24167@scylladb.com>	2017-09-11 11:58:44 +02:00
Duarte Nunes	ec75eac37d	ring_position_exponential_vector_sharder: Take ranges by rvalue Avoids some copies. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170814093310.29200-1-duarte@scylladb.com>	2017-08-14 12:55:43 +03:00
Gleb Natapov	3b7d8c8767	storage_proxy: add capability to read data/digest for non singular ranges Currently only mutation_data read supports non singular ranges. This patch extends data/digest reads to support them too.	2017-08-03 10:35:09 +03:00
Gleb Natapov	69c5526301	messaging_service: return cache hit ratio as part of data read	2017-06-13 09:57:14 +03:00
Paweł Dziepak	cfde2ad5b4	storage_proxy: make mutate() an execution stage	2017-03-09 09:27:43 +00:00
Paweł Dziepak	00b42c477f	storage_proxy: count counter updates for which the node was a leader	2017-03-02 09:05:12 +00:00
Paweł Dziepak	277501f42f	db: propagate tracing state for counter writes	2017-03-02 09:05:10 +00:00
Paweł Dziepak	25173f8095	db: propagate timeout for counter writes	2017-03-02 09:05:10 +00:00
Paweł Dziepak	426345e1d4	storage_proxy: avoid excessive mutation freezes	2017-03-01 16:33:36 +00:00
Calle Wilund	0a4edca756	counters/cql: allow wormholing actual counter values (with shards) via cql Adds yet another magic function "SCYLLA_COUNTER_SHARD_LIST", indicating that argument value, which must be a list of tuples <int, UUID, long, long>, should be inserted as an actual counter value, not update. This of course to allow counters to be read from sstable loader. Note that we also need to allow timestamps for counter mutations, as well as convince the counter code itself to treat the data as already baked. So ugly wormhole galore. v2: * Changed flag names * More explicit wormholing, bypassing normal counter path, to avoid read-before-write etc * throw exceptions on unhandled shard types in marshalling v3: * Added counter id ordering check * Added batch statement check for mixing normal and raw counter updates Message-Id: <1487683665-23426-2-git-send-email-calle@scylladb.com>	2017-02-22 09:19:46 +00:00
Nadav Har'El	f2fd81ece0	materialized views: function to send a mutation to endpoint Add a function for sending one mutation to one remote replica owning this mutation. This is needed for materialized views, where each base replica sends each view mutation to one particular view replica. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2017-02-06 13:36:45 +01:00
Gleb Natapov	3c372525ed	storage_proxy: use storage_proxy clock instead of explicit lowres_clock Merge commit `45b6070832` used butchered version of storage_proxy patch to adjust to rpc timer change instead the one I've sent. This patch fixes the differences. Message-Id: <20170206095237.GA7691@scylladb.com>	2017-02-06 12:51:36 +02:00
Paweł Dziepak	1e8814f5ce	storage_proxy: support counter updates	2017-02-02 10:35:14 +00:00
Paweł Dziepak	c14c6b753b	storage_proxy: add get_live_endpoints()	2017-02-02 10:35:14 +00:00
Amnon Heiman	45b6070832	Merge seastar upstream * seastar 397685c...c1dbd89 (13): > lowres_clock: drop cache-line alignment for _timer > net/packet: add missing include > Merge "Adding histogram and description support" from Amnon > reactor: Fix the error: cannot bind 'std::unique_ptr' lvalue to 'std::unique_ptr&&' > Set the option '--server' of tests/tcp_sctp_client to be required > core/memory: Remove superfluous assignment > core/memory: Remove dead code > core/reactor: Use logger instead of cerr > fix inverted logic in overprovision parameter > rpc: fix timeout checking condition > rpc: use lowres_clock instead of high resolution one > semaphore: make semaphore's clock configurable > rpc: detect timedout outgoing packets earlier Includes treewide change to accomodate rpc changing its timeout clock to lowres_clock. Includes fixup from Amnon: collectd api should use the metrics getters As part of a preperation of the change in the metrics layer, this change the way the collectd api uses the metrics value to use the getters instead of calling the member directly. This will be important when the internal implementation will changed from union to variant. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1485457657-17634-1-git-send-email-amnon@scylladb.com>	2017-02-01 14:39:08 +02:00
Gleb Natapov	64660397fc	storage_proxy: move operation type information from counter's name to a label Makes it much more flexible to view the data in various ways in Graphana. Message-Id: <20170126102746.GL11469@scylladb.com>	2017-01-26 12:38:29 +02:00
Gleb Natapov	ccee01f352	storage_proxy: put datacenter name into a label instead of counter's name Having datacenter name as a label makes it possible to create Prometheus board for the counters. Message-Id: <20170124132051.GX11469@scylladb.com>	2017-01-24 15:27:34 +02:00
Gleb Natapov	76aed548e3	storage_proxy: add replica side counters for data read Message-Id: <20170112085907.GN11469@scylladb.com>	2017-01-12 11:41:04 +02:00
Paweł Dziepak	1a52569f7d	storage_proxy: pass maximum result size to replicas We may want to change the default individual result size limit in the future. If it is provided by the coordinator and not hardcoded in the replicas this can be done without causing data query digest mismatches or wasteful mutation query results.	2016-12-22 17:16:23 +01:00
Asias He	937f28d2f1	Convert to use dht::partition_range_vector and dht::token_range_vector	2016-12-19 14:08:50 +08:00
Asias He	e5485f3ea6	Get rid of query::partition_range Use dht::partition_range instead	2016-12-19 08:09:25 +08:00
Duarte Nunes	c2072c7dc9	storage_proxy: Decrease limits when retrying command This patch changes a read_command's limits when retrying it, so that we don't ask for more rows than necessary. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-15 10:41:06 +00:00
Duarte Nunes	9572c19dc6	storage_proxy: Don't fetch superfluous partitions This patch ensures we keep track of how many partitions we've queried so we don't ask for more than the number we need. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-15 10:27:46 +00:00
Gleb Natapov	a05516f14c	storage_proxy: wire up range_slice_timeouts, range_slice_unavailables and read_unavailables counters Message-Id: <20161206105154.GL1866@scylladb.com>	2016-12-08 11:42:52 +02:00

1 2 3 4

154 Commits