scylladb

Author	SHA1	Message	Date
Asias He	0379d0c031	repair: Send reason for node operations Since `956b092012` (Merge "Repair based node operation" from Asias), repair is used by other node operations like bootstrap, decommission and so on. Send the reason for the repair, so that we can handle the materialized view update correctly according to the reason of the operation. We want to trigger the view update only if the repair is used by repair operation. Otherwise, the view table will be handled twice, 1) when the view table is synced using repair 2) when the base table is synced using repair and view table update is triggered. Fixes #5930 Fixes #5998 (cherry picked from commit `066934f7c4`)	2020-04-16 10:06:17 +03:00
Gleb Natapov	121cd383fa	lwt: remove entries from system.paxos table after successful learn stage The learning stage of PAXOS protocol leaves behind an entry in system.paxos table with the last learned value (which can be large). In case not all participants learned it successfully next round on the same key may complete the learning using this info. But if all nodes learned the value the entry does not serve useful purpose any longer. The patch adds another round, "prune", which is executed in background (limited to 1000 simultaneous instances) and removes the entry in case all nodes replied successfully to the "learn" round. It uses the ballot's timestamp to do the deletion, so not to interfere with the next round. Since deletion happens very close to previous writes it will likely happen in memtable and will never reach sstable, so that reduces memtable flush and compaction overhead. Fixes #5779 Message-Id: <20200330154853.GA31074@scylladb.com> (cherry picked from commit `8a408ac5a8`)	2020-04-02 15:36:52 +02:00
Gleb Natapov	5753ab7195	lwt: drop invoke_on in paxos_state prepare and accept Since lwt requests are now running on an owning shard there is no longer a need to invoke cross shard call on paxos_state level. RPC calls may still arrive to a wrong shard so we need to make cross shard call there.	2020-01-13 10:26:02 +02:00
Benny Halevy	9ec98324ed	messaging_service: unregister_handler: return rpc unregister_handler future Now that seastar returns it. Fixes https://github.com/scylladb/scylla/issues/5228 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20191212143214.99328-1-bhalevy@scylladb.com>	2019-12-12 16:38:36 +02:00
Benny Halevy	105c8ef5a9	messaging_service: wait on unregister_handler Prepare for returning future<> from seastar rpc unregister_handler. Refs https://github.com/scylladb/scylla/issues/5228 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20191208153924.1953-1-bhalevy@scylladb.com>	2019-12-11 14:17:41 +02:00
Piotr Dulikowski	adfa7d7b8d	messaging_service: don't move `unsigned` values in handlers Performing std::move on integral types is pointless. This commit gets rid of moves of values of `unsigned` type in rpc handlers.	2019-12-05 00:58:31 +01:00
Piotr Dulikowski	2e802ca650	hh: add HINT_MUTATION verb Introduce a new verb dedicated for receiving and sending hints: HINT_MUTATION. It is handled on the streaming connection, which is separate from the one used for handling mutations sent by coordinator during a write. The intent of using a separate connection is to increase fariness while handling hints and user requests - this way, a situation can be avoided in which one type of requests saturate the connection, negatively impacting the other one.	2019-12-05 00:51:49 +01:00
Vladimir Davydov	bf5f864d80	paxos: piggyback result query on prepare response Current LWT implementation uses at least three network round trips: - first, execute PAXOS prepare phase - second, query the current value of the updated key - third, propose the change to participating replicas (there's also learn phase, but we don't wait for it to complete). The idea behind the optimization implemented by this patch is simple: piggyback the current value of the updated key on the prepare response to eliminate one round trip. To generate less network traffic, only the closest to the coordinator replica sends data while other participating replicas send digests which are used to check data consistency. Note, this patch changes the API of some RPC calls used by PAXOS, but this should be okay as long as the feature in the early development stage and marked experimental. To assess the impact of this optimization on LWT performance, I ran a simple benchmark that starts a number of concurrent clients each of which updates its own key (uncontended case) stored in a cluster of three AWS i3.2xlarge nodes located in the same region (us-west-1) and measures the aggregate bandwidth and latency. The test uses shard-aware gocql driver. Here are the results: latency 99% (ms) bandwidth (rq/s) timeouts (rq/s) clients before after before after before after 1 2 2 626 637 0 0 5 4 3 2616 2843 0 0 10 3 3 4493 4767 0 0 50 7 7 10567 10833 0 0 100 15 15 12265 12934 0 0 200 48 30 13593 14317 0 0 400 185 60 14796 15549 0 0 600 290 94 14416 15669 0 0 800 568 118 14077 15820 2 0 1000 710 118 13088 15830 9 0 2000 1388 232 13342 15658 85 0 3000 1110 363 13282 15422 233 0 4000 1735 454 13387 15385 329 0 That is, this optimization improves max LWT bandwidth by about 15% and allows to run 3-4x more clients while maintaining the same level of system responsiveness.	2019-11-24 11:35:29 +02:00
Vladimir Davydov	3d1d4b018f	paxos: remove unnecessary move constructor invocations invoke_on() guarantees that captures object won't be destroyed until the future returned by the invoked function is resolved so there's no need to move key, token, proposal for calling paxos_state::*_impl helpers.	2019-11-24 11:35:29 +02:00
Gleb Natapov	8d6201a23b	lwt: Add RPC verbs needed for paxos implementation Paxos protocol has three stages: prepare, accept, learn. This patch adds rpc verb for each of those stages. To be term compatible with Cassandra the patch calls those stages: prepare, propose, commit.	2019-10-27 23:21:51 +03:00
Avi Kivity	ba64ec78cf	messaging_service: use rpc::tuple instead of variadic futures for rpc Since variadic future<> is deprecated, switch to rpc::tuple for multiple return values in rpc calls. This is more or less mechanical translation.	2019-09-26 12:09:31 +02:00
Gleb Natapov	73e3d0a283	messaging_service: enable reuseaddr on messaging service rpc Fixes #4943 Message-Id: <20190918152405.GV21540@scylladb.com>	2019-09-19 11:43:03 +03:00
Gleb Natapov	9e9f64d90e	messaging_service: configure different streaming domain for each rpc server A streaming domain identifies a server across shards. Each server should have different one. Fixes: #4953 Message-Id: <20190908085327.GR21540@scylladb.com>	2019-09-08 14:05:40 +03:00
Botond Dénes	7adc764b6e	messaging_service: add canonical_support to schema pull and push verbs The verbs are: * DEFINITIONS_UPDATE (push) * MIGRATION_REQUEST (pull) Support was added in a backward-compatible way. The push verb, sends both the old frozen mutation parameter, and the new optional canonical mutation parameter. It is expected that new nodes will use the latter, while old nodes will fall-back to the former. The pull verb has a new optional `options` parameter, which for now contains a single flag: `remote_supports_canonical_mutation_retval`. This flag, if set, means that the remote node supports the new canonical mutation return value, thus the old frozen mutations return value can be left empty.	2019-09-04 10:32:44 +03:00
Botond Dénes	fddd9a88dd	treewide: silence discarded future warnings for legit discards This patch silences those future discard warnings where it is clear that discarding the future was actually the intent of the original author, and they did the necessary precautions (handling errors). The patch also adds some trivial error handling (logging the error) in some places, which were lacking this, but otherwise look ok. No functional changes.	2019-08-26 18:54:44 +03:00
Asias He	49a73aa2fc	streaming: Move stream_mutation_fragments_cmd to a new file (#4812 ) Avoid including the lengthy stream_session.hh in messaging_service. More importantly, fix the build because currently messaging_service.cc and messaging_service.hh does not include stream_mutation_fragments_cmd. I am not sure why it builds on my machine. Spotted this when backporting the "streaming: Send error code from the sender to receiver" to 3.0 branch. Refs: #4789	2019-08-07 14:59:46 +02:00
Asias He	bac987e32a	streaming: Send error code from the sender to receiver In case of error on the sender side, the sender does not propagate the error to the receiver. The sender will close the stream. As a result, the receiver will get nullopt from the source in get_next_mutation_fragment and pass mutation_fragment_opt with no value to the generating_reader. In turn, the generating_reader generates end of stream. However, the last element that the generating_reader has generated can be any type of mutation_fragment. This makes the sstable that consumes the generating_reader violates the mutation_fragment stream rule. To fix, we need to propagate the error. However RPC streaming does not support propagate the error in the framework. User has to send an error code explicitly. Fixes: #4789	2019-08-06 16:54:56 +02:00
Asias He	5d3e4d7b73	messaging_service: Check if messaging_service is stopped before get_rpc_client get_rpc_client assumes the messaging_service is not stopped. We should check is_stopping() before we call get_rpc_client. We do such check in existing code, e.g., send_message and friends. Do the same check in the newly introduced make_sink_and_source_for_stream_mutation_fragments() and friends for row level repair. Fixes: #4767	2019-07-31 11:44:57 +03:00
Calle Wilund	c540e36fe2	gms::inet_address: Make serialization ipv6 aware Because inet_address was initially hardcoded to ipv4, its wire format is not very forward compatible. Since we potentially need to communicate with older version nodes, we manually define the new serial format for inet_address to be: ipv4: 4 bytes address ipv6: 4 bytes marker 0xffffffff (invalid address) 16 bytes data -> address	2019-07-08 14:13:09 +00:00
Calle Wilund	e9816efe06	Remove usage of inet_address::raw_addr()	2019-07-08 14:13:09 +00:00
Calle Wilund	4ef940169f	Replace use of "ipv4_addr" with socket_address Allows the various sockets to use ipv6 address binding if so configured.	2019-07-08 14:13:09 +00:00
Asias He	37b3de4ea0	messaging_service: Add REPAIR_GET_FULL_ROW_HASHES_WITH_RPC_STREAM support It is used by row level repair.	2019-07-02 21:18:55 +08:00
Asias He	a7c7ba9765	messaging_service: Add REPAIR_PUT_ROW_DIFF_WITH_RPC_STREAM support It is used by row level repair.	2019-07-02 21:18:55 +08:00
Asias He	dc92bda93b	messaging_service: Add REPAIR_GET_ROW_DIFF_WITH_RPC_STREAM support	2019-07-02 21:18:55 +08:00
Asias He	f312c95b74	messaging_service: Add do_make_sink_source helper It is used by the row level repair rpc stream verbs to make sink and source object.	2019-07-02 21:18:55 +08:00
Asias He	bc295a00a6	messaging_service: Add rpc stream verb for row level repair - REPAIR_GET_ROW_DIFF_WITH_RPC_STREAM Get repair rows from follower nodes - REPAIR_PUT_ROW_DIFF_WITH_RPC_STREAM Put repair rows to follower nodes - REPAIR_GET_FULL_ROW_HASHES_WITH_RPC_STREAM: Get full hashes from follower nodes	2019-07-02 21:18:55 +08:00
Asias He	3db136f81e	repair: Use the same schema version for repair master and followers Before this patch, repair master and followers use their own schema version at the point repair starts independently. The schemas can be different due to schema change. Repair uses the schema to serialize mutation_fragment and deserialize the mutation_fragment received from peer nodes. Using different schema version to serialize and deserialize cause undefined behaviour. To fix, we use the schema the repair master decides for all the repair nodes involved. On top of this patch, we could do another step to make sure all nodes has the latest schema. But let's do it in a separate patch. Fixes: #4549 Backports: 3.1	2019-06-18 18:27:21 +08:00
Asias He	b463d7039c	repair: Introduce get_combined_row_hash_response Currently, REPAIR_GET_COMBINED_ROW_HASH RPC verb returns only the repair_hash object. In the future, we will use set reconciliation algorithm to decode the full row hashes in working row buf. It is useful to return the number of rows inside working row buf in addition to the combined row hashes to make sure the decode is successful. It is also better to use a wrapper class for the verb response so we can extend the return values later more easily with IDL. Fixes #4526 Message-Id: <93be47920b523f07179ee17e418760015a142990.1559771344.git.asias@scylladb.com>	2019-06-12 13:51:29 +03:00
Gleb Natapov	1d851a3892	messaging: catch an error that sending of CLIENT_ID may return Avoid a warning about unhandled exception. Message-Id: <20190506122718.GL21208@scylladb.com>	2019-05-06 18:13:51 +03:00
Paweł Dziepak	d47ea66ec6	messaging_service: add lz4_fragmented RPC compressor Seastar now supports two RPC compression algorithm: the original LZ4 one and LZ4_FRAGMENTED. The latter uses lz4 stream interface which allows it to process large messages without fully linearising them. Since, RPC requests used by Scylla often contain user-provided data that potentially could be very large, LZ4_FRAGMENTED is a better choice for the default compression algorithm. Message-Id: <20190417144318.27701-1-pdziepak@scylladb.com>	2019-04-18 19:07:14 +03:00
Gleb Natapov	1abc50ad8a	messaging_service: make sure a client is unique for a destination Function messaging_service::get_rpc_client() suppose to either return existing client or create one and return it. The function is suppose to be atomic, so after checking that requested client does not exist it is safe to assume emplace() will succeed. But we saw bugs that made the function to not be atomic. Lets add an assert that will help to catch such bugs easier if they will happen in the future. Message-Id: <20190326115741.GX26144@scylladb.com>	2019-03-26 14:19:08 +02:00
Gleb Natapov	bb93d990ad	messaging_service: keep shared pointer to an rpc connection while opening mutation fragment stream Current code captures a reference to rpc::client in a continuation, but there is no guaranty that the reference will be valid when continuation runs. Capture shared pointer to rpc::client instead. Fixes #4350. Message-Id: <20190314135538.GC21521@scylladb.com>	2019-03-21 12:46:01 -03:00
Gleb Natapov	a70374d982	messaging_service: do not forget to close stream when sending it to another side failed Fixes #4124 Message-Id: <20190131091857.GC3172@scylladb.com>	2019-01-31 12:01:56 +02:00
Duarte Nunes	fa2b0384d2	Replace std::experimental types with C++17 std version. Replace stdx::optional and stdx::string_view with the C++ std counterparts. Some instances of boost::variant were also replaced with std::variant, namely those that called seastar::visit. Scylla now requires GCC 8 to compile. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20190108111141.5369-1-duarte@scylladb.com>	2019-01-08 13:16:36 +02:00
Avi Kivity	c96fc1d585	Merge "Introduce row level repair" from Asias " === How the the partition level repair works - The repair master decides which ranges to work on. - The repair master splits the ranges to sub ranges which contains around 100 partitions. - The repair master computes the checksum of the 100 partitions and asks the related peers to compute the checksum of the 100 partitions. - If the checksum matches, the data in this sub range is synced. - If the checksum mismatches, repair master fetches the data from all the peers and sends back the merged data to peers. === Major problems with partition level repair - A mismatch of a single row in any of the 100 partitions causes 100 partitions to be transferred. A single partition can be very large. Not to mention the size of 100 partitions. - Checksum (find the mismatch) and streaming (fix the mismatch) will read the same data twice === Row level repair Row level checksum and synchronization: detect row level mismatch and transfer only the mismatch === How the row level repair works - To solve the problem of reading data twice Read the data only once for both checksum and synchronization between nodes. We work on a small range which contains only a few mega bytes of rows, We read all the rows within the small range into memory. Find the mismatch and send the mismatch rows between peers. We need to find a sync boundary among the nodes which contains only N bytes of rows. - To solve the problem of sending unnecessary data. We need to find the mismatched rows between nodes and only send the delta. The problem is called set reconciliation problem which is a common problem in distributed systems. For example: Node1 has set1 = {row1, row2, row3} Node2 has set2 = { row2, row3} Node3 has set3 = {row1, row2, row4} To repair: Node1 fetches nothing from Node2 (set2 - set1), fetches row4 (set3 - set1) from Node3. Node1 sends row1 and row4 (set1 + set2 + set3 - set2) to Node2 Node1 sends row3 (set1 + set2 + set3 - set3) to Node3. === How to implement repair with set reconciliation - Step A: Negotiate sync boundary class repair_sync_boundary { dht::decorated_key pk; position_in_partition position } Reads rows from disk into row buffers until the size is larger than N bytes. Return the repair_sync_boundary of the last mutation_fragment we read from disk. The smallest repair_sync_boundary of all nodes is set as the current_sync_boundary. - Step B: Get missing rows from peer nodes so that repair master contains all the rows Request combined hashes from all nodes between last_sync_boundary and current_sync_boundary. If the combined hashes from all nodes are identical, data is synced, goto Step A. If not, request the full hashes from peers. At this point, the repair master knows exactly what rows are missing. Request the missing rows from peer nodes. Now, local node contains all the rows. - Step C: Send missing rows to the peer nodes Since local node also knows what peer nodes own, it sends the missing rows to the peer nodes. === How the RPC API looks like - repair_range_start() Step A: - request_sync_boundary() Step B: - request_combined_row_hashes() - reqeust_full_row_hashes() - request_row_diff() Step C: - send_row_diff() - repair_range_stop() === Performance evaluation We created a cluster of 3 Scylla nodes on AWS using i3.xlarge instance. We created a keyspace with a replication factor of 3 and inserted 1 billion rows to each of the 3 nodes. Each node has 241 GiB of data. We tested 3 cases below. 1) 0% synced: one of the node has zero data. The other two nodes have 1 billion identical rows. Time to repair: old = 87 min new = 70 min (rebuild took 50 minutes) improvement = 19.54% 2) 100% synced: all of the 3 nodes have 1 billion identical rows. Time to repair: old = 43 min new = 24 min improvement = 44.18% 3) 99.9% synced: each node has 1 billion identical rows and 1 billion * 0.1% distinct rows. Time to repair: old: 211 min new: 44 min improvement: 79.15% Bytes sent on wire for repair: old: tx= 162 GiB, rx = 90 GiB new: tx= 1.15 GiB, tx = 0.57 GiB improvement: tx = 99.29%, rx = 99.36% It is worth noting that row level repair sends and receives exactly the number of rows needed in theory. In this test case, repair master needs to receives 2 million rows and sends 4 million rows. Here are the details: Each node has 1 billion * 0.1% distinct rows, that is 1 million rows. So repair master receives 1 million rows from repair slave 1 and 1 million rows from repair slave 2. Repair master sends 1 million rows from repair master and 1 million rows received from repair slave 1 to repair slave 2. Repair master sends sends 1 million rows from repair master and 1 million rows received from repair slave 2 to repair slave 1. In the result, we saw the rows on wire were as expected. tx_row_nr = 1000505 + 999619 + 1001257 + 998619 (4 shards, the numbers are for each shard) = 4'000'000 rx_row_nr = 500233 + 500235 + 499559 + 499973 (4 shards, the numbers are for each shard) = 2'000'000 Fixes: #3033 Tests: dtests/repair_additional_test.py " * 'asias/row_level_repair_v7' of github.com:cloudius-systems/seastar-dev: (51 commits) repair: Enable row level repair repair: Add row_level_repair repair: Add docs for row level repair repair: Add repair_init_messaging_service_handler repair: Add repair_meta repair: Add repair_writer repair: Add repair_reader repair: Add repair_row repair: Add fragment_hasher repair: Add decorated_key_with_hash repair: Add get_random_seed repair: Add get_common_diff_detect_algorithm repair: Add shard_config repair: Add suportted_diff_detect_algorithms repair: Add repair_stats to repair_info repair: Introduce repair_stats flat_mutation_reader: Add make_generating_reader storage_service: Introduce ROW_LEVEL_REPAIR feature messaging_service: Add RPC verbs for row level repair repair: Export the repair logger ...	2018-12-25 13:13:00 +02:00
Duarte Nunes	ede5742f9b	service/storage_proxy: Send view update backlog from replicas Change the inter-node protocol so we can propagate the view update backlog from a base replica to the coordinator through the mutation_done and mutation_failed verbs. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-19 22:38:30 +00:00
Botond Dénes	1865e5da41	treewide: remove include database.hh from headers where possible Many headers don't really need to include database.hh, the include can be replaced by forward declarations and/or including the actually needed headers directly. Some headers don't need this include at all. Each header was verified to be compilable on its own after the change, by including it into an empty `.cc` file and compiling it. `.cc` files that used to get `database.hh` through headers that no longer include it were changed to include it themselves.	2018-12-14 08:03:57 +02:00
Asias He	acc9ff8dce	messaging_service: Add RPC verbs for row level repair This patch adds the RPC verbs that are needed by the row level repair. The usage of those verbs are in the following patches. All the verbs for row level repair are sent by the repair master. Repair master asks repair slaves to create repair meta objects, a.k.a, repair_meta object, to store the repair meta data needed by row level repair algorithm. The repair meta object is identified by the IP address of the repair master and a uint32 number repair_meta_id chosen by repair master. When repair master restarts or is out of the cluster, repair slaves will detect it and remove all existing repair_meta for the repair master. When repair slave restarts, the existing repair_meta on the slave will be gone. The sync boundary used in the verbs is the position_in_partition of the last mutation_fragment. In each repair round, peers work on (last_sync_boundary, current_sync_boundary]	2018-12-12 16:49:01 +08:00
Asias He	063dfcda26	messaging_service: Add constructor for msg_addr Which takes the ip address and shard id.	2018-12-12 16:49:01 +08:00
Avi Kivity	775b7e41f4	Update seastar submodule * seastar d59fcef...b924495 (2): > build: Fix protobuf generation rules > Merge "Restructure files" from Jesse Includes fixup patch from Jesse: " Update Seastar `#include`s to reflect restructure All Seastar header files are now prefixed with "seastar" and the configure script reflects the new locations of files. Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Message-Id: <5d22d964a7735696fb6bb7606ed88f35dde31413.1542731639.git.jhaberku@scylladb.com> "	2018-11-21 00:01:44 +02:00
Gleb Natapov	d144e6ceac	messaging_service: enable port load balancing algorithm for RPC server In a homogeneous cluster this will reduce number of internal cross-shard hops per request since RPC calls will arrive to correct shard. Message-Id: <20181118150817.GF2062@scylladb.com>	2018-11-20 16:15:12 +00:00
Asias He	7f826d3343	streaming: Expose reason for streaming On receiving a mutation_fragment or a mutation triggered by a streaming operation, we pass an enum stream_reason to notify the receiver what the streaming is used for. So the receiver can decide further operation, e.g., send view updates, beyond applying the streaming data on disk. Fixes #3276 Message-Id: <f15ebcdee25e87a033dcdd066770114a499881c0.1539498866.git.asias@scylladb.com>	2018-10-15 22:03:28 +01:00
Calle Wilund	3cb50c861d	messaging_service: Make rpc streaming sink respect tls connection Fixes #3787 Message service streaming sink was created using direct call to rpc::client::make_sink. This in turn needs a new socker, which it creates completely ignoring what underlying transport is active for the client in question. Fix by retaining the tls credential pointer in the client wrapper, and using this in a sink method to determine whether to create a new tls socker, or just go ahead with a plain one. Message-Id: <20181010003249.30526-1-calle@scylladb.com>	2018-10-10 12:55:28 +03:00
Avi Kivity	4553238653	messaging: fix unbounded allocation in TLS RPC server The non-TLS RPC server has an rpc::resource_limits configuration that limits its memory consumption, but the TLS server does not. That means a many-node TLS configuration can OOM if all nodes gang up on a single replica. Fix by passing the limits to the TLS server too. Fixes #3757. Message-Id: <20180907192607.19802-1-avi@scylladb.com>	2018-09-10 12:11:16 +01:00
Duarte Nunes	a025bf6a7d	Merge seastar upstream Seastar introduced a "compat" namespace, which conflicts with Scylla's own "compat" namespaces. The merge thus includes changes to scope uses of Scylla's "compat" namespaces. * seastar 8ad870f...9bb1611 (5): > util/variant_utils: Ensure variant_cast behaves well with rvalues > util/std-compat: Fix infinite recursion > doc/tutorial: Undo namespace changes > util/variant_utils: Add cast_variant() > Add compatbility with C++17's library types Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-08-14 13:07:09 +01:00
Avi Kivity	c4013f6fe1	messaging: categorize more streaming/repair verbs as streaming Since the messaging service will assign a scheduling group based on the client index, it's more important now to get the verbs categorized correctly. Re-categorize REPLICATION_FINISHED, REPAIR_CHECKSUM_RANGE, and most importantly STREAM_MUTATION_FRAGMENTS to the repair/streaming oriented connections so we get the correct scheduling.	2018-07-15 15:44:10 +03:00
Avi Kivity	ff3d7839ab	messaging: remove default when computing rpc client index A default means that when adding new verbs, we may forget to categorize a verb correctly. Without the default, the compiler will complain due to -Wswitch.	2018-07-15 15:40:29 +03:00
Avi Kivity	fe2db68be8	messaging: convert do_get_rpc_client_idx into a switch A switch is more readable for multiple choice with no clearly preferred choice.	2018-07-15 15:26:50 +03:00
Avi Kivity	3b1e04091c	messaging: choose connection index via a look-up table Looking up is faster than a bunch of if()s.	2018-07-15 15:21:06 +03:00
Avi Kivity	8ee807321f	Merge "scylla streaming with rpc streaming" from Asias " This work is on top of Gleb's rpc streaming which is merged recently. What this series does is to replace scylla streaming service's data plane to use the new rpc streaming instead of the old rpc verb to send the mutations for scylla streaming. Other parts of scylla streaming, the control plane, are not changed. In my test, to bootstrap a new node to the existing one node cluster, smp 2, scylla stores data on ramdisk to minimize disk io impact. I saw x2 improvment in streaming bandwidth. Before: [shard 0] stream_session - [Stream #2ae92320-5fc8-11e8-911a-000000000000] Streaming plan for Bootstrap-ks3-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=1570312 KiB, 109521.02 KiB/s [shard 0] range_streamer - Bootstrap with 127.0.0.1 for keyspace=ks3 succeeded, took 14.338 seconds After: [shard 0] stream_session - [Stream #e5589ac0-5fc7-11e8-b463-000000000000] Streaming plan for Bootstrap-ks3-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=1546875 KiB, 220415.36 KiB/s [shard 0] range_streamer - Bootstrap with 127.0.0.1 for keyspace=ks3 succeeded, took 7.018 seconds Tests: dtest update_cluster_layout_tests.py Fixes: #3591 " * tag 'asias/scylla_streaming_with_rpc_streaming_v8' of github.com:scylladb/seastar-dev: streaming: Add rpc streaming support storage_service: Introduce STREAM_WITH_RPC_STREAM feature streaming: Add estimate_partitions to send_info messaging_service: Add streaming with rpc streaming support messaging_service: Add streaming_domain database: Add add_sstable_and_update_cache database: Add make_streaming_sstable_for_write	2018-07-15 12:36:52 +03:00

1 2 3 4 5 ...

285 Commits