scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 19:46:48 +00:00

Author	SHA1	Message	Date
Duarte Nunes	aaa76d58ba	query: Move to_partition_range to dht namespace This patch moves to_partition_range, from the query namespace to the dht namespace, where it is a more natural fit. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1468498060-19251-1-git-send-email-duarte@scylladb.com>	2016-07-15 10:41:52 +02:00
Tomasz Grabiec	7227c537ce	Merge branch 'pdziepak/streamed-mutations-hashing/v5' from seastar-dev.git From Paweł: This is another episode in the "convert X to streamed mutations" series. Hashing mutations (mainly for repair) is converted so that it doesn't need to rebuild whole mutation. The first part of the series changes the way streamed mutations deal with range tombstones. Since it is not necessary to make sure we write disjoint tombstones to sstables there is no need anymore for streamed mutations to produce disjoint tombstones and, consequently, no need for range tombstones to be split into range_tombstone_begin and range_tombstone_end. The second part is the actual hashing implementation. However, to ensure that the hash depends only on the contents of the mutation and no the way it is stored in different data sources range tombstones have to be made disjoint before they are hashed. This series also ensures that any changes caused by streamed mutations to hashing and streaming do not break repair during upgrade.	2016-07-13 11:24:00 +02:00
Paweł Dziepak	3fe1aec29d	streaming: avoid word "ERROR" in non-error messages Some tools (e.g. ccm) get confused and consider messages containing word "ERROR" as error level messagess irrespectively of their actual severity level. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1468399752-5228-1-git-send-email-pdziepak@scylladb.com>	2016-07-13 12:06:33 +03:00
Paweł Dziepak	e779e2f0c9	streaming: do not fragment mutations in mixed cluster The receiving side needs to handle fragmented mutations properly so that isolation guarantees are not broken. If the receiving node may be an old one do not fragment mutations. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-13 09:51:23 +01:00
Paweł Dziepak	d9eb4d8028	streaming: use fragment_and_freeze() to send mutations Commit `206955e4` "streaming: Reduce memory usage when sending mutations" moved streaming mutation limiter from do_send_mutations() to send_mutations(). The reason for that was that send_mutation() did full mutation copies. That's no longer the case and streaming limiter should be moved back to do_send_mutation() in order to provide back pressure to fragment_and_freeze(). Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:36 +01:00
Paweł Dziepak	32a5de7a1f	db: handle receiving fragmented mutations If mutations are fragmented during streaming a special care must be taken so that isolation guarantees are not broken. Mutations received with flag "fragmented" set are applied to a memtable that is used only by that particular streaming task and the sstables created by flushing such memtables are not made visible until the task is complte. Also, in case the streaming fails all data is dropped. This means that fragmented mutations cannot benefit from coalescing of writes from multiple streaming plans, hence separate way of handling them so that there is no loss of performance for small partitions. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	f2ae31711e	streaming: inform CF when streaming fails Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	4031c0ed8f	streaming: pass plan_id to column family for apply and flush plan_id is needed to keep track of the origin of mutations so that if they are fragmented all fragments are made visible at the same time, when that particular streaming plan_id completes. Basically, each streaming plan that sends big (fragmented) mutations is going to have its own memtables and a list of sstables which will get flushed and made visible when that plan completes (or dropped if it fails). Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	737eb73499	mutation_reader: make readers return streamed_mutations Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:50 +01:00
Asias He	94c9211b0e	streaming: Switch log level to warn instead of error dtest takes error level log as serious error. It is not a serious error for streaming to fail to send a verb and fail a streaming session, for example, the peer node is gone or stopped. Switch to use log level warn instead of level error. Fixes repair_additional_test.py:RepairAdditionalTest.repair_kill_3_test Fixes: #1335 Message-Id: <0149d30044e6e4d80732f1a20cd20593de489fc8.1465979288.git.asias@scylladb.com>	2016-06-15 13:01:22 +03:00
Asias He	96463cc17c	streaming: Fix indention in do_send_mutations Message-Id: <bc8cfa7c7b29f08e70c0af6d2fb835124d0831ac.1464857352.git.asias@scylladb.com>	2016-06-02 11:56:03 +03:00
Asias He	206955e47c	streaming: Reduce memory usage when sending mutations Limit disk bandwidth to 5MB/s to emulate a slow disk: echo "8:0 5000000" > /cgroup/blkio/limit/blkio.throttle.write_bps_device echo "8:0 5000000" > /cgroup/blkio/limit/blkio.throttle.read_bps_device Start scylla node 1 with low memory: scylla -c 1 -m 128M --auto-bootstrap false Run c-s: taskset -c 7 cassandra-stress write duration=5m cl=ONE -schema 'replication(factor=1)' -pop seq=1..100000 -rate threads=20 limit=2000/s -node 127.0.0.1 Start scylla node 2 with low memory: scylla -c 1 -m 128M --auto-bootstrap true Without this patch, I saw std::bad_alloc during streaming ERROR 2016-06-01 14:31:00,196 [shard 0] storage_proxy - exception during mutation write to 127.0.0.1: std::bad_alloc (std::bad_alloc) ... ERROR 2016-06-01 14:31:10,172 [shard 0] database - failed to move memtable to cache: std::bad_alloc (std::bad_alloc) ... To fix: 1. Apply the streaming mutation limiter before we read the mutation into memory to avoid wasting memory holding the mutation which we can not send. 2. Reduce the parallelism of sending streaming mutations. Before we send each range in parallel, after we send each range one by one. before: nr_vnode * nr_shard * (send_info + cf.make_reader memory usage) after: nr_shard * (send_info + cf.make_reader memory usage) We can at least save memory usage by the factor of nr_vnode, 256 by default. In my setup, fix 1) alone is not enough, with both fix 1) and 2), I saw no std::bad_alloc. Also, I did not see streaming bandwidth dropped due to 2). In addition, I tested grow_cluster_test.py:GrowClusterTest.test_grow_3_to_4, as described: https://github.com/scylladb/scylla/issues/1270#issuecomment-222585375 With this patch, I saw no std::bad_alloc any more. Fixes: #1270 Message-Id: <7703cf7a9db40e53a87f0f7b5acbb03fff2daf43.1464785542.git.asias@scylladb.com>	2016-06-02 11:01:58 +03:00
Piotr Jastrzebski	dcba6f5c45	Pass clustering_row_ranges to mutation readers. This will allow readers to reduce the amount of data read. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-05-16 14:36:57 +02:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Asias He	62d443a07d	streaming: Fix log of plan_id and session address in stream_session They are get swapped. Fix it up. Spotted by looking at the log. Message-Id: <d163d71e9a96d1a45c3a4c529519790eeff7c486.1459172778.git.asias@scylladb.com>	2016-03-29 09:01:06 +03:00
Asias He	6fd6e57e80	streaming: Harden keep alive timer - Do nothing in case the session is closed, to prevent we fire up the timer again - Print log info when no progress has been made if the time expires, it is very useful to debug a idle session - Grab a reference when the keep alive timer is running Message-Id: <9f2cc3164696905a6a39c0d072a980765d598dfd.1458782956.git.asias@scylladb.com>	2016-03-24 11:58:54 +02:00
Asias He	fe263e5436	Revert "Revert "streaming: Start to send mutations after PREPARE_DONE_MESSAGE"" This reverts commit `1f29a698d5`.	2016-03-24 08:43:17 +08:00
Asias He	a6dd6e6d55	Revert "Revert "streaming: Simplify session completion logic"" This reverts commit `354fca9d56`.	2016-03-24 07:48:27 +08:00
Asias He	c2eff7e824	streaming: Complete receive task after the flush A STREAM_MUTATION_DONE message will signal the receiver that the sender has completed the sending of streams mutations. When the receiver finds it has zero task to send and zero task to receive, it will finish the stream_session, and in turn finish the stream_plan if all the stream_sessions are finished. We should call receive_task_completed only after the flush finishes so that when stream_plan is finshed all the data is on disk. Fixes repair_disjoint_data_test issue with Glauber's "[PATCH v4 0/9] Make sure repairs do not cripple incoming load" serries ====================================================================== FAIL: repair_disjoint_data_test (repair_additional_test.RepairAdditionalTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "scylla-dtest/repair_additional_test.py", line 102, in repair_disjoint_data_test self.check_rows_on_node(node1, 3000) File "scylla-dtest/repair_additional_test.py", line 33, in check_rows_on_node self.assertEqual(len(result), rows, len(result)) AssertionError: 2461	2016-03-23 09:40:49 -04:00
Glauber Costa	5fa866223d	streaming: add incoming streaming mutations to a different sstable Keeping the mutations coming from the streaming process as mutations like any other have a number of advantages - and that's why we do it. However, this makes it impossible for Seastar's I/O scheduler to differentiate between incoming requests from clients, and those who are arriving from peers in the streaming process. As a result, if the streaming mutations consume a significant fraction of the total mutations, and we happen to be using the disk at its limits, we are in no position to provide any guarantees - defeating the whole purpose of the scheduler. To implement that, we'll keep a separate set of memtables that will contain only streaming mutations. We don't have to do it this way, but doing so makes life a lot easier. In particular, to write an SSTable, our API requires (because the filter requires), that a good estimate on the number of partitions is informed in advance. The partitions also need to be sorted. We could write mutations directly to disk, but the above conditions couldn't be met without significant effort. In particular, because mutations can be arriving from multiple peer nodes, we can't really sort them without keeping a staging area anyway. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-23 09:13:00 -04:00
Glauber Costa	10c8ca6ace	priority manager: separate streaming reads from writes Streaming has currently one class, that can be used to contain the read operations being generated by the streaming process. Those reads come from two places: - checksums (if doing repair) - reading mutations to be sent over the wire. Depending on the amount of data we're dealing with, that can generate a significant chunk of data, with seconds worth of backlog, and if we need to have the incoming writes intertwined with those reads, those can take a long time. Even if one node is only acting as a receiver, it may still read a lot for the checksums - if we're talking about repairs, those are coming from the checksums. However, in more complicated failure scenarios, it is not hard to imagine a node that will be both sending and receiving a lot of data. The best way to guarantee progress on both fronts, is to put both kinds of operations into different classes. This patch introduces a new write class, and rename the old read class so it can have a more meaningful name. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-23 09:12:59 -04:00
Pekka Enberg	354fca9d56	Revert "streaming: Simplify session completion logic" This reverts commit `208b7fa7ba`. It breaks Glauber's upcoming repair series.	2016-03-22 20:37:50 +02:00
Pekka Enberg	1f29a698d5	Revert "streaming: Start to send mutations after PREPARE_DONE_MESSAGE" This reverts commit `4c06221766`. It breaks Glauber's upcoming repair series.	2016-03-22 20:37:22 +02:00
Asias He	4c06221766	streaming: Start to send mutations after PREPARE_DONE_MESSAGE Below are 3 possible cases in a stream session, after commit `208b7fa7ba` (streaming: Simplify session completion logic) We might close the session before the exchange of the PREPARE_DONE_MESSAGE message in case 1). To fix, we defer the sending of mutations after PREPARE_DONE_MESSAGE is sent at the initiator node. 1) Initiator Follower tx rx tx rx 1 0 0 1 send prepare send back prepare recev prepare send mutations (close the session before prepare_done msg is sent) recv mutations (close session before prepare_done msg is received) send prepare_done recv prepare_done and send no mutations 2) Initiator Follower tx rx tx rx 0 1 1 0 send prepare send back prepare recv prepare nothing to send send prepare_done recv prepare_done and send mutations (close session) recv mutations (close session) 3) Initiator Follower tx rx tx rx 1 1 1 1 send prepare send back prepare recv prepare send mutations recv mutations, can not close session since we have mutations to send send prepare_done recv prepare_done and send mutations (close session) recv mutations (close session) Message-Id: <d6510b558565db23202164fa491b883ef3796e58.1458634037.git.asias@scylladb.com>	2016-03-22 15:05:57 +02:00
Asias He	208b7fa7ba	streaming: Simplify session completion logic Both the initiator and follower of a stream session knows how many transfer task and receive task the stream session contains in the preparation phase. They use the _transfers and _receivers map to track the tasks, like below: std::map<UUID, stream_transfer_task> _transfers; std::map<UUID, stream_receive_task> _receivers; A stream_transfer_task will send STREAM_MUTATION verb to transfer data with frozen_mutation, when all the STREAM_MUTATIONs are sent, it will send STREAM_MUTATION_DONE to tell the peer the stream_transfer_task is completed and remove the stream_transfer_task from _transfers map. The peer will remove the corresponding stream_receive_task in _receivers. We do not really need the COMPLETE_MESSAGE verb to notify the peer we have completed sending. It makes the session completion logic much simpler and cleaner if we do not depend on COMPLETE_MESSAGE verb. However, to be compatible with older version, we always send a COMPLETE_MESSAGE message and do nothing in the COMPLETE_MESSAGE handler and replies a ready future even if the stream_session is closed already. This way, node with older version will get a COMPLETE_MESSAGE message and manage to send a COMPLETE_MESSAGE message to new node as before. Message-Id: <1458540564-34277-2-git-send-email-asias@scylladb.com>	2016-03-21 16:58:03 +02:00
Asias He	28ccd866e2	streaming: Move ranges in stream_plan The ranges are not used afterwards. We can move instead of copy. Message-Id: <1458540564-34277-1-git-send-email-asias@scylladb.com>	2016-03-21 10:10:09 +01:00
Glauber Costa	e52b869b25	fix small typo will sent -> will send Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20eaf0cea6fe14b03332547b7c4a3b85e9b619e7.1458325926.git.glauber@scylladb.com>	2016-03-18 20:34:22 +02:00
Glauber Costa	a3ebf640c6	stream_session: print debug message for STREAM_MUTATION For this verb(), we don't call get_session - and it doesn't look like we will. We currently have no debug message for this one, which makes it harder to debug the stream of messages. Print it. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-16 22:09:46 -04:00
Glauber Costa	0ab4275893	stream_session: remove duplicated debug message Whenever we call get_session, that will print a debug message about the arrival of this new verb. Because we also print that explicitly in PREPARE_DONE, that message gets duplicated. That confuses poor developers who are, for a while, left wondering why is it that the sender is sender the message twice. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-16 22:04:25 -04:00
Asias He	2d50c71ca3	streaming: Handle cf is deleted after the deletion check The cf can be deleted after the cf deletion check. Handle this case as well. Use "warn" level to log if cf is missing. Although we can handle the case, but it is good to distingush where the receiver of streaming applied all the stream mutations or not. We believe that the cf is missing because it was dropped, but it could be missing because of a bug or something we didn't anticipated here. Related patch: "streaming: Handle cf is deleted when sending STREAM_MUTATION_DONE" Fixes simple_add_new_node_while_schema_changes_test failure. Message-Id: <c4497e0500f50e0a3422efb37e73130765c88c57.1458090598.git.asias@scylladb.com>	2016-03-16 09:46:41 +01:00
Asias He	f747df2aff	streaming: Fix rethrow in stream_transfer_task Fix bootstrap_test.py:TestBootstrap.failed_bootstap_wiped_node_can_join_test Logs on node 1: INFO 2016-03-11 15:53:43,287 [shard 0] gossip - FatClient 127.0.0.2 has been silent for 30000ms, removing from gossip INFO 2016-03-11 15:53:43,287 [shard 0] stream_session - stream_manager: Close all stream_session with peer = 127.0.0.2 in on_remove WARN 2016-03-11 15:53:43,498 [shard 0] stream_session - [Stream #4e411ba0-e75e-11e5-81f8-000000000000] stream_transfer_task: Fail to send STREAM_MUTATION_DONE to 127.0.0.2:0: std::runtime_error ([Stream #4e411ba0-e75e-11e5-81f8-000000000000] GOT STREAM_ MUTATION_DONE 127.0.0.1: Can not find stream_manager) terminate called without an active exception Backtrace on node 1: #0 0x00007fb74723da98 in raise () from /lib64/libc.so.6 #1 0x00007fb74723f69a in abort () from /lib64/libc.so.6 #2 0x00007fb74ab84aed in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6 #3 0x00007fb74ab82936 in ?? () from /lib64/libstdc++.so.6 #4 0x00007fb74ab82981 in std::terminate() () from /lib64/libstdc++.so.6 #5 0x00007fb74ab82be9 in __cxa_rethrow () from /lib64/libstdc++.so.6 #6 0x0000000000f3521e in streaming::stream_transfer_task::<lambda()>::<lambda(auto:44)>::operator()<std::__exception_ptr::exception_ptr> (ep=..., __closure=0x7ffce74d8630) at streaming/stream_transfer_task.cc:169 #7 do_void_futurize_apply<const streaming::stream_transfer_task::start()::<lambda()>::<lambda(auto:44)>&, std::__exception_ptr::exception_ptr> (func=...) at /home/asias/src/cloudius-systems/scylla/seastar/core/future.hh:1142 #8 futurize<void>::apply<const streaming::stream_transfer_task::start()::<lambda()>::<lambda(auto:44)>&, std::__exception_ptr::exception_ptr> (func=...) at /home/asias/src/cloudius-systems/scylla/seastar/core/future.hh:1190 #9 future<>::<lambda(auto:7&&)>::operator()<future<> > ( fut=fut@entry=<unknown type in /home/asias/src/cloudius-systems/scylla/build/release/scylla, CU 0xec84d00, DIE 0xee2561d>, __closure=__closure@entry=0x7ffce74d8630) at /home/asias/src/cloudius-systems/scylla/seastar/core/future.hh:1014 Message-Id: <1457684884-4776-2-git-send-email-asias@scylladb.com>	2016-03-11 11:14:05 +02:00
Asias He	a9ec752939	streaming: Reduce STREAM_MUTATION error logging There might be larger number of STREAM_MUTATION inflight. Log one error per column_family per range to avoid spam the log.	2016-03-10 10:56:48 +08:00
Asias He	7c4c99d7c7	streaming: Fix a log level in get_column_family_stores It is supposed to be debug level instead of info level.	2016-03-10 10:56:48 +08:00
Asias He	d9ead889f3	streaming: Handle cf is deleted when sending STREAM_MUTATION_DONE In the preparation phase of streaming, we check that remote node has all the cf_id which are needed for the entire streaming process, including the cf_id which local node will send to remote node and wise versa. So, at later time, if the cf_id is missing, it must be that the cf_id is deleted. It is fine to ingore no_such_column_family exception. In this patch, we change the code to ignore at server side to avoid sending the exception back, to avoid handle exception in an IDL compatiable way. One thing we can improve is that the sender might know the cf is deleted later than the receiver does. In this case, the sender will send some more mutations if we send back the no_such_column_family back to the sender. However, since we do not throw exceptions in the receiver stream mutation handler, it will not cause a lot of overhead, the receiver will just ignore the mutation received. Fixes #979	2016-03-09 16:50:38 +08:00
Asias He	efa74dbae0	streaming: Do not send if the cf is deleted It is possible that a cf is deleted after we make the cf reader. Avoid sending them to avoid the unnecessary overhead to send them on the wire and the peer node to drop the received mutations.	2016-03-09 16:50:38 +08:00
Asias He	dca9e594cc	streaming: Remove the unused test code It is introduced in the early development of streaming. We have dtest for streaming now, drop it. Message-Id: <1457499303-21163-1-git-send-email-asias@scylladb.com>	2016-03-09 10:31:42 +02:00
Asias He	1f3928c321	streaming: Hook streaming with gossip callback If the peer node of a stream_session is restarted or removed we should abort the streaming. It is better to hook gossip callback in the stream manager than in each streamm_session.	2016-03-09 07:35:20 +08:00
Asias He	50bf65db8d	streaming: Fix keep alive timer progress checking When the first time the keep alive timer fires, the _last_stream_bytes btyes will be zero since it is the first time we update it. The keep alive timer will be rearmed and fired again. The second time, we find there is no progress, we close the session. The total idle time will be 2 * keep alive timer. To make the idle time to close the session be more precise, we reduce the interval to check the progess and close the session by checking last time the progress is made. Message-Id: <c959cffce0cc738a3d73caaf71d2adb709d46863.1456831616.git.asias@scylladb.com>	2016-03-01 16:46:08 +02:00
Asias He	fd5f3cff47	streaming: Fix stream_manager progress api For each stream_session, we pretend we are sending/receiving one file, to make it compatible with nodetool. For receiving_files, the file name is "rxnofile". For sending_files, the file name is "txnofile". stream_manager::update_all_progress_info is introduced to update the progress info of all the stream_sessions in the node. We need this because streaming mutations are received on all the cores, but the stream_session object is only on one of the cores. It adds overhead if we update progress info in stream_session object whenever we receive a streaming mutation. So, what we do now is when we really need the progress info, we update the progress info in stream_session object. With http://127.0.0.$i:10000/stream_manager/, it looks like below when decommission node 3 in a 3 nodes cluster. =========== GET NODE 1 [{"plan_id": "935a2cc0-dc6b-11e5-bdbf-000000000000", "description": "Unbootstrap", "sessions": [{"receiving_files": [{"value": {"direction": "IN", "file_name": "rxnofile", "session_index": 0, "total_bytes": 16876296, "peer": "127.0.0.3", "current_bytes": 16876296}, "key": "rxnofile"}], "receiving_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.3", "peer": "127.0.0.3"}]}] =========== GET NODE 2 [{"plan_id": "935a2cc0-dc6b-11e5-bdbf-000000000000", "description": "Unbootstrap", "sessions": [{"receiving_files": [{"value": {"direction": "IN", "file_name": "rxnofile", "session_index": 0, "total_bytes": 16755552, "peer": "127.0.0.3", "current_bytes": 16755552}, "key": "rxnofile"}], "receiving_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.3", "peer": "127.0.0.3"}]}] =========== GET NODE 3 [{"plan_id": "935a2cc0-dc6b-11e5-bdbf-000000000000", "description": "Unbootstrap", "sessions": [{"sending_files": [{"value": {"direction": "OUT", "file_name": "txnofile", "session_index": 0, "total_bytes": 16876296, "peer": "127.0.0.1", "current_bytes": 16876296}, "key": "txnofile"}], "sending_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.1", "peer": "127.0.0.1"},{"sending_files": [{"value": {"direction": "OUT", "file_name": "txnofile", "session_index": 0, "total_bytes": 16755552, "peer": "127.0.0.2", "current_bytes": 16755552}, "key": "txnofile"}], "sending_summaries": [{"files": 1, "total_size": 0, "cf_id": "869d8630-dc6b-11e5-bdbf-000000000000"}], "session_index": 0, "state": "PREPARING", "connecting": "127.0.0.2", "peer": "127.0.0.2"}]}]	2016-02-26 17:38:37 +08:00
Asias He	37f52d632f	streaming: Remove unused progress() function	2016-02-26 17:38:37 +08:00
Asias He	8060b97d67	streaming: Log number of bytes sent and recevied when stream_plan completes It is useful for test code to verify number of bytes sent/received. It looks like below in the log. /tmp/out1:INFO [shard 0] stream_session - \ [Stream #1f3e23f0-db9e-11e5-9cfb-000000000000] bytes_sent = 0, bytes_received = 15760704 /tmp/out2:INFO [shard 0] stream_session - \ [Stream #1f3e23f0-db9e-11e5-9cfb-000000000000] bytes_sent = 0, bytes_received = 18203964 /tmp/out3:INFO [shard 0] stream_session - \ [Stream #1f3e23f0-db9e-11e5-9cfb-000000000000] bytes_sent = 33964668, bytes_received = 0	2016-02-26 17:38:37 +08:00
Asias He	9dede89e07	streaming: Add get_progress_on_all_shards for plan_id Get stream_bytes for a specific plan_id.	2016-02-26 17:38:37 +08:00
Asias He	d146045bc5	Revert "Revert "streaming: Send mutations on all shards"" This brings back streaming on all shards. The bug in locator/abstract_replication_strategy is now fixed. This reverts commit `9f3061ade8`. Message-Id: <a79ce9cdd6f4af1c6088b89e1911b4b2ed1c10ae.1455589460.git.asias@scylladb.com>	2016-02-16 11:16:51 +02:00
Avi Kivity	9f3061ade8	Revert "streaming: Send mutations on all shards" This reverts commit `31d439213c`. Fixes #894. Conflicts: streaming/stream_manager.cc (may have undone part of `63a5aa6122`)	2016-02-09 18:26:14 +02:00
Calle Wilund	2ffd7d7b99	stream_manager: Change construction to make gcc 4.9 happy gcc 4.9 complains about the type{ val, val } construction of type with implicit default constructor, i.e. member = initial declarations. gcc 5 does not (and possibly rightly so). However, we still (implicitly) claim to support gcc 4.9 so why not just change this particular instance. Message-Id: <1454921328-1106-1-git-send-email-calle@scylladb.com>	2016-02-08 10:54:48 +02:00
Asias He	31d439213c	streaming: Send mutations on all shards Currently, only the shard where the stream_plan is created on will send streaing mutations. To utilize all the available cores, we can make each shard send mutations which it is responsbile for. On the receiver side, we do not forward the mutations to the shard where the stream_session is created, so that we can avoid unnecessary forwarding. Note: the downside is that it is now harder to: 1) to track number of bytes sent and received 2) to update the keep alive timer upon receive of the STREAM_MUTATION To fix, we now store the sent/recieved bytes info on all shards. When the keep alive timer expires, we check if any progress has been made. Hopefully, this patch will make the streaming much faster and in turn make the repair/decommission/adding a node faster. Refs: https://github.com/scylladb/scylla/issues/849 Tested with decommission/repair dtest. Message-Id: <96b419ab11b736a297edd54a0b455ffdc2511ac5.1454645370.git.asias@scylladb.com>	2016-02-07 10:57:51 +02:00
Gleb Natapov	63a5aa6122	prevent superfluous frozen_mutation copying Sometimes frozen_mutation is copied while it can be moved instead. Fix those cases. Message-Id: <20160204165708.GI6705@scylladb.com>	2016-02-07 10:54:16 +02:00
Asias He	c67538009c	streaming: Fix assert in update_progress The problem is that on the follower side, we set up _session_info too late, after received PREPARE_DONE_MESSAGE message. The initiator can send STREAM_MUTATION before sending PREPARE_DONE_MESSAGE message. To fix, we set up _session_info after we received the prepare_message on both initiator and follower. Fixes #869 scylla: streaming/session_info.cc:44: void streaming::session_info::update_progress(streaming::progress_info): Assertion `peer == new_progress.peer' failed. Message-Id: <6d945ba1e8c4fc0949c3f0a72800c9448ba27761.1454476876.git.asias@scylladb.com>	2016-02-03 10:15:45 +02:00
Asias He	c618c699b3	streaming: Increase mutation_send_limiter The idea behind the current 10 stream_mutations per core limitation is to avoid streaming overwhelms the TCP connection and starves normal cql verbs if the streaming mutations are big and takes long time to complete. Now that we use a standalone connection for streaming verbs, we can increase the limitation. Hopefully, this will fix #849.	2016-02-01 11:01:56 +08:00
Asias He	f07cd30c81	streaming: Remove unused create_message_for_retry	2016-01-29 16:31:07 +08:00

1 2 3 4 5 ...

377 Commits