scylladb

Author	SHA1	Message	Date
Duarte Nunes	aaa76d58ba	query: Move to_partition_range to dht namespace This patch moves to_partition_range, from the query namespace to the dht namespace, where it is a more natural fit. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1468498060-19251-1-git-send-email-duarte@scylladb.com>	2016-07-15 10:41:52 +02:00
Paweł Dziepak	e779e2f0c9	streaming: do not fragment mutations in mixed cluster The receiving side needs to handle fragmented mutations properly so that isolation guarantees are not broken. If the receiving node may be an old one do not fragment mutations. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-13 09:51:23 +01:00
Paweł Dziepak	d9eb4d8028	streaming: use fragment_and_freeze() to send mutations Commit `206955e4` "streaming: Reduce memory usage when sending mutations" moved streaming mutation limiter from do_send_mutations() to send_mutations(). The reason for that was that send_mutation() did full mutation copies. That's no longer the case and streaming limiter should be moved back to do_send_mutation() in order to provide back pressure to fragment_and_freeze(). Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:36 +01:00
Paweł Dziepak	32a5de7a1f	db: handle receiving fragmented mutations If mutations are fragmented during streaming a special care must be taken so that isolation guarantees are not broken. Mutations received with flag "fragmented" set are applied to a memtable that is used only by that particular streaming task and the sstables created by flushing such memtables are not made visible until the task is complte. Also, in case the streaming fails all data is dropped. This means that fragmented mutations cannot benefit from coalescing of writes from multiple streaming plans, hence separate way of handling them so that there is no loss of performance for small partitions. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	737eb73499	mutation_reader: make readers return streamed_mutations Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:50 +01:00
Asias He	94c9211b0e	streaming: Switch log level to warn instead of error dtest takes error level log as serious error. It is not a serious error for streaming to fail to send a verb and fail a streaming session, for example, the peer node is gone or stopped. Switch to use log level warn instead of level error. Fixes repair_additional_test.py:RepairAdditionalTest.repair_kill_3_test Fixes: #1335 Message-Id: <0149d30044e6e4d80732f1a20cd20593de489fc8.1465979288.git.asias@scylladb.com>	2016-06-15 13:01:22 +03:00
Asias He	96463cc17c	streaming: Fix indention in do_send_mutations Message-Id: <bc8cfa7c7b29f08e70c0af6d2fb835124d0831ac.1464857352.git.asias@scylladb.com>	2016-06-02 11:56:03 +03:00
Asias He	206955e47c	streaming: Reduce memory usage when sending mutations Limit disk bandwidth to 5MB/s to emulate a slow disk: echo "8:0 5000000" > /cgroup/blkio/limit/blkio.throttle.write_bps_device echo "8:0 5000000" > /cgroup/blkio/limit/blkio.throttle.read_bps_device Start scylla node 1 with low memory: scylla -c 1 -m 128M --auto-bootstrap false Run c-s: taskset -c 7 cassandra-stress write duration=5m cl=ONE -schema 'replication(factor=1)' -pop seq=1..100000 -rate threads=20 limit=2000/s -node 127.0.0.1 Start scylla node 2 with low memory: scylla -c 1 -m 128M --auto-bootstrap true Without this patch, I saw std::bad_alloc during streaming ERROR 2016-06-01 14:31:00,196 [shard 0] storage_proxy - exception during mutation write to 127.0.0.1: std::bad_alloc (std::bad_alloc) ... ERROR 2016-06-01 14:31:10,172 [shard 0] database - failed to move memtable to cache: std::bad_alloc (std::bad_alloc) ... To fix: 1. Apply the streaming mutation limiter before we read the mutation into memory to avoid wasting memory holding the mutation which we can not send. 2. Reduce the parallelism of sending streaming mutations. Before we send each range in parallel, after we send each range one by one. before: nr_vnode * nr_shard * (send_info + cf.make_reader memory usage) after: nr_shard * (send_info + cf.make_reader memory usage) We can at least save memory usage by the factor of nr_vnode, 256 by default. In my setup, fix 1) alone is not enough, with both fix 1) and 2), I saw no std::bad_alloc. Also, I did not see streaming bandwidth dropped due to 2). In addition, I tested grow_cluster_test.py:GrowClusterTest.test_grow_3_to_4, as described: https://github.com/scylladb/scylla/issues/1270#issuecomment-222585375 With this patch, I saw no std::bad_alloc any more. Fixes: #1270 Message-Id: <7703cf7a9db40e53a87f0f7b5acbb03fff2daf43.1464785542.git.asias@scylladb.com>	2016-06-02 11:01:58 +03:00
Piotr Jastrzebski	dcba6f5c45	Pass clustering_row_ranges to mutation readers. This will allow readers to reduce the amount of data read. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-05-16 14:36:57 +02:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Glauber Costa	10c8ca6ace	priority manager: separate streaming reads from writes Streaming has currently one class, that can be used to contain the read operations being generated by the streaming process. Those reads come from two places: - checksums (if doing repair) - reading mutations to be sent over the wire. Depending on the amount of data we're dealing with, that can generate a significant chunk of data, with seconds worth of backlog, and if we need to have the incoming writes intertwined with those reads, those can take a long time. Even if one node is only acting as a receiver, it may still read a lot for the checksums - if we're talking about repairs, those are coming from the checksums. However, in more complicated failure scenarios, it is not hard to imagine a node that will be both sending and receiving a lot of data. The best way to guarantee progress on both fronts, is to put both kinds of operations into different classes. This patch introduces a new write class, and rename the old read class so it can have a more meaningful name. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-23 09:12:59 -04:00
Asias He	f747df2aff	streaming: Fix rethrow in stream_transfer_task Fix bootstrap_test.py:TestBootstrap.failed_bootstap_wiped_node_can_join_test Logs on node 1: INFO 2016-03-11 15:53:43,287 [shard 0] gossip - FatClient 127.0.0.2 has been silent for 30000ms, removing from gossip INFO 2016-03-11 15:53:43,287 [shard 0] stream_session - stream_manager: Close all stream_session with peer = 127.0.0.2 in on_remove WARN 2016-03-11 15:53:43,498 [shard 0] stream_session - [Stream #4e411ba0-e75e-11e5-81f8-000000000000] stream_transfer_task: Fail to send STREAM_MUTATION_DONE to 127.0.0.2:0: std::runtime_error ([Stream #4e411ba0-e75e-11e5-81f8-000000000000] GOT STREAM_ MUTATION_DONE 127.0.0.1: Can not find stream_manager) terminate called without an active exception Backtrace on node 1: #0 0x00007fb74723da98 in raise () from /lib64/libc.so.6 #1 0x00007fb74723f69a in abort () from /lib64/libc.so.6 #2 0x00007fb74ab84aed in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6 #3 0x00007fb74ab82936 in ?? () from /lib64/libstdc++.so.6 #4 0x00007fb74ab82981 in std::terminate() () from /lib64/libstdc++.so.6 #5 0x00007fb74ab82be9 in __cxa_rethrow () from /lib64/libstdc++.so.6 #6 0x0000000000f3521e in streaming::stream_transfer_task::<lambda()>::<lambda(auto:44)>::operator()<std::__exception_ptr::exception_ptr> (ep=..., __closure=0x7ffce74d8630) at streaming/stream_transfer_task.cc:169 #7 do_void_futurize_apply<const streaming::stream_transfer_task::start()::<lambda()>::<lambda(auto:44)>&, std::__exception_ptr::exception_ptr> (func=...) at /home/asias/src/cloudius-systems/scylla/seastar/core/future.hh:1142 #8 futurize<void>::apply<const streaming::stream_transfer_task::start()::<lambda()>::<lambda(auto:44)>&, std::__exception_ptr::exception_ptr> (func=...) at /home/asias/src/cloudius-systems/scylla/seastar/core/future.hh:1190 #9 future<>::<lambda(auto:7&&)>::operator()<future<> > ( fut=fut@entry=<unknown type in /home/asias/src/cloudius-systems/scylla/build/release/scylla, CU 0xec84d00, DIE 0xee2561d>, __closure=__closure@entry=0x7ffce74d8630) at /home/asias/src/cloudius-systems/scylla/seastar/core/future.hh:1014 Message-Id: <1457684884-4776-2-git-send-email-asias@scylladb.com>	2016-03-11 11:14:05 +02:00
Asias He	a9ec752939	streaming: Reduce STREAM_MUTATION error logging There might be larger number of STREAM_MUTATION inflight. Log one error per column_family per range to avoid spam the log.	2016-03-10 10:56:48 +08:00
Asias He	d9ead889f3	streaming: Handle cf is deleted when sending STREAM_MUTATION_DONE In the preparation phase of streaming, we check that remote node has all the cf_id which are needed for the entire streaming process, including the cf_id which local node will send to remote node and wise versa. So, at later time, if the cf_id is missing, it must be that the cf_id is deleted. It is fine to ingore no_such_column_family exception. In this patch, we change the code to ignore at server side to avoid sending the exception back, to avoid handle exception in an IDL compatiable way. One thing we can improve is that the sender might know the cf is deleted later than the receiver does. In this case, the sender will send some more mutations if we send back the no_such_column_family back to the sender. However, since we do not throw exceptions in the receiver stream mutation handler, it will not cause a lot of overhead, the receiver will just ignore the mutation received. Fixes #979	2016-03-09 16:50:38 +08:00
Asias He	efa74dbae0	streaming: Do not send if the cf is deleted It is possible that a cf is deleted after we make the cf reader. Avoid sending them to avoid the unnecessary overhead to send them on the wire and the peer node to drop the received mutations.	2016-03-09 16:50:38 +08:00
Asias He	d146045bc5	Revert "Revert "streaming: Send mutations on all shards"" This brings back streaming on all shards. The bug in locator/abstract_replication_strategy is now fixed. This reverts commit `9f3061ade8`. Message-Id: <a79ce9cdd6f4af1c6088b89e1911b4b2ed1c10ae.1455589460.git.asias@scylladb.com>	2016-02-16 11:16:51 +02:00
Avi Kivity	9f3061ade8	Revert "streaming: Send mutations on all shards" This reverts commit `31d439213c`. Fixes #894. Conflicts: streaming/stream_manager.cc (may have undone part of `63a5aa6122`)	2016-02-09 18:26:14 +02:00
Asias He	31d439213c	streaming: Send mutations on all shards Currently, only the shard where the stream_plan is created on will send streaing mutations. To utilize all the available cores, we can make each shard send mutations which it is responsbile for. On the receiver side, we do not forward the mutations to the shard where the stream_session is created, so that we can avoid unnecessary forwarding. Note: the downside is that it is now harder to: 1) to track number of bytes sent and received 2) to update the keep alive timer upon receive of the STREAM_MUTATION To fix, we now store the sent/recieved bytes info on all shards. When the keep alive timer expires, we check if any progress has been made. Hopefully, this patch will make the streaming much faster and in turn make the repair/decommission/adding a node faster. Refs: https://github.com/scylladb/scylla/issues/849 Tested with decommission/repair dtest. Message-Id: <96b419ab11b736a297edd54a0b455ffdc2511ac5.1454645370.git.asias@scylladb.com>	2016-02-07 10:57:51 +02:00
Gleb Natapov	63a5aa6122	prevent superfluous frozen_mutation copying Sometimes frozen_mutation is copied while it can be moved instead. Fix those cases. Message-Id: <20160204165708.GI6705@scylladb.com>	2016-02-07 10:54:16 +02:00
Asias He	2f48d402e2	streaming: Remove unused commented code	2016-01-29 16:31:07 +08:00
Asias He	91e245edac	streaming: Initialize total_size in stream_transfer_task Also rename the private member to _total_size and _files	2016-01-29 16:31:07 +08:00
Asias He	c4bdb6f782	streaming: Wire up session progress The progress info is needed by JMX api.	2016-01-29 16:31:07 +08:00
Asias He	03aced39c4	streaming: Account number of bytes sent and received per session The API will consume it soon.	2016-01-27 18:16:58 +08:00
Asias He	e8b8b454df	streaming: Flatten streaming messages class namespace There are only two messages: prepare_message and outgoing_file_message. Actually only the prepare_message is the message we send on wire. Flatten the namespace.	2016-01-26 13:04:29 +08:00
Asias He	51fa717b8e	streaming: Get rid of file_message_header Again, we do not send sstable files, thus neither header info for sstables files. TODO: Estimate mutation size we sent.	2016-01-25 17:56:43 +08:00
Asias He	bdd6a69af7	streaming: Drop unused parameters - int connections_per_host Scylla does not create connections per stream_session, instead it uses rpc, thus connections_per_host is not relevant to scylla. - bool keep_ss_table_level - int repaired_at Scylla does not stream sstable files. They are not relevant to scylla.	2016-01-25 11:38:13 +08:00
Asias He	88e99e89d6	streaming: Add more debug info - Add debug for the peer address info - Add debug in stream_transfer_task and stream_receive_task - Add debug when cancel the keep_alive timer - Add debug for has_active_sessions in stream_result_future::maybe_complete	2016-01-22 07:43:16 +08:00
Asias He	2345cda42f	messaging_service: Rename shard_id to msg_addr Use shard_id as the destination of the messaging_service is confusing, since shard_id is used in the context of cpu id. Message-Id: <8c9ef193dc000ef06f8879e6a01df65cf24635d8.1452155241.git.asias@scylladb.com>	2016-01-07 10:36:35 +02:00
Asias He	22d0525bc0	streaming: Get rid of the _from_ parameter Get this from cinfo.retrieve_auxiliary inside the rpc handler.	2015-12-31 11:25:08 +01:00
Asias He	89b79d44de	streaming: Get rid of the _connecting_ parameter messaging_service will use private ip address automatically to connect a peer node if possible. There is no need for the upper level like streaming to worry about it. Drop it simplifies things a bit.	2015-12-31 11:25:08 +01:00
Avi Kivity	827a4d0010	Merge "streaming: Invalidate cache upon receiving of stream" from Asias "When a node gain or regain responsibility for certain token ranges, streaming will be performed, upon receiving of the stream data, the row cache is invalidated for that range. Refs #484."	2015-12-28 10:24:46 +02:00
Asias He	c971fad618	streaming: Introduce keep alive timer for each stream_session If the session is idle for 10 minutes, close the session. This can detect the following hangs: 1) if the sending node is gone, the receiving peer will wait forever 2) if the node which should send COMPLETE_MESSAGE to the peer node is gone, the peer node will wait forever Fixes simple_kill_streaming_node_while_bootstrapping_test.	2015-12-24 20:34:44 +08:00
Asias He	2d32195c32	streaming: Invalidate cache upon receiving of stream When a node gain or regain responsibility for certain token ranges, streaming will be performed, upon receiving of the stream data, the row cache is invalidated for that range. Refs #484.	2015-12-21 14:44:13 +08:00
Asias He	242e5ea291	streaming: Ignore remote no_such_column_family for stream_transfer_task When we start to sending mutations for cf_id to remote node, remote node might do not have the cf_id anymore due to dropping of the cf for instance. We should not fail the streaming if this happens, since the cf does not exist anymore there is no point streaming it. Fixes #566	2015-11-18 15:12:23 +02:00
Asias He	6ac54a27dc	streaming: Skip non-exist cf for stream_transfer_task Skip sending the mutation if the cf is dropped after we call make_local_reader in stream_session::add_transfer_ranges(). Fix #550.	2015-11-16 16:48:35 +01:00
Asias He	860c7aff37	streaming: Print plan_id in logger	2015-11-10 15:39:34 +08:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Avi Kivity	b22a598efb	mutation_reader: make noncopyable Many mutation_reader implementations capture 'this', which, if copied, becomes invalid. Protect against this error my making mutation_reader a non-copyable object. Fix inadvertant copied around the code base.	2015-08-25 15:49:08 +03:00
Asias He	fd1c0e0bb3	streaming: Fix iterate and delete The problem is that in start_streaming_files we iterate the _transfers map, however in task.start() we can delete the task from _transfers: stream_transfer_task::start() -> stream_transfer_task::complete -> stream_session::task_completed -> _transfers.erase(completed_task.cf_id) To fix, we advance the iterator before we start the task. std::_Rb_tree_increment(std::_Rb_tree_node_base const*) () from /lib64/libstdc++.so.6 /usr/include/c++/5.1.1/bits/stl_tree.h:205 (this=this@entry=0x6000000dc290) at streaming/stream_transfer_task.cc:55 streaming::stream_session::start_streaming_files (this=this@entry=0x6000000ab500) at streaming/stream_session.cc:526 (this=0x6000000ab500, requests=std::vector of length 1, capacity 1 = {...}, summaries=std::vector of length 1, capacity 1 = {...}) at streaming/stream_session.cc:356 streaming/stream_session.cc:83	2015-08-17 11:00:30 +08:00
Asias He	d2e826d6e6	streaming: Log STREAM_MUTATION_DONE before sending it It is useful for debug.	2015-08-17 11:00:30 +08:00
Asias He	0f1f710b27	streaming: Introduce transfer_task_completed	2015-08-17 11:00:30 +08:00
Asias He	651200c123	streaming: Log exception It is easier to tell what is going wrong.	2015-08-17 10:52:30 +08:00
Asias He	aa012ba374	streaming: Send STREAM_MUTATION in parallel At the moment, when local node send a mutation to remote node, it will wait for remote node to apply the mutation and send back a response, then it will send the next mutation. This means the sender are sending mutations one by one. To optimize, we can make the sender send more mutations in parallel without waiting for the response. In order to apply back pressure from remote node, a per shard mutation send limiter is introduced so that the sender will not overwhelm the receiver.	2015-08-17 10:52:30 +08:00
Asias He	e13d93b2ff	streaming: Improve error handling in stream_transfer_task::complete	2015-08-10 14:49:34 +08:00
Asias He	c7c33a9f44	streaming: Add error handling for STREAM_MUTATION sending	2015-08-10 14:44:25 +08:00
Asias He	be4d9c63b1	streaming: Drop do_with in stream_transfer_task::start We can copy id instead, it is cheap.	2015-08-10 14:13:15 +08:00
Asias He	f9109c33ba	streaming: Implement stream_transfer_task completion logic	2015-07-21 16:12:54 +08:00
Asias He	f2960a7cb0	streaming: Send plan_id for STREAM_MUTATION We need this to find session associated with this frozen_mutation.	2015-07-21 16:12:54 +08:00
Asias He	ccb32ceec5	streaming: Add stream_transfer_task::complete	2015-07-21 16:12:54 +08:00
Asias He	8561315cf2	streaming: de-thread_local-ize logger	2015-07-21 16:12:54 +08:00

1 2

62 Commits