scylladb

Author	SHA1	Message	Date
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Kefu Chai	f86a5ae87a	streaming: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16947	2024-01-23 19:38:30 +02:00
Benny Halevy	314e45d957	streaming: define plan_id as a strong tagged_uuid type Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-22 19:45:30 +03:00
Benny Halevy	c1fc0672a5	streaming: add forward declarations in stream_fwd.hh To be used for defining streaming::plan_id in the next patcvh. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-21 16:00:02 +03:00
Benny Halevy	257d74bb34	schema, everywhere: define and use table_id as a strong type Define table_id as a distinct utils::tagged_uuid modeled after raft tagged_id, so it can be differentiated from other uuid-class types, in particular from table_schema_version. Fixes #11207 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:09:41 +03:00
Asias He	953af38281	streaming: Allow drop table during streaming Currently, if a table is dropped during streaming, the streaming would fail with no_such_column_family error. Since the table is dropped anyway, it makes more sense to ignore the streaming result of the dropped table, whether it is successful or failed. This allows users to drop tables during node operations, e.g., bootstrap or decommission a node. This is especially useful for the cloud users where it is hard to coordinate between a node operation by admin and user cql change. This patch also fixes a possible user after free issue by not passing the table reference object around. Fixes #10395 Closes #10396	2022-04-24 17:43:20 +03:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Avi Kivity	84be89eb3b	streaming: drop unused fields	2021-05-21 21:03:23 +03:00
Pavel Solodovnikov	2f442f28af	treewide: add const qualifiers throughout the code base	2019-11-26 02:24:49 +03:00
Asias He	a9dab60b6c	streaming: One cf per time on sender In the case there are large number of column families, the sender will send all the column families in parallel. We allow 20% of shard memory for streaming on the receiver, so each column family will have 1/N, N is the number of in-flight column families, memory for memtable. Large N causes a lot of small sstables to be generated. It is possible there are multiple senders to a single receiver, e.g., when a new node joins the cluster, the maximum in-flight column families is number of peer node. The column families are sent in the order of cf_id. It is not guaranteed that all peers has the same speed so they are sending the same cf_id at the same time, though. We still have chance some of the peers are sending the same cf_id. Fixes #3065 Message-Id: <46961463c2a5e4f1faff232294dc485ac4f1a04e.1513159678.git.asias@scylladb.com>	2017-12-13 12:32:41 +02:00
Avi Kivity	85a6a2b3cb	streaming: remove unneeded includes	2017-09-12 10:43:39 +03:00
Asias He	937f28d2f1	Convert to use dht::partition_range_vector and dht::token_range_vector	2016-12-19 14:08:50 +08:00
Asias He	e5485f3ea6	Get rid of query::partition_range Use dht::partition_range instead	2016-12-19 08:09:25 +08:00
Asias He	d1178fa299	Convert to use dht::token_range	2016-12-19 08:04:29 +08:00
Asias He	ba54654af3	streaming: Use interval_set to sort and merge ranges So that the ranges are sorted and have no overlaps. We can have less ranges to deal with and it can help the mutation readers to optimize. Here is an exmaple of ranges generated by repair: Before: INFO 2016-12-07 17:44:21,185 [shard 0] stream_session - cf_id = dec9fa90-bc3b-11e6-af78-000000000001, before ranges = {(-3383928698815274642, -3376937163195039606], (-7260764223708720005, -7251657821052234309], (-4767213984179237293, -4747032371925842389], (-7645879646119667643, -7589962743703481776], (-2340199306656526861, -2320523117224780931], (-576028861239229331, -560973674020019962], (-4070378863644120252, -3987599893827407860], (-2551584407739673151, -2498779102482524711], (-5416061903556353312, -5354212455975869358], (37594980457713898, 67885601051654285], (3083778975065200884, 3091232478835418439], (3131345970514528877, 3187922544267434961], (5765437476661317163, 5778671293583720541], (5960610072466058818, 5972289771228014343], (7749618183851698485, 7758080813117351135], (-3987599893827407860, -3899198931034439776], (-7251657821052234309, -7131649010279865221], (-3576581915808403133, -3383928698815274642], (-417850207760366422, -327959672080599465], (-2671876682129336880, -2551584407739673151], (-1305178847032904465, -1137497074548854552], (8540448858050275827, 8610171849752115483], (-560973674020019962, -417850207760366422], (-2498779102482524711, -2340199306656526861], (2394447940525988167, 2523396860109747637], (-6703329224557608009, -6517757811218772762], (-3675103288021821677, -3576581915808403133], (-5622185785296846551, -5416061903556353312], (8610171849752115483, 8742605005068551458], (8068079250973315241, 8185655671734937642], (560264964510741191, 790641981923757238], (5581202487214475094, 5765437476661317163], (8742605005068551458, 8923908282731801645], (-6038176423022601107, -5622185785296846551], (5778671293583720541, 5960610072466058818], (-3899198931034439776, -3675103288021821677], (8356739976149429222, 8540448858050275827], (-6517757811218772762, -6038176423022601107], (-8052600134279395253, -7645879646119667643], (-327959672080599465, 37594980457713898], (7758080813117351135, 8019254284118543066], (4781565016737645510, 5067070718000527886], (2523396860109747637, 3083778975065200884], (-5354212455975869358, -4767213984179237293], (6784138025918878582, 7190719703944308372], (67885601051654285, 447405341661896387], (-2190610927722759275, -1305178847032904465], (-4747032371925842389, -4070378863644120252]}, size=48 After: INFO 2016-12-07 17:44:21,185 [shard 0] stream_session - cf_id = dec9fa90-bc3b-11e6-af78-000000000001, after ranges = {(-8052600134279395253, -7589962743703481776], (-7260764223708720005, -7131649010279865221], (-6703329224557608009, -3376937163195039606], (-2671876682129336880, -2320523117224780931], (-2190610927722759275, -1137497074548854552], (-576028861239229331, 447405341661896387], (560264964510741191, 790641981923757238], (2394447940525988167, 3091232478835418439], (3131345970514528877, 3187922544267434961], (4781565016737645510, 5067070718000527886], (5581202487214475094, 5972289771228014343], (6784138025918878582, 7190719703944308372], (7749618183851698485, 8019254284118543066], (8068079250973315241, 8185655671734937642], (8356739976149429222, 8923908282731801645]}, size=15	2016-12-12 11:09:26 +08:00
Asias He	1987264beb	streaming: Make streaming reader with ranges Now that we have the new interface to make readers with ranges, we can simplify the code a lot. 1) Less readers are needed before: number of ranges of readers after: smp::count readers at most 2) No foreign_ptr is needed There is no need to forward to a shard to make the foreign_ptr for send_info in the first phase and forward to that shard to execute the send_info in the second phase. 3) No do_with is needed in send_mutations since si now is a lw_shared_ptr 4) Fix possible user after free of 'si' in do_send_mutations We need to take a reference of 'si' when sending the mutation with send_stream_mutation rpc call, otherwise: msg1 got exception si->mutations_done.broken() si is freed msg2 got exception si is used again The issue is introduced in `dc50ce0ce5` (streaming: Make the mutation readers when streaming starts) which is master only, branch 1.5 is not affected.	2016-12-12 09:04:21 +08:00
Asias He	dc50ce0ce5	streaming: Make the mutation readers when streaming starts Currenlty we make the mutation readers for streaming at different time point, i.e., do_for_each(_ranges.begin(), _ranges.end(), [] (auto range) { make a mutation reader for this range read mutations from the reader and send }) If there are write workload in the background, we will stream extra data, since the later the reader is made the more data we need to send. Fix it by making all the readers before starting to stream. Fixes #1815 Message-Id: <1479341474-1364-2-git-send-email-asias@scylladb.com>	2016-11-17 12:41:53 +02:00
Avi Kivity	a35136533d	Convert ring_position and token ranges to be nonwrapping Wrapping ranges are a pain, so we are moving wrap handling to the edges. Since cql can't generate wrapping ranges, this means thrift and the ring maintenance code; also range->ring transformations need to merge the first and last ranges. Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>	2016-11-02 21:04:11 +02:00
Asias He	576e15532f	streaming: Add append_ranges for stream_transfer_task Allow to append more ranges to transfer for a stream transfer task.	2016-09-26 06:28:50 +08:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Asias He	d146045bc5	Revert "Revert "streaming: Send mutations on all shards"" This brings back streaming on all shards. The bug in locator/abstract_replication_strategy is now fixed. This reverts commit `9f3061ade8`. Message-Id: <a79ce9cdd6f4af1c6088b89e1911b4b2ed1c10ae.1455589460.git.asias@scylladb.com>	2016-02-16 11:16:51 +02:00
Avi Kivity	9f3061ade8	Revert "streaming: Send mutations on all shards" This reverts commit `31d439213c`. Fixes #894. Conflicts: streaming/stream_manager.cc (may have undone part of `63a5aa6122`)	2016-02-09 18:26:14 +02:00
Asias He	31d439213c	streaming: Send mutations on all shards Currently, only the shard where the stream_plan is created on will send streaing mutations. To utilize all the available cores, we can make each shard send mutations which it is responsbile for. On the receiver side, we do not forward the mutations to the shard where the stream_session is created, so that we can avoid unnecessary forwarding. Note: the downside is that it is now harder to: 1) to track number of bytes sent and received 2) to update the keep alive timer upon receive of the STREAM_MUTATION To fix, we now store the sent/recieved bytes info on all shards. When the keep alive timer expires, we check if any progress has been made. Hopefully, this patch will make the streaming much faster and in turn make the repair/decommission/adding a node faster. Refs: https://github.com/scylladb/scylla/issues/849 Tested with decommission/repair dtest. Message-Id: <96b419ab11b736a297edd54a0b455ffdc2511ac5.1454645370.git.asias@scylladb.com>	2016-02-07 10:57:51 +02:00
Asias He	f07cd30c81	streaming: Remove unused create_message_for_retry	2016-01-29 16:31:07 +08:00
Asias He	2f48d402e2	streaming: Remove unused commented code	2016-01-29 16:31:07 +08:00
Asias He	91e245edac	streaming: Initialize total_size in stream_transfer_task Also rename the private member to _total_size and _files	2016-01-29 16:31:07 +08:00
Asias He	e8b8b454df	streaming: Flatten streaming messages class namespace There are only two messages: prepare_message and outgoing_file_message. Actually only the prepare_message is the message we send on wire. Flatten the namespace.	2016-01-26 13:04:29 +08:00
Asias He	2d32195c32	streaming: Invalidate cache upon receiving of stream When a node gain or regain responsibility for certain token ranges, streaming will be performed, upon receiving of the stream data, the row cache is invalidated for that range. Refs #484.	2015-12-21 14:44:13 +08:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Asias He	f15c98cbca	streaming: Fix create_message_for_retry It makes no sense to create an empty message to retry, we must retry a existing message.	2015-08-17 10:52:30 +08:00
Asias He	09aae35baa	streaming: Add move constructor for stream_transfer_task Otherwise code in stream_session.cc _transfers.emplace(cf_id, stream_transfer_task(shared_from_this(), cf_id)).first; will copy the stream_transfer_task and in turn copy the outgoing_file_message which is not copyable due to the semaphore member. Thanks avi for figuring this out.	2015-08-17 10:52:30 +08:00
Asias He	ccb32ceec5	streaming: Add stream_transfer_task::complete	2015-07-21 16:12:54 +08:00
Asias He	e82bdf2995	streaming: Swith to use shared_ptr from std::shared_ptr Since our shared_ptr works with incomplete types now, switch to it.	2015-07-14 20:41:14 +08:00
Asias He	14ae9e66ae	streaming: Use shared_ptr to track back to stream_session I tried our lw_shared_ptr, the compiler complained endless usage of incomplete type stream_session. I can not include stream_session.hh everywhere due to circular dependency. For now, I'm using std::shared_ptr which works fine.	2015-07-14 20:41:14 +08:00
Asias He	3256a21556	streaming: Use frozen_mutation to send mutations Each outgoing_file_message might contain multiple mutations. Send them one mutation per RPC call (using frozen_mutation), instead of one big outgoing_file_message per one RPC call.	2015-07-09 15:52:28 +08:00
Asias He	ad3692f666	streaming: Implement stream_session::add_transfer_ranges Given keyspace names, ranges and column_families names, figure out mutation_readers to transfer.	2015-07-09 15:52:27 +08:00
Asias He	e8d9e7f3c5	streaming: Add stream_session::retry	2015-06-26 08:31:28 +08:00
Asias He	c5729ab0c3	streaming: Add stream_session::received	2015-06-26 08:31:28 +08:00
Asias He	442f68af19	streaming: Implement virtual functions in stream_transfer_task	2015-06-24 16:13:30 +08:00
Asias He	4c9af76261	streaming: Move add_transfer_file to source file Reduce dependency to stream_session	2015-06-24 16:13:30 +08:00
Asias He	bb40fed991	streaming: Add add_transfer_file to stream_transfer_task	2015-06-24 16:13:23 +08:00
Asias He	334b1f81fc	streaming: Convert StreamTransferTask.java to C++	2015-06-18 14:55:07 +08:00

43 Commits