Define table_id as a distinct utils::tagged_uuid modeled after raft
tagged_id, so it can be differentiated from other uuid-class types,
in particular from table_schema_version.
Fixes#11207
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Currently, if a table is dropped during streaming, the streaming would
fail with no_such_column_family error.
Since the table is dropped anyway, it makes more sense to ignore the
streaming result of the dropped table, whether it is successful or
failed.
This allows users to drop tables during node operations, e.g., bootstrap
or decommission a node.
This is especially useful for the cloud users where it is hard to
coordinate between a node operation by admin and user cql change.
This patch also fixes a possible user after free issue by not passing
the table reference object around.
Fixes#10395Closes#10396
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
In the case there are large number of column families, the sender will
send all the column families in parallel. We allow 20% of shard memory
for streaming on the receiver, so each column family will have 1/N, N is
the number of in-flight column families, memory for memtable. Large N
causes a lot of small sstables to be generated.
It is possible there are multiple senders to a single receiver, e.g.,
when a new node joins the cluster, the maximum in-flight column families
is number of peer node. The column families are sent in the order of
cf_id. It is not guaranteed that all peers has the same speed so they
are sending the same cf_id at the same time, though. We still have
chance some of the peers are sending the same cf_id.
Fixes#3065
Message-Id: <46961463c2a5e4f1faff232294dc485ac4f1a04e.1513159678.git.asias@scylladb.com>
Now that we have the new interface to make readers with ranges, we can
simplify the code a lot.
1) Less readers are needed
before: number of ranges of readers
after: smp::count readers at most
2) No foreign_ptr is needed
There is no need to forward to a shard to make the foreign_ptr for
send_info in the first phase and forward to that shard to execute the
send_info in the second phase.
3) No do_with is needed in send_mutations since si now is a
lw_shared_ptr
4) Fix possible user after free of 'si' in do_send_mutations
We need to take a reference of 'si' when sending the mutation with
send_stream_mutation rpc call, otherwise:
msg1 got exception
si->mutations_done.broken()
si is freed
msg2 got exception
si is used again
The issue is introduced in dc50ce0ce5 (streaming: Make the mutation
readers when streaming starts) which is master only, branch 1.5 is not
affected.
Currenlty we make the mutation readers for streaming at different
time point, i.e.,
do_for_each(_ranges.begin(), _ranges.end(), [] (auto range) {
make a mutation reader for this range
read mutations from the reader and send
})
If there are write workload in the background, we will stream extra
data, since the later the reader is made the more data we need to send.
Fix it by making all the readers before starting to stream.
Fixes#1815
Message-Id: <1479341474-1364-2-git-send-email-asias@scylladb.com>
Wrapping ranges are a pain, so we are moving wrap handling to the edges.
Since cql can't generate wrapping ranges, this means thrift and the ring
maintenance code; also range->ring transformations need to merge the first
and last ranges.
Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>
Currently, only the shard where the stream_plan is created on will send
streaing mutations. To utilize all the available cores, we can make each
shard send mutations which it is responsbile for. On the receiver side,
we do not forward the mutations to the shard where the stream_session is
created, so that we can avoid unnecessary forwarding.
Note: the downside is that it is now harder to:
1) to track number of bytes sent and received
2) to update the keep alive timer upon receive of the STREAM_MUTATION
To fix, we now store the sent/recieved bytes info on all shards. When
the keep alive timer expires, we check if any progress has been made.
Hopefully, this patch will make the streaming much faster and in turn
make the repair/decommission/adding a node faster.
Refs: https://github.com/scylladb/scylla/issues/849
Tested with decommission/repair dtest.
Message-Id: <96b419ab11b736a297edd54a0b455ffdc2511ac5.1454645370.git.asias@scylladb.com>
There are only two messages: prepare_message and outgoing_file_message.
Actually only the prepare_message is the message we send on wire.
Flatten the namespace.
When a node gain or regain responsibility for certain token ranges,
streaming will be performed, upon receiving of the stream data, the
row cache is invalidated for that range.
Refs #484.
Otherwise code in stream_session.cc
_transfers.emplace(cf_id, stream_transfer_task(shared_from_this(), cf_id)).first;
will copy the stream_transfer_task and in turn copy the
outgoing_file_message which is not copyable due to the semaphore member.
Thanks avi for figuring this out.
I tried our lw_shared_ptr, the compiler complained endless usage of
incomplete type stream_session. I can not include stream_session.hh
everywhere due to circular dependency.
For now, I'm using std::shared_ptr which works fine.
Each outgoing_file_message might contain multiple mutations. Send them
one mutation per RPC call (using frozen_mutation), instead of one big
outgoing_file_message per one RPC call.