This patch serves as an example of how we can add wrappers for
ms.send_message and ms.register_handler.
When we convert all the users of them, we can make messaging_service.hh
do not include rpc.hh.
Two ring_positions are equal if tokens and keys are equal or tokens are
equal and one or both of them do not specify key. So ring_positions
without a key is a wildcard that equals any ring_positions with the same
token.
"Fixes for ORDER BY clauses" from Pawel.
"The patches fix several issues in CQL3 frontend related to ORDER BY
clauses. Also, column component indexes are now handled properly and it
is possible to create tables with more than one column in clustering key."
When passing tokens corresponding to 129th key in the sstable to
read_range_rows(), it failed with heap-buffer-overflow pointing to:
return make_ready_future<uint64_t>(index_list[min_index_idx].position);
The scenario is as follows. We pass the lower bound token, which
corresponds to the first partition of some (not first) summary
page. That token will compare less than any entry in that page (even
less with the key we took it from, cause we want all partitions with
that token), so min_idx will point to the previous summary page
(correct). Then this code tries to locate the position in the previous
page:
auto m = adjust_binary_search_index(this->binary_search(index_list, minimum_key(), min_token));
auto min_index_idx = m >= 0 ? m : 0;
binary_search() will return ((-index.list_size()) -1), because the
token is greater than anything in that page. So "m" and
"min_index_idx" will be (index.list_size()-1) after adjusting.
Then the code tried this:
auto candidate = key_view(bytes_view(index_list[min_index_idx]));
auto tcandidate = dht::global_partitioner().get_token(candidate);
if (tcandidate < min_token) {
min_index_idx++;
}
The last key compared less than the token also, so min_index_idx is
bumped up to index_list.size(). It then tried to use this too large
index on index_list, which caused buffer overflow.
We clearly need to return the first position of the next page in this
case, and this change does it indirectly by calling
data_end_position(), which also handles edge cases like if there is no
next summary page.
I reimplemented the logic top-down, and found that the last special
casing for tcandidate was not needed, so I removed it.
The order of columns that belong to partition key or clustering key
needs to be preserved.
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
result_set may contain more columns than user requested (for instance if
some of them were needed to properly sort output). The additional columns
are always the last ones and are not included in metadata::column_count().
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
The logger class constructor registers itself with the logger registry,
in order to enable dynamically setting log levels. However, since
thread_local variables may be (and are) initialized at the time of first
use, when the program starts up no loggers are registered.
Fix by making loggers global, not thread_local. This requires that the
registry use locking to prevent registration happening on different threads
from corrupting the registry.
Note that technically global variables can also be initialized at the
point of first use, and there is no portable way for classes to self-register.
However this is the best we can do.
"With this series, we can now create and start a stream_plan, ask it to stream a
user created table. The negotiation messages can be exchanged and the mutations
can be sent and received.
I'm still trying to find a way to verify the mutation reach to the sstable
correclty on the peer node."
Now stream_result_future can create a stream_coordinator if not
provided.
So
- On sending side, stream_coordinator is created by stream_plan
- On receiving side, stream_coordinator is created by stream_result_future