Port the Murmur hash functions from Java to C++. This is ironic,
because these functions actually originated in C code, which the
Cassandra guys converted to Java :-)
The Murmur hash is used in Cassandra for several things, including the
bloom filter (which is part of each sstable), and the default
data partitioner (Murmur3Partitioner).
I tested on some example string that all three methods (hash32, hash2_64
and hash3_x64_128) produce exactly the same output in the new C++ code as
they do in the original Java functions. This is important, because we want
the C++ node to be able to run alongside Java nodes, so they have to agree
on the same hash function.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
[avi: update configure.py]
The constructor from "const char_type *" wouldn't really work when
char_type != char, because strlen() won't work on such pointers.
It is more convenient to have a constructor from an ordinary const char *
(e.g., a C string literal), and solve the type problem with an ugly cast.
This only makes sense when sizeof(char_type)==1 (i.e., it is char, unsigned
char, or signed char), but I think we make this assumption in other places
as well (e.g., in determining the size we need to allocate).
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
std::string has an operator[] to get access (read or modify) to one
character in the string. This patch adds the same operator for our
sstring.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
The basic_sstring<> template allows picking char_type - which instead of
being just "char" could be chosen to be something similar, like "unsigned char"
or "signed char".
Unfortunately some hard-coded uses of "char" were left in the code. This
did not cause any problems for existing code because of implicit conversions,
but it does cause problems once I try to implement operator[] when char_type
is not char. This results in an attempt to return a reference to the result
of a conversion, which is not allowed.
So this patch fixes the leftover uses of "char" to "char_type".
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Supports variadic logging with placeholders, e.g.
logger.error("what happened? x = {}, y = {}", x, y);
Instantiate loggers as static thread_local, e.g.
class foo {
static thread_local logging::logger logger;
};
thread_local logging::logger foo::logger{logging::logger_for<foo>};
Add partial support for get_slice(). With this, the cassandra-stress read
test can complete in thrift mode (yielding 78krps on my desktop).
Reviewed-by: Pekka Enberg <penberg@cloudius-systems.com>
If the tcb is destroyed (by, say, the connection being closed and an RST),
then any continuation launched from it would see it destroyed when it
executes.
Fix by protecting the tcb using a shared pointer reference.
Remove pre-poll-mode code, from Gleb:
"This series moves most of eventfd users to use other form of notification
which can be polled without entering the kernel and moving epoll in its own
poller which is enabled only if there is an fd that needs to be polled."
[avi: add -lrt to linker command line]
Cassandra allows even regular columns to be treated as a sorted map
(column name -> value), accessing it with get_slice(), so sort the column
names to support this.
The code attempts to use scattered_message::append_static() to gain zero
copy, claiming that _output protects the data's lifetime, but that's a
false claim - it will remain in the transmit queue waiting for an ACK
while we process (and respond to) the next request.
Fix by just writing the output naturally. We may zero-copy support later,
though it is dubious it will help thrift, as the code is full of copies
and atomics.
sstring's std::string conversion uses c_str() to construct the value,
but the conversion is broken if the value contains NUL - both sstring and
std::string can contain NULs, but C strings use them as a terminator.
Fix by using the size+length std::string constructor.
size of the sstring _ascii_prefix should also be added when computing
item footprint. Without this change, reclaimer would end up evicting
more than needed.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
This patchset extends the database and thrift layers sufficiently to
pass the cassandra-stress insert test in thrift mode. I see 70krps
on my desktop (posix stack), likely limited by the client.
Using sstring can lead to confusion with UTF8 strings.
The Java byte type is signed, so make bytes' internal type be signed as
well (even though Cassandra tries to treat it as unsigned).
While we should use int8_t, sstring is not perfectly compatible with this
yet, so add a FIXME and use char instead.