Booting a bootstrapped node had a race condition between setting and
advertising the state as boot strapped (call to
storage_service::bootstrap) and setting and advertising the
state as normal (call to storage_service::set_tokens) - as such a node
could get into a state in which it was "stuck" in bootstrap mode.
Following this patch you must wait for 5 seconds to have the cluster in
a stable state.
Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
Size-tired strategy basically consists of creating buckets with sstables
of nearly the same size.
Afterwards, it will find the most interesting bucket, which size must be
between min threshold and max threshold. Bucket with the smallest average
size is the most interesting one.
Bucket hotness is also considered when finding the most interesting bucket,
but we don't support this yet.
We are also missing some code that discards sstable based on its coldness,
i.e. hardly read.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
compact_all_sstables is about selecting all available sstables
for compaction and executing a compaction code on them.
This compaction code was moved to a more generic function called
compact_sstables, which will compact a list of given sstables.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Returns the sum of the size of all sstable components.
It will be used by size-tiered strategy.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
When running with boost unit-tests, cpu threads can be destroyed, causing
static object destructors to fail when they try to free data which was
allocated on a cpu thread that no longer exists.
Leak the memory instead.
By using free functions that take the serializer as a parameter (rather than
'this'), we can add serialization functions without changing the serializer
class.
Switch rpc from using futurized serialization/deserialization to
in-memory, atomic serialization. This is much faster, especially to
compile, and results in no loss of functionality, as it was illegal
for serialization to defer (except due to the connection being blocked),
because that would result in further rpc requests being stalled.
rpc currently allows serializers and deserializers to defer, because
the input and output stream may not be ready. They may not, however,
defer on behalf of the object being serialized or deserialized (i.e.
you cannot serialize to disk or deserialize from disk) because that
causes the tcp connection to block until serialization/deserialization is
complete. So in practice messages must be small enough to fit in memory,
and there is nothing gained by the complexity.
To simplify things, switch to non-deferring serialization. Add a frame
header to messages that specifies the buffer size, which allows rpc to
use a read_exacly() to consume the message, and thereafter deserialize it
immediately.
The result is significantly simpler, which should help with compile time.
"In the beginning, we needed to set fields in the schema, so we had set_
functions. Then the schema_builder came, and it acquired a lot of set_
functions.
Because its role is actually to build the schema, it is obvious that those
functions are much better placed if they are there. In this sense, having
them to stay in schema as well, is just duplication.
Some of the functions were already orphans in this sense. No callers left.
But some others had callers. This patchset fixes them so they build through
the builder. And once this is done, schema will now consistently have getters
only.
This will become more pressing once I introduce the compact storage changes -
which is the reason I am doing this now. For those, the process of computing
the right value is a bit complicated - definitely not just val = x, and we
definitely don't need the code living in two places. It would be much
better if the existing users - specially the system tables, would be already
fixed."
Make all callers go through the builder. Current callers - there are many
in the system tables code, are patched.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
gcc version 5.1.1 20150618 (Red Hat 5.1.1-4) (GCC)
[1/6] CXX build/release/log.o
FAILED: g++ -MMD -MT build/release/log.o -MF build/release/log.o.d
-std=gnu++1y -g -Wall -Werror -fvisibility=hidden -pthread -I.
-U_FORTIFY_SOURCE -I/usr/include/jsoncpp/ -Wno-maybe-uninitialized
-DHAVE_XEN -DHAVE_HWLOC -DHAVE_NUMA -O2 -I build/release/gen -c -o
build/release/log.o log.cc
log.cc: In member function ‘void
logging::logger::really_do_log(logging::log_level, const char*,
logging::logger::stringer**, size_t)’:
log.cc:86:19: error: no match for ‘operator<<’ (operand types are
‘std::ostream {aka std::basic_ostream<char>}’ and ‘std::ostringstream
{aka std::basic_ostringstream<char>}’)
std::cout << out;
^
log.cc:86:19: note: candidate: operator<<(int, int) <built-in>
log.cc:86:19: note: no known conversion for argument 2 from
‘std::ostringstream {aka std::basic_ostringstream<char>}’ to ‘int’
In file included from /usr/include/c++/5.1.1/iostream:39:0,
from core/sstring.hh:31,
from log.hh:8,
from log.cc:5:
calculate_natural_endpoints needs to limit the list of returned
addresses by the replication_factor current impl will return all nodes
in cluster
for reference origins code:
while (endpoints.size() < replicas && iter.hasNext())
{
InetAddress ep = metadata.getEndpoint(iter.next());
if (!endpoints.contains(ep))
endpoints.add(ep);
}
}
Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
This is meant to allow std::moving the returned object when needed.
Otherwise std::move(s.get_vector()) will be degraded to copying.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tomek pointed out that we shouldn't be passing a reference to commitlog every
time we use the add_column_family interface, because that will at times pass a
reference to a null object.
Test that, and pass no_commitlog if there is none.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
The patch introduces reconciliation code. The same code suppose to be
working for both range and single key queries. Handling of raw_limit,
short reads and read repairs is still very much missing.
--
v1->v2:
- call live_row_count() only once.