Remove inclusions from header files (primary offender is fb_utilities.hh)
and introduce new messaging_service_fwd.hh to reduce rebuilds when the
messaging service changes.
Message-Id: <1475584615-22836-1-git-send-email-avi@scylladb.com>
Try to emulate the origin behaviour for batch reply. They use an
explicit write handler, combinging
1.) Hinting to all known dead endpoints
2.) Sending to all persumed live, requiring ack from all
3.) Hinting to endpoint to which send failed.
We don't have hints, so try to work around by doing send with
cl=ALL, and if send fails (wholly or partially), retain the
batch in the log.
This is still slight behavioural difference, and we also risk
filling up the batch log in extreme cases. (Though probably not
in any real environment).
Refs #1222
Message-Id: <1466444170-23797-1-git-send-email-calle@scylladb.com>
During decommission, the storage_service::unbootstrap() needs to
initiate a batchlog replay operation. To sync the replay operation
initiated by the timer in batchlog_manager and storage_service, a
semaphore is introduced. To simplify the semaphore locking, the
management code now always runs on shard zero, but the real work is
distruted to all shards.
We need to be able to replay mutations created using older versions of
the table's schema. frozen_mutation can be only read using the version
it was serialized with, and there is no guarantee that the node will
know this version at the time of replay. Currently versions kept
in-memory so a node forgets all past versions when it restarts.
To solve this, let's store canonical_mutations which, like data in
sstables, can be read using any later schema version of given table.
It is hard-coded as 30 seconds at the moment.
Usage:
$ scylla --ring-delay-ms 5000
Time a node waits to hear from other nodes before joining the ring in
milliseconds.
Same as -Dcassandra.ring_delay_ms in cassandra.
Replace db_clock::now_in_usec() and db_clock::now() * 1000 accesses
where the intent is to create a new auto-generate cell timestamp with
a call to new_timestamp(). Now the knowledge of how to create timestamps
is in a single place.
Since bytes is a very generic value that is returned from many calls,
it is easy to pass it by mistake to a function expecting a data_value,
and to get a wrong result. It is impossible for the data_value constructor
to know if the argument is a genuine bytes variable, a data_value of another
type, but serialized, or some other serialized data type.
To prevent misuse, make the data_value(bytes) constructor
(and complementary data_value(optional<bytes>) explicit.
Since replay is a "node global" operation, we should not attempt to
do it in parallel on each shard. It will just overlap/interfere.
Could just run this on cpu 0 or but since this _could_ be a
lengty operation, each timer callback is round-robined shards just in case...
Align with rest of file (for better or worse). This allows calls from
entity without query_processor handy (i.e. storage_proxy).
Added "minimal" setup method for the "global" state, to facilitate
tests. Doing a full setup either in cql_test_env or after it is created
breaks badly. (Not sure why). So quick workaround.
Updated the current two users (batchlog_manager and commitlog_replayer)
callsites to conform.
config.hh changes rapidly, so don't force lots of recompiles by including it.
Need to place seed_provider_type in namespace scope, so we can forward
declare it for that.
The logger class constructor registers itself with the logger registry,
in order to enable dynamically setting log levels. However, since
thread_local variables may be (and are) initialized at the time of first
use, when the program starts up no loggers are registered.
Fix by making loggers global, not thread_local. This requires that the
registry use locking to prevent registration happening on different threads
from corrupting the registry.
Note that technically global variables can also be initialized at the
point of first use, and there is no portable way for classes to self-register.
However this is the best we can do.
Somewhat simplifies version of the Origin code, since from what I
can see, there is less need for us to do explicit query sends in
the BLM itself, instead we can just go through storage_proxy.
I could be wrong though.