That's what we're trying to standardize on.
This patch also fixes an issue with current query::result::serialize()
not being const-qualified, because it modifies the
buffer. messaging_service did a const cast to work this around, which
is not safe.
This patch adds the implementation to the get_version.
After this patch the following url will be available:
messaging_service/version?addr=127.0.0.1
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
"This series allows the compaction manager to be used by the nodetool as a stub implementation.
It has two changes:
* Add to the compaction manager API a method that returns a compaction info
object
* Stub all the compaction method so that it will create an unimplemented
warning but will not fail, the API implementation will be reverted when the
work on compaction will be completed."
This patch adds the column family API that return the snapshot size.
The changes in the swagger definition file follo origin so the same API will be used for the metric and the
column_family.
The implementation is based on the get_snapshot_details in the
column_family.
This fix:
425
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Backport: CASSANDRA-10330
ae4cd69 Print versions for gossip states in gossipinfo
For instance, the version for each state, which can be useful for
diagnosing the reason for any missing states. Also instead of just
omitting the TOKENS state, let's indicate whether the state was actually
present or not.
With
Node 1 (Seed node, Port 7000 is opened, 10.184.9.144)
Node 2 (Port 7000 is opened, 10.184.9.145)
Node 3 (Port 7000 is blocked by firewall)
On Node 3, we saw the following error which was very confusing: Node 3
saw Node 1 and Node 3 but it complained it can not contact any seeds.
The message "Node 10.184.9.144 is now part of the cluster" and friends
are actually messages printed during the gossip shadow round where Node
3 connects to Node 1's port 7000 and Node 1 returns all info it knows to
Node 3, so that Node 3 knows Node 1 and Node 2 and we see the "Node
10.184.9.144/145 is now part of the cluster" message.
However, during the normal gossip round, Node 3 will not mark Node 1 and
Node 2 UP until the Seed node initiates a gossip round to Node 3, (note
port 7000 on node 3 is blocked in this case). So Node 3 will not mark
Node 1 and Node 2 UP and we see the "Unable to contact any seeds" error.
[shard 0] storage_service - Loading persisted ring state
[shard 0] gossip - Node 10.184.9.144 is now part of the cluster
[shard 0] gossip - inet_address 10.184.9.144 is now UP
[shard 0] gossip - Node 10.184.9.145 is now part of the cluster
[shard 0] gossip - inet_address 10.184.9.145 is now UP
[shard 0] storage_service - Starting up server gossip
scylla_run[12479]: Start gossiper service ...
[shard 0] storage_service - JOINING: waiting for ring information
[shard 0] storage_service - JOINING: schema complete, ready to bootstrap
[shard 0] storage_service - JOINING: waiting for pending range calculation
[shard 0] storage_service - JOINING: calculation complete, ready to bootstrap
[shard 0] storage_service - JOINING: getting bootstrap token
[shard 0] storage_service - JOINING: sleeping 5000 ms for pending range setup
scylla_run[12479]: Exiting on unhandled exception of type 'std::runtime_error': Unable to contact any seeds!
Backported: CASSANDRA-8336 and CASSANDRA-9871
84b2846 remove redundant state
b2c62bb Add shutdown gossip state to prevent timeouts during rolling restarts
8f9ca07 Cannot replace token does not exist - DN node removed as Fat Client
Fixes:
When X is shutdown, X sends SHUTDOWN message to both Y and Z, but for
some reason, only Y receives the message and Z does not receive the
message. If Z has a higher gossip version for X than Y has for
X, Z will initiate a gossip with Y and Y will mark X alive again.
X ------> Y
\ /
\ /
Z
Fixes: #593
"Changes the parser/replayer to treat data corruption as non-fatal,
skipping as little as possible to get the most data out of a segment,
but keeping track of, and reporting, the amount corrupted.
Replayer handles this and reports any non-fatal errors on replay finish.
Also added tests for corruption cases.
This patch series contains a cleanup-patch for commitlog_tests that was
previously submitted, but got lost."
If something bad happens between write request handler creation and
request execution the request handler have to be destroyed. Currently
code tries to do that explicitly in all places where request may be
abandoned, but it misses some (at least one). This patch replaces this
by introducing unique_response_handler object that will remove the handler
automatically if request is not executed for some reason.
Rename antlr3-tool to antlr3 (same as distribution package), and use distribution version if it's available
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
"Before this change, populations could race with update from flushed
memtable, which might result in cache being populated with older
data. Populations started before the flush are not considering the
memtable nor its sstable.
The fix employed here is to make update wait for populations which
were started before the flushed memtable's sstable was added to the
undrelying data source. All populatinos started after that are
guaranteed to see the new data. The update() call will wait only for
current populating reads to complete, it will not wait for readers to
get advanced by the consumer for instance."
To avoid a race where natural endpoint was updated to contain node A,
but A was not yet removed from pending endpoints.
This fixes the root cause of commit d9d8f87c1 (storage_proxy: filter out
natural endpoints from pending endpoint). This patch alone fixes#539,
but we still want commit d9d8f87c1 to be safe.
When other bootstrapping/leaving/moving nodes are found during
bootstrap, instead of throwing immediately, sleep and try again for one
minute, hoping other nodes will finish the operation soon.
Since we are retrying using shadow gossip round more than once, we need
to put the gossip state back to shadow round after each shadow round, to
make shadow round works correctly.
This is useful when starting an empty cluster for testing. E.g,
$ scylla --listen-address 127.0.0.1
$ sleep 3
$ scylla --listen-address 127.0.0.2
$ sleep 3
$ scylla --listen-address 127.0.0.3
Without this patch, node 3 will hit the check.
TIME STATUS
-----------------------
Node 1:
32:00 Starts
32:00 In NORMAL status
Node 2:
32:03 Starts
32:04 In BOOT status
32:10 In NORMAL status
Node 3:
32:06 Starts
32:06 Found node 2 in BOOT status, hit the check, sleep and try again
32:11 Found node 2 in NORMAL status, can keep going now
32:12 In BOOT status
32:18 In NORMAL status
When other bootstrapping/leaving/moving nodes are found during
bootstrap, instead of throwing immediately, sleep and try again for one
minute, hoping other nodes will finish the operation soon.
This is useful when starting an empty cluster for testing. E.g,
$ scylla --listen-address 127.0.0.1
$ scylla --listen-address 127.0.0.2
$ scylla --listen-address 127.0.0.3
Without this patch, node 3 will hit the check.
TIME STATUS
-----------------------
Node 1:
25:19 Starts
25:20 In NORMAL status
Node 2:
25:19 Starts
25:23 In BOOT status
25:28 In NORMAL status
Node 3:
25:19 Starts
25:24 Found node 2 in BOOT status, hit the check, sleep and try again
25:29 Found node 2 in NORMAL status, can keep going now
25:29 In BOOT status
25:34 In NORMAL status
Before this change, populations could race with update from flushed
memtable, which might result in cache being populated with older
data. Populations started before the flush are not considering the
memtable nor its sstable.
The fix employed here is to make update wait for populations which
were started before the flushed memtable's sstable was added to the
undrelying data source. All populatinos started after that are
guaranteed to see the new data.