While scylla-ami-setup.service is running, login message says "run systemctl status scylla-server" to see status, but it actually never launched yet.
This patch fixes the message to notice RAID construction is running, and 'systemctl status scylla-ami-setup' is the correct way to see status.
Fixes#1035
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1460660628-10103-2-git-send-email-syuu@scylladb.com>
From Glauber:
There are current some outstanding issues with the throttling code. It's
easier to see them with the streaming code, but at least one of them is general.
One of them is related to situations in which the amount of memory available
leaves only one memtable fitting in memory. That would only happen with the
general code if we set the memtable cleanup threshold to 100 % - and I don't
even know if it is valid - but will happen quite often with the streaming code.
If that happens, we'll start throttling when that memtable is being written,
but won't be able to put anything else in its place - leading to unnecessary
throttling.
The second, and more serious, happens when we start throttling and the amount
of available memory is not at least 1MB. This can deadlock the database in
the sense that it will prevent any request from continuing, and in turn causing
a flush due to memtable size. It is a good practice anyway to always guarantee
progress.
Fixes#1144
Broken by f15c380a4f.
This resulted in empty collection being returned in the results
instead of no collection.
Fixes org.apache.cassandra.cql3.validation.entities.CollectionsTest
from cassandra-unit-tests.
* seastar 2aeb9dd...2185f37 (15):
> reactor: avoid issuing systemwide memory barriers in parallel
> Revert "Use sys_membarrier() when available"
> Merge "Various exception-safety fixes" from Tomasz
> future-util: make map reduce exception safe
> collectd: do not give up after a failure
> future-util: make repeat_until_value exception safe
> rpc: do not block connection when unknown verbs is received
> rpc: do not wait for a reply after timeout
> rpc: move connection stats to base class
> core/reactor: Handle io_submit failures inside flush_pending_aio
> apps/iotune: add --fs-check option to use iotune for kernel version check
> Merge "Some exception safety patches" from Paweł
> tls: Fix conversion of dh_params::level to gnutls_sec_param_t
> core: posix_thread: Mark start_routine as noexcept
> fair_queue: better overflow protection
"If we compact sstables A, B into a new sstable C we must either delete both
A and B, or none of them. This is because a tombstone in B may delete data
in A, and during compaction, both the tombstone and the data are removed.
If only B is deleted, then the data gets resurrected.
Non-atomic deletion occurs because the filesystem does not support atomic
deletion of multiple files; but the window for that is small and is not
addressed in this patchset. Another case is when A is shared across
multiple shards (as is the case when changing shard count, or migrating
from existing Cassandra sstables). This case is covered by this patchset.
Fixes #1181."
Our current throttling code releases one requests per 1MB of memory available
that we have. If we are below the memory limit, but not by 1MB or more, then
we will keep getting to unthrottle, but never really do anything.
If another memtable is close to the flushing point, those requests may be
exactly the ones that would make it flush. Without them, we'll freeze the
database.
In general, we need to always release at least one request to make sure that
progress is always achieved.
This fixes#1144
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Since commit c1cffd06 logger catch errors internally, so no need to
catch most of them at the top level. Only those that can happen during
parameter evaluation can reach here. Change parameters to not throw
too.
query::result transformation to printable form is very heavy operation
that allocates memory and thus can fail. Add a class to query::result that
can be used with logger to push to string conversion when output is
performed.
This is usually not a problem for the main memtable list - although it can be,
depending on settings, but shows up easily for the streaming memtables list.
We would like to have at least two memtables, even if we have to cut it short.
If we don't do that, one memtable will have use all available memory and we'll
force throttling until the memtable gets totally flushed.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
throttle_state is currently a nested member of database, but there is no
particular reason - aside from the fact that it is currently only ever
referenced by the database for us to do so.
We'll soon want to have some interaction between this and the column family, to
allow us to flush during throttle. To make that easier, let's unnest it.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
This is a preparation patch so we can move the throttling infrastructure inside
the memtable_list. To do that, the region group will have to be passed to the
throttler so let's just go ahead and store it.
In consequence of that, all that the CF has to tell us is what is the current
schema - no longer how to create a new memtable.
Also, with a new parameter to be passed to the memtable_list the creation code
gets quite big and hard to follow. So let's move the creation functions to a
helper.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
If sstables A, B are compacted, A and B must be deleted atomically.
Otherwise, if A has data that is covered by a tombstone in B, and that
tombstone is deleted, and if B is deleted while A is not, then the data
in A is resurrected.
Fixes#1181.
A shared sstable must be compacted by all shards before it can be deleted.
Since we're stoping, that's not going to happen. Cancel those pending
deletions to let anyone waiting on them to continue.
When we compact a set of sstables, we have to remove the set atomically,
otherwise we can resurrect data if the following happens:
insert data to sstable A
insert tombstone to sstable B
compact A+B -> C (removing both data and tombstone)
delete B only
read data from A
Since an sstable may be shared by multiple shard, and each shard performs
compaction at a different time, we need to defer deletion of an sstable
set until all shards agree that the set can be deleted.
An additional atomicity issue exists because posix does not provide a way
to atomically delete multiple files. This issue is not addressed by this
patch.
"With this series, we support all the 3 nodetool removenode commands, e.g.,
$ nodetool removenode 778948bf-6709-4eb5-80fe-bee911e9c3bf
$ nodetool removenode status
RemovalStatus: Removing token (-8969872965815280276). Waiting for
replication confirmation from [127.0.0.3,127.0.0.1].
$ nodetool removenode force
RemovalStatus: No token removals in process.
Tested with:
1)
- start 3 nodes
- inject data with
cassandra-stress write no-warmup cl=TWO n=2000000 -schema 'replication(factor=2)'
- kill -9 node2
- wait for node2 to be in DOWN state
- run nodetool removenode host2_host_id on node1
2)
- start 3 nodes
- inject data with
cassandra-stress write no-warmup cl=TWO n=2000000 -schema 'replication(factor=2)'
- kill -9 node2
- wait for node2 to be in DOWN state
- run nodetool removenode host2_host_id on node1
- kill -9 node3
- nodetool removenode will wait forever since node3 is gonne, node3
will never send the replication confirmation to node1
- run nodetool removenode force on node1
nodetool removenode completes with the following error:
$ nodetool removenode 31690b82-ebb0-4594-8bcf-1ce82b6e0f6e
nodetool: Scylla API server HTTP POST to URL
'/storage_service/remove_node' failed: nodetool removenode force is called by user
nodetool removenode force completes sucessfully
$ nodetool removenode force
RemovalStatus: Removing token (-9171569494049085776). Waiting for
replication confirmation from [127.0.0.3,127.0.0.1].
Fixes #1135."
Make the Docker image more user-friendly by starting up JMX proxy in the
background and install Scylla tools in the image. Also add a welcome
banner like we have with our AMI so that users have pointers to nodetool
and cqlsh, as well as our documentation.
Message-Id: <1460376059-3678-1-git-send-email-penberg@scylladb.com>