From Pekka:
This patch series implements support for CQL DROP TABLE. It uses the newly
added truncate infrastructure under the hood. After this series, the
test_table CQL test in dtest passes:
[penberg@nero urchin-dtest]$ nosetests -v cql_tests.py:TestCQL.table_test
table_test (cql_tests.TestCQL) ... ok
----------------------------------------------------------------------
Ran 1 test in 23.841s
OK
We are generating huge output xml files with the --jenkins flag. Update
the printout from all to test_suite - to reduce size and incldue the
info we need.
Error messages / failed assertions are still printed
Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
When we query schema tables after we have applied a delete mutation, the
dropped table does not exist in the "after" result set. Fix the
merge_tables() algorithm to take that into account.
Makes merge_tables() really call to database::drop_column_family() when
a table is dropped.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
For drop_column_family(), we want to first remove the column_family from
lookup tables and truncate after that to avoid races. Introduce a
truncate() variant that takes keyspace and column_family references.
Signed-off-by: Pekka Enberg <penberg@scylladb.com>
We need to capture the "is_local_only" boolean by value because it's an
argument to the function. Fixes an annoying bug where we failed to update
schema version because we pass "true" accidentally. Spotted by ASan.
Signed-off-by: Pekka Enberg <penberg@scylladb.com>
"The control over backups is now moved to the CF itself, from the storage
service. That allows us to simplify the code (while making it correct) for cases
in which the storage service is not available.
With this change, we no longer need the database config passed down to the
storage_service object. So that patch is reverted."
Currently, we control incremental backups behavior from the storage service.
This creates some very concrete problems, since the storage service is not
always available and initialized.
The solution is to move it to the column family (and to the keyspace so we can
properly propagate the conf file value). When we change this from the api, we will
have to iterate over all of them, changing the value accordingly.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
We will need to change some properties of the keyspace / cf. We need an acessor
that is not marked as const.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
"This series adds the functionality that is required so the nodetool cfstats
would work.
It complete the histogram support for read and write latency and add stub for
functionality that is needed but is not supported yet."
This adds the implementation that return the estimated total latency of
the read and of the write.
First the method that sum the count was renamed to get_cf_stats_count
and a method was added named get_cf_stats_sum to sum the estimated
latencies.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
The histogram that are used typically only sample the data, so to get an
estimation of the actual sum, we use the estimated mean multiply by the
actuall count.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This patch contains two changes to the histogram implementation. It uses
a simpler method to calculate the estimated mean (simply divide the
estimated sum with the number of samples) and to make sure that there
will always be values in the histogram, it start with taking a sample
(when there are no samples) and then use the mask to decide if to sample
or not.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This patch contains the following changes, in the definition of the read
and write latency histogram it removes the mask value, so the the
default value will be used.
To support the gothering of the read latency histogram the query method
cannot be const as it modifies the histogram statistics.
The read statistic is sample based and it should have no real impact on
performance, if there will be an impact, we can always change it in the
future to a lower sampling rate.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the API definition with stub implementation that would make
the nodetool cfstats to run.
After this patch the nodetool cfstats command would work, but with stub
imlementation.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This patch add some missing definition for cas read an write, the API
definition is for completness only as we do not support cas yet.
It also change a part of the definition from storage_service to
storage_proxy as it should be.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
Only tables that arise from flushes are backed up. Compacted tables are not.
Therefore, the place for that to happen is right after our flush.
Note that due to our sharded architecture, it is possible that in the face of a
value change some shards will backup sstables while others won't.
This is, in theory, possible to mitigate through a rwlock. However, this
doesn't differ from the situation where all tables are coming from a single
shard and the toggle happens in the middle of them.
The code as is guarantees that we'll never partially backup a single sstable,
so that is enough of a guarantee.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
Query and set the state of incremental backups. The initial value comes from
the configuration file through the local db reference. Later on, it can be
changed through the interface.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
When we convert exceptions into CQL server errors, type information is
not preserved. Therefore, improve exception error messages to make
debugging dtest failures, for example, slightly easier.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
For a lot of users, running Scylla in some kinds of filesystems that do not support
O_DIRECT is quite frustrating: it will fail at some point, with random error messages
that aren't really meaningful.
We should try to check for that, and fail with a good error message. Also, since our
performance claims won't really hold in anything other than XFS, we should warn the user
if that is not the setup we encounter.
Fixes#409
Signed-off-by: Glauber Costa <glommer@scylladb.com>
Lazy digest calculation code introduced a bug in background read repair.
The problem is that digest_read_resolver::resolve() destroys one data
result (it is moved to a caller to be sent as a reply), so during
background digest match there is no value to calculate a digest from.
Copying data to the caller would be most elegant solution, but also
slowest one, so lets just treat the case where there is only one
target queried and skip digest calculation in this case since we know
digest_match() will do nothing.
"First iteration implementation of CQL truncate, transposed from
Origin.
Includes a workable impl. of snapshots, since that is sort of an integral
part of the origin code.
Note: This is still incomplete/incorrect in two ways:
1.) Since we have no way to ensure sstables are finished writing,
the flush-snapshots are unreliable. Needs basically the same
fix as correct commitlog management, namely flush queues and
the ability to wait-force "active" flushes to finish before
continuing.
2.) System table truncation record saving does not handle sharding.
This means we basically save the "last" RP from any of the shards
truncating, and consequently if we have a crash and do commitlog
replay, we could resurrect truncated data.
Fix is to have truncation records be per cf+shard just as RP:s
are per shard.
However, since some people are waiting for at least a semi-functional
truncate, I'm submitting this without fixing the two above issues,
since they can be dealt with in subsequent patches."