From Pekka:
This patch series implements support for CQL DROP TABLE. It uses the newly
added truncate infrastructure under the hood. After this series, the
test_table CQL test in dtest passes:
[penberg@nero urchin-dtest]$ nosetests -v cql_tests.py:TestCQL.table_test
table_test (cql_tests.TestCQL) ... ok
----------------------------------------------------------------------
Ran 1 test in 23.841s
OK
Currently, we control incremental backups behavior from the storage service.
This creates some very concrete problems, since the storage service is not
always available and initialized.
The solution is to move it to the column family (and to the keyspace so we can
properly propagate the conf file value). When we change this from the api, we will
have to iterate over all of them, changing the value accordingly.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
Query and set the state of incremental backups. The initial value comes from
the configuration file through the local db reference. Later on, it can be
changed through the interface.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
Lazy digest calculation code introduced a bug in background read repair.
The problem is that digest_read_resolver::resolve() destroys one data
result (it is moved to a caller to be sent as a reply), so during
background digest match there is no value to calculate a digest from.
Copying data to the caller would be most elegant solution, but also
slowest one, so lets just treat the case where there is only one
target queried and skip digest calculation in this case since we know
digest_match() will do nothing.
Digest resolver is broken in a way that prevents read completion to
be reported if data arrives after enough digests for cl were already
received. This happens because the code tried to save on a state and
used _cl_responses as an indicator that completion was reported already,
but this is incorrect since there can be enough responses for cl, but no
data yet. Fix by introducing special state to track completion reporting.
Fixes#331
Connection drop during read operation is not an error and should not be
reported as such. Furthermore disconnects are already reported by
gossip, so no need to report it for each ongoing read again.
Fixes#320
"This series enable the nodetool info, by completing the missing APIs.
The main change is returning fixed value for storage_service
is_rpc_server_running, is_native_transport_running and get_exception_count.
After this series it will be possible to run:
nodetool info (while the jmx is runnning) and to get the results without errors
or crashes."
All database code was converted to is when storage_proxy was made
distributed, but then new code was written to use storage_proxy& again.
Passing distributed<> object is safer since it can be passed between
shards safely. There was a patch to fix one such case yesterday, I found
one more while converting.
rpc_server
This patch changes the behaviour of is_native_transport_running and
is_rpc_server_running to return true and not to fail, we assume that
they are running. It should be changed when an API to start and stop
them will be added.
The get_exception_count will return 0, the definition for it in origin
is exception that were not cought in a thread.
We should re-think about what it means in our implementation, meanwhile
return 0, for no exception, is a reasonable approach.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
Route request to CPU 0. _operation_mode is not replicated to other CPUS.
Without this:
$ curl -X GET --header "Accept: application/json"
"http://127.0.0.1:10000/storage_service/operation_mode"
returns "NORMAL" and "STARTING" randomly.
Only cpu 0 instance of gossip has the correct information, route request
to cpu 0.
Fix a bug where
$ curl -X GET --header "Accept: application/json"
"http://172.31.5.77:10000/storage_service/gossiping"
returns true and false randomly.
If several mutation in a batch throw exceptions have_cl.broken() will be
called more then once. Fix this by dropping ad hoc have_cl and use
parallel_for_each() that does the same thing that current code is doing.
Fixes#297
"This series deals with copies and moves of mutation. The former are dealt
with by adding std::move() and missing 'mutable' (in case of lambdas). The
latter are improved by storing mutation_partition externally thus removing
the need for moving mutation_partition each time mutation is moved.
Storing mutation_partition externally is obviously trading the cost of
move constructor for the cost of allocation which shows in perf_mutation
results since mutations aren't moved in that test.
perf_mutation (-c 1):
before: 3289520.06 tps
after: 3183023.37 tps
diff: -3.24%
perf_simple_query (read):
before: 526954.05 tps
after: 577225.16 tps
diff +9.54%
perf_simple_query (write):
before: 731832.70 tps
after: 734923.60 tps
diff: +0.42%
Fixes#150 (well, not completely)."
If an exception happens in the query path, we'll never know about it. They are
currently being ignored.
Investigating this, I found out that this is because the readers in
storage_proxy.cc handles them - but don't log they anywhere.
This patch introduces such logging. the error() function takes an sstring not
an exception_ptr: this is so we can reuse it in the future to also log problems
from other hosts (currently not done).
We have a separate helper to extract the message from the current exception
before we pass it to error()
Fixes#110
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>