I'm not sure what happened. We have the same commented code in both .hh
and .cc. It is very confusing when enabling some of the code. Let's
remove the duplicated code in .cc and leave the in .hh only.
"I.e. implement storage_proxy::mutate_atomically, which in turn means
roughly the same as mutate, with write/remove from the batchlog table
intermixed.
This patch restructures some stuff in storage_proxy to avoid to much code
duplication, with the assumption (amongst other) that dead nodes will be few
etc."
Our thrift code performs an elaborate dance to convert a result/exception
reported in a future<> to the cob/exn_cob flow required by the thrift
library. However, if the exception if thrown before the first continuation,
no one will catch it will be leaked, eventually resulting in a crash.
Fix by replacing the complete() infrastructure, which took a future as a
parameter, with a with_cob() helper that instead takes a function to
execute. This allows it to catch both exceptions thrown directly and
exceptions reported via the future.
Fixes#133.
What we implement is ka, not la. Since the summary is the one element that
actually changed in the 2.2 implementation, it is particularly important that
we get this one right. I have previously missed this.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Currently, each column family creates a fiber to handle compaction requests
in parallel to the system. If there are N column families, N compactions
could be running in parallel, which is definitely horrible.
To solve that problem, a per-database compaction manager is introduced here.
Compaction manager is a feature used to service compaction requests from N
column families. Parallelism is made available by creating more than one
fiber to service the requests. That being said, N compaction requests will
be served by M fibers.
A compaction request being submitted will go to a job queue shared between
all fibers, and the fiber with the lowest amount of pending jobs will be
signalled.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
We need to also catch exceptions in top-level connection::process() so
that they are converted to proper CQL protocol errors.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
With the change in column_family stats, the API needs to get the counter
from the read and write histogram.
It also adds the implementation for the read and write latency histogram.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This change make sure that when there are no results (ie. all the
histogram that are summed are empty) the return result will be a zerroed
histogram and not an empty object.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
With the use of sparse histogram, the read and write counters in the
column_family stats can be used.
The total impact on performanc should be neglectible.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the latency histogram to the column_family swagger
definitions.
The definitions are based on the ColumnFamilyMetrics.
It adds the following commands:
get_read_latency_histogram
get_all_read_latency_histogram
get_write_latency_histogram
get_all_write_latency_histogram
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
The histogrm object is used both as a general counter for the number of
events and for statistics and sampling.
This chanage the histogram implementation, so it would support spares
sampling while keeping the total number of event accurate.
The implementation includes the following:
Remove the template nature of the histogram, as it is used only for
timer and use the name ihistogram instead.
If in the future we'll need a histogram for other types, we can use the
histogrma name for it.
a total counter was added that count the number of events that are part
of the statistic calculation.
A helper methods where added to the ihistogram to handle the latency
counter object.
According to the sample mask it would mark the latency object as start
if the counter and the mask are non zero and it would accept the latency
object in its mark method, in which if the latency was not start, it
will not be added and only the 'count' counter that counts the total
number of events will be incremented.
This should reduce the impact of latency calculation to a neglectable
effect.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
When doing a spares latency check, it is required to know if a latency
object was started.
This returns true if the start timer was set.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
storage_service is a singleton, and wants a database for initialization.
On the other hand, database is a proper object that is created and
destroyed for each test. As a result storage_service ends up using
a destroyed object.
Work around this by:
- leaking the database object so that storage_service has something
to play with
- doing the second phase of storage_service initialization only once
(endsize / (1024*1024)) is an integer calculation, so if endsize is
lower than 1024^2, the result would be 0.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
I made the mistake of running scylla on a spinning disk. Since a disk
can serve about 100 reads/second, that set the tone for the whole benchmark.
Fix by improving cache preload when flushing a memtable. If we can detect
that a mutation is not part of any sstable (other than the one we just wrote),
we can add insert it into the cache.
After this, running a mixed cassandra-stress returns the expected results,
even on a spinning disk.
Add a FIXME about something I'm unsure about - does repair only need to
repair this node, or also make an effort to also repair the other nodes
(or more accurately, their specific token-ranges being repaired) if we're
already communicating with them?
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>