Commit Graph

19 Commits

Author SHA1 Message Date
Raphael S. Carvalho
18c792c174 compaction: fix throughput calculation
(endsize / (1024*1024)) is an integer calculation, so if endsize is
lower than 1024^2, the result would be 0.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-10 13:18:11 +03:00
Raphael S. Carvalho
3ddb9be984 db: fix compaction on an empty column family
When forcing a compaction on a column family with no sstables, an
assert will fail because there is no sstables to be compacted.
This problem is fixed by ignoring a compaction request when no
sstable is provided.

Fixes #61.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-05 14:04:22 +03:00
Raphael S. Carvalho
477a3586d7 compaction: add missing information to compaction log
duration and throughput weren't being calculated.

closes #54.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-02 19:15:57 +03:00
Raphael S. Carvalho
c9fdc7dc5d compaction: get rid of invalid FIXME comment
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-28 19:22:26 +03:00
Avi Kivity
2e745bebad Merge "use compaction strategy options" from Raphael 2015-07-27 17:06:43 +03:00
Raphael S. Carvalho
70770c261b sstables: remove double percentage symbol from compaction log message
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-24 10:21:38 +02:00
Raphael S. Carvalho
634d00511b compaction: use compaction options in strategy
Support to compaction strategy options was recently added.
Previously, we were using default values in compaction strategy for
options, but now we can use the options defined in the schema.
Currently, we only support size-tiered strategy, so let's start
with it.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-23 15:26:47 -03:00
Raphael S. Carvalho
e57fe36249 compaction: get compaction threshold from schema instead
Get values from cf->schema instead of using hardcoded threshold
values. In addition, move DEFAULT_MIN_COMPACTION_THRESHOLD and
DEFAULT_MAX_COMPACTION_THRESHOLD to schema.hh so as not to have
knowledge duplicated.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-22 18:03:23 +03:00
Raphael S. Carvalho
63b41cc068 sstables: log compaction activity
There is some missing information in the last log printout, because
it's currently hard to generate such information.
Anyway, this patch is a good start towards providing the same log
messages as origin.

Addresses issue #12

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-22 09:15:18 -03:00
Raphael S. Carvalho
8faa202e98 sstables: add function to return candidates using size-tiered strategy
That's helpful for the purpose of testing, and leveled compaction may
also end up using size-tiered compaction strategy for selecting
candidates.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-20 12:27:33 -03:00
Raphael S. Carvalho
25f24c0748 sstables: fix size-tiered strategy
If old average is equals to new average, then we would remove
new average entry. That's totally wrong.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-20 12:26:56 -03:00
Raphael S. Carvalho
a99c92f1b6 sstable compaction: add initial support to size-tiered strategy
Size-tired strategy basically consists of creating buckets with sstables
of nearly the same size.
Afterwards, it will find the most interesting bucket, which size must be
between min threshold and max threshold. Bucket with the smallest average
size is the most interesting one.

Bucket hotness is also considered when finding the most interesting bucket,
but we don't support this yet.
We are also missing some code that discards sstable based on its coldness,
i.e. hardly read.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-20 10:08:14 -03:00
Raphael S. Carvalho
719898d0e5 introduce automatic compaction
As the name implies, this patch introduces the concept of automatic
compaction for sstables.

Compaction task is triggered whenever a new sstable is written.
Concurrent compaction on the same column family isn't supported, so
compaction may be postponed if there is an ongoing compression.
In addition, seastar::gate is used both to prevent a new compaction
from starting and to wait for an ongoing compaction to finish, when
the system is asked for a shutdown.

This patch also introduces an abstract class for compaction strategy,
which is really useful for supporting multiple strategies.
Currently, null and major compaction strategies are supported.
As the name implies, null compaction strategy does nothing.
Major compaction strategy is about compacting all sstables into one.
This strategy may end up being helpful when adding support to major
compaction via nodetool.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-16 12:00:12 +03:00
Raphael S. Carvalho
c3372c36a2 sstables: keep track of compacted sstable's ancestors
In C*, every compacted sstable keeps track of its ancestors in the
statistics file. Supposedly, that info is used to discard sstable
files from ancestors which for some odd reason weren't deleted.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-06-30 18:09:01 +03:00
Raphael S. Carvalho
92054f8391 sstables: fix typo in compaction code
s/estimated_parititions/estimated_partitions

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-06-25 17:45:06 +03:00
Raphael S. Carvalho
b54d35dcbb sstables: fix use-after-free at the end of compaction
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-06-25 08:39:10 +03:00
Nadav Har'El
27c238d6b7 sstable: fix load of new sstable
Apparently, after writing a new sstable, with write_components(), it
is necessary to load() it. I'm not sure why, but we get a crash on
an aio to a closed file descriptor if we don't.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-06-24 16:44:25 +03:00
Nadav Har'El
0b297b9f6c sstable compaction: simplify compact_sstables() function
Instead of requiring the user to subclass a "sstable_creator" class to
specify how to create a new sstable (or in the future, several of them),
switch to an std::function.

In practice, it is much easier to specify a lambda than a class, especialy
since C++11 made it easy to capture variables into lambdas - but not into
local classes.

The "commit()" function is also unnecessary. Then intention there was to
provide a function to "commit" the new sstables (i.e., rename them).
But the caller doesn't need to supply this function - it can just wait
for the future of the end of compaction, and do his own committing code
right then.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-06-24 16:44:11 +03:00
Nadav Har'El
f26dae3bf9 sstable: basic compaction function
This patch adds the basic compaction function sstables::compact_sstables,
which takes a list of input sstables, and creates several (currently one)
merged sstable. This implementation is pretty simple once we have all
the infrastructure in place (combining reader, writer, and a pipe between
them to reduce context switches).

This is already working compaction, but not quite complete: We'll need
to add compaction strategies (which sstables to compact, and when),
better cardinality estimator, sstable management and renaming, and a lot
of other details, and we'll probably still need to change the API.
But we can already write a test for compacting existing sstables (see
the next patch), and I wanted to get this patch out of the way, so we can
start working on applying compaction in a real use case.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-06-23 09:48:58 +03:00