Commit Graph

30 Commits

Author SHA1 Message Date
Raphael S. Carvalho
6b6bb38f38 compaction_manager: stop manager after storage io error
Manager will stop itself if a compaction fails due to storage io
error, which unconditionally results in stop of transportation
services.

Fixes #2147.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170316054538.23423-1-raphaelsc@scylladb.com>
2017-03-16 10:37:47 +02:00
Vlad Zolotarov
00e37c389b sstables::compaction_manager: move collectd metrics registration to the metrics registration layer
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-01-10 16:24:54 -05:00
Glauber Costa
56df53f51e compaction_manager: fix shutdown sequence
By the time we are able to acquire this semaphore, we may be stopped
already. So we need to test it before we go ahead. I can see shutdown
hangs before this patch that are fixed with it applied.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <e5b378893128d086d584ffbb2acd3fb687648e5c.1481655433.git.glauber@scylladb.com>
2016-12-14 09:26:24 +01:00
Glauber Costa
5803957ab5 compaction: fix build
Commit 732ee275 moved tracking of one statistics value inside a lambda
without capturing this in that lambda. Compilation fails as a result.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Reviewed-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <68860640f4533dd43e43f341f1620e25464b700b.1481313455.git.glauber@scylladb.com>
2016-12-10 09:00:20 +02:00
Raphael S. Carvalho
732ee275f8 compaction: fix running compaction counter when splitting sstables
The counter was being increased before taking the semaphore, so
every pending split would count as a running compaction which
misleads the user as a result.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <f2050cc3599cee7af29d4579368a154708b37731.1481248048.git.raphaelsc@scylladb.com>
2016-12-09 15:01:43 +02:00
Raphael S. Carvalho
e86de40b49 compaction_manager: inform about compaction cancelled by shutdown
After some changes in compaction manager, user no longer is informed
that compaction was cancelled in event of shutdown. That's because
we only ignore ready future when compaction manager was asked to
stop.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <02ca29b5a93fe3a558896598f325b0dce069e82c.1478277317.git.raphaelsc@scylladb.com>
2016-11-14 16:37:33 +02:00
Raphael S. Carvalho
56a50784f8 compaction_manager: make registration of sstables and weight exception safe
Compacting sstables and weight could be left unregistered in event of an
exception. Let's make it safe by using a RAII approach.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <f2cf9d0c12f22046293bd2185ef14ede3f4d63d4.1469114161.git.raphaelsc@scylladb.com>
2016-07-22 07:02:48 +01:00
Raphael S. Carvalho
ed5e7e6842 compaction: refactor compaction manager
Previously, same function was used to handle both regular compaction
and cleanup requests. That's bad because a lot of conditions were
added for both compaction types to live in the same function.
Now, cleanup and regular compaction will live in different functions.
They share a lot of code, so helper functions were introduced.
This change is also important for user-initiated compaction that
will go through compaction manager in the future.
Code is also a lot easier to read now.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-07-08 16:37:53 -03:00
Raphael S. Carvalho
da6a2b429d compaction: add functions to register and deregister compacting sstables
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-07-08 16:00:51 -03:00
Raphael S. Carvalho
4d6dce8ec9 compaction: add helper function to get candidates for strategy
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-07-08 15:06:14 -03:00
Raphael S. Carvalho
bfc5376548 compaction: remove gate from compaction manager task
There is no longer a need to use gate for regular termination of
fiber that runs compaction. Now, we only set task->stopping to
true, ask for compaction termination, and wait for its future to
resolve. Code is simplified a lot with this change.

Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-07-08 15:05:10 -03:00
Avi Kivity
2a46410f4a Change sstable_list from a map to a set
sstable_list is now a map<generation, sstable>; change it to a set
in preparation for replacing it with sstable_set.  The change simplifies
a lot of code; the only casualty is the code that computes the highest
generation number.
2016-07-03 10:26:57 +03:00
Nadav Har'El
721f7d1d4f Rewrite shared sstables soon after startup
Several shards may share the same sstable - e.g., when re-starting scylla
with a different number of shards, or when importing sstables from an
external source. Sharing an sstable is fine, but it can result in excessive
disk space use because the shared sstable cannot be deleted until all
the shards using it have finished compacting it. Normally, we have no idea
when the shards will decide to compact these sstables - e.g., with size-
tiered-compaction a large sstable will take a long time until we decide
to compact it. So what this patch does is to initiate compaction of the
shared sstables - on each shard using it - so that a soon as possible after
the restart, we will have the original sstable is split into separate
sstables per shard, and the original sstable can be deleted. If several
sstables are shared, we serialize this compaction process so that each
shard only rewrites one sstable at a time. Regular compactions may happen
in parallel, but they will not not be able to choose any of the shared
sstables because those are already marked as being compacted.

Commit 3f2286d0 increased the need for this patch, because since that
commit, if we don't delete the shared sstable, we also cannot delete
additional sstables which the different shards compacted with it. For one
scylla user, this resulted in so much excessive disk space use, that it
literally filled the whole disk.

After this patch commit 3f2286d0, or the discussion in issue #1318 on how
to improve it, is no longer necessary, because we will never compact a shared
sstable together with any other sstable - as explained above, the shared
sstables are marked as "being compacted" so the regular compactions will
avoid them.

Fixes #1314.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1465406235-15378-1-git-send-email-nyh@scylladb.com>
Reviewed-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-06-08 15:44:29 -04:00
Raphael S. Carvalho
1b8e170254 compaction: retry compaction until strategy is satisfied
Previously, we were using a stat to decide if compaction should be
retried, but that's not efficient. The information is also lost
after node is restarted.

After these changes, compaction will be retried until strategy is
satisfied, i.e. there is nothing to compact.
We will now be doing the following in a loop:
Get compaction job from compaction strategy.
	If cannot run, finish the loop.
	Otherwise, compact this column family.
Go back to start of the loop.

By the way, pending_compactions stat will be deprecated after this
commit. Previously, it was increased to indicate the want for
compaction and decreased when compaction finished. Now, we can
compact more than we asked for, so it would be decreased below 0.
Also, it's the strategy that will tell the want for compaction.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <899df0d8d807f6b5d9bb8600d7c63b4e260cc282.1465398243.git.raphaelsc@scylladb.com>
2016-06-08 11:31:56 -04:00
Raphael S. Carvalho
588ce915d6 compaction: disable parallel compaction for leveled strategy
It was discussed that leveled strategy may not benefit from parallel
compaction feature because almost all compaction jobs will have similar
size. It was also found that leveled strategy wasn't working correctly
with it because two overlapping sstable (targetting the same level)
could be created in parallel by two ongoing compaction.

Fixes #1293.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <60fe165d611c0283ca203c6d3aa2662ab091e363.1464883077.git.raphaelsc@scylladb.com>
2016-06-05 18:20:00 +03:00
Raphael S. Carvalho
d80d194873 compaction_manager: stop compaction tasks in parallel
Purpose is to speed up shutdown.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <a8db3492f1ceeea2a886d3920e5effa841ea155f.1462838670.git.raphaelsc@scylladb.com>
2016-05-10 10:03:35 +03:00
Raphael S. Carvalho
3ac22bc0d7 compaction_manager: simplify code that waits for cleanup termination
Now that a task is created on demand, it's possible to wait for
termination of cleanup without extra machinery.
However, shared_future<> is now used because we may have more
than one fiber waiting for completion of task.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <209de365c7782742dc2876a66f9d0784998cae53.1462599296.git.raphaelsc@scylladb.com>
2016-05-08 11:26:36 +03:00
Raphael S. Carvalho
b8277979ef compaction_manager: fix indentation
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <82c6b93b24cbcc97f5eff3f91b05d4c1b415ecee.1462412927.git.raphaelsc@scylladb.com>
2016-05-05 10:06:56 +03:00
Raphael S. Carvalho
5aeeb0b3e8 compaction: add support to parallel compaction on the same column family
It was noticed that small sstables will accumulate for a column family because
scylla was limited to two compaction per shard, and a column family could have
at most one compaction running at a given shard. With the number of sstables
increasing rapidly, read performance is degraded.

At the moment, our compaction manager works by running two compaction task
handlers that run in parallel to the rest of the system. Each task handler
gets to run when needed, gets a column family from compaction manager queue,
runs compaction on it, and goes to sleep again. That's basically its cycle.
Compaction manager only allows one instance of a column family to be on its
queue, meaning that it's impossible for a column family to be compacted in
parallel. One compaction starts after another for a given column family.

To solve the problem described, we want to concurrently run compaction jobs
of a column family that have different "size tier" (or "weight").
For those unfamiliar, compaction job contains a list of sstables that will be
compacted together.
The "size tier" of a compaction job is the log of the total size of the input
sstables. So a compaction job only gets to run if its "size tier" is not the
same of an ongoing compaction. There is no point in compacting concurrently at
the same "size tier", because that slows down both compactions.

We will no longer queue column families in compaction manager. Instead, we
create a new fiber to run compaction on demand.
This fiber that runs asynchronously will do the following:
1) Get a compaction job from compaction strategy.
2) Calculate "size tier" of compaction job.
3) Run compaction job if its "size tier" is not the same of an ongoing
compaction for the given column family.
As before, it may decide to re-compact a column family based on a stat stored
in column family object.

Ran all compaction-related dtests.

Fixes #1216.

Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <d30952ff136192a522bde4351926130addec8852.1462311908.git.raphaelsc@scylladb.com>
2016-05-04 11:46:09 +03:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Raphael S. Carvalho
822759eee0 compaction_manager: update stat pending_tasks properly
Size of both _cfs_to_cleanup and _cfs_to_compact must be added when
calculating a new value to _stats.pending_tasks.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <b601e24d0631922798575f39d00fb54fe00d4971.1457093016.git.raphaelsc@scylladb.com>
2016-03-07 17:36:03 +01:00
Raphael S. Carvalho
b1cc0490f5 sstables: make compaction manager shutdown less verbose
before:

^CINFO  [shard 0] compaction_manager - Asked to stop
INFO  [shard 0] compaction_manager - compaction task handler stopped due to shutdown
INFO  [shard 0] compaction_manager - compaction task handler stopped due to shutdown
INFO  [shard 1] compaction_manager - Asked to stop
INFO  [shard 2] compaction_manager - Asked to stop
INFO  [shard 1] compaction_manager - compaction task handler stopped due to shutdown
INFO  [shard 2] compaction_manager - compaction task handler stopped due to shutdown
INFO  [shard 3] compaction_manager - Asked to stop
INFO  [shard 1] compaction_manager - compaction task handler stopped due to shutdown
INFO  [shard 2] compaction_manager - compaction task handler stopped due to shutdown
INFO  [shard 3] compaction_manager - compaction task handler stopped due to shutdown
INFO  [shard 3] compaction_manager - compaction task handler stopped due to shutdown

after:

^CINFO  [shard 0] compaction_manager - Asked to stop
INFO  [shard 0] compaction_manager - Stopped
INFO  [shard 1] compaction_manager - Asked to stop
INFO  [shard 2] compaction_manager - Asked to stop
INFO  [shard 3] compaction_manager - Asked to stop
INFO  [shard 1] compaction_manager - Stopped
INFO  [shard 2] compaction_manager - Stopped
INFO  [shard 3] compaction_manager - Stopped

`compaction_manager - compaction task handler stopped due to shutdown` is still printed
in debug level

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <535d5ad40102571a3d5d36257342827989e8f0f4.1455835407.git.raphaelsc@scylladb.com>
2016-02-21 11:55:17 +02:00
Raphael S. Carvalho
a53cfc8127 compaction manager: add support to wait for termination of cleanup
'nodetool cleanup' must wait for termination of cleanup, however,
cleanup is handled asynchronously. To solve that, a mechanism is
added here to wait for termination of a cleanup. This mechanism is
about using promise to notificate waiter of cleanup completion.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <6dc0a39170f3f51487fb8858eb443573548d8bce.1455655016.git.raphaelsc@scylladb.com>
2016-02-18 17:01:18 +02:00
Raphael S. Carvalho
59bbe98c21 sstables: keep track of compacting sstables in compacton manager itself
Avi says:
"Something like unordered_set<unsigned long> is error prone, because ints
tend to mix up (also, need to use a sized type, unsigned long varies among
machines)."

With that in mind, it's better if we keep track of compacting sstables in
a unordered_set<shared_sstable>.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <249f0fd4cfcf786cf3c37a79978f7743d07f48ad.1455120811.git.raphaelsc@scylladb.com>
2016-02-15 18:35:43 +02:00
Raphael S. Carvalho
ed61fe5831 sstables: make compaction stop report user-friendly
When scylla stopped an ongoing compaction, the event was reported
as an error. This patch introduces a specialized exception for
compaction stop so that the event can be handled appropriately.

Before:
ERROR [shard 0] compaction_manager - compaction failed: read exception:
std::runtime_error (Compaction for keyspace1/standard1 was deliberately
stopped.)

After:
INFO  [shard 0] compaction_manager - compaction info: Compaction for
keyspace1/standard1 was stopped due to shutdown.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <1f85d4e5c24d23a1b4e7e0370a2cffc97cbc6d44.1455034236.git.raphaelsc@scylladb.com>
2016-02-11 12:16:53 +02:00
Raphael S. Carvalho
4041f8cffc compaction: stop all ongoing compaction during shutdown
Currently, we wait for ongoing compaction during shutdown, but
that may take 'forever' if compacting huge sstables with a slow
disk. Compaction of huge sstables will take a considerable amount
of time even with fast disks. Therefore, all ongoing compaction
should be stopped during shutdown.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <3370f17ce4274df417ea60651f33fc5d4de91199.1454441286.git.raphaelsc@scylladb.com>
2016-02-03 10:18:51 +02:00
Raphael S. Carvalho
cf22c827f9 compaction_manager: fix assertion when stopping task
Task is stopped by closing gate and forcing it to exit via gate
exception. The problem is that task->compacting_cf may be set to
the column family being compacted, and compaction_manager::remove
would see it and try to stop the same task again, which would
lead to problems. The fix is to clean task->compacting_cf when
stopping task.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <3473e93c1a107a619322769d65fa020529b5501b.1454441286.git.raphaelsc@scylladb.com>
2016-02-03 10:18:15 +02:00
Raphael S. Carvalho
bb909798bc compaction_manager: introduce can_submit
Purpose is to reuse code and also make it easier to read.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-01-21 15:42:23 -02:00
Raphael S. Carvalho
653a07d75d compaction_manager: introduce signal_less_busy_task
Purpose is to reuse code.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-01-21 15:31:44 -02:00
Raphael S. Carvalho
2164aa8d5b move compaction manager from /utils to /sstables
Compaction manager was initially created at utils because it was
more generic, and wasn't only intended for compaction.
It was more like a task handler based on futures, but now it's
only intended to manage compaction tasks, and thus should be
moved elsewhere. /sstables is where compaction code is located.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-01-21 15:23:05 -02:00