Commit Graph

22 Commits

Author SHA1 Message Date
Avi Kivity
b6ebe2e20b Merge "Avoid avalanche of tasks after memtable flush" from Tomasz
"Before, the logic for releasing writes blocked on dirty worked like this:

  1) When region group size changes and it is not under pressure and there
     are some requests blocked, then schedule request releasing task

  2) request releasing task, if no pressure, runs one request and if there are
     still blocked requests, schedules next request releasing task

If requests don't change the size of the region group, then either some request
executes or there is a request releasing task scheduled. The amount of scheduled
tasks is at most 1, there is a single releasing thread.

However, if requests themselves would change the size of the group, then each
such change would schedule yet another request releasing thread, growing the task
queue size by one.

The group size can also change when memory is reclaimed from the groups (e.g.
when contains sparse segments). Compaction may start many request releasing
threads due to group size updates.

Such behavior is detrimental for performance and stability if there are a lot
of blocked requests. This can happen on 1.5 even with modest concurrency
because timed out requests stay in the queue. This is less likely on 1.6 where
they are dropped from the queue.

The releasing of tasks may start to dominate over other processes in the
system. When the amount of scheduled tasks reaches 1000, polling stops and
server becomes unresponsive until all of the released requests are done, which
is either when they start to block on dirty memory again or run out of blocked
requests. It may take a while to reach pressure condition after memtable flush
if it brings virtual dirty much below the threshold, which is currently the
case for workloads with overwrites producing sparse regions.

I saw this happening in a write workload from issue #2021 where the number of
request releasing threads grew into thousands.

Fix by ensuring there is at most one request releasing thread at a time. There
will be one releasing fiber per region group which is woken up when pressure is
lifted. It executes blocked requests until pressure occurs."

* tag 'tgrabiec/lsa-single-threaded-releasing-v2' of github.com:cloudius-systems/seastar-dev:
  tests: lsa: Add test for reclaimer starting and stopping
  tests: lsa: Add request releasing stress test
  lsa: Avoid avalanche releasing of requests
  lsa: Move definitions to .cc
  lsa: Simplify hard pressure notification management
  lsa: Do not start or stop reclaiming on hard pressure
  tests: lsa: Adjust to take into account that reclaimers are run synchronously
  lsa: Document and annotate reclaimer notification callbacks
  tests: lsa: Use with_timeout() in quiesce()

(cherry picked from commit 7a00dd6985)
2017-02-02 22:19:25 +01:00
Avi Kivity
7faf2eed2f build: support for linking statically with boost
Remove assumptions in the build system about dynamically linked boost unit
tests.  Includes seastar update which would have otherwise broken the
build.
2016-10-26 08:51:21 +03:00
Glauber Costa
7f29cb8aba tests: add logalloc tests for pressure notification
tests to make sure varios scenarios of pressure notification for active
asynchronous reclaim work.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-06-20 18:58:39 -04:00
Glauber Costa
8f5047fc5f tests: add tests to new region_group throttle interface
Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-06-20 18:51:00 -04:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Tomasz Grabiec
529c8b8858 logalloc: Rename tracker::occupancy() to region_occupancy() 2016-03-22 14:56:44 +01:00
Benoît Canet
1fb9a48ac5 exception: Optionally shutdown communication on I/O errors.
I/O errors cannot be fixed by Scylla the only solution
is to shutdown the database communications.

Signed-off-by: Benoît Canet <benoit@scylladb.com>
Message-Id: <1458154098-9977-1-git-send-email-benoit@scylladb.com>
2016-03-17 15:02:52 +02:00
Paweł Dziepak
13849fd129 tests/lsa: add test for region groups
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-03-17 11:20:22 +00:00
Paweł Dziepak
ed53784cb6 tests/lsa: do not leak memory in large allocation test
Large allocations test, unsurprisingly, allocates a lot of memory. Do
not leak it so that any tests that are going to be run afterwards have
still some memory left.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-03-17 11:19:13 +00:00
Paweł Dziepak
9d482532f4 tests/lsa: reduce the size of large allocation
Originally, large allocation test case attempted to allocate an object
as big as halft of the space used by the lsa. That failed when the test
was executed with lower amount of memory available mainly due to the
memory fragmentation caused by previous test cases.

This patches reduces the size of the large allocation to 3/8 of the
total space used by the lsa which is still a lot but seems to make the
test pass even with as little memory as 64MB per shard.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-12-10 13:16:43 +01:00
Paweł Dziepak
63bdf52803 tests/lsa: add large allocations test
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-12-08 23:56:46 +01:00
Paweł Dziepak
83b004b2fb lsa: avoid fragmenting memory
Originally, lsa allocated each segment independently what could result
in high memory fragmentation. As a result many compaction and eviction
passes may be needed to release a sufficiently big contiguous memory
block.

These problems are solved by introduction of segment zones, contiguous
groups of segments. All segments are allocated from zones and the
algorithm tries to keep the number of zones to a minimum. Moreover,
segments can be migrated between zones or inside a zone in order to deal
with fragmentation inside zone.

Segment zones can be shrunk but cannot grow. Segment pool keeps a tree
containing all zones ordered by their base addresses. This tree is used
only by the memory reclamer. There is also a list of zones that have
at least one free segments that is used during allocation.

Segment allocation doesn't have any preferences which segment (and zone)
to choose. Each zone contains a free list of unused segments. If there
are no zones with free segments a new one is created.

Segment reclamation migrates segments from the zones higher in memory
to the ones at lower addresses. The remaining zones are shrunk until the
requested number of segments is reclaimed.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-12-08 19:31:40 +01:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Tomasz Grabiec
b4835756fd tests: Fix compilation error
Introdued in 920fe4278a
2015-09-08 12:52:30 +02:00
Tomasz Grabiec
920fe4278a Cleanup leftovers after compaction_counter to reclaim_counter rename 2015-09-08 10:19:19 +02:00
Tomasz Grabiec
870e9e5729 lsa: Replace compaction_lock with broader reclaim_lock
Disabling compaction of a region is currently done in order to keep
the references valid. But disabling only compaction is not enough, we
also need to disable eviction, as it also invalidates
references. Rather than introducing another type of lock, compaction
and eviction are controlled together, generalized as "reclaiming"
(hence the reclaim_lock).
2015-09-01 17:29:04 +03:00
Tomasz Grabiec
3115a1aaa0 tests: logalloc_test: Disable test_compaction_lock with default allocator
It relies on the fact that the process has a fixed amount of memory
assigned and std::bad_alloc is thrown in a timely manner when it fills
up, which is the case for seastar's allocator, but not with the
default allocator. With the latter the OOM killer kills the process.
2015-09-01 15:17:43 +03:00
Tomasz Grabiec
2d6d15308e tests: logalloc_test: Add test for compaction_lock 2015-08-31 21:50:17 +02:00
Tomasz Grabiec
110a55886c lsa: Introduce region::compaction_counter() 2015-08-31 13:58:42 +02:00
Tomasz Grabiec
bceeb301b7 tests: lsa: Add test for region merging 2015-08-08 09:59:24 +02:00
Avi Kivity
a1543dc4f9 tests: mark fake variable as unused in logalloc_test
So that gcc 5.1 doesn't complain.
2015-08-07 21:32:09 +03:00
Tomasz Grabiec
658c21a060 tests: Add LSA tests 2015-08-06 14:05:16 +02:00