Commit Graph

12785 Commits

Author SHA1 Message Date
Avi Kivity
ba2e170e4b compaction: fix return in leveled compaction droppable tombstones loop
If the loop ever terminates, we need to return something.

Message-Id: <20170719133508.13374-1-avi@scylladb.com>
2017-08-01 13:33:02 +03:00
Takuya ASADA
a998b7b3eb dist/ami: follow scylla-tools package name change on RedHat variants
Since scylla-tools generates two .rpm packages, we need to copy them to our AMI.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20170722090002.9850-1-syuu@scylladb.com>
2017-07-31 18:57:12 +03:00
Avi Kivity
7c8dea088a Merge seastar upstream
* seastar 54e940f...fc937b8 (2):
  > configure.py: Always ensure tmp directory exists
  > coding-style.md: introduce
2017-07-31 18:06:09 +03:00
Duarte Nunes
a85232dd82 Fix compilation errors on GCC 6
GCC 6 inconsistently requires explicitly calling a member function
through "this->" for lambda functions capturing "this".

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170731143755.21970-1-duarte@scylladb.com>
2017-07-31 17:40:44 +03:00
Benoît Canet
b44ba11e4c transport: Count the number of unpaged queries
Queries with query page size equal or smaller than
zero are unpaged queries.

Count these kind of queries and make them a metrics
since they can ruin the performance of the system.

Message-Id: <20170731130004.25807-2-benoit@scylladb.com>
2017-07-31 16:01:45 +03:00
Avi Kivity
3fe6731436 Merge "educe the effect of the latency metrics" from Amnon
"This series reduce that effect in two ways:
1. Remove the latency counters from the system keyspaces
2. Reduce the histogram size by limiting the maximum number of buckets and
   stop the last bucket."

Fixes #2650.

* 'amnon/remove_cf_latency_v2' of github.com:cloudius-systems/seastar-dev:
  database: remove latency from the system table
  estimated histogram: return a smaller histogram
2017-07-31 15:58:30 +03:00
Paweł Dziepak
402799fcc0 mutation_reader: drop move_and_clear()
Since the discovery of std::exchange(x, {}) move_and_clear has become
obsolete. Beside, the name was wrong, it did not clear the vector but
recreated it meaning that any allocated memory wasn't reused (not that
it mattered in the existing usages).

Message-Id: <20170731123549.10887-1-pdziepak@scylladb.com>
2017-07-31 15:51:19 +03:00
Gleb Natapov
87bc3f7e7f configure.py: use user provided compiler flags when checking for features
User provided compiler flags my change an outcome of the test.

Message-Id: <20170724111520.GA18230@scylladb.com>
2017-07-31 15:33:06 +03:00
Avi Kivity
f4b2a1ef4e Merge "Optimise combined_mutation_reader" from Paweł
"These patches optimise combined_mutation_reader for cases where the majority
of mutation_readers is disjoint.

perf_fast_forward:
Results are medians of 3 of fragments/s as reported by perf_fast_forward.

Command:
perf_fast_forward -c1 --enable-cache

small: small-partition-skips (read=1, skip=0)
large: large-partition-skips (read=1, skip=0)

          before        after      diff
small     195753      238196       +22%
large    1244325     1359096        +9%

perf_simple_query:

Results are medians of 10 of reads/s as reported by perf_simple_query.

Command:
perf_simple_query -c1

before   98651.40
after   104554.85
diff          +6%"

* tag 'avoid-merge_mutations/v1' of https://github.com/pdziepak/scylla:
  combined_mutation_reader: avoid unnecessary merge_mutations()
  combined_mutation_reader: do not pop mutation with different key
2017-07-31 15:14:42 +03:00
Avi Kivity
178b54e790 Merge "memtable flush: Fixes and improvements" from Duarte
"This series ensure that when we retry a memtable flush, we re-acquire the
flush permit that was previously released. It also ensures we don't hold
the sstable read lock for the duration of the sleep leading to the retry.

To achieve that cleanly we refactor the way the permit lifecycle is managed
by employing a RAII-based approach.

We also improve the latency of writes blocked on virtual dirty by releasing
the flush permit before fsyncing the sstables. There are additional avenues
for performance improvements on top of this one."

* 'memtable-flush-additional-fixes/v4' of github.com:duarten/scylla:
  column_family: Re-acquire flush permit in case of error
  column_family: Don't hold sstable read lock when retrying flush
  sstables: Release the flush permit before fsyncing
  sstables: Introduce write_monitor
  database: Extract out dirty_memory_manager
  dirty_memory_manager: Refactor flush permit lifetime management
  dirty_memory_manager: Invert permit acquisition order
  memtable_list: Register different seal functions for each behaviour
2017-07-31 14:57:19 +03:00
Paweł Dziepak
2b53a560c8 combined_mutation_reader: avoid unnecessary merge_mutations()
Merging mutations is quite an expensive operation. The creation of
streamed mutation merger involves several allocations (mostly coming
from various std::vector) and then all mutation_fragments need to go
through a heap.

All this is completely unnecessary if there is only one mutation, so
let's skip a call to merge_mutations() in such cases. This also means
that we can reuse memory allocated by _current vector if merge is not
required.
2017-07-31 12:35:40 +01:00
Paweł Dziepak
f78f2b3c92 combined_mutation_reader: do not pop mutation with different key
Originally, the loop insidecombined_mutation_reader::next() so that it
was popping mutation from the heap and when it encountered one with a
different decorated key it was pushed back and the ones accumulated so
far merged and emitted. In other words, every time the reader progressed
to the next mutation it did needless pop and push operations on the
heap.

This patch rearranges the code so that the key of the next mutation is
compared before it is popped from the heap.
2017-07-31 12:35:40 +01:00
Duarte Nunes
c81431ad16 column_family: Re-acquire flush permit in case of error
If we fail to flush an sstable, after creating the flush_reader, then
we will have released the flush permit when we retry the flush. Ensure
that when retrying, we re-acquire the flush permit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-31 12:40:19 +02:00
Duarte Nunes
9162e016da column_family: Don't hold sstable read lock when retrying flush
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-31 12:40:19 +02:00
Duarte Nunes
1a33cc6847 sstables: Release the flush permit before fsyncing
This allows a queued flush to start while we fsync the current
sstable, which helps reduce the overall time new writes are blocked on
dirty memory.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-31 12:40:19 +02:00
Duarte Nunes
784a078e72 sstables: Introduce write_monitor
The write_monitor provides callbacks to inform an observer of the
state of the ongoing sstable write.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-31 12:40:19 +02:00
Duarte Nunes
d2b0a5a0a6 database: Extract out dirty_memory_manager
Needed to the flush_permit can be propagated to the sstables layer.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-31 12:40:19 +02:00
Duarte Nunes
a2b732c156 dirty_memory_manager: Refactor flush permit lifetime management
This patch refactors how the flush permit lifetime is managed,
dropping the current hash table in favour of a RAII approach.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-31 12:40:19 +02:00
Duarte Nunes
f647f5b14a dirty_memory_manager: Invert permit acquisition order
For an upcoming fix it is required to invert the permit acquisition
order: first we acquire the background work permit and then the single
flush permit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-31 12:40:19 +02:00
Duarte Nunes
e371accac8 memtable_list: Register different seal functions for each behaviour
Instead of passing a flush_behaviour to the seal function, use two
different functions for each of the behaviours.

This will be important in the forthcoming patches, which will require
the signatures of those functions to differ.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-31 12:40:19 +02:00
Paweł Dziepak
e970630272 tests/serialized_action: add missing forced defers
serialized_action_tests depends on the fact that first part of the
serialized_action is executed at cetrtain points (in which it reads a
global variable that is later updated by the main thread).
This worked well in the release mode before ready continuations were
inlined and run immediately, but not in the debug mode since inlining
was not happening and the main seastar::thread was missing some yield
points.
Message-Id: <20170731103013.26542-1-pdziepak@scylladb.com>
2017-07-31 11:35:24 +01:00
Duarte Nunes
4e3232fc29 utils/log_histogram: Fix typo when calculating number of buckets
We weren't correctly calculating the number of buckets due to
returning the wrong variable.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170731094733.7746-1-duarte@scylladb.com>
2017-07-31 12:49:11 +03:00
Avi Kivity
e855a28fae Revert "Merge "memtable flush: Fixes and improvements" from Duarte"
This reverts commit 733a64a1df, reversing
changes made to e11e66723a.

Breaks sstable_test and perf_fast_forward.
2017-07-31 12:44:28 +03:00
Avi Kivity
85056f3611 log_histogram: fix constexpr-ness of log_histogram_options
1. assert() is not constexpr.
2. can't use static_assert(), because the contructor may be called in a non-constexpr
   environment; moved to log_histogram
3. pow2_rank() uses count_leading_zeros() which is not constexpr; split
   into constexpr and non-constexpr versions
4. duplicated number_of_buckets() because bucket_of() can't be constexpr due to pow2_rank
Message-Id: <20170726105444.32698-1-avi@scylladb.com>
2017-07-31 09:11:40 +01:00
Avi Kivity
733a64a1df Merge "memtable flush: Fixes and improvements" from Duarte
"This series ensure that when we retry a memtable flush, we re-acquire the
flush permit that was previously released. It also ensures we don't hold
the sstable read lock for the duration of the sleep leading to the retry.

To achieve that cleanly we refactor the way the permit lifecycle is managed
by employing a RAII-based approach.

We also improve the latency of writes blocked on virtual dirty by releasing
the flush permit before fsyncing the sstables. There are additional avenues
for performance improvements on top of this one."

* 'memtable-flush-additional-fixes/v3' of github.com:duarten/scylla:
  column_family: Re-acquire flush permit in case of error
  column_family: Don't hold sstable read lock when retrying flush
  sstables: Release the flush permit before fsyncing
  sstables: Introduce write_monitor
  database: Extract out dirty_memory_manager
  dirty_memory_manager: Refactor flush permit lifetime management
  dirty_memory_manager: Invert permit acquisition order
  memtable_list: Register different seal functions for each behaviour
  main: Don't catch polymorphic exceptions by value
2017-07-31 10:32:26 +03:00
Duarte Nunes
e11e66723a main: Don't catch polymorphic exceptions by value
GCC trunk complains due to exception slicing.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170727163021.8000-1-duarte@scylladb.com>
2017-07-31 10:12:13 +03:00
Avi Kivity
fc683c3f3e Merge seastar upstream
* seastar a14d667...54e940f (8):
  > Merge "Prometheus to use output stream" from Amnon
  > http_test: Fix an http output stream test
  > build: harden try_compile_and_link output temporary file
  > configure: disable exception scalability hack on debug build
  > build: don't perform test compiles to /dev/null
  > Provide workaround for non scaleable c++ exception runtime
  > Merge "Add output stream to http message reply" from Amnon
  > configure.py: use user provided compiler flags when checking for features
2017-07-31 10:09:48 +03:00
Avi Kivity
c1718dd5e3 Update scylla-ami submodule
* dist/ami/files/scylla-ami 2bd1481...b41e5eb (1):
  > Fix incorrect scylla-server sysconfig file edit for i3 memflush controller
2017-07-31 09:41:24 +03:00
Takuya ASADA
714540cd4c dist/debian: refuse upgrade if current scylla < 1.7.3 && commitlog remains
Commitlog replay fails when upgrade from <1.7.3 to 2.0, we need to refuse
updating package if current scylla < 1.7.3 && commitlog remains.

Note: We have the problem on scylla-server package, but to prevent
scylla-conf package upgrade, %pretrans should be define on scylla-conf.

Fixes #2551

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1501187555-4629-1-git-send-email-syuu@scylladb.com>
2017-07-31 09:08:40 +03:00
Avi Kivity
e5d2e28df9 Merge "Backport exception scalability fix from gcc-7" from Gleb
"This patch series backports scalability fix for _Unwind_Find_FDE and modifies
out CentOS package to use our libgcc and libstdc++ which are needed to make
use of the fix instead of locally installed ones."

Ref #2646 (fixes on RHEL 7 and related only)

* 'gleb/exception-gcc-fix-v2' of github.com:cloudius-systems/seastar-dev:
  dist/redhat: Make scylla rpm depend on scylla-libgcc and scylla-libstdc++ and use it instead of locally installed one
  dist/redhat: Backport scalability fix of _Unwind_Find_FDE to out gcc
2017-07-30 19:31:03 +03:00
Gleb Natapov
8fe875cc79 dist/redhat: Make scylla rpm depend on scylla-libgcc and scylla-libstdc++ and use it instead of locally installed one 2017-07-30 16:03:25 +03:00
Gleb Natapov
1cf7e72c68 dist/redhat: Backport scalability fix of _Unwind_Find_FDE to out gcc 2017-07-30 16:03:10 +03:00
Paweł Dziepak
e62403190b Merge "Introduce perf_cache_eviction test" from Tomasz
Runs appending writes to a single partition, at full speed, and a reader
which selects the head of the partition, with 100ms delay between reads.
Prints latency percentiles and some stats.

Intended to test performance at the transition from non-evicting to
evicting modes.

Currently we can see that after the transition, whole partition gets
evicted and reads constantly miss.

Sample output:

    rd/s: 10, wr/s: 135947, ev/s: 0, pmerge/s: 1, miss/s: 0, cache: 708/778 [MB], LSA: 820/910 [MB], std free: 82 [MB]

    reads : min: 149   , 50%: 179   , 90%: 1331  , 99%: 1331  , 99.9%: 1331  , max: 6866   [us]
    writes: min: 3     , 50%: 4     , 90%: 4     , 99%: 5     , 99.9%: 258   , max: 51012  [us]

    rd/s: 7, wr/s: 93354, ev/s: 9, pmerge/s: 1, miss/s: 3, cache: 0/0 [MB], LSA: 107/128 [MB], std free: 82 [MB]

    reads : min: 179   , 50%: 179   , 90%: 73457 , 99%: 73457 , 99.9%: 73457 , max: 105778 [us]
    writes: min: 3     , 50%: 4     , 90%: 4     , 99%: 5     , 99.9%: 258   , max: 105778 [us]

* tag 'tgrabiec/row-eviction-perf-test' of github.com:scylladb/seastar-dev:
  tests: Introduce perf_cache_eviction
  tests: simple_schema: Add getter for DDL statement
  estimated_histogram: Implement percentile()
  utils: estimated_histogram: Make printable
2017-07-28 09:49:22 +01:00
Duarte Nunes
0f1bd81523 column_family: Re-acquire flush permit in case of error
If we fail to flush an sstable, after creating the flush_reader, then
we will have released the flush permit when we retry the flush. Ensure
that when retrying, we re-acquire the flush permit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 21:09:18 +02:00
Duarte Nunes
2f4cffc7f6 column_family: Don't hold sstable read lock when retrying flush
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 21:09:18 +02:00
Duarte Nunes
5e64839e85 sstables: Release the flush permit before fsyncing
This allows a queued flush to start while we fsync the current
sstable, which helps reduce the overall time new writes are blocked on
dirty memory.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 21:09:18 +02:00
Duarte Nunes
a737577881 sstables: Introduce write_monitor
The write_monitor provides callbacks to inform an observer of the
state of the ongoing sstable write.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 21:09:18 +02:00
Duarte Nunes
121f967b30 database: Extract out dirty_memory_manager
Needed to the flush_permit can be propagated to the sstables layer.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 21:09:18 +02:00
Duarte Nunes
ef1275e9dd dirty_memory_manager: Refactor flush permit lifetime management
This patch refactors how the flush permit lifetime is managed,
dropping the current hash table in favour of a RAII approach.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 21:09:18 +02:00
Duarte Nunes
cfc8fae33f dirty_memory_manager: Invert permit acquisition order
For an upcoming fix it is required to invert the permit acquisition
order: first we acquire the background work permit and then the single
flush permit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 21:09:18 +02:00
Duarte Nunes
7e68e4677d memtable_list: Register different seal functions for each behaviour
Instead of passing a flush_behaviour to the seal function, use two
different functions for each of the behaviours.

This will be important in the forthcoming patches, which will require
the signatures of those functions to differ.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 21:09:18 +02:00
Duarte Nunes
7502401652 main: Don't catch polymorphic exceptions by value
GCC trunk complains due to exception slicing.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-07-27 21:09:18 +02:00
Duarte Nunes
143f4fd861 Merge "Prevent pull requests from accumulating" from Tomasz
If schema merging completes at lower rate than incoming pull requests,
then merge processes will accumulate and needlessly request and hold schema mutations.

In rare cases, when there are constant schema changes, they may even
overflow memory. This was seen in dtest:

  concurrent_schema_changes_test.py:TestConcurrentSchemaChanges.create_lots_of_schema_churn_test

Allowing only one active and one queued pull request per remote
endpoint is enough.

* tag 'tgrabiec/dont-accumulate-schema-pulls-v2' of github.com:scylladb/seastar-dev:
  migration_manager: Log schema pulls
  migration_manager: Prevent pull requests from accumulating
  utils: Introduce serialized_action
2017-07-27 21:01:38 +02:00
Tomasz Grabiec
e09220dbff migration_manager: Log schema pulls 2017-07-27 20:08:25 +02:00
Tomasz Grabiec
350d98d4e1 migration_manager: Prevent pull requests from accumulating
If schema merging completes at lower rate than incoming pull requests,
then merge processes will accumulate and needlessly request and hold schema mutations.

In rare cases, when there are constant schema changes, they may even
overflow memory. This was seen in dtest:

  concurrent_schema_changes_test.py:TestConcurrentSchemaChanges.create_lots_of_schema_churn_test

Allowing only one active and one queued pull request per remote
endpoint is enough.
2017-07-27 20:08:25 +02:00
Tomasz Grabiec
6a3703944b utils: Introduce serialized_action 2017-07-27 20:08:21 +02:00
Paweł Dziepak
f02bef7917 streamed_mutation: do not call fill_buffer() ahead of time
consume_mutation_fragments_until() allows consuming mutation fragments
until a specified condition happens. This patch reorganises its
implementation so that we avoid situations when fill_buffer() is called
with stop condition being true.
Message-Id: <20170727122218.7703-1-pdziepak@scylladb.com>
2017-07-27 17:47:57 +02:00
Tomasz Grabiec
ac7e6ef1bc tests: Introduce perf_cache_eviction 2017-07-27 17:19:07 +02:00
Tomasz Grabiec
2d2e7ef6fb tests: simple_schema: Add getter for DDL statement 2017-07-27 17:19:07 +02:00
Tomasz Grabiec
5602be72fa estimated_histogram: Implement percentile() 2017-07-27 17:19:07 +02:00