Commit Graph

14193 Commits

Author SHA1 Message Date
Duarte Nunes
83e983d4d0 mutation_partition: Remove unused operator==()
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180115013546.67260-1-duarte@scylladb.com>
2018-01-15 11:16:35 +02:00
Duarte Nunes
9d1d9883ff mutation_partition: Remove unused for_each_cell() overload
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180115013618.67351-1-duarte@scylladb.com>
2018-01-15 11:16:34 +02:00
Duarte Nunes
b607662d2e collection_type_impl: Make for_each_cell static
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180115013532.67200-1-duarte@scylladb.com>
2018-01-15 11:16:33 +02:00
Avi Kivity
fe788e0a5d mutation_reader: adjust FragmentProducer concept for timeout
forward_to() no accepts a timeout parameter, and the concept should
reflect it, or it breaks the build when concepts are enabled.
2018-01-14 18:09:37 +02:00
Avi Kivity
90dc409c83 Merge "Support for MIN/MAX aggregation functions over date-types" from Dan
"Added support for min/max functions over date/timestamp/timeuuid.

There was one issue with Scylla's type system internals: no C++ type
was mapped to these types. So special "native_types" were added for them.
It required some changes to native functions because these types don't support
the same operations as their real native counterparts.

Fixes #3104."

* 'danfiala/3104-v1' of https://github.com/hagrid-the-developer/scylla:
  tests: Tests for min/max aggregate functions over date/timestamp and timeuuid.
  functions: Added min/max functions for date/timestamp/timeuuid.
  types: Added native types for timestamp and timeuuid.
  Advertise compatibility with CQL Version 3.3.2, since CAST functions are supported.
2018-01-14 17:26:27 +02:00
Daniel Fiala
1d0d419693 tests: Tests for min/max aggregate functions over date/timestamp and timeuuid.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2018-01-14 13:17:09 +01:00
Daniel Fiala
5bad03b5a6 functions: Added min/max functions for date/timestamp/timeuuid.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2018-01-14 13:13:36 +01:00
Daniel Fiala
0d71194da6 types: Added native types for timestamp and timeuuid.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2018-01-14 13:11:36 +01:00
Mika Eloranta' via ScyllaDB development
bc1248e62a build: rpm build script --xtrace option
Enables bash "set -o xtrace" printing of full executed command lines for
debugging purposes.

Signed-off-by: Mika Eloranta <mel@aiven.io>
Message-Id: <20180113212944.86008-1-mel@aiven.io>
2018-01-14 12:32:32 +02:00
Mika Eloranta' via ScyllaDB development
7266446227 build: fix rpm build script --jobs N handling
Fixes argument misquoting at $SRPM_OPTS expansion for the mock commands
and makes the --jobs argument work as supposed.

Signed-off-by: Mika Eloranta <mel@aiven.io>
Message-Id: <20180113212904.85907-1-mel@aiven.io>
2018-01-14 12:30:19 +02:00
Raphael S. Carvalho
fd2b4a7eb3 mutation_reader_test: remove schema left over from dummy selector
it now lives in base class, and this one is useless.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180114032943.28228-1-raphaelsc@scylladb.com>
2018-01-14 10:59:48 +02:00
Raphael S. Carvalho
16f8150916 tests: mutation_reader_test: Fix test_combined_reader_slicing_with_overlapping_range_tombstones
Test fails after fa5a26f12d because generated sstable doesn't contain data for the
shard it was created at, so sharding metadata is empty, resulting in exception
added in the aforementioned commit. That's fixed by using the new make_local_key()
to generate data that belongs to current shard.

make_local_keys(), from which make_local_key() is built on top of, will be useful
to make sstable test work again with any smp count.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180114032025.26739-1-raphaelsc@scylladb.com>
2018-01-14 10:59:29 +02:00
Tomasz Grabiec
9c391970b8 Merge 'per-request timeouts' from Glauber
Timeouts are a global property. However, for tables in keyspaces like
the system keyspace, we don't want to uphold that timeout--in fact, we
wan't no timeout there at all.

We already apply such configuration for requests waiting in the queued
sstable queue: system keyspace requests won't be removed. However, the
storage proxy will insert its own timeouts in those requests, causing
them to fail.

This patch changes the storage proxy read layer so that the timeout is
applied based on the column family configuration, which is in turn
inherited from the keyspace configuration. This matches our usual
way of passing db parameters down.

In terms of implementation, we can either move the timeout inside the
abstract read executor or keep it external. The former is a bit cleaner,
the the latter has the nice property that all executors generated will
share the exact same timeout point. In this patch, we chose the latter.

We are also careful to propagate the timeout information to the replica.
So even if we are talking about the local replica, when we add the
request to the concurrency queue, we will do it in accordance with the
timeout specified by the storage proxy layer.

After this patch, Scylla is able to start just fine with very low
timeouts--since read timeouts in the system keyspace are now ignored.

Fixes #2462

* git@github.com:glommer/scylla.git timeouts-v8.1:
  database: delete unused function
  consolidate timeout_clock
  mutation_query: add a timeout to the mutation query path
  flat_mutation_reader: pass timeout down to consume()
  add a timeout to fill_buffer
  add a timeout to fast forward to
  restricted_mutation_reader: don't pass timeouts through the config
    structure
  allow request-specific read timeouts in storage proxy reads
2018-01-12 17:06:27 +01:00
Glauber Costa
08a0c3714c allow request-specific read timeouts in storage proxy reads
Timeouts are a global property. However, for tables in keyspaces like
the system keyspace, we don't want to uphold that timeout--in fact, we
wan't no timeout there at all.

We already apply such configuration for requests waiting in the queued
sstable queue: system keyspace requests won't be removed. However, the
storage proxy will insert its own timeouts in those requests, causing
them to fail.

This patch changes the storage proxy read layer so that the timeout is
applied based on the column family configuration, which is in turn
inherited from the keyspace configuration. This matches our usual
way of passing db parameters down.

In terms of implementation, we can either move the timeout inside the
abstract read executor or keep it external. The former is a bit cleaner,
the the latter has the nice property that all executors generated will
share the exact same timeout point. In this patch, we chose the latter.

We are also careful to propagate the timeout information to the replica.
So even if we are talking about the local replica, when we add the
request to the concurrency queue, we will do it in accordance with the
timeout specified by the storage proxy layer.

After this patch, Scylla is able to start just fine with very low
timeouts--since read timeouts in the system keyspace are now ignored.

Fixes #2462

Implementation notes, and general comments about open discussion in 2462:

* Because we are not bypassing the timeout, just setting it high enough,
  I consider the concerns about the batchlog moot: if we fail for any
  other reason that will be propagated. Last case, because the timeout
  is per-CF, we could do what we do for the dirty memory manager and
  move the batchlog alone to use a different timeout setting.

* Storage proxy likes specifying its timeouts as a time_point, whereas
  when we get low enough as to deal with the read_concurrency_config,
  we are talking about deltas. So at some point we need to convert time_points
  to durations. We do that in the database query functions.

v2:
- use per-request instead of per-table timeouts.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-12 07:43:21 -05:00
Glauber Costa
3c9eeea4cf restricted_mutation_reader: don't pass timeouts through the config structure
This patch enables passing a timeout to the restricted_mutation_reader
through the read path interface -- using fill_buffer and friends. This
will serve as a basis for having per-timeout requests.

The config structure still has a timeout, but that is so far only used
to actually pass the value to the query interface. Once that starts
coming from the storage proxy layer (next patch) we will remove.

The query callers are patched so that we pass the timeout down. We patch
the callers in database.cc, but leave the streaming ones alone. That can
be safely done because the default for the query path is now no_timeout,
and that is what the streaming code wants. So there is no need to
complicate the interface to allow for passing a timeout that we intend
to disable.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-12 07:43:21 -05:00
Glauber Costa
5140aaea00 add a timeout to fast forward to
In the last patch, we enabled per-request timeouts, we enable timeouts
in fill_buffer. There are many places, though, in which we
fast_forward_to before we fill_buffer, so in order to make that
effective we need to propagate the timeouts to fast_forward_to as well.

In the same way as fill_buffer, we make the argument optional wherever
possible in the high level callers, making them mandatory in the
implementations.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-12 07:43:19 -05:00
Glauber Costa
d965af42b0 add a timeout to fill_buffer
As part of the work to enable per-request timeouts, we enable timeouts
in fill_buffer.

The argument is made optional at the main classes, but mandatory in all
the ::impl versions. This way we'll make sure we didn't forget anything.

At this point we're still mostly passing that information around and
don't have any entity that will act on those timeouts. In the next patch
we will wire that up.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-11 12:07:41 -05:00
Glauber Costa
54d3ebde4e flat_mutation_reader: pass timeout down to consume()
We pass the timeout that we received from data_query/mutation_query
down to consume, which is responsible for actually reading the data.

To make those timeouts actionable, though, we'll have to patch
fill_buffer(). This will happen in the next patch.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-11 12:07:41 -05:00
Glauber Costa
8433702c90 mutation_query: add a timeout to the mutation query path
data_query and mutation_query are patched so that they start accepting a
per-query timeout. We will default to no timeout, and then no callers
will be changed yet.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-11 12:07:41 -05:00
Glauber Costa
80c4a211d8 consolidate timeout_clock
At the moment, various different subsystems use their different
ideas of what a timeout_clock is. This makes it a bit harder to pass
timeouts between them because although most are actually a lowres_clock,
that is not guaranteed to be the case. As a matter of fact, the timeout
for restricted reads is expressed as nanoseconds, which is not a valid
duration in the lowres_clock.

As a first step towards fixing this, we'll consolidate all of the
existing timeout_clocks in one, now called db::timeout_clock. Other
things that tend to be expressed in terms of that clock--like the fact
that the maximum time_point means no timeout and a semaphore that
wait()s with that resolution are also moved to the common header.

In the upcoming patch we will fix the restricted reader timeouts to
be expressed in terms of the new timeout_clock.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-11 12:07:41 -05:00
Glauber Costa
40c428dc19 database: delete unused function
no in-tree users.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-11 12:07:41 -05:00
Paweł Dziepak
bd6fa8b331 configure.py: add dependency seastar/configure.py
Scylla's configure.py calls seastar/configure.py and uses seastar.pc
that it produces to generate Scylla's build.ninja. However, there is no
appropriate dependency in build.ninja and changes to
seastar/configure.py alone do not trigger regeneration of Scylla's
build.ninja. This patch remedies that problem.

Message-Id: <20180111144237.5259-1-pdziepak@scylladb.com>
2018-01-11 16:48:06 +02:00
Takuya ASADA
b68ee98310 dist/debian: make pbuilder works on Debian 9
On Debian 9, 'pbuilder create' fails because of lack of GPG key for
3rdparty repo, so we need --allow-untrusted on 'pbuilder create' and
'pbuilder update'.

Also, apt-key adv --fetch-keys does not works correctly on it, but we can use
"curl <URL> | apt-key add -" as workaround.

Fixes #3088

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1513797714-18067-1-git-send-email-syuu@scylladb.com>
2018-01-11 15:02:05 +02:00
Takuya ASADA
420b61b466 dist/debian: follow renaming of gcc-7.2 packages on Debian 8
Now we applied our scylla-$(pkg)$(ver) style package naming on gcc-7.2,
so switch to it.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1515522920-8266-1-git-send-email-syuu@scylladb.com>
2018-01-11 15:02:04 +02:00
Duarte Nunes
cbbdfde979 sstables/compaction_backlog_tracker: Constify backlog()
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180111004914.25796-1-duarte@scylladb.com>
2018-01-11 13:20:57 +02:00
Duarte Nunes
43ad5bd182 sstables/compaction_backlog_manager: Fix user-after-free
If the compaction_backlog_manager's lifetime ends before the linked
compaction_backlog_tracker's, the latter's _manager pointer not being
cleared, can lead to a use-after-free error when running
~compaction_backlog_tracker(), as evidenced by unit-tests failed.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180111004914.25796-2-duarte@scylladb.com>
2018-01-11 13:20:55 +02:00
Amnon Heiman
372b02676a register the cache API before gossip settle
cache service API does not need to wait for the gossip to settle.

Fixes: #2075

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20180103094757.13270-1-amnon@scylladb.com>
2018-01-11 10:27:52 +01:00
Paweł Dziepak
b4a4c04bab combined_reader: optimise for disjoint partition streams
The legacy mutation_reader/streamed_mutation design allowed very easily
to skip the partition merging logic if there was only one underlying
reader that has emitted it.

That optimisation was lost after conversion to flat mutation readers
which has impacted the performance. This patch mostly recovers it by
bypassing most of mutation_reader_merger logic if there is only a single
active reader for a given partition.

The performance regression was introduced in
8731c1bc66 "Flatten the implementation of
combined_mutation_reader".

perf_simple_query -c4 read results (medians of 60):

original regression
             before 8731c1     after 8731c1   diff
 read            326241.02        300244.09  -8.0%

this patch
                    before            after  diff
 read            313882.59        325148.05  3.6%
Message-Id: <20180103121019.764-1-pdziepak@scylladb.com>
2018-01-11 10:21:17 +01:00
Duarte Nunes
891c22904b partition_snapshot_reader: Don't push empty static rows
This patch fixes a regression introduced in 259f6759b4, which pushed
static row fragments regardless of them being empty.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180110222936.23085-1-duarte@scylladb.com>
2018-01-11 10:05:51 +01:00
Pekka Enberg
92b2e56211 Merge "Revive round-robin coordinator load balancing" from Vlad
"This series revives the round-robin load balancing added by Pekka back in 2015.

 If somebody tries to enable it with the current master it would quite quickly
 lead to a crash due to a few unresolved issues in the corresponding code.

 Fixes #2351
 Fixes #3118"

* 'fix-round-robin-balancing-v2' of github.com:vladzcloudius/scylla:
  transport::server::process_request(): avoid extra copy of the client_state
  service::cql_server::connection::process_request: use client_state "request copy" constructor
  service::client_state: introduce "request copy" copy-constructor
  service::storage_service: add the get_local_auth_service() accessor
  service::client_state: remove the unused _tracing_session_id field
2018-01-11 09:02:13 +02:00
Daniel Fiala
ef3324129a Advertise compatibility with CQL Version 3.3.2, since CAST functions are supported.
Fixes #3103.

Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2018-01-10 15:01:22 +01:00
Avi Kivity
56801d1b8c Update scylla-ami submodule
* dist/ami/files/scylla-ami 3366c93...3aa87a7 (1):
  > Move to kernel-ml kernel stream
2018-01-10 11:58:27 +02:00
Vlad Zolotarov
26a9aa5157 transport::server::process_request(): avoid extra copy of the client_state
Don't use submit_to(...) when we are going to handle the request on a local
shard. Otherwise there is a not needed copy of the _client_state in the submit_to(...)
lambda capture list.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-01-09 14:00:04 -05:00
Vlad Zolotarov
0b88c52639 service::cql_server::connection::process_request: use client_state "request copy" constructor
Create a cross-shard copy of the client_state object and give it to the single request handling
function and give it a timestamp generated by the original client_state instance (which is promised
to be monotonous).

Fixes #3118

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-01-09 14:00:04 -05:00
Vlad Zolotarov
430d172040 service::client_state: introduce "request copy" copy-constructor
A new constructor creates a copy of the current client_status to be
used in the context of the handling of a single request.

The copy may take place at a shard different from the one where the
request has been received.

In order to ensure the monotonicity of the timestamps used by the request handled
on the same connection the created copy of the client_state is going to use the same timestamp provided by the
caller instead of generating it.

It's the caller's responsibility to ensure the monotonicity of given timestamps.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-01-09 14:00:03 -05:00
Duarte Nunes
c142b6d0ee atomic_cell: Remove revert flag
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180109184420.7556-1-duarte@scylladb.com>
2018-01-09 19:54:51 +01:00
Duarte Nunes
259f6759b4 partition_snapshot_reader: Use static_row() to read static_row
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180109162815.5811-2-duarte@scylladb.com>
2018-01-09 19:17:02 +01:00
Duarte Nunes
16c975edcc partition_version: Return static_row fragment from static_row()
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180109162815.5811-1-duarte@scylladb.com>
2018-01-09 19:17:02 +01:00
Avi Kivity
7e898d2745 Merge seastar upstream
* seastar 6972a1e...a7a3e6f (1):
  > Update dpdk submodule
2018-01-09 18:17:33 +02:00
Tomasz Grabiec
5a32cf9008 tests: Make bad_alloc from test_concurrent_reads_and_eviction less likely
With -m1G, the test failed sporadically, because too many large
mutations were accumulated in memory. Avoid by limiting backlog.

Message-Id: <1515486430-4778-1-git-send-email-tgrabiec@scylladb.com>
2018-01-09 13:52:38 +02:00
Tomasz Grabiec
40ea74a934 tests: Drop unconditional mutation printing from assertions
sprint() may need to allocate significant amount of memory if mutation
is large, and cause bad_alloc in
row_cache_test::test_concurrent_reads_and_eviction.

Message-Id: <1515486454-4913-1-git-send-email-tgrabiec@scylladb.com>
2018-01-09 13:52:19 +02:00
Avi Kivity
d340a03e81 Merge seastar upstream
* seastar b0f5591...6972a1e (8):
  > Merge NOWAIT AIO from Avi
  > configure: Allow overriding protoc compiler path
  > Tutorial: fix default of --reserve-memory
  > future-util: optimise parallel_for_each()
  > future-utils: avoid defining a template with its default template parameter
  > fix socket_address output stream operator
  > test: fix spelling of "abort_source_test"
  > Make dependencies and doc more arch-friendly
2018-01-09 12:33:40 +02:00
Piotr Jastrzebski
3bddf3415f flat_mutation_reader: Add test for make_forwardable
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <71c8b195e25c3c5c5b97f12e2d7b2f011c0d3162.1515490058.git.piotr@scylladb.com>
2018-01-09 10:46:04 +01:00
Piotr Jastrzebski
945f45f490 Fix fast_forward_to(partition_range&) in forwardable flat reader.
Making sure fast_forward_to(const partition_range&) sets _current
correctly.

Fixes #3089

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <6c29cf273f191da0e21035bcbe1592042ecffc70.1515490058.git.piotr@scylladb.com>
2018-01-09 10:46:04 +01:00
Asias He
774307b3a7 streaming: Do send failed message for uninitialized session
The uninitialized session has no peer associated with it yet. There is
no point sending the failed message when abort the session. Sending the
failed message in this case will send to a peer with uninitialized
dst_cpu_id which will casue the receiver to pass a bogus shard id to
smp::submit_to which cases segfault.

In addition, to be safe, initialize the dst_cpu_id to zero. So that
uninitialized session will send message to shard zero instead of random
bogus shard id.

Fixes the segfault issue found by
repair_additional_test.py:RepairAdditionalTest.repair_abort_test

Fixes #3115
Message-Id: <9f0f7b44c7d6d8f5c60d6293ab2435dadc3496a9.1515380325.git.asias@scylladb.com>
2018-01-08 15:04:06 +02:00
Raphael S. Carvalho
4610e994e1 sstables: cure our blindness on sstable read failure
After 611774b, we're blind again on which sstable caused a compaction
to fail, leaving us with cryptic message as follow:
compaction_manager - compaction failed: std::runtime_error (compressed
chunk failed checksum)

After this change, now both read failure in compaction or regular read
will report the guilty sstable, see:
compaction_manager - compaction failed: std::runtime_error (SSTable reader
found an exception when reading sstable ./data/.../keyspace1-standard1
ka-1-Data.db : std::runtime_error(compressed chunk failed checksum))

Fixes #3006.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180102230752.14701-1-raphaelsc@scylladb.com>
2018-01-08 13:43:13 +02:00
Avi Kivity
72c673fcc3 Merge "I/O Controller for memtables and compactions" from Glauber
"This patchset implements the compaction controller for I/O shares. The
goal is to automatic adjust compaction shares based on a
strategy-specific backlog. A higher backlog will translate into higher
shares.

As compaction progresses, that reduces the backlog. As new data is
flushed, that increases the backlog. The goal of the controler is to
keep the backlog constant at a certain rate, so that we don't go neither
too fast or too slow.

Tracking reads and writes:
==========================

Tracking of reads and writes happen through the read_monitor and the
write_monitor. The write monitor is an existing interface that has the
purpose of releasing the write permit at particular points of the write
process. We enhance it so to get a reference to an instance that tracks
the current offset inside the sstables::file_writer. This way the
backlog tracker can always know for sure what's the offset of the
current write.

A similar thing is done for reads. The data_consumer already tracks the
position of the current read, and we isolate that into a structure to
which we can get a reference. A read_monitor allows us to connect the
compaction to that reference.

Lifetime management:
====================

In general, tracking objects will be owned by their callers and passed
down as references. The compaction object will own the read monitors and
the compaction write monitors and the memtable flush write monitor will
be kept alive in a do_with block around the flush itself.

The backlog_{write,read}_progress_manager needs to be kept alive until
the SSTable is no longer in progress. For writes, that means until we
are able to add the SSTable charges in full, and for reads (compaction)
that means until we are able to remove the charges in full.

It is important to do that to avoid spikes in the graph. If we remove
the progress managers in a different operation than updating the SSTable
list we will be left in a temporary state where charges appear or
disappear abruptly, to be fixed when the final
add_sstable/remove_sstable happens. So we want those things to happen
together.

The compaction_backlog_tracker is kept alive until the strategy changes,
for example, through ALTER TABLE. Current charges are transferred to the
new strategy's compaction_backlog_tracker object when we do that. If the
type of strategy changes, the current read charges are forgotten. We can
do that because those running compaction will not really contribute to
decrease the backlog of the new compaction strategy.

Tranfer of Charges
==================

When ALTER TABLE happens, we need to transfer ongoing writes to the new
backlog manager. Ongoing reads will still be tracked by the
backlog_manager that originated them.

The rationale for that is that reads still belong to the current
compaction, with the strategy that generated them. But new Tables being
written will add to the backlog of the new strategy.

Note that ALTER TABLE operations not necessarily cause a change of
Strategy. We can be using the same strategy but just changing
properties. If that is the case, we expect no discontinuity in the
backlog graph (tested).

Resharding
==========

Resharding compactions are more complex than normal compactions because
the SSTables are created in one shard and later sent to another shard.
It is better, then, to track resharding compactions separately and let
them have their own backlog tracker, which will insert backlog in
proportion to the amount of data to be resharded.

Memtable Flush I/O Controller
=============================

With the current infrastructure it becomes trivial to add a new
controller, for either I/O or CPU. This patchset then adds an I/O
controller for memtable flushes, using the same backlog algorithm that
we already used for CPU."

* 'compaction-controller-io-v5' of github.com:glommer/scylla:
  database: add a controller for I/O on memtable flushes.
  document the compaction controller
  compaction: adjust shares for compactions
  backlog_controllers: implement generic I/O controller
  factor out some of the controller code
  io shares: multiply all shares by 10
  compaction_strategy: implement backlog manager for the SizeTiered strategy
  infrastructure for backlog estimator for compaction work.
  sstables: notify about end of data component write
  sstables: add read_monitor_generator
  sstables: add read_monitor
  sstables: enhance data consumer with a position tracker
  sstables: enhance the file_writer with an offset tracker
  sstables: pass references instead of pointers for write_monitor
  compaction: control destruction of readers
2018-01-07 15:00:10 +02:00
Avi Kivity
375ed938b4 Merge "Fix potential infinite recursion in leveled compaction" from Raphael
'"The issue is triggered by compaction of sstables of level higher than 0.

The problem happens when interval map of partitioned sstable set stores
intervals such as follow:
[-9223362900961284625 : -3695961740249769322 ]
(-3695961740249769322 : -3695961103022958562 ]

When selector is called for first interval above, the exclusive lower
bound of the second interval is returned as next token, but the
inclusivess info is not returned.
So reader_selector was returning that there *were* new readers when
the current token was -3695961740249769322 because it was stored in
selector position field as inclusive, but it's actually exclusive.

This false positive was leading to infinite recursion in combined
reader because sstable set's incremental selector itself knew that
there were actually *no* new readers, and therefore *no* progress
could be made."

Fixes #2908.'

* 'high_level_compaction_infinite_recursion_fix_v4' of github.com:raphaelsc/scylla:
  tests: test for infinite recursion bug when doing high-level compaction
  Fix potential infinite recursion when combining mutations for leveled compaction
  dht: make it easier to create ring_position_view from token
  dht: introduce is_min/max for ring_position
2018-01-07 13:22:17 +02:00
Vlad Zolotarov
f0d5619634 service::storage_service: add the get_local_auth_service() accessor
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-01-05 18:00:11 -05:00
Vlad Zolotarov
1d978b9caa service::client_state: remove the unused _tracing_session_id field
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-01-05 18:00:11 -05:00