Commit Graph

9771 Commits

Author SHA1 Message Date
Paweł Dziepak
4acf77d755 sstables: drop unused data_stream_at()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-04 18:17:43 +01:00
Paweł Dziepak
2cdf498bbd sstables: close input stream in sstable::data_read()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-04 18:17:42 +01:00
Paweł Dziepak
8931b939a1 sstables: use finally() to close input streams
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-04 18:17:42 +01:00
Paweł Dziepak
e6ececce7f Merge seastar upstream
Submodule seastar a47f893..c82c36f:
  > reactor: fix build error
  > util: lazy_eval: fix compilation errors related to operator<<()s
    definitions
2016-07-04 18:14:05 +01:00
Duarte Nunes
41843b32c5 thrift: Correctly mark a CF as dense
And store whether the comparator is a composite type in the case of
dynamic CFs.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1467307688-11059-1-git-send-email-duarte@scylladb.com>
2016-07-04 17:40:53 +02:00
Nadav Har'El
c4e871ea2d Work around unexpected data_value constructor
If someone tried to naively use utf8_type->decompose("18wX"), this would
mysteriously fail, returning an empty key.

decompose takes a data_value, so the compiler looked for an implict
conversion from the string constant (const char*) to data_value. We did
not have such a conversion, only conversion from sstring. But the compiler
chose (backed by the C++ standard, no doubt) to implicitly convert the
const char* to a bool (!), and then use data_value(bool). It did not
convert the const char* to an sstring, nor did it warn about the possible
ambiguity.

So this patch adds a data_value(const char*) constructor, so people will
not fall into the same trap that I fell into...

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1467643462-6349-1-git-send-email-nyh@scylladb.com>
2016-07-04 17:50:53 +03:00
Avi Kivity
e22517bafc Merge "Optimize reads from leveled sstables"
In a leveled column family, there can be many thousands of sstables, since
each sstable is limited to a relatively small size (160M by default).
With the current approach of reading from all sstables in parallel, cpu
quickly becomes a bottleneck as we need to check the bloom filter for each
of these sstables.

This patch addresses the problem by introducing a
compaction-strategy-specific data structure for holding sstables.  This
data structure has a method to obtain the sstables used for a read.

For leveled compaction strategy, this data structure is an interval map,
which can be efficiently used to select the right sstables.
2016-07-04 16:00:35 +03:00
Asias He
610a0f7ef0 storage_service: Skip feature check for seed node for now
When a seed node boots up with more than one node in the seed list, it
will fail to talk to the other seed node which is not up yet.
This fails the feature check, so the seed node will not boot.

Skip the feature check for seed node for now, util we have a proper solution.

Fixes recent dtest failure due to fail to boot the seed node.

Message-Id: <e1d4110f96817e45f81dc0bc948dd14600fc5333.1467251799.git.asias@scylladb.com>
2016-07-04 15:09:57 +03:00
Avi Kivity
28fab55e6e Merge "Convert sstable writes to streamed mutations" from Paweł
"This series converts sstable writers (including compaction) to streamed
mutations and makes them use consumer-style interface.

Code related to sstable writes and compaction is converted to consumers
that can be used with consume_flattened_in_thread() (which is a variant
of consume_flattened() intended to be run inside a thread).
compac_for_query is improved so that it can be reused by sstable
compaction."
2016-07-04 15:07:47 +03:00
Avi Kivity
171054e87b Merge seastar upstream
* seastar d4d9e16...a47f893 (1):
  > Merge "overprovisioning support"
2016-07-04 13:46:03 +03:00
Paweł Dziepak
5d0de2179a Merge "Adding scylla version API" from Amnon
Amnon says:

The API that returns the version, currently returns the compatibility
version
(e.g. the version the compatible origin version - currently 2.1.8).

The check version functionality need to know what is the current running
version of scylla.  For that a new API was added that return the current
version.

The result is equivalent of running scylla --version.

After this series a call to:

$ curl -X GET
"http://localhost:10000/storage_service/scylla_release_version"
"666.development-20160703.72f0d4d"

Which is the json representation of:

$ ./build/release/scylla --version
666.development-20160703.72f0d4d
2016-07-04 10:52:44 +01:00
Asias He
f6a2672be0 storage_service: Modify log to match config option of scylla
We currently log as follow:

May  9 00:09:13 node3.nl scylla[2546]:  [shard 0] storage_service - This
node was decommissioned and will not rejoin the ring unless
cassandra.override_decommission=true has been set,or all existing data
is removed and the node is bootstrapped again

Howerver, user should use

   override_decommission:true

instead of

   cassandra.override_decommission:true

in scylla.yaml where the cassandra prefix is stripped.

Fixes #1240
Message-Id: <b0c9424c6922431ad049ab49391771e07ca6fbde.1467079190.git.asias@scylladb.com>
2016-07-04 10:47:49 +02:00
Avi Kivity
76cc0c0cf9 auth: fix performance problem when looking up permissions
data_resource lookup uses data_resource::name(), which uses sprint(), which
uses (indirectly) locale, which takes a global lock.  This is a bottleneck
on large machines.

Fix by not using name() during lookup.

Fixes #1419
Message-Id: <1467616296-17645-1-git-send-email-avi@scylladb.com>
2016-07-04 10:26:18 +02:00
Yoav Kleinberger
49cba035ea tools/scyllatop: leave terminal in a functioning state when user quits with CTRL-C
closes issue #1417.

Signed-off-by: Yoav Kleinberger <yoav@scylladb.com>
Message-Id: <1467556769-11851-1-git-send-email-yoav@scylladb.com>
2016-07-03 17:43:46 +03:00
Amnon Heiman
e66a1cd705 API: Add implementation for the scylla release version
This adds the implementation to the scylla release version API.

After this patch a call to:

curl -X GET "http://localhost:10000/storage_service/scylla_release_version"

Will return the current scylla release version.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-07-03 16:29:09 +03:00
Amnon Heiman
56ea8c943e API: add scylla release version API
This adds a definition to the scylla release version. The API already
return the compatibility version (ie. the compatible origin version)

This definition returns the scylla version, a call to the API should
return the same result as running scylla --version.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-07-03 16:26:21 +03:00
Avi Kivity
68e613b313 Rebuild _column_family::_sstables when changing compaction_strategy
The concrete sstable_set type depends on the compaction strategy, so
ask the compaction_strategy to create a new sstable_set object and populate
it.
2016-07-03 13:42:10 +03:00
Avi Kivity
44a6cef4e1 sstable mutation readers: use sstable_set::select()
Apply compaction strategy specific logic to narrow down the set of sstables
used for a query; can speed up reads using LeveledCompactionStrategy
significantly.

Fixes #1185.
2016-07-03 10:50:58 +03:00
Avi Kivity
4cb7618601 Convert column_family::_sstables to sstable_set
Using sstable_set will allow us to filter sstables during a query before
actually creating a reader (this is left to the next patch; here we just
convert the users of the _sstables field).
2016-07-03 10:32:27 +03:00
Avi Kivity
c8237fc262 compaction_strategy: introduce make_sstable_set()
Allow compaction_strategy to create a container for sstables that is
optimized for the strategy.

Most compaction_strategies return bag_sstable_set; leveled compaction
returns the specialized partitioned_sstable_set.
2016-07-03 10:27:01 +03:00
Avi Kivity
168696c558 Introduce partitioned_sstable_set
partitioned_sstable_set assumes that sstable are mostly partitioned along
the token range: only a few sstables will be needed to access a particular
token.  It is implemented as an interval_map.
2016-07-03 10:27:00 +03:00
Avi Kivity
64e4357461 Introduce bag_sstable_set
bag_sstable_set is a generic sstable_set implementation: it assumes nothing
about the sstables.  It is implemented as a vector, and any select will
return the entire sstable set.
2016-07-03 10:27:00 +03:00
Avi Kivity
85e9cf4616 Introduce sstable_set
sstable_set abstracts the notion of a container of sstables, allowing
different compaction strategies to supply their own implementation.  The
intended user is leveled compaction strategy; since it partitions sstables,
it can quickly restrict the number of sstables that participate in a query
by looking at the min/max partition key.

sstable_set also maintains an internal lw_shared_ptr<sstable_list>,
in parallel with the abstract container.  This is to support
column_family::get_sstable(), which returns a lw_shared_ptr<sstable_list>
which must be anchored somewhere if it is not saved at the caller side,
as it isn't in most current callers.
2016-07-03 10:27:00 +03:00
Avi Kivity
c1815abd15 Introduce compatible_ring_position
ring_position is built for modern code that does not require default
constructors or stateless comparators.  But not all code is modern, so
supply a compatible_ring_position that works with old code, at the cost
of some extra storage.  Intended user is boost's interval container
library.
2016-07-03 10:27:00 +03:00
Avi Kivity
2a46410f4a Change sstable_list from a map to a set
sstable_list is now a map<generation, sstable>; change it to a set
in preparation for replacing it with sstable_set.  The change simplifies
a lot of code; the only casualty is the code that computes the highest
generation number.
2016-07-03 10:26:57 +03:00
Duarte Nunes
386c0dd4b2 storage_proxy: Correctly calculate new limit
This patch fixes a bug where we would always return query::max_rows
when calculating the new limit for a retry read command.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1467289746-18177-1-git-send-email-duarte@scylladb.com>
2016-06-30 14:49:56 +02:00
Paweł Dziepak
b150720361 sstable: enable read ahead
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 13:18:24 +01:00
Paweł Dziepak
4513f8b52c sstables: add compressed_file_data_source_impl::close()
compressed_file_data_source_impl should close the underlying data source
properly when asked to.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 13:07:07 +01:00
Paweł Dziepak
55a6911d7a sstables: close input_stream<> properly
If read ahead is going to be enabled it is important to close
input_stream<> properly (and wait for completion) before destroying it.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Paweł Dziepak
e44e12c74a sstables: drop no longer needed code
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Paweł Dziepak
c2f0ee9b5f sstables: add consumer-style sstable compactor
This patch moves compaction logic to a consumer that can be used with
consume_flattened_in_thread(). Internally, sstable_writer is used to
write individual sstables.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Paweł Dziepak
18a9ee105f sstables: add consumer-style sstable writer
sstable_writer encapsulates all logic related to writing sstable.
Previously introduced component_writer is used to write actual
mutations. sstable_writer is intended to be used with
consume_flattened_in_thread(). Its purpose is to be used by higher-level
consumer that needs to write possibly more than one sstable (sstable
compaction is an example of such consumer).

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Paweł Dziepak
0e8b8463ba sstables: introduce consumer-style components writer
This patch rewrites do_write_components() so that it can use
consume_flattened_in_thread(). All components-writing code is moved to a
new consumer: component_writer.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Paweł Dziepak
0287e0c9ac mutation_reader: add consume_flattened_in_thread()
This is a version of consume_flattened() intended to be run inside a
thread. All consumer code is going to be invoked in the same thread
context.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Paweł Dziepak
7a95847014 mutation_compactor: prepare for sstable compaction
compact_mutation code is going to be shared among queries and sstable
compaction. There are some differences though. Queries don't provide
_max_purgeable and sstable compaction don't need any limits.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Paweł Dziepak
00bcc05d36 mutation_compactor: _max_purgeable depends on the decorated key
_max_perguable can be different for each partition, since it is computed
using sstables in which that partition is present (or likely to be
present).

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Paweł Dziepak
4133cc7a53 mutation_reader: make consume_flattened() produce decorated keys
Since decorated keys are already computed it is better to pass more
information than less. Consumers interested just in partition key can
just drop token and the ones requiring full decorated key don't need to
recompute it.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:00 +01:00
Paweł Dziepak
fe4b739828 mutation_compactor: rename compact_for_query to compact_mutation
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:37:54 +01:00
Paweł Dziepak
3e86f9ab73 mutation_partition: extract compact_for_query to a separate header
The compacting logic inside compact_for_query is going to be shared with
sstable compaction.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:37:54 +01:00
Paweł Dziepak
9b14c93677 streamed_mutation: return reference to decorated key
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:37:54 +01:00
Paweł Dziepak
3c08ffb275 query: add full_slice
query::full_slice is a partiton slice which has full clustering row
ranges for all partition keys and no per-partition row limit.
Options and columns are not set.

It is used as a helper object in cases when a reference to
partition_slice is needed but the user code needs just all data there is
(an example of such case would be sstable compaction).

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:37:54 +01:00
Paweł Dziepak
599ed7f1ed sstables: restore indentation
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:37:54 +01:00
Paweł Dziepak
e7ff20b3bb sstables: run compaction code inside a thread
Currently, each sstable write has its separate thread. However, the goal
is to have compaction use consume_flattened() with a consumer that
creates and writes the sstables. consume_flattened() needs to be executed
inside a thread, since sstable writer may defer.

This patch is a first step in preparations and it just makes whole
compaction logic run inside a thread. That makes little sense now, since
all sstable writes spawn their own threads but that's going to change
in the following patches.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:37:54 +01:00
Duarte Nunes
0ae6eafadd query: Make partition_limit last parameter
The partition_limit should have been added to the end of the ctor
argument list, as its current placement causes some callers to pass it
the timestamp instead of the limit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1467239360-6853-3-git-send-email-duarte@scylladb.com>
2016-06-30 12:31:11 +02:00
Gleb Natapov
8bf82cc31c put additional info into cql timeout exception
Fixes #1397

Message-Id: <20160628101829.GR14658@scylladb.com>
2016-06-30 12:03:48 +02:00
Paweł Dziepak
b70bf086b7 frozen_mutation: handle reversed streams properly
Freezing streamed_mutations assumed that mutation fragments are streamed
in the order they appear in the frozen mutation. That is not true for
reversed streams.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1467277069-18702-1-git-send-email-pdziepak@scylladb.com>
2016-06-30 11:26:45 +02:00
Avi Kivity
9ac730dcc9 mutation_reader: make restricting_mutation_reader even more restricting
While limiting the number of concurrently executing sstable readers reduces
our memory load, the queued readers, although consuming a small amount of
memory, can still grow without bounds.

To limit the damage, add two limits on the queue:
 - a timeout, which is equal to the read timeout
 - a queue length limit, which is equal to 2% of the shard memory divided
   by an estimate of the queued request size (1kb)

Together, these limits bound the amount of memory needed by queued disk
requests in case the disk can't keep up.
Message-Id: <1467206055-30769-1-git-send-email-avi@scylladb.com>
2016-06-29 15:17:35 +02:00
Raphael S. Carvalho
85cb2a6d35 database: trigger compaction on boot
At the moment, we only trigger compaction after creating a new
sstable as a result of memtable flush, or some other event such
as changing compaction strategy of a column family.
However, it's important to trigger compaction on boot too.
That will happen after loading all column families.

Fixes #1404.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <54d38a418157454eec97aaba6b8a6b6e51484db4.1467135349.git.raphaelsc@scylladb.com>
2016-06-29 13:47:42 +03:00
Amnon Heiman
610fe274fd services: Make scylla-jmx service depends on scylla-server
The scylla-jmx no longer shutdown itself. A better setup would be that
the it would be started when the scylla-server starts and that it would
shutdown when the scylla-server shutdown.

This patch do the scylla-server part of the change.

The scylla-server definition would Want the scylla-jmx.service so there
is no need to enable the scylla-jmx.service.

A patch to the scylla-jmx would cause it to shutdown when the scylla-jmx
shutsdown.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1467184502-4358-1-git-send-email-amnon@scylladb.com>
2016-06-29 11:36:04 +03:00
Avi Kivity
2c4501f317 Merge seastar upstream
* seastar c15055c...d4d9e16 (4):
  > semaphore: switch to chunked_fifo
  > fair_queue: add missing include
  > chunked_fifo: implement back()
  > Chunked FIFO queue
2016-06-28 19:30:29 +03:00