Commit Graph

16241 Commits

Author SHA1 Message Date
Avi Kivity
f258df099a Update ami submodule
* dist/ami/files/scylla-ami d53834f...c7e5a70 (1):
  > ds2_configure.py: uncomment 'cluster_name' when it's commented out
2018-07-31 09:34:33 +03:00
Avi Kivity
e7ae4beef0 main: run prometheus and API servers under streaming group
Both the Prometheus and the API servers are used for maintenance
operations, similarly to streaming. Run them under the streaming
scheduling group to prevent them from impacting normal operations,
and rename the streaming scheduling group to reflect the more
generic role.

This helps to prevent spikes from Prometheus or API requests from
interfering with the normal workload. Using an existing group is
preferable to creating a new group because in the worst case, all
the non-main-workload groups compete with the main workload.
Consolidating them allows us to give them significant shares in
total without increasing competition in the worst case.

The group's label is unchanged to preserve compatibility with
dashboards.

A nice side effect is that repair, which is initiated by API calls,
gets placed into the maintenance group naturally. Compaction tasks
which are run by compaction manager are not changed.
Message-Id: <20180714160723.23655-1-avi@scylladb.com>
2018-07-30 15:07:33 +01:00
Avi Kivity
a4282c2c6e tracing: move tracing code to cold path
Most queries run without tracing (and those that run with tracing
are not sensitive to a few cycles), so mark the tracing paths as
cold.
Message-Id: <20180723133000.30482-1-avi@scylladb.com>
2018-07-30 15:05:57 +01:00
Rafi Einstein
123f2c2a1c Add a counter for reverse queries
Fixes #3492

Tests: dtest(cql_additional_tests.py)
Message-Id: <20180729202615.22459-1-rafie@scylladb.com>
2018-07-30 12:34:43 +03:00
Takuya ASADA
032b26deeb dist/common/scripts/scylla_ntp_setup: fix typo
Comment on Python is "#" not "//".

Fixes #3629

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180730091022.4512-1-syuu@scylladb.com>
2018-07-30 12:30:53 +03:00
Avi Kivity
04d88e8ff7 scripts: add a script to compute optimal number of compile jobs
This will allow continuous integration to use the optimal number
of compiler jobs, without having to resort to complex calculations
from its scripting environment.

Message-Id: <20180722172050.13148-1-avi@scylladb.com>
2018-07-30 10:15:11 +03:00
Avi Kivity
a4c9330bfc Merge "Optimise paged queries" from Paweł
"
This series adds some optimisations to the paging logic, that attempt to
close the performance gap between paged and not paged queries. The
former are more complex so always are going to be slower, but the
performance loss was unacceptably large.

Fixes #3619.

Performance with paging:
        ./perf_paging_before  ./perf_paging_after   diff
 read              271246.13            312815.49  15.3%

Without paging:
        ./perf_nopaging_before  ./perf_nopaging_after   diff
 read                343732.17              342575.77  -0.3%

Tests: unit(release), dtests(paging_test.py, paging_additional_test.py)
"

* tag 'optimise-paging/v1' of https://github.com/pdziepak/scylla:
  cql3: select statement: don't copy metadata if not needed
  cql3: query_options: make simple getter inlineable
  cql3: metadata: avoid copying column information
  query_pager: avoid visiting result_view if not needed
  query::result_view: add get_last_partition_and_clustering_key()
  query::result_reader: fix const correctness
  tests/uuid: add more tests including make_randm_uuid()
  utils: uuid: don't use std::random_device()
2018-07-26 19:24:03 +03:00
Nadav Har'El
25bd139508 cross-tree: clean up use of std::random_device()
std::random_device() uses the relatively slow /dev/urandom, and we rarely if
ever intend to use it directly - we normally want to use it to seed a faster
random_engine (a pseudo-random number generator).

In many places in the code, we first created a random_device variable, and then
using it created a random_engine variable. However, this practice created the
risk of a programmer accidentally using the random_device object, instead of the
random_engine object, because both have the same API; This hurts performance.

This risk materialized in just two places in the code, utils/uuid.cc and
gms/gossiper.cc. A patch for to uuid.cc was sent previously by Pawel and is
not included in this patch, and the fix for gossiper.{cc,hh} is included here.

To avoid risking the same mistake in the future, this patch switches across the
code to an idiom where the random_device object is not *named*, so cannot be
accidentally used. We use the following idiom:

   std::default_random_engine _engine{std::random_device{}()};

Here std::random_device{}() creates the random device (/dev/urandom) and pulls
a random integer from it. It then uses this seed to create the random_engine
(the pseudo-random number generator). The std::random_device{} object is
temporary and unnamed, and cannot be unintentionally used directly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180726154958.4405-1-nyh@scylladb.com>
2018-07-26 16:54:58 +01:00
Takuya ASADA
8e4d1350c9 dist/common/scripts/scylla_ntp_setup: ignore ntpdate error
Even ntpdate fails to adjust clock ntpd may able to recover it later,
ignore ntpdate error keep running the script.

Fixes #3629

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180726080206.28891-1-syuu@scylladb.com>
2018-07-26 14:44:53 +03:00
Paweł Dziepak
3e32245bb8 cql3: select statement: don't copy metadata if not needed 2018-07-26 12:37:20 +01:00
Paweł Dziepak
15775c958a cql3: query_options: make simple getter inlineable 2018-07-26 12:37:06 +01:00
Paweł Dziepak
ef0c999742 cql3: metadata: avoid copying column information
The column-related metadata is shared by all requests done with the same
perpared query. However, metadata class contains also some additional
flags and paging state which may differ. This patch allows sharing
column information among multiple instances of the metadata class.
2018-07-26 12:17:04 +01:00
Paweł Dziepak
757d9e3b5d query_pager: avoid visiting result_view if not needed
query::result_visitor provides get_last_partition_and_clustering_key()
which allows getting those without iterating through the whole result.
Moreover, row count may be precomputed in the result, if it isn't there
is query::result_view::count_partitions_and_rows() for getting it.
2018-07-26 12:14:48 +01:00
Paweł Dziepak
9b6dc52255 query::result_view: add get_last_partition_and_clustering_key()
Paging needs to get last partition and clustering key (if the latter
exists). Previously, this was done by result_view visitor but that is
suboptimal. Let's add a direct getter for those.
2018-07-26 12:12:08 +01:00
Paweł Dziepak
b5ed4c8806 query::result_reader: fix const correctness 2018-07-26 12:11:27 +01:00
Paweł Dziepak
495df277f9 tests/uuid: add more tests including make_randm_uuid() 2018-07-26 12:03:37 +01:00
Paweł Dziepak
b485deb124 utils: uuid: don't use std::random_device()
std::random_device() is extremely slow. This patch modifies
make_rand_uuid() so that it requires only two invocations of the PRNG.
2018-07-26 12:02:32 +01:00
Avi Kivity
b167647bf6 dist: redhat: fix up bad file ownership of rpms/srpms
mock outputs files owned by root. This causes attempts
by scripts that want to junk the working directory (typically
continuous integration) to fail on permission errors.

Fixup those permissions after the fact.
Message-Id: <20180719163553.5186-1-avi@scylladb.com>
2018-07-26 08:20:42 +03:00
Avi Kivity
bea1f715dc storage_proxy: count cross-shard operations
Count operations which were started on one shard and
were performed on another, due to non-shard-aware driver
and/or RPC.
Message-Id: <20180723155118.8545-1-avi@scylladb.com>
2018-07-25 16:21:04 +01:00
Avi Kivity
d6ef74fe36 Merge "Fix JSON string quoting" from Piotr
"

This mini-series covers a regression caused by newest versions
of jsoncpp library, which changed the way of quoting UTF-8 strings.

Tests: unit (release)
"

* 'add_json_quoting_3' of https://github.com/psarna/scylla:
  tests: add JSON unit test
  types: use value_to_quoted_string in JSON quoting
  json: add value_to_quoted_string helper function

Ref #3622.
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
2018-07-25 17:49:55 +03:00
Piotr Sarna
b367cff05d tests: add JSON unit test
Since value_to_quoted_string now has an internal implementation,
a unit test is provided to check if strings are quoted
and escaped properly.
2018-07-25 13:16:06 +02:00
Piotr Sarna
d307b5712c types: use value_to_quoted_string in JSON quoting
In order to avoid regressions caused by external libraries,
our own value_to_quoted_string implementation is used.

Fixes #3622
2018-07-25 13:16:06 +02:00
Piotr Sarna
783762a958 json: add value_to_quoted_string helper function
After open-source-parsers/jsoncpp@42a161f commit jsoncpp's version
of valueToQuotedString no longer fits our needs, because too many
UTF-8 characters are unnecessarily escaped. To remedy that,
this commit provides our own string quoting implementation.

Reported-by: Nadav Har'El <nyh@scylladb.com>

Refs #3622
2018-07-25 13:16:00 +02:00
Piotr Sarna
f66aace685 cql3: fix INSERT JSON grammar
Previously CQL grammar wrongfully required INSERT JSON queries
to provide a list of columns, even though they are already
present in JSON itself.
Unfortunately, tests were written with this false assumption as well,
so they're are updated.
Message-Id: <33b496cba523f0f27b6cbf5539a90b6feb20269e.1532514111.git.sarna@scylladb.com>
2018-07-25 11:36:59 +01:00
Avi Kivity
b443a9b930 compaction: demote compaction start/end messages to DEBUG level
Compactions start and end all the time, especially with many shards,
and don't contribute much to understanding what is going on these
days. Compaction throughput is available through the metrics and
other information is available via the compaction history table.

Demote compaction start and end messages to DEBUG level to keep
the log clean. Cleaning and resharding compactions are kept as
INFO, at least for now, since they are manual operations and
therefore rarer.
Message-Id: <20180724132859.14109-1-avi@scylladb.com>
2018-07-25 09:53:39 +01:00
Takuya ASADA
58f094e06d dist/debian: fix ImportError on pystache
Seems like pystache does not provides dependency, need to install it on
build_deb.sh.

Fixes #3627

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180724164852.16094-1-syuu@scylladb.com>
2018-07-25 07:42:19 +03:00
Avi Kivity
e2ad45c3db Merge "Add clustering prefix logic to indexes and filtering" from Piotr
"
This series follows up ALLOW FILTERING support series and depends on
this one: https://groups.google.com/d/msg/scylladb-dev/Qxt3_MP03jI/5ZhRTJ3gBwAJ

The following optimizations regarding clustering key prefix and filtering are
applied:
 * if clustering key restrictions require filtering, but they still
   contain any part of the prefix, this prefix can be used to narrow
   down the query by using it in computing clustering bounds
 * if an indexed query has partition key restrictions and any clustering
   key restrictions that form a prefix, then from now on this prefix
   will be used to narrow down the index query

"

Ref #3611.

* 'use_prefix_with_filtering_and_si_4' of https://github.com/psarna/scylla:
  tests: add prefix cases to indexed filtered queries tests
  cql3: use ck prefix in filtered queries
  cql3: use clustering key prefix in index queries
  cql3: add conversion to ck longest prefix restrictions
  cql3: add prefix_size method to ck restrictions
2018-07-23 15:28:50 +03:00
Piotr Sarna
517a5b66ba tests: add prefix cases to indexed filtered queries tests
More cases related to querying clustering key prefix in an indexed
query are added to secondary index test suite.
2018-07-23 14:10:52 +02:00
Piotr Sarna
8523c24576 cql3: use ck prefix in filtered queries
If a filtering query has restrictions that include any clustering
prefix, the longest prefix will be used to narrow down the query.

Fixes #3611
2018-07-23 14:10:52 +02:00
Piotr Sarna
6cc8ccc771 cql3: use clustering key prefix in index queries
If an indexed query has partition+clustering key restrictions as well
and at least some of these restrictions create a prefix, this prefix
is used in the index query to narrow down the number of rows read.

Refs #3611
2018-07-23 14:10:52 +02:00
Piotr Sarna
ab74f75727 cql3: add conversion to ck longest prefix restrictions
For optimization purposes it's sometimes useful to extract
the longest prefix of clustering key restrictions in order
to narrow down queries.
2018-07-23 14:10:52 +02:00
Piotr Sarna
2e4c493870 cql3: add prefix_size method to ck restrictions
Clustering key restrictions are usually set for at least part
of the clustering key prefix. A method of extracting the longest
prefix's size is added.
2018-07-23 14:10:52 +02:00
Vladimir Krivopalov
ec7f853f49 sstables: Do not pass liveness_info to consume_row_end().
The liveness_info is unconditionally added to the _in_progress_row as of
commit cbfc741d70 so no need to pass it to consume_row_end() and add
conditionally.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <7cd3e599817cbd4b857c3295153602cd2b9a6ef1.1532311852.git.vladimir@scylladb.com>
2018-07-23 13:10:36 +03:00
Avi Kivity
bb79eccf55 tests: sstable_mutation_test: hack around leak during sstable close
sstable close is an asychronous operation launched in the background,
so we can't wait for it. If the test ends before all operations are
complete, the background operations are detected as leaks.

We need either a proper close(), or maybe a sstables::quiesce() that
waits until there are no sstables alive on the shard, but until then,
a hack.
2018-07-23 12:40:46 +03:00
Avi Kivity
af6ce47082 Merge "Support filtering and fast-forwarding with SSTables 3.x" from Piotr and Vladimir
"
This patchset authored by Piotr fixes ck filtering and fast forwarding in SSTables 3.x.
For now only clustering rows are supported and range tombstones will come next.

Test: unit {release}
"

* 'projects/sstables-30/filtering/v5' of https://github.com/argenet/scylla:
  sstables: Minor clean-up and renaming to clustering_ranges_walker.
  sstables: Add test for filtering and forwarding
  sstables: Fix schema for static row tests
  sstables: Fix ck filtering and fast forwarding
  sstables: Introduce mutation_fragment_filter
2018-07-22 21:11:51 +03:00
Avi Kivity
761931659a Merge "Do not linearise incoming CQL3 requests" from Paweł
"
This series changes the native CQL3 protocl layer so that it works with
fragmented buffers instead of a single temporary_buffer per request.
The main part is fragmented_temporary_buffer which represents a
fragmented buffer consisting of multiple temporary_buffers. It provides
helpers for reading fragmented buffer from an input_stream, interpreting
the data in the fragmented buffer as well as view that satisfy
FragmentRange concept.

There are still situations where a fragmented buffer is linearised. That
includes decompressing client requests (this uses reusable buffers in a
similar way to the code that sends compressed responses), CQL statement
restrictions and values that are hard-coded in prepared statements
(hopefully, the values in those cases will be small), value validation
in some cases (blobs are not validated, irrelevant for many fixed-size
small types, but may be a problem for large text cells) as well as
operations on collections.

Tests: unit(release), dtests(cql_prepared_test.py, cql_tests.py, cql_additional_tests.py)
"

* tag 'fragmented-cql3-receive/v1' of https://github.com/pdziepak/scylla: (23 commits)
  types: bytes_view: override fragmented validate()
  cql3: value_view: switch to fragmented_temporary_buffer::view
  types: add validate that accepts fragmented_temporary_buffer::view
  cql3 query_options: add linearize()
  cql3: query_options: use bytes_ostream for temporaries
  cql3: operation: make make_cell accept fragmented_temporary_buffer::view
  atomic_cell: accept fragmented_temporary_buffer::view values
  cql3: avoid ambiguity in a call to update_parameters::make_cell()
  transport: switch to fragmented_temporary_buffer
  transport: extract compression buffers from response class
  tests/reusable_buffer: test fragmented_temporary_buffer support
  utils: reusable_buffer: support fragmented_temporary_buffer
  tests: add test for fragmented_temporary_buffer
  util fragment_range: add general linearisation functions
  utils: add fragmented_temporary_buffer
  tests: add basic test for transport requests and responses
  tests/random-utils: print seed
  tests/random-utils: generate sstrings
  cql3: add value_view printer and equality comparison
  transport: move response outside of cql_server class
  ...
2018-07-22 19:40:37 +03:00
Avi Kivity
30cddd4531 Merge "Support reading promoted index from SSTables 3.x" from Vladimir and Piotr
"
This patchset adds support for reading Index.db files written in
SSTables 3.x ('mc') format.

Note that the offsets map introduced in SSTables 3.x is neither used nor
read yet. It is located last in promoted index and so current parsers
just ignore it for the time being.

Later it should be used to perform binary search of a desired promoted
index block in large partition, thus reducing the complexity from linear
to logarithmic.

Tests: unit {release}
"

* 'projects/sstables-30/index_reader/v5' of https://github.com/argenet/scylla:
  sstables: Add getter for end_open_marker to index_reader.
  tests: Add test reading index for a partition comprised of RT markers of boundary types.
  tests: Add test for reading index of a partition comprised of only range tombstones.
  tests: Use std::adjacent_find in index_reader_assertions::has_monotonic_positions()
  tests: Read rows only index
  sstables: Do not seek through the promoted index for static row positions.
  sstables: Read promoted index stored in SSTables 3.x ('mc') format.
  sstables: Make promoted_index_block support clustering keys for both ka/la and mc formats.
  utils: Add overloaded_functor helper.
  position_in_partition: Add a constructor from range_tag_t{}, bound_kind and clustering_key_prefix.
  sstables: Support reading signed vints in continuous_data_consumer.
  sstables: Factor out the code building a vector of fixed clustering values lengths.
  sstables: Remove unused includes from index_entry.hh
  tests: Add test for reading SSTables 3.x index file with empty promoted index.
  tests: Rename sstable_assertions.hh -> tests/index_reader_assertions.hh
  sstables: Support parsing index entries from SSTables 3.x format.
  sstables: move bound_kind_m to header
2018-07-22 16:15:41 +03:00
Vladimir Krivopalov
df1a151f75 sstables: Minor clean-up and renaming to clustering_ranges_walker.
- Renamed _current to _current_range to better reflect its nature as
  there are other similarly named members (_current_start and
  _current_end).

- Don't use a temporary variable for incrementing the change counter.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 16:34:37 -07:00
Piotr Jastrzebski
01611f2083 sstables: Add test for filtering and forwarding
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-07-20 16:34:37 -07:00
Piotr Jastrzebski
3466dc2368 sstables: Fix schema for static row tests 2018-07-20 16:34:37 -07:00
Piotr Jastrzebski
abf3fc1b98 sstables: Fix ck filtering and fast forwarding
Both were broken before this change.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 16:34:37 -07:00
Piotr Jastrzebski
564bcfa4d0 sstables: Introduce mutation_fragment_filter
This class encapsulates the logic related to
clustering key filtering and fast forwarding.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 16:19:07 -07:00
Vladimir Krivopalov
4d3467d793 sstables: Add getter for end_open_marker to index_reader.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 13:51:13 -07:00
Vladimir Krivopalov
c7285abc9e tests: Add test reading index for a partition comprised of RT markers of boundary types.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 13:51:13 -07:00
Vladimir Krivopalov
91f96d7d2b tests: Add test for reading index of a partition comprised of only range tombstones.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 13:51:13 -07:00
Vladimir Krivopalov
fc051954c2 tests: Use std::adjacent_find in index_reader_assertions::has_monotonic_positions()
Not only this is easier to read and understand, but it also doesn't
force the promoted_index_block class to support copying which is
heavyweight and otherwise not needed.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 13:51:13 -07:00
Vladimir Krivopalov
d4e0fa96e3 tests: Read rows only index
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 13:51:13 -07:00
Vladimir Krivopalov
5561c713d9 sstables: Do not seek through the promoted index for static row positions.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 13:51:13 -07:00
Vladimir Krivopalov
917528c427 sstables: Read promoted index stored in SSTables 3.x ('mc') format.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 13:51:13 -07:00
Vladimir Krivopalov
86d14f8166 sstables: Make promoted_index_block support clustering keys for both ka/la and mc formats.
This is a pre-requisite for parsing promoted index blocks written in
SSTables 'mc' format.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-07-20 13:51:13 -07:00