Commit Graph

17080 Commits

Author SHA1 Message Date
Piotr Sarna
4f5ee3dfcd cql3: add counting dropped rows in filtering pager
Counter for dropped rows is added to the filtering pager.
This metrics can be used later to implement applying LIMIT
to filtering queries properly.
Dropped rows are returned on visitor::accept_partition_end.
2018-11-29 14:06:59 +01:00
Avi Kivity
de17150cb2 Update seastar submodule
* seastar 1fbb633...132e6cd (2):
  > scripts: json2code: port to Python 3
  > docker/dev/Dockerfile: add c-ares-devel to docker setup
2018-11-28 19:05:21 +02:00
Duarte Nunes
a589dade07 Merge 'Fix checking for multi-column restrictions in filtering' from Piotr
"
This series fixes #3891 by amending the way restrictions
are checked for filtering. Previous implementation that returned
false from need_filtering() when multi-column restrictions
were present was incorrect.
Now, the error is going to be returned from restrictions filter layer,
and once multi-column support is implemented for filtering, it will
require no further changes.

Tests: unit (release)
"

* 'fix_multi_column_filtering_check_3' of https://github.com/psarna/scylla:
  tests: add multi-column filtering check
  cql3: remove incorrect multi-column check
  cql3: check filtering restrictions only if applicable
  cql3: add pk/ck_restrictions_need_filtering()
2018-11-28 15:36:37 +00:00
Piotr Sarna
ae0ffa6575 tests: add multi-column filtering check
Multi-column restrictions filtering is not supported yet,
so a simple case to ensure that is added.
2018-11-28 13:58:16 +01:00
Piotr Sarna
0013929782 cql3: remove incorrect multi-column check
need_filtering() incorrectly returned false if multi-column restrictions
were present. Instead, these restrictions should be allowed to need
filtering.

Fixes #3891
2018-11-28 13:58:16 +01:00
Piotr Sarna
65f21cc518 cql3: check filtering restrictions only if applicable
Primary key restrictions should be checked only when they need
filtering - otherwise it's superfluous, since they were already
applied on query level.
2018-11-28 13:58:16 +01:00
Piotr Sarna
f59ddcab52 cql3: add pk/ck_restrictions_need_filtering()
These functions return true if partition/clustering key restriction
parts of statement restrictions require filtering.
2018-11-28 13:58:16 +01:00
Duarte Nunes
d09d4bbd91 Merge 'Fix checking if system tables need view updates' from Piotr
"
This miniseries ensures that system tables are not checked
for having view updates, because they never do.
What's more, distributed system table is used in the process,
so it's unsafe to query the table while streaming it.

Tests: unit (release), dtest(update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_decommission_node_2_test)
"

* 'fix_checking_if_system_tables_need_view_updates_3' of https://github.com/psarna/scylla:
  streaming: don't check view building of system tables
  database: add is_internal_keyspace
  streaming: remove unused sstable_is_staging bool class
2018-11-28 10:00:34 +00:00
Piotr Sarna
8e6021dfa1 streaming: don't check view building of system tables
System tables will never need view building, and, what's more,
are actually used in the process of view build checking.
So, checking whether system tables need a view update path
is simplified to returning 'false'.
2018-11-28 09:21:56 +01:00
Piotr Sarna
1336b9ee31 database: add is_internal_keyspace
Similarly to is_system_keyspace, it will allow checking if a keyspace
is created for internal use.
2018-11-28 09:21:56 +01:00
Piotr Sarna
6ad2c39f88 streaming: remove unused sstable_is_staging bool class
sstable_is_staging bool class is not used anywhere in the code anymore,
so it's removed.
2018-11-28 09:21:56 +01:00
Duarte Nunes
9f639edaa2 Merge 'storage_proxy: fix some bugs in early (due to errors) request completion' from Gleb
"
The series fixed #3565 and #3566
"

* 'gleb/write_failure_fixes' of github.com:scylladb/seastar-dev:
  storage_proxy: store hint for CL=ANY if all nodes replied with failure
  storage_proxy: complete write request early if all replicas replied with success of failure
  storage_proxy: check that write failure response comes from recognized replica
  storage_proxy: move code executed on write timeout into separate function
2018-11-27 21:44:01 +00:00
Takuya ASADA
52f030806f install-dependencies.sh: fix dependency issues on Debian variants
Sync Debian variants dependencies with dist/debian/control.mustache
(before merging relocatable), use scylla 3rdparty packages.

Since we use 3rdparty repo on seastar/install-dependencies.sh, drop repo
setup part from this script.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20181031122800.11802-1-syuu@scylladb.com>
2018-11-27 21:44:01 +00:00
Gleb Natapov
17197fb005 storage_proxy: store hint for CL=ANY if all nodes replied with failure
Current code assumes that request failed if all replicas replied with
failure, but this is not true for CL=ANY requests. Take it into account.

Fixed: #3565
2018-11-27 15:06:37 +02:00
Gleb Natapov
d1d04eae3c storage_proxy: complete write request early if all replicas replied with success of failure
Currently if write request reaches CL and all replicas replied, but some
replied with failures, the request will wait for timeout to be retired.
Detect this case and retire request immediately instead.

Fixes #3566
2018-11-27 14:49:37 +02:00
Gleb Natapov
76ab3d716b storage_proxy: check that write failure response comes from recognized replica
Before accounting failure response we need to make sure it comes from a
replica that participates in the request.
2018-11-27 14:44:49 +02:00
Rafael Ávila de Espíndola
777ea893e6 Delete data_consume_rows_at_once.
As far as I can tell the old sstable reading code required reading the
data into a contiguous buffer. The function data_consume_rows_at_once
implemented the old behavior and incrementally code was moved away
from it.

Right now the only use is in two tests. The sstables used in those
tests are already used in other tests with data_consume_rows.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20181127024319.18732-2-espindola@scylladb.com>
2018-11-27 14:11:50 +02:00
Avi Kivity
1ff6b8fb96 Merge "Don't binary compare compressed sstables in test_write_many_partitions_* tests" from Piotr
"
Compression is not deterministic so instead of binary comparing the sstable files we just read data back
and make sure everything that was written down is still present.

Tests: unit(release)
"

* 'haaawk/binary-compare-of-compressed-sstables/v3' of github.com:scylladb/seastar-dev:
  sstables: Remove compressed parameter from get_write_test_path
  sstables: Remove unused sstable test files
  sstables: Ensure compare_sstables isn't used for compressed files
  sstables: Don't binary compare compressed sstables
  sstables: Remove debug printout from test_write_many_partitions
2018-11-27 14:01:20 +02:00
Duarte Nunes
098dd90bd2 Merge 'Reduce dependencies around consistency_level.hh' from Avi
"
consistency_level.hh is rather heavyweighy in both its contents and what it
includes. Reduce the number of inclusion sites and split the file to reduce
dependencies.
"

* tag 'cl-header/v2' of https://github.com/avikivity/scylla:
  consistency_level: simplify validation API
  Split consistency_level.hh header
  database: remove unneeded consistency_level.hh include
  cql: remove unneeded includes of consistency_level.hh
2018-11-27 11:59:34 +00:00
Piotr Jastrzebski
4366302c4c sstables: Extract mp_row_cosumer_m::check_schema_mismatch
This method will contain common logic used in multiple places
and reduce code duplication.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <bbda2f4ea4f9325055f096dc549f63b1bb03d3b6.1543311990.git.piotr@scylladb.com>
2018-11-27 12:45:12 +01:00
Avi Kivity
4676e07400 consistency_level: simplify validation API
Remove unused parameters, replace refcounted pointers by references.
2018-11-27 13:41:49 +02:00
Avi Kivity
2c08bff8d5 Split consistency_level.hh header
It has two unrelated users: cql for validation, and storage_proxy for
complicated calculations. Split the simple stuff into a new header to reduce
dependencies.
2018-11-27 13:32:10 +02:00
Avi Kivity
b015f41344 database: remove unneeded consistency_level.hh include 2018-11-27 13:30:56 +02:00
Gleb Natapov
7bc68aa0eb storage_proxy: move code executed on write timeout into separate function
Currently the callback is in lambda, but we will want to call the code
not only during timer expiration.
2018-11-27 13:23:30 +02:00
Avi Kivity
9201d22c06 cql: remove unneeded includes of consistency_level.hh
Move the includes to .cc to reduce include pollution.
2018-11-27 13:18:33 +02:00
Raphael S. Carvalho
626afa6973 database: conditionally release sstable references from compaction manager
Not all compaction operations submitted through compaction manager sets a callback
for releasing references of exhausted sstables in compaction manager itself.
That callback lives in compaction descriptor which is passed to table::compaction().
Let's make the call conditional to avoid bad function call exceptions.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20181126235616.10452-1-raphaelsc@scylladb.com>
2018-11-27 12:10:43 +01:00
Avi Kivity
2eaeb3e4eb Update swagger-ui submodule
Updates to version 2.2.10 with a local change (from Amnon) to support our location.

Fixes #3942.
2018-11-27 13:01:02 +02:00
Tomasz Grabiec
17a8a9d13d gdb: Properly parse unique_ptr in 'scylla lsa'
There's no _M_t._M_head_impl any more in the standard library.

We now have std_unique_ptr wrapper which abstracts this fact away so
use that.

Message-Id: <20181126174837.11542-1-tgrabiec@scylladb.com>
2018-11-27 12:32:41 +02:00
Tomasz Grabiec
eecda72175 gdb: Adjust 'scylla lsa' for removal of emergency reserve
There's no _emergency_reserve any more. Show _free_segments instead.

Message-Id: <20181126174837.11542-2-tgrabiec@scylladb.com>
2018-11-27 12:32:37 +02:00
Avi Kivity
5e759b0c07 Merge "Optimize checksum computation for the MC sstable format" from Tomek
"
One part of the improvement comes from replacing zlib's CRC32 with the one
from libdeflate, which is optimized for modern architecture and utilizes the
PCLMUL instruction.

perf_checksum test was introduced to measure performance of various
checksumming operations.

Results for 514 B (relevant for writing with compression enabled):

    test                                      iterations      median         mad         min         max
    crc_test.perf_deflate_crc32_combine            58414    16.711us     3.483ns    16.708us    16.725us
    crc_test.perf_adler_combine                165788278     6.059ns     0.031ns     6.027ns     7.519ns
    crc_test.perf_zlib_crc32_combine               59546    16.767us    26.191ns    16.741us    16.801us
    ---
    crc_test.perf_deflate_crc32_checksum        12705072    83.267ns     4.580ns    78.687ns    98.964ns
    crc_test.perf_adler_checksum                 3918014   206.701ns    23.469ns   183.231ns   258.859ns
    crc_test.perf_zlib_crc32_checksum            2329682   428.787ns     0.085ns   428.702ns   510.085ns

Results for 64 KB (relevant for writing with compression disabled):

    test                                      iterations      median         mad         min         max
    crc_test.perf_deflate_crc32_combine            25364    38.393us    17.683ns    38.375us    38.545us
    crc_test.perf_adler_combine                169797143     5.842ns     0.009ns     5.833ns     6.901ns
    crc_test.perf_zlib_crc32_combine               26067    38.663us    95.094ns    38.546us    40.523us
    ---
    crc_test.perf_deflate_crc32_checksum          202821     4.937us    14.426ns     4.912us     5.093us
    crc_test.perf_adler_checksum                   44684    22.733us   206.263ns    22.492us    25.258us
    crc_test.perf_zlib_crc32_checksum              18839    53.049us    36.117ns    53.013us    53.274us

The new CRC32 implementation (deflate_crc32) doesn't provide a fast
checksum_combine() yet, it delegates to zlib so it's as slow as the latter.

Because for CRC32 checksum_combine() is several orders of magnitude slower
than checksum(), we avoid calling checksum_combine() completely for this
checksummer. We still do it for adler32, which has combine() which is faster
than checksum().

SStable write performance was evaluated by running:

  perf_fast_forward --populate --data-directory /tmp/perf-mc \
     --rows=10000000 -c1 -m4G --datasets small-part

Below is a summary of the average frag/s for a memtable flush. Each result is
an average of about 20 flushes with stddev of about 4k.

Before:

 [1] MC,lz4: 330'903
 [2] LA,lz4: 450'157
 [3] MC,checksum: 419'716
 [4] LA,checksum: 459'559

After:

 [1'] MC,lz4: 446'917 ([1] + 35%)
 [2'] LA,lz4: 456'046 ([2] + 1.3%)
 [3'] MC,checksum: 462'894 ([3] + 10%)
 [4'] LA,checksum: 467'508 ([4] + 1.7%)

After this series, the performance of the MC format writer is similar to that
of the LA format before the series.

There seems to be a small but consistent improvement for LA too. I'm not sure
why.
"

* tag 'improve-mc-sstable-checksum-libdeflate-v3' of github.com:tgrabiec/scylla:
  tests: perf: Introduce perf_checksum
  tests: Add test for libdeflate CRC32 implementation
  sstables: compress: Use libdeflate for crc32
  sstables: compress: Rename crc32_utils to zlib_crc32_checksummer
  licenses: Add libdeflate license
  Integrate libdeflate with the build system
  Add libdeflate submodule
  sstables: Avoid checksum_combine() for the crc32 checksummer
  sstables: compress: Avoid unnecessary checksum_combine()
  sstables: checksum_utils: Add missing include
2018-11-26 20:10:46 +02:00
Tomasz Grabiec
f1a35b654a tests: perf: Introduce perf_checksum 2018-11-26 18:59:43 +01:00
Tomasz Grabiec
5b6e3fb5ed tests: Add test for libdeflate CRC32 implementation 2018-11-26 18:59:42 +01:00
Tomasz Grabiec
bf0164cdaf sstables: compress: Use libdeflate for crc32
Improves memtable flush performance by 10% in a CPU-bound case.

Unlike the zlib implementation, libdeflate is optimized for modern
CPUs. It utilizes the PCLMUL instruction.
2018-11-26 18:59:42 +01:00
Tomasz Grabiec
0ac1905f4f sstables: compress: Rename crc32_utils to zlib_crc32_checksummer 2018-11-26 18:59:42 +01:00
Tomasz Grabiec
ba141a4852 licenses: Add libdeflate license 2018-11-26 18:59:41 +01:00
Tomasz Grabiec
048d569b45 Integrate libdeflate with the build system 2018-11-26 18:59:09 +01:00
Tomasz Grabiec
f704f7bc19 Add libdeflate submodule 2018-11-26 18:57:51 +01:00
Tomasz Grabiec
743cf43847 sstables: Avoid checksum_combine() for the crc32 checksummer
checksum_combine() is much slower than re-feeding the buffer to
checksum() for the zlib CRC32 checksummer.

Introduce Checksum::prefer_combine() to determine this and select
more optimal behavior for given checksummer.

Improves performance of memtable flush with compression enabled by 30%.
2018-11-26 18:57:33 +01:00
Avi Kivity
b351a9fee7 db/repair_decision.hh: add missing #include
Message-Id: <20181126154948.2453-1-avi@scylladb.com>
2018-11-26 18:49:08 +01:00
Tomasz Grabiec
88cf1c61ba sstables: compress: Avoid unnecessary checksum_combine() 2018-11-26 14:31:38 +01:00
Tomasz Grabiec
8372cf7bcc sstables: checksum_utils: Add missing include 2018-11-26 14:31:38 +01:00
Avi Kivity
c6d700279b class_registry: introduce a non-static variant of class_registry
class_registry's staticness brings has the usual problem of
static classes (loss of dependency information) and prevents us
from librarifying Scylla since all objects that define a registration
must be linked in.

Take a first step against this staticness by defining a nonstatic
variant. The static class_registry is then redefined in terms of the
nonstatic class. After all uses have been converted, the static
variant can be retired.
Message-Id: <20181126130935.12837-1-avi@scylladb.com>
2018-11-26 13:30:21 +00:00
Paweł Dziepak
62ea153629 Merge "Check for schema mismatch after dropping dead cells" from Piotr
"
Previously we were checking for schema incompatibility between current schema and sstable
serialization header before reading any data. This isn't the best approach because
data in sstable may be already irrelevant due to column drop for example.

This patchset moves the check after actual data is read and verified that it has
a timestamp new enough to classify it as nonobsolete.

Fixes #3924
"

* 'haaawk/3924/v3' of github.com:scylladb/seastar-dev:
  sstables: Enable test_schema_change for MC format
  sstables3: Throw error on schema mismatch only for live cells
  sstables: Pass column_info to consume_*_column
  sstables: Add schema_mismatch to column_info
  sstables: Store column data type in column_info
  sstables: Remove code duplication in column_translation
2018-11-26 13:10:18 +00:00
Avi Kivity
9a46ee69d4 doc: fix BYPASS CACHE documentation
BYPASS CACHE was mistakenly documenting an earlier version of the patch.
Correct it to document th committed version.
Message-Id: <20181126125810.9344-1-avi@scylladb.com>
2018-11-26 13:04:52 +00:00
Piotr Jastrzebski
dec48dd1e2 sstables: Remove compressed parameter from get_write_test_path
This parameter is no longer used.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-11-26 13:46:23 +01:00
Piotr Jastrzebski
92ffccd636 sstables: Remove unused sstable test files
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-11-26 13:35:15 +01:00
Piotr Jastrzebski
a29c9189cb sstables: Ensure compare_sstables isn't used for compressed files
Binary comparing compressed sstables is wrong because compression
is not deterministic.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-11-26 13:35:15 +01:00
Piotr Jastrzebski
7e263208f0 sstables: Don't binary compare compressed sstables
This family of test_write_many_partitions_* tests writes
sstables down from memtable using different compressions.
Then it compares the resulting file with a blueprint file
and reads the data back to check everything is there.

Compression is not deterministic so this patch makes the
tests not compare resulting compressed sstable file with blueprint
file and instead only read data back.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-11-26 13:35:03 +01:00
Piotr Jastrzebski
5c86294a56 sstables: Enable test_schema_change for MC format
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-11-26 13:25:23 +01:00
Piotr Jastrzebski
4bdb86c712 sstables3: Throw error on schema mismatch only for live cells
Previously we were throwing exception during the creation of
column_translation. This wasn't always correct because sometimes
column for which the mismatch appeared was already dropped and
data present in sstable should be ignored anyway.

Fixes #3924

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-11-26 13:25:10 +01:00