Counter for dropped rows is added to the filtering pager.
This metrics can be used later to implement applying LIMIT
to filtering queries properly.
Dropped rows are returned on visitor::accept_partition_end.
"
This series fixes#3891 by amending the way restrictions
are checked for filtering. Previous implementation that returned
false from need_filtering() when multi-column restrictions
were present was incorrect.
Now, the error is going to be returned from restrictions filter layer,
and once multi-column support is implemented for filtering, it will
require no further changes.
Tests: unit (release)
"
* 'fix_multi_column_filtering_check_3' of https://github.com/psarna/scylla:
tests: add multi-column filtering check
cql3: remove incorrect multi-column check
cql3: check filtering restrictions only if applicable
cql3: add pk/ck_restrictions_need_filtering()
need_filtering() incorrectly returned false if multi-column restrictions
were present. Instead, these restrictions should be allowed to need
filtering.
Fixes#3891
"
This miniseries ensures that system tables are not checked
for having view updates, because they never do.
What's more, distributed system table is used in the process,
so it's unsafe to query the table while streaming it.
Tests: unit (release), dtest(update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_decommission_node_2_test)
"
* 'fix_checking_if_system_tables_need_view_updates_3' of https://github.com/psarna/scylla:
streaming: don't check view building of system tables
database: add is_internal_keyspace
streaming: remove unused sstable_is_staging bool class
System tables will never need view building, and, what's more,
are actually used in the process of view build checking.
So, checking whether system tables need a view update path
is simplified to returning 'false'.
"
The series fixed#3565 and #3566
"
* 'gleb/write_failure_fixes' of github.com:scylladb/seastar-dev:
storage_proxy: store hint for CL=ANY if all nodes replied with failure
storage_proxy: complete write request early if all replicas replied with success of failure
storage_proxy: check that write failure response comes from recognized replica
storage_proxy: move code executed on write timeout into separate function
Sync Debian variants dependencies with dist/debian/control.mustache
(before merging relocatable), use scylla 3rdparty packages.
Since we use 3rdparty repo on seastar/install-dependencies.sh, drop repo
setup part from this script.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20181031122800.11802-1-syuu@scylladb.com>
Current code assumes that request failed if all replicas replied with
failure, but this is not true for CL=ANY requests. Take it into account.
Fixed: #3565
Currently if write request reaches CL and all replicas replied, but some
replied with failures, the request will wait for timeout to be retired.
Detect this case and retire request immediately instead.
Fixes#3566
As far as I can tell the old sstable reading code required reading the
data into a contiguous buffer. The function data_consume_rows_at_once
implemented the old behavior and incrementally code was moved away
from it.
Right now the only use is in two tests. The sstables used in those
tests are already used in other tests with data_consume_rows.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20181127024319.18732-2-espindola@scylladb.com>
"
Compression is not deterministic so instead of binary comparing the sstable files we just read data back
and make sure everything that was written down is still present.
Tests: unit(release)
"
* 'haaawk/binary-compare-of-compressed-sstables/v3' of github.com:scylladb/seastar-dev:
sstables: Remove compressed parameter from get_write_test_path
sstables: Remove unused sstable test files
sstables: Ensure compare_sstables isn't used for compressed files
sstables: Don't binary compare compressed sstables
sstables: Remove debug printout from test_write_many_partitions
"
consistency_level.hh is rather heavyweighy in both its contents and what it
includes. Reduce the number of inclusion sites and split the file to reduce
dependencies.
"
* tag 'cl-header/v2' of https://github.com/avikivity/scylla:
consistency_level: simplify validation API
Split consistency_level.hh header
database: remove unneeded consistency_level.hh include
cql: remove unneeded includes of consistency_level.hh
It has two unrelated users: cql for validation, and storage_proxy for
complicated calculations. Split the simple stuff into a new header to reduce
dependencies.
Not all compaction operations submitted through compaction manager sets a callback
for releasing references of exhausted sstables in compaction manager itself.
That callback lives in compaction descriptor which is passed to table::compaction().
Let's make the call conditional to avoid bad function call exceptions.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20181126235616.10452-1-raphaelsc@scylladb.com>
There's no _M_t._M_head_impl any more in the standard library.
We now have std_unique_ptr wrapper which abstracts this fact away so
use that.
Message-Id: <20181126174837.11542-1-tgrabiec@scylladb.com>
"
One part of the improvement comes from replacing zlib's CRC32 with the one
from libdeflate, which is optimized for modern architecture and utilizes the
PCLMUL instruction.
perf_checksum test was introduced to measure performance of various
checksumming operations.
Results for 514 B (relevant for writing with compression enabled):
test iterations median mad min max
crc_test.perf_deflate_crc32_combine 58414 16.711us 3.483ns 16.708us 16.725us
crc_test.perf_adler_combine 165788278 6.059ns 0.031ns 6.027ns 7.519ns
crc_test.perf_zlib_crc32_combine 59546 16.767us 26.191ns 16.741us 16.801us
---
crc_test.perf_deflate_crc32_checksum 12705072 83.267ns 4.580ns 78.687ns 98.964ns
crc_test.perf_adler_checksum 3918014 206.701ns 23.469ns 183.231ns 258.859ns
crc_test.perf_zlib_crc32_checksum 2329682 428.787ns 0.085ns 428.702ns 510.085ns
Results for 64 KB (relevant for writing with compression disabled):
test iterations median mad min max
crc_test.perf_deflate_crc32_combine 25364 38.393us 17.683ns 38.375us 38.545us
crc_test.perf_adler_combine 169797143 5.842ns 0.009ns 5.833ns 6.901ns
crc_test.perf_zlib_crc32_combine 26067 38.663us 95.094ns 38.546us 40.523us
---
crc_test.perf_deflate_crc32_checksum 202821 4.937us 14.426ns 4.912us 5.093us
crc_test.perf_adler_checksum 44684 22.733us 206.263ns 22.492us 25.258us
crc_test.perf_zlib_crc32_checksum 18839 53.049us 36.117ns 53.013us 53.274us
The new CRC32 implementation (deflate_crc32) doesn't provide a fast
checksum_combine() yet, it delegates to zlib so it's as slow as the latter.
Because for CRC32 checksum_combine() is several orders of magnitude slower
than checksum(), we avoid calling checksum_combine() completely for this
checksummer. We still do it for adler32, which has combine() which is faster
than checksum().
SStable write performance was evaluated by running:
perf_fast_forward --populate --data-directory /tmp/perf-mc \
--rows=10000000 -c1 -m4G --datasets small-part
Below is a summary of the average frag/s for a memtable flush. Each result is
an average of about 20 flushes with stddev of about 4k.
Before:
[1] MC,lz4: 330'903
[2] LA,lz4: 450'157
[3] MC,checksum: 419'716
[4] LA,checksum: 459'559
After:
[1'] MC,lz4: 446'917 ([1] + 35%)
[2'] LA,lz4: 456'046 ([2] + 1.3%)
[3'] MC,checksum: 462'894 ([3] + 10%)
[4'] LA,checksum: 467'508 ([4] + 1.7%)
After this series, the performance of the MC format writer is similar to that
of the LA format before the series.
There seems to be a small but consistent improvement for LA too. I'm not sure
why.
"
* tag 'improve-mc-sstable-checksum-libdeflate-v3' of github.com:tgrabiec/scylla:
tests: perf: Introduce perf_checksum
tests: Add test for libdeflate CRC32 implementation
sstables: compress: Use libdeflate for crc32
sstables: compress: Rename crc32_utils to zlib_crc32_checksummer
licenses: Add libdeflate license
Integrate libdeflate with the build system
Add libdeflate submodule
sstables: Avoid checksum_combine() for the crc32 checksummer
sstables: compress: Avoid unnecessary checksum_combine()
sstables: checksum_utils: Add missing include
Improves memtable flush performance by 10% in a CPU-bound case.
Unlike the zlib implementation, libdeflate is optimized for modern
CPUs. It utilizes the PCLMUL instruction.
checksum_combine() is much slower than re-feeding the buffer to
checksum() for the zlib CRC32 checksummer.
Introduce Checksum::prefer_combine() to determine this and select
more optimal behavior for given checksummer.
Improves performance of memtable flush with compression enabled by 30%.
class_registry's staticness brings has the usual problem of
static classes (loss of dependency information) and prevents us
from librarifying Scylla since all objects that define a registration
must be linked in.
Take a first step against this staticness by defining a nonstatic
variant. The static class_registry is then redefined in terms of the
nonstatic class. After all uses have been converted, the static
variant can be retired.
Message-Id: <20181126130935.12837-1-avi@scylladb.com>
"
Previously we were checking for schema incompatibility between current schema and sstable
serialization header before reading any data. This isn't the best approach because
data in sstable may be already irrelevant due to column drop for example.
This patchset moves the check after actual data is read and verified that it has
a timestamp new enough to classify it as nonobsolete.
Fixes#3924
"
* 'haaawk/3924/v3' of github.com:scylladb/seastar-dev:
sstables: Enable test_schema_change for MC format
sstables3: Throw error on schema mismatch only for live cells
sstables: Pass column_info to consume_*_column
sstables: Add schema_mismatch to column_info
sstables: Store column data type in column_info
sstables: Remove code duplication in column_translation
BYPASS CACHE was mistakenly documenting an earlier version of the patch.
Correct it to document th committed version.
Message-Id: <20181126125810.9344-1-avi@scylladb.com>
This family of test_write_many_partitions_* tests writes
sstables down from memtable using different compressions.
Then it compares the resulting file with a blueprint file
and reads the data back to check everything is there.
Compression is not deterministic so this patch makes the
tests not compare resulting compressed sstable file with blueprint
file and instead only read data back.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Previously we were throwing exception during the creation of
column_translation. This wasn't always correct because sometimes
column for which the mismatch appeared was already dropped and
data present in sstable should be ignored anyway.
Fixes#3924
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>