Commit Graph

17921 Commits

Author SHA1 Message Date
Gleb Natapov
ecc5230de5 storage_proxy: remove old get_restricted_ranges() interface
It is not used any more.
2019-02-11 14:45:43 +02:00
Gleb Natapov
0cd9bbb71d cql3/statements/select_statement: convert index query interface to new query_ranges_to_vnodes_generator interface 2019-02-11 14:45:43 +02:00
Gleb Natapov
e6208b1cde tests: convert storage_proxy test to new query_ranges_to_vnodes_generator interface 2019-02-11 14:45:43 +02:00
Gleb Natapov
2735a85c8e storage_proxy: convert range query path to new query_ranges_to_vnodes_generator interface 2019-02-11 14:45:43 +02:00
Gleb Natapov
692a0bd000 storage_proxy: introduce new query_ranges_to_vnode_generator interface
get_restricted_ranges() function gets query provided key ranges
and divides them on vnode boundaries. It iterates over all ranges and
calculates all vnodes, but all its users are usually interested in only
one vnode since most likely it will be enough to populate a page. If it
will be not enough they will ask for more. This patch introduces new
interface instead of the function that allows to generate vnode ranges
on demand instead of precalculating all of them.
2019-02-11 14:45:43 +02:00
Tomasz Grabiec
7184289015 Merge "Various fixes and improvements for sstables statistics" from Paweł
This series contains several fixes and improvements as well as new tests
for sstable code dealing with statistics.

 * https://github.com/pdziepak/scylla.git sstable-stats-fixes/v1-rebased:
  sstables: compaction: don't access moved-from vector of sstables
  memtable: move encoding_stats_collector implementation out of header
  sstables: seal_statistics(): pass encoding_stats by constant reference
  sstables/mc/writer: don't assume all schema columns are present
  tests/sstable3: improvements to file compare
  tests: extract mutation data model
  tests/data_model: add support for expiring atomic cells
  tests/data_model: allow specifying timestamp for row markers
  tests/memtable: test column tracking for encoding stats
  sstables: use correct source of statistics in
    get_encoding_stats_for_compaction()
  utils/extremum_tracking: preserve "not-set" status on merge
  sstables/metadata_collector: move the default values to the global
    tracker
  tests/sstables: test for reading serialisation header
  tests/sstables: pass encoding stats to write_components()
  tests/sstable: test merging encoding_stats

Fixes #4202.
2019-02-07 12:35:29 +01:00
Paweł Dziepak
67252de195 tests/sstable: test merging encoding_stats 2019-02-07 10:17:06 +00:00
Paweł Dziepak
e25603fbf7 tests/sstables: pass encoding stats to write_components()
By default write_components() uses a safe default for encoding_stats
which indicates that all columns are present. This may hide so bugs, so
let's pass the real thing in the tests that this may matter.
2019-02-07 10:17:06 +00:00
Paweł Dziepak
d44d5ebf86 tests/sstables: test for reading serialisation header 2019-02-07 10:17:06 +00:00
Paweł Dziepak
ebf667fb9c sstables/metadata_collector: move the default values to the global tracker
column_stats is a per-partition tracker, while metadata_collector is the
global one. The statistics gathered by column_stats are merged into the
metadata_collector. In order to ensure that we get proper default values
in case no value of particular kind (e.g. no TTLs) was seen they need to
be set on the global tracker, not the per-partition one.
2019-02-07 10:16:50 +00:00
Paweł Dziepak
2680022df0 utils/extremum_tracking: preserve "not-set" status on merge
extremum_tracker allows choosing a default value that's going to be used
only if no "real" values were provided. Since it is never compared with
the actual input values it can be anything. For instance, if the minimum
tracker default value is 0 and there was one update with the value 1 the
detected minimum is going to be 1 (the default is ignored).

However, this doesn't work when the trackers are merged since that
process always leaves the destination tracker in the "set" state
regardless whether any of the merged trakcers has ever seen any value.

This is fixed by this patch, by properly preserving _is_set state on
merge.
2019-02-07 10:16:50 +00:00
Paweł Dziepak
84d8ee35d4 sstables: use correct source of statistics in get_encoding_stats_for_compaction()
sstable class is responsible for much more things that it should. In
particular, it takes care of both writing and reading sstables. The
problem that it causes is that it is very easy to confuse those two.

This is what has happened in get_encoding_stats_for_compaction().
Originally, it was using _c_stats as a source of the statistics, which
is used only during the write and per-partition. Needless to say, the
returned encoding_stats were bogus.

The correct source of those statistics is get_stats_metadata().
2019-02-07 10:16:50 +00:00
Paweł Dziepak
e315448d0a tests/memtable: test column tracking for encoding stats 2019-02-07 10:16:50 +00:00
Paweł Dziepak
591d5195a9 tests/data_model: allow specifying timestamp for row markers 2019-02-07 10:16:50 +00:00
Paweł Dziepak
b07cba6a89 tests/data_model: add support for expiring atomic cells 2019-02-07 10:16:50 +00:00
Paweł Dziepak
aab0b7360f tests: extract mutation data model 2019-02-07 10:16:50 +00:00
Paweł Dziepak
fa216be260 tests/sstable3: improvements to file compare
This patch introduces some improvement to file comparison:
 - exception flags are set so that any error triggers an exceptions and
   guarantees that they are not silently ignored
 - std::ios_base::binary flag is passed to open()
 - istreambuf_iterator is used instead of istream_iterator. It is better
   suited for comparing binary data.
2019-02-07 10:16:50 +00:00
Paweł Dziepak
bc61471132 sstables/mc/writer: don't assume all schema columns are present
The writer constructor prepares lists of present static and regular
columns, those should be used for any further checks.
2019-02-07 10:16:50 +00:00
Paweł Dziepak
0132bcc035 sstables: seal_statistics(): pass encoding_stats by constant reference 2019-02-07 10:16:50 +00:00
Paweł Dziepak
341f186933 memtable: move encoding_stats_collector implementation out of header 2019-02-07 10:16:50 +00:00
Paweł Dziepak
6d5c1a9813 sstables: compaction: don't access moved-from vector of sstables 2019-02-07 10:16:50 +00:00
Paweł Dziepak
a8a45a243b tests/cql_test_env: don't override tmpdir::path
The interface tmpdir::path isn't properly encapsulated and its users can
modify the path even though they really shouldn't. This can happen
accidentally, in cql_test_env a reference to tmpdir::path was created
and later assigned to in one of the code paths. This caused tmpdir
destructor to remove wrong directory at program exit.

This patch solves the problem by avoiding referencing tmpdir::path, a
copy is perfectly acceptable considering that this is tests-only code.

Message-Id: <20190206173046.26801-1-pdziepak@scylladb.com>
2019-02-06 20:55:40 +02:00
Takuya ASADA
96b1cb97ba dist/ami: don't cleanup build dir
rm -rf build/* was to start rpm building on clean state, but it also delete
scylla built binaries so it was not good idea.

Instead of rm -rf build/*, we can check file existance on cloned
directory, if it seems good we can reuse it.
Also we need to run git pull on each package repo since it may not
included latest commit.

Fixes #4189

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190206101755.2056-1-syuu@scylladb.com>
2019-02-06 15:33:09 +02:00
Nadav Har'El
3e7dc7230d build_deb.sh: fix error message
The error message was apparently copied from the RPM script. Fix it.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190205162148.20698-1-nyh@scylladb.com>
2019-02-05 18:22:36 +02:00
Avi Kivity
54748ad15b Merge "Allow non-key IN restrictions" from Piotr
"
Fixes #4193
Fixes #3795

This series enables handling IN restrictions for regular columns,
which is needed by both filtering and indexing mechanisms.

Tests: unit (release)
"

* 'allow_non_key_in_restrictions' of https://github.com/psarna/scylla:
  tests: add filtering with IN restriction test
  cql3: remove unused can_have_only_one_value function
  cql3: allow non-key IN restrictions
2019-02-05 17:30:35 +02:00
Piotr Sarna
45db5da51b tests: add filtering with IN restriction test
Test case for filtering regular columns with IN restriction is added.
2019-02-05 16:04:17 +01:00
Piotr Sarna
36609d1376 cql3: remove unused can_have_only_one_value function 2019-02-05 16:04:17 +01:00
Piotr Sarna
c178ed8b16 cql3: allow non-key IN restrictions
Restricting a regular column with IN restriction is a perfectly
valid case for filtering and indexing, so it should be allowed.

Fixes #4193
Fixes #3795
2019-02-05 15:50:17 +01:00
Rafael Ávila de Espíndola
84542dadfa sstables: delete_atomically: don't drop futures
We still allow the delete of rows from system.large_partition to run
in parallel with the sstable deletion, but now we return a future that
waits for both.

Tests: unit (release)

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190205001526.68774-1-espindola@scylladb.com>
2019-02-05 16:47:58 +02:00
Calle Wilund
ba6a8ef35b tls: Use a default prio string disabling TLS1.0 forcing min 128bits
Fixes #4010

Unless user sets this explicitly, we should try explicitly avoid
deprecated protocol versions. While gnutls should do this for
connections initiated thusly, clients such as drivers etc might
use obsolete versions.

Message-Id: <20190107131513.30197-1-calle@scylladb.com>
2019-02-05 15:34:18 +02:00
Avi Kivity
6c71eae63f Merge "API: Stream compaction history records" from Amnon
"
get_compaction_history can return a lot of records which will add up to a
big http reply.

This series makes sure it will not create large allocations when
returning the results.

It adds an api to the query_processor to use paged queries with a
consumer function that returns a future, this way we can use the http
stream after each record.

This implementation will prevent large allocations and stalls.

Fixes #4152
"

* 'amnon/compaction_history_stream_v7' of github.com:scylladb/seastar-dev:
  tests/query_processor_test: add query_with_consumer_test
  system_keyspace, api: stream get_compaction_history
  query_processor: query and for_each_cql_result with future
2019-02-05 14:16:36 +02:00
Avi Kivity
ebf179318c Merge "SI: Add virtual columns to underlying MV" from Duarte
"
Virtual columns are MV-specific columns that contribute to the
liveness of view rows. However, we were not adding those columns when
creating an index's underlying MV, causing indexes to miss base rows.

Fixes #4144
Branches: master, branch-3.0
"

Reviewed-by: Nadav Har'El <nyh@scylladb.com>

* 'sec-index/virtual-columns/v1' of https://github.com/duarten/scylla:
  tests/secondary_index_test: Add reproducer for #4144
  index/secondary_index_manager: Add virtual columns to MV
2019-02-05 13:26:45 +02:00
Avi Kivity
367ef8d318 Merge "provide our own, relocatable, python3 interpreter" from Glauber
"

We would like to deploy Scylla in constrained environments where
internet access is not permitted. In those environments it is not
possible to acquire the dependencies of Scylla from external repos and
the packages have to be sent alongside with its dependencies.

In older distributions, like CentOS7 there isn't a python3 interpreter
available. And while we can package one from EPEL this tends to break in
practice when installing the software in older patchlevels (for
instance, installing into RHEL7.3 when the latest is RHEL7.5).

The reason for that, as we saw in practice, is that EPEL may
not respect RHEL patchlevels and have the python interpreter depending
on newer versions of some system libraries.

virtualenv can be used to create isolated python enviornments, but it is
not designed for full isolation and I hit at least two roadblocks in
practice:

1) It doesn't copy the files, linking some instead. There is an
   --always-copy option but it is broken (for years) in some
   distributions.
2) Even when the above works, it still doesn't copy some files, relying
   on the system files instead (one sad example was the subprocess
   module that was just kept in the system and not moved to the
   virtualenv)

This patch solves that problem by creating a python3 environment in a
directory with the modules that Scylla uses, and no other else. It is
essentially doing what vitualenv should do but doesn't. Once this
environment is assembled the binaries are then made relocatable the same
way the Scylla binary is.

One difference (for now) between the Scylla binary relocation process
and ours is that we steer away from LD_LIBRARY_PATH: the environment
variable is inherited by any child process steming from the caller,
which means that we are unable to use the subprocess module to call
system binaries like mkfs (which our scripts do a lot). Instead, we rely
on RUNPATH to tell the binary where to search for its libraries.

Once we generate an archive with the python3 interpreter, we then
package it as an rpm with bare any dependencies. The dependencies listed
are:

$ rpm -qpR scylla-relocatable-python3-3.6.7-1.el7.x86_64.rpm
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PartialHardlinkSets) <= 4.0.4-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsXz) <= 5.2-1

And the total size of that rpm, with all modules scylla needs is 20MB.

The Scylla rpm now have a way more modest dependency list:

$ rpm -qpR scylla-server-666.development-0.20190121.80b7c7953.el7.x86_64.rpm | sort | uniq
/bin/sh
curl
file
hwloc
kernel >= 3.10.0-514
mdadm
pciutils
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsXz) <= 5.2-1
scylla-conf
scylla-relocatable-python3 <== our python3 package.
systemd-libs
util-linux
xfsprogs

I have tested this end to end by generating RPMs from our master branch,
then installing them in a clean CentOS7.3 installation without even
using yum, just rpm -Uhv <package_list>

Then I called scylla_setup to make sure all python scripts were working
and started Scylla successfully.
"

* 'scylla-python3-v5' of github.com:glommer/scylla:
  Create a relocatable python3 interpreter
  spec file: fix python3 dependency list.
  fixup scripts before installing them to their final location
  automatically relocate python scripts
  make scyllatop relocatable
  use relative paths for installing scylla and iotune binaries
2019-02-05 12:53:34 +02:00
Amnon Heiman
c96c3ce9e8 tests/query_processor_test: add query_with_consumer_test
This patch adds a unit test for querying with a consumer function.

query with consumer uses paging, the tests covers the scenarios where
the number of rows bellow and above the page size, it also test the
option to stop in the middle of reading.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2019-02-05 12:35:53 +02:00
Amnon Heiman
6c7742d616 system_keyspace, api: stream get_compaction_history
get_compaciton_history can return big chunk of data.

To prevent large memory allocation, the get_compaction_history now read
each compaction_history record and use the http stream to send it.

Fixes #4152

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2019-02-05 11:14:53 +02:00
Amnon Heiman
c0e3b7673d query_processor: query and for_each_cql_result with future
query and for_each_cql_result accept a function that reads a row and
return a stop_iterator.

This implementation of those functions gets a function that returns a
future stop_iterator allowing preemption between calls.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2019-02-05 11:14:53 +02:00
Glauber Costa
afed2cddae Create a relocatable python3 interpreter
We would like to deploy Scylla in constrained environments where
internet access is not permitted. In those environments it is not
possible to acquire the dependencies of Scylla from external repos and
the packages have to be sent alongside with its dependencies.

In older distributions, like CentOS7 there isn't a python3 interpreter
available. And while we can package one from EPEL this tends to break in
practice when installing the software in older patchlevels (for
instance, installing into RHEL7.3 when the latest is RHEL7.5).

The reason for that, as we saw in practice, is that EPEL may
not respect RHEL patchlevels and have the python interpreter depending
on newer versions of some system libraries.

virtualenv can be used to create isolated python enviornments, but it is
not designed for full isolation and I hit at least two roadblocks in
practice:

1) It doesn't copy the files, linking some instead. There is an
  --always-copy option but it is broken (for years) in some
  distributions.
2) Even when the above works, it still doesn't copy some files, relying
   on the system files instead (one sad example was the subprocess
   module that was just kept in the system and not moved to the
   virtualenv)

This patch solves that problem by creating a python3 environment in a
directory with the modules that Scylla uses, and no other else. It is
essentially doing what vitualenv should do but doesn't. Once this
environment is assembled the binaries are then made relocatable the same
way the Scylla binary is.

One difference (for now) between the Scylla binary relocation process
and ours is that we steer away from LD_LIBRARY_PATH: the environment
variable is inherited by any child process steming from the caller,
which means that we are unable to use the subprocess module to call
system binaries like mkfs (which our scripts do a lot). Instead, we rely
on RUNPATH to tell the binary where to search for its libraries.

In terms of the python interpreter, PYTHONPATH does not need to be set
for this to work as the python interpreter will include the lib
directory in its PYTHONPATH. To confirm this, we executed the following
code:

    bin/python3 -c "import sys; print('\n'.join(sys.path))"

with the interpreter unpacked to  both /home/centos/glaubertmp/test/ and
/tmp. It yields respectively:

    /home/centos/glaubertmp/test/lib64/python36.zip
    /home/centos/glaubertmp/test/lib64/python3.6
    /home/centos/glaubertmp/test/lib64/python3.6/lib-dynload
    /home/centos/glaubertmp/test/lib64/python3.6/site-packages

and

    /tmp/python/lib64/python36.zip
    /tmp/python/lib64/python3.6
    /tmp/python/lib64/python3.6/lib-dynload
    /tmp/python/lib64/python3.6/site-packages

This was tested by moving the .tar.gz generated on my Fedora28 laptop to
a CentOS machine without python3 installed. I could then invoke
./scylla_python_env/python3 and use the interpreter to call 'ls' through
the subprocess module.

I have also tested that we can successfully import all the modules we listed
for installation and that we can read a sample yaml file (since PyYAML depends
on the system's libyaml, we know that this works)

Time to build:
real	0m15.935s
user	0m15.198s
sys	0m0.382s

Final archive size (uncompressed): 81MB
Final archive sie (compressed)   : 25MB

Signed-off-by: Glauber Costa <glauber@scylladb.com>
--
v3:
- rewrite in python3
- do not use temporary directories, add directly to the archive. Only the python binary
  have to be materialized
- Use --cacheonly for repoquery, and also repoquery --list in a second step to grab the file list
v2:
- do not use yum, resolve dependencies from installed packages instead
- move to scripts as Avi wants this not only for old offline CentOS
2019-02-04 18:02:40 -05:00
Glauber Costa
f757b42ba7 spec file: fix python3 dependency list.
The dependency list as it was did not reflect the fact that scyllatop is
now written in python3.

Some packages, like urwid, should use the python3 version. CentOS
doesn't really have an urwid package for python3, not even in EPEL. So
this officially marks the point in which we can't build packages that
will install in CentOS7 anyway.

Luckily, we will soon be providing our own python3 interpreter. But for
now, as a first step, simplify the dependency list by removing the
CentOS/Fedora conditional and listing the full python3 list

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2019-02-04 18:02:40 -05:00
Glauber Costa
7052028752 fixup scripts before installing them to their final location
Before installing python files to their final location in install.sh,
replace them with a thunk so that they can work with our python3
interpreter.  The way the thunk works, they will also work without our
python3 interpreter so unconditionally fixing them up is always safe.

I opt in this patch for fixing up just at install time to simplify
developer's life, who won't have to worry about this at all.

Note about the rpm .spec file: since we are relying on specific format
for the shebangs, we shouldn't let rpmbuild mess with them. Therefore,
we need to disable a global variable that controls that behavior (by
definition, Fedora rpmbuild will rewrite all shebangs to /usr/bin/python3)

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2019-02-04 18:02:40 -05:00
Glauber Costa
3869628429 automatically relocate python scripts
Given a python script at $DIR/script.py, this copies the script to
$DIR/libexec/script.py.bin, fixes its shebang to use /usr/bin/env instead
of an absolute path for the interpreter and replaces the original script
with a thunk that calls into that script.

PYTHONPATH is adjusted so that the original directory containing the script
can also serve as a source of modules, as would be originally intended.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2019-02-04 18:02:39 -05:00
Glauber Costa
1bb65a0888 make scyllatop relocatable
Right now the binary we distribute with scyllatop calls into
/usr/lib/scylla/scyllatop/scyllatop.py unconditionally. Calling that is
all that this binary does.

This poses a problem to our relocatable process, since we don't want
to be referring to absolute paths (And moreover, that is calling python
whereas it should be calling python3)

The scyllatop.py files includes a python3 shebang and is executable.
Therefore, it is best to just create a link to that file and execute it
directly

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2019-02-04 16:12:46 -05:00
Glauber Costa
e890b8af09 use relative paths for installing scylla and iotune binaries
The answer is yes: if we install them in $root/opt, we should link
to $root/opt

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2019-02-04 14:33:51 -05:00
Piotr Jastrzebski
834bec5cc9 Read shard awareness columns as dropped
Without this new version of Scylla won't be able to
start with system tables inherited after older version
that had shard awareness columns.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <cb62f20fc0c98f532c6f4ad5e08b3794951e85bd.1549289050.git.piotr@scylladb.com>
2019-02-04 18:43:11 +02:00
Rafael Ávila de Espíndola
bbd9dfcba7 Add a --split-dwarf option to configure.py
It is off by default as it conflicts with distcc.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190204002706.15540-1-espindola@scylladb.com>
2019-02-04 18:42:16 +02:00
Benny Halevy
a9e1e0233a Add a dev build mode to test.py
Message-Id: <20190204162112.7471-2-espindola@scylladb.com>
2019-02-04 18:38:23 +02:00
Rafael Ávila de Espíndola
6243443591 Add a dev build mode
The build times I got with a clean ccache were:

ninja dev      10806.89s user  678.29s system 2805% cpu  6:49.33 total
ninja release  28906.37s user 1094.53s system 2378% cpu 21:01.27 total
ninja debug    18611.17s user 1405.66s system 2310% cpu 14:26.52 total

With this version -gz is not passed to seastar's configure. It should
probably be seastar's configure responsibility to do that and I will
send a separate patch to do it.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190204162112.7471-1-espindola@scylladb.com>
2019-02-04 18:38:22 +02:00
Calle Wilund
9cadbaa96f commitlog_replayer: Bugfix: finding truncation positions uses local var ref
"uuid" was ref:ed in a continuation. Works 99.9% of the time because
the continuation is not actually delayed (and assuming we begin the
checks with non-truncated (system) cf:s it works).
But if we do delay continuation, the resulting cf map will be
borked.

Fixes #4187.

Message-Id: <20190204141831.3387-1-calle@scylladb.com>
2019-02-04 16:51:13 +02:00
Rafael Ávila de Espíndola
15a515a39b build: Don't link utils/gz/gen_crc_combine_table with seastar
It doesn't use seastar, so there is no point in linking with it.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190203214145.43009-1-espindola@scylladb.com>
2019-02-04 15:43:16 +02:00
Botond Dénes
2a67355ded multishard_combining_reader: better shard selection algorithm
The multishard reader has to combine the output of all shards into a
single fragment stream. To do that, each time a `partition_start` is
read it has to check if there is another partition, from another shard,
that has to be emitted before this partition. Currently for this it
uses the partitioner. At every partition start fragment it checks if the
token falls into the current shard sub-range. The shard sub-range is the
continuous range of tokens, where each token belongs to the same shard.
If the partition doesn't belong to the current shard sub-range the
multishard reader assumes the following shard sub-range of the next shard
will have data and move over to it. This assumption will however only
stand on very dense tables, and will fail miserably on less dense
tables, resulting in the multishard reader effectively iterating over
the shard sub-ranges (4096 in the worst case), only to find data in just
a few of them. This resulted in high user-perceived latency when
scanning a sparse table.

This patch replaces this algorithm with one based on a shard heap. The
shards are now organized into a min-heap, by the next token they have
data for. When a partition start fragment is read from the current
shard, its token is compared to the smallest token in the shard heap. If
smaller, we continue to read from the current shard. Otherwise we move
to the shard with the smallest token. When constructing the reader, or
after fast-forwarding we don't know what first token each reader will
produce. To avoid reading in a partition from each reader, we assume
each reader will produce the first token from the first shard sub-range
that overlaps with the query range. This algorithm performs much better
on sparse tables, while also being slightly better on dense tables.

I did only a very rough measurement using CQL tracing. I populated a
table with four rows on a 64 shards machine, then scanned the entire
table.
Time to scan the table (microseconds):
before 27'846
after   5'248

Fixes: #4125

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <d559f887b650ab8caa79ad4d45fa2b7adc39462d.1548846019.git.bdenes@scylladb.com>
2019-02-04 14:10:23 +02:00
Piotr Sarna
11e6d88ca7 tests: supplement filtering collections with more cases
Filtering test cases for collections are supplemented with
checking whether CONTAINS works correctly for sets and maps.

Message-Id: <4a684152cdcdb65e1415ba5859699cb324312c2b.1548837150.git.sarna@scylladb.com>
2019-02-03 17:19:30 +02:00