Commit Graph

2975 Commits

Author SHA1 Message Date
Avi Kivity
84465c23c4 Merge "Add multi-column restrictions filtering" from Piotr
"
Fixes #3574

This series adds missing multi-column restrictions filtering to CQL.
The underlying infrastructure already allows checking multi-column
restrictions in a reasonable way, so this series consists of mostly
adding simple interfaces and parameters.
Also, unit test cases for multi-column restrictions are provided.

Tests: unit (dev)
"

* 'add_multi_column_restrictions_filtering_3' of https://github.com/psarna/scylla:
  tests: add multi-column filtering tests
  cql3: add multi-column restrictions filtering
  cql3: add specified is_satisfied_by to multi-column restriction
  cql3: rewrite raw loop in is_satisfied_by to boost::any_of
  cql3: fix is_satisfied_by for multi-column restrictions
  cql3: add missing include to multi-column restriction
2019-02-19 14:42:14 +02:00
Piotr Sarna
9432937816 tests: add multi-column filtering tests
Refs #3574
2019-02-19 13:24:25 +01:00
Avi Kivity
934ba7ccb2 Merge "tests: introduce test environment and cleanup sstable tests" from Benny
"
As part of implementing sstables manager and fixing issue related
to updating large_data_handler on all delete paths, we want to funnel
all sstable creations, loading, and deletions through a manager.

The patchset lays out test infrastructure to funnel these opeations
through class sstables::test_env.

In the process, it cleans up many numerous call sites in the existing
unit tests that evolved over time.

Refs #4198
Refs #4149

Tests: unit (dev)
"

* 'projects/test_env/v3' of https://github.com/bhalevy/scylla:
  tests: introduce sstables::test_env
  tests: perf_sstable: rename test_env
  tests: sstable_datafile_test: use useable_sst
  tests: sstable_test: add write_and_validate_sst helper
  tests: sstable_test: add test_using_reusable_sst helper
  tests: sstable_test: use reusable_sst where possible
  tests: sstable_test: add test_using_working_sst helper
  tests: sstable_3_x_test: make_test_sstable
  tests: run_sstable_resharding_test: use default parameters to make_sstable
  tests: sstables::test::make_test_sstable: reorder params
  tests: test_setup: do_with_test_directory is unused
  tests: move sstable_resharding_strategy_tests to sstable_reharding_test
  tests: move create_token_from_key helpers to test_services
  tests: move column_family_for_tests to test_services
  dht: move declaration of default_partitioner from sstable_datafile_test to i_partitioner.hh
2019-02-19 11:26:42 +02:00
Duarte Nunes
6e83457b1b Merge 'Add PER PARTITION LIMIT' from Piotr
"
This series introduces PER PARTITION LIMIT to CQL.
Protocol and storage is already capable of applying per-partition limits,
so for nonpaged queries the changes are superficial - a variable is parsed
and passed down.
For paged queries and filtering the situation is a little bit more complicated
due to corner cases: results for one partition can be split over 2 or more pages,
filtering may drop rows, etc. To solve these, another variable is added to paging
state - the number of rows already returned from last served partition.
Note that "last" partition may be stretched over any number of pages, not just the
last one, which is a case especially when considering filtering.
As a result, per-partition-limiting queries are not eligible for page generator
optimization, because they may need to have their results locally filtered
for extraneous rows (e.g. when the next page asks for  per-partition limit 5,
but we already received 4 rows from the last partition, so need just 1 more
from last partition key, but 5 from all next ones).

Tests: unit (dev)

Fixes #2202
"

* 'add_per_partition_limit_3' of https://github.com/psarna/scylla:
  tests: remove superficial ignore_order from filtering tests
  tests: add filtering with per partition key limit test
  tests: publish extract_paging_state and count_rows_fetched
  tests: fix order of parameters in with_rows_ignore_order
  cql3,grammar: add PER PARTITION LIMIT
  idl,service: add persistent last partition row count
  cql3: prevent page generator usage for per-partition limit
  cql3: add checking for previous partition count to filtering
  pager: add adjusting per-partition row limit
  cql3: obey per partition limit for filtering
  cql3: clean up unneeded limit variables
  cql3: obey per partition limit for select statement
  cql3: add get_per_partition_limit
  cql3: add per_partition_limit to CQL statement
2019-02-18 14:47:11 +00:00
Piotr Sarna
5ad5221ce1 tests: remove superficial ignore_order from filtering tests
Testing filtering with LIMIT used with_rows_ignore_order function,
while it's better to use simpler with_rows.
2019-02-18 11:06:44 +01:00
Piotr Sarna
5f67a501ec tests: add filtering with per partition key limit test 2019-02-18 11:06:44 +01:00
Piotr Sarna
a84e237177 tests: publish extract_paging_state and count_rows_fetched
These local lambda functions will be reused, so they are promoted
to static functions.
2019-02-18 11:06:44 +01:00
Piotr Sarna
824e9dc352 tests: fix order of parameters in with_rows_ignore_order
When reporting a failure, expected rows were mixed up with received
rows. Also, the message assumed it received more rows, but it can
as well be less, so now it reports a "different number" of rows.
2019-02-18 11:06:44 +01:00
Avi Kivity
a1567b0997 Merge "replace get_restricted_ranges() function with generator interface" from Gleb
"
get_restricted_ranges() is inefficient since it calculates all
vnodes that cover a requested key ranges in advance, but callers often
use only the first one.  Replace the function with generator interface
that generates requested number of vnodes on demand.
"

* 'gleb/query_ranges_to_vnodes_generator' of github.com:scylladb/seastar-dev:
  storage_proxy: limit amount of precaclulated ranges by query_ranges_to_vnodes_generator
  storage_proxy: remove old get_restricted_ranges() interface
  cql3/statements/select_statement: convert index query interface to new query_ranges_to_vnodes_generator interface
  tests: convert storage_proxy test to new query_ranges_to_vnodes_generator interface
  storage_proxy: convert range query path to new query_ranges_to_vnodes_generator interface
  storage_proxy: introduce new query_ranges_to_vnode_generator interface
2019-02-18 10:33:54 +02:00
Avi Kivity
03531c2443 fragmented_temporary_buffer: fix read_exactly() during premature end-of-stream
read_exactly(), when given a stream that does not contain the amount of data
requested, will loop endlessly, allocating more and more memory as it does, until
it fails with an exception (at which point it will release the memory).

Fix by returning an empty result, like input_stream::read_exactly() (which it
replaces). Add a test case that fails without a fix.

Affected callers are the native transport, commitlog replay, and internal
deserialization.

Fixes #4233.

Branches: master, branch-3.0
Tests: unit(dev)
Message-Id: <20190216150825.14841-1-avi@scylladb.com>
2019-02-16 17:06:19 +00:00
Rafael Ávila de Espíndola
9cd14f2602 Don't write to system.large_partition during shutdown
The included testcase used to crash because during database::stop() we
would try to update system.large_partition.

There doesn't seem to be an order we can stop the existing services in
cql_test_env that makes this possible.

This patch then adds another step when shutting down a database: first
stop updating system.large_partition.

This means that during shutdown any memtable flush, compaction or
sstable deletion will not be reflected in system.large_partition. This
is hopefully not too bad since the data in the table is TTLed.

This seems to impact only tests, since main.cc calls _exit directly.

Tests: unit (release,debug)

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190213194851.117692-1-espindola@scylladb.com>
2019-02-15 10:49:10 +01:00
Benny Halevy
c8f239ff2b tests: introduce sstables::test_env
In preparation to adding sstables_manager we want
to establish an environment for testing sstables.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:37:41 +02:00
Benny Halevy
f9546b23b7 tests: perf_sstable: rename test_env
test_env is going to be a class in sstables namespace

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:15 +02:00
Benny Halevy
d6cfc1fae5 tests: sstable_datafile_test: use useable_sst
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
2a6b5a7622 tests: sstable_test: add write_and_validate_sst helper
In preparation for sstables::test_env

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
255f05e6c8 tests: sstable_test: add test_using_reusable_sst helper
In preparation for sstables::test_env

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
e11e29a1fc tests: sstable_test: use reusable_sst where possible
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
9d4989f2e8 tests: sstable_test: add test_using_working_sst helper
In preparation for sstables::test_env

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
55aac22b37 tests: sstable_3_x_test: make_test_sstable
Reused for making sstables for test cases.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
3bc1b8b9ff tests: run_sstable_resharding_test: use default parameters to make_sstable
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
b0f3f8d766 tests: sstables::test::make_test_sstable: reorder params
In preparation for providing a default large_data_handler in
a test-standard way.

buffer_size parameter reordered and now has a default value
same as make_sstable()'s.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:21:36 +02:00
Benny Halevy
bcd3f36a8a tests: test_setup: do_with_test_directory is unused
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:21:32 +02:00
Benny Halevy
b39c7bc4ae tests: move sstable_resharding_strategy_tests to sstable_reharding_test
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:21:32 +02:00
Benny Halevy
8801a6da1f tests: move create_token_from_key helpers to test_services
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:21:32 +02:00
Benny Halevy
815fd76c25 tests: move column_family_for_tests to test_services
And unify multiple copies of column_family_test_config().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:21:10 +02:00
Benny Halevy
b6ad61d2e5 dht: move declaration of default_partitioner from sstable_datafile_test to i_partitioner.hh
So it can be used by other tests

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:16:52 +02:00
Nadav Har'El
43c42d608d materialized views: forbid using "virtual" columns in restrictions
For fixing issue #3362 we added in materialized views, in some cases,
"virtual columns" for columns which were not selected into the view.
Although these columns nominally exist in the view's schema, they must
not be visible to the user, and in commit
3f3a76aa8f we prevented a user from being
able to SELECT these columns.

In this patch we also prevent the user from being able to use these
column names (which shouldn't exist in the view) in WHERE restrictions.

Fixes #4216

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190212162014.18778-1-nyh@scylladb.com>
2019-02-14 16:08:41 +02:00
Nadav Har'El
fec562ec8f Materialized views: limit size of row batching during bulk view building
The bulk materialized-view building processes (when adding a materialized
view to a table with existing data) currently reads the base table in
batches of 128 (view_builder::batch_size) rows. This is clearly better
than reading entire partitions (which may be huge), but still, 128 rows
may grow pretty large when we have rows with large strings or blobs,
and there is no real reason to buffer 128 rows when they are large.

Instead, when the rows we read so far exceed some size threshold (in this
patch, 1MB), we can operate on them immediately instead of waiting for
128.

As a side-effect, this patch also solves another bug: At worst case, all
the base rows of one batch may be written into one output view partition,
in one mutation. But there is a hard limit on the size of one mutation
(commitlog_segment_size_in_mb, by default 32MB), so we cannot allow the
batch size to exceed this limit. By not batching further after 1MB,
we avoid reaching this limit when individual rows do not reach it but
128 of them did.

Fixes #4213.

This patch also includes a unit test reproducing #4213, and demonstrating
that it is now solved.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190214093424.7172-1-nyh@scylladb.com>
2019-02-14 12:04:40 +02:00
Calle Wilund
2e320a456c truncation_migration_test: Add rudimentary test 2019-02-13 09:08:12 +00:00
Calle Wilund
b253757b17 cql_test_env: Add separate config w. feature disable 2019-02-13 09:08:12 +00:00
Calle Wilund
859a1d8f36 cql_test_env: Add truncation migration to init 2019-02-13 09:08:12 +00:00
Calle Wilund
fbcbe529ad cql_assertions: Add null/non-null tests 2019-02-13 09:08:12 +00:00
Botond Dénes
9000626647 shard_reader: auto pause readers after being used
Previously it was the responsibility of the layer above (multishard
combining reader) to pause readers, which happened via an explicit
`pause()` call. This proved to be a very bad design as we kept finding
spots where the multishard reader should have paused the reader to avoid
potential deadlocks (due to starved reader concurrency semaphores), but
didn't.

This commit moves the responsibility of pausing the reader into the
shard reader. The reader is now kept in a paused state, except when it
is actually used (a `fill_buffer()` or `fast_forward_to()` call is
executing). This is fully transparent to the layer above.
As a side note, the shard reader now also hides when the reader is
created. This also used to be the responsibility of the multishard
reader, and although it caused no problems so far, it can be considered
a leak of internal details. The shard reader now automatically creates
the remote reader on the first time it is attempted to be used.

The code has been reorganized, such that there is now a clear separation
of responsibilities. The multishard combining reader handles the
combining of the output of the shard readers, as well as issuing
read-aheads. The shard reader handles read-ahead and creating the
remote reader when needed, as well as transferring the results of remote
reads to the "home" shard. The remote reader
(`shard_reader::remote_reader`, new in this patch) handles
pausing-resuming as well as recreating the reader after it was evicted.
Layers don't access each other's internals (like they used to).

After this commit, the reader passed to `destroy_reader()` will always
be in paused state.
2019-02-12 16:20:51 +02:00
Botond Dénes
37006135dc shard_reader: make reader creation sync
Reader creation happens through the `reader_lifecycle_policy` interface,
which offers a `create_reader()` method. This method accepts a shard
parameter (among others) and returns a future. Its implementation is
expected to go to the specified shard and then return with the created
reader. The method is expected to be called from the shard where the
shard reader (and consequently the multishard reader) lives. This API,
while reasonable enough, has a serious flaw. It doesn't make batching
possible. For example, if the shard reader issues a call to the remote
shard to fill the remote reader's buffer, but finds that it was evicted
while paused, it has to come back to the local shard just to issue the
recreate call. This makes the code both convoluted and slow.
Change the reader creation API to be synchronous, that is, callable from
the shard where the reader has to be created, allowing for simple call
sites and batching.
This change requires that implementations of the lifecycle policy update
any per-reader data-structure they have from the remote shard. This is
not a problem however, as these data-structures are usually partitioned,
such that they can be accessed safely from a remote shard.
Another, very pleasant, consequence of this change is that now all
methods of the lifecycle interface are sync and thus calls to them
cannot overlap anymore.

This patch also removes the
`test_multishard_combining_reader_destroyed_with_pending_create_reader`
unit test, which is not useful anymore.

For now just emulate the old interface inside shard reader. We will
overhaul the shard reader after some further changes to minimize
noise.
2019-02-12 16:20:51 +02:00
Botond Dénes
57d1f6589c shard_reader: use semaphore directly to pause-resume
The shard reader relies on the `reader_lifecycle_policy` for pausing and
resuming the remote reader. The lifecycle policy's API was designed to
be as general as possible, allowing for any implementation of
pause/resume. However, in practice, we have a single implementation of
pause/resume: registering/unregistering the reader with the relevant
`reader_concurrency_semaphore`, and we don't expect any new
implementations to appear in the future.
Thus, the generic API of the lifecycle policy, is needlessly abstract
making its implementations needlessly complex. We can instead make this
very concrete and have the lifecycle policy just return the relevant
semaphore, removing the need for every implementor of the lifecycle
policy interface to have a duplicate implementation of the very same
logic.

For now just emulate the old interface inside shard reader. We will
overhaul the shard reader after some further changes to minimize noise.
2019-02-12 16:20:51 +02:00
Nadav Har'El
85e5791710 tests/view_schema_test: fix flakiness caused by missing eventually()
All tests that involve writing to a base table and then reading from the
view table must use the eventually() function to account for the fact that
the view update is asynchronous, and may be visible only some time after
writing the base table. Forgetting an eventually() can cause the test
to become flaky and sometimes fail because the expected data is not *yet*
in the view. Botond noticed these failures in practice in two subtests
(test_partition_key_filtering_with_slice and
test_clustering_key_in_restrictions).

This patch fixes both tests, and I also reviewed the entire source file
view_schem_test.cc and found additional places missing an eventually()
(and also places that unnecessarily used eventually() to read from the
base table), and fixed those as well.

Fixes #4212

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190212121140.14679-1-nyh@scylladb.com>
2019-02-12 16:10:30 +02:00
Paweł Dziepak
eb03cf00f5 sstable: write_components: drop default for encoding stats
There is no value if having a default value for encoding_stats parameter
of write_components(). If anything it weakens the tests by encouraging
not using the real encoding stats which is not what the actual sstable
write path in Scylla does.

This patch removes the default value and makes most of the tests provide
real encoding statistics. The ones that do not are those that have no
easy way of obtaining those (and those stats are not that important for
the test itself) or there is a reason for not using those
(sstable_3_x_test::test_sstable_write_large_row uses row size thresholds
based on size with default-constructed encoding_stats).

Message-Id: <20190212124356.14878-1-pdziepak@scylladb.com>
2019-02-12 16:08:24 +02:00
Gleb Natapov
e6208b1cde tests: convert storage_proxy test to new query_ranges_to_vnodes_generator interface 2019-02-11 14:45:43 +02:00
Duarte Nunes
d2d885fb93 Merge 'Fix misdetection of remote counter shards' from Paweł
"
The code reading counter cells form sstables verifies that there are no
unsupported local or remote shards. The latter are detected by checking
if all shards are present in the counter cell header (only remote shards
do not have entries there). However, the logic responsible for doing
that was incorrectly computing the total number of counter shards in a
cell if the header was larger than a single counter shard. This resulted
in incorrect complaints that remote shards are present.

Fixes #4206

Tests: unit(release)
"

* tag 'counter-header-fix/v1' of https://github.com/pdziepak/scylla:
  tests/sstables: test counter cell header with large number of shards
  sstables/counters: fix remote counter shard detection
2019-02-10 12:16:31 +02:00
Paweł Dziepak
4eeb8eeed5 tests/sstables: test counter cell header with large number of shards
The logic responsible for reading counters from sstables was getting
confused by large headers. The size of the header depends directly on
the number of shards. This tests checks that we can handle cells with
large number of counter shards properly.
2019-02-08 17:06:31 +00:00
Paweł Dziepak
64b1a2caf9 tests: modernise tmpdir
tmpdir is a helper class representing a temporary directory.
Unfortunately, it suffers for some problems such as lack of proper
encapsulation and weak typing. This has caused bugs in the past when the
user code accidentally modified the member variable with the path to the
directory.

This patch modernises tmpdir and updates its users. The path is stored
in a std::filesystem::path and available read-only to the class users.
mkdtemp and boost are replaced by standard solution.

The users are update to use path more (when it didn't involve too many
changes to their code) and stop using lw_shared_ptr to store the tmpdir
when it wasn't necessary.

tmpdir intentionally doesn't provide any helpers for getting the path as
a string in order to discourage weak types.

Message-Id: <20190207145727.491-1-pdziepak@scylladb.com>
2019-02-07 20:18:14 +02:00
Paweł Dziepak
67252de195 tests/sstable: test merging encoding_stats 2019-02-07 10:17:06 +00:00
Paweł Dziepak
e25603fbf7 tests/sstables: pass encoding stats to write_components()
By default write_components() uses a safe default for encoding_stats
which indicates that all columns are present. This may hide so bugs, so
let's pass the real thing in the tests that this may matter.
2019-02-07 10:17:06 +00:00
Paweł Dziepak
d44d5ebf86 tests/sstables: test for reading serialisation header 2019-02-07 10:17:06 +00:00
Paweł Dziepak
e315448d0a tests/memtable: test column tracking for encoding stats 2019-02-07 10:16:50 +00:00
Paweł Dziepak
591d5195a9 tests/data_model: allow specifying timestamp for row markers 2019-02-07 10:16:50 +00:00
Paweł Dziepak
b07cba6a89 tests/data_model: add support for expiring atomic cells 2019-02-07 10:16:50 +00:00
Paweł Dziepak
aab0b7360f tests: extract mutation data model 2019-02-07 10:16:50 +00:00
Paweł Dziepak
fa216be260 tests/sstable3: improvements to file compare
This patch introduces some improvement to file comparison:
 - exception flags are set so that any error triggers an exceptions and
   guarantees that they are not silently ignored
 - std::ios_base::binary flag is passed to open()
 - istreambuf_iterator is used instead of istream_iterator. It is better
   suited for comparing binary data.
2019-02-07 10:16:50 +00:00
Paweł Dziepak
a8a45a243b tests/cql_test_env: don't override tmpdir::path
The interface tmpdir::path isn't properly encapsulated and its users can
modify the path even though they really shouldn't. This can happen
accidentally, in cql_test_env a reference to tmpdir::path was created
and later assigned to in one of the code paths. This caused tmpdir
destructor to remove wrong directory at program exit.

This patch solves the problem by avoiding referencing tmpdir::path, a
copy is perfectly acceptable considering that this is tests-only code.

Message-Id: <20190206173046.26801-1-pdziepak@scylladb.com>
2019-02-06 20:55:40 +02:00