Commit Graph

3181 Commits

Author SHA1 Message Date
Tomasz Grabiec
3e30a33e31 Merge "Introduce tests::random_schema" from Botond
Most of our tests use overly simplistic schemas (`simple_schema`) or
very specialized ones that focus on exercising a specific area of the
tested code. This is fine in most places as not all code is schema
dependent, however practice has showed that there can be nasty bugs
hiding in dark corners that only appear with a schema that has a
specific combination of types.

This series introduces `tests::random_schema` a utility class for
generating random schemas and random data for them. An important goal is
to make using random schemas in tests as simple and convenient as
possible, therefore fostering the appearance of tests using random
schemas.

Random schema was developed to help testing code I'm currently working
on, which segregates data by time-windows. As I wasn't confident in my
ability to think of every possible combination of types that can break
my code I came up with random-schema to help me finding these corner
cases. So far I consider it a success, it already found bugs in my code
that I'm not sure I would have found if I had relied on specific
schemas. It also found bugs in unrelated areas of the code which proves
my point in the first paragraph.

* https://github.com/denesb/scylla.git random_schema/v5:
  tests/data_model: approximate to the modeled data structures
  data_value: add ascii constructor
  tests/random-utils.hh: add stepped_int_distribution
  tests/random-utils.hh: get_int() add overloads that accept external
    rand engine
  tests/random-utils.hh: add get_real()
  tests: introduce random_schema
2019-06-26 18:10:20 +02:00
Avi Kivity
06a9596491 tests: cql_test_env: disable commitlog O_DSYNC
O_DSYNC causes commitlog to pre-allocate each commitlog segment by writing
zeroes into it. In normal operation, this is amortized over the many
times the segment will be reused. In tests, this is wasteful, but under
the default workstation configuration with /tmp using tmpfs, no actual
writes occur.

However on a non-default configuration with /tmp mounted on a real disk,
this causes huge disk I/O and eventually a crash (observed in
schema_change_test). The crash is likely only caused indirectly, as the
extra I/O (exacerbated by many tests running in parallel) xcauses timeouts.

I reproduced this problem by running 15 copies of schema_change_test in
parallel with /tmp mounted on a real filesystem. Without this change, I
usually observe one or two of the copies crashing, with the change they
complete (and much more quickly, too).
2019-06-26 12:15:53 +02:00
Avi Kivity
fc629bb14f Merge "cql3: lift infinite bound check" from Benny & Piotr
"
If the database supports infinite bound range deletions,
CQL layer will no longer throw an error indicating that both ranges
need to be specified.

Fixes #432

Update test_range_deletion_scenarios unit test accordingly.
"

* 'cql3-lift-infinite-bound-check' of https://github.com/bhalevy/scylla:
  cql3: lift infinite bound check if it's supported
  service: enable infinite bound range deletions with mc
  database: add flag for infinite bound range deletions
2019-06-25 19:05:29 +03:00
Botond Dénes
d00cb4916c tests: introduce random_schema
random_schema is a utility class that provides methods for generating
random schemas as well as generating data (mutations) for them. The aim
is to make using random schemas in tests as simple and convenient as
is using `simple_schema`. For this reason the interface of
`random_schema` follows closely that of `simple_schema` to the extent
that it makes sense. An important difference is that `random_schema`
relies on `data_model` to actually build mutations. So all its
mutation-related operations work with `data_model::mutation_descrition`
instead of actual `mutation` objects. Once the user arrived at the
desired mutation description they can generate an actual mutation via
`data_model::mutation_description::build()`.

In addition to the `random_schema` class, the `random_schema.hh` header
exposes the generic utility classes for generating types and values
that it internally uses.

random_schema is fully deterministic. Using the same seed and the same
set of operations is guaranteed to result in generating the same schema
and data.
2019-06-25 12:01:33 +03:00
Botond Dénes
070d72ee23 tests/random-utils.hh: add get_real() 2019-06-25 12:01:33 +03:00
Botond Dénes
2d9f6c3b63 tests/random-utils.hh: get_int() add overloads that accept external rand engine 2019-06-25 12:01:33 +03:00
Botond Dénes
2a7710129e tests/random-utils.hh: add stepped_int_distribution 2019-06-25 12:01:33 +03:00
Botond Dénes
1bd8b77770 tests/data_model: approximate to the modeled data structures
Make the the data modelling structures model their "real" counterparts
more closely, allowing the user greater control on the produced data.
The changes:
* Add timestamp to atomic_value (which is now a struct, not just an
    alias to bytes).
* Add tombstone to collection.
* Add row_tombstone to row.
* Add bound kinds and tombstone to range_tombstone.

Great care was taken to preserve backward compatibility, to avoid
unnecessary changes in existing code.
2019-06-25 12:01:33 +03:00
Piotr Sarna
add40d4e59 cql3: lift infinite bound check if it's supported
If the database supports infinite bound range deletions,
CQL layer will no longer throw an error indicating that both ranges
need to be specified.

[bhalevy] Update test_range_deletion_scenarios unit test accordingly.

Fixes #432

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-06-24 15:58:34 +03:00
Piotr Sarna
b668ee2b2d tests: add indexing+paging test case for clustering keys
Indexing a non-prefix part of the clustering key has a separate
code path (see issue #3405), so it deserves a separate test case.
2019-06-24 14:51:17 +02:00
Piotr Sarna
3d9a37f28f tests: add indexing + paging + aggregation test case
Indexed queries used to erroneously return partial per-page results
for aggregation queries. This test case used to reproduce the problem
and now ensures that there would be no regressions.

Refs #4540
2019-06-24 14:06:42 +02:00
Piotr Sarna
60cafcc39c tests: add query_options to cquery_nofail
The cquery_nofail utility is extended, so it can accept custom
query options, just like execute_cql does.
2019-06-24 14:06:41 +02:00
Avi Kivity
779b378785 Merge "Fix partitioned_sstable_set by making it self sufficient" from Raphael & Benny
"
partitioned_sstable_set is not self sufficient because it relies on
compatible_ring_position_view, which in turn relies on lifetime of
sstable object. This leads to use-after-free. Fix this problem by
introducing compatible_ring_position and using it in p__s__s.

Fixes #4572.

Test: unit (dev), compaction dtests (dev)
"

* 'projects/fix_partitioned_sstable_set/v4' of ssh://github.com/bhalevy/scylla:
  tests: Test partitioned sstable set's self-sufficiency
  sstables: Fix partitioned_sstable_set by making it self sufficient
  Introduce compatible_ring_position and compatible_ring_position_or_view
2019-06-23 17:17:18 +03:00
Raphael S. Carvalho
14fa7f6c02 tests: Test partitioned sstable set's self-sufficiency
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-06-23 16:29:13 +03:00
Rafael Ávila de Espíndola
3bd5dd7570 Add a few more tests of data_value::to_string
I found that no tests covered this code while refactoring it.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190620183449.64779-2-espindola@scylladb.com>
2019-06-23 16:03:06 +03:00
Piotr Sarna
b8cadc928c tests: add test case for finishing index paging
The test case makes sure that paging indexes does not result
in an infinite loop.

Refs #4569
2019-06-19 14:10:13 +02:00
Rafael Ávila de Espíndola
26c0814a88 Add test large collection warning
This was already working, but we were not testing for it.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190617181706.66490-1-espindola@scylladb.com>
2019-06-18 10:27:55 +02:00
Nadav Har'El
6aab1a61be Fix deciding whether a query uses indexing
The code that decides whether a query should used indexing was buggy - a partition key index might have influenced the decision even if the whole partition key was passed in the query (which effectively means that indexing it is not necessary).

Fixes #4539

Closes https://github.com/scylladb/scylla/pull/4544

Merged from branch 'fix_deciding_whether_a_query_uses_indexing' of git://github.com/psarna/scylla
  tests: add case for partition key index and filtering
  cql3: fix deciding if a query uses indexing
2019-06-18 01:01:14 +03:00
Tomasz Grabiec
f798f724c8 frozen_mutation: Guard against unfreezing using wrong schema
Currently, calling unfreeze() using the wrong version of the schema
results in undefined behavior. That can cause hard-to-debug
problems. Better to throw in such cases.

Refs #4549.

Tests:
  - unit (dev)
Message-Id: <1560459022-23786-1-git-send-email-tgrabiec@scylladb.com>
2019-06-17 15:23:24 +03:00
Benny Halevy
4ad06c7eeb tests/perf: provide random-seed option
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190613114307.31038-2-bhalevy@scylladb.com>
2019-06-13 14:45:49 +03:00
Benny Halevy
43e4631e6a tests: random-utils: use seastar::testing::local_random_engine
To provide test reproducibility use the seastar local_random_engine.

To reproduce a run, use the --random-seed command line option
with the seed printed accordingly.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190613114307.31038-1-bhalevy@scylladb.com>
2019-06-13 14:45:48 +03:00
Benny Halevy
fe2d629e20 mutation_reader_test: test_multishard_combining_reader_reading_empty_table: fix non-atomic sharing of shards_touched
It needs to be a std::vector<std::atomic<bool>>
otherwise threads step on wach other in shared memory.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190613112359.21884-1-bhalevy@scylladb.com>
2019-06-13 14:44:43 +03:00
Piotr Sarna
2c2122e057 tests: add a test case for filtering clustering key
The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref #4541
Message-Id: <3612dc1c6c22c59ac9184220a2e7f24e8d18407c.1560410018.git.sarna@scylladb.com>
2019-06-13 10:38:56 +03:00
Piotr Sarna
adeea0a022 cql3: fix fetching clustering key columns for filtering
When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

This commit also fixes tests in cql_query_test suite, because they now
expect more values - columns fetched for filtering will be present as
well (only internally, the clients receive only data they asked for).

Fixes #4541
Message-Id: <f08ebae5562d570ece2bb7ee6c84e647345dfe48.1560410018.git.sarna@scylladb.com>
2019-06-13 10:38:37 +03:00
Dejan Mircevski
a52a56bfc0 utils: Add like_matcher
A utility for matching text with LIKE patterns, and a battery of
tests.

Tests: unit(dev,debug)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-06-12 13:14:53 +03:00
Piotr Sarna
7b2de7ac5b tests: add case for partition key index and filtering
The test ensures that partition key index does not influence
filtering decisions for regular columns.

Ref #4539
2019-06-12 11:53:02 +02:00
Juliana Oliveira
43f92ae6d5 cql: functions: add min/max/count for boolean type
Explicitly add min/max/count functions and tests for
boolean type.

Tests: unit (release)

Signed-off-by: Juliana Oliveira <juliana@scylladb.com>
Message-Id: <20190612015215.GA2618@shenzou.localdomain>
2019-06-12 10:11:08 +03:00
Juliana Oliveira
fd83f61556 Add a warning for partitions with too many rows
This patch adds a warning option to the user for situations where
rows count may get bigger than initially designed. Through the
warning, users can be aware of possible data modeling problems.

The threshold is initially set to '100,000'.

Tests: unit (dev)

Message-Id: <20190528075612.GA24671@shenzou.localdomain>
2019-06-06 19:48:57 +03:00
Botond Dénes
2ccd8ee47c queue_reader: use the reader's buffer as the queue
The queue reader currently uses two buffers, a `_queue` that the
producer pushes fragments into and its internal `_buffer` where these
fragments eventually end up being served to the consumer from.
This double buffering is not necessary. Change the reader to allow the
producer to push fragments directly into the internal `_buffer`. This
complicates the code a little bit, as the producer logic of
`seastar::queue` has to be folded into the queue reader. On the other
hand this introduces proper memory consumption management, as well as
reduces the amount of consumed memory and eliminates the possibility of
outside code mangling with the queue. Another big advantage of the
change is that there is now an explicit way to communicate the EOS
condition, no need to push a disengaged `mutation_fragment_opt`.

The producer of the queue reader now pushes the fragments into the
reader via an opaque `queue_reader_handle` object, which has the
producer methods of `seastar::queue`.

Existing users of queue readers are refactored to use the new interface.

Since the code is more complex now, unit tests are added as well.
2019-06-04 13:39:26 +03:00
Paweł Dziepak
899ebe483a Merge "Fix empty counters handling in MC" from Piotr
"
Before this patchset empty counters were incorrectly persisted for
MC format. No value was written to disk for them. The correct way
is to still write a header that informs the counter is empty.

We also need to make sure that reading wrongly persisted empty
counters works because customers may have sstables with wrongly
persisted empty counters.

Fixes #4363
"

* 'haaawk/4363/v3' of github.com:scylladb/seastar-dev:
  sstables: add test for empty counters
  docs: add CorrectEmptyCounters to sstable-scylla-format
  sstables: Add a feature for empty counters in Scylla.db.
  sstables: Write header for empty counters
  sstables: Remove unused variables in make_counter_cell
  sstables: Handle empty counter value in read path
2019-05-23 13:05:53 +01:00
Piotr Jastrzebski
fdbf4f6f53 sstables: add test for empty counters
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-05-23 10:10:24 +02:00
Dejan Mircevski
09acb32d35 tests/cql: Replace equery() with cquery_nofail()
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-21 23:38:09 -04:00
Dejan Mircevski
a9849ecba7 tests: Add cquery_nofail() utility
Most tests await the result of cql_test_env::execute_cql().  Most
would also benefit from reporting errors with top-level location
included.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-21 23:28:14 -04:00
Dejan Mircevski
1d8bfc4173 tests: Drop redundant function
make_predicate_for_exception_message_fragment() is redundant now that
exception_utils has landed.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-21 23:28:14 -04:00
Avi Kivity
5a276d44af Merge "row_cache: Make invalidate() preemptible" from Tomasz
"
This patchset fixes reactor stalls caused by cache invalidation not being preemptible.
This becomes a problem when there is a lot of partitions in cache inside the invalidated range.

This affects high-level operations like nodetool refresh, table
truncation, repair and streaming.

Fixes #2683

The improvement on stalls was measured using tests/perf_row_cache_update:

  Before:

    Small partitions, no overwrites:
    invalidation: 339.420624 [ms], preemption: {count: 2, 99%: 0.008239 [ms], max: 339.422144 [ms]}
    Small partition with a few rows:
    invalidation: 191.855331 [ms], preemption: {count: 2, 99%: 0.008239 [ms], max: 191.856816 [ms]}
    Large partition, lots of small rows:
    invalidation: 0.959328 [ms], preemption: {count: 2, 99%: 0.008239 [ms], max: 0.961453 [ms]}

  After:

    Small partitions, no overwrites:
    invalidation: 400.505554 [ms], preemption: {count: 843, 99%: 0.545791 [ms], max: 0.502340 [ms]}
    Small partition with a few rows:
    invalidation: 306.352600 [ms], preemption: {count: 644, 99%: 0.545791 [ms], max: 0.506464 [ms]}
    Large partition, lots of small rows:
    invalidation: 0.963660 [ms], preemption: {count: 2, 99%: 0.009887 [ms], max: 0.963264 [ms]}

The maximum scheduling latency went down form 339 ms to 0.5 ms (task quota).

Tests:
  - unit (dev)
"

* tag 'cache-preemptible-invalidation-v2' of github.com:tgrabiec/scylla:
  row_cache: Make invalidate() preemptible
  row_cache: Switch _prev_snapshot_pos to be a ring_position_ext
  dht: Introduce ring_position_ext
  dht: ring_position_view: Take key by const pointer
  tests: perf_row_cache_update: Rename 'stall' to 'preemption' to avoid confusion
  tests: perf_row_cache_update: Report stalls around invalidation
2019-05-19 10:47:46 +03:00
Avi Kivity
8e19121e98 Merge "Implement simple selection alongside aggregation" from Dejan
"
Although CQL allows SELECT statements with both simple and aggregate
selectors, Scylla disallows them.  This patch removes that restriction
and ensures that mixed simple/aggregate selection works as specified
both with and without GROUP BY.

Tests: unit (dev)
"

* 'aggregate-and-simple-select-together' of https://github.com/dekimir/scylla:
  cql: Fix mixed selection with GROUP BY
  cql: Allow mixing of aggregate and simple selectors
2019-05-14 20:03:58 +03:00
Dejan Mircevski
f9b00a4318 cql: Fix mixed selection with GROUP BY
GROUP BY is currently supported by simple_selection, the class used
when all selectors are simple.  But when selectors are mixed, we use
selection_with_processing, which does not yet support GROUP BY.  This
patch fixes that.

It also adapts one testcase in filtering_test to the new behavior of
simple_selector.  The test currently expects the last value seen, but
simple_selector now outputs the first value seen.

(More details: the WHERE clause implicitly selects the columns it
references, and unit tests are forced to provide expected values for
these columns.  The user-visible result is unchanged in the test;
users never see the WHERE column values due to filtering in
cql::transport, outside unit tests.)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-14 12:50:39 -04:00
Dejan Mircevski
06e3b36164 cql: Allow mixing of aggregate and simple selectors
Scylla currently rejects SELECT statements with both simple and
aggregate selectors, but Cassandra allows them.  This patch brings
parity to Scylla.

Fixes #4447.

Tests: unit (dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-14 10:34:02 -04:00
Tomasz Grabiec
6159d5522d tests: Add test which verifies that schema digest stays the same
(cherry picked from commit 8019634dba)
2019-05-14 10:43:06 +02:00
Tomasz Grabiec
815295547d tests: Add sstables for the schema digest test
Generated by running test_schema_digest_does_not_change with
regenerate set to true.

(cherry picked from commit 1f2995c8c5)
2019-05-14 10:43:06 +02:00
Tomasz Grabiec
1530224377 dht: Introduce ring_position_ext
It's an owning version of ring_position_view.

Note that ring_position has a narrower domain than the
ring_position_view for historical reasons, so we cannot use that.
2019-05-13 19:30:50 +02:00
Tomasz Grabiec
ed697306be tests: perf_row_cache_update: Rename 'stall' to 'preemption' to avoid confusion 2019-05-13 19:18:20 +02:00
Tomasz Grabiec
b516e5fdbf tests: perf_row_cache_update: Report stalls around invalidation 2019-05-13 10:47:03 +02:00
Avi Kivity
fdace36fa5 Merge "Fixes for GCC9 build" from Paweł
"
This series contains fixes for GCC9 build, mostly corrections needed
after changes in libstdc++. With this series and a workaround for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90415 (not included)
Scylla builds and passes unit tests with GCC9 (tested on Fedora 30,
development mode only).

Tests: unit(dev with gcc8 and gcc9).
"

* tag 'gcc9-fixes/v1' of https://github.com/pdziepak/scylla:
  tests/imr: add missing noexcept
  counters: bytes_view::pointer is not const pointer
  imr/fundamental: use bytes_view::const_pointer for const pointer
2019-05-09 21:51:24 +03:00
Paweł Dziepak
96eec203bd tests/imr: add missing noexcept
The concepts require that serialisers passed to the IMR are noexcept.
GCC9 started verifying that.
2019-05-09 17:38:24 +01:00
Dejan Mircevski
e4ec89473e tests: Cover indexing errors in frozen collections
Add new test cases:
- disallow creating a non-FULL index on frozen collections
- disallow repeated creation of a FULL index on frozen collections
- disallow FULL indexes on non-frozen collections
- disallow referencing frozen-map entries in the WHERE clause

Also add error-message expectations to existing test cases.

Fixes #3654.

Tests: unit (dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Message-Id: <20190509025806.124499-1-dejan@scylladb.com>
2019-05-09 15:25:11 +03:00
Dejan Mircevski
4eeec4a452 tests: drop util.hh
The file tests/util.hh was somehow committed despite `git mv`g it to
tests/exception_utils.hh.

Tests: unit (dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Message-Id: <20190508210203.106295-1-dejan@scylladb.com>
2019-05-09 14:45:33 +03:00
Avi Kivity
a86fdeb02b Merge "Implement GROUP BY" from Dejan
"
Cassandra has supported GROUP BY in SELECT statements since 2016
(v3.10), while ScyllaDB currently treats it as a syntax error.  To
achieve parity with Cassandra in this important bit of functionality,
this patch adds full support for GROUP BY, from parsing to validation
to implementation to testing.
"

* 'groupby-implPP' of https://github.com/dekimir/scylla:
  Implement grouping in selection processing
  Propagate GROUP BY indices to result_set_builder
  Process GROUP BY columns into select_statement
  Parse GROUP BY clause, store column identifiers
2019-05-08 18:35:12 +03:00
Dejan Mircevski
d51e4a589d Implement grouping in selection processing
Make result_set_builder obey its _group_by_cell_indices by recognizing
group boundaries and resetting the selectors.

Also make simple_selectors work correctly when grouping.

Fixes #2206.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-08 11:05:36 -04:00
Dejan Mircevski
274a77f45e Process GROUP BY columns into select_statement
Validate raw GROUP BY identifiers and translate them into
a select_statement member.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-08 10:10:10 -04:00