Commit Graph

308 Commits

Author SHA1 Message Date
Piotr Dulikowski
6c5c745e25 cdc: add cdc log schema test 2020-03-21 07:33:35 +01:00
Pekka Enberg
6b2cd1bd7d Revert "db::commitlog: Don't write trailing zero block unless needed"
This reverts commit 0b34d88957. According
to Rafael Avila de Espindola:

"I have bisected the recent failures [in commitlog_test] on next to this
 patch."
2020-03-20 22:30:58 +02:00
Piotr Sarna
62c34a9085 cql: fix qualifying indexed columns for filtering
When qualifying columns to be fetched for filtering, we also check
if the target column is not used as an index - in which case there's
no need of fetching it. However, the check was incorrectly assuming
that any restriction is eligible for indexing, while it's currently
only true for EQ. The fix makes a more specific check and contains
many dynamic casts, but these will hopefully we gone once our
long planned "restrictions rewrite" is done.
This commit comes with a test.

Fixes #5708
Tests: unit(dev)
2020-03-19 10:34:16 +02:00
Rafael Ávila de Espíndola
3c2851aafc test: Make sure auth_service is always stopped
An exception thrown after the start of auth_service and before
init_server_without_the_messaging_service_part returns would cause the
sharded<auth_service> destructor to assert.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200317002051.117832-2-espindola@scylladb.com>
2020-03-18 20:17:55 +01:00
Avi Kivity
c766f50491 Merge "Split some unit tests into smaller pieces" from Pavel E
"
The debug mode unit tests take ~half-an-hour to complete. Here's
the tests run-times top list

Test:					Time (seconds):
            ... steady tail goes here ...
test/boost/user_function_test		496
test/boost/row_cache_test		502
test/boost/view_schema_test		932
test/boost/cql_query_test		997
test/boost/mutation_reader_test		1048
test/boost/sstable_mutation_test	1417
test/boost/secondary_index_test		1468

Splitting the spike (top-5) is the primary goal. However, the
distribution of test-cases in 3 of those tests is also _very_
non-uniform, so just cutting it into equal parts doesn't work.
For example, the test_index_with_paging from the slowest one
takes ~14 minutes on its own and is the slowest test-case out
there.

So the set does this:

- moves the champion test_index_with_paging into separate file
- detaches the most heavy parts from sstable_mutation_test and
  mutation_reader_test into own tests too. The resulting split
  is still non-uniform, but it's 4 tests that run notably less
  than the 14 minutes record each
- splits the cql_query_test and view_schema_test into several
  parts in a wildcard manner to run out of the 14 min threshold
- moves some shared code into lib/

As the result, the debug mode test run takes 14.5 minutes =)
which is almost 2 times faster than it was. The dev mode run
time is not affected noticeably.

Test: well, unit(debug) and unit(dev)
"

* 'br-split-unit-tests-3-next' of https://github.com/xemul/scylla:
  test: Split view_schema_test
  test: Split cql_query_test
  test: Split mutation_reader_test
  test: Split sstable_mutation_test
  test: Split secondary_index test
2020-03-18 12:19:32 +02:00
Tomasz Grabiec
488482c55a Merge "lwt: ensure unqualified SELECT works with SERIAL cl" from Kostja
Ensure unqualified SELECT throws an appropriate exception with
SERIAL consistency level.
Since such query touches multiple partitions, we don't support it
in SERIAL mode.

Branch URL:
https://github.com/kostja/scylla/tree/gh-6016-crash-lwt-select
2020-03-17 17:24:06 +01:00
Konstantin Osipov
4978bb513d test: add a test case for SERIAL read consistency
Pass custom query options to execute_prepared and
add a test case for custom SERIAL consistency.
2020-03-17 18:58:12 +03:00
Pavel Emelyanov
86c712a340 test: Split view_schema_test
Detach *partition_key* and *clustering_key* ones into own files.
The resultint 2 tests run ~4 minutes each, the leftover ones
complete within 11 minutes. The same -- the goal to run out of
14 minutes is reached, further splitting needs more thinking
than just wildcarding.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-16 20:27:45 +03:00
Pavel Emelyanov
e848d63510 test: Split cql_query_test
This detaches *like_operator*, *group_by*, *functions*
and *large* cases into own files. The split is not
uniform -- the resulting 4 tests run less that 3 minutes
each,  what's left in the origin runs ~11 minutes. But
since the goal was to get out of 14 minutes threshold
and this file contains 126 cases (the champion) so I
just did "wildcard" selection that worked.

It also required moving require_rows() helpers into a
local header.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-16 20:27:45 +03:00
Pavel Emelyanov
3fbd88b226 test: Split mutation_reader_test
Detach test_multishard_combining_reader_as_mutation_source into
individual file.

This particular test runs ~13 minutes. What's left in the origin
completes a bit faster.

The split also requires moving the reader_lifecycle_policy and
the dummy_partitioner into lib/

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-16 20:27:44 +03:00
Pavel Emelyanov
3577fa2bb8 test: Split sstable_mutation_test
Detach test_schema_changes and test_sstable_conforms_to_mutation_source
into individual files. These two take ~10 minutes each, what's left in
origin finishes within 4 minutes alltogether.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-16 20:26:34 +03:00
Pavel Emelyanov
5b86f4be9a test: Split secondary_index test
Detach test_index_with_paging into individual file.

This particular test-case is the longest one in the sute,
it takes ~14 minutes to run, further splitting of this
test is pointless (for now) and all subsequent splits in
this set just make the resulting times less than this.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-16 20:26:34 +03:00
Avi Kivity
342c967b6a Merge "Introduce compacting reader" from Botond
"
Allow adding compacting to any reader pipeline. The intended users are
streaming and repair, with the goal to prevent wasting transfer
bandwidth with data that is purgeable.
No current user in the tree.

Tests: unit(dev), mutation_reader_test.compacting_reader_*(debug)
"

* 'compacting-reader/v3' of https://github.com/denesb/scylla:
  test: boost/mutation_reader_test: add unit test for compacting_reader
  test: lib/flat_mutation_reader_assertions: be more lenient about empty mutations
  test: lib/mutation_source_test: make data compaction friendly
  test: random_mutation_generator: add generate_uncompactable mode
  mutation_reader: introduce compacting_reader
2020-03-16 16:41:50 +02:00
Botond Dénes
837b79c265 test: boost/mutation_reader_test: add unit test for compacting_reader 2020-03-16 13:58:13 +02:00
Botond Dénes
3b482af33d test: lib/flat_mutation_reader_assertions: be more lenient about empty mutations
When expecting a mutation that compacts to an empty one, allow it to be
not produced at all. After all, compaction normally doesn't even emits
empty partitions.
2020-03-16 13:58:13 +02:00
Botond Dénes
1ab45e15a0 test: lib/mutation_source_test: make data compaction friendly
Currently the mutation source test suite may generate data that is
compactable. This poses a problem for the next patch, where we want to
use it to test `compacting_reader` a reader which compacts data as it
reads it. When the input is compactable, this will introduce artificial
differences, failing the tests.
To allow also testing such readers, make sure data is not compactable,
i.e. compacting it will not change it.
The goal of the mutation source test suite is not to exercise compaction
logic, so this will not take anything away from its value.
2020-03-16 13:58:13 +02:00
Botond Dénes
c4fab16723 test: random_mutation_generator: add generate_uncompactable mode
The random mutation generator currently generates data and tombstones
with random timestamps selected from a pre-determined range. This
results in mutations where tombstones often cover each other and data.
There is nothing wrong with this, as this is how real data is too.
However for certain tests this is problematic, as compacting the
mutations will result in a different mutations. To cater for these users
too, introduce a `generate_uncompactable` option. When set to `yes`, the
generated mutations will be uncompactable, i.e. no tombstone will cover
lower-level tombstones and no tombstone will cover data. The mutations
will not change after compacted.
2020-03-16 13:58:13 +02:00
Nadav Har'El
35d95d6887 merge: Add postimage implementation
Merged pull request https://github.com/scylladb/scylla/pull/5996 from
Calle Wilund:

Fixes #4992

Implements post-image support by synthesizing it from
pre-image + delta.

Post-image data differs from the delta data in two ways:

1.) It merges non-atomics into an actual result value
2.) It contains all columns of the row, not just
those affected by the update.

For a non-atomic field, the post-image value of a column
is either the pre-image or the delta (maybe null)

Tested by adding post-image checks to pre-image test
and collection/udt tests
2020-03-16 13:42:07 +02:00
Calle Wilund
0a3383c090 cdc: Add postimage implementation
Fixes #4992

Implements post-image support by synthesizing it from
pre-image + delta.

Post-image data differs from the delta data in two ways:

1.) It merges non-atomics into an actual result value
2.) It contains _all_ columns of the row, not just
    those affected by the update.

For a non-atomic field, the post-image value of a column
is either the pre-image or the delta (maybe null)

Tested by adding post-image checks to pre-image test
and collection/udt tests
2020-03-16 09:21:06 +00:00
Avi Kivity
ee9df91a76 Merge "Allow setting partitioner per table" from Piotr
"
This PR makes it possible to enable the usage of different partitioner for each table. If no table-specific partitioner is set for a given table then a default partitioner is used.

The PR is composed of the following parts:

 - Introduction of schema::get_partitioner that still returns dht::global_partitioner
 - Replacement of all the usage of dht::global_partitioner with schema::get_partitioner
 - Making it possible to set table-specific partitioner in a schema_builder
 - Remove all the places that were setting default partitioner except for main.cc (mostly tests)
 - Move default partitioner from i_partitioner to schema.cc and hide it from the rest of the codebase
 - Remove dht::global_partitioner

After this PR there's no such thing as global partitioner at all. There is only a default partitioner but it still has to be accessed through schema::get_partitioner.

There are some intermediate states in which i_partitioner is stored as shared_ptr in the schema but the final version keeps it by const&.

The PR does not enable per table partitioner end-to-end. Just the internals of the single node are covered. I still have to deal with:

 - Making sure a table has the same partitioner on each node
 - Allowing user to set up a table-specific partitioner on table
 - Signal driver about what partitioner is used by a given table
 - Persist partitioner info for each table that does not use default partitioner.
Fixes #5493

Tests: unit(dev, release, debug), dtest(byo)
"

* 'per_table_partitioner' of https://github.com/haaawk/scylla:
  schema: drop optional from _partitioner field
  make_multishard_combining_reader: stop taking partitioner
  split_range_to_single_shard: stop taking partitioner as argument
  tests: remove unused murmur3 includes
  partitioner: move default_partitioner to schema.cc
  partitioner: hide dht::default_partitioner
  schema: include partitioner name in scylla tables mutation
  schema: make it possible to set custom partitioner
  scylla_tables: add partitioner column
  schema_features: add PER_TABLE_PARTITIONERS feature
  features: add PER_TABLE_PARTITIONERS feature
2020-03-16 11:13:47 +02:00
Rafael Ávila de Espíndola
69874f4330 feature_service: Remove default constructor
This makes user that feature_config_from_db_config is used for both
tests and main.cc.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200312153453.37282-2-espindola@scylladb.com>
2020-03-16 11:01:15 +02:00
Piotr Jastrzebski
924ed7bb1c make_multishard_combining_reader: stop taking partitioner
The function already takes schema so there's no need
for it to take partitioner. It can be obtained using
schema::get_partitioner

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-15 10:25:20 +01:00
Piotr Jastrzebski
4b7fb323c3 split_range_to_single_shard: stop taking partitioner as argument
The function already takes schema so we don't need partitioner.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-15 10:25:20 +01:00
Piotr Jastrzebski
f99fd35f53 tests: remove unused murmur3 includes
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-15 10:25:20 +01:00
Piotr Jastrzebski
7064f6b831 partitioner: hide dht::default_partitioner
Remove last usage of this global outside i_partitioner.cc
and hide it inside the compilation unit.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-15 10:25:20 +01:00
Nadav Har'El
635e6d887c materialized views: fix corner case of view updates used by Alternator
While CQL does not allow creation of a materialized view with more than one
base regular column in the view's key, in Alternator we do allow this - both
partition and clustering key may be a base regular column. We had a bug in
the logic handling this case:

If the new base row is missing a value for *one* of the view key columns,
we shouldn't create a view row. Similarly, if the existing base row was
missing a value for *one* of the view key columns, a view row does not
exist and doesn't need to be deleted.  This was done incorrectly, and made
decisions based on just one of the key columns, and the logic is now
fixed (and I think, simplified) in this patch.

With this patch, the Alternator test which previously failed because of
this problem now passes. The patch also includes new tests in the existing
C++ unit test test_view_with_two_regular_base_columns_in_key. This tests
was already supposed to be testing various cases of two-new-key-columns
updates, but missed the cases explained above. These new tests failed
badly before this patch - some of them had clean write errors, others
caused crashes. With this patch, they pass.

Fixes #6008.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200312162503.8944-1-nyh@scylladb.com>
2020-03-15 07:57:33 +01:00
Avi Kivity
07ddbf6e54 Merge "Reduce our dependence on sstring" from Rafael
"
It doesn't look like we will be able to switch to std::string just
yet, but when it is not too inconvenient we can try to reduce our
dependence so that attempting the switch again in the future is
easier.
"

* 'espindola/sstring-api' of https://github.com/espindola/scylla:
  redis: Use scattered_message::append(std::string_view)
  everywhere: Use uninitialized_string instead of sstring::initialized_later
  compressor: Add an explicit cast to const sstring&
  everywhere: Be more explicit that we don't want std::make_shared
  cql3: Don't use sstring::reset
  everywhere: Don't assume sstring::begin() and sstring::end() are pointers
2020-03-14 16:29:42 +02:00
Piotr Sarna
8d2555673f test: add a simple test for synchronous local view updates
With synchronous local view updates enabled, local materialized views
can be queried right after base table insertions, without the risk
of reading stale values.
2020-03-11 09:15:57 +01:00
Rafael Ávila de Espíndola
80d969ce31 everywhere: Use uninitialized_string instead of sstring::initialized_later
This is just a trivial wrapper over initialized_later when using
sstring, but also works when std::string is used.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-10 13:17:49 -07:00
Rafael Ávila de Espíndola
caef2ef903 everywhere: Don't assume sstring::begin() and sstring::end() are pointers
If we switch to using std::string we have to handle begin and end
returning iterators.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-10 13:13:48 -07:00
Juliusz Stasiewicz
3cc3233281 test/cdc: test that LWT generates CDC logs
Tests #5952
Refs #5869
2020-03-10 08:33:49 +01:00
Raphael S. Carvalho
899bb230e2 sstable_resharding_test: fix sstable_resharding_strategy_tests with odd smp count
leveled_compaction_strategy_strategy::get_resharding_jobs() returns compaction
jobs, each containing at most smp::count ssts, so calculation is wrong if
smp count is an odd number.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Acked-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200305161003.14424-1-raphaelsc@scylladb.com>
2020-03-09 17:52:53 +02:00
Avi Kivity
8af6dabbf0 Merge "Decouple cql_config from storage_service" from Pavel E
"
The cql_configu is needed by storage_service to feed it to
thrift/transport servers. These servers, in turn, put the
config onto query_options. The final goal of this config
reference is the guts of query_processor (but currently it's
only used by restrictions)

This way is rather long and confusing. It seems more natural
to keep the cql_config on it's main "user" -- query processor.

This patch set does so. However, in order to push the config
into its current usage places a huge refactoring is needed --
most of the classes in cql3/statements and cql3/restrictions.
It's much more handy to contunue keeping it via query_options,
so the query_processor is equipped with the method to return
the reference on the config to those initializing query_options.

Tests: unit(debug)
"

* 'br-clean-client-services-from-cql-config-2' of https://github.com/xemul/scylla:
  storage_service: Forget cql_config
  transport: Forget cql_config
  thrift: Forget cql_config
  query_processor: Carry reference on cql_config
2020-03-09 15:06:59 +02:00
Piotr Dulikowski
5f652e58c1 cdc: allow dropping manually created tables with cdc log suffix
The is_log_for_some_table function incorrectly assumed that
database::find_schema would return a null pointer in case the queried
schema does not exist. This patch fixes that, and now this function
checks for existence of the schema using database::has_schema.

Tests: unit(dev)
2020-03-09 12:17:13 +01:00
Pavel Emelyanov
0298a6270e storage_service: Forget cql_config
It needs the config purely to feed one into thrift/transport
server, since the latter two no longer needs one, neither does
the former.

As a nice side effect -- some tests no longer have to carry
the cql_config on board.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-09 11:58:06 +03:00
Pavel Emelyanov
0a9a5a2dd7 query_processor: Carry reference on cql_config
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-09 11:57:28 +03:00
Konstantin Osipov
ac6f64a885 locator: correctly select endpoints if RF=0
SimpleStrategy creates a list of endpoints by iterating over the set of
all configured endpoints for the given token, until we reach keyspace
replication factor.
There is a trivial coding bug when we first add at least one endpoint
to the list, and then compare list size and replication factor.
If RF=0 this never yields true.
Fix by moving the RF check before at least one endpoint is added to the
list.
Cassandra never had this bug since it uses a less fancy while()
loop.

Fixes #5962
Message-Id: <20200306193729.130266-1-kostja@scylladb.com>
2020-03-08 16:53:01 +02:00
Calle Wilund
0b34d88957 db::commitlog: Don't write trailing zero block unless needed
Fixes #5899

When terminating (closing) a segment, we write a trailing block
of zero so reader can have an empty region after last used chunk
as end marker. This is due to using recycled, pre-allocated
segments with potentially non-zero data extending over the point
where we are ending the segment (i.e. we are not fully filling
the segment due to a huge mutation or similar).

However, if we reach end of segment writing the final block
(typically many small mutations), the file will end naturally
after the data written, and any trailing zero block would in fact
just extend the file further. While this will only happen once per
segment recycled (independent on how many times it is recycled),
it is still both slightly breaking the disk usage contract and
also potentially causing some disk stalls due to metadata changes
(though of course very infrequent).

We should only write trailing zero if we are below the max_size
file size when terminating

Adds a small size check to commitlog test to verify size bounds.
(Which breaks without the patch)

Message-Id: <20200226121601.15347-2-calle@scylladb.com>
2020-03-08 16:51:53 +02:00
Konstantin Osipov
b4b08be0e1 test: add a test case for rare replication configurations
Introduce a test which checks how different CQL features (DML, LWT,
MV) work when no replicas are available (e.g. because
they are all in an unavailable data center).
Specifically the test checks that when we SELECT with IN clause
and there are no available replicas, there is no crash (#5935).

Message-Id: <20200306192521.73486-3-kostja@scylladb.com>
2020-03-08 15:11:08 +02:00
Nadav Har'El
6febd4199e merge: cdc: on row delete, show the whole row as preimage
Merged pull request https://github.com/scylladb/scylla/pull/5980 by
Piotr Jastrzębski, based on https://github.com/scylladb/scylla/pull/5976
by Juliusz Stasiewicz:

"If base mutation has at least one row tombstone, its preimage log entry
displays all the base columns."

Fixes #5709

Tests: unit(dev)
2020-03-08 14:54:59 +02:00
Juliusz Stasiewicz
49f1a24472 tests/cdc: test preimage on row delete
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-08 13:27:49 +01:00
Piotr Dulikowski
0e413efb48 cdc: correct static row preimage for case with no clustering row
In case a static and a clustering row is written at the same time, but
a clustering row with given key was not present, the preimage query was
incorrectly configured and no rows were returned. This resulted in an
empty preimage, while a preimage for static row should be present.

This patch fixes this and now the static row is correctly written to cdc
log in the case above.

Tests: unit(dev)
2020-03-08 09:25:45 +01:00
Piotr Sarna
395c7eeb98 Merge ' cdc: disallow creating nested cdc logs' from Piotr
This change disallows creating CDC log tables for already existing
CDC log tables. CDC logs nested in that way are not really useful
and do not work at the moment, therefore disallowing their creation
prevents confusion.

Fixes #5967
Tests: unit(dev)

* piodul/5967-disallow-nested-cdc-logs:
  cdc: disallow creating nested CDC logs
  cql_repl: register schema extensions
2020-03-08 09:22:59 +01:00
Piotr Sarna
be293523bd Merge 'Replace dht::global_partitioner() calls with...
... schema::get_partitioner and make schema::get_partitioner
return const&' from Piotr

Partitioners returned from get_partitioner are shared
and not supposed to be changed so let's use the type system
to enforce that.

dht::global_partitioner() is deprecated and will be removed
as soon as custom partitioners are implemented so it's best
to replace it with schema::get_partitioner.

Tests: unit(dev)

* hawk/global_partitioner_cleanup:
  schema: get_partitioner return const&
  compaction_manager: stop calling dht::global_partitioner()
  sstable_datafile_test: stop calling dht::global_partitioner()
2020-03-06 14:36:03 +01:00
Piotr Jastrzebski
54d24553bb schema: get_partitioner return const&
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-06 13:33:53 +01:00
Piotr Jastrzebski
08ebf1f69d sstable_datafile_test: stop calling dht::global_partitioner()
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-06 13:33:53 +01:00
Piotr Dulikowski
f317283578 cdc: disallow creating nested CDC logs
This change disallows creating CDC log tables for already existing CDC
log tables. CDC logs nested in that way are not really useful and do not
work at the moment, therefore disallowing their creation prevents
confusion.
2020-03-06 10:47:13 +01:00
Piotr Dulikowski
75284eb2a5 cql_repl: register schema extensions
Alternator and CDC, apart from enabling their experimental features,
need to have their schema extensions registered. This patch adds missing
registration of schema extensions to cql_repl, so that cql tests written
with Alternator or CDC in mind will properly work.
2020-03-06 10:31:07 +01:00
Piotr Sarna
d1db198211 Merge ' Allow repeated LIKE on same column' from Dejan
Fixes #5902 by making the LIKE restriction keep a vector
of matchers and apply them all to the column value.

Tests: unit (dev)

* dekimir/multiple-likes:
  cql3: Allow repeated LIKE on same column
  cql3: Forbid calling LIKE::values()
  cql3: Move LIKE::_last_pattern to matcher
2020-03-06 09:55:54 +01:00
Piotr Sarna
22798f7b7b locator: fix validating replication factor
In order to properly validate not only network topology strategy,
but also other strategies, the checks are moved straight to
validate_replication_factor().
Also, the test case is extended with a too long integer
and a check for SimpleStrategy replication factor.

Fixes #3801
Tests: unit(dev)

Message-Id: <e0c3c3c36c589e1d440c9708a6dce820c111b8da.1583483602.git.sarna@scylladb.com>
2020-03-06 10:39:34 +02:00