Commit Graph

280 Commits

Author SHA1 Message Date
Piotr Jastrzebski
7064f6b831 partitioner: hide dht::default_partitioner
Remove last usage of this global outside i_partitioner.cc
and hide it inside the compilation unit.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-15 10:25:20 +01:00
Piotr Sarna
8d2555673f test: add a simple test for synchronous local view updates
With synchronous local view updates enabled, local materialized views
can be queried right after base table insertions, without the risk
of reading stale values.
2020-03-11 09:15:57 +01:00
Juliusz Stasiewicz
3cc3233281 test/cdc: test that LWT generates CDC logs
Tests #5952
Refs #5869
2020-03-10 08:33:49 +01:00
Raphael S. Carvalho
899bb230e2 sstable_resharding_test: fix sstable_resharding_strategy_tests with odd smp count
leveled_compaction_strategy_strategy::get_resharding_jobs() returns compaction
jobs, each containing at most smp::count ssts, so calculation is wrong if
smp count is an odd number.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Acked-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200305161003.14424-1-raphaelsc@scylladb.com>
2020-03-09 17:52:53 +02:00
Avi Kivity
8af6dabbf0 Merge "Decouple cql_config from storage_service" from Pavel E
"
The cql_configu is needed by storage_service to feed it to
thrift/transport servers. These servers, in turn, put the
config onto query_options. The final goal of this config
reference is the guts of query_processor (but currently it's
only used by restrictions)

This way is rather long and confusing. It seems more natural
to keep the cql_config on it's main "user" -- query processor.

This patch set does so. However, in order to push the config
into its current usage places a huge refactoring is needed --
most of the classes in cql3/statements and cql3/restrictions.
It's much more handy to contunue keeping it via query_options,
so the query_processor is equipped with the method to return
the reference on the config to those initializing query_options.

Tests: unit(debug)
"

* 'br-clean-client-services-from-cql-config-2' of https://github.com/xemul/scylla:
  storage_service: Forget cql_config
  transport: Forget cql_config
  thrift: Forget cql_config
  query_processor: Carry reference on cql_config
2020-03-09 15:06:59 +02:00
Piotr Dulikowski
5f652e58c1 cdc: allow dropping manually created tables with cdc log suffix
The is_log_for_some_table function incorrectly assumed that
database::find_schema would return a null pointer in case the queried
schema does not exist. This patch fixes that, and now this function
checks for existence of the schema using database::has_schema.

Tests: unit(dev)
2020-03-09 12:17:13 +01:00
Pavel Emelyanov
0298a6270e storage_service: Forget cql_config
It needs the config purely to feed one into thrift/transport
server, since the latter two no longer needs one, neither does
the former.

As a nice side effect -- some tests no longer have to carry
the cql_config on board.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-09 11:58:06 +03:00
Pavel Emelyanov
0a9a5a2dd7 query_processor: Carry reference on cql_config
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-09 11:57:28 +03:00
Konstantin Osipov
ac6f64a885 locator: correctly select endpoints if RF=0
SimpleStrategy creates a list of endpoints by iterating over the set of
all configured endpoints for the given token, until we reach keyspace
replication factor.
There is a trivial coding bug when we first add at least one endpoint
to the list, and then compare list size and replication factor.
If RF=0 this never yields true.
Fix by moving the RF check before at least one endpoint is added to the
list.
Cassandra never had this bug since it uses a less fancy while()
loop.

Fixes #5962
Message-Id: <20200306193729.130266-1-kostja@scylladb.com>
2020-03-08 16:53:01 +02:00
Calle Wilund
0b34d88957 db::commitlog: Don't write trailing zero block unless needed
Fixes #5899

When terminating (closing) a segment, we write a trailing block
of zero so reader can have an empty region after last used chunk
as end marker. This is due to using recycled, pre-allocated
segments with potentially non-zero data extending over the point
where we are ending the segment (i.e. we are not fully filling
the segment due to a huge mutation or similar).

However, if we reach end of segment writing the final block
(typically many small mutations), the file will end naturally
after the data written, and any trailing zero block would in fact
just extend the file further. While this will only happen once per
segment recycled (independent on how many times it is recycled),
it is still both slightly breaking the disk usage contract and
also potentially causing some disk stalls due to metadata changes
(though of course very infrequent).

We should only write trailing zero if we are below the max_size
file size when terminating

Adds a small size check to commitlog test to verify size bounds.
(Which breaks without the patch)

Message-Id: <20200226121601.15347-2-calle@scylladb.com>
2020-03-08 16:51:53 +02:00
Konstantin Osipov
b4b08be0e1 test: add a test case for rare replication configurations
Introduce a test which checks how different CQL features (DML, LWT,
MV) work when no replicas are available (e.g. because
they are all in an unavailable data center).
Specifically the test checks that when we SELECT with IN clause
and there are no available replicas, there is no crash (#5935).

Message-Id: <20200306192521.73486-3-kostja@scylladb.com>
2020-03-08 15:11:08 +02:00
Nadav Har'El
6febd4199e merge: cdc: on row delete, show the whole row as preimage
Merged pull request https://github.com/scylladb/scylla/pull/5980 by
Piotr Jastrzębski, based on https://github.com/scylladb/scylla/pull/5976
by Juliusz Stasiewicz:

"If base mutation has at least one row tombstone, its preimage log entry
displays all the base columns."

Fixes #5709

Tests: unit(dev)
2020-03-08 14:54:59 +02:00
Juliusz Stasiewicz
49f1a24472 tests/cdc: test preimage on row delete
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-08 13:27:49 +01:00
Piotr Dulikowski
0e413efb48 cdc: correct static row preimage for case with no clustering row
In case a static and a clustering row is written at the same time, but
a clustering row with given key was not present, the preimage query was
incorrectly configured and no rows were returned. This resulted in an
empty preimage, while a preimage for static row should be present.

This patch fixes this and now the static row is correctly written to cdc
log in the case above.

Tests: unit(dev)
2020-03-08 09:25:45 +01:00
Piotr Sarna
395c7eeb98 Merge ' cdc: disallow creating nested cdc logs' from Piotr
This change disallows creating CDC log tables for already existing
CDC log tables. CDC logs nested in that way are not really useful
and do not work at the moment, therefore disallowing their creation
prevents confusion.

Fixes #5967
Tests: unit(dev)

* piodul/5967-disallow-nested-cdc-logs:
  cdc: disallow creating nested CDC logs
  cql_repl: register schema extensions
2020-03-08 09:22:59 +01:00
Piotr Sarna
be293523bd Merge 'Replace dht::global_partitioner() calls with...
... schema::get_partitioner and make schema::get_partitioner
return const&' from Piotr

Partitioners returned from get_partitioner are shared
and not supposed to be changed so let's use the type system
to enforce that.

dht::global_partitioner() is deprecated and will be removed
as soon as custom partitioners are implemented so it's best
to replace it with schema::get_partitioner.

Tests: unit(dev)

* hawk/global_partitioner_cleanup:
  schema: get_partitioner return const&
  compaction_manager: stop calling dht::global_partitioner()
  sstable_datafile_test: stop calling dht::global_partitioner()
2020-03-06 14:36:03 +01:00
Piotr Jastrzebski
54d24553bb schema: get_partitioner return const&
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-06 13:33:53 +01:00
Piotr Jastrzebski
08ebf1f69d sstable_datafile_test: stop calling dht::global_partitioner()
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-06 13:33:53 +01:00
Piotr Dulikowski
f317283578 cdc: disallow creating nested CDC logs
This change disallows creating CDC log tables for already existing CDC
log tables. CDC logs nested in that way are not really useful and do not
work at the moment, therefore disallowing their creation prevents
confusion.
2020-03-06 10:47:13 +01:00
Piotr Dulikowski
75284eb2a5 cql_repl: register schema extensions
Alternator and CDC, apart from enabling their experimental features,
need to have their schema extensions registered. This patch adds missing
registration of schema extensions to cql_repl, so that cql tests written
with Alternator or CDC in mind will properly work.
2020-03-06 10:31:07 +01:00
Piotr Sarna
d1db198211 Merge ' Allow repeated LIKE on same column' from Dejan
Fixes #5902 by making the LIKE restriction keep a vector
of matchers and apply them all to the column value.

Tests: unit (dev)

* dekimir/multiple-likes:
  cql3: Allow repeated LIKE on same column
  cql3: Forbid calling LIKE::values()
  cql3: Move LIKE::_last_pattern to matcher
2020-03-06 09:55:54 +01:00
Piotr Sarna
22798f7b7b locator: fix validating replication factor
In order to properly validate not only network topology strategy,
but also other strategies, the checks are moved straight to
validate_replication_factor().
Also, the test case is extended with a too long integer
and a check for SimpleStrategy replication factor.

Fixes #3801
Tests: unit(dev)

Message-Id: <e0c3c3c36c589e1d440c9708a6dce820c111b8da.1583483602.git.sarna@scylladb.com>
2020-03-06 10:39:34 +02:00
Piotr Sarna
6df132436f cql3: disallow range deletions for specific columns
Range deletions of specific columns are not well-defined
(range tombstones cover entire rows) and are forbidden
in Cassandra, so we follow suit.
This commit comes with a simple test.

Fixes #5728
Tests: unit(dev)
Message-Id: <896264f5f5790b9f96fcc18655ac3248a6abf37a.1583424131.git.sarna@scylladb.com>
2020-03-06 10:04:05 +02:00
Piotr Sarna
5b7a35e02b network_topology_strategy: validate integers
In order to prevent users from creating a network topology
strategy instance with invalid inputs, it's not enough to use
std::stol() on the input: a string "3abc" still returns the number '3',
but will later confuse cqlsh and other drivers, when they ask for
topology strategy details.
The error message is now more human readable, since for incorrect
numeric inputs it used to return a rather cryptic message:
    ServerError: stol()
This commit fixes the issue and comes with a simple test.

Fixes #3801
Tests: unit(dev)
Message-Id: <7aaae83d003738f047d28727430ca0a5cec6b9c6.1583478000.git.sarna@scylladb.com>
2020-03-06 09:50:33 +02:00
Piotr Dulikowski
38b7f1ad45 unit tests: register cdc extension before tests
In the following commits, using cdc in tests will require registering
cdc extension explicitly in db config.
2020-03-05 16:11:20 +01:00
Piotr Dulikowski
6895b0e395 db::extensions: add shorthands for add_schema_extension
This abstract away a pattern used everywhere when adding a schema
extension.
2020-03-05 16:09:44 +01:00
Tomasz Grabiec
d5557023f6 Merge "Stop using BOOST_TEST_MESSAGE() in unit tests" from Kostja
Stop using BOOST_TEST_MESSAGE() in unit tests, it bloats test XML
output. Use Scylla logger instead.

Test: unit (debug, dev, release)
2020-03-05 13:27:30 +01:00
Kamil Braun
3200d415da cdc: use a single timeuuid value for a batch of changes
If a batch update is performed with a sequence of changes with a single
timestamp, they will now show up in CDC with a single timeuuid in the
`time` column, distinguished by different `batch_seq_no` values.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-05 12:32:57 +01:00
Konstantin Osipov
ac0717fb64 test: consistently use a global testlog object in all tests
Use test/lib/log.hh in all tests now that we have it.
2020-03-05 13:34:24 +03:00
Piotr Sarna
f21bd57058 Merge "cdc: log static rows correctly" from Piotr
Currently, writes to a static row in a base table are not reflected
at all in the corresponding cdc log. This patch causes such writes
to be properly logged.

Fixes: #5744
Tests: unit(dev)

* piodul/5744-handle-static-row-correctly-in-cdc:
  cdc_test: add tests for handling static row
  cdc: fix indentation in transformer::transform
  cdc: handle static rows separately in transformer::transform
  cdc: move process_cells higher (and fix captured variables)
  cdc: reduce dependencies on captured variables in process_cells
  cdc: fix preimage query for static rows
2020-03-05 10:42:15 +01:00
Konstantin Osipov
ff3f9cb7cf test: stop using BOOST_TEST_MESSAGE() for logging
We use boost test logging primarily to generate nice XML xunit
files used in Jenkins. These XML files can be bloated
with messages from BOOST_TEST_MESSAGE(), hundreds of megabytes
of build archives, on every build.

Let's use seastar logger for test logging instead, reserving
the use of boost log facilities for boost test markup information.
2020-03-05 11:38:11 +03:00
Piotr Dulikowski
204e204586 cdc: do not attempt to log empty mutations
It is possible to produce an empty mutation using CQL. For example, the
following query:

DELETE FROM ks.tbl WHERE pk = 0 AND ck < 1 AND ck > 2;

will attempt to delete from an empty range of rows. This is translated
to the following mutation:

{ks.tbl {key: pk{000400000000}, token:-3485513579396041028}
 {mutation_partition:
  static: cont=1 {row: },
  clustered: {}}}

Such mutation does not contain any timestamp, therefore it is difficult
to determine what timestamp was used while making the query. This is
problematic for CDC, because an entry in CDC log should be written with
the same timestamp as a part of the mutation.

Because an empty mutation does not modify the table in any way, we can
safely skip logging such mutations in CDC and still preserve the
ability to reconstruct the current state of the base table from full
CDC log.

Tests: unit(dev)
2020-03-05 08:32:54 +01:00
Piotr Dulikowski
e6751fad62 cdc_test: add tests for handling static row 2020-03-05 00:16:17 +01:00
Botond Dénes
8b908a9aba test: lib/mutation_source_test: log the name of the test-method
Most test-methods log a message with their names upon entering them.
This helps in identifying the test-method a failure happened in in the
logs. Two methods were missing this log line, so add it.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200304155235.46170-1-bdenes@scylladb.com>
2020-03-04 18:16:21 +02:00
Nadav Har'El
f67a402c48 merge: Remove treewide dependency on boost/multiprecision
Merged patch series from Avi Kivity:

boost/multiprecision is a heavyweight library, pulling in 20,000 lines of code into
each header that depends on it. It is used by converting_mutation_partition_applier
and types.hh. While the former is easy to put out-of-line, the latter is not.

All we really need is to forward-declare boost::multiprecision::cpp_int, but that
is not easy - it is a template taking several parameters, among which are non-type
template parameters also defined in that header. So it's quite difficult to
disentangle, and fragile wrt boost changes.

This patchset introduces a wrapper type utils::multiprecision_int which _can_
be forward declared, and together with a few other small fixes, manages to
uninclude boost/multiprecision from most of the source files. The total reduction
in number of lines compiled over a full build is 324 * 23,227 or around 7.5
million.

Tests: unit (dev)
Ref #1

https://github.com/avikivity/scylla uninclude-boost-multiprecision/v1

Avi Kivity (5):
  converting_mutation_partition_applier: move to .cc file
  utils: introduce multiprecision_int
  tests: cdc_test: explicitly convert from cdc::operation to uint8_t
  treewide: use utils::multiprecision_int for varint implementation
  types: forward-declare multiprecision_int

 configure.py                             |   2 +
 concrete_types.hh                        |   2 +-
 converting_mutation_partition_applier.hh | 163 ++-------------
 types.hh                                 |  12 +-
 utils/big_decimal.hh                     |   3 +-
 utils/multiprecision_int.hh              | 256 +++++++++++++++++++++++
 converting_mutation_partition_applier.cc | 188 +++++++++++++++++
 cql3/functions/aggregate_fcts.cc         |  10 +-
 cql3/functions/castas_fcts.cc            |  28 +--
 cql3/type_json.cc                        |   2 +-
 lua.cc                                   |  38 ++--
 mutation_partition_view.cc               |   2 +
 test/boost/cdc_test.cc                   |   6 +-
 test/boost/cql_query_test.cc             |  16 +-
 test/boost/json_cql_query_test.cc        |  12 +-
 test/boost/types_test.cc                 |  58 ++---
 test/boost/user_function_test.cc         |   2 +-
 test/lib/random_schema.cc                |  14 +-
 types.cc                                 |  20 +-
 utils/big_decimal.cc                     |   4 +-
 utils/multiprecision_int.cc              |  37 ++++
 21 files changed, 627 insertions(+), 248 deletions(-)
 create mode 100644 utils/multiprecision_int.hh
 create mode 100644 converting_mutation_partition_applier.cc
 create mode 100644 utils/multiprecision_int.cc
2020-03-04 15:13:42 +02:00
Avi Kivity
3c772757c0 treewide: use utils::multiprecision_int for varint implementation
The goal is to forward-declare utils::multiprecision_int, something
beyond my capabilities for boost::multiprecision::cpp_int, to reduce
compile time bloat.

The patch is mostly search-and-replace, with a few casts added to
disambiguate conversions the compiler had trouble with.
2020-03-04 13:28:16 +02:00
Avi Kivity
874f65c58c tests: cdc_test: explicitly convert from cdc::operation to uint8_t
After the varint data type starts using the new multiprecision_int type,
this code fails to compile. I expect that somehow the conversion from enum
class to cpp_int was allowed to succeed, and we ended up with a data_value
of type varint. The tests succeeded because the serialized representation
happened to be the same.
2020-03-04 13:28:16 +02:00
Piotr Jastrzebski
354e3c34c8 cdc log: merge stream_id columns into a single column
Previously we had stream_id_1 and stream_id_2 columns
of type long each. They were forming a partition key.

In a new format we want a single stream_id column that
forms a partition key. To be able to still store two
longs, the new column will have type blob and its value
will be concatenated bytes of two longs that
partition key is composed of.

We still want partition key to logically be two longs
because those two values will be used by a custom partitioner
later once we implement it.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-04 13:27:48 +02:00
Tomasz Grabiec
477dadc062 Merge "cql_test_env: Drop a few shared_ptr<sharded<...>>" from Rafael
I found that a few variables in cql_test_env were wrapping sharded in
shared_ptr for no apparent reason. These patches convert them to plain
sharded<...>.
2020-03-04 11:31:52 +01:00
Avi Kivity
906784639d Merge "Clean sstables from using global objects" from Pavel E
"
This set cleans sstable_writer_config and surrounding sstables
code from using global storage_ and feature_ service-s and database
by moving the configuration logic onto sstables_manager (that
was supposed to do it since eebc3701a5).

Most of the complexity is hidden around sstable_writer_config
creation, this set makes the sstables_manager create this object
with an explicit call. All the rest are consequences of this change.

Tests: unit(debug), manual start-stop
"

* 'br-clean-sstables-manager-2' of https://github.com/xemul/scylla:
  sstables: Move get_highest_supported_format
  sstables: Remove global get_config() helper
  sstables: Use manager's config() in .new_sstable_component_file()
  sstable_writer_config: Extend with more db::config stuff
  sstables_manager: Don't use global helper to generate writer config
  sstable_writer_config: Sanitize out some features fields initialization
  sstable_writer_config: Factor out some field initialization
  sstables: Generate writer config via manager only
  sstables: Keep reference on manager
  test: Re-use existing global sstables_manager
  table: Pass sstable_writer_config into write_memtable_to_sstable
2020-03-03 18:33:01 +02:00
Kamil Braun
5de9b5b566 cdc: add change splitting test
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-03 13:31:19 +01:00
Kamil Braun
5c4a237c12 cdc: split the mutation before passing it into transform
If the mutation contains separate logical changes (e.g. with different
timestamps and/or ttls), it will be split into multiple mutations, each
passed into transform.
2020-03-03 13:17:51 +01:00
Nadav Har'El
359b32fb63 merge: CDC: implement new column format and naming
Merged pull request https://github.com/scylladb/scylla/pull/5910 by
Calle Wilund:

Rename metadata and data columns according to new spec

Also use transformation methods for names in all code + tests
to make switching again easier

Break up data column tuple

Data column is now pure frozen original type.

    If column is deleted (set to null), a metadata column cdc$deleted_ is set to true, to distinguish null column == not involved in row operation
    For non-atomic collections, a cdc$deleted_elements_ column is added, and when removing elements from collection this is where they are shown.
    For non-atomic assign, the "cdc$deleted_" is true, and is set to new value.

column_op removed.
2020-03-03 12:36:16 +02:00
Calle Wilund
ed0d1c5fe2 cdc: Break up data column tuple
According to "new" spec:

Data column is now pure frozen original type.

If column is deleted (set to null), a metadata column
cdc$deleted_<name> is set to true, to distinguish
null column == not involved in row operation

For non-atomic collections, a cdc$deleted_elements_<name>
column is added, and when removing elements from collection
this is where they are shown.

For non-atomic assign, the "cdc$deleted_<name>" is true,
and <name> is set to new value.

column_op removed.
2020-03-03 08:52:20 +00:00
Rafael Ávila de Espíndola
28e59566a8 cql_test_env: Don't use a shared_ptr for token_metadata
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-02 13:52:23 -08:00
Rafael Ávila de Espíndola
47f8a63279 cql_test_env: Don't use a shared_ptr for migration_notifier
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-02 13:51:45 -08:00
Rafael Ávila de Espíndola
ed0c4d2801 cql_test_env: Don't use a shared_ptr for view_update_generator
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-02 13:51:25 -08:00
Rafael Ávila de Espíndola
ff2edd15d4 cql_test_env: Don't use a shared_ptr for view_builder
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-02 13:50:48 -08:00
Rafael Ávila de Espíndola
9375478803 cql_test_env: Don't use a shared_ptr for feature_service
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-02 13:50:25 -08:00
Rafael Ávila de Espíndola
5e87562f33 cql_test_env: Don't use a shared_ptr for database
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-02 13:50:08 -08:00