Commit Graph

3901 Commits

Author SHA1 Message Date
Petr Gusev
0d443dfd16 modification_statement: fix LWT insert crash if clustering key is null
PR #9314 fixed a similar issue with regular insert statements
but missed the LWT code path.

It's expected behaviour of
modification_statement::create_clustering_ranges to return an
empty range in this case, since possible_lhs_values it
uses explicitly returns empty_value_set if it evaluates rhs
to null, and it has a comment about it (All NULL
comparisons fail; no column values match.) On the other hand,
all components of the primary key are required to be set,
this is checked at the prepare phase, in
modification_statement::process_where_clause. So the only
problem was modification_statement::execute_with_condition
was not expecting an empty clustering_range in case of
a null clustering key.

Fixes: #11954
2022-11-22 16:45:16 +04:00
Botond Dénes
f3eecb47f6 Merge 'Optimize cleanup compaction get ranges for invalidation' from Benny Halevy
Take advantage of the facts that both the owned ranges
and the initial non_owned_ranges (derived from the set of sstables)
are deoverlapped and sorted by start token to turn
the calculation of the final non_owned_ranges from
quadratic to linear.

Fixes #11922

Closes #11903

* github.com:scylladb/scylladb:
  dht: optimize subtract_ranges
  compaction: refactor dht::subtract_ranges out of get_ranges_for_invalidation
  compaction_manager: needs_cleanup: get first/last tokens from sstable decorated keys
2022-11-22 06:45:01 +02:00
Avi Kivity
bf2e54ff85 Merge 'Move deletion log code to sstable_directory.cc' from Pavel Emelyanov
In order to support different storage kinds for sstable files (e.g. -- s3) it's needed to localize all the places that manipulate files on a POSIX filesystem so that custom storage could implement them in its own way. This set moves the deletion log manipulations to the sstable_directory.cc, which already "knows" that it works over a directory.

Closes #12020

* github.com:scylladb/scylladb:
  sstables: Delete log file in replay_pending_delete_log()
  sstables: Move deletion log manipulations to sstable_directory.cc
  sstables: Open-code delete_sstables() call
  sstables: Use fs::path in replay_pending_delete_log()
  sstables: Indentation fix after previous patch
  sstables: Coroutinize replay_pending_delete_log
  sstables: Read pending delete log with one line helper
  sstables: Dont write pending log with file_writer
2022-11-21 21:22:59 +02:00
Benny Halevy
8b81635d95 compaction: refactor dht::subtract_ranges out of get_ranges_for_invalidation
The algorithm is generic and can be used elsewhere.

Add a unit test for the function before it gets
optimized in the following patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-11-21 15:48:26 +02:00
Pavel Emelyanov
bdc47b7717 sstables: Move deletion log manipulations to sstable_directory.cc
The deletion log concept uses the fact that files are on a POSIX
filesystem. Support for another storage type will have to reimplement
this place, so keep the FS-specific code in _directory.cc file.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-21 13:16:21 +03:00
Nadav Har'El
757d2a4c02 test/alternator: un-xfail a test which passes on modern Python
We had an xfailing test that reproduced a case where Alternator tried
to report an error when the request was too long, but the boto library
didn't see this error and threw a "Broken Pipe" error instead. It turns
out that this wasn't a Scylla bug but rather a bug in urllib3, which
overzealously reported a "Broken Pipe" instead of trying to read the
server's response. It turns out this issue was already fixed in
   https://github.com/urllib3/urllib3/pull/1524

and now, on modern installations, the test that used to fail now passes
and reports "XPASS".

So in this patch we remove the "xfail" tag, and skip the test if
running an old version of urllib3.

Fixes #8195

Closes #12038
2022-11-21 08:10:10 +02:00
Avi Kivity
994603171b Merge 'Add validator to the mutation compactor' from Botond Dénes
Fragment reordering and fragment dropping bugs have been plaguing us since forever. To fight them we added a validator to the sstable write path to prevent really messed up sstables from being written.
This series adds validation to the mutation compactor. This will cover reads and compaction among others, hopefully ridding us of such bugs on the read path too.
This series fixes some benign looking issues found by unit tests after the validator was added -- although how benign a producer emitting two partition-ends depends entirely on how the consumer reacts to it, so no such bug is actually benign.

Fixes: https://github.com/scylladb/scylladb/issues/11174

Closes #11532

* github.com:scylladb/scylladb:
  mutation_compactor: add validator
  mutation_fragment_stream_validator: add a 'none' validation level
  test/boost/mutation_query_test: test_partition_limit: sort input data
  querier: consume_page(): use partition_start as the sentinel value
  treewide: use ::for_partition_end() instead of ::end_of_partition_tag_t{}
  treewide: use ::for_partition_start() instead of ::partition_start_tag_t{}
  position_in_partition: add for_partition_{start,end}()
2022-11-20 20:33:26 +02:00
Avi Kivity
779b01106d Merge 'cql3: expr: add unit tests for prepare_expression' from Jan Ciołek
Adds unit tests for the function `expr::prepare_expression`.

Three minor bugs were found by these tests, both fixed in this PR.
1. When preparing a map, the type for tuple constructor was taken from an unprepared tuple, which has `nullptr` as its type.
2. Preparing an empty nonfrozen list or set resulted in `null`, but preparing a map didn't. Fixed this inconsistency.
3. Preparing a `bind_variable` with `nullptr` receiver was allowed. The `bind_variable` ended up with a `nullptr` type, which is incorrect. Changed it to throw an exception,

Closes #11941

* github.com:scylladb/scylladb:
  test preparing expr::usertype_constructor
  expr_test: test that prepare_expression checks style_type of collection_constructor
  expr_test: test preparing expr::collection_constructor for map
  prepare_expr: make preparing nonfrozen empty maps return null
  prepare_expr: fix a bug in map_prepare_expression
  expr_test: test preparing expr::collection_constructor for set
  expr_test: test preparing expr::collection_constructor for list
  expr_test: test preparing expr::tuple_constructor
  expr_test: test preparing expr::untyped_constant
  expr_test_utils: add make_bigint_raw/const
  expr_test_utils: add make_tinyint_raw/const
  expr_test: test preparing expr::bind_variable
  cql3: prepare_expr: forbid preparing bind_variable without a receiver
  expr_test: test preparing expr::null
  expr_test: test preparing expr::cast
  expr_test_utils: add make_receiver
  expr_test_utils: add make_smallint_raw/const
  expr_test: test preparing expr::token
  expr_test: test preparing expr::subscript
  expr_test: test preparing expr::column_value
  expr_test: test preparing expr::unresolved_identifier
  expr_test_utils: mock data_dictionary::database
2022-11-20 20:03:54 +02:00
Nadav Har'El
2ba8b8d625 test/cql-pytest: remove "xfail" from passing test testIndexOnFrozenCollectionOfUDT
We had a test that used to fail because of issue #8745. But this issue
was alread fixed, and we forgot to remove the "xfail" marker. The test
now passes, so let's remove the xfail marker.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #12039
2022-11-20 19:54:59 +02:00
Avi Kivity
2f9c53fbe4 Merge 'test/pylib: scylla_cluster: use server ID to name workdir and log file, not IP address' from Kamil Braun
Since recently the framework uses a separate set of unique IDs to
identify servers, but the log file and workdir is still named using the
last part of the IP address.

This is confusing: the test logs sometimes don't provide the IP addr
(only the ID), and even if they do, the reader of the test log may not
know that they need to look at the last part of the IP to find the
node's log/workdir.

Also using ID will be necessary if we want to reuse IP addresses (e.g.
during node replace, or simply not to run out of IP addresses during
testing).

So use the ID instead to name the workdir and log file.

Also, when starting a test case, print the used cluster. This will make
it easier to map server IDs to their IP addresses when browsing through
the test logs.

Closes #12018

* github.com:scylladb/scylladb:
  test/pylib: manager_client: print used cluster when starting test case
  test/pylib: scylla_cluster: use server ID to name workdir and log file, not IP address
2022-11-20 16:56:19 +02:00
Tomasz Grabiec
c8e983b4aa test: flat_mutation_reader_assertions: Use fatal BOOST_REQUIRE_EQUAL instead of BOOST_CHECK_EQUAL
BOOST_CHECK_EQUAL is a weaker form of assertion, it reports an error
and will cause the test case to fail but continues. This makes the
test harder to debug because there's no obvious way to catch the
failure in GDB and the test output is also flooded with things which
happen after the failed assertion.

Message-Id: <20221119171855.2240225-1-tgrabiec@scylladb.com>
2022-11-20 16:14:26 +02:00
Nadav Har'El
2d2034ea28 Merge 'cql3: don't ignore other restrictions when a multi column restriction is present during filtering' from Jan Ciołek
When filtering with multi column restriction present all other restrictions were ignored.
So a query like:
`SELECT * FROM WHERE pk = 0 AND (ck1, ck2) < (0, 0) AND regular_col = 0 ALLOW FILTERING;`
would ignore the restriction `regular_col = 0`.

This was caused by a bug in the filtering code:
2779a171fc/cql3/selection/selection.cc (L433-L449)

When multi column restrictions were detected, the code checked if they are satisfied and returned immediately.
This is fixed by returning only when these restrictions are not satisfied. When they are satisfied the other restrictions are checked as well to ensure all of them are satisfied.

This code was introduced back in 2019, when fixing #3574.
Perhaps back then it was impossible to mix multi column and regular columns and this approach was correct.

Fixes: #6200
Fixes: #12014

Closes #12031

* github.com:scylladb/scylladb:
  cql-pytest: add a reproducer for #12014, verify that filtering multi column and regular restrictions works
  boost/restrictions-test: uncomment part of the test that passes now
  cql-pytest: enable test for filtering combined multi column and regular column restrictions
  cql3: don't ignore other restrictions when a multi column restriction is present during filtering
2022-11-20 11:50:38 +02:00
Jan Ciolek
286f182a8c cql-pytest: add a reproducer for #12014, verify that filtering multi column and regular restrictions works
In issue #12014 a user has encountered an instance of #6200.
When filtering a WHERE clause which contained
both multi-column and regular restrictions,
the regular restrictions were ignored.

Add a test which reproduces the issue
using a reproducer provided by the user.

This problem is tested in another similar test,
but this one reproduces the issue in the exact
way it was found by the user.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-18 15:27:42 +01:00
Jan Ciolek
63fb2612c3 boost/restrictions-test: uncomment part of the test that passes now
A part of the test was commented out due to #6200.
Now #6200 has been fixed and it can be uncommented.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-18 15:14:32 +01:00
Jan Ciolek
99e1032e34 cql-pytest: enable test for filtering combined multi column and regular column restrictions
The test test_multi_column_restrictions_and_filtering was marked as xfail,
because issue #6200 wasn't fixed. Now that filtering
multi column and other restrictions together has been fixed
the test passes.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-18 15:14:32 +01:00
Petr Gusev
41629e97de test.py: handle --markers parameter
Some tests may take longer than a few seconds to run. We want to
mark such tests in some way, so that we can run them selectively.
This patch proposes to use pytest markers for this. The markers
from the test.py command line are passed to pytest
as is via the -m parameter.

By default, the marker filter is not applied and all tests
will be run without exception. To exclude e.g. slow tests
you can write --markers 'not slow'.

The --markers parameter is currently only supported
by Python tests, other tests ignore it. We intend to
support this parameter for other types of tests in the future.

Another possible improvement is not to run suites for which
all tests have been filtered out by markers. The markers are
currently handled by pytest, which means that the logic in
test.py (e.g., running a scylla test cluster) will be run
for such suites.

Closes #11713
2022-11-18 12:36:20 +01:00
Kamil Braun
d7649a86c4 Merge 'Build up to support of dynamic IP address changes in Raft' from Konstantin Osipov
We plan to stop storing IP addresses in Raft configuration, and instead
use the information disseminated through gossip to locate Raft peers.

Implement patches that are building up to that:
* improve Raft API of configuration change notifications
* disseminate raft host id in Gossip
* avoid using Raft addresses from Raft configuraiton, and instead
  consistently use the translation layer between raft server id <-> IP
  address

Closes #11953

* github.com:scylladb/scylladb:
  raft: persist the initial raft address map
  raft: (upgrade) do not use IP addresses from Raft config
  raft: (and gossip) begin gossiping raft server ids
  raft: change the API of conf change notifications
2022-11-18 11:38:19 +01:00
Botond Dénes
437fcdeeda Merge 'Make use of enum_set in directory lister' from Pavel Emelyanov
The lister accepts sort of a filter -- what kind of entries to list, regular, directories or both. It currently uses unordered_set, but enum_set is shorter and better describes the intent.

Closes #12017

* github.com:scylladb/scylladb:
  lister: Make lister::dir_entry_types an enum_set
  database: Avoid useless local variable
2022-11-18 12:15:26 +02:00
Jan Ciolek
77d68153f1 test preparing expr::usertype_constructor
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:41:10 +01:00
Jan Ciolek
eb92fb4289 expr_test: test that prepare_expression checks style_type of collection_constructor
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:41:10 +01:00
Jan Ciolek
77c63a6b92 expr_test: test preparing expr::collection_constructor for map
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:41:09 +01:00
Jan Ciolek
a656fdfe9a expr_test: test preparing expr::collection_constructor for set
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:22:37 +01:00
Jan Ciolek
76f587cfe7 expr_test: test preparing expr::collection_constructor for list
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:22:37 +01:00
Jan Ciolek
44b55e6caf expr_test: test preparing expr::tuple_constructor
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:22:37 +01:00
Jan Ciolek
265100a638 expr_test: test preparing expr::untyped_constant
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:22:37 +01:00
Jan Ciolek
f6b9100cd2 expr_test_utils: add make_bigint_raw/const
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:22:37 +01:00
Jan Ciolek
f9ff131f86 expr_test_utils: add make_tinyint_raw/const
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:22:36 +01:00
Jan Ciolek
76b6161386 expr_test: test preparing expr::bind_variable
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 20:22:36 +01:00
Pavel Emelyanov
a396c27efc Merge 'message: messaging_service: fix topology_ignored for pending endpoints in get_rpc_client' from Kamil Braun
`get_rpc_client` calculates a `topology_ignored` field when creating a
client which says whether the client's endpoint had topology information
when this client was created. This is later used to check if that client
needs to be dropped and replaced with a new client which uses the
correct topology information.

The `topology_ignored` field was incorrectly calculated as `true` for
pending endpoints even though we had topology information for them. This
would lead to unnecessary drops of RPC clients later. Fix this.

Remove the default parameter for `with_pending` from
`topology::has_endpoint` to avoid similar bugs in the future.

Apparently this fixes #11780. The verbs used by decommission operation
use RPC client index 1 (see `do_get_rpc_client_idx` in
message/messaging_service.cc). From local testing with additional
logging I found that by the time this client is created (i.e. the first
verb in this group is used), we already know the topology. The node is
pending at that point - hence the bug would cause us to assume we don't
know the topology, leading us to dropping the RPC client later, possibly
in the middle of a decommission operation.

Fixes: #11780

Closes #11942

* github.com:scylladb/scylladb:
  message: messaging_service: check for known topology before calling is_same_dc/rack
  test: reenable test_topology::test_decommission_node_add_column
  test/pylib: util: configurable period in wait_for
  message: messaging_service: fix topology_ignored for pending endpoints in get_rpc_client
  message: messaging_service: topology independent connection settings for GOSSIP verbs
2022-11-17 20:14:32 +03:00
Jan Ciolek
42e01cc67f expr_test: test preparing expr::null
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 17:30:05 +01:00
Jan Ciolek
45b3fca71c expr_test: test preparing expr::cast
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 17:30:05 +01:00
Jan Ciolek
498c9bfa0d expr_test_utils: add make_receiver
Add a convenience function which creates receivers.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 17:30:04 +01:00
Jan Ciolek
6873a21fbd expr_test_utils: add make_smallint_raw/const
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 17:30:04 +01:00
Jan Ciolek
488056acb7 expr_test: test preparing expr::token
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 17:30:04 +01:00
Jan Ciolek
7958f77a40 expr_test: test preparing expr::subscript
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 17:30:04 +01:00
Jan Ciolek
569bd61c6c expr_test: test preparing expr::column_value
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 17:30:04 +01:00
Jan Ciolek
26174e29c6 expr_test: test preparing expr::unresolved_identifier
It's interesting that prepare_expression
for column identifiers doesn't require a receiver.
I hope this won't break validation in the future.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 17:30:04 +01:00
Jan Ciolek
c719a923bb expr_test_utils: mock data_dictionary::database
Add a function which creates a mock instance
of data_dictionary::database.

prepare_expression requires a data_dictionary::database
as an argument, so unit tests for it need something
to pass there. make_data_dictionary_database can
be used to create an instance that is sufficient for tests.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-11-17 17:30:00 +01:00
Kamil Braun
8e8c32befe test/pylib: manager_client: print used cluster when starting test case
It will be easier to map server IDs to their IP addresses when browsing
through the test logs.
2022-11-17 17:14:23 +01:00
Pavel Emelyanov
bc62ca46d4 lister: Make lister::dir_entry_types an enum_set
This type is currently an unordered_set, but only consists of at most
two elements. Making it an enum_set renders it into a size_t variable
and better describes the intention.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-17 19:01:45 +03:00
Kamil Braun
b83234d8aa test/pylib: scylla_cluster: use server ID to name workdir and log file, not IP address
Since recently the framework uses a separate set of unique IDs to
identify servers, but the log file and workdir is still named using the
last part of the IP address.

This is confusing: the test logs sometimes don't provide the IP addr
(only the ID), and even if they do, the reader of the test log may not
know that they need to look at the last part of the IP to find the
node's log/workdir.

Also using ID will be necessary if we want to reuse IP addresses (e.g.
during node replace, or simply not to run out of IP addresses during
testing).
2022-11-17 16:55:12 +01:00
Konstantin Osipov
990c7a209f raft: change the API of conf change notifications
Pass a change diff into the notification callback,
rather than add or remove servers one by one, so that
if we need to persist the state, we can do it once per
configuration change, not for every added or removed server.

For now still pass added and removed entries in two separate calls
per a single configuration change. This is done mainly to fulfill the
library contract that it never sends messages to servers
outside the current configuration. The group0 RPC
implementation doesn't need the two calls, since it simply
marks the removed servers as expired: they are not removed immediately
anyway, and messages can still be delivered to them.
However, there may be test/mock implementations of RPC which
could benefit from this contract, so we decided to keep it.
2022-11-17 12:07:31 +03:00
Nadav Har'El
e393639114 test/cql-pytest: reproducer for crash in LWT with null key
This patch adds a reproducer for issue #11954: Attempting an
"IF NOT EXISTS" (LWT) write with a null key crashes Scylla,
instead of producing a simple error message (like happens
without the "IF NOT EXISTS" after #7852 was fixed).

The test passed on Cassandra, but crashes Scylla. Because of this
crash, we can't just mark the test "xfail" and it's temporarily
marked "skip" instead.

Refs #11954.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11982
2022-11-17 07:31:13 +02:00
Avi Kivity
3497891cf9 utils: spell "barrett" correctly
As P. T. Barnoom famously said, "write what you like but spell my name
correctly". Following that, we correct the spelling of Barrett's name
in the source tree.

Closes #11989
2022-11-16 16:30:38 +02:00
Kamil Braun
9b2449d3ea test: reenable test_topology::test_decommission_node_add_column
Also improve the test to increase the probability of reproducing #11780
by injecting sleeps in appropriate places.

Without the fix for #11780 from the earlier commit, the test reproduces
the issue in roughly half of all runs in dev build on my laptop.
2022-11-16 14:01:50 +01:00
Kamil Braun
0f49813312 test/pylib: util: configurable period in wait_for 2022-11-16 14:01:50 +01:00
Nadav Har'El
2f2f01b045 materialized views: fix view writes after base table schema change
When we write to a materialized view, we need to know some information
defined in the base table such as the columns in its schema. We have
a "view_info" object that tracks each view and its base.

This view_info object has a couple of mutable attributes which are
used to lazily-calculate and cache the SELECT statement needed to
read from the base table. If the base-table schema ever changes -
and the code calls set_base_info() at that point - we need to forget
this cached statement. If we don't (as before this patch), the SELECT
will use the wrong schema and writes will no longer work.

This patch also includes a reproducing test that failed before this
patch, and passes afterwords. The test creates a base table with a
view that has a non-trivial SELECT (it has a filter on one of the
base-regular columns), makes a benign modification to the base table
(just a silly addition of a comment), and then tries to write to the
view - and before this patch it fails.

Fixes #10026
Fixes #11542
2022-11-16 13:58:21 +02:00
Botond Dénes
bd1fcbc38f Merge 'Introduce reverse vector_deserializer.' from Michał Radwański
As indicated in #11816, we'd like to enable deserializing vectors in reverse.
The forward deserialization is achieved by reading from an input_stream. The
input stream internally is a singly linked list with complicated logic. In order to
allow for going through it in reverse, instead when creating the reverse vector
initializer, we scan the stream and store substreams to all the places that are a
starting point for a next element. The iterator itself just deserializes elements
from the remembered substreams, this time in reverse.

Fixes #11816

Closes #11956

* github.com:scylladb/scylladb:
  test/boost/serialization_test.cc: add test for reverse vector deserializer
  serializer_impl.hh: add reverse vector serializer
  serializer_impl: remove unneeded generic parameter
2022-11-16 07:37:24 +02:00
Nadav Har'El
e4dba6a830 test/cql-pytest: add test for when MV requires IS NOT NULL
As noted in issue #11979, Scylla inconsistently (and unlike Cassandra)
requires "IS NOT NULL" one some but not all materialized-view key
columns. Specifically, Scylla does not require "IS NOT NULL" on the
base's partition key, while Cassandra does.

This patch is a test which demonstrates this inconsistency. It currently
passes on Cassandra and fails on Scylla, so is marked xfail.

Refs #11979

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11980
2022-11-15 14:21:48 +01:00
Botond Dénes
34f29c8d67 Merge 'Use with_sstable_directory() helper in tests' from Pavel Emelyanov
The helper is already widely used, one (last) test case can benefit from using it too

Closes #11978

* github.com:scylladb/scylladb:
  test: Indentation fix after previous patch
  test: Wse with_sstable_directory() helper
2022-11-15 14:21:48 +01:00