Commit Graph

233 Commits

Author SHA1 Message Date
Calle Wilund
78350a7e1b cdc: Ensure columns removed from log table are registered as dropped
If we are redefining the log table, we need to ensure any dropped
columns are registered in "dropped_columns" table, otherwise clients will not
be able to read data older than now.
Includes unit test.

Should probably be backported to all CDC enabled versions.

Fixes #10473
Closes #10474
2022-05-04 14:19:39 +02:00
Nadav Har'El
6fb762630b cql-pytest: translate Cassandra's tests for SELECT operations
This is a translation of Cassandra's CQL unit test source file
    validation/operations/SelectTest.java into our our cql-pytest framework.

    This large test file includes 78 tests for various types of SELECT
    operations. Four additional tests require UDF in Java syntax,
    and were skipped.

    All 78 tests pass on Cassandra. 25 of the tests fail on Scylla
    reproducing 3 already known Scylla issues and 8 previously-unknown
    issues:

    Previously known issues:

      Refs #2962:  Collection column indexing
      Refs #4244:  Add support for mixing token, multi- and single-column
                   restrictions
      Refs #8627:  Cleanly reject updates with indexed values where
                   value > 64k

    Newly-discovered issues:

      Refs #10354: SELECT DISTINCT should allow filter on static columns,
                   not just partition keys
      Refs #10357: Spurious static row returned from query with filtering,
                   despite not matching filter
      Refs #10358: Comparison with UNSET_VALUE should produce an error
      Refs #10359: "CONTAINS NULL" and "CONTAINS KEY NULL" restrictions
                   should match nothing
      Refs #10361: Null or UNSET_VALUE subscript should generate an
                   invalid request error
      Refs #10366: Enforce Key-length limits during SELECT
      Refs #10443: SELECT with IN and ORDER BY orders rows per partition
                   instead of for the entire response
      Refs #10448: The CQL token() function should validate its parameters

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #10449
2022-05-03 11:45:05 +03:00
Nadav Har'El
fbb2a41246 expressions: don't dereference invalid map subscript in filter
If we have the filter expression "WHERE m[?] = 2", the existing code
simply assumed that the subscript is an object of the right type.
However, while it should indeed be the right type (we already have code
that verifies that), there are two more options: It can also be a NULL,
or an UNSET_VALUE. Either of these cases causes the existing code to
dereference a non-object as an object, leading to bizarre errors (as
in issue #10361) or even crashes (as in issue #10399).

Cassandra returns a invalid request error in these cases: "Unsupported
unset map key for column m" or "Unsupported null map key for column m".
We decided to do things differently:

 * For NULL, we consider m[NULL] to result in NULL - instead of an error.
   This behavior is more consistent with other expressions that contain
   null - for example NULL[2] and NULL<2 both result in NULL as well.
   Moreover, if in the future we allow more complex expressions, such
   as m[a] (where a is a column), we can find the subscript to be null
   for some rows and non-null for other rows - and throwing an "invalid
   query" in the middle of the filtering doesn't make sense.

 * For UNSET_VALUE, we do consider this an error like Cassandra, and use
   the same error message as Cassandra. However, the current implementation
   checks for this error only when the expression is evaluated - not
   before. It means that if the scan is empty before the filtering, the
   error will not be reported and we'll silently return an empty result
   set. We currently consider this ok, but we can also change this in the
   future by binding the expression only once (today we do it on every
   evaluation) and validating it once after this binding.

Fixes #10361
Fixes #10399

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-04-24 16:05:34 +03:00
Nadav Har'El
808a93d29b expressions: fix invalid dereference in map subscript evaluation
When we have an filter such as "WHERE m[2] = 3" (where m is a map
column), if a row had a null value for m, our expression evaluation
code incorrectly dereferences an unset optional, and continued
processing the result of this dereference which resulted in undefined
behavior - sometimes we were lucky enough to get "marshaling error"
but other times Scylla crashed.

The fix is trivial - just check before dereferencing the optional value
of the map. We return null in that case, which means that we consider
the result of null[2] to be null. I think this is a reasonable approach
and fits our overall approach of making null dominate expressions (e.g.,
the value of "null < 2" is also null).

The test test_filtering.py::test_filtering_null_map_with_subscript,
which used to frequently fail with marshaling errors or crashes, now
passes every time so its "xfail" mark is removed.

Fixes #10417

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-04-24 14:58:56 +03:00
Nadav Har'El
189b8845fe test/cql-pytest: improve tests for map subscripts and nulls
The test test_null.py::test_map_subscript_null turned out to reproduce
multiple bugs related to using map subscripts in filtering expressions.
One was issue #10361 (m[null] resulted in a bizarre error) or #10399
(m[null] resulted in a crash), and a different issue was #10401 (m[2]
resulted in a bizarre error or a crash if m itself was null). Moreover,
the same test uncovered different bugs depending how it was run - alone
or with other tests - because it was using a shared table.

In this patch we introduce two separate tests in test_filtering.py
which are designed to reproduce these separate bugs instead of mixing
them into one test. The new tests also cover a few more corners which
the previous test (which focused on nulls) missed - such as UNSET_VALUE.

The two new tests (and the old test_map_subscript_null) pass on
Cassandra so still assume that the Cassandra behavior - that m[null]
should be an error - is the correct behavior. We may want to change
the desired behavior (e.g., to decide that m[null] be null, not an
error), and change the tests accordingly later - but for now the
tests follow Cassandra's behavior exactly, and pass on Cassandra
and fail on Scylla (so are marked xfail).

The bugs reproduced by these tests involve randomness or reading
uninitialized memory, so these tests sometimes pass, sometimes fail,
and sometimes even crash (as reported in #10399 and #10401). So to
reproduce these bugs run the tests multiple times. For example:

    test/cql-pytest/run --count 100 --runxfail
        test_filtering.py::test_filtering_null_map_with_subscript

Refs #10361
Refs #10399
Refs #10401

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-04-24 13:26:26 +03:00
Nadav Har'El
cc40685c28 test/cql-pytest: add test for filtering with IN restriction
It turns out that Cassandra does not allow IN restrictions together with
filtering, except, curiously, when the restriction is on a clustering key.
There is no real reason for this limitation - the error message even says
it is not *yet* supported.

Scylla, on the other hand, does support this case. Of course it's not
enough that we support it - we need to support it correctly... But we don't
have a full regression test that this support is correct - in
filtering_test.cc we test it with clustering and regular columns - but not
partition key columns.

So this patch adds a simple cql-pytest test that this sort of filtering
works in Scylla correctly for partition, clustering and regular columns
(and also confirms that these cases don't work, yet, on Cassandra).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220420075553.1008062-1-nyh@scylladb.com>
2022-04-20 09:56:22 +02:00
Nadav Har'El
6cafffe281 test/cql-pytest: reproduce internal server error on null subscript
The restriction "WHERE m[NULL] = 2" should result in an invalid request
error, but currently results in an ugly internal server error.
This test reproduces it, and since the bug is still in the code - is
marked as xfail.

Refs #10361

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220412134118.829671-1-nyh@scylladb.com>
2022-04-13 08:49:48 +03:00
Nadav Har'El
ae0e1574dc test/cql-pytest: reproducer for CONTAINS NULL bug
This is a reproducer for issue #10359 that a "CONTAINS NULL" and
"CONTAINS KEY NULL" restrictions should not match any set, but currently
do match non-empty or all sets.

The tests currently fail on Scylla, so marked xfail. They also fails on
Cassandra because Cassandra considers such a request an error, which
we consider a mistake (see #4776) - so the tests are marked "cassandra_bug".

Refs #10359.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220412130914.823646-1-nyh@scylladb.com>
2022-04-13 08:49:23 +03:00
Nadav Har'El
5d87ead9f1 test/cql-pytest: add more tests comparing against NULL
We already have a test showing that WHERE v=NULL ALLOW FILTERING is
allowed in Scylla (unlike Cassandra), and matches nothing. Here
we add two further tests that confirm that:

1. Not only is v=NULL allowed - v<NULL, v<=NULL, and so on, is also
   allowed and matches nothing.

2. The ALLOW FILTERING is required in in those requests. Without it,
   both Scylla and Cassandra generate the same "ALLOW FILTERING is
   required" error.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220411214503.770413-1-nyh@scylladb.com>
2022-04-13 08:48:55 +03:00
Nadav Har'El
d9ec5ed46c test/cql-pytest: add test for blobAsInt() et al for various blob lengths
Recently I added a test that verified that blobAsInt() accepts a zero-
byte blob and return an "empty" integer. I was asked by one of the
reviewers - what happens if we try to pass a *three* byte blob to
blobAsInt()? Here is a new test that demonstrates that the answer is:
Besides the 0-byte blob, blobAsInt() only allows a 4-byte blob. Trying
3 or 5 bytes will result in an invalid query error being returned.

The test passes on both Cassandra and Scylla, confirming their behavior
is the same. The test checks all fixed-sized integer types - int (4
bytes), bigint (8 bytes), smallint (2 bytes) and tinyint (1 byte).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220411093803.651881-1-nyh@scylladb.com>
2022-04-11 12:44:22 +03:00
Nadav Har'El
3456cbcfcf test/cql-pytest: split test_null.py into test_null and test_empty
We had in test_null.py a mixture of tests for null values and the
"null" CQL keyword - and tests for empty values. Null and empty
values are *not* the same thing, and there is no reason to keep the
tests for the two things in the same file and further confuse these
two distinct concepts.

This patch just moves code from test_null.py into a new test_empty.py -
there are no functional changes.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220407090348.137583-2-nyh@scylladb.com>
2022-04-11 09:54:54 +03:00
Nadav Har'El
cf79d84efa test/cql-pytest: add regression test for "empty" integer
In https://github.com/scylladb/scylla-rust-driver/issues/278 we noted
that beyond the concept of a null integer value (which has size -1),
there is also an empty integer value (size 0). This patch adds a test
that it works as expected. And we see that it does - Scylla stores such
a value fine, and the Python driver retrieves it the same as a null
(arguably, this is fine - the important point is to see that we don't
get a crash or an error).

The test passes - I just added it as a regression test for the future.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220407090348.137583-1-nyh@scylladb.com>
2022-04-11 09:54:53 +03:00
Piotr Sarna
97c9729487 test: add test cases for keyspace storage options
The test cases check if it's possible to set and/or alter
storage options for keyspaces with CQL, and whether the changes
are reflected in the schema tables.
2022-04-08 09:17:01 +02:00
Piotr Sarna
3272b4826f db: add keyspace-storage-options experimental feature
Specifying non-standard keyspace options is experimental, so it's
going to be protected by a configuration flag.
2022-04-08 09:17:01 +02:00
Piotr Sarna
2683b54402 Merge 'CQL3: Optional FINALFUNC and INITCOND for UDA' from Michał Jadwiszczak
Makes final function and initial condition to be optional while
creating UDA. No final function means UDA returns final state
and default initial condition is `null`.
Both items were optional in cql's grammar but they were treated as required in code.

Additionally I've added check if state function returns state.

Fixes #10324

Closes #10331

* github.com:scylladb/scylla:
  CQL3: check sfunc return type in UDA
  cql-pytest: UDA no final_func/initcond tests
  cql3: allow no final_func and no initcond in UDA
2022-04-06 16:04:47 +02:00
Nadav Har'El
0f3cd6ad18 test/cql-pytest: fix fails_without_raft tests on Cassandra
We had a Python typo ("false" instead of "False") which prevented
tests with the fails_without_raft marker for running on Cassandra.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220405170337.36321-1-nyh@scylladb.com>
2022-04-06 11:20:25 +03:00
Jadw1
b560286ffe CQL3: check sfunc return type in UDA
Thre return type of state function is now checked while creating UDA.
Appropriate test added to cql-pytest.
2022-04-06 09:25:17 +02:00
Jadw1
977d6ac8b0 cql-pytest: UDA no final_func/initcond tests
Cql-pytests to check if UDA works properly without final function
or initial condition.
2022-04-06 09:25:12 +02:00
Nadav Har'El
cfe04e6437 test/cql-pytest: nicer error message if a test can't find nodetool
When testing Scylla, cql-pytest does *not* need an external nodetool
command - it uses the REST API instead because it is much faster and
there is no need to install anything. However, if cql-pytest is run
against Cassandra, the tests do want to use the "nodetool" utility and
want to know what it is. The tests use either the NODETOOL environment
variable, or if that doesn't exist, look for "nodetool" in the path.

If nodetool wasn't found in that way, before this patch, we got an ugly
error message with long irrelevant Python backtraces. It wasn't easy
to understand that what actually happened was that the user forgot
to set the NODETOOL environment variable.

This patch cleans up this error handling. Now, if nodetool cannot be
found, every test that tries to run nodetool will report just a one-
line error message, clearly explaining what went wrong and how to
fix it:

        Error: Can't find nodetool. Please set the NODETOOL
        environment variable to the path of the nodetool utility.

To reiterate, when testing Scylla, nodetool is *not* needed even after
this patch. These errors will not happen even if you don't have the
nodetool utility. You only need nodetool if you plan to test Cassandra.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220405171835.43992-1-nyh@scylladb.com>
2022-04-05 20:29:02 +03:00
Lukasz Sojka
5727f196e3 Add big batch logs tests
Tests for warning and error lines in logfile when user executes
big batch (above preconfigured thresholds in scylla.yaml).

Signed-off-by: Lukasz Sojka <lukasz.sojka@scylladb.com>

Closes #10232
2022-04-04 17:25:13 +03:00
Avi Kivity
5fc093ad42 Merge 'wasm: manage memory using exports from the client' from Wojciech Mitros
This patch adds importing the `malloc` and `free` method from the wasm client, and using them for allocating wasm memory for UDF arguments and freeing its result. When the methods are not exported, the old behaviour is used instead. To make that possible, this patch also includes a fix to the usage of pages in wasm memory (methods `size` and `grow`) that were used for allocating memory for arguments until now. (The source codes  for the examples didn't work on my machine in their original form, so when updating paging I've also added small unrelated modifications)

Tests:unit(dev)

Closes #10234

* github.com:scylladb/scylla:
  wasm: add wasm ABI version 2
  wasm: add WASI handling
  wasm: add documentation
  wasm: add _scylla_abi export for specifying abi for wasm udfs
  wasm: update ABI for passing parameters to wasm UDFs
  wasm: move common code to a separate function
  wasm: use wasm pages for wasm memory
2022-03-31 12:33:55 +03:00
Wojciech Mitros
56c5459c50 wasm: add null handling for wasm udf
As the name suggests, for UDFs defined as RETURNS NULL ON NULL
INPUT, we sometimes want to return nulls. However, currently
we do not return nulls. Instead, we fail on the null check in
init_arg_visitor. Fix by adding null handling before passing
arguments, same as in lua.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>

Closes #10298
2022-03-31 12:27:38 +03:00
Piotr Sarna
c0fd53a9d7 cql3: fix qualifying restrictions with IN for indexing
When a query contains IN restriction on its partition key,
it's currently not eligible for indexing. It was however
erroneously qualified as such, which lead to fetching incorrect
results. This commit fixes the issue by not allowing such queries
to undergo indexing, and comes with a regression test.

Fixes #10300

Closes #10302
2022-03-31 11:04:17 +03:00
Nadav Har'El
3eaafbbdf7 test/cql-pytest: mark a test failing on Cassandra with cassandra_bug
We have a test for the LIKE restriction with ALLOW FILTERING.

Cassandra does not yet support this combination (it only supports LIKE
with SASI indexes), so this test fails on Cassandra, suggesting either
the test is wrong, or Cassandra is wrong. In this case, Cassandra is
wrong - they have an issue requesting this to be fixed -
https://issues.apache.org/jira/browse/CASSANDRA-17198, and even an
implementation which is being reviewed.

So let's mark this test with "cassandra_bug", meaning it is expected
to fail (xfail) when running against Cassandra. When CASSANDRA-17198
is fixed, we can remove the cassandra_bug mark.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220330211734.4103691-1-nyh@scylladb.com>
2022-03-31 09:47:44 +02:00
Wojciech Mitros
93b082e8d9 wasm: add _scylla_abi export for specifying abi for wasm udfs
Different languages may require different ABIs for passing
parameters, etc. This patch adds a requirement for all wasm
UDFs to export an _scylla_abi symbol, that is an 32-bit integer
with a value specifying the ABI version.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 19:37:11 +02:00
Wojciech Mitros
a7ee3ccf52 wasm: update ABI for passing parameters to wasm UDFs
WebAssembly uses 32-bit address space, while also
having 64-bit integers as it native types. As a result,
when passing size of an object in memory and its address,
it can be combined into one 64-bit value. As a bonus,
if the object is null, we can signal it by passing -1 as
its size.

This patch implements handling of this new ABI and adjusts
expamples in test_wasm.py.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 17:13:25 +02:00
Wojciech Mitros
62761a7cf3 wasm: use wasm pages for wasm memory
The memory.grow and memory.size wasm methods return
the memory size in pages, and memory.size takes its
argument in the number of pages. A WebAssembly page
has a size of 64KiB, so during memory allocation
we have to divide our desired size in bytes by page
size and round up. Similarly, when reading memory
size we need to multiply the result by 64KiB to
get the size in bytes.

The change affects current naive allocator for
arguments when calling wasm UDFs and the examples
in wasm_test.py - both commented code and compiled
wasm in text representation.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 17:13:13 +02:00
Nadav Har'El
06fdce82aa test/*/run: clearer error message on wrong SCYLLA setting
The test runners cql-pytest/run et al. try to automatically find the
last-compile Scylla executable, but this decision can be overriden by
the SCYLLA environment variable. If the user sets by mistake SCYLLA to
something which is not a valid path of an executable, the result was a
long and obscure Python stack trace.

So after this patch, if SCYLLA points to something which is not an
executable, a clear error is produced immediately, directing the user
to set it this variable to a correct executable

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220323164427.3301828-1-nyh@scylladb.com>
2022-03-23 19:01:17 +02:00
Jadw1
1438be5311 CQL3: Bloom filter efficacy test
Added CQL pytest to check bloom filter efficacy by
inserting `N` rows and reading `M` non-existing keys.

Added `bloom_filter_false_positives` method to Python `nodetool`
module. Method gets fp number by calling Scylla's API.

Fixes: #1055

Closes #10186
2022-03-23 16:51:50 +02:00
Nadav Har'El
f76f6dbccb secondary index: avoid special characters in default index names
In CQL, table names are limited to so-called word characters (letters,
numbers and underscores), but column names don't have such a limitation.
When we create a secondary index, its default name is constructed from
the column name - so can contain problematic characters. It can include
even the "/" character. The problem is that the index name is then used,
like a table name, to create a directory with that name.

The test included in this patch demonstrates that before this patch, this
can be misused to create subdirectories anywhere in the filesystem, or to
crash Scylla when it fails to create a directory (which it considers an
unrecoverable I/O error).

In this patch we do what Cassandra does - remove all non-word
characters from the indexed column name before constructing the default
index name. In the included test - which can run on both Scylla and
Cassandra - we verify that the constructed index name is the same as
in Cassandra, which is useful to know (e.g., because knowing the index
name is needed to DROP the index).

Also, this patch adds a second line of defense against the security problem
described above: It is now an error to create a schema with a slash or
null (the two characters not allowed in Unix filenames) in the keyspace
or table names. So if the first line of defense (CQL checking the validity
of its commands) fails, we'll have that second line of defense. I verified
that if I revert the default-index-name fix, the second line of defense
kicks in, and the index creation is aborted and cannot create files in
the wrong place to crash Scylla.

Fixes #3403

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220320162543.3091121-1-nyh@scylladb.com>
2022-03-20 18:33:48 +02:00
Nadav Har'El
189ff5414f test/cql-pytest: implement test_tools.py without run-script cooperation
In commit afab1a97c6, we added
test_tools.py - tests for the various tools embedded in the Scylla
executable. These tests need to know where the Scylla executable is,
and also where its sstables are stored. For this, the commit added two
test parameters - "--scylla-path" and "--workdir" - with which the
"run" script communicated this knowledge to the test.

However, that implementation meant that these tests only work if the
test was run via the test/cql-pytest/run script - they won't work if
the user ran Scylla/pytest manually, or through some other script not
passing these options.

This patch drops the "--scylla-path" and "--workdir" parameters, and
instead the test figures out this information on its own:

1. To find the Scylla executable, we begin by looking (using the
   local_process_id(cql) function from the previous patch) for a
   local process which listens to our CQL connection, and then find
   the executable's path using /proc.

2. To find the Scylla data directory (which is what we really need, not
   workdir which is just a shortcut to set all directories!), we
   retrieve this configuration from the system.config table through CQL.

I tested that test_tools.py now works not only through test/cql-pytest/run
but also if I run Scylla manually and then run "pytest test_tools.py"
without any extra parameters.

Fixes #10209

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220314151125.2737815-2-nyh@scylladb.com>
2022-03-14 20:25:22 +02:00
Nadav Har'El
8ed0909cc3 test/cql-pytest: add mechanism and example of testing Scylla log messages
Generally, cql-pytest tests do not, and *should not* rely on looking up
messages in the Scylla log file: Relying on such messages makes it
impossible to run the same test against Cassandra or even a remotely-
installed Scylla, and the tests tend to break when logging (which is not
considered part of our API) changes. Moreover, usually what our dtests
achieve by looking at the log - e.g., figuring out when some event has
happened - can be achieved through official CQL APIs, and this is what
normal users do anyway (users don't normally dig through the log to
figure out when their operation completed).

However, sometimes we do want to write a test to confirm that during a
certain operation, a certain log message gets written to Scylla's log.
A desire to do this was raised by @fruch and @soyacz, so in this patch
I provide a mechanism to do this, and a trivial example - which checks
that a "Creating ..." message appears on the log whenever a table is
created, and "Dropping ..." when the table is deleted.

As is explained in detail in patches in the comment, Scylla's log file
is found automatically, without relying on Scylla's runner (such as
the script test/cql-pytest/run) communicating to the test where the log
file is. If the log file can't be found - e.g., we're testing a remote
Scylla, or if this isn't Scylla, the tests are skipped.

I would like all logfile-testing tests to be in the same file,
test_logs.py. As I explained above, I think it is a mistake for general
tests to check the log file just because they can. I think that the only
tests that should use the log file are tests deliberately written to
check what gets logged - and those can be collected in the same file.

As part of this patch, we add the utility function local_process_id(cql)
to find (if we can) the local process which listens to the connection
"cql". This utility function will later be useful in more places - for
example test_tools.py needs to find Scylla's executable.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220314151125.2737815-1-nyh@scylladb.com>
2022-03-14 20:25:20 +02:00
Lukasz Sojka
c65f1c3b47 test/cql-pytest: add warnings test
cql client should return warnings when batch exceedes certain size.
This test verifies if response contains them.

Test covers issue: https://github.com/scylladb/scylla/issues/10196

Signed-off-by: Lukasz Sojka <lukasz.sojka@scylladb.com>

Closes #10197
2022-03-14 19:49:06 +02:00
Nadav Har'El
383aa326cc cql-pytest: translate Cassandra's tests for BATCH operations
This is a translation of Cassandra's CQL unit test source file
validation/operations/BatchTest.java into our our cql-pytest framework.

This test file includes 13 tests for various types of BATCH operations.
All tests pass on Scylla - no known or new bugs were reproduced.

Two of the tests involve very slow testing of TTLs, so after verifying
they work I marked them "skip" for now (we can always turn them on later,
perhaps after reducing the length or number of the sleeps).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220313121634.2611423-1-nyh@scylladb.com>
2022-03-14 09:43:02 +01:00
Nadav Har'El
397dd64dea test/cql-pytest: avoid "run" warnings caused by pytest bug
This patch gets rid annoying pytest configuration warnings when running
test/cql-pytest/run. These started to happen after commit
afab1a97c6, due to a pytest bug:

In that commit, we added new "--scylla-path" and "--workdir" parameters
to our pytest tests, and test/cql-pytest/run started passing them,
and test/cql-pytest/run sometest runs pytest as:

    pytest --host something --workdir somedir --scylla-path somepath sometest

Pytest wants to find a configuration file (pytest.ini or tox.ini) in the
directory where the tests live, but its logic to find that directory is
buggy: It (_pytest/config/findpaths.py::determine_setup()) looks at
the command line for directory names, and looks for config files in
these directories or any of their parents. It ignores parameters
beginning with "-", but in our case the various arguments like
"--scylla-path" are each followed by another option, and this one is
not ignored! So instead of looking for the config file in sometest's
parent directories (and finding test/cql-pytest/pytest.ini), pytest
sees the directory given after "scylla-path", and finds the completely
irrelevant tox.ini there - and uses that, which (depending what you
have installed) can generate warnings.

The solution is to change the run script to use "--scylla-path=..."
as one parameter instead of "--scylla-path ..." as two parameters.
When it's just one parameter, the pytest determine_setup() logic skips
it entirely, and finds just the actual test directory.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220309132726.2311721-1-nyh@scylladb.com>
2022-03-09 15:37:08 +02:00
Piotr Sarna
05b66102e9 test: add a case for LIKE operator on a descending order column
This case is a regression test for issue #10181, where it turned out
that a clustering column with descending order is not properly
recognized as a string.
This test case used to fail with:
cassandra.InvalidRequest:
  Error from server: code=2200 [Invalid query]
  message="LIKE is allowed only on string types, which b is not"
...until it got fixed by the previous commit.
2022-03-09 08:56:22 +01:00
Nadav Har'El
c8152e78d7 Merge 'CQL3: fromJson accepts string as bool' from Jadw1
The problem was incompatibility with cassandra, which accepts bool
as a string in `fromJson()` UDF. The difference between Cassandra and
Scylla now is Scylla accepts whitespaces around word in string,
Cassandra don't. Both are case insensitive.

Fixes: https://github.com/scylladb/scylla/issues/7915

Closes #10134

* github.com:scylladb/scylla:
  CQL3/pytest: Updating test_json
  CQL3: fromJson accepts string as bool
2022-03-08 16:27:40 +02:00
Nadav Har'El
ef43531fb6 materialized views: allow empty strings in views and indexes
Although Cassandra generally does not allow empty strings as partition
keys (note they are allowed as clustering keys!), it *does* allow empty
strings in regular columns to be indexed by a secondary index, or to
become an empty partition-key column in a materialized view. As noted in
issues #9375 and #9364 and verified in a few xfailing cql-pytest tests,
Scylla didn't allow these cases - and this patch fixes that.

The patch mostly *removes* unnecessary code: In one place, code
prevented an sstable with an empty partition key from being written.
Another piece of removed code was a function is_partition_key_empty()
which the materialized-view code used to check whether the view's
row will end up with an empty partition key, which was supposedly
forbidden. But in fact, should have been allowed like they are allowed
in Cassandra and required for the secondary-index implementation, and
the entire function wasn't necessary.

Note that the removed function is_partition_key_empty() was *NOT* required
for the "IS NOT NULL" feature of materialized views - this continues to
work as expected after this patch, and we add another test to confirm it.
Being null and being an empty string are two different things.

This patch also removes a part of a unit test which enshrined the
wrong behavior.

After this patch we are left with one interesting difference from
Cassandra: Though Cassandra allows a user to create a view row with an
empty-string partition key, and this row is fully visible in when
scanning the view, this row can *not* be queried individually because
"WHERE v=''" is forbidden when v is the partition key (of the view).
Scylla does not reproduce this anomaly - and such point query does work
in Scylla after this patch. We add a new test to check this case, and mark
it "cassandra_bug", i.e., it's a Cassandra behavior which we consider
wrong and don't want to emulate.

This patch relies on #9352 and #10178 having been fixed in previous patches,
otherwise the WHERE v='' does not work when reading from sstables.
We add to the already existing tests we had for empty materialized-views
keys a lookup with WHERE v='' which failed before fixing those two issues.

Fixes #9364
Fixes #9375

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-03-08 15:34:26 +02:00
Jadw1
213dace26e CQL3/pytest: Updating test_json
Referring to issue #7915, cassandra also works with unprepared
statement. There was missing `fromJson()`, the test was inserting
string into boolean column.
2022-03-04 14:18:42 +01:00
Nadav Har'El
3d0bd523b5 Merge 'CQL3: fromJson out of range integer cause as error' from Jadw1
Passing integer which exceeds corresponding type's bounds to
`fromJson()` was causing silent overflow, e.g. inserting
`fromJson('2147483648')` to `int` coulmn stored `-2147483648`.

Now, this will cause marshal_exception. All integer types are testing agains their bounds.

Tests referring issue https://github.com/scylladb/scylla/issues/7914 in `test/cql-pytest/cassandra_tests/validation/entities/json_test.py` won't pass because the expected error's messages differ from the thrown ones. I was wondering what the message should be, because expected messages in tests aren't consistent, for instance:
- bigint overflow expects `Expected a bigint value, but got a` message
- short overflow expects `Unable to make short from` message

For now the message is `Value {} out of bound`.

Fixes: https://github.com/scylladb/scylla/issues/7914

Closes #10145

* github.com:scylladb/scylla:
  CQL3/pytest: Updating test_json
  CQL3: fromJson out of range integer cause as error
2022-03-03 13:46:16 +02:00
Jadw1
742efc4992 CQL3/pytest: Updating test_json
Added test for bigint overflow.
2022-03-02 15:36:09 +01:00
Nadav Har'El
b650ff5808 test/cql-pytest: test another corner-case of scientific-notation integers
In a previous patch, we added a test for the case of Scylla trying to
assign the JSON value 1e6 into an integer - which should be allowed
because 1e6 is indeed a whole number, in the range of int.

We already fixed that in commit efe7456f0a,
but this patch adds another test which demonstrates that an even more
esoteric problem remains:

If we are reading a JSON value into a bigint (CQL's 64-bit integer),
*and* if the number is between 2^53 and 2^63-1 *and* if the number
is written using scientific notation, e.g., 922337203685477580.7e1
(which is 2^63-1), then the bigint is set incorrectly, with some
digits being lost. The problem is that RapidJSON reads this integer
into the "double" type, which only keeps 53 significant bits.

Because this is an open issue (#10137), the test included here is
marked as expected failure (xfail). The test is also known to
fail in Cassandra - which doesn't allow scientific notation for
JSON integers at all despite the JSON standard - so the test is
also marked "cassandra_bug".

Refs #10137

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-02-28 13:52:56 +02:00
Nadav Har'El
d1b4cbfbc3 test/cql-pytest: add reproducer for LWT bug with static-column conditions
This patch adds a reproducing test for issue #10081. That issue is about
a conditional (LWT) UPDATE operation that chose a non-existent row via WHERE,
and its condition refers to both static and regular columns: In that case,
the code incorrectly assumes that because it didn't read any row, all columns
are null - and forgets that the static column is *not* null.

The test, test_lwt.py::test_lwt_missing_row_with_static
passes on Cassandra but fails on Scylla, so is marked xfail.

Refs #10081

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220215215243.660087-1-nyh@scylladb.com>
2022-02-25 07:26:11 +02:00
Nadav Har'El
364bd00136 test/cql-pytest: confirm that table names cannot include non-Latin letters
In CQL table names must be composed only of letters, digits, or underscores,
but some Cassandra documentation is unclear whether these "letters" refer only
to the Latin alphabet, or maybe UTF-8 names composed of letters in other
alphabets should be allowed too.

This patch adds a test that confirms that both Scylla and Cassandra only
accept the Latin alphabet in table names, and for example UTF-8 names
with French or Hebrew letters are rejected.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220222134220.972413-1-nyh@scylladb.com>
2022-02-22 20:58:25 +03:00
Nadav Har'El
1a940a1003 test/cql-pytest: remove "xfail" mark from scientific-notation tests that now pass
After issue #10100 was fixed, the two tests reproducing it now pass,
so remove their "xfail" marker.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220222131809.970592-1-nyh@scylladb.com>
2022-02-22 20:58:25 +03:00
Botond Dénes
3aa05f7f03 Merge "Make system.clients table virtual" from Pavel Emelyanov
"
The table lists connected clients. For this the clients are
stored in real table when they connect, update their statuses
when needed and remove^w tombstone themselves when they
disconnect. On start the whole table is cleared.

This looks weird. Here's another approach (inspired by the
hackathon project) that makes this table a pure virtual one.
The schema is preserved so is the data returned.

The benefits of doing it virtual are

- no on-disk updates while processing clients
- no potentially failing updates on non-failing disconnect
- less usage of the global qctx thing
- less calls to global storage_proxy
- simpler support for thrift and alternator clients (today's
  table implementation doesn't track them)
- the need to make virtual tables reg/unreg dynamic

branch: https://github.com/xemul/scylla/tree/br-clients-virtual-table-4
tests: manual(dev), unit(dev)

The manual test used 80-shards node and 1M connections from
1k different IP addresses.
"

* 'br-clients-virtual-table-4' of https://github.com/xemul/scylla:
  test: Add cql-pytest sanity test for system.clients table
  client_data: Sanitize connection_notifier
  transport: Indentation fix after previous patch
  code: Remove old on-disk version of system.clients table
  system_keyspace: Add clients_v virtual table
  protocol_server: Add get_client_data call
  transport: Track client state for real
  transport: Add stringifiers to client_data class
  generic_server: Gentle iterator
  generic_server: Type alias
  docs: Add system.clients description
2022-02-22 20:58:25 +03:00
Nadav Har'El
7181a6757a test/cql-pytest: add a couple of tests for static columns
This patch adds two tests for two interesting edge cases in the behavior
of static columns in Scylla. We already have a lot of tests for static
columns in other frameworks (C++ unit tests, cql and dtest), but the two
cases here are issues where specifically we weren't sure how Cassandra
behaves in those cases - and this can most easily be checked in the
test/cql-pytest framework.

The first test, test_static_not_selected, is a reproducer for issue #10091.
This issue was reported by a user @aohotnik, who was surprised by the
fact that Scylla returns empty values, instead of nothing, when selecting
regular columns of a non-existent row if the partition has a static
column set. The test demonstrates a difference between Scylla and
Cassandra, so it is marked "xfail" - it passes on Cassandra and fails on
Scylla. If later we decide that both Scylla's and Cassandra's behaviours
are reasonable and both can be considered "correct", we can change this
test to except Scylla's result as well and it will beging to pass.

The second test, test_missing_row_with_static, shows that SELECT of a
non-existent row returns nothing - even if the partition has a static
column. The behavior in this case is identical in Scylla and Cassandra,
so this test passes. This contrasts with the analogous situation in LWT
UPDATE from issue #10081, where the IF condition is expected to see the
static column value.

Refs #10081
Refs #10091

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220220120418.831540-1-nyh@scylladb.com>
2022-02-21 16:04:57 +02:00
Nadav Har'El
6476a64185 test/cql-pytest: reproducer for JSON scientific-notation integer problem
The JSON standard specifies numbers without making a distinction of what
is "an integer" and what is "floating point". The value 1e6 is a valid
number, and although it is customary in C that 1e6 is a floating-point
constant, as a JSON constant there is nothing inherently "non-integer" about
it - it is a whole number. This is why I believe CQL commands such as

    CREATE TABLE t(pk int PRIMARY KEY, v int);
    INSERT INTO t JSON '{"pk": 1, "v": 1e6}';

should be allowed, as 1e6 is a whole number and fits in the range of
Scylla's int.

The included tests show that, unfortunately, 1e6 is *not* currently
allowed to be assigned to an integer. The test currently fail on both
Scylla and Cassandra - and we believe this failure to be a bug in both,
so the test is marked with xfail (known to fail) and cassandra-bug
(known failure on Cassandra considered to be a bug).

Refs #10100

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220220141602.843783-1-nyh@scylladb.com>
2022-02-20 17:01:22 +02:00
Pavel Emelyanov
9c06897ec3 test: Add cql-pytest sanity test for system.clients table
Check that SELECT {columns} FROM system.clients returns back only local
connection of cql type (because there are no others during the test).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-18 15:02:26 +03:00
Botond Dénes
1e038b40cf test/cql-pytest: add tests for scylla-sstable's dump commands
The tests are smoke-tests: they mostly check that scylla doesn't crash
while dumping and it produces *some* output. When dumping json, the test
checks that it is valid json.
2022-02-17 15:24:24 +02:00