scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 12:47:02 +00:00

Author	SHA1	Message	Date
Kamil Braun	858803cc2c	test/pylib: pool: replace `steal` with `put(is_dirty=True)` The pool usage was kind of awkward previously: if the user of a pool decided that a previously borrowed object should no longer be used, it was their responsibility to destroy the object (releasing associated resources and so on) and then call `steal()` on the pool to free space for a new object. Change the interface. Now the `Pool` constructor obtains a `destroy` function additionally to the `build` function. The user calls the function `put` to return both objects that are still usable and those aren't. For the latter, they set `is_dirty=True`. The pool will 'destroy' the object with the provided function, which could mean e.g. releasing associated resources. For example, instead of: ``` if self.cluster.is_dirty: self.clusters.stop() self.clusters.release_ips() self.clusters.steal() else: self.clusters.put(self.cluster) ``` we can now use: ``` self.clusters.put(self.cluster, is_dirty=self.cluster.is_dirty) ``` (assuming that `self.clusters` is a pool constructed with a `destroy` function that stops the cluster and releases its IPs.) Also extend the interface of the context manager obtained by `instance()` - the user must now pass a flag `dirty_on_exception`. If the context manager exists due to an exception and that flag was `True`, the object will be considered dirty. The dirty flag can also be set manually on the context manager. For example: ``` async with (cm := pool.instance(dirty_on_exception=True)) as server: cm.dirty = await run_test(test, server) # It will also be considered dirty if run_test throws an exception ```	2023-01-26 11:58:00 +01:00
Alexey Novikov	ce96b472d3	prevent populating cache with expired rows from sstables change row purge condition for compacting_reader to remove all expired rows to avoid read perfomance problems when there are many expired tombstones in row cache Refs #2252 Closes #12565	2023-01-25 12:59:40 +01:00
Kamil Braun	5bc7f0732e	Merge 'test.py: manual cluster pool handling for Python suite' from Alecco From reviews of https://github.com/scylladb/scylladb/pull/12569, avoid using `async with` and access the `Pool` of clusters with `get()`/`put()`. Closes #12612 * github.com:scylladb/scylladb: test.py: manual cluster handling for PythonSuite test.py: stop cluster if PythonSuite fails to start test.py: minor fix for failed PythonSuite test	2023-01-24 17:37:55 +01:00
Nadav Har'El	55558e1bd7	test/alternator: check operation on invalid TableName Issue #12538 suggested that maybe Alternator shouldn't bother reporting an invalid table name in item operations like PutItem, and that it's enough to report that the table doesn't exist. But the test added in this patch shows that DynamoDB, like Alternator, reports the invalid table name in this case - not just that the table doesn't exist. That should make us think twice before acting on issue #12538. If we do what this issue recommended, this test will need to be fixed (e.g., to accept as correct both types of errors). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12608	2023-01-24 14:14:39 +02:00
Alejo Sanchez	f236d518c6	test.py: manual cluster handling for PythonSuite Instead of complex async with logic, use manual cluster pool handling. Revert the discard() logic in Pool from a recent commit. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-24 11:38:17 +01:00
Nadav Har'El	ccc2c6b5dd	Merge 'test/pylib: scylla_cluster: improve server startup check' from Kamil Braun Don't use a range scan, which is very inefficient, to perform a query for checking CQL availability. Improve logging when waiting for server startup times out. Provide details about the failure: whether we managed to obtain the Host ID of the server and whether we managed to establish a CQL connection. Closes #12588 * github.com:scylladb/scylladb: test/pylib: scylla_cluster: better logging for timeout on server startup test/pylib: scylla_cluster: use less expensive query to check for CQL availability	2023-01-23 17:00:52 +02:00
Kamil Braun	8a1ea6c49f	test/pylib: scylla_cluster: better logging for timeout on server startup Waiting for server startup is a multi-step procedure: after we start the actual process, we will: - try to obtain the Host ID (by querying a REST API endpoint) - then try to connect a CQL session - then try to perform a CQL query The steps are repeated every .1 second until we reach a timeout (the Host ID step is skipped if we previously managed to obtain it). On timeout we'd only get a generic "failed to start server" message, it wouldn't say what we managed to do and what not. For example, on one of the failed jobs on Jenkins I observed this timeout error. Looking at the logs of the server, it turned out that the server printed the "initialization completed" message more than 2 minutes before the actual timeout happened. So for 2 minutes, the test framework either couldn't obtain the Host ID, or couldn't establish a CQL connection, or couldn't perform a CQL query, but I wasn't able to determine fully which one of these was the case. Improve the code by printing whether we managed to get the Host ID of the server and if so - whether we managed to connect to CQL.	2023-01-23 15:59:42 +01:00
Kamil Braun	0e591606a5	test/pylib: scylla_cluster: use less expensive query to check for CQL availability The previous CQL query used a range scan which is very inefficient, even for local tables. Also add a comment explaining why we need this query.	2023-01-23 15:59:05 +01:00
Nadav Har'El	54f174a1f4	Merge 'test.py: handle broken clusters for Python suite' from Alecco If the after test check fails (is_after_test_ok is False), discard the cluster and raise exception so context manager (pool) does not recycle it. Ignore exception re-raised by the context manager. Fixes #12360 Closes #12569 * github.com:scylladb/scylladb: test.py: handle broken clusters for Python suite test.py: Pool discard method	2023-01-22 19:58:12 +02:00
Botond Dénes	7f9b39009c	reader_concurrency_semaphore_test: leak test: relax iteration limit This test creates random dummy reads and simulates a query with them. The test works in terms of iteration (tick), advancing each simulating read in each iteration. To prevent infinite runtime an iteration limit of 100 was added to detect a non-converging test and kill it. This limit proved too strict however and in this patch we bump it to 1000 to prevent some unlucky seed making this test fail, as seen recently in CI. Closes #12580	2023-01-20 15:39:13 +02:00
Nadav Har'El	3d78dbd9f2	test/cql-pytest: regression tests for null lookup in local SI We noticed that old branches of Scylla had problems with looking up a null value in a local secondary index - hanging or crashing. This patch includes tests to reproduce these bugs. The tests pass on current master - apparently this bug has already been fixed, but we didn't have a regression test for it. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12570	2023-01-19 23:58:33 +02:00
Alejo Sanchez	c886a05b37	test.py: Pool discard method Add a context manager discard() method to tell it to discard the object. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-19 21:43:45 +01:00
Kamil Braun	2f84e820fd	test/pylib: scylla_cluster: return error details from test framework endpoints If an endpoint handler throws an exception, the details of the exception are not returned to the client. Normally this is desirable so that information is not leaked, but in this test framework we do want to return the details to the client so it can log a useful error message. Do it by wrapping every handler into a catch clause that returns the exception message. Also modify a bit how HTTPErrors are rendered so it's easier to discern the actual body of the error from other details (such as the params used to make the request etc.) Before: ``` E test.pylib.rest_client.HTTPError: HTTP error 500: 500 Internal Server Error E E Server got itself in trouble, params None, json None, uri http+unix://api/cluster/before-test/test_stuff ``` After: ``` E test.pylib.rest_client.HTTPError: HTTP error 500, uri: http+unix://api/cluster/before-test/test_stuff, params: None, json: None, body: E Failed to start server at host 127.155.129.1. E Check the log files: E /home/kbraun/dev/scylladb/testlog/test.py.dev.log E /home/kbraun/dev/scylladb/testlog/dev/scylla-1.log ``` Closes #12563	2023-01-19 17:47:13 +02:00
Kamil Braun	3ed3966f13	test/pylib: scylla_cluster: release cluster IPs when stopping ScyllaClusterManager When we obtained a new cluster for a test case after the previous test case left a dirty cluster, we would release the old cluster's used IP addresses (`_before_test` function). However, we would not release the last cluster's IP after the last test case. We would run out of IPs with sufficiently many test files or `--repeat` runs. Fix this. Also reorder the operations a bit: stop the cluster (and release its IPs) before freeing up space in the cluster pool (i.e. call `self.cluster.stop()` before `self.clusters.steal()`). This reduces concurrency a bit - fewer Scyllas running at the same time, which is good (the pool size gives a limit on the desired max number of concurrently running clusters). Killing a cluster is quick so it won't make a significant difference for the next guy waiting on the pool. Closes #12564	2023-01-19 17:46:46 +02:00
Nadav Har'El	18be50582d	test/cql-pytest: add tests for behavior of unset values Recently, commit `0b418fa` made the checking for "unset" values more centralized and more robust, but as the tests added in this patch show, the situation is good (and in particular, that #10358 is solved). The tests in this patch check that the behavior of "unset" values in the CQL v4 protocol matches Cassandra's behavior and its documentation, and how it compares to our wishes of how we want unset values to behave. One of these tests fail on Cassandra (we consider this a Cassandra bug). One test fails on Scylla because it doesn't yet support arithmetic expressions (Refs #2693). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12534	2023-01-19 15:48:07 +02:00
Nadav Har'El	9433108158	Merge 'Allow transient list values to contain NULLs' from Avi Kivity The CQL protocol and specification call for lists with NULLs in some places. For example, the statement: ```cql UPDATE tab SET x = 3 IF y IN (1, 2, NULL) WHERE pk = 4 ``` has a list `(1, 2, NULL)` that contains NULL. Although the syntax is tuple-like, the value is a list; consider the same statement as a prepared statement: ```cql UPDATE tab SET x = :x IF y IN :y_values WHERE pk = :pk ``` `:y_values` must have a list type, since the number of elements is unknown. Currently, this is done with special paths inside LWT that bypass normal evaluation, but if we want to unify those paths, we must allow NULLs in lists (except in storage). This series does that. Closes #12411 * github.com:scylladb/scylladb: test: materialized view: add test exercising synthetic empty-type columns cql3: expr: relax evaluate_list() to allow allow NULL elements types: allow lists with NULL test: relax NULL check test predicate cql3, types: validate listlike collections (sets, lists) for storage types: make empty type deserialize to non-null value	2023-01-19 15:15:16 +02:00
Botond Dénes	d661d03057	Merge 'main, test: integrate perf tools into scylla' from Kefu Chai following tests are integrated into scylla executable - perf_fast_forward - perf_row_cache_update - perf_simple_query - perf_row_cache_update - perf_sstable before this change ```console $ size build/release/scylla text data bss dec hex filename 82284664 288960 335897 82909521 4f11951 build/release/scylla $ ls -l build/release/scylla -rwxrwxr-x 1 kefu kefu 1719672112 Jan 19 17:51 build/release/scylla ``` after this change ```console $ size build/release/scylla text data bss dec hex filename 84349449 289424 345257 84984130 510c142 build/release/scylla $ ls -l build/release/scylla -rwxrwxr-x 1 kefu kefu 1774204800 Jan 19 17:52 build/release/scylla ``` Fixes #12484 Closes #12558 * github.com:scylladb/scylladb: main: move perf_sstable into scylla main: move perf_row_cache_update into scylla test: perf_row_cache_update: add static specifier to local functions main: move perf_fast_forward into scylla main: move perf_simple_query into scylla test: extract debug::the_database out main: shift the args when checking exec_name main: extract lookup_main_func() out	2023-01-19 15:01:30 +02:00
Kamil Braun	147dd73996	test/pylib: scylla_cluster: mark cluster as dirty if it fails to boot If a cluster fails to boot, it saves the exception in `self.start_exception` variable; the exception will be rethrown when a test tries to start using this cluster. As explained in `before_test`: ``` def before_test(self, name) -> None: """Check that the cluster is ready for a test. If there was a start error, throw it here - the server is running when it's added to the pool, which can't be attributed to any specific test, throwing it here would stop a specific test.""" ``` It's arguable whether we should blame some random test for a failure that it didn't cause, but nevertheless, there's a problem here: the `start_exception` will be rethrown and the test will fail, but then the cluster will be simply returned to the pool and the next test will attempt to use it... and so on. Prevent this by marking the cluster as dirty the first time we rethrow the exception. Closes #12560	2023-01-19 14:26:57 +02:00
Avi Kivity	9029b8dead	test: disable commitlog O_DSYNC, preallocation Commitlog O_DSYNC is intended to make Raft and schema writes durable in the face of power loss. To make O_DSYNC performant, we preallocate the commitlog segments, so that the commitlog writes only change file data and not file metadata (which would require the filesystem to commit its own log). However, in tests, this causes each ScyllaDB instance to write 384MB of commitlog segments. This overloads the disks and slows everything down. Fix this by disabling O_DSYNC (and therefore preallocation) during the tests. They can't survive power loss, and run with --unsafe-bypass-fsync anyway. Closes #12542	2023-01-19 11:14:05 +01:00
Kefu Chai	7f5bb19d1f	main: move perf_sstable into scylla * configure.py: - include `test/perf/perf_sstable` and its dependencies in scylla_perfs * test/perf/perf_sstable.cc: change `main()` to `perf::scylla_sstable_main()` * test/perf/entry_point.hh: add `perf::scylla_sstable_main()` * main.cc: - dispatch "perf-sstable" subcommand to `perf::scylla_sstable_main` before this change, we have a tool at `test/perf/perf_sstable` for running performance tests by exercising sstable related operations. after this change, the `test/perf/perf_sstable` is integreated into `scylla` as a subcommand. so we can run `scylla perf-sstable` [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:52 +08:00
Kefu Chai	240f2c6f00	main: move perf_row_cache_update into scylla * configure.py: - include `test/perf/perf_row_cache_update.cc` in scylla_perfs * main.cc: - dispatch "perf-row-cache-update" subcommand to `perf::scylla_row_cache_update_main` * test/perf/perf_fast_forward.cc: change `main()` to `perf::scylla_row_cache_update_main()` * test/perf/entry_point.hh: add `perf::scylla_row_cache_update_main()` before this change, we have a tool at `test/perf/perf_row_cache_update` for running performance tests by updating row cache. after this change, the `test/perf/perf_row_cache_update` is integreated into `scylla` as a subcommand. so we can run `scylla perf-row-cache-update [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:46 +08:00
Kefu Chai	4e390b9a05	test: perf_row_cache_update: add static specifier to local functions now that these functions are only used by the same compiling unit, they don't need external linkage. so let's hide them using `static`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:46 +08:00
Kefu Chai	228ccdc1c7	main: move perf_fast_forward into scylla * configure.py: - include `test/perf/perf_simple_query.cc` in scylla_perfs * main.cc: - dispatch "perf-fast-forward" subcommand to `perf::scylla_fast_forward_main` * test/perf/perf_fast_forward.cc: change `main()` to `perf::scylla_simple_query_main()` * test/perf/entry_point.hh: add `perf::scylla_simple_query_main()` before this change, we have a tool at `test/perf/perf_fast_forward` for running performance tests by fast forwarding the reader. after this change, the `test/perf/perf_fast_forward` is integreated into `scylla` as a subcommand. so we can run `scylla perf-fast-forward [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:40 +08:00
Kefu Chai	09de031cab	main: move perf_simple_query into scylla * configure.py: - include scylla_perfs in scylla - move 'test/lib/debug.cc' down scylla_perfs, as the latter uses `debug::the_database` - link `scylla` against seastar_testing_libs also. because we use the helpers in `test/lib/random_utils.hh` for generating random numbers / sequences in `perf_simple_query.cc`, and `random_utils.hh` references `seastar::testing::local_random_engine` as a local RNG. but `seastar::testing::local_random_engine` is included in `libseastar_testing.a` or `libseastar_perf_testing.a`. since we already have the rules for linking against `libseastar_testing.a`, let's just reuse them, and link `scylla` against this new dependency. * main.cc: - dispatch "perf-simple-query" subcommand to `perf::scylla_simple_query_main` * test/perf/perf_simple_query.cc: change `main()` to `perf::scylla_simple_query_main()` * test/perf/entry_point.hh: define the main function entries so `main.cc` can find them. it's quite like how we collect the entries in `tools/entry_point.hh` before this change, we have a tool at `test/perf/perf_simple_query` for running performance test by sending simple query to a single-node cluster. after this change, the `test/perf/perf_simple_query` is integreated into `scylla` as a subcommand. so we can run `scylla perf-simple-query [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:30 +08:00
Kefu Chai	c65692a13a	test: extract debug::the_database out we want to integrate some perf test into scylla executable, so we can run them on a regular basis. but `test/lib/cql_test_env.cc` shares `debug::the_database` with `main.cc`, so we cannot just compile them into a single binary without changing them. before this change, both `test/lib/cql_test_env.cc` and `main.cc` define `debug::the_database`. after this change, `debug::the_database` is extracted into `debug.cc`, so it compiles into a separate compiling unit. and scylla and tests using seastar testing framework are linked against `debug.cc` via `scylla_core` respectively. this paves the road to integrating scylla with the tests linking aginst `test/lib/cql_test_env.cc`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:23 +08:00
Nadav Har'El	0ff0c80496	test/cql-pytest: un-xfail tests for UNSET values Commit `0b418fa` improved the error detection of unset values in inappropriate CQL statements, and some of the unit tests translated from Cassandra started to pass, so this patch removes their "xfail" mark. In a couple of places Scylla's error message is worded differently from Cassandra, so the test was modified to look for a shorter string common to both implementations. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12553	2023-01-19 07:47:08 +02:00
Kefu Chai	6a3b19b53d	test/perf: replace "std::cout <<" with fmt::print() for better readablity Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12559	2023-01-19 07:45:13 +02:00
Avi Kivity	aab5954cfb	Merge 'reader_concurrency_semaphore: add more layers of defense against OOM' from Botond Dénes The reader concurrency semaphore has no mechanism to limit the memory consumption of already admitted read. Once memory collective memory consumption of all the admitted reads is above the limit, all it can do is to not admit any more. Sometimes this is not enough and the memory consumption of the already admitted reads balloons to the point of OOMing the node. This pull-request offers a solution to this: it introduces two more layers of defense above this: a soft and a hard limit. Both are multipliers applied on the semaphores normal memory limit. When the soft limit threshold is surpassed, all readers but one are blocked via a new blocking `request_memory()` call which is used by the `tracking_file_impl`. The reader to be allowed to proceed is chosen at random, it is the first reader which happens to request memory after the limit is surpassed. This is both very simple and should avoid situations where the algorithm choosing the reader to be allowed to proceed chooses a reader which will then always time out. When the hard limit threshold is surpassed, `reader_concurrency_semaphore::consume()` starts throwing `std::bad_alloc`. This again will result in eliminating whichever reader was unlucky enough to request memory at the right moment. With this, the semaphore is now effectively enforcing an upper bound for memory consumption, defined by the hard limit. Refs: https://github.com/scylladb/scylladb/issues/11927 Closes #11955 * github.com:scylladb/scylladb: test: reader_concurrency_semaphore_test: add tests for semaphore memory limits reader_permit: expose operator<<(reader_permit::state) reader_permit: add id() accessor reader_concurrency_semaphore: add foreach_permit() reader_concurrency_semaphore: document the new memory limits reader_concurrency_semaphore: add OOM killer reader_concurrency_semaphore: make consume() and signal() private test: stop using reader_concurrency_semaphore::{consume,signal}() directly reader_concurrency_semaphore: move consume() out-of-line reader_permit: consume(): make it exception-safe reader_permit: resource_units::reset(): only call consume() if needed reader_concurrency_semaphore: tracked_file_impl: use request_memory() reader_concurrency_semaphore: add request_memory() reader_concurrency_semaphore: wrap wait list reader_concurrency_semaphore: add {serialize,kill}_limit_multiplier parameters test/boost/reader_concurrency_semaphore_test: dummy_file_impl: don't use hardoced buffer size reader_permit: add make_new_tracked_temporary_buffer() reader_permit: add get_state() accessor reader_permit: resource_units: add constructor for already consumed res reader_permit: resource_units: remove noexcept qualifier from constructor db/config: introduce reader_concurrency_semaphore_{serialize,kill}_limit_multiplier scylla-gdb.py: scylla-memory: extract semaphore stats formatting code scylla-gdb.py: fix spelling of "graphviz"	2023-01-18 17:02:55 +02:00
Avi Kivity	9a54cb5deb	Merge 'cql3/expr: make it possible to prepare binary_operator' from Jan Ciołek `prepare_expression` takes an unprepared CQL expression straight from the parser output and prepares it. Preparation consists of various type checks that are needed to ensure that the expression is correct and to reason about it. While `prepare_expression` supports a number of different types of expressions, until now it was impossible to prepare a `binary_operator`. Eventually we would like to be able to prepare all kinds of expressions, so this PR adds the missing support for `binary_operator`. Closes #12550 * github.com:scylladb/scylladb: expr_test: test preparing binary_operator with NULL RHS expr_test: test preparing IS NOT NULL binary_operator expr_test: test preparing binary_operator with LIKE expr_test: test preparing binary_operator with CONTAINS KEY expr_test: test preparing binary_operator with CONTAINS expr_test: test preparing binary_operator with IN expr_test: test preparing binary_operator with =, !=, <, <=, >, >= expr_test: use make__untyped function in existing tests expr_test_utils: add utilities to create untyped_constant expr_test_utils: add make_float_ and make_double_* cql3: expr: make it possible to prepare binary_operator using prepare_expression cql3/expr: check that RHS of IS NOT NULL is a null value when preparing binary operators cql3: expr: pass non-empty keyspace name in prepare_binary_operator cql3: expr: take reference to schema in prepare_binary_operator	2023-01-18 16:55:18 +02:00
Jan Ciolek	ae0e955b90	expr_test: test preparing binary_operator with NULL RHS Make sure that preparing binary_operator works properly when the RHS is NULL. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:46 +01:00
Jan Ciolek	65b8a09409	expr_test: test preparing IS NOT NULL binary_operator Add unit test which check that preparing binary_operators which represent IS NOT NULL works as expected Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:46 +01:00
Jan Ciolek	5b3e6769f1	expr_test: test preparing binary_operator with LIKE Add unit test which check that preparing binary_operators with the LIKE operation works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com	2023-01-18 12:04:45 +01:00
Jan Ciolek	e876496f7f	expr_test: test preparing binary_operator with CONTAINS KEY Add unit test which check that preparing binary_operators with the CONTAINS KEY operation works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:45 +01:00
Jan Ciolek	c6d2e1a03e	expr_test: test preparing binary_operator with CONTAINS Add unit test which check that preparing binary_operators with the CONTAINS operation works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:45 +01:00
Jan Ciolek	6b147ecaea	expr_test: test preparing binary_operator with IN Add unit test which check that preparing binary_operators with the IN operation works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:45 +01:00
Jan Ciolek	669d791250	expr_test: test preparing binary_operator with =, !=, <, <=, >, >= Add unit test which check that preparing binary_operators with basic comparison operations works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:44 +01:00
Jan Ciolek	60803d12a9	expr_test: use make_*_untyped function in existing tests Use the newly introduced convenience methods that create untyped_constant in existing tests. This will make the code more readable by removing visual clutter that came with the previous overly verbose code. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:44 +01:00
Jan Ciolek	819390f9fe	expr_test_utils: add utilities to create untyped_constant expression tests often need to create instances of untyped_constant. Creating them by hand is tedious because the required code is overly verbose. Having convenience functions for it speeds up test writing. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:44 +01:00
Jan Ciolek	362bf7f534	expr_test_utils: add make_float_* and make_double_* Add utilities to create float and double values in tests. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:44 +01:00
Nadav Har'El	48e2d6a541	Merge 'utils: throw error on malformed input in base64 decode' from Marcin Maliszkiewicz Several cases where fixed in this patches, all are related to processing of malformed base64 data. Main purpose was to bring alternator implementation closer to what DynamoDB does. We now: - Throw error when padding is missing during base64 decoding - Throw error when base64 data is malformed - In alternator when invalid base64 data is fetched from DB (as opposed to being part of user's request) we now exclude such row during filtering Additionally some small code quality improvements: - avoid unnecessary type conversions in calls to rjson:from_strings functions - avoid some copy constructions in calls to rjson:from_strings functions Fixes https://github.com/scylladb/scylladb/issues/6487 Closes #11944 * github.com:scylladb/scylladb: alternator: evaluate expressions as false for stored malformed binary data rjson: avoid copy constructors in from_string calls when possible alternator: remove unused parameters from describe_items func utils: throw error on malformed input in base64 decode utils: throw error on missing padding in base64 decode	2023-01-18 12:40:57 +02:00
Avi Kivity	561f4ca057	test: materialized view: add test exercising synthetic empty-type columns Materialized views inject synthetic empty-type columns in some conditions. Since we just touched empty-type serialization/deserialization, add a test to exercise it and make sure it still works.	2023-01-18 10:38:24 +02:00
Avi Kivity	04925a7b29	cql3: expr: relax evaluate_list() to allow allow NULL elements Tests are similarly relaxed. A test is added in lwt_test to show that insertion of a list with NULL is still rejected, though we allow NULLs in IF conditions. One test is changed from a list of longs to a list of ints, to prevent churn in the test helper library.	2023-01-18 10:38:24 +02:00
Avi Kivity	390a0ca47b	types: allow lists with NULL Allow transient lists that contain NULL throughout the evaluation machinery. This makes is possible to evalute things like `IF col IN (1, 2, NULL)` without hacks, once LWT conditions are converted to expressions. A few tests are relaxed to accommodate the new behavior: - cql_query_test's test_null_and_unset_in_collections is relaxed to allow `WHERE col IN ?`, with the variable bound to a list containing NULL; now it's explicitly allowed - expr_test's evaluate_bind_variable_validates_no_null_in_list was checking generic lists for NULLs, and was similary relaxed (and renamed) - expr_Test's evaluate_bind_variable_validates_null_in_lists_recursively was similarly relaxed to allow NULLs.	2023-01-18 10:38:24 +02:00
Avi Kivity	00145f9ada	test: relax NULL check test predicate When we start allowing NULL in lists in some contexts, the exact location where an error is raised (when it's disallowed) will change. To prepare for that, relax the exception check to just ensure the word NULL is there, without caring about the exact wording.	2023-01-18 10:38:24 +02:00
Avi Kivity	da4abccf89	types: make empty type deserialize to non-null value The empty type is used internally to implement CQL sets on top of multi-cell maps. The map's key (an atomic cell) represents the set value, and the map's value is discarded. Since it's unneeded we use an internal "empty" type. Currently, it is deserialized into a `data_value` object representing a NULL. Since it's discarded, it really doesn't matter. However, with the impending change to change lists to allow NULLs, it does matter: 1. the coordinator sets the 'collections_as_maps' flag for LWT requests since it wants list indexes (this affects sets too). 2. the replica responds by serializing a set as a map. 3. since we start allow NULL collection values, we now serialize those NULLs as NULLs. 4. the coordinator deserializes the map, and complains about NULL values, since those are not supported. The solution is simple, deserialize the empty value as a non-NULL object. We create an empty empty_type_representation and add the scaffolding needed. Serialization and deserialization is already coded, it was just never called for NULL values (which were serialized with size 0, in collections, rather than size -1, luckily). A unit test is added.	2023-01-18 10:38:24 +02:00
Tomasz Grabiec	563998b69a	Merge 'raft: improve group 0 reconfiguration failure handling' from Kamil Braun Make it so that failures in `removenode`/`decommission` don't lead to reduced availability, and any leftovers in group 0 can be removed by `removenode`: - In `removenode`, make the node a non-voter before removing it from the token ring. This removes the possibility of having a group 0 voting member which doesn't correspond to a token ring member. We can still be left with a non-voter, but that's doesn't reduce the availability of group 0. - As above but for `decommission`. - Make it possible to remove group 0 members that don't correspond to token ring members from group 0 using `removenode`. - Add an API to query the current group 0 configuration. Fixes #11723. Closes #12502 * github.com:scylladb/scylladb: test: test_topology: test for removing garbage group 0 members test/pylib: move some utility functions to util.py db: system_keyspace: add a virtual table with raft configuration db: system_keyspace: improve system.raft_snapshot_config schema service: storage_service: better error handling in `decommission` service: storage_service: fix indentation in removenode service: storage_service: make `removenode` work for group 0 members which are not token ring members service/raft: raft_group0: perform read_barrier in wait_for_raft service: storage_service: make leaving node a non-voter before removing it from group 0 in decommission/removenode test: test_raft_upgrade: remove test_raft_upgrade_with_node_remove service/raft: raft_group0: link to Raft docs where appropriate service/raft: raft_group0: more logging service/raft: raft_group0: separate function for checking and waiting for Raft	2023-01-17 21:23:15 +01:00
Kamil Braun	d134c458e5	test/pylib: increase timeout when waiting for cluster before test Increase the timeout from default 5 minutes to 10 minutes. Sent as a workaround for #12546 to unblock next promotions. Closes #12547	2023-01-17 21:03:09 +02:00
Kamil Braun	4f1c317bdc	test: test_raft_upgrade: stop servers gracefully in test_recovery_after_majority_loss This test is frequently failing due to a timeout when we try to restart one of the nodes. The shutdown procedure apparently hangs when we try to stop the `hints_manager` service, e.g.: ``` INFO 2023-01-13 03:18:02,946 [shard 0] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 0] hints_manager - Stopped INFO 2023-01-13 03:18:02,946 [shard 0] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Stopped INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Stopped INFO 2023-01-13 03:22:56,997 [shard 0] hints_manager - Stopped ``` observe the 5 minute delay at the end. There is a known issue about `hints_manager` stop hanging: #8079. Now, for some reason, this is the only test case that is hitting this issue. We don't completely understand why. There is one significant difference between this test case and others: this is the only test case which kills 2 (out of 3) servers in the cluster and then tries to gracefully shutdown the last server. There's a hypothesis that the last server gets stuck trying to send hints to the killed servers. We weren't able to prove/falsify it yet. But if it's true, then this patch will: - unblock next promotions, - give us some important information when we see that the issue stops appearing. In the patch we shutdown all servers gracefully instead of killing them, like we do in the other test cases. Closes #12548	2023-01-17 20:51:09 +02:00
Kamil Braun	5545547d07	test: test_topology: test for removing garbage group 0 members Verify that `removenode` can remove group 0 members which are not token ring members.	2023-01-17 12:28:00 +01:00
Kamil Braun	c959ec455a	test/pylib: move some utility functions to util.py They were used in test_raft_upgrade, but we want to use them in other test files too.	2023-01-17 12:28:00 +01:00

1 2 3 4 5 ...

4151 Commits