scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 04:37:00 +00:00

Author	SHA1	Message	Date
Alejo Sanchez	cf3b8d7edc	pytest/topology: check snapshot transfer Test snapshot transfer by reducing the snapshot threshold on initial servers (3 and 1 trailing). Then creates a table, and does 3 extra schema changes (add column), triggering at least 2 snapshots. Then brings a new server to the cluster, which will get the schema through a snapshot. Then the test stops the initial servers and verifies the table schema is up to date on the new server. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-07 16:09:07 +01:00
Alejo Sanchez	9ceb6aba81	test/pylib: one-shot error injection helper Existing helper with async context manager only worked for non one-shot error injections. Fix it and add another helper for one-shot without a context manager. Fix tests using the previous helper. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-02 16:37:21 +01:00
Kamil Braun	40142a51d0	test: topology: wait for token ring/group 0 consistency after decommission There was a check for immediate consistency after a decommission operation has finished in one of the tests, but it turns out that also after decommission it might take some time for token ring to be updated on other nodes. Replace the check with a wait. Also do the wait in another test that performs a sequence of decommissions. We won't attempt to start another decommission until every node learns that the previously decommissioned node has left. Closes #12686	2023-02-01 16:49:22 +02:00
Nadav Har'El	132af20057	Merge 'test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests' from Kamil Braun `ScyllaClusterManager` is used to run a sequence of test cases from a single test file. Between two consecutive tests, if the previous test left the cluster 'dirty', meaning the cluster cannot be reused, it would free up space in the pool (using `steal`), stop the cluster, then get a new cluster from the pool. Between the `steal` and the `get`, a concurrent test run (with its own instance of `ScyllaClusterManager` would start, because there was free space in the pool. This resulted in undesirable behavior when we ran tests with `--repeat X` for a large `X`: we would start with e.g. 4 concurrent runs of a test file, because the pool size was 4. As soon as one of the runs freed up space in the pool, we would start another concurrent run. Soon we'd end up with 8 concurrent runs. Then 16 concurrent runs. And so on. We would have a large number of concurrent runs, even though the original 4 runs didn't finish yet. All of these concurrent runs would compete waiting on the pool, and waiting for space in the pool would take longer and longer (the duration is linear w.r.t number of concurrent competing runs). Tests would then time out because they would have to wait too long. Fix that by using the new `replace_dirty` function introduced to the pool. This function frees up space by returning a dirty cluster and then immediately takes it away to be used for a new cluster. Thanks to this, we will only have at most as many concurrent runs as the pool size. For example with --repeat 8 and pool size 4, we would run 4 concurrent runs and start the 5th run only when one of the original 4 runs finishes, then the 6th run when a second run finishes and so on. The fix is preceded by a refactor that replaces `steal` with `put(is_dirty=True)` and a `destroy` function passed to the pool (now the pool is responsible for stopping the cluster and releasing its IPs). Fixes #11757 Closes #12549 * github.com:scylladb/scylladb: test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests test/pylib: pool: introduce `replace_dirty` test/pylib: pool: replace `steal` with `put(is_dirty=True)`	2023-02-01 12:37:39 +02:00
Nadav Har'El	681a066923	test/pylib: put UNIX-domain socket in /tmp The "cluster manager" used by the topology test suite uses a UNIX-domain socket to communicate between the cluster manager and the individual tests. The socket is currently located in the test directory but there is a problem: In Linux the length of the path used as a UNIX-domain socket address is limited to just a little over 100 bytes. In Jenkins run, the test directory names are very long, and we sometimes go over this length limit and the result is that test.py fails creating this socket. In this patch we simply put the socket in /tmp instead of the test directory. We only need to do this change in one place - the cluster manager, as it already passes the socket path to the individual tests (using the "--manager-api" option). Tested by cloning Scylla in a very long directory name. A test like ./test.py --mode=dev test_concurrent_schema fails before this patch, and passes with it. Fixes #12622 Closes #12678	2023-02-01 12:37:35 +03:00
Nadav Har'El	f873884b50	test/alternator: unskip test which works on modern Scylla We had one test test_gsi.py::test_gsi_identical that didn't work on KA/LA sstables due to #6157, so it was skipped. Today, Scylla no longer supports writing these old sstable formats, so the test can never find itself running on these versions, so should pass. And indeed it does, and the "skip" marker can be removed. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12651	2023-01-27 14:10:07 +02:00
Botond Dénes	d358d4d9e9	Merge 'Configure sstable_test_env with tempdir' from Pavel Emelyanov Today's sstable_test_env starts with a default-configured db::config and, thus, sstables_manager. Test cases that run in this env always create a tempdir to store sstable files in on their own. Next patching makes sstable-manager and friends fully control the data-dir path in order to support object storage for sstables in a nice way, and this behavior of tests upsets this ongoing work. Said that, this PR configures sstable_test_env with a tempdir and pins down the cases using it to stick to that directory, rather than to the custom one. Closes #12641 * github.com:scylladb/scylladb: test: Use tempdir from sstable_test_env test: Add tmpdir to sstable test env test: Keep db::config as unique pointer	2023-01-27 13:59:12 +02:00
Kamil Braun	fa9cf81af2	test: topology: verify that group 0 and token ring are consistent After topology changes like removing a node, verify that the set of group 0 members and token ring members is the same. Modify `get_token_ring_host_ids` to only return NORMAL members. The previous version which used the `/storage_service/host_id` endpoint might have returned non-NORMAL members as well. Fixes: #12153 Closes #12619	2023-01-27 14:21:14 +03:00
Botond Dénes	d7ed92bb42	Merge 'Reduce the number of table::make_sstable() overloads' from Pavel Emelyanov There are several helpers to make an sstable for the table and two with most of the arguments are only used by tests. This PR leaves table with just one arg-less call thus making it easier to patch further. Closes #12636 * github.com:scylladb/scylladb: table: Shrink sstables making API tests: Use sstables manager to make sstables distributed_loader: Add helpers to make sstables for reshape/reshard	2023-01-26 14:25:21 +02:00
Kamil Braun	5eadea301e	Merge 'pytest: start after ungraceful stop' from Alecco If a server is stopped suddenly (i.e. not graceful), schema tables might be in inconsistent state. Add a test case and enable Scylla configuration option (force_schema_commit_log) to handle this. Fixes #12218 Closes #12630 * github.com:scylladb/scylladb: pytest: test start after ungraceful stop test.py: enable force_schema_commit_log	2023-01-26 12:08:33 +01:00
Kamil Braun	3eabe04f5d	test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests `ScyllaClusterManager` is used to run a sequence of test cases from a single test file. Between two consecutive tests, if the previous test left the cluster 'dirty', meaning the cluster cannot be reused, it would put the old cluster to the pool with `is_dirty=True`, then get a new cluster from the pool. Between the `put` and the `get`, a concurrent test run (with its own instance of `ScyllaClusterManager`) would start, because there was free space in the pool. This resulted in undesirable behavior when we ran tests with `--repeat X` for a large `X`: we would start with e.g. 4 concurrent runs of a test file, because the pool size was 4. As soon as one of the runs freed up space in the pool, we would start another concurrent run. Soon we'd end up with 8 concurrent runs. Then 16 concurrent runs. And so on. We would have a large number of concurrent runs, even though the original 4 runs didn't finish yet. All of these concurrent runs would compete waiting on the pool, and waiting for space in the pool would take longer and longer (the duration is linear w.r.t number of concurrent competing runs). Tests would then time out because they would have to wait too long. Fix that by using the new `replace_dirty` function introduced to the pool. This function frees up space by returning a dirty cluster and then immediately takes it away to be used for a new cluster. Thanks to this, we will only have at most as many concurrent runs as the pool size. For example with --repeat 8 and pool size 4, we would run 4 concurrent runs and start the 5th run only when one of the original 4 runs finishes, then the 6th run when a second run finishes and so on. Fixes #11757	2023-01-26 11:58:00 +01:00
Kamil Braun	b5ef57ecc2	test/pylib: pool: introduce `replace_dirty` Used to atomically return a dirty object to the pool and then use the space freed by this object to get another object. Unlike `put(is_dirty=True)` followed by `get`, a concurrent waiter cannot take away our space from us. A piece of `get` was refactored to a private function `_build_and_get`, this piece is also used in `replace_dirty`.	2023-01-26 11:58:00 +01:00
Kamil Braun	858803cc2c	test/pylib: pool: replace `steal` with `put(is_dirty=True)` The pool usage was kind of awkward previously: if the user of a pool decided that a previously borrowed object should no longer be used, it was their responsibility to destroy the object (releasing associated resources and so on) and then call `steal()` on the pool to free space for a new object. Change the interface. Now the `Pool` constructor obtains a `destroy` function additionally to the `build` function. The user calls the function `put` to return both objects that are still usable and those aren't. For the latter, they set `is_dirty=True`. The pool will 'destroy' the object with the provided function, which could mean e.g. releasing associated resources. For example, instead of: ``` if self.cluster.is_dirty: self.clusters.stop() self.clusters.release_ips() self.clusters.steal() else: self.clusters.put(self.cluster) ``` we can now use: ``` self.clusters.put(self.cluster, is_dirty=self.cluster.is_dirty) ``` (assuming that `self.clusters` is a pool constructed with a `destroy` function that stops the cluster and releases its IPs.) Also extend the interface of the context manager obtained by `instance()` - the user must now pass a flag `dirty_on_exception`. If the context manager exists due to an exception and that flag was `True`, the object will be considered dirty. The dirty flag can also be set manually on the context manager. For example: ``` async with (cm := pool.instance(dirty_on_exception=True)) as server: cm.dirty = await run_test(test, server) # It will also be considered dirty if run_test throws an exception ```	2023-01-26 11:58:00 +01:00
Pavel Emelyanov	dd307d8a42	test: Use tempdir from sstable_test_env The test cases in sstable_directory_test use a temporary directory that differs from the one sstables manager starts over. Fix that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 11:47:06 +03:00
Pavel Emelyanov	0c3799db71	test: Add tmpdir to sstable test env This adds the test/lib's tmpdir instance _and_ configures the data_file_directories with this path. This makes sure sstables manager and the rest of the test use the same directory for sstables. For now it doesn't change anything, but helps next patching. (A neat side effect of this change is that sstable_test_env is now configured the same way as cql_test_env does) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 11:47:06 +03:00
Pavel Emelyanov	fd559f3b81	tests: Use sstables manager to make sstables This test uses two many-args helpers from table calss to create sstables with desired parameters. The table API in question is not used by any other code but these few places, to it's better to open-code it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 10:47:39 +03:00
Pavel Emelyanov	9ccae1be18	test: Keep db::config as unique pointer The goal is to make it possible to make config with custom-initialized options in test_env::impl's constructor initializer list (next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-25 19:38:47 +03:00
Kamil Braun	a0ff33e777	test/pylib: scylla_cluster: don't leak server if stopping it fails `ScyllaCluster.server_stop` had this piece of code: ``` server = self.running.pop(server_id) if gracefully: await server.stop_gracefully() else: await server.stop() self.stopped[server_id] = server ``` We observed `stop_gracefully()` failing due to a server hanging during shutdown. We then ended up in a state where neither `self.running` nor `self.stopped` had this server. Later, when releasing the cluster and its IPs, we would release that server's IP - but the server might have still been running (all servers in `self.running` are killed before releasing IPs, but this one wasn't in `self.running`). Fix this by popping the server from `self.running` only after `stop_gracefully`/`stop` finishes. Make an analogous fix in `server_start`: put `server` into `self.running` before we actually start it. If the start fails, the server will be considered "running" even though it isn't necessarily, but that is OK - if it isn't running, then trying to stop it later will simply do nothing; if it is actually running, we will kill it (which we should do) when clearing after the cluster; and we don't leak it. Closes #12613	2023-01-25 16:58:02 +02:00
Alejo Sanchez	878cb45c24	pytest: test start after ungraceful stop Test case for a start of a server after it was stopped suddenly (instead of gracefully). This coud cause commitlog flush issues. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-25 14:49:27 +01:00
Alejo Sanchez	ccbd89f0cd	test.py: enable force_schema_commit_log To handle start after ungraceful stop, enable separate schema commit log from server start. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-25 14:49:27 +01:00
Alexey Novikov	ce96b472d3	prevent populating cache with expired rows from sstables change row purge condition for compacting_reader to remove all expired rows to avoid read perfomance problems when there are many expired tombstones in row cache Refs #2252 Closes #12565	2023-01-25 12:59:40 +01:00
Kamil Braun	5bc7f0732e	Merge 'test.py: manual cluster pool handling for Python suite' from Alecco From reviews of https://github.com/scylladb/scylladb/pull/12569, avoid using `async with` and access the `Pool` of clusters with `get()`/`put()`. Closes #12612 * github.com:scylladb/scylladb: test.py: manual cluster handling for PythonSuite test.py: stop cluster if PythonSuite fails to start test.py: minor fix for failed PythonSuite test	2023-01-24 17:37:55 +01:00
Nadav Har'El	55558e1bd7	test/alternator: check operation on invalid TableName Issue #12538 suggested that maybe Alternator shouldn't bother reporting an invalid table name in item operations like PutItem, and that it's enough to report that the table doesn't exist. But the test added in this patch shows that DynamoDB, like Alternator, reports the invalid table name in this case - not just that the table doesn't exist. That should make us think twice before acting on issue #12538. If we do what this issue recommended, this test will need to be fixed (e.g., to accept as correct both types of errors). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12608	2023-01-24 14:14:39 +02:00
Alejo Sanchez	f236d518c6	test.py: manual cluster handling for PythonSuite Instead of complex async with logic, use manual cluster pool handling. Revert the discard() logic in Pool from a recent commit. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-24 11:38:17 +01:00
Nadav Har'El	ccc2c6b5dd	Merge 'test/pylib: scylla_cluster: improve server startup check' from Kamil Braun Don't use a range scan, which is very inefficient, to perform a query for checking CQL availability. Improve logging when waiting for server startup times out. Provide details about the failure: whether we managed to obtain the Host ID of the server and whether we managed to establish a CQL connection. Closes #12588 * github.com:scylladb/scylladb: test/pylib: scylla_cluster: better logging for timeout on server startup test/pylib: scylla_cluster: use less expensive query to check for CQL availability	2023-01-23 17:00:52 +02:00
Kamil Braun	8a1ea6c49f	test/pylib: scylla_cluster: better logging for timeout on server startup Waiting for server startup is a multi-step procedure: after we start the actual process, we will: - try to obtain the Host ID (by querying a REST API endpoint) - then try to connect a CQL session - then try to perform a CQL query The steps are repeated every .1 second until we reach a timeout (the Host ID step is skipped if we previously managed to obtain it). On timeout we'd only get a generic "failed to start server" message, it wouldn't say what we managed to do and what not. For example, on one of the failed jobs on Jenkins I observed this timeout error. Looking at the logs of the server, it turned out that the server printed the "initialization completed" message more than 2 minutes before the actual timeout happened. So for 2 minutes, the test framework either couldn't obtain the Host ID, or couldn't establish a CQL connection, or couldn't perform a CQL query, but I wasn't able to determine fully which one of these was the case. Improve the code by printing whether we managed to get the Host ID of the server and if so - whether we managed to connect to CQL.	2023-01-23 15:59:42 +01:00
Kamil Braun	0e591606a5	test/pylib: scylla_cluster: use less expensive query to check for CQL availability The previous CQL query used a range scan which is very inefficient, even for local tables. Also add a comment explaining why we need this query.	2023-01-23 15:59:05 +01:00
Nadav Har'El	54f174a1f4	Merge 'test.py: handle broken clusters for Python suite' from Alecco If the after test check fails (is_after_test_ok is False), discard the cluster and raise exception so context manager (pool) does not recycle it. Ignore exception re-raised by the context manager. Fixes #12360 Closes #12569 * github.com:scylladb/scylladb: test.py: handle broken clusters for Python suite test.py: Pool discard method	2023-01-22 19:58:12 +02:00
Botond Dénes	7f9b39009c	reader_concurrency_semaphore_test: leak test: relax iteration limit This test creates random dummy reads and simulates a query with them. The test works in terms of iteration (tick), advancing each simulating read in each iteration. To prevent infinite runtime an iteration limit of 100 was added to detect a non-converging test and kill it. This limit proved too strict however and in this patch we bump it to 1000 to prevent some unlucky seed making this test fail, as seen recently in CI. Closes #12580	2023-01-20 15:39:13 +02:00
Nadav Har'El	3d78dbd9f2	test/cql-pytest: regression tests for null lookup in local SI We noticed that old branches of Scylla had problems with looking up a null value in a local secondary index - hanging or crashing. This patch includes tests to reproduce these bugs. The tests pass on current master - apparently this bug has already been fixed, but we didn't have a regression test for it. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12570	2023-01-19 23:58:33 +02:00
Alejo Sanchez	c886a05b37	test.py: Pool discard method Add a context manager discard() method to tell it to discard the object. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-19 21:43:45 +01:00
Kamil Braun	2f84e820fd	test/pylib: scylla_cluster: return error details from test framework endpoints If an endpoint handler throws an exception, the details of the exception are not returned to the client. Normally this is desirable so that information is not leaked, but in this test framework we do want to return the details to the client so it can log a useful error message. Do it by wrapping every handler into a catch clause that returns the exception message. Also modify a bit how HTTPErrors are rendered so it's easier to discern the actual body of the error from other details (such as the params used to make the request etc.) Before: ``` E test.pylib.rest_client.HTTPError: HTTP error 500: 500 Internal Server Error E E Server got itself in trouble, params None, json None, uri http+unix://api/cluster/before-test/test_stuff ``` After: ``` E test.pylib.rest_client.HTTPError: HTTP error 500, uri: http+unix://api/cluster/before-test/test_stuff, params: None, json: None, body: E Failed to start server at host 127.155.129.1. E Check the log files: E /home/kbraun/dev/scylladb/testlog/test.py.dev.log E /home/kbraun/dev/scylladb/testlog/dev/scylla-1.log ``` Closes #12563	2023-01-19 17:47:13 +02:00
Kamil Braun	3ed3966f13	test/pylib: scylla_cluster: release cluster IPs when stopping ScyllaClusterManager When we obtained a new cluster for a test case after the previous test case left a dirty cluster, we would release the old cluster's used IP addresses (`_before_test` function). However, we would not release the last cluster's IP after the last test case. We would run out of IPs with sufficiently many test files or `--repeat` runs. Fix this. Also reorder the operations a bit: stop the cluster (and release its IPs) before freeing up space in the cluster pool (i.e. call `self.cluster.stop()` before `self.clusters.steal()`). This reduces concurrency a bit - fewer Scyllas running at the same time, which is good (the pool size gives a limit on the desired max number of concurrently running clusters). Killing a cluster is quick so it won't make a significant difference for the next guy waiting on the pool. Closes #12564	2023-01-19 17:46:46 +02:00
Nadav Har'El	18be50582d	test/cql-pytest: add tests for behavior of unset values Recently, commit `0b418fa` made the checking for "unset" values more centralized and more robust, but as the tests added in this patch show, the situation is good (and in particular, that #10358 is solved). The tests in this patch check that the behavior of "unset" values in the CQL v4 protocol matches Cassandra's behavior and its documentation, and how it compares to our wishes of how we want unset values to behave. One of these tests fail on Cassandra (we consider this a Cassandra bug). One test fails on Scylla because it doesn't yet support arithmetic expressions (Refs #2693). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12534	2023-01-19 15:48:07 +02:00
Nadav Har'El	9433108158	Merge 'Allow transient list values to contain NULLs' from Avi Kivity The CQL protocol and specification call for lists with NULLs in some places. For example, the statement: ```cql UPDATE tab SET x = 3 IF y IN (1, 2, NULL) WHERE pk = 4 ``` has a list `(1, 2, NULL)` that contains NULL. Although the syntax is tuple-like, the value is a list; consider the same statement as a prepared statement: ```cql UPDATE tab SET x = :x IF y IN :y_values WHERE pk = :pk ``` `:y_values` must have a list type, since the number of elements is unknown. Currently, this is done with special paths inside LWT that bypass normal evaluation, but if we want to unify those paths, we must allow NULLs in lists (except in storage). This series does that. Closes #12411 * github.com:scylladb/scylladb: test: materialized view: add test exercising synthetic empty-type columns cql3: expr: relax evaluate_list() to allow allow NULL elements types: allow lists with NULL test: relax NULL check test predicate cql3, types: validate listlike collections (sets, lists) for storage types: make empty type deserialize to non-null value	2023-01-19 15:15:16 +02:00
Botond Dénes	d661d03057	Merge 'main, test: integrate perf tools into scylla' from Kefu Chai following tests are integrated into scylla executable - perf_fast_forward - perf_row_cache_update - perf_simple_query - perf_row_cache_update - perf_sstable before this change ```console $ size build/release/scylla text data bss dec hex filename 82284664 288960 335897 82909521 4f11951 build/release/scylla $ ls -l build/release/scylla -rwxrwxr-x 1 kefu kefu 1719672112 Jan 19 17:51 build/release/scylla ``` after this change ```console $ size build/release/scylla text data bss dec hex filename 84349449 289424 345257 84984130 510c142 build/release/scylla $ ls -l build/release/scylla -rwxrwxr-x 1 kefu kefu 1774204800 Jan 19 17:52 build/release/scylla ``` Fixes #12484 Closes #12558 * github.com:scylladb/scylladb: main: move perf_sstable into scylla main: move perf_row_cache_update into scylla test: perf_row_cache_update: add static specifier to local functions main: move perf_fast_forward into scylla main: move perf_simple_query into scylla test: extract debug::the_database out main: shift the args when checking exec_name main: extract lookup_main_func() out	2023-01-19 15:01:30 +02:00
Kamil Braun	147dd73996	test/pylib: scylla_cluster: mark cluster as dirty if it fails to boot If a cluster fails to boot, it saves the exception in `self.start_exception` variable; the exception will be rethrown when a test tries to start using this cluster. As explained in `before_test`: ``` def before_test(self, name) -> None: """Check that the cluster is ready for a test. If there was a start error, throw it here - the server is running when it's added to the pool, which can't be attributed to any specific test, throwing it here would stop a specific test.""" ``` It's arguable whether we should blame some random test for a failure that it didn't cause, but nevertheless, there's a problem here: the `start_exception` will be rethrown and the test will fail, but then the cluster will be simply returned to the pool and the next test will attempt to use it... and so on. Prevent this by marking the cluster as dirty the first time we rethrow the exception. Closes #12560	2023-01-19 14:26:57 +02:00
Avi Kivity	9029b8dead	test: disable commitlog O_DSYNC, preallocation Commitlog O_DSYNC is intended to make Raft and schema writes durable in the face of power loss. To make O_DSYNC performant, we preallocate the commitlog segments, so that the commitlog writes only change file data and not file metadata (which would require the filesystem to commit its own log). However, in tests, this causes each ScyllaDB instance to write 384MB of commitlog segments. This overloads the disks and slows everything down. Fix this by disabling O_DSYNC (and therefore preallocation) during the tests. They can't survive power loss, and run with --unsafe-bypass-fsync anyway. Closes #12542	2023-01-19 11:14:05 +01:00
Kefu Chai	7f5bb19d1f	main: move perf_sstable into scylla * configure.py: - include `test/perf/perf_sstable` and its dependencies in scylla_perfs * test/perf/perf_sstable.cc: change `main()` to `perf::scylla_sstable_main()` * test/perf/entry_point.hh: add `perf::scylla_sstable_main()` * main.cc: - dispatch "perf-sstable" subcommand to `perf::scylla_sstable_main` before this change, we have a tool at `test/perf/perf_sstable` for running performance tests by exercising sstable related operations. after this change, the `test/perf/perf_sstable` is integreated into `scylla` as a subcommand. so we can run `scylla perf-sstable` [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:52 +08:00
Kefu Chai	240f2c6f00	main: move perf_row_cache_update into scylla * configure.py: - include `test/perf/perf_row_cache_update.cc` in scylla_perfs * main.cc: - dispatch "perf-row-cache-update" subcommand to `perf::scylla_row_cache_update_main` * test/perf/perf_fast_forward.cc: change `main()` to `perf::scylla_row_cache_update_main()` * test/perf/entry_point.hh: add `perf::scylla_row_cache_update_main()` before this change, we have a tool at `test/perf/perf_row_cache_update` for running performance tests by updating row cache. after this change, the `test/perf/perf_row_cache_update` is integreated into `scylla` as a subcommand. so we can run `scylla perf-row-cache-update [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:46 +08:00
Kefu Chai	4e390b9a05	test: perf_row_cache_update: add static specifier to local functions now that these functions are only used by the same compiling unit, they don't need external linkage. so let's hide them using `static`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:46 +08:00
Kefu Chai	228ccdc1c7	main: move perf_fast_forward into scylla * configure.py: - include `test/perf/perf_simple_query.cc` in scylla_perfs * main.cc: - dispatch "perf-fast-forward" subcommand to `perf::scylla_fast_forward_main` * test/perf/perf_fast_forward.cc: change `main()` to `perf::scylla_simple_query_main()` * test/perf/entry_point.hh: add `perf::scylla_simple_query_main()` before this change, we have a tool at `test/perf/perf_fast_forward` for running performance tests by fast forwarding the reader. after this change, the `test/perf/perf_fast_forward` is integreated into `scylla` as a subcommand. so we can run `scylla perf-fast-forward [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:40 +08:00
Kefu Chai	09de031cab	main: move perf_simple_query into scylla * configure.py: - include scylla_perfs in scylla - move 'test/lib/debug.cc' down scylla_perfs, as the latter uses `debug::the_database` - link `scylla` against seastar_testing_libs also. because we use the helpers in `test/lib/random_utils.hh` for generating random numbers / sequences in `perf_simple_query.cc`, and `random_utils.hh` references `seastar::testing::local_random_engine` as a local RNG. but `seastar::testing::local_random_engine` is included in `libseastar_testing.a` or `libseastar_perf_testing.a`. since we already have the rules for linking against `libseastar_testing.a`, let's just reuse them, and link `scylla` against this new dependency. * main.cc: - dispatch "perf-simple-query" subcommand to `perf::scylla_simple_query_main` * test/perf/perf_simple_query.cc: change `main()` to `perf::scylla_simple_query_main()` * test/perf/entry_point.hh: define the main function entries so `main.cc` can find them. it's quite like how we collect the entries in `tools/entry_point.hh` before this change, we have a tool at `test/perf/perf_simple_query` for running performance test by sending simple query to a single-node cluster. after this change, the `test/perf/perf_simple_query` is integreated into `scylla` as a subcommand. so we can run `scylla perf-simple-query [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:30 +08:00
Kefu Chai	c65692a13a	test: extract debug::the_database out we want to integrate some perf test into scylla executable, so we can run them on a regular basis. but `test/lib/cql_test_env.cc` shares `debug::the_database` with `main.cc`, so we cannot just compile them into a single binary without changing them. before this change, both `test/lib/cql_test_env.cc` and `main.cc` define `debug::the_database`. after this change, `debug::the_database` is extracted into `debug.cc`, so it compiles into a separate compiling unit. and scylla and tests using seastar testing framework are linked against `debug.cc` via `scylla_core` respectively. this paves the road to integrating scylla with the tests linking aginst `test/lib/cql_test_env.cc`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:23 +08:00
Nadav Har'El	0ff0c80496	test/cql-pytest: un-xfail tests for UNSET values Commit `0b418fa` improved the error detection of unset values in inappropriate CQL statements, and some of the unit tests translated from Cassandra started to pass, so this patch removes their "xfail" mark. In a couple of places Scylla's error message is worded differently from Cassandra, so the test was modified to look for a shorter string common to both implementations. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12553	2023-01-19 07:47:08 +02:00
Kefu Chai	6a3b19b53d	test/perf: replace "std::cout <<" with fmt::print() for better readablity Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12559	2023-01-19 07:45:13 +02:00
Avi Kivity	aab5954cfb	Merge 'reader_concurrency_semaphore: add more layers of defense against OOM' from Botond Dénes The reader concurrency semaphore has no mechanism to limit the memory consumption of already admitted read. Once memory collective memory consumption of all the admitted reads is above the limit, all it can do is to not admit any more. Sometimes this is not enough and the memory consumption of the already admitted reads balloons to the point of OOMing the node. This pull-request offers a solution to this: it introduces two more layers of defense above this: a soft and a hard limit. Both are multipliers applied on the semaphores normal memory limit. When the soft limit threshold is surpassed, all readers but one are blocked via a new blocking `request_memory()` call which is used by the `tracking_file_impl`. The reader to be allowed to proceed is chosen at random, it is the first reader which happens to request memory after the limit is surpassed. This is both very simple and should avoid situations where the algorithm choosing the reader to be allowed to proceed chooses a reader which will then always time out. When the hard limit threshold is surpassed, `reader_concurrency_semaphore::consume()` starts throwing `std::bad_alloc`. This again will result in eliminating whichever reader was unlucky enough to request memory at the right moment. With this, the semaphore is now effectively enforcing an upper bound for memory consumption, defined by the hard limit. Refs: https://github.com/scylladb/scylladb/issues/11927 Closes #11955 * github.com:scylladb/scylladb: test: reader_concurrency_semaphore_test: add tests for semaphore memory limits reader_permit: expose operator<<(reader_permit::state) reader_permit: add id() accessor reader_concurrency_semaphore: add foreach_permit() reader_concurrency_semaphore: document the new memory limits reader_concurrency_semaphore: add OOM killer reader_concurrency_semaphore: make consume() and signal() private test: stop using reader_concurrency_semaphore::{consume,signal}() directly reader_concurrency_semaphore: move consume() out-of-line reader_permit: consume(): make it exception-safe reader_permit: resource_units::reset(): only call consume() if needed reader_concurrency_semaphore: tracked_file_impl: use request_memory() reader_concurrency_semaphore: add request_memory() reader_concurrency_semaphore: wrap wait list reader_concurrency_semaphore: add {serialize,kill}_limit_multiplier parameters test/boost/reader_concurrency_semaphore_test: dummy_file_impl: don't use hardoced buffer size reader_permit: add make_new_tracked_temporary_buffer() reader_permit: add get_state() accessor reader_permit: resource_units: add constructor for already consumed res reader_permit: resource_units: remove noexcept qualifier from constructor db/config: introduce reader_concurrency_semaphore_{serialize,kill}_limit_multiplier scylla-gdb.py: scylla-memory: extract semaphore stats formatting code scylla-gdb.py: fix spelling of "graphviz"	2023-01-18 17:02:55 +02:00
Avi Kivity	9a54cb5deb	Merge 'cql3/expr: make it possible to prepare binary_operator' from Jan Ciołek `prepare_expression` takes an unprepared CQL expression straight from the parser output and prepares it. Preparation consists of various type checks that are needed to ensure that the expression is correct and to reason about it. While `prepare_expression` supports a number of different types of expressions, until now it was impossible to prepare a `binary_operator`. Eventually we would like to be able to prepare all kinds of expressions, so this PR adds the missing support for `binary_operator`. Closes #12550 * github.com:scylladb/scylladb: expr_test: test preparing binary_operator with NULL RHS expr_test: test preparing IS NOT NULL binary_operator expr_test: test preparing binary_operator with LIKE expr_test: test preparing binary_operator with CONTAINS KEY expr_test: test preparing binary_operator with CONTAINS expr_test: test preparing binary_operator with IN expr_test: test preparing binary_operator with =, !=, <, <=, >, >= expr_test: use make__untyped function in existing tests expr_test_utils: add utilities to create untyped_constant expr_test_utils: add make_float_ and make_double_* cql3: expr: make it possible to prepare binary_operator using prepare_expression cql3/expr: check that RHS of IS NOT NULL is a null value when preparing binary operators cql3: expr: pass non-empty keyspace name in prepare_binary_operator cql3: expr: take reference to schema in prepare_binary_operator	2023-01-18 16:55:18 +02:00
Jan Ciolek	ae0e955b90	expr_test: test preparing binary_operator with NULL RHS Make sure that preparing binary_operator works properly when the RHS is NULL. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:46 +01:00
Jan Ciolek	65b8a09409	expr_test: test preparing IS NOT NULL binary_operator Add unit test which check that preparing binary_operators which represent IS NOT NULL works as expected Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:46 +01:00

1 2 3 4 5 ...

4170 Commits