scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 16:33:35 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	2d2460046b	test: memtable_test: Fix it with multiple compaction groups With compaction groups, automatic flushing may not pick the user table. Fix it by using explicit flush. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Botond Dénes	34cdcaffae	reader_concurrency_semaphore: un-bless permits when they become inactive When the memory consumption of the semaphore reaches the configured serialize threshold, all but the blessed permit is blocked from consuming any more memory. This ensures that past this limit, only one permit at a time can consume memory. Such a blessed permit can be registered inactive. Before this patch, it would still retain its blessed status when doing so. This could result in this permit being re-queued for admission if it was evicted in the meanwhile, potentially resulting in a complete deadlock of the semaphore: * admission queue permits cannot be admitted because there is no memory * admitter permits are all queued on memory, as none of them are blessed This patch strips the blessed status from the permit when it is registered as inactive. It also adds a unit test to verify this happens. Fixes: #12603 Closes #12694	2023-02-01 21:02:17 +02:00
Wojciech Mitros	86c61828e6	udt: disallow dropping a user type used in a user function Currently, nothing prevents us from dropping a user type used in a user function, even though doing so may make us unable to use the function correctly. This patch prevents this behavior by checking all function argument and return types when executing a drop type statement and preventing it from completing if the type is referenced by any of them. Closes #12680	2023-02-01 18:53:29 +02:00
Jan Ciolek	ed568f3f70	cql-pytest: test filtering using list with bind variable Add tests which test filtering using IN restriction with a list which contains a bind variable. There are other cql-pytest tests which test IN lists with a bind variable, but it looks like they don't do filtering. IN restrictions on primary key columns are handled in a special way to generate the right ranges. These tests hit a different code path as filtering uses `expr::evaluate()`. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-02-01 16:30:09 +01:00
Jan Ciolek	9eb6746a67	test/expr_test: test <int_value> IN (123, ?, 456) Add tests which test evaluating the IN restriction with a list which contains a bind variable. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-02-01 16:29:32 +01:00
Kamil Braun	40142a51d0	test: topology: wait for token ring/group 0 consistency after decommission There was a check for immediate consistency after a decommission operation has finished in one of the tests, but it turns out that also after decommission it might take some time for token ring to be updated on other nodes. Replace the check with a wait. Also do the wait in another test that performs a sequence of decommissions. We won't attempt to start another decommission until every node learns that the previously decommissioned node has left. Closes #12686	2023-02-01 16:49:22 +02:00
Nadav Har'El	132af20057	Merge 'test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests' from Kamil Braun `ScyllaClusterManager` is used to run a sequence of test cases from a single test file. Between two consecutive tests, if the previous test left the cluster 'dirty', meaning the cluster cannot be reused, it would free up space in the pool (using `steal`), stop the cluster, then get a new cluster from the pool. Between the `steal` and the `get`, a concurrent test run (with its own instance of `ScyllaClusterManager` would start, because there was free space in the pool. This resulted in undesirable behavior when we ran tests with `--repeat X` for a large `X`: we would start with e.g. 4 concurrent runs of a test file, because the pool size was 4. As soon as one of the runs freed up space in the pool, we would start another concurrent run. Soon we'd end up with 8 concurrent runs. Then 16 concurrent runs. And so on. We would have a large number of concurrent runs, even though the original 4 runs didn't finish yet. All of these concurrent runs would compete waiting on the pool, and waiting for space in the pool would take longer and longer (the duration is linear w.r.t number of concurrent competing runs). Tests would then time out because they would have to wait too long. Fix that by using the new `replace_dirty` function introduced to the pool. This function frees up space by returning a dirty cluster and then immediately takes it away to be used for a new cluster. Thanks to this, we will only have at most as many concurrent runs as the pool size. For example with --repeat 8 and pool size 4, we would run 4 concurrent runs and start the 5th run only when one of the original 4 runs finishes, then the 6th run when a second run finishes and so on. The fix is preceded by a refactor that replaces `steal` with `put(is_dirty=True)` and a `destroy` function passed to the pool (now the pool is responsible for stopping the cluster and releasing its IPs). Fixes #11757 Closes #12549 * github.com:scylladb/scylladb: test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests test/pylib: pool: introduce `replace_dirty` test/pylib: pool: replace `steal` with `put(is_dirty=True)`	2023-02-01 12:37:39 +02:00
Nadav Har'El	681a066923	test/pylib: put UNIX-domain socket in /tmp The "cluster manager" used by the topology test suite uses a UNIX-domain socket to communicate between the cluster manager and the individual tests. The socket is currently located in the test directory but there is a problem: In Linux the length of the path used as a UNIX-domain socket address is limited to just a little over 100 bytes. In Jenkins run, the test directory names are very long, and we sometimes go over this length limit and the result is that test.py fails creating this socket. In this patch we simply put the socket in /tmp instead of the test directory. We only need to do this change in one place - the cluster manager, as it already passes the socket path to the individual tests (using the "--manager-api" option). Tested by cloning Scylla in a very long directory name. A test like ./test.py --mode=dev test_concurrent_schema fails before this patch, and passes with it. Fixes #12622 Closes #12678	2023-02-01 12:37:35 +03:00
Botond Dénes	71ad0dff2b	test/lib/sstable_utils: remove now unused token_generation_for_shard() and friends	2023-01-30 05:03:42 -05:00
Botond Dénes	a03c11234d	test/lib/simple_schema: remove now unused make_keys() and friends	2023-01-30 05:03:42 -05:00
Botond Dénes	4ad3ba52b0	test: migrate to tests::generate_partition_key[s]() Use the newly introduced key generation facilities, instead of the the old inflexible alternatives and hand-rolled code. Most of the migrations are mechanic, but there are two tests that were tricky to migrate: * sstable_compaction_test.sstable_run_based_compaction_test * sstable_mutation_test.test_key_count_estimation These two tests seems to depend on generated keys all being of the same size. This makes some sense in the case of the key count estimation test, but makes no sense at all to me in the case of the sstable run test.	2023-01-30 05:03:42 -05:00
Botond Dénes	84c94881b3	test/lib/test_services: add table_for_tests::make_default_schema() Creating the default schema, used in the default constructor of table_for_tests. Allows for getting the default schema without creating an instance first.	2023-01-30 05:03:42 -05:00
Botond Dénes	61f28d3ab2	test/lib: add key_utils.hh Contains methods to generate partition and clustering keys. In the case of the former, one can specify the shard to generate keys for. We currently have some methods to generate these but they are not generic. Therefore the tests are littered by open-coded variants. The methods introduced here are completely generic: they can generate keys for any schema.	2023-01-30 05:03:42 -05:00
Botond Dénes	04ca710a95	test/lib/random_schema.hh: value_generator: add min_size_in_bytes Allow caller to specify the minimum size in bytes of the generated value. Only really works with string-like types (and collections of these). Also fixed max size enforcement for strings: before this patch, the provided max size was dividied by wide string size, instead of the char width of the actual string type the value is generated for.	2023-01-30 01:11:31 -05:00
Tomasz Grabiec	c9c476afd7	test: mvcc: Extend some scenarios with exhaustive consistency checks on eviction	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	80de99cb1b	test: mvcc: Extract mvcc_container::allocate_in_region()	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	f2832046e9	test: mvcc: Avoid copies of mutation under failure injection Speeds up the test a bit because we avoid the copy when converting to mutation_partition_v2 in apply().	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	b8980f68f0	test: mvcc: Add missing logalloc::reclaim_lock to test_apply_is_atomic	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	bc35fa7696	Pass is_evictable to apply()	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	2b5e7a684b	tests: mutation_partition_v2: Introduce test_external_memory_usage_v2 mirroring the test for v1	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	81b1b2ee55	tests: mutation: Fix test_external_memory_usage() to not measure mutation object footprint The test measured copying of the mutation object, but verified the measurement against mutation_partition::external_memory_usage(). So anything allocated on the mutation object level would cause the test to (incorrectly) fail. Fix that by copying only the mutation_partition part. Currently not a problem, because the partition_key is stored in the in-line storage. Would become a problem once inline storage is reduced.	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	f172336b32	tests: mutation_partition_v2: Add test for exception safety of mutation merging	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	919ff433d1	tests: Add tests for the mutation_partition_v2 model	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	026f8cc1e7	db: Use mutation_partition_v2 in mvcc This patch switches memtable and cache to use mutation_partition_v2, and all affected algorithms accordingly. The memtable reader was changed to use the same cursor implementation which cache uses, for improved code reuse and reducing risk of bugs due to discrepancy of algorithms which deal with MVCC. Range tombstone eviction in cache has now fine granularity, like with rows. Fixes #2578 Fixes #3288 Fixes #10587	2023-01-27 21:56:28 +01:00
Tomasz Grabiec	40719c600c	test: memtable_test: Relax test_segment_migration_during_flush Partition version merging can now insert sentinels, which may temporarily increase unspooled memory. It is no longer true that unspooled monotonically decreases, which the test verified. Relax it, and only verify that unspooled is smaller than real dirty.	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	31bcc3b861	test: cache_flat_mutation_reader: Avoid timestamp clash api::new_timestamp() is not monotonic. In test_single_row_and_tombstone_not_cached_single_row_range1, we generate a deletion and an insertion in the deleted reange. If they get the same timestamp, the inserted row will be covered. This will surface after cache starts to compact rows with range tombstones.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	25683449e4	test: cache_flat_mutation_reader_test: Use monotonic timestamps when inserting rows When inserting range tombstones, the test uses api::new_timestamp(), but when inserting rows, it uses a fixed timestamp of 1. This will be problematic when rows get compacted with range tombstone, all rows would get compacted away, which is not expected by the test. To fix this, let's use the same timestamp source as range tombstones. This way rows will get a later timestamp.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	71057412ed	test: mvcc: Fix sporadic failures due to compact_for_compaction() compact_for_compaction() will perform cell expiration based on gc_clock::now(), which introduces sporadic mismatches due to expiry status of a row marker. Drop this, we can rely on compaction done by is_equal_to_compacted()	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	f908713290	test: lib: random_mutation_generator: Produce partition tombstone less often This tombstone has a high chance of obliterating all data, which will make tests which involve partition version merging not very interesting. The result will be an empty partition with a tombstone. Reduce its frequency, so that in MVCC there is a significant chance of having live data in the combined entry where individual versions come from the generator.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	3bf8052be4	test: lib: random_utils: Introduce with_probability()	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	c386874e18	test: lib: Improve error message in has_same_continuity()	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	08f68c5f20	test: mvcc: mvcc_container: Avoid UB in tracker() getter when there is no tracker	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	5aa8cb56a8	test: mvcc: Insert entries in the tracker evictable snapshots must have all entries added to the tracker. Partition version merging assumes this. Before this was benign, but will start to trigger asserts in mutation_partition_v2.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	9d38997971	test: mvcc_test: Do not set dummy::no on non-clustering rows This will trigger an assert in apply_monotonically() later in the series, where this row would be merged with a dummy at the same position. This row must not be marked as non-dummy, there is an assumption that non-clustering positions are all dummies. There can't be two entries with the same position an a different dummy status.	2023-01-27 19:15:38 +01:00
Nadav Har'El	f873884b50	test/alternator: unskip test which works on modern Scylla We had one test test_gsi.py::test_gsi_identical that didn't work on KA/LA sstables due to #6157, so it was skipped. Today, Scylla no longer supports writing these old sstable formats, so the test can never find itself running on these versions, so should pass. And indeed it does, and the "skip" marker can be removed. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12651	2023-01-27 14:10:07 +02:00
Botond Dénes	d358d4d9e9	Merge 'Configure sstable_test_env with tempdir' from Pavel Emelyanov Today's sstable_test_env starts with a default-configured db::config and, thus, sstables_manager. Test cases that run in this env always create a tempdir to store sstable files in on their own. Next patching makes sstable-manager and friends fully control the data-dir path in order to support object storage for sstables in a nice way, and this behavior of tests upsets this ongoing work. Said that, this PR configures sstable_test_env with a tempdir and pins down the cases using it to stick to that directory, rather than to the custom one. Closes #12641 * github.com:scylladb/scylladb: test: Use tempdir from sstable_test_env test: Add tmpdir to sstable test env test: Keep db::config as unique pointer	2023-01-27 13:59:12 +02:00
Kamil Braun	fa9cf81af2	test: topology: verify that group 0 and token ring are consistent After topology changes like removing a node, verify that the set of group 0 members and token ring members is the same. Modify `get_token_ring_host_ids` to only return NORMAL members. The previous version which used the `/storage_service/host_id` endpoint might have returned non-NORMAL members as well. Fixes: #12153 Closes #12619	2023-01-27 14:21:14 +03:00
Botond Dénes	d7ed92bb42	Merge 'Reduce the number of table::make_sstable() overloads' from Pavel Emelyanov There are several helpers to make an sstable for the table and two with most of the arguments are only used by tests. This PR leaves table with just one arg-less call thus making it easier to patch further. Closes #12636 * github.com:scylladb/scylladb: table: Shrink sstables making API tests: Use sstables manager to make sstables distributed_loader: Add helpers to make sstables for reshape/reshard	2023-01-26 14:25:21 +02:00
Kamil Braun	5eadea301e	Merge 'pytest: start after ungraceful stop' from Alecco If a server is stopped suddenly (i.e. not graceful), schema tables might be in inconsistent state. Add a test case and enable Scylla configuration option (force_schema_commit_log) to handle this. Fixes #12218 Closes #12630 * github.com:scylladb/scylladb: pytest: test start after ungraceful stop test.py: enable force_schema_commit_log	2023-01-26 12:08:33 +01:00
Kamil Braun	3eabe04f5d	test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests `ScyllaClusterManager` is used to run a sequence of test cases from a single test file. Between two consecutive tests, if the previous test left the cluster 'dirty', meaning the cluster cannot be reused, it would put the old cluster to the pool with `is_dirty=True`, then get a new cluster from the pool. Between the `put` and the `get`, a concurrent test run (with its own instance of `ScyllaClusterManager`) would start, because there was free space in the pool. This resulted in undesirable behavior when we ran tests with `--repeat X` for a large `X`: we would start with e.g. 4 concurrent runs of a test file, because the pool size was 4. As soon as one of the runs freed up space in the pool, we would start another concurrent run. Soon we'd end up with 8 concurrent runs. Then 16 concurrent runs. And so on. We would have a large number of concurrent runs, even though the original 4 runs didn't finish yet. All of these concurrent runs would compete waiting on the pool, and waiting for space in the pool would take longer and longer (the duration is linear w.r.t number of concurrent competing runs). Tests would then time out because they would have to wait too long. Fix that by using the new `replace_dirty` function introduced to the pool. This function frees up space by returning a dirty cluster and then immediately takes it away to be used for a new cluster. Thanks to this, we will only have at most as many concurrent runs as the pool size. For example with --repeat 8 and pool size 4, we would run 4 concurrent runs and start the 5th run only when one of the original 4 runs finishes, then the 6th run when a second run finishes and so on. Fixes #11757	2023-01-26 11:58:00 +01:00
Kamil Braun	b5ef57ecc2	test/pylib: pool: introduce `replace_dirty` Used to atomically return a dirty object to the pool and then use the space freed by this object to get another object. Unlike `put(is_dirty=True)` followed by `get`, a concurrent waiter cannot take away our space from us. A piece of `get` was refactored to a private function `_build_and_get`, this piece is also used in `replace_dirty`.	2023-01-26 11:58:00 +01:00
Kamil Braun	858803cc2c	test/pylib: pool: replace `steal` with `put(is_dirty=True)` The pool usage was kind of awkward previously: if the user of a pool decided that a previously borrowed object should no longer be used, it was their responsibility to destroy the object (releasing associated resources and so on) and then call `steal()` on the pool to free space for a new object. Change the interface. Now the `Pool` constructor obtains a `destroy` function additionally to the `build` function. The user calls the function `put` to return both objects that are still usable and those aren't. For the latter, they set `is_dirty=True`. The pool will 'destroy' the object with the provided function, which could mean e.g. releasing associated resources. For example, instead of: ``` if self.cluster.is_dirty: self.clusters.stop() self.clusters.release_ips() self.clusters.steal() else: self.clusters.put(self.cluster) ``` we can now use: ``` self.clusters.put(self.cluster, is_dirty=self.cluster.is_dirty) ``` (assuming that `self.clusters` is a pool constructed with a `destroy` function that stops the cluster and releases its IPs.) Also extend the interface of the context manager obtained by `instance()` - the user must now pass a flag `dirty_on_exception`. If the context manager exists due to an exception and that flag was `True`, the object will be considered dirty. The dirty flag can also be set manually on the context manager. For example: ``` async with (cm := pool.instance(dirty_on_exception=True)) as server: cm.dirty = await run_test(test, server) # It will also be considered dirty if run_test throws an exception ```	2023-01-26 11:58:00 +01:00
Pavel Emelyanov	dd307d8a42	test: Use tempdir from sstable_test_env The test cases in sstable_directory_test use a temporary directory that differs from the one sstables manager starts over. Fix that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 11:47:06 +03:00
Pavel Emelyanov	0c3799db71	test: Add tmpdir to sstable test env This adds the test/lib's tmpdir instance _and_ configures the data_file_directories with this path. This makes sure sstables manager and the rest of the test use the same directory for sstables. For now it doesn't change anything, but helps next patching. (A neat side effect of this change is that sstable_test_env is now configured the same way as cql_test_env does) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 11:47:06 +03:00
Pavel Emelyanov	fd559f3b81	tests: Use sstables manager to make sstables This test uses two many-args helpers from table calss to create sstables with desired parameters. The table API in question is not used by any other code but these few places, to it's better to open-code it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 10:47:39 +03:00
Pavel Emelyanov	9ccae1be18	test: Keep db::config as unique pointer The goal is to make it possible to make config with custom-initialized options in test_env::impl's constructor initializer list (next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-25 19:38:47 +03:00
Kamil Braun	a0ff33e777	test/pylib: scylla_cluster: don't leak server if stopping it fails `ScyllaCluster.server_stop` had this piece of code: ``` server = self.running.pop(server_id) if gracefully: await server.stop_gracefully() else: await server.stop() self.stopped[server_id] = server ``` We observed `stop_gracefully()` failing due to a server hanging during shutdown. We then ended up in a state where neither `self.running` nor `self.stopped` had this server. Later, when releasing the cluster and its IPs, we would release that server's IP - but the server might have still been running (all servers in `self.running` are killed before releasing IPs, but this one wasn't in `self.running`). Fix this by popping the server from `self.running` only after `stop_gracefully`/`stop` finishes. Make an analogous fix in `server_start`: put `server` into `self.running` before we actually start it. If the start fails, the server will be considered "running" even though it isn't necessarily, but that is OK - if it isn't running, then trying to stop it later will simply do nothing; if it is actually running, we will kill it (which we should do) when clearing after the cluster; and we don't leak it. Closes #12613	2023-01-25 16:58:02 +02:00
Alejo Sanchez	878cb45c24	pytest: test start after ungraceful stop Test case for a start of a server after it was stopped suddenly (instead of gracefully). This coud cause commitlog flush issues. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-25 14:49:27 +01:00
Alejo Sanchez	ccbd89f0cd	test.py: enable force_schema_commit_log To handle start after ungraceful stop, enable separate schema commit log from server start. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-25 14:49:27 +01:00
Alexey Novikov	ce96b472d3	prevent populating cache with expired rows from sstables change row purge condition for compacting_reader to remove all expired rows to avoid read perfomance problems when there are many expired tombstones in row cache Refs #2252 Closes #12565	2023-01-25 12:59:40 +01:00

... 151 152 153 154 155 ...

11801 Commits