scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-08 07:53:20 +00:00

Author	SHA1	Message	Date
Nadav Har'El	29a0a2d694	test/scylla-gdb: detect Scylla compiled without debugging information test/scylla-gdb tests Scylla's gdb debugging tools, and cannot work if Scylla was compiled without debug information (i.e, the "dev" build mode). In the past, test/scylla-gdb/run detected this case and printed a clear error: Scylla executable was compiled without debugging information (-g) so cannot be used to test gdb. Please set SCYLLA environment variable. Unfortunately, since recently this detection fails, because even when Scylla is compiled without debug information we link into it a library (libwasmtime.a) which has some debug information. As a result, instead of one clear error message, we get all scylla-gdb tests running - and each of them failing separately. This is ugly and unhelpful. Each of the tests fail because our "gdb" test fixture tries to load scylla-gdb.py and fails when the symbols it needs (e.g., "size_t") cannot be found. So in this patch, we check once for the existance of this symbol - and if missing we exit pytest instead of failing each individual test. Moreover, if loading scylla-gdb.py fails for some other unexpected reason, let's exit the test as well, instead of failing each individual test. Fixes #10863. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #10937	2022-06-30 20:22:50 +02:00
Botond Dénes	d80256f4dd	dirty_memory_manager: move db ctor out-of-line To facilitate further patching.	2022-06-30 17:26:18 +03:00
Botond Dénes	3a19412237	Merge 'Various improvements to perf_row_cache_upgrade test' from Tomasz Grabiec Closes #10930 * github.com:scylladb/scylla: test: perf_row_cache_update: Flush std output after each line test: perf_row_cache_update: Drain background cleaner before starting the test test: perf_row_cache_update: Measure memtable filling time test: perf_row_cache_update: Respect preemption when applying mutations test: perf_row_cache_update: Drop unused pk variable	2022-06-30 15:57:29 +03:00
Tomasz Grabiec	8f3349b407	test: lib: flat_mutation_reader_assertion: Add trace-level logging of read fragments Message-Id: <20220629153926.137824-1-tgrabiec@scylladb.com>	2022-06-30 08:43:30 +03:00
Tomasz Grabiec	6d753bd01c	gdb: Make robust in case there is no global storage_proxy or database instance Some unit tests don't initialze these. Message-Id: <20220629152743.134296-1-tgrabiec@scylladb.com>	2022-06-30 08:41:57 +03:00
Nadav Har'El	8024da10f0	test/cql-pytest: avoid leaving behind temporary files Before this patch, the test cql-pytest/test_tools.py left behind a temporary file in /tmp. It used pytest's "tmp_path_factory" feature, but it doesn't remove temporary files it creates. This patch removes the temporary file when the fixture using it ends, but moreover, it puts the temporary file not in /tmp but rather next to Scylla's data directory. That directory will be eventually removed entirely, so even if we accidentally leave a file there, it will eventually be deleted. Fixes #10924 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #10929	2022-06-30 07:35:55 +03:00
Tomasz Grabiec	a6aef60b93	memtable: Fix missing range tombstones during reads under ceratin rare conditions There is a bug introduced in `e74c3c8` (4.6.0) which makes memtable reader skip one a range tombstone for a certain pattern of deletions and under certain sequence of events. _rt_stream contains the result of deoverlapping range tombstones which had the same position, which were sipped from all the versions. The result of deoverlapping may produce a range tombstone which starts later, at the same position as a more recent tombstone which has not been sipped from the partition version yet. If we consume the old range tombstone from _rt_stream and then refresh the iterators, the refresh will skip over the newer tombstone. The fix is to drop the logic which drains _rt_stream so that _rt_stream is always merged with partition versions. For the problem to trigger, there have to be multiple MVCC versions (at least 2) which contain deletions of the following form: [a, c] @ t0 [a, b) @ t1, [b, d] @ t2 c > b The proper sequence for such versions is (assuming d > c): [a, b) @ t1, [b, d] @ t2 Due to the bug, the reader will produce: [a, b) @ t1, [b, c] @ t0 The reader also needs to be preempted right before processing [b, d] @ t2 and iterators need to get invalidated so that lsa_partition_reader::do_refresh_state() is called and it skips over [b, d] @ t2. Otherwise, the reader will emit [b, d] @ t2 later. If it does emit the proper range tombstone, it's possible that it will violate fragment order in the stream if _rt_stream accumulated remainders (possible with 3 MVCC versions). The problem goes away once MVCC versions merge. Fixes #10913 Fixes #10830 Closes #10914	2022-06-29 19:02:23 +03:00
Tomasz Grabiec	3a222c1db3	test: perf_row_cache_update: Flush std output after each line	2022-06-29 17:36:02 +02:00
Tomasz Grabiec	daf0f041be	test: perf_row_cache_update: Drain background cleaner before starting the test	2022-06-29 17:36:02 +02:00
Tomasz Grabiec	5d4bd5d6d5	test: perf_row_cache_update: Measure memtable filling time	2022-06-29 17:36:02 +02:00
Tomasz Grabiec	9d9bf8c196	test: perf_row_cache_update: Respect preemption when applying mutations Otherwise, once preemption is signalled, memtable::apply() will keep creating MVCC snapshots, which will slow the test down.	2022-06-29 17:36:02 +02:00
Tomasz Grabiec	46a5a606c4	test: perf_row_cache_update: Drop unused pk variable	2022-06-29 17:36:02 +02:00
Pavel Emelyanov	85033ea6ae	Merge 'A bunch of refactors related to Raft group 0' from Kamil Braun The commits here were extracted from PR https://github.com/scylladb/scylla/pull/10835 which implements upgrade procedure for Raft group 0. They are mostly refactors which don't affect the behavior of the system, except one: the commit `4d439a16b3` causes all schema changes to be bounced to shard 0. Previously, they would only be bounced when the local Raft feature was enabled. I do that because: 1. eventually, we want this to be the default behavior 2. in the upgrade PR I remove the `is_raft_enabled()` function - the function was basically created with the mindset "Raft is either enabled or not" - which was right when we didn't support upgrade, but will be incorrect when we introduce intermediate states (when we upgrade from non-raft-based to raft-based operations); the upgrade PR introduces another mechanism to dispatch based on the upgrade state, but for the case of bouncing to shard 0, dispatching is simply not necessary. Closes #10864 * github.com:scylladb/scylla: service/raft: raft_group_registry: add assertions when fetching servers for groups service/raft: raft_group_registry: remove `_raft_support_listener` service/raft: raft_group0: log adding/removing servers to/from group 0 RPC map service/raft: raft_group0: move group 0 RPC handlers from `storage_service` service/raft: messaging: extract raft_addr/inet_addr conversion functions service: storage_service: initialize `raft_group0` in `main` and pass a reference to `join_cluster` treewide: remove unnecessary `migration_manager::is_raft_enabled()` calls test/boost: memtable_test: perform schema operations on shard 0 test/boost: cdc_test: remove test_cdc_across_shards message: rename `send_message_abortable` to `send_message_cancellable` message: change parameter order in `send_message_oneway_timeout`	2022-06-29 16:51:54 +03:00
Pavel Emelyanov	b60f2c220b	test_tools: Do not create type if it exists There effectively are several test-cases in this test, each calls the scylla_sstable() to prepare, thus each creates a type in the same scylla instance. The 2nd attempt ends up with the "already exists" error: E cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] message="A user type of name cql_test_1656396925652.type1 already exists" tests: unit(dev) https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1075/ fixes: #10872 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220628081459.12791-1-xemul@scylladb.com>	2022-06-29 14:31:57 +03:00
Calle Wilund	aab7794c31	commitlog_test: Change timeout handling to do abort() Refs #10805 To help debug spurious failures, ensure to do abort() for debugger/core ease. Closes #10843.	2022-06-29 13:26:51 +03:00
Nadav Har'El	fc0243a43a	Merge 'Implement a number of improvements in test.py' from Konstantin Osipov A number of improvements in test.py as requested by maintainers: * don't capture pytest output * stick to the specific server in control connections * support --log-level option and pass it to logging module * when checking if CQL is up, ignore timeout errors * no longer force schema migration when starting the server * use test uname, not id, in log output * improve logging of ScyllaServer * log what cluster is used for a test * extend xml output with logs On the same token, remove mypy warnings and make linter pass on test.py, as well as add some type checking. Fixes #10871 Fixes #10785 Closes #10902 * github.com:scylladb/scylla: test.py: extend xml output with logs test.py: log what cluster is used for a test test.py: improve logging of ScyllaServer test.py: use test uname, not id, in log output test.py: support --log-level option and pass it to logging module test.py: make ScyllaServer more reliable and fast test.py: don't capture pytest output test.py: add type annotations test.py: convert log_filename to pathlib test.py: please linter test.py: remove mypy warnings	2022-06-29 13:06:07 +03:00
Pavel Emelyanov	3a753068be	Merge "Make permissions cache live updateable and add an API for resetting authorization cache" from Igor Ribeiro Barbosa Duarte Currently, for users who have permissions_cache configs set to very high values (and thus can't wait for the configured times to pass) having to restart the service every time they make a change related to permissions or prepared_statements cache (e.g. Adding a user and changing their permissions) can become pretty annoying. This patch series make permissions_validity_in_ms, permissions_update_interval_in_ms and permissions_cache_max_entries live updateable so that restarting the service is not necessary anymore for these cases. It also adds an API for flushing the cache to make it easier for users who don't want to modify their permissions_cache config. branch: https://github.com/igorribeiroduarte/scylla/tree/make_permissions_cache_live_updateable CI: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1005/ dtests: https://github.com/igorribeiroduarte/scylla-dtest/tree/test_permissions_cache * https://github.com/igorribeiroduarte/scylla/make_permissions_cache_live_updateable: loading_cache_test: Test loading_cache::reset and loading_cache::update_config api: Add API for resetting authorization cache authorization_cache: Make permissions cache and authorized prepared statements cache live updateable auth_prep_statements_cache: Make aut_prep_statements_cache accept a config struct utils/loading_cache.hh: Add update_config method utils/loading_cache.hh: Rename permissions_cache_config to loading_cache_config and move it to loading_cache.hh utils/loading_cache.hh: Add reset method	2022-06-29 11:14:13 +03:00
Igor Ribeiro Barbosa Duarte	8cc2de5fe0	loading_cache_test: Test loading_cache::reset and loading_cache::update_config Validate that the size of the cache is zero after calling the reset method and that the config is being updated correctly after calling update_config. Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>	2022-06-28 19:58:06 -03:00
Igor Ribeiro Barbosa Duarte	a23c3d6338	api: Add API for resetting authorization cache For cases where we have very high values set to permissions_cache validity and update interval (E.g.: 1 day), whenever a change to permissions is made it's necessary to update scylla config and decrease these values, since waiting for all this time to pass wouldn't be viable. This patch adds an API for resetting the authorization cache so that changing the config won't be mandatory for these cases. Usage: $ curl -X POST http://localhost:10000/authorization_cache/reset Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>	2022-06-28 19:58:06 -03:00
Igor Ribeiro Barbosa Duarte	b9051c79bc	authorization_cache: Make permissions cache and authorized prepared statements cache live updateable Currently, for users who have permissions_cache configs set to very high values (and thus can't wait for the configured times to pass) having to restart the service every time they make a change related to permissions or prepared_statements cache(e.g.: Adding a user) can become pretty annoying. This patch make permissions_validity_in_ms, permissions_update_interval_in_ms and permissions_cache_max_entries live updateable so that restarting the service is not necessary anymore for these cases. Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>	2022-06-28 19:58:06 -03:00
Igor Ribeiro Barbosa Duarte	c8c48a98fa	auth_prep_statements_cache: Make aut_prep_statements_cache accept a config struct This patch makes authorized_prepared_statements_cache acccept a config struct, similarly to permissions_cache. This will make it easier to make this cache live updateable on the next patch. Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>	2022-06-28 19:57:52 -03:00
Igor Ribeiro Barbosa Duarte	d02cd5e8bc	utils/loading_cache.hh: Add update_config method This patch adds an update_config method in order to allow live updating the config for permissions_cache. This method is going to be used in the next patches after making permissions_cache config live updateable. Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>	2022-06-28 19:46:58 -03:00
Igor Ribeiro Barbosa Duarte	667840a7eb	utils/loading_cache.hh: Rename permissions_cache_config to loading_cache_config and move it to loading_cache.hh This patch renames the permissions_cache_config struct to loading_cache_config and moves it to utils/loading_cache.hh. This will make it easier to handle config updates to the authorization caches on the next patches Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>	2022-06-28 19:46:22 -03:00
Nadav Har'El	630959bb77	Merge 'test.py async and schema changes' from Alecco Change tests to use async mode and add helpers and tests for schema changes. These test series will be expanded with topology changes. Closes #10550 * github.com:scylladb/scylla: test.py topology: repro for issue #1207 test.py: port fixture fails_without_raft test.py topology: table methods to add/remove index test.py topology: add/drop table column helpers test.py topology: insert sequential row test.py: remove deprecated test test_null test.py: managed random tables test.py: test_keyspace fixture async test.py: rename fixture test_keyspace to keyspace test.py topology: test with asyncio	2022-06-28 23:25:18 +03:00
Avi Kivity	37780c6521	Merge 'test: perf: allow testing timeouts in perf_simple_query' from Piotr Dulikowski This PR adds necessary modifications to perf_simple_query so that it can be used to test performance of the timeout handling path. With an appropriate combination of flags, it is possible to consistently trigger timeouts on every operation. The following flags are added: - `--stop-on-error` - if true (which is the default), the test stops after encountering the first exception and reports it; otherwise it causes errors to be counted and reported at the end. - `--timeout <x>` - allows to use `USE TIMEOUT <x>` in the benchmark query/statement. - `--bypass-cache` - uses `BYPASS CACHE` in the benchmark query (relevant only to reads). Examples: ``` ./build/release/test/perf/perf_simple_query --smp=1 --operations-per-shard=1000000 --write 131023.65 tps ( 56.2 allocs/op, 13.2 tasks/op, 49784 insns/op, 0 errors) ./build/release/test/perf/perf_simple_query --smp=1 --operations-per-shard=1000000 --write --stop-on-error=false --timeout=0s 97163.73 tps ( 53.1 allocs/op, 5.1 tasks/op, 78687 insns/op, 1000000 errors) ./build/release/test/perf/perf_simple_query --smp=1 --operations-per-shard=1000000 154060.36 tps ( 63.1 allocs/op, 12.1 tasks/op, 42998 insns/op, 0 errors) ./build/release/test/perf/perf_simple_query --smp=1 --operations-per-shard=1000000 --stop-on-error=false --flush --bypass-cache --timeout=0s 30127.43 tps ( 48.2 allocs/op, 14.3 tasks/op, 312416 insns/op, 1000000 errors) ``` Refs: #2363 Closes #10899 * github.com:scylladb/scylla: test: perf: add bypass cache argument test: perf: add timeout argument test: perf: count errors and report the count in results test: perf: add stop-on-error argument test: perf: coroutinize run_worker() test: perf: fix crash on exception in time_parallel_ex	2022-06-28 19:28:22 +03:00
Tomasz Grabiec	3bb147ae95	db: mutation_cleaner: Enqueue new snapshots at the back This fixes a quadratic behavior in case lots of snapshots with range tombstones are queued for merging. Before the change, new snapshots were inserted at the front, which is also where the worker looks at. Merging a version has a linear component in complexity function which depends on the number of range tombstones. If we merge snapshots starting from the latest to oldest then the whole process becomes quadratic because the version which is merged accumulates an increasing amont of tombstones, ones which were already merged before. We should instead merge starting from the oldest snapshots, this way each tombstone is applied exactly once during merge. This bug got wose after `4bd4aa2e88`, which makes merging tombstones more expensive. Closes #10916	2022-06-28 18:29:29 +03:00
Nadav Har'El	a8b02f7965	test: set sanitizer options in run scripts When the run scripts for tests of cql-pytest, alternator, redis, etc., run Scylla, they should set the UBSAN_OPTIONS and ASAN_OPTIONS so that if the executable is built with sanitizers enabled, it will ignore false positives that we know about, and fail on real errors. The change in this patch affects all test/*/run scripts which use the this shared Scylla-starting code. test.py already had the same settings, and it affected the tests that it knows to run directly (unit tests, cql-pytest, etc.). Fixes #10904 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #10915	2022-06-28 18:24:48 +03:00
Konstantin Osipov	d5d748ae86	test.py: extend xml output with logs Add test and server logs, as well as the unidiff, to XML output. This makes jenkins reports nicer. While on it, debug & fix bugs in handling of flaky tests: - the reset would reset a flaky test even after the last attempt fails, so it would be impossible to see what happened to it - the args needed to be reset as well, since execution modifies them - we would say that we're going to retry the flaky test when in fact it was the last attempt to run it and no more retries were planned	2022-06-28 18:22:01 +03:00
Konstantin Osipov	22d30abed8	test.py: log what cluster is used for a test	2022-06-28 18:22:01 +03:00
Konstantin Osipov	dbdfac7a0f	test.py: improve logging of ScyllaServer	2022-06-28 18:22:01 +03:00
Konstantin Osipov	c4bee2860b	test.py: use test uname, not id, in log output Clarify "Test was cancelled" error message.	2022-06-28 18:22:01 +03:00
Konstantin Osipov	b502dd3c4e	test.py: support --log-level option and pass it to logging module scylla-python driver and scylla_server.py can be more verbose at higher log levels, allow specifying the log level from the command line.	2022-06-28 18:22:01 +03:00
Konstantin Osipov	ad7423649f	test.py: make ScyllaServer more reliable and fast 1) Stick to the specific server in control connections. It could happen that, when starting a cluster and checking if a specific node is up, the check would actually execute against an already running node. Prevent this from happening by setting a white list connection balancing policy for control connections. 2) When checking if CQL is up, ignore timeout errors Scylla in debug mode can easily time out on a DDL query, and the timeout error at start up would lead to the entire cluster marked as broken. This is too harsh, allow timeouts at start. 3) No longer force schema migration when starting the server By default, Raft is on, so the nodes are getting schema through Raft leader. Schema migration significantly slows down cluster start in debug mode (60 seconds -> 100 seconds), and even though it was a great test that helped discover several bugs in Scylla, it shouldn't be part of normal cluster boot, so disable it.	2022-06-28 18:21:25 +03:00
Konstantin Osipov	867d5b4eda	test.py: don't capture pytest output Let print()s inside pytest tests go into pytest logs. This simplifies debugging, especially if someone is not familiar with pytest.	2022-06-28 17:46:47 +03:00
Konstantin Osipov	fd3d08e560	test.py: add type annotations Add type annotations where possible.	2022-06-28 17:46:47 +03:00
Konstantin Osipov	2470b1d888	test.py: convert log_filename to pathlib	2022-06-28 17:46:47 +03:00
Konstantin Osipov	20070e2d89	test.py: please linter Rename the static method to not collied with a member variable with the same name.	2022-06-28 17:46:46 +03:00
Konstantin Osipov	fc232099bb	test.py: remove mypy warnings	2022-06-28 17:46:46 +03:00
Alejo Sanchez	c478a53d9c	test.py topology: repro for issue #1207 Repro for bug in concurrent schema changes for many tables and indexing involved. Do alter tables by doing in parallel new table creation, alter a table (_alter), and index other tables (_index). Original repro had sets of 20 of those and slept for 20 seconds to settle. This repro does it for Scylla with just 1 set and 1 second. This issue goes away once Raft is enabled. https://github.com/scylladb/scylla/issues/1207 Originally at https://issues.apache.org/jira/browse/CASSANDRA-10250 Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:27 +02:00
Alejo Sanchez	4228bfef84	test.py: port fixture fails_without_raft Port fails_without_raft to higher level conftest file for future use in topology pytests. While there, make it async. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:27 +02:00
Alejo Sanchez	e2cc35b768	test.py topology: table methods to add/remove index Add helper methods to add and drop indexes on a given column. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:27 +02:00
Alejo Sanchez	d80857e26e	test.py topology: add/drop table column helpers Helper to add/drop a specified or random column. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:27 +02:00
Alejo Sanchez	e8e6a8e85a	test.py topology: insert sequential row For each table keep a counter and insert rows with sequential values generated correspondingly by each column's type. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:27 +02:00
Alejo Sanchez	0624be6d58	test.py: remove deprecated test test_null With test_schema there's no need for test_null. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:27 +02:00
Alejo Sanchez	ed140f98d8	test.py: managed random tables Helpers to create keyspace and manange randomized tables. Fixture drops all created tables still active after the test finishes. Includes helper methods to verify schema consistency. These helpers will be used in Raft schema changes tests coming later. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:27 +02:00
Alejo Sanchez	df1a032c04	test.py: test_keyspace fixture async Make test_keyspace fixture async. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:26 +02:00
Alejo Sanchez	fda69a0773	test.py: rename fixture test_keyspace to keyspace Name makes better sense as it's not a test but a fixture for tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:26 +02:00
Alejo Sanchez	00648342a6	test.py topology: test with asyncio Run test async using a wrapper for Cassandra python driver's future. The wrapper was suggested by a user and brought forward by @fruch. It's based on https://stackoverflow.com/a/49351069 . Redefine pytest event_loop fixture to avoid issues with fixtures with scope bigger than function (like keyspace). See https://github.com/pytest-dev/pytest-asyncio/issues/68 Convert sample test_null to async. More useful test cases will come afterwards. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-06-28 15:07:26 +02:00
Botond Dénes	6c818f8625	Merge 'sstables: generation_type tidy-up' from Michael Livshin - Use `sstables::generation_type` in more places - Enforce conceptual separation of `sstables::generation_type` and `int64_t` - Fix `extremum_tracker` so that `sstables::generation_type` can be non-default-constructible Fixes #10796. Closes #10844 * github.com:scylladb/scylla: sstables: make generation_type an actual separate type sstables: use generation_type more soundly extremum_tracker: do not require default-constructible value types	2022-06-28 08:50:12 +03:00
Calle Wilund	688fd31e64	commitlog: Add counters for actual pending allocations + segment wait Fixes #9367 The CL counters pending_allocations and requests_blocked_memory are exposed in graphana (etc) and often referred to as metrics on whether we are blocking on commit log. But they don't really show this, as they only measure whether or not we are blocked on the memory bandwidth semaphore that provides rate back pressure (fixed num bytes/s - sortof). However, actual tasks in allocation or segment wait is not exposed, so if we are blocked on disk IO or waiting for segments to become available, we have no visible metrics. While the "old" counters certainly are valid, I have yet to ever see them be non-zero in modern life. Closes #9368	2022-06-28 08:36:27 +03:00

1 2 3 4 5 ...

31760 Commits