scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 20:46:56 +00:00

Author	SHA1	Message	Date
Benny Halevy	e5538cf52e	test: mutation_write_test: test_timestamp_based_splitting_mutation_writer: no need to downgrade reader to v1 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220315083425.2786228-2-bhalevy@scylladb.com>	2022-03-15 11:41:11 +02:00
Benny Halevy	90edddd7e3	everywhere: use make_flat_mutation_reader_from_mutations_v2 Rather than upgrade_to_v2(make_flat_mutation_reader_from_mutations) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220315083425.2786228-1-bhalevy@scylladb.com>	2022-03-15 11:41:10 +02:00
Nadav Har'El	189ff5414f	test/cql-pytest: implement test_tools.py without run-script cooperation In commit `afab1a97c6`, we added test_tools.py - tests for the various tools embedded in the Scylla executable. These tests need to know where the Scylla executable is, and also where its sstables are stored. For this, the commit added two test parameters - "--scylla-path" and "--workdir" - with which the "run" script communicated this knowledge to the test. However, that implementation meant that these tests only work if the test was run via the test/cql-pytest/run script - they won't work if the user ran Scylla/pytest manually, or through some other script not passing these options. This patch drops the "--scylla-path" and "--workdir" parameters, and instead the test figures out this information on its own: 1. To find the Scylla executable, we begin by looking (using the local_process_id(cql) function from the previous patch) for a local process which listens to our CQL connection, and then find the executable's path using /proc. 2. To find the Scylla data directory (which is what we really need, not workdir which is just a shortcut to set all directories!), we retrieve this configuration from the system.config table through CQL. I tested that test_tools.py now works not only through test/cql-pytest/run but also if I run Scylla manually and then run "pytest test_tools.py" without any extra parameters. Fixes #10209 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220314151125.2737815-2-nyh@scylladb.com>	2022-03-14 20:25:22 +02:00
Nadav Har'El	8ed0909cc3	test/cql-pytest: add mechanism and example of testing Scylla log messages Generally, cql-pytest tests do not, and should not rely on looking up messages in the Scylla log file: Relying on such messages makes it impossible to run the same test against Cassandra or even a remotely- installed Scylla, and the tests tend to break when logging (which is not considered part of our API) changes. Moreover, usually what our dtests achieve by looking at the log - e.g., figuring out when some event has happened - can be achieved through official CQL APIs, and this is what normal users do anyway (users don't normally dig through the log to figure out when their operation completed). However, sometimes we do want to write a test to confirm that during a certain operation, a certain log message gets written to Scylla's log. A desire to do this was raised by @fruch and @soyacz, so in this patch I provide a mechanism to do this, and a trivial example - which checks that a "Creating ..." message appears on the log whenever a table is created, and "Dropping ..." when the table is deleted. As is explained in detail in patches in the comment, Scylla's log file is found automatically, without relying on Scylla's runner (such as the script test/cql-pytest/run) communicating to the test where the log file is. If the log file can't be found - e.g., we're testing a remote Scylla, or if this isn't Scylla, the tests are skipped. I would like all logfile-testing tests to be in the same file, test_logs.py. As I explained above, I think it is a mistake for general tests to check the log file just because they can. I think that the only tests that should use the log file are tests deliberately written to check what gets logged - and those can be collected in the same file. As part of this patch, we add the utility function local_process_id(cql) to find (if we can) the local process which listens to the connection "cql". This utility function will later be useful in more places - for example test_tools.py needs to find Scylla's executable. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220314151125.2737815-1-nyh@scylladb.com>	2022-03-14 20:25:20 +02:00
Lukasz Sojka	c65f1c3b47	test/cql-pytest: add warnings test cql client should return warnings when batch exceedes certain size. This test verifies if response contains them. Test covers issue: https://github.com/scylladb/scylla/issues/10196 Signed-off-by: Lukasz Sojka <lukasz.sojka@scylladb.com> Closes #10197	2022-03-14 19:49:06 +02:00
Benny Halevy	37dc31c429	api: storage_service: force_keyspace_compaction: compact one table at a time To make major compaction more resilient to low- disk space conditions, `342bfbd65a` sorted the tables based on their live disk space used. However, each shard still makes progress in its own pace. This change serializes major compaction between tables so we still compact in parallel on all shards, but one (distributed) table at a time. As a follow-up, we can consider serializing even at the single shard level when disk space is critically low, so we can't even risk parallel compaction across all shards. Refs scylladb/scylla-dtest#2653 Test: unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220313153814.2203660-1-bhalevy@scylladb.com>	2022-03-14 15:39:23 +02:00
Raphael S. Carvalho	1a2332a0ba	compaction: Move release_exhausted out of the compaction descriptor With compact_sstables() now living in compaction_manager::task, release_exhausted no longer has to live inside compaction_descriptor, which is a good direction because implementation detail is being removed from the interface. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220311023410.250149-2-raphaelsc@scylladb.com>	2022-03-14 15:39:23 +02:00
Raphael S. Carvalho	fce9d869b4	compaction: Move table::compact_sstables() into compaction manager Table submits compaction request into manager, which in turn calls back table to run the compaction when the time has come, i.e.: table -> compaction manager -> table -> execute compaction But manager should not rely on table to run compaction, as compaction execution procedure sits one layer below the manager and should be accessed directly by it, i.e: table -> compaction manager -> execute compaction This makes code easier to understand and update_compaction_history() can now be noop for unit tests using table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220311023410.250149-1-raphaelsc@scylladb.com>	2022-03-14 15:39:23 +02:00
Botond Dénes	964d9e033d	Merge "raft_group_registry: drain_on_shutdown" from Benny Halevy " This series hardens raft_group_registry::stop_servers and uses it to drain_on_shutdown, called before the database is stopped in cql_test_env. (Not needed for main). raft_group_registry deferred_stop is introduced right after the service is started to make sure it's properly stopped even if there's an exception at any point while starting. Test: unit(dev) " * tag 'raft_group_registry-drain_on_shutdown-v1' of https://github.com/bhalevy/scylla: cql_test_env: raft_group_registry::drain_on_shutdown before stopping the database raft_group_registry: harden stop_servers raft_group_registry: delete unused _shutdown_gate	2022-03-14 14:22:46 +02:00
Avi Kivity	e7fb71020b	Merge 'replica: Optimize empty_flat_reader out of the hot path' from Michał Chojnowski When row_cache::make_reader() and memtable::make_flat_reader() see that the query result is empty, they return empty_flat_reader, which is a trivial implementation of flat_mutation_reader. Even though empty_flat_reader doesn't do anything meaningful, it still needs to be created, handled in merging_reader and destroyed. Turns out this is costly. This patch series replaces hot path uses of empty_flat_reader with an empty optional. Performance effects: `perf_simple_query --smp 1` TPS: 138k -> 168k allocs/op: 80.2 -> 71.1 insns/op: 49.9k -> 45.1k `perf_simple_query --smp 1 --enable-cache=1 --flush` TPS: 125k -> 150k allocs/op: 79.2 -> 71.1 insns/op: 51.7k -> 47.2k For a cassandra-stress benchmark (localhost, 100% cache reads) this translates to a TPS increase from ~42k to ~48k per hyperthread. Note that this optimization is effective for single-partition reads where the queried partition is only in cache/sstables or only in memtables. Other queries (e.g. where the partition is in both cache in memtables and needs to be merged) are unaffected. Closes #10204 * github.com:scylladb/scylla: replica: Prefer row_cache::make_reader_opt() to row_cache::make_reader() row_cache: Add row_cache::make_reader_opt() replica: Prefer memtable::make_flat_reader_opt() to memtable::make_flat_reader() memtable: Add memtable::make_flat_reader_opt() [avi: adjust #include for readers/ split]	2022-03-14 14:07:00 +02:00
Mikołaj Sielużycki	1d84a254c0	flat_mutation_reader: Split readers by file and remove unnecessary includes. The flat_mutation_reader files were conflated and contained multiple readers, which were not strictly necessary. Splitting optimizes both iterative compilation times, as touching rarely used readers doesn't recompile large chunks of codebase. Total compilation times are also improved, as the size of flat_mutation_reader.hh and flat_mutation_reader_v2.hh have been reduced and those files are included by many file in the codebase. With changes real 29m14.051s user 168m39.071s sys 5m13.443s Without changes real 30m36.203s user 175m43.354s sys 5m26.376s Closes #10194	2022-03-14 13:20:25 +02:00
Benny Halevy	26b1be0b8f	test: lib: random_mutation_generator: accept optional random seed Provide an easy way to instrument a particular test case to use a given random number seed (that's curretly already printed to the test log). Refs #5349 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210907114537.3464004-1-bhalevy@scylladb.com>	2022-03-14 13:09:36 +02:00
Michał Chojnowski	83efb508d6	replica: Prefer row_cache::make_reader_opt() to row_cache::make_reader() The former is significantly cheaper when there is nothing to be read.	2022-03-14 12:02:49 +01:00
Michał Chojnowski	6c6519a909	row_cache: Add row_cache::make_reader_opt()	2022-03-14 12:02:49 +01:00
Michał Chojnowski	f211ef9d71	replica: Prefer memtable::make_flat_reader_opt() to memtable::make_flat_reader() The former is significantly cheaper when there is nothing to be read.	2022-03-14 12:02:49 +01:00
Michał Chojnowski	218f2b6e98	memtable: Add memtable::make_flat_reader_opt() When there is nothing to read, make_flat_reader() returns an empty (no-op) reader. But it turns out that constructing, combining and destroying that empty reader is quite costly. As an optimization, add an alternative version which returns an empty optional instead.	2022-03-14 12:02:49 +01:00
Benny Halevy	8481852c91	cql_test_env: raft_group_registry::drain_on_shutdown before stopping the database We're currently stopping raft_gr before shutting the database down, but we fail to do that if anything goes wrong before that, e.g. if distributed_loader::init_non_system_keyspaces fails. This change splits drain_on_shutdown out of stop() to stop the raft groups before the database is stopped and does the rest in a deferred_stop placed right after the rafr_gr registry is strated. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-14 11:49:44 +02:00
Benny Halevy	ac307d6a62	raft_group_registry: harden stop_servers stop_servers should never fail since it's called on the shutdown path. Use a local gate in stop_servers() to wait on all background raft group server aborts. Also, handle theoretical exceptions from server::abort() to guarantee success. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-14 11:49:44 +02:00
Benny Halevy	ab30feb71d	raft_group_registry: delete unused _shutdown_gate Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-14 11:49:44 +02:00
Piotr Dulikowski	2415a1d169	abstract_read_resolver: bring back cancelling timeout timer on read failure Recent PR #10092 (propagating read timeouts on coordinator without throwing) accidentally removed a line which cancelled `abstract_read_resolver`'s `_timeout` timer after a read failure. Because of that, it might happen that after a read failure the timer is triggered and the `_done_promise` is set twice which triggers an assert in seastar. This commit brings back the line which cancels the timeout timer. Fixes: #10193 Closes #10206	2022-03-14 09:43:32 +01:00
Nadav Har'El	383aa326cc	cql-pytest: translate Cassandra's tests for BATCH operations This is a translation of Cassandra's CQL unit test source file validation/operations/BatchTest.java into our our cql-pytest framework. This test file includes 13 tests for various types of BATCH operations. All tests pass on Scylla - no known or new bugs were reproduced. Two of the tests involve very slow testing of TTLs, so after verifying they work I marked them "skip" for now (we can always turn them on later, perhaps after reducing the length or number of the sleeps). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220313121634.2611423-1-nyh@scylladb.com>	2022-03-14 09:43:02 +01:00
Piotr Sarna	83ec505fab	cql3: add tracing indexed aggregate queries Commit `1c99ed6ced` added tracing logs about the index chosen for the query, but aggregate queries have a separate code path, which wasn't taken into account. After this patch, tracing for aggregate queries also includes this additional information. Closes #10195	2022-03-11 15:27:03 +02:00
Raphael S. Carvalho	67a7b7a3f4	compaction: rename interrupt() to a descriptive name interrupt() makes it sound like it's interrupting the compaction, but it's actually called on interrupt, to handle the interrupt scenario. Let's rename it to on_interrupt(). Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220311000128.189840-1-raphaelsc@scylladb.com>	2022-03-11 10:16:34 +02:00
Michał Sala	c8413631af	forward_service: change implicit lambda capture list to explicit one Changing the capture list of a lambda in forward_service::execute_on_this_shard from [&] to an explicit one enables grater readability and prevents potential bugs. Closes #10191	2022-03-10 17:30:06 +02:00
Botond Dénes	e9ba8ad43a	Merge "Configure gossiper the "classical" way" from Pavel Emelyanov The services' configuration should be performed with the help of service-specific config that's filled by the service creator. This is not the case for gossiper that grabs the db::config and keeps reference on it throughout its lifetime. This set brings the gossiper configuration to the described form by putting the needed config bits onto gossip_config (that already exists and is partially used for gossiper configuration). And two live-updateable options need extra care. tests: unit(dev), dtest.simple_boot_shutdown(dev) * 'br-gossiper-no-db-config' of https://github.com/xemul/scylla: gossiper: Remove db::config reference from gossiper gossiper: Keep live-updateable options on gossiper gossiper: Keep immutable options on gossip_config	2022-03-10 16:35:41 +02:00
Botond Dénes	ab440e1a07	mutation_writer: drop now unused v1 variants of bucket_writer feed_writer() Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20220302145945.189607-2-bdenes@scylladb.com>	2022-03-10 15:20:07 +02:00
Botond Dénes	108d921fc9	mutation_writer: partition_based_splitting_writer: convert implementation to v2 Although its API was long converted to v2, its implementation stayed v1 because the memtable and mutation API were still v1. Now that the memtable flush returns a v2 reader we can have a second look at converting this. While the mutation API still uses v1, this can easily be worked around by using going through `mutation_rebuilder_v2`. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20220302145945.189607-1-bdenes@scylladb.com>	2022-03-10 15:20:07 +02:00
Botond Dénes	7e0b51ff23	Merge 'Overhaul compaction_manager::task' from Benny Halevy The series overhauls the compaction_manager::task design and implementation by properly layering the functionality between the compaction_manager that deals with generic task execution, and the per-task business logic that is defined in a set of classes derived from the generic task class. While at it, the series introduces `task::state` and a set of helper functions to manage it to prevent leaks in the statistics, fixing #9974. Two more stats counter were exposed: `completed_tasks` and a new `postponed_tasks`. Test: sstable_compaction_test Dtest: compaction_test.py compaction_additional_test.py Fixes #9974 Closes #10122 * github.com:scylladb/scylla: compaction_manager: use coroutine::switch_to compaction_manager::task: drop _compaction_running compaction_manager: move per-type logic to derived task compaction_manager: task: add state enum compaction_manager: task: add maybe_retry compaction_manager: reevaluate_postponed_compactions: mark as noexcept compaction_manager: define derived task types compaction_manager: register_metrics: expose postponed_compactions compaction_manager: register_metrics: expose failed_compactions compaction_manager: register_metrics: expose _stats.completed_tasks compaction: add documentation for compaction_type to string conversions compaction: expose to_string(compaction_type) compaction_manager: task: standardize task description in log messages compaction_manager: refactor can_proceed compaction_manager: pass compaction_manager& to task ctor compaction_manager: use shared_ptr<task> rather than lw_shared_ptr compaction_manager: rewrite_sstables: acquire _maintenance_ops_sem once compaction_manager: use compaction_state::lock only to synchronize major and regular compaction	2022-03-10 13:33:56 +02:00
Benny Halevy	5e1fda7e1d	compaction_manager: use coroutine::switch_to Saving an allocation for running the functor as a task in the switched-to scheduling group. Also, switch to the desired scheduling group at the beginning of the task so that the higher level logic, like getting the list of sstables to compact will be performed under the desired scheduling group, not only the compaction code itself. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:20:01 +02:00
Benny Halevy	8c66916652	compaction_manager::task: drop _compaction_running Replace the _compaction_running boolean member by calculating _state == state::active now that setup_new_compaction switches state to `active` Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:20:01 +02:00
Benny Halevy	a2a5e530f0	compaction_manager: move per-type logic to derived task Move the business logic into the task specific classes. Separating initialization during task construction, from the compaction_done task, moved into a do_run() method, and in some cases moving a lambda function that was called per table (as in rewrite_sstables) into a private method of the derived class. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:20:01 +02:00
Benny Halevy	2e6ce43a97	compaction_manager: task: add state enum Add an enum class representing the task state machine and a switch_state function to transition between the states and update the corresponding compaction_manager stats counters. Refs #9974 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:19:59 +02:00
Mikołaj Sielużycki	5920349357	row_cache: Make row_cache reader from sstables compacting. Reading data from sstables without compacting first puts unnecessary pressure on the cache. The mutation streams need to be resolved anyway before passing to subsequent consumers, so it's better to do it as close to the source as possible. Fixes: #3568 Closes #10188	2022-03-10 11:40:10 +02:00
Benny Halevy	9c59d66b7e	compaction_manager: task: add maybe_retry Replacing and combining compaction_manager methods: maybe_stop_on_error and put_task_to_sleep. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 11:35:37 +02:00
Benny Halevy	ee32be3aa5	compaction_manager: reevaluate_postponed_compactions: mark as noexcept To simplify error handling in following patches that will coroutinize task logic. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 11:35:37 +02:00
Benny Halevy	72162ed653	compaction_manager: define derived task types Turn task into a class, defining a clear hierarchy of private, protected, and public methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 11:35:35 +02:00
Avi Kivity	2f967f84d4	Merge "Migrate sstable writer to v2" from Botond " This patch-set converts the sstable writer to v2, then prepares the ground for users actually being able to use the v2 variant. Finally it converts all users to do so and then decommissions the v1 variant. For users to be able to use the v2 writer API, we first have to add a v2 output to the compactor first, as some users write to sstables via the compactor. Tests: unit(dev, release) " * 'sstable-writer-v2/v2' of https://github.com/denesb/scylla: sstables/sstable: remove now unused v1 write_components() variant mutation_compactor: remove now unused compact_for_compaction test/boost/mutation_test: migrate to compact_for_mutation_v2 streaming: migrate to v2 variant of sstable writer API memtable-sstable: migrate to v2 variant of sstable writer API test: migrate to the v2 variant of the sstable writer API sstables/sstable: expose v2 variant of write_components() sstables: convert mx writer to v2 sstables/metadata_collector: use position_in_partition for min/max keys test/boost/mutation_test: test_compactor_range_tombstone_spanning_many_pages extend to check v2 output too mutation_reader: convert compacting reader v2 mutation_compactor: add v2 output mutation_compactor: make _last_clustering_pos track last input range_tombstone_change: add set_tombstone() test/lib/mutation_source_test: log name of each run_mutation_source()	2022-03-10 09:45:57 +02:00
Botond Dénes	2e0610e459	sstables/sstable: remove now unused v1 write_components() variant Supplanted by the v2 variant.	2022-03-10 09:16:33 +02:00
Botond Dénes	4e97477281	mutation_compactor: remove now unused compact_for_compaction	2022-03-10 09:16:33 +02:00
Botond Dénes	32e9809e9c	test/boost/mutation_test: migrate to compact_for_mutation_v2	2022-03-10 09:16:33 +02:00
Botond Dénes	06e6bb6ec9	streaming: migrate to v2 variant of sstable writer API	2022-03-10 09:16:33 +02:00
Botond Dénes	d8fec08468	memtable-sstable: migrate to v2 variant of sstable writer API	2022-03-10 09:16:33 +02:00
Botond Dénes	959483a2dc	test: migrate to the v2 variant of the sstable writer API	2022-03-10 09:16:33 +02:00
Benny Halevy	37694422dc	compaction_manager: register_metrics: expose postponed_compactions Provide a metric counting the number of tables with postponed compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:18 +02:00
Benny Halevy	089d4442d8	compaction_manager: register_metrics: expose failed_compactions Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:18 +02:00
Benny Halevy	8081f951d0	compaction_manager: register_metrics: expose _stats.completed_tasks Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:18 +02:00
Benny Halevy	ffc314d506	compaction: add documentation for compaction_type to string conversions Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:18 +02:00
Benny Halevy	28a74a2e90	compaction: expose to_string(compaction_type) To be used in the next patch to generate a string dscription from the compaction_type. In theory, we could use compaction_name() btu the latter returns the compaction type in all-upper case and that is very different from what we print to the log today. The all-upper strings are used for the api layer, e.g. to stop tasks of a particular compaction type. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:18 +02:00
Benny Halevy	20a8609392	compaction_manager: task: standardize task description in log messages Define task::describe and use it via operator<< to print the task metadata to the log in a standard way. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:18 +02:00
Benny Halevy	59863b317f	compaction_manager: refactor can_proceed Move the task-internal parts of can_proceed to a respective compaction_manager::task method, preparing for turning it into a class with a proper hierarchy of access to private members. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:17 +02:00

1 2 3 4 5 ...

30501 Commits