scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 07:23:15 +00:00

Author	SHA1	Message	Date
Avi Kivity	86bbf1763d	Merge "reader concurrency semaphore: dump permit diagnostics on timeout or queue overflow" from Botond " The reader concurrency semaphore timing out or its queue being overflown are fairly common events both in production and in testing. At the same time it is a hard to diagnose problem that often has a benign cause (especially during testing), but it is equally possible that it points to something serious. So when this error starts to appear in logs, usually we want to investigate and the investigation is lengthy... either involves looking at metrics or coredumps or both. This patch intends to jumpstart this process by dumping a diagnostics on semaphore timeout or queue overflow. The diagnostics is printed to the log with debug level to avoid excessive spamming. It contains a histogram of all the permits associated with the problematic semaphore organized by table, operation and state. Example: DEBUG 2020-10-08 17:05:26,115 [shard 0] reader_concurrency_semaphore - Semaphore _read_concurrency_sem: timed out, dumping permit diagnostics: Permits with state admitted, sorted by memory memory count name 3499M 27 ks.test:data-query 3499M 27 total Permits with state waiting, sorted by count count memory name 1 0B ks.test:drain 7650 0B ks.test:data-query 7651 0B total Permits with state registered, sorted by count count memory name 0 0B total Total: permits: 7678, memory: 3499M This allows determining several things at glance: * What are the tables involved * What are the operations involved * Where is the memory This can speed up a follow-up investigation greatly, or it can even be enough on its own to determine that the issue is benign. Tests: unit(dev, debug) " * 'dump-diagnostics-on-semaphore-timeout/v2' of https://github.com/denesb/scylla: reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow utils: add to_hr_size() reader_concurrency_semaphore: link permits into an intrusive list reader_concurrency_semaphore: move expiry_handler::operator()() out-of-line reader_concurrency_semaphore: move constructors out-of-line reader_concurrency_semaphore: add state to permits reader_concurrency_semaphore: name permits querier_cache_test: test_immediate_evict_on_insert: use two permits multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader() multishard_combining_reader: add permit parameter multishard_combining_reader: shard_reader: use multishard reader's permit	2020-10-13 12:44:23 +03:00
Botond Dénes	ff623e70b3	reader_concurrency_semaphore: name permits Require a schema and an operation name to be given to each permit when created. The schema is of the table the read is executed against, and the operation name, which is some name identifying the operation the permit is part of. Ideally this should be different for each site the permit is created at, to be able to discern not only different kind of reads, but different code paths the read took. As not all read can be associated with one schema, the schema is allowed to be null. The name will be used for debugging purposes, both for coredump debugging and runtime logging of permit-related diagnostics.	2020-10-13 12:32:13 +03:00
Botond Dénes	40c5474022	querier_cache_test: test_immediate_evict_on_insert: use two permits The test currently uses a single permit shared between two simulated reads (to wait admission twice). This is not a supported way of using a permit and will stop working soon as we make the states the permit is in more pronounced.	2020-10-12 15:56:56 +03:00
Botond Dénes	307cdf1e0d	multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader() Allow the evictable reader managing the underlying reader to pass its own permit to it when creating it, making sure they share the same permit. Note that the two parts can still end up using different permits, when the underlying reader is kept alive between two pages of a paged read and thus keeps using the permit received on the previous page. Also adjust the `reader_context` in multishard_mutation_query.cc to use the passed-in permit instead of creating a new one when creating a new reader.	2020-10-12 15:56:56 +03:00
Botond Dénes	e09ab09fff	multishard_combining_reader: add permit parameter Don't create an own permit, take one as a parameter, like all other readers do, so the permit can be provided by the higher layer, making sure all parts of the logical read use the same permit.	2020-10-12 15:56:56 +03:00
Gleb Natapov	9d7c81c1b8	raft: fix boost/raft_fsm_test complication Message-Id: <20201011063802.GA2628121@scylladb.com>	2020-10-12 12:09:21 +02:00
Nadav Har'El	977da3567f	Merge 'Alternator streams: Fix shard lengths, parenting, expiration, filter useless ones and improve paging' from Calle Wilund The remains of the defunct #7246. Fixes #7344 Fixes #7345 Fixes #7346 Fixes #7347 Shard ID length is now within limits. Shard end sequence number should be set when appropriate. Shard parent is selected a bit more carefully (sorting) Shards are filtered by time to exclude cdc generations we cannot get data from (too old) Shard paging improved Closes #7348 * github.com:scylladb/scylla: test_streams: Add some more sanity asserts alternator::streams: Set dynamodb data TTL explicitly in cdc options alternator::streams: Improve paging and fix parent-child calculation alternator::streams: Remove table from shard_id alternator::streams: Filter our cdc streams older than data/table alternator::error: Add a few dynamo exception types	2020-10-12 09:43:12 +03:00
Avi Kivity	4d6739c2e6	Merge "Use max_concurrent_for_each" from Benny " max_concurrent_for_each was added to seastar for replacing sstable_directory::parallel_for_each_restricted by using more efficient concurrency control that doesn't create unlimited number of continuations. The series replaces the use of sstable_directory::parallel_for_each_restricted with max_concurrent_for_each and exposes the sstable_directory::do_for_each_sstable via a static method. This method is used here by table::snapshot to limit concurrency do snapshot operations that suffer from the same unbound concurrency problem sstable_directory solved. In addition sstable_directory::_load_semaphore that was used across calls to do_for_each_sstable was replaced by a static per-shard semaphore that caps concurrency across all calls to `do_for_each_sstable` on that shard. This makes sense since the disk is a shared resource. In the future, we may want to have a load semaphore per device rather than a single global one. We should experiment with that. Test: unit(dev) " * tag 'max_concurrent_for_each-v5' of github.com:bhalevy/scylla: table: snapshot: use max_concurrent_for_each sstable_directory: use a external load_semaphore test: sstable_directory_test: extract sstable_directory creation into with_sstable_directory distributed_loader: process_upload_dir: use initial_sstable_loading_concurrency sstables: sstable_directory: use max_concurrent_for_each	2020-10-12 09:43:12 +03:00
Avi Kivity	610fa83f28	test: database_test: fix threading confusion database_test contains several instances of calling do_with_cql_test_env() with a function that expects to be called in a thread. This mostly works because there is an internal thread in do_with_cql_test_env(), but is not guaranteed to. Fix by switching to the more appropriate do_with_cql_test_env_thread(). Closes #7333	2020-10-11 17:44:30 +03:00
Avi Kivity	58e02c216a	test: sstable_datafile_test: sstable_run_based_compaction_test: prevent use of uninitialized variable observer The variable 'observer' (an std::optional) may be left uninitialized if 'incremental_enabled' is false. However, it is used afterwards with a call to disconnect, accessing garbage. Fix by accessing it via the optional wrapper. A call to optional::reset() destroys the observable, which in turn calls disconnect(). Closes #7380	2020-10-11 17:36:08 +03:00
Avi Kivity	15ab6a3feb	test: cql_repl: use boost::regex instead of std::regex to avoid stack overflow libstdc++'s std::regex uses recursion[1], with a depth controlled by the input. Together with clang's debug mode, this overflows the stack. Use boost::regex instead, which is immune to the problem. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 Closes #7378	2020-10-11 17:12:21 +03:00
Avi Kivity	882ed2017a	test: network_topology_strategy_test: fix overflow in d2t() d2t() scales a fraction in the range [0, 1] to the range of a biased token (same as unsigned long). But x86 doesn't support conversion to unsigned, only signed, so this is a truncating conversion. Clang's ubsan correctly warns about it. Fix by reducing the range before converting, and expanding it afterwards. Closes #7376	2020-10-11 16:05:02 +03:00
Alejo Sanchez	5d408082b6	raft: log failed test case name Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-09 15:50:47 +02:00
Alejo Sanchez	664b3eddb1	raft: test add hasher Values seen by nodes were so far added but this does not provide a guarantee the order of these values was respected. Use a digest to check output, implicitly checking order. On the other hand, sum or a simple positional checksum like Fletcher's is easier to debug as rolling sum is evident. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-09 15:50:42 +02:00
Alejo Sanchez	670824c6fa	raft: declarative tests For convenience making Raft tests, use declarative structures. Servers are set up and initialized and then updates are processed. For now, updates are just adding entries to leader and change of leader. Updates and leader changes can be specified to run after initial test setup. An example test for 3 nodes, node 0 starting as leader having two entries 0 and 1 for term 1, and with current term 2, then adding 12 entries, changing leader to node 1, and adding 12 more entries. The test will automatically add more entries to the last leader until the test limit of total_values (default 100). {.name = "test_name", .nodes = 3, .initial_term = 2, .initial_states = {{.le = {{1,0},{1,1}}}, .updates = {entries{12},new_leader{1},entries{12}},}, Leader is isolated before change via is_leader returning false. Initial leader (default server 0) will be set with this method, too. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-09 15:50:31 +02:00
Alejo Sanchez	7d4b33d834	raft: test make app return proper exit int value Seastar app returns int result exit value. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-09 15:50:24 +02:00
Alejo Sanchez	093bc8fbb3	raft: test add support for disconnected server Failure detector support of disconnected servers with a global set of addresses. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-09 15:50:02 +02:00
Alejo Sanchez	21d7686766	raft: tests use custom server ids for easier debugging Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-09 15:49:57 +02:00
Alejo Sanchez	56683ae689	raft: test remove unnecessary header Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-09 15:49:45 +02:00
Alejo Sanchez	1bff357816	raft: fix typo snaphot snapshot Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-09 15:49:39 +02:00
Benny Halevy	57cc5f6ae1	sstable_directory: use a external load_semaphore Although each sstable_directory limits concurrency using max_concurrent_for_each, there could be a large number of calls to do_for_each_sstable running in parallel (e.g per keyspace X per table in the distributed_loader). To cap parallelism across sstable_directory instances and concurrent calls to do_for_each_sstable, start a sharded<semaphore> and pass a shared semaphore& to the sstable_directory:s. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-08 11:57:06 +03:00
Benny Halevy	dc46aaa3fd	test: sstable_directory_test: extract sstable_directory creation into with_sstable_directory Use common code to create, start, and stop the sharded<sstable_directory> for each test. This will be used in the next patch for creating a sharded semaphore and passing it to the sstable_directory. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-08 11:57:06 +03:00
Gleb Natapov	0bff15a976	raft: Send multiple entries in one append_entry rpc Send more that one entry in single append_entry message but limit one packets size according to append_request_threshold parameter. Message-Id: <20201007142602.GA2496906@scylladb.com>	2020-10-07 16:43:33 +02:00
Calle Wilund	349c5ee21a	test_streams: Add some more sanity asserts Checking validity of retured shard sets etc.	2020-10-07 08:43:39 +00:00
Calle Wilund	3cdd7fe191	alternator::streams: Remove table from shard_id Fixes #7344 It is not data really needed, as shard_id:s are not required to be unique across streams, and also because the length limit on shard_id text representation. As a side effect, shard iter instead carries the stream arn.	2020-10-07 08:43:39 +00:00
Avi Kivity	c6a3fa5a49	Merge "querier_cache: use the querier's permit for memory accounting" from Botond " The querier cache has a memory based eviction mechanism, which starts evicting freshly inserted queriers once their collective memory consumption goes above the configured limit. For determining the memory consumption of individual queriers, the querier cache uses `flat_mutation_reader::buffer_size()`. But we now have a much more comprehensive accounting of the memory used by queriers: the reader permit, which also happens to be available in each querier. So use this to determine the querier's memory consumption instead. Tests: unit(dev) " * 'querier-cache-use-permit-for-memory-accounting/v1' of https://github.com/denesb/scylla: flat_mutation_reader: de-virtualize buffer_size() querier_cache: use the reader permit for memory accounting querier_cache_test: use local semaphore not the test global one reader_permit: add consumed_resources() accessor	2020-10-06 16:52:44 +03:00
Tomasz Grabiec	46b7ba8809	Merge "Bring memory footprint test back to work" from Pavel Emelyanov The test was broken by recent sstables manager rework. In the middle the sstables::test_env is destroyed without being closed which leads to broken _closing assertion inside ~sstables_manager(). Fix is to use the test_env::do_with helper. tests: perf.memory_footprint * https://github.com/xemul/scylla/tree/br-memory-footprint-test-fix: test/perf/memory_footprint: Fix indentation after previous patch test/perf/memory_footprint: Don't forget to close sstables::test_env after usage	2020-10-06 11:49:03 +02:00
Pavel Emelyanov	8bceb916ea	test/perf/memory_footprint: Fix indentation after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-10-06 11:08:09 +03:00
Pavel Emelyanov	3e4de0f748	test/perf/memory_footprint: Don't forget to close sstables::test_env after usage After recent sstables manager rework the sstables::test_env must be .close()d after usage, otherwise the ~sstables_mananger() hits the _closing assertion. Do it with the help of .do_with(). The execution context is already seastar::async in this place, so .get() it explicitly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-10-06 11:06:35 +03:00
Botond Dénes	dd372c8457	flat_mutation_reader: de-virtualize buffer_size() The main user of this method, the one which required this method to return the collective buffer size of the entire reader tree, is now gone. The remaining two users just use it to check the size of the reader instance they are working with. So de-virtualize this method and reduce its responsibility to just returning the buffer size of the current reader instance.	2020-10-06 08:22:56 +03:00
Botond Dénes	f7eea06f61	querier_cache_test: use local semaphore not the test global one In the mutation source, which creates the reader for this test, the global test semaphore's permit was passed to the created reader (`tests::make_permit()`). This caused reader resources to be accounted on the global test semaphore, instead of the local one the test creates. Just forward the permit passed to the mutation sources to the reader to fix this.	2020-10-06 08:22:56 +03:00
Nadav Har'El	421f0c729d	merge: counters: Avoid signed integer overflow Merged patch series by Tomasz Grabiec: UBSAN complains in debug mode when the counter value overflows: counters.hh:184:16: runtime error: signed integer overflow: 1 + 9223372036854775807 cannot be represented in type 'long int' Aborting on shard 0. Overflow is supposed to be supported. Let's silence it by using casts. Fixes #7330. Tests: - build/debug/test/tools/cql_repl --input test/cql/counters_test.cql Tomasz Grabiec (2): counters: Avoid signed integer overflow test: cql: counters: Add tests reproducing signed integer overflow in debug mode counters.hh \| 2 +- test/cql/counters_test.cql \| 9 ++++++++ test/cql/counters_test.result \| 48 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 58 insertions(+), 1 deletion(-)	2020-10-05 21:43:19 +03:00
Tomasz Grabiec	f01ffe063a	test: cql: counters: Add tests reproducing signed integer overflow in debug mode Reproduces #7330	2020-10-05 20:06:34 +02:00
Alejo Sanchez	bb67d15e2f	Raft: disable boost tests for now Disable raft fsm boost tests until raft is part of build. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-02 14:03:01 +02:00
Alejo Sanchez	4e26dad3a0	Raft: Remove tests for now Remove raft C++ tests until raft is included in build process. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2020-10-02 12:26:05 +02:00
Tomasz Grabiec	ca7f0c61f0	Merge "raft: initial implementation" from Gleb This is the beginning of raft protocol implementation. It only supports log replication and voter state machine. The main difference between this one and the RFC (besides having voter state machine) is that the approach taken here is to implement raft as a deterministic state machine and move all the IO processing away from the main logic. To do that some changes to RPC interface was required: all verbs are now one way meaning that sending a request does not wait for a reply and the reply arrives as a separate message (or not at all, it is safe to drop packets). * scylla-dev/raft-v4: raft: add a short readme file raft: compile raft tests raft: add raft tests raft: Implement log replication and leader election raft: Introduce raft interface header	2020-10-01 17:09:52 +02:00
Gleb Natapov	4959609589	raft: add raft tests Add test for currently implemented raft features. replication_test tests replication functionality with various initial log configurations. raft_fsm_test test voting state machine functionality.	2020-10-01 14:30:59 +03:00
Etienne Adam	98dc0dc03a	redis: only create required keyspaces/tables The 'redis_database_count' was already existing, but was not used when initializing the keyspaces. This patch merely uses it. I think it's better that way, it seems cleaner not to create 15 x 5 tables when we use only one redis database. Also change a test to test with a higher max number of database. Signed-off-by: Etienne Adam <etienne.adam@gmail.com> Message-Id: <20200930210256.4439-1-etienne.adam@gmail.com>	2020-10-01 10:27:03 +03:00
Avi Kivity	fd1dd0eac7	Merge "Track the memory consumption of reader buffers" from Botond " The last major untracked area of the reader pipeline is the reader buffers. These scale with the number of readers as well as with the size and shape of data, so their memory consumption is unpredictable varies wildly. For example many small rows will trigger larger buffers allocated within the `circular_buffer<mutation_fragment>`, while few larger rows will consume a lot of external memory. This series covers this area by tracking the memory consumption of both the buffer and its content. This is achieved by passing a tracking allocator to `circular_buffer<mutation_fragment>` so that each allocation it makes is tracked. Additionally, we now track the memory consumption of each and every mutation fragment through its whole lifetime. Initially I contemplated just tracking the `_buffer_size` of `flat_mutation_reader::impl`, but concluded that as our reader trees are typically quite deep, this would result in a lot of unnecessary `signal()`/`consume()` calls, that scales with the number of mutation fragments and hence adds to the already considerable per mutation fragment overhead. The solution chosen in this series is to instead track the memory consumption of the individual mutation fragments, with the observation that these are typically always moved and very rarely copied, so the number of `signal()`/`consume()` calls will be minimal. This additional tracking introduces an interesting dilemma however: readers will now have significant memory on their account even before being admitted. So it may happen that they can prevent their own admission via this memory consumption. To prevent this, memory consumption is only forwarded to the semaphore upon admission. This might be solved when the semaphore is moved to the front -- before the cache. Another consequence of this additional, more complete tracking is that evictable readers now consume memory even when the underlying reader is evicted. So it may happen that even though no reader is currently admitted, all memory is consumed from the semaphore. To prevent any such deadlocks, the semaphore now admits a reader unconditionally if no reader is admitted -- that is if all count resources all available. Refs: #4176 Tests: unit(dev, debug, release) " * 'track-reader-buffers/v2' of https://github.com/denesb/scylla: (37 commits) test/manual/sstable_scan_footprint_test: run test body in statement sched group test/manual/sstable_scan_footprint_test: move test main code into separate function test/manual/sstable_scan_footprint_test: sprinkle some thread::maybe_yield():s test/manual/sstable_scan_footprint_test: make clustering row size configurable test/manual/sstable_scan_footprint_test: document sstable related command line arguments mutation_fragment_test: add exception safety test for mutation_fragment::mutate_as_*() test: simple_schema: add make_static_row() reader_permit: reader_resources: add operator== mutation_fragment: memory_usage(): remove unused schema parameter mutation_fragment: track memory usage through the reader_permit reader_permit: resource_units: add permit() and resources() accessors mutation_fragment: add schema and permit partition_snapshot_row_cursor: row(): return clustering_row instead of mutation_fragment mutation_fragment: remove as_mutable_end_of_partition() mutation_fragment: s/as_mutable_partition_start/mutate_as_partition_start/ mutation_fragment: s/as_mutable_range_tombstone/mutate_as_range_tombstone/ mutation_fragment: s/as_mutable_clustering_row/mutate_as_clustering_row/ mutation_fragment: s/as_mutable_static_row/mutation_as_static_row/ flat_mutation_reader: make _buffer a tracked buffer mutation_reader: extract the two fill_buffer_result into a single one ...	2020-09-29 16:08:16 +03:00
Piotr Sarna	9e5ce5a93c	counters: remove unused 1.7.4 counter order code After cleaning up old cluster features (`253a7640e3`) the code for special handling of 1.7.4 counter order was effectively only used in its own tests, so it can be safely removed. Closes #7289	2020-09-29 12:16:58 +03:00
Botond Dénes	2ee026f26f	test/manual/sstable_scan_footprint_test: run test body in statement sched group So that queries are processed in said scheduling group and thus they use the user read concurrency semaphore.	2020-09-28 11:27:49 +03:00
Botond Dénes	272a54b81c	test/manual/sstable_scan_footprint_test: move test main code into separate function	2020-09-28 11:27:49 +03:00
Botond Dénes	29861b068e	test/manual/sstable_scan_footprint_test: sprinkle some thread::maybe_yield():s To avoid stalls.	2020-09-28 11:27:49 +03:00
Botond Dénes	daa9fa72f1	test/manual/sstable_scan_footprint_test: make clustering row size configurable So that large-row workloads can be simulated too.	2020-09-28 11:27:49 +03:00
Botond Dénes	2ff326a41a	test/manual/sstable_scan_footprint_test: document sstable related command line arguments	2020-09-28 11:27:49 +03:00
Botond Dénes	ceb308411c	mutation_fragment_test: add exception safety test for mutation_fragment::mutate_as_*()	2020-09-28 11:27:49 +03:00
Botond Dénes	ceb0b02ee8	test: simple_schema: add make_static_row()	2020-09-28 11:27:49 +03:00
Botond Dénes	256140a033	mutation_fragment: memory_usage(): remove unused schema parameter The memory usage is now maintained and updated on each change to the mutation fragment, so it needs not be recalculated on a call to `memory_usage()`, hence the schema parameter is unused and can be removed.	2020-09-28 11:27:47 +03:00
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00
Botond Dénes	4f5ccf82cb	mutation_fragment: s/as_mutable_clustering_row/mutate_as_clustering_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_clustering_row() &&`.	2020-09-28 10:53:56 +03:00

1 2 3 4 5 ...

889 Commits