scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-19 16:15:07 +00:00

Author	SHA1	Message	Date
Botond Dénes	dadc0c32e1	reader_concurrency_semaphore: execution_loop(): move maybe_admit_waiters() to the inner loop Now that the CPU concurency limit is configurable, new reads might be ready to execute right after the current one was executed. So move the poll for admitting new reads into the inner loop, to prevent the situation where the inner loop yields and a concurrent do_wait_admission() finds that there are waiters (queued because at the time they arrived to the semaphore, the _ready_list was not empty) but it is is possible to admit a new read. When this happens the semaphore will dump diagnostics to help debug the apparent contradiction, which can generate a lot of log spam. Moving the poll into the inner loop prevents the false-positive contradiction detection from firing. Refs: scylladb/scylladb#19017 Closes scylladb/scylladb#19600 (cherry picked from commit `155acbb306`)	2024-07-08 08:13:40 +03:00
Botond Dénes	abc4a9b635	reader_concurrency_semaphore: wire in the configurable cpu concurrency Before this patch, the semaphore was hard-wired to stop admission, if there is even a single permit, which is in the need_cpu state. Therefore, keeping the CPU concurrency at 1. This patch makes use of the new cpu_concurrency parameter, which was wired in in the last patches, allowing for a configurable amount of concurrent need_cpu permits. This is to address workloads where some small subset of reads are expected to be slow, and can hold up faster reads behind them in the semaphore queue. (cherry picked from commit `07c0a8a6f8`)	2024-07-08 08:12:34 +03:00
Botond Dénes	052cef2621	reader_concurrency_semaphore: add cpu_concurrency constructor parameter In the case of the user semaphore, this receives the new reader_concurrency_semaphore_cpu_limit config item. Not used yet. (cherry picked from commit `59faa6d4ff`)	2024-07-08 08:12:20 +03:00
Botond Dénes	3c813fbb99	reader_concurrency_semaphore: add range param to evict_inactive_reads_for_table() When the new optional parameter has a value, evict only inactive reads, whose ranges overlap with the provided range. The range for the inactive read is provided in `register_inactive_read()`. If the inactive read has no range, ovarlap is assumed and the read is evicted. This will be used to evict all inactive reads that could potentially use a cleaned-up tablet.	2024-04-30 01:31:08 -04:00
Botond Dénes	9e7a957ffb	reader_concurrency_semaphore: allow storing a range with the inactive reader This allows specifying the range the inactive read is reading from. To be used in the next patch to selectively evict inactive reads whose range overlaps with a certain (tablet) range.	2024-04-30 01:31:08 -04:00
Botond Dénes	67684308d1	reader_concurrency_semaphore: avoid detach() in inactive_read_handle::abandon() inactive_read_handle::abandon() evicts and destroyes the inactive-read, so it is not left behind. Currently, while doing so, it triggers the inactive_read's own version of abandon(): detach(). The two has bad interaction when the inactive_read_handle stores the last permit instance, causing (so far benign) use-after-free. Prevent triggering detach() to avoid this bad interaction altogether.	2024-04-30 01:31:08 -04:00
Kefu Chai	168ade72f8	treewide: replace formatter<std::string_view> with formatter<string_view> in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>` for `std::string_view` as well as the specialization of `fmt::formatter<..>` for `fmt::string_view` which is an implementation builtin in {fmt} for compatibility of pre-C++17. and this type is used even if the code is compiled with C++ stadandard greater or equal to C++17. also, before v10, the `fmt::formatter<std::string_view>::format()` is defined so it accepts `std::string_view`. after v10, `fmt::formatter<std::string_view>` still exists, but it is now defined using `format_as()` machinery, so it's `format()` method does not actually accept `std::string_view`, it accepts `fmt::string_view`, as the former can be converted to `fmt::string_view`. this is why we can inherit from `fmt::formatter<std::string_view>` and use `formatter<std::string_view>::format(foo, ctx);` to implement the `format()` method with {fmt} v9, but we cannot do this with {fmt} v10, and we would have following compilation failure: ``` FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o /home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc /home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format' 254 \| return formatter<std::string_view>::format(it->second, ctx); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ /usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument 2759 \| FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const \| ^ ~~~~~~~~~~~~ ``` because the inherited `format()` method actually comes from `fmt::formatter<fmt::string_view>`. to reduce the confusion, in this change, we just inherit from `fmt::format<string_view>`, where `string_view` is actually `fmt::string_view`. this follows the document at https://fmt.dev/latest/api.html#formatting-user-defined-types, and since there is less indirection under the hood -- we do not use the specialization created by `FMT_FORMAT_AS` which inherit from `formatter<fmt::string_view>`, hopefully this can improve the compilation speed a little bit. also, this change addresses the build failure with {fmt} v10. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18299	2024-04-19 07:44:07 +03:00
Botond Dénes	c6cff53771	reader_concurrency_semaphore: use variable reference for metrics Instead of a functor, for those metrics that just return the value of an existing member variable. This is ever so slightly more efficient than a functor. Closes scylladb/scylladb#17726	2024-03-11 20:47:04 +02:00
Kefu Chai	38ae52d5cd	add fmt::formatter for reader_permit::state and reader_resources before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for * reader_permit::state * reader_resources Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17707	2024-03-11 09:55:51 +02:00
Patryk Wrobel	1b6ab65c51	reader_concurrency_semaphore.cc: move stringstream content instead of copying it C++20 introduced a new overload of std::stringstream::str() that is selected when the mentioned member function is called on r-value. The new overload returns a string, that is move-constructed from the underlying string instead of being copy-constructed. This change applies std::move() on stringstream objects before calling str() member function to avoid copying of the underlying buffer. Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#17064	2024-01-31 09:31:50 +02:00
Michał Jadwiszczak	49544c47a1	reader_concurrency_semaphore: add name of semaphore in tracing messages	2024-01-23 10:25:34 +01:00
Lakshmi Narayanan Sreethar	76f0d5e35b	reader_permit: store schema_ptr instead of raw schema pointer Store schema_ptr in reader permit instead of storing a const pointer to schema to ensure that the schema doesn't get changed elsewhere when the permit is holding on to it. Also update the constructors and all the relevant callers to pass down schema_ptr instead of a raw pointer. Fixes #16180 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#16658	2024-01-11 08:37:56 +02:00
Avi Kivity	7fce057cda	database, reader_concurrency_sempaphore: deduplicate reader_concurrency_sempaphore metrics reader_concurrency_sempaphore are triplicated: each metrics is registered for streaming, user, and system classes. To fix, just move the metrics registration from database to reader_concurrency_sempaphore, so each reader_concurrency_sempaphore instantiated will register its metrics (if its creator asked for it). Adjust the names given to reader_concurrency_sempaphore so we don't change the labels. scylla-gdb is adjusted to support the new names.	2023-12-13 09:16:18 -05:00
Botond Dénes	e1b30f50be	reader_concurrency_semaphore: add register_metrics constructor parameter To be used in the next patch to control whether the semaphore registers and exports metrics or not. We want to move metric registration to the semaphore but we don't want all semaphores to export metrics. The decision on whether a semaphore should or shouldn't export metrics should be made on a case-by-case basis so this new parameter has no default value (except for the for_tests constructor).	2023-12-13 06:25:45 -05:00
Botond Dénes	6829eaad39	reader_concurrency_semaphore: use utils::memory_limit_reached exception When the kill limit is triggered.	2023-09-27 10:27:32 -04:00
Raphael S. Carvalho	914cbc11cf	reader_concurrency_semaphore: Fix stop() in face of evictable reads becoming inactive Scylla can crash due to a complicated interaction of service level drop, evictable readers, inactive read registration path. 1) service level drop invoke stop of reader concurrency semaphore, which will wait for in flight requests 2) turns out it stops first the gate used for closing readers that will become inactive. 3) proceeds to wait for in-flight reads by closing the reader permit gate. 4) one of evictable reads take the inactive read registration path, and finds the gate for closing readers closed. 5) flat mutation reader is destroyed, but finds the underlying reader was not closed gracefully and triggers the abort. By closing permit gate first, evictable readers becoming inactive will be able to properly close underlying reader, therefore avoiding the crash. Fixes #15534. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#15535	2023-09-25 08:55:50 +03:00
Michał Chojnowski	2000a09859	reader_concurrency_semaphore: fix a deadlock between stop() and execution_loop() Permits added to `_ready_list` remain there until executed by `execution_loop()`. But `execution_loop()` exits when `_stopped == true`, even though nothing prevents new permits from being added to `_ready_list` after `stop()` sets `_stopped = true`. Thus, if there are reads concurrent with `stop()`, it's possible for a permit to be added to `_ready_list` after `execution_loop()` has already quit. Such a permit will never be destroyed, and `stop()` will forever block on `_permit_gate.close()`. A natural solution is to dismiss `execution_loop()` only after it's certain that `_ready_list` won't receive any new permits. This is guaranteed by `_permit_gate.close()`. After this call completes, it is certain that no permits exist. After this patch, `execution_loop()` no longer looks at `_stopped`. It only exits when `_ready_list_cv` breaks, and this is triggered by `stop()` right after `_permit_gate.close()`. Fixes #15198 Closes #15199	2023-08-29 08:18:49 +03:00
Kefu Chai	bab16eb30e	treewide: remove #includes not use directly for faster build times and clear inter-module dependencies, we should not #includes headers not directly used. instead, we should only #include the headers directly used by a certain compilation unit. in this change, the source files under "/compaction" directories are checked using clangd, which identifies the cases where we have an #include which is not directly used. all the #includes identified by clangd are removed. because some source files rely on the incorrectly included header file, those ones are updated to #include the header file they directly use. if a forward declaration suffice, the declaration is added instead. see also https://clangd.llvm.org/guides/include-cleaner#unused-include-warning Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-07-18 17:36:31 +08:00
Botond Dénes	c4faa05888	reader_concurrency_semaphore: s/description/operation/ in diagnostics dumps "description" is not the respective column contains, so fix the header.	2023-06-07 14:21:48 +03:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Avi Kivity	97694d26c4	Merge 'reader_permit: minor improvements to resource consume/release safety' from Botond Dénes This PR contains some small improvements to the safety of consuming/releasing resources to/from the semaphore: * reader_permit: make the low-level `consume()/signal()` API private, making the only user (an RAII class) friend. * reader_resources: split `reset()` into `noexcept` and potentially throwing variant. * reader_resources::reset_to(): try harder to avoid calling `consume()` (when the new resource amount is smaller then the previous one) Closes #13678 * github.com:scylladb/scylladb: reader_permit: resource_units::reset_to(): try harder to avoid calling consume() reader_permit: split resource_units::reset() reader_permit: make consume()/signal() API private	2023-05-14 14:14:23 +03:00
Botond Dénes	b790f14456	reader_concurrency_semaphore: execution_loop(): trigger admission check when _ready_list is empty The execution loop consumes permits from the _ready_list and executes them. The _ready_list usually contains a single permit. When the _ready_list is not empty, new permits are queued until it becomes empty. The execution loops relies on admission checks triggered by the read releasing resouces, to bring in any queued read into the _ready_list, while it is executing the current read. But in some cases the current read might not free any resorces and thus fail to trigger an admission check and the currently queued permits will sit in the queue until another source triggers an admission check. I don't yet know how this situation can occur, if at all, but it is reproducible with a simple unit test, so it is best to cover this corner-case in the off-chance it happens in the wild. Add an explicit admission check to the execution loop, after the _ready_list is exhausted, to make sure any waiters that can be admitted with an empty _ready_list are admitted immediately and execution continues. Fixes: #13540 Closes #13541	2023-05-08 17:11:41 +03:00
Botond Dénes	c1e8e86637	reader_concurrency_semaphore: reader_permit: clean-up after failed memory requests When requesting memory via `reader_permit::request_memory()`, the requested amount is added to `_requested_memory` member of the permit impl. This is because multiple concurrent requests may be blocked and waiting at the same time. When the requests are fulfilled, the entire amount is consumed and individual requests track their requested amount with `resource_units` to release later. There is a corner-case related to this: if a reader permit is registered as inactive while it is waiting for memory, its active requests are killed with `std::bad_alloc`, but the `_requested_memory` fields is not cleared. If the read survives because the killed requests were part of a non-vital background read-ahead, a later memory request will also include amount from the failed requests. This extra amount wil not be released and hence will cause a resource leak when the permit is destroyed. Fix by detecting this corner case and clearing the `_requested_memory` field. Modify the existing unit test for the scenario of a permit waiting on memory being registered as inactive, to also cover this corner case, reproducing the bug. Fixes: #13539 Closes #13679	2023-05-07 14:06:51 +03:00
Avi Kivity	f125a3e315	Merge 'tree: finish the reader_permit state renames' from Botond Dénes In https://github.com/scylladb/scylladb/pull/13482 we renamed the reader permit states to more descriptive names. That PR however only covered only the states themselves and their usages, as well as the documentation in `docs/dev`. This PR is a followup to said PR, completing the name changes: renaming all symbols, names, comments etc, so all is consistent and up-to-date. Closes #13573 * github.com:scylladb/scylladb: reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes reader_concurrency_semaphore: update API w.r.t. recent permit state name changes reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes	2023-05-04 18:29:04 +03:00
Kefu Chai	48387a5a9a	reader_concurrency_semaphore: fix signed/unsigned comparision a signed/unsigned comparsion can overflow. and GCC-13 rightly points this out. so let's use `std::cmp_greater_equal()` when comparing unsigned and signed for greater-or-equal. ``` /home/kefu/dev/scylladb/reader_concurrency_semaphore.cc:931:76: error: comparison of integer expressions of different signedness: ‘long int’ and ‘uint64_t’ {aka ‘long unsigned int’} [-Werror=sign-compare] 931 \| if (_resources.memory <= 0 && (consumed_resources().memory + r.memory) >= get_kill_limit()) [[unlikely]] { \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-29 17:02:25 +08:00
Botond Dénes	88c19b23dc	reader_permit: resource_units::reset_to(): try harder to avoid calling consume() Currently, the `reset_to()` implementation calls `consume(new_amount)` (if not zero), then calls `signal(old_amount)`. This means that even if `reset_to()` is a net reduction in the amount of resources, there is a call to `consume()` which can now potentially throw. Add a special case for when the new amount of resources is strictly smaller than the old amount. In this case, just call `signal()` with the difference. This not just avoids a potential `std::bad_alloc`, but also helps relieving memory pressure when this is most needed, by not failing calls to release memory.	2023-04-26 07:41:57 -04:00
Botond Dénes	2449b714df	reader_permit: split resource_units::reset() Into reset_to() and reset_to_zero(). The latter replaces `reset()` with the default 0 resources argument, which was often called from noexcept contexts. Splitting it out from `reset()` allows for a specialized implementation that is guaranteed to be `noexcept` indeed and thus peace of mind.	2023-04-26 07:41:57 -04:00
Botond Dénes	ecbb118d32	reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes Update comments, test names and etc. that are still using the old terminology for permit state names, bring them up to date with the recent state name changes.	2023-04-19 05:31:27 -04:00
Botond Dénes	e71d6566ab	reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes They are still using the old terminology for permit state names, bring them up to date with the recent state name changes.	2023-04-19 05:20:44 -04:00
Botond Dénes	804403f618	reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes They is still using the old terminology for permit state names, bring them up to date with the recent state name changes.	2023-04-19 05:20:42 -04:00
Botond Dénes	89328ce447	reader_concurrency_semaphore: update API w.r.t. recent permit state name changes It is still using the old terminology for permit state names, bring it up to date with the recent state name changes.	2023-04-19 05:18:13 -04:00
Botond Dénes	3919effe2d	reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes It is still using the old terminology for permit state names, bring it up to date with the recent state name changes.	2023-04-19 05:17:34 -04:00
Botond Dénes	943ae7fc69	reader_permit: give better names to active* states The names of these states have been the source of confusion ever since they were introduced. Give them names which better reflects their true meaning and gives less room for misinterpretation. The changes are: * active/unused -> active * active/used -> active/need_cpu * active/blocked -> active/await Hopefully the new names do a better job at conveying what these states really mean: * active - a regular admitted permit, which is active (as opposed to an inactive permit). * active/need_cpu - an active permit which was marked as needing CPU for the read to make progress. This permit prevents admission of new permits while it is in this state. * active/await - a former active/need_cpu permit, which has to wait on I/O or a remote shard. While in this state, it doesn't block the admission of new permits (pending other criteria such as resource availability).	2023-04-14 08:40:46 -04:00
Botond Dénes	bd57471e54	reader_concurrency_semaphore: don't evict inactive readers needlessly Inactive readers should only be evicted to free up resources for waiting readers. Evicting them when waiters are not admitted for any other reason than resources is wasteful and leads to extra load later on when these evicted readers have to be recreated end requeued. This patch changes the logic on both the registering path and the admission path to not evict inactive readers unless there are readers actually waiting on resources. A unit-test is also added, reproducing the overly-agressive eviction and checking that it doesn't happen anymore. Fixes: #11803 Closes #13286	2023-04-13 15:20:18 +03:00
Botond Dénes	d5488dba69	reader_permit: set_trace_state(): emit trace message linking to previous page This method is called on the start of each page, updating the trace state stored on the permit to that of the current page. When doing so, emit a trace message, containing the session id of the previous page, so the per-page sessions can be stiched together later. Note that this message is only emitted if the cached read survived between the pages. Example: Tracing session: dcfc1570-ca3c-11ed-88d0-24443f03a8bb activity \| timestamp \| source \| source_elapsed \| client ---------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+----------- Execute CQL3 query \| 2023-03-24 08:10:27.271000 \| 127.0.0.1 \| 0 \| 127.0.0.1 Parsing a statement [shard 0] \| 2023-03-24 08:10:27.271864 \| 127.0.0.1 \| -- \| 127.0.0.1 Processing a statement [shard 0] \| 2023-03-24 08:10:27.271958 \| 127.0.0.1 \| 94 \| 127.0.0.1 Creating read executor for token 3274692326281147944 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE [shard 0] \| 2023-03-24 08:10:27.271995 \| 127.0.0.1 \| 132 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2023-03-24 08:10:27.271998 \| 127.0.0.1 \| 135 \| 127.0.0.1 Start querying singular range {{3274692326281147944, pk{00026b73}}} [shard 0] \| 2023-03-24 08:10:27.272003 \| 127.0.0.1 \| 140 \| 127.0.0.1 [reader concurrency semaphore] admitted immediately [shard 0] \| 2023-03-24 08:10:27.272006 \| 127.0.0.1 \| 143 \| 127.0.0.1 [reader concurrency semaphore] executing read [shard 0] \| 2023-03-24 08:10:27.272014 \| 127.0.0.1 \| 150 \| 127.0.0.1 Querying cache for range {{3274692326281147944, pk{00026b73}}} and slice {(-inf, +inf)} [shard 0] \| 2023-03-24 08:10:27.272022 \| 127.0.0.1 \| 159 \| 127.0.0.1 Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 3 clustering row(s) (3 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2023-03-24 08:10:27.272076 \| 127.0.0.1 \| 212 \| 127.0.0.1 Caching querier with key ab928e0d-b815-46b7-9a02-1fa2d9549477 [shard 0] \| 2023-03-24 08:10:27.272084 \| 127.0.0.1 \| 221 \| 127.0.0.1 Querying is done [shard 0] \| 2023-03-24 08:10:27.272087 \| 127.0.0.1 \| 224 \| 127.0.0.1 Done processing - preparing a result [shard 0] \| 2023-03-24 08:10:27.272106 \| 127.0.0.1 \| 242 \| 127.0.0.1 Request complete \| 2023-03-24 08:10:27.271259 \| 127.0.0.1 \| 259 \| 127.0.0.1 Tracing session: dd3092f0-ca3c-11ed-88d0-24443f03a8bb activity \| timestamp \| source \| source_elapsed \| client ---------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+----------- Execute CQL3 query \| 2023-03-24 08:10:27.615000 \| 127.0.0.1 \| 0 \| 127.0.0.1 Parsing a statement [shard 0] \| 2023-03-24 08:10:27.615223 \| 127.0.0.1 \| -- \| 127.0.0.1 Processing a statement [shard 0] \| 2023-03-24 08:10:27.615310 \| 127.0.0.1 \| 87 \| 127.0.0.1 Creating read executor for token 3274692326281147944 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE [shard 0] \| 2023-03-24 08:10:27.615346 \| 127.0.0.1 \| 124 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2023-03-24 08:10:27.615349 \| 127.0.0.1 \| 126 \| 127.0.0.1 Start querying singular range {{3274692326281147944, pk{00026b73}}} [shard 0] \| 2023-03-24 08:10:27.615352 \| 127.0.0.1 \| 130 \| 127.0.0.1 Found cached querier for key ab928e0d-b815-46b7-9a02-1fa2d9549477 and range(s) {{{3274692326281147944, pk{00026b73}}}} [shard 0] \| 2023-03-24 08:10:27.615358 \| 127.0.0.1 \| 135 \| 127.0.0.1 Reusing querier [shard 0] \| 2023-03-24 08:10:27.615362 \| 127.0.0.1 \| 139 \| 127.0.0.1 Continuing paged query, previous page's trace session is dcfc1570-ca3c-11ed-88d0-24443f03a8bb [shard 0] \| 2023-03-24 08:10:27.615364 \| 127.0.0.1 \| 141 \| 127.0.0.1 [reader concurrency semaphore] executing read [shard 0] \| 2023-03-24 08:10:27.615371 \| 127.0.0.1 \| 148 \| 127.0.0.1 Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 1 clustering row(s) (1 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2023-03-24 08:10:27.615385 \| 127.0.0.1 \| 163 \| 127.0.0.1 Querying is done [shard 0] \| 2023-03-24 08:10:27.615583 \| 127.0.0.1 \| 360 \| 127.0.0.1 Done processing - preparing a result [shard 0] \| 2023-03-24 08:10:27.615730 \| 127.0.0.1 \| 507 \| 127.0.0.1 Request complete \| 2023-03-24 08:10:27.615518 \| 127.0.0.1 \| 518 \| 127.0.0.1 See the message: Continuing paged query, previous page's trace session is dcfc1570-ca3c-11ed-88d0-24443f03a8bb [shard 0] \| 2023-03-24 08:10:27.615364 \| 127.0.0.1 \| 141 \| 127.0.0.1 This is a folow-up to #13255 Refs: #12781 Closes #13318	2023-03-26 18:41:21 +03:00
Botond Dénes	ff87f95a26	reader_concurrency_semaphore: add trace points for important events Notably, to admission execution and eviction. Registering/unregistering the permit as inactive is not traced, as this happens on every buffer-fill for range scans. Semaphore trace messages have a "[reader_concurrency_semaphore]" prefix to allow them to be clearly associated with the semaphore.	2023-03-22 04:58:18 -04:00
Botond Dénes	1f51f752cc	reader_permit: refresh trace_state on new pages To make sure all tracing done on a certain page will make its way into the appropriate trace session. This is a contination of the previous patch (which added trace pointer to the permit).	2023-03-22 04:58:10 -04:00
Botond Dénes	156e5d346d	reader_permit: keep trace_state pointer on permit And propagate it down to where it is created. This will be used to add trace points for semaphore related events, but this will come in the next patches.	2023-03-22 04:58:01 -04:00
Botond Dénes	d6583cad0a	reader_concurrency_semaphore: do_dump_reader_permit_diagnostics(): print the stats Print the semaphore stats below the permit listing and remove the currently redundant "Total: " line. Some of the stats printed here are already exported as metrics, but instead of trying to cherry-pick and risk some metrics falling through the cracks, just print everything, there aren't that many anyway.	2023-03-17 03:15:41 -04:00
Botond Dénes	7b701ac52e	reader_concurrency_semaphore: add stats to record reason for queueing permits When diagnosing problems, knowing why permits were queued is very valuable. Record the reason in a new stats, one for each reason a permit can be queued.	2023-03-17 03:15:41 -04:00
Botond Dénes	bb00405818	reader_concurrency_semaphore: can_admit_read(): also return reason for rejection So caller can bump the appropriate counters or log the reason why the the request cannot be admitted.	2023-03-17 03:15:40 -04:00
Botond Dénes	3f0b3489a2	reader_concurrency_semaphore: handle reader blocked on memory becoming inactive Kill said read's memory requests with std::bad_alloc and dequeue it from the memory wait list, then evict it on the spot. Now that `_inactive_reads` just store permits, we can do this easily.	2023-03-13 08:07:53 -04:00
Botond Dénes	d1bc5f9293	reader_permit: evict inactive read on timeout If the read is inactive when the timeout clock fires, evict it. Now that `_inactive_reads` just store permits, we can do this easily.	2023-03-13 08:07:53 -04:00
Botond Dénes	6181c08191	reader_concurrency_semaphore: move inactive_read to .cc It is not used in the header anymore and moving it to the .cc allows us to remove the dependency on flat_mutation_reader_v2.hh.	2023-03-13 08:07:53 -04:00
Botond Dénes	e56ec9373d	reader_concurrency_semaphore: store permits in _inactive_reads Add an member of type `inactive_read` to reader permit, and store permit instances in `_inactive_reads`. This list is now just another intrusive list the permit can be linked into, depending on its state. Inactive read handles now just store a reader permit pointer.	2023-03-13 08:07:53 -04:00
Botond Dénes	d11f9efbfe	reader_concurrency_semaphore: inactive_read: de-inline more methods They will soon need to access reader_permit::impl internals, only available in the .cc file.	2023-03-13 08:07:53 -04:00
Botond Dénes	8e296e8e05	reader_concurrency_semaphore: make _ready_list intrusive Following the same scheme we used to make the wait lists intrusive. Permits are added to the ready list intrusive list while waiting to be executed and moved back to the _permit_list when de-queued from this list. We now use a conditional variable for signaling when there are permits ready to be executed.	2023-03-13 08:07:53 -04:00
Botond Dénes	11dde4b80b	reader_permit: add wait_for_execution state Used while the permit is in the _ready_list, waiting for the execution loop to pick it up. This just acknowledging the existence of this wait-state. This state will now show up in permit diagnostics printouts and we can now determine whether a permit is waiting for execution, without checking which queue it is in.	2023-03-09 07:11:51 -05:00
Botond Dénes	6229f8b1a6	reader_concurrency_semaphore: make wait lists intrusive Instead of using expiring_fifo to store queued permits, use the same intrusive list mechanism we use to keep track of all permits. Permits are now moved between the _permit_list and the wait queues, depending on which state they are in. This means _permit_list is now not the definitive list containing all permits, instead it is the list containing all permits that are not in a more specialized queue at the moment. Code wishing to iterate over all permits should now use foreach_permits(). For outside code, this was already the only way and internal users are already patched. Making the wait lists intrusive allows us to dequeue a permit from any position, with nothing but a permit reference at hand. It also means the wait queues don't have any additional memory requirements, other than the memory for the permit itself. Timeout while being queued is now handled by the permit's on_timeout() callback.	2023-03-09 07:11:49 -05:00
Botond Dénes	9ea9a48dbc	reader_concurrency_semaphore: move most wait_queue methods out-of-line They will soon depend on the definition of the reader_permit::impl, which is only available in the .cc file.	2023-03-09 06:53:11 -05:00

1 2 3 4 5

202 Commits