scylladb

Author	SHA1	Message	Date
Michał Chojnowski	c5c19e90ac	logalloc: add hold_reserve mutation_partition_v2::apply_monotonically() needs to perform some allocations in a destructor, to ensure that the invariants of the data structure are restored before returning. But it is usually called with reclaiming disabled, so the allocations might fail even in a perfectly healthy node with plenty of reclaimable memory. This patch adds a mechanism which allows to reserve some LSA memory (by asking the allocator to keep it unused) and make it available for allocation right when we need to guarantee allocation success. (cherry picked from commit `7b3f55a65f`)	2024-07-10 08:36:11 +00:00
Michał Chojnowski	985f5a50f6	logalloc: generalize refill_emergency_reserve() In the next patch, we will want to do the thing as refill_emergency_reserve() does, just with a quantity different than _emergency_reserve_max. So we split off the shareable part to a new function, and use it to implement refill_emergency_reserve(). (cherry picked from commit `f784be6a7e`)	2024-07-10 08:36:11 +00:00
Benny Halevy	e5ca65f78b	test/perf: report also log_allocations/op Currently perf-simple-query --write ignores log allocations that happen on the memtable apply path. This change adds tracking and accounting of the number of log allocation, and reporting of thereof. For reference, here's the output of build/release/scylla perf-simple-query --write --default-log-level=error --random-seed=1 -c 1 ``` random-seed=1 enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=write, frontend=cql, query_single_key=no, counters=no} Disabling auto compaction 78073.55 tps ( 59.4 allocs/op, 16.3 logallocs/op, 14.3 tasks/op, 52991 insns/op, 0 errors) 77263.59 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53282 insns/op, 0 errors) 79913.07 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53295 insns/op, 0 errors) 79554.32 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53284 insns/op, 0 errors) 79151.53 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53289 insns/op, 0 errors) median 79151.53 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53289 insns/op, 0 errors) median absolute deviation: 761.54 maximum: 79913.07 minimum: 77263.59 ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-05-02 18:42:41 +03:00
Kefu Chai	fcf7ca5675	utils/logalloc: do not allocate memory in reclaim_timer::report() before this change, `reclaim_timer::report()` calls ```c++ fmt::format(", at {}", current_backtrace()) ``` which allocates a `std::string` on heap, so it can fail and throw. in that case, `std::terminate()` is called. but at that moment, the reason why `reclaim_timer::report()` gets called is that we fail to reclaim memory for the caller. so we are more likely to run into this issue. anyway, we should not allocate memory in this path. in this change, a dedicated printer is created so that we don't format to a temporary `std::string`, and instead write directly to the buffer of logger. this avoids the memory allocation. Fixes #18099 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18100	2024-04-01 11:01:52 +03:00
Kefu Chai	3d9054991b	utils/logalloc: add fmt::formatter for occupancy_stats before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `occupancy_stats`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-23 11:32:41 +08:00
Yaniv Kaul	ae2ab6000a	Typos: fix typos in code Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255	2023-12-05 15:18:11 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Avi Kivity	ee9cc450d4	logalloc: report increases of reserves The log-structured allocator maintains memory reserves to so that operations using log-strucutured allocator memory can have some working memory and can allocate. The reserves start small and are increased if allocation failures are encountered. Before starting an operation, the allocator first frees memory to satisfy the reserves. One problem is that if the reserves are set to a high value and we encounter a stall, then, first, we have no idea what value the reserves are set to, and second, we have no idea what operation caused the reserves to be increased. We fix this problem by promoting the log reports of reserve increases from DEBUG level to INFO level and by attaching a stack trace to those reports. This isn't optimal since the messages are used for debugging, not for informing the user about anything important for the operation of the node, but I see no other way to obtain the information. Ref #13930. Closes scylladb/scylladb#15153	2023-10-23 13:37:50 +02:00
Botond Dénes	a0c5dee2aa	utils/logalloc: introduce logalloc::bad_alloc This new exception type inherits from std::bad_alloc and allows logalloc code to add additional information about why the allocation failed. We currently have 3 different throw sites for std::bad_alloc in logalloc.cc and when investigating a coredump produced by --abort-on-lsa-bad-alloc, it is impossible to determine, which throw-site activated last, triggering the abort. This patch fixes that by disambiguating the throw-sites and including it in the error message printed, right before abort. Refs: #15373 Closes scylladb/scylladb#15503	2023-09-21 17:43:53 +03:00
Pavel Emelyanov	30959fc9b1	lsa, test: Extend memory footprint test with per-type total sizes When memory footprint test is over it prints total size taken by row cache, memtable and sstables as well as individual objects' sizes. It's also nice to know the details on the row-cache's individual objects. This patch extends the printing with total size of allocated object types according to migrator_fn types. Sample output: mutation footprint: - in cache: 11040928 - in memtable: 9142424 - in sstable: mc: 2160000 md: 2160000 me: 2160000 - frozen: 540 - canonical: 827 - query result: 342 sizeof(cache_entry) = 64 sizeof(memtable_entry) = 64 sizeof(bptree::node) = 288 sizeof(bptree::data) = 72 -- sizeof(decorated_key) = 32 -- sizeof(mutation_partition) = 96 -- -- sizeof(_static_row) = 8 -- -- sizeof(_rows) = 24 -- -- sizeof(_row_tombstones) = 40 sizeof(rows_entry) = 144 sizeof(evictable) = 24 sizeof(deletable_row) = 72 sizeof(row) = 16 radix_tree::inner_node::node_sizes = 48 80 144 272 528 1040 radix_tree::leaf_node::node_sizes = 120 216 416 816 3104 sizeof(atomic_cell_or_collection) = 16 btree::linear_node_size(1) = 24 btree::inner_node_size = 216 btree::leaf_node_size = 120 LSA stats: N18compact_radix_tree4treeI13cell_and_hashjE9leaf_nodeE: 360 N5bplus4dataIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 5040 N5bplus4nodeIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 19296 17partition_version: 952416 N11intrusive_b4nodeI10rows_entryXadL_ZNS1_5_linkEEENS1_11tri_compareELm12ELm20ELNS_10key_searchE0ELNS_10with_debugE0EEE: 317472 10rows_entry: 1429056 12blob_storage: 254 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#15434	2023-09-18 11:23:18 +02:00
Avi Kivity	2303f08eea	utils: logalloc: correct asan_interface.h location It's a system header, so it deserves angle brackets. Closes #14036	2023-05-29 23:03:25 +03:00
Kefu Chai	3ae11de204	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:53 +08:00
Kefu Chai	9520acb1a1	logalloc: mark segment_store_backend's virtual before this change, `seastar_memory_segment_store_backend` is class with virtual method, but it does not have a virtual dtor. but we do use a unique_ptr<segment_store_backend> to manage the lifecycle of an intance of its derived class. to enable the compiler to call the right dtor, we should mark the base class's dtor as virtual. this should address following warings from Clang-17: ``` /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:100:2: error: delete called on non-final 'logalloc::seastar_memory_segment_store_backend' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] delete __ptr; ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:405:4: note: in instantiation of member function 'std::default_delete<logalloc::seastar_memory_segment_store_backend>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/utils/logalloc.cc:812:20: note: in instantiation of member function 'std::unique_ptr<logalloc::seastar_memory_segment_store_backend>::~unique_ptr' requested here : _backend(std::make_unique<seastar_memory_segment_store_backend>()) ^ ``` and ``` /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:100:2: error: delete called on 'logalloc::segment_store_backend' that is abstract but has non-virtual destructor [-Werror,-Wdelete-abstract-non-virtual-dtor] delete __ptr; ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:405:4: note: in instantiation of member function 'std::default_delete<logalloc::segment_store_backend>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/utils/logalloc.cc:811:5: note: in instantiation of member function 'std::unique_ptr<logalloc::segment_store_backend>::~unique_ptr' requested here contiguous_memory_segment_store() ^ ``` Fixes #12872 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12873	2023-02-16 19:05:48 +02:00
Avi Kivity	a2d43bb851	logalloc: disambiguate types and non-type members logalloc::tracker has some members with the same names as types from namespace scope. gcc (rightfully) complains that this changes the meaning of the name. Qualify the types to disambiguate.	2022-11-28 21:58:30 +02:00
Michał Chojnowski	4563cbe595	logalloc: prevent false positives in reclaim_timer reclaim_timer uses a coarse clock, but does not account for the measurement error introduced by that -- it can falsely report reclaims as stalls, even if they are shorter by a full coarse clock tick from the requested threshold (blocked-reactor-notify-ms). Notably, if the stall threshold happens to be smaller or equal to coarse clock resolution, Scylla's log gets spammed with false stall reports. The resolution of coarse clocks in Linux is 1/CONFIG_HZ. This is typically equal to 1 ms or 4 ms, and stall thresholds of this order can occur in practice. Eliminate false positives by requiring the measured reclaim duration to be at least 1 clock tick longer than the configured threshold for it to be considered a stall. Fixes #10981 Closes #11680	2022-10-02 13:41:40 +03:00
Avi Kivity	2cec417426	Merge 'tools: use the standard allocator' from Botond Dénes Tools want to be as little disrupting to the environment they run in as possible, because they might be run in a production environment, next to a running scylladb production server. As such, the usual behavior of seastar applications w.r.t. memory is an anti-pattern for tools: they don't want to reserve most of the system memory, in fact they don't want to reserve any amount, instead consuming as much as needed on-demand. To achieve this, tools want to use the standard allocator. To achieve this they need a seastar option to to instruct seastar to not configure and use the seastar allocator and they need LSA to cooperate with the standard allocator. The former is provided by https://github.com/scylladb/seastar/pull/1211. The latter is solved by introducing the concept of a `segment_store_backend`, which abstracts away how the memory arena for segments is acquired and managed. We then refactor the existing segment store so that the seastar allocator specific parts are moved to an implementation of this backend concept, then we introduce another backend implementation appropriate to the standard allocator. Finally, tools configure seastar with the newly introduced option to use the standard allocator and similarly configure LSA to use the standard allocator appropriate backend. Refs: https://github.com/scylladb/scylladb/issues/9882 This is the last major code piece in scylla for making tools production ready. Closes #11510 * github.com:scylladb/scylladb: test/boost: add alternative variant of logalloc test tools: use standard allocator utils/logalloc: add use_standard_allocator_segment_pool_backend() utils/logalloc: introduce segment store backend for standard allocator utils/logalloc: rebase release segment-store on segment-store-backend utils/logalloc: introduce segment_store_backend utils/logalloc: push segment alloc/dealloc to segment_store test/boost/logalloc_test: make test_compaction_with_multiple_regions exception-safe	2022-09-20 12:59:34 +03:00
Botond Dénes	a55903c839	utils/logalloc: add use_standard_allocator_segment_pool_backend() Creating a standard-memory-allocator backend for the segment store. This is targeted towards tools, which want to configure LSA with a segment store backend that is appropriate for the standard allocator (which they want to use). We want to be able to use this in both release and debug mode. The former will be used by tools and the latter will be used to run the logalloc tests with this new backend, making sure it works and doesn't regress. For this latter, we have to allow the release and debug stores to coexist in the same build and for the debug store to be able to delegate to the release store when the standard allocator backend is used.	2022-09-16 13:02:40 +03:00
Botond Dénes	c1c74005b7	utils/logalloc: introduce segment store backend for standard allocator To be used by tools, this store backend is compatible with the standard allocator as it acquires the memory arena for segments via mmap().	2022-09-16 12:16:57 +03:00
Botond Dénes	d2a7ebbe66	utils/logalloc: rebase release segment-store on segment-store-backend Rebase the seastar allocator based segment store implementation on the recently introduced segment store backend which is now abstracts away how memory for segments is obtained. This patch also introduces an explicit `segment_npos` to be used for cases when a segment -> index mapping fails (segment doesn't belong to the store). Currently the seastar allocator based store simply doesn't handle this case, while the standard allocator based store uses 0 as the implicit invalid index.	2022-09-16 12:16:57 +03:00
Botond Dénes	3717f7740d	utils/logalloc: introduce segment_store_backend We want to make it possible to select the segment-store to be used for LSA -- the seastar allocator based one or the standard allocator based on -- at runtime. Currently this choice is made at compile time via preprocessor switches. The current standard memory based store is specialized for debug build, we want something more similar to the seastar standard memory allocator based one. So we introduce a segment store backend for the current seastar allocator based store, which abstracts how the backing memory for all segments is allocated/freed, while keeping the segment <-> index mapping common. In the next patches we will rebase the current seastar allocator based segment store on this backend and later introduce another backend for standard allocator, targeted for release builds.	2022-09-16 12:16:57 +03:00
Botond Dénes	5ea4d7fb39	utils/logalloc: push segment alloc/dealloc to segment_store Currently the actual alloc/dealloc of memory for segments is located outside the segment stores. We want to abstract away how segments are allocated, so we move this logic too into the segment store. For now this results in duplicate code in the two segment store implementations, but this will soon be gone.	2022-09-16 12:16:57 +03:00
Avi Kivity	d3b8c0c8a6	logalloc: don't crash while reporting reclaim stalls if --abort-on-seastar-bad-alloc is specified The logger is proof against allocation failures, except if --abort-on-seastar-bad-alloc is specified. If it is, it will crash. The reclaim stall report is likely to be called in low memory conditions (reclaim's job is to alleviate these conditions after all), so we're likely to crash here if we're reclaiming a very low memory condition and have a large stall simultaneously (AND we're running in a debug environment). Prevent all this by disabling --abort-on-seastar-bad-alloc temporarily. Fixes #11549 Closes #11555	2022-09-15 19:24:39 +02:00
Michał Chojnowski	c61b901828	utils: logalloc: prefer memory::free_memory() to memory::stats().free_memory The former is a shortcut that does not involve a copy of all stats. This saves some instructions in the hot path. Closes #11495	2022-09-08 14:12:20 +03:00
Botond Dénes	5bc499080d	utils/logalloc: remove reclaim_timer:: globals One of them (_active_timer) is moved to shard tracker, the other is made a simple local in reclaim_timer.	2022-08-23 10:38:58 +03:00
Botond Dénes	5f8971173e	utils/logalloc: make s_sanitizer_report_backtrace global a member of tracker We want to consolidate all the logalloc state into a single object: the shard tracker. Replacing this global with a member in said object is part of this effort.	2022-08-23 10:38:58 +03:00
Botond Dénes	499b9a3a7c	utils/logalloc: tracker_reclaimer_lock: get shard tracker via constructor arg	2022-08-23 10:38:58 +03:00
Botond Dénes	7d17d675af	utils/logalloc: move global stat accessors to tracker These are pretend free functions, accessing globals in the background, make them a member of the tracker instead, which everything needed locally to compute them. Callers still have to access these stats through the global tracker instance, but this can be changed to happen through a local instance. Soon....	2022-08-23 10:38:58 +03:00
Botond Dénes	f406151a86	utils/logalloc: allocating_section: don't use the global tracker Instead, get the tracker instance from the region. This requires adding a `region&` parameter to `with_reserve()`. This brings us one step closer to eliminating the global tracker.	2022-08-23 10:38:58 +03:00
Botond Dénes	e968866fa1	utils/logalloc: pass down tracker::impl reference to segment_pool To get rid of some usages of `shard_tracker()`.	2022-08-23 10:38:58 +03:00
Botond Dénes	3bd94e41bf	utils/logalloc: move segment pool into tracker Instead of a separate global segment pool instance, make it a member of the already global tracker. Most users are inside the tracker instance anyway. Outside users can access the pool through the global tracker instance.	2022-08-23 10:38:58 +03:00
Botond Dénes	5b86dfc35a	utils/logalloc: add tracker member to basic_region_impl For now this member is initialized from the global tracker instance. But it allows the members of region impl to be detached from said global, making a step towards removing it.	2022-08-23 10:38:58 +03:00
Botond Dénes	f4056bd344	utils/logalloc: make segment independent of segment pool segment has some members, which simply forward the call to a segment_pool method, via the global segment_pool instance. Remove these and make the callers use the segment pool directly instead.	2022-08-23 10:38:58 +03:00
Benny Halevy	1d9862dab3	logalloc: region_impl: add moved method Don't open-code calling the region_impl _listeners->moved() in region move-constructor and move-assignment op. The other._impl->_region might be different then &other post region::merge so let the region_impl decide which region* is moved from. The new_region is also set to region_impl->_region so need to open-code that either in the said call sites. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-28 10:49:49 +03:00
Benny Halevy	cd4dbb1cae	logalloc: region: merge: optimize getting other impl The other _impl is presumed to be engaged already, so just call other.get_impl() once for both use cases. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-28 10:49:36 +03:00
Benny Halevy	a547cb79e8	logalloc: region: merge: call region_impl::unlisten We can't be sure that the other_impl->_region == &other since it could be a result of a previous merge, so don't decide for it which region to unlisten to, let it use its current _region. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-28 10:49:27 +03:00
Benny Halevy	003216de59	logalloc: region: call unlisten rather than open coding it Current ~region and region::operator= open-code region_impl::unlisten. Just call it so it will be easier to maintain. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-28 10:49:11 +03:00
Benny Halevy	cff953535c	logalloc: region move-ctor: initialize _impl There's no need to default-initialize it and then move-assign it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-28 10:49:05 +03:00
Benny Halevy	c7d77e4076	logalloc: region: get_impl might be called on disengaged _impl when moved First check if _impl is engaged before accessing it to set its _region = this in the move constructor and move assignment operator. Add unit test for these odd orner cases. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-28 10:48:58 +03:00
Benny Halevy	6e961ead3b	logalloc: mark free functions noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	705b42efe2	logalloc: allocating_section: mark functions noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	f9db708376	logalloc: allocating_section: guard: mark constructor noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	95b0e41abb	logalloc: tracker_reclaimer_lock: mark constructor noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	ed9e036509	logalloc: mark shard_tracker noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	d6e6ffc741	logalloc: region: mark functions const/noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	3ba85c3bbd	logalloc: region_impl: mark functions noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	3f96818c03	logalloc: region_impl: object_descriptor: mark functions noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	0866548b27	logalloc: region_group: mark functions noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	fe50c76dbc	logalloc: tracker: mark functions const/noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:40:50 +03:00
Benny Halevy	71c21a83ad	logalloc: tracker::impl: make region_occupancy and friends const No that they don't modify the tracker impl. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:40:18 +03:00
Benny Halevy	1c0c01cc24	logalloc: tracker::impl: occupancy: get rid of reclaiming_lock It was added in `d20fae96a2` as a precaution not to invalidate iterators while traversing _regions. However it is not requried as no allocation is done on this synchronous path - therefore there is no point in preventing reclaim. This will allow making the respective functions const as they merely return stats and do not modify the tracker impl. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:39:18 +03:00

1 2 3 4 5 ...

325 Commits