Commit Graph

38 Commits

Author SHA1 Message Date
Michał Chojnowski
cb02d47b10 utils/logalloc: enforce LSA allocation size limits
In order to guarantee a decent upper limit on fragmentation,
LSA only handles allocations smaller than 0.1 of a segment.

Allocations larger than this limit are permitted, but they are
not placed in LSA segments. Instead, they are forwarded to
the standard allocator.

We don't really have any use case for this "fallback".
As far as I can tell, it only exists for "historical"
reasons, from times where there were some data structures
which weren't fully adapted to LSA yet.

We don't the fallback to be used.
Long-lived standard allocations are undesirable.
They have higher internal fragmentation than LSA
allocations, and they can cause external fragmentation
in the standard allocator. So we want to eliminate them all.

The only reason to keep the fallback is to soften the impact
if some bug results in limit-exceeding LSA allocations happening
in production. In principle, the fallback turns a crash
(or something similarly drastic) into just a performance problem.

However, it turns out that the fallback is buggy.
Recently we had a bug which caused limit-exceeding LSA allocations
to happen.
And then it turned out that LSA reclaim doesn't deal fully correctly
with evictable non-LSA allocations, and the dirty_memory_manager
accounting for non-LSA allocations is completely wrong.
This resulted in subtle, serious, and hard to understand stability
problems in production.

Arguably the biggest problem is that the "fallback" allocations
weren't reported in any way. They were happening in some tests,
but they were silently permitted, so nobody noticed that they
should be eliminated. If we just had a rate-limited error log
that reports fallback allocations, they would have never got
into a release.

So maybe we could fix the fallback, add more tests for it,
add a warning for when it's used, and keep it.

But this PR instead opts for removing the fallback mechanism
altogether and failing fast. After the patch, if a non-conforming
allocation happens, it will trigger an `on_internal_error`.
With this, we risk a greater impact if some non-conforming allocations
happen in production, but we make the system simpler.

It's hard to say if it's a good tradeoff.
2025-05-29 13:05:08 +02:00
Kefu Chai
7215d4bfe9 utils: do not include unused headers
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.

please note, because quite a few source files relied on
`utils/to_string.hh` to pull in the specialization of
`fmt::formatter<std::optional<T>>`, after removing
`#include <fmt/std.h>` from `utils/to_string.hh`, we have to
include `fmt/std.h` directly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-14 07:56:39 -05:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Michał Chojnowski
7b3f55a65f logalloc: add hold_reserve
mutation_partition_v2::apply_monotonically() needs to perform some allocations
in a destructor, to ensure that the invariants of the data structure are
restored before returning. But it is usually called with reclaiming disabled,
so the allocations might fail even in a perfectly healthy node with plenty of
reclaimable memory.

This patch adds a mechanism which allows to reserve some LSA memory (by
asking the allocator to keep it unused) and make it available for allocation
right when we need to guarantee allocation success.
2024-07-08 16:08:27 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Pavel Emelyanov
30959fc9b1 lsa, test: Extend memory footprint test with per-type total sizes
When memory footprint test is over it prints total size taken by row
cache, memtable and sstables as well as individual objects' sizes. It's
also nice to know the details on the row-cache's individual objects.
This patch extends the printing with total size of allocated object
types according to migrator_fn types.

Sample output:

    mutation footprint:
     - in cache:     11040928
     - in memtable:  9142424
     - in sstable:
       mc:   2160000
       md:   2160000
       me:   2160000
     - frozen:       540
     - canonical:    827
     - query result: 342

     sizeof(cache_entry) = 64
     sizeof(memtable_entry) = 64
     sizeof(bptree::node) = 288
     sizeof(bptree::data) = 72
     -- sizeof(decorated_key) = 32
     -- sizeof(mutation_partition) = 96
     -- -- sizeof(_static_row) = 8
     -- -- sizeof(_rows) = 24
     -- -- sizeof(_row_tombstones) = 40

     sizeof(rows_entry) = 144
     sizeof(evictable) = 24
     sizeof(deletable_row) = 72
     sizeof(row) = 16
     radix_tree::inner_node::node_sizes =  48 80 144 272 528 1040
     radix_tree::leaf_node::node_sizes =  120 216 416 816 3104
     sizeof(atomic_cell_or_collection) = 16
     btree::linear_node_size(1) = 24
     btree::inner_node_size = 216
     btree::leaf_node_size = 120
    LSA stats:
      N18compact_radix_tree4treeI13cell_and_hashjE9leaf_nodeE: 360
      N5bplus4dataIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 5040
      N5bplus4nodeIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 19296
      17partition_version: 952416
      N11intrusive_b4nodeI10rows_entryXadL_ZNS1_5_linkEEENS1_11tri_compareELm12ELm20ELNS_10key_searchE0ELNS_10with_debugE0EEE: 317472
      10rows_entry: 1429056
      12blob_storage: 254

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#15434
2023-09-18 11:23:18 +02:00
Pavel Emelyanov
063da81ab7 code: Convert nothrow construction assertions into concepts
The small_vector also has N>0 constraint that's also converted

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-24 19:44:50 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
daf028210b build: enable -Winconsistent-missing-override warning
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.

Enable the warning and fix numerous violations.

Closes #9347
2021-09-15 12:55:54 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Emelyanov
9216a5bc08 allocation_strategy, code: Conceptualize dynamic objects
Usually lsa allocation is performed with the construct() helper that
allocates a sizeof(T) slot and constructs it in place. Some rare
objects have dynamic size, so they are created by alloc()ating a
slot of some specific size and (!) must provide the correct overload
of size_for_allocation_strategy that reports back the relevant
storage size.

This "must provide" is not enforced, if missed a default sizer would
be instantiated, but won't work properly. This patch makes all users
of alloc() conform to DynamicObject concept which requires the
presense of .storage_size() method to tell that size.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-05-19 09:23:49 +03:00
Pavel Emelyanov
b8a4f32b48 allocation_strategy: Hide migrators
After IMR rework the only lsa-migrating functionality is standard one
that calls move constructors on lsa slots. Hide the whole thing inside
allocation-strategy.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-05-19 09:23:49 +03:00
Pavel Emelyanov
28f01aadc9 allocation_strategy, code: Simplify alloc()
Todays alloc() accepts migrate-fn, size and alignment. All the callers
don't really need to provide anything special for the migrate-fn and
are just happy with default alignof() for alignment. The simplification
is in providing alloc() that only accepts size arg and does the rest
itself.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-05-19 09:23:49 +03:00
Pavel Emelyanov
fdfcda97d7 allocation_strategy: Mark size_for_allocation_strategy noexcept
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-05-19 09:23:49 +03:00
Avi Kivity
4cfaab208e allocation_strategy: set preferred max contiguous allocation to 128k for standard allocations
Now that managed_bytes and its users do not assume that a managed_bytes
instance allocated using standard_allocation_strategy is non-fragmented,
we can set the preferred max contiguous allocation to 128k. This causes
managed_bytes to fragment instances that are larger than this size.

Note that managed_bytes is the only user.

Closes #7943
2021-01-21 11:15:13 +02:00
Piotr Jastrzebski
f2b98b0aad Replace disable_failure_guard with scoped_critical_alloc_section
scoped_critical_alloc_section was recently introduced to replace
disable_failure_guard and made the old class deprecated.

This patch replaces all occurences of disable_failure_guard with
scoped_critical_alloc_section.

Without this patch the build prints many warnings like:
warning: 'disable_failure_guard' is deprecated: Use scoped_critical_section instead [-Wdeprecated-declarations]

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <ca2a91aaf48b0f6ed762a6aa687e6ac5e936355d.1605621284.git.piotr@scylladb.com>
2020-11-17 16:01:25 +02:00
Pavel Emelyanov
b2ce3b197e allocation_strategy: Fix standard_migrator initialization
This is the continuation of 30722b8c8e, so let me re-cite Rafael:

    The constructors of these global variables can allocate memory. Since
    the variables are thread_local, they are initialized at first use.

    There is nothing we can do if these allocations fail, so use
    disable_failure_guard.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20201028140553.21709-1-xemul@scylladb.com>
2020-10-28 16:22:23 +02:00
Pavel Emelyanov
2c4a94aeab allocation_strategy: Mark returning lambda as noexcept
It just calls current_alloctor().destroy() which is noexcept

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-09 14:41:23 +03:00
Pavel Emelyanov
a497dfdd0b allocation_strategy: Mark trivial noexcept methods as such
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-09 14:41:03 +03:00
Paweł Dziepak
c6c5accd19 lsa: provide migrator with the object size
While the migration function should have enough information to obtain
the object size itself, the LSA logic needs to compute it as well.
IMR is going to make calculating object sizes more expensive, so by
providing the infromation to the migrator we can avoid some needless
operations.
2018-05-09 16:52:26 +01:00
Paweł Dziepak
884888dc11 lsa: add free() that does not require object size
It is non-trivial to get the size of an IMR object. However, the
standard allocator doesn't really need it and LSA can compute it itself
by asking the migrator.
2018-05-09 16:52:26 +01:00
Paweł Dziepak
b3699f286d lsa: make migrators table thread-local
Migrators can be registered and deregistered at any time. If the table
is not thread-local we risk race conditions.
2018-05-09 16:10:46 +01:00
Michael Munday
18c0ab539e utils/allocation_strategy: force alignment to be at least sizeof(void*)
The alignment of packed structs can be 1. The system¹ posix_memalign
function will return EINVAL when passed this alignment. This fix
forces the alignment to be at least sizeof(void*).

¹ The seastar implementation of posix_memalign does not appear to
  have this limitation currently.
2017-12-08 10:12:41 -05:00
Tomasz Grabiec
34ccf234ea Integrate with allocation failure injection framework 2017-11-07 15:33:24 +01:00
Tomasz Grabiec
87be474c19 lsa: Move reclaim counter concept to allocation_strategy level
So that generic code can detect invalidation of references. Also, to
allow reusing the same mechanism for signalling external reference
invalidation.
2017-09-13 17:38:08 +02:00
Tomasz Grabiec
69e2eccf68 allocating_strategy: Introduce alloc_strategy_unique_ptr<> 2017-05-22 19:30:02 +02:00
Avi Kivity
6d9e18fd61 logalloc: reduce descriptor overhead
Every lsa-allocated object is prefixed by a header that contains information
needed to free or migrate it.  This includes its size (for freeing) and
an 8-byte migrator (for migrating).  Together with some flags, the overhead
is 14 bytes (16 bytes if the default alignment is used).

This patch reduces the header size to 1 byte (8 bytes if the default alignment
is used).  It uses the following techniques:

 - ULEB128-like encoding (actually more like ULEB64) so a live object's header
   can typically be stored using 1 byte
 - indirection, so that migrators can be encoded in a small index pointing
   to a migrator table, rather than using an 8-byte pointer; this exploits
   the fact that only a small number of types are stored in LSA
 - moving the responsibility for determining an object's size to its
   migrator, rather than storing it in the header; this exploits the fact
   that the migrator stores type information, and object size is in fact
   information about the type

The patch improves the results of memory_footprint_test as following:

Before:

 - in cache:     976
 - in memtable:  947

After:

mutation footprint:
 - in cache:     880
 - in memtable:  858

A reduction of about 10%.  Further reductions are possible by reducing the
alignment of lsa objects.

logalloc_test was adjusted to free more objects, since with the lower
footprint, rounding errors (to full segments) are different and caused
false errors to be detected.

Missing: adjustments to scylla-gdb.py; will be done after we agree on the
new descriptor's format.
2017-04-24 12:23:12 +02:00
Glauber Costa
28e3f2f6ee LSA: export information about object memory footprint
We allocate objects of a certain size, but we use a bit more memory to hold
them.  To get a clerer picture about how much memory will an object cost us, we
need help from the allocator. This patch exports an interface that allow users
to query into a specific allocator to get that information.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-10-04 10:39:10 -04:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Avi Kivity
2437fc956c allocation_strategy: expose preferred allocation size limit
Our premier allocation_strategy, lsa, prefers to limit allocations below
a tenth of the segment size so they can be moved around; larger allocations
are pinned and can cause memory fragmentation.

Provide an API so that objects can query for this preferred size limit.

For now, lsa is not updated to expose its own limit; this will be done
after the full stack is updated to make use of the limit, or intermediate
steps will not work correctly.
2015-12-06 16:23:42 +02:00
Avi Kivity
f3afe3e876 allocation_strategy: constify migrate_fn
Since abstract_type will be providing our migrate_fn, they must be const,
and indeed a migration does not change the migration function.
2015-11-13 17:13:07 +02:00
Avi Kivity
16006949d0 logalloc: make migrator an object, not a function pointer
The migrator tells lsa how to move an object when it is compacted.
Currently it is a function pointer, which means we must know how to move
the object at compile time.  Making it an object allows us to build the
migration function at runtime, making it suitable for runtime-defined types
(such as tuples and user-defined types).

In the future, we may also store the size there for fixed-size types,
reducing lsa overhead.

C++ variable templates would have made this patch smaller, but unfortunately
they are only supported on gcc 5+.
2015-10-21 11:24:56 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Tomasz Grabiec
f404a238bb allocation_strategy: Make construct() exception-safe 2015-09-06 21:24:59 +02:00
Tomasz Grabiec
d20fae96a2 lsa: Make reclaimer run synchronously with allocations
The goal is to make allocation less likely to fail. With async
reclaimer there is an implicit bound on the amount of memory that can
be allocated between deferring points. This bound is difficult to
enforce though. Sync reclaimer lifts this limitation off.

Also, allocations which could not be satisfied before because of
fragmentation now will have higher chances of succeeding, although
depending on how much memory is fragmented, that could involve
evicting a lot of segments from cache, so we should still avoid them.

Downside of sync reclaiming is that now references into regions may be
invalidated not only across deferring points but at any allocation
site. compaction_lock can be used to pin data, preferably just
temporarily.
2015-08-31 21:50:18 +02:00
Avi Kivity
932ddc328c logalloc: optimize current_allocation_strategy()
This heavily used function shows up in many places in the profile (as part
of other functions), so it's worth optimizing by eliminating the special
case for the standard allocator.  Use a statically allocated object instead.

(a non-thread-local object is fine since it has no data members).
2015-08-17 16:51:10 +03:00
Tomasz Grabiec
5dc58a7cd4 allocation_strategy: Leak the standard strategy
Some code may attempt to use it during finalization after "instance"
was destroyed.

Reported by Pekka:

/usr/include/c++/4.9.2/bits/unique_ptr.h:291:14: runtime error:
reference binding to null pointer of type 'struct
standard_allocation_strategy'
./utils/allocation_strategy.hh:105:13: runtime error: reference
binding to null pointer of type 'struct standard_allocation_strategy'
./utils/allocation_strategy.hh:118:35: runtime error: reference
binding to null pointer of type 'struct allocation_strategy'
./utils/managed_bytes.hh:59:45: runtime error: member call on null
pointer of type 'struct allocation_strategy'
./utils/allocation_strategy.hh:82:9: runtime error: member access
within null pointer of type 'struct allocation_strategy'
2015-08-07 18:35:20 +03:00
Tomasz Grabiec
581eceea41 utils: Introduce allocation_strategy 2015-08-06 12:52:43 +02:00