Commit Graph

15 Commits

Author SHA1 Message Date
Kefu Chai
7215d4bfe9 utils: do not include unused headers
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.

please note, because quite a few source files relied on
`utils/to_string.hh` to pull in the specialization of
`fmt::formatter<std::optional<T>>`, after removing
`#include <fmt/std.h>` from `utils/to_string.hh`, we have to
include `fmt/std.h` directly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-14 07:56:39 -05:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Solodovnikov
76bea23174 treewide: reduce header interdependencies
Use forward declarations wherever possible.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>

Closes #8813
2021-06-07 15:58:35 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Glauber Costa
2ba08178ca large_bitset: be more accurate with memory usage
We are slightly underestimating the amount of memory we use. Now that
the chunked vector can exports its internal memory usage we can use that
directly.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-05-15 11:22:21 -04:00
Glauber Costa
b2f9958071 large_bitset: use a chunked_vector internally and simplify API
save and load functions for the large_bitset were introduced by Avi with
d590e327c0.

In that commit, Avi says:

"... providing iterator-based load() and save() methods.  The methods
support partial load/save so that access to very large bitmaps can be
split over multiple tasks."

The only user of this interface is SSTables. And turns out we don't really
split the access like that. What we do instead is to create a chunked vector
and then pass its begin() method with position = 0 and let it write everything.

The problem here is that this require the chunked vector to be fully
initialized, not just reserved. If the bitmap is large enough that in itself
can take a long time without yielding (up to 16ms seen in my setup).

We can simplify things considerably by moving the large_bitset to use a
chunked vector internally: it already uses a poor man's version of it
by allocating chunks internally (it predates the chunked_vector).

By doing that, we can turn save() into a simple copy operation, and do
away with load altogether by adding a new constructor that will just
copy an existing chunked_vector.

Fixes #3341
Tests: unit (release)

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20180409234726.28219-1-glauber@scylladb.com>
2018-04-10 10:25:06 +03:00
Glauber Costa
7fd31088f2 large_bitset/bloom filter: add preemption points in loops
SSTables that contain many keys - a common case with small partitions in
long lived nodes - can generate filters that are quite large.

I have seen stalls over 80ms when reading a filter that was the result
of a 6h write load of very small keys after nodetool compact (filter was
in the 100s of MB)

Similar care should be taken when creating the filter, as if the
estimated number of partitions is big, the resulting large_bitset can be
quite big as well.

If we treat the i_filter.hh and large_bitset.hh interfaces as truly
generic, then maybe we should have an in_thread version along with a
common version. But the bloom filter is the only user for both and even
if that changes in the future, it is still a good idea to run something
with a massive loop in a thread.

So for simplicity, I am just asserting that we are on a thread to avoid
surprises, and inserting preemption points in the loops.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-03-15 12:24:15 -04:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Glauber Costa
cbedd9ee41 Export Bloom Filter's memory size
Do it so we can estimate how much memory it is being used by the filters. This
estimate is not 100 % correct: the implementation of the bloom_filter class
uses a thread-local variable that is common to all filters. We won't include
that in the estimate. But aside from that, it should be quite accurate.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-09-28 16:43:06 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Tomasz Grabiec
a4536c3186 utils/large_bitset: Fix buffer overflow in load()/save()
Also fixes https://github.com/cloudius-systems/seastar/issues/54

==5658==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6250006b7848 at pc 0x1413e02 bp 0x7fff7cd7f1e0 sp 0x7fff7cd7f1d8
WRITE of size 8 at 0x6250006b7848 thread T0
    #0 0x1413e01 in unsigned long* std::__copy_move<false, false, std::random_access_iterator_tag>::__copy_m<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*) /usr/include/c++/4.9/bits/stl_algobase.h:336
    #1 0x1413c59 in unsigned long* std::__copy_move_a<false, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*) /usr/include/c++/4.9/bits/stl_algobase.h:396
    #2 0x1413aea in unsigned long* std::__copy_move_a2<false, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*) /usr/include/c++/4.9/bits/stl_algobase.h:434
    #3 0x14138df in unsigned long* std::copy<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*) /usr/include/c++/4.9/bits/stl_algobase.h:466
    #4 0x1413545 in unsigned long* std::__copy_n<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long, unsigned long*, std::random_access_iterator_tag) /usr/include/c++/4.9/bits/stl_algo.h:779
    #5 0x1412d44 in unsigned long* std::copy_n<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long, unsigned long*) /usr/include/c++/4.9/bits/stl_algo.h:804
    #6 0x14112b3 in unsigned long large_bitset::load<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*> >(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long) utils/large_bitset.hh:81
    #7 0x13fcfc9 in _ZZZN8sstables7sstable11read_filterEvENKUlRT_E_clINS_6filterEEEDaS2_ENKUlvE_clEv (/home/tgrabiec/src/urchin/build/debug/scylla+0x13fcfc9)
    #8 0x1400a50 in apply /home/tgrabiec/src/urchin/seastar/core/apply.hh:34
    #9 0x1400afb in apply<sstables::sstable::read_filter()::<lambda(auto:25&)> [with auto:25 = sstables::filter]::<lambda()> > /home/tgrabiec/src/urchin/seastar/core/apply.hh:42
    #10 0x1400bb2 in apply<sstables::sstable::read_filter()::<lambda(auto:25&)> [with auto:25 = sstables::filter]::<lambda()> > /home/tgrabiec/src/urchin/seastar/core/future.hh:1062
    #11 0x140f1b7 in _ZZN6futureIIEE4thenIZZN8sstables7sstable11read_filterEvENKUlRT_E_clINS2_6filterEEEDaS5_EUlvE_S0_EET0_OT_ENUlOS4_E_clI12future_stateIIEEEEDaSC_ (/home/tgrabiec/src/urchin/build/debug/scylla+0x140f1b7)
    #12 0x140f350 in run /home/tgrabiec/src/urchin/seastar/core/future.hh:359
    #13 0x426e2c in reactor::run_tasks(circular_buffer<std::unique_ptr<task, std::default_delete<task> >, std::allocator<std::unique_ptr<task, std::default_delete<task> > > >&, unsigned long) core/reactor.cc:1093
    #14 0x429cb1 in reactor::run() core/reactor.cc:1190
    #15 0x72bc69 in app_template::run_deprecated(int, char**, std::function<void ()>&&) core/app-template.cc:122
    #16 0xa119bc in main /home/tgrabiec/src/urchin/main.cc:279
    #17 0x7ffc1b6beec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
    #18 0x412558 (/home/tgrabiec/src/urchin/build/debug/scylla+0x412558)

0x6250006b7848 is located 0 bytes to the right of 8008-byte region [0x6250006b5900,0x6250006b7848)
allocated by thread T0 here:
    #0 0x7ffc1cf6c7df in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x547df)
    #1 0x7ffc204eef17 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x8df17)
    #2 0xfa5d4f in large_bitset::large_bitset(unsigned long) utils/large_bitset.cc:15
    #3 0x13fcec6 in _ZZZN8sstables7sstable11read_filterEvENKUlRT_E_clINS_6filterEEEDaS2_ENKUlvE_clEv (/home/tgrabiec/src/urchin/build/debug/scylla+0x13fcec6)
    #4 0x1400a50 in apply /home/tgrabiec/src/urchin/seastar/core/apply.hh:34
    #5 0x1400afb in apply<sstables::sstable::read_filter()::<lambda(auto:25&)> [with auto:25 = sstables::filter]::<lambda()> > /home/tgrabiec/src/urchin/seastar/core/apply.hh:42
    #6 0x1400bb2 in apply<sstables::sstable::read_filter()::<lambda(auto:25&)> [with auto:25 = sstables::filter]::<lambda()> > /home/tgrabiec/src/urchin/seastar/core/future.hh:1062
    #7 0x140f1b7 in _ZZN6futureIIEE4thenIZZN8sstables7sstable11read_filterEvENKUlRT_E_clINS2_6filterEEEDaS5_EUlvE_S0_EET0_OT_ENUlOS4_E_clI12future_stateIIEEEEDaSC_ (/home/tgrabiec/src/urchin/build/debug/scylla+0x140f1b7)
    #8 0x140f350 in run /home/tgrabiec/src/urchin/seastar/core/future.hh:359
    #9 0x426e2c in reactor::run_tasks(circular_buffer<std::unique_ptr<task, std::default_delete<task> >, std::allocator<std::unique_ptr<task, std::default_delete<task> > > >&, unsigned long) core/reactor.cc:1093
    #10 0x429cb1 in reactor::run() core/reactor.cc:1190
    #11 0x72bc69 in app_template::run_deprecated(int, char**, std::function<void ()>&&) core/app-template.cc:122
    #12 0xa119bc in main /home/tgrabiec/src/urchin/main.cc:279
    #13 0x7ffc1b6beec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)

SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/include/c++/4.9/bits/stl_algobase.h:336 unsigned long* std::__copy_move<false, false, std::random_access_iterator_tag>::__copy_m<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*)
Shadow bytes around the buggy address:
  0x0c4a800ceeb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a800ceec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a800ceed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a800ceee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a800ceef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c4a800cef00: 00 00 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa
  0x0c4a800cef10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a800cef20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a800cef30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a800cef40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a800cef50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Contiguous container OOB:fc
  ASan internal:           fe
==5658==ABORTING
2015-09-10 09:33:59 +03:00
Tomasz Grabiec
3b7dfbcc85 Fix whitespace errors 2015-09-08 14:10:56 +02:00
Avi Kivity
d590e327c0 large_bitmap: support for loading and saving the bitmap to raw ints
Single-bit accessors are very slow, especially because we don't support
setting a bit to a value (just set to 1 and clear to 0).  This causes
loading and retrieving the contents of a bitmap to be painfully slow.

Fix by providing iterator-based load() and save() methods.  The methods
support partial load/save so that access to very large bitmaps can be
split over multiple tasks.
2015-09-08 14:09:59 +02:00
Avi Kivity
e928bcaf19 utils: introduce large_bitset
Like boost::dynamic_bitset, but less capable.  On the other hand it avoids
very large allocations, which are incurred by the bloom filter's bitset
on even moderately sized sstables.
2015-08-23 12:22:49 +03:00