Commit Graph

1754 Commits

Author SHA1 Message Date
Kefu Chai
519b4a2934 utils/s3: include used header
when building the tree with clang-19 and libstdc++ shipped along with
GCC 14.2.1, we have

```
clang++ -MD -MT build/release/utils/s3/aws_error.o -MF build/release/utils/s3/aws_error.o.d -std=c++23 -I/home/kefu/dev/scylladb/master/seastar/include -I/home/kefu/dev/scylladb/master/build/release/seastar/gen/include -Werror=unused-result -DSEASTAR_API_LEVEL=7 -DSEASTAR_SSTRING -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_LOGGER_TYPE_STDOUT -DFMT_SHARED -I/usr/include/p11-kit-1 -DWITH_GZFILEOP -ffile-prefix-map=/home/kefu/dev/scylladb/master=. -march=westmere -ffunction-sections -fdata-sections  -O3 -mllvm -inline-threshold=2500 -fno-slp-vectorize -DSCYLLA_BUILD_MODE=release -g -gz -Xclang -fexperimental-assignment-tracking=disabled -iquote. -iquote build/release/gen -std=gnu++23  -ffile-prefix-map=/home/kefu/dev/scylladb/master=. -march=westmere -DBOOST_ALL_DYN_LINK    -fvisibility=hidden -isystem abseil -Wall -Werror -Wextra -Wimplicit-fallthrough -Wno-mismatched-tags -Wno-c++11-narrowing -Wno-overloaded-virtual -Wno-unused-parameter -Wno-unsupported-friend -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-psabi -Wno-error=deprecated-declarations -DXXH_PRIVATE_API -DSEASTAR_TESTING_MAIN  -c -o build/release/utils/s3/aws_error.o utils/s3/aws_error.cc
utils/s3/aws_error.cc:33:21: error: no member named 'make_unique' in namespace 'std'
   33 |     auto doc = std::make_unique<rapidxml::xml_document<>>();
      |                ~~~~~^
utils/s3/aws_error.cc:33:57: error: expected '(' for function-style cast or type construction
   33 |     auto doc = std::make_unique<rapidxml::xml_document<>>();
      |                                 ~~~~~~~~~~~~~~~~~~~~~~~~^
utils/s3/aws_error.cc:33:59: error: expected expression
   33 |     auto doc = std::make_unique<rapidxml::xml_document<>>();
      |                                                           ^
3 errors generated.
ninja: build stopped: subcommand failed.
```

in order to address the build failure, let's include the used header.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21064
2024-10-13 18:32:34 +03:00
Pavel Emelyanov
7163fbcef5 Merge 'utils: replace dependency on boost ranges with <ranges>' from Avi Kivity
To avoid depending on two similar libraries (boost ranges and std \<ranges), replace
uses of the former with the latter. This series tackles the utils/ directory.

Code cleanup, no backport.

Closes scylladb/scylladb#20997

* github.com:scylladb/scylladb:
  utils: logalloc: replace boost with std
  utils: lsa: chunked_managed_vector: replace boost with std
  utils: config_file: replace boost with std
  utils: loading_cache: replace boost with std
  utils: fragment_range: replace boost with std
  utils: error_injector: replace boost with std
  utils: crc: replace boost for_each with built-in range for
  utils: class_registrator: replace boost with std
  utils: chunked_vector: replace boost with std
  utils: observable: replace boost with std
2024-10-09 16:04:48 +03:00
Pavel Emelyanov
17ec416178 Merge 'Make sure S3 upload completion parses possible error' from Ernest Zaslavsky
fixes #20517
Adds `aws_error` which possibly can contain errors from the S3 response body. Adds to the multipart upload completion a check for possible error and issues a retry if the error is retryable

Closes scylladb/scylladb#20518

* github.com:scylladb/scylladb:
  test: add complete_multipart_upload completion tests
  code: s3 client error handling
  code: add response parsing and error handling to the complete_multipart_upload
  code: Introduce AWS errors parsing
2024-10-09 12:01:27 +03:00
Avi Kivity
656dc438ab utils: logalloc: replace boost with std 2024-10-08 12:07:14 +03:00
Avi Kivity
84b25a51f5 utils: lsa: chunked_managed_vector: replace boost with std 2024-10-08 12:03:30 +03:00
Avi Kivity
fa772701be utils: config_file: replace boost with std 2024-10-08 12:03:15 +03:00
Avi Kivity
b62fadae5f utils: loading_cache: replace boost with std
Unfortunately, the replacement for boost::range::join(),
std::views::concat(), is in C++26 (and not implemented in libstdc++ 14).
We use array/transform/join to simulate it.
2024-10-08 11:54:34 +03:00
Avi Kivity
72a39b84b0 utils: fragment_range: replace boost with std 2024-10-07 21:32:16 +03:00
Avi Kivity
c8a68c4cf7 utils: error_injector: replace boost with std 2024-10-07 21:28:36 +03:00
Avi Kivity
c560686d92 utils: crc: replace boost for_each with built-in range for
Simpler.
2024-10-07 21:19:14 +03:00
Avi Kivity
44419fc5ec utils: class_registrator: replace boost with std 2024-10-07 21:16:03 +03:00
Avi Kivity
adb92a6c16 utils: chunked_vector: replace boost with std 2024-10-07 21:11:23 +03:00
Avi Kivity
b259389a3e utils: observable: replace boost with std 2024-10-07 21:11:07 +03:00
Avi Kivity
d12ba753e0 utils/unconst, mutation_partition: switch to ranges
unconst is a small help that converts a const iterator to a non-const
iterator with the help of the container. Currently it is using the
boost iterator/range libraries.

Convert it to <ranges> as part of an effort to standardize on a single
range library. Its only user in mutation_partition is converted as well.

Due to more iteroperability problems between <range> and boost, some
calls to boost::adaptors::reversed have to be converted as well.
2024-10-07 17:30:12 +03:00
Avi Kivity
75f4ea1b68 utils: intrusive_btree: improve conformity with iterator requirements
The <ranges> library checks that an iterator's operator++() returns
a reference to the same type. intrusive_btree's iterator do not; instead
they return some base type and rely on implicit conversion to the real
iterator type. This causes interoperatibility problems with <range>.

Fix by using the CRTP pattern to inform iterator_base about what
type we really are, and cast to it. Enforce it with static_assert.

Note we can't static_assert in class scope since it is checked too
early and fails. Checking in function scope delays the check.
2024-10-07 17:26:01 +03:00
Pavel Emelyanov
8ccb4a1045 Merge 'db: remove unused includes ' from Kefu Chai
these unused includes are identified by clang-include-cleaner.
after auditing the source files, all of the reports have been
confirmed.

please note, since we have `using seastar::shared_ptr` in
`seastarx.h`, this renders `#include <seastar/core/shared_ptr.hh>`
unnecessary if we don't need the full definition of `seastar::shared_ptr`.

so, in this change, all the unused includes are removed.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#20963

* github.com:scylladb/scylladb:
  .github: add db to iwyu's CLEANER_DIR
  db: remove unused includes
2024-10-07 10:55:48 +03:00
Avi Kivity
946bb870f3 utils: hashers: include <memory>
hashers.hh uses std::unique_ptr, so include its header.

Closes scylladb/scylladb#20974
2024-10-07 10:52:36 +03:00
Kefu Chai
960aa38cf3 utils/i_filter: include used header
when compiling with clang-19 and the standard library from GCC-14.2,
we have:

```
/usr/bin/cmake -E __run_co_compile --tidy="clang-tidy;--checks=-*,bugprone-use-after-move;--extra-arg-before=--driver-mode=g++" --source=/__w/scylladb/scylladb/utils/bloom_filter.cc -- /usr/bin/clang++ -DBOOST_REGEX_DYN_LINK -DBOOST_REGEX_NO_LIB -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -I/__w/scylladb/scylladb -I/__w/scylladb/scylladb/seastar/include -I/__w/scylladb/scylladb/build/seastar/gen/include -I/__w/scylladb/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/__w/scylladb/scylladb/build=. -march=wes
Error: /__w/scylladb/scylladb/utils/bloom_filter.cc:81:1: error: unknown type name 'filter_ptr' [clang-diagnostic-error]
   81 | filter_ptr create_filter(int hash, large_bitset&& bitset, filter_format format) {
      | ^
Error: /__w/scylladb/scylladb/utils/bloom_filter.cc:82:12: error: no viable conversion from returned value of type '__detail::__unique_ptr_t<murmur3_bloom_filter>' (aka 'unique_ptr<utils::filter::murmur3_bloom_filter>') to function return type 'int' [clang-diagnostic-error]
   82 |     return std::make_unique<murmur3_bloom_filter>(hash, std::move(bitset), format);
      |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Error: /__w/scylladb/scylladb/utils/bloom_filter.cc:85:1: error: unknown type name 'filter_ptr' [clang-diagnostic-error]
   85 | filter_ptr create_filter(int hash, int64_t num_elements, int buckets_per, filter_format format) {
      | ^
Error: /__w/scylladb/scylladb/utils/bloom_filter.cc:86:12: error: no viable conversion from returned value of type '__detail::__unique_ptr_t<murmur3_bloom_filter>' (aka 'unique_ptr<utils::filter::murmur3_bloom_filter>') to function return type 'int' [clang-diagnostic-error]
   86 |     return std::make_unique<murmur3_bloom_filter>(hash, large_bitset(get_bitset_size(num_elements, buckets_per)), format);
      |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Error: /__w/scylladb/scylladb/utils/bloom_filter.hh:93:1: error: unknown type name 'filter_ptr' [clang-diagnostic-error]
   93 | filter_ptr create_filter(int hash, large_bitset&& bitset, filter_format format);
      | ^
Error: /__w/scylladb/scylladb/utils/bloom_filter.hh:94:1: error: unknown type name 'filter_ptr' [clang-diagnostic-error]
   94 | filter_ptr create_filter(int hash, int64_t num_elements, int buckets_per, filter_format format);
      | ^
Error: /__w/scylladb/scylladb/utils/i_filter.hh:17:25: error: no template named 'unique_ptr' in namespace 'std' [clang-diagnostic-error]
   17 | using filter_ptr = std::unique_ptr<i_filter>;
      |                    ~~~~~^
Error: /__w/scylladb/scylladb/utils/i_filter.hh:54:12: error: unknown type name 'filter_ptr' [clang-diagnostic-error]
   54 |     static filter_ptr get_filter(int64_t num_elements, double max_false_pos_prob, filter_format format);
      |            ^
4 warnings and 8 errors generated.
```

apparently, the definition of `std::unique_ptr` is missing where it is
used. so let's include `<memory>`, so that `i_filter.hh` is more
self-contained.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20971
2024-10-06 14:20:41 +03:00
Michał Chojnowski
5884c9d2fc utils/rjson.cc: correct a comment about assert()
Commit aa1270a00c changed most uses
of `assert` in the codebase to `SCYLLA_ASSERT`.

But the comment fixed in this patch is talking specifically about
`assert`, and shouldn't have been changed. It doesn't make sense
after the change.

Closes scylladb/scylladb#20967
2024-10-06 12:47:51 +03:00
Michał Chojnowski
882a3c60e4 utils/cached_file: reduce latency (and increase overhead) of partially-cached reads
Currently, `cached_file::stream` (currently used only by index_reader,
to read index pages), works as follows.

Assume that the caller requested a read of the range [pos, pos + size).
Then:

- If the first page of the requested range is uncached,
  the entire [pos, pos + size) range is read from disk (even if some
  later pieces of it are cached), the resulting pages are added to the cache,
  and the read completes (most likely) from the cached pages.
- If the first page of the read is cached, then the rest of the read
  is handled page-by-page, in a sequential loop, serving each page
  either from cache (if present) or from disk.

For example, assume that pages 0, 1, 2, 3, 4 are requested.

If exactly pages 1, 2 are cached, then `stream` will read the entire [0, 4] range
from disk and insert the missing 0, 3, 4, and then it will continue serving the
read from cache.

If exactly pages 0 and 3 are cached, then it will serve 0 from cache,
then it will read 1 from disk and insert it into cache,
then it will read 2 from disk and insert it into cache,
then it will serve 3 from cache,
then it will read 4 from disk and insert it into cache.

If exactly the first page is cached, a 128 kiB read turns
into 31 I/O sequential read ops.

This is weird, and doesn't look intended. In one case, we are reading even pages
we already have, just to avoid fragmenting the read, and in the other case
we are reading pages one-by-one (sequentially!) even if they are neighbours.

I'm not sure if cached_file should minimize IOPS or byte throughput,
but the current state is surely suboptimal. Even if its read strategy
is somehow optimal, it should still at least coalesce contiguous reads
and perform the non-contiguous reads in parallel.

This patch leans into minimizing IOPS. After the patch, we serve
as many front pages from the cache as we can, but when we see
an uncached page, we read the entire remainder of the read from disk.

As if we trimmed the read request by the longest cached prefix,
and then performed the rest using the logic from before the patch.

For example, if exactly pages 0 and 3 are cached,
then we serve 0 from cache,
then we read [1, 4] from disk and insert everything into cache.

For partially-cached files, this will result in more bytes read
from disk, but less IOPS. This might be a bad thing. But if so,
then we should lean the other way in a more explicit and efficient
way than we currently do.

Closes scylladb/scylladb#20935
2024-10-04 17:39:38 +02:00
Kefu Chai
ee36358a60 db: remove unused includes
these unused includes are identified by clang-include-cleaner.
after auditing the source files, all of the reports have been
confirmed.

please note, since we have `using seastar::shared_ptr` in
`seastarx.h`, this renders `#include <seastar/core/shared_ptr.hh>`
unnecessary if we don't need the full definition of `seastar::shared_ptr`.

so, in this change, all the unused includes are removed. but there are
some headers which are actually used, while still being identified by
this tool. these includes are marked with "IWYU pragma: keep".

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-04 20:48:18 +08:00
Avi Kivity
e99426df60 treewide: de-static namespace scope functions in headers
'static inline' is always wrong in headers - if the same header is
included multiple times, and the function happens not to be inlined,
then multiple copies of it will be generated.

Fix by mechanically changing '^static inline' to 'inline'.
2024-10-01 14:02:50 +03:00
Ernest Zaslavsky
3be6052786 code: s3 client error handling
Handle the `finalize_upload` possible exception to abort the upload (which also can throw) and show the right error originated from the `finalize_upload`
2024-10-01 09:06:24 +03:00
Ernest Zaslavsky
6be2433b5a code: add response parsing and error handling to the complete_multipart_upload
Instead of ignoring the response for multipart upload completion start parsing it and look for a possible errors in the response body. If the error is found throw an exception
2024-10-01 09:06:24 +03:00
Ernest Zaslavsky
826cf5cd4a code: Introduce AWS errors parsing
Add a simple utility class to parse (possible) error response from AWS S3. Stay as close as possible to aws-sdk-cpp ErrorMarshaler https://github.com/aws/aws-sdk-cpp/blob/main/src/aws-cpp-sdk-core/source/client/AWSErrorMarshaller.cpp logic
Also, add a tester for this new class
2024-10-01 09:06:24 +03:00
Avi Kivity
fb8743b2d6 Merge 'sstables: Fix use-after-free on page cache buffer when parsing promoted index entries across pages' from Tomasz Grabiec
This fixes a use-after-free bug when parsing clustering key across
pages.

Also includes a fix for allocating section retry, which is potentially not safe (not in practice yet).

Details of the first problem:

Clustering key index lookup is based on the index file page cache. We
do a binary search within the index, which involves parsing index
blocks touched by the algorithm. Index file pages are 4 KB chunks
which are stored in LSA.

To parse the first key of the block, we reuse clustering_parser, which
is also used when parsing the data file. The parser is stateful and
accepts consecutive chunks as temporary_buffers. The parser is
supposed to keep its state across chunks.

In 93482439, the promoted index cursor was optimized to avoid
fully page copy when parsing index blocks. Instead, parser is
given a temporary_buffer which is a view on the page.

A bit earlier, in b1b5bda, the parser was changed to keep shared
fragments of the buffer passed to the parser in its internal state (across pages)
rather than copy the fragments into a new buffer. This is problematic
when buffers come from page cache because LSA buffers may be moved
around or evicted. So the temporary_buffer which is a view on the LSA
buffer is valid only around the duration of a single consume() call to
the parser.

If the blob which is parsed (e.g. variable-length clustering key
component) spans pages, the fragments stored in the parser may be
invalidated before the component is fully parsed. As a result, the
parsed clustering key may have incorrect component values. This never
causes parsing errors because the "length" field is always parsed from
the current buffer, which is valid, and component parsing will end at
the right place in the next (valid) buffer.

The problematic path for clustering_key parsing is the one which calls
primitive_consumer::read_bytes(), which is called for example for text
components. Fixed-size components are not parsed like this, they store
the intermediate state by copying data.

This may cause incorrect clustering keys to be parsed when doing
binary search in the index, diverting the search to an incorrect
block.

Details of the solution:

We adapt page_view to a temporary_buffer-like API. For this, a new concept
is introduced called ContiguousSharedBuffer. We also change parsers so that
they can be templated on the type of the buffer they work with (page_view vs
temporary_buffer). This way we don't introduce indirection to existing algorithms.

We use page_view instead of temporary_buffer in the promoted
index parser which works with page cache buffers. page_view can be safely
shared via share() and stored across allocating sections. It keeps hold to the
LSA buffer even across allocating sections by the means of cached_file::page_ptr.

Fixes #20766

Closes scylladb/scylladb#20837

* github.com:scylladb/scylladb:
  sstables: bsearch_clustered_cursor: Add trace-level logging
  sstables: bsearch_clustered_cursor: Move definitions out of line
  test, sstables: Verify parsing stability when allocating section is retried
  test, sstables: Verify parsing stability when buffers cross page boundary
  sstables: bsearch_clustered_cursor: Switch parsers to work with page_view
  cached_file: Adapt page_view to ContiguousSharedBuffer
  cached_file: Change meaning of page_view::_size to be relative to _offset rather than page start
  sstables, utils: Allow parsers to work with different buffer types
  sstables: promoted_index_block_parser: Make reset() always bring parser to initial state
  sstables: bsearch_clustered_cursor: Switch read_block_offset() to use the read() method
  sstables: bsearch_clustered_cursor: Fix parsing when allocating section is retried
2024-10-01 00:02:55 +03:00
Kefu Chai
faec71e666 directories: mark verification_error() with [[noreturn]]
this helps the compiler or static analyzers do make the right decision.
for instance, clang-tidy thinks a parameter like `std::move(path)`
could be reused after being moved away. with this attribute, this tool
should be able to tell that this never happens.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-30 12:07:15 +08:00
Kefu Chai
0ef72475fc directories: pass const ref of path to verification_error()
before this change, we pass a `path` to `verification_error()` by
moving away from the original `path`. this works fine in the sense
that it is correct and does not incur potential performance issues.

but clang-tidy considers it a used-after-move, because it cannot tell
`verification_error()` does not return at all, and believes that `path`
could be accessed again after being moved away. so it warns like:

```
Warning: /__w/scylladb/scylladb/utils/directories.cc:132:52: warning: 'path' used after it was moved [bugprone-use-after-move]
  132 |         bool can_access = co_await file_accessible(path.string(), access_flags::read | access_flags::write | access_flags::execute);
      |                                                    ^
/__w/scylladb/scylladb/utils/directories.cc:121:28: note: move occurred here
  121 |         verification_error(std::move(path), "File not owned by current euid: {}. Owner is: {}", geteuid(), sd.uid);
      |                            ^
```

in this change, instead of passing `fs::path` to `verification_error()`,
we pass a `const fs::path&` to this function. because
`verification_error()` is not coroutine, neither does it not pass `path` to
another continuation to be scheduled. so it's perfectly fine to pass
`path` to it.

this change address the false alarms from clang-tidy.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-30 12:07:15 +08:00
Tomasz Grabiec
c09fa0cb98 test, sstables: Verify parsing stability when buffers cross page boundary 2024-09-27 01:25:15 +02:00
Tomasz Grabiec
7670ee701a sstables: bsearch_clustered_cursor: Switch parsers to work with page_view
This fixes a use-after-free bug when parsing clustering key across
pages.

Clustering key index lookup is based on the index file page cache. We
do a binary search within the index, which involves parsing index
blocks touched by the algorithm. Index file pages are 4 KB chunks
which are stored in LSA.

To parse the first key of the block, we reuse clustering_parser, which
is also used when parsing the data file. The parser is stateful and
accepts consecutive chunks as temporary_buffers. The parser is
supposed to keep its state across chunks.

In b1b5bda, the parser was changed to keep shared fragments of the
buffer passed to the parser in its internal state (across pages)
rather than copy the fragments into a new buffer. This is problematic
when buffers come from page cache because LSA buffers may be moved
around or evicted. So the temporary_buffer which is a view on the LSA
buffer is valid only around the duration of a single consume() call to
the parser.

If the blob which is parsed (e.g. variable-length clustering key
component) spans pages, the fragments stored in the parser may be
invalidated before the component is fully parsed. As a result, the
parsed clustering key may have incorrect component values. This never
causes parsing errors because the "length" field is always parsed from
the current buffer, which is valid, and component parsing will end at
the right place in the next (valid) buffer.

The problematic path for clustering_key parsing is the one which calls
primitive_consumer::read_bytes(), which is called for example for text
components. Fixed-size components are not parsed like this, they store
the intermediate state by copying data.

This may cause incorrect clustering keys to be parsed when doing
binary search in the index, diverting the search to an incorrect
block.

The solution is to use page_view instead of temporary_buffer, which
can be safely shared via share() and stored across allocating
section. The page_view maintains its hold to the LSA buffer even
across allocating sections.

Fixes #20766
2024-09-27 01:25:15 +02:00
Tomasz Grabiec
c15145b71d cached_file: Adapt page_view to ContiguousSharedBuffer 2024-09-27 01:25:15 +02:00
Tomasz Grabiec
29498a97ae cached_file: Change meaning of page_view::_size to be relative to _offset rather than page start
Will be easier to implement ContiguousSharedBuffer API as the buffer
size will be equal to _size.
2024-09-27 01:25:15 +02:00
Tomasz Grabiec
c0fa49bab5 sstables, utils: Allow parsers to work with different buffer types
Currently, parsers work with temporary_buffer<char>. This is unsafe
when invoked by bsearch_clustered_cursor, which reuses some of the
parsers, and passes temporary_buffer<char> which is a view onto LSA
buffer which comes from the index file page cache. This view is stable
only around consume(). If parsing requires more than one page, it will
continue with a different input buffer. The old buffer will be
invalid, and it's unsafe for the parser to store and access
it. Unfortunetly, the temporary_buffer API allows sharing the buffer
via the share() method, which shares the underlying memory area. This
is not correct when the underlying is managed by LSA, because storage
may move. Parser uses this sharing when parsing blobs, e.g. clustering
key components. When parsing resumes in the next page, parser will try
to access the stored shared buffers pointing to the previous page,
which may result in use-after-free on the memory area.

In prearation for fixing the problem, parametrize parsers to work with
different kinds of buffers. This will allow us to instantiate them
with a buffer kind which supports sharing of LSA buffers properly in a
safe way.

It's not purely mechanical work. Some parts of the parsing state
machine still works with temporary_buffer<char>, and allocate buffers
internally, when reading into linearized destination buffer. They used
to store this destination in _read_bytes vector, same field which is
used to store the shared buffers. Now it's not possible, since shared
buffer type may be different than temporary_buffer<char>. So those
paths were changed to use a new field: _read_bytes_buf.
2024-09-27 01:24:54 +02:00
Kefu Chai
d5b348460f config: do not provide default value for set_value() and friends
before this change, `config_file::set_value()` and
`config_file::set_value_on_all_shards()` provide default value for
`config_source`. but the default value is never used -- we alway
specify the `source_source` when calling `set_value_on_all_shards()`.

so in hope to improve the readability, the default value is removed.
so, for example, one can figure out when `config_source::Internal` is
used with less efforts. despite that `config_file::set_value()` is not
used in the tree. for the sake of completeness, its default value is
also dropped.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20728
2024-09-25 15:45:42 +03:00
Kefu Chai
cb1670b79b Update seastar submodule
* seastar ec5da7a6...69f88e2f (38):
  > build: s/Sanitizers_COMPILER_OPTIONS/Sanitizers_COMPILE_OPTIONS
  > test: Update httpd test with request/reply body writing sugar
  > http: Add sugar to request and response body writers
  > utils: Add util::write_to_stream() helper
  > seastar-addr2line: adjust llvm termination regex
  > README.md: add Crimson project
  > rpc: conditionally use fmt::runtime() based on SEASTAR_LOGGER_COMPILE_TIME_FMT
  > build: check the combination of Sanitizers
  > tls: clear session ticket before releasing
  > print: remove dead code
  > doc/lambda-coroutine-fiasco: reword for better readability
  > rpc: fix compilation error caused by fmt::runtime()
  > tutorial: explain the use case of rethrow_exception and coroutine::exception
  > reactor: print more informative error when io_submit fails
  > README.md: note GitHub discussions
  > prometheus: `fmt::print` to stringstream directly
  > doc: add document for testing with seastar
  > seastar/testing: only include used headers
  > test: Add abortable http client test cases
  > http/client: Add abortable make_request() API method
  > http/client: Abort established connections
  > http/client: Handle abort source in pool wait
  > http/client: Add abort source to factory::make() method
  > http/client: Pass abort_source here and there
  > http/client: Idnentation fix after previous patch
  > http/client: Merge some continuations explicitly
  > signal: add seastar signal api
  > httpd: remove unused prometheus structs
  > print: use fmtlib's fmt::format_string in format()
  > rpc: do not use seastar::format() in rpc logger
  > treewide: s/format/seastar::format/
  > prometheus: sanitize label value for text protocol
  > tests: unit test prometheus wire format
  > io-tester: Introduce batches to rate-based submission
  > io-tester: Generalize issueing request and collecting its result
  > io-tester: Cancel intent once
  > io-tester: Dont carry rps/parallelism variables over lambdas
  > io-tester: Simplify in-flight management

The breaking changes in the seastar submodule necessitate corresponding
modifications in our code. These changes must be implemented together in
a single commit to maintain consistency. So that each commit is buildable.

following changes are included in addition to seastar submodule update:
* instead of passing a `const char*` for the format string, pass a
  templated `fmt::format_string<...>`, this depends on the
  `seastar::format()` change in seastar.
* explicitly call `fmt::runtime()` if the format string is not a
  consteval expression. this depends on the `seastar::format()` change
  in seastar. as `seastar::format()` does not accept a plain
  `const char*` which is not constexpr anymore.
* pass abort_source to `dns_connection_factory::make()`. this depends on
  the change in seastar, which added a `abort_source*` argument to
  the pure virtual member function of `connection_factory::make()`.
* call call {fmt,seastar}::format() explicitly. this is a follow up of
  3e84d43f, which takes care of all places where we should call
  `fmt::format()` and `seastar::format()` explicitly to disambiguate the
  `format()` call. but more `format()` call made their way into the source
   tree after 3e84d43f. so we need fix them as well.
* include used header in tests

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Update seastar submodule

 Please enter the commit message for your changes. Lines starting

Closes scylladb/scylladb#20649
2024-09-18 13:59:22 +03:00
Pavel Emelyanov
ebfa73e004 s3/client: Don't move file from write_body's lambda
Requests sent by S3 are retriable, so when request.write_body() is
called, it should keep everything intact in case http client will call
it again.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#20579
2024-09-17 09:48:09 +03:00
Botond Dénes
4fb194117e Merge 'Generalize multipart upload implementations in S3 client' from Pavel Emelyanov
There are two currently -- upload_sink_base and do_upload_file. This PR merges as much code as possible (spoiler: it's already mostly copy-n-pase-d, so squashing is pretty straightforward)

Closes scylladb/scylladb#20568

* github.com:scylladb/scylladb:
  s3/client: Reuse class multipart_upload in do_upload_file
  s3/client: Split upload_sink_base class into two
2024-09-13 10:35:10 +03:00
Pavel Emelyanov
17e7d3145c s3/client: Reuse class multipart_upload in do_upload_file
Uploading a file is implemented by the do_upload_file class. This class
re-implements a big portion of what's currently in multipart_upload one.
This patch makes the former class inherit from the latter and removes
all the duplication from it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-09-12 18:38:16 +03:00
Pavel Emelyanov
14b741afc9 s3/client: Split upload_sink_base class into two
This class implements two facilities -- multipart upload protocol itself
plus some common parts of upload_sink_impl (in fact -- only close() and
plugs put(packet)).

This patch aplits those two facilities into two classes. One of them
will be re-used later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-09-12 18:00:19 +03:00
Kefu Chai
197451f8c9 utils/rjson.cc: include the function name in exception message
recently, we are observing errors like:

```
stderr: error running operation: rjson::error (JSON SCYLLA_ASSERT failed on condition 'false', at: 0x60d6c8e 0x4d853fd 0x50d3ac8 0x518f5cd 0x51c4a4b 0x5fad446)
```

we only passed `false` to the `RAPIDJSON_ASSERT()` macro, so what we
have is but the type of the error (rjson::error) and a backtrace.
would be better if we can have more information without recompiling
or fetching the debug symbols for decipher the backtrace.

Refs scylladb/scylladb#20533
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20539
2024-09-12 15:22:49 +03:00
Kefu Chai
3e84d43f93 treewide: use seastar::format() or fmt::format() explicitly
before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.

that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:

```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
  265 |     return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
      |            ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
 4290 |     format(format_string<_Args...> __fmt, _Args&&... __args)
      |     ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
  143 | format(fmt::format_string<A...> fmt, A&&... a) {
      | ^
```

in this change, we

change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
  `seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
  because, `sstring::operator std::basic_string` would incur a deep
  copy.

we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-11 23:21:40 +03:00
Pavel Emelyanov
7742b90cb1 directory_lister: Add noexcept default move-constructor
It's required to make it possible to push lister into with_closeable().
Its requiremenent of nothrow-move-constructible doesn't accept
default-generated one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-09-05 11:10:21 +03:00
Pavel Emelyanov
86bc5b11fe s3-client: Add support for lister::filter
Directory lister comes with a filter function that tells lister which
entries to skip by its .get() method. For uniformity, add the same to
S3 bucket_lister.

After this change the lister reports shorter name in the returned
directory entry (with the prefix cut), so also need to tune up the unit
test respectively.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-27 16:15:40 +03:00
Pavel Emelyanov
113d2449f8 utils: Introduce abstract (directory) lister
This patch hides directory_lister and bucket_lister behind a common
facade. The intention is to provide a uniform API for sstable_directory
that it could use to list sstables' components wherever they are.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-27 16:15:40 +03:00
Botond Dénes
5c0f6d4613 Merge 'Make Summary support histogram with infinite bucket vlaues' from Amnon Heiman
This series fixes an issue where histogram Summaries return an infinite value.

It updated the quantile calculation logic to address cases where values fall into the infinite bucket of a histogram.
Now, instead of returning infinite (max int), the calculation will return the last bucket limit, ensuring finite outputs in all cases.

The series adds a test for summaries with a specific test case for this scenario.

Fixes #20255
Need backport to 6.0, 6.1 and 2023.1 and above

Closes scylladb/scylladb#20257

* github.com:scylladb/scylladb:
  test/estimated_histogram_test Add summary tests
  utils/histogram.hh: Make summary support inifinite bucket.
2024-08-27 10:33:54 +03:00
Botond Dénes
15fdc3f6cc Merge 'Add ability to list S3 bucket contents' from Pavel Emelyanov
This is prerequisite for "restore from object storage" feature. In order to collect the sstables in bucket one would need to list the bucket contents with the given prefix. The ListObjectsV2 provides a way for it and here's the respective s3::client extension.

Closes scylladb/scylladb#20120

* github.com:scylladb/scylladb:
  test: Add test for s3::client::bucket_lister
  s3_client: Add bucket lister
  s3_client: Encode query parameter value for query-string
2024-08-23 10:16:07 +03:00
Amnon Heiman
011aa91a8c utils/histogram.hh: Make summary support inifinite bucket.
This patch handles an edge cases related to The infinite bucket  
limit.

Summaries are the P50, P95, and P99 quantiles.

The quantiles are calculated from a histogram; we find the bucket and
return its upper limit.

In classic histograms, there is a notion of the infinite bucket;
anything that does not fall into the last bucket is considered to be
infinite;

with quantile, it does not make sense. So instead of reporting infinite
we'll report the bucket lower limit.

Fixes #20255

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-08-22 23:34:24 +03:00
Kamil Braun
5c9efdff50 Merge 'raft: store_snapshot_descriptor to use actually preserved items number when truncating the local log table' from Sergey Zolotukhin
io_fiber/store_snapshot_descriptor now gets the actual number of items
preserved when the log is truncated, fixing extra entries remained after
log snapshot creation. Also removes incorrect check for the number of
truncated items in the
raft_sys_table_storage::store_snapshot_descriptor.

Minor change: Added error_injection test API for changing snapshot thresholds settings.

Fixes scylladb/scylladb#16817
Fixes scylladb/scylladb#20080

Closes scylladb/scylladb#20095

* github.com:scylladb/scylladb:
  raft:  Ensure const correctness in applier_fiber.
  raft: Invoke store_snapshot_descriptor with actually preserved items.
  raft: Use raft_server_set_snapshot_thresholds in tests.
  raft: Fix indentation in server.cc
  raft: Add a test to check log size after truncation.
  raft: Add raft_server_set_snapshot_thresholds injection.
  utils: Ensure const correctness of injection_handler::get().
2024-08-20 18:15:30 +02:00
Tomasz Grabiec
ff52527c54 Merge 'repair: do_rebuild_replace_with_repair: use source_dc only when safe' from Benny Halevy
It is unsafe to restrict the sync nodes for repair to the source data center if it has too low replication factor in network_topology_replication_strategy, or if other nodes in that DC are ignored.

Also, this change restricts the usage of source_dc to `network_topology` and `everywhere_topology`
strategies, as with simple replication strategy
there is no guarantee that there would be any
more replicas in that data center.

Fixes #16826

Reproducer submitted as https://github.com/scylladb/scylla-dtest/pull/3865
It fails without this fix and passes with it.

* Requires backport to live versions.  Issue hit in the filed with 2022.2.14

Closes scylladb/scylladb#16827

* github.com:scylladb/scylladb:
  repair: do_rebuild_replace_with_repair: use source_dc only when safe
  repair: replace_with_repair: pass the replace_node downstream
  repair: replace_with_repair: pass ignore_nodes as a set of host_id:s
  repair: replace_rebuild_with_repair: pass ks_erms from caller
  nodetool: rebuild: add force option
  Add and use utils::optional_param to pass source_dc
2024-08-20 16:13:23 +02:00
Sergey Zolotukhin
c5da0775f2 utils: Ensure const correctness of injection_handler::get().
Make utils::error_injection::injection_handler::get() method 'const' as it does not mutate object's state.
2024-08-20 14:15:50 +02:00