Commit Graph

180 Commits

Author SHA1 Message Date
Kefu Chai
372a4d1b79 treewide: do not define FMT_DEPRECATED_OSTREAM
since we do not rely on FMT_DEPRECATED_OSTREAM to define the
fmt::formatter for us anymore, let's stop defining `FMT_DEPRECATED_OSTREAM`.

in this change,

* utils: drop the range formatters in to_string.hh and to_string.c, as
  we don't use them anymore. and the tests for them in
  test/boost/string_format_test.cc are removed accordingly.
* utils: use fmt to print chunk_vector and small_vector. as
  we are not able to print the elements using operator<< anymore
  after switching to {fmt} formatters.
* test/boost: specialize fmt::details::is_std_string_like<bytes>
  due to a bug in {fmt} v9, {fmt} fails to format a range whose
  element type is `basic_sstring<uint8_t>`, as it considers it
  as a string-like type, but `basic_sstring<uint8_t>`'s char type
  is signed char, not char. this issue does not exist in {fmt} v10,
  so, in this change, we add a workaround to explicitly specialize
  the type trait to assure that {fmt} format this type using its
  `fmt::formatter` specialization instead of trying to format it
  as a string. also, {fmt}'s generic ranges formatter calls the
  pair formatter's `set_brackets()` and `set_separator()` methods
  when printing the range, but operator<< based formatter does not
  provide these method, we have to include this change in the change
  switching to {fmt}, otherwise the change specializing
  `fmt::details::is_std_string_like<bytes>` won't compile.
* test/boost: in tests, we use `BOOST_REQUIRE_EQUAL()` and its friends
  for comparing values. but without the operator<< based formatters,
  Boost.Test would not be able to print them. after removing
  the homebrew formatters, we need to use the generic
  `boost_test_print_type()` helper to do this job. so we are
  including `test_utils.hh` in tests so that we can print
  the formattable types.
* treewide: add "#include "utils/to_string.hh" where
  `fmt::formatter<optional<>>` is used.
* configure.py: do not define FMT_DEPRECATED_OSTREAM
* cmake: do not define FMT_DEPRECATED_OSTREAM

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:57:36 +08:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Botond Dénes
a19a2d76c9 sstablex/mx/reader: validate(): print trace message when finishing the PI block 2024-03-12 11:05:18 -04:00
Botond Dénes
677be168c4 sstablex/mx/reader: validate(): make index-data PI position check message consistent
The message says "index-data" but when printing the position, the data
position is printed first, causing confusion. Fix this and while at it,
also print the position of the partition start.
2024-03-12 11:05:18 -04:00
Botond Dénes
5bff7c40d3 sstablex/mx/reader: validate(): only load the next PI block if current is exhausted
The validate() consumes the content of partitions in a consume-loop.
Every time the consumer asks for a "break", the next PI block is loaded
and set on the validator, so it can validate that further clustering
elements are indeed from this block.
This loop assumed the consumer would only request interruption when the
current clustering block is finished. This is wrong, the consumer can
also request interruption when yielding is needed. When this is the
case, the next PI block doesn't have to be loaded yet, the current one
is not exhausted yet. Check this condition, before loading the next PI
block, to prevent false positive errors, due to mismatched PI block
and clustering elements from the sstable.
2024-03-12 11:05:18 -04:00
Botond Dénes
e073df1dbb sstablex/mx/reader: validate(): reset the current PI block on partition-start
It is possible that the next partition has no PI and thus there won't be
a new PI block to overwrite the old one. This will result in
false-positive messages about rows being outside of the finished PI
block.
2024-03-12 11:05:18 -04:00
Botond Dénes
2737899c21 sstablex/mx/reader: validate(): consume_range_tombstone(): check for finished clustering blocked
Promoted index entries can be written on any clustering elements,
icluding range tombstones. So the validating consumer also has the check
whether the current expected clustering block is finished, when
consuming a range tombstone. If it is, consumption has to be
interrupted, so that the outer-loop can load up the next promoted index
block, before moving on to the next clustering element.
2024-03-12 11:05:18 -04:00
Botond Dénes
f46b458f0d sstablex/mx/reader: validate(): fix validator for range tombstone end bounds
For range tombstone end-bounds, the validate_fragment_order() should be
passed a null tombstone, not a disengaged optional. The latter means no
change in the current tombstone. This caused the end bound of range
tombstones to not make it to the validator and the latter complained
later on partition-end that the partition has unclosed range tombstone.
2024-03-12 11:05:18 -04:00
Michał Chojnowski
f9e97fa632 sstables: fix a use-after-free in key_view::explode()
key_view::explode() contains a blatant use-after-free:
unless the input is already linearized, it returns a view to a local temporary buffer.

This is rare, because partition keys are usually not large enough to be fragmented.
But for a sufficiently large key, this bug causes a corrupted partition_key down
the line.

Fixes #17625

Closes scylladb/scylladb#17626
2024-03-07 09:07:07 +02:00
Kefu Chai
9a32029a8f sstables: add fmt::formatter for sstables::bound_kind_m
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `sstables::bound_kind_m`,
drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-23 13:55:22 +08:00
Kefu Chai
0b13de52de sstable/mx: add fmt::formatter for cached_promoted_index::promoted_index_block
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for
`cached_promoted_index::promoted_index_block`, and drop its
operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17415
2024-02-20 09:00:32 +02:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Avi Kivity
8ee75ae8f4 sstables: writer: don't require effective_replication_map for sharding metadata
Currently, we pass an effective_replication_map_ptr to sstable_writer,
so that we can get a stable dht::sharder for writing the sharding metadata.
This is needed because with tablets, the sharder can change dynamically.

However, this is both bad and unnecessary:
 - bad: holding on to an effective_replication_map_ptr is a barrier
   for topology operations, preventing tablet migrations (etc) while
   an sstable is being written
 - unnecessary: tablets don't require sharding metadata at all, since
   two tablets cannot overlap (unlike two sstables from different shards in
   the same node). So the first/last key is sufficient to determine the
   shard/tablet ownership.

Given that, just pass the sharder for vnode sstables, and don't generate
sharding metadata for tablet sstables.
2024-01-23 22:23:08 +02:00
Kefu Chai
a6152cb87b sstables: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16666
2024-01-09 11:45:44 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Aleksandra Martyniuk
7b3e0ab1f2 compaction: sstables: monitor validation scrub with compaction_read_generator
Validation scrub bypasses the usual compaction machinery, though it
still needs to be tracked with compaction_progress_monitor so that
we could reach its progress from compaction task executor.

Track sstable scrub in validate mode with read monitors.
2023-10-12 17:03:46 +02:00
Kefu Chai
9c24be05c3 sstable/writer: log sstable name and pk when capping ldt
when the local_deletion_time is too large and beyond the
epoch time of INT32_MAX, we cap it to INT32_MAX - 1.
this is a signal of bad configuration or a bug in scylla.
so let's add more information in the logging message to
help track back to the source of the problem.

Fixes #15015
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-21 19:25:32 +08:00
Raphael S. Carvalho
8829ff02c5 Revert "sstables: Close SSTable reader if index exhaustion is detected in fast forward call"
This reverts commit 1fefe597e6.

Can be reverted after auto-closed reader.

Refs #12998.
2023-07-12 10:48:28 -03:00
Avi Kivity
1545ae2d3b Merge 'Make SSTable cleanup more efficient by fast forwarding to next owned range' from Raphael "Raph" Carvalho
Today, SSTable cleanup skips to the next partition, one at a time, when it finds that the current partition is no longer owned by this node.

That's very inefficient because when a cluster is growing in size, existing nodes lose multiple sequential tokens in its owned ranges. Another inefficiency comes from fetching index pages spanning all unowned tokens, which was described in https://github.com/scylladb/scylladb/issues/14317.

To solve both problems, cleanup will now use multi range reader, to guarantee that it will only process the owned data and as a result skip unowned data. This results in cleanup scanning an owned range and then fast forwarding to the next one, until it's done with them all. This reduces significantly the amount of data in the index caching, as index will only be invoked at each range boundary instead.

Without further ado,

before:

`INFO  2023-07-01 07:10:26,281 [shard 0] compaction - [Cleanup keyspace2.standard1 701af580-17f7-11ee-8b85-a479a1a77573] Cleaned 1 sstables to [./tmp/1/keyspace2/standard1-b490ee20179f11ee9134afb16b3e10fd/me-3g7a_0s8o_06uww24drzrroaodpv-big-Data.db:level=0]. 2GB to 1GB (~50% of original) in 26248ms = 81MB/s. ~9443072 total partitions merged to 4750028.`

after:

`INFO  2023-07-01 07:07:52,354 [shard 0] compaction - [Cleanup keyspace2.standard1 199dff90-17f7-11ee-b592-b4f5d81717b9] Cleaned 1 sstables to [./tmp/1/keyspace2/standard1-b490ee20179f11ee9134afb16b3e10fd/me-3g7a_0s4m_5hehd2rejj8w15d2nt-big-Data.db:level=0]. 2GB to 1GB (~50% of original) in 17424ms = 123MB/s. ~9443072 total partitions merged to 4750028.`

Fixes #12998.
Fixes #14317.

Closes #14469

* github.com:scylladb/scylladb:
  test: Extend cleanup correctness test to cover more cases
  compaction: Make SSTable cleanup more efficient by fast forwarding to next owned range
  sstables: Close SSTable reader if index exhaustion is detected in fast forward call
  sstables: Simplify sstable reader initialization
  compaction: Extend make_sstable_reader() interface to work with mutation_source
  test: Extend sstable partition skipping test to cover fast forward using token
2023-07-11 23:28:15 +03:00
Raphael S. Carvalho
1fefe597e6 sstables: Close SSTable reader if index exhaustion is detected in fast forward call
When wiring multi range reader with cleanup, I found that cleanup
wouldn't be able to release disk space of input SSTables earlier.

The reason is that multi range reader fast forward to the next range,
therefore it enables mutation_reader::forwarding, and as a result,
combined reader cannot release readers proactively as it cannot tell
for sure that the underlying reader is exhausted. It may have reached
EOS for the current range, but it may have data for the next one.

The concept of EOS actually only applies to the current range being
read. A reader that returned EOS will actually get out of this
state once the combined reader fast forward to the next range.

Therefore, only the underlying reader, i.e. the sstable reader,
can for certain know that the data source is completely exhausted,
given that tokens are read in monotonically increasing order.

For reversed reads, that's not true but fast forward to range
is not actually supported yet for it.

Today, the SSTable reader already knows that the underlying SSTable
was exhausted in fast_forward_to(), after it call index_reader's
advance_to(partition_range), therefore it disables subsequent
reads. We can take a step further and also check that the index
was exhausted, i.e. reached EOF.

So if the index is exhausted, and there's no partition to read
after the fast_forward_to() call, we know that there's nothing
left to do in this reader, and therefore the reader can be
closed proactively, allowing the disk space of SSTable to be
reclaimed if it was already deleted.

We can see that the combined reader, under multi range reader,
will incrementally find a set of disjoint SSTable exhausted,
as it fast foward to owned ranges

1:
INFO  2023-07-05 10:51:09,570 [shard 0] mutation_reader - flat_multi_range_mutation_reader(): fast forwarding to range [{-4525396453480898112, start},{-4525396453480898112, end}]
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-1-big-Data.db, start == *end, eof ? true
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - closing reader 0x60100029d800 for /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-1-big-Data.db
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-3-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-4-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-5-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-6-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-7-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-8-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-9-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-10-big-Data.db, start == *end, eof ? false

2:
INFO  2023-07-05 10:51:09,572 [shard 0] mutation_reader - flat_multi_range_mutation_reader(): fast forwarding to range [{-2253424581619911583, start},{-2253424581619911583, end}]
INFO  2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-2-big-Data.db, start == *end, eof ? true
INFO  2023-07-05 10:51:09,572 [shard 0] sstable - closing reader 0x60100029d400 for /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-2-big-Data.db
INFO  2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-4-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-5-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-6-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-7-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-8-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-9-big-Data.db, start == *end, eof ? false
INFO  2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-10-big-Data.db, start == *end, eof ? false

And so on.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-07-11 13:56:24 -03:00
Raphael S. Carvalho
f08a4eaacb sstables: Simplify sstable reader initialization
It's odd that we see things like:

    if (!is_initialized()) {
        return initialize().then([this] {
            if (!is_initialized()) {

    and

    return ensure_initialized().then([this, &pr] {
        if (!is_initialized()) {

One might think initialize will actually initialize the reader by
setting up context, and ensure_initialized() will even have stronger
guarantees, meaning that the reader must be initialized by it.

But none are true.

In the context of single-partition read, it can happen initialize()
will not set up context, meaning is_initialized() returns false,
which is why initialization must be checked even after we call
ensure_initialized().

Let's merge ensure_initialized() and initialize() into a
maybe_initialize() which returns a boolean saying if the reader
is initialized.

It makes the code initializing the reader easier to understand.
2023-07-11 13:56:23 -03:00
Avi Kivity
0cabf4eeb9 build: disable implicit fallthrough
Prevent switch case statements from falling through without annotation
([[fallthrough]]) proving that this was intended.

Existing intended cases were annotated.

Closes #14607
2023-07-10 19:36:06 +02:00
Tomasz Grabiec
17d6163548 sstables: Generate sharding metadata using sharder from erm when writing
We need to keep sharding metadata consistent with tablet mapping to
shards in order for node restart to detect that those sstables belong
to a single shard and that resharding is not necessary. Resharding of
sstables based on tablet metadata is not implemented yet and will
abort after this series.

Keeping sharding metadata accurate for tablets is only necessary until
compaction group integration is finished. After that, we can use the
sstable token range to determine the owning tablet and thus the owning
shard. Before that, we can't, because a single sstable may contain
keys from different tablets, and the whole key range may overlap with
keys which belong to other shards.
2023-06-21 00:58:24 +02:00
Pavel Emelyanov
c68c154fb6 code: Reduce tracing/*hh fanout
There are some headers that include tracing/*.hh ones despite all they
need is forward-declared trace_state_ptr

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #14155
2023-06-07 19:19:22 +03:00
Pavel Emelyanov
66e43912d6 code: Switch to seastar API level 7
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).

So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command

The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields

Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)

Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile

The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13963
2023-06-06 13:29:16 +03:00
Benny Halevy
3771d48488 sstables: mx: validate: close consumer context
data_consume_rows keeps an input_stream member that must be closed.
In particular, on the error path, when we destroy it possibly
with readaheads in flight.

Fixes #13836

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #13840
2023-05-10 17:05:43 +03:00
Botond Dénes
d941d38759 sstables/mx/reader: add mx specific validator
Working with the low-level sstable parser and index reader, this
validator also cross-checks the index with the data file, making sure
all partitions are located at the position and in the order the index
describes. Furthermore, if the index also has promoted index, the order
and position of clustering elements is checked against it.
This is above the usual fragment kind order, partition key order and
clustering order checks that we already had with the reader-level
validator.
2023-05-04 03:13:03 -04:00
Botond Dénes
d79db676b1 sstables/mx/reader: template data_consume_rows_context_m on the consumer
Sadly this means all accesses of base-class members have to be qualified
with `this->`.
2023-05-02 09:42:42 -04:00
Botond Dénes
06fb48362a sstables/mx/reader: move row_processing_result to namespace scope
Reduce `data_consume_rows_context_m`'s dependency on the
`mp_row_consumer_m` symbol, preparing the way to make the former
templated on the consumer.
2023-05-02 09:42:42 -04:00
Botond Dénes
00362754a0 sstables/mx/reader: use data_consumer::proceed directly
Currently mp_row_consumer_m creates an alias to data_consumer::proceed.
Code in the rest of the file uses both unqualified name and
mp_row_consumer_m::proceed. Remove the alias and just use
`data_consumer::proceed` directly everywhere, leads to cleaner code.
2023-05-02 09:42:42 -04:00
Botond Dénes
388e7ddc03 sstables/mx/reader.cc: extend namespace to end-of-file (cosmetic) 2023-05-02 09:42:42 -04:00
Kefu Chai
f5b05cf981 treewide: use defaulted operator!=() and operator==()
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.

fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.

in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.

sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.

also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.

please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.

this change is a cleanup to modernize the code base with C++20
features.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13687
2023-04-27 10:24:46 +03:00
Kefu Chai
3738fcbe05 keys: specialize fmt::formatter<partition_key> and friends
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print following classes without the help of `operator<<`.

- partition_key_view
- partition_key
- partition_key::with_schema_wrapper
- key_with_schema
- clustering_key_prefix
- clustering_key_prefix::with_schema_wrapper

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now. the helper
of `print_key()` is removed, as its only caller is
`operator<<(std::ostream&, const
clustering_key_prefix::with_schema_wrapper&)`.

the reason why all these operators are replaced in one go is that
we have a template function of `key_to_str()` in `db/large_data_handler.cc`.
this template function is actually the caller of operator<< of
`partition_key::with_schema_wrapper` and
`clustering_key_prefix::with_schema_wrapper`.
so, in order to drop either of these two operator<<, we need to remove
both of them, so that we can switch over to `fmt::to_string()` in this
template function.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-14 13:21:30 +08:00
Pavel Emelyanov
ac1e56c9d9 sstable, storage: Virtualize data sink making for Data and Index
Add the make_data_or_index_sink() virtual method and its implementation for
filesystem_storage.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Pavel Emelyanov
1d4fcce5dd sstable/writer: Shuffle writer::init_file_writers()
The method needs to create two data sinks -- for Data and for Index
files -- and then wrap it with more stuff (compression, checksums,
streams, etc.). With S3 backend using file-output-stream won't work,
becase S3 storage cannot provide writable file API (it has data_sink
instead).

This patch extracts file_data_sink creation so that it could be
virtualized with storage API later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Raphael S. Carvalho
213eaab246 sstables: Fix use-after-move when making reader in reverse mode
static report:
sstables/mx/reader.cc:1705:58: error: invalid invocation of method 'operator*' on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed]
            legacy_reverse_slice_to_native_reverse_slice(*schema, slice.get()), pc, std::move(trace_state), fwd, fwd_mr, monitor);

Fixes #13394.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-03-31 08:39:11 -03:00
Avi Kivity
c3a2ec9d3c Merge 'use fmt::join() for printing ranges' from Kefu Chai
this series intends to deprecate `::join()`, as it always materializes a range into a concrete string. but what we always want is to print the elements in the given range to stream, or to a seastar logger, which is backed by fmtlib. also, because fmtlib offers exactly the same set of features implemented by to_string.hh, this change would allow us to use fmtlib to replace to_string.hh for better maintainability, and potentially better performance. as fmtlib is lazy evaluated, and claims to be performant under most circumstances.

Closes #13163

* github.com:scylladb/scylladb:
  utils: to_string: move join to namespace utils
  treewide: use fmt::join() when appropriate
  row_cache: pass "const cache_entry" to operator<<
2023-03-19 15:16:02 +02:00
Kefu Chai
c37f4e5252 treewide: use fmt::join() when appropriate
now that fmtlib provides fmt::join(). see
https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view
there is not need to revent the wheel. so in this change, the homebrew
join() is replaced with fmt::join().

as fmt::join() returns an join_view(), this could improve the
performance under certain circumstances where the fully materialized
string is not needed.

please note, the goal of this change is to use fmt::join(), and this
change does not intend to improve the performance of existing
implementation based on "operator<<" unless the new implementation is
much more complicated. we will address the unnecessarily materialized
strings in a follow-up commit.

some noteworthy things related to this change:

* unlike the existing `join()`, `fmt::join()` returns a view. so we
  have to materialize the view if what we expect is a `sstring`
* `fmt::format()` does not accept a view, so we cannot pass the
  return value of `fmt::join()` to `fmt::format()`
* fmtlib does not format a typed pointer, i.e., it does not format,
  for instance, a `const std::string*`. but operator<<() always print
  a typed pointer. so if we want to format a typed pointer, we either
  need to cast the pointer to `void*` or use `fmt::ptr()`.
* fmtlib is not able to pick up the overload of
  `operator<<(std::ostream& os, const column_definition* cd)`, so we
  have to use a wrapper class of `maybe_column_definition` for printing
  a pointer to `column_definition`. since the overload is only used
  by the two overloads of
  `statement_restrictions::add_single_column_parition_key_restriction()`,
  the operator<< for `const column_definition*` is dropped.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-16 20:34:18 +08:00
Botond Dénes
2694aa1078 reader_permit: use check_abort() for timeout
Instead of having callers use get_timeout(), then compare it against the
current time, set up a timeout timer in the permit, which assigned a new
`_ex` member (a `std::exception_ptr`) to the appropriate exception type
when it fires.
Callers can now just poll check_abort() which will throw when `_ex`
is not null. This is more natural and allows for more general reasons
for aborting reads in the future.
This prepares the ground for timeouts being managed inside the permit,
instead of by the semaphore. Including timing out while in a wait queue.
2023-03-09 06:53:09 -05:00
Petr Gusev
64427b9164 flat_mutation_reader_v2: drop forward_buffer_to
This is just a strange method I came across.
It effectively does nothing but clear_buffer().
2023-02-28 23:00:02 +04:00
Pavel Emelyanov
0959739216 sstables: Remove always-false sstable_writer_config::leave_unsealed
It was used in sstables streaming code up until e5be3352 (database,
streaming, messaging: drop streaming memtables) or nearby, then the
whole feature was reworked.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #12967
2023-02-23 12:50:06 +01:00
Kefu Chai
cca9b7c4cd sstables/mx/partition_reversing_data_source: remove defaulted move ctor
as partition_reversing_data_source_impl has indirectly a member variable which
a member of reference type. this should addres following warning from
Clang:

```
/home/kefu/dev/scylladb/sstables/mx/partition_reversing_data_source.cc:476:43: error: explicitly defaulted move assignment operator is implicitly deleted [-Werror,-Wdefaulted-function-deleted]
    partition_reversing_data_source_impl& operator=(partition_reversing_data_source_impl&&) noexcept = default;
                                          ^
/home/kefu/dev/scylladb/sstables/mx/partition_reversing_data_source.cc:365:19: note: move assignment operator of 'partition_reversing_data_source_impl' is implicitly deleted because field '_schema' is of reference type 'const schema &'
    const schema& _schema;
                  ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-19 12:57:40 +08:00
Kefu Chai
45f0449ccf sstables: mx/writer: remove defaulted move ctor
because its base class of `writer_impl` has a member variable
`_validator`, which has its copy ctor deleted. let's just
drop the defaulted move ctor, as compiler is not able to
generate one for us.

```
/home/kefu/dev/scylladb/sstables/mx/writer.cc:805:5: error: explicitly defaulted move constructor is implicitly deleted [-Werror,-Wdefaulted-function-deleted]
    writer(writer&& o) = default;
    ^
/home/kefu/dev/scylladb/sstables/mx/writer.cc:528:16: note: move constructor of 'writer' is implicitly deleted because base class 'sstable_writer::writer_impl' has a deleted move constructor
class writer : public sstable_writer::writer_impl {
               ^
/home/kefu/dev/scylladb/sstables/writer_impl.hh:29:48: note: copy constructor of 'writer_impl' is implicitly deleted because field '_validator' has a deleted copy constructor
    mutation_fragment_stream_validating_filter _validator;
                                               ^
/home/kefu/dev/scylladb/mutation/mutation_fragment_stream_validator.hh:188:5: note: 'mutation_fragment_stream_validating_filter' has been explicitly marked deleted here
    mutation_fragment_stream_validating_filter(const mutation_fragment_stream_validating_filter&) = delete;
    ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12877
2023-02-15 23:06:10 +02:00
Kefu Chai
0cb842797a treewide: do not define/capture unused variables
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-15 22:57:18 +02:00
Avi Kivity
69a385fd9d Introduce schema/ module
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.

Closes #12858
2023-02-15 11:01:50 +02:00
Avi Kivity
c5e4bf51bd Introduce mutation/ module
Move mutation-related files to a new mutation/ directory. The names
are kept in the global namespace to reduce churn; the names are
unambiguous in any case.

mutation_reader remains in the readers/ module.

mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this
patch.

This is a step forward towards librarization or modularization of the
source base.

Closes #12788
2023-02-14 11:19:03 +02:00
Botond Dénes
2acfa950d7 sstables: wire in the reader_permit's sstable read count tracking
Hook in the relevant methods when creating and destroying sstable
readers.
2023-01-03 09:37:29 -05:00
Raphael S. Carvalho' via ScyllaDB development
a7c4a129cb sstables: Bump row_reads metrics for mx version
Metric was always 0 despite a row was processed by mx reader.

Fixes #12406.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20221227220202.295790-1-raphaelsc@scylladb.com>
2022-12-30 18:38:30 +01:00
Avi Kivity
7c7eb81a66 Merge 'Encapsulate filesystem access by sstable into filesystem_storage subsclass' from Pavel Emelyanov
This is to define the API sstable needs from underlying storage. When implementing object-storage backend it will need to implement those. The API looks like

        future<> snapshot(const sstable& sst, sstring dir, absolute_path abs) const;
        future<> quarantine(const sstable& sst, delayed_commit_changes* delay);
        future<> move(const sstable& sst, sstring new_dir, generation_type generation, delayed_commit_changes* delay);
        void open(sstable& sst, const io_priority_class& pc); // runs in async context
        future<> wipe(const sstable& sst) noexcept;

        future<file> open_component(const sstable& sst, component_type type, open_flags flags, file_open_options options, bool check_integrity);

It doesn't have "list" or alike, because it's not a method of an individual sstable, but rather the one from sstables_manager. It will come as separate PR.

Closes #12217

* github.com:scylladb/scylladb:
  sstable, storage: Mark dir/temp_dir private
  sstable: Remove get_dir() (well, almost)
  sstable: Add quarantine() method to storage
  sstable: Use absolute/relative path marking for snapshot()
  sstable: Remove temp_... stuff from sstable
  sstable: Move open_component() on storage
  sstable: Mark rename_new_sstable_component_file() const
  sstable: Print filename(type) on open-component error
  sstable: Reorganize new_sstable_component_file()
  sstable: Mark filename() private
  sstable: Introduce index_filename()
  tests: Disclosure private filename() calls
  sstable: Move wipe_storage() on storage
  sstable: Remove temp dir in wipe_storage()
  sstable: Move unlink parts into wipe_storage
  sstable: Remove get_temp_dir()
  sstable: Move write_toc() to storage
  sstable: Shuffle open_sstable()
  sstable: Move touch_temp_dir() to storage
  sstable: Move move() to storage
  sstable: Move create_links() to storage
  sstable: Move seal_sstable() to storage
  sstable: Tossing internals of seal_sstable()
  sstable: Move remove_temp_dir() to storage
  sstable: Move create_links_common() to storage
  sstable: Move check_create_links_replay() to storage
  sstable: Remove one of create_links() overloads
  sstable: Remove create_links_and_mark_for_removal()
  sstable: Indentation fix after prevuous patch
  sstable: Coroutinize create_links_common()
  sstable: Rename create_links_common()'s "dir" argument
  sstable: Make mark_for_removal bool_class
  sstable, table: Add sstable::snapshot() and use in table::take_snapshot
  sstable: Move _dir and _temp_dir on filesystem_storage
  sstable: Use sync_directory() method
  test, sstable: Use component_basename in test
  sstables: Move read_{digest|checksum} on sstable
2022-12-18 17:29:35 +02:00
Botond Dénes
8f8284783a Merge 'Fix handling of non-full clustering keys in the read path' from Tomasz Grabiec
This PR fixes several bugs related to handling of non-full
clustering keys.

One is in trim_clustering_row_ranges_to(), which is broken for non-full keys in reverse
mode. It will trim the range to position_in_partition_view::after_key(full_key) instead of
position_in_partition_view::before_key(key), hence it will include the
key in the resulting range rather than exclude it.

Fixes #12180

after_key() was creating a position which is after all keys prefixed
by a non-full key, rather than a position which is right after that
key.

This will issue will be caught by cql_query_test::test_compact_storage
in debug mode when mutation_partition_v2 merging starts inserting
sentinels at position after_key() on preemption.

It probably already causes problems for such keys as after_key() is used
in various parts in the read path.

Refs #1446

Closes #12234

* github.com:scylladb/scylladb:
  position_in_partition: Make after_key() work with non-full keys
  position_in_partition: Introduce before_key(position_in_partition_view)
  db: Fix trim_clustering_row_ranges_to() for non-full keys and reverse order
  types: Fix comparison of frozen sets with empty values
2022-12-15 10:47:12 +02:00