Commit Graph

162 Commits

Author SHA1 Message Date
Botond Dénes
bc779ed00c multishard_mutation_query.cc: use shard_for_reads() instead of shard_of()
The latter is deprecated.
2024-05-16 00:28:47 +02:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Botond Dénes
8213e66815 replica/database: use include page-size in max-result-size
This patch changes get_unlimited_query_max_result_size():
* Also set the page-size field, not just the soft/hard limits
* Renames it to get_query_max_result_size()
* Update callers, specifically storage_proxy::get_max_result_size(),
  which now has a much simpler common return path and has to drop the
  page size on one rare return path.

This is a purely mechanical change, no behaviour is changed.
2024-02-27 02:27:55 -05:00
Kefu Chai
6800810dba interval, multishard_mutation_query: fix typos in comments
these misspellings were identified by codespell.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17491
2024-02-23 09:06:24 +02:00
Botond Dénes
ce472b33b8 multishard_mutation_query: add tablets support
When reading a list of ranges with tablets, we don't need a multishard
reader. Instead, we intersect the range list with the local nodes tablet
ranges, then read each range from the respective shard.
The individual ranges are read sequentially, with
database::query[_mutations](), merging the results into a single
instance. This makes the code simple. For tablets,
multishard_mutation_query.cc is no longer on the hot paths, range scans
on tables with tablets fork off to a different code-path in the
coordinator. The only code using multishard_mutation_query.cc are
forced, replica-local scans, like those used by SELECT * FROM
MUTATION_FRAGMENTS(). These are mainly used for diagnostics and tests,
so we optimize for simplicity, not performance.
2024-02-21 02:08:48 -05:00
Botond Dénes
d160a179ee multishard_mutation_query: remove compaction-state from result-builder factory
This param was used by the query-result builder, to set the
last-position on end-of-stream. Instead, do this via a new ResultBuilder
method, maybe_set_last_position(), which is called from read_page(),
which has access to the compaction-state.
With this, the ResultBuilder can be created without a compaction-state
at hand. This will be important in the next patch.
2024-02-21 02:08:48 -05:00
Botond Dénes
95bc0cb1c0 multishard_mutation_query: do_query(): return foreign_ptr<lw_shared_ptr<result>>
Makes future patching easier.
2024-02-21 02:08:48 -05:00
Kefu Chai
34cc245da5 gms: add formatter for read_context::dismantle_buffer_stats
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for
`read_context::dismantle_buffer_stats`, and drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17389
2024-02-19 09:43:53 +02:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Lakshmi Narayanan Sreethar
76f0d5e35b reader_permit: store schema_ptr instead of raw schema pointer
Store schema_ptr in reader permit instead of storing a const pointer to
schema to ensure that the schema doesn't get changed elsewhere when the
permit is holding on to it. Also update the constructors and all the
relevant callers to pass down schema_ptr instead of a raw pointer.

Fixes #16180

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#16658
2024-01-11 08:37:56 +02:00
Kefu Chai
34259a03d0 treewide: use consteval string as format string when formatting log message
seastar::logger is using the compile-time format checking by default if
compiled using {fmt} 8.0 and up. and it requires the format string to be
consteval string, otherwise we have to use `fmt::runtime()` explicitly.

so adapt the change, let's use the consteval string when formatting
logging messages.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16612
2024-01-02 19:08:47 +02:00
Michał Jadwiszczak
a5fc53aa11 querier_cache: check semaphore mismatch during querier lookup
Previously semaphore mismatch was checked only in multi-partition
queries and if happened, an internal error was thrown.

This commit pushed the check down to `querier_cache`, so each
`lookup_*_querier` method will check for the mismatch.

What's more, if semaphore mismatch occurs, check whether both semaphores belong
to user. If so, log a warning and drop cached reader instead of
throwing an error.

The mismatch can happen if user's scheduling group changed during
a query. We don't want to throw an error then, but drop and reset
cached reader.
2023-07-21 19:05:50 +02:00
Tomasz Grabiec
fb0bdcec0c storage_proxy: Avoid multishard reader for tablets
Currently, the coordinator splits the partition range at vnode (or
tablet) boundaries and then tries to merge adjacent ranges which
target the same replica. This is an optimization which makes less
sense with tablets, which are supposed to be of substantial size. If
we don't merge the ranges, then with tablets we can avoid using the
multishard reader on the replica side, since each tablet lives on a
single shard.

The main reason to avoid a multishard reader is avoiding its
complexity, and avoiding adapting it to work with tablet
sharding. Currently, the multishard reader implementation makes
several assumptions about shard assignment which do not hold with
tablets. It assumes that shards are assigned in a round-robin fashion.
2023-06-21 00:58:24 +02:00
Tomasz Grabiec
d92287f997 db: multishard: Obtain sharder from erm
This is not strictly necessary, as the multishard reader will be later
avoided altogether for tablet-based tables, but it is a step towards
converting all code to use the erm->get_sharder() instead of
schema::get_sharder().
2023-06-21 00:58:24 +02:00
Pavel Emelyanov
66e43912d6 code: Switch to seastar API level 7
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).

So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command

The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields

Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)

Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile

The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13963
2023-06-06 13:29:16 +03:00
Botond Dénes
4ba7810f60 multishard_mutation_query: make reader_context::lookup_readers() exception safe
With regards to closing the looked-up querier if an exception is thrown.
In particular, this requires closing the querier if a semaphore mismatch
is detected. Move the table lookup above the line where the querier is
looked up, to avoid having to handle the exception from it.
2023-05-08 07:35:39 -04:00
Botond Dénes
227b0d3f08 multishard_mutation_query: lookup_readers(): make inner lambda a coroutine
Needed by the next patch. Sad, but it runs once/shard/page, so it
shouldn't be noticable.
2023-05-08 07:35:33 -04:00
Botond Dénes
0aa03f85a3 readers/multishard: reader_lifecycle_policy: add get_read_range()
Allows retrieving the current read-range for the reader on the given
shard (where the method is called).
2023-03-24 08:40:11 -04:00
Botond Dénes
1f51f752cc reader_permit: refresh trace_state on new pages
To make sure all tracing done on a certain page will make its way into
the appropriate trace session.
This is a contination of the previous patch (which added trace pointer
to the permit).
2023-03-22 04:58:10 -04:00
Botond Dénes
156e5d346d reader_permit: keep trace_state pointer on permit
And propagate it down to where it is created. This will be used to add
trace points for semaphore related events, but this will come in the
next patches.
2023-03-22 04:58:01 -04:00
Kefu Chai
0cb842797a treewide: do not define/capture unused variables
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-15 22:57:18 +02:00
Avi Kivity
69a385fd9d Introduce schema/ module
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.

Closes #12858
2023-02-15 11:01:50 +02:00
Pavel Emelyanov
9cd1f777a5 database.hh: Remove unused headers
Use forward declarations when needed

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11667
2022-10-04 09:01:38 +03:00
Botond Dénes
7730419f5c query-result-writer: stop when tombstone-limit is reached
The query result writer now counts tombstones and cuts the page (marking
it as a short one) when the tombstone limit is reached. This is to avoid
timing out on large span of tombstones, especially prefixes.
In the case of unpaged queries, we fail the read instead, similarly to
how we do with max result size.
If the limit is 0, the previous behaviour is used: tombstones are not
taken into consideration at all.
2022-08-10 06:03:38 +03:00
Benny Halevy
c71ef330b2 query-request, everywhere: define and use query_id as a strong type
Define query_id as a tagged_uuid
So it can be differentiated from other uuid-class types.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-08 08:13:28 +03:00
Botond Dénes
70b4158ce0 mutation_compactor: detach_state(): make it no-op if partition was exhausted
detach_state() allows the user to resume a compaction process later,
without having to keep the compactor object alive. This happens by
generating and returning the mutation fragments the user has to re-feed
to a newly constructed compactor to bring it into the exact same state
the current compactor was at the point of stopping the compaction.
This state includes the partition-header (partition-start and static-row
if any) and the currently active range tombstone.
Detaching the state is pointless however when the compaction was stopped
such that the currently compacted partition was completely exhausted.
Allowing the state to be detached in this case seems benign but it
caused a subtle bug in the main user of this feature: the partition
range scan algorithm, where the fragments included in the detached state
were pushed back into the reader which produced them. If the partition
happened to be exhausted -- meaning the next fragment in the reader was
a partition-start or EOS -- this resulted in the partition being
re-emitted later without a partition-end, resulting in corrupt
query-result being generated, in turn resulting in an obscure "IDL frame
truncated" error.

This patch solves this seemingly benign but sinister bug by making the
return value of `detach_state()` an std::optional and returning a
disengaged optional when the partition was exhausted.
2022-08-02 06:43:24 +03:00
Botond Dénes
cdd3a364cb querier: use full_position in shard_mutation_querier
Instead of a separate partition key and position-in-partition.
This continues the recently started effort to standardize storing of
full positions on `full_position`.

This patch is also a hidden preparation for read_context::save_readers()
multishard_mutation_query.cc) no longer being able to get partition key
from compaction state in the future.
2022-08-02 06:43:24 +03:00
Avi Kivity
00cec159d6 Revert "Merge 'multishard_mutation_query: don't unpop partition header of spent partition' from Botond Dénes"
This reverts commit c3bad157e5, reversing
changes made to e66809d051. The checks it
adds are triggered by some dtests. While it's possible the check is
triggered due to an existing problem, better to investigate it out-of-tree.

Fixes #11169.
2022-07-31 15:24:33 +03:00
Botond Dénes
f119554106 mutation_compactor: detach_state(): make it no-op if partition was exhausted
detach_state() allows the user to resume a compaction process later,
without having to keep the compactor object alive. This happens by
generating and returning the mutation fragments the user has to re-feed
to a newly constructed compactor to bring it into the exact same state
the current compactor was at the point of stopping the compaction.
This state includes the partition-header (partition-start and static-row
if any) and the currently active range tombstone.
Detaching the state is pointless however when the compaction was stopped
such that the currently compacted partition was completely exhausted.
Allowing the state to be detached in this case seems benign but it
caused a subtle bug in the main user of this feature: the partition
range scan algorithm, where the fragments included in the detached state
were pushed back into the reader which produced them. If the partition
happened to be exhausted -- meaning the next fragment in the reader was
a partition-start or EOS -- this resulted in the partition being
re-emitted later without a partition-end, resulting in corrupt
query-result being generated, in turn resulting in an obscure "IDL frame
truncated" error.

This patch solves this seemingly benign but sinister bug by making the
return value of `detach_state()` an std::optional and returning a
disengaged optional when the partition was exhausted.
2022-07-28 09:02:26 +03:00
Botond Dénes
afa694a20c querier: use full_position in shard_mutation_querier
Instead of a separate partition key and position-in-partition.
This continues the recently started effort to standardize storing of
full positions on `full_position`.

This patch is also a hidden preparation for read_context::save_readers()
multishard_mutation_query.cc) no longer being able to get partition key
from compaction state in the future.
2022-07-28 08:19:23 +03:00
Botond Dénes
ac9935b645 multishard_mutation_query: remove now pointless compact_for_result_state typedef
No need to switch on the now defunct emit_only_live_rows.
2022-07-12 08:44:33 +03:00
Botond Dénes
4d2ce5c304 mutation_compactor: remove emit_only_live_rows template parameter
Now that we use emit_only_live_rows::no everywhere we can remove this
template parameters. Only the template parameter is removed, the
internal logic around it is left in place (will be removed in a next
patch), by hard-wiring `only_live()`.
2022-07-12 08:43:49 +03:00
Botond Dénes
bedc82e52c tree: use emit_only_live_rows::no
emit_only_live_rows is a convenience so downstream consumers of the
mutation compactors don't have to check the `bool is_live` already
passed to them. This convenience however causes a template parameter and
additional logic for the compactor. As the most prominent of these
consumers (the query result builder) will soon have to switch to
emit_only_live_rows::no for other reasons anyway (it will want to count
tombstones), we take the opportunity to switch everybody to ::no. This
can be done with very little additional complexity to these consumer --
basically an additional if or two.
This prepares the ground for removing this template parameter and the
associate logic from the compactor.
2022-07-12 08:41:51 +03:00
Botond Dénes
742dc10185 querier: querier_cache: de-override insert() methods
Soon, the currently two distinct types of queriers will be merged, as
the template parameter differentiating them will be gone. This will make
using type based overload for insert() impossible, as 2 out of the 3
types will be the same. Use different names instead.
2022-07-12 08:41:48 +03:00
Botond Dénes
fd5f8f2275 query: have replica provide the last position
Use the recently introduced query-result facility to have the replica
set the position where the query should continue from. For now this is
the same as what the implicit position would have been previously (last
row in result), but it opens up the possibility to stop the query at a
dead row.
2022-06-23 13:36:24 +03:00
Botond Dénes
7b6b7a49cd mutlishard_mutation_query: propagate compaction state to result builder
Not used in this patch, facilitates further patching.
2022-06-23 13:36:24 +03:00
Botond Dénes
738cb99c53 multishard_mutation_query: defer creating result builder until needed
Currently the result builder is created two frames above the method in
which actually needed. Push down a factory method instead and create it
where actually used. This allows us to pass it arguments that are
present only in the method which uses it.
2022-06-23 13:36:24 +03:00
Botond Dénes
58d53b66c1 querier: rely on compactor for position tracking
For some time now the compactor track its own position. The querier can
make use of this instead of duplicating this effort.
2022-06-23 13:36:24 +03:00
Benny Halevy
5babc609c6 multishard_mutation_query: do_query: couroutinize save_readers lambda
To keep it simple. It is unlikely to throw.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-06-08 09:31:17 +03:00
Benny Halevy
921092955b multishard_mutation_query: do_query: prevent exceptions using coroutine::as_future
Optimize error handling by preventing
exception try/catch using coroutine::as_future.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-06-08 09:31:17 +03:00
Benny Halevy
7a76ba4038 multishard_mutation_query: read_page: prevent exceptions using coroutine::as_future
Optimize error handling by preventing
exception try/catch using coroutine::as_future
to get query::consume_page's result.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-06-08 09:31:15 +03:00
Benny Halevy
817a0f316a multishard_mutation_query: save_readers: fixup indentation
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-06-08 09:23:14 +03:00
Benny Halevy
804d727b8b multishard_mutation_query: coroutinize save_readers
And use smp::invoke_on_all rather than a home-brewed
version of parallel_for_each over all shard ids.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-06-08 09:23:14 +03:00
Benny Halevy
22e5352cc2 multishard_mutation_query: lookup_readers: make noexcept
Sot it can be co_awaited efficiently using coroutine::as_future,
othwise, any exceptions will escape `as_future`.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-06-08 09:23:14 +03:00
Benny Halevy
ea3935507e multishard_mutation_query: optimize lookup_readers
No need to call _db.invoke_on inside a parallel_for_each
loop over all shards.  Just use _db.invoke_on_all instead.

Besides that, there's no need for a .then continuation
for assigning the per-shard reader in _readers[shard].
It can be done by the functor running on each db shard.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-06-08 09:23:14 +03:00
Benny Halevy
055141fc2e multishard_mutation_query: do_query: stop ctx if lookup_readers fails
lookup_readers might fail after populating some readers
and those better be closed before returning the exception.

Fixes #10351

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #10425
2022-04-26 11:11:52 +03:00
Botond Dénes
d0ea895671 readers: move multishard reader & friends to reader/multishard.cc
Since the multishard reader family weighs more than 1K SLOC, it gets
its own .cc file.
2022-03-30 15:42:51 +03:00
Botond Dénes
0b5217052d querier: switch to v2 compactor output
The change is mostly mechanical: update all compactor instances to the
_v2 variant and update all call-sites, of which there is not that many.
As a consequence of this patch, queries -- both single-partition and
range-scans -- now do the v2->v1 conversion in the consumers, instead of
in the compactor.
2022-03-11 09:24:05 +02:00
Pavel Emelyanov
063da81ab7 code: Convert nothrow construction assertions into concepts
The small_vector also has N>0 constraint that's also converted

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-24 19:44:50 +03:00
Botond Dénes
f1e9e3b3b7 compact_mutation: drop support for v1 input 2022-02-21 12:29:24 +02:00