scylladb

Author	SHA1	Message	Date
Botond Dénes	bc779ed00c	multishard_mutation_query.cc: use shard_for_reads() instead of shard_of() The latter is deprecated.	2024-05-16 00:28:47 +02:00
Kefu Chai	168ade72f8	treewide: replace formatter<std::string_view> with formatter<string_view> in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>` for `std::string_view` as well as the specialization of `fmt::formatter<..>` for `fmt::string_view` which is an implementation builtin in {fmt} for compatibility of pre-C++17. and this type is used even if the code is compiled with C++ stadandard greater or equal to C++17. also, before v10, the `fmt::formatter<std::string_view>::format()` is defined so it accepts `std::string_view`. after v10, `fmt::formatter<std::string_view>` still exists, but it is now defined using `format_as()` machinery, so it's `format()` method does not actually accept `std::string_view`, it accepts `fmt::string_view`, as the former can be converted to `fmt::string_view`. this is why we can inherit from `fmt::formatter<std::string_view>` and use `formatter<std::string_view>::format(foo, ctx);` to implement the `format()` method with {fmt} v9, but we cannot do this with {fmt} v10, and we would have following compilation failure: ``` FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o /home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc /home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format' 254 \| return formatter<std::string_view>::format(it->second, ctx); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ /usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument 2759 \| FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const \| ^ ~~~~~~~~~~~~ ``` because the inherited `format()` method actually comes from `fmt::formatter<fmt::string_view>`. to reduce the confusion, in this change, we just inherit from `fmt::format<string_view>`, where `string_view` is actually `fmt::string_view`. this follows the document at https://fmt.dev/latest/api.html#formatting-user-defined-types, and since there is less indirection under the hood -- we do not use the specialization created by `FMT_FORMAT_AS` which inherit from `formatter<fmt::string_view>`, hopefully this can improve the compilation speed a little bit. also, this change addresses the build failure with {fmt} v10. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18299	2024-04-19 07:44:07 +03:00
Botond Dénes	8213e66815	replica/database: use include page-size in max-result-size This patch changes get_unlimited_query_max_result_size(): * Also set the page-size field, not just the soft/hard limits * Renames it to get_query_max_result_size() * Update callers, specifically storage_proxy::get_max_result_size(), which now has a much simpler common return path and has to drop the page size on one rare return path. This is a purely mechanical change, no behaviour is changed.	2024-02-27 02:27:55 -05:00
Kefu Chai	6800810dba	interval, multishard_mutation_query: fix typos in comments these misspellings were identified by codespell. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17491	2024-02-23 09:06:24 +02:00
Botond Dénes	ce472b33b8	multishard_mutation_query: add tablets support When reading a list of ranges with tablets, we don't need a multishard reader. Instead, we intersect the range list with the local nodes tablet ranges, then read each range from the respective shard. The individual ranges are read sequentially, with database::query[_mutations](), merging the results into a single instance. This makes the code simple. For tablets, multishard_mutation_query.cc is no longer on the hot paths, range scans on tables with tablets fork off to a different code-path in the coordinator. The only code using multishard_mutation_query.cc are forced, replica-local scans, like those used by SELECT * FROM MUTATION_FRAGMENTS(). These are mainly used for diagnostics and tests, so we optimize for simplicity, not performance.	2024-02-21 02:08:48 -05:00
Botond Dénes	d160a179ee	multishard_mutation_query: remove compaction-state from result-builder factory This param was used by the query-result builder, to set the last-position on end-of-stream. Instead, do this via a new ResultBuilder method, maybe_set_last_position(), which is called from read_page(), which has access to the compaction-state. With this, the ResultBuilder can be created without a compaction-state at hand. This will be important in the next patch.	2024-02-21 02:08:48 -05:00
Botond Dénes	95bc0cb1c0	multishard_mutation_query: do_query(): return foreign_ptr<lw_shared_ptr<result>> Makes future patching easier.	2024-02-21 02:08:48 -05:00
Kefu Chai	34cc245da5	gms: add formatter for read_context::dismantle_buffer_stats before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `read_context::dismantle_buffer_stats`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17389	2024-02-19 09:43:53 +02:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Lakshmi Narayanan Sreethar	76f0d5e35b	reader_permit: store schema_ptr instead of raw schema pointer Store schema_ptr in reader permit instead of storing a const pointer to schema to ensure that the schema doesn't get changed elsewhere when the permit is holding on to it. Also update the constructors and all the relevant callers to pass down schema_ptr instead of a raw pointer. Fixes #16180 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#16658	2024-01-11 08:37:56 +02:00
Kefu Chai	34259a03d0	treewide: use consteval string as format string when formatting log message seastar::logger is using the compile-time format checking by default if compiled using {fmt} 8.0 and up. and it requires the format string to be consteval string, otherwise we have to use `fmt::runtime()` explicitly. so adapt the change, let's use the consteval string when formatting logging messages. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16612	2024-01-02 19:08:47 +02:00
Michał Jadwiszczak	a5fc53aa11	querier_cache: check semaphore mismatch during querier lookup Previously semaphore mismatch was checked only in multi-partition queries and if happened, an internal error was thrown. This commit pushed the check down to `querier_cache`, so each `lookup_*_querier` method will check for the mismatch. What's more, if semaphore mismatch occurs, check whether both semaphores belong to user. If so, log a warning and drop cached reader instead of throwing an error. The mismatch can happen if user's scheduling group changed during a query. We don't want to throw an error then, but drop and reset cached reader.	2023-07-21 19:05:50 +02:00
Tomasz Grabiec	fb0bdcec0c	storage_proxy: Avoid multishard reader for tablets Currently, the coordinator splits the partition range at vnode (or tablet) boundaries and then tries to merge adjacent ranges which target the same replica. This is an optimization which makes less sense with tablets, which are supposed to be of substantial size. If we don't merge the ranges, then with tablets we can avoid using the multishard reader on the replica side, since each tablet lives on a single shard. The main reason to avoid a multishard reader is avoiding its complexity, and avoiding adapting it to work with tablet sharding. Currently, the multishard reader implementation makes several assumptions about shard assignment which do not hold with tablets. It assumes that shards are assigned in a round-robin fashion.	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	d92287f997	db: multishard: Obtain sharder from erm This is not strictly necessary, as the multishard reader will be later avoided altogether for tablet-based tables, but it is a step towards converting all code to use the erm->get_sharder() instead of schema::get_sharder().	2023-06-21 00:58:24 +02:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Botond Dénes	4ba7810f60	multishard_mutation_query: make reader_context::lookup_readers() exception safe With regards to closing the looked-up querier if an exception is thrown. In particular, this requires closing the querier if a semaphore mismatch is detected. Move the table lookup above the line where the querier is looked up, to avoid having to handle the exception from it.	2023-05-08 07:35:39 -04:00
Botond Dénes	227b0d3f08	multishard_mutation_query: lookup_readers(): make inner lambda a coroutine Needed by the next patch. Sad, but it runs once/shard/page, so it shouldn't be noticable.	2023-05-08 07:35:33 -04:00
Botond Dénes	0aa03f85a3	readers/multishard: reader_lifecycle_policy: add get_read_range() Allows retrieving the current read-range for the reader on the given shard (where the method is called).	2023-03-24 08:40:11 -04:00
Botond Dénes	1f51f752cc	reader_permit: refresh trace_state on new pages To make sure all tracing done on a certain page will make its way into the appropriate trace session. This is a contination of the previous patch (which added trace pointer to the permit).	2023-03-22 04:58:10 -04:00
Botond Dénes	156e5d346d	reader_permit: keep trace_state pointer on permit And propagate it down to where it is created. This will be used to add trace points for semaphore related events, but this will come in the next patches.	2023-03-22 04:58:01 -04:00
Kefu Chai	0cb842797a	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:57:18 +02:00
Avi Kivity	69a385fd9d	Introduce schema/ module Schema related files are moved there. This excludes schema files that also interact with mutations, because the mutation module depends on the schema. Those files will have to go into a separate module. Closes #12858	2023-02-15 11:01:50 +02:00
Pavel Emelyanov	9cd1f777a5	database.hh: Remove unused headers Use forward declarations when needed Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11667	2022-10-04 09:01:38 +03:00
Botond Dénes	7730419f5c	query-result-writer: stop when tombstone-limit is reached The query result writer now counts tombstones and cuts the page (marking it as a short one) when the tombstone limit is reached. This is to avoid timing out on large span of tombstones, especially prefixes. In the case of unpaged queries, we fail the read instead, similarly to how we do with max result size. If the limit is 0, the previous behaviour is used: tombstones are not taken into consideration at all.	2022-08-10 06:03:38 +03:00
Benny Halevy	c71ef330b2	query-request, everywhere: define and use query_id as a strong type Define query_id as a tagged_uuid So it can be differentiated from other uuid-class types. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:13:28 +03:00
Botond Dénes	70b4158ce0	mutation_compactor: detach_state(): make it no-op if partition was exhausted detach_state() allows the user to resume a compaction process later, without having to keep the compactor object alive. This happens by generating and returning the mutation fragments the user has to re-feed to a newly constructed compactor to bring it into the exact same state the current compactor was at the point of stopping the compaction. This state includes the partition-header (partition-start and static-row if any) and the currently active range tombstone. Detaching the state is pointless however when the compaction was stopped such that the currently compacted partition was completely exhausted. Allowing the state to be detached in this case seems benign but it caused a subtle bug in the main user of this feature: the partition range scan algorithm, where the fragments included in the detached state were pushed back into the reader which produced them. If the partition happened to be exhausted -- meaning the next fragment in the reader was a partition-start or EOS -- this resulted in the partition being re-emitted later without a partition-end, resulting in corrupt query-result being generated, in turn resulting in an obscure "IDL frame truncated" error. This patch solves this seemingly benign but sinister bug by making the return value of `detach_state()` an std::optional and returning a disengaged optional when the partition was exhausted.	2022-08-02 06:43:24 +03:00
Botond Dénes	cdd3a364cb	querier: use full_position in shard_mutation_querier Instead of a separate partition key and position-in-partition. This continues the recently started effort to standardize storing of full positions on `full_position`. This patch is also a hidden preparation for read_context::save_readers() multishard_mutation_query.cc) no longer being able to get partition key from compaction state in the future.	2022-08-02 06:43:24 +03:00
Avi Kivity	00cec159d6	Revert "Merge 'multishard_mutation_query: don't unpop partition header of spent partition' from Botond Dénes" This reverts commit `c3bad157e5`, reversing changes made to `e66809d051`. The checks it adds are triggered by some dtests. While it's possible the check is triggered due to an existing problem, better to investigate it out-of-tree. Fixes #11169.	2022-07-31 15:24:33 +03:00
Botond Dénes	f119554106	mutation_compactor: detach_state(): make it no-op if partition was exhausted detach_state() allows the user to resume a compaction process later, without having to keep the compactor object alive. This happens by generating and returning the mutation fragments the user has to re-feed to a newly constructed compactor to bring it into the exact same state the current compactor was at the point of stopping the compaction. This state includes the partition-header (partition-start and static-row if any) and the currently active range tombstone. Detaching the state is pointless however when the compaction was stopped such that the currently compacted partition was completely exhausted. Allowing the state to be detached in this case seems benign but it caused a subtle bug in the main user of this feature: the partition range scan algorithm, where the fragments included in the detached state were pushed back into the reader which produced them. If the partition happened to be exhausted -- meaning the next fragment in the reader was a partition-start or EOS -- this resulted in the partition being re-emitted later without a partition-end, resulting in corrupt query-result being generated, in turn resulting in an obscure "IDL frame truncated" error. This patch solves this seemingly benign but sinister bug by making the return value of `detach_state()` an std::optional and returning a disengaged optional when the partition was exhausted.	2022-07-28 09:02:26 +03:00
Botond Dénes	afa694a20c	querier: use full_position in shard_mutation_querier Instead of a separate partition key and position-in-partition. This continues the recently started effort to standardize storing of full positions on `full_position`. This patch is also a hidden preparation for read_context::save_readers() multishard_mutation_query.cc) no longer being able to get partition key from compaction state in the future.	2022-07-28 08:19:23 +03:00
Botond Dénes	ac9935b645	multishard_mutation_query: remove now pointless compact_for_result_state typedef No need to switch on the now defunct emit_only_live_rows.	2022-07-12 08:44:33 +03:00
Botond Dénes	4d2ce5c304	mutation_compactor: remove emit_only_live_rows template parameter Now that we use emit_only_live_rows::no everywhere we can remove this template parameters. Only the template parameter is removed, the internal logic around it is left in place (will be removed in a next patch), by hard-wiring `only_live()`.	2022-07-12 08:43:49 +03:00
Botond Dénes	bedc82e52c	tree: use emit_only_live_rows::no emit_only_live_rows is a convenience so downstream consumers of the mutation compactors don't have to check the `bool is_live` already passed to them. This convenience however causes a template parameter and additional logic for the compactor. As the most prominent of these consumers (the query result builder) will soon have to switch to emit_only_live_rows::no for other reasons anyway (it will want to count tombstones), we take the opportunity to switch everybody to ::no. This can be done with very little additional complexity to these consumer -- basically an additional if or two. This prepares the ground for removing this template parameter and the associate logic from the compactor.	2022-07-12 08:41:51 +03:00
Botond Dénes	742dc10185	querier: querier_cache: de-override insert() methods Soon, the currently two distinct types of queriers will be merged, as the template parameter differentiating them will be gone. This will make using type based overload for insert() impossible, as 2 out of the 3 types will be the same. Use different names instead.	2022-07-12 08:41:48 +03:00
Botond Dénes	fd5f8f2275	query: have replica provide the last position Use the recently introduced query-result facility to have the replica set the position where the query should continue from. For now this is the same as what the implicit position would have been previously (last row in result), but it opens up the possibility to stop the query at a dead row.	2022-06-23 13:36:24 +03:00
Botond Dénes	7b6b7a49cd	mutlishard_mutation_query: propagate compaction state to result builder Not used in this patch, facilitates further patching.	2022-06-23 13:36:24 +03:00
Botond Dénes	738cb99c53	multishard_mutation_query: defer creating result builder until needed Currently the result builder is created two frames above the method in which actually needed. Push down a factory method instead and create it where actually used. This allows us to pass it arguments that are present only in the method which uses it.	2022-06-23 13:36:24 +03:00
Botond Dénes	58d53b66c1	querier: rely on compactor for position tracking For some time now the compactor track its own position. The querier can make use of this instead of duplicating this effort.	2022-06-23 13:36:24 +03:00
Benny Halevy	5babc609c6	multishard_mutation_query: do_query: couroutinize save_readers lambda To keep it simple. It is unlikely to throw. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-08 09:31:17 +03:00
Benny Halevy	921092955b	multishard_mutation_query: do_query: prevent exceptions using coroutine::as_future Optimize error handling by preventing exception try/catch using coroutine::as_future. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-08 09:31:17 +03:00
Benny Halevy	7a76ba4038	multishard_mutation_query: read_page: prevent exceptions using coroutine::as_future Optimize error handling by preventing exception try/catch using coroutine::as_future to get query::consume_page's result. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-08 09:31:15 +03:00
Benny Halevy	817a0f316a	multishard_mutation_query: save_readers: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-08 09:23:14 +03:00
Benny Halevy	804d727b8b	multishard_mutation_query: coroutinize save_readers And use smp::invoke_on_all rather than a home-brewed version of parallel_for_each over all shard ids. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-08 09:23:14 +03:00
Benny Halevy	22e5352cc2	multishard_mutation_query: lookup_readers: make noexcept Sot it can be co_awaited efficiently using coroutine::as_future, othwise, any exceptions will escape `as_future`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-08 09:23:14 +03:00
Benny Halevy	ea3935507e	multishard_mutation_query: optimize lookup_readers No need to call _db.invoke_on inside a parallel_for_each loop over all shards. Just use _db.invoke_on_all instead. Besides that, there's no need for a .then continuation for assigning the per-shard reader in _readers[shard]. It can be done by the functor running on each db shard. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-08 09:23:14 +03:00
Benny Halevy	055141fc2e	multishard_mutation_query: do_query: stop ctx if lookup_readers fails lookup_readers might fail after populating some readers and those better be closed before returning the exception. Fixes #10351 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10425	2022-04-26 11:11:52 +03:00
Botond Dénes	d0ea895671	readers: move multishard reader & friends to reader/multishard.cc Since the multishard reader family weighs more than 1K SLOC, it gets its own .cc file.	2022-03-30 15:42:51 +03:00
Botond Dénes	0b5217052d	querier: switch to v2 compactor output The change is mostly mechanical: update all compactor instances to the _v2 variant and update all call-sites, of which there is not that many. As a consequence of this patch, queries -- both single-partition and range-scans -- now do the v2->v1 conversion in the consumers, instead of in the compactor.	2022-03-11 09:24:05 +02:00
Pavel Emelyanov	063da81ab7	code: Convert nothrow construction assertions into concepts The small_vector also has N>0 constraint that's also converted Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-02-24 19:44:50 +03:00
Botond Dénes	f1e9e3b3b7	compact_mutation: drop support for v1 input	2022-02-21 12:29:24 +02:00

1 2 3 4

162 Commits