scylladb

Author	SHA1	Message	Date
Kefu Chai	372a4d1b79	treewide: do not define FMT_DEPRECATED_OSTREAM since we do not rely on FMT_DEPRECATED_OSTREAM to define the fmt::formatter for us anymore, let's stop defining `FMT_DEPRECATED_OSTREAM`. in this change, * utils: drop the range formatters in to_string.hh and to_string.c, as we don't use them anymore. and the tests for them in test/boost/string_format_test.cc are removed accordingly. * utils: use fmt to print chunk_vector and small_vector. as we are not able to print the elements using operator<< anymore after switching to {fmt} formatters. * test/boost: specialize fmt::details::is_std_string_like<bytes> due to a bug in {fmt} v9, {fmt} fails to format a range whose element type is `basic_sstring<uint8_t>`, as it considers it as a string-like type, but `basic_sstring<uint8_t>`'s char type is signed char, not char. this issue does not exist in {fmt} v10, so, in this change, we add a workaround to explicitly specialize the type trait to assure that {fmt} format this type using its `fmt::formatter` specialization instead of trying to format it as a string. also, {fmt}'s generic ranges formatter calls the pair formatter's `set_brackets()` and `set_separator()` methods when printing the range, but operator<< based formatter does not provide these method, we have to include this change in the change switching to {fmt}, otherwise the change specializing `fmt::details::is_std_string_like<bytes>` won't compile. * test/boost: in tests, we use `BOOST_REQUIRE_EQUAL()` and its friends for comparing values. but without the operator<< based formatters, Boost.Test would not be able to print them. after removing the homebrew formatters, we need to use the generic `boost_test_print_type()` helper to do this job. so we are including `test_utils.hh` in tests so that we can print the formattable types. * treewide: add "#include "utils/to_string.hh" where `fmt::formatter<optional<>>` is used. * configure.py: do not define FMT_DEPRECATED_OSTREAM * cmake: do not define FMT_DEPRECATED_OSTREAM Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:57:36 +08:00
Kefu Chai	a439ebcfce	treewide: include fmt/ranges.h and/or fmt/std.h before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<. with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:56:16 +08:00
Avi Kivity	26c8470f65	treewide: use #include <seastar/...> for seastar headers We treat Seastar as an external library, so fix the few places that didn't do so to use angle brackets. Closes #14037	2023-06-06 08:36:09 +03:00
Kefu Chai	94c6df0a08	treewide: use fmtlib when printing UUID this change tries to reduce the number of callers using operator<<() for printing UUID. they are found by compiling the tree after commenting out `operator<<(std::ostream& out, const UUID& uuid)`. but this change alone is not enough to drop all callers, as some callers are using `operator<<(ostream&, const unordered_map&)` and other overloads to print ranges whose elements contain UUID. so in order to limit the scope of the change, we are not changing them here. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-20 15:38:45 +08:00
Kefu Chai	c37f4e5252	treewide: use fmt::join() when appropriate now that fmtlib provides fmt::join(). see https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view there is not need to revent the wheel. so in this change, the homebrew join() is replaced with fmt::join(). as fmt::join() returns an join_view(), this could improve the performance under certain circumstances where the fully materialized string is not needed. please note, the goal of this change is to use fmt::join(), and this change does not intend to improve the performance of existing implementation based on "operator<<" unless the new implementation is much more complicated. we will address the unnecessarily materialized strings in a follow-up commit. some noteworthy things related to this change: * unlike the existing `join()`, `fmt::join()` returns a view. so we have to materialize the view if what we expect is a `sstring` * `fmt::format()` does not accept a view, so we cannot pass the return value of `fmt::join()` to `fmt::format()` * fmtlib does not format a typed pointer, i.e., it does not format, for instance, a `const std::string`. but operator<<() always print a typed pointer. so if we want to format a typed pointer, we either need to cast the pointer to `void` or use `fmt::ptr()`. * fmtlib is not able to pick up the overload of `operator<<(std::ostream& os, const column_definition* cd)`, so we have to use a wrapper class of `maybe_column_definition` for printing a pointer to `column_definition`. since the overload is only used by the two overloads of `statement_restrictions::add_single_column_parition_key_restriction()`, the operator<< for `const column_definition*` is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 20:34:18 +08:00
Avi Kivity	6aa91c13c5	Merge 'Optimize topology::compare_endpoints' from Benny Halevy The code for compare_endpoints originates at the dawn of time (`bc034aeaec`) and is called on the fast path from storage_proxy via `sort_by_proximity`. This series considerably reduces the function's footprint by: 1. carefully coding the many comparisons in the function so to reduce the number of conditional banches (apparently the compiler isn't doing a good enough job at optimizing it in this case) 2. avoid sstring copy in topology::get_{datacenter,rack} Closes #12761 * github.com:scylladb/scylladb: topology: optimize compare_endpoints to_string: add print operators for std::{weak,partial}_ordering utils: to_sstring: deinline std::strong_ordering print operator move to_string.hh to utils/ test: network_topology: add test_topology_compare_endpoints	2023-03-07 15:17:19 +02:00
Kefu Chai	0cb842797a	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:57:18 +02:00
Benny Halevy	25ebc63b82	move to_string.hh to utils/ Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-15 11:09:04 +02:00
Avi Kivity	69a385fd9d	Introduce schema/ module Schema related files are moved there. This excludes schema files that also interact with mutations, because the mutation module depends on the schema. Those files will have to go into a separate module. Closes #12858	2023-02-15 11:01:50 +02:00
Avi Kivity	c5e4bf51bd	Introduce mutation/ module Move mutation-related files to a new mutation/ directory. The names are kept in the global namespace to reduce churn; the names are unambiguous in any case. mutation_reader remains in the readers/ module. mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this patch. This is a step forward towards librarization or modularization of the source base. Closes #12788	2023-02-14 11:19:03 +02:00
Avi Kivity	2739ac66ed	treewide: drop cql_serialization_format Now that we don't accept cql protocol version 1 or 2, we can drop cql_serialization format everywhere, except when in the IDL (since it's part of the inter-node protocol). A few functions had duplicate versions, one with and one without a cql_serialization_format parameter. They are deduplicated. Care is taken that `partition_slice`, which communicates the cql_serialization_format across nodes, still presents a valid cql_serialization_format to other nodes when transmitting itself and rejects protocol 1 and 2 serialization\ format when receiving. The IDL is unchanged. One test checking the 16-bit serialization format is removed.	2023-01-03 19:54:13 +02:00
Tomasz Grabiec	23e4c83155	position_in_partition: Make after_key() work with non-full keys This fixes a long standing bug related to handling of non-full clustering keys, issue #1446. after_key() was creating a position which is after all keys prefixed by a non-full key, rather than a position which is right after that key. This will issue will be caught by cql_query_test::test_compact_storage in debug mode when mutation_partition_v2 merging starts inserting sentinels at position after_key() on preemption. It probably already causes problems for such keys.	2022-12-14 14:47:33 +01:00
Tomasz Grabiec	536c0ab194	db: Fix trim_clustering_row_ranges_to() for non-full keys and reverse order trim_clustering_row_ranges_to() is broken for non-full keys in reverse mode. It will trim the range to position_in_partition_view::after_key(full_key) instead of position_in_partition_view::before_key(key), hence it will include the key in the resulting range rather than exclude it. Fixes #12180 Refs #1446	2022-12-08 13:41:28 +01:00
Botond Dénes	8c0dd99f7c	query: result_merger::get() don't reset last-pos on short-reads and last pages When merging multiple query-results, we use the last-position of the last result in the combined one as the combined result's last position. This only works however if said last result was included fully. Otherwise we have to discard the last-position included with the result and the pager will use the position of the last row in the combined result as the last position. The commit introducing the above logic mistakenly discarded the last position when the result is a short read or a page is not full. This is not necessary and even harmful as it can result in an empty combined result being delivered to the pager, without a last-position.	2022-08-10 06:01:49 +03:00
Botond Dénes	014c5b56a3	query-result: move last_pos up to query::result query_result was the wrong place to put last position into. It is only included in data-responses, but not on digest-responses. If we want to support empty pages from replicas, both data and digest responses have to include the last position. So hoist up the last position to the parent structure: query::result. This is a breaking change inter-node ABI wise, but it is fine: the current code wasn't released yet. Closes #11072	2022-07-20 13:28:09 +03:00
Jadw1	29a0be75da	forward_service: support UDA and native aggregate parallelization Enables parallelization of UDA and native aggregates. The way the query is parallelized is the same as in #9209. Separate reduction type for `COUNT(*)` is left for compatibility reason.	2022-07-18 15:25:41 +02:00
Botond Dénes	fd5f8f2275	query: have replica provide the last position Use the recently introduced query-result facility to have the replica set the position where the query should continue from. For now this is the same as what the implicit position would have been previously (last row in result), but it opens up the possibility to stop the query at a dead row.	2022-06-23 13:36:24 +03:00
Botond Dénes	009d2fe2f7	idl/query: add last_position to query_result To be used to allow the replica to specify the last position in the stream, where the query was left off. Currently this is always the same as the implicit position -- the last row in the result-set -- but this requires only stopping the read on a live row, which is a requirement we want to lift: we want to be able to stop on a tombstone. As tombstones are not included in the query result, we have to allow the replica to overwrite the last seen position explicitly. This patch introduces the new field in the query-result IDL but it is not written to yet, nor is it read, that is left for the next patches.	2022-06-23 13:36:24 +03:00
Michał Sala	538cff651e	query: do not assert in `operator<<(ostream&, const forward_result::printer&)` Printing invalid forward_result should not cause Scylla to stop.	2022-03-09 14:58:11 +01:00
Michał Sala	51362e4e5e	query: transform asserts into on_internal_error in forward_result::merge It was done to show more context in case of forward_result::merge arguments size mismatch and also to prevent aborts caused by another nodes sending malformed data.	2022-03-09 14:58:11 +01:00
Michał Sala	fff454761a	messaging_service: add verb for count() request forwarding Except for the verb addition, this commit also defines forward_request and forward_result structures, used as an argument and result of the new rpc. forward_request is used to forward information about select statement that does count() (or other aggregating functions such as max, min, avg in the future). Due to the inability to serialize cql3::statements::select_statement, I chose to include query::read_command, dht::partition_range_vector and some configuration options in forward_request. They can be serialized and are sufficient enough to allow creation of service::pager::query_pagers::pager.	2022-02-01 21:14:41 +01:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Tomasz Grabiec	3226c5bf9d	Merge 'sstables: mx: enable position fast-forwarding in reverse mode' from Kamil Braun Most of the machinery was already implemented since it was used when jumping between clustering ranges of a query slice. We need only perform one additional thing when performing an index skip during fast-forwarding: reset the stored range tombstone in the consumer (which may only be stored in fast-forwarding mode, so it didn't matter that it wasn't reset earlier). Comments were added to explain the details. As a preparation for the change, we extend the sstable reversing reader random schema test with a fast-forwarding test and include some minor fixes. Fixes #9427. Closes #9484 * github.com:scylladb/scylla: query-request: add comment about clustering ranges with non-full prefix key bounds sstables: mx: enable position fast-forwarding in reverse mode test: sstable_conforms_to_mutation_source_test: extend `test_sstable_reversing_reader_random_schema` with fast-forwarding test: sstable_conforms_to_mutation_source_test: fix `vector::erase` call test: mutation_source_test: extract `forwardable_reader_to_mutation` function test: random_schema: fix clustering column printing in `random_schema::cql`	2021-11-29 16:01:53 +01:00
Kamil Braun	ea6310961c	test: sstable_conforms_to_mutation_source_test: extend `test_sstable_reversing_reader_random_schema` with fast-forwarding The test would check whether the forward and reverse readers returned consistent results when created in non-forwarding mode with slicing. Do the same but using fast-forwarding instead of slicing. To do this we require a vector of `position_range`s. We also need a vector of `clustering_range`s for the existing test. We modify the existing `random_ranges` function to return `position_range`s instead of `clustering_range`s since `position_range`s are easier to reason about, especially when we consider non-full clustering key prefixes. A function is introduced to convert a `position_range` to a `clustering_range` for the existing test.	2021-11-29 11:10:46 +01:00
Botond Dénes	15af80800a	partition_slice: init all fields in copy ctor _partition_row_limit_high_bits was left out for some reason, corrupting the per-partition row limit.	2021-11-23 14:21:50 +02:00
Botond Dénes	c372b9676d	partition_slice: operator<<: print the entire partition row limit Not just the low bits.	2021-11-23 14:21:50 +02:00
Michał Radwański	dac2509a7f	query: reverse clustering_range	2021-10-05 16:47:04 +02:00
Kamil Braun	4bd601c6fd	query-request: introduce `half_reverse_slice` A utility function for converting between forward and half-reversed (or 'legacy'-reversed) slices to be used in the next commit.	2021-09-28 17:03:57 +03:00
Botond Dénes	6b5936812f	reverse_slice(): toggle reversed bit instead of setting it reverse_slice() works in both direction: reverse <-> forward, so it cannot unconditionally set the reversed bit, instead it should toggle it.	2021-09-13 18:05:11 +03:00
Botond Dénes	5d33d76cfd	query: add slice reversing functions	2021-09-09 14:18:32 +03:00
Tomasz Grabiec	9fe3e86368	db: Print more fields of read_command Message-Id: <20210810143752.420988-1-tgrabiec@scylladb.com>	2021-08-17 12:24:40 +03:00
Tomasz Grabiec	e177cd382b	db: Remove superfluous } from read_command printout Message-Id: <20210810131429.407903-1-tgrabiec@scylladb.com>	2021-08-10 17:32:34 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Wojciech Mitros	45215746fe	increase the maximum size of query results to 2^64 Currently, we cannot select more than 2^32 rows from a table because we are limited by types of variables containing the numbers of rows. This patch changes these types and sets new limits. The new limits take effect while selecting all rows from a table - custom limits of rows in a result stay the same (2^32-1). In classes which are being serialized and used in messaging, in order to be able to process queries originating from older nodes, the top 32 bits of new integers are optional and stay at the end of the class - if they're absent we assume they equal 0. The backward compatibility was tested by querying an older node for a paged selection, using the received paging_state with the same select statement on an upgraded node, and comparing the returned rows with the result generated for the same query by the older node, additionally checking if the paging_state returned by the upgraded node contained new fields with correct values. Also verified if the older node simply ignores the top 32 bits of the remaining rows number when handling a query with a paging_state originating from an upgraded node by generating and sending such a query to an older node and checking the paging_state in the reply(using python driver). Fixes #5101.	2020-08-03 17:32:49 +02:00
Botond Dénes	c364c7c6a2	result_memory_limiter: add unlimited_result_size constant To be used as the max result size for internal queries.	2020-07-28 18:00:29 +03:00
Konstantin Osipov	191acec7ab	schema: rename column_mask to column_set Since it contains a precise set of columns, it's more accurate to call it a set, not a mask. Besides, the name column_mask is already used for column options on storage level.	2019-11-13 11:41:30 +03:00
Konstantin Osipov	c0f0ab5edd	lwt: introduce column mask Introduce a bitset container which can be used to compute all columns used in a query. Add a partition_slice constructor which uses the bitset.	2019-10-16 22:40:55 +03:00
Botond Dénes	4cb873abfe	query::trim_clustering_row_ranges_to(): fix handling of non-full prefix keys Non-full prefix keys are currently not handled correctly as all keys are treated as if they were full prefixes, and therefore they represent a point in the key space. Non-full prefixes however represent a sub-range of the key space and therefore require null extending before they can be treated as a point. As a quick reminder, `key` is used to trim the clustering ranges such that they only cover positions >= then key. Thus, `trim_clustering_row_ranges_to()` does the equivalent of intersecting each range with (key, inf). When `key` is a prefix, this would exclude all positions that are prefixed by key as well, which is not desired. Fixes: #4839 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20190819134950.33406-1-bdenes@scylladb.com>	2019-08-20 00:24:51 +02:00
Botond Dénes	87973498a1	query: refactor trim_clustering_row_ranges_to() Allow expressing `pos` in term of a `position_in_partition_view`, which allows finer control of the exact position, allowing specifying position before, at or after a certain key. The previous overload is kept for backward compatibility, invoking the new overload behind the curtains.	2019-08-13 09:47:55 +03:00
Botond Dénes	181bf64858	query: add trim_clustering_row_ranges_to() This algorithm was already duplicated in two places (service/pager/query_pagers.cc and mutation_reader.cc). Soon it will be used in a third place. Instead of triplicating, move it into a function that everybody can use.	2019-02-08 16:30:17 +02:00
Paweł Dziepak	9024187222	partition_slice: use small_vector for column_ids	2018-12-06 14:21:04 +00:00
Avi Kivity	a71ab365e3	toplevel: convert sprint() to format() sprint() recently became more strict, throwing on sprint("%s", 5). Replace with the more modern format(). Mechanically converted with https://github.com/avikivity/unsprint.	2018-11-01 13:16:17 +00:00
Duarte Nunes	baeec0935f	Replace query::full_slice with schema::full_slice() query::full_slice doesn't select any regular or static columns, which is at odds with the expectations of its users. This patch replaces it with the schema::full_slice() version. Refs #2885 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1507732800-9448-2-git-send-email-duarte@scylladb.com>	2017-10-17 11:25:53 +02:00
Duarte Nunes	3b9a9b7321	query-result: Send row and partition count over the wire To avoid calculating them on the coordinator side. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-08-14 10:29:06 +02:00
Duarte Nunes	d7bab684ea	query::result: Optimize calculate_counts() Now that range queries go through the normal digest path, we rely on query::result::calculate_counts() to count the amount of partitions and rows returned. This patch makes it a bit faster. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-08-14 10:28:29 +02:00
Tomasz Grabiec	0073df30aa	query: Introduce full_clustering_range	2017-02-23 18:50:53 +01:00
Duarte Nunes	21d1bbb527	view: Add may_be_affected_by function This patch adds the may_be_affected_by() function to the view class, which is responsible to determine whether an update to a base class affects one of its views. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-02-06 13:35:30 +01:00
Paweł Dziepak	a7d694654a	query: make result_memory_limiter constants available for linker	2016-12-22 13:35:04 +01:00
Paweł Dziepak	38ee69dee0	idl: allow writers to use any output stream Original IDL generated code was hardcoded to always use bytes_ostream. This patch makes the output stream a template parameter so that any valid output stream can be used. Unfortunately, making IDL writers generic requires updates in the code that uses them, this is fixed in C++17 which would be able to deduce the parameter in most cases.	2016-12-22 13:35:04 +01:00
Asias He	e5485f3ea6	Get rid of query::partition_range Use dht::partition_range instead	2016-12-19 08:09:25 +08:00

1 2

96 Commits