scylladb

Author	SHA1	Message	Date
Avi Kivity	c5e4bf51bd	Introduce mutation/ module Move mutation-related files to a new mutation/ directory. The names are kept in the global namespace to reduce churn; the names are unambiguous in any case. mutation_reader remains in the readers/ module. mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this patch. This is a step forward towards librarization or modularization of the source base. Closes #12788	2023-02-14 11:19:03 +02:00
Nadav Har'El	b28818db06	Merge 'Make regexes in types.cc static and remove unnecessary tolower transform' from Marcin Maliszkiewicz - makes all regexes static If making regex compilation static for uuid_type_impl and timeuuid_type_impl helps then it should also help for timestamp_type and simple_date_type. - remove unnecessary tolower transform in simple_date_type_impl::from_sstring Following function uses only decimal and '-' characters (see date_re). They are not affected by tolower call in any way. Aditionally std::strtoll supports "0x" prefixes but also accepts upper case version "0X" so it's also not affected by tolower call. get_simple_date_time only casts strings to integer types using boost:lexical_cast so also not affected by tolower. Finally, serialize only uses str to include it in an exception text so tolower doesn't affect it in a positive way. It's even better that input is displayed to the user as it was, not converted to lower case. Closes #12621 * github.com:scylladb/scylladb: types: remove unnecessary tolower transform in simple_date_type_impl::from_sstring types: make all regexes static	2023-01-24 16:13:59 +02:00
Marcin Maliszkiewicz	f4de64957b	types: remove unnecessary tolower transform in simple_date_type_impl::from_sstring Following function uses only decimal and '-' characters (see date_re). They are not affected by tolower call in any way. Aditionally std::strtoll supports "0x" prefixes but also accepts upper case version "0X" so it's also not affected by tolower call. get_simple_date_time only casts strings to integer types using boost:lexical_cast so also not affected by tolower. Finally, serialize only uses str to include it in an exception text so tolower doesn't affect it in a positive way. It's even better that input is displayed to the user as it was, not converted to lower case.	2023-01-24 10:50:13 +01:00
Marcin Maliszkiewicz	76c1d0e5d3	types: make all regexes static If making regex compilation static for uuid_type_impl and timeuuid_type_impl helps then it should also help for timestamp_type and simple_date_type.	2023-01-23 20:37:32 +01:00
Botond Dénes	ebc100f74f	types: is_tuple(): handle reverse types Currently reverse types match the default case (false), even though they might be wrapping a tuple type. One user-visible effect of this is that a schema, which has a reversed<frozen<UDT>> clustering key component, will have this component incorrectly represented in the schema cql dump: the UDT will loose the frozen attribute. When attempting to recreate this schema based on the dump, it will fail as the only frozen UDTs are allowed in primary key components. Fixes: #12576 Closes #12579	2023-01-20 15:50:58 +02:00
Avi Kivity	390a0ca47b	types: allow lists with NULL Allow transient lists that contain NULL throughout the evaluation machinery. This makes is possible to evalute things like `IF col IN (1, 2, NULL)` without hacks, once LWT conditions are converted to expressions. A few tests are relaxed to accommodate the new behavior: - cql_query_test's test_null_and_unset_in_collections is relaxed to allow `WHERE col IN ?`, with the variable bound to a list containing NULL; now it's explicitly allowed - expr_test's evaluate_bind_variable_validates_no_null_in_list was checking generic lists for NULLs, and was similary relaxed (and renamed) - expr_Test's evaluate_bind_variable_validates_null_in_lists_recursively was similarly relaxed to allow NULLs.	2023-01-18 10:38:24 +02:00
Avi Kivity	00145f9ada	test: relax NULL check test predicate When we start allowing NULL in lists in some contexts, the exact location where an error is raised (when it's disallowed) will change. To prepare for that, relax the exception check to just ensure the word NULL is there, without caring about the exact wording.	2023-01-18 10:38:24 +02:00
Avi Kivity	5f8540ecfa	cql3, types: validate listlike collections (sets, lists) for storage Lists allow NULL in some contexts (bind variables for LWT "IN ?" conditions), but not in most others. Currently, the implementation just disallows NULLs in list values, and the cases where it is allowed are hacked around. To reduce the special cases, we'll allow lists to have NULLs, and just restrict them for storage. This is similar to how scalar values can be NULL, but not when they are part of a partition key. To prepare for the transition, identify the locations where lists (and sets, which share the same storage) are stored as frozen values and add a NULL check there. Non-frozen lists already have the check. Since sets share the same format as lists, apply the same to them. No actual checks are done yet, since NULLs are impossible. This is just a stub.	2023-01-18 10:38:24 +02:00
Avi Kivity	da4abccf89	types: make empty type deserialize to non-null value The empty type is used internally to implement CQL sets on top of multi-cell maps. The map's key (an atomic cell) represents the set value, and the map's value is discarded. Since it's unneeded we use an internal "empty" type. Currently, it is deserialized into a `data_value` object representing a NULL. Since it's discarded, it really doesn't matter. However, with the impending change to change lists to allow NULLs, it does matter: 1. the coordinator sets the 'collections_as_maps' flag for LWT requests since it wants list indexes (this affects sets too). 2. the replica responds by serializing a set as a map. 3. since we start allow NULL collection values, we now serialize those NULLs as NULLs. 4. the coordinator deserializes the map, and complains about NULL values, since those are not supported. The solution is simple, deserialize the empty value as a non-NULL object. We create an empty empty_type_representation and add the scaffolding needed. Serialization and deserialization is already coded, it was just never called for NULL values (which were serialized with size 0, in collections, rather than size -1, luckily). A unit test is added.	2023-01-18 10:38:24 +02:00
Michał Chojnowski	9e17564c70	types: add some missing explicit instantiations Some functions defined by a template in types.cc are used in other translation units (via `cql3/untyped_result_set.hh`), but aren't explicitly instantiated. Therefore their linking can fail, depending on inlining decisions. (I experienced this when playing with compiler options). Fix that. Closes #12539	2023-01-17 10:46:01 +02:00
Avi Kivity	2739ac66ed	treewide: drop cql_serialization_format Now that we don't accept cql protocol version 1 or 2, we can drop cql_serialization format everywhere, except when in the IDL (since it's part of the inter-node protocol). A few functions had duplicate versions, one with and one without a cql_serialization_format parameter. They are deduplicated. Care is taken that `partition_slice`, which communicates the cql_serialization_format across nodes, still presents a valid cql_serialization_format to other nodes when transmitting itself and rejects protocol 1 and 2 serialization\ format when receiving. The IDL is unchanged. One test checking the 16-bit serialization format is removed.	2023-01-03 19:54:13 +02:00
Botond Dénes	8f8284783a	Merge 'Fix handling of non-full clustering keys in the read path' from Tomasz Grabiec This PR fixes several bugs related to handling of non-full clustering keys. One is in trim_clustering_row_ranges_to(), which is broken for non-full keys in reverse mode. It will trim the range to position_in_partition_view::after_key(full_key) instead of position_in_partition_view::before_key(key), hence it will include the key in the resulting range rather than exclude it. Fixes #12180 after_key() was creating a position which is after all keys prefixed by a non-full key, rather than a position which is right after that key. This will issue will be caught by cql_query_test::test_compact_storage in debug mode when mutation_partition_v2 merging starts inserting sentinels at position after_key() on preemption. It probably already causes problems for such keys as after_key() is used in various parts in the read path. Refs #1446 Closes #12234 * github.com:scylladb/scylladb: position_in_partition: Make after_key() work with non-full keys position_in_partition: Introduce before_key(position_in_partition_view) db: Fix trim_clustering_row_ranges_to() for non-full keys and reverse order types: Fix comparison of frozen sets with empty values	2022-12-15 10:47:12 +02:00
Michał Jadwiszczak	29ad5a08a8	implement `keyspace_element` interface This patch implements `data_dictionary::keyspace_element` interfece in: `keyspace_metadata`, `user_type_impl`, `user_function`, `user_aggregate` and schema.	2022-12-10 12:34:09 +01:00
Tomasz Grabiec	232ce699ab	types: Fix comparison of frozen sets with empty values A frozen set can be part of the clustering key, and with compact storage, the corresponding key component can have an empty value. Comparison was not prepared for this, the iterator attempts to deserialize the item count and will fail if the value is empty. Fixes #12242	2022-12-08 13:41:11 +01:00
Benny Halevy	79000bc02e	bit_cast: use std::bit_cast Now that scylla requries c++20 there's no need to define our own implementation in utils/bit_cast.hh Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:02:27 +03:00
Botond Dénes	5ea6700e23	types: publish timestamp_from_string() It looks like it is a better option for timestamp parsing than anything current C++ stdlib can offer. What a pity.	2022-08-02 10:33:01 +03:00
Benny Halevy	751eceb2e6	types: time_point_to_string: use numeric formatting rather than chrono-format specifiers As reported in #10867, newer versions of the fmt library format %Y using 4-characters width, 0-padding the prefix when needed, while older versions don't do that. This change moves away from using %Y and friends fmt specifiers to using explicit numeric-based formatting conforming to ISO 8601 and making sure the year field has at least 4 digits and is zero padded. When negative, the width is upped to 5 so it would show as -0001 rather than -001. The unit test was updated respectively. Fixes #10867 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10870	2022-06-27 08:28:56 +03:00
Benny Halevy	87ee6c3722	types: time_point_to_string: harden against out of range timestamps The time point is multiplied by an adjustment factor of 1000 for boost::posix_time::time_duration::ticks_per_second() = 1000000 when calling boost::posix_time::milliseconds(count). That may lead to integer overflow as reported by the UndefinedBehaviorSanitizer. See https://github.com/scylladb/scylla/issues/10830#issuecomment-1158899187 This change uses gmtime_r to convert seconds since unix epoch to std::tm and the fmt library to format the iso representation of the time_point to avoid exceptions and undefined behavior. gmtime_r may still detect an overflow "when the year does not fit into an integer" (see ctime(3)). In this case we return a backward compatible representation of "{count} milliseconds (out of range)". Refs #10830 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-21 08:08:57 +03:00
Nadav Har'El	0040e9e7f4	Merge 'cql: Add proper validation for null and unset inside collections send as bound values' from Jan Ciołek Let's say we have a query like: ```cql INSERT INTO ks.t (list_column) VALUES (?); ``` And the driver sends a list with null inside as the bound value, something like `[1, 2, null, 4]`. In such case we should throw `invalid_request_exception` because `nulls` are not allowed inside collections. Currently when a query like this gets executed Scylla throws an ugly marshalling error. This is because the validation code reads size of the next element, interprets it as an unsigned integer and tries to read this much. In case of `null` element the size is `-1`, which when converted to unsigned `size_t` gives 18446744073709551615 and it fails to read this much. This PR adds proper validation checks to make the error message better. I also added some tests. I originally tried to write them in python, but python driver really doesn't like sending invalid values. Trying to send `[1, None, 2]` results in a list with empty value instead of null. Trying to send `[1, UNSET_VALUE, 2]` Fails before query even leaves the driver. Fixes #10580 Closes #10599 * github.com:scylladb/scylla: cql3: Add tests for null and unset inside collections cql3: Add null and unset checks in collection validation	2022-05-19 11:25:24 +03:00
cvybhu	7adc572ec6	cql3: Add tests for null and unset inside collections Add a bunch of tests that test what happens when there is a null or unset value inside collections. They are not allowed so every such attempt should end with invalid_request_exception with proper message. I had to write a new function for collection serialization. I tried to use data_value and its methods, but it's impossible to create a data_value that represents an unset value. Signed-off-by: cvybhu <jan.ciolek@scylladb.com>	2022-05-19 00:15:17 +02:00
cvybhu	345e89756b	cql3: Add null and unset checks in collection validation Validating a collection should ensure that there are no null or unset values inside the collection. The validation already fails in case of such values, but it does so in an ugly way. Length of null and unset value is negative but is cast to unsigned size_t. Then it tries to read a really large value and fails with marshalling error. The new checks are a better way to handle this. Signed-off-by: cvybhu <jan.ciolek@scylladb.com>	2022-05-18 11:05:14 +02:00
Michał Sala	f6bdc4d694	cql3: expr: add printer for expression expression::printer is used to print CQL expressions in a pretty way that allows them to be parsed back to the same representation. There is a bunch of things that need to be changed when compared to the current implementation of opreatorr<<(expression) to output something parsable. column names should be printed without 'unresolved_identifier()' and sometimes they need to be quoted to perserve case sensitivity. I needed to write new code for printing constant values because the current one did debug printing (e.g. a set was printed as '1; 2; 3'). A list of IN values should be printed inside () intead of [], but because it is internally represented as a list it is by default printed with []. To fix this a temporary tuple_constructor is created and printed. Signed-off-by: cvybhu <jan.ciolek@scylladb.com>	2022-05-16 18:17:58 +02:00
Piotr Sarna	0a068cddb1	types: fix is_string for reversed types Checking if the type is string is subtly broken for reversed types, and these types will not be recognized as strings, even though they are. As a result, if somebody creates a column with DESC order and then tries to use operator LIKE on it, it will fail because the type would not be recognized as a string.	2022-03-09 08:18:33 +01:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Nadav Har'El	6ebf32f4d7	types: deinline template throw_with_backtrace<marshal_exception, sstring> When a template is instantiated in a header file which is included by many source files, the compiler needs to compile it again and again. ClangBuildAnalyzer helps find the worst cases of this happening, and one of the worst happens to be seastar::throw_with_backtrace<marshal_exception, sstring> This specific template function takes (according to ClangBuildAnalyzer) 362 milliseconds to instantiate, and this is done 312 (!) times, because it reaches virtually every Scylla source file via either types.hh or compound.hh which use this idiom. Unfortunately, C++ as it exists today does not have a mechanism to avoid compiling a specific template instantiation if this was already done in some other source file. But we can do this manually using the C++11 feature of "extern template": 1. For a specific template instance, in this case seastar::throw_with_backtrace<marhsal_exception, sstring>, all source files except one specify it as "extern template". This means that the code for it will NOT be built in this source file, and the compiler assumes the linker will eventually supply it. 2. At the same time, one source file instantiates this template instance once regularly, without "extern". The numbers from ClangBuildAnalyzer suggest that this patch should reduce total build time by 1% (in dev build mode), but this is hard to measure in practice because the very long build time (210 CPU minutes on my laptop) usually fluctuates by more than 1% in consecutive runs. However, we've seen in the past that a good estimate of build time is the total produced object size (du -bc build/dev/*/.o). This patch indeed reduces this total object size (in dev build mode) by exactly 1%. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220105171453.308821-1-nyh@scylladb.com>	2022-01-05 19:23:40 +02:00
Nadav Har'El	6012f6f2b6	build performance: do not include <seastar/net/ip.hh> In a previous patch, we noticed that the header file <gm/inet_address.hh>, which is included, directly or indirectly, by most source files, includes <seastar/net/ip.hh> which is very slow to compile, and replaced it by the much faster-to-include <seastar/net/ipv[46]_address.hh>. However, we also included <seastar/net/ip.hh> in types.hh - and that too is included by almost every file, so the actual saving from the above patch was minimal. So in this patch we replace this include too. After this patch Scylla does not include <seastar/net/ip.hh> at all. According to ClangBuildAnalyzer, this reduces the average time to include types.hh (multiply this by 312 times!) from 4 seconds to 1.8 seconds, and reduces total build time (dev mode) by about 3%. Some of the source files were now missing some include directives, that were previously included in ip.hh - so we need to add those explicitly. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-01-05 17:29:21 +02:00
Avi Kivity	df73d12272	types: remove recursive constraint in deserialize_value deserialize_value() has a constraint that depends on another deserialize_value() implementation. Apprently gcc wants to instantiate the deserialize_value() instance we're constraining while evaluating the constraint, leading to a loop. Since this deserialize_value() is just an internal helper, drop the constraint rather than fighting it.	2021-10-10 18:16:50 +03:00
Jan Ciolek	e9f24edc9b	cql3: types: Optimize abstract_type::contains_collection contains_collection() and contains_set_or_map() used to be calculated on each call(). Now the result is calculated only once during type creation. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-09-24 13:45:38 +02:00
Jan Ciolek	5589f348e7	cql3: expr: Implement evaluate(expr::bind_variable) Implement evaluating a bind_variable. To be able to evaluate a bind_variable we need to know the type of the bound value. This is why a data_type has been added to the bind_variable struct. There are some quirks when evaluating a bind_variable. The first problem occurs when the variable has been sent with an older cql serialization format and contains collections. In that case the value has to be reserialized to use the newest cql serialization format. The second problem occurs when there is a set or a map in the value. The set value sent by the driver might not have the elements in the correct order, contain duplicates etc. When a set or map is detected in the value it is reserialized as well. collection_type_impl::reserialize doesn't work for this purpose, because it uses data_value which does not perform sorting or removal. New code corresponds to old bind() of lists::marker in cql3/lists.cc, sets::marker etc. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-09-24 11:05:53 +02:00
Jan Ciolek	e621cbaa32	cql3: Add contains_collection/set_or_map to abstract_type Sometimes we need to know whether some type contains some collection, set, or map inside. Introduce two functions that provide this information. Information about collection is useful for reserializing values with old serialization format. Information about set/map is useful for reserializing sets and maps to remove duplicates. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-09-24 11:05:53 +02:00
Botond Dénes	183ac6981a	types: add reversed(data_type) Reversing the sort order of a type.	2021-09-09 11:49:05 +03:00
Avi Kivity	0909e3c17d	treewide: remove redundant "x <=> 0" compares If x is of type std::strong_ordering, then "x <=> 0" is equivalent to x. These no-ops were inserted during #1449 fixes, but are now unnecessary. They have potential for harm, since they can hide an accidental of the type of x to an arithmetic type, so remove them. Ref #1449.	2021-07-28 13:30:32 +03:00
Avi Kivity	e52ebe2da5	types: convert abstract_type::compare and related to std::strong_ordering Change comparators around types to std::strong_ordering. Ref #1449.	2021-07-28 13:19:24 +03:00
Avi Kivity	b7160b74ea	types: reduce boilerplate when comparing empty value Some types have boilerplate code to check if one or both values are empty. Consolidate it in a helper to reduce noise.	2021-07-28 13:19:09 +03:00
Avi Kivity	9059514335	build, treewide: enable -Wpessimizing-move warning This warning prevents using std::move() where it can hurt - on an unnamed temporary or a named automatic variable being returned from a function. In both cases the value could be constructed directly in its final destination, but std::move() prevents it. Fix the handful of cases (all trivial), and enable the warning. Closes #8992	2021-07-08 17:52:34 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Avi Kivity	789757a692	Merge 'cql3: represent lists as chunked_vector instead of std::vector' from Michał Chojnowski The cql3 layer manipulates lists as `std::vector`s (of `managed_bytes_opt`). Since lists can be arbitrarily large, let's use chunked vectors there to prevent potentially large contiguous allocations. Closes #8668 * github.com:scylladb/scylla: cql3: change the internal type of tuples::in_value from std::vector to chunked_vector cql3: change the internal type of lists::value from std::vector to chunked_vector cql3: in multi_item_terminal, return the vector of items by value	2021-05-24 17:19:45 +03:00
Avi Kivity	50f3bbc359	Merge "treewide: various header cleanups" from Pavel S " The patch set is an assorted collection of header cleanups, e.g: * Reduce number of boost includes in header files * Switch to forward declarations in some places A quick measurement was performed to see if these changes provide any improvement in build times (ccache cleaned and existing build products wiped out). The results are posted below (`/usr/bin/time -v ninja dev-build`) for 24 cores/48 threads CPU setup (AMD Threadripper 2970WX). Before: Command being timed: "ninja dev-build" User time (seconds): 28262.47 System time (seconds): 824.85 Percent of CPU this job got: 3979% Elapsed (wall clock) time (h:mm:ss or m:ss): 12:10.97 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 2129888 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 1402838 Minor (reclaiming a frame) page faults: 124265412 Voluntary context switches: 1879279 Involuntary context switches: 1159999 Swaps: 0 File system inputs: 0 File system outputs: 11806272 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 After: Command being timed: "ninja dev-build" User time (seconds): 26270.81 System time (seconds): 767.01 Percent of CPU this job got: 3905% Elapsed (wall clock) time (h:mm:ss or m:ss): 11:32.36 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 2117608 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 1400189 Minor (reclaiming a frame) page faults: 117570335 Voluntary context switches: 1870631 Involuntary context switches: 1154535 Swaps: 0 File system inputs: 0 File system outputs: 11777280 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 The observed improvement is about 5% of total wall clock time for `dev-build` target. Also, all commits make sure that headers stay self-sufficient, which would help to further improve the situation in the future. " * 'feature/header_cleanups_v1' of https://github.com/ManManson/scylla: transport: remove extraneous `qos/service_level_controller` includes from headers treewide: remove evidently unneded storage_proxy includes from some places service_level_controller: remove extraneous `service/storage_service.hh` include sstables/writer: remove extraneous `service/storage_service.hh` include treewide: remove extraneous database.hh includes from headers treewide: reduce boost headers usage in scylla header files cql3: remove extraneous includes from some headers cql3: various forward declaration cleanups utils: add missing <limits> header in `extremum_tracking.hh`	2021-05-24 14:24:20 +03:00
Michał Chojnowski	65be64d0fe	types: don't linearize values in abstract_type::hash Yet another patch aiming to prevent potentially large allocations. abstract_type::hash somehow evaded the anti-linearization patches until now. Fix that. Note that decimals and varints are still linearized, but we leave it be, under the assumption that nobody inserts 128KiB-large varints into a database. Refs: #8120 Closes #8689	2021-05-23 12:11:53 +03:00
Michał Chojnowski	ebe485953a	types: fix a case of type punning via union Type punning via unions is legal in C, but illegal (undefined behaviour) in C++. Use the legal bit_cast instead. Closes #8685	2021-05-23 10:12:56 +03:00
Pavel Solodovnikov	fff7ef1fc2	treewide: reduce boost headers usage in scylla header files `dev-headers` target is also ensured to build successfully. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-05-20 01:33:18 +03:00
Michał Chojnowski	dcbc053ecd	cql3: change the internal type of lists::value from std::vector to chunked_vector Lists can grow very big. Let's use a chunked vector to prevent large contiguous allocations.	2021-05-17 17:09:55 +02:00
Michał Chojnowski	ba53c85829	cdc: log: rewrite collection merge to use managed_bytes instead of bytes	2021-04-08 10:16:21 +02:00
Michał Chojnowski	42acdc4d09	cdc: log: don't linearize collections in get_preimage_col_value	2021-04-08 10:16:21 +02:00
Michał Chojnowski	458878a414	cql3: optimize the deserialization of collections Before this patch, deserializing a collection from a (prepared) CQL request involved deserializing every element and serializing it again. Originally this was a hacky method of validation, and it was also needed to reserialize nested frozen collections from the CQLv2 format to the CQLv3 format. But since then we started doing validation separately (before calls to from_serialized) and CQLv2 became irrelevant, making reserialization of elements (which, among other things, involves a memory alocation for every element) pure waste. This patch adds a faster path for collections in the v3 format, which does not involve linearizing or reserializing the elements (since v3 is the same as our internal format). After this patch, the path from prepared CQL statements to atomic_cell_or_collection is almost completely linearization-free. The last remaining place is collection_mutation_description, where map keys are linearized.	2021-04-01 10:44:21 +02:00
Michał Chojnowski	0bb959e890	cql3: don't linearize elements of lists, tuples, and user types This patch switches the type used to store collection elements inside the intermediate form used in lists::value, tuples::value etc. from bytes to managed_bytes. After this patch, tuple and list elements are only linearized in from_serialized, which will be corrected soon. This commit introduces some additional copies in expression.cc, which will be dealt with in a future commit.	2021-04-01 10:44:21 +02:00
Michał Chojnowski	e9c05582a4	types: add write_collection_{value,size} for managed_bytes_mutable_view We will use them to avoid linearization when going from the intermediate std::vector<bytes> form in cql3/ to the atomic_cell format, by outputting managed_bytes instead of bytes in get_with_protocol_version.	2021-04-01 10:44:21 +02:00
Wojciech Mitros	f57fa935a2	types: remove linearization from abstract_type::compare To avoid high latencies caused by large contigous allocations needed by linearizing, work on fragmented buffers instead. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-03-31 06:35:10 +02:00
Wojciech Mitros	daa31be37f	types: replace buffers in tuple_deserializing_iterator with fragmented ones In preparation for removing linearization from abstract_type::compare, add options to avoid linearization in tuple_deserializing_iterator. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-03-31 06:35:09 +02:00
Wojciech Mitros	823d4c7529	types: make tuple_type_impl::split work with any FragmentedViews We may want to store a tuple in a fragmented buffer. To split it into a vector of optional bytes, tuple_type_impl::split can be used. To split a contiguous buffer(bytes_view), simply pass single_fragmented_view(bytes_view). Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-03-31 06:34:37 +02:00

1 2 3 4 5 ...

503 Commits