scylladb

Author	SHA1	Message	Date
Jan Ciolek	a7d1dab066	statement_restrictions_test: tests for extracting column restrictions Add unit tests for the function extract_single_column_restrictions_for_column() Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-08-02 15:43:42 +02:00
Jan Ciolek	43ab3d6831	expression: add a function to extract restrictions for a column Add a function, which given an expression and a column, extracts all restrictions involving this column. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-08-02 15:43:33 +02:00
Avi Kivity	a38b1006d1	cql3: expression: update find_atom, count_if for function_call, cast, field_selection The combination of the new types and these functions cannot happen yet, but as they are generic functions it is better to implement them in case it becomes possible later.	2021-07-27 20:16:43 +03:00
Avi Kivity	2b7b9bb469	cql3: expressions: fix printing of nested expressions Now that we eliminated cql3::selectable::raw, we can print nested expressions.	2021-07-27 20:16:29 +03:00
Avi Kivity	98c4f0dfb3	cql3: selection: replace selectable::raw with expression Now that all selectable::raw subclasses have been converted to cql3::selectable::with_expression::raw, the class structure is just a wrapper around expressions. Peel it, converting the virtual member functions to free functions, and replacing object instances with expression or nested_expression as the case allows.	2021-07-27 20:16:15 +03:00
Avi Kivity	979010a1e5	cql3: expression: convert selectable::with_field_selection::raw to expression Add a field_selection variant element to expression. Like function_call and cast, the structure from which a field is selectewd cannot yet be an expression, since not all seletable::raw:s are converted. This will be done in a later pass. This is also why printing a field selection now does not print the selected expression; this will also be corrected later.	2021-07-27 20:16:12 +03:00
Avi Kivity	714b812212	cql3: expression: convert selectable::with_cast::raw to expression Add a cast variant element to expression. Like function_call, the argument being converted cannot yet be an expression, since not all seletable::raw:s are converted. This will be done in a later pass. This is also why printing a cast now does not print the casted expression; this will also be corrected later.	2021-07-27 20:14:52 +03:00
Avi Kivity	5adae5837e	cql3: expression: convert selectable::with_anonymous_function::raw to expression Rather than creating a new variant element in expression, we extend function_call to handle both named and anonymous functions, since most of the processing is the same.	2021-07-27 20:13:55 +03:00
Avi Kivity	3e392d2513	cql3: expression: convert selectable::with_function_call::raw to expressions Add a function_call variant element to hold function calls. Note that because not all selectables are yet converted, function call arguments are still of type selectable::raw. They will be converted to expressions later. This is also why printing a function now does not print its arguments; this will also be corrected later.	2021-07-27 20:13:51 +03:00
Avi Kivity	ff65c54316	cql3: expressions: convert writetime_or_ttl::raw to expression Create a new element in the expression variant, column_mutation_attribute, signifying we're picking up an attribute of a column mutation (not a column value!). We use an enum rather than a bool to choose between writetime and ttl (the two mutation attributes) for increased explicitness. Although there can only be one type for the column we're operating on (it must be an unresolved_identifer), we use a nested_expression. This is because we'll later need to also support a column_value as the column type after we prepare it. This is somewhat similar to the address of operator in C, which syntactically takes any expression but semantically operates only on lvalues.	2021-07-27 20:10:52 +03:00
Avi Kivity	294f0f35b1	cql3: expression: add convenience constructor from expression element to nested expression It is convenient to initialize a nested_expression variable from one of the types that compose the expression variant, but C++ doesn't allow it. Add a constructor that does this. Use the new variant_element concept to constrain the input to be one of the variant's elements.	2021-07-27 20:08:48 +03:00
Avi Kivity	ac3b093e3c	cql3: expression: use nested_expression in binary_operator binary_operator::lhs is implementing the pattern in nested_expression. Use nested_expression instead to reduce code size.	2021-07-27 20:08:34 +03:00
Avi Kivity	b07a0867b3	cql3: expression: introduce nested_expression class The exression type cannot be a member of a struct that is an element of the expression variant. This is because it would then be required to contain itself. So introduce a nested_expression type to indirectly hold an expression, but keep the value semantics we expect from expressions: it is copyable and a copy has separate identity and storage. In fact binary_operator had to resort to this trick, so it's converted to nested_expression in the next patch.	2021-07-27 20:08:21 +03:00
Avi Kivity	8a518e9c78	Convert column_identifier_raw's use as selectable to expressions Introduce unresolved_identifer as an unprepared counterpart to column_value. column_identifier_raw no longer inherits from selectable::raw, but methods for now to reduce churn.	2021-07-27 20:08:15 +03:00
Jan Ciolek	51ee9adeec	expression: Add replace_token function Adds replace_token function which takes an expression and replaces all left hand side occurences of token() with the given column definition. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-07-21 12:25:12 +02:00
Piotr Sarna	1e0880e345	cql3,expr: unify get_value Now that there's only one helper function for getting values, the call can be inlined instead.	2021-07-13 10:40:08 +02:00
Piotr Sarna	95002bb8d4	cql3,expr: purge mutation-based is_satisfied_by The interface is now unified, and all callers use the original CQL3-backed API.	2021-07-13 10:40:08 +02:00
Avi Kivity	3c21833aac	cql3: expr: make column_value (and similar) a first-class expression Currently, column names can only appear in a boolean binary expression, but not on their own. This means that in the statement SELECT a FROM tab WHERE a > 3; We can represent the WHERE clause as an expression, but not the selector. To pave the way for using expressions in selector contexts, we promote the elements of binary_operator::lhs (column_value, column_value_tuple, token) to be expressions in their own right. binary_operator::lhs becomes an expression (wrapped in unique_ptr, because variants can't contain themselves). Note that all three new possibilities make sense in a selector: SELECT column FROM tab SELECT token(pk) FROM tab SELECT function_that_accepts_a_tuple((col1, col2)) FROM tab There is some fallout from this: - because binary_operator contains a unique_ptr, it is no longer copyable. We add a copy constructor and assignment operator to compensate. - often, the new elements don't make sense when evaluating a boolean expression, which is the only context we had before. We call on_internal_error in these cases. The parser right now prevents such cases from being constructed in the first place (this is equivalent to if (some_struct_value) in C). - in statement_restrictions.cc, we need to evalute the lhs in the context of the full binary operator. I introduced with_current_binary_operator() for this; an alternative approach is to create a new sub-visitor. Closes #8797	2021-06-17 10:08:58 +03:00
Pavel Solodovnikov	76bea23174	treewide: reduce header interdependencies Use forward declarations wherever possible. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Closes #8813	2021-06-07 15:58:35 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Avi Kivity	8a4abe9895	cql3: expression: don't copy expression in has_supporting_index() std::bind() copies the bound parameters for safekeeping. Here this includes expr, which can be quite heavyweight. Use std::ref() to prevent copying. This is safe since the bound expression is executed and discarded before has_supporting_index() returns. Closes #8791	2021-06-06 19:18:49 +03:00
Pavel Solodovnikov	fff7ef1fc2	treewide: reduce boost headers usage in scylla header files `dev-headers` target is also ensured to build successfully. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-05-20 01:33:18 +03:00
Avi Kivity	bd16e98019	expr: give a name to a tuple of columns Right now, binary_operator::lhs is a variant<column_value, std::vector<column_value>, token>. The role of the second branch (a vector of column values) is to represent a tuple of columns e.g. "WHERE (a, b, c) = ?"), but this is not clear from the type name. Inroduce a wrapper type around the vector, column_value_tuple, to make it clear we're dealing with tuples of CQL references (a column_value is really a column_ref, since it doesn't actually contain any value). Closes #8208	2021-04-12 09:40:16 +02:00
Michał Chojnowski	979666075f	cql3: expression: use managed_bytes instead of bytes where possible	2021-04-01 10:44:21 +02:00
Michał Chojnowski	6e7e795dfd	cql3: expr: expression: make the argument of to_range a forwarding reference Make to_range able to handle rvalues. We will pass managed_bytes&& to it in the next patch to avoid pointless copying. The public declaration of to_range is changed to a concrete function to avoid having to explicitly instantiate to_range for all possible reference types of clustering_key_prefix.	2021-04-01 10:44:21 +02:00
Michał Chojnowski	0bb959e890	cql3: don't linearize elements of lists, tuples, and user types This patch switches the type used to store collection elements inside the intermediate form used in lists::value, tuples::value etc. from bytes to managed_bytes. After this patch, tuple and list elements are only linearized in from_serialized, which will be corrected soon. This commit introduces some additional copies in expression.cc, which will be dealt with in a future commit.	2021-04-01 10:44:21 +02:00
Michał Chojnowski	b9322a6b71	cql3: switch users of cql3::raw_value_view to internals-independent API We want to change the internals of cql3::raw_value{_view}. However, users of cql3::raw_value and cql3::raw_value_view often use them by extracting the internal representation, which will be different after the planned change. This commit prepares us for the change by making all accesses to the value inside cql3::raw_value(_view) be done through helper methods which don't expose the internal representation publicly. After this commit we are free to change the internal representation of raw_value_{view} without messing up their users.	2021-04-01 10:42:04 +02:00
Dejan Mircevski	8db24fc03b	cql3/expr: Handle `IN ?` bound to null Previously, we crashed when the IN marker is bound to null. Throw invalid_request_exception instead. Fixes #8265 Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #8287	2021-03-17 09:59:22 +02:00
Dejan Mircevski	992d5c6184	cql3/expr: Improve column printing Before this change, we would print an expression like this: ((ColumnDefinition{name=c, type=org.apache.cassandra.db.marshal.Int32Type, kind=CLUSTERING_COLUMN, componentIndex=0, droppedAt=-9223372036854775808}) = 0000007b) Now, we print the same expression like this: (c = 0000007b) Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #8285	2021-03-17 09:59:22 +02:00
Dejan Mircevski	8dac132581	cql3/expr: Add is_multi_column() It will come in handy when we start using expressions to calculate the clustering slice. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-03-10 21:25:43 -05:00
Dejan Mircevski	1f591bd16e	cql3/expr: Add more operators to needs_filtering Omitting these operators didn't cause bugs, because needs_filtering() is never invoked on them. But that will likely change in the future, so add them now to prevent problems down the road. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-03-10 21:25:43 -05:00
Dejan Mircevski	c0c93982d0	cql3: Replace CK-bound mode with comparison_order Instead of defining this enum in multi_column_restriction::slice, put it in the expr namespace and add it to binary_operator. We will need it when we switch bounds calculation from multi_column_restriction to expr classes. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-03-10 21:25:43 -05:00
Dejan Mircevski	7dfe471b5a	cql3/expr: Make to_range globally visible It will be used in statement_restrictions for calculating clustering bounds. And it will come in handy elsewhere in the future, I'm sure. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-03-10 21:25:43 -05:00
Botond Dénes	ba7a9d2ac3	imr: switch back to open-coded description of structures Commit `aab6b0ee27` introduced the controversial new IMR format, which relied on a very template-heavy infrastructure to generate serialization and deserialization code via template meta-programming. The promise was that this new format, beyond solving the problems the previous open-coded representation had (working on linearized buffers), will speed up migrating other components to this IMR format, as the IMR infrastructure reduces code bloat, makes the code more readable via declarative type descriptions as well as safer. However, the results were almost the opposite. The template meta-programming used by the IMR infrastructure proved very hard to understand. Developers don't want to read or modify it. Maintainers don't want to see it being used anywhere else. In short, nobody wants to touch it. This commit does a conceptual revert of `aab6b0ee27`. A verbatim revert is not possible because related code evolved a lot since the merge. Also, going back to the previous code would mean we regress as we'd revert the move to fragmented buffers. So this revert is only conceptual, it changes the underlying infrastructure back to the previous open-coded one, but keeps the fragmented buffers, as well as the interface of the related components (to the extent possible). Fixes: #5578	2021-02-16 23:43:07 +01:00
Avi Kivity	60f5ec3644	Merge 'managed_bytes: switch to explicit linearization' from Michał Chojnowski This is a revival of #7490. Quoting #7490: The managed_bytes class now uses implicit linearization: outside LSA, data is never fragmented, and within LSA, data is linearized on-demand, as long as the code is running within with_linearized_managed_bytes() scope. We would like to stop linearizing managed_bytes and keep it fragmented at all times, since linearization can require large contiguous chunks. Large contiguous allocations are hard to satisfy and cause latency spikes. As a first step towards that, we remove all implicitly linearizing accessors and replace them with an explicit linearization accessor, with_linearized(). Some of the linearization happens long before use, by creating a bytes_view of the managed_bytes object and passing it onwards, perhaps storing it for later use. This does not work with with_linearized(), which creates a temporary linearized view, and does not work towards the longer term goal of never linearizing. As a substitute a managed_bytes_view class is introduced that acts as a view for managed_bytes (for interoperability it can also be a view for bytes and is compatible with bytes_view). By the end of the series, all linearizations are temporary, within the scope of a with_linearized() call and can be converted to fragmented consumption of the data at leisure. This has limited practical value directly, as current uses of managed_bytes are limited to keys (which are limited to 64k). However, it enables converting the atomic_cell layer back to managed_bytes (so we can remove IMR) and the CQL layer to managed_bytes/managed_bytes_view, removing contiguous allocations from the coordinator. Closes #7820 * github.com:scylladb/scylla: test: add hashers_test memtable: fix accounting of managed_bytes in partition_snapshot_accounter test: add managed_bytes_test utils: fragment_range: add a fragment iterator for FragmentedView keys: update comments after changes and remove an unused method mutation_test: use the correct preferred_max_contiguous_allocation in measuring_allocator row_cache: more indentation fixes utils: remove unused linearization facilities in `managed_bytes` class misc: fix indentation treewide: remove remaining `with_linearized_managed_bytes` uses memtable, row_cache: remove `with_linearized_managed_bytes` uses utils: managed_bytes: remove linearizing accessors keys, compound: switch from bytes_view to managed_bytes_view sstables: writer: add write_* helpers for managed_bytes_view compound_compat: transition legacy_compound_view from bytes_view to managed_bytes_view types: change equal() to accept managed_bytes_view types: add parallel interfaces for managed_bytes_view types: add to_managed_bytes(const sstring&) serializer_impl: handle managed_bytes without linearizing utils: managed_bytes: add managed_bytes_view::operator[] utils: managed_bytes: introduce managed_bytes_view utils: fragment_range: add serialization helpers for FragmentedMutableView bytes: implement std::hash using appending_hash utils: mutable_view: add substr() utils: fragment_range: add compare_unsigned utils: managed_bytes: make the constructors from bytes and bytes_view explicit utils: managed_bytes: introduce with_linearized() utils: managed_bytes: constrain with_linearized_managed_bytes() utils: managed_bytes: avoid internal uses of managed_bytes::data() utils: managed_bytes: extract do_linearize_pure() thrift: do not depend on implicit conversion of keys to bytes_view clustering_bounds_comparator: do not depend on implicit conversion of keys to bytes_view cql3: expression: linearize get_value_from_mutation() eariler bytes: add to_bytes(bytes) cql3: expression: mark do_get_value() as static	2021-01-18 11:01:28 +02:00
Dejan Mircevski	3aa80f47fe	abstract_type: Rework unreversal methods Replace two methods for unreversal (`as` and `self_or_reversed`) with a new one (`without_reversed`). More flexible and better named. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #7889	2021-01-10 19:30:12 +02:00
Dejan Mircevski	4515a49d4d	cql3: Fix `IN ?` for unset values When the right-hand side of IN is an unset value, we must report an error, like Cassandra does. This fixes testListWithUnsetValues, so re-enable it. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-01-07 13:22:20 +02:00
Avi Kivity	1dd6d7029a	cql3: expression: linearize get_value_from_mutation() eariler do_get_value() is careful to return a fragmented view, but its only caller get_value_from_mutation() linearizes it immediately afterwards. Linearize it sooner; this prevents mixing in fragmented values from cells (now via IMR) and fragmented values from partition/clustering keys. It only works now because keys are not fragmented outside LSA, and value_view has a special case for single-fragment values. This helps when keys become fragmented.	2020-12-20 15:14:44 +01:00
Avi Kivity	28126257c2	cql3: expression: mark do_get_value() as static It is used only later in this file.	2020-12-20 15:14:44 +01:00
Piotr Sarna	2015988373	Merge 'types: get rid of linearization in deserialize()' from Michał Chojnowski Citing #6138: > In the past few years we have converted most of our codebase to work in terms of fragmented buffers, instead of linearised ones, to help avoid large allocations that put large pressure on the memory allocator. > One prominent component that still works exclusively in terms of linearised buffers is the types hierarchy, more specifically the de/serialization code to/from CQL format. Note that for most types, this is the same as our internal format, notable exceptions are non-frozen collections and user types. > > Most types are expected to contain reasonably small values, but texts, blobs and especially collections can get very large. Since the entire hierarchy shares a common interface we can either transition all or none to work with fragmented buffers. This series gets rid of intermediate linearizations in deserialization. The next steps are removing linearizations from serialization, validation and comparison code. Series summary: - Fix a bug in `fragmented_temporary_buffer::view::remove_prefix`. (Discovered while testing. Since it wasn't discovered earlier, I guess it doesn't occur in any code path in master.) - Add a `FragmentedView` concept to allow uniform handling of various types of fragmented buffers (`bytes_view`, `temporary_fragmented_buffer::view`, `ser::buffer_view` and likely `managed_bytes_view` in the future). - Implement `FragmentedView` for relevant fragmented buffer types. - Add helper functions for reading from `FragmentedView`. - Switch `deserialize()` and all its helpers from `bytes_view` to `FragmentedView`. - Remove `with_linearized()` calls which just became unnecessary. - Add an optimization for single-fragment cases. The addition of `FragmentedView` might be controversial, because another concept meant for the same purpose - `FragmentRange` - is already used. Unfortunately, it lacks the functionality we need. The main (only?) thing we want to do with a fragmented buffer is to extract a prefix from it and `FragmentRange` gives us no way to do that, because it's immutable by design. We can work around that by wrapping it into a mutable view which will track the offset into the immutable `FragmentRange`, and that's exactly what `linearizing_input_stream` is. But it's wasteful. `linearizing_input_stream` is a heavy type, unsuitable for passing around as a view - it stores a pair of fragment iterators, a fragment view and a size (11 words) to conform to the iterator-based design of `FragmentRange`, when one fragment iterator (4 words) already contains all needed state, just hidden. I suggest we replace `FragmentRange` with `FragmentedView` (or something similar) altogether. Refs: #6138 Closes #7692 * github.com:scylladb/scylla: types: collection: add an optimization for single-fragment buffers in deserialize types: add an optimization for single-fragment buffers in deserialize cql3: tuples: don't linearize in in_value::from_serialized cql3: expr: expression: replace with_linearize with linearized cql3: constants: remove unneeded uses of with_linearized cql3: update_parameters: don't linearize in prefetch_data_builder::add_cell cql3: lists: remove unneeded use of with_linearized query-result-set: don't linearize in result_set_builder::deserialize types: remove unneeded collection deserialization overloads types: switch collection_type_impl::deserialize from bytes_view to FragmentedView cql3: sets: don't linearize in value::from_serialized cql3: lists: don't linearize in value::from_serialized cql3: maps: don't linearize in value::from_serialized types: remove unused deserialize_aux types: deserialize: don't linearize tuple elements types: deserialize: don't linearize collection elements types: switch deserialize from bytes_view to FragmentedView types: deserialize tuple types from FragmentedView types: deserialize set type from FragmentedView types: deserialize map type from FragmentedView types: deserialize list type from FragmentedView types: add FragmentedView versions of read_collection_size and read_collection_value types: deserialize varint type from FragmentedView types: deserialize floating point types from FragmentedView types: deserialize decimal type from FragmentedView types: deserialize duration type from FragmentedView types: deserialize IP address types from FragmentedView types: deserialize uuid types from FragmentedView types: deserialize timestamp type from FragmentedView types: deserialize simple date type from FragmentedView types: deserialize time type from FragmentedView types: deserialize boolean type from FragmentedView types: deserialize integer types from FragmentedView types: deserialize string types from FragmentedView types: remove unused read_simple_opt types: implement read_simple* versions for FragmentedView utils: fragmented_temporary_buffer: implement FragmentedView for view utils: fragment_range: add single_fragmented_view serializer: implement FragmentedView for buffer_view utils: fragment_range: add linearized and with_linearized for FragmentedView utils: fragment_range: add FragmentedView utils: fragmented_temporary_buffer: fix view::remove_prefix	2020-12-04 09:46:20 +01:00
Michał Chojnowski	68177a6721	cql3: expr: expression: replace with_linearize with linearized with_linearized creates an additional internal `bytes` when the input is fragmented. linearized copies the data directly to the output `bytes`, so it's more efficient.	2020-12-04 09:19:39 +01:00
Dejan Mircevski	7f8ed811c1	cql3/expr: Clarify multi-column doesn't use indexing Although not currently used, the old code was wrong and confusing to readers. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-11-25 10:59:13 -05:00
Avi Kivity	6be9f49380	cql3: expression: switch from range_bound to interval_bound to avoid clang class template argument deduction woes Clang does not implement P1814R0 (class template argument deduction for alias templates), so it can't deduce the template arguments for range_bound, but it can for interval_bound, so switch to that. Using the modern name rather than the compatibility alias is preferred anyway. Closes #7422	2020-11-01 13:19:44 +02:00
Dejan Mircevski	40adf38915	cql3/expr: Use Boost concept assert In `bd6855e`, we reverted to Boost ranges and commented out the concept check. But Boost has its own concept check, which this patch enables. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #7471	2020-10-22 17:24:49 +03:00
Avi Kivity	bd6855ed62	cql3: expression: drop <ranges> Clang has trouble with some parts of <ranges>. Replace with boost range adaptors for now.	2020-10-19 10:23:30 +03:00
Dejan Mircevski	df3ea2443b	cql3: Drop all uses_function methods No one seems to call them except for other uses_function methods. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-09-04 17:27:30 +02:00
Dejan Mircevski	cbf8186a12	cql3/expr: Drop make_column_op() Instantiating binary_operator directly is more readable. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-25 11:10:36 +03:00
Dejan Mircevski	fb6c011b52	everywhere: Insert space after `switch` Quoth @avikivity: "switch is not a function, and we celebrate that by putting a space after it like other control-flow keywords." https://github.com/scylladb/scylla/pull/7052#discussion_r471932710 Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 14:31:04 +03:00
Dejan Mircevski	1aa326c93b	cql3: Drop operator_type entirely Since no live code uses it anymore, it can be safely removed. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 12:27:01 +02:00
Dejan Mircevski	d97605f4f8	cql3: Drop operator_type from the parser Replace operator_type with the nicer-behaved oper_t in CQL parser and, consequently, in the relation hierarchy and column_condition. After this, no references to operator_type remain in live code. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 12:27:00 +02:00

1 2

53 Commits