scylladb

Author	SHA1	Message	Date
Dejan Mircevski	8db24fc03b	cql3/expr: Handle `IN ?` bound to null Previously, we crashed when the IN marker is bound to null. Throw invalid_request_exception instead. Fixes #8265 Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #8287	2021-03-17 09:59:22 +02:00
Dejan Mircevski	992d5c6184	cql3/expr: Improve column printing Before this change, we would print an expression like this: ((ColumnDefinition{name=c, type=org.apache.cassandra.db.marshal.Int32Type, kind=CLUSTERING_COLUMN, componentIndex=0, droppedAt=-9223372036854775808}) = 0000007b) Now, we print the same expression like this: (c = 0000007b) Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #8285	2021-03-17 09:59:22 +02:00
Dejan Mircevski	8dac132581	cql3/expr: Add is_multi_column() It will come in handy when we start using expressions to calculate the clustering slice. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-03-10 21:25:43 -05:00
Dejan Mircevski	1f591bd16e	cql3/expr: Add more operators to needs_filtering Omitting these operators didn't cause bugs, because needs_filtering() is never invoked on them. But that will likely change in the future, so add them now to prevent problems down the road. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-03-10 21:25:43 -05:00
Dejan Mircevski	c0c93982d0	cql3: Replace CK-bound mode with comparison_order Instead of defining this enum in multi_column_restriction::slice, put it in the expr namespace and add it to binary_operator. We will need it when we switch bounds calculation from multi_column_restriction to expr classes. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-03-10 21:25:43 -05:00
Dejan Mircevski	7dfe471b5a	cql3/expr: Make to_range globally visible It will be used in statement_restrictions for calculating clustering bounds. And it will come in handy elsewhere in the future, I'm sure. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-03-10 21:25:43 -05:00
Botond Dénes	ba7a9d2ac3	imr: switch back to open-coded description of structures Commit `aab6b0ee27` introduced the controversial new IMR format, which relied on a very template-heavy infrastructure to generate serialization and deserialization code via template meta-programming. The promise was that this new format, beyond solving the problems the previous open-coded representation had (working on linearized buffers), will speed up migrating other components to this IMR format, as the IMR infrastructure reduces code bloat, makes the code more readable via declarative type descriptions as well as safer. However, the results were almost the opposite. The template meta-programming used by the IMR infrastructure proved very hard to understand. Developers don't want to read or modify it. Maintainers don't want to see it being used anywhere else. In short, nobody wants to touch it. This commit does a conceptual revert of `aab6b0ee27`. A verbatim revert is not possible because related code evolved a lot since the merge. Also, going back to the previous code would mean we regress as we'd revert the move to fragmented buffers. So this revert is only conceptual, it changes the underlying infrastructure back to the previous open-coded one, but keeps the fragmented buffers, as well as the interface of the related components (to the extent possible). Fixes: #5578	2021-02-16 23:43:07 +01:00
Avi Kivity	60f5ec3644	Merge 'managed_bytes: switch to explicit linearization' from Michał Chojnowski This is a revival of #7490. Quoting #7490: The managed_bytes class now uses implicit linearization: outside LSA, data is never fragmented, and within LSA, data is linearized on-demand, as long as the code is running within with_linearized_managed_bytes() scope. We would like to stop linearizing managed_bytes and keep it fragmented at all times, since linearization can require large contiguous chunks. Large contiguous allocations are hard to satisfy and cause latency spikes. As a first step towards that, we remove all implicitly linearizing accessors and replace them with an explicit linearization accessor, with_linearized(). Some of the linearization happens long before use, by creating a bytes_view of the managed_bytes object and passing it onwards, perhaps storing it for later use. This does not work with with_linearized(), which creates a temporary linearized view, and does not work towards the longer term goal of never linearizing. As a substitute a managed_bytes_view class is introduced that acts as a view for managed_bytes (for interoperability it can also be a view for bytes and is compatible with bytes_view). By the end of the series, all linearizations are temporary, within the scope of a with_linearized() call and can be converted to fragmented consumption of the data at leisure. This has limited practical value directly, as current uses of managed_bytes are limited to keys (which are limited to 64k). However, it enables converting the atomic_cell layer back to managed_bytes (so we can remove IMR) and the CQL layer to managed_bytes/managed_bytes_view, removing contiguous allocations from the coordinator. Closes #7820 * github.com:scylladb/scylla: test: add hashers_test memtable: fix accounting of managed_bytes in partition_snapshot_accounter test: add managed_bytes_test utils: fragment_range: add a fragment iterator for FragmentedView keys: update comments after changes and remove an unused method mutation_test: use the correct preferred_max_contiguous_allocation in measuring_allocator row_cache: more indentation fixes utils: remove unused linearization facilities in `managed_bytes` class misc: fix indentation treewide: remove remaining `with_linearized_managed_bytes` uses memtable, row_cache: remove `with_linearized_managed_bytes` uses utils: managed_bytes: remove linearizing accessors keys, compound: switch from bytes_view to managed_bytes_view sstables: writer: add write_* helpers for managed_bytes_view compound_compat: transition legacy_compound_view from bytes_view to managed_bytes_view types: change equal() to accept managed_bytes_view types: add parallel interfaces for managed_bytes_view types: add to_managed_bytes(const sstring&) serializer_impl: handle managed_bytes without linearizing utils: managed_bytes: add managed_bytes_view::operator[] utils: managed_bytes: introduce managed_bytes_view utils: fragment_range: add serialization helpers for FragmentedMutableView bytes: implement std::hash using appending_hash utils: mutable_view: add substr() utils: fragment_range: add compare_unsigned utils: managed_bytes: make the constructors from bytes and bytes_view explicit utils: managed_bytes: introduce with_linearized() utils: managed_bytes: constrain with_linearized_managed_bytes() utils: managed_bytes: avoid internal uses of managed_bytes::data() utils: managed_bytes: extract do_linearize_pure() thrift: do not depend on implicit conversion of keys to bytes_view clustering_bounds_comparator: do not depend on implicit conversion of keys to bytes_view cql3: expression: linearize get_value_from_mutation() eariler bytes: add to_bytes(bytes) cql3: expression: mark do_get_value() as static	2021-01-18 11:01:28 +02:00
Dejan Mircevski	3aa80f47fe	abstract_type: Rework unreversal methods Replace two methods for unreversal (`as` and `self_or_reversed`) with a new one (`without_reversed`). More flexible and better named. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #7889	2021-01-10 19:30:12 +02:00
Dejan Mircevski	4515a49d4d	cql3: Fix `IN ?` for unset values When the right-hand side of IN is an unset value, we must report an error, like Cassandra does. This fixes testListWithUnsetValues, so re-enable it. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-01-07 13:22:20 +02:00
Avi Kivity	1dd6d7029a	cql3: expression: linearize get_value_from_mutation() eariler do_get_value() is careful to return a fragmented view, but its only caller get_value_from_mutation() linearizes it immediately afterwards. Linearize it sooner; this prevents mixing in fragmented values from cells (now via IMR) and fragmented values from partition/clustering keys. It only works now because keys are not fragmented outside LSA, and value_view has a special case for single-fragment values. This helps when keys become fragmented.	2020-12-20 15:14:44 +01:00
Avi Kivity	28126257c2	cql3: expression: mark do_get_value() as static It is used only later in this file.	2020-12-20 15:14:44 +01:00
Piotr Sarna	2015988373	Merge 'types: get rid of linearization in deserialize()' from Michał Chojnowski Citing #6138: > In the past few years we have converted most of our codebase to work in terms of fragmented buffers, instead of linearised ones, to help avoid large allocations that put large pressure on the memory allocator. > One prominent component that still works exclusively in terms of linearised buffers is the types hierarchy, more specifically the de/serialization code to/from CQL format. Note that for most types, this is the same as our internal format, notable exceptions are non-frozen collections and user types. > > Most types are expected to contain reasonably small values, but texts, blobs and especially collections can get very large. Since the entire hierarchy shares a common interface we can either transition all or none to work with fragmented buffers. This series gets rid of intermediate linearizations in deserialization. The next steps are removing linearizations from serialization, validation and comparison code. Series summary: - Fix a bug in `fragmented_temporary_buffer::view::remove_prefix`. (Discovered while testing. Since it wasn't discovered earlier, I guess it doesn't occur in any code path in master.) - Add a `FragmentedView` concept to allow uniform handling of various types of fragmented buffers (`bytes_view`, `temporary_fragmented_buffer::view`, `ser::buffer_view` and likely `managed_bytes_view` in the future). - Implement `FragmentedView` for relevant fragmented buffer types. - Add helper functions for reading from `FragmentedView`. - Switch `deserialize()` and all its helpers from `bytes_view` to `FragmentedView`. - Remove `with_linearized()` calls which just became unnecessary. - Add an optimization for single-fragment cases. The addition of `FragmentedView` might be controversial, because another concept meant for the same purpose - `FragmentRange` - is already used. Unfortunately, it lacks the functionality we need. The main (only?) thing we want to do with a fragmented buffer is to extract a prefix from it and `FragmentRange` gives us no way to do that, because it's immutable by design. We can work around that by wrapping it into a mutable view which will track the offset into the immutable `FragmentRange`, and that's exactly what `linearizing_input_stream` is. But it's wasteful. `linearizing_input_stream` is a heavy type, unsuitable for passing around as a view - it stores a pair of fragment iterators, a fragment view and a size (11 words) to conform to the iterator-based design of `FragmentRange`, when one fragment iterator (4 words) already contains all needed state, just hidden. I suggest we replace `FragmentRange` with `FragmentedView` (or something similar) altogether. Refs: #6138 Closes #7692 * github.com:scylladb/scylla: types: collection: add an optimization for single-fragment buffers in deserialize types: add an optimization for single-fragment buffers in deserialize cql3: tuples: don't linearize in in_value::from_serialized cql3: expr: expression: replace with_linearize with linearized cql3: constants: remove unneeded uses of with_linearized cql3: update_parameters: don't linearize in prefetch_data_builder::add_cell cql3: lists: remove unneeded use of with_linearized query-result-set: don't linearize in result_set_builder::deserialize types: remove unneeded collection deserialization overloads types: switch collection_type_impl::deserialize from bytes_view to FragmentedView cql3: sets: don't linearize in value::from_serialized cql3: lists: don't linearize in value::from_serialized cql3: maps: don't linearize in value::from_serialized types: remove unused deserialize_aux types: deserialize: don't linearize tuple elements types: deserialize: don't linearize collection elements types: switch deserialize from bytes_view to FragmentedView types: deserialize tuple types from FragmentedView types: deserialize set type from FragmentedView types: deserialize map type from FragmentedView types: deserialize list type from FragmentedView types: add FragmentedView versions of read_collection_size and read_collection_value types: deserialize varint type from FragmentedView types: deserialize floating point types from FragmentedView types: deserialize decimal type from FragmentedView types: deserialize duration type from FragmentedView types: deserialize IP address types from FragmentedView types: deserialize uuid types from FragmentedView types: deserialize timestamp type from FragmentedView types: deserialize simple date type from FragmentedView types: deserialize time type from FragmentedView types: deserialize boolean type from FragmentedView types: deserialize integer types from FragmentedView types: deserialize string types from FragmentedView types: remove unused read_simple_opt types: implement read_simple* versions for FragmentedView utils: fragmented_temporary_buffer: implement FragmentedView for view utils: fragment_range: add single_fragmented_view serializer: implement FragmentedView for buffer_view utils: fragment_range: add linearized and with_linearized for FragmentedView utils: fragment_range: add FragmentedView utils: fragmented_temporary_buffer: fix view::remove_prefix	2020-12-04 09:46:20 +01:00
Michał Chojnowski	68177a6721	cql3: expr: expression: replace with_linearize with linearized with_linearized creates an additional internal `bytes` when the input is fragmented. linearized copies the data directly to the output `bytes`, so it's more efficient.	2020-12-04 09:19:39 +01:00
Dejan Mircevski	7f8ed811c1	cql3/expr: Clarify multi-column doesn't use indexing Although not currently used, the old code was wrong and confusing to readers. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-11-25 10:59:13 -05:00
Avi Kivity	6be9f49380	cql3: expression: switch from range_bound to interval_bound to avoid clang class template argument deduction woes Clang does not implement P1814R0 (class template argument deduction for alias templates), so it can't deduce the template arguments for range_bound, but it can for interval_bound, so switch to that. Using the modern name rather than the compatibility alias is preferred anyway. Closes #7422	2020-11-01 13:19:44 +02:00
Dejan Mircevski	40adf38915	cql3/expr: Use Boost concept assert In `bd6855e`, we reverted to Boost ranges and commented out the concept check. But Boost has its own concept check, which this patch enables. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #7471	2020-10-22 17:24:49 +03:00
Avi Kivity	bd6855ed62	cql3: expression: drop <ranges> Clang has trouble with some parts of <ranges>. Replace with boost range adaptors for now.	2020-10-19 10:23:30 +03:00
Dejan Mircevski	df3ea2443b	cql3: Drop all uses_function methods No one seems to call them except for other uses_function methods. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-09-04 17:27:30 +02:00
Dejan Mircevski	cbf8186a12	cql3/expr: Drop make_column_op() Instantiating binary_operator directly is more readable. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-25 11:10:36 +03:00
Dejan Mircevski	fb6c011b52	everywhere: Insert space after `switch` Quoth @avikivity: "switch is not a function, and we celebrate that by putting a space after it like other control-flow keywords." https://github.com/scylladb/scylla/pull/7052#discussion_r471932710 Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 14:31:04 +03:00
Dejan Mircevski	1aa326c93b	cql3: Drop operator_type entirely Since no live code uses it anymore, it can be safely removed. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 12:27:01 +02:00
Dejan Mircevski	d97605f4f8	cql3: Drop operator_type from the parser Replace operator_type with the nicer-behaved oper_t in CQL parser and, consequently, in the relation hierarchy and column_condition. After this, no references to operator_type remain in live code. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 12:27:00 +02:00
Dejan Mircevski	71c921111d	cql3/expr: Replace operator_type with an enum operator_type is awkward because it's not copyable or assignable. Replace it in expression representation with a new enum class, oper_t. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 12:27:00 +02:00
Dejan Mircevski	8cae61ee6b	cql3: Move #include from .hh to .cc restrictions.hh included fmt/ostream.h, which is expensive due to its transitive #includes. Replace it with fmt/core.h, which transitively includes only standard C++ headers. As requested by #5763 feedback: https://github.com/scylladb/scylla/pull/5763#discussion_r443210634 Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-08 21:37:08 +03:00
Dejan Mircevski	df20854963	cql3: Move expressions to their own namespace Move the classes representing CQL expressions (and utility functions on them) from the `restrictions` namespace to a new namespace `expr`. Most of the restriction.hh content was moved verbatim to expression.hh. Similarly, all expression-related code was moved from statement_restrictions.cc verbatim to expression.cc. As suggested in #5763 feedback https://github.com/scylladb/scylla/pull/5763#discussion_r443210498 Tests: dev (unit) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-08 21:03:26 +03:00

26 Commits