The way that this detection works is a bit clunky, but it does its job
given the simplest cases e.g. "SELECT COUNT(*) FROM ks.t". It fails when
there are multiple selectors, or when there is a column name specified
("SELECT COUNT(column_name) FROM ks.t").
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.
data_dictionary::database::real_database() is called from several
places, for these reasons:
- calling yet-to-be-converted code
- callers with a legitimate need to access data (e.g. system_keyspace)
but with the ::database accessor removed from query_processor.
We'll need to find another way to supply system_keyspace with
data access.
- to gain access to the wasm engine for testing whether used
defined functions compile. We'll have to find another way to
do this as well.
The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.
Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
Move selectable_column to selectable.cc (and to the cql3::selection
namespace). This cleans up column_identifier.hh so it is now a pure
vocabulary header.
column_identifier serves two purposes: one is as a general value type
that names a value, for example in column_specification. The other
is as a `selectable` derived class specializing in selecting columns
from a base table. Obviously, to select a column from a table you
need to know its name, but to name some value (which might not
be a table column!) you don't need the ability to select it from
a table.
The mix-up stands in the way of unifying the select-clause
(`selectable`) and where-clause (previously known as `term`)
expression prepare paths. This is because the already existing
where-clause result, `expr::column_value`, is implemented as
`column_definition*`, while the select clause equivalent,
`column_identifier`, can't contain a column_definition because
not all uses of column_identifier name a schema column.
To fix this, split column_identifier into two: column_identifier
retains the original use case of naming a value, while a new class
`selectable_column` has the additional ability of selecting a
column in a select clause. It still doesn't use column_definition,
that will be adjusted later.
sprint() is obsolete. Note some calls where to helper functions that
use sprint(), not to sprint() directly, so both the helpers and
the callers were modified.
This allows us to forward-declare raw_selector, which in turn reduces
indirect inclusions of expression.hh from 147 to 58, reducing rebuilds
when anything in that area changes.
Includes that were lost due to the change are restored in individual
translation units.
Closes#9434
Now that expression can be nested in its component types
directly, we can remove nested_expression. Most of the patch
adjusts uses to drop the dereference that was needed for
nested_expression.
Simple wrappers for std::get, std::get_if, std::holds_alternative.
The new names are shorter and IMO more readable.
Call sites are updated.
We will later replace the implementation.
The new expr::visit() is just a wrapper around std::visit(),
but has better constraints. A call to expr::visit() with a
visitor that misses an overload will produce an error message
that points at the missing type. This is done using the new
invocable_on_expression concept. Note it lists the expression
types one by one rather than using template magic, since
otherwise we won't get the nice messages.
Later, we will change the implementation when expression becomes
our own type rather than std::variant.
Call sites are updated.
Adds constant to the expression variant:
struct constant {
raw_value value;
data_type type;
};
This struct will be used to represent constant values with known bytes and type.
This corresponds to the terminal from current design.
bool is removed from expression, now constant is used instead.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
This reverts commit e9343fd382, reversing
changes made to 27138b215b. It causes a
regression in v2 serialization_format support:
collection_serialization_with_protocol_v2_test fails with: marshaling error: read_simple_bytes - not enough bytes (requested 1627390306, got 3)
Fixes#9360
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.
Enable the warning and fix numerous violations.
Closes#9347
Adds constant to the expression variant:
struct constant {
raw_value value;
data_type type;
};
This struct will be used to represent constant values with known bytes and type.
This corresponds to the terminal from current design.
bool is removed from expression, now constant is used instead.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
column_value_tuple overlaps both column_value and tuple_constructor
(in different respects) and can be replaced by a combination: a
tuple_constructor of column_value. The replacement is more expressive
(we can have a tuple of column_value and other expression types), though
the code (especially grammar) do not allow it yet.
So remove column_value_tuple and replace it everywhere with
tuple_constructor. Visitors get the merged behavior of the existing
tuple_constructor and column_value_tuple, which is usually trivial
since tuple_constructor and column_value_tuple came from different
hierarchies (term::raw and relation), so usually one of the types
just calls on_internal_error().
The change results in awkwards casts in two areas: WHERE clause
filtering (equal() and related), and clustering key range evaluations
(limits() and related). When equal() is replaced by recursive
evaluate(), the casts will go way (to be replaced by the evaluate())
visitor. Clustering key range extraction will remain limited
to tuples of column_value, so the prepare phase will have to vet
the expressions to ensure the casts don't fail (and use the
filtering path if they will).
Tests: unit (dev)
Closes#9274
Convert the user_types::literal raw to a new expression type
usertype_constructor. I used "usertype" to convey that is is a
((user type) constructor), not a (user (type constructor)).
We have this dependency now:
column_identifier -> selectable -> expression
and want to introduce this:
expression -> user types -> column_identifier
This leads to a loop, since expression is not (yet) forward
declarable.
Fix by moving any mention of expression from selectable.hh to a new
header selection-expr.hh.
database.cc lost access to timeout_config, so adjust its includes
to regain it.
Introduce a collection_constructor (similar to C++'s std::initializer_list)
to hold subexpressions being gathered into a list. Since sets, maps, and
lists construction share some attributes (all elements must be of the
same type) collection_constructor will be used for all of them, so it
also holds an enum. I used "style" for the enum since it's a weak
attribute - an empty set is also an empty map. I chose collection_constructor
rather than plain 'collection' to highlight that it's not the only way
to get a collection (selecting a collection column is another, as an
example) and to hint at what it does - construct a collection from
more primitive elements.
Introduce tuple_constructor (not a literal, since (?, ?) and (column_value,
column_value) are not literals) to represent a tuple constructed from
subexpressions. In the future we can replace column_value_tuple
with tuple_constructor(column_value, column_value, ...), but this is
not done now.
I chose the name 'tuple_constructor' since other expressions can represent
tuples (e.g. my_tuple_column, :bind_variable_of_tuple_type,
func_returning_tuple()). It also explains what the expression does.
Introduce a new expression untyped_constant that corresponds to
constants::literal, which is removed. untyped_constant is rather
ugly in that it won't exist post-prepare. We should probably instead
replace it with typed constants that use the widest possible type
(decimal and varint), and select a narrower type during the prepare
phase when we perform type inference. The conversion itseld is
straightforward.
Convert the four forms of abstract_marker to expr::bind_variable (the
name was chosen since variable is the role of the thing, while "marker"
refers more to the grammar). Having four variants is unnecessary, but
this patch doesn't do anything about that.
The cast expression has two operands: the subexpression to cast and the
type to cast to. Since prepared and unprepared expressions are the
same type, we don't have to do anything, but prepared and unprepared
types are different. So add a variant to be able to support both.
The reason the selectable->expression transformation did not need to
do this is that casts in a selector cannot accept a user defined type.
Note those casts also have different syntax and different execution,
so we'll have to choose whether to unify the two semantics, or whether
to keep them separate. This patch does not force anything (but does hint
at unification by not including any discriminant beyond the type's
rawness). The string representation matches the part of the grammar
it was derived from (or conversion back to CQL will yield wrong
results).
Since expressions can nest, and since we won't covert everything at once,
add a way to store a term::raw as an expression. We can now have a
term::raw that is internally an expression, and an expression that is
implemented as term::raw.
Function call selectors correctly checked if their arguments
are required to run in threaded context, but forgot to check
the function itself - which is now done.
Now that all selectable::raw subclasses have been converted to
cql3::selectable::with_expression::raw, the class structure is
just a wrapper around expressions. Peel it, converting the
virtual member functions to free functions, and replacing
object instances with expression or nested_expression as the
case allows.
Add a field_selection variant element to expression. Like function_call
and cast, the structure from which a field is selectewd cannot yet be
an expression, since not all seletable::raw:s are converted. This will
be done in a later pass. This is also why printing a field selection now
does not print the selected expression; this will also be corrected later.
Add a cast variant element to expression. Like function_call, the
argument being converted cannot yet be an expression, since not
all seletable::raw:s are converted. This will be done in a later
pass. This is also why printing a cast now does not print the
casted expression; this will also be corrected later.
Rather than creating a new variant element in expression, we extend
function_call to handle both named and anonymous functions, since
most of the processing is the same.
Add a function_call variant element to hold function calls. Note
that because not all selectables are yet converted, function call
arguments are still of type selectable::raw. They will be converted
to expressions later. This is also why printing a function now
does not print its arguments; this will also be corrected later.
As temporary scaffolding while we're converting selectable::raw
subclasses to expressions, we'll need expressions to refer to
selectable::raw (specifically, function call arguments, which will
end up as expressions as well). To avoid a #include loop, make
selectable::raw forward-declarable by moving it to namespace scope.
Create a new element in the expression variant, column_mutation_attribute,
signifying we're picking up an attribute of a column mutation (not a
column value!). We use an enum rather than a bool to choose between
writetime and ttl (the two mutation attributes) for increased
explicitness.
Although there can only be one type for the column we're operating
on (it must be an unresolved_identifer), we use a nested_expression.
This is because we'll later need to also support a column_value
as the column type after we prepare it. This is somewhat similar
to the address of operator in C, which syntactically takes any
expression but semantically operates only on lvalues.
Introduce unresolved_identifer as an unprepared counterpart to column_value.
column_identifier_raw no longer inherits from selectable::raw, but
methods for now to reduce churn.
Prepare to migrate selectable::raw sub-classes to expressions by
creating a bridge betweet the two types. with_expression::raw
is a selectable::raw and implements all its methods (right now,
trivially), and its contents is an expression. The methods are
implemented using the usual visitor pattern.
Eliminate not used includes and replace some more includes
with forward declarations where appropriate.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
We may want to store a tuple in a fragmented buffer. To split it
into a vector of optional bytes, tuple_type_impl::split can be used.
To split a contiguous buffer(bytes_view), simply pass
single_fragmented_view(bytes_view).
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
This patch enables select cql statements where collection columns are
selected columns in queries where clustering column is restricted by
"IN" cql operator. Such queries are accepted by cassandra since v4.0.
The internals actually provide correct support for this feature already,
this patch simply removes relevant cql query check.
Tests: cql-pytest (testInRestrictionWithCollection)
Fixes#7743Fixes#4251
Signed-off-by: Vojtech Havel <vojtahavel@gmail.com>
Message-Id: <20210104223422.81519-1-vojtahavel@gmail.com>
Move the classes representing CQL expressions (and utility functions
on them) from the `restrictions` namespace to a new namespace `expr`.
Most of the restriction.hh content was moved verbatim to
expression.hh. Similarly, all expression-related code was moved from
statement_restrictions.cc verbatim to expression.cc.
As suggested in #5763 feedback
https://github.com/scylladb/scylla/pull/5763#discussion_r443210498
Tests: dev (unit)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Currently, we cannot select more than 2^32 rows from a table because we are limited by types of
variables containing the numbers of rows. This patch changes these types and sets new limits.
The new limits take effect while selecting all rows from a table - custom limits of rows in a result
stay the same (2^32-1).
In classes which are being serialized and used in messaging, in order to be able to process queries
originating from older nodes, the top 32 bits of new integers are optional and stay at the end
of the class - if they're absent we assume they equal 0.
The backward compatibility was tested by querying an older node for a paged selection, using the
received paging_state with the same select statement on an upgraded node, and comparing the returned
rows with the result generated for the same query by the older node, additionally checking if the
paging_state returned by the upgraded node contained new fields with correct values. Also verified
if the older node simply ignores the top 32 bits of the remaining rows number when handling a query
with a paging_state originating from an upgraded node by generating and sending such a query to
an older node and checking the paging_state in the reply(using python driver).
Fixes#5101.
Instead of `restriction` class methods, use the new free functions.
Specific replacement actions are listed below.
Note that class `restrictions` (plural) remains intact -- both its
methods and its type hierarchy remain intact for now.
Ensure full test coverage of the replacement code with new file
test/boost/restrictions_test.cc and some extra testcases in
test/cql/*.
Drop some existing tests because they codify buggy behaviour
(reference #6369, #6382). Drop others because they forbid relation
combinations that are now allowed (eg, mixing equality and
inequality, comparing to NULL, etc.).
Here are some specific categories of what was replaced:
- restriction::is_foo predicates are replaced by using the free
function find_if; sometimes it is used transitively (see, eg,
has_slice)
- restriction::is_multi_column is replaced by dynamic casts (recall
that the `restrictions` class hierarchy still exists)
- utility methods is_satisfied_by, is_supported_by, to_string, and
uses_function are replaced by eponymous free functions; note that
restrictions::uses_function still exists
- restriction::apply_to is replaced by free function
replace_column_def
- when checking infinite_bound_range_deletions, the has_bound is
replaced by local free function bounded_ck
- restriction::bounds and restriction::value are replaced by the more
general free function possible_lhs_values
- using free functions allows us to simplify the
multi_column_restriction and token_restriction hierarchies; their
methods merge_with and uses_function became identical in all
subclasses, so they were moved to the base class
- single_column_primary_key_restrictions<clustering_key>::needs_filtering
was changed to reuse num_prefix_columns_that_need_not_be_filtered,
which uses free functions
Fixes#5799.
Fixes#6369.
Fixes#6371.
Fixes#6372.
Fixes#6382.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
This patch changes the signatures of `test_assignment` and
`test_all` functions to accept `cql3::column_specification` by
const reference instead of shared pointer.
Mostly a cosmetic change reducing overall shared_ptr bloat in
cql3 code.
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200529195249.767346-1-pa.solodovnikov@scylladb.com>