In preparation of the relaxation of the grammar to return any expression,
change the whereClause production to return an expression rather than
terms. Note that the expression is still constrained to be a conjunction
of relations, and our filtering code isn't prepared for more.
Before the patch, if the WHERE clause was optional, the grammar would
pass an empty vector of expressions (which is exactly correct). After
the patch, it would pass a default-constructed expression. Now that
happens to be an empty conjunction, which is exactly what's needed, but
it is too accidental, so the patch changes optional WHERE clauses to
explicitly generate an empty conjunction if the WHERE clause wasn't
specified.
The grammar now checks that UPDATEs don't clash (for example,
updates to the same column). The checks are good, but the grammar
isn't the right place for them - better to concentrate all the checks
in the prepare() code so it's easy to see all the checks.
Move the checks to raw::update_statement::prepare_internal(). This
exposes that the checks are quadratic, so add a comment. It could be
fixed with a stable_sort() first, but that is left to later.
Closes#10820
An expr::constant is an expression that happens to represent a constant,
so it's too heavyweight to be used for evaluation. Right now the extra
weight is just a type (which causes extra work by having to maintain
the shared_ptr reference count), but it will grow in the future to include
source location (for error reporting) and maybe other things.
Prior to e9b6171b5 ("Merge 'cql3: expr: unify left-hand-side and
right-hand-side of binary_operator prepares' from Avi Kivity"), we had
to use expr::constant since there was not enough type infomation in
expressions. But now every expression carries its type (in programming
language terms, expressions are now statically typed), so carrying types
in values is not needed.
So change evaluate() to return cql3::raw_value. The majority of the
patch just changes that. The rest deals with some fallout:
- cql3::raw_value gains a view() helper to convert to a raw_value_view,
and is_null_or_unset() to match with expr::constant and reduce further
churn.
- some helpers that worked on expr::constant and now receive a
raw_value now need the type passed via an additional argument. The
type is computed from the expression by the caller.
- many type checks during expression evaluation were dropped. This is
a consequence of static typing - we must trust the expression prepare
phase to perform full type checking since values no longer carry type
information.
Closes#10797
Currently prepare_expression is never used where a schema is needed -
it is called for the right-hand-side of binary operators (where we
don't accept columns) or for attributes like WRITETIME or TTL. But
when we unify expression preparation it will need to handle columns
too, and these need the schema to look up the column.
So pass the schema as a parameter. It is optional (a pointer) since
not all contexts will have a schema (for example CREATE AGGREGATE).
Functionality of the relation class has been replaced by
expr::to_restriction.
Relation and all classes deriving from it can now be removed.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
Parser used to output the where clause as a vector of relations,
but now we can change it to a vector of expressions.
Cql.g needs to be modified to output expressions instead
of relations.
The WHERE clause is kept in a few places in the code that
need to be changed to vector<expression>.
Finally relation->to_restriction is replaced by expr::to_restriction
and the expressions are converted to restrictions where required.
The relation class isn't used anywhere now and can be removed.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.
Closes#10562
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
Add the missing partition-key validation in INSERT JSON statements.
Scylla, following the lead of Cassandra, forbids an empty-string partition
key (please note that this is not the same as a null partition key, and
that null clustering keys *are* allowed).
Trying to INSERT, UPDATE or DELETE a partition with an empty string as
the partition key fails with a "Key may not be empty". However, we had a
loophole - you could insert such empty-string partition keys using an
"INSERT ... JSON" statement.
The problem was that the partition key validation was done in one place -
`modification_statement::build_partition_keys()`. The INSERT, UPDATE and
DELETE statements all inherited this same method and got the correct
validation. But the INSERT JSON statement - insert_prepared_json_statement
overrode the build_partition_keys() method and this override forgot to call
the validation function. So in this patch we add the missing validation.
Note that the validation function checks for more than just empty strings -
there is also a length limit for partition keys.
This patch also adds a cql-pytest reproducer for this bug. Before this
patch, the test passed on Cassandra but failed on Scylla.
Reported by @FortTell
Fixes#9853.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220116085216.21774-1-nyh@scylladb.com>
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.
data_dictionary::database::real_database() is called from several
places, for these reasons:
- calling yet-to-be-converted code
- callers with a legitimate need to access data (e.g. system_keyspace)
but with the ::database accessor removed from query_processor.
We'll need to find another way to supply system_keyspace with
data access.
- to gain access to the wasm engine for testing whether used
defined functions compile. We'll have to find another way to
do this as well.
The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.
Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
prepare_term now takes an expression and returns a prepared expression.
It should be renamed to prepare_expression.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
prepare_term is now the only function that uses terms.
Change it so that it returns expression instead of term
and remove all occurences of expr::to_expression(prepare_term(...))
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Replace all uses of term with expression in cql3/statements/update_statement
There was some trouble with extracting values from json.
The original code worked this way on a map example:
> There is a json string to parse: {'b': 1, 'a': 2, 'b': 3}
> The code parses the json and creates bytes where this map is serialized
but without removing duplicates, sorting etc.
> Then a maps::delayed_value is created from these bytes.
During creation map elements are extracted, sorted and duplicates are removed.
This map value is then used in setter
Now when maps::delayed_value is changed to expr::constant the step where elements are sorted is lost.
Because of this we need to do this earlier, the best place is during original json parsing.
Additionally I suspect that removing duplicated elements used to work only on the first level, in case of map of maps it wouldn't work.
Now it will work no matter how many layers of maps there are.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
expr::constant is always serialized using the internal cql serialization format,
but currently the code keeps values in the cache in other format.
As preparation for moving from term to expression change
so that values kept in the cache are serialized using
the internal format.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Simple wrappers for std::get, std::get_if, std::holds_alternative.
The new names are shorter and IMO more readable.
Call sites are updated.
We will later replace the implementation.
constant is now ready to replace terminal as a final value representation.
Replace bind() with evaluate and shared_ptr<terminal> with constant.
We can't get rid of terminal yet. Sometimes terminal is converted back
to term, which constant can't do. This won't be a problem once we
replace term with expression.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Start using evaluate_to_raw_value instead of bind_and_get.
This is a step towards using only evaluate.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
To convert a terminal to expr::constant we need know the value type.
Implement getting value type for terminals in constants.hh.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
This reverts commit e9343fd382, reversing
changes made to 27138b215b. It causes a
regression in v2 serialization_format support:
collection_serialization_with_protocol_v2_test fails with: marshaling error: read_simple_bytes - not enough bytes (requested 1627390306, got 3)
Fixes#9360
constant is now ready to replace terminal as a final value representation.
Replace bind() with evaluate and shared_ptr<terminal> with constant.
We can't get rid of terminal yet. Sometimes terminal is converted back
to term, which constant can't do. This won't be a problem once we
replace term with expression.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Start using evaluate_to_raw_value instead of bind_and_get.
This is a step towards using only evaluate.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
To convert a terminal to expr::constant we need know the value type.
Implement getting value type for terminals in constants.hh.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Introduce a new expression untyped_constant that corresponds to
constants::literal, which is removed. untyped_constant is rather
ugly in that it won't exist post-prepare. We should probably instead
replace it with typed constants that use the widest possible type
(decimal and varint), and select a narrower type during the prepare
phase when we perform type inference. The conversion itseld is
straightforward.
Convert the four forms of abstract_marker to expr::bind_variable (the
name was chosen since variable is the role of the thing, while "marker"
refers more to the grammar). Having four variants is unnecessary, but
this patch doesn't do anything about that.
The class is repurposed to be more generic and also be able
to hold additional metadata related to function calls within
a CQL statement. Rename all methods appropriately.
Visitor functions in AST nodes (`collect_marker_specification`)
are also renamed to a more generic `fill_prepare_context`.
The name `prepare_context` designates that this metadata
structure is a byproduct of `stmt::raw::prepare()` call and
is needed only for "prepare" step of query execution.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
The parser accepts INSERT JSON ... IF NOT EXISTS but we later
ignore it. This is a bug (#8682). Note it down and shut down
a compiler error that will result when we enable
-Wunused-private-field.
The operation::make_*cell functions are useless aliases to methods of
update_parameters, and are used interchangeably with them throughout the code.
Remove them.
Also, remove the now-unused update_parameters::make_cell version for
fragmented_temporary_buffer::view.
We want to change the internals of cql3::raw_value{_view}.
However, users of cql3::raw_value and cql3::raw_value_view often
use them by extracting the internal representation, which will be different
after the planned change.
This commit prepares us for the change by making all accesses to the value
inside cql3::raw_value(_view) be done through helper methods which don't expose
the internal representation publicly.
After this commit we are free to change the internal representation of
raw_value_{view} without messing up their users.
"
operator_type is awkward because it's not copyable or assignable. Replace it with a new enum class.
Tests: unit(dev)
"
* dekimir-operator-type:
cql3: Drop operator_type entirely
cql3: Drop operator_type from the parser
cql3/expr: Replace operator_type with an enum
Replace operator_type with the nicer-behaved oper_t in CQL parser and,
consequently, in the relation hierarchy and column_condition.
After this, no references to operator_type remain in live code.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
`count` function was often used in various ways.
`contains` does not only express the intend of the code better but also
does it in more unified way.
This commit replaces all the occurences of the `count` with the
`contains`.
Tests: unit(dev)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>
seastar::make_shared has a constructor taking a T&&. There is no such
constructor in std::make_shared:
https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared
This means that we have to move from
make_shared(T(...)
to
make_shared<T>(...)
If we don't want to depend on the idiosyncrasies of
seastar::make_shared.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
In order to eventually switch to a single JSON library,
most of the libjsoncpp usage is dropped in favor of rjson.
Unfortunately, one usage still remains:
test/utils/test_repl utility heavily depends on the *exact textual*
format of its output JSON files, so replacing a library results
in all tests failing because of differences in formatting.
It is possible to force rjson to print its documents in the exact
matching format, but that's left for later, since the issue is not
critical. It would be nice though if our test suite compared
JSON documents with a real JSON parser, since there are more
differences - e.g. libjsoncpp keeps children of the object
sorted, while rapidjson uses an unordered data structure.
This change should cause no change in semantics, it strives
just to replace all usage of libjsoncpp with rjson.
* Mark cql3::attributes::raw class as final
* Change every occurrence of ::shared_ptr<attributes::raw>
to std::unique_ptr<...>
* Mark all methods in cql3::attributes::raw as const
* Remove redundant "_attrs" ptr copy in insert_json_statement,
use one from raw::modification_statement
* Fix odd indentation in cql3/statements/update_statement.cc
Tests: unit-tests (dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200301223708.99883-1-pa.solodovnikov@scylladb.com>
* Pass raw::select_statement::parameters as lw_shared_ptr
* Some more const cleanups here and there
* lists,maps,sets::equals now accept const-ref to *_type_impl
instead of shared_ptr
* Remove unused `get_column_for_condition` from modification_statement.hh
* More methods now accept const-refs instead of shared_ptr
Every call site where a shared_ptr was required as an argument
has been inspected to be sure that no dangling references are
possible.
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200220153204.279940-1-pa.solodovnikov@scylladb.com>
De-pointerize cql3 code APIs further: change some call sites
to pass `schema` as const-ref instead of `shared_ptr`.
Affected functions known to be expecting always non-null
pointer to schema and don't store or pass the pointer somewhere
else, assuming it's safe to give them just a reference.
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200218142338.69824-1-pa.solodovnikov@scylladb.com>
and replace all dht::global_partitioner().decorate_key
with dht::decorate_key
It is an improvement because dht::decorate_key takes schema
and uses it to obtain partitioner instead of using global
partitioner as it was before.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
`parsed_statement::get_bound_variables` is assumed to always
return a nonnull pointer to `variable_specifications` instance.
In this case using a pointer is superfluous and can be safely
replaced by a plain reference.
Also add a default ctor and a utility method `set_bound_variables`
to the `variable_specifications` class to actually reset the
contents of the class instance.
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200120195839.164296-1-pa.solodovnikov@scylladb.com>
Instances of `variable_specifications` are passed around as
shared_ptr's, which are redundant in this case since the class
is marked as `final`. Use `lw_shared_ptr` instead since we know
for sure it's not a polymorphic pointer.
Tests: unit(debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20191225232853.45395-1-pa.solodovnikov@scylladb.com>
cql_statement is a class representing a prepared statement in Scylla.
It is used concurrently during execution, so it is important that its
change is not changed by execution.
Add const qualifier to the execution methods family, throghout the
cql hierarchy.
Mark a few places which do mutate prepared statement state during
execution as mutable. While these are not affecting production today,
as code ages, they may become a source of latent bugs and should be
moved out of the prepared state or evaluated at prepare eventually:
cf_property_defs::_compaction_strategy_class
list_permissions_statement::_resource
permission_altering_statement::_resource
property_definitions::_properties
select_statement::_opts
Merged patch series from Juliusz Stasiewicz:
Welcome to my first PR to Scylla!
The task was intended as a warm-up ("noob") exercise; its description is
here: #4182 Sorry, I also couldn't help it and did some scouting: edited
descriptions of some metrics and shortened few annoyingly long LoC.