A user-defined aggregate is represented as an aggregate
which calls its state function on each input row
and then finalizes its execution by calling its final function
on the final state, after all rows were already processed.
These include the following:
* `currenttimestamp()`
* `currenttime()`
* `currentdate()`
* `currenttimeuuid()`
Previously, they were incorrectly marked as pure, meaning
that the function is executed at "prepare" step.
For functions that possibly depend on timing and random seed,
this is clearly a bug. Cassandra doesn't have a notion of pure
functions, so they are lazily evaluated.
Make Scylla to match Cassandra behavior for these functions.
Add a unit-test for a fix (excluding `currentdate()` function,
because there is no way to use synthetic clock with query
processor and sleeping for a whole day to demonstrate correct
behavior is clearly not an option).
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
And reuse these values when handling `bounce_to_shard` messages.
Otherwise such a function (e.g. `uuid()`) can yield a different
value when a statement re-executed on the other shard.
It can lead to an infinite number of `bounce_to_shard` messages
sent in case the function value is used to calculate partition
key ranges for the query. Which, in turn, will cause crashes
since we don't support bouncing more than one time and the second
hop will result in a crash.
Caching works only for LWT statements and only for the function
calls that affect partition key range computation for the query.
`variable_specifications` class is renamed to `prepare_context`
and generalized to record information about each `function_call`
AST node and modify them, as needed:
* Check whether a given function call is a part of partition key
statement restriction.
* Assign ids for caching if above is true and the call is a part
of an LWT statement.
There is no need to include any kind of statement identifier
in the cache key since `query_options` (which holds the cache)
is limited to a single statement, anyway.
Note that `function_call::raw` AST nodes are not created
for selection clauses of a SELECT statement hence they
can only accept only one of the following things as parameters:
* Other function calls.
* Literal values.
* Parameter markers.
In other words, only parameters that can be immediately reduced
to a byte buffer are allowed and we don't need to handle
database inputs to non-pure functions separately since they
are not possible in this context. Anyhow, we don't even have
a single non-pure function that accepts arguments, so precautions
are not needed at the moment.
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
The class is repurposed to be more generic and also be able
to hold additional metadata related to function calls within
a CQL statement. Rename all methods appropriately.
Visitor functions in AST nodes (`collect_marker_specification`)
are also renamed to a more generic `fill_prepare_context`.
The name `prepare_context` designates that this metadata
structure is a byproduct of `stmt::raw::prepare()` call and
is needed only for "prepare" step of query execution.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
This warning prevents using std::move() where it can hurt
- on an unnamed temporary or a named automatic variable being
returned from a function. In both cases the value could be
constructed directly in its final destination, but std::move()
prevents it.
Fix the handful of cases (all trivial), and enable the warning.
Closes#8992
Eliminate not used includes and replace some more includes
with forward declarations where appropriate.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Every time db/config.hh is modified (e.g., to add a new configuration
option), 110 source files need to be recompiled. Many of those 110 didn't
really care about configuration options, and just got the dependency
accidentally by including some other header file.
In this patch, I remove the include of "db/config.hh" from all header
files. It is only needed in source files - and header files only
need forward declarations. In some cases, source files were missing
certain includes which they got incidentally from db/config.hh, so I
had to add these includes explicitly.
After this patch, the number of source files that get recompiled after a
change to db/config.hh goes down from 110 to 45.
It also means that 65 source files now compile faster because they don't
include db/config.hh and whatever it included.
Additionally, this patch also eliminates a few unnecessary inclusions
of database.hh in other header files, which can use a forward declaration
or database_fwd.hh. Some of the source files including one of those
header files relied on one of the many header files brought in by
database.hh, so we need to include those explicitly.
In view_update_generator.hh something interesting happened - it *needs*
database.hh because of code in the header file, but only included
database_fwd.hh, and the only reason this worked was that the files
including view_update_generator.hh already happened to unnecessarily
include database.hh. So we fix that too.
Refs #1
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210505102111.955470-1-nyh@scylladb.com>
A follow up for the patch for #7611. This change was requested
during review and moved out of #7611 to reduce its scope.
The patch switches UUID_gen API from using plain integers to
hold time units to units from std::chrono.
For one, we plan to switch the entire code base to std::chrono units,
to ensure type safety. Secondly, using std::chrono units allows to
increase code reuse with template metaprogramming and remove a few
of UUID_gen functions that beceme redundant as a result.
* switch get_time_UUID(), unix_timestamp(), get_time_UUID_raw(), switch
min_time_UUID(), max_time_UUID(), create_time_safe() to
std::chrono
* remove unused variant of from_unix_timestamp()
* remove unused get_time_UUID_bytes(), create_time_unsafe(),
redundant get_adjusted_timestamp()
* inline get_raw_UUID_bytes()
* collapse to similar implementations of get_time_UUID()
* switch internal constants to std::chrono
* remove unnecessary unique_ptr from UUID_gen::_instance
Message-Id: <20210406130152.3237914-2-kostja@scylladb.com>
We want to change the internals of cql3::raw_value{_view}.
However, users of cql3::raw_value and cql3::raw_value_view often
use them by extracting the internal representation, which will be different
after the planned change.
This commit prepares us for the change by making all accesses to the value
inside cql3::raw_value(_view) be done through helper methods which don't expose
the internal representation publicly.
After this commit we are free to change the internal representation of
raw_value_{view} without messing up their users.
Commit aab6b0ee27 introduced the
controversial new IMR format, which relied on a very template-heavy
infrastructure to generate serialization and deserialization code via
template meta-programming. The promise was that this new format, beyond
solving the problems the previous open-coded representation had (working
on linearized buffers), will speed up migrating other components to this
IMR format, as the IMR infrastructure reduces code bloat, makes the code
more readable via declarative type descriptions as well as safer.
However, the results were almost the opposite. The template
meta-programming used by the IMR infrastructure proved very hard to
understand. Developers don't want to read or modify it. Maintainers
don't want to see it being used anywhere else. In short, nobody wants to
touch it.
This commit does a conceptual revert of
aab6b0ee27. A verbatim revert is not
possible because related code evolved a lot since the merge. Also, going
back to the previous code would mean we regress as we'd revert the move
to fragmented buffers. So this revert is only conceptual, it changes the
underlying infrastructure back to the previous open-coded one, but keeps
the fragmented buffers, as well as the interface of the related
components (to the extent possible).
Fixes: #5578
Rather than asserting, as seen in #7977.
This shouldn't crash the server in production.
Add unit test that reproduces this scenario
and verifies the internal error exception.
Fixes#7977
Test: unit(release)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210201163051.1775536-1-bhalevy@scylladb.com>
As reproduced in cql-pytest/test_json.py and reported in issue #7911,
failing fromJson() calls should return a FUNCTION_FAILURE error, but
currently produce a generic SERVER_ERROR, which can lead the client
to think the server experienced some unknown internal error and the
query can be retried on another server.
This patch adds a new cassandra_exception subclass that we were missing -
function_execution_exception - properly formats this error message (as
described in the CQL protocol documentation), and uses this exception
in two cases:
1. Parse errors in fromJson()'s parameters are converted into a
function_execution_exception.
2. Any exceptions during the execute() of a native_scalar_function_for
function is converted into a function_execution_exception.
In particular, fromJson() uses a native_scalar_function_for.
Note, however, that functions which already took care to produce
a specific Cassandra error, this error is passed through and not
converted to a function_execution_exception. An example is
the blobAsText() which can return an invalid_request error, so
it is left as such and not converted. This also happens in Cassandra.
All relevant tests in cql-pytest/test_json.py now pass, and are
no longer marked xfail. This patch also includes a few more improvements
to test_json.py.
Fixes#7911
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210118140114.4149997-1-nyh@scylladb.com>
The min/max aggregators use aggregate_type_for comparators, and the
aggregate_type_for<timeuuid> is regular uuid. But that yields wrong
results; timeuuids should be compared as timestamps.
Fix it by changing aggregate_type_for<timeuuid> from uuid to timeuuid,
so aggregators can distinguish betwen the two. Then specialize the
aggregation utilities for timeuuid.
Add a cql-pytest and change some unit tests, which relied on naive
uuid comparators.
Fixes#7729.
Tests: unit (dev, debug)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Closes#7910
scoped_critical_alloc_section was recently introduced to replace
disable_failure_guard and made the old class deprecated.
This patch replaces all occurences of disable_failure_guard with
scoped_critical_alloc_section.
Without this patch the build prints many warnings like:
warning: 'disable_failure_guard' is deprecated: Use scoped_critical_section instead [-Wdeprecated-declarations]
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <ca2a91aaf48b0f6ed762a6aa687e6ac5e936355d.1605621284.git.piotr@scylladb.com>
If initialization of a TLS variable fails there is nothing better to
do than call std::unexpected.
This also adds a disable_failure_guard to avoid errors when using
allocation error injection.
With init() being noexcept, we can also mark clear_functions.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200804180550.96150-1-espindola@scylladb.com>
seastar::make_shared has a constructor taking a T&&. There is no such
constructor in std::make_shared:
https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared
This means that we have to move from
make_shared(T(...)
to
make_shared<T>(...)
If we don't want to depend on the idiosyncrasies of
seastar::make_shared.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
impl_count_function doesn't explicitly initialize _count, so its correctness
depends on default initialization. Let's explicitly initialize _count to
make the code future proof.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200714162604.64402-1-raphaelsc@scylladb.com>
For collections and UDTs the `MIN()` and `MAX()` functions are
generated on the fly. Until now they worked by comparing just the
byte representations of arguments.
This patch uses specific per-type comparators to provide semantically
sensible, dynamically created aggregates.
Fixes#6768
In order to eventually switch to a single JSON library,
most of the libjsoncpp usage is dropped in favor of rjson.
Unfortunately, one usage still remains:
test/utils/test_repl utility heavily depends on the *exact textual*
format of its output JSON files, so replacing a library results
in all tests failing because of differences in formatting.
It is possible to force rjson to print its documents in the exact
matching format, but that's left for later, since the issue is not
critical. It would be nice though if our test suite compared
JSON documents with a real JSON parser, since there are more
differences - e.g. libjsoncpp keeps children of the object
sorted, while rapidjson uses an unordered data structure.
This change should cause no change in semantics, it strives
just to replace all usage of libjsoncpp with rjson.
It seems that the following functions are never used, delete them:
* `function::has_reference_to`
* `functions::get_overload_count`
* `to_identifiers` in column_identifier.hh
* `single_column_relation::get_map_key`
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200606115149.1770453-1-pa.solodovnikov@scylladb.com>
Add Paxos error injections before/after save promise, proposal, decision,
paxos_response_handler, delete decision.
Adds a method to inject an error providing a lambda while avoiding to add
a continuation when the error injection is disabled.
For this provide error exception and enter() to allow flow control (i.e. return)
on simple error injections without lambdas.
Also includes Pavel's patch for CQL API for error injections, updated to
current error injection API and added one_shot support. Also added some
basic CQL API boost tests.
For CQL API there's a limitation of the current grammar not supporting
f(<terminal>) so values have to be inserted in a table until this is
resolved. See #5411
* https://github.com/alecco/scylla/tree/error_injection_v11:
paxos: fix indentation
paxos: add error injections
utils: add timeout error injection with lambda
utils: error injection add enter() for control flow
utils: error injections provide error exceptions
failure_injector: implement CQL API for failure injector class
lwt: fix disabled error injection templates
This patch changes the signatures of `test_assignment` and
`test_all` functions to accept `cql3::column_specification` by
const reference instead of shared pointer.
Mostly a cosmetic change reducing overall shared_ptr bloat in
cql3 code.
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200529195249.767346-1-pa.solodovnikov@scylladb.com>
The following UDFs are defined to control failure injector API usage:
* enable_injection(name, args)
* disable_injection(name)
All arguments have string type.
As currently function(terminal) is not supported by the parser,
the arguments must come from selected rows.
Added boost test for CQL API.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
This produces more compact code and avoids the anti-pattern of
building a map with statically known values. If the values are given
to GCC via a switch statement it can do a much better job at compile
time than libstdc++ can at runtime.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200422224905.198794-1-espindola@scylladb.com>
Casts only depend on their operands, so a plain function pointer is
sufficient. This allows replacing all the make_castas_* functions that
return a lambda with plain castas_* functions that do the casting.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200413162014.23884-2-espindola@scylladb.com>
Casting a type to itself doesn't make sense, but it is harmless so allow
it instead of reporting a confusing error message that makes even less
sense:
InvalidRequest: Error from server: code=2200 [Invalid query]
message="org.apache.cassandra.db.marshal.BooleanType cannot be cast
to org.apache.cassandra.db.marshal.BooleanType"
Note that some types already supported self-casting, this patch just
extends this to all types in a forward compatible way.
Fixes: #5102
Tests: unit(dev), manual test casting boolean to boolean.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200408135041.854981-1-bdenes@scylladb.com>
The goal is to forward-declare utils::multiprecision_int, something
beyond my capabilities for boost::multiprecision::cpp_int, to reduce
compile time bloat.
The patch is mostly search-and-replace, with a few casts added to
disambiguate conversions the compiler had trouble with.
and replace all calls to dht::global_partitioner().get_token
dht::get_token is better because it takes schema and uses it
to obtain partitioner instead of using a global partitioner.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
i_partitioner::token_to_bytes is just a call to
token::data and does not depend on partitioner
at all. It is possible to convert token to bytes
without having access to partitioner.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Add "const" attributes to `assignment_testable::test_assignment`
and `term::raw::prepare` methods. These should have been marked as
"const" even before the change but for some reason were missing
these qualifiers.
Mark other supplementary methods with "const" attributes as
necessary.
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200127213215.494000-1-pa.solodovnikov@scylladb.com>
Make sure that the timestamp argument does not overflow
60 bits when converted to units of 100 nanos since
epoch, like with writetime() that returns microseconds since epoch
in contrast to other time functions like
unixtimestampof that return millis since epoch.
Fixes#5552
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
`parsed_statement::get_bound_variables` is assumed to always
return a nonnull pointer to `variable_specifications` instance.
In this case using a pointer is superfluous and can be safely
replaced by a plain reference.
Also add a default ctor and a utility method `set_bound_variables`
to the `variable_specifications` class to actually reset the
contents of the class instance.
Tests: unit(dev, debug)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200120195839.164296-1-pa.solodovnikov@scylladb.com>