Previously, we crashed when the IN marker is bound to null. Throw
invalid_request_exception instead.
Fixes#8265
Tests: unit (dev)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Closes#8287
Before this change, we would print an expression like this:
((ColumnDefinition{name=c, type=org.apache.cassandra.db.marshal.Int32Type, kind=CLUSTERING_COLUMN, componentIndex=0, droppedAt=-9223372036854775808}) = 0000007b)
Now, we print the same expression like this:
(c = 0000007b)
Tests: unit (dev)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Closes#8285
After previous patches some places in cql3 code take a
long path to get database reference:
query processor -> storage proxy -> database
The query processor can provide the database reference
by itself, so take this chance.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are few schema altering statements that need to have
the query processor inside lambda continuations. Fortunately,
they all are continuations of make_ready_future<>()s, so the
query processor can be simply captured by reference and used.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This statement needs the query processor one step below the
stack from its .announce_migration method. So here's the
dedicated patch for it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Most of the schema altering statements implementations can now
stop calling for global migration manager instance and get it
from the query processor.
Here are the trivial cases when the query processor is just
avaiable at the place where it's needed.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The query processor sits upper than the migration manager,
in the services layering, it's started after and (will be)
stopped before the migration manager.
The migration manager is needed in schema altering statements
which are called with query processor argument. They will
later get the migration manager from the query processor.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now when the only call to .announce_migration gas the
query processor at hands -- pass it to the real statements.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The schema altering statements are all inherited from the same
base class which delcares a pure virtual .announce_migration()
method. All the real statements are called with storage proxy
argument, while the need the migration manager. So like in the
previous patch -- replace storage proxy with query processor.
While doing the replacement also get the database instance from
the querty processor, not from proxy.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently the statement's execute() method accepts storage
proxy as the first argument. This is enough for all of them
but schema altering ones, because the latter need to call
migration manager's announce.
To provide the migration manager to those who need it it's
needed to have some higher-level service that the proxy. The
query processor seems to be good candidate for it.
Said that -- all the .execute()s now accept the querty
processor instead of the proxy and get the proxy itself from
the query processor.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Use expressions instead of _clustering_columns_restrictions. This is
a step towards replacing the entire restrictions class hierarchy with
expressions.
Update some expected results in unit tests to reflect the new code.
These new results are equivalent to the old ones in how
storage_proxy::query() will process them (details:
bound_view::from_range() returns the same result for an empty-prefix
singular as for (-inf,+inf)).
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Omitting these operators didn't cause bugs, because needs_filtering()
is never invoked on them. But that will likely change in the future,
so add them now to prevent problems down the road.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Instead of defining this enum in multi_column_restriction::slice, put
it in the expr namespace and add it to binary_operator. We will need
it when we switch bounds calculation from multi_column_restriction to
expr classes.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
It will be used in statement_restrictions for calculating clustering
bounds. And it will come in handy elsewhere in the future, I'm sure.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Add statement_restrictions::_clustering_prefix_restrictions and fill
it with relevant expressions. Explain how to find all such
expressions in the WHERE clause.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Refs #7961Fixes#8014
Instead of doing a deep copy of input, we keep assume ownership and build
rows of the views therein, potentially retaining fragmented data as-is
avoiding premature linearization.
Note that this is not all sugar and flowers though. Any data access will
by nature be more expensive, and the view collections we create are
potentially just as expensive as copying for small cells.
Otoh, it allows writing code using this that avoids data copying,
depending on destination.
v2:
* Fixed wrong collection reserved in visitor
* Changed row index from shared ptr to ref
* Moved typedef
* Removed non-existing constructors
* Added const ref to index build
* Fixed raft usage after rebase
v3:
* Changed shared_ptr to unique
Accidentally introduced in 9eed26ca3d, it can never be true due to
code above it.
Tests: unit (dev)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Closes#8201
Refs #8093
Refs /scylladb/scylla-tools-java#218
Adds keyword that can preface value tuples in (a, b, c) > (1, 2, 3)
expressions, forcing the restriction to bypass column sort order
treatment, and instead just create the raw ck bounds accordningly.
This is a very limited, and simple version, but since we only need
to cover this above exact syntax, this should be sufficient.
v2:
* Add small cql test
v3:
* Added comment in multi_column_restriction::slice, on what "mode" means and is for
* Added small document of our internal CQL extension keywords, including this.
v4:
* Added a few more cases to tests to verify multi-column restrictions
* Reworded docs a bit
v5:
* Fixed copy-paste error in comment
v6:
* Added negative (error) test cases
v7:
* Added check + reject of trying to combine SCYLLA_CLUST... slice and
normal one
Closes#8094
Timeout config is now stored in each connection, so there's no point
in tracking it inside each query as well. This patch removes
timeout_config from query_options and follows by removing now
unnecessary parameters of many functions and constructors.
The `query_processor::query` method allowed internal paged queries.
However, it was quite limited, hardcoding a number of parameters:
consistency level, timeout config, page size.
This commit does the following improvements:
1. Rename `query` to `query_internal` to make it obvious that this API
is supposed to be used for internal queries only
2. Extend the method to take consistency level, timeout config, and page
size as parameters
3. Remove unused overloads of `query_internal`
4. Fix a bunch of typos / grammar issues in the docstring
Commit aab6b0ee27 introduced the
controversial new IMR format, which relied on a very template-heavy
infrastructure to generate serialization and deserialization code via
template meta-programming. The promise was that this new format, beyond
solving the problems the previous open-coded representation had (working
on linearized buffers), will speed up migrating other components to this
IMR format, as the IMR infrastructure reduces code bloat, makes the code
more readable via declarative type descriptions as well as safer.
However, the results were almost the opposite. The template
meta-programming used by the IMR infrastructure proved very hard to
understand. Developers don't want to read or modify it. Maintainers
don't want to see it being used anywhere else. In short, nobody wants to
touch it.
This commit does a conceptual revert of
aab6b0ee27. A verbatim revert is not
possible because related code evolved a lot since the merge. Also, going
back to the previous code would mean we regress as we'd revert the move
to fragmented buffers. So this revert is only conceptual, it changes the
underlying infrastructure back to the previous open-coded one, but keeps
the fragmented buffers, as well as the interface of the related
components (to the extent possible).
Fixes: #5578
Currently there are places that call
keyspace_element_name::get_keyspace() without checking that _ks_name is
engaged. Fix those places.
Message-Id: <20210216085545.54753-1-gleb@scylladb.com>
Rather than asserting, as seen in #7977.
This shouldn't crash the server in production.
Add unit test that reproduces this scenario
and verifies the internal error exception.
Fixes#7977
Test: unit(release)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210201163051.1775536-1-bhalevy@scylladb.com>
As reproduced in cql-pytest/test_json.py and reported in issue #7911,
failing fromJson() calls should return a FUNCTION_FAILURE error, but
currently produce a generic SERVER_ERROR, which can lead the client
to think the server experienced some unknown internal error and the
query can be retried on another server.
This patch adds a new cassandra_exception subclass that we were missing -
function_execution_exception - properly formats this error message (as
described in the CQL protocol documentation), and uses this exception
in two cases:
1. Parse errors in fromJson()'s parameters are converted into a
function_execution_exception.
2. Any exceptions during the execute() of a native_scalar_function_for
function is converted into a function_execution_exception.
In particular, fromJson() uses a native_scalar_function_for.
Note, however, that functions which already took care to produce
a specific Cassandra error, this error is passed through and not
converted to a function_execution_exception. An example is
the blobAsText() which can return an invalid_request error, so
it is left as such and not converted. This also happens in Cassandra.
All relevant tests in cql-pytest/test_json.py now pass, and are
no longer marked xfail. This patch also includes a few more improvements
to test_json.py.
Fixes#7911
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210118140114.4149997-1-nyh@scylladb.com>
Merged patch series by Konstantin Osipov:
"These series improve uniqueness of generated timeuuids and change
list append/prepend logic to use client/LWT timestamp in timeuuids
generated for list keys. Timeuuid compare functions are
optimized.
The test coverage is extended for all of the above."
uuid: add a comment warning against UUID::operator<
uuid: replace slow versions of timeuiid compare with optimized/tested versions.
test: add tests for legacy uuid compare & msb monotonicity
test: add a test case for append/prepend limit
test: add a test case for monotonicity of timeuuid least significant bits
uuid: implement optimized timeuuid compare
test: add a test case for list prepend/append with custom timestamp
lists: rewrite list prepend to use append machinery
lists: use query timestamp for list cell values during append
uuid: fill in UUID node identifier part of UUID
test: add a CQL test for list append/prepend operations
Rewrite list prepend to use the same machinery
as append, and thus produce correct results when used in LWT.
After this patch, list prepend begins to honor user supplied timestamps.
If a user supplied timestamp for prepend is less than 2010-01-01 00:00:00
an exception is thrown.
Fixes#7611
Scylla list cells are represented internally as a map of
timeuuid => value. To append a new value to a list
the coordinator generates a timeuuid reflecting the current time as key
and adds a value to the map using this key.
Before this patch, Scylla always generated a timeuuid for a new
value, even if the query had a user supplied or LWT timestamp.
This could break LWT linearizability. User supplied timestamps were
ignored.
This is reported as https://github.com/scylladb/scylla/issues/7611
A statement which appended multiple values to a list or a BATCH
generated an own microsecond-resolution timeuuid for each value:
BEGIN BATCH
UPDATE ... SET a = a + [3]
UPDATE ... SET a = a + [4]
APPLY BATCH
UPDATE ... SET a = a + [3, 4]
To fix the bug, it's necessary to preserve monotonicity of
timeuuids within a batch or multi-value append, but make sure
they all use the microsecond time, as is set by LWT or user.
To explain the fix, it's first necessary to recall the structure
of time-based UUIDs:
60 bits: time since start of GMT epoch, year 1582, represented
in 100-nanosecond units
4 bits: version
14 bits: clock sequence, a random number to avoid duplicates
in case system clock is adjusted
2 bits: type
48 bits: MAC address (or other hardware address)
The purpose of clockseq bits is as defined in
https://tools.ietf.org/html/rfc4122#section-4.1.5
is to reduce the probability of UUID collision in case clock
goes back in time or node id changes. The implementation should reset it
whenever one of these events may occur.
Since LWT microsecond time is guaranteed to be
unique by Paxos, the RFC provisioning for clockseq and MAC
slots becomes excessive.
The fix thus changes timeuuid slot content in the following way:
- time component now contains the same microsecond time for all
values of a statement or a batch. The time is unique and monotonic in
case of LWT. Otherwise it's most always monotonic, but may not be
unique if two timestamps are created on different coordinators.
- clockseq component is used to store a sequence number which is
unique and monotonic for all values within the statement/batch.
- to protect against time back-adjustments and duplicates
if time is auto-generated, MAC component contains a random (spoof)
MAC address, re-created on each restart. The address is different
at each shard.
The change is made for all sources of time: user, generated, LWT.
Conditioning the list key generation algorithm on the source of
time would unnecessarily complicate the code while not increase
quality (uniqueness) of created list keys.
Since 14 bits of clockseq provide us with only 16383 distinct slots
per statement or batch, 3 extra bits in nanosecond part of the time
are used to extend the range to 131071 values per statement/batch.
If the rang is exceeded beyond the limit, an exception is produced.
A twist on the use of clockseq to extend timeuuid uniqueness is
that Scylla, like Cassandra, uses int8 compare to compare lower
bits of timeuuid for ordering. The patch takes this into account
and sign-complements the clockseq value to make it monotonic
according to the legacy compare function.
Fixes#7611
test: unit (dev)
This is a revival of #7490.
Quoting #7490:
The managed_bytes class now uses implicit linearization: outside LSA, data is never fragmented, and within LSA, data is linearized on-demand, as long as the code is running within with_linearized_managed_bytes() scope.
We would like to stop linearizing managed_bytes and keep it fragmented at all times, since linearization can require large contiguous chunks. Large contiguous allocations are hard to satisfy and cause latency spikes.
As a first step towards that, we remove all implicitly linearizing accessors and replace them with an explicit linearization accessor, with_linearized().
Some of the linearization happens long before use, by creating a bytes_view of the managed_bytes object and passing it onwards, perhaps storing it for later use. This does not work with with_linearized(), which creates a temporary linearized view, and does not work towards the longer term goal of never linearizing. As a substitute a managed_bytes_view class is introduced that acts as a view for managed_bytes (for interoperability it can also be a view for bytes and is compatible with bytes_view).
By the end of the series, all linearizations are temporary, within the scope of a with_linearized() call and can be converted to fragmented consumption of the data at leisure.
This has limited practical value directly, as current uses of managed_bytes are limited to keys (which are limited to 64k). However, it enables converting the atomic_cell layer back to managed_bytes (so we can remove IMR) and the CQL layer to managed_bytes/managed_bytes_view, removing contiguous allocations from the coordinator.
Closes#7820
* github.com:scylladb/scylla:
test: add hashers_test
memtable: fix accounting of managed_bytes in partition_snapshot_accounter
test: add managed_bytes_test
utils: fragment_range: add a fragment iterator for FragmentedView
keys: update comments after changes and remove an unused method
mutation_test: use the correct preferred_max_contiguous_allocation in measuring_allocator
row_cache: more indentation fixes
utils: remove unused linearization facilities in `managed_bytes` class
misc: fix indentation
treewide: remove remaining `with_linearized_managed_bytes` uses
memtable, row_cache: remove `with_linearized_managed_bytes` uses
utils: managed_bytes: remove linearizing accessors
keys, compound: switch from bytes_view to managed_bytes_view
sstables: writer: add write_* helpers for managed_bytes_view
compound_compat: transition legacy_compound_view from bytes_view to managed_bytes_view
types: change equal() to accept managed_bytes_view
types: add parallel interfaces for managed_bytes_view
types: add to_managed_bytes(const sstring&)
serializer_impl: handle managed_bytes without linearizing
utils: managed_bytes: add managed_bytes_view::operator[]
utils: managed_bytes: introduce managed_bytes_view
utils: fragment_range: add serialization helpers for FragmentedMutableView
bytes: implement std::hash using appending_hash
utils: mutable_view: add substr()
utils: fragment_range: add compare_unsigned
utils: managed_bytes: make the constructors from bytes and bytes_view explicit
utils: managed_bytes: introduce with_linearized()
utils: managed_bytes: constrain with_linearized_managed_bytes()
utils: managed_bytes: avoid internal uses of managed_bytes::data()
utils: managed_bytes: extract do_linearize_pure()
thrift: do not depend on implicit conversion of keys to bytes_view
clustering_bounds_comparator: do not depend on implicit conversion of keys to bytes_view
cql3: expression: linearize get_value_from_mutation() eariler
bytes: add to_bytes(bytes)
cql3: expression: mark do_get_value() as static
The min/max aggregators use aggregate_type_for comparators, and the
aggregate_type_for<timeuuid> is regular uuid. But that yields wrong
results; timeuuids should be compared as timestamps.
Fix it by changing aggregate_type_for<timeuuid> from uuid to timeuuid,
so aggregators can distinguish betwen the two. Then specialize the
aggregation utilities for timeuuid.
Add a cql-pytest and change some unit tests, which relied on naive
uuid comparators.
Fixes#7729.
Tests: unit (dev, debug)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Closes#7910
The option can only take integer values >= 0, since negative
TTL is meaningless and is expected to fail the query when used
with `USING TTL` clause.
It's better to fail early on `CREATE TABLE` and `ALTER TABLE`
statement with a descriptive message rather than catch the
error during the first lwt `INSERT` or `UPDATE` while trying
to insert to system.paxos table with the desired TTL.
Tests: unit(dev)
Fixes: #7906
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20210111202942.69778-1-pa.solodovnikov@scylladb.com>
Cassandra constructs `QueryOptions.SpecificOptions` in the same
way that we do (by not providing `serial_constency`), but they
do have a user-defined constructor which does the following thing:
this.serialConsistency = serialConsistency == null ? ConsistencyLevel.SERIAL : serialConsistency;
This effectively means that DEFAULT `SpecificOptions` always
have `SerialConsistency` set to `SERIAL`, while we leave this
`std::nullopt`, since we don't have a constructor for
`specific_options` which does this.
Supply `db::consistency_level::SERIAL` explicitly to the
`specific_options::DEFAULT` value.
Tests: unit(dev)
Fixes: #7850
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20201231104018.362270-1-pa.solodovnikov@scylladb.com>
Replace two methods for unreversal (`as` and `self_or_reversed`) with
a new one (`without_reversed`). More flexible and better named.
Tests: unit (dev)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Closes#7889
The keys classes (partition_key et al) already use managed_bytes,
but they assume the data is not fragmented and make liberal use
of that by casting to bytes_view. The view classes use bytes_view.
Change that to managed_bytes_view, and adjust return values
to managed_bytes/managed_bytes_view.
The callers are adjusted. In some places linearization (to_bytes())
is needed, but this isn't too bad as keys are always <= 64k and thus
will not be fragmented when out of LSA. We can remove this
linearization later.
The serialize_value() template is called from a long chain, and
can be reached with either bytes_view or managed_bytes_view.
Rather than trace and adjust all the callers, we patch it now
with constexpr if.
operator bytes_view (in keys) is converted to operator
managed_bytes_view, allowing callers to defer or avoid
linearization.
Unset values for key and value were not handled. Handle them in a
manner matching Cassandra.
This fixes all cases in testMapWithUnsetValues, so re-enable it (and
fix a comment typo in it).
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
When the right-hand side of IN is an unset value, we must report an
error, like Cassandra does.
This fixes testListWithUnsetValues, so re-enable it.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Make the bind() operation of the scalar marker handle the unset-value
case (which it previously didn't).
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Avoid crash described in #7740 by ignoring the update when the
element-to-remove is UNSET_VALUE.
Tests: unit (dev)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
When the clustering order is reversed on a map column, the column type
is reversed_type_impl, not map_type_impl. Therefore, we have to check
for both reversed type and map type in some places.
This patch handles reverse types in enough places to make
test_clustering_key_reverse_frozen_map pass. However, it leaves
other places (invocations of is_map() and *_cast<map_type_impl>())
as they currently are; some are protected by callers from being
invoked on reverse types, but some are quite possibly bugs untriggered
by existing tests.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
When the clustering order is reversed on a list column, the column type
is reversed_type_impl, not list_type_impl. Therefore, we have to check
for both reversed type and list type in some places.
This patch handles reverse types in enough places to make
test_clustering_key_reverse_frozen_list pass. However, it leaves
other places (invocations of is_list() and *_cast<list_type_impl>())
as they currently are; some are protected by callers from being
invoked on reverse types, but some are quite possibly bugs untriggered
by existing tests.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>