"
When a view's partition key contains only columns from the base's partition
key (and not an additional one), the liveness - existance or disappearance -
of a view-table row is tied to the liveness of the base table row. And
that, in turn, depends not only on selected columns (base-table columns
SELECTed to also appear in the view) but also on unselected columns.
This means that we may need to keep a view row alive even without data,
just because some unselected column is alive in the base table. Before this
patch set we tried to build a single "row marker" in the view column which
tried to summarize the liveness information in all unselected columns.
But this proved unworkable, as explained in issue #3362 and as will be
demonstrated in unit tests at the end of this series.
Because we can't replace several unselected cells by one row marker, what
we do in this series is to add for each for the unselected cells a "virtual
cell" which contains the cell's liveness information (timestamp, deletion,
ttl) but not its value. For collections, we can't represent the entire
collection by one virtual cell, and rather need a collection of virtual
cells.
Fixes#3362
"
* 'virtual-cols-v3' of https://github.com/nyh/scylla:
Materialized Views: test that virtual columns are not visible
Materialized Views: unit test reproducing fixed issue #3362
Materialized Views: no need for elaborate row marker calculations
Materialized Views: add unselected columns as virtual columns
Materialized Views: fill virtual columns
Do not allow selecting a virtual column
schema: persist "view virtual" columns to a separate system table
schema: add "view virtual" flag to schema's column_definition
Add "empty" type name to CQL parser, but only for internal parsing
This patch fixes a regression introduced in
9e88b60ef5, which broke the lookup for
prefetched values of lists when a clustering key is specified.
This is the code that was removed from some list operations:
std::experimental::optional<clustering_key> row_key;
if (!column.is_static()) {
row_key = clustering_key::from_clustering_prefix(*params._schema, prefix);
}
...
auto&& existing_list = params.get_prefetched_list(m.key().view(), row_key, column);
Put it back, in the form of common code in the update_parameters class.
Fixes#3703
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
The constraint is no longer relevant, since Casandra removed
it in version 2.2. In addition the mechanism for handling this
case is already implemented and is identical in case of
clustering keys with single column EQ,= and IN relations.
(Cartesian product of singular ranges).
A unit test for this test case was added.
Fixes#1735
Tests:
1. Unit Tests.
2. Manual testing with the case described in the issue.
3. dtest: ql_additional_tests.py:TestCQL.composite_row_key_test
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <83b43fdc1ca0e0cc287f66f11816fc71b8bd2925.1534430405.git.eliransin@scylladb.com>
LIMIT should restrict the output result and not the query whose result
set is aggregated. when using aggregate the output is guarantied to
be only one row long. since LIMIT accepts only none negative numbers,
it has no effect and can be ignored.
Fixes#2028
Tests: The issue described Testcase , UnitTests.
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <6c235376c81f052020e2ed23d0a3d071b36d4415.1534416997.git.eliransin@scylladb.com>
When a view's partition key contains only columns from the base's partition
key (and not an additional one), the liveness (existance or disappearance)
of a view-table row is tied to the liveness of the base table row - and
that depends not only on selected columns (base-table columns SELECTed to
also appear in the view) but also on unselected columns.
This means that we may need to keep a view row alive even without data,
just because some unselected column is alive in the base table. Before this
patch we tried to build a single "row marker" in the view column which
summarizes the liveness information in all unselected columns, but this
proved unworkable, as explained in issue #3362 and as will be demonstrated
in unit tests in a later patch.
Because we can't replace several unselected cells by one row marker, what
we do in this patch is to add for each for the unselected cell a "virtual
cell" which contains the cell's liveness information (timestamp, deletion,
ttl) but not its value. For collections, we can't represent the entire
collection by one virtual cell, and rather need a collection of virtual
cells.
This patch just adds the virtual columns to the view schema. Code in
the previous patch, when it notices the virtual columns in the view's
schema, added the appropriate content into these columns.
We may need to add virtual columns to a view when first created, but also
when an unselected column is added to the base table with "ALTER TABLE",
so both are supported in this patch.
Fixes#3362.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
For issue #3362, we will need to add to a materialized view also unselected
base-table columns as "virtual columns". We need these columns to exist
to keep view rows alive, but we don't want the user to be able to see
them.
In this patch we prevent SELECTing the virtual columns of the view,
and also exclude the virtual columns from a "SELECT *" on a view.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Even before this patch, Scylla supported the "empty" type (a column with
no content) but only internally - i.e., in code but not in CQL syntax.
The "empty" type was used in dense tables without regular columns, and a
special optimization in db::cql_type_parser::parse() allowed this type
name to be parsed when reading the schema tables, without allowing the
"empty" type to be used by users in CQL statements.
However, parse() only supported "empty" itself, and more complex types
like list<empty> were not recognized by parse(). In the following patches,
we plan to add to virtual columns to materialized views, with types empty,
list<empty> or map<something, empty>. We need all these types to work, and
before this patch, they don't. But we want all of these types to only work
internally - when Scylla's code creates these hidden columns; we do not
want to add the "empty" type to CQL's syntax.
This is what we do in this patch: The CQL parser's comparator_type rule
now has a parameter, "internal", used to differenciate internal calls
via db::cql_type_parser::parse() from calls from CQL query parsing.
If a user tries something like:
CREATE TABLE e (pk empty PRIMARY KEY);
He will get the error:
Invalid (reserved) user type name empty
Note that here, as usual, unknown types are treated as "user types",
and "empty" is not allowed as a user type name - we "reserve" it in case
one day in the future we will want to allow users a direct syntax to
create empty columns. We already have, following Cassandra, a bunch of
other names reserved from being user type names, including "byte",
"complex", and others (see _reserved_type_names()), and using "empty"
as a type name will result in a similar error message.
Just like all other type names, the name "empty" is not a reserved
keyword in other senses: a user can create a table or a column with
the name "empty", just like he can create one with the name "int".
Refs #3362.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
_value_views is the authoritative data structure for the
client-specified values. Indeed, the ctor called
transport::request::read_options() leaves _values completely empty.
In query_options::prepare() we were, however, using _values to
associated values to the client-specified column names, and not
_value_views. Fix this by using _value_views instead.
As for the reasons we didn't see this bug earlier, I assume it's
because very few drivers set the 0x04 query options flag, which means
column names are omitted. This is the right thing to do since most
drivers have enough information to correctly position the values.
Fixes#3688
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180814234605.14775-1-duarte@scylladb.com>
A raw value can be in one of three states: a valid value, an unset
value, a null value. When translating raw_values to their views, we
were treating both unset and null values are null raw_value_views.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180814231051.14385-1-duarte@scylladb.com>
We need to validate before calling query_options::prepare() whether
the set of prepared statement values sent in the query matches the
amount of names we need to bind, otherwise we risk an out-of-bounds
access if the client also specified names together with the values.
Refs #3688
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180814225607.14215-1-duarte@scylladb.com>
When the list of values in the IN list of a single column contains
duplicates, multiple executors are activated since the assumption
is that each value in the IN list corresponds to a different partition.
this results in the same row appearing in the result number times
corresponding to the duplication of the partition value.
Added queries for the in restriction unitest and fixed with a bad result check.
Fixes#2837
Tests: Queries as in the usecase from the GitHub issue in both forms ,
prepared and plain (using python driver),Unitest.
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <ad88b7218fa55466be7bc4303dc50326a3d59733.1534322238.git.eliransin@scylladb.com>
"
This miniseries fixes ALLOW FILTERING support for prepared statements
by passing correct query options to the filter instead of empty ones.
"
* 'pass_query_options_to_restrictions_filter' of https://github.com/psarna/scylla:
tests: add testing prepared statements with ALLOW FILTERING
cql3: pass query options to restrictions filter
"
This series addresses SELECT/INSERT JSON support issues, namely
handling null values properly and parsing decimals from strings.
It also comes with updated cql tests.
Tests: unit (release)
"
* 'json_fixes_3' of https://github.com/psarna/scylla:
cql3: remove superfluous null conversions in to_json_string
tests: update JSON cql tests
cql3: enable parsing decimal JSON values from string
cql3: add missing return for dead cells
cql3: simplify parsing optional JSON values
cql3: add handling null value in to_json
cql3: provide to_json_string for optional bytes argument
Query options may contain bound values needed for checking filtering
restrictions. Previously, empty query_options{} were used, which
caused prepared statements to fail.
Fixes#3677
"
This series adds some optimisations to the paging logic, that attempt to
close the performance gap between paged and not paged queries. The
former are more complex so always are going to be slower, but the
performance loss was unacceptably large.
Fixes#3619.
Performance with paging:
./perf_paging_before ./perf_paging_after diff
read 271246.13 312815.49 15.3%
Without paging:
./perf_nopaging_before ./perf_nopaging_after diff
read 343732.17 342575.77 -0.3%
Tests: unit(release), dtests(paging_test.py, paging_additional_test.py)
"
* tag 'optimise-paging/v1' of https://github.com/pdziepak/scylla:
cql3: select statement: don't copy metadata if not needed
cql3: query_options: make simple getter inlineable
cql3: metadata: avoid copying column information
query_pager: avoid visiting result_view if not needed
query::result_view: add get_last_partition_and_clustering_key()
query::result_reader: fix const correctness
tests/uuid: add more tests including make_randm_uuid()
utils: uuid: don't use std::random_device()
The column-related metadata is shared by all requests done with the same
perpared query. However, metadata class contains also some additional
flags and paging state which may differ. This patch allows sharing
column information among multiple instances of the metadata class.
Previously CQL grammar wrongfully required INSERT JSON queries
to provide a list of columns, even though they are already
present in JSON itself.
Unfortunately, tests were written with this false assumption as well,
so they're are updated.
Message-Id: <33b496cba523f0f27b6cbf5539a90b6feb20269e.1532514111.git.sarna@scylladb.com>
If an indexed query has partition+clustering key restrictions as well
and at least some of these restrictions create a prefix, this prefix
is used in the index query to narrow down the number of rows read.
Refs #3611
"
This series changes the native CQL3 protocl layer so that it works with
fragmented buffers instead of a single temporary_buffer per request.
The main part is fragmented_temporary_buffer which represents a
fragmented buffer consisting of multiple temporary_buffers. It provides
helpers for reading fragmented buffer from an input_stream, interpreting
the data in the fragmented buffer as well as view that satisfy
FragmentRange concept.
There are still situations where a fragmented buffer is linearised. That
includes decompressing client requests (this uses reusable buffers in a
similar way to the code that sends compressed responses), CQL statement
restrictions and values that are hard-coded in prepared statements
(hopefully, the values in those cases will be small), value validation
in some cases (blobs are not validated, irrelevant for many fixed-size
small types, but may be a problem for large text cells) as well as
operations on collections.
Tests: unit(release), dtests(cql_prepared_test.py, cql_tests.py, cql_additional_tests.py)
"
* tag 'fragmented-cql3-receive/v1' of https://github.com/pdziepak/scylla: (23 commits)
types: bytes_view: override fragmented validate()
cql3: value_view: switch to fragmented_temporary_buffer::view
types: add validate that accepts fragmented_temporary_buffer::view
cql3 query_options: add linearize()
cql3: query_options: use bytes_ostream for temporaries
cql3: operation: make make_cell accept fragmented_temporary_buffer::view
atomic_cell: accept fragmented_temporary_buffer::view values
cql3: avoid ambiguity in a call to update_parameters::make_cell()
transport: switch to fragmented_temporary_buffer
transport: extract compression buffers from response class
tests/reusable_buffer: test fragmented_temporary_buffer support
utils: reusable_buffer: support fragmented_temporary_buffer
tests: add test for fragmented_temporary_buffer
util fragment_range: add general linearisation functions
utils: add fragmented_temporary_buffer
tests: add basic test for transport requests and responses
tests/random-utils: print seed
tests/random-utils: generate sstrings
cql3: add value_view printer and equality comparison
transport: move response outside of cql_server class
...
If full partition key (or full primary key) is used in an indexed
query, it should not require filtering, because queries like that
can be efficiently narrowed down with stricter index restrictions.
If both index and partition key is used in a query, it should not
require filtering, because indexed query can be narrowed down
with partition key information. This commit appends partition key
restrictions to index query.
Some code in the CQL3 layer requires bytes_view and it is fairly
reasonable to assume that it won't deal with large buffers (e.g.
statement restrictions). query_options already has make_temporary()
which takes ownership of a cql3::raw_value so that the rest of the code
can use cql3::raw_value_view. This patch adds similar linearize()
function which, if necessary, linearises a cql3::raw_value_view and
returns a bytes_view with lifetime tied to the life or query_options.
bytes_ostream is going to be more efficient than
std::vector<std::vector<char>> since it can put multiple small values in
a single buffer thus reducing the number of memory allocations.
Using initializer lists in calls like foo({}) is ambiguous if foo() has
multiple overloads with more than one accepting a type that is
default-constructible. update_parameters::make_cell() is about to get an
overload that accepts fragmented_temporary_buffer::view as a value, so
let's make sure its call site won't be ambiguous.
ALLOW FILTERING support caused index-related restrictions to possibly
have more values. In order to remain correct, only those restrictions
which match the indexed columns should be used.
In order to extract value from a restriction for just one column,
value_for(column_name, options) method is implemented.
It's needed because once ALLOW FILTERING support was introduced,
index-related restrictions may contain more than 1 value.
In order to prevent future compilation errors, externally defined
class methods from single column primary key restrictions are explicitly
marked inline.
Conditions that detect if restrictions need an indexed query weren't
specific enough to work properly with mixed index-filtering queries,
because they would overly eager assume that partition/clustering key
restrictions have a backing index.
Original series that introduced filtering logged a warning
when collection restrictions appeared. Instead, an exception
should be thrown until collection restrictions are supported
for ALLOW FILTERING clauses.
Message-Id: <ddaf342d4d6766fadb756f66e5afa0b99ce054f8.1531220558.git.sarna@scylladb.com>