Commit Graph

1380 Commits

Author SHA1 Message Date
Duarte Nunes
79d796e710 Merge 'Materialized Views: row liveness correction' from Nadav
"
When a view's partition key contains only columns from the base's partition
key (and not an additional one), the liveness - existance or disappearance -
of a view-table row is tied to the liveness of the base table row. And
that, in turn, depends not only on selected columns (base-table columns
SELECTed to also appear in the view) but also on unselected columns.

This means that we may need to keep a view row alive even without data,
just because some unselected column is alive in the base table. Before this
patch set we tried to build a single "row marker" in the view column which
tried to summarize the liveness information in all unselected columns.
But this proved unworkable, as explained in issue #3362 and as will be
demonstrated in unit tests at the end of this series.

Because we can't replace several unselected cells by one row marker, what
we do in this series is to add for each for the unselected cells a "virtual
cell" which contains the cell's liveness information (timestamp, deletion,
ttl) but not its value. For collections, we can't represent the entire
collection by one virtual cell, and rather need a collection of virtual
cells.

Fixes #3362
"

* 'virtual-cols-v3' of https://github.com/nyh/scylla:
  Materialized Views: test that virtual columns are not visible
  Materialized Views: unit test reproducing fixed issue #3362
  Materialized Views: no need for elaborate row marker calculations
  Materialized Views: add unselected columns as virtual columns
  Materialized Views: fill virtual columns
  Do not allow selecting a virtual column
  schema: persist "view virtual" columns to a separate system table
  schema: add "view virtual" flag to schema's column_definition
  Add "empty" type name to CQL parser, but only for internal parsing
2018-08-29 14:32:38 +01:00
Piotr Sarna
fa72422baa cql3: fix handling multi-column partition key in INSERT JSON
Multiple column partition keys were previously handled incorrectly,
now the implementation is based on from_exploded instead of
from_singular.

Fixes #3687
Message-Id: <09e0bdb0f1c18d49b9e67c21777d93ba1545a13c.1534171422.git.sarna@scylladb.com>
2018-08-28 11:34:11 +03:00
Piotr Sarna
465045368f cql3: add proper setting of empty collections in INSERT JSON
Previously empty collections where incorrectly added as dead cells,
which resulted in serialization errors later.

Fixes #3664
Message-Id: <a9c90d66c6737641cafe40edb779df490ada0309.1534848313.git.sarna@scylladb.com>
2018-08-23 11:22:05 +03:00
Duarte Nunes
05731cb5ad cql3/lists: Fix multi-cell static list updates in the presence of ckeys
This patch fixes a regression introduced in
9e88b60ef5, which broke the lookup for
prefetched values of lists when a clustering key is specified.

This is the code that was removed from some list operations:

std::experimental::optional<clustering_key> row_key;
if (!column.is_static()) {
  row_key = clustering_key::from_clustering_prefix(*params._schema, prefix);
}
...
auto&& existing_list = params.get_prefetched_list(m.key().view(), row_key, column);

Put it back, in the form of common code in the update_parameters class.

Fixes #3703

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-08-20 21:39:37 +01:00
Eliran Sinvani
f5f6cf2096 cql3: remove rejection of an IN relation if not on last partition KEY
The constraint is no longer relevant, since Casandra removed
it in version 2.2. In addition the mechanism for handling this
case is already implemented and is identical in case of
clustering keys with single column EQ,= and IN relations.
(Cartesian product of singular ranges).

A unit test for this test case was added.

Fixes #1735
Tests:
1. Unit Tests.
2. Manual testing with the case described in the issue.
3. dtest: ql_additional_tests.py:TestCQL.composite_row_key_test

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <83b43fdc1ca0e0cc287f66f11816fc71b8bd2925.1534430405.git.eliransin@scylladb.com>
2018-08-16 19:32:43 +01:00
Eliran Sinvani
d743ceae76 cql3: ignore LIMIT in select statement with aggregate
LIMIT should restrict the output result and not the query whose result
set is aggregated. when using aggregate the output is guarantied to
be only one row long. since LIMIT accepts only none negative numbers,
it has no effect and can be ignored.

Fixes #2028
Tests: The issue described Testcase ,  UnitTests.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <6c235376c81f052020e2ed23d0a3d071b36d4415.1534416997.git.eliransin@scylladb.com>
2018-08-16 19:31:56 +01:00
Nadav Har'El
30f721afab Materialized Views: add unselected columns as virtual columns
When a view's partition key contains only columns from the base's partition
key (and not an additional one), the liveness (existance or disappearance)
of a view-table row is tied to the liveness of the base table row - and
that depends not only on selected columns (base-table columns SELECTed to
also appear in the view) but also on unselected columns.

This means that we may need to keep a view row alive even without data,
just because some unselected column is alive in the base table. Before this
patch we tried to build a single "row marker" in the view column which
summarizes the liveness information in all unselected columns, but this
proved unworkable, as explained in issue #3362 and as will be demonstrated
in unit tests in a later patch.

Because we can't replace several unselected cells by one row marker, what
we do in this patch is to add for each for the unselected cell a "virtual
cell" which contains the cell's liveness information (timestamp, deletion,
ttl) but not its value. For collections, we can't represent the entire
collection by one virtual cell, and rather need a collection of virtual
cells.

This patch just adds the virtual columns to the view schema. Code in
the previous patch, when it notices the virtual columns in the view's
schema, added the appropriate content into these columns.

We may need to add virtual columns to a view when first created, but also
when an unselected column is added to the base table with "ALTER TABLE",
so both are supported in this patch.

Fixes #3362.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2018-08-16 15:42:22 +03:00
Nadav Har'El
3f3a76aa8f Do not allow selecting a virtual column
For issue #3362, we will need to add to a materialized view also unselected
base-table columns as "virtual columns". We need these columns to exist
to keep view rows alive, but we don't want the user to be able to see
them.

In this patch we prevent SELECTing the virtual columns of the view,
and also exclude the virtual columns from a "SELECT *" on a view.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2018-08-16 15:34:22 +03:00
Nadav Har'El
b4fc711903 Add "empty" type name to CQL parser, but only for internal parsing
Even before this patch, Scylla supported the "empty" type (a column with
no content) but only internally - i.e., in code but not in CQL syntax.
The "empty" type was used in dense tables without regular columns, and a
special optimization in db::cql_type_parser::parse() allowed this type
name to be parsed when reading the schema tables, without allowing the
"empty" type to be used by users in CQL statements.

However, parse() only supported "empty" itself, and more complex types
like list<empty> were not recognized by parse(). In the following patches,
we plan to add to virtual columns to materialized views, with types empty,
list<empty> or map<something, empty>. We need all these types to work, and
before this patch, they don't. But we want all of these types to only work
internally - when Scylla's code creates these hidden columns; we do not
want to add the "empty" type to CQL's syntax.

This is what we do in this patch: The CQL parser's comparator_type rule
now has a parameter, "internal", used to differenciate internal calls
via db::cql_type_parser::parse() from calls from CQL query parsing.
If a user tries something like:

    CREATE TABLE e (pk empty PRIMARY KEY);

He will get the error:

    Invalid (reserved) user type name empty

Note that here, as usual, unknown types are treated as "user types",
and "empty" is not allowed as a user type name - we "reserve" it in case
one day in the future we will want to allow users a direct syntax to
create empty columns. We already have, following Cassandra, a bunch of
other names reserved from being user type names, including "byte",
"complex", and others (see _reserved_type_names()), and using "empty"
as a type name will result in a similar error message.

Just like all other type names, the name "empty" is not a reserved
keyword in other senses: a user can create a table or a column with
the name "empty", just like he can create one with the name "int".

Refs #3362.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2018-08-16 15:12:27 +03:00
Duarte Nunes
a4355fe7e7 cql3/query_options: Use _value_views in prepare()
_value_views is the authoritative data structure for the
client-specified values. Indeed, the ctor called
transport::request::read_options() leaves _values completely empty.

In query_options::prepare() we were, however, using _values to
associated values to the client-specified column names, and not
_value_views. Fix this by using _value_views instead.

As for the reasons we didn't see this bug earlier, I assume it's
because very few drivers set the 0x04 query options flag, which means
column names are omitted. This is the right thing to do since most
drivers have enough information to correctly position the values.

Fixes #3688

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180814234605.14775-1-duarte@scylladb.com>
2018-08-15 10:38:09 +01:00
Duarte Nunes
8751a58a2b cql3/query_options: Preserve unset values when building value_views
A raw value can be in one of three states: a valid value, an unset
value, a null value. When translating raw_values to their views, we
were treating both unset and null values are null raw_value_views.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180814231051.14385-1-duarte@scylladb.com>
2018-08-15 10:37:29 +01:00
Duarte Nunes
805ce6e019 cql3/query_processor: Validate presence of statement values timeously
We need to validate before calling query_options::prepare() whether
the set of prepared statement values sent in the query matches the
amount of names we need to bind, otherwise we risk an out-of-bounds
access if the client also specified names together with the values.

Refs #3688

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180814225607.14215-1-duarte@scylladb.com>
2018-08-15 10:37:13 +01:00
Eliran Sinvani
d734d316a6 cql3: ensure repeated values in IN clauses don't return repeated rows
When the list of values in the IN list of a single column contains
duplicates, multiple executors are activated since the assumption
is that each value in the IN list corresponds to a different partition.
this results in the same row appearing in the result number times
corresponding to the duplication of the partition value.

Added queries for the in restriction unitest and fixed with a bad result check.

Fixes #2837
Tests: Queries as in the usecase from the GitHub issue in both forms ,
prepared and plain (using python driver),Unitest.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <ad88b7218fa55466be7bc4303dc50326a3d59733.1534322238.git.eliransin@scylladb.com>
2018-08-15 10:21:22 +01:00
Piotr Sarna
310d0a74b9 cql3: throw proper request exception for INSERT JSON
JSON code is amended in order to return proper
"Missing mandatory PRIMARY KEY part" message instead of generic
"Attempt to access value of a disengaged optional object".

Fixes #3665
Message-Id: <69157d659d51ce5a2d408614ce3ba7bf8e3a5d88.1534161127.git.sarna@scylladb.com>
2018-08-13 23:57:37 +01:00
Duarte Nunes
1521dc56ae Merge 'Pass query options to restrictions filter' from Piotr
"
This miniseries fixes ALLOW FILTERING support for prepared statements
by passing correct query options to the filter instead of empty ones.
"

* 'pass_query_options_to_restrictions_filter' of https://github.com/psarna/scylla:
  tests: add testing prepared statements with ALLOW FILTERING
  cql3: pass query options to restrictions filter
2018-08-09 18:15:18 +01:00
Duarte Nunes
95677877c2 Merge 'JSON support fixes' from Piotr
"
This series addresses SELECT/INSERT JSON support issues, namely
handling null values properly and parsing decimals from strings.
It also comes with updated cql tests.

Tests: unit (release)
"

* 'json_fixes_3' of https://github.com/psarna/scylla:
  cql3: remove superfluous null conversions in to_json_string
  tests: update JSON cql tests
  cql3: enable parsing decimal JSON values from string
  cql3: add missing return for dead cells
  cql3: simplify parsing optional JSON values
  cql3: add handling null value in to_json
  cql3: provide to_json_string for optional bytes argument
2018-08-09 18:05:34 +01:00
Piotr Sarna
f962b85fa3 cql3: add missing return for dead cells
Fixes #3664
2018-08-09 18:07:12 +02:00
Piotr Sarna
cdbeed4e3b cql3: simplify parsing optional JSON values
With new to_json_string implementation that accepts bytes_opt,
parsing optional values can be simplified to remove explicit
branching.
2018-08-09 18:07:12 +02:00
Piotr Sarna
e4396e17cb cql3: add handling null value in to_json
Previously to_json function would fail with null passed as a parameter.

Fixes #3667
2018-08-09 18:07:12 +02:00
Piotr Sarna
8c18aaa511 cql3: pass query options to restrictions filter
Query options may contain bound values needed for checking filtering
restrictions. Previously, empty query_options{} were used, which
caused prepared statements to fail.

Fixes #3677
2018-08-09 17:44:45 +02:00
Eliran Sinvani
3f2bb07599 cql3: Count unpaged select queries
If the counter goes up this can be a possible reason for slowdown in
queries (since it means that potentially a large amount of data will
be sent to the client at once).

Fixes #2478
Tests: cqlsh with PAGING OFF and ON and validating with a print.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <01253cee0b8c1110aaee3da41d1f434ca798b430.1533817568.git.eliransin@scylladb.com>
2018-08-09 13:53:44 +01:00
Rafi Einstein
123f2c2a1c Add a counter for reverse queries
Fixes #3492

Tests: dtest(cql_additional_tests.py)
Message-Id: <20180729202615.22459-1-rafie@scylladb.com>
2018-07-30 12:34:43 +03:00
Avi Kivity
a4c9330bfc Merge "Optimise paged queries" from Paweł
"
This series adds some optimisations to the paging logic, that attempt to
close the performance gap between paged and not paged queries. The
former are more complex so always are going to be slower, but the
performance loss was unacceptably large.

Fixes #3619.

Performance with paging:
        ./perf_paging_before  ./perf_paging_after   diff
 read              271246.13            312815.49  15.3%

Without paging:
        ./perf_nopaging_before  ./perf_nopaging_after   diff
 read                343732.17              342575.77  -0.3%

Tests: unit(release), dtests(paging_test.py, paging_additional_test.py)
"

* tag 'optimise-paging/v1' of https://github.com/pdziepak/scylla:
  cql3: select statement: don't copy metadata if not needed
  cql3: query_options: make simple getter inlineable
  cql3: metadata: avoid copying column information
  query_pager: avoid visiting result_view if not needed
  query::result_view: add get_last_partition_and_clustering_key()
  query::result_reader: fix const correctness
  tests/uuid: add more tests including make_randm_uuid()
  utils: uuid: don't use std::random_device()
2018-07-26 19:24:03 +03:00
Paweł Dziepak
3e32245bb8 cql3: select statement: don't copy metadata if not needed 2018-07-26 12:37:20 +01:00
Paweł Dziepak
15775c958a cql3: query_options: make simple getter inlineable 2018-07-26 12:37:06 +01:00
Paweł Dziepak
ef0c999742 cql3: metadata: avoid copying column information
The column-related metadata is shared by all requests done with the same
perpared query. However, metadata class contains also some additional
flags and paging state which may differ. This patch allows sharing
column information among multiple instances of the metadata class.
2018-07-26 12:17:04 +01:00
Piotr Sarna
f66aace685 cql3: fix INSERT JSON grammar
Previously CQL grammar wrongfully required INSERT JSON queries
to provide a list of columns, even though they are already
present in JSON itself.
Unfortunately, tests were written with this false assumption as well,
so they're are updated.
Message-Id: <33b496cba523f0f27b6cbf5539a90b6feb20269e.1532514111.git.sarna@scylladb.com>
2018-07-25 11:36:59 +01:00
Piotr Sarna
8523c24576 cql3: use ck prefix in filtered queries
If a filtering query has restrictions that include any clustering
prefix, the longest prefix will be used to narrow down the query.

Fixes #3611
2018-07-23 14:10:52 +02:00
Piotr Sarna
6cc8ccc771 cql3: use clustering key prefix in index queries
If an indexed query has partition+clustering key restrictions as well
and at least some of these restrictions create a prefix, this prefix
is used in the index query to narrow down the number of rows read.

Refs #3611
2018-07-23 14:10:52 +02:00
Piotr Sarna
ab74f75727 cql3: add conversion to ck longest prefix restrictions
For optimization purposes it's sometimes useful to extract
the longest prefix of clustering key restrictions in order
to narrow down queries.
2018-07-23 14:10:52 +02:00
Piotr Sarna
2e4c493870 cql3: add prefix_size method to ck restrictions
Clustering key restrictions are usually set for at least part
of the clustering key prefix. A method of extracting the longest
prefix's size is added.
2018-07-23 14:10:52 +02:00
Avi Kivity
761931659a Merge "Do not linearise incoming CQL3 requests" from Paweł
"
This series changes the native CQL3 protocl layer so that it works with
fragmented buffers instead of a single temporary_buffer per request.
The main part is fragmented_temporary_buffer which represents a
fragmented buffer consisting of multiple temporary_buffers. It provides
helpers for reading fragmented buffer from an input_stream, interpreting
the data in the fragmented buffer as well as view that satisfy
FragmentRange concept.

There are still situations where a fragmented buffer is linearised. That
includes decompressing client requests (this uses reusable buffers in a
similar way to the code that sends compressed responses), CQL statement
restrictions and values that are hard-coded in prepared statements
(hopefully, the values in those cases will be small), value validation
in some cases (blobs are not validated, irrelevant for many fixed-size
small types, but may be a problem for large text cells) as well as
operations on collections.

Tests: unit(release), dtests(cql_prepared_test.py, cql_tests.py, cql_additional_tests.py)
"

* tag 'fragmented-cql3-receive/v1' of https://github.com/pdziepak/scylla: (23 commits)
  types: bytes_view: override fragmented validate()
  cql3: value_view: switch to fragmented_temporary_buffer::view
  types: add validate that accepts fragmented_temporary_buffer::view
  cql3 query_options: add linearize()
  cql3: query_options: use bytes_ostream for temporaries
  cql3: operation: make make_cell accept fragmented_temporary_buffer::view
  atomic_cell: accept fragmented_temporary_buffer::view values
  cql3: avoid ambiguity in a call to update_parameters::make_cell()
  transport: switch to fragmented_temporary_buffer
  transport: extract compression buffers from response class
  tests/reusable_buffer: test fragmented_temporary_buffer support
  utils: reusable_buffer: support fragmented_temporary_buffer
  tests: add test for fragmented_temporary_buffer
  util fragment_range: add general linearisation functions
  utils: add fragmented_temporary_buffer
  tests: add basic test for transport requests and responses
  tests/random-utils: print seed
  tests/random-utils: generate sstrings
  cql3: add value_view printer and equality comparison
  transport: move response outside of cql_server class
  ...
2018-07-22 19:40:37 +03:00
Piotr Sarna
0c85bdcdc2 cql3: make index+primary key restrictions filtering-independent
If full partition key (or full primary key) is used in an indexed
query, it should not require filtering, because queries like that
can be efficiently narrowed down with stricter index restrictions.
2018-07-18 18:45:08 +02:00
Piotr Sarna
2542630a18 cql3: use primary key restrictions in filtering index queries
If both index and partition key is used in a query, it should not
require filtering, because indexed query can be narrowed down
with partition key information. This commit appends partition key
restrictions to index query.
2018-07-18 18:45:08 +02:00
Piotr Sarna
27590816f0 cql3: add is_all_eq to primary key restrictions
is_all_eq is later needed to decide if restrictions can be used
in an indexed query.
2018-07-18 18:45:08 +02:00
Piotr Sarna
20a349777e cql3: add explicit conversion between key restrictions
Partition and clustering key restrictions sometimes need to be converted
and this commit provides a way to do that.
2018-07-18 18:45:08 +02:00
Piotr Sarna
f1357defd6 cql3: add apply_to() method to single column restriction
This method allows copying single column restriction,
possibly with a new column definition.
2018-07-18 18:44:38 +02:00
Piotr Sarna
30f9924ad5 cql3: make primary key restrictions' values unambiguous
using directive must be used to disambiguate the overridden method.
2018-07-18 13:28:37 +02:00
Paweł Dziepak
0b9eed72f4 cql3: value_view: switch to fragmented_temporary_buffer::view 2018-07-18 12:28:06 +01:00
Paweł Dziepak
8f4cb36ef2 cql3 query_options: add linearize()
Some code in the CQL3 layer requires bytes_view and it is fairly
reasonable to assume that it won't deal with large buffers (e.g.
statement restrictions). query_options already has make_temporary()
which takes ownership of a cql3::raw_value so that the rest of the code
can use cql3::raw_value_view. This patch adds similar linearize()
function which, if necessary, linearises a cql3::raw_value_view and
returns a bytes_view with lifetime tied to the life or query_options.
2018-07-18 12:28:06 +01:00
Paweł Dziepak
3810045f8f cql3: query_options: use bytes_ostream for temporaries
bytes_ostream is going to be more efficient than
std::vector<std::vector<char>> since it can put multiple small values in
a single buffer thus reducing the number of memory allocations.
2018-07-18 12:28:06 +01:00
Paweł Dziepak
dff6cd3e2f cql3: operation: make make_cell accept fragmented_temporary_buffer::view 2018-07-18 12:28:06 +01:00
Paweł Dziepak
7d7910aa4d cql3: avoid ambiguity in a call to update_parameters::make_cell()
Using initializer lists in calls like foo({}) is ambiguous if foo() has
multiple overloads with more than one accepting a type that is
default-constructible. update_parameters::make_cell() is about to get an
overload that accepts fragmented_temporary_buffer::view as a value, so
let's make sure its call site won't be ambiguous.
2018-07-18 12:28:06 +01:00
Paweł Dziepak
46acd76cc8 cql3: add value_view printer and equality comparison
BOOST_CHECK_*() expect compared objcts to be equality-comparable and
printable.
2018-07-18 12:28:06 +01:00
Piotr Sarna
7d9715db27 cql3: use single restriction value in index creation
ALLOW FILTERING support caused index-related restrictions to possibly
have more values. In order to remain correct, only those restrictions
which match the indexed columns should be used.
2018-07-11 18:06:21 +02:00
Piotr Sarna
1d75035672 cql3: add secondary index condition to need_filtering
A query that restricts a partition key and an indexed column
needs filtering (after reading an index) and it wasn't
properly detected before.
2018-07-11 18:06:21 +02:00
Piotr Sarna
80ce9b72a1 cql3: add value_for method
In order to extract value from a restriction for just one column,
value_for(column_name, options) method is implemented.
It's needed because once ALLOW FILTERING support was introduced,
index-related restrictions may contain more than 1 value.
2018-07-11 18:06:21 +02:00
Piotr Sarna
c1ad28f28e cql3: add missing inline declarations to restrictions
In order to prevent future compilation errors, externally defined
class methods from single column primary key restrictions are explicitly
marked inline.
2018-07-11 18:06:21 +02:00
Piotr Sarna
02811d8996 cql3: make index detection more specific
Conditions that detect if restrictions need an indexed query weren't
specific enough to work properly with mixed index-filtering queries,
because they would overly eager assume that partition/clustering key
restrictions have a backing index.
2018-07-11 18:06:21 +02:00
Piotr Sarna
aadbfc6b84 cql3: throw instead of log for collection filtering
Original series that introduced filtering logged a warning
when collection restrictions appeared. Instead, an exception
should be thrown until collection restrictions are supported
for ALLOW FILTERING clauses.

Message-Id: <ddaf342d4d6766fadb756f66e5afa0b99ce054f8.1531220558.git.sarna@scylladb.com>
2018-07-10 14:44:29 +03:00