"
Fixes#4256
This miniseries fixes a problem with inserting NULL values through
INSERT JSON interface.
Tests: unit (dev)
"
* 'fix_insert_json_with_null' of https://github.com/psarna/scylla:
tests: add test for INSERT JSON with null values
cql3: add missing value erasing to json parser
(cherry picked from commit 5520fc37ba)
"
This series adds DEFAULT UNSET and DEFAULT NULL keyword support
to INSERT JSON statement, as stated in #3909.
Tests: unit (release)
"
* 'add_json_default_unset_2' of https://github.com/psarna/scylla:
tests: add DEFAULT UNSET case to JSON cql tests
tests: split JSON part of cql query test
cql3: add DEFAULT UNSET to INSERT JSON
(cherry picked from commit 447f953a2c)
"
Before this series the limit was applied per page instead
of globally, which might have resulted in returning too many
rows.
To fix that:
1. restrictions filter now has a 'remaining' parameter
in order to stop accepting rows after enough of them
have already been accepted
2. pager passes its row limit to restrictions filter,
so no more rows than necessary will be served to the client
3. results no longer need to be trimmed on select_statement
level
Tests: unit (release)
"
Fixes#4100
* 'fix_filtering_limit_with_paging_3' of https://github.com/psarna/scylla:
tests: add filtering+limit+paging test case
tests: allow null paging state in filtering tests
cql3: fix filtering with LIMIT with regard to paging
(cherry picked from commit 7505815013)
This series adds a generic test for schema changes that generates
various schema and data before and after an ALTER TABLE operation. It is
then used to check correctness of mutation::upgrade() and sstable
readers and lead to the discovery of #3924 and #3925.
Fixes#3925.
* https://github.com/pdziepak/scylla.git schema-change-test/v3.1
schema_builder: make member function names less confusing
converting_mutation_partition_applier: fix collection type changes
converting_mutation_partition_applier: do not emit empty collections
sstable: use format() instead of sprint()
tests/random-utils: make functions and variables inline
tests: add models for schemas and data
tests: generate schema changes
tests/mutation: add test for schema changes
tests/sstable: add test for schema changes
(cherry picked from commit 564b328b2e)
Cassandra supports a "CREATE CUSTOM INDEX" to create a secondary index
with a custom implementation. The only custom implementation that Cassandra
supports is SASI. But Scylla doesn't support this, or any other custom
index implementation. If a CREATE CUSTOM INDEX statement is used, we
shouldn't silently ignore the "CUSTOM" tag, we should generate an error.
This patch also includes a regression test that "CREATE CUSTOM INDEX"
statements with valid syntax fail (before this patch, they succeeded).
Fixes#3977
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20181211224545.18349-2-nyh@scylladb.com>
(cherry picked from commit a0379209e6)
"
This series adds proper handling of filtering queries with LIMIT.
Previously the limit was erroneously applied before filtering,
which leads to truncated results.
To avoid that, paged filtering queries now use an enhanced pager,
which remembers how many rows dropped and uses that information
to fetch for more pages if the limit is not yet reached.
For unpaged filtering queries, paging is done internally as in case
of aggregations to avoid returning keeping huge results in memory.
Also, previously, all limited queries used the page size counted
from max(page size, limit). It's not good for filtering,
because with LIMIT 1 we would then query for rows one-by-one.
To avoid that, filtered queries ask for the whole page and the results
are truncated if need be afterwards.
Tests: unit (release)
"
* 'fix_filtering_with_limit_2' of https://github.com/psarna/scylla:
tests: add filtering with LIMIT test
tests: split filtering tests from cql_query_test
cql3: add proper handling of filtering with LIMIT
service/pager: use dropped_rows to adjust how many rows to read
service/pager: virtualize max_rows_to_fetch function
cql3: add counting dropped rows in filtering pager
(cherry picked from commit 1afda28cf3)
After this patch, the Materialized Views and Secondary Index features
are considered generally-available and no longer require passing an
explicit "--experimental=on" flag to Scylla.
The "--experimental=on" flag and the db::config::check_experimental()
function remain unused, as we graduated the only two features which used
this flag. However, we leave the support for experimental features in
the code, to make it easier to add new experimental features in the future.
Another reason to leave the command-line parameter behind is so existing
scripts that still use it will not break.
Fixes#3917
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20181115144456.25518-1-nyh@scylladb.com>
(cherry picked from commit 78ed7d6d0c)
"
This patchset fixes#3803. When a select statement with filtering
is executed and the column that is needed for the filtering is not
present in the select clause, rows that should have been filtered out
according to this column will still be present in the result set.
Tests:
1. The testcase from the issue.
2. Unit tests (release) including the
newly added test from this patchset.
"
* 'issues/3803/v10' of https://github.com/eliransin/scylla:
unit test: add test for filtering queries without the filtered column
cql3 unit test: add assertion for the number of serialized columns
cql3: ensure retrieval of columns for filtering
cql3: refactor find_idx to be part of statement restrictions object
cql3: add prefix size common functionality to all clustering restrictions
cql3: rename selection metadata manipulation functions
(cherry picked from commit 3fe92663d4)
Commit 1d34ef38a8 "cql3: make pagers use
time_point instead of duration" has unintentionally altered the timeout
semantics for aggregate queries. Such requests fetch multiple pages before
sending a response to the client. Originally, each of those fetches had
a timeout-duration to finish, after the problematic commit the whole
request needs to complete in a single timeout-duration. This,
unsurprisingly, makes some queries that were successful before fail with
a timeout. This patch restores the original behaviour.
Fixes#3877.
Message-Id: <20181022125318.4384-1-pdziepak@scylladb.com>
(cherry picked from commit c94d2b6aa6)
A materialized views can provide a filter so as to pick up only a subset
of the rows from the base table. Usually, the filter operates on columns
from the base table's primary key. If we use a filter on regular (non-key)
columns, things get hairy, and as issue #3430 showed, wrong: merely updating
this column in the base table may require us to delete, or resurrect, the
view row. But normally we need to do the above when the "new view key column"
was updated, when there is one. We use shadowable tombstones with one
timestamp to do this, so it cannot take into account the two timestamp from
those two columns (the filtered column and the new key column).
So in the current code, filtering by a non-key column does not work correctly.
In this patch we provide two test cases (one involving TTLs, and one involves
only normal updates), which demonstrate vividly that it does *not* work
correctly. With normal updates, trying to resurect a view row that has
previously disappeared, fails. With TTLs, things are even worse, and the view
row fails to disappear when the filtered column is TTLed.
In Cassandra, the same thing doesn't work correctly as well (see
CASSANDRA-13798 and CASSANDRA-13832) so they decided to refuse creating
a materialized view filtering a non-key column. In this patch we also
do this - fail the creation of such an unsupported view. For this reason,
the two tests mentioned above are commented out in a "#if", with, instead,
a trivial test verifying a failure to create such a view.
Note that as explained above, when the filtered column and new view key
column are *different* we have a problem. But when they are the *same* - namely
we filter by a non-key base column which actually *is* a key in the view -
we are actually fine. This patch includes additional test cases verifying
that this case is really fine and provides correct results. Accordingly,
this case is *not* forbidden in the view creation code.
Fixes#3430.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20181008185633.24616-1-nyh@scylladb.com>
(cherry picked from commit b8668dc0f8)
Right now, with specialized execute() that takes primary keys
for indexed_table_select_statement, the original execute()
method implemented in select_statement is not used anywhere,
so it's removed.
Base queries that are part of index queries are allowed to be short,
which can result in wasted work - e.g. when we query all replicas
in parallel, but have to discard most of the result, since the first
one (in token order) resulted in a short read.
Thus, we start by quering 1 range, check if the read is short,
and if not, continue by querying 2x more ranges than before.
Refs #2960
Searching for index view schema for an indexed statement can be done
once in prepare stage, so it's moved to indexed_table_select_statement
prepare method.
Instead of returning a coordinator result and making a caller parse it
later, read_posting_list now extracts rows by itself.
This change is later needed when querying is replaced with a pager.
A standard way for passing a timeout parameter is specifying
a time_point, while pagers used to take a duration in order
to compute time points on the fly. This patch adds a timeout
parameter, which is a time_point, to fetch_page().
There is a bad interaction between may_need_paging() and query result
size limiter. The former is trying to avoid the complexity of paged
queries when the number of returned rows is going to be smaller than the
page size. The latter uses the fact that paged queries need not return
all requested rows to limit the size of a query results. Since
may_need_paging() may turn a paged query into non-paged one as a side
effect it disables the oversized result protection.
This patch limits the cases when may_need_paging() disables paging to
the situations when we know for sure that query result size limiter
won't be needed, i.e.: the result is not going to contain more than one
row. If the client knows for sure that the paging is not needed and
the performance impact is worthwhile it can disable paging on its side.
Otherwise, let's default to the safer behaviour.
Fixes#3620.
Message-Id: <20180925134431.24329-1-pdziepak@scylladb.com>
"
When a view's partition key contains only columns from the base's partition
key (and not an additional one), the liveness - existance or disappearance -
of a view-table row is tied to the liveness of the base table row. And
that, in turn, depends not only on selected columns (base-table columns
SELECTed to also appear in the view) but also on unselected columns.
This means that we may need to keep a view row alive even without data,
just because some unselected column is alive in the base table. Before this
patch set we tried to build a single "row marker" in the view column which
tried to summarize the liveness information in all unselected columns.
But this proved unworkable, as explained in issue #3362 and as will be
demonstrated in unit tests at the end of this series.
Because we can't replace several unselected cells by one row marker, what
we do in this series is to add for each for the unselected cells a "virtual
cell" which contains the cell's liveness information (timestamp, deletion,
ttl) but not its value. For collections, we can't represent the entire
collection by one virtual cell, and rather need a collection of virtual
cells.
Fixes#3362
"
* 'virtual-cols-v3' of https://github.com/nyh/scylla:
Materialized Views: test that virtual columns are not visible
Materialized Views: unit test reproducing fixed issue #3362
Materialized Views: no need for elaborate row marker calculations
Materialized Views: add unselected columns as virtual columns
Materialized Views: fill virtual columns
Do not allow selecting a virtual column
schema: persist "view virtual" columns to a separate system table
schema: add "view virtual" flag to schema's column_definition
Add "empty" type name to CQL parser, but only for internal parsing
LIMIT should restrict the output result and not the query whose result
set is aggregated. when using aggregate the output is guarantied to
be only one row long. since LIMIT accepts only none negative numbers,
it has no effect and can be ignored.
Fixes#2028
Tests: The issue described Testcase , UnitTests.
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <6c235376c81f052020e2ed23d0a3d071b36d4415.1534416997.git.eliransin@scylladb.com>
When a view's partition key contains only columns from the base's partition
key (and not an additional one), the liveness (existance or disappearance)
of a view-table row is tied to the liveness of the base table row - and
that depends not only on selected columns (base-table columns SELECTed to
also appear in the view) but also on unselected columns.
This means that we may need to keep a view row alive even without data,
just because some unselected column is alive in the base table. Before this
patch we tried to build a single "row marker" in the view column which
summarizes the liveness information in all unselected columns, but this
proved unworkable, as explained in issue #3362 and as will be demonstrated
in unit tests in a later patch.
Because we can't replace several unselected cells by one row marker, what
we do in this patch is to add for each for the unselected cell a "virtual
cell" which contains the cell's liveness information (timestamp, deletion,
ttl) but not its value. For collections, we can't represent the entire
collection by one virtual cell, and rather need a collection of virtual
cells.
This patch just adds the virtual columns to the view schema. Code in
the previous patch, when it notices the virtual columns in the view's
schema, added the appropriate content into these columns.
We may need to add virtual columns to a view when first created, but also
when an unselected column is added to the base table with "ALTER TABLE",
so both are supported in this patch.
Fixes#3362.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
"
This miniseries fixes ALLOW FILTERING support for prepared statements
by passing correct query options to the filter instead of empty ones.
"
* 'pass_query_options_to_restrictions_filter' of https://github.com/psarna/scylla:
tests: add testing prepared statements with ALLOW FILTERING
cql3: pass query options to restrictions filter
"
This series addresses SELECT/INSERT JSON support issues, namely
handling null values properly and parsing decimals from strings.
It also comes with updated cql tests.
Tests: unit (release)
"
* 'json_fixes_3' of https://github.com/psarna/scylla:
cql3: remove superfluous null conversions in to_json_string
tests: update JSON cql tests
cql3: enable parsing decimal JSON values from string
cql3: add missing return for dead cells
cql3: simplify parsing optional JSON values
cql3: add handling null value in to_json
cql3: provide to_json_string for optional bytes argument
Query options may contain bound values needed for checking filtering
restrictions. Previously, empty query_options{} were used, which
caused prepared statements to fail.
Fixes#3677
If an indexed query has partition+clustering key restrictions as well
and at least some of these restrictions create a prefix, this prefix
is used in the index query to narrow down the number of rows read.
Refs #3611
"
This series changes the native CQL3 protocl layer so that it works with
fragmented buffers instead of a single temporary_buffer per request.
The main part is fragmented_temporary_buffer which represents a
fragmented buffer consisting of multiple temporary_buffers. It provides
helpers for reading fragmented buffer from an input_stream, interpreting
the data in the fragmented buffer as well as view that satisfy
FragmentRange concept.
There are still situations where a fragmented buffer is linearised. That
includes decompressing client requests (this uses reusable buffers in a
similar way to the code that sends compressed responses), CQL statement
restrictions and values that are hard-coded in prepared statements
(hopefully, the values in those cases will be small), value validation
in some cases (blobs are not validated, irrelevant for many fixed-size
small types, but may be a problem for large text cells) as well as
operations on collections.
Tests: unit(release), dtests(cql_prepared_test.py, cql_tests.py, cql_additional_tests.py)
"
* tag 'fragmented-cql3-receive/v1' of https://github.com/pdziepak/scylla: (23 commits)
types: bytes_view: override fragmented validate()
cql3: value_view: switch to fragmented_temporary_buffer::view
types: add validate that accepts fragmented_temporary_buffer::view
cql3 query_options: add linearize()
cql3: query_options: use bytes_ostream for temporaries
cql3: operation: make make_cell accept fragmented_temporary_buffer::view
atomic_cell: accept fragmented_temporary_buffer::view values
cql3: avoid ambiguity in a call to update_parameters::make_cell()
transport: switch to fragmented_temporary_buffer
transport: extract compression buffers from response class
tests/reusable_buffer: test fragmented_temporary_buffer support
utils: reusable_buffer: support fragmented_temporary_buffer
tests: add test for fragmented_temporary_buffer
util fragment_range: add general linearisation functions
utils: add fragmented_temporary_buffer
tests: add basic test for transport requests and responses
tests/random-utils: print seed
tests/random-utils: generate sstrings
cql3: add value_view printer and equality comparison
transport: move response outside of cql_server class
...
If both index and partition key is used in a query, it should not
require filtering, because indexed query can be narrowed down
with partition key information. This commit appends partition key
restrictions to index query.
ALLOW FILTERING support caused index-related restrictions to possibly
have more values. In order to remain correct, only those restrictions
which match the indexed columns should be used.
Require a timeout parameter for storage_proxy::mutate_begin() and
all its callers (all the way to thrift and cql modification_statement
and batch_statement).
This should fix spurious debug-mode test failures, where overcommit
and general debug slowness result in the default timeouts being
exceeded. Since the tests use infinite timeouts, they should not
time out any more.
Tests: unit (release), with an extra patch that aborts
when a non-infinite timeout is detected.
Message-Id: <20180707204424.17116-1-avi@scylladb.com>