Commit Graph

705 Commits

Author SHA1 Message Date
Rafael Ávila de Espíndola
de6d6c46a1 types: Remove collection_type_impl::kind
All uses have been switched to abstract_type::kind.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-08-14 10:02:00 -07:00
Tomasz Grabiec
64ff1b6405 cql: alter type: Format field name as text instead of hex
Fixes #4841

Message-Id: <1565702635-26214-1-git-send-email-tgrabiec@scylladb.com>
2019-08-13 16:25:48 +03:00
Gleb Natapov
6a4207f202 Pass service permit to storage_proxy
Current cql transport code acquire a permit before processing a query and
release it when the query gets a reply, but some quires leave work behind.
If the work is allowed to accumulate without any limit a server may
eventually run out of memory. To prevent that the permit system should
account for the background work as well. The patch is a first step in
this direction. It passes a permit down to storage proxy where it will
be later hold by background work.
2019-08-12 10:20:43 +03:00
Nadav Har'El
759752947b drop_index_statement: fix column_family()
All statement objects which derive from cf_statement, including
drop_index_statement, have a column_family() returning the name of the
column family involved in this statement. For most statement this is
known at the time of construction, because it is part of the statement,
but for "DROP INDEX", the user doesn't specify the table's name - just
the index name. So we need to override column_family() to find the
table name.

The existing implementation assert()ed that we can always find such
a table, but this is not true - for example, in a DROP INDEX with
"IF EXISTS", it is perfectly fine for no such table to exist. In this
case we don't want a crash, and not even an except - it's fine that
we just return an empty table name.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190716180104.15985-1-nyh@scylladb.com>
2019-07-17 09:44:47 +03:00
Kamil Braun
c0915c40eb Warn user about using SimpleStrategy with Multi DC deployment
If the user creates a keyspace with the 'SimpleStrategy' replication class
in a multi-datacenter environment, they will receive a warning in the CQL shell
and in the server logs.
Resolves #4481.

Signed-off-by: Kamil Braun <kbraun@scylladb.com>
2019-07-05 09:25:03 +02:00
Avi Kivity
fc629bb14f Merge "cql3: lift infinite bound check" from Benny & Piotr
"
If the database supports infinite bound range deletions,
CQL layer will no longer throw an error indicating that both ranges
need to be specified.

Fixes #432

Update test_range_deletion_scenarios unit test accordingly.
"

* 'cql3-lift-infinite-bound-check' of https://github.com/bhalevy/scylla:
  cql3: lift infinite bound check if it's supported
  service: enable infinite bound range deletions with mc
  database: add flag for infinite bound range deletions
2019-06-25 19:05:29 +03:00
Piotr Sarna
add40d4e59 cql3: lift infinite bound check if it's supported
If the database supports infinite bound range deletions,
CQL layer will no longer throw an error indicating that both ranges
need to be specified.

[bhalevy] Update test_range_deletion_scenarios unit test accordingly.

Fixes #432

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-06-24 15:58:34 +03:00
Piotr Sarna
fe18638de3 cql3: make DEFAULT_COUNT_PAGE_SIZE constant public
The constant will be later used in test scenarios.
2019-06-24 13:21:37 +02:00
Piotr Sarna
bb08af7e68 cql3: add proper aggregation to paged indexing
Aggregated and paged filtering needs to aggregate the results
from all pages in order to avoid returning partial per-page
results. It's a little bit more complicated than regular aggregation,
because each paging state needs to be translated between the base
table and the underlying view. The routine keeps fetching pages
from the underlying view, which are then used to fetch base rows,
which go straight to the result set builder.

Fixes #4540
2019-06-24 13:21:32 +02:00
Piotr Sarna
7a8b243ce4 cql3: split execute_base_query implementation
In order to handle aggregation queries correctly, the function that
returns base query results is split into two, so it's possible to
access raw query results, before they're converted into end-user
CQL message.
2019-06-24 12:57:03 +02:00
Benny Halevy
fae4ca756c cql3: select_statement: provide default initializer for parameters::_bypass_cache
Fixes #4503

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190521143300.22753-1-bhalevy@scylladb.com>
2019-05-21 20:06:40 +03:00
Avi Kivity
a86fdeb02b Merge "Implement GROUP BY" from Dejan
"
Cassandra has supported GROUP BY in SELECT statements since 2016
(v3.10), while ScyllaDB currently treats it as a syntax error.  To
achieve parity with Cassandra in this important bit of functionality,
this patch adds full support for GROUP BY, from parsing to validation
to implementation to testing.
"

* 'groupby-implPP' of https://github.com/dekimir/scylla:
  Implement grouping in selection processing
  Propagate GROUP BY indices to result_set_builder
  Process GROUP BY columns into select_statement
  Parse GROUP BY clause, store column identifiers
2019-05-08 18:35:12 +03:00
Dejan Mircevski
c3929aee3a Propagate GROUP BY indices to result_set_builder
Ensure that the indices recorded in select_statement are passed to
result_set_builder when one is created for processing the cell values.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-08 10:10:10 -04:00
Dejan Mircevski
274a77f45e Process GROUP BY columns into select_statement
Validate raw GROUP BY identifiers and translate them into
a select_statement member.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-08 10:10:10 -04:00
Dejan Mircevski
e1fb414805 Parse GROUP BY clause, store column identifiers
Extend the grammar file with GROUP BY, collect the column identifiers,
and store them in raw::select_statement.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-08 10:09:22 -04:00
Nadav Har'El
a45b6e41a0 materialized views and secondary index: sometimes allow dropping base columns
Until this patch, dropping columns from a table was completely forbidden
if this table has any materialized views or secondary indexes. However,
this is excessively harsh, and not compatible with Cassandra which does
allow dropping columns from a base table which has a secondary index on
*other* columns. This incompatibility was raised in the following
Stackoverflow question:
https://stackoverflow.com/questions/55757273/error-while-dropping-column-from-a-table-with-secondary-index-scylladb/55776490

In this patch, we allow dropping a base table column if none of its
materialized views *needs* this column. Columns selected by a view
(as regular or key columns) are needed by it, of course, but when
virtual columns are used (namely, there is a view with same key columns
as the base), *all* columns are needed by the view, so unfortunately none
of the columns may be dropped.

After this patch, when a base-table column cannot be dropped because one
of the materialized views needs it, the error message will look like:

   exceptions::invalid_request_exception: Cannot drop column a from base
   table ks.cf: a materialized view cf_a_idx_index needs this column.

This patch also includes extensive testing for the cases where dropping
columns are now allowed, and not allowed. The secondary-index tests are
especially interesting, because they demonstrate that now usually (when
a non-key column is being indexed) dropping columns will be allowed,
which is what originally bothered the Stackoverflow user.

Fixes #4448.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190429214805.2972-1-nyh@scylladb.com>
2019-04-30 12:13:10 +01:00
Paweł Dziepak
85409c1a16 Merge "Validate elements of collections" from Piotr
"
Previously we weren't validating elements of collections so it
was possible to add non-UTF-8 string to a column with type
list<text>.

Tests: unit(release)

Fixes #4009
"

* 'haaawk/4009/v5' of github.com:scylladb/seastar-dev:
  types: Test correct map validation
  types: Test correct in clause validation
  types: Test correct tuple validation
  types: Test correct set validation
  types: Test correct list validation
  types: Add test_tuple_elements_validation
  types: Add test_in_clause_validation
  types: Add test_map_elements_validation
  types: Add test_set_elements_validation
  types: Add test_list_elements_validation
  types: Validate input when tuples
  types: Validate input when parsing a set
  types: Validate input when parsing a map
  types: Validate input when parsing a list
  types: Implement validation for tuple
  types: Implement validation for set
  types: Implement validation for map
  types: Implement validation for list
  types: Add cql_serialization_format parameter to validate
2019-04-18 19:07:14 +03:00
Glauber Costa
c01ed239a3 fix typo in create table statement error message
specifed -> specified

Fixes #4434

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190415125206.2993-1-glauber@scylladb.com>
2019-04-15 16:51:13 +03:00
Piotr Jastrzebski
f5f6367674 types: Add cql_serialization_format parameter to validate
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-04-09 16:58:22 +02:00
Avi Kivity
a7520c0ba9 Merge "Turn cql3_type into a trivial wrapper over data_type" from Rafael
"
Both cql3_type and abstract_type are normally used inside
shared_ptr. This creates a problem when an abstract_type needs to refer
to a cql3_type as that creates a cycle.

To avoid warnings from asan, we were using a std::unordered_map to
store one of the edges of the cycle. This avoids the warning, but
wastes even more memory.

Even before this series cql3_type was a fairly light weight
structure. This patch pushes in that direction and now cql3_type is a
struct with a single member variable, a data_type.

This avoids the reference cycle and is easier to understand IMHO.

The one corner case is varchar. In the old system cql3_type::varchar
and cql3_type::text don't compare equal, but they both map to the same
data_type.

In the new system they would compare equal, so we avoid the confusion
by just removing the cql3_type::varchar variable.

Tests: unit (dev)
"

* 'espindola/merge-cq3-type-and-type-v3' of https://github.com/espindola/scylla:
  Turn cql3_type into a trivial wrapper over data_type
  Delete cql3_type::varchar
  Simplify db::cql_type_parser::parse
  Add a test for the varchar column representation
2019-03-25 15:03:16 +02:00
Avi Kivity
a9cf07369f Merge "Add local indexes" from Piotr
"
This series adds support for local indexing, i.e. when the index table
resides on the same partition as base data.
It addresses the performance issue of having an indexed query
that also specifies a partition key - index will be queried
locally.
"

* 'add_local_indexing_11' of https://github.com/psarna/scylla: (30 commits)
  tests: add cases for local index prefix optimization
  tests: add create/drop local index test case
  tests: add non-standard names cases to local index tests
  tests: add multi pk case for local index tests
  tests: add test for malformed local index definitions
  tests: add local index paging test
  tests: add local indexing test
  cql3: add CREATE INDEX syntax for local indexes
  cql3: use serialization function to create index target string
  index: add serialization function for index targets
  index: use proper local index target when adding index
  index: add parsing target column name from local index targets
  db: add checking for local index in schema tables
  index: add checking if serialized target implies local index
  index: enable parsing multi-key targets
  index: move target parser code to .cc file
  json: add non-throwing overload for to_json_value
  cql3: add checking for local indexes in has_supporting_index()
  cql3: move finding index restrictions to prepare stage
  cql3: add picking an index by score
  ...
2019-03-21 12:46:00 -03:00
Nadav Har'El
561c640ed1 materialized views: allow view without clustering columns
When a materialized view was created, the verification code artificially
forbade creating a view without a clustering key column. However, there
is no real reason to forbid this. In the trivial case, the original base
table might not have had a clustering key, and the view might want to use
the exact same key. In a more complex case, a view may want to have all the
primary key columns as *partition* key columns, and that should be fine.

The patch also includes a regression test, which failed before this patch,
and succeeds with it (we test that we can create materialized views in both
aforementioned scenarios, and these materialized views work as expected).

Duarte raised the opinion that the "trivial" case of a view table with
a key identical to that of the base should be disallowed. However, this
should be done, if at all (I think it shouldn't), in a follow-up patch,
which will implement the non-triviality requirement consistently (e.g.,
require view primary key to be different from base's, regardless of
the existance or non-existance of clustering columns).

Fixes #4340.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Message-Id: <20190320122925.10108-1-nyh@scylladb.com>
2019-03-21 12:45:52 -03:00
Rafael Ávila de Espíndola
53ab298957 Turn cql3_type into a trivial wrapper over data_type
Both cql3_type and abstract_type are normally used inside
shared_ptr. This creates a problem when an abstract_type needs to refer
to a cql3_type as that creates a cycle.

To avoid warnings from asan, we were using a std::unordered_map to
store one of the edges of the cycle. This avoids the warning, but
wastes even more memory.

Even before this patch cql3_type was a fairly light weight
structure. This patch pushes in that direction and now cql3_type is a
struct with a single member variable, a data_type.

This avoids the reference cycle and is easier to understand IMHO.

Tests: unit (dev)

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-20 14:10:28 -07:00
Piotr Sarna
1fd61c5ac4 cql3: use serialization function to create index target string
Instead of building the string manually, a serialization function
is called to create a string out of index target list.
2019-03-20 10:51:27 +01:00
Piotr Sarna
87f6e37caa cql3: move finding index restrictions to prepare stage
Index restrictions that match a given index were recomputed
during execution stage, which is redundant and prone to errors.
Now, used index restrictions are cached in a prepare statement.
2019-03-20 10:20:22 +01:00
Piotr Sarna
2f173f7ed8 cql3: add handling paging state for local indexes
When computing paging state for local indexes, the partition
and clustering keys are different than with global ones:
 - partition key is the same as base's
 - clustering key starts with the indexed column
2019-03-20 10:20:02 +01:00
Piotr Sarna
75dd964751 cql3: add handling partition slices for local indexes
For local indexes, a slice will consist of the indexed column
followed by base clustering columns.
2019-03-20 10:20:01 +01:00
Piotr Sarna
b12162c8f5 cql3: add returning correct partition ranges for local indexes
Local indexes always share the partition range with their base.
2019-03-20 09:51:46 +01:00
Piotr Sarna
da8e8f18b3 cql3: make read_posting_list a member function
It already accepts several arguments that can be extracted from 'this',
and more will be added in the future.
New parameters include lambdas prepared during prepare stage
that define how to extract partition/clustering key ranges depending
on which index is used, so keeping it a static function will result
in unbounded number of parameters with complex types, which will
in turn make the function header almost illegible for a reader.
Hence, read_posting_list becomes a member function with easy access
to any data prepared during prepare stage.
2019-03-20 09:51:46 +01:00
Piotr Sarna
85017c5ad4 cql3: look for indexed column definition only once
There's no need to look for the column definition inside a loop.
2019-03-20 09:51:46 +01:00
Piotr Sarna
8002471c81 cql3: allow index target to keep multiple columns
Instead of having just one column definition, index target is now
a variant of either single column definition or a vector of them.
The vector is expected to be used when part of a target definition
is enclosed in parentheses:
 $ CREATE INDEX ON t((p),v);
or
 $ CREATE INDEX ON t((p1,p2), v);
etc.

This feature will allow providing (possibly composite) base partition key
to CREATE INDEX statement, which will result in creating a local index.
2019-03-20 09:51:46 +01:00
Piotr Sarna
90d47ca183 schema: add is_local_index cached value to index metadata
In order to quickly distinguish global indexes from local ones,
a cached boolean value is introduced.
2019-03-20 09:51:46 +01:00
Tomasz Grabiec
c584f48c32 Merge "transport: sort bound ranges in read reques in order to conform to cql definitions" from Eliran
According to the cql definitions, if no ORDER BY clause is present,
records should be returned ordered by the clustering keys. Since the
backend returns the ranges according to their order of appearance
in the request, the bounds should be sorted before sending it to the
backend. This kind of sorting is needed in queries that generates more
than one bound to be read, examples to such queris are:
1. a SELECT query with an IN clause.
2. a SELECT query on a mixed order tupple of columns (see #2050).
The assumption this commit makes is the correctness of the bounds
list, that is, the bounds are non overlapping. If this wasn't true, multiple
occurences of the same reccord could have returned for certain queries.

Tests:
1. Unit tests release
2. All dtest that requires #2050 and #2029

Fixes #2029
2019-03-05 21:07:15 +01:00
Piotr Sarna
e9bc2a7912 cql3: fix error message for lack of primary keys in JSON
When any primary key part is not present in INSERT JSON statement,
proper error message will be presented to the client.

Tests: unit (dev) 
Message-Id: <3aa99703523c45056396a0b6d97091da30206dab.1551797502.git.sarna@scylladb.com>
2019-03-05 16:54:46 +02:00
Eliran Sinvani
7df0c873aa transport: sort bound ranges in read reques in order to conform to cql definitions
According to the cql definitions, if no ORDER BY clause is present,
records should be returned ordered by the clustering keys. Since the
backend returns the ranges according to their order of appearance
in the request, the bounds should be sorted before sending it to the
backend. This kind of sorting is needed in queries that generates more
than one bound to be read, examples to such queris are:
1. a SELECT query with an IN clause.
2. a SELECT query on a mixed order tupple of columns (see #2050).
The assumption this commit makes is the correctness of the bounds
list, that is, the bounds are non overlapping. If this wasn't true, multiple
occurences of the same reccord could have returned for certain queries.

Tests:
1. Unit tests release
2. All dtest that requires #2050 and #2029

Fixes #2029

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2019-03-05 13:51:17 +02:00
Piotr Sarna
c743617236 cql3: unify max value for row limit and per-partition limit
Limits are stored as uint32_t everywhere, but in some places
int32_t was used, which created inconsistencies when comparing
the value to std::numeric_limits<Type>::max().
In order to solve inconsistencies, the types are unified to uint32_t,
and instead of explicitly calling numeric limit max,
an already existing constant value query::max_rows is utilized.

Fixes #4253

Message-Id: <4234712ff61a0391821acaba63455a34844e489b.1550683120.git.sarna@scylladb.com>
2019-02-21 13:56:02 +02:00
Piotr Sarna
6618191e49 cql3: add missing value erasing to json parser
When inserting a null value through INSERT JSON, the column
was erroneously not removed from the 'not used' list of columns.

Fixes #4256
2019-02-21 11:23:44 +01:00
Duarte Nunes
6e83457b1b Merge 'Add PER PARTITION LIMIT' from Piotr
"
This series introduces PER PARTITION LIMIT to CQL.
Protocol and storage is already capable of applying per-partition limits,
so for nonpaged queries the changes are superficial - a variable is parsed
and passed down.
For paged queries and filtering the situation is a little bit more complicated
due to corner cases: results for one partition can be split over 2 or more pages,
filtering may drop rows, etc. To solve these, another variable is added to paging
state - the number of rows already returned from last served partition.
Note that "last" partition may be stretched over any number of pages, not just the
last one, which is a case especially when considering filtering.
As a result, per-partition-limiting queries are not eligible for page generator
optimization, because they may need to have their results locally filtered
for extraneous rows (e.g. when the next page asks for  per-partition limit 5,
but we already received 4 rows from the last partition, so need just 1 more
from last partition key, but 5 from all next ones).

Tests: unit (dev)

Fixes #2202
"

* 'add_per_partition_limit_3' of https://github.com/psarna/scylla:
  tests: remove superficial ignore_order from filtering tests
  tests: add filtering with per partition key limit test
  tests: publish extract_paging_state and count_rows_fetched
  tests: fix order of parameters in with_rows_ignore_order
  cql3,grammar: add PER PARTITION LIMIT
  idl,service: add persistent last partition row count
  cql3: prevent page generator usage for per-partition limit
  cql3: add checking for previous partition count to filtering
  pager: add adjusting per-partition row limit
  cql3: obey per partition limit for filtering
  cql3: clean up unneeded limit variables
  cql3: obey per partition limit for select statement
  cql3: add get_per_partition_limit
  cql3: add per_partition_limit to CQL statement
2019-02-18 14:47:11 +00:00
Piotr Sarna
3a2b004f02 cql3: prevent page generator usage for per-partition limit
Paged queries that induce per-partition limits cannot use
page generator optimization, as sometimes the results need
to be filtered for extraneous rows on page breaks.
2019-02-18 11:06:44 +01:00
Piotr Sarna
1dadae212a cql3: add checking for previous partition count to filtering
Filtering now needs to take into account per partition limits as well,
and for that it's essential to be able to compare partition keys
and decide which rows should be dropped - if previous page(s) contained
rows with the same partition key, these need to be taken into
consideration too.
2019-02-18 11:06:43 +01:00
Piotr Sarna
b965c3778f cql3: obey per partition limit for filtering
Filtering queries now take into account the limit of rows
per single partition provided by the user.
2019-02-18 10:29:34 +01:00
Piotr Sarna
b3aa939cde cql3: clean up unneeded limit variables
Some places extracted a `limit` variable to be captured by lambdas,
but they were not used inside them.
2019-02-18 10:29:34 +01:00
Piotr Sarna
cfb6e9c79c cql3: obey per partition limit for select statement
Select statement now takes into account the limit of rows
per single partition provided by the user.
2019-02-18 10:29:34 +01:00
Piotr Sarna
41b466246e cql3: add get_per_partition_limit 2019-02-18 10:29:34 +01:00
Piotr Sarna
93786a9148 cql3: add per_partition_limit to CQL statement
Select statements can now accept per_partition_limit variable.
2019-02-18 10:29:34 +01:00
Avi Kivity
a1567b0997 Merge "replace get_restricted_ranges() function with generator interface" from Gleb
"
get_restricted_ranges() is inefficient since it calculates all
vnodes that cover a requested key ranges in advance, but callers often
use only the first one.  Replace the function with generator interface
that generates requested number of vnodes on demand.
"

* 'gleb/query_ranges_to_vnodes_generator' of github.com:scylladb/seastar-dev:
  storage_proxy: limit amount of precaclulated ranges by query_ranges_to_vnodes_generator
  storage_proxy: remove old get_restricted_ranges() interface
  cql3/statements/select_statement: convert index query interface to new query_ranges_to_vnodes_generator interface
  tests: convert storage_proxy test to new query_ranges_to_vnodes_generator interface
  storage_proxy: convert range query path to new query_ranges_to_vnodes_generator interface
  storage_proxy: introduce new query_ranges_to_vnode_generator interface
2019-02-18 10:33:54 +02:00
Calle Wilund
e70286a849 db/extensions: Allow schema extensions to turn themselves off
Fixes #4222

Iff an extension creation callback returns null (not exception)
we treat this as "I'm not needed" and simply ignore it.

Message-Id: <20190213124311.23238-1-calle@scylladb.com>
2019-02-13 14:50:51 +02:00
Gleb Natapov
0cd9bbb71d cql3/statements/select_statement: convert index query interface to new query_ranges_to_vnodes_generator interface 2019-02-11 14:45:43 +02:00
Nadav Har'El
5a695b8029 Materialized views: fix three error messages
Three error messages were supposed to include a column name, but a "{}"
was missing in the format so the given column name didn't actually appear
in the error message. So this patch adds the missing {}'s.

Fixes #4183.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190203112100.13031-1-nyh@scylladb.com>
2019-02-03 12:23:29 +01:00
Piotr Jastrzebski
fe8dfc8fdc Stop including types/set.hh into cql3/sets.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:57:19 +01:00