scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 10:00:35 +00:00

Author	SHA1	Message	Date
Avi Kivity	fdfc347595	cql: make select_statement execution_stage scheduling aware Inherit scheduling from the caller, preventing a fall back into the main group.	2018-06-18 18:30:21 +03:00
Piotr Sarna	70ba8c8317	cql3: update token order comments Comments about token order were outdated with token column patches and they are now up to date. Fixes #3423	2018-06-06 09:02:37 +02:00
Avi Kivity	b70febe246	cql: cql_statement: remove execute_internal() With no callers, it can be safely removed.	2018-05-27 12:40:27 +03:00
Avi Kivity	eb19798f99	cql: select_statement: make execute() and execute_internal() equivalent execute_internal(), for some code paths, differs from execute by the following: 1. it uses CL_ONE unconditionally 2. it has no query timeout 3. it doesn't use execution stages for other code paths, it just calls execute. As preparation for getting rid of execute_internal(), unify the two code paths. Commit `4859b759b9` caused the consistency level and timeouts to be provided by the caller, so using the caller provided parameters instead of overriding them does not change behavior.	2018-05-27 12:36:02 +03:00
Nadav Har'El	1b29dd44f7	secondary index: fix wrong results returned in certain cases The current secondary-index search code, in indexed_table_select_statement::do_execute(), begins by fetching a list of partitions, and then the content of these partitions from the base table. However, in some cases, when the table has clustering columns and not searching on the first one of them, doing this work in partition granularity is wrong, and yields wrong results as demonstrated in issue #3405. So in this patch, we recognize the cases where we need to work in clustering row granularity, and in those cases use the new functions introduced in the previous patches - find_index_clustering_rows() and the execute() variant taking a list of primary-keys of rows. Fixes #3405. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-05-24 15:56:03 +03:00
Nadav Har'El	adf6d742be	secondary index: method for fetching list of rows from base table We add a new variant of select_statement::execute() which allows selecting an arbitrary list of clustering rows. The existing execute() variant can't do that - it can only take a list of partitions, and read the same clustering rows from all of them. The new select variant is not needed for regular CQL queries (which do not have a syntax allowing reading a list of rows with arbitrary primary keys), but we will need it for secondary index search, for solving issue #3405. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-05-24 15:54:36 +03:00
Nadav Har'El	a096a82adc	secondary index: method for fetching list of rows from index We already have a method find_index_partition_ranges(), to fetch a list of partition keys from the secondary index. However, as we shall see in the following patches (and see also issue #3405), getting a list of entire partitions is not always enough - the secondary index actually holds a list of primary keys, which includes clustering keys, and in some queries we can't just ignore them. So this patch provides a new method find_index_clustering_rows(), to query the secondary index and get a list of matching clustering keys. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-05-24 15:53:29 +03:00
Nadav Har'El	083b2ae573	select_statement.cc: refactor find_index_partition_ranges() The function find_index_partition_ranges() is used in secondary index searches for fetching a list of matching partition. In a following patch, we want to add a similar function for getting a list of rows. To avoid duplicate code, in this patch we split parts of find_index_partition_ranges() into two new functions: 1. get_index_schema() returns a pointer to the index view's schema. 2. read_posting_list() reads from this view the posting list (i.e., list of keys) for the current searched value. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-05-24 15:50:45 +03:00
Nadav Har'El	7dc9b77682	select_statement.cc: fix variable lifetime errors do_with() provides code a reference to an object which will be kept alive. It is a mistake to make a copy of this object or of parts of it, because then the lifetime of this copy will have to be maintained as well. In particular, it is a mistake to do do_with(..., [] (auto x) { ... }) - note how "auto x" appears instead of the correct "auto& x". This causes the object to be copied, and its lifetime not maintained. This patch fixes several cases where this rule was broken in select_statement.cc. I could not reproduce actual crashes caused by these mistakes, but in theory they could have happened. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-05-24 15:46:12 +03:00
Piotr Sarna	40bf5d671b	cql: add secondary index metrics This commit adds basic secondary index metrics to cql_stats: * total number of indexes creates * total number of indexes dropped * total number of reads from a secondary index * total number of rows read from a secondary index References #3384 Reviewed-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <d5eda7a343cee547c921dd4d289ecb1ac1c2bf24.1526374243.git.sarna@scylladb.com>	2018-05-15 17:59:53 +03:00
Nadav Har'El	f5536d607e	secondary index: fix multiple appearance of rows This patch fixes a bug where queries using a secondary index would, in some cases, produce the same rows multiple times. The problem was that the code begins by finding a list of primary keys that match the search, and then work on the partitions containing them. If multiple rows matched in the same partition, the partition was considered multiple times, and the same rows were output multiple times. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180510203141.17157-1-nyh@scylladb.com>	2018-05-13 20:08:14 +02:00
Duarte Nunes	a23bda3393	Merge 'Implement separate timeout for range queries' from Avi " This patchset implements separate timeouts for range queries, and lays the foundations for separate timeouts for other query types. While the feature in itself is worthy, the real motivation is to have the timeouts decided by the caller, instead of storage_proxy. This in turn is required to disentangle each layer behaving differently depending on whether the query is internal or not; instead, the goal is to have each caller declare its needs in terms of consistency level and timeouts, and have the lower layers implement its requirements instead of making their own decisions. Fixes #3013. Tests: unit (release) " * tag '3013/v1.1' of https://github.com/avikivity/scylla: storage_proxy: remove default_query_timeout() storage_proxy: don't use default timeouts query_options: augment with timeout_config thrift: configure thrift transport and handler with a timeout_config transport: configure native transport with a timeout_config cql3: define and populate timeout_config_selector timeout_config: introduce timeout configuration	2018-05-13 20:05:50 +02:00
Avi Kivity	d8dd7e05a7	storage_proxy: don't use default timeouts Require all callers to supply timeouts instead of relying on defaults. Since all callers now have the timeouts set up, they can easily supply them.	2018-04-30 13:19:53 +03:00
Avi Kivity	49fdf01b5d	cql3: define and populate timeout_config_selector Determine which timeout we need to apply at prepare time. We don't know the numerical value (since it depends on whoever is executing the query, not just the statement type), but we know which member of timeout_config we need, so determine and remember that.	2018-04-30 13:19:49 +03:00
Nadav Har'El	8012f231ca	materialized views: fix another case-sensitivity bug We had another case-sensitivity bug in materialized views, where if a case-sensitive (quoted) column name was listed explicitly on "SELECT" (instead of implicitly, e.g., in "SELECT *") the column name was incorrectly folded to lower-case and inserts would fail. This patch fixes the code, where a "SELECT" statement was built using the desired column names, but column names that needed quoting were not being quoted. The bug was in a helper function build_select_statement() which took column name strings and failed to quote them. We clean up this function to take column definitions instead of strings - and take care of the quoting itself. It also needs to quote the table's name in the select statement being built. Fixes #3391. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180429221857.6248-6-nyh@scylladb.com>	2018-04-30 00:27:23 +02:00
Nadav Har'El	a0bc0d2d11	secondary index: fix support for compound partition key In the current code, if the base table has a compound partition key (i.e., multiple partition-key columns) searching its secondary indexes didn't work. There is no real reason why this, it was a just a bug in preparing the second query: Every SI query is converted to two queries. The first queries the associated materialized view, to find a list of primary keys. Those we need to use in a second query, of the base table. The second query needs to list, as restrictions, the keys found above. When a partition key is compound, its components build one key and one restriction. But in the buggy code, we incorrectly used each component as a separate (improperly formatted) key and restriction, and obviously this didn't work. This patch also adds a test that reproduces this problem and confirms its fix. In the fixed code I also found another incorrect use of to_cql_string() (which could break case-sensitive primary key column names) and changed it to to_string(). Fixes #3210. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180429124138.24406-1-nyh@scylladb.com>	2018-04-29 14:40:13 +01:00
Piotr Sarna	000ce24306	cql3: solve JSON case-sensitivity issues This commit fixes two closely related issues with handling case-sensitive column names in JSON: * according to doc, case-sensitive names should be wrapped with additional pair of double quotes during JSON SELECT * logic error in parse_json() prevented INSERT JSON from working properly on case-sensitive column names This commit is followed by updated cql_query_test, which checks case-sensitive cases as well. Message-Id: <82d9d5e193a656e99bc86b297c00662a6fb808a0.1524576066.git.sarna@scylladb.com>	2018-04-24 16:30:55 +03:00
Nadav Har'El	1ec5688b0b	Materialized Views: fix incorrect limitations on row filtering This patch fixes several cases where it was disallowed to create a materialized view with a filter ("where ..."), for no good reason. After this patch, these cases will be allowed. Fixes #2367. In ordinary SELECT queries, certain types of filtering which is known to be deceptively inefficient is now allowed. For example, trying to query a range of partition keys cannot be done without reading the entire database (because the murmur3 tokenizer randomizes the order of partitions). Restricting two partition key components also cannot be done without reading excessive amount of the entire partition. So Scylla, following Cassandra, chooses to disallow such SELECT queries, and give an error message. However, the same SELECT statements should be allowed when defining a materialized view. In this case, the filter is just used to check an individual row - not to search for one - so there is no performance concern. Unfortunately the existing code did these validations while building the SELECT statement's "restrictions", in code shared by both uses of SELECT (query and MV definition). It was easy to move one of the validations to later code which runs after the restriction has already been built (and knows if it is working for query or MV), but because of the way the "restrictions" objects (translated from Cassandra 2's code) hide what they contain, many of the checks are harder to perform after having built the restrictions object. So instead, we add in strategic places in the restriction-handling code a new "allow_filtering" flag. If restrictions are built with allow_filtering=true, the extra performance-oriented tests on the filtering restrictions is not done. Materialized views sets allow_filtering=true. The allow_filtering flag will also be useful later when we want to support the "ALLOW FILTERING" query option which is currently not supported properly (we have several open issues on that). However note that this patch doesn't complete that support: I left a FIXME in the spot where we set allow_filtering in the Materialized Views case, but in the futre also need to set it if the user specified "ALLOWED FILTERING" in the query. This patch also enables several unit tests written by Duarte which used to fail because of this bug, and now pass. These tests verify that the restrictions are now allowed and filter the view as desired; But I also added test code to verify that the same restrictions are still forbidden, as before, when used in ordinary SELECT queries. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180423124343.17591-1-nyh@scylladb.com>	2018-04-23 14:08:04 +01:00
Avi Kivity	f7b102238a	cql3: change cql_statement methods to accept a local storage_proxy The storage_proxy represents the entire cluster, so there's never a need to access it on a remote shard; the local shard instance will contact remote shard or remote nodes as needed. Simplify the API by passing storage_proxy references instead of seastar::sharded<storage_proxy> references. query_processor and other callers are adjusted to call seastar::sharded::local() first. Message-Id: <20180415142656.25370-2-avi@scylladb.com>	2018-04-16 10:18:28 +02:00
Avi Kivity	dc0c458c12	Merge "First series on JSON support in CQL" from Piotr " This series introduces 'SELECT JSON' clause support for CQL. Things implemented: * expanding CQL grammar with JSON keyword * converting values to JSON format * serving 'SELECT JSON ' clauses tests for 'SELECT JSON' " * 'json_ops' of https://github.com/psarna/scylla: tests: add cql unit tests for SELECT JSON cql3: Add JSON token to CQL grammar cql3: add support for SELECT JSON clause cql3: add to_json_string function to types	2018-04-11 18:26:53 +03:00
Piotr Sarna	15545da572	cql3: add support for SELECT JSON clause This commit adds the implementation of SELECT JSON clause which returns rows in JSON format. Each returned row has a single '[json]' column. References #2058	2018-04-11 17:12:02 +02:00
Piotr Sarna	a5b6047ffa	cql3: add row-wise read statistics Database read metrics is now extended by total number of rows read, exported through cql_rows_read field. Closes #3146 Message-Id: <02f0816c509f3d7fea06da22869eea61548284e2.1522919708.git.sarna@scylladb.com>	2018-04-05 13:39:08 +03:00
Botond Dénes	2e2abf6edb	storage_proxy: add coordinator_query_options and coordinator_query_result As yet more parameters and return-values are about to be added to all storage_proxy::query_* methods we need a way that scales better than changing the signatures every time. To this end we aggregate all non-mandatory query parameters into `coordinator_query_options` and all return values into `coordinator_query_result`. This way new fields can be simply added to the respective structs while the signatures of the methods themselves and their client code can remain unchanged.	2018-03-19 15:17:35 +02:00
Botond Dénes	eac597d726	Add preferred and last replicas to the signature of query() preferred_replicas are added to the parameters and last_replicas are added to the return type. The preferred replicas will be used as a hint for the selection of the replicas to send the read requests to. The last replicas (returned) are the replicas actually selected for the read. This will allow queries to consistently hit the same replicas for each page thus reusing readers created on these replicas. For convenience a query() overload is provided that doesn't take or return the preferred and last replicas. This patch only adds the parameters and propagates them down to query_singular() and query_partition_key_range(). The code to actually use these preferred-replicas will be added in later patches. This reason for separating this is to reduce noise and improve reviewability for those functional changes later.	2018-03-13 10:34:34 +02:00
Nadav Har'El	fa284f6307	Add query UUID to read command This patch adds the parameter to read_command which is needed for caching of readers during multiple pages of a paged queries, which we will introduce in the next patches. The query_uuid is a UUID of a previously saved reader, which the replica is now asked to recall and resume (if this saved reader is no longer in the cache, it is fine, a new reader will be started). Additionally a helper flag is_first_page is added so that the replica can avoid doing any cache lookups (and incrementing miss counters) for the first page. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-03-13 10:34:34 +02:00
Duarte Nunes	ac6abf8021	Merge 'CQL clustering column secondary indexing support' from Pekka "This patch series adds support for clustering column secondary indexing. Fixes #2961 Tests: unit-tests (release)" * 'penberg/cql-2i-clustering-key-indexing/v2' of github.com:penberg/scylla: tests/cql_query_test: Add indexed clustering key query test cql3: Fix clustering column secondary indexing cql3/statements: Add values() helper to restrictions cql3/restrictions: Fix multi_column_restriction::values() cql3/restrictions: Fix single_column_primary_key_restrictions::values()	2018-02-12 18:49:34 +00:00
Paweł Dziepak	b635fec9bf	cql3/select_statement: do not capture stack variables by reference Default capture by reference considered harmful in async code.	2018-02-08 14:46:10 +00:00
Pekka Enberg	0128f802ed	cql3: Fix clustering column secondary indexing Fix clustering column indexing by lifting the limitation of only considering non-primary key restrictions in select_statement::find_index_partition_ranges().	2018-02-06 16:57:27 +02:00
Glauber Costa	08a0c3714c	allow request-specific read timeouts in storage proxy reads Timeouts are a global property. However, for tables in keyspaces like the system keyspace, we don't want to uphold that timeout--in fact, we wan't no timeout there at all. We already apply such configuration for requests waiting in the queued sstable queue: system keyspace requests won't be removed. However, the storage proxy will insert its own timeouts in those requests, causing them to fail. This patch changes the storage proxy read layer so that the timeout is applied based on the column family configuration, which is in turn inherited from the keyspace configuration. This matches our usual way of passing db parameters down. In terms of implementation, we can either move the timeout inside the abstract read executor or keep it external. The former is a bit cleaner, the the latter has the nice property that all executors generated will share the exact same timeout point. In this patch, we chose the latter. We are also careful to propagate the timeout information to the replica. So even if we are talking about the local replica, when we add the request to the concurrency queue, we will do it in accordance with the timeout specified by the storage proxy layer. After this patch, Scylla is able to start just fine with very low timeouts--since read timeouts in the system keyspace are now ignored. Fixes #2462 Implementation notes, and general comments about open discussion in 2462: * Because we are not bypassing the timeout, just setting it high enough, I consider the concerns about the batchlog moot: if we fail for any other reason that will be propagated. Last case, because the timeout is per-CF, we could do what we do for the dirty memory manager and move the batchlog alone to use a different timeout setting. * Storage proxy likes specifying its timeouts as a time_point, whereas when we get low enough as to deal with the read_concurrency_config, we are talking about deltas. So at some point we need to convert time_points to durations. We do that in the database query functions. v2: - use per-request instead of per-table timeouts. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-01-12 07:43:21 -05:00
Vladimir Krivopalov	41eb278899	Only allow DISTINCT SELECT queries with partition key restrictions. Fixes #2049 Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <75e69626d797e63fb1e93a9120f135d4959fad1c.1512162540.git.vladimir@scylladb.com>	2017-12-03 11:59:11 +02:00
Pekka Enberg	9048f741ad	cql3: Secondary-index backed select statements This patch adds support for secondary-index backed select statements. Current select_statement class is split into two separate classes: primary_key_select_statement that retains regular query behavior and indexed_table_select_statement that introduces the new secondary-index backed query logic. One of the two behaviors is selected at query preparation time to minimize overhead for non-indexed queries.	2017-11-03 10:12:58 +02:00
Amnon Heiman	08c81427b9	Add paging for internal queries Usually, internal queries are used for short queries. Sometimes though, like in the case of get compaction history, there could be a large amount of results. Without paging it will overload the system. This patch adds the ability to use paging internally. Using paging will be done explicitely, all the relevant information would be store in an internal_query_state, that would hold both the paging state but also the query so consecutive calls can be made. To use paging use the query method with a function. The function gets beside a statement and its parameters a function that will be used for each of the returned rows. For example if qp is a query_processor: qp.query("SELECT * from system.compaction_history", [] (const cql3::untyped_result_set::row& row) { .... // do something with row ... return stop_iteration::no; // keep on reading }); Will run the function on each of the compaction history table rows. To stop the iteration, the function can return stop_iteration::yes.	2017-07-20 17:43:51 +03:00
Duarte Nunes	6ac73b57fb	cql3/statements/select_statement: Remove dead code Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170522100230.17393-1-duarte@scylladb.com>	2017-05-22 14:32:12 +03:00
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Duarte Nunes	d7701087af	cql3/restrictions/statement_restrictions: Consider statement type Now that update_statement uses statement_restrictions, we need our validation logic to take the statement type into account, in particular to deal with insertion statements which only set static columns but specify clustering values. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-10 19:54:42 +02:00
Pekka Enberg	dfee4d2bb0	cql3: Fix partition key bind indices for prepared statements Fix the CQL front-end to populate the partition key bind index array in result message prepared metadata, which is needed for CQL binary protocol v4 to function correctly. Fixes #2355. Message-Id: <1494247871-3148-1-git-send-email-penberg@scylladb.com>	2017-05-08 16:33:17 +03:00
Vlad Zolotarov	ff55b76562	cql3::query_processor: use weak_ptr for passing the prepared statements around Use seastar::checked_ptr<weak_ptr<pepared_statement>> instead of shared_ptr for passing prepared statements around. This allows an easy tracking and handling of statements invalidation. This implementation will throw an exception every time an invalidated statement reference is dereferenced. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-04-12 12:24:03 -04:00
Avi Kivity	27c42359bc	Merge seastar upstream * seastar 6b21197...2ebe842 (6): > Merge "Various improvements to execution stages" from Paweł > app-template: allow apps to specify a name for help message > bool_class: avoid initializing object of incomplete type > app-template: make sure we can still get help with required options > prometheus: Http handler that returns prometheus 0.4 protobuf or text format > Update DPDK to 17.02 Includes patch from Pawel to adjust to updated execution_stage interface.	2017-03-26 10:50:21 +03:00
Duarte Nunes	bfb8a3c172	materialized views: Replace db::view::view class The write path uses a base schema at a particular version, and we want it to use the materialized views at the corresponding version. To achieve this, we need to map the state currently in db::view::view to a particular schema version, which this patch does by introducing the view_info class to hold the state previously in db::view::view, and by having a view schema directly point to it. The changes in the patch are thus: 1) Introduce view_info to hold the extra view state; 2) Point to the view_info from the schema; 3) Make the functions in the now stateless db::view::view non-member; 4) Remove the db::view::view class. All changes are structural and don't affect current behavior. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-03-15 15:50:05 +01:00
Paweł Dziepak	d005b20071	cql3: make select statement an execution stage	2017-03-09 09:27:43 +00:00
Pekka Enberg	2bd560118e	cql3/statements/select_statement: Unset value support	2017-01-27 09:24:36 +02:00
Tomasz Grabiec	bc6486b304	Use gc_clock instead of db_clock where possible Some code paths were obtaining db_clock timestamp to only convert it to gc_clock later. Avoid this. In the future we could make gc_clock cheaper cause it has low precision. Message-Id: <1482401190-2035-1-git-send-email-tgrabiec@scylladb.com>	2016-12-22 13:27:55 +02:00
Duarte Nunes	124802e196	cql3: Add function to build view's select statement This patch adds an utility function that creates a raw select statement from a set of columns and a where clause. It is intended to be used to create the prepared select statement used by the view class. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Duarte Nunes	088dfdb108	select_statement: Consider materialized views This patch considers materialized views in select_statement::check_access(). Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Duarte Nunes	8792fed651	create_view_statement: Complete implementation Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Duarte Nunes	a9c17b0a52	select_statement: Propagate for_view argument This patch propagates the for_view argument, used by statement_restrictions to ensure IS NOT NULL can be used when creating a materialized view. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Asias He	937f28d2f1	Convert to use dht::partition_range_vector and dht::token_range_vector	2016-12-19 14:08:50 +08:00
Asias He	e5485f3ea6	Get rid of query::partition_range Use dht::partition_range instead	2016-12-19 08:09:25 +08:00
Duarte Nunes	7ce859799b	select_statement: Don't always trim result set Trimming the result set is only needed when the query contains an "IN" relation, an ORDER BY clause, and defines a limit, which is the case where we query different ranges concurrently. We don't use the result_merger to trim since we first need to reorder the rows. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-15 11:00:46 +00:00
Duarte Nunes	fee0b7fa48	query_result_merger: Limit rows This patch makes the row limit enforced by the storage_proxy layer. It adds a row limit to the query_result_merger, useful when merging results for concurrent queries. More importantly, it provides guarantees that upper layers may be relying on implicitly (e.g., the paging code). Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-15 11:00:36 +00:00

1 2 3

112 Commits