scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 12:17:02 +00:00

Author	SHA1	Message	Date
Duarte Nunes	1953c5fa61	Merge 'Fix filtering with LIMIT' from Piotr " This series adds proper handling of filtering queries with LIMIT. Previously the limit was erroneously applied before filtering, which leads to truncated results. To avoid that, paged filtering queries now use an enhanced pager, which remembers how many rows dropped and uses that information to fetch for more pages if the limit is not yet reached. For unpaged filtering queries, paging is done internally as in case of aggregations to avoid returning keeping huge results in memory. Also, previously, all limited queries used the page size counted from max(page size, limit). It's not good for filtering, because with LIMIT 1 we would then query for rows one-by-one. To avoid that, filtered queries ask for the whole page and the results are truncated if need be afterwards. Tests: unit (release) " * 'fix_filtering_with_limit_2' of https://github.com/psarna/scylla: tests: add filtering with LIMIT test tests: split filtering tests from cql_query_test cql3: add proper handling of filtering with LIMIT service/pager: use dropped_rows to adjust how many rows to read service/pager: virtualize max_rows_to_fetch function cql3: add counting dropped rows in filtering pager (cherry picked from commit `1afda28cf3`)	2018-12-02 12:07:46 +02:00
Duarte Nunes	522a48a244	Merge 'Fix for a select statement with filtered columns' from Eliran " This patchset fixes #3803. When a select statement with filtering is executed and the column that is needed for the filtering is not present in the select clause, rows that should have been filtered out according to this column will still be present in the result set. Tests: 1. The testcase from the issue. 2. Unit tests (release) including the newly added test from this patchset. " * 'issues/3803/v10' of https://github.com/eliransin/scylla: unit test: add test for filtering queries without the filtered column cql3 unit test: add assertion for the number of serialized columns cql3: ensure retrieval of columns for filtering cql3: refactor find_idx to be part of statement restrictions object cql3: add prefix size common functionality to all clustering restrictions cql3: rename selection metadata manipulation functions (cherry picked from commit `3fe92663d4`)	2018-10-24 09:44:46 +03:00
Duarte Nunes	e6630c627b	Merge 'Add secondary index paging' from Piotr " Indexed select statement consists of two queries - the view query used to extract base keys and the base query that uses those keys to return base rows. The main idea of this series is to replace raw proxy.query() call during the view query to one that uses a pager. Additionally, paging info from the view query needs to be returned to the client, in order to be used later for requesting new pages. " * 'paging_indexes_7' of https://github.com/psarna/scylla: tests: add test for secondary index with paging cql3: remove execute(primary_keys) from select statement cql3: add incremental base queries to index query storage_proxy: make get_restricted_ranges public cql3: add base query handling function to indexed statement cql3: add generating base key from index keys cql3: add paging state generation function cql3: move getting index view schema to prepare stage pager: make state() defined for exhausted pagers cql3: add maybe_set_paging_state function cql3: rename set_has_more_pages to set_paging_state pager: add setters for partition/clustering keys cql3: add paging to read_posting_list cql3: add non-const get_result_metadata method cql3: make find_index_* functions return paging state cql3: make read_posting_list return future<rows> cql3: make pagers use time_point instead of duration	2018-10-01 10:42:21 +01:00
Duarte Nunes	5e7bb20c8a	cql3/selection/selector: Unwrap types when validating assignment When validating assignment between two types, it's possible one of them is wrapped in a reverse_type, if it comes, for example, from the type associated with a clustering column. When checking for weak assignment the types are correctly unwrapped, but not when checking for an exact match, which this patch fixes. Technically, the receiver is never a reversed_type for the current callers, but this is the morally correct implementation, as the type being reversed or not plays no role in assignment. Tests: unit(release) Fixes #3789 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180927223201.28152-1-duarte@scylladb.com>	2018-09-28 07:08:19 +03:00
Piotr Sarna	b83aa69a2e	cql3: add non-const get_result_metadata method	2018-09-27 15:18:06 +02:00
Piotr Sarna	5b5c9f2707	cql3: fix a 'pratition_key' typo partition_key got misspelled with 'pratition_key' typo in the original series. Message-Id: <de59fe6161df5442b19d8ba4336e2f828b7ede32.1535981852.git.sarna@scylladb.com>	2018-09-04 16:05:09 +03:00
Nadav Har'El	3f3a76aa8f	Do not allow selecting a virtual column For issue #3362, we will need to add to a materialized view also unselected base-table columns as "virtual columns". We need these columns to exist to keep view rows alive, but we don't want the user to be able to see them. In this patch we prevent SELECTing the virtual columns of the view, and also exclude the virtual columns from a "SELECT *" on a view. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-08-16 15:34:22 +03:00
Piotr Sarna	8c18aaa511	cql3: pass query options to restrictions filter Query options may contain bound values needed for checking filtering restrictions. Previously, empty query_options{} were used, which caused prepared statements to fail. Fixes #3677	2018-08-09 17:44:45 +02:00
Piotr Sarna	aadbfc6b84	cql3: throw instead of log for collection filtering Original series that introduced filtering logged a warning when collection restrictions appeared. Instead, an exception should be thrown until collection restrictions are supported for ALLOW FILTERING clauses. Message-Id: <ddaf342d4d6766fadb756f66e5afa0b99ce054f8.1531220558.git.sarna@scylladb.com>	2018-07-10 14:44:29 +03:00
Piotr Sarna	77aa97f62a	cql3: fix ALLOW FILTERING iterator In original series cell iterator for regular cells was erroneously taken by copy instead of by reference, which will result in iterating over the first value indefinitely. Also, the same iterator was not updated for collections, which is fixed too. Message-Id: <83297adf8121de4fd37257c87f250d61ea9ec80b.1530892191.git.sarna@scylladb.com>	2018-07-06 17:23:12 +01:00
Piotr Sarna	a08fba19e3	cql3: optimize filtering partition keys and static rows If any restriction on partition key or static row part fails, it will be so for every row that belongs to a partition. Hence, full check of the rest of the rows is skipped.	2018-07-05 10:50:43 +02:00
Piotr Sarna	2a0b720102	cql3: add filtering visitor In order to filter results of an 'ALLOW FILTERING' query, a visitor that can take optional filter for result_builder is provided. It defaults to nop_filter, which accepts all rows.	2018-07-05 10:50:43 +02:00
Piotr Sarna	1cf5653f89	cql3: move result_set_builder functions to header Moving function definitions to header is a preparation step before turning result_set_builder into a template.	2018-07-05 10:50:43 +02:00
Paweł Dziepak	3f1184d16d	cql3: selection: add is_trivial() cql3::result_generator supports only trivial selections.	2018-06-25 09:21:47 +01:00
Paweł Dziepak	4704c4efab	query::result: avoid copying and linearising cell value query::result_view already operates on views of a serialised query::result. However, until now the value of a cell was always linearised and copied. This patch makes use of ser::buffer_view to avoid that.	2018-06-25 09:21:47 +01:00
Piotr Sarna	15545da572	cql3: add support for SELECT JSON clause This commit adds the implementation of SELECT JSON clause which returns rows in JSON format. Each returned row has a single '[json]' column. References #2058	2018-04-11 17:12:02 +02:00
Vladimir Krivopalov	fb7d46fc2e	Allow COUNT(*) and COUNT(1) to be queried with other aggregations or columns Fixes #2218 Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <c387d34969d5bcfb8b2bf42806e6e05a9b8a067c.1511487356.git.vladimir@scylladb.com>	2017-11-24 10:01:21 +00:00
Daniel Fiala	7fe653f08c	cql3/selectable: Add selectable::with_cast for CAST AS functions. Signed-off-by: Daniel Fiala <daniel@scylladb.com>	2017-10-07 21:04:40 +02:00
Calle Wilund	6c8b5fc09d	schema_tables: Use v3 schema tables and formats Switches system/schema_* for system_schema/*, updates schema/schema builder and uses to hold/expect v3 style info (i.e. types & dropped).	2017-05-10 16:44:48 +00:00
Paweł Dziepak	fce6e0987f	cql3: selection: do not panic when seeing counters At this stage counters cells are already long_type values, so no special handling is necessary.	2017-02-02 10:35:14 +00:00
Tomasz Grabiec	bc6486b304	Use gc_clock instead of db_clock where possible Some code paths were obtaining db_clock timestamp to only convert it to gc_clock later. Avoid this. In the future we could make gc_clock cheaper cause it has low precision. Message-Id: <1482401190-2035-1-git-send-email-tgrabiec@scylladb.com>	2016-12-22 13:27:55 +02:00
Pekka Enberg	e1e8ca2788	cql3: Fix selecting same column multiple times Under the hood, the selectable::add_and_get_index() function deliberately filters out duplicate columns. This causes simple_selector::get_output_row() to return a row with all duplicate columns filtered out, which triggers and assertion because of row mismatch with metadata (which contains the duplicate columns). The fix is rather simple: just make selection::from_selectors() use selection_with_processing if the number of selectors and column definitions doesn't match -- like Apache Cassandra does. Fixes #1367 Message-Id: <1477989740-6485-1-git-send-email-penberg@scylladb.com>	2016-11-01 09:09:01 +00:00
Duarte Nunes	cb0516a76c	schema: Remove compact_column concept This is a confusing one, and can be replaced the fact that dense schemas have a single regular column. Ref #1542 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-08-03 17:21:41 +00:00
Duarte Nunes	529c3a3ae6	column_kind: Drop compact_column A compact column is a dense schema's single regular column. The fact that it is a different column_kind has lead to various bugs (#1535, derived by the schema being dense and the column being regular. Fixes #1542 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-08-03 17:21:37 +00:00
Avi Kivity	0135b4d5cd	cql3: constify metadata users Metadata usually doesn't change after it is created; make that visible in the code, allowing further optimizations to be applied later. Message-Id: <1464334638-7971-3-git-send-email-avi@scylladb.com>	2016-05-31 09:12:11 +03:00
Gleb Natapov	f3b515052b	udt: fix error generation if accessed type is not udt Fixes #1198 Message-Id: <1460884314-3717-2-git-send-email-gleb@scylladb.com>	2016-04-18 12:45:03 +03:00
Duarte Nunes	ece89069dd	udt: Implement to_string() for selectable Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1460884314-3717-1-git-send-email-gleb@scylladb.com>	2016-04-18 12:44:48 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Tomasz Grabiec	63006e5dd2	query: Serialize collection cells using CQL format We want the format of query results to be eventually defined in the IDL and be independent of the format we use in memory to represent collections. This change is a step in this direction. The change decouples format of collection cells in query results from our in-memory representation. We currently use collection_mutation_view, after the change we will use CQL binary protocol format. We use that because it requires less transformations on the coordinator side. One complication is that some list operations need to retrieve keys used in list cells, not only values. To satisfy this need, new query option was added called "collections_as_maps" which will cause lists and sets to be reinterpreted as maps matching their underlying representation. This allows the coordinator to generate mutations referencing existing items in lists.	2016-02-15 17:05:55 +01:00
Tomasz Grabiec	9d11968ad8	Rename serialization_format to cql_serialization_format	2016-02-15 16:53:56 +01:00
Tomasz Grabiec	916a91c913	query: Split send_timestamp_and_expiry into two separate options It's cleaner that way. They don't need to come together.	2016-02-15 16:53:56 +01:00
Paweł Dziepak	ed7d9d4996	schema: change has_collections() to has_multi_column_collections() All users of schema::has_collections() aren't really interested in frozen ones. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-05 10:46:42 +01:00
Paweł Dziepak	3287022000	cql3: do not assume that clustering key is full In case of schemas that use compact storage it is possible that trailing components of clustering keys are not set. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-12-10 05:46:26 +01:00
Avi Kivity	79f7431a03	db: change collection_mutation::{one,view} not to use nested classes Nested classes cannot be forward-declared, so change the naming not to use them. Follows atomic_cell{,_view}.	2015-11-13 17:13:07 +02:00
Calle Wilund	4a1a17defc	cql3::selection: Move result set building visitor to result_set_builder Allows its use (and partial override - hint hint) in more place than one.	2015-11-10 13:12:33 +01:00
Calle Wilund	23b6240dad	cql3::selection: Fix some constness correctness	2015-11-10 13:12:33 +01:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Paweł Dziepak	f6a93be655	cql3: skip compact value columns with no name Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-08-14 14:53:35 +02:00
Paweł Dziepak	a6d0ed205b	cql3: use api::missing_timestamp for missing timestamps A missing timestamp is a missing one, not the smallest one. Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-07-16 16:23:03 +02:00
Pekka Enberg	11b633208d	cql3: Remove Java imports from C++ files Remove left-over Java imports from files that are already translated to C++. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-05-12 16:41:12 +03:00
Pekka Enberg	d50139351f	cql3: Use pragma once everywhere There's no benefit to using C include guards so switch to pragma once everywhere for consistency. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-05-12 16:32:56 +03:00
Tomasz Grabiec	5ba1486ae7	db: Rename "ttl" to "expiry" when it's used as time point To avoid confusion with "ttl" the duration.	2015-05-06 17:27:22 +02:00
Avi Kivity	6290dee438	db: const correctness for abstract_type and friends Types are immutable.	2015-04-29 15:40:38 +03:00
Tomasz Grabiec	731a63e371	schema: Embed raw_schema inside schema Public fields got encapsulated.	2015-04-24 18:01:01 +02:00
Avi Kivity	f841a05475	cql3: convert selectable::with_field_selection to C++ Due to circular dependencies (selectable::with_field_selection -> column_identifier -> selectable) a new header file was created.	2015-04-20 16:15:34 +03:00
Avi Kivity	fa961f1e5e	cql3: convert field_selector to C++	2015-04-20 16:15:34 +03:00
Avi Kivity	3d38708434	cql3: pass a database& instance to most foo::raw::prepare() variants To prepare a user-defined type, we need to look up its name in the keyspace. While we get the keyspace name as an argument to prepare(), it is useless without the database instance. Fix the problem by passing a database reference along with the keyspace. This precolates through the class structure, so most cql3 raw types end up receiving this treatment. Origin gets along without it by using a singleton. We can't do this due to sharding (we could use a thread-local instance, but that's ugly too). Hopefully the transition to a visitor will clean this up.	2015-04-20 16:15:34 +03:00
Tomasz Grabiec	ee906471ab	cql3: Move method implementations to .cc	2015-04-15 20:44:59 +02:00
Tomasz Grabiec	00f99cefd4	db: split query.hh to reduce header dependencies	2015-04-15 20:44:59 +02:00
Tomasz Grabiec	878a740b9d	db: Write query results in serialized form This gives about 30% increase in tps in: build/release/tests/perf/perf_simple_query -c1 --query-single-key This patch switches query result format from a structured one to a serialized one. The problems with structured format are: - high level of indirection (vector of vectors of vectors of blobs), which is not CPU cache friendly - high allocation rate due to fine-grained object structure On replica side, the query results are probably going to be serialized in the transport layer anyway, so this change only subtracts work. There is no processing of the query results on replica other than concatenation in case of range queries. If query results are collected in serialized form from different cores, we can concatenate them without copying by simply appending the fragments into the packet. This optimization is not implemented yet. On coordinator side, the query results would have to be parsed from the transport layer buffers anyway, so this also doesn't add work, but again saves allocations and copying. The CQL server doesn't need complex data structures to process the results, it just goes over it linearly consuming it. This patch provides views, iterators and visitors for consuming query results in serialized form. Currently the iterators assume that the buffer is contiguous but we could easily relax this in future so that we can avoid linearization of data received from seastar sockets. The coordinator side could be optimized even further for CQL queries which do not need processing (eg. select * from cf where ...) we could make the replica send the query results in the format which is expected by the CQL binary protocol client. So in the typical case the coordinator would just pass the data using zero-copy to the client, prepending a header. We do need structure for prefetched rows (needed by list manipulations), and this change adds query result post-processing which converts serialized query result into a structured one, tailored particularly for prefetched rows needs. This change also introduces partition_slice options. In some queries (maybe even in typical ones), we don't need to send partition or clustering keys back to the client, because they are already specified in the query request, and not queried for. The query results hold now keys as optional elements. Also, meta-data like cell timestamp and ttl is now also optional. It is only needed if the query has writetime() or ttl() functions in it, which it typically won't have.	2015-04-15 20:44:50 +02:00

1 2

73 Commits