scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-21 00:50:35 +00:00

Author	SHA1	Message	Date
Vlad Zolotarov	baa6496816	service::storage_proxy: READ instrumentation: store trace state object in abstract_read_executor Having a trace_state_ptr in the storage_proxy level is needed to trace code bits in this level. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-07-19 18:21:59 +03:00
Vlad Zolotarov	54a758dfff	cql3::select_statement: simplify the tracing code by using a tracing::make_trace_info() helper Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-07-19 18:21:58 +03:00
Vlad Zolotarov	a5022a09a4	tracing: use 'write' instead of 'flush' and 'store' for consistency with seastar's API In names of functions and variables: s/flush_/write_/ s/store_/write_/ In a i_tracing_backend_helper: s/flush()/kick()/ Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-07-19 18:21:57 +03:00
Duarte Nunes	f013425bb5	query: Ensure timestamp is last param in read_command Since the timestamp is not serialized, it must always be the last parameter of query::read_command. This patch reorders it with the partition_limit parameters and updates callers that specified a timestamp argument. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1468312334-10623-1-git-send-email-duarte@scylladb.com>	2016-07-12 10:41:54 +01:00
Pekka Enberg	f64c25a495	cql3/statements/select_statement: Unify coding style The coding style in select_statement.cc is very inconsistent which makes the code hard to read. Clean that up. Message-Id: <1464871790-21031-1-git-send-email-penberg@scylladb.com>	2016-06-02 16:17:21 +02:00
Vlad Zolotarov	4c17a422e0	cql3: instrument a SELECT query to send tracing info Instrument a coordinator of a SELECT query to send tracing session info to the corresponding replica Nodes. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-06-01 20:17:25 +03:00
Vlad Zolotarov	6e26909b02	query::read_command: add an optional trace_info field Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-06-01 20:17:19 +03:00
Avi Kivity	0135b4d5cd	cql3: constify metadata users Metadata usually doesn't change after it is created; make that visible in the code, allowing further optimizations to be applied later. Message-Id: <1464334638-7971-3-git-send-email-avi@scylladb.com>	2016-05-31 09:12:11 +03:00
Avi Kivity	25b3d74f45	cql3: Split select_statement::raw_statement into raw namespace cql3::select_statement::raw_statement -> cql3::raw::select_statement Message-Id: <1464609556-3756-4-git-send-email-avi@scylladb.com>	2016-05-31 09:09:30 +03:00
Avi Kivity	caf8d4f0e6	cql3: separate parsed_statement and parsed_statment::prepared cql3::statements::parsed_statement -> cql3::statements::raw::parsed_statement cql3::statements::parsed_statement::prepared -> cql3::statements::prepared_statement Message-Id: <1464609556-3756-2-git-send-email-avi@scylladb.com>	2016-05-31 09:09:10 +03:00
Gleb Natapov	7f6b12c97a	query: add user provided timestamp to read_command If read query supplies timestamp move it to read_command to be used later otherwise get local timestamp.	2016-05-24 15:19:35 +03:00
Calle Wilund	3906dc9f0d	cql3::statements: Change check_access to future<> + implement	2016-04-19 11:49:05 +00:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Tomasz Grabiec	1ecf9a7427	query: result_view: Introduce do_with() Encapsulates linearization. Abstracts away the fact that result_view can't work with discontiguous storage yet.	2016-02-26 12:26:13 +01:00
Pekka Enberg	dfcc48d82a	transport: Add result metadata to PREPARED message The gocql driver assumes that there's a result metadata section in the PREPARED message. Technically, Scylla is not at fault here as the CQL specification explicitly states in Section 4.2.5.4. ("Prepared") that the section may be empty: - <result_metadata> is defined exactly as <metadata> but correspond to the metadata for the resultSet that execute this query will yield. Note that <result_metadata> may be empty (have the No_metadata flag and 0 columns, See section 4.2.5.2) and will be for any query that is not a Select. There is in fact never a guarantee that this will non-empty so client should protect themselves accordingly. The presence of this information is an However, Cassandra always populates the section so lets do that as well. Fixes #912. Message-Id: <1456317082-31688-1-git-send-email-penberg@scylladb.com>	2016-02-24 14:43:24 +02:00
Tomasz Grabiec	09dc79f245	cql3: select_statement: Set desired serialization format	2016-02-15 17:05:55 +01:00
Tomasz Grabiec	9d11968ad8	Rename serialization_format to cql_serialization_format	2016-02-15 16:53:56 +01:00
Erich Keane	4197ceeedb	raw_statement::is_reversed rewrite to avoid VLA The is_reversed function uses a variable length array, which isn't spec-abiding C++. Additionally, the Clang compiler doesn't allow them with non-POD types, so this function wouldn't compile. After reading through the function it seems that the array wasn't necessary as the check could be calculated inline rather than separately. This version should be more performant (since it no longer requires the VLA lookup performance hit) while taking up less memory in all but the smallest of edge-cases (when the clustering_key_size * sizeof(optional<bool>) < sizeof(size_type) - sizeof(uint32_t) + sizeof(bool). This patch uses relation_order_unsupported it assure that the exception order is consistent with the preivous version. The throw would otherwise be moved into the initial for-loop. There are two derrivations in behavior: The first is the initial assert. It however should not change the apparent behavior besides causing orderings() to be looked up 2x in debug situations. The second is the conversion of is_reversed_ from an optional to a bool. The result is that the final return value is now well-defined to be false in the release-condition where orderings().size() == 0, rather than be the ill-defined *is_reversed_ that was there previously. Signed-off-by: Erich Keane <erich.keane@verizon.net> Message-Id: <1454546285-16076-4-git-send-email-erich.keane@verizon.net>	2016-02-07 10:38:17 +02:00
Calle Wilund	e935c9cd34	select_statement: Make sure all aggregate queries use paging Mainly to make sure we respect row limits. Since normal result generation does not for aggregates. Fixes #752 Message-Id: <1452681048-30171-2-git-send-email-calle@scylladb.com>	2016-01-14 19:03:37 +02:00
Tomasz Grabiec	04eb58159a	query: Add schema_version field to read_command	2016-01-11 10:34:51 +01:00
Pekka Enberg	d7db5e91b6	cql3: Move select_statement implementation to source file	2015-12-18 12:59:22 +02:00
Calle Wilund	fdc549cd47	select_statement: Handle aggregate queries Fixes #549. Being clinically absent-minded, aggregate query support (i.e. count(...)) was left out of the "paging" change set. This adds repeated paged querying to do aggregate queries (similar to origin). Uses "batched" paging.	2015-11-11 18:41:47 +01:00
Calle Wilund	ecd7674867	select_statement: Paging Check if paging might be needed, and if so, use a paging object. Similar to origin, but without all the filters.	2015-11-10 13:16:06 +01:00
Calle Wilund	4a1a17defc	cql3::selection: Move result set building visitor to result_set_builder Allows its use (and partial override - hint hint) in more place than one.	2015-11-10 13:12:33 +01:00
Avi Kivity	2c3591cbd9	data_value de-any-fication We use boost::any to convert to and from database values (stored in serlialized form) and native C++ values. boost::any captures information about the data type (how to copy/move/delete etc.) and stores it inside the boost::any instance. We later retrieve the real value using boost::any_cast. However, data_value (which has a boost::any member) already has type information as a data_type instance. By teaching data_type intances about the corresponding native type, we can elimiante the use of boost::any. While boost::any is evil and eliminating it improves efficiency somewhat, the real goal is growing native type support in data_type. We will use that later to store native types in the cache, enabling O(log n) access to collections, O(1) access to tuples, and more efficient large blob support.	2015-10-30 17:38:51 +01:00
Pekka Enberg	1890d276b9	cql3: Add depends_on_{keyspace\|column_family} helper to cql_statement Signed-off-by: Pekka Enberg <penberg@scylladb.com>	2015-10-15 09:18:52 +03:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Paweł Dziepak	1402125bd8	cql3: reverse order of bounds for reversed selects Because of the reverse flag in partition slice rows inside bounds will be returned in reversed order, however, we still have to make sure that the bounds are in the expected order. Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-08-13 11:08:20 +02:00
Paweł Dziepak	7a7919a62e	cql3: set properly partition_slice for SELECT DISTINCT Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-08-04 15:39:54 +02:00
Avi Kivity	bb3aef7fd9	Merge "Basic schema handling of compact strategy" from Glauber "With this patchset, cqlsh's describe table command now work"	2015-07-23 16:47:29 +03:00
Glauber Costa	d1496944d9	sstables: handle compaction strategy Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-07-23 00:02:11 -04:00
Paweł Dziepak	281a8d97a5	cql3: make sure enough rows is returned in queries with IN clause In case of queries with IN on partition key, ORDER BY and LIMIT it is not known until after post-query sort which rows should be included in the result set. To make sure that the output is correct each partition specified in IN clause is queried for LIMIT rows and the excess data is trimmed after the results are sorted. Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-07-23 02:38:33 +02:00
Paweł Dziepak	b14aa8760a	cql3: support reversed order in select statements When partition_slice option reversed is set the query will already return rows in desired (i.e. reversed) order. That's not true, however, for statements using IN restriction on partition keys. In such case post-query ordering is still needed. Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-07-23 02:38:30 +02:00
Gleb Natapov	c58aea07c3	remove storage_proxy::query_local() Use storage_proxy::query() instead.	2015-07-20 12:06:00 +02:00
Paweł Dziepak	f829cf373e	cql3: selection::raw_selector::alias may be null Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-07-14 19:34:27 +02:00
Tomasz Grabiec	ad99e84505	storage_proxy: Take schema_ptr in query() It will be needed for reconciliation.	2015-07-12 12:54:38 +02:00
Tomasz Grabiec	9724b84bb3	db: Fix query of partitions with no live clustered rows When partition has no live regular rows, but has some data live in the static row, then it should appear in the results, even though we didn't select any static column. To reproduce: create table cf (k blob, c blob, v blob, s1 blob static, primary key (k, c)); update cf set s1 = 0x01 where k = 0x01; update cf set s1 = 0x02 where k = 0x02; select k from cf; The "select" statement should return 2 rows, but was returning 0. The following query worked fine, because static columns were included: select * from cf; The data query should contain only live data, so we shouldn't write a partition entry if it's supposed to be absent from the results. We can'r tell that though until we've processed all the data. To solve this problem, query result writer is using an optimistic approach, where the partition header will be retracted from the buffer (cheaply), if it turns out there's no live data in it.	2015-07-09 19:55:00 +02:00
Calle Wilund	333af5b61a	Implement select_statement::execute_internal	2015-07-06 08:21:15 +02:00
Paweł Dziepak	290a7ca1bf	query: add timestamp to read_command Read command needs a timestamp in order to determine which cells have already expired. Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-07-02 17:01:19 +02:00
Gleb Natapov	a338407e29	make storage_proxy object distributed storage_proxy holds per cpu state now to track clustering, so it has to be distributed otherwise smp setup does not work.	2015-06-17 15:14:06 +02:00
Gleb Natapov	b7155ad862	pass partitions_ranges separately from from read_command partitions_ranges will be manipulated upon to be split for different destination, so provide it separately from read_command to not copy the later for each destination.	2015-06-11 15:18:07 +03:00
Calle Wilund	1631ce132e	Add "storage_proxy&" argument to cql_statement::validate To make db, schemas etc reachable	2015-06-03 10:13:52 +02:00
Calle Wilund	1d30b85ac6	Unify column_definition::column_kind and ::column_kind enums	2015-06-02 11:22:41 +02:00
Avi Kivity	db46ced43c	cql3: fix shared_ptr misuse in select_statement A shared_ptr is mutable, so it must be thread_local, not static.	2015-06-01 17:34:00 +02:00
Tomasz Grabiec	731a63e371	schema: Embed raw_schema inside schema Public fields got encapsulated.	2015-04-24 18:01:01 +02:00
Avi Kivity	3d38708434	cql3: pass a database& instance to most foo::raw::prepare() variants To prepare a user-defined type, we need to look up its name in the keyspace. While we get the keyspace name as an argument to prepare(), it is useless without the database instance. Fix the problem by passing a database reference along with the keyspace. This precolates through the class structure, so most cql3 raw types end up receiving this treatment. Origin gets along without it by using a singleton. We can't do this due to sharding (we could use a thread-local instance, but that's ugly too). Hopefully the transition to a visitor will clean this up.	2015-04-20 16:15:34 +03:00
Tomasz Grabiec	ee906471ab	cql3: Move method implementations to .cc	2015-04-15 20:44:59 +02:00
Tomasz Grabiec	00f99cefd4	db: split query.hh to reduce header dependencies	2015-04-15 20:44:59 +02:00
Tomasz Grabiec	878a740b9d	db: Write query results in serialized form This gives about 30% increase in tps in: build/release/tests/perf/perf_simple_query -c1 --query-single-key This patch switches query result format from a structured one to a serialized one. The problems with structured format are: - high level of indirection (vector of vectors of vectors of blobs), which is not CPU cache friendly - high allocation rate due to fine-grained object structure On replica side, the query results are probably going to be serialized in the transport layer anyway, so this change only subtracts work. There is no processing of the query results on replica other than concatenation in case of range queries. If query results are collected in serialized form from different cores, we can concatenate them without copying by simply appending the fragments into the packet. This optimization is not implemented yet. On coordinator side, the query results would have to be parsed from the transport layer buffers anyway, so this also doesn't add work, but again saves allocations and copying. The CQL server doesn't need complex data structures to process the results, it just goes over it linearly consuming it. This patch provides views, iterators and visitors for consuming query results in serialized form. Currently the iterators assume that the buffer is contiguous but we could easily relax this in future so that we can avoid linearization of data received from seastar sockets. The coordinator side could be optimized even further for CQL queries which do not need processing (eg. select * from cf where ...) we could make the replica send the query results in the format which is expected by the CQL binary protocol client. So in the typical case the coordinator would just pass the data using zero-copy to the client, prepending a header. We do need structure for prefetched rows (needed by list manipulations), and this change adds query result post-processing which converts serialized query result into a structured one, tailored particularly for prefetched rows needs. This change also introduces partition_slice options. In some queries (maybe even in typical ones), we don't need to send partition or clustering keys back to the client, because they are already specified in the query request, and not queried for. The query results hold now keys as optional elements. Also, meta-data like cell timestamp and ttl is now also optional. It is only needed if the query has writetime() or ttl() functions in it, which it typically won't have.	2015-04-15 20:44:50 +02:00
Tomasz Grabiec	f9afd82231	cql3: Remove unnecessary indirection to result_set_builder	2015-04-15 20:33:49 +02:00

1 2

57 Commits