scylladb

Author	SHA1	Message	Date
Tomasz Grabiec	0c84f00b16	query: Fix invalid initialization of _memory_tracker by moving-from-self Fixes the following UBSAN warning: core/semaphore.hh:293:74: runtime error: reference binding to misaligned address 0x0000006c55d7 for type 'struct basic_semaphore', which requires 8 byte alignment Since the field was not initialied properly, probably also fixes some user-visible bug. Message-Id: <1488368222-32009-1-git-send-email-tgrabiec@scylladb.com>	2017-03-01 11:38:28 +00:00
Avi Kivity	4667641f5f	result_memory_tracker: fix too-short short reads 1.6 truncates paged queries early to avoid overrunning server memory with too-large query results, but in the case of partition range queries, this terminates too early due to an uninitialized variable holding the maximum result size. This results in slow performance due to additional round trips. Fix by initializing the maximum result size from the result_memory_tracker running on the coordinating shard. Fixes #1995. Message-Id: <20170105103915.10633-1-avi@scylladb.com>	2017-01-05 10:51:55 +00:00
Paweł Dziepak	e6d27ac529	query: introduce result_memory_accounter::foreign_state Range queries used to be performed sequentially and the shard performing part of the read was reading state of the merger's memory accounter directly. Now, they may be performed in parallel so it is safer to just pass relevant data by value to the intersted shards so that they are not reading something that another shard is modyfing at the same time. Since query is done in parallel there is a chance of overread. However, the parallelism is high only in sparsely populated tables and that's when the overread is less serious problem.	2016-12-22 17:16:24 +01:00
Paweł Dziepak	a0523df8d6	result_memory_limiter: add accounter for digest reads Digest reads differ from data reads in a way that they do not really consume any memory. We still want them to stop in the same place that data reads would, but the per-shard semaphore shouldn't be updated by them.	2016-12-22 13:35:04 +01:00
Paweł Dziepak	aa083d3d85	result_memory_limiter: split new_read() to new_{data, mutation}_read() For data queries it is very important that all replicas get limited in the same place (this includes replicas returning only digest). That's why they shouldn't be affected by per-shard result memory limit. Moreover, we should make sure that individual memory limits are the same, making the coordinator provide it for replicas which allow to safely change it in the future. Mutation queries are not as sensitive but it is still beneficial to make sure that all replicas use the same individual limit.	2016-12-22 13:35:04 +01:00
Duarte Nunes	9572c19dc6	storage_proxy: Don't fetch superfluous partitions This patch ensures we keep track of how many partitions we've queried so we don't ask for more than the number we need. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-15 10:27:46 +00:00
Duarte Nunes	93be8d7cef	query::result: Add partition count This patch adds a partition count to query::result, filled by the query::result::builder. The partition count is present whenever the result carries data, being absent only for the case where the result contains only a digest. We also ensure that counts are present for an empty query::result. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-15 10:27:46 +00:00
Paweł Dziepak	cfd4d0f680	db: add metrics for short reads and memory used for results Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-12-14 14:28:36 +00:00
Paweł Dziepak	15de8de9e5	reconcilable_result: keep result_memory_tracker object Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-12-14 14:10:02 +00:00
Paweł Dziepak	ee89d80d5c	query: add result size limiter This patch introduces an infrastrucutre for limiting result size. There is a shard-local limit which makes sure that all results combined do not use more than 10% of the shard memory. There is also an invidual limit which restricts a result to 4 MB. In order In order to avoid sending tiny results there is minimum guaranteed size (4 kB), which the query needs to reserve before it starts producing the result. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-12-14 14:10:02 +00:00
Paweł Dziepak	da7ca85040	query: allow short reads When paging is used the cluster is allowed to return less rows than the client asked for. However, if such possibility is used we need a way of telling that to the coordinator and the paging implementation so that they can differentiate between short reads caused by the replica running out of data to sent and short reads caused by any other means. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-12-14 14:10:01 +00:00
Paweł Dziepak	cb2a557cf7	query::result: reduce chunk count Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-08-22 09:31:33 +01:00
Gleb Natapov	1e6f64f4ab	query: add latest modification timestamp to result structure	2016-05-24 13:27:34 +03:00
Gleb Natapov	db322d8f74	query: put live row count into query::result The patch calculates row count during result building and while merging. If one of results that are being merged does not have row count the merged result will not have one either.	2016-05-02 15:10:15 +03:00
Gleb Natapov	15ebe5e4e5	query: add calculate_row_count function to query::result	2016-04-14 19:26:00 +03:00
Gleb Natapov	f47b2dad18	query: add lazy printer to query::result query::result transformation to printable form is very heavy operation that allocates memory and thus can fail. Add a class to query::result that can be used with logger to push to string conversion when output is performed.	2016-04-14 19:26:00 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Paweł Dziepak	21e2ebcf8c	query: build only result, only digest or both Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Paweł Dziepak	46079f763b	query: add keys and tombstones to result digest Query result digest is used to verify that all replicas have the same data. Therefore, it needs to contain more information than the query result itself in order to ensure proper detection of disagreements. Generally, adding clustering keys to the digest regardless of whether the client asked for them will guarantee correctness. However, adding tombstones as well improves the chances of early detection of nodes containing stale data. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Paweł Dziepak	3efb10bd08	result.idl: keep digest together with result Result digest is going to be computed in query result builder and require information not available in the query resylt. That's why the digest now needs to be sent to the other nodes together with the result as they won't be able compute it on their own. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Paweł Dziepak	bdc23ae5b5	remove db/serializer.hh includes Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-02 09:07:09 +00:00
Tomasz Grabiec	6cec131432	query: Switch to IDL-generated views and writers The query result footprint for cassandra-stress mutation as reported by tests/memory-footprint increased by 18% from 285 B to 337 B. perf_simple_query shows slight regression in throughput (-8%): build/release/tests/perf/perf_simple_query -c4 -m1G --partitions 100000 Before: ~433k tps After: ~400k tps	2016-02-26 12:26:13 +01:00
Tomasz Grabiec	916a91c913	query: Split send_timestamp_and_expiry into two separate options It's cleaner that way. They don't need to come together.	2016-02-15 16:53:56 +01:00
Gleb Natapov	b4b560e0fc	change result_digest to hold std::array instead of a std::vector Digest size if fixed, so no need to use std::vector to hold it. Message-Id: <20160203102530.GU6705@scylladb.com>	2016-02-03 12:27:39 +02:00
Gleb Natapov	31bb194c21	Remove old result_digest serializer	2016-02-02 12:15:50 +02:00
Gleb Natapov	10cd4d948c	Move result_digest to idl	2016-02-02 12:15:50 +02:00
Gleb Natapov	ab6703f9bc	Remove old query::result serializer	2016-01-24 12:45:41 +02:00
Tomasz Grabiec	dd51ff0410	query: Make query::result movable	2015-12-16 18:06:54 +01:00
Tomasz Grabiec	d64db98943	query: Convert serialization of query::result to use db::serializer<> That's what we're trying to standardize on. This patch also fixes an issue with current query::result::serialize() not being const-qualified, because it modifies the buffer. messaging_service did a const cast to work this around, which is not safe.	2015-12-03 09:19:11 +01:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Gleb Natapov	ab5f52fde3	storage_proxy: lazily calculate digest from data results during query Do not calculate digest from data on arrival, do it during digest matching check, also skip it entirely if there is only one digest to match.	2015-09-16 17:40:22 +03:00
Tomasz Grabiec	d9e6f0d1da	query: Introduce query::result::pretty_print()	2015-07-28 11:31:08 +02:00
Tomasz Grabiec	a03fc3549b	query-result: Add missing include	2015-07-09 18:53:03 +02:00
Tomasz Grabiec	f46b7a815e	query: Fix typos in comments	2015-07-02 13:25:46 +02:00
Gleb Natapov	4b9661c608	initial read clustering code Works only if all replicas (participating in CL) has the same live data. Does not detects mismatch in tombstones (no infrastructure yet). Does not report timeout yet.	2015-07-01 13:36:30 +03:00
Gleb Natapov	730170ff1a	serialize data structures needed for read clustering	2015-07-01 13:36:28 +03:00
Gleb Natapov	3d3d3a8627	implment query::result::digest()	2015-07-01 13:35:57 +03:00
Tomasz Grabiec	5ba1486ae7	db: Rename "ttl" to "expiry" when it's used as time point To avoid confusion with "ttl" the duration.	2015-05-06 17:27:22 +02:00
Tomasz Grabiec	8d6b93d787	query: Document intention behind query results format	2015-04-19 10:07:02 +03:00
Tomasz Grabiec	00f99cefd4	db: split query.hh to reduce header dependencies	2015-04-15 20:44:59 +02:00

40 Commits