scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-21 17:10:35 +00:00

Author	SHA1	Message	Date
Avi Kivity	c05f60387b	i_partitioner: remove unused function Found by clang.	2017-04-17 23:03:15 +03:00
Tomasz Grabiec	d4b6e430ed	dht: Introduce ring_position_view	2017-03-28 18:10:39 +02:00
Tomasz Grabiec	55a7cceef5	dht: Move comparison logic from ring_position::tri_compare() to ring_position_comparator It will soon define common ordering for many objects, not just ring_position.	2017-03-28 18:10:39 +02:00
Tomasz Grabiec	65a8920b25	dht: Make min/max tokens capturable by reference So that they can be later used in views.	2017-03-28 18:10:39 +02:00
Avi Kivity	141048e0e5	dht: improve token hash function For a small token, we can just return it, since it already is a hash. We hash large tokens using murmur3, which is supposedly a good hash.	2017-01-20 11:24:14 +02:00
Avi Kivity	a1cafed370	storage_proxy: handle range scans of sparsely populated tables When murmur3_partitioner_ignore_msb_bits = 12 (which we'd like to be the default), a scan range can be split into a large number of subranges, each going to a separate shard. With the current implementation, subranges were queried sequentially, resulting in very long latency when the table was empty or nearly empty. Switch to an exponential retry mechanism, where the number of subranges queried doubles each time, dropping the latency from O(number of subranges) to O(log(number of subranges)). If, during an iteration of a retry, we read at most one range from each shard, then partial results are merged by concatentation. This optimizes for the dense(r) case, where few partial results are required. If, during an iteration of a retry, we need more than one range per shard, then we collapse all of a shard's ranges into just one range, and merge partial results by sorting decorated keys. This reduces the number of sstable read creations we need to make, and optimizes for the sparse table case, where we need many partial results, most of which are empty. We don't merge subranges that come from different partition ranges, because those need to be sorted in request order, not decorated key order. [tgrabiec: trivial conflicts] Message-Id: <20161220170532.25173-1-avi@scylladb.com>	2016-12-20 18:32:29 +01:00
Asias He	937f28d2f1	Convert to use dht::partition_range_vector and dht::token_range_vector	2016-12-19 14:08:50 +08:00
Asias He	85034c1b57	Convert to use dht::partition_range	2016-12-19 08:04:30 +08:00
Asias He	d1178fa299	Convert to use dht::token_range	2016-12-19 08:04:29 +08:00
Asias He	463cc4fbde	dht: Introduce split_ranges_to_shards Split a ranges into shard ranges map with ring_position_range_sharder helper.	2016-12-12 09:04:21 +08:00
Asias He	044c4ff44c	dht: Introduce split_range_to_shards Split a range into shard ranges map with ring_position_range_sharder helper.	2016-12-12 09:04:21 +08:00
Duarte Nunes	ada2f1092e	dht: Make i_partitioner::tri_compare pure virtual This patch makes the i_partitioner::tri_compare() function pure virtual as it is overridden by all partitioners. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20161211172037.16496-1-duarte@scylladb.com>	2016-12-11 19:29:37 +02:00
Duarte Nunes	bb66b051ed	dht: Make i_partitioner::tri_compare memory safe This patch fixes a typo in i_partitioner::tri_compare() where we were using std::max instead of std::min, thus avoiding accessing random memory and getting random results. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20161211165043.17816-1-duarte@scylladb.com>	2016-12-11 18:58:10 +02:00
Avi Kivity	07d5a20bae	Wire up sharding ignore msb parameter to configuration We might have used a fancy map<sstring, any> to pass the parameters, but that's overkill for now.	2016-11-22 22:40:47 +02:00
Duarte Nunes	66f6a367a4	ring_position_range_sharder: Avoid copying eagerly Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20161104115632.15974-1-duarte@scylladb.com>	2016-11-13 11:42:23 +02:00
Avi Kivity	7202b94183	dht: introduce a sharder for vectors of partition ranges Building on the single-range sharder, add a sharder for vectors of partition ranges. This helps with wrapped ranges, which are translated into a vector containing two shards.	2016-11-03 19:10:20 +02:00
Avi Kivity	43a2380899	dht: add a generator for shard/range pairs Divides a ring_position range into a sequence of shard/range pairs. This allows sequential iteration over shards in ring order. The current multi-partition query executes on all shards in parallel, but this is very wasteful, as most of the data will be thrown away if it is not included in the page. With the generator, we can switch to sequential execution.	2016-11-03 19:10:17 +02:00
Avi Kivity	6320181b97	partitioner: const correctness for comparators	2016-11-03 11:27:40 +02:00
Avi Kivity	a35136533d	Convert ring_position and token ranges to be nonwrapping Wrapping ranges are a pain, so we are moving wrap handling to the edges. Since cql can't generate wrapping ranges, this means thrift and the ring maintenance code; also range->ring transformations need to merge the first and last ranges. Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>	2016-11-02 21:04:11 +02:00
Duarte Nunes	aaa76d58ba	query: Move to_partition_range to dht namespace This patch moves to_partition_range, from the query namespace to the dht namespace, where it is a more natural fit. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1468498060-19251-1-git-send-email-duarte@scylladb.com>	2016-07-15 10:41:52 +02:00
Asias He	f4389349e4	config: Enable partitioner option Enable --partitioner option so that user can choose partitioner other than the default Murmur3Partitioner. Currently, only Murmur3Partitioner and ByteOrderedPartitioner are supported. When non-supported partitioner is specifed, error will be propogated to user.	2016-07-08 17:44:55 +08:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Gleb Natapov	775cc93880	remove unused range and token serializers	2016-02-02 12:15:49 +02:00
Gleb Natapov	043d132ba9	Remove no longer used serializers.	2016-01-24 12:45:41 +02:00
Asias He	6259dd73b6	token: Print token using the partitioner defined method Make nodetool ring output align with c* output.	2015-10-29 15:53:47 +02:00
Raphael S. Carvalho	936575efd2	dht: introduce comparator for token Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-10-15 19:34:50 -03:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Glauber Costa	e1968c389e	dht: use tri_compare for token comparisons Loading data from memory tends to be the most expensive part of the comparison operations. Because we don't have a tri_compare function for tokens, we end up having to do an equality test, which will load the token's data in memory, and then, because all we know is that they are not equal, we need to do another one. Having two dereferences is harmful, and shows up in my simple benchmark. This is because before writing to sstables, we must order the keys in decorated key order, which is heavy on the comparisons. The proposed change speeds up index write benchmark by 8.6%: Before: 41458.14 +- 1.49 partitions / sec (30 runs) After: 45020.81 +- 3.60 partitions / sec (30 runs) Parameters: --smp 6 --partitions 500000 Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-08-12 09:23:42 -05:00
Tomasz Grabiec	593fa725e9	dht: Relax dependencies on bytes const& In preparation to switching to a bytes container which is not "bytes" switch to bytes_view.	2015-08-06 14:05:16 +02:00
Avi Kivity	f915ff1fcd	dht: introduce i_partitioner::shard_of() and implement msb sharding Make sharding partitioner-specific, since different partitioners interpret the byte content differently. Implement it by extracting the shard from the most significant bits, which can be used to minimize cross shard traffic for range queries, and reduces sstable sharing.	2015-08-03 20:17:40 +03:00
Tomasz Grabiec	e5feff5d71	dht: ring_position: Switch to total ordering range::is_wrap_around() and range::contains() rely on total ordering on values to work properly. Current ring_position_comparator was only imposing a weak ordering (token positions equal to all key positions with that token). range::before() and range::after() can't work for weak ordering. If the bound is exclusive, we don't know if user-provided token position is inside or outside. Also, is_wrap_around() can't properly detect wrap around in all cases. Consider this case: (1) ]A; B] (2) [A; B] For A = (tok1) and B = (tok1, key1), (1) is a wrap around and (2) is not. Without total ordering between A and B, range::is_wrap_around() can't tell that. I think the simplest soution is to define a total ordering on ring_position by making token positions positioned either before or after all keys with that token.	2015-07-24 16:08:41 +02:00
Tomasz Grabiec	2e845140e2	dht: ring_position: Implement less_compare() and equals() using tri_compare()	2015-07-24 16:08:41 +02:00
Tomasz Grabiec	7b30e8fcff	dht: ring_position: Move definitions out of line	2015-07-24 16:08:41 +02:00
Asias He	243dbd3bfd	dht: Reuse token serializer in ring_position	2015-07-23 09:08:15 +08:00
Asias He	1761be4dc4	dht: Implement serializer interface for token It is needed by query::range<token>.	2015-07-23 09:08:15 +08:00
Gleb Natapov	253ba71747	add comparison functions for ring_position Two ring_positions are equal if tokens and keys are equal or tokens are equal and one or both of them do not specify key. So ring_positions without a key is a wildcard that equals any ring_positions with the same token.	2015-07-15 12:41:31 +03:00
Gleb Natapov	49fb10d640	improve token printout Add min/max token printout	2015-07-14 12:21:42 +03:00
Paweł Dziepak	351b113913	dht: allow configuration file to choose partitioner Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-07-12 15:14:53 +02:00
Tomasz Grabiec	d035c499b8	db: Move database::shard_of() to dht::shard_of()	2015-07-07 16:56:25 +02:00
Gleb Natapov	730170ff1a	serialize data structures needed for read clustering	2015-07-01 13:36:28 +03:00
Tomasz Grabiec	ac333f04e2	dht: Make decorated_key comparable with ring_position	2015-06-25 18:45:12 +02:00
Tomasz Grabiec	9525464f74	Move query::ring_position to dht::ring_position	2015-06-25 18:45:12 +02:00
Avi Kivity	f8e2c13933	dht: rename midpoint() to midpoint_unsigned() midpoint()'s algorithm only works for unsigned tokens, so rename it accordingly.	2015-06-25 14:54:40 +03:00
Avi Kivity	d8946d07ed	dht: fix token wraparound in midpoint() midpoint(l, r) where l > r needs to wrap around the end of the ring. Adjust the midpoint() function to do this. Note this is still broken for the murmur3 partitioner, since it doesn't treat tokens as unsigned.	2015-06-25 13:55:46 +03:00
Asias He	9649414011	dht: Align token print Before: token=df 96 79 87 21 b2 ed 80 token=5c 98 e a0 4f 5e 28 6b After: token=df 96 79 87 21 b2 ed 80 token=5c 98 0e a0 4f 5e 28 6b	2015-06-04 17:12:10 +08:00
Glauber Costa	6a8049dce1	dht: add maximum_token Analogous to minimum_token. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-05-21 16:27:35 -04:00
Tomasz Grabiec	aec740f895	db: Make decorated_key have ordering compatible with Origin	2015-04-30 12:02:39 +02:00
Tomasz Grabiec	71041eb0d6	dht: Implement operator<< for decorated_key	2015-04-24 18:01:01 +02:00
Tomasz Grabiec	841a13da93	dht: Implement operator!= for decorated_key	2015-04-24 18:01:01 +02:00
Gleb Natapov	02fb270fbe	token operator<<	2015-04-19 10:15:14 +03:00

1 2

53 Commits