scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 19:35:12 +00:00

Author	SHA1	Message	Date
Gleb Natapov	6723748aff	Implement speculating read and always speculating read executors	2015-08-23 15:26:49 +03:00
Gleb Natapov	54e1628928	Get configured speculative retry type for read	2015-08-23 15:26:48 +03:00
Gleb Natapov	cf10416786	Implement new_read_repair_decision() function.	2015-08-23 15:26:48 +03:00
Gleb Natapov	5de6759f40	Do not check targets size before calling make_digest_requests() If there is not enough targets make_digest_requests() will return ready future immediately.	2015-08-23 15:26:48 +03:00
Gleb Natapov	6b1669468a	Fix short read problem. Fix https://issues.apache.org/jira/browse/CASSANDRA-2643 same way Origin does it: if short read is detected retry with bigger limit and check again.	2015-08-17 18:11:26 +03:00
Avi Kivity	7a14bcd66e	Merge "API: add get estimated row size histogram to column family" from Amnon "This series cleans the streaming_histogram and the estimated histogram that were importad from origin, it then uses it to get the estimated min and max row estimation in the API."	2015-08-16 17:31:23 +03:00
Gleb Natapov	6f9cc6efe4	fix query::read_command lifetime issue in mutation_result_merger mutation_result_merger can outlive query::read_command, so it have to hold shared pointer to it instead of reference. The bug was introduced by `89e36541c3`	2015-08-16 10:59:43 +03:00
Gleb Natapov	89e36541c3	Correctly enforce row limit in mutation_result_merger Currently limit is enforced only on partition boundary, so real result can contain 2*row_limit - 1 rows in the worst case. Fix it by trimming rows from a mutation if only part of its rows fit the requested limit.	2015-08-13 18:28:30 +03:00
Gleb Natapov	987bf33865	storage_proxy: cleanup commented origin code Remove code that was already reimplemented. Makes file navigation much easier.	2015-08-12 16:50:57 +03:00
Gleb Natapov	ea2632e15b	Move overloaded_exception to exceptions.hh	2015-08-12 16:12:09 +03:00
Gleb Natapov	0b3d2de2f1	Fix mutation write timeout exception reporting Make it compatible with CQL specification	2015-08-12 14:58:48 +03:00
Amnon Heiman	c0a52a28bc	Adding the read latency support to the storage proxy This adds the latency histogram support to the storage_proxy. It uses a the latency object to mark the opetation latency, if there will be an impact on performance, it can be changed from all operations to sample of the operation. Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>	2015-08-12 13:10:18 +03:00
Gleb Natapov	36c7c2ac5f	Provide correct data_present for read_timeout_exception Fix fixmes.	2015-08-11 19:45:59 +03:00
Gleb Natapov	6046316352	Untemplatize continuation in storage_proxy::query Nothing wrong with it besides that it crashs my eclips indexer for some reason.	2015-08-11 19:45:59 +03:00
Calle Wilund	b7c7c97295	StorageProxy: implement mutate_atomically Atomically == add to batch log before doing actual mutate	2015-08-11 17:10:17 +02:00
Nadav Har'El	b6fd3dd623	storage_proxy::make_local_reader: support wrap-around ranges If the requested range wraps around the end of the ring, make_local_reader() ends up trying to create an sstable reader with that wrapping range, which is not supported. Also our shard-looping code is wrong in a wrapping range. The solution is simple: split the wrap-around range into two (from the start of the range until the end of the ring, and then from the beginning of the ring until the end of the range), and read each of these subranges normally. This feature is needed to allow streaming a range that wraps around, because streaming currently uses make_local_reader(). Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-08-07 11:38:03 +03:00
Nadav Har'El	6af8fb1d4a	storage_proxy::make_local_reader: simplify lifetime of range parameter make_local_reader takes a partition-range parameter, passed by reference. The meaning of this passing-by-reference is not documented: when make_local_reader returns, which of the following two statements holds? 1. make_local_reader still holds this reference, so the caller is forced to keep the range object alive. For how long? or 2. make_local_reader reads the range before returning, and when it returns, the caller continues to "own" the range object, and is allowed to do anything, including delete it. The principle of least surprise suggests that #2 is better, unless there is a very good reason to choose #1, and unless this requirement to keep the argument alive (and for how long) was clearly documented. But neither is the case here - the overhead of copying the argument is negligable compared to the other overheads of make_local_reader, and nothing was documented. But it turns out the current code did #1 - the address of the range was passed around after make_local_reader returns - the different readers on different cpus continue to use it to eventually find the right sstable byte range to iterate over. It very easy to forget this and call make_local_reader on a on-stack range variable, and the result is a hard-to-debug use-after-free mess. This patch switches us to situation #2: Before make_local_reader returns, the range is copied to whoever needs to hold it after the return, namely each individual shard_reader. The patch to do this is trivial (one character removed). Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>	2015-08-07 11:38:03 +03:00
Pekka Enberg	99a80050e3	db: Rename legacy_schema_tables to schema_tables There's nothing legacy about it so rename legacy_schema_tables to schema_tables. The naming comes from a Cassandra 3.x development branch which is not relevant for us in the near future. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-08-05 13:56:47 +03:00
Avi Kivity	8d050b679a	db: improve make_local_reader() Instead of merging shard data using make_combined_reader(), take advantage of the fact that shard data is disjoint, and use make_joining_reader(). This removes the need to sort the partitions as they are being read.	2015-08-04 17:11:39 +03:00
Shlomi Livne	b8bcf6d6e7	Fix a bug in mutation response processing in smp Current code passes the storage_proxy instance of the shard on which the message was received instead of using the storage_proxy instance of the shard that sent the request. Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>	2015-08-04 11:23:49 +03:00
Tomasz Grabiec	51b3bf38bb	storage_proxy: Introduce make_local_reader()	2015-08-03 15:21:40 +02:00
Tomasz Grabiec	19851fb8db	storage_proxy: Move stop() implementation to source file	2015-08-03 11:52:23 +02:00
Glauber Costa	438a3f8619	storage_proxy: move speculative_retry type to schema It will be stored in the schema, so move there, where it belongs. We'll need to do more than just store a type, so provide a class that encapsulates it. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-07-29 18:44:59 -04:00
Avi Kivity	34131b22dd	Merge "Debuggability improvements" from Tomasz	2015-07-28 12:47:33 +03:00
Tomasz Grabiec	c2664c0d46	storage_proxy: Add query logging on log_level::trace	2015-07-28 11:31:08 +02:00
Pekka Enberg	3803e0ed5b	exceptions: Move request_timeout_exception to exceptions.hh Now that consistency level dependency issues are sorted out, move request_timeout_exception to exceptions.hh. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-07-28 10:06:18 +03:00
Pekka Enberg	0b8c67ed79	exceptions: Move unavailable_exception to exceptions.hh Move unavailable_exception to exceptions.hh where other CQL transport level exceptions are defined in. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-07-28 10:06:18 +03:00
Pekka Enberg	a4225fa7d0	service/storage_proxy: Use proper read_timeout_exception Use the shiny new read_timeout_exception that handles CQL transport protocol encoding properly. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-07-27 17:49:52 +03:00
Amnon Heiman	893410b08d	storage_proxy: setting the write timers This patch set the write timmers: histogram, timeout and unavailable. For the histogram a latency is needed. For that the latency object is used. Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>	2015-07-26 10:57:40 +03:00
Tomasz Grabiec	e5feff5d71	dht: ring_position: Switch to total ordering range::is_wrap_around() and range::contains() rely on total ordering on values to work properly. Current ring_position_comparator was only imposing a weak ordering (token positions equal to all key positions with that token). range::before() and range::after() can't work for weak ordering. If the bound is exclusive, we don't know if user-provided token position is inside or outside. Also, is_wrap_around() can't properly detect wrap around in all cases. Consider this case: (1) ]A; B] (2) [A; B] For A = (tok1) and B = (tok1, key1), (1) is a wrap around and (2) is not. Without total ordering between A and B, range::is_wrap_around() can't tell that. I think the simplest soution is to define a total ordering on ring_position by making token positions positioned either before or after all keys with that token.	2015-07-24 16:08:41 +02:00
Gleb Natapov	e41880fc66	drop reference to storage_proxy from read executors Now when storage_proxy can be accessed globally there is no need to waste memory to save its pointer in each read executor.	2015-07-23 12:32:21 +03:00
Gleb Natapov	00d4974cec	storage_proxy: add logger Uncomment exiting logging points.	2015-07-23 12:32:21 +03:00
Gleb Natapov	f122ee39b9	storage_proxy: return proper error codes to transport layer Transport layer expects to get error code in an exception of type exceptions::cassandra_exception. Fix code to use it as a base for all user visible exceptions and put correct error code there.	2015-07-23 12:32:21 +03:00
Avi Kivity	8870bf1bf8	Merge "Handling of non-full partition range queries" from Tomasz	2015-07-22 15:18:02 +03:00
Gleb Natapov	150c28e941	storage_proxy: do not ignore connection errors Do nothing about them for now. Read will eventually fail on timeout.	2015-07-22 13:44:47 +03:00
Gleb Natapov	98fae1a010	storage_proxy: handle read timeout	2015-07-22 13:44:46 +03:00
Avi Kivity	a547b881a7	Merge "Schema pull" from Pekka "This series enables the schema pull functionality. It's used to synchronize schema at node startup for schema changes that happened in the cluster while the node was down. Node 1: # Node 2 is not running. [penberg@nero apache-cassandra-2.1.7]$ ./bin/cqlsh --no-color 127.0.0.1 Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 \| Cassandra 2.2.0 \| CQL spec 3.2.0 \| Native protocol v3] Use HELP for help. cqlsh> CREATE KEYSPACE keyspace3 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh> SELECT * FROM system.schema_keyspaces; keyspace_name \| durable_writes \| strategy_class \| strategy_options ---------------+----------------+--------------------------------------------+---------------------------- keyspace3 \| True \| SimpleStrategy \| {"replication_factor":"1"} system \| True \| org.apache.cassandra.locator.LocalStrategy \| {} (2 rows) cqlsh> SELECT key, schema_version FROM system.local; key \| schema_version -------+-------------------------------------- local \| c3a18ddc-80c5-3a25-b82d-57178a318771 (1 rows) Node 2: # Node 2 is started. [penberg@nero apache-cassandra-2.1.7]$ ./bin/cqlsh --no-color 127.0.0.2 Connected to Test Cluster at 127.0.0.2:9042. [cqlsh 5.0.1 \| Cassandra 2.2.0 \| CQL spec 3.2.0 \| Native protocol v3] Use HELP for help. cqlsh> SELECT * FROM system.schema_keyspaces; keyspace_name \| durable_writes \| strategy_class \| strategy_options ---------------+----------------+--------------------------------------------+---------------------------- keyspace3 \| True \| SimpleStrategy \| {"replication_factor":"1"} system \| True \| org.apache.cassandra.locator.LocalStrategy \| {} (2 rows) cqlsh> SELECT key, schema_version FROM system.local; key \| schema_version -------+-------------------------------------- local \| c3a18ddc-80c5-3a25-b82d-57178a318771 (1 rows)"	2015-07-22 13:32:47 +03:00
Gleb Natapov	b737a85f08	storage_proxy: fix get_live_sorted_endpoints() remove_if() does not really removes anything. Fixes #33	2015-07-22 12:14:40 +02:00
Pekka Enberg	d4d2e2fa0e	service/storage_proxy: add get_storage_proxy() helpers Make storage proxy a singleton and add helpers to look up a reference. This makes it easier to convert code from Origin in areas such as storage service. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-07-22 11:57:00 +03:00
Pekka Enberg	d81b04f356	service/storage_proxy: Wire up MIGRATION_REQUEST messages Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-07-22 11:57:00 +03:00
Tomasz Grabiec	6882e6ca50	storage_proxy: Fix range splitting when split point is exlcluded by lower bound If we had a range (x; ...] then x is excluded, but token iterator was initialized with x. The splitting loop would exit prematurely because it would detect that the token is outside the range. The fix is to teach ring_range() to recognize this and always give tokens which are not smaller than the range's lower bound.	2015-07-22 10:27:48 +02:00
Tomasz Grabiec	c219c3676b	storage_proxy: Avoid unnecessary copy of a partition_range	2015-07-22 10:27:48 +02:00
Tomasz Grabiec	abdb36a0f6	storage_proxy: Fix ordering comparator used during merging and reconciliation	2015-07-22 10:27:48 +02:00
Tomasz Grabiec	9b52f5bf2b	storage_proxy: Make range splitting never produce a wrap around range Origin has no notion of a maximum token so a range without upper bound is represented as (x; min]. The splitting code is supposed to produce only non-wrapping ranges, but (x; min] looks like a wrapping range, so database code which consumes it would have to special-case for it. A simpler solution is to change the splitting code to never produce a wrapping range.	2015-07-22 10:27:48 +02:00
Tomasz Grabiec	f557951554	storage_proxy: Encapsulate access to range tokens The wrappers also take care of cases when each bound is undefined, and return mimimum or maximum token respectively, which fixes undefined behavior in get_restricted_ranges(). Alternative solution would be to make partition_range use a different type of range, the one where bounds are always specified. However it's not worth introducing a new range type just for those few users.	2015-07-22 10:27:48 +02:00
Tomasz Grabiec	5587d2ce9a	storage_proxy: Simplify access to storage_proxy::response_id_type storage_proxy::storage_proxy::response_id_type -> storage_proxy::response_id_type	2015-07-22 10:27:48 +02:00
Tomasz Grabiec	0b0ea04958	range: Remove start_value() and end_value() It's easy to miss that they may be undefined. start() and end(), which return optional<bound> const&, make it clear.	2015-07-22 10:27:47 +02:00
Gleb Natapov	c58aea07c3	remove storage_proxy::query_local() Use storage_proxy::query() instead.	2015-07-20 12:06:00 +02:00
Gleb Natapov	640e65b947	storage_proxy: Reconciliation. The beginning The patch introduces reconciliation code. The same code suppose to be working for both range and single key queries. Handling of raw_limit, short reads and read repairs is still very much missing. -- v1->v2: - call live_row_count() only once.	2015-07-16 19:23:58 +03:00
Avi Kivity	c74e36c30e	Merge branch 'master' of github.com:cloudius-systems/urchin into db Conflicts: message/messaging_service.cc message/messaging_service.hh	2015-07-16 12:51:19 +03:00

1 2 3

125 Commits