scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Pavel Solodovnikov	9ce0e2efa3	gms: gossiper: coroutinize `mark_as_shutdown` and `convict` Since these two functions call each other, convert to coroutines and eliminate the dependency on `seastar::async` for both of them at the same time. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-02-05 10:15:21 +03:00
Pavel Solodovnikov	c584a9cc1f	gms: gossiper: remove comment about requiring thread context in `mark_alive` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-02-05 10:15:21 +03:00
Pavel Solodovnikov	ee30d0a385	gms: gossiper: don't use `seastar::async` in `mark_alive` Since `real_mark_alive` does not require `seastar::async` now, we can eliminate the wrapping async call, as well. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-02-05 10:15:21 +03:00
Pavel Solodovnikov	529f4d0f98	gms: gossiper: coroutinize `do_on_change_notifications` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-02-05 10:15:21 +03:00
Pavel Solodovnikov	37066039df	gms: gossiper: coroutinize `do_before_change_notifications` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-02-05 10:15:21 +03:00
Pavel Solodovnikov	231d8a3ad4	gms: gossiper: coroutinize `real_mark_alive` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-02-05 10:15:21 +03:00
Pavel Solodovnikov	c929f23b8d	gms: gossiper: coroutinize `mark_dead` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-02-05 10:15:20 +03:00
Avi Kivity	fe65122ccd	Merge 'Distribute `select count()` queries' from Michał Sala This pull request speeds up execution of `count()` queries. It does so by splitting given query into sub-queries and distributing them across some group of nodes for parallel execution. New level of coordination was added. Node called super-coordinator splits aggregation query into sub-queries and distributes them across some group of coordinators. Super-coordinator is also responsible for merging results. To develop a mechanism for speeding up `count()` queries, there was a need to detect which queries have a `count()` selector. Due to this pull request being a proof of concept, detection was realized rather poorly. It is only allows catching the simplest cases of `count()` queries (with only one selector and no column name specified). After detecting that a query is a `count()` it should be split into sub-queries and sent to another coordinators. Splitting part wasn't that difficult, it has been achieved by limiting original query's partition ranges. Sending modified query to another node was much harder. The easiest scenario would be to send whole `cql3::statements::select_statement`. Unfortunately `cql3::statements::select_statement` can't be [de]serialized, so sending it was out of the question. Even more unfortunately, some non-[de]serializable members of `cql3::statements::select_statement` are required to start the execution process of this statement. Finally, I have decided to send a `query::read_command` paired with required [de]serializable members. Objects, that cannot be [de]serialized (such as query's selector) are mocked on the receiving end. When a super-coordinator receives a `count()` query, it splits it into sub-queries. It does so, by splitting original query's partition ranges into list of vnodes, grouping them by their owner and creating sub-queries with partition ranges set to successive results of such grouping. After creation, each sub-query is sent to the owner of its partition ranges. Owner dispatches received sub-query to all of its shards. Shards slice partition ranges of the received sub-query, so that they will only query data that is owned by them. Each shard becomes a coordinator and executes so prepared sub-query. 3 node cluster set up on powerful desktops located in the office (3x32 cores) Filled the cluster with ~2 10^8 rows using scylla-bench and run: ``` time cqlsh <ip> <port> --request-timeout=3600 -e "select count() from scylla_bench.test using timeout 1h;" ``` master: 68s * this branch: 2s 3 node cluster (each node had 2 shards, `murmur3_ignore_msb_bits` was set to 1, `num_tokens` was set to 3) ``` > cqlsh -e 'tracing on; select count() from ks.t; Now Tracing is enabled count ------- 1000 (1 rows) Tracing session: e5852020-7fc3-11ec-8600-4c4c210dd657 activity \| timestamp \| source \| source_elapsed \| client ---------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+----------- Execute CQL3 query \| 2022-01-27 22:53:08.770000 \| 127.0.0.1 \| 0 \| 127.0.0.1 Parsing a statement [shard 1] \| 2022-01-27 22:53:08.770451 \| 127.0.0.1 \| -- \| 127.0.0.1 Processing a statement [shard 1] \| 2022-01-27 22:53:08.770487 \| 127.0.0.1 \| 36 \| 127.0.0.1 Dispatching forward_request to 3 endpoints [shard 1] \| 2022-01-27 22:53:08.770509 \| 127.0.0.1 \| 58 \| 127.0.0.1 Sending forward_request to 127.0.0.1:0 [shard 1] \| 2022-01-27 22:53:08.770516 \| 127.0.0.1 \| 64 \| 127.0.0.1 Executing forward_request [shard 1] \| 2022-01-27 22:53:08.770519 \| 127.0.0.1 \| -- \| 127.0.0.1 read_data: querying locally [shard 1] \| 2022-01-27 22:53:08.770528 \| 127.0.0.1 \| 9 \| 127.0.0.1 Start querying token range ({-4242912715832118944, end}, {-4075408479358018994, end}] [shard 1] \| 2022-01-27 22:53:08.770531 \| 127.0.0.1 \| 12 \| 127.0.0.1 Creating shard reader on shard: 1 [shard 1] \| 2022-01-27 22:53:08.770537 \| 127.0.0.1 \| 18 \| 127.0.0.1 Scanning cache for range ({-4242912715832118944, end}, {-4075408479358018994, end}] and slice {(-inf, +inf)} [shard 1] \| 2022-01-27 22:53:08.770541 \| 127.0.0.1 \| 22 \| 127.0.0.1 Page stats: 12 partition(s), 0 static row(s) (0 live, 0 dead), 12 clustering row(s) (12 live, 0 dead) and 0 range tombstone(s) [shard 1] \| 2022-01-27 22:53:08.770589 \| 127.0.0.1 \| 70 \| 127.0.0.1 Sending forward_request to 127.0.0.2:0 [shard 1] \| 2022-01-27 22:53:08.770600 \| 127.0.0.1 \| 149 \| 127.0.0.1 Sending forward_request to 127.0.0.3:0 [shard 1] \| 2022-01-27 22:53:08.770608 \| 127.0.0.1 \| 157 \| 127.0.0.1 Executing forward_request [shard 0] \| 2022-01-27 22:53:08.770627 \| 127.0.0.1 \| -- \| 127.0.0.1 read_data: querying locally [shard 0] \| 2022-01-27 22:53:08.770639 \| 127.0.0.1 \| 11 \| 127.0.0.1 Start querying token range ({2507462623645193091, end}, {3897266736829642805, end}] [shard 0] \| 2022-01-27 22:53:08.770643 \| 127.0.0.1 \| 15 \| 127.0.0.1 Creating shard reader on shard: 0 [shard 0] \| 2022-01-27 22:53:08.770646 \| 127.0.0.1 \| 19 \| 127.0.0.1 Scanning cache for range ({2507462623645193091, end}, {3897266736829642805, end}] and slice {(-inf, +inf)} [shard 0] \| 2022-01-27 22:53:08.770649 \| 127.0.0.1 \| 22 \| 127.0.0.1 Executing forward_request [shard 1] \| 2022-01-27 22:53:08.770658 \| 127.0.0.2 \| -- \| 127.0.0.1 Executing forward_request [shard 1] \| 2022-01-27 22:53:08.770674 \| 127.0.0.3 \| 5 \| 127.0.0.1 read_data: querying locally [shard 1] \| 2022-01-27 22:53:08.770698 \| 127.0.0.2 \| 40 \| 127.0.0.1 Start querying token range [{4611686018427387904, start}, {5592106830937975806, end}] [shard 1] \| 2022-01-27 22:53:08.770704 \| 127.0.0.2 \| 46 \| 127.0.0.1 Creating shard reader on shard: 1 [shard 1] \| 2022-01-27 22:53:08.770710 \| 127.0.0.2 \| 52 \| 127.0.0.1 read_data: querying locally [shard 1] \| 2022-01-27 22:53:08.770712 \| 127.0.0.3 \| 43 \| 127.0.0.1 Scanning cache for range [{4611686018427387904, start}, {5592106830937975806, end}] and slice {(-inf, +inf)} [shard 1] \| 2022-01-27 22:53:08.770714 \| 127.0.0.2 \| 56 \| 127.0.0.1 Start querying token range [{-4611686018427387904, start}, {-4242912715832118944, end}] [shard 1] \| 2022-01-27 22:53:08.770718 \| 127.0.0.3 \| 49 \| 127.0.0.1 Creating shard reader on shard: 1 [shard 1] \| 2022-01-27 22:53:08.770739 \| 127.0.0.3 \| 70 \| 127.0.0.1 Scanning cache for range [{-4611686018427387904, start}, {-4242912715832118944, end}] and slice {(-inf, +inf)} [shard 1] \| 2022-01-27 22:53:08.770743 \| 127.0.0.3 \| 73 \| 127.0.0.1 Page stats: 17 partition(s), 0 static row(s) (0 live, 0 dead), 17 clustering row(s) (17 live, 0 dead) and 0 range tombstone(s) [shard 1] \| 2022-01-27 22:53:08.770814 \| 127.0.0.3 \| 145 \| 127.0.0.1 Executing forward_request [shard 0] \| 2022-01-27 22:53:08.770846 \| 127.0.0.3 \| -- \| 127.0.0.1 read_data: querying locally [shard 0] \| 2022-01-27 22:53:08.770862 \| 127.0.0.3 \| 16 \| 127.0.0.1 Page stats: 71 partition(s), 0 static row(s) (0 live, 0 dead), 71 clustering row(s) (71 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2022-01-27 22:53:08.770865 \| 127.0.0.1 \| 238 \| 127.0.0.1 Start querying token range ({-6683686776653114062, end}, {-6473446911791631266, end}] [shard 0] \| 2022-01-27 22:53:08.770867 \| 127.0.0.3 \| 21 \| 127.0.0.1 Creating shard reader on shard: 0 [shard 0] \| 2022-01-27 22:53:08.770874 \| 127.0.0.3 \| 28 \| 127.0.0.1 Scanning cache for range ({-6683686776653114062, end}, {-6473446911791631266, end}] and slice {(-inf, +inf)} [shard 0] \| 2022-01-27 22:53:08.770879 \| 127.0.0.3 \| 33 \| 127.0.0.1 Page stats: 48 partition(s), 0 static row(s) (0 live, 0 dead), 48 clustering row(s) (48 live, 0 dead) and 0 range tombstone(s) [shard 1] \| 2022-01-27 22:53:08.770880 \| 127.0.0.2 \| 222 \| 127.0.0.1 Querying is done [shard 1] \| 2022-01-27 22:53:08.770888 \| 127.0.0.1 \| 369 \| 127.0.0.1 read_data: querying locally [shard 1] \| 2022-01-27 22:53:08.770909 \| 127.0.0.1 \| 390 \| 127.0.0.1 Start querying token range ({-4075408479358018994, end}, {-3391415989210253693, end}] [shard 1] \| 2022-01-27 22:53:08.770911 \| 127.0.0.1 \| 392 \| 127.0.0.1 Creating shard reader on shard: 1 [shard 1] \| 2022-01-27 22:53:08.770914 \| 127.0.0.1 \| 395 \| 127.0.0.1 Scanning cache for range ({-4075408479358018994, end}, {-3391415989210253693, end}] and slice {(-inf, +inf)} [shard 1] \| 2022-01-27 22:53:08.770936 \| 127.0.0.1 \| 418 \| 127.0.0.1 Executing forward_request [shard 0] \| 2022-01-27 22:53:08.770951 \| 127.0.0.2 \| -- \| 127.0.0.1 read_data: querying locally [shard 0] \| 2022-01-27 22:53:08.770966 \| 127.0.0.2 \| 15 \| 127.0.0.1 Page stats: 12 partition(s), 0 static row(s) (0 live, 0 dead), 12 clustering row(s) (12 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2022-01-27 22:53:08.770969 \| 127.0.0.3 \| 123 \| 127.0.0.1 Start querying token range (-inf, {-6683686776653114062, end}] [shard 0] \| 2022-01-27 22:53:08.770969 \| 127.0.0.2 \| 18 \| 127.0.0.1 Creating shard reader on shard: 0 [shard 0] \| 2022-01-27 22:53:08.770974 \| 127.0.0.2 \| 23 \| 127.0.0.1 Scanning cache for range (-inf, {-6683686776653114062, end}] and slice {(-inf, +inf)} [shard 0] \| 2022-01-27 22:53:08.770977 \| 127.0.0.2 \| 26 \| 127.0.0.1 Querying is done [shard 1] \| 2022-01-27 22:53:08.770993 \| 127.0.0.3 \| 324 \| 127.0.0.1 read_data: querying locally [shard 1] \| 2022-01-27 22:53:08.770998 \| 127.0.0.3 \| 329 \| 127.0.0.1 Start querying token range ({-3391415989210253693, end}, {0, start}) [shard 1] \| 2022-01-27 22:53:08.771001 \| 127.0.0.3 \| 332 \| 127.0.0.1 Creating shard reader on shard: 1 [shard 1] \| 2022-01-27 22:53:08.771004 \| 127.0.0.3 \| 335 \| 127.0.0.1 Scanning cache for range ({-3391415989210253693, end}, {0, start}) and slice {(-inf, +inf)} [shard 1] \| 2022-01-27 22:53:08.771007 \| 127.0.0.3 \| 338 \| 127.0.0.1 Page stats: 48 partition(s), 0 static row(s) (0 live, 0 dead), 48 clustering row(s) (48 live, 0 dead) and 0 range tombstone(s) [shard 1] \| 2022-01-27 22:53:08.771044 \| 127.0.0.1 \| 525 \| 127.0.0.1 Querying is done [shard 0] \| 2022-01-27 22:53:08.771069 \| 127.0.0.1 \| 442 \| 127.0.0.1 On shard execution result is [71] [shard 0] \| 2022-01-27 22:53:08.771145 \| 127.0.0.1 \| 518 \| 127.0.0.1 Querying is done [shard 1] \| 2022-01-27 22:53:08.771308 \| 127.0.0.1 \| 789 \| 127.0.0.1 On shard execution result is [60] [shard 1] \| 2022-01-27 22:53:08.771351 \| 127.0.0.1 \| 832 \| 127.0.0.1 Page stats: 127 partition(s), 0 static row(s) (0 live, 0 dead), 127 clustering row(s) (127 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2022-01-27 22:53:08.771379 \| 127.0.0.2 \| 427 \| 127.0.0.1 Page stats: 183 partition(s), 0 static row(s) (0 live, 0 dead), 183 clustering row(s) (183 live, 0 dead) and 0 range tombstone(s) [shard 1] \| 2022-01-27 22:53:08.771385 \| 127.0.0.3 \| 716 \| 127.0.0.1 Querying is done [shard 0] \| 2022-01-27 22:53:08.771402 \| 127.0.0.3 \| 556 \| 127.0.0.1 Querying is done [shard 1] \| 2022-01-27 22:53:08.771403 \| 127.0.0.2 \| 745 \| 127.0.0.1 read_data: querying locally [shard 1] \| 2022-01-27 22:53:08.771408 \| 127.0.0.2 \| 750 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2022-01-27 22:53:08.771409 \| 127.0.0.3 \| 563 \| 127.0.0.1 Start querying token range ({5592106830937975806, end}, +inf) [shard 1] \| 2022-01-27 22:53:08.771411 \| 127.0.0.2 \| 754 \| 127.0.0.1 Start querying token range ({-6272011798787969456, end}, {-4611686018427387904, start}) [shard 0] \| 2022-01-27 22:53:08.771412 \| 127.0.0.3 \| 566 \| 127.0.0.1 Creating shard reader on shard: 0 [shard 0] \| 2022-01-27 22:53:08.771415 \| 127.0.0.3 \| 569 \| 127.0.0.1 Creating shard reader on shard: 1 [shard 1] \| 2022-01-27 22:53:08.771415 \| 127.0.0.2 \| 757 \| 127.0.0.1 Scanning cache for range ({5592106830937975806, end}, +inf) and slice {(-inf, +inf)} [shard 1] \| 2022-01-27 22:53:08.771419 \| 127.0.0.2 \| 761 \| 127.0.0.1 Scanning cache for range ({-6272011798787969456, end}, {-4611686018427387904, start}) and slice {(-inf, +inf)} [shard 0] \| 2022-01-27 22:53:08.771419 \| 127.0.0.3 \| 573 \| 127.0.0.1 Received forward_result=[131] from 127.0.0.1:0 [shard 1] \| 2022-01-27 22:53:08.771454 \| 127.0.0.1 \| 1003 \| 127.0.0.1 Page stats: 74 partition(s), 0 static row(s) (0 live, 0 dead), 74 clustering row(s) (74 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2022-01-27 22:53:08.771764 \| 127.0.0.3 \| 918 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2022-01-27 22:53:08.771768 \| 127.0.0.3 \| 922 \| 127.0.0.1 Start querying token range [{0, start}, {2507462623645193091, end}] [shard 0] \| 2022-01-27 22:53:08.771771 \| 127.0.0.3 \| 925 \| 127.0.0.1 Creating shard reader on shard: 0 [shard 0] \| 2022-01-27 22:53:08.771775 \| 127.0.0.3 \| 929 \| 127.0.0.1 Scanning cache for range [{0, start}, {2507462623645193091, end}] and slice {(-inf, +inf)} [shard 0] \| 2022-01-27 22:53:08.771779 \| 127.0.0.3 \| 933 \| 127.0.0.1 Querying is done [shard 1] \| 2022-01-27 22:53:08.771935 \| 127.0.0.3 \| 1265 \| 127.0.0.1 Querying is done [shard 0] \| 2022-01-27 22:53:08.771950 \| 127.0.0.2 \| 998 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2022-01-27 22:53:08.771956 \| 127.0.0.2 \| 1004 \| 127.0.0.1 Start querying token range ({-6473446911791631266, end}, {-6272011798787969456, end}] [shard 0] \| 2022-01-27 22:53:08.771959 \| 127.0.0.2 \| 1008 \| 127.0.0.1 Creating shard reader on shard: 0 [shard 0] \| 2022-01-27 22:53:08.771963 \| 127.0.0.2 \| 1011 \| 127.0.0.1 Scanning cache for range ({-6473446911791631266, end}, {-6272011798787969456, end}] and slice {(-inf, +inf)} [shard 0] \| 2022-01-27 22:53:08.771966 \| 127.0.0.2 \| 1014 \| 127.0.0.1 Page stats: 13 partition(s), 0 static row(s) (0 live, 0 dead), 13 clustering row(s) (13 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2022-01-27 22:53:08.772008 \| 127.0.0.2 \| 1057 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2022-01-27 22:53:08.772012 \| 127.0.0.2 \| 1061 \| 127.0.0.1 Start querying token range ({3897266736829642805, end}, {4611686018427387904, start}) [shard 0] \| 2022-01-27 22:53:08.772014 \| 127.0.0.2 \| 1063 \| 127.0.0.1 Creating shard reader on shard: 0 [shard 0] \| 2022-01-27 22:53:08.772016 \| 127.0.0.2 \| 1065 \| 127.0.0.1 Scanning cache for range ({3897266736829642805, end}, {4611686018427387904, start}) and slice {(-inf, +inf)} [shard 0] \| 2022-01-27 22:53:08.772019 \| 127.0.0.2 \| 1067 \| 127.0.0.1 On shard execution result is [200] [shard 1] \| 2022-01-27 22:53:08.772053 \| 127.0.0.3 \| 1384 \| 127.0.0.1 Page stats: 56 partition(s), 0 static row(s) (0 live, 0 dead), 56 clustering row(s) (56 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2022-01-27 22:53:08.772138 \| 127.0.0.2 \| 1186 \| 127.0.0.1 Page stats: 190 partition(s), 0 static row(s) (0 live, 0 dead), 190 clustering row(s) (190 live, 0 dead) and 0 range tombstone(s) [shard 1] \| 2022-01-27 22:53:08.772364 \| 127.0.0.2 \| 1706 \| 127.0.0.1 Page stats: 149 partition(s), 0 static row(s) (0 live, 0 dead), 149 clustering row(s) (149 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2022-01-27 22:53:08.772407 \| 127.0.0.3 \| 1561 \| 127.0.0.1 Querying is done [shard 0] \| 2022-01-27 22:53:08.772417 \| 127.0.0.3 \| 1571 \| 127.0.0.1 Querying is done [shard 1] \| 2022-01-27 22:53:08.772418 \| 127.0.0.2 \| 1760 \| 127.0.0.1 Querying is done [shard 0] \| 2022-01-27 22:53:08.772426 \| 127.0.0.2 \| 1475 \| 127.0.0.1 Querying is done [shard 0] \| 2022-01-27 22:53:08.772428 \| 127.0.0.2 \| 1476 \| 127.0.0.1 Querying is done [shard 0] \| 2022-01-27 22:53:08.772449 \| 127.0.0.3 \| 1604 \| 127.0.0.1 On shard execution result is [196] [shard 0] \| 2022-01-27 22:53:08.772555 \| 127.0.0.2 \| 1603 \| 127.0.0.1 On shard execution result is [238] [shard 1] \| 2022-01-27 22:53:08.772674 \| 127.0.0.2 \| 2016 \| 127.0.0.1 On shard execution result is [235] [shard 0] \| 2022-01-27 22:53:08.772770 \| 127.0.0.3 \| 1924 \| 127.0.0.1 Received forward_result=[435] from 127.0.0.3:0 [shard 1] \| 2022-01-27 22:53:08.772933 \| 127.0.0.1 \| 2482 \| 127.0.0.1 Received forward_result=[434] from 127.0.0.2:0 [shard 1] \| 2022-01-27 22:53:08.773110 \| 127.0.0.1 \| 2658 \| 127.0.0.1 Merged result is [1000] [shard 1] \| 2022-01-27 22:53:08.773111 \| 127.0.0.1 \| 2660 \| 127.0.0.1 Done processing - preparing a result [shard 1] \| 2022-01-27 22:53:08.773114 \| 127.0.0.1 \| 2663 \| 127.0.0.1 Request complete \| 2022-01-27 22:53:08.772666 \| 127.0.0.1 \| 2666 \| 127.0.0.1 ``` Fixes #1385 Closes #9209 github.com:scylladb/scylla: docs: add parallel aggregations design doc db: config: add a flag to disable new parallelized aggregation algorithm test: add parallelized select count test forward_service: add metrics forward_service: parallelize execution across shards forward_service: add tracing cql3: statements: introduce parallelized_select_statement cql3: query_processor: add forward_service reference to query_processor gms: add PARALLELIZED_AGGREGATION feature service: introduce forward_service storage_proxy: extract query_ranges_to_vnodes_generator to a separate file messaging_service: add verb for count() request forwarding cql3: selection: detect if a selection represents count()	2022-02-04 12:34:19 +02:00
Nadav Har'El	b54e85088d	Merge 'snapshots: Fix snapshot-ctl to include snapshots of dropped tables' from Benny Halevy Snapshot-ctl methods fetch information about snapshots from column family objects. The problem with this is that we get rid of these objects once the table gets dropped, while the snapshots might still be present (the auto_snapshot option is specifically made to create this kind of situation). This commit switches from relying on column family interface to scanning every datadir that the database knows of in search for "snapshots" folders. This PR is a rebased version of #9539 (and slightly cleaned-up, cosmetically) and so it replaces the previous PR. Fixes #3463 Closes #7122 Closes #9884 * github.com:scylladb/scylla: snapshots: Fix snapshot-ctl to include snapshots of dropped tables table: snapshot: add debug messages	2022-02-04 12:34:19 +02:00
Piotr Sarna	c613d1ce87	alternator: migrate expression parsers to string_view Following the advice in the FIXME note, helper functions for parsing expressions are now based on string views to avoid a few unnecessary conversions to std::string. Tests: unit(dev) Closes #10013	2022-02-04 12:34:19 +02:00
Nadav Har'El	87e48d61a7	build: rebuild relocatable packages if version changed In commit `d72465531e` we fixed the building of relocatable packages of submodules (tools/java, etc.) to use the top-level Scylla's version. However, if on an active working directory Scylla's version changes - as we just did from 4.7 to 5.0 - these relocatable packages are not rebuilt with the new version number, and as a result some of our scripts (such as the docker build) can't find them. Because the build-submodule-reloc rule depends on the files build/SCYLLA-{PRODUCT,VERSION,RELEASE}-FILE (which is what the aforementioned commit did), in this patch we add those files as a dependency whenever build-submodule-reloc is used. This means that if any of these files change, we rebuild the relocatable packages and anything depending on them (e.g., Debian packages). Fixes #10018. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220202131248.1610678-1-nyh@scylladb.com>	2022-02-03 10:19:15 +02:00
Botond Dénes	996e2f8048	Merge 'Handle serialized_action trigger exceptions' from Benny Halevy " which is currently unhandled from multiple call sites, leading to the following warning as seen in https://jenkins.scylladb.com/view/master/job/scylla-master/job/dtest-release/1094/artifact/logs-all.release.2/1643794928169_materialized_views_test.py%3A%3ATestInterruptBuildProcess%3A%3Atest_interrupt_build_process_and_resharding_half_to_max_test/node2.log ``` Scylla version 5.0.dev-0.20220201.a026b4ef4 with build-id cebf6dca8edd8df843a07e0f01a1573f1d0a6dfc starting ... WARN 2022-02-02 09:31:56,616 [shard 2] seastar - Exceptional future ignored: seastar::sleep_aborted (Sleep is aborted), backtrace: 0x463b65e 0x463bb50 0x463be58 0x426c165 0x230c744 0x42adad4 0x42aeea7 0x42cdb55 0x4281a2a /jenkins/workspace/scylla-master/dtest-release/scylla/.ccm/scylla-repository/a026b4ef490074df0d31d4b0ed9189d0cfaa745e/scylla/libreloc/libpthread.so.0+0x9298 /jenkins/workspace/scylla-master/dtest-release/scylla/.ccm/scylla-repository/a026b4ef490074df0d31d4b0ed9189d0cfaa745e/scylla/libreloc/libc.so.6+0x100352 -------- seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<serialized_action::trigger(bool)::{lambda()#2}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void>::finally_body<serialized_action::trigger(bool)::{lambda()#2}, false> >(seastar::future<void>::finally_body<serialized_action::trigger(bool)::{lambda()#2}, false>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<serialized_action::trigger(bool)::{lambda()#2}, false>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> ``` Decoded: ``` void seastar::backtrace(seastar::current_backtrace_tasklocal()::$_3&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:59 (inlined by) seastar::current_backtrace_tasklocal() at ./build/release/seastar/./seastar/src/util/backtrace.cc:86 seastar::current_tasktrace() at ./build/release/seastar/./seastar/src/util/backtrace.cc:137 seastar::current_backtrace() at ./build/release/seastar/./seastar/src/util/backtrace.cc:170 seastar::report_failed_future(std::__exception_ptr::exception_ptr const&) at ./build/release/seastar/./seastar/src/core/future.cc:210 (inlined by) seastar::report_failed_future(seastar::future_state_base::any&&) at ./build/release/seastar/./seastar/src/core/future.cc:218 seastar::future_state_base::any::check_failure() at ././seastar/include/seastar/core/future.hh:567 (inlined by) seastar::future_state::clear() at ././seastar/include/seastar/core/future.hh:609 (inlined by) ~future_state at ././seastar/include/seastar/core/future.hh:614 (inlined by) ~future at ././seastar/include/seastar/core/scheduling.hh:43 (inlined by) void seastar::futurize >::satisfy_with_result_of::then_wrapped_nrvo, seastar::future::finally_body >(seastar::future::finally_body&&)::{lambda(seastar::internal::promise_base_with_type&&, serialized_action::trigger(bool)::{lambda()#2}&, seastar::future_state&&)#1}::operator()(seastar::internal::promise_base_with_type, seastar::internal::promise_base_with_type&&, seastar::future_state::finally_body&&::monostate>) const::{lambda()#1}>(seastar::internal::promise_base_with_type, seastar::future::finally_body&&) at ././seastar/include/seastar/core/future.hh:2120 (inlined by) operator() at ././seastar/include/seastar/core/future.hh:1667 (inlined by) seastar::continuation, seastar::future::finally_body, seastar::future::then_wrapped_nrvo, serialized_action::trigger(bool)::{lambda()#2}>(serialized_action::trigger(bool)::{lambda()#2}&&)::{lambda(seastar::internal::promise_base_with_type&&, serialized_action::trigger(bool)::{lambda()#2}&, seastar::future_state&&)#1}, void>::run_and_dispose() at ././seastar/include/seastar/core/future.hh:767 seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./seastar/src/core/reactor.cc:2344 (inlined by) seastar::reactor::run_some_tasks() at ./build/release/seastar/./seastar/src/core/reactor.cc:2754 seastar::reactor::do_run() at ./build/release/seastar/./seastar/src/core/reactor.cc:2923 operator() at ./build/release/seastar/./seastar/src/core/reactor.cc:4128 (inlined by) void std::__invoke_impl(std::__invoke_other, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_100&) at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/invoke.h:61 (inlined by) std::enable_if, void>::type std::__invoke_r(seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_100&) at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/invoke.h:111 (inlined by) std::_Function_handler::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/std_function.h:291 std::function::operator()() const at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/std_function.h:560 (inlined by) seastar::posix_thread::start_routine(void) at ./build/release/seastar/./seastar/src/core/posix.cc:60 ``` This series handles exception handling to serialized actions triggers that don't handle exceptions. Test: unit(dev) " tag 'handle-serialized_action-trigger-exception-v1' of https://github.com/bhalevy/scylla: migration_manager: passive_announce(version): handle exception view_builder: do_build_step: handle unexpected exceptions storage_service: no need to include utils/serialized_action.hh	2022-02-03 10:17:59 +02:00
Yaron Kaikov	e6ea0e04ed	release: prepare for 5.1.dev	2022-02-03 08:11:24 +02:00
Calle Wilund	1e66043412	commitlog: Fix double clearing of _segment_allocating shared_future. Fixes #10020 Previous fix `445e1d3` tried to close one double invocation, but added another, since it failed to ensure all potential nullings of the opt shared_future happened before a new allocator could reset it. This simplifies the code by making clearing the shared_future a pre-requisite for resolving its contents (as read by waiters). Also removes any need for try-catch etc. Closes #10024	2022-02-02 23:26:17 +02:00
Nadav Har'El	cb6630040d	docker: don't repeat "--alternator-address" option twice If the Docker startup script is passed both "--alternator-port" and "--alternator-https-port", a combination which is supposed to be allowed, it passes to Scylla the "--alternator-address" option twice. This isn't necessary, and worse - not allowed. So this patch fixes the scyllasetup.py script to only pass this parameter once. Fixes #10016. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220202165814.1700047-1-nyh@scylladb.com>	2022-02-02 23:26:11 +02:00
Michał Sala	4903f7a314	docs: add parallel aggregations design doc Added document describes the design of a mechanism that parallelizes execution of aggregation queries.	2022-02-02 17:52:22 +01:00
Benny Halevy	b94c9ed3e6	migration_manager: passive_announce(version): handle exception Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-02 14:54:19 +02:00
Benny Halevy	b56b10a4bb	view_builder: do_build_step: handle unexpected exceptions Exception are handled by do_build_step in principle, Yet if an unhandled exception escapes handling (e.g. get_units(_sem, 1) fails on a broken semaphore) we should warn about it since the _build_step.trigger() calls do no handle exceptions. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-02 14:54:19 +02:00
Benny Halevy	71a9524175	storage_service: no need to include utils/serialized_action.hh	2022-02-02 14:42:05 +02:00
Botond Dénes	d309a86708	Merge 'Add keyspace_offstrategy_compaction api' from Benny Halevy This series adds methods to perform offstrategy compaction, if needed, returning a future<bool> so the caller can wait on it until compaction completes. The returned value is true iff offstrategy compaction was needed. The added keyspace_offstrategy_compaction calls perform_offstrategy_compaction on the specified keyspace and tables, return the number of tables that required offstrategy compaction. A respective unit test was added to the rest_api pytest. This PR replaces https://github.com/scylladb/scylla/pull/9095 that suggested adding an option to `keyspace_compaction` since offstrategy compaction triggering logic is different enough from major compaction meriting a new api. Test: unit (dev) Closes #9980 * github.com:scylladb/scylla: test: rest_api: add unit tests for keyspace_offstrategy_compaction api api: add keyspace_offstrategy_compaction compaction_manager: get rid of submit_offstrategy table: add perform_offstrategy_compaction compaction_manager: perform_offstrategy: print ks.cf in log messages compaction_manager: allow waiting on offstrategy compaction	2022-02-02 13:15:31 +02:00
Nadav Har'El	79776ff2ff	alternator: fix error handling during Alternator startup A recent restructuring of the startup of Alternator (and also other protocol servers) led to incorrect error-handling behavior during startup: If an error was detected on one of the shards of the sharded service (in alternator/server.cc), the sharded service itself was never stopped (in alternator/controller.cc), leading to an assertion failure instead of the desired error message. A common example of this problem is when the requested port for the server was already taken (this was issue #9914). So in this patch, exception handling is removed from server.cc - the exception will propegate to the code in controller.cc, which will properly stop the server (including the sharded services) before returning. Fixes #9914. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220130131709.1166716-1-nyh@scylladb.com>	2022-02-02 10:35:57 +01:00
Piotr Wojtczak	0dd7739716	snapshots: Fix snapshot-ctl to include snapshots of dropped tables Snapshot-ctl methods fetch information about snapshots from column family objects. The problem with this is that we get rid of these objects once the table gets dropped, while the snapshots might still be present (the auto_snapshot option is specifically made to create this kind of situation). This commit switches from relying on column family interface to scanning every datadir that the database knows of in search for "snapshots" folders. Fixes #3463 Closes #7122 Closes #9884 Signed-off-by: Piotr Wojtczak <piotr.m.wojtczak@gmail.com> Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-01 22:31:43 +02:00
Benny Halevy	2a90896b79	table: snapshot: add debug messages Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-01 22:31:37 +02:00
Michał Sala	b439d6e710	db: config: add a flag to disable new parallelized aggregation algorithm Just in case the new algorithm turns out to be buggy, add a flag to fall-back to the old algorithm.	2022-02-01 21:26:25 +01:00
Michał Sala	140bab279c	test: add parallelized select count test Added test that checks if a SELECT COUNT(*) query was transformed and processed in a parallel way. Checking is done by looking at the cql statistics and comparing subsequent counts of parallelized aggregation SELECT query executions.	2022-02-01 21:14:41 +01:00
Michał Sala	e6e9553b4a	forward_service: add metrics Introduces metrics for `forward_service`. 3 counters were created, which allows checking how many requests had been dispached or executed.	2022-02-01 21:14:41 +01:00
Michał Sala	354f7a1c34	forward_service: parallelize execution across shards Coordinators processed each vnode sequentially on shards when executing a `forward_request` sent by super-coordinator. This commit changes this behavior and parallelizes execution of `forward_request` across shards. It does that by adding additional layer of dispatching to `forward_service`. When a coordinator receives a `forward_request`, it forwards it to each of its shards. Shards slice `forward_request`'s partition ranges so that they will only query data that is owned by them. Implementation of slicing partition ranges was based on @nyh's `token_ranges_owned_by_this_shard` from `alternator/ttl.cc`.	2022-02-01 21:14:41 +01:00
Michał Sala	aec96be553	forward_service: add tracing	2022-02-01 21:14:41 +01:00
Michał Sala	f344bd0aaa	cql3: statements: introduce parallelized_select_statement Detect whether a statement is a count() query in prepare time. If so, instantiate a new `select_statement` subclass - `parallelized_select_statement`. This subclass has a different execution logic, that enables it to distribute count() queries across a cluster. Also, a new counter was added - `select_parallelized` that counts the number of parallelized aggregation SELECT query executions.	2022-02-01 21:14:41 +01:00
Michał Sala	66a93d3000	cql3: query_processor: add forward_service reference to query_processor	2022-02-01 21:14:41 +01:00
Michał Sala	3789a4d02b	gms: add PARALLELIZED_AGGREGATION feature This new feature will be used to determined whether the whole cluster is ready to parallelize execution of aggregation queries.	2022-02-01 21:14:41 +01:00
Michał Sala	a6cf3f52bd	service: introduce forward_service The new service is responsible for: * spreading forward_request execution across multiple nodes in cluster * collecting forward_request execution results and merging them `forward_service::dispatch` method takes forward_request as an argument, and forwards its execution to group of other nodes (using rpc verb added in previous commits). Each node (in the group chosen by dispatch method) is provided with forward_request, which is no different from the original argument except for changed partition ranges. They are changed so that vnodes contained in them are owned by recipient node. Executing forward_request is realized in `forward_service::execute` method, that is registered to be called on FORWARD_REQUEST verb receipt. Process of executing forward_request consists of mocking few non-serializable object (such as `cql3::selection`) in order to create `service:pager:query_pagers::pager` and `cql3::selection::result_set_builder`. After pager and result_set_builder creation, execution process resembles what might be seen in select_statement's execution path.	2022-02-01 21:14:41 +01:00
Michał Sala	0fe59082ec	storage_proxy: extract query_ranges_to_vnodes_generator to a separate file Such separation allows using query_ranges_to_vnodes_generator by other services without needing a storage_proxy dependency.	2022-02-01 21:14:41 +01:00
Michał Sala	fff454761a	messaging_service: add verb for count() request forwarding Except for the verb addition, this commit also defines forward_request and forward_result structures, used as an argument and result of the new rpc. forward_request is used to forward information about select statement that does count() (or other aggregating functions such as max, min, avg in the future). Due to the inability to serialize cql3::statements::select_statement, I chose to include query::read_command, dht::partition_range_vector and some configuration options in forward_request. They can be serialized and are sufficient enough to allow creation of service::pager::query_pagers::pager.	2022-02-01 21:14:41 +01:00
Michał Sala	bb7edf3785	cql3: selection: detect if a selection represents count() The way that this detection works is a bit clunky, but it does its job given the simplest cases e.g. "SELECT COUNT() FROM ks.t". It fails when there are multiple selectors, or when there is a column name specified ("SELECT COUNT(column_name) FROM ks.t").	2022-02-01 21:14:41 +01:00
Pavel Emelyanov	a026b4ef49	config: Add option to disable config updates via CQL The system.config table allows changing config parameters, but this change doesn't survive restarts and is considered to be dangerous (sometimes). Add an option to disable the table updates. The option is LiveUpdate and can be set to false via CQL too (once). fixes #9976 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220201121114.32503-1-xemul@scylladb.com>	2022-02-01 14:30:47 +02:00
Takuya ASADA	c2ccdac297	move cloud related code from scylla repository to scylla-machine-image Currently, cloud related code have cross-dependencies between scylla and scylla-machine-image. It is not good way to implement, and single change can break both package. To resolve the issue, we need to move all cloud related code to scylla-machine-image, and remove them from scylla repository. Change list: - move cloud part of scylla_util.py to scylla-machine-image - move cloud part of scylla_io_setup to scylla-machine-image - move scylla_ec2_check to scylla-machine-image - move cloud part of scylla_bootparam_setup to scylla-machine-image Closes #9957	2022-02-01 11:26:59 +02:00
Tomasz Grabiec	00a9326ae7	Merge "raft: let `modify_config` finish on a follower that removes itself" from Kamil When forwarding a reconfiguration request from follower to a leader in `modify_config`, there is no reason to wait for the follower's commit index to be updated. The only useful information is that the leader committed the configuration change - so `modify_config` should return as soon as we know that. There is a reason not to wait for the follower's commit index to be updated: if the configuration change removes the follower, the follower will never learn about it, so a local waiter will never be resolved. `execute_modify_config` - the part of `modify_config` executed on the leader - is thus modified to finish when the configuration change is fully complete (including the dummy entry appended at the end), and `modify_config` - which does the forwarding - no longer creates a local waiter, but returns as soon as the RPC call to the leader confirms that the entry was committed on the leader. We still return an `entry_id` from `execute_modify_config` but that's just an artifact of the implementation. Fixes #9981. A regression test was also added in randomized_nemesis_test. * kbr/modify-config-finishes-v1: test: raft: randomized_nemesis_test: regression test for #9981 raft: server: don't create local waiter in `modify_config`	2022-01-31 20:14:50 +01:00
Kamil Braun	97ff98f3a7	service: migration_manager: retry schema change command on transient failures The call to `raft::server::add_entry` in `announce_with_raft` may fail e.g. due to a leader change happening when we try to commit the entry. In cases like this it makes sense to retry the command so we don't prematurely report an error to the client. This may result in double application of the command. Fortunately, the schema change command is idempotent thanks to the group 0 state ID mechanism (originally used to prevent conflicting concurrent changes from happening). Indeed, once a command passes the state ID check, it changes the group 0 history last state ID, causing all later applications of that same command to fail the check. Similarly, once a command fails the state ID check, it means that the last state ID is different than the one observed when the command was being constructed, so all further applications of the command will also fail the check (it is not possible for the last state ID to change from X to Y then back to X). Note that this reasoning only works for commands with `prev_state_id` engaged, such as the ones which we're using in `migration_manager::announce_with_raft`. It would not work with "unconditional commands" where `prev_state_id` is `nullopt` - for those commands no state ID check is performed. It could still be safe to retry those commands if they are idempotent for a different reason. (Note: actually, our schema commands are already idempotent even without the state ID check, because they simply apply a set of mutations, and applying the same mutations twice is the same as applying them once.) Message-Id: <20220131152926.18087-1-kbraun@scylladb.com>	2022-01-31 19:49:31 +01:00
Takuya ASADA	218dd3851c	scylla_swap_setup: add --swap-size-bytes Currently, --swap-size does not able to specify exact file size because the option takes parameter only in GB. To fix the limitation, let's add --swpa-size-bytes to specify swap size in bytes. We need this to implement preallocate swapfile while building IaaS image. see scylladb/scylla-machine-image#285 Closes #9971	2022-01-31 18:32:32 +02:00
Benny Halevy	4272dd0b28	storage_proxy: mutate_counter_on_leader_and_replicate: use container to get to shard proxy Rather than using the global helper, get_local_storage_proxy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220131151516.3461049-2-bhalevy@scylladb.com>	2022-01-31 18:14:31 +02:00
Benny Halevy	8acdc6ebdc	storage_proxy: paxos: don't use global storage_proxy Rather than calling get_local_storage_proxy(), use paxos_response_handler::_proxy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220131151516.3461049-1-bhalevy@scylladb.com>	2022-01-31 18:14:31 +02:00
Calle Wilund	445e1d3e41	commitlog: Ensure we never have more than one new_segment call at a time Refs #9896 Found by @eliransin. Call to new_segment was wrapped in with_timeout. This means that if primary caller timed out, we would leave new_segment calls running, but potentially issue new ones for next caller. This could lead to reserve segment queue being read simultanously. And it is not what we want. Change to always use the shared_future wait, all callers, and clear it only on result (exception or segment) Closes #10001	2022-01-31 16:50:22 +02:00
Nadav Har'El	8a745593a2	Merge 'alternator: fill UnprocessedKeys for failed batch reads' from Piotr Sarna DynamoDB protocol specifies that when getting items in a batch failed only partially, unprocessed keys can be returned so that the user can perform a retry. Alternator used to fail the whole request if any of the reads failed, but right now it instead produces the list of unprocessed keys and returns them to the user, as long as at least 1 read was successful. This series comes with a test based on Scylla's error injection mechanism, and thus is only useful in modes which come with error injection compiled in. In release mode, expect to see the following message: SKIPPED (Error injection not enabled in Scylla - try compiling in dev/debug/sanitize mode) Fixes #9984 Closes #9986 * github.com:scylladb/scylla: test: add total failure case for GetBatchItem test: add error injection case for GetBatchItem test: add a context manager for error injection to alternator alternator: add error injection to BatchGetItem alternator: fill UnprocessedKeys for failed batch reads	2022-01-31 15:28:24 +02:00
Piotr Sarna	c87126198d	test: add total failure case for GetBatchItem The test verifies that if all reads from a batch operation failed, the result is an error, and not a success response with UnprocessedKeys parameter set to all keys.	2022-01-31 14:21:55 +01:00
Piotr Sarna	e79c2943fc	test: add error injection case for GetBatchItem The new test case is based on Scylla error injection mechanism and forces a partial read by failing some requests from the batch.	2022-01-31 14:21:55 +01:00
Piotr Sarna	99c5bec0e2	test: add a context manager for error injection to alternator With the new context manager it's now easier to request an error to be injected via REST API. Note that error injection is only enabled in certain build modes (dev, debug, sanitize) and the test case will be skipped if it's not possible to use this mechanism.	2022-01-31 14:21:55 +01:00
Tomasz Grabiec	8297ae531d	Merge "Automatically retry CQL DDL statements in presence of concurrent changes" from Kamil Schema changes on top of Raft do not allow concurrent changes. If two changes are attempted concurrently, one of them gets `group0_concurrent_modification` exception. Catch the exception in CQL DDL statement execution function and retry. In addition, improve the description of CQL DDL statements in group 0 history table. Add a test which checks that group 0 history grows iff a schema change does not throw `group0_concurrent_modification`. Also check that the retry mechanism works as expected. * kbr/ddl-retry-v1: test: unit test for group 0 concurrent change protection and CQL DDL retries cql3: statements: schema_altering_statement: automatically retry in presence of concurrent changes	2022-01-31 14:12:35 +01:00
Tomasz Grabiec	b78bab7286	Merge "raft: fixes and improvements to the library and nemesis test" from Kamil Raft randomized nemesis test was improved by adding some more chaos: randomizing the network delay, server configuration, ticking speed of servers. This allowed to catch a serious bug, which is fixed in the first patch. The patchset also fixes bugs in the test itself and adds quality of life improvements such as better diagnostics when inconsistency is detected. * kbr/nemesis-random-v1: test: raft: randomized_nemesis_test: print state of each state machine when detecting inconsistency test: raft: randomized_nemesis_test: print details when detecting inconsistency test: raft: randomized_nemesis_test: print snapshot details when taking/loading snapshots in `impure_state_machine` test: raft: randomized_nemesis_test: keep server id in impure_state_machine test: raft: randomized_nemesis_test: frequent snapshotting configuration test: raft: randomized_nemesis_test: tick servers at different speeds in generator test test: raft: randomized_nemesis_test: simplify ticker test: raft: randomized_nemesis_test: randomize network delay test: raft: randomized_nemesis_test: fix use-after-free in `environment::crash()` test: raft: randomized_nemesis_test: fix use-after-free in two-way rpc functions test: raft: randomized_nemesis_test: rpc: don't propagate `gate_closed_exception` outside test: raft: randomized_nemesis_test: fix obsolete comment raft: fsm: print configuration entries appearing in the log raft: `operator<<(ostream&, ...)` implementation for `server_address` and `configuration` raft: server: abort snapshot applications before waiting for rpc abort raft: server: logging fix raft: fsm: don't advance commit index beyond matched entries	2022-01-31 13:25:27 +01:00
Calle Wilund	7ca72ffd19	database: Make wrapped version of timed_out_error a timed_out_error Refs #9919 in `a6202ae` throw_commitlog_add_error was added to ensure we had more info on errors generated writing to commit log. However, several call sites catch timed_out_error explicitly, not checking for nested etc. `97bb1be` and `868b572` tried to deal with it, by using check routines. It turns out there are call sites left, and while these should be changed, it is safer and quicker for now to just ensure that iff we have a timed_out_error, we throw yet another timed_out_error. Closes #10002	2022-01-31 14:15:23 +02:00

1 2 3 4 5 ...

30009 Commits