scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-04 14:03:06 +00:00

Author	SHA1	Message	Date
Avi Kivity	ed6c01a9fa	test: increase timeout to account for flat_mutation_reader_v2 tests Since `fce124bd90` ("Merge "Introduce flat_mutation_reader_v2" from Tomasz") tests involving mutation_reader are a lot slower due to the new API testing. On slower machines it's enough to time out. Work underway to improve the situation, and it will also revert back to the original timing once the flat_mutation_reader_v2 work is done, but meanwhile, increase the timeout. Closes #9046	2021-07-15 12:33:43 +03:00
Avi Kivity	1643549d08	Merge 'Coroutinize the sstable reader' from Wojciech Mitros This patch applies the same changes to both kl and mx sstable readers, but because the kl reader is old, we'll focus on the newer one. This patch makes the main sstable reader process a coroutine, allowing to simplify it, by: - using the state saved in the coroutine instead of most of the states saved in the _state variable - removing the switch statement and moving the code of former switch cases, resulting in reduced number of jumps in code - removing repetitive ifs for read statuses, by adding them to the coroutine implementation The coroutine is saved in a new class ```processing_result_generator```, which works like a generator: using its ```generate()``` method, one can order the coroutine to continue until it yields a data_consumer::processing_result value, which was achieved previously by calling the function that is now the coroutine(```do_process_state()```). Before the patch, the main processing method had 558 lines. The patch reduces this number to 345 lines. However, usage of c++ coroutines has a non-negligible effect on the performance of the sstable reader. In the test cases from ```perf_fast_forward``` the new sstable reader performs up to 2% more instructions (per fragment) than the former implementation, and this loss is achieved for cases where we're reading many subsequent rows, without any skips. Thanks to finding an optimization during the development of the patch, the loss is mitigated when we do skip rows, and for some cases, we can even observe an improvement. You can see the full results in attached files: [old_results.txt](https://github.com/scylladb/scylla/files/6793139/old_results.txt), [new_results.txt](https://github.com/scylladb/scylla/files/6793140/new_results.txt) Test: unit(dev) Refs: #7952 Closes #9002 * github.com:scylladb/scylla: mx sstable reader: reduce code blocks mx sstable reader: make ifs consistent sstable readers: make awaiter for read status mx sstable reader: don't yield if the data buffer is not empty mx sstable reader: combine FLAGS and FLAGS_2 states mx sstable reader: reduce placeholder state usage mx sstable reader: replace non_consuming states with a bool mx sstable reader: reduce placeholder state usage mx sstable reader: replace unnecessary states with a placeholder mx sstable reader: remove false if case mx sstable reader: remove row_body_missing_columns_label mx sstable reader: remove row_body_deletion_label mx sstable reader: remove column_end_label mx sstable reader: remove column_cell_path_label mx sstable reader: remove column_ttl_label mx sstable reader: remove column_deletion_time_label mx sstable reader: remove complex_column_2_label mx sstable reader: remove row_body_missing_columns_read_columns_label mx sstable reader: remove row_body_marker_label mx sstable reader: remove row_body_shadowable_deletion_label mx sstable reader: remove row_body_prev_size_label mx sstable reader: remove ck_block_label mx sstable reader: remove ck_block2_label mx sstable reader: remove clustering_row_label and complex_column_label mx sstable reader: remove labels with only one goto mx sstable reader: replace the switch cases with gotos and a new label mx sstable reader: remove states only reached consecutively or from goto mx sstable reader: remove switch breaks for consecutive states mx sstable reader: convert readers main method into a coroutine kl sstable reader: replace states for ending with one state, simplify non_consuming kl sstable reader: remove unnecessary states kl sstable reader: remove unnecessary yield kl sstable reader: remove unnecessary blocks kl sstable reader: fix indentation kl sstable reader: replace switch with standard flow control kl sstable reader: remove state::CELL case kl sstable reader: move states code only reachable from one place kl sstable reader: remove states only reached consecutively kl sstable reader: remove switch breaks for consecutive states kl sstable reader: remove unreachable case kl sstable reader: move testing hack for fragmented buffers outside the coroutine kl sstable reader: convert readers main method into a coroutine sstable readers: create a generator class for coroutines	2021-07-15 12:06:14 +03:00
Wojciech Mitros	45058776c2	mx sstable reader: reduce code blocks Some blocks of code were surrounded by curly braces, because a variable was declared inside a switch case. After changes, some of the variable declarations are in if/else/while cases, and no longer need to be in separate code blocks, while other blocks can be extended to entire labels for simplicity.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	9b333908e4	mx sstable reader: make ifs consistent In several places we're checking the return value of our consumers' consume_* calls. Because the behaviour in all cases is the same, let us use the same notation as well.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	dc38605f75	sstable readers: make awaiter for read status After each read* call of the primitive_consumer we need to check if the entire primitive was in our current buffer. We can check it in the proceed_generator object by yielding the returned read status: if the yielded status is ready, the yield_value method returns a structure whose await_ready() method returns true. Otherwise it returns false. The returned structure is co_awaited by the coroutine (due to co_yield), and if await_ready() returns true, the coroutine isn't stopped, conversely, if it returns false, (technical: and because its await_suspend methods returns void) the coroutine stops, and a proceed::yes value is saved, indicating that we need more buffers.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	09a0cd7c05	mx sstable reader: don't yield if the data buffer is not empty The skip() method returns a skip_bytes object if we want to skip the entire buffer, otherwise it returns a proceed::yes and trims the buffer. If the buffer is only trimmed we don't need to interrupt the coroutine, we simply continue instead.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	5dc64532bd	mx sstable reader: combine FLAGS and FLAGS_2 states We don't differentiate between FLAGS and FLAGS_2 in verify_end_state(), so we can merge them into one state.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	ab1e6f4211	mx sstable reader: reduce placeholder state usage After the changes to non_consuming states, we can remove some state::OTHER assignments again.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	c904ab12c8	mx sstable reader: replace non_consuming states with a bool The non_consuming() method is only used after assuring that primitive_consumer::active() (in continuous_data_consumer::process()) so we don't need states where primitive_consumer::active(), which is most of them. We still need to make sure that the states change when they need to, so we replace all the concerned states with the placeholder state, and for the few states from the non_consuming() OR, where the primitive_consumer::active() returns true, we set the value of _consuming to false, changing it back when the state is no longer non_consuming.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	b05d3eefed	mx sstable reader: reduce placeholder state usage We can remove state assignments that we know are changing a state to itself. Similarily, if a state is changed in the same way in an if and an else, it can be changed before the if/else instead.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	b2e3fbffd0	mx sstable reader: replace unnecessary states with a placeholder After removing the switch, the state is only used for verify_end_state() and non_consuming(), so we can replace states that are not used there with a single one, so that the state still stops being one of the appearing states when it needs to.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	9a7a8fa86c	mx sstable reader: remove false if case consume_row_marker_and_tombstone does not return proceed::no in the mp_row_consumer_m implementation, and even if it did, we would most likely want to yield proceed::no in that case as well.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	2262aac11a	mx sstable reader: remove row_body_missing_columns_label row_body_missing_columns_label is only reached from one goto, or consecutively, so the code omitted by goto can be ommited by an if instead (or else).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	99b5a332db	mx sstable reader: remove row_body_deletion_label row_body_deletion_label is only reached from one goto, or consecutively, so the code omitted by goto can be ommited by an if instead (or else).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	cbce22a88b	mx sstable reader: remove column_end_label column_end_label is only reached from one goto, or consecutively, so the code omitted by goto can be ommited by an if instead (or else).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	925d921cb4	mx sstable reader: remove column_cell_path_label column_cell_path_label is only reached from two goto, both at the end of an if/else block, or consecutively, so the code after the if/else block can be ommited by an if instead (or else).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	e85987a439	mx sstable reader: remove column_ttl_label column_ttl_label is only reached from two goto, both at the end of an if/else block, or consecutively, so the code after the if/else block can be ommited by an if instead (or else).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	4b3607e97b	mx sstable reader: remove column_deletion_time_label column_deletion_time_label is only reached from one goto, or consecutively, so the code omitted by goto can be ommited by an if instead (or else).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	8cf23c3b01	mx sstable reader: remove complex_column_2_label complex_column_2_label is only reached from one goto, or consecutively, so the code omitted by goto can be ommited by an if instead (or else).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	fbe28d18f3	mx sstable reader: remove row_body_missing_columns_read_columns_label row_body_missing_columns_read_columns_label is only reached consecutively, or from a goto after the label. This is changed to a while loop starting at the label and ending at the goto. The code executed in the only case we do not reach the goto (so when exiting the loop) is moved after the while.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	3b512ea2c2	mx sstable reader: remove row_body_marker_label row_body_marker_label is only reached from one goto inside an else case, or consecutively, so the code omitted by goto can be moved inside the corresponding if case.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	0bcde69319	mx sstable reader: remove row_body_shadowable_deletion_label row_body_shadowable_deletion_label is only reached from one goto, or consecutively, so the code omitted by goto can be ommited by an if instead (or else).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	3d0fdf9f3b	mx sstable reader: remove row_body_prev_size_label row_body_prev_size_label is only reached consecutively, or from a goto not far after the label. This is changed to a while loop starting at the label and ending at the goto.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	b27166c36f	mx sstable reader: remove ck_block_label ck_block_label is only reached consecutively, or from a few gotos not far after the label. This is changed to a while loop with gotos replaced with continue's.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	ec6c2f0e07	mx sstable reader: remove ck_block2_label ck_block2_label is only reached from one goto, or consecutively, so the code omitted by goto can be ommited by an if instead (or else).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	1e59e249ec	mx sstable reader: remove clustering_row_label and complex_column_label clustering_row_label is only reached from one goto, or consecutively, so the code omitted by goto can be ommited by an if instead (or else). Also remove complex_column_label because it is next to its only goto.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	440aba61a9	mx sstable reader: remove labels with only one goto If a case is reached only after after jumping with a single goto, that goto may be replaced with the target code.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	65f7eb5ada	mx sstable reader: replace the switch cases with gotos and a new label Because the number of remaining cases is moderately low, and after finishing a case we always enter another one, the switch is removed completely, and the last remaining cases are handled by 3 additional gotos and 1 new label.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	0398c68797	mx sstable reader: remove states only reached consecutively or from goto If a state is never reached from the top of the switch, but only by continuing from the previous case, we don't need to have a case: for it. Similarily, if there is a label that we goto, we don't need the switch case.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	f87b27b9e4	mx sstable reader: remove switch breaks for consecutive states If _state at the end of a switch case has the same value as the next case, instead of breaking the switch, we can just fall through.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	32b996aca5	mx sstable reader: convert readers main method into a coroutine (same as in kl sstable reader) The function is converted to a coroutine simply by adding an infinite loop around the switch, and starting another iteration after yielding a value, instead of returning. Because the coroutine resume() function does not take any arguments, a new member is introduced to remember the "data" buffer, that was previously an argument to the method.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	4816e8120b	kl sstable reader: replace states for ending with one state, simplify non_consuming After removing the switch, the only use for states in the sstable reader are methods non_consuming() and verify_end_state(). The non_consuming() method is only used after assuring that !primitive_consumer::active() (in continuous_data_consumer::process()) so we don't need states where primitive_consumer::active() for this method, and is actually all of them. We don't differentiate between ATOM_START and ATOM_START_2 in verify_end_state(), so we can just merge them into one. While we need tho remember times when we enter states used in verify_end_state(), we also need to remember when we exit them. For that reason we introduce a new state "NOT_CLOSING", that fails all comparisons in verify_end_state(), and replaces all states that aren't used in verify_end_state()	2021-07-14 20:50:30 +02:00
Wojciech Mitros	0c284a8b5e	kl sstable reader: remove unnecessary states After removing the switch, the state is only used for verify_end_state() and non_consuming(), so we can remove states that are not used there (and which do not change them).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	35c30e6178	kl sstable reader: remove unnecessary yield We don't need to yield row_consumer::proceed::yes if we are not parsing a primitive using primitive_consumer, we can just continue execution.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	97c7b5fe76	kl sstable reader: remove unnecessary blocks Some blocks of code were surrounded by curly braces, because a variable was declared inside a switch case. With standard flow control, it's no longer needed.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	914e4f27e9	kl sstable reader: fix indentation To simplify review, the code moved in previous commits didn't change its indentation. This commit fixes it.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	7a6729159f	kl sstable reader: replace switch with standard flow control We get rid of the switch by using the infinite loop around the switch for jumping to the first case, adding an infinite loop around the second case (one break from the switch with the state of the first case becomes a break of the new while), and adding an if around the first case (because we never break in the first case).	2021-07-14 20:50:30 +02:00
Wojciech Mitros	cfe6a46a60	kl sstable reader: remove state::CELL case The CELL state is only set in the if/else block immediately before the CELL case, so we don't need to have a case for it.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	c41f49d2e5	kl sstable reader: move states code only reachable from one place If a case is reached only after exiting a certain other case (or goto) its code may as well be moved to that place.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	5f27413c1f	kl sstable reader: remove states only reached consecutively If a state is never reached from the top of the switch, but only by continuing from the previous case, we don't need to have a case: for it.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	e226fc12c9	kl sstable reader: remove switch breaks for consecutive states If _state at the end of a switch case has the same value as the next case, instead of breaking the switch, we can just fall through.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	bc7ed3f596	kl sstable reader: remove unreachable case The STOP_THEN_ATOM_START is never reached, so it can be removed altogether.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	63d1a44d12	kl sstable reader: move testing hack for fragmented buffers outside the coroutine The testing hack can't be done inside the coroutine, because we don't have the original "data" buffer	2021-07-14 20:50:30 +02:00
Wojciech Mitros	6fff9aed3c	kl sstable reader: convert readers main method into a coroutine The function is converted to a coroutine simply by adding an infinite loop around the switch, and starting another iteration after yielding a value, instead of returning. Because the coroutine resume() function does not take any arguments, a new member is introduced to remember the "data" buffer, that was previously an argument to the method.	2021-07-14 20:50:30 +02:00
Wojciech Mitros	01c2f406df	sstable readers: create a generator class for coroutines The data_consume_rows_context and data_consume_rows_context_m are classes, that use primitive_consumer read* methods to get primitives from a streamed sstable, and using their corresponding consumers' ( mp_row_consumer_k_l and mp_row_consumer_m) consume* methods, they fill the buffer of the corresponding flat_mutation_reader. The main procedure where we decide which read* and consume* methods to call, is do_process_state. We save the current state of the procedure in the _state variable, to remember where to continue in the next call. For each call, the do_process_state method returns an information about whether we can keep filling the buffer using more buffers from the stream (proceed::yes), or not (proceed::no). The saved state can be (mostly) removed by using a generator coroutine, whose state is saved when its execution is halted, and which yields the values, that do_process_state would return before. The processing_result_generator is a class for managing a generator coroutine. When the coroutine halts, the proceed_generator saves the value yielded by the coroutine, and returns it to the caller.	2021-07-14 20:50:27 +02:00
Piotr Sarna	3d816b7c16	Merge 'Move the reader concurrency semaphore in front of the cache' from Botond This patchset combines two important changes to the way reader permits are created and admitted: 1) It switches admission to be up-front. 2) It changes the admission algorithm. (1) Currently permits are created before the read is started, but they only wait for admission when going to the disk. This leaves the resources consumption of cache and memtables reads unbounded, possibly leading to OOM (rare but happens). This series changes this that permits are admitted at the moment they are creating making admission up-front -- at least those reads that pass admission at all (some don't). (2) Admission currently is based on availability of resources. We have a certain amount of memory available, which derived from the memory available to the shard, as well a hardcoded count resource. Reads are admitted when a count and a certain amount (base cost) of memory is available. This patchset adds a new aspect to this admission process beyond the existing resource availability: the number of used/blocked reads. Namely it only admits new reads if in addition to the necessary amount of resources being available, all currently used readers are blocked. In other words we only admit new reads if all currently admitted reads requires something other than CPU to progress. They are either waiting on I/O, a remote shard, or attention from their consumers (not used currently). The reason for making these two changes at the same time is that up-front admission means cache reads now need to obtain a permit too. For cache reads the optimal concurrency is 1. Anything above that just increases latency (without increasing throughput). So we want to make sure that if a cache reader hits it doesn't get any competition for CPU and it can run to completion. We admit new reads only if the read misses and has to go to disk. A side effect of these changes is that the execution stages from the replica-side read path are replaced with the reader concurrency semaphore as an execution stage. This is necessary due to bad interaction between said execution stages and up-front admission. This has an important consequence: read timeouts are more strictly enforced because the execution stage doesn't have a timeout so it can execute already timed-out reads too. This is not the case with the semaphore's queue which will drop timed-out reads. Another consequence is that, now data and mutation reads share the same execution stage, which increases its effectiveness, on the other hand system and user reads don't anymore. Fixes: #4758 Fixes: #5718 Tests: unit(dev, release, debug) * 'reader-concurrency-semaphore-in-front-of-the-cache/v5.3' of https://github.com/denesb/scylla: (54 commits) test/boost/reader_concurrency_semaphore_test: add used/blocked test test/boost/reader_concurrency_semaphore_test: add admission test reader_permit: add operator<< for reader_resources reader_concurrency_semaphore: add reads_{admitted,enqueued} stats table: make_sstable_reader(): fix indentation table: clean up make_sstable_reader() database: remove now unused query execution stages mutation_reader: remove now unused restricting_reader sstables: sstable_set: remove now unused make_restricted_range_sstable_reader() reader_permit: remove now unused wait_admission() reader_concurrency_semaphore: remove now unused obtain_permit_nowait() reader_concurrency_semaphore: admission: flip the switch database: increase semaphore max queue size test: index_with_paging_test: increase semaphore's queue size reader_concurrency_semaphore: add set_max_queue_size() test: mutation_reader_test: remove restricted reader tests reader_concurrency_semaphore: remove now unused make_permit() test: reader_concurrency_semaphore_test: move away from make_permit() test: move away from make_permit() treewide: use make_tracking_only_permit() ...	2021-07-14 16:22:56 +02:00
Botond Dénes	e2dfb2df71	test/boost/reader_concurrency_semaphore_test: add used/blocked test Make sure that releasing a bunch of used/blocked guards in random order doesn't break the permit state.	2021-07-14 17:19:02 +03:00
Botond Dénes	0337d3ea4a	test/boost/reader_concurrency_semaphore_test: add admission test Checking every conceivable admission scenario (hopefully).	2021-07-14 17:19:02 +03:00
Botond Dénes	b81f39cec9	reader_permit: add operator<< for reader_resources And use it in tests, it results in actually useful error messages.	2021-07-14 17:19:02 +03:00
Botond Dénes	1666ad078a	reader_concurrency_semaphore: add reads_{admitted,enqueued} stats Primarily for tests, but we could also export these, should we want to.	2021-07-14 17:19:02 +03:00

1 2 3 4 5 ...

27412 Commits