Those methods are never used so it's better not to keep a dead code
around.
Tests: unit(dev)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Closes#9188
Rename the old version to `sstables::make_reader_v1()`, to have a
nicely searcheable eradication target.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
"
Now reshape can be aborted on either boot or refresh.
The workflow is:
1) reshape starts
2) user notices it's taking too long
3) nodetool stop RESHAPE
the good thing is that completed reshape work isn't lost, allowing
table to enjoy the benefits of all reshaping done up to the abortion
point.
Fixes#7738.
"
* 'abort_reshape_v1' of https://github.com/raphaelsc/scylla:
compaction: Allow reshape to be aborted
api: make compaction manager api available earlier
Now reshape can be aborted on either boot or refresh.
The workflow is:
1) reshape starts
2) user notices it's taking too long
3) nodetool stop RESHAPE
the good thing is that completed reshape work isn't lost, allowing
table to enjoy the benefits of all reshaping done up to the abortion
point.
Fixes#7738.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
When perfing cleanup, merging reader showed up as significant.
Given that cleanup is performed on a single sstable at a time,
merging reader becomes an extra layer doing useless work.
1.71% 1.71% scylla scylla [.] merging_reader<mutation_reader_merger>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}::operator()
mutation compactor, to get rid of purgeable expired data
and so on, still consumes the data retrieved by sstable
reader, so no semantic change is done.
With the overhead removed, cleanup becomes ~9% faster, see:
BEFORE
real 1m15.240s
user 0m2.648s
sys 0m0.128s
240MB to 237MB (~98% of original) in 3301ms = 71MB/s.
719MB to 719MB (~99% of original) in 9761ms = 73MB/s.
1GB to 1GB (~100% of original) in 15372ms = 73MB/s.
1GB to 1GB (~100% of original) in 15343ms = 74MB/s.
1GB to 1GB (~100% of original) in 15329ms = 74MB/s.
1GB to 1GB (~100% of original) in 15360ms = 73MB/s.
AFTER
real 1m9.154s
user 0m2.428s
sys 0m0.123s
240MB to 237MB (~98% of original) in 3010ms = 78MB/s.
719MB to 719MB (~99% of original) in 8997ms = 79MB/s.
1GB to 1GB (~100% of original) in 14114ms = 80MB/s.
1GB to 1GB (~100% of original) in 14145ms = 80MB/s.
1GB to 1GB (~100% of original) in 14106ms = 80MB/s.
1GB to 1GB (~100% of original) in 14053ms = 80MB/s.
With 1TB set, ~20m would had been reduced instead.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210730190713.462135-1-raphaelsc@scylladb.com>
NOTE: this series depends on a Seastar submodule update, currently queued in next: 0ed35c6af052ab291a69af98b5c13e023470cba3
In order to avoid needless throwing, exceptions are passed
directly wherever possible. Two mechanisms which help with that are:
1. `make_exception_future<>` for futures
2. `co_return coroutine::exception(...)` for coroutines
which return `future<T>` (the mechanism does not work for `future<>`
without parameters, unfortunately)
Tests: unit(release)
Closes#9079
* github.com:scylladb/scylla:
system_keyspace: pass exceptions without throwing
sstables: pass exceptions without throwing
storage_proxy: pass exceptions without throwing
multishard_mutation_query: pass exceptions without throwing
client_state: pass exceptions without throwing
flat_mutation_reader: pass exceptions without throwing
table: pass exceptions without throwing
commitlog: pass exceptions without throwing
compaction: pass exceptions without throwing
database: pass exceptions without throwing
Convert all known tri-compares that return an int to return std::strong_ordering.
Returning an int is dangerous since the caller can treat it as a bool, and indeed
this series uncovered a minor bug (#9103).
Test: unit (dev)
Fixes#1449Closes#9106
* github.com:scylladb/scylla:
treewide: remove redundant "x <=> 0" compares
test: mutation_test: convert internal tri-compare to std::strong_ordering
utils: int_range: change to std::strong_ordering
test: change some internal comparators to std::strong_ordering
utils: big_decimal: change to std::strong_ordering
utils: fragment_range: change to std::strong_ordering
atomic_cell: change compare_atomic_cell_for_merge() to std::strong_ordering
types: drop scaffolding erected around lexicographical_tri_compare
sstables: keys: change to std::strong_ordering internally
bytes: compare_unsigned(): change to std::strong_ordering
uuid: change comparators to std::strong_ordering
types: convert abstract_type::compare and related to std::strong_ordering
types: reduce boilerplate when comparing empty value
serialized_tri_compare: change to std::strong_ordering
compound_compat: change to std::strong-ordering
types: change lexicographical_tri_compare, prefix_equality_tri_compare to std::strong_ordering
"
This series changes the behavior of the system when executing reads
annotated with "bypass cache" clause in CQL. Such reads will not
use nor populate the sstable partition index cache and sstable index page cache.
"
* 'bypass-cache-in-sstable-index-reads' of github.com:tgrabiec/scylla:
sstables: Do not populate page cache when searching in promoted index for "bypass cache" reads
sstables: Do not populate partition index cache for "bypass cache" reads
If x is of type std::strong_ordering, then "x <=> 0" is equivalent to
x. These no-ops were inserted during #1449 fixes, but are now unnecessary.
They have potential for harm, since they can hide an accidental of the
type of x to an arithmetic type, so remove them.
Ref #1449.
The signature already returned std::strong_ordering, but an
internal comparator returned int. Switch it, so it now uses
the strong_ordering overload of lexicographicall_tri_compare().
Ref #1449.
This patch follows #9002, further reducing the complexity of the sstable readers.
The split between row consumer interfaces and implementations has been first added in 2015, and there is no reason to create new implementations anymore. By merging those classes, we achieve a sizeable reduction in sstable reader length and complexity.
Refs #7952
Tests: unit(dev)
Closes#9073
* github.com:scylladb/scylla:
sstables: merge row_consumer into mp_row_consumer_k_l
sstables: move kl row_consumer
sstables: merge consumer_m into mp_row_consumer_m
sstables: move mp_row_consumer_m
Prevent accidental conversions to bool from yielding the wrong results.
Unprepared users (that converted to bool, or assigned to int) are adjusted.
Ref #1449
Test: unit (dev)
Closes#9088
In order to avoid needless throwing, exceptions are passed
directly wherever possible. Two mechanisms which help with that are:
1. make_exception_future<> for futures
2. co_return coroutine::exception(...) for coroutines
which return future<T> (the mechanism does not work for future<>
without parameters, unfortunately)
The row_consumer interface has only one implementation:
mp_row_consumer_k_l; and we're not planning other ones,
so to reduce the number of inheritances, and the number
of lines in the sstable reader, these classes may be
combined.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
In preparation for the next patch combining row_consumer and
mp_row_consumer_k_l, move row_consumer next to row_consumer.
Because row_consumer is going to be removed, we retire some
old tests for different implementations of the row_consumer
interface; as a result, we don't need to expose internal
types of kl sstable reader for tests, so all classes from
reader_impl.hh are moved to reader.cc, and the reader_impl.hh
file is deleted, and the reader.cc file has an analogous
structure to the reader.cc file in sstables/mx directory.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
The consumer_m interface has only one implementation:
mp_row_consumer_m; and we're not planning other ones,
so to reduce the number of inheritances, and the number
of lines in the sstable reader, these classes may be
combined.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
To make next patch combining consumer_m and mp_row_consumer_m
more readable, move mp_row_consumer_m next to consumer_m.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
When skipping bytes at the end of a continuous_data_consumer range,
the position of the consumer is moved after the skipped bytes, but
the position of the underlying input_stream is not.
This patch adds skipping of the underlying input_stream, to make
its position consistent with the position of the consumer.
Fixes#9024
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Closes#9039
* github.com:scylladb/scylla:
tests: add test for skipping bytes at end of consumer
continuous_data_consumer: properly skip bytes at the end of a range
When skipping bytes at the end of a continuous_data_consumer range,
the position of the consumer is moved after the skipped bytes, but
the position of the underlying input_stream is not.
This patch adds skipping of the underlying input_stream, to make
its position consistent with the position of the consumer.
Fixes#9024
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
We need a permit to initialize said object which makes the semaphore
used and hence trigger an error if an exception is thrown in the
constructor. Move the initialization to the end of the constructor to
prevent this.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210719040449.9202-1-bdenes@scylladb.com>
Reads which bypass cache will use a private temporary instance of cached_file
which dies together with the index cursor.
The cursor still needs a cached_file with cachig layer. Binary searching needs
caching for performance, some of the pages will be reused. Another reason to still
use cached_file is to work with a common interface, and reusing it requires minimal changes.
Index cursor for reads which bypass cache will use a private temporary
instance of the partition index cache.
Promoted index scanner (ka/la format) will not go through the page cache.
This patch applies the same changes to both kl and mx sstable readers, but because the kl reader is old, we'll focus on the newer one.
This patch makes the main sstable reader process a coroutine,
allowing to simplify it, by:
- using the state saved in the coroutine instead of most of the states saved in the _state variable
- removing the switch statement and moving the code of former switch cases, resulting in reduced number of jumps in code
- removing repetitive ifs for read statuses, by adding them to the coroutine implementation
The coroutine is saved in a new class ```processing_result_generator```, which works like a generator: using its ```generate()``` method, one can order the coroutine to continue until it yields a data_consumer::processing_result value, which was achieved previously by calling the function that is now the coroutine(```do_process_state()```).
Before the patch, the main processing method had 558 lines. The patch reduces this number to 345 lines.
However, usage of c++ coroutines has a non-negligible effect on the performance of the sstable reader.
In the test cases from ```perf_fast_forward``` the new sstable reader performs up to 2% more instructions (per fragment) than the former implementation, and this loss is achieved for cases where we're reading many subsequent rows, without any skips.
Thanks to finding an optimization during the development of the patch, the loss is mitigated when we do skip rows, and for some cases, we can even observe an improvement.
You can see the full results in attached files: [old_results.txt](https://github.com/scylladb/scylla/files/6793139/old_results.txt), [new_results.txt](https://github.com/scylladb/scylla/files/6793140/new_results.txt)
Test: unit(dev)
Refs: #7952Closes#9002
* github.com:scylladb/scylla:
mx sstable reader: reduce code blocks
mx sstable reader: make ifs consistent
sstable readers: make awaiter for read status
mx sstable reader: don't yield if the data buffer is not empty
mx sstable reader: combine FLAGS and FLAGS_2 states
mx sstable reader: reduce placeholder state usage
mx sstable reader: replace non_consuming states with a bool
mx sstable reader: reduce placeholder state usage
mx sstable reader: replace unnecessary states with a placeholder
mx sstable reader: remove false if case
mx sstable reader: remove row_body_missing_columns_label
mx sstable reader: remove row_body_deletion_label
mx sstable reader: remove column_end_label
mx sstable reader: remove column_cell_path_label
mx sstable reader: remove column_ttl_label
mx sstable reader: remove column_deletion_time_label
mx sstable reader: remove complex_column_2_label
mx sstable reader: remove row_body_missing_columns_read_columns_label
mx sstable reader: remove row_body_marker_label
mx sstable reader: remove row_body_shadowable_deletion_label
mx sstable reader: remove row_body_prev_size_label
mx sstable reader: remove ck_block_label
mx sstable reader: remove ck_block2_label
mx sstable reader: remove clustering_row_label and complex_column_label
mx sstable reader: remove labels with only one goto
mx sstable reader: replace the switch cases with gotos and a new label
mx sstable reader: remove states only reached consecutively or from goto
mx sstable reader: remove switch breaks for consecutive states
mx sstable reader: convert readers main method into a coroutine
kl sstable reader: replace states for ending with one state, simplify non_consuming
kl sstable reader: remove unnecessary states
kl sstable reader: remove unnecessary yield
kl sstable reader: remove unnecessary blocks
kl sstable reader: fix indentation
kl sstable reader: replace switch with standard flow control
kl sstable reader: remove state::CELL case
kl sstable reader: move states code only reachable from one place
kl sstable reader: remove states only reached consecutively
kl sstable reader: remove switch breaks for consecutive states
kl sstable reader: remove unreachable case
kl sstable reader: move testing hack for fragmented buffers outside the coroutine
kl sstable reader: convert readers main method into a coroutine
sstable readers: create a generator class for coroutines
Some blocks of code were surrounded by curly braces, because
a variable was declared inside a switch case. After changes,
some of the variable declarations are in if/else/while cases,
and no longer need to be in separate code blocks, while other
blocks can be extended to entire labels for simplicity.
In several places we're checking the return value of our
consumers' consume_* calls. Because the behaviour in all cases
is the same, let us use the same notation as well.
After each read* call of the primitive_consumer we need to check
if the entire primitive was in our current buffer. We can check it
in the proceed_generator object by yielding the returned read status:
if the yielded status is ready, the yield_value method returns
a structure whose await_ready() method returns true. Otherwise it
returns false.
The returned structure is co_awaited by the coroutine (due to co_yield),
and if await_ready() returns true, the coroutine isn't stopped,
conversely, if it returns false, (technical: and because its await_suspend
methods returns void) the coroutine stops, and a proceed::yes value
is saved, indicating that we need more buffers.
The skip() method returns a skip_bytes object if we want to
skip the entire buffer, otherwise it returns a proceed::yes
and trims the buffer.
If the buffer is only trimmed we don't need to interrupt
the coroutine, we simply continue instead.
The non_consuming() method is only used after assuring that
primitive_consumer::active() (in continuous_data_consumer::process())
so we don't need states where primitive_consumer::active(), which
is most of them.
We still need to make sure that the states change when they need to,
so we replace all the concerned states with the placeholder state,
and for the few states from the non_consuming() OR, where the
primitive_consumer::active() returns true, we set the value of
_consuming to false, changing it back when the state is no longer
non_consuming.
We can remove state assignments that we know are
changing a state to itself.
Similarily, if a state is changed in the same way
in an if and an else, it can be changed before the
if/else instead.
After removing the switch, the state is only used for
verify_end_state() and non_consuming(), so we can
replace states that are not used there with a single
one, so that the state still stops being one of the
appearing states when it needs to.
consume_row_marker_and_tombstone does not return proceed::no in the
mp_row_consumer_m implementation, and even if it did, we would most
likely want to yield proceed::no in that case as well.
column_cell_path_label is only reached from two goto, both
at the end of an if/else block, or consecutively, so the code
after the if/else block can be ommited by an if instead (or else).
column_ttl_label is only reached from two goto, both
at the end of an if/else block, or consecutively, so the code
after the if/else block can be ommited by an if instead (or else).
row_body_missing_columns_read_columns_label is only reached
consecutively, or from a goto after the label. This is changed to a
while loop starting at the label and ending at the goto.
The code executed in the only case we do not reach the goto (so
when exiting the loop) is moved after the while.
row_body_marker_label is only reached from one goto inside an else
case, or consecutively, so the code omitted by goto can be moved
inside the corresponding if case.
row_body_shadowable_deletion_label is only reached from one
goto, or consecutively, so the code omitted by goto can be
ommited by an if instead (or else).
row_body_prev_size_label is only reached consecutively, or from
a goto not far after the label. This is changed to a while loop
starting at the label and ending at the goto.
ck_block_label is only reached consecutively, or from
a few gotos not far after the label. This is changed
to a while loop with gotos replaced with continue's.
clustering_row_label is only reached from one goto, or consecutively,
so the code omitted by goto can be ommited by an if instead (or else).
Also remove complex_column_label because it is next to
its only goto.