A helper which will be used during upcoming changes to
mutation_partition_v2::apply_monotonically(), which will extend it to merging
versions with different schemas.
A helper which will be used during upcoming changes to
mutation_partition_v2::apply_monotonically(), which will extend it to merging
versions with different schemas.
It will be used in upcoming commits.
A factory function is used, rather than an actual constructor,
because we want to delegate the (easy) case of equal schemas
to the existing single-schema constructor.
And that's impossible (at least without invoking a copy/move
constructor) to do with only constructors.
operator<< accepts a schema& and a partition_entry&. But since the latter
now contains a reference to its schema inside, the former is redundant.
Remove it.
After adding a _schema field to each partition version,
the field in partition_snapshot is redundant. It can be always recovered
from the latest version. Remove it.
Currently, partition_version does not reference its schema.
All partition_version reachable from a entry/snapshot have the same schema,
which is referenced in memtable_entry/cache_entry/partition_snapshot.
To enable gentle schema upgrades, we want to use the existing background
version merging mechanism. To achieve that, we will move the schema reference
into partition_version, and we will allow neighbouring MVCC versions to have
different schemas, and we will merge them on-the-fly during reads and
persistently during background version merges.
This way, an upgrade will boil down to adding a new empty version with
the new schema.
This patch adds the _schema field to partition_version and propagates
the schema pointer to it from the version's containers (entry/snapshot).
Subsequent patches will remove the schema references from the containers,
because they are now redundant.
We don't have a convention for when to pass `schema_ptr` and and when to pass
`const schema&` around.
In general, IMHO the natural convention for such a situation is to pass the
shared pointer if the callee might extend the lifetime of shared_ptr,
and pass a reference otherwise. But we convert between them willy-nilly
through shared_from_this().
While passing a reference to a function which actually expects a shared_ptr
can make sense (e.g. due to the fact that smart pointers can't be passed in
registers), the other way around is rather pointless.
This patch takes one occurence of that and modifies the parameter to a reference.
Since enable_shared_from_this makes shared pointer parameters and reference
parameters interchangeable, this is a purely cosmetic change.
They will be used in upcoming patches which introduce incremental schema upgrades.
Currently, these variants always copy cells during upgrade.
This could be optimized in the future by adding a way to move them instead.
The comment refers to "other", but it means "pe". Fix that.
The patch also adds a bit of context to the mutation_partition jargon
("evictability" and "continuity"), by reminding how it relates
to the concrete abstractions: memtable and cache.
Most variants of apply() and apply_monotonically() in mutation_partition_v2
are leftovers from mutation_partition, and are unused. Thus they only
add confusion and maintenance burden. Since we will be modifying
apply_monotonically() in upcoming patches, let's clean them up, lest
the variants become stale.
This patch removes all unused variants of apply() and apply_monotonically()
and "manually inlines" the variants which aren't used often enough to carry
their own weight.
In the end, we are left with a single apply_monotonically() and two convenience
apply() helpers.
The single apply_monotonically() accepts two schema arguments. This facility
is unimplemented and unused as of this patch - the two arguments are always
the same - but it will be implemented and used in later parts of the series.
The comment suggests that the order of sentinel insertion is meaningful because
of the resulting eviction order. But the sentinels are added to the tracker
with the two-argument version of insert(), which inserts the second argument
into the LRU right before the (more recent) first argument.
Thus the eviction order of sentinels is decided explicitly, and it doesn't
rely on insertion order.
In some mixed-schema apply helpers for tests, the source mutation
is accidentally copied with the target schema. Fix that.
Nothing seems to be currently affected by this bug; I found it when it
was triggered by a new test I was adding.
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.
fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.
in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.
sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.
also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.
please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.
this change is a cleanup to modernize the code base with C++20
features.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13687
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `range_tombstone_list` and `range_tombstone_entry`
without the help of `operator<<`.
the corresponding `operator<<()` for `range_tombstone_entry` is moved
into test, where it is used. and the other one is dropped in this change,
as all its callers are now using fmtlib for formatting now.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13627
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `apply_resume` without the help of `operator<<`.
the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13584
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `tombstone` and `shadowable_tombstone` without the
help of `operator<<`.
in this change, only `operator<<(ostream&, const shadowable_tombstone&)`
is dropped, and all its callers are now using fmtlib for formatting the
instances of `shadowable_tombstone` now.
`operator<<(ostream&, const tombstone&)` is preserved. as it is still
used by Boost::test for printing the operands in case the comparing tests
fail.
please note, before this change we were using a concrete string
for indent. after this change, some of the places are changed to
using fmtlib for indent.
Refs scylladb#13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Every tracker insertion has to have a corresponding removal or eviction,
(otherwise the number of rows in the tracker will be misaccounted).
If we add the row to the tracker before adding it to the tree,
and the tree insertion fails (with bad_alloc), this contract will be violated.
Fix that.
Note: the problem is currently irrelevant because an exception during
sentinel insertion will abort the program anyway.
Closes#13336
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print
- position_in_partition
- position_in_partition_view
- position_in_partition_view::printer
without the help of fmt::ostream. their `operator<<(ostream,..)` are
reimplemented using fmtlib accordingly to ease the review.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `partition_region` with the help of fmt::ostream.
to help with the review process, the corresponding `to_string()` is
dropped, and its callers now switch over to `fmt::to_string()` in
this change as well. to use `fmt::to_string()` helps with consolidating
all places to use fmtlib for printing/formatting.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The purpose of `_stop` is to remember whether the consumption of the
last partition was interrupted or it was consumed fully. In the former
case, the compactor allows retreiving the compaction state for the given
partition, so that its compaction can be resumed at a later point in
time.
Currently, `_stop` is set to `stop_iteration::yes` whenever the return
value of any of the `consume()` methods is also `stop_iteration::yes`.
Meaning, if the consuming of the partition is interrupted, this is
remembered in `_stop`.
However, a partition whose consumption was interrupted is not always
continued later. Sometimes consumption of a partitions is interrputed
because the partition is not interesting and the downstream consumer
wants to stop it. In these cases the compactor should not return an
engagned optional from `detach_state()`, because there is not state to
detach, the state should be thrown away. This was incorrectly handled so
far and is fixed in this patch, but overwriting `_stop` in
`consume_partition_end()` with whatever the downstream consumer returns.
Meaning if they want to skip the partition, then `_stop` is reset to
`stop_partition::no` and `detach_state()` will return a disengaged
optional as it should in this case.
Fixes: #12629Closes#13365
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print range_tombstone_change without using ostream<<.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print range_tombstone.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
now that fmtlib provides fmt::join(). see
https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view
there is not need to revent the wheel. so in this change, the homebrew
join() is replaced with fmt::join().
as fmt::join() returns an join_view(), this could improve the
performance under certain circumstances where the fully materialized
string is not needed.
please note, the goal of this change is to use fmt::join(), and this
change does not intend to improve the performance of existing
implementation based on "operator<<" unless the new implementation is
much more complicated. we will address the unnecessarily materialized
strings in a follow-up commit.
some noteworthy things related to this change:
* unlike the existing `join()`, `fmt::join()` returns a view. so we
have to materialize the view if what we expect is a `sstring`
* `fmt::format()` does not accept a view, so we cannot pass the
return value of `fmt::join()` to `fmt::format()`
* fmtlib does not format a typed pointer, i.e., it does not format,
for instance, a `const std::string*`. but operator<<() always print
a typed pointer. so if we want to format a typed pointer, we either
need to cast the pointer to `void*` or use `fmt::ptr()`.
* fmtlib is not able to pick up the overload of
`operator<<(std::ostream& os, const column_definition* cd)`, so we
have to use a wrapper class of `maybe_column_definition` for printing
a pointer to `column_definition`. since the overload is only used
by the two overloads of
`statement_restrictions::add_single_column_parition_key_restriction()`,
the operator<< for `const column_definition*` is dropped.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Said method can now throw `std::bad_alloc` since aab5954. All call-sites should have been adapted in the series introducing the throw, but some managed to slip through because the oom unit test didn't run in debug mode. This series fixes the remaining unpatched call-sites and makes sure the test runs in debug mode too, so leaks like this are detected.
Fixes: #12767Closes#12756
* github.com:scylladb/scylladb:
test/boost/reader_concurreny_semaphore_test: run oom protection tests in debug mode
treewide: adapt to throwing reader_concurrency_semaphore::consume()
they are part of the CQL type system, and are "closer" to types.
let's move them into "types" directory.
the building systems are updated accordingly.
the source files referencing `types.hh` were updated using following
command:
```
find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} +
```
the source files under sstables include "types.hh", which is
indeed the one located under "sstables", so include "sstables/types.hh"
instea, so it's more explicit.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12926
Said method can now throw `std::bad_alloc` since aab5954. All call-sites
should have been adapted in the series introducing the throw, but some
managed to slip through because the oom unit test didn't run in debug
mode. In this commit the remaining unpatched call-sites are fixed.
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.
Closes#12858
Move mutation-related files to a new mutation/ directory. The names
are kept in the global namespace to reduce churn; the names are
unambiguous in any case.
mutation_reader remains in the readers/ module.
mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this
patch.
This is a step forward towards librarization or modularization of the
source base.
Closes#12788