It used to be used by `compression_parameters` validation logic
to ask the created `compressor` for compressor-specific option names.
Since we no longer delegate this to `compressor`, but we just
put the knowledge of those options directly into
`compressor_parameters`, it's dead code now.
The class in question is a wrapper around output_stream that writes,
flushes and closes the stream in async context. For logging it also
keeps the component filename on board, and now it's good time to patch
it and keep the component_filename instead.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
now that we are allowed to use C++23. we now have the luxury of using
`std::views::transform`.
in this change, we:
- replace `boost::adaptors::transformed` with `std::views::transform`
- use `fmt::join()` when appropriate where `boost::algorithm::join()`
is not applicable to a range view returned by `std::view::transform`.
- use `std::ranges::fold_left()` to accumulate the range returned by
`std::view::transform`
- use `std::ranges::fold_left()` to get the maximum element in the
range returned by `std::view::transform`
- use `std::ranges::min()` to get the minimal element in the range
returned by `std::view::transform`
- use `std::ranges::equal()` to compare the range views returned
by `std::view::transform`
- remove unused `#include <boost/range/adaptor/transformed.hpp>`
- use `std::ranges::subrange()` instead of `boost::make_iterator_range()`,
to feed `std::views::transform()` a view range.
to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.
this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.
limitations:
there are still a couple places where we are still using
`boost::adaptors::transformed` due to the lack of a C++23 alternative
for `boost::join()` and `boost::adaptors::uniqued`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21700
To reduce the dependency load, replace use of boost ranges
with the std equivalent.
Files that lost the indirect boost dependency have it added as a
direct dependency.
Uncompressed SSTables store their checksums in a separate CRC.db file.
Add this in the list of SSTable components.
Since this component is used only for validation, load the component
on-demand for validation tasks and delete it when all validation tasks
finish. In more detail:
- Make the checksum component shareable and weakly referencable.
Also, add a constructor since it is no longer an aggregate.
- Use a weak pointer to store a non-owning reference in the components
and a shared pointer to keep the object alive while validation runs.
Once validation finishes, the component should be cleaned up
automatically.
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
`sstables::write()` has multiple overloads, which are defined in
`sstables/writer.hh`. two of these overloads are template functions,
which have a template parameter named `W`, which has a type constraint
requiring it to fulfill the `Writer` concept. but in `types.hh`, when
the compiler tries to instantiate the template function with signature
of `write(sstable_version_types v, W& out, const T& t)` with
`file_writer` as the template parameter of `w`, `file_writer` is only
forward-declared using `class file_writer` in the same header file, so
this type is still an incomplete type at that moment. that's why the
compiler is not able to determine if `file_writer` fulfills the
constraint or not. actually, the declaration of `file_writer` is located
in `sstables/writer.hh`, which in turn includes `types.hh`. so they
form a cyclic dependency.
in this change, in order to break this cycle, we extract file_writer out
into a separate header file, so that both `sstables/writer.hh` and
`sstables/types.hh` can include it. this address the build failure.
Fixesscylladb/scylladb#19667
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19669
before this change, `checksummed_file_data_sink_impl` just inherits the
`data_sink_impl::flush()` from its parent class. but as a wrapper around
the underlying `_out` data_sink, this is not only an unusual design
decision in a layered design of an I/O system, but also could be
problematic. to be more specific, the typical user of `data_sink_impl`
is a `data_sink`, whose `flush()` member function is called when
the user of `data_sink` want to ensure that the data sent to the sink
is pushed to the underlying storage / channel.
this in general works, as the typical user of `data_sink` is in turn
`output_stream`, which calls `data_sink.flush()` before closing the
`data_sink` with `data_sink.close()`. and the operating system will
eventually flush the data after application closes the corresponding
fd. to be more specific, almost none of the popular local filesystem
implements the file_operations.op, hence, it's safe even if the
`output_stream` does not flush the underlying data_sink after writing
to it. this is the use case when we write to sstables stored on local
filesystem. but as explained above, if the data_sink is backed by a
network filesystem, a layered filesystem or a storage connected via
a buffered network device, then it is crucial to flush in a timely
manner, otherwise we could risk data lost if the application / machine /
network breaks when the data is considerered persisted but they are
_not_!
but the `data_sink` returned by `client::make_upload_jumbo_sink` is
a little bit different. multipart upload is used under the hood, and
we have to finalize the upload once all the parts are uploaded by
calling `close()`. but if the caller fails / chooses to close the
sink before flushing it, the upload is aborted, and the partially
uploaded parts are deleted.
the default-implemented `checksummed_file_data_sink_impl::flush()`
breaks `upload_jumbo_sink` which is the `_out` data_sink being
wrapped by `checksummed_file_data_sink_impl`. as the `flush()`
calls are shortcircuited by the wrapper, the `close()` call
always aborts the upload. that's why the data and index components
just fail to upload with the S3 backend.
in this change, we just delegate the `flush()` call to the
wrapped class.
Fixes#15079
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15134
The sstable::write_toc() gets TOC filename from file writer, while it
can get it from itself. This makes the file_writer::get_filename()
private and actually improves logging, as the writer is not required
to have the filename onboard, while sstable always has it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Writing into sstable component output stream should be done with care.
In particular -- flushing can happen only once right before closing the
stream. Flushing the stream in between several writes is not going to
work, because file stream would step on unaligned IO and S3 upload
stream would send completion message to the server and would lose any
subsequent write.
Having said that, it's better to remove the flush() ability from the
component writer not to tempt the developers.
refs: #13320
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
they are part of the CQL type system, and are "closer" to types.
let's move them into "types" directory.
the building systems are updated accordingly.
the source files referencing `types.hh` were updated using following
command:
```
find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} +
```
the source files under sstables include "types.hh", which is
indeed the one located under "sstables", so include "sstables/types.hh"
instea, so it's more explicit.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12926
Now that scylla requries c++20 there's no
need to define our own implementation in utils/bit_cast.hh
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Compaction ancestors is only available in versions older than mx,
therefore we can make it optional in seal_statistics(). The motivation
is that mx writer will no longer call sstable::compaction_ancestors()
which return type will be soon changed to type generation_type, so the
returned value can be something other than an integer, e.g. uuid.
We could kill compaction_ancestors in seal_statistics interface, but
given that most generic write functions still work for older versions,
if there were still a writer for them, I decided to not do it now.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The bytes_stream param is passed by value from `write_promoted_index`
(since 0d8463aba5)
causing an uneeded copy.
This can lead to OOM if the promoted index is extremely large.
Pass the bytes_ostream by reference instead to prevent this copy.
Fixes#10569
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#10570
(that is, instances of `std::optional`).
The ME sstable format includes optional originating host id in stats
metadata. We know how to write and parse uuids, but not how to write
and parse optionals.
The format is (used by C* in this case, and also happens to be
consistent with how booleans are serialized): first a boolean
indicating whether the contents are present (0 or 1, as a byte), then
the contents (if any).
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.
Enable the warning and fix numerous violations.
Closes#9347
The checksum sink carries another sink on board and forwards
the put buffers lower, so there's no point in making these
two have different buffer sizes. This is what really happens
now, but this change makes this more explicit and makes the
checksumming code conform to the new output stream API.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We have an integral and a non-integral overload, each constrained
with enable_if. We use std::integral to constrain the integral
overload and leave the other unconstrained, as C++ will choose the
more constrained version when applicable.
There are three variants: integral, enum, and self-describing
(currently expressed as not integral and not enum). Convert to
concepts by using the standard concepts or the new self_describing
concept.
In some places we use the `*reinterpret_cast<const net::packed<T>*>(&x)`
pattern to reinterpret memory. This is a violation of C++'s aliasing rules,
which invokes undefined behaviour.
The blessed way to correctly reinterpret memory is to copy it into a new
object. Let's do that.
Note: the reinterpret_cast way has no performance advantage. Compilers
recognize the memory copy pattern and optimize it away.
Commit aab6b0ee27 introduced the
controversial new IMR format, which relied on a very template-heavy
infrastructure to generate serialization and deserialization code via
template meta-programming. The promise was that this new format, beyond
solving the problems the previous open-coded representation had (working
on linearized buffers), will speed up migrating other components to this
IMR format, as the IMR infrastructure reduces code bloat, makes the code
more readable via declarative type descriptions as well as safer.
However, the results were almost the opposite. The template
meta-programming used by the IMR infrastructure proved very hard to
understand. Developers don't want to read or modify it. Maintainers
don't want to see it being used anywhere else. In short, nobody wants to
touch it.
This commit does a conceptual revert of
aab6b0ee27. A verbatim revert is not
possible because related code evolved a lot since the merge. Also, going
back to the previous code would mean we regress as we'd revert the move
to fragmented buffers. So this revert is only conceptual, it changes the
underlying infrastructure back to the previous open-coded one, but keeps
the fragmented buffers, as well as the interface of the related
components (to the extent possible).
Fixes: #5578
Compaction needs access to the sstable's ancestors so we need to
keep the ancestors for the sstable separately from the metadata collector
as the latter is about to be moved to the sstable writer.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Correct counter order is supported for over 2 years and upgrades are only
allowed from versions which already have the support, so the checks
are hereby dropped.
With relatively big summaries, reactor can be stalled for a couple
of milliseconds.
This patch:
a. allocates positions upfront to avoid excessive reallocation.
b. returns a future from seal_summary() and uses `seastar::do_for_each`
to iterate over the summary entries so the loop can yield if necessary.
Fixes#7108.
Based on 2470aad5a389dfd32621737d2c17c7e319437692 by Raphael S. Carvalho <raphaelsc@scylladb.com>
Test: unit(dev)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200826091337.28530-1-bhalevy@scylladb.com>
"
This series adds support for the "md" sstable format.
Support is based on the following:
* do not use clustering based filtering in the presence
of static row, tombstones.
* Disabling min/max column names in the metadata for
formats older than "md".
* When updating the metadata, reset and disable min/max
in the presence of range tombstones (like Cassandra does
and until we process them accurately).
* Fix the way we maintain min/max column names by:
keeping whole clustering key prefixes as min/max
rather than calculating min/max independently for
each component, like Cassandra does in the "md" format.
Fixes#4442
Tests: unit(dev), cql_query_test -t test_clustering_filtering* (debug)
md migration_test dtest from git@github.com:bhalevy/scylla-dtest.git migration_test-md-v1
"
* tag 'md-format-v4' of github.com:bhalevy/scylla: (27 commits)
config: enable_sstables_md_format by default
test: cql_query_test: add test_clustering_filtering unit tests
table: filter_sstable_for_reader: allow clustering filtering md-format sstables
table: create_single_key_sstable_reader: emit partition_start/end for empty filtered results
table: filter_sstable_for_reader: adjust to md-format
table: filter_sstable_for_reader: include non-scylla sstables with tombstones
table: filter_sstable_for_reader: do not filter if static column is requested
table: filter_sstable_for_reader: refactor clustering filtering conditional expression
features: add MD_SSTABLE_FORMAT cluster feature
config: add enable_sstables_md_format
database: add set_format_by_config
test: sstable_3_x_test: test both mc and md versions
test: Add support for the "md" format
sstables: mx/writer: use version from sstable for write calls
sstables: mx/writer: update_min_max_components for partition tombstone
sstables: metadata_collector: support min_max_components for range tombstones
sstable: validate_min_max_metadata: drop outdated logic
sstables: rename mc folder to mx
sstables: may_contain_rows: always true for old formats
sstables: add may_contain_rows
...
Rather than using a constant sstable_version_types::mc.
In preparation to supporting sstable_version_types::md.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Otherwise we may trip the following
assert(_closing_state == state::closed);
in ~append_challenged_posix_file_impl
when the output_stream is destructed.
Example stack trace:
non-virtual thunk to seastar::append_challenged_posix_file_impl::~append_challenged_posix_file_impl() at /jenkins/slave/workspace/scylla-3.2/build/scylla/seastar/include/seastar/core/future.hh:944
seastar::shared_ptr_count_for<checked_file_impl>::~shared_ptr_count_for() at crtstuff.c:?
seastar::shared_ptr<seastar::file_impl>::~shared_ptr() at /jenkins/slave/workspace/scylla-3.2/build/scylla/seastar/include/seastar/core/future.hh:944
(inlined by) seastar::file::~file() at /jenkins/slave/workspace/scylla-3.2/build/scylla/seastar/include/seastar/core/file.hh:155
(inlined by) seastar::file_data_sink_impl::~file_data_sink_impl() at /jenkins/slave/workspace/scylla-3.2/build/scylla/seastar/src/core/fstream.cc:312
(inlined by) seastar::file_data_sink_impl::~file_data_sink_impl() at /jenkins/slave/workspace/scylla-3.2/build/scylla/seastar/src/core/fstream.cc:312
seastar::output_stream<char>::~output_stream() at crtstuff.c:?
sstables::sstable::write_crc(sstables::checksum const&) [clone .cold] at sstables.cc:?
sstables::mc::writer::close_data_writer() at crtstuff.c:?
sstables::mc::writer::consume_end_of_stream() at crtstuff.c:?
sstables::sstable::write_components(flat_mutation_reader, unsigned long, seastar::lw_shared_ptr<schema const>, sstables::sstable_writer_config const&, encoding_stats, seastar::io_priority_class const&)::{lambda()#1}::operator()() at sstables.cc:?
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Unify common code for file creation and file_writer construction
for sstable components.
It is defined as noexcept based on `new_sstable_component_file`
and makes sure the file is closed on error by using `file_writer::make`
that guarantees that.
Will be used for auto-closing the writer as a RAII object.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
checksummed_file_writer cannot be moved, so we can't have a
checksummed_file_writer::make that returns a future. So instead we
pass in a data_sink and let the callers call make_file_data_sink.
This is in preparation for make_file_data_sink returning a future in
the seastar api v3.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
For now it always returns a ready future. This is in preparation for
using seastar v3 api where make_file_output_stream returns a future.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Seastar recently lost support for the experimental Concepts Technical
Specification (TS) and gained support for C++20 concepts. Re-enable
concepts in Scylla by updating our use of concepts to the C++20
standard.
This change:
- peels off uses of the GCC6_CONCEPT macro
- removes inclusions of <seastar/gcc6-concepts.hh>
- replaces function-style concepts (no longer supported) with
equation-style concepts
- semicolons added and removed as needed
- deprecated std::is_pod replaced by recommended replacement
- updates return type constraints to use concepts instead of
type names (either std::same_as or std::convertible_to, with
std::same_as chosen when possible)
No attempt is made to improve the concepts; this is a specification
update only.
Message-Id: <20200531110254.2555854-1-avi@scylladb.com>
The enable_sstable_key_validation and summary_bytes_cost are used
in sstables writing code, keeping them on sstable_writer_config
removes more calls to global get_config().
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>