The method in question suffers from scylladb/seastar#1298. The PR fixes it and makes a bit shorter along the way
Closes#13776
* github.com:scylladb/scylladb:
sstable: Close file at the end
sstables: Use read_entire_stream_cont() helper
One is unused, the other one is not really required in public
Closes#13771
* github.com:scylladb/scylladb:
file_writer: Remove static make() helper
sstable: Use toc_filename() to print TOC file path
The thing is than when closing file input stream the underlying file is
not .close()-d (see scylladb/seastar#1298). The remove_by_toc_name() is
buggy in this sense. Using with_closeable() fixes it and makes the code
shorter.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The remove_by_toc_name() wants to read the whole stream into a sstring.
There's a convenience helper to facilitate that.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In https://github.com/scylladb/scylladb/pull/13482 we renamed the reader permit states to more descriptive names. That PR however only covered only the states themselves and their usages, as well as the documentation in `docs/dev`.
This PR is a followup to said PR, completing the name changes: renaming all symbols, names, comments etc, so all is consistent and up-to-date.
Closes#13573
* github.com:scylladb/scylladb:
reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes
reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes
reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes
reader_concurrency_semaphore: update API w.r.t. recent permit state name changes
reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes
Current S3 client was tested over minio and it takes few more touches to work with amazon S3.
The main challenge here is to support singed requests. The AWS S3 server explicitly bans unsigned multipart-upload requests, which in turn is the essential part of the sstables S3 backend, so we do need signing. Signing a request has many options and requirements, one of them is -- request _body_ can be or can be not included into signature calculations. This is called "(un)signed payload". Requests sent over plain HTTP require payload signing (i.e. -- request body should be included into signature calculations), which can a bit troublesome, so instead the PR uses unsigned payload (i.e. -- doesn't include the request body into signature calculation, only necessary headers and query parameters), but thus also needs HTTPS.
So what this set does is makes the existing S3 client code sign requests. In order to sign the request the code needs to get AWS key and secret (and region) from somewhere and this somewhere is the conf/object_storage.yaml config file. The signature generating code was previously merged (moved from alternator code) and updated to suit S3 client needs.
In order to properly support HTTPS the PR adds special connection factory to be used with seastar http client. The factory makes DNS resolving of AWS endpoint names and configures gnutls systemtrust.
fixes: #13425Closes#13493
* github.com:scylladb/scylladb:
doc: Add a document describing how to configure S3 backend
s3/test: Add ability to run boost test over real s3
s3/client: Sign requests if configured
s3/client: Add connection factory with DNS resolve and configurable HTTPS
s3/client: Keep server port on config
s3/client: Construct it with config
s3/client: Construct it with sstring endpoint
sstables: Make s3_storage with endpoint config
sstables_manager: Keep object storage configs onboard
code: Introduce conf/object_storage.yaml configuration file
The sstable::write_toc() gets TOC filename from file writer, while it
can get it from itself. This makes the file_writer::get_filename()
private and actually improves logging, as the writer is not required
to have the filename onboard, while sstable always has it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Commit ecbd112979
`distributed_loader: reshard: consider sstables for cleanup`
caused a regression in loading new sstables using the `upload`
directory, as seen in e.g. https://jenkins.scylladb.com/view/master/job/scylla-master/job/dtest-daily-release/230/testReport/migration_test/TestMigration/Run_Dtest_Parallel_Cloud_Machines___FullDtest___full_split000___test_migrate_sstable_without_compression_3_0_md_/
```
query = "SELECT COUNT(*) FROM cf"
statement = SimpleStatement(query)
s = self.patient_cql_connection(node, 'ks')
result = list(s.execute(statement))
> assert result[0].count == expected_number_of_rows, \
"Expected {} rows. Got {}".format(expected_number_of_rows, list(s.execute("SELECT * FROM ks.cf")))
E AssertionError: Expected 1 rows. Got []
E assert 0 == 1
E +0
E -1
```
The reason for the regression is that the call to `do_for_each_sstable`
in `collect_all_shared_sstables` to search for sstables that need
cleanup caused the list of sstables in the sstable directory to be
moved and cleared.
parallel_for_each_restricted moves the container passed to it
into a `do_with` continuation. This is required for
parallel_for_each_restricted.
However, moving the container is destructive and so,
the decision whether to move or not needs to be the
caller's, not the callee.
This patch changes the signature of parallel_for_each_restricted
to accept a lvalue reference to the container rather than a rvalue reference,
allowing the callers to decide whether to move or not.
Most callers are converted to move the container, as they effectively do
today, and a new method, `filter_sstables` was added for the
`collect_all_shared_sstables` us case, that allows the `func` that
processes each sstable to decide whether the sstable is kept
in `_unshared_local_sstables` or not.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Similar to previous patch -- extent the s3::client constructor to get
the endpoint config value next to the endpoint string. For now the
configs are likely empty, but they are yet unused too.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently the client is constructed with socket_address which's prepared
by the caller from the endpoint string. That's not flexible engouh,
because s3 client needs to know the original endpoint string for two
reasons.
First, it needs to lookup endpoint config for potential AWS creds.
Second, it needs this exact value as Host: header in its http requests.
So this patch just relaxes the client constructor to accept the endpoint
string and hard-code the 9000 port. The latter is temporary, this is how
local tests' minio is started, but next patch will make it configurable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Continuation of the previous patch. The sstables::s3_storage gets the
endpoint config instance upon creation.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The user sstables manager will need to provide endpoint config for
sstables' storage drivers. For that it needs to get it from db::config
and keep in-sync with its updates.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
in this series, we try to use `generation_type` as a proxy to hide the consumers from its underlying type. this paves the road to the UUID based generation identifier. as by then, we cannot assume the type of the `value()` without asking `generation_type` first. better off leaving all the formatting and conversions to the `generation_type`. also, this series changes the "generation" column of sstable registry table to "uuid", and convert the value of it to the original generation_type when necessary, this paves the road to a world with UUID based generation id.
Closes#13652
* github.com:scylladb/scylladb:
db: use uuid for the generation column in sstable registry table
db, sstable: add operator data_value() for generation_type
db, sstable: print generation instead of its value
* change the "generation" column of sstable registry table from
bigint to uuid
* from helper to convert UUID back to the original generation
in the long run, we encourage user to use uuid based generation
identifier. but in the transition period, both bigint based and uuid
based identifiers are used for the generation. so to cater both
needs, we use a hackish way to store the integer into UUID. to
differentiate the was-integer UUID from the geniune UUID, we
check the UUID's most_significant_bits. because we only support
serialize UUID v1, so if the timestamp in the UUID is zero,
we assume the UUID was generated from an integer when converting it
back to a generation identififer.
also, please note, the only use case of using generation as a
column is the sstable_registry table, but since its schema is fixed,
we cannot store both a bigint and a UUID as the value of its
`generation` column, the simpler way forward is to use a single type
for the generation. to be more efficient and to preserve the type of
the generation, instead of using types like ascii string or bytes,
we will always store the generation as a UUID in this table, if the
generation's identifier is a int64_t, the value of the integer will
be used as the least significant bits of the UUID.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The s3_storage leaks client when sstable gets destoryed. So far this
came unnoticed, but debug-mode unit test ran over minio captured it. So
here's the fix.
When sstable is destroyed it also kicks the storage to do whatever
cleanup is needed. In case of s3 storage the cleanup is in closing the
on-boarded client. Until #13458 is fixed each sstable has its own
private version of the client and there's no other place where it can be
close()d in co_await-able mannter.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
No point in going through the vector<mutation> entry-point
just to discover in run time that it was called
with a single-element vector, when we know that
in advance.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#13733
this change silences the warning of `-Wmissing-braces` from
clang. in general, we can initialize an object without constructor
with braces. this is called aggregate initialization.
but the standard does allow us to initialize each element using
either copy-initialization or direct-initialization. but in our case,
neither of them applies, so the clang warns like
```
suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
options.elements.push_back({bytes(k.begin(), k.end()), bytes(v.begin(), v.end())});
^~~~~~~~~~~~~~~~~~~~~~~~~
{ }
```
in this change,
also, take the opportunity to use structured binding to simplify the
related code.
Closes#13705
* github.com:scylladb/scylladb:
build: reenable -Wmissing-braces
treewide: add braces around subobject
cql3/stats: use zero-initialization
so we can apply `execute_cql()` on `generation_type` directly without
extracting its value using `generation.value()`. this paves the road to
adding UUID based generation id to `generation_type`. as by then, we
will have both UUID based and integer based `generation_type`, so
`generation_type::value()` will not be able to represent its value
anymore. and this method will be replaced by `operator data_value()` in
this use case.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this change prepares for the change to use `variant<UUID, int64_t>`
as the value of `generation_type`. as after this change, the "value"
of a generation would be a UUID or an integer, and we don't want to
expose the variant in generation's public interface. so the `value()`
method would be changed or removed by then.
this change takes advantage of the fact that the formatter of
`generation_type` always prints its value. also, it's better to
reuse `generation_type` formatter when appropriate.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this change helps to silence the warning of `-Wmissing-braces` from
clang. in general, we can initialize an object without constructor
with braces. this is called aggregate initialization.
but the standard does allow us to initialize each element using
either copy-initialization or direct-initialization. but in our case,
neither of them applies, so the clang warns like
```
suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
options.elements.push_back({bytes(k.begin(), k.end()), bytes(v.begin(), v.end())});
^~~~~~~~~~~~~~~~~~~~~~~~~
{ }
```
in this change,
also, take the opportunity to use structured binding to simplify the
related code.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
bytes_on_disk is the sum of all sstable components.
As read_simple() fetches the file size before parsing the component,
bytes_on_disk can be added incrementally rather than an additional
step after all components were already parsed.
Likewise, write_simple() tracks the offset for each new component,
and therefore bytes_on_disk can also be added incrementally.
This simplifies s3 life as it no longer have to care about feeding
a bytes_on_disk, which is currently limited to data and index
sizes only.
Refs #13649.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
With all parsing of simple components going through do_read_simple(),
common infrastructure can be reused (exception handling, debug logging,
etc), and also statistics spanning all components can be easily added.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
With all writing of simple components going through do_write_simple(),
common infrastructure can be reused (exception handling, debug logging,
etc), and also statistics spanning all components can be easily added.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
For S3, filter size is currently set to zero, as we want to avoid
"fstat-ing" each file.
On-disk representation of bloom filter is similar to the in-memory
one, therefore let's use memory footprint in filter_size().
User of filter_size() is API implementing "nodetool cfstats" and
it cares about the size of bloom filter data (that's how it's
described).
This way, we provide the filter data size regardless of the
underlying storage type.
Refs #13649.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.
fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.
in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.
sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.
also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.
please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.
this change is a cleanup to modernize the code base with C++20
features.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13687
When moving to the base directory, the printout currently looks broken:
```
INFO 2023-04-16 09:15:58,631 [shard 0] sstable - Moving sstable .../data/ks/cf-4c1bb670dc3711ed96733daf102e4aab/upload/md-1-big-Data.db to in ".../data/ks/cf-4c1bb670dc3711ed96733daf102e4aab/"
```
Since `path` already contains `to`, the message can be just simplified
and `to` need not be printed explicitly.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#13525
this is the first step to the uuid-based generation identifier. the goal is to encapsulate the generation related logic in generator, so its consumers do not have to understand the difference between the int64_t based generation and UUID v1 based generation.
this commit should not change the behavior of existing scylla. it just allows us to derive from `generation_generator` so we can have another generator which generates UUID based generation identifier.
Closes#13073
* github.com:scylladb/scylladb:
replica, test: create generation id using generator
sstables: add generation_generator
test: sstables: use generate_n for generating ids for testing
to prepare for the uuid-based generation identifier, where we
will generate uuid-based generation idenfier if corresponding
option is enabled, otherwise an integer based id. to reduce the
repeatings, generation_generator is extracted out so it can be reused.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this the standard library offers
`std::lexicographical_compare_threeway()`, and we never uses the
last two addition parameters which are not provided by
`std::lexicographical_compare_threeway()`. there is no need to have
the homebrew version of trichotomic compare function.
in this change,
* all occurrences of `lexicographical_tri_compare()` are replaced
with `std::lexicographical_compare_threeway()`.
* ``lexicographical_tri_compare()` is dropped.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13615
C++20 compiler is able to generate defaulted operator== and
operator!=. and the default generated operators behaves exactly
the same as the ones crafted by us. so let's it do its job.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13614
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `component_type` without the help of `operator<<`.
the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.
also, please note, to enable fmtlib to format `std::set<component_type>`
in `test/boost/sstable_3_x_test.cc` , we need to include
`<fmt/ranges.h>` in that source file.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13598
Currently, the reader might dereference a null pointer
if the input stream reaches eof prematurely,
and read_exactly returns an empty temporary_buffer.
Detect this condition before dereferencing the buffer
and sstables::malformed_sstable_exception.
Fixes#13599
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#13600
this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`.
- partition_key_view
- partition_key
- partition_key::with_schema_wrapper
- key_with_schema
- clustering_key_prefix
- clustering_key_prefix::with_schema_wrapper
the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now. the helper
of `print_key()` is removed, as its only caller is
`operator<<(std::ostream&, const
clustering_key_prefix::with_schema_wrapper&)`.
the reason why all these operators are replaced in one go is that
we have a template function of `key_to_str()` in `db/large_data_handler.cc`.
this template function is actually the caller of operator<< of
`partition_key::with_schema_wrapper` and
`clustering_key_prefix::with_schema_wrapper`.
so, in order to drop either of these two operator<<, we need to remove
both of them, so that we can switch over to `fmt::to_string()` in this
template function.
Refs scylladb#13245
Closes#13513
* github.com:scylladb/scylladb:
keys: consolidate the formatter for partition_keys
keys: specialize fmt::formatter<partition_key> and friends
enable_optimized_twcs_queries is specific to TWCS, therefore it
belongs to TWCS, not replica::table.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#13489
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print following classes without the help of `operator<<`.
- partition_key_view
- partition_key
- partition_key::with_schema_wrapper
- key_with_schema
- clustering_key_prefix
- clustering_key_prefix::with_schema_wrapper
the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now. the helper
of `print_key()` is removed, as its only caller is
`operator<<(std::ostream&, const
clustering_key_prefix::with_schema_wrapper&)`.
the reason why all these operators are replaced in one go is that
we have a template function of `key_to_str()` in `db/large_data_handler.cc`.
this template function is actually the caller of operator<< of
`partition_key::with_schema_wrapper` and
`clustering_key_prefix::with_schema_wrapper`.
so, in order to drop either of these two operator<<, we need to remove
both of them, so that we can switch over to `fmt::to_string()` in this
template function.
Refs scylladb#13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
If bloom filter is not loaded, it means that an always-present filter
is used, which translates into the SSTable being opened on every
single read.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
This series extends sstable cleanup to resharding and other (offstrategy, major, and regular) compaction types so to:
* cleanup uploaded sstables (#11933)
* cleanup staging sstables after they are moved back to the main directory and become eligible for compaction (#9559)
When perform_cleanup is called, all sstables are scanned, and those that require cleanup are marked as such, and are added for tracking to table_state::cleanup_sstable_set. They are removed from that set once released by compaction.
Along with that sstables set, we keep the owned_ranges_ptr used by cleanup in the table_state to allow other compaction types (offstrategy, major, or regular) to cleanup those sstables that are marked as require_cleanup and that were skipped by cleanup compaction for either being in the maintenance set (requiring offstrategy compaction) or in staging.
Resharding is using a more straightforward mechanism of passing the owned token ranges when resharding uploaded sstables and using it to detect sstable that require cleanup, now done as piggybacked on resharding compaction.
Closes#12422
* github.com:scylladb/scylladb:
table: discard_sstables: update_sstable_cleanup_state when deleting sstables
compaction_manager: compact_sstables: retrieve owned ranges if required
sstables: add a printer for shared_sstable
compaction_manager: keep owned_ranges_ptr in compaction_state
compaction_manager: perform_cleanup: keep sstables in compaction_state::sstables_requiring_cleanup
compaction: refactor compaction_state out of compaction_manager
compaction: refactor compaction_fwd.hh out of compaction_descriptor.hh
compaction_manager: compacting_sstable_registration: keep a ref to the compaction_state
compaction_manager: refactor get_candidates
compaction_manager: get_candidates: mark as const
table, compaction_manager: add requires_cleanup
sstable_set: add for_each_sstable_until
distributed_loader: reshard: update sstable cleanup state
table, compaction_manager: add update_sstable_cleanup_state
compaction_manager: needs_cleanup: delete unused schema param
compaction_manager: perform_cleanup: disallow empty sorted_owened_ranges
distributed_loader: reshard: consider sstables for cleanup
distributed_loader: process_upload_dir: pass owned_ranges_ptr to reshard
distributed_loader: reshard: add optional owned_ranges_ptr param
distributed_loader: reshard: get a ref to table_state
distributed_loader: reshard: capture creator by ref
distributed_loader: reshard: reserve num_jobs buckets
compaction: move owned ranges filtering to base class
compaction: move owned_ranges into descriptor