scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 19:35:12 +00:00

Author	SHA1	Message	Date
Botond Dénes	b704698ba5	Merge 'Close toc file in remove_by_toc_name()' from Pavel Emelyanov The method in question suffers from scylladb/seastar#1298. The PR fixes it and makes a bit shorter along the way Closes #13776 * github.com:scylladb/scylladb: sstable: Close file at the end sstables: Use read_entire_stream_cont() helper	2023-05-05 11:33:05 +03:00
Botond Dénes	0cccf9f1cc	Merge 'Remove some file_writer public methods' from Pavel Emelyanov One is unused, the other one is not really required in public Closes #13771 * github.com:scylladb/scylladb: file_writer: Remove static make() helper sstable: Use toc_filename() to print TOC file path	2023-05-05 10:48:46 +03:00
Raphael S. Carvalho	1f69c46889	sstables: use version_types received from parser or writer This is only a cosmetical change, no change in semantics Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13779	2023-05-05 10:32:14 +03:00
Pavel Emelyanov	75e7187e1a	sstable: Close file at the end The thing is than when closing file input stream the underlying file is not .close()-d (see scylladb/seastar#1298). The remove_by_toc_name() is buggy in this sense. Using with_closeable() fixes it and makes the code shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-04 20:37:48 +03:00
Pavel Emelyanov	334383beb5	sstables: Use read_entire_stream_cont() helper The remove_by_toc_name() wants to read the whole stream into a sstring. There's a convenience helper to facilitate that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-04 20:37:09 +03:00
Avi Kivity	f125a3e315	Merge 'tree: finish the reader_permit state renames' from Botond Dénes In https://github.com/scylladb/scylladb/pull/13482 we renamed the reader permit states to more descriptive names. That PR however only covered only the states themselves and their usages, as well as the documentation in `docs/dev`. This PR is a followup to said PR, completing the name changes: renaming all symbols, names, comments etc, so all is consistent and up-to-date. Closes #13573 * github.com:scylladb/scylladb: reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes reader_concurrency_semaphore: update API w.r.t. recent permit state name changes reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes	2023-05-04 18:29:04 +03:00
Avi Kivity	1d351dde06	Merge 'Make S3 client work with real S3' from Pavel Emelyanov Current S3 client was tested over minio and it takes few more touches to work with amazon S3. The main challenge here is to support singed requests. The AWS S3 server explicitly bans unsigned multipart-upload requests, which in turn is the essential part of the sstables S3 backend, so we do need signing. Signing a request has many options and requirements, one of them is -- request _body_ can be or can be not included into signature calculations. This is called "(un)signed payload". Requests sent over plain HTTP require payload signing (i.e. -- request body should be included into signature calculations), which can a bit troublesome, so instead the PR uses unsigned payload (i.e. -- doesn't include the request body into signature calculation, only necessary headers and query parameters), but thus also needs HTTPS. So what this set does is makes the existing S3 client code sign requests. In order to sign the request the code needs to get AWS key and secret (and region) from somewhere and this somewhere is the conf/object_storage.yaml config file. The signature generating code was previously merged (moved from alternator code) and updated to suit S3 client needs. In order to properly support HTTPS the PR adds special connection factory to be used with seastar http client. The factory makes DNS resolving of AWS endpoint names and configures gnutls systemtrust. fixes: #13425 Closes #13493 * github.com:scylladb/scylladb: doc: Add a document describing how to configure S3 backend s3/test: Add ability to run boost test over real s3 s3/client: Sign requests if configured s3/client: Add connection factory with DNS resolve and configurable HTTPS s3/client: Keep server port on config s3/client: Construct it with config s3/client: Construct it with sstring endpoint sstables: Make s3_storage with endpoint config sstables_manager: Keep object storage configs onboard code: Introduce conf/object_storage.yaml configuration file	2023-05-04 18:08:54 +03:00
Pavel Emelyanov	c4394a059c	file_writer: Remove static make() helper It's simply unused Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-04 16:55:41 +03:00
Pavel Emelyanov	eaf534cc4b	sstable: Use toc_filename() to print TOC file path The sstable::write_toc() gets TOC filename from file writer, while it can get it from itself. This makes the file_writer::get_filename() private and actually improves logging, as the writer is not required to have the filename onboard, while sstable always has it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-04 16:54:21 +03:00
Benny Halevy	205daf49fd	sstable_directory: coroutinize parallel_for_each_restricted Using a coroutine simplifies the function and reduced the number of moves it performs. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-04 11:46:59 +03:00
Benny Halevy	e4acc44814	sstable_directory: parallel_for_each_restricted: use std::ranges for template definition We'd like the container to be a std::ranges::range. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-04 11:44:24 +03:00
Benny Halevy	e2023877f2	sstable_directory: parallel_for_each_restricted: do not move container Commit `ecbd112979` `distributed_loader: reshard: consider sstables for cleanup` caused a regression in loading new sstables using the `upload` directory, as seen in e.g. https://jenkins.scylladb.com/view/master/job/scylla-master/job/dtest-daily-release/230/testReport/migration_test/TestMigration/Run_Dtest_Parallel_Cloud_Machines___FullDtest___full_split000___test_migrate_sstable_without_compression_3_0_md_/ ``` query = "SELECT COUNT() FROM cf" statement = SimpleStatement(query) s = self.patient_cql_connection(node, 'ks') result = list(s.execute(statement)) > assert result[0].count == expected_number_of_rows, \ "Expected {} rows. Got {}".format(expected_number_of_rows, list(s.execute("SELECT FROM ks.cf"))) E AssertionError: Expected 1 rows. Got [] E assert 0 == 1 E +0 E -1 ``` The reason for the regression is that the call to `do_for_each_sstable` in `collect_all_shared_sstables` to search for sstables that need cleanup caused the list of sstables in the sstable directory to be moved and cleared. parallel_for_each_restricted moves the container passed to it into a `do_with` continuation. This is required for parallel_for_each_restricted. However, moving the container is destructive and so, the decision whether to move or not needs to be the caller's, not the callee. This patch changes the signature of parallel_for_each_restricted to accept a lvalue reference to the container rather than a rvalue reference, allowing the callers to decide whether to move or not. Most callers are converted to move the container, as they effectively do today, and a new method, `filter_sstables` was added for the `collect_all_shared_sstables` us case, that allows the `func` that processes each sstable to decide whether the sstable is kept in `_unshared_local_sstables` or not. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-04 11:36:25 +03:00
Pavel Emelyanov	85f06ca556	s3/client: Construct it with config Similar to previous patch -- extent the s3::client constructor to get the endpoint config value next to the endpoint string. For now the configs are likely empty, but they are yet unused too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	caf9e357c8	s3/client: Construct it with sstring endpoint Currently the client is constructed with socket_address which's prepared by the caller from the endpoint string. That's not flexible engouh, because s3 client needs to know the original endpoint string for two reasons. First, it needs to lookup endpoint config for potential AWS creds. Second, it needs this exact value as Host: header in its http requests. So this patch just relaxes the client constructor to accept the endpoint string and hard-code the 9000 port. The latter is temporary, this is how local tests' minio is started, but next patch will make it configurable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	711514096a	sstables: Make s3_storage with endpoint config Continuation of the previous patch. The sstables::s3_storage gets the endpoint config instance upon creation. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	bd1e3c688f	sstables_manager: Keep object storage configs onboard The user sstables manager will need to provide endpoint config for sstables' storage drivers. For that it needs to get it from db::config and keep in-sync with its updates. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Botond Dénes	48b9f31a08	Merge 'db, sstable: use generation_type instead of its value when appropriate' from Kefu Chai in this series, we try to use `generation_type` as a proxy to hide the consumers from its underlying type. this paves the road to the UUID based generation identifier. as by then, we cannot assume the type of the `value()` without asking `generation_type` first. better off leaving all the formatting and conversions to the `generation_type`. also, this series changes the "generation" column of sstable registry table to "uuid", and convert the value of it to the original generation_type when necessary, this paves the road to a world with UUID based generation id. Closes #13652 * github.com:scylladb/scylladb: db: use uuid for the generation column in sstable registry table db, sstable: add operator data_value() for generation_type db, sstable: print generation instead of its value	2023-05-03 09:04:54 +03:00
Kefu Chai	74e9e6dd1a	db: use uuid for the generation column in sstable registry table * change the "generation" column of sstable registry table from bigint to uuid * from helper to convert UUID back to the original generation in the long run, we encourage user to use uuid based generation identifier. but in the transition period, both bigint based and uuid based identifiers are used for the generation. so to cater both needs, we use a hackish way to store the integer into UUID. to differentiate the was-integer UUID from the geniune UUID, we check the UUID's most_significant_bits. because we only support serialize UUID v1, so if the timestamp in the UUID is zero, we assume the UUID was generated from an integer when converting it back to a generation identififer. also, please note, the only use case of using generation as a column is the sstable_registry table, but since its schema is fixed, we cannot store both a bigint and a UUID as the value of its `generation` column, the simpler way forward is to use a single type for the generation. to be more efficient and to preserve the type of the generation, instead of using types like ascii string or bytes, we will always store the generation as a UUID in this table, if the generation's identifier is a int64_t, the value of the integer will be used as the least significant bits of the UUID. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-02 19:23:22 +08:00
Pavel Emelyanov	81a1416ebf	sstables: Add and call storage::destroy() The s3_storage leaks client when sstable gets destoryed. So far this came unnoticed, but debug-mode unit test ran over minio captured it. So here's the fix. When sstable is destroyed it also kicks the storage to do whatever cleanup is needed. In case of s3 storage the cleanup is in closing the on-boarded client. Until #13458 is fixed each sstable has its own private version of the client and there's no other place where it can be close()d in co_await-able mannter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:44 +03:00
Pavel Emelyanov	3e0c3346a8	sstables: Coroutinize sstable::destroy() To simiplify patching by next patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:11 +03:00
Benny Halevy	707bd17858	everywhere: optimize calls to make_flat_mutation_reader_from_mutations_v2 with single mutation No point in going through the vector<mutation> entry-point just to discover in run time that it was called with a single-element vector, when we know that in advance. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13733	2023-05-02 07:58:34 +03:00
Botond Dénes	f527b28174	Merge 'treewide: reenable -Wmissing-braces' from Kefu Chai this change silences the warning of `-Wmissing-braces` from clang. in general, we can initialize an object without constructor with braces. this is called aggregate initialization. but the standard does allow us to initialize each element using either copy-initialization or direct-initialization. but in our case, neither of them applies, so the clang warns like ``` suggest braces around initialization of subobject [-Werror,-Wmissing-braces] options.elements.push_back({bytes(k.begin(), k.end()), bytes(v.begin(), v.end())}); ^~~~~~~~~~~~~~~~~~~~~~~~~ { } ``` in this change, also, take the opportunity to use structured binding to simplify the related code. Closes #13705 * github.com:scylladb/scylladb: build: reenable -Wmissing-braces treewide: add braces around subobject cql3/stats: use zero-initialization	2023-04-28 16:00:14 +03:00
Kefu Chai	ba8402067f	db, sstable: add operator data_value() for generation_type so we can apply `execute_cql()` on `generation_type` directly without extracting its value using `generation.value()`. this paves the road to adding UUID based generation id to `generation_type`. as by then, we will have both UUID based and integer based `generation_type`, so `generation_type::value()` will not be able to represent its value anymore. and this method will be replaced by `operator data_value()` in this use case. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-28 20:39:12 +08:00
Kefu Chai	ae9aa9c4bd	db, sstable: print generation instead of its value this change prepares for the change to use `variant<UUID, int64_t>` as the value of `generation_type`. as after this change, the "value" of a generation would be a UUID or an integer, and we don't want to expose the variant in generation's public interface. so the `value()` method would be changed or removed by then. this change takes advantage of the fact that the formatter of `generation_type` always prints its value. also, it's better to reuse `generation_type` formatter when appropriate. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-28 20:39:12 +08:00
Kefu Chai	eb7c41767b	treewide: add braces around subobject this change helps to silence the warning of `-Wmissing-braces` from clang. in general, we can initialize an object without constructor with braces. this is called aggregate initialization. but the standard does allow us to initialize each element using either copy-initialization or direct-initialization. but in our case, neither of them applies, so the clang warns like ``` suggest braces around initialization of subobject [-Werror,-Wmissing-braces] options.elements.push_back({bytes(k.begin(), k.end()), bytes(v.begin(), v.end())}); ^~~~~~~~~~~~~~~~~~~~~~~~~ { } ``` in this change, also, take the opportunity to use structured binding to simplify the related code. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-28 16:59:29 +08:00
Raphael S. Carvalho	2dbae856f8	sstable: Piggyback on sstable parser and writer to provide bytes_on_disk bytes_on_disk is the sum of all sstable components. As read_simple() fetches the file size before parsing the component, bytes_on_disk can be added incrementally rather than an additional step after all components were already parsed. Likewise, write_simple() tracks the offset for each new component, and therefore bytes_on_disk can also be added incrementally. This simplifies s3 life as it no longer have to care about feeding a bytes_on_disk, which is currently limited to data and index sizes only. Refs #13649. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:48 -03:00
Raphael S. Carvalho	4d02821094	sstable: restore indentation in read_digest() and read_checksum() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:48 -03:00
Raphael S. Carvalho	75dc7b799e	sstable: make all parsing of simple components go through do_read_simple() With all parsing of simple components going through do_read_simple(), common infrastructure can be reused (exception handling, debug logging, etc), and also statistics spanning all components can be easily added. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:48 -03:00
Raphael S. Carvalho	71cd8e6b51	sstable: Add missing pragma once to random_access_reader.hh Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:48 -03:00
Raphael S. Carvalho	b783bddbdf	sstable: make all writing of simple components go through do_write_simple() With all writing of simple components going through do_write_simple(), common infrastructure can be reused (exception handling, debug logging, etc), and also statistics spanning all components can be easily added. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:46 -03:00
Raphael S. Carvalho	dcee5c4fae	sstable: Restore indentation in read_simple() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:04:52 -03:00
Raphael S. Carvalho	253d9e787b	sstable: Coroutinize read_simple() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:04:52 -03:00
Raphael S. Carvalho	0dcdec6a55	sstable: Use filter memory footprint in filter_size() For S3, filter size is currently set to zero, as we want to avoid "fstat-ing" each file. On-disk representation of bloom filter is similar to the in-memory one, therefore let's use memory footprint in filter_size(). User of filter_size() is API implementing "nodetool cfstats" and it cares about the size of bloom filter data (that's how it's described). This way, we provide the filter data size regardless of the underlying storage type. Refs #13649. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:04:52 -03:00
Kefu Chai	f5b05cf981	treewide: use defaulted operator!=() and operator==() in C++20, compiler generate operator!=() if the corresponding operator==() is already defined, the language now understands that the comparison is symmetric in the new standard. fortunately, our operator!=() is always equivalent to `! operator==()`, this matches the behavior of the default generated operator!=(). so, in this change, all `operator!=` are removed. in addition to the defaulted operator!=, C++20 also brings to us the defaulted operator==() -- it is able to generated the operator==() if the member-wise lexicographical comparison. under some circumstances, this is exactly what we need. so, in this change, if the operator==() is also implemented as a lexicographical comparison of all memeber variables of the class/struct in question, it is implemented using the default generated one by removing its body and mark the function as `default`. moreover, if the class happen to have other comparison operators which are implemented using lexicographical comparison, the default generated `operator<=>` is used in place of the defaulted `operator==`. sometimes, we fail to mark the operator== with the `const` specifier, in this change, to fulfil the need of C++ standard, and to be more correct, the `const` specifier is added. also, to generate the defaulted operator==, the operand should be `const class_name&`, but it is not always the case, in the class of `version`, we use `version` as the parameter type, to fulfill the need of the C++ standard, the parameter type is changed to `const version&` instead. this does not change the semantic of the comparison operator. and is a more idiomatic way to pass non-trivial struct as function parameters. please note, because in C++20, both operator= and operator<=> are symmetric, some of the operators in `multiprecision` are removed. they are the symmetric form of the another variant. if they were not removed, compiler would, for instance, find ambiguous overloaded operator '=='. this change is a cleanup to modernize the code base with C++20 features. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13687	2023-04-27 10:24:46 +03:00
Pavel Emelyanov	4f93b440a5	sstables: Remove lost eptr variable from do_write_simple() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13684	2023-04-27 07:37:15 +03:00
Benny Halevy	87d9c4d7f8	sstables: filesystem_storage::change_state: simplify log message When moving to the base directory, the printout currently looks broken: ``` INFO 2023-04-16 09:15:58,631 [shard 0] sstable - Moving sstable .../data/ks/cf-4c1bb670dc3711ed96733daf102e4aab/upload/md-1-big-Data.db to in ".../data/ks/cf-4c1bb670dc3711ed96733daf102e4aab/" ``` Since `path` already contains `to`, the message can be just simplified and `to` need not be printed explicitly. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13525	2023-04-24 13:43:48 +03:00
Botond Dénes	1750bb34b7	Merge 'sstables, replica: add generation generator' from Kefu Chai this is the first step to the uuid-based generation identifier. the goal is to encapsulate the generation related logic in generator, so its consumers do not have to understand the difference between the int64_t based generation and UUID v1 based generation. this commit should not change the behavior of existing scylla. it just allows us to derive from `generation_generator` so we can have another generator which generates UUID based generation identifier. Closes #13073 * github.com:scylladb/scylladb: replica, test: create generation id using generator sstables: add generation_generator test: sstables: use generate_n for generating ids for testing	2023-04-24 09:31:08 +03:00
Kefu Chai	6e82aa42d5	sstables: add generation_generator to prepare for the uuid-based generation identifier, where we will generate uuid-based generation idenfier if corresponding option is enabled, otherwise an integer based id. to reduce the repeatings, generation_generator is extracted out so it can be reused. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 21:51:13 +08:00
Kefu Chai	a2aa133822	treewide: use std::lexicographical_compare_threeway this the standard library offers `std::lexicographical_compare_threeway()`, and we never uses the last two addition parameters which are not provided by `std::lexicographical_compare_threeway()`. there is no need to have the homebrew version of trichotomic compare function. in this change, * all occurrences of `lexicographical_tri_compare()` are replaced with `std::lexicographical_compare_threeway()`. * ``lexicographical_tri_compare()` is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13615	2023-04-21 14:28:18 +03:00
Kefu Chai	51fc0bc698	sstables: use default generated operator== C++20 compiler is able to generate defaulted operator== and operator!=. and the default generated operators behaves exactly the same as the ones crafted by us. so let's it do its job. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13614	2023-04-21 14:25:39 +03:00
Kefu Chai	c5fa1ac9f7	sstable: specialize fmt::formatter<component_type> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `component_type` without the help of `operator<<`. the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. also, please note, to enable fmtlib to format `std::set<component_type>` in `test/boost/sstable_3_x_test.cc` , we need to include `<fmt/ranges.h>` in that source file. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13598	2023-04-21 09:49:24 +03:00
Benny Halevy	77b70dbdb7	sstables: compressed_file_data_source_impl: get: throw malformed_sstable_exception on premature eof Currently, the reader might dereference a null pointer if the input stream reaches eof prematurely, and read_exactly returns an empty temporary_buffer. Detect this condition before dereferencing the buffer and sstables::malformed_sstable_exception. Fixes #13599 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13600	2023-04-21 07:56:58 +03:00
Botond Dénes	804403f618	reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes They is still using the old terminology for permit state names, bring them up to date with the recent state name changes.	2023-04-19 05:20:42 -04:00
Botond Dénes	4c37dc5507	Merge 'keys: specialize fmt::formatter<partition_key> and friends' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Closes #13513 * github.com:scylladb/scylladb: keys: consolidate the formatter for partition_keys keys: specialize fmt::formatter<partition_key> and friends	2023-04-17 10:27:31 +03:00
Raphael S. Carvalho	a47bac931c	Move TWCS option from table into TWCS itself enable_optimized_twcs_queries is specific to TWCS, therefore it belongs to TWCS, not replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13489	2023-04-14 08:28:16 +03:00
Kefu Chai	3738fcbe05	keys: specialize fmt::formatter<partition_key> and friends this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-14 13:21:30 +08:00
Raphael S. Carvalho	17261369ea	sstables: Allow SSTable loading to discard bloom filter If bloom filter is not loaded, it means that an always-present filter is used, which translates into the SSTable being opened on every single read. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:34:22 -03:00
Raphael S. Carvalho	1427a5ce98	sstables: Allow sstable_directory user to feed custom sstable open config This will be used by load-and-stream to load SSTables in its own customized way. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:34:16 -03:00
Raphael S. Carvalho	86516f4cef	sstables: Move sstable_open_info into open_info.hh So sstable_directory can access its definition without having to include sstables.hh. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:31:14 -03:00
Botond Dénes	f1bbf705f9	Merge 'Cleanup sstables in resharding and other compaction types' from Benny Halevy This series extends sstable cleanup to resharding and other (offstrategy, major, and regular) compaction types so to: * cleanup uploaded sstables (#11933) * cleanup staging sstables after they are moved back to the main directory and become eligible for compaction (#9559) When perform_cleanup is called, all sstables are scanned, and those that require cleanup are marked as such, and are added for tracking to table_state::cleanup_sstable_set. They are removed from that set once released by compaction. Along with that sstables set, we keep the owned_ranges_ptr used by cleanup in the table_state to allow other compaction types (offstrategy, major, or regular) to cleanup those sstables that are marked as require_cleanup and that were skipped by cleanup compaction for either being in the maintenance set (requiring offstrategy compaction) or in staging. Resharding is using a more straightforward mechanism of passing the owned token ranges when resharding uploaded sstables and using it to detect sstable that require cleanup, now done as piggybacked on resharding compaction. Closes #12422 * github.com:scylladb/scylladb: table: discard_sstables: update_sstable_cleanup_state when deleting sstables compaction_manager: compact_sstables: retrieve owned ranges if required sstables: add a printer for shared_sstable compaction_manager: keep owned_ranges_ptr in compaction_state compaction_manager: perform_cleanup: keep sstables in compaction_state::sstables_requiring_cleanup compaction: refactor compaction_state out of compaction_manager compaction: refactor compaction_fwd.hh out of compaction_descriptor.hh compaction_manager: compacting_sstable_registration: keep a ref to the compaction_state compaction_manager: refactor get_candidates compaction_manager: get_candidates: mark as const table, compaction_manager: add requires_cleanup sstable_set: add for_each_sstable_until distributed_loader: reshard: update sstable cleanup state table, compaction_manager: add update_sstable_cleanup_state compaction_manager: needs_cleanup: delete unused schema param compaction_manager: perform_cleanup: disallow empty sorted_owened_ranges distributed_loader: reshard: consider sstables for cleanup distributed_loader: process_upload_dir: pass owned_ranges_ptr to reshard distributed_loader: reshard: add optional owned_ranges_ptr param distributed_loader: reshard: get a ref to table_state distributed_loader: reshard: capture creator by ref distributed_loader: reshard: reserve num_jobs buckets compaction: move owned ranges filtering to base class compaction: move owned_ranges into descriptor	2023-04-11 14:52:29 +03:00

1 2 3 4 5 ...

3109 Commits