scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 21:55:50 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	75dc7b799e	sstable: make all parsing of simple components go through do_read_simple() With all parsing of simple components going through do_read_simple(), common infrastructure can be reused (exception handling, debug logging, etc), and also statistics spanning all components can be easily added. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:48 -03:00
Raphael S. Carvalho	71cd8e6b51	sstable: Add missing pragma once to random_access_reader.hh Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:48 -03:00
Raphael S. Carvalho	b783bddbdf	sstable: make all writing of simple components go through do_write_simple() With all writing of simple components going through do_write_simple(), common infrastructure can be reused (exception handling, debug logging, etc), and also statistics spanning all components can be easily added. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:46 -03:00
Raphael S. Carvalho	dcee5c4fae	sstable: Restore indentation in read_simple() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:04:52 -03:00
Raphael S. Carvalho	253d9e787b	sstable: Coroutinize read_simple() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:04:52 -03:00
Raphael S. Carvalho	0dcdec6a55	sstable: Use filter memory footprint in filter_size() For S3, filter size is currently set to zero, as we want to avoid "fstat-ing" each file. On-disk representation of bloom filter is similar to the in-memory one, therefore let's use memory footprint in filter_size(). User of filter_size() is API implementing "nodetool cfstats" and it cares about the size of bloom filter data (that's how it's described). This way, we provide the filter data size regardless of the underlying storage type. Refs #13649. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:04:52 -03:00
Kefu Chai	f5b05cf981	treewide: use defaulted operator!=() and operator==() in C++20, compiler generate operator!=() if the corresponding operator==() is already defined, the language now understands that the comparison is symmetric in the new standard. fortunately, our operator!=() is always equivalent to `! operator==()`, this matches the behavior of the default generated operator!=(). so, in this change, all `operator!=` are removed. in addition to the defaulted operator!=, C++20 also brings to us the defaulted operator==() -- it is able to generated the operator==() if the member-wise lexicographical comparison. under some circumstances, this is exactly what we need. so, in this change, if the operator==() is also implemented as a lexicographical comparison of all memeber variables of the class/struct in question, it is implemented using the default generated one by removing its body and mark the function as `default`. moreover, if the class happen to have other comparison operators which are implemented using lexicographical comparison, the default generated `operator<=>` is used in place of the defaulted `operator==`. sometimes, we fail to mark the operator== with the `const` specifier, in this change, to fulfil the need of C++ standard, and to be more correct, the `const` specifier is added. also, to generate the defaulted operator==, the operand should be `const class_name&`, but it is not always the case, in the class of `version`, we use `version` as the parameter type, to fulfill the need of the C++ standard, the parameter type is changed to `const version&` instead. this does not change the semantic of the comparison operator. and is a more idiomatic way to pass non-trivial struct as function parameters. please note, because in C++20, both operator= and operator<=> are symmetric, some of the operators in `multiprecision` are removed. they are the symmetric form of the another variant. if they were not removed, compiler would, for instance, find ambiguous overloaded operator '=='. this change is a cleanup to modernize the code base with C++20 features. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13687	2023-04-27 10:24:46 +03:00
Pavel Emelyanov	4f93b440a5	sstables: Remove lost eptr variable from do_write_simple() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13684	2023-04-27 07:37:15 +03:00
Benny Halevy	87d9c4d7f8	sstables: filesystem_storage::change_state: simplify log message When moving to the base directory, the printout currently looks broken: ``` INFO 2023-04-16 09:15:58,631 [shard 0] sstable - Moving sstable .../data/ks/cf-4c1bb670dc3711ed96733daf102e4aab/upload/md-1-big-Data.db to in ".../data/ks/cf-4c1bb670dc3711ed96733daf102e4aab/" ``` Since `path` already contains `to`, the message can be just simplified and `to` need not be printed explicitly. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13525	2023-04-24 13:43:48 +03:00
Botond Dénes	1750bb34b7	Merge 'sstables, replica: add generation generator' from Kefu Chai this is the first step to the uuid-based generation identifier. the goal is to encapsulate the generation related logic in generator, so its consumers do not have to understand the difference between the int64_t based generation and UUID v1 based generation. this commit should not change the behavior of existing scylla. it just allows us to derive from `generation_generator` so we can have another generator which generates UUID based generation identifier. Closes #13073 * github.com:scylladb/scylladb: replica, test: create generation id using generator sstables: add generation_generator test: sstables: use generate_n for generating ids for testing	2023-04-24 09:31:08 +03:00
Kefu Chai	6e82aa42d5	sstables: add generation_generator to prepare for the uuid-based generation identifier, where we will generate uuid-based generation idenfier if corresponding option is enabled, otherwise an integer based id. to reduce the repeatings, generation_generator is extracted out so it can be reused. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 21:51:13 +08:00
Kefu Chai	a2aa133822	treewide: use std::lexicographical_compare_threeway this the standard library offers `std::lexicographical_compare_threeway()`, and we never uses the last two addition parameters which are not provided by `std::lexicographical_compare_threeway()`. there is no need to have the homebrew version of trichotomic compare function. in this change, * all occurrences of `lexicographical_tri_compare()` are replaced with `std::lexicographical_compare_threeway()`. * ``lexicographical_tri_compare()` is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13615	2023-04-21 14:28:18 +03:00
Kefu Chai	51fc0bc698	sstables: use default generated operator== C++20 compiler is able to generate defaulted operator== and operator!=. and the default generated operators behaves exactly the same as the ones crafted by us. so let's it do its job. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13614	2023-04-21 14:25:39 +03:00
Kefu Chai	c5fa1ac9f7	sstable: specialize fmt::formatter<component_type> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `component_type` without the help of `operator<<`. the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. also, please note, to enable fmtlib to format `std::set<component_type>` in `test/boost/sstable_3_x_test.cc` , we need to include `<fmt/ranges.h>` in that source file. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13598	2023-04-21 09:49:24 +03:00
Benny Halevy	77b70dbdb7	sstables: compressed_file_data_source_impl: get: throw malformed_sstable_exception on premature eof Currently, the reader might dereference a null pointer if the input stream reaches eof prematurely, and read_exactly returns an empty temporary_buffer. Detect this condition before dereferencing the buffer and sstables::malformed_sstable_exception. Fixes #13599 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13600	2023-04-21 07:56:58 +03:00
Botond Dénes	4c37dc5507	Merge 'keys: specialize fmt::formatter<partition_key> and friends' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Closes #13513 * github.com:scylladb/scylladb: keys: consolidate the formatter for partition_keys keys: specialize fmt::formatter<partition_key> and friends	2023-04-17 10:27:31 +03:00
Raphael S. Carvalho	a47bac931c	Move TWCS option from table into TWCS itself enable_optimized_twcs_queries is specific to TWCS, therefore it belongs to TWCS, not replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13489	2023-04-14 08:28:16 +03:00
Kefu Chai	3738fcbe05	keys: specialize fmt::formatter<partition_key> and friends this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-14 13:21:30 +08:00
Raphael S. Carvalho	17261369ea	sstables: Allow SSTable loading to discard bloom filter If bloom filter is not loaded, it means that an always-present filter is used, which translates into the SSTable being opened on every single read. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:34:22 -03:00
Raphael S. Carvalho	1427a5ce98	sstables: Allow sstable_directory user to feed custom sstable open config This will be used by load-and-stream to load SSTables in its own customized way. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:34:16 -03:00
Raphael S. Carvalho	86516f4cef	sstables: Move sstable_open_info into open_info.hh So sstable_directory can access its definition without having to include sstables.hh. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:31:14 -03:00
Botond Dénes	f1bbf705f9	Merge 'Cleanup sstables in resharding and other compaction types' from Benny Halevy This series extends sstable cleanup to resharding and other (offstrategy, major, and regular) compaction types so to: * cleanup uploaded sstables (#11933) * cleanup staging sstables after they are moved back to the main directory and become eligible for compaction (#9559) When perform_cleanup is called, all sstables are scanned, and those that require cleanup are marked as such, and are added for tracking to table_state::cleanup_sstable_set. They are removed from that set once released by compaction. Along with that sstables set, we keep the owned_ranges_ptr used by cleanup in the table_state to allow other compaction types (offstrategy, major, or regular) to cleanup those sstables that are marked as require_cleanup and that were skipped by cleanup compaction for either being in the maintenance set (requiring offstrategy compaction) or in staging. Resharding is using a more straightforward mechanism of passing the owned token ranges when resharding uploaded sstables and using it to detect sstable that require cleanup, now done as piggybacked on resharding compaction. Closes #12422 * github.com:scylladb/scylladb: table: discard_sstables: update_sstable_cleanup_state when deleting sstables compaction_manager: compact_sstables: retrieve owned ranges if required sstables: add a printer for shared_sstable compaction_manager: keep owned_ranges_ptr in compaction_state compaction_manager: perform_cleanup: keep sstables in compaction_state::sstables_requiring_cleanup compaction: refactor compaction_state out of compaction_manager compaction: refactor compaction_fwd.hh out of compaction_descriptor.hh compaction_manager: compacting_sstable_registration: keep a ref to the compaction_state compaction_manager: refactor get_candidates compaction_manager: get_candidates: mark as const table, compaction_manager: add requires_cleanup sstable_set: add for_each_sstable_until distributed_loader: reshard: update sstable cleanup state table, compaction_manager: add update_sstable_cleanup_state compaction_manager: needs_cleanup: delete unused schema param compaction_manager: perform_cleanup: disallow empty sorted_owened_ranges distributed_loader: reshard: consider sstables for cleanup distributed_loader: process_upload_dir: pass owned_ranges_ptr to reshard distributed_loader: reshard: add optional owned_ranges_ptr param distributed_loader: reshard: get a ref to table_state distributed_loader: reshard: capture creator by ref distributed_loader: reshard: reserve num_jobs buckets compaction: move owned ranges filtering to base class compaction: move owned_ranges into descriptor	2023-04-11 14:52:29 +03:00
Botond Dénes	355583066e	Merge 'Reduce memory footprint of SSTable index summary' from Raphael "Raph" Carvalho SSTable summary is one of the components fully loaded into memory that may have a significant footprint. This series reduces the summary footprint by reducing the amount of token information that we need to keep in memory for each summary entry. Of course, the benefit of this size optimization is proportional to the amount of summary entries, which in turn is proportional to the number of partitions in a SSTable. Therefore we can say that this optimization will benefit the most tables which have tons of small-sized partitions, which will result in big summaries. Results: ``` BEFORE [1000000 pkeys] data size: 4035888890, summary -> memory footprint: 5843232, entries: 88158 [10000000 pkeys] data size: 40368888890, summary -> memory footprint: 55787128, entries: 844925 AFTER [1000000 pkeys] data size: 4035888890, summary -> memory footprint: 4351536, entries: 88158 [10000000 pkeys] data size: 40368888890, summary -> memory footprint: 42211984, entries: 844925 ``` That shows a 25% reduction in footprint, for both 1 and 10 million pkeys. Closes #13447 * github.com:scylladb/scylladb: sstables: Store raw token into summary entries sstables: Don't store token data into summary's memory pool	2023-04-11 08:29:11 +03:00
Botond Dénes	05b381bfa2	Merge 'Simple S3 storage for sstables' from Pavel Emelyanov The PR adds sstables storage backend that keeps all component files as S3 objects and system.sstables_registry ownership table that keeps track of what sstables objects belong to local node and their names. When a keyspace is configured with 'STORAGE = { 'type': 'S3' }' the respective class table object eventually gets the storage_options instance pointing to the target S3 endpoint and bucket. All the sstables created for that table attach the S3 storage implementation that maintains components' files as S3 objects. Writing to and reading from components is handled by the S3 client facilities from utils/. Changing the sstable state, which is -- moving between normal, staging and quarantine states -- is not yet implemented, but would eventually happen by updating entries in the sstables registry. To keep track of which node owns which objects, to provide bucket-wide uniqueness of object names and to maintain sstable state the storage driver keeps records in the system.sstables_registry ownership table. The table maps sstable location and generation to the object format, version, status-state () and (!) unique identifier (some time soon this identifier is supposed to be replaced with UUID sstables generations). The component object name is thus s3://bucket/uuid/component_basename. The registry is also used on boot. The distributed loader picks up sstables from all the tables found in schema and for S3-backed keyspaces it lists entries in the registry to a) identify those and b) get their unique S3-side identifiers to open by name. () About sstable's status and state. The state field is the part of today's sstable path on disk -- staging, quarantine, normal (root table data dir), etc. Since S3 doesn't have the renaming facility, moving sstable between those states is only possible by updating the entry in the registry. This is not yet implemented in this set (#13017) The status field tracks sstable' transition through its creation-deletion. It first starts with 'creating' status which corresponds to the today's TemporaryTOC file. After being created and written to the sstable moves into 'sealed' state which corresponds to the today's normal sstable being with the TOC file. To delete sstable atomically it first moves into 'removing' state which is equivalent to being in the deletion-log for the on-disk sstable. Once removed from the bucket, the entry is removed from the registry. To play with: 1. Start minio (installed by install-dependencies.sh) ``` export MINIO_ROOT_USER=${root_user} export MINIO_ROOT_PASSWORD=${root_pass} mkdir -p ${root_directory} minio server ${root_directory} ``` 2. Configure minio CLI, create anonymous bucket ``` mc config host rm local mc config host add local http://127.0.0.1:9000 ${root_user} ${root_pass} mc mb local/sstables mc anonymous set public local/sstables ``` 3. Start Scylla with object-storage feature enabled ``` scylla ... --experimental-features=keyspace-storage-options --workdir ${as_usual}``` 4. Create KS with S3 storage ``` create keyspace ... storage = { 'type': 'S3', 'endpoint': '127.0.0.1:9000', 'bucket': 'sstables' };``` The S3 client has a logger named "s3", it's useful to use on with `trace` verbosity. Closes #12523 * github.com:scylladb/scylladb: test: Add object-storage test distributed_loader: Print storage type when populating sstable_directory: Add ownership table components lister sstable_directory: Make components_lister and API sstable_directory: Create components lister based on storage options sstables: Add S3 storage implementation system_keyspace: Add ownership table system_keyspace: Plug to user sstables manager too sstable: Make storage instance based on storage options sstable_directory: Keep storage_options aboard sstable: Virtualize the helper that gets on-disk stats for sstable sstable, storage: Virtualize data sink making for small components sstable, storage: Virtualize data sink making for Data and Index sstable/writer: Shuffle writer::init_file_writers() sstable: Make storage an API utils: Add S3 readable file impl for random reads utils: Add S3 data sink for multipart upload utils: Add S3 client with basic ops cql-pytest: Add option to run scylla over stable directory test.py: Equip it with minio server sstables: Detach write_toc() helper	2023-04-11 08:17:25 +03:00
Benny Halevy	9105f9800c	sstables: add a printer for shared_sstable Refactor the printing logic in compaction::formatted_sstables_list out to sstables::to_string(const shared_sstable&, bool include_origin) and operator<<(const shared_sstable) on top of it. So that we can easily print std::vector<shared_sstable> from compaction_manager in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:31:35 +03:00
Benny Halevy	d765686491	sstable_set: add for_each_sstable_until Calls a function on all sstables or until the function returns stop_iteration::yes. Change the sstable_set_impl interface to expose only for_each_sstable_until and let sstable_set::for_each_sstable use that, wrapping the void-returning function passed to it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:11:58 +03:00
Pavel Emelyanov	8b9e9671de	distributed_loader: Print storage type when populating On boot it's very useful to know which storage a table comes from, so add the respective info to existing log messages. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	f04c6cdf9a	sstable_directory: Add ownership table components lister When sstables are stored on object storage, they are "registered" in the system.sstables_registry ownership table. The sstable_directory is supposed to list sstables from this table, so here's the respective components lister. The lister is created by sstables_manager, by the time it's requested from the the system keyspace is already plugged. The lister only handles "sealed" sstables. Dangling ones are still ignored, this is to be fixed later. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	8bd9f7accf	sstable_directory: Make components_lister and API Now the lister is filesystem-specific. There will soon come another one for S3, so the sstable_directory should be prepared for that by making the lister an abstract class. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	5f7f0117e1	sstable_directory: Create components lister based on storage options The directory's lister is storage-specific and should be created differently for different storage options. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	950ee0efe8	sstables: Add S3 storage implementation The driver puts all componenets into s3://bucket/uuid/component_name objects where 'bucket' is the keyspace options configuration parameter, and the 'uuid' is the value obtained from the ownership table. E.g. s3://test_bucket/d0a743b0-ad38-11ed-85b5-39b6b0998182/Data.db The life-time is straightforward. Until sealed, the sstable has 'creating' status in the table, then it's updated to be 'sealed'. Prior to removing the objects the status is set to 'deleting' thus allowing the distributed loader to pick up the dangling objects un re-load (not yet implemented). Finally, the entry is deleted from the table. It needs the PR #12648 not to generate empty ks/cf directories on the local filesystem. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	e34b86dd61	system_keyspace: Plug to user sstables manager too The sharded<sys_ks> instances are plugged to large data handler and compaction manager to maintain the circular dependency between these components via the interposing database instance. Do the same for user sstables manager, because S3 driver will need to update the local ownership table. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	4bb885b759	sstable: Make storage instance based on storage options This patch adds storage options lw-ptr to sstables_manager::make_sstable and makes the storage instance creation depend on the options. For local it just creates the filesystem storage instance, for S3 -- throws, but next patch will fix that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	df026e2cb5	sstable_directory: Keep storage_options aboard The class in question will need to know the table's storage it will need to list sstables from. For that -- construct it with the storage options taken from table. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	c060f3a52f	sstable: Virtualize the helper that gets on-disk stats for sstable When opening an existing (or just sealed) sstable its components are stat()-ed to get the on-disk sizes and a bit more. Stat-ing a file by name on S3 is not (yet) implemented and doing it file-by-file can be quite terrible. So add a method to return sstable stats in a storage-specific manner. For S3 this can be implemented by getting the info from the ownership table (in the future). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	0ddd27cb29	sstable, storage: Virtualize data sink making for small components This time sstable needs to create a data sink for a component without having the file at hand. That's pretty much the same as in previous patch, but the mathod declaration differs slightly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	ac1e56c9d9	sstable, storage: Virtualize data sink making for Data and Index Add the make_data_or_index_sink() virtual method and its implementation for filesystem_storage. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	1d4fcce5dd	sstable/writer: Shuffle writer::init_file_writers() The method needs to create two data sinks -- for Data and for Index files -- and then wrap it with more stuff (compression, checksums, streams, etc.). With S3 backend using file-output-stream won't work, becase S3 storage cannot provide writable file API (it has data_sink instead). This patch extracts file_data_sink creation so that it could be virtualized with storage API later. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	525a261a4e	sstable: Make storage an API Currently sstable carries a filesystem_storage instance on board. Next patches will make it possible to use some other storage with different data accessing methods. This patch makes sstable carry abstract storage interface and make the existing filesystem_storage implement it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	93c8b4b46b	sstables: Detach write_toc() helper When sstable is opened it generates a certain content into TOC file. In filesystem storage this first gets into TemporaryTOC one. Future S3 driver will need the same to put into TOC object. Not to produce duplicate code detach the content generation into a helper. Next patches will make use of it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:00 +03:00
Raphael S. Carvalho	01466be7b9	sstables: Store raw token into summary entries Scylla stores a dht::token into each summary entry, for convenience. But that costs us 16 bytes for each summary entry. That's because dht::token has a kind field in addition to data, both 64 bits. With 1kk partitions, each averaging 4k bytes, summary may end up with ~90k summary entries. So dht::token only will add ~1.5M to the memory footprint of summary. We know summary samples index keys, therefore all tokens in all summary entries cannot have any token kind other than 'key'. Therefore, we can save 8 bytes for each summary entry by storing a 64-bit raw token and converting it back into token whenever needed. Memory footprint of summary entries in a summary goes from sizeof(summary_entry) * entries.size(): 1771520 to sizeof(summary_entry) * entries.size(): 1417216 which is explained by the 8 bytes reduction per summary entry. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-10 10:26:04 -03:00
Raphael S. Carvalho	6b5cd9ac7b	sstables: Don't store token data into summary's memory pool summary has a memory pool, which is implemented as a set of contiguous buffer of exponentially increasing size, with the max size of 128k. This pool served for both storing keys of summary entries and their respective tokens. The summary entry itself just stores a string_view, which points to the actual data in the memory pool. Since this series `31593e1451`, which removed token_view, summary_entry stores the actual token, not just the view. Therefore, memory is being wasted, as SSTable loader / writer is unnecessarily storing the token data into the pool. With 11k summary entries, the footprint drops from 756004 to 624932. A 18% reduction. Of course, the reduction depends on factors like key size, where the key size can outweigh significantly this waste. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-10 09:59:11 -03:00
Nadav Har'El	d26bb8c12d	Merge 'tree: migrate from std::regex to boost::regex' from Botond Dénes Except for where usage of `std::regex` is required by 3rd party library interfaces. As demonstrated countless times, std::regex's practice of using recursion for pattern matching can result in stack overflow, especially on AARCH64. The most recent incident happened after merging https://github.com/scylladb/scylladb/pull/13075, which (indirectly) uses `sstables::make_entry_descriptor()` to test whether a certain path is a valid scylla table path in a trial-and-error manner. This resulted in stacks blowing up in AARCH64. To prevent this, use the already tried and tested method of switching from `std::regex` to `boost::regex`. Don't wait until each of the `std::regex` sites explode, replace them all preemptively. Refs: https://github.com/scylladb/scylladb/issues/13404 Closes #13452 * github.com:scylladb/scylladb: test: s/std::regex/boost::regex/ utils: s/std::regex/boost::regex/ db/commitlog: s/std::regex/boost::regex/ types: s/std::regex/boost::regex/ index: s/std::regex/boost::regex/ duration.cc: s/std::regex/boost::regex/ cql3: s/std::regex/boost::regex/ thrift: s/std::regex/boost::regex/ sstables: use s/std::regex/boost::regex/	2023-04-09 18:47:41 +03:00
Botond Dénes	ba031ad181	sstables: use s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:50:12 -04:00
Tomasz Grabiec	9802bb6564	Merge 'Remove explicit flush() from sstable component writer' from Pavel Emelyanov Writing into sstable component output stream should be done with care. In particular -- flushing can happen only once right before closing the stream. Flushing the stream in between several writes is not going to work, because file stream would step on unaligned IO and S3 upload stream would send completion message to the server and would lose any subsequent write. Most of the file_writer users already obey that and flush the writer once right before closing it. The do_write_simple() is extra careful about exceptions handling, but it's an overkill (see first patch). It's better to make file_writer API explicitly lack the ability to flush itself by flushing the stream when closing the writer. Closes #13338 * github.com:scylladb/scylladb: sstables: Move writer flush into close (and remove it) sstables: Relax exception handling in do_write_simple	2023-04-05 12:09:31 +02:00
Botond Dénes	36e53d571c	Merge 'Treewide use-after-move bug fixes' from Raphael "Raph" Carvalho That's courtersy of `153813d3b8`, which annotates Seastar smart pointer classes with Clang's consumed attributes, to help Clang to statically spot use-after-move bugs. Closes #13386 * github.com:scylladb/scylladb: replica: Fix use-after-move in table::make_streaming_reader index/built_indexes_virtual_reader.hh: Fix use-after-move db/view/build_progress_virtual_reader: Fix use-after-move sstables: Fix use-after-move when making reader in reverse mode	2023-04-03 06:57:54 +03:00
Raphael S. Carvalho	213eaab246	sstables: Fix use-after-move when making reader in reverse mode static report: sstables/mx/reader.cc:1705:58: error: invalid invocation of method 'operator' on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed] legacy_reverse_slice_to_native_reverse_slice(schema, slice.get()), pc, std::move(trace_state), fwd, fwd_mr, monitor); Fixes #13394. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-31 08:39:11 -03:00
Botond Dénes	207dcbb8fa	Merge 'sstables: prepare for uuid-based generation_type' from Benny Halevy Preparing for #10459, this series defines sstables::generation_type::int_t as `int64_t` at the moment and use that instead of naked `int64_t` variables so it can be changed in the future to hold e.g. a `std::variant<int64_t, sstables::generation_id>`. sstables::new_generation was defined to generation new, unique generations. Currently it is based on incrementing a counter, but it can be extended in the future to manufacture UUIDs. The unit tests are cleaned up in this series to minimize their dependency on numeric generations. Basically, they should be used for loading sstables with hard coded generation numbers stored under `test/resource/sstables`. For all the rest, the tests should use existing and mechanisms introduced in this series such as generation_factory, sst_factory and smart make_sstable methods in sstable_test_env and table_for_tests to generate new sstables with a unique generation, and use the abstract sst->generation() method to get their generation if needed, without resorting the the actual value it may hold. Closes #12994 * github.com:scylladb/scylladb: everywhere: use sstables::generation_type test: sstable_test_env: use make_new_generation sstable_directory::components_lister::process: fixup indentation sstables: make highest_generation_seen return optional generation replica: table: add make_new_generation function replica: table: move sstable generation related functions out of line test: sstables: use generation_type::int_t sstables: generation_type: define int_t	2023-03-30 17:05:07 +03:00
Pavel Emelyanov	886a1392a8	sstables: Move writer flush into close (and remove it) Writing into sstable component output stream should be done with care. In particular -- flushing can happen only once right before closing the stream. Flushing the stream in between several writes is not going to work, because file stream would step on unaligned IO and S3 upload stream would send completion message to the server and would lose any subsequent write. Having said that, it's better to remove the flush() ability from the component writer not to tempt the developers. refs: #13320 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-30 09:34:04 +03:00
Pavel Emelyanov	77169e2647	sstables: Relax exception handling in do_write_simple This effectively reverts `000514e7cc` (sstable: close file_writer if an exception in thrown) because it became obsoleted by `60873d2360` (sstable: file_writer: auto-close in destructor). The change is in fact idempotent. Before the patch writer was closed regardless of write/flush failing or not. After the patch writer will close itself in destrictor for sure. Before the patch an exception from write/flush was caught, then close was called and regardless of close failed or not the former exception was re-thrown. After the patch an exception from write/flush will result inin writer destruction that would ignore close exception (if any). Before the patch throwing close after successfull write/flush re-threw the close exception. After the patch writer will be closed "by hand" and any exception will be reported. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-30 09:32:56 +03:00

1 2 3 4 5 ...

3081 Commits