scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 08:23:29 +00:00

Author	SHA1	Message	Date
Petr Gusev	8877641b0f	token_metadata_test: check read_endpoints when bootstrapping first node	2023-05-21 13:17:42 +04:00
Petr Gusev	e9a6fcc8e1	token_metadata_test: refactor tests, extract create_erm No logical changes, just tidied up	2023-05-21 13:17:42 +04:00
Petr Gusev	87307781c4	effective_replication_map: use new get_pending_endpoints and get_endpoints_for_reading We already use the new pending_endpoints from erm though the get_pending_ranges virtual function, in this commit we update all the remaining places to use the new implementation in erm, as well as remove the old implementation in token_metadata.	2023-05-21 13:17:42 +04:00
Petr Gusev	d4f004f5c7	token_metadata_test.cc: create token_metadata and replication_strategy as shared pointers We want to switch token_metadata_test to the new implementation of pending_endpoints and read_endpoints in erm. To do this, it is convenient to have token_metadata and replication_strategy as shared pointers, as it fits better with the signature of calculate_effective_replication_map. In this commit we don't change the logic of the tests, we just migrate them to use pointers.	2023-05-21 13:17:42 +04:00
Petr Gusev	51e80691ef	token_metadata: replace set_topology_transition_state with set_read_new This helps isolate topology::transition_state dependencies, token_metadata doesn't need the entire enum, just this boolean flag.	2023-05-19 19:04:43 +04:00
Botond Dénes	c2aee26278	Merge 'Keep sstables garbage collection in sstable_directory' from Pavel Emelyanov Currently temporary directories with incomplete sstables and pending deletion log are processed by distributed loader on start. That's not nice, because for s3 backed sstables this code makes no sense (and is currently a no-op because of incomplete implementation). This garbage collecting should be kept in sstable_directory where it can off-load this work onto lister component that is storage-aware. Once g.c. code moved, it allows to clean the class sstable list of static helpers a bit. refs: #13024 refs: #13020 refs: #12707 Closes #13767 * github.com:scylladb/scylladb: sstable: Toss tempdir extension usage sstable: Drop pending_delete_dir_basename() sstable: Drop is_pending_delete_dir() helper sstable_directory: Make garbage_collect() non-static sstable_directory: Move deletion log exists check distributed_loader: Move garbage collecting into sstable_directory distributed_loader: Collect garbace collecting in one call sstable: Coroutinize remove_temp_dir() sstable: Coroutinize touch_temp_dir() sstable: Use storage::temp_dir instead of hand-crafted path	2023-05-19 08:50:13 +03:00
Raphael S. Carvalho	38b226f997	Resurrect optimization to avoid bloom filter checks during compaction Commit `8c4b5e4283` introduced an optimization which only calculates max purgeable timestamp when a tombstone satisfy the grace period. Commit 'repair: Get rid of the gc_grace_seconds' inverted the order, probably under the assumption that getting grace period can be more expensive than calculating max purgeable, as repair-mode GC will look up into history data in order to calculate gc_before. This caused a significant regression on tombstone heavy compactions, where most of tombstones are still newer than grace period. A compaction which used to take 5s, now takes 35s. 7x slower. The reason is simple, now calculation of max purgeable happens for every single tombstone (once for each key), even the ones that cannot be GC'ed yet. And each calculation has to iterate through (i.e. check the bloom filter of) every single sstable that doesn't participate in compaction. Flame graph makes it very clear that bloom filter is a heavy path without the optimization: 45.64% 45.64% sstable_compact sstable_compaction_test_g [.] utils::filter::bloom_filter::is_present With its resurrection, the problem is gone. This scenario can easily happen, e.g. after a deletion burst, and tombstones becoming only GC'able after they reach upper tiers in the LSM tree. Before this patch, a compaction can be estimated to have this # of filter checks: (# of keys containing any tombstone) * (# of uncompacting sstable runs[1]) [1] It's # of runs, as each key tend to overlap with only one fragment of each run. After this patch, the estimation becomes: (# of keys containing a GC'able tombstone) * (# of uncompacting runs). With repair mode for tombstone GC, the assumption, that retrieval of gc_before is more expensive than calculating max purgeable, is kept. We can revisit it later. But the default mode, which is the "timeout" (i.e. gc_grace_seconds) one, we still benefit from the optimization of deferring the calculation until needed. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13908	2023-05-18 09:01:50 +03:00
Pavel Emelyanov	ed50fda1fe	sstable: Toss tempdir extension usage The tempdir for filesystem-based sstables is {generation}.sstable one. There are two places that need to know the ".sstable" extention -- the tempdir creating code and the tempdir garbage-collecting code. This patch simplifies the sstable class by patching the aforementioned functions to use newly introduced tempdir_extension string directly, without the help of static one-line helpers. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-17 15:19:38 +03:00
Pavel Emelyanov	e8c0ae28b5	sstable: Drop pending_delete_dir_basename() The helper is used to return const char* value of the pending delete dir. Callers can use it directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-17 15:17:33 +03:00
Kefu Chai	6cd745fd8b	build: cmake: add missing test string_format_test was added in `1b5d5205c8`, so let's add it to CMake building system as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13912	2023-05-17 09:51:51 +03:00
Benny Halevy	302a89488a	test: sstable_3_x_test: add test_compression_premature_eof Reproduces #13599 and verifies the fix. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13903	2023-05-17 09:00:44 +03:00
Kamil Braun	5a8e2153a0	Merge 'Fix heart_beat_state::force_highest_possible_version_unsafe' from Benny Halevy It turns out that numeric_limits defines an implicit implementation for std::numeric_limits<utils::tagged_integer<Tag, ValueType>> which apprently returns a default-constructed tagged_integer for min() and max(), and this broke `gms::heart_beat_state::force_highest_possible_version_unsafe()` since [gms: heart_beat_state: use generation_type and version_type](`4cdad8bc8b`) (merged in [Merge 'gms: define and use generation and version types'...](`7f04d8231d`)) Implementing min/max correctly Fixes #13801 Closes #13880 * github.com:scylladb/scylladb: storage_service: handle_state_normal: on_internal_error on "owns no tokens" utils: tagged_integer: implement std::numeric_limits::{min,max} test: add tagged_integer_test	2023-05-16 13:59:41 +02:00
Avi Kivity	3c54d5ec5e	test: string_format_test: don't compare std::string with sstring For unknown reasons, clang 16 rejects equality comparison (operator==) where the left-hand-side is an std::string and the right-hand-side is an sstring. gcc and older clang versions first convert the left-hand-side to an sstring and then call the symmetric equality operator. I was able to hack sstring to support this assymetric comparison, but the solution is quite convoluted, and it may be that it's clang at fault here. So instead this patch eliminates the three cases where it happened. With is applied, we can build with clang 16. Closes #13893	2023-05-16 08:56:16 +03:00
Benny Halevy	a70b53b6e7	utils: tagged_integer: implement std::numeric_limits::{min,max} Add add a respective unit test. It turns out that numeric_limits defines an implicit implementation for std::numeric_limits<utils::tagged_integer<Tag, ValueType>> which apprently returns a default-constructed tagged_integer for min() and max(), and this broke `gms::heart_beat_state::force_highest_possible_version_unsafe()` since `4cdad8bc8b` (merged in `7f04d8231d`) Implementing min/max correctly Fixes #13801 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-15 10:19:39 +03:00
Botond Dénes	6c27297406	Merge 'test: sstable_test: use generator to create new generations' from Kefu Chai in this series, instead of hardwiring to integer, we switch to generation generator for creating new generations. this should helps us to migrate to a generation identifier which can also represented by UUID. and potentially can help to improve the testing coverage once we switch over to UUID-based generation identifier. will need to parameterize these tests by then, for sure. Closes #13863 github.com:scylladb/scylladb: test: sstable: use generator to generate generations test: sstable: pass generation_type in helper functions test: sstable: use generator to generate generations	2023-05-15 10:04:30 +03:00
Botond Dénes	20ff122a84	Merge 'Delete S3 sstables without the help of deletion log' from Pavel Emelyanov There are two layers of stables deletion -- delete-atomically and wipe. The former is in fact the "API" method, it's called by table code when the specific sstable(s) are no longer needed. It's called "atomically" because it's expected to fail in the middle in a safe manner so that subsequent boot would pick the dangling parts and proceed. The latter is a low-level removal function that can fail in the middle, but it's not of _its_ care. Currently the atomic deletion is implemented with the help of sstable_directory::delete_atomically() method that commits sstables files names into deletion log, then calls wipe (indirectly), then drops the deletion log. On boot all found deletion logs are replayed. The described functionality is used regardless of the sstable storage type, even for S3, though deletion log is an overkill for S3, it's better be implemented with the help of ownership table. In fact, S3 storage already implements atomic deletion in its wipe method thus being overly careful. So this PR - makes atomic deletion be storage-specific - makes S3 wipe non-atomic fixes: #13016 note: Replaying sstables deletion from ownership table on boot is not here, see #13024 Closes #13562 * github.com:scylladb/scylladb: sstables: Implement atomic deleter for s3 storage sstables: Get atomic deleter from underlying storage sstables: Move delete_atomically to manager and rename	2023-05-15 08:57:47 +03:00
Benny Halevy	1b5d5205c8	test: add tagged_integer_test Add basic test for tagged+integer arithmetic operations. Remove const qualifier from `tagged_integer::operator[+-]=` as these are add/sub-assign operators that need to modify the value in place. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-14 23:26:58 +03:00
Avi Kivity	31e820e5a1	Merge 'Allow tombstone GC in compaction to be disabled on user request' from Raphael "Raph" Carvalho Adding new APIs /column_family/tombstone_gc and /storage_service/tombstone_gc, that will allow for disabling tombstone garbage collection (GC) in compaction. Mimicks existing APIs /column_family/autocompaction and /storage_service/autocompaction. column_family variant must specify a single table only, following existing convention. whereas the storage_service one can specify an entire keyspace, or a subset of a tables in a keyspace. column_family API usage ----- ``` The table name must be in keyspace:name format Get status: curl -s -X GET "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf" Enable GC curl -s -X POST "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf" Disable GC curl -s -X DELETE "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf" ``` storage_service API usage ----- ``` Tables can be specified using a comma-separated list. Enable GC on keyspace curl -s -X POST "http://127.0.0.1:10000/storage_service/tombstone_gc/ks" Disable GC on keyspace curl -s -X DELETE "http://127.0.0.1:10000/storage_service/tombstone_gc/ks" Enable GC on a subset of tables curl -s -X POST "http://127.0.0.1:10000/storage_service/tombstone_gc/ks?cf=table1,table2" ``` Closes #13793 * github.com:scylladb/scylladb: test: Test new API for disabling tombstone GC test: rest_api: extract common testing code into generic functions Add API to disable tombstone GC in compaction api: storage_service: restore indentation api: storage_service: extract code to set attribute for a set of tables tests: Test new option for disabling tombstone GC in compaction compaction_strategy: bypass tombstone compaction if tombstone GC is disabled table: Allow tombstone GC in compaction to be disabled on user request	2023-05-14 14:16:16 +03:00
Tomasz Grabiec	a91e83fad6	Merge "issue raft read barrier before pulling schema" from Gleb Schema pull may fail because the pull does not contain everything that is needed to instantiate a schema pointer. For instance it does not contain a keyspace. This series changes the code to issue raft read barrier before the pull which will guaranty that the keyspace is created before the actual schema pull is performed.	2023-05-14 14:14:24 +03:00
Raphael S. Carvalho	a7ceb987f5	test: Fix sporadic failures of database_test database_test is failing sporadically and the cause was traced back to commit `e3e7c3c7e5`. The commit forces a subset of tests in database_test, to run once for each of predefined x_log2_compaction_group settings. That causes two problems: 1) test becomes 240% slower in dev mode. 2) queries on system.auth is timing out, and the reason is a small table being spread across hundreds of compaction groups in each shard. so to satisfy a range scan, there will be multiple hops, making the overhead huge. additionally, the compaction group aware sstable set is not merged yet. so even point queries will unnecessarily scan through all the groups. Fixes #13660. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13851	2023-05-14 14:14:24 +03:00
Avi Kivity	97694d26c4	Merge 'reader_permit: minor improvements to resource consume/release safety' from Botond Dénes This PR contains some small improvements to the safety of consuming/releasing resources to/from the semaphore: * reader_permit: make the low-level `consume()/signal()` API private, making the only user (an RAII class) friend. * reader_resources: split `reset()` into `noexcept` and potentially throwing variant. * reader_resources::reset_to(): try harder to avoid calling `consume()` (when the new resource amount is smaller then the previous one) Closes #13678 * github.com:scylladb/scylladb: reader_permit: resource_units::reset_to(): try harder to avoid calling consume() reader_permit: split resource_units::reset() reader_permit: make consume()/signal() API private	2023-05-14 14:14:23 +03:00
Pavel Emelyanov	5985f00da9	sstables: Move delete_atomically to manager and rename This is to let manager decide which storage driver to call for atomic sstables deletion in the next patch. While at it -- rename the sstable_directory's method into something more descriptive (to make compiler catch all callers of it). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-12 17:52:12 +03:00
Raphael S. Carvalho	6c32148751	tests: Test new option for disabling tombstone GC in compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-05-12 10:14:28 -03:00
Kefu Chai	e89e0d4b28	test: sstable: use generator to generate generations instead of assuming the integer-based generation id, let's use the generation generator for creating a new generation id. this helps us to improve the testing coverity once we migrate to the UUID-based generation identifier. this change uses generator to generate generations for `make_sstable_for_all_shards()`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-12 13:22:32 +08:00
Kefu Chai	e3d6dd46b7	test: sstable: pass generation_type in helper functions always avoid using generation_type if possible. this helps us to hide the underlying type of generation identifier, which could also be a UUID in future. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-12 13:22:32 +08:00
Kefu Chai	e788bfbb43	test: sstable: use generator to generate generations instead of assuming the integer-based generation id, let's use the generation generator for creating a new generation id. this helps us to improve the testing coverity once we migrate to the UUID-based generation identifier. this change uses generator to create generations for `make_sstable_for_this_shard()`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-12 13:22:30 +08:00
Gleb Natapov	091ec285fe	serialized_action: make serialized_action abortable Add an ability to abort waiting for a result of a specific trigger() invocation.	2023-05-11 16:31:23 +03:00
Botond Dénes	24cb351655	Merge 'test: sstable_test: avoid using helper using generation_type::int_t ' from Kefu Chai the series drops some of the callers using SSTable generation as integer. as the generation of SSTable is but an identifier, we should not use it as an integer out of generation_type's implementation. Closes #13845 github.com:scylladb/scylladb: test: drop unused helper functions test: sstable_mutation_test: avoid using helper using generation_type::int_t test: sstable_move_test: avoid using helper using generation_type::int_t test: sstable_*test: avoid using helper using generation_type::int_t test: sstable_3_x_test: do not use reuseable_sst() accepting integer	2023-05-11 10:17:02 +03:00
Kefu Chai	b036d2b50c	test: sstable_mutation_test: avoid using helper using generation_type::int_t this change is one of the series which drops most of the callers using SSTable generation as integer. as the generation of SSTable is but an identifier, we should not use it as an integer out of generation_type's implementation. so, in this change, instead of using `generation_type::int_t` in the helper functions, we just pass `generation_type` in place of integer. also, since `generate_clustered()` is only used by functions in the same compilation unit, let's take the opportunity to mark it `static`. and there is no need to pass generation as a template parameter, we just pass it as a regular parameter. we will divert other callers of `reusable_sst(..., generation_type::int)` in following-up changes in different ways. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-11 12:32:22 +08:00
Kefu Chai	689e1e99d6	test: sstable_move_test: avoid using helper using generation_type::int_t this change is one of the series which drops most of the callers using SSTable generation as integer. as the generation of SSTable is but an identifier, we should not use it as an integer out of generation_type's implementation. so, in this change, instead of using `generation_type::int_t` in helper functions, we just use `generation_type`. please note, despite that we'd prefer generating the generations using generator, the SSTables used by the tests modified by this change are stored in the repo, to ensure that the tests are always able to find the SSTable files, we keep them unchanged instead of using generation_generator, or a random generation for the testing. we will divert other callers of `reusable_sst(..., generation_type::int)` in following-up changes in different ways. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-11 12:32:22 +08:00
Kefu Chai	bfd6caffbb	test: sstable_*test: avoid using helper using generation_type::int_t this change is one of the series which drops most of the callers using SSTable generation as integer. as the generation of SSTable is but an identifier, we should not use it as an integer out of generation_type's implementation. so, in this change, instead of using the helper accepting int, we switch to the one which accepts generation_type by offering a default paramter, which is a generation created using 1. this preserves the existing behavior. we will divert other callers of `reusable_sst(..., generation_type::int)` in following-up changes in different ways. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-11 12:32:22 +08:00
Kefu Chai	ab8efbf1ab	test: sstable_3_x_test: do not use reuseable_sst() accepting integer this change is one of the series which drops most of the callers using SSTable generation as integer. as the generation of SSTable is but an identifier, we should not use it as an integer out of generation_type's implementation. so, in this change, instead of using the helper accepting int, we switch to the one which accepts generation_type. also, as no callers are using the last parameter of `make_test_sstable()`, let's drop it . Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-11 12:32:21 +08:00
Nadav Har'El	e57252092c	Merge 'cql3: result_set, selector: change value type to managed_bytes_opt' from Avi Kivity CQL evolved several expression evaluation mechanisms: WHERE clause, selectors (the SELECT clause), and the LWT IF clause are just some examples. Most now use expressions, which use managed_bytes_opt as the underlying value representation, but selectors still use bytes_opt. This poses two problems: 1. bytes_opt generates large contiguous allocations when used with large blobs, impacting latency 2. trying to use expressions with bytes_opt will incur a copy, reducing performance To solve the problem, we harmonize the data types to managed_bytes_opt (#13216 notwithstanding). This is somewhat difficult since the source of the values are views into a bytes_ostream. However, luckily bytes_ostream and managed_bytes_view are mostly compatible so with a little effort this can be done. The series is neutral wrt performance: before: ``` 222118.61 tps ( 61.1 allocs/op, 12.1 tasks/op, 43092 insns/op, 0 errors) 224250.14 tps ( 61.1 allocs/op, 12.1 tasks/op, 43094 insns/op, 0 errors) 224115.66 tps ( 61.1 allocs/op, 12.1 tasks/op, 43092 insns/op, 0 errors) 223508.70 tps ( 61.1 allocs/op, 12.1 tasks/op, 43107 insns/op, 0 errors) 223498.04 tps ( 61.1 allocs/op, 12.1 tasks/op, 43087 insns/op, 0 errors) ``` after: ``` 220708.37 tps ( 61.1 allocs/op, 12.1 tasks/op, 43118 insns/op, 0 errors) 225168.99 tps ( 61.1 allocs/op, 12.1 tasks/op, 43081 insns/op, 0 errors) 222406.00 tps ( 61.1 allocs/op, 12.1 tasks/op, 43088 insns/op, 0 errors) 224608.27 tps ( 61.1 allocs/op, 12.1 tasks/op, 43102 insns/op, 0 errors) 225458.32 tps ( 61.1 allocs/op, 12.1 tasks/op, 43098 insns/op, 0 errors) ``` Though I expect with some more effort we can eliminate some copies. Closes #13637 * github.com:scylladb/scylladb: cql3: untyped_result_set: switch to managed_bytes_view as the cell type cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt cql3: untyped_result_set: always own data types: abstract_type: add mixed-type versions of compare() and equal() utils/managed_bytes, serializer: add conversion between buffer_view<bytes_ostream> and managed_bytes_view utils: managed_bytes: add bidirectional conversion between bytes_opt and managed_bytes_opt utils: managed_bytes: add managed_bytes_view::with_linearized() utils: managed_bytes: mark managed_bytes_view::is_linearized() const	2023-05-10 15:01:45 +03:00
Kamil Braun	7d9ab44e81	Merge 'token_metadata: read remapping for write_both_read_new' from Gusev Petr When new nodes are added or existing nodes are deleted, the topology state machine needs to shunt reads from the old nodes to the new ones. This happens in the `write_both_read_new` state. The problem is that previously this state was not handled in any way in `token_metadata` and the read nodes were only changed when the topology state machine reached the final 'owned' state. To handle `write_both_read_new` an additional `interval_map` inside `token_metadata` is maintained similar to `pending_endpoints`. It maps the ranges affected by the ongoing topology change operation to replicas which should be used for reading. When topology state sm reaches the point when it needs to switch reads to a new topology, it passes `request_read_new=true` in a call to `update_pending_ranges`. This forces `update_pending_ranges` to compute the ranges based on new topology and store them to the `interval_map`. On the data plane, when a read on coordinator needs to decide which endpoints to use, it first consults this `interval_map` in `token_metadata`, and only if it doesn't contain a range for current token it uses normal endpoints from `effective_replication_map`. Closes #13376 * github.com:scylladb/scylladb: storage_proxy, storage_service: use new read endpoints storage_proxy: rename get_live_sorted_endpoints->get_endpoints_for_reading token_metadata: add unit test for endpoints_for_reading token_metadata: add endpoints for reading sequenced_set: add extract_set method token_metadata_impl: extract maybe_migration_endpoints helper function token_metadata_impl: introduce migration_info token_metadata_impl: refactor update_pending_ranges token_metadata: add unit tests token_metadata: fix indentation token_metadata_impl: return unique_ptr from clone functions	2023-05-10 10:03:30 +02:00
Petr Gusev	15fe4d8d69	token_metadata: add unit test for endpoints_for_reading	2023-05-09 18:42:03 +04:00
Botond Dénes	287ccce1cc	Merge 'sstables: extract storage out ' from Kefu Chai this change extracts the storage class and its derived classes out into their own source files. for couple reasons: - for better readability. the sstables.hh is over 1005 lines. and sstables.cc 3602 lines. it's a little bit difficult to figure out how the different parts in these sources interact with each other. for instance, with this change, it's clear some of helper functions are only used by file_system_storage. - probably less inter-source dependency. by extracting the sources files out, they can be compiled individually, so changing one .cc file does not impact others. this could speed up the compilation time. Closes #13785 * github.com:scylladb/scylladb: sstables: storage: coroutinize idempotent_link_file() sstables: extract storage out	2023-05-09 14:03:40 +03:00
Petr Gusev	3120cabf56	token_metadata: add unit tests We are going to refactor update_pending_ranges, so in this commit we add some simple unit tests to ensure we don't break it.	2023-05-09 13:56:06 +04:00
Kefu Chai	2eefcb37eb	sstables: extract storage out this change extracts the storage class and its derived classes out into storage.cc and storage.hh. for couple reasons: - for better readability. the sstables.hh is over 1005 lines. and sstables.cc 3602 lines. it's a little bit difficult to figure out how the different parts in these sources interact with each other. for instance, with this change, it's clear some of helper functions are only used by file_system_storage. - probably less inter-source dependency. by extracting the sources files out, they can be compiled individually, so changing one .cc file does not impact others. this could speed up the compilation time. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-09 16:47:00 +08:00
Botond Dénes	20f620feb9	Merge 'replica, sstable: replace generation_type::value() with generation_type::as_int()' from Kefu Chai this series prepares for the UUID based generation by replacing the general `value()` function with the function with more specific name: `as_int()`. Closes #13796 * github.com:scylladb/scylladb: test: drop a reusable_sst() variant which accepts int as generation treewide: replace generation_type::value() with generation_type::as_int()	2023-05-09 07:30:54 +03:00
Nadav Har'El	5f37d43ee6	Merge 'compaction: validate: validate the index too' from Botond Dénes In addition to the data file itself. Currently validation avoids the index altogether, using the crawling reader which only relies on the data file and ignores the index+summary. This is because a corrupt sstable usually has a corrupt index too and using both at the same time might hide the corruption. This patch adds targeted validation of the index, independent of and in addition to the already existing data validation: it validates the order of index entries as well as whether the entry points to a complete partition in the data file. This will usually result in duplicate errors for out-of-order partitions: one for the data file and one for the index file. Fixes: #9611 Closes #11405 * github.com:scylladb/scylladb: test/cql-pytest: add test_sstable_validation.py test/cql-pytest: extract scylla_path,temp_workdir fixtures to conftest.py tools/scylla-sstables: write validation result to stdout sstables/sstable: validate(): delegate to mx validator for mx sstables sstables/mx/reader: add mx specific validator mutation/mutation_fragment_stream_validator: add validator() accessor to validating filter sstables/mx/reader: template data_consume_rows_context_m on the consumer sstables/mx/reader: move row_processing_result to namespace scope sstables/mx/reader: use data_consumer::proceed directly sstables/mx/reader.cc: extend namespace to end-of-file (cosmetic) compaction/compaction: remove now unused scrub_validate_mode_validate_reader() compaction/compaction: move away from scrub_validate_mode_validate_reader() tools/scylla-sstable: move away from scrub_validate_mode_validate_reader() test/boost/sstable_compaction_test: move away from scrub_validate_mode_validate_reader() sstables/sstable: add validate() method compaction/compaction: scrub_sstables_validate_mode(): validate sstables one-by-one compaction: scrub: use error messages from validator mutation_fragment_stream_validator: produce error messages in low-level validator	2023-05-08 17:14:26 +03:00
Botond Dénes	b790f14456	reader_concurrency_semaphore: execution_loop(): trigger admission check when _ready_list is empty The execution loop consumes permits from the _ready_list and executes them. The _ready_list usually contains a single permit. When the _ready_list is not empty, new permits are queued until it becomes empty. The execution loops relies on admission checks triggered by the read releasing resouces, to bring in any queued read into the _ready_list, while it is executing the current read. But in some cases the current read might not free any resorces and thus fail to trigger an admission check and the currently queued permits will sit in the queue until another source triggers an admission check. I don't yet know how this situation can occur, if at all, but it is reproducible with a simple unit test, so it is best to cover this corner-case in the off-chance it happens in the wild. Add an explicit admission check to the execution loop, after the _ready_list is exhausted, to make sure any waiters that can be admitted with an empty _ready_list are admitted immediately and execution continues. Fixes: #13540 Closes #13541	2023-05-08 17:11:41 +03:00
Botond Dénes	ab5fd0f750	Merge 's3: Provide timestamps in the s3 file implementation' from Raphael "Raph" Carvalho SSTable relies on st.st_mtime for providing creation time of data file, which in turn is used by features like tombstone compaction. Therefore, let's implement it. Fixes https://github.com/scylladb/scylladb/issues/13649. Closes #13713 * github.com:scylladb/scylladb: s3: Provide timestamps in the s3 file implementation s3: Introduce get_object_stats() s3: introduce get_object_header()	2023-05-08 11:43:41 +03:00
Raphael S. Carvalho	57661f0392	s3: Introduce get_object_stats() get_object_stats() will be used for retrieving content size and also last modified. The latter is required for filling st_mtim, etc, in the s3::client::readable_file::stat() method. Refs #13649. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-05-07 19:51:10 -03:00
Avi Kivity	42a1ced73b	cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt The expression system uses managed_bytes_opt for values, but result_set uses bytes_opt. This means that processing values from the result set in expressions requires a copy. Out of the two, managed_bytes_opt is the better choice, since it prevents large contiguous allocations for large blobs. So we switch result_set to use managed_bytes_opt. Users of the result_set API are adjusted. The db::function interface is not modified to limit churn; instead we convert the types on entry and exit. This will be adjusted in a following patch.	2023-05-07 17:17:36 +03:00
Botond Dénes	c1e8e86637	reader_concurrency_semaphore: reader_permit: clean-up after failed memory requests When requesting memory via `reader_permit::request_memory()`, the requested amount is added to `_requested_memory` member of the permit impl. This is because multiple concurrent requests may be blocked and waiting at the same time. When the requests are fulfilled, the entire amount is consumed and individual requests track their requested amount with `resource_units` to release later. There is a corner-case related to this: if a reader permit is registered as inactive while it is waiting for memory, its active requests are killed with `std::bad_alloc`, but the `_requested_memory` fields is not cleared. If the read survives because the killed requests were part of a non-vital background read-ahead, a later memory request will also include amount from the failed requests. This extra amount wil not be released and hence will cause a resource leak when the permit is destroyed. Fix by detecting this corner case and clearing the `_requested_memory` field. Modify the existing unit test for the scenario of a permit waiting on memory being registered as inactive, to also cover this corner case, reproducing the bug. Fixes: #13539 Closes #13679	2023-05-07 14:06:51 +03:00
Kefu Chai	bd3e8d0460	test: drop a reusable_sst() variant which accepts int as generation this is one of the changes to reduce the usage of integer based generation test. in future, we will need to expand the test to exercise the UUID based generation, or at least to be neutral to the underlying generation's identifier type. so, to remove the helpers which only accept `generation_type::int_t` would helps us to make this happen. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-06 18:24:48 +08:00
Kefu Chai	9b35faf485	treewide: replace generation_type::value() with generation_type::as_int() * replace generation_type::value() with generation_type::as_int() * drop generation_value() because we will switch over to UUID based generation identifier, the member function or the free function generation_value() cannot fulfill the needs anymore. so, in this change, they are consolidated and are replaced by "as_int()", whose name is more specific, and will also work and won't be misleading even after switching to UUID based generation identifier. as `value()` would be confusing by then: it could be an integer or a UUID. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-06 18:24:45 +08:00
Botond Dénes	687a8bb2f0	Merge 'Sanitize test::filename(sstable) API' from Pavel Emelyanov There are two of them currently with slightly different declaration. Better to leave only one. Closes #13772 * github.com:scylladb/scylladb: test: Deduplicate test::filename() static overload test: Make test::filename return fs::path	2023-05-05 11:36:08 +03:00
Pavel Emelyanov	ac305076bd	test: Split test_twcs_interposer_on_memtable_flush naturally The test case consists of two internal sub-test-cases. Making them explicit kills three birds with one stone - improves parallelizm - removes env's tempdir wiping - fixes code indentation refs: #12707 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13768	2023-05-05 10:42:30 +03:00
Avi Kivity	f125a3e315	Merge 'tree: finish the reader_permit state renames' from Botond Dénes In https://github.com/scylladb/scylladb/pull/13482 we renamed the reader permit states to more descriptive names. That PR however only covered only the states themselves and their usages, as well as the documentation in `docs/dev`. This PR is a followup to said PR, completing the name changes: renaming all symbols, names, comments etc, so all is consistent and up-to-date. Closes #13573 * github.com:scylladb/scylladb: reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes reader_concurrency_semaphore: update API w.r.t. recent permit state name changes reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes	2023-05-04 18:29:04 +03:00

1 2 3 4 5 ...

2513 Commits