scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Benny Halevy	bda3705974	test/lib: test_reader_conversions: always close reader read_mutation_from_flat_mutation_reader might throw so we need to close the reader returned from ms.make_fragment_v1_stream also on the error path to avoid the internal error abort when the reader is destroyed while opened. Fixes #14098 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #14099	2023-05-31 17:49:38 +02:00
Kefu Chai	82cac8e7cf	treewide: s/std::source_location/seastar::compact::source_location/ CWG 2631 (https://cplusplus.github.io/CWG/issues/2631.html) reports an issue on how the default argument is evaluated. this problem is more obvious when it comes to how `std::source_location::current()` is evaluated as a default argument. but not all compilers have the same behavior, see https://godbolt.org/z/PK865KdG4. notebaly, clang-15 evaluates the default argument at the callee site. so we need to check the capability of compiler and fall back to the one defined by util/source_location-compat.hh if the compiler suffers from CWG 2631. and clang-16 implemented CWG2631 in https://reviews.llvm.org/D136554. But unfortunately, this change was not backported to clang-15. before switching over to clang-16, for using std::source_location::current() as the default parameter and expect the behavior defined by CWG2631, we have to use the compatible layer provided by Seastar. otherwise we always end up having the source_location at the callee side, which is not interesting under most circumstances. so in this change, all places using the idiom of passing std::source_location::current() as the default parameter are changed to use seastar::compat::source_location::current(). despite that we have `#include "seastarx.h"` for opening the seastar namespace, to disambiguate the "namespace compat" defined somewhere in scylladb, the fully qualified name of `seastar::compat::source_location::current()` is used. see also `09a3c63345`, where we used std::source_location as an alias of std::experimental::source_location if it was available. but this does not apply to the settings of our current toolchain, where we have GCC-12 and Clang-15. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14086	2023-05-30 15:10:12 +03:00
Pavel Emelyanov	44b811ce19	test: Don't create directory for system tables in cql_test_env The distributed_loader::init_system_keyspaces() does it when called few lines above this place Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-26 17:58:46 +03:00
Botond Dénes	57758ec3e1	Merge 'Put streaming sched group onto stream manager' from Pavel Emelyanov The manager is in charge of updating IO bandwidth on the respective prio class. Nowadays it uses global priority-manager, but unifying sched classes effort will require it to use non-global streaming sched group. After the patch the sched class field is unused, but it's a preparation towards huge (really huge) "switch to seastar API level 7" patch ref: #13963 Closes #13997 * github.com:scylladb/scylladb: stream_manager: Add streaming sched group copy cql_test_env: Move sched groups initialization up	2023-05-24 09:27:30 +03:00
Botond Dénes	313ae4ddac	Merge 'Generalize some file accessing helpers in test/' from Pavel Emelyanov Several test cases use common operations one files like existence checking, content comparing, etc. with the help of home-brew local helpers. The set makes use of some existing seastar:: ones and generalizes others into test/lib/. The primary intent here is `57 insertions(+), 135 deletions(-)` Closes #13936 * github.com:scylladb/scylladb: test: Generalize touch_file() into test_utils.* test/database: Generalize file/dir touch and exists checks test/sstables: Use seastar::file_exists() to check test/sstables: Remove sstdesc test/sstables: Use compare_files from utils/ in sstable_test test/sstables: Use compare_files() from utils/ in sstable_3_x_test test/util: Add compare_file() helpers	2023-05-24 08:43:41 +03:00
Avi Kivity	3956e01640	Merge 'Clean index_reader API' from Pavel Emelyanov The way index_reader maintains io_priority_class can be relaxed a bit. The main intent is to shorten the #13963 final patch a bit, as a side effect index_reader gets its portion of API polishing. ref: #13963 Closes #13992 * github.com:scylladb/scylladb: index_reader: Introduce and use default arguments to constructor index_reader: Use _pc field in get_file_input_stream_options() directly index_reader: Move index_reader::get_file_input_stream_options to private: block	2023-05-23 18:46:26 +03:00
Pavel Emelyanov	678f8fb1b7	stream_manager: Add streaming sched group copy The manager in question is responsible for maintaining the streaming class IO bandwidth update. Nowadays it does it via priority manager's global streaming IO priority class field, but it will need to switch to streaming sched group. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-23 14:31:23 +03:00
Pavel Emelyanov	ff9d65f6ad	cql_test_env: Move sched groups initialization up The streaming manager will need to keep its copy of streaming/maintenance group, so groups should be created early. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-23 14:31:23 +03:00
Pavel Emelyanov	2bb024c948	index_reader: Introduce and use default arguments to constructor Most of creators of index_reader construct it with default prio class, null trace pointer and use_caching::yes. Assigning implicit defaults to constructor arguments keeps the code shorter and easier to read. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-23 11:29:04 +03:00
Pavel Emelyanov	9bdc0d3f44	test: Generalize touch_file() into test_utils.* Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-23 10:40:55 +03:00
Pavel Emelyanov	1f4c3be50c	test/util: Add compare_file() helpers To be used later Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-23 10:37:08 +03:00
Jan Ciolek	d2ef55b12c	test: use NetworkTopologyStrategy in all unit tests As described in https://github.com/scylladb/scylladb/issues/8638, we're moving away from `SimpleStrategy`, in the future it will become deprecated. We should remove all uses of it and replace them with `NetworkTopologyStrategy`. This change replaces `SimpleStrategy` with `NetworkTopologyStrategy` in all unit tests, or at least in the ones where it was reasonable to do so. Some of the tests were written explicitly to test the `SimpleStrategy` strategy, or changing the keyspace from `SimpleStrategy` to `NetworkTopologyStrategy`. These tests were left intact. It's still a feature that is supported, even if it's slowly getting deprecated. The typical way to use `NetworkTopologyStrategy` is to specify a replication factor for each datacenter. This could be a bit cumbersome, we would have to fetch the list of datacenters, set the repfactors, etc. Luckily there is another way - we can just specify a replication factor to use for or each existing datacenter, like this: ```cql CREATE KEYSPACE {} WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'replication_factor' : 1}; ``` This makes the change rather straightforward - just replace all instances of `'SimpleStrategy'', with `'NetworkTopologyStrategy'`. Refs: https://github.com/scylladb/scylladb/issues/8638 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #13990	2023-05-23 08:52:56 +03:00
Botond Dénes	3b424e391b	Merge 'perform_cleanup: wait until all candidates are cleaned up' from Benny Halevy cleanup_compaction should resolve only after all sstables that require cleanup are cleaned up. Since it is possible that some of them are in staging and therefore cannot be cleaned up, retry once a second until they become eligible. Timeout if there is no progress within 5 minutes to prevent hanging due to view building bug. Fixes #9559 Closes #13812 * github.com:scylladb/scylladb: table: signal compaction_manager when staging sstables become eligible for cleanup compaction_manager: perform_cleanup: wait until all candidates are cleaned up compaction_manager: perform_cleanup: perform_offstrategy if needed compaction_manager: perform_cleanup: update_sstables_cleanup_state in advance sstable_set: add for_each_sstable_gently* helpers	2023-05-19 12:35:59 +03:00
Kamil Braun	13df85ea11	Merge 'Cut feature_service -> system_keyspace dependency' from Pavel Emelyanov This implicit link it pretty bad, because feature service is a low-level one which lots of other services depend on. System keyspace is opposite -- a high-level one that needs e.g. query processor and database to operate. This inverse dependency is created by the feature service need to commit enabled features' names into system keyspace on cluster join. And it uses the qctx thing for that in a best-effort manner (not doing anything if it's null). The dependency can be cut. The only place when enabled features are committed is when gossiper enables features on join or by receiving state changes from other nodes. By that time the sharded<system_keyspace> is up and running and can be used. Despite gossiper already has system keyspace dependency, it's better not to overload it with the need to mess with enabling and persisting features. Instead, the feature_enabler instance is equipped with needed dependencies and takes care of it. Eventually the enabler is also moved to feature_service.cc where it naturally belongs. Fixes: #13837 Closes #13172 * github.com:scylladb/scylladb: gossiper: Remove features and sysks from gossiper system_keyspace: De-static save_local_supported_features() system_keyspace: De-static load_\|save_local_enabled_features() system_keyspace: Move enable_features_on_startup to feature_service (cont) system_keyspace: Move enable_features_on_startup to feature_service feature_service: Open-code persist_enabled_feature_info() into enabler gms: Move feature enabler to feature_service.cc gms: Move gossiper::enable_features() to feature_service::enable_features_on_join() gms: Persist features explicitly in features enabler feature_service: Make persist_enabled_feature_info() return a future system_keyspace: De-static load_peer_features() gms: Move gossiper::do_enable_features to persistent_feature_enabler::enable_features() gossiper: Enable features and register enabler from outside gms: Add feature_service and system_keyspace to feature_enabler	2023-05-18 18:21:06 +02:00
Benny Halevy	bb59687116	table: signal compaction_manager when staging sstables become eligible for cleanup perform_cleanup may be waiting for those sstables to become eligible for cleanup so signal it when table::move_sstables_from_staging detects an sstable that requires cleanup. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-17 11:33:22 +03:00
Botond Dénes	0cff0ffa08	Merge 'alternator,config: make alternator_timeout_in_ms live-updateable' from Kefu Chai before this change, alternator_timeout_in_ms is not live-updatable, as after setting executor's default timeout right before creating sharded executor instances, they never get updated with this option anymore. but many users would like to set the driver timers based on server timers. we need to enable them to configure timeout even when the server is still running. in this change, * `alternator_timeout_in_ms` is marked as live-updateable * `executor::_s_default_timeout` is changed to a thread_local variable, so it can be updated by a per-shard updateable_value. and it is now a updateable_value, so its variable name is updated accordingly. this value is set in the ctor of executor, and it is disconnected from the corresponding named_value<> option in the dtor of executor. * alternator_timeout_in_ms is passed to the constructor of executor via sharded_parameter, so `executor::_timeout_in_ms` can be initialized on per-shard basis * `executor::set_default_timeout()` is dropped, as we already pass the option to executor in its ctor. Fixes #12232 Closes #13300 * github.com:scylladb/scylladb: alternator: split the param list of executor ctor into multi lines alternator,config: make alternator_timeout_in_ms live-updateable	2023-05-15 10:16:29 +03:00
Avi Kivity	31e820e5a1	Merge 'Allow tombstone GC in compaction to be disabled on user request' from Raphael "Raph" Carvalho Adding new APIs /column_family/tombstone_gc and /storage_service/tombstone_gc, that will allow for disabling tombstone garbage collection (GC) in compaction. Mimicks existing APIs /column_family/autocompaction and /storage_service/autocompaction. column_family variant must specify a single table only, following existing convention. whereas the storage_service one can specify an entire keyspace, or a subset of a tables in a keyspace. column_family API usage ----- ``` The table name must be in keyspace:name format Get status: curl -s -X GET "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf" Enable GC curl -s -X POST "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf" Disable GC curl -s -X DELETE "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf" ``` storage_service API usage ----- ``` Tables can be specified using a comma-separated list. Enable GC on keyspace curl -s -X POST "http://127.0.0.1:10000/storage_service/tombstone_gc/ks" Disable GC on keyspace curl -s -X DELETE "http://127.0.0.1:10000/storage_service/tombstone_gc/ks" Enable GC on a subset of tables curl -s -X POST "http://127.0.0.1:10000/storage_service/tombstone_gc/ks?cf=table1,table2" ``` Closes #13793 * github.com:scylladb/scylladb: test: Test new API for disabling tombstone GC test: rest_api: extract common testing code into generic functions Add API to disable tombstone GC in compaction api: storage_service: restore indentation api: storage_service: extract code to set attribute for a set of tables tests: Test new option for disabling tombstone GC in compaction compaction_strategy: bypass tombstone compaction if tombstone GC is disabled table: Allow tombstone GC in compaction to be disabled on user request	2023-05-14 14:16:16 +03:00
Avi Kivity	0a78995e2b	Merge 'Share s3 clients between sstables' from Pavel Emelyanov Currently s3::client is created for each sstable::storage. It's later shared between sstable's files and upload sink(s). Also foreign_sstable_open_info can produce a file from a handle making a new standalone client. Coupled with the seastar's http client spawning connections on demand, this makes it impossible to control the amount of opened connections to object storage server. In order to put some policy on top of that (as well as apply workload prioritization) s3 clients should be collected in one place and then shared by users. Since s3::client uses seastar::http::client under the hood which, in turn, can generate many connections on demand, it's enough to produce a single s3::client per configured endpoint one each shard and then share it between all the sstables, files and sinks. There's one difficulty however, solving which is most of what this PR does. The file handle, that's used to transfer sstable's file across shards, should keep aboard all it needs to re-create a file on another shard. Since there's a single s3::client per shard, creation of a file out of a handle should grab that shard's client somehow. The meaningful shard-local object that can help is the sstables_manager and there are three ways to make use of it. All deal with the fact that sstables_manager-s are not sharded<> services, but are owner by the database independently on each shard. 1. walk the client -> sst.manager -> database -> container -> database -> sst.manager -> client chain by keeping its first half on the handle and unrolling the second half to produce a file 2. keep sharded peering service referenced by the sstables_manager that's initialized in main and passed though the database constructor down to sstables_manager(s) 3. equip file_handle::to_file with the "context" argument and teach sstables foreign info opener to push sstables_manager down to s3 file ... somehow This PR chooses the 2nd way and introduces the sstables::storage_manager main-local sharded peering service that maintains all the s3::clients. "While at it" the new manager gets the object_storage_config updating facilities from the database (it's overloaded even without it already). Later the manager will also be in charge of collecting and exporting S3 metrics. In order to limit the number of S3 connections it also needs a patch seastar http::client, there's PR already doing that, once (if) merged there'll come one more fix on top. refs: #13458 refs: #13369 refs: scylladb/seastar#1652 Closes #13859 * github.com:scylladb/scylladb: s3: Pick client from manager via handle s3: Generalize s3 file handle s3: Live-update clients' configs sstables: Keep clients shared across sstables storage_manager: Rewrap config map sstables, database: Move object storage config maintenance onto storage_manager sstables: Introduce sharded<storage_manager>	2023-05-14 14:14:23 +03:00
Raphael S. Carvalho	6c32148751	tests: Test new option for disabling tombstone GC in compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-05-12 10:14:28 -03:00
Raphael S. Carvalho	3b28c26c77	table: Allow tombstone GC in compaction to be disabled on user request If tombstone GC was disabled, compaction will ensure that fully expired sstables won't be bypassed and that no expired tombstones will be purged. Changing the value takes immediate effect even on ongoing compactions. Not wired into an API yet. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-05-12 10:14:28 -03:00
Pavel Emelyanov	a59096aa70	sstables, database: Move object storage config maintenance onto storage_manager Right now the map<endpoint, config> sits on the sstables manager and its update is governed by database (because it's peering and can kick other shards to update it as well). Having the sharded<storage_manager> at hand lets freeing database from the need to update configs and keeps sstables_manager a bit smaller. Also this will allow keeping s3 clients shared between sstables via this map by next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-11 19:39:00 +03:00
Pavel Emelyanov	2153751d45	sstables: Introduce sharded<storage_manager> The manager in question keeps track of whatever sstables_manager needs to work with the storage (spoiler: only S3 one). It's main-local sharded peering service, so that container() call can be used by next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-11 19:36:01 +03:00
Botond Dénes	24cb351655	Merge 'test: sstable_test: avoid using helper using generation_type::int_t ' from Kefu Chai the series drops some of the callers using SSTable generation as integer. as the generation of SSTable is but an identifier, we should not use it as an integer out of generation_type's implementation. Closes #13845 github.com:scylladb/scylladb: test: drop unused helper functions test: sstable_mutation_test: avoid using helper using generation_type::int_t test: sstable_move_test: avoid using helper using generation_type::int_t test: sstable_*test: avoid using helper using generation_type::int_t test: sstable_3_x_test: do not use reuseable_sst() accepting integer	2023-05-11 10:17:02 +03:00
Kefu Chai	29284d64a5	test: drop unused helper functions all users of these two helpers have switched to their alternatives, so there is no need to keep them. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-11 12:32:37 +08:00
Kefu Chai	bfd6caffbb	test: sstable_*test: avoid using helper using generation_type::int_t this change is one of the series which drops most of the callers using SSTable generation as integer. as the generation of SSTable is but an identifier, we should not use it as an integer out of generation_type's implementation. so, in this change, instead of using the helper accepting int, we switch to the one which accepts generation_type by offering a default paramter, which is a generation created using 1. this preserves the existing behavior. we will divert other callers of `reusable_sst(..., generation_type::int)` in following-up changes in different ways. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-11 12:32:22 +08:00
Nadav Har'El	e57252092c	Merge 'cql3: result_set, selector: change value type to managed_bytes_opt' from Avi Kivity CQL evolved several expression evaluation mechanisms: WHERE clause, selectors (the SELECT clause), and the LWT IF clause are just some examples. Most now use expressions, which use managed_bytes_opt as the underlying value representation, but selectors still use bytes_opt. This poses two problems: 1. bytes_opt generates large contiguous allocations when used with large blobs, impacting latency 2. trying to use expressions with bytes_opt will incur a copy, reducing performance To solve the problem, we harmonize the data types to managed_bytes_opt (#13216 notwithstanding). This is somewhat difficult since the source of the values are views into a bytes_ostream. However, luckily bytes_ostream and managed_bytes_view are mostly compatible so with a little effort this can be done. The series is neutral wrt performance: before: ``` 222118.61 tps ( 61.1 allocs/op, 12.1 tasks/op, 43092 insns/op, 0 errors) 224250.14 tps ( 61.1 allocs/op, 12.1 tasks/op, 43094 insns/op, 0 errors) 224115.66 tps ( 61.1 allocs/op, 12.1 tasks/op, 43092 insns/op, 0 errors) 223508.70 tps ( 61.1 allocs/op, 12.1 tasks/op, 43107 insns/op, 0 errors) 223498.04 tps ( 61.1 allocs/op, 12.1 tasks/op, 43087 insns/op, 0 errors) ``` after: ``` 220708.37 tps ( 61.1 allocs/op, 12.1 tasks/op, 43118 insns/op, 0 errors) 225168.99 tps ( 61.1 allocs/op, 12.1 tasks/op, 43081 insns/op, 0 errors) 222406.00 tps ( 61.1 allocs/op, 12.1 tasks/op, 43088 insns/op, 0 errors) 224608.27 tps ( 61.1 allocs/op, 12.1 tasks/op, 43102 insns/op, 0 errors) 225458.32 tps ( 61.1 allocs/op, 12.1 tasks/op, 43098 insns/op, 0 errors) ``` Though I expect with some more effort we can eliminate some copies. Closes #13637 * github.com:scylladb/scylladb: cql3: untyped_result_set: switch to managed_bytes_view as the cell type cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt cql3: untyped_result_set: always own data types: abstract_type: add mixed-type versions of compare() and equal() utils/managed_bytes, serializer: add conversion between buffer_view<bytes_ostream> and managed_bytes_view utils: managed_bytes: add bidirectional conversion between bytes_opt and managed_bytes_opt utils: managed_bytes: add managed_bytes_view::with_linearized() utils: managed_bytes: mark managed_bytes_view::is_linearized() const	2023-05-10 15:01:45 +03:00
Avi Kivity	42a1ced73b	cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt The expression system uses managed_bytes_opt for values, but result_set uses bytes_opt. This means that processing values from the result set in expressions requires a copy. Out of the two, managed_bytes_opt is the better choice, since it prevents large contiguous allocations for large blobs. So we switch result_set to use managed_bytes_opt. Users of the result_set API are adjusted. The db::function interface is not modified to limit churn; instead we convert the types on entry and exit. This will be adjusted in a following patch.	2023-05-07 17:17:36 +03:00
Kefu Chai	bd3e8d0460	test: drop a reusable_sst() variant which accepts int as generation this is one of the changes to reduce the usage of integer based generation test. in future, we will need to expand the test to exercise the UUID based generation, or at least to be neutral to the underlying generation's identifier type. so, to remove the helpers which only accept `generation_type::int_t` would helps us to make this happen. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-06 18:24:48 +08:00
Kefu Chai	05a172c7e7	build: cmake: link against Boost::unit_test_framework we introduced the linkage to Boost::unit_test_framework in `fe70333c19`, this library is used by test/lib/test_utils.cc, so update CMake accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13781	2023-05-05 13:55:00 +03:00
Botond Dénes	687a8bb2f0	Merge 'Sanitize test::filename(sstable) API' from Pavel Emelyanov There are two of them currently with slightly different declaration. Better to leave only one. Closes #13772 * github.com:scylladb/scylladb: test: Deduplicate test::filename() static overload test: Make test::filename return fs::path	2023-05-05 11:36:08 +03:00
Avi Kivity	1d351dde06	Merge 'Make S3 client work with real S3' from Pavel Emelyanov Current S3 client was tested over minio and it takes few more touches to work with amazon S3. The main challenge here is to support singed requests. The AWS S3 server explicitly bans unsigned multipart-upload requests, which in turn is the essential part of the sstables S3 backend, so we do need signing. Signing a request has many options and requirements, one of them is -- request _body_ can be or can be not included into signature calculations. This is called "(un)signed payload". Requests sent over plain HTTP require payload signing (i.e. -- request body should be included into signature calculations), which can a bit troublesome, so instead the PR uses unsigned payload (i.e. -- doesn't include the request body into signature calculation, only necessary headers and query parameters), but thus also needs HTTPS. So what this set does is makes the existing S3 client code sign requests. In order to sign the request the code needs to get AWS key and secret (and region) from somewhere and this somewhere is the conf/object_storage.yaml config file. The signature generating code was previously merged (moved from alternator code) and updated to suit S3 client needs. In order to properly support HTTPS the PR adds special connection factory to be used with seastar http client. The factory makes DNS resolving of AWS endpoint names and configures gnutls systemtrust. fixes: #13425 Closes #13493 * github.com:scylladb/scylladb: doc: Add a document describing how to configure S3 backend s3/test: Add ability to run boost test over real s3 s3/client: Sign requests if configured s3/client: Add connection factory with DNS resolve and configurable HTTPS s3/client: Keep server port on config s3/client: Construct it with config s3/client: Construct it with sstring endpoint sstables: Make s3_storage with endpoint config sstables_manager: Keep object storage configs onboard code: Introduce conf/object_storage.yaml configuration file	2023-05-04 18:08:54 +03:00
Pavel Emelyanov	56dfc21ba0	test: Deduplicate test::filename() static overload There are two of them currently, both returning fs::path for sstable components. One is static and can be dropped, callers are patched to use the non-static one making the code tiny bit shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-04 17:16:00 +03:00
Pavel Emelyanov	3f30a253be	test: Make test::filename return fs::path The sstable::filename() is private and is not supposed to be used as a path to open any files. However, tests are different and they sometimes know it is. For that they use test wrapper that has access to private members and may make assumptions about meaning of sstable::filename(). Said that, the test::filename() should return fs::path, not sstring. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-04 17:14:04 +03:00
Tomasz Grabiec	e385ce8a2b	Merge "fix stack use after free during shutdown" from Gleb storage_service uses raft_group0 but the during shutdown the later is destroyed before the former is stopped. This series move raft_group0 destruction to be after storage_service is stopped already. For the move to work some existing dependencies of raft_group0 are dropped since they do not really needed during the object creation. Fixes #13522	2023-05-04 15:14:18 +02:00
Pavel Emelyanov	fe70333c19	test: Auto-skip object-storage test cases if run from shell In case an sstable unit test case is run individually, it would fail with exception saying that S3_... environment is not set. It's better to skip the test-case rather than fail. If someone wants to run it from shell, it will have to prepare S3 server (minio/AWS public bucket) and provide proper environment for the test-case. refs: #13569 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13755	2023-05-04 14:15:18 +03:00
Gleb Natapov	dc6c3b60b4	init: move raft_group0 creation before storage_service storage_service uses raft_group0 so the later needs to exists until the former is stopped.	2023-05-04 13:03:18 +03:00
Gleb Natapov	e9fb885e82	service/raft: raft_group0: drop dependency on cdc::generation_service raft_group0 does not really depends on cdc::generation_service, it needs it only transiently, so pass it to appropriate methods of raft_group0 instead of during its creation.	2023-05-04 13:03:07 +03:00
Pavel Emelyanov	3bec5ea2ce	s3/client: Keep server port on config Currently the code temporarily assumes that the endpoint port is 9000. This is what tests' local minio is started with. This patch keeps the port number on endpoint config and makes test get the port number from minio starting code via environment. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	2f6aa5b52e	code: Introduce conf/object_storage.yaml configuration file In order to access real S3 bucket, the client should use signed requests over https. Partially this is due to security considerations, partially this is unavoidable, because multipart-uploading is banned for unsigned requests on the S3. Also, signed requests over plain http require signing the payload as well, which is a bit troublesome, so it's better to stick to secure https and keep payload unsigned. To prepare signed requests the code needs to know three things: - aws key - aws secret - aws region name The latter could be derived from the endpoint URL, but it's simpler to configure it explicitly, all the more so there's an option to use S3 URLs without region name in them we could want to use some time. To keep the described configuration the proposed place is the object_storage.yaml file with the format endpoints: - name: a.b.c port: 443 aws_key: 12345 aws_secret: abcdefghijklmnop ... When loaded, the map gets into db::config and later will be propagated down to sstables code (see next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:15 +03:00
Nadav Har'El	b5f28e2b55	Merge 'Add S3 support to sstables::test_env' from Pavel Emelyanov Currently there are only 2 tests for S3 -- the pure client test and compound object_store test that launches scylla, creates s3-backed table and CQL-queries it. At the same time there's a whole lot of small unit test for sstables functionality, part of it can run over S3 storage too. This PR adds this support and patches several test cases to use it. More test cases are to come later on demand. fixes: #13015 Closes #13569 * github.com:scylladb/scylladb: test: Make resharding test run over s3 too test: Add lambda to fetch bloom filter size test: Tune resharding test use of sstable::test_env test: Make datafile test case run over s3 too test: Propagate storage options to table_for_test test: Add support for s3 storage_options in config test: Outline sstables::test_env::do_with_async() test: Keep storage options on sstable_test_env config sstables: Add and call storage::destroy() sstables: Coroutinize sstable::destroy()	2023-05-02 21:48:05 +03:00
Pavel Emelyanov	f7df238545	test: Propagate storage options to table_for_test Teach table_for_tests use any storage options, not just local one. For now the only user that passes non-local options is sstables::test_env. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Pavel Emelyanov	fa1de16f30	test: Add support for s3 storage_options in config When the sstable test case wants to run over S3 storage it needs to specify that in test config by providing the S3 storage options. So first thing this patch adds is the helper that makes these options based on the env left by minio launcher from test.py. Next, in order to make sstables_manager work with S3 it needs the plugged system keyspace which, in turn, needs query processor, proxy, database, etc. All this stuff lives in cql_test_env, so the test case running with S3 options will run in a sstables::test_env nested inside cql_test_env. The latter would also need to plug its system keyspace to the former's sstables manager and turn the experimental feature ON. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Pavel Emelyanov	1e03733e8c	test: Outline sstables::test_env::do_with_async() It's growing larger, better to keep it in .cc file Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:45 +03:00
Pavel Emelyanov	f223f5357d	test: Keep storage options on sstable_test_env config So that it could be set to s3 by the test case on demand. Default is local storage which uses env's tempdir or explicit path argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:45 +03:00
Benny Halevy	ba883859c7	utils: to_string: get rid of to_string(const Range&) Use fmt::to_string instead. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Botond Dénes	022465d673	Merge 'Tone down offstrategy log message' from Benny Halevy In many cases we trigger offstrategy compaction opportunistically also when there's nothing to do. In this case we still print to the log lots of info-level message and call `run_offstrategy_compaction` that wastes more cpu cycles on learning that it has nothing to do. This change bails out early if the maintenance set is empty and prints a "Skipping off-strategy compaction" message in debug level instead. Fixes #13466 Also, add an group_id class and return it from compaction_group and table_state. Use that to identify the compaction_group / table_state by "ks_name.cf_name compaction_group=idx/total" in log messages. Fixes #13467 Closes #13520 * github.com:scylladb/scylladb: compaction_manager: print compaction_group id compaction_group, table_state: add group_id member compaction_manager: offstrategy compaction: skip compaction if no candidates are found	2023-05-02 08:05:18 +03:00
Raphael S. Carvalho	2dbae856f8	sstable: Piggyback on sstable parser and writer to provide bytes_on_disk bytes_on_disk is the sum of all sstable components. As read_simple() fetches the file size before parsing the component, bytes_on_disk can be added incrementally rather than an additional step after all components were already parsed. Likewise, write_simple() tracks the offset for each new component, and therefore bytes_on_disk can also be added incrementally. This simplifies s3 life as it no longer have to care about feeding a bytes_on_disk, which is currently limited to data and index sizes only. Refs #13649. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:06:48 -03:00
Raphael S. Carvalho	bc486b05fa	test: sstable_utils: reuse set_values() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-27 12:04:52 -03:00
Kamil Braun	30cc07b40d	Merge 'Introduce tablets' from Tomasz Grabiec This PR introduces an experimental feature called "tablets". Tablets are a way to distribute data in the cluster, which is an alternative to the current vnode-based replication. Vnode-based replication strategy tries to evenly distribute the global token space shared by all tables among nodes and shards. With tablets, the aim is to start from a different side. Divide resources of replica-shard into tablets, with a goal of having a fixed target tablet size, and then assign those tablets to serve fragments of tables (also called tablets). This will allow us to balance the load in a more flexible manner, by moving individual tablets around. Also, unlike with vnode ranges, tablet replicas live on a particular shard on a given node, which will allow us to bind raft groups to tablets. Those goals are not yet achieved with this PR, but it lays the ground for this. Things achieved in this PR: - You can start a cluster and create a keyspace whose tables will use tablet-based replication. This is done by setting `initial_tablets` option: ``` CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 3, 'initial_tablets': 8}; ``` All tables created in such a keyspace will be tablet-based. Tablet-based replication is a trait, not a separate replication strategy. Tablets don't change the spirit of replication strategy, it just alters the way in which data ownership is managed. In theory, we could use it for other strategies as well like EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy is augmented to support tablets. - You can create and drop tablet-based tables (no DDL language changes) - DML / DQL work with tablet-based tables Replicas for tablet-based tables are chosen from tablet metadata instead of token metadata Things which are not yet implemented: - handling of views, indexes, CDC created on tablet-based tables - sharding is done using the old method, it ignores the shard allocated in tablet metadata - node operations (topology changes, repair, rebuild) are not handling tablet-based tables - not integrated with compaction groups - tablet allocator piggy-backs on tokens to choose replicas. Eventually we want to allocate based on current load, not statically Closes #13387 * github.com:scylladb/scylladb: test: topology: Introduce test_tablets.py raft: Introduce 'raft_server_force_snapshot' error injection locator: network_topology_strategy: Support tablet replication service: Introduce tablet_allocator locator: Introduce tablet_aware_replication_strategy locator: Extract maybe_remove_node_being_replaced() dht: token_metadata: Introduce get_my_id() migration_manager: Send tablet metadata as part of schema pull storage_service: Load tablet metadata when reloading topology state storage_service: Load tablet metadata on boot and from group0 changes db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata() migration_notifier: Introduce before_drop_keyspace() migration_manager: Make prepare_keyspace_drop_announcement() return a future<> test: perf: Introduce perf-tablets test: Introduce tablets_test test: lib: Do not override table id in create_table() utils, tablets: Introduce external_memory_usage() db: tablets: Add printers db: tablets: Add persistence layer dht: Use last_token_of_compaction_group() in split_token_range_msb() locator: Introduce tablet_metadata dht: Introduce first_token() dht: Introduce next_token() storage_proxy: Improve trace-level logging locator: token_metadata: Fix confusing comment on ring_range() dht, storage_proxy: Abstract token space splitting Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries" db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms() db: Introduce get_non_local_vnode_based_strategy_keyspaces() service: storage_proxy: Avoid copying keyspace name in write handler locator: Introduce per-table replication strategy treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type locator: Introduce effective_replication_map locator: Rename effective_replication_map to vnode_effective_replication_map locator: effective_replication_map: Abstract get_pending_endpoints() db: Propagate feature_service to abstract_replication_strategy::validate_options() db: config: Introduce experimental "TABLETS" feature db: Log replication strategy for debugging purposes db: Log full exception on error in do_parse_schema_tables() db: keyspace: Remove non-const replication strategy getter config: Reformat	2023-04-27 09:40:18 +02:00
Pavel Emelyanov	9bb4ee160f	gossiper: Remove features and sysks from gossiper Now gossiper doesn't need those two as its dependencies, they can be removed making code shorter and dependencies graph simpler. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-25 17:06:06 +03:00

1 2 3 4 5 ...

929 Commits