scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 04:37:00 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	f953fb2f52	schema_change_test: Use proxy from cql_test_env There's one place where test case calls for storage proxy and currently does it via global refernece. Time to switch it to cql_test_env's one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-21 14:18:00 +03:00
Botond Dénes	1426c623eb	Merge 'Tune up S3 unit tests environment usage (and a bit more)' from Pavel Emelyanov The tests in question are using MINIO_SERVER_ADDRESS environment variable to export minio server address from pylib to test cases. Also they use hard-coded public bucket name. Both plays badly with AWS S3, the former due to MINIO_... in its name and the latter because public bucket name can be any. So this PR puts address and public bucket name into S3_..._FOR_TEST environment variables and fixes output stream closure on failure while at it. Detached from #13493 Closes #13546 * github.com:scylladb/scylladb: s3/test: Rename MINIO_SERVER_ADDRESS environment variable s3/test: Keep public bucket name in environment s3/test: Fix upload stream closure test/lib: Add getenv_safe() helper	2023-04-20 18:01:12 +03:00
Pavel Emelyanov	a77ca69360	s3/test: Rename MINIO_SERVER_ADDRESS environment variable Using it the pylib minio code export minio address for tests. This creates unneeded WTFs when running the test over AWS S3, so it's better to rename to variable not to mention MINIO at all. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	12c4e7d605	s3/test: Keep public bucket name in environment Local test.py runs minio with the public 'testbucket' bucket and all test cases know that. This series adds an ability to run tests over real S3 so the bucket name should be configurable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	91674da982	s3/test: Fix upload stream closure If multipart upload fails for some reason the output stream remains not closed and the respective assertion masquerades the original failure. Fix that by closing the stream in all cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Avi Kivity	7a42927a3d	treewide: stop using 'using namespace std' in namespace scope Such namespace-wide imports can create conflicts between names that are the same in seastar and std, such as {std,seastar}::future and {std,seastar}::format, since we also have 'using namespace seastar'. Replace the namespace imports with explicit qualification, or with specific name imports. Closes #13528	2023-04-17 14:08:37 +03:00
Botond Dénes	6c889213bf	Merge 'Topology add node exception safety' from Benny Halevy Currently if index_node throws when trying to add an already indexed node, pop_node might unindex the existing node instead of the new one. Instead, with this change, unindex_node looks up the node by its pointer and removed it from the index map only if it's found there so to clean up safely after index_node throws (at any stage). Add a unit test to verify that. In addition, added a unit test to reproduce #13502 and test the fix. Closes #13512 * github.com:scylladb/scylladb: test: locator_topology: add test_update_node topology: add_node, unindex_node: make exception safe	2023-04-17 11:02:15 +03:00
Botond Dénes	4c37dc5507	Merge 'keys: specialize fmt::formatter<partition_key> and friends' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Closes #13513 * github.com:scylladb/scylladb: keys: consolidate the formatter for partition_keys keys: specialize fmt::formatter<partition_key> and friends	2023-04-17 10:27:31 +03:00
Benny Halevy	e18eb71fa3	test: locator_topology: add test_update_node Reproduces issue fixed in PR #13502 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-14 17:51:07 +03:00
Benny Halevy	e29994b2aa	topology: add_node, unindex_node: make exception safe Current if index_node throws when trying to add an already indexed node, pop_node might unindex the existing node instead of the new one. Instead, with this change, unindex_node looks up the node by its pointer and removed it from the index map only if it's found there so to clean up safely after index_node throws (at any stage). Add a unit test to verify that. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-14 17:51:05 +03:00
Raphael S. Carvalho	a47bac931c	Move TWCS option from table into TWCS itself enable_optimized_twcs_queries is specific to TWCS, therefore it belongs to TWCS, not replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13489	2023-04-14 08:28:16 +03:00
Kefu Chai	3738fcbe05	keys: specialize fmt::formatter<partition_key> and friends this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-14 13:21:30 +08:00
Botond Dénes	bd57471e54	reader_concurrency_semaphore: don't evict inactive readers needlessly Inactive readers should only be evicted to free up resources for waiting readers. Evicting them when waiters are not admitted for any other reason than resources is wasteful and leads to extra load later on when these evicted readers have to be recreated end requeued. This patch changes the logic on both the registering path and the admission path to not evict inactive readers unless there are readers actually waiting on resources. A unit-test is also added, reproducing the overly-agressive eviction and checking that it doesn't happen anymore. Fixes: #11803 Closes #13286	2023-04-13 15:20:18 +03:00
Kefu Chai	87170bf07a	build: cmake: add more tests this change should add the remaining tests under boost/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13494	2023-04-13 14:57:00 +03:00
Botond Dénes	f1bbf705f9	Merge 'Cleanup sstables in resharding and other compaction types' from Benny Halevy This series extends sstable cleanup to resharding and other (offstrategy, major, and regular) compaction types so to: * cleanup uploaded sstables (#11933) * cleanup staging sstables after they are moved back to the main directory and become eligible for compaction (#9559) When perform_cleanup is called, all sstables are scanned, and those that require cleanup are marked as such, and are added for tracking to table_state::cleanup_sstable_set. They are removed from that set once released by compaction. Along with that sstables set, we keep the owned_ranges_ptr used by cleanup in the table_state to allow other compaction types (offstrategy, major, or regular) to cleanup those sstables that are marked as require_cleanup and that were skipped by cleanup compaction for either being in the maintenance set (requiring offstrategy compaction) or in staging. Resharding is using a more straightforward mechanism of passing the owned token ranges when resharding uploaded sstables and using it to detect sstable that require cleanup, now done as piggybacked on resharding compaction. Closes #12422 * github.com:scylladb/scylladb: table: discard_sstables: update_sstable_cleanup_state when deleting sstables compaction_manager: compact_sstables: retrieve owned ranges if required sstables: add a printer for shared_sstable compaction_manager: keep owned_ranges_ptr in compaction_state compaction_manager: perform_cleanup: keep sstables in compaction_state::sstables_requiring_cleanup compaction: refactor compaction_state out of compaction_manager compaction: refactor compaction_fwd.hh out of compaction_descriptor.hh compaction_manager: compacting_sstable_registration: keep a ref to the compaction_state compaction_manager: refactor get_candidates compaction_manager: get_candidates: mark as const table, compaction_manager: add requires_cleanup sstable_set: add for_each_sstable_until distributed_loader: reshard: update sstable cleanup state table, compaction_manager: add update_sstable_cleanup_state compaction_manager: needs_cleanup: delete unused schema param compaction_manager: perform_cleanup: disallow empty sorted_owened_ranges distributed_loader: reshard: consider sstables for cleanup distributed_loader: process_upload_dir: pass owned_ranges_ptr to reshard distributed_loader: reshard: add optional owned_ranges_ptr param distributed_loader: reshard: get a ref to table_state distributed_loader: reshard: capture creator by ref distributed_loader: reshard: reserve num_jobs buckets compaction: move owned ranges filtering to base class compaction: move owned_ranges into descriptor	2023-04-11 14:52:29 +03:00
Kefu Chai	86b66a9875	build: cmake: drop test_table.CC this change mirrors the corresponding change in `configure.py` in `4b5b6a9010` . Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13461	2023-04-11 09:42:58 +03:00
Botond Dénes	05b381bfa2	Merge 'Simple S3 storage for sstables' from Pavel Emelyanov The PR adds sstables storage backend that keeps all component files as S3 objects and system.sstables_registry ownership table that keeps track of what sstables objects belong to local node and their names. When a keyspace is configured with 'STORAGE = { 'type': 'S3' }' the respective class table object eventually gets the storage_options instance pointing to the target S3 endpoint and bucket. All the sstables created for that table attach the S3 storage implementation that maintains components' files as S3 objects. Writing to and reading from components is handled by the S3 client facilities from utils/. Changing the sstable state, which is -- moving between normal, staging and quarantine states -- is not yet implemented, but would eventually happen by updating entries in the sstables registry. To keep track of which node owns which objects, to provide bucket-wide uniqueness of object names and to maintain sstable state the storage driver keeps records in the system.sstables_registry ownership table. The table maps sstable location and generation to the object format, version, status-state () and (!) unique identifier (some time soon this identifier is supposed to be replaced with UUID sstables generations). The component object name is thus s3://bucket/uuid/component_basename. The registry is also used on boot. The distributed loader picks up sstables from all the tables found in schema and for S3-backed keyspaces it lists entries in the registry to a) identify those and b) get their unique S3-side identifiers to open by name. () About sstable's status and state. The state field is the part of today's sstable path on disk -- staging, quarantine, normal (root table data dir), etc. Since S3 doesn't have the renaming facility, moving sstable between those states is only possible by updating the entry in the registry. This is not yet implemented in this set (#13017) The status field tracks sstable' transition through its creation-deletion. It first starts with 'creating' status which corresponds to the today's TemporaryTOC file. After being created and written to the sstable moves into 'sealed' state which corresponds to the today's normal sstable being with the TOC file. To delete sstable atomically it first moves into 'removing' state which is equivalent to being in the deletion-log for the on-disk sstable. Once removed from the bucket, the entry is removed from the registry. To play with: 1. Start minio (installed by install-dependencies.sh) ``` export MINIO_ROOT_USER=${root_user} export MINIO_ROOT_PASSWORD=${root_pass} mkdir -p ${root_directory} minio server ${root_directory} ``` 2. Configure minio CLI, create anonymous bucket ``` mc config host rm local mc config host add local http://127.0.0.1:9000 ${root_user} ${root_pass} mc mb local/sstables mc anonymous set public local/sstables ``` 3. Start Scylla with object-storage feature enabled ``` scylla ... --experimental-features=keyspace-storage-options --workdir ${as_usual}``` 4. Create KS with S3 storage ``` create keyspace ... storage = { 'type': 'S3', 'endpoint': '127.0.0.1:9000', 'bucket': 'sstables' };``` The S3 client has a logger named "s3", it's useful to use on with `trace` verbosity. Closes #12523 * github.com:scylladb/scylladb: test: Add object-storage test distributed_loader: Print storage type when populating sstable_directory: Add ownership table components lister sstable_directory: Make components_lister and API sstable_directory: Create components lister based on storage options sstables: Add S3 storage implementation system_keyspace: Add ownership table system_keyspace: Plug to user sstables manager too sstable: Make storage instance based on storage options sstable_directory: Keep storage_options aboard sstable: Virtualize the helper that gets on-disk stats for sstable sstable, storage: Virtualize data sink making for small components sstable, storage: Virtualize data sink making for Data and Index sstable/writer: Shuffle writer::init_file_writers() sstable: Make storage an API utils: Add S3 readable file impl for random reads utils: Add S3 data sink for multipart upload utils: Add S3 client with basic ops cql-pytest: Add option to run scylla over stable directory test.py: Equip it with minio server sstables: Detach write_toc() helper	2023-04-11 08:17:25 +03:00
Benny Halevy	9105f9800c	sstables: add a printer for shared_sstable Refactor the printing logic in compaction::formatted_sstables_list out to sstables::to_string(const shared_sstable&, bool include_origin) and operator<<(const shared_sstable) on top of it. So that we can easily print std::vector<shared_sstable> from compaction_manager in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:31:35 +03:00
Benny Halevy	1baca96de1	compaction_manager: needs_cleanup: delete unused schema param It isn't needed. The sstable already has a schema. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:03:53 +03:00
Benny Halevy	09df04c919	compaction: move owned_ranges into descriptor Move the owned_ranges_ptr, currently used only by cleanup and upgrade compactions, to the generic compaction descriptor so we apply cleanup in other compaction types. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:52:12 +03:00
Pavel Emelyanov	fd817e199c	Merge 'auth: replace operator<<(..) with fmt formatter' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `auth::auth_authentication_options` and `auth::resource_kind` without the help of fmt::ostream. and their `operator<<(ostream,..)` are dropped, as there are no users of them anymore. Refs #13245 Closes #13460 * github.com:scylladb/scylladb: auth: remove unused operator<<(.., resource_kind) auth: specialize fmt::formatter<resource_kind> auth: remove unused operator<<(.., authentication_option) auth: specialize fmt::formatter<authentication_option>	2023-04-10 17:05:09 +03:00
Pavel Emelyanov	4bb885b759	sstable: Make storage instance based on storage options This patch adds storage options lw-ptr to sstables_manager::make_sstable and makes the storage instance creation depend on the options. For local it just creates the filesystem storage instance, for S3 -- throws, but next patch will fix that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	df026e2cb5	sstable_directory: Keep storage_options aboard The class in question will need to know the table's storage it will need to list sstables from. For that -- construct it with the storage options taken from table. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	033fa107f8	utils: Add S3 readable file impl for random reads Sometimes an sstable is used for random read, sometimes -- for streamed read using the input stream. For both cases the file API can be provided, because S3 API allows random reads of arbitrary lengths. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	a4a64149a6	utils: Add S3 data sink for multipart upload Putting a large object into S3 using plain PUT is bad choice -- one need to collect the whole object in memory, then send it as a content-length request with plain body. Less memory stress is by using multipart upload, but multipart upload has its limitation -- each part should be at least 5Mb in size. For that reason using file API doesn't work -- file IO API operates with external memory buffers and the file impl would only have raw pointers to it. In order to collect 5Mb of chunk in RAM the impl would have to copy the memory which is not good. Unlike the file API data_sink API is more flexible, as it has temporary buffers at hand and can cache them in zero-copy manner. Having sad that, the S3 data_sink implementation is like this: * put(buffer): move the buffer into local cache, once the local cache grows above 5Mb send out the part * flush: send out whatever is in cache, then send upload completion request * close: check that the upload finihsed (in flush), abort the upload otherwise User of the API may (actually should) wrap the sink with output_stream and use it as any other output_stream. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	3745b5c715	utils: Add S3 client with basic ops Those include -- HEAD to get size, PUT to upload object in one go, GET to read the object as contigious buffer and DELETE to drop one. The client uses http client from seastar and just implements the S3 protocol using it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Nadav Har'El	d26bb8c12d	Merge 'tree: migrate from std::regex to boost::regex' from Botond Dénes Except for where usage of `std::regex` is required by 3rd party library interfaces. As demonstrated countless times, std::regex's practice of using recursion for pattern matching can result in stack overflow, especially on AARCH64. The most recent incident happened after merging https://github.com/scylladb/scylladb/pull/13075, which (indirectly) uses `sstables::make_entry_descriptor()` to test whether a certain path is a valid scylla table path in a trial-and-error manner. This resulted in stacks blowing up in AARCH64. To prevent this, use the already tried and tested method of switching from `std::regex` to `boost::regex`. Don't wait until each of the `std::regex` sites explode, replace them all preemptively. Refs: https://github.com/scylladb/scylladb/issues/13404 Closes #13452 * github.com:scylladb/scylladb: test: s/std::regex/boost::regex/ utils: s/std::regex/boost::regex/ db/commitlog: s/std::regex/boost::regex/ types: s/std::regex/boost::regex/ index: s/std::regex/boost::regex/ duration.cc: s/std::regex/boost::regex/ cql3: s/std::regex/boost::regex/ thrift: s/std::regex/boost::regex/ sstables: use s/std::regex/boost::regex/	2023-04-09 18:47:41 +03:00
Kefu Chai	9d5fbe226e	auth: remove unused operator<<(.., resource_kind) since the only user of operator<<(..., resource_kind) is now `auth_resource_test`, let's just move it into this test. and there is no need to keep this operator in the header file where `resource_kind` is defined. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-07 20:32:28 +08:00
Botond Dénes	452cb1a712	test: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:51:32 -04:00
Botond Dénes	0a46a574e6	Merge 'Topology: introduce nodes' from Benny Halevy As a first step towards using host_id to identify nodes instead of ip addresses this series introduces a node abstraction, kept in topology, indexed by both host_id and endpoint. The revised interface also allows callers to handle cases where nodes are not found in the topology more gracefully by introducing `find_node()` functions that look up nodes by host_id or inet_address and also get a `must_exist` parameter that, if false (the default parameter value) would return nullptr if the node is not found. If true, `find_node` throws an internal error, since this indicates a violation of an internal assumption that the node must exist in the topology. Callers that may handle missing nodes, should use the more permissive flavor and handle the !find_node() case gracefully. Closes #11987 * github.com:scylladb/scylladb: topology: add node state topology: remove dead code locator: add class node topology: rename update_endpoint to add_or_update_endpoint topology: define get_{rack,datacenter} inline shared_token_metadata: mutate_token_metadata: replicate to all shards locator: endpoint_dc_rack: refactor default_location locator: endpoint_dc_rack: define default operator== test: storage_proxy_test: provide valid endpoint_dc_rack	2023-04-06 13:47:22 +03:00
Tomasz Grabiec	bbabf07f69	Merge 'test/boost/multishard_mutation_query: use random schema' from Botond Dénes This test currently uses `test/lib/test_table.hh` to generate data for its test cases. This data generation facility is used by no other tests. Worse, it is redundant as we already have a random data generator with fixed schema, in `test/lib/mutation_source_test.hh`. So in this series, we migrate the test cases in said test file to random schema and its random data generation facilities. These are used by several other test cases and using random schema allows us to cover a wider (quasi-infinite) number of possibilities. After migrating all tests away from it, `test/lib/test_table.hh` is removed. This series also reduces the runtime of `fuzzy_test` drastically. It should now run in a few minutes or even in seconds (depending on the machine). Fixes: #12944 Closes #12574 * github.com:scylladb/scylladb: test/lib: rm test_table.hh test/boos/multishard_mutation_query_test: migrate other tests to random schema test/boost/multishard_mutation_query_test: use ks keyspace test/boost/multishard_mutation_query_test: improve test pager test/boost/multishard_mutation_query_test: refactor fuzzy_test test/boost: add multishard_mutation_query_test more memory types/user: add get_name() accessor test/lib/random_schema: add create_with_cql() test/lib/random_schema: fix udt handling test/lib/random_schema: type_generator(): also generate frozen types test/lib/random_schema: type_generator(): make static column generation conditional test/lib/random_schema: type_generator(): don't generate duration_type for keys test/lib/random_schema: generate_random_mutations(): add overload with seed test/lib/random_schema: generate_random_mutations(): respect range tombstone count param test/lib/random_schema: generate_random_mutations(): add yields test/lib/random_schema: generate_random_mutations(): fix indentation test/lib/random_schema: generate_random_mutations(): coroutinize method test/lib/random_schema: generate_random_mutations(): expand comment	2023-04-05 10:32:58 +02:00
Benny Halevy	c17df1759e	topology: add node state Add a simple node state model with: `joining`, `normal`, `leaving`, and `left` states to help managing nodes during replace with the the same ip address. Later on, this could also help prevent nodes that were decommissioned, removed, or replaced from rejoining the cluster. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:18:31 +03:00
Benny Halevy	f3d5df5448	locator: add class node And keep per node information (idx, host_id, endpoint, dc_rack, is_pending) in node objects, indexed by topology on several indices like: idx, host_id, endpoint, current/pending, per dc, per dc/rack. The node index is a shorthand identifier for the node. node* and index are valid while the respective topology instance is valid. To be used, the caller must hold on to the topology / token_metadata object (e.g. via a token_metadata_ptr or effective_replication_map) Refs #6403 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> topology: add node idx Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:13:02 +03:00
Benny Halevy	006e02410f	topology: rename update_endpoint to add_or_update_endpoint To reflect what it does, Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:08:03 +03:00
Benny Halevy	5874a0d0ca	test: storage_proxy_test: provide valid endpoint_dc_rack Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 19:13:05 +03:00
Kefu Chai	e107b31d23	test: sstable: remove unused class in sstable test generation_for_sharded_test is not used by any of these sstable tests, so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13388	2023-03-31 08:02:22 +03:00
Pavel Emelyanov	7d6ab5c84d	code: Remove some headers from query_processor.hh The forward_service.hh and raft_group0_client.hh can be replaced with forward declarations. Few other files need their previously indirectly included headers back. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13384	2023-03-31 07:08:41 +03:00
Botond Dénes	207dcbb8fa	Merge 'sstables: prepare for uuid-based generation_type' from Benny Halevy Preparing for #10459, this series defines sstables::generation_type::int_t as `int64_t` at the moment and use that instead of naked `int64_t` variables so it can be changed in the future to hold e.g. a `std::variant<int64_t, sstables::generation_id>`. sstables::new_generation was defined to generation new, unique generations. Currently it is based on incrementing a counter, but it can be extended in the future to manufacture UUIDs. The unit tests are cleaned up in this series to minimize their dependency on numeric generations. Basically, they should be used for loading sstables with hard coded generation numbers stored under `test/resource/sstables`. For all the rest, the tests should use existing and mechanisms introduced in this series such as generation_factory, sst_factory and smart make_sstable methods in sstable_test_env and table_for_tests to generate new sstables with a unique generation, and use the abstract sst->generation() method to get their generation if needed, without resorting the the actual value it may hold. Closes #12994 * github.com:scylladb/scylladb: everywhere: use sstables::generation_type test: sstable_test_env: use make_new_generation sstable_directory::components_lister::process: fixup indentation sstables: make highest_generation_seen return optional generation replica: table: add make_new_generation function replica: table: move sstable generation related functions out of line test: sstables: use generation_type::int_t sstables: generation_type: define int_t	2023-03-30 17:05:07 +03:00
Pavel Emelyanov	92318fdeae	Merge 'Initialize Wasm together with query_processor' from Wojciech Mitros The wasm engine is moved from replica::database to the query_processor. The wasm instance cache and compilation thread runner were already there, but now they're also initialized in the query_processor constructor. By moving the initialization to the constructor, we can now be certain that all wasm-related objects (wasm instance cache, compilation thread runner, and wasm engine, which was already passed in the constructor) are initialized when we try to use them because we have to use the query processor to access them anyway. The change is also motivated by the fact that we're planning to take Wasm UDFs out of experimental, after which they should stop getting special treatment. Closes #13311 * github.com:scylladb/scylladb: wasm: move wasm initialization to query_processor constructor wasm: return wasm instance cache as a reference instead of a pointer wasm: move wasm engine to query_processor	2023-03-30 14:30:23 +03:00
Avi Kivity	472b155d76	Merge 'Allow each compaction group to have its own compaction strategy state' from Raphael "Raph" Carvalho This is important for multiple compaction groups, as they cannot share state that must span a single SSTable set. The solution is about: 1) Decoupling compaction strategy from its state; making compaction_strategy a pure stateless entity 2) Each compaction group storing its own compaction strategy state 3) Compaction group feeds its state into compaction strategy whenever needed Closes #13351 * github.com:scylladb/scylladb: compaction: TWCS: wire up compaction_strategy_state compaction: LCS: wire up compaction_strategy_state compaction: Expose compaction_strategy_state through table_state replica: Add compaction_strategy_state to compaction group compaction: Introduce compaction_strategy_state compaction: add table_state param to compaction_strategy::notify_completion() compaction: LCS: extract state into a separate struct compaction: TWCS: prepare for stateless strategy compaction: TWCS: extract state into a separate struct compaction: add const-qualifier to a few compaction_strategy methods	2023-03-29 18:57:11 +03:00
Botond Dénes	bae62f899d	mutation/mutation_compactor: consume_partition_end(): reset _stop The purpose of `_stop` is to remember whether the consumption of the last partition was interrupted or it was consumed fully. In the former case, the compactor allows retreiving the compaction state for the given partition, so that its compaction can be resumed at a later point in time. Currently, `_stop` is set to `stop_iteration::yes` whenever the return value of any of the `consume()` methods is also `stop_iteration::yes`. Meaning, if the consuming of the partition is interrupted, this is remembered in `_stop`. However, a partition whose consumption was interrupted is not always continued later. Sometimes consumption of a partitions is interrputed because the partition is not interesting and the downstream consumer wants to stop it. In these cases the compactor should not return an engagned optional from `detach_state()`, because there is not state to detach, the state should be thrown away. This was incorrectly handled so far and is fixed in this patch, but overwriting `_stop` in `consume_partition_end()` with whatever the downstream consumer returns. Meaning if they want to skip the partition, then `_stop` is reset to `stop_partition::no` and `detach_state()` will return a disengaged optional as it should in this case. Fixes: #12629 Closes #13365	2023-03-29 17:48:45 +03:00
Wojciech Mitros	c9b701b516	wasm: return wasm instance cache as a reference instead of a pointer In an incoming change, the wasm instance cache will be modified to be owned by the query_processor - it will hold an optional instead of a raw pointer to the cache, so we should stop returning the raw pointer from the getter as well. Consequently, the cache is also stored as a reference in wasm::cache, as it gets the reference from the query_processor. For consistency with the wasm engine and the wasm alien thread runner, the name of the getter is also modified to follow the same pattern.	2023-03-28 18:18:48 +02:00
Raphael S. Carvalho	989afbf83b	compaction: TWCS: wire up compaction_strategy_state TWCS no longer keeps internal state, and will now rely on state managed by each compaction group through compaction::table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-28 08:48:15 -03:00
Raphael S. Carvalho	233fe6d3dc	compaction: LCS: wire up compaction_strategy_state LCS no longer keeps internal state, and will now rely on state managed by each compaction group through compaction::table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-28 08:48:15 -03:00
Botond Dénes	b6c022a142	Merge 'cmake: sync with `configure.py` (15/n)' from Kefu Chai this is the 15th changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals: - to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience - to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules. this changeset includes following changes: - build: cmake: add two missing tests - build: cmake: port more cxxflags from configure.py Closes #13262 * github.com:scylladb/scylladb: build: cmake: add missing source files to idl and service build: cmake: port more cxxflags from configure.py build: cmake: add two missing tests	2023-03-28 09:16:38 +03:00
Botond Dénes	ab61704c54	Merge 'mutation: replace operator<<(.., const range_tombstone&) with fmt formatter' from Kefu Chai this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `range_tombstone` and `range_tombstone_change` without using ostream<<. also, this change removes all existing callers of `operator<<(ostream, const range_tombstone &)` and `operator<<(ostream, const range_tombstone_change &)`, and then removes these two `operator<<`s. Refs #13245 Closes #13260 * github.com:scylladb/scylladb: mutation: drop operator<<(ostream, const range_tombstone{_change,} &) mutation: use fmtlib to print range_stombstone{_change,} mutation: mutation_fragment_v2: specialize fmt::formatter<range_tombstone_change> mutation: range_tombstone: specialize fmt::formatter<range_tombstone>	2023-03-27 11:38:59 +03:00
Botond Dénes	4b5b6a9010	test/lib: rm test_table.hh No users left.	2023-03-27 02:00:44 -04:00
Botond Dénes	3a43574b39	test/boos/multishard_mutation_query_test: migrate other tests to random schema Create a local method called create_test_table that has the same signature as test::create_test_table, but uses random schema behind the scenes to generate the schema and the data, then migrate all the test cases to use it instead. To accomodate to the added randomness added by the random schema and random data, the unreliable querier cache population checks was replaced with more reliable lookup and miss checks, to prevent test flakiness. Querier cache population checks worked well with a fixed and simple schema and a fixed table population, they don't work that well with random data. With this, there are no more uses of test_table.hh in this test and the include can be removed.	2023-03-27 02:00:44 -04:00
Botond Dénes	56a9968817	test/boost/multishard_mutation_query_test: use ks keyspace This keyspace exists by default and thus we don't have to create a new one for each test. Also use `get_name()` to pass the test case's name as table name, instead of hard-coding it. We already had some copy-pasta creep in: two tests used the same table name. This is an error, as each test runs in its own env, but it is confusing to see another test case's name in the logs.	2023-03-27 02:00:44 -04:00
Botond Dénes	ad313d8eef	test/boost/multishard_mutation_query_test: improve test pager Propagate the page size to the result builder, so it can determine when a page is short and thus it is the last page, instead of asking for more pages until an empty one turns up. This will make tests more reliable when dealing with random datasets. Also change how the page counter is bumped: bump it after the current page is executed, at which point we know whether there will be a next page or not. This fixes an off-by-one seen in some cases.	2023-03-27 02:00:44 -04:00

1 2 3 4 5 ...

2390 Commits