scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Robert Bindar	27f2d64725	Remove object storage config credentials provider During development of #22428 we decided that we have no need for `object-storage.yaml`, and we'd rather store the endpoints in `scylla.yaml` and get a REST api to exopose the endpoints for free. This patch removes the credentials provider used to read the aws keys from this yaml file. Followup work will remove the `object-storage.yaml` file altogether and move the endpoints to `scylla.yaml`. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#22951	2025-03-07 10:40:58 +03:00
Kefu Chai	a43072a21e	cql3,test: replace boost::range::adjacent_find with std::ranges to reduce third-party dependencies and modernize the codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22998	2025-03-04 10:08:02 +02:00
Pavel Emelyanov	e4e15a00b7	Merge 'reader_concurrency_semaphore: register_inactive_read(): handle aborted permit' from Botond Dénes It is possible that the permit handed in to register_inactive_read() is already aborted (currently only possible if permit timed out). If the permit also happens to have wait for memory, the current code will attempt to call promise<>::set_exception() on the permit's promise to abort its waiters. But if the permit was already aborted via timeout, this promise will already have an exception and this will trigger an assert. Add a separate case for checking if the permit is aborted already. If so, treat it as immediate eviction: close the reader and clean up. Fixes: scylladb/scylladb#22919 Bug is present in all live versions, backports are required. Closes scylladb/scylladb#23044 * github.com:scylladb/scylladb: reader_concurrency_semaphore: register_inactive_read(): handle aborted permit test/boost/reader_concurrency_semaphore_test: move away from db::timeout_clock::now()	2025-03-04 10:40:28 +03:00
Kefu Chai	5571b537b5	tree: Make values mutable to enable move semantics Previously, variables were marked as const, causing std::move() calls to be redundant as reported by GCC warnings. This change either removes const qualifiers or marks related lambdas as mutable, allowing the compiler to properly utilize move constructors for better performance. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23066	2025-03-03 13:53:02 +03:00
Evgeniy Naydanov	cb0e0ebcf7	test.py: extract prepare dirs and S3 mock steps to test/conftest.py As a part of the moving to bare pytest we need to extract the required test environment preparation steps into pytest's hooks/fixtures. Do this for S3 mock stuff (MinioServer, MockS3Server, and S3ProxyServer) and for directories with test artifacts. For compatibility reason add --test-py-init CLI option for bare pytest test runner: need to add it to pytest command if you need test.py stuff in your tests (boost, topology, etc.) Also, postpone initialization of TestSuite.artifacts and TestSuite.hosts from import-time to runtime. Closes scylladb/scylladb#23087	2025-03-03 13:24:37 +03:00
Botond Dénes	7ba29ec46c	reader_concurrency_semaphore: register_inactive_read(): handle aborted permit It is possible that the permit handed in to register_inactive_read() is already aborted (currently only possible if permit timed out). If the permit also happens to have wait for memory, the current code will attempt to call promise<>::set_exception() on the permit's promise to abort its waiters. But if the permit was already aborted via timeout, this promise will already have an exception and this will trigger an assert. Add a separate case for checking if the permit is aborted already. If so, treat it as immediate eviction: close the reader and clean up. Fixes: scylladb/scylladb#22919	2025-02-28 01:32:46 -05:00
Botond Dénes	4d8eb02b8d	test/boost/reader_concurrency_semaphore_test: move away from db::timeout_clock::now() Unless the test in question actually wants to test timeouts. Timeouts will have more pronounced consequences soon and thus using db::timeout_clock::now() becomes a sure way to make tests flaky. To avoid this, use db::no_timeout in the tests that don't care about timeouts.	2025-02-28 01:31:33 -05:00
Kefu Chai	6e4df57f97	mutation,test: replace boost::equal with std::ranges::equal to reduce third-party dependencies and modernize the codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22999	2025-02-26 14:27:42 +03:00
Avi Kivity	6e70e69246	test/lib: mutation_assertions: deinline While generally better to reduce inline code, here we get rid of the clustering_interval_set.hh dependency, which in turns depends on boost interval_set, a large dependency. incremental_compaction_test.cc is adjusted for a missing header. Closes scylladb/scylladb#22957	2025-02-25 11:40:54 +01:00
Kefu Chai	9fdbe0e74b	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22997	2025-02-25 10:32:32 +03:00
Tomasz Grabiec	1a7023c85a	config, tablets: Allow tablets_initial_scale_factor to be a fraction We may want fewer than 1 tablets per shard in large clusters. The per-table option is a fraction, so for consistency, this should be too.	2025-02-19 16:29:08 +01:00
Tomasz Grabiec	2b2fa0203e	test: tablets_test: Test scaling when creating lots of tables	2025-02-19 16:29:08 +01:00
Tomasz Grabiec	0e111990a1	test: tablets_test: Test tablet count changes on per-table option and config changes	2025-02-19 16:29:08 +01:00
Tomasz Grabiec	5e471c6f1b	test: tablets_test: Add support for auto-split mode rebalance_tablets() was performing migrations and merges automatically but not splits, because splits need to be acked by replicas via load_stats. It's inconvenient in tests which want to rebalance to the equilibrium point. This patch changes rebalance_tablets() to split automatically by default, can be disabled for tests which expect differently. shared_load_stats was introduced to provide a stable holder of load_stats which can be reused across rebalance_tablets() calls.	2025-02-19 16:29:08 +01:00
Tomasz Grabiec	f1bda8d4c1	tablets: load_balancer: Scale down tablet count to respect per-shard tablet count goal The limit is enforced by controlling average per-shard tablet replica count in a given DC, which is controlled by per-table tablet count. This is effective in respecting the limit on individual shards as long as tablet replicas are distributed evenly between shards. There is no attempt to move tablets around in order to enforce limits on individual shards in case of imbalance between shards. If the average per-shard tablet count exceeds the limit, all tables which contribute to it (have replicas in the DC) are scaled down by the same factor. Due to rounding up to the nearest power of 2, we may overshoot the per-shard goal by at most a factor of 2. If different DCs want different scale factors of a given table, the lowest scale factor is chosen for a given table. The limit is configurable. It's a global per-cluster config which controls how many tablet replicas per shard in total we consider to be still ok. It controls tablet allocator behavior, when choosing initial tablet count. Even though it's a per-node config, we don't support different limits per node. All nodes must have the same value of that config. It's similar in that regard to other scheduler config items like tablets_initial_scale_factor and target_tablet_size_in_bytes.	2025-02-19 16:29:07 +01:00
Tomasz Grabiec	94b5165ac7	tablets: Use scheduler's make_sizing_plan() to decide about tablet count of a new table This makes decisions made by the scheduler consistent with decisions made on table creation, with regard to tablet count. We want to avoid over-allocation of tablets when table is created, which would then be reduced by the scheduler's scaling logic. Not just to avoid wasteful migrations post table creation, but to respect the per-shard goal. To respect the per-shard goal, the algorithm will no longer be as simple as looking at hints, and we want to share the algorithm between the scheduler and initial tablet allocator. So invoke the scheduler to get the tablet count when table is created.	2025-02-19 14:40:07 +01:00
Tomasz Grabiec	9d600dd783	tablets: load_balancer: Drop test_mode tablets_test is now creating proper schema in the database, so test_mode is no longer needed.	2025-02-19 14:38:48 +01:00
Botond Dénes	3928851ab0	Merge 'encryption_at_rest_test/encryption: Add some verbosity etc to help diagnose test run issues' from Calle Wilund Refs #22628 Adds exception handler + cleanup for the case where we have a bad config/env vars (hint minio) or similar, such that we fail with exception during setting up the EAR context. In a normal startup, this is ok. We will report the exception, and the do a exit(1). In tests however, we don't and active context will instead be freed quite proper, in which case we need to call stop to ensure we don't crash on shared pointer destruction on wrong shard. Doing so will hide the real issue from whomever runs the test. Adds some verbosity to track issues with the network proxy used to test EAR connector difficulties. Also adds an earlier close in input stream to help network usage. Note: This is a diagnostic helper. Still cannot repro the issue above. Closes scylladb/scylladb#22810 * github.com:scylladb/scylladb: gcp/aws kms: Promote service_error to recoverable + use malformed_response_error encryption_at_rest_test: Add verbosity + earlier stream close to proxy encryption: Add exception handler to context init (for tests)	2025-02-18 10:29:30 +02:00
Avi Kivity	30a38e61d4	Merge 'sstables_manager: trigger reclaim/reload on `components_memory_reclaim_threshold` update' from Lakshmi Narayanan Sreethar The config variable `components_memory_reclaim_threshold` limits the memory available to the sstable bloom filters. Any change to its value is not immediately propagated to the sstable manager, despite it being a LiveUpdate variable. The updated value takes effect only when a new sstable is created or deleted. This PR first refactors the reclaim and reload logic into a single background fiber. It then updates the sstable manager to subscribe to changes in the `components_memory_reclaim_threshold` configuration value and immediately triggers the reclaim/reload fiber when a change is detected. Fixes #21947 This is an improvement and does not need to be backported. Closes scylladb/scylladb#22725 * github.com:scylladb/scylladb: sstables_manager: trigger reclaim/reload on `components_memory_reclaim_threshold` update sstables_manager: maybe_reclaim_components: yield between iterations sstables_manager: rename `increment_total_reclaimable_memory_and_maybe_reclaim()` sstables_manager: move reclaim logic into `components_reclaim_reload_fiber()` sstables_manager: rename `_sstable_deleted_event` condition variable sstables_manager: rename `components_reloader_fiber()` sstables_manager: fix `maybe_reclaim_components()` indentation sstables_manager: reclaim components memory until usage falls below threshold sstables_manager: introduce `get_components_memory_reclaim_threshold()` sstables_manager: extract `maybe_reclaim_components()` sstables_manager: fix `maybe_reload_components()` indentation sstables_manager: extract out `maybe_reload_components()`	2025-02-17 22:33:33 +02:00
Lakshmi Narayanan Sreethar	064bf2fd85	sstables_manager: trigger reclaim/reload on `components_memory_reclaim_threshold` update The config variable `components_memory_reclaim_threshold` limits the memory available to the sstable bloom filters. Any change to its value is not immediately propagated to the sstable manager, despite it being a LiveUpdate variable. The updated value takes effect only when a new sstable is created or deleted. This patch updates the sstable manager to subscribe to any changes in the above mentioned config value and immediately trigger the reclaim/reload fiber when a change occurs. Also, adds a testcase to verify the fix. Fixes #21947 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-02-17 20:55:45 +05:30
Calle Wilund	5905c19ab4	encryption_at_rest_test: Add verbosity + earlier stream close to proxy Refs #22628 Adds some verbosity to track issues with the network proxy used to test EAR connector difficulties. Also adds an earlier close in input stream to help network usage. Note: This is a diagnostic helper. Still cannot repro the issue above.	2025-02-17 13:49:43 +00:00
Botond Dénes	3439d015cb	Merge 'repair: Introduce Host and DC filter support' from Aleksandra Martyniuk Currently, the tablet repair scheduler repairs all replicas of a tablet. It does not support hosts or DCs selection. It should be enough for most cases. However, users might still want to limit the repair to certain hosts or DCs in production. https://github.com/scylladb/scylladb/pull/21985 added the preparation work to add the config options for the selection. This patch adds the hosts or DCs selection support. Fixes https://github.com/scylladb/scylladb/issues/22417 New feature. No backport is needed. Closes scylladb/scylladb#22621 * github.com:scylladb/scylladb: test: add test to check dcs and hosts repair filter test: add repair dc selection to test_tablet_metadata_persistence repair: Introduce Host and DC filter support docs: locator: update the docs and formatter of tablet_task_info	2025-02-17 10:04:09 +02:00
Kefu Chai	aa8c27b872	db: prevent accidental copies of result_set_row by making it move-only result_set_row is a heavyweight object containing multiple cell types: regular columns, partition keys, and static values. To prevent expensive accidental copies, delete the copy constructor and replace it with: 1. A move constructor for efficient vector reallocation 2. An explicit copy() method when copies are actually needed This change reduces overhead in some non-hot paths by eliminating implicit deep copies. Please note, previously, in `create_view_from_mutation()`, we kept a copy of `result_set_row`, and then reused `table_rs` for holding the mutation for `scylla_tables`. Because we don't copy the `result_set_row` in this change, in order to avoid invalidating the `row` after reusing `table_rs` in the outer scope, we define a new `table_rs` shadowing the one in the out scope. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22741	2025-02-17 09:48:08 +02:00
Kefu Chai	7ff0d7ba98	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22857	2025-02-15 20:32:22 +02:00
Raphael S. Carvalho	d78f57e94a	service: Don't use new tablet_resize_finalization state until supported In a rolling upgrade, nodes that weren't upgraded yet will not recognize the new tablet_resize_finalization state, that serves both split and merges, leading to a crash. To fix that, coordinator will pick the old tablet_split_finalization state for serving split finalization, until the cluster agrees on merge, so it can start using the new generic state for resize finalization introduced in merge series. Regression was introduced in `e00798f`. Fixes #22840. Reported-by: Tomasz Grabiec <tgrabiec@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#22845	2025-02-15 20:32:22 +02:00
Lakshmi Narayanan Sreethar	7f0f839d6d	sstables_manager: move reclaim logic into `components_reclaim_reload_fiber()` Move the sstable reclaim logic into `components_reclaim_reload_fiber()` in preparation for the fix for #21947. This also simplifies the overall reclaim/reload logic by preventing multiple fibers from attempting to reclaim/reload component memory concurrently. Also, update the existing test cases to adapt to this change. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-02-14 22:11:04 +05:30
Aleksandra Martyniuk	1c8a41e2dd	test: add repair dc selection to test_tablet_metadata_persistence	2025-02-14 09:13:11 +01:00
Kefu Chai	481397317d	sstables, test: migrate from boost::copy() to std::ranges::copy() Replace boost::copy() with the standard library's std::ranges::copy() to reduce external dependencies and simplify the codebase. This change eliminates the requirement for boost::range and makes the implementation more maintainable. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22789	2025-02-11 14:55:25 +03:00
Botond Dénes	51a273401c	Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec This PR converts boost load balancer tests in preparation for load balancer changes which add per-table tablet hints. After those changes, load balancer consults with the replication strategy in the database, so we need to create proper schema in the database. To do that, we need proper topology for replication strategies which use RF > 1, otherwise keyspace creation will fail. Topology is created in tests via group0 commands, which is abstracted by the new `topology_builder` class. Tests cannot modify token_metadata only in memory now as it needs to be consistent with the schema and on-disk metadata. That's why modifications to tablet metadata are now made under group0 guard and save back metadata to disk. Closes scylladb/scylladb#22648 * github.com:scylladb/scylladb: test: tablets: Drop keyspace after do_test_load_balancing_merge_colocation() scenario tests: tablets: Set initial tablets to 1 to exit growing mode test: tablets_test: Create proper schema in load balancer tests test: lib: Introduce topology_builder test: cql_test_env: Expose topology_state_machine topology_state_machine: Introduce lock transition	2025-02-10 16:08:41 +02:00
Nadav Har'El	a492e239e3	Merge 'test.py: Add the possibility to run boost and unit tests with pytest ' from Andrei Chekun Add the possibility to run boost and unit tests with pytest test.py should follow the next paradigm - the ability to run all test cases sequentially by ONE pytest command. With this paradigm, to have the better performance, we can split this 1 command into 2,3,4,5,100,200... whatever we want It's a new functionality that does not touch test.py way of executing the boost and unit tests. It supports the main features of test.py way of execution: automatic discovery of modes, repeats. There is an additional requirement to execute tests in parallel: pytest-xdist. To install it, execute `pip install pytest-xdist` To run test with pytest execute `pytest test/boost`. To execute only one file, provide the path filename `pytest test/boost/aggregate_fcts_test.cc` since it's a normal path, autocompletion will work on the terminal. To provide a specific mode, use the next parameter `--mode dev`, if parameter will not be provided pytest will try to use `ninja mode_list` to find out the compiled modes. Parallel execution controlled by pyest-xdist and the parameter `-n 12`. The useful command to discover the tests in the file or directory is `pytest --collect-only -q --mode dev test/boost/aggregate_fcts_test.cc`. That will return all test functions in the file. To execute only one function from the test, you can invoke the output from the previous command, but suffix for mode should be skipped, for example output will be `test/boost/aggregate_fcts_test.cc::test_aggregate_avg.dev`, so to execute this specific test function, please use the next command `pytest --mode dev test/boost/aggregate_fcts_test.cc::test_aggregate_avg` There is a parameter `--repeat` that used to repeat the test case several times in the same way as test.py did. It's not possible to run both boost and unit tests directories with one command, so we need to provide explicitly which directory should be executed. Like this `pytest --mode dev test/unit` or `pytest --mode dev test/boost` Fixes: https://github.com/scylladb/qa-tasks/issues/1775 Closes scylladb/scylladb#21108 * github.com:scylladb/scylladb: test.py: Add possibility to run ldap tests from pytest test.py: Add the possibility to run unit tests from pytest test.py: Add the possibility to run boost test from pytest test.py: Add discovery for C++ tests for pytest test.py: Modify s3 server mock test.py: Add method to get environment variables from MinIO wrapper test.py: Move get configured modes to common lib	2025-02-09 11:56:24 +01:00
Avi Kivity	9712390336	Merge 'Add per-table tablet options in schema' from Benny Halevy This series extends the table schema with per-table tablet options. The options are used as hints for initial tablet allocation on table creation and later for resize (split or merge) decisions, when the table size changes. * New feature, no backport required Closes scylladb/scylladb#22090 * github.com:scylladb/scylladb: tablets: resize_decision: get rid of initial_decision tablet_allocator: consider tablet options for resize decision tablet_allocator: load_balancer: table_size_desc: keep target_tablet_size as member network_topology_strategy: allocate_tablets_for_new_table: consider tablet options network_topology_strategy: calculate_initial_tablets_from_topology: precalculate shards per dc using for_each_token_owner network_topology_strategy: calculate_initial_tablets_from_topology: set default rf to 0 cql3: data_dictionary: format keyspace_metadata: print "enabled":true when initial_tablets=0 cql3/create_keyspace_statement: add deprecation warning for initial tablets test: cqlpy: test_tablets: add tests for per-table tablet options schema: add per-table tablet options feature_service: add TABLET_OPTIONS cluster schema feature	2025-02-08 20:32:19 +02:00
Avi Kivity	9db9b0963f	Merge ' reader_concurrency_semaphore: set_notify_handler(): disable timeout ' from Botond Dénes `set_notify_handler()` is called after a querier was inserted into the querier cache. It has two purposes: set a callback for eviction and set a TTL for the cache entry. This latter was not disabling the pre-existing timeout of the permit (if any) and this would lead to premature eviction of the cache entry if the timeout was shorter than TTL (which his typical). Disable the timeout before setting the TTL to prevent premature eviction. Fixes: https://github.com/scylladb/scylladb/issues/22629 Backport required to all active releases, they are all affected. Closes scylladb/scylladb#22701 * github.com:scylladb/scylladb: reader_concurrency_semaphore: set_notify_handler(): disable timeout reader_permit: mark check_abort() as const	2025-02-08 20:05:03 +02:00
Andrei Chekun	8ef840a1c5	test.py: Add the possibility to run boost test from pytest Add the possibility to run boost test from pytest. Boost facade based on code from https://github.com/pytest-dev/pytest-cpp, but enhanced and rewritten to suite better.	2025-02-07 21:40:25 +01:00
Tomasz Grabiec	1854ea2165	test: tablets: Drop keyspace after do_test_load_balancing_merge_colocation() scenario This scenario is invoked in a loop in the test_load_balancing_merge_colocation_with_random_load test case, which will cause accumulation of tablet maps making each reload slower in subsequent iterations. It wasn't a problem before because we overwritten tablet_metadata in each iteration to contain only tablets for the current table, but now we need to keep it consistent with the schema and don't do that.	2025-02-07 17:13:52 +01:00
Tomasz Grabiec	58460a8863	tests: tablets: Set initial tablets to 1 to exit growing mode After tablet hints, there is no notion of leaving growing mode and tablet count is sustained continuously by initial tablet option, so we need to lower it for merge to happen.	2025-02-07 17:13:52 +01:00
Tomasz Grabiec	ca6159fbe2	test: tablets_test: Create proper schema in load balancer tests This is in preparation for load balancer changes needed to respect per-table tablet hints and respecting per-shard tablet count goal. After those changes, load balancer consults with the replication strategy in the database, so we need to create proper schema in the database. To do that, we need proper topology for replication strategies which use RF > 1, otherwise keyspace creation will fail.	2025-02-07 17:13:52 +01:00
Botond Dénes	9174f27cc8	reader_concurrency_semaphore: set_notify_handler(): disable timeout set_notify_handler() is called after a querier was inserted into the querier cache. It has two purposes: set a callback for eviction and set a TTL for the cache entry. This latter was not disabling the pre-existing timeout of the permit (if any) and this would lead to premature eviction of the cache entry if the timeout was shorter than TTL (which his typical). Disable the timeout before setting the TTL to prevent premature eviction. Fixes: #scylladb/scylladb#22629	2025-02-07 02:31:01 -05:00
Avi Kivity	861fb58e14	Merge 'vector: add support for vector type' from Dawid Pawlik This pull request is an implementation of vector data type similar to one used by Apache Cassandra. The patch contains: - implementation of vector_type_impl class - necessary functionalities similar to other data types - support for serialization and deserialization of vectors - support for Lua and JSON format - valid CQL syntax for `vector<>` type - `type_parser` support for vectors - expression adjustments such as: - add `collection_constructor::style_type::vector` - rename `collection_constructor::style_type::list` to `collection_constructor::style_type::list_or_vector` - vector type encoding (for drivers) - unit tests - cassandra compatibility tests - necessary documentation Co-authored-by: @janpiotrlakomy Fixes https://github.com/scylladb/scylladb/issues/19455 Closes scylladb/scylladb#22488 * github.com:scylladb/scylladb: docs: add vector type documentation cassandra_tests: translate tests covering the vector type type_codec: add vector type encoding boost/expr_test: add vector expression tests expression: adjust collection constructor list style expression: add vector style type test/boost: add vector type cql_env boost tests test/boost: add vector type_parser tests type_parser: support vector type cql3: add vector type syntax types: implement vector_type_impl	2025-02-06 20:36:50 +02:00
Benny Halevy	20c6ca2813	tablet_allocator: consider tablet options for resize decision Do not merge tablets if that would drop the tablet_count below the minimum provided by hints. Split tablets if the current tablet_count is less than the minimum tablet count calculated using the table's tablet options. TODO: override min_tablet_count if the tablet count per shard is greater than the maximum allowed. In this case the tables tablet counts should be scaled down proportionally. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-02-06 18:43:35 +02:00
Pavel Emelyanov	951625ca13	Merge 's3 client: add aws credentials providers' from Ernest Zaslavsky This update introduces four types of credential providers: 1. Environment variables 2. Configuration file 3. AWS STS 4. EC2 Metadata service The first two providers should only be used for testing and local runs. They must NEVER be used in production. The last two providers are intended for use on real EC2 instances: - AWS STS: Preferred method for obtaining temporary credentials using IAM roles. - EC2 Metadata Service: Should be used as a last resort. Additionally, a simple credentials provider chain is created. It queries each provider sequentially until valid credentials are obtained. If all providers fail, it returns an empty result. fixes: #21828 Closes scylladb/scylladb#21830 * github.com:scylladb/scylladb: docs: update the `object_storage.md` and `admin.rst` aws creds: add STS and Instance Metadata service credentials providers aws creds: add env. and file credentials providers s3 creds: move credentials out of endpoint config	2025-02-06 11:12:37 +03:00
Benny Halevy	32c2f7579f	network_topology_strategy: allocate_tablets_for_new_table: consider tablet options Use the keyspace initial_tablets for min_tablet_count, if the latter isn't set, then take the maximum of the option-based tablet counts: - min_tablet_count - and expected_data_size_in_gb / target_tablet_size - min_per_shard_tablet_count (via calculate_initial_tablets_from_topology) If none of the hints produce a positive tablet_count, fall back to calculate_initial_tablets_from_topology * initial_scale. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-02-06 08:59:32 +02:00
Benny Halevy	c5668d99c9	schema: add per-table tablet options Unlike with vnodes, each tablet is served only by a single shard, and it is associated with a memtable that, when flushed, it creates sstables which token-range is confined to the tablet owning them. On one hand, this allows for far better agility and elasticity since migration of tablets between nodes or shards does not require rewriting most if not all of the sstables, as required with vnodes (at the cleanup phase). Having too few tablets might limit performance due not being served by all shards or by imbalance between shards caused by quantization. The number of tabelts per table has to be a power of 2 with the current design, and when divided by the number of shards, some shards will serve N tablets, while others may serve N+1, and when N is small N+1/N may be significantly larger than 1. For example, with N=1, some shards will serve 2 tablet replicas and some will serve only 1, causing an imbalance of 100%. Now, simply allocating a lot more tablets for each table may theoretically address this problem, but practically: a. Each tablet has memory overhead and having too many tablets in the system with many tables and many tablets for each of them may overwhelm the system's and cause out-of-memory errors. b. Too-small tablets cause a proliferation of small sstables that are less efficient to acces, have higher metadata overhead (due to per-sstable overhead), and might exhaust the system's open file-descriptors limitations. The options introduced in this change can help the user tune the system in two ways: 1. Sizing the table to prevent unnecessary tablet splits and migrations. This can be done when the table is created, or later on, using ALTER TABLE. 2. Controlling min_per_shard_tablet_count to improve tablet balancing, for hot tables. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-02-06 08:55:51 +02:00
Tomasz Grabiec	3bb19e9ac9	locator: network_topology_startegy: Ignore leaving nodes when computing capacity for new tables For example, nodes which are being decommissioned should not be consider as available capacity for new tables. We don't allocate tablets on such nodes. Would result in higher per-shard load then planned. Closes scylladb/scylladb#22657	2025-02-05 23:59:41 +02:00
Kefu Chai	9a20fb43ab	tree: replace boost::min_element() with std::ranges::min_element() in order to reduce the external header dependency, let's switch to the standardlized std::ranges::min_element(). Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22572	2025-02-05 21:54:01 +02:00
Tomasz Grabiec	e22e3b21b1	locator: network_topology_strategy: Fix SIGSEGV when creating a table when there is a rack with no normal nodes In that case, new_racks will be used, but when we discover no candidates, we try to pop from existing_racks. Fixes #22625 Closes scylladb/scylladb#22652	2025-02-05 20:13:05 +02:00
Raphael S. Carvalho	ce65164315	test: Use linux-aio backend again on seastar-based tests Since mid December, tests started failing with ENOMEM while submitting I/O requests. Logs of failed tests show IO uring was used as backend, but we never deliberately switched to IO uring. Investigation pointed to it happening accidentaly in commit `1bac6b75dc`, which turned on IO uring for allowing native tool in production, and picked linux-aio backend explicitly when initializing Scylla. But it missed that seastar-based tests would pick the default backend, which is io_uring once enabled. There's a reason we never made io_uring the default, which is that it's not stable enough, and turns out we made the right choice back then and it apparently continue to be unstable causing flakiness in the tests. Let's undo that accidental change in tests by explicitly picking the linux-aio backend for seastar-based tests. This should hopefully bring back stability. Refs #21968. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#22695	2025-02-05 15:19:24 +02:00
Ernest Zaslavsky	dee4fc7150	aws creds: add STS and Instance Metadata service credentials providers This commit introduces two new credentials providers: STS and Instance Metadata Service. The S3 client's provider chain has been updated to incorporate these new providers. Additionally, unit tests have been added to ensure coverage of the new functionality.	2025-02-05 14:57:19 +02:00
Ernest Zaslavsky	d534051bea	aws creds: add env. and file credentials providers This commit entirely removes credentials from the endpoint configuration. It also eliminates all instances of manually retrieving environment credentials. Instead, the construction of file and environment credentials has been moved to their respective providers. Additionally, a new aws_credentials_provider_chain class has been introduced to support chaining of multiple credential providers.	2025-02-05 14:57:19 +02:00
Botond Dénes	f2d5819645	reader_concurrency_semaphore: with_permit(): proper clean-up after queue overload with_permit() creates a permit, with a self-reference, to avoid attaching a continuation to the permit's run function. This self-reference is used to keep the permit alive, until the execution loop processes it. This self reference has to be carefully cleared on error-paths, otherwise the permit will become a zombie, effectively leaking memory. Instead of trying to handle all loose ends, get rid of this self-reference altogether: ask caller to provide a place to save the permit, where it will survive until the end of the call. This makes the call-site a little bit less nice, but it gets rid of a whole class of possible bugs. Fixes: #22588 Closes scylladb/scylladb#22624	2025-02-04 21:27:16 +02:00
Ernest Zaslavsky	c911fc4f34	s3 creds: move credentials out of endpoint config This commit refactors the way AWS credentials are managed in Scylla. Previously, credentials were included in the endpoint configuration. However, since credentials and endpoint configurations serve different purposes and may have different lifetimes, it’s more logical to manage them separately. Moving forward, credentials will be completely removed from the endpoint_config to ensure clear separation of concerns.	2025-02-04 16:45:23 +02:00

1 2 3 4 5 ...

3752 Commits