scylladb

Author	SHA1	Message	Date
Avi Kivity	7984925059	Merge 'Use coroutine::switch_to() in table::try_flush_memtable_to_sstable' from Pavel Emelyanov The method was coroutinized by `6df07f7ff7`. Back then thecoroutine::switch_to() wasn't available, and the code used with_scheduling_group() to call coroutinized lambdas. Those lambdas were implemented as on-stack variables to solve the capture list lifetime problems. As a result, the code looks like ``` auto flush = [] { ... // do the flushing auto post_flush = [] { ... // do the post-flushing } co_return co_await with_scheduling_group(group_b, post_flush); }; co_return co_await with_scheduling_group(group_a, flush); ``` which is a bit clumsy. Now we have switch_to() and can make the code flow of this method more readable, like this ``` co_await switch_to(group_a); ... // do the flushing co_await switch_to(group_b); ... // do the post-flushing ``` Code cleanup, not backporting Closes scylladb/scylladb#28430 * github.com:scylladb/scylladb: table: Fix indentation after previous patch table: Use coroutine::switch_to() in try_flush_memtable_to_sstable()	2026-01-29 18:12:35 +02:00
Pavel Emelyanov	56e212ea8d	table: Fix indentation after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2026-01-29 15:02:33 +03:00
Pavel Emelyanov	258a1a03e3	table: Use coroutine::switch_to() in try_flush_memtable_to_sstable() It allows dropping the local lambdas passed into with_scheduling_group() calls. Overall the code flow becomes more readable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2026-01-29 15:01:27 +03:00
Pavel Emelyanov	937d008d3c	Merge 'Clean up partition_snapshot_reader' from Botond Dénes Move to `replica/`, drop `flat` from name and drop unused usages as well as unused includes. Code cleanup, no backport Closes scylladb/scylladb#28353 * github.com:scylladb/scylladb: replica/partition_snapshot_reader: remove unused includes partition_snapshot_reader: remove "flat" from name mv partition_snapshot_reader.hh -> replica/	2026-01-29 11:22:15 +03:00
Botond Dénes	482ffe06fd	Merge 'Improve load shedding on the replica side' from Łukasz Paszkowski When reads arrive, they have to wait for admission on the reader concurrency semaphore. If the node is overloaded, the reads will be queued. They can time out while in the queue, but will not time out once admitted. Once the shard is sufficiently loaded, it is possible that most queued reads will time out, because the average time it takes to for a queued read to be admitted is around that of the timeout. If a read times out, any work we already did, or are about to do on it is wasted effort. Therefore, the patch tries to prevent it by checking if an admitted read has a chance to complete in time and abort it if not. It uses the following criteria: if read's remaining time <= read's timeout when arrived to the semaphore * live updateable preemptive_abort_factor; the read is rejected and the next one from the wait list is considered. Fixes https://github.com/scylladb/scylladb/issues/14909 Fixes: SCYLLADB-353 Backport is not needed. Better to first observe its impact. Closes scylladb/scylladb#21649 * github.com:scylladb/scylladb: reader_concurrency_semaphore: Check during admission if read may timeout permit_reader::impl: Replace break with return after evicting inactive permit on timeout reader_concurrency_semaphore: Add preemptive_abort_factor to constructors config: Add parameters to control reads' preemptive_abort_factor permit_reader: Add a new state: preemptive_aborted reader_concurrency_semaphore: validate waiters counter when dequeueing a waiting permit reader_concurrency_semaphore: Remove cpu_concurrency's default value	2026-01-29 08:27:22 +02:00
Łukasz Paszkowski	fde09fd136	reader_concurrency_semaphore: Add preemptive_abort_factor to constructors The new parameter parametrizes the factor used to reject a read during admission. Its value shall be between 0.0 and 1.0 where + 0.0 means a read will never get rejected during admission + 1.0 means a read will immediatelly get rejected during admission Although passing values outside the interaval is possible, they will have the exact same effects as they were clamped to [0.0, 1.0].	2026-01-28 14:20:01 +01:00
Avi Kivity	47315c63dc	treewide: include Seastar headers with angle brackets Seastar is a "system" library from our point of view, so should be included with angle brackets. Closes scylladb/scylladb#28395	2026-01-28 10:33:06 +02:00
Łukasz Paszkowski	8829098e90	reader_concurrency_semaphore: Remove cpu_concurrency's default value The commit `59faa6d`, introduces a new parameter called cpu_concurrency and sets its default value to 1 which violates the commit `fbb83dd` that removes all default values from constructors but one used by the unit tests. The patch removes the default value of the cpu_concurrency parameter and alters tests to use the test dedicated reader_concurrency_semaphore constructor wherever possible.	2026-01-27 15:40:11 +01:00
Botond Dénes	755e8319ee	replica/partition_snapshot_reader: remove unused includes	2026-01-26 16:52:46 +02:00
Botond Dénes	756837c5b4	partition_snapshot_reader: remove "flat" from name The "flat" migration is long done, this distinction is no longer meaningful.	2026-01-26 16:52:46 +02:00
Botond Dénes	9d1933492a	mv partition_snapshot_reader.hh -> replica/ The partition snapshot lives in mutation/, however mutation/ is a lower level concept than a mutation reader. The next best place for this reader is the replica/ directory, where the memtable, its main user, also lives. Also move the code to the replica namespace. test/boost/mvcc_test.cc includes this header but doesn't use anything from it. Instead of updating the include path, just drop the unused include.	2026-01-26 16:52:08 +02:00
Piotr Dulikowski	fe9237fdc9	Merge 'alternator: don't require rf_rack flag for indexes, validate instead' from Michael Litvak In `8df61f6d99` we changed the requirements for creating materialized views and MV-based indexes - instead of requiring the rf_rack_valid_keyspaces flag to be set, we now require the keyspace to be RF-rack-valid at the time of creation, and it is enforced to remain RF-rack-valid while the MV exists. This validation is done in the cql create view/index statements. The same should be done also for alternator - when creating a table with GSI or LSI, or when adding a GSI to an existing table, previously we required the flag rf_rack_valid_keyspaces to be set. Now we change it to instead check if the keyspace is RF-rack-valid, and if not the operation fails with an appropriate error. Fixes https://github.com/scylladb/scylladb/issues/28214 backport to 2025.4 to add RF-rack-valid enforcements in alternator Closes scylladb/scylladb#28154 * github.com:scylladb/scylladb: locator: document the exception type of assert_rf_rack_valid_keyspace alternator: don't require rf_rack flag for indexes, validate instead	2026-01-23 11:49:02 +01:00
Patryk Jędrzejczak	4e984139b2	Merge 'strongly consistent tables: basic implementation' from Petr Gusev In this PR we add a basic implementation of the strongly-consistent tables: * generate raft group id when a strongly-consistent table is created * persist it into system.tables table * start raft groups on replicas when a strongly-consistent tablet_map reaches them * add strongly-consistent version of the storage_proxy, with the `query` and `mutate` methods * the `mutate` method submits a command to the tablets raft group, the query method reads the data with `raft.read_barrier()` * strongly-consistent versions of the `select_statement` and `modification_statement` are added * a basic `test_strong_consistency.py/test_basic_write_read` is added which to check that we can write and read data in a strongly consistent fashion. Limitations: * for now the strongly consistent tables can have tablets only on shard zero. This is because we (ab/re) use the existing raft system tables which live only on shard0. In the next PRs we'll create separate tables for the new tablets raft groups. * No Scylla-side proxying - the test has to figure out who is the leader and submit the command to the right node. This will be fixed separately. * No tablet balancing -- migration/split/merges require separate complicated code. The new behavior is hidden behind `STRONGLY_CONSISTENT_TABLES` feature, which is enabled when the `STRONGLY_CONSISTENT_TABLES` experimental feature flag is set. Requirements, specs and general overview of the feature can be found [here](https://scylladb.atlassian.net/wiki/spaces/RND/pages/91422722/Strong+Consistency). Short term implementation plan is [here](https://docs.google.com/document/d/1afKeeHaCkKxER7IThHkaAQlh2JWpbqhFLIQ3CzmiXhI/edit?tab=t.0#heading=h.thkorgfek290) One can check the strongly consistent writes and reads locally via cqlsh: scylla.yaml: ``` experimental_features: - strongly-consistent-tables ``` cqlsh: ``` CREATE KEYSPACE IF NOT EXISTS my_ks WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1} AND tablets = {'initial': 1} AND consistency = 'local'; CREATE TABLE my_ks.test (pk int PRIMARY KEY, c int); INSERT INTO my_ks.test (pk, c) VALUES (10, 20); SELECT * FROM my_ks.test WHERE pk = 10; ``` Fixes SCYLLADB-34 Fixes SCYLLADB-32 Fixes SCYLLADB-31 Fixes SCYLLADB-33 Fixes SCYLLADB-56 backport: no need Closes scylladb/scylladb#27614 * https://github.com/scylladb/scylladb: test_encryption: capture stderr test/cluster: add test_strong_consistency.py raft_group_registry: disable metrics for non-0 groups strong consistency: implement select_statement::do_execute() cql: add select_statement.cc strong consistency: implement coordinator::query() cql: add modification_statement cql: add statement_helpers strong consistency: implement coordinator::mutate() raft.hh: make server::wait_for_leader() public strong_consistency: add coordinator modification_statement: make get_timeout public strong_consistency: add groups_manager strong_consistency: add state_machine and raft_command table: add get_max_timestamp_for_tablet tablets: generate raft group_id-s for new table tablet_replication_strategy: add consistency field tablets: add raft_group_id modification_statement: remove virtual where it's not needed modification_statement: inline prepare_statement() system_keyspace: disable tablet_balancing for strongly_consistent_tables cql: rename strongly_consistent statements to broadcast statements	2026-01-23 09:52:33 +01:00
Michael Litvak	d5009882c6	locator: document the exception type of assert_rf_rack_valid_keyspace The function assert_rf_rack_valid_keyspace uses the exception type std::invalid_argument when the RF-rack validation fails. Document it and change all callers to catch this specific exception type when checking for RF-rack validation failures, so that other exception types can be propagated properly.	2026-01-22 16:11:35 +01:00
Pavel Emelyanov	cb6ee05391	Merge 'Extend snapshot manifest.json with tablet-aware metadata' from Benny Halevy This series extends the json manifest file we create when taking snapshots. It adds the following metadata: - manifesr version and scope - snapshot name - created_at and expires_at timestamps (#24061) - node metadata (host_id, dc, rack) - keyspace and table metadat - tablet_count (#26352) - per-sstable metadata (#26352) Fixes [SCYLLADB-189](https://scylladb.atlassian.net/browse/SCYLLADB-189) Fixes [SCYLLADB-195](https://scylladb.atlassian.net/browse/SCYLLADB-195) Fixes [SCYLLADB-196](https://scylladb.atlassian.net/browse/SCYLLADB-196) * Enhancement, no backport needed [SCYLLADB-189]: https://scylladb.atlassian.net/browse/SCYLLADB-189?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [SCYLLADB-195]: https://scylladb.atlassian.net/browse/SCYLLADB-195?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [SCYLLADB-196]: https://scylladb.atlassian.net/browse/SCYLLADB-196?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ Closes scylladb/scylladb#27945 * github.com:scylladb/scylladb: snapshot: keep per-sstable metadata in manifest.json snapshot: add table info and tablet_count to manifest.json snapshot: add basic support for snapshot ttl in manifest.json table: snapshot_on_all_shards: take snapshot_options db: snapshot_ctl: move skip_flush to struct snapshot_options snapshot: add snapshot name in manifest.json test: lib: cql_test_env: apply db::config::tablets_mode_for_new_keyspaces snapshot: add node info to manifest.json snapshot: add manifest info to manifest.json test: database_test: snapshot_works: add validate_manifest	2026-01-22 15:19:11 +03:00
Botond Dénes	7d2e6c0170	Merge 'config: add enforce_rack_list option' from Aleksandra Martyniuk Add enforce_rack_list option. When the option is set to true, all tablet keyspaces have rack list replication factor. When the option is on: - CREATE STATEMENT always auto-extends rf to rack lists; - ALTER STATEMENT fails when there is numeric rf in any DC. The flag is set to false by default and a node needs to be restarted in order to change its value. Starting a node with enforce_rack_list option will fail, if there are any tablet keyspaces with numeric rf in any DC. enforce_rack_list is a per-node option and a user needs to ensure that no tablet keyspace is altered or created while nodes in the cluster don't have the consistent value. Mark rf_rack_valid_keyspaces as deprecated. Fixes: https://github.com/scylladb/scylladb/issues/26399. New feature; no backport needed Closes scylladb/scylladb#28084 * github.com:scylladb/scylladb: test: add test for enforce_rack_list option db: mark rf_rack_valid_keyspaces as deprecated config: add enforce_rack_list option Revert "alternator: require rf_rack_valid_keyspaces when creating index"	2026-01-22 10:27:35 +02:00
Benny Halevy	d6557764b9	snapshot: keep per-sstable metadata in manifest.json Adds a "sstables" array member to manifest.json. For each sstables, keep the following metadata: id - a uuid for the sstable (the sstable identifier if the use-sstable-identifier option was used, otherwise the sstable uuid generation) toc_name - the name of the TOC.txt file data_size and index_size - in bytes first_token and last_token - of the sstable first and last keys. Fixes: SCYLLADB-196 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:42:52 +02:00
Benny Halevy	dc9093303d	snapshot: add table info and tablet_count to manifest.json Add a table member to manifest.json with the keyspace_name, table_name, table_id, tablets_type, and, for tablets-enabled tables, get tablet_count on each shard and write the minimum to manifest.json. For vnodes-based tables, tablet_count=0. For now, `tablets_type` may be either `none` for vnodes tables, or `powof2` for tablets tables. In the future, when we support arbitrary tablt boundaries, this will be reflected here, and it is likely we would backup the whole tablets map sperately to get all tablet boundaries. Fixes SCYLLADB-195 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:36:52 +02:00
Benny Halevy	91df129e21	snapshot: add basic support for snapshot ttl in manifest.json Store the snapshot `created_at` time and an optional `expires_at` time. Fixes SCYLLADB-189 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:12:56 +02:00
Benny Halevy	5e90fbb9d2	table: snapshot_on_all_shards: take snapshot_options And keep the options for now in the local_snapshot_writer. The options will be used by following patches to pass extra metadata like the snapshot creation time, expiration time, etc. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:12:56 +02:00
Benny Halevy	49a3e0914d	db: snapshot_ctl: move skip_flush to struct snapshot_options So we can easily extend it and add more options. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:12:56 +02:00
Benny Halevy	d9fc3b1c11	snapshot: add snapshot name in manifest.json Store the snapshot tag in the manifest file. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:12:56 +02:00
Benny Halevy	0d82e56078	snapshot: add node info to manifest.json Add metadata about the node: host_id, datacenter, and rack. This enables dc- or rack- aware restore. Today this information is "encoded" into the snapshot hierarchy prefixes, but if all manifest files would be stored in a flat directory, we'd need to encode that metadata in the object name, but it'd be better for the manifest contents to be self descriptive. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:12:56 +02:00
Benny Halevy	24040efc54	snapshot: add manifest info to manifest.json Add metadata about the manifest itself: A version and the manifest scope (currently "node", but in the future, may also be "shard", or "tablet") Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:12:56 +02:00
Petr Gusev	a8350b274e	table: add get_max_timestamp_for_tablet Strongly consistent writes require knowing the maximum timestamp of locally applied mutations to guarantee monotonically increasing timestamps for subsequent writes. This commit adds a function that returns the maximum timestamp for a given tablet. Why it is safe to use this function with deleted cells: * Tombstones are included in memtable.get_max_timestamp() calculations. * The maximum timestamp of a memtable is used to initialize the maximum timestamp of the resulting sstable. * During compaction, a new sstable’s maximum timestamp is initialized as the maximum of the contributing sstables.	2026-01-21 14:56:00 +01:00
Petr Gusev	53f93eb830	tablets: add raft_group_id Add a `raft_group_id` column to `system.tablets` and to the `tablet_map` class. The column is populated only when the `strongly_consistent_tables` feature is enabled. This feature is currently disabled by default and is enabled only when the user sets the `STRONGLY_CONSISTENT_TABLES` experimental flag. The `raft_group_id` column is added to `system.tablets` only when this flag is set. This allows the schema to evolve freely while the feature is experimental, without requiring complex migrations.	2026-01-21 14:56:00 +01:00
Avi Kivity	c7dda5500c	database: simplify apply_counter_update exception handling Use coroutine::try_future to exit the coroutine immediately on error instead of explict checks. Closes scylladb/scylladb#28257	2026-01-20 11:13:49 +02:00
Aleksandra Martyniuk	761ace4f05	config: add enforce_rack_list option Add enforce_rack_list option. When the option is set to true, all tablet keyspaces have rack list replication factor. When the option is on: - CREATE STATEMENT always auto-extends rf to rack lists; - ALTER STATEMENT fails when there is numeric rf in any DC. The flag is set to false by default and a node needs to be restarted in order to change its value. Starting a node with enforce_rack_list option will fail, if there are any tablet keyspaces with numeric rf in any DC. enforce_rack_list is a per-node option and a user needs to ensure that no tablet keyspace is altered or created while nodes in the cluster don't have the consistent value.	2026-01-20 09:58:51 +01:00
Avi Kivity	874322f95e	multishard_query: simplify do_query() coroutine/continuation complexity do_query() is a coroutine but uses some continuations to take advantage of exceptions being propagated via future::then() without being thrown. We can accomplish the same thing with a nested coroutine and coroutine::try_future(), simplifying the code. While this area isn't performance intensive, we're not adding allocations. The coroutine frame may add an allocation, but since read_page() certainly does not return immediately, the following then() will allocate as well. Since we eliminated that then(), the change is at least neutral allocation-wise. Closes scylladb/scylladb#28258	2026-01-20 10:45:10 +02:00
Botond Dénes	19efd7f6f9	Merge 'The system_replicated_keys should be mark as a system keyspace' from Amnon Heiman This PR marks system_replicated_keys as a system keyspace. It was missing when the keyspace was added. A side effect of that is that metrics that are not supposed to be reported are. Fixes #27903 Closes scylladb/scylladb#27954 * github.com:scylladb/scylladb: distributed_loader: system_replicated_keys as system keyspace replicated_key_provider: make KSNAME public	2026-01-19 09:37:41 +02:00
Botond Dénes	1f6f7ceb68	replica/table: don't throw exceptions on the read-path Use coroutine::as_future() to avoid exceptions taking flight and triggering expensive stack-unwinding. Especially bad for common exceptions like timeouts. Not using coroutine::try_future(), because on the error path, the querier has to be closed.	2026-01-13 10:47:57 +02:00
Botond Dénes	131489fe48	multishard_mutation_query: fix indentation Left broken by previous patch.	2026-01-13 10:47:57 +02:00
Botond Dénes	7eeb7fcfba	multishard_mutation_query: don't throw exceptions on the read-path Use coroutine::try_future() to avoid exceptions taking flight and triggering expensive stack-unwinding. Especially bad for common exceptions like timeouts.	2026-01-13 10:47:57 +02:00
Botond Dénes	04b8f72946	Merge 'repair: Implement auto repair for tablet repair' from Asias He repair: Implement auto repair for tablet repair This patch implements the basic auto repair support for tablet repair. It was decided to add no per table configuration for the initial implementation, so two scylla yaml config options are introduced to set the default auto repair configs for all the tablet tables. - auto_repair_enabled_default Set true to enable auto repair for tablet tables by default. The value will be overridden by the per keyspace or per table configuration which is not implemented yet. - auto_repair_threshold_default_in_seconds Set the default time in seconds for the auto repair threshold for tablet tables. If the time since last repair is bigger than the configured time, the tablet is eligible for auto repair. The value will be overridden by the per keyspace or per table configuration which is not implemented yet. The following metrcis are added: - auto_repair_needs_repair_nr The number of tablets with auto repair enabled that needs repair - auto_repair_enabled_nr The number of tablets with auto repair enabled The metrics are useful to tell if auto repair is falling behind. In the future, more auto repair scheduling will be added, e.g., scheduling based on the repaired and unrepaired sstable set size, tombstone ratio and so on, in addition to the time based scheduling. Fixes SCYLLADB-99 New feature. No backport. Closes scylladb/scylladb#27534 * github.com:scylladb/scylladb: topology_coordinator: Add metrics for tablet repair repair: Implement auto repair for tablet repair	2026-01-12 14:16:01 +02:00
Asias He	7ba7b25bdd	repair: Implement auto repair for tablet repair This patch implements the basic auto repair support for tablet repair. It was decided to add no per table configuration for the initial implementation, so two scylla yaml config options are introduced to set the default auto repair configs for all the tablet tables. - auto_repair_enabled_default Set true to enable auto repair for tablet tables by default. The value will be overridden by the per keyspace or per table configuration which is not implemented yet. - auto_repair_threshold_default_in_seconds Set the default time in seconds for the auto repair threshold for tablet tables. If the time since last repair is bigger than the configured time, the tablet is eligible for auto repair. The value will be overridden by the per keyspace or per table configuration which is not implemented yet. The following metrcis are added: - auto_repair_needs_repair_nr The number of tablets with auto repair enabled that needs repair - auto_repair_enabled_nr The number of tablets with auto repair enabled The metrics are useful to tell if auto repair is falling behind. In the future, more auto repair scheduling will be added, e.g., scheduling based on the repaired and unrepaired sstable set size, tombstone ratio and so on, in addition to the time based scheduling. Fixes SCYLLADB-99	2026-01-09 16:11:39 +08:00
Botond Dénes	60570d7114	Merge 'topology coordinator: restrict node join/remove to preserve RF-rack validity' from Michael Litvak Allow creating materialized views and secondary indexes in a tablets keyspace only if it's RF-rack-valid, and enforce RF-rack-validity while the keyspace has views by restricting some operations: * Altering a keyspace's RF if it would make the keyspace RF-rack-invalid * Adding a node in a new rack * Removing / Decommissioning the last node in a rack Previously the config option `rf_rack_valid_keyspaces` was required for creating views. We now remove this restriction - it's not needed because we always maintain RF-rack-validity for keyspaces with views. The restrictions are relevant only for keyspaces with numerical RF. Keyspace with rack-list-based RF are always RF-rack-valid. Fixes scylladb/scylladb#23345 Fixes https://github.com/scylladb/scylladb/issues/26820 backport to relevant versions for materialized views with tablets since it depends on rf-rack validity Closes scylladb/scylladb#26354 * github.com:scylladb/scylladb: docs: update RF-rack restrictions cql3: don't apply RF-rack restrictions on vector indexes cql3: add warning when creating mv/index with tablets about rf-rack service/tablet_allocator: always allow tablet merge of tables with views locator: extend rf-rack validation for rack lists test: test rf-rack validity when creating keyspace during node ops locator: fix rf-rack validation during node join/remove test: test topology restrictions for views with tablets test: add test_topology_ops_with_rf_rack_valid topology coordinator: restrict node join/remove to preserve RF-rack validity topology coordinator: add validation to node remove locator: extend rf-rack validation functions view: change validate_view_keyspace to allow MVs if RF=Racks db: enforce rf-rack-validity for keyspaces with views replica/db: add enforce_rf_rack_validity_for_keyspace helper db: remove enforce parameter from check_rf_rack_validity test: adjust test to not break rf-rack validity	2026-01-09 10:01:23 +02:00
Benny Halevy	93b827c185	database: truncate_table_on_all_shards: drop outdated TODO comment The comment was added in `83323e155e` Since then, table::seal_active_memtable was improved to guarantee waiting on oustanding flushes on success (See `d55a2ac762`), so we can remove this TODO comment (it also not covered by any issue so nobody is planned to ever work on it). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-08 09:49:46 +02:00
Benny Halevy	2a803d2261	database: truncate_table_on_all_shards: consider can_flush on all shards can_flush might return a different value for each shard so check it right before deciding whether to flush or clear a memtable shard. Note that under normal condition can_flush would always return true now that it checks only the presence of the seal memtable function rather than check memtable_list::empty(). Fixes #27639 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-08 09:49:46 +02:00
Benny Halevy	02ee341a03	memtable_list: unify can_flush and may_flush Now that we have a unit test proving that it's safe to flush an empty memtable list there is no need to distinguish between may_flush and can_flush. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-08 09:49:46 +02:00
Benny Halevy	0342a24ee0	test: database_test: add test_flush_empty_table_waits_on_outstanding_flush Test that table::flush waits on outstanding flushes, even if the active memtable is empty Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-08 09:49:45 +02:00
Benny Halevy	5be6b80936	replica: table, storage_group, compaction_group: add needs_flush Table needs flush if not all its memtable lists are empty. To be used in the next patch for a unit test. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-08 09:41:22 +02:00
Avi Kivity	0df85c8ae8	Revert "Merge 'Unify configuration of object storage endpoints' from Pavel Emelyanov" This reverts commit `1bb897c7ca`, reversing changes made to `954f2cbd2f`. It makes incompatible changes to the object storage configuration format, breaking tests [1]. It's likely that it doesn't break any production configuration, but we can't be sure. Fixes #27966 Closes scylladb/scylladb#27969	2026-01-05 08:53:41 +02:00
Benny Halevy	4d46674d03	table: get_snapshot_details: use relative-path based file_stat With the additional file_stat overload introduced in `3e9b071838`, use the opened directory for more efficient, relative-path based stat. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-04 11:05:56 +02:00
Benny Halevy	2d2177d2c9	table: get_snapshot_details: fix warning in exists_in_dir The functor is called both on the data directory as well as on the staging directory, so the warning printed if the found file is not the same inode should print the given path, not datadir / name (as was copy and pasted). Refs #27635 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-04 11:05:56 +02:00
Benny Halevy	240b32a87a	table: get_snapshot_details: fix staging dir calculation staging is based off of datadir, not snapshot_dir. the issue was introduced in `f5ca3657e2`. Refs #27635 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-04 11:05:56 +02:00
Amnon Heiman	c6d1c63ddb	distributed_loader: system_replicated_keys as system keyspace This patch adds system_replicated_keys to the list of known system keyspaces. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2026-01-02 16:41:47 +02:00
Benny Halevy	3e9b071838	Update seastar submodule * seastar f0298e40...4dcd4df5 (29): > file: provide a default implementation for file_impl::statat > util: Genralize memory_data_sink > defer: Replace static_assert() with concept > treewide: drop the support of fmtlib < 9.0.0 > test: Improve resilience of netsed scheduling fairness test > Merge 'file: Use query_device_alignment_info in blkdev_alignments ' from Kefu Chai file: Put alignment helpers in anonymous namespace file: Use query_device_alignment_info in blkdev_alignments > Merge 'file: Query physical block size and minimum I/O size' from Kefu Chai file: Apply physical_block_size override to filesystem files file: Use designated initializers in xfs_alignments iotune: Add physical block size detection disk_params: Add support for physical_block_size overrides from io_properties.yaml block_device: Query alignment requirements separately for memory and I/O > Merge 'json: formatter: fix formatting of std:string_view' from Benny Halevy json: formatter: fix formatting of std:string_view json: formatter: make sure std::string_view conforms to is_string_like Fixes #27887 > demos:improve the output of demo_with_io_intent() in file_demo > test: Add accept() vs accept_abort() socket test > file: Refine posix_file_impl alignments initialization > Add file::statat and a corresponding file_stat overload > cmake: don't compile memcached app for API < 9 > Merge 'Revert to ~old lifetime semantics for lvalues passed to then()-alikes' from Travis Downs future: adjust lifetime for lvalue continuations future: fix value class operator() > pollable_fd: Unfriend everything > Merge 'file: experimental_list_directory: use buffered generator' from Benny Halevy file: experimental_list_directory: use buffered generator file: define list_directory_generator_type > Merge 'Make datagram API use temporary_buffer<>-s' from Pavel Emelyanov net: Deprecate datagram::get_data() returning packet memcache: Fix indentation after previous patch memcache: Use new datagram::get_buffers() API dns: Use new datagram::get_buffers() API tests: Use new datagram::get_buffers() API demo: Use new datagram::get_buffers() API udp: Make datagram implementations return span of temporary_buffer-s > Merge 'Remove callback from timer_set::complete()' from Pavel Emelyanov reactor: Fix indentation after previous patch timers: Remove enabling callback from timer_set::complete() > treewide: avoid 'static sstring' in favor of 'constexpr string_view' > resource: Hide hwloc from public interface > Merge 'Fix handle_exception_type for lvalues' from Travis Downs futures_test: compile-time tests function_traits: handle reference_wrapper > posix_data_sink_impl: Assert to guard put UB > treewide: fix build with `SEASTAR_SSTRING` undefined > avoid deprecation warnings for json_exception > `util/variant_utils`: correct type deduction for `seastar::visit` > net/dns: fixed socket concurrent access > treewide: add missing headers > Merge 'Remove posix file helper file_read_state class' from Pavel Emelyanov file: Remove file_read_state test: Add a test for posix_file_impl::do_dma_read_bulk() > membarrier: simplify locking Adjust scylla to the following changes in scylla: - file_stat became polymorphic - needs explicit inference in table::snapshot_exists, table::get_snapshot_details - file::experimental_list_directory now returns list_directory_generator_type Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#27916	2025-12-30 19:37:13 +03:00
Avi Kivity	853f3dadda	Merge 'treewide: fix some spelling errors' from Piotr Smaron Irritated by prevailing spellchecker comments attached to every PR, I aim to fix them all. No need to backport, just cosmetic changes. Closes scylladb/scylladb#27897 * github.com:scylladb/scylladb: treewide: fix some spelling errors codespell: ignore `iif` and `tread`	2025-12-29 20:45:31 +02:00
Pavel Emelyanov	d892140655	Merge 'Reduce allocations when traversing compaction_groups' from Benny Halevy - table, storage_group: add compaction_group_count - And use to reserve vector capacity before adding an item per compaction_group - table: reduce allocations by using for_each_compaction_group rather than compaction_groups() - compaction_groups() may allocate memory, but when called from a synchronous call site, the caller can use for_each_compaction_group instead. * Improvement, no backport needed Closes scylladb/scylladb#27479 * github.com:scylladb/scylladb: table: reduce allocations by using for_each_compaction_group rather than compaction_groups() replica: storage_group: rename compaction_groups to compaction_groups_immediate	2025-12-29 16:26:33 +03:00
Piotr Smaron	fb4d89f789	treewide: fix some spelling errors	2025-12-29 13:53:56 +01:00

1 2 3 4 5 ...

1856 Commits