scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 00:02:37 +00:00

Author	SHA1	Message	Date
Avi Kivity	611918056a	Merge 'repair: Add tablet incremental repair support' from Asias He The central idea of incremental repair is to allow repair participants to select and repair only a portion of the dataset to speed up the repair process. All repair participants must utilize an identical selection method to repair and synchronize the same selected dataset. There are two primary selection methods: time-based and file-based. The time-based method selects data within a specified time frame. It is versatile but it is less efficient because it requires reading all of the dataset and omitting data beyond the time frame. The file-based method selects data from unrepaired SSTables and is more efficient because it allows the entire SSTable to be omitted. This document patch implements the file-based selection method. Incremental repair will only be supported for tablet tables; it will not be supported for vnode tables. On one hand, the legacy vnode is less important to support. On the other hand, the incremental repair for vnode is much harder to implement. With vnodes, a SSTalbe could contain data for multiple vnode ranges. When a given vnode range is repaired, only a portion of the SSTable is repaired. This complicates the manipulation of SSTables significantly during both repair and compaction. With tablets, an entire tablet is repaired so that a sstable is either fully repaired or not repaired which is a huge simplification. This patch uses the repaired_at from sstables::statistics component to mark a sstable as repaired. It uses a virtual clock as the repair timestamp, i.e., using a monotonically increasing number for the repaired_at field of a SSTable and sstables_repaired_at column in system.tablets table. Notice that when a sstable is not repaired, the repaired_at field will be set to the default value 0 by default. The being_repaired in memory field of a SSTable is used to explicitly mark that a SSTable is being selected. The following variables are used for incremental repair: The repaired_at on disk field of a SSTable is used. - A 64-bit number increases sequentially The sstables_repaired_at is added to the system.tablets table. - repaired_at <= sstables_repaired_at means the sstable is repaired The being_repaired in memory field of a SSTable is added. - A repair UUID tells which sstable has participated in the repair Initial test results: 1) Medium dataset results Node amount: 3 Instance type: i4i.2xlarge Disk usage per node: ~500GB Cluster pre-populated with ~500GB of data before starting repairs job. Results for Repair Timings: The regular repair run took 210 mins. Incremental repair 1st run took 183 mins, 2nd and 3rd runs took around 48s The speedup is: 183 mins / 48s = 228X 2) Small dataset results Node amount: 3 Instance type: i4i.2xlarge Disk usage per node: ~167GB Cluster pre-populated with ~167GB of data before starting the repairs job. Regular repair 1st run took 110s, 2nd and 3rd runs took 110s. Incremental repair 1st run took 110 seconds, 2nd and 3rd run took 1.5 seconds. The speedup is: 110s / 1.5s = 73X 3) Large dataset results Node amount: 6 Instance type: i4i.2xlarge, 3 racks 50% of base load, 50% read/write Dataset == Sum of data on each node Dataset Non-incremental repair (minutes) 1.3 TiB 31:07 3.5 TiB 25:10 5.0 TiB 19:03 6.3 TiB 31:42 Dataset Incremental repair (minutes) 1.3 TiB 24:32 3.0 TiB 13:06 4.0 TiB 5:23 4.8 TiB 7:14 5.6 TiB 3:58 6.3 TiB 7:33 7.0 TiB 6:55 Fixes #22472 Closes scylladb/scylladb#24291 * github.com:scylladb/scylladb: replica: Introduce get_compaction_reenablers_and_lock_holders_for_repair compaction: Move compaction_reenabler to compaction_reenabler.hh topology_coordinator: Make rpc::remote_verb_error to warning level repair: Add metrics for sstable bytes read and skipped from sstables test.py: Disable incremental for test_tombstone_gc_for_streaming_and_repair test.py: Add tests for tablet incremental repair repair: Add tablet incremental repair support compaction: Add tablet incremental repair support feature_service: Add TABLET_INCREMENTAL_REPAIR feature tablet_allocator: Add tablet_force_tablet_count_increase and decrease repair: Add incremental helpers sstable: Add being_repaired to sstable sstables: Add set_repaired_at to metadata_collector mutation_compactor: Introduce add operator to compaction_stats tablet: Add sstables_repaired_at to system.tablets table test: Fix drain api in task_manager_client.py	2025-08-19 13:13:22 +03:00
Dawid Mędrek	6a71461e53	treewide: Fix spelling errors The errors were spotted by our GitHub Actions. Closes scylladb/scylladb#24822	2025-08-19 13:07:43 +03:00
Anna Stuchlik	841ba86609	doc: document support for new z3 instance types This commit adds new z3 instances we now support to the list of GCP instance types. Fixes https://github.com/scylladb/scylladb/issues/25438 Closes scylladb/scylladb#25446	2025-08-14 10:59:45 +02:00
Anna Stuchlik	1e5659ac30	doc: add the information about ScyllaDB C# Driver This commit adds the driver to the list of ScyllaDB drivers, including the information about: - CDC integration (not available) - Tablets (supported) Fixes https://github.com/scylladb/scylladb/issues/25495 Closes scylladb/scylladb#25498	2025-08-14 11:29:52 +03:00
Pavel Emelyanov	eaec7c9b2e	Merge 'cql3: add default replication strategy to `create_keyspace_statement`' from Dario Mirovic When creating a new keyspace, both replication strategy and replication factor must be stated. For example: `CREATE KEYSPACE ks WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'replication_factor' : 3 };` This syntax is verbose, and in all but some testing scenarios `NetworkTopologyStrategy` is used. This patch allows skipping replication strategy name, filling it with `NetworkTopologyStrategy` when that happens. The following syntax is now valid: `CREATE KEYSPACE ks WITH REPLICATION = { 'replication_factor' : 3 };` and will give the same result as the previous, more explicit one. Fixes https://github.com/scylladb/scylladb/issues/16029 Backport is not needed. This is an enhancement for future releases. Closes scylladb/scylladb#25236 * github.com:scylladb/scylladb: docs/cql: update documentation for default replication strategy test/cqlpy: add keyspace creation default strategy test cql3: add default replication strategy to `create_keyspace_statement`	2025-08-14 11:18:36 +03:00
Dario Mirovic	2ac37b4fde	docs/cql: update documentation for default replication strategy Update create-keyspace-statement section of ddl.rst since `class` is no longer mandatory. Add an example for keyspace creation without specifying `class`. Refs: #16029	2025-08-13 01:52:00 +02:00
Wojciech Przytuła	7600ccfb20	Fix link to ScyllaDB manual The link would point to outdated OS docs. I fixed it to point to up-to-date Enterprise docs. Closes scylladb/scylladb#25328	2025-08-12 10:33:06 +03:00
Tomasz Grabiec	9fd312d157	Merge 'row_cache: add memtable overlap checks elision optimization for tombstone gc' from Botond Dénes https://github.com/scylladb/scylladb/issues/24962 introduced memtable overlap checks to cache tombstone GC. This was observed to be very strict and greatly reduce the effectiveness of tombstone GC in the cache, especially for MV workloads, which regularly recycle old timestamp into new writes, so the memtable often has smaller min live timestamp than the timestamp of the tombstones in the cache. When creating a new memtable, save a snapshot of the tombstone gc state. This snapshot is used later to exclude this memtable from overlap checks for tombstones, whose token have an expiry time larger than that of the tombstone, meaning: all writes in this memtable were produced at a point in time when the current tombstone has already expired. This has the following implications: * The partition the tombstone is part of was already repaired at the time the memtable was created. * All writes in the memtable were produced after this tombstone's expiry time, these writes cannot be possibly relevant for this tombstone. Based on this, such memtables are excluded from the overlap checks. With adequately frequent memtable flushes -- so that the tombstone gc state snapshot is refreshed -- most memtables should be excluded from overlap checks, greatly helping the cache's tombstone GC efficiency. Fixes: https://github.com/scylladb/scylladb/issues/24962 Fixes a regression introduced by https://github.com/scylladb/scylladb/pull/23255 which was backported to all releases, needs backport to all releases as well Closes scylladb/scylladb#25033 * github.com:scylladb/scylladb: docs/dev/tombstone.md: document the memtable overlap check elision optimization test/boost/row_cache_test: add test for memtable overlap check elision db/cache_mutation_reader: obtain gc-before and min-live-ts lazily mutation/mutation_compactor: use max_purgeable::can_purge and max_purgeable::purge_result db/cache_mutation_reader: use max_purgeable::can_purge() replica/table: get_max_purgeable_fn_for_cache_underlying_reader(): use max_purgable::combine() replica/database: memtable_list::get_max_purgeable(): set expiry-treshold compaction/compaction_garbage_collector: max_purgeable: add expiry_treshold replica/table: propagate gc_state to memtable_list replica/memtable_list: add tombstone_gc_state* member replica/memtable: add tombstone_gc_state_snapshot tombstone_gc: introduce tombstone_gc_state_snapshot tombstone_gc: extract shared state into shared_tombstone_gc_state tombstone_gc: per_table_history_maps::_group0_gc_time: make it a value tombstone_gc: fold get_group0_gc_time() into its caller tombstone_gc: fold get_or_create_group0_gc_time() into update_group0_refresh_time() tombstone_gc: fold get_or_create_repair_history_for_table() into update_repair_time() tombstone_gc: refactor get_or_greate_repair_history_for_table() replica/memtable_list: s/min_live_timestamp()/get_max_purgeable()/ db/read_context: return max_purgeable from get_max_purgeable() compaction/compaction_garbage_collector: add formatter for max_purgeable mutation: move definition of gc symbols to compaction.cc compaction/compaction_garbage_collector: refactor max_purgeable into a class test/boost/row_cache_test: refactor test_populating_reader_tombstone_gc_with_data_in_memtable test: rewrite test_compacting_reader_tombstone_gc_with_data_in_memtable in C++ test/boost/row_cache_test: refactor cache tombstone GC with memtable overlap tests	2025-08-11 23:54:59 +02:00
Botond Dénes	660ea9202a	docs/dev/tombstone.md: document the memtable overlap check elision optimization	2025-08-11 17:20:12 +03:00
Anna Stuchlik	1322f301f6	doc: add support for RHEL 10 This commit adds RHEL 10 to the list of supported platforms. Fixes https://github.com/scylladb/scylladb/issues/25436 Closes scylladb/scylladb#25437	2025-08-11 13:13:37 +02:00
Patryk Jędrzejczak	7b77c6cc4a	docs: Raft recovery procedure: recommend verifying participation in Raft recovery This instruction adds additional safety. The faster we notice that a node didn't restart properly, the better. The old gossip-based recovery procedure had a similar recommendation to verify that each restarting node entered `RECOVERY` mode. Fixes #25375 This is a documentation improvement. We should backport it to all branches with the new recovery procedure, so 2025.2 and 2025.3. Closes scylladb/scylladb#25376	2025-08-11 09:21:29 +03:00
Asias He	5377f87e5a	tablet: Add sstables_repaired_at to system.tablets table It is used to store the repaired_at for each tablet.	2025-08-11 10:10:07 +08:00
Anna Stuchlik	f3d9d0c1c7	doc: add new and removed metrics to the 2025.3 upgrade guide This commit adds the list of new and removed metrics to the already existing upgrade guide from 2025.2 to 2025.3. Fixes https://github.com/scylladb/scylladb/issues/24697 Closes scylladb/scylladb#25385	2025-08-08 13:25:51 +02:00
Botond Dénes	70aa81990b	Merge 'Alternator - add the ability to write, not just read, system tables' from Nadav Har'El In commit `44a1daf` we added the ability to read Scylla system tables with Alternator. This feature is useful, among other things, in tests that want to read Scylla's configuration through the system table system.config. But tests often want to modify system.config, e.g., to temporarily reduce some threshold to make tests shorter. Until now, this was not possible This series add supports for writing to system tables through Alternator, and examples of tests using this capability (and utility functions to make it easy). Because the ability to write to system tables may have non-obvious security consequences, it is turned off by default and needs to be enabled with a new configuration option "alternator_allow_system_table_write" No backports are necessary - this feature is only intended for tests. We may later decide to backport if we want to backport new tests, but I think the probability we'll want to do this is low. Fixes #12348 Closes scylladb/scylladb#19147 * github.com:scylladb/scylladb: test/alternator: utility functions for changing configuration alternator: add optional support for writing to system table test/alternator: reduce duplicated code	2025-08-08 09:13:15 +03:00
Pavel Emelyanov	0616407be5	Merge 'rest_api: add endpoint which drops all quarantined sstables' from Taras Veretilnyk Added a new POST endpoint `/storage_service/drop_quarantined_sstables` to the REST API. This endpoint allows dropping all quarantined SSTables either globally or for a specific keyspace and tables. Optional query parameters `keyspace` and `tables` (comma-separated table names) can be provided to limit the scope of the operation. Fixes scylladb/scylladb#19061 Backport is not required, it is new functionality Closes scylladb/scylladb#25063 * github.com:scylladb/scylladb: docs: Add documentation for the nodetool dropquarantinedsstables command nodetool: add command for dropping quarantine sstables rest_api: add endpoint which drops all quarantined sstables	2025-08-06 11:55:15 +03:00
Taras Veretilnyk	bcb90c42e4	docs: Sort commands list in nodetool.rst Fixes scylladb/scylladb#25330 Closes scylladb/scylladb#25331	2025-08-06 11:20:53 +03:00
Nadav Har'El	a896e2dbb9	alternator: add optional support for writing to system table In commit `44a1daf` we added the ability to read system tables through the DynamoDB API (actually, the Scan and Query requests only). This ability is useful for tests, and can also be useful to users who want to read information that is only available through system tables. This patch adds support also for writing into system tables. This will be useful for Alternator tests, were we want to temporarily change some live-updatable configuration option - and so far haven't been able to do that like we did do in some cql-pytest tests. For reasons explained in issue #23218, only superuser roles are allowed to write to system tables - it is not enough for the role to be granted MODIFY permissions on the system table or on ALL KEYSPACES. Moreover, the ability to modify system tables carries special risks, so this patch only allows writes to the system tables if a new configuration option "alternator_allow_system_table_write" turned on. This option is turned off by default. This patch also includes a test for this new configuration-writing capability. The test scripts test/alternator/run and test.py now run Scylla with alternator_allow_system_table_write turned on, but the new test can also run without this option, and will be skipped in that case (to allow running the test suite against some manually- run instance of Scylla). Fixes: #12348 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-08-06 10:00:04 +03:00
Avi Kivity	4c785b31c7	Merge 'List Alternator clients in system.clients virtual table' from Nadav Har'El Before this series, the "system.clients" virtual table lists active connections (and their various properties, like client address, logged in username and client version) only for CQL requests. This series adds also Alternator clients to system.clients. One of the interesting use cases of this new feature is understanding exactly which SDK a user is using -without inspecting their application code. Different SDKs pass different "User-Agent" headers in requests, and that User-Agent will be visible in the system.clients entries for Alternator requests as the "driver_name" field. Unlike CQL where logged in username, driver name, etc. applies to a complete connection, in the Alternator API, different requests can theoretically be signed by different users and carry different headers but still arrive over the same HTTP connection. So instead of listing the currently open Alternator connections, we will list the currently active requests. The first three patches introduce utilities that will be useful in the implementation. The fourth patch is the implementation itself (which is quite simple with the utility introduced in the second patch), and the fifth patch a regression test for the new feature. The sixth patch adds documentation, the seventh patch refactors generic_server to use the newly introduced utility class and reduce code duplication, and the eighth patch adds a small check to an existing check of CQL's system.clients. Fixes #24993 This patch adds a new feature, so doesn't require a backport. Nevertheless, if we want it to get to existing customers more quickly to allow us to better understand their use case by reading the system.clients table, we may want to consider backporting this patch to existing branches. There is some risk involved in this patch, because it adds code that gets run on every Alternator request, so a bug on it can cause problems for every Alternator request. Closes scylladb/scylladb#25178 * github.com:scylladb/scylladb: test/cqlpy: slightly strengthen test for system.clients generic_server: use utils::scoped_item_list docs/alternator: document the system.clients system table in Alternator alternator: add test for Alternator clients in system.clients alternator: list active Alternator requests in system.clients utils: unit test for utils::scoped_item_list utils: add a scoped_item_list utility class utils: add "fatal" version of utils::on_internal_error()	2025-08-05 15:55:41 +03:00
Pavel Emelyanov	5fcdf948d9	doc: Update system.clients schema with scheduling_group cell It was added by `9319d65971` (db/virtual_tables: add scheduling group column to system.clients) recently. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#25294	2025-08-05 10:16:20 +03:00
Piotr Dulikowski	ec7832cc84	Merge 'Raft-based recovery procedure: simplify rolling restart with recovery_leader' from Patryk Jędrzejczak The following steps are performed in sequence as part of the Raft-based recovery procedure: - set `recovery_leader` to the host ID of the recovery leader in `scylla.yaml` on all live nodes, - send the `SIGHUP` signal to all Scylla processes to reload the config, - perform a rolling restart (with the recovery leader being restarted first). These steps are not intuitive and more complicated than they could be. In this PR, we simplify these steps. From now on, we will be able to simply set `recovery_leader` on each node just before restarting it. Apart from making necessary changes in the code, we also update all tests of the Raft-based recovery procedure and the user-facing documentation. Fixes scylladb/scylladb#25015 The Raft-based procedure was added in 2025.2. This PR makes the procedure simpler and less error-prone, so it should be backported to 2025.2 and 2025.3. Closes scylladb/scylladb#25032 * github.com:scylladb/scylladb: docs: document the option to set recovery_leader later test: delay setting recovery_leader in the recovery procedure tests gossip: add recovery_leader to gossip_digest_syn db: system_keyspace: peers_table_read_fixup: remove rows with null host_id db/config, gms/gossiper: change recovery_leader to UUID db/config, utils: allow using UUID as a config option	2025-08-04 08:29:32 +02:00
Taras Veretilnyk	15e3980693	docs: Add documentation for the nodetool dropquarantinedsstables command Fixes scylladb/scylladb#19061	2025-08-01 11:46:33 +02:00
Nadav Har'El	70c94ac9dd	docs/alternator: document the system.clients system table in Alternator Add to docs/alternator/new-apis.md a full description of the `system.clients` support in Alternator that was added in the previous patches. Although arguably all Scylla system tables should work on Alternator and do not need to be individually documented, I believe that this specific table, is interesting to document. This is because some of the attributes in this table have non-obvious and Alternator-specific meanings. Moreover, there's even a diffence in what each individual item in the table represents (it represents active requests, not entire connections as in CQL). While editing the system tables section of new-apis.md, this patch also slightly improves its formatting. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-08-01 02:15:05 +03:00
Nadav Har'El	22f845b128	docs/alternator: mention missing ShardFilter support Add in docs/alternator/compatibility.md a mention of the ShardFilter option which we don't support in Alternator Streams. This option was only introduced to DynamoDB a week ago, so it's not surprising we don't yet support it :-) Refs #25160 Closes scylladb/scylladb#25161	2025-07-29 14:37:24 +03:00
Andrei Chekun	a6a3d119e8	docs: update documentation with new way of running C++ tests Documentation had outdated information how to run C++ test. Additionally, some information added about gathered test metrics. Closes scylladb/scylladb#25180	2025-07-29 14:36:19 +03:00
Anna Stuchlik	b67bb641bc	doc: add OS support for ScyllaDB 2025.3 This commit adds the information about support for platforms in ScyllaDB version 2025.3. Fixes https://github.com/scylladb/scylladb/issues/24698 Closes scylladb/scylladb#25220	2025-07-29 14:33:12 +03:00
Anna Stuchlik	8365219d40	doc: add the upgrade guide from 2025.2 to 2025.3 This PR adds the upgrade guide from version 2025.2 to 2025.3. Also, it removes the upgrade guide existing for the previous version that is irrelevant in 2025.2 (upgrade from 2025.1 to 2025.2). Note that the new guide does not include the "Enable Consistent Topology Updates" page and note, as users upgrading to 2025.3 have consistent topology updates already enabled. Fixes https://github.com/scylladb/scylladb/issues/24696 Closes scylladb/scylladb#25219	2025-07-29 14:32:31 +03:00
Anna Stuchlik	18b4d4a77c	doc: add tablets support information to the Drivers table This commit: - Extends the Drivers support table with information on which driver supports tablets and since which version. - Adds the driver support policy to the Drivers page. - Reorganizes the Drivers page to accommodate the updates. In addition: - The CPP-over-Rust driver is added to the table. - The information about Serverless (which we don't support) is removed and replaced with tablets to correctly describe the contents of the table. Fixes https://github.com/scylladb/scylladb/issues/19471 Refs https://github.com/scylladb/scylladb-docs-homepage/issues/69 Closes scylladb/scylladb#24635	2025-07-29 08:11:42 +03:00
Nadav Har'El	b4fc3578fc	Merge 'LWT: enable for tablet-based tables' from Petr Gusev This PR enables LWT (Lightweight Transactions) support for tablet-based tables by leveraging colocated tables. Currently, storing Paxos state in system tables causes two major issues: * Loss of Paxos state during tablet migration or base table rebuilds * When a tablet is migrated or the base table is rebuilt, system tables don't retain Paxos state. * This breaks LWT correctness in certain scenarios. * Failing test cases demonstrating this: * test_lwt_state_is_preserved_on_tablet_migration * test_lwt_state_is_preserved_on_rebuild * Shard misalignment and performance overhead * Tablets may be placed on arbitrary shards by the tablet balancer. * Accessing Paxos state in system tables could require a shard jump, degrading performance. We move Paxos state into a dedicated Paxos table, colocated with the base table: * Each base table gets its own Paxos state table. * This table is lazily created on the first LWT operation. * Its tablets are colocated with those of the base table, ensuring: * Co-migration during tablet movement * Co-rebuilding with the base table * Shard alignment for local access to Paxos state Some reasoning for why this is sufficient to preserve LWT correctness is discussed in [2]. This PR addresses two issues from the "Why doesn't it work for tablets" section in [1]: * Tablet migration vs LWT correctness * Paxos table sharding Other issues ("bounce to shard" and "locking for intranode_migration") have already been resolved in previous PRs. References [1] - [LWT over tablets design](https://docs.google.com/document/d/1CPm0N9XFUcZ8zILpTkfP5O4EtlwGsXg_TU4-1m7dTuM/edit?tab=t.0#heading=h.goufx7gx24yu) [2] - [LWT: Paxos state and tablet balancer](https://docs.google.com/document/d/1-xubDo612GGgguc0khCj5ukmMGgLGCLWLIeG6GtHTY4/edit?tab=t.0) [3] - [Colocated tables PR](https://github.com/scylladb/scylladb/pull/22906#issuecomment-3027123886) [4] - [Possible LWT consistency violations after a topology change](https://github.com/scylladb/scylladb/issues/5251) Backport: not needed because this is a new feature. Closes scylladb/scylladb#24819 * github.com:scylladb/scylladb: create_keyspace: fix warning for tablets docs: fix lwt.rst docs: fix tablets.rst alternator: enable LWT random_failures: enable execute_lwt_transaction test_tablets_lwt: add test_paxos_state_table_permissions test_tablets_lwt: add test_lwt_for_tablets_is_not_supported_without_raft test_tablets_lwt: test timeout creating paxos state table test_tablets_lwt: add test_lwt_concurrent_base_table_recreation test_tablets_lwt: add test_lwt_state_is_preserved_on_rebuild test_tablets_lwt: migrate test_lwt_support_with_tablets test_tablets_lwt: add test_lwt_state_is_preserved_on_tablet_migration test_tablets_lwt: add simple test for LWT check_internal_table_permissions: handle Paxos state tables client_state: extract check_internal_table_permissions paxos_store: handle base table removal database: get_base_table_for_tablet_colocation: handle paxos state table paxos_state: use node_local_only mode to access paxos state query_options: add node_local_only mode storage_proxy: handle node_local_only in query storage_proxy: handle node_local_only in mutate storage_proxy: introduce node_local_only flag abstract_replication_strategy: remove unused using storage_proxy: add coordinator_mutate_options storage_proxy: rename create_write_response_handler -> make_write_response_handler storage_proxy: simplify mutate_prepare paxos_state: lazily create paxos state table migration_manager: add timeout to start_group0_operation and announce paxos_store: use non-internal queries qp: make make_internal_options public paxos_store: conditional cf_id filter paxos_store: coroutinize feature_service: add LWT_WITH_TABLETS feature paxos_state: inline system_keyspace functions into paxos_store paxos_state: extract state access functions into paxos_store	2025-07-28 13:19:23 +03:00
Taras Veretilnyk	6b6622e07a	docs: fix typo in command name enbleautocompaction -> enableautocompaction Renamed the file and updated all references from 'enbleautocompaction' to the correct 'enableautocompaction'. Fixes scylladb/scylladb#25172 Closes scylladb/scylladb#25175	2025-07-28 12:49:26 +03:00
Botond Dénes	837424f7bb	Merge 'Add Azure Key Provider for Encryption at Rest' from Nikos Dragazis This PR introduces a new Key Provider to support Azure Key Vault as a Key Management System (KMS) for Encryption at Rest. The core design principle is the same as in the AWS and GCP key providers - an externally provided Vault key that is used to protect local data encryption keys (a process known as "key wrapping"). In more detail, this patch series consists of: * Multiple Azure credential sources, offering a variety of authentication options (Service Principals, Managed Identities, environment variables, Azure CLI). * The Azure host - the Key Vault endpoint bridge. * The Azure Key Provider - the interface for the Azure host. * Unit tests using real Azure resources (credentials and Vault keys). * Log filtering logic to not expose sensitive data in the logs (plaintext keys, credentials, access tokens). This is part of the overall effort to support Azure deployments. Testing done: * Unit tests. * Manual test on an Azure VM with a Managed Identity. * Manual test with credentials from Azure CLI. * Manual test of `--azure-hosts` cmdline option. * Manual test of log filtering. Remaining items: - [x] Create necessary Azure resources for CI. - [x] Merge pipeline changes (https://github.com/scylladb/scylla-pkg/pull/5201). Closes https://github.com/scylladb/scylla-enterprise/issues/1077. New feature. No backport is needed. Closes scylladb/scylladb#23920 * github.com:scylladb/scylladb: docs: Document the Azure Key Provider test: Add tests for Azure Key Provider pylib: Add mock server for Azure Key Vault encryption: Define and enable Azure Key Provider encryption: azure: Delegate hosts to shard 0 encryption: Add Azure host cache encryption: Add config options for Azure hosts encryption: azure: Add override options encryption: azure: Add retries for transient errors encryption: azure: Implement init() encryption: azure: Implement get_key_by_id() encryption: azure: Add id-based key cache encryption: azure: Implement get_or_create_key() encryption: azure: Add credentials in Azure host encryption: azure: Add attribute-based key cache encryption: azure: Add skeleton for Azure host encryption: Templatize get_{kmip,kms,gcp}_host() encryption: gcp: Fix typo in docstring utils: azure: Get access token with default credentials utils: azure: Get access token from Azure CLI utils: azure: Get access token from IMDS utils: azure: Get access token with SP certificate utils: azure: Get access token with SP secret utils: rest: Add interface for request/response redaction logic utils: azure: Declare all Azure credential types utils: azure: Define interface for Azure credentials utils: Introduce base64url_{encode,decode}	2025-07-25 10:45:32 +03:00
Petr Gusev	1f5d9ace93	docs: fix lwt.rst Add a new section about Paxos state tables. Update all references to system.paxos in the text to refer to this section.	2025-07-24 20:04:43 +02:00
Petr Gusev	69017fb52a	docs: fix tablets.rst LWT and Alternator are now supported with tablets.	2025-07-24 20:04:43 +02:00
Petr Gusev	abab025d4f	alternator: enable LWT	2025-07-24 20:04:43 +02:00
Patryk Jędrzejczak	f408d1fa4f	docs: document the option to set recovery_leader later In one of the previous commits, we made it possible to set `recovery_leader` on each node just before restarting it. Here, we update the corresponding documentation.	2025-07-23 15:36:57 +02:00
Ran Regev	3d82b9485e	docs: update nodetool restore documentation for --sstables-file-list Fixes: #25128 A leftover from #25077 Closes scylladb/scylladb#25129	2025-07-22 14:43:35 +02:00
Nikos Dragazis	88554b7c7a	docs: Document the Azure Key Provider Extend the EaR ops guide to incorporate the new Azure Key Provider. Document its options and provide instructions on how to configure it. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2025-07-16 23:06:11 +03:00
Botond Dénes	2d3965c76e	Merge 'Reduce Alternator table name length limit to 192 and fix crash when adding stream to table with very long name' from Nadav Har'El Before this series, it is possible to crash Scylla (due to an I/O error) by creating an Alternator table close to the maximum name length of 222, and then enabling Alternator Streams. This series fixes this bug in two ways: 1. On a pre-existing table whose name might be up to 222 characters, enabling Streams will check if the resulting name is too long, and if it is, fail with a clear error instead of crashing. This case will effect pre-existing tables whose name has between 207 and 222 characters (207 is `222 - strlen("_scylla_cdc_log")`) - for such tables enabling Streams will fail, but no longer crash. 2. For new tables, the table name length limit is lowered from 222 to 192. The new limit is still high enough, but ensures it will be possible to enable streams any new table. It will also always be possible to add a GSI for such a table with name up to 29 characters (if the table name is shorter, the GSI name can be longer - the sum can be up to 221 characters). No need to backport, Alternator Streams is still an experimental feature and this patch just improves the unlikely situation of extremely long table names. Fixes #24598 Closes scylladb/scylladb#24717 * github.com:scylladb/scylladb: alternator: lower maximum table name length to 192 alternator: don't crash when adding Streams to long table name alternator: split length limit for regular and auxiliary tables alternator: avoid needlessly validating table name	2025-07-15 06:57:04 +03:00
Botond Dénes	1f9f43d267	Merge 'kms_host: Support external temporary security credentials' from Nikos Dragazis This PR extends the KMS host to support temporary AWS security credentials provided externally via the Scylla configuration file, environment variables, or the AWS credentials file. The KMS host already supports: * Temporary credentials obtained automatically from the EC2 instance metadata service or via IAM role assumption. * Long-term credentials provided externally via configuration, environment, or the AWS credentials file. This PR is about temporary credentials that are external, i.e., not generated by Scylla. Such credentials may be issued, for example, through identity federation (e.g., Okta + gimme-aws-creds). External temporary credentials are useful for short-lived tasks like local development, debugging corrupted SSTables with `scylla-sstable`, or other local testing scenarios. These credentials are temporary and cannot be refreshed automatically, so this method is not intended for production use. Documentation has been updated to mention these additional credential sources. Fixes #22470. New feature, no backport is needed. Closes scylladb/scylladb#22465 * github.com:scylladb/scylladb: doc: Expose new `aws_session_token` option for KMS hosts kms_host: Support authn with temporary security credentials encryption_config: Mention environment in credential sources for KMS	2025-07-15 06:45:39 +03:00
Pawel Pery	eadbf69d6f	vector_store_client: implement ANN API This patch is a part of vector_store_client sharded service implementation for a communication with vector-store service. It implements a functionality for ANN search request to a vector-store service. It sends request, receive response and after parsing it returns the list of primary keys. It adds json parsing functionality specific for the HTTP ANN API. It adds a hardcoded http request timeout for retrieving response from the Vector Store service. It also adds an automatic boost test of the ANN search interface, which uses a mockup http server in a background to simulate vector-store service. It adds a documentation for HTTP API protocol used used for ANN functionality. Fixes: VS-47	2025-07-09 11:54:51 +02:00
Piotr Dulikowski	ea35302617	Merge 'test: audit: enable syslog audit tests' from Andrzej Jackowski Several audit test issues caused test failures, and in the result, almost all of audit syslog tests were marked with xfail. This patch series enables the syslog audit tests, that should finally pass after the following fixes are introduced: - bring back commas to audit syslog (scylladb#24410 fix) - synchronize audit syslog server - fix parsing of syslog messages - generate unique uuid for each line in syslog audit - allow audit logging from multiple nodes Fixes: scylladb/scylladb#24410 Test improvements, no backport required. Closes scylladb/scylladb#24553 * github.com:scylladb/scylladb: test: audit: use automatic comparators in AuditEntry test: audit: enable syslog audit tests test: audit: sort new audit entries before comparing with expected ones test: audit: check audit logging from multiple nodes test: audit: generate unique uuid for each line in syslog audit test: audit: fix parsing of syslog messages test: audit: synchronize audit syslog server docs: audit: update syslog audit format to the current one audit: bring back commas to audit syslog	2025-07-07 12:45:44 +02:00
Nadav Har'El	18b6c4d3c5	alternator: lower maximum table name length to 192 Currently, Alternator allows creating a table with a name up to 222 (max_table_name_length) characters in length. But if you do create a table with such a long name, you can have some difficulties later: You you will not be able to add Streams or GSI or LSI to that table, because 222 is also the absolute maximum length Scylla tables can have and the auxilliary tables we want to create (CDC log, materialized views) will go over this absolute limit (max_auxiliary_table_name_length). This is not nice. DynamoDB users assume that after successfully creating a table, they can later - perhaps much later - decide to add Streams or GSI to it, and today if they chose extremely long names, they won't be able to do this. So in this patch, we lower max_table_name_length from 222 to 192. A user will not be able to create tables with longer names, but the good news is that once successfully creating a table, it will always be possible to enable Streams on it (the CDC log table has an extra 15 bytes in its name, and 192 + 15 is less than 222), and it will be possible to add GSIs with short enough names (if the GSI name is 29 or less, 192 + 29 + 1 = 222). This patch is a trivial one-line code change, but also includes the corrected documentation of the limits, and a fix for one test that previously checked that a table name with length 222 was allowed - and now needs to check 192 because 222 is no longer allowed. Note that if a user has existing tables and upgrades Scylla, it is possible that some pre-existing Alternator tables might have lengths over 192 (up to 222). This is fine - in the previous patches we made sure that even in this case, all operations will still work correctly on these old tables (by not not validating the name!), and we also made sure that attempting to enable Streams may fail when the name is too long (we do not remove those old checks in this patch, and don't plan to remove them in the forseeable future). Note that the limit we chose - 192 characters - is identical to the table name limit we recently chose in CQL. It's nicer that we don't need to memorize two different limits for Alternator and CQL. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-07-07 11:58:21 +03:00
dependabot[bot]	59cc496757	build(deps): bump sphinx-scylladb-theme from 1.8.6 to 1.8.7 in /docs Bumps [sphinx-scylladb-theme](https://github.com/scylladb/sphinx-scylladb-theme) from 1.8.6 to 1.8.7. - [Release notes](https://github.com/scylladb/sphinx-scylladb-theme/releases) - [Commits](https://github.com/scylladb/sphinx-scylladb-theme/compare/1.8.6...1.8.7) --- updated-dependencies: - dependency-name: sphinx-scylladb-theme dependency-version: 1.8.7 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Closes scylladb/scylladb#24805	2025-07-03 12:04:24 +03:00
Patryk Jędrzejczak	fa982f5579	docs: handling-node-failures: fix typo Replacing "from" is incorrect. The typo comes from recently merged #24583. Fixes #24732 Requires backport to 2025.2 since #24583 has been backported to 2025.2. Closes scylladb/scylladb#24733	2025-07-02 12:22:01 +03:00
Nikos Dragazis	fbc9ead182	doc: Expose new `aws_session_token` option for KMS hosts Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2025-07-02 12:04:40 +03:00
Avi Kivity	dfaed80f55	Merge 'types: add byte-comparable format support for native cql3 types' from Lakshmi Narayanan Sreethar This PR introduces a new `comparable_bytes` class to add byte-comparable format support for all the [native cql3 data types](https://opensource.docs.scylladb.com/stable/cql/types.html#native-types) except `counter` type as that is not comparable. The byte-comparable format is a pre-requisite for implementing the trie based index format for our sstables(https://github.com/scylladb/scylladb/issues/19191). This implementation adheres to the byte-comparable format specification in https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/bytecomparable/ByteComparable.md Note that support for composite data types like lists, maps, and sets has not been implemented yet and will be made available in a separate PR. Refs https://github.com/scylladb/scylladb/issues/19407 New feature - backport not required. Closes scylladb/scylladb#23541 * github.com:scylladb/scylladb: types/comparable_bytes: add testcase to verify compatibility with cassandra types/comparable_bytes: support variable-length natively byte-ordered data types types/comparable_bytes: support decimal cql3 types types/comparable_bytes: introduce count_digits() method types/comparable_bytes: support uuid and timeuuid cql3 types types/comparable_bytes: support varint cql3 type types/comparable_bytes: support skipping sign byte write in decode_signed_long_type types/comparable_bytes: introduce encode/decode_varint_length types/comparable_bytes: support float and double cql3 types types/comparable_bytes: support date, time and timestamp cql3 types types/comparable_bytes: support bigint cql3 type types/comparable_bytes: support fixed length signed integers types/comparable_bytes: support boolean cql3 type types: introduce comparable_bytes class bytes_ostream: overload write() to support writing from FragmentedView docs: fix minor typo in docs/dev/cql3-type-mapping.md	2025-07-02 11:58:32 +03:00
Lakshmi Narayanan Sreethar	068e74b457	docs: fix minor typo in docs/dev/cql3-type-mapping.md Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Tomasz Grabiec	97679002ee	Merge 'Co-locate tablets of different tables' from Michael Litvak Add the option to co-locate tablets of different tables. For example, a base table and its CDC table, or a local index. main changes and ideas: * "table group" - a set of one or more tables that should be co-located. (Example: base table and CDC table). A group consists of one base table and zero or more children tables. * new column `base_table` in `system.tablets`: when creating a new table, it can be set to point to a base table, which the new table's tablets will be co-located with. when it's set, the tablet map information should be retrieved from the base table map. the child map doesn't contain per-tablet information. * co-located tables always have the same tablet count and the same tablet replicas. each tablet operation - migration, resize, repair - is applied on all tablets in a synchronized manner by the topology coordinator. * resize decision for a group is made by combining the per-table hints and comparing the average tablet size (over all tablets in the group) with the target tablet size. * the tablets load balancer works with the base table as a representative of the group. it represents a single migration unit with some `group_size` that is taken into account. * view tablets are co-located with base tablets when the partition keys match. Fixes https://github.com/scylladb/scylladb/issues/17043 backport is not needed. this is preliminary work for support of MVs and CDC with tablets. Closes scylladb/scylladb#22906 * github.com:scylladb/scylladb: tablets: validate no clustering row mutations on co-located tables raft_group0_client: extend validate_change to mixed_change type docs: topology-over-raft: document co-located tables tablet-mon.py: visual indication for co-located tablets tablet-mon.py: handle co-located tablets test/boost/view_schema_test.cc: fix race in wait_until_built boost/tablets_test: test load balancing and resize of co-located tablets test/tablets: test tablets colocation tablets: co-locate view tablets with base when the partition keys match test/pylib/tablets: common get_tablet_count api test_mv_tablets: use get_tablet_replicas from common tablets api test/pylib/tablets: fix test api to read tablet replicas from base table tablets: allocator: create co-located tables in a single operation alternator: prepare all new tables in a single announcement migration_manager: add notification for creating multiple tables tablets: read_tablet_transition_stage: read from base table storage service: allow repair request only on base tables tablets: keyspace_rf_change: apply on base table storage service: generate tablet migration updates on base tables tablets: replace all_tables method tablets: split when all co-located tablets are ready tablets: load balancer: sizing plan for table groups tablets: load balancer: handle co-located tablets tablets: allocate co-located tablets tablets: handle migration of co-located tablets storage service: add repair colocated tablets rpc tablets: save and read tablet metadata of co-located tables tablets: represent co-located tables in tablet metadata tablets: add base_table column to system.tablets docs: update system.tablets schema	2025-07-01 16:02:30 +02:00
Botond Dénes	37ef9efb4e	docs: cql/types.rst: remove reference to frozen-only UDTs ScyllaDB supports non-frozen UDTs since 3.2, no need to keep referencing this limitation in the current docs. Replace the description of the limitation with general description of frozen semantics for UDTs. Fixes: #22929 Closes scylladb/scylladb#24763	2025-07-01 16:19:18 +03:00
Michael Litvak	6fa5d2f7c8	docs: topology-over-raft: document co-located tables	2025-07-01 13:20:19 +03:00
Anna Stuchlik	9234e5a4b0	doc: add the SBOM page and the download link This commit migrates the Software Bill Of Materials (SBOM) page added to the Enterprise docs with https://github.com/scylladb/scylla-enterprise/pull/5067. The only difference is the link to the SBOM files - it was Enterprise SBOM in the Enterprise docs, while here is a link to the ScyllaDB SBOM. It's a follow-up of migration to Source Avalable and should be backported to all Source Available versions - 2025.1 and later. Fixes https://github.com/scylladb/scylladb/issues/24730 Closes scylladb/scylladb#24735	2025-07-01 11:33:19 +03:00

1 2 3 4 5 ...

1739 Commits