scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-03 05:26:58 +00:00

Author	SHA1	Message	Date
Taras Veretilnyk	ca2b8352ac	db: add large_row_fail_threshold_mb config option Reject writes targeting a row whose on-disk size already exceeds this threshold (MB). Code default is 0 (disabled) for existing clusters; scylla.yaml ships 20 for new deployments.	2026-05-13 13:47:32 +02:00
Taras Veretilnyk	64fc53cb7c	db: add rows_count_fail_threshold config option Reject writes targeting a partition whose on-disk row count already exceeds this threshold. Code default is 0 (disabled) for existing clusters; scylla.yaml ships 200000 for new deployments.	2026-05-13 13:47:17 +02:00
Taras Veretilnyk	c12b16603d	db: add large_partition_fail_threshold_mb config option Reject writes targeting a partition whose on-disk size already exceeds this threshold (MB). Code default is 0 (disabled) for existing clusters; scylla.yaml ships 2000 for new deployments.	2026-05-13 13:47:02 +02:00
Taras Veretilnyk	72c472a3ae	replica: introduce large_data_exception Add large_data_exception to the replica exception hierarchy so that write-path guardrails can reject mutations that target partitions already known to exceed configured size limits. Wire it through exception_variant / IDL so it propagates from replica to coordinator, where storage_proxy re-throws it as a mutation_write_failure.	2026-05-13 12:53:34 +02:00
Anna Stuchlik	a7b7019f90	doc: update the node size limit This commit increases the node size limit from 256 to 4096 CPUs based on `be1f566488` Fixes SCYLLADB-1676 Closes scylladb/scylladb#29602	2026-05-11 16:38:53 +03:00
Nadav Har'El	f1b2b9bd52	Merge 'Register `fulltext_index` custom index type' from Dawid Pawlik This PR adds the `fulltext_index` custom index class, laying the groundwork for full-text search in ScyllaDB. It focuses on the CQL-facing layer - schema validation, option parsing, and metadata - without implementing the search backend itself. Users can now write: ```cql CREATE CUSTOM INDEX ON t(content) USING 'fulltext_index' WITH OPTIONS = {'analyzer': 'english', 'positions': 'false'}; ``` The implementation follows the same custom index pattern established by vector search: a `custom_index` subclass registered in the factory map, with no backing materialized view. This keeps the door open for a CDC-based indexing pipeline similar to the one vector search uses. As part of this work, the option validation helpers (`validate_enumerated_option`, `validate_positive_option`, `validate_factor_option`) were extracted from `vector_index.cc` into a shared header so both index types can reuse them. The `custom_index` base class also gained a virtual `index_type_name()` method, giving each subclass a self-describing name for error messages without hardcoding strings in shared code. The PR is split into three commits: 1. Extract shared validation utilities and add `index_type_name()` to `custom_index` 2. Implement `fulltext_index` with column type and option validation 3. Integration tests covering creation, validation, describe, and metadata Fixes: SCYLLADB-1517 Fixes: SCYLLADB-1510 References: SCYLLADB-1516 Closes scylladb/scylladb#29658 * github.com:scylladb/scylladb: test/cqlpy: add integration tests for `fulltext_index` index: unify custom index description index: add `fulltext_index` custom index implementation index: extract option validation helpers	2026-05-11 16:16:58 +03:00
Nadav Har'El	fcfad51284	Merge 'cql3/selection: require EXECUTE on UDA REDUCEFUNC at SELECT time' from Marcin Maliszkiewicz selection::used_functions() pushed the UDA, its SFUNC and its FINALFUNC, but never the REDUCEFUNC. The reducefunc is invoked by the distributed aggregation path in service::mapreduce_service, so a user could cause it to run server-side without holding EXECUTE on it as long as the query took the mapreduce path. Also push agg.state_reduction_function so select_statement::check_access requires EXECUTE on it too. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1756 Backport: no, it's a minor fix and UDFs are experimental feature in Scylla Closes scylladb/scylladb#29717 * github.com:scylladb/scylladb: test/cqlpy: add test for EXECUTE permission on UDA sub-functions cql3/selection: require EXECUTE on UDA REDUCEFUNC at SELECT time	2026-05-11 16:14:38 +03:00
Botond Dénes	cf37f541a0	Merge ' sstables_loader: ensure upload directory is empty when load_and_stream returns' from Taras Veretilnyk After `load_and_stream` (e.g. via `nodetool refresh --load-and-stream`) returns success, source sstable files in the `upload/` directory may still be on disk. `mark_for_deletion()` only sets an in-memory flag; the actual file deletion runs lazily when the last `shared_sstable` reference drops. This leaves a window between API success and physical deletion where a follow-up scan of the upload directory can detected sstables that will be deleted soon. This might cause failure because SSTable will be already wiped during processing. For fix: Force unlink to complete before `stream()` returns, so the upload directory is in a consistent state by the time the API reports success. For tablet streaming, partially-contained sstables participate in multiple per-tablet batches; eagerly unlinking after each batch would break the next batch that still needs to read the file. A `defer_unlinking` flag on the streamer postpones the explicit unlink until after all batches complete (called once at the end of `tablet_sstable_streamer::stream()`). Vnode streaming unlink eagerly at the end of `stream_sstable_mutations`. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1647 Backport is required, as it is a bug fix that was introduced in `517a4dc4df`. Closes scylladb/scylladb#29599 * github.com:scylladb/scylladb: sstables_loader: synchronously unlink streamed sstables before returning sstables: make sstable::unlink() idempotent	2026-05-11 14:43:46 +03:00
Asias He	0204372156	repair: Reject repair requests where start and end tokens are equal When a user calls the repair API with identical startToken and endToken values, the code creates a wrapping interval (T, T]. This causes unwrap() to split it into (-inf, T] and (T, +inf), covering the entire token ring and triggering a full repair. Reject such requests early with an error message matching Cassandra's behavior: "Start and end tokens must be different." Fixes: https://scylladb.atlassian.net/browse/CUSTOMER-358 Closes scylladb/scylladb#29821	2026-05-11 14:08:20 +03:00
Botond Dénes	ad7ac62835	Merge ' Add a node_owner column (locator::host_id) to system.sstables and make it part of the partition key' from Dimitrios Symonidis Add a node_owner column (locator::host_id) to system.sstables and make it part of the partition key, so the primary key becomesv PRIMARY KEY ((table_id, node_owner), generation). This is the first step toward moving the sstables registry into system_distributed: once distributed, each node's startup scan must read only the rows it owns, which requires the owning node to be part of the partition key. Partitioning by (table_id, node_owner) turns that scan into a single-partition read of exactly the local node's rows. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1562 No need to backport this, keyspace over object storage is experimental feature Closes scylladb/scylladb#29659 * github.com:scylladb/scylladb: db, sstables: add node_owner to sstables registry primary key db, sstables: rename sstables registry column owner to table_id	2026-05-11 14:08:19 +03:00
Marcin Maliszkiewicz	fa9d15d31a	test/cqlpy: add test for EXECUTE permission on UDA sub-functions Verify that SELECT of a UDA requires EXECUTE on its SFUNC, FINALFUNC, and REDUCEFUNC individually. If any one permission is missing, the query must be rejected at planning time (even on an empty table). The test is parameterized over the three sub-functions and uses Lua on Scylla or Java on Cassandra, so it runs on both backends. The REDUCEFUNC case is skipped on Cassandra since REDUCEFUNC is a Scylla extension. Refs SCYLLADB-1756	2026-05-11 10:23:39 +02:00
copilot-swe-agent[bot]	9e7d67612c	docs: fix typo in materialized views docs - "columns are" instead of "is" The MV Select Statement description was missing the word "columns" and used incorrect verb agreement, making the sentence grammatically broken and ambiguous. docs/cql/mv.rst: "which of the base table is included" → "which of the base table columns are included" Fixes #29662 Closes #29663 Co-authored-by: annastuchlik <37244380+annastuchlik@users.noreply.github.com>	2026-05-11 11:15:25 +03:00
Botond Dénes	eae15f4fdd	Merge 'Share timeout_config between services' from Pavel Emelyanov The timeout_config (more exactly -- updatable_timeout_config) is used by alternator/controller and transport/controller. Both create a local copy of that opbject by constructing one out of db::config. Also some options from this config are needed by storage_proxy, but since it doesn't have access to any timeout_config-s, it just uses db::config by getting it from the database. This PR introduces top-level sharded<updateable_timeout_config>, initializes it from db::config values and makes existing users plus storage_proxy us it where required. Motivation -- remove more replica::database::get_config() users. A side effect -- timeout_config is not duplicated by transport and alternator controllers. Components' dependencies cleanup, not backporting. Closes scylladb/scylladb#29636 * github.com:scylladb/scylladb: storage_proxy: Use shared updateable_timeout_config for CAS contention timeout alternator: Use shared updateable_timeout_config by reference cql_transport: Use shared updateable_timeout_config by reference storage_proxy: Use shared updateable_timeout_config by reference main: Introduce sharded<updateable_timeout_config> storage_proxy: Keep own updateable_timeout_config	2026-05-11 11:12:01 +03:00
Botond Dénes	9b2dfab2e5	Merge 'Don't use database.get_config() to fetch calculate_view_update_throttling_delay option' from Pavel Emelyanov This option is used in two places -- proxy and view-update-generator both need it to calculate the calculate_view_update_throttling_delay() value. This PR moves the option onto view_update_backlog top-level service, makes the calculating helper be method of that class and patches the callers to use it. This eliminates more places that abuse database as db::config accessor. Code dependencies refactoring, not backporting Closes scylladb/scylladb#29635 * github.com:scylladb/scylladb: view: Turn calculate_view_update_throttling_delay into node_update_backlog member view: Place view_flow_control_delay_limit_in_ms on node_update_backlog view: Add node_update_backlog reference to view_update_generator	2026-05-11 10:30:24 +03:00
Pavel Emelyanov	f39cbb1ec6	storage_proxy: Move maintenance_mode onto storage_proxy::config Stop reading maintenance_mode through replica::database's db::config. Add a properly typed maintenance_mode_enabled field to storage_proxy::config, populate it in main.cc from cfg->maintenance_mode() (same as messaging_service::config), and use a cached member in storage_proxy instead of db.local().get_config().maintenance_mode(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Closes scylladb/scylladb#29637	2026-05-11 10:11:20 +03:00
Yaniv Michael Kaul	631f1e1654	compaction: set_skip_when_empty() for validation_errors metric Add .set_skip_when_empty() to compaction_manager::validation_errors. This metric only increments when scrubbing encounters out-of-order or invalid mutation fragments in SSTables, indicating data corruption. It is almost always zero and creates unnecessary reporting overhead. AI-Assisted: yes Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#29349	2026-05-11 09:12:40 +03:00
Yaniv Michael Kaul	b8a150e22c	build: add -ftime-trace support for compilation profiling Add a --time-trace flag to configure.py and a Scylla_TIME_TRACE CMake option that enable Clang's -ftime-trace on all C++ compilations. When enabled, each .o file produces a companion .json trace that can be analyzed with ClangBuildAnalyzer or loaded in chrome://tracing to identify slow headers and costly template instantiations. This is the first step toward data-driven build speed improvements. Refs #1 Usage: configure.py: ./configure.py --time-trace --mode dev CMake: cmake -DScylla_TIME_TRACE=ON -DCMAKE_BUILD_TYPE=Dev .. Closes scylladb/scylladb#29462	2026-05-11 08:55:33 +03:00
Dmitry Kropachev	85d0011b3c	gitignore: add missing rust build artifacts rust/**/target and Cargo.lock files under rust/inc/ and rust/wasmtime_bindings/ were not ignored, nor was test/resource/wasm/rust/target/. Closes scylladb/scylladb#28943	2026-05-11 07:06:26 +03:00
Botond Dénes	3f72852d8c	Merge 'Fix missing format string placeholders across the codebase (33 bugs across 14 modules )' from Yaniv Kaul Fix 28 format string bugs plus 5 related format argument bugs across 14 modules where `{}` placeholders were missing or arguments were wrong, causing arguments to be silently dropped or misleading output from the `{fmt}` library. Inspired by https://github.com/scylladb/scylladb/pull/29143 (which fixed a single instance in `replica/table.cc`), a comprehensive audit of the entire codebase was performed to find all similar issues. - Missing `{}` placeholder (21 instances): format string simply lacks `{}` for a passed argument, e.g. `format("msg for table {}", group_id, table_id)` -- `group_id` is silently dropped - Spurious comma breaking C++ string literal concatenation (2 instances): a comma after a string literal prevents adjacent-literal concatenation, turning the continuation into a format argument instead of part of the format string - Printf-style `%s` in fmtlib context (4 instances): `%s` has no meaning in fmtlib and appears as literal text while the argument is silently ignored - Extra spurious argument (1 instance): an extraneous `t.tomb()` argument inserted between correct arguments, causing wrong values in the wrong slots - Wrong variable in error message (4 instances in `types/map.hh`): error messages for oversized map keys/values reported `map_size` (total entry count) instead of the actual `elem.first.size()` or `elem.second.size()` that exceeded the limit - Swapped argument order (1 instance in `data_dictionary/data_dictionary.cc`): format string says `"Extraneous options for {type}: {values}"` but the values and type arguments were passed in reverse order \| Module \| Bugs Fixed \| Files \| \|--------\|:---------:\|-------\| \| `replica/` \| 1 \| `table.cc` \| \| `service/` \| 4 \| `raft_group0.cc`, `storage_service.cc` \| \| `db/` \| 6 \| `heat_load_balance.cc`, `commitlog_replayer.cc`, `view_update_generator.cc`, `view_building_worker.cc`, `row_locking.cc` \| \| `cql3/` \| 2 \| `prepare_expr.cc`, `statement_restrictions.cc` \| \| `transport/` \| 4 \| `event_notifier.cc` \| \| `sstables/` \| 3 \| `partition_reversing_data_source.cc`, `reader.cc` \| \| `alternator/` \| 1 \| `conditions.cc` \| \| `cdc/` \| 1 \| `split.cc` \| \| `raft/` \| 1 \| `server.cc` \| \| `utils/` \| 2 \| `gcp/object_storage.cc`, `s3/client.cc` \| \| `mutation/` \| 1 \| `mutation_partition.hh` \| \| `ent/` \| 2 \| `kmip_host.cc`, `kms_host.cc` \| \| `types/` \| 4 \| `map.hh` \| \| `data_dictionary/` \| 1 \| `data_dictionary.cc` \| The `{fmt}` library's compile-time checker validates that each `{}` placeholder references a valid argument, but does not verify the reverse -- that every argument has a corresponding placeholder. Extra arguments are silently ignored at both compile time and runtime. Build verified with `dbuild ninja build/dev/scylla` -- compiles cleanly. --- Note: Commits were amended to fix the author name from "Yaniv Michael Kaul" to "Yaniv Kaul". Closes scylladb/scylladb#29448 * github.com:scylladb/scylladb: data_dictionary: fix swapped arguments in extraneous options error types: fix wrong variable in map key/value size error messages ent: fix missing format placeholders in encryption error/log messages mutation: fix spurious argument in shadowable_tombstone formatter utils: fix missing format placeholders in object storage log messages raft: fix missing format placeholder in server ostream operator cdc: fix missing format placeholder in error message alternator: fix missing format placeholder in error message sstables: fix missing format placeholders in error messages transport: fix printf-style format specifiers in fmtlib log calls cql3: fix missing format placeholders in error messages db: fix missing format placeholders in log and error messages service: fix missing format placeholders in log messages replica: fix missing format placeholder in cleanup log message	2026-05-11 07:04:42 +03:00
Yaron Kaikov	5694c93c12	build: add collect-dist target to organize build artifacts Build artifacts are currently scattered across build/dist/$mode/redhat/, tools/python3/build/, tools/cqlsh/build/, etc. with unpredictable names. Add a new 'collect-dist' ninja target that gathers all distributable artifacts into a well-known structure: build/$mode/dist/rpm/ -- all binary RPMs (no SRPMs) build/$mode/dist/deb/ -- all .deb packages build/$mode/dist/tar/ -- relocatable tarballs (already here) The collection is done via a reusable 'collect_pkgs' ninja rule defined directly in configure.py, which knows all the source paths. No external script is needed. Fixes: SCYLLADB-75 Closes scylladb/scylladb#29475	2026-05-11 06:54:29 +03:00
Michael Litvak	274024a76b	configure.py: update compile_commands.json if stale configure.py creates compile_commands.json in the root directory as a symbolic link to the file in one of the build directories. If the file already exists it does nothing. However it may happen that the file exists but the target file does not exist. For example, if the build directory is removed and then building with a different mode. Then the file will remain as a stale symbolic link. To address this, when the file exists check also if it's a valid symbolic link. If not, then recreate it with a valid target. Closes scylladb/scylladb#29680	2026-05-10 22:17:16 +03:00
Piotr Szymaniak	459c1dc32f	test/alternator: stop avoiding tablets in Streams tests Alternator Streams now supports tablets, so stop skipping the TTL Streams test in tablet mode and stop forcing vnodes in the Streams audit test. Refs SCYLLADB-463 Closes scylladb/scylladb#29697	2026-05-10 22:13:15 +03:00
Nadav Har'El	df8c9b17b8	Merge 'alternator: Graduate Alternator Streams from experimental' from Piotr Szymaniak As a final step for https://scylladb.atlassian.net/browse/SCYLLADB-461 we need to graduate Alternator Streams from experimental. So let's remove `--experimental-features=alternator-streams` and map the obsolete config string to `UNUSED` for backward compatibility. Also, remove the related gating of the feature. Finally, stop providing the config flag in test configs. Fixes SCYLLADB-1680 Fixes #16367 To documentation tracked by https://scylladb.atlassian.net/browse/SCYLLADB-462 still remains. This PR needs to hit 2026.2, so (only) if it branches before the PR is merged to `master`, we'd need to backport. Closes scylladb/scylladb#29604 * github.com:scylladb/scylladb: test: Stop providing alternator-streams experimental flag alternator: Graduate Alternator Streams from experimental	2026-05-10 22:10:03 +03:00
Nadav Har'El	34136d3bc2	Merge 'vector_search: test: migrate CQL tests for vector search from C++/Boost to pytest' from Karol Nowacki Migrate vector search (ANN ordered select query) CQL tests from C++/Boost suite to pytest. This migration includes: - New pytest tests in `test/cqlpy/test_vector_search_with_vector_store_mock.py` - VectorStoreMock server as pytest fixture to simulate vector store responses The benefits of this migration are: - Extended test coverage to verify CQL protocol serialization and driver - Reduced overall test time (no compilation required for pytest) Fixes SCYLLADB-695 No backport needed as this is a refactoring. Closes scylladb/scylladb#29593 * github.com:scylladb/scylladb: vector_search: test: migrate paging warnings tests to Python vector_search: test: migrate local_vector_index to Python vector_search: test: migrate vector_index_with_additional_filtering_column to Python vector_search: test: migrate cql_error_contains_http_error_description to Python vector_search: test: migrate pk in restriction test to Python	2026-05-10 22:09:17 +03:00
Nadav Har'El	d4aa528834	Merge 'load_balancer: fix tablet allocator dropped table' from Ferenc Szili - Handle dropped tables gracefully in the tablet load balancer's `get_schema_and_rs()` instead of aborting with `on_internal_error` - The load balancer operates on a token metadata snapshot but accesses the live schema for table lookups. A DROP TABLE applied by another fiber between coroutine yield points can remove a table from the live schema while it still exists in the snapshot, causing an abort. `get_schema_and_rs()` now returns `std::optional` and logs a warning in debug log level instead of aborting when a table is missing. All callers skip dropped tables: - `make_sizing_plan`: skips to next table - `make_resize_plan`: skips to next table (merge suppression is moot) - `check_constraints`: returns `skip_info{}` with empty viable targets - `get_rs`: returns `nullptr`, checked by `check_constraints` The call chain is: `make_plan` → `make_internode_plan` → `check_constraints` → `get_rs` → `get_schema_and_rs`. The `make_internode_plan` coroutine has multiple `co_await` yield points (`maybe_yield`, `pick_candidate`) between building the candidate tablet list and checking replication constraints. A DROP TABLE schema mutation applied during any of these yields removes the table from `_db.get_tables_metadata()` while the candidate list still references it. Added `test_load_balancing_with_dropped_table` which simulates the race by capturing a token metadata snapshot, dropping the table, then calling `balance_tablets` with the stale snapshot. Fixes: SCYLLADB-1664 This fix needs to be backported to versions: 2025.4, 2026.1 Closes scylladb/scylladb#29585 * github.com:scylladb/scylladb: test: verify load balancer handles dropped tables gracefully tablet_allocator: handle dropped tables gracefully in get_schema_and_rs	2026-05-10 22:07:51 +03:00
Nadav Har'El	63927e07ea	Merge 'alternator/streams: keep disabled streams usable and purge on re-enable' from Piotr Szymaniak When an Alternator stream is disabled, the data should continue to be accessible so that consumers can finish reading. When the stream is later re-enabled, a new StreamArn is produced and only then the old data is purged. On disable, the existing CDC options (including preimage and postimage) are preserved so that DescribeStream can still report StreamViewType. All stream APIs continue to work on the disabled stream, with all shards reported as closed (EndingSequenceNumber set). No new CDC records are written; existing data expires via TTL after 24 hours. On re-enable, the old CDC log table is dropped as a separate Raft group0 schema change and a fresh one is created with a new UUID, giving a new StreamArn. This is Alternator-specific — CQL CDC keeps reusing the log table. Re-enabling is the only way to immediately purge old stream data. Old stream data is removed immediately upon re-enable (a discrepancy with DynamoDB, which keeps it readable for 24 hours through the old StreamArn). Tests updated to cover the new disable and re-enable behavior. Fixes #7239 Fixes SCYLLADB-523 Closes scylladb/scylladb#29413 * github.com:scylladb/scylladb: alternator/streams: remove dead next_iter in get_records test/alternator: fix stream wait timeouts to use wall-clock time docs/alternator: document stream disable/re-enable behavior alternator/streams: keep disabled streams usable and purge on re-enable	2026-05-10 22:04:35 +03:00
Nadav Har'El	e277f747bd	Merge 'Make collection unfreezing more efficient' from Botond Dénes Introduce `read_from_collection_cell_view()` which reads a `collection_mutation` directly from the IDL representation of a collection (`ser::collection_cell_view`). This cuts down the number of allocations required drastically compared to the current method of: IDL -> collection_mutatio_description -> collection_mutation Reduces the number of allocations to unfreeze a collection from O(collection_cell_count) -> O(1) (actually, due to buffer fragmentation, it is O(collection_size)). The new method is used when unfreezing frozen mutations and frozen mutation fragments. This is on the hot path: all writes with collections benefit. Add a `--collection` flag to `perf-simple-query` to allow measuring the performance improvement of this PR. With `dbuild -it -- build/release/scylla perf-simple-query --collection=16 -c1 -m2G --default-log-level=error --write` the number of allocations drop from ~123 to 102, which is a significant amount of allocations shaved off. Refs: https://github.com/scylladb/scylladb/issues/3602 (solves one use-case out of the many listed therein) Fixes: SCYLLADB-1046 Fixes: SCYLLADB-1077 Backport: this is an optimization so normally not a backport candidate, but we may have to backport to relieve certain customers Closes scylladb/scylladb#29033 * github.com:scylladb/scylladb: test/perf/perf_simple_query: add --collection=N test/boost/frozen_mutation_test: add freeze/unfreeze test for large collections mutation/mutation_partition_view: use read_from_collection_cell_view() to read collections mutation/collection_mutation: introduce read_from_collection_cell_view() mutation/atomic_cell: atomic_cell_type: add write() and serialized_size() mutation/collection_mutation: generalize serialize_collection_mutation mutation/mutation_partition_view: avoid copying collection mutation/mutation_partition_view: accept collection_mutation in the consume API partition_builder: add move variant of accept_*_cell() collection overloads	2026-05-10 20:39:08 +03:00
Yaniv Kaul	a6cf45f9e2	data_dictionary: fix swapped arguments in extraneous options error The format string says "Extraneous options for {type}: {values}" but the arguments were passed in the wrong order (values first, type second), producing misleading error messages like "Extraneous options for bucket,endpoint: S3" instead of "Extraneous options for S3: bucket,endpoint". Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:20 +03:00
Yaniv Kaul	a13da94308	types: fix wrong variable in map key/value size error messages Four error messages for oversized map keys/values reported map_size (the total number of entries) instead of the actual key or value size that exceeded the limit. The condition checks elem.first.size() or elem.second.size(), but the error message printed map_size. This affects both the bytes and managed_bytes serialization overloads. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:20 +03:00
Yaniv Kaul	bf1d59ad95	ent: fix missing format placeholders in encryption error/log messages Fix two format string bugs: - kmip_host.cc: cmd_in was passed as an argument to a trace log but had no {} placeholder, so the command was silently dropped. - kms_host.cc: the XML node name (what) was passed to the error message but had no {} placeholder, so the error never showed which XML node was missing. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:20 +03:00
Yaniv Kaul	a76774f8f9	mutation: fix spurious argument in shadowable_tombstone formatter The formatter for shadowable_tombstone had a spurious t.tomb() argument between the timestamp and deletion_time arguments. This caused t.tomb() (the whole tombstone) to be formatted into the deletion_time={} slot, while the actual deletion_time count was silently dropped. Remove the extra argument. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:19 +03:00
Yaniv Kaul	700b0b4c28	utils: fix missing format placeholders in object storage log messages Fix two format string bugs: - gcp/object_storage.cc: _session_path was passed but the format string had empty parentheses () instead of ({}), so the session path was silently dropped from the debug output. - s3/client.cc: part_number was passed as an argument but had no {} placeholder. The upload_id ended up in the etag slot and was silently dropped. Add {} for all three values. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:19 +03:00
Yaniv Kaul	358f6fba9f	raft: fix missing format placeholder in server ostream operator The FSM state was passed as an argument but the format string had empty parentheses () instead of ({}), causing the FSM state to be silently dropped from the output. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:19 +03:00
Yaniv Kaul	605455f82d	cdc: fix missing format placeholder in error message The collection type name was passed as an argument but the format string only had a trailing colon without a {} placeholder, so the type name was silently dropped from the error message. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:19 +03:00
Yaniv Kaul	0c88ff6a40	alternator: fix missing format placeholder in error message The values count was passed as an argument but had no {} placeholder, so it was silently dropped. The analogous BETWEEN check on the line above correctly uses {} -- apply the same pattern here. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:19 +03:00
Yaniv Kaul	e29f59347b	sstables: fix missing format placeholders in error messages Fix three format string bugs: - partition_reversing_data_source.cc: _row_start was passed as an argument but had no {} placeholder in the invariant error message. Add {} for all three values to show the full diagnostic. - reader.cc: two "Invalid boundary type" error messages passed the type value as an argument but had no {} placeholder, so the actual invalid type was never shown. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:19 +03:00
Yaniv Kaul	413497c9ce	transport: fix printf-style format specifiers in fmtlib log calls Four logger calls used %s (printf-style) instead of {} (fmtlib-style), causing __func__ to be silently ignored and the literal text "%s" to appear in the log output. The same file already uses {} correctly in the on_create_function and on_create_aggregate handlers. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:19 +03:00
Yaniv Kaul	cfb568b5b5	cql3: fix missing format placeholders in error messages Fix two format string bugs where arguments were silently dropped: - prepare_expr.cc: the bad argument to count() was passed but had no {} placeholder, so users never saw what was actually passed. - statement_restrictions.cc: the unsupported multi-column relation was passed but the trailing colon had no {} placeholder. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:19 +03:00
Yaniv Kaul	fdebed5746	db: fix missing format placeholders in log and error messages Fix six format string bugs where arguments were silently dropped: - heat_load_balance.cc: pp value was passed but had no {} placeholder. - commitlog_replayer.cc: column_family_id was passed but table= had no {} placeholder. - view_update_generator.cc: _sstables_with_tables.size() was passed but had no {} placeholder. - view_building_worker.cc: exception pointer was passed but the trailing colon had no {} placeholder. - row_locking.cc: partition key and clustering key were passed in error messages but had no {} placeholders. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:49:50 +03:00
Yaniv Kaul	4ee81f9b32	service: fix missing format placeholders in log messages Fix four format string bugs: - raft_group0.cc: the exception from sleep_and_abort was passed as an argument but had no {} placeholder, so it was silently dropped. - storage_service.cc: loading topology trace was missing a placeholder for the cleanup field (9 args but only 8 placeholders). - storage_service.cc: two join-rejection warnings had a spurious comma after the first string literal, breaking C++ string concatenation. This caused the continuation string to be treated as a separate format argument instead of being part of the format string, and params.host_id was silently dropped. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:49:50 +03:00
Yaniv Kaul	f75248a734	replica: fix missing format placeholder in cleanup log message The log message for tablet cleanup invalidation was missing a {} placeholder for the table name (cf_name), causing it to be silently dropped from the output. Add {}.{} to show both keyspace and table name, consistent with the convention used elsewhere in the file. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:49:50 +03:00
Piotr Dulikowski	bc482bfdea	database: add missing co_await on lock in create_local_system_table The function database::create_local_system_table calls get_tables_metadata().hold_write_lock(), but does not co_await the returned future. Effectively, this code does not guarantee mutual exclusion because it does not wait for the lock to be acquired and does not guarantee that the lock is held long enough. Fix this by adding the co_await that was missing. Found by manual inspection. This code is not known to have caused any problems so far, but it's clearly wrong - hence the fix. Closes scylladb/scylladb#29806	2026-05-10 15:36:21 +03:00
Avi Kivity	5a887362e3	Merge 'Remove legacy tables creation code' from Gleb Natapov Drop creation of `service_levels` and `cdc_generation_descriptions_v2` table creation code since they are no longer needed. Old clusters will still have it because they were created earlier. Also the series contains a small improvement around group0 creation. No backport needed since this removes functionality. Closes scylladb/scylladb#29482 * github.com:scylladb/scylladb: db/system_distributed_keyspace: remove system_distributed_everywhere since it is unused db/system_distributed_keyspace: drop CDC_TOPOLOGY_DESCRIPTION and CDC_GENERATIONS_V2 db/system_distributed_keyspace: remove unused code db/system_distributed_keyspace: drop old cdc_generation_descriptions_v2 table db/system_distributed_keyspace: drop old service_levels table fix indent after the previous patch group0: call setup_group0 only when needed	2026-05-10 14:46:21 +03:00
Botond Dénes	67226e6f1b	scylla-gdb.py: interval_printer: update for new layout interval switched from std::optional<> to union + bools for bound storage in `42d7ae1082`. Update the printer to work with the new layout. Keep the code backwards compatible, 2025.1 still uses optionals and is still supported. Closes scylladb/scylladb#29738	2026-05-10 14:28:24 +03:00
Avi Kivity	ece4e0738f	Merge 'docs/cql: fix syntax errors in CQL examples' from Yaniv Kaul Fix 4 genuine CQL syntax errors in documentation examples, found by automated extraction and execution of doc code blocks against a live ScyllaDB instance. - insert.rst: `USING TTL 86400 IF NOT EXISTS` → `IF NOT EXISTS USING TTL 86400` (wrong clause order produces syntax error) - ddl.rst: Missing opening quote in ALTER KEYSPACE example (`dc2'` → `'dc2'`) - ddl.rst: Hyphenated column names need double-quoting; also fix PRIMARY KEY referencing non-existent `customer_id` instead of `cust_id` - types.rst: UDT `address` contains nested collections, so it must be `frozen<address>` when used as a column type Built a CQL extractor that parses `.. code-block:: cql` blocks from RST docs, then executed all 194 extracted statements against ScyllaDB 2026.2.0-rc0. These 4 are confirmed syntax/semantic errors in the documentation. Closes scylladb/scylladb#29765 * github.com:scylladb/scylladb: test/cqlpy: add tests for hyphenated column names docs/cql: fix UDT example to use frozen<address> docs/cql: fix CREATE TABLE example with hyphenated column names docs/cql: fix missing opening quote in ALTER KEYSPACE example docs/cql: fix INSERT example clause order (IF NOT EXISTS before USING)	2026-05-10 14:23:30 +03:00
Anna Stuchlik	61d1cbfd20	doc: add the upgrade guide from 2026.1 to 2026.2 This commit adds the upgrade guide, including the updated metrics. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1746 Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1765 Closes scylladb/scylladb#29694	2026-05-10 14:20:09 +03:00
Benny Halevy	a797c9f10b	table: delete sstables atomically per compaction group during truncate Prepare for truncate of tables on object storage, where we want to limit the atomic deletion batches to produce smaller batch mutations. This is safe since truncate does not really need to delete all sstables in the table atomically — it is already non-atomic since each node and each shard deletes its own sstables. The atomic deletion mechanism is used for convenience. Previously, discard_sstables collected all sstables from all compaction groups on the shard into a single vector and issued one atomic delete for all of them. Change to track removed sstables per compaction group and issue separate atomic deletes per group using coroutine::parallel_for_each, allowing concurrent deletion across groups. Closes scylladb/scylladb#29789	2026-05-10 14:08:10 +03:00
Botond Dénes	d0813769ec	sstables/trie: add preemption points in trie_writer The BTI partition index trie writer flushes all buffered nodes at the end of each SSTable via complete_until_depth(0), called from bti_partition_index_writer_impl::finish(). This is a tight synchronous loop that writes trie nodes through file_writer::write(), which uses a buffered output_stream: individual writes that fit in the buffer are plain memcpy operations returning a ready future, so .get() never yields. As a result the reactor can stall for several milliseconds on large SSTables. The entire call chain runs inside seastar::async() (via sstable::write_components()), so seastar::thread::maybe_yield() is safe to call here. Add it at the top of both tight loops: - complete_until_depth(), which iterates over trie depth - lay_out_children(), which iterates over child branches per node Fixes SCYLLADB-1885 Closes scylladb/scylladb#29798	2026-05-10 11:30:59 +03:00
Marcin Maliszkiewicz	fb55bef0ac	cql3/selection: require EXECUTE on UDA REDUCEFUNC at SELECT time selection::used_functions() pushed the UDA, its SFUNC and its FINALFUNC, but never the REDUCEFUNC. The reducefunc is invoked by the distributed aggregation path in service::mapreduce_service, so a user could cause it to run server-side without holding EXECUTE on it as long as the query took the mapreduce path. Also push agg.state_reduction_function so select_statement::check_access requires EXECUTE on it too. Fixes SCYLLADB-1756	2026-05-08 16:37:52 +02:00
Botond Dénes	8ca0f2dd54	Merge 'raft: do not throw commit_status_unknown from add_entry when possible' from Patryk Jędrzejczak Previously, when a snapshot load subsumed a committed entry before apply() was called locally, add_entry would throw commit_status_unknown -- even though the entry was known to be committed and included in the snapshot. This was overly pessimistic. Normal state machine implementations shouldn't care whether an entry was applied via apply() or via a snapshot load. Unnecessary commit_status_unknown caused flakiness of test_frequent_snapshotting and unnecessary retries in group0. Raft groups from strongly consistent tables couldn't hit unnecessary commit_status_unknown's because they use wait_type::committed and `enable_forwarding == false`. Three sites are changed: 1. wait_for_entry (truncation case): the snapshot-term match optimization that proved the entry was committed now applies to both wait_type::committed and wait_type::applied, not just committed. 2. wait_for_entry (snapshot covers entry): instead of throwing commit_status_unknown when the snapshot index >= entry index, return successfully. The entry's effects are included in the state machine's state via the snapshot. 3. drop_waiters: when called from load_snapshot, pass the snapshot term. Waiters whose term matches the snapshot term are resolved successfully (set_value) instead of failing with commit_status_unknown, since the Log Matching Property guarantees they were committed and included. This deflakes test_frequent_snapshotting: the test uses aggressive snapshot settings (snapshot_threshold=1) causing wait_for_entry to occasionally find the snapshot covering its entry. Previously this threw commit_status_unknown, failing the test. With this fix, wait_for_entry returns success. Note that apply() is never actually skipped in this test -- the leader always applies entries locally before taking a snapshot. The nemesis test is updated to handle the new behavior: call() detects when add_entry succeeded but the output channel was not written (apply() skipped locally) and returns apply_skipped instead of hanging. The linearizability checker in basic_generator_test counts skipped applies separately from failures. basic_generator_test exercises this path: skipped_applies > 0 occurs in some runs. Fixes: SCYLLADB-1264 No backport: the changes are quite risky and the test being fixed fails very rarely. Closes scylladb/scylladb#29685 * github.com:scylladb/scylladb: test/raft: fix duplicate check in connected::operator() test/raft: add tests for add_entry snapshot interactions raft: do not throw commit_status_unknown from add_entry when possible raft: change drop_waiters parameter from index to snapshot descriptor raft: server: fix a typo	2026-05-08 16:39:52 +03:00

1 2 3 4 5 ...

53740 Commits