scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 19:46:48 +00:00

Author	SHA1	Message	Date
Wojciech Mitros	c0ea98f922	strong_consistency: classify reads by consistency level Introduce a read_type enum (linearizable vs non_linearizable) and transform the existing "validate" function into a "parse" method - instead of checking if the consistency level is one of the accepted ones, we now also return the correcponding read type for strong consistency. The "parse" function maps CQL consistency levels to following read types: - CL=(LOCAL_)QUORUM -> linearizable (this is the default CL) - CL=(LOCAL_)ONE -> non_linearizable - all others -> throw The classification is performed in the CQL layer (select_statement) to keep the coordinator free of CL concepts.	2026-05-23 11:35:37 +02:00
Wojciech Mitros	1f91524547	strong_consistency: add begin_read() to raft_server Add begin_read() method to raft_server that checks leadership for read operations. Unlike begin_mutate(), it does not need to compute a timestamp or interact with leader_info. It simply checks current_leader() and returns one of three dispositions: - ok: this node is the leader, proceed with read_barrier() locally - raft::not_a_leader: redirect to the indicated leader - need_wait_for_leader: leader unknown, caller must wait and retry This will be used by the read forwarding logic in subsequent commits.	2026-05-23 11:35:36 +02:00
Andrzej Jackowski	f8156702de	tree: add missing -present to copyright headers ~2076 files used "Copyright (C) YYYY-present ScyllaDB" while ~88 files used "Copyright (C) YYYY ScyllaDB". This inconsistency leads to unnecessary code review discussions and gradual spread of the less common format. Standardize all ScyllaDB copyright headers to use -present. Fixes SCYLLADB-1984 Closes scylladb/scylladb#29876	2026-05-21 10:57:42 +02:00
Wojciech Mitros	13c043903d	strong_consistency: cache leader location for non-replica nodes When a non-replica node handles a strongly consistent write, it must forward the request to a replica. If the closest replica is not the leader, the request gets redirected again, causing an extra roundtrip. Add a leader location cache in groups_manager, keyed by raft group_id. After a write request is forwarded, the CQL transport layer records the final node as the leader in the cache. Subsequent write requests from the same node for the same group are forwarded directly to the cached leader, eliminating the extra roundtrip. The cache is only used for writes. Reads can be served by any replica, so they skip the cache and use proximity-based routing instead. Cache entries are validated at use time: if the cached leader is no longer a replica (e.g. after tablet migration), the entry is evicted and the normal closest-replica path is taken. This prevents a scenario where two nodes keep redirecting to each other because both think that the other is the leader but actually both are non-replicas - such loop is broken as soon as the tablet maps are updated. On token_metadata updates, entries for groups that no longer exist (e.g. table dropped, tablet merged) are evicted. Entries for groups that still exist are kept — use-time validation handles staleness. An on_node_resolved callback is propagated through the redirect/bounce path so the transport layer can update the cache generically without coupling to the strong-consistency coordinator. The coordinator creates the callback only for writes (capturing the groups_manager and group_id) and attaches it to the bounce message; the transport layer invokes it once the final node is known, keeping the forwarding infrastructure subsystem-agnostic. We also add a test which verifies that after the initial redirect, following requests to the same node avoid the extra redirect and forward directly to the leader. Fixes: SCYLLADB-1064 Closes scylladb/scylladb#29392	2026-05-21 10:32:56 +02:00
Gleb Natapov	cc034f84c5	schema: ensure committed_by_group0 is set for all non-system tables on boot Tables created before the GROUP0_SCHEMA_VERSIONING feature was enabled have committed_by_group0 = null in system_schema.scylla_tables. This causes maybe_delete_schema_version() to delete their version cell, forcing the legacy hash-based schema version computation path. Add ensure_committed_by_group0() which runs on boot and fixes up any non-system tables where committed_by_group0 is not true (null or false): 1. Queries system_schema.scylla_tables for rows where committed_by_group0 is null or false, skipping system keyspaces (system, system_schema). 2. Takes a group0 guard 3. Re-checks after the raft barrier in case another node already fixed it. 4. For each table needing fixup, creates a mutation writing the version cell (from the in-memory schema). The committed_by_group0 = true flag is stamped by add_committed_by_group0_flag() inside announce(). 5. Announces via raft group0. 6. Retries with a small random delay on group0_concurrent_modification. On other nodes, schema_applier will detect these as "altered" tables (scylla_tables mutation changed), but since the actual table definition is unchanged, update_column_family is effectively a no-op. This is a prerequisite for eventually removing the legacy hash-based schema versioning code path. Closes scylladb/scylladb#29911	2026-05-21 10:22:07 +02:00
Patryk Jędrzejczak	cbadc3d675	test: fix flaky test_raft_snapshot_truncation by waiting for async log truncation Snapshot creation and raft log truncation happen asynchronously in the IO fiber after a schema change completes. The test was querying system.raft immediately after the schema change returned, racing with the IO fiber's store_snapshot_descriptor call. Replace immediate assertions with wait_for polling loops: - log_size == 0: wait for log truncation after drop keyspace - new_snap_id != original_snap_id: wait for new snapshot to be persisted Fixes: SCYLLADB-2120 Closes scylladb/scylladb#29967	2026-05-21 10:50:00 +03:00
Artsiom Mishuta	2259307c2e	test.py: remove redundant pytest.mark.asyncio decorators Fixes: SCYLLADB-1935	2026-05-21 10:36:47 +03:00
Botond Dénes	f8ac8540bd	Merge 'logstor: compare records by timestamp and segment sequence number' from Michael Litvak Add the record timestamp. The timestamp is extracted from the row marker of the mutation when we write it. When inserting a record to index, we compare it with the existing record, and insert it only if it has newer timestamp. Add a segment sequence number that is a global (per-shard) increasing number that is allocated when getting a new segment for write, and is written in buffer headers in the segment. It is used to distinguish between buffers written to different generations of a segment, and for recovery to break ties by keeping the record from the newest segment. Refs https://scylladb.atlassian.net/browse/SCYLLADB-770 no backport - logstor is a new feature Closes scylladb/scylladb#29933 * github.com:scylladb/scylladb: test: logstor: add basic delete test logstor: rewrite segment seq num from streaming logstor: add segment sequence number logstor: get_segment helper logstor: compare records by timestamp	2026-05-21 08:44:18 +03:00
Michał Jadwiszczak	eac9449967	test/test_mv_building: ensure nodes see each other after restart In SCYLLADB-2058 we observed a timeout exception while querying the base table after restarting nodes 2 and 3. Unfortunately, logs don't give us much useful information about the root cause. This patch adds basic checks that nodes see each other after the restart and that the cql connection sees restarted node. It doesn't guarantee that the error won't occur again - in logs from SCYLLADB-2058 we see that each node sees other via gossip after part of the cluster is restarted. In case the error will occur again, this commit also increases logging level of `cql_server` and `storage_proxy`. Refs SCYLLADB-2058 Closes scylladb/scylladb#29951	2026-05-20 14:11:41 +02:00
Marcin Maliszkiewicz	83823149e9	Merge 'audit: implement audit_rules config' from Andrzej Jackowski This patch series adds `audit_rules`, a new audit configuration option for fine-grained, role-aware audit filtering with per-rule sink routing. Rules can be configured in `scylla.yaml` or updated live through `system.config` without restarting the node. Each rule specifies target sinks (`table`, `syslog`), statement categories, qualified table name patterns, and role patterns. Table and role patterns use POSIX `fnmatch` with extended glob syntax. For table-scoped categories (`DML`, `DDL`, `QUERY`), a rule matches only when the category, role, and qualified table name all match. For table-independent categories (`AUTH`, `ADMIN`, `DCL`), the table filter is ignored. Empty category or role lists match nothing; an empty table list matches nothing only for table-scoped categories. The new rules are additive with the existing `audit_categories`, `audit_keyspaces`, and `audit_tables` settings: both mechanisms are evaluated for each audit event, and the final sink set is the union of all matches. To avoid evaluating glob patterns on every audit event, audit rules use a preprocessed cache of known roles and tables. The cache is kept in sync through group0 role/table snapshots, role-change notifications, and schema migration notifications. For known entities, rule matching uses precomputed role/table rule sets; unknown entities fall back to direct rule evaluation. When `audit_rules` is empty, per-event rule matching returns immediately and does not evaluate glob patterns. Audit still keeps known role/table metadata in sync while audit is enabled, so rules can be enabled later through live configuration updates without restarting the node. Performance Measured with `perf-simple-query --smp 1 --duration 100` against a null syslog socket. Results show no regression when audit is disabled, and audit-rules performance has at most 1% more instructions than legacy config for equivalent workloads: ``` =============================================================================================================================================================================== Configuration \| Binary \| throughput (tps) \| insns/op \| cpu_cycles/op \| alloc/op \| logal/op \| task/op =============================================================================================================================================================================== audit=none [1] \| baseline \| 206922.4 \| 36591.6 \| 15348.3 \| 58.1 \| 0.0 \| 14.1 audit=none [1] \| this PR \| 207856.4 (+0.5%) \| 36544.9 (-0.1%) \| 15274.0 (-0.5%) \| 58.1 \| 0.0 \| 14.1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- audit=syslog keyspaces=ks [2] \| baseline \| 94871.8 \| 54163.0 \| 27172.4 \| 72.0 \| 0.0 \| 24.0 audit=syslog keyspaces=ks [2] \| this PR \| 96138.4 (+1.3%) \| 54072.3 (-0.2%) \| 26699.3 (-1.7%) \| 72.0 \| 0.0 \| 24.0 audit=syslog audit-rules=ks [3] \| this PR \| 95142.1 (+0.3%) \| 54457.8 (+0.5%) \| 26953.8 (-0.8%) \| 72.0 \| 0.0 \| 24.0 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- audit=syslog keyspaces=ks-non-existent [4] \| baseline \| 213997.8 \| 36735.6 \| 14848.1 \| 58.1 \| 0.0 \| 14.1 audit=syslog keyspaces=ks-non-existent [4] \| this PR \| 219297.2 (+2.5%) \| 36667.3 (-0.2%) \| 14500.1 (-2.3%) \| 58.1 \| 0.0 \| 14.1 audit=syslog audit-rules=ks-non-existent [5] \| this PR \| 211038.7 (-1.4%) \| 36999.7 (+0.7%) \| 15048.6 (+1.4%) \| 58.1 \| 0.0 \| 14.1 =============================================================================================================================================================================== [1] ./scylla perf-simple-query --smp 1 --duration 100 --audit "none" [2] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-keyspaces "ks" --audit-categories "DCL,DDL,AUTH,DML,QUERY" --audit-unix-socket-path "/tmp/audit-null.sock" [3] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-rules '[{"sinks":["syslog"],"categories":["DCL","DDL","AUTH","DML","QUERY"],"qualified_table_names":["ks."],"roles":[""]}]' --audit-unix-socket-path "/tmp/audit-null.sock" [4] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-keyspaces "ks-non-existent" --audit-categories "DCL,DDL,AUTH,DML,QUERY" --audit-unix-socket-path "/tmp/audit-null.sock" [5] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-rules '[{"sinks":["syslog"],"categories":["DCL","DDL","AUTH","DML","QUERY"],"qualified_table_names":["ks-non-existent."],"roles":[""]}]' --audit-unix-socket-path "/tmp/audit-null.sock" audit-null.sock was created with `socat -u UNIX-RECV:/tmp/audit-null.sock,type=2 OPEN:/dev/null` ``` Fixes: SCYLLADB-1430 No backport: new feature Closes scylladb/scylladb#29267 * github.com:scylladb/scylladb: test: alternator: audit: rules filtering and batch bypass test: perf: add --audit-rules option to perf-simple-query docs: add audit rules section to the auditing guide test: audit: cover role and schema cache notifications test: audit: cover audit rules cluster behavior audit: rebuild rule caches on group0 snapshot and role changes audit: refresh rule caches on schema, role, and config changes audit: route matching rules to configured sinks test: cover preprocessed audit rule cache audit: add preprocessed rule matching cache audit: pass sink targets to storage helpers test: audit: cover rule matching semantics audit: add rule matching and sink helpers test: audit: cover audit_rules configuration config: add live audit_rules option test: cover audit rule parsing and validation audit: define audit_rule type with parsing and validation	2026-05-20 14:10:45 +02:00
Gleb Natapov	c2cc7ebf39	test: fix test_cas_semaphore flakiness due to paxos state table creation timeout The test was starting Scylla with --write-request-timeout-in-ms=500 on the command line. This tight timeout also applied to paxos state table creation, which goes through raft and can take longer than 500ms on slow platforms (e.g. aarch64/dev). When the first batch of CAS requests triggered paxos state table creation under error injection, the raft schema change could still be in-flight when the second batch fired, causing spurious WriteTimeout failures unrelated to the semaphore bug being tested. Fix by changing the write timeout at runtime via the REST API: lower it to 500ms only for the error-injection CAS phase (after table creation is done), then restore it to 10000ms before the second batch that must succeed. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-2104 Closes scylladb/scylladb#29969	2026-05-20 13:06:17 +02:00
Avi Kivity	6df04c9e5b	Update seastar submodule Changed seastar::http::experimental to seastar::http to reflect graduation of the seastar http API. Changed call to seastar::rename_file() (in sstables/storage.cc, sstables/sstable_directory.cc, sstable/sstables.cc and db/hints/internal/hint_storage.cc) to reflect new default parameter. Updated scylla_gdb test helper get_task() to work with updated accept loop in Seatar. This is just test code (attempts to find a task to operate on), not used in real scylla-gdb.py work, but nevertheless the adjustment keeps backward compatibility. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1798 Fixes https://scylladb.atlassian.net/browse/SCYLLADB-2043 * seastar 485a62b2...510f3148 (43): > reactor_backend: fix iocb double-free and shutdown hang during AIO teardown > file: fix default DMA alignment > http: add to_reply() to redirect_exception with extra-header support > core: propagate syscall errors via `coroutine::exception` > file: assert dma alignments are powers of two > doc: Document undocumented io_tester features and fix output example > backtrace: print the build_id along with the backtrace > reactor: default to oneline backtraces > Merge 'json: formatter: support types with user-defined conversion to sstring' from Benny Halevy tests: json_formatter: test formatter::write with string types json: formatter: support types with user-defined conversion to sstring > httpd_test: fix build failure with Seastar_SSTRING=OFF > net/tls: introduce ssl_call wrapper for SSL I/O > build: disable unused command line argument error for C++ module > coroutine/generator: fix setup of generator's waiting task > tests/tls: set 1000-day validity for self-signed CA cert > net: tls: openssl: disable certificate compression > reactor: reduce steady_clock::now() calls per scheduling quantum > fair_queue: remove notify_request_finished() > loop: use small_vector for parallel_for_each_state incomplete futures > dodge false sharing in spinlock > Merge 'Handle nowait support for reads and writes independently' from Pavel Emelyanov file: Change nowait_works mode detection file: Introduce read-only nowait_mode filesystem: Make nowait_works bit a enum class too file: Make nowait_works bit a enum class > Merge 'net/tls: improve OpenSSL error queue hygiene' from Gellért Peresztegi-Nagy net/tls: assert clean error queue before SSL operations net/tls: clear error queue after successful SSL operations net/tls: clear error queue after successful SSL_CTX_new net/tls: drain error queue on unexpected error codes net/tls: use make_openssl_error for BIO creation failure > vla.hh: add missing includes > Merge 'smp: make smp::count non-static' from Avi Kivity smp: convert all smp::count usages to instance-aware alternatives smp: add per-instance shard_count and this_smp() infrastructure disk_params: document pre-init smp::count access with explicit 0 reactor_backend: document pre-init smp::count access with explicit 0 tests: alien_test: pass shard count to alien thread explicitly > build: fix cmake missing ninja on Ubuntu 26.04 > rpc: Fix uint64 wraparound of expired timeout in send_entry() > Merge 'Generalize some RPC tests' from Pavel Emelyanov tests: Generalize async connection-based scheduling RPC tests tests: Generalize sync connection-based scheduling RPC tests tests: Remove redundant variadic/nonvariadic RPC tuple tests tests: Generalize max timeout RPC tests > net: tls: openssl: Share BIO ptrs across shards > http: fix compilation on clang 22 with c++26 > build: openssl tools needed for test cert generation > reactor: support rename2 > future: fix forwarding of reference types > Merge 'Zero-copy http chunked data sink' from Pavel Emelyanov http: Make chunked data sink zero-copy tests/prometheus_http: Rewrite on top of http::client tests/httpd: Rewrite content_length_limit on top of http::client > tests: Replace ad-hoc http_consumer with production HTTP parser > Merge 'co_return to accept same expressions and types as return' from Alexey Bashtanov tests/unit/{coroutines,futures}: strict types on co_return and set_value api: introduce version 10: core/{coroutine,future}: make `co_return` more strict with types core/{coroutine,future}: preparations to fix `co_return` type semantics > Merge 'Perftune.py: add special handling for mlx5 rss queues number calculation' from Vladislav Zolotarov perftune.py: NetPerfTuner: enhance RSS (a.k.a. "Rx") queues accounting for mlx5 devices perftune.py: update docstring of NetPerfTuner.__get_rps_cpus() method perftune.py: add a method that parses and models the output of the 'ethtool -l' command for a given interface > httpd: rewrite do_accepts/do_accept_one as coroutines > file: add mmap support to file > http: Move client code out of experimental namespace > file: add hugetlbfs support to file system detection > tests: Replace test_source_impl with util::as_input_stream > tests: Replace buf_source_impl with util::as_input_stream > Merge 'rpc_tester: expose throuput for rpc tester' from Marcin Szopa rpc_tester: remove unused payload size variable from job_rpc_streaming class rpc_tester: add start time tracking for throughput calculation, print throughput and msg/s for job_rpc rpc_tester: refactor result emission to use dedicated functions for messages and throughput > iostream: cast first argument of `std::min` to `size_t` Closes scylladb/scylladb#29952	2026-05-20 13:47:12 +03:00
Artsiom Mishuta	ff51e9c620	test/pylib: make pytest logging config robust when ini is missing Make pytest logging config robust when the ini is missing and prevents crashing Pytest on the configuration stage in case the wrong tests path is provided. Fixes: SCYLLADB-1998 Closes scylladb/scylladb#29941	2026-05-20 13:16:01 +03:00
Dawid Pawlik	4c2ce1928c	types/vector: avoid unnecessary copies during vector reserialization When reserialize_value() is called on a vector type (which happens only when the vector's element type contains sets or maps), the old code materialized all elements via split_fragmented() into a std::vector<managed_bytes>, then iterated them calling reserialize_value() on each — discarding the intermediate copy. Use split_fragmented_view() to obtain zero-copy views of elements, and pass those directly to reserialize_value(). This avoids one managed_bytes allocation per element. Additionally, wrap the call with with_simplified() so that when the input is a single contiguous fragment (the common case), the compiler receives a single_fragmented_view and can eliminate fragment-boundary checks at compile time. Also generalize build_value_fragmented() to accept any forward range of FragmentedView elements (not just managed_bytes), and write directly into the output buffer via with_linearized instead of going through an intermediate read_simple_bytes copy. This benefits all callers including evaluate_vector() on the INSERT path for vector<float, N>. The with_simplified() dispatch instantiates reserialize_value with single_fragmented_view, which in turn instantiates partially_deserialize_listlike and partially_deserialize_map with that type. Add explicit template instantiations in types/types.cc since those function templates are defined there and only previously instantiated for managed_bytes_view and fragmented_temporary_buffer::view. Note: the reserialization path is only exercised for vectors whose element type contains sets or maps (e.g. vector<frozen<map<int,int>>, N>). The common vector<float, N> case never enters reserialize_value() because bound_value_needs_to_be_reserialized() returns false at the call site. However, the build_value_fragmented() improvement applies to all vector INSERTs. References: SCYLLADB-471 Fixes: SCYLLADB-1799 Closes scylladb/scylladb#28559	2026-05-20 12:22:19 +03:00
Andrzej Jackowski	1f37bf21cf	test: alternator: audit: rules filtering and batch bypass The audit_rules path was not covered at all by alternator tests. Add focused coverage that single-table operations respect audit_rules qualified_table_names filtering, and that cross-table batches bypass the table filter because the audit path receives an empty keyspace for multi-table batch operations. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	f39c48648a	test: perf: add --audit-rules option to perf-simple-query Allow perf-simple-query to compare audit-rule matching with the category/keyspace/table audit filters under the same workload. Register a hardcoded "tester" role with the audit cache so rules targeting that role exercise the preprocessed fast path. The new option was used to measure audit-rules performance against the category/keyspace/table audit config. The results are as follows: =============================================================================================================================================================================== Configuration \| Binary \| throughput (tps) \| insns/op \| cpu_cycles/op \| alloc/op \| logal/op \| task/op =============================================================================================================================================================================== audit=none [1] \| baseline \| 206922.4 \| 36591.6 \| 15348.3 \| 58.1 \| 0.0 \| 14.1 audit=none [1] \| this PR \| 207856.4 (+0.5%) \| 36544.9 (-0.1%) \| 15274.0 (-0.5%) \| 58.1 \| 0.0 \| 14.1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- audit=syslog keyspaces=ks [2] \| baseline \| 94871.8 \| 54163.0 \| 27172.4 \| 72.0 \| 0.0 \| 24.0 audit=syslog keyspaces=ks [2] \| this PR \| 96138.4 (+1.3%) \| 54072.3 (-0.2%) \| 26699.3 (-1.7%) \| 72.0 \| 0.0 \| 24.0 audit=syslog audit-rules=ks [3] \| this PR \| 95142.1 (+0.3%) \| 54457.8 (+0.5%) \| 26953.8 (-0.8%) \| 72.0 \| 0.0 \| 24.0 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- audit=syslog keyspaces=ks-non-existent [4] \| baseline \| 213997.8 \| 36735.6 \| 14848.1 \| 58.1 \| 0.0 \| 14.1 audit=syslog keyspaces=ks-non-existent [4] \| this PR \| 219297.2 (+2.5%) \| 36667.3 (-0.2%) \| 14500.1 (-2.3%) \| 58.1 \| 0.0 \| 14.1 audit=syslog audit-rules=ks-non-existent [5] \| this PR \| 211038.7 (-1.4%) \| 36999.7 (+0.7%) \| 15048.6 (+1.4%) \| 58.1 \| 0.0 \| 14.1 =============================================================================================================================================================================== [1] ./scylla perf-simple-query --smp 1 --duration 100 --audit "none" [2] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-keyspaces "ks" --audit-categories "DCL... [3] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-rules '[{"sinks":["syslog"],"categorie... [4] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-keyspaces "ks-non-existent" --audit-ca... [5] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-rules '[{"sinks":["syslog"],"categorie... audit-null.sock was created with `socat -u UNIX-RECV:/tmp/audit-null.sock,type=2 OPEN:/dev/null` Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	f4acf91949	docs: add audit rules section to the auditing guide Operators need a reference for the new rule schema, its relationship to audit_categories/audit_tables/ audit_keyspaces, and the live-update path so they can adopt the feature without reading the source. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	f03398fdba	test: audit: cover role and schema cache notifications Verify on a multi-node cluster that role creation/alter/ drop and table/materialized-view create/drop trigger updates to the preprocessed audit-rules cache on every node, and that a matching DML on the newly created table is audited via the cache. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	7f61d7662d	test: audit: cover audit rules cluster behavior Cluster-level tests should validate rule matching, live updates, sink routing, role filtering, and error handling without rerunning the broader audit suite. Add audit_rules to LIVE_AUDIT_KEYS so the test framework tracks it as a live-updatable config key. Test that rules with empty categories or roles match nothing, that DML rules coexist with legacy audit config, AUTH rules fire on login events, CQL and REST API update paths reject invalid JSON, per-rule sink routing works for table and syslog, role-based filtering works across sessions, and sink mismatch produces a warning in server logs. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	c810bb48f4	audit: rebuild rule caches on group0 snapshot and role changes Nodes can join or reload snapshots after roles and tables already exist, so the cache cannot rely only on incremental notifications. Bulk-load all known roles and tables into the rule cache on Raft state reload and snapshot transfer. Detect incremental role creates and drops in reload_modules() by comparing the loaded roles against the auth cache, and forward the changes to every shard. Each shard rebuilds the fnmatch cache locally from its own rules to avoid cross-shard races when rules are updated concurrently with entity sync. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	78bd361919	audit: refresh rule caches on schema, role, and config changes Schema, role, and config changes must refresh the preprocessed rule cache, otherwise the fast path serves stale matches after reconfiguration or metadata changes. Register a migration listener for table/view create/drop. Observe audit_rules config changes through a serialized action so concurrent rebuilds collapse. Add hooks for role create/drop and a set_known_entities() bulk-load method. Implement real cleanup in shutdown() (previously a no-op) and roll back cleanly on start failure. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	465f8f4d8d	audit: route matching rules to configured sinks Rule-based routing must coexist with legacy category/keyspace/table filtering so operators who have not opted into rules keep their existing behavior. Merge rule-matched sinks into the event's sink set alongside legacy matches. Add a username parameter to should_log_login/sinks_for_login so rules can match the authenticated role. Use a conservative over-approximation for the fast will-log check since the role is not yet known at that call site. Log an error at startup when rules reference sinks not enabled globally. Log a warning when rules are configured but audit is disabled. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	7afb90aa6f	test: cover preprocessed audit rule cache The rule cache is the fast path for matching, so its hit, fallback, refresh, and category-bypass behavior needs focused unit coverage. Test transparent hash consistency, cached and uncached lookup paths, incremental entity add/remove, rule refresh, and empty-rules short circuit. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	97fb2f01ff	audit: add preprocessed rule matching cache Running fnmatch on every audit event would hurt hot-path latency. Precompute per-role and per-table bitsets and intersect them at query time. Rebuild from snapshots with a generation counter to avoid partial state after yielding. Unknown roles/tables fall back to linear fnmatch until metadata notifications populate the cache. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	6354daa8d7	audit: pass sink targets to storage helpers Per-rule routing needs each audit event to carry its target sinks so storage helpers can self-filter without duplicating writes. Replace should_log() with sinks_for() returning an audit_sink_set and add sinks_for_login() for the login path. Move the early-return filtering check from the static inspect() caller into audit::log() so it uses the new sinks_for() directly. Pass the sink set to storage_helper::write() so each helper only fires when its sink is included. Rename parse_audit_modes to parse_audit_sinks. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	67ecdba456	test: audit: cover rule matching semantics Rule matching is reused by both the preprocessed cache and the fallback path -- unit-test it separately so coupling failures do not mask matching bugs. Cover category bitmask, glob patterns for tables and roles, AUTH/ADMIN/DCL table bypass, empty-keyspace batch bypass, and sink bitmask conversion. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	65dd103f74	audit: add rule matching and sink helpers Rule matching must be shared between the preprocessed cache and the fallback path to avoid divergent semantics. Introduce audit_sink enum and audit_sink_set bitmask for routing. Match categories via bitmask, tables and roles via fnmatch with extended globs. AUTH/ADMIN/DCL bypass table matching. Empty category or role lists match nothing. Empty keyspace (e.g. cross-table batches) bypasses table matching for table-scoped categories. Convert validated sink names to an audit_sink_set bitmask for routing. Refs SCYLLADB-1430	2026-05-20 06:55:15 +02:00
Andrzej Jackowski	762fd5d455	test: audit: cover audit_rules configuration Audit rules enter through three paths (YAML, CQL, CLI), each with its own parsing and tracking -- cover all entry points before routing can depend on them. Test loading from YAML, live update via CQL and server API, CLI parsing, invalid value rejection at each path, and observer notification on live update. Refs SCYLLADB-1430	2026-05-20 06:55:14 +02:00
Andrzej Jackowski	f3a7e2e3dc	config: add live audit_rules option Operators need to configure audit rules through YAML, CQL, and CLI with live-update support so routing can be reconfigured without restart. Add audit_rules as a LiveUpdate config option with YAML decoding, JSON parsing for CQL updates, CLI --audit-rules flag, and a custom serializer that avoids double-quoting the JSON array. Refs SCYLLADB-1430	2026-05-20 06:55:14 +02:00
Andrzej Jackowski	3cc55dd6eb	test: cover audit rule parsing and validation Parsing and validation are the first consumer-visible surface of audit rules -- cover them before building higher layers. Test JSON parsing (valid, malformed, missing fields), rule validation (unknown sinks, invalid categories), and JSON round-trip serialization. Refs SCYLLADB-1430	2026-05-20 06:55:14 +02:00
Andrzej Jackowski	32cfa778f7	audit: define audit_rule type with parsing and validation Audit rules provide more granular control over which statements are audited, filtering by tables, roles, and categories. Typos in sink or category names should be caught at parse time rather than silently disabling rules at runtime. Define the audit_rule struct with JSON parsing, validation of sink and category names, serialization, and fmt support. Move statement_category, category_set, and category_to_string out of audit.hh/audit.cc so the rule type is self-contained. Refs SCYLLADB-1430	2026-05-20 06:55:14 +02:00
Tomasz Grabiec	f0dea67a87	Merge 'transport: add per-service-level transport request latency histogram' from Piotr Smaron and Marcin Maliszkiewicz Add a per-scheduling-group latency histogram on the transport level that measures the full CQL request lifetime: from fetching the request buffer until the response is written to the socket. Today latencies are accounted only on the storage proxy level, leaving the time spent in the transport layer (response queue wait + actual I/O) unaccounted. Having both transport and storage proxy latencies allows operators to tell where latency accumulates. The metric is exposed as scylla_transport_cql_request_latency_histogram with the scheduling_group_name label, following the cql_ prefix convention of all other per-SG transport metrics. Fixes: SCYLLADB-1691 New feature, no backport. Closes scylladb/scylladb#29878 * github.com:scylladb/scylladb: test/cluster: add test for per-service-level transport request latency histogram transport: add per-service-level transport request latency histogram	2026-05-20 01:12:14 +02:00
Nadav Har'El	a91d8aeb63	Merge '`CREATE INDEX` validation for Fulltext Indexes' from Dawid Pawlik This PR adds the schema-level validation required for `CREATE INDEX` and `DROP INDEX` on fulltext indexes, mirroring what vector indexes already enforce. Fulltext indexes are viewless custom indexes (no backing materialized view) that rely on CDC for change tracking. The validation ensures these prerequisites are met at index creation time and cannot be violated afterwards via `ALTER TABLE`. Tablet storage: Fulltext indexes require the keyspace to use tablet storage. Creation is rejected otherwise. CDC requirements: Fulltext indexes need a CDC log with a minimum TTL of 24 hours and either `delta = 'full'` or `postimage = true`. The PR enforces this in three places: - `CREATE INDEX` rejects creation when existing CDC options don't meet the requirements. - auto-enables CDC for tables with a fulltext index (same as vector indexes) and validates CDC options on schema updates. - `ALTER TABLE` blocks disabling CDC while a fulltext index exists. Viewless index generalization: The `vector_index`-specific checks in `create_index_statement` (rejecting `WITH` view properties, name-based duplicate detection for issue #26672) are replaced with a generic `is_viewless_custom_class()` helper that queries the index factory. This automatically covers both vector and fulltext indexes without duplicating logic. DROP INDEX reuses the existing path with no changes needed - the standard drop logic works for viewless indexes as-is. Added tests covering all validation paths above. All existing tests are updated to require the `skip_without_tablets` fixture. Fixes: SCYLLADB-1516 Closes scylladb/scylladb#29739 * github.com:scylladb/scylladb: external_index: fix require CDC options for disabled CDC test/cqlpy: add duplicate and view tests for fulltext index cql3: generalize viewless index handling in CREATE INDEX statement test/cqlpy: add CDC validation tests for fulltext index fulltext_index: enforce CDC requirements for fulltext indexes test/cqlpy: add tablet requirement test for fulltext index fulltext_index: require tablet storage for fulltext indexes index: introduce `external_index` base class for VS/FTS indexes	2026-05-20 01:10:56 +03:00
Piotr Szymaniak	f6d4d8abc0	test/cluster: don't advance to SERVING before CQL/Alternator connections are up Don't honor sd_notify SERVING until CQL/Alternator ports are verified reachable. Fixes a race introduced in `af03f0e8c4` (PR #29758). Refs: #29929 Fixes: SCYLLADB-2065 Closes scylladb/scylladb#29964	2026-05-19 21:06:04 +03:00
Michael Litvak	eecbead541	test: wait for others_not_see_server before exclude Between stopping a server and excluding it, wait for other nodes to see the server as down, otherwise exclude may see the server as alive and fail. Fixes SCYLLADB-2110 Closes scylladb/scylladb#29966	2026-05-19 19:36:54 +02:00
Piotr Smaron	810ed6eedc	test/cluster: add test for per-service-level transport request latency histogram Verify that the new scylla_transport_cql_request_latency_histogram metric correctly records transport-level request latencies per service level. Uses error injection to pause a request mid-flight and verifies that the histogram is not updated while the request is paused (since the response has not been written yet), and is updated after the request completes. Co-authored-by: Marcin Maliszkiewicz <marcinmal@scylladb.com>	2026-05-19 16:07:33 +02:00
Piotr Smaron	f90a3296cf	transport: add per-service-level transport request latency histogram Add a per-scheduling-group latency histogram that tracks the full transport-level CQL request lifetime: from fetching the request buffer until the response is written to the socket. Today latencies are accounted only on the storage proxy level, which leaves the time spent in the transport layer unaccounted. The time spent by a response waiting to be sent out can be significant. Having both the transport and the storage proxy latencies allows operators to tell where latency is accumulated. The histogram uses utils::time_estimated_histogram (range 0.5ms to 33s) and is exposed as scylla_transport_cql_request_latency_histogram with the scheduling_group_name label, following the cql_ prefix convention used by all other per-scheduling-group transport metrics. The start time is captured at the beginning of process_request(). The latency is recorded after the response is successfully written to the socket, ensuring the measurement covers processing time, response queue wait time, and actual I/O time. Co-authored-by: Marcin Maliszkiewicz <marcinmal@scylladb.com>	2026-05-19 16:07:33 +02:00
Patryk Jędrzejczak	b7fc661fa9	Merge 'raft: fix send_snapshot abort_source lifetime' from Emil Maskovsky Fix a lifetime bug where `send_snapshot()` captured `abort_source` by reference and the referenced object could be destroyed before the continuation ran. Use a gate-tracked background coroutine for each snapshot transfer: - keep abort_source on the coroutine frame (stable lifetime) - store a raw abort_source* in _snapshot_transfers for synchronous abort - erase transfer slots immediately on abort to allow same-batch reuse - close _snapshot_gate during abort() to wait for all in-flight transfers This removes the need for extra aborted-transfer bookkeeping and makes snapshot transfer shutdown and ownership semantics explicit. Fixes: SCYLLADB-1234 Refs: https://github.com/scylladb/scylladb/pull/29092 No backport: Currently the abort source parameter is not being actually used, so this doesn't cause any problems in the current and older branches. So no backport is needed (the using of abort source parameter will be eventually implemented on master afterwards). Closes scylladb/scylladb#29913 * https://github.com/scylladb/scylladb: raft: fix send_snapshot abort_source lifetime raft: fix parameter name mismatch in `send_snapshot()`	2026-05-19 10:15:13 +02:00
Szymon Malewski	6b2fce03f9	alternator: optional stripping of http response headers In Alternator's HTTP API, response headers can dominate bandwidth for small payloads. The Server, Date, and Content-Type headers were sent on every response but many clients never use them. This patch introduces three Alternator config options: - alternator_http_response_server_header, - alternator_http_response_disable_date_header, - alternator_http_response_disable_content_type_header, which allow customizing or suppressing the respective HTTP response headers. All three options support live update (no restart needed). The Server header is no longer sent by default; the Date and Content-Type defaults preserve the existing behavior. The Server and Date header suppression uses Seastar's set_server_header() and set_generate_date_header() APIs added in https://github.com/scylladb/seastar/pull/3217. This patch also fixes deprecation warnings from older Seastar HTTP APIs. Tests are in test/alternator/test_http_headers.py. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-70 Closes scylladb/scylladb#28288	2026-05-19 10:47:13 +03:00
Benny Halevy	97e03762c5	test/cluster/test_keyspace_rf: extend test_create_keyspace_with_default_replication_factor for tablets rack lists Add more racks to dc2 to verify that the default replication factor covers all available racks (rather than e.g. limited to 3). With tablets and rf_rack_valid_keyspaces, verify also the automatically selected rack list. Restrict the extension to non-debug build modes to prevent running out of memory with --repeat=100. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#29931	2026-05-19 10:44:24 +03:00
Nadav Har'El	cd61a44ab8	test/alternator: test response compression of tiny responses This patch adds to the existing collection of tests for Alternator response compression another test with a tiny response being compressed. This test serves two purposes: 1. It verifies setting alternator_response_compression_threshold_in_bytes to a tiny number like 1 really means that tiny responses would be compressed. 2. It verifies that our compression code, which has a special code path for the small chunk at the end of the compression, works correctly. The original motivation for writing this test was a false alarm by Claude Code which claimed that Alternator's response compression code has a serious, exploitable, memory overrun bug, because it set the wrong size limit on that last chunk. Claude was wrong, there is no such bug. We did set an oversized limit on the last chunk (so this patch fixes this typo), but it didn't matter - because the code used deflateBound - the guaranteed maximum size of the uncompressed data - for the buffer's size, so the buffer was unconditionally big enough, no matter which avail_out limit we passed to delate() it could never overflow. The included test passes even before this patch, even with ASAN enabled to detect memory overflows - no overflow was happening. It also passes after the typo correction in this patch. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#29718	2026-05-19 10:02:26 +03:00
Dawid Pawlik	a631123c06	external_index: fix require CDC options for disabled CDC Since we want to remove the requirement of disallowing "explicitly disabled" CDC table when creating external index (#29894), we still need to check other CDC required parameters to be set properly. Before this commit, once we auto-enable CDC which was "explicitly disabled", we would never run the `check_cdc_options()`. This patch adjusts the check to happen not only when the CDC enabled is true.	2026-05-19 08:53:15 +02:00
Dawid Pawlik	6387c61506	test/cqlpy: add duplicate and view tests for fulltext index Verify that fulltext indexes, which have no backing materialized view, correctly reject duplicate index creation and respect IF NOT EXISTS semantics. Named indexes must not be created twice under the same name; unnamed indexes on the same column must be detected as duplicates. IF NOT EXISTS must silently succeed rather than create a second index, including the known edge cases where the same name is reused across different tables or columns in the same keyspace (VECTOR-641).	2026-05-19 08:52:47 +02:00
Dawid Pawlik	232b1a3725	cql3: generalize viewless index handling in CREATE INDEX statement Replace the `vector_index`-specific checks in `create_index_statement` with a generic `is_viewless_custom_class()` helper that queries the index factory to determine whether an index type creates a backing materialized view. This covers both existing (`vector_index`) and new (`fulltext_index`) viewless index types: - Reject view properties (WITH clause) for any viewless index - Use name-based duplicate detection for named viewless indexes, since they have no backing view table for `has_schema()` to find (issue #26672)	2026-05-19 08:52:47 +02:00
Dawid Pawlik	215a1e3f00	test/cqlpy: add CDC validation tests for fulltext index Verify that fulltext index creation and ALTER TABLE enforce the CDC requirements: creation is rejected when TTL is below the 24-hour minimum, or when the delta mode is neither 'full' nor compensated by postimage. Also verify that enabling postimage or full delta mode allows index creation to succeed, that DROP INDEX works, and that ALTER TABLE cannot disable CDC while a fulltext index is present.	2026-05-19 08:52:47 +02:00
Dawid Pawlik	9e02e11ea8	fulltext_index: enforce CDC requirements for fulltext indexes Fulltext indexes rely on CDC to track changes for asynchronous index building. Enforce the following CDC constraints during CREATE INDEX: - CDC TTL must be at least 86400 seconds (24 hours) - CDC delta mode must be 'full' or postimage must be enabled Add `has_fulltext_index()` and `check_cdc_options()` so that other modules can detect fulltext indexes and validate CDC settings: - include fulltext indexes in `cdc_enabled()` so the CDC log is auto-created, and validate CDC options in `on_before_update_column_family()` - block `ALTER TABLE ... WITH cdc = {'enabled': false}` when a fulltext index exists on the table	2026-05-19 08:52:47 +02:00
Dawid Pawlik	558de64773	test/cqlpy: add tablet requirement test for fulltext index Add `test_create_fulltext_index_requires_tablets` to verify that creating a fulltext index on a keyspace with tablets disabled is rejected.	2026-05-19 08:52:47 +02:00
Dawid Pawlik	69dc62c373	fulltext_index: require tablet storage for fulltext indexes Fulltext indexes, like vector indexes, require the base table's keyspace to use tablets. Add `check_uses_tablets()` validation to `fulltext_index::validate()` that rejects index creation when the keyspace does not use tablet storage. Also add `skip_without_tablets` fixture to all existing fulltext index tests so they are skipped in environments where tablets are not available.	2026-05-19 08:52:47 +02:00
Dawid Pawlik	61d658106a	index: introduce `external_index` base class for VS/FTS indexes Add `external_index` as a common base for `vector_index` and `fulltext_index`, both of which are backed by an external Vector Store engine and share CDC requirements.	2026-05-19 08:52:47 +02:00
Emil Maskovsky	e0f58d1e81	raft: fix send_snapshot abort_source lifetime Fix a lifetime bug where `send_snapshot()` captured `abort_source` by reference and the referenced object could be destroyed before the continuation ran. Use a gate-tracked background coroutine for each snapshot transfer: - keep abort_source on the coroutine frame (stable lifetime) - store a raw abort_source* in _snapshot_transfers for synchronous abort - erase transfer slots immediately on abort to allow same-batch reuse - close _snapshot_gate during abort() to wait for all in-flight transfers This removes the need for extra aborted-transfer bookkeeping and makes snapshot transfer shutdown and ownership semantics explicit. Fixes: SCYLLADB-1234	2026-05-18 21:49:37 +00:00

1 2 3 4 5 ...

53998 Commits