scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-22 07:42:16 +00:00

Author	SHA1	Message	Date
Petr Gusev	9e3209e4a3	cql: refactor add_tablet_info to take tablet_routing_info directly Change add_tablet_info() to accept locator::tablet_routing_info instead of destructured (tablet_replica_set, token_range) pair. This simplifies all three call sites. Remove the empty-replicas guard inside add_tablet_info(): the only producer of tablet_routing_info is tablet ERM's check_locality(), which returns either nullopt (correctly routed) or info with replicas copied from tablet_info — a tablet always has replicas. All callers already check for nullopt before calling add_tablet_info(), so by the time we enter the function replicas are guaranteed non-empty.	2026-05-15 12:28:33 +02:00
Petr Gusev	738b7b4a86	cql: fix UB dereference of nullopt tablet_info in execute_with_condition When check_locality() returns nullopt (correctly routed LWT), the optional tablet_info was unconditionally dereferenced in the lambda capture list: tablet_info->tablet_replicas, tablet_info->token_range. The code previously masked this by initializing tablet_info with an empty-but-present value, so the dereference happened to work but only because the empty tablet_replicas made add_tablet_info() a no-op. After check_locality() overwrites it with nullopt, the dereference is UB. Fix by initializing tablet_info as empty (nullopt) and guarding the dereference.	2026-05-15 11:56:14 +02:00
Petr Gusev	167a3c9c50	cql: fix missing TABLETS_ROUTING_V1 payload after CAS shard bounce After an internal CAS shard bounce, check_locality() was evaluating against this_shard_id() of the post-bounce shard — which is the correct tablet shard — so it returned nullopt, and LWT/SERIAL responses omitted the tablets-routing-v1 custom payload. The client never learned the correct tablet map. Fix by recording the original entry shard in client_state (initialized to this_shard_id() at construction, preserved across shard bounces via client_state_for_another_shard) and passing it to check_locality() so it compares against the client's actual routing decision. No host_id tracking or forwarded_client_state IDL changes are needed because CAS shard bounces are always intra-node. Fixes SCYLLADB-2041	2026-05-15 11:56:14 +02:00
Piotr Dulikowski	f3ac35f9d2	Merge 'strong_consistency: wait for raft servers to start in create table' from Michael Litvak When creating a strongly consistent table, wait for the table's raft servers to start and be ready to serve queries before completing the operation. We want the create table operation to absorb the delay of starting the raft groups instead of the first queries. The create table coordinator commits and applies the schema statement, then it waits for all hosts that have a tablet replica to create and start the raft groups for the table's tablets. It does this by sending an RPC to all the relevant hosts that executes a group0 barrier, in order to ensure the table and raft groups are created, then waits for all raft groups on the host to finish starting and be ready. Fixes SCYLLADB-807 no backport - strong consistency is still experimental Closes scylladb/scylladb#28843 * github.com:scylladb/scylladb: strong_consistency: wait for leader when starting a group strong_consistency: change wait for groups to start on startup strong_consistency: optimize wait_for_groups_to_start strong_consistency: wait for raft servers to start in create table	2026-05-13 16:42:05 +02:00
Piotr Dulikowski	dc05bd35bb	Merge 'strong_consistency: limit available consistency levels in strong consistent requests' from Michał Jadwiszczak Strong consistent requests take different patch then EC requests and consistency levels don’t map well. We should limit available consistency levels in SC request to avoid ignoring them silently, which may cause confusion to user. For writes, there is only one option: - QUORUM/LOCAL_QUORUM (multi DC is not supported yet, so both of those CLs have the same effect) - we need quorum of replicas to successfully commit new mutations to Raft log. For reads, there are 2 options: - QUORUM/LOCAL_QUORUM - if user wants to be sure he sees latest data and the query needs to execute `read_barrier()`, which requires quorum of replicas - ONE/LOCAL_ONE - if user just wants to read data from one replica without synchronization All tests were updated to use LOCAL_QUORUM for both read and writes. Fixes SCYLLADB-1766 SC is in experimental phase and this patch is an improvement, no backport needed. Closes scylladb/scylladb#29691 * github.com:scylladb/scylladb: strong_consistency: allow QUORUM/LOCAL_QUORUM and ONE/LOCAL_ONE for reads strong_consistency: allow only QUORUM/LOCAL_QUORUM CL for writes	2026-05-13 16:31:05 +02:00
Michael Litvak	5a5c7c6241	strong_consistency: wait for raft servers to start in create table When creating a strongly consistent table, wait for the table's raft servers to start and be ready to serve queries before completing the operation. We want the create table operation to absorb the delay of starting the raft groups instead of the first queries. The create table coordinator commits and applies the schema statement, then it waits for all hosts that have a tablet replica to create and start the raft groups for the table's tablets. It does this by sending an RPC to all the relevant hosts that executes a group0 barrier, in order to ensure the table and raft groups are created, then waits for all raft groups on the host to finish starting and be ready. Fixes SCYLLADB-807	2026-05-13 08:43:24 +02:00
Michał Jadwiszczak	d073097ebf	strong_consistency: allow QUORUM/LOCAL_QUORUM and ONE/LOCAL_ONE for reads We can execute strong consistent read queries in 2 ways: - with QUORUM/LOCAL_QUORUM CL - this path executes `read_barrier()` before reading the data, which synchronizes Raft log with the leader. But to execute it, we need quorum of replicas - with ONE/LOCAL_ONE CL - this path just reads data from one replica without any synchronization (not implemented yet)	2026-05-12 23:20:07 +02:00
Michał Jadwiszczak	68f0cf6fac	strong_consistency: allow only QUORUM/LOCAL_QUORUM CL for writes To successfully write data to strong consistent table, a quorum of replicas need to be used to save the data to Raft log. So the only reasonable consistency level is QUORUM/LOCAL_QUORUM (currently SC doesn't support multi DC).	2026-05-12 23:20:03 +02:00
Piotr Dulikowski	129f193116	Merge 'strong_consistency: implement basic coordinator metrics' from Michał Jadwiszczak Add per-shard metrics for strong consistency coordinator operations (latency, timeouts, bounces, status unknown) under the `"strong_consistency_coordinator"` category. These are analogous to the eventual consistency metrics in `storage_proxy_stats`, enabling direct performance comparison between the two consistency modes. The metrics are simplified compared to `storage_proxy_stats` — no breakdown by table, tablet, scheduling group, or DC, only per-shard. Fixes SCYLLADB-1343 Strong consistency is still in experimental phase, no need to backport. Closes scylladb/scylladb#29318 * github.com:scylladb/scylladb: test/strong_consistency: verify metrics strong_consistency: wire up metrics to operations strong_consistency: add stats struct and metrics registration	2026-05-12 16:15:51 +02:00
Avi Kivity	f5ffbd3c3e	cql3: restrictions: reindent statement_restrictions.cc `6165124fcc` has left statement_restrictions.cc scarred and deformed. Restore it to standard 4-space indentation. This patch contains only whitespace changes. Closes scylladb/scylladb#29598	2026-05-11 17:02:14 +03:00
Piotr Smaron	959f67b345	cql: verify tuples length in multi-column IN restriction When a multi-column IN restriction contains tuples with a different number of elements than the number of restricted columns (e.g. `(b, c, d) IN ((1, 2), (2, 1, 4))`), Scylla would either produce an inconsistent error message or, for over-sized tuples, an internal type-mismatch error referencing the list literal representation. Validate each tuple's arity against the number of restricted columns while building the IN restriction and raise a clear "Expected N elements in value tuple, but got M" error in both the under- and over-sized cases. Fixes #13241 Co-authored-by: Alexander Turetskiy <someone.tur@gmail.com> Closes scylladb/scylladb#18407	2026-05-11 16:55:09 +03:00
Nadav Har'El	fcfad51284	Merge 'cql3/selection: require EXECUTE on UDA REDUCEFUNC at SELECT time' from Marcin Maliszkiewicz selection::used_functions() pushed the UDA, its SFUNC and its FINALFUNC, but never the REDUCEFUNC. The reducefunc is invoked by the distributed aggregation path in service::mapreduce_service, so a user could cause it to run server-side without holding EXECUTE on it as long as the query took the mapreduce path. Also push agg.state_reduction_function so select_statement::check_access requires EXECUTE on it too. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1756 Backport: no, it's a minor fix and UDFs are experimental feature in Scylla Closes scylladb/scylladb#29717 * github.com:scylladb/scylladb: test/cqlpy: add test for EXECUTE permission on UDA sub-functions cql3/selection: require EXECUTE on UDA REDUCEFUNC at SELECT time	2026-05-11 16:14:38 +03:00
Yaniv Kaul	cfb568b5b5	cql3: fix missing format placeholders in error messages Fix two format string bugs where arguments were silently dropped: - prepare_expr.cc: the bad argument to count() was passed but had no {} placeholder, so users never saw what was actually passed. - statement_restrictions.cc: the unsupported multi-column relation was passed but the trailing colon had no {} placeholder. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2026-05-10 17:51:19 +03:00
Marcin Maliszkiewicz	fb55bef0ac	cql3/selection: require EXECUTE on UDA REDUCEFUNC at SELECT time selection::used_functions() pushed the UDA, its SFUNC and its FINALFUNC, but never the REDUCEFUNC. The reducefunc is invoked by the distributed aggregation path in service::mapreduce_service, so a user could cause it to run server-side without holding EXECUTE on it as long as the query took the mapreduce path. Also push agg.state_reduction_function so select_statement::check_access requires EXECUTE on it too. Fixes SCYLLADB-1756	2026-05-08 16:37:52 +02:00
Dario Mirovic	918130befd	utils: loading_cache: add insert() that is a no-op when caching is disabled When the cache is constructed with expiry == 0 the underlying storage is never instantiated and get_ptr() asserts via caching_enabled(). This is fine for callers that need a handle into the cache, but it makes get_ptr() unusable for write-only insertions on caches whose expiry is configurable at runtime (e.g. caches driven by a LiveUpdate config option that the operator may set to 0). Add a new insert(k, load) method on loading_cache that returns a future<> and is a no-op when caching is disabled, otherwise forwards to get_ptr(k, load) and discards the resulting handle. This completes the disabled-mode safety contract of the cache for the write side, mirroring the fallback that get() already provides for the read side. Switch authorized_prepared_statements_cache::insert() from get_ptr().discard_result() to the new insert(), which fixes the crash 'Assertion caching_enabled() failed' in authorized_prepared_statements_cache::insert() that occurs when permissions_validity_in_ms is set to 0 and a prepared statement is executed under authentication. Fixes SCYLLADB-1699	2026-04-30 16:51:23 +02:00
Marcin Maliszkiewicz	3df951bc9c	Merge 'audit: set audit_info for native-protocol BATCH messages' from Andrzej Jackowski Commit `16b56c2451` ("Audit: avoid dynamic_cast on a hot path") moved audit info into batch_statement via set_audit_info(), but only wired it for the CQL-text BATCH path (raw::batch_statement::prepare()). Native-protocol BATCH messages (opcode 0x0D), handled by process_batch_internal in transport/server.cc, construct a batch_statement without setting audit_info. This causes audit to silently skip the entire batch. Set audit_info on the batch_statement so these batches are audited. Fixes SCYLLADB-1652 No backport - bug introduced recently. Closes scylladb/scylladb#29570 * github.com:scylladb/scylladb: test/audit: add reproducer for native-protocol batch not being audited audit: set audit_info for native-protocol BATCH messages test/audit: rename internal test methods to avoid CI misdetection	2026-04-22 18:56:28 +02:00
Michał Jadwiszczak	f77c258c8e	strong_consistency: wire up metrics to operations Track write and read latency using latency_counter in coordinator::mutate() and coordinator::query(). Count commit_status_unknown errors in coordinator::mutate(). Count node and shard bounces in redirect_statement(), passing the coordinator's stats from both modification_statement and select_statement.	2026-04-22 08:59:59 +02:00
Tomasz Grabiec	cddde464ca	Merge 'service: Support adding/removing a datacenter with tablets by changing RF' from Aleksandra Martyniuk With this change, you can add or remove a DC(s) in a single ALTER KEYSPACE statement. It requires the keyspace to use rack list replication factor. In existing approach, during RF change all tablet replicas are rebuilt at once. This isn't the case now. In global_topology_request::keyspace_rf_change the request is added to a ongoing_rf_changes - a new column in system.topology table. In a new column in system_schema.keyspaces - next_replication - we keep the target RF. In make_rf_change_plan, load balancer schedules necessary migrations, considering the load of nodes and other pending tablet transitions. Requests from ongoing_rf_changes are processed concurrently, independently from one another. In each request racks are processed concurrently. No tablet replica will be removed until all required replicas are added. While adding replicas to each rack we always start with base tables and won't proceed with views until they are done (while removing - the other way around). The intermediary steps aren't reflected in schema. When the Rf change is finished: - in system_schema.keyspaces: - next_replication is cleared; - new keyspace properties are saved; - request is removed from ongoing_rf_changes; - the request is marked as done in system.topology_requests. Until the request is done, DESCRIBE KEYSPACE shows the replication_v2. If a request hasn't started to remove replicas, it can be aborted using task manager. system.topology_requests::error is set (but the request isn't marked as done) and next_replication = replication_v2. This will be interpreted by load balancer, that will start the rollback of the request. After the rollback is done, we set the relevant system.topology_requests entry as done (failed), clear the request id from system.topology::ongoing_rf_changes, and remove next_replication. Fixes: SCYLLADB-567. No backport needed; new feature. Closes scylladb/scylladb#24421 * github.com:scylladb/scylladb: service: fix indentation docs: update documentation test: test multi RF changes service: tasks: allow aborting ongoing RF changes cql3: allow changing RF by more than one when adding or removing a DC service: handle multi_rf_change service: implement make_rf_change_plan service: add keyspace_rf_change_plan to migration_plan service: extend tablet_migration_info to handle rebuilds service: split update_node_load_on_migration service: rearrange keyspace_rf_change handler db: add columns to system_schema.keyspaces db: service: add ongoing_rf_changes to system.topology gms: add keyspace_multi_rf_change feature	2026-04-22 01:46:11 +02:00
Andrzej Jackowski	f5bb9b6282	audit: set audit_info for native-protocol BATCH messages Commit `16b56c2451` ("Audit: avoid dynamic_cast on a hot path") moved audit info into batch_statement via set_audit_info(), but only wired it for the CQL-text BATCH path (raw::batch_statement::prepare()). Native-protocol BATCH messages (opcode 0x0D), handled by process_batch_internal in transport/server.cc, construct a batch_statement without setting audit_info. This causes audit to silently skip the entire batch. Set audit_info on the batch_statement so these batches are audited. Fixes SCYLLADB-1652	2026-04-21 21:52:26 +02:00
Nadav Har'El	6165124fcc	Merge 'cql3: statement_restrictions: analyze during prepare time' from Avi Kivity The statement_restrictions code is responsible for analyzing the WHERE clause, deciding on the query plan (which index to use), and extracting the partition and clustering keys to use for the index. Currently, it suffers from repetition in making its decisions: there are 15 calls to expr::visit in statement_restrictions.cc, and 14 find_binop calls. This reduces to 2 visits (one nested in the other) and 6 find_binop calls. The analysis of binary operators is done once, then reused. The key data structure introduced is the predicate. While an expression takes inputs from the row evaluated, constants, and bind variables, and produces a boolean result, predicates ask which values for a column (or a number of columns) are needed to satisfy (part of) the WHERE clause. The WHERE clause is then expressed as a conjunction of such predicates. The analyzer uses the predicates to select the index, then uses the predicates to compute the partition and clustering keys. The refactoring is composed of these parts (but patches from different parts are interspersed): 1. an exhaustive regression test is added as the first commit, to ensure behavior doesn't change 2. move computation from query time to prepare time 3. introduce, gradually enrich, and use predicates to implement the statement_restrictions API Major refactoring, and no bugs fixed, so definitely not backporting. Closes scylladb/scylladb#29114 * github.com:scylladb/scylladb: cql3: statement_restrictions: replace has_eq_restriction_on_column with precomputed set cql3: statement_restrictions: replace multi_column_range_accumulator_builder with direct predicate iteration cql3: statement_restrictions: use predicate fields in build_get_clustering_bounds_fn cql3: statement_restrictions: remove extract_single_column_restrictions_for_column cql3: statement_restrictions: use predicate vectors in prepare_indexed_local cql3: statement_restrictions: use predicate vector size for clustering prefix length cql3: statement_restrictions: replace do_find_idx and is_supported_by with predicate-based versions cql3: statement_restrictions: remove expression-based has_supporting_index and index_supports_some_column cql3: statement_restrictions: replace multi-column and PK index support checks with predicate-based versions cql3: statement_restrictions: add predicate-based index support checking cql3: statement_restrictions: use pre-built single-column maps for index support checks cql3: statement_restrictions: build clustering-prefix restrictions incrementally cql3: statement_restrictions: build partition-range restrictions incrementally cql3: statement_restrictions: build clustering-key single-column restrictions map incrementally cql3: statement_restrictions: build partition-key single-column restrictions map incrementally cql3: statement_restrictions: build non-primary-key single-column restrictions map incrementally cql3: statement_restrictions: use tracked has_mc_clustering for _has_multi_column cql3: statement_restrictions: track has-token state incrementally cql3: statement_restrictions: track partition-key-empty state incrementally cql3: statement_restrictions: track first multi-column predicate incrementally cql3: statement_restrictions: track last clustering column incrementally cql3: statement_restrictions: track clustering-has-slice incrementally cql3: statement_restrictions: track has-multi-column-clustering incrementally cql3: statement_restrictions: track clustering-empty state incrementally cql3: statement_restrictions: replace restr bridge variable with pred.filter cql3: statement_restrictions: convert single-column branch to use predicate properties cql3: statement_restrictions: convert multi-column branch to use predicate properties cql3: statement_restrictions: convert constructor loop to iterate over predicates cql3: statement_restrictions: annotate predicates with operator properties cql3: statement_restrictions: annotate predicates with is_not_null and is_multi_column cql3: statement_restrictions: complete preparation early cql3: statement_restrictions: convert expressions to predicates without being directed at a specific column cql3: statement_restrictions: refine possible_lhs_values() function_call processing cql3: statement_restrictions: return nullptr for function solver if not token cql3: statement_restrictions: refine possible_lhs_values() subscript solving cql3: statement_restrictions: return nullptr from possible_lhs_values instead of on_internal_error cql3: statement_restrictions: convert possible_lhs_values into a solver cql3: statement_restrictions: split _where to boolean factors in preparation for predicates conversion cql3: statement_restrictions: refactor IS NOT NULL processing cql3: statement_restrictions: fold add_single_column_nonprimary_key_restriction() into its caller cql3: statement_restrictions: fold add_single_column_clustering_key_restriction() into its caller cql3: statement_restrictions: fold add_single_column_partition_key_restriction() into its caller cql3: statement_restrictions: fold add_token_partition_key_restriction() into its caller cql3: statement_restrictions: fold add_multi_column_clustering_key_restriction() into its caller cql3: statement_restrictions: avoid early return in add_multi_column_clustering_key_restrictions cql3: statement_restrictions: fold add_is_not_restriction() into its caller cql3: statement_restrictions: fold add_restriction() into its caller cql3: statement_restrictions: remove possible_partition_token_values() cql3: statement_restrictions: remove possible_column_values cql3: statement_restrictions: pass schema to possible_column_values() cql3: statement_restrictions: remove fallback path in solve() cql3: statement_restrictions: reorder possible_lhs_column parameters cql3: statement_restrictions: prepare solver for multi-column restrictions cql3: statement_restrictions: add solver for token restriction on index cql3: statement_restrictions: pre-analyze column in value_for() cql3: statement_restrictions: don't handle boolean constants in multi_column_range_accumulator_builder cql3: statement_restrictions: split range_from_raw_bounds into prepare phase and query phase cql3: statement_restrictions: adjust signature of range_from_raw_bounds cql3: statement_restrictions: split multi_column_range_accumulator into prepare-time and query-time phases cql3: statement_restrictions: make get_multi_column_clustering_bounds a builder cql3: statement_restrictions: multi-key clustering restrictions one layer deeper cql3: statement_restrictions: push multi-column post-processing into get_multi_column_clustering_bounds() cql3: statement_restrictions: pre-analyze single-column clustering key restrictions cql3: statement_restrictions: wrap value_for_index_partition_key() cql3: statement_restrictions: hide value_for() cql3: statement_restrictions: push down clustering prefix wrapper one level cql3: statement_restrictions: wrap functions that return clustering ranges cql3: statement_restrictions: do not pass view schema back and forth cql3: statement_restrictions: pre-analyze token range restrictions cql3: statement_restrictions: pre-analyze partition key columns cql3: statement_restrictions: do not collect subscripted partition key columns cql3: statement_restrictions: split _partition_range_restrictions into three cases cql3: statement_restrictions: move value_list, value_set to header file cql3: statement_restrictions: wrap get_partition_key_ranges cql3: statement_restrictions: prepare statement_restrictions for capturing `this` test: statement_restrictions: add index_selection regression test	2026-04-21 15:44:06 +03:00
Łukasz Paszkowski	d18eb9479f	cql/statement: Create keyspace_metadata with correct initial_tablets count In `ks_prop_defs::as_ks_metadata(...)` a default initial tablets count is set to 0, when tablets are enabled and the replication strategy is NetworkReplicationStrategy. This effectively sets _uses_tablets = false in abstract_replication_strategy for the remaining strategies when no `tablets = {...}` options are specified. As a consequence, it is possible to create vnode-based keyspaces even when tablets are enforced with `tablets_mode_for_new_keyspaces`. The patch sets a default initial tablets count to zero regardless of the chosen replication strategy. Then each of the replication strategy validates the options and raises a configuration exception when tablets are not supported. All tests are altered in the following way: + whenever it was correct, SimpleStrategy was replaced with NetworkTopologyStrategy + otherwise, tablets were explicitly disabled with ` AND tablets = {'enabled': false}` Fixes https://github.com/scylladb/scylladb/issues/25340 Closes scylladb/scylladb#25342	2026-04-20 17:57:38 +03:00
Avi Kivity	d584bd7358	cql3: statement_restrictions: replace has_eq_restriction_on_column with precomputed set has_eq_restriction_on_column() walked expression trees at prepare time to find binary_operators with op==EQ that mention a given column on the LHS. Its only caller is ORDER BY validation in select_statement, which checks that clustering columns without an explicit ordering have an EQ restriction. Replace the 50-line expression-walking free function with a precomputed unordered_set<const column_definition*> (_columns_with_eq) populated during the main predicate loop in analyze_statement_restrictions. For single-column EQ predicates the column is taken from on_column; for multi-column EQ like (ck1, ck2) = (1, 2), all columns in on_clustering_key_prefix are included. The member function becomes a single set::contains() call.	2026-04-19 20:57:09 +03:00
Avi Kivity	b7f86eaabc	cql3: statement_restrictions: replace multi_column_range_accumulator_builder with direct predicate iteration build_get_multi_column_clustering_bounds_fn() used expr::visit() to dispatch each restriction through a 15-handler visitor struct. Only the binary_operator handler did real work; the conjunction handler just recursed, and the remaining 13 handlers were dead-code on_internal_error calls (the filter expression of each predicate is always a binary_operator). Replace the visitor with a loop over predicates that does as<binary_operator>(pred.filter) directly, building the same query-time lambda inline. Promote intersect_all() and process_in_values() from static methods of the deleted struct to free functions in the anonymous namespace -- they are still called from the query-time lambda.	2026-04-19 20:57:09 +03:00
Avi Kivity	ece9af229d	cql3: statement_restrictions: use predicate fields in build_get_clustering_bounds_fn Replace find_binop(..., is_multi_column) with pred.is_multi_column in build_get_clustering_bounds_fn() and add_clustering_restrictions_to_idx_ck_prefix(). Replace is_clustering_order(binop) with pred.order == comparison_order::clustering and iterate predicates directly instead of extracting filter expressions. Remove the now-dead is_multi_column() free function.	2026-04-19 20:57:09 +03:00
Avi Kivity	72da1207d7	cql3: statement_restrictions: remove extract_single_column_restrictions_for_column The previous commit made prepare_indexed_local() use the pre-built predicate vectors instead of calling extract_single_column_restrictions_for_column(). That was the last production caller. Remove the function definition (65 lines of expression-walking visitor) and its declaration/doc-comment from the header. Replace the unit test (expression_extract_column_restrictions) which directly called the removed function with synthetic column_definitions, with per_column_restriction_routing which exercises the same routing logic through the public analyze_statement_restrictions() API. The new test verifies not just factor counts but the exact (column_name, oper_t) pairs in each per-column entry, catching misrouted restrictions that a count-only check would miss.	2026-04-19 20:57:09 +03:00
Avi Kivity	b093477cf7	cql3: statement_restrictions: use predicate vectors in prepare_indexed_local Replace the extract_single_column_restrictions_for_column(_where, ...) call in prepare_indexed_local() with a direct lookup in the pre-built predicate vectors. The old code walked the entire WHERE expression tree to extract binary operators mentioning the indexed column, wrapped them in a conjunction, translated column definitions to the index schema, then called to_predicate_on_column() which walked the expression again to convert back to predicates. The new code selects the appropriate predicate vector map (PK, CK, or non-PK) based on the indexed column's kind, looks up the column's predicates directly, applies replace_column_def to each, and folds them with make_conjunction -- producing the same result without any expression tree walks. This removes the last production caller of extract_single_column_restrictions_for_column (unit tests in statement_restrictions_test.cc still exercise it).	2026-04-19 20:57:09 +03:00
Avi Kivity	a725e39218	cql3: statement_restrictions: use predicate vector size for clustering prefix length Replace the body of num_clustering_prefix_columns_that_need_not_be_filtered() with a single return of _clustering_prefix_restrictions.size(). The old implementation called get_single_column_restrictions_map() to rebuild a per-column map from the clustering expression tree, then iterated it in schema order counting columns until it hit a gap, a needs-filtering predicate, or a slice. But _clustering_prefix_restrictions is already built with exactly that same logic during the constructor (lines 1234-1248): it iterates CK columns in schema order, appending predicates until it encounters a gap in column_id, a predicate that needs_filtering, or a slice -- at which point it stops. So the vector's size is, by construction, the answer to the same question the old code was re-deriving at query time. This makes four helper functions dead code: - get_single_column_restrictions_map(): walked the expression tree to build a map<column_definition*, expression> of per-column restrictions. Was a ~15-line function that called get_sorted_column_defs() and extract_single_column_restrictions_for_column() for each column. - get_the_only_column(): extracted the single column_value from a restriction expression, asserting it was single-column. Called by the old loop body. - is_single_column_restriction(): thin wrapper around get_single_column_restriction_column(). - get_single_column_restriction_column(): ~25-line function that walked an expression tree with for_each_expression<column_value> to determine whether all column_value nodes refer to the same column. Called by the above two. Remove all four functions and their forward declarations (-95 lines).	2026-04-19 20:57:08 +03:00
Avi Kivity	68c2e292ac	cql3: statement_restrictions: replace do_find_idx and is_supported_by with predicate-based versions Convert do_find_idx() from a member function that walks expression trees via index_restrictions()/for_each_expression/extract_single_column_restrictions to a static free function that iterates index_search_group spans using are_predicates_supported_by(). Convert calculate_column_defs_for_filtering_and_erase_restrictions_used_for_index() to use predicate vectors instead of expression-based is_supported_by(). Remove now-dead code: is_supported_by(), is_supported_by_helper(), score() member function, and do_find_idx() member function.	2026-04-19 20:57:08 +03:00
Avi Kivity	c42397e995	cql3: statement_restrictions: remove expression-based has_supporting_index and index_supports_some_column These functions are no longer called now that all index support checks in the constructor use predicate-based alternatives. The expression-based is_supported_by and is_supported_by_helper are still needed by choose_idx() and calculate_column_defs_for_filtering_and_erase_restrictions_used_for_index().	2026-04-19 20:57:08 +03:00
Avi Kivity	1aafe0708a	cql3: statement_restrictions: replace multi-column and PK index support checks with predicate-based versions Replace clustering_columns_restrictions_have_supporting_index(), multi_column_clustering_restrictions_are_supported_by(), get_clustering_slice(), and partition_key_restrictions_have_supporting_index() with predicate-based equivalents that use the already-accumulated mc_ck_preds and sc_pk_pred_vectors locals. The new multi_column_predicates_have_supporting_index() checks each multi-column predicate's columns list directly against indexes, avoiding expression tree walks through find_in_expression and bounds_slice.	2026-04-19 20:57:08 +03:00
Avi Kivity	fa6f239cc7	cql3: statement_restrictions: add predicate-based index support checking Add `op` and `is_subscript` fields to `struct predicate` and populate them in all predicate creation sites in `to_predicates()`. These fields record the binary operator and whether the LHS is a subscript (map element access), which are the two pieces of information needed to query index support. Add `is_predicate_supported_by()` which mirrors `is_supported_by_helper()` but operates on a single predicate's fields instead of walking the expression tree. Add a predicate-vector overload of `index_supports_some_column()` and use it in the constructor to replace expression-based index support checks for single-column partition key, clustering key, and non-primary-key restrictions. The multi-column clustering key case still uses the existing expression-based path.	2026-04-19 20:57:08 +03:00
Avi Kivity	25ba3bd649	cql3: statement_restrictions: use pre-built single-column maps for index support checks Replace index_supports_some_column(expression, ...) with index_supports_some_column(single_column_restrictions_map, ...) to eliminate get_single_column_restrictions_map() tree walks when checking index support. The three call sites now use the maps already built incrementally in the constructor loop: _single_column_nonprimary_key_restrictions, _single_column_clustering_key_restrictions, and _single_column_partition_key_restrictions. Also replace contains_multi_column_restriction() tree walk in clustering_columns_restrictions_have_supporting_index() with _has_multi_column.	2026-04-19 20:57:08 +03:00
Avi Kivity	fab90224b3	cql3: statement_restrictions: build clustering-prefix restrictions incrementally Replace the extract_clustering_prefix_restrictions() tree walk with incremental collection during the main loop. Two new locals -- mc_ck_preds and sc_ck_preds -- accumulate multi-column and single-column clustering key predicates respectively. A short post-loop block computes the longest contiguous prefix from sc_ck_preds (or uses mc_ck_preds directly for multi-column), replacing the removed function. Also remove the now-unused to_predicate_on_clustering_key_prefix(), with_current_binary_operator() helper, and the visitor_with_binary_operator_context concept.	2026-04-19 20:57:08 +03:00
Avi Kivity	3bd308986a	cql3: statement_restrictions: build partition-range restrictions incrementally Replace the extract_partition_range() tree walk with incremental collection during the main loop. Two new locals before the loop -- token_pred and pk_range_preds -- accumulate token and single-column EQ/IN partition key predicates respectively. A short post-loop block materializes _partition_range_restrictions from these locals, replacing the removed function. This removes the last tree walk over partition-key restrictions.	2026-04-19 20:57:08 +03:00
Avi Kivity	db28411548	cql3: statement_restrictions: build clustering-key single-column restrictions map incrementally Instead of accumulating all clustering-key restrictions into a conjunction tree and then decomposing it by column via get_single_column_restrictions_map() post-loop, build the per-column map incrementally as each single-column clustering-key predicate is processed. The post-loop guard (!has_mc_clustering) is no longer needed: multi-column predicates go through the is_multi_column branch and never insert into this map, and mixing multi with single-column is rejected with an exception. This eliminates a post-loop tree walk over _clustering_columns_restrictions.	2026-04-19 20:57:08 +03:00
Avi Kivity	a4608804d8	cql3: statement_restrictions: build partition-key single-column restrictions map incrementally Instead of accumulating all partition-key restrictions into a conjunction tree and then decomposing it by column via get_single_column_restrictions_map() post-loop, build the per-column map incrementally as each single-column partition-key predicate is processed. The post-loop guard (!has_token_restrictions()) is no longer needed: token predicates go through the on_partition_key_token branch and never insert into this map, and mixing token with non-token is rejected with an exception. This eliminates a post-loop tree walk over _partition_key_restrictions.	2026-04-19 20:57:08 +03:00
Avi Kivity	e9b16a11ba	cql3: statement_restrictions: build non-primary-key single-column restrictions map incrementally Instead of accumulating all non-primary-key restrictions into a conjunction tree and then decomposing it by column via get_single_column_restrictions_map() post-loop, build the per-column map incrementally as each non-primary-key predicate is processed. This eliminates a post-loop tree walk over _nonprimary_key_restrictions.	2026-04-19 20:57:08 +03:00
Avi Kivity	701366a8d1	cql3: statement_restrictions: use tracked has_mc_clustering for _has_multi_column Replace the two post-loop find_binop(_clustering_columns_restrictions, is_multi_column) tree walks and the contains_multi_column_restriction() tree walk with the already-tracked local has_mc_clustering. The redundant second assignment inside the _check_indexes block is removed entirely.	2026-04-19 20:57:08 +03:00
Avi Kivity	da438507d0	cql3: statement_restrictions: track has-token state incrementally Replace the two in-loop calls to has_token_restrictions() (which walks the _partition_key_restrictions expression tree looking for token function calls) with a local bool has_token, set to true when a token predicate is processed. The member function is retained since it's used outside the constructor. With this change, the constructor loop's non-error control flow performs zero expression tree scanning. The only remaining tree walks are on error paths (get_sorted_column_defs, get_columns_in_commons for formatting exception messages) and structural (make_conjunction for building accumulated expressions).	2026-04-19 20:57:07 +03:00
Avi Kivity	1344278a19	cql3: statement_restrictions: track partition-key-empty state incrementally Replace the in-loop call to partition_key_restrictions_is_empty() (which walks the _partition_key_restrictions expression tree via is_empty_restriction()) with a local bool pk_is_empty, set to false at the two sites where partition key restrictions are added. The member function is retained since it's used outside the constructor.	2026-04-19 20:57:07 +03:00
Avi Kivity	14812ea1e0	cql3: statement_restrictions: track first multi-column predicate incrementally Replace find_in_expression<binary_operator>(_clustering_columns_restrictions, always_true), which walks the accumulated expression tree to find the first binary_operator, with a tracked pointer first_mc_pred set when the first multi-column predicate is added. This eliminates the tree scan, the null check, and the is_lower_bound/is_upper_bound lambdas, replacing them with direct predicate field accesses: first_mc_pred->order, first_mc_pred->is_lower_bound, first_mc_pred->is_upper_bound, and first_mc_pred->filter for error messages.	2026-04-19 20:57:07 +03:00
Avi Kivity	ef005c10ba	cql3: statement_restrictions: track last clustering column incrementally Replace get_last_column_def(_clustering_columns_restrictions), which walks the entire accumulated expression tree to collect and sort all column definitions, with a local pointer ck_last_column that tracks the column with the highest schema position as single-column clustering restrictions are added.	2026-04-19 20:57:07 +03:00
Avi Kivity	88bd5ea1b7	cql3: statement_restrictions: track clustering-has-slice incrementally Replace has_slice(_clustering_columns_restrictions), which walks the accumulated expression tree looking for slice operators, with a local bool ck_has_slice set when any clustering predicate with is_slice is added. Updated at all three clustering insertion points: multi-column first assignment, multi-column slice conjunction, and single-column conjunction.	2026-04-19 20:57:07 +03:00
Avi Kivity	1071c39f17	cql3: statement_restrictions: track has-multi-column-clustering incrementally Replace find_binop(_clustering_columns_restrictions, is_tuple_constructor), which walks the accumulated expression tree looking for multi-column restrictions, with a local bool has_mc_clustering set when a multi-column predicate is first added. This serves both the multi-column branch (checking existing restrictions are also multi-column) and the single-column branch (checking no multi-column restrictions exist).	2026-04-19 20:57:07 +03:00
Avi Kivity	aa6a0ad326	cql3: statement_restrictions: track clustering-empty state incrementally Replace is_empty_restriction(_clustering_columns_restrictions), which recursively walks the accumulated expression tree, with a local bool ck_is_empty that is set to false when a clustering restriction is first added. Updated at both insertion points: multi-column first assignment and single-column make_conjunction.	2026-04-19 20:57:07 +03:00
Avi Kivity	d4ff613c0a	cql3: statement_restrictions: replace restr bridge variable with pred.filter The constructor loop no longer needs to extract a binary_operator reference from each predicate. All remaining uses (make_conjunction, get_columns_in_commons, assignment to accumulated restriction members, _where.push_back, and error formatting) accept expression directly, which is what pred.filter already is. This eliminates the unnecessary as<binary_operator> cast at the top of the loop.	2026-04-19 20:57:07 +03:00
Avi Kivity	44b18f3399	cql3: statement_restrictions: convert single-column branch to use predicate properties In the single-column partition-key and clustering-key sub-branches, replace direct binary_operator field inspections with pre-computed predicate booleans: !pred.equality && !pred.is_in instead of restr.op != EQ && restr.op != IN, pred.is_in instead of find(restr, IN), and pred.is_slice instead of has_slice(restr). Also fix a leftover restr.order in the multi-column branch error message.	2026-04-19 20:57:07 +03:00
Avi Kivity	b0c5eed384	cql3: statement_restrictions: convert multi-column branch to use predicate properties Replace direct operator comparisons with predicate boolean fields: pred.equality, pred.is_in, pred.is_slice, pred.is_lower_bound, pred.is_upper_bound, and pred.order.	2026-04-19 20:57:07 +03:00
Avi Kivity	afd68187ea	cql3: statement_restrictions: convert constructor loop to iterate over predicates Convert the constructor loop to first build predicates from the prepared where clause, then iterate over the predicates. The IS_NOT branch now uses pred.is_not_null_single_column and pred.on instead of inspecting the expression directly. The branch conditions for multi-column (pred.is_multi_column), token (on_partition_key_token), and single-column (on_column) now use predicate properties instead of expression helpers. Remove extract_column_from_is_not_null_restriction() which is no longer needed.	2026-04-19 20:57:07 +03:00
Avi Kivity	440d9f2d82	cql3: statement_restrictions: annotate predicates with operator properties Add boolean fields to struct predicate that describe the operator: equality, is_in, is_slice, is_upper_bound, is_lower_bound, and comparison_order. Populate them in all to_predicates() return sites. These fields will allow the constructor loop to inspect predicate properties directly instead of re-examining the expression.	2026-04-19 20:57:07 +03:00

1 2 3 4 5 ...

4191 Commits