Merge 'cql3: statement_restrictions: analyze during prepare time' from Avi Kivity

The statement_restrictions code is responsible for analyzing the WHERE
clause, deciding on the query plan (which index to use), and extracting
the partition and clustering keys to use for the index.

Currently, it suffers from repetition in making its decisions: there are 15
calls to expr::visit in statement_restrictions.cc, and 14 find_binop calls. This
reduces to 2 visits (one nested in the other) and 6 find_binop calls. The analysis
of binary operators is done once, then reused.

The key data structure introduced is the predicate. While an expression
takes inputs from the row evaluated, constants, and bind variables, and
produces a boolean result, predicates ask which values for a column (or
a number of columns) are needed to satisfy (part of) the WHERE clause.
The WHERE clause is then expressed as a conjunction of such predicates.
The analyzer uses the predicates to select the index, then uses the predicates
to compute the partition and clustering keys.

The refactoring is composed of these parts (but patches from different parts
are interspersed):

1. an exhaustive regression test is added as the first commit, to ensure behavior doesn't change
2. move computation from query time to prepare time
3. introduce, gradually enrich, and use predicates to implement the statement_restrictions API

Major refactoring, and no bugs fixed, so definitely not backporting.

Closes scylladb/scylladb#29114

* github.com:scylladb/scylladb:
  cql3: statement_restrictions: replace has_eq_restriction_on_column with precomputed set
  cql3: statement_restrictions: replace multi_column_range_accumulator_builder with direct predicate iteration
  cql3: statement_restrictions: use predicate fields in build_get_clustering_bounds_fn
  cql3: statement_restrictions: remove extract_single_column_restrictions_for_column
  cql3: statement_restrictions: use predicate vectors in prepare_indexed_local
  cql3: statement_restrictions: use predicate vector size for clustering prefix length
  cql3: statement_restrictions: replace do_find_idx and is_supported_by with predicate-based versions
  cql3: statement_restrictions: remove expression-based has_supporting_index and index_supports_some_column
  cql3: statement_restrictions: replace multi-column and PK index support checks with predicate-based versions
  cql3: statement_restrictions: add predicate-based index support checking
  cql3: statement_restrictions: use pre-built single-column maps for index support checks
  cql3: statement_restrictions: build clustering-prefix restrictions incrementally
  cql3: statement_restrictions: build partition-range restrictions incrementally
  cql3: statement_restrictions: build clustering-key single-column restrictions map incrementally
  cql3: statement_restrictions: build partition-key single-column restrictions map incrementally
  cql3: statement_restrictions: build non-primary-key single-column restrictions map incrementally
  cql3: statement_restrictions: use tracked has_mc_clustering for _has_multi_column
  cql3: statement_restrictions: track has-token state incrementally
  cql3: statement_restrictions: track partition-key-empty state incrementally
  cql3: statement_restrictions: track first multi-column predicate incrementally
  cql3: statement_restrictions: track last clustering column incrementally
  cql3: statement_restrictions: track clustering-has-slice incrementally
  cql3: statement_restrictions: track has-multi-column-clustering incrementally
  cql3: statement_restrictions: track clustering-empty state incrementally
  cql3: statement_restrictions: replace restr bridge variable with pred.filter
  cql3: statement_restrictions: convert single-column branch to use predicate properties
  cql3: statement_restrictions: convert multi-column branch to use predicate properties
  cql3: statement_restrictions: convert constructor loop to iterate over predicates
  cql3: statement_restrictions: annotate predicates with operator properties
  cql3: statement_restrictions: annotate predicates with is_not_null and is_multi_column
  cql3: statement_restrictions: complete preparation early
  cql3: statement_restrictions: convert expressions to predicates without being directed at a specific column
  cql3: statement_restrictions: refine possible_lhs_values() function_call processing
  cql3: statement_restrictions: return nullptr for function solver if not token
  cql3: statement_restrictions: refine possible_lhs_values() subscript solving
  cql3: statement_restrictions: return nullptr from possible_lhs_values instead of on_internal_error
  cql3: statement_restrictions: convert possible_lhs_values into a solver
  cql3: statement_restrictions: split _where to boolean factors in preparation for predicates conversion
  cql3: statement_restrictions: refactor IS NOT NULL processing
  cql3: statement_restrictions: fold add_single_column_nonprimary_key_restriction() into its caller
  cql3: statement_restrictions: fold add_single_column_clustering_key_restriction() into its caller
  cql3: statement_restrictions: fold add_single_column_partition_key_restriction() into its caller
  cql3: statement_restrictions: fold add_token_partition_key_restriction() into its caller
  cql3: statement_restrictions: fold add_multi_column_clustering_key_restriction() into its caller
  cql3: statement_restrictions: avoid early return in add_multi_column_clustering_key_restrictions
  cql3: statement_restrictions: fold add_is_not_restriction() into its caller
  cql3: statement_restrictions: fold add_restriction() into its caller
  cql3: statement_restrictions: remove possible_partition_token_values()
  cql3: statement_restrictions: remove possible_column_values
  cql3: statement_restrictions: pass schema to possible_column_values()
  cql3: statement_restrictions: remove fallback path in solve()
  cql3: statement_restrictions: reorder possible_lhs_column parameters
  cql3: statement_restrictions: prepare solver for multi-column restrictions
  cql3: statement_restrictions: add solver for token restriction on index
  cql3: statement_restrictions: pre-analyze column in value_for()
  cql3: statement_restrictions: don't handle boolean constants in multi_column_range_accumulator_builder
  cql3: statement_restrictions: split range_from_raw_bounds into prepare phase and query phase
  cql3: statement_restrictions: adjust signature of range_from_raw_bounds
  cql3: statement_restrictions: split multi_column_range_accumulator into prepare-time and query-time phases
  cql3: statement_restrictions: make get_multi_column_clustering_bounds a builder
  cql3: statement_restrictions: multi-key clustering restrictions one layer deeper
  cql3: statement_restrictions: push multi-column post-processing into get_multi_column_clustering_bounds()
  cql3: statement_restrictions: pre-analyze single-column clustering key restrictions
  cql3: statement_restrictions: wrap value_for_index_partition_key()
  cql3: statement_restrictions: hide value_for()
  cql3: statement_restrictions: push down clustering prefix wrapper one level
  cql3: statement_restrictions: wrap functions that return clustering ranges
  cql3: statement_restrictions: do not pass view schema back and forth
  cql3: statement_restrictions: pre-analyze token range restrictions
  cql3: statement_restrictions: pre-analyze partition key columns
  cql3: statement_restrictions: do not collect subscripted partition key columns
  cql3: statement_restrictions: split _partition_range_restrictions into three cases
  cql3: statement_restrictions: move value_list, value_set to header file
  cql3: statement_restrictions: wrap get_partition_key_ranges
  cql3: statement_restrictions: prepare statement_restrictions for capturing `this`
  test: statement_restrictions: add index_selection regression test
This commit is contained in:
Nadav Har'El
2026-04-21 15:44:06 +03:00
12 changed files with 2264 additions and 1573 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -23,15 +23,113 @@ namespace cql3 {
namespace restrictions {
/// A set of discrete values.
using value_list = std::vector<managed_bytes>; // Sorted and deduped using value comparator.
/// General set of values. Empty set and single-element sets are always value_list. interval is
/// never singular and never has start > end. Universal set is a interval with both bounds null.
using value_set = std::variant<value_list, interval<managed_bytes>>;
// For some boolean expression (say (X = 3) = TRUE, this represents a function that solves for X.
// (here, it would return 3). The expression is obtained by equating some factors of the WHERE
// clause to TRUE.
using solve_for_t = std::function<value_set (const query_options&)>;
struct on_row {
bool operator==(const on_row&) const = default;
};
struct on_column {
const column_definition* column;
bool operator==(const on_column&) const = default;
};
// Placeholder type indicating we're solving for the partition key token.
struct on_partition_key_token {
const ::schema* schema;
bool operator==(const on_partition_key_token&) const = default;
};
struct on_clustering_key_prefix {
std::vector<const column_definition*> columns;
bool operator==(const on_clustering_key_prefix&) const = default;
};
// A predicate on a column or a combination of columns. The WHERE clause analyzer
// will attempt to convert predicates (that return true or false for a particular row)
// to solvers (that return the set of column values that satisfy the predicate) when possible.
struct predicate {
// A function that returns the set of values that satisfy the filter. Can be unset,
// in which case the filter must be interpreted.
solve_for_t solve_for;
// The original filter for this column.
expr::expression filter;
// What column the predicate can be solved for
std::variant<
on_row, // cannot determine, so predicate is on entire row
on_column, // solving for a single column: e.g. c1 = 3
on_partition_key_token, // solving for the token, e.g. token(pk1, pk2) >= :var
on_clustering_key_prefix // solving for a clustering key prefix: e.g. (ck1, ck2) >= (3, 4)
> on;
// Whether the returned value_set will resolve to a single value.
bool is_singleton = false;
// Whether the returned value_set follows CQL comparison semantics
bool comparable = true;
bool is_multi_column = false;
bool is_not_null_single_column = false;
bool equality = false; // operator is EQ
bool is_in = false; // operator is IN
bool is_slice = false; // operator is LT/LTE/GT/GTE
bool is_upper_bound = false; // operator is LT/LTE
bool is_lower_bound = false; // operator is GT/GTE
expr::comparison_order order = expr::comparison_order::cql;
std::optional<expr::oper_t> op; // the binary operator, if any
bool is_subscript = false; // whether the LHS is a subscript (map element access)
};
///In some cases checking if columns have indexes is undesired of even
///impossible, because e.g. the query runs on a pseudo-table, which does not
///have an index-manager, or even a table object.
using check_indexes = bool_class<class check_indexes_tag>;
// A function that returns the partition key ranges for a query. It is the solver of
// WHERE clause fragments such as WHERE token(pk) > 1 or WHERE pk1 IN :list1 AND pk2 IN :list2.
using get_partition_key_ranges_fn_t = std::function<dht::partition_range_vector (const query_options&)>;
// A function that returns the clustering key ranges for a query. It is the solver of
// WHERE clause fragments such as WHERE ck > 1 or WHERE (ck1, ck2) > (1, 2).
using get_clustering_bounds_fn_t = std::function<std::vector<query::clustering_range> (const query_options& options)>;
// A function that returns a singleton value, usable for a key (e.g. bytes_opt)
using get_singleton_value_fn_t = std::function<bytes_opt (const query_options&)>;
struct no_partition_range_restrictions {
};
struct token_range_restrictions {
predicate token_restrictions;
};
struct single_column_partition_range_restrictions {
std::vector<predicate> per_column_restrictions;
};
using partition_range_restrictions = std::variant<
no_partition_range_restrictions,
token_range_restrictions,
single_column_partition_range_restrictions>;
// A map of per-column predicate vectors, ordered by schema position.
using single_column_predicate_vectors = std::map<const column_definition*, std::vector<predicate>, expr::schema_pos_column_definition_comparator>;
/**
* The restrictions corresponding to the relations specified on the where-clause of CQL query.
*/
class statement_restrictions {
struct private_tag {}; // Tag for private constructor
private:
schema_ptr _schema;
@@ -81,7 +179,7 @@ private:
bool _has_queriable_regular_index = false, _has_queriable_pk_index = false, _has_queriable_ck_index = false;
bool _has_multi_column; ///< True iff _clustering_columns_restrictions has a multi-column restriction.
std::optional<expr::expression> _where; ///< The entire WHERE clause.
std::vector<expr::expression> _where; ///< The entire WHERE clause (factorized).
/// Parts of _where defining the clustering slice.
///
@@ -96,7 +194,7 @@ private:
/// 4.4 elements other than the last have only EQ or IN atoms
/// 4.5 the last element has only EQ, IN, or is_slice() atoms
/// 5. if multi-column, then each element is a binary_operator
std::vector<expr::expression> _clustering_prefix_restrictions;
std::vector<predicate> _clustering_prefix_restrictions;
/// Like _clustering_prefix_restrictions, but for the indexing table (if this is an index-reading statement).
/// Recall that the index-table CK is (token, PK, CK) of the base table for a global index and (indexed column,
@@ -105,7 +203,7 @@ private:
/// Elements are conjunctions of single-column binary operators with the same LHS.
/// Element order follows the indexing-table clustering key.
/// In case of a global index the first element's (token restriction) RHS is a dummy value, it is filled later.
std::optional<std::vector<expr::expression>> _idx_tbl_ck_prefix;
std::optional<std::vector<predicate>> _idx_tbl_ck_prefix;
/// Parts of _where defining the partition range.
///
@@ -113,16 +211,25 @@ private:
/// binary_operators on token. If single-column restrictions define the partition range, each element holds
/// restrictions for one partition column. Each partition column has a corresponding element, but the elements
/// are in arbitrary order.
std::vector<expr::expression> _partition_range_restrictions;
partition_range_restrictions _partition_range_restrictions;
bool _partition_range_is_simple; ///< False iff _partition_range_restrictions imply a Cartesian product.
check_indexes _check_indexes = check_indexes::yes;
/// Columns that appear on the LHS of an EQ restriction (not IN).
/// For multi-column EQ like (ck1, ck2) = (1, 2), all columns in the tuple are included.
std::unordered_set<const column_definition*> _columns_with_eq;
std::vector<const column_definition*> _column_defs_for_filtering;
schema_ptr _view_schema;
std::optional<secondary_index::index> _idx_opt;
expr::expression _idx_restrictions = expr::conjunction({});
get_partition_key_ranges_fn_t _get_partition_key_ranges_fn;
get_clustering_bounds_fn_t _get_clustering_bounds_fn;
get_clustering_bounds_fn_t _get_global_index_clustering_ranges_fn;
get_clustering_bounds_fn_t _get_global_index_token_clustering_ranges_fn;
get_clustering_bounds_fn_t _get_local_index_clustering_ranges_fn;
get_singleton_value_fn_t _value_for_index_partition_key_fn;
public:
/**
* Creates a new empty <code>StatementRestrictions</code>.
@@ -130,9 +237,10 @@ public:
* @param cfm the column family meta data
* @return a new empty <code>StatementRestrictions</code>.
*/
statement_restrictions(schema_ptr schema, bool allow_filtering);
statement_restrictions(private_tag, schema_ptr schema, bool allow_filtering);
friend statement_restrictions analyze_statement_restrictions(
public:
friend shared_ptr<const statement_restrictions> analyze_statement_restrictions(
data_dictionary::database db,
schema_ptr schema,
statements::statement_type type,
@@ -142,9 +250,15 @@ public:
bool for_view,
bool allow_filtering,
check_indexes do_check_indexes);
friend shared_ptr<const statement_restrictions> make_trivial_statement_restrictions(
schema_ptr schema,
bool allow_filtering);
private:
statement_restrictions(data_dictionary::database db,
// Important: objects of this class captures `this` extensively and so must remain non-copyable.
statement_restrictions(const statement_restrictions&) = delete;
statement_restrictions& operator=(const statement_restrictions&) = delete;
statement_restrictions(private_tag,
data_dictionary::database db,
schema_ptr schema,
statements::statement_type type,
const expr::expression& where_clause,
@@ -211,10 +325,7 @@ public:
bool has_token_restrictions() const;
// Checks whether the given column has an EQ restriction.
// EQ restriction is `col = ...` or `(col, col2) = ...`
// IN restriction is NOT an EQ restriction, this function will not look for IN restrictions.
// Uses column_defintion::operator== for comparison, columns with the same name but different schema will not be equal.
// Checks whether the given column has an EQ restriction (not IN).
bool has_eq_restriction_on_column(const column_definition&) const;
/**
@@ -224,12 +335,6 @@ public:
*/
std::vector<const column_definition*> get_column_defs_for_filtering(data_dictionary::database db) const;
/**
* Gives a score that the index has - index with the highest score will be chosen
* in find_idx()
*/
int score(const secondary_index::index& index) const;
/**
* Determines the index to be used with the restriction.
* @param db - the data_dictionary::database context (for extracting index manager)
@@ -250,18 +355,8 @@ public:
size_t partition_key_restrictions_size() const;
bool parition_key_restrictions_have_supporting_index(const secondary_index::secondary_index_manager& index_manager, expr::allow_local_index allow_local) const;
size_t clustering_columns_restrictions_size() const;
bool clustering_columns_restrictions_have_supporting_index(
const secondary_index::secondary_index_manager& index_manager,
expr::allow_local_index allow_local) const;
bool multi_column_clustering_restrictions_are_supported_by(const secondary_index::index& index) const;
bounds_slice get_clustering_slice() const;
/**
* Checks if the clustering key has some unrestricted components.
* @return <code>true</code> if the clustering key has some unrestricted components, <code>false</code> otherwise.
@@ -279,15 +374,6 @@ public:
schema_ptr get_view_schema() const { return _view_schema; }
private:
std::pair<std::optional<secondary_index::index>, expr::expression> do_find_idx(const secondary_index::secondary_index_manager& sim) const;
void add_restriction(const expr::binary_operator& restr, schema_ptr schema, bool allow_filtering, bool for_view);
void add_is_not_restriction(const expr::binary_operator& restr, schema_ptr schema, bool for_view);
void add_single_column_parition_key_restriction(const expr::binary_operator& restr, schema_ptr schema, bool allow_filtering, bool for_view);
void add_token_partition_key_restriction(const expr::binary_operator& restr);
void add_single_column_clustering_key_restriction(const expr::binary_operator& restr, schema_ptr schema, bool allow_filtering);
void add_multi_column_clustering_key_restriction(const expr::binary_operator& restr);
void add_single_column_nonprimary_key_restriction(const expr::binary_operator& restr);
void process_partition_key_restrictions(bool for_view, bool allow_filtering, statements::statement_type type);
/**
@@ -315,7 +401,17 @@ private:
void add_clustering_restrictions_to_idx_ck_prefix(const schema& idx_tbl_schema);
unsigned int num_clustering_prefix_columns_that_need_not_be_filtered() const;
void calculate_column_defs_for_filtering_and_erase_restrictions_used_for_index(data_dictionary::database db);
void calculate_column_defs_for_filtering_and_erase_restrictions_used_for_index(
data_dictionary::database db,
const single_column_predicate_vectors& sc_pk_pred_vectors,
const single_column_predicate_vectors& sc_ck_pred_vectors,
const single_column_predicate_vectors& sc_nonpk_pred_vectors);
get_partition_key_ranges_fn_t build_partition_key_ranges_fn() const;
get_clustering_bounds_fn_t build_get_clustering_bounds_fn() const;
get_clustering_bounds_fn_t build_get_global_index_clustering_ranges_fn() const;
get_clustering_bounds_fn_t build_get_global_index_token_clustering_ranges_fn() const;
get_clustering_bounds_fn_t build_get_local_index_clustering_ranges_fn() const;
get_singleton_value_fn_t build_value_for_index_partition_key_fn() const;
public:
/**
* Returns the specified range of the partition key.
@@ -389,7 +485,10 @@ public:
private:
/// Prepares internal data for evaluating index-table queries. Must be called before
/// get_local_index_clustering_ranges().
void prepare_indexed_local(const schema& idx_tbl_schema);
void prepare_indexed_local(const schema& idx_tbl_schema,
const single_column_predicate_vectors& sc_pk_pred_vectors,
const single_column_predicate_vectors& sc_ck_pred_vectors,
const single_column_predicate_vectors& sc_nonpk_pred_vectors);
/// Prepares internal data for evaluating index-table queries. Must be called before
/// get_global_index_clustering_ranges() or get_global_index_token_clustering_ranges().
@@ -398,15 +497,18 @@ private:
public:
/// Calculates clustering ranges for querying a global-index table.
std::vector<query::clustering_range> get_global_index_clustering_ranges(
const query_options& options, const schema& idx_tbl_schema) const;
const query_options& options) const;
/// Calculates clustering ranges for querying a global-index table for queries with token restrictions present.
std::vector<query::clustering_range> get_global_index_token_clustering_ranges(
const query_options& options, const schema& idx_tbl_schema) const;
const query_options& options) const;
/// Calculates clustering ranges for querying a local-index table.
std::vector<query::clustering_range> get_local_index_clustering_ranges(
const query_options& options, const schema& idx_tbl_schema) const;
const query_options& options) const;
/// Finds the value of partition key of the index table
bytes_opt value_for_index_partition_key(const query_options&) const;
sstring to_string() const;
@@ -416,7 +518,7 @@ public:
bool is_empty() const;
};
statement_restrictions analyze_statement_restrictions(
shared_ptr<const statement_restrictions> analyze_statement_restrictions(
data_dictionary::database db,
schema_ptr schema,
statements::statement_type type,
@@ -427,23 +529,14 @@ statement_restrictions analyze_statement_restrictions(
bool allow_filtering,
check_indexes do_check_indexes);
// Extracts all binary operators which have the given column on their left hand side.
// Extracts only single-column restrictions.
// Does not include multi-column restrictions.
// Does not include token() restrictions.
// Does not include boolean constant restrictions.
// For example "WHERE c = 1 AND (a, c) = (2, 1) AND token(p) < 2 AND FALSE" will return {"c = 1"}.
std::vector<expr::expression> extract_single_column_restrictions_for_column(const expr::expression&, const column_definition&);
shared_ptr<const statement_restrictions> make_trivial_statement_restrictions(
schema_ptr schema,
bool allow_filtering);
// Checks whether this expression is empty - doesn't restrict anything
bool is_empty_restriction(const expr::expression&);
// Finds the value of the given column in the expression
// In case of multpiple possible values calls on_internal_error
bytes_opt value_for(const column_definition&, const expr::expression&, const query_options&);
}
}

View File

@@ -626,7 +626,7 @@ modification_statement::prepare(data_dictionary::database db, prepare_context& c
// Since this cache is only meaningful for LWT queries, just clear the ids
// if it's not a conditional statement so that the AST nodes don't
// participate in the caching mechanism later.
if (!prepared_stmt->has_conditions() && prepared_stmt->_restrictions.has_value()) {
if (!prepared_stmt->has_conditions() && prepared_stmt->_restrictions) {
ctx.clear_pk_function_calls_cache();
}
prepared_stmt->_may_use_token_aware_routing = ctx.get_partition_key_bind_indexes(*schema).size() != 0;

View File

@@ -94,7 +94,7 @@ private:
std::optional<bool> _is_raw_counter_shard_write;
protected:
std::optional<restrictions::statement_restrictions> _restrictions;
shared_ptr<const restrictions::statement_restrictions> _restrictions;
public:
typedef std::optional<std::unordered_map<sstring, bytes_opt>> json_cache_opt;

View File

@@ -19,7 +19,7 @@ public:
uint32_t bound_terms,
lw_shared_ptr<const parameters> parameters,
::shared_ptr<selection::selection> selection,
::shared_ptr<restrictions::statement_restrictions> restrictions,
::shared_ptr<const restrictions::statement_restrictions> restrictions,
::shared_ptr<std::vector<size_t>> group_by_cell_indices,
bool is_reversed,
ordering_comparator_type ordering_comparator,

View File

@@ -109,7 +109,7 @@ public:
std::unique_ptr<prepared_statement> prepare(data_dictionary::database db, cql_stats& stats, const cql_config& cfg, bool for_view);
private:
std::vector<selection::prepared_selector> maybe_jsonize_select_clause(std::vector<selection::prepared_selector> select, data_dictionary::database db, schema_ptr schema);
::shared_ptr<restrictions::statement_restrictions> prepare_restrictions(
::shared_ptr<const restrictions::statement_restrictions> prepare_restrictions(
data_dictionary::database db,
schema_ptr schema,
prepare_context& ctx,

View File

@@ -1027,7 +1027,7 @@ view_indexed_table_select_statement::prepare(data_dictionary::database db,
uint32_t bound_terms,
lw_shared_ptr<const parameters> parameters,
::shared_ptr<selection::selection> selection,
::shared_ptr<restrictions::statement_restrictions> restrictions,
::shared_ptr<const restrictions::statement_restrictions> restrictions,
::shared_ptr<std::vector<size_t>> group_by_cell_indices,
bool is_reversed,
ordering_comparator_type ordering_comparator,
@@ -1139,7 +1139,7 @@ lw_shared_ptr<const service::pager::paging_state> view_indexed_table_select_stat
auto& last_base_pk = last_pos.partition;
auto* last_base_ck = last_pos.position.has_key() ? &last_pos.position.key() : nullptr;
bytes_opt indexed_column_value = restrictions::value_for(*cdef, _used_index_restrictions, options);
bytes_opt indexed_column_value = _restrictions->value_for_index_partition_key(options);
auto index_pk = [&]() {
if (_index.metadata().local()) {
@@ -1350,12 +1350,7 @@ dht::partition_range_vector view_indexed_table_select_statement::get_partition_r
dht::partition_range_vector view_indexed_table_select_statement::get_partition_ranges_for_global_index_posting_list(const query_options& options) const {
dht::partition_range_vector partition_ranges;
const column_definition* cdef = _schema->get_column_definition(to_bytes(_index.target_column()));
if (!cdef) {
throw exceptions::invalid_request_exception("Indexed column not found in schema");
}
bytes_opt value = restrictions::value_for(*cdef, _used_index_restrictions, options);
bytes_opt value = _restrictions->value_for_index_partition_key(options);
if (value) {
auto pk = partition_key::from_single_value(*_view_schema, *value);
auto dk = dht::decorate_key(*_view_schema, pk);
@@ -1374,11 +1369,11 @@ query::partition_slice view_indexed_table_select_statement::get_partition_slice_
// Only EQ restrictions on base partition key can be used in an index view query
if (pk_restrictions_is_single && _restrictions->partition_key_restrictions_is_all_eq()) {
partition_slice_builder.with_ranges(
_restrictions->get_global_index_clustering_ranges(options, *_view_schema));
_restrictions->get_global_index_clustering_ranges(options));
} else if (_restrictions->has_token_restrictions()) {
// Restrictions like token(p1, p2) < 0 have all partition key components restricted, but require special handling.
partition_slice_builder.with_ranges(
_restrictions->get_global_index_token_clustering_ranges(options, *_view_schema));
_restrictions->get_global_index_token_clustering_ranges(options));
}
}
@@ -1389,7 +1384,7 @@ query::partition_slice view_indexed_table_select_statement::get_partition_slice_
partition_slice_builder partition_slice_builder{*_view_schema};
partition_slice_builder.with_ranges(
_restrictions->get_local_index_clustering_ranges(options, *_view_schema));
_restrictions->get_local_index_clustering_ranges(options));
return partition_slice_builder.build();
}
@@ -1607,7 +1602,7 @@ public:
uint32_t bound_terms,
lw_shared_ptr<const parameters> parameters,
::shared_ptr<selection::selection> selection,
::shared_ptr<restrictions::statement_restrictions> restrictions,
::shared_ptr<const restrictions::statement_restrictions> restrictions,
::shared_ptr<std::vector<size_t>> group_by_cell_indices,
bool is_reversed,
ordering_comparator_type ordering_comparator,
@@ -1645,7 +1640,7 @@ private:
uint32_t bound_terms,
lw_shared_ptr<const select_statement::parameters> parameters,
::shared_ptr<selection::selection> selection,
::shared_ptr<restrictions::statement_restrictions> restrictions,
::shared_ptr<const restrictions::statement_restrictions> restrictions,
::shared_ptr<std::vector<size_t>> group_by_cell_indices,
bool is_reversed,
parallelized_select_statement::ordering_comparator_type ordering_comparator,
@@ -2076,7 +2071,7 @@ static select_statement::ordering_comparator_type get_similarity_ordering_compar
::shared_ptr<cql3::statements::select_statement> vector_indexed_table_select_statement::prepare(data_dictionary::database db, schema_ptr schema,
uint32_t bound_terms, lw_shared_ptr<const parameters> parameters, ::shared_ptr<selection::selection> selection,
::shared_ptr<restrictions::statement_restrictions> restrictions, ::shared_ptr<std::vector<size_t>> group_by_cell_indices, bool is_reversed,
::shared_ptr<const restrictions::statement_restrictions> restrictions, ::shared_ptr<std::vector<size_t>> group_by_cell_indices, bool is_reversed,
ordering_comparator_type ordering_comparator, prepared_ann_ordering_type prepared_ann_ordering, std::optional<expr::expression> limit,
std::optional<expr::expression> per_partition_limit, cql_stats& stats, const secondary_index::index& index, std::unique_ptr<attributes> attrs) {
@@ -2589,7 +2584,7 @@ std::unique_ptr<prepared_statement> select_statement::prepare(data_dictionary::d
return make_unique<prepared_statement>(audit_info(), std::move(stmt), ctx, std::move(partition_key_bind_indices), std::move(warnings));
}
::shared_ptr<restrictions::statement_restrictions>
::shared_ptr<const restrictions::statement_restrictions>
select_statement::prepare_restrictions(data_dictionary::database db,
schema_ptr schema,
prepare_context& ctx,
@@ -2599,8 +2594,8 @@ select_statement::prepare_restrictions(data_dictionary::database db,
restrictions::check_indexes do_check_indexes)
{
try {
return ::make_shared<restrictions::statement_restrictions>(restrictions::analyze_statement_restrictions(db, schema, statement_type::SELECT, _where_clause, ctx,
selection->contains_only_static_columns(), for_view, allow_filtering, do_check_indexes));
return restrictions::analyze_statement_restrictions(db, schema, statement_type::SELECT, _where_clause, ctx,
selection->contains_only_static_columns(), for_view, allow_filtering, do_check_indexes);
} catch (const exceptions::unrecognized_entity_exception& e) {
if (contains_alias(e.entity)) {
throw exceptions::invalid_request_exception(format("Aliases aren't allowed in the WHERE clause (name: '{}')", e.entity));

View File

@@ -200,7 +200,7 @@ public:
uint32_t bound_terms,
lw_shared_ptr<const parameters> parameters,
::shared_ptr<selection::selection> selection,
::shared_ptr<restrictions::statement_restrictions> restrictions,
::shared_ptr<const restrictions::statement_restrictions> restrictions,
::shared_ptr<std::vector<size_t>> group_by_cell_indices,
bool is_reversed,
ordering_comparator_type ordering_comparator,
@@ -372,7 +372,7 @@ public:
static ::shared_ptr<cql3::statements::select_statement> prepare(data_dictionary::database db, schema_ptr schema, uint32_t bound_terms,
lw_shared_ptr<const parameters> parameters, ::shared_ptr<selection::selection> selection,
::shared_ptr<restrictions::statement_restrictions> restrictions, ::shared_ptr<std::vector<size_t>> group_by_cell_indices, bool is_reversed,
::shared_ptr<const restrictions::statement_restrictions> restrictions, ::shared_ptr<std::vector<size_t>> group_by_cell_indices, bool is_reversed,
ordering_comparator_type ordering_comparator, prepared_ann_ordering_type prepared_ann_ordering, std::optional<expr::expression> limit,
std::optional<expr::expression> per_partition_limit, cql_stats& stats, const secondary_index::index& index, std::unique_ptr<cql3::attributes> attrs);

View File

@@ -66,7 +66,7 @@ public:
: update_statement(std::move(audit_info), statement_type::INSERT, bound_terms, s, std::move(attrs), stats)
, _value(std::move(v))
, _default_unset(default_unset) {
_restrictions = restrictions::statement_restrictions(s, false);
_restrictions = cql3::restrictions::make_trivial_statement_restrictions(s, false);
}
private:
virtual void execute_operations_for_key(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params, const json_cache_opt& json_cache) const override;

View File

@@ -493,7 +493,7 @@ std::unique_ptr<service::pager::query_pager> service::pager::query_pagers::pager
// If partition row limit is applied to paging, we still need to fall back
// to filtering the results to avoid extraneous rows on page breaks.
if (!filtering_restrictions && cmd->slice.partition_row_limit() < query::max_rows_if_set) {
filtering_restrictions = ::make_shared<cql3::restrictions::statement_restrictions>(s, true);
filtering_restrictions = cql3::restrictions::make_trivial_statement_restrictions(s, true);
}
if (filtering_restrictions) {
return std::make_unique<filtering_query_pager>(proxy, std::move(s), std::move(selection), state,

File diff suppressed because it is too large Load Diff

View File

@@ -23,7 +23,7 @@ using namespace cql3;
namespace {
/// Helper to create statement_restrictions from a WHERE clause string
restrictions::statement_restrictions make_restrictions(
shared_ptr<const restrictions::statement_restrictions> make_restrictions(
std::string_view where_clause, cql_test_env& env, const sstring& table_name = "t", const sstring& keyspace_name = "ks") {
prepare_context ctx;
@@ -63,8 +63,8 @@ SEASTAR_TEST_CASE(to_json_empty_restrictions) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto schema = e.local_db().find_schema("ks", "t");
restrictions::statement_restrictions restr(schema, false);
auto json = rjson::print(vector_search::prepare_filter(restr, false).to_json(query_options({})));
shared_ptr<const restrictions::statement_restrictions> restr = restrictions::make_trivial_statement_restrictions(schema, false);
auto json = rjson::print(vector_search::prepare_filter(*restr, false).to_json(query_options({})));
BOOST_CHECK_EQUAL(json, "{}");
});
@@ -75,7 +75,7 @@ SEASTAR_TEST_CASE(to_json_with_allow_filtering) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -87,7 +87,7 @@ SEASTAR_TEST_CASE(to_json_single_column_eq) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=42", e);
auto json = get_restrictions_json(restr, false);
auto json = get_restrictions_json(*restr, false);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":42}],"allow_filtering":false})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -99,7 +99,7 @@ SEASTAR_TEST_CASE(to_json_single_column_lt) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and ck<100", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"<","lhs":"ck","rhs":100}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -111,7 +111,7 @@ SEASTAR_TEST_CASE(to_json_single_column_gt) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and ck>50", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":">","lhs":"ck","rhs":50}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -123,7 +123,7 @@ SEASTAR_TEST_CASE(to_json_single_column_lte) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and ck<=75", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"<=","lhs":"ck","rhs":75}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -135,7 +135,7 @@ SEASTAR_TEST_CASE(to_json_single_column_gte) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and ck>=25", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":">=","lhs":"ck","rhs":25}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -147,7 +147,7 @@ SEASTAR_TEST_CASE(to_json_single_column_in) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and ck in (1, 2, 3)", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"IN","lhs":"ck","rhs":[1,2,3]}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -159,7 +159,7 @@ SEASTAR_TEST_CASE(to_json_string_value) {
cquery_nofail(e, "create table ks.t(pk text, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk='hello'", e);
auto json = get_restrictions_json(restr, false);
auto json = get_restrictions_json(*restr, false);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":"hello"}],"allow_filtering":false})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -171,7 +171,7 @@ SEASTAR_TEST_CASE(to_json_multi_column_eq) {
cquery_nofail(e, "create table ks.t(pk int, ck1 int, ck2 int, v vector<float, 3>, primary key(pk, ck1, ck2))");
auto restr = make_restrictions("pk=1 and (ck1, ck2)=(10, 20)", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"()==()","lhs":["ck1","ck2"],"rhs":[10,20]}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -183,7 +183,7 @@ SEASTAR_TEST_CASE(to_json_multi_column_lt) {
cquery_nofail(e, "create table ks.t(pk int, ck1 int, ck2 int, v vector<float, 3>, primary key(pk, ck1, ck2))");
auto restr = make_restrictions("pk=1 and (ck1, ck2)<(10, 20)", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"()<()","lhs":["ck1","ck2"],"rhs":[10,20]}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -195,7 +195,7 @@ SEASTAR_TEST_CASE(to_json_multi_column_gt) {
cquery_nofail(e, "create table ks.t(pk int, ck1 int, ck2 int, v vector<float, 3>, primary key(pk, ck1, ck2))");
auto restr = make_restrictions("pk=1 and (ck1, ck2)>(10, 20)", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"()>()","lhs":["ck1","ck2"],"rhs":[10,20]}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -207,7 +207,7 @@ SEASTAR_TEST_CASE(to_json_multi_column_lte) {
cquery_nofail(e, "create table ks.t(pk int, ck1 int, ck2 int, v vector<float, 3>, primary key(pk, ck1, ck2))");
auto restr = make_restrictions("pk=1 and (ck1, ck2)<=(10, 20)", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"()<=()","lhs":["ck1","ck2"],"rhs":[10,20]}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -219,7 +219,7 @@ SEASTAR_TEST_CASE(to_json_multi_column_gte) {
cquery_nofail(e, "create table ks.t(pk int, ck1 int, ck2 int, v vector<float, 3>, primary key(pk, ck1, ck2))");
auto restr = make_restrictions("pk=1 and (ck1, ck2)>=(10, 20)", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"()>=()","lhs":["ck1","ck2"],"rhs":[10,20]}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -231,7 +231,7 @@ SEASTAR_TEST_CASE(to_json_multi_column_in) {
cquery_nofail(e, "create table ks.t(pk int, ck1 int, ck2 int, v vector<float, 3>, primary key(pk, ck1, ck2))");
auto restr = make_restrictions("pk=1 and (ck1, ck2) in ((1, 2), (3, 4))", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"()IN()","lhs":["ck1","ck2"],"rhs":[[1,2],[3,4]]}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -243,7 +243,7 @@ SEASTAR_TEST_CASE(to_json_multiple_restrictions) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and ck>=10 and ck<100", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":">=","lhs":"ck","rhs":10},{"type":"<","lhs":"ck","rhs":100}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -255,7 +255,7 @@ SEASTAR_TEST_CASE(to_json_with_boolean_value) {
cquery_nofail(e, "create table ks.t(pk int, ck boolean, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("ck=true", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"ck","rhs":true}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -267,7 +267,7 @@ SEASTAR_TEST_CASE(to_json_bind_marker_partition_key) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=?", e);
auto filter = vector_search::prepare_filter(restr, false);
auto filter = vector_search::prepare_filter(*restr, false);
std::vector<raw_value> bind_values = {raw_value::make_value(int32_type->decompose(42))};
auto options = make_query_options(std::move(bind_values));
@@ -283,7 +283,7 @@ SEASTAR_TEST_CASE(to_json_bind_marker_clustering_key) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=? and ck>?", e);
auto filter = vector_search::prepare_filter(restr, true);
auto filter = vector_search::prepare_filter(*restr, true);
std::vector<raw_value> bind_values = {
raw_value::make_value(int32_type->decompose(1)),
@@ -301,7 +301,7 @@ SEASTAR_TEST_CASE(to_json_bind_marker_different_values) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=?", e);
auto filter = vector_search::prepare_filter(restr, false);
auto filter = vector_search::prepare_filter(*restr, false);
std::vector<raw_value> bind_values1 = {raw_value::make_value(int32_type->decompose(100))};
auto options1 = make_query_options(std::move(bind_values1));
@@ -322,7 +322,7 @@ SEASTAR_TEST_CASE(to_json_bind_marker_string_value) {
cquery_nofail(e, "create table ks.t(pk text, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=?", e);
auto filter = vector_search::prepare_filter(restr, false);
auto filter = vector_search::prepare_filter(*restr, false);
std::vector<raw_value> bind_values = {raw_value::make_value(utf8_type->decompose("hello_world"))};
auto options = make_query_options(std::move(bind_values));
@@ -338,7 +338,7 @@ SEASTAR_TEST_CASE(to_json_mixed_literals_and_bind_markers) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and ck>?", e);
auto filter = vector_search::prepare_filter(restr, true);
auto filter = vector_search::prepare_filter(*restr, true);
std::vector<raw_value> bind_values = {raw_value::make_value(int32_type->decompose(25))};
auto options = make_query_options(std::move(bind_values));
@@ -354,7 +354,7 @@ SEASTAR_TEST_CASE(to_json_bind_marker_in_list) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and ck in ?", e);
auto filter = vector_search::prepare_filter(restr, true);
auto filter = vector_search::prepare_filter(*restr, true);
auto list_type = list_type_impl::get_instance(int32_type, true);
auto list_val = make_list_value(list_type, {data_value(10), data_value(20), data_value(30)});
@@ -373,7 +373,7 @@ SEASTAR_TEST_CASE(to_json_bind_marker_multi_column) {
cquery_nofail(e, "create table ks.t(pk int, ck1 int, ck2 int, v vector<float, 3>, primary key(pk, ck1, ck2))");
auto restr = make_restrictions("pk=1 and (ck1, ck2)>?", e);
auto filter = vector_search::prepare_filter(restr, true);
auto filter = vector_search::prepare_filter(*restr, true);
auto tuple_type = tuple_type_impl::get_instance({int32_type, int32_type});
auto tuple_val = make_tuple_value(tuple_type, {data_value(10), data_value(20)});
@@ -392,7 +392,7 @@ SEASTAR_TEST_CASE(to_json_no_bind_markers_uses_cache) {
cquery_nofail(e, "create table ks.t(pk int, ck int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=42", e);
auto filter = vector_search::prepare_filter(restr, false);
auto filter = vector_search::prepare_filter(*restr, false);
auto options1 = query_options({});
auto json1 = rjson::print(filter.to_json(options1));
@@ -412,7 +412,7 @@ SEASTAR_TEST_CASE(to_json_nonprimary_key_eq) {
cquery_nofail(e, "create table ks.t(pk int, ck int, r int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and r=42", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":"==","lhs":"r","rhs":42}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -424,7 +424,7 @@ SEASTAR_TEST_CASE(to_json_nonprimary_key_range) {
cquery_nofail(e, "create table ks.t(pk int, ck int, r int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and r>10 and r<100", e);
auto json = get_restrictions_json(restr, true);
auto json = get_restrictions_json(*restr, true);
auto expected = R"json({"restrictions":[{"type":"==","lhs":"pk","rhs":1},{"type":">","lhs":"r","rhs":10},{"type":"<","lhs":"r","rhs":100}],"allow_filtering":true})json";
BOOST_CHECK_EQUAL(json, expected);
@@ -436,7 +436,7 @@ SEASTAR_TEST_CASE(to_json_nonprimary_key_bind_marker) {
cquery_nofail(e, "create table ks.t(pk int, ck int, r int, v vector<float, 3>, primary key(pk, ck))");
auto restr = make_restrictions("pk=1 and r=?", e);
auto filter = vector_search::prepare_filter(restr, true);
auto filter = vector_search::prepare_filter(*restr, true);
std::vector<raw_value> bind_values = {raw_value::make_value(int32_type->decompose(99))};
auto options = make_query_options(std::move(bind_values));