Commit Graph

49 Commits

Author SHA1 Message Date
Karol Nowacki
357c0a8218 cql3: Move vector search select to dedicated class
The execution of SELECT statements with ANN ordering (vector search) was
previously implemented within `indexed_table_select_statement`. This was
not ideal, as vector search logic is independent of secondary index selects.

This resulted in unnecessary complexity because vector search queries don't
use features like aggregates or paging. More importantly,
`indexed_table_select_statement` assumed a non-null `view_schema` pointer,
which doesn't hold for vector indexes (where `view_ptr` is null).
This caused null pointer dereferences during ANN ordered selects, leading
to crashes (VECTOR-179). Other parts of the class still dereference
`view_schema` without null checks.

Moving the vector search select logic out of
`indexed_table_select_statement` simplifies the code and prevents these
null pointer dereferences.
2025-10-29 08:37:21 +01:00
Nadav Har'El
0a990d2a48 config: split tri_mode_restriction to a separate header
Today, any source file or header file that wants to use the
tri_mode_restriction type needs to include db/config.hh, which is a
large and frequently-changing header file. In this patch we split this
type into a separate header file, db/tri_mode_restriction.hh, and avoid
a few unnecessary inclusions of db/config.hh. However, a few source
files now need to explicitly include db/config.hh, after its
transitive inclusion is gone.

Note that the overwhelmingly common inclusion of db/config.hh continues
to be a problem after this patch - 128 source files include it directly.
So this patch is just the first step in long journey.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#25692
2025-08-27 13:47:04 +03:00
Jan Łakomy
5fecad0ec8 cql3/statements: add ANN OF queries support to select statements
Add parsing of `ANN OF` queries to the `select_statement` and
`indexed_table_select_statement` classes.
Add a placeholder for the implementation of external ANN queries.

Rename `should_create_view` to `view_should_exist` as it is used
not only to check if the view should be created but also if
the view has been created.

Co-authored-by: Dawid Pawlik <dawid.pawlik@scylladb.com>
2025-08-01 12:08:50 +02:00
Jan Łakomy
d073a4c1fa cql3/raw: add ANN ordering to the raw statement layer
Extend `orderings_type` to include ANN ordering.

Co-authored-by: Dawid Pawlik <dawid.pawlik@scylladb.com>
2025-07-31 11:11:24 +02:00
Paweł Zakrzewski
854d2917a1 cql3/select_statement: reject PER PARTITION LIMIT with SELECT DISTINCT
Before this patch we silently allowed and ignored PER PARTITION LIMIT.
SELECT DISTINCT requires all the partition key columns, which means that
setting PER PARTITION LIMIT is redundant - only one result will be
returned from every partition anyway.

Cassandra behaves the same way, so this patch also ensures
compatibility.

Fixes scylladb/scylladb#15109

Closes scylladb/scylladb#22950
2025-02-24 14:50:18 +02:00
Paweł Zakrzewski
98f5e49ea8 audit: Add support to CQL statements
Integrates audit functionality into CQL statement processing to enable tracking of database operations. Key changes:

- Add audit_info and statement_category to all CQL statements
- Implement audit categories for different statement types:
  - DDL: Schema altering statements (CREATE/ALTER/DROP)
  - DML: Data manipulation (INSERT/UPDATE/DELETE/TRUNCATE/USE)
  - DCL: Access control (GRANT/REVOKE/CREATE ROLE)
  - QUERY: SELECT statements
  - ADMIN: Service level operations

- Add audit inspection points in query processing:
  - Before statement execution
  - After access checks
  - After statement completion
  - On execution failures

- Add password sanitization for role management statements
  - Mask plaintext passwords in audit logs
  - Handle both direct password parameters and options maps
  - Preserve query structure while hiding sensitive data

- Modify prepared statement lifecycle to carry audit context
  - Pass audit info during statement preparation
  - Track audit info through statement execution
  - Support batch statement auditing

This change enables comprehensive auditing of CQL operations while ensuring sensitive data is properly masked in audit logs.
2025-01-15 11:10:36 +01:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Botond Dénes
6458ff9917 cql3/statements: wire-in mutation_fragments_select_statement
This commit contains all the changes required to wire-in the new select
from mutation_fragment() statement.
2023-07-19 01:28:28 -04:00
Botond Dénes
aa31321da9 cql3: add SELECT MUTATION FRAGMENTS select statement sub-type
SELECT * FROM MUTATION_FRAGMENTS($table) is a new select statement
sub-type. More information will be provided in the patch which introduces
it. This patch adds only the Cql.g changes and what is further strictly
necessary.
2023-07-19 01:28:28 -04:00
Avi Kivity
7c3ceb6473 cql3: select_statement: use prepared selectors
Change one more layer of processing to work on prepared
rather than raw selectors. This moves the call to prepare
the selectors early in select_statement processing. In turn
this changes maybe_jsonize_select_clause() and forward_service's
mock_selection() to work in the prepared realm as well.

This moves us one step closer to using evaluate() to process
the select clause, as the prepared selectors are now available
in select_statement. We can't use them yet since we can't evaluate
aggregations.
2023-07-03 19:45:17 +03:00
Avi Kivity
42a1ced73b cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt
The expression system uses managed_bytes_opt for values, but result_set
uses bytes_opt. This means that processing values from the result set
in expressions requires a copy.

Out of the two, managed_bytes_opt is the better choice, since it prevents
large contiguous allocations for large blobs. So we switch result_set
to use managed_bytes_opt. Users of the result_set API are adjusted.

The db::function interface is not modified to limit churn; instead we
convert the types on entry and exit. This will be adjusted in a following
patch.
2023-05-07 17:17:36 +03:00
Avi Kivity
9823e75d16 cql3: grammar: make where clause return an expression
In preparation of the relaxation of the grammar to return any expression,
change the whereClause production to return an expression rather than
terms. Note that the expression is still constrained to be a conjunction
of relations, and our filtering code isn't prepared for more.

Before the patch, if the WHERE clause was optional, the grammar would
pass an empty vector of expressions (which is exactly correct). After
the patch, it would pass a default-constructed expression. Now that
happens to be an empty conjunction, which is exactly what's needed, but
it is too accidental, so the patch changes optional WHERE clauses to
explicitly generate an empty conjunction if the WHERE clause wasn't
specified.
2022-07-22 20:14:48 +03:00
Piotr Sarna
ec0a3bbbd4 cql3: add a statement for deleting ghost rows
In order to expose the API for deleting ghost rows from a view,
a CQL statement is created. It is loosely based on select_statement,
as its first step is to select view table rows.
2022-05-19 10:11:50 +02:00
Piotr Sarna
d74e25be67 cql3: convert is_json statement parameter to enum
Right now is_json is used to decide if the statement needs to be treated
in a special way. For two types (regular statement and JSON statement),
a boolean is enough, but this series extends it for two more types,
so the flag is converted to an enum.
2022-05-19 10:11:50 +02:00
cvybhu
d85f680df3 cql3: Remove relation class
Functionality of the relation class has been replaced by
expr::to_restriction.

Relation and all classes deriving from it can now be removed.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-16 18:17:58 +02:00
cvybhu
51cdbdeacb cql3: Make parser output expression for relations
Parser used to output the where clause as a vector of relations,
but now we can change it to a vector of expressions.

Cql.g needs to be modified to output expressions instead
of relations.

The WHERE clause is kept in a few places in the code that
need to be changed to vector<expression>.

Finally relation->to_restriction is replaced by expr::to_restriction
and the expressions are converted to restrictions where required.

The relation class isn't used anywhere now and can be removed.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-16 18:17:58 +02:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
d768e9fac5 cql3, related: switch to data_dictionary
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.

data_dictionary::database::real_database() is called from several
places, for these reasons:

 - calling yet-to-be-converted code
 - callers with a legitimate need to access data (e.g. system_keyspace)
   but with the ::database accessor removed from query_processor.
   We'll need to find another way to supply system_keyspace with
   data access.
 - to gain access to the wasm engine for testing whether used
   defined functions compile. We'll have to find another way to
   do this as well.

The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.

Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
2021-12-15 13:54:23 +02:00
Jan Ciolek
f76a1cd4bf cql3: Reorganize orderings code
Reorganized the code that handles column ordering (ASC or DESC).
I feel that it's now clearer and easier to understand.

Added an enum that describes column ordering.
It has two possible values: ascending or descending.
It used to be a bool that was sometimes called 'reversed',
which could mean multiple things.

Instead of column.type->is_reversed() != <ordering bool>
there is now a function called are_column_select_results_reversed.

Split checking if ordering is reversed and verifying whether it's correct into two functions.
Before all of this was done by is_reversed()

This is a preparation to later allow skipping ORDER BY restrictions on some columns.
Adding this to the existing code caused it to get quite complex,
but this new version is better suited for the task.

The diff is a bit messy because I moved all ordering functions to one place,
it's better to read select_statement.cc lines 1495-1651 directly.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-12-09 12:06:42 +01:00
Jan Ciolek
a24d06c195 cql3: Remove term in select_statement
Replace all uses of term with expression in cql3/statements/select_statement

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-10-28 20:55:09 +02:00
Avi Kivity
8b59e3a0b1 Merge ' cql3: Demand ALLOW FILTERING for unlimited, sliced partitions ' from Dejan Mircevski
Return the pre- 6773563d3 behavior of demanding ALLOW FILTERING when partition slice is requested but on potentially unlimited number of partitions.  Put it on a flag defaulting to "off" for now.

Fixes #7608; see comments there for justification.

Tests: unit (debug, dev), dtest (cql_additional_test, paging_test)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>

Closes #9126

* github.com:scylladb/scylla:
  cql3: Demand ALLOW FILTERING for unlimited, sliced partitions
  cql3: Track warnings in prepared_statement
  test: Use ALLOW FILTERING more strictly
  cql3: Add statement_restrictions::to_string
2021-08-31 18:05:26 +03:00
Dejan Mircevski
2f28f68e84 cql3: Demand ALLOW FILTERING for unlimited, sliced partitions
When a query requests a partition slice but doesn't limit the number
of partitions, require that it also says ALLOW FILTERING.  Although
do_filter() isn't invoked for such queries, the performance can still
be unexpectedly slow, and we want to signal that to the user by
demanding they explicitly say ALLOW FILTERING.

Because we now reject queries that worked fine before, existing
applications can break.  Therefore, the behavior is controlled by a
flag currently defaulting to off.  We will default to "on" in the next
Scylla version.

Fixes #7608; see comments there for justification.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2021-08-31 10:45:41 -04:00
Avi Kivity
b11ec1aeda cql3: select_statement: convert term::raw to expression
Straightforward substitution; using std::optional<> since those
expressions are indeed optional.
2021-08-26 15:41:14 +03:00
Pavel Solodovnikov
49ddd269ea cql3: rename variable_specifications to prepare_context
The class is repurposed to be more generic and also be able
to hold additional metadata related to function calls within
a CQL statement. Rename all methods appropriately.

Visitor functions in AST nodes (`collect_marker_specification`)
are also renamed to a more generic `fill_prepare_context`.

The name `prepare_context` designates that this metadata
structure is a byproduct of `stmt::raw::prepare()` call and
is needed only for "prepare" step of query execution.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-07-24 14:33:33 +03:00
Pavel Solodovnikov
76bea23174 treewide: reduce header interdependencies
Use forward declarations wherever possible.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>

Closes #8813
2021-06-07 15:58:35 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Gleb Natapov
805da054e7 cql3: store cf_name as optional in cf_statement instead of shared_ptr
It been a shard_ptr is a remnant of translation from Java.
Message-Id: <20210216123931.80280-2-gleb@scylladb.com>
2021-02-16 15:58:37 +02:00
Piotr Sarna
157be33b89 cql3: add per-query timeout to select statement
First of all, select statement is extended with an 'attrs' field,
which keeps the per-query attributes. Currently, only TIMEOUT
parameter is legal to use, since TIMESTAMP and TTL bear no meaning
for reads.

Secondly, if TIMEOUT attribute is set, it will be used as the effective
timeout for a particular query.
2020-12-14 07:50:40 +01:00
Pavel Solodovnikov
f6e765b70f cql3: pass column_specification via lw_shared_ptr
`column_specification` class is marked as "final": it's safe
to use non-polymorphic pointer "lw_shared_ptr" instead of a
more generic "shared_ptr".

tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200427084016.26068-1-pa.solodovnikov@scylladb.com>
2020-04-27 12:47:42 +03:00
Pavel Solodovnikov
8efb02146f cql3: const cleanups and API de-pointerization
* Pass raw::select_statement::parameters as lw_shared_ptr
 * Some more const cleanups here and there
 * lists,maps,sets::equals now accept const-ref to *_type_impl
   instead of shared_ptr
 * Remove unused `get_column_for_condition` from modification_statement.hh
 * More methods now accept const-refs instead of shared_ptr

Every call site where a shared_ptr was required as an argument
has been inspected to be sure that no dangling references are
possible.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200220153204.279940-1-pa.solodovnikov@scylladb.com>
2020-02-20 18:14:49 +02:00
Pavel Solodovnikov
a46f235092 cql3: prefer passing schema as const ref instead of shared_ptr
De-pointerize cql3 code APIs further: change some call sites
to pass `schema` as const-ref instead of `shared_ptr`.

Affected functions known to be expecting always non-null
pointer to schema and don't store or pass the pointer somewhere
else, assuming it's safe to give them just a reference.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200218142338.69824-1-pa.solodovnikov@scylladb.com>
2020-02-18 20:13:10 +02:00
Pavel Solodovnikov
abb3a7e218 cql3: minor sweeps through the cql layer code to reduce shared_ptrs count
Convert some more helper functions to accept const reference to
column_specification and column_identifier instead of shared_ptr.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-02-16 17:24:26 +03:00
Konstantin Osipov
d4866c1a28 cql3: remove prepared alias for prepared_statement
cql3 has cql_statement, parsed_statement and prepared_statement
classes, which, largely, stand for the same thing. prepared was
an alias for prepared_statement which only required an extra
tag jump in IDE and carried no meaning.
2020-02-12 16:44:43 +03:00
Pavel Solodovnikov
e1b22b6a4c cql3: get rid of lw_shared_ptr for variable_specifications
`parsed_statement::get_bound_variables` is assumed to always
return a nonnull pointer to `variable_specifications` instance.

In this case using a pointer is superfluous and can be safely
replaced by a plain reference.

Also add a default ctor and a utility method `set_bound_variables`
to the `variable_specifications` class to actually reset the
contents of the class instance.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200120195839.164296-1-pa.solodovnikov@scylladb.com>
2020-01-22 12:51:02 +02:00
Pavel Solodovnikov
aba9a11ff0 cql: pass variable_specifications via lw_shared_ptr
Instances of `variable_specifications` are passed around as
shared_ptr's, which are redundant in this case since the class
is marked as `final`. Use `lw_shared_ptr` instead since we know
for sure it's not a polymorphic pointer.

Tests: unit(debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20191225232853.45395-1-pa.solodovnikov@scylladb.com>
2019-12-29 16:26:26 +02:00
Benny Halevy
fae4ca756c cql3: select_statement: provide default initializer for parameters::_bypass_cache
Fixes #4503

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190521143300.22753-1-bhalevy@scylladb.com>
2019-05-21 20:06:40 +03:00
Dejan Mircevski
274a77f45e Process GROUP BY columns into select_statement
Validate raw GROUP BY identifiers and translate them into
a select_statement member.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-08 10:10:10 -04:00
Dejan Mircevski
e1fb414805 Parse GROUP BY clause, store column identifiers
Extend the grammar file with GROUP BY, collect the column identifiers,
and store them in raw::select_statement.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-05-08 10:09:22 -04:00
Piotr Sarna
93786a9148 cql3: add per_partition_limit to CQL statement
Select statements can now accept per_partition_limit variable.
2019-02-18 10:29:34 +01:00
Avi Kivity
ecf3f92ec7 cql: add SELECT ... BYPASS CACHE clause
The BYPASS CACHE clause instructs the database not to read from or populate the
cache for this query. The new keywords (BYPASS and CACHE) are not reserved.
2018-11-26 11:37:49 +02:00
Avi Kivity
775b7e41f4 Update seastar submodule
* seastar d59fcef...b924495 (2):
  > build: Fix protobuf generation rules
  > Merge "Restructure files" from Jesse

Includes fixup patch from Jesse:

"
Update Seastar `#include`s to reflect restructure

All Seastar header files are now prefixed with "seastar" and the
configure script reflects the new locations of files.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <5d22d964a7735696fb6bb7606ed88f35dde31413.1542731639.git.jhaberku@scylladb.com>
"
2018-11-21 00:01:44 +02:00
Eliran Sinvani
fd422c954e cql3: ensure retrieval of columns for filtering
When a query that needs filtering is executed, the columns
that the coordinator is filtering by have to be retrieved.The
columns should be retrieved even if they are not used for
ordering or named in the actual select clause.
If the columns are missing from the result set, then any
filtering that restricts the missing column will not take
place.

Fixes #3803

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2018-10-21 08:41:46 +03:00
Piotr Sarna
27bf20aa3f cql3: enable ALLOW FILTERING
Enables 'ALLOW FILTERING' queries by transfering control
to result_set_builder::filtering_visitor.
Both regular and primary key columns are allowed,
but some things are left unimplemented:
 - multi-column restrictions
 - CONTAINS queries

Fixes #2025
2018-07-05 10:50:43 +02:00
Piotr Sarna
15545da572 cql3: add support for SELECT JSON clause
This commit adds the implementation of SELECT JSON clause
which returns rows in JSON format. Each returned row has a single
'[json]' column.

References #2058
2018-04-11 17:12:02 +02:00
Vlad Zolotarov
ff55b76562 cql3::query_processor: use weak_ptr for passing the prepared statements around
Use seastar::checked_ptr<weak_ptr<pepared_statement>> instead of shared_ptr for passing prepared statements around.
This allows an easy tracking and handling of statements invalidation.

This implementation will throw an exception every time an invalidated
statement reference is dereferenced.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-04-12 12:24:03 -04:00
Duarte Nunes
a9c17b0a52 select_statement: Propagate for_view argument
This patch propagates the for_view argument, used by
statement_restrictions to ensure IS NOT NULL can be used when creating
a materialized view.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Vlad Zolotarov
7606588267 cql3::query_processor: add cql_stats
- Add cql_stats member.
   - Pass it to cql3::raw::parsed_statement::prepare() virtual method.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-11-03 11:48:57 -04:00
Avi Kivity
25b3d74f45 cql3: Split select_statement::raw_statement into raw namespace
cql3::select_statement::raw_statement
    -> cql3::raw::select_statement
Message-Id: <1464609556-3756-4-git-send-email-avi@scylladb.com>
2016-05-31 09:09:30 +03:00