scylladb

Author	SHA1	Message	Date
Marcin Maliszkiewicz	e3f2ebd4fb	cql3: remove not needed cmd copy in indexed_table_select_statement It's not used variable. There should be a tiny perf increase as it saves allocation. Closes scylladb/scylladb#23473	2025-03-31 09:34:32 +03:00
Botond Dénes	39bcf99f8e	Merge 'Apply hard limit to partition range vectors in secondary index queries' from Nikos Dragazis Secondary index queries fetch partition keys from the index view and store them in an `std::vector`. The vector size is currently limited by the user's page size and the page memory limit (1MiB). These are not enough to prevent large contiguous allocations (which can lead to stalls). This series introduces a hard limit to the vector size to ensure it does not exceed the allocator's preferred max contiguous allocation size (128KiB). With the size of each element being 120 bytes, this allows for 1092 partition keys. The limit was set to 1000. Any partitions above this limit are discarded. Discarding partitions breaks the querier cache on the replicas, causing a performance regression, as can be seen from the following measurements: ``` * Cluster: 3 nodes (local Docker containers), 1 vCPU, 4GB memory, dev mode * Schema: CREATE KEYSPACE ks WITH replication = {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'datacenter1': '3'} AND durable_writes = true AND tablets = {'enabled': false}; CREATE TABLE ks.t1 (pk1 int, pk2 int, ck int, value int, PRIMARY KEY ((pk1, pk2), ck)); CREATE INDEX t1_pk2_idx ON ks.t1(pk2); * Query: CONSISTENCY LOCAL_QUORUM; SELECT * FROM ks.t1 where pk2 = 1; +------------+-------------------+-------------------+ \| Page Size \| Master \| Vector Limit \| +============+===================+===================+ \| \| Latency (sec) \| Latency (sec) \| +------------+-------------------+-------------------+ \| 100 \| 5.80 ± 0.13 \| 5.64 ± 0.10 \| +------------+-------------------+-------------------+ \| 1000 \| 4.77 ± 0.07 \| 4.62 ± 0.06 \| +------------+-------------------+-------------------+ \| 2000 \| 4.67 ± 0.07 \| 5.13 ± 0.03 \| +------------+-------------------+-------------------+ \| 5000 \| 4.82 ± 0.09 \| 6.25 ± 0.06 \| +------------+-------------------+-------------------+ \| 10000 \| 4.89 ± 0.36 \| 7.52 ± 0.13 \| +------------+-------------------+-------------------+ \| -1 \| 4.90 ± 0.67 \| 4.79 ± 0.33 \| +------------+-------------------+-------------------+ ``` We expect this to be fixed with adaptive paging in a future PR. Until then, users can avoid regressions by adjusting their page size. Additionally, this series changes the `untyped_result_set` to store rows in a `chunked_vector` instead of an `std::vector`, similarly to the `result_set`. Secondary index queries use an `untyped_result_set` to store the raw result from the index view before processing. With 1MiB results, the `std::vector` would cause a large allocation of this magnitude. Finally, a unit test is added to reproduce the bug. Fixes #18536. The PR fixes stalls of up to 100ms, but there is an easy workaround: adjust the page size. No need to backport. Closes scylladb/scylladb#22682 * github.com:scylladb/scylladb: cql3: secondary index: Limit page size for single-row partitions cql3: secondary index: Limit the size of partition range vectors cql3: untyped_result_set: Store rows in chunked_vector test: Reproduce bug with large allocations from secondary index	2025-03-14 15:06:07 +02:00
Paweł Zakrzewski	d483051e44	cql3/select_statement: reject aggregate functions when PER PARTITION LIMIT is present Before this patch we silently allowed and ignored PER PARTITION LIMIT. While using aggregate functions in conjunction with PER PARTITION LIMIT can make sense, we want to disable it until we can offer proper implementation, see #9879 for discussion. We want to match Cassandra, and for queries with aggregate functions it behaves as follows: - it silently ignores PER PARTITION LIMIT if GROUP BY is present, which matches our previous implementation. - rejects PER PARTITION LIMIT when GROUP BY is not present. This patch adds rejection of the second group. Fixes #9879 Closes scylladb/scylladb#23086	2025-03-13 10:29:53 +02:00
Nikos Dragazis	7a6a4f54a5	cql3: secondary index: Limit page size for single-row partitions The size of the partition range vector was constrained in the previous patch. Any rows beyond the vector's capacity are discarded. In the special case of single-row partitions, we know the size of each partition, so we can enforce this limit on the query itself via the page size. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2025-03-10 12:18:49 +02:00
Nikos Dragazis	76b31a3acc	cql3: secondary index: Limit the size of partition range vectors The partition range vector is an std::vector, which means it performs contiguous allocations. Large allocations are known to cause problems (e.g., reactor stalls). For paged queries, limit the vector size to 1000. If more partition keys are available in the query result, discard them. Ideally, we should not be fetching them at all, but this is not possible without knowing the size of each partition. Currently, each vector element is 120 bytes and the standard allocator's max preferred contiguous allocation is 128KiB. Therefore, the chosen value of 1000 satisfies the constraint (128 KiB / 120 = 1092 > 1000). This should be good enough for most cases. Since secondary index queries involve one base table query per partition key, these queries are slow. A higher limit would only make them slower and increase the probability of a timeout. For the same reason, saving a follow-up paged request from the client would not increase the efficiency much. For unpaged queries, do not apply any limit. This means they remain susceptible to stalls, but unpaged queries are considered unoptimized anyway. Finally, update the unit test reproducer since the bug is now fixed. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2025-03-10 12:18:42 +02:00
Paweł Zakrzewski	9e7f79d1ab	cql3/select_statement: require LIMIT and PER PARTITION LIMIT to be strictly positive LIMIT and PER PARTITION LIMIT limit the number of rows returned or taken into consideration by a query. It makes no logical sense to have this value at less than 1. Cassandra also has this requirement. This patch ensures that the limit value is strictly positive and adds an explicit test for it - it was only tested in a test ported from Cassandra, that is disabled due to other issues. Closes scylladb/scylladb#23013	2025-03-03 08:13:27 +02:00
Paweł Zakrzewski	854d2917a1	cql3/select_statement: reject PER PARTITION LIMIT with SELECT DISTINCT Before this patch we silently allowed and ignored PER PARTITION LIMIT. SELECT DISTINCT requires all the partition key columns, which means that setting PER PARTITION LIMIT is redundant - only one result will be returned from every partition anyway. Cassandra behaves the same way, so this patch also ensures compatibility. Fixes scylladb/scylladb#15109 Closes scylladb/scylladb#22950	2025-02-24 14:50:18 +02:00
Kefu Chai	fd52b0a3cc	cql3: fix false-positive "used-after-move" warning in clang-tidy `slice.is_reversed()` was falsely flagged as accessing moved data, since the underlying enum_set remains valid after move. However, to improve code clarity and silence the warning, now reference `command->slice` directly instead, which is guaranteed to be valid as the move target. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22971	2025-02-23 18:58:35 +02:00
Kefu Chai	727d5637ab	cql3: remove redundant std::move() in select_statement.cc GCC-14 correctly flagged unnecessary use of std::move() where copy elision applies: ``` return std::move(paging_state_copy); ``` This error occurs in indexed_table_select_statement::generate_view_paging_state_from_base_query_results at line 1122. The C++17 standard guarantees copy elision for returning local variables, making std::move() redundant in this context and potentially hindering compiler optimizations. Fixes build failure with GCC-14 which treats redundant moves as errors with -Werror=redundant-move. The error message looks like: ``` /usr/lib64/ccache/g++ -DDEVEL -DSCYLLA_BUILD_MODE=dev -DSCYLLA_ENABLE_ERROR_INJECTION -DSCYLLA_ENABLE_PREEMPTION_SOURCE -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Dev\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build -I/home/kefu/dev/scylladb/build/gen -isystem /home/kefu/dev/scylladb/build/rust -isystem /home/kefu/dev/scylladb/seastar/include -isystem /home/kefu/dev/scylladb/build/Dev/seastar/gen/include -isystem /home/kefu/dev/scylladb/abseil -I/usr/include/p11-kit-1 -O2 -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unused-parameter -Wno-changes-meaning -Wno-ignored-attributes -Wno-dangling-pointer -Wno-array-bounds -Wno-narrowing -Wno-type-limits -ffile-prefix-map=/home/kefu/dev/scylladb/= -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -ffile-prefix-map=/home/kefu/dev/scylladb/build/=build -march=westmere -Wstack-usage=21504 -std=gnu++23 -Wno-maybe-uninitialized -Werror=unused-result -fstack-clash-protection -DSEASTAR_P2581R1 -DSEASTAR_API_LEVEL=7 -DSEASTAR_BUILD_SHARED_LIBS -DSEASTAR_SSTRING -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_SCHEDULING_GROUPS_COUNT=19 -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_TYPE_ERASE_MORE -DBOOST_PROGRAM_OPTIONS_NO_LIB -DBOOST_PROGRAM_OPTIONS_DYN_LINK -DBOOST_THREAD_NO_LIB -DBOOST_THREAD_DYN_LINK -DFMT_SHARED -MD -MT cql3/CMakeFiles/cql3.dir/Dev/statements/select_statement.cc.o -MF cql3/CMakeFiles/cql3.dir/Dev/statements/select_statement.cc.o.d -o cql3/CMakeFiles/cql3.dir/Dev/statements/select_statement.cc.o -c /home/kefu/dev/scylladb/cql3/statements/select_statement.cc /home/kefu/dev/scylladb/cql3/statements/select_statement.cc: In member function ‘seastar::lw_shared_ptr<const service::pager::paging_state> cql3::statements::indexed_table_select_statement::generate_view_paging_state_from_base_query_results(seastar::lw_shared_ptr<const service::pager::paging_state>, const seastar::foreign_ptr<seastar::lw_shared_ptr<query::result> >&, service::query_state&, const cql3::query_options&) const’: /home/kefu/dev/scylladb/cql3/statements/select_statement.cc:1122:21: error: redundant move in return statement [-Werror=redundant-move] 1122 \| return std::move(paging_state_copy); \| ~~~~~~~~~^~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22903	2025-02-18 21:12:58 +02:00
Piotr Dulikowski	6aa962f5f4	Merge 'Add audit subsystem for database operations' from Paweł Zakrzewski Introduces a comprehensive audit system to track database operations for security and compliance purposes. This change includes: Core Components: - New audit subsystem for logging database operations - Service level integration for proper resource management - CQL statement tracking with operation categories - Login process integration for tenant management Key Features: - Configurable audit logging (syslog/table) - Operation categorization (QUERY/DML/DDL/DCL/AUTH/ADMIN) - Selective auditing by keyspace/table - Password sanitization in audit logs - Service level shares support (1-1000) for workload prioritization - Proper lifecycle management and cleanup I ran the dtests for audit (manually enabled) and they pass. The in-repo tests pass. Notably, there should be no non-whitespace changes between this and scylla-enterprise Fixes scylladb/scylla-enterprise#4999 Closes scylladb/scylladb#22147 * github.com:scylladb/scylladb: audit: Add shares support to service level management audit: Add service level support to CQL login process audit: Add support to CQL statements audit: Integrate audit subsystem into Scylla main process audit: Add documentation for the audit subsystem audit: Add the audit subsystem	2025-01-17 13:14:55 +01:00
Gleb Natapov	83d15b8e32	cql3: report host id instead of ip in error during SELECT FROM MUTATION_FRAGMENTS query We want to drop ip from the topology::node.	2025-01-16 16:37:07 +02:00
Paweł Zakrzewski	98f5e49ea8	audit: Add support to CQL statements Integrates audit functionality into CQL statement processing to enable tracking of database operations. Key changes: - Add audit_info and statement_category to all CQL statements - Implement audit categories for different statement types: - DDL: Schema altering statements (CREATE/ALTER/DROP) - DML: Data manipulation (INSERT/UPDATE/DELETE/TRUNCATE/USE) - DCL: Access control (GRANT/REVOKE/CREATE ROLE) - QUERY: SELECT statements - ADMIN: Service level operations - Add audit inspection points in query processing: - Before statement execution - After access checks - After statement completion - On execution failures - Add password sanitization for role management statements - Mask plaintext passwords in audit logs - Handle both direct password parameters and options maps - Preserve query structure while hiding sensitive data - Modify prepared statement lifecycle to carry audit context - Pass audit info during statement preparation - Track audit info through statement execution - Support batch statement auditing This change enables comprehensive auditing of CQL operations while ensuring sensitive data is properly masked in audit logs.	2025-01-15 11:10:36 +01:00
Michael Litvak	5ef7afb968	cql3: allow SELECT of specific collection key This adds to the grammar the option to SELECT a specific key in a collection column using subscript syntax. For example: SELECT map['key'] FROM table SELECT map['key1']['key2'] FROM table The key can also be parameterized in a prepared query. For this we need to pass the query options to result_set_builder where we process the selectors. Fixes scylladb/scylladb#7751	2024-12-30 17:05:20 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Kefu Chai	bab12e3a98	treewide: migrate from boost::adaptors::transformed to std::views::transform now that we are allowed to use C++23. we now have the luxury of using `std::views::transform`. in this change, we: - replace `boost::adaptors::transformed` with `std::views::transform` - use `fmt::join()` when appropriate where `boost::algorithm::join()` is not applicable to a range view returned by `std::view::transform`. - use `std::ranges::fold_left()` to accumulate the range returned by `std::view::transform` - use `std::ranges::fold_left()` to get the maximum element in the range returned by `std::view::transform` - use `std::ranges::min()` to get the minimal element in the range returned by `std::view::transform` - use `std::ranges::equal()` to compare the range views returned by `std::view::transform` - remove unused `#include <boost/range/adaptor/transformed.hpp>` - use `std::ranges::subrange()` instead of `boost::make_iterator_range()`, to feed `std::views::transform()` a view range. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. limitations: there are still a couple places where we are still using `boost::adaptors::transformed` due to the lack of a C++23 alternative for `boost::join()` and `boost::adaptors::uniqued`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21700	2024-12-03 09:41:32 +02:00
Botond Dénes	075ca6cc02	Merge 'cql3: respect PER PARTITION LIMIT for aggregate queries' from Paweł Zakrzewski Currently, PER PARTITION LIMIT is not implemented for aggregates and queries can result in more rows than expected from the same partition. Instrument the result_set_builder class so that it can enforce PER PARTITION LIMIT for aggregate queries, specifically: - add per_partition_limit to the result_set_builder - expose the number of input rows in the selector result_set_builder gets two new functions handling partition start and end: - accept_partition_end for notifying that a partition has been finished. This is also called when a page ends, so we cannot simply flush here, as a naive implementation could do. - accept_new_partition, where we flush_selectors() if it's indeed a new partition (and not a continuation of the previous) and the query has a grouping: we don't want to flush on new partition in a query like SELECT COUNT() FROM foo; Fixes #5363 Closes scylladb/scylladb#21125 github.com:scylladb/scylladb: test: enable PER PARTIION LIMIT + GROUP BY tests cql3: respect PER PARTITION LIMIT for aggregates cql3: selection: count input rows in the selector cql3: selection: pass per partition limit to the result_set_builder cql3: show different messages for LIMIT and PER PARTITION LIMIT in get_limit	2024-11-20 09:54:28 +02:00
Paweł Zakrzewski	aea3c3851e	cql3: selection: pass per partition limit to the result_set_builder Aggregates require the limit to be applied from within the builder class, so it needs to be passed to it.	2024-11-18 17:56:53 +01:00
Paweł Zakrzewski	cb1483037c	cql3: show different messages for LIMIT and PER PARTITION LIMIT in get_limit select_statement::get_limit is used to evaluate the LIMIT value for both LIMIT and PER PARTITION LIMIT. This change fixes the error message for incorrect values passed by the user.	2024-11-18 17:56:53 +01:00
Nadav Har'El	b778ce08a9	cql3: change sstring_view to std::string_view Our "sstring_view" is an historic alias for the standard std::string_view. The cql3/ directory used this old alias in a few of random places, let's change them to use the standard type name. Refs #4062. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-11-18 15:57:20 +02:00
Kefu Chai	59eb2ab119	treewide: s/boost::algorithm::any_of/std::ranges::any_of/ now that we are allowed to use C++23. we now have the luxury of using `std::ranges::any_of`. in this change, we replace `boost::algorithm::any_of` with `std::ranges::any_of` to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-11-05 14:06:09 +08:00
Kefu Chai	f8bb1c64f1	treewide: s/boost::algorithm::all_of/std::ranges::all_of/ now that we are allowed to use C++23. we now have the luxury of using `std::ranges::all_of`. in this change, we replace `boost::algorithm::all_of` with `std::ranges::all_of` to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-11-05 14:05:24 +08:00
Avi Kivity	820509026f	schema: replace boost ranges with std ranges To reduce dependency load, use std ranges instead of boost ranges. The std::ranges::{lower,upper}_bound don't support heterogeneous lookup, but a more natural solution is to use a projection to search for the name, so we use that and the custom comparator is removed. Many callers are converted as well due to poor interoperability between boost ranges and std ranges.	2024-10-15 16:42:54 +03:00
Paweł Zakrzewski	16dd58fb0d	cql3: respect the user-defined page size in aggregate queries This change allows the user to fully set the page size for the query. There's still an internal hard-limit of 1MB anyway, so there's no need to limit it to our default value (because using a larger page size might be a query optimization sometimes) Fixes #20612 Closes scylladb/scylladb#20692	2024-09-23 16:31:21 +03:00
Avi Kivity	657848dcbb	cql3: statement_restrictions, expr: move restrictions-related expression utilities out of expression.cc Move all of the blatantly restriction-related expression utilities to statement_restrictions.cc. Some are so blatant as to include the word "restriction" in their name. Others are just so specialized that they cannot be used for anything else. The motivation is that further refactoring will be simplified if it can happen within the same module, as there will not be a need to prove it has no effect elsewhere. Most of the declarations are made non-public (in .cc file) to limit proliferation. A few are needed for tests or in select_statement.cc and so are kept public. Other than that, the only changes are namespace qualifications and removal of a now-duplicate definition ("inclusive"). Closes scylladb/scylladb#20732	2024-09-22 11:00:51 +03:00
Piotr Dulikowski	7e7701d436	Merge 'cql3/statements/select_statement: `SELECT ... USING SERVICE LEVEL`' from Michał Jadwiszczak Allow to specify service level used in select statement `SELECT ... USING SERVICE LEVEL sl_name`. In OSS, this only affects statement's timeout. In case both service level and timeout are specified `SELECT ... USING SERVICE LEVEL sl_name AND TIMEOUT 1h`, the timeout has higher priority as statement's timeout. Fixes scylladb/scylladb#18471 Closes scylladb/scylladb#20523 * github.com:scylladb/scylladb: test/cql-pytest: add test for `SELECT ... USING SERVICE LEVEL` cql3/Cql.g: extend grammar to allow `SELECT ... USING SERVICE LEVEL` cql3/statements/select_statement: use service level timeout cql3/attributes: add service level name field qos/service_level_controller: add method to check if service level exists in cache	2024-09-19 18:19:23 +02:00
Avi Kivity	1663fbe717	cql3: statement_restrictions: use functional style Instead of a constructor, use a new function analyze_statement_restrictions() as the entry point. It returns an immutable statement_restrictions object. This opens the door to returning a variant, with each arm of the variant corresponding to a different query plan.	2024-09-17 17:13:27 +03:00
Avi Kivity	d5c8083b76	cql3: statement_restrictions: make it a const object Make validate_secondary_index_selections() const (it trivially is), and call prepare_indexed_local() / prepared_indexed_global() at the end of the constructor. By making statement_restrictions a const object, reasoning about it can be local (looking at the source file) rather than global (looking at all the interactions of the class with its environment. In fact, we might make it a function one day. Since prepare_indexed_global()/prepare_indexed_local() only mutate _idx_tbl_ck_prefix, which isn't mutated by the rest of the code, the transformation is safe. The corresponding code is removed from select_statement. The removal isn't complete since it still uses some computation, but later deduplication is left for another day.	2024-09-17 17:03:27 +03:00
Michał Jadwiszczak	af6dc78025	cql3/statements/select_statement: use service level timeout Use service level timeout in selecte statement when specified. `USING TIMEOUT` have higher priority in timeout definition.	2024-09-16 13:48:48 +02:00
Kefu Chai	3e84d43f93	treewide: use seastar::format() or fmt::format() explicitly before this change, we rely on `using namespace seastar` to use `seastar::format()` without qualifying the `format()` with its namespace. this works fine until we changed the parameter type of format string `seastar::format()` from `const char*` to `fmt::format_string<...>`. this change practically invited `seastar::format()` to the club of `std::format()` and `fmt::format()`, where all members accept a templated parameter as its `fmt` parameter. and `seastar::format()` is not the best candidate anymore. despite that argument-dependent lookup (ADT for short) favors the function which is in the same namespace as its parameter, but `using namespace` makes `seastar::format()` more competitive, so both `std::format()` and `seastar::format()` are considered as the condidates. that is what is happening scylladb in quite a few caller sites of `format()`, hence ADT is not able to tell which function the winner in the name lookup: ``` /__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous 265 \| return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id()); \| ^~~~~~ /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 4290 \| format(format_string<_Args...> __fmt, _Args&&... __args) \| ^ /__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 143 \| format(fmt::format_string<A...> fmt, A&&... a) { \| ^ ``` in this change, we change all `format()` to either `fmt::format()` or `seastar::format()` with following rules: - if the caller expects an `sstring` or `std::string_view`, change to `seastar::format()` - if the caller expects an `std::string`, change to `fmt::format()`. because, `sstring::operator std::basic_string` would incur a deep copy. we will need another change to enable scylladb to compile with the latest seastar. namely, to pass the format string as a templated parameter down to helper functions which format their parameters. to miminize the scope of this change, let's include that change when bumping up the seastar submodule. as that change will depend on the seastar change. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-09-11 23:21:40 +03:00
Avi Kivity	d69bf4f010	cql3: introduce dialect infrastructure A dialect is a different way to interpret the same CQL statement. Examples: - how duplicate bind variable names are handled (later in this series) - whether `column = NULL` in LWT can return true (as is now) or whether it always returns NULL (as in SQL) Currently, dialect is an empty structure and will be filled in later. It is passed to query_processor methods that also accept a CQL string, and from there to the parser. It is part of the prepared statement cache key, so that if the dialect is changed online, previous parses of the statement are ignored and the statement is prepared again. The patch is careful to pick up the dialect at the entry point (e.g. CQL protocol server) so that the dialect doesn't change while a statement is parsed, prepared, and cached.	2024-08-29 21:19:23 +03:00
Botond Dénes	46563d719f	replica/mutation_dump: enfore pinning of effective replication map By making it a required argument, making sure the topology version is pinned for the duration of the query. This is needed because mutation dump queries bypass the storage proxy, where this pinning usually takes place. So it has to be enforced here.	2024-08-22 06:24:06 -04:00
Avi Kivity	3de4e8f91b	Merge 'cql: process LIMIT for GROUP BY select queries' from Paweł Zakrzewski This change fixes #17237, fixes #5361 and fixes #5362 by passing the limit value down the call chain in cql3. A test is also added. fixes #17237 fixes #5361 fixes #5362 The regression happened in 5.4 as we changed the way GROUP BY is processed in `432cb02` - to force aggregation when it is used. The LIMIT value was not passed to aggregations and thus we failed to adhere to it. W want to backport this fix to 5.4 and 6.0 to have continuous correct results for the test case from #17237 This patch consists of 4 commits: - fa4225ea0fac2057b7a9976f57dc06bcbd900cd4 - cql3: respect the user-defined page size in aggregate queries - a precondition for this patch to be implementable - 8fbe69e74dca16ed8832d9a90489ca47ba271d0b - cql3/select_statement: simplify the get_limit function - the `do_get_limit()` function did a lot of legwork that should not be associated with it. This change makes it trivial and makes its callers do additional checks (for unset guards, or for an aggregate query) - 162828194a2b88c22fbee335894ff045dcc943c9 - cql3: process LIMIT for GROUP BY queries - pass the limit value down the chain and make use of it. This is the actual fix to #17237 - b3dc6de6d6cda8f5c09b01463bb52f827a6a00b4 - test/cql-pytest: Add test for GROUP BY queries with LIMIT - tests Closes scylladb/scylladb#18842 * github.com:scylladb/scylladb: test/cql-pytest: Add test for GROUP BY queries with LIMIT cql3: process LIMIT for GROUP BY queries cql3/select_statement: simplify the get_limit function cql3: respect the user-defined page size in aggregate queries	2024-08-14 17:54:59 +03:00
Łukasz Paszkowski	15a01c7111	select_statement::do_execute: Add tracing informaction Add information on table and query schema versions to tracing.	2024-08-13 10:07:12 +02:00
Łukasz Paszkowski	158b994676	query::trim_clustering_row_ranges_to: require reversed schema for native reversed ranges Simplify implementation and for clustering key ranges in native reversed format, require a reversed table schema. Trimming native reversed clustering key ranges requires a reversed schema to be passed in. Thus, the reverse flag is no longer required as it would always be set to false.	2024-08-13 10:07:10 +02:00
Łukasz Paszkowski	309ba68692	select_statement: Execute reversed query in native format Use a reversed schema and a native reversed slice when constructing a read_command and executing a reversed select statement. Such a created read_command is passed further down to query_pagers::pager and storage::proxy::query_result that transform it to the format they accept/know, i.e. lagacy.	2024-08-13 10:03:46 +02:00
Paweł Zakrzewski	e7ae7f3662	cql3: process LIMIT for GROUP BY queries Currently LIMIT not passed to the query executor at all and it was just an accident that it worked for the case referenced in #17237. This change passes the limit value down the chain.	2024-08-11 09:08:43 +02:00
Paweł Zakrzewski	3838ad64b3	cql3/select_statement: simplify the get_limit function The get_limit() function performed tasks outside of its scope - for example checked if the statement was an aggregate. This change moves the onus of the check to the caller.	2024-08-11 09:08:43 +02:00
Paweł Zakrzewski	08f3219cb8	cql3: respect the user-defined page size in aggregate queries The comment in the code already states that we should use the user-defined page size if it's provided. To avoid OOM conditions we'll use the internally defined limit as the upper bound or if no page size is provided. This change lays ground work for fixing #5362 and is necessary to pass the test introduced in #19392 once it is implemented.	2024-08-11 09:08:43 +02:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Avi Kivity	3fc4e23a36	forward_service: rename to mapreduce_service forward_service is nondescriptive and misnamed, as it does more than forward requests. It's a classic map/reduce algorithm (and in fact one of its parameters is "reducer"), so name it accordingly. The name "forward" leaked into the wire protocol for the messaging service RPC isolation cookie, so it's kept there. It's also maintained in the name of the logger (for "nodetool setlogginglevel") for compatibility with tests. Closes scylladb/scylladb#19444	2024-07-03 19:29:47 +03:00
Michał Jadwiszczak	e9ace7c203	cql3/select_statement: do not parallelize single-partition aggregations Currently reads with WHERE clause which limits them to be single-partition reads, are unnecessarily parallelized. This commit checks this condition and the query doesn't use forward_service in single-partition reads.	2024-06-18 19:21:32 +02:00
Tomasz Grabiec	c9294b1642	lwt: Avoid deprecated sharder::shard_of() Instead, use shard_for_reads(). The justification is that: 1) In cas_shard(), we need to pick a single request coordinator. shard_for_reads() gives that, which is equivalent to shard_of() if there is no intra-node migration. 2) In paxos handler for prepare(), the shard we execute it on is the shard from which we read, so shard_for_reads() is the one. 3) Updates of paxos state are separate CQL requests, and use their own sharding. 4) Handler for learn is executing updates using calls to storage_proxy::mutate_locally() which will use the right sharder for writes However, the code is still not prepared for intra-node migration, and possibly regular migration too in case of abandoned requests, because the locking of paxos state assumes that the shard is static. That would have to be fixed separately, e.g. by locking both shards (shard_for_writes()) during migration, so that the set of locked shards always intersects during migration and local serialization of paxos state updates is achieved. I left FIXMEs for that.	2024-05-16 00:28:47 +02:00
Pavel Emelyanov	1612aa01ca	cql3: Reserve vector with pk columns When constructing a vector with partition key data, the size of that vector is known beforehand Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18239	2024-04-16 07:06:07 +03:00
Kefu Chai	15d59db98b	cql3: select_statement: include <ranges> we should include used header, to avoid compilation failures like: ``` cql3/statements/select_statement.cc:229:79: error: no member named 'filter' in namespace 'std::ranges::views' for (const auto& used_function : used_functions \| std::ranges::views::filter(not_native)) { ~~~~~~~~~~~~~~~~~~~~^ 1 error generated.` ``` if some of the included header drops its own `#include <optional>`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18145	2024-04-02 18:47:54 +03:00
Botond Dénes	c228e4d518	cql3: select_statement: mutation_fragments_select_statement: fix use-after-return Don't capture stack variables by reference... it can (and will) explode in your face.	2024-02-28 06:48:09 -05:00
Kefu Chai	2dbf044b91	cql3: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16791	2024-01-16 16:43:17 +02:00
Kamil Braun	d93074e87e	cql3: don't parallelize select aggregates to local tables We've observed errors during shutdown like the following: ``` ERROR 2023-12-26 17:36:17,413 [shard 0:main] raft - [088f01a3-a18b-4821-b027-9f49e55c1926] applier fiber stopped because of the error: std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1230): std::runtime_error (forward_service is shutting down) INFO 2023-12-26 17:36:17,413 [shard 0:strm] storage_service - raft_state_monitor_fiber aborted with raft::stopped_error (Raft instance is stopped) ERROR 2023-12-26 17:36:17,413 [shard 0:strm] storage_service - raft topology: failed to fence previous coordinator raft::stopped_error (Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1230): std::runtime_error (forward_service is shutting down)") ``` some CQL statement execution was trying to use `forward_service` during shutdown. It turns out that the statement is in `system_keyspace::load_topology_state`: ``` auto gen_rows = co_await execute_cql( format("SELECT count(range_end) as cnt FROM {}.{} WHERE key = '{}' AND id = ?", NAME, CDC_GENERATIONS_V3, cdc::CDC_GENERATIONS_V3_KEY), gen_uuid); ``` It's querying a table in the `system` keyspace. Pushing local table queries through `forward_service` doesn't make sense as the data is not distributed. Excluding local tables from this logic also fixes the shutdown error. Fixes scylladb/scylladb#16570 Closes scylladb/scylladb#16662	2024-01-08 14:44:22 -05:00
Sylwia Szunejko	91a5a41313	add a way to negotiate generation of the tablet info for drivers Tablets metadata is quite expensive to generate (each data_value is an allocation), so an old driver (without support for tablets) will generate huge amounts of such notifications. This commit adds a way to negotiate generation of the notification: a new driver will ask for them, and an old driver won't get them. It uses the OPTIONS/SUPPORTED/STARTUP protocol described in native_protocol_v4.spec. Closes scylladb/scylladb#16611	2024-01-02 20:00:50 +02:00
Nadav Har'El	fc71c34597	Merge 'select statement: verify EXECUTE permissions only for non native functions' from Eliran Sinvani Commit `62458b8e4f` introduced the enforcement of EXECUTE permissions of functions in cql select. However, according to the reference in #12869, the permissions should be enforced only on UDFs and UDAs. The code does not distinguish between the two so the permissions are also unintenionally enforced also on native function. This commit introduce the distinction and only enforces the permissions on non native functions. Fixes #16526 Manually verified (before and after change) with the reproducer supplied in #16526 and also with some the `min` and `max` native functions. Also added test that checks for regression on native functions execution and verified that it fails on authorization before the fix and passes after the fix. Closes scylladb/scylladb#16556 * github.com:scylladb/scylladb: test.py: Add test for native functions permissions select statement: verify EXECUTE permissions only for non native functions	2023-12-26 18:14:21 +02:00
Eliran Sinvani	cac79977d6	select statement: verify EXECUTE permissions only for non native functions Commit `62458b8e4f` introduced the enforcement of EXECUTE permissions of functions in cql select. However, according to the reference in #12869, the permissions should be enforced only on UDFs and UDAs. The code does not distinguish between the two so the permissions are also unintentionally enforced also on native function. This commit introduce the distinction and only enforces the permissions on non native functions. Fixes #16526 Manually verified (before and after change) with the reproducer supplied in #16526 and also with some the `min` and `max` native functions. Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2023-12-26 10:27:04 +02:00

1 2 3 4 5 ...

502 Commits