scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 00:02:37 +00:00

Author	SHA1	Message	Date
Avi Kivity	1d631f7bac	cql3: statement_restrictions: fold add_single_column_partition_key_restriction() into its caller The goal is to simplify flow-control where the order in which variables are updated depends on their location in the source. With functions, this is difficult.	2026-04-19 20:57:05 +03:00
Avi Kivity	24cd98e454	cql3: statement_restrictions: fold add_token_partition_key_restriction() into its caller The goal is to simplify flow-control where the order in which variables are updated depends on their location in the source. With functions, this is difficult.	2026-04-19 20:57:05 +03:00
Avi Kivity	be3239fc58	cql3: statement_restrictions: fold add_multi_column_clustering_key_restriction() into its caller The goal is to simplify flow-control where the order in which variables are updated depends on their location in the source. With functions, this is difficult.	2026-04-19 20:57:05 +03:00
Avi Kivity	8990346c75	cql3: statement_restrictions: avoid early return in add_multi_column_clustering_key_restrictions Prepare for inlining it into its caller, which doesn't work easily if there's an early return.	2026-04-19 20:57:05 +03:00
Avi Kivity	fa130051a6	cql3: statement_restrictions: fold add_is_not_restriction() into its caller The goal is to simplify flow-control where the order in which variables are updated depends on their location in the source. With functions, this is difficult.	2026-04-19 20:57:05 +03:00
Avi Kivity	63f9362c89	cql3: statement_restrictions: fold add_restriction() into its caller The goal is to simplify flow-control where the order in which variables are updated depends on their location in the source. With functions, this is difficult.	2026-04-19 20:57:05 +03:00
Avi Kivity	9cbb1b851e	cql3: statement_restrictions: remove possible_partition_token_values() It's just a call to possible_lhs_values() with a different signature. Now possible_lhs_values() is our only solver.	2026-04-19 20:57:05 +03:00
Avi Kivity	c1fc596203	cql3: statement_restrictions: remove possible_column_values replace with now-identical possible_lhs_values. This paves the way to have only one solver function (after we remove possible_partition_token_values).	2026-04-19 20:57:05 +03:00
Avi Kivity	b26e6f7330	cql3: statement_restrictions: pass schema to possible_column_values() This unifies the signature with possible_lhs_values(), paving the way to deduplicating the two functions. We always have the schema and may as well pass it.	2026-04-19 20:57:05 +03:00
Avi Kivity	c6f6e81fe5	cql3: statement_restrictions: remove fallback path in solve() All query plans that try to solve for the possible values a column (or token, or column-tuple) can take have been converted to set analyzed_column::solve_for. Recognize that by removing the fallback path. This removes the last possible_column_values() call that isn't bound (using std::bind_front), and will allow moving it to prepare time.	2026-04-19 20:57:05 +03:00
Avi Kivity	e0445269e5	cql3: statement_restrictions: reorder possible_lhs_column parameters By moving query_options to the end, we can use std::bind_front to convert it from a build-time to a run-time function that depends only on the query_options.	2026-04-19 20:57:05 +03:00
Avi Kivity	e42ad62561	cql3: statement_restrictions: prepare solver for multi-column restrictions Multi-column restrictions (a, b) > (:v1, :v2) do not obey normal comparison rules. For example, given (a, b) > (5, 1) AND a <= 5 We see that (a, b) = (5, 2) satisfies the constraint, but if we tried to solve for the interval ( (5, 1), (5) ] We'd have to conclude that (5,1) <= (5). It's possible to extend the CQL type system to support this, but that would be a lot of work, and in fact the current code doesn't depend on it (by solving these intersections in its own code path (multi_column_range_accumulator_builder's prefix3cmp). So, we just mark such solvers as non-comparable, and generate an internal error if we try to compare them in make_conjunction.	2026-04-19 20:57:05 +03:00
Avi Kivity	96e8414963	cql3: statement_restrictions: add solver for token restriction on index possible_column_values() knows how to find the values that the token can take, so add a solve_for implementation for tokens.	2026-04-19 20:57:04 +03:00
Avi Kivity	135809d97b	cql3: statement_restrictions: pre-analyze column in value_for() Since we pre-analyze the column, return a built function, and remove the corresponding lambda from the caller.	2026-04-19 20:57:04 +03:00
Avi Kivity	0a16d90acb	cql3: statement_restrictions: don't handle boolean constants in multi_column_range_accumulator_builder In statement_restriction's constructor, we check that all the boolean factors are relations. This means the code to handle a constant here is dead code. Remove it; while it's good to handle it, it should be handled at the top level, not in multi-column restriction processing.	2026-04-19 20:57:04 +03:00
Avi Kivity	56ae02d8a3	cql3: statement_restrictions: split range_from_raw_bounds into prepare phase and query phase range_from_raw_bound processes restrictions of the form (a, b) > SCYLLA_CLUSTERING_BOUND(?, ?) indicating that comparisons respect whether columns are reversed or not. Iterate over expressions during the prepare phase only; generating "builder" functions to be executed during the query phase.	2026-04-19 20:57:04 +03:00
Avi Kivity	2c75123bbd	cql3: statement_restrictions: adjust signature of range_from_raw_bounds The get_clustering_bounds() family works in terms of vectors of clustering ranges (to support IN) and in fact the only caller converts it to a vector. Converting it immediately simplifies later patching.	2026-04-19 20:57:04 +03:00
Avi Kivity	e646b763e7	cql3: statement_restrictions: split multi_column_range_accumulator into prepare-time and query-time phases multi_column_range_accumulator analyzes an expression containing multi-column restrictions of the form (a, b) > (?, ?) and simultaneously analyzes them and solves for the set of intervals that satisfy those restrictions. Split this into prepare-time phase (that generates "builders", functions that operator on the accumulator), and a query phase that executes the builders. Importantly, the expression visitor ends up on the prepare phase, so it can be merged with other parts of the analysis. Helper functions of the visitor are made static, since they need to run during the query phase but the visitor only exists during the prepare phase.	2026-04-19 20:57:04 +03:00
Avi Kivity	ea26186043	cql3: statement_restrictions: make get_multi_column_clustering_bounds a builder Lay the groundwork for analyzing multi column clustering bounds by splitting the function into prepare-time and execute-time parts. To start with, all of the work is done at query time, but later patches will move bits into prepare time.	2026-04-19 20:57:04 +03:00
Avi Kivity	c60e3d5cf7	cql3: statement_restrictions: multi-key clustering restrictions one layer deeper For the multi column binary operator case, perform more of the work at prepare time in preparation for consolidating the analysis.	2026-04-19 20:57:04 +03:00
Avi Kivity	b520e74128	cql3: statement_restrictions: push multi-column post-processing into get_multi_column_clustering_bounds() Doing this splits the multi-column processing code into a preparation phase and an evaluation phase in a single call, making it easier to further split prepare/evaluate.	2026-04-19 20:57:04 +03:00
Avi Kivity	c4ab0ddb85	cql3: statement_restrictions: pre-analyze single-column clustering key restrictions Change _clustering_prefix_restrictions and _idx_tbl_ck_prefix (the latter is the equivalent of the former, for indexed queries), to use predicate instead of expressions. This lets us do more of the work of solving restrictions during prepare time. We only handle single-column restrictions here. Multi-column restrictions use the existing path. We introduce two helpers: - value_set_to_singleton() converts a restriction solution to a singleton when we know that's the only possible answer - replace_column_def() overload for predicate, similar to the existing overload for expressions There is a wart in get_single_column_clustering_bounds(): we arrive at his point with the two vectors possibly pointing at different columns. Previously, possible_lhs_values() did this check while solving. We now check for it here. The predicate::on variant gets another member, for clustering key prefixes. Since everything is still handled by the legacy paths, we mostly error out.	2026-04-19 20:57:04 +03:00
Avi Kivity	201ed53837	cql3: statement_restrictions: wrap value_for_index_partition_key() To allow more work to be carried out during prepare time, wrap the body in an std::function, which will be called at execution time. Currently we actually do the work during execution time; but the way is prepared.	2026-04-19 20:57:04 +03:00
Avi Kivity	325497d460	cql3: statement_restrictions: hide value_for() value_for() is a general function that solves for values that satisfy an expression set to TRUE. This goes against our goal to prepare solvers for all the expressions we use. Fortunately, it's only called with one expression, which comes from statement_restrictions, so we can add an accessor that provides the expression from our own state. Later, we'll be able to do prepare-time work on it.	2026-04-19 20:57:04 +03:00
Avi Kivity	dcdd2f7e72	cql3: statement_restrictions: push down clustering prefix wrapper one level This allows us to tackle each case separately.	2026-04-19 20:57:03 +03:00
Avi Kivity	1039ed9ed2	cql3: statement_restrictions: wrap functions that return clustering ranges During prepare time, build functions for use during execution time. Currently, the wrappers are very shallow, and practically all the work is done at execution time. But the stage is set for more peeling. The index clustering ranges had on_internal_error()s if an index was not used. They're converted to returning a null function. If executed (which is never supposed to happen), it will throw a bad_function_call.	2026-04-19 20:57:03 +03:00
Avi Kivity	620df7103f	cql3: statement_restrictions: do not pass view schema back and forth For indexed queries, statement_restrictions calculates _view_schema, which is passed via get_view_schema() to indexed_select_statement(), which passes it right back to statement_restrictions via one of three functions to calculate clustering ranges. Avoid the back-and-forth and use the stored value. Using a different value would be broken. This change allows unifying the signatures of the four functions that get clustering ranges.	2026-04-19 20:57:03 +03:00
Avi Kivity	6fce090e30	cql3: statement_restrictions: pre-analyze token range restrictions Convert token range restrictions to the predicate format we introduced earlier, where we have a function to solve for the token range rather than running the analysis at runtime. Again the truth is that the function will delegate to possible_partition_token_values() which actually will do the analysis at runtime, but it's one step closer. We add a new variant element for predicate::on, since it doesn't fit the existing element (the token isn't a column).	2026-04-19 20:57:03 +03:00
Avi Kivity	941011bb4a	cql3: statement_restrictions: pre-analyze partition key columns The expression tree for partition keys is analyzed during runtime: in partition_range_from_singles() (for example), we call find_binop and get_subscripted_column() to understand the expression structure. This analysis is problematic because it has to match the analysis during prepare time; and they have to evolve in lock step. Here, we move the analysis to the prepare stage. This is done by augmenting the expression into a new predicate struct. It contains the original expression (as a fallback for paths not yet converted), as well as a solve_for function which contains a function built at prepare time that embeds all the necessary analysis. We introduce the `predicate` type which is an augmentation of boolean expressions. In addition to the expression, we remember what column the expression is on, and a function that computes what values the column can take on that would make the expression true. The field that says what column the predicate is about is typed as a variant since later on we will have predicates on non-columns (the token, or a clustering prefix). Note that currently the function engages in some run-time analysis of its own, since it calls possible_lhs_values that itself does analysis, but this is a step in the right direction.	2026-04-19 20:57:03 +03:00
Avi Kivity	c73f3ac55f	cql3: statement_restrictions: do not collect subscripted partition key columns An indexed SELECT of the from SELECT ... WHERE pk['sub'] = ? is impossible because our indexes do not support frozen maps, and partition key collections must be frozen. Stop collecting such constructs for the purpose of determining the partition range. This reduces having to deal with combinations of restrictions on the column and its entries later on. In case we start supporting indexes on frozen maps, leave an on_internal_error to remind us.	2026-04-19 20:57:03 +03:00
Avi Kivity	531f137ed3	cql3: statement_restrictions: split _partition_range_restrictions into three cases _partition_range_restrictions are a vector of expressions, one per partition key column, except that it can be empty if there is no restriction on the partition that can be translated to a read command, and if the restriction is on a token range, the first element only is used. Separate the three cases into distinct structs. After this, additional work can be done utilizing the specialization.	2026-04-19 20:57:03 +03:00
Avi Kivity	fcf7c4c90d	cql3: statement_restrictions: move value_list, value_set to header file They don't really need to be public, but will be used in intermediate storage.	2026-04-19 20:57:03 +03:00
Avi Kivity	926886fcfb	cql3: statement_restrictions: wrap get_partition_key_ranges statement_restrictions::get_partition_key_ranges() re-interprets the expressions used to specify the partition key. This means that the analysis phase (determining what those expressions are and how they are to be used) and the execution phase (using them) are in separate places. This makes it very hard to refactor while preserving correctness. As a first step in unifying the two phases, we move the selection of the strategy (using token, cartesian product, or single partition) from execution to analysis, by making the if-tree return a function to be executed at execution time, rather than running the if-tree itself at execution time.	2026-04-19 20:57:03 +03:00
Avi Kivity	eec0b20dbc	cql3: statement_restrictions: prepare statement_restrictions for capturing `this` Prevent copying/moving, that can change the address, and instead enforce using shared_ptr. Most of the code is already using shared_ptr, so the changes aren't very large. To forbid non-shared_ptr construction, the constructors are annotated with a private_tag tag class.	2026-04-19 20:57:03 +03:00
Avi Kivity	374be94faa	test: statement_restrictions: add index_selection regression test In preparation for refactoring statement_restrictions, add a simple and an exhaustive regression test, encoding the index selection algorithm into the test. We cannot change the index selection algorithm because then mixed-node clusters will alter the sorting key mid-query (if paging takes place). Because the exhaustive space has such a large stack frame, and because Address Santizer bloats the stack frame, increase it for debug builds.	2026-04-19 20:57:01 +03:00
Artsiom Mishuta	dce0c24a02	test/alternator: replace bare pytest.skip() with typed skip helpers	2026-04-19 17:34:41 +02:00
Artsiom Mishuta	b078cd1e72	test: migrate new bare skips introduced by upstream after rebase Migrate 3 bare skip sites that appeared in upstream/master after the initial migration: - test/cluster/test_strong_consistency.py: 2 @pytest.mark.skip → @pytest.mark.skip_bug (SCYLLADB-1056) - test/cqlpy/conftest.py: pytest.skip() → skip_env() in skip_on_scylla_vnodes fixture	2026-04-19 17:34:41 +02:00
Artsiom Mishuta	9c4d3ce097	test/pylib: reject bare pytest.mark.skip and add codebase guards Harden the skip_reason_plugin to reject bare @pytest.mark.skip at collection time with pytest.UsageError instead of warnings.warn(). Add test/pylib_test/test_no_bare_skips.py with three guard tests: - AST scan for bare pytest.skip() runtime calls - Real pytest --collect-only against all Python test directories	2026-04-19 17:34:31 +02:00
Avi Kivity	a15294d601	Revert "Update seastar submodule" This reverts commit `2943d30b0c`. It introduces a regression where --unsafe-bypass-fsync is not honored. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1496	2026-04-19 15:14:48 +03:00
Avi Kivity	9fb67e3e96	Revert "alternator: optional stripping of http response headers" This reverts commit `73f0deef6d`. It prevents `2943d30b0c`, which causes high flakiness, from being reverted.	2026-04-19 15:14:48 +03:00
Artsiom Mishuta	0b6b380b80	test: update comments referencing pytest.skip() to skip_env() Update 7 comments/docstrings across 5 files that still referenced pytest.skip() to reference the typed skip_env() wrapper for consistency with the migrated code.	2026-04-19 11:14:03 +02:00
Artsiom Mishuta	b10028e556	test: migrate runtime pytest.skip() to typed skip_bug() Migrate 2 runtime pytest.skip() calls referencing known bugs to use the typed skip_bug() wrapper from test.pylib.skip_types: - test/alternator/test_ttl.py: Streams on tablets (#23838) - test/scylla_gdb/test_task_commands.py: coroutine task not found (#22501)	2026-04-19 11:10:42 +02:00
Artsiom Mishuta	8a80e2c3be	test: migrate runtime pytest.skip() to typed skip_env() Migrate runtime pytest.skip() calls across 34 files to use the typed skip_env() wrapper from test.pylib.skip_types. These sites skip at runtime because a required feature, config option, library version, build mode, or runtime topology is not available. Also fixes 'raise pytest.skip(...)' in test_audit.py — skip_env() already raises internally, so the explicit raise was incorrect. Each file gains one new import: from test.pylib.skip_types import skip_env	2026-04-19 11:09:29 +02:00
Artsiom Mishuta	fb0974a329	test: migrate bare @pytest.mark.skip to skip_not_implemented Migrate 2 bare @pytest.mark.skip decorators (no reason string) to @pytest.mark.skip_not_implemented with an explicit reason referencing issue #3882 (COMPACT STORAGE not implemented).	2026-04-19 11:06:30 +02:00
Artsiom Mishuta	a39fb9d29a	test: migrate @pytest.mark.skip to @pytest.mark.skip_slow Migrate 4 @pytest.mark.skip decorator sites to @pytest.mark.skip_slow across 3 test files where the skip reason indicates a slow test.	2026-04-19 11:06:30 +02:00
Artsiom Mishuta	638efedc3c	test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented Migrate 10 @pytest.mark.skip decorator sites to @pytest.mark.skip_not_implemented across 5 test files where the skip reason indicates a feature not yet implemented.	2026-04-19 11:06:30 +02:00
Artsiom Mishuta	465636bc53	test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs Migrate 24 @pytest.mark.skip decorator sites to @pytest.mark.skip_bug across 16 test files where the reason references a known bug or issue.	2026-04-19 11:06:30 +02:00
Szymon Malewski	73f0deef6d	alternator: optional stripping of http response headers In Alternator's HTTP API, response headers can dominate bandwidth for small payloads. The Server, Date, and Content-Type headers were sent on every response but many clients never use them. This patch introduces three Alternator config options: - alternator_http_response_server_header, - alternator_http_response_disable_date_header, - alternator_http_response_disable_content_type_header, which allow customizing or suppressing the respective HTTP response headers. All three options support live update (no restart needed). The Server header is no longer sent by default; the Date and Content-Type defaults preserve the existing behavior. The Server and Date header suppression uses Seastar's set_server_header() and set_generate_date_header() APIs added in https://github.com/scylladb/seastar/pull/3217. This patch also fixes deprecation warnings from older Seastar HTTP APIs. Tests are in test/alternator/test_http_headers.py. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-70 Closes scylladb/scylladb#28288	2026-04-19 09:22:04 +03:00
Nadav Har'El	f83270df12	Merge 'alternator/streams: Block tablet merges for Alternator Streams on tablet tables' from Piotr Szymaniak DynamoDB Streams API can only convey a single parent per stream shard. Tablet merges produce two parents, making them incompatible with Alternator Streams. This series blocks tablet merges when streams are active on a tablet table. For CreateTable, a freshly created table has no pending merges, so streams are enabled immediately with tablet merges blocked. For UpdateTable on an existing table, stream enablement is deferred: the user's intent is stored via `enable_requested`, tablet merges are blocked (new merge decisions are suppressed and any active merge decision is revoked), and the topology coordinator finalizes enablement once no in-flight merges remain. The topology coordinator is woken promptly on error injection release and tablet split completion, reducing finalization latency from ~60s to seconds. `test_parent_children_merge` is marked xfail (merges are now blocked), and downward (merge) steps are removed from `test_parent_filtering` and `test_get_records_with_alternating_tablets_count`. Not addressed here: using a topology request to preempt long-running operations like repair (tracked in SCYLLADB-1304). Refs SCYLLADB-461 Closes scylladb/scylladb#29224 * github.com:scylladb/scylladb: topology: Wake coordinator promptly for stream enablement lifecycle test/cluster: Test deferred stream enablement on tablet tables alternator/streams: Block tablet merges when Alternator Streams are enabled	2026-04-19 09:15:13 +03:00
Nadav Har'El	0d05e3b4a4	alternator: fix ListStreams paging if table is deleted during paging Currently, ListStreams paging works by looking in the list of tables for ExclusiveStartStreamArn and starting there. But it's possible that during the paging process, one of the tables got deleted and ExclusiveStartStreamArn no longer points to an existing table. In the current implementation this caused the paging to stop (think it reached the end). The solution is simple: ListStreams will now sort the list of tables by name (it anyway needs to be sorted by something to be consistent across pages), and will look with std::upper_bound for the first table after the ExclusiveStartStreamArn - we don't need to find that table name itself. The patch also includes a test reproducing this bug. As usual, the test passes on DynamoDB, fails on Alternator before this patch, and passes with the patch. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-04-19 09:12:02 +03:00

... 9 10 11 12 13 ...

53948 Commits