scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-12 19:02:12 +00:00

Author	SHA1	Message	Date
Piotr Sarna	dd9d6c081e	cql-pytest: relax error conditions for a failed wasm execution Originally, the expected failure for a recursive invocation test case was to expect that fuel gets exhausted, but it's also possible to hit a stack limit first. All errors are equally expected here as long as the execution is halted, so let's relax the condition and accept any wasm-related InvalidRequest errors. Closes #9361	2021-09-20 15:20:52 +03:00
Avi Kivity	8c0f2f9e3d	Revert "Merge 'cql3: Add expr::constant to replace terminal' from Jan Ciołek" This reverts commit `e9343fd382`, reversing changes made to `27138b215b`. It causes a regression in v2 serialization_format support: collection_serialization_with_protocol_v2_test fails with: marshaling error: read_simple_bytes - not enough bytes (requested 1627390306, got 3) Fixes #9360	2021-09-20 15:15:09 +03:00
Avi Kivity	15819e0304	Merge "Database start/stop code sanitation" from Pavel E " Currently database start and stop code is quite disperse and exists in two slightly different forms -- one in main and the other one in cql_test_env. This set unifies both and makes them look almost the perfect way: sharded<database> db; db.start(<dependencies>); auto stop = defer([&db] { db.stop().get(); }); db.invoke_on_all(&database::start).get(); with all (well, most) other mentionings of the "db" variable being arguments for other services' dependencies. tests: unit(dev, release), unit.cross_shard_barrier(debug) dtest.simple_boot_shutdown(dev) refs: #2737 refs: #2795 refs: #5489 " * 'br-database-teardown-unification-2' of https://github.com/xemul/scylla: (26 commits) main: Log when database starts view_update_generator: Register staging sstables in constructor database, messaging: Delete old connection drop notification database, proxy: Relocate connection-drop activity messaging, proxy: Notify connection drops with boost signal database, tests: Rework recommended format setting database, sstables_manager: Sow some noexcepts database: Eliminate unused helpers database: Merge the stop_database() into database::stop() database: Flatten stop_database() database: Equip with cross-shard-barrier database: Move starting bits into start() database: Add .start() method main: Initialize directories before database main, api: Detach set_server_config from database and move up main: Shorten commitlog creation database: Extract commitlog initialization from init_system_keyspace repair: Shutdown without database help main: Shift iosched verification upward database: Remove unused mm arg from init_non_system_keyspaces() ...	2021-09-20 10:26:13 +03:00
Kamil Braun	e3f1667744	sstables: remove use_binary_search_in_promoted_index This was a global variable that was potentially modified from a performance benchmark. It would modify the behavior of `index_reader` in certain scenarios. Remove the variable so we can specify the behavior of `index_reader` functions without relying on anything other than what's passed into the constructor and the function parameters.	2021-09-19 13:59:25 +03:00
Pavel Emelyanov	b78e9b51b7	database, tests: Rework recommended format setting Tests don't have sstable format selector and enforce the needed format by hands with the help of special database:: method. It's more natural to provide it via convig. Doing this makes database initialization in main and cql_test_env closer to each other. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	4b7846da86	database: Merge the stop_database() into database::stop() After stop_database() became shard-local, it's possible to merge it with database::stop() as they are both called one after another on scylla stop. In cql-test-env there are few more steps in between, but they don't rely on the database being partially stopped. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	b1013e09b4	database: Equip with cross-shard-barrier Make sure a node-wide barrier exists on a database when scylla starts. Also provide a barrier for cql_test_env. In all other cases keep a solo-mode barrier so that single-shard db stop doesn't get blocked. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	634ea4b543	database: Move starting bits into start() Thse include large_data_handler::start, compaction_manager::enable and database::init_commitlog. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:48:48 +03:00
Pavel Emelyanov	e2308034ff	database: Add .start() method Called right after the sharded::start(). For now empty, to be populated by next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:44:48 +03:00
Pavel Emelyanov	127e4fe8de	main: Shorten commitlog creation This does three things in one go: - converts db.invoke_on_all([] (database& db) { return db.init_commitlog(); }); into a one-line version db.invoke_on_all(&database::init_commitlog); - removes the shard-0 pre-initialization for tests, because tests don't have the problem this pre- solves - make the init_commitlog() re-entrable to let regular start not check for shard-0 explicitly Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:37:07 +03:00
Pavel Emelyanov	f6ab69b7f8	database: Extract commitlog initialization from init_system_keyspace The intention is to keep all database initialization code in one place. The init_system_keyspace() is one the obstacles -- it initializes db's commitlog as first step. This patch moves the commitlog initialization out of the mentioned helper. The result looks clumsy, but it's temporary, next patches will brush it up. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:36:42 +03:00
Pavel Emelyanov	bd2b7dca0e	database: Remove unused mm arg from init_non_system_keyspaces() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:37 +03:00
Pavel Emelyanov	7e5abb5096	main, scylla-gdb, cql-test-env: Unify debug::the_database All the debug:: inhabitants have their names look like "the_<classname>" This patch brings the database piece to this standard. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:30 +03:00
Pavel Emelyanov	e324230648	utils: Introduce cross-shard barrier (with test) Add a synchronization facility to let shards wait for each other to pass through certain points in the code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:12 +03:00
Avi Kivity	cc8fc73761	Merge 'hints: fix bugs in HTTP API for waiting for hints found by running dtest in debug mode' from Piotr Dulikowski This series of commits fixes a small number of bugs with current implementation of HTTP API which allows to wait until hints are replayed, found by running the `hintedhandoff_sync_point_api_test` dtest in debug mode. Refs: #9320 Closes #9346 * github.com:scylladb/scylla: commitlog: make it possible to provide base segment ID hints: fill up missing shards with zeros in decoded sync points hints: propagate abort signal correctly in wait_for_sync_point hints: fix use-after-free when dismissing replay waiters	2021-09-15 12:55:54 +03:00
Avi Kivity	daf028210b	build: enable -Winconsistent-missing-override warning This warning can catch a virtual function that thinks it overrides another, but doesn't, because the two functions have different signatures. This isn't very likely since most of our virtual functions override pure virtuals, but it's still worth having. Enable the warning and fix numerous violations. Closes #9347	2021-09-15 12:55:54 +03:00
Piotr Dulikowski	486421c58c	hints: fill up missing shards with zeros in decoded sync points Between encoding and decoding of a sync point, the node might have been restarted and resharded with increased shard count. During resharding, existing hints segments might have been moved to new shards. Because of that, we need to make sure that we wait for foreign segments to be replayed on the new shards too. This commit modifies the sync point decoding logic so that it places a zero replay position for new shards. Additionally, a (incorrect) shard count check is removed from `storage_proxy::wait_for_hint_sync_point` because now the shard count in decoded sync point is guaranteed to be not less than the node's current shard count.	2021-09-15 11:04:34 +02:00
Avi Kivity	08042c1688	Merge 'reader_permit: make query max result size accessible from the permit' from Kamil Braun This will make it easier, for example, to enforce memory limits in lower levels of the `flat_mutation_reader` stack. By default, the query result size is unlimited. However, for specific queries it is possible to store a different value (e.g. obtained from a `read_command` object) through a setter. An example of this can be seen in the last commit of this PR, where we set the limit to `cmd.max_result_size` if engaged, or to the 'unlimited query' limit (using `database::get_unlimited_query_max_result_size()`) if not. Refs: #9281. The v2 version of the reverse sstable reader PR will be based on this PR: we'll use the query max result size parameter in one of the readers down the stack where `read_command` is not available but `reader_permit` is. Closes #9341 * github.com:scylladb/scylla: table, database: query, mutation_query: remove unnecessary class_config param reader_permit: make query max result size accessible from the permit reader_concurrency_semaphore: remove default parameter values from constructors query_class_config: remove query::max_result_size default constructor	2021-09-14 16:17:18 +03:00
Kamil Braun	fbb83dd5ca	reader_concurrency_semaphore: remove default parameter values from constructors It's easy to forget about supplying the correct value for a parameter when it has a default value specified. It's safer if 'production code' is forced to always supply these parameters manually. The default values were mostly useful in tests, where some parameters didn't matter that much and where the majority of uses of the class are. Without default values adding a new parameter is a pain, forcing one to modify every usage in the tests - and there are a bunch of them. To solve this, we introduce a new constructor which requires passing the `for_tests` tag, marking that the constructor is only supposed to be used in tests (and the constructor has an appropriate comment). This constructor uses default values, but the other constructors - used in 'production code' - do not.	2021-09-14 12:20:28 +02:00
Kamil Braun	8386b55e9c	query_class_config: remove query::max_result_size default constructor The default values for the fields of this class didn't make much sense, and the default constructor was used only in a single place so removing it is trivial. It's safer when the user is forced to supply the limits.	2021-09-14 12:20:28 +02:00
Avi Kivity	3f2c680b70	Merge 'Add initial support for WebAssembly in user-defined functions (UDF)' from Piotr Sarna This series adds very basic support for WebAssembly-based user-defined functions. This series comes with a basic set of tests which were used to designate a minimal goal for this initial implementation. Example usage: ```cql CREATE FUNCTION ks.fibonacci (str text) RETURNS NULL ON NULL INPUT RETURNS boolean LANGUAGE xwasm AS ' (module (func $fibonacci (param $n i32) (result i32) (if (i32.lt_s (local.get $n) (i32.const 2)) (return (local.get $n)) ) (i32.add (call $fibonacci (i32.sub (local.get $n) (i32.const 1))) (call $fibonacci (i32.sub (local.get $n) (i32.const 2))) ) ) (export "fibonacci" (func $fibonacci)) ) ' ``` Note that the language is currently called "xwasm" as in "experimental wasm", because its interface is still subject to change in the future. Closes #9108 * github.com:scylladb/scylla: docs: add a WebAssembly entry cql-pytest: add wasm-based tests for user-defined functions main: add wasm engine instantiation treewide: add initial WebAssembly support to UDF wasm: add initial WebAssembly runtime implementation db: add wasm_engine pointer to database lang: add wasm_engine service import wasmtime.hh lua: move to lang/ directory cql3: generalize user-defined functions for more languages	2021-09-14 11:34:20 +03:00
Piotr Sarna	41b94d3cf3	cql-pytest: add wasm-based tests for user-defined functions A first set of wasm-based test cases is added. The tests include verifying that supported types work and that validation of the input wasm is performed.	2021-09-13 19:03:58 +02:00
Avi Kivity	e9343fd382	Merge 'cql3: Add expr::constant to replace terminal' from Jan Ciołek Add new struct to the `expression` variant: ```c++ // A value serialized with the internal (latest) cql_serialization_format struct constant { cql3::raw_value value; data_type type; // Never nullptr, for NULL and UNSET might be empty_type }; ``` and use it where possible instead of `terminal`. This struct will eventually replace all classes deriving from `terminal`, but for now `terminal` can't be removed completely. We can't get rid of terminal yet, because sometimes `terminal` is converted back to `term`, which `constant` can't do. This won't be a problem once we replace term with expression. `bool` is removed from `expression`, now `constant` is used instead. This is a redesign of PR #9203, there is some discussion about the chosen representation there. Closes #9244 * github.com:scylladb/scylla: cql3: term: Remove get_elements and multi_item_terminal from terminals cql3: Replace most uses of terminal with expr::constant cql3: expr: Remove repetition from expr::get_elements cql3: expr: Add expr::get_elements(constant) cql3: term: remove term::bind_and_get cql3: Replace all uses of bind_and_get with evaluate_to_raw_view cql3: expr: Add evaluate_IN_list cql3: tuples: Implement tuples::in_value::get cql3: Move data_type to terminal, make get_value_type non-virtual cql3: user_types: Implement get_value_type in user_types.hh cql3: tuples: Implement get_value_type in tuples.hh cql3: maps: Implement get_value_type in maps.hh cql3: sets: Implement get_value_type in sets.hh cql3: lists: Implement get_value_type in lists.hh cql3: constants: Implement get_value_type in constants.hh cql3: expr: Add expr::evaluate cql3: values: Add unset value to raw_value_view::make_temporary cql3: expr: Add constant to expression	2021-09-13 19:26:09 +03:00
Tomasz Grabiec	890b861d20	Merge 'query::reverse_slice(): toggle reversed bit instead of setting it' from Botond Dénes The above mentioned method is supposed to work both ways: reversed <-> forward, so setting the reversed bit is not correct: it should be toggled, which is what this mini-series does. Closes #9327 * github.com:scylladb/scylla: reverse_slice(): toggle reversed bit instead of setting it partition_slice_builder(): add with_option_toggled() enum_set: add toggle()	2021-09-13 18:48:11 +03:00
Avi Kivity	f3712d4767	Merge "Avoid nested seastar::async in tests" from Pavel E " There's a bunch of explicit and implicit async contexts nesting in sstables tests. This set turns them into a single nest async (mostly with an awk script). The indentation in first two patches is deliberately left as it was before patching, i.e. -- slightly broken. As a consolation, after the third patch it suddenly becomes fixed as the unneeded intermediate call with broken indent is removed. tests: unit(dev) " * 'br-sst-tests-no-nested-async' of https://github.com/xemul/scylla: test: Don't nest seastar::async calls (2nd cont) test: Don't nest seastar::async calls (cont) test: Don't nest seastar::async calls	2021-09-13 18:45:46 +03:00
Botond Dénes	96c95119f9	enum_set: add toggle()	2021-09-13 18:05:11 +03:00
Jan Ciolek	68b65771a7	cql3: tuples: Implement get_value_type in tuples.hh To convert a terminal to expr::constant we need know the value type. Implement getting value type for terminals in tuples.hh. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-09-13 17:03:23 +02:00
Jan Ciolek	60a34236ee	cql3: constants: Implement get_value_type in constants.hh To convert a terminal to expr::constant we need know the value type. Implement getting value type for terminals in constants.hh. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-09-13 17:03:23 +02:00
Jan Ciolek	79cb268ada	cql3: expr: Add constant to expression Adds constant to the expression variant: struct constant { raw_value value; data_type type; }; This struct will be used to represent constant values with known bytes and type. This corresponds to the terminal from current design. bool is removed from expression, now constant is used instead. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-09-13 17:03:21 +02:00
Avi Kivity	61c9df4bd2	Merge "Split sstable_conforms_to_mutation_source" from Pavel E " The tests contains a single case that runs 6 different cases inside and is one of the longest tests out there. Splitting it improves parallel-cases suite run time. tests: unit(dev, debug, release) " * 'br-split-sst-conforms-to-ms' of https://github.com/xemul/scylla: tests: Fix indentation after previous patch tests: Split sstable_conforms_to_mutation_source	2021-09-13 11:27:44 +03:00
Avi Kivity	1fd701e709	test: cql-pytest: skip tests depending on timeuuid monotonicity timeuuid is not monotonic when now() is called on different connections, so when running tests that depend on that property, we get failures if using the Scylla driver (which became standard in `729d0fe`). Skip the tests for now, until we figure out what to do. We probably can't make now() globally monotonic, and there isn't much to gain by making it monotonic only per connection, since clients are allowed to switch connections (and even nodes) at will. Ref #9300 Closes #9323 [avi: committing my own patch to unblock master]	2021-09-12 19:30:40 +03:00
Nadav Har'El	1d4474d543	test/alternator/run: don't run Scylla if "--aws" option The test/alternator/run script runs Scylla and then runs pytest against it. But when passing the "--aws" option, the intention is that these tests be run against AWS DynamoDB, not a local Scylla, so there is no point in starting Scylla at all - so this is what we do in this patch. This doesn't really add a new feature - "test/alternator/run --aws" will now be nothing more than "cd test/alternator; pytest --aws". But it adds the convenience that you can run the same tests on Scylla and AWS with exactly the same "run" command, just adding the "--aws" option, and don't need to sometimes use "run" and sometimes "pytest". Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210912133239.75463-1-nyh@scylladb.com>	2021-09-12 16:50:38 +03:00
Avi Kivity	c5f52f9d97	schema_tables: don't flush in tests Flushing schema tables is important for crash recovery (without a flush, we might have sstables using a new schema before the commitlog entry noting the schema change has been replayed), but not important for tests that do not test crash recovery. Avoiding those flushes reduces system, user, and real time on tests running on a consumer-level SSD. before: real 8m51.347s user 7m5.743s sys 5m11.185s after: real 7m4.249s user 5m14.085s sys 2m11.197s Note real time is higher that user+sys time divided by the number of hardware threads, indicating that there is still idle time due to the disk flushing, so more work is needed. Closes #9319	2021-09-12 11:32:13 +03:00
Raphael S. Carvalho	acba3bd3c4	sstables: give a more descriptive name to compaction_options the name compaction_options is confusing as it overlaps in meaning with compaction_descriptor. hard to reason what are the exact difference between them, without digging into the implementation. compaction_options is intended to only carry options specific to a give compaction type, like a mode for scrub, so let's rename it to compaction_type_options to make it clearer for the readers. [avi: adjust for scrub changes] Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210908003934.152054-1-raphaelsc@scylladb.com>	2021-09-12 11:21:33 +03:00
Tomasz Grabiec	83113d8661	Merge "raft: new schema for storing raft snapshots" from Pavel Solodovnikov Previously, the layout for storing raft snapshot descriptors contained a `config` field, which had `blob` data type. That means `raft::configuration` for the snapshot was serialized as a whole in binary form. It's convenient to implement and is the most compact form of representing the data, but: 1. Hard to debug due to the need to de-serialize the data. 2. Plants a time bomb wrt. changing data layout and also the documentation in the future. Remove the `config` field from `system.raft_snapshots` and extract it to a separate `system.raft_config` table to store the data in exploded form. Also, modify the schema of `system.raft_snapshots` table in the following way: add a `server_id` field as a part of composite partition key ((group_id, server_id)) to be able to start multiple raft servers belonging to one raft group on the same scylla node. Rename `id` field in `raft_snapshots` to `snapshot_id` so it's self-documenting. Rename `snapshot_id` from clustering key since a given server can have only one snapshot installed at a time. Note that the `raft::server_address` stucture contains an opaque `info` member, which is `bytes`, but in the `raft_config` table we use `ip_addr inet` field, instead. We always know that the corresponding member field is going to contain an IP address (either v4 or v6) of a given raft server. So, now the snapshots schema looks like this: CREATE TABLE raft_snapshots ( group_id timeuuid, server_id uuid, snapshot_id uuid, idx int, term int, -- no `config` field here, moved to `raft_config` table PRIMARY KEY ((group_id, server_id)) ) CREATE TABLE raft_config ( group_id timeuuid, my_server_id uuid, server_id uuid, disposition text, -- can be either 'CURRENT` or `PREVIOUS' can_vote bool, ip_addr inet, PRIMARY KEY ((group_id, my_server_id), server_id, disposition) ); This way it's much easier to extend the schema with new fields, very easy to debug and inspect via CQL, and it's much more descriptive in terms of self-documentation. Tests: unit(dev) * manmanson/raft_snapshots_new_schema_v2: test: adjust `schema_change_test` to include new `system.raft_config` table raft: new schema for storing raft snapshots raft: pass server id to `raft_sys_table_storage` instance	2021-09-10 20:41:59 +02:00
Avi Kivity	7a798b44a2	cql3: expr: replace column_value_tuple by a composition of tuple_constructor and column_value column_value_tuple overlaps both column_value and tuple_constructor (in different respects) and can be replaced by a combination: a tuple_constructor of column_value. The replacement is more expressive (we can have a tuple of column_value and other expression types), though the code (especially grammar) do not allow it yet. So remove column_value_tuple and replace it everywhere with tuple_constructor. Visitors get the merged behavior of the existing tuple_constructor and column_value_tuple, which is usually trivial since tuple_constructor and column_value_tuple came from different hierarchies (term::raw and relation), so usually one of the types just calls on_internal_error(). The change results in awkwards casts in two areas: WHERE clause filtering (equal() and related), and clustering key range evaluations (limits() and related). When equal() is replaced by recursive evaluate(), the casts will go way (to be replaced by the evaluate()) visitor. Clustering key range extraction will remain limited to tuples of column_value, so the prepare phase will have to vet the expressions to ensure the casts don't fail (and use the filtering path if they will). Tests: unit (dev) Closes #9274	2021-09-10 10:43:29 +02:00
Avi Kivity	219fdcd8da	Merge 'tools: introduce scylla-sstable' from Botond Dénes A tool which can be used to examine the content of sstable(s) and execute various operations on them. The currently supported operations are: * dump - dumps the content of the sstable(s), similar to sstabledump; * dump-index - dumps the content of the sstable index(es), similar to scylla-sstable-index; * writetime-histogram - generates a histogram of all the timestamps in the sstable(s); * custom - a hackable operation for the expert user (until scripting support is implemented); * validate - validate the content of the sstable(s) with the mutation fragment stream validator, same as scrub in validate mode; The sstables to-be-examined are passed as positional command line arguments. Sstables will be processed by the selected operation one-by-one (can be changed with `--merge`). Any number of sstables can be passed but mind the open file limits. Pass the full path to the data component of the sstables (-Data.db). For now it is required that the sstable is found at a valid data path: /path/to/datadir/{keyspace_name}/{table_name}-{table_id}/ The schema to read the sstables is read from a `schema.cql` file. This should contain the keyspace and table definitions, as well as any UDTs used. Filtering the sstable(s) to process only certain partition(s) is supported via the `--partition` and `--partitions-file` command line flags. Partition keys are expected to be in the hexdump format used by scylla (hex representation of the raw buffer). Operations write their output to stdout, or file(s). The tool logs to stderr, with a logger called `scylla-sstable-crawler`. Examples: # dump the content of the sstable $ scylla-sstable-crawler --dump /path/to/md-123456-big-Data.db # dump the content of the two sstable(s) as a unified stream $ scylla-sstable-crawler --dump --merge /path/to/md-123456-big-Data.db /path/to/md-123457-big-Data.db # generate a joint histogram for the specified partition $ scylla-sstable-crawler --writetime-histogram --partition={{myhexpartitionkey}} /path/to/md-123456-big-Data.db # validate the specified sstables $ scylla-sstable-crawler --validate /path/to/md-123456-big-Data.db /path/to/md-123457-big-Data.db Future plans: JSON output for dump. * A simple way of generating `schema.cql` for any schema, other than copying it from snapshots, or copying from `cqlsh`. None of these generate a complete output. * Relax sstable path checks, so sstables can be loaded from any path. * Add scripting support (Lua), allowing custom operations to be written in a scripting language. Refs: #9241 Closes #9271 * github.com:scylladb/scylla: tools: remove scylla-sstable-index tools: introduce scylla-sstable tools: extract finding selected operation (handler) into function tools: add schema_loader cql3: query_processor: add parse_statements() cql3: statements/create_type: expose create_type() cql3: statements/create_keyspace: add get_keyspace_metadata()	2021-09-09 19:24:06 +03:00
Avi Kivity	c1028de22a	Merge 'Introduce native reversed format' from Botond Dénes We define the native reverse format as a reversed mutation fragment stream that is identical to one that would be emitted by a table with the same schema but with reversed clustering order. The main difference to the current format is how range tombstones are handled: instead of looking at their start or end bound depending on the order, we always use them as-usual and the reversing reader swaps their bounds to facilitate this. This allows us to treat reversed streams completely transparently: just pass along them a reversed schema and all the reader, compacting and result building code is happily ignorant about the fact that it is a reversed stream. This series is the first step towards implementing efficient reverse reads. It allows us to remove all the special casing we have in various places for reverse reads and thus treating reverse streams transparently in all the middle layers. The only layers that have to know about the actual reversing are mutation sources proper. The plan is that when reading in reverse we create a reversed schema in the top layer then pass this down as the schema for the read. There are two layers that will need to act on this reversed schema: * The layer sitting on top of the first layer which still can't handle reversed streams, this layer will create a reversed reader to handle the transition. * The mutation source proper: which will obtain the underlying schema and will emit the data in reverse order. Once all the mutation sources are able to handle reverse reads, we can get rid of the reverse reader entirely. Refs: #1413 Tests: unit(dev) TODO: * v2 * more testing Also on: https://github.com/denesb/scylla.git reverse-reads/v3 Changelog v3: * Drop the entire schema transformation mechanism; * Drop reversing from `schema_builder()`; * Don't keep any information about whether the schema is reversed or not in the schema itself, instead make reversing deterministic w.r.t. schema version, such that: `s.version() == s.make_reversed().make_reversed().version()`; * Re-reverse range tombstones in `streaming_mutation_freezer`, so `reconcilable_results` sent to the coordinator during read repair still use the old reverse format; v2: * Add `data_type reversed(data_type)`; * Add `bound_kind reverse_kind(bound_kind)`; * Make new API safer to use: - `schema::underlying_type()`: return this when unengaged; - `schema::make_transformed()`: noop when applying the same transformation again; * Generalize reversed into transformation. Add support to transferring to remote nodes and shards by way of making `schema_tables` aware of the transformation; * Use reverse schema everywhere in reverse reader; Closes #9184 * github.com:scylladb/scylla: range_tombstone_accumulator: drop _reversed flag test/boost/mutation_test: add test for mutation::consume() monotonicity test/boost/flat_mutation_reader_test: more reversed reader tests flat_mutation_reader: make_reversing_reader(): implement fast_forward_to(partition_range) flat_mutation_reader: make_reversing_reader(): take ownership of the reader test/lib/mutation_source_test: add consistent log to all methods mutation: introduce reverse() mutation_rebuilder: make it standalone mutation: make copy constructor compatible with mutation_opt treewide: switch to native reversed format for reverse reads mutation: consume(): add native reverse order mutation: consume(): don't include dummy rows query: add slice reversing functions partition_slice_builder: add range mutating methods partition_slice_builder: add constructor with slice query: specific_ranges: add non-const ranges accessor range_tombstone: add reverse() clustering_bounds_comparator: add reverse_kind() schema: introduce make_reversed() schema: add a transforming copy constructor utils: UUID_gen: introduce negate() types: add reversed(data_type) docs: design-notes: add reverse-reads.md	2021-09-09 15:50:22 +03:00
Botond Dénes	f02632aeb0	range_tombstone_accumulator: drop _reversed flag	2021-09-09 15:42:15 +03:00
Botond Dénes	f07805c3ef	test/boost/mutation_test: add test for mutation::consume() monotonicity In both forward and reverse modes.	2021-09-09 15:42:15 +03:00
Botond Dénes	3cc882f6a8	test/boost/flat_mutation_reader_test: more reversed reader tests Check that the reverse reader emits a stream identical to that emitted by a reader reading in native order from a table with reversed clustering order.	2021-09-09 15:42:15 +03:00
Botond Dénes	350440b418	flat_mutation_reader: make_reversing_reader(): take ownership of the reader Makes for much simpler client code.	2021-09-09 15:42:15 +03:00
Botond Dénes	c71a281e6b	test/lib/mutation_source_test: add consistent log to all methods Most test methods log their own name either via testlog.info() or BOOST_TEST_MESSAGE() so failures can be more easily located. Not all do however. This commit fixes this and also converts all those using BOOST_TEST_MESSAGE() for this to testlog.info(), for consistency.	2021-09-09 15:42:15 +03:00
Botond Dénes	74a22a706b	mutation_rebuilder: make it standalone Not requiring a wrapper object to become usable.	2021-09-09 15:42:15 +03:00
Botond Dénes	502a45ad58	treewide: switch to native reversed format for reverse reads We define the native reverse format as a reversed mutation fragment stream that is identical to one that would be emitted by a table with the same schema but with reversed clustering order. The main difference to the current format is how range tombstones are handled: instead of looking at their start or end bound depending on the order, we always use them as-usual and the reversing reader swaps their bounds to facilitate this. This allows us to treat reversed streams completely transparently: just pass along them a reversed schema and all the reader, compacting and result building code is happily ignorant about the fact that it is a reversed stream.	2021-09-09 15:42:15 +03:00
Botond Dénes	f200c8104a	schema: introduce make_reversed() `make_revered()` creates a schema identical to the schema instance it is called on, with clustering order reversed. To distinguish the reverse schema from the original one, the node-id part of its version UUID is bit-flipped. This ensures that reversing a schema twice will result in the identical schema to the original one (although a different C++ object). This reversed schema will be used in reversed reads, so intermediate layers can be ignorant of the fact that the read happens in reverse.	2021-09-09 11:49:05 +03:00
Botond Dénes	65913f4cfa	utils: UUID_gen: introduce negate()	2021-09-09 11:49:05 +03:00
Dejan Mircevski	6afdc6004c	cql3/modification_statement: Replace empty-range check with null check The empty-range check causes more bugs than it fixes. Replace it with an explicit check for =NULL (see #7852). Fixes #9311. Fixes #9290. Tests: unit (dev), cql-pytest on Cassandra 4.0 Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #9314	2021-09-09 10:56:13 +03:00
Dejan Mircevski	58a9a24ff0	cql3: Allow indexed query to select static columns We previously forbade selecting a static column when an index is used. But Cassandra allows it, so we should, too -- see #8869. After removing the static-column check, the existing code gets the correct result without any further changes (though it may read multiple rows from the same partition). Fixes #8869. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #9307	2021-09-08 08:22:59 +02:00
Tomasz Grabiec	9a77a03ea1	Merge "Remove most uses of gms::get_gossiper(), gms::get_local_gossiper()" from Avi In the quest to have explicit dependencies and the abiliy to run multiple nodes in one process, remove some uses of get_gossiper() and get_local_gossiper() and replace them with parameters passed from main() or its equivalents. Some uses still remain, mostly in snitch, but this series removes a majority. * https://github.com/avikivity/scylla.git gossiper-deglobal-1/v1 alternator: remove uses of get_local_gossiper() storage_service: remove stray get_gossiper(), get_local_gossiper() calls migration_manager: remove use of get_gossiper() from passive_announce() storage_proxy: start_hints_manager(): don't require caller to provide gossiper migration_manager: remove uses of get_local_gossiper() storage_proxy: remove uses of get_local_gossiper() gossiper: remove get_local_gossiper() from some inline helpers gossiper: remove get_gossiper() from stop_gossiping() gossiper: remove uses of get_local_gossiper for its rpc server api: remove use of get_local_gossiper() gossiper: remove calls to global get_gossiper from within the gossiper itself	2021-09-07 20:02:30 +02:00

1 2 3 4 5 ...

2235 Commits