scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-06 15:03:06 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	bbcf671276	config: Remove unused replacing options The --replace-token and --replace-node were added some time ago, but have never been used since then, just parsed and immediatelly aborted. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210930102222.16294-1-xemul@scylladb.com>	2021-09-30 14:56:04 +03:00
Piotr Jastrzebski	79de151158	cache_tracker: remove unused parameter from on_remove Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f66ad391d86963b43b2a01e957887ea597e591e8.1632992165.git.piotr@scylladb.com>	2021-09-30 13:03:13 +03:00
Pavel Emelyanov	9f5fd8b5c0	system_keyspace: Keep local_host_id on local_cache Some places in the code want to have future-less access to the host id, now they do it all by themselves. Local cache seems to be a better place (for the record -- some time ago the "better place" argument justified cached host id relocation from the storage_service onto the database). While at it -- add the future-less getter for the host_id to be used further. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 10:54:38 +03:00
Pavel Emelyanov	beb345c00a	code: Rename get_local_host_id() into load_...() There will appear the future-less method which better deserves the get_ prefix, so give the existing method the load_ one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 10:33:57 +03:00
Pavel Emelyanov	e49dc4ed0d	system_keyspace: Coroutinize get_/set_local_host_id Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 10:33:57 +03:00
Avi Kivity	b3c95a1fc6	commitlog: reduce inclusions of commitlog.hh due to db::commitlog::force_sync (#9379 ) There are now 231 translation units that indirectly include commitlog.hh due to the need to have access to db::commitlog::force_sync. Move that type to a new file commitlog_types.hh and make it available without access to the commitlog class. This reduces the number of translation units that depend on commitlog.hh to 84, improving compile time.	2021-09-29 16:13:44 +03:00
Botond Dénes	41facb3270	treewide: move reversing to the mutation sources Push down reversing to the mutation-sources proper, instead of doing it on the querier level. This will allow us to test reverse reads on the mutation source level. The `max_size` parameter of `consume_page()` is now unused but is not removed in this patch, it will be removed in a follow-up to reduce churn.	2021-09-29 12:15:45 +03:00
Botond Dénes	dec282e050	db/virtual_table: streaming_virtual_table::as_mutation_source(): use query schema instead of table schema The two might not be the same in case the schema was upgraded (unlikely for virtual tables) or if we are reading in reverse. It is important to use the passed-in query schema consistently during a read.	2021-09-28 17:03:57 +03:00
Avi Kivity	369afe3124	treewide: use coroutine::maybe_yield() instead of co_await make_ready_future() The dedicated API shows the intent, and may be a tiny bit faster. Closes #9382	2021-09-23 12:28:56 +02:00
Avi Kivity	6702711d9c	Merge "Gossiper start-stop sanitation (+ bonus track)" from Pavel E " The main challenge here is to move messaging_service.start_listen() call from out of gossiper into main. Other changes are pretty minor compared to that and include - patch gossiper API towards a standard start-shutdown-stop form - gossiping "sharder info" in initial state - configure cluster name and seeds via gossip_config tests: unit(dev) dtest.bootstrap_test.start_stop_test_node(dev) manual(dev): start+stop, nodetool enable-/disablegossip refs: #2737 refs: #2795 refs: #5489 " * 'br-gossiper-dont-start-messaging-listen-2' of https://github.com/xemul/scylla: code: Expell gossiper.hh from other headers storage_service: Gossip "sharder" in initial states gossiper: Relax set_seeds() gossiper, main: Turn init_gossiper into get_seeds_from_config storage_service: Eliminate the do-bind argument from everywhere gossiper: Drop ms-registered manipulations messaging, main, gossiper: Move listening start into main gossiper: Do handlers reg/unreg from start/stop gossiper: Split (un)init_messaging_handler() gossiper: Relocate stop_gossiping() into .stop() gossiper: Introduce .shutdown() and use where appropriate gossiper: Set cluster_name via gossip_config gossiper, main: Straighten start/stop tests/cql_test_env: Open-code tst_init_ms_fd_gossiper tests/cql_test_env: De-global most of gossiper gossiper: Merge start_gossiping() overloads into one gossiper: Use is_... helpers gossiper: Fix do_shadow_round comment gossiper: Dispose dead code	2021-09-23 12:18:38 +03:00
Pavel Emelyanov	598841a5dd	code: Expell gossiper.hh from other headers This needs to add forward declarations of the gossiper class and re-include some other headers here and there. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-22 13:13:06 +03:00
Piotr Sarna	d3edca4b43	Merge 'alternator: add stub implementation of TTL's API operations' ... from Nadav Har'El This small series adds a stub implementation of Alternator's UpdateTimeToLive and DescribeTimeToLive operations. These operations can enable, disable, or inquire about, the chosen expiration-time attribute. Currently, the information about the chosen attribute is only saved, with no actual expiration of any items taking place. Because this is an incomplete implementation of this feature, it is not enabled unless an experimental flag is enabled on all nodes in the cluster. See the individual patches for more information on what this series does. Refs #5060. Closes #9345 * github.com:scylladb/scylla: test/alternator: rename utility function test_table_name() alternator: stub TTL operations alternator: make three utility functions in executor.cc non-static test/alternator: test another corner case of TTL	2021-09-21 09:58:17 +02:00
Avi Kivity	15819e0304	Merge "Database start/stop code sanitation" from Pavel E " Currently database start and stop code is quite disperse and exists in two slightly different forms -- one in main and the other one in cql_test_env. This set unifies both and makes them look almost the perfect way: sharded<database> db; db.start(<dependencies>); auto stop = defer([&db] { db.stop().get(); }); db.invoke_on_all(&database::start).get(); with all (well, most) other mentionings of the "db" variable being arguments for other services' dependencies. tests: unit(dev, release), unit.cross_shard_barrier(debug) dtest.simple_boot_shutdown(dev) refs: #2737 refs: #2795 refs: #5489 " * 'br-database-teardown-unification-2' of https://github.com/xemul/scylla: (26 commits) main: Log when database starts view_update_generator: Register staging sstables in constructor database, messaging: Delete old connection drop notification database, proxy: Relocate connection-drop activity messaging, proxy: Notify connection drops with boost signal database, tests: Rework recommended format setting database, sstables_manager: Sow some noexcepts database: Eliminate unused helpers database: Merge the stop_database() into database::stop() database: Flatten stop_database() database: Equip with cross-shard-barrier database: Move starting bits into start() database: Add .start() method main: Initialize directories before database main, api: Detach set_server_config from database and move up main: Shorten commitlog creation database: Extract commitlog initialization from init_system_keyspace repair: Shutdown without database help main: Shift iosched verification upward database: Remove unused mm arg from init_non_system_keyspaces() ...	2021-09-20 10:26:13 +03:00
Nadav Har'El	4ffd8c1f2b	alternator: stub TTL operations This patch adds stubs for the UpdateTimeToLive and DescribeTimeToLive operations to Alternator. These operations can enable, disable, or inquire about, the chosen expiration-time attribute. Currently, the information about the chosen attribute is only saved, with no actual expiration of any items taking place. Some of the tests for the TTL feature start to pass, so their xfail tag is removed. Because this this new feature is incomplete, it is not enabled unless the "alternator-ttl" experimental feature is enabled. Moreover, for these operations to be allowed, the entire cluster needs to support this experimental feature, because all nodes need to participate in the data expiration - if some old nodes don't support Alternator TTL, some of the data they hold won't get expired... So we don't allow enabling TTL until all the nodes in the cluster support this feature. The implementation is in a new source file, alternator/ttl.cc. This source file will continue to grow as we implement the expiration feature. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-09-19 21:05:21 +03:00
Pavel Emelyanov	0de69136d4	view_update_generator: Register staging sstables in constructor First, it's to fix the discarded future during the register. The future is not actually such, as it's always the no-op ready one as at that stage the view_update_generator is neither aborted nor is in throttling state. Second, this change is to keep database start-up code in main shorter and cleaner. Registering staging sstables belongs to the view_update_generator start code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	75e1d7ea74	large_data_handler: Prepare for stopped qctx All the large data handler methods rely on global qctx thing to write down its notes. This creates circular dependency: query processor -> database -> large_data_handler -> qctx -> qp In scylla this is not a technical problem, neither qctx nor the query processor are stopped. It is a problem in cql_test_env that stops everything, including resetting qctx to null. To avoid tests stepping on nullptr qctx add the explicit check. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:24 +03:00
Avi Kivity	cc8fc73761	Merge 'hints: fix bugs in HTTP API for waiting for hints found by running dtest in debug mode' from Piotr Dulikowski This series of commits fixes a small number of bugs with current implementation of HTTP API which allows to wait until hints are replayed, found by running the `hintedhandoff_sync_point_api_test` dtest in debug mode. Refs: #9320 Closes #9346 * github.com:scylladb/scylla: commitlog: make it possible to provide base segment ID hints: fill up missing shards with zeros in decoded sync points hints: propagate abort signal correctly in wait_for_sync_point hints: fix use-after-free when dismissing replay waiters	2021-09-15 12:55:54 +03:00
Avi Kivity	daf028210b	build: enable -Winconsistent-missing-override warning This warning can catch a virtual function that thinks it overrides another, but doesn't, because the two functions have different signatures. This isn't very likely since most of our virtual functions override pure virtuals, but it's still worth having. Enable the warning and fix numerous violations. Closes #9347	2021-09-15 12:55:54 +03:00
Piotr Dulikowski	91163fcfa5	commitlog: make it possible to provide base segment ID Adds a configuration option to the commitlog: base_segment_id. When provided, the commitlog uses this ID as a base of its segment IDs instead of calculating it based on the number of milliseconds between the epoch and boot time. This is needed in order for the feature which allows to wait for hints to be replayed to work - it relies on the replay positions monotonically increasing. Endpoint managers periodically re-creates its commitlog instance - if it is re-created when there are no segments on disk, currently it will choose the number of milliseconds between the epoch and boot time, which might result in segments being generated with the same IDs as some segments previously created and deleted during the same runtime.	2021-09-15 11:04:34 +02:00
Piotr Dulikowski	486421c58c	hints: fill up missing shards with zeros in decoded sync points Between encoding and decoding of a sync point, the node might have been restarted and resharded with increased shard count. During resharding, existing hints segments might have been moved to new shards. Because of that, we need to make sure that we wait for foreign segments to be replayed on the new shards too. This commit modifies the sync point decoding logic so that it places a zero replay position for new shards. Additionally, a (incorrect) shard count check is removed from `storage_proxy::wait_for_hint_sync_point` because now the shard count in decoded sync point is guaranteed to be not less than the node's current shard count.	2021-09-15 11:04:34 +02:00
Piotr Dulikowski	77f2448b2c	hints: propagate abort signal correctly in wait_for_sync_point When `manager::wait_for_sync_point` is called, the abort source from the arguments (`as`) might have already been triggered. In such case, the subscription which was supposed to trigger the `local_as` abort source won't be run, and the code will wait indefinitely for hints to be replayed instead of checking the replay status and returning immediately. This commit fixes the problem by manually triggering `local_as` if `as` have been triggered.	2021-09-14 14:27:01 +02:00
Piotr Dulikowski	8e29ebc5d5	hints: fix use-after-free when dismissing replay waiters When the promise waited on in the `wait_until_hints_are_replayed_up_to` function is resolved, a continuation runs which prints a log line with information about this event. The continuation captures a pointer to the hints sender and uses it to get information about the endpoint whose hints are waited for. However, at this point the sender might have been deleted - for example, when the node is being stopped and everybody waiting for hints is dismissed. This commit fixes the use-after-free by getting all necessary information while the sender is guaranteed to be alive and captures it in the continuation's capture list.	2021-09-14 13:46:16 +02:00
Avi Kivity	3f2c680b70	Merge 'Add initial support for WebAssembly in user-defined functions (UDF)' from Piotr Sarna This series adds very basic support for WebAssembly-based user-defined functions. This series comes with a basic set of tests which were used to designate a minimal goal for this initial implementation. Example usage: ```cql CREATE FUNCTION ks.fibonacci (str text) RETURNS NULL ON NULL INPUT RETURNS boolean LANGUAGE xwasm AS ' (module (func $fibonacci (param $n i32) (result i32) (if (i32.lt_s (local.get $n) (i32.const 2)) (return (local.get $n)) ) (i32.add (call $fibonacci (i32.sub (local.get $n) (i32.const 1))) (call $fibonacci (i32.sub (local.get $n) (i32.const 2))) ) ) (export "fibonacci" (func $fibonacci)) ) ' ``` Note that the language is currently called "xwasm" as in "experimental wasm", because its interface is still subject to change in the future. Closes #9108 * github.com:scylladb/scylla: docs: add a WebAssembly entry cql-pytest: add wasm-based tests for user-defined functions main: add wasm engine instantiation treewide: add initial WebAssembly support to UDF wasm: add initial WebAssembly runtime implementation db: add wasm_engine pointer to database lang: add wasm_engine service import wasmtime.hh lua: move to lang/ directory cql3: generalize user-defined functions for more languages	2021-09-14 11:34:20 +03:00
Avi Kivity	e9ae9279e8	system_keyspace: reindent after conversion to class Conversion to class left indentation in ruins, but that can be easily fixed. 'git diff -w' reports no changes. Closes #9339	2021-09-14 08:49:24 +03:00
Piotr Sarna	62e8c89a9c	treewide: add initial WebAssembly support to UDF This commit adds a very basic support for user-defined functions coded in wasm. The support is very limited (only a few types work) and was not tested against reactor stalls and performance in general.	2021-09-13 19:03:58 +02:00
Avi Kivity	e70b9d4835	system_keyspace: convert from namespace to class All the namespace scope functions in system_keyspace have no place to store context, so they must store their context in global variables. This prevents conversion of those global variables to constructor-provided depdendencies. Take the first step towards providing a place to store the context by converting system_keyspace to a class. All the functions are static, so no context is yet available, but we can de-static-ify them incrementally in the future and store the context in class members. Indentation is a mess, but can be easily fixed later.	2021-09-13 15:14:14 +03:00
Avi Kivity	115d6d8d4c	system_keyspace: prepare forward-declared members In anticipation of making system_keyspace a class instead of a namespace, rename any member that is currently forward-declared, since one can't forward-declare a class member. Each member is taken out of the system_keyspace namespace and gains a system_keyspace prefix. Aliases are added to reduce code churn. The result isn't lovely, but can be adjusted later.	2021-09-13 15:11:26 +03:00
Avi Kivity	c6ce81d6a0	system_keyspace: rearrange legacy subnamespace Merge two fragments together, in anticipation of making 'legacy' s struct instead of a namespace (when system_keyspace is a class, we can't nest a namespace inside it).	2021-09-13 15:10:15 +03:00
Avi Kivity	6d379ae6f9	system_keyspace: remove outdated java code This code has been rewritten and not removed, or is not needed. Remove it to reduce clutter.	2021-09-13 15:08:57 +03:00
Piotr Sarna	4e952df470	lua: move to lang/ directory Support for more languages is comming, so let's group them in a separate directory.	2021-09-13 11:01:33 +02:00
Piotr Sarna	46c6603fe0	cql3: generalize user-defined functions for more languages In order to support more languages than just Lua in the future, Lua-specific configuration is now extracted to a separate structure.	2021-09-13 11:01:33 +02:00
Avi Kivity	c5f52f9d97	schema_tables: don't flush in tests Flushing schema tables is important for crash recovery (without a flush, we might have sstables using a new schema before the commitlog entry noting the schema change has been replayed), but not important for tests that do not test crash recovery. Avoiding those flushes reduces system, user, and real time on tests running on a consumer-level SSD. before: real 8m51.347s user 7m5.743s sys 5m11.185s after: real 7m4.249s user 5m14.085s sys 2m11.197s Note real time is higher that user+sys time divided by the number of hardware threads, indicating that there is still idle time due to the disk flushing, so more work is needed. Closes #9319	2021-09-12 11:32:13 +03:00
Tomasz Grabiec	83113d8661	Merge "raft: new schema for storing raft snapshots" from Pavel Solodovnikov Previously, the layout for storing raft snapshot descriptors contained a `config` field, which had `blob` data type. That means `raft::configuration` for the snapshot was serialized as a whole in binary form. It's convenient to implement and is the most compact form of representing the data, but: 1. Hard to debug due to the need to de-serialize the data. 2. Plants a time bomb wrt. changing data layout and also the documentation in the future. Remove the `config` field from `system.raft_snapshots` and extract it to a separate `system.raft_config` table to store the data in exploded form. Also, modify the schema of `system.raft_snapshots` table in the following way: add a `server_id` field as a part of composite partition key ((group_id, server_id)) to be able to start multiple raft servers belonging to one raft group on the same scylla node. Rename `id` field in `raft_snapshots` to `snapshot_id` so it's self-documenting. Rename `snapshot_id` from clustering key since a given server can have only one snapshot installed at a time. Note that the `raft::server_address` stucture contains an opaque `info` member, which is `bytes`, but in the `raft_config` table we use `ip_addr inet` field, instead. We always know that the corresponding member field is going to contain an IP address (either v4 or v6) of a given raft server. So, now the snapshots schema looks like this: CREATE TABLE raft_snapshots ( group_id timeuuid, server_id uuid, snapshot_id uuid, idx int, term int, -- no `config` field here, moved to `raft_config` table PRIMARY KEY ((group_id, server_id)) ) CREATE TABLE raft_config ( group_id timeuuid, my_server_id uuid, server_id uuid, disposition text, -- can be either 'CURRENT` or `PREVIOUS' can_vote bool, ip_addr inet, PRIMARY KEY ((group_id, my_server_id), server_id, disposition) ); This way it's much easier to extend the schema with new fields, very easy to debug and inspect via CQL, and it's much more descriptive in terms of self-documentation. Tests: unit(dev) * manmanson/raft_snapshots_new_schema_v2: test: adjust `schema_change_test` to include new `system.raft_config` table raft: new schema for storing raft snapshots raft: pass server id to `raft_sys_table_storage` instance	2021-09-10 20:41:59 +02:00
Avi Kivity	16116ac631	interval: constrain comparator parameters The interval template member functions mostly accept tri-comparators but a few functions accept less-comparators. To reduce the chance of error, and to provide better error messages, constrain comparator parameters to the expected signature. In one case (db/size_estimates_virtual_reader.cc) the caller had to be adjusted. The comparator supported comparisons of the interval value type against other types, but not against itself. To simplify things, we add that signature too, even though it will never be called. Closes #9291	2021-09-10 16:43:16 +02:00
Avi Kivity	c1028de22a	Merge 'Introduce native reversed format' from Botond Dénes We define the native reverse format as a reversed mutation fragment stream that is identical to one that would be emitted by a table with the same schema but with reversed clustering order. The main difference to the current format is how range tombstones are handled: instead of looking at their start or end bound depending on the order, we always use them as-usual and the reversing reader swaps their bounds to facilitate this. This allows us to treat reversed streams completely transparently: just pass along them a reversed schema and all the reader, compacting and result building code is happily ignorant about the fact that it is a reversed stream. This series is the first step towards implementing efficient reverse reads. It allows us to remove all the special casing we have in various places for reverse reads and thus treating reverse streams transparently in all the middle layers. The only layers that have to know about the actual reversing are mutation sources proper. The plan is that when reading in reverse we create a reversed schema in the top layer then pass this down as the schema for the read. There are two layers that will need to act on this reversed schema: * The layer sitting on top of the first layer which still can't handle reversed streams, this layer will create a reversed reader to handle the transition. * The mutation source proper: which will obtain the underlying schema and will emit the data in reverse order. Once all the mutation sources are able to handle reverse reads, we can get rid of the reverse reader entirely. Refs: #1413 Tests: unit(dev) TODO: * v2 * more testing Also on: https://github.com/denesb/scylla.git reverse-reads/v3 Changelog v3: * Drop the entire schema transformation mechanism; * Drop reversing from `schema_builder()`; * Don't keep any information about whether the schema is reversed or not in the schema itself, instead make reversing deterministic w.r.t. schema version, such that: `s.version() == s.make_reversed().make_reversed().version()`; * Re-reverse range tombstones in `streaming_mutation_freezer`, so `reconcilable_results` sent to the coordinator during read repair still use the old reverse format; v2: * Add `data_type reversed(data_type)`; * Add `bound_kind reverse_kind(bound_kind)`; * Make new API safer to use: - `schema::underlying_type()`: return this when unengaged; - `schema::make_transformed()`: noop when applying the same transformation again; * Generalize reversed into transformation. Add support to transferring to remote nodes and shards by way of making `schema_tables` aware of the transformation; * Use reverse schema everywhere in reverse reader; Closes #9184 * github.com:scylladb/scylla: range_tombstone_accumulator: drop _reversed flag test/boost/mutation_test: add test for mutation::consume() monotonicity test/boost/flat_mutation_reader_test: more reversed reader tests flat_mutation_reader: make_reversing_reader(): implement fast_forward_to(partition_range) flat_mutation_reader: make_reversing_reader(): take ownership of the reader test/lib/mutation_source_test: add consistent log to all methods mutation: introduce reverse() mutation_rebuilder: make it standalone mutation: make copy constructor compatible with mutation_opt treewide: switch to native reversed format for reverse reads mutation: consume(): add native reverse order mutation: consume(): don't include dummy rows query: add slice reversing functions partition_slice_builder: add range mutating methods partition_slice_builder: add constructor with slice query: specific_ranges: add non-const ranges accessor range_tombstone: add reverse() clustering_bounds_comparator: add reverse_kind() schema: introduce make_reversed() schema: add a transforming copy constructor utils: UUID_gen: introduce negate() types: add reversed(data_type) docs: design-notes: add reverse-reads.md	2021-09-09 15:50:22 +03:00
Botond Dénes	f02632aeb0	range_tombstone_accumulator: drop _reversed flag	2021-09-09 15:42:15 +03:00
Piotr Sarna	5d7c765422	db,view: split stopping view builder to drain+stop In order to be able to avoid a deadlock when CQL server cannot be started, the view builder shutdown procedure is now split to two parts - - drain and stop. Drain is performed before storage proxy shutdown, but stop() will be called even before drain is scheduled. The deadlock is as follows: - view builder creates a reader permit in order to be able to read from system tables - CQL server fails to start, shutdown procedure begins - view builder stop() is not called (because it was not scheduled yet), so it holds onto its reader permit - database shutdown procedure waits for all permits to be destroyed, and it hangs indefinitely because view builder keeps holding its permit.	2021-09-08 10:52:40 +02:00
Avi Kivity	705f957425	Merge "Generalize TLS creds builder configuration" from Pavel E " There are 4 places out there that do the same steps parsing "client_\|server_encryption_options" and configuring the seastar::tls::creds_builder with the values (messaging, redis, alternator and transport). Also to make redis and transport look slimmer main() cleans the client_encryption_options by ... parsing it too. This set introduces a (coroutinized) helper to configure the creds_builder with map<string, string> and removes the options beautification from main. tests: unit(dev), dtest.internode_ssl_test(dev) " * 'br-generalize-tls-creds-builder-configuration' of https://github.com/xemul/scylla: code: Generalize tls::credentials_builder configuration transport, redis: Do not assume fixed encryption options messaging: Move encryption options parsing to ms main: Open-code internode encryption misconfig warning main, config: Move options parsing helpers	2021-09-01 14:19:19 +03:00
Avi Kivity	8b59e3a0b1	Merge ' cql3: Demand ALLOW FILTERING for unlimited, sliced partitions ' from Dejan Mircevski Return the pre- `6773563d3` behavior of demanding ALLOW FILTERING when partition slice is requested but on potentially unlimited number of partitions. Put it on a flag defaulting to "off" for now. Fixes #7608; see comments there for justification. Tests: unit (debug, dev), dtest (cql_additional_test, paging_test) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #9126 * github.com:scylladb/scylla: cql3: Demand ALLOW FILTERING for unlimited, sliced partitions cql3: Track warnings in prepared_statement test: Use ALLOW FILTERING more strictly cql3: Add statement_restrictions::to_string	2021-08-31 18:05:26 +03:00
Dejan Mircevski	2f28f68e84	cql3: Demand ALLOW FILTERING for unlimited, sliced partitions When a query requests a partition slice but doesn't limit the number of partitions, require that it also says ALLOW FILTERING. Although do_filter() isn't invoked for such queries, the performance can still be unexpectedly slow, and we want to signal that to the user by demanding they explicitly say ALLOW FILTERING. Because we now reject queries that worked fine before, existing applications can break. Therefore, the behavior is controlled by a flag currently defaulting to off. We will default to "on" in the next Scylla version. Fixes #7608; see comments there for justification. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2021-08-31 10:45:41 -04:00
Pavel Solodovnikov	8d3c0ee9b6	raft: new schema for storing raft snapshots Previously, the layout for storing raft snapshot descriptors contained a `config` field, which had `blob` data type. That means `raft::configuration` for the snapshot was serialized as a whole in binary form. It's convenient to implement and is the most compact form of representing the data, but: 1. Hard to debug due to the need to de-serialize the data. 2. Plants a time bomb wrt. changing data layout and also the documentation in the future. Remove the `config` field from `system.raft_snapshots` and extract it to a separate `system.raft_config` table to store the data in exploded form. Also, modify the schema of `system.raft_snapshots` table in the following way: add a `server_id` field as a part of composite partition key ((group_id, server_id)) to be able to start multiple raft servers belonging to one raft group on the same scylla node. Rename `id` field in `raft_snapshots` to `snapshot_id` so it's self-documenting. Rename `snapshot_id` from clustering key since a given server can have only one snapshot installed at a time. Note that the `raft::server_address` stucture contains an opaque `info` member, which is `bytes`, but in the `raft_config` table we use `ip_addr inet` field, instead. We always know that the corresponding member field is going to contain an IP address (either v4 or v6) of a given raft server. So, now the snapshots schema looks like this: CREATE TABLE raft_snapshots ( group_id timeuuid, server_id uuid, snapshot_id uuid, idx int, term int, -- no `config` field here, moved to `raft_config` table PRIMARY KEY ((group_id, server_id)) ) CREATE TABLE raft_config ( group_id timeuuid, my_server_id uuid, server_id uuid, disposition text, -- can be either 'CURRENT` or `PREVIOUS' can_vote bool, ip_addr inet, PRIMARY KEY ((group_id, my_server_id), server_id, disposition) ); This way it's much easier to extend the schema with new fields, very easy to debug and inspect via CQL, and it's much more descriptive in terms of self-documentation. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-08-27 09:24:46 +03:00
Pavel Solodovnikov	c0854a0f62	raft: create system tables only when `raft` experimental feature is set Also introduce a tiny function to return raft-enabled db config for cql testing. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20210826091432.279532-1-pa.solodovnikov@scylladb.com>	2021-08-26 12:21:12 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	fe479aca1d	reader_permit: add timeout member To replace the timeout parameter passed to flat_mutation_reader methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 14:29:44 +03:00
Pavel Solodovnikov	22794efc22	db: add experimental option for raft Introduce `raft` experimental option. Adjust the tests accordingly to accomodate the new option. It's not enabled by default when providing `--experimental=true` config option and should be requested explicitly via `--experimental-options=raft` config option. Hide the code related to `raft_group_registry` behind the switch. The service object is still constructed but no initialization is performed (`init()` is not called) if the flag is not set. Later, other raft-related things, such as raft schema changes, will also use this flag. Also, don't introduce a corresponding gossiper feature just yet, because again, it should be done after the raft schema changes API contract is stabilized. This will be done in a separate series, probably related to implementing the feature itself. Tests: unit(dev) Ref #9239. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20210823121956.167682-1-pa.solodovnikov@scylladb.com>	2021-08-23 17:45:58 +03:00
Benny Halevy	e9aff2426e	everywhere: make deferred actions noexcept Prepare for updating seastar submodule to a change that requires deferred actions to be noexcept (and return void). Test: unit(dev, debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:52 +03:00
Benny Halevy	ef8ec54970	commitlog: segment, segment_manager: mark methods noexcept Prepare for marking deferred_actions nexcept. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:40 +03:00
Benny Halevy	4439e5c132	everywhere: cleanup defer.hh includes Get rid of unused includes of seastar/util/{defer,closeable}.hh and add a few that are missing from source files. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:39 +03:00
Pavel Emelyanov	e02b39ca3d	code: Generalize tls::credentials_builder configuration All the places in code that configure the mentioned creds builder from client_\|server_encryption_options now do it the same way. This patch generalizes it all in the utils:: helper. The alternator code "ignores" require_client_auth and truststore keys, but it's easy to make the generalized helper be compatible. Also make the new helper coroutinized from the beginning. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-08-20 18:05:41 +03:00
Pavel Emelyanov	aa88527375	main, config: Move options parsing helpers The get_or_default and is_true are two aux bits that are used to parse the config options. The former is duplicated in the alternator code as well. Put both in utils namespace for future. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-08-20 17:53:41 +03:00

1 2 3 4 5 ...

2288 Commits