scylladb

Author	SHA1	Message	Date
Avi Kivity	0114244363	Merge 'replica/database: drop_column_family(): properly cleanup stale querier cache entries' from Botond Dénes Said method has to evict all querier cache entries, belonging to the to-be-dropped table. This is already the case, but there was a window where new entries could sneak in, causing a stale reference to the table to be de-referenced later when they are evicted due to TTL. This window is now closed, the entries are evicted after the method has waited for all ongoing operations on said table to stop. Fixes: #10450 Closes #10451 * github.com:scylladb/scylla: replica/database: drop_column_family(): drop querier cache entries after waiting for ops replica/database: finish coroutinizing drop_column_family() replica/database: make remove(const column_family&) private (cherry picked from commit `7f1e368e92`)	2022-05-01 17:11:52 +03:00
Avi Kivity	4b1b0a55c0	replica, atomic_cell: move atomic_cell merge code from replica module to atomic_cell.cc compare_atomic_cell_for_merge() was placed in database.cc, before atomic_cell.cc existed. Move it to its correct place. Closes #9889 (cherry picked from commit `6c53717a39`)	2022-03-24 18:07:11 +02:00
Avi Kivity	1aff7d19c2	treewide: replace seastar::fmt_print() with fmt::print() We shouldn't be using Seastar as a text formatting library; that's not its focus. Use fmt directly instead. fmt::print() doesn't return the output stream which is a minor inconvenience, but that's life. Closes #9556	2021-11-01 10:05:16 +02:00
Benny Halevy	0746b5add6	storage_service: replicate_to_all_cores: update all keyspaces Currently we update the effective_replication_map only on non-system keyspace, leaving the system keyspace, that uses the local replication strategy, with the empty replication_map, as it was first initialized. This may lead to a crash when get_ranges is called later as seen in #9494 where get_ranges was called from the perform_sstable_upgrade path. This change updates the effective_replication_map on all keyspaces rather than just on the non-system ones and adds a unit test that reproduces #9494 without the fix and passes with it. Fixes #9494 Test: unit(dev), database_test(debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20211020143217.243949-1-bhalevy@scylladb.com>	2021-10-20 17:54:23 +03:00
Benny Halevy	e4dc81ec04	abstract_replication_strategy: add to_qualified_class_name And use it from cql3 check_restricted_replication_strategy and keyspace_metadata ctor that defined their own `replication_class_strategy`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-18 12:13:25 +03:00
Tomasz Grabiec	cc56a971e8	database, treewide: Introduce partition_slice::is_reversed() Cleanup, reduces noise. Message-Id: <20211014093001.81479-1-tgrabiec@scylladb.com>	2021-10-14 12:39:16 +03:00
Benny Halevy	8c85197c6c	abstract_replication_strategy: get rid of shared_token_metadata member and ctor param It is not used any more. Methods either use the token_metadata_ptr in the effective_replication_map, or receive an ad-hoc token_metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 16:10:06 +03:00
Benny Halevy	dfdc8d4ddb	abstract_replication_strategy: move get_ranges and get_primary_ranges* to effective_replication_map Provide a sync get_ranges method by effective_replication_map that uses the precalculated map to get all token ranges owned by or replicated on a given endpoint. Reuse do_get_ranges as common infrastructure for all 3 cases: get_ranges, get_primary_ranges, and get_primary_ranges_within_dc. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 16:09:51 +03:00
Benny Halevy	991a6a8664	keyspace: update_effective_replication_map And use it to get_natural_endpoints. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:55:34 +03:00
Benny Halevy	970b0a50b5	keyspace: futurize create_replication_strategy And functions that use it, like: keyspace::update_from database::update_keyspace database::create_in_memory_keyspace Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:53:41 +03:00
Benny Halevy	5001d261d4	abstract_replication_strategy: define replication_strategy_config_options To be used for searching effective replication strategy instances. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:39:36 +03:00
Avi Kivity	fd8beeaea9	treewide: handle switch statements that return A switch statement where every case returns triggers a gcc warning if the surrounding function doesn't return/abort. Fix by adding an abort(). The abort() will never trigger since we have a warning on unhandled switch cases.	2021-10-10 18:16:50 +03:00
Tomasz Grabiec	e89b9799b8	Merge 'sstable mx reader: implement reverse single-partition reads' from Kamil Braun Until now reversed queries were implemented inside `querier::consume_page` (more precisely, inside the free function `consume_page` used by `querier::consume_page`) by wrapping the passed-in reader into `make_reversing_reader` and then consuming fragments from the resulting reversed reader. The first couple of commits change that by pushing the reversing down below the `make_combined_reader` call in `table::query`. This allows working on improving reversing for memtables independently from reversing for sstables. We then extend the `index_reader` with functions that allow reading the promoted index in reverse. We introduce `partition_reversing_data_source`, which wraps an sstable data file and returns data buffers with contents of a single chosen partition as if the rows were stored in reverse order. We use the reversing source and the extended index reader in `mx_sstable_mutation_reader` to implement efficient (at least in theory) reversed single-partition reads. The patchset disables cache for reversed reads. Fast-forwarding is not supported in the mx reader for reversed queries at this point. Details in commit messages. Read the commits in topological order for best review experience. Refs: #9134 (not saying "Fixes" because it's only for single-partition queries without forwarding) Closes #9281 * github.com:scylladb/scylla: table: add option to automatically bypass cache for reversed queries test: reverse sstable reader with random schema and random mutations sstables: mx: implement reversed single-partition reads sstables: mx: introduce partition_reversing_data_source sstables: index_reader: add support for iterating over clustering ranges in reverse clustering_key_filter: clustering_key_filter_ranges owning constructor flat_mutation_reader: mention reversed schema in make_reversing_reader docstring clustering_key_filter: document clustering_key_filter_ranges::get_ranges	2021-10-04 15:37:34 +02:00
Kamil Braun	703aed3277	table: add option to automatically bypass cache for reversed queries Currently the new reversing sstable algorithms do not support fast forwarding and the cache does not yet handle reversed results. This forced us to disable the cache for reversed queries if we want to guarantee bounded memory. We introduce an option that does this automatically (without specifying `bypass cache` in the query) and turn it on by default. If the user decides that they prefer to keep the cache at the cost of fetching entire partitions into memory (which may be viable if their partitions are small) during reversed queries, the option can be turned off. It is live-updateable.	2021-10-04 15:24:12 +02:00
Pavel Emelyanov	e9002e1e61	database: Get local host id from system_keyspace It's now cached on database itself, and it can be removed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 10:55:20 +03:00
Avi Kivity	936de92876	Merge 'cql3: Add evaluate(expression) and use instead of term::bind()' from Jan Ciołek This PR adds the function: ```c++ constant evaluate(const expression&, const query_options&); ``` which evaluates the given expression to a constant value. It binds all the bound values, calls functions, and reduces the whole expression to just raw bytes and `data_type`, just like `bind()` and `get()` did for `term`. The code is often similar to the original `bind()` implementation in `lists.cc`, `sets.cc`, etc. * For some reason in the original code, when a collection contains `unset_value`, then the whole collection is evaluated to `unset_value`. I'm not sure why this is the case, considering it's impossible to have `unset_value` inside a collection, because we forbid bind markers inside collections. For example here: `cc8fc73761/cql3/lists.cc (L134)` This seems to have been introduced by Pekka Enberg in `50ec81ee67`, but he has left the company. I didn't change the behaviour, maybe there is a reason behind it, although maybe it would be better to just throw `invalid_request_exception`. * There was a strange limitation on map key size, it seems incorrect: `cc8fc73761/cql3/maps.cc (L150)`, but I left it in. * When evaluating a `user_type` value, the old code tolerated `unset_value` in a field, but it was later converted to NULL. This means that `unset_value` doesn't work inside a `user_type`, I didn't change it, will do in another PR. * We can't fully get rid of `bind()` yet, because it's used in `prepare_term` to return a `terminal`. It will be removed in the next PR, where we finally get rid of `term`. Closes #9353 * github.com:scylladb/scylla: cql3: types: Optimize abstract_type::contains_collection cql3: expr: Convert evaluate_IN_list to use evaluate(expression) cql3: expr: Use only evaluate(expression) to evaluate term cql3: expr: Implement evaluate(expr::function_call) cql3: expr: Implement evaluate(expr::usertype_constructor) cql3: expr: Implement evaluate(expr::collection_constructor) cql3: expr: Implement evaluate(expr::tuple_constructor) cql3: expr: Implement evaluate(expr::bind_variable) cql3: Add contains_collection/set_or_map to abstract_type cql3: expr: Add evaluate(expression, query_options) cql3: Implement term::to_expression for function_call cql3: Implement term::to_expression for user_type cql3: Implement term::to_expression for collections cql3: Implement term::to_expression for tuples cql3: Implement term::to_expression for marker classes cql3: expr: Add data_type to *_constructor structs cql3: Add term::to_expression method cql3: Reorganize term and expression includes	2021-09-26 12:58:11 +03:00
Pavel Emelyanov	88e5b7c547	database: Shutdown in tests There's a circular dependency: query processor needs database database owns large_data_handler and compaction_manager those two need qctx qctx owns a query_processor Respectively, the latter hidden dependency is not "tracked" by constructor arguments -- the query processor is started after the database and is deferred to be stopped before it. This works in scylla, because query processor doesn't really stop there, but in cql_test_env it's problematic as it stops everything, including the qctx. Recent database start-stop sanitation revealed this problem -- on database stop either l.d.h. or compaction manager try to start (or continue) messing with the query processor. One problem was faced immediatelly and pluged with the `75e1d7ea` safety check inside l.d.h., but still cql_test_env tests continue suffering from use after free on stopped query processor. The fix is to partially revert the `4b7846da` by making the tests stop some pieces of the database (inclusing l.d.h. and compaction manager) as it used to before. In scylla this is, probably, not needed, at least now -- the database shutdown code was and still is run right before the stopping one. tests: unit(debug) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210924080248.11764-1-xemul@scylladb.com>	2021-09-26 11:09:01 +03:00
Jan Ciolek	746e9c620f	cql3: Reorganize term and expression includes Make term.hh include expression.hh instead of the other way around. expression can't be forward declared. expression is needed in term.hh to declare term::to_expression(). Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-09-24 11:05:53 +02:00
Benny Halevy	ad46ff8e5e	database: coroutinize create_keyspace Prepare for futurizing on create_in_memory_keyspace. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210923093200.1559734-10-bhalevy@scylladb.com>	2021-09-23 14:05:44 +03:00
Benny Halevy	91091e9d89	database: update_keyspace: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210923093200.1559734-9-bhalevy@scylladb.com>	2021-09-23 14:05:18 +03:00
Benny Halevy	c71cd2bed3	database: coroutinize update_keyspace Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210923093200.1559734-8-bhalevy@scylladb.com>	2021-09-23 14:05:18 +03:00
Pavel Emelyanov	a4118a70ee	database, messaging: Delete old connection drop notification Database no longer needs it. Since the only user of the old-style notification is gone -- remove it as well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	bfd91d7b81	database, proxy: Relocate connection-drop activity On start database is subscribed on messaging-service connection drop notification to drop the hit-rate from column families. However, the updater and reader of those hit-rates is the storage_proxy, so it must be the _proxy_ who drops the hit-rate. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	b78e9b51b7	database, tests: Rework recommended format setting Tests don't have sstable format selector and enforce the needed format by hands with the help of special database:: method. It's more natural to provide it via convig. Doing this makes database initialization in main and cql_test_env closer to each other. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	a42383b127	database, sstables_manager: Sow some noexcepts Setting sstables format into database and into sstables_manager is all plain assignments. Mark them as noexcept, next patch will become apparently exception safe after that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	9a76df96e3	database: Eliminate unused helpers There are some large-data-handler-related helpers left after previous patches, they can be removed altogehter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	4b7846da86	database: Merge the stop_database() into database::stop() After stop_database() became shard-local, it's possible to merge it with database::stop() as they are both called one after another on scylla stop. In cql-test-env there are few more steps in between, but they don't rely on the database being partially stopped. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	469c734155	database: Flatten stop_database() The method need to perform four steps cross-shard synchronously: first stop compaction manager, then close user and, after it, system tables, finally shutdown the large data handler. This patch reworks this synchronization with the help of cross-shard barrier added to the database previously. The motivation is to merge .stop_database() with .stop(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	b1013e09b4	database: Equip with cross-shard-barrier Make sure a node-wide barrier exists on a database when scylla starts. Also provide a barrier for cql_test_env. In all other cases keep a solo-mode barrier so that single-shard db stop doesn't get blocked. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	634ea4b543	database: Move starting bits into start() Thse include large_data_handler::start, compaction_manager::enable and database::init_commitlog. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:48:48 +03:00
Pavel Emelyanov	e2308034ff	database: Add .start() method Called right after the sharded::start(). For now empty, to be populated by next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:44:48 +03:00
Pavel Emelyanov	127e4fe8de	main: Shorten commitlog creation This does three things in one go: - converts db.invoke_on_all([] (database& db) { return db.init_commitlog(); }); into a one-line version db.invoke_on_all(&database::init_commitlog); - removes the shard-0 pre-initialization for tests, because tests don't have the problem this pre- solves - make the init_commitlog() re-entrable to let regular start not check for shard-0 explicitly Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:37:07 +03:00
Pavel Emelyanov	bd2b7dca0e	database: Remove unused mm arg from init_non_system_keyspaces() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:37 +03:00
Pavel Emelyanov	bb23986826	wasm: Localize it to database usage The wasm::engine exists as a sharded<> service in main, but it's only passed by local reference into database on start. There's no much profit in keeping it at main scope, things get much simpler if keeping the engine purely on database. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:17 +03:00
Kamil Braun	c12e265eb8	table, database: query, mutation_query: remove unnecessary class_config param The semaphore inside was never accessed and `max_memory_for_unlimited_query` was always equal to `cmd.max_result_size` so the parameter was completely redundant. `cmd.max_result_size` is supposed to be always set in the affected functions - which are executed on the replica side - as soon as the replica receives the `read_command` object, in case the parameter was not set by the coordinator. However, we don't have a guarantee at the type level (it's still an `optional`). Many places used `cmd.max_result_size` without even an assertion. We make the code a bit safer, we check for `cmd.max_result_size` and if it's indeed engaged, store it in `reader_permit`. We then access it from `reader_permit` where necessary. If `cmd.max_result_size` is not set, we assume this is an unlimited query and obtain the limit from `get_unlimited_query_max_result_size`.	2021-09-14 13:39:56 +02:00
Kamil Braun	fbb83dd5ca	reader_concurrency_semaphore: remove default parameter values from constructors It's easy to forget about supplying the correct value for a parameter when it has a default value specified. It's safer if 'production code' is forced to always supply these parameters manually. The default values were mostly useful in tests, where some parameters didn't matter that much and where the majority of uses of the class are. Without default values adding a new parameter is a pain, forcing one to modify every usage in the tests - and there are a bunch of them. To solve this, we introduce a new constructor which requires passing the `for_tests` tag, marking that the constructor is only supposed to be used in tests (and the constructor has an appropriate comment). This constructor uses default values, but the other constructors - used in 'production code' - do not.	2021-09-14 12:20:28 +02:00
Botond Dénes	502a45ad58	treewide: switch to native reversed format for reverse reads We define the native reverse format as a reversed mutation fragment stream that is identical to one that would be emitted by a table with the same schema but with reversed clustering order. The main difference to the current format is how range tombstones are handled: instead of looking at their start or end bound depending on the order, we always use them as-usual and the reversing reader swaps their bounds to facilitate this. This allows us to treat reversed streams completely transparently: just pass along them a reversed schema and all the reader, compacting and result building code is happily ignorant about the fact that it is a reversed stream.	2021-09-09 15:42:15 +03:00
Benny Halevy	b7eaa22ce6	abstract_replication_strategy: create_replication_strategy: drop keyspace name parameter It is not used. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210906133840.3307279-1-bhalevy@scylladb.com>	2021-09-06 16:51:21 +03:00
Benny Halevy	56e063ce93	keyspace: get rid of set_replication_strategy It's unused. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210906133905.3307397-1-bhalevy@scylladb.com>	2021-09-06 16:48:35 +03:00
Avi Kivity	4d7e00d0f8	cql3: selection: make selectable.hh not include expr/expresion.hh We have this dependency now: column_identifier -> selectable -> expression and want to introduce this: expression -> user types -> column_identifier This leads to a loop, since expression is not (yet) forward declarable. Fix by moving any mention of expression from selectable.hh to a new header selection-expr.hh. database.cc lost access to timeout_config, so adjust its includes to regain it.	2021-08-26 15:19:14 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	9b0b13c450	reader_concurrency_semaphore: adjust reactivated reader timeout Update the reader's timeout where needed after unregistering inactive_read. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	fe479aca1d	reader_permit: add timeout member To replace the timeout parameter passed to flat_mutation_reader methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 14:29:44 +03:00
Nadav Har'El	49aea3b301	Merge 'database: coroutinize schema load functions' from Avi Kivity Simple coroutinization of the schema load functions, leaving the code tidier. Test: unit (dev) Closes #9217 * github.com:scylladb/scylla: database: adjust indentation after coroutinization of schema table parsing code database: convert database::parse_schema_tables() to a coroutine database: remove unneeded temporary in do_parse_schema_tables() database: convert do_parse_schema_tables() to a coroutine	2021-08-23 17:45:58 +03:00
Benny Halevy	4439e5c132	everywhere: cleanup defer.hh includes Get rid of unused includes of seastar/util/{defer,closeable}.hh and add a few that are missing from source files. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:39 +03:00
Avi Kivity	5450af8e1b	database: coroutinize stop() Make the code tidier. The conversion is not mechanical: the finally block is converted to straight line code. stop()/close() must not fail anyway, and we cannot recover from such failures. The when_all_succeed() for stopping the semaphores is also converted to straight-line code - there is no advantage to stopping them in parallel, as we're just waiting for running tasks to complete and clean up. Test: unit (dev) Closes #9218	2021-08-18 10:57:44 +02:00
Avi Kivity	73d6f2798d	database: adjust indentation after coroutinization of schema table parsing code	2021-08-17 21:05:05 +03:00
Avi Kivity	4ca856157d	database: convert database::parse_schema_tables() to a coroutine In one case we have f = f.then(...), but we can just wait for the first future where it's created.	2021-08-17 21:00:15 +03:00
Avi Kivity	4f91953ebf	database: remove unneeded temporary in do_parse_schema_tables() The coroutine can keep the cf_name parameter alive, provided we pass it by value.	2021-08-17 20:45:41 +03:00
Avi Kivity	b2d5820d75	database: convert do_parse_schema_tables() to a coroutine	2021-08-17 20:44:28 +03:00

1 2 3 4 5 ...

1477 Commits