scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-20 00:20:47 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	a4118a70ee	database, messaging: Delete old connection drop notification Database no longer needs it. Since the only user of the old-style notification is gone -- remove it as well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	bfd91d7b81	database, proxy: Relocate connection-drop activity On start database is subscribed on messaging-service connection drop notification to drop the hit-rate from column families. However, the updater and reader of those hit-rates is the storage_proxy, so it must be the _proxy_ who drops the hit-rate. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	b78e9b51b7	database, tests: Rework recommended format setting Tests don't have sstable format selector and enforce the needed format by hands with the help of special database:: method. It's more natural to provide it via convig. Doing this makes database initialization in main and cql_test_env closer to each other. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	a42383b127	database, sstables_manager: Sow some noexcepts Setting sstables format into database and into sstables_manager is all plain assignments. Mark them as noexcept, next patch will become apparently exception safe after that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	9a76df96e3	database: Eliminate unused helpers There are some large-data-handler-related helpers left after previous patches, they can be removed altogehter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	4b7846da86	database: Merge the stop_database() into database::stop() After stop_database() became shard-local, it's possible to merge it with database::stop() as they are both called one after another on scylla stop. In cql-test-env there are few more steps in between, but they don't rely on the database being partially stopped. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	469c734155	database: Flatten stop_database() The method need to perform four steps cross-shard synchronously: first stop compaction manager, then close user and, after it, system tables, finally shutdown the large data handler. This patch reworks this synchronization with the help of cross-shard barrier added to the database previously. The motivation is to merge .stop_database() with .stop(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	b1013e09b4	database: Equip with cross-shard-barrier Make sure a node-wide barrier exists on a database when scylla starts. Also provide a barrier for cql_test_env. In all other cases keep a solo-mode barrier so that single-shard db stop doesn't get blocked. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	634ea4b543	database: Move starting bits into start() Thse include large_data_handler::start, compaction_manager::enable and database::init_commitlog. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:48:48 +03:00
Pavel Emelyanov	e2308034ff	database: Add .start() method Called right after the sharded::start(). For now empty, to be populated by next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:44:48 +03:00
Pavel Emelyanov	127e4fe8de	main: Shorten commitlog creation This does three things in one go: - converts db.invoke_on_all([] (database& db) { return db.init_commitlog(); }); into a one-line version db.invoke_on_all(&database::init_commitlog); - removes the shard-0 pre-initialization for tests, because tests don't have the problem this pre- solves - make the init_commitlog() re-entrable to let regular start not check for shard-0 explicitly Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:37:07 +03:00
Pavel Emelyanov	bd2b7dca0e	database: Remove unused mm arg from init_non_system_keyspaces() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:37 +03:00
Pavel Emelyanov	bb23986826	wasm: Localize it to database usage The wasm::engine exists as a sharded<> service in main, but it's only passed by local reference into database on start. There's no much profit in keeping it at main scope, things get much simpler if keeping the engine purely on database. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:17 +03:00
Kamil Braun	c12e265eb8	table, database: query, mutation_query: remove unnecessary class_config param The semaphore inside was never accessed and `max_memory_for_unlimited_query` was always equal to `cmd.max_result_size` so the parameter was completely redundant. `cmd.max_result_size` is supposed to be always set in the affected functions - which are executed on the replica side - as soon as the replica receives the `read_command` object, in case the parameter was not set by the coordinator. However, we don't have a guarantee at the type level (it's still an `optional`). Many places used `cmd.max_result_size` without even an assertion. We make the code a bit safer, we check for `cmd.max_result_size` and if it's indeed engaged, store it in `reader_permit`. We then access it from `reader_permit` where necessary. If `cmd.max_result_size` is not set, we assume this is an unlimited query and obtain the limit from `get_unlimited_query_max_result_size`.	2021-09-14 13:39:56 +02:00
Kamil Braun	fbb83dd5ca	reader_concurrency_semaphore: remove default parameter values from constructors It's easy to forget about supplying the correct value for a parameter when it has a default value specified. It's safer if 'production code' is forced to always supply these parameters manually. The default values were mostly useful in tests, where some parameters didn't matter that much and where the majority of uses of the class are. Without default values adding a new parameter is a pain, forcing one to modify every usage in the tests - and there are a bunch of them. To solve this, we introduce a new constructor which requires passing the `for_tests` tag, marking that the constructor is only supposed to be used in tests (and the constructor has an appropriate comment). This constructor uses default values, but the other constructors - used in 'production code' - do not.	2021-09-14 12:20:28 +02:00
Botond Dénes	502a45ad58	treewide: switch to native reversed format for reverse reads We define the native reverse format as a reversed mutation fragment stream that is identical to one that would be emitted by a table with the same schema but with reversed clustering order. The main difference to the current format is how range tombstones are handled: instead of looking at their start or end bound depending on the order, we always use them as-usual and the reversing reader swaps their bounds to facilitate this. This allows us to treat reversed streams completely transparently: just pass along them a reversed schema and all the reader, compacting and result building code is happily ignorant about the fact that it is a reversed stream.	2021-09-09 15:42:15 +03:00
Benny Halevy	b7eaa22ce6	abstract_replication_strategy: create_replication_strategy: drop keyspace name parameter It is not used. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210906133840.3307279-1-bhalevy@scylladb.com>	2021-09-06 16:51:21 +03:00
Benny Halevy	56e063ce93	keyspace: get rid of set_replication_strategy It's unused. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210906133905.3307397-1-bhalevy@scylladb.com>	2021-09-06 16:48:35 +03:00
Avi Kivity	4d7e00d0f8	cql3: selection: make selectable.hh not include expr/expresion.hh We have this dependency now: column_identifier -> selectable -> expression and want to introduce this: expression -> user types -> column_identifier This leads to a loop, since expression is not (yet) forward declarable. Fix by moving any mention of expression from selectable.hh to a new header selection-expr.hh. database.cc lost access to timeout_config, so adjust its includes to regain it.	2021-08-26 15:19:14 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	9b0b13c450	reader_concurrency_semaphore: adjust reactivated reader timeout Update the reader's timeout where needed after unregistering inactive_read. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	fe479aca1d	reader_permit: add timeout member To replace the timeout parameter passed to flat_mutation_reader methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 14:29:44 +03:00
Nadav Har'El	49aea3b301	Merge 'database: coroutinize schema load functions' from Avi Kivity Simple coroutinization of the schema load functions, leaving the code tidier. Test: unit (dev) Closes #9217 * github.com:scylladb/scylla: database: adjust indentation after coroutinization of schema table parsing code database: convert database::parse_schema_tables() to a coroutine database: remove unneeded temporary in do_parse_schema_tables() database: convert do_parse_schema_tables() to a coroutine	2021-08-23 17:45:58 +03:00
Benny Halevy	4439e5c132	everywhere: cleanup defer.hh includes Get rid of unused includes of seastar/util/{defer,closeable}.hh and add a few that are missing from source files. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:39 +03:00
Avi Kivity	5450af8e1b	database: coroutinize stop() Make the code tidier. The conversion is not mechanical: the finally block is converted to straight line code. stop()/close() must not fail anyway, and we cannot recover from such failures. The when_all_succeed() for stopping the semaphores is also converted to straight-line code - there is no advantage to stopping them in parallel, as we're just waiting for running tasks to complete and clean up. Test: unit (dev) Closes #9218	2021-08-18 10:57:44 +02:00
Avi Kivity	73d6f2798d	database: adjust indentation after coroutinization of schema table parsing code	2021-08-17 21:05:05 +03:00
Avi Kivity	4ca856157d	database: convert database::parse_schema_tables() to a coroutine In one case we have f = f.then(...), but we can just wait for the first future where it's created.	2021-08-17 21:00:15 +03:00
Avi Kivity	4f91953ebf	database: remove unneeded temporary in do_parse_schema_tables() The coroutine can keep the cf_name parameter alive, provided we pass it by value.	2021-08-17 20:45:41 +03:00
Avi Kivity	b2d5820d75	database: convert do_parse_schema_tables() to a coroutine	2021-08-17 20:44:28 +03:00
Asias He	cc44edb4e2	database: Detemplate run_async I initially tried to use a noncopyable_function to avoid the unnecessary template usage. However, since database::apply_in_memory is a hot function. It is better to use with_gate directly. The run_async function does nothing but calls with_gate anyway. Closes #9160	2021-08-12 07:53:10 +03:00
Nadav Har'El	6c27000b98	Merge 'Propagate exceptions without throwing' from Piotr Sarna NOTE: this series depends on a Seastar submodule update, currently queued in next: 0ed35c6af052ab291a69af98b5c13e023470cba3 In order to avoid needless throwing, exceptions are passed directly wherever possible. Two mechanisms which help with that are: 1. `make_exception_future<>` for futures 2. `co_return coroutine::exception(...)` for coroutines which return `future<T>` (the mechanism does not work for `future<>` without parameters, unfortunately) Tests: unit(release) Closes #9079 * github.com:scylladb/scylla: system_keyspace: pass exceptions without throwing sstables: pass exceptions without throwing storage_proxy: pass exceptions without throwing multishard_mutation_query: pass exceptions without throwing client_state: pass exceptions without throwing flat_mutation_reader: pass exceptions without throwing table: pass exceptions without throwing commitlog: pass exceptions without throwing compaction: pass exceptions without throwing database: pass exceptions without throwing	2021-08-01 16:47:47 +03:00
Avi Kivity	a180cd240f	atomic_cell: change compare_atomic_cell_for_merge() to std::strong_ordering The implementation is in database.cc for some reason. Ref #1449.	2021-07-28 13:26:27 +03:00
Piotr Sarna	66c4d58a8c	database: pass exceptions without throwing In order to avoid needless throwing, exceptions are passed directly wherever possible. Two mechanisms which help with that are: 1. make_exception_future<> for futures 2. co_return coroutine::exception(...) for coroutines which return future<T> (the mechanism does not work for future<> without parameters, unfortunately)	2021-07-26 17:02:36 +02:00
Botond Dénes	27fbca84f6	reader_concurrency_semaphore: remove prethrow_action The semaphore accepts a functor as in its constructor which is run just before throwing on wait queue overload. This is used exclusively to bump a counter in the database::stats, which counts queue overloads. However, there is now an identical counter in reader_concurrency_semaphore::stats, so the database can just use that directly and we can retire the now unused prethrow action. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210716111105.237492-1-bdenes@scylladb.com>	2021-07-19 15:47:37 +03:00
Pavel Emelyanov	1ed582304d	memtable_list: Shorten flush coalescing codeflow The memtable_list::flush() maintains a shared_promise object to coalesce the flushers until the get_flush_permit() resolves. Also it needs to keep the extraneous flushes counter bumped while doing the flush itself. All this can be coded in a shorter form and without the need to carry shared_promise<> around. tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210716164237.10993-1-xemul@scylladb.com>	2021-07-17 00:42:20 +02:00
Botond Dénes	ae4df99e6b	database: remove now unused query execution stages	2021-07-14 17:19:02 +03:00
Botond Dénes	1b7eea0f52	reader_concurrency_semaphore: admission: flip the switch This patch flips two "switches": 1) It switches admission to be up-front. 2) It changes the admission algorithm. (1) by now all permits are obtained up-front, so this patch just yanks out the restricted reader from all reader stacks and simultaneously switches all `obtain_permit_nowait()` calls to `obtain_permit()`. By doing this admission is now waited on when creating the permit. (2) we switch to an admission algorithm that adds a new aspect to the existing resource availability: the number of used/blocked reads. Namely it only admits new reads if in addition to the necessary amount of resources being available, all currently used readers are blocked. In other words we only admit new reads if all currently admitted reads requires something other than CPU to progress. They are either waiting on I/O, a remote shard, or attention from their consumers (not used currently). We flip these two switches at the same time because up-front admission means cache reads now need to obtain a permit too. For cache reads the optimal concurrency is 1. Anything above that just increases latency (without increasing throughput). So we want to make sure that if a cache reader hits it doesn't get any competition for CPU and it can run to completion. We admit new reads only if the read misses and has to go to disk. Another change made to accommodate this switch is the replacement of the replica side read execution stages which the reader concurrency semaphore as an execution stage. This replacement is needed because with the introduction of up-front admission, reads are not independent of each other any-more. One read executed can influence whether later reads executed will be admitted or not, and execution stages require independent operations to work well. By moving the execution stage into the semaphore, we have an execution stage which is in control of both admission and running the operations in batches, avoiding the bad interaction between the two.	2021-07-14 17:19:02 +03:00
Botond Dénes	7bfa40a2f1	treewide: use make_tracking_only_permit() For all those reads that don't (won't or can't) pass through admission currently.	2021-07-14 17:19:02 +03:00
Botond Dénes	7f2813e3fa	database: mutation_query(): handle querier lookup/save on the database level Instead of passing down the querier_cache_ctx to table::mutation_query(), handle the querier lookup/save on the level where the cache exists. The real motivation behind this change however is that we need to move the lookup outside the execution stage, because the current execution stage will soon be replaced by the one provided by the semaphore and to use that properly we need to know if we have a saved permit or not.	2021-07-14 16:48:43 +03:00
Botond Dénes	f9d302bf49	database: mutation_query(): convert into coroutine To facilitate further patching (and reading).	2021-07-14 16:48:43 +03:00
Botond Dénes	d2f5393a43	database: query(): handle querier lookup/save on the database level Instead of passing down the querier_cache_ctx to table::query(), handle the querier lookup/save on the level where the cache exists. The real motivation behind this change however is that we need to move the lookup outside the execution stage, because the current execution stage will soon be replaced by the one provided by the semaphore and to use that properly we need to know if we have a saved permit or not.	2021-07-14 16:48:43 +03:00
Botond Dénes	c28a6e8537	database: query(): convert into coroutine To facilitate further patching (and reading).	2021-07-14 16:48:43 +03:00
Botond Dénes	426b46c4ed	mutation_reader: reader_lifecycle_policy: add obtain_reader_permit() This method is both a convenience method to obtain the permit, as well as an abstraction to allow different implementations to get creative. For example, the main implementation, the one in multishard mutation query returns the permit of the saved reader one was successful. This ensures that on a multi-paged read the same permit is used across as much pages as possible. Much more importantly it ensures the evictable reader wrapping the actual reader both use the same permit.	2021-07-14 16:48:43 +03:00
Botond Dénes	97a03f9027	database: make_multishard_streaming_reader: use external permit As a preparation for up-front admission, add a permit parameter to `make_multishard_streaming_reader()`, which will be the admitted permit once we switch to up-front admission. For now it has to be a non-admitted permit. A nice side-effect of this patch is that now permits will have a use-case specific description, instead of the generic "multishard-streaming-reader" one	2021-07-14 16:48:43 +03:00
Botond Dénes	999169e535	database: make_streaming_reader(): require permit As a preparation for up-front admission, add a permit parameter to `make_streaming_reader()`, which will be the admitted permit once we switch to up-front admission. For now it has to be a non-admitted permit. A nice side-effect of this patch is that now permits will have a use-case specific description, instead of the generic "streaming" one.	2021-07-14 16:48:43 +03:00
Botond Dénes	3ec149222d	database: add obtain_reader_permit() A convenience method for obtaining an admitted permit for a read on a given table. For now it uses the nowait semaphore obtaining method, as all normal reads still use the old admission method. Migrating reads to this method will make the switch easier, as there will be one central place to replace the nowait method with the proper one.	2021-07-14 16:48:43 +03:00
Avi Kivity	f0e2f31839	Merge "Implement validation compaction" from Botond " Currently, when sstables are suspected to be corrupt, one has a few bad choices on how to verify that they are indeed correct: * Obtain suspect sstable files and manually inspect them. This is problematic because it requires a scylla engineer to have direct access to data, which is not always simple or even possible due to privacy protection rules. * Run sstable scrub in abort mode. This is enough to confirm whether there is any corruption or not, but only in a binary manner. It is not possible to explore the full scope of the corruption, as the scrub will abort on the first corruption. * Run sstable scrub in non-abort mode. Although this allows for exploring the full scope of the corruption and it even gets rid of it, it is a very intrusive and potentially destructive process that some users might not be willing to even risk. This patchset offers an alternative: validation compaction. This is a completely non-intrusive compaction that reads all sstables in turn and validates their contents, logging any discrepancies it can find. It does not mutate their content, it doesn't even re-writes them. It is akin to a dry-run mode for sstable scrub. The reason it was not implemented as such is that the current compaction infrastructure assumes that input sstables are replaced by output sstables as part of the compaction process. Lifting this assumption seemed error-prone and risky, so instead I snatched the unused "Validation" compaction type for this purpose. This compaction type completely bypasses the regular compaction infrastructure but only at the low-level. It still integrates fully into compaction-manager. Fixes: #7736 Refs: https://github.com/scylladb/scylla-tools-java/issues/263 Tests: unit(dev) " * 'validation-compaction/v5' of https://github.com/denesb/scylla: test/boost/sstable_datafile_test: add test for validation compaction test/boost/sstable_datafile_test: scrub tests: extract corrupt sst writer code into function api: storage_service: expose validation compaction sstables/compaction_manager: add perform_sstable_validation() sstables/compaction_manager: rewrite_sstables(): resolve maintenance group FIXME sstables/compaction_manager: add maintenance scheduling group sstables/compaction_manager: drop _scheduling_group field sstables/compaction_manager: run_custom_job(): replace parameter name with compaction type sstables/compaction_manager: run_custom_job(): keep job function alive sstables/compaction_descriptor: compaction_options: add validation compaction type sstables/compaction: compaction_options::type(): add static assert for size of index_to_type sstables/compaction: implement validation compaction type sstables/compaction: extract compaction info creation into static method sstables/compaction: extract sstable list formatting to a class sstables/compaction: scrub_compaction: extract reporting code into static methods position_in_paritition{_view}: add has_key() mutation_fragment_stream_validator: add schema() accessor	2021-07-13 10:29:40 +03:00
Tomasz Grabiec	e947fac74c	database: Fix cache metrics not being registered Introduced in `6a6403d`. The default constructor with dummy_app_stats is also used by production code. Fixes #9012 Message-Id: <20210712221447.71902-1-tgrabiec@scylladb.com>	2021-07-13 07:50:44 +03:00
Botond Dénes	c8f8e9232c	sstables/compaction_manager: add maintenance scheduling group rewrite_sstables() wants to be run in the maintenance group and soon we will add another compaction type which also wants to be run in the said group. To enable this propagate the maintenance scheduling group (both CPU and IO) to the compaction manager.	2021-07-12 10:25:15 +03:00
Botond Dénes	c4e71fb9b8	reader_concurrency_semaphore: remove default name parameter Naming the concurrency semaphore is currently optional, unnamed semaphores defaulting to "Unnamed semaphore". Although the most important semaphores are named, many still aren't, which makes for a poor debugging experience when one of these times out. To prevent this, remove the name parameter defaults from those constructors that have it and require a unique name to be passed in. Also update all sites creating a semaphore and make sure they use a unique name.	2021-07-08 12:31:36 +03:00

1 2 3 4 5 ...

1456 Commits