scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 18:40:38 +00:00

Author	SHA1	Message	Date
Benny Halevy	cd0061dcb5	database: shutdown keyspaces release the keyspace effective_replication_map during shutdown so that effective_replication_map_factory can be stopped cleanly with no outstanding e_r_m:s. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-11-19 10:52:41 +02:00
Benny Halevy	5947de7674	keyspace: get a reference to the erm_factory To be used for creating effective_replication_map. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-11-19 10:46:51 +02:00
Pavel Emelyanov	123286d5cd	database: Remove infinite_bound_range_deletion bits Have been unused for quite a while already Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20211112150837.24125-1-xemul@scylladb.com>	2021-11-12 19:40:17 +01:00
Avi Kivity	f74b258928	Merge "Add the system.config virtual table (updateable)" from Pavel E " Scylla can be configured via a bunch of config files plus a bunch of commandline options. Collecting these altogether can be challenging. The proposed table solves a big portion of this by dupming the db::config contents as a table. For convenience (and, maybe, to facilitate Benny's CLI) it's possible to update the 'value' column of the table with CQL request. There exists a PR with a table that exports loglevels in a form of a table. The updating technique used in this set is applicable to that table as well. tests: compilation(dev, release, debug), unit(debug) " * 'br-db-config-virtual-table-3' of https://github.com/xemul/scylla: tests: Unit test for system.config virtual table system_keyspace: Table with config options code: Push db::config down to virtual tables storage_proxy: Propagate virtual table exceptions messages table: Virtual writer hook (mutation applier) table: Rewrap table::apply() table: Mark virtual reader branch with unlikely utils: Add config_src::source_name() method utils: Ability to set_value(sstring) for an option utils: Internal change of config option utils: Mark some config_file methods noexcept	2021-11-11 22:13:26 +02:00
Pavel Emelyanov	947e4c9a10	code: Push db::config down to virtual tables The db::config reference is available on the database, which can be get from the virtual_table itself. The problem is that it's a const refernece, while system.config will be updateable and will need non-const reference. Adding non-const get_config() on the database looks wrong. The database shouldn't be used as config provider, even the const one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-11 16:39:34 +03:00
Pavel Emelyanov	5aefc48e28	table: Virtual writer hook (mutation applier) Symmetrically to virtual reader one, add the virtual writer callback on a table that will be in charge of applying the provided mutation. If a virtual table doesn't override this apply method the dedicated exception is thrown. Next patch will catch it and propagate back to caller, so it's a new exception type, not existing/std one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-11 16:39:34 +03:00
Pavel Emelyanov	80460f66fc	table: Rewrap table::apply() The main motivation is to have future returning apply (to be used by next patches). As a side effect -- indentation fix and private dirty_memory_region_group() method. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-11 16:39:34 +03:00
Botond Dénes	b58403fb63	Merge "Flatten database drain" from Pavel E " Draining the database is now scattered across the do_drain() method of the storage_service. Also it tells shutdown drain from API drain. This set packs this logic into the database::drain() method. tests: unit(dev), start-stop-drain(dev) " * 'br-database-drain' of https://github.com/xemul/scylla: database, storage_service: Pack database::drain() method storage_service: Shuffle drain sequence storage_service, database: Move flush-on-drain code storage_service: Remove bool from do_drain	2021-11-11 08:19:35 +02:00
Avi Kivity	d2e02ea7aa	Merge " Abstract table for compaction layer with table_state" from Raphael " table_state is being introduced for compaction subsystem, to remove table dependency from compaction interface, fix layer violations, and also make unit testing easier as table_state is an abstraction that can be implemented even with no actual table backing it. In this series, compaction strategy interfaces are switching to table_state, and eventually, we'll make compact_sstables() switch to it too. The idea is that no compaction code will directly reference a table object, but only work with the abstraction instead. So compaction subdirectory can stop including database.hh altogether, which is a great step forward. " * 'table_state_v5' of https://github.com/raphaelsc/scylla: sstable_compaction_test: switch to table_state compaction: stop including database.hh for compaction_strategy compaction: switch to table_state in estimated_pending_compactions() compaction: switch to table_state in compaction_strategy::get_major_compaction_job() compaction: switch to table_state in compaction_strategy::get_sstables_for_compaction() DTCS: reduce table dependency for task estimation LCS: reduce table dependency for task estimation table: Implement table_state compaction: make table param of get_fully_expired_sstables() const compaction_manager: make table param of has_table_ongoing_compaction() const Introduce table_state	2021-11-09 19:21:57 +02:00
Pavel Emelyanov	43f6a13a30	database, storage_service: Pack database::drain() method The storage_service::do_drain() now ends up with shutting down compaction manager, flushing CFs and shutting down commitlog. All three belong to the database and deserve being packed into a single database::drain() method. A note -- these steps are cross-shard synchronized, but database already has a barrier for that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-09 19:17:38 +03:00
Pavel Emelyanov	82509c9e74	storage_service, database: Move flush-on-drain code Flushing all CFs on shutdown is now fully managed in storage service and it looks weird. Some better place for it seems to be the database itself. Moving the flushing code also imples moving the drain_progress thing and patching the relevant API call. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-09 19:11:49 +03:00
Raphael S. Carvalho	03c819b8f5	table: Implement table_state This is the first implementation of table_state, intended to be used within compaction. It contains everything needed for compaction strategies. Subsequently, compaction strategy interface will replace table by table_state, and later all compaction procedures. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-09 10:45:40 -03:00
Raphael S. Carvalho	33b39a2bfc	compaction: move run_with_compaction_disabled() from table into compaction_manager That's intended to fix a bad layer violation as table was given the responsibility of disabling compaction for a given table T, but that logic clearly belongs to compaction_manager instead. Additionally, gate will be used instead of counter, as former provides manager with a way to synchronize with functions running under run_with_compaction_disabled. so remove() can wait for their termination. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-08 15:12:46 -03:00
Botond Dénes	200e2fad4d	db/system_keyspace: propagate distributed<> database and storage_service to register_virtual_tables() As some virtual tables will need the distributed versions of these.	2021-11-05 15:42:41 +02:00
Raphael S. Carvalho	51aa79e267	table: give a more descriptive name to compaction_data in compact_sstables() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-04 11:09:24 -03:00
Benny Halevy	0746b5add6	storage_service: replicate_to_all_cores: update all keyspaces Currently we update the effective_replication_map only on non-system keyspace, leaving the system keyspace, that uses the local replication strategy, with the empty replication_map, as it was first initialized. This may lead to a crash when get_ranges is called later as seen in #9494 where get_ranges was called from the perform_sstable_upgrade path. This change updates the effective_replication_map on all keyspaces rather than just on the non-system ones and adds a unit test that reproduces #9494 without the fix and passes with it. Fixes #9494 Test: unit(dev), database_test(debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20211020143217.243949-1-bhalevy@scylladb.com>	2021-10-20 17:54:23 +03:00
Benny Halevy	8c85197c6c	abstract_replication_strategy: get rid of shared_token_metadata member and ctor param It is not used any more. Methods either use the token_metadata_ptr in the effective_replication_map, or receive an ad-hoc token_metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 16:10:06 +03:00
Benny Halevy	4b838197e2	storage_service: update keyspaces effective_replication_map on token_metadata change Every time the token_metadata changes we need to update the effective_replication_map on all non-system keyspaces. Do that in replicate_to_all_cores after the updated token_metadata has been replicated to all cores. We first prepare and clone the token_metadata, then prepare and clone the new effective_replication_maps. Any failure at this stage is recoverable, handle via rollback and the exception is returned. Note that any failure to _apply_ the pending token_metadata or the effective_replication_map will cause scylla to abort. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 13:05:28 +03:00
Benny Halevy	991a6a8664	keyspace: update_effective_replication_map And use it to get_natural_endpoints. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:55:34 +03:00
Benny Halevy	970b0a50b5	keyspace: futurize create_replication_strategy And functions that use it, like: keyspace::update_from database::update_keyspace database::create_in_memory_keyspace Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:53:41 +03:00
Benny Halevy	d96a67eb57	abstract_replication_strategy: use shared_ptr in registry Enable creating shared_ptr<BaseClass> in nonstatic_class_registry using BaseClass::ptr_type and use that for abstract_replication_strategy. While at it, also clean up compressor with that respect to define compressor::ptr_type as shared_ptr<compressor> thus simplifying compressor_registry. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:39:36 +03:00
Benny Halevy	4511c9acdb	database.hh: convert ifdef block to pragma once Besides being more modern and more efficient for the compiler, this #ifndef block confuses my editor that greys out the whole block. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:39:36 +03:00
Benny Halevy	5001d261d4	abstract_replication_strategy: define replication_strategy_config_options To be used for searching effective replication strategy instances. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:39:36 +03:00
Pavel Emelyanov	c5128eea67	api, database, storage_service: Unify auto-compaction toggle There are two knobs here -- global and per-table one. Both were added without any synchronisation, but the former one was later fixed to become serialized and not to be available "too early". This patch unifies both toggles to be serialized with each-other and not be enabled too early. The justification for this change is to move the global toggle from out of the storage service, as it really belongs to the database, not the storage service. Respectively, the current synchronization, that depends on storage service internals, should be replaced with something else. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-10-11 11:12:39 +03:00
Tomasz Grabiec	e89b9799b8	Merge 'sstable mx reader: implement reverse single-partition reads' from Kamil Braun Until now reversed queries were implemented inside `querier::consume_page` (more precisely, inside the free function `consume_page` used by `querier::consume_page`) by wrapping the passed-in reader into `make_reversing_reader` and then consuming fragments from the resulting reversed reader. The first couple of commits change that by pushing the reversing down below the `make_combined_reader` call in `table::query`. This allows working on improving reversing for memtables independently from reversing for sstables. We then extend the `index_reader` with functions that allow reading the promoted index in reverse. We introduce `partition_reversing_data_source`, which wraps an sstable data file and returns data buffers with contents of a single chosen partition as if the rows were stored in reverse order. We use the reversing source and the extended index reader in `mx_sstable_mutation_reader` to implement efficient (at least in theory) reversed single-partition reads. The patchset disables cache for reversed reads. Fast-forwarding is not supported in the mx reader for reversed queries at this point. Details in commit messages. Read the commits in topological order for best review experience. Refs: #9134 (not saying "Fixes" because it's only for single-partition queries without forwarding) Closes #9281 * github.com:scylladb/scylla: table: add option to automatically bypass cache for reversed queries test: reverse sstable reader with random schema and random mutations sstables: mx: implement reversed single-partition reads sstables: mx: introduce partition_reversing_data_source sstables: index_reader: add support for iterating over clustering ranges in reverse clustering_key_filter: clustering_key_filter_ranges owning constructor flat_mutation_reader: mention reversed schema in make_reversing_reader docstring clustering_key_filter: document clustering_key_filter_ranges::get_ranges	2021-10-04 15:37:34 +02:00
Kamil Braun	703aed3277	table: add option to automatically bypass cache for reversed queries Currently the new reversing sstable algorithms do not support fast forwarding and the cache does not yet handle reversed results. This forced us to disable the cache for reversed queries if we want to guarantee bounded memory. We introduce an option that does this automatically (without specifying `bypass cache` in the query) and turn it on by default. If the user decides that they prefer to keep the cache at the cost of fetching entire partitions into memory (which may be viable if their partitions are small) during reversed queries, the option can be turned off. It is live-updateable.	2021-10-04 15:24:12 +02:00
Avi Kivity	1bac93e075	Merge "simplifications and layer violation fix for compaction manager" from Raphael "This series removes layer violation in compaction, and also simplifies compaction manager and how it interacts with compaction procedure." * 'compaction_manager_layer_violation_fix/v4' of github.com:raphaelsc/scylla: compaction: split compaction info and data for control compaction_manager: use task when stopping a given compaction type compaction: remove start_size and end_size from compaction_info compaction_manager: introduce helpers for task compaction_manager: introduce explicit ctor for task compaction: kill sstables field in compaction_info compaction: kill table pointer in compaction_info compaction: simplify procedure to stop ongoing compactions compaction: move management of compaction_info to compaction_manager compaction: move output run id from compaction_info into task	2021-10-04 13:09:31 +03:00
Raphael S. Carvalho	9067a13eac	compaction: split compaction info and data for control compaction_info must only contain info data to be exported to the outside world, whereas compaction_data will contain data for controlling compaction behavior and stats which change as compaction progresses. This separation makes the interface clearer, also allowing for future improvements like removing direct references to table in compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-09-30 13:16:57 -03:00
Raphael S. Carvalho	efed06e2e4	compaction: move management of compaction_info to compaction_manager Today, compaction is calling compaction manager to register / deregister the compaction_info created by it. This is a layer violation because manager sits one layer above compaction, so manager should be responsible for managing compaction info. From now on, compaction_info will be created and managed by compaction_manager. compaction will only have a reference to info, which it can use to update the world about compaction progress. This will allow compaction_manager to be simplified as info can be coupled with its respective task, allowing duplication to be removed and layer violation to be fixed. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-09-30 13:15:00 -03:00
Pavel Emelyanov	e9002e1e61	database: Get local host id from system_keyspace It's now cached on database itself, and it can be removed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 10:55:20 +03:00
Avi Kivity	b3c95a1fc6	commitlog: reduce inclusions of commitlog.hh due to db::commitlog::force_sync (#9379 ) There are now 231 translation units that indirectly include commitlog.hh due to the need to have access to db::commitlog::force_sync. Move that type to a new file commitlog_types.hh and make it available without access to the commitlog class. This reduces the number of translation units that depend on commitlog.hh to 84, improving compile time.	2021-09-29 16:13:44 +03:00
Raphael S. Carvalho	9718173598	compaction: Update backlog tracker correctly when schema is updated Currently the following can happen: 1) there's ongoing compaction with input sstable A, so sstable set and backlog tracker both contains A. 2) ongoing compaction replaces input sstable A by B, so sstable set contains only B now. 3) schema is updated, so a new backlog tracker is built without A because sstable set now contains only B. 4) ongoing compaction tries to remove A from tracker, but it was excluded in step 3. 5) tracker can now have a negative value if table is decreasing in size, which leads to log(<negative number>) == -NaN This problem happens because backlog tracker updates are decoupled from sstable set updates. Given that the essential content of backlog tracker should be the same as one of sstable set, let's move tracker management to table. Whenever sstable set is updated, backlog tracker will be updated with the same changes, making their management less error prone. Fixes #9157 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-09-27 14:15:29 -03:00
Avi Kivity	d7ac699a55	Revert "Merge "compaction: Update backlog tracker correctly when schema is updated" from Raphael" This reverts commit `b5cf0b4489`, reversing changes made to `e8493e20cb`. It causes segmentation faults when sstable readers are closed. Fixes #9388.	2021-09-26 18:31:49 +03:00
Avi Kivity	bf94c06fc7	Revert "Merge "simplifications and layer violation fix for compaction manager" from Raphael" This reverts commit `7127c92acc`, reversing changes made to `88480ac504`. We need to revert `b5cf0b4489` to fix #9388, and this stands in the way. Ref #9388.	2021-09-26 18:30:36 +03:00
Pavel Emelyanov	88e5b7c547	database: Shutdown in tests There's a circular dependency: query processor needs database database owns large_data_handler and compaction_manager those two need qctx qctx owns a query_processor Respectively, the latter hidden dependency is not "tracked" by constructor arguments -- the query processor is started after the database and is deferred to be stopped before it. This works in scylla, because query processor doesn't really stop there, but in cql_test_env it's problematic as it stops everything, including the qctx. Recent database start-stop sanitation revealed this problem -- on database stop either l.d.h. or compaction manager try to start (or continue) messing with the query processor. One problem was faced immediatelly and pluged with the `75e1d7ea` safety check inside l.d.h., but still cql_test_env tests continue suffering from use after free on stopped query processor. The fix is to partially revert the `4b7846da` by making the tests stop some pieces of the database (inclusing l.d.h. and compaction manager) as it used to before. In scylla this is, probably, not needed, at least now -- the database shutdown code was and still is run right before the stopping one. tests: unit(debug) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210924080248.11764-1-xemul@scylladb.com>	2021-09-26 11:09:01 +03:00
Raphael S. Carvalho	5bf51ced14	compaction: split compaction info and data for control compaction_info must only contain info data to be exported to the outside world, whereas compaction_data will contain data for controlling compaction behavior and stats which change as compaction progresses. This separation makes the interface clearer, also allowing for future improvements like removing direct references to table in compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-09-23 10:56:18 -03:00
Raphael S. Carvalho	0885376a85	compaction: move management of compaction_info to compaction_manager Today, compaction is calling compaction manager to register / deregister the compaction_info created by it. This is a layer violation because manager sits one layer above compaction, so manager should be responsible for managing compaction info. From now on, compaction_info will be created and managed by compaction_manager. compaction will only have a reference to info, which it can use to update the world about compaction progress. This will allow compaction_manager to be simplified as info can be coupled with its respective task, allowing duplication to be removed and layer violation to be fixed. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-09-23 10:00:49 -03:00
Avi Kivity	b5cf0b4489	Merge "compaction: Update backlog tracker correctly when schema is updated" from Raphael " Backlog tracker isn't updated correctly when facing a schema change, and may leak a SSTable if compaction strategy is changed, which causes backlog to be computed incorrectly. Most of these problems happen because sstable set and tracker are updated independently, so it could happen that tracker lose track (pun intended) of changes applied to set. The first patch will fix the leak when strategy is changed, and the third patch will make sure that tracker is updated atomically with sstable set, so these kind of problems will not happen anymore. Fixes #9157 test: mode(debug) " * 'fixes_to_backlog_tracker_v3' of https://github.com/raphaelsc/scylla: compaction: Update backlog tracker correctly when schema is updated compaction: Don't leak backlog of input sstable when compaction strategy is changed compaction: introduce compaction_read_monitor_generator::remove_exhausted_sstables() compaction: simplify removal of monitors	2021-09-22 18:55:25 +03:00
Raphael S. Carvalho	ff38f59f67	compaction: Update backlog tracker correctly when schema is updated Currently the following can happen: 1) there's ongoing compaction with input sstable A, so sstable set and backlog tracker both contains A. 2) ongoing compaction replaces input sstable A by B, so sstable set contains only B now. 3) schema is updated, so a new backlog tracker is built without A because sstable set now contains only B. 4) ongoing compaction tries to remove A from tracker, but it was excluded in step 3. 5) tracker can now have a negative value if table is decreasing in size, which leads to log(<negative number>) == -NaN This problem happens because backlog tracker updates are decoupled from sstable set updates. Given that the essential content of backlog tracker should be the same as one of sstable set, let's move tracker management to table. Whenever sstable set is updated, backlog tracker will be updated with the same changes, making their management less error prone. Fixes #9157 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-09-20 15:54:41 -03:00
Pavel Emelyanov	a4118a70ee	database, messaging: Delete old connection drop notification Database no longer needs it. Since the only user of the old-style notification is gone -- remove it as well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	b78e9b51b7	database, tests: Rework recommended format setting Tests don't have sstable format selector and enforce the needed format by hands with the help of special database:: method. It's more natural to provide it via convig. Doing this makes database initialization in main and cql_test_env closer to each other. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	a42383b127	database, sstables_manager: Sow some noexcepts Setting sstables format into database and into sstables_manager is all plain assignments. Mark them as noexcept, next patch will become apparently exception safe after that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	9a76df96e3	database: Eliminate unused helpers There are some large-data-handler-related helpers left after previous patches, they can be removed altogehter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	4b7846da86	database: Merge the stop_database() into database::stop() After stop_database() became shard-local, it's possible to merge it with database::stop() as they are both called one after another on scylla stop. In cql-test-env there are few more steps in between, but they don't rely on the database being partially stopped. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	469c734155	database: Flatten stop_database() The method need to perform four steps cross-shard synchronously: first stop compaction manager, then close user and, after it, system tables, finally shutdown the large data handler. This patch reworks this synchronization with the help of cross-shard barrier added to the database previously. The motivation is to merge .stop_database() with .stop(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	b1013e09b4	database: Equip with cross-shard-barrier Make sure a node-wide barrier exists on a database when scylla starts. Also provide a barrier for cql_test_env. In all other cases keep a solo-mode barrier so that single-shard db stop doesn't get blocked. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:49:06 +03:00
Pavel Emelyanov	e2308034ff	database: Add .start() method Called right after the sharded::start(). For now empty, to be populated by next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:44:48 +03:00
Pavel Emelyanov	f6ab69b7f8	database: Extract commitlog initialization from init_system_keyspace The intention is to keep all database initialization code in one place. The init_system_keyspace() is one the obstacles -- it initializes db's commitlog as first step. This patch moves the commitlog initialization out of the mentioned helper. The result looks clumsy, but it's temporary, next patches will brush it up. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:36:42 +03:00
Pavel Emelyanov	bd2b7dca0e	database: Remove unused mm arg from init_non_system_keyspaces() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:37 +03:00
Pavel Emelyanov	dc92f220e4	database: Drop get_available_memory() helper It's only used on start to provide the total_memory() value to the repair configuration code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-15 17:35:32 +03:00

1 2 3 4 5 ...

987 Commits