scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 22:25:48 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	b88acffd66	replica: Allow one compaction_backlog_tracker for each compaction_group Today, compaction_backlog_tracker is managed in each compaction_strategy implementation. So every compaction strategy is managing its own tracker and providing a reference to it through get_backlog_tracker(). But this prevents each group from having its own tracker, because there's only a single compaction_strategy instance per table. To remove this limitation, compaction_strategy impl will no longer manage trackers but will instead provide an interface for trackers to be created, such that each compaction group will be allowed to have its own tracker, which will be managed by compaction manager. On compaction strategy change, table will update each group with the new tracker, which is created using the previously introduced ompaction_group_sstable_set_updater. Now table's backlog will be the sum of all compaction_group backlogs. The normalization factor is applied on the sum, so we don't have to adjust each individual backlog to any factor. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:22:51 -03:00
Raphael S. Carvalho	90991bda69	replica: Refactor table::set_compaction_strategy for multiple groups Refactoring the function for it to accomodate multiple compaction groups. To still provide strong exception guarantees, preparation and execution of changes will be separated. Once multiple groups are supported, each group will be prepared first, and the noexcept execution will be done as a last step. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:37 -03:00
Raphael S. Carvalho	244efddb22	Fix exception safety when transferring ongoing charges to new backlog tracker When setting a new strategy, the charges of old tracker is transferred to the new one. The problem is that we're not reverting changes if exception is triggered before the new strategy is successfully set. To fix this exception safety issue, let's copy the charges instead of moving them. If exception is triggered, the old tracker is still the one used and remain intact. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:37 -03:00
Raphael S. Carvalho	d1e2dbc592	replica: move_sstables_from_staging: Use tracker from group owning the SSTable When moving SSTables from staging directory, we'll conditionally add them to backlog tracker. As each group has its own tracker, a given sstable will be added to the tracker of the group that owns it. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:37 -03:00
Raphael S. Carvalho	9031dc3199	replica: Move table::backlog_tracker_adjust_charges() to compaction_group Procedures that call this function happen to be in compaction_group, so let's move it to group. Simplifies the change where the procedure retrieves tracker from the group itself. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	116459b69e	replica: table::discard_sstables: Use compaction_group's backlog tracker Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	b2d8545b15	replica: Disable backlog tracker in compaction_group::stop() As we're moving backlog tracker to compaction group, we need to stop the tracker there too. We're moving it a step earlier in table::stop(), before sstables are cleared, but that's okay because it's still done after the group was deregistered from compaction manager, meaning no compactions are running. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	91b0d772e2	replica: database_sstable_write_monitor: use compaction_group's backlog tracker Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	f37a05b559	replica: Move table::do_add_sstable() to compaction_group All callers of do_add_sstable() live in compaction_group, so it should be moved into compaction_group too. It also makes easier for the function to retrieve the backlog tracker from the group. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	1ec0ef18a5	compaction/table_state: Introduce get_backlog_tracker() This interface will be helpful for allowing replica::table, unit tests and sstables::compaction to access the compaction group's tracker which will be managed by the compaction manager, once we complete the decoupling work. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Botond Dénes	725e5b119d	Revert "replica: Pick new generation for SSTables being moved from staging dir" This reverts commit `ba6186a47f`. Said commit violates the widely held assumption that sstables generations can be used as sstable identity. One known problem caused this is potential OOO partition emitted when reading from sstables (#11843). We now also have a better fix for #11789 (the bug this commit was meant to fix): `4aa0b16852`. So we can revert without regressions. Fixes: #11843 Closes #11886	2022-11-09 16:35:31 +02:00
Raphael S. Carvalho	a57724e711	Make off-strategy compaction wait for view building completion Prior to off-strategy compaction, streaming / repair would place staging files into main sstable set, and wait for view building completion before they could be selected for regular compaction. The reason for that is that view building relies on table providing a mutation source without data in staging files. Had regular compaction mixed staging data with non-staging one, table would have a hard time providing the required mutation source. After off-strategy compaction, staging files can be compacted in parallel to view building. If off-strategy completes first, it will place the output into the main sstable set. So a parallel view building (on sstables used for off-strategy) may potentially get a mutation source containing staging data from the off-strategy output. That will mislead view builder as it won't be able to detect changes to data in main directory. To fix it, we'll do what we did before. Filter out staging files from compaction, and trigger the operation only after we're done with view building. We're piggybacking on off-strategy timer for still allowing the off-strategy to only run at the end of the node operation, to reduce the amount of compaction rounds on the data introduced by repair / streaming. Fixes #11882. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #11919	2022-11-08 08:53:58 +02:00
Benny Halevy	eb3a94e2bc	table: perform_cleanup_compaction: flush memtable We don't explicitly cleanup the memtable, while it might hold tokens disowned by the current node. Flush the memtable before performing cleanup compaction to make sure all tokens in the memtable are cleaned up. Note that non-owned ranges are invalidate in the cache in compaction_group::update_main_sstable_list_on_compaction_completion using desc.ranges_for_cache_invalidation. Fixes #1239 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-06 19:41:40 +02:00
Benny Halevy	fc278be6c4	table: add perform_cleanup_compaction Move the integration with compaction_manager from the api layer to the tabel class so it can also make sure the memtable is cleaned up in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-06 19:41:33 +02:00
Botond Dénes	4aa0b16852	Merge 'distributed_loader: detect highest generation before populating column families' from Benny Halevy We should scan all sstables in the table directory and its subdirectories to determine the highest sstable version and generation before using it for creating new sstables (via reshard or reshape). Otherwise, the generations of new sstables created when populating staging (via reshard or reshape) may collide with generations in the base directory, leading to https://github.com/scylladb/scylladb/issues/11789 Refs scylladb/scylladb#11789 Fixes scylladb/scylladb#11793 Closes #11795 * github.com:scylladb/scylladb: distributed_loader: populate_column_family: reindent distributed_loader: coroutinize populate_column_family distributed_loader: table_population_metadata: start: reindent distributed_loader: table_population_metadata: coroutinize start_subdir distributed_loader: table_population_metadata: start_subdir: reindent distributed_loader: pre-load all sstables metadata for table before populating it	2022-10-21 14:07:51 +03:00
Avi Kivity	6b0afb968d	Merge 'reader_concurrency_semaphore: add set_resources()' from Botond Dénes Allowing to change the total or initial resources the semaphore has. After calling `set_resources()` the semaphore will look like as if it was created with the specified amount of resources when created. Use the new method in `replica::database::revert_initial_system_read_concurrency_boost()` so it doesn't lead to strange semaphore diagnostics output. Currently the system semaphore has 90/100 count units when there are no reads against it, which has led to some confusion. I also plan on using the new facility in enterprise. Closes #11772 * github.com:scylladb/scylladb: replica/database: revert initial boost to system semaphore with set_resources() reader_concurrency_semaphore: add set_resources()	2022-10-19 18:04:20 +03:00
Raphael S. Carvalho	ba6186a47f	replica: Pick new generation for SSTables being moved from staging dir When moving a SSTable from staging to base dir, we reused the generation under the assumption that no SSTable in base dir uses that same generation. But that's not always true. When reshaping staging dir, reshape compaction can pick a generation taken by a SSTable in base dir. That's because staging dir is populated first and it doesn't have awareness of generations in base dir yet. When that happens, view building will fail to move SSTable in staging which shares the same generation as another in base dir. We could have played with order of population, populating base dir first than staging dir, but the fragility wouldn't be gone. Not future proof at all. We can easily make this safe by picking a new generation for the SSTable being moved from staging, making sure no clash will ever happen. Fixes #11789. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #11790	2022-10-19 15:33:30 +03:00
Benny Halevy	4d7f0be929	distributed_loader: populate_column_family: reindent	2022-10-19 14:18:38 +03:00
Benny Halevy	030afaa934	distributed_loader: coroutinize populate_column_family Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-19 14:18:04 +03:00
Benny Halevy	0f23ee14c9	distributed_loader: table_population_metadata: start: reindent	2022-10-19 14:16:59 +03:00
Benny Halevy	39cec4f304	distributed_loader: table_population_metadata: coroutinize start_subdir Calling it in a seastar thread was done to reduce code churn and facilitate backporting. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-19 14:16:59 +03:00
Benny Halevy	5749a54cab	distributed_loader: table_population_metadata: start_subdir: reindent	2022-10-19 14:16:59 +03:00
Benny Halevy	119c0f3983	distributed_loader: pre-load all sstables metadata for table before populating it We should scan all sstables in the table directory and its subdirectories to determine the highest sstable version and generation before using it for creating new sstables (via reshard or reshape). Fixes scylladb/scylladb#11793 Note: table_population_metadata::start_subdir is called in a seastar thread to facilitate backporting to old versions that do not support coroutines yet. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-19 14:16:57 +03:00
Botond Dénes	d85208a574	replica/database: revert initial boost to system semaphore with set_resources() Unlike the current method (which uses consume()), this will also adjust the initial resources, adjusting the semaphore as if it was created with the reduced amount of resources in the first place. This fixes the confusing 90/100 count resources seen in diagnostics dump outputs.	2022-10-17 07:39:20 +03:00
Pavel Emelyanov	3e0b61d707	compaction_manager: Relax history paths There's a virtual method on table_state to update the entry in system keyspace. It's an overkill to facilitate tests that don't want this. With new system_keyspace weak referencing it can be made simpled by moving the updating call to the compaction_manager itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-10 16:20:59 +03:00
Pavel Emelyanov	f9b57df471	database: Plug/unplug system_keyspace There's a circular dependency between system_keyspace and database. The former needs the latter because it needs to execula local requests via query_processor. The latter needs the former via compaction manager and large data handler, database depends on both and these too need to insert their entries into system keyspace. To cut this loop the compaction manager and large data handler both get a weak reference on the system keysace. Once system keyspace starts is activcates this reference via the database call. When system keyspace is shutdown-ed on stop, it deactivates the reference. Technically the weak reference is implemented by marking the system_k.s. object as async_sharded_service, and the "reference" in question is the shared_from_this() pointer. When compaction manager or large data handler need to update a system keyspace's table, they both hold an extra reference on the system keyspace until the entry is committed, thus making sure that sys._k.s. doesn't stop from under their feet. At the same time, unplugging the reference on shutdown makes sure that no new entries update will appear and the system_k.s. will eventually be released. It's not a C++ classical reference, because system_keyspace starts after and stops before database. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-10 16:20:59 +03:00
Botond Dénes	b247f29881	Merge 'De-static system_keyspace::get_{saved\|local}_tokens()' from Pavel Emelyanov Yet another user of global qctx object. Making the method(s) non-static requires pushing the system_keyspace all the way down to size_estimate_virtual_reader and a small update of the cql_test_env Closes #11738 * github.com:scylladb/scylladb: system_keyspace: Make get_{local\|saved}_tokens non static size_estimates_virtual_reader: Pass sys_ks argument to get_local_ranges() cql_test_env: Keep sharded<system_keyspace> reference size_estimate_virtual_reader: Keep system_keyspace reference system_keyspace: Pass sys_ks argument to install_virtual_readers() system_keyspace: Make make() non-static distributed_loader: Pass sys_ks argument to init_system_keyspace() system_keyspace: Remove dangling forward declaration	2022-10-07 11:28:32 +03:00
Avi Kivity	20bad62562	Merge 'Detect and record large collections' from Benny Halevy This series adds support for detecting collections that have too many items and recording them in `system.large_cells`. A configuration variable was added to db/config: `compaction_collection_items_count_warning_threshold` set by default to 10000. Collections that have more items than this threshold will be warned about and will be recorded as a large cell in the `system.large_cells` table. Documentation has been updated respectively. A new column was added to system.large_cells: `collection_items`. Similar to the `rows` column in system.large_partition, `collection_items` holds the number of items in a collection when the large cell is a collection, or 0 if it isn't. Note that the collection may be recorded in system.large_cells either due to its size, like any other cell, and/or due to the number of items in it, if it cross the said threshold. Note that #11449 called for a new system.large_collections table, but extending system.large_cells follows the logic of system.large_partitions is a smaller change overall, hence it was preferred. Since the system keyspace schema is hard coded, the schema version of system.large_cells was bumped, and since the change is not backward compatible, we added a cluster feature - `LARGE_COLLECTION_DETECTION` - to enable using it. The large_data_handler large cell detection record function will populate the new column only when the new cluster feature is enabled. In addition, unit tests were added in sstable_3_x_test for testing large cells detection by cell size, and large_collection detection by the number of items. Closes #11449 Closes #11674 * github.com:scylladb/scylladb: sstables: mx/writer: optimize large data stats members order sstables: mx/writer: keep large data stats entry as members db: large_data_handler: dynamically update config thresholds utils/updateable_value: add transforming_value_updater db/large_data_handler: cql_table_large_data_handler: record large_collections db/large_data_handler: pass ref to feature_service to cql_table_large_data_handler db/large_data_handler: cql_table_large_data_handler: move ctor out of line docs: large-rows-large-cells-tables: fix typos db/system_keyspace: add collection_elements column to system.large_cells gms/feature_service: add large_collection_detection cluster feature test: sstable_3_x_test: add test_sstable_too_many_collection_elements test: lib: simple_schema: add support for optional collection column test: lib: simple_schema: build schema in ctor body test: lib: simple_schema: cql: define s1 as static only if built this way db/large_data_handler: maybe_record_large_cells: consider collection_elements db/large_data_handler: debug cql_table_large_data_handler::delete_large_data_entries sstables: mx/writer: pass collection_elements to writer::maybe_record_large_cells sstables: mx/writer: add large_data_type::elements_in_collection db/large_data_handler: get the collection_elements_count_threshold db/config: add compaction_collection_elements_count_warning_threshold test: sstable_3_x_test: add test_sstable_write_large_cell test: sstable_3_x_test: pass cell_threshold_bytes to large_data_handler test: sstable_3_x_test: large_data_handler: prepare callback for testing large_cells test: sstable_3_x_test: large_data tests: use BOOST_REQUIRE_[GL]T test: sstable_3_x_test: test_sstable_log_too_many_rows: use tests::random	2022-10-06 18:28:21 +03:00
Avi Kivity	62a4d2d92b	Merge 'Preliminary changes for multiple Compaction Groups' from Raphael "Raph" Carvalho What's contained in this series: - Refactored compaction tests (and utilities) for integration with multiple groups - The idea is to write a new class of tests that will stress multiple groups, whereas the existing ones will still stress a single group. - Fixed a problem when cloning compound sstable set (cannot be triggered today so I didn't open a GH issue) - Many changes in replica::table for allowing integration with multiple groups Next: - Introduce for_each_compaction_group() for iterating over groups wherever needed. - Use for_each_compaction_group() in replica::table operations spanning all groups (API, readers, etc). - Decouple backlog tracker from compaction strategy, to allow for backlog isolation across groups - Introduce static option for defining number of compaction groups and implement function to map a token to its respective group. - Testing infrastructure for multiple compaction groups (helpful when testing the dynamic behavior: i.e. merging / splitting). Closes #11592 * github.com:scylladb/scylladb: sstable_resharding_test: Switch to table_for_tests replica: Move compacted_undeleted_sstables into compaction group replica: Use correct compaction_group in try_flush_memtable_to_sstable() replica: Make move_sstables_from_staging() robust and compaction group friendly test: Rename column_family_for_tests to table_for_tests sstable_compaction_test: Use column_family_for_tests::as_table_state() instead test: Don't expose compound set in column_family_for_tests test: Implement column_family_for_tests::table_state::is_auto_compaction_disabled_by_user() sstable_compaction_test: Merge table_state_for_test into column_family_for_tests sstable_compaction_test: use table_state_for_test itself in fully_expired_sstables() sstable_compaction_test: Switch to table_state in compact_sstables() sstable_compaction_test: Reduce boilerplate by switching to column_family_for_tests	2022-10-06 18:23:47 +03:00
Pavel Emelyanov	04552f2d58	system_keyspace: Pass sys_ks argument to install_virtual_readers() The size-estimate-virtual-reader will need it, now it's available as "this" from system_keyspace::make() method Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 17:57:13 +03:00
Pavel Emelyanov	1938412d7a	system_keyspace: Make make() non-static This helper needs system_keyspace reference and using "this" as this looks natural. Also this de-static-ification makes it possible to put some sense into the invoke_on_all() call from init_system_keyspace() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 17:56:11 +03:00
Pavel Emelyanov	9f79525f8e	distributed_loader: Pass sys_ks argument to init_system_keyspace() It's final destination is virtual tabls registration code called from init_system_keyspace() eventually Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 17:55:03 +03:00
Botond Dénes	753f671eaa	Merge 'dirty_memory_manager: simplify, clarify, and document' from Avi Kivity This series undoes some recent damage to clarity, then goes further by renaming terms around dirty_memory_manager to be clearer. Documentation is added. Closes #11705 * github.com:scylladb/scylladb: dirty_memory_manager: re-term "virtual dirty" to "unspooled dirty" dirty_memory_manager: rename _virtual_region_group api: column_family: fix memtable off-heap memory reporting dirty_memory_manager: unscramble terminology	2022-10-06 13:49:26 +02:00
Tomasz Grabiec	4c8dc41f75	Merge 'Handle storage_io_error's ENOSPC when flushing' from Pavel Emelyanov This is the continuation of the `a980510654` that tries to catch ENOSPCs reported via storage_io_error similarly to how defer_verbose_shutdown() does on stop Closes #11664 * github.com:scylladb/scylladb: table: Handle storage_io_error's ENOSPC when flushing table: Rewrap retry loop	2022-10-06 13:49:26 +02:00
Raphael S. Carvalho	cf3f93304e	replica: Move compacted_undeleted_sstables into compaction group Compacted undeleted sstables are relevant for avoiding data resurrection in the purge path. As token ranges of groups won't overlap, it's better to isolate this data, so to prevent one group from interfering with another. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-10-05 21:37:19 -03:00
Raphael S. Carvalho	56ac62bbd6	replica: Use correct compaction_group in try_flush_memtable_to_sstable() We need to pass the compaction_group received as a param, not the one retrieved via as_table_state(). Needed for supporting multiple groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-10-05 21:37:19 -03:00
Raphael S. Carvalho	707ebf9cf7	replica: Make move_sstables_from_staging() robust and compaction group friendly Off-strategy can happen in parallel to view building. A semaphore is used to ensure they don't step on each other's toe. If off-strategy completes first, then move_sstables_from_staging() won't find the SSTable alive and won't reach code to add the file to the backlog tracker. If view building completes first, the SSTable exists, but it's not reshaped yet (has repair origin) and shouldn't be added to the backlog tracker. Off-strategy completion code will make sure new sstables added to main set are accounted by the backlog tracker, so move_sstables_from_staging() only need to add to tracker files which are certainly not going through a reshape compaction. So let's take these facts into account to make the procedure more robust and compaction group friendly. Very welcome change for when multiple groups are supported. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-10-05 21:37:19 -03:00
Benny Halevy	2c4ff71d2b	db: large_data_handler: dynamically update config thresholds make the various large data thresholds live-updateable and construct the observers and updaters in cql_table_large_data_handler to dynamically update the base large_data_handler class threshold members. Fixes #11685 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-05 10:53:40 +03:00
Botond Dénes	4c13328788	Merge 'Return all sstables in table::get_sstable_set()' from Raphael "Raph" Carvalho This fixes a regression introduced by `1e7a444`, where table::get_sstable_set() isn't exposing all sstables, but rather only the ones in the main set. That causes user of the interface, such as get_sstables_by_partition_key() (used by API to return sstable name list which contains a particular key), to miss files in the maintenance set. Fixes https://github.com/scylladb/scylladb/issues/11681. Closes #11682 * github.com:scylladb/scylladb: replica: Return all sstables in table::get_sstable_set() sstables: Fix cloning of compound_sstable_set	2022-10-05 06:55:50 +03:00
Raphael S. Carvalho	827750c142	replica: Return all sstables in table::get_sstable_set() get_sstable_set() as its name implies is not confined to the main or maintenance set, nor to a specific compaction group, so let's make it return the compound set which spans all groups, meaning all sstables tracked by a table will be returned. This is a regression introduced in `1e7a444`. It affects the API to return sstable list containing a partition key, as sstables in maintenance would be missed, fooling users of the API like tools that could trust the output. Each compaction group is returning the main and maintenance set in table_state's main_sstable_set() and maintenance_sstable_set(), respectively. Fixes #11681. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-10-04 10:43:27 -03:00
Avi Kivity	37c6b46d26	dirty_memory_manager: re-term "virtual dirty" to "unspooled dirty" The "virtual dirty" term is not very informative. "Virtual" means "not real", but it doesn't say in which way it isn't real. In this case, virtual dirty refers to real dirty memory, minus the portion of memtables that has been written to disk (but not yet sealed - in that case it would not be dirty in the first place). I chose to call "the portion of memtables that has been written to disk" as "spooled memory". At least the unique term will cause people to look it up and may be easier to remember. From that we have "unspooled memory". I plan to further change the accounting to account for spooled memory rather than unspooled, as that is a more natural term, but that is left for later. The documentation, config item, and metrics are adjusted. The config item is practically unused so it isn't worth keeping compatibility here.	2022-10-04 14:03:59 +03:00
Avi Kivity	d02c407769	dirty_memory_manager: rename _virtual_region_group Since we folded _real_region_group into _virtual_region_group, the "virtual" tag makes no sense any more, so remove it.	2022-10-04 14:01:45 +03:00
Avi Kivity	bc2fcf5187	dirty_memory_manager: unscramble terminology Before `95f31f37c1` ("Merge 'dirty_memory_manager: simplify region_group' from Avi Kivity"), we had two region_group objects, one _real_region_group and another _virtual_region_group, each with a set of "soft" and "hard" limits and related functions and members. In `95f31f37c1`, we merged _real_region_group into _virtual_region_group, but unfortunately the _real_region_group members received the "hard" prefix when they got merged. This overloads the meaning of "hard" - is it related to soft/hard limit or is it related to the real/virtual distinction? This patch applied some renaming to restore consistency. Anything that came from _virtual_region_group now has "virtual" in its name. Anything that came from _real_region_group now has "real" in its name. The terms are still pretty bad but at least they are consistent.	2022-10-04 13:56:28 +03:00
Pavel Emelyanov	9cd1f777a5	database.hh: Remove unused headers Use forward declarations when needed Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11667	2022-10-04 09:01:38 +03:00
Benny Halevy	3f8bba202f	db/large_data_handler: pass ref to feature_service to cql_table_large_data_handler For recording collection_elements of large_collections when the large_collection_detection feature is enabled. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:42:10 +03:00
Benny Halevy	a107f583fd	db/large_data_handler: get the collection_elements_count_threshold Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:31:11 +03:00
Botond Dénes	95f31f37c1	Merge 'dirty_memory_manager: simplify region_group' from Avi Kivity region_group evolved as a tree, each node of which contains some regions (memtables). Each node has some constraints on memory, and can start flushing and/or stop allocation into its memtables and those below it when those constraints are violated. Today, the tree has exactly two nodes, only one of which can hold memtables. However, all the complexity of the tree remains. This series applies some mechanical code transformations that remove the tree structure and all the excess functionality, leaving a much simpler structure behind. Before: - a tree of region_group objects - each with two parameters: soft limit and hard limit - but only two instances ever instantiated After: - a single region_group object - with three parameters - two from the bottom instance, one from the top instance Closes #11570 * github.com:scylladb/scylladb: dirty_memory_manager: move third memory threshold parameter of region_group constructor to reclaim_config dirty_memory_manager: simplify region_group::update() dirty_memory_manager: fold region_group::notify_hard_pressure_relieved into its callers dirty_memory_manager: clean up region_group::do_update_hard_and_check_relief() dirty_memory_manager: make do_update_hard_and_check_relief() a member of region_group dirty_memory_manager: remove accessors around region_group::_under_hard_pressure dirty_memory_manager: merge memory_hard_limit into region_group dirty_memory_manager: rename members in memory_hard_limit dirty_memory_manager: fold do_update() into region_group::update() dirty_memory_manager: simplify memory_hard_limit's do_update dirty_memory_manager: drop soft limit / soft pressure members in memory_hard_limit dirty_memory_manager: de-template do_update(region_group_or_memory_hard_limit) dirty_memory_manager: adjust soft_limit threshold check dirty_memory_manager: drop memory_hard_limit::_name dirty_memory_manager: simplify memory_hard_limit configuration dirty_memory_manager: fold region_group_reclaimer into {memory_hard_limit,region_group} dirty_memory_manager: stop inheriting from region_group_reclaimer dirty_memory_manager: test: unwrap region_group_reclaimer dirty_memory_manager: change region_group_reclaimer configuration to a struct dirty_memory_manager: convert region_group_reclaimer to callbacks dirty_memory_manager: consolidate region_group_reclaimer constructors dirty_memory_manager: rename {memory_hard_limit,region_group}::notify_relief dirty_memory_manager: drop unused parameter to memory_hard_limit constructor dirty_memory_manager: drop memory_hard_limit::shutdown() dirty_memory_manager: split region_group hierarchy into separate classes dirty_memory_manager: extract code block from region_group::update dirty_memory_manager: move more allocation_queue functions out of region_group dirty_memory_manager: move some allocation queue related function definitions outside class scope dirty_memory_manager: move region_group::allocating_function and related classes to new class allocation_queue dirty_memory_manager: remove support for multiple subgroups	2022-10-03 13:22:47 +03:00
Avi Kivity	17b1cb4434	dirty_memory_manager: move third memory threshold parameter of region_group constructor to reclaim_config Place it along the other parameters.	2022-09-30 22:17:37 +03:00
Avi Kivity	6a02bb7c2b	dirty_memory_manager: merge memory_hard_limit into region_group The two classes always have a 1:1 or 0:1 relationship, and so we can just move all the members of memory_hard_limit into region_group, with the functions that track the relationship (memory_hard_limit::{add,del}()) removed. The 0:1 relationship is maintained by initializing the hard limit parameter with std::numeric_limits<size_t>::max(). The _hard_total_memory variable is always checked if it is greater than this parameter in order to do anything, and with this default it can never be.	2022-09-30 21:59:38 +03:00
Pavel Emelyanov	6a5b0d6c70	table: Handle storage_io_error's ENOSPC when flushing Commit `a9805106` (table: seal_active_memtable: handle ENOSPC error) made memtable flushing code stand ENOSPC and continue flusing again in the hope that the node administrator would provide some free space. However, it looks like the IO code may report back ENOSPC with some exception type this code doesn't expect. This patch tries to fix it refs: #11245 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-29 19:16:30 +03:00

1 2 3 4 5 ...

410 Commits