scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 11:30:36 +00:00

Author	SHA1	Message	Date
Avi Kivity	2a76065e3d	table, memtable: share log-structured allocator statistics across all memtables in a table The log-structured allocator collects allocation statistics (which it uses to manage memory reserves) in some objects kept in memtable_table_shared_data. Right now, this object is local to memtable_list, which itself is local to a tablet replica. Move it to table scope so different tablets in the shard share the statistics. This helps a newly-migrated tablet adjust more quickly.	2023-12-26 21:24:51 +02:00
Avi Kivity	02111d6754	memtable: consolidate _read_section, _allocating_section in a struct Those two members are passed from memtable_list to memtable. Since we wish to pass them from table, it becomes awkward to pass them as two separate variables as their contents are specific to memtable internals. Wrap them in a name that indicates their role (being table-wide shared data for memtables) and pass them as a unit.	2023-12-26 21:11:48 +02:00
Raphael S. Carvalho	5e55954f27	replica: Make the storage snapshot survive concurrent compactions Consider this: 1) file streaming takes storage snapshot = list of sstables 2) concurrent compaction unlink some of those sstables from file system 3) file streaming tries to send unlinked sstables, but files other than data and index cannot be read as only data and index have file descriptors opened To fix it, the snapshot now returns a set of files, one per sstable component, for each sstable. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#16476	2023-12-21 12:50:28 +02:00
Raphael S. Carvalho	546b31846a	replica: Introduce storage group splitting This introduces the ability to split a storage group. The main compaction group is split into left and right groups. set_split() is used to set the storage group to splitting mode, which will create left and right compaction groups. Incoming writes will now be placed into memtable of either left or right groups. split() is used to complete the splitting of a group. It only returns when all preexisting data is split. That means main compaction group will be empty and all the data will be stored in either left or right group. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 12:02:01 -03:00
Raphael S. Carvalho	213b2f1382	replica: Rename compaction_group_manager to storage_group_manager That's to reflect the fact that the manager now works with storage groups instead. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 11:40:09 -03:00
Raphael S. Carvalho	15de1cdcbc	replica: Introduce concept of storage group Storage group is the storage of tablets. This new concept is helpful for tablet splitting, where the storage of tablet will be split in multiple compaction groups, where each can be compacted independently. The reason for not going with arena concept is that it added complexity, and it felt much more elegant to keep compaction group unchanged which at the end of the day abstracts the concept of a set of sstables that can be compacted and operated independently. When splitting, the storage group for a tablet may therefore own multiple compaction groups, left, right, and main, where main keeps the data that needs splitting. When splitting completes, only left and right compaction groups will be populated. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 11:40:09 -03:00
Benny Halevy	cddcf3ad0c	table: add table_holder and hold method A smart pointer that guards the table object while it's being used by async functions. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-12 08:43:49 +02:00
Patryk Jędrzejczak	c8ee7d4499	db: make schema commitlog feature mandatory Using consistent cluster management and not using schema commitlog ends with a bad configuration throw during bootstrap. Soon, we will make consistent cluster management mandatory. This forces us to also make schema commitlog mandatory, which we do in this patch. A booting node decides to use schema commitlog if at least one of the two statements below is true: - the node has `force_schema_commitlog=true` config, - the node knows that the cluster supports the `SCHEMA_COMMITLOG` cluster feature. The `SCHEMA_COMMITLOG` cluster feature has been added in version 5.1. This patch is supposed to be a part of version 6.0. We don't support a direct upgrade from 5.1 to 6.0 because it skips two versions - 5.2 and 5.4. So, in a supported upgrade we can assume that the version which we upgrade from has schema commitlog. This means that we don't need to check the `SCHEMA_COMMITLOG` feature during an upgrade. The reasoning above also applies to Scylla Enterprise. Version 2024.2 will be based on 6.0. Probably, we will only support an upgrade to 2024.2 from 2024.1, which is based on 5.4. But even if we support an upgrade from 2023.x, this patch won't break anything because 2023.1 is based on 5.2, which has schema commitlog. Upgrades from 2022.x definitely won't be supported. When we populate a new cluster, we can use the `force_schema_commitlog=true` config to use schema commitlog unconditionally. Then, the cluster feature check is irrelevant. This check could fail because we initiate schema commitlog before we learn about the features. The `force_schema_commitlog=true` config is especially useful when we want to use consistent cluster management. Failing feature checks would lead to crashes during initial bootstraps. Moreover, there is no point in creating a new cluster with `consistent_cluster_management=true` and `force_schema_commitlog=false`. It would just cause some initial bootstraps to fail, and after successful restarts, the result would be the same as if we used `force_schema_commitlog=true` from the start. In conclusion, we can unconditionally use schema commitlog without any checks in 6.0 because we can always safely upgrade a cluster and start a new cluster. Apart from making schema commitlog mandatory, this patch adds two changes that are its consequences: - making the unneeded `force_schema_commitlog` config unused, - deprecating the `SCHEMA_COMMITLOG` feature, which is always assumed to be true. Closes scylladb/scylladb#16254	2023-12-04 21:02:16 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Benny Halevy	66ba983fe0	compaction_manager: flush_all_tables before major compaction Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See `64ec1c6ec6` However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See `f42eb4d1ce`). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb/scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Benny Halevy	be763bea34	database: add flush_all_tables Flushes all tables after forcing force_new_active_segment of the commitlog to make sure all commitlog segments can get recycled. Otherwise, due to "false sharing", rarely-written tables might inhibit recycling of the commitlog segments they reference. After `f42eb4d1ce`, that won't allow compaction to purge some tombstones based on the min_gc_time. To be used in the next patch by major compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Benny Halevy	1fd85bd37b	api: compaction: add flush_memtables option When flushing is done externally, e.g. by running `nodetool flush` prior to `nodetool compact`, flush_memtables=false can be passed to skip flushing of tables right before they are major-compacted. This is useful to prevent creation of small sstables due to excessive memtable flushing. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Kefu Chai	15bfa09454	treewide: do not mark return value const if this has no effect this change is a cleanup. to mark a return value without value semantics has no effect. these `const` specifier useless. so let's drop them. and, if we compile the tree with `-Wignore-qualifiers`, the compiler would warn like: ``` /home/kefu/dev/scylladb/schema/schema.hh:245:5: error: 'const' type qualifier on return type has no effect [-Werror,-Wignored-qualifiers] 245 \| const index_metadata_kind kind() const; \| ^~~~~ ``` so this change also silences the above warnings. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-11-17 17:46:19 +08:00
Pavel Emelyanov	68cf26587c	database: Add get_sstables_manager(bool_class is_system) method There's one place that does this selection, soon there will appear another, so it's worth having a convenience helper getter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-08 20:23:16 +03:00
Patryk Jędrzejczak	fbcd667030	replica: keyspace::create_replication_strategy: remove a redundant parameter The options parameter is redundant. We always use `_metadata->strategy_options()` and `keyspace::create_replication_strategy` already assumes that `_metadata` is set by using its other fields. Closes scylladb/scylladb#15776	2023-10-20 10:20:49 +03:00
Avi Kivity	7d5e22b43b	replica: memtable: don't forget memtable memory allocation statistics A memtable object contains two logalloc::allocating_section members that track memory allocation requirements during reads and writes. Because these are local to the memtable, each time we seal a memtable and create a new one, these statistics are forgotten. As a result we may have to re-learn the typical size of reads and writes, incurring a small performance penalty. The solution is to move the allocating_section object to the memtable_list container. The workload is the same across all memtables of the same table, so we don't lose discrimination here. The performance penalty may be increased later if log changes to memory reserve thresholds including a backtrace, so this reduces the odds of incurring such a penalty. Closes scylladb/scylladb#15737	2023-10-18 17:43:33 +02:00
Tomasz Grabiec	0aef0f900b	Merge 'truncation records refactorings' from Petr Gusev This PR contains several refactoring, related to truncation records handling in `system_keyspace`, `commitlog_replayer` and `table` clases: * drop map_reduce from `commitlog_replayer`, it's sufficient to load truncation records from the null shard; * add a check that `table::_truncated_at` is properly initialized before it's accessed; * move its initialization after `init_non_system_keyspaces` Closes scylladb/scylladb#15583 * github.com:scylladb/scylladb: system_keyspace: drop truncation_record system_keyspace: remove get_truncated_at method table: get_truncation_time: check _truncated_at is initialized database: add_column_family: initialize truncation_time for new tables database: add_column_family: rename readonly parameter to is_new system_keyspace: move load_truncation_times into distributed_loader::populate_keyspace commitlog_replayer: refactor commitlog_replayer::impl::init system_keyspace: drop redundant typedef system_keyspace: drop redundant save_truncation_record overload table: rename cache_truncation_record -> set_truncation_time system_keyspace: get_truncated_position -> get_truncated_positions	2023-10-17 10:55:30 +02:00
Petr Gusev	80fa5810a7	table: get_truncation_time: check _truncated_at is initialized	2023-10-05 15:19:59 +04:00
Petr Gusev	32a19fd61b	database: add_column_family: rename readonly parameter to is_new We want to make table::_truncated_at optional, so that in get_truncation_time we can assert that it is initialized. For existing tables this initialisation will happen in load_truncation_times function, and for new tables we want to initialize it in add_column_family like we do with mark_ready_for_writes. Now add_column_family function has parameter 'readonly', which is set by the callers to false if we are creating a fresh new table and not loading it from sstables. In this commit we rename this parameter to is_new and invert the passed values. This will allow us in the next commit to initialize _truncated_at field for new tables.	2023-10-05 15:19:59 +04:00
Raphael S. Carvalho	893ee68251	replica: Clean up storage of tablet on migration When a tablet is migrated into a new home, we need to clean its storage (i.e. the compaction group) in the old home. This includes its presence in row cache, which can be shared by multiple tablets living in the same shard. For exception safety, the following is done first in a "prepare phase" during cache invalidation. 1) take a compaction guard, to stop and disable compaction 2) flush memtable(s). 3) builds a list of all sstables, which represents all the storage of the tablet. Then once cache is invalidated successfully, we then clear the sstable sets of the the group in the "execution phase", to prevent any background op from incorrectly picking them and also to allow for their deletion. All the sstables of a tablet are deleted atomically, in order to guarantee that a failure midway won't cause data resurrection if it happens tablet is migrated back into the old home. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-10-04 12:16:19 -03:00
Petr Gusev	da1e6751e9	table: rename cache_truncation_record -> set_truncation_time This is a refactoring commit without observable changes in behaviour. There is a truncation_record struct, but in this method we only care about time, so rename it (and other related methods) appropriately to avoid confusion.	2023-10-03 17:11:35 +04:00
Avi Kivity	a3d73bfba7	Merge 'Add support for decommission with tablets' from Tomasz Grabiec Load balancer will recognize decommissioning nodes and will move tablet replicas away from such nodes with highest priority. Topology changes have now an extra step called "tablet draining" which calls the load balancer. The step will execute tablet migration track as long as there are nodes which require draining. It will not do regular load balancing. If load balancer is unable to find new tablet replicas, because RF cannot be met or availability is at risk due to insufficient node distribution in racks, it will throw an exception. Currently, topology change will retry in a loop. We should make this error cause topology change to be aborted. There is no infrastructure for aborts yet, so this is not implemented. Closes #15197 * github.com:scylladb/scylladb: tablets, raft topology: Add support for decommission with tablets tablet_allocator: Compute load sketch lazily tablet_allocator: Set node id correctly tablet_allocator: Make migration_plan a class tablets: Implement cleanup step storage_service, tablets: Prevent stale RPCs from running beyond their stage locator: Introduce tablet_metadata_guard locator, replica: Add a way to wait for table's effective_replication_map change storage_service, tablets: Extract do_tablet_operation() from stream_tablet() raft topology: Add break in the final case clause raft topology: Fix SIGSEGV when trace-level logging is enabled raft topology: Set node state in topology raft topology: Always set host id in topology	2023-09-14 17:16:23 +03:00
Tomasz Grabiec	d5539e080d	tablets: Implement cleanup step This change adds a stub for tablet cleanup on the replica side and wires it into the tablet migration process. The handling on replica side is incomplete because it doesn't remove the actual data yet. It only flushes the memtables, so that all data is in sstables and none requires a memtable flush. This patch is necessary to make decommission work. Otherwise, a memtable flush would happen when the decommissioned node is put in the drained state (as in nodetool drain) and it would fail on missing host id mapping (node is no longer in topology), which is examined by the tablet sharder when producing sstable sharding metadata. Leading to abort due to failed memtable flush.	2023-09-14 12:45:10 +02:00
Petr Gusev	ce0ee32d5a	database.cc: make _uses_schema_commitlog optional This field on the null shard is properly initialized in maybe_init_schema_commitlog function, until then we can't make decisions based on its value. This problem can happen e.g. if add_column_family function is called with readonly=false before maybe_init_schema_commitlog. It will call commitlog_for to pass the commitlog to mark_ready_for_writes and commitlog_for reads _uses_schema_commitlog. In this commit we add protection against this case - we trigger internal_error if _uses_schema_commitlog is read before it is initialized. maybe_init_schema_commitlog() was added to cql_test_env to make boost tests work with the new invariant.	2023-09-13 23:17:20 +04:00
Petr Gusev	beb29f094b	system_keyspace: drop load phases We want to switch system.scylla_local table to the schema commitlog, but load phases hamper here - schema commitlog is initialized after phase1, so a table which is using it should be moved to phase2, but system.scylla_local contains features, and we need them before schema commitlog initialization for SCHEMA_COMMITLOG feature. In this commit we are taking a different approach to loading system tables. First, we load them all in one pass in 'readonly' mode. In this mode, the table cannot be written to and has not yet been assigned a commit log. To achieve this we've added _readonly bool field to the table class, it's initialized to true in table's constructor. In addition, we changed the table constructor to always assign nullptr to commitlog, and we trigger an internal error if table.commitlog() property is accessed while the table is in readonly mode. Then, after triggering on_system_tables_loaded notifications on feature_service and sstable_format_selector, we call system_keyspace::mark_writable and eventually table::mark_ready_for_writes which selects the proper commitlog and marks the table as writable. In sstable_compaction_test we drop several mark_ready_for_writes calls since they are redundant, the table has already been made writable in env.make_table_for_tests call. The table::commitlog function either returns the current commitlog or causes an error if the table is readonly. This didn't work for virtual tables, since they never called mark_ready_for_writes. In this commit we add this call to initialize_virtual_tables.	2023-09-13 23:17:20 +04:00
Petr Gusev	47ffc66c7f	database.hh: add_column_family: add readonly parameter Previously, creating a table or view in schema_tables.cc/merge_tables_and_views was a two-step process: first adding a column family (add_column_family function) and then marking it as ready for writes (mark_table_as_writable). There is an yield between these stages, this means someone could see a table or view for which the mark_table_as_writable method had not yet been called, and start writing to it. This problem was demonstrated by materialised view dtests. A view is created on all nodes. On some nodes it will be created earlier than on others and the view rebuild process will start writing data to that view on other nodes, where mark_table_as_writable has not yet been called. In this patch we solve this problem by adding a readonly parameter to the add_column_family method. When loading tables from disk, this flag is set to true and the mark_table_as_writable is called only after all sstables have been loaded. When creating a new table, this flag is set to false, mark_table_as_writable is called from inside add_column_family and the new table becomes visible already as writable.	2023-09-13 23:17:20 +04:00
Tomasz Grabiec	c27d212f4b	api, storage_service: Recalculate table digests on relocal_schema api call Currently, the API call recalculates only per-node schema version. To workaround issues like #4485 we want to recalculate per-table digests. One way to do that is to restart the node, but that's slow and has impact on availability. Use like this: curl -X POST http://127.0.0.1:10000/storage_service/relocal_schema Fixes #15380 Closes #15381	2023-09-13 18:27:57 +03:00
Pavel Emelyanov	c2f2e0fd7a	table: Remove find_partition_slow() helper It's no longer used Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-29 15:38:41 +03:00
Raphael S. Carvalho	a22f74df00	table: Introduce storage snapshot for upcoming tablet streaming New file streaming for tablets will require integration with compaction groups. So this patch introduces a way for streaming to take a storage snapshot of a given tablet using its token range. Memtable is flushed first, so all data of a tablet can be streamed through its sstables. The interface is compaction group / tablet agnostic, but user can easily pick data from a single tablet by using the range in tablet metadata for a given tablet. E.g.: auto erm = table.get_effective_replication_map(); auto& tm = erm->get_token_metadata(); auto tablet_map = tm.tablets().get_tablet_map(table.schema()->id()); for (auto tid : tablet_map.tablet_ids()) { auto tr = tmap.get_token_range(tid); auto ssts = co_await table.take_storage_snapshot(tr); ... } Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #15128	2023-08-25 13:06:02 +02:00
Tomasz Grabiec	bd8bb5d4b1	Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho Compaction group is the data plane for tablets, so this integration allows each tablet to have its own storage (memtable + sstables). A crucial step for dynamic tablets, where each tablet can be worked on independently. There are still some inefficiencies to be worked on, but as it is, it already unlocks further development. ``` INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf ``` Closes #14863 * github.com:scylladb/scylladb: Kill scylla option to configure number of compaction groups replica: Wire tablet into compaction group token_metadata: Add this_host_id to topology config replica: Switch to chunked_vector for storing compaction groups replica: Generate group_id for compaction_group on demand	2023-08-18 15:17:17 +02:00
Raphael S. Carvalho	cc60598368	replica: Wire tablet into compaction group Compaction group is the data plane for tablets, so this integration allows each tablet to have its own storage (memtable + sstables). A crucial step for dynamic tablets, where each tablet can be worked on independently. There are still some inefficiencies to be worked on, but as it is, it already unlocks further development. INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf There's a need for compaction_group_manager, as table will still support "tabletless" mode, and we don't want to sprinkle ifs here and there, to support both modes. It's not really a manager (it's not even supposed to store a state), but I couldn't find a better name. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-16 18:23:53 -03:00
Raphael S. Carvalho	d3f71ae4ee	replica: Switch to chunked_vector for storing compaction groups We aim for a large number of tablets, therefore let's switch to chunked_vector to avoid large contiguous allocs. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-15 09:04:05 -03:00
Pavel Emelyanov	fdfec474ae	table: Make sstables with required state By default it's created with normal state, but there are some places that need to put it into staging. Do it with new state enum Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-14 15:28:54 +03:00
Pavel Emelyanov	6628dc47c5	table: Open-code sstables making streaming helpers There are two of those that call each other to end up calling plain make_sstable() one. It's simpler to patch both if they just call the latter directly. While at it -- drop the unused default argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-14 14:56:02 +03:00
Pavel Emelyanov	fa93ac9bfd	database: Add wasm::manager& dependency The dependency is needed by db::schema_tables to get wasm manager for its needs. This patch prepares the ground. Now the wasm::manager is shared between replica::database and cql3::query_processor Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Botond Dénes	4d538e1363	Merge 'Task manager tasks covering compaction group compaction' from Aleksandra Martyniuk All compaction task executors, except for regular compaction one, become task manager compaction tasks. Creating and starting of major_compaction_task_executor is modified to be consistent with other compaction task executors. Closes #14505 * github.com:scylladb/scylladb: test: extend test_compaction_task.py to cover compaction group tasks compaction: turn custom_task_executor into compaction_task_impl compaction: turn sstables_task_executor into sstables_compaction_task_impl compaction: change sstables compaction tasks type compaction: move table_upgrade_sstables_compaction_task_impl compaction: pass task_info through sstables compaction compaction: turn offstrategy_compaction_task_executor into offstrategy_compaction_task_impl compaction: turn cleanup_compaction_task_executor into cleanup_compaction_task_impl comapction: use optional task info in major compaction compaction: use perform_compaction in compaction_manager::perform_major_compaction	2023-08-04 10:11:00 +03:00
Amnon Heiman	d10a3dd19a	config: add enable_node_table_metrics flag By default, per-table-per-shard metrics reporting is turned off, and the aggregated version of the metrics (per-table-per-node) will be turned on. There could be a situation where a user with an excessive number of tables would suffer from performance issues, both from the network and the metrics collection server. This patch adds a config option, enable_node_table_metrics, which allows users to turn off per-table metrics reporting altogether. For example, when running Scylla with the command line argument '--enable-node-aggregated-table_metrics 0' per-table metrics will not be reported. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2023-08-02 10:20:18 +03:00
Botond Dénes	4a02865ea1	Merge 'Prevent invalidation of iterators over database::_column_families' from Aleksandra Martyniuk Maps related to column families in database are extracted to a column_families_data class. Access to them is possible only through methods. All methods which may preempt hold rwlock in relevant mode, so that the iterators can't become invalid. Fixes: #13290 Closes #13349 * github.com:scylladb/scylladb: replica: make tables_metadata's attributes private replica: add methods to get a filtered copy of tables map replica: add methods to check if given table exists replica: add methods to get table or table id replica: api: return table_id instead of const table_id& replica: iterate safely over tables related maps replica: pass tables_metadata to phased_barrier_top_10_counts replica: add methods to safely add and remove table replica: wrap column families related maps into tables_metadata replica: futurize database::add_column_family and database::remove	2023-07-31 15:31:59 +03:00
Aleksandra Martyniuk	4e439ac957	compaction: turn offstrategy_compaction_task_executor into offstrategy_compaction_task_impl offstrategy_compaction_task_executor inherits both from compaction_task_executor and offstrategy_compaction_task_impl.	2023-07-28 10:51:55 +02:00
Aleksandra Martyniuk	92f2987217	compaction: turn cleanup_compaction_task_executor into cleanup_compaction_task_impl cleanup_compaction_task_executor inherits both from compaction_task_executor and cleanup_compaction_task_impl. Add a new version of compaction_manager::perform_task_on_all_files which accepts only the tasks that are derived from compaction_task_impl. After all task executors' conversions are done, the new version replaces the original one.	2023-07-28 10:48:58 +02:00
Aleksandra Martyniuk	8317e4dd7f	comapction: use optional task info in major compaction To make it consistent with the upcoming methods, methods triggering major compaction get std::optional<tasks::task_info> as an argument. Thanks to that we can distinguish between a task that has no parent and the task which won't be registered in task manager.	2023-07-28 09:25:21 +02:00
Botond Dénes	b599f15b26	replica: make_[multishard_]streaming_reader(): make compaction_time mandatory Now that all users have opted in unconditionally, there is no point in keeping this optional. Make it mandatory to make sure there are no opt-out by mistake. The global override via enable_compacting_data_for_streaming_and_repair config item still remains, allowing compaction to be force turned-off.	2023-07-27 04:57:52 -04:00
Botond Dénes	2f8d77e97b	replica/table: add optional compacting to make_multishard_streaming_reader() Doing to make_multishard_streaming_reader() what the previous commit did to make_streaming_reader(). In fact, the new compaction_time parameter is simply forwarded to the make_streaming_reader() on the shard readers. Call sites are updated, but none opt in just yet.	2023-07-27 03:22:11 -04:00
Botond Dénes	42b0dd5558	replica/table: add optional compacting to make_streaming_reader() Opt-in is possible by passing an engaged `compaction_time` (gc_clock::time_point) to the method. When this new parameter is disengaged, no compaction happens. Note that there is a global override, via the enable_compacting_data_for_streaming_and_repair config item, which can force-disable this compaction. Compaction done on the output of the streaming reader does not garbage-collect tombstones! All call-sites are adjusted (the new parameter is not defaulted), but none opt in yet. This will be done in separate commit per user.	2023-07-27 03:22:11 -04:00
Botond Dénes	ad2ddffb22	Merge 'Remove qctx from system_keyspace::save_truncation_record()' from Pavel Emelyanov The method is called by db::truncate_table_on_all_shards(), its call-chain, in turn, starts from - proxy::remote::handle_truncate() - schema_tables::merge_schema() - legacy_schema_migrator - tests All of the above are easy to get system_keyspace reference from. This, in turn, allows making the method non-static and use query_processor reference from system_keyspace object in stead of global qctx Closes #14778 * github.com:scylladb/scylladb: system_keyspace: Make save_truncation_record() non-static code: Pass sharded<db::system_keyspace>& to database::truncate() db: Add sharded<system_keyspace>& to legacy_schema_migrator	2023-07-26 08:48:49 +03:00
Aleksandra Martyniuk	6e6ba7309e	replica: make tables_metadata's attributes private Make _column_families and _ks_cf_to_uuid private to prevent unsafe access. The maps can be accessed only through method which use locks if preemption is possible.	2023-07-25 17:13:24 +02:00
Aleksandra Martyniuk	c5cad803b3	replica: add methods to get a filtered copy of tables map	2023-07-25 17:13:24 +02:00
Aleksandra Martyniuk	ff26b2ba3f	replica: add methods to check if given table exists	2023-07-25 17:13:24 +02:00
Aleksandra Martyniuk	6796721c3d	replica: add methods to get table or table id	2023-07-25 17:13:24 +02:00
Aleksandra Martyniuk	e072a2341d	replica: api: return table_id instead of const table_id& Return table_id instead of const table_id& from database::find_uuid as copying table_id does not cause much overhead and simplifies methods signature.	2023-07-25 17:13:24 +02:00

1 2 3 4 5 ...

320 Commits