scylladb

Author	SHA1	Message	Date
Petr Gusev	b70bca71bc	system_keyspace: move load_truncation_times into distributed_loader::populate_keyspace load_truncation_times() now works only for schema tables since the rest is not loaded until distributed_loader::init_non_system_keyspaces. An attempt to call cf.set_truncation_time for non-system table just throws an exception, which is caught and logged with debug level. This means that the call cf.get_truncation_time in paxos_state.cc has never worked as expected. To fix that we move load_truncation_times() closer to the point where the tables are loaded. The function distributed_loader::populate_keyspace is called for both system and non-system tables. Once the tables are loaded, we use the 'truncated' table to initialize _truncated_at field for them. The truncation_time check for schema tables is also moved into populate_keyspace since is seems like a more natural place for it.	2023-10-05 15:19:52 +04:00
Benny Halevy	87d438b234	distributed_loader: populate_keyspace: iterate over datadirs in the inner loop It is more efficient to iterate over multiple data directories in the inner loop rather than the outer loop. Following patch will make use of the datadir in table_populator. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-23 08:50:24 +03:00
Pavel Emelyanov	bb4ddbb996	distributed_loader: Generalize datadir parallelizm loop Population of keyspaces happens first fo system keyspaces, then for non-system ones. Both methods iterate over config datadirs to populate from all configured directories. This patch generalizes this loop into the populate_keyspace() method. (indentation is deliberately left broken) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-15 17:49:53 +03:00
Pavel Emelyanov	0430ebf851	distributed_loader: Provide keyspace ref to populate_keyspace The method in question tries to find keyspace reference on the database by the given keyspace name. However, one of the callers aready has the keyspace reference at hands and can just pass it. The other calls can find the keyspace on its own. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-15 17:49:03 +03:00
Petr Gusev	beb29f094b	system_keyspace: drop load phases We want to switch system.scylla_local table to the schema commitlog, but load phases hamper here - schema commitlog is initialized after phase1, so a table which is using it should be moved to phase2, but system.scylla_local contains features, and we need them before schema commitlog initialization for SCHEMA_COMMITLOG feature. In this commit we are taking a different approach to loading system tables. First, we load them all in one pass in 'readonly' mode. In this mode, the table cannot be written to and has not yet been assigned a commit log. To achieve this we've added _readonly bool field to the table class, it's initialized to true in table's constructor. In addition, we changed the table constructor to always assign nullptr to commitlog, and we trigger an internal error if table.commitlog() property is accessed while the table is in readonly mode. Then, after triggering on_system_tables_loaded notifications on feature_service and sstable_format_selector, we call system_keyspace::mark_writable and eventually table::mark_ready_for_writes which selects the proper commitlog and marks the table as writable. In sstable_compaction_test we drop several mark_ready_for_writes calls since they are redundant, the table has already been made writable in env.make_table_for_tests call. The table::commitlog function either returns the current commitlog or causes an error if the table is readonly. This didn't work for virtual tables, since they never called mark_ready_for_writes. In this commit we add this call to initialize_virtual_tables.	2023-09-13 23:17:20 +04:00
Petr Gusev	c4787a160b	system_keyspace: remove unused parameter	2023-09-13 23:00:15 +04:00
Kamil Braun	33c19baabc	db: system_keyspace: take simpler service references in `make` Take references to services which are initialized earlier. The references to `gossiper`, `storage_service` and `raft_group0_registry` are no longer needed. This will allow us to move the `make` step right after starting `system_keyspace`.	2023-06-18 13:39:27 +02:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Pavel Emelyanov	3d7122d2fe	distributed_loader: Move garbage collecting into sstable_directory It's the directory that owns the components lister and can reason about the way to pick up dangling bits, be it local directories or entries from the ownership table. First thing to do is to move the g.c. code into sstable_directory. While at it -- convert ssting dir into fs::path dir and switch logger. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-17 15:16:23 +03:00
Raphael S. Carvalho	fe6df3d270	sstable_loader: Discard SSTable bloom filter on load-and-stream Load-and-stream reads the entire content from SSTables, therefore it can afford to discard the bloom filter that might otherwise consume a significant amount of memory. Bloom filters are only needed by compaction and other replica::table operations that might want to check the presence of keys in the SSTable files, like single-partition reads. It's not uncommon to see Data:Filter ratio of less than 100:1, meaning that for ~300G of data, filters will take ~3G. In addition to saving memory footprint, it also reduces operation time as load-and-stream no longer have to read, parse and build the filters from disk into memory. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:34:22 -03:00
Benny Halevy	aa4b18f8fb	distributed_loader: reshard: add optional owned_ranges_ptr param For passing owned_ranges_ptr from distributed_loader::process_upload_dir. Refs #11933 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:57:41 +03:00
Petr Gusev	5a5d664a5a	init_system_keyspace: refactoring towards explicit load phases We aim (#12642) to use the schema commit log for raft tables. Now they are loaded at the first call to init_system_keyspace in main.cc, but the schema commitlog is only initialized shortly before the second call. This is important, since the schema commitlog initialization (database::before_schema_keyspace_init) needs to access schema commitlog feature, which is loaded from system.scylla_local and therefore is only available after the first init_system_keyspace call. So the idea is to defer the loading of the raft tables until the second call to init_system_keyspace, just as it works for schema tables. For this we need a tool to mark which tables should be loaded in the first or second phase. To do this, in this patch we introduce system_table_load_phase enum. It's set in the schema_static_props for schema tables. It replaces the system_keyspace::table_selector in the signature of init_system_keyspace. The call site for populate_keyspace in init_system_keyspace was changed, table_selector.contains_keyspace was replaced with db.local().has_keyspace. This check prevents calling populate_keyspace(system_schema) on phase1, but allows for populate_keyspace(system) on phase2 (to init raft tables). On this second call some tables from system keyspace (e.g. system.local) may have already been populated on phase1. This check protects from double-populating them, since every populated cf is marked as ready_for_writes.	2023-03-24 15:54:46 +04:00
Pavel Emelyanov	e67751ee92	distributed_loader: Let make_sstables_available choose target directory When sstables are loaded from upload/ subdir, the final step is to move them from this directory into base or staging one. The uploading code evaluates the target directory, then pushes it down the stack towards make_sstables_available() method. This patch replaces the path argument with bool to_staging one. The goal is to remove the knowlege of exact sstable location (nowadays -- its files' path) from the distributed loader and keep it in sstable object itself. Next patches will make full use of this change. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 17:23:59 +03:00
Pavel Emelyanov	0c7efe38e1	distributed_loader: Rename table_population_metadata It used to be just metadata by providing the meta for population, now it does the population by itself, so rename it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-15 20:15:04 +03:00
Pavel Emelyanov	16fca3fa8a	distributed_loader: Move populate_column_family() into population meta This ownership change also requires the auto& = *this alias and extra specification where to call reshard() and reshape() from. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-15 19:57:41 +03:00
Pavel Emelyanov	e6e65c87d5	sstable_directory: Add io-prio argument to .reshard() Now it gets one from this-> but the method is becoming static one in distributed_loader which only has it as an argument. That's not big deal as the current IO class is going to be derived from current sched group, so this extra arg will go away at all some day. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:41 +03:00
Pavel Emelyanov	420fc8d4df	sstable_directory: Add io-prio argument to .reshape() Now it gets one from this-> but the method is becoming static one in distributed_loader which only has it as an argument. That's not big deal as the current IO class is going to be derived from current sched group, so this extra arg will go away at all some day. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:22:27 +03:00
Kamil Braun	a483915c62	db: system_keyspace: add a virtual table with raft configuration Add a new virtual table `system.raft_state` that shows the currently operating Raft configuration for each present group. The schema is the same as `system.raft_snapshot_config` (the latter shows the config from the last snapshot). In the future we plan to add more columns to this table, showing more information (like the current leader and term), hence the generic name. Adding the table requires some plumbing of `sharded<raft_group_registry>&` through function parameters to make it accessible from `register_virtual_tables`, but it's mostly straightforward. Also added some APIs to `raft_group_registry` to list all groups and find a given group (returning `nullptr` if one isn't found, not throwing an exception).	2023-01-17 12:28:00 +01:00
Pavel Emelyanov	7ca5e143d7	sstable_directory: Convert sort-sstables argument to flags struct The sstable_directory::process_sstable_dir() accepts a boolean to control its behavior when collecting sstables. Turn this boolean into a structure of flags. The intention is to extend this flags set in the future (next patch). This boolean is true all the time, but one place sets it to true in a "verbose" manner, like this: bool sort_sstables_according_to_owner = false; process_sstable_dir(directory, sort_sstables_according_to_owner).get(); the local variable is not used anymore. Using designated initializers solves the verbosity in a nicer manner. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-22 18:19:23 +03:00
Benny Halevy	119c0f3983	distributed_loader: pre-load all sstables metadata for table before populating it We should scan all sstables in the table directory and its subdirectories to determine the highest sstable version and generation before using it for creating new sstables (via reshard or reshape). Fixes scylladb/scylladb#11793 Note: table_population_metadata::start_subdir is called in a seastar thread to facilitate backporting to old versions that do not support coroutines yet. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-19 14:16:57 +03:00
Pavel Emelyanov	9f79525f8e	distributed_loader: Pass sys_ks argument to init_system_keyspace() It's final destination is virtual tabls registration code called from init_system_keyspace() eventually Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 17:55:03 +03:00
Calle Wilund	d9c391e366	Revert "distributed_loader: Remove unused load-prio manipulations" This reverts commit `7396de72b1`. In `7396de7` (and refactorings before it) the set of prioritized keyspaces (and processing thereof) was removed, due to apparent non-usage (which is true for open-source version). This functionality is however required for certain features of the enterprise version (ear). As such is needs to be restored and reenabled. This reverts the actual commit, patch after ensures we use the prio set.	2022-08-23 10:34:05 +00:00
Benny Halevy	257d74bb34	schema, everywhere: define and use table_id as a strong type Define table_id as a distinct utils::tagged_uuid modeled after raft tagged_id, so it can be differentiated from other uuid-class types, in particular from table_schema_version. Fixes #11207 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:09:41 +03:00
Tomasz Grabiec	c5ad05c819	db: Allow splitting initiatlization of system tables We will need some system tables to be initialized earlier in the boot so that system.scylla_local can be read before schema tables are initialized.	2022-07-06 22:08:56 +02:00
Pavel Emelyanov' via ScyllaDB development	b0b29edcd7	distributed-loader: Remove ensure_system_table_directories It looks like the exactly same code is called few steps above via distributed_loader::init_system_keyspace `- distributed_loader::populate_keyspace While at it -- move the supervisor::notify("loading system sstables") handing around in the more suitable location. tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/981/ Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220621165313.31284-1-xemul@scylladb.com>	2022-06-22 13:59:00 +03:00
Botond Dénes	c450508954	Merge "Introduce sharded<system_keyspace> instance" from Pavel Emelyanov " Making the system-keyspace into a standard sharded instance will help to fix several dependency knots. First, the global qctx and local-cache both will be moved onto the sys-ks, all their users will be patched to depend on system-keyspace. Now it's not quite so, but we're moving towards this state. Second, snitch instance now sits in the middle of another dependency loop. To untie one the preferred ip and dc/rack info should be moved onto system keyspace altogether (now it's scattered over several places). The sys-ks thus needs to be a sharded service with some state. This set makes system-keyspace sharded instance, equipps it with all the dependencies it needs and passes it as dependency into storage service, migration manager and API. This helps eliminating a good portion of global qctx/cache usage and prepares the ground for snitch rework. tests: unit(dev) v1: unit(debug), dtest.simple_boot_shutdown(dev) " * 'br-sharded-system-keyspace-instance-2' of https://github.com/xemul/scylla: (25 commits) system_keyspace: Make load_host_ids non-static system_keyspace: Make load_tokens non-static system_keyspace: Make remove_endpoint and update_tokens non-static system_keyspace: Coroutinize update_tokens system_keyspace: Coroutinize remove_endpoint system_keyspace: Make update_cached_values non-static system_keyspace: Coroutinuze update_peer_info system_keyspace: Make update_schema_version non-static schema_tables: Add sharded<system_keyspace> argument to update_schema_version_and_announce replica: Push sharded<system_keyspace> down to parse_system_tables api: Carry sharded<system_keyspace> reference along storage_service: Keep sharded<system_keyspace> reference migration_manager: Keep sharded<system_keyspace> reference system_keyspace: Remove temporary qp variable system_keyspace: Make get_preferred_ips non-static system_keyspace: Make cache_truncation_record non-static system_keyspace: Make check_health non-static system_keyspace: Make build_bootstrap_info non-static system_keyspace: Make build_dc_rack_info non-static system_keyspace: Make setup_version non-static ...	2022-03-17 08:16:29 +02:00
Benny Halevy	a1d0f089c8	replica: distributed_database: populate_column_family: trigger offstrategy compaction only for the base directory In https://github.com/scylladb/scylla/issues/10218 we see off-strategy compaction happening on a table during the initial phases of `distributed_loader::populate_column_family`. It is caused by triggering offtrategy compaction too early, when sstables are populated from the staging directory in `a144d30162`. We need to trigger offstrategy compaction only of the base table directory, never the staging or quarantine dirs. Fixes #10218 Test: unit(dev) DTest: materialized_views_test.py::TestInterruptBuildProcess Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220316152812.3344634-1-bhalevy@scylladb.com>	2022-03-16 18:57:00 +02:00
Pavel Emelyanov	009c449cc3	replica: Push sharded<system_keyspace> down to parse_system_tables The method needs to call merge_schema() that will need system keyspace instance at hand. The parse_s._t. method is boot-time one, pushing the main-local instance through it is fine Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-16 14:24:40 +03:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Raphael S. Carvalho	a144d30162	distributed_loader: postpone reshape of repair-originated sstables SSTables created by repair will potentially not conform to the compaction strategy layout goal. If node shuts down before off-strategy has a chance to reshape those files, node will be forced to reshape them on restart. That causes unexpected downtime. Turns out we can skip reshape of those files on boot, and allow them to be reshaped after node becomes online, as if the node never went down. Those files will go through same procedure as files created by repair-based ops. They will be placed in maintenance set, and be reshaped iteratively until ready for integration into the main set. Fixes #9895. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-01-12 13:14:31 -03:00
Avi Kivity	4392c20bd3	replica: move distributed_loader into replica module distributed_loader is replica-side thing, so it belongs in the replica module ("distributed" refers to its ability to load sstables in their correct shards). So move it to the replica module.	2022-01-10 15:25:28 +02:00

31 Commits