scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 12:17:02 +00:00

Author	SHA1	Message	Date
Gleb Natapov	53120c1d57	system_keyspace: fix assertion for missing transition_state The code assumes that if there is no transition_state there should be no nodes that currently in transition in a state other then left_token_ring state, but rebuild operation also creates such nodes, so add the check for it as well.	2023-08-10 16:37:56 +03:00
Kamil Braun	59c410fb97	Merge 'migration_manager: announce: provide descriptions for all calls' from Patryk Jędrzejczak The `system.group0_history` table provides useful descriptions for each command committed to Raft group 0. One way of applying a command to group 0 is by calling `migration_manager::announce`. This function has the `description` parameter set to empty string by default. Some calls to `announce` use this default value which causes `null` values in `system.group0_history`. We want `system.group0_history` to have an actual description for every command, so we change all default descriptions to reasonable ones. Going further, We remove the default value for the `description` parameter of `migration_manager::announce` to avoid using it in the future. Thanks to this, all commands in `system.group0_history` will have a non-null description. Fixes #13370 Closes #14979 * github.com:scylladb/scylladb: migration_manager: announce: remove the default value of description test: always pass empty description to migration_manager::announce migration_manager: announce: provide descriptions for all calls	2023-08-09 16:58:41 +02:00
Botond Dénes	108e510a23	Merge 'Update sstable_requiring_cleanup on compaction completion' from Benny Halevy Currently `sstable_requiring_cleanup` is updated using `compacting_sstable_registration`, but that mechanism is not used by offstrategy compaction, leading to #14304. This series introduces `compaction_manager::on_compaction_completion` that intercepts the call to the table::on_compaction_completion. This allows us to update `sstable_requiring_cleanup` right before the compacted sstables are deleted, making sure they are no leaked to `sstable_requiring_cleanup`, which would hold a reference to them until cleanup attempts to clean them up. `cleanup_incremental_compaction_test` was adjusted to observe the sstables `on_delete` (by adding a new observer event) to detect the case where cleanup attempts to delete the leaked sstables and fails since they were already deleted from the file system by offstrategy compaction. The test fails with the fix and passes with it. Fixes #14304 Closes #14858 * github.com:scylladb/scylladb: compaction_manager: on_compaction_completion: erase sstables from sstables_requiring_cleanup compaction/leveled_compaction_strategy: ideal_level_for_input: special case max_sstable_size==0 sstable: add on_delete observer compaction_manager: add on_compaction_completion sstable_compaction_test: cleanup_incremental_compaction_test: verify sstables_requiring_cleanup is empty	2023-08-09 11:03:45 +03:00
Botond Dénes	69d6778daf	Merge 'build: cmake: fixes for the release build' from Kefu Chai before this change, we use generator expression to initialize CMAKE_CXX_FLAGS_RELEASE, this has two problems: 1. the generator expression is not expanded when setting a regular variable. 2. the ending ">" is missing in one of the generator expression. 3. the parameters are not separated with ";" so address them, let's just * use `add_compile_options()` directly, as the corresponding `mode.${build_mode}.cmake` is only included when the "${build_mode}" is used. * add comma in between the command line options. * add the missing closing ">" Closes #14996 * github.com:scylladb/scylladb: build: cmake: pass --gc-sections to ld not ar build: cmake: use add_compile_options() in release build	2023-08-09 09:55:02 +03:00
Kefu Chai	47c9b25bac	compaction_manager: correct comment on compaction_task_executor::state when it comes to `regular_compaction_task_executor`, we repeat the compaction until the compaction can not proceed, so after an iteration of compaction completes successfully, the task can still continue with yet another round of the compaction as it sees appropriate. so let's update the comment to reflect this fact. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14891	2023-08-09 09:49:18 +03:00
Kefu Chai	6dc885a8e2	compaction: mark more member variables const quite a few member variables serves as the configuration for a given compaction, they are immutable in the life cycle of it, so for better readability, let's mark them `const`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14981	2023-08-09 09:28:44 +03:00
Botond Dénes	5f65ac73ed	Merge 'Remove qctx' from Pavel Emelyanov The only place that still calls it is static force_blocking_flush method. If can be made non-static already. Also, while at it, coroutinize some system_keyspace methods and fix a FIXME regarding replica::database access in it. Closes #14984 * github.com:scylladb/scylladb: code: Remove query-context.hh code: Remove qctx system_keyspace: Use system_keyspace's container() to flush system_keyspace: Make force_blocking_flush() non-static system_keyspace: Coroutinize update_tokens() system_keyspace: Coroutinize save_truncation_record()	2023-08-09 09:27:53 +03:00
Kefu Chai	782b1992b2	build: cmake: pass --gc-sections to ld not ar ar is not able to tell which sections to be GC'ed, hence it does not care about --gc-sections, but ld does. let's add this option to CMAKE_EXE_LINKER_FLAGS. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 13:50:44 +08:00
Kefu Chai	f7377725c2	build: cmake: use add_compile_options() in release build before this change, we use generator expression to initialize CMAKE_CXX_FLAGS_RELEASE, this has two problems: 1. the generator expression is not expanded when setting a regular variable. 2. the ending ">" is missing in one of the generator expression. 3. the parameters are not separated with ";" so address them, let's just * use `add_compile_options()` directly, as the corresponding `mode.${build_mode}.cmake` is only included when the "${build_mode}" is used. * add comma in between the command line options. * add the missing closing ">" Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 12:56:06 +08:00
Pavel Emelyanov	f1515c610e	code: Remove query-context.hh The whole thing is unused now, so the header is no longer needed Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:11:07 +03:00
Pavel Emelyanov	413d81ac16	code: Remove qctx Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:10:56 +03:00
Pavel Emelyanov	d7f5d6dba8	system_keyspace: Use system_keyspace's container() to flush In force_blocking_flush() there's an invoke-on-all invocation of replica::database::flush() and a FIXME to get the replica database from somewhere else rather than via query-processor -> data_dictionary. Since now the force_blocking_flush() is non-static the invoke-on-all can happen via system_keyspace's container and the database can be obtained directly from the sys.ks. local instance Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:09:32 +03:00
Pavel Emelyanov	7a342ed5c0	system_keyspace: Make force_blocking_flush() non-static Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:09:20 +03:00
Pavel Emelyanov	6b8fe5ac43	system_keyspace: Coroutinize update_tokens() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:09:15 +03:00
Pavel Emelyanov	1700d79b60	system_keyspace: Coroutinize save_truncation_record() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:09:09 +03:00
Benny Halevy	7a7c8d0d23	compaction_manager: on_compaction_completion: erase sstables from sstables_requiring_cleanup Erase retired sstable from compaction_state::sstables_requiring_cleanup also on_compaction_completion (in addition to compacting_sstable_registration::release_compacting for offstrategy compaction with piggybacked cleanup or any other compaction type that doesn't use compacting_sstable_registration. Add cleanup_during_offstrategy_incremental_compaction_test that is modeled after cleanup_incremental_compaction_test to check that cleanup doesn't attempt to cleanup already-deleted sstables that were left over by offstrategy compaction in sstables_requiring_cleanup. Fixes #14304 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:16:46 +03:00
Benny Halevy	b1e164a241	compaction/leveled_compaction_strategy: ideal_level_for_input: special case max_sstable_size==0 Prevent div-by-zero byt returning const level 1 if max_sstable_size is zero, as configured by cleanup_incremental_compaction_test, before it's extended to cover also offstrategy compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:16:46 +03:00
Benny Halevy	b08f2ac4c6	sstable: add on_delete observer Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:15:00 +03:00
Benny Halevy	df66895080	compaction_manager: add on_compaction_completion Pass the call to the table on_compaction_completion so we can manage the sstables requiring cleanup state along the way. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:12:05 +03:00
Benny Halevy	ea64ae54f8	sstable_compaction_test: cleanup_incremental_compaction_test: verify sstables_requiring_cleanup is empty Make sure that there are no sstables_requiring_cleanup after cleanup compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:12:01 +03:00
Patryk Jędrzejczak	356e131acd	migration_manager: announce: remove the default value of description We remove the default value for the description parameter of migration_manager::announce to avoid using it in the future. Thanks to this, all commands in system.group0_history will have a non-null description.	2023-08-07 14:38:11 +02:00
Patryk Jędrzejczak	866c9a904d	test: always pass empty description to migration_manager::announce In the next commit, we remove the default value for the description parameter of migration_manager::announce to avoid using it in the future. However, many calls to announce in tests use the default value. We have to change it, but we don't really care about descriptions in the tests, so we pass the empty string everywhere.	2023-08-07 14:38:11 +02:00
Patryk Jędrzejczak	27ddf78171	migration_manager: announce: provide descriptions for all calls The system.group0_history table provides useful descriptions for each command committed to Raft group 0. One way of applying a command to group 0 is by calling migration_manager::announce. This function has the description parameter set to empty string by default. Some calls to announce use this default value which causes null values in system.group0_history. We want system.group0_history to have an actual description for every command, so we change all default descriptions to reasonable ones. We can't provide a reasonable description to announce in query_processor::execute_thrift_schema_command because this function is called in multiple situations. To solve this issue, we add the description parameter to this function and to handler::execute_schema_command that calls it.	2023-08-07 14:38:11 +02:00
Avi Kivity	4f7e83a4d0	cql3: select_statement: reject DISTINCT with GROUP BY on clustering keys While in SQL DISTINCT applies to the result set, in CQL it applies to the table being selected, and doesn't allow GROUP BY with clustering keys. So reject the combination like Cassandra does. While this is not an important issue to fix, it blocks un-xfailing other issues, so I'm clearing it ahead of fixing those issues. An issue is unmarked as xfail, and other xfails lose this issue as a blocker. Fixes #12479 Closes #14970	2023-08-07 15:35:59 +03:00
Patryk Jędrzejczak	1772433ae2	raft_group0: log gaining and losing leadership on the INFO level Knowing that a server gained or lost leadership in group 0 is sometimes useful for the purpose of debugging, so we log information about these events on the INFO level. Gaining and losing leadership are relatively rare events, so this change shouldn't flood the logs. Closes #14877	2023-08-07 12:13:24 +02:00
Kamil Braun	9edc98f8e9	Merge 'raft: make a removed/decommissioning node a non-voter early' from Patryk Jędrzejczak For `removenode`, we make a removed node a non-voter early. There is no downside to it because the node is already dead. Moreover, it improves availability in some situations. For `decommission`, if we decommission a node when the number of nodes is even, we make it a non-voter early to improve availability. All majorities containing this node will remain majorities when we make this node a non-voter and remove it from the set because the required size of a majority decreases. We don't change `decommission` when the number of nodes is odd since this may reduce availability. Fixes #13959 Closes #14911 * github.com:scylladb/scylladb: raft: make a decommissioning node a non-voter early raft: topology_coordinator: implement step_down_as_nonvoter raft: make a removed node a non-voter early	2023-08-07 10:14:33 +02:00
Botond Dénes	fa4aec90e9	Merge 'test: tasks: Fix task_manager/wait_task test ' from Aleksandra Martyniuk Rewrite test that checks whether task_manager/wait_task works properly. The old version didn't work. Delete functions used in old version. Closes #14959 * github.com:scylladb/scylladb: test: rewrite wait_task test test: move ThreadWrapper to rest_util.py	2023-08-07 09:04:29 +03:00
Benny Halevy	6f037549ac	sstables: delete_with_pending_deletion_log: batch sync_directory When deleting multiple sstables with the same prefix the deletion atomicity is ensured by the pending_delete_log file, so if scylla crashes in the middle, deletions will be replyed on restart. Therefore, we don't have to ensure atomicity of each individual `unlink`. We just need to sync the directory once, before removing the pending_delete_log file. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #14967	2023-08-06 18:52:13 +03:00
Avi Kivity	6c1e44e237	Merge 'Make replica::database and cql3::query_processor share wasm manager' from Pavel Emelyanov This makes it possible to remove remaining users of the global qctx. The thing is that db::schema_tables code needs to get wasm's engine, alien runner and instance cache to build wasm context for the merged function or to drop it from cache in the opposite case. To get the wasm stuff, this code uses global qctx -> query_processor -> wasm chain. However, the functions (un)merging code already has the database reference at hand, and its natural to get wasm stuff from it, not from the q.p. which is not available So this PR packs the wasm engine, runner and cache on sharded<wasm::manager> instance, makes the manager be referenced by both q.p. and database and removes the qctx from schema tables code Closes #14933 * github.com:scylladb/scylladb: schema_tables: Stop using qctx database: Add wasm::manager& dependency main, cql_test_env, wasm: Start wasm::manager earlier wasm: Shuffle context::context() wasm: Add manager::remove() wasm: Add manager::precompile() wasm: Move stop() out of query_processor wasm: Make wasm sharded<manager> query_processor: Wrap wasm stuff in a struct	2023-08-06 17:00:28 +03:00
Avi Kivity	412629a9a1	Merge 'Export tablet load-balancer metrics' from Tomasz Grabiec The metrics are registered on-demand when load-balancer is invoked, so that only leader exports the metrics. When leader changes, the old leader will stop exporting. The metrics are divided into two levels: per-dc and per-node. In prometheus, they will have appropriate labels for dc and host_id values. Closes #14962 * github.com:scylladb/scylladb: tablet_allocator: unregister metrics when leadership is lost tablets: load_balancer: Export metrics service, raft: Move balance_tablets() to tablet_allocator tablet_allocator: Start even if tablets feature is not enabled main, storage_service: Pass tablet allocator to storage_service	2023-08-06 16:58:27 +03:00
Tomasz Grabiec	f26e65d4d4	tablets: Fix crash on table drop Before the patch, tablet metadata update was processed on local schema merge before table changes. When table is dropped, this means that for a while table will exist without a corresponding tablet map. This can cause memtable flush for this table to fail, resulting in intentional abort(). That's because sstable writing attempts to access tablet map to generate sharding metadata. If auto_snapshot is enabled, this is much more likely to happen, because we flush memtables on table drop. To fix the problem, process tablet metadata after dropping tables, but before creating tables. Fixes #14943 Closes #14954	2023-08-06 16:45:43 +03:00
Pavel Emelyanov	3c6686e181	bptree: Replace assert with static_assert The one runs under checked constexpr value anyway Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #14951	2023-08-06 16:36:12 +03:00
Tomasz Grabiec	f827cfd5b6	tablet_allocator: unregister metrics when leadership is lost So that graphs are not polluted with stale metrics from past leaders.	2023-08-05 21:48:08 +02:00
Tomasz Grabiec	d653cbae53	tablets: load_balancer: Export metrics	2023-08-05 21:48:08 +02:00
Tomasz Grabiec	67c7aadded	service, raft: Move balance_tablets() to tablet_allocator The implementation will access metrics registered from tablet_allocator.	2023-08-05 21:48:08 +02:00
Tomasz Grabiec	cb0d763a22	tablet_allocator: Start even if tablets feature is not enabled topology coordinator will call it. Rather than spreading ifs there, it's simpler to start it and disable functionality in the tablet allocator.	2023-08-05 21:48:08 +02:00
Tomasz Grabiec	5bfc8b0445	main, storage_service: Pass tablet allocator to storage_service Tablet balancing will be done through tablet_allocator later.	2023-08-05 03:10:26 +02:00
Pavel Emelyanov	fd50ba839c	schema_tables: Stop using qctx There are two places in there that need qctx to get query_processor from to, in turn, get wasm::manager from. Fortunately, both places have the database reference at hand and can get the wasm::manager from it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	fa93ac9bfd	database: Add wasm::manager& dependency The dependency is needed by db::schema_tables to get wasm manager for its needs. This patch prepares the ground. Now the wasm::manager is shared between replica::database and cql3::query_processor Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	f4e7ffa0fc	main, cql_test_env, wasm: Start wasm::manager earlier It will be needed by replica::database and should be available that early. It doesn't depend on anything and can be moved in the starting order safely Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	595c5abbf9	wasm: Shuffle context::context() Add a constructor that builds context out of const manager reference. The existing one needs to get engine and instance cache and does it via query_processor. This change lets removing those exports and finally -- drop the wasm::manager -> cql3::query_processor friendship Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	56404ee053	wasm: Add manager::remove() This is one of the users of query_processor's export of wasm::manager's instance cache. Remove it in advance Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	93cb73fddb	wasm: Add manager::precompile() This is not to make query_processor export alien runner from the wasm::manager Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	d58a2d65b5	wasm: Move stop() out of query_processor When the q.p. stops it also "stops" the wasm manager. Move this call into main. The cql test env doesn't need this change, it stops the whole sharded service which stops instances on its own Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	243f2217dd	wasm: Make wasm sharded<manager> The wasm::manager is just cql3::wasm_context renamed. It now sits in lang/wasm* and is started as a sharded service in main (and cql test env). This move also needs some headers shuffling, but it's not severe This change is required to make it possible for the wasm::manager to be shared (by reference) between q.p. and replica::database further Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	dde285e7e9	query_processor: Wrap wasm stuff in a struct There are three wasm-only fields on q.p. -- engine, cache and runner. This patch groups them on a single wasm_context structure to make it earier to manipulate them in the next patches The 'friend' declaration it temporary, will go away soon Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Kamil Braun	421a5ad55c	Merge 'feature_service: don't load whole topology state to check features' from Piotr Dulikowski Currently, feature service uses `system_keyspace::load_topology_state` to load information about features from the `system.topology` table. This function implicitly assumes that it is called after schema commitlog replay and will correspond to the state of the topology state machine after some command is applied. However, feature check happens before the commitlog replay. If some group 0 command consists of multiple mutations that are not applied atomically, the `load_topology_state` function may fail to construct a `service::topology` object based on the table state. Moreover, this function not only checks `system.topology` but also `system.cdc_generations_v3` - in the case of the issue, the entry that was loaded from the this table didn't contain the `num_ranges` parameter. In order to fix this, the feature check code now uses `load_topology_features_state` which only loads enabled and supported features from `system.topology`. Only this information is really necessary for the feature check, and it doesn't have any invariants to check. Fixes: #14944 Closes #14955 * github.com:scylladb/scylladb: feature_service: don't load whole topology state to check features system_keyspace: separate loading topology_features from topology topology_state_machine: extract features-related fields to a struct untyped_result_set: add missing_column_exception	2023-08-04 15:09:12 +02:00
Kamil Braun	fed775e13b	Merge 'group0_state_machine: await transfer_snapshot' from Benny Halevy Hold a (newly added) group0_state_machine gate that is closed and waited on in group0_state_machine::abort() To prevent use-after-free when destroying the group0_state_machine while transfer_snapshot runs. Fixes #14907 Also, use an abort_source in group0_state_machine to abort an ongoing transfer_snapshot operation on group0_state_machine::abort() Closes #14952 * github.com:scylladb/scylladb: raft: group0_state_machine: transfer_snapshot: make abortable raft: group0_state_machine: transfer_snapshot: hold gate	2023-08-04 14:21:57 +02:00
Botond Dénes	68d2397d01	Merge 'repair: delete unused fields' from Aleksandra Martyniuk Delete unused shard_repair_task_impl members and incorrectly used method's argument. Closes #14956 * github.com:scylladb/scylladb: repair: delete task_manager_module::get_progress argument repair: delete unused shard_repair_task_impl fields	2023-08-04 15:08:31 +03:00
Aleksandra Martyniuk	629f893355	test: rewrite wait_task test Rewrite test that checks whether task_manager/wait_task works properly. The old version didn't work. Delete functions used in old version.	2023-08-04 13:34:58 +02:00

1 2 3 4 5 ...

38337 Commits