scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 18:40:38 +00:00

Author	SHA1	Message	Date
Petr Gusev	1ddc76ffd1	test_fencing: add test_fence_hints The test makes a write through the first node with the third node down, this causes a hint to be stored on the first node for the second. We increment the version and fence_version on the third node, restart it, and expect to see a hint delivery failure because of versions mismatch. Then we update the versions of the first node and expect hint to be successfully delivered.	2023-08-22 15:48:40 +04:00
Petr Gusev	3ccd2abad4	test.py: output the skipped tests pytest option -rs forces it to print all the skipped tests along with the reasons. Without this option we can't tell why certain tests were skipped, maybe some of them shouldn't already.	2023-08-22 15:48:40 +04:00
Petr Gusev	c434d26b36	test.py: add skip_mode decorator and fixture Syntactic sugar for marking tests to be skipped in a particular mode. There is skip_in_debug/skip_in_release in suite.yaml, but they can be applied only on the entire file, which is unnatural and inconvenient. Also, they don't allow to specify a reason why the test is skipped. Separate dictionary skipped_funcs is needed since we can't use pytest fixtures in decorators.	2023-08-22 15:48:40 +04:00
Petr Gusev	a639d161e6	test.py: add mode fixture Sometimes a test wants to know what mode it is running in so that e.g. it can skip itself in some of them.	2023-08-22 15:48:40 +04:00
Petr Gusev	439c91851f	hints: add debug log for dropped hints Dropping data is rather important event, let's log it at least at the debug level. It'll help in debugging tests.	2023-08-22 15:48:40 +04:00
Petr Gusev	9fd3df13a2	hints: send_one_hint: extend the scope of file_send_gate holder The problem was that the holder in with_gate call was released too early. This happened before the possible call to on_hint_send_failure in then_wrapped. As a result, the effects of on_hint_send_failure (segment_replay_failed flag) were not visible in send_one_file after ctx_ptr->file_send_gate.close(), so we could decide that the segment was sent in full and delete it even if sending of some hints led to errors. Fixes #15110	2023-08-22 15:48:40 +04:00
Petr Gusev	0b7a90dff6	pylib: add ScyllaMetrics This patch adds facilities to work with Scylla metrics from test.py tests. The new metrics property was added to ManagerClient, its query method sends a request to Scylla metrics endpoint and returns and object to conveniently access the result. ScyllaMetrics is copy-pasted from test_shedding.py. It's difficult to reuse code between 'new' and 'old' styles of tests, we can't just import pylib in 'old' tests because of some problems with python search directories. A past commit of mine that attempted to solve this problem was rejected on review.	2023-08-22 14:31:04 +04:00
Petr Gusev	1b7603af23	hints manager: add send_errors counter There was no indication of problems in the hints manager metrics before. We need this counter for fencing tests in the later commit, but it seems to be useful on its own.	2023-08-22 14:31:04 +04:00
Petr Gusev	fa25e6d63e	token_metadata: add debug logs We log the new version when the new token metadata is set. Also, the log for fence_version is moved in shared_token_metadata from storage_service for uniformity.	2023-08-22 14:31:04 +04:00
Petr Gusev	360453fd87	fencing: add simple data plane test The test starts a three node cluster and manually decrements the version on the last node. It then tries to write some data through the last node and expects to get 'stale topology' exception.	2023-08-22 14:31:01 +04:00
Petr Gusev	5361de76f9	random_tables.py: add counter column type We'll need it for fencing test.	2023-08-11 17:37:09 +04:00
Petr Gusev	f5b41a8075	raft topology: don't increment version when transitioning to node_state::normal This version increment in not accompanied by a global_token_metadata_barrier, which means the new version won't be reflected in fence_version and basically will have no effect in terms of fencing.	2023-08-11 16:22:25 +04:00
Patryk Jędrzejczak	d1d1b6cf6e	raft: remove a replaced node from group 0 earlier The topology coordinator only marks a replaced node as LEFT during the replace operation and actually removes it from the group 0 config in cleanup_group0_config_if_needed. If this function is called before raft has committed a replacing node as a voter, it does not remove the replaced node from the group 0 config. Then, the coordinator can decide that it has no work to do and starts sleeping, leaving us with an outdated config. This behavior reduces group 0 availability and causes problems in tests (see #14885). Also, it makes the coordinator's logic confusing - it claims that it has no work to do when it has some work to do. Therefore, we modify the coordinator so that it removes the replaced node earlier in handle_topology_transition. Fixes #14885 Fixes #14975 Closes #15009	2023-08-11 01:32:24 +02:00
Botond Dénes	403ba9b055	Merge 'gossiper: lock_endpoint: fix review comments' from Benny Halevy This series fixes a couple of review comments on #14845 Closes #14976 * github.com:scylladb/scylladb: gossiper: lock_endpoint: fix comment regarding permit_id mismatch gossiper: lock_endpoint: change assert to on_internal_error	2023-08-10 17:37:32 +03:00
Kamil Braun	8f658fb139	Merge 's3/client: check for available port before starting minio server' from Kefu Chai there is chance that the default port of 9000 has been used on the host running the test, in that case, we should try to use another available port. so, in this change, we try ports in the ranges of [9000, 9000+1000), and use the first one which is not connectable. Fixes #14985 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14997 * github.com:scylladb/scylladb: test: stop using HostRegistry in MinioServer s3/client: check for available port before starting minio server	2023-08-10 14:01:13 +02:00
Alejo Sanchez	e2122163f5	test/pylib: protect double call to cluster stop test.py schedules calls to cluster .uninstall() and .stop() making double calls to it running at the same time. Mark the cluster as not running early on. While there, do the same for .stop_gracefully() for consistency. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #14987	2023-08-10 13:37:49 +02:00
Kamil Braun	39330b9c11	Merge 'gossiper: convict: lock_endpoint' from Benny Halevy Currently, `mark_dead` is called with null_permit_id from `convict`, and in this case, by contract, it should lock the endpoint, same as `mark_as_shutdown`. This change somehow escaped #14845 so it amends it. Fixes #14838 Closes #15001 * github.com:scylladb/scylladb: gossiper: verify permit_id in all private functions gossiper: convict: lock_endpoint	2023-08-10 13:09:05 +02:00
Benny Halevy	623ed1a249	gossiper: verify permit_id in all private functions Instead of acquiring the permit is the permit_id arg is null, like in mark_as_shutdown, just asssert that the permit_id is non-null. The functions are both private and we control all callers. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-10 08:17:04 +03:00
Benny Halevy	42c1c5ead8	gossiper: convict: lock_endpoint Currently, `mark_dead` is called with null_permit_id from `convict`, and in this case, by contract, it should lock the endpoint, same as `mark_as_shutdown`. This change somehow escaped #14845 so it amends it. Fixes #14838 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-10 07:50:33 +03:00
Kefu Chai	0c0a59bf62	test: stop using HostRegistry in MinioServer since MinioServer find a free port by itself, there is no need to provide it an IP address for it anymore -- we can always use 127.0.0.1. so, in this change, we just drop the HostRegistry parameter passed to the constructor of MinioServer, and pass the host address in place of it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 23:40:22 +08:00
Kamil Braun	59c410fb97	Merge 'migration_manager: announce: provide descriptions for all calls' from Patryk Jędrzejczak The `system.group0_history` table provides useful descriptions for each command committed to Raft group 0. One way of applying a command to group 0 is by calling `migration_manager::announce`. This function has the `description` parameter set to empty string by default. Some calls to `announce` use this default value which causes `null` values in `system.group0_history`. We want `system.group0_history` to have an actual description for every command, so we change all default descriptions to reasonable ones. Going further, We remove the default value for the `description` parameter of `migration_manager::announce` to avoid using it in the future. Thanks to this, all commands in `system.group0_history` will have a non-null description. Fixes #13370 Closes #14979 * github.com:scylladb/scylladb: migration_manager: announce: remove the default value of description test: always pass empty description to migration_manager::announce migration_manager: announce: provide descriptions for all calls	2023-08-09 16:58:41 +02:00
Kefu Chai	29554b0fc6	s3/client: check for available port before starting minio server there is chance that the default port of 9000 has been used on the host running the test, in that case, we should try to use another available port. so, in this change, we try ports in the ranges of [9000, 9000+1000), and use the first one which is not connectable. Fixes #14985 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 17:33:42 +08:00
Botond Dénes	108e510a23	Merge 'Update sstable_requiring_cleanup on compaction completion' from Benny Halevy Currently `sstable_requiring_cleanup` is updated using `compacting_sstable_registration`, but that mechanism is not used by offstrategy compaction, leading to #14304. This series introduces `compaction_manager::on_compaction_completion` that intercepts the call to the table::on_compaction_completion. This allows us to update `sstable_requiring_cleanup` right before the compacted sstables are deleted, making sure they are no leaked to `sstable_requiring_cleanup`, which would hold a reference to them until cleanup attempts to clean them up. `cleanup_incremental_compaction_test` was adjusted to observe the sstables `on_delete` (by adding a new observer event) to detect the case where cleanup attempts to delete the leaked sstables and fails since they were already deleted from the file system by offstrategy compaction. The test fails with the fix and passes with it. Fixes #14304 Closes #14858 * github.com:scylladb/scylladb: compaction_manager: on_compaction_completion: erase sstables from sstables_requiring_cleanup compaction/leveled_compaction_strategy: ideal_level_for_input: special case max_sstable_size==0 sstable: add on_delete observer compaction_manager: add on_compaction_completion sstable_compaction_test: cleanup_incremental_compaction_test: verify sstables_requiring_cleanup is empty	2023-08-09 11:03:45 +03:00
Botond Dénes	69d6778daf	Merge 'build: cmake: fixes for the release build' from Kefu Chai before this change, we use generator expression to initialize CMAKE_CXX_FLAGS_RELEASE, this has two problems: 1. the generator expression is not expanded when setting a regular variable. 2. the ending ">" is missing in one of the generator expression. 3. the parameters are not separated with ";" so address them, let's just * use `add_compile_options()` directly, as the corresponding `mode.${build_mode}.cmake` is only included when the "${build_mode}" is used. * add comma in between the command line options. * add the missing closing ">" Closes #14996 * github.com:scylladb/scylladb: build: cmake: pass --gc-sections to ld not ar build: cmake: use add_compile_options() in release build	2023-08-09 09:55:02 +03:00
Kefu Chai	47c9b25bac	compaction_manager: correct comment on compaction_task_executor::state when it comes to `regular_compaction_task_executor`, we repeat the compaction until the compaction can not proceed, so after an iteration of compaction completes successfully, the task can still continue with yet another round of the compaction as it sees appropriate. so let's update the comment to reflect this fact. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14891	2023-08-09 09:49:18 +03:00
Kefu Chai	6dc885a8e2	compaction: mark more member variables const quite a few member variables serves as the configuration for a given compaction, they are immutable in the life cycle of it, so for better readability, let's mark them `const`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14981	2023-08-09 09:28:44 +03:00
Botond Dénes	5f65ac73ed	Merge 'Remove qctx' from Pavel Emelyanov The only place that still calls it is static force_blocking_flush method. If can be made non-static already. Also, while at it, coroutinize some system_keyspace methods and fix a FIXME regarding replica::database access in it. Closes #14984 * github.com:scylladb/scylladb: code: Remove query-context.hh code: Remove qctx system_keyspace: Use system_keyspace's container() to flush system_keyspace: Make force_blocking_flush() non-static system_keyspace: Coroutinize update_tokens() system_keyspace: Coroutinize save_truncation_record()	2023-08-09 09:27:53 +03:00
Kefu Chai	782b1992b2	build: cmake: pass --gc-sections to ld not ar ar is not able to tell which sections to be GC'ed, hence it does not care about --gc-sections, but ld does. let's add this option to CMAKE_EXE_LINKER_FLAGS. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 13:50:44 +08:00
Kefu Chai	f7377725c2	build: cmake: use add_compile_options() in release build before this change, we use generator expression to initialize CMAKE_CXX_FLAGS_RELEASE, this has two problems: 1. the generator expression is not expanded when setting a regular variable. 2. the ending ">" is missing in one of the generator expression. 3. the parameters are not separated with ";" so address them, let's just * use `add_compile_options()` directly, as the corresponding `mode.${build_mode}.cmake` is only included when the "${build_mode}" is used. * add comma in between the command line options. * add the missing closing ">" Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 12:56:06 +08:00
Pavel Emelyanov	f1515c610e	code: Remove query-context.hh The whole thing is unused now, so the header is no longer needed Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:11:07 +03:00
Pavel Emelyanov	413d81ac16	code: Remove qctx Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:10:56 +03:00
Pavel Emelyanov	d7f5d6dba8	system_keyspace: Use system_keyspace's container() to flush In force_blocking_flush() there's an invoke-on-all invocation of replica::database::flush() and a FIXME to get the replica database from somewhere else rather than via query-processor -> data_dictionary. Since now the force_blocking_flush() is non-static the invoke-on-all can happen via system_keyspace's container and the database can be obtained directly from the sys.ks. local instance Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:09:32 +03:00
Pavel Emelyanov	7a342ed5c0	system_keyspace: Make force_blocking_flush() non-static Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:09:20 +03:00
Pavel Emelyanov	6b8fe5ac43	system_keyspace: Coroutinize update_tokens() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:09:15 +03:00
Pavel Emelyanov	1700d79b60	system_keyspace: Coroutinize save_truncation_record() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-08 11:09:09 +03:00
Benny Halevy	7a7c8d0d23	compaction_manager: on_compaction_completion: erase sstables from sstables_requiring_cleanup Erase retired sstable from compaction_state::sstables_requiring_cleanup also on_compaction_completion (in addition to compacting_sstable_registration::release_compacting for offstrategy compaction with piggybacked cleanup or any other compaction type that doesn't use compacting_sstable_registration. Add cleanup_during_offstrategy_incremental_compaction_test that is modeled after cleanup_incremental_compaction_test to check that cleanup doesn't attempt to cleanup already-deleted sstables that were left over by offstrategy compaction in sstables_requiring_cleanup. Fixes #14304 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:16:46 +03:00
Benny Halevy	b1e164a241	compaction/leveled_compaction_strategy: ideal_level_for_input: special case max_sstable_size==0 Prevent div-by-zero byt returning const level 1 if max_sstable_size is zero, as configured by cleanup_incremental_compaction_test, before it's extended to cover also offstrategy compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:16:46 +03:00
Benny Halevy	b08f2ac4c6	sstable: add on_delete observer Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:15:00 +03:00
Benny Halevy	df66895080	compaction_manager: add on_compaction_completion Pass the call to the table on_compaction_completion so we can manage the sstables requiring cleanup state along the way. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:12:05 +03:00
Benny Halevy	ea64ae54f8	sstable_compaction_test: cleanup_incremental_compaction_test: verify sstables_requiring_cleanup is empty Make sure that there are no sstables_requiring_cleanup after cleanup compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-08 08:12:01 +03:00
Patryk Jędrzejczak	356e131acd	migration_manager: announce: remove the default value of description We remove the default value for the description parameter of migration_manager::announce to avoid using it in the future. Thanks to this, all commands in system.group0_history will have a non-null description.	2023-08-07 14:38:11 +02:00
Patryk Jędrzejczak	866c9a904d	test: always pass empty description to migration_manager::announce In the next commit, we remove the default value for the description parameter of migration_manager::announce to avoid using it in the future. However, many calls to announce in tests use the default value. We have to change it, but we don't really care about descriptions in the tests, so we pass the empty string everywhere.	2023-08-07 14:38:11 +02:00
Patryk Jędrzejczak	27ddf78171	migration_manager: announce: provide descriptions for all calls The system.group0_history table provides useful descriptions for each command committed to Raft group 0. One way of applying a command to group 0 is by calling migration_manager::announce. This function has the description parameter set to empty string by default. Some calls to announce use this default value which causes null values in system.group0_history. We want system.group0_history to have an actual description for every command, so we change all default descriptions to reasonable ones. We can't provide a reasonable description to announce in query_processor::execute_thrift_schema_command because this function is called in multiple situations. To solve this issue, we add the description parameter to this function and to handler::execute_schema_command that calls it.	2023-08-07 14:38:11 +02:00
Avi Kivity	4f7e83a4d0	cql3: select_statement: reject DISTINCT with GROUP BY on clustering keys While in SQL DISTINCT applies to the result set, in CQL it applies to the table being selected, and doesn't allow GROUP BY with clustering keys. So reject the combination like Cassandra does. While this is not an important issue to fix, it blocks un-xfailing other issues, so I'm clearing it ahead of fixing those issues. An issue is unmarked as xfail, and other xfails lose this issue as a blocker. Fixes #12479 Closes #14970	2023-08-07 15:35:59 +03:00
Benny Halevy	db7a4109dd	gossiper: lock_endpoint: fix comment regarding permit_id mismatch Fixes a code review comment. See https://github.com/scylladb/scylladb/pull/14845#discussion_r1283572889 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-07 14:39:42 +03:00
Benny Halevy	4ebd2fa09d	gossiper: lock_endpoint: change assert to on_internal_error Fixes a code review comment. See https://github.com/scylladb/scylladb/pull/14845#discussion_r1283060243 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-07 14:36:35 +03:00
Patryk Jędrzejczak	1772433ae2	raft_group0: log gaining and losing leadership on the INFO level Knowing that a server gained or lost leadership in group 0 is sometimes useful for the purpose of debugging, so we log information about these events on the INFO level. Gaining and losing leadership are relatively rare events, so this change shouldn't flood the logs. Closes #14877	2023-08-07 12:13:24 +02:00
Kamil Braun	9edc98f8e9	Merge 'raft: make a removed/decommissioning node a non-voter early' from Patryk Jędrzejczak For `removenode`, we make a removed node a non-voter early. There is no downside to it because the node is already dead. Moreover, it improves availability in some situations. For `decommission`, if we decommission a node when the number of nodes is even, we make it a non-voter early to improve availability. All majorities containing this node will remain majorities when we make this node a non-voter and remove it from the set because the required size of a majority decreases. We don't change `decommission` when the number of nodes is odd since this may reduce availability. Fixes #13959 Closes #14911 * github.com:scylladb/scylladb: raft: make a decommissioning node a non-voter early raft: topology_coordinator: implement step_down_as_nonvoter raft: make a removed node a non-voter early	2023-08-07 10:14:33 +02:00
Botond Dénes	fa4aec90e9	Merge 'test: tasks: Fix task_manager/wait_task test ' from Aleksandra Martyniuk Rewrite test that checks whether task_manager/wait_task works properly. The old version didn't work. Delete functions used in old version. Closes #14959 * github.com:scylladb/scylladb: test: rewrite wait_task test test: move ThreadWrapper to rest_util.py	2023-08-07 09:04:29 +03:00
Benny Halevy	6f037549ac	sstables: delete_with_pending_deletion_log: batch sync_directory When deleting multiple sstables with the same prefix the deletion atomicity is ensured by the pending_delete_log file, so if scylla crashes in the middle, deletions will be replyed on restart. Therefore, we don't have to ensure atomicity of each individual `unlink`. We just need to sync the directory once, before removing the pending_delete_log file. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #14967	2023-08-06 18:52:13 +03:00

1 2 3 4 5 ...

38359 Commits