scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 16:33:35 +00:00

Author	SHA1	Message	Date
Botond Dénes	d4563e2b28	test/pylib: rest_client: add get_injection() The /v2/error_injection/{injection} endpoint now has a GET method too, expose this. (cherry picked from commit `0c61b1822c`)	2024-06-11 17:32:37 +00:00
Botond Dénes	bb18a8152e	api/error_injection: add getter for error_injection Allow external code to obtain information about an error injection point, including whether it is enabled, and importantly, what its parameters are. Together with the `set_parameter()` added in the previous patch, this allows tests to read out the values of internal parameters, via a set_parameter() injection point. (cherry picked from commit `feea609e37`)	2024-06-11 17:32:37 +00:00
Botond Dénes	1947290c74	utils/error_injection: add set_parameter() Allow injection points to write values into the parameter map, which external code can then examine. This allows exfiltrating the values if internal variables, to be examined by tests, without exposing these variables via an "official" path. (cherry picked from commit `4590026b38`)	2024-06-11 17:32:36 +00:00
Botond Dénes	d121fc1264	replica/database: fix live-update enable_compacting_data_for_streaming_and_repair This config item is propagated to the table object via table::config. Although the field in table::config, used to propagate the value, was utils::updateable_value<T>, it was assigned a constant and so the live-update chain was broken. This patch fixes this. (cherry picked from commit `dbccb61636`)	2024-06-11 17:32:36 +00:00
Guilherme Nogueira	1ace370ecd	Remove comma that breaks CQL DML on tablets.rst The current sample reads: ```cql CREATE KEYSPACE my_keyspace WITH replication = { 'class': 'NetworkTopologyStrategy', 'replication_factor': 3, } AND tablets = { 'enabled': false }; ``` The additional comma after `'replication_factor': 3` breaks the query execution. (cherry picked from commit `cf157e4423`) Closes scylladb/scylladb#19194	2024-06-10 20:24:22 +03:00
Kefu Chai	3e7de910ab	docs: correct the link pointing to Scylla U before this change it points to https://university.scylladb.com/courses/scylla-operations/lessons/change-data-capture-cdc/ which then redirects the browser to https://university.scylladb.com/courses/scylla-operations/, but it should have point to https://university.scylladb.com/courses/data-modeling/lessons/change-data-capture-cdc/ in this change, the hyperlink is corrected. Fixes #19163 Refs `6e97b83b60` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> (cherry picked from commit `b5dce7e3d0`) Closes scylladb/scylladb#19198	2024-06-10 20:23:08 +03:00
Kefu Chai	9cf0d618d0	build: populate cxxflags to abseil before this change, when building abseil, we don't pass cxxflags to compiler, and abseil libraries are build with the default optimization level. in the case of clang, its default optimization level is `-O0`, it compiles the fastest, but the performance of the emitted code is not optimized for runtime performance. but we expect good performance for the release build. a typical command line for building abseil looks like ``` clang++ -I/home/kefu/dev/scylladb/master/abseil -ffile-prefix-map=/home/kefu/dev/scylladb/master=. -march=westmere -std=gnu++20 -Wall -Wextra -Wcast-qual -Wconversion -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wfor-loop-analysis -Wformat-security -Wgnu-redeclared-enum -Winfinite-recursion -Winvalid-constexpr -Wliteral-conversion -Wmissing-declarations -Woverlength-strings -Wpointer-arith -Wself-assign -Wshadow-all -Wshorten-64-to-32 -Wsign-conversion -Wstring-conversion -Wtautological-overlap-compare -Wtautological-unsigned-zero-compare -Wundef -Wuninitialized -Wunreachable-code -Wunused-comparison -Wunused-local-typedefs -Wunused-result -Wvla -Wwrite-strings -Wno-float-conversion -Wno-implicit-float-conversion -Wno-implicit-int-float-conversion -Wno-unknown-warning-option -DNOMINMAX -MD -MT absl/base/CMakeFiles/scoped_set_env.dir/internal/scoped_set_env.cc.o -MF absl/base/CMakeFiles/scoped_set_env.dir/internal/scoped_set_env.cc.o.d -o absl/base/CMakeFiles/scoped_set_env.dir/internal/scoped_set_env.cc.o -c /home/kefu/dev/scylladb/master/abseil/absl/base/internal/scoped_set_env.cc ``` so, in this change, we populate cxxflags to abseil, so that the per-mode `-O` option can be populated when building abseil. after this change, the command line building abseil in release mode looks like ``` clang++ -I/home/kefu/dev/scylladb/master/abseil -ffunction-sections -fdata-sections -O3 -mllvm -inline-threshold=2500 -fno-slp-vectorize -DSCYLLA_BUILD_MODE=release -g -gz -ffile-prefix-map=/home/kefu/dev/scylladb/master=. -march=westmere -std=gnu++20 -Wall -Wextra -Wcast-qual -Wconversion -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wfor-loop-analysis -Wformat-security -Wgnu-redeclared-enum -Winfinite-recursion -Winvalid-constexpr -Wliteral-conversion -Wmissing-declarations -Woverlength-strings -Wpointer-arith -Wself-assign -Wshadow-all -Wshorten-64-to-32 -Wsign-conversion -Wstring-conversion -Wtautological-overlap-compare -Wtautological-unsigned-zero-compare -Wundef -Wuninitialized -Wunreachable-code -Wunused-comparison -Wunused-local-typedefs -Wunused-result -Wvla -Wwrite-strings -Wno-float-conversion -Wno-implicit-float-conversion -Wno-implicit-int-float-conversion -Wno-unknown-warning-option -DNOMINMAX -MD -MT absl/flags/CMakeFiles/flags_commandlineflag_internal.dir/internal/commandlineflag.cc.o -MF absl/flags/CMakeFiles/flags_commandlineflag_internal.dir/internal/commandlineflag.cc.o.d -o absl/flags/CMakeFiles/flags_commandlineflag_internal.dir/internal/commandlineflag.cc.o -c /home/kefu/dev/scylladb/master/abseil/absl/flags/internal/commandlineflag.cc ``` Refs `0b0e661a85` Fixes #19161 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> (cherry picked from commit `535f2b2134`) Closes scylladb/scylladb#19200	2024-06-10 20:22:00 +03:00
Nadav Har'El	4810937ddf	test/alternator: fix flaky test test_item_latency The Alternator test test_metrics.py::test_item_latency confirms that for several operation types (PutItem, GetItem, DeleteItem, UpdateItem) we did not forget to measure their latencies. The test checked that a latency was updated by checking that two metrics increases: scylla_alternator_op_latency_count scylla_alternator_op_latency_sum However, it turns out that the "sum" is only an approximate sum of all latencies, and when the total sum grows large it sometimes does not increase when a short latency is added to the statistics. When this happens, this test fails on the assertion that the "sum" increases after an operation. We saw this happening sometimes in CI runs. The simple fix is to stop checking _sum at all, and only verify that the _count increases - this is really an integer counter that unconditionally increases when a latency is added to the histogram. Don't worry that the strength of this test is reduced - this test was never meant to check the accuracy or correctness of the histograms - we should have different (and better) tests for that, unrelated to Alternator. The purpose of this test is only to verify that for some specific operation like PutItem, Alternator didn't forget to measure its latency and update the histogram. We want to avoid a bug like we had in counters in the past (#9406). Fixes #18847. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `13cf6c543d`) Closes scylladb/scylladb#19193	2024-06-10 20:20:54 +03:00
Tomasz Grabiec	a3e4dc7b6c	test: tablets: Fix flakiness of test_removenode_with_ignored_node due to read timeout The check query may be executed on a node which doesn't yet see that the downed server is down, as it is not shut down gracefully. The query coordinator can choose the down node as a CL=1 replica for read and time out. To fix, wait for all nodes to notice the node is down before executing the checking query. Fixes #17938 (cherry picked from commit `c8f71f4825`) Closes scylladb/scylladb#19199	2024-06-10 20:12:56 +03:00
Botond Dénes	7a6ff12ace	Merge '[Backport 6.0] alternator: keep TTL work in the maintenance scheduling group' from ScyllaDB Alternator has a custom TTL implementation. This is based on a loop, which scans existing rows in the table, then decides whether each row have reached its end-of-life and deletes it if it did. This work is done in the background, and therefore it uses the maintenance (streaming) scheduling group. However, it was observed that part of this work leaks into the statement scheduling group, competing with user workloads, negatively affecting its latencies. This was found to be causes by the reads and writes done on behalf of the alternator TTL, which looses its maintenance scheduling group when these have to go to a remote node. This is because the messaging service was not configured to recognize the streaming scheduling group, when statement verbs like read or writes are invoked. The messaging service currently recognizes two statement "tenants": the user tenant (statement scheduling group) and system (default scheduling group), as we used to have only user-initiated operations and sytsem (internal) ones. With alternator TTL, there is now a need to distinguish between two kinds of system operation: foreground and background ones. The former should use the system tenant while the latter will use the new maintenance tenant (streaming scheduling group). This series adds a streaming tenant to the messaging service configuration and it adds a test which confirms that with this change, alternator TTL is entirely contained in the maintenance scheduling group. Fixes: #18719 - [x] Scans executed on behalf of alternator TTL are running in the statement group, disturbing user-workloads, this PR has to be backported to fix this. (cherry picked from commit `5d3f7c13f9`) (cherry picked from commit `1fe8f22d89`) Refs #18729 Closes scylladb/scylladb#19196 * github.com:scylladb/scylladb: alternator, scheduler: test reproducing RPC scheduling group bug main: add maintenance tenant to messaging_service's scheduling config	2024-06-10 19:58:38 +03:00
Anna Stuchlik	e38d675cb9	doc: mark tablets as GA in the CREATE KEYSPACE section This commit removes the information that tablets are an experimental feature from the CREATE KEYSPACE section. In addition, it removes the notes and cautions that are redundant when a feature is GA, especially the information and warnings about the future plans. Fixes https://github.com/scylladb/scylladb/issues/18670 Closes scylladb/scylladb#19063 (cherry picked from commit `55ed18db07`)	2024-06-10 18:53:47 +03:00
Gleb Natapov	45ff4d2c41	group0, topology coordinator: run group0 and the topology coordinator in gossiper scheduling group Currently they both run in streaming group and it may become busy during repair/mv building and affect group0 functionality. Move it to the gossiper group where it should have more time to run. Fixes #18863 (cherry picked from commit `a74fbab99a`) Closes scylladb/scylladb#19175	2024-06-10 10:34:29 +02:00
Nadav Har'El	0662e80917	alternator, scheduler: test reproducing RPC scheduling group bug This patch adds a test for issue #18719: Although the Alternator TTL work is supposedly done in the "streaming" scheduling group, it turned out we had a bug where work sent on behalf of that code to other nodes failed to inherit the correct scheduling group, and was done in the normal ("statement") group. Because this problem only happens when more than one node is involved, the test is in the multi-node test framework test/topology_experimental_raft. The test uses the Alternator API. We already had in that framework a test using the Alternator API (a test for alternator+tablets), so in this patch we move the common Alternator utility functions to a common file, test_alternator.py, where I also put the new test. The test is based on metrics: We write expiring data, wait for it to expire, and then check the metrics on how much CPU work was done in the wrong scheduling group ("statement"). Before #18719 was fixed, a lot of work was done there (more than half of the work done in the right group). After the issue was fixed in the previous patch, the work on the wrong scheduling group went down to zero. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `1fe8f22d89`)	2024-06-10 07:42:23 +00:00
Botond Dénes	5b546ad4b1	main: add maintenance tenant to messaging_service's scheduling config Currently only the user tenant (statement scheduling group) and system (default scheduling group) tenants exist, as we used to have only user-initiated operations and sytem (internal) ones. Now there is need to distinguish between two kinds of system operation: foreground and background ones. The former should use the system tenant while the latter will use the new maintenance tenant (streaming scheduling group). (cherry picked from commit `5d3f7c13f9`)	2024-06-10 07:42:22 +00:00
Piotr Dulikowski	e04378fdf0	Merge ' [Backport 6.0] db/hints: Use host ID to IP mappings to choose the ep manager to drain when node is leaving' from Dawid Mędrek In [`d0f5873`](`d0f58736c8`), we introduced mappings IP–host ID between hint directories and the hint endpoint managers managing them. As a consequence, it may happen that one hint directory stores hints towards multiple nodes at the same time. If any of those nodes leaves the cluster, we should drain the hint directory. However, before these changes that doesn't happen – we only drain it when the node of the same host ID as the hint endpoint manager leaves the cluster. This PR fixes that draining issue in the pre-host-ID-based hinted handoff. Now no matter which of the nodes corresponding to a hint directory leaves the cluster, the directory will be drained. We also introduce error injections to be able to test that it indeed happens. Fixes scylladb/scylladb#18761 (cherry picked from commit [`745a9c6`](`745a9c6ab8`)) (cherry picked from commit [`e855794`](`e855794327`)) Refs scylladb/scylladb#18764 Closes scylladb/scylladb#19114 * github.com:scylladb/scylladb: db/hints: Introduce an error injection to test draining db/hints: Ensure that draining happens	2024-06-10 09:11:07 +02:00
Tomasz Grabiec	f8243cbf19	Merge '[Backport 6.0] Serialize repair with tablet migration' from ScyllaDB We want to exclude repair with tablet migrations to avoid races between repair reads and writes with replica movement. Repair is not prepared to handle topology transitions in the middle. One reason why it's not safe is that repair may successfully write to a leaving replica post streaming phase and consider all replicas to be repaired, but in fact they are not, the new replica would not be repaired. Other kinds of races could result in repair failures. If repair writes to a leaving replica which was already cleaned up, such writes will fail, causing repair to fail. Excluding works by keeping effective_replication_map_ptr in a version which doesn't have table's tablets in transitions. That prevents later transitions from starting because topology coordinator's barrier will wait for that erm before moving to a stage later than allow_write_both_read_old, so before any requests start using the new topology. Also, if transitions are already running, repair waits for them to finish. A blocked tablet migration (e.g. due to down node) will block repair, whereas before it would fail. Once admin resolves the cause of blocked migration, repair will continue. Fixes #17658. Fixes #18561. (cherry picked from commit `6c64cf33df`) (cherry picked from commit `1513d6f0b0`) (cherry picked from commit `476c076a21`) (cherry picked from commit `c45ce41330`) (cherry picked from commit `e97acf4e30`) (cherry picked from commit `98323be296`) (cherry picked from commit `5ca54a6e88`) Refs #18641 Closes scylladb/scylladb#19144 * github.com:scylladb/scylladb: test: pylib: Do not block async reactor while removing directories repair: Exclude tablet migrations with tablet repair repair_service: Propagate topology_state_machine to repair_service main, storage_service: Move topology_state_machine outside storage_service storage_srvice, toplogy: Extract topology_state_machine::await_quiesced() tablet_scheduler: Make disabling of balancing interrupt shuffle mode tablet_scheduler: Log whether balancing is considered as enabled	2024-06-09 00:20:44 +02:00
Tomasz Grabiec	27f01bf4e3	test: pylib: Do not block async reactor while removing directories This fixes a problem where suite cleanup schedules lots of uninstall() tasks for servers started in the suite, which schedules lots of tasks, which synchronously call rmtree(). These take over a minute to finish, which blocks other tasks for tests which are still executing. In particular, this was observed to case ManagerClient.server_stop_gracefully() to time-out. It has a timeout of 60 seconds. The server was stopped quickly, but the RESTful API response was not processed in time and the call timed out when it got the async reactor. (cherry picked from commit `5ca54a6e88`)	2024-06-08 16:31:18 +02:00
Tomasz Grabiec	ded9aca6ee	repair: Exclude tablet migrations with tablet repair We want to exclude repair with tablet migrations to avoid races between repair reads and writes with replica movement. Repair is not prepared to handle topology transitions in the middle. One reason why it's not safe is that repair may successfully write to a leaving replica post streaming phase and consider all replicas to be repaired, but in fact they are not, the new replica would not be repaired. Other kinds of races could result in repair failures. If repair writes to a leaving replica which was already cleaned up, such writes will fail, causing repair to fail. Excluding works by keeping effective_replication_map_ptr in a version which doesn't have table's tablets in transitions. That prevents later transitions from starting because topology coordinator's barrier will wait for that erm before moving to a stage later than allow_write_both_read_old, so before any requets start using the new topology. Also, if transitions are already running, repair waits for them to finish. Fixes #17658. Fixes #18561. (cherry picked from commit `98323be296`)	2024-06-08 16:31:18 +02:00
Tomasz Grabiec	ccd441a4de	repair_service: Propagate topology_state_machine to repair_service (cherry picked from commit `e97acf4e30`)	2024-06-08 16:31:15 +02:00
Jenkins Promoter	79e4e411b3	Update ScyllaDB version to: 6.0.1	2024-06-07 09:31:05 +03:00
Kefu Chai	f8ba94a960	doc: document "enable_tablets" option it sets the cluster feature of tablets, and is a prerequisite for using tablets. Refs #18670 Fixes #19157 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> (cherry picked from commit `bac7e1e942`) Closes scylladb/scylladb#19158	2024-06-07 07:03:30 +03:00
Tzach Livyatan	dfe89157c6	Docs: fix start command in Update replace-dead-node.rst Fix #18920 (cherry picked from commit `c30f81c389`) Closes scylladb/scylladb#19142	2024-06-07 07:02:02 +03:00
Kefu Chai	50d8fa6b77	topology_coordinator: handle/wait futures when stopping topology_coordinator before this change, unlike other services in scylla, topology_coordinator is not properly stopped when it is aborted, because the scylla instance is no longer a leader or is being shut down. its `run()` method just stops the grand loop and bails out before topology_coordinator is destroyed. but we are tracking the migration state of tablets using a bunch of futures, which might not be handled yet, and some of them could carry failures. in that case, when the `future` instances with failure state get destroyed, seastar calls `report_failed_future`. and seastar considers this practice a source a bug -- as one just fails to handle an error. that's why we have following error: ``` WARN 2024-05-19 23:00:42,895 [shard 0:strm] seastar - Exceptional future ignored: seastar::rpc::unknown_verb_error (unknown verb), backtrace: /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x56c14e /home/bhalevy/.ccm/scylla-repository/local_tarball/libre loc/libseastar.so+0x56c770 /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x56ca58 /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x38c6ad 0x29cdd07 0x29b376b 0x29a5b65 0x108105a /home/bhalevy/.ccm/scylla-repository/local_tarbal l/libreloc/libseastar.so+0x3ff1df /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x400367 /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3ff838 /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x36de58 /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x36d092 0x1017cba 0x1055080 0x1016ba7 /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x27b89 /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x27c4a 0x1015524 ``` and the backtrace looks like: ``` seastar::current_backtrace_tasklocal() at ??:? seastar::current_tasktrace() at ??:? seastar::current_backtrace() at ??:? seastar::report_failed_future(seastar::future_state_base::any&&) at ??:? service::topology_coordinator::tablet_migration_state::~tablet_migration_state() at topology_coordinator.cc:? service::topology_coordinator::~topology_coordinator() at topology_coordinator.cc:? service::run_topology_coordinator(seastar::sharded<db::system_distributed_keyspace>&, gms::gossiper&, netw::messaging_service&, locator::shared_token_metadata&, db::system_keyspace&, replica::database&, service::raft_group0&, service::topology_state_machine&, seastar::abort_source&, raft::server&, seastar::noncopyable_function<seastar::future<service::raft_topology_cmd_result> (utils::tagged_tagged_integer<raft::internal::non_final, raft::term_tag, unsigned long>, unsigned long, service::raft_topology_cmd const&)>, service::tablet_allocator&, std::chrono::duration<long, std::ratio<1l, 1000l> >, service::endpoint_lifecycle_notifier&) [clone .resume] at topology_coordinator.cc:? seastar::internal::coroutine_traits_base<void>::promise_type::run_and_dispose() at main.cc:? seastar::reactor::run_some_tasks() at ??:? seastar::reactor::do_run() at ??:? seastar::reactor::run() at ??:? seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at ??:? ``` and even worse, these futures are indirectly owned by `topology_coordinator`. so there are chances that they could be used even after `topology_coordinator` is destroyed. this is a use-after-free issue. because the `run_topology_coordinator` fiber exits when the scylla instance retires from the leader's role, this use-after-free could be fatal to a running instance due to undefined behavior of use after free. so, in this change, we handle the futures in `_tablets`, and note down the failures carried by them if any. Fixes #18745 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> (cherry picked from commit `4a36918989`) Closes scylladb/scylladb#19139	2024-06-07 07:00:25 +03:00
Jenkins Promoter	a77615adf3	Update ScyllaDB version to: 6.0.0 scylla-6.0.0-candidate-20240606102200 scylla-6.0.0-candidate-20240606081124 scylla-6.0.0	2024-06-06 16:03:39 +03:00
Tomasz Grabiec	e518bb68b2	main, storage_service: Move topology_state_machine outside storage_service It will be propagated to repair_service to avoid cyclic dependency: storage_service <-> repair_service (cherry picked from commit `c45ce41330`)	2024-06-06 13:01:19 +00:00
Tomasz Grabiec	af2caeb2de	storage_srvice, toplogy: Extract topology_state_machine::await_quiesced() Will be used later in a place which doesn't have access to storage_service but has to toplogy_state_machine. It's not necessary to start group0 operation around polling because the busy() state can be checked atomically and if it's false it means the topology is no longer busy. (cherry picked from commit `476c076a21`)	2024-06-06 13:01:19 +00:00
Tomasz Grabiec	d5ebfea1ff	tablet_scheduler: Make disabling of balancing interrupt shuffle mode Tests will rely on that, they will run in shuffle mode, and disable balancing around section which otherwise would be infinitely blocked by ongoing shuffling (like repair). (cherry picked from commit `1513d6f0b0`)	2024-06-06 13:01:18 +00:00
Tomasz Grabiec	3fec9e1344	tablet_scheduler: Log whether balancing is considered as enabled (cherry picked from commit `6c64cf33df`)	2024-06-06 13:01:18 +00:00
Kamil Braun	5d3dde50f4	Merge '[Backport 6.0] Fail bootstrap if ip mapping is missing during double write stage' from ScyllaDB If a node restart just before it stores bootstrapping node's IP it will not have ID to IP mapping for bootstrapping node which may cause failure on a write path. Detect this and fail bootstrapping if it happens. (cherry picked from commit `1faef47952`) (cherry picked from commit `27445f5291`) (cherry picked from commit `6853b02c00`) (cherry picked from commit `f91db0c1e4`) Refs #18927 Closes scylladb/scylladb#19118 * github.com:scylladb/scylladb: raft topology: fix indentation after previous commit raft topology: do not add bootstrapping node without IP as pending test: add test of bootstrap where the coordinator crashes just before storing IP mapping schema_tables: remove unused code	2024-06-06 11:35:13 +02:00
Tomasz Grabiec	b7fe4412d0	test: pylib: Fetch all pages by default in run_async Fetching only the first page is not the intuitive behavior expected by users. This causes flakiness in some tests which generate variable amount of keys depending on execution speed and verify later that all keys were written using a single SELECT statement. When the amount of keys becomes larger than page size, the test fails. Fixes #18774 (cherry picked from commit `2c3f7c996f`) Closes scylladb/scylladb#19130	2024-06-06 08:22:45 +03:00
Benny Halevy	fd7284ec06	gms: endpoint_state: get_dc_rack: do not assign to uninitialized memory Assigning to a member of an uninitialized optional does not initialize the object before assigning to it. This resulted in the AddressSanitizer detecting attempt to double-free when the uninitialized string contained apprently a bogus pointer. The change emplaces the returned optional when needed without resorting to the copy-assignment operator. So it's not suceptible to assigning to uninitialized memory, and it's more efficient as well... Fixes scylladb/scylladb#19041 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `b2fa954d82`) Closes scylladb/scylladb#19117	2024-06-06 08:21:05 +03:00
Botond Dénes	8d12eeee62	Merge '[Backport 6.0] tasks: introduce task manager's task folding' from Aleksandra Martyniuk Task manager's tasks stay in memory after they are finished. Moreover, even if a child task is unregistered from task manager, it is still alive since its parent keeps a foreign pointer to it. Also, when a task has finished successfully there is no point in keeping all of its descendants in memory. The patch introduces folding of task manager's tasks. Whenever a task which has a parent is finished it is unregistered from task manager and foreign_ptr to it (kept in its parent) is replaced with its status. Children's statuses of the task are dropped unless they or one of their descendants failed. So for each operation we keep a tree of tasks which contains: - a root task and its direct children (status if they are finished, a task otherwise); - running tasks and their direct children (same as above); - a statuses path from root to failed tasks. /task_manager/wait_task/ does not unregister tasks anymore. Refs: https://github.com/scylladb/scylladb/issues/16694. - [ ] Backport reason (please explain below if this patch should be backported or not) Requires backport to 6.0 as task number exploded with tablets. (cherry picked from commit `6add9edf8a`) (cherry picked from commit `319e799089`) (cherry picked from commit `e6c50ad2d0`) (cherry picked from commit `a82a2f0624`) (cherry picked from commit `c1b2b8cb2c`) (cherry picked from commit `30f97ea133`) (cherry picked from commit `fc0796f684`) (cherry picked from commit `d7e80a6520`) (cherry picked from commit `beef77a778`) Refs https://github.com/scylladb/scylladb/pull/18735 Closes scylladb/scylladb#19104 * github.com:scylladb/scylladb: docs: describe task folding test: rest_api: add test for task tree structure test: rest_api: modify new_test_module tasks: test: modify test_task methods api: task_manager: do not unregister task in /task_manager/wait_task/ tasks: unregister tasks with parents when they are finished tasks: fold finished tasks info their parents tasks: make task_manager::task::impl::finish_failed noexcept tasks: change _children type	2024-06-06 07:56:12 +03:00
Gleb Natapov	e11827f37e	raft topology: fix indentation after previous commit (cherry picked from commit `f91db0c1e4`)	2024-06-05 13:55:29 +00:00
Gleb Natapov	0acfc223ab	raft topology: do not add bootstrapping node without IP as pending If there is no mapping from host id to ip while a node is in bootstrap state there is no point adding it to pending endpoint since write handler will not be able to map it back to host id anyway. If the transition sate requires double writes though we still want to fail. In case the state is write_both_read_old we fail the barrier that will cause topology operation to rollback and in case of write_both_read_new we assert but this should not happen since the mapping is persisted by this point (or we failed in write_both_read_old state). Fixes: scylladb/scylladb#18676 (cherry picked from commit `6853b02c00`)	2024-06-05 13:55:28 +00:00
Gleb Natapov	c53cd98a41	test: add test of bootstrap where the coordinator crashes just before storing IP mapping On the next boot there is no host ID to IP mapping which causes node to crash again with "No mapping for :: in the passed effective replication map" assertion. (cherry picked from commit `27445f5291`)	2024-06-05 13:55:28 +00:00
Gleb Natapov	fa6a7cf144	schema_tables: remove unused code (cherry picked from commit `1faef47952`)	2024-06-05 13:55:28 +00:00
Patryk Jędrzejczak	65021c4b1c	[Backport 6.0] test: test_topology_ops: run correctly without tablets The values of `tablets_enabled` were nonempty strings, so they always evaluated to `True` in the if statement responsible for enabling writing workers only if tablets are disabled. Hence, the writing workers were always disabled. The original commit, `ea4717da65`, contains one more change, which is not needed (and conflicting) in 6.0 because scylladb/scylladb#18898 has been backported first. Closes scylladb/scylladb#19111	2024-06-05 15:15:00 +02:00
Botond Dénes	341c29bd74	Merge '[Backport 6.0] storage_service: Fix race between tablet split and stats retrieval' from Raphael "Raph" Carvalho Retrieval of tablet stats must be serialized with mutation to token metadata, as the former requires tablet id stability. If tablet split is finalized while retrieving stats, the saved erm, used by all shards, can have a lower tablet count than the one in a particular shard, causing an abort as tablet map requires that any id feeded into it is lower than its current tablet count. Fixes https://github.com/scylladb/scylladb/issues/18085. (cherry picked from commit `abcc68dbe7`) (cherry picked from commit `551bf9dd58`) (cherry picked from commit `e7246751b6`) Refs https://github.com/scylladb/scylladb/pull/18287 Closes scylladb/scylladb#19095 * github.com:scylladb/scylladb: topology_experimental_raft/test_tablets: restore usage of check_with_down test: Fix flakiness in topology_experimental_raft/test_tablets service: Use tablet read selector to determine which replica to account table stats storage_service: Fix race between tablet split and stats retrieval	2024-06-05 13:06:32 +03:00
Aleksandra Martyniuk	e963631859	docs: describe task folding (cherry picked from commit `beef77a778`)	2024-06-05 10:09:13 +02:00
Jenkins Promoter	c6f0a3267e	Update ScyllaDB version to: 6.0.0-rc3 scylla-6.0.0-rc3 scylla-6.0.0-rc3-candidate-20240605025744	2024-06-05 10:03:47 +03:00
Marcin Maliszkiewicz	f02f2fef40	docs: remove note about performance degradation with default superuser This doesn't apply for auth-v2 as we improved data placement and removed cassandra quirk which was setting different CL for some default superuser involved operations. Fixes #18773 (cherry picked from commit `9adf74ae6c`) Closes scylladb/scylladb#18860	2024-06-05 09:04:45 +03:00
Benny Halevy	f8ae38a68c	data_dictionary: keyspace_metadata: format: print also initial_tablets Currently, there is no indication of tablets in the logged KSMetaData. Print the tablets configuration of either the`initial` number of tablets, if enabled, or {'enabled':false} otherwise. For example: ``` migration_manager - Create new Keyspace: KSMetaData{name=tablets_ks, strategyClass=org.apache.cassandra.locator.NetworkTopologyStrategy, strategyOptions={"datacenter1": "1"}, cfMetaData={}, durable_writes=true, tablets={"initial":0}, userTypes=org.apache.cassandra.config.UTMetaData@0x600004d446a8} migration_manager - Create new Keyspace: KSMetaData{name=vnodes_ks, strategyClass=org.apache.cassandra.locator.NetworkTopologyStrategy, strategyOptions={"datacenter1": "1"}, cfMetaData={}, durable_writes=true, tablets={"enabled":false}, userTypes=org.apache.cassandra.config.UTMetaData@0x600004c33ea8} Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `4fe700a962`) Closes scylladb/scylladb#19009	2024-06-05 08:31:21 +03:00
Botond Dénes	8a064daccf	Update tools/java submodule * tools/java 4ee15fd9...6dfc187a (1): > Update Scylla Java driver to 3.11.5.3. [botond: regenerate frozen toolchain] Closes scylladb/scylladb#18999	2024-06-05 08:00:19 +03:00
Botond Dénes	7f540407c9	Merge '[Backport 6.0] repair: Introduce new primary replica selection algorithm for tablets' from ScyllaDB Tablet allocation does not guarantee fairness of the first replica in the replicas set across dcs. The lack of this fix cause the following dtest to fail: repair_additional_test.py::TestRepairAdditional::test_repair_option_pr_multi_dc Use the tablet_map get_primary_replica or get_primary_replica_within_dc, respectively to see if this node is the primary replica for each tablet or not. Fixes https://github.com/scylladb/scylladb/issues/17752 No backport is required before 6.0 as tablets (and tablet repair) are introduced in 6.0 (cherry picked from commit `c52f70f92c`) (cherry picked from commit `2de79c39dc`) (cherry picked from commit `84761acc31`) (cherry picked from commit `009767455d`) (cherry picked from commit `18df36d920`) Refs #18784 Closes scylladb/scylladb#19068 * github.com:scylladb/scylladb: repair: repair_tablets: use get_primary_replica repair: repair_tablets: no need to check ranges_specified per tablet locator: tablet_map: add get_primary_replica_within_dc locator: tablet_map: get_primary_replica: do not copy tablet info locator: tablet_map: get_primary_replica: return tablet_replica	2024-06-05 07:47:24 +03:00
Aleksandra Martyniuk	50e1369d1d	test: rest_api: add test for task tree structure Add test which checks whether the tasks are folded into their parent as expected. (cherry picked from commit `d7e80a6520`)	2024-06-04 14:42:10 +00:00
Aleksandra Martyniuk	21e860453c	test: rest_api: modify new_test_module Remove remaining test tasks when a test module is removed, so that a node could shutdown even if a test fails. (cherry picked from commit `fc0796f684`)	2024-06-04 14:42:10 +00:00
Dawid Medrek	fc3d2d8fde	db/hints: Introduce an error injection to test draining We want to verify that a hint directory is drained when any of the nodes correspodning to it leaves the cluster. The test scenario should happen before the whole cluster has been migrated to the host-ID-based hinted handoff, so when we still rely on the mappings between hint endpoint managers and the hint directories managed by them. To make such a test possible, in these changes we introduce an error injection rejecting incoming hints. We want to test a scenario when: 1. hints are saved towards a given node -- node N1, 2. N1 changes its IP to a different one, 3. some other node -- node N2 -- changes its IP to the original IP of N1, 4. hints are saved towards N2 and they are stored in the same directory as the hints saved towards N1 before, 5. we start draining N2. Because at some point N2 needs to be stopped, it may happen that some mutations towards a distributed system table generate a hint to N2 BEFORE it has finished changing its IP, effectively creating another hint directory where ALL of the hints towards the node will be stored from there on. That would disturb the test scenario. Hence, this error injection is necessary to ensure that all of the steps in the test proceed as expected. (cherry picked from commit `e855794327`)	2024-06-04 14:42:09 +00:00
Aleksandra Martyniuk	1d34da21a9	tasks: test: modify test_task methods Wait until the task is done in test_task::finish_failed and test_task::finish to ensure that it is folded into its parent. (cherry picked from commit `30f97ea133`)	2024-06-04 14:42:09 +00:00
Aleksandra Martyniuk	377bc345f1	api: task_manager: do not unregister task in /task_manager/wait_task/ If /task_manager/wait_task/ unregisters the task, then there is no way to examine children failures, since their statuses can be checked only through their parent. (cherry picked from commit `c1b2b8cb2c`)	2024-06-04 14:42:09 +00:00
Aleksandra Martyniuk	607be221b8	tasks: unregister tasks with parents when they are finished Unregister children that are finished from task manager. They can be examined through they parents. (cherry picked from commit `a82a2f0624`)	2024-06-04 14:42:09 +00:00

1 2 3 4 5 ...

42904 Commits