scylladb

Author	SHA1	Message	Date
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Avi Kivity	fdc1449392	treewide: rename flat_mutation_reader_v2 to mutation_reader flat_mutation_reader_v2 was introduced in a pair of commits in 2021: `e3309322c3` "Clone flat_mutation_reader related classes into v2 variants" `08b5773c12` "Adapt flat_mutation_reader_v2 to the new version of the API" as a replacement for flat_mutation_reader, using range_tombstone_change instead of range_tombstone to represent represent range tombstones. See those commits for more information. The transition was incremental; the last use of the original flat_mutation_reader was removed in 2022 in commit `026f8cc1e7` "db: Use mutation_partition_v2 in mvcc" In turn, flat_mutation_reader was introduced in 2017 in commit `748205ca75` "Introduce flat_mutation_reader" To transition from a mutation_reader that nested rows within a partition in a separate stream, to a flat reader that streamed partitions and rows in the same stream. Here, we reclaim the original name and rename the awkward flat_mutation_reader_v2 to mutation_reader. Note that mutation_fragment_v2 remains since we still use the original for compatibilty, sometimes. Some notes about the transition: - files were also renamed. In one case (flat_mutation_reader_test.cc), the rename target already existed, so we rename to mutation_reader_another_test.cc. - a namespace 'mutation_reader' with two definitions existed (in mutation_reader_fwd.hh). Its contents was folded into the mutation_reader class. As a result, a few #includes had to be adjusted. Closes scylladb/scylladb#19356	2024-06-21 07:12:06 +03:00
Pavel Emelyanov	8906126a2c	stream_manager: Remove system_distributed_keyspace and view_update_generator Now all the code is happy with view_builder and can be shortened Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-23 13:41:56 +03:00
Pavel Emelyanov	ae2dcdc7c2	streaming: Remove system_distributed_keyspace and view_update_generator Now all the code is happy with view_builder and can be shortened Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-23 13:41:55 +03:00
Pavel Emelyanov	57517d5987	streaming: Proparage view_builder& down to make_streaming_consumer() Continuation of the previous patch. Repair itself doesn't need it, but streaming consumer does. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-23 13:41:46 +03:00
Tomasz Grabiec	fdcaaea91a	tablets, streaming: Implement tablet streaming for intra-node migration	2024-05-16 00:28:46 +02:00
Pavel Emelyanov	1f44a374b8	error_injection: Overload inject() instead of inject_with_handler() The inject_with_handler() method accepts a coroutine that can be called wiht injection_handler. With such function as an argument, there's no need in distinctive inject_with_handler() name for a method, it can be overload of all the existing inject()-s Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-11 19:30:19 +03:00
Asias He	83a28342ea	service: Drop unused table param from session_topology_guard The table param is not used. Dropping it so it can be used in places where the table object is not available. Closes scylladb/scylladb#17628	2024-03-07 09:34:40 +02:00
Aleksandra Martyniuk	2ea5d9b623	test: test drop table on receiver side during streaming	2024-02-15 12:06:47 +01:00
Asias He	e7e1f4b01a	streaming: Fix rpc::source and rpc::optional parameter order The new rpc::optional parameter must come after any existing parameters, including the rpc::source parameters, otherwise it will break compatibility. The regression was introduced in: ``` commit `fd3c089ccc` Author: Tomasz Grabiec <tgrabiec@scylladb.com> Date: Thu Oct 26 00:35:19 2023 +0200 service: range_streamer: Propagate topology_guard to receivers ``` We need to backport this patch ASAP before we release anything that contains commit `fd3c089ccc`. Refs: #16941 Fixes: #17175 Closes scylladb/scylladb#17176	2024-02-06 13:15:28 +01:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Kefu Chai	f86a5ae87a	streaming: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16947	2024-01-23 19:38:30 +02:00
Avi Kivity	9c0f05efa1	Merge 'Track tablet streaming under global sessions to prevent side-effects of failed streaming' from Tomasz Grabiec Tablet streaming involves asynchronous RPCs to other replicas which transfer writes. We want side-effects from streaming only within the migration stage in which the streaming was started. This is currently not guaranteed on failure. When streaming master fails (e.g. due to RPC failing), it can be that some streaming work is still alive somewhere (e.g. RPC on wire) and will have side-effects at some point later. This PR implements tracking of all operations involved in streaming which may have side-effects, which allows the topology change coordinator to fence them and wait for them to complete if they were already admitted. The tracking and fencing is implemented by using global "sessions", created for streaming of a single tablet. Session is globally identified by UUID. The identifier is assigned by the topology change coordinator, and stored in system.tablets. Sessions are created and closed based on group0 state (tablet metadata) by the barrier command sent to each replica, which we already do on transitions between stages. Also, each barrier waits for sessions which have been closed to be drained. The barrier is blocked only if there is some session with work which was left behind by unsuccessful streaming. In which case it should not be blocked for long, because streaming process checks often if the guard was left behind and stops if it was. This mechanism of tracking is fault-tolerant: session id is stored in group0, so coordinator can make progress on failover. The barriers guarantee that session exists on all replicas, and that it will be closed on all replicas. Closes scylladb/scylladb#15847 * github.com:scylladb/scylladb: test: tablets: Add test for failed streaming being fenced away error_injection: Introduce poll_for_message() error_injection: Make is_enabled() public api: Add API to kill connection to a particular host range_streamer: Do not block topology change barriers around streaming range_streamer, tablets: Do not keep token metadata around streaming tablets: Fail gracefully when migrating tablet has no pending replica storage_service, api: Add API to disable tablet balancing storage_service, api: Add API to migrate a tablet storage_service, raft topology: Run streaming under session topology guard storage_service, tablets: Use session to guard tablet streaming tablets: Add per-tablet session id field to tablet metadata service: range_streamer: Propagate topology_guard to receivers streaming: Always close the rpc::sink storage_service: Introduce concept of a topology_guard storage_service: Introduce session concept tablets: Fix topology_metadata_guard holding on to the old erm docs: Document the topology_guard mechanism	2023-12-07 16:29:02 +02:00
Tomasz Grabiec	7d0f4c10a2	test: tablets: Add test for failed streaming being fenced away	2023-12-06 18:37:01 +01:00
Tomasz Grabiec	9dac0febce	range_streamer: Do not block topology change barriers around streaming Streaming was keeping effective_replication_map_ptr around the whole process, which blocks topology change barriers. This will inhibit progress of tablet load balancer or concurrent migrations, resulting in worse performance. Fix by switching to the most recent erm on sharder calls. multishard_writer calls shard_of() for each new partition. A better way would be to switch immediately when topology version changes, but this is left for later.	2023-12-06 18:36:17 +01:00
Tomasz Grabiec	fd3c089ccc	service: range_streamer: Propagate topology_guard to receivers	2023-12-06 18:36:16 +01:00
Tomasz Grabiec	063095ea50	streaming: Always close the rpc::sink rpc::sink::~sink aborts if not closed. There is a try/catch clause which ensures that close() is called, but there was code after sink is created which is not covered by it. Move sink construction past that code.	2023-12-06 18:35:41 +01:00
Benny Halevy	6f7de427f0	streaming: use locator::topology rather than fb_utilities Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 16:12:11 +02:00
Kamil Braun	819f542ee6	migration_manager: take `abort_source&` in get_schema_for_read/write No callsite needed the `nullptr` case, so we can convert pointer to reference.	2023-11-23 17:23:47 +01:00
Asias He	2b2302d373	streaming: Ignore dropped table on both sides It is possible the sender and receiver of streaming nodes have different views on if a table is dropped or not. For example: - n1, n2 and n3 in the cluster - n4 started to join the cluster and stream data from n1, n2, n3 - a table was dropped - n4 failed to write data from n2 to sstable because a table was dropped - n4 ended the streaming - n2 checked if the table was present and would ignore the error if the table was dropped - however n2 found the table was still present and was not dropped - n2 marked the streaming as failed This will fail the streaming when a table is dropped. We want streaming to ignore such dropped tables. In this patch, a status code is sent back to the sender to notify the table is dropped so the sender could ignore the dropped table. Fixes #15370 Closes scylladb/scylladb#15912	2023-11-03 13:38:48 +02:00
Pavel Emelyanov	b974d8ca1b	stream_session: Do not print banign exceptions with error level Handler of STREAM_MUTATION_FRAGMENTS verb creates and starts reader. The resulting future is then checked for being exceptional and an error message is printed in logs. However, if reader fails because of socket being closed by peer, the error looks excessive. In that case the exception is just regular handling of the socket/stream closure and can be demoted down to debug level. fixes: #15891 Similar cherry-picking of log level exists in e.g. storage proxy, see for example `56bd9b5d` (service: storage_proxy: do not report abort requests in handle_write ) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#15892	2023-10-31 14:21:22 +02:00
Botond Dénes	4a02865ea1	Merge 'Prevent invalidation of iterators over database::_column_families' from Aleksandra Martyniuk Maps related to column families in database are extracted to a column_families_data class. Access to them is possible only through methods. All methods which may preempt hold rwlock in relevant mode, so that the iterators can't become invalid. Fixes: #13290 Closes #13349 * github.com:scylladb/scylladb: replica: make tables_metadata's attributes private replica: add methods to get a filtered copy of tables map replica: add methods to check if given table exists replica: add methods to get table or table id replica: api: return table_id instead of const table_id& replica: iterate safely over tables related maps replica: pass tables_metadata to phased_barrier_top_10_counts replica: add methods to safely add and remove table replica: wrap column families related maps into tables_metadata replica: futurize database::add_column_family and database::remove	2023-07-31 15:31:59 +03:00
Tomasz Grabiec	f88220aeee	stream_transfer_task, multishard_writer: Work with table sharder So that we can use it on tablet-based tables.	2023-07-25 21:08:51 +02:00
Aleksandra Martyniuk	cdbfa0b2f5	replica: iterate safely over tables related maps Loops over _column_families and _ks_cf_to_uuid which may preempt are protected by reader mode of rwlock so that iterators won't get invalid.	2023-07-25 17:13:04 +02:00
Aleksandra Martyniuk	52afd9d42d	replica: wrap column families related maps into tables_metadata As a preparation for ensuring access safety for column families related maps, add tables_metadata, access to members of which would be protected by rwlock.	2023-07-25 16:13:00 +02:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Avi Kivity	26c8470f65	treewide: use #include <seastar/...> for seastar headers We treat Seastar as an external library, so fix the few places that didn't do so to use angle brackets. Closes #14037	2023-06-06 08:36:09 +03:00
Gleb Natapov	7caf1d26fb	migration manager: Make schema pull abortable. Now which schema pull may issues raft read barrier it may stuck if majority is not available. Make the operation abortable and abort it during queries if timeout is reached.	2023-05-11 16:31:23 +03:00
Botond Dénes	156e5d346d	reader_permit: keep trace_state pointer on permit And propagate it down to where it is created. This will be used to add trace points for semaphore related events, but this will come in the next patches.	2023-03-22 04:58:01 -04:00
Benny Halevy	1392c7e1cf	streaming: stream_session: maybe_yield To prevent reactor stalls when freeing many/long token range vectors. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 17:32:44 +02:00
Benny Halevy	c4836ab9e9	streaming: stream_session: prepare: move token ranges to add_transfer_ranges Reduce copies on the path to calling add_transfer_ranges. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 17:04:47 +02:00
Pavel Emelyanov	f51762c72a	headers: Refine view_update_generator.hh and around The initial intent was to reduce the fanout of shared_sstable.hh through v.u.g.hh -> cql_test_env.hh chain, but it also resulted in some shots around v.u.g.hh -> database.hh inclusion. By and large: - v.u.g.hh doesn't need database.hh - cql_test_env.hh doesn't need v.u.g.hh (and thus -- the shared_sstable.hh) but needs database.hh instead - few other .cc files need v.u.g.hh directly as they pulled it via cql_test_env.hh before - add forward declarations in few other places Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12952	2023-02-22 09:32:30 +02:00
Kefu Chai	0cb842797a	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:57:18 +02:00
Avi Kivity	69a385fd9d	Introduce schema/ module Schema related files are moved there. This excludes schema files that also interact with mutations, because the mutation module depends on the schema. Those files will have to go into a separate module. Closes #12858	2023-02-15 11:01:50 +02:00
Avi Kivity	c5e4bf51bd	Introduce mutation/ module Move mutation-related files to a new mutation/ directory. The names are kept in the global namespace to reduce churn; the names are unambiguous in any case. mutation_reader remains in the readers/ module. mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this patch. This is a step forward towards librarization or modularization of the source base. Closes #12788	2023-02-14 11:19:03 +02:00
Asias He	b9e5e340aa	streaming: Enable offstrategy for all classic streaming based node ops This patch enables offstrategy compaction for all classic streaming based node ops. We can use this method because tables are streamed one after another. As long as there is still streamed data for a given table, we update the automatic trigger timer. When all the streaming has finished, the trigger timer will timeout and fire the offstrategy compaction for the given table. I checked with this patch, rebuild is 3X faster. There was no compaction in the middle of the streaming. The streamed sstables are compacted together after streaming is done. Time Before: INFO 2022-11-25 10:06:08,213 [shard 0] range_streamer - Rebuild succeeded, took 67 seconds, nr_ranges_remaining=0 Time After: INFO 2022-11-25 09:42:50,943 [shard 0] range_streamer - Rebuild succeeded, took 23 seconds, nr_ranges_remaining=0 Compaciton Before: 88 sstables were written -> 88 sstables were added into main set Compaction After: 88 sstables written -> after offstretegy 2 sstables were added into main seet Closes #11848	2022-12-28 11:12:02 +02:00
Benny Halevy	314e45d957	streaming: define plan_id as a strong tagged_uuid type Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-22 19:45:30 +03:00
Benny Halevy	2b017ce285	schema, everywhere: define and use table_schema_version as a strong type Define table_schema_version as a distinct tagged_uuid class, So it can be differentiated from other uuid-class types, in particular table_id. Added reversed(table_schema_version) for convenience and uniformity since the same logic is currently open coded in several places. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:09:45 +03:00
Benny Halevy	257d74bb34	schema, everywhere: define and use table_id as a strong type Define table_id as a distinct utils::tagged_uuid modeled after raft tagged_id, so it can be differentiated from other uuid-class types, in particular from table_schema_version. Fixes #11207 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:09:41 +03:00
Asias He	953af38281	streaming: Allow drop table during streaming Currently, if a table is dropped during streaming, the streaming would fail with no_such_column_family error. Since the table is dropped anyway, it makes more sense to ignore the streaming result of the dropped table, whether it is successful or failed. This allows users to drop tables during node operations, e.g., bootstrap or decommission a node. This is especially useful for the cloud users where it is hard to coordinate between a node operation by admin and user cql change. This patch also fixes a possible user after free issue by not passing the table reference object around. Fixes #10395 Closes #10396	2022-04-24 17:43:20 +03:00
Botond Dénes	b029bd3db7	tree: remove mutation_reader.hh include In most files it was unused. We should move these to the patch which moved out the last interesting reader from mutation_reader.hh (and added the corresponding new header include) but its probably not worth the effort. Some other files still relied on mutation_reader.hh to provide reader concurrency semaphore and some other misc reader related definitions.	2022-03-30 15:42:51 +03:00
Botond Dénes	fcf15fda94	readers: generating_reader: use noncopyable_function<> std::function<> requires the functor it wraps to be copyable, which is an unnecessarily strict requirement. To relax this, we use noncopyable_function<> instead. Since the former seems to lack some disambiguation magic of the latter, we add `_v1` and `_v2` postfixes to manually disambiguate.	2022-03-17 06:53:44 +02:00
Botond Dénes	35bbd54946	readers: merge generating.hh into generating_v2.hh Both variants return a v2 reader and we are going to keep both for a time to come.	2022-03-17 06:52:28 +02:00
Botond Dénes	7844ff9912	readers/generating.hh: return v2 reader from make_generating_reader() For now, the (v1) reader is just upgraded to v2 behind the scenes.	2022-03-17 06:51:20 +02:00
Mikołaj Sielużycki	1d84a254c0	flat_mutation_reader: Split readers by file and remove unnecessary includes. The flat_mutation_reader files were conflated and contained multiple readers, which were not strictly necessary. Splitting optimizes both iterative compilation times, as touching rarely used readers doesn't recompile large chunks of codebase. Total compilation times are also improved, as the size of flat_mutation_reader.hh and flat_mutation_reader_v2.hh have been reduced and those files are included by many file in the codebase. With changes real 29m14.051s user 168m39.071s sys 5m13.443s Without changes real 30m36.203s user 175m43.354s sys 5m26.376s Closes #10194	2022-03-14 13:20:25 +02:00
Botond Dénes	ad1b157452	streaming/consumer: convert to v2 At least on the API level, internally there are still conversions, but these are going to be sorted out in the next patches too.	2022-03-02 09:55:09 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Raphael S. Carvalho	426450dc04	treewide: remove useless include of database.hh Wrote a script based on cpp-include to find places that needlessly included database.hh, which is expensive to process during build time. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220104204359.168895-1-raphaelsc@scylladb.com>	2022-01-05 10:15:19 +02:00
Pavel Emelyanov	a3b4d4d3cf	stream_session: Use manager reference from result-future When the stream_session initializes it's being equipped with the shared-pointer on the stream_result_future very early. In all the places where stream_session needs the manager this pointer is alive and session get get manager from it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-24 12:17:37 +03:00

1 2 3 4 5 ...

328 Commits