scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 11:30:36 +00:00

Author	SHA1	Message	Date
Dawid Medrek	fbbb9f879a	db/hints: Remove unused aliases from manager.hh	2023-09-15 04:17:08 +02:00
Dawid Medrek	d46437a87b	db/hints: Rename end_point_hints_manager This commit renames `end_point_hints_manager` to `hint_endpoint_manager` to be consistent with other names used in the module (they all start with `hint_`).	2023-09-15 03:46:15 +02:00
Dawid Medrek	6d1eee448b	db/hints: Rename sender to hint_sender We rename the structure to highlight what exactly its purpose is.	2023-09-15 03:46:15 +02:00
Dawid Medrek	4ad0f8907c	db/hints: Move the rebalancing logic to hint_storage This commit continues modularizing manager.hh.	2023-09-15 03:46:15 +02:00
Dawid Medrek	999484466d	db/hints: Move the implementation of sender This commit continues modularizing manager.hh. After moving the declaration of sender to a dedicated header file, these changes move its implementation to a separate source file.	2023-09-15 03:46:15 +02:00
Dawid Medrek	17aabf6b9a	db/hints: Move the declaration of sender to hint_sender.hh This commit is yet another step in modularizing manager.hh. We move the declaration of sender to a dedicated file. Its implementation will follow in a future commit.	2023-09-15 03:46:15 +02:00
Dawid Medrek	1a7262ed6e	db/hints: Move sender::replay_allowed() to the source file The premise of these changes is the fact that we cannot have a cycle of #includes. Because the declaration of `sender` is going to be moved to a separate header file in a future commit, and because that header file is going to be included in the file where `end_point_hints_manager` is declared, we will need to rely on `end_point_hints_manager` being an incomplete type there. A consequence of that is that we cannot access any of `end_point_hints_manager`'s methods. This commit prepares the ground for it by moving the definition of the function to the source file where `end_point_hints_manager` will be a complete type.	2023-09-15 03:46:15 +02:00
Dawid Medrek	ad2a36bd45	db/hints: Put end_point_hints_manager in internal namespace	2023-09-15 03:46:15 +02:00
Dawid Medrek	507054012d	db/hints: Move the implementation of end_point_hints_manager This commit continues moving end_point_hints_manager to its dedicated files. After moving the declaration of the class, these changes move the implementation.	2023-09-15 03:46:15 +02:00
Dawid Medrek	f72c423984	db/hints: Move the declaration of end_point_hints_manager This commit is yet another step in modularizing manager.hh. We move the declaration of the class to a dedicated header file. The implementation will follow in a future commit.	2023-09-15 03:46:15 +02:00
Dawid Medrek	854cc0c939	db/hints: Move definitions of functions using shard hint manager We move definitions of inline methods of end_point_hints_manager and sender accessing shard hint manager to the source file, effectively un-inlining them. We need to do that to prepare for moving said structures out of manager.hh. This commit is yet another step in modularizing manager.hh.	2023-09-15 03:45:57 +02:00
Dawid Medrek	db08a85f5d	db/hints: Introduce hint_storage.hh This commit moves types used by shard hint manager and related to storing hints on disk to another file. It is yet another step in modularizing manager.hh.	2023-09-15 02:28:10 +02:00
Dawid Medrek	4814b3b19a	db/hints: Extract the logger from manager.cc This commit extracts the logger used in manager.cc to prepare the ground for modularization of manager.hh into separate smaller files. We want to preserve the logging behavior (at least for the time being), which means new files should use the same logger. These changes serve that purpose.	2023-09-15 02:24:20 +02:00
Dawid Medrek	efd6d1f57a	db/hints: Extract common types from manager.hh Currently, data structures used in manager.hh use their own aliases for gms::inet_address. It is clear they all should use the same type and having different names for it only reduces readability of the code. This commit introduces a common alias -- endpoint_id -- and gets rid of the other ones. This commit is also the first step in modularizing manager.hh by extracting common types to another file.	2023-09-15 02:23:30 +02:00
Dawid Medrek	c7fe5d7f94	utils/lister: Limit the API of scan_dir() to fs::path Right now, the function allows for passing the path to a file as a seastar::sstring, which is then converted to std::filesystem::path -- implicitly to the caller. However, the function performs I/O, and there is no reason to accept any other type than std::filesystem::path, especially because the conversion is straightforward. Callers can perform it on their own. This commit introduces the more constrained API. Closes #15266	2023-09-05 20:50:42 +03:00
Benny Halevy	2c54d7a35a	view, storage_proxy: carry effective_replication_map along with endpoints When sending mutation to remote endpoint, the selected endpoints must be in sync with the current effective_replication_map. Currently, the endpoints are sent down the storage_proxy stack, and later on an effective_replication_map is retrieved again, and it might not match the target or pending endpoints, similar to the case seen in https://github.com/scylladb/scylladb/issues/15138 The correct way is to carry the same effective replication map used to select said endpoints and pass it down the stack. See also https://github.com/scylladb/scylladb/pull/15141 Fixes scylladb/scylladb#15144 Fixes scylladb/scylladb#14730 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #15142	2023-08-29 09:08:42 +03:00
Kamil Braun	93be4c0cb0	Merge 'Base node liveliness consistently on gossiper::is_alive' from Benny Halevy Currently he gossiper marks endpoint_state objects as alive/dead. I some cases the endpoint_state::is_alive function is checked but in many other cases gossiper::is_alive(endpoint) is used to determine if the endpoint is alive. This series removed the endpoint_state::is_alive state and moves all the logic to gossiper::is_alive that bases its decision on the endpoint having an endpoint_state and being in the _live_endpoints set. For that, the _live_endpoints is made sure to be replicated to all shards when changed and the endpoint_state changes are serialized under lock_endpoint, and also making sure that the endpoint_state in the _endpoint_states_map is never updated in place, but rather a temporary copy is changed and then safely replicated using gossiper::replicate Refs https://github.com/scylladb/scylladb/issues/14794 Closes #14801 * github.com:scylladb/scylladb: gossiper: mark_alive: remove local_state param endpoint_state: get rid of _is_alive member and methods gossiper: is_alive: use _live_endpoints gossiper: evict_from_membership: erase endpoint from _live_endpoints gossiper: replicate_live_endpoints_on_change: use _live_endpoints_version to detect change gossiper: run: no need to replicate live_endpoints gossiper: fold update_live_endpoints_version into replicate_live_endpoints_on_change gossiper: add mutate_live_and_unreachable_endpoints gossiper: reset_endpoint_state_map: clear also shadow endpoint sets gossiper: reset_endpoint_state_map: clear live/unreachable endpoints on all shards gossiper: functions that change _live_endpoints must be called on shard 0 gossiper: add lock_endpoint_update_semaphore gossiper: make _live_endpoints an unordered_set endpoint_state: use gossiper::is_alive externally	2023-08-23 17:18:05 +02:00
Petr Gusev	439c91851f	hints: add debug log for dropped hints Dropping data is rather important event, let's log it at least at the debug level. It'll help in debugging tests.	2023-08-22 15:48:40 +04:00
Petr Gusev	9fd3df13a2	hints: send_one_hint: extend the scope of file_send_gate holder The problem was that the holder in with_gate call was released too early. This happened before the possible call to on_hint_send_failure in then_wrapped. As a result, the effects of on_hint_send_failure (segment_replay_failed flag) were not visible in send_one_file after ctx_ptr->file_send_gate.close(), so we could decide that the segment was sent in full and delete it even if sending of some hints led to errors. Fixes #15110	2023-08-22 15:48:40 +04:00
Petr Gusev	1b7603af23	hints manager: add send_errors counter There was no indication of problems in the hints manager metrics before. We need this counter for fencing tests in the later commit, but it seems to be useful on its own.	2023-08-22 14:31:04 +04:00
Benny Halevy	97061cc3b8	endpoint_state: use gossiper::is_alive externally Before we remove endpoint_state:_is_alive to rely solely on gossipper::_live_endpoints. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-22 09:06:09 +03:00
Patryk Jędrzejczak	02618831ef	db: hints: add checksum to sync point encoding sync point API provided with incorrect sync point id might allocate crazy amount of memory and fail with std::bad_alloc. To fix this, we can check if the encoded sync point has been modified before decoding. We can achieve this by calculating a checksum before encoding, appending it to the encoded sync point, and compering it with a checksum calculated in db::hints::decode before decoding.	2023-07-17 16:05:07 +02:00
Patryk Jędrzejczak	0a424e1760	db: hints: add the version_size constant The next commit changes the format of encoding sync points to V2. The new format appends the checksum to the encoded sync points and its implementation uses the checksum_size constant - the number of bytes required to store the checksum. To increase consistency and readability, we can additionally add and use the version_size constant. Definitions of sync_point::decode and sync_point::encode are slightly changed so that they don't depend on the version_size value and make implementation of the V2 format easier.	2023-07-17 16:02:18 +02:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Kamil Braun	beabb61566	test: reproducer for hints manager shutdown hang	2023-05-29 11:03:39 +02:00
Tomasz Grabiec	9b17ad3771	locator: Introduce per-table replication strategy Will be used by tablet-based replication strategies, for which effective replication map is different per table. Also, this patch adapts existing users of effective replication map to use the per-table effective replication map. For simplicity, every table has an effective replication map, even if the erm is per keyspace. This way the client code can be uniform and doesn't have to check whether replication strategy is per table. Not all users of per-keyspace get_effective_replication_map() are adapted yet to work per-table. Those algorithms will throw an exception when invoked on a keyspace which uses per-table replication strategy.	2023-04-24 10:49:36 +02:00
Benny Halevy	f3d5df5448	locator: add class node And keep per node information (idx, host_id, endpoint, dc_rack, is_pending) in node objects, indexed by topology on several indices like: idx, host_id, endpoint, current/pending, per dc, per dc/rack. The node index is a shorthand identifier for the node. node* and index are valid while the respective topology instance is valid. To be used, the caller must hold on to the topology / token_metadata object (e.g. via a token_metadata_ptr or effective_replication_map) Refs #6403 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> topology: add node idx Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:13:02 +03:00
Kefu Chai	c37f4e5252	treewide: use fmt::join() when appropriate now that fmtlib provides fmt::join(). see https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view there is not need to revent the wheel. so in this change, the homebrew join() is replaced with fmt::join(). as fmt::join() returns an join_view(), this could improve the performance under certain circumstances where the fully materialized string is not needed. please note, the goal of this change is to use fmt::join(), and this change does not intend to improve the performance of existing implementation based on "operator<<" unless the new implementation is much more complicated. we will address the unnecessarily materialized strings in a follow-up commit. some noteworthy things related to this change: * unlike the existing `join()`, `fmt::join()` returns a view. so we have to materialize the view if what we expect is a `sstring` * `fmt::format()` does not accept a view, so we cannot pass the return value of `fmt::join()` to `fmt::format()` * fmtlib does not format a typed pointer, i.e., it does not format, for instance, a `const std::string`. but operator<<() always print a typed pointer. so if we want to format a typed pointer, we either need to cast the pointer to `void` or use `fmt::ptr()`. * fmtlib is not able to pick up the overload of `operator<<(std::ostream& os, const column_definition* cd)`, so we have to use a wrapper class of `maybe_column_definition` for printing a pointer to `column_definition`. since the overload is only used by the two overloads of `statement_restrictions::add_single_column_parition_key_restriction()`, the operator<< for `const column_definition*` is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 20:34:18 +08:00
Avi Kivity	6aa91c13c5	Merge 'Optimize topology::compare_endpoints' from Benny Halevy The code for compare_endpoints originates at the dawn of time (`bc034aeaec`) and is called on the fast path from storage_proxy via `sort_by_proximity`. This series considerably reduces the function's footprint by: 1. carefully coding the many comparisons in the function so to reduce the number of conditional banches (apparently the compiler isn't doing a good enough job at optimizing it in this case) 2. avoid sstring copy in topology::get_{datacenter,rack} Closes #12761 * github.com:scylladb/scylladb: topology: optimize compare_endpoints to_string: add print operators for std::{weak,partial}_ordering utils: to_sstring: deinline std::strong_ordering print operator move to_string.hh to utils/ test: network_topology: add test_topology_compare_endpoints	2023-03-07 15:17:19 +02:00
Kefu Chai	0cb842797a	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:57:18 +02:00
Benny Halevy	25ebc63b82	move to_string.hh to utils/ Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-15 11:09:04 +02:00
Avi Kivity	c5e4bf51bd	Introduce mutation/ module Move mutation-related files to a new mutation/ directory. The names are kept in the global namespace to reduce churn; the names are unambiguous in any case. mutation_reader remains in the readers/ module. mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this patch. This is a step forward towards librarization or modularization of the source base. Closes #12788	2023-02-14 11:19:03 +02:00
Benny Halevy	68141d0aac	topology: get rid of pending state Now, with `a44ca06906`, is_normal_token_owner that replaced is_member does not rely anymore on the pending status of endpoints in topology. With that we can get rid of this state and just keep all endpoints we know about in the topology. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-13 14:17:18 +02:00
Benny Halevy	243dc2efce	hints: host_filter: check topology::has_endpoint if enabled_selectively Don't call get_datacenter(ep) without checking first has_endpoint(ep) since the former may abort on internal error if the endpoint is not listed in topology. Refs #11870 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12054	2022-11-24 14:33:06 +03:00
Botond Dénes	437fcdeeda	Merge 'Make use of enum_set in directory lister' from Pavel Emelyanov The lister accepts sort of a filter -- what kind of entries to list, regular, directories or both. It currently uses unordered_set, but enum_set is shorter and better describes the intent. Closes #12017 * github.com:scylladb/scylladb: lister: Make lister::dir_entry_types an enum_set database: Avoid useless local variable	2022-11-18 12:15:26 +02:00
Asias He	4571fcf9e7	token_metadata: Rename is_member to is_normal_token_owner The name is_normal_token_owner is more clear than is_member. The is_normal_token_owner reflects what it really checks.	2022-11-18 09:29:20 +08:00
Pavel Emelyanov	bc62ca46d4	lister: Make lister::dir_entry_types an enum_set This type is currently an unordered_set, but only consists of at most two elements. Making it an enum_set renders it into a size_t variable and better describes the intention. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-17 19:01:45 +03:00
Botond Dénes	a9573b84c5	Merge 'commitlog: Revert/modify `fac2bc4` - do footprint add in delete' from Calle Wilund Fixes #11184 Fixes #11237 In prev (broken) fix for https://github.com/scylladb/scylladb/issues/11184 we added the footprint for left-over files (replay candidates) to disk footprint on commitlog init. This effectively prevents us from creating segments iff we have tight limits. Since we nowadays do quite a bit of inserts _before_ commitlog replay (system.local, but...) we can end up in a situation where we deadlock start because we cannot get to the actual replay that will eventually free things. Another, not thought through, consequence is that we add a single footprint to _all_ commitlog shard instances - even though only shard 0 will get to actually replay + delete (i.e. drop footprint). So shards 1-X would all be either locked out or performance degraded. Simplest fix is to add the footprint in delete call instead. This will lock out segment creation until delete call is done, but this is fast. Also ensures that only replay shard is involved. To further emphasize this, don't store segments found on init scan in all shard instances, instead retrieve (based on low time-pos for current gen) when required. This changes very little, but we at last don't store pointless string lists in shards 1 to X, and also we can potentially ask for the list twice. More to the point, goes better hand-in-hand with the semantics of "delete_segments", where any file sent in is considered candidate for recycling, and included in footprint. Closes #11251 * github.com:scylladb/scylladb: commitlog: Make get_segments_to_replay on-demand commitlog: Revert/modify `fac2bc4` - do footprint add in delete	2022-08-15 09:10:32 +03:00
Benny Halevy	d295d8e280	everywhere: define locator::host_id as a strong tagged_uuid type So it can be distinguished from other uuid-based identifiers in the system. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #11276	2022-08-12 06:01:44 +03:00
Calle Wilund	a729c2438e	commitlog: Make get_segments_to_replay on-demand Refs #11237 Don't store segments found on init scan in all shard instances, instead retrieve (based on low time-pos for current gen) when required. This changes very little, but we at last don't store pointless string lists in shards 1 to X, and also we can potentially ask for the list twice. More to the point, goes better hand-in-hand with the semantics of "delete_segments", where any file sent in is considered candidate for recycling, and included in footprint.	2022-08-11 06:41:23 +00:00
Benny Halevy	2b017ce285	schema, everywhere: define and use table_schema_version as a strong type Define table_schema_version as a distinct tagged_uuid class, So it can be differentiated from other uuid-class types, in particular table_id. Added reversed(table_schema_version) for convenience and uniformity since the same logic is currently open coded in several places. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:09:45 +03:00
Benny Halevy	1fda686f96	idl: make idl headers self-sufficient Add include statements to satisfy dependencies. Delete, now unneeded, include directives from the upper level source files. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:02:27 +03:00
Benny Halevy	cfc7e9aa59	db: hints: sync_point: do not include idl definition file idl definition files are not intended for direct inclusion in .cc files. Data types it represents are supposed to be defined in regular C++ header, so define them in db/hints/scyn_point.hh and include it rather then idl/hinted_handoff.idl.hh. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:02:27 +03:00
Pavel Emelyanov	820be06ac1	hints: Remove snitch dependency After previous patch hints manager class gets unused dependency on snitch. While removing it it turns out that several unrelated places get needed headers indirectly via host_filter.hh -> snitsh_base.hh inclusion. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-22 11:47:26 +03:00
Pavel Emelyanov	9b6312687b	hints: Get rack/datacenter from topology The topology referecne is obtained from the proxy anchor pointer sitting on manager. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-22 11:47:26 +03:00
Avi Kivity	4b53af0bd5	treewide: replace parallel_for_each with coroutine::parallel_for_each in coroutines coroutine::parallel_for_each avoids an allocation and is therefore preferred. The lifetime of the function object is less ambiguous, and so it is safer. Replace all eligible occurences (i.e. caller is a coroutine). One case (storage_service::node_ops_cmd_heartbeat_updater()) needed a little extra attention since there was a handle_exception() continuation attached. It is converted to a try/catch. Closes #10699	2022-05-31 09:06:24 +03:00
Avi Kivity	528ab5a502	treewide: change metric calls from make_derive to make_counter make_derive was recently deprecated in favor of make_counter, so make the change throughput the codebase. Closes #10564	2022-05-14 12:53:55 +02:00
Calle Wilund	d478896d46	commitlog: kill non-recycled segment management It has been default for a while now. Makes no sense to not do it. Even hints can use it (even if it makes no difference there)	2022-04-11 16:34:00 +00:00
Benny Halevy	ebbbf1e687	lister: move to utils There's nothing specific to scylla in the lister classes, they could (and maybe should) be part of the seastar library. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-28 12:36:03 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00

1 2 3 4 5

229 Commits