scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 11:36:54 +00:00

Author	SHA1	Message	Date
Kamil Braun	ff4ecfa182	dht: boot_strapper: check if keyspace still exists in `bootstrap` While we're iterating over the fetched keyspace names, some of these keyspaces may get dropped. Handle that by checking if the keyspace still exists. Also, when retrieving the replication strategy from the keyspace, store the pointer (which is an `lw_shared_ptr`) to the strategy to keep it alive, in case the keyspace that was holding it gets dropped. Closes #10861	2022-06-27 19:13:46 +02:00
Pavel Emelyanov	5e2fa32c8c	range_streamer: Get rack/datacenter from topology It's needed in source filter classes so range-streamer passes the topology reference into its methods. Nice side effect -- snitch header goes away from range-streamer one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-22 11:47:26 +03:00
Asias He	1f8b529e08	range_streamer: Disable restream logic Consider: - n1 and n2 in the cluster - n3 bootstraps to join - n1 does not hear gossip update from n3 due to network issue - n1 removes n3 from gossip and pending node list - stream between n1 and n3 fails - n1 and n3 network issue is fixed - n3 retry the stream with n1 - n3 finishes the stream with n1 - n3 advertises normal to join the cluster The problem is that n1 will not treat n3 as the pending node so writes will not route to n3 once n1 removes n3. Another problem is that when n1 gets normal gossip status update from n3. The gossip listener will fail because n1 has removed n3 so n1 could not find the host id for n3. This will cause n1 to abort. To fix, disable the retry logic in range_streamer so that once a stream with existing fails the bootstrap fails. The downside is that we lose the ability to restream caused by temporary network issue but since we have repair based node operation. We can use it to resume the previous failed node operations. Fixes: #9805 Closes #9806	2022-05-24 11:24:25 +03:00
Avi Kivity	5937b1fa23	treewide: remove empty comments in top-of-files After `fcb8d040` ("treewide: use Software Package Data Exchange (SPDX) license identifiers"), many dual-licensed files were left with empty comments on top. Remove them to avoid visual noise. Closes #10562	2022-05-13 07:11:58 +02:00
Mikołaj Sielużycki	1d84a254c0	flat_mutation_reader: Split readers by file and remove unnecessary includes. The flat_mutation_reader files were conflated and contained multiple readers, which were not strictly necessary. Splitting optimizes both iterative compilation times, as touching rarely used readers doesn't recompile large chunks of codebase. Total compilation times are also improved, as the size of flat_mutation_reader.hh and flat_mutation_reader_v2.hh have been reduced and those files are included by many file in the codebase. With changes real 29m14.051s user 168m39.071s sys 5m13.443s Without changes real 30m36.203s user 175m43.354s sys 5m26.376s Closes #10194	2022-03-14 13:20:25 +02:00
Nadav Har'El	bc4d0fd5ad	murmur3: fix inconsistent token for empty partition key Traditionally in Scylla and in Cassandra, an empty partition key is mapped to minimum_token() instead of the empty key's usual hash function (0). The reasons for this are unknown (to me), but one possibility is that having one known key that maps to the minimal token is useful for various iterations. In murmur3_partitioner.cc we have two variants of the token calculation function - the first is get_token(bytes_view) and the second is get_token(schema, partition_key_view). The first includes that empty- key special case, but the second was missing this special case! As Kamil first noted in #9352, the second variant is used when looking up partitions in the index file - so if a partition with an empty-string key is saved under one token, it will be looked up under a different token and not found. I reproduced exactly this problem when fixing issues #9364 and #9375 (empty-string keys in materialized views and indexes) - where a partition with an empty key was visible in a full-table scan but couldn't be found by looking up its key because of the wrong index lookup. I also tried an alternative fix - changing both implementations to return minimum_token (and not 0) for the empty key. But this is undesirable - minimum_token is not supposed to be a valid token, so the tokenizer and sharder may not return a valid replica or shard for it, so we shouldn't store data under such token. We also have have code (such as an increasing- key sanity check in the flat mutation reader) which assumes that no real key in the data can be minimum_token, and our plan is to start allowing data with an empty key (at least for materialized views). This patch does not risk a backward-incompatible disk format changes for two reasons: 1. In the current Scylla, there was no valid case where an empty partition key may appear. CQL and Thrift forbid such keys, and materialized-views and indexes also (incorrectly - see #9364, #9375) drop such rows. 2. Although Cassandra does allow empty partition keys, they is only allowed in materialized views and indexes - and we don't support reading materialized views generated by Cassandra (the user must re-generate them in Scylla). When #9364 and #9375 will be fixed by the next patch, empty partition keys will start appearing in Scylla (in materialized views and in the materialized view backing a secondary index), and this fix will become important. Fixes #9352 Refs #9364 Refs #9375 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-03-08 14:15:03 +02:00
Avi Kivity	6572b297a2	treewide: clean up stray license blurbs After the mechanical change in `fcb8d040e8` ("treewide: use Software Package Data Exchange (SPDX) license identifiers"), a few stray license blurbs or fragments thereof remain. In two cases these were extra blurbs in code generators intended for the generated code, in others they were just missed by the script. Clean them up, adding an SPDX license identifier where needed. Closes #10072	2022-02-13 14:16:16 +02:00
Pavel Emelyanov	469ded71a9	bootstrapper: Get 'is-replacing' via argument too This also removes the only usage of this helper outside of the storage service. The place that needs it is the use_strict_sources_for_ranges() checker and all the callers of it are aware of whether it's replacing happenning or not. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-02-07 12:41:02 +03:00
Pavel Emelyanov	9770f54789	bootstrapper: Get replace address via argument This removes the only usage of db.get_replace_address outside of storage service. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-02-07 12:39:51 +03:00
Pavel Emelyanov	1525c04db3	dht: Use db::config to generate initial tookens The replica::database is passed into the helper just to get the config from. Better to use config directly without messing with the database. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-27 16:41:29 +03:00
Pavel Emelyanov	77532a6a36	database, dht: Move get_initial_tokens() The helper in question has nothing to do with replica/database and is only used by dht to convert config option to a set of tokens. It sounds like the helper deserves living where it's needed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-27 16:41:29 +03:00
Pavel Emelyanov	50170366ea	storage_service: Factor out random/config tokens generation There's a place in normal node start that parses the initial_token option or generates num_tokens random tokens. This code is used almost unchanged since being ported from its java version. Later there appeared the dht::get_bootstrap_token() with the same internal logic. This patch generalizes these two places. Logging messages are unified too (dtest seem not to check those). The change improves a corner case. The normal node startup code doesn't check if the initial_token is empty and num_tokens is 0 generating empty bootstrap_tokens set. It fails later with an obscure 'remove_endpoint should be used instead' message. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-27 16:41:29 +03:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Avi Kivity	ae3a360725	database: Move database, keyspace, table classes to replica/ directory The database, keyspace, and table classes represent the replica-only part of the objects after which they are named. Reading from a table doesn't give you the full data, just the replica's view, and it is not consistent since reconciliation is applied on the coordinator. As a first step in acknowledging this, move the related files to a replica/ subdirectory.	2022-01-06 17:07:30 +02:00
Nadav Har'El	6012f6f2b6	build performance: do not include <seastar/net/ip.hh> In a previous patch, we noticed that the header file <gm/inet_address.hh>, which is included, directly or indirectly, by most source files, includes <seastar/net/ip.hh> which is very slow to compile, and replaced it by the much faster-to-include <seastar/net/ipv[46]_address.hh>. However, we also included <seastar/net/ip.hh> in types.hh - and that too is included by almost every file, so the actual saving from the above patch was minimal. So in this patch we replace this include too. After this patch Scylla does not include <seastar/net/ip.hh> at all. According to ClangBuildAnalyzer, this reduces the average time to include types.hh (multiply this by 312 times!) from 4 seconds to 1.8 seconds, and reduces total build time (dev mode) by about 3%. Some of the source files were now missing some include directives, that were previously included in ip.hh - so we need to add those explicitly. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-01-05 17:29:21 +02:00
Pavel Emelyanov	831f18e392	dht: Pass gossiper to range_streamer::add_ranges A continuation of the previous patch. The range_streamer needs gossiper too, and is called from boot_strapper and storage_service. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-25 10:54:16 +03:00
Pavel Emelyanov	6a2f6068cb	dht: Pass gossiper argument to bootstrap The boot_strapper::bootstrap needs gossiper and is called only from the storage_service code that has it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-25 10:53:56 +03:00
Pavel Emelyanov	3087422d4d	stream_plan: Keep stream_manager onboard The plan itself doesn't need it, but it creates some lower level objects that do. Next patches will use this reference. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-24 12:17:37 +03:00
Pavel Emelyanov	c593f8624d	dht: Keep stream_manager on board This is the preparation for the future patching. The stream_plan creation will need the manager reference, so keep one on dht object in advance. These are only created from the storage service bootstrap code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-24 12:17:37 +03:00
Pavel Emelyanov	5877b84a1a	range_streamer: Remove stream_plan from The streamer creates stream_plan "on demand" and doesnt use the on-board one Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20211112180335.27831-1-xemul@scylladb.com>	2021-11-12 19:38:45 +01:00
Benny Halevy	17296cba4b	effective_replication_map: add get_range_addresses Equivalent to abstract_replication_strategy get_range_addresses, yet synchronous, as it uses the precalculated map. Call it from storage_service::get_new_source_ranges and range_streamer::get_all_ranges_with_sources_for. Consequently, get_new_source_ranges and removenode_add_ranges can become synchronous too. Unfortunately we can't entirely get rid of abstract_replication_strategy::get_range_addresses as it's still needed by range_streamer::get_all_ranges_with_strict_sources_for. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 16:10:06 +03:00
Benny Halevy	4d2561ff75	abstract_replication_strategy: precacluate get_replication_factor for effective_replication_map Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 16:10:06 +03:00
Benny Halevy	cbe58345b9	abstract_replication_strategy: futurize get_*address_ranges Remaining callers of get_address_ranges and get_pending_address_ranges are all either from a seastar thread or from a coroutine so we can make the methods always async and drop the can_yield param. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 16:10:06 +03:00
Benny Halevy	91581ba23a	abstract_replication_strategy: futurize get_range_addresses All remaining use sites are called in a seastar thread so we drop the can_yield param and make get_range_addresses always async. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 16:10:06 +03:00
Benny Halevy	d96a67eb57	abstract_replication_strategy: use shared_ptr in registry Enable creating shared_ptr<BaseClass> in nonstatic_class_registry using BaseClass::ptr_type and use that for abstract_replication_strategy. While at it, also clean up compressor with that respect to define compressor::ptr_type as shared_ptr<compressor> thus simplifying compressor_registry. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-10-13 12:39:36 +03:00
Benny Halevy	7498ac4869	dht: boot_strapper: bootstrap: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210923144206.1690576-2-bhalevy@scylladb.com>	2021-09-26 11:09:01 +03:00
Benny Halevy	798aee6747	dht: boot_strapper: coroutinize bootstrap Prepare for futurizing get_pending_address_ranges. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210923144206.1690576-1-bhalevy@scylladb.com>	2021-09-26 11:09:01 +03:00
Avi Kivity	daf028210b	build: enable -Winconsistent-missing-override warning This warning can catch a virtual function that thinks it overrides another, but doesn't, because the two functions have different signatures. This isn't very likely since most of our virtual functions override pure virtuals, but it's still worth having. Enable the warning and fix numerous violations. Closes #9347	2021-09-15 12:55:54 +03:00
Avi Kivity	0909e3c17d	treewide: remove redundant "x <=> 0" compares If x is of type std::strong_ordering, then "x <=> 0" is equivalent to x. These no-ops were inserted during #1449 fixes, but are now unnecessary. They have potential for harm, since they can hide an accidental of the type of x to an arithmetic type, so remove them. Ref #1449.	2021-07-28 13:30:32 +03:00
Avi Kivity	8a80e455fb	sstables: keys: convert trichotomic comparisons to std::strong_ordering Prevent accidental conversions to bool from yielding the wrong results. Unprepared users (that converted to bool, or assigned to int) are adjusted. Ref #1449 Test: unit (dev) Closes #9088	2021-07-26 19:09:19 +03:00
Juliusz Stasiewicz	a8b741efe2	endpoint_details: store `_host` as `gms::inet_address` In an upcoming commit I will add "system.describe_ring" table which uses endpoint's inet address as a part of CK and, therefore, needs to keep them sorted with `inet_addr_type::less`.	2021-07-20 14:00:54 +02:00
Tomasz Grabiec	06e373e272	sstables: index_reader: Keep index objects under LSA In preparation for caching index objects, manage them under LSA. Implementation notes: key_view was changed to be a view on managed_bytes_view instead of bytes, so it now can be fragmented. Old users of key_view now have to linearize it. Actual linearization should be rare since partition keys are typically small. Index parser is now not constructing the index_entry directly, but produces value objects which live in the standard allocator space: class parsed_promoted_index_entry; calss parsed_partition_index_entry; This change was needed to support consumers which don't populate the partition index cache and don't use LSA, e.g. sstable::generate_summary(). It's now consumer's responsibility to allocate index_entry out of parsed_partition_index_entry.	2021-07-02 19:02:14 +02:00
Avi Kivity	0048c404d2	Merge 'dht: token: make some cosmetic changes' from Michał Chojnowski This is a set of a few cosmetic changes in dht/token. Mostly some comments and a simplification of `midpoint()`. Closes #8803 * github.com:scylladb/scylla: dht: token: add a comment excusing the `const bytes&` constructor dht: token: simplify midpoint() dht: token: add a comment to normalize() dht: token: use {read,write}_unaligned instead of std::copy_n dht: token-sharding: fix a typo in a comment	2021-06-07 15:41:15 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Solodovnikov	e0749d6264	treewide: some random header cleanups Eliminate not used includes and replace some more includes with forward declarations where appropriate. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-06-06 19:18:49 +03:00
Michał Chojnowski	23b7178f0d	dht: token: add a comment excusing the `const bytes&` constructor	2021-06-05 15:22:42 +02:00
Michał Chojnowski	31aad81dc9	dht: token: simplify midpoint() midpoint doesn't have to be so complicated.	2021-06-05 15:22:35 +02:00
Michał Chojnowski	a2352ea332	dht: token: add a comment to normalize() The purpose and name of normalize are not obvious and deserve an explanatory comment.	2021-05-31 11:54:58 +02:00
Michał Chojnowski	3d9b8c9eff	dht: token: use {read,write}_unaligned instead of std::copy_n A cosmetic change.	2021-05-31 11:54:58 +02:00
Michał Chojnowski	3c88a9ccb6	dht: token-sharding: fix a typo in a comment	2021-05-31 11:54:45 +02:00
Nadav Har'El	fb0c4e469a	Merge 'token_metadata: Fix get_all_endpoints to return nodes in the ring' from Asias He The get_all_endpoints() should return the nodes that are part of the ring. A node inside _endpoint_to_host_id_map does not guarantee that the node is part of the ring. To fix, return from _token_to_endpoint_map. Fixes #8534 Closes #8536 * github.com:scylladb/scylla: token_metadata: Get rid of get_all_endpoints_count range_streamer: Handle everywhere_topology range_streamer: Adjust use_strict_sources_for_ranges token_metadata: Fix get_all_endpoints to return nodes in the ring	2021-05-11 18:39:10 +03:00
Asias He	4793894fac	range_streamer: Handle everywhere_topology The everywhere_topology returns the number of nodes in the cluster as RF. This makes only streaming from the node losing the range impossible since no node is losing the range after bootstrap. Shortcut it not to use strict source in case the keyspace is everywhere_topology. Refs #8503	2021-05-06 10:02:11 +08:00
Asias He	1b7414860b	range_streamer: Adjust use_strict_sources_for_ranges Now the get_all_endpoints() returns the number of nodes in the ring. We need to adjust the checking for using strict source or not. Use strict when number of nodes in the ring is equal or more than RF Refs #8534	2021-05-06 10:02:11 +08:00
Avi Kivity	cea5493cb7	storage_proxy, treewide: introduce names for vectors of inet_address storage_proxy works with vectors of inet_addresses for replica sets and for topology changes (pending endpoints, dead nodes). This patch introduces new names for these (without changing the underlying type - it's still std::vector<gms::inet_address>). This is so that the following patch, that changes those types to utils::small_vector, will be less noisy and highlight the real changes that take place.	2021-05-05 18:36:48 +03:00
Benny Halevy	96ef204676	dht/token: shard_of: reuse shard_of_minimum_token Returning shard 0 for the minimum token better be hardcoded in one place. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210428113339.1092555-1-bhalevy@scylladb.com>	2021-04-28 15:08:36 +03:00
Benny Halevy	662355519d	dht/i_partitioner: split_range_to_single_shard: drop unused lambda capture of start_shard Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210428113440.1099877-1-bhalevy@scylladb.com>	2021-04-28 15:07:57 +03:00
Pavel Emelyanov	5ecbc33be5	database.*: Remove unused headers The database.hh is the central recursive-headers knot -- it has ~50 includes. This patch leaves only 34 (it remains the champion though). Similar thing for database.cc. Both changes help the latter compile ~4% faster :) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210414183107.30374-1-xemul@scylladb.com>	2021-04-18 14:03:17 +03:00
Avi Kivity	58b7f225ab	keys: convert trichotomic comparators to return std::strong_ordering A trichotomic comparator returning an int an easily be mistaken for a less comparator as the return types are convertible. Use the new std::strong_ordering instead. A caller in cql3's update_parameters.hh is also converted, following the path of least resistance. Ref #1449. Test: unit (dev) Closes #8323	2021-03-21 09:30:43 +02:00
Avi Kivity	378556418c	dht: ring_position, decorated_key: convert tri_comparators to std::strong_ordering Convert tri_comparators to return std::strong_ordering rather than int, to prevent confusion with less comparators. Downstream users are either also converted, or adjust the return type back to int, whichever happens to be simpler; in all cases the change it trivial.	2021-03-18 12:40:05 +02:00

1 2 3 4 5 ...

405 Commits