Commit Graph

401 Commits

Author SHA1 Message Date
Mikołaj Sielużycki
1d84a254c0 flat_mutation_reader: Split readers by file and remove unnecessary includes.
The flat_mutation_reader files were conflated and contained multiple
readers, which were not strictly necessary. Splitting optimizes both
iterative compilation times, as touching rarely used readers doesn't
recompile large chunks of codebase. Total compilation times are also
improved, as the size of flat_mutation_reader.hh and
flat_mutation_reader_v2.hh have been reduced and those files are
included by many file in the codebase.

With changes

real	29m14.051s
user	168m39.071s
sys	5m13.443s

Without changes

real	30m36.203s
user	175m43.354s
sys	5m26.376s

Closes #10194
2022-03-14 13:20:25 +02:00
Nadav Har'El
bc4d0fd5ad murmur3: fix inconsistent token for empty partition key
Traditionally in Scylla and in Cassandra, an empty partition key is mapped
to minimum_token() instead of the empty key's usual hash function (0).
The reasons for this are unknown (to me), but one possibility is that
having one known key that maps to the minimal token is useful for
various iterations.

In murmur3_partitioner.cc we have two variants of the token calculation
function - the first is get_token(bytes_view) and the second is
get_token(schema, partition_key_view). The first includes that empty-
key special case, but the second was missing this special case!

As Kamil first noted in #9352, the second variant is used when looking
up partitions in the index file - so if a partition with an empty-string
key is saved under one token, it will be looked up under a different
token and not found. I reproduced exactly this problem when fixing
issues #9364 and #9375 (empty-string keys in materialized views and
indexes) - where a partition with an empty key was visible in a
full-table scan but couldn't be found by looking up its key because of
the wrong index lookup.

I also tried an alternative fix - changing both implementations to return
minimum_token (and not 0) for the empty key. But this is undesirable -
minimum_token is not supposed to be a valid token, so the tokenizer and
sharder may not return a valid replica or shard for it, so we shouldn't
store data under such token. We also have have code (such as an increasing-
key sanity check in the flat mutation reader) which assumes that
no real key in the data can be minimum_token, and our plan is to start
allowing data with an empty key (at least for materialized views).

This patch does not risk a backward-incompatible disk format changes
for two reasons:

1. In the current Scylla, there was no valid case where an empty partition
   key may appear. CQL and Thrift forbid such keys, and materialized-views
   and indexes also (incorrectly - see #9364, #9375) drop such rows.
2. Although Cassandra *does* allow empty partition keys, they is only
   allowed in materialized views and indexes - and we don't support reading
   materialized views generated by Cassandra (the user must re-generate
   them in Scylla).

When #9364 and #9375 will be fixed by the next patch, empty partition keys
will start appearing in Scylla (in materialized views and in the
materialized view backing a secondary index), and this fix will become
important.

Fixes #9352
Refs #9364
Refs #9375

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-03-08 14:15:03 +02:00
Avi Kivity
6572b297a2 treewide: clean up stray license blurbs
After the mechanical change in fcb8d040e8
("treewide: use Software Package Data Exchange (SPDX) license identifiers"),
a few stray license blurbs or fragments thereof remain. In two cases
these were extra blurbs in code generators intended for the generated code,
in others they were just missed by the script.

Clean them up, adding an SPDX license identifier where needed.

Closes #10072
2022-02-13 14:16:16 +02:00
Pavel Emelyanov
469ded71a9 bootstrapper: Get 'is-replacing' via argument too
This also removes the only usage of this helper outside of the storage
service. The place that needs it is the use_strict_sources_for_ranges()
checker and all the callers of it are aware of whether it's replacing
happenning or not.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-07 12:41:02 +03:00
Pavel Emelyanov
9770f54789 bootstrapper: Get replace address via argument
This removes the only usage of db.get_replace_address outside of
storage service.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-07 12:39:51 +03:00
Pavel Emelyanov
1525c04db3 dht: Use db::config to generate initial tookens
The replica::database is passed into the helper just to get the
config from. Better to use config directly without messing with
the database.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-01-27 16:41:29 +03:00
Pavel Emelyanov
77532a6a36 database, dht: Move get_initial_tokens()
The helper in question has nothing to do with replica/database and
is only used by dht to convert config option to a set of tokens.
It sounds like the helper deserves living where it's needed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-01-27 16:41:29 +03:00
Pavel Emelyanov
50170366ea storage_service: Factor out random/config tokens generation
There's a place in normal node start that parses the initial_token
option or generates num_tokens random tokens. This code is used almost
unchanged since being ported from its java version. Later there appeared
the dht::get_bootstrap_token() with the same internal logic.

This patch generalizes these two places. Logging messages are unified
too (dtest seem not to check those).

The change improves a corner case. The normal node startup code doesn't
check if the initial_token is empty and num_tokens is 0 generating empty
bootstrap_tokens set. It fails later with an obscure 'remove_endpoint
should be used instead' message.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-01-27 16:41:29 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
bbad8f4677 replica: move ::database, ::keyspace, and ::table to replica namespace
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.

References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.

scylla-gdb.py is adjusted to look for both the new and old names.
2022-01-07 12:04:38 +02:00
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Nadav Har'El
6012f6f2b6 build performance: do not include <seastar/net/ip.hh>
In a previous patch, we noticed that the header file <gm/inet_address.hh>,
which is included, directly or indirectly, by most source files,
includes <seastar/net/ip.hh> which is very slow to compile, and
replaced it by the much faster-to-include <seastar/net/ipv[46]_address.hh>.

However, we also included <seastar/net/ip.hh> in types.hh - and that
too is included by almost every file, so the actual saving from the
above patch was minimal. So in this patch we replace this include too.
After this patch Scylla does not include <seastar/net/ip.hh> at all.

According to ClangBuildAnalyzer, this reduces the average time to include
types.hh (multiply this by 312 times!) from 4 seconds to 1.8 seconds,
and reduces total build time (dev mode) by about 3%.

Some of the source files were now missing some include directives, that
were previously included in ip.hh - so we need to add those explicitly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-01-05 17:29:21 +02:00
Pavel Emelyanov
831f18e392 dht: Pass gossiper to range_streamer::add_ranges
A continuation of the previous patch. The range_streamer needs
gossiper too, and is called from boot_strapper and storage_service.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-25 10:54:16 +03:00
Pavel Emelyanov
6a2f6068cb dht: Pass gossiper argument to bootstrap
The boot_strapper::bootstrap needs gossiper and is called only from
the storage_service code that has it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-25 10:53:56 +03:00
Pavel Emelyanov
3087422d4d stream_plan: Keep stream_manager onboard
The plan itself doesn't need it, but it creates some lower level
objects that do. Next patches will use this reference.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
c593f8624d dht: Keep stream_manager on board
This is the preparation for the future patching. The stream_plan
creation will need the manager reference, so keep one on dht
object in advance. These are only created from the storage service
bootstrap code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
5877b84a1a range_streamer: Remove stream_plan from
The streamer creates stream_plan "on demand" and doesnt use the on-board one

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20211112180335.27831-1-xemul@scylladb.com>
2021-11-12 19:38:45 +01:00
Benny Halevy
17296cba4b effective_replication_map: add get_range_addresses
Equivalent to abstract_replication_strategy get_range_addresses,
yet synchronous, as it uses the precalculated map.

Call it from storage_service::get_new_source_ranges
and range_streamer::get_all_ranges_with_sources_for.

Consequently, get_new_source_ranges and removenode_add_ranges
can become synchronous too.

Unfortunately we can't entirely get rid of
abstract_replication_strategy::get_range_addresses
as it's still needed by
range_streamer::get_all_ranges_with_strict_sources_for.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
4d2561ff75 abstract_replication_strategy: precacluate get_replication_factor for effective_replication_map
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
cbe58345b9 abstract_replication_strategy: futurize get_*address_ranges
Remaining callers of get_address_ranges and get_pending_address_ranges
are all either from a seastar thread or from a coroutine
so we can make the methods always async and drop the
can_yield param.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
91581ba23a abstract_replication_strategy: futurize get_range_addresses
All remaining use sites are called in a seastar thread
so we drop the can_yield param and make get_range_addresses
always async.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
d96a67eb57 abstract_replication_strategy: use shared_ptr in registry
Enable creating shared_ptr<BaseClass> in nonstatic_class_registry
using BaseClass::ptr_type and use that for
abstract_replication_strategy.

While at it, also clean up compressor with that respect
to define compressor::ptr_type as shared_ptr<compressor>
thus simplifying compressor_registry.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Benny Halevy
7498ac4869 dht: boot_strapper: bootstrap: fixup indentation
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210923144206.1690576-2-bhalevy@scylladb.com>
2021-09-26 11:09:01 +03:00
Benny Halevy
798aee6747 dht: boot_strapper: coroutinize bootstrap
Prepare for futurizing get_pending_address_ranges.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210923144206.1690576-1-bhalevy@scylladb.com>
2021-09-26 11:09:01 +03:00
Avi Kivity
daf028210b build: enable -Winconsistent-missing-override warning
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.

Enable the warning and fix numerous violations.

Closes #9347
2021-09-15 12:55:54 +03:00
Avi Kivity
0909e3c17d treewide: remove redundant "x <=> 0" compares
If x is of type std::strong_ordering, then "x <=> 0" is equivalent to
x. These no-ops were inserted during #1449 fixes, but are now unnecessary.
They have potential for harm, since they can hide an accidental of the
type of x to an arithmetic type, so remove them.

Ref #1449.
2021-07-28 13:30:32 +03:00
Avi Kivity
8a80e455fb sstables: keys: convert trichotomic comparisons to std::strong_ordering
Prevent accidental conversions to bool from yielding the wrong results.
Unprepared users (that converted to bool, or assigned to int) are adjusted.

Ref #1449

Test: unit (dev)

Closes #9088
2021-07-26 19:09:19 +03:00
Juliusz Stasiewicz
a8b741efe2 endpoint_details: store _host as gms::inet_address
In an upcoming commit I will add "system.describe_ring" table which uses
endpoint's inet address as a part of CK and, therefore, needs to keep them
sorted with `inet_addr_type::less`.
2021-07-20 14:00:54 +02:00
Tomasz Grabiec
06e373e272 sstables: index_reader: Keep index objects under LSA
In preparation for caching index objects, manage them under LSA.

Implementation notes:

key_view was changed to be a view on managed_bytes_view instead of
bytes, so it now can be fragmented. Old users of key_view now have to
linearize it.  Actual linearization should be rare since partition
keys are typically small.

Index parser is now not constructing the index_entry directly, but
produces value objects which live in the standard allocator space:

  class parsed_promoted_index_entry;
  calss parsed_partition_index_entry;

This change was needed to support consumers which don't populate the
partition index cache and don't use LSA,
e.g. sstable::generate_summary(). It's now consumer's responsibility
to allocate index_entry out of parsed_partition_index_entry.
2021-07-02 19:02:14 +02:00
Avi Kivity
0048c404d2 Merge 'dht: token: make some cosmetic changes' from Michał Chojnowski
This is a set of a few cosmetic changes in dht/token. Mostly some comments and a simplification of `midpoint()`.

Closes #8803

* github.com:scylladb/scylla:
  dht: token: add a comment excusing the `const bytes&` constructor
  dht: token: simplify midpoint()
  dht: token: add a comment to normalize()
  dht: token: use {read,write}_unaligned instead of std::copy_n
  dht: token-sharding: fix a typo in a comment
2021-06-07 15:41:15 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
e0749d6264 treewide: some random header cleanups
Eliminate not used includes and replace some more includes
with forward declarations where appropriate.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-06-06 19:18:49 +03:00
Michał Chojnowski
23b7178f0d dht: token: add a comment excusing the const bytes& constructor 2021-06-05 15:22:42 +02:00
Michał Chojnowski
31aad81dc9 dht: token: simplify midpoint()
midpoint doesn't have to be so complicated.
2021-06-05 15:22:35 +02:00
Michał Chojnowski
a2352ea332 dht: token: add a comment to normalize()
The purpose and name of normalize are not obvious and deserve an explanatory
comment.
2021-05-31 11:54:58 +02:00
Michał Chojnowski
3d9b8c9eff dht: token: use {read,write}_unaligned instead of std::copy_n
A cosmetic change.
2021-05-31 11:54:58 +02:00
Michał Chojnowski
3c88a9ccb6 dht: token-sharding: fix a typo in a comment 2021-05-31 11:54:45 +02:00
Nadav Har'El
fb0c4e469a Merge 'token_metadata: Fix get_all_endpoints to return nodes in the ring' from Asias He
The get_all_endpoints() should return the nodes that are part of the ring.

A node inside _endpoint_to_host_id_map does not guarantee that the node
is part of the ring.

To fix, return from _token_to_endpoint_map.

Fixes #8534

Closes #8536

* github.com:scylladb/scylla:
  token_metadata: Get rid of get_all_endpoints_count
  range_streamer: Handle everywhere_topology
  range_streamer: Adjust use_strict_sources_for_ranges
  token_metadata: Fix get_all_endpoints to return nodes in the ring
2021-05-11 18:39:10 +03:00
Asias He
4793894fac range_streamer: Handle everywhere_topology
The everywhere_topology returns the number of nodes in the cluster as
RF. This makes only streaming from the node losing the range impossible
since no node is losing the range after bootstrap.

Shortcut it not to use strict source in case the keyspace is
everywhere_topology.

Refs #8503
2021-05-06 10:02:11 +08:00
Asias He
1b7414860b range_streamer: Adjust use_strict_sources_for_ranges
Now the get_all_endpoints() returns the number of nodes in the ring. We
need to adjust the checking for using strict source or not.

Use strict when number of nodes in the ring is equal or more than RF

Refs #8534
2021-05-06 10:02:11 +08:00
Avi Kivity
cea5493cb7 storage_proxy, treewide: introduce names for vectors of inet_address
storage_proxy works with vectors of inet_addresses for replica sets
and for topology changes (pending endpoints, dead nodes). This patch
introduces new names for these (without changing the underlying
type - it's still std::vector<gms::inet_address>). This is so that
the following patch, that changes those types to utils::small_vector,
will be less noisy and highlight the real changes that take place.
2021-05-05 18:36:48 +03:00
Benny Halevy
96ef204676 dht/token: shard_of: reuse shard_of_minimum_token
Returning shard 0 for the minimum token better
be hardcoded in one place.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210428113339.1092555-1-bhalevy@scylladb.com>
2021-04-28 15:08:36 +03:00
Benny Halevy
662355519d dht/i_partitioner: split_range_to_single_shard: drop unused lambda capture of start_shard
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210428113440.1099877-1-bhalevy@scylladb.com>
2021-04-28 15:07:57 +03:00
Pavel Emelyanov
5ecbc33be5 database.*: Remove unused headers
The database.hh is the central recursive-headers knot -- it has ~50
includes. This patch leaves only 34 (it remains the champion though).
Similar thing for database.cc.
Both changes help the latter compile ~4% faster :)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210414183107.30374-1-xemul@scylladb.com>
2021-04-18 14:03:17 +03:00
Avi Kivity
58b7f225ab keys: convert trichotomic comparators to return std::strong_ordering
A trichotomic comparator returning an int an easily be mistaken
for a less comparator as the return types are convertible.

Use the new std::strong_ordering instead.

A caller in cql3's update_parameters.hh is also converted, following
the path of least resistance.

Ref #1449.

Test: unit (dev)

Closes #8323
2021-03-21 09:30:43 +02:00
Avi Kivity
378556418c dht: ring_position, decorated_key: convert tri_comparators to std::strong_ordering
Convert tri_comparators to return std::strong_ordering rather than int,
to prevent confusion with less comparators. Downstream users are either
also converted, or adjust the return type back to int, whichever happens
to be simpler; in all cases the change it trivial.
2021-03-18 12:40:05 +02:00
Avi Kivity
3cd2f00438 dht: convert token tri_compare to std::strong_ordering
Change token's tri_compare functions to return std::strong_ordering,
which is not convertible to bool and therefore not suspect to
being misused where a less-compare is expected.

Two of the users (ring_position and decorated_key) have to undo
the conversion, since they still return int. A follow up will
convert them too.

Ref #1449.
2021-02-28 21:03:59 +02:00
Pavel Emelyanov
ffc9cc9aec range-streamer: Remove global storage service reference
The reference is used by range streamer and (!) storage
service itself to find out if the consistent_rangemovement
option is ON/OFF.

Both places already have the database with config at hands
and can be simplified.

v2: spellchecking

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210212095403.22662-1-xemul@scylladb.com>
2021-02-12 15:50:30 +01:00
Benny Halevy
322aa2f8b5 token_metadata: add clear_gently
clear_gently gently clears the token_metadata members.
It uses continuations to allow yielding if needed
to prevent reactor stalls.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-12-22 11:22:21 +02:00
Benny Halevy
e089c22ec1 token_metdata: futurize update_normal_tokens
The function complexity if O(#tokens) in the worst case
as for each endpoint token to traverses _token_to_endpoint_map
lineraly to erase the endpoint mapping if it exists.

This change renames the current implementation of
update_normal_tokens to update_normal_tokens_sync
and clones the code as a coroutine that returns a future
and may yield if needed.

Eventually we should futurize the whole token_metadata
and abstract_replication_strategy interface and get rid
of the synchronous functions.  Until then the sync
version is still required from call sites that
are neither returning a future nor run in a seastar thread.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-12-22 10:35:15 +02:00