Commit Graph

859 Commits

Author SHA1 Message Date
Kamil Braun
30cc07b40d Merge 'Introduce tablets' from Tomasz Grabiec
This PR introduces an experimental feature called "tablets". Tablets are
a way to distribute data in the cluster, which is an alternative to the
current vnode-based replication. Vnode-based replication strategy tries
to evenly distribute the global token space shared by all tables among
nodes and shards. With tablets, the aim is to start from a different
side. Divide resources of replica-shard into tablets, with a goal of
having a fixed target tablet size, and then assign those tablets to
serve fragments of tables (also called tablets). This will allow us to
balance the load in a more flexible manner, by moving individual tablets
around. Also, unlike with vnode ranges, tablet replicas live on a
particular shard on a given node, which will allow us to bind raft
groups to tablets. Those goals are not yet achieved with this PR, but it
lays the ground for this.

Things achieved in this PR:

  - You can start a cluster and create a keyspace whose tables will use
    tablet-based replication. This is done by setting `initial_tablets`
    option:

    ```
        CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy',
                        'replication_factor': 3,
                        'initial_tablets': 8};
    ```

    All tables created in such a keyspace will be tablet-based.

    Tablet-based replication is a trait, not a separate replication
    strategy. Tablets don't change the spirit of replication strategy, it
    just alters the way in which data ownership is managed. In theory, we
    could use it for other strategies as well like
    EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy
    is augmented to support tablets.

  - You can create and drop tablet-based tables (no DDL language changes)

  - DML / DQL work with tablet-based tables

    Replicas for tablet-based tables are chosen from tablet metadata
    instead of token metadata

Things which are not yet implemented:

  - handling of views, indexes, CDC created on tablet-based tables
  - sharding is done using the old method, it ignores the shard allocated in tablet metadata
  - node operations (topology changes, repair, rebuild) are not handling tablet-based tables
  - not integrated with compaction groups
  - tablet allocator piggy-backs on tokens to choose replicas.
    Eventually we want to allocate based on current load, not statically

Closes #13387

* github.com:scylladb/scylladb:
  test: topology: Introduce test_tablets.py
  raft: Introduce 'raft_server_force_snapshot' error injection
  locator: network_topology_strategy: Support tablet replication
  service: Introduce tablet_allocator
  locator: Introduce tablet_aware_replication_strategy
  locator: Extract maybe_remove_node_being_replaced()
  dht: token_metadata: Introduce get_my_id()
  migration_manager: Send tablet metadata as part of schema pull
  storage_service: Load tablet metadata when reloading topology state
  storage_service: Load tablet metadata on boot and from group0 changes
  db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata()
  migration_notifier: Introduce before_drop_keyspace()
  migration_manager: Make prepare_keyspace_drop_announcement() return a future<>
  test: perf: Introduce perf-tablets
  test: Introduce tablets_test
  test: lib: Do not override table id in create_table()
  utils, tablets: Introduce external_memory_usage()
  db: tablets: Add printers
  db: tablets: Add persistence layer
  dht: Use last_token_of_compaction_group() in split_token_range_msb()
  locator: Introduce tablet_metadata
  dht: Introduce first_token()
  dht: Introduce next_token()
  storage_proxy: Improve trace-level logging
  locator: token_metadata: Fix confusing comment on ring_range()
  dht, storage_proxy: Abstract token space splitting
  Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries"
  db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms()
  db: Introduce get_non_local_vnode_based_strategy_keyspaces()
  service: storage_proxy: Avoid copying keyspace name in write handler
  locator: Introduce per-table replication strategy
  treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type
  locator: Introduce effective_replication_map
  locator: Rename effective_replication_map to vnode_effective_replication_map
  locator: effective_replication_map: Abstract get_pending_endpoints()
  db: Propagate feature_service to abstract_replication_strategy::validate_options()
  db: config: Introduce experimental "TABLETS" feature
  db: Log replication strategy for debugging purposes
  db: Log full exception on error in do_parse_schema_tables()
  db: keyspace: Remove non-const replication strategy getter
  config: Reformat
2023-04-27 09:40:18 +02:00
Kefu Chai
f5b05cf981 treewide: use defaulted operator!=() and operator==()
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.

fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.

in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.

sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.

also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.

please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.

this change is a cleanup to modernize the code base with C++20
features.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13687
2023-04-27 10:24:46 +03:00
Kefu Chai
951457a711 treewide: do not use std::rel_ops
std::rel_ops was deprecated in C++20, as C++20 provides a better
solution for defining comparison operators. and all the use cases
previously to be addressed by `using namespace std::rel_ops` have
been addressed either by `operator<=>` or the default-generated
`operator!=`.

so, in this change, to avoid using deprecated facilities, let's
drop all these `using namespace std::rel_ops`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-26 14:09:58 +08:00
Kefu Chai
124153d439 build: cmake: sync with configure.py
this changes updates the CMake building system with the changes
introduced by 3f1ac846d8 and
d1817e9e1b

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13648
2023-04-24 14:55:20 +03:00
Tomasz Grabiec
9781d3ffc5 db: config: Introduce experimental "TABLETS" feature 2023-04-24 10:49:36 +02:00
Benny Halevy
5520d3a8e3 gossiper: version_generator: add {debug_,}validate_gossip_generation
Make sure that the int64_t generation we get over rpc
fits in the int32_t generation_type we keep locally.

Restrict this assertion to non-release builds.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:48:01 +03:00
Benny Halevy
5dc7b7811c gms: gossip_digest: use generation_type and version_type
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:48:01 +03:00
Benny Halevy
4cdad8bc8b gms: heart_beat_state: use generation_type and version_type
Define default constructor as heart_beat_state(gms::generation_type(0))

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:48:01 +03:00
Benny Halevy
b638571cb0 gms: versioned_value: use version_type
Adjust scylla-gdb.get_gms_version_value
to get the versioned_value version as version_type
(utils::tagged_integer).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:48:01 +03:00
Benny Halevy
2d20ee7d61 gms: version_generator: define version_type and generation_type strong types
Derived from utils::tagged_integer, using different tags,
the types are incompatible with each other and require explicit
typecasting to- and from- their value type.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:47:17 +03:00
Benny Halevy
d1817e9e1b utils: move generation-number to gms
Although get_generation_number implementation is
completely generic, it is used exclusively to seed
the gossip generation number.

Following patches will define a strong gms::generation_id
type and this function should return it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:37:32 +03:00
Benny Halevy
c5d819ce60 gms: versioned_value: make members private
and provide accessor functions to get them.

1. So they can't be modified by mistake, as the versioned value is
   immutable. A new value must have a higher version.
2. Before making the version a strong gms::version_type.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:37:32 +03:00
Benny Halevy
44a8db016a gms: versioned_value: delete unused compare_to function
Not only it is unused, it is wrong since
it doesn't compare the value, only its version.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:37:32 +03:00
Benny Halevy
59e771be5c gms: gossip_digest: delete unused compare_to function
Not only it is unused, it is wrong since
it doesn't compare the digest endpoint member.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:37:32 +03:00
Kamil Braun
55f43e532c Merge 'get rid of gms/failure_detector' from Benny Halevy
Move gms::arrival_window to api/failure_detector which is its only user.
and get rid of the rest, which is not used, now that we use direct_failure_detector instead.

TODO: integare direct_failure_detector with failure_detector api.

Closes #13576

* github.com:scylladb/scylladb:
  gms: get rid of unused failure_detector
  api: failure_detector: remove false dependency on failure_detector::arrival_window
  test: rest_api: add test_failure_detector
2023-04-21 11:47:44 +02:00
Kefu Chai
ecb5380638 treewide: s/boost::lexical_cast<std::string>/fmt::to_string()/
this change replaces all occurrences of `boost::lexical_cast<std::string>`
in the source tree with `fmt::to_string()`. for couple reasons:

* `boost::lexical_cast<std::string>` is longer than `fmt::to_string()`,
  so the latter is easier to parse and read.
* `boost::lexical_cast<std::string>` creates a stringstream under the
  hood, so it can use the `operator<<` to stringify the given object.
  but stringstream is known to be less performant than fmtlib.
* we are migrating to fmtlib based formatting, see #13245. so
  using `fmt::to_string()` helps us to remove yet another dependency
  on `operator<<`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13611
2023-04-21 09:43:53 +03:00
Benny Halevy
3f1ac846d8 gms: get rid of unused failure_detector
The legacy failure_detector is now unused and can be removed.

TODO: integare direct_failure_detector with failure_detector api.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-21 09:08:27 +03:00
Kefu Chai
99bf8bc0f4 bytes, gms: s/format_to/fmt::format_to/
to disambiguate `fmt::format_to()` from `std::format_to()`. turns out,
we have `using namespace std` somewhere in the source tree, and with
libstdc++ shipped by GCC-13, we have `std::format_to()`, so without
exactly which one to use, compiler complains like

```
/optimized_clang/stage-1-X86/build/bin/clang++ -MD -MT build/dev/mutation/mutation.o -MF build/dev/mutation/mutation.o.d -I/optimized_clang/scylla-X86/seastar/include -I/optimized_clang/scylla-X86/build/dev/seastar/gen/include -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Werror=unused-result -fstack-clash-protection -DSEASTAR_API_LEVEL=6 -DSEASTAR_BUILD_SHARED_LIBS -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_TYPE_ERASE_MORE -DFMT_SHARED -I/usr/include/p11-kit-1   -ffile-prefix-map=/optimized_clang/scylla-X86=. -march=westmere -DDEVEL -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSCYLLA_ENABLE_ERROR_INJECTION -O2 -DSCYLLA_BUILD_MODE=dev -iquote. -iquote build/dev/gen --std=gnu++20  -ffile-prefix-map=/optimized_clang/scylla-X86=. -march=westmere  -DBOOST_TEST_DYN_LINK   -DNOMINMAX -DNOMINMAX -fvisibility=hidden  -Wall -Werror -Wno-mismatched-tags -Wno-tautological-compare -Wno-parentheses-equality -Wno-c++11-narrowing -Wno-missing-braces -Wno-ignored-attributes -Wno-overloaded-virtual -Wno-unused-command-line-argument -Wno-unsupported-friend -Wno-delete-non-abstract-non-virtual-dtor -Wno-braced-scalar-init -Wno-implicit-int-float-conversion -Wno-delete-abstract-non-virtual-dtor -Wno-psabi -Wno-narrowing -Wno-nonnull -Wno-uninitialized -Wno-error=deprecated-declarations -DXXH_PRIVATE_API -DSEASTAR_TESTING_MAIN -DFMT_DEPRECATED_OSTREAM  -c -o build/dev/mutation/mutation.o mutation/mutation.cc
In file included from mutation/mutation.cc:9:
In file included from mutation/mutation.hh:13:
In file included from mutation/mutation_partition.hh:21:
In file included from ./schema/schema_fwd.hh:13:
In file included from ./utils/UUID.hh:22:
./bytes.hh:116:21: error: call to 'format_to' is ambiguous
                    format_to(out, "{}{:02x}", _delimiter, std::byte(v[i]));
                    ^~~~~~~~~
./bytes.hh:134:43: note: in instantiation of function template specialization 'fmt::formatter<fmt_hex>::format<fmt::basic_format_context<fmt::appender, char>>' requested here
        return fmt::formatter<::fmt_hex>::format(::fmt_hex(bytes_view(s)), ctx);
                                          ^
/usr/include/fmt/core.h:813:64: note: in instantiation of function template specialization 'fmt::formatter<seastar::basic_sstring<signed char, unsigned int, 31, false>>::format<fmt::basic_format_context<fmt::appender, char>>' requested here
    -> decltype(typename Context::template formatter_type<T>().format(
                                                               ^
/usr/include/fmt/core.h:824:10: note: while substituting deduced template arguments into function template 'has_const_formatter_impl' [with Context = fmt::basic_format_context<fmt::appender, char>, T = seastar::basic_sstring<signed char, unsigned int, 31, false>]
  return has_const_formatter_impl<Context>(static_cast<T*>(nullptr));
```

to address this FTBFS, let's be more explicit by adding "fmt::" to
specify which `format_to()` to use.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13361
2023-03-29 14:47:28 +03:00
Kefu Chai
a3cb5db542 gms/inet_address: implement operator<< using fmt::formatter
less repeatings this way,

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-27 20:06:45 +08:00
Kefu Chai
8dbaef676d treewide: use fmtlib to format gms::inet_address
the goal of this change is to reduce the dependency on
`operator<<(ostream&, const gms::inet_address&)`.

this is not an exhaustive search-and-replace change, as in some
caller sites we have other dependencies to yet-converted ostream
printer, we cannot fix them all, this change only updates some
caller of `operator<<(ostream&, const gms::inet_address&)`.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-27 20:06:45 +08:00
Kefu Chai
4ea6e06cac gms/inet_address: specialize fmt::formatter<gms::inet_address>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `gms::inet_address` with the help of fmt::ostream.
please note, the ':' delimiter is specified when printing the IPv6 address.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-27 20:06:45 +08:00
Tomasz Grabiec
c54a3d9c10 Merge 'Clean enabled features manipulations in system keyspace' from Pavel Emelyanov
There was an attempt to cut feature-service -> system-keyspace dependency (#13172) which turned out to require more changes. Here's a preparation squeezing from this future work.

This set
- leaves only batch-enabling API in feature service
- keeps the need for async context in feature service
- narrows down system keyspace features API to only load and store records
- relaxes features updating logic in sys.ks.
- cosmetic

Closes #13264

* github.com:scylladb/scylladb:
  feature_service: Indentation fix after previous patch
  feature_service: Move async context into enable()
  system_keyspace: Refactor local features load/save helpers
  feature_service: Mark supported_feature_set() const
  feature_service: Remove single feature enabling method
  boot: Enable features in batch
  gossiper: Enable features in batch
2023-03-24 13:12:49 +01:00
Alejo Sanchez
da00052ad8 gms, service: replicate live endpoints on shard 0
Call replicate_live_endpoints on shard 0 to copy from 0 to the rest of
the shards. And get the list of live members from shard 0.

Move lock to the callers.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>

Closes #13240
2023-03-21 15:46:12 +01:00
Pavel Emelyanov
970fc80ea6 feature_service: Indentation fix after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 11:59:37 +03:00
Pavel Emelyanov
8600cb2db0 feature_service: Move async context into enable()
Callers don't need to know that enabling features has this requirement
Indentation is deliberately left broken (until next patch)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 11:59:34 +03:00
Pavel Emelyanov
ae6e29a919 system_keyspace: Refactor local features load/save helpers
Introduce load_local_enabled_features() and save_local_enabled_features()
that get and put std::set<sstring> with feature names (and perform set to
string and back conversions on their own). They look natural next to
existing sys.ks. methods to get/set local-supported features and peer
features.

Using the new API, the more generic functions to preserve individual
features and load them on startup can become much shorter and cleaner.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 11:54:02 +03:00
Pavel Emelyanov
6a5ab87441 feature_service: Mark supported_feature_set() const
It's indeed such

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 11:12:29 +03:00
Pavel Emelyanov
985fbf703a feature_service: Remove single feature enabling method
No longer used

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 11:12:28 +03:00
Pavel Emelyanov
256dd9d7e3 gossiper: Enable features in batch
Gossiper code walks the list of feature names and enables them
one-by-one. However, in the feature_service code there's a method that
enables features in batch.

Using it now doesn't make any difference, but next patches will make
some use of it. Also, this will let shortening feature_service's API and
will make it simpler to remove qctx thing from there.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 11:12:16 +03:00
Pavel Emelyanov
eecb9244dd sstables: Expell sstable_version_types from_string() helper
It's name is too generic despite it's narrow specialization. Also,
there's a version_from_string() method that does the same in a more
convenient way.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 09:56:18 +03:00
Avi Kivity
c3a2ec9d3c Merge 'use fmt::join() for printing ranges' from Kefu Chai
this series intends to deprecate `::join()`, as it always materializes a range into a concrete string. but what we always want is to print the elements in the given range to stream, or to a seastar logger, which is backed by fmtlib. also, because fmtlib offers exactly the same set of features implemented by to_string.hh, this change would allow us to use fmtlib to replace to_string.hh for better maintainability, and potentially better performance. as fmtlib is lazy evaluated, and claims to be performant under most circumstances.

Closes #13163

* github.com:scylladb/scylladb:
  utils: to_string: move join to namespace utils
  treewide: use fmt::join() when appropriate
  row_cache: pass "const cache_entry" to operator<<
2023-03-19 15:16:02 +02:00
Kefu Chai
c37f4e5252 treewide: use fmt::join() when appropriate
now that fmtlib provides fmt::join(). see
https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view
there is not need to revent the wheel. so in this change, the homebrew
join() is replaced with fmt::join().

as fmt::join() returns an join_view(), this could improve the
performance under certain circumstances where the fully materialized
string is not needed.

please note, the goal of this change is to use fmt::join(), and this
change does not intend to improve the performance of existing
implementation based on "operator<<" unless the new implementation is
much more complicated. we will address the unnecessarily materialized
strings in a follow-up commit.

some noteworthy things related to this change:

* unlike the existing `join()`, `fmt::join()` returns a view. so we
  have to materialize the view if what we expect is a `sstring`
* `fmt::format()` does not accept a view, so we cannot pass the
  return value of `fmt::join()` to `fmt::format()`
* fmtlib does not format a typed pointer, i.e., it does not format,
  for instance, a `const std::string*`. but operator<<() always print
  a typed pointer. so if we want to format a typed pointer, we either
  need to cast the pointer to `void*` or use `fmt::ptr()`.
* fmtlib is not able to pick up the overload of
  `operator<<(std::ostream& os, const column_definition* cd)`, so we
  have to use a wrapper class of `maybe_column_definition` for printing
  a pointer to `column_definition`. since the overload is only used
  by the two overloads of
  `statement_restrictions::add_single_column_parition_key_restriction()`,
  the operator<< for `const column_definition*` is dropped.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-16 20:34:18 +08:00
Kamil Braun
b919373cce Merge 'api: gossiper: get alive nodes after reaching current shard 0 version' from Alecco
Add an API call to wait for all shards to reach the current shard 0
gossiper version. Throws when timeout is reached.

Closes #12540

* github.com:scylladb/scylladb:
  api: gossiper: fix alive nodes
  gms, service: lock live endpoint copy
  gms, service: live endpoint copy method
2023-03-16 09:46:02 +01:00
Alejo Sanchez
e35762241a api: gossiper: fix alive nodes
Fix API call to wait for all shards to reach the current shard 0
gossiper version. Throws when timeout is reached.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2023-03-10 17:29:11 +01:00
Alejo Sanchez
6c04476561 gms, service: lock live endpoint copy
To allow concurrent execution, protect copy of live endpoints with a
semaphore.
2023-03-10 17:16:21 +01:00
Benny Halevy
78b0222842 gossiper: get_generation_for_nodes: get nodes as unordered_set
Prepare for following patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-03-09 11:42:03 +02:00
Alejo Sanchez
f55e91d797 gms, service: live endpoint copy method
Move replication logic for live endpoint across shards to a separate
method

This will be used by API get alive nodes.

As this is now in a method and outside gossiper::run(), assert it's
called from shard 0.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2023-03-08 10:45:35 +01:00
Avi Kivity
6aa91c13c5 Merge 'Optimize topology::compare_endpoints' from Benny Halevy
The code for compare_endpoints originates at the dawn of time (bc034aeaec)
and is called on the fast path from storage_proxy via `sort_by_proximity`.

This series considerably reduces the function's footprint by:
1. carefully coding the many comparisons in the function so to reduce the number of conditional banches (apparently the compiler isn't doing a good enough job at optimizing it in this case)
2. avoid sstring copy in topology::get_{datacenter,rack}

Closes #12761

* github.com:scylladb/scylladb:
  topology: optimize compare_endpoints
  to_string: add print operators for std::{weak,partial}_ordering
  utils: to_sstring: deinline std::strong_ordering print operator
  move to_string.hh to utils/
  test: network_topology: add test_topology_compare_endpoints
2023-03-07 15:17:19 +02:00
Kefu Chai
563fbb2d11 build: cmake: extract more subsystem out into its own CMakeLists.txt
namely, cdc, compaction, dht, gms, lang, locator, mutation_writer, raft, readers, replica,
service, tools, tracing and transport.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-02 10:15:25 +08:00
Kefu Chai
0cb842797a treewide: do not define/capture unused variables
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-15 22:57:18 +02:00
Benny Halevy
25ebc63b82 move to_string.hh to utils/
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-02-15 11:09:04 +02:00
Gleb Natapov
1688163233 raft: replace experimental raft option with dedicated flag
Unlike other experimental feature we want to raft to be optional even
after it leaves experimental mode. For that we need to have a separate
option to enable it. The patch adds the binary option "consistent-cluster-management"
for that.
2023-01-03 11:15:11 +02:00
Benny Halevy
c9993f020d storage_service: get rid of handle_state_replacing
Since 2ec1f719de nodes no longer
publish HIBERNATE state so we don't need to support handling it.

Replace is now always done using node operations (using
repair or streaming).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-12-19 12:19:08 +02:00
Kamil Braun
bf6679906f gms, service: stop gossiping and storing RAFT_SERVER_ID
It is equal to (if present) HOST_ID and no longer used for anything.

The application state was only gossiped if `experimental-features`
contained `raft`, so we can free this slot.

Similarly, `raft_server_id`s were only persisted in `system.peers` if
the `SUPPORTS_RAFT` cluster feature was enabled, which happened only
when `experimental-features` contained `raft`. The `raft_server_id`
field in the schema was also introduced recently in `master` and didn't
get to be in a release yet. Given either of these reasons, we can remove
this field safely.
2022-12-12 15:20:30 +01:00
Kamil Braun
5dbe236339 Revert "gms/gossiper: fetch RAFT_SERVER_ID during shadow round"
This reverts commit 60217d7f50.
We no longer need RAFT_SERVER_ID.
2022-12-12 15:20:20 +01:00
Avi Kivity
e6ffc22053 Merge 'cql3: Server-side DESC statement' from Michał Jadwiszczak
This PR adds server-side `DESCRIBE` statement, which is required in latest cqlsh version.

The only change from the user perspective is the `DESC ...` statement can be used with cqlsh version >= 6.0. Previously the statement was executed from client side, but starting with Cassandra 4.0 and cqlsh 6.0, execution of describe was moved to server side, so the user was unable to do `DESC ...` with Scylla and cqlsh 6.0.

Implemented describe statements:
- `DESC CLUSTER`
- `DESC [FULL] SCHEMA`
- `DESC [ONLY] KEYSPACE`
- `DESC KEYSPACES/TYPES/FUNCTIONS/AGGREGATES/TABLES`
- `DESC TYPE/FUNCTION/AGGREGATE/MATERIALIZED VIEW/INDEX/TABLE`
- `DESC`

[Cassandra's implementation for reference](https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/statements/DescribeStatement.java)

Changes in this patch:
- cql3::util: added `single_quite()` function
- added `data_dictionary::keyspace_element` interface
- implemented `data_dictionary::keyspace_element` for:
    - keyspace_metadata,
    - UDT, UDF, UDA
    - schema
- cql3::functions: added `get_user_functions()` and `get_user_aggregates()` to get all UDFs/UDAs in specified keyspace
- data_dictionary::user_types_metadata: added `has_type()` function
- extracted `describe_ring()` from storage_service to standalone helper function in `locator/util.hh`
- storage_proxy: added `describe_ring()` (implemented using helper function mentioned above)
- extended CQL grammar to handle describe statement
- increased version in `version.hh` to 4.0.0, so cqlsh will use server-side describe statement

Referring: https://github.com/scylladb/scylla/issues/9571, https://github.com/scylladb/scylladb/issues/11475

Closes #11106

* github.com:scylladb/scylladb:
  version: Increasing version
  cql-pytest: Add tests for server-side describe statement
  cql-pytest: creating random elements for describe's tests
  cql3: Extend CQL grammar with server-side describe statement
  cql3:statements: server-side describe statement
  data_dictonary: add `get_all_keyspaces()` and `get_user_keyspaces()`
  storage_proxy: add `describe_ring()` method
  storage_service, locator: extract describe_ring()
  data_dictionary:user_types_metadata: add has_type() function
  cql3:functions: `get_user_functions()` and `get_user_aggregates()`
  implement `keyspace_element` interface
  data_dictionary: add `keyspace_element` interface
  cql3: single_quote() util function
  view: row_lock: lock_ck: reindent
  test/topology: enable replace tests
  service/raft: report an error when Raft ID can't be found in `raft_group0::remove_from_group0`
  service: handle replace correctly with Raft enabled
  gms/gossiper: fetch RAFT_SERVER_ID during shadow round
  service: storage_service: sleep 2*ring_delay instead of BROADCAST_INTERVAL before replace
2022-12-11 18:29:36 +02:00
Michał Jadwiszczak
dd46a92e23 storage_service, locator: extract describe_ring()
`describe_ring()` was implemented as a method of `storage_service`. This
patch extracts it from there to a standalone helper function in
`locator/util.hh`.
2022-12-10 12:51:05 +01:00
Kamil Braun
60217d7f50 gms/gossiper: fetch RAFT_SERVER_ID during shadow round
During the replace operation we need the Raft ID of the replaced node.
The shadow round is used for fetching all necessary information before
the replace operation starts.
2022-12-10 12:27:22 +01:00
Nadav Har'El
4cdaba778d Merge 'Secondary indexes on static columns' from Piotr Dulikowski
This pull request introduces support for global secondary indexes based on static columns.

Local secondary indexes based on secondary columns are not planned to be supported and are explicitly forbidden. Because there is only one static row per partition and local indexes require full partition key when querying, such indexes wouldn't be very useful and would only waste resources.

The index table for secondary indexes on static columns, unlike other secondary indexes, do not contain clustering keys from the base table. A static column's value determines a set of full partitions, so the clustering keys would only be unnecessary.

The already existing logic for querying using secondary indexes works after introducing minimal notifications. The view update generation path now works on a common representation of static and clustering rows, but the new representation allowed to keep most of the logic intact.

New cql-pytests are added. All but one of the existing tests for secondary indexes on static columns - ported from Cassandra - now work and have their `xfail` marks lifted; the remaining test requires support for collection indexing, so it will start working only after #2962 is fixed.

Materialized view with static rows as a key are __not__ implemented in this PR.

Fixes: #2963

Closes #11166

* github.com:scylladb/scylladb:
  test_materialized_view: verify that static columns are not allowed
  test_secondary_index: add (currently failing) test for static index paging
  test_secondary_index: add more tests for secondary indexes on static columns
  cassandra_tests: enable existing tests for static columns
  create_index_statement: lift restriction on secondary indexes on static rows
  db/view: fetch and process static rows when building indexes
  gms/feature_service: introduce SECONDARY_INDEXES_ON_STATIC_COLUMNS cluster feature
  create_index_statement: disallow creation of local indexes with static columns
  select_statement: prepare paging for indexes on static columns
  select_statement: do not attempt to fetch clustering columns from secondary index's table
  secondary_index_manager: don't add clustering key columns to index table of static column index
  replica/table: adjust the view read-before-write to return static rows when needed
  db/view: process static rows in view_update_builder::on_results
  db/view: adjust existing view update generation path to use clustering_or_static_row
  column_computation: adjust to use clustering_or_static_row
  db/view: add clustering_or_static_row
  deletable_row: add column_kind parameter to is_live
  view_info: adjust view_column to accept column_kind
  db/view: base_dependent_view_info: split non-pk columns into regular and static
2022-12-08 09:54:05 +02:00
Piotr Dulikowski
25fec0acce gms/feature_service: introduce SECONDARY_INDEXES_ON_STATIC_COLUMNS cluster feature
The new feature will prevent secondary indexes on static columns from
being created unless the whole cluster is ready to support them.
2022-12-06 11:21:16 +01:00