Commit Graph

224 Commits

Author SHA1 Message Date
Pavel Emelyanov
e80adbade3 code: De-globalize gossiper
No code uses global gossiper instance, it can be removed. The main and
cql-test-env code now have their own real local instances.

This change also requires adding the debug:: pointer and fixing the
scylle-gdb.py to find the correct global location.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-03 10:57:40 +03:00
Pavel Emelyanov
b0544ba7bd cql test env: Keep gossiper reference on board
The reference is already available at the env initialization, but it's
not kept on the env instance itself. Will be used by the next patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-03 10:57:40 +03:00
Pavel Emelyanov
4bea0b7491 code: Use gossiper reference where possible
Some places in the code has function-local gossiper reference but
continue to use global instance. Re-use the local reference (it's going
to become sharded<> instance soon).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-03 10:57:40 +03:00
Pavel Emelyanov
38c77d0d85 snitch: Keep gossiper reference
The reference is put on the snitch_ptr because this is the sharded<>
thing and because gossiper reference is the same for different snitch
drivers. Also, getting gossiper from snitch_ptr by driver will look
simpler than getting it from any base class.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-03 10:57:40 +03:00
Pavel Emelyanov
2d32c47d0d main, cql_test_env: Start snitch later
Snitch depends on gossiper and system keyspace, so it needs to be
started after those two do.

fixes #10402

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-03 10:57:32 +03:00
Piotr Grabowski
6537dc6126 loading_cache_test: test prepared stmts cache
Add a new test that sets up a CQL environment with a very small prepared 
statements cache. The test reproduces a scenario described in #10440,
where a privileged section of prepared statement cache gets large
and that could possibly starve the unprivileged section, making it
impossible to execute BATCH statements. Additionally, at the end of the 
test, prepared statements/"simulated batches" with prepared statements 
are executed a random number of times, stressing the cache.

To create a CQL environment with small prepared cache, cql_test_config
is extended to allow setting custom memory_config value.
2022-04-29 19:22:55 +02:00
Pavel Emelyanov
828a951886 snitch: Remove create_snitch/stop_snitch
After previous patches both, create_snitch() and stop_snitch() no look
like the classica sharded service start/stop sequence. Finally both
helpers can be removed and the rest of the user can just call start/stop
on locally obtained sharded references.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:43:25 +03:00
Pavel Emelyanov
633746b87d snitch: Make config-based construction of all drivers
Currently snitch drivers register themselves in class-registry with all
sorts of construction options possible. All those different constuctors
are in fact "config options".

When later snitch will declare its dependencies (gossiper and system
keyspace), it will require patching all this registrations, which's very
inconvenient.

This patch introduces the snitch_config struct and replaces all the
snitch constructors with the snitch_driver(snitch_config cfg) one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:38:34 +03:00
Pavel Emelyanov
3da5f6ac30 gossiper: Add system keyspace dependency
The gossiper reads peer features from system keyspace. Also the snitch
code needs system keyspace, and since now it gets all its dependencies
from gossiper (will be fixed some day, but not now), it will do the same
for sys.ks.. Thus it's worth having gossiper->system_keyspace explicit
dependency.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-03-25 15:08:13 +03:00
Pavel Emelyanov
62417577ab cdc_generation_service: Add system keyspace dependency
The service uses system keyspace to, e.g., manage the generation id,
thus it depends on the system_keyspace instance and deserves the
explicit reference.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-03-25 13:39:32 +03:00
Tomasz Grabiec
cd5fec8a23 Merge "raft: re-advertise gossiper features when raft feature support changes" from Pavel
Prior to the change, `USES_RAFT_CLUSTER_MANAGEMENT` feature wasn't
properly advertised upon enabling `SUPPORTS_RAFT_CLUSTER_MANAGEMENT`
raft feature.

This small series consists of 3 parts to fix the handling of supported
features for raft:
1. Move subscription for `SUPPORTS_RAFT_CLUSTER_MANAGEMENT` to the
   `raft_group_registry`.
2. Update `system.local#supported_features` directly in the
   `feature_service::support()` method.
3. Re-advertise gossiper state for `SUPPORTED_FEATURES` gossiper
   value in the support callback within `raft_group_registry`.

* manmanson/track_supported_set_recalculation_v7:
  raft: re-advertise gossiper features when raft feature support changes
  raft: move tracking `SUPPORTS_RAFT_CLUSTER_MANAGEMENT` feature to raft
  gms: feature_service: update `system.local#supported_features` when feature support changes
  test: cql_test_env: enable features in a `seastar::thread`
2022-03-18 12:34:17 +01:00
Pavel Solodovnikov
011942dcce raft: move tracking SUPPORTS_RAFT_CLUSTER_MANAGEMENT feature to raft
Move the listener from feature service to the `raft_group_registry`.

Enable support for the `USES_RAFT_CLUSTER_MANAGEMENT`
feature when the former is enabled.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-03-18 09:54:25 +03:00
Pavel Solodovnikov
724ea7aa38 test: cql_test_env: enable features in a seastar::thread
Each feature can have an associated `when_enabled` callback
registered, which is assumed to run in the thread context,
so wrap the `enable()` call in a seastar thread.

Tests: unit(dev)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-03-18 09:54:15 +03:00
Pavel Emelyanov
009c449cc3 replica: Push sharded<system_keyspace> down to parse_system_tables
The method needs to call merge_schema() that will need system keyspace
instance at hand. The parse_s._t. method is boot-time one, pushing the
main-local instance through it is fine

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-03-16 14:24:40 +03:00
Pavel Emelyanov
f18a80852e storage_service: Keep sharded<system_keyspace> reference
Storage service uses system keyspace on boot heavily

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-03-16 14:24:40 +03:00
Pavel Emelyanov
42e733bdf7 migration_manager: Keep sharded<system_keyspace> reference
The main target here is system_keyspace::update_schema_version() which
is now static, but needs to have system_keyspace at "this". Migration
manager is one of the places that calls that method indirectly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-03-16 14:24:40 +03:00
Pavel Emelyanov
b761e558b5 system_keyspace: Keep local_cache reference on board
For now it's a reference, but all users of the cache will be
eventually switched into using system_keyspace.

In cql-test-env cache starting happens earlier than it was
before, but that's OK, it just initializes empty instances.
In main cache starts at the same time as before patching.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-03-16 13:57:59 +03:00
Pavel Emelyanov
1bcb6c13a5 system_keyspace: Move minimal_setup into start
Start happens at exactly the same place. One thing to take care
of is that it happens on all shards.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-03-16 13:57:59 +03:00
Pavel Emelyanov
7ef69b8189 system_keyspace: Make sharded object
The db::system_keyspace was made a class some time ago, time to create
a standard sharded<> object out of it. It needs query processor and
database. None of those depensencies is started early enough, so the
object for now starts in two steps -- early instances creation and
late start.

The instances will carry qctx and local_cache on board and all the
services that need those two will depend on system-keyspace. Its start
happens at exactly the same place where system_keyspace::setup happens
thus any service that will use system_keyspace will be on the same
safe side as it is now.

In the further future the system_keyspace will be equpped with its
own query processor backed by local replica database instance, instead
of the whole storage proxy as it is now.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-03-16 13:57:59 +03:00
Benny Halevy
8481852c91 cql_test_env: raft_group_registry::drain_on_shutdown before stopping the
database

We're currently stopping raft_gr before
shutting the database down, but we fail to do that if
anything goes wrong before that, e.g. if
distributed_loader::init_non_system_keyspaces fails.

This change splits drain_on_shutdown out of stop()
to stop the raft groups before the database is stopped
and does the rest in a deferred_stop placed right
after the rafr_gr registry is strated.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-03-14 11:49:44 +02:00
Michael Livshin
3fef604075 sstables_manager: add get_local_host_id() method and support
Since ME sstable format includes originating host id in stats
metadata, local host id needs to be made available for writing and
validation.

Both Scylla server (where local host id comes from the `system.local`
table) and unit tests (where it is fabricated) must be accomodated.
Regardless of how the host id is obtained, it is stored in the db
config instance and accessed through `sstables_manager`.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-02-16 18:21:24 +02:00
Michael Livshin
0b1447c702 add "sstable_format" config
Initialize it to "md" until ME format support is
complete (i.e. storing originating host id in sstable stats metadata
is implemented), so at present there is no observable change by
default.

Also declare "enable_sstables_md_format" unused -- the idea, going
forward, being that only "sstable_format" controls the written sstable
file format and that no more per-format enablement config options
shall be added.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-02-16 18:21:24 +02:00
Piotr Dulikowski
ffd439d908 cql_test_env: optimize handling result_message::exception
The single_node_cql_env uses query_processor::execute_xyz family of
methods to perform operations. Due to previous commits in this series,
they allocate one more task than before - a continuation that converts
result_message::exception into an exceptional future. We can recover
that one task by using variants of those methods which do not perform a
conversion, and turn .finally() invocations into .then()s which perform
conversion manually.
2022-02-08 11:08:42 +01:00
Michał Sala
66a93d3000 cql3: query_processor: add forward_service reference to query_processor 2022-02-01 21:14:41 +01:00
Kamil Braun
a664ac7ba5 treewide: require group0_guard when performing schema changes
`announce` now takes a `group0_guard` by value. `group0_guard` can only
be obtained through `migration_manager::start_group0_operation` and
moved, it cannot be constructed outside `migration_manager`.

The guard will be a method of ensuring linearizability for group 0
operations.
2022-01-24 15:20:35 +01:00
Kamil Braun
283ac7fefe treewide: pass mutation timestamp from call sites into migration_manager::prepare_* functions
The functions which prepare schema change mutations (such as
`prepare_new_column_family_announcement`) would use internally
generated timestamps for these mutations. When schema changes are
managed by group 0 we want to ensure that timestamps of mutations
applied through Raft are monotonic. We will generate these timestamps at
call sites and pass them into the `prepare_` functions. This commit
prepares the APIs.
2022-01-24 15:12:50 +01:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Gleb Natapov
100b44f5ff test use new schema announcement api in cql_test_env.cc 2022-01-13 23:09:02 +02:00
Avi Kivity
4392c20bd3 replica: move distributed_loader into replica module
distributed_loader is replica-side thing, so it belongs in the
replica module ("distributed" refers to its ability to load
sstables in their correct shards). So move it to the replica
module.
2022-01-10 15:25:28 +02:00
Avi Kivity
bbad8f4677 replica: move ::database, ::keyspace, and ::table to replica namespace
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.

References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.

scylla-gdb.py is adjusted to look for both the new and old names.
2022-01-07 12:04:38 +02:00
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Pavel Emelyanov
7a15f1c402 batch_|modification_statement: Make get_mutations accept query_processor
This completes the batch_ and modification_statement rework.
Also touch the private batch_statement::read_command while at it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-12-23 10:54:28 +03:00
Avi Kivity
a97731a7e5 migration_manager: replace uses of get_storage_proxy and get_local_storage_proxy with constructor-provided reference
A static helper also gained a storage_proxy parameter.
2021-12-16 21:05:47 +02:00
Avi Kivity
d768e9fac5 cql3, related: switch to data_dictionary
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.

data_dictionary::database::real_database() is called from several
places, for these reasons:

 - calling yet-to-be-converted code
 - callers with a legitimate need to access data (e.g. system_keyspace)
   but with the ::database accessor removed from query_processor.
   We'll need to find another way to supply system_keyspace with
   data access.
 - to gain access to the wasm engine for testing whether used
   defined functions compile. We'll have to find another way to
   do this as well.

The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.

Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
2021-12-15 13:54:23 +02:00
Avi Kivity
399e2895f1 test: cql_test_env: provide access to data_dictionary
Allow tests to have access to the data_dictionary.
2021-12-15 13:54:18 +02:00
Gleb Natapov
e9fafea5c1 migration_manager: pass raft_gr to the migration manager
Migration manager will be use raft group zero to distribute schema
changes.
2021-12-11 12:31:07 +02:00
Avi Kivity
078f69c133 Merge "raft: (service) implement group 0 as a service" from Kostja
"
To ensure consistency of schema and topology changes,
Scylla needs a linearizable storage for this data
available at every member of the database cluster.

The series introduces such storage as a service,
available to all Scylla subsystems. Using this service, any other
internal service such as gossip or migrations (schema) could
persist changes to cluster metadata and expect this to be done in
a consistent, linearizable way.

The series uses the built-in Raft library to implement a
dedicated Raft group, running on shard 0, which includes all
members of the cluster (group 0), adds hooks to topology change
events, such as adding or removing nodes of the cluster, to update
group 0 membership, ensures the group is started when the
server boots.

The state machine for the group, i.e. the actual storage
for cluster-wide information still remains a stub. Extending
it to actually persist changes of schema or token ring
is subject to a subsequent series.

Another Raft related service was implemented earlier: Raft Group
Registry. The purpose of the registry is to allow Scylla have an
arbitrary number of groups, each with its own subset of cluster
members and a relevant state machine, sharing a common transport.
Group 0 is one (the first) group among many.
"

* 'raft-group-0-v12' of github.com:scylladb/scylla-dev:
  raft: (server) improve tracing
  raft: (metrics) fix spelling of waiters_awaken
  raft: make forwarding optional
  raft: (service) manage Raft configuration during topology changes
  raft: (service) break a dependency loop
  raft: (discovery) introduce leader discovery state machine
  system_keyspace: mark scylla_local table as always-sync commitlog
  system_keyspace: persistence for Raft Group 0 id and Raft Server Id
  raft: add a test case for adding entries on follower
  raft: (server) allow adding entries/modify config on a follower
  raft: (test) replace virtual with override in derived class
  raft: (server) fix a typo in exception message
  raft: (server) implement id() helper
  raft: (server) remove apply_dummy_entry()
  raft: (test) fix missing initialization in generator.hh
2021-11-30 16:24:51 +02:00
Konstantin Osipov
c22f945f11 raft: (service) manage Raft configuration during topology changes
Operations of adding or removing a node to Raft configuration
are made idempotent: they do nothing if already done, and
they are safe to resume after a failure.

However, since topology changes are not transactional, if a
bootstrap or removal procedure fails midway, Raft group 0
configuration may go out of sync with topology state as seen by
gossip.

In future we must change gossip to avoid making any persistent
changes to the cluster: all changes to persistent topology state
will be done exclusively through Raft Group 0.

Specifically, instead of persisting the tokens by advertising
them through gossip, the bootstrap will commit a change to a system
table using Raft group 0. nodetool will switch from looking at
gossip-managed tables to consulting with Raft Group 0 configuration
or Raft-managed tables.
Once this transformation is done, naturally, adding a node to Raft
configuration (perhaps as a non-voting member at first) will become the
first persistent change to ring state applied when a node joins;
removing a node from the Raft Group 0 configuration will become the last
action when removing a node.

Until this is done, do our best to avoid a cluster state when
a removed node or a node which addition failed is stuck in Raft
configuration, but the node is no longer present in gossip-managed
system tables. In other words, keep the gossip the primary source of
truth. For this purpose, carefully chose the timing when we
join and leave Raft group 0:

Join the Raft group 0 only after we've advertised our tokens, so the
cluster is aware of this node, it's visible in nodetool status,
but before node state jumps to "normal", i.e. before it accepts
queries. Since the operation is idempotent, invoke it on each
restart.

Remove the node from Group 0 *before* its tokens are removed
from gossip-managed system tables. This guarantees
that if removal from Raft group 0 fails for whatever reason,
the node stays in the ring, so nodetool removenode and
friends are re-tried.

Add tracing.
2021-11-25 12:35:42 +03:00
Pavel Emelyanov
ef1960d034 code: Carry gossiper down to virtual tables creation
One of the tables needs gossiper and uses global one. This patch
prepares the fix by patching the main -> register_virtual_tables
stack with the gossiper reference.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-25 10:52:55 +03:00
Pavel Emelyanov
aaa58b7b89 storage_service: Keep streaming_manager reference
The manager is drained() on drain/decommission/isolate. Since now
it's storage_service who orchestrates all of the above, it needs
and explicit reference on the target.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:35 +03:00
Benny Halevy
d344765ec6 get rid of the global batchlog_manager
Now that it's unused.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
9cde52c58f storage_service: keep a reference to the batchlog_manager
Rather than accessing the global batchlog_manager.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
c6d82891cc test: cql_test_env: expose batchlog_manager
And use in batchlog_manager_test.test_execute_batch
to help deglobalize the batchlog_manager.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
03039e8f8a main: allow setting the global batchlog_manager
As a prerequisite to globalizing the batchlog_manager,
allow setting a global pointer to it and instantiate
the sharded<db::batchlog_manager> on the main/cql_test_env
stack.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
1e259665fe test: cql_test_env: stop view_update_generator before database shuts down
We can't have view updates happening after the database shuts down.
In particular, mutateMV depends on the keyspace effective_replaication_map
and it is going to be released when all keyspaces shut down, in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:52:41 +02:00
Benny Halevy
1d7556d099 main: pass erm_factory to storage_service
To be used for creating effective_replication_map
when token_metadata changes, and update all
keyspaces with it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Benny Halevy
242043368e main: pass erm_factory to storage_proxy
To be used for creating the effective_replication_map per keyspace.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Benny Halevy
3fed73e7c2 locator: add effective_replication_map_factory
It will be used further to create shared copies
of effective_replication_map based on replication_strategy
type and config options.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Pavel Emelyanov
947e4c9a10 code: Push db::config down to virtual tables
The db::config reference is available on the database, which
can be get from the virtual_table itself. The problem is that
it's a const refernece, while system.config will be updateable
and will need non-const reference.

Adding non-const get_config() on the database looks wrong. The
database shouldn't be used as config provider, even the const
one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 16:39:34 +03:00
Piotr Sarna
4bfaa7d9fc Merge 'Service levels: fix undefined behaviours' from Eliran Sinvani
This mini series contains two fixes that are bundled together since the
second one assumes that the first one exists (or it will not fix
anything really...), the two problems were:
1. When certain operations are called on a service level controller
   which doesn't have it's data accessor set, it can lead to a crash
since some operations will still try to dereference the accessor
pointer.
2. The cql environment test initialized the accessor with a
   sharded<system_distributed_data>& however this sharded class as
itself is not initialized (sharded::start wasn't called), so for the
same that were unsafe for null dereference the accessor will now crash
for trying to access uninitialized sharded instance.

Closes #9468

* github.com:scylladb/scylla:
  CQL test environment: Fix bad initialization order
  Service Level Controller: Fix possible dereference of a null pointer
2021-10-18 08:53:53 +02:00