Commit Graph

776 Commits

Author SHA1 Message Date
Pavel Emelyanov
6de8bb663b storage_service: Remove migration notifier dependency
The only reason why storage service keeps a refernce on the migration
notifier is that the latter was needed by cdc before previous patch.
Now cdc gets the notifier directly from main, so storage service is
a bit more off the hook.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-29 22:47:13 +03:00
Pavel Emelyanov
3a7ca647af cdc: Provide migration notifier right at once
The only way db_context's migration notifier reference is set up
is via cdc_service->db_context::builder->.build chain of calls.
Since the builder's notifier optional reference is always
disengaged (the .with_migration_notifier is removed by previous
patch) the only possible notifier reference there is from the
storage service which, in turn, is the same as in main.cc.

Said that -- push the notifier reference onto db_context directly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-29 22:40:24 +03:00
Pavel Emelyanov
13d264d6bd migration_manager: Make it main-local
Now everybody is patched to use component-local instance of migration
manager and its global instance can be moved into main() scope.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-23 17:13:24 +03:00
Pavel Emelyanov
423d0baa65 streaming: Get migration_manager shared_ptr in messaging
Same as in previous patch, but for streaming code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-23 17:13:24 +03:00
Pavel Emelyanov
a4569a30f3 storage_proxy: Get migration_manager shared_ptr in messaging
The proxy's messaging code uses migration manager to obtain schema.
Since proxy is more low-level service than migration manager, it's
incorrect to make proxy reference the manager directly. Instead,
push the shared_ptr into proxy's messaging code. This kills two
birst with one stone:

1: let proxy use migration manager
2: makes sure that by the time migration manager is stopped
   the proxy's use of this pointer is gone (unregistered from
   rpc)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-23 17:13:24 +03:00
Pavel Emelyanov
b7fe191e3d storage_service: Keep migration manager on board
The storage service needs migration manager to sync schema
on lifecycle notifiers and to stop the guy on drain. So this
patch just pushes the migration manager reference all the
way through the storage service constructor.

Few words about tests. Since now storage service needs the
migration manager in constructor, some tests should take it
from somewhere. The cql_test_env already has (and uses) it,
all the others can just provide a not-started sharded one,
it won't be in use in _those_ tests anyway.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-23 17:13:24 +03:00
Pavel Emelyanov
4c30556b8e repair: Keep private sharded migration manager pointer
It's nowadays standard for repair to keep global pointers on
the needed services. Keep the migration manager there too to
avoid explicit call to get_local_migration_manager. Later this
pile of global pointers will be encapsulated on redis service.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-23 17:13:24 +03:00
Pavel Emelyanov
2e74dc5fd7 redis: Carry sharded migration manager over init
The only place in redis that needs migration manager is the
::init method that's called on start. It's possible to pass
the migration manager as an argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-23 17:13:24 +03:00
Piotr Sarna
2591cbb62e main: add a debug symbol for service level controller
It's notoriously hard to find the service level controller symbol
(possible by guessing the offset based on system_distributed_keyspace
address, but it's very cumbersome). To make the debugging process
easier, the symbol is exported via the `debug` namespace.

Closes #8506
2021-04-19 11:29:01 +03:00
Avi Kivity
935378fa53 main: start background reclaim before bootstrap
We start background reclaim after we bootstrap, so bootstrap doesn't
benefit from it, and sees long stalls.

Fix by moving background reclaim initialization early, before
storage_service::join_cluster().

(storage_service::join_cluster() is quite odd in that main waits
for it synchronously, compared to everything else which is just
a background service that is only initialized in main).

Fixes #8473.

Closes #8474
2021-04-15 11:59:41 +02:00
Piotr Sarna
26ee6aa1e9 transport: initialize query state with service level controller
Query state should be aware of the service level controller in order
to properly serve service-level-related CQL queries.
2021-04-12 16:31:27 +02:00
Piotr Sarna
32bcbe59ad main: add initializing service level data accessor
The accessor must be set up in order to be able to use
statement related to service level management.
2021-04-12 16:31:27 +02:00
Eliran Sinvani
f78707d3fb cql: Support accessing service_level_controller from query state
In order to implement service level cql queries, the queries objects
needs access to the service_level_controller object when processing.
This patch adds this access by embedding it into the query state object.
In order to accomplish the above the query processor object needs an
access to service_level_controller in order to instantiate the query state.
Message-Id: <68f5a7796068a49d9cd004f1cbf34bdf93b418bc.1609234193.git.sarna@scylladb.com>
2021-04-12 16:01:04 +02:00
Eliran Sinvani
e173eaa032 instantiate and initialize the service_level_controller
This patch adds the initialization of service_level_controller. It
constructs the distributed service and start the watch loop for
distributed data changes.
Message-Id: <e97661194833d576aa39b3e7886366590f272612.1609175402.git.sarna@scylladb.com>
2021-04-12 16:01:04 +02:00
Pavel Emelyanov
887a1b0d3d tracing: Stop tracing in main's deferred action
Tracing is created in two steps and is destroyed in two too.
The 2nd step doesn't have the corresponding stop part, so here
it is -- defer tracing stop after it was started.

But need to keep in mind, that tracing is also shut down on
drain, so the stopping should handle this.

Fixes #8382
tests: unit(dev), manual(start-stop, aborted-start)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210331092221.1602-1-xemul@scylladb.com>
2021-03-31 12:28:37 +03:00
Piotr Sarna
ef1de114f0 thrift: partially add admission control
This commit adds admission control in the form of passing
service permits to the Thrift server.
The support is partial, because Thrift also supports running CQL
queries, and for that purpose a query_state object is kept
in the Thrift handler. However, the handler is generally created
once per connection, not once per query, and the query_state object
is supposed to keep the state of a single query only.
In order to keep this series simpler, the CQL-on-top-of-Thrift
layer is not touched and is left as TODO.
Moreover, the Thrift layer does not make it easy to pass custom
per-query context (like service_permit), so the implementation
uses a trick: the service permit is created on the server
and then passed as reference to its connections and their respective
Thrift handlers. Then, each time a query is read from the socket,
this service permit is overwritten and then read back from the Thrift
handler. This mechanism heavily relies on the fact that there are
zero preemption points between overwriting the service permit
and reading it back by the handler. Otherwise, races may occur.
This assumption was verified by code inspection + empirical tests,
but if somebody is aware that it may not always hold, please speak up.
2021-03-29 13:05:16 +02:00
Pavel Emelyanov
b60099c2f8 storage_proxy: Drain and unsubscribe in main.cc
Currently shutdown after drain leaves storage proxy
subscribed on storage_service events and without the
storage_proxy::drain_on_shutdown being called. So it
seems safe if the whole thing is relocated closer to
its starting peers.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-26 18:58:46 +03:00
Pavel Emelyanov
f1d7804102 tracing: Stop it in main.cc
The tracing::stop() just checks that it was shutdown()-ed and
otherwise a noop, so it's OK to stop tracing later. This brings
drain() and drain_on_shutdown() closer to each other and makes
main.cc look more like it should.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-26 18:58:45 +03:00
Pavel Emelyanov
d7cccec97f system_distributed_keyspace: Stop it in main.cc
It's now stopped in drain_on_shutdown, but since its stop()
method is a noop, it doesn't matter where it is. Keeping it
in main.cc next to related start brings drain_on_shutdown()
closer to drain() and the whole thing closer to the Ideal
start-stop sequence.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-26 18:58:45 +03:00
Pavel Emelyanov
5456174d69 storage_service: Move (un)subscription to migration events
After the patch the subscription effectively happens at
the same time as before, but is now located in main.cc,
so no real change here.

The unsubscription was in the drain_on_shutdown before
the patch, but after it it happens to be a defer next
to its peer, i.e. later, but it shouldn't be disastrous
for two reasons. First -- client services and migration
manager are already stopped. Second -- before the patch
this subscription was _not_ cancelled if shutdown ran
after nodetool drain and it didn't cause troubles.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-26 18:58:45 +03:00
Pavel Emelyanov
f0a79574d4 memory_limiter: Use main-local instance everyehere
The cql_server and alternator both need the limiter, so
patch them to stop using storage service's one and use
the main-local one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-18 11:28:45 +01:00
Pavel Emelyanov
359e9caf54 main: Have local memory limiter and carry where needed
Prepare memory limiters to have non-global instance of
the service. For now the main-local instance is not
used and (!) is not stopped for real, just like the
storage_service's one is.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-18 11:28:45 +01:00
Tomasz Grabiec
40121621f6 Merge "Kill some get_local_migration_manager() calls" from Pavel Emelyanov
There are a bunch of such calls in schema altering statements and
there's currently no way to obtain the migration manager for such
statements, so a relatively big rework needed.

The solution in this set is -- all statements' execute() methods are
called with query processor as first argument (now the storage proxy
is there), query processor references and provides migration manager
for statements. Those statements that need proxy can already get it
from the query processor.

Afterwards table_helper and thrift code can also stop using the global
migration manager instance, since they both have query processor in
needed places. While patching them a couple of calls to global storage
proxy also go away.

The new query processor -> migration manager dependency fits into
current start-stop sequence: the migration manager is started early,
the query processor is started after it. On stop the query processor
remains alive, but the migration manager stops. But since no code
currently (should) call get_local_migration_manager() it will _not_
call the query_processor::get_migration_manager() either, so this
dangling reference is ugly, but safe.

Another option could be to make storage proxy reference migration
manager, but this dependency doesn't look correct -- migration manager
is higher-level service than the storage proxy is, it is migration
manager who currently calls storage proxy, but not the vice versa.

* xemul/br-kill-some-migration-managers-2:
  cql3: Get database directly from query processor
  thrift: Use query_processor::get_migration_manager()
  table_helper: Use query_processor::get_migration_manager()
  cql3: Use query_processor::get_migration_manager() (lambda captures cases)
  cql3: Use query_processor::get_migration_manager() (alter_type statement)
  cql3: Use query_processor::get_migration_manager() (trivial cases)
  query_processor: Keep migration manager onboard
  cql3: Pass query processor to announce_migration:s
  cql3: Switch to qp (almost) in schema-altering-stmt
  cql3: Change execute()'s 1st arg to query_processor
2021-03-17 09:59:22 +02:00
Calle Wilund
a0745f9498 messaging_service: Enforce dc/rack membership iff required for non-tls connections
When internode_encryption is "rack" or "dc", we should enforce incoming
connections are from the appropriate address spaces iff answering on
non-tls socket.

This is implemented by having two protocol handlers. One for tls/full notls,
and one for mixed (needs checking) connections. The latter will ask
snitch if remote address is kosher, and refuse the connection otherwise.

Note: requires seastar patches:
"rpc: Make is possible for rpc server instance to refuse connection"
"RPC: (client) retain local address and use on stream creation"

Note that ip-level checks are not exhaustive. If a user is also using
"require_client_auth" with dc/rack tls setting we should warn him that
there is a possibility that someone could spoof himself pass the
authentication.

Closes #8051
2021-03-17 09:59:22 +02:00
Pavel Emelyanov
1de235f4da query_processor: Keep migration manager onboard
The query processor sits upper than the migration manager,
in the services layering, it's started after and (will be)
stopped before the migration manager.

The migration manager is needed in schema altering statements
which are called with query processor argument. They will
later get the migration manager from the query processor.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-15 19:00:58 +03:00
Asias He
268fa9d9fe main: Lower shares for main scheduling group     The main scheduling group has the shares of 1000, which is as high as   the statement group.     From time to time, we see unexpected scheduling group leaking to the main group, which causes the drop of the query performance.   This patch reduce the main scheduling shares to 200, which is the same as the maintenance scheduling group. It is a safer default in case code leaks to the main scheduling group.         Refs: #7720
Closes #8243
2021-03-10 19:34:45 +02:00
Raphael S. Carvalho
05b07c7161 sstable_set: preparatory work to change sstable_set::all() api
users of sstable_set::all() rely on the set itself keeping a reference
to the returned list, so user can iterate through the list assuming
that it is alive all the way through.

this will change in the future though, because there will be a
compound set impl which will have to merge the all() of multiple
managed sets, and the result is a temporary value.

so even range-based loops on all() have to keep a ref to the returned
list, to avoid the list from being prematurely destroyed.

so the following code
	for (auto& sst : *sstable_set.all()) { ...}
becomes
	for (auto sstables = sstable_set.all(); auto& sst : *sstables) { ... }

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-03-10 12:02:12 -03:00
Piotr Sarna
0e0282cdf1 Merge ' cdc: move (most of) CDC generation management to a new service' from Kamil Braun
Currently all management of CDC generations happens in storage_service,
which is a big ball of mud that does many unrelated things.

This PR introduces a new service crafted to handle CDC generation
management: listening and reacting to generation changes in the cluster.

We plug the service in, initializing it in main and test code,
passing a reference to storage_service and having storage_service call
the service (using the `after_join` method): the service only starts
doing its job after the node joins the token ring (either on bootstrap
or restart).

Some parts of generation management still remain in storage_service:
the bootstrap procedure, which happens inside storage_service,
must also do some initialization regarding CDC generations,
for example: on restart it must retrieve the latest known generation
timestamp from disk; on bootstrap it must create a new generation
and announce it to other nodes. The order of these operations w.r.t
the rest of the startup procedure is important, hence the startup
procedure is the only right place for them. We may try decoupling
these services even more in follow-up PRs, but that requires a bit
of careful reasoning. What this PR does is a low-hanging fruit.

Still, what remains in storage_service is a small part of the entire
CDC generation management logic; most of it has been moved to the
new service. This includes listening for generation changes and
updating the data structures for performing CDC log writes (cdc::metadata).
Furthermore these handling functions now return futures (and are internally
coroutines), where previously they required a seastar::async context.

This PR is a prerequisite to fixing #7985. The fact that all the CDC generation
management code was in storage_service is technical debt. It will be easier
to modify the management algorithms when they sit in their own module.

Tests: unit (dev) and cdc_tests.py dtest (dev), and local replication test using scylla-cdc-java

Closes #8172

* github.com:scylladb/scylla:
  cdc: move (most of) CDC generation management code to the new service
  cdc: coroutinize make_new_cdc_generation
  cdc: coroutinize update_streams_description
  cdc: introduce cdc::generation_service
  main: move cdc_service initialization just prior to storage_service initialization
2021-02-26 12:42:27 +01:00
Kamil Braun
e2f03e4aba cdc: move (most of) CDC generation management code to the new service
Currently all management of CDC generations happens in storage_service,
which is a big ball of mud that does many unrelated things.

Previous commits have introduced a new service for managing CDC
generations. This code moves most of the relevant code to this new
service.

However, some part still remains in storage_service: the bootstrap
procedure, which happens inside storage_service, must also do some
initialization regarding CDC generations, for example: on restart it
must retrieve the latest known generation timestamp from disk; on
bootstrap it must create a new generation and announce it to other
nodes. The order of these operations w.r.t the rest of the startup
procedure is important, hence the startup procedure is the only right
place for them.

Still, what remains in storage_service is a small part of the entire
CDC generation management logic; most of it has been moved to the
new service. This includes listening for generation changes and
updating the data structures for performing CDC log writes (cdc::metadata).
Furthermore these functions now return futures (and are internally
coroutines), where previously they required a seastar::async context.
2021-02-26 12:06:12 +01:00
Tomasz Grabiec
ecb6c56a2a Merge 'lsa: background reclaim' from Avi Kivity
This series adds background reclaim to lsa, with the goal
that most large allocations can be satisfied from available
free memory, and and reclaim work can be done from a preemptible
context.

If the workload has free cpu, then background reclaim will
utilize that free cpu, reducing latency for the main workload.
Otherwise, background reclaim will compete with the main
workload, but since that work needs to happen anyway,
throughput will not be reduced.

A unit test is added to verify it works.

Fixes #1634.

Closes #8044

* github.com:scylladb/scylla:
  test: logalloc_test: test background reclaim
  logalloc: reduce gap between std min_free and logalloc min_free
  logalloc: background reclaim
  logalloc: preemptible reclaim
2021-02-24 13:23:30 +01:00
Asias He
554ab035dd main: Run init_server and join_cluster inside maintenance scheduling group
Currently, init_server and join_cluster which initiate the bootstrap and
replace operations on the new node run inside the main scheduling group.
We should run them inside the maintenance scheduling group to reduce the
impact on the user workload.

This patch fixes a scheduling group leak for bootstrap and replace operation.

Before:
[shard 0] storage_service - storage_service::bootstrap sg=main
[shard 0] repair - bootstrap_with_repair sg=main

After:
[shard 0] storage_service - storage_service::bootstrap sg=streaming
[shard 0] repair - bootstrap_with_repair sg=streaming

Fixes #8130

Closes #8131
2021-02-22 14:55:02 +02:00
Kamil Braun
d4937daaea cdc: introduce cdc::generation_service
This commit introduces a new service crafted to handle CDC generation
management: listening and reacting to generation changes in the cluster.

The implementation is a stub for now, the service reacts to generation
changes by simply logging the event.

The commit plugs the service in, initializing it in main and test code,
passing a reference to storage_service and having storage_service start
the service (using the `after_join` method): the service only starts
doing its job after the node joins the token ring (either on bootstrap
or restart).
2021-02-22 12:45:43 +01:00
Kamil Braun
8e72c33d7c main: move cdc_service initialization just prior to storage_service
initialization

As a preparation for introducing CDC generation management service.

cdc_service will depend on the generation service.
But the generation service needs some other services to work
properly. In particular, it uses the local database, so it should be
initialized after the local database.

The only service that will need the cdc generation service is
storage_service, so we can place the generation service initialization
code right before storage_service initialization code. So the order will
be cdc_generation_service -> cdc_service -> storage_service.
2021-02-22 12:43:10 +01:00
Kamil Braun
3d7b990300 system_distributed_keyspace: use mutation API to insert CDC streams
The `storage_proxy::mutate` low-level API is much more powerful than
the CQL API. This power is not needed for this commit but for the next.
2021-02-18 11:44:59 +01:00
Benny Halevy
50ca693a02 main: disable stall detector during startup
We see long reactor stalls from `logalloc::prime_segment_pool`
in debug mode yet the stall detector's purpose is to detect
reactor stalls during normal operation where they can increase
the latency of other queries running in parallel.

Since this change doesn't actually fix the stalls but rather
hides them, the following annotations will just refrence
the respective github issues rather than auto-close them.

Refs #7150
Refs #5192
Refs #5960

Restore blocked_reactor_notify_ms right before
starting storage_proxy.  Once storage_proxy is up, this node
affects cluster latency, and so stalls should be reported so
they can be fixed.

Test: secondary_index_test --blocked-reactor-notify-ms 1 (release)
DTest: CASSANDRA_DIR=../scylla/build/release SCYLLA_EXT_OPTS="--blocked-reactor-notify-ms 2" ./scripts/run_test.sh materialized_views_test:TestMaterializedViews.interrupt_build_process_with_resharding_half_to_max_test

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210216112052.27672-1-bhalevy@scylladb.com>
2021-02-16 13:28:31 +02:00
Pavel Solodovnikov
63cdf4694d raft: wire up schema Raft RPC handlers
This patch adds registration and de-registration of the
corresponding Raft RPC verbs handlers.

There is a new `raft_services` class that
is responsible for initializing the raft RPC verbs and
managing raft server instances.

The service inherits `seastar::peering_sharded_service<T>`,
because we need to route the request to the appropriate shard
which is handled by the `shard_for_group` function (currently
only handling schema raft group to land on shard 0, otherwise
throws an exception).

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-02-16 13:08:59 +03:00
Avi Kivity
ca0c006b37 logalloc: background reclaim
Set up a coroutine in a new scheduling group to ensure there is
a "cushion" of free memory. It reclaims in preemptible mode in
order to reduce reactor stalls (constrast with synchronous reclaim
that cannot preempt until it achieved its goal).

The free memory target is arbitrarily set at 60MB. The reclaimer's
shares are proportional to the distance from the free memory target;
so a workload that allocates memory rapidly will have the background
reclaimer working harder.

I rolled my own condition variable here, mostly as an experiment.
seastar::condition_variable requires several allocations, while
the one here requires none. We should formalize it after we gain
more experience with it.
2021-02-14 19:09:29 +02:00
Piotr Sarna
1b8c946ad7 alternator: add handling max_concurrent_requests_per_shard
The config value is already used to set an upper limit of concurrent
CQL requests, and now it's also abided by alternator.
Excessive requests result in returning RequestLimitExceeded error
to the client.

Tests: manual
Running multiple concurrent requests via the test suite results in:
botocore.errorfactory.RequestLimitExceeded: An error occurred (RequestLimitExceeded) when calling the CreateTable operation: too many in-flight requests: 17
2021-02-04 17:23:41 +01:00
Pekka Enberg
6cc981d089 scylla: Add "--build-mode" command line option
This adds a "--build-mode" command line option to "scylla" executable:

$ ./build/dev/scylla --build-mode
dev

This allows you to discover the build mode of a "scylla" executable
without resorting to "readelf", for example, to verify that you are
looking at the correct executable while debugging packaging issues.

Closes #7865
2021-01-20 16:07:29 +02:00
Nadav Har'El
781f9d9aca alternator: make default timeout configurable
Whereas in CQL the client can pass a timeout parameter to the server, in
the DynamoDB API there is no such feature; The server needs to choose
reasonable timeouts for its own internal operations - e.g., writes to disk,
querying other replicas, etc.

Until now, Alternator had a fixed timeout of 10 seconds for its
requests. This choice was reasonable - it is much higher than we expect
during normal operations, and still lower than the client-side timeouts
that some DynamoDB libraries have (boto3 has a one-minute timeout).
However, there's nothing holy about this number of 10 seconds, some
installations might want to change this default.

So this patch adds a configuration option, "--alternator-timeout-in-ms",
to choose this timeout. As before, it defaults to 10 seconds (10,000ms).

In particular, some test runs are unusually slow - consider for example
testing a debug build (which is already very slow) in an extremely
over-comitted test host. In some cases (see issue #7706) we noticed
the 10 second timeout was not enough. So in this patch we increase the
default timeout chosen in the "test/alternator/run" script to 30 seconds.

Please note that as the code is structured today, this timeout only
applies to some operations, such as GetItem, UpdateItem or Scan, but
does not apply to CreateTable, for example. This is a pre-existing
issue that this patch does not change.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201207122758.2570332-1-nyh@scylladb.com>
2020-12-09 14:30:43 +01:00
Avi Kivity
70689088fd Merge "Remove reference on database from global qctx" from Pavel E
"
The qctx is global object that references query processor and
database to let the rest of the code query system keyspace.

As the first step of de-globalizing it -- remove the database
reference from it. After the set the qctx remains a simple
wrapper over the query processor (which is already de-globalized)
and the query processor in turn is mostly needed only to parse
the query string into prepared statement only. This, in turn,
makes it possible to remove the qctx later by parsing the
query strings on boot and carrying _them_ around, not the qctx
itself.

tests: unit(dev), dtest(simple_cluster_driver_test:dev), manual start/stop
"

* 'br-remove-database-from-qctx' of https://github.com/xemul/scylla:
  query-context: Remove database from qctx
  schema-tables: Use query processor referece in save_system(_keyspace)?_schema
  system-keyspace: Rewrite force_blocking_flush
  system-keyspace: Use cluster_name string in check_health
  system-keyspace: Use db::config in setup_version
  query-context: Kill global helpers
  test: Use cql_test_env::evecute_cql instead of qctx version
  code: Use qctx::evecute_cql methods, not global ones
  system-keyspace: Do not call minimal_setup for the 2nd time
  system-keyspace: Fix indentation after previous patch
  system-keyspace: Do not do invoke_on_all by hands
  system-keyspace: Remove dead code
2020-11-19 18:31:51 +02:00
Pavel Emelyanov
689fd029a1 query-context: Remove database from qctx
No users of qctx::db are left.  One global database reference less.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-11-19 18:39:05 +03:00
Piotr Dulikowski
0fd36e2579 api: allow changing hinted handoff configuration
This commit makes it possible to change hints manager's configuration at
runtime through HTTP API.

To preserve backwards compatibility, we keep the old behavior of not
creating and checking hints directories if they are not enabled at
startup. Instead, hint directories are lazily initialized when hints are
enabled for the first time through HTTP API.
2020-11-17 10:24:43 +01:00
Piotr Dulikowski
cefe5214ff config: plug in hints::host_filter object into configuration
Uses db::hints::host_filter as the type of hinted_handoff_enabled
configuration option.

Previously, hinted_handoff_enabled used to be a string option, and it
was parsed later in a separate function during startup. The function
returned a std::optional<std::unordered_set<sstring>>, whose meaning in
the context of hints is rather enigmatic for an observer not familiar
with hints.

Now, hinted_handoff_enabled has type of db::hints::host_filter, and it
is plugged into the config parsing framework, so there is no need for
later post-processing.
2020-11-17 10:24:42 +01:00
Piotr Dulikowski
40710677d0 hints: introduce db::hints::directory_initializer
Introduces a db::hints::directory_initializer object, which encapsulates
the logic of initializing directories for hints (creating/validating
directories, segment rebalancing). It will be useful for lazy
initialization of hints manager.
2020-11-17 10:15:47 +01:00
Piotr Dulikowski
81a568c57a directories.cc: prepare for use outside main.cc
Currently, the `directories` class is used exclusively during
initialization, in the main() function. This commit refactors this class
so that it is possible to use it to initialize directories much later
after startup.

The intent of this change is to make it possible for hints manager to
create directories for hints lazily. Currently, when Scylla is booted
with hinted handoff disabled, the `hints_directory` config parameter is
ignored and directories for hints are neither created nor verified.
Because we would like to preserve this behavior and introduce
possibility to switch hinted handoff on in runtime, the hints
directories will have to be created lazily the first time hinted handoff
is enabled.
2020-11-17 10:15:47 +01:00
Piotr Dulikowski
5b12375842 main.cc: wait for hints manager to start
In main.cc, we spawn a future which starts the hints manager, but we
don't wait for it to complete. This can have the following consequences:

- The hints manager does some asynchronous operations during startup,
  so it can take some time to start. If it is started after we start
  handling requests, and we admit some requests which would result in
  hints being generated, those hints will be dropped instead because we
  check if hints manager is started before writing them.
- Initialization of hints manager may fail, and Scylla won't be stopped
  because of it (e.g. we don't have permissions to create hints
  directories). The consequence of this is that hints manager won't be
  started, and hints will be dropped instead of being written. This may
  affect both regular hints manager, and the view hints manager.

This commit causes us to wait until hints manager start and see if there
were any errors during initialization.

Fixes #7598

Closes #7599
2020-11-12 14:17:10 +02:00
Benny Halevy
29ed59f8c4 main: start a shared_token_metadata
And use it to get a token_metadata& compatible
with current usage, until the services are converted to
use token_metadata_ptr.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-11-11 14:20:23 +02:00
Pavel Emelyanov
d045df773f code: RIP global query processor instance
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-31 18:51:52 +03:00
Pavel Emelyanov
8989021dc3 system distributed keyspace: Start sharded service erarlier
The constructors just set up the references, real start happens in .start()
so it is safe to do this early. This helps not carrying migration manager
and query processor down the storage service cluster joining code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-31 18:51:52 +03:00