Commit Graph

63 Commits

Author SHA1 Message Date
Kefu Chai
9e8805bb49 repair, transport: s/get0()/get()/
`future::get0()` was deprecated in favor of `future::get()`. so
let's use the latter instead. this change silences a `-Wdeprecated`
warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18357
2024-04-23 15:48:54 +03:00
Mikołaj Grzebieluch
a0915115c3 maintenance_socket: change log message to differentiate from regular CQL ports
Scylla-ccm uses function `wait_for_binary_interface` that waits for
scylla logs to print "Starting listening for CQL clients". If this log
is printed far before the regular cql_controller is initialized,
scylla-ccm assumes too early that node is initialized.
It can result in timeouts that throw errors, for example in the function
`watch_rest_for_alive`.

Closes scylladb/scylladb#17496
2024-03-08 10:08:09 +01:00
Kefu Chai
19e02de1aa transport/controller: remove unused struct definition
the removed struct definition is not used, so drop it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17537
2024-03-06 10:17:08 +02:00
Avi Kivity
93af3dd69b Merge 'Maintenance socket: set filesystem permissions to 660' from Mikołaj Grzebieluch
Set filesystem permissions for the maintenance socket to 660 (previously it was 755) to allow a scyllaadm's group to connect.
Split the logic of creating sockets into two separate functions, one for each case: when it is a regular cql controller or used by maintenance_socket.

Fixes https://github.com/scylladb/scylladb/issues/16487.

Closes scylladb/scylladb#17113

* github.com:scylladb/scylladb:
  maintenance_socket: add option to set owning group
  transport/controller: get rid of magic number for socket path's maximal length
  transport/controller: set unix_domain_socket_permissions for maintenance_socket
  transport/controller: pass unix_domain_socket_permissions to generic_server::listen
  transport/controller: split configuring sockets into separate functions
2024-02-20 15:09:54 +02:00
Mikołaj Grzebieluch
182cfebe40 maintenance_socket: add option to set owning group
Option `maintenance-socket-group` sets the owning group of the maintenance socket.
If not set, the group will be the same as the user running the scylla node.
2024-02-19 10:21:00 +01:00
Benny Halevy
ac83df4875 transport: controller: do_start_server: do not set_cql_read for maintenance port
RPC is not ready yet at this point, so we should not
set this application state yet.

This is indicated by the following warning from
`gossiper::add_local_application_state`:
```
WARN  2024-01-22 23:40:53,978 [shard 0:stmt] gossip - Fail to apply application_state: std::runtime_error (endpoint_state_map does not contain endpoint = 127.227.191.13, application_states = {{RPC_READY -> Value(1,1)}})
```

That should really be an internal error, but
it can't because of this bug.

Fixes #16932

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-02-11 11:49:52 +02:00
Mikołaj Grzebieluch
38191144ac transport/controller: get rid of magic number for socket path's maximal length
Calculate `max_socket_length` from the size of the structure
representing the Unix domain socket address.
2024-02-09 12:32:37 +01:00
Mikołaj Grzebieluch
fffb732704 transport/controller: set unix_domain_socket_permissions for maintenance_socket
Set filesystem permissions for the maintenance socket to 660.

Fixes #16487
2024-02-09 12:32:26 +01:00
Mikołaj Grzebieluch
4cecda7ead transport/controller: pass unix_domain_socket_permissions to generic_server::listen 2024-02-05 14:22:03 +01:00
Mikołaj Grzebieluch
6b178f9a4a transport/controller: split configuring sockets into separate functions
TCP sockets and unix domain sockets don't share common listen options
excluding `socket_address`. For unix domain sockets, available options will be
expanded to cover also filesystem permissions and owner for the socket.
Storing listen options for both types of sockets in one structure would become messy.
For now, both use `listen_cfg`.

In a singular cql controller, only sockets of one type are created, thus it
can be easily split into two cases.
Isolate maintenance socket from `listen_cfg`.
2024-02-05 14:20:17 +01:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Pavel Emelyanov
7c5c89ba8d Revert "Merge 'Use utils::directories instead of db::config to get dirs' from Patryk Wróbel"
This reverts commit 370fbd346c, reversing
changes made to 0912d2a2c6.

This makes scylla-manager mis-interpret the data_file_directories
somehow, issue #17078
2024-01-31 15:08:14 +03:00
Patryk Wrobel
0f3b00f9ad cql_transport/controler: use utils::directories to get paths of dirs
This change replaces usage of db::config with
usage of utils::directories to get paths of
directories in cql_transport/controler.

Refs: scylladb#5626

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
2024-01-29 13:20:38 +01:00
Mikołaj Grzebieluch
8b2f0e38d9 service/maintenance_mode: move maintenance_socket_enabled definition to seperate file 2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch
2b9a88d17a cql_controller: maintenance socket: fix indentation 2023-12-18 17:58:13 +01:00
Mikołaj Grzebieluch
ac61d0f695 cql_controller: add option to start maintenance socket
Add an option to listen on the maintenance socket. It is set up on an unix domain socket
and the metrics are disabled.
This enables having an independent authentication mechanism for this socket.

To start the maintenance socket, a new cql_controller has to be created
with
`db::maintenance_socket_enabled::yes` argument.

Creating maintenance socket will raise an exception if
* the path is longer than 107 chars (due to linux limits),
* a file or a directory already exists in the path.

The indentation is fixed in the next commit.
2023-12-18 17:58:13 +01:00
Pavel Emelyanov
b42391bfbe transport: Shutdown server on disablebinary
... and do the real "sharded::stop" in the background. On node shutdown
it needs to pick up all dangling background stopping.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-11 17:37:48 +03:00
Pavel Emelyanov
bc2d44994a transport/controller: Coroutinize do_stop_server()
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-11 17:32:07 +03:00
Pavel Emelyanov
7701aa0789 transport/controller: Coroutinize stop_server()
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-11 17:32:07 +03:00
Avi Kivity
26c8470f65 treewide: use #include <seastar/...> for seastar headers
We treat Seastar as an external library, so fix the few places
that didn't do so to use angle brackets.

Closes #14037
2023-06-06 08:36:09 +03:00
Kefu Chai
ebf5e138e8 redis,thrift,transport: make timeout_config live-updateable
* timeout_config
  - add `updated_timeout_config` which represents an always-updated
    options backed by `utils::updateable_value<>`. this class is
    used by servers which need to access the latest timeout related
    options. the existing `timeout_config` is more like a snapshot
    of the `updated_timeout_config`. it is used in the use case where
    we don't need to most updated options or we update the options
    manually on demand.
* redis, thrift, transport: s/timeout_config/updated_timeout_config/
  when appropriate. use the improved version of timeout_config where
  we need to have the access to the most-updated version of the timeout
  options.

Fixes #10172
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-29 20:17:45 +08:00
Kefu Chai
e0ac2eb770 redis,thrift,transport: pass config via sharded_parameter
* pass config via sharded_parameter
* initialize config using designated initializer

this change paves the road to servers with live-updateable timeout
options.

before this change, the servers initialize a domain specific combo
config, like `redis_server_config`,  with the same instance of a
timeout_config, and pass the combox config as a ctor parameter to
construct each sharded service instance. but this design assumes
the value semantic of the config class, say, it should be copyable.
but if we want to use utils::updateable_value<> to get updated
option values, we would have to postpone the instantiation of the
config until the sharded service is about to be initialized.

so, in this change, instead of taking a domain specific config created
before hand, all services constructed with a `timeout_config` will
take a `sharded_parameter()` for creating the config. also, take
this opportunity to initialize the config using designated initializer.
for two reasons:

* less repeatings this way. we don't have to repeat the variable
  name of the config being initialized for each member variable.
* prepare for some member variables which do not have a default
  constructor. this applies to the timeout_config's updater which
  will not have a default constructor, as it should be initialized
  by db::config and a reference to the timeout_config to be updated.

we will update the `timeout_config` side in a follow-up commit.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-29 20:06:00 +08:00
Vlad Zolotarov
f94bbc5b34 transport: add per-scheduling-group CQL opcode-specific metrics
This patch extends a previous patch that added these metrics globally:
 - cql_requests_count
 - cql_request_bytes
 - cql_response_bytes

This patch adds a "scheduling_group_name" label to these metrics and changes corresponding
counters to be accounted on a per-scheduling-group level.

As a bonus this patch also marks all 3 metrics as 'skip_when_empty'.

Ref #13061

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <20230321201412.3004845-1-vladz@scylladb.com>
2023-03-22 13:27:48 +02:00
Pavel Emelyanov
7bc697ec99 protocol_server: Add get_client_data call
The call returns a chunked_vector with client_data's. For now
only the native transport implements it, others return empty
vector.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-18 14:25:08 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Emelyanov
43951318c8 transport: Keep gossiper on server
The gossiper is needed by the transport::event_notifier. There's
already gossiper reference on the transport controller, but it's
a local reference, because controller doesn't need more. This
patch upgrages controller reference to sharded<> and propagates
it further up to the server.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-25 10:54:45 +03:00
Botond Dénes
a51529dd15 protocol_servers: strengthen guarantees of listen_addresses()
In early versions of the series which proposed protocol servers, the
interface had two methods answering pretty much the same question of
whether the server is running or not:
* listen_addresses(): empty list -> server not running
* is_server_running()

To reduce redundancy and to avoid possible inconsistencies between the
two methods, `is_server_running()` was scrapped, but re-added by a
follow-up patch because `listen_addresses()` proved to be unreliable as
a source for whether the server is running or not.
This patch restores the previous state of having only
`listen_addresses()` with two additional changes:
* rephrase the comment on `listen_addresses()` to make it clear that
  implementations must return empty list when the server is not running;
* those implementations that have a reliable source of whether the
  server is running or not, use it to force-return an empty list when
  the server is not running

Tests: dtest(nodetool_additional_test.py)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20211117062539.16932-1-bdenes@scylladb.com>
2021-11-19 11:09:09 +03:00
Benny Halevy
9d4262e264 protocol_server: add per-protocol is_server_running method
Change b0a2a9771f broke
the generic api implementation of
is_native_transport_running that relied on
the addresses list being empty agter the server is stopped.

To fix that, this change introduces a pure virtual method:
protocol_server::is_server_running that can be implemented
by each derived class.

Test: unit(dev)
DTest: nodetool_additional_test.py:TestNodetool.binary_test

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211114135248.588798-1-bhalevy@scylladb.com>
2021-11-14 16:01:31 +02:00
Avi Kivity
b0a2a9771f Merge "Sanitize hostnames resolving on start" from Pavel E
"
On start scylla resolves several hostnames into addresses. Different
places use different hostname selection logic, e.g. the API address
can be the listen one if the dedicated option not set. Failure to
resolve a hostname is reported with an exception that (sometimes)
contains the hostname, but it doesn't look very convenient -- better
to know the config option name. Also resolving of different hostnames
has different decoration around, e.g. prometheus carries a main-local
lambda just to nicely wrap the try/catch block.

This set unifies this zoo and makes main() shorter and less hairy:

1. All failures to resolve a hostname are reported with an
   exception containing the relevant config option

2. The || operator for named_value's is introduced to make
   the option selection look as short as

     resolve(cfg->some_address() || cfg->another_address())

3. All sanity checks are explicit and happen early in main

4. No dangling local variables carrying the cfg->...() value

5. Use resolved IP when logging a "... is listening on ..."
   message after a service start

tests: unit(dev)
"

* 'br-ip-resolve-on-start' of https://github.com/xemul/scylla:
  main: Move fb-utilities initialization up the main
  code: Use utils::resolve instead of inet_address::lookup
  main: Remove unused variable
  main: Sanitize resolving of listen address
  main: Sanitize resolving of broadcast address
  main: Sanitize resolving of broadcast RPC address
  main: Sanitize resolving of API address
  main: Sanitize resolving of prometheus address
  utils: Introduce || operator for named_values
  db.config: Verbose address resolver helper
  main: Remove api-port and prometheus-port variables
  alternator: Resolve address with the help of inet_address
  redis, thrift: Remove unused captures
2021-11-09 09:15:40 +02:00
Pavel Emelyanov
2f9c21644b code: Use utils::resolve instead of inet_address::lookup
There are some users of the latter call left. They all suffer
from the same problem -- the lack of verbosity on resolving
errors.

While at it also get rid of useless local variables that are
only there to carry the cfg->...() option over.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-08 17:33:27 +03:00
Botond Dénes
134fa98ff4 transport: controller: implement the protocol_server interface 2021-11-05 15:42:41 +02:00
Pavel Emelyanov
e02b39ca3d code: Generalize tls::credentials_builder configuration
All the places in code that configure the mentioned creds builder
from client_|server_encryption_options now do it the same way.
This patch generalizes it all in the utils:: helper.

The alternator code "ignores" require_client_auth and truststore
keys, but it's easy to make the generalized helper be compatible.

Also make the new helper coroutinized from the beginning.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-20 18:05:41 +03:00
Pavel Emelyanov
35209e7500 transport, redis: Do not assume fixed encryption options
On start main() brushes up the client_encryption_options option
so that any user of it sees it in some "clean" state and can
avoid using get_or_default() to parse.

This patch removes this assumption (and the cleaning code itself).
Next patch will make use of it and relax the duplicated parsing
complexity back.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-20 17:59:33 +03:00
Pavel Emelyanov
b1bb00a95c transport.controller: Brushup cql_server declarations
The controller code sits in the cql_transport namespace and
can omit its mentionings. Also the seastar::distributed<>
is replaced with modern seastar::sharded<> while at it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:50:57 +03:00
Pavel Emelyanov
65b1bb8302 transport: Use local notifier to (un)subscribe server
Now the controller has the lifecycle notifier reference and
can stop using storage service to manage the subscription.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:48:58 +03:00
Pavel Emelyanov
5f99eeb35e transport: Keep lifecycle notifier sharded reference
It's needed to (un)subscribe server on it (next patch).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:48:20 +03:00
Pavel Emelyanov
c7b0b25494 transport, generic_server: Remove no longer used functionality
After subscription management was moved onto controller level
a bunch of code can be dropped:

- passing migration notifier beyond controller
- event_notifier's _stopped bit
- event_notifier .stop() method
- event_notifier empty constructor and destrictor
- generic_server's on_stop virtual method

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:41:32 +03:00
Pavel Emelyanov
1acef41626 transport: (Un)Subscribe cql_server::event_notifier from controller
There's a migration notifier that's carried through cql_server
_just_ to let event-notifier (un)subscribe on it. Also there's
a call for global storage-service in there which will need to
be replaced with yet another pass-through argument which is not
great.

It's easier to establish this subscription outside of cql_server
like it's currently done for proxy and sl-manager. In case of
cql_server the "outside" is the controller.

This patch just moves the subscription management from cql_server
to controller, next two patches will make more use of this change.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:37:23 +03:00
Pavel Emelyanov
990db016e9 transport: Untie transport and database
Both controller and server only need database to get config from.
Since controller creation only happens in main() code which has the
config itself, we may remove database mentioning from transport.

Previous attempt was not to carry the config down to the server
level, but it stepped on an updateable_value landmine -- the u._v.
isn't copyable cross-shard (despite the docs) and to properly
initialize server's max_concurrent_requests we need the config's
named_value member itself.

The db::config that flies through the stack is const reference, but
its named_values do not get copied along the way -- the updateable
value accepts both references and const references to subscribe on.

tests: start-stop in debug mode

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210607135656.18522-1-xemul@scylladb.com>
2021-06-09 20:04:12 +03:00
Avi Kivity
e6c5a63581 Merge "Fix several issues on transport stop" from Pavel E
"
There's a bunch of issues with starting and stopping of cql_server with
the help of cql_controller.

fixes: #8796
tests: manual(start + stop,
              start + exception on cql_set_state()
	     )
       unit not run, they don't mess with transport controller
"

* 'br-transport-stop-fixes' of https://github.com/xemul/scylla:
  transport/controller: Stop server on state change failure too
  transport/controller: Rollback server start on state change failure too
  transport/controller: Do not leave _server uninitialized
  transport/controller: Rework try-catch into defers
2021-06-07 11:41:36 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Emelyanov
76947c829e transport/controller: Stop server on state change failure too
If on stop the set_cql_state() throws the local sharded<cql_server>
will be left not stopped and will fail the respective assertion on
its destruction.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-06-04 16:53:21 +03:00
Pavel Emelyanov
f6ef148c76 transport/controller: Rollback server start on state change failure too
If set_cql_state() throws the cserver remains started. If this
happens on start before the controller stop defer action is
scheduled the destruction of controller will fain on assertion
that checks the _server must be stopped.

Effectively this is the fix of #8796

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-06-04 16:50:51 +03:00
Pavel Emelyanov
6995e41e64 transport/controller: Do not leave _server uninitialized
If an exception happens after sharded<cql_server>.start() the
controller's _server pointer is left pointing to stopped sharded
server. This makes it impossible to start the server again (via
API) since the check for if (_server) will always be true.

This is the continuation of the ae4d5a60 fix.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-06-04 16:48:26 +03:00
Pavel Emelyanov
12220b74e8 transport/controller: Rework try-catch into defers
This is to make further patching simpler.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-06-04 16:48:12 +03:00
Piotr Sarna
26ee6aa1e9 transport: initialize query state with service level controller
Query state should be aware of the service level controller in order
to properly serve service-level-related CQL queries.
2021-04-12 16:31:27 +02:00
Pavel Emelyanov
dcdd207349 storage_service: Drop memory limiter
Nobody uses it now.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-18 11:28:45 +01:00
Pavel Emelyanov
f0a79574d4 memory_limiter: Use main-local instance everyehere
The cql_server and alternator both need the limiter, so
patch them to stop using storage service's one and use
the main-local one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-18 11:28:45 +01:00
Pavel Emelyanov
359e9caf54 main: Have local memory limiter and carry where needed
Prepare memory limiters to have non-global instance of
the service. For now the main-local instance is not
used and (!) is not stopped for real, just like the
storage_service's one is.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-18 11:28:45 +01:00
Pavel Emelyanov
c2f94fb527 cql_server: Remove semaphore getter fn from config
The cql_server() need to get the memory limiter semaphore
from local storage service instance. To make this happen
a callback in introduced on the config structure. The same
can be achieved in a simler manner -- by providing the
local storage service instances directly.

Actually, the storage service will be removed in further
patches from this place, so this patch is mostly to get
rid of the callback from the config.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-18 11:28:45 +01:00