If we call server::stop() right after "server" construction, it hangs:
With the server never listening (never accepting connections and never
serving connections), nothing ever calls server::maybe_stop().
Consequently,
co_await _all_connections_stopped.get_future();
at the end of server::stop() deadlocks.
Such a server::stop() call does occur in controller::do_start_server()
[transport/controller.cc], when
- cserver->start() (sharded<cql_server>::start()) constructs a
"server"-derived object,
- start_listening_on_tcp_sockets() throws an exception before reaching
listen_on_all_shards() (for example because it fails to set up client
encryption -- certificate file is inaccessible etc.),
- the "deferred_action"
cserver->stop().get();
is invoked during cleanup.
(The cserver->stop() call exposing the connection tracking problem dates
back to commit ae4d5a60ca ("transport::controller: Shut down distributed
object on startup exception", 2020-11-25), and it's been triggerable
through the above code path since commit 6b178f9a4a
("transport/controller: split configuring sockets into separate
functions", 2024-02-05).)
Tracking live connections and connection acceptances seems like a good fit
for "seastar::gate", so rewrite the tracking with that. "seastar::gate"
can be closed (and the returned future can be waited for) without anyone
ever having entered the gate.
NOTE: this change makes it quite clear that neither server::stop() nor
server::shutdown() must be called multiple times. The permitted sequences
are:
- server::shutdown() + server::stop()
- or just server::stop().
Fixes#10305
Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
In the following patch, we will add a method to update service levels
parameters for each cql connections.
To support this, this patch allows to pass async function as a parameter
to `for_each_gently()` method.
Make `generic_server::gentle_iterator` a mutable iterator to allow
`for_each_gently` to make changes to the connections.
Fixes: #16035Closesscylladb/scylladb#16036
The method waits for listening sockets to stop listening and aborts the
connected sockets, but doesn't wait for the established connections to
finish processing.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The _stopped future resolves when all "sockets" stop -- listening and
connected ones. Furure patching will need to wait for listening sockets
to stop separately from connected ones.
Rename the `_stopped` to reflect what it is now while at it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Add missing include of "<list>" which caused compile errors on GCC:
In file included from generic_server.cc:9:
generic_server.hh:91:10: error: ‘list’ in namespace ‘std’ does not name a template type
91 | std::list<gentle_iterator> _gentle_iterators;
| ^~~~
generic_server.hh:19:1: note: ‘std::list’ is defined in header ‘<list>’; did you forget to ‘#include <list>’?
18 | #include <seastar/net/tls.hh>
+++ |+#include <list>
19 |
Note that there are some GCC compilation problems still left apart from
this one.
Closes#10328
Add the ability to iterate over the list of connections in a "gentle"
manner, i.e. -- preempting the loop when required.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
After subscription management was moved onto controller level
a bunch of code can be dropped:
- passing migration notifier beyond controller
- event_notifier's _stopped bit
- event_notifier .stop() method
- event_notifier empty constructor and destrictor
- generic_server's on_stop virtual method
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The cql_server and redis_server share the same ancestor of do_accepts().
Let's pull up the cql_server version of do_accept() (that has more
functionality) to generic_server::server and use it in the redis_server
too.
Pull up the cql_server process() to base class and convert redis_server
to use it.
Please note that this fixes EPIPE and connection reset issue in the
Redis server, which was fixed in the CQL server in commit 1a8630e6a
("transport: silence "broken pipe" and "connection reset by peer"
errors").
The cql_server and redis_server both have the same "_stopped" and
"_connections_list" member variables. Pull them up to the
generic_server::server base class.
The cql_server and redis_server classes have a maybe_idle() method,
which sets the _all_connections_stopped promise if server wants to stop
and can be stopped. Pull up the duplicated code to
generic_server::server class.
Both cql_server::connection and redis_server::connection inherit
boost::intrusive::list_base_hook<>, so let's pull up that to the
generic_server::connection class that both inherit.
This patch moves the duplicated connection::shutdown() method to to a
new generic_server::connection base class that is now inherited by
cql_server and redis_server.