Commit Graph

496 Commits

Author SHA1 Message Date
Avi Kivity
987e6533d2 transport: return correct error codes when downgrading v4 {WRITE,READ}_FAILURE to {WRITE,READ}_TIMEOUT
Protocol v4 added WRITE_FAILURE and READ_FAILURE. When running under v3
we downgrade these exceptions to WRITE_TIMEOUT and READ_TIMEOUT (since
the client won't understand the v4 errors), but we still send the new
error codes. This causes the client to become confused.

Fix by updating the error codes.

A better fix is to move the error code from the constructor parameter
list and hard-code it in the constructor, but that is left for a follow-up
after this minimal fix.

Fixes #5610.

Closes #10362
2022-04-12 19:19:52 +03:00
Nadav Har'El
fa7a302130 cross-tree: split coordinator_result from exceptions.hh
Recently, coordinator_result was introduced as an alternative for
exceptions. It was placed in the main "exceptions/exceptions.hh" header,
which virtually every single source file in Scylla includes.
But unfortunately, it brings in some heavy header files and templates,
leading to a lot of wasted build time - ClangBuildAnalyzer measured that
we include exceptions.hh in 323 source files, taking almost two seconds
each on average.

In this patch, we split the coordinator_result feature into a separate
header file, "exceptions/coordinator_result", and only the few places
which need it include the header file. Unfortunately, some of these
few places are themselves header, so the new header file ends up being
included in 100 source files - but 100 is still much less than 323 and
perhaps we can reduce this number 100 later.

After this patch, the total Scylla object-file size is reduced by 6.5%
(the object size is a proxy for build time, which I didn't directly
measure). ClangBuildAnalyzer reports that now each of the 323 includes
of exceptions.hh only takes 80ms, coordinator_result.hh is only included
100 times, and virtually all the cost to include it comes from Boost's
result.hh (400ms per inclusion).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220228204323.1427012-1-nyh@scylladb.com>
2022-03-02 10:12:57 +02:00
Pavel Emelyanov
de6c60c1c9 client_data: Sanitize connection_notifier
Now the connection_notifier is all gone, only the client_data bits are left.
To keep it consistent -- rename the files.

Also, while at it, brush up the header dependencies and remove the not
really used constexprs for client states.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-18 15:02:26 +03:00
Pavel Emelyanov
d63ba87266 transport: Indentation fix after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-18 15:02:26 +03:00
Pavel Emelyanov
971c431a23 code: Remove old on-disk version of system.clients table
This includes most of the connection_notifier stuff as well as
the auxiliary code from system_keyspace.cc and a bunch of
updating calls from the client state changing.

Other than less code and less disk updates on clients connection
paths, this removes one usage of the nasty global qctx thing.

Since the system.clients goes away rename the system.clients_v
here too so the table is always present out there.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-18 15:02:26 +03:00
Pavel Emelyanov
7bc697ec99 protocol_server: Add get_client_data call
The call returns a chunked_vector with client_data's. For now
only the native transport implements it, others return empty
vector.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-18 14:25:08 +03:00
Pavel Emelyanov
0046cdc6cb transport: Track client state for real
Right now when the client state changes the respective update is
performed on the system.clients table. While doing it some bits
from this state are lost from the in-memory structures. For the
sake of exporting this information we need to track whether the
connected client goes authenticating or is already ready.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-18 14:25:08 +03:00
Avi Kivity
7cc43f8aa8 Merge 'utils: add result_try and result_futurize_try' from Piotr Dulikowski
Adds `utils::result_try` and `utils::result_futurize_try` - functions which allow to convert existing try..catch blocks into a version which handles C++ exceptions, failed results with exception containers and, depending on the function variant, exceptional futures using the same exception handling logic.

For example, you can convert the following try..catch block:

    try {
        return a_function_that_may_throw();
    } catch (const my_exception& ex) {
        return 123;
    } catch (...) {
        throw;
    }

...to this:

    return utils::result_try([&] {
        return a_function_that_may_throw_or_return_a_failed_result();
    },  utils::result_catch<my_exception>([&] (const Ex&) {
        return 123;
    }), utils::result_catch_dots([&] (auto&& handle) {
        return handle.into_result();
    });

Similarly, `utils::result_futurize_try` can be used to migrate `then_wrapped` or `f.handle_exception()` constructs.

As an example of the usability of the new constructs, two places in the current code which need to simultaneously handle exceptions and failed results are converted to use `result_try` and `result_futurize_try`.

Results of `perf_simple_query --smp 1 --operations-per-shard 1000000 --write`:

```
127041.61 tps ( 67.2 allocs/op,  14.2 tasks/op,   52422 insns/op)
126958.60 tps ( 67.2 allocs/op,  14.2 tasks/op,   52409 insns/op)
127088.37 tps ( 67.2 allocs/op,  14.2 tasks/op,   52411 insns/op)
127560.84 tps ( 67.2 allocs/op,  14.2 tasks/op,   52424 insns/op)
127826.61 tps ( 67.2 allocs/op,  14.2 tasks/op,   52406 insns/op)

126801.02 tps ( 67.2 allocs/op,  14.2 tasks/op,   52420 insns/op)
125371.51 tps ( 67.2 allocs/op,  14.2 tasks/op,   52425 insns/op)
126498.51 tps ( 67.2 allocs/op,  14.2 tasks/op,   52427 insns/op)
126359.41 tps ( 67.2 allocs/op,  14.2 tasks/op,   52423 insns/op)
126298.27 tps ( 67.2 allocs/op,  14.2 tasks/op,   52410 insns/op)
```

The number of tasks and allocations is unchanged. The number of instructions per operations seems similar, it may have increased slightly (by 10-20) but it's hard to tell for sure because of the noisiness of the results.

Tests: unit(dev)

Closes #10045

* github.com:scylladb/scylla:
  transport: use result_try in process_request_one
  storage_proxy: use result_futurize_try in mutate_end
  storage_proxy: temporarily throw exception from result in mutate_end
  utils: add result_try and result_futurize_try
2022-02-13 19:38:13 +02:00
Piotr Dulikowski
049564bd2d transport: use result_try in process_request_one
Adapts the exception handling logic in process_request_one so that it
uses utils::result_try to handle both C++ exceptions and failed results
in a unified way.
2022-02-10 17:35:32 +01:00
Piotr Dulikowski
81968f2c3a transport/server: handle exceptions from coordinator_result without throwing
Instead of throwing the exception contained in failed `result<>`, it is
now inspected with a visitor which avoids the need for throwing.
2022-02-08 11:08:42 +01:00
Piotr Dulikowski
4cc5d582e3 transport/server: propagate coordinator_result to the error handling code
Now, the failed `result<>` is throwlessly propagated to the continuation
which converts exceptions to CQL response messages, and is thrown there.
2022-02-08 11:08:42 +01:00
Piotr Dulikowski
c750f7895f transport/server: unwrap the exception result_message in process_xyz_internal
At the point where `result_message` is converted to a
`cql_server::response`, now the result message is inspected and returned
as failed `result<>` if it contained an error.

For now, the failed `result<>` is thrown as exception in `process` and
`process_on_shard`, but that will change in the next commit.
2022-02-08 11:08:42 +01:00
Piotr Dulikowski
e4ff22b4ca result_message: add result_message::exception
In order to propagate exceptions as values through the CQL layer with
minimal modifications to the interfaces, a new result_message type is
introduced: result_message::exception. Similarly to
result_message::bounce_to_shard, this is an internal type which is
supposed to be handled before being returned to the client.
2022-02-08 11:08:42 +01:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Emelyanov
c04ddc5aa9 transport: Use server gossiper in event notifier
The notifier is automatic friend of server and can access its
private fields without additional wrappers/decorations.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-25 10:56:05 +03:00
Pavel Emelyanov
2cb18c2404 transport: Keep backreference from event_notifier
The event_notifier is private server subclass that's created once
per server to handle events from storage_service. The notifier needs
gossiper that already sits on the server, and to get it the simplest
way is to equip notifier with the server backreference. Since these
two objects are in strict 1:1 relation this reference is safe.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-25 10:55:41 +03:00
Pavel Emelyanov
43951318c8 transport: Keep gossiper on server
The gossiper is needed by the transport::event_notifier. There's
already gossiper reference on the transport controller, but it's
a local reference, because controller doesn't need more. This
patch upgrages controller reference to sharded<> and propagates
it further up to the server.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-25 10:54:45 +03:00
Botond Dénes
a51529dd15 protocol_servers: strengthen guarantees of listen_addresses()
In early versions of the series which proposed protocol servers, the
interface had two methods answering pretty much the same question of
whether the server is running or not:
* listen_addresses(): empty list -> server not running
* is_server_running()

To reduce redundancy and to avoid possible inconsistencies between the
two methods, `is_server_running()` was scrapped, but re-added by a
follow-up patch because `listen_addresses()` proved to be unreliable as
a source for whether the server is running or not.
This patch restores the previous state of having only
`listen_addresses()` with two additional changes:
* rephrase the comment on `listen_addresses()` to make it clear that
  implementations must return empty list when the server is not running;
* those implementations that have a reliable source of whether the
  server is running or not, use it to force-return an empty list when
  the server is not running

Tests: dtest(nodetool_additional_test.py)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20211117062539.16932-1-bdenes@scylladb.com>
2021-11-19 11:09:09 +03:00
Benny Halevy
9d4262e264 protocol_server: add per-protocol is_server_running method
Change b0a2a9771f broke
the generic api implementation of
is_native_transport_running that relied on
the addresses list being empty agter the server is stopped.

To fix that, this change introduces a pure virtual method:
protocol_server::is_server_running that can be implemented
by each derived class.

Test: unit(dev)
DTest: nodetool_additional_test.py:TestNodetool.binary_test

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211114135248.588798-1-bhalevy@scylladb.com>
2021-11-14 16:01:31 +02:00
Avi Kivity
b0a2a9771f Merge "Sanitize hostnames resolving on start" from Pavel E
"
On start scylla resolves several hostnames into addresses. Different
places use different hostname selection logic, e.g. the API address
can be the listen one if the dedicated option not set. Failure to
resolve a hostname is reported with an exception that (sometimes)
contains the hostname, but it doesn't look very convenient -- better
to know the config option name. Also resolving of different hostnames
has different decoration around, e.g. prometheus carries a main-local
lambda just to nicely wrap the try/catch block.

This set unifies this zoo and makes main() shorter and less hairy:

1. All failures to resolve a hostname are reported with an
   exception containing the relevant config option

2. The || operator for named_value's is introduced to make
   the option selection look as short as

     resolve(cfg->some_address() || cfg->another_address())

3. All sanity checks are explicit and happen early in main

4. No dangling local variables carrying the cfg->...() value

5. Use resolved IP when logging a "... is listening on ..."
   message after a service start

tests: unit(dev)
"

* 'br-ip-resolve-on-start' of https://github.com/xemul/scylla:
  main: Move fb-utilities initialization up the main
  code: Use utils::resolve instead of inet_address::lookup
  main: Remove unused variable
  main: Sanitize resolving of listen address
  main: Sanitize resolving of broadcast address
  main: Sanitize resolving of broadcast RPC address
  main: Sanitize resolving of API address
  main: Sanitize resolving of prometheus address
  utils: Introduce || operator for named_values
  db.config: Verbose address resolver helper
  main: Remove api-port and prometheus-port variables
  alternator: Resolve address with the help of inet_address
  redis, thrift: Remove unused captures
2021-11-09 09:15:40 +02:00
Pavel Emelyanov
2f9c21644b code: Use utils::resolve instead of inet_address::lookup
There are some users of the latter call left. They all suffer
from the same problem -- the lack of verbosity on resolving
errors.

While at it also get rid of useless local variables that are
only there to carry the cfg->...() option over.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-08 17:33:27 +03:00
Botond Dénes
134fa98ff4 transport: controller: implement the protocol_server interface 2021-11-05 15:42:41 +02:00
Avi Kivity
1aff7d19c2 treewide: replace seastar::fmt_print() with fmt::print()
We shouldn't be using Seastar as a text formatting library; that's
not its focus. Use fmt directly instead. fmt::print() doesn't return
the output stream which is a minor inconvenience, but that's life.

Closes #9556
2021-11-01 10:05:16 +02:00
Botond Dénes
9ec55e054d treewide: distinguish truncated frame errors
We have two identical "Truncated frame" errors, at:
* read_frame_size() in serialization_visitors.hh;
* cql_server::connection::read_and_decompress_frame() in
  transport/server.cc;

When such an exception is thrown, it is impossible to tell where was it
thrown from and it doesn't have any further information contained in it
(beyond the basic information it being thrown implies).
This patch solves both problems: it makes the exception messages unique
per location and it adds information about why it was thrown (the
expected vs. real size of the frame).

Ref: #9482

Closes #9520
2021-10-27 12:27:16 +02:00
Calle Wilund
940058d25a transport::server: Handle nested exceoptions in cql execution/query
Fixes #9491

CQL server, when encountering a "general" exception (i.e. not thrown by
cql error checks), reports a wire error with simply the what() part of
exception. However, if we have nested exceptions, we will most likely
lose info here (hello encryption).

General exception case should unwind exception and give back full,
concatenated message to avoid confusion.

Closes #9492
2021-10-20 17:54:17 +03:00
Piotr Sarna
59bd25d1ea transport: respond with overloaded exception during shedding
This commit makes shedding always respond - with overloaded exception,
instead of ignoring the request.

Fixes #9442

Closes #9443
2021-10-07 15:38:40 +03:00
Piotr Sarna
06f724857f transport: remove unused map of stream_id->query states
The map is never touched, so it only occupies precious space
for each connection.

Closes #9383
2021-09-26 13:41:58 +03:00
Avi Kivity
daf028210b build: enable -Winconsistent-missing-override warning
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.

Enable the warning and fix numerous violations.

Closes #9347
2021-09-15 12:55:54 +03:00
Avi Kivity
705f957425 Merge "Generalize TLS creds builder configuration" from Pavel E
"
There are 4 places out there that do the same steps parsing
"client_|server_encryption_options" and configuring the
seastar::tls::creds_builder with the values (messaging, redis,
alternator and transport).

Also to make redis and transport look slimmer main() cleans
the client_encryption_options by ... parsing it too.

This set introduces a (coroutinized) helper to configure the
creds_builder with map<string, string> and removes the options
beautification from main.

tests: unit(dev), dtest.internode_ssl_test(dev)
"

* 'br-generalize-tls-creds-builder-configuration' of https://github.com/xemul/scylla:
  code: Generalize tls::credentials_builder configuration
  transport, redis: Do not assume fixed encryption options
  messaging: Move encryption options parsing to ms
  main: Open-code internode encryption misconfig warning
  main, config: Move options parsing helpers
2021-09-01 14:19:19 +03:00
Avi Kivity
22d2a815c9 transport: server.hh: trim unneeded cql3 includes
query_processor.hh can be replaced with a forward declaration, and
result-message headers, and valuees.hh is unneeded.

Closes #9238
2021-08-23 18:09:22 +03:00
Pavel Emelyanov
e02b39ca3d code: Generalize tls::credentials_builder configuration
All the places in code that configure the mentioned creds builder
from client_|server_encryption_options now do it the same way.
This patch generalizes it all in the utils:: helper.

The alternator code "ignores" require_client_auth and truststore
keys, but it's easy to make the generalized helper be compatible.

Also make the new helper coroutinized from the beginning.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-20 18:05:41 +03:00
Pavel Emelyanov
35209e7500 transport, redis: Do not assume fixed encryption options
On start main() brushes up the client_encryption_options option
so that any user of it sees it in some "clean" state and can
avoid using get_or_default() to parse.

This patch removes this assumption (and the cleaning code itself).
Next patch will make use of it and relax the duplicated parsing
complexity back.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-20 17:59:33 +03:00
Avi Kivity
0876248c2b Merge "cql3: cache function calls evaluation for non-deterministic functions" from Pavel S
"
`function_call` AST nodes are created for each function
with side effects in a CQL query, i.e. non-deterministic
functions (`uuid()`, `now()` and some others timeuuid-related).

These nodes are evaluated either when a query itself is executed
or query restrictions are computed (e.g. partition/clustering
key ranges for LWT requests).

We need to cache the calls since otherwise when handling a
`bounce_to_shard` request for an LWT query, we can possibly
enter an infinite bouncing loop (in case a function is used
to calculate partition key ranges for a query), since the
results can be different each time.

Furthermore, we don't support bouncing more than one time.
Returning `bounce_to_shard` message more than one time
will result in a crash.

Caching works only for LWT statements and only for the function
calls that affect partition key range computation for the query.

`variable_specifications` class is renamed to `prepare_context`
and generalized to record information about each `function_call`
AST node and modify them, as needed:
* Check whether a given function call is a part of partition key
  statement restriction.
* Assign ids for caching if above is true and the call is a part
  of an LWT statement.

There is no need to include any kind of statement identifier
in the cache key since `query_options` (which holds the cache)
is limited to a single statement, anyway.

Function calls are indexed by the order in which they appear
within a statement while parsing. There is no need to
include any kind of statement identifier to the cache key
since `query_options` (which holds the cache) is limited
to a single statement, anyway.

Note that `function_call::raw` AST nodes are not created
for selection clauses of a SELECT statement hence they
can only accept only one of the following things as parameters:
* Other function calls.
* Literal values.
* Parameter markers.

In other words, only parameters that can be immediately reduced
to a byte buffer are allowed and we don't need to handle
database inputs to non-pure functions separately since they
are not possible in this context. Anyhow, we don't even have
a single non-pure function that accepts arguments, so precautions
are not needed at the moment.

Add a test written in `cql-pytest` framework to verify
that both prepared and unprepared lwt statements handle
`bounce_to_shard` messages correctly in such scenario.

Fixes: #8604

Tests: unit(dev, debug)

NOTE: the patchset uses `query_options` as a container for
cached values. This doesn't look clean and `service::query_state`
seems to be a better place to store them. But it's not
forwarded to most of the CQL code and would mean that a huge number
of places would have to be amended.
The series presents a trade-off to avoid forwarding `query_state`
everywhere (but maybe it's the thing that needs to be done, nonetheless).
"

* 'lwt_bounce_to_shard_cached_fn_v6' of https://github.com/ManManson/scylla:
  cql-pytest: add a test for non-pure CQL functions
  cql3: cache function calls evaluation for non-deterministic functions
  cql3: rename `variable_specifications` to `prepare_context`
2021-07-30 14:21:11 +03:00
Pavel Solodovnikov
3b6adf3a62 cql3: cache function calls evaluation for non-deterministic functions
And reuse these values when handling `bounce_to_shard` messages.

Otherwise such a function (e.g. `uuid()`) can yield a different
value when a statement re-executed on the other shard.

It can lead to an infinite number of `bounce_to_shard` messages
sent in case the function value is used to calculate partition
key ranges for the query. Which, in turn, will cause crashes
since we don't support bouncing more than one time and the second
hop will result in a crash.

Caching works only for LWT statements and only for the function
calls that affect partition key range computation for the query.

`variable_specifications` class is renamed to `prepare_context`
and generalized to record information about each `function_call`
AST node and modify them, as needed:
* Check whether a given function call is a part of partition key
  statement restriction.
* Assign ids for caching if above is true and the call is a part
  of an LWT statement.

There is no need to include any kind of statement identifier
in the cache key since `query_options` (which holds the cache)
is limited to a single statement, anyway.

Note that `function_call::raw` AST nodes are not created
for selection clauses of a SELECT statement hence they
can only accept only one of the following things as parameters:
* Other function calls.
* Literal values.
* Parameter markers.

In other words, only parameters that can be immediately reduced
to a byte buffer are allowed and we don't need to handle
database inputs to non-pure functions separately since they
are not possible in this context. Anyhow, we don't even have
a single non-pure function that accepts arguments, so precautions
are not needed at the moment.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-07-30 01:22:39 +03:00
Pavel Emelyanov
b1bb00a95c transport.controller: Brushup cql_server declarations
The controller code sits in the cql_transport namespace and
can omit its mentionings. Also the seastar::distributed<>
is replaced with modern seastar::sharded<> while at it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:50:57 +03:00
Pavel Emelyanov
65b1bb8302 transport: Use local notifier to (un)subscribe server
Now the controller has the lifecycle notifier reference and
can stop using storage service to manage the subscription.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:48:58 +03:00
Pavel Emelyanov
5f99eeb35e transport: Keep lifecycle notifier sharded reference
It's needed to (un)subscribe server on it (next patch).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:48:20 +03:00
Pavel Emelyanov
c7b0b25494 transport, generic_server: Remove no longer used functionality
After subscription management was moved onto controller level
a bunch of code can be dropped:

- passing migration notifier beyond controller
- event_notifier's _stopped bit
- event_notifier .stop() method
- event_notifier empty constructor and destrictor
- generic_server's on_stop virtual method

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:41:32 +03:00
Pavel Emelyanov
1acef41626 transport: (Un)Subscribe cql_server::event_notifier from controller
There's a migration notifier that's carried through cql_server
_just_ to let event-notifier (un)subscribe on it. Also there's
a call for global storage-service in there which will need to
be replaced with yet another pass-through argument which is not
great.

It's easier to establish this subscription outside of cql_server
like it's currently done for proxy and sl-manager. In case of
cql_server the "outside" is the controller.

This patch just moves the subscription management from cql_server
to controller, next two patches will make more use of this change.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:37:23 +03:00
Avi Kivity
9059514335 build, treewide: enable -Wpessimizing-move warning
This warning prevents using std::move() where it can hurt
- on an unnamed temporary or a named automatic variable being
returned from a function. In both cases the value could be
constructed directly in its final destination, but std::move()
prevents it.

Fix the handful of cases (all trivial), and enable the warning.

Closes #8992
2021-07-08 17:52:34 +03:00
Pavel Emelyanov
990db016e9 transport: Untie transport and database
Both controller and server only need database to get config from.
Since controller creation only happens in main() code which has the
config itself, we may remove database mentioning from transport.

Previous attempt was not to carry the config down to the server
level, but it stepped on an updateable_value landmine -- the u._v.
isn't copyable cross-shard (despite the docs) and to properly
initialize server's max_concurrent_requests we need the config's
named_value member itself.

The db::config that flies through the stack is const reference, but
its named_values do not get copied along the way -- the updateable
value accepts both references and const references to subscribe on.

tests: start-stop in debug mode

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210607135656.18522-1-xemul@scylladb.com>
2021-06-09 20:04:12 +03:00
Pavel Solodovnikov
76bea23174 treewide: reduce header interdependencies
Use forward declarations wherever possible.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>

Closes #8813
2021-06-07 15:58:35 +03:00
Piotr Sarna
fa29b79c20 transport: close connections when too large requests arrive
Too large requests are currently handled by the CQL server by
skipping them and sending back an error response.
That's however wasteful and dangerous: bogus request sizes
will force Scylla to potentially skip gigabytes of data
- and skipping is done by simply reading from the socket,
so it may results in gigabytes of bandwidth wasted.
Even if the request size is not bogus, closing the connection
forces users to adjust their request sizes, which should be done
anyway.

Originally, there was a bug in handling too large requests which
only read their headers and then left the connection in a broken,
undefined state, trying to interpret the rest of the large request
as a next CQL header. It was later fixed to skip the request, but
closing the connection is a safer thing to do.

Fixes #8798

Closes #8800
2021-06-07 12:23:55 +03:00
Avi Kivity
e6c5a63581 Merge "Fix several issues on transport stop" from Pavel E
"
There's a bunch of issues with starting and stopping of cql_server with
the help of cql_controller.

fixes: #8796
tests: manual(start + stop,
              start + exception on cql_set_state()
	     )
       unit not run, they don't mess with transport controller
"

* 'br-transport-stop-fixes' of https://github.com/xemul/scylla:
  transport/controller: Stop server on state change failure too
  transport/controller: Rollback server start on state change failure too
  transport/controller: Do not leave _server uninitialized
  transport/controller: Rework try-catch into defers
2021-06-07 11:41:36 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Avi Kivity
100d6f4094 build: enable -Wunused-function
Also drop a single violation in transport/server.cc. This helps
prevent dead code from piling up.

Three functions in row_cache_test that are not used in debug mode
are moved near their user, and under the same ifdef, to avoid triggering
the error.

Closes #8767
2021-06-06 09:21:23 +03:00
Pavel Emelyanov
76947c829e transport/controller: Stop server on state change failure too
If on stop the set_cql_state() throws the local sharded<cql_server>
will be left not stopped and will fail the respective assertion on
its destruction.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-06-04 16:53:21 +03:00
Pavel Emelyanov
f6ef148c76 transport/controller: Rollback server start on state change failure too
If set_cql_state() throws the cserver remains started. If this
happens on start before the controller stop defer action is
scheduled the destruction of controller will fain on assertion
that checks the _server must be stopped.

Effectively this is the fix of #8796

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-06-04 16:50:51 +03:00
Pavel Emelyanov
6995e41e64 transport/controller: Do not leave _server uninitialized
If an exception happens after sharded<cql_server>.start() the
controller's _server pointer is left pointing to stopped sharded
server. This makes it impossible to start the server again (via
API) since the check for if (_server) will always be true.

This is the continuation of the ae4d5a60 fix.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-06-04 16:48:26 +03:00
Pavel Emelyanov
12220b74e8 transport/controller: Rework try-catch into defers
This is to make further patching simpler.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-06-04 16:48:12 +03:00