Calling _read_buf.close() doesn't imply eof(), some data
may have already been read into kernel or client buffers
and will be returned next time read() is called.
When the _server._max_request_size limit was exceeded
and the _read_buf was closed, the process_request method
finished and we started processing the next request in
connection::process. The unread data from _read_buf was
treated as the header of the next request frame, resulting
in "Invalid or unsupported protocol version" error.
The existing test_shed_too_large_request was adjusted.
It was originally written with the assumption that the data
of a large query would simply be dropped from the socket
and the connection could be used to handle the
next requests. This behaviour was changed in scylladb#8800,
now the connection is closed on the Scylla side and
can no longer be used. To check there are no errors
in this case, we use Scylla metrics, getting them
from the Scylla Prometheus API.
If request processing ended with an error, it is worth
sending the error to the client through
make_error/write_response. Previously in this case we
just wrote a message to the log and didn't handle the
client connection in any way. As a result, the only
thing the client got in this case was timeout error.
A new test_batch_with_error is added. It is quite
difficult to reproduce error condition in a test,
so we use error injection instead. Passing injection_key
in the body of the request ensures that the exception
will be thrown only for this test request and
will not affect other requests that
the driver may send in the background.
Closes: scylladb#12104
Ideally, these errors should be transparently delivered
to the client, but in practice, due to various
flaws/bugs in scylla and/or the driver,
they can be lost, which enormously complicates troubleshooting.
const socket_address& get_remote_address() is needed for its
convenient conversion to string, which includes ip and port.
The CQL binary protocol introduced "unset" values in version 4
of the protocol. Unset values can be bound to variables, which
cause certain CQL fragments to be skipped. For example, the
fragment `SET a = :var` will not change the value of `a` if `:var`
is bound to an unset value.
Unsets, however, are very limited in where they can appear. They
can only appear at the top-level of an expression, and any computation
done with them is invalid. For example, `SET list_column = [3, :var]`
is invalid if `:var` is bound to unset.
This causes the code to be littered with checks for unset, and there
are plenty of tests dedicated to catching unsets. However, a simpler
way is possible - prevent the infiltration of unsets at the point of
entry (when evaluating a bind variable expression), and introduce
guards to check for the few cases where unsets are allowed.
This is what this long patch does. It performs the following:
(general)
1. unset is removed from the possible values of cql3::raw_value and
cql3::raw_value_view.
(external->cql3)
2. query_options is fortified with a vector of booleans,
unset_bind_variable_vector, where each boolean corresponds to a bind
variable index and is true when it is unset.
3. To avoid churn, two compatiblity structs are introduced:
cql3::raw_value{,_view}_vector_with_unset, which can be constructed
from a std::vector<raw_value{,_view/}>, which is what most callers
have. They can also be constructed with explicit unset vectors, for
the few cases they are needed.
(cql3->variables)
4. query_options::get_value_at() now throws if the requested bind variable
is unset. This replaces all the throwing checks in expression evaluation
and statement execution, which are removed.
5. A new query_options::is_unset() is added for the users that can tolerate
unset; though it is not used directly.
6. A new cql3::unset_operation_guard class guards against unsets. It accepts
an expression, and can be queried whether an unset is present. Two
conditions are checked: the expression must be a singleton bind
variable, and at runtime it must be bound to an unset value.
7. The modification_statement operations are split into two, via two
new subclasses of cql3::operation. cql3::operation_no_unset_support
ignores unsets completely. cql3::operation_skip_if_unset checks if
an operand is unset (luckily all operations have at most one operand that
tolerates unset) and applies unset_operation_guard to it.
8. The various sites that accept expressions or operations are modified
to check for should_skip_operation(). This are the loops around
operations in update_statement and delete_statement, and the checks
for unset in attributes (LIMIT and PER PARTITION LIMIT)
(tests)
9. Many unset tests are removed. It's now impossible to enter an
unset value into the expression evaluation machinery (there's
just no unset value), so it's impossible to test for it.
10. Other unset tests now have to be invoked via bind variables,
since there's no way to create an unset cql3::expr::constant.
11. Many tests have their exception message match strings relaxed.
Since unsets are now checked very early, we don't know the context
where they happen. It would be possible to reintroduce it (by adding
a format string parameter to cql3::unset_operation_guard), but it
seems not to be worth the effort. Usage of unsets is rare, and it is
explicit (at least with the Python driver, an unset cannot be
introduced by ommission).
I tried as an alternative to wrap cql3::raw_value{,_view} (that doesn't
recognize unsets) with cql3::maybe_unset_value (that does), but that
caused huge amounts of churn, so I abandoned that in favor of the
current approach.
Closes#12517
In 424dbf43f ("transport: drop cql protocol versions 1 and 2"),
we dropped support for protocols 1 and 2, but some code remains
that checks for those versions. It is now dead code, so remove it.
Closes#12497
Now that we don't accept cql protocol version 1 or 2, we can
drop cql_serialization format everywhere, except when in the IDL
(since it's part of the inter-node protocol).
A few functions had duplicate versions, one with and one without
a cql_serialization_format parameter. They are deduplicated.
Care is taken that `partition_slice`, which communicates
the cql_serialization_format across nodes, still presents
a valid cql_serialization_format to other nodes when
transmitting itself and rejects protocol 1 and 2 serialization\
format when receiving. The IDL is unchanged.
One test checking the 16-bit serialization format is removed.
Version 3 was introduced in 2014 (Cassandra 2.1) and was supported
in the very first version of Scylla (2a7da21481 "CQL binary protocol").
Cassandra 3.0 (2015) dropped protocols 1 and 2 as well.
It's safe enough to drop it now, 9 years after introduction of v3
and 7 years after Cassandra stopped supporting it.
Dropping it allows dropping cql_serialization_format, which causes
quite a lot of pain, and is probably broken. This will be dropped in the
following patch.
The cql server uses an execution stage to process and execute queries,
however, processing stage is best utilized when having a recurrent flow
that needs to be called repeatedly since it better utilizes the
instruction cache.
Up until now, every request was sent through the processing stage, but
most requests are not meant to be executed repeatedly with high volume.
This change processes and executes the data queries asynchronously,
through an execution stage, and all of the rest are processed one by
one, only continuing once the request has been done end to end.
Tests:
Unit tests in dev and debug.
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Closes#12202
Otherwise cql_transport::additional_options_for_proto_ext() complains
about inability to format the enum class value
Introduced by efc3953c (transport: add rate_limit_error)
Fmt version 8.1.1-5.fc35, fresher one must have it out of the box
Fixes#10884
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20220627052703.32024-1-xemul@scylladb.com>
Adds a CQL protocol extension which introduces the rate_limit_error. The
new error code will be used to indicate that the operation failed due to
it exceeding the allowed per-partition rate limit.
The error code is supposed to be returned only if the corresponding CQL
extension is enabled by the client - if it's not enabled, then
Config_error will be returned in its stead.
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.
Closes#10562
One user observed this assertion fail, but it's an extremely rare event.
The root cause - interlacing of processing STARTUP and OPTIONS messages -
is still there, but now it's harmless enough to leave it as is.
Fixes#10487Closes#10503
Protocol v4 added WRITE_FAILURE and READ_FAILURE. When running under v3
we downgrade these exceptions to WRITE_TIMEOUT and READ_TIMEOUT (since
the client won't understand the v4 errors), but we still send the new
error codes. This causes the client to become confused.
Fix by updating the error codes.
A better fix is to move the error code from the constructor parameter
list and hard-code it in the constructor, but that is left for a follow-up
after this minimal fix.
Fixes#5610.
Closes#10362
Recently, coordinator_result was introduced as an alternative for
exceptions. It was placed in the main "exceptions/exceptions.hh" header,
which virtually every single source file in Scylla includes.
But unfortunately, it brings in some heavy header files and templates,
leading to a lot of wasted build time - ClangBuildAnalyzer measured that
we include exceptions.hh in 323 source files, taking almost two seconds
each on average.
In this patch, we split the coordinator_result feature into a separate
header file, "exceptions/coordinator_result", and only the few places
which need it include the header file. Unfortunately, some of these
few places are themselves header, so the new header file ends up being
included in 100 source files - but 100 is still much less than 323 and
perhaps we can reduce this number 100 later.
After this patch, the total Scylla object-file size is reduced by 6.5%
(the object size is a proxy for build time, which I didn't directly
measure). ClangBuildAnalyzer reports that now each of the 323 includes
of exceptions.hh only takes 80ms, coordinator_result.hh is only included
100 times, and virtually all the cost to include it comes from Boost's
result.hh (400ms per inclusion).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220228204323.1427012-1-nyh@scylladb.com>
Now the connection_notifier is all gone, only the client_data bits are left.
To keep it consistent -- rename the files.
Also, while at it, brush up the header dependencies and remove the not
really used constexprs for client states.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This includes most of the connection_notifier stuff as well as
the auxiliary code from system_keyspace.cc and a bunch of
updating calls from the client state changing.
Other than less code and less disk updates on clients connection
paths, this removes one usage of the nasty global qctx thing.
Since the system.clients goes away rename the system.clients_v
here too so the table is always present out there.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The call returns a chunked_vector with client_data's. For now
only the native transport implements it, others return empty
vector.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Right now when the client state changes the respective update is
performed on the system.clients table. While doing it some bits
from this state are lost from the in-memory structures. For the
sake of exporting this information we need to track whether the
connected client goes authenticating or is already ready.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Adds `utils::result_try` and `utils::result_futurize_try` - functions which allow to convert existing try..catch blocks into a version which handles C++ exceptions, failed results with exception containers and, depending on the function variant, exceptional futures using the same exception handling logic.
For example, you can convert the following try..catch block:
try {
return a_function_that_may_throw();
} catch (const my_exception& ex) {
return 123;
} catch (...) {
throw;
}
...to this:
return utils::result_try([&] {
return a_function_that_may_throw_or_return_a_failed_result();
}, utils::result_catch<my_exception>([&] (const Ex&) {
return 123;
}), utils::result_catch_dots([&] (auto&& handle) {
return handle.into_result();
});
Similarly, `utils::result_futurize_try` can be used to migrate `then_wrapped` or `f.handle_exception()` constructs.
As an example of the usability of the new constructs, two places in the current code which need to simultaneously handle exceptions and failed results are converted to use `result_try` and `result_futurize_try`.
Results of `perf_simple_query --smp 1 --operations-per-shard 1000000 --write`:
```
127041.61 tps ( 67.2 allocs/op, 14.2 tasks/op, 52422 insns/op)
126958.60 tps ( 67.2 allocs/op, 14.2 tasks/op, 52409 insns/op)
127088.37 tps ( 67.2 allocs/op, 14.2 tasks/op, 52411 insns/op)
127560.84 tps ( 67.2 allocs/op, 14.2 tasks/op, 52424 insns/op)
127826.61 tps ( 67.2 allocs/op, 14.2 tasks/op, 52406 insns/op)
126801.02 tps ( 67.2 allocs/op, 14.2 tasks/op, 52420 insns/op)
125371.51 tps ( 67.2 allocs/op, 14.2 tasks/op, 52425 insns/op)
126498.51 tps ( 67.2 allocs/op, 14.2 tasks/op, 52427 insns/op)
126359.41 tps ( 67.2 allocs/op, 14.2 tasks/op, 52423 insns/op)
126298.27 tps ( 67.2 allocs/op, 14.2 tasks/op, 52410 insns/op)
```
The number of tasks and allocations is unchanged. The number of instructions per operations seems similar, it may have increased slightly (by 10-20) but it's hard to tell for sure because of the noisiness of the results.
Tests: unit(dev)
Closes#10045
* github.com:scylladb/scylla:
transport: use result_try in process_request_one
storage_proxy: use result_futurize_try in mutate_end
storage_proxy: temporarily throw exception from result in mutate_end
utils: add result_try and result_futurize_try
Adapts the exception handling logic in process_request_one so that it
uses utils::result_try to handle both C++ exceptions and failed results
in a unified way.
At the point where `result_message` is converted to a
`cql_server::response`, now the result message is inspected and returned
as failed `result<>` if it contained an error.
For now, the failed `result<>` is thrown as exception in `process` and
`process_on_shard`, but that will change in the next commit.
In order to propagate exceptions as values through the CQL layer with
minimal modifications to the interfaces, a new result_message type is
introduced: result_message::exception. Similarly to
result_message::bounce_to_shard, this is an internal type which is
supposed to be handled before being returned to the client.
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
The notifier is automatic friend of server and can access its
private fields without additional wrappers/decorations.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The event_notifier is private server subclass that's created once
per server to handle events from storage_service. The notifier needs
gossiper that already sits on the server, and to get it the simplest
way is to equip notifier with the server backreference. Since these
two objects are in strict 1:1 relation this reference is safe.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The gossiper is needed by the transport::event_notifier. There's
already gossiper reference on the transport controller, but it's
a local reference, because controller doesn't need more. This
patch upgrages controller reference to sharded<> and propagates
it further up to the server.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In early versions of the series which proposed protocol servers, the
interface had two methods answering pretty much the same question of
whether the server is running or not:
* listen_addresses(): empty list -> server not running
* is_server_running()
To reduce redundancy and to avoid possible inconsistencies between the
two methods, `is_server_running()` was scrapped, but re-added by a
follow-up patch because `listen_addresses()` proved to be unreliable as
a source for whether the server is running or not.
This patch restores the previous state of having only
`listen_addresses()` with two additional changes:
* rephrase the comment on `listen_addresses()` to make it clear that
implementations must return empty list when the server is not running;
* those implementations that have a reliable source of whether the
server is running or not, use it to force-return an empty list when
the server is not running
Tests: dtest(nodetool_additional_test.py)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20211117062539.16932-1-bdenes@scylladb.com>
Change b0a2a9771f broke
the generic api implementation of
is_native_transport_running that relied on
the addresses list being empty agter the server is stopped.
To fix that, this change introduces a pure virtual method:
protocol_server::is_server_running that can be implemented
by each derived class.
Test: unit(dev)
DTest: nodetool_additional_test.py:TestNodetool.binary_test
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211114135248.588798-1-bhalevy@scylladb.com>
"
On start scylla resolves several hostnames into addresses. Different
places use different hostname selection logic, e.g. the API address
can be the listen one if the dedicated option not set. Failure to
resolve a hostname is reported with an exception that (sometimes)
contains the hostname, but it doesn't look very convenient -- better
to know the config option name. Also resolving of different hostnames
has different decoration around, e.g. prometheus carries a main-local
lambda just to nicely wrap the try/catch block.
This set unifies this zoo and makes main() shorter and less hairy:
1. All failures to resolve a hostname are reported with an
exception containing the relevant config option
2. The || operator for named_value's is introduced to make
the option selection look as short as
resolve(cfg->some_address() || cfg->another_address())
3. All sanity checks are explicit and happen early in main
4. No dangling local variables carrying the cfg->...() value
5. Use resolved IP when logging a "... is listening on ..."
message after a service start
tests: unit(dev)
"
* 'br-ip-resolve-on-start' of https://github.com/xemul/scylla:
main: Move fb-utilities initialization up the main
code: Use utils::resolve instead of inet_address::lookup
main: Remove unused variable
main: Sanitize resolving of listen address
main: Sanitize resolving of broadcast address
main: Sanitize resolving of broadcast RPC address
main: Sanitize resolving of API address
main: Sanitize resolving of prometheus address
utils: Introduce || operator for named_values
db.config: Verbose address resolver helper
main: Remove api-port and prometheus-port variables
alternator: Resolve address with the help of inet_address
redis, thrift: Remove unused captures
There are some users of the latter call left. They all suffer
from the same problem -- the lack of verbosity on resolving
errors.
While at it also get rid of useless local variables that are
only there to carry the cfg->...() option over.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We shouldn't be using Seastar as a text formatting library; that's
not its focus. Use fmt directly instead. fmt::print() doesn't return
the output stream which is a minor inconvenience, but that's life.
Closes#9556
We have two identical "Truncated frame" errors, at:
* read_frame_size() in serialization_visitors.hh;
* cql_server::connection::read_and_decompress_frame() in
transport/server.cc;
When such an exception is thrown, it is impossible to tell where was it
thrown from and it doesn't have any further information contained in it
(beyond the basic information it being thrown implies).
This patch solves both problems: it makes the exception messages unique
per location and it adds information about why it was thrown (the
expected vs. real size of the frame).
Ref: #9482Closes#9520
Fixes#9491
CQL server, when encountering a "general" exception (i.e. not thrown by
cql error checks), reports a wire error with simply the what() part of
exception. However, if we have nested exceptions, we will most likely
lose info here (hello encryption).
General exception case should unwind exception and give back full,
concatenated message to avoid confusion.
Closes#9492
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.
Enable the warning and fix numerous violations.
Closes#9347
"
There are 4 places out there that do the same steps parsing
"client_|server_encryption_options" and configuring the
seastar::tls::creds_builder with the values (messaging, redis,
alternator and transport).
Also to make redis and transport look slimmer main() cleans
the client_encryption_options by ... parsing it too.
This set introduces a (coroutinized) helper to configure the
creds_builder with map<string, string> and removes the options
beautification from main.
tests: unit(dev), dtest.internode_ssl_test(dev)
"
* 'br-generalize-tls-creds-builder-configuration' of https://github.com/xemul/scylla:
code: Generalize tls::credentials_builder configuration
transport, redis: Do not assume fixed encryption options
messaging: Move encryption options parsing to ms
main: Open-code internode encryption misconfig warning
main, config: Move options parsing helpers
All the places in code that configure the mentioned creds builder
from client_|server_encryption_options now do it the same way.
This patch generalizes it all in the utils:: helper.
The alternator code "ignores" require_client_auth and truststore
keys, but it's easy to make the generalized helper be compatible.
Also make the new helper coroutinized from the beginning.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
On start main() brushes up the client_encryption_options option
so that any user of it sees it in some "clean" state and can
avoid using get_or_default() to parse.
This patch removes this assumption (and the cleaning code itself).
Next patch will make use of it and relax the duplicated parsing
complexity back.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
`function_call` AST nodes are created for each function
with side effects in a CQL query, i.e. non-deterministic
functions (`uuid()`, `now()` and some others timeuuid-related).
These nodes are evaluated either when a query itself is executed
or query restrictions are computed (e.g. partition/clustering
key ranges for LWT requests).
We need to cache the calls since otherwise when handling a
`bounce_to_shard` request for an LWT query, we can possibly
enter an infinite bouncing loop (in case a function is used
to calculate partition key ranges for a query), since the
results can be different each time.
Furthermore, we don't support bouncing more than one time.
Returning `bounce_to_shard` message more than one time
will result in a crash.
Caching works only for LWT statements and only for the function
calls that affect partition key range computation for the query.
`variable_specifications` class is renamed to `prepare_context`
and generalized to record information about each `function_call`
AST node and modify them, as needed:
* Check whether a given function call is a part of partition key
statement restriction.
* Assign ids for caching if above is true and the call is a part
of an LWT statement.
There is no need to include any kind of statement identifier
in the cache key since `query_options` (which holds the cache)
is limited to a single statement, anyway.
Function calls are indexed by the order in which they appear
within a statement while parsing. There is no need to
include any kind of statement identifier to the cache key
since `query_options` (which holds the cache) is limited
to a single statement, anyway.
Note that `function_call::raw` AST nodes are not created
for selection clauses of a SELECT statement hence they
can only accept only one of the following things as parameters:
* Other function calls.
* Literal values.
* Parameter markers.
In other words, only parameters that can be immediately reduced
to a byte buffer are allowed and we don't need to handle
database inputs to non-pure functions separately since they
are not possible in this context. Anyhow, we don't even have
a single non-pure function that accepts arguments, so precautions
are not needed at the moment.
Add a test written in `cql-pytest` framework to verify
that both prepared and unprepared lwt statements handle
`bounce_to_shard` messages correctly in such scenario.
Fixes: #8604
Tests: unit(dev, debug)
NOTE: the patchset uses `query_options` as a container for
cached values. This doesn't look clean and `service::query_state`
seems to be a better place to store them. But it's not
forwarded to most of the CQL code and would mean that a huge number
of places would have to be amended.
The series presents a trade-off to avoid forwarding `query_state`
everywhere (but maybe it's the thing that needs to be done, nonetheless).
"
* 'lwt_bounce_to_shard_cached_fn_v6' of https://github.com/ManManson/scylla:
cql-pytest: add a test for non-pure CQL functions
cql3: cache function calls evaluation for non-deterministic functions
cql3: rename `variable_specifications` to `prepare_context`