94 Commits

Author SHA1 Message Date
Petr Gusev
ffaee20b62 exceptions.hh: fix message argument passing
The message argument is usually taken from a temporary variable
constructed with the format() function. It is more efficient to
pass it by value and move it along the constructor chain.
2025-08-13 13:39:52 +02:00
Petr Gusev
ff89c03c7f exceptions: add constructors that accept explicit error messages
To improve debuggability, we need to propagate original error messages
from Paxos verbs to the user. This change adds constructors that take
an error message directly, enabling better error reporting.

Additionally, functions such as write_timeout_to_read,
write_failure_to_read etc are updated to use these message-based
constructors. These functions are used in storage_proxy::cas to
convert between different error types, and without this change,
they could lose the original error message during conversion.
2025-08-12 16:31:05 +02:00
Andrzej Jackowski
1fca994c7b transport: storage_proxy: release ERM when waiting for query timeout
Before this change, if a read executor had just enough targets to
achieve query's CL, and there was a connection drop (e.g. node failure),
the read executor waited for the entire request timeout to give drivers
time to execute a speculative read in a meantime. Such behavior don't
work well when a very long query timeout (e.g. 1800s) is set, because
the unfinished request blocks topology changes.

This change implements a mechanism to thrown a new
read_failure_exception_with_timeout in the aforementioned scenario.
The exception is caught by CQL server which conducts the waiting, after
ERM is released. The new exception inherits from read_failure_exception,
because layers that don't catch the exception (such as mapreduce
service) should handle the exception just a regular read_failure.
However, when CQL server catch the exception, it returns
read_timeout_exception to the client because after additional waiting
such an error message is more appropriate (read_timeout_exception was
also returned before this change was introduced).

This change:
 - Add new read_failure_exception_with_timeout exception
 - Add throw of read_failure_exception_with_timeout in storage_proxy
 - Add abort_source to CQL server, as well as to_stop() method for
   the correct abort handling
 - Add sleep in CQL server when the new exception is caught

Refs #21831
2025-04-23 09:29:47 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Kefu Chai
48c8d24345 treewide: drop support for fmt < v10
since fedora 38 is EOL. and fedora 39 comes with fmt v10.0.0, also,
we've switched to the build image based on fedora 40, which ships
fmt-devel v10.2.1, there is no need to support fmt < 10.

in this change, we drop the support fmt < 10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21847
2024-12-09 20:42:38 +02:00
Kefu Chai
00810e6a01 treewide: include seastar/core/format.hh instead of seastar/core/print.hh
The later includes the former and in addition to `seastar::format()`,
`print.hh` also provides helpers like `seastar::fprint()` and
`seastar::print()`, which are deprecated and not used by scylladb.

Previously, we include `seastar/core/print.hh` for using
`seastar::format()`. and in seastar 5b04939e, we extracted
`seastar::format()` into `seastar/core/format.hh`. this allows us
to include a much smaller header.

In this change, we just include `seastar/core/format.hh` in place of
`seastar/core/print.hh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21574
2024-11-14 17:45:07 +02:00
Dawid Mędrek
7a7a1e3558 treewide: Prefer bytes_fwd.hh over bytes.hh
CI started reporting warnings about including `bytes.hh` in
several files. The reason is they actually only use code
introduced in `bytes_fwd.hh` (which is also included by `bytes.hh`).
Clang-include-cleaner suggests that we get rid of that indirection
and only include `bytes_fwd.hh`. That's what happens in this commit.

We include `bytes.hh` in `exceptions/exceptions.cc` because
it relies on the formatting utilities declared and defined
in `bytes.hh`.

Closes scylladb/scylladb#20842
2024-10-02 07:29:30 +02:00
Kefu Chai
cb1670b79b Update seastar submodule
* seastar ec5da7a6...69f88e2f (38):
  > build: s/Sanitizers_COMPILER_OPTIONS/Sanitizers_COMPILE_OPTIONS
  > test: Update httpd test with request/reply body writing sugar
  > http: Add sugar to request and response body writers
  > utils: Add util::write_to_stream() helper
  > seastar-addr2line: adjust llvm termination regex
  > README.md: add Crimson project
  > rpc: conditionally use fmt::runtime() based on SEASTAR_LOGGER_COMPILE_TIME_FMT
  > build: check the combination of Sanitizers
  > tls: clear session ticket before releasing
  > print: remove dead code
  > doc/lambda-coroutine-fiasco: reword for better readability
  > rpc: fix compilation error caused by fmt::runtime()
  > tutorial: explain the use case of rethrow_exception and coroutine::exception
  > reactor: print more informative error when io_submit fails
  > README.md: note GitHub discussions
  > prometheus: `fmt::print` to stringstream directly
  > doc: add document for testing with seastar
  > seastar/testing: only include used headers
  > test: Add abortable http client test cases
  > http/client: Add abortable make_request() API method
  > http/client: Abort established connections
  > http/client: Handle abort source in pool wait
  > http/client: Add abort source to factory::make() method
  > http/client: Pass abort_source here and there
  > http/client: Idnentation fix after previous patch
  > http/client: Merge some continuations explicitly
  > signal: add seastar signal api
  > httpd: remove unused prometheus structs
  > print: use fmtlib's fmt::format_string in format()
  > rpc: do not use seastar::format() in rpc logger
  > treewide: s/format/seastar::format/
  > prometheus: sanitize label value for text protocol
  > tests: unit test prometheus wire format
  > io-tester: Introduce batches to rate-based submission
  > io-tester: Generalize issueing request and collecting its result
  > io-tester: Cancel intent once
  > io-tester: Dont carry rps/parallelism variables over lambdas
  > io-tester: Simplify in-flight management

The breaking changes in the seastar submodule necessitate corresponding
modifications in our code. These changes must be implemented together in
a single commit to maintain consistency. So that each commit is buildable.

following changes are included in addition to seastar submodule update:
* instead of passing a `const char*` for the format string, pass a
  templated `fmt::format_string<...>`, this depends on the
  `seastar::format()` change in seastar.
* explicitly call `fmt::runtime()` if the format string is not a
  consteval expression. this depends on the `seastar::format()` change
  in seastar. as `seastar::format()` does not accept a plain
  `const char*` which is not constexpr anymore.
* pass abort_source to `dns_connection_factory::make()`. this depends on
  the change in seastar, which added a `abort_source*` argument to
  the pure virtual member function of `connection_factory::make()`.
* call call {fmt,seastar}::format() explicitly. this is a follow up of
  3e84d43f, which takes care of all places where we should call
  `fmt::format()` and `seastar::format()` explicitly to disambiguate the
  `format()` call. but more `format()` call made their way into the source
   tree after 3e84d43f. so we need fix them as well.
* include used header in tests

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Update seastar submodule

 Please enter the commit message for your changes. Lines starting

Closes scylladb/scylladb#20649
2024-09-18 13:59:22 +03:00
Kefu Chai
3e84d43f93 treewide: use seastar::format() or fmt::format() explicitly
before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.

that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:

```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
  265 |     return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
      |            ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
 4290 |     format(format_string<_Args...> __fmt, _Args&&... __args)
      |     ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
  143 | format(fmt::format_string<A...> fmt, A&&... a) {
      | ^
```

in this change, we

change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
  `seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
  because, `sstring::operator std::basic_string` would incur a deep
  copy.

we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-11 23:21:40 +03:00
Piotr Dulikowski
da5f4faac1 Merge 'mv: reject user requests by coordinator when a replica is overloaded by MVs' from Wojciech Mitros
Currently, when a view update backlog of one replica is full, the write is still sent by the coordinator to all replicas. Because of the backlog, the write fails on the replica, causing inconsistency that needs to be fixed by repair. To avoid these inconsistencies, this patch adds a check on the coordinator for overloaded replicas. As a result, a write may be rejected before being sent to any replicas and later retried by the user, when the replica is no longer overloaded.

This patch does not remove the replica write failures, because we still may reach a full backlog when more view updates are generated after the coordinator check is performed and before the write reaches the replica.

Fixes scylladb/scylladb#17426

Closes scylladb/scylladb#18334

* github.com:scylladb/scylladb:
  mv: test the view update behavior
  mv: add test for admission control
  storage_proxy: return overloaded_exception instead of throwing
  mv: reject user requests by coordinator when a replica is overloaded by MVs
2024-08-27 12:50:34 +02:00
Wojciech Mitros
a55b7688b6 storage_proxy: return overloaded_exception instead of throwing
To avoid an expensive stack unwind, instead of throwing an error,
we can just return it thanks to the boost::result type that the
affected methods use. The result with an exception needs to be
constructed not implicitly, but with boost::outcome_v2::failure,
because the exception, converted into coordinator_exception_container
can be then converted into both into a successful response_id_type
as well as into a failure.
2024-08-02 12:12:24 +02:00
Dawid Medrek
414ea68cac exceptions/exceptions.hh: Wrap #include <concepts> within an #ifdef
`GitHub Actions / Analyze #includes in source files` keeps reporting
that the include shouldn't be present in the file. The reason is
that we use FMT with version >10, so the fragment of the code that
uses the include is not compiled. We move the include to a place
where it's used, which should fix the warnings.

Closes scylladb/scylladb#19776
2024-07-17 22:09:41 +03:00
Pavel Emelyanov
c752bda0a2 Merge '.github: change severity to error in clang-include-cleaner ' from Kefu Chai
in this changeset, we tighten the clang-include-cleaner workflow, and address the warnings in two more subdirectories in the source tree.

* it's a cleanup, no need to backport

Closes scylladb/scylladb#19155

* github.com:scylladb/scylladb:
  .github: add alternator to iwyu's CLEANER_DIR
  alternator: do not include unused headers
  .github: change severity to error in clang-include-cleaner
  exceptions: do not include unused headers
2024-06-12 10:16:17 +03:00
Wojciech Mitros
4aa7ada771 exceptions: make view update timeouts inherit from timed_out_error
Currently, when generating and propagating view updates, if we notice
that we've already exceeded the time limit, we throw an exception
inheriting from `request_timeout_exception`, to later catch and
log it when finishing request handling. However, when catching, we
only check timeouts by matching the `timed_out_error` exception,
so the exception thrown in the view update code is not registered
as a timeout exception, but an unknown one. This can cause tests
which were based on the log output to start failing, as in the past
we were noticing the timeout at the end of the request handling
and using the `timed_out_error` to keep processing it and now, even
though we do notice the timeout even earlier, due to it's type we
log an error to the log, instead of treating it as a regular timeout.
In this patch we make the error thrown on timeout during view updates
inherit from `timed_out_error` instead of the `request_timeout_exception`
(it is also moved from the "exceptions" directory, where we define
exceptions returned to the user).
Aside from helping with the issue described above, we also improve our
metrics, as the `request_timeout_exception` is also not checked for
in the `is_timeout_exception` method, and because we're using it to
check whether we should update write timeout metrics, they will only
start getting updated after this patch.

Closes scylladb/scylladb#19102
2024-06-07 09:54:48 +02:00
Kefu Chai
d33ab21ef8 exceptions: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-07 07:28:52 +08:00
Jan Ciolek
e0442d7bfa mv: throttle view update generation for large queries
For every mutation applied to the base table we have to
generate the corresponding materialized view table updates.

In case of simple requests, like INSERT or UPDATE, the number
of view updates generated per base table mutation is limited
to at most a few view table updates per base table update.

The situation is different for DELETE queries, which can delete
the whole partitions or clustering ranges. Range deletions are
fast on the base table, but for the view table the situation
is different. Deleting a single partition in the base table
will generate as many singular view updates as there are rows
in the deleted partition, which could potentially be in the millions.

To prevent OOM view updates are generated in batches of at most 100 rows.
There is a loop which generates the next batch of updates, spawns tasks
to send them to remote nodes, generates another batch and so on.

The problem is that there is no concurrency control - each batch is scheduled
to be sent in the background, but the following batch is generated without
waiting for the previously generated updates to be sent. This can lead to
unbounded concurrency and OOM.

To protect against this view update generation should be limited somehow.

There is an existing mechanism for limiting view updates - throttling.
We keep track of how many pending view updates there are, in the view backlog,
and delay responses to the client based on this backlog's fullness.
For a well behaved client with limited concurrency this will slow down
the amount of incoming requests until it reaches an optimal point.

This works for simple queries (INSERT, UPDATE, ...), but it doesn't do anything
for range DELETEs. A DELETE is a single request that generates millions of view
updates, delaying client response doesn't help.

The throttling mechanism could be extend to cover this case - we could treat the
DELETE request like any other client and force it to wait before sending more updates.

This commit implements this approach - before sending the next batch of updates
the generator is forced to sleep for a bit of time, calculated using the exisiting
throttling equation.
The more full the backlog gets the more the generator will have to sleep for,
and hopefully this will prevent overloading the system with view updates.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2024-05-13 18:16:23 +02:00
Jan Ciolek
cd62697605 exceptions: add read_write_timeout_exception, a subclass of request_timeout_exception
The `request_timeout_exception` is thrown when a client request can't be completed in time.
Previously this class included some fields specific to read/write timeouts:
```
db::consistency_level consistency;
int32_t received;
int32_t block_for;
```

The problem is that a request can timeout for reasons other than read/write timeout,
for example the request might timeout due to materialized view update generation taking
too long.

In such cases of non read/write timeouts we would like to be able use request_timeout_exception,
but it contains fields that aren't releveant in these cases.

To deal with this let's create read_write_timeout_exception, which inherits
from request_timeout_exception. read_write_timout_exception will contain all
of these fields that are specific to read/write timeouts. request_timeout_exception
will become the base class that doesn't have any fields, the other case-specific
exceptions will derive from it and add the desired fields.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2024-05-13 18:16:09 +02:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Kefu Chai
0609cd676f exceptions: add fmt::formatter for classes derived from cassandra_exception
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, `fmt::formatter<T>` is added for classes derived from
`cassandra_exception`, where `T` is the class type derived from
`cassandra_exception`.

this change is implemented to be backward compatible  with {fmt} < 10.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-03-21 12:48:19 +08:00
Kefu Chai
16e1700246 exceptions: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17152
2024-02-06 13:16:03 +02:00
Kefu Chai
3bca11668a db: add formatter for exceptions::exception_code
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `exceptions::exception_code`,
and drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17151
2024-02-06 13:15:08 +02:00
Yaniv Kaul
ae2ab6000a Typos: fix typos in code
Fixes some more typos as found by codespell run on the code.
In this commit, there are more user-visible errors.

Refs: https://github.com/scylladb/scylladb/issues/16255
2023-12-05 15:18:11 +02:00
Jan Ciolek
55fb91bf10 exceptions: remove relation field from unrecognized_entity_exception
The exception unrecognized_entity_exception used to have two fields:
* entity - the name that wasn't recognized
* relation_str - part of the WHERE clause that contained this entity

In 4e0a089f3e the places that throw
this exception were modified, the thrower started passing unrecognized
column name to both fields - entity and relation_str. It was easier to
do things this way, accessing the whole WHERE clause can be problematic.

The problem is that this caused error messages to get weird, e.g:
"Undefined name x in where clause ('x')".
x is not the WHERE clause, it's the unrecognized name.

Let's remove the `relation_str` field as it isn't used anymore,
it only causes confusion. After this change the message would be:
"Unrecognized name x"
Which makes much more sense.

Refs #10632

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>

Closes #13944
2023-05-24 19:35:26 +03:00
Piotr Dulikowski
e69b44a60f exception: fix the error code used for rate_limit_exception
Per-partition rate limiting added a new error type which should be
returned when Scylla decides to reject an operation due to per-partition
rate limit being exceeded. The new error code requires drivers to
negotiate support for it, otherwise Scylla will report the error as
`Config_error`. The existing error code override logic works properly,
however due to a mistake Scylla will report the `Config_error` code even
if the driver correctly negotiated support for it.

This commit fixes the problem by specifying the correct error code in
`rate_limit_exception`'s constructor.

Tested manually with a modified version of the Rust driver which
negotiates support for the new error. Additionally, tested what happens
when the driver doesn't negotiate support (Scylla properly falls back to
`Config_error`).

Branches: 5.1
Fixes: #11517

Closes #11518
2022-09-13 11:46:15 +02:00
Pavel Emelyanov
f3841c1b45 exceptions: Define operator<< for exception_code
Otherwise cql_transport::additional_options_for_proto_ext() complains
about inability to format the enum class value

Introduced by efc3953c (transport: add rate_limit_error)
Fmt version 8.1.1-5.fc35, fresher one must have it out of the box

Fixes #10884

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20220627052703.32024-1-xemul@scylladb.com>
2022-06-27 14:49:58 +03:00
Piotr Dulikowski
621b7f35e2 replica: add rate_limit_exception and a simple serialization framework
Introduces `replica::rate_limit_exception` - an exceptions that is
supposed to be thrown/returned on the replica side when the request is
rejected due to the exceeding the per-partition rate limit.

Additionally, introduces the `exception_variant` type which allows to
transport the new exception over RPC while preserving the type
information. This will be useful in later commits, as the coordinator
will have to know whether a replica has failed due to rate limit being
exceeded or another kind of error.

The `exception_variant` currently can only either hold "other exception"
(std::monostate) or the aforementioned `rate_limit_exception`, but can
be extended in a backwards-compatible way in the future to be able to
hold more exceptions that need to be handled in a different way.
2022-06-22 20:07:58 +02:00
Piotr Dulikowski
efc3953c0a transport: add rate_limit_error
Adds a CQL protocol extension which introduces the rate_limit_error. The
new error code will be used to indicate that the operation failed due to
it exceeding the allowed per-partition rate limit.

The error code is supposed to be returned only if the corresponding CQL
extension is enabled by the client - if it's not enabled, then
Config_error will be returned in its stead.
2022-06-22 20:07:58 +02:00
cvybhu
d85f680df3 cql3: Remove relation class
Functionality of the relation class has been replaced by
expr::to_restriction.

Relation and all classes deriving from it can now be removed.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-16 18:17:58 +02:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Nadav Har'El
fa7a302130 cross-tree: split coordinator_result from exceptions.hh
Recently, coordinator_result was introduced as an alternative for
exceptions. It was placed in the main "exceptions/exceptions.hh" header,
which virtually every single source file in Scylla includes.
But unfortunately, it brings in some heavy header files and templates,
leading to a lot of wasted build time - ClangBuildAnalyzer measured that
we include exceptions.hh in 323 source files, taking almost two seconds
each on average.

In this patch, we split the coordinator_result feature into a separate
header file, "exceptions/coordinator_result", and only the few places
which need it include the header file. Unfortunately, some of these
few places are themselves header, so the new header file ends up being
included in 100 source files - but 100 is still much less than 323 and
perhaps we can reduce this number 100 later.

After this patch, the total Scylla object-file size is reduced by 6.5%
(the object size is a proxy for build time, which I didn't directly
measure). ClangBuildAnalyzer reports that now each of the 323 includes
of exceptions.hh only takes 80ms, coordinator_result.hh is only included
100 times, and virtually all the cost to include it comes from Boost's
result.hh (400ms per inclusion).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220228204323.1427012-1-nyh@scylladb.com>
2022-03-02 10:12:57 +02:00
Nadav Har'El
f84094320d exceptions: de-inline exception constructors
The header file "exceptions/exceptions.hh" and the exception types in it
is used by virtually every source file in Scylla, so excessive includes
and templated code generation in this header could slow down the build
considerably.

Before this patch, all of the exceptions' constructors were inline in
exceptions.hh, so source file using one of these exceptions will need
to recompile the code, which is fairly heavy, using the fmt templates
for various types. According to ClangBuildAnalyzer, 323 source files
needed to materialize prepare_message<db::consistency_level,int&,int&>,
taking 0.3 seconds each.

So this patch moves the exception constructors from the header file
exceptions.hh to the source file exceptions.cc. The header file no longer
uses fmt.

Unfortunately, the actual build-time savings from this patch is tiny -
around 0.1%... It turns out that most of the prepare_message<>
compilation time comes from fmt compilation time, and since virtually
all source files use fmt for other header reasons (intentionally or
through other headers), no compilation time can be saved. Nevertheless,
I hope that as we proceed with more cleanups like this and eliminate
more unnecessary code-generation-in-headers, we'll start seeing build
time drop.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-02-28 14:47:41 +02:00
Piotr Dulikowski
4c58683102 exceptions: make it possible to return read_{timeout,failure}_exception as value
Adds read_timeout_exception and read_failure exception to the list of
exceptions supported by the coordinator_exception_container.

Those exceptions are not yet returned-as-value anywhere, but they will
be in the commits that follow.
2022-02-22 16:08:52 +01:00
Piotr Dulikowski
9304791ce5 exceptions: add coordinator_exception_container and coordinator_result
Adds coordinator_exception_container which is a typedef over
exception_container and is meant to hold exceptions returned from the
coordinator code path. Currently, it can only hold mutation write
timeout exceptions, because only that kind of error will be returned by
value as a result of this PR. In the future, more exception types can be
added.

Adds coordinator_result which is a boost::outcome::result that uses
coordinator_exception_container as the error type.
2022-02-08 11:08:42 +01:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Emelyanov
1ea301ad07 storage_proxy: Propagate virtual table exceptions messages
The intention is to return some meaningful info to the CQL caller
if a virtual table update fails. Unfortunately the "generic" error
reporting in CQL is not extremely flexible, so the best option
seems to report regular write failre with custom message in it.

For now this only works for virtual table errors.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 16:39:34 +03:00
Avi Kivity
4433178ccb utils: exceptions: convert sprint() to format()
sprint() is type-unsafe, and we are converging on format(). Convert
exceptions.hh to format().

Closes #9006
2021-07-12 11:17:57 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Nadav Har'El
702b1b97bf cql: fix error return from execution of fromJson() and other functions
As reproduced in cql-pytest/test_json.py and reported in issue #7911,
failing fromJson() calls should return a FUNCTION_FAILURE error, but
currently produce a generic SERVER_ERROR, which can lead the client
to think the server experienced some unknown internal error and the
query can be retried on another server.

This patch adds a new cassandra_exception subclass that we were missing -
function_execution_exception - properly formats this error message (as
described in the CQL protocol documentation), and uses this exception
in two cases:

1. Parse errors in fromJson()'s parameters are converted into a
   function_execution_exception.

2. Any exceptions during the execute() of a native_scalar_function_for
   function is converted into a function_execution_exception.
   In particular, fromJson() uses a native_scalar_function_for.

   Note, however, that functions which already took care to produce
   a specific Cassandra error, this error is passed through and not
   converted to a function_execution_exception. An example is
   the blobAsText() which can return an invalid_request error, so
   it is left as such and not converted. This also happens in Cassandra.

All relevant tests in cql-pytest/test_json.py now pass, and are
no longer marked xfail. This patch also includes a few more improvements
to test_json.py.

Fixes #7911

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210118140114.4149997-1-nyh@scylladb.com>
2021-01-21 15:21:13 +01:00
Piotr Wojtczak
3560acd311 cql_metrics: Add metrics for CQL errors
This change adds tracking of all the CQL errors that can be
raised in response to a CQL message from a client, as described
in the CQL v4 protocol and with Scylla's CDC_WRITE_FAILUREs
included.

Fixes #5859

Closes #7604
2020-11-30 12:18:37 +02:00
Piotr Sarna
58ae0c5208 exceptions: make a single-param constructor explicit
... since it's good practice.
2020-09-28 09:16:31 +02:00
Piotr Sarna
b0737542f2 exceptions: add a constructor based on custom message
OverloadedException was historically only used when the number
of in-flight hints got too high. The other constructor will be useful
for using OverloadedException in other scenarios.
2020-09-28 09:16:31 +02:00
Pavel Solodovnikov
1d3f9174c5 cql3: avoid using shared_ptr's in unrecognized_entity_exception
Using shared_ptr's in `unrecognized_entity_exception` can lead
to cross-cpu deletion of a pointer which will trigger an assert
`_cpu == std::this_thread::get_id()' when shared_ptr is disposed.

Copy `column_identifier` to the exception object and avoid using
an instance of `cql3::relation`: just get a string representation
from it since nothing more is used in associated exception
handling code.

Fixes: #6287
Tests: unit(dev, debug), dtest(lwt_destructive_ddl_test.py:LwtDestructiveDDLTest.test_rename_column)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200506155714.150497-1-pa.solodovnikov@scylladb.com>
2020-05-06 19:02:36 +03:00
Benny Halevy
3b31acfa80 exceptions: drop OVERFLOW_ERROR cql binary protocol extension
Client drivers act differently on errors codes they don't recognize.
Adding new errors codes is considered a protocol extension and
should be negotiated with the client.

This change keeps `overflow_error_exception` internally but uses
the INVALID cql error code to return the error message back to the client
similar to keyspace_not_defined_exception.

We (and cassandra) already use `invalid_request_exception` extensively
to return various errors related to invalid values or types in the query.

Fixes #6264

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Reviewed-by: Gleb Natapov <gleb@scylladb.com>
Message-Id: <20200422130011.108003-1-bhalevy@scylladb.com>
2020-04-28 12:16:00 +03:00
Juliusz Stasiewicz
d37b3f34f1 cdc: fix the "NoHostAvailable" client error when CL is not met
This commit resolves the client-observable effect of CDC read
consistencies. I wrapped the preimage's SELECT query in try-catch to
intercept the `unavailable_exception`, which led to misleading
`NoHostAvailable` in Python and Java drivers. Now client gets a new
error code and a message specific to the issue of CL not being met
by the preimage query.

Fixes #5746
2020-04-27 13:56:57 +02:00
Rafael Ávila de Espíndola
caef2ef903 everywhere: Don't assume sstring::begin() and sstring::end() are pointers
If we switch to using std::string we have to handle begin and end
returning iterators.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-10 13:13:48 -07:00
Calle Wilund
5e46079e89 exceptions: Set correct error code in truncate_exception
Refs #4924

truncate_exception should, like its origin counterpart, set
error code to TRUNCATE_ERROR, not PROTOCOL_ERROR.

tests: unit + partial dtest
Message-Id: <20200212100920.14478-1-calle@scylladb.com>
2020-02-12 11:17:16 +01:00
Benny Halevy
72e2ea47c1 cql3: time_uuid_fcts: validate time UUID
Throw an error in case we hit an invalid time UUID
rather than hitting an assert.

Ref #5552

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-01-27 11:09:01 +02:00
Benny Halevy
e97a111f64 cql3: functions: detect and handle int overflow in sum
Detect integer overflow in cql sum functions and throw an error.
Note that Cassandra quietly truncates the sum if it doesn't fit
in the input type but we rather break compatibility in this
case. See https://issues.apache.org/jira/browse/CASSANDRA-4914?focusedCommentId=14158400&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14158400

Fixes #5536

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-01-08 09:48:33 +02:00
Benny Halevy
98260254df exceptions: sort exception_code definitions
Be compatible with Cassandra source.
It's easier to maintain this way.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-01-08 09:48:21 +02:00
Benny Halevy
30d0f1df75 exceptions: define additional cassandra CQL exceptions codes
As of e9da85723a

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-01-08 09:40:57 +02:00