326 Commits

Author SHA1 Message Date
Nadav Har'El
c87d6407ed cql3: Replace SCYLLA_ASSERT by throwing_assert
In this patch we replace every single use of SCYLLA_ASSERT() in the cql3/
directory by throwing_assert().

The problem with SCYLLA_ASSERT() is that when it fails, it crashes Scylla.
This is almost always a bad idea (see #7871 discussing why), but it's even
riskier in front-end code like cql3/: In front-end code, there is a risk
that due to a bug in our code, a specific user request can cause Scylla
to crash. A malicious user can send this query to all nodes and crash
the entire cluster. When the user is not malicious, it causes a small
problem (a failing request) to become a much worse crash - and worse,
the user has no idea which request is causing this crash and the crash
will repeat if the same request is tried again.

All of this is solved by using the new throwing_assert(), which is the
same as SCYLLA_ASSERT() but throws an exception (using on_internal_error())
instead of crashing. The exception will prevent the code path with the
invalid assumption from continuing, but will result in only the current
user request being aborted, with a clear error message reporting the
internal server error due to an assertion failure.

I reviewed all the changes that I did in this patch to check that (to the
best of my understanding) none of the assertions in cql3/ involve the
sort of serious corruption that might require crashing the Scylla node
entirely.

throwing_assert() also improves logging of assertion failures compared
to the original SCYLLA_ASSERT() - SCYLLA_ASSERT() printed a message to
stderr which in many installations is lost, whereas throwing_assert()
uses Scylla's standard logger, and also includes a backtrace in the
log message.

Fixes #13970 (Exorcise assertions from CQL code paths)
Refs #7871 (Exorcise assertions from Scylla)

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-03-11 09:41:20 +02:00
Szymon Malewski
d817e56e87 vector_similarity_fcts.cc: fix strict aliasing violation in extract_float_vector
Previous code performed endian conversion by bulk-copying raw bytes
into a std::vector<float> and then iterating over it via a
reinterpret_cast<uint32_t*> pointer. Accessing float storage through a
uint32_t* violates C++ strict aliasing rules, giving the compiler
freedom to reorder or elide the stores, causing undefined behavior.

Replace the two-pass approach with a single-pass loop using
seastar::consume_be<uint32_t>() and std::bit_cast<float>(), which is
both well-defined and auto-vectorizable.

Follow-up #28754

Closes scylladb/scylladb#28912
2026-03-06 09:15:45 +01:00
Szymon Malewski
212bd6ae1a vector: Vectorize loops in similarity functions
The main loops iterating over vector components were not vectorized due to:
- "cannot prove it is safe to reorder floating-point operations"
- "Cannot vectorize early exit loop with more than one early exit"
The first issue is fixed with adding `#pragma clang fp contract(fast) reassociate(on)`, which allows compiler to optimize floating point operations.
The second issue is solved by refactoring the operations in the affected loop.
Additionally using float operations instead of double increases throughput and numerical accuracy is not the main consideration in vector search scenarios.

Performance measured:
- scylla built using dbuild
- using https://github.com/zilliztech/VectorDBBench (modified to call `SELECT id, similarity_cosine({vector<float, 1536>}, {vector<float, 1536>}) ...` without ANN search):
- client concurrency 20
before: ~2250 QPS
`float` operations: ~2350 QPS
`compute_cosine_similarity` vectorization: ~2500QPS
`extract_float_vector` vectorization: ~3000QPS

Follow-up https://github.com/scylladb/scylladb/pull/28615
Ref https://scylladb.atlassian.net/browse/SCYLLADB-764

Closes scylladb/scylladb#28754
2026-03-04 15:14:53 +01:00
Yaniv Michael Kaul
ead9961783 cql: vector: fix vector dimension type
Switch vector dimension handling to fixed-width `uint32_t` type,
update parsing/validation, and add boundary tests.

The dimension is parsed as `unsigned long` at first which is guaranteed
to be **at least** 32-bit long, which is safe to downcast to `uint32_t`.

Move `MAX_VECTOR_DIMENSION` from `cql3_type::raw_vector` to `cql3_type`
to ensure public visibility for checks outside the class.

Add tests to verify the type boundaries.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-223

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Co-authored-by: Dawid Pawlik <dawid.pawlik@scylladb.com>

Closes scylladb/scylladb#28762
2026-02-26 14:46:53 +02:00
Marcin Maliszkiewicz
c5dc086baf Merge 'vector_search: return NaN for similarity_cosine with all-zero vectors' from Dawid Pawlik
The ANN vector queries with all-zero vectors are allowed even on vector indexes with similarity function set to cosine.
When enabling the rescoring option, those queries would fail as the rescoring calls `similarity_cosine` function underneath, causing an `InvalidRequest` exception as all-zero vectors were not allowed matching Cassandra's behaviour.

To eliminate the discrepancy we want the all-zero vector `similarity_cosine` calls to pass, but return the NaN as the cosine similarity for zero vectors is mathematically incorrect. We decided not to use arbitrary values contrary to USearch, for which the distance (not to be confused with similarity) is defined as cos(0, 0) = 0, cos(0, x) = 1 while supporting the range of values [0, 2].
If we wanted to convert that to similarity, that would mean sim_cos(0, x) = 0.5, which does not support mathematical reasoning why that would be more similar than for example vectors marking obtuse angles.
It's safe to assume that all-zero vectors for cosine similarity shouldn't make any impact, therefore we return NaN and eliminate them from best results.

Adjusted the tests accordingly to check both proper Cassandra and Scylla's behaviour.

Fixes: SCYLLADB-456

Backport to 2026.1 needed, as it fixes the bug for ANN vector queries using rescoring introduced there.

Closes scylladb/scylladb#28609

* github.com:scylladb/scylladb:
  test/vector_search: add reproducer for rescoring with zero vectors
  vector_search: return NaN for similarity_cosine with all-zero vectors
2026-02-23 13:10:44 +01:00
Szymon Malewski
668d6fe019 vector: Improve similarity functions performance
Improves performance of deserialization of vector data for calculating similarity functions.
Instead of deserializing vector data into a std::vector<data_value>, we deserialize directly into a std::vector<float>
and then pass it to similarity functions as a std::span<const float>.
This avoids overhead of data_value allocations and conversions.
Example QPS of `SELECT id, similarity_cosine({vector<float, 1536>}, {vector<float, 1536>}) ...`:
client concurrency 1: before: ~135 QPS, after: ~1005 QPS
client concurrency 20: before: ~280 QPS, after: ~2097 QPS
Measured using https://github.com/zilliztech/VectorDBBench (modified to call above query without ANN search)

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-471

Closes scylladb/scylladb#28615
2026-02-18 00:33:34 +02:00
Dawid Pawlik
af0889d194 vector_search: return NaN for similarity_cosine with all-zero vectors
The ANN vector queries with all-zero vectors are allowed even on vector
indexes with similarity function set to cosine.
When enabling the rescoring option, those queries would fail as the rescoring
calls `similarity_cosine` function underneath, causing an `InvalidRequest` exception
as all-zero vectors were not allowed matching Cassandra's behaviour.

To eliminate the discrepancy we want the all-zero vector `similarity_cosine` calls to pass,
but return the NaN as the cosine similarity for zero vectors is mathematically incorrect.
We decided not to use arbitrary values contrary to USearch, for which the distance
(not to be confused with similarity) is defined as cos(0, 0) = 0, cos(0, x) = 1 while
supporting the range of values [0, 2].
If we wanted to convert that to similarity, that would mean sim_cos(0, x) = 0.5,
which does not support mathematical reasoning why that would be more similar than
for example vectors marking obtuse angles.
It's safe to assume that all-zero vectors for cosine similarity shouldn't make any impact,
therefore we return NaN and eliminate them from best results.

Adjusted the tests accordingly to check both proper Cassandra and Scylla's behaviour.

Fixes: SCYLLADB-456
2026-02-11 12:31:47 +01:00
Dawid Pawlik
5b2b8d596a vector_similarity_fcts: introduce similarity functions
This patch introduces scalar functions `similarity_cosine()`,
`similarity_euclidean()`, and `similarity_dot_product()`
which should return a float - similarity of the given vectors
calculated according to the function's similarity metric.

The argument types of this function are retrieved with
the `retrieve_vector_arg_types`, but shall be assignable to
`vector<float, N>` where `N` is the same for both arguments.

This patch introduces a dimensionality check during the execusion
of those functions.
2026-01-02 12:48:43 +01:00
Dawid Pawlik
b72df3ae27 vector_similarity_fcts: retrieve similarity function argument types
This patch retrieves the argument types for similarity functions.
Newly introduced `retrieve_vector_arg_types` function checks if
the provided arguments are vectors of floats and if
both the vector values match the same type (dimension).
If so, we know the exact type and set it as the function arguments type.
Otherwise, if the exact type is unkown, but we can assign to vector<float, N>
then the dimensionality check will be done during execution of
the similarity function.
This also takes care of null values and bind variables the same way
as implemented in Cassandra to stay compatible.
Meaning that if we can infer the type from one argument, then the latter
may be unknown (null or ?).

Additionally this patch adds `test_assignment_any_vector` function
which tests the weak assignment to vector<float, N> as mentioned
above.
2026-01-02 12:48:43 +01:00
Dawid Pawlik
2bedefbb85 vector_similarity_fcts: add calculating similarity between vectors
This commit introduces `compute_cosine_similarity`, `compute_euclidean_similarity`,
`compute_dot_product_similarity` functions to calculate the vectors similarity
in respective metric.
The similarity is a float value meaning how similar the vectors are in a range of [0, 1].
Values closer to 1 indicate greater similarity.

The `dot_product` similarity requires L2 normalized vectors as arguments.
The similarity is calculated based on the jVector's implementation used by Cassandra.
f967f1c924/jvector-base/src/main/java/io/github/jbellis/jvector/vector/VectorSimilarityFunction.java (L36-L69)
2026-01-02 12:48:08 +01:00
Michał Hudobski
5c957e83cb vector_search: remove dependence on cql3
This patch removes the dependence of vector search module
on the cql3 module by moving the contents of cql3/type_json.hh
to types/json_utils.hh and removing the usage of cql3 primary_key
object in vector_store_client. We also make the needed adjustments
to files that were previously using the afformentioned type_json.hh
file.

This fixes the circular dependency cql3 <-> vector_search.

Closes scylladb/scylladb#26482
2025-10-21 17:41:55 +03:00
Marcin Maliszkiewicz
2e69016c4f db: access types during schema merge via special storage
Once we create types atomically the code which is before commit
may depend on newly added types, so it has to access both old and
new types. New storage called in_progress_types_storage was added.
2025-07-10 10:40:42 +02:00
Marcin Maliszkiewicz
32b2786728 db: store functions and aggregates change batch in schema_applier
To be used in following commit.
2025-07-10 10:40:42 +02:00
Dawid Mędrek
ac9062644f cql3: Represent create_statement using managed_string
When describing a table, we need to do it carefully: if some
columns were dropped, we must specify that explicitly by

```
ALTER TABLE {table} DROP {column} USING TIMESTAMP ...
```

in the result of the DESCRIBE statement. Failing to do so
could lead to data resurrection.

However, if a table has been altered many, many times,
we might end up with a huge create statement. Constructing
it could, in turn, trigger an oversized allocation.
Some tests ran into that very problem in fact.

In this commit, we want to mitigate the problem: instead of
allocating a contiguous chunk of memory for the create
statement, we use `fragmented_ostringstream` and `managed_string`
to possibly keep data scattered in memory. It makes handling
`cql3::description` less convenient in the code, but since
the struct is pretty much immediately serialized after
creating it, it's a very good trade-off.

We provide a reproducer. It consistently passes with this commit,
while having about 50% chance of failure before it (based on my
own experiments). Playing with the parameters of the test
doesn't seem to improve that chance, so let's keep it as-is.

Fixes scylladb/scylladb#24018
2025-07-01 12:58:02 +02:00
Avi Kivity
cd79a8fc25 Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"
This reverts commit 0b516da95b, reversing
changes made to 30199552ac. It breaks
cluster.random_failures.test_random_failures.test_random_failures
in debug mode (at least).

Fixes #24513
2025-06-16 22:38:12 +03:00
Marcin Maliszkiewicz
b3730282c3 db: access types during schema merge via special storage
Once we create types atomically the code which is before commit
may depend on newly added types, so it has to access both old and
new types. New storage called in_progress_types_storage was added.
2025-06-06 08:50:33 +02:00
Marcin Maliszkiewicz
52069d954f db: store functions and aggregates change batch in schema_applier
To be used in following commit.
2025-05-27 20:00:58 +02:00
Avi Kivity
81821d26cd cql3: functions: add set_intersection()
Given two sets of equivalent types, return the set
intersection.

This is a generic function which adapts to the actual
input type.

A unit test is added.

Closes scylladb/scylladb#22763
2025-02-16 14:06:29 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Kefu Chai
ce2f80c227 treewide: migrate from boost::make_iterator_range to ranges::subrange
Replace boost::make_iterator_range() with std::ranges::subrange.

This change improves code modernization and reduces external dependencies:

- Replace boost::make_iterator_range() with std::ranges::subrange
- Remove boost/range/iterator_range.hpp include
- Improve iterator type detection in interval.hh using std::ranges::const_iterator_t<Range>

This is part of ongoing efforts to modernize our codebase and minimize
external dependencies.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21787
2024-12-09 21:31:53 +02:00
Kefu Chai
a5ee0c896b treewide: migrate from boost::adaptors::filtered to std::views::filter
Modernize the codebase by replacing Boost range adaptors with C++23 standard library views,
reducing external dependencies and leveraging modern C++ language features.

Key Changes:
- Replace `boost::adaptors::filtered` with `std::views::filter`
- Remove `#include <boost/range/adaptor/filtered.hpp>`
- Utilize standard library range views

Motivation:
- Reduce project's external dependency footprint
- Leverage standard library's range and view capabilities
- Improve long-term code maintainability
- Align with modern C++ best practices

Implementation Challenges and Considerations:
1. Range Conversion and Move Semantics
   - `std::ranges::to` adaptor requires rvalue references
   - Necessitated updates to variable and parameter constness
   - Example: `cql3/restrictions/statement_restrictions.cc` modified to remove `const`
     from `common` to enable efficient range conversion

2. Range Iteration and Mutation
   - Range views may mutate internal state during iteration
   - Cannot pass ranges by const reference in some scenarios
   - Solution: Pass ranges by rvalue reference to explicitly indicate
     state invalidation

Limitations:
- One instance of `boost::adaptors::filtered` temporarily preserved
  due to lack of a C++23 alternative for `boost::join()`
- A comprehensive replacement will be addressed in a follow-up change

This change is part of our ongoing effort to modernize the codebase,
reducing external dependencies and adopting modern C++ practices.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21648
2024-11-26 14:26:50 +02:00
Kefu Chai
00810e6a01 treewide: include seastar/core/format.hh instead of seastar/core/print.hh
The later includes the former and in addition to `seastar::format()`,
`print.hh` also provides helpers like `seastar::fprint()` and
`seastar::print()`, which are deprecated and not used by scylladb.

Previously, we include `seastar/core/print.hh` for using
`seastar::format()`. and in seastar 5b04939e, we extracted
`seastar::format()` into `seastar/core/format.hh`. this allows us
to include a much smaller header.

In this change, we just include `seastar/core/format.hh` in place of
`seastar/core/print.hh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21574
2024-11-14 17:45:07 +02:00
Kefu Chai
59eb2ab119 treewide: s/boost::algorithm::any_of/std::ranges::any_of/
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::any_of`.

in this change, we replace `boost::algorithm::any_of` with
`std::ranges::any_of`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-05 14:06:09 +08:00
Kefu Chai
24d14b601b treewide: s/boost::adaptors::map_values/std::views::values/
now that we are allowed to use C++23. we now have the luxury of using
`std::views::values`.

in this change, we:

- replace `boost::adaptors::map_values` with `std::views::values`
- update affected code to work with `std::views::values`
- the places where we use `boost::join()` are not changed, because
  we cannot use `std::views::concat` yet. this helper is only
  available in C++26.

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21265
2024-10-27 21:32:45 +02:00
Kefu Chai
6ead5a4696 treewide: move log.hh into utils/log.hh
the log.hh under the root of the tree was created keep the backward
compatibility when seastar was extracted into a separate library.
so log.hh should belong to `utils` directory, as it is based solely
on seastar, and can be used all subsystems.

in this change, we move log.hh into utils/log.hh to that it is more
modularized. and this also improves the readability, when one see
`#include "utils/log.hh"`, it is obvious that this source file
needs the logging system, instead of its own log facility -- please
note, we do have two other `log.hh` in the tree.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-22 06:54:46 +03:00
Avi Kivity
c3be2489ce treewide: drop includes of <boost/range/adaptors.hpp>
This includes way too much, including <boost/regex.hpp>, which is huge.
Drop includes of adaptors.hpp and replace by what is needed.

Closes scylladb/scylladb#21187
2024-10-20 17:17:11 +03:00
Dawid Mędrek
8582ed513b cql3/functions/user_function: Print arguments and return type without frozen
Scylla doesn't allow for the types of arguments or the return type
to be frozen. As a result, before these changes, create statements
produced to restore UDFs as part of `DESCRIBE` statements could not
be executed.

We fix that and add a reproducer test and another one to verify that
the implementation is correct.
2024-10-07 20:53:10 +02:00
Dawid Mędrek
1f1b201fd8 cql3/functions/user_function: Use fmt to format create statement
We replace `std::ostringstream` with views and formatting
using fmt to improve readability of the code.
2024-10-02 19:17:35 +02:00
Avi Kivity
d16ea0afd6 Merge 'cql3: Extend DESC SCHEMA by auth and service levels' from Dawid Mędrek
Auth has been managed via Raft since Scylla 6.0. Restoring data
following the usual procedure (1) is error-prone and so a safer
method must have been designed and implemented. That's what
happens in this PR.

We want to extend `DESC SCHEMA` by auth and service levels
to provide a safe way to backup and restore those two components.
To realize that, we change the meaning of `DESC SCHEMA WITH INTERNALS`
and add a new "tier": `DESC SCHEMA WITH INTERNALS AND PASSWORDS`.

* `DESC SCHEMA` -- no change, i.e. the statement describes the current
  schema items such as keyspaces, tables, views, UDTs, etc.
* `DESC SCHEMA WITH INTERNALS` -- does the same as the previous tier
  and also describes auth and service levels. No information about
  passwords is returned.
* `DESC SCHEMA WITH INTERNALS AND PASSWORDS` -- does the same
  as the previous tier and also includes information about the salted
  hashes corresponding to the passwords of roles.

To restore existing roles, we extend the `CREATE ROLE` statement
by allowing to use the option `WITH SALTED HASH = '[...]'`.

---

Implementation strategy:

* Add missing things/adjust existing ones that will be used later.
* Implement creating a role with salted hash.
* Add tests for creating a role with salted hash.
* Prepare for implementing describe functionality of auth and service levels.
* Implement describe functionality for elements of auth and service levels.
* Extend the grammar.
* Add tests for describe auth and service levels.
* Add/update documentation.

---

(1): https://opensource.docs.scylladb.com/stable/operating-scylla/procedures/backup-restore/restore.html
In case the link stops working, restoring a schema was realised
by managing raw files on disk.

Fixes scylladb/scylladb#18750
Fixes scylladb/scylladb#18751
Fixes scylladb/scylladb#20711

Closes scylladb/scylladb#20168

* github.com:scylladb/scylladb:
  docs: Update user documentation for backup and restore
  docs/dev: Add documentation for DESC SCHEMA
  test: Add tests for describing auth and service levels
  cql3/functions/user_function: Remove newline character before and after UDF body
  cql3: Implement DESCRIBE SCHEMA WITH INTERNALS AND PASSWORDS
  auth: Implement describing auth
  auth/authenticator: Add member functions for querying password hash
  service/qos/service_level_controller: Describe service levels
  data_dictionary: Remove keyspace_element.hh
  treewide: Start using new overloads of describe
  treewide: Fix indentation in describe functions
  treewide: Return create statement optionally in describe functions
  treewide: Add new describe overloads to implementations of data_dictionary::keyspace_element
  treewide: Start using schema::ks_name() instead of schema::keyspace_name()
  cql3: Refactor `description`
  cql3: Move description to dedicated files
  test: Add tests for `CREATE ROLE WITH SALTED HASH`
  cql3/statements: Restrict CREATE ROLE WITH SALTED HASH
  auth: Allow for creating roles with SALTED HASH
  types: Introduce a function `cql3_type_name_without_frozen()`
  cql3/util: Accept std::string_view rather than const sstring&
2024-09-24 21:44:32 +03:00
Dawid Mędrek
10d13f541b cql3/functions/user_function: Remove newline character before and after UDF body
We remove newline characters that are printed before and after
a UDF's body. This way, we want to keep the create statement
as close to what was actually provided as possible. Although
there should be no semantic differences with or without the
newline characters, it's a lot more convenient in testing when
they're not present.

Fixes scylladb/scylladb#20711
2024-09-24 14:18:01 +02:00
Kefu Chai
2014d1c0cb cql3: drop workaround for castas_fctn_simple()
now that e13a584ab7
has been merged, and our toolchain is based on the fedora 40 on 20240710, which should include this
change.

so let's drop the workaround from 51d09e6a

Refs #18508
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20750
2024-09-22 19:59:10 +03:00
Dawid Mędrek
b357307406 data_dictionary: Remove keyspace_element.hh
The interface is not used anywhere anymore, so we can
remove it safely. It has been replaced by custom
functions for each keyspace element and `cql3::description`.
2024-09-20 14:24:54 +02:00
Dawid Mędrek
df94e92b06 treewide: Fix indentation in describe functions
After modifying new functions for generating `cql3::description`,
we fix indentation in them in this commit.
2024-09-20 14:24:54 +02:00
Dawid Mędrek
86722e4cea treewide: Return create statement optionally in describe functions
We add a new parameter in functions used to generate instances
of `cql3::description` for types related to situations where we
might not need a create statement. An example of such a scenario
could be `DESCRIBE TYPES`.
2024-09-20 14:24:54 +02:00
Dawid Mędrek
0702e93e32 treewide: Add new describe overloads to implementations of data_dictionary::keyspace_element
We're removing `data_dictionary::keyspace_element`.
Before we can do that, we need to substitute the existing
methods used for describing keyspace elements with their
new versions returning `cql3::description`.
That's what happens in this commit.
2024-09-20 14:24:53 +02:00
Kefu Chai
3e84d43f93 treewide: use seastar::format() or fmt::format() explicitly
before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.

that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:

```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
  265 |     return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
      |            ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
 4290 |     format(format_string<_Args...> __fmt, _Args&&... __args)
      |     ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
  143 | format(fmt::format_string<A...> fmt, A&&... a) {
      | ^
```

in this change, we

change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
  `seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
  because, `sstring::operator std::basic_string` would incur a deep
  copy.

we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-11 23:21:40 +03:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Marcin Maliszkiewicz
395dec35c1 cql3: functions: replace template with std::function in with_udf_iter()
Templates are slower to compile and more difficult to read,
in this case generalization is not needed and can be replaced
by std::function.
2024-07-15 09:39:20 +02:00
Marcin Maliszkiewicz
85d38e013c cql3: functions: improve functions class constness handling
Declares getters as const methods. Makes instance() function
return const object so that it may only be modified via change_batch
class.
2024-07-15 09:39:20 +02:00
Marcin Maliszkiewicz
b9861c0bb7 Revert "cql3: functions: make modification functions accessible only via batch class"
This reverts commit 3f1c2fecc2.

This access control property will be implemented differently
(by using const) in subsequent commit hence revert.
2024-07-15 09:39:20 +02:00
Marcin Maliszkiewicz
3f1c2fecc2 cql3: functions: make modification functions accessible only via batch class
This is to assure that all the code is using batching
2024-07-04 13:10:26 +02:00
Marcin Maliszkiewicz
4d937c5a17 cql3: functions: introduce class for batching functions modifications
It will hold a temporary shallow copy of declared functions.
Then each modification adds/removes/replaces stored function object.
At the end change is commited by moving temporary copy to the
main functions class instance.
2024-07-04 12:14:36 +02:00
Marcin Maliszkiewicz
16b770ff1a cql3: functions: make functions class non-static
This is done to ease code reuse in the following commit.
It'd also help should we ever want properly mount functions
class to schema object instead of static storage.
2024-07-04 10:24:57 +02:00
Marcin Maliszkiewicz
47033dce7a cql3: functions: remove reduntant class access specifiers 2024-07-04 10:24:57 +02:00
Marcin Maliszkiewicz
e86191b19f cql3: functions: remove unused java snippet
It doesn't seem to serve any purpose now.
2024-07-04 10:24:57 +02:00
Avi Kivity
3fc4e23a36 forward_service: rename to mapreduce_service
forward_service is nondescriptive and misnamed, as it does more than
forward requests. It's a classic map/reduce algorithm (and in fact one
of its parameters is "reducer"), so name it accordingly.

The name "forward" leaked into the wire protocol for the messaging
service RPC isolation cookie, so it's kept there. It's also maintained
in the name of the logger (for "nodetool setlogginglevel") for
compatibility with tests.

Closes scylladb/scylladb#19444
2024-07-03 19:29:47 +03:00
Pavel Emelyanov
16db2f650e functions: Do not crash when schema is missing
Getting token() function first tries to find a schema for underlying
table and continues with nullptr if there's no one. Later, when creating
token_fct, the schema is passed as is and referenced. If it's null crash
happens.

It used to throw before 5983e9e7b2 (cql3: test_assignment: pass optional
schema everywhere) on missing schema, but this commit changed the way
schema is looked up, so nullptr is now possible.

fixes: #18637

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18639
2024-05-15 17:20:40 +03:00
Avi Kivity
51d09e6a2a cql3: castas_fcts: do not rely on boost casting large multiprecision integers to floats behavior
In [1] a bug casting large multiprecision integers to floats is documented (note that it
received two fixes, the most recent and relevant is [2]). Even with the fix, boost now
returns NaN instead of ±∞ as it did before [3].

Since we cannot rely on boost, detect the conditions that trigger the bug and return
the expected result.

The unit test is extended to cover large negative numbers.

Boost version behavior:
 - 1.78 - returns ±∞
 - 1.79 - terminates
 - 1.79 + fix - returns NaN

Fixes https://github.com/scylladb/scylladb/issues/18508

[1] https://github.com/boostorg/multiprecision/issues/553
[2] ea786494db
[3] https://github.com/boostorg/math/issues/1132

Closes scylladb/scylladb#18532
2024-05-13 13:18:28 +03:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Kefu Chai
4cc5fcde72 cql3: add fmt::formatter for std::vector<data_type>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for std::vector<data_type>,
and drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-03-02 10:52:50 +08:00