Commit Graph

280 Commits

Author SHA1 Message Date
Pavel Emelyanov
16db2f650e functions: Do not crash when schema is missing
Getting token() function first tries to find a schema for underlying
table and continues with nullptr if there's no one. Later, when creating
token_fct, the schema is passed as is and referenced. If it's null crash
happens.

It used to throw before 5983e9e7b2 (cql3: test_assignment: pass optional
schema everywhere) on missing schema, but this commit changed the way
schema is looked up, so nullptr is now possible.

fixes: #18637

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18639
2024-05-15 17:20:40 +03:00
Avi Kivity
51d09e6a2a cql3: castas_fcts: do not rely on boost casting large multiprecision integers to floats behavior
In [1] a bug casting large multiprecision integers to floats is documented (note that it
received two fixes, the most recent and relevant is [2]). Even with the fix, boost now
returns NaN instead of ±∞ as it did before [3].

Since we cannot rely on boost, detect the conditions that trigger the bug and return
the expected result.

The unit test is extended to cover large negative numbers.

Boost version behavior:
 - 1.78 - returns ±∞
 - 1.79 - terminates
 - 1.79 + fix - returns NaN

Fixes https://github.com/scylladb/scylladb/issues/18508

[1] https://github.com/boostorg/multiprecision/issues/553
[2] ea786494db
[3] https://github.com/boostorg/math/issues/1132

Closes scylladb/scylladb#18532
2024-05-13 13:18:28 +03:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Kefu Chai
4cc5fcde72 cql3: add fmt::formatter for std::vector<data_type>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for std::vector<data_type>,
and drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-03-02 10:52:50 +08:00
Nadav Har'El
fc861742d7 cql: avoid undefined behavior in totimestamp() of extreme dates
This patch fixes a UBSAN-reported integer overflow during one of our
existing tests,

   test_native_functions.py::test_mintimeuuid_extreme_from_totimestamp

when attempting to convert an extreme "date" value, millions of years
in the past, into a "timestamp" value. When UBSAN crashing is enabled,
this test crashes before this patch, and succeeds after this patch.

The "date" CQL type is 32-bit count of *days* from the epoch, which can
span 2^31 days (5 million years) before or after the epoch. Meanwhile,
the "timestamp" type measures the number of milliseconds from the same
epoch, in 64 bits. Luckily (or intentionally), every "date", however
extreme, can be converted into a "timestamp": This is because 2^31 days
is 1.85e17 milliseconds, well below timestamp's limit of 2^63 milliseconds
(9.2e18).

But it turns out that our conversion function, date_to_time_point(),
used some boost::gregorian library code, which carried out these
calculations in **microsecond** resolution. The extra conversion to
microseconds wasn't just wasteful, it also caused an integer overflow
in the extreme case: 2^31 days is 1.85e20 microseconds, which does NOT
fit in a 64-bit integer. UBSAN notices this overflow, and complains
(plus, the conversion is incorrect).

The fix is to do the trivial conversion on our own (a day is, by
convention, exactly 86400 seconds - no fancy library is needed),
without the grace of Boost. The result is simpler, faster, correct
for the Pliocene-age dates, and fixes the UBSAN crash in the test.

Fixes #17516

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#17527
2024-02-27 17:04:18 +02:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Kefu Chai
2dbf044b91 cql3: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16791
2024-01-16 16:43:17 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Michael Huang
a684e51e4d cql3: fix bad optional access when executing fromJson function
Fix fromJson(null) to return null, not a error as it did before this patch.
We use "null" as the default value when unwrapping optionals
to avoid bad optional access errors.

Fixes: scylladb#7912

Signed-off-by: Michael Huang <michaelhly@gmail.com>

Closes scylladb/scylladb#15481
2023-09-21 20:18:49 +03:00
Nadav Har'El
b513bba201 cql: support casting of counter to other types
We were missing support in the "CAST(x AS type)" function for the counter
type. This patch adds this support, as well as extensive testing that it
works in Scylla the same as Cassandra.

We also un-xfail an existing test translated from Cassandra's unit
test. But note that this old test did not cover all the edge-cases that
the new test checks - some missing cases in the implementation were
not caught by the old test.

Fixes #14501

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-07-30 20:16:25 +03:00
Nadav Har'El
c1762750ed cql: implement missing counterasblob() and blobascounter() functions
Code in functions.cc creates the different TYPEasblob() and blobasTYPE()
functions for all type names TYPE. The functions for the "counter" type
were skipped, supposedly because "counters are not supported yet". But
counters are supported, so let's add the missing functions.

The code fix is trivial, the tests that verify that the result behaves
like Cassandra took more work.

After this patch, unimplemented::cause::COUNTERS is no longer used
anywhere in the code. I wanted to remove it, but noticed that
unimplemented::cause is a graveyard of unused causes, so decided not
to remove this one either. We should clean it up in a separate patch.

Fixes #14742

Also includes tests for tangently-related issues:
Refs #12607
Refs #14319

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-07-30 20:16:25 +03:00
Avi Kivity
66c47d40e6 cql3: selection: drop selector_factories, selectables, and selectors
The whole class hierarchy is no longer used by anything and we can just
delete it.
2023-07-03 19:45:17 +03:00
Avi Kivity
b7556e9482 cql3: functions: add "first" aggregate function
first(x) returns the first x it sees in the group. This is useful
for SELECT clauses that return a mix of aggregates and non-aggregates,
for example

    SELECT max(x), x

with inputs of x = { 1, 2, 3 } is expected to return (3, 1).

Currently, this behavior is handled by individual selectors,
which means they need to contain extra state for this, which
cannot be easily translated to expressions. The new first function
allows translating the SELECT clause above to

    SELECT max(x), first(x)

so all selectors are aggregations and can be handled in the same
way.

The first() function is not exposed to users.
2023-07-02 18:15:00 +03:00
Avi Kivity
73b6b6e3d1 cql3: expr: support preparing SQL-style casts
We convert the cast to a function, just like the existing
with_function selectable.
2023-06-13 21:04:49 +03:00
Avi Kivity
871c1c4f99 cql3: error injection functions: mark enabled_injections() as impure
A pure function should return the same value on every invocation,
but enabled_injections() returns true or false depending on global
state.

Mark it impure to reflect that.

Currently, the bug has no effect, but once we prepare selectors,
the prepare_function_call() will constant-fold calls to pure
functions, so we'll capture global state at prepare time rather
than evaluate it each time anew.
2023-06-13 21:04:49 +03:00
Avi Kivity
c0f59f0789 cql3: eliminate dynamic_cast<selector> from functions::get()
Type inference for function calls is a bit complicated:
 - a function argument can be inferred from the signature: a call to
   my_func(:arg) will infer :arg's type from the function signature
 - a function signature can be inferred from its argument types:
   a call to max(my_column) will select the correct max() signature
   (as max is generic) from my_column's type

Currently, functions::get() implements this by invoking
dynamic_cast<selector*> on the argument. If the caller of
functions::get() is the SELECT clause preparation, then the
cast will succeed and we'll be able to find the type. If not,
we fail (and fall back to inferring the argument types from a
non-generic function signature).

Since we're about to move selectors to expressions, the dynamic_cast
will fail, so we must replace it with a less fragile approach.

The fix is to augment assignment_testable (the interface representing
a function argument) with an intentionally-awkwardly-named
assignment_testable_type_opt(), that sees whether we happen to know
the type for the argument in order to implement signature-from-argument
inference.

A note about assignment_testable: this is a bridge interface
that is the least common denominator of anything that calls functions.
Since we're moving towards expressions, there are fewer implementations of
the interface as the code evolves.
2023-06-13 21:04:49 +03:00
Avi Kivity
5983e9e7b2 cql3: test_assignment: pass optional schema everywhere
test_assignment() and related functions check for type compatibility between
a right-hand-side and a left-hand-side.

It started its life with a limited functionality for INSERT and UPDATE,
but now it's about to be used for cast expression in selectors, which
can cast a column_value. A column_value is still an unresolved_identifier
during the prepare phase, and cannot be resolved without a schema.

To prepare for this, pass an optional schema everywhere.

Ultimately, test_assignment likely needs to be folded into prepare_expr(),
but before that prepare_expr() has to be used everywhere.
2023-06-13 21:04:49 +03:00
Avi Kivity
78f4ee385f cql3: functions: fix count(col) for non-scalar types
count(col), unlike count(*), does not count rows for which col is NULL.
However, if col's data type is not a scalar (e.g. a collection, tuple,
or user-defined type) it behaves like count(*), counting NULLs too.

The cause is that get_dynamic_aggregate() converts count() to
the count(*) version. It works for scalars because get_dynamic_aggregate()
intentionally fails to match scalar arguments, and functions::get() then
matches the arguments against the pre-declared count functions.

As we can only pre-declare count(scalar) (there's an infinite number
of non-scalar types), we change the approach to be the same as min/max:
we make count() a generic function. In fact count(col) is much better
as a generic function, as it only examines its input to see if it is
NULL.

A unit test is added. It passes with Cassandra as well.

Fixes #14198.

Closes #14199
2023-06-13 14:40:14 +03:00
Jan Ciolek
15ed83adbc cql3/functions: make column family argument optional in functions::get
The method `functions::get` is used to get the `functions::function` object
of the CQL function called using `expr::function_call`.

Until now `functions::get` required the caller to pass both the keyspace
and the column family.

The keyspace argument is always needed, as every CQL function belongs
to some keyspace, but the column family isn't used in most cases.

The only case where having the column family is really required
is the `token()` function. Each variant of the `token()` function
belongs to some table, as the arguments to the function are the
consecutive partition key columns.

Let's make the column family argument optional. In most cases
the function will work without information about column family.
In case of the `token()` function there's gonna be a check
and it will throw an exception if the argument is nullopt.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:00:01 +02:00
Kefu Chai
ca6ebbd1f0 cql3, db: sstable: specialize fmt::formatter<function_name>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `function_name` without the help of `operator<<`.

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13608
2023-04-21 10:07:28 +03:00
Kefu Chai
ecb5380638 treewide: s/boost::lexical_cast<std::string>/fmt::to_string()/
this change replaces all occurrences of `boost::lexical_cast<std::string>`
in the source tree with `fmt::to_string()`. for couple reasons:

* `boost::lexical_cast<std::string>` is longer than `fmt::to_string()`,
  so the latter is easier to parse and read.
* `boost::lexical_cast<std::string>` creates a stringstream under the
  hood, so it can use the `operator<<` to stringify the given object.
  but stringstream is known to be less performant than fmtlib.
* we are migrating to fmtlib based formatting, see #13245. so
  using `fmt::to_string()` helps us to remove yet another dependency
  on `operator<<`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13611
2023-04-21 09:43:53 +03:00
Botond Dénes
d828cfcb23 Merge 'db, cql3: functions: switch argument passing to std::span' from Avi Kivity
Database functions currently receive their arguments as an std::vector. This
is inflexible (for example, one cannot use small_vector to reduce allocations).

This series adapts the function signature to accept parameters using std::span.
Some changes in the keys interface are needed to support this. Lastly, one call
site is migrated to small_vector.

This is in support of changing selectors to use expressions.

Closes #13581

* github.com:scylladb/scylladb:
  cql3: abstract_function_selector: use small_vector for argument buffer
  db, cql3: functions: pass function parameters as a span instead of a vector
  keys: change from_optional_exploded to accept a span instead of a vector
2023-04-21 06:49:07 +03:00
Avi Kivity
3e0aacc8b5 db, cql3: functions: pass function parameters as a span instead of a vector
Spans are more flexible and can be constructed from any contiguous
container (such as small_vector), or a subrange of such a container.
This can save allocations, so change the signature to accept a span.

Spans cannot be constructed from std::initializer_list, so one such
call site is changed to use construct a span directly from the single
argument.
2023-04-19 20:38:55 +03:00
Nadav Har'El
81e0f5b581 cql3: allow SUM() aggregation to result in a NaN
When floating-point data contains +Inf and -Inf, the sum is NaN.

Our SUM() aggregation calculated this sum correctly, but then instead
of returning it, complained that the sum overflowed by narrowing.
This was a false positive: The sum() finalizer wanted to test that no
precision was lost when casting the accumulator to the result type,
so checked that the result before and after the cast are the same.
But specifically for NaN, it is never equal to anything - not even
to itself. This check is wrong for floating point, but moreover -
isn't even necessary when the two types (accumulator type and result
type) are identical so in this patch we skip it in this case.

Note that in the current code, a different accumulator and result type
is only used in the case of integer types; When accumulating floating
point sums, the same type is used, so the broken check will be avoided.

The test for this issue starts to pass with this patch, so the xfail
tag is removed.

Fixes #13551

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-04-19 09:31:41 +03:00
Nadav Har'El
59ab9aac44 Merge 'functions: reframe aggregate functions in terms of scalar functions' from Avi Kivity
Currently, aggregate functions are implemented in a statefull manner.
The accumulator is stored internally in an aggregate_function::aggregate,
requiring each query to instantiate new instances (see
aggregate_function_selector's constructor, and note how it's called
from selector::new_instance()).

This makes aggregates hard to use in expressions, since expressions
are stateless (with state only provided to evaluate()). To facilitate
migration towards stateless expressions, we define a
stateless_aggregate_function (modeled after user-defined aggregates,
which are already stateless). This new struct defines the aggregate
in terms of three scalar functions: one to aggregate a new input into
an accumulator (provided in the first parameter), one to finalize an
accumulator into a result, and one to reduce two accumulators for
parallelized aggregation.

All existing native aggregate functions are converted to the new model, and
the old interface is removed. This series does not yet convert selectors to
expressions, but it does remove one of the obstacles.

Performance evaluation: I created a table with a million ints on a single-node cluster, and ran the avg() function on them. I measured the number of instructions executed with `perf stat -p $(pgrep scylla) -e instructions` while the query was running. The query executed from cache, memtables were flushed beforehand. The instruction count per row increased from roughly 49k to roughly 52k, indicating 3k extra instructions per row. While 3k instructions to execute a function is huge, it is currently dwarfed by other overhead (and will be even less important in a cluster where it CL>1 will cause non-coordinator code to run multiple times).

Closes #13105

* github.com:scylladb/scylladb:
  cql3/selection, forward_service: use use stateless_aggregate_function directly
  db: functions: fold stateless_aggregate_function_adapter into aggregate_function
  cql3: functions: simplify accumulator_for template
  cql3: functions: base user-defined aggregates on stateless aggregates
  cql3: functions: drop native_aggregate_function
  cql3: functions: reimplement count(column) statelessly
  cql3: functions: reimplement avg() statelessly
  cql3: functions: reimplement sum() statelessly
  cql3: functions: change wide accumulator type to varint
  cql3: functions: unreverse types for min/max
  cql3: functions: rename make_{min,max}_dynamic_function
  cql3: functions: reimplement min/max statelessly
  cql3: functions: reimplement count(*) statelessly
  cql3: functions: simplify creating native functions even more
  cql3: functions: add helpers for automating marshalling for scalar functions
  types: fix big_decimal constructor from literal 0
  cql3: functions: add helper class for internal scalar functions
  db: functions: add stateless aggregate functions
  db, cql3: move scalar_function from cql3/functions to db/functions
2023-03-30 13:58:47 +03:00
Avi Kivity
58eb21aa5d db: functions: fold stateless_aggregate_function_adapter into aggregate_function
Now that all aggregate functions are derived from
stateless_aggregate_function_adapter, we can just fold its functionality
into the base class. This exposes stateless_aggregate_function to
all users of aggregate_function, so they can begin to benefit from
the transformation, though this patch doesn't touch those users.

The aggregate_function base class is partiallly devirtualized since
there is just a single implementation now.
2023-03-28 23:47:11 +03:00
Avi Kivity
68529896aa cql3: functions: simplify accumulator_for template
The accumulator_for template is used to select the accumulator
type for aggregates. After refactoring, all that is needed from
it is to select the native type, so remove all the excess code.
2023-03-28 23:47:11 +03:00
Avi Kivity
4ea3136026 cql3: functions: base user-defined aggregates on stateless aggregates
Since the model for stateless aggregates was taken from user
defined aggregates, the conversion is trivial.
2023-03-28 23:47:11 +03:00
Avi Kivity
f2715b289a cql3: functions: drop native_aggregate_function
Now that all aggregates are implemented staetelessly,
native_aggregate_function no longer has subclasses, so drop it.
2023-03-28 23:47:11 +03:00
Avi Kivity
6bceb25982 cql3: functions: reimplement count(column) statelessly
Note that we don't use the automarshalling helper for the
aggregation function, since it doesn't work for compound
types.
2023-03-28 23:47:11 +03:00
Avi Kivity
4f2cdace9a cql3: functions: reimplement avg() statelessly 2023-03-28 23:47:11 +03:00
Avi Kivity
b0a8fd3287 cql3: functions: reimplement sum() statelessly 2023-03-28 23:47:11 +03:00
Avi Kivity
d21d11466a cql3: functions: change wide accumulator type to varint
Currently, we use __int128, but this has no direct counterpart
in CQL, so we can't express the accumulator type as part of a
CQL scalar function. Switch to varint which is a superset, although
slower.
2023-03-28 23:47:11 +03:00
Avi Kivity
3252dc0172 cql3: functions: unreverse types for min/max
Currently it works without this, but later unreversing will
be removed from another part of the stack, causing min/max
on reversed types to return incorrect results. Anticipate
that an unreverse the types during construction.
2023-03-28 23:47:09 +03:00
Avi Kivity
ed466b7e68 cql3: functions: rename make_{min,max}_dynamic_function
There's no longer a statically-typed variant, so no need
to distinguish the dynamically-typed one.
2023-03-28 23:37:49 +03:00
Avi Kivity
bfd70c192e cql3: functions: reimplement min/max statelessly
min() and max() had two implementations: one static (for each type in
a select list) and one dynamic (for compound types). Since the
dynamic implementation is sufficient, we only reimplement that. This
means we don't use the automarshalling helpers, since we don't do any
arithemetic on values apart from comparison, which is conveniently
provided by abstract_type.
2023-03-26 15:18:22 +03:00
Avi Kivity
e6342d476b cql3: functions: reimplement count(*) statelessly
Note we have to explicitly decay lambdas to functions using unary operator +.
2023-03-26 15:18:22 +03:00
Avi Kivity
9291ec5ed1 cql3: functions: simplify creating native functions even more
Add a helper function to consolidate the internal native function
class and the automatic marshalling introduced in previous patches.

Since decaying a lambda into a function pointer (in order to
infer its signature) there are two overloads: one accepts a lambda
and decays it into a function pointer, the second accepts a function
pointer, infers its argument, and constructs the function object.
2023-03-26 15:15:36 +03:00
Kefu Chai
c37f4e5252 treewide: use fmt::join() when appropriate
now that fmtlib provides fmt::join(). see
https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view
there is not need to revent the wheel. so in this change, the homebrew
join() is replaced with fmt::join().

as fmt::join() returns an join_view(), this could improve the
performance under certain circumstances where the fully materialized
string is not needed.

please note, the goal of this change is to use fmt::join(), and this
change does not intend to improve the performance of existing
implementation based on "operator<<" unless the new implementation is
much more complicated. we will address the unnecessarily materialized
strings in a follow-up commit.

some noteworthy things related to this change:

* unlike the existing `join()`, `fmt::join()` returns a view. so we
  have to materialize the view if what we expect is a `sstring`
* `fmt::format()` does not accept a view, so we cannot pass the
  return value of `fmt::join()` to `fmt::format()`
* fmtlib does not format a typed pointer, i.e., it does not format,
  for instance, a `const std::string*`. but operator<<() always print
  a typed pointer. so if we want to format a typed pointer, we either
  need to cast the pointer to `void*` or use `fmt::ptr()`.
* fmtlib is not able to pick up the overload of
  `operator<<(std::ostream& os, const column_definition* cd)`, so we
  have to use a wrapper class of `maybe_column_definition` for printing
  a pointer to `column_definition`. since the overload is only used
  by the two overloads of
  `statement_restrictions::add_single_column_parition_key_restriction()`,
  the operator<< for `const column_definition*` is dropped.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-16 20:34:18 +08:00
Avi Kivity
7a5e609d8d cql3: functions: add helpers for automating marshalling for scalar functions
Add a helper that, given a C++ function, deduces its arguument types
and wraps the function in marshalling/unmarshalling code.

The native function expects non-null inputs, so an additional helper is
called to decide what to do if nulls are encountered. One such
helper is return_accumulator_on_null (since that's the default
behavior of aggregates), and the other is return_any_nonnull(),
useful for reductions.
2023-03-15 22:28:41 +02:00
Avi Kivity
6c8d942fa1 cql3: functions: add helper class for internal scalar functions
We'll need many scalar functions to implement aggregates in terms
of scalars, so we add an internal_scalar_function class to reduce
boilerplate. The new class proxies the scalar function into a
native noncopyable_function provided by the constructor.
2023-03-15 22:22:02 +02:00
Avi Kivity
82c4341e0e db, cql3: move scalar_function from cql3/functions to db/functions
Previously, we moved cql3::functions::function to the
db::functions namespace, since functions are a part of the
data dictionary, which is independent of cql3. We do the
same now for scalar_function, since we wish to make use
of it in a new db::functions::stateless_aggregate_function.

A stub remains in cql3/functions to avoid churn.
2023-03-15 20:37:25 +02:00
Avi Kivity
6aa91c13c5 Merge 'Optimize topology::compare_endpoints' from Benny Halevy
The code for compare_endpoints originates at the dawn of time (bc034aeaec)
and is called on the fast path from storage_proxy via `sort_by_proximity`.

This series considerably reduces the function's footprint by:
1. carefully coding the many comparisons in the function so to reduce the number of conditional banches (apparently the compiler isn't doing a good enough job at optimizing it in this case)
2. avoid sstring copy in topology::get_{datacenter,rack}

Closes #12761

* github.com:scylladb/scylladb:
  topology: optimize compare_endpoints
  to_string: add print operators for std::{weak,partial}_ordering
  utils: to_sstring: deinline std::strong_ordering print operator
  move to_string.hh to utils/
  test: network_topology: add test_topology_compare_endpoints
2023-03-07 15:17:19 +02:00
Kefu Chai
79d2eb1607 cql3: functions: validate arguments for 'token()' also
since "token()" computes the token for a given partition key,
if we pass the key of the wrong type, it should reject.

in this change,

* we validate the keys before returning the "token()" function.
* drop the "xfail" decorator from two of the tests. they pass
  now after this fix.
* change the tests which previously passed the wrong number of
  arguments containing null to "token()" and expect it to return
  null, so they verify that "token()" should reject these
  arguments with the expected error message.

Fixes #10448
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12991
2023-02-26 19:01:58 +02:00
Kefu Chai
df63e2ba27 types: move types.{cc,hh} into types
they are part of the CQL type system, and are "closer" to types.
let's move them into "types" directory.

the building systems are updated accordingly.

the source files referencing `types.hh` were updated using following
command:

```
find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} +
```

the source files under sstables include "types.hh", which is
indeed the one located under "sstables", so include "sstables/types.hh"
instea, so it's more explicit.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12926
2023-02-19 21:05:45 +02:00
Kefu Chai
0cb842797a treewide: do not define/capture unused variables
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-15 22:57:18 +02:00
Benny Halevy
25ebc63b82 move to_string.hh to utils/
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-02-15 11:09:04 +02:00
Wojciech Mitros
02bfac0c66 uda: change the UDF used in a UDA if it's replaced
Currently, if a UDA uses a UDF that's being replaced,
the UDA will still keep using the old UDF until the
node is restarted.
This patch fixes this behavior by checking all UDAs
when replacing a UDF and updating them if necessary.

Fixes #12709
2023-02-07 12:17:52 +01:00
Wojciech Mitros
58987215dc functions: add helper same_signature method
When deciding whether two functions have the same
signature, we have to check if they have the same name
and parameter types. Additionally, if they're represented
by pointers, we need to check if any of them is a nullptr.
This logic is used multiple times, so it's extracted to
a separate function.
To use this function, the `used_by_user_aggregate` method
takes now a function instead of name and types list - we
can do it because we always use it with an existing user
function (that we're trying to drop).
The method will also be useful when we'll be not dropping,
but replacing a user function.
2023-02-07 10:15:12 +01:00
Wojciech Mitros
20069372e7 uda: return aggregate functions as shared pointers
We will want to reuse the functions that we get from an aggregate
without making a deep copy, and it's only possible if we get
pointers from the aggregate instead of actual values.
2023-02-07 10:15:09 +01:00