Commit Graph

152 Commits

Author SHA1 Message Date
Michał Jadwiszczak
8157d260f2 types: add a method to get all referenced user types
The method allows to collect all UDTs used to create a type.
This is required to sort UDTs in a topological order.
2024-05-16 13:30:03 +02:00
Kefu Chai
e2d5054c53 types: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18326
2024-04-23 12:08:23 +03:00
Kefu Chai
372a4d1b79 treewide: do not define FMT_DEPRECATED_OSTREAM
since we do not rely on FMT_DEPRECATED_OSTREAM to define the
fmt::formatter for us anymore, let's stop defining `FMT_DEPRECATED_OSTREAM`.

in this change,

* utils: drop the range formatters in to_string.hh and to_string.c, as
  we don't use them anymore. and the tests for them in
  test/boost/string_format_test.cc are removed accordingly.
* utils: use fmt to print chunk_vector and small_vector. as
  we are not able to print the elements using operator<< anymore
  after switching to {fmt} formatters.
* test/boost: specialize fmt::details::is_std_string_like<bytes>
  due to a bug in {fmt} v9, {fmt} fails to format a range whose
  element type is `basic_sstring<uint8_t>`, as it considers it
  as a string-like type, but `basic_sstring<uint8_t>`'s char type
  is signed char, not char. this issue does not exist in {fmt} v10,
  so, in this change, we add a workaround to explicitly specialize
  the type trait to assure that {fmt} format this type using its
  `fmt::formatter` specialization instead of trying to format it
  as a string. also, {fmt}'s generic ranges formatter calls the
  pair formatter's `set_brackets()` and `set_separator()` methods
  when printing the range, but operator<< based formatter does not
  provide these method, we have to include this change in the change
  switching to {fmt}, otherwise the change specializing
  `fmt::details::is_std_string_like<bytes>` won't compile.
* test/boost: in tests, we use `BOOST_REQUIRE_EQUAL()` and its friends
  for comparing values. but without the operator<< based formatters,
  Boost.Test would not be able to print them. after removing
  the homebrew formatters, we need to use the generic
  `boost_test_print_type()` helper to do this job. so we are
  including `test_utils.hh` in tests so that we can print
  the formattable types.
* treewide: add "#include "utils/to_string.hh" where
  `fmt::formatter<optional<>>` is used.
* configure.py: do not define FMT_DEPRECATED_OSTREAM
* cmake: do not define FMT_DEPRECATED_OSTREAM

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:57:36 +08:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Kefu Chai
1b859e484f treewide: use fmt::to_string() to transform a UUID to std::string
without `FMT_DEPRECATED_OSTREAM` macro, `UUID::to_sstring()` is
implemented using its `fmt::formatter`, which is not available
at the end of this header file where `UUID` is defined. at this moment,
we still use `FMT_DEPRECATED_OSTREAM` and {fmt} v9, so we can
still use `UUID::to_sstring()`, but in {fmt} v10, we cannot.

so, in this change, we change all callers of `UUID::to_sstring()`
to `fmt::to_string()`, so that we don't depend on
`FMT_DEPRECATED_OSTREAM` and {fmt} v9 anymore.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-03-26 13:38:37 +08:00
Benny Halevy
136df58cbc data_value: delete data_value(T*) constructor
Currently, since the data_value(bool) ctor
is implicit, pointers of any kind are implicitly
convertible to data_value via intermediate conversion
to `bool`.

This is error prone, since it allows unsafe comparison
between e.g. an `sstring` with `some*` by implicit
conversion of both sides to `data_value`.

For example:
```
    sstring name = "dc1";
    struct X {
        sstring s;
    };
    X x(name);
    auto p = &x;
    if (name == p) {}
```

Refs #17261

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#17262
2024-02-11 15:42:55 +02:00
Kurashkin Nikita
7ce9a3e9e5 cql: add limits for integer values when creating date type
Added a simple check that prevents entering int values that lead to
overflow when creating a date type.

Fixes #17066

Closes scylladb/scylladb#17102
2024-02-08 00:08:01 +02:00
Botond Dénes
53a11cba62 Merge 'types/types.cc: move stringstream content instead of copying it' from Patryk Wróbel
C++20 introduced a new overload of std::ostringstream::str() that is selected when the mentioned member function is called on r-value.

The new overload returns a string, that is move-constructed from the underlying string instead of being copy-constructed.

This change applies std::move() on stringstream objects before calling str() member function to avoid copying of the underlying buffer.

It also removes a helper function `inet_addr_type_impl::to_sstring()` - it was used only in two places. It was replaced with `fmt::to_string()`.

Closes scylladb/scylladb#16991

* github.com:scylladb/scylladb:
  use fmt::to_string() for seastar::net::inet_address
  types/types.cc: move stringstream content instead of copying it
2024-02-06 13:11:41 +02:00
Kefu Chai
6f07d9edaa types: use {fmt} to format boolean
{fmt} format boolean as "true" / "false" since v2.0.1, no need to
reinvent the wheel.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-06 10:40:02 +08:00
Kefu Chai
be29556955 types: use {fmt} to format time
so we can tighten our dependencies a little bit. there are only
three places where we are using the `date` library. the outputs
of these two ways are identical:
see https://wandbox.org/permlink/Lo9NUrQNUEqyiMEa and https://godbolt.org/z/YEha9ah7v to compare their outputs.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-06 10:39:30 +08:00
Patryk Wrobel
cc186c1798 use fmt::to_string() for seastar::net::inet_address
This change removes inet_addr_type_impl::to_sstring()
and replaces its usages with fmt::to_string().
The removed helper performed an uneeded copying via
std::ostringstream::str().

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
2024-02-05 16:56:40 +01:00
Patryk Wrobel
8c0d30cd88 types/types.cc: move stringstream content instead of copying it
C++20 introduced a new overload of std::ofstringstream::str()
that is selected when the mentioned member function is called
on r-value.

The new overload returns a string, that is move-constructed
from the underlying string instead of being copy-constructed.

This change applies std::move() on stringstream objects before
calling str() member function to avoid copying of the underlying
buffer.

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
2024-02-05 16:35:27 +01:00
Kefu Chai
a1dcddd300 utils: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16833
2024-01-18 12:50:06 +02:00
Kefu Chai
f5d1836a45 types: fix indent
f344e130 failed to get the indent right, so fix it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16834
2024-01-18 09:14:39 +02:00
Kefu Chai
f344e13066 types: add formatter for data_value
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define a formatter for data_value, but its
its operator<<() is preserved as we are still using the generic
homebrew formatter for formatting std::vector, which in turn uses
operator<< of the element type.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16767
2024-01-15 13:18:23 +02:00
Lakshmi Narayanan Sreethar
cd9e027047 types: fix ambiguity in align_up call
Compilation fails with recent boost versions (>=1.79.0) due to an
ambiguity with the align_up function call. Fix that by adding type
inference to the function call.

Fixes #16746

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#16747
2024-01-12 10:50:31 +02:00
Benny Halevy
6123dc6b09 query_processor: execute_internal: support unset values
Add overloads for execute_internal and friends
accepting a vector of optional<data_value>.

The caller can pass nullopt for any unset value.
The vector of optionals is translated internally to
`cql3::raw_value_vector_with_unset` by `make_internal_options`.

This path will be called by system_keyspace::update_peer_info
for updating a subset of the system.peers columns.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:21:35 +02:00
Benny Halevy
328ce23c78 types: add data_value_list
data_value_list is a wrapper around std::initializer_list<data_value>.
Use it for passing values to `cql3::query_processor::execute_internal`
and friends.

A following path will add a std::variant for data_value_or_unset
and extend data_value_list to support unset values.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:17:27 +02:00
Kefu Chai
db9e314965 treewide: apply codespell to the comments in source code
for less spelling errors in comment.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16408
2023-12-20 10:25:03 +02:00
Kefu Chai
efd65aebb2 build: cmake: add check-header target
to have feature parity with `configure.py`. we won't need this
once we migrate to C++20 modules. but before that day comes, we
need to stick with C++ headers.

we generate a rule for each .hh files to create a corresponding
.cc and then compile it, in order to verify the self-containness of
that header. so the number of rule is quite large, to avoid the
unnecessary overhead. the check-header target is enabled only if
`Scylla_CHECK_HEADERS` option is enabled.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15913
2023-11-13 10:27:06 +02:00
Kefu Chai
80c656a08b types: use more readable error message when serializing non-ASCII string
before this change, we print

marshaling error: Value not compatible with type org.apache.cassandra.db.marshal.AsciiType: '...'

but the wording is not quite user friendly, it is a mapping of the
underlying implementation, user would have difficulty understanding
"marshaling" and/or "org.apache.cassandra.db.marshal.AsciiType"
when reading this error message.

so, in this change

1. change the error message to:
     Invalid ASCII character in string literal: '...'
   which should be more straightforward, and easier to digest.
2. update the test accordingly

please note, the quoted non-ASCII string is preserved instead of
being printed in hex, as otherwise user would not be able to map it
with his/her input.

Refs #14320
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15678
2023-10-20 09:25:44 +03:00
Raphael S. Carvalho
2a81b2e49a types: Avoid unneeded copy in simple_date_type_impl::from_sstring()
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#15645
2023-10-06 11:05:27 +03:00
Nadav Har'El
d9c2cd3024 cql: implement missing type functions for "counters" type
types.cc had eight of its functions unimplemented for the "counters"
types, throwing an "unimplemented::cause::COUNTERS" when used.
A ninth function (validate) was unimplemented for counters but did not
even throw.
Many code paths did not use any of these functions so didn't care, but
some do - e.g., the silly do-nothing "SELECT CAST(c AS counter)" when
c is already a counter column, which causes this operation to fail.

When the types.cc code encounters a counter value, it is (if I understand
it correctly) already a single uint64_t ("long_type") value, so we fall
back to the long_type implementation of all the functions. To avoid mistakes,
I simply copied the reversed_type implementation for all these functions -
whereas the reversed_type implementation falls back to using the underlying
type, the counter_type implementation always falls back to long_type.

After this patch, "SELECT CAST(c AS counter)" for a counter column works.
We'll introduce a test that verifies this (and other things) in a later
patch in this series.

The following patches will also need more of these functions to be
implemented correctly (e.g., blobascounter() fails to validate the size
of the input blob if the validate function isn't implemented for the
counter type).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-07-30 20:16:25 +03:00
Alexey Novikov
ff721ec3e3 make timestamp string format cassandra compatible
when we convert timestamp into string it must look like: '2017-12-27T11:57:42.500Z'
it concerns any conversion except JSON timestamp format
JSON string has space as time separator and must look like: '2017-12-27 11:57:42.500Z'
both formats always contain milliseconds and timezone specification

Fixes #14518
Fixes #7997

Closes #14726
2023-07-27 12:01:09 +03:00
Kefu Chai
bab16eb30e treewide: remove #includes not use directly
for faster build times and clear inter-module dependencies, we
should not #includes headers not directly used. instead, we should
only #include the headers directly used by a certain compilation
unit.

in this change, the source files under "/compaction" directories
are checked using clangd, which identifies the cases where we have
an #include which is not directly used. all the #includes identified
by clangd are removed. because some source files rely on the incorrectly
included header file, those ones are updated to #include the header
file they directly use.

if a forward declaration suffice, the declaration is added instead.

see also https://clangd.llvm.org/guides/include-cleaner#unused-include-warning

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-07-18 17:36:31 +08:00
Jan Ciolek
464437ef90 types/user: modify idx_of_field to use bytes_view
Let's change the argument type from `bytes`
to `bytes_view`. Sometimes it's possible to get
an instance of `bytes_view`, but getting `bytes`
would require a copy, which is wasteful.

`bytes_view` allows to avoid copies.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-06-16 01:11:31 +02:00
Jan Ciolek
ab1ba497b5 types: add read_nth_user_type_field()
Add a function which can be used to read the nth
field of a serialized UDT value.

We could deserialize the whole value and then choose
one of the deserialized fields, but that would be wasteful.
Sometimes we only need the value of one field, not all of them.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-06-16 01:11:30 +02:00
Jan Ciolek
5fce4d9675 types: add read_nth_tuple_element()
Add a function which retrieves the value of nth
field from a serialized tuple value.

I tried to make it as efficient as possible.
Other functions, like evaluate(subscript) tend to
deserialize the whole structure and put all of its
elements in a vector. Then they select a single element
from this vector.
This is wasteful, as we only need a single element's value.

This function goes over the serialized fields
and directly returns the one that is needed.
No allocations are needed.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-06-14 07:22:39 +02:00
Avi Kivity
42a1ced73b cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt
The expression system uses managed_bytes_opt for values, but result_set
uses bytes_opt. This means that processing values from the result set
in expressions requires a copy.

Out of the two, managed_bytes_opt is the better choice, since it prevents
large contiguous allocations for large blobs. So we switch result_set
to use managed_bytes_opt. Users of the result_set API are adjusted.

The db::function interface is not modified to limit churn; instead we
convert the types on entry and exit. This will be adjusted in a following
patch.
2023-05-07 17:17:36 +03:00
Avi Kivity
d3e9fd49a3 types: abstract_type: add mixed-type versions of compare() and equal()
compare() and equal() can compare two unfragmented values or two
fragmented values, but a mix of a fragmented value and an unfragmented
value runs afoul of C++ conversion rules. Add more overloads to
make it simpler for users.
2023-05-07 17:17:36 +03:00
Benny Halevy
935ff0fcbb types: timestamp_from_string: print current_exception on error
We may catch exceptions that are not `marshal_exception`.
Print std::current_exception() in this case to provide
some context about the marshalling error.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #13693
2023-04-27 22:30:55 +03:00
Kefu Chai
f5b05cf981 treewide: use defaulted operator!=() and operator==()
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.

fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.

in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.

sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.

also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.

please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.

this change is a cleanup to modernize the code base with C++20
features.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13687
2023-04-27 10:24:46 +03:00
Kefu Chai
a2aa133822 treewide: use std::lexicographical_compare_threeway
this the standard library offers
`std::lexicographical_compare_threeway()`, and we never uses the
last two addition parameters which are not provided by
`std::lexicographical_compare_threeway()`. there is no need to have
the homebrew version of trichotomic compare function.

in this change,

* all occurrences of `lexicographical_tri_compare()` are replaced
  with `std::lexicographical_compare_threeway()`.
* ``lexicographical_tri_compare()` is dropped.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13615
2023-04-21 14:28:18 +03:00
Kefu Chai
6bb32efac0 utils: big_decimal: replace compare() with <=> operator
now that we are using C++20, it'd be more convenient if we can use
the <=> operator for comparing. the compiler creates the 6 other
operators for us if the <=> operator is defined. so the code is more
compacted.

in this change, `big_decimal::compare()` is replaced with `operator<=>`,
and its caller is updated accordingly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-15 12:52:30 +08:00
Nadav Har'El
d26bb8c12d Merge 'tree: migrate from std::regex to boost::regex' from Botond Dénes
Except for where usage of `std::regex` is required by 3rd party library interfaces.
As demonstrated countless times, std::regex's practice of using recursion for pattern matching can result in stack overflow, especially on AARCH64. The most recent incident happened after merging https://github.com/scylladb/scylladb/pull/13075, which (indirectly) uses `sstables::make_entry_descriptor()` to test whether a certain path is a valid scylla table path in a trial-and-error manner. This resulted in stacks blowing up in AARCH64.
To prevent this, use the already tried and tested method of switching from `std::regex` to `boost::regex`. Don't wait until each of the `std::regex` sites explode, replace them all preemptively.

Refs: https://github.com/scylladb/scylladb/issues/13404

Closes #13452

* github.com:scylladb/scylladb:
  test: s/std::regex/boost::regex/
  utils: s/std::regex/boost::regex/
  db/commitlog: s/std::regex/boost::regex/
  types: s/std::regex/boost::regex/
  index: s/std::regex/boost::regex/
  duration.cc: s/std::regex/boost::regex/
  cql3: s/std::regex/boost::regex/
  thrift: s/std::regex/boost::regex/
  sstables: use s/std::regex/boost::regex/
2023-04-09 18:47:41 +03:00
Botond Dénes
712889c99f types: s/std::regex/boost::regex/
The former is prone to producing stack-overflow as it uses recursion in
it match implementation.

The migration is entirely mechanical is for the most part.
escape() needs some special treatment, looks like boost::regex wants
double escaped bacspace.
2023-04-06 09:50:45 -04:00
Botond Dénes
00f06522c2 types/user: add get_name() accessor
For the raw name (bytes).
2023-03-27 01:44:00 -04:00
Kefu Chai
e796525f23 types: remove unused header
<iterator> was introduced back in
1cf02cb9d8, but lexicographical_compare.hh
was extracted out in bdfc0aa748, since we
don't have any users of <iterator> in types.hh anymore, let's remove it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13327
2023-03-26 16:55:16 +03:00
Avi Kivity
bdfc0aa748 utils, types, test: extract lexicographical compare utilities
UUID_test uses lexicograhical_compare from the types module. This
is a layering violation, since UUIDs are at a much lower level than
the database type system. In practical terms, this cause link failures
with gcc due to some thread-local-storage variables defined in types.hh
but not provided by any object, since we don't link with types.o in this
test.

Fix by extracting the relevant functions into a new header.
2023-03-21 15:42:53 +02:00
Kefu Chai
c37f4e5252 treewide: use fmt::join() when appropriate
now that fmtlib provides fmt::join(). see
https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view
there is not need to revent the wheel. so in this change, the homebrew
join() is replaced with fmt::join().

as fmt::join() returns an join_view(), this could improve the
performance under certain circumstances where the fully materialized
string is not needed.

please note, the goal of this change is to use fmt::join(), and this
change does not intend to improve the performance of existing
implementation based on "operator<<" unless the new implementation is
much more complicated. we will address the unnecessarily materialized
strings in a follow-up commit.

some noteworthy things related to this change:

* unlike the existing `join()`, `fmt::join()` returns a view. so we
  have to materialize the view if what we expect is a `sstring`
* `fmt::format()` does not accept a view, so we cannot pass the
  return value of `fmt::join()` to `fmt::format()`
* fmtlib does not format a typed pointer, i.e., it does not format,
  for instance, a `const std::string*`. but operator<<() always print
  a typed pointer. so if we want to format a typed pointer, we either
  need to cast the pointer to `void*` or use `fmt::ptr()`.
* fmtlib is not able to pick up the overload of
  `operator<<(std::ostream& os, const column_definition* cd)`, so we
  have to use a wrapper class of `maybe_column_definition` for printing
  a pointer to `column_definition`. since the overload is only used
  by the two overloads of
  `statement_restrictions::add_single_column_parition_key_restriction()`,
  the operator<< for `const column_definition*` is dropped.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-16 20:34:18 +08:00
Avi Kivity
6aa91c13c5 Merge 'Optimize topology::compare_endpoints' from Benny Halevy
The code for compare_endpoints originates at the dawn of time (bc034aeaec)
and is called on the fast path from storage_proxy via `sort_by_proximity`.

This series considerably reduces the function's footprint by:
1. carefully coding the many comparisons in the function so to reduce the number of conditional banches (apparently the compiler isn't doing a good enough job at optimizing it in this case)
2. avoid sstring copy in topology::get_{datacenter,rack}

Closes #12761

* github.com:scylladb/scylladb:
  topology: optimize compare_endpoints
  to_string: add print operators for std::{weak,partial}_ordering
  utils: to_sstring: deinline std::strong_ordering print operator
  move to_string.hh to utils/
  test: network_topology: add test_topology_compare_endpoints
2023-03-07 15:17:19 +02:00
Avi Kivity
3042deb930 types: reimplement in terms of a variable template
data_type_for() is a function template that converts a C++
type to a database dynamic type (data_type object).

Instead of implementing a function per type, implement a variable
template instance. This is shorter and nicer.

Since the original type variables (e.g. long_type) are defined separately,
use a reference instead of copying to avoid initialization order problems.

To catch misuses of data_type_for the general data_type_for_v variable
template maps to some unused tag type which will cause a build error
when instantiated.

The original motivation for this was to allow for partial
specialization of data_type_for() for tuple types, but this isn't
really workable since the native type for tuples is std::vector<data_value>,
not std::tuple, and I only checked this after getting the work done,
so this isn't helping anything; it's just a little nicer.

Closes #13043
2023-03-01 11:25:39 +02:00
Botond Dénes
ef548e654d types: unserialize_value for multiprecision_int,bool: don't read uninitialized memory
Check the first fragment before dereferencing it, the fragment might be
empty, in which case move to the next one.
Found by running range scan tests with random schema and random data.

Fixes: #12821
Fixes: #12823
Fixes: #12708

Closes #12824
2023-02-21 17:39:18 +02:00
Kefu Chai
df63e2ba27 types: move types.{cc,hh} into types
they are part of the CQL type system, and are "closer" to types.
let's move them into "types" directory.

the building systems are updated accordingly.

the source files referencing `types.hh` were updated using following
command:

```
find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} +
```

the source files under sstables include "types.hh", which is
indeed the one located under "sstables", so include "sstables/types.hh"
instea, so it's more explicit.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12926
2023-02-19 21:05:45 +02:00
Avi Kivity
69a385fd9d Introduce schema/ module
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.

Closes #12858
2023-02-15 11:01:50 +02:00
Avi Kivity
390a0ca47b types: allow lists with NULL
Allow transient lists that contain NULL throughout the
evaluation machinery. This makes is possible to evalute things
like `IF col IN (1, 2, NULL)` without hacks, once LWT conditions
are converted to expressions.

A few tests are relaxed to accommodate the new behavior:
 - cql_query_test's test_null_and_unset_in_collections is relaxed
   to allow `WHERE col IN ?`, with the variable bound to a list
   containing NULL; now it's explicitly allowed
 - expr_test's evaluate_bind_variable_validates_no_null_in_list was
   checking generic lists for NULLs, and was similary relaxed (and
   renamed)
 - expr_Test's evaluate_bind_variable_validates_null_in_lists_recursively
   was similarly relaxed to allow NULLs.
2023-01-18 10:38:24 +02:00
Avi Kivity
00145f9ada test: relax NULL check test predicate
When we start allowing NULL in lists in some contexts, the exact
location where an error is raised (when it's disallowed) will
change. To prepare for that, relax the exception check to just
ensure the word NULL is there, without caring about the exact
wording.
2023-01-18 10:38:24 +02:00
Avi Kivity
5f8540ecfa cql3, types: validate listlike collections (sets, lists) for storage
Lists allow NULL in some contexts (bind variables for LWT "IN ?"
conditions), but not in most others. Currently, the implementation
just disallows NULLs in list values, and the cases where it is allowed
are hacked around. To reduce the special cases, we'll allow lists
to have NULLs, and just restrict them for storage. This is similar
to how scalar values can be NULL, but not when they are part of a
partition key.

To prepare for the transition, identify the locations where lists
(and sets, which share the same storage) are stored as frozen
values and add a NULL check there. Non-frozen lists already have the
check. Since sets share the same format as lists, apply the same to
them.

No actual checks are done yet, since NULLs are impossible. This
is just a stub.
2023-01-18 10:38:24 +02:00
Avi Kivity
2739ac66ed treewide: drop cql_serialization_format
Now that we don't accept cql protocol version 1 or 2, we can
drop cql_serialization format everywhere, except when in the IDL
(since it's part of the inter-node protocol).

A few functions had duplicate versions, one with and one without
a cql_serialization_format parameter. They are deduplicated.

Care is taken that `partition_slice`, which communicates
the cql_serialization_format across nodes, still presents
a valid cql_serialization_format to other nodes when
transmitting itself and rejects protocol 1 and 2 serialization\
format when receiving. The IDL is unchanged.

One test checking the 16-bit serialization format is removed.
2023-01-03 19:54:13 +02:00
Michał Jadwiszczak
29ad5a08a8 implement keyspace_element interface
This patch implements `data_dictionary::keyspace_element` interfece
in: `keyspace_metadata`, `user_type_impl`, `user_function`,
`user_aggregate` and schema.
2022-12-10 12:34:09 +01:00