Commit Graph

36269 Commits

Author SHA1 Message Date
Kefu Chai
ca6ebbd1f0 cql3, db: sstable: specialize fmt::formatter<function_name>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `function_name` without the help of `operator<<`.

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13608
2023-04-21 10:07:28 +03:00
Botond Dénes
d74f3598f4 Merge 'dht: specialize fmt::formatter<dht::token>' from Kefu Chai
this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `dht::token` without the help of `operator<<`.

the corresponding `operator<<()` is preserved in this change, as it has lots of users in this project, we will tackle them case-by-case in follow-up changes.

also, the forward declaration of `operator<<(ostream&, constdht::token&)` in `dht/i_partitioner.hh` is removed. ias it not necessary.

Refs https://github.com/scylladb/scylladb/issues/13245

Closes #13610

* github.com:scylladb/scylladb:
  dht: remove unnecessarily forward declaration
  dht: specialize fmt::formatter<dht::token>
2023-04-21 09:51:25 +03:00
Kefu Chai
c5fa1ac9f7 sstable: specialize fmt::formatter<component_type>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `component_type` without the help of `operator<<`.

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.

also, please note, to enable fmtlib to format `std::set<component_type>`
in `test/boost/sstable_3_x_test.cc` , we need to include
`<fmt/ranges.h>` in that source file.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13598
2023-04-21 09:49:24 +03:00
Kefu Chai
9215adee46 streaming: specialize fmt::formatter<stream_reason>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `stream_reason` without the help of `operator<<`.

please note, because we still cannot use the generic formatter for
std::unordered_map provided by fmtlib, so in order to drop `operator<<`
for `stream_reason`, and to print `unordered_map<stream_reason>`,
`fmt::join()` is used as a temporary solution. we will audit all
`fmt::join()` calls, after removing the homebrew formatter of
`std::unordered_map`.

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13609
2023-04-21 09:44:23 +03:00
Kefu Chai
ecb5380638 treewide: s/boost::lexical_cast<std::string>/fmt::to_string()/
this change replaces all occurrences of `boost::lexical_cast<std::string>`
in the source tree with `fmt::to_string()`. for couple reasons:

* `boost::lexical_cast<std::string>` is longer than `fmt::to_string()`,
  so the latter is easier to parse and read.
* `boost::lexical_cast<std::string>` creates a stringstream under the
  hood, so it can use the `operator<<` to stringify the given object.
  but stringstream is known to be less performant than fmtlib.
* we are migrating to fmtlib based formatting, see #13245. so
  using `fmt::to_string()` helps us to remove yet another dependency
  on `operator<<`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13611
2023-04-21 09:43:53 +03:00
Nadav Har'El
9c3907bb3c test/cql-pytest: reproducers for incorrect AVG of "decimal" type
This patch contains tests reproducing issue #13601 and the corresponding
Cassandra issue CASSANDRA-18470. These issues are about what the AVG
aggregation does for arbitrary-precision "decimal" numbers - the tests
we add here show examples where the current behavior doesn't make sense:

The problem is that "decimal" has arbitrary precision - so, should an
average of 1/3 be returned as 0.3 or 0.33333333333333333? This is not
specified, so Scylla (and Cassandra) decided to pick the result precision
based on the input precision. In particular, the average of 1 and 2
is returned as 2 (zero digits after the decimal point, like in the
inputs) instead of the expected 1.5. Arguably this isn't useful behavior.

The test adds a second test which fails on Cassandra, but does pass
on Scylla: Cassandra returns as the average of 1, 2, 2, 3 the integer 1
whereas the correct average is 2 (and Scylla returns it correctly).
The reason why this bug is even worse on Cassandra is that Scylla's AVG
only loses precision when dividing the sum and count, but Cassandra
tries to maintain only the average, and loses precision at every step.

Refs #13601

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13603
2023-04-21 08:32:30 +03:00
Kefu Chai
7b21bfd36e mutation: specialize fmt::formatter<apply_resume>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `apply_resume` without the help of `operator<<`.

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13584
2023-04-21 08:27:57 +03:00
Benny Halevy
77b70dbdb7 sstables: compressed_file_data_source_impl: get: throw malformed_sstable_exception on premature eof
Currently, the reader might dereference a null pointer
if the input stream reaches eof prematurely,
and read_exactly returns an empty temporary_buffer.

Detect this condition before dereferencing the buffer
and sstables::malformed_sstable_exception.

Fixes #13599

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #13600
2023-04-21 07:56:58 +03:00
Botond Dénes
d828cfcb23 Merge 'db, cql3: functions: switch argument passing to std::span' from Avi Kivity
Database functions currently receive their arguments as an std::vector. This
is inflexible (for example, one cannot use small_vector to reduce allocations).

This series adapts the function signature to accept parameters using std::span.
Some changes in the keys interface are needed to support this. Lastly, one call
site is migrated to small_vector.

This is in support of changing selectors to use expressions.

Closes #13581

* github.com:scylladb/scylladb:
  cql3: abstract_function_selector: use small_vector for argument buffer
  db, cql3: functions: pass function parameters as a span instead of a vector
  keys: change from_optional_exploded to accept a span instead of a vector
2023-04-21 06:49:07 +03:00
Kefu Chai
fe9f41bd84 dht: remove unnecessarily forward declaration
it turns out the declaration of `operator<<(ostream&, const
dht::token&)` is unnecessarily. so let's drop it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-21 11:41:54 +08:00
Kefu Chai
53dedca8cd dht: specialize fmt::formatter<dht::token>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `dht::token` without the help of `operator<<`.

the corresponding `operator<<()` is preserved in this change, as it
has lots of users in this project, we will tackle them case-by-case in
follow-up changes.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-21 11:41:54 +08:00
Avi Kivity
0c64dd12b1 test: raft_server_test: fix string compare for clang 15
Clang 15 rejects string compares where the left-hand-side is a C
string, so help it along by converting it ourselves.

Closes #13582
2023-04-21 06:38:10 +03:00
Botond Dénes
1426c623eb Merge 'Tune up S3 unit tests environment usage (and a bit more)' from Pavel Emelyanov
The tests in question are using MINIO_SERVER_ADDRESS environment variable to export minio server address from pylib to test cases. Also they use hard-coded public bucket name. Both plays badly with AWS S3, the former due to MINIO_... in its name and the latter because public bucket name can be any.

So this PR puts address and public bucket name into S3_..._FOR_TEST environment variables and fixes output stream closure on failure while at it.

Detached from #13493

Closes #13546

* github.com:scylladb/scylladb:
  s3/test: Rename MINIO_SERVER_ADDRESS environment variable
  s3/test: Keep public bucket name in environment
  s3/test: Fix upload stream closure
  test/lib: Add getenv_safe() helper
2023-04-20 18:01:12 +03:00
Pavel Emelyanov
30b6f34a0b s3/client: Explicitly set _upload_id empty when completing
The upload_sink::_upload_id remains empty until upload starts, remains
non-empty while it proceeds, then becomes empty again after it
completes. The upload_started() method cheks that and on .close()
started upload is aborted.

The final switch to empty is done by std::move()ing the upload id into
completion requrest, but it's better to use std::exchange() to emphasize
the fact the the _upload_id becomes empty at that point for a reason.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13570
2023-04-20 17:32:08 +03:00
Avi Kivity
1cd6d59578 Merge 'Remove global proxy usage from view_info::select_statement()' from Pavel Emelyanov
The method needs proxy to get data_dictionary::database from to pass down to select_statement::prepare(). And a legacy bit that can come with data_dictionary::database as well. Fortunately, all the call traces that end up at select_statement() start inside table:: methods that have view_update_generator, or at view_builder::consumer that has reference to view_builder. Both services can share the database reference. However, the call traces in question pass through several code layers, so the PR adds data_dictionary::database to those layers one by one.

Closes #13591

* github.com:scylladb/scylladb:
  view_info: Drop calls to get_local_storage_proxy()
  view_info: Add data_dictionary argument to select_statement()
  view_info: Add data_dictionary argument to partition_slice() method
  view_filter_checking_visitor: Construct with data_dictionary
  view: Carry data_dictionary arg through standalone helpers
  view_updates: Carry data_dictionary argument throug methods
  view_update_builder: Construct with data dictionary
  table: Push view_update_generator arg to affected_views()
  view: Add database getters to v._update_generator and v._builder
2023-04-20 16:40:06 +03:00
Avi Kivity
43a0b40082 Merge 'Remove global proxy usage from API handlers' from Pavel Emelyanov
There are few places in the API handlers that call global proxy for their needs. Most of those places are easy to patch, because proxy is either at http_ctx thing right inside the handler code. Also there's a handler code in view_builder that needs proxy too, but it really needs topology, not proxy, and can get it elsewhere (the handler is coroutinized while at it)

Closes #13593

* github.com:scylladb/scylladb:
  view: Get topology via database tokens
  view: Indentation fix after previous patch
  view: Coroutinuze view_builder::view_build_statuses()
  api: Use ctx.sp in storage service handler
  api,main: Unset storage_proxy API on stop
  api: Use ctx.sp in set_storage_proxy() routes
2023-04-20 16:31:31 +03:00
Botond Dénes
66ee73641e test/cql-pytest/nodetool.py: no_autocompaction_context: use the correct API
This `with` context is supposed to disable, then re-enable
autocompaction for the given keyspaces, but it used the wrong API for
it, it used the column_family/autocompaction API, which operates on
column families, not keyspaces. This oversight led to a silent failure
because the code didn't check the result of the request.
Both are fixed in this patch:
* switch to use `storage_service/auto_compaction/{keyspace}` endpoint
* check the result of the API calls and report errors as exceptions

Fixes: #13553

Closes #13568
2023-04-20 16:21:16 +03:00
Kamil Braun
8d7b5f1710 Merge 'test/pylib: topology fix asyncio fixture and fix logger' from Alecco
Remove unnecessary asyncio marker and re-introduce top level logger instance.

Closes #13561

* github.com:scylladb/scylladb:
  test/pylib: add missing logger
  test/pylib: remove unnecessary asyncio marker
2023-04-20 14:23:05 +02:00
Alejo Sanchez
11561a73cb test/pylib: ManagerClient helpers to wait for...
server to see other servers after start/restart

When starting/restarting a server, provide a way to wait for the server
to see at least n other servers.

Also leave the implementation methods available for manual use and
update previous tests, one to wait for a specific server to be seen, and
one to wait for a specific server to not be seen (down).

Fixes #13147

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>

Closes #13438
2023-04-20 14:22:31 +02:00
Avi Kivity
342cdb2a63 Update tools/jmx submodule (split Depends line)
* tools/jmx 15fd4ca...fdd0474 (1):
  > dist/debian: split Depends into multiple lines
2023-04-20 15:11:33 +03:00
Pavel Emelyanov
bda2aea5be view: Get topology via database tokens
The view_builder::view_build_statuses() needs topology to walk its
nodes. Now it gets one from global proxy via its token metadata, but
database also has tokens and view_builder has reference to database.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 13:18:14 +03:00
Pavel Emelyanov
403463d7eb view: Indentation fix after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 13:18:14 +03:00
Pavel Emelyanov
257814f443 view: Coroutinuze view_builder::view_build_statuses()
Easier to patch it this way further.
Indentation is deliberately left broken until next patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 13:17:59 +03:00
Pavel Emelyanov
ece731301c api: Use ctx.sp in storage service handler
Similarly to previous patch, but from another routes group. The storage
service API calls mainly use storage service, but one place needs proxy
to call recalculate_schema_version() with

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 13:14:52 +03:00
Pavel Emelyanov
21136058bd api,main: Unset storage_proxy API on stop
So that the routes referencing and using ctx.sp don't step on a proxy
that's going to be removed (not now, but some time later) fron under
them on shutdown.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 13:14:04 +03:00
Pavel Emelyanov
8d490d20dc api: Use ctx.sp in set_storage_proxy() routes
It's already used in many other places, few methods still stick to
global proxy usage.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 13:12:49 +03:00
Alejo Sanchez
2c1ba377bf test/pylib: add missing logger
The logger instancewas removed in a previous commit but it is used in
the wrapper helper. Add it back.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2023-04-20 10:36:02 +02:00
Alejo Sanchez
05338a6cd7 test/pylib: remove unnecessary asyncio marker
Remove missing asyncio marker for fixture as this is only needed for
tests.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2023-04-20 10:36:02 +02:00
Pavel Emelyanov
edcce7d8dd view_info: Drop calls to get_local_storage_proxy()
In both cases the proxy is called to get data_dictionary from. Now its
available as the call argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 11:17:46 +03:00
Pavel Emelyanov
3e4fb7cad6 view_info: Add data_dictionary argument to select_statement()
This method needs data_dictionary to work. Fortunately, all callers of
it already have the dictionary at hand and can just pass it as argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 11:17:46 +03:00
Pavel Emelyanov
4375835cdd view_info: Add data_dictionary argument to partition_slice() method
The caller is calculate_affected_clustering_ranges() with dictionary
arg, the method needs dictionary to call view_info::select_statement()
later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 11:17:46 +03:00
Pavel Emelyanov
0aff55cdb2 view_filter_checking_visitor: Construct with data_dictionary
The visitor is wait-free helper for matches_view_filter() that has
dictionary as its argument. Later the visitor will pass the dictionary
to view_info::select_statement().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 11:17:46 +03:00
Pavel Emelyanov
837fde84b1 view: Carry data_dictionary arg through standalone helpers
There's a bunch of functions in view.{hh|cc} that don't belong to any
class and perform view-related claculations for view updates. Lots of
them eventually call view_info::select_statement() which will later need
the dictionary.

By now all those methods' callers have data dictionary at hand and can
share it via argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 11:17:46 +03:00
Pavel Emelyanov
1301a99ba3 view_updates: Carry data_dictionary argument throug methods
The goal is to have the dictionary at places that later wrap calls to
view_info::select_statement(). This graph of calls starts at the only
public view_updates::generate_update() method which, in turn, is called
from view_update_builder that already has data dictionary at hand.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 11:17:46 +03:00
Pavel Emelyanov
9d3d533561 view_update_builder: Construct with data dictionary
The caller is table with view-update-generator at hand (it calls
mutate_MV on). Builder here is used as a temporary object that destroys
once the caller coroutine co_return-s, so keeping the database obtained
from the view-update-generator is safe.

Later the v.u.b. object will propagate its data dictionary down the
callstacks.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 11:17:38 +03:00
Pavel Emelyanov
4a16ab3bd4 table: Push view_update_generator arg to affected_views()
Caller already has it to call mutate_MV() on. The method in question
will need the generator in one of the next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 10:42:31 +03:00
Pavel Emelyanov
7ddcd0c918 view: Add database getters to v._update_generator and v._builder
Both services carry database which will be used by auxiliary objects
like view_updates, view_update_builder, consumer, etc in next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-20 10:41:16 +03:00
Warren Krewenki
73eaebe338 Remove visible :orphan:
The text `:orphan:` was showing up in the scylla.yaml documentation with no context.

Closes #13524
2023-04-20 08:24:48 +03:00
Avi Kivity
9fb5443f87 cql3: abstract_function_selector: use small_vector for argument buffer
abstract_function_selector uses a preallocated vector to store the
arguments to aggregate functions, to prevent an allocation for every row.
Use small_vector to prevent an allocation per query, if the number of
arguments happens to be small.

This isn't expected to make a significant performance difference.
2023-04-19 20:42:25 +03:00
Avi Kivity
3e0aacc8b5 db, cql3: functions: pass function parameters as a span instead of a vector
Spans are more flexible and can be constructed from any contiguous
container (such as small_vector), or a subrange of such a container.
This can save allocations, so change the signature to accept a span.

Spans cannot be constructed from std::initializer_list, so one such
call site is changed to use construct a span directly from the single
argument.
2023-04-19 20:38:55 +03:00
Avi Kivity
9072763a52 keys: change from_optional_exploded to accept a span instead of a vector
A span is more generic than a vector, and can be constructed
from any contiguous container (like small_vector), or a subset
of a container.

To support this, helpers in compound.hh need to use make_iterator_range,
since a span doesn't fit the container concept (since spans don't own
their contents).

This is needed to make a similar change to function evaluation, as
the token function passes its parameters to from_optional_exploded().
2023-04-19 20:18:50 +03:00
Avi Kivity
6ca1b14488 Update tools/jmx submodule (drop java 8 on debian)
* tools/jmx 3316f7a...15fd4ca (1):
  > dist/debian: drop dependencies on jdk-8
2023-04-19 19:51:03 +03:00
Botond Dénes
0c430c01e9 Merge 'cql: allow SUM() aggregations which result in a NaN' from Nadav Har'El
This short PR fixes a bug in SUM() aggregation where if the data contains +Inf and -Inf the returned sum should be NaN but we returned an error instead. This is a recent regression uncovered by a dtest (see issue #13551), but in the first patch we add additional tests in the cql-pytest framework which reproduce this bug and explore various other areas (wrongly) implicated by the failing dtest.

Fixes #13551

Closes #13564

* github.com:scylladb/scylladb:
  cql3: allow SUM() aggregation to result in a NaN
  test/cql-pytest: add tests for data casts and inf in sums
2023-04-19 13:50:23 +03:00
Pavel Emelyanov
a77ca69360 s3/test: Rename MINIO_SERVER_ADDRESS environment variable
Using it the pylib minio code export minio address for tests. This
creates unneeded WTFs when running the test over AWS S3, so it's better
to rename to variable not to mention MINIO at all.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-19 12:51:12 +03:00
Pavel Emelyanov
12c4e7d605 s3/test: Keep public bucket name in environment
Local test.py runs minio with the public 'testbucket' bucket and all
test cases know that. This series adds an ability to run tests over real
S3 so the bucket name should be configurable.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-19 12:51:12 +03:00
Pavel Emelyanov
91674da982 s3/test: Fix upload stream closure
If multipart upload fails for some reason the output stream remains not
closed and the respective assertion masquerades the original failure.
Fix that by closing the stream in all cases.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-19 12:51:12 +03:00
Pavel Emelyanov
b239e0d368 test/lib: Add getenv_safe() helper
The helper is like ::getenv() but checks if the variable exists and
throws descriptive exception. So instead of

    fatal error: in "...": std::logic_error: basic_string: construction from null is not valid

one could get something like

   fatal error: in "...": std::logic_error: Environment variable ... not set

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-19 12:49:26 +03:00
Botond Dénes
ad065aaa62 Update tools/jmx submodule
* tools/jmx e9bfaabd...3316f7a9 (2):
  > select-java: avoid exec multiple paths
  > select-java: extract function out
2023-04-19 11:18:19 +03:00
Nadav Har'El
81e0f5b581 cql3: allow SUM() aggregation to result in a NaN
When floating-point data contains +Inf and -Inf, the sum is NaN.

Our SUM() aggregation calculated this sum correctly, but then instead
of returning it, complained that the sum overflowed by narrowing.
This was a false positive: The sum() finalizer wanted to test that no
precision was lost when casting the accumulator to the result type,
so checked that the result before and after the cast are the same.
But specifically for NaN, it is never equal to anything - not even
to itself. This check is wrong for floating point, but moreover -
isn't even necessary when the two types (accumulator type and result
type) are identical so in this patch we skip it in this case.

Note that in the current code, a different accumulator and result type
is only used in the case of integer types; When accumulating floating
point sums, the same type is used, so the broken check will be avoided.

The test for this issue starts to pass with this patch, so the xfail
tag is removed.

Fixes #13551

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-04-19 09:31:41 +03:00
Nadav Har'El
5b792dde68 Merge 'Extend aws_sigv4 code to suite S3 client needs' from Pavel Emelyanov
The AWS signature-generating code was moved from alternator some time ago as is. Now it's clear that in which places it should be extended to work for S3 client as well. The enhancements are

- Support UNSIGNED-PAYLOAD to omit calculating checksums for request body
- Include full URL path into the signature, not just hard-coded "/" string
- Don't check datastamp expiration if not asked for

This is a part of #13493

Closes #13535

* github.com:scylladb/scylladb:
  utils/aws: Brush up the aws_sigv4.hh header
  utils/aws: Export timepoint formatter
  utils/aws: Omit datestamp expiration checks when not needed
  utils/aws: Add canonical-uri argument
  utils/aws: Support unsigned-payload signatures
2023-04-18 16:33:52 +03:00