Commit Graph

4825 Commits

Author SHA1 Message Date
Botond Dénes
c1e8e86637 reader_concurrency_semaphore: reader_permit: clean-up after failed memory requests
When requesting memory via `reader_permit::request_memory()`, the
requested amount is added to `_requested_memory` member of the permit
impl. This is because multiple concurrent requests may be blocked and
waiting at the same time. When the requests are fulfilled, the entire
amount is consumed and individual requests track their requested amount
with `resource_units` to release later.
There is a corner-case related to this: if a reader permit is registered
as inactive while it is waiting for memory, its active requests are
killed with `std::bad_alloc`, but the `_requested_memory` fields is not
cleared. If the read survives because the killed requests were part of
a non-vital background read-ahead, a later memory request will also
include amount from the failed requests. This extra amount wil not be
released and hence will cause a resource leak when the permit is
destroyed.
Fix by detecting this corner case and clearing the `_requested_memory`
field. Modify the existing unit test for the scenario of a permit
waiting on memory being registered as inactive, to also cover this
corner case, reproducing the bug.

Fixes: #13539

Closes #13679
2023-05-07 14:06:51 +03:00
Kamil Braun
70f2b09397 Merge 'scylla_cluster.py: fix read_last_line' from Gusev Petr
This is a follow-up to #13399, the patch
addresses the issues mentioned there:
* linesep can be split between blocks;
* linesep can be part of UTF-8 sequence;
* avoid excessively long lines, limit to 256 chars;
* the logic of the function made simpler and more maintainable.

Closes #13427

* github.com:scylladb/scylladb:
  pylib_test: add tests for read_last_line
  pytest: add pylib_test directory
  scylla_cluster.py: fix read_last_line
  scylla_cluster.py: move read_last_line to util.py
2023-05-05 13:29:15 +02:00
Kefu Chai
05a172c7e7 build: cmake: link against Boost::unit_test_framework
we introduced the linkage to Boost::unit_test_framework in
fe70333c19, this library is used by
test/lib/test_utils.cc, so update CMake accordingly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13781
2023-05-05 13:55:00 +03:00
Petr Gusev
8a0bcf9d9d pylib_test: add tests for read_last_line 2023-05-05 12:57:43 +04:00
Petr Gusev
7476e91d67 pytest: add pylib_test directory
We want to add tests for read_last_line,
in this commit we add a new directory for them
since there were no tests for pylib code before.
2023-05-05 12:57:43 +04:00
Petr Gusev
330d1d5163 scylla_cluster.py: fix read_last_line
This is a follow-up to #13399, the patch
addresses the issues mentioned there:
* linesep can be split between blocks;
* linesep can be part of UTF-8 sequence;
* avoid excessively long lines, limit to 512 chars;
* the logic of the function made simpler and more
maintainable.
2023-05-05 12:57:36 +04:00
Petr Gusev
8a5e211c30 scylla_cluster.py: move read_last_line to util.py
We want to add tests for read_last_line, so we
move it to make this simper.
2023-05-05 12:51:25 +04:00
Botond Dénes
687a8bb2f0 Merge 'Sanitize test::filename(sstable) API' from Pavel Emelyanov
There are two of them currently with slightly different declaration. Better to leave only one.

Closes #13772

* github.com:scylladb/scylladb:
  test: Deduplicate test::filename() static overload
  test: Make test::filename return fs::path
2023-05-05 11:36:08 +03:00
Pavel Emelyanov
ac305076bd test: Split test_twcs_interposer_on_memtable_flush naturally
The test case consists of two internal sub-test-cases. Making them
explicit kills three birds with one stone

- improves parallelizm
- removes env's tempdir wiping
- fixes code indentation

refs: #12707

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13768
2023-05-05 10:42:30 +03:00
Avi Kivity
f125a3e315 Merge 'tree: finish the reader_permit state renames' from Botond Dénes
In https://github.com/scylladb/scylladb/pull/13482 we renamed the reader permit states to more descriptive names. That PR however only covered only the states themselves and their usages, as well as the documentation in `docs/dev`.
This PR is a followup to said PR, completing the name changes: renaming all symbols, names, comments etc, so all is consistent and up-to-date.

Closes #13573

* github.com:scylladb/scylladb:
  reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update API w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes
2023-05-04 18:29:04 +03:00
Avi Kivity
204521b9a7 Merge 'mutation/mutation_compactor: validate range tombstone change before it is moved' from Botond Dénes
e2c9cdb576 moved the validation of the range tombstone change to the place where it is actually consumed, so we don't attempt to pass purged or discarded range tombstones to the validator. In doing so however, the validate pass was moved after the consume call, which moves the range tombstone change, the validator having been passed a moved-from range tombstone. Fix this by moving he validation to before the consume call.

Refs: #12575

Closes #13749

* github.com:scylladb/scylladb:
  test/boost/mutation_test: add sanity test for mutation compaction validator
  mutation/mutation_compactor: add validation level to compaction state query constructor
  mutation/mutation_compactor: validate range tombstone change before it is moved
2023-05-04 18:15:35 +03:00
Avi Kivity
1d351dde06 Merge 'Make S3 client work with real S3' from Pavel Emelyanov
Current S3 client was tested over minio and it takes few more touches to work with amazon S3.

The main challenge here is to support singed requests. The AWS S3 server explicitly bans unsigned multipart-upload requests, which in turn is the essential part of the sstables S3 backend, so we do need signing. Signing a request has many options and requirements, one of them is -- request _body_ can be or can be not included into signature calculations. This is called "(un)signed payload". Requests sent over plain HTTP require payload signing (i.e. -- request body should be included into signature calculations), which can a bit troublesome, so instead the PR uses unsigned payload (i.e. -- doesn't include the request body into signature calculation, only necessary headers and query parameters), but thus also needs HTTPS.

So what this set does is makes the existing S3 client code sign requests. In order to sign the request the code needs to get AWS key and secret (and region) from somewhere and this somewhere is the conf/object_storage.yaml config file. The signature generating code was previously merged (moved from alternator code) and updated to suit S3 client needs.

In order to properly support HTTPS the PR adds special connection factory to be used with seastar http client. The factory makes DNS resolving of AWS endpoint names and configures gnutls systemtrust.

fixes: #13425

Closes #13493

* github.com:scylladb/scylladb:
  doc: Add a document describing how to configure S3 backend
  s3/test: Add ability to run boost test over real s3
  s3/client: Sign requests if configured
  s3/client: Add connection factory with DNS resolve and configurable HTTPS
  s3/client: Keep server port on config
  s3/client: Construct it with config
  s3/client: Construct it with sstring endpoint
  sstables: Make s3_storage with endpoint config
  sstables_manager: Keep object storage configs onboard
  code: Introduce conf/object_storage.yaml configuration file
2023-05-04 18:08:54 +03:00
Pavel Emelyanov
56dfc21ba0 test: Deduplicate test::filename() static overload
There are two of them currently, both returning fs::path for sstable
components. One is static and can be dropped, callers are patched to use
the non-static one making the code tiny bit shorter.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-04 17:16:00 +03:00
Pavel Emelyanov
3f30a253be test: Make test::filename return fs::path
The sstable::filename() is private and is not supposed to be used as a
path to open any files. However, tests are different and they sometimes
know it is. For that they use test wrapper that has access to private
members and may make assumptions about meaning of sstable::filename().

Said that, the test::filename() should return fs::path, not sstring.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-04 17:14:04 +03:00
Tomasz Grabiec
e385ce8a2b Merge "fix stack use after free during shutdown" from Gleb
storage_service uses raft_group0 but the during shutdown the later is
destroyed before the former is stopped. This series move raft_group0
destruction to be after storage_service is stopped already. For the
move to work some existing dependencies of raft_group0 are dropped
since they do not really needed during the object creation.

Fixes #13522
2023-05-04 15:14:18 +02:00
Pavel Emelyanov
fe70333c19 test: Auto-skip object-storage test cases if run from shell
In case an sstable unit test case is run individually, it would fail
with exception saying that S3_... environment is not set. It's better to
skip the test-case rather than fail. If someone wants to run it from
shell, it will have to prepare S3 server (minio/AWS public bucket) and
provide proper environment for the test-case.

refs: #13569

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13755
2023-05-04 14:15:18 +03:00
Konstantin Osipov
e7c9ca560b test: issue a read barrier before checking ring consistency
Raft replication doesn't guarantee that all replicas see
identical Raft state at all times, it only guarantees the
same order of events on all replicas.

When comparing raft state with gossip state on a node, first
issue a read barrier to ensure the node has the latest raft state.

To issue a read barrier it is sufficient to alter a non-existing
state: in order to validate the DDL the node needs to sync with the
leader and fetch its latest group0 state.

Fixes #13518 (flaky topology test).

Closes #13756
2023-05-04 12:22:07 +02:00
Gleb Natapov
dc6c3b60b4 init: move raft_group0 creation before storage_service
storage_service uses raft_group0 so the later needs to exists until
the former is stopped.
2023-05-04 13:03:18 +03:00
Gleb Natapov
e9fb885e82 service/raft: raft_group0: drop dependency on cdc::generation_service
raft_group0 does not really depends on cdc::generation_service, it needs
it only transiently, so pass it to appropriate methods of raft_group0
instead of during its creation.
2023-05-04 13:03:07 +03:00
Pavel Emelyanov
e00d3188ed s3/test: Add ability to run boost test over real s3
Support the AWS_S3_EXTRA environment vairable that's :-split and the
respective substrings are set as endpoint AWS configuration. This makes
it possible to run boost S3 test over real S3.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-03 20:23:38 +03:00
Pavel Emelyanov
3bec5ea2ce s3/client: Keep server port on config
Currently the code temporarily assumes that the endpoint port is 9000.
This is what tests' local minio is started with. This patch keeps the
port number on endpoint config and makes test get the port number from
minio starting code via environment.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-03 20:19:43 +03:00
Pavel Emelyanov
85f06ca556 s3/client: Construct it with config
Similar to previous patch -- extent the s3::client constructor to get
the endpoint config value next to the endpoint string. For now the
configs are likely empty, but they are yet unused too.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-03 20:19:43 +03:00
Pavel Emelyanov
caf9e357c8 s3/client: Construct it with sstring endpoint
Currently the client is constructed with socket_address which's prepared
by the caller from the endpoint string. That's not flexible engouh,
because s3 client needs to know the original endpoint string for two
reasons.

First, it needs to lookup endpoint config for potential AWS creds.
Second, it needs this exact value as Host: header in its http requests.

So this patch just relaxes the client constructor to accept the endpoint
string and hard-code the 9000 port. The latter is temporary, this is how
local tests' minio is started, but next patch will make it configurable.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-03 20:19:43 +03:00
Pavel Emelyanov
2f6aa5b52e code: Introduce conf/object_storage.yaml configuration file
In order to access real S3 bucket, the client should use signed requests
over https. Partially this is due to security considerations, partially
this is unavoidable, because multipart-uploading is banned for unsigned
requests on the S3. Also, signed requests over plain http require
signing the payload as well, which is a bit troublesome, so it's better
to stick to secure https and keep payload unsigned.

To prepare signed requests the code needs to know three things:
- aws key
- aws secret
- aws region name

The latter could be derived from the endpoint URL, but it's simpler to
configure it explicitly, all the more so there's an option to use S3
URLs without region name in them we could want to use some time.

To keep the described configuration the proposed place is the
object_storage.yaml file with the format

endpoints:
  - name: a.b.c
    port: 443
    aws_key: 12345
    aws_secret: abcdefghijklmnop
    ...

When loaded, the map gets into db::config and later will be propagated
down to sstables code (see next patch).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-03 20:19:15 +03:00
Botond Dénes
4365f004c1 test/boost/mutation_test: add sanity test for mutation compaction validator
Checking that compacted fragments are forwarded to the validator intact.
2023-05-03 04:19:42 -04:00
Nadav Har'El
b5f28e2b55 Merge 'Add S3 support to sstables::test_env' from Pavel Emelyanov
Currently there are only 2 tests for S3 -- the pure client test and compound object_store test that launches scylla, creates s3-backed table and CQL-queries it. At the same time there's a whole lot of small unit test for sstables functionality, part of it can run over S3 storage too.

This PR adds this support and patches several test cases to use it. More test cases are to come later on demand.

fixes: #13015

Closes #13569

* github.com:scylladb/scylladb:
  test: Make resharding test run over s3 too
  test: Add lambda to fetch bloom filter size
  test: Tune resharding test use of sstable::test_env
  test: Make datafile test case run over s3 too
  test: Propagate storage options to table_for_test
  test: Add support for s3 storage_options in config
  test: Outline sstables::test_env::do_with_async()
  test: Keep storage options on sstable_test_env config
  sstables: Add and call storage::destroy()
  sstables: Coroutinize sstable::destroy()
2023-05-02 21:48:05 +03:00
Botond Dénes
72003dc35c readers: evictable_reader: skip progress guarantee when next pos is partition start
The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the last buffer-fill. Otherwise, the reader
could get stuck in an infinite loop between buffer fills, if the reader
is evicted in-between.
The code guranteeing this forward change has a bug: when the next
expected position is a partition-start (another partition), the code
would loop forever, effectively reading all there is from the underlying
reader.
To avoid this, add a special case to ignore the progress guarantee loop
altogether when the next expected position is a partition start. In this
case, progress is garanteed anyway, because there is exactly one
partition-start fragment in each partition.

Fixes: #13491

Closes #13563
2023-05-02 16:19:32 +03:00
Botond Dénes
7baa2d9cb2 Merge 'Cleanup range printing' from Benny Halevy
This mini-series cleans up printing of ranges in utils/to_string.hh

It generalizes the helper function to work on a std::ranges::range,
with some exceptions, and adds a helper for boost::transformed_range.

It also changes the internal interface by moving `join` the the utils namespace
and use std::string rather than seastar::sstring.

Additional unit tests were added to test/boost/json_test

Fixes #13146

Closes #13159

* github.com:scylladb/scylladb:
  utils: to_string: get rid of utils::join
  utils: to_string: get rid of to_string(std::initializer_list)
  utils: to_string: get rid of to_string(const Range&)
  utils: to_string: generalize range helpers
  test: add string_format_test
  utils: chunked_vector: add std::ranges::range ctor
2023-05-02 14:55:18 +03:00
Botond Dénes
d6ed5bbc7e Merge 'alternator: fix validation of numbers' magnitude and precision' from Nadav Har'El
DynamoDB limits the allowed magnitude and precision of numbers - valid
decimal exponents are between -130 and 125 and up to 38 significant
decimal digitst are allowed. In contrast, Scylla uses the CQL "decimal"
type which offers unlimited precision. This can cause two problems:

1. Users might get used to this "unofficial" feature and start relying
    on it, not allowing us to switch to a more efficient limited-precision
    implementation later.

2. If huge exponents are allowed, e.g., 1e-1000000, summing such a
    number with 1.0 will result in a huge number, huge allocations and
    stalls. This is highly undesirable.

This series adds more tests in this area covering additional corner cases,
and then fixes the issue by adding the missing verification where it's
needed. After the series, all 12 tests in test/alternator/test_number.py now pass.

Fixes #6794

Closes #13743

* github.com:scylladb/scylladb:
  alternator: unit test for number magnitude and precision function
  alternator: add validation of numbers' magnitude and precision
  test/alternator: more tests for limits on number precision and magnitude
  test/alternator: reproducer for DoS in unlimited-precision addition
2023-05-02 14:33:36 +03:00
Nadav Har'El
ed34f3b5e4 cql-pytest: translate Cassandra's test for LWT with collections
This is a translation of Cassandra's CQL unit test source file
validation/operations/InsertUpdateIfConditionTest.java into our cql-pytest
framework.

This test file checks various LWT conditional updates which involve
collections or UDTs (there is a separate test file for LWT conditional
updates which do not involve collections, which I haven't translated
yet).

The tests reproduce one known bug:

Refs #5855:  lwt: comparing NULL collection with empty value in IF
             condition yields incorrect results

And also uncovered three previously-unknown bugs:

Refs #13586: Add support for CONTAINS and CONTAINS KEY in LWT expressions
Refs #13624: Add support for UDT subfields in LWT expression
Refs #13657: Misformatted printout of column name in LWT error message

Beyond those bona-fide bugs, this test also demonstrates several places
where we intentionally deviated from Cassandra's behavior, forcing me
to comment out several checks. These deviations are known, and intentional,
but some of them are undocumented and it's worth listing here the ones
re-discovered by this test:

1. On a successful conditional write, Cassandra returns just True, Scylla
   also returns the old contents of the row. This difference is officially
   documented in docs/kb/lwt-differences.rst.
2. Scylla allows the test "l = [null]" or "s = {null}" with this weird
   null element (the result is false), whereas Cassandra prints an error.
3. Scylla allows "l[null]" or "m[null]" (resulting in null), Cassandra
   prints an error.
4. Scylla allows a negative list index, "l[-2]", resulting in null.
   Cassandra prints an error in this case.
5. Cassandra allows in "IF v IN (?, ?)" to bind individual values to
   UNSET_VALUE and skips them, Scylla treats this as an error. Refs #13659.
6. Scylla allows "IN null" (the condition just fails), Cassandra prints
   an error in this case.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13663
2023-05-02 11:53:58 +03:00
Pavel Emelyanov
d4a72de406 test: Make resharding test run over s3 too
Now when the test case and used lib/utils code is using storage-agnostic
approach, it can be extended to run over S3 storage as well.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:46:23 +03:00
Pavel Emelyanov
2601c58278 test: Add lambda to fetch bloom filter size
The resharding test compares bloom filter sizes before and after reshard
runs. For that it gets the filter on-disk filename and stat()s it. That
won't work with S3 as it doesn't have its accessable on-disk files.

Some time ago there existed the storage::get_stats() method, but now
it's gone. The new s3::client::get_object_stat() is coming, but it will
take time to switch to it. For now, generalize filter size fetching into
a local lambda. Next patch will make a stub in it for S3 case, and once
the get_object_stat() is there we'll be able to smoothly start using it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:43:26 +03:00
Pavel Emelyanov
76594bf72b test: Tune resharding test use of sstable::test_env
The test case in question spawns async context then makes the test_env
instance on the stack (and stopper for it too). There's helper for the
above steps, better to use them.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Pavel Emelyanov
439c8770aa test: Make datafile test case run over s3 too
Most of the sstable_datafile test cases are capable of running with S3
storage, so this patch makes the simplest of them do it. Patching the
rest from this file is optional, because mostly the cases test how the
datafile data manipulations work without checking the files
manipulations. So even if making them all run over S3 is possible, it
will just increase the testing time w/o real test of the storage driver.

So this patch makes one test case run over local and S3 storages, more
patches to update more test cases with files manipulations are yet to
come.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Pavel Emelyanov
f7df238545 test: Propagate storage options to table_for_test
Teach table_for_tests use any storage options, not just local one. For
now the only user that passes non-local options is sstables::test_env.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Pavel Emelyanov
fa1de16f30 test: Add support for s3 storage_options in config
When the sstable test case wants to run over S3 storage it needs to
specify that in test config by providing the S3 storage options. So
first thing this patch adds is the helper that makes these options based
on the env left by minio launcher from test.py.

Next, in order to make sstables_manager work with S3 it needs the
plugged system keyspace which, in turn, needs query processor, proxy,
database, etc. All this stuff lives in cql_test_env, so the test case
running with S3 options will run in a sstables::test_env nested inside
cql_test_env. The latter would also need to plug its system keyspace to
the former's sstables manager and turn the experimental feature ON.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Nadav Har'El
57ffbcbb22 cql3: fix spurious token names in syntax error messages
We have known for a long time (see issue #1703) that the quality of our
CQL "syntax error" messages leave a lot to be desired, especially when
compared to Cassandra. This patch doesn't yet bring us great error
messages with great context - doing this isn't easy and it appears that
Antlr3's C++ runtime isn't as good as the Java one in this regard -
but this patch at least fixes **garbage** printed in some error messages.

Specifically, when the parser can deduce that a specific token is missing,
it used to print

    line 1:83 missing ')' at '<missing '

After this patch we get rid of the meaningless string '<missing ':

    line 1:83 : Missing ')'

Also, when the parser deduced that a specific token was unneeded, it
used to print:

    line 1:83 extraneous input ')' expecting <invalid>

Now we got rid of this silly "<invalid>" and write just:

    line 1:83 : Unexpected ')'

Refs #1703. I didn't yet marked that issue "fixed" because I think a
complete fix would also require printing the entire misparsed line and the
point of the parse failure. Scylla still prints a generic "Syntax Error"
in most cases now, and although the character number (83 in the above
example) can help, it's much more useful to see the actual failed
statement and where character 83 is.

Unfortunately some tests enshrine buggy error messages and had to be
fixed. Other tests enshrined strange text for a generic unexplained
error message, which used to say "  : syntax error..." (note the two
spaces and elipses) and after this patch is " : Syntax error". So
these tests are changed. Another message, "no viable alternative at
input" is deliberately kept unchanged by this patch so as not to break
many more tests which enshrined this message.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13731
2023-05-02 11:23:58 +03:00
Pavel Emelyanov
1e03733e8c test: Outline sstables::test_env::do_with_async()
It's growing larger, better to keep it in .cc file

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:15:45 +03:00
Pavel Emelyanov
f223f5357d test: Keep storage options on sstable_test_env config
So that it could be set to s3 by the test case on demand. Default is
local storage which uses env's tempdir or explicit path argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:15:45 +03:00
Nadav Har'El
e74f69bb56 alternator: unit test for number magnitude and precision function
In the previous patch we added a limit in Alternator for the magnitude
and precision of numbers, based on a function get_magnitude_and_precision
whose implementation was, unfortunately, rather elaborate and delicate.

Although we did add in the previous patches some end-to-end tests which
confirmed that the final decision made based on this function, to accept or
reject numbers, was a correct decision in a few cases, such an elaborate
function deserves a separate unit test for checking just that function
in isolation. In fact, this unit tests uncovered some bugs in the first
implementation of get_magnitude_and_precision() which the other tests
missed.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-05-02 11:04:05 +03:00
Nadav Har'El
3c0603558c alternator: add validation of numbers' magnitude and precision
DynamoDB limits the allowed magnitude and precision of numbers - valid
decimal exponents are between -130 and 125 and up to 38 significant
decimal digitst are allowed. In contrast, Scylla uses the CQL "decimal"
type which offers unlimited precision. This can cause two problems:

1. Users might get used to this "unofficial" feature and start relying
   on it, not allowing us to switch to a more efficient limited-precision
   implementation later.

2. If huge exponents are allowed, e.g., 1e-1000000, summing such a
   number with 1.0 will result in a huge number, huge allocations and
   stalls. This is highly undesirable.

After this patch, all tests in test/alternator/test_number.py now
pass. The various failing tests which verify magnitude and precision
limitations in different places (key attributes, non-key attributes,
and arithmetic expressions) now pass - so their "xfail" tags are removed.

Fixes #6794

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-05-02 11:04:05 +03:00
Nadav Har'El
0eccc49308 test/alternator: more tests for limits on number precision and magnitude
We already have xfailing tests for issue #6794 - the missing checks on
precision and magnitudes of numbers in Alternator - but this patch adds
checks for additional corner cases. In particular we check the case that
numbers are used in a *key* column, which goes to a different code path
than numbers used in non-key columns, so it's worth testing as well.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-05-02 11:04:05 +03:00
Nadav Har'El
56b8b9d670 test/alternator: reproducer for DoS in unlimited-precision addition
As already noted in issue #6794, whereas DynamoDB limits the magnitude
of numbers to between 10^-130 and 10^125, Scylla does not. In this patch
we add yet another test for this problem, but unlike previous tests
which just shown too much magnitude being allowed which always sounded
like a benign problem - the test in this patch shows that this "feature"
can be used to DoS Scylla - a user user can send a short request that
causes arbitrarily-large allocations, stalls and CPU usage.

The test is currently marked "skip" because it cause cause Scylla to
take a very long time and/or run out of memory. It passes on DynamoDB
because the excessive magnitude is simply not allowed there.

Refs #6794

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-05-02 11:03:51 +03:00
Benny Halevy
e6bcb1c8df utils: to_string: get rid of to_string(std::initializer_list)
It's unused.

Just in case, add a unit test case for using the fmt library to
format it (that includes fmt::to_string(std::initializer_list)).

Note that the existing to_string implementation
used square brackets to enclose the initializer_list
but the new, standardized form uses curly braces.

This doesn't break anything since to_string(initializer_list)
wasn't used.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-05-02 10:48:46 +03:00
Benny Halevy
ba883859c7 utils: to_string: get rid of to_string(const Range&)
Use fmt::to_string instead.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-05-02 10:48:46 +03:00
Benny Halevy
15c9f0f0df utils: to_string: generalize range helpers
As seen in https://github.com/scylladb/scylladb/issues/13146
the current implementation is not general enough
to provide print helpers for all kind of containers.

Modernize the implementation using templates based
on std::ranges::range and using fmt::join.

Extend unit test for formatting different types of ranges,
boost::transformed ranges, deque.

Fixes #13146

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-05-02 10:48:46 +03:00
Benny Halevy
59e89efca6 test: add string_format_test
Test string formatting before cleaning up
utils/to_string.hh in the next patches.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-05-02 10:48:46 +03:00
Botond Dénes
022465d673 Merge 'Tone down offstrategy log message' from Benny Halevy
In many cases we trigger offstrategy compaction opportunistically
also when there's nothing to do.  In this case we still print
to the log lots of info-level message and call
`run_offstrategy_compaction` that wastes more cpu cycles
on learning that it has nothing to do.

This change bails out early if the maintenance set is empty
and prints a "Skipping off-strategy compaction" message in debug
level instead.

Fixes #13466

Also, add an group_id class and return it from compaction_group and table_state.
Use that to identify the compaction_group / table_state by "ks_name.cf_name compaction_group=idx/total" in log messages.

Fixes #13467

Closes #13520

* github.com:scylladb/scylladb:
  compaction_manager: print compaction_group id
  compaction_group, table_state: add group_id member
  compaction_manager: offstrategy compaction: skip compaction if no candidates are found
2023-05-02 08:05:18 +03:00
Benny Halevy
707bd17858 everywhere: optimize calls to make_flat_mutation_reader_from_mutations_v2 with single mutation
No point in going through the vector<mutation> entry-point
just to discover in run time that it was called
with a single-element vector, when we know that
in advance.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #13733
2023-05-02 07:58:34 +03:00
Nadav Har'El
1cefb662cd Merge 'cql3/expr: remove expr::token' from Jan Ciołek
Let's remove `expr::token` and replace all of its functionality with `expr::function_call`.

`expr::token` is a struct whose job is to represent a partition key token.
The idea is that when the user types in `token(p1, p2) < 1234`, this will be internally represented as an expression which uses `expr::token` to represent the `token(p1, p2)` part.

The situation with `expr::token` is a bit complicated.
On one hand side it's supposed to represent the partition token, but sometimes it's also assumed that it can represent a generic call to the `token()` function, for example `token(1, 2, 3)` could be a `function_call`, but it could also be `expr::token`.

The query planning code assumes that each occurence of expr::token
represents the partition token without checking the arguments.
Because of this allowing `token(1, 2, 3)` to be represented as `expr::token` is dangerous - the query planning might think that it is `token(p1, p2, p3)` and plan the query based on this, which would be wrong.

Currently `expr::token` is created only in one specific case.
When the parser detects that the user typed in a restriction which has a call to `token` on the LHS it generates `expr::token`.
In all other cases it generates an `expr::function_call`.
Even when the `function_call` represents a valid partition token, it stays a `function_call`. During preparation there is no check to see if a `function_call` to `token` could be turned into `expr::token`. This is a bit inconsistent - sometimes `token(p1, p2, p3)` is represented as `expr::token` and the query planner handles that, but sometimes it might be represented as `function_call`, which the query planner doesn't handle.

There is also a problem because there's a lot of code duplication between a `function_call` and `expr::token`.
All of the evaluation and preparation is the same for `expr::token` as it's for a `function_call` to the token function.
Currently it's impossible to evaluate `expr::token` and preparation has some flaws, but implementing it would basically consist of copy-pasting the corresponding code from token `function_call`.

One more aspect is multi-table queries.
With `expr::token` we turn a call to the `token()` function into a struct that is schema-specific.
What happens when a single expression is used to make queries to multiple tables? The schema is different, so something that is represented as `expr::token` for one schema would be represented as `function_call` in the context of a different schema.
Translating expressions to different tables would require careful manipulation to convert `expr::token` to `function_call` and vice versa. This could cause trouble for index queries.

Overall I think it would be best to remove `expr::token`.

Although having a clear marker for the partition token is sometimes nice for query planning, in my opinion the pros are outweighted by the cons.
I'm a big fan of having a single way to represent things, having two separate representations of the same thing without clear boundaries between them causes trouble.

Instead of having both `expr::token` and `function_call` we can just have the `function_call` and check if it represents a partition token when needed.

Refs: #12906
Refs: #12677
Closes: #12905

Closes #13480

* github.com:scylladb/scylladb:
  cql3: remove expr::token
  cql3: keep a schema in visitor for extract_clustering_prefix_restrictions
  cql3: keep a schema inside the visitor for extract_partition_range
  cql3/prepare_expr: make get_lhs_receiver handle any function_call
  cql3/expr: properly print token function_call
  expr_test: use unresolved_identifier when creating token
  cql3/expr: split possible_lhs_values into column and token variants
  cql3/expr: fix error message in possible_lhs_values
  cql3: expr: reimplement is_satisfied_by() in terms of evaluate()
  cql3/expr: add a schema argument to expr::replace_token
  cql3/expr: add a comment for expr::has_partition_token
  cql3/expr: add a schema argument to expr::has_token
  cql3: use statement_restrictions::has_token_restrictions() wherever possible
  cql3/expr: add expr::is_partition_token_for_schema
  cql3/expr: add expr::is_token_function
  cql3/expr: implement preparing function_call without a receiver
  cql3/functions: make column family argument optional in functions::get
  cql3/expr: make it possible to prepare expr::constant
  cql3/expr: implement test_assignment for column_value
  cql3/expr: implement test_assignment for expr::constant
2023-04-30 15:31:35 +03:00