scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 19:46:48 +00:00

Author	SHA1	Message	Date
Botond Dénes	c1e8e86637	reader_concurrency_semaphore: reader_permit: clean-up after failed memory requests When requesting memory via `reader_permit::request_memory()`, the requested amount is added to `_requested_memory` member of the permit impl. This is because multiple concurrent requests may be blocked and waiting at the same time. When the requests are fulfilled, the entire amount is consumed and individual requests track their requested amount with `resource_units` to release later. There is a corner-case related to this: if a reader permit is registered as inactive while it is waiting for memory, its active requests are killed with `std::bad_alloc`, but the `_requested_memory` fields is not cleared. If the read survives because the killed requests were part of a non-vital background read-ahead, a later memory request will also include amount from the failed requests. This extra amount wil not be released and hence will cause a resource leak when the permit is destroyed. Fix by detecting this corner case and clearing the `_requested_memory` field. Modify the existing unit test for the scenario of a permit waiting on memory being registered as inactive, to also cover this corner case, reproducing the bug. Fixes: #13539 Closes #13679	2023-05-07 14:06:51 +03:00
Kamil Braun	70f2b09397	Merge 'scylla_cluster.py: fix read_last_line' from Gusev Petr This is a follow-up to #13399, the patch addresses the issues mentioned there: * linesep can be split between blocks; * linesep can be part of UTF-8 sequence; * avoid excessively long lines, limit to 256 chars; * the logic of the function made simpler and more maintainable. Closes #13427 * github.com:scylladb/scylladb: pylib_test: add tests for read_last_line pytest: add pylib_test directory scylla_cluster.py: fix read_last_line scylla_cluster.py: move read_last_line to util.py	2023-05-05 13:29:15 +02:00
Kefu Chai	05a172c7e7	build: cmake: link against Boost::unit_test_framework we introduced the linkage to Boost::unit_test_framework in `fe70333c19`, this library is used by test/lib/test_utils.cc, so update CMake accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13781	2023-05-05 13:55:00 +03:00
Petr Gusev	8a0bcf9d9d	pylib_test: add tests for read_last_line	2023-05-05 12:57:43 +04:00
Petr Gusev	7476e91d67	pytest: add pylib_test directory We want to add tests for read_last_line, in this commit we add a new directory for them since there were no tests for pylib code before.	2023-05-05 12:57:43 +04:00
Petr Gusev	330d1d5163	scylla_cluster.py: fix read_last_line This is a follow-up to #13399, the patch addresses the issues mentioned there: * linesep can be split between blocks; * linesep can be part of UTF-8 sequence; * avoid excessively long lines, limit to 512 chars; * the logic of the function made simpler and more maintainable.	2023-05-05 12:57:36 +04:00
Petr Gusev	8a5e211c30	scylla_cluster.py: move read_last_line to util.py We want to add tests for read_last_line, so we move it to make this simper.	2023-05-05 12:51:25 +04:00
Botond Dénes	687a8bb2f0	Merge 'Sanitize test::filename(sstable) API' from Pavel Emelyanov There are two of them currently with slightly different declaration. Better to leave only one. Closes #13772 * github.com:scylladb/scylladb: test: Deduplicate test::filename() static overload test: Make test::filename return fs::path	2023-05-05 11:36:08 +03:00
Pavel Emelyanov	ac305076bd	test: Split test_twcs_interposer_on_memtable_flush naturally The test case consists of two internal sub-test-cases. Making them explicit kills three birds with one stone - improves parallelizm - removes env's tempdir wiping - fixes code indentation refs: #12707 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13768	2023-05-05 10:42:30 +03:00
Avi Kivity	f125a3e315	Merge 'tree: finish the reader_permit state renames' from Botond Dénes In https://github.com/scylladb/scylladb/pull/13482 we renamed the reader permit states to more descriptive names. That PR however only covered only the states themselves and their usages, as well as the documentation in `docs/dev`. This PR is a followup to said PR, completing the name changes: renaming all symbols, names, comments etc, so all is consistent and up-to-date. Closes #13573 * github.com:scylladb/scylladb: reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes reader_concurrency_semaphore: update API w.r.t. recent permit state name changes reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes	2023-05-04 18:29:04 +03:00
Avi Kivity	204521b9a7	Merge 'mutation/mutation_compactor: validate range tombstone change before it is moved' from Botond Dénes `e2c9cdb576` moved the validation of the range tombstone change to the place where it is actually consumed, so we don't attempt to pass purged or discarded range tombstones to the validator. In doing so however, the validate pass was moved after the consume call, which moves the range tombstone change, the validator having been passed a moved-from range tombstone. Fix this by moving he validation to before the consume call. Refs: #12575 Closes #13749 * github.com:scylladb/scylladb: test/boost/mutation_test: add sanity test for mutation compaction validator mutation/mutation_compactor: add validation level to compaction state query constructor mutation/mutation_compactor: validate range tombstone change before it is moved	2023-05-04 18:15:35 +03:00
Avi Kivity	1d351dde06	Merge 'Make S3 client work with real S3' from Pavel Emelyanov Current S3 client was tested over minio and it takes few more touches to work with amazon S3. The main challenge here is to support singed requests. The AWS S3 server explicitly bans unsigned multipart-upload requests, which in turn is the essential part of the sstables S3 backend, so we do need signing. Signing a request has many options and requirements, one of them is -- request _body_ can be or can be not included into signature calculations. This is called "(un)signed payload". Requests sent over plain HTTP require payload signing (i.e. -- request body should be included into signature calculations), which can a bit troublesome, so instead the PR uses unsigned payload (i.e. -- doesn't include the request body into signature calculation, only necessary headers and query parameters), but thus also needs HTTPS. So what this set does is makes the existing S3 client code sign requests. In order to sign the request the code needs to get AWS key and secret (and region) from somewhere and this somewhere is the conf/object_storage.yaml config file. The signature generating code was previously merged (moved from alternator code) and updated to suit S3 client needs. In order to properly support HTTPS the PR adds special connection factory to be used with seastar http client. The factory makes DNS resolving of AWS endpoint names and configures gnutls systemtrust. fixes: #13425 Closes #13493 * github.com:scylladb/scylladb: doc: Add a document describing how to configure S3 backend s3/test: Add ability to run boost test over real s3 s3/client: Sign requests if configured s3/client: Add connection factory with DNS resolve and configurable HTTPS s3/client: Keep server port on config s3/client: Construct it with config s3/client: Construct it with sstring endpoint sstables: Make s3_storage with endpoint config sstables_manager: Keep object storage configs onboard code: Introduce conf/object_storage.yaml configuration file	2023-05-04 18:08:54 +03:00
Pavel Emelyanov	56dfc21ba0	test: Deduplicate test::filename() static overload There are two of them currently, both returning fs::path for sstable components. One is static and can be dropped, callers are patched to use the non-static one making the code tiny bit shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-04 17:16:00 +03:00
Pavel Emelyanov	3f30a253be	test: Make test::filename return fs::path The sstable::filename() is private and is not supposed to be used as a path to open any files. However, tests are different and they sometimes know it is. For that they use test wrapper that has access to private members and may make assumptions about meaning of sstable::filename(). Said that, the test::filename() should return fs::path, not sstring. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-04 17:14:04 +03:00
Tomasz Grabiec	e385ce8a2b	Merge "fix stack use after free during shutdown" from Gleb storage_service uses raft_group0 but the during shutdown the later is destroyed before the former is stopped. This series move raft_group0 destruction to be after storage_service is stopped already. For the move to work some existing dependencies of raft_group0 are dropped since they do not really needed during the object creation. Fixes #13522	2023-05-04 15:14:18 +02:00
Pavel Emelyanov	fe70333c19	test: Auto-skip object-storage test cases if run from shell In case an sstable unit test case is run individually, it would fail with exception saying that S3_... environment is not set. It's better to skip the test-case rather than fail. If someone wants to run it from shell, it will have to prepare S3 server (minio/AWS public bucket) and provide proper environment for the test-case. refs: #13569 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13755	2023-05-04 14:15:18 +03:00
Konstantin Osipov	e7c9ca560b	test: issue a read barrier before checking ring consistency Raft replication doesn't guarantee that all replicas see identical Raft state at all times, it only guarantees the same order of events on all replicas. When comparing raft state with gossip state on a node, first issue a read barrier to ensure the node has the latest raft state. To issue a read barrier it is sufficient to alter a non-existing state: in order to validate the DDL the node needs to sync with the leader and fetch its latest group0 state. Fixes #13518 (flaky topology test). Closes #13756	2023-05-04 12:22:07 +02:00
Gleb Natapov	dc6c3b60b4	init: move raft_group0 creation before storage_service storage_service uses raft_group0 so the later needs to exists until the former is stopped.	2023-05-04 13:03:18 +03:00
Gleb Natapov	e9fb885e82	service/raft: raft_group0: drop dependency on cdc::generation_service raft_group0 does not really depends on cdc::generation_service, it needs it only transiently, so pass it to appropriate methods of raft_group0 instead of during its creation.	2023-05-04 13:03:07 +03:00
Pavel Emelyanov	e00d3188ed	s3/test: Add ability to run boost test over real s3 Support the AWS_S3_EXTRA environment vairable that's :-split and the respective substrings are set as endpoint AWS configuration. This makes it possible to run boost S3 test over real S3. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:23:38 +03:00
Pavel Emelyanov	3bec5ea2ce	s3/client: Keep server port on config Currently the code temporarily assumes that the endpoint port is 9000. This is what tests' local minio is started with. This patch keeps the port number on endpoint config and makes test get the port number from minio starting code via environment. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	85f06ca556	s3/client: Construct it with config Similar to previous patch -- extent the s3::client constructor to get the endpoint config value next to the endpoint string. For now the configs are likely empty, but they are yet unused too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	caf9e357c8	s3/client: Construct it with sstring endpoint Currently the client is constructed with socket_address which's prepared by the caller from the endpoint string. That's not flexible engouh, because s3 client needs to know the original endpoint string for two reasons. First, it needs to lookup endpoint config for potential AWS creds. Second, it needs this exact value as Host: header in its http requests. So this patch just relaxes the client constructor to accept the endpoint string and hard-code the 9000 port. The latter is temporary, this is how local tests' minio is started, but next patch will make it configurable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	2f6aa5b52e	code: Introduce conf/object_storage.yaml configuration file In order to access real S3 bucket, the client should use signed requests over https. Partially this is due to security considerations, partially this is unavoidable, because multipart-uploading is banned for unsigned requests on the S3. Also, signed requests over plain http require signing the payload as well, which is a bit troublesome, so it's better to stick to secure https and keep payload unsigned. To prepare signed requests the code needs to know three things: - aws key - aws secret - aws region name The latter could be derived from the endpoint URL, but it's simpler to configure it explicitly, all the more so there's an option to use S3 URLs without region name in them we could want to use some time. To keep the described configuration the proposed place is the object_storage.yaml file with the format endpoints: - name: a.b.c port: 443 aws_key: 12345 aws_secret: abcdefghijklmnop ... When loaded, the map gets into db::config and later will be propagated down to sstables code (see next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:15 +03:00
Botond Dénes	4365f004c1	test/boost/mutation_test: add sanity test for mutation compaction validator Checking that compacted fragments are forwarded to the validator intact.	2023-05-03 04:19:42 -04:00
Nadav Har'El	b5f28e2b55	Merge 'Add S3 support to sstables::test_env' from Pavel Emelyanov Currently there are only 2 tests for S3 -- the pure client test and compound object_store test that launches scylla, creates s3-backed table and CQL-queries it. At the same time there's a whole lot of small unit test for sstables functionality, part of it can run over S3 storage too. This PR adds this support and patches several test cases to use it. More test cases are to come later on demand. fixes: #13015 Closes #13569 * github.com:scylladb/scylladb: test: Make resharding test run over s3 too test: Add lambda to fetch bloom filter size test: Tune resharding test use of sstable::test_env test: Make datafile test case run over s3 too test: Propagate storage options to table_for_test test: Add support for s3 storage_options in config test: Outline sstables::test_env::do_with_async() test: Keep storage options on sstable_test_env config sstables: Add and call storage::destroy() sstables: Coroutinize sstable::destroy()	2023-05-02 21:48:05 +03:00
Botond Dénes	72003dc35c	readers: evictable_reader: skip progress guarantee when next pos is partition start The evictable reader must ensure that each buffer fill makes forward progress, i.e. the last fragment in the buffer has a position larger than the last fragment from the last buffer-fill. Otherwise, the reader could get stuck in an infinite loop between buffer fills, if the reader is evicted in-between. The code guranteeing this forward change has a bug: when the next expected position is a partition-start (another partition), the code would loop forever, effectively reading all there is from the underlying reader. To avoid this, add a special case to ignore the progress guarantee loop altogether when the next expected position is a partition start. In this case, progress is garanteed anyway, because there is exactly one partition-start fragment in each partition. Fixes: #13491 Closes #13563	2023-05-02 16:19:32 +03:00
Botond Dénes	7baa2d9cb2	Merge 'Cleanup range printing' from Benny Halevy This mini-series cleans up printing of ranges in utils/to_string.hh It generalizes the helper function to work on a std::ranges::range, with some exceptions, and adds a helper for boost::transformed_range. It also changes the internal interface by moving `join` the the utils namespace and use std::string rather than seastar::sstring. Additional unit tests were added to test/boost/json_test Fixes #13146 Closes #13159 * github.com:scylladb/scylladb: utils: to_string: get rid of utils::join utils: to_string: get rid of to_string(std::initializer_list) utils: to_string: get rid of to_string(const Range&) utils: to_string: generalize range helpers test: add string_format_test utils: chunked_vector: add std::ranges::range ctor	2023-05-02 14:55:18 +03:00
Botond Dénes	d6ed5bbc7e	Merge 'alternator: fix validation of numbers' magnitude and precision' from Nadav Har'El DynamoDB limits the allowed magnitude and precision of numbers - valid decimal exponents are between -130 and 125 and up to 38 significant decimal digitst are allowed. In contrast, Scylla uses the CQL "decimal" type which offers unlimited precision. This can cause two problems: 1. Users might get used to this "unofficial" feature and start relying on it, not allowing us to switch to a more efficient limited-precision implementation later. 2. If huge exponents are allowed, e.g., 1e-1000000, summing such a number with 1.0 will result in a huge number, huge allocations and stalls. This is highly undesirable. This series adds more tests in this area covering additional corner cases, and then fixes the issue by adding the missing verification where it's needed. After the series, all 12 tests in test/alternator/test_number.py now pass. Fixes #6794 Closes #13743 * github.com:scylladb/scylladb: alternator: unit test for number magnitude and precision function alternator: add validation of numbers' magnitude and precision test/alternator: more tests for limits on number precision and magnitude test/alternator: reproducer for DoS in unlimited-precision addition	2023-05-02 14:33:36 +03:00
Nadav Har'El	ed34f3b5e4	cql-pytest: translate Cassandra's test for LWT with collections This is a translation of Cassandra's CQL unit test source file validation/operations/InsertUpdateIfConditionTest.java into our cql-pytest framework. This test file checks various LWT conditional updates which involve collections or UDTs (there is a separate test file for LWT conditional updates which do not involve collections, which I haven't translated yet). The tests reproduce one known bug: Refs #5855: lwt: comparing NULL collection with empty value in IF condition yields incorrect results And also uncovered three previously-unknown bugs: Refs #13586: Add support for CONTAINS and CONTAINS KEY in LWT expressions Refs #13624: Add support for UDT subfields in LWT expression Refs #13657: Misformatted printout of column name in LWT error message Beyond those bona-fide bugs, this test also demonstrates several places where we intentionally deviated from Cassandra's behavior, forcing me to comment out several checks. These deviations are known, and intentional, but some of them are undocumented and it's worth listing here the ones re-discovered by this test: 1. On a successful conditional write, Cassandra returns just True, Scylla also returns the old contents of the row. This difference is officially documented in docs/kb/lwt-differences.rst. 2. Scylla allows the test "l = [null]" or "s = {null}" with this weird null element (the result is false), whereas Cassandra prints an error. 3. Scylla allows "l[null]" or "m[null]" (resulting in null), Cassandra prints an error. 4. Scylla allows a negative list index, "l[-2]", resulting in null. Cassandra prints an error in this case. 5. Cassandra allows in "IF v IN (?, ?)" to bind individual values to UNSET_VALUE and skips them, Scylla treats this as an error. Refs #13659. 6. Scylla allows "IN null" (the condition just fails), Cassandra prints an error in this case. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13663	2023-05-02 11:53:58 +03:00
Pavel Emelyanov	d4a72de406	test: Make resharding test run over s3 too Now when the test case and used lib/utils code is using storage-agnostic approach, it can be extended to run over S3 storage as well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:46:23 +03:00
Pavel Emelyanov	2601c58278	test: Add lambda to fetch bloom filter size The resharding test compares bloom filter sizes before and after reshard runs. For that it gets the filter on-disk filename and stat()s it. That won't work with S3 as it doesn't have its accessable on-disk files. Some time ago there existed the storage::get_stats() method, but now it's gone. The new s3::client::get_object_stat() is coming, but it will take time to switch to it. For now, generalize filter size fetching into a local lambda. Next patch will make a stub in it for S3 case, and once the get_object_stat() is there we'll be able to smoothly start using it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:43:26 +03:00
Pavel Emelyanov	76594bf72b	test: Tune resharding test use of sstable::test_env The test case in question spawns async context then makes the test_env instance on the stack (and stopper for it too). There's helper for the above steps, better to use them. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Pavel Emelyanov	439c8770aa	test: Make datafile test case run over s3 too Most of the sstable_datafile test cases are capable of running with S3 storage, so this patch makes the simplest of them do it. Patching the rest from this file is optional, because mostly the cases test how the datafile data manipulations work without checking the files manipulations. So even if making them all run over S3 is possible, it will just increase the testing time w/o real test of the storage driver. So this patch makes one test case run over local and S3 storages, more patches to update more test cases with files manipulations are yet to come. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Pavel Emelyanov	f7df238545	test: Propagate storage options to table_for_test Teach table_for_tests use any storage options, not just local one. For now the only user that passes non-local options is sstables::test_env. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Pavel Emelyanov	fa1de16f30	test: Add support for s3 storage_options in config When the sstable test case wants to run over S3 storage it needs to specify that in test config by providing the S3 storage options. So first thing this patch adds is the helper that makes these options based on the env left by minio launcher from test.py. Next, in order to make sstables_manager work with S3 it needs the plugged system keyspace which, in turn, needs query processor, proxy, database, etc. All this stuff lives in cql_test_env, so the test case running with S3 options will run in a sstables::test_env nested inside cql_test_env. The latter would also need to plug its system keyspace to the former's sstables manager and turn the experimental feature ON. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Nadav Har'El	57ffbcbb22	cql3: fix spurious token names in syntax error messages We have known for a long time (see issue #1703) that the quality of our CQL "syntax error" messages leave a lot to be desired, especially when compared to Cassandra. This patch doesn't yet bring us great error messages with great context - doing this isn't easy and it appears that Antlr3's C++ runtime isn't as good as the Java one in this regard - but this patch at least fixes garbage printed in some error messages. Specifically, when the parser can deduce that a specific token is missing, it used to print line 1:83 missing ')' at '<missing ' After this patch we get rid of the meaningless string '<missing ': line 1:83 : Missing ')' Also, when the parser deduced that a specific token was unneeded, it used to print: line 1:83 extraneous input ')' expecting <invalid> Now we got rid of this silly "<invalid>" and write just: line 1:83 : Unexpected ')' Refs #1703. I didn't yet marked that issue "fixed" because I think a complete fix would also require printing the entire misparsed line and the point of the parse failure. Scylla still prints a generic "Syntax Error" in most cases now, and although the character number (83 in the above example) can help, it's much more useful to see the actual failed statement and where character 83 is. Unfortunately some tests enshrine buggy error messages and had to be fixed. Other tests enshrined strange text for a generic unexplained error message, which used to say " : syntax error..." (note the two spaces and elipses) and after this patch is " : Syntax error". So these tests are changed. Another message, "no viable alternative at input" is deliberately kept unchanged by this patch so as not to break many more tests which enshrined this message. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13731	2023-05-02 11:23:58 +03:00
Pavel Emelyanov	1e03733e8c	test: Outline sstables::test_env::do_with_async() It's growing larger, better to keep it in .cc file Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:45 +03:00
Pavel Emelyanov	f223f5357d	test: Keep storage options on sstable_test_env config So that it could be set to s3 by the test case on demand. Default is local storage which uses env's tempdir or explicit path argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:45 +03:00
Nadav Har'El	e74f69bb56	alternator: unit test for number magnitude and precision function In the previous patch we added a limit in Alternator for the magnitude and precision of numbers, based on a function get_magnitude_and_precision whose implementation was, unfortunately, rather elaborate and delicate. Although we did add in the previous patches some end-to-end tests which confirmed that the final decision made based on this function, to accept or reject numbers, was a correct decision in a few cases, such an elaborate function deserves a separate unit test for checking just that function in isolation. In fact, this unit tests uncovered some bugs in the first implementation of get_magnitude_and_precision() which the other tests missed. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:04:05 +03:00
Nadav Har'El	3c0603558c	alternator: add validation of numbers' magnitude and precision DynamoDB limits the allowed magnitude and precision of numbers - valid decimal exponents are between -130 and 125 and up to 38 significant decimal digitst are allowed. In contrast, Scylla uses the CQL "decimal" type which offers unlimited precision. This can cause two problems: 1. Users might get used to this "unofficial" feature and start relying on it, not allowing us to switch to a more efficient limited-precision implementation later. 2. If huge exponents are allowed, e.g., 1e-1000000, summing such a number with 1.0 will result in a huge number, huge allocations and stalls. This is highly undesirable. After this patch, all tests in test/alternator/test_number.py now pass. The various failing tests which verify magnitude and precision limitations in different places (key attributes, non-key attributes, and arithmetic expressions) now pass - so their "xfail" tags are removed. Fixes #6794 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:04:05 +03:00
Nadav Har'El	0eccc49308	test/alternator: more tests for limits on number precision and magnitude We already have xfailing tests for issue #6794 - the missing checks on precision and magnitudes of numbers in Alternator - but this patch adds checks for additional corner cases. In particular we check the case that numbers are used in a key column, which goes to a different code path than numbers used in non-key columns, so it's worth testing as well. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:04:05 +03:00
Nadav Har'El	56b8b9d670	test/alternator: reproducer for DoS in unlimited-precision addition As already noted in issue #6794, whereas DynamoDB limits the magnitude of numbers to between 10^-130 and 10^125, Scylla does not. In this patch we add yet another test for this problem, but unlike previous tests which just shown too much magnitude being allowed which always sounded like a benign problem - the test in this patch shows that this "feature" can be used to DoS Scylla - a user user can send a short request that causes arbitrarily-large allocations, stalls and CPU usage. The test is currently marked "skip" because it cause cause Scylla to take a very long time and/or run out of memory. It passes on DynamoDB because the excessive magnitude is simply not allowed there. Refs #6794 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:03:51 +03:00
Benny Halevy	e6bcb1c8df	utils: to_string: get rid of to_string(std::initializer_list) It's unused. Just in case, add a unit test case for using the fmt library to format it (that includes fmt::to_string(std::initializer_list)). Note that the existing to_string implementation used square brackets to enclose the initializer_list but the new, standardized form uses curly braces. This doesn't break anything since to_string(initializer_list) wasn't used. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Benny Halevy	ba883859c7	utils: to_string: get rid of to_string(const Range&) Use fmt::to_string instead. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Benny Halevy	15c9f0f0df	utils: to_string: generalize range helpers As seen in https://github.com/scylladb/scylladb/issues/13146 the current implementation is not general enough to provide print helpers for all kind of containers. Modernize the implementation using templates based on std::ranges::range and using fmt::join. Extend unit test for formatting different types of ranges, boost::transformed ranges, deque. Fixes #13146 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Benny Halevy	59e89efca6	test: add string_format_test Test string formatting before cleaning up utils/to_string.hh in the next patches. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Botond Dénes	022465d673	Merge 'Tone down offstrategy log message' from Benny Halevy In many cases we trigger offstrategy compaction opportunistically also when there's nothing to do. In this case we still print to the log lots of info-level message and call `run_offstrategy_compaction` that wastes more cpu cycles on learning that it has nothing to do. This change bails out early if the maintenance set is empty and prints a "Skipping off-strategy compaction" message in debug level instead. Fixes #13466 Also, add an group_id class and return it from compaction_group and table_state. Use that to identify the compaction_group / table_state by "ks_name.cf_name compaction_group=idx/total" in log messages. Fixes #13467 Closes #13520 * github.com:scylladb/scylladb: compaction_manager: print compaction_group id compaction_group, table_state: add group_id member compaction_manager: offstrategy compaction: skip compaction if no candidates are found	2023-05-02 08:05:18 +03:00
Benny Halevy	707bd17858	everywhere: optimize calls to make_flat_mutation_reader_from_mutations_v2 with single mutation No point in going through the vector<mutation> entry-point just to discover in run time that it was called with a single-element vector, when we know that in advance. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13733	2023-05-02 07:58:34 +03:00
Nadav Har'El	1cefb662cd	Merge 'cql3/expr: remove expr::token' from Jan Ciołek Let's remove `expr::token` and replace all of its functionality with `expr::function_call`. `expr::token` is a struct whose job is to represent a partition key token. The idea is that when the user types in `token(p1, p2) < 1234`, this will be internally represented as an expression which uses `expr::token` to represent the `token(p1, p2)` part. The situation with `expr::token` is a bit complicated. On one hand side it's supposed to represent the partition token, but sometimes it's also assumed that it can represent a generic call to the `token()` function, for example `token(1, 2, 3)` could be a `function_call`, but it could also be `expr::token`. The query planning code assumes that each occurence of expr::token represents the partition token without checking the arguments. Because of this allowing `token(1, 2, 3)` to be represented as `expr::token` is dangerous - the query planning might think that it is `token(p1, p2, p3)` and plan the query based on this, which would be wrong. Currently `expr::token` is created only in one specific case. When the parser detects that the user typed in a restriction which has a call to `token` on the LHS it generates `expr::token`. In all other cases it generates an `expr::function_call`. Even when the `function_call` represents a valid partition token, it stays a `function_call`. During preparation there is no check to see if a `function_call` to `token` could be turned into `expr::token`. This is a bit inconsistent - sometimes `token(p1, p2, p3)` is represented as `expr::token` and the query planner handles that, but sometimes it might be represented as `function_call`, which the query planner doesn't handle. There is also a problem because there's a lot of code duplication between a `function_call` and `expr::token`. All of the evaluation and preparation is the same for `expr::token` as it's for a `function_call` to the token function. Currently it's impossible to evaluate `expr::token` and preparation has some flaws, but implementing it would basically consist of copy-pasting the corresponding code from token `function_call`. One more aspect is multi-table queries. With `expr::token` we turn a call to the `token()` function into a struct that is schema-specific. What happens when a single expression is used to make queries to multiple tables? The schema is different, so something that is represented as `expr::token` for one schema would be represented as `function_call` in the context of a different schema. Translating expressions to different tables would require careful manipulation to convert `expr::token` to `function_call` and vice versa. This could cause trouble for index queries. Overall I think it would be best to remove `expr::token`. Although having a clear marker for the partition token is sometimes nice for query planning, in my opinion the pros are outweighted by the cons. I'm a big fan of having a single way to represent things, having two separate representations of the same thing without clear boundaries between them causes trouble. Instead of having both `expr::token` and `function_call` we can just have the `function_call` and check if it represents a partition token when needed. Refs: #12906 Refs: #12677 Closes: #12905 Closes #13480 * github.com:scylladb/scylladb: cql3: remove expr::token cql3: keep a schema in visitor for extract_clustering_prefix_restrictions cql3: keep a schema inside the visitor for extract_partition_range cql3/prepare_expr: make get_lhs_receiver handle any function_call cql3/expr: properly print token function_call expr_test: use unresolved_identifier when creating token cql3/expr: split possible_lhs_values into column and token variants cql3/expr: fix error message in possible_lhs_values cql3: expr: reimplement is_satisfied_by() in terms of evaluate() cql3/expr: add a schema argument to expr::replace_token cql3/expr: add a comment for expr::has_partition_token cql3/expr: add a schema argument to expr::has_token cql3: use statement_restrictions::has_token_restrictions() wherever possible cql3/expr: add expr::is_partition_token_for_schema cql3/expr: add expr::is_token_function cql3/expr: implement preparing function_call without a receiver cql3/functions: make column family argument optional in functions::get cql3/expr: make it possible to prepare expr::constant cql3/expr: implement test_assignment for column_value cql3/expr: implement test_assignment for expr::constant	2023-04-30 15:31:35 +03:00

1 2 3 4 5 ...

4825 Commits