scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-08 16:03:20 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	3bec5ea2ce	s3/client: Keep server port on config Currently the code temporarily assumes that the endpoint port is 9000. This is what tests' local minio is started with. This patch keeps the port number on endpoint config and makes test get the port number from minio starting code via environment. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	85f06ca556	s3/client: Construct it with config Similar to previous patch -- extent the s3::client constructor to get the endpoint config value next to the endpoint string. For now the configs are likely empty, but they are yet unused too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	caf9e357c8	s3/client: Construct it with sstring endpoint Currently the client is constructed with socket_address which's prepared by the caller from the endpoint string. That's not flexible engouh, because s3 client needs to know the original endpoint string for two reasons. First, it needs to lookup endpoint config for potential AWS creds. Second, it needs this exact value as Host: header in its http requests. So this patch just relaxes the client constructor to accept the endpoint string and hard-code the 9000 port. The latter is temporary, this is how local tests' minio is started, but next patch will make it configurable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	711514096a	sstables: Make s3_storage with endpoint config Continuation of the previous patch. The sstables::s3_storage gets the endpoint config instance upon creation. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	bd1e3c688f	sstables_manager: Keep object storage configs onboard The user sstables manager will need to provide endpoint config for sstables' storage drivers. For that it needs to get it from db::config and keep in-sync with its updates. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:43 +03:00
Pavel Emelyanov	2f6aa5b52e	code: Introduce conf/object_storage.yaml configuration file In order to access real S3 bucket, the client should use signed requests over https. Partially this is due to security considerations, partially this is unavoidable, because multipart-uploading is banned for unsigned requests on the S3. Also, signed requests over plain http require signing the payload as well, which is a bit troublesome, so it's better to stick to secure https and keep payload unsigned. To prepare signed requests the code needs to know three things: - aws key - aws secret - aws region name The latter could be derived from the endpoint URL, but it's simpler to configure it explicitly, all the more so there's an option to use S3 URLs without region name in them we could want to use some time. To keep the described configuration the proposed place is the object_storage.yaml file with the format endpoints: - name: a.b.c port: 443 aws_key: 12345 aws_secret: abcdefghijklmnop ... When loaded, the map gets into db::config and later will be propagated down to sstables code (see next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:19:15 +03:00
Nadav Har'El	b5f28e2b55	Merge 'Add S3 support to sstables::test_env' from Pavel Emelyanov Currently there are only 2 tests for S3 -- the pure client test and compound object_store test that launches scylla, creates s3-backed table and CQL-queries it. At the same time there's a whole lot of small unit test for sstables functionality, part of it can run over S3 storage too. This PR adds this support and patches several test cases to use it. More test cases are to come later on demand. fixes: #13015 Closes #13569 * github.com:scylladb/scylladb: test: Make resharding test run over s3 too test: Add lambda to fetch bloom filter size test: Tune resharding test use of sstable::test_env test: Make datafile test case run over s3 too test: Propagate storage options to table_for_test test: Add support for s3 storage_options in config test: Outline sstables::test_env::do_with_async() test: Keep storage options on sstable_test_env config sstables: Add and call storage::destroy() sstables: Coroutinize sstable::destroy()	2023-05-02 21:48:05 +03:00
Botond Dénes	72003dc35c	readers: evictable_reader: skip progress guarantee when next pos is partition start The evictable reader must ensure that each buffer fill makes forward progress, i.e. the last fragment in the buffer has a position larger than the last fragment from the last buffer-fill. Otherwise, the reader could get stuck in an infinite loop between buffer fills, if the reader is evicted in-between. The code guranteeing this forward change has a bug: when the next expected position is a partition-start (another partition), the code would loop forever, effectively reading all there is from the underlying reader. To avoid this, add a special case to ignore the progress guarantee loop altogether when the next expected position is a partition start. In this case, progress is garanteed anyway, because there is exactly one partition-start fragment in each partition. Fixes: #13491 Closes #13563	2023-05-02 16:19:32 +03:00
Botond Dénes	7baa2d9cb2	Merge 'Cleanup range printing' from Benny Halevy This mini-series cleans up printing of ranges in utils/to_string.hh It generalizes the helper function to work on a std::ranges::range, with some exceptions, and adds a helper for boost::transformed_range. It also changes the internal interface by moving `join` the the utils namespace and use std::string rather than seastar::sstring. Additional unit tests were added to test/boost/json_test Fixes #13146 Closes #13159 * github.com:scylladb/scylladb: utils: to_string: get rid of utils::join utils: to_string: get rid of to_string(std::initializer_list) utils: to_string: get rid of to_string(const Range&) utils: to_string: generalize range helpers test: add string_format_test utils: chunked_vector: add std::ranges::range ctor	2023-05-02 14:55:18 +03:00
Botond Dénes	d6ed5bbc7e	Merge 'alternator: fix validation of numbers' magnitude and precision' from Nadav Har'El DynamoDB limits the allowed magnitude and precision of numbers - valid decimal exponents are between -130 and 125 and up to 38 significant decimal digitst are allowed. In contrast, Scylla uses the CQL "decimal" type which offers unlimited precision. This can cause two problems: 1. Users might get used to this "unofficial" feature and start relying on it, not allowing us to switch to a more efficient limited-precision implementation later. 2. If huge exponents are allowed, e.g., 1e-1000000, summing such a number with 1.0 will result in a huge number, huge allocations and stalls. This is highly undesirable. This series adds more tests in this area covering additional corner cases, and then fixes the issue by adding the missing verification where it's needed. After the series, all 12 tests in test/alternator/test_number.py now pass. Fixes #6794 Closes #13743 * github.com:scylladb/scylladb: alternator: unit test for number magnitude and precision function alternator: add validation of numbers' magnitude and precision test/alternator: more tests for limits on number precision and magnitude test/alternator: reproducer for DoS in unlimited-precision addition	2023-05-02 14:33:36 +03:00
Nadav Har'El	ed34f3b5e4	cql-pytest: translate Cassandra's test for LWT with collections This is a translation of Cassandra's CQL unit test source file validation/operations/InsertUpdateIfConditionTest.java into our cql-pytest framework. This test file checks various LWT conditional updates which involve collections or UDTs (there is a separate test file for LWT conditional updates which do not involve collections, which I haven't translated yet). The tests reproduce one known bug: Refs #5855: lwt: comparing NULL collection with empty value in IF condition yields incorrect results And also uncovered three previously-unknown bugs: Refs #13586: Add support for CONTAINS and CONTAINS KEY in LWT expressions Refs #13624: Add support for UDT subfields in LWT expression Refs #13657: Misformatted printout of column name in LWT error message Beyond those bona-fide bugs, this test also demonstrates several places where we intentionally deviated from Cassandra's behavior, forcing me to comment out several checks. These deviations are known, and intentional, but some of them are undocumented and it's worth listing here the ones re-discovered by this test: 1. On a successful conditional write, Cassandra returns just True, Scylla also returns the old contents of the row. This difference is officially documented in docs/kb/lwt-differences.rst. 2. Scylla allows the test "l = [null]" or "s = {null}" with this weird null element (the result is false), whereas Cassandra prints an error. 3. Scylla allows "l[null]" or "m[null]" (resulting in null), Cassandra prints an error. 4. Scylla allows a negative list index, "l[-2]", resulting in null. Cassandra prints an error in this case. 5. Cassandra allows in "IF v IN (?, ?)" to bind individual values to UNSET_VALUE and skips them, Scylla treats this as an error. Refs #13659. 6. Scylla allows "IN null" (the condition just fails), Cassandra prints an error in this case. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13663	2023-05-02 11:53:58 +03:00
Pavel Emelyanov	d4a72de406	test: Make resharding test run over s3 too Now when the test case and used lib/utils code is using storage-agnostic approach, it can be extended to run over S3 storage as well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:46:23 +03:00
Pavel Emelyanov	2601c58278	test: Add lambda to fetch bloom filter size The resharding test compares bloom filter sizes before and after reshard runs. For that it gets the filter on-disk filename and stat()s it. That won't work with S3 as it doesn't have its accessable on-disk files. Some time ago there existed the storage::get_stats() method, but now it's gone. The new s3::client::get_object_stat() is coming, but it will take time to switch to it. For now, generalize filter size fetching into a local lambda. Next patch will make a stub in it for S3 case, and once the get_object_stat() is there we'll be able to smoothly start using it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:43:26 +03:00
Kefu Chai	135b4fd434	db: schema_tables: capture reference to temporary value by value `clustering_key_columns()` returns a range view, and `front()` returns the reference to its first element. so we cannot assume the availability of this reference after the expression is evaluated. to address this issue, let's capture the returned range by value, and keep the first element by reference. this also silences warning from GCC-13: ``` /home/kefu/dev/scylladb/db/schema_tables.cc:3654:30: error: possibly dangling reference to a temporary [-Werror=dangling-reference] 3654 \| const column_definition& first_view_ck = v->clustering_key_columns().front(); \| ^~~~~~~~~~~~~ /home/kefu/dev/scylladb/db/schema_tables.cc:3654:79: note: the temporary was destroyed at the end of the full expression ‘(& v)->view_ptr::operator->()->schema::clustering_key_columns().boost::iterator_range<__gnu_cxx::__normal_iterator<const column_definition, std::vector<column_definition> > >::<anonymous>.boost::iterator_range_detail::iterator_range_base<__gnu_cxx::__normal_iterator<const column_definition, std::vector<column_definition> >, boost::iterators::random_access_traversal_tag>::<anonymous>.boost::iterator_range_detail::iterator_range_base<__gnu_cxx::__normal_iterator<const column_definition, std::vector<column_definition> >, boost::iterators::bidirectional_traversal_tag>::<anonymous>.boost::iterator_range_detail::iterator_range_base<__gnu_cxx::__normal_iterator<const column_definition, std::vector<column_definition> >, boost::iterators::incrementable_traversal_tag>::front()’ 3654 \| const column_definition& first_view_ck = v->clustering_key_columns().front(); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ ``` Fixes #13720 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13721	2023-05-02 11:42:43 +03:00
Pavel Emelyanov	76594bf72b	test: Tune resharding test use of sstable::test_env The test case in question spawns async context then makes the test_env instance on the stack (and stopper for it too). There's helper for the above steps, better to use them. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Pavel Emelyanov	439c8770aa	test: Make datafile test case run over s3 too Most of the sstable_datafile test cases are capable of running with S3 storage, so this patch makes the simplest of them do it. Patching the rest from this file is optional, because mostly the cases test how the datafile data manipulations work without checking the files manipulations. So even if making them all run over S3 is possible, it will just increase the testing time w/o real test of the storage driver. So this patch makes one test case run over local and S3 storages, more patches to update more test cases with files manipulations are yet to come. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Pavel Emelyanov	f7df238545	test: Propagate storage options to table_for_test Teach table_for_tests use any storage options, not just local one. For now the only user that passes non-local options is sstables::test_env. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Pavel Emelyanov	fa1de16f30	test: Add support for s3 storage_options in config When the sstable test case wants to run over S3 storage it needs to specify that in test config by providing the S3 storage options. So first thing this patch adds is the helper that makes these options based on the env left by minio launcher from test.py. Next, in order to make sstables_manager work with S3 it needs the plugged system keyspace which, in turn, needs query processor, proxy, database, etc. All this stuff lives in cql_test_env, so the test case running with S3 options will run in a sstables::test_env nested inside cql_test_env. The latter would also need to plug its system keyspace to the former's sstables manager and turn the experimental feature ON. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:30:03 +03:00
Nadav Har'El	57ffbcbb22	cql3: fix spurious token names in syntax error messages We have known for a long time (see issue #1703) that the quality of our CQL "syntax error" messages leave a lot to be desired, especially when compared to Cassandra. This patch doesn't yet bring us great error messages with great context - doing this isn't easy and it appears that Antlr3's C++ runtime isn't as good as the Java one in this regard - but this patch at least fixes garbage printed in some error messages. Specifically, when the parser can deduce that a specific token is missing, it used to print line 1:83 missing ')' at '<missing ' After this patch we get rid of the meaningless string '<missing ': line 1:83 : Missing ')' Also, when the parser deduced that a specific token was unneeded, it used to print: line 1:83 extraneous input ')' expecting <invalid> Now we got rid of this silly "<invalid>" and write just: line 1:83 : Unexpected ')' Refs #1703. I didn't yet marked that issue "fixed" because I think a complete fix would also require printing the entire misparsed line and the point of the parse failure. Scylla still prints a generic "Syntax Error" in most cases now, and although the character number (83 in the above example) can help, it's much more useful to see the actual failed statement and where character 83 is. Unfortunately some tests enshrine buggy error messages and had to be fixed. Other tests enshrined strange text for a generic unexplained error message, which used to say " : syntax error..." (note the two spaces and elipses) and after this patch is " : Syntax error". So these tests are changed. Another message, "no viable alternative at input" is deliberately kept unchanged by this patch so as not to break many more tests which enshrined this message. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13731	2023-05-02 11:23:58 +03:00
Pavel Emelyanov	1e03733e8c	test: Outline sstables::test_env::do_with_async() It's growing larger, better to keep it in .cc file Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:45 +03:00
Pavel Emelyanov	f223f5357d	test: Keep storage options on sstable_test_env config So that it could be set to s3 by the test case on demand. Default is local storage which uses env's tempdir or explicit path argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:45 +03:00
Pavel Emelyanov	81a1416ebf	sstables: Add and call storage::destroy() The s3_storage leaks client when sstable gets destoryed. So far this came unnoticed, but debug-mode unit test ran over minio captured it. So here's the fix. When sstable is destroyed it also kicks the storage to do whatever cleanup is needed. In case of s3 storage the cleanup is in closing the on-boarded client. Until #13458 is fixed each sstable has its own private version of the client and there's no other place where it can be close()d in co_await-able mannter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:44 +03:00
Avi Kivity	c0eb0d57bc	install-dependencies.sh: don't use fgrep fgrep says: fgrep: warning: fgrep is obsolescent; using grep -F follow its advice. Closes #13729	2023-05-02 11:15:40 +03:00
Pavel Emelyanov	3e0c3346a8	sstables: Coroutinize sstable::destroy() To simiplify patching by next patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-02 11:15:11 +03:00
Nadav Har'El	e74f69bb56	alternator: unit test for number magnitude and precision function In the previous patch we added a limit in Alternator for the magnitude and precision of numbers, based on a function get_magnitude_and_precision whose implementation was, unfortunately, rather elaborate and delicate. Although we did add in the previous patches some end-to-end tests which confirmed that the final decision made based on this function, to accept or reject numbers, was a correct decision in a few cases, such an elaborate function deserves a separate unit test for checking just that function in isolation. In fact, this unit tests uncovered some bugs in the first implementation of get_magnitude_and_precision() which the other tests missed. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:04:05 +03:00
Nadav Har'El	3c0603558c	alternator: add validation of numbers' magnitude and precision DynamoDB limits the allowed magnitude and precision of numbers - valid decimal exponents are between -130 and 125 and up to 38 significant decimal digitst are allowed. In contrast, Scylla uses the CQL "decimal" type which offers unlimited precision. This can cause two problems: 1. Users might get used to this "unofficial" feature and start relying on it, not allowing us to switch to a more efficient limited-precision implementation later. 2. If huge exponents are allowed, e.g., 1e-1000000, summing such a number with 1.0 will result in a huge number, huge allocations and stalls. This is highly undesirable. After this patch, all tests in test/alternator/test_number.py now pass. The various failing tests which verify magnitude and precision limitations in different places (key attributes, non-key attributes, and arithmetic expressions) now pass - so their "xfail" tags are removed. Fixes #6794 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:04:05 +03:00
Nadav Har'El	0eccc49308	test/alternator: more tests for limits on number precision and magnitude We already have xfailing tests for issue #6794 - the missing checks on precision and magnitudes of numbers in Alternator - but this patch adds checks for additional corner cases. In particular we check the case that numbers are used in a key column, which goes to a different code path than numbers used in non-key columns, so it's worth testing as well. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:04:05 +03:00
Nadav Har'El	56b8b9d670	test/alternator: reproducer for DoS in unlimited-precision addition As already noted in issue #6794, whereas DynamoDB limits the magnitude of numbers to between 10^-130 and 10^125, Scylla does not. In this patch we add yet another test for this problem, but unlike previous tests which just shown too much magnitude being allowed which always sounded like a benign problem - the test in this patch shows that this "feature" can be used to DoS Scylla - a user user can send a short request that causes arbitrarily-large allocations, stalls and CPU usage. The test is currently marked "skip" because it cause cause Scylla to take a very long time and/or run out of memory. It passes on DynamoDB because the excessive magnitude is simply not allowed there. Refs #6794 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:03:51 +03:00
Benny Halevy	959a740dac	utils: to_string: get rid of utils::join Use `fmt::format("{}", fmt::join(...))` instead. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:59:58 +03:00
Benny Halevy	e6bcb1c8df	utils: to_string: get rid of to_string(std::initializer_list) It's unused. Just in case, add a unit test case for using the fmt library to format it (that includes fmt::to_string(std::initializer_list)). Note that the existing to_string implementation used square brackets to enclose the initializer_list but the new, standardized form uses curly braces. This doesn't break anything since to_string(initializer_list) wasn't used. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Benny Halevy	ba883859c7	utils: to_string: get rid of to_string(const Range&) Use fmt::to_string instead. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Benny Halevy	15c9f0f0df	utils: to_string: generalize range helpers As seen in https://github.com/scylladb/scylladb/issues/13146 the current implementation is not general enough to provide print helpers for all kind of containers. Modernize the implementation using templates based on std::ranges::range and using fmt::join. Extend unit test for formatting different types of ranges, boost::transformed ranges, deque. Fixes #13146 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Benny Halevy	59e89efca6	test: add string_format_test Test string formatting before cleaning up utils/to_string.hh in the next patches. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Benny Halevy	45153b58bd	utils: chunked_vector: add std::ranges::range ctor To be used in next patch for constructing chunked_vector from an initializer_list. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Wojciech Mitros	b18c21147f	cql: check if the keyspace is system when altering permissions Currently, when altering permissions on a functions resource, we only check if it's a builtin function and not if it's all functions in the "system" keyspace, which contains all builtin functions. This patch adds a check of whether the function resource keyspace is "system". This check actually covers both "single function" and "all functions in keyspace" cases, so the additional check for single functions is removed. Closes #13596	2023-05-02 10:13:59 +03:00
Botond Dénes	022465d673	Merge 'Tone down offstrategy log message' from Benny Halevy In many cases we trigger offstrategy compaction opportunistically also when there's nothing to do. In this case we still print to the log lots of info-level message and call `run_offstrategy_compaction` that wastes more cpu cycles on learning that it has nothing to do. This change bails out early if the maintenance set is empty and prints a "Skipping off-strategy compaction" message in debug level instead. Fixes #13466 Also, add an group_id class and return it from compaction_group and table_state. Use that to identify the compaction_group / table_state by "ks_name.cf_name compaction_group=idx/total" in log messages. Fixes #13467 Closes #13520 * github.com:scylladb/scylladb: compaction_manager: print compaction_group id compaction_group, table_state: add group_id member compaction_manager: offstrategy compaction: skip compaction if no candidates are found	2023-05-02 08:05:18 +03:00
Avi Kivity	9c37fdaca3	Revert "dht: incremental_owned_ranges_checker: use lower_bound()" This reverts commit `d85af3dca4`. It restores the linear search algorithm, as we expect the search to terminate near the origin. In this case linear search is O(1) while binary search is O(log n). A comment is added so we don't repeat the mistake. Closes #13704	2023-05-02 08:01:44 +03:00
Benny Halevy	707bd17858	everywhere: optimize calls to make_flat_mutation_reader_from_mutations_v2 with single mutation No point in going through the vector<mutation> entry-point just to discover in run time that it was called with a single-element vector, when we know that in advance. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13733	2023-05-02 07:58:34 +03:00
Avi Kivity	72c12a1ab2	Merge 'cdc, db_clock: specialize fmt::formatter<{db_clock::time_point, generation_id}>' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `cdc::generation_id` and `db_clock::time_point` without the help of `operator<<`. the formatter of `cdc::generation_id` uses that of `db_clock::time_point` , so these two commits are posted together in a single pull request. the corresponding `operator<<()` is removed in this change, as all its callers are now using fmtlib for formatting now. Refs #13245 Closes #13703 * github.com:scylladb/scylladb: db_clock: specialize fmt::formatter<db_clock::time_point> cdc: generation: specialize fmt::formatter<generation_id>	2023-05-01 22:56:33 +03:00
Avi Kivity	7b7d9bcb14	Merge 'Do not access owned_ranges_ptr across shards in update_sstable_cleanup_state' from Benny Halevy This series fixes a few issues caused by `f1bbf705f9` (`f1bbf705f9`): - table, compaction_manager: prevent cross shard access to owned_ranges_ptr - Fixes #13631 - distributed_loader: distribute_reshard_jobs: pick one of the sstable shard owners - compaction: make_partition_filter: do not assert shard ownership - allow the filtering reader now used during resharding to process tokens owned by other shards Closes #13635 * github.com:scylladb/scylladb: compaction: make_partition_filter: do not assert shard ownership distributed_loader: distribute_reshard_jobs: pick one of the sstable shard owners table, compaction_manager: prevent cross shard access to owned_ranges_ptr	2023-05-01 22:51:00 +03:00
Avi Kivity	c9dab3ac81	Merge 'treewide: fix warnings from GCC-13' from Kefu Chai this series silences the warnings from GCC 13. some of these changes are considered as critical fixes, and posted separately. see also #13243 Closes #13723 * github.com:scylladb/scylladb: cdc: initialize an optional using its value type compaction: disambiguate type name db: schema_tables: drop unused variable reader_concurrency_semaphore: fix signed/unsigned comparision locator: topology: disambiguate type names raft: disambiguate promise name in raft::awaited_conf_changes	2023-05-01 22:48:00 +03:00
Kefu Chai	37f1beade5	s3/client: do not allocate potentially big object on stack when compiling using GCC-13, it warns that: ``` /home/kefu/dev/scylladb/utils/s3/client.cc:224:9: error: stack usage might be 66352 bytes [-Werror=stack-usage=] 224 \| sstring parse_multipart_upload_id(sstring& body) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~ ``` so it turns out that `rapidxml::xml_document<>` could be very large, let's allocate it on heap instead of on the stack to address this issue. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13722	2023-05-01 22:46:18 +03:00
Kefu Chai	108f20c684	cql3: capture reference to temporary value by value `data_dictionary::database::find_keyspace()` returns a temporary object, and `data_dictionary::keyspace::user_types()` returns a references pointing to a member of this temporary object. so we cannot use the reference after the expression is evaluated. in this change, we capture the return value of `find_keyspace()` using universal reference, and keep the return value of `user_types()` with a reference, to ensure us that we can use it later. this change silences the warning from GCC-13, like: ``` /home/kefu/dev/scylladb/cql3/statements/authorization_statement.cc:68:21: error: possibly dangling reference to a temporary [-Werror=dangling-reference] 68 \| const auto& utm = qp.db().find_keyspace(*keyspace).user_types(); \| ^~~ ``` Fixes #13725 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13726	2023-05-01 22:41:41 +03:00
Kefu Chai	b76877fd99	transport: capture reference to temp value by value `current_scheduling_group()` returns a temporary value, and `name()` returns a reference, so we cannot capture the return value by reference, and use the reference after this expression is evaluated. this would cause undefined behavior. so let's just capture it by value. this change also silence following warning from GCC-13: ``` /home/kefu/dev/scylladb/transport/server.cc:204:11: error: possibly dangling reference to a temporary [-Werror=dangling-reference] 204 \| auto& cur_sg_name = current_scheduling_group().name(); \| ^~~~~~~~~~~ /home/kefu/dev/scylladb/transport/server.cc:204:56: note: the temporary was destroyed at the end of the full expression ‘seastar::current_scheduling_group().seastar::scheduling_group::name()’ 204 \| auto& cur_sg_name = current_scheduling_group().name(); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ ``` Fixes #13719 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13724	2023-05-01 22:40:36 +03:00
Kefu Chai	0a3a254284	cql3: do not capture reference to temporary value `data_dictionary::database::find_column_family()` return a temporary value, and `data_dictionary::table::get_index_manager()` returns a reference in this temporary value, so we cannot capture this reference and use it after the expression is evaluated. in this change, we keep the return value of `find_column_family()` by value, to extend the lifecycle of the return value of `get_index_manager()`. this should address the warning from GCC-13, like: ``` /home/kefu/dev/scylladb/cql3/restrictions/statement_restrictions.cc:519:15: error: possibly dangling reference to a temporary [-Werror=dangling-reference] 519 \| auto& sim = db.find_column_family(_schema).get_index_manager(); \| ^~~ ``` Fixes #13727 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13728	2023-05-01 22:39:48 +03:00
Nadav Har'El	1cefb662cd	Merge 'cql3/expr: remove expr::token' from Jan Ciołek Let's remove `expr::token` and replace all of its functionality with `expr::function_call`. `expr::token` is a struct whose job is to represent a partition key token. The idea is that when the user types in `token(p1, p2) < 1234`, this will be internally represented as an expression which uses `expr::token` to represent the `token(p1, p2)` part. The situation with `expr::token` is a bit complicated. On one hand side it's supposed to represent the partition token, but sometimes it's also assumed that it can represent a generic call to the `token()` function, for example `token(1, 2, 3)` could be a `function_call`, but it could also be `expr::token`. The query planning code assumes that each occurence of expr::token represents the partition token without checking the arguments. Because of this allowing `token(1, 2, 3)` to be represented as `expr::token` is dangerous - the query planning might think that it is `token(p1, p2, p3)` and plan the query based on this, which would be wrong. Currently `expr::token` is created only in one specific case. When the parser detects that the user typed in a restriction which has a call to `token` on the LHS it generates `expr::token`. In all other cases it generates an `expr::function_call`. Even when the `function_call` represents a valid partition token, it stays a `function_call`. During preparation there is no check to see if a `function_call` to `token` could be turned into `expr::token`. This is a bit inconsistent - sometimes `token(p1, p2, p3)` is represented as `expr::token` and the query planner handles that, but sometimes it might be represented as `function_call`, which the query planner doesn't handle. There is also a problem because there's a lot of code duplication between a `function_call` and `expr::token`. All of the evaluation and preparation is the same for `expr::token` as it's for a `function_call` to the token function. Currently it's impossible to evaluate `expr::token` and preparation has some flaws, but implementing it would basically consist of copy-pasting the corresponding code from token `function_call`. One more aspect is multi-table queries. With `expr::token` we turn a call to the `token()` function into a struct that is schema-specific. What happens when a single expression is used to make queries to multiple tables? The schema is different, so something that is represented as `expr::token` for one schema would be represented as `function_call` in the context of a different schema. Translating expressions to different tables would require careful manipulation to convert `expr::token` to `function_call` and vice versa. This could cause trouble for index queries. Overall I think it would be best to remove `expr::token`. Although having a clear marker for the partition token is sometimes nice for query planning, in my opinion the pros are outweighted by the cons. I'm a big fan of having a single way to represent things, having two separate representations of the same thing without clear boundaries between them causes trouble. Instead of having both `expr::token` and `function_call` we can just have the `function_call` and check if it represents a partition token when needed. Refs: #12906 Refs: #12677 Closes: #12905 Closes #13480 * github.com:scylladb/scylladb: cql3: remove expr::token cql3: keep a schema in visitor for extract_clustering_prefix_restrictions cql3: keep a schema inside the visitor for extract_partition_range cql3/prepare_expr: make get_lhs_receiver handle any function_call cql3/expr: properly print token function_call expr_test: use unresolved_identifier when creating token cql3/expr: split possible_lhs_values into column and token variants cql3/expr: fix error message in possible_lhs_values cql3: expr: reimplement is_satisfied_by() in terms of evaluate() cql3/expr: add a schema argument to expr::replace_token cql3/expr: add a comment for expr::has_partition_token cql3/expr: add a schema argument to expr::has_token cql3: use statement_restrictions::has_token_restrictions() wherever possible cql3/expr: add expr::is_partition_token_for_schema cql3/expr: add expr::is_token_function cql3/expr: implement preparing function_call without a receiver cql3/functions: make column family argument optional in functions::get cql3/expr: make it possible to prepare expr::constant cql3/expr: implement test_assignment for column_value cql3/expr: implement test_assignment for expr::constant	2023-04-30 15:31:35 +03:00
Tomasz Grabiec	aba5667760	Merge 'raft topology: refactor the coordinator to allow non-node specific topology transitions' from Kamil Braun We change the meaning and name of `replication_state`: previously it was meant to describe the "state of tokens" of a specific node; now it describes the topology as a whole - the current step in the 'topology saga'. It was moved from `ring_slice` into `topology`, renamed into `transition_state`, and the topology coordinator code was modified to switch on it first instead of node state - because there may be no single transitioning node, but the topology itself may be transitioning. This PR was extracted from #13683, it contains only the part which refactors the infrastructure to prepare for non-node specific topology transitions. Closes #13690 * github.com:scylladb/scylladb: raft topology: rename `update_replica_state` -> `update_topology_state` raft topology: remove `transition_state::normal` raft topology: switch on `transition_state` first raft topology: `handle_ring_transition`: rename `res` to `exec_command_res` raft topology: parse replaced node in `exec_global_command` raft topology: extract `cleanup_group0_config_if_needed` from `get_node_to_work_on` storage_service: extract raft topology coordinator fiber to separate class raft topology: rename `replication_state` to `transition_state` raft topology: make `replication_state` a topology-global state	2023-04-30 10:55:24 +02:00
Kefu Chai	e333bcc2da	cdc: initialize an optional using its value type as this syntax is not supported by the standard, it seems clang just silently construct the value with the initializer list and calls the operator=, but GCC complains: ``` /home/kefu/dev/scylladb/cdc/split.cc:392:54: error: converting to ‘std::optional<partition_deletion>’ from initializer list would use explicit constructor ‘constexpr std::optional<_Tp>::optional(_Up&&) [with _Up = const tombstone&; typename std::enable_if<__and_v<std::__not_<std::is_same<std::optional<_Tp>, typename std::remove_cv<typename std::remove_reference<_Iter>::type>::type> >, std::__not_<std::is_same<std::in_place_t, typename std::remove_cv<typename std::remove_reference<_Iter>::type>::type> >, std::is_constructible<_Tp, _Up>, std::__not_<std::is_convertible<_Iter, _Iterator> > >, bool>::type <anonymous> = false; _Tp = partition_deletion]’ 392 \| _result[t.timestamp].partition_deletions = {t}; \| ^ ``` to silences the error, and to be more standard compliant, let's use emplace() instead. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-29 19:34:12 +08:00
Jan Ciolek	be8ef63bf5	cql3: remove expr::token Let's remove expr::token and replace all of its functionality with expr::function_call. expr::token is a struct whose job is to represent a partition key token. The idea is that when the user types in `token(p1, p2) < 1234`, this will be internally represented as an expression which uses expr::token to represent the `token(p1, p2)` part. The situation with expr::token is a bit complicated. On one hand side it's supposed to represent the partition token, but sometimes it's also assumed that it can represent a generic call to the token() function, for example `token(1, 2, 3)` could be a function_call, but it could also be expr::token. The query planning code assumes that each occurence of expr::token represents the partition token without checking the arguments. Because of this allowing `token(1, 2, 3)` to be represented as expr::token is dangerous - the query planning might think that it is `token(p1, p2, p3)` and plan the query based on this, which would be wrong. Currently expr::token is created only in one specific case. When the parser detects that the user typed in a restriction which has a call to `token` on the LHS it generates expr::token. In all other cases it generates an `expr::function_call`. Even when the `function_call` represents a valid partition token, it stays a `function_call`. During preparation there is no check to see if a `function_call` to `token` could be turned into `expr::token`. This is a bit inconsistent - sometimes `token(p1, p2, p3)` is represented as `expr::token` and the query planner handles that, but sometimes it might be represented as `function_call`, which the query planner doesn't handle. There is also a problem because there's a lot of duplication between a `function_call` and `expr::token`. All of the evaluation and preparation is the same for `expr::token` as it's for a `function_call` to the token function. Currently it's impossible to evaluate `expr::token` and preparation has some flaws, but implementing it would basically consist of copy-pasting the corresponding code from token `function_call`. One more aspect is multi-table queries. With `expr::token` we turn a call to the `token()` function into a struct that is schema-specific. What happens when a single expression is used to make queries to multiple tables? The schema is different, so something that is representad as `expr::token` for one schema would be represented as `function_call` in the context of a different schema. Translating expressions to different tables would require careful manipulation to convert `expr::token` to `function_call` and vice versa. This could cause trouble for index queries. Overall I think it would be best to remove expr::token. Although having a clear marker for the partition token is sometimes nice for query planning, in my opinion the pros are outweighted by the cons. I'm a big fan of having a single way to represent things, having two separate representations of the same thing without clear boundaries between them causes trouble. Instead of having expr::token and function_call we can just have the function_call and check if it represents a partition token when needed. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-04-29 13:11:31 +02:00
Jan Ciolek	6e0ae59c5a	cql3: keep a schema in visitor for extract_clustering_prefix_restrictions The schema will be needed once we remove expr::token and switch to using expr::is_partition_token_for_schema, which requires a schema arguments. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-04-29 13:11:31 +02:00

1 2 3 4 5 ...

36561 Commits