scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 08:12:08 +00:00

Author	SHA1	Message	Date
Nadav Har'El	808a93d29b	expressions: fix invalid dereference in map subscript evaluation When we have an filter such as "WHERE m[2] = 3" (where m is a map column), if a row had a null value for m, our expression evaluation code incorrectly dereferences an unset optional, and continued processing the result of this dereference which resulted in undefined behavior - sometimes we were lucky enough to get "marshaling error" but other times Scylla crashed. The fix is trivial - just check before dereferencing the optional value of the map. We return null in that case, which means that we consider the result of null[2] to be null. I think this is a reasonable approach and fits our overall approach of making null dominate expressions (e.g., the value of "null < 2" is also null). The test test_filtering.py::test_filtering_null_map_with_subscript, which used to frequently fail with marshaling errors or crashes, now passes every time so its "xfail" mark is removed. Fixes #10417 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-04-24 14:58:56 +03:00
Nadav Har'El	189b8845fe	test/cql-pytest: improve tests for map subscripts and nulls The test test_null.py::test_map_subscript_null turned out to reproduce multiple bugs related to using map subscripts in filtering expressions. One was issue #10361 (m[null] resulted in a bizarre error) or #10399 (m[null] resulted in a crash), and a different issue was #10401 (m[2] resulted in a bizarre error or a crash if m itself was null). Moreover, the same test uncovered different bugs depending how it was run - alone or with other tests - because it was using a shared table. In this patch we introduce two separate tests in test_filtering.py which are designed to reproduce these separate bugs instead of mixing them into one test. The new tests also cover a few more corners which the previous test (which focused on nulls) missed - such as UNSET_VALUE. The two new tests (and the old test_map_subscript_null) pass on Cassandra so still assume that the Cassandra behavior - that m[null] should be an error - is the correct behavior. We may want to change the desired behavior (e.g., to decide that m[null] be null, not an error), and change the tests accordingly later - but for now the tests follow Cassandra's behavior exactly, and pass on Cassandra and fail on Scylla (so are marked xfail). The bugs reproduced by these tests involve randomness or reading uninitialized memory, so these tests sometimes pass, sometimes fail, and sometimes even crash (as reported in #10399 and #10401). So to reproduce these bugs run the tests multiple times. For example: test/cql-pytest/run --count 100 --runxfail test_filtering.py::test_filtering_null_map_with_subscript Refs #10361 Refs #10399 Refs #10401 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-04-24 13:26:26 +03:00
Avi Kivity	8624718983	Merge "row_cache: update reader implementations to v2" from Botond " cache_flat_mutation_reader gets a native v2 implementation. The underlying mutation representation is not changed: range deletions are still stored as v1 range_tombstones in mutation_partition. These are converted to range tombstone changes during reading. This allows for separating the change of a native v2 reader implementation and a native v2 in-memory storage format, enabling the two to be done at separate times and incrementally. This means there is still conversion ingoing when reading from cache and when populating, but when reading from underlying, the stream can now be passed through as-is without conversions. Also, any future v2 related changes to the in-memory storage will now be limited to the cache reader implementation itself. In the process, the non-forwarding reader, whose only user is the cache, is also converted to v2. " Performance results reported by Botond: " build/release/test/perf/perf_simple_query -c1 -m2G --flush -- duration=20 BEFORE median 130421.76 tps ( 71.1 allocs/op, 12.1 tasks/op, 47462 insns/op) median absolute deviation: 319.64 maximum: 131028.33 minimum: 127502.55 AFTER median 133297.41 tps ( 64.1 allocs/op, 12.2 tasks/op, 45406 insns/op) median absolute deviation: 2964.24 maximum: 137581.56 minimum: 123739.4 Getting rid of those upgrade/downgrade was good for allocs and ops. Curiously there is a 0.1 rise in number of tasks though. " * 'row-cache-readers-v2/v1' of https://github.com/denesb/scylla: row_cache: update reader implementations to v2 range_tombstone_change_generator: flush(): add end_of_range readers/nonforwardable: convert to v2 read_context: fix indentation read_context: coroutinize move_to_next_partition() row_cache: cache_entry::read(): return v2 reader row_cache: return v2 readers from make_reader*() readers/delegating_v2: s/make_delegating_reader_v2/make_delegating_reader/	2022-04-23 19:10:43 +03:00
Botond Dénes	5e97fb9fc4	row_cache: update reader implementations to v2 cache_flat_mutation_reader gets a native v2 implementation. The underlying mutation representation is not changed: range deletions are still stored as v1 range_tombstones in mutation_partition. These are converted to range tombstone changes during reading. This allows for separating the change of a native v2 reader implementation and a native v2 in-memory storage format, enabling the two to be done at separate times and incrementally.	2022-04-21 14:57:04 +03:00
Botond Dénes	5cc5fd4d23	range_tombstone_change_generator: flush(): add end_of_range Allowing to flush all range tombstone changes, including those that have a position equal to the passed in upper bound, when finishing off a read-range, e.g. a clustering range from a slice.	2022-04-21 14:37:10 +03:00
Botond Dénes	7626beb729	readers/nonforwardable: convert to v2 It has a single user, the row cache, which for now has to upgrade/downgrade around the nonforwardable reader, but this will go away in the next patches when the row cache readers are converted to v2 proper.	2022-04-21 14:34:00 +03:00
Botond Dénes	b061acb668	Merge 'Remove queue reader v1' from Mikołaj Sielużycki The patchset embeds the mutation_fragment upgrading logic from v1 to v2 into the mutation_fragment_queue. This way the mutation fragments coming to the mutation_fragment_queue can be v1, but the underlying query_reader receives mutation_fragment_v2, eliminating the last usage of query_reader (v1). The last commit removes query_reader, query_reader_handle and associated factory functions. tests: unit(dev), dtest(incremental_repair_test, read_repair_test, repair_additional_test, repair_test) Closes #10371 * github.com:scylladb/scylla: readers: Remove queue_reader v1 and associated code. repair: Make mutation_fragment_queue internally upgrade fragments to v2 repair: Make mutation_fragment_queue::impl a seastar::shared_ptr	2022-04-21 12:34:48 +03:00
Mikołaj Sielużycki	f74fd0dd80	readers: Remove queue_reader v1 and associated code.	2022-04-20 17:56:34 +02:00
Mikołaj Sielużycki	339b60e5b0	repair: Make mutation_fragment_queue internally upgrade fragments to v2	2022-04-20 17:55:58 +02:00
Mikołaj Sielużycki	eeb2b458de	repair: Make mutation_fragment_queue::impl a seastar::shared_ptr It makes mutation_fragment_queue copyable and makes the pointer to pending mutation fragments in next commit stable. This allows moving the mutation_fragment_queue without breaking the underlying upgrading_consumer.	2022-04-20 17:51:58 +02:00
Botond Dénes	46481264e9	read_context: fix indentation Broken by the previous patch (patches actually -- it was half-indent on half-indent before that).	2022-04-20 10:59:09 +03:00
Botond Dénes	28f90728a3	read_context: coroutinize move_to_next_partition() Makes the code more readable and the impending v2 transition less noisy.	2022-04-20 10:59:09 +03:00
Botond Dénes	2a0d7e8a1d	row_cache: cache_entry::read(): return v2 reader Push the conversion down one level. Soon we will make cache flat mutation reader a v2 reader, this keeps the related noise separate.	2022-04-20 10:59:09 +03:00
Botond Dénes	0b035c9099	row_cache: return v2 readers from make_reader*() And adjust callers. The factory functions just sprinkle upgrade_to_v2() on returned readers for now. One test in row_cache_test.cc had to be disabled, because the upgrade to v2 wrapper we now have over cache readers doesn't allow it to directly control the reader's buffer size and so the test fails. There is a FIXME left in the test code and the test will be re-enabled once a native v2 reader implementation allows us to get rid of the upgrade wrapper.	2022-04-20 10:59:09 +03:00
Botond Dénes	c3c71b3aa5	readers/delegating_v2: s/make_delegating_reader_v2/make_delegating_reader/ The argument type (v1 or v2 reader) is enough to disambiguate and overloading the v1 method makes a transition to v2 more seamless.	2022-04-20 10:59:09 +03:00
Nadav Har'El	cc40685c28	test/cql-pytest: add test for filtering with IN restriction It turns out that Cassandra does not allow IN restrictions together with filtering, except, curiously, when the restriction is on a clustering key. There is no real reason for this limitation - the error message even says it is not yet supported. Scylla, on the other hand, does support this case. Of course it's not enough that we support it - we need to support it correctly... But we don't have a full regression test that this support is correct - in filtering_test.cc we test it with clustering and regular columns - but not partition key columns. So this patch adds a simple cql-pytest test that this sort of filtering works in Scylla correctly for partition, clustering and regular columns (and also confirms that these cases don't work, yet, on Cassandra). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220420075553.1008062-1-nyh@scylladb.com>	2022-04-20 09:56:22 +02:00
Konstantin Osipov	a3b790b413	test.py: add a dependency on python3-aiohttp and tabulate Satisfy the build system requirements. [avi: regenerate frozen toolchain]	2022-04-19 18:22:50 +03:00
Konstantin Osipov	097fbc7c5d	.gitignore: ignore mypy_cache, the python lint cache	2022-04-19 16:48:47 +03:00
Pavel Emelyanov	41392a59bb	storage_service: Remove pointless check in replace-bootstrap The method in question is called in the branch where the replace address is checked to be present, no need in extra explicit check. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-19 13:27:52 +03:00
Pavel Emelyanov	49481b1a21	storage_service: Generalize wait for range setup Both the if is_replacing()/else branches call gossiper wating method as their first steps. Can be done once. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-19 13:27:52 +03:00
Pavel Emelyanov	d213e6ffd1	storage_service: Merge common if-else branches in bootstrap There are three modes in there -- bootstrap, b.s. with RBNO and b.s. for replacing. All three are checked two times in a row, but can be done once. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-19 13:27:52 +03:00
Pavel Emelyanov	b0df3a32b4	storage_service: Move tables bootstrap-ON upwards This call just places a boolean flag on all. It won't hurt if it lasts while the node is performing pre-bootstrap checks, but it allows making the whole method less branchy. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-19 13:27:52 +03:00
Avi Kivity	469bca5369	storage_proxy: coroutinize mutate_locally (vector overload) The do_with() means we have an unconditional allocation, so we can justify the coroutine's allocation (replacing it). Meanwhile, coroutine::parallel_for_each() reduces an allocation if mutate_locally() blocks. Closes #10387	2022-04-19 10:59:16 +03:00
Botond Dénes	3051fc3cbc	Merge 'Fix some errors and issues found by gcc 12' from Avi Kivity gcc 12 checks some things that clang doesn't, resulting in compile errors. This series fixes some of theses issues, but still builds (and tests) with clang. Unfortunately, we still don't have a clean gcc build due to an outstanding bug [1]. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98056 Closes #10386 * github.com:scylladb/scylla: build: disable warnings that cause false-positive errors with gcc 12 utils: result_loop: remove invalid and incorrect constraint service: forward_service: avoid using deprecated std::bind1st and std::not1 repair: explicityl ignore tombstone gc update response treewide: abort() after switch in formatters db: view: explicitly ignore unused result compaction: leveled_compaction_strategy: avoid compares between signed and unsigned compaction_manager: compaction_reenabler: disambiguate compaction_state api: avoid function specialization in req_param alternator: ttl: avoid specializing class templates in non-namespace scope alternator: executor: fix signed/unsigned comparison in is_big()	2022-04-19 10:25:38 +03:00
Botond Dénes	4d972b8d31	Merge 'storage_proxy: convert rpc handlers from lambdas to member functions' from Avi Kivity Currently, rpc handlers are all lambdas inside storage_proxy::init_messaging_service(). This means any stack trace refers to storage_proxy::init_messaging_service::lambda#n instead of a meaningful function name, and it makes init_messaging_service() very intimidating. Fix that by moving all such lambdas to regular member functions. The first two patches remove unnecessary captures to make it easy; the final patch coverts the lambdas to member functions. Closes #10388 * github.com:scylladb/scylla: storage_proxy: convert rpc handlers from lambdas to member functions storage_proxy: don't capture messaging_service in server callbacks storage_proxy: don't capture migration_manager in server callbacks	2022-04-19 08:20:49 +03:00
Takuya ASADA	acaf0bb88a	scripts: print perftune.py error message when capture_output=True We currently does not able to get any error message from subprocess when we specified capture_output=True on subprocess.run(). This is because CalledProcessError does not print stdout/stderr when it raised, and we don't catch the exception, we just let python to cause Traceback. Result of that, we only able to know exit status and failed command but not able to get stdout/stderr. This is problematic especially working on perftune.py bug, since the script should caused Traceback but we never able to see it. To resolve this, add wrapper function "out()" for capture output, and print stdout/stderr with error message inside the function. Fixes #10390 Closes #10391	2022-04-18 14:06:51 +03:00
Avi Kivity	27093d32d1	Merge 'gms: gossiper: coroutinize `apply_state` functions' from Pavel Solodovnikov Mostly trivial conversions to coroutines in the gossiper to facilitate code readability. Closes #10389 * github.com:scylladb/scylla: gms: gossiper: coroutinize `apply_state_locally` gms: gossiper: coroutinize `apply_state_locally_without_listener_notification` gms: gossiper: coroutinize `do_apply_state_locally` gms: gossiper: coroutinize `apply_new_states`	2022-04-18 13:48:07 +03:00
Avi Kivity	7129ddfa67	build: disable warnings that cause false-positive errors with gcc 12 gcc 12 generates some incorrect warnings (that we treat as errors). Silence them so we can build.	2022-04-18 12:27:18 +03:00
Avi Kivity	160bbb00dd	utils: result_loop: remove invalid and incorrect constraint Checking a concept in a requires-expression requires an additional requires keyword. Moreover, the constraint is incorrect (at least all callers pass a T, not a result<T>), so remove it. Found by gcc 12.	2022-04-18 12:27:18 +03:00
Avi Kivity	e55f5fab53	service: forward_service: avoid using deprecated std::bind1st and std::not1 Switch to newer alterantives std::bind_front, std::not_fn.	2022-04-18 12:27:18 +03:00
Avi Kivity	5da586271f	repair: explicityl ignore tombstone gc update response The response struct is empty and we have nothing to do with it. Cast it to void to avoid a gcc warning.	2022-04-18 12:27:18 +03:00
Avi Kivity	1e1c0226a6	treewide: abort() after switch in formatters It is typical in switch statements to select on an enum type and rely on the compliler to complain if an enum value was missed. But gcc isn't satisified since the enum could have a value outside the declared list. Call abort() in this impossible situation to pacify it.	2022-04-18 12:27:18 +03:00
Avi Kivity	a1df583dea	db: view: explicitly ignore unused result Otherwise, gcc complains.	2022-04-18 12:27:18 +03:00
Avi Kivity	eb436ac940	compaction: leveled_compaction_strategy: avoid compares between signed and unsigned These can overflow. Here, there is no such risk, but switch to unsigned to avoid the warning.	2022-04-18 12:27:18 +03:00
Avi Kivity	fa7172fcad	compaction_manager: compaction_reenabler: disambiguate compaction_state compaciton_state is used both as a type and a function, which gcc does not like. Disambiguate by fully qualifying the type name.	2022-04-18 12:27:18 +03:00
Avi Kivity	de6631656c	api: avoid function specialization in req_param Function specializations are not allowed (you're supposed to use overloads), but clang appears to allow them. Here, we can't use an overload since the type doesn't appear in the parameter list. Use a constraint instead.	2022-04-18 12:27:18 +03:00
Avi Kivity	40beb48176	alternator: ttl: avoid specializing class templates in non-namespace scope The C++ standard disallows class template specialization in non-namespace scopes. Clang apparently allows it as an extension. Fix by not using a template - there are just two specializations and no generic implementation. Use regular classes and std::conditional_t to choose between the two.	2022-04-18 12:27:18 +03:00
Avi Kivity	b5e8e32c01	alternator: executor: fix signed/unsigned comparison in is_big() Signed/unsigned comparisons are subject to C promotion rules. In is_big() in this case the comparison is safe, but gcc warns. Use a cast to silence the warning. The sign/unsigned mix and int/size_t size differences still look bad, it would be good to revisit this code, but that is left for another patch.	2022-04-18 12:23:18 +03:00
Piotr Sarna	fea18943cd	schema_tables: drop leftover change to system_schema.keyspaces Series `59d56a3fd7` introduced an accidental backward incompatible regression by adding a column to system_schema.keyspaces and then not even using it for anything. It's a leftover from the original hackathon implementation and should never reach master in the first place. Fortunately, the series isn't part of any stable release yet. Fixes #10376 Tests: manual, verifying that the system_schema.keyspaces table no longer contains the extraneous column. Closes #10377	2022-04-18 12:00:43 +03:00
Avi Kivity	36aee57978	storage_proxy: convert rpc handlers from lambdas to member functions Currently, rpc handlers are all lambdas inside storage_proxy::init_messaging_service(). This means any stack trace refers to storage_proxy::init_messaging_service::lambda#n instead of a meaningful function name, and it makes init_messaging_service() very intimidating. Fix that by moving all such lambdas to regular member functions. This is easy now that they don't capture anything except `this`, which we provide during registration via std::bind_front(). A few #includes and forward declarations had to be added to storage_proxy.hh. This is unfortunate, but can only be solved by splitting storage_proxy into a client part and a server part.	2022-04-17 19:03:06 +03:00
Avi Kivity	f7e8109b16	storage_proxy: don't capture messaging_service in server callbacks We'd like to make the server callbacks member functions, rather than lambdas, so we need to eliminate their captures. This patch eliminats 'ms' by referringn to the already existing member '_messaging' instead.	2022-04-17 17:55:05 +03:00
Avi Kivity	4cac2eb43e	storage_proxy: don't capture migration_manager in server callbacks We'd like to make the server callbacks member functions, rather than lambdas, so we need to eliminate their captures. This patch eliminates 'mm' by making it a member variable and capturing 'this' instead. In one case 'mm' was used by a handle_write() intermediate lambda so we have to make that non-static and capture it too. uninit_messaging_service() clears the member variable to preserve the same lifetime 'mm' had before, in case that's important.	2022-04-17 17:54:51 +03:00
Avi Kivity	86dfe75268	Update seastar submodule * seastar acf7e3523b...5e86362704 (10): > Merge "Respect taskset-configured cpumask" from Pavel E Ref #9505. > rpc_tester: Run CPU hogs on server side too > std-coroutine: include <coroutine> for LLVM-15 > Revert "Merge "tests: perf: measure coroutines performance" from Benny" > test: perf_tests: remove [[gnu::always_inline]] attribute from coroutine perf tests > Merge "tests: perf: measure coroutines performance" from Benny > Merge "Extend RPC tester" from Pavel E > rpc: Mark connection trivial getters const noexcept > seastar-addr2line: Allow use of llvm-addr2line as the command > file: append_challenged_posix_file: Serialize allocate() to not block concurrent reads or writes	2022-04-17 17:11:31 +03:00
Pavel Solodovnikov	b25c4fee01	gms: gossiper: coroutinize `apply_state_locally` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-04-17 11:51:18 +03:00
Pavel Solodovnikov	746f1179eb	gms: gossiper: coroutinize `apply_state_locally_without_listener_notification` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-04-17 11:38:33 +03:00
Pavel Solodovnikov	b7322c3f5d	gms: gossiper: coroutinize `do_apply_state_locally` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-04-17 11:29:26 +03:00
Pavel Solodovnikov	c48dcf607a	gms: gossiper: coroutinize `apply_new_states` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2022-04-17 11:28:42 +03:00
Kamil Braun	b1b22f2c2b	service: raft: don't support/advertise USES_RAFT feature The code would advertise the USES_RAFT feature when the SUPPORTS_RAFT feature was enabled through a listener registered on the SUPPORTS_RAFT feature. This would cause a deadlock: 1. `gossiper::add_local_application_state(SUPPORTED_FEATURES, ...)` locks the gossiper (it's called for the first time from sstables format selector). 2. The function calls `on_change` listeners. 3. One of the listeners is the one for SUPPORTS_RAFT. 4. The listener calls `gossiper::add_local_application_state(SUPPORTED_FEATURES, ...)`. 5. This tries to lock the gossiper. In turn, depending on timing, this could hang the startup procedure, which calls `add_local_application_state` multiple times at various points, trying to take the lock inside gossiper. This prevents us from testing raft / group 0, new schema change procedures that use group 0, etc. For now, simply remove the code that advertises the USES_RAFT feature. Right now the feature has no other effect on the system than just becoming enabled. In fact, it's possible that we don't need this second feature at all (SUPPORTS_RAFT may be enough), but that's work-in-progress. If needed, it will be easy to bring the enabling code back (in a fixed form that doesn't cause a deadlock). We don't remove the feature definitions yet just in case. Refs: #10355	2022-04-15 16:08:25 +02:00
Avi Kivity	4a5082bfc8	main: fix discarded future during prometheus start sequence Probably not triggerable since it will be a while before we recognize a signal to exit. But a FIXME is a FIXME. Closes #10374	2022-04-15 16:40:31 +03:00
Avi Kivity	d90415434e	main: wait for memory_threshold_guard start We start the memory threshold guard (that enables large memory allocation warnings post-boot) but don't wait for it. I can't imagine it can hurt, but it does carry a FIXME label. Closes #10375	2022-04-15 16:37:47 +03:00

1 2 3 4 5 ...

31056 Commits