scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Botond Dénes	f527956cdb	readers: remove v1 empty_reader The only user is row level repair: it is replaced with downgrade_to_v1(make_empty_flat_reader_v2()). The row level reader has lots of downgrade_to_v1() calls, we will deal with these later all at once. Another use is the empty mutation source, this is trivially converted to use the v2 variant.	2022-04-28 14:12:24 +03:00
Botond Dénes	ea37e9c04e	readers: remove v1 delegating_reader The only user is a test, which is hereby converted to use the v2 delegating reader.	2022-04-28 14:12:24 +03:00
Botond Dénes	70d019116f	sstables/kl: make reader impl v2 native The conversion is shallow: the meat of the logic remains v1, fragments are converted to v2 right before being pushed into the buffer. This approach is simple, surgical and is still better then a full upgrade_to_v2().	2022-04-28 14:12:24 +03:00
Botond Dénes	a22b02c801	sstables/kl: return v2 reader from factory methods This just moves the upgrade_to_v2() calls to the other side of said factory methods, preparing the ground for converting the kl reader impl to a native v2 one.	2022-04-28 14:12:24 +03:00
Botond Dénes	4b222e7f37	sstables: move mp_row_consumer_reader_k_l to kl/reader.cc Its only user is in said file, so that is a better place for it.	2022-04-28 14:12:24 +03:00
Botond Dénes	4f77e74bd4	partition_snapshot_reader: convert implementation to native v2 The underlying mutation representation is still v1, so the implementation still has to do conversion. This happens right above the lsa reader component.	2022-04-28 14:12:12 +03:00
Botond Dénes	9c7455825b	mutation_fragment_v2: range_tombstone_change: add minimal_memory_usage()	2022-04-28 14:11:51 +03:00
Avi Kivity	de0ee13f45	schema_tables: forward-declare user_function and user_aggerates These bring in wasm.hh (though they really shouldn't) and make everyone suffer. Forward declare instead and add missing includes where needed. Closes #10444	2022-04-28 07:22:02 +03:00
Botond Dénes	2c08468fcb	Merge 'Make headers self-contained' from Avi Kivity Minor fixlets to make `ninja dev-headers` pass. Closes #10445 * github.com:scylladb/scylla: readers/from_mutations_v2.hh: make self-contained data_dictionary/storage_options.hh: make self-contained	2022-04-28 07:20:10 +03:00
Avi Kivity	a9812166cd	replica, partition_snapshot_reader, keys: replace boost::any with std::any Reduce #include load by standardizing on std::any. In keys.cc, we just drop the unneeded include. One instance of boost::any remains in config_file, due to a tie-in with other boost components. Closes #10441	2022-04-28 07:18:53 +03:00
Avi Kivity	3a81cb7cc3	readers/from_mutations_v2.hh: make self-contained Due to an inline function, we need the definition of flat_mutation_reader_v2.hh, so include it.	2022-04-27 15:55:16 +03:00
Avi Kivity	28406c2c56	data_dictionary/storage_options.hh: make self-contained Add "seastarx.hh" so sstring works (rather than seastar::sstring).	2022-04-27 15:54:32 +03:00
Avi Kivity	333fdcb3f5	Update tools/java submodule (fix NodeProbe: Malformed IPv6 address at index) * tools/java 9bc83b7a32...a4573759a2 (1): > CASSANDRA-17581 fix NodeProbe: Malformed IPv6 address at index Fixes #10442.	2022-04-27 14:51:47 +03:00
Benny Halevy	e88871f4ec	replica: database: move shard_of implementation to mutation layer We don't need the database to determine the shard of the mutation, only its schema. So move the implementation to the respecive definitions of mutation and frozen_mutation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10430	2022-04-27 14:40:24 +03:00
Nadav Har'El	f6ce7891a5	test/alternator: add test for key length limits DynamoDB limits partition-key length to 2048 bytes and sort-key length to 1024 bytes. Alternator currently has no such limits officially, but if a user tries a key length of over 64 KB, the result will be an "internal server error" as Alternator runs into Scylla's low-level key length limit of 64 KB. In this patch we add (mostly xfailing) tests confirming all the above observations. The tests include extensive comments on what they are testing and why. Some of these tests (specifically, the ones checking what happens above 64 KB) should pass once Alternator is fixed. Other tests - requiring that the limits be exactly what they are in DynamoDB - may either not pass or change in the future, depending on what we decide the limits should be in Alternator. Refs #10347 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #10438	2022-04-26 18:09:19 +02:00
Avi Kivity	582802825a	treewide: use system-#include (angle brackets) for seastar Seastar is an external library from Scylla's point of view so we should use the angle bracket #include style. Most of the source follows this, this patch fixes a few stragglers. Also fix cases of #include which reached out to seastar's directory tree directly, via #include "seastar/include/sesatar/..." to just refer to <seastar/...>. Closes #10433	2022-04-26 14:46:42 +03:00
Takuya ASADA	48b6aec16a	scripts: use "out()" function for all capture_output subprocesses On `acaf0bb` we applied out() just for perftune.py because we had issue #10390 with this script. But the issue can happen with other commands too, let's apply it to all commands which uses capture_output. related #10390 Closes #10414	2022-04-26 13:56:52 +03:00
Benny Halevy	01f41630a5	compaction: time_window_compaction_strategy: reset estimated_remaining_tasks when running out of candidates _estimated_remaining_tasks gets updated via get_next_non_expired_sstables -> get_compaction_candidates, but otherwise if we return earlier from get_sstables_for_compaction, it does not get updated and may go out of sync. Refs #10418 (to be closed when the fix reaches branch-4.6) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10419	2022-04-26 11:26:48 +03:00
Benny Halevy	055141fc2e	multishard_mutation_query: do_query: stop ctx if lookup_readers fails lookup_readers might fail after populating some readers and those better be closed before returning the exception. Fixes #10351 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10425	2022-04-26 11:11:52 +03:00
Botond Dénes	bf1b6ced3c	Merge "Make storage_service::bootstrap less if-y" from Pavel Emelyanov " The method in question performs node bootstrap in several different modes (regular, replacing, rnbo) and several subsequent if-else branches just duplicate each-other. This set merges them making the code easier to read. " * 'br-less-branchy-bootstrap' of https://github.com/xemul/scylla: storage_service: Remove pointless check in replace-bootstrap storage_service: Generalize wait for range setup storage_service: Merge common if-else branches in bootstrap storage_service: Move tables bootstrap-ON upwards	2022-04-26 10:58:30 +03:00
Raphael S. Carvalho	d79fb9a12f	docs: Update compaction controller doc The doc is being updated to reflect the changes in the commit `d8833de3bb` ("Redefine Compaction Backlog to tame compaction aggressiveness"). Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-04-26 10:50:45 +03:00
Benny Halevy	db676e9e4a	replica: database: apply: make sure the schema is synced or throw internal error Currently an exception is thrown in the apply stage when the schema is not synced, but it is too late since returning an error doesn't pinpoint which code path was using an unsync'ed schema so move the check earlier, before _apply_stage is called. We need to make sure the schema is synced earlier when the mutation is applied so call on_internal_error to generate a backtrace in testing and still throw an error in production. Typically storage_proxy::mutate_locally implicitly ensures the schema is synced by making a global_schema_ptr for it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220424110057.3957597-1-bhalevy@scylladb.com>	2022-04-25 12:18:47 +02:00
Benny Halevy	bcd35af7cf	replica: table: generate_and_propagate_view_updates: pass mutation to make_flat_mutation_reader_from_mutations_v2 With `f5ef687acd` we can consume the single mutation directly, so there's n need to pass it as a vector of size 1. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220424103826.3930895-1-bhalevy@scylladb.com>	2022-04-24 22:19:19 +03:00
Avi Kivity	728479a6ea	Merge 'Fix map subscript crashes when map or subscript is null' from Nadav Har'El In the filtering expression "WHERE m[?] = 2", our implementation was buggy when either the map, or the subscript, was NULL (and also when the latter was an UNSET_VALUE). Our code ended up dereferencing null objects, yielding bizarre errors when we were lucky, or crashes when we were less lucky - see examples of both in issues #10361, #10399, #10401. The existing test `test_null.py::test_map_subscript_null` reproduced all these bugs sporadically. In this series we improve the test to reproduce the separate bugs separately, and also reproduce additional problems (like the UNSET_VALUE). We then define both `m[NULL]` and `NULL[2]` to result in NULL instead of the existing undefined (and buggy, and crashing) behavior. This new definition is consistent with our usual SQL-inspired tradition that NULL "wins" in expressions - e.g., `NULL < 2` is also defined as resulting in NULL. However, this decision differs from Cassandra, where `m[NULL]` is considered an error but `NULL[2]` is allowed. We believe that making `m[NULL]` be a NULL instead of an error is more consistent, and moreover - necessary if we ever want to support more complicate expressions like `m[a]`, where the column `a` can be NULL for some rows and non-NULL for others, and it doesn't make sense to return an "invalid query" error in the middle of the scan. Fixes #10361 Fixes #10399 Fixes #10401 Closes #10420 * github.com:scylladb/scylla: expressions: don't dereference invalid map subscript in filter expressions: fix invalid dereference in map subscript evaluation test/cql-pytest: improve tests for map subscripts and nulls	2022-04-24 21:16:10 +03:00
Avi Kivity	a4be927e23	Revert "memtable_list: futurize clear_and_add" This reverts commit `2325c566d9`. It causes a use-after-free of a memtable. Fixes #10421.	2022-04-24 21:09:48 +03:00
Asias He	953af38281	streaming: Allow drop table during streaming Currently, if a table is dropped during streaming, the streaming would fail with no_such_column_family error. Since the table is dropped anyway, it makes more sense to ignore the streaming result of the dropped table, whether it is successful or failed. This allows users to drop tables during node operations, e.g., bootstrap or decommission a node. This is especially useful for the cloud users where it is hard to coordinate between a node operation by admin and user cql change. This patch also fixes a possible user after free issue by not passing the table reference object around. Fixes #10395 Closes #10396	2022-04-24 17:43:20 +03:00
Tzach Livyatan	607ccf0393	Update doc project name to scylla dev Closes #10342	2022-04-24 17:40:54 +03:00
Nadav Har'El	fbb2a41246	expressions: don't dereference invalid map subscript in filter If we have the filter expression "WHERE m[?] = 2", the existing code simply assumed that the subscript is an object of the right type. However, while it should indeed be the right type (we already have code that verifies that), there are two more options: It can also be a NULL, or an UNSET_VALUE. Either of these cases causes the existing code to dereference a non-object as an object, leading to bizarre errors (as in issue #10361) or even crashes (as in issue #10399). Cassandra returns a invalid request error in these cases: "Unsupported unset map key for column m" or "Unsupported null map key for column m". We decided to do things differently: * For NULL, we consider m[NULL] to result in NULL - instead of an error. This behavior is more consistent with other expressions that contain null - for example NULL[2] and NULL<2 both result in NULL as well. Moreover, if in the future we allow more complex expressions, such as m[a] (where a is a column), we can find the subscript to be null for some rows and non-null for other rows - and throwing an "invalid query" in the middle of the filtering doesn't make sense. * For UNSET_VALUE, we do consider this an error like Cassandra, and use the same error message as Cassandra. However, the current implementation checks for this error only when the expression is evaluated - not before. It means that if the scan is empty before the filtering, the error will not be reported and we'll silently return an empty result set. We currently consider this ok, but we can also change this in the future by binding the expression only once (today we do it on every evaluation) and validating it once after this binding. Fixes #10361 Fixes #10399 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-04-24 16:05:34 +03:00
Nadav Har'El	808a93d29b	expressions: fix invalid dereference in map subscript evaluation When we have an filter such as "WHERE m[2] = 3" (where m is a map column), if a row had a null value for m, our expression evaluation code incorrectly dereferences an unset optional, and continued processing the result of this dereference which resulted in undefined behavior - sometimes we were lucky enough to get "marshaling error" but other times Scylla crashed. The fix is trivial - just check before dereferencing the optional value of the map. We return null in that case, which means that we consider the result of null[2] to be null. I think this is a reasonable approach and fits our overall approach of making null dominate expressions (e.g., the value of "null < 2" is also null). The test test_filtering.py::test_filtering_null_map_with_subscript, which used to frequently fail with marshaling errors or crashes, now passes every time so its "xfail" mark is removed. Fixes #10417 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-04-24 14:58:56 +03:00
Nadav Har'El	189b8845fe	test/cql-pytest: improve tests for map subscripts and nulls The test test_null.py::test_map_subscript_null turned out to reproduce multiple bugs related to using map subscripts in filtering expressions. One was issue #10361 (m[null] resulted in a bizarre error) or #10399 (m[null] resulted in a crash), and a different issue was #10401 (m[2] resulted in a bizarre error or a crash if m itself was null). Moreover, the same test uncovered different bugs depending how it was run - alone or with other tests - because it was using a shared table. In this patch we introduce two separate tests in test_filtering.py which are designed to reproduce these separate bugs instead of mixing them into one test. The new tests also cover a few more corners which the previous test (which focused on nulls) missed - such as UNSET_VALUE. The two new tests (and the old test_map_subscript_null) pass on Cassandra so still assume that the Cassandra behavior - that m[null] should be an error - is the correct behavior. We may want to change the desired behavior (e.g., to decide that m[null] be null, not an error), and change the tests accordingly later - but for now the tests follow Cassandra's behavior exactly, and pass on Cassandra and fail on Scylla (so are marked xfail). The bugs reproduced by these tests involve randomness or reading uninitialized memory, so these tests sometimes pass, sometimes fail, and sometimes even crash (as reported in #10399 and #10401). So to reproduce these bugs run the tests multiple times. For example: test/cql-pytest/run --count 100 --runxfail test_filtering.py::test_filtering_null_map_with_subscript Refs #10361 Refs #10399 Refs #10401 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-04-24 13:26:26 +03:00
Avi Kivity	8624718983	Merge "row_cache: update reader implementations to v2" from Botond " cache_flat_mutation_reader gets a native v2 implementation. The underlying mutation representation is not changed: range deletions are still stored as v1 range_tombstones in mutation_partition. These are converted to range tombstone changes during reading. This allows for separating the change of a native v2 reader implementation and a native v2 in-memory storage format, enabling the two to be done at separate times and incrementally. This means there is still conversion ingoing when reading from cache and when populating, but when reading from underlying, the stream can now be passed through as-is without conversions. Also, any future v2 related changes to the in-memory storage will now be limited to the cache reader implementation itself. In the process, the non-forwarding reader, whose only user is the cache, is also converted to v2. " Performance results reported by Botond: " build/release/test/perf/perf_simple_query -c1 -m2G --flush -- duration=20 BEFORE median 130421.76 tps ( 71.1 allocs/op, 12.1 tasks/op, 47462 insns/op) median absolute deviation: 319.64 maximum: 131028.33 minimum: 127502.55 AFTER median 133297.41 tps ( 64.1 allocs/op, 12.2 tasks/op, 45406 insns/op) median absolute deviation: 2964.24 maximum: 137581.56 minimum: 123739.4 Getting rid of those upgrade/downgrade was good for allocs and ops. Curiously there is a 0.1 rise in number of tasks though. " * 'row-cache-readers-v2/v1' of https://github.com/denesb/scylla: row_cache: update reader implementations to v2 range_tombstone_change_generator: flush(): add end_of_range readers/nonforwardable: convert to v2 read_context: fix indentation read_context: coroutinize move_to_next_partition() row_cache: cache_entry::read(): return v2 reader row_cache: return v2 readers from make_reader*() readers/delegating_v2: s/make_delegating_reader_v2/make_delegating_reader/	2022-04-23 19:10:43 +03:00
Botond Dénes	5e97fb9fc4	row_cache: update reader implementations to v2 cache_flat_mutation_reader gets a native v2 implementation. The underlying mutation representation is not changed: range deletions are still stored as v1 range_tombstones in mutation_partition. These are converted to range tombstone changes during reading. This allows for separating the change of a native v2 reader implementation and a native v2 in-memory storage format, enabling the two to be done at separate times and incrementally.	2022-04-21 14:57:04 +03:00
Botond Dénes	5cc5fd4d23	range_tombstone_change_generator: flush(): add end_of_range Allowing to flush all range tombstone changes, including those that have a position equal to the passed in upper bound, when finishing off a read-range, e.g. a clustering range from a slice.	2022-04-21 14:37:10 +03:00
Botond Dénes	7626beb729	readers/nonforwardable: convert to v2 It has a single user, the row cache, which for now has to upgrade/downgrade around the nonforwardable reader, but this will go away in the next patches when the row cache readers are converted to v2 proper.	2022-04-21 14:34:00 +03:00
Botond Dénes	b061acb668	Merge 'Remove queue reader v1' from Mikołaj Sielużycki The patchset embeds the mutation_fragment upgrading logic from v1 to v2 into the mutation_fragment_queue. This way the mutation fragments coming to the mutation_fragment_queue can be v1, but the underlying query_reader receives mutation_fragment_v2, eliminating the last usage of query_reader (v1). The last commit removes query_reader, query_reader_handle and associated factory functions. tests: unit(dev), dtest(incremental_repair_test, read_repair_test, repair_additional_test, repair_test) Closes #10371 * github.com:scylladb/scylla: readers: Remove queue_reader v1 and associated code. repair: Make mutation_fragment_queue internally upgrade fragments to v2 repair: Make mutation_fragment_queue::impl a seastar::shared_ptr	2022-04-21 12:34:48 +03:00
Mikołaj Sielużycki	f74fd0dd80	readers: Remove queue_reader v1 and associated code.	2022-04-20 17:56:34 +02:00
Mikołaj Sielużycki	339b60e5b0	repair: Make mutation_fragment_queue internally upgrade fragments to v2	2022-04-20 17:55:58 +02:00
Mikołaj Sielużycki	eeb2b458de	repair: Make mutation_fragment_queue::impl a seastar::shared_ptr It makes mutation_fragment_queue copyable and makes the pointer to pending mutation fragments in next commit stable. This allows moving the mutation_fragment_queue without breaking the underlying upgrading_consumer.	2022-04-20 17:51:58 +02:00
Botond Dénes	46481264e9	read_context: fix indentation Broken by the previous patch (patches actually -- it was half-indent on half-indent before that).	2022-04-20 10:59:09 +03:00
Botond Dénes	28f90728a3	read_context: coroutinize move_to_next_partition() Makes the code more readable and the impending v2 transition less noisy.	2022-04-20 10:59:09 +03:00
Botond Dénes	2a0d7e8a1d	row_cache: cache_entry::read(): return v2 reader Push the conversion down one level. Soon we will make cache flat mutation reader a v2 reader, this keeps the related noise separate.	2022-04-20 10:59:09 +03:00
Botond Dénes	0b035c9099	row_cache: return v2 readers from make_reader*() And adjust callers. The factory functions just sprinkle upgrade_to_v2() on returned readers for now. One test in row_cache_test.cc had to be disabled, because the upgrade to v2 wrapper we now have over cache readers doesn't allow it to directly control the reader's buffer size and so the test fails. There is a FIXME left in the test code and the test will be re-enabled once a native v2 reader implementation allows us to get rid of the upgrade wrapper.	2022-04-20 10:59:09 +03:00
Botond Dénes	c3c71b3aa5	readers/delegating_v2: s/make_delegating_reader_v2/make_delegating_reader/ The argument type (v1 or v2 reader) is enough to disambiguate and overloading the v1 method makes a transition to v2 more seamless.	2022-04-20 10:59:09 +03:00
Nadav Har'El	cc40685c28	test/cql-pytest: add test for filtering with IN restriction It turns out that Cassandra does not allow IN restrictions together with filtering, except, curiously, when the restriction is on a clustering key. There is no real reason for this limitation - the error message even says it is not yet supported. Scylla, on the other hand, does support this case. Of course it's not enough that we support it - we need to support it correctly... But we don't have a full regression test that this support is correct - in filtering_test.cc we test it with clustering and regular columns - but not partition key columns. So this patch adds a simple cql-pytest test that this sort of filtering works in Scylla correctly for partition, clustering and regular columns (and also confirms that these cases don't work, yet, on Cassandra). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220420075553.1008062-1-nyh@scylladb.com>	2022-04-20 09:56:22 +02:00
Konstantin Osipov	a3b790b413	test.py: add a dependency on python3-aiohttp and tabulate Satisfy the build system requirements. [avi: regenerate frozen toolchain]	2022-04-19 18:22:50 +03:00
Konstantin Osipov	097fbc7c5d	.gitignore: ignore mypy_cache, the python lint cache	2022-04-19 16:48:47 +03:00
Pavel Emelyanov	41392a59bb	storage_service: Remove pointless check in replace-bootstrap The method in question is called in the branch where the replace address is checked to be present, no need in extra explicit check. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-19 13:27:52 +03:00
Pavel Emelyanov	49481b1a21	storage_service: Generalize wait for range setup Both the if is_replacing()/else branches call gossiper wating method as their first steps. Can be done once. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-19 13:27:52 +03:00
Pavel Emelyanov	d213e6ffd1	storage_service: Merge common if-else branches in bootstrap There are three modes in there -- bootstrap, b.s. with RBNO and b.s. for replacing. All three are checked two times in a row, but can be done once. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-19 13:27:52 +03:00
Pavel Emelyanov	b0df3a32b4	storage_service: Move tables bootstrap-ON upwards This call just places a boolean flag on all. It won't hurt if it lasts while the node is performing pre-bootstrap checks, but it allows making the whole method less branchy. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-19 13:27:52 +03:00

1 2 3 4 5 ...

30983 Commits