scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-20 16:40:35 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	e115fce8f7	Merge "raft: sometimes become a candidate even if outside the configuration" from Kamil There are situations where a node outside the current configuration is the only node that can become a leader. We become candidates in such cases. But there is an easy check for when we don't need to; a comment was added explaining that. * kbr/candidate-outside-config-v3: raft: sometimes become a candidate even if outside the configuration raft: fsm: update _commit_idx when applying snapshot	2021-08-09 12:29:03 +02:00
Piotr Dulikowski	e18b29765a	hints: add hint sync point structure Adds a sync_point structure. A sync point is a (possibly incomplete) mapping from hint queues to a replay position in it. Users will be able to create sync points consisting of the last written positions of some hint queues, so then they can wait until hint replay in all of the queues reach that point. The sync point supports serialization - first it is serialized with the help of IDL to a binary form, and then converted to a hexadecimal string. Deserialization is also possible.	2021-08-09 09:24:36 +02:00
Piotr Dulikowski	5a0942a0f8	utils,alternator: move base64 code from alternator to utils The base64 encoding/decoding functions will be used for serialization of hint sync point descriptions. Base64 format is not specific to Alternator, so it can be moved to utils.	2021-08-09 09:24:36 +02:00
Kamil Braun	7533c84e62	raft: sometimes become a candidate even if outside the configuration There are situations where a node outside the current configuration is the only node that can become a leader. We become candidates in such cases. But there is an easy check for when we don't need to; a comment was added explaining that.	2021-08-06 13:18:32 +02:00
Kamil Braun	93822b0ee7	test: raft: regression test for storing cluster configuration when taking snapshots Before the fix introduced in the previous patch, the cluster would forget its configuration when taking a snapshot, making it unable to reelect a leader. This regression test catches that.	2021-08-06 12:17:22 +02:00
Avi Kivity	52364b5da0	Merge 'cql3: Use expressions to calculate the local-index clustering ranges' from Jan Ciołek Calculating clustering ranges on a local index has been rewritten to use the new `expression` variant. This allows us to finally remove the old `bounds_ranges` function. Closes #9080 * github.com:scylladb/scylla: cql3: Remove unused functions like bounds_ranges cql3: Use expressions to calculate the local-index clustering ranges statement_restrictions_test: tests for extracting column restrictions expression: add a function to extract restrictions for a column	2021-08-05 18:32:11 +03:00
Kamil Braun	f050d3682c	raft: fsm: stronger check for outdated remote snapshots We must not apply remote snapshots with commit indexes smaller than our local commit index; this could result in out-of-order command application to the local state machine replica, leading to serializability violations. Message-Id: <20210805112736.35059-1-kbraun@scylladb.com>	2021-08-05 14:29:50 +02:00
Nadav Har'El	ae51fef57c	cql-pytest: add tests for estimated partition count In issue #9083 a user noted that whereas Cassandra's partition-count estimation is accurate, Scylla's (rewritten in commit `b93cc21`) is very inaccurate. The tests introduce here, which all xfail on Scylla, confirm this suspicion. The most important tests are the "simple" tests, involving a workload which writes N distinct partitions and then asks for the estimated partition count. Cassandra provides accurate estimates, which grow more accurate with more partitions, so it passes these tests, while Scylla provides bad estimates and fails them. Additional tests demonstrate that neither Scylla nor Cassandra can handle anything beyond the "simple" case of distinct partitions. Two tests which xfail on both Cassandra and Scylla demonstrate that if we write the same partitions to multiple sstables - or also delete partitions - the estimated partition counts will be way off. Refs #9083 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210726211315.1515856-1-nyh@scylladb.com>	2021-08-05 08:50:19 +02:00
Nadav Har'El	d640998ca8	test/cql-pytest: add test for another ALLOW FILTERING case In this patch we add another test case for a case where ALLOW FILTERING should not be required (and Cassandra doesn't require it) but Scylla does. This problem was introduced by pull request #9122. The pull request fixed an incorrect query (see issue #9085) involving both an index and a multi-column restriction on a compound clustering key - and the fix is using filtering. However, in one specific case involving a full prefix, it shouldn't require filtering. This test reproduces this case. The new test passes on Cassandra (and also theoretically, should pass), but fails on Scylla - the check_af_optional() call fails because Scylla made the ALLOW FILTERING mandatory for that case. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210803092046.1677584-1-nyh@scylladb.com>	2021-08-04 15:24:47 +03:00
Nadav Har'El	dba184039a	test/alternator: another test for Query's ExclusiveStartKey We already have tests for Query's ExclusiveStartKey option, but we only exercised it as a way for paging linearly through all the results. Now we add a test that confirms that ExclusiveStartKey can be used not just for paging through all the result - but also for jumping directly to the middle of a partition after any clustering key (existing or non- existing clustering key). The new test also for the first time verifies that ExclusiveStartKey with a specific format works (previous tests just copied LastEvaluatedKey to ExclusiveStartKey, so any opaque cookie could have worked). The test passes on both DynamoDB and Alternator so it did not find a new bug. But it's useful to have as a regression test, in case in the future we want to improve paging performance (see #6278) - and need to keep in mind that ExclusiveStartKey is not just for paging. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210729114703.1609058-1-nyh@scylladb.com>	2021-08-04 15:24:47 +03:00
Kamil Braun	4165045356	test: raft: randomized_nemesis_test: handle timeouts in rpc::send_snapshot They were already correctly returned to the caller, but we had a leftover discarded future that would sometimes end up with a broken_promise exception. Ignore the exception explicitly. Message-Id: <20210803122207.78406-1-kbraun@scylladb.com>	2021-08-04 15:24:47 +03:00
Raphael S. Carvalho	a869d61c89	tests: Move compaction-related tests into its own unit With commit `1924e8d2b6`, compaction code was moved into a top level dir as compaction is layered on top of sstables. Let's continue this work by moving all compaction unit tests into its own test file. This also makes things much more organized. sstable_datafile_test, as its name implies, will only contain sstable data tests. Perhaps it should be renamed to only sstable_data_test, as the test also contains tests involving other components, not only the data one. BEFORE $ cat test/boost/sstable_datafile_test.cc \| grep TEST_CASE \| wc -l 105 AFTER $ cat test/boost/sstable_compaction_test.cc \| grep TEST_CASE \| wc -l 57 $ cat test/boost/sstable_datafile_test.cc \| grep TEST_CASE \| wc -l 48 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210802192120.148583-1-raphaelsc@scylladb.com>	2021-08-02 22:26:26 +03:00
Jan Ciolek	a7d1dab066	statement_restrictions_test: tests for extracting column restrictions Add unit tests for the function extract_single_column_restrictions_for_column() Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2021-08-02 15:43:42 +02:00
Tomasz Grabiec	3e47f28c65	Merge "raft: use the correct term when storing a snapshot" from Kamil We should not use the current term; we should use the term of the snapshot's index, which may be lower. * https://github.com/kbr-/scylla/tree/snapshot-right-term-fix: test: raft: regression test for using the correct term when taking a snapshot test: raft: randomized_nemesis_test: server configuration parameter raft: use the correct term when storing a snapshot	2021-08-02 15:33:52 +02:00
Dejan Mircevski	debf65e136	cql3: Filter regular-index results on multi-column When a WHERE clause contains a multi-column restriction and an indexed regular column, we must filter the results. It is generally not possible to craft the index-table query so it fetches only the matching rows, because that table's clustering key doesn't match up with the column tuple. Fixes #9085. Tests: unit (dev, debug) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #9122	2021-08-02 14:15:43 +03:00
Kamil Braun	ac5121a016	test: raft: regression test for using the correct term when taking a snapshot	2021-08-02 11:48:35 +02:00
Kamil Braun	63fdc718d4	test: raft: randomized_nemesis_test: server configuration parameter	2021-08-02 11:47:19 +02:00
Avi Kivity	ebda2fd4db	test: cql_test_env: increase file descriptor limit It was observed that since `fce124bd90` ('Merge "Introduce flat_mutation_reader_v2" from Tomasz') database_test takes much longer. This is expected since it now runs the upgrade/downgrade reader tests on all existing tests. It was also observed that in a similar time frame database_test sometimes times our on test machines, taking much longer than usual, even with the extra work for testing reader upgrade/downgrade. In an attempt to reproduce, I noticed ti failing on EMFILE (too many open file descriptors). I saw that tests usually use ~100 open file descriptors, while the default limit is 1024. I suspect we have runaway concurrency, but I was not able to pinpoint the cause. It could be compaction lagging behind, or cleanup work for deleting tables (the test test_database_with_data_in_sstables_is_a_mutation_source creates and deletes many tables). As a stopgap solution to unblock the tests, this patch raises the file descriptor limit in the way recommended by [1]. While tests shouldn't use so many descriptors, I ran out of ideas about how to plug the hole. Note that main() does something similar, through more elaborate since it needs to communicate to users. See `ec60f44b64` ("main: improve process file limit handling"). [1] http://0pointer.net/blog/file-descriptor-limits.html Closes #9121	2021-08-02 11:57:14 +03:00
Pavel Solodovnikov	b1a3b59a08	test: test_materialized_view: test_mv_select_stmt_bound_values: improve error handling Restrict expected exception message to filter only relevant exception, matching both for scylla and cassandra. For example, the former has this message: Cannot use query parameters in CREATE MATERIALIZED VIEW statements While the latter throws this: Bind variables are not allowed in CREATE MATERIALIZED VIEW statements Also, place cleanup code in try-finally clause. Tests: cql-pytest:test_materialized_view.py(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20210802083912.229886-1-pa.solodovnikov@scylladb.com>	2021-08-02 11:49:50 +03:00
Nadav Har'El	e8fe1817df	cql-pytest: translate Cassandra's tests for timestamps This is a translation of Cassandra's CQL unit test source file validation/entities/TimestampTest.java into our our cql-pytest framework. This test file checks has a few tests (8) on various features of cell timestamps. All these tests pass on Cassandra and on Scylla - i.e., these tests no new Scylla bug was detected :-) Two of the new tests are very slow (6 seconds each) and check a trivial feature that was already checked elsewhere more efficiently (the fact that TTL expiration works), so I marked them "skip" after verifying they really pass. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210801142738.1633126-1-nyh@scylladb.com>	2021-08-02 09:25:49 +02:00
Pavel Solodovnikov	d07f681a95	test: test_non_deterministic_functions: add `lwt` to test cases names The tests are related to LWT so add the corresponding prefix to all the tests cases to emphasize that. Tests: cql-pytest(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20210801131820.164480-1-pa.solodovnikov@scylladb.com>	2021-08-01 16:23:30 +03:00
Avi Kivity	48860b135a	Merge "cql3: fix `current()` functions to be non-deterministic" from Pavel S " Previously, the following functions were incorrectly marked as pure, meaning that the function is executed at "prepare" step: `currenttimestamp()` * `currenttime()` * `currentdate()` * `currenttimeuuid()` For functions that possibly depend on timing and random seed, this is clearly a bug. Cassandra doesn't have a notion of pure functions, so they are lazily evaluated. Make Scylla to match Cassandra behavior for these functions. Add a unit-test for a fix (excluding `currentdate()` function, because there is no way to use synthetic clock with query processor and sleeping for a whole day to demonstrate correct behavior is clearly not an option). Also, extend the cql-pytest for #8604 since there are now more non-deterministic CQL functions, they are all subject to the test now. Fixes: #8816 " * 'timeuuid_function_pure_annotation_v3' of https://github.com/ManManson/scylla: test: test_non_deterministic_functions: test more non-pure functions cql3: change `current*()` CQL functions to be non-pure	2021-08-01 12:35:36 +03:00
Pavel Solodovnikov	a130921120	test: test_non_deterministic_functions: test more non-pure functions Check that all existing non-pure functions (except for `currentdate()`) work correctly with or without prepared statements. Tests: cql-pytest/test_non_deterministic_functions.py(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-08-01 12:18:26 +03:00
Pavel Solodovnikov	21d758020a	cql3: change `current()` CQL functions to be non-pure These include the following: `currenttimestamp()` * `currenttime()` * `currentdate()` * `currenttimeuuid()` Previously, they were incorrectly marked as pure, meaning that the function is executed at "prepare" step. For functions that possibly depend on timing and random seed, this is clearly a bug. Cassandra doesn't have a notion of pure functions, so they are lazily evaluated. Make Scylla to match Cassandra behavior for these functions. Add a unit-test for a fix (excluding `currentdate()` function, because there is no way to use synthetic clock with query processor and sleeping for a whole day to demonstrate correct behavior is clearly not an option). Tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-08-01 12:17:23 +03:00
Pavel Solodovnikov	1ca7825cf6	test: add a test checking that bind markers within MVs SELECT statement don't lead to a crash The request should fail with `InvalidRequest` exception and shouldn't crash the database. Don't check for actual error messages, because they are different between Scylla and Cassandra. The former has this message: Cannot use query parameters in CREATE MATERIALIZED VIEW statements While the latter throws this: Bind variables are not allowed in CREATE MATERIALIZED VIEW statements Tests: cql-pytest/test_materialized_view.py(scylla dev, cassandra trunk) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-07-30 17:57:24 +03:00
Piotr Sarna	1c7af8d46f	cql-pytest: adjust a test case for Cassandra 4 One of the test cases stopped working against Cassandra 4, but that's just because it returns a slightly different error type. The test case is adjusted to work on both Scylla and new Cassandra. Message-Id: <222a7f63a3e9739c6fc646173306fcdb3da25890.1627655555.git.sarna@scylladb.com>	2021-07-30 17:36:23 +03:00
Avi Kivity	0876248c2b	Merge "cql3: cache function calls evaluation for non-deterministic functions" from Pavel S " `function_call` AST nodes are created for each function with side effects in a CQL query, i.e. non-deterministic functions (`uuid()`, `now()` and some others timeuuid-related). These nodes are evaluated either when a query itself is executed or query restrictions are computed (e.g. partition/clustering key ranges for LWT requests). We need to cache the calls since otherwise when handling a `bounce_to_shard` request for an LWT query, we can possibly enter an infinite bouncing loop (in case a function is used to calculate partition key ranges for a query), since the results can be different each time. Furthermore, we don't support bouncing more than one time. Returning `bounce_to_shard` message more than one time will result in a crash. Caching works only for LWT statements and only for the function calls that affect partition key range computation for the query. `variable_specifications` class is renamed to `prepare_context` and generalized to record information about each `function_call` AST node and modify them, as needed: * Check whether a given function call is a part of partition key statement restriction. * Assign ids for caching if above is true and the call is a part of an LWT statement. There is no need to include any kind of statement identifier in the cache key since `query_options` (which holds the cache) is limited to a single statement, anyway. Function calls are indexed by the order in which they appear within a statement while parsing. There is no need to include any kind of statement identifier to the cache key since `query_options` (which holds the cache) is limited to a single statement, anyway. Note that `function_call::raw` AST nodes are not created for selection clauses of a SELECT statement hence they can only accept only one of the following things as parameters: * Other function calls. * Literal values. * Parameter markers. In other words, only parameters that can be immediately reduced to a byte buffer are allowed and we don't need to handle database inputs to non-pure functions separately since they are not possible in this context. Anyhow, we don't even have a single non-pure function that accepts arguments, so precautions are not needed at the moment. Add a test written in `cql-pytest` framework to verify that both prepared and unprepared lwt statements handle `bounce_to_shard` messages correctly in such scenario. Fixes: #8604 Tests: unit(dev, debug) NOTE: the patchset uses `query_options` as a container for cached values. This doesn't look clean and `service::query_state` seems to be a better place to store them. But it's not forwarded to most of the CQL code and would mean that a huge number of places would have to be amended. The series presents a trade-off to avoid forwarding `query_state` everywhere (but maybe it's the thing that needs to be done, nonetheless). " * 'lwt_bounce_to_shard_cached_fn_v6' of https://github.com/ManManson/scylla: cql-pytest: add a test for non-pure CQL functions cql3: cache function calls evaluation for non-deterministic functions cql3: rename `variable_specifications` to `prepare_context`	2021-07-30 14:21:11 +03:00
Pavel Solodovnikov	eaf70df203	cql-pytest: add a test for non-pure CQL functions Introduce a test using `cql-pytest` framework to assert that both prepared an unprepared LWT statements (insert with `IF NOT EXISTS`) with a non-deterministic function call work correctly in case its evaluation affects partition key range computation (hence the choice of `cas_shard()` for lwt query). Tests: cql-pytest/test_non_deterministic_functions.py Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-07-30 01:22:50 +03:00
Tomasz Grabiec	7c28f77412	Merge 'Convert all remaining int tri-compares to std::strong_ordering' from Avi Kivity Convert all known tri-compares that return an int to return std::strong_ordering. Returning an int is dangerous since the caller can treat it as a bool, and indeed this series uncovered a minor bug (#9103). Test: unit (dev) Fixes #1449 Closes #9106 * github.com:scylladb/scylla: treewide: remove redundant "x <=> 0" compares test: mutation_test: convert internal tri-compare to std::strong_ordering utils: int_range: change to std::strong_ordering test: change some internal comparators to std::strong_ordering utils: big_decimal: change to std::strong_ordering utils: fragment_range: change to std::strong_ordering atomic_cell: change compare_atomic_cell_for_merge() to std::strong_ordering types: drop scaffolding erected around lexicographical_tri_compare sstables: keys: change to std::strong_ordering internally bytes: compare_unsigned(): change to std::strong_ordering uuid: change comparators to std::strong_ordering types: convert abstract_type::compare and related to std::strong_ordering types: reduce boilerplate when comparing empty value serialized_tri_compare: change to std::strong_ordering compound_compat: change to std::strong-ordering types: change lexicographical_tri_compare, prefix_equality_tri_compare to std::strong_ordering	2021-07-29 21:43:54 +02:00
Avi Kivity	e44d3cc0ea	Merge "Remove global storage service instance" from Pavel E " There are few places that call global storage service, but all are easily fixable without significant changes. 1. alternator -- needs token metadata, switch to using proxy 2. api -- calls methods from storage service, all handlers are registered in main and can capture storage service from there 3. thrift -- calls methods from storage service, can carry the reference via controller 4. view -- needs tokens, switch to using (global) proxy 5. storage_service -- (surprisingly) can use "this" tests: unit(dev), dtest(simple_boot_shutdown, dev) " * 'br-unglobal-storage-service' of https://github.com/xemul/scylla: storage_service: Make it local storage_service: Remove (de)?init_storage_service() storage_service: Use container() in run_with(out)_api_lock storage_service: Unmark update_topology static storage_service: Capture this when appropriate view: Use proxy to get token metadata from thrift: Use local storage service in handlers thrift: Carry sharded<storage_service>& down to handler api: Capture and use sharded<storage_service>& in handlers api: Carry sharded<storage_service>& down to some handlers alternator: Take token metadata from server's storage_proxy alternator: Keep storage_proxy on server	2021-07-29 11:47:16 +03:00
Avi Kivity	8d2255d82c	Merge "Parallelize multishard_combining_reader_as_mutation_source test" from Pavel E " This is the 3rd slowest test in the set. There are 3 cases out there that are hard-coded to be sequential. However, splitting them into boost test cases helps running this test faster in --parallel-cases mode. Timings for debug mode: Total before the patch: 25 min Sequential after the patch: 25 min Basic case: 5 min Evict-paused-readers case: 5 min Single-mutation-buffer case: 15 min tests: unit.multishard_combining_reader_as_mutation_source(debug) " * 'br-parallel-mcr-test' of https://github.com/xemul/scylla: test: Split test_multishard_combining_reader_as_mutation_source into 3 test: Fix indentation after previous patch test: Move out internals of test_multishard_combining_reader_as_mutation_source	2021-07-29 11:39:02 +03:00
Pavel Emelyanov	f9132b582b	storage_service: Make it local There are 3 places that can now declare local instance: - main - cql_test_env - boost gossiper test The global pointer is saved in debug namespace for debugging. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-29 05:12:36 +03:00
Gleb Natapov	4764028cb3	raft: Remove leader_id from append_request The filed is not used anywhere. Message-Id: <YP0khmjK2JSp77AG@scylladb.com>	2021-07-28 20:30:07 +02:00
Avi Kivity	42e1f318d7	Merge "Respect "bypass cache" in sstable index caching" from Tomasz " This series changes the behavior of the system when executing reads annotated with "bypass cache" clause in CQL. Such reads will not use nor populate the sstable partition index cache and sstable index page cache. " * 'bypass-cache-in-sstable-index-reads' of github.com:tgrabiec/scylla: sstables: Do not populate page cache when searching in promoted index for "bypass cache" reads sstables: Do not populate partition index cache for "bypass cache" reads	2021-07-28 18:45:39 +03:00
Avi Kivity	331eb57e17	Revert "compression: define 'class' attribute for compression and deprecate 'sstable_compression'" This reverts commit `5571ef0d6d`. It causes rolling upgrade failures. Fixes #9055. Reopens #8948.	2021-07-28 14:14:22 +03:00
Avi Kivity	0909e3c17d	treewide: remove redundant "x <=> 0" compares If x is of type std::strong_ordering, then "x <=> 0" is equivalent to x. These no-ops were inserted during #1449 fixes, but are now unnecessary. They have potential for harm, since they can hide an accidental of the type of x to an arithmetic type, so remove them. Ref #1449.	2021-07-28 13:30:32 +03:00
Avi Kivity	70f481a1f0	test: mutation_test: convert internal tri-compare to std::strong_ordering Drop the temporary merge_container() overload we had to support tri-compares that returned int.	2021-07-28 13:30:07 +03:00
Avi Kivity	11fa402ecc	test: change some internal comparators to std::strong_ordering Ref #1449.	2021-07-28 13:28:51 +03:00
Avi Kivity	7729ff03ad	uuid: change comparators to std::strong_ordering Ref #1449.	2021-07-28 13:20:32 +03:00
Avi Kivity	e52ebe2da5	types: convert abstract_type::compare and related to std::strong_ordering Change comparators around types to std::strong_ordering. Ref #1449.	2021-07-28 13:19:24 +03:00
Avi Kivity	d86e529239	serialized_tri_compare: change to std::strong_ordering Also convert a users in mutation_test. Ref #1449.	2021-07-28 13:19:00 +03:00
Benny Halevy	67d5addc09	test: mutation_reader_test: clustering_order_merger_test_generator: use explicit type for num_ranges gcc 10.3.1 spews the following error: ``` _test_generator::generate_scenario(std::mt19937&) const’: test/boost/mutation_reader_test.cc:3731:28: error: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Werror=sign-compare] 3731 \| for (auto i = 0; i < num_ranges; ++i) { \| ~~^~~~~~~~~~~~ ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210728073538.2467040-1-bhalevy@scylladb.com>	2021-07-28 11:22:59 +03:00
Avi Kivity	77a2b4b520	test: perf: perf_simple_query: add instructions_per_op to the json-result output It's in text output, but `863b49af03` forgot to add it to the machine readable results. Closes #9017	2021-07-27 20:26:19 +02:00
Pavel Emelyanov	b3c89787be	mutation_partition: Return immutable collection for range tombstones Patch the .row_tombstones() to return the range_tombstone_list wrapped into the immutable_collection<> so that callers are guaranteed not to touch the collection itself, but still can modify the tombstones. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-27 20:06:53 +03:00
Pavel Emelyanov	1bf643d4fd	mutation_partition: Pin mutable access to range tombstones Some callers of mutation_partition::row_tomstones() don't want (and shouldn't) modify the list itself, while they may want to modify the tombstones. This patch explicitly locates those that need to modify the collection, because the next patch will return immutable collection for the others. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-27 20:06:53 +03:00
Pavel Emelyanov	05b8cdfd24	mutation_partition: Return immutable collection for rows Patch the .clustered_rows() method to return the btree of rows wrapped into the immutable_collection<> so that callers are guaranteed not to touch the collection itself, but still can modify the elements in it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-27 20:06:53 +03:00
Pavel Emelyanov	e652b03b4e	btree tests: Dont use iterator erase Next patches will mark btree::iterator methods that modify the tree itself as private, so stop using them in tests. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-27 20:06:53 +03:00
Avi Kivity	f86e65b4e7	Merge "Fix quadratic behavior in memtable/row_cache with lots of range tombstones" from Tomasz " This series fixes two issues which cause very poor efficiency of reads when there is a lot of range tombstones per live row in a partition. The first issue is in the row_cache reader. Before the patch, all range tombstones up to the next row were copied into a vector, and then put into the buffer until it's full. This would get quadratic if there is much more range tombstones than fit in a buffer. The fix is to avoid the accumulation of all tombstones in the vector and invoke the callback instead, which stops the iteration as soon as the buffer is full. Fixes #2581. The second, similar issue was in the memtable reader. Tests: - unit (dev) - perf_row_cache_update (release) " * tag 'no-quadratic-rt-in-reads-v1' of github.com:tgrabiec/scylla: test: perf_row_cache_update: Uncomment test case for lots of range tombstones row_cache: Consume range tombstones incrementally partition_snapshot_reader: Avoid quadratic behavior with lots of range tombstones tests: mvcc: Relax monotonicity check range_tombstone_stream: Introduce peek_next()	2021-07-27 14:39:13 +03:00
Avi Kivity	2cca461652	Merge 'sstables: merge row consumer interfaces with implementations' from Wojciech Mitros This patch follows #9002, further reducing the complexity of the sstable readers. The split between row consumer interfaces and implementations has been first added in 2015, and there is no reason to create new implementations anymore. By merging those classes, we achieve a sizeable reduction in sstable reader length and complexity. Refs #7952 Tests: unit(dev) Closes #9073 * github.com:scylladb/scylla: sstables: merge row_consumer into mp_row_consumer_k_l sstables: move kl row_consumer sstables: merge consumer_m into mp_row_consumer_m sstables: move mp_row_consumer_m	2021-07-27 12:23:29 +03:00
Pavel Emelyanov	ca2dfac7d7	test: Split test_multishard_combining_reader_as_mutation_source into 3 There are 3 independent cases in this test that benefit from running in parallel. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-27 09:29:20 +03:00

1 2 3 4 5 ...

2087 Commits