scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Gleb Natapov	512556914a	test: move memtable_test.cc to new schema announcement api	2022-01-13 23:10:13 +02:00
Gleb Natapov	be46109af6	test: move cql_query_test.cc to new schema announcement api	2022-01-13 23:09:02 +02:00
Gleb Natapov	100b44f5ff	test use new schema announcement api in cql_test_env.cc	2022-01-13 23:09:02 +02:00
Gleb Natapov	5dffc8ed3e	test: convert database_test to new schema announcement api	2022-01-13 23:09:02 +02:00
Avi Kivity	134601a15e	Merge "Convert input side of mutation compactor to v2" from Botond " With this series the mutation compactor can now consume a v2 stream. On the output side it still uses v1, so it can now act as an online v2->v1 converter. This allows us to push out v2->v1 conversion to as far as the compactor, usually the next to last component in a read pipeline, just before the final consumer. For reads this is as far as we can go, as the intra-node ABI and hence the result-sets built are v1. For compaction we could go further and eliminate conversion altogether, but this requires some further work on both the compactor and the sstable writer and so it is left to be done later. To summarize, this patchset enables a v2 input for the compactor and it updates compaction and single partition reads to use it. " * 'mutation-compactor-consume-v2/v1' of https://github.com/denesb/scylla: table: add make_reader_v2() querier: convert querier_cache and {data,mutation}_querier to v2 compaction: upgrade compaction::make_interposer_consumer() to v2 mutation_reader: remove unecessary stable_flattened_mutations_consumer compaction/compaction_strategy: convert make_interposer_consumer() to v2 mutation_writer: migrate timestamp_based_splitting_writer to v2 mutation_writer: migrate shard_based_splitting_writer to v2 mutation_writer: add v2 clone of feed_writer and bucket_writer flat_mutation_reader_v2: add reader_consumer_v2 typedef mutation_reader: add v2 clone of queue_reader compact_mutation: make start_new_page() independent of mutation_fragment version compact_mutation: add support for consuming a v2 stream compact_mutation: extract range tombstone consumption into own method range_tombstone_assembler: add get_range_tombstone_change() range_tombstone_assembler: add get_current_tombstone()	2022-01-12 14:37:19 +02:00
Avi Kivity	4118f2d8be	treewide: replace deprecated seastar::later() with seastar::yield() seastar::later() was recently deprecated and replaced with two alternatives: a cheap seastar::yield() and an expensive (but more powerful) seastar::check_for_io_immediately(), that corresponds to the original later(). This patch replaces all later() calls with the weaker yield(). In all cases except one, it's unambiguously correct. In one case (test/perf scheduling_latency_measurer::stop()) it's not so ambiguous, since check_for_io_immediately() will additionally force a poll and so will cause more work to be done (but no additional tasks to be executed). However, I think that any measurement that relies on the measuring the work on the last tick to be inaccurate (you need thousands of ticks to get any amount of confidence in the measurement) that in the end it doesn't matter what we pick. Tests: unit (dev) Closes #9904	2022-01-12 12:19:19 +01:00
Nadav Har'El	7a9f69ec38	Merge 'lister cleanup and test' from Benny Halevy Split off of #9835. The series removes extraneous includes of lister.hh from header files and adds a unit test for lister::scan_dir to test throwing an exception from the walker function passed to `scan_dir`. Test: unit(dev) Closes #9885 * github.com:scylladb/scylla: test: add lister_list lister: add more overloads of fs::path operator/ for std::string and string_view resource_manager: remove unnecessary include of lister.hh from header file sstables: sstable_directory: remove unncessary include of lister.hh from header file	2022-01-12 08:20:07 +01:00
Benny Halevy	1e6829e9f1	test: add lister_list Test the lister class. In particular the ability to abort the lister when the walker function throws an exception. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-11 17:04:16 +02:00
Benny Halevy	b9c41dc0fd	sstables: sstable_directory: remove unncessary include of lister.hh from header file The source file depends on it, not the header. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-11 17:04:16 +02:00
Botond Dénes	97d74de8fc	Merge "flat_mutation_reader: clone evictable_reader & convert some others" from Michael Livshin " The first patch introduces evictable_reader_v2, and the second one further simplifies it. We clone instead of converting because there is at least one downstream (by way of multishard_combining_reader) use that is not itself straightforward to convert at the moment (multishard_mutation_query), and because evictable_reader instances cannot be {up,down}graded (since users also access the undelying buffers). This also means that shard_reader, reader_lifecycle_policy and multishard_combining_reader have to be cloned. " * tag 'clone-evictable-reader-to-v2/v3' of https://github.com/cmm/scylla: convert make_multishard_streaming_reader() to flat_mutation_reader_v2 convert table::make_streaming_reader() to flat_mutation_reader_v2 convert make_flat_multi_range_reader() to flat_mutation_reader_v2 view_update_generator: remove unneeded call to downgrade_to_v1() introduce multishard_combining_reader_v2 introduce shard_reader_v2 introduce the reader_lifecycle_policy_v2 abstract base evictable_reader_v2: further code simplifications introduce evictable_reader_v2 & friends	2022-01-11 17:01:08 +02:00
Nadav Har'El	9d0eaeb90a	test/scylla-gdb: enable test for "scylla fiber" After the rewrite of the test/scylla-gdb, the test for "scylla fiber" was disabled - and this patch brings it back. For the "scylla fiber" operation to do something interesting (and not just print an error message and seem to succeed...) it needs a real task pointer. The old code interrupted Scylla in a breakpoint and used get_local_tasks(), but in the new test framework we attach to Scylla while it's idle, so there are no ready tasks. So in this patch we use the find_vptrs() function to find a continuation from http_server::do_accept_one() - it has an interesting fiber of 5 continuations. After this patch all 33 tests in test/scylla-gdb/test_misc.py pass. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220110211813.581807-1-nyh@scylladb.com>	2022-01-11 17:01:08 +02:00
Nadav Har'El	7f5ca5bf3f	Merge 'replica: move distributed_loader to replica module' from Avi Kivity distributed_loader is replica-side thing, so it belongs in the replica module ("distributed" refers to its ability to load sstables in their correct shards). So move it to the replica module. The change exposes a dependency on the construction order of static variables (which isn't defined), so we remove the dependency in the first two patches. Closes #9891 * github.com:scylladb/scylla: replica: move distributed_loader into replica module tracing: make sure keyspace and table names are available to static constructors auth: make sure keyspace and table names are available to static constructors	2022-01-11 17:01:08 +02:00
Michael Livshin	1f27e12dc6	convert make_multishard_streaming_reader() to flat_mutation_reader_v2 All changes are mechanical. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Avi Kivity	4392c20bd3	replica: move distributed_loader into replica module distributed_loader is replica-side thing, so it belongs in the replica module ("distributed" refers to its ability to load sstables in their correct shards). So move it to the replica module.	2022-01-10 15:25:28 +02:00
Nadav Har'El	63bd0807b4	test/scylla-gdb: skip tests on aarch64 As already noted in commit `eac6fb8`, many of the scylla-gdb tests fail on aarch64 for various reasons. The solution used in that commit was to have test/scylla-gdb/run pretend to succeed - without testing anything - when not running on x86_64. This workaround was accidentally lost when scylla-gdb/run was recently rewritten. This patch brings this workaround back, but in a slightly different form - Instead of the run script not doing anything, the tests do get called, but the "gdb" fixture in test/scylla-gdb/conftest.py causes each individual test to be skipped. The benefit of this approach is that it can easily be improved in the future to only skip (or xfail) specific tests which are known to fail on aarch64, instead of all of them - as half of the tests do pass on aarch64. Fixes #9892. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220109152630.506088-1-nyh@scylladb.com>	2022-01-09 17:34:23 +02:00
Botond Dénes	aa3c943f4c	mutation_reader: remove unecessary stable_flattened_mutations_consumer Said wrapper was conceived to make unmovable `compact_mutation` because readers wanted movable consumers. But `compact_mutation` is movable for years now, as all its unmovable bits were moved into an `lw_shared_ptr<>` member. So drop this unnecessary wrapper and its unnecessary usages.	2022-01-07 13:52:07 +02:00
Botond Dénes	9826b5d732	mutation_writer: migrate timestamp_based_splitting_writer to v2	2022-01-07 13:51:48 +02:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Avi Kivity	ae3a360725	database: Move database, keyspace, table classes to replica/ directory The database, keyspace, and table classes represent the replica-only part of the objects after which they are named. Reading from a table doesn't give you the full data, just the replica's view, and it is not consistent since reconciliation is applied on the coordinator. As a first step in acknowledging this, move the related files to a replica/ subdirectory.	2022-01-06 17:07:30 +02:00
Nadav Har'El	6e2d29300c	test/scylla-gdb: a rewrite, using pytest This patch is an almost complete rewrite of the test/scylla-gdb framework for testing Scylla's gdb commands. The goals of this rewrite are described in issue #9864. In short, the goals are: 1. Use pytest to define individual test cases instead one long Python script. This will make it easier to add more tests, to run only individual tests (e.g., test/scylla-gdb/run somefile.py::sometest), to understand which test failed when it fails - and a lot of other pytest conveniences. 2. Instead of an ad-hoc shell script to run Scylla, gdb, and the test, use the same Python code which is used in other test suites (alternator, cql-pytest, redis, and more). The resulting handling of the temporary resources (processes, directories, IP address) is more robust, and interrupting test/scylla-gdb/run will correctly kill its child processes (both Scylla and gdb). All existing gdb tests (except one - more on this below...) were easily rewritten in the new framework. The biggest change in this patch is who starts what. Before this patch, "run" starts gdb, which in turn starts Scylla, stops it on a breakpoint, and then runs various tests. After this patch, "run" starts Scylla on its own (like it does in test/cql-pytest/run, et al.), and then gdb runs pytest - and in a pytest fixture attaches to the running Scylla process. The biggest benefit of this approach is that "run" is aware of both gdb and Scylla, and can kill both with abruptly with SIGKILL to end the test. But there's also a downside to this change: One of the tests (of "scylla fiber") needs access to some task object. Before this patch, Scylla was stopped on a breakpoint, and a task was available at that point. After this patch, we attach gdb to an idle Scylla, and the test cannot find any task to use. So the test_fiber() test fails for now. One way we could perhaps fix it is to add a breakpoint and "continue" Scylla a bit more after attaching to it. However, I could find the right breakpoint - and we may also need to send a request to Scylla to get it to reach that breakpoint. I'm still looking for a better way to have access to some "task" object we can test on. Fixes #9864. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220102221534.1096659-1-nyh@scylladb.com>	2022-01-06 11:29:55 +02:00
Avi Kivity	d01e1a774b	Merge 'Build performance: do not include the entire <seastar/net/ip.hh>' from Nadav Har'El The header file <seastar/net/ip.hh> is a large collection of unrelated stuff, and according to ClangBuildAnalyzer, takes 2 seconds to compile for every source file that included it - and unfortunately virtually all Scylla source files included it - through either "types.hh" or "gms/inet_address.hh". That's 2300 CPU seconds wasted. In this two-patch series we completely eliminate the inclusion of <seastar/net/ip.hh> from Scylla. We still need the ipv4_address, ipv6_address types (e.g., gms/inet_address.hh uses it to hold a node's IP address) so those were split (in a Seastar patch that is already in) from ip.hh into separate small header files that we can include. This patch reduces the entire build time (of build/dev/scylla) by 4% - reducing almost 10 sCPU minutes (!) from the build. Closes #9875 github.com:scylladb/scylla: build performance: do not include <seastar/net/ip.hh> build performance: speed up inclusion of <gm/inet_address.hh>	2022-01-05 17:55:07 +02:00
Nadav Har'El	6012f6f2b6	build performance: do not include <seastar/net/ip.hh> In a previous patch, we noticed that the header file <gm/inet_address.hh>, which is included, directly or indirectly, by most source files, includes <seastar/net/ip.hh> which is very slow to compile, and replaced it by the much faster-to-include <seastar/net/ipv[46]_address.hh>. However, we also included <seastar/net/ip.hh> in types.hh - and that too is included by almost every file, so the actual saving from the above patch was minimal. So in this patch we replace this include too. After this patch Scylla does not include <seastar/net/ip.hh> at all. According to ClangBuildAnalyzer, this reduces the average time to include types.hh (multiply this by 312 times!) from 4 seconds to 1.8 seconds, and reduces total build time (dev mode) by about 3%. Some of the source files were now missing some include directives, that were previously included in ip.hh - so we need to add those explicitly. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-01-05 17:29:21 +02:00
Tomasz Grabiec	382797a627	tests: perf: perf_fast_forward: Fix test_large_partition_slicing_clustering_keys for scylla_bench_large_part_ds1 schema The test case assumed int32 partition key, but scylla_bench_large_part_ds1 has int64 partition key. This resulted in no results to be returned by the reader. Fixs by introducing a partition key factory on the data source level. Message-Id: <20220105150550.67951-1-tgrabiec@scylladb.com>	2022-01-05 17:18:06 +02:00
Nadav Har'El	e7e9001808	test/alternator: add more tests for GSI "Projection" We already have multiple tests for the unimplemented "Projection" feature of GSI and LSI (see issue #5036). This patch adds seven more test cases, focusing on various types of errors conditions (e.g., trying to project the same attribute twice), esoteric corner cases (it's fine to list a key in NonKeyAttributes!), and corner cases that I expect we will have in our implementation (e.g., a projected attribute may either be a real Scylla column or just an element in a map column). All new tests pass on DynamoDB and fail on Alternator (due to #5036), so marked with "xfail". Refs #5036. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211228193748.688060-1-nyh@scylladb.com>	2022-01-05 10:35:36 +02:00
Avi Kivity	53a83c4b1e	Merge "flat_mutation_reader: convert flat_mutation_reader_from_mutations to v2" from Botond " Like flat_mutation_reader_from_fragments, this reader is also heavily used by tests to compose a specific workload for readers above it. So instead of converting it, we add a v2 variant and leave the v1 variant in place. The v2 variant was written from scratch to have built-in support for reading in reverse. It is built-on `mutation::consume()` to avoid duplicating the logic of consuming the contents of the mutation. To avoid stalls, `mutation::consume()` gets support for pausing and resuming consuming a mutation. Tests: unit(dev) " * 'flat_mutation_reader_from_mutations_v2/v2' of https://github.com/denesb/scylla: flat_mutation_reader: convert make_flat_mutation_reader_from_mutation() v2 flat_mutation_reader: extract mutation slicing into a function mutation: consume(): make it pausable/resumable mutation: consume(): restructure clustering iterator initialization test/boost/mutation_test: add rebuild test for mutation::consume()	2022-01-05 10:23:17 +02:00
Nadav Har'El	5fbeae9016	cql-pytest: add a couple of default-TTL tests This patch adds a new cql-pytest test file - test_ttl.py - with currently just a couple of tests for the "with default_time_to_live" feature. One is a basic test, and second reproduces issue #9842 - that "using ttl 0" should override the default time to live, but doesn't. The test for #9842, test_default_ttl_0_override, fails on Scylla and passes on Cassandra, and is marked "xfail". Refs #9842. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211227091502.553577-1-nyh@scylladb.com>	2022-01-05 10:15:19 +02:00
Botond Dénes	62d82b8b0e	flat_mutation_reader: convert make_flat_mutation_reader_from_mutation() v2 Since this reader is also heavily used by tests to compose a specific workload for readers above it, we just add a v2 variant, instead of changing the existing v1 one. The v2 variant was written from scratch to have built-in support for reading in reverse. It is built-on `mutation::consume()` to avoid duplicating the logic of consuming the contents of the mutation. A v2 native unit test is also added.	2022-01-05 09:06:16 +02:00
Asias He	a8ad385ecd	repair: Get rid of the gc_grace_seconds The gc_grace_seconds is a very fragile and broken design inherited from Cassandra. Deleted data can be resurrected if cluster wide repair is not performed within gc_grace_seconds. This design pushes the job of making the database consistency to the user. In practice, it is very hard to guarantee repair is performed within gc_grace_seconds all the time. For example, repair workload has the lowest priority in the system which can be slowed down by the higher priority workload, so that there is no guarantee when a repair can finish. A gc_grace_seconds value that is used to work might not work after data volume grows in a cluster. Users might want to avoid running repair during a specific period where latency is the top priority for their business. To solve this problem, an automatic mechanism to protect data resurrection is proposed and implemented. The main idea is to remove the tombstone only after the range that covers the tombstone is repaired. In this patch, a new table option tombstone_gc is added. The option is used to configure tombstone gc mode. For example: 1) GC a tombstone after gc_grace_seconds cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ; This is the default mode. If no tombstone_gc option is specified by the user. The old gc_grace_seconds based gc will be used. 2) Never GC a tombstone cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'}; 3) GC a tombstone immediately cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'}; 4) GC a tombstone after repair cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'}; In addition to the 'mode' option, another option 'propagation_delay_in_seconds' is added. It defines the max time a write could possibly delay before it eventually arrives at a node. A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc option can only be used after the whole cluster supports the new feature. A mixed cluster works with no problem. Tests: compaction_test.py, ninja test Fixes #3560 [avi: resolve conflicts vs data_dictionary]	2022-01-04 19:48:14 +02:00
Botond Dénes	5e547dcc8a	test/boost/mutation_test: add rebuild test for mutation::consume() In the next patches we will refactor mutation::consume(). Before doing that add another test, which rebuilds the consumed mutation, comparing it with the original.	2022-01-04 11:43:46 +02:00
Nadav Har'El	8774fc83d3	test/rest_api: fix "--ssl" option test/rest_api has a "--ssl" option to use encrypted CQL. It's not clear to me why this is useful (it doesn't actually test encryption of the REST API!), but as long as we have such an option, it should work. And it didn't work because of a typo - we set a "check_cql" variable to the right function, but then forgot to use it and used run.check_cql instead (which is just for unencrypted cql). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220102123202.1052930-1-nyh@scylladb.com>	2022-01-02 15:53:25 +02:00
Avi Kivity	9e74556413	Merge 'Support reverse reads in the row cache natively' from Tomasz Grabiec This change makes row cache support reverse reads natively so that reversing wrappers are not needed when reading from cache and thus the read can be executed efficiently, with similar cost as the forward-order read. The database is serving reverse reads from cache by default after this. Before, it was bypassing cache by default after `703aed3277`. Refs: #1413 Tests: - unit [dev] - manual query with build/dev/scylla and cache tracing on Closes #9454 * github.com:scylladb/scylla: tests: row_cache: Extend test_concurrent_reads_and_eviction to run reverse queries row_cache: partition_snapshot_row_cursor: Print more details about the current version vector row_cache: Improve trace-level logging config: Use cache for reversed reads by default config: Adjust reversed_reads_auto_bypass_cache description row_cache: Support reverse reads natively mvcc: partition_snapshot: Support slicing range tombstones in reverse test: flat_mutation_reader_assertions: Consume expected range tombstones before end_of_partition row_cache: Log produced range tombstones test: Make produces_range_tombstone() report ck_ranges tests: lib: random_mutation_generator: Extract make_random_range_tombstone() partition_snapshot_row_cursor: Support reverse iteration utils: immutable-collection: Make movable intrusive_btree: Make default-initialized iterator cast to false	2021-12-29 16:53:25 +02:00
Pavel Emelyanov	7a15f1c402	batch_\|modification_statement: Make get_mutations accept query_processor This completes the batch_ and modification_statement rework. Also touch the private batch_statement::read_command while at it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-12-23 10:54:28 +03:00
Benny Halevy	2f2e3b2e84	test: lib: index_reader_assertions: close reader before it is destroyed Otherwise, it may trip an assertion when the nuderlying file is closed, as seen in e.g.: https://jenkins.scylladb.com/view/master/job/scylla-master/job/next/4318/artifact/testlog/x86_64_release/sstable_3_x_test.test_read_rows_only_index.4174.log ``` test/boost/sstable_3_x_test.cc(0): Entering test case "test_read_rows_only_index" sstable_3_x_test: ./seastar/src/core/fstream.cc:205: virtual seastar::file_data_source_impl::~file_data_source_impl(): Assertion `_reads_in_progress == 0' failed. Aborting on shard 0. Backtrace: 0x22557e8 0x2286842 0x7f2799e99a1f /lib64/libc.so.6+0x3d2a1 /lib64/libc.so.6+0x268a3 /lib64/libc.so.6+0x26788 /lib64/libc.so.6+0x35a15 0x222c53d 0x222c548 0xb929cc 0xc0b23b 0xa84bbf 0x24d0111 ``` Decoded: ``` __GI___assert_fail at :? ~file_data_source_impl at ./build/release/seastar/./seastar/src/core/fstream.cc:205 ~file_data_source_impl at ./build/release/seastar/./seastar/src/core/fstream.cc:202 std::default_delete<seastar::data_source_impl>::operator()(seastar::data_source_impl) const at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/unique_ptr.h:85 (inlined by) ~unique_ptr at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/unique_ptr.h:361 (inlined by) ~data_source at ././seastar/include/seastar/core/iostream.hh:55 (inlined by) ~input_stream at ././seastar/include/seastar/core/iostream.hh:254 (inlined by) ~continuous_data_consumer at ././sstables/consumer.hh:484 (inlined by) ~index_consume_entry_context at ././sstables/index_reader.hh:116 (inlined by) std::default_delete<sstables::index_consume_entry_context<sstables::index_consumer> >::operator()(sstables::index_consume_entry_context<sstables::index_consumer>) const at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/unique_ptr.h:85 (inlined by) ~unique_ptr at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/unique_ptr.h:361 (inlined by) ~index_bound at ././sstables/index_reader.hh:395 (inlined by) ~index_reader at ././sstables/index_reader.hh:435 std::default_delete<sstables::index_reader>::operator()(sstables::index_reader*) const at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/unique_ptr.h:85 (inlined by) ~unique_ptr at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/unique_ptr.h:361 (inlined by) ~index_reader_assertions at ././test/lib/index_reader_assertions.hh:31 (inlined by) operator() at ./test/boost/sstable_3_x_test.cc:4630 ``` Test: unit(dev), sstable_3_x_test.test_read_rows_only_index(release X 10000) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20211222132858.2155227-1-bhalevy@scylladb.com>	2021-12-22 15:33:22 +02:00
Botond Dénes	aba68c8f83	Merge "reader_concurrency_semaphore: convert to flat_mutation_reader_v2" from Michael " The second patch in this series is a mechanical conversion of reader_concurrency_semaphore to flat_mutation_reader_v2, and caller updates. The first patch is needed to pass the test suite, since without it a real reader version conversion would happen on every entry to and exit from reader_concurrency_semaphore, which is stressful (for example: mutation_reader_test.test_multishard_streaming_reader reaches 8191 conversions for a couple of readers, which somehow causes it to catch SIGSEGV in diverse and seemingly-random places). Note that in a real workload it is unreasonable to expect readers being parked in a reader_concurrency_semaphore to be pristine, so short-circuiting their version conversions will be impossible and this workaround will not really help. " * tag 'rcs-v2-v4' of https://github.com/cmm/scylla: reader_concurrency_semaphore: convert to flat_mutation_reader_v2 short-circuit flat mutation reader upgrades and downgrades	2021-12-22 15:08:31 +02:00
Michael Livshin	a1b8ba23d2	reader_concurrency_semaphore: convert to flat_mutation_reader_v2 Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2021-12-21 11:26:17 +02:00
Raphael S. Carvalho	64ec1c6ec6	table: Make sure major compaction doesn't miss data in memtable Make sure that major will compact data in all sstables and memtable, as tombstones sitting in memtable could shadow data in sstables. For example, a tombstone in memtable deleting a large partition could be missed in major, so space wouldn't be saved as expected. Additionally, write amplification is reduced as data in memtable won't have to travel through tiers once flushed. Fixes #9514. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211217160055.96693-2-raphaelsc@scylladb.com>	2021-12-21 07:21:34 +02:00
Raphael S. Carvalho	e1e8e020fe	tests: Allow memtable to be flushed through column_family_for_tests Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211217160055.96693-1-raphaelsc@scylladb.com>	2021-12-21 07:21:26 +02:00
Botond Dénes	55bb70a878	Merge "Make sure TWCS per-window major includes all files" from Raphael " TWCS perform STCS on a window as long as it's the most recent one. From there on, TWCS will compact all files in the past window into a single file. With some moderate write load, it could happen that there's still some compaction activity in that past window, meaning that per-window major may miss some files being currently compacted. As a result, a past window may contain more than 1 file after all compaction activity is done on its behalf, which may increase read amplification. To avoid that, TWCS will now make sure that per-window major is serialized, to make sure no files are missed. Fixes #9553. tests: unit(dev). " * 'fix_twcs_per_window_major_v3' of https://github.com/raphaelsc/scylla: TWCS: Make sure major on past window is done on all its sstables TWCS: remove needless param for STCS options TWCS: kill unused param in newest_bucket() compaction: Implement strategy control and wire it compaction: Add interface to control strategy behavior.	2021-12-20 17:12:50 +02:00
Avi Kivity	e772fcbd57	Merge "Convert combined reader to v2" from Botond " Users are adjusted by sprinkling `upgrade_to_v2()` and `downgrade_to_v1()` where necessary (or removing any of these where possible). No attempt was made to optimize and reduce the amount of v1<->v2 conversions. This is left for follow-up patches to keep this set small. The combined reader is composed of 3 layers: 1. fragment producer - pop fragments from readers, return them in batches (each fragment in a batch having the same type and pos). 2. fragment merger - merge fragment batches into single fragments 3. reader implementation glue-code Converting layers (1) and (3) was mostly mechanical. The logic of merging range tombstone changes is implemented at layer (2), so the two different producer (layer 1) implementations we have share this logic. Tests: unit(dev) " * 'combined-reader-v2/v4' of https://github.com/denesb/scylla: test/boost/mutation_reader_test: add test_combined_reader_range_tombstone_change_merging mutation_reader: convert make_clustering_combined_reader() to v2 mutation_reader: convert position_reader_queue to v2 mutation_reader: convert make_combined_reader() overloads to v2 mutation_reader: combined_reader: convert reader_selector to v2 mutation_reader: convert combined reader to v2 mutation_reader: combined_reader: attach stream_id to mutation_fragments flat_mutation_reader_v2: add v2 version of empty reader test/boost/mutation_reader_test: clustering_combined_reader_mutation_source_test: fix end bound calculation	2021-12-20 14:01:03 +02:00
Botond Dénes	7f331cee01	test/boost/mutation_reader_test: add test_combined_reader_range_tombstone_change_merging Stressing the range tombstone change merging logic.	2021-12-20 09:29:05 +02:00
Botond Dénes	e1bbc4a480	mutation_reader: convert make_clustering_combined_reader() to v2 Just sprinkle the right amount downgrade_to_v1() and upgrade_to_v2() to call sites, no attempts at optimization was done.	2021-12-20 09:29:05 +02:00
Botond Dénes	2364144b19	mutation_reader: convert position_reader_queue to v2 By removing the converting (v1->v2) constructor of `reader_and_upper_bound` and adjusting its users.	2021-12-20 09:29:05 +02:00
Botond Dénes	aeddcf50a1	mutation_reader: convert make_combined_reader() overloads to v2 Just sprinkle the right amount downgrade_to_v1() and upgrade_to_v2() to call sites, no attempts at optimization was done.	2021-12-20 09:29:05 +02:00
Botond Dénes	1554b94b78	mutation_reader: combined_reader: convert reader_selector to v2	2021-12-20 09:29:05 +02:00
Nadav Har'El	252ce8afd4	Merge 'Extend stop compaction api' from Benny Halevy Allow stopping compaction by type on a given keyspace and list of tables. Also add api unit test suite that tests the existing `stop_compaction` api and the new `stop_keyspace_compaction` api. Fixes #9700 Closes #9746 * github.com:scylladb/scylla: api: storage_service: validate_keyspace: improve exception error message api: compaction_manager: add stop_keyspace_compaction api: storage_service: expose validate_keyspace and parse_tables api: compaction_manager: stop_compaction: fix type description compaction_manager: stop_compaction: expose optional table* test: api: add basic compaction_manager test	2021-12-20 00:18:46 +02:00
Tomasz Grabiec	1c80d7fec4	tests: row_cache: Extend test_concurrent_reads_and_eviction to run reverse queries	2021-12-19 22:43:52 +01:00
Tomasz Grabiec	63351483f0	row_cache: Support reverse reads natively Some implementation notes below. When iterating in reverse, _last_row is after the current entry (_next_row) in table schema order, not before like in the forward mode. Since there is no dummy row before all entries, reverse iteration must be now prepared for the fact that advancing _next_row may land not pointing at any row. The partition_snapshot_row_cursor maintains continuity() correctly in this case, and positions the cursor before all rows, so most of the code works unchanged. The only excpetion is in move_to_next_entry(), which now cannot assume that failure to advance to an entry means it can end a read. maybe_drop_last_entry() is not implemented in reverse mode, which may expose reverse-only workload to the problem of accumulating dummy entries. ensure_population_lower_bound() was not updating _last_row after inserting the entry in latets version. This was not a problem for forward reads because they do not modify the row in the partition snapshot represented by _last_row. They only need the row to be there in the latest version after the call. It's different for reveresed reads, which change the continuity of the entry represented by _last_row, hence _last_row needs to have the iterator updated to point to the entry from the latest version, otherwise we'd set the continuity of the previous version entry which would corrupt the continuity.	2021-12-19 22:41:35 +01:00
Tomasz Grabiec	d0c367f44f	mvcc: partition_snapshot: Support slicing range tombstones in reverse	2021-12-19 22:41:35 +01:00
Tomasz Grabiec	87c921dff5	test: flat_mutation_reader_assertions: Consume expected range tombstones before end_of_partition There may be unconsumed but expected fragments in the stream at the time of the call to produces_partition_end(). Call check_rts() sooner to avoid failures.	2021-12-19 22:41:35 +01:00
Tomasz Grabiec	5f45d45c55	test: Make produces_range_tombstone() report ck_ranges	2021-12-19 22:41:35 +01:00

1 2 3 4 5 ...

2646 Commits