scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 18:40:38 +00:00

Author	SHA1	Message	Date
Mikołaj Sielużycki	93d6eb6d51	compacting_reader: Support fast_forward_to position range. Fast forwarding is delegated to the underlying reader and assumes the it's supported. The only corner case requiring special handling that has shown up in the tests is producing partition start mutation in the forwarding case if there are no other fragments. compacting state keeps track of uncompacted partition start, but doesn't emit it by default. If end of stream is reached without producing a mutation fragment, partition start is not emitted. This is invalid behaviour in the forwarding case, so I've added a public method to compacting state to force marking partition as non-empty. I don't like this solution, as it feels like breaking an abstraction, but I didn't come across a better idea. Tests: unit(dev, debug, release) Message-Id: <20220128131021.93743-1-mikolaj.sieluzycki@scylladb.com>	2022-01-31 13:37:36 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Michael Livshin	1f27e12dc6	convert make_multishard_streaming_reader() to flat_mutation_reader_v2 All changes are mechanical. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Avi Kivity	ae3a360725	database: Move database, keyspace, table classes to replica/ directory The database, keyspace, and table classes represent the replica-only part of the objects after which they are named. Reading from a table doesn't give you the full data, just the replica's view, and it is not consistent since reconciliation is applied on the coordinator. As a first step in acknowledging this, move the related files to a replica/ subdirectory.	2022-01-06 17:07:30 +02:00
Asias He	a8ad385ecd	repair: Get rid of the gc_grace_seconds The gc_grace_seconds is a very fragile and broken design inherited from Cassandra. Deleted data can be resurrected if cluster wide repair is not performed within gc_grace_seconds. This design pushes the job of making the database consistency to the user. In practice, it is very hard to guarantee repair is performed within gc_grace_seconds all the time. For example, repair workload has the lowest priority in the system which can be slowed down by the higher priority workload, so that there is no guarantee when a repair can finish. A gc_grace_seconds value that is used to work might not work after data volume grows in a cluster. Users might want to avoid running repair during a specific period where latency is the top priority for their business. To solve this problem, an automatic mechanism to protect data resurrection is proposed and implemented. The main idea is to remove the tombstone only after the range that covers the tombstone is repaired. In this patch, a new table option tombstone_gc is added. The option is used to configure tombstone gc mode. For example: 1) GC a tombstone after gc_grace_seconds cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ; This is the default mode. If no tombstone_gc option is specified by the user. The old gc_grace_seconds based gc will be used. 2) Never GC a tombstone cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'}; 3) GC a tombstone immediately cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'}; 4) GC a tombstone after repair cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'}; In addition to the 'mode' option, another option 'propagation_delay_in_seconds' is added. It defines the max time a write could possibly delay before it eventually arrives at a node. A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc option can only be used after the whole cluster supports the new feature. A mixed cluster works with no problem. Tests: compaction_test.py, ninja test Fixes #3560 [avi: resolve conflicts vs data_dictionary]	2022-01-04 19:48:14 +02:00
Botond Dénes	7f331cee01	test/boost/mutation_reader_test: add test_combined_reader_range_tombstone_change_merging Stressing the range tombstone change merging logic.	2021-12-20 09:29:05 +02:00
Botond Dénes	e1bbc4a480	mutation_reader: convert make_clustering_combined_reader() to v2 Just sprinkle the right amount downgrade_to_v1() and upgrade_to_v2() to call sites, no attempts at optimization was done.	2021-12-20 09:29:05 +02:00
Botond Dénes	2364144b19	mutation_reader: convert position_reader_queue to v2 By removing the converting (v1->v2) constructor of `reader_and_upper_bound` and adjusting its users.	2021-12-20 09:29:05 +02:00
Botond Dénes	aeddcf50a1	mutation_reader: convert make_combined_reader() overloads to v2 Just sprinkle the right amount downgrade_to_v1() and upgrade_to_v2() to call sites, no attempts at optimization was done.	2021-12-20 09:29:05 +02:00
Botond Dénes	1554b94b78	mutation_reader: combined_reader: convert reader_selector to v2	2021-12-20 09:29:05 +02:00
Botond Dénes	f15f4952be	test/boost/mutation_reader_test: clustering_combined_reader_mutation_source_test: fix end bound calculation Currently the test assumes that fragments represent weakly monotonic upper bounds and therefore unconditionally overwrites the upper-bound on receiving each fragment. Range tombstones however violate this as a range tombstone with a smaller position (lower bound) may have a higher upper bound than some or all fragments that follow it in the stream. This causes test failures after the converting the combined reader to v2, but not before, no idea why.	2021-12-16 14:57:49 +02:00
Avi Kivity	395b30bca8	mutation_reader: update make_filtering_reader() to flat_mutation_reader_v2 As part of the drive to move over to flat_mutation_reader_v2, update make_filtering_reader(). Since it doesn't examine range tombstones (only the partition_start, to filter the key) the entire patch is just glue code upgrading and downgrading users in the pipeline (or removing a conversion, in one case). Test: unit (dev) Closes #9723	2021-12-07 12:18:07 +02:00
Botond Dénes	64bb48855c	flat_mutation_reader: revamp flat_mutation_reader_from_mutations() Add schema parameter so that: * Caller has better control over schema -- especially relevant for reverse reads where it is not possible to follow the convention of passing the query schema which is reversed compared to that of the mutations. * Now that we don't depend on the mutations for the schema, we can lift the restriction on mutations not being empty: this leads to safer code. When the mutations parameter is empty, an empty reader is created. Add "make_" prefix to follow convention of similar reader factory functions. Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20211115155614.363663-1-bdenes@scylladb.com>	2021-11-15 17:58:46 +02:00
Kamil Braun	075a894a89	test: mutation_reader_test: reversed version of test_clustering_order_merger_sstable_set	2021-09-29 12:15:48 +03:00
Kamil Braun	7d5273b044	test: mutation_reader_test: clustering_combined_reader_mutation_source_test: prepare for reading in reverse For reversed reads we must adjust the lower/upper bounds used by the `position_reader_queue` and `clustering_combined_reader`. The bounds are calculated using the mutation schema, but we need bounds calculated using the query schema which is reversed.	2021-09-29 12:15:48 +03:00
Botond Dénes	c048d854d9	test: mutation_reader_test: test_manual_paused_evictable_reader_is_mutation_source: use query schema instead of table schema The two might not be the same in case the schema was upgraded or if we are reading in reverse. It is important to use the passed-in query schema consistently during a read.	2021-09-29 12:15:48 +03:00
Kamil Braun	7dc4ee35c9	sstable_set: time_series_sstable_set: reverse mode `time_series_sstable_set` uses `clustering_combined_reader` to implement efficient single-partition reads. It provides a `position_reader_queue` to the reader. This queue returns readers to the sstables from the set in order of the sstables' lower bounds, and with each reader it provides an upper bound for the positions-in-partition returned by the reader. Until now we would assume non-reversed queries only. Reversed queries were implemented by performing forward query in the lower layers and reversing the results at the upper-most layer of the reader stack. Before pushing the reversing down to the sources (in particular, to sstable readers), we need to support the reverse mode in `time_series_sstable_set` and the queue it provides to `clustering_combined_reader`. This requires using different lower and upper bounds in the queue. For non-reversed reads we used `sstable::min_position()` as the lower bound and `sstable::max_position()` as the upper bound. For reversed reads all comparisons performed by `clustering_combined_reader` will be reversed, as it will use a reversed schema. We can then use `sstable::max_position().reversed()` for the lower bound and `sstable::min_position().reversed()` for the upper bound.	2021-09-28 17:03:57 +03:00
Kamil Braun	fbb83dd5ca	reader_concurrency_semaphore: remove default parameter values from constructors It's easy to forget about supplying the correct value for a parameter when it has a default value specified. It's safer if 'production code' is forced to always supply these parameters manually. The default values were mostly useful in tests, where some parameters didn't matter that much and where the majority of uses of the class are. Without default values adding a new parameter is a pain, forcing one to modify every usage in the tests - and there are a bunch of them. To solve this, we introduce a new constructor which requires passing the `for_tests` tag, marking that the constructor is only supposed to be used in tests (and the constructor has an appropriate comment). This constructor uses default values, but the other constructors - used in 'production code' - do not.	2021-09-14 12:20:28 +02:00
Pavel Emelyanov	8c786937d5	test: Don't nest seastar::async calls The SEASTAR_THREAD_TEST_CASE runs the provided lambda in async context. The sstables::test_env::run_with_async does the same. This (script-generated) patch makes all of the found cases be SEASTAR_TEST_CASE and, respectively, return the async future from the run_with_async(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-06 08:26:09 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	fe479aca1d	reader_permit: add timeout member To replace the timeout parameter passed to flat_mutation_reader methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 14:29:44 +03:00
Michael Livshin	f07306d75c	sstables: make sstable::make_reader() return flat_mutation_reader_v2 Rename the old version to `sstables::make_reader_v1()`, to have a nicely searcheable eradication target. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2021-08-09 19:20:48 +03:00
Benny Halevy	67d5addc09	test: mutation_reader_test: clustering_order_merger_test_generator: use explicit type for num_ranges gcc 10.3.1 spews the following error: ``` _test_generator::generate_scenario(std::mt19937&) const’: test/boost/mutation_reader_test.cc:3731:28: error: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Werror=sign-compare] 3731 \| for (auto i = 0; i < num_ranges; ++i) { \| ~~^~~~~~~~~~~~ ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210728073538.2467040-1-bhalevy@scylladb.com>	2021-07-28 11:22:59 +03:00
Botond Dénes	388da36bbb	test: mutation_reader_test: remove restricted reader tests Soon we will switch to up-front admission which will break these tests. No point in trying to fix them as once the switch is done we'll retire the restricted reader too. Remove these tests now so they are not in the way of progress.	2021-07-14 17:19:02 +03:00
Botond Dénes	c07db00b70	test: move away from make_permit() Use the most appropriate up-front admission variant.	2021-07-14 17:19:02 +03:00
Botond Dénes	97a03f9027	database: make_multishard_streaming_reader: use external permit As a preparation for up-front admission, add a permit parameter to `make_multishard_streaming_reader()`, which will be the admitted permit once we switch to up-front admission. For now it has to be a non-admitted permit. A nice side-effect of this patch is that now permits will have a use-case specific description, instead of the generic "multishard-streaming-reader" one	2021-07-14 16:48:43 +03:00
Avi Kivity	9059514335	build, treewide: enable -Wpessimizing-move warning This warning prevents using std::move() where it can hurt - on an unnamed temporary or a named automatic variable being returned from a function. In both cases the value could be constructed directly in its final destination, but std::move() prevents it. Fix the handful of cases (all trivial), and enable the warning. Closes #8992	2021-07-08 17:52:34 +03:00
Botond Dénes	2d2b9e7b36	test/boost: migrate off the global test reader semaphore	2021-07-08 16:53:38 +03:00
Botond Dénes	5fff314739	test/lib/simple_schema: migrate off the global test reader semaphore	2021-07-08 15:28:39 +03:00
Botond Dénes	46d21e842d	test/lib/reader_lifecycle_policy: add permit parameter to factory function The factory method doesn't match the signature of `reader_lifecycle_policy::make_reader()`, notably the permit is missing. Add it as it is important that the wrapping evictable reader and underlying reader share the permits.	2021-07-08 12:31:36 +03:00
Botond Dénes	2a45d643b6	test/boost/mutation_reader_test: share permit between readers in a read Permits were designed such that there is one permit per read, being shared by all readers in that read. Make sure readers created by tests adhere to this.	2021-07-08 12:31:36 +03:00
Botond Dénes	75e8d2d04a	test: mutation_reader_test: add more test for reader recreation	2021-06-30 11:21:58 +03:00
Botond Dénes	852bf6befd	evictable_reader: relax partition key check on reader recreation When recreating the underlying reader, the evictable reader validates that the first partition key it emits is what it expects to be. If the read stopped at the end of a partition, it expects the first partition to be a larger one. If the read stopped in the middle of a certain partition it expects the first partition to be the same it stopped in the middle of. This latter assumption doesn't hold in all circumstances however. Namely, the partition it stopped in the middle of might get compacted away in the time the read was paused, in which case the read will resume from a greater partition. This perfectly valid cases however currently triggers the evictable reader's self validation, leading to the abortion of the read and a scary error to be logged. Relax this check to accept any partition that is >= compared to the one the read stopped in the middle of.	2021-06-30 11:21:53 +03:00
Avi Kivity	d27e88e785	Merge "compaction: prevent broken_promise or dangling reader errors" from Benny " This series prevents broken_promise or dangling reader errors when (resharding) compaction is stopped, e.g. during shutdown. At the moment compaction just closes the reader unilaterally and this yanks the reader from under the queue_reader_handle feet, causing dangling queue reader and broken_promise errors as seen in #8755. Instead, fix queue_reader::close to set value on the _full/_not_full promises and detach from the handle, and return _consume_fut from bucket_writer::consume if handle is terminated. Fixes #8755 Test: unit(dev) DTest: materialized_views_test.py:TestMaterializedViews.interrupt_build_process_and_resharding_half_to_max_test(debug) " * tag 'propagate-reader-abort-v3' of github.com:bhalevy/scylla: mutation_writer: bucket_writer: consume: propagate _consume_fut if queue_reader_handle is_terminated queue_reader_handle: add get_exception method queue_reader: close: set value on promises on detach from handle	2021-06-22 18:52:11 +03:00
Benny Halevy	4830b6647c	queue_reader: close: set value on promises on detach from handle To prevent broken_promise exception. Since close() is manadatory the queue_reader destructor, that just detaches the reader from the handle, is not needed anymore, so remove it. Adjust the test_queue_reader unit test accordingly. Test: test_queue_reader(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-06-16 17:25:14 +03:00
Botond Dénes	d2ddaced4e	test/lib/reader_lifecycle_policy: get rid of lifecycle workarounds The lifecycle of the reader lifecycle policy and all the resources the reads use is now enclosed in that of the multishard reader thanks to its close() method. We can now remove all the workarounds we had in place to keep different resources as long as background reader cleanup finishes.	2021-06-16 11:29:36 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Emelyanov	d2442a1bb3	tests: Ditch storage_service_for_tests The purpose of the class in question is to start sharded storage service to make its global instance alive. I don't know when exactly it happened but no code that instantiates this wrapper really needs the global storage service. Ref: #2795 tests: unit(dev), perf_sstable(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210526170454.15795-1-xemul@scylladb.com>	2021-05-27 14:39:13 +03:00
Pavel Solodovnikov	fff7ef1fc2	treewide: reduce boost headers usage in scylla header files `dev-headers` target is also ensured to build successfully. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-05-20 01:33:18 +03:00
Botond Dénes	300ee974f7	test: use with_cql_test_env_thread where needed Currently `with_cql_test_env()` is equivalent to `with_cql_test_env_thread()`, which resulted in many tests using the former while really needing the latter and getting away with it. This equivalence is incidental and will go away soon, so make sure all cql test env using tests that expect to be run in a thread use the appropriate variant. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210514141614.128213-1-bdenes@scylladb.com>	2021-05-18 13:44:52 +03:00
Botond Dénes	c872a963b6	test: move reader_concurrency_semaphore related tests into separate file The mutation_reader_test is already one of our largest test files. Move the reader concurrency semaphore related tests to a new file, making them easier to find making the mutation reader test a little bit smaller too.	2021-05-06 08:59:47 +03:00
Botond Dénes	5f217b6dee	test: mutation_reader_test: convert restricted reader tests to semaphore tests These two tests (restricted_reader_timeout and restricted_reader_max_queue_length) are testing the semaphore in reality, but through the restricted reader, which is distracting as it needlessly brings in an additional layer into the picture. Rewrite them to test the semaphore directly, getting much lighter in the process.	2021-05-06 08:57:12 +03:00
Botond Dénes	45d580f056	test: mutation_reader_test: add test_reader_concurrency_semaphore_forward_progress This unit test checks that the semaphore doesn't get into a deadlock when contended, in the presence of many memory-only reads (that don't wait for admission). This is tested by simulating the 3 kind of reads we currently have in the system: * memory-only: reads that don't pass admission and only own memory. * admitted: reads that pass admission. * evictable: admitted reads that are furthermore evictable. The test creates and runs a large number of these reads in parallel, read kinds being selected randomly, then creates a watchdog which kills the test if no progress is being made.	2021-04-26 15:57:17 +03:00
Botond Dénes	cadc26de38	test: mutation_reader_test: add test_reader_concurrency_semaphore_readmission_preserves_units This unit test passes a read through admission again-and-again, just like an evictable reader would be during its lifetime. When readmitted the read sometimes has to wait and sometimes not. This is to check that the readmitting a previously admitted reader doesn't leak any units.	2021-04-26 15:57:17 +03:00
Botond Dénes	2b66f7222e	reader_concurrency_semaphore: inactive_read_handle: abandon(): close reader `fa43d7680` recently introduced mandatory closing of readers before they are destroyed. One reader destroy path that was left not closing the reader before destruction is `inactive_reader_handle::abandon()`. This path is executed when the handle is destroyed while still referring to a non-evicted inactive read. This patch fixes it up to close the reader and adds a small unit test which checks that this happens.	2021-04-26 15:56:54 +03:00
Benny Halevy	5b22731f9a	flat_mutation_reader: require close Make flat_mutation_reader::impl::close pure virtual so that all implementations are required to implemnt it. With that, provide a trivial implementation to all implementations that currently use the default, trivial close implementation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	aa5289f255	test: everywhere: close flat_mutation_reader when done Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	43bf0f9356	reader_concurrency_semaphore: add stop method In addition to clear_inactive_reads, that's currently called when the database object is destroyed, introduce a stop() method that will: 1. wait on all background closes of inactive_reads. 2. close all present inactive_reads and waits on their close. 3. signal waiters on the wait_list via broken() with a proper exception indicating that the semaphore was closed. In addition, assert in the semaphore's destructor that it has no remaining inactive reads. Stop must be called from whoever owns the r_c_s. Mainly, from database::stop. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	e1ec401bb6	mutation_reader: evictable_reader: implement close If there's an active reader then close it, else, try to resume the paused reader, and close it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00

1 2 3

141 Commits