scylladb

Author	SHA1	Message	Date
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Asias He	a8ad385ecd	repair: Get rid of the gc_grace_seconds The gc_grace_seconds is a very fragile and broken design inherited from Cassandra. Deleted data can be resurrected if cluster wide repair is not performed within gc_grace_seconds. This design pushes the job of making the database consistency to the user. In practice, it is very hard to guarantee repair is performed within gc_grace_seconds all the time. For example, repair workload has the lowest priority in the system which can be slowed down by the higher priority workload, so that there is no guarantee when a repair can finish. A gc_grace_seconds value that is used to work might not work after data volume grows in a cluster. Users might want to avoid running repair during a specific period where latency is the top priority for their business. To solve this problem, an automatic mechanism to protect data resurrection is proposed and implemented. The main idea is to remove the tombstone only after the range that covers the tombstone is repaired. In this patch, a new table option tombstone_gc is added. The option is used to configure tombstone gc mode. For example: 1) GC a tombstone after gc_grace_seconds cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ; This is the default mode. If no tombstone_gc option is specified by the user. The old gc_grace_seconds based gc will be used. 2) Never GC a tombstone cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'}; 3) GC a tombstone immediately cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'}; 4) GC a tombstone after repair cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'}; In addition to the 'mode' option, another option 'propagation_delay_in_seconds' is added. It defines the max time a write could possibly delay before it eventually arrives at a node. A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc option can only be used after the whole cluster supports the new feature. A mixed cluster works with no problem. Tests: compaction_test.py, ninja test Fixes #3560 [avi: resolve conflicts vs data_dictionary]	2022-01-04 19:48:14 +02:00
Tomasz Grabiec	d0c367f44f	mvcc: partition_snapshot: Support slicing range tombstones in reverse	2021-12-19 22:41:35 +01:00
Tomasz Grabiec	757fc1275f	partition_snapshot_row_cursor: Support reverse iteration	2021-12-19 22:41:35 +01:00
Tomasz Grabiec	0d7b3f9463	tests: mvcc: Relax monotonicity check Consecutive range tombstones can have the same position. They will, in one of the test cases, after the range tombstone merger in partition_snapshot_flat_reader no longer uses range_tombstone_list to merge data form multiple versions, which deoverlaps, but rather merges the streams corresponding to each version, which interleaves range tombstones from different versions.	2021-07-26 17:27:03 +02:00
Botond Dénes	2d2b9e7b36	test/boost: migrate off the global test reader semaphore	2021-07-08 16:53:38 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Emelyanov	4558eb3afc	partition_snapshot_row_cursor: Move cells hash creation to reader Right now call to .row() method may create hash on row's cells. It's counterintuitive to see a const method that transparently changes something it points to. Since the only caller of a row() who knows whether the hash creation is required is the cache reader, it's better to move the call to prepare_hash() into it. Other than making the .row() less surprising this also helps to get rid of the whole method by the next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-04-09 12:18:29 +03:00
Pavel Emelyanov	00caf5f219	partition_snapshot_row_cursor: Move read_partition into test The method in question is test-only helper, there's no need in keeping it as a part of the API. Another reason to move is that the method is O(number of rows) and doesn't preempt while looping, but cursor code users try hard not to stall the reactor. So even though this method has a meaningful semantics within the class, it will better be reinvented if needed in core code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-04-09 12:16:13 +03:00
Calle Wilund	4b65d67a1a	partition_version: Change range_tombstones() to return chunked_vector Refs #7364 The number of tombstones can be large. As a stopgap measure to just returning a source range (with keepalive), we can at least alleviate the problem by using a chunked vector. Closes #7433	2020-10-26 11:54:42 +02:00
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00
Konstantin Osipov	ff3f9cb7cf	test: stop using BOOST_TEST_MESSAGE() for logging We use boost test logging primarily to generate nice XML xunit files used in Jenkins. These XML files can be bloated with messages from BOOST_TEST_MESSAGE(), hundreds of megabytes of build archives, on every build. Let's use seastar logger for test logging instead, reserving the use of boost log facilities for boost test markup information.	2020-03-05 11:38:11 +03:00
Avi Kivity	6728b96df7	clustering_interval_set: split to own header file clustering_interval_set is a rarely used class, but one that requires boost/icl, which is quite heavyweight. To speed up compilation, move it to its own header and sprinkle #includes where needed. Tests: unit (dev) Message-Id: <20200214190507.1137532-1-avi@scylladb.com>	2020-02-16 17:40:47 +02:00
Konstantin Osipov	1c8736f998	tests: move all test source files to their new locations 1. Move tests to test (using singular seems to be a convention in the rest of the code base) 2. Move boost tests to test/boost, other (non-boost) unit tests to test/unit, tests which are expected to be run manually to test/manual. Update configure.py and test.py with new paths to tests.	2019-12-16 17:47:42 +03:00

14 Commits