scylladb

Author	SHA1	Message	Date
Łukasz Paszkowski	da95f44adc	readers: Use reversed schema and native reversed slices The reconcilable_result is built as it would be constructed for forward read queries for tables with reversed order. Mutations constructed for reversed queries are consumed forward. Drop overloaded reversed functions that reverse read_command and reconcilable_result directly and keep only those requiring smart pointers. They are not used any more.	2024-08-13 10:03:46 +02:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Avi Kivity	fdc1449392	treewide: rename flat_mutation_reader_v2 to mutation_reader flat_mutation_reader_v2 was introduced in a pair of commits in 2021: `e3309322c3` "Clone flat_mutation_reader related classes into v2 variants" `08b5773c12` "Adapt flat_mutation_reader_v2 to the new version of the API" as a replacement for flat_mutation_reader, using range_tombstone_change instead of range_tombstone to represent represent range tombstones. See those commits for more information. The transition was incremental; the last use of the original flat_mutation_reader was removed in 2022 in commit `026f8cc1e7` "db: Use mutation_partition_v2 in mvcc" In turn, flat_mutation_reader was introduced in 2017 in commit `748205ca75` "Introduce flat_mutation_reader" To transition from a mutation_reader that nested rows within a partition in a separate stream, to a flat reader that streamed partitions and rows in the same stream. Here, we reclaim the original name and rename the awkward flat_mutation_reader_v2 to mutation_reader. Note that mutation_fragment_v2 remains since we still use the original for compatibilty, sometimes. Some notes about the transition: - files were also renamed. In one case (flat_mutation_reader_test.cc), the rename target already existed, so we rename to mutation_reader_another_test.cc. - a namespace 'mutation_reader' with two definitions existed (in mutation_reader_fwd.hh). Its contents was folded into the mutation_reader class. As a result, a few #includes had to be adjusted. Closes scylladb/scylladb#19356	2024-06-21 07:12:06 +03:00
Kefu Chai	432c000dfa	./: not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17888	2024-03-20 09:16:46 +02:00
Pavel Emelyanov	d90db016bf	treewide: Use partition_slice::is_reversed() Continuation of `cc56a971e8`, more noisy places detected Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#17763	2024-03-13 08:52:46 +02:00
Alexey Novikov	ca4e7f91c6	compact and remove expired rows from cache on read when read from cache compact and expire row tombstones remove expired empty rows from cache do not expire range tombstones in this patch Refs #2252, #6033 Closes #12917	2023-06-26 15:29:01 +02:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Avi Kivity	69a385fd9d	Introduce schema/ module Schema related files are moved there. This excludes schema files that also interact with mutations, because the mutation module depends on the schema. Those files will have to go into a separate module. Closes #12858	2023-02-15 11:01:50 +02:00
Avi Kivity	c5e4bf51bd	Introduce mutation/ module Move mutation-related files to a new mutation/ directory. The names are kept in the global namespace to reduce churn; the names are unambiguous in any case. mutation_reader remains in the readers/ module. mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this patch. This is a step forward towards librarization or modularization of the source base. Closes #12788	2023-02-14 11:19:03 +02:00
Botond Dénes	5e97fb9fc4	row_cache: update reader implementations to v2 cache_flat_mutation_reader gets a native v2 implementation. The underlying mutation representation is not changed: range deletions are still stored as v1 range_tombstones in mutation_partition. These are converted to range tombstone changes during reading. This allows for separating the change of a native v2 reader implementation and a native v2 in-memory storage format, enabling the two to be done at separate times and incrementally.	2022-04-21 14:57:04 +03:00
Botond Dénes	46481264e9	read_context: fix indentation Broken by the previous patch (patches actually -- it was half-indent on half-indent before that).	2022-04-20 10:59:09 +03:00
Botond Dénes	28f90728a3	read_context: coroutinize move_to_next_partition() Makes the code more readable and the impending v2 transition less noisy.	2022-04-20 10:59:09 +03:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Tomasz Grabiec	63351483f0	row_cache: Support reverse reads natively Some implementation notes below. When iterating in reverse, _last_row is after the current entry (_next_row) in table schema order, not before like in the forward mode. Since there is no dummy row before all entries, reverse iteration must be now prepared for the fact that advancing _next_row may land not pointing at any row. The partition_snapshot_row_cursor maintains continuity() correctly in this case, and positions the cursor before all rows, so most of the code works unchanged. The only excpetion is in move_to_next_entry(), which now cannot assume that failure to advance to an entry means it can end a read. maybe_drop_last_entry() is not implemented in reverse mode, which may expose reverse-only workload to the problem of accumulating dummy entries. ensure_population_lower_bound() was not updating _last_row after inserting the entry in latets version. This was not a problem for forward reads because they do not modify the row in the partition snapshot represented by _last_row. They only need the row to be there in the latest version after the call. It's different for reveresed reads, which change the continuity of the entry represented by _last_row, hence _last_row needs to have the iterator updated to point to the entry from the latest version, otherwise we'd set the continuity of the previous version entry which would corrupt the continuity.	2021-12-19 22:41:35 +01:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	499357fb43	row_cache: autoupdating_underlying_reader: fast_forward_to: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210613104232.634621-2-bhalevy@scylladb.com>	2021-06-20 14:46:35 +03:00
Benny Halevy	3db7db5743	row_cache: autoupdating_underlying_reader: fast_forward_to: capture snapshot by value when updating reader Currently we capture the snapshot mutation_source by reference for calling create_underlying_reader after closing the reader. However, if close_reader yields, the snapshot reference passed may be gone, so capture it by value instead. Fixes #8848 Test: unit(dev) DTest: restore_snapshot_using_old_token_ownership_test(debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210613104232.634621-1-bhalevy@scylladb.com>	2021-06-20 14:46:35 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Botond Dénes	5eb4517f56	read_context: move_to_next_partition(): make reader creation atomic Otherwise an interleaving cache update can clear the `_prev_snapshot` before the reader is created, leading to the reader being created via a null mutation source. Tests: unit(dev, release, debug:row_cache_test) Fixes #8671. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210518092317.227433-1-bdenes@scylladb.com>	2021-05-18 13:41:48 +03:00
Benny Halevy	63522361f2	row_cache: read_context: add close method To close the underlying reader. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	4b0fcc7d99	row_cache: autoupdating_underlying_reader: add close method To close the undelying reader. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	3853d7a376	row_cache: autoupdating_underlying_reader: close reader before updating it use the newly introduced reassign method to first close the flat_mutation_reader_opt before assigning it with a new reader. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Piotr Jastrzebski	ceab5f026d	read_context: add _partition_exists This new state stores the information whether current partition represented by _key is present in underlying. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2021-04-12 20:57:20 +02:00
Piotr Jastrzebski	b3b68dc662	read_context: remove skip_first_fragment arg from create_underlying All callers pass false for its value so no need to keep it around. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2021-04-12 19:51:06 +02:00
Piotr Jastrzebski	088a02aafd	read_context: skip first fragment in ensure_underlying This was previously done in create_underlying but ensure_underlying is a better place because we will add more related logic to this consumption in the following patches. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2021-04-12 19:46:04 +02:00
Benny Halevy	5263ab0e9d	row_cache: read_context: use query-request is_single_partition helper Rather than hand-coding the same logic. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210215101254.480228-32-bhalevy@scylladb.com>	2021-02-17 18:29:39 +02:00
Benny Halevy	35256d1b92	treewide: explicitly use flat_mutation_reader_opt Unlike flat_mutation_reader_opt that is defined using optimized_optional<flat_mutation_reader>, std::optional<T> does not evaluate to `false` after being moved, only after it is explicitly reset. Use flat_mutation_reader_opt rather than std::optional<flat_mutation_reader> to make it easier to check if it was closed before it's destroyed or being assigned-over. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210215101254.480228-6-bhalevy@scylladb.com>	2021-02-17 17:57:34 +02:00
Benny Halevy	29002e3b48	flat_mutation_reader: return future from next_partition To allow it to asynchronously close underlying readers on next_partition(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-01-13 17:35:07 +02:00
Botond Dénes	fe024cecdc	row_cache: pass a valid permit to underlying read All reader are soon going to require a valid permit, so make sure we have a valid permit which we can pass to the underlying reader when creating it. This means `row_cache::make_reader()` now also requires a permit to be passed to it.	2020-05-28 11:34:35 +03:00
Pavel Emelyanov	4fa12f2fb8	header: De-bloat schema.hh The header sits in many other headers, but there's a handy schema_fwd.hh that's tiny and contains needed declarations for other headers. So replace shema.hh with schema_fwd.hh in most of the headers (and remove completely from some). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200303102050.18462-1-xemul@scylladb.com>	2020-03-03 11:34:00 +01:00
Tomasz Grabiec	0675088818	row_cache: Use the correct schema version to populate the partition entry The sstable reader which populates the partition entry in the cache is using the schema of the partition entry snapshot, which will be the schema of the cache at the time the partition was entered. If there was a schema change after the cache reader entered the partition but before it created the sstable reader, the cache populating reader will interpret sstable fragments using the wrong schema version. That is more likely if partitions have many rows, and the front of the partition is populated. With single-row partitions that's unlikely to happen. That is undefined behavior in general, which may include: - read failures due to bad_alloc, if fixed-size cells are interpreted as variable-sized cells, and we misinterpret a value for a huge size - wrong read results - node crash This doesn't result in a permanent corruption, restarting the node should help. Fixes #5127.	2019-10-03 22:03:28 +02:00
Tomasz Grabiec	69775c5721	row_cache: Fix abort in cache populating read concurrent with memtable flush When we're populating a partition range and the population range ends with a partition key (not a token) which is present in sstables and there was a concurrent memtable flush, we would abort on the following assert in cache::autoupdating_underlying_reader: utils::phased_barrier::phase_type creation_phase() const { assert(_reader); return _reader_creation_phase; } That's because autoupdating_underlying_reader::move_to_next_partition() clears the _reader field when it tries to recreate a reader but it finds the new range to be empty: if (!_reader \|\| _reader_creation_phase != phase) { if (_last_key) { auto cmp = dht::ring_position_comparator(_cache._schema); auto&& new_range = _range.split_after(_last_key, cmp); if (!new_range) { _reader = {}; return make_ready_future<mutation_fragment_opt>(); } Fix by not asserting on _reader. creation_phase() will now be meaningful even after we clear the _reader. The meaning of creation_phase() is now "the phase in which the reader was last created or 0", which makes it valid in more cases than before. If the reader was never created we will return 0, which is smaller than any phase returned by cache::phase_of(), since cache starts from phase 1. This shouldn't affect current behavior, since we'd abort() if called for this case, it just makes the value more appropriate for the new semantics. Tests: - unit.row_cache_test (debug) Fixes #4236 Message-Id: <1553107389-16214-1-git-send-email-tgrabiec@scylladb.com>	2019-03-21 12:46:00 -03:00
Duarte Nunes	fa2b0384d2	Replace std::experimental types with C++17 std version. Replace stdx::optional and stdx::string_view with the C++ std counterparts. Some instances of boost::variant were also replaced with std::variant, namely those that called seastar::visit. Scylla now requires GCC 8 to compile. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20190108111141.5369-1-duarte@scylladb.com>	2019-01-08 13:16:36 +02:00
Paweł Dziepak	df1d438fcd	row_cache: drop support for streamed_mutation::forwarding::yes entirely	2018-12-20 13:27:25 +00:00
Botond Dénes	eb357a385d	flat_mutation_reader: make timeout opt-out rather than opt-in Currently timeout is opt-in, that is, all methods that even have it default it to `db::no_timeout`. This means that ensuring timeout is used where it should be is completely up to the author and the reviewrs of the code. As humans are notoriously prone to mistakes this has resulted in a very inconsistent usage of timeout, many clients of `flat_mutation_reader` passing the timeout only to some members and only on certain call sites. This is small wonder considering that some core operations like `operator()()` only recently received a timeout parameter and others like `peek()` didn't even have one until this patch. Both of these methods call `fill_buffer()` which potentially talks to the lower layers and is supposed to propagate the timeout. All this makes the `flat_mutation_reader`'s timeout effectively useless. To make order in this chaos make the timeout parameter a mandatory one on all `flat_mutation_reader` methods that need it. This ensures that humans now get a reminder from the compiler when they forget to pass the timeout. Clients can still opt-out from passing a timeout by passing `db::no_timeout` (the previous default value) but this will be now explicit and developers should think before typing it. There were suprisingly few core call sites to fix up. Where a timeout was available nearby I propagated it to be able to pass it to the reader, where I couldn't I passed `db::no_timeout`. Authors of the latter kind of code (view, streaming and repair are some of the notable examples) should maybe consider propagating down a timeout if needed. In the test code (the wast majority of the changes) I just used `db::no_timeout` everywhere. Tests: unit(release, debug) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <1edc10802d5eb23de8af28c9f48b8d3be0f1a468.1536744563.git.bdenes@scylladb.com>	2018-09-20 11:31:24 +02:00
Duarte Nunes	712c051de6	cache_flat_mutation_reader: Pre-calculate cell hash When digest is requested, pre-calculate the cell's hash. We consider the case when the cell is already in the cache, and the case when it added by the underlying reader. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-02-01 01:02:50 +00:00
Piotr Jastrzebski	96c97ad1db	Rename streamed_mutation* files to mutation_fragment* Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-01-24 20:56:49 +01:00
Glauber Costa	5140aaea00	add a timeout to fast forward to In the last patch, we enabled per-request timeouts, we enable timeouts in fill_buffer. There are many places, though, in which we fast_forward_to before we fill_buffer, so in order to make that effective we need to propagate the timeouts to fast_forward_to as well. In the same way as fill_buffer, we make the argument optional wherever possible in the high level callers, making them mandatory in the implementations. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-01-12 07:43:19 -05:00
Piotr Jastrzebski	14d98aaa0b	Rename row_cache::create_underlying_flat_reader to create_underlying_reader Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-18 16:37:57 +01:00
Piotr Jastrzebski	b976872c1a	Rename all _underlying_flat methods in read_context to _underlying. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-18 16:37:57 +01:00
Piotr Jastrzebski	8b796a884f	Rename read_context::enter_flat_partition to enter_partition Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-18 16:37:57 +01:00
Piotr Jastrzebski	8d37b71843	Rename autoupdating_underlying_flat_reader to autoupdating_underlying_reader Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-18 16:37:57 +01:00
Piotr Jastrzebski	9789c37e9d	Remove autoupdating_underlying_reader Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-18 16:37:56 +01:00
Piotr Jastrzebski	893e434207	Stop using autoupdating_underlying_reader in read_context Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-18 16:37:56 +01:00
Piotr Jastrzebski	714868db2d	Use autoupdating_underlying_flat_reader in read_context and add read_context::enter_flat_partition. This will temporarily coexist with read_context::enter_partition but after everything in cache is migrated to flat reader the new method will replace old one. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-18 13:28:09 +01:00
Piotr Jastrzebski	3e980cac3d	Make autoupdating_underlying_flat_reader use flat reader. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-18 13:28:09 +01:00
Piotr Jastrzebski	77b6f7c599	read_context: create a copy of autoupdating_underlying_reader called autoupdating_underlying_flat_reader. It will be modified in the next patch to use flat reader to underlying. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-18 13:28:09 +01:00
Tomasz Grabiec	af4a9a4a30	row_cache: Make read_context::key() valid before reading from underlying starts So that we can call cache_streamed_mutation::can_populate() before we start reading from underlying. Will be needed in upcoming changes which insert dummy entries when falling back to underlying.	2017-11-02 11:05:19 +01:00
Tomasz Grabiec	22948238b6	row_cache: Fix potential timeout or deadlock due to sstable read concurrency limit database::make_sstable_reader() creates a reader which will need to obtain a semaphore permit when invoked. Therefore, each read may create at most one such reader in order to be guaranteed to make progress. If the reader tries to create another reader, that may deadlock (or for non-system tables, timeout), if enough number of such readers tries to do the same thing at the same time. Avoid the problem by dropping previous reader before creating a new one. Refs #2644. Message-Id: <1501152454-4866-1-git-send-email-tgrabiec@scylladb.com>	2017-07-27 13:58:20 +03:00
Paweł Dziepak	c2ec43f70b	cache_streamed_mutation: use consumer based read_context reader	2017-07-26 14:38:21 +01:00

1 2

58 Commits