scylladb

Author	SHA1	Message	Date
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Calle Wilund	5d044ab74e	commitlog: Make commitlog respect disk limit better Refs #6148 Separates disk usage into two cases: Allocated and used. Since we use both reserve and recycled segments, both which are not actually filled with anything at the point of waiting. Also refuses to recycle segments or increase reserve size if our current disk footprint exceeds threshold. And finally uses some initial heuristics to determine when we should suggest flushing, based on disk limit, segment size, and current usage. Right now, when we only have a half segment left before hitting used == max. Some initial tests show an improved adherence to limit though it will still be exceeded, because we do _not_ force waiting for segments to become cleared or similar if we need to add data, thus slow flushing can still make usage create extra segments. We will however attempt to shrink disk usage when load is lighter. Somewhat unclear how much this impacts performance with tight limits, and how much this matters. v2: * Add some comments/explanations v3: * Made disk footprint subtract happen post delete (non-optimistic)	2020-08-11 10:40:56 +00:00
Calle Wilund	9167d1ac76	commitlog: Demote buffer write log messages to trace Because they become very plentiful and annoying when one tries to analyze segment behaviour. More so in batch mode.	2020-08-11 09:18:23 +00:00
Piotr Jastrzebski	52ec0c683e	codebase wide: replace erase + remove_if with erase_if C++20 introduced std::erase_if which simplifies removal of elements from the collection. Previously the code pattern looked like: <collection>.erase( std::remove_if(<collection>.begin(), <collection>.end(), <predicate>), <collection>.end()); In C++20 the same can be expressed with: std::erase_if(<collection>, <predicate>); This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <6ffcace5cce79793ca6bd65c61dc86e6297233fd.1597064990.git.piotr@scylladb.com>	2020-08-10 18:17:38 +03:00
Benny Halevy	3ab1d9fe1d	commitlog: use seastar::with_file_close_on_failure `close_on_failure` was committed to seastar so use the library version. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-07-16 20:32:32 +03:00
Benny Halevy	54c5583b8d	commitlog: allocate_segment_ex, segment: pass descriptor by value Besdies being more robust than passing const descriptor& to continuations, this helps simplify making allocate_segment_ex's continuations nothrow_move_constructible, that is need for using seastar::with_file_close_on_failure(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-07-16 20:31:12 +03:00
Benny Halevy	22c384c2e9	commitlog: allocate_segment_ex: filename capture is unused Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-07-16 20:23:57 +03:00
Piotr Dulikowski	b955793088	hinted handoff: disable warnings about segments left on disk When a mutation is written to the commitlog, a rp_handle object is returned which keeps a reference to commitlog segment. A segment is "dirty" when its reference count is not zero, otherwise it is "clean". When commitlog object is being destroyed, a warning is being printed for every dirty segment. On the other hand, clean segments are deleted. In case of standard mutation writing path, the rp_handle moves responsibility for releasing the reference to the memtable to which the mutation is written. When the memtable is flushed to disk, all references accumulated in the memtable are released. In this context, it makes sense to warn about dirty segments, because such segments contain mutations that are not written to sstables, and need to be replayed. However, hinted handoff uses a different workflow - it recreates a commitlog object periodically. When a hint is written to commitlog, the rp_handle reference is not released, so that segments with hints are not deleted when destroying the commitlog. When commitlog is created again, we get a list of saved segments with hints that we can try to send at a later time. Although this is intended behavior, now that releasing the hints commitlog is done properly, it causes the mentioned warning to periodically appear in the logs. This patch adds a parameter for the commitlog that allows to disable this warning. It is only used when creating hinted handoff commitlogs.	2020-07-07 19:40:42 +02:00
Rafael Ávila de Espíndola	67c22c8697	commitlog::read_log_file: Don't discard a future This makes the code a bit easier to read as there are no discarded futures and no references to having to keep a subscription alive, which we don't with current seastar. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200527013120.179763-1-espindola@scylladb.com>	2020-06-24 17:22:29 +03:00
Avi Kivity	a4c44cab88	treewide: update concepts language from the Concepts TS to C++20 Seastar recently lost support for the experimental Concepts Technical Specification (TS) and gained support for C++20 concepts. Re-enable concepts in Scylla by updating our use of concepts to the C++20 standard. This change: - peels off uses of the GCC6_CONCEPT macro - removes inclusions of <seastar/gcc6-concepts.hh> - replaces function-style concepts (no longer supported) with equation-style concepts - semicolons added and removed as needed - deprecated std::is_pod replaced by recommended replacement - updates return type constraints to use concepts instead of type names (either std::same_as or std::convertible_to, with std::same_as chosen when possible) No attempt is made to improve the concepts; this is a specification update only. Message-Id: <20200531110254.2555854-1-avi@scylladb.com>	2020-06-02 09:12:21 +03:00
Avi Kivity	4d15aba7c0	commitlog: capture "this" explicitly in lambda C++20 deprecates capturing this in default-copy lambdas ([=]), with good reason. Move to explicit captures to avoid any ambiguity and reduce warning spew. Message-Id: <20200517150834.753463-1-avi@scylladb.com>	2020-05-19 08:14:32 +03:00
Calle Wilund	525b283326	commitlog::read_log_file: Preserve subscription across reading Fixes #6265 Return type for read_log_file was previously changed from subscription to future<>, returning the previously returned subscriptions result of done(). But it did not preserve the subscription itself, which in turn will cause us to (in work::stream), call back into a deleted object. Message-Id: <20200422090856.5218-1-calle@scylladb.com>	2020-04-22 12:12:11 +03:00
Avi Kivity	2039b79664	commitlog: filter out files in the commitlog directory which don't have the correct prefix Commitlog replay is given a filename prefix to filter files against, but it ignores it. As a result we will replay anything in that directory, including recycled segments, which is wasteful. Fix by adding a check for the prefix. Tests: unit (dev), manual test that regular commitlog files are not filtered. Message-Id: <20200416174542.133230-1-avi@scylladb.com>	2020-04-17 08:44:32 +03:00
Benny Halevy	35892e4557	db::commitlog: close file if wrapping failed When I/O error (e.g. EMFILE / ENOSPC) happens we hit an assert in ~append_challenged_posix_file_impl(): Assertion _closing_state == state::closed' failed. Commit `6160b9017d` add close on failure of the lamda defined in allocate_segment_ex, but it doesn't handle an error after the file is opened/created while it is wrapped with commitlog_file_extensions. Refs #5657 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Reviewed-by: Calle Wilund <calle@scylladb.com> Message-Id: <20200414115231.298632-1-bhalevy@scylladb.com>	2020-04-14 16:14:28 +03:00
Avi Kivity	88ade3110f	treewide: replace calls to engine().some_api() with some_api() This removes the need to include reactor.hh, a source of compile time bloat. In some places, the call is qualified with seastar:: in order to resolve ambiguities with a local name. Includes are adjusted to make everything compile. We end up having 14 translation units including reactor.hh, primarily for deprecated things like reactor::at_exit(). Ref #1	2020-04-05 12:46:04 +03:00
Rafael Ávila de Espíndola	c5795e8199	everywhere: Replace engine().cpu_id() with this_shard_id() This is a bit simpler and might allow removing a few includes of reactor.hh. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200326194656.74041-1-espindola@scylladb.com>	2020-03-27 11:40:03 +03:00
Rafael Ávila de Espíndola	eca0ac5772	everywhere: Update for deprecated apply functions Now apply is only for tuples, for varargs use invoke. This depends on the seastar changes adding invoke. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200324163809.93648-1-espindola@scylladb.com>	2020-03-25 08:49:53 +02:00
Calle Wilund	9fee712d62	db::commitlog: Don't write trailing zero block unless needed Fixes #5899 When terminating (closing) a segment, we write a trailing block of zero so reader can have an empty region after last used chunk as end marker. This is due to using recycled, pre-allocated segments with potentially non-zero data extending over the point where we are ending the segment (i.e. we are not fully filling the segment due to a huge mutation or similar). However, if we reach end of segment writing the final block (typically many small mutations), the file will end naturally after the data written, and any trailing zero block would in fact just extend the file further. While this will only happen once per segment recycled (independent on how many times it is recycled), it is still both slightly breaking the disk usage contract and also potentially causing some disk stalls due to metadata changes (though of course very infrequent). We should only write trailing zero if we are below the max_size file size when terminating Adds a small size check to commitlog test to verify size bounds. (Which breaks without the patch) v2: - Fix test to take into account that files might be deleted behind our backs. v3: - Fix test better, by doing verification _before_ segments are queued for delete. Message-Id: <20200226121601.15347-2-calle@scylladb.com> Message-Id: <20200324100235.23982-1-calle@scylladb.com>	2020-03-24 11:31:55 +01:00
Pekka Enberg	6b2cd1bd7d	Revert "db::commitlog: Don't write trailing zero block unless needed" This reverts commit `0b34d88957`. According to Rafael Avila de Espindola: "I have bisected the recent failures [in commitlog_test] on next to this patch."	2020-03-20 22:30:58 +02:00
Calle Wilund	0b34d88957	db::commitlog: Don't write trailing zero block unless needed Fixes #5899 When terminating (closing) a segment, we write a trailing block of zero so reader can have an empty region after last used chunk as end marker. This is due to using recycled, pre-allocated segments with potentially non-zero data extending over the point where we are ending the segment (i.e. we are not fully filling the segment due to a huge mutation or similar). However, if we reach end of segment writing the final block (typically many small mutations), the file will end naturally after the data written, and any trailing zero block would in fact just extend the file further. While this will only happen once per segment recycled (independent on how many times it is recycled), it is still both slightly breaking the disk usage contract and also potentially causing some disk stalls due to metadata changes (though of course very infrequent). We should only write trailing zero if we are below the max_size file size when terminating Adds a small size check to commitlog test to verify size bounds. (Which breaks without the patch) Message-Id: <20200226121601.15347-2-calle@scylladb.com>	2020-03-08 16:51:53 +02:00
Calle Wilund	b48255a4cd	db::commitlog: Only zero disk blocks not already allocated in segment Fixes #5891 Refs #5899 When creating segments with the o_dsync option active, we write max_size zeros to disk, to ensure actual disk blocks are allocated. However, if we recycle a segment, we should, when not actually creating a new file, check the existing size on disk, and only zero any blocks not already allocated (i.e. if recycled file was smaller than max_size, due to segement truncation on close). test: unit Message-Id: <20200226121601.15347-1-calle@scylladb.com>	2020-03-05 13:27:08 +01:00
Piotr Jastrzebski	f105f43008	commitlog: remove FIXME In segment_manager::on_timer() there's a FIXME to stop discarding future returned from sync() but sync() does not return any future so it's safe to remove the FIXME and stop casting to (void). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <6d6d819cb2972e47e5f3fbe7b896499c64b09e53.1583230579.git.piotr@scylladb.com>	2020-03-03 12:21:56 +02:00
Gleb Natapov	df2f67626b	commitlog: fix size of a write used to zero a segment Due to a bug the entire segment is written in one huge write of 32Mb. The idea was to split it to writes of 128K, so fix it. Fixes #5857 Message-Id: <20200220102939.30769-1-gleb@scylladb.com>	2020-02-20 17:22:21 +02:00
Gleb Natapov	6a78cc9e31	commitlog: use commitlog IO scheduling class for segment zeroing There may be other commitlog writes waiting for zeroing to complete, so not using proper scheduling class causes priority inversion. Fixes #5858. Message-Id: <20200220102939.30769-2-gleb@scylladb.com>	2020-02-20 17:15:13 +02:00
Pavel Emelyanov	d1775dd701	utils: Move disk-error-handler into it The disk-error-handler is purely auxiliary thing that helps propagating IO errors to the rest of the code. It well deserves not sitting in the root namespace. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200207112443.18475-1-xemul@scylladb.com>	2020-02-09 17:26:52 +02:00
Rafael Ávila de Espíndola	da984f1f33	commitlog: Store a future instead of a subscription in db::commitlog::segment_manager::list_descriptors::helper The only use we had for the subscription was calling done, may as well call it early and store the future<>. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-01-30 08:31:28 -08:00
Rafael Ávila de Espíndola	e4b8f52237	commitlog: Simplify the return of read_log_file This function really just wants to signal it is done, so return a future<>. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200128172847.31513-1-espindola@scylladb.com>	2020-01-30 12:00:29 +02:00
Gleb Natapov	c654ffe34b	commitlog: fix flushing an entry marked as "sync" in periodic mode After `546556b71b` we can have mixed writes into commitlog, some do flush immediately some do not. If non flushing write races with flushing one and becomes responsible for writing back its buffer into a file flush will be skipped which will cause assert in batch_cycle() to trigger since flush position will not be advanced. Fix that by checking that flush was skipped and in this case flush explicitly our file position. Fixes #5670 Message-Id: <20200128145103.GI26048@scylladb.com>	2020-01-29 12:58:25 +02:00
Gleb Natapov	8dc37277df	commitlog: remove unused variable Message-Id: <20200128132118.GH26048@scylladb.com>	2020-01-29 00:11:17 +02:00
Gleb Natapov	e0bc4aa098	commitlog: add sync method to entry_writer If the method returns true commitlog should sync to file immediately after writing the entry and wait for flush to complete before returning.	2020-01-15 12:15:42 +02:00
Gleb Natapov	720c0aa285	commitlog: update last sync timestamp when cycle a buffer If in memory buffer has not enough space for incoming mutation it is written into a file, but the code missed updating timestamp of a last sync, so we may sync to often. Message-Id: <20200102155049.21291-9-gleb@scylladb.com>	2020-01-05 16:13:59 +02:00
Gleb Natapov	14746e4218	commitlog: drop segment gate The code that enters the gate never defers before leaving, so the gate behaves like a flag. Lets use existing flag to prohibit adding data to a closed segment. Message-Id: <20200102155049.21291-8-gleb@scylladb.com>	2020-01-05 16:13:59 +02:00
Gleb Natapov	680330ae70	commitlog: introduce segment::close() function. Currently segment closing code is spread over several functions and activated based on the _closed flag. Make segment closing explicit by moving all the code into close() function and call it where _closed flag is set. Message-Id: <20200102155049.21291-6-gleb@scylladb.com>	2020-01-05 16:13:55 +02:00
Gleb Natapov	a1ae08bb63	commitlog: remove unused segment::flush() parameter Message-Id: <20200102155049.21291-5-gleb@scylladb.com>	2020-01-05 16:13:55 +02:00
Gleb Natapov	1e15e1ef44	commitlog: cleanup segment sync() Call cycle() only once. Message-Id: <20200102155049.21291-4-gleb@scylladb.com>	2020-01-05 16:13:54 +02:00
Gleb Natapov	3d3d2c572e	commitlog: move segment shutdown code from sync() Currently sync() does two completely different things based on the shutdown parameter. Separate code into two different function. Message-Id: <20200102155049.21291-3-gleb@scylladb.com>	2020-01-05 16:13:54 +02:00
Gleb Natapov	89afb92b28	commitlog: drop superfluous this Message-Id: <20200102155049.21291-2-gleb@scylladb.com>	2020-01-05 16:13:53 +02:00
Gleb Natapov	bae5cb9f37	commitlog: remove unused argument during segment creation Since `99a5a77234` all segments are created equal and "active" argument is never true, so drop it. Message-Id: <20191231150639.GR9084@scylladb.com>	2019-12-31 17:14:03 +02:00
Gleb Natapov	60a851d3a5	commitlog: always flush segments atomically with writing db::commitlog::segment::batch_cycle() assumes that after a write for a certain position completes (as reported by _pending_ops.wait_for_pending()) it will also be flushed, but this is true only if writing and flushing are atomic wrt _pending_ops lock. It usually is unless flush_after is set to false when cycle() is called. In this case only writing is done under the lock. This is exactly what happens when a segment is closed. Flush is skipped because zero header is added after the last entry and then flushed, but this optimization breaks batch_cycle() assumption. Fix it by flushing after the write atomically even if a segment is being closed. Fixes #5496 Message-Id: <20191224115814.GA6398@scylladb.com>	2019-12-24 14:52:23 +02:00
Tomasz Grabiec	aa173898d6	Merge "Named semaphores in concurrency reader, segment_manager and region_group" from Juliusz Selected semaphores' names are now included in exception messages in case of timeout or when admission queue overflows. Resolves #5281	2019-12-05 14:19:56 +01:00
Juliusz Stasiewicz	430b2ad19d	commitlog+region_group: timeout exceptions with names `segment_manager' now uses a decorated version of `timed_out_error' with hardcoded name. On the other hand `region_group' uses named `on_request_expiry' within its `expiring_fifo'.	2019-12-03 19:07:19 +01:00
Botond Dénes	4054ba0c45	serialization: accept any CharOutputIterator Not just bytes::output_iterator. Allow writing into streams other than just `bytes`. In fact we should be very careful with writing into `bytes` as they require potentially large contiguous allocations. The `write()` method is now templatized also on the type of its first argument, which now accepts any CharOutputIterator. Due to our poor usage of namespace this now collides with `write` defined inside `db/commitlog/commitlog.cc`. Luckily, the latter doesn't really have to be templatized on the data type it reads from, and de-templatizing it resolves the clash.	2019-12-02 10:10:31 +02:00
Rafael Ávila de Espíndola	6160b9017d	commitlog: make sure a file is closed If allocate or truncate throws, we have to close the file. Fixes #4877 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20191114174810.49004-1-espindola@scylladb.com>	2019-11-24 11:35:29 +02:00
Avi Kivity	623071020e	commitlog: change variadic stream in read_log_file to future<struct> Since seastar::streams are based on future/promise, variadic streams suffer the same fate as variadic futures - deprecation and eventual removal. This patch therefore replaces a variadic stream in commitlog::read_log_file() with a non-variadic stream, via a helper struct. Tests: unit (dev)	2019-10-29 19:25:12 +01:00
Rafael Ávila de Espíndola	4d0916a094	commitlog: Handle gate_closed_exception Before this patch, if the _gate is closed, with_gate throws and forward_to is not executed. When the promise<> p is destroyed it marks its _task as a broken promise. What happens next depends on the branch. On master, we warn when the shared_future is destroyed, so this patch changes the warning from a broken_promise to a gate closed. On 3.1, we warn when the promises in shared_future::_peers are destroyed since they no longer have a future attached: The future that was attached was the "auto f" just before the with_gate call, and it is destroyed when with_gate throws. The net result is that this patch fixes the warning in 3.1. I will send a patch to seastar to make the warning on master more consistent with the warning in 3.1. Fixes #4394 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190917211915.117252-1-espindola@scylladb.com>	2019-09-17 23:41:21 +02:00
Botond Dénes	136fc856c5	treewide: silence discarded future warnings for questionable discards This patches silences the remaining discarded future warnings, those where it cannot be determined with reasonable confidence that this was indeed the actual intent of the author, or that the discarding of the future could lead to problems. For all those places a FIXME is added, with the intent that these will be soon followed-up with an actual fix. I deliberately haven't fixed any of these, even if the fix seems trivial. It is too easy to overlook a bad fix mixed in with so many mechanical changes.	2019-08-26 19:28:43 +03:00
Botond Dénes	fddd9a88dd	treewide: silence discarded future warnings for legit discards This patch silences those future discard warnings where it is clear that discarding the future was actually the intent of the original author, and they did the necessary precautions (handling errors). The patch also adds some trivial error handling (logging the error) in some places, which were lacking this, but otherwise look ok. No functional changes.	2019-08-26 18:54:44 +03:00
Rafael Ávila de Espíndola	636e2470b1	Always close commitlog files We were using segment::_closed to decide whether _file was already closed. Unfortunately they are not exactly the same thing. As far as I understand it, segments can be closed and reused without actually closing the file. Found with a seastar patch that asserts on destroying an open append_challenged_posix_file_impl. Fixes #4745. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190721171332.7995-1-espindola@scylladb.com>	2019-07-22 10:08:57 +03:00
Calle Wilund	f317d7a975	commitlog: Simplify commitlog extension iteration Fixes #4640 Iterating extensions in commitlog.cc should mimic that in sstables.cc, i.e. a simple future-chain. Should also use same order for read and write open, as we should preserve transformation stack order. Message-Id: <20190702150028.18042-1-calle@scylladb.com>	2019-07-02 18:37:44 +03:00
Calle Wilund	1e37e1d40c	commitlog: Add optional use of O_DSYNC mode Refs #3929 Optionally enables O_DSYNC mode for segment files, and when enabled ignores actual flushing and just barriers any ongoing writes. Iff using O_DSYNC mode, we will not only truncate the file to max size, but also do an actual initial write of zero:s to it, since XFS (intended target) has observably less good behaviour on non-physical file blocks. Once written (and maybe recycled) we should have rather satisfying throughput on writes. Note that the O_DSYNC behaviour is hidden behind a default disabled option. While user should probably seldom worry about this, we should add some sort of logic i main/init that unless specified by user, evaluates the commitlog disk and sets this to true if it is using XFS and looks ok. This is because using O_DSYNC on things like EXT4 etc has quite horrible performance. All above statements about performance and O_DSYNC behaviour are based on a sampling of benchmark results (modified fsqual) on a statistically non-ssignificant selection of disks. However, at least there the observed behaviour is a rather large difference between ::fallocate:ed disk area vs. actually written using O_DSYNC on XFS, and O_DSYNC on EXT4. Note also that measurements on O_DSYNC vs. no O_DSYNC does not take into account the wall-clock time of doing manual disk flush. This is intentionally ignored, since in the commitlog case, at least using periodic mode, flushes are relatively rare. Message-Id: <20190520120331.10229-1-calle@scylladb.com>	2019-05-20 15:10:48 +03:00

1 2 3 4 5

236 Commits