scylladb

Author	SHA1	Message	Date
Piotr Jastrzebski	01ea159fde	codebase wide: use try_emplace when appropriate C++17 introduced try_emplace for maps to replace a pattern: if(element not in a map) { map.emplace(...) } try_emplace is more efficient and results in a more concise code. This commit introduces usage of try_emplace when it's appropriate. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <4970091ed770e233884633bf6d46111369e7d2dd.1597327358.git.piotr@scylladb.com>	2020-08-16 14:41:09 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Tomasz Grabiec	3486eba1ce	commitlog: Fix use-after-free on mutation object during replay The mutation object may be freed prematurely during commitlog replay in the schema upgrading path. We will hit the problem if the memtable is full and apply_in_memory() needs to defer. This will typically manifest as a segfault. Fixes #6953 Introduced in `79935df` Tests: - manual using scylla binary. Reproduced the problem then verified the fix makes it go away Message-Id: <1596044010-27296-1-git-send-email-tgrabiec@scylladb.com>	2020-07-29 20:58:15 +03:00
Botond Dénes	6083ed668b	commitlog_replayer: ignore entries with invalid keys When replaying the commitlog, pass keys to `validation::validate_cql_key()`. Discard entries which fail validation and warn about it in the logs. This prevents invalid keys from getting into the system, possibly failing the commitlog replay and the successful boot of the node, preventing the node from recovering data.	2020-05-12 12:07:21 +03:00
Rafael Ávila de Espíndola	e4b8f52237	commitlog: Simplify the return of read_log_file This function really just wants to signal it is done, so return a future<>. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200128172847.31513-1-espindola@scylladb.com>	2020-01-30 12:00:29 +02:00
Calle Wilund	56a5e0a251	commitlog_replayer: Ensure applied frozen_mutation is safe during apply Fixes #5211 In `79935df959` replay apply-call was changed from one with no continuation to one with. But the frozen mutation arg was still just lambda local. Change to use do_with for this case as well. Message-Id: <20191203162606.1664-1-calle@scylladb.com>	2019-12-03 18:28:01 +02:00
Avi Kivity	623071020e	commitlog: change variadic stream in read_log_file to future<struct> Since seastar::streams are based on future/promise, variadic streams suffer the same fate as variadic futures - deprecation and eventual removal. This patch therefore replaces a variadic stream in commitlog::read_log_file() with a non-variadic stream, via a helper struct. Tests: unit (dev)	2019-10-29 19:25:12 +01:00
Tomasz Grabiec	79935df959	commitlog: replay: Respect back-pressure from memtable space to prevent OOM Commit log replay was bypassing memtable space back-pressure, and if replay was faster than memtable flush, it could lead to OOM. The fix is to call database::apply_in_memory() instead of table::apply(). The former blocks when memtable space is full. Fixes #4982. Tests: - unit (release) - manual, replay with memtable flush failin and without failing Message-Id: <1568381952-26256-1-git-send-email-tgrabiec@scylladb.com>	2019-09-15 11:51:56 +03:00
Calle Wilund	9cadbaa96f	commitlog_replayer: Bugfix: finding truncation positions uses local var ref "uuid" was ref:ed in a continuation. Works 99.9% of the time because the continuation is not actually delayed (and assuming we begin the checks with non-truncated (system) cf:s it works). But if we do delay continuation, the resulting cf map will be borked. Fixes #4187. Message-Id: <20190204141831.3387-1-calle@scylladb.com>	2019-02-04 16:51:13 +02:00
Duarte Nunes	b7517183fa	db/commitlog: Use fragmented buffers to read entries Leverage fragmented_temporary_buffer when reading commit log entries, avoiding large allocations. Refs #4020 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-31 13:20:37 +00:00
Avi Kivity	f0a709cfc8	commitlog_replayer: don't use query_processor During normal writes, query processing happens before commitlog, so logically commitlog replaying the commitlog shouldn't need it. And in fact the dependency on query_processor can be eliminated, all it needs is the local node's database.	2018-12-29 11:00:29 +02:00
Avi Kivity	cc8312a8b9	commitlog: reduce dependencies on db/config.hh Instead of accessing extensions via config, access it via database::extensions(). This reduces recompilations when configuration is extended.	2018-12-21 20:15:43 +00:00
Tomasz Grabiec	538e041f22	Merge "Remove some dependencies on db::config" from Avi db::config is a global class; changes in any module can cause changes in db::config. Therefore, it is a cause of needless recompilation. Remove some of these dependencies by having consumers of db::config declare an intermediate config struct that is contains only configuration of interest to them, and have their caller fill it out (in the case of auth, it already followed this scheme and the patchset only moves the translation function). In addition, some outright pointless inclusions of db/config.hh are removed. The result is somewhat shorter compile times, and fewer needless recompiles. * https://github.com/avikivity/scylla unconfig-1/v1: config: remove inclusions of db/config.hh from header files repair: remove unneeded config.hh inclusion batchlog_manager: remove dependency on db::config auth: remove permissions_cache dependency on db::config auth: remove auth::service dependency on db::config auth: remove unneeded db/config.hh includes	2018-12-10 14:53:14 +01:00
Calle Wilund	b35af84599	commitlog_replay: Enforce file name based id matching When reading the header chunk of a commitlog file, check the stored id value against the id derived from the file name, and ignore if mismatched. This is a prerequisite for re-using renamed commitlog files, as we can then fail-fast should one such be left on disk, instead of trying to replay it. We also check said id via the CRC check for each chunk parsed. If we find a chunk with mismatched id, we will get a CRC error for the chunk, and replay will terminate (albeit not gracefully).	2018-12-10 09:09:07 +00:00
Avi Kivity	864f55e745	config: remove inclusions of db/config.hh from header files Instead, distribute those inclusions to .cc files that require them. This reduces rebuilds when config.hh changes, and makes it easier to locate files that need config disaggregation.	2018-12-09 20:11:38 +02:00
Avi Kivity	775b7e41f4	Update seastar submodule * seastar d59fcef...b924495 (2): > build: Fix protobuf generation rules > Merge "Restructure files" from Jesse Includes fixup patch from Jesse: " Update Seastar `#include`s to reflect restructure All Seastar header files are now prefixed with "seastar" and the configure script reflects the new locations of files. Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Message-Id: <5d22d964a7735696fb6bb7606ed88f35dde31413.1542731639.git.jhaberku@scylladb.com> "	2018-11-21 00:01:44 +02:00
Avi Kivity	d77e044cde	db: convert sprint() to format() sprint() recently became more strict, throwing on sprint("%s", 5). Replace with the more modern format(). Mechanically converted with https://github.com/avikivity/unsprint.	2018-11-01 13:16:17 +00:00
Vlad Zolotarov	a89188de07	commitlog::read_log_file(): set the a read I/O priority class explicitly Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-10-10 15:22:43 -04:00
Calle Wilund	bb1a2c6c2e	db::commitlog: Add commitlog/hints file io extension To allow on-disk data to be augumented.	2018-03-26 11:58:27 +00:00
José Guilherme Vanz	380bc0aa0d	Swap arguments order of mutation constructor Swap arguments in the mutation constructor keeping the same standard from the constructor variants. Refs #3084 Signed-off-by: José Guilherme Vanz <guilherme.sft@gmail.com> Message-Id: <20180120000154.3823-1-guilherme.sft@gmail.com>	2018-01-21 12:58:42 +02:00
Vlad Zolotarov	878d58d23a	db/commitlog/commitlog::descriptor: add a filename_prefix parameter This parameter is used when creating a new segment. It's default value is a descriptor::FILENAME_PREFIX. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-12-14 15:05:47 -05:00
Calle Wilund	d9b8c79eb9	commitlog_replayer: Ignore sstable replay positions With relaxed position ordering, we cannot use existing sstables as water mark for replay. We must replay everything above truncation marks.	2017-06-07 12:07:01 +00:00
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Calle Wilund	b12b65db92	commitlog/replayer: Bugfix: minimum rp broken, and cl reader offset too The previous fix removed the additional insertion of "min rp" per source shard based on whether we had processed existing CF:s or not (i.e. if a CF does not exist as sstable at all, we must tag it as zero-rp, and make whole shard for it start at same zero. This is bad in itself, because it can cause data loss. It does not cause crashing however. But it did uncover another, old old lingering bug, namely the commitlog reader initiating its stream wrongly when reading from an actual offset (i.e. not processing the whole file). We opened the file stream from the file offset, then tried to read the file header and magic number from there -> boom, error. Also, rp-to-file mapping was potentially suboptimal due to using bucket iterator instead of actual range. I.e. three fixes: * Reinstate min position guarding for unencoutered CF:s * Fix stream creating in CL reader * Fix segment map iterator use. v2: * Fix typo Message-Id: <1490611637-12220-1-git-send-email-calle@scylladb.com>	2017-03-28 10:32:28 +02:00
Calle Wilund	c3a510a08d	commitlog_replayer: Do proper const-loopup of min positions for shards Fixes #2173 Per-shard min positions can be unset if we never collected any sstable/truncation info for it, yet replay segments of that id. Wrap the lookups to handle "missing data -> default", which should have been there in the first place. Message-Id: <1490185101-12482-1-git-send-email-calle@scylladb.com>	2017-03-22 17:57:09 +02:00
Calle Wilund	078589c508	commitlog_replayer: Make replay parallel per shard Fixes #2098 Replay previously did all segments in parallel on shard 0, which caused heavy memory load. To reduce this and spread footprint across shards, instead do X segments per shard, sequential per shard. v2: * Fixed whitespace errors Message-Id: <1489503382-830-1-git-send-email-calle@scylladb.com>	2017-03-15 13:07:17 +02:00
Tomasz Grabiec	059a1a4f22	db: Fix commitlog replay to not drop cell mutations with older schema column_mapping is not safe to access across shards, because data_type is not safe to access. One of the manifestation of this is that abstract_type::is_value_compatible_with() always fails if the two types belong to different shards. During replay, column_mapping lives on the replaying shard, and is used by converting_mutation_partition_applier against the schema on the target shard. Since types in the mapping will be considered incompatible with types in the schema, all cells will be dropped. Fix by using column_mapping in a safe way, by copying it to the target shard if necessary. Each shard maintains its own cache of column mappings. Fixes #1924. Message-Id: <1481310463-13868-1-git-send-email-tgrabiec@scylladb.com>	2016-12-13 12:19:32 +02:00
Tomasz Grabiec	c1a7e2090e	Revert "database: change find_column_families signature so it returns a lw_shared_ptr" This reverts commit `f3528ede65`.	2016-11-04 10:48:21 +01:00
Glauber Costa	f3528ede65	database: change find_column_families signature so it returns a lw_shared_ptr There are places in which we need to use the column family object many times, with deferring points in between. Because the column family may have been destroyed in the deferring point, we need to go and find it again. If we use lw_shared_ptr, however, we'll be able to at least guarantee that the object will be alive. Some users will still need to check, if they want to guarantee that the column family wasn't removed. But others that only need to make sure we don't access an invalid object will be able to avoid the cost of re-finding it just fine. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <722bf49e158da77ff509372c2034e5707706e5bf.1478111467.git.glauber@scylladb.com>	2016-11-03 13:27:31 +01:00
Avi Kivity	2a46410f4a	Change sstable_list from a map to a set sstable_list is now a map<generation, sstable>; change it to a set in preparation for replacing it with sstable_set. The change simplifies a lot of code; the only casualty is the code that computes the highest generation number.	2016-07-03 10:26:57 +03:00
Calle Wilund	2b812a392a	commitlog_replayer: Fix calculation of global min pos per shard If a CF does not have any sstables at all, we should treat it as having a replay position of zero. However, since we also must deal with potential re-sharding, we cannot just set shard->uuid->zero initially, because we don't know what shards existed. Go through all CF:s post map-reduce, and for every shard where a CF does not have an RP-mapping (no sstables found), set the global min pos (for shard) to zero. Fixes #1372 Message-Id: <1465991864-4211-1-git-send-email-calle@scylladb.com>	2016-06-21 10:05:05 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Paweł Dziepak	bdc23ae5b5	remove db/serializer.hh includes Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-02 09:07:09 +00:00
Pekka Enberg	86173fb8cc	db/commitlog: Fix debug log format string in commitlog_replayer::recover() I saw the following Boost format string related warning during commitlog replay: INFO [shard 0] commitlog_replayer - Replaying node3/commitlog/CommitLog-1-72057594289748293.log, node3/commitlog/CommitLog-1-90071992799230277.log, node3/commitlog/CommitLog-1-108086391308712261.log, node3/commitlog/CommitLog-1-251820357.log, node3/commitlog/CommitLog-1-54043195780266309.log, node3/commitlog/CommitLog-1-36028797270784325.log, node3/commitlog/CommitLog-1-126100789818194245.log, node3/commitlog/CommitLog-1-18014398761302341.log, node3/commitlog/CommitLog-1-126100789818194246.log, node3/commitlog/CommitLog-1-251820358.log, node3/commitlog/CommitLog-1-18014398761302342.log, node3/commitlog/CommitLog-1-36028797270784326.log, node3/commitlog/CommitLog-1-54043195780266310.log, node3/commitlog/CommitLog-1-72057594289748294.log, node3/commitlog/CommitLog-1-90071992799230278.log, node3/commitlog/CommitLog-1-108086391308712262.log WARN [shard 0] commitlog_replayer - error replaying: boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::io::too_many_args> > (boost::too_many_args: format-string referred to less arguments than were passed) While inspecting the code, I noticed that one of the error loggers is missing an argument. As I don't know how the original failure triggered, I wasn't able to verify that that was the only one, though. Message-Id: <1453893301-23128-1-git-send-email-penberg@scylladb.com>	2016-01-27 13:40:19 +02:00
Calle Wilund	59bf54d59a	commitlog_replayer: Modify logging to more match origin * Match origin log messages - Demote per-file printouts to "debug" level. * Print an all-files stat summary for whole replay (begin/summary) - At info level, like origin Prompted by dtest that expects origin log output. Message-Id: <1453216558-18359-1-git-send-email-calle@scylladb.com>	2016-01-19 17:19:52 +02:00
Paweł Dziepak	218898b297	commitlog: upgrade mutations during commitlog replay Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-13 10:50:26 +01:00
Paweł Dziepak	661849dbc3	commitlog: learn about schema versions during replay Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-13 10:50:23 +01:00
Paweł Dziepak	18d0a57bf4	commitlog: use commitlog entry writer and reader Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-13 10:20:06 +01:00
Tomasz Grabiec	036974e19b	Make mutation interfaces support multiple versions Schema is tracked in memtable and cache per-entry. Entries are upgraded lazily on access. Incoming mutations are upgraded to table's current schema on given shard. Mutating nodes need to keep schema_ptr alive in case schema version is requested by target node.	2016-01-11 10:34:51 +01:00
Tomasz Grabiec	c0ac7b3a73	commitlog: Wrap subscription in a unique_ptr<> to make it nothrow movable future<> will require nothrow move constructible types.	2015-12-07 09:50:28 +01:00
Tomasz Grabiec	657841922a	Mark move constructors noexcept when possible	2015-12-07 09:50:27 +01:00
Calle Wilund	76b43fbf74	commitlog_replayer: Handle replay data errors as non-fatal Discern fatal and non-fatal excceptions, and handle data corruption by adding to stats, resporting it, but continue processing. Note that "invalid_arguement", i.e. attempting to replay origin/old segments are still considered fatal, as it is probably better to signal this strongly to user/admin	2015-11-23 15:42:45 +01:00
Calle Wilund	43712a583d	commitlog_replayer: Special case exception from "old/origin file" And write some nice informative stuff.	2015-11-10 17:14:22 +01:00
Calle Wilund	a66c22f1ec	commitlog_replayer: Acquire truncation RP:s per replayed shard I.e. get them in bulk and fill in for all shards	2015-10-07 09:00:22 +02:00
Calle Wilund	17bd18b59c	commitlog_replayer: Add logging message for exceptions in multi-file recover	2015-10-07 08:59:54 +02:00
Calle Wilund	3f1fa77979	commitlog_replayer: Fix broken comparison A commitlog entry should be ignored if its position is <= highest recorded position, not <.	2015-10-07 08:59:53 +02:00
Calle Wilund	b3c95ce42d	system_keyspace: Change truncation record method to use context qp Align with rest of file (for better or worse). This allows calls from entity without query_processor handy (i.e. storage_proxy). Added "minimal" setup method for the "global" state, to facilitate tests. Doing a full setup either in cql_test_env or after it is created breaks badly. (Not sure why). So quick workaround. Updated the current two users (batchlog_manager and commitlog_replayer) callsites to conform.	2015-09-30 09:09:41 +02:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Calle Wilund	04562b23b4	commitlog_replayer: More correct fix for reordering issue in replay * Removes previous, accidental fix that got committed. * Instead just do not give RP:s to replay mutations. This is same as in Origin, and just as/more correct, since we intend to flush the data to sstables asap anyway	2015-09-16 15:41:17 +03:00
Raphael S. Carvalho	c729ea36e1	commitlog: guard commit log replay against reordering After killing scylla in the middle of a write, the next scylla instance failed to finish commit log replay, showing the following error message: scylla: core/future.hh:448: void promise<T>::set_value(A&& ...) [with A = {}; T = {}]: Assertion `_state' failed. After a long debug session, I figured out that check_valid_rp() was triggering the exception replay_position_reordered_exception, which means replay position reordering. Looking at `8b9a63a3c6`, I noticed that database::apply is guarded against reodering, but commitlog replay code is not. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-09-12 06:17:14 -03:00

1 2

53 Commits