scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 03:45:11 +00:00

Author	SHA1	Message	Date
Benny Halevy	38aa455e83	utils: merge_to_gently: prevent stall in std::copy_if std::copy_if runs without yielding. See https://github.com/scylladb/scylla/issues/8897#issuecomment-867522480 Note that the standard states that no iterators or references are invalidated on insert so we can keep inserting before last1 when merging the remainder of list2 at the tail of list1. Fixes #8897 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `453e7c8795`)	2021-10-12 13:05:45 +03:00
Avi Kivity	a99382a076	main: start background reclaim before bootstrap We start background reclaim after we bootstrap, so bootstrap doesn't benefit from it, and sees long stalls. Fix by moving background reclaim initialization early, before storage_service::join_cluster(). (storage_service::join_cluster() is quite odd in that main waits for it synchronously, compared to everything else which is just a background service that is only initialized in main). Fixes #8473. Closes #8474 (cherry picked from commit `935378fa53`)	2021-10-12 13:00:49 +03:00
Pavel Emelyanov	6e2d055be3	mutation: Keep range tombstone in tree when consuming Current code std::move()-s the range tombstone into consumer thus moving the tombstone's linkage to the containing list as well. As the result the orignal range tombstone itself leaks as it leaves the tree and cannot be reached on .clear(). Another danger is that the iterator pointing to the tombstone becomes invalid while it's then ++-ed to advance to the next entry. The immediate fix is to keep the tombstone linked to the list while moving. fixes: #9207 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210825100834.3216-1-xemul@scylladb.com> (cherry picked from commit `b012040a76`)	2021-10-12 12:57:49 +03:00
Michael Livshin	152f710dec	avoid race between compaction and table stop Also add a debug-only compaction-manager-side assertion that tests that no new compaction tasks were submitted for a table that is being removed (debug-only because not constant-time). Fixes #9448. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com> Message-Id: <20211007110416.159110-1-michael.livshin@scylladb.com> (cherry picked from commit `e88891a8af`)	2021-10-12 12:51:34 +03:00
Asias He	14620444a2	storage_service: Fix argument in send_meta_data::do_receive The extra status print is not needed in the log. Fixes the following error: ERROR 2021-08-10 10:54:21,088 [shard 0] storage_service - service/storage_service.cc:3150 @do_receive: failed to log message: fmt='send_meta_data: got error code={}, from node={}, status={}': fmt::v7::format_error (argument not found) Fixes #9183 Closes #9189 (cherry picked from commit `ce8fd051c9`)	2021-10-12 12:45:47 +03:00
Yaron Kaikov	47be33a104	release: prepare for 4.5.0 scylla-4.5.0	2021-10-06 14:12:31 +03:00
Takuya ASADA	18b8388958	scylla_cpuscaling_setup: add --force option To building Ubuntu AMI with CPU scaling configuration, we need force running mode for scylla_cpuscaling_setup, which run setup without checking scaling_governor support. See scylladb/scylla-machine-image#204 Closes #9326 (cherry picked from commit `f928dced0c`)	2021-10-05 16:20:10 +03:00
Takuya ASADA	56b24818ec	scylla_cpuscaling_setup: disable ondemand.service on Ubuntu On Ubuntu, scaling_governor becomes powersave after rebooted, even we configured cpufrequtils. This is because ondemand.service, it unconditionally change scaling_governor to ondemand or powersave. cpufrequtils will start before ondemand.service, scaling_governor overwrite by ondemand.service. To configure scaling_governor correctly, we have to disable this service. Fixes #9324 Closes #9325 (cherry picked from commit `cd7fe9a998`)	2021-10-03 14:08:22 +03:00
Raphael S. Carvalho	5ed149b7e1	compaction_manager: prevent unbounded growth of pending tasks There will be unbounded growth of pending tasks if they are submitted faster than retiring them. That can potentially happen if memtables are frequently flushed too early. It was observed that this unbounded growth caused task queue violations as the queue will be filled with tons of tasks being reevaluated. By avoiding duplication in pending task list for a given table T, growth is no longer unbounded and consequently reevaluation is no longer aggressive. Refs #9331. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210930125718.41243-1-raphaelsc@scylladb.com> (cherry picked from commit `52302c3238`)	2021-10-03 13:10:37 +03:00
Eliran Sinvani	9265dbd5f7	dist: rpm: Add specific versioning and python3 dependency The Red Hat packages were missing two things, first the metapackage wasn't dependant at all in the python3 package and second, the scylla-server package dependencies didn't contain a version as part of the dependency which can cause to some problems during upgrade. Doing both of the things listed here is a bit of an overkill as either one of them separately would solve the problem described in #XXXX but both should be applied in order to express the correct concept. Fixes #8829 Closes #8832 (cherry picked from commit `9bfb2754eb`)	2021-09-12 16:01:01 +03:00
Avi Kivity	443fda8fb1	Merge "evictable_readers: don't drop static rows, drop assumption about snapshot isolation" from Botond " This mini-series fixes two loosely related bugs around reader recreation in the evictable reader (related by both being around reader recreation). A unit test is also added which reproduces both of them and checks that the fixes indeed work. More details in the patches themselves. This series replaces the two independent patches sent before: * [PATCH v1] evictable_reader: always reset static row drop flag * [PATCH v1] evictable_reader: relax partition key check on reader recreation As they depend on each other, it is easier to add a test if they are in a series. Fixes: #8923 Fixes: #8893 Tests: unit(dev, mutation_reader_test:debug) " * 'evictable-reader-recreation-more-bugs/v1' of https://github.com/denesb/scylla: test: mutation_reader_test: add more test for reader recreation evictable_reader: relax partition key check on reader recreation evictable_reader: always reset static row drop flag (cherry picked from commit `4209dfd753`)	2021-09-06 20:35:14 +03:00
Pavel Emelyanov	02da29fd05	btree: Destroy, not drop, node on clone roll-back The node in this place is not yet attached to its parent, so in btree::debug::yes (tests only) mode the node::drop()'s parent checks will access null parent pointer. However, in non-tesing runtime there's a chance that a linear node fails to clone one of its keys and gets here. In this case it will carry both leftmost and rightmost flags and the assertion in drop will fire. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `1d857d604a`) Ref #9248.	2021-09-06 11:16:29 +03:00
Yaron Kaikov	edead1caf9	release: prepare for 4.5.rc7 scylla-4.5.rc7	2021-09-06 08:40:20 +03:00
Tomasz Grabiec	8dbd4edbb5	Merge 'hints: use token_metadata to tell if node has left the ring' from Piotr Dulikowski This PR changes the `can_send` function so that it looks at the `token_metadata` in order to tell if the destination node is in the ring. Previously, gossiper state was used for that purpose and required a relatively complicated condition to check. The new logic just uses `token_metadata::is_member` which reduces complexity of the `can_send` function. Additionally, `storage_service` is slightly modified so that during a removenode operation the `token_metadata` is first updated and only then endpoint lifecycle subscribers are notified. This was done in order to prevent a race just like the one which happened in #5087 - hints manager is a lifecycle subscriber and starts a draining operation when a node is removed, and in order for draining to work correctly, `can_send` should keep returning true for that node. Tests: - unit(dev) - dtest(hintedhandoff_additional_test.py) - dtest(topology_test.py) Closes #8387 * github.com:scylladb/scylla: hints: clarify docstring comment for can_send hints: use token_metadata to tell if node is in the ring hints: slightly reogranize "if" statement in can_send storage_service: release token_metadata lock before notify_left storage_service: notify_left after token_metadata is replicated (cherry picked from commit `307bd354d2`) Ref #5087.	2021-09-05 17:38:11 +03:00
Pavel Emelyanov	02bb2e1f4c	btree: Dont leak kids on clone roll-back When failed-to-be-cloned node cleans itself it must also clear all its child nodes. Plain destroy() doesn't do it, it only frees the provided node. fixes: #9248 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> (cherry picked from commit `d1a1a2dac2`)	2021-09-05 17:33:48 +03:00
Benny Halevy	4bae31523d	distributed_loader: distributed_loader::get_sstables_from_upload_dir: do not copy vector containing foreign shared sstables lw_shared_ptr must not be copied on a foreign shard. Copying the vector on shard 0 tries increases the reference count of lw_shared_ptr<sstable> elements that were created on other shards, as seen in https://github.com/scylladb/scylla/issues/9278. Fixes #9278 DTest: migration_test.py:TestLoadAndStream_with_3_0_md.load_and_stream_increase_cluster_test(debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210902084313.2003328-1-bhalevy@scylladb.com> (cherry picked from commit `33f579f783`)	2021-09-05 17:02:38 +03:00
Michał Chojnowski	55348131f9	utils: compact-radix-tree: fix accidental cache line bouncing Whenever a node_head_ptr is assigned to nil_root, the _backref inside it is overwritten. But since nil_root is shared between shards, this causes severe cache line bouncing. (It was observed to reduce the total write throughput of Scylla by 90% on a large NUMA machine). This backreference is never read anyway, so fix this bug by not writing it. Fixes #9252 Closes #9246 (cherry picked from commit `126baa7850`)	2021-08-29 15:45:33 +03:00
Avi Kivity	c1b9de3d5e	Revert "messaging_service: Enforce dc/rack membership iff required for non-tls connections" This reverts commit `a0745f9498`. It breaks multiregion clusters on AWS. Ref #8418.	2021-08-29 15:44:11 +03:00
Avi Kivity	9956bce436	Revert "messaging_service: Bind to listen address, not broadcast" This reverts commit `6dc7ef512d`. It stands in the way of reverting `a0745f9498`, which is implicated in #8418.	2021-08-29 15:43:29 +03:00
Pavel Solodovnikov	95f32428e4	raft: create system tables only when `raft` experimental feature is set Also introduce a tiny function to return raft-enabled db config for cql testing. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20210826091432.279532-1-pa.solodovnikov@scylladb.com> (cherry picked from commit `c0854a0f62`) Ref #9239.	2021-08-29 14:01:36 +03:00
Pavel Solodovnikov	5b3319816a	db: add experimental option for raft Introduce `raft` experimental option. Adjust the tests accordingly to accomodate the new option. It's not enabled by default when providing `--experimental=true` config option and should be requested explicitly via `--experimental-options=raft` config option. Hide the code related to `raft_group_registry` behind the switch. The service object is still constructed but no initialization is performed (`init()` is not called) if the flag is not set. Later, other raft-related things, such as raft schema changes, will also use this flag. Also, don't introduce a corresponding gossiper feature just yet, because again, it should be done after the raft schema changes API contract is stabilized. This will be done in a separate series, probably related to implementing the feature itself. Tests: unit(dev) Ref #9239. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20210823121956.167682-1-pa.solodovnikov@scylladb.com> (cherry picked from commit `22794efc22`)	2021-08-29 14:01:33 +03:00
Avi Kivity	8f63a9de31	Update seastar submodule (perftune failure on bond NIC) * seastar dab10ba6ad...70ea9312a1 (1): > perftune.py: instrument bonding tuning flow with 'nic' parameter Fixes #9225.	2021-08-19 16:58:15 +03:00
Hagit Segev	88314fedfa	release: prepare for 4.5.rc6 scylla-4.5.rc6	2021-08-17 14:17:04 +03:00
Calle Wilund	b0edfa6d70	commitlog/config: Make hard size enforcement false by default + add config opt Refs #9053 Flips default for commitlog disk footprint hard limit enforcement to off due to observed latency stalls with stress runs. Instead adds an optional flag "commitlog_use_hard_size_limit" which can be turned on to in fact do enforce it. Sort of tape and string fix until we can properly tweak the balance between cl & sstable flush rate. Closes #9195 (cherry picked from commit `3633c077be`)	2021-08-16 10:05:08 +03:00
Takuya ASADA	9338f6b6b8	scylla_cpuscaling_setup: change scaling_governor path On some environment /sys/devices/system/cpu/cpufreq/policy0/scaling_governor does not exist even it supported CPU scaling. Instead, /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor is avaliable on both environment, so we should switch to it. Fixes #9191 Closes #9193 (cherry picked from commit `e5bb88b69a`)	2021-08-12 12:10:04 +03:00
Asias He	28940ef505	table: Fix is_shared assert for load and stream The reader is used by load and stream to read sstables from the upload directory which are not guaranteed to belong to the local shard. Using the make_range_sstable_reader instead of make_local_shard_sstable_reader. Tests: backup_restore_tests.py:TestBackupRestore.load_and_stream_using_snapshot_test backup_restore_tests.py:TestBackupRestore.load_and_stream_to_new_cluster_2_test backup_restore_tests.py:TestBackupRestore.load_and_stream_to_new_cluster_1_test migration_test.py:TestLoadAndStream.load_and_stream_asymmetric_cluster_test migration_test.py:TestLoadAndStream.load_and_stream_decrease_cluster_test migration_test.py:TestLoadAndStream.load_and_stream_frozen_pk_test migration_test.py:TestLoadAndStream.load_and_stream_increase_cluster_test migration_test.py:TestLoadAndStream.load_and_stream_primary_replica_only_test Fixes #9173 Closes #9185 (cherry picked from commit `040b626235`)	2021-08-12 11:17:33 +03:00
Raphael S. Carvalho	4c03bcce4c	compaction: Prevent tons of compaction of fully expired sstable from happening in parallel Compaction manager can start tons of compaction of fully expired sstable in parallel, which may consume a significant amount of resources. This problem is caused by weight being released too early in compaction, after data is all compacted but before table is called to update its state, like replacing sstables and so on. Fully expired sstables aren't actually compacted, so the following can happen: - compaction 1 starts for expired sst A with weight W, but there's nothing to be compacted, so weight W is released, then calls table to update state. - compaction 2 starts for expired sst B with weight W, but there's nothing to be compacted, so weight W is released, then calls table to update state. - compaction 3 starts for expired sst C with weight W, but there's nothing to be compacted, so weight W is released, then calls table to update state. - compaction 1 is done updating table state, so it finally completes and releases all the resources. - compaction 2 is done updating table state, so it finally completes and releases all the resources. - compaction 3 is done updating table state, so it finally completes and releases all the resources. This happens because, with expired sstable, compaction will release weight faster than it will update table state, as there's nothing to be compacted. With my reproducer, it's very easy to reach 50 parallel compactions on a single shard, but that number can be easily worse depending on the amount of sstables with fully expired data, across all tables. This high parallelism can happen only with a couple of tables, if there are many time windows with expired data, as they can be compacted in parallel. Prior to `55a8b6e3c9`, weight was released earlier in compaction, before last sstable was sealed, but right now, there's no need to release weight earlier. Weight can be released in a much simpler way, after the compaction is actually done. So such compactions will be serialized from now on. Fixes #8710. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210527165443.165198-1-raphaelsc@scylladb.com> [avi: drop now unneeded storage_service_for_tests] (cherry picked from commit `a7cdd846da`)	2021-08-10 18:16:24 +03:00
Piotr Jastrzebski	860e2190a9	api: use proper type to reduce partition count Partition count is of a type size_t but we use std::plus<int> to reduce values of partition count in various column families. This patch changes the argument of std::plus to the right type. Using std::plus<int> for size_t compiles but does not work as expected. For example plus<int>(2147483648LL, 1LL) = -2147483647 while the code would probably want 2147483649. Fixes #9090 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Closes #9074 (cherry picked from commit `90a607e844`)	2021-08-10 18:16:15 +03:00
Nadav Har'El	6cf88812f6	secondary index: fix regression in CREATE INDEX IF NOT EXISTS The recent commit `0ef0a4c78d` added helpful error messages in case an index cannot be created because the intended name of its materialized view is already taken - but accidentally broke the "CREATE INDEX IF NOT EXISTS" feature. The checking code was correct, but in the wrong place: we need to first check maybe the index already exists and "IF NOT EXISTS" was chosen - and only do this new error checking if this is not the case. This patch also includes a cql-pytest test for reproducing this bug. The bug is also reproduced by the translated Cassandra unit tests cassandra_tests/validation/entities/secondary_index_test.py:: testCreateAndDropIndex and this is how I found this bug. After these patch, all these tests pass. Fixes #8717. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210526143635.624398-1-nyh@scylladb.com> (cherry picked from commit `97e827e3e1`)	2021-08-10 18:02:02 +03:00
Nadav Har'El	89fbcf9c81	Merge 'Fix index name conflicts with regular tables' from Piotr Sarna When an index is created without an explicit name, a default name is chosen. However, there was no check if a table with conflicting name already exists. The check is now in place and if any conflicts are found, a new index name is chosen instead. When an index is created with an explicit name and a conflicting regular table is found, index creation should simply fail. This series comes with a test. Fixes #8620 Tests: unit(release) Closes #8632 * github.com:scylladb/scylla: cql-pytest: add regression tests for index creation cql3: fail to create an index if there is a name conflict database: check for conflicting table names for indexes (cherry picked from commit `cee4c075d2`)	2021-08-10 12:33:21 +03:00
Calle Wilund	6dc7ef512d	messaging_service: Bind to listen address, not broadcast Refs #8418 Broadcast can (apparently) be an address not actually on machine, but on the other side of NAT. Thus binding local side of outgoing connection there will fail. Bind instead to listen_address (or broadcast, if listen_to_broadcast), this will require routing + NAT to make the connection looking like from broadcast from node connected to, to allow the connection (if using partial encryption). Note: this is somewhat verified somewhat limitedly. I would suggest verifying various multi rack/dc setups before relying on it. Closes #8974 (cherry picked from commit `b8b5f69111`)	2021-07-18 14:09:03 +03:00
Asias He	ae39b30ed3	repair: Consider memory bloat when calculate repair parallelism The repair parallelism is calculated by the number of memory allocated to repair and memory usage per repair instance. Currently, it does not consider memory bloat issues (e.g., issue #8640) which cause repair to use more memory and cause std::bad_alloc. Be more conservative when calculating the parallelism to avoid repair using too much memory. Fixes #8641 Closes #8652 (cherry picked from commit `b8749f51cb`)	2021-07-15 13:01:43 +03:00
Avi Kivity	d54372d699	Dedicate Scylla 4.5 release	2021-07-14 18:41:02 +03:00
Hagit Segev	7f96719c55	release: prepare for 4.5.rc5 scylla-4.5.rc5	2021-07-11 16:56:02 +03:00
Takuya ASADA	0c33983c71	scylla-fstrim.timer: drop BindsTo=scylla-server.service To avoid restart scylla-server.service unexpectedly, drop BindsTo= from scylla-fstrim.timer. Fixes #8921 Closes #8973 (cherry picked from commit `def81807aa`)	2021-07-08 10:06:12 +03:00
Calle Wilund	3c51b4066b	commitlog: Use defensive copies of segment list in iterations Fixes #8952 In `5ebf5835b0` we added a segment prune after flushing, to deal with deadlocks in shutdown. This means that calls that issue sync/flush-like ops "for-all", need to operate on a defensive copy of the list. Closes #8980 (cherry picked from commit `ce45ffdffb`)	2021-07-08 09:56:14 +03:00
Takuya ASADA	efa4d24deb	dist/redhat: fix systemd unit name of scylla-node-exporter systemd unit name of scylla-node-exporter is scylla-node-exporter.service, not node-exporter.service. Fixes #8966 Closes #8967 (cherry picked from commit `f19ebe5709`)	2021-07-07 18:37:42 +03:00
Takuya ASADA	3e1d608111	dist: stop removing /etc/systemd/system/.mount on package uninstall Listing /etc/systemd/system/.mount as ghost file seems incorrect, since user may want to keep using RAID volume / coredump directory after uninstalling Scylla, or user may want to upgrade enterprise version. Also, we mixed two types of files as ghost file, it should handle differently: 1. automatically generated by postinst scriptlet 2. generated by user invoked scylla_setup The package should remove only 1, since 2 is generated by user decision. However, just dropping .mount from %files section causes another problem, rpm will remove these files during upgrade, instead of uninstall (#8924). To fix both problem, specify .mount files as "%ghost %config". It will keep files both package upgrade and package remove. See scylladb/scylla-enterprise#1780 Closes #8810 Closes #8924 Closes #8959 (cherry picked from commit `f71f9786c7`)	2021-07-07 18:37:42 +03:00
Pavel Emelyanov	a87bb38c29	hasher: More picky noexcept marking of feed_hash() Commit `5adb8e555c` marked the ::feed_hash() and a visitor lambda of digester::feed_hash() as noexcept. This was quite recklesl as the appending_hash<>::operator()s called by ::feed_hash() are not all marked noexcept. In particular, the appending_hash<row>() is not such and seem to throw. The original intent of the mentioned commit was to facilitate the partition_hasher in repair/ code. The hasher itself had been removed by the `0af7a22c21`, so it no longer needs the feed_hash-s to be noexcepts. The fix is to inherit noexcept from the called hashers, but for the digester::feed_hash part the noexcept is just removed until clang compilation bug #50994 is fixed. fixes: #8983 tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210706153608.4299-1-xemul@scylladb.com> (cherry picked from commit `63a2fed585`)	2021-07-07 18:36:00 +03:00
Raphael S. Carvalho	ab3e284e04	LCS/reshape: Don't reshape single sstable in level 0 with strict mode With strict mode, it could happen that a sstable alone in level 0 is selected for offstrategy compaction, which means that we could run into an infinite reshape process. This is fixed by respecting the offstrategy threshold. Unit test is added. Fixes #8573. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210506181324.49636-1-raphaelsc@scylladb.com> (cherry picked from commit `8480839932`)	2021-07-07 14:06:20 +03:00
Raphael S. Carvalho	b0e833d9e5	LCS: reshape: Fix overlapping check when determining if a sstable set is disjoint Wrong comparison operator is used when checking for overlapping. It would miss overlapping when last key of a sstable is equal to the first key of another sstable that comes next in the set, which is sorted by first key. Fixes #8531. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `39ecddbd34`)	2021-07-07 14:04:04 +03:00
Hagit Segev	9e55d9bd04	release: prepare for 4.5.rc4 scylla-4.5.rc4	2021-07-07 12:17:19 +03:00
Avi Kivity	f89f4e69a0	Merge 'Commitlog: Handle disk usage and disk footprint discrepancies, ensuring we flush when needed (#8695 ) (v3)' from Calle Wilund Fixes #8270 If we have an allocation pattern where we leave large parts of segments "wasted" (typically because the segment has empty space, but cannot hold the mutation being added), we can have a disk usage that is below threshold, yet still get a disk footprint that is over limit causing new segment allocation to stall. We need to take a few things into account: 1.) Need to include wasted space in the threshold check. Whether or not disk is actually used does not matter here. 2.) If we stall a segment alloc, we should just flush immediately. No point in waiting for the timer task. 3.) Need to adjust the thresholds a bit. Depending on sizes, we should probably consider start flushing once we've used up space enough to be in the last available segment, so a new one is hopefully available by the time we hit the limit. 4.) (v2) Must ensure discard/delete routines are executed. Because we can race with background disk syncs, we may need to issue segment prunes from end_flush() so we wake up actual file deletion/recycling 5.) (v2) Shutdown must ensure discard/delete is run after we've disabled background task etc, otherwise we might fail waking up replenish and get stuck in gate 6.) (v2) Recycling or deleting segments must be consistent, regardless of shutdown. For same reason as above. 7.) (v3) Signal recycle/delete queues/promise on shutdown (with recognized marker) to handle edge case where we only have a single (allocating) segment in the list, and cannot wake up replenisher in any more civilized way. Also fix edge case (for tests), when we have too few segment to have an active one (i.e. need flush everything). New attempt at this, should fix intermittent shutdown deadlocks in commitlog_test. Closes #8764 * github.com:scylladb/scylla: commitlog_test: Add test case for usage/disk size threshold mismatch commitlog_test: Improve test assertion commitlog: Add waitable future for background sync/flush commitlog: abort queues on shutdown commitlog: break out "abort" calls into member functions commitlog: Do explicit discard+delete in shutdown commitlog: Recycle or not should not depend on shutdown state commitlog: Issue discard_unused_segments on segment::flush end IFF deletable commitlog: Flush all segments if we only have one. commitlog: Always force flush if segment allocation is waiting commitlog: Include segment wasted (slack) size in footprint check commitlog: Adjust (lower) usage threshold (cherry picked from commit `14252c8b71`)	2021-06-27 14:05:36 +03:00
Nadav Har'El	7146646bf4	Merge 'commitlog: make_checked_file for segments, report and ignore other errors on shutdown' from Benny Halevy Shutdown must never fail, otherwise it may cause hangs as seen in https://github.com/scylladb/scylla/issues/8577. This change wraps the file created in `allocate_segment_ex` in `make_checked_file` so that scylla will abort when failing to write to the commitlog files. In case other errors are seen during shutdown, just log them and continue with shutting down to prevent scylla from hanging. Fixes #8577 Test: unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #8578 * github.com:scylladb/scylla: commitlog: segment_manager::shutdown: abort on errors commitlog: allocate_segment_ex: make_checked_file (cherry picked from commit `48ff641f67`)	2021-06-27 14:04:04 +03:00
Benny Halevy	b3aba49ab0	commitlog: segment_manager: max_size must be aligned This was triggered by the test_total_space_limit_of_commitlog dtest. When it passes a very large commitlog_segment_size_in_mb (1/6th of the free memory size, in mb), segment_manager constructor limits max_size to std::numeric_limits<position_type>::max() which is 0xffffffff. This causes allocate_segment_ex to loop forever when writing the segment file since `dma_write` returns 0 when the count is unaligned (seen 4095). The fix here is to select a sligtly small maxsize that is aligned down to a multiple of 1MB. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210407121059.277912-1-bhalevy@scylladb.com> (cherry picked from commit `705f9c4f79`)	2021-06-27 14:03:41 +03:00
Hagit Segev	706de00ef2	release: prepare for 4.5.rc3 scylla-4.5.rc3	2021-06-20 17:14:50 +03:00
Piotr Sarna	cd5b915460	Merge 'view: fix use-after-move when handling view update failures' Backport of `6726fe79b6`. The code was susceptible to use-after-move if both local and remote updates were going to be sent. The whole routine for sending view updates is now rewritten to avoid use-after-move. Fixes #8830 Tests: unit(release), dtest(secondary_indexes_test.py:TestSecondaryIndexes.test_remove_node_during_index_build) Closes #8834 * backport-6726fe7: view: fix use-after-move when handling view update failures db,view: explicitly move the mutation to its helper function db,view: pass base token by value to mutate_MV	2021-06-16 13:27:12 +02:00
Piotr Sarna	247d30f075	view: fix use-after-move when handling view update failures The code was susceptible to use-after-move if both local and remote updates were going to be sent. The whole routine for sending view updates is now rewritten to avoid use-after-move. Refs #8830 Tests: unit(release), dtest(secondary_indexes_test.py:TestSecondaryIndexes.test_remove_node_during_index_build)	2021-06-16 13:25:50 +02:00
Piotr Sarna	13da17e6fe	db,view: explicitly move the mutation to its helper function The `apply_to_remote_endpoints` helper function used to take its `mut` parameter by reference, but then moved the value from it, which is confusing and prone to errors. Since the value is moved-from, let's pass it to the helper function as rvalue ref explicitly.	2021-06-16 13:25:46 +02:00
Piotr Sarna	6e29d74ab8	db,view: pass base token by value to mutate_MV The base token is passed cross-continuations, so the current way of passing it by const reference probably only works because the token copying is cheap enough to optimize the reference out. Fix by explicitly taking the token by value.	2021-06-16 13:23:10 +02:00

1 2 3 4 5 ...

25872 Commits