scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 20:05:10 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	e99a0c7b89	schema: Fix race in schema version recalculation leading to stale schema version in gossip Migration manager installs several feature change listeners: if (this_shard_id() == 0) { _feature_listeners.push_back(_feat.cluster_supports_view_virtual_columns().when_enabled(update_schema)); _feature_listeners.push_back(_feat.cluster_supports_digest_insensitive_to_expiry().when_enabled(update_schema)); _feature_listeners.push_back(_feat.cluster_supports_cdc().when_enabled(update_schema)); _feature_listeners.push_back(_feat.cluster_supports_per_table_partitioners().when_enabled(update_schema)); } They will call update_schema_version_and_announce() when features are enabled, which does this: return update_schema_version(proxy, features).then([] (utils::UUID uuid) { return announce_schema_version(uuid); }); So it first updates the schema version and then publishes it via gossip in announce_schema_version(). It is possible that the announce_schema_version() part of the first schema change will be deferred and will execute after the other four calls to update_schema_version_and_announce(). It will install the old schema version in gossip instead of the more recent one. The fix is to serialize schema digest calculation and publishing. Fixes #7200 (cherry picked from commit `1a57d641d1`)	2020-10-01 18:18:53 +02:00
Yaron Kaikov	f8c7605657	release: prepare for 4.0.10 scylla-4.0.10	2020-09-28 20:33:24 +03:00
Avi Kivity	7b9e33dcd4	Update seastar submodule * seastar e87ce4941c...065a40b34a (1): > lz4_fragmented_compressor: Fix buffer requirements Fixes #6925.	2020-09-23 12:07:11 +03:00
Yaron Kaikov	d86a31097a	release: prepare for 4.0.9 scylla-4.0.9	2020-09-17 14:24:32 +03:00
Nadav Har'El	bd9d6f8e45	alternator: fix corruption of PutItem operation in case of contention This patch fixes a bug noted in issue #7218 - where PutItem operations sometimes lose part of the item's data - some attributes were lost, and the name of other attributes replaced by empty strings. The problem happened when the write-isolation policy was LWT and there was contention of writes to the same partition (not necessarily the same item). To use CAS (a.k.a. LWT), Alternator builds an alternator::rmw_operation object with an apply() function which takes the old contents of the item (if needed) and a timestamp, and builds a mutation that the CAS should apply. In the case of the PutItem operation, we wrongly assumed that apply() will be called only once - so as an optimization the strings saved in the put_item_operation were moved into the returned mutation. But this optimization is wrong - when there is contention, apply() may be called again when the changed proposed by the previous one was not accepted by the Paxos protocol. The fix is to change the one place where put_item_operation moved strings out of the saved operations into the mutations, to be a copy. But to prevent this sort of bug from reoccuring in future code, this patch enlists the compiler to help us verify that it can't happen: The apply() function is marked "const" - it can use the information in the operation to build the mutation, but it can never modify this information or move things out of it, so it will be fine to call this function twice. The single output field that apply() does write (_return_attributes) is marked "mutable" to allow the const apply() to write to it anyway. Because apply() might be called twice, it is important that if some apply() implementation sometimes sets _return_attributes, then it must always set it (even if to the default, empty, value) on every call to apply(). The const apply() means that the compiler verfies for us that I didn't forget to fix additional wrong std::move()s. Additionally, a test I wrote to easily reproduce issue #7218 (which I will submit as a dtest later) passes after this fix. Fixes #7218. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200916064906.333420-1-nyh@scylladb.com> (cherry picked from commit `5e8bdf6877`)	2020-09-16 23:05:23 +03:00
Benny Halevy	11ef23e97a	test: cql_query_test: test_cache_bypass: use table stats test is currently flaky since system reads can happen in the background and disturb the global row cache stats. Use the table's row_cache stats instead. Fixes #6773 Test: cql_query_test.test_cache_bypass(dev, debug) Credit-to: Botond Dénes <bdenes@scylladb.com> Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200811140521.421813-1-bhalevy@scylladb.com> (cherry picked from commit `6deba1d0b4`)	2020-09-16 18:20:30 +03:00
Asias He	2c0eac09ae	migration_manager: Make sync_schema return error when node is down sync_schema is supposed to make sure that this node knows about all schema changes known by "nodes" that were made prior to this call. Currently, when a node is down, the sync is sliently skipped. To fix, add a flag to migration_task::run_may_throw to indicate that it should fail if a node is down. Fixes #4791 (cherry picked from commit `7ba821cbc0`)	2020-09-16 16:01:44 +03:00
Dejan Mircevski	713a7269d0	cql3: Fix NULL reference in get_column_defs_for_filtering There was a typo in get_column_defs_for_filtering(): it checked the wrong pointer before dereferencing. Add a test exposing the NULL dereference and fix the typo. Tests: unit (dev) Fixes #7198. Signed-off-by: Dejan Mircevski <dejan@scylladb.com> (cherry picked from commit `9d02f10c71`)	2020-09-16 15:47:09 +03:00
Avi Kivity	1724301d4d	reconcilable_result_builder: don't aggrevate out-of-memory condition during recovery Consider an unpaged query that consumes all of available memory, despite `fea5067dfa` which limits them (perhaps the user raised the limit, or this is a system query). Eventually we will see a bad_alloc which will abort the query and destroy this reconcilable_result_builder. During destruction, we first destroy _memory_accounter, and then _result. Destroying _memory_accounter resumes some continuations which can then allocate memory synchronously when increasing the task queue to accomodate them. We will then crash. Had we not crashed, we would immediately afterwards release _result, freeing all the memory that we would ever need. Fix by making _result the last member, so it is freed first. Fixes #7240. (cherry picked from commit `9421cfded4`)	2020-09-16 15:41:10 +03:00
Avi Kivity	9971f2f5db	Merge "Fix repair stalls in get_sync_boundary and apply_rows_on_master_in_thread" from Asias " This path set fixes stalls in repair that are caused by std::list merge and clear operations during test_latency_read_with_nemesis test. Fixes #6940 Fixes #6975 Fixes #6976 " * 'fix_repair_list_stall_merge_clear_v2' of github.com:asias/scylla: repair: Fix stall in apply_rows_on_master_in_thread and apply_rows_on_follower repair: Use clear_gently in get_sync_boundary to avoid stall utils: Add clear_gently repair: Use merge_to_gently to merge two lists utils: Add merge_to_gently (cherry picked from commit `4547949420`)	2020-09-10 13:15:01 +03:00
Avi Kivity	ee328c22ca	repair: apply_rows_on_follower(): remove copy of repair_rows list We copy a list, which was reported to generate a 15ms stall. This is easily fixed by moving it instead, which is safe since this is the last use of the variable. Fixes #7115. (cherry picked from commit `6ff12b7f79`)	2020-09-10 11:53:55 +03:00
Avi Kivity	3a9c9a8a12	Update seastar submodule * seastar 861b7edd61...e87ce4941c (1): > core/reactor: complete_timers(): restore previous scheduling group Fixes #7184.	2020-09-07 11:28:55 +03:00
Raphael S. Carvalho	c03445871a	compaction: Prevent non-regular compaction from picking compacting SSTables After `8014c7124`, cleanup can potentially pick a compacting SSTable. Upgrade and scrub can also pick a compacting SSTable. The problem is that table::candidates_for_compaction() was badly named. It misleads the user into thinking that the SSTables returned are perfect candidates for compaction, but manager still need to filter out the compacting SSTables from the returned set. So it's being renamed. When the same SSTable is compacted in parallel, the strategy invariant can be broken like overlapping being introduced in LCS, and also some deletion failures as more than one compaction process would try to delete the same files. Let's fix scrub, cleanup and ugprade by calling the manager function which gets the correct candidates for compaction. Fixes #6938. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200811200135.25421-1-raphaelsc@scylladb.com> (cherry picked from commit `11df96718a`)	2020-09-06 18:41:12 +03:00
Takuya ASADA	565ac1b092	aws: update enhanced networking supported instance list Sync enhanced networking supported instance list to latest one. Reference: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html Fixes #6991 (cherry picked from commit `7cccb018b8`)	2020-09-06 18:21:46 +03:00
Yaron Kaikov	7d1180b98f	release: prepare for 4.0.8 scylla-4.0.8	2020-08-30 09:42:34 +03:00
Piotr Sarna	f258e6f6ee	Merge 'counters: Fix filtering of counters' from Juliusz Queries with `ALLOW FILTERING` and constraints on counter values used to be rejected as "unimplemented". The reason was a missing tri-comparator, which is added in this patch. Fixes #5635 * jul-stas-5635-filtering-on-counters: cql/tests: Added test for filtering on counter columns counters: add comparator and remove `unimplemented` from restrictions (cherry picked from commit `c32faee657`)	2020-08-27 18:42:30 +03:00
Avi Kivity	2708b0d664	Merge "repair: row_level: prevent deadlocks when repairing homogenous nodes" from Botond " This series backports the series "repair: row_level: prevent deadlocks when repairing homogenous nodes" (merged as `a9c7a1a86`) to branch-4.1. " Fixes #6272 * 'repair-row-level-evictable-local-reader/branch-4.1' of https://github.com/denesb/scylla: repair: row_level: destroy reader on EOS or error repair: row_level: use evictable_reader for local reads mutation_reader: expose evictable_reader mutation_reader: evictable_reader: add auto_pause flag mutation_reader: make evictable_reader a flat_mutation_reader mutation_reader: s/inactive_shard_read/inactive_evictable_reader/ mutation_reader: move inactive_shard_reader code up mutation_reader: fix indentation mutation_reader: shard_reader: extract remote_reader as evictable_reader mutation_reader: reader_lifecycle_policy: make semaphore() available early (cherry picked from commit `59aa1834a7`)	2020-08-27 17:44:27 +03:00
Asias He	e31ffbf2e6	compaction_manager: Avoid stall in perform_cleanup The following stall was seen during a cleanup operation: scylla: Reactor stalled for 16262 ms on shard 4. \| std::_MakeUniq<locator::tokens_iterator_impl>::__single_object std::make_unique<locator::tokens_iterator_impl, locator::tokens_iterator_impl&>(locator::tokens_iterator_impl&) at /usr/include/fmt/format.h:1158 \| (inlined by) locator::token_metadata::tokens_iterator::tokens_iterator(locator::token_metadata::tokens_iterator const&) at ./locator/token_metadata.cc:1602 \| locator::simple_strategy::calculate_natural_endpoints(dht::token const&, locator::token_metadata&) const at simple_strategy.cc:? \| (inlined by) locator::simple_strategy::calculate_natural_endpoints(dht::token const&, locator::token_metadata&) const at ./locator/simple_strategy.cc:56 \| locator::abstract_replication_strategy::get_ranges(gms::inet_address, locator::token_metadata&) const at /usr/include/fmt/format.h:1158 \| locator::abstract_replication_strategy::get_ranges(gms::inet_address) const at /usr/include/fmt/format.h:1158 \| service::storage_service::get_ranges_for_endpoint(seastar::basic_sstring<char, unsigned int, 15u, true> const&, gms::inet_address const&) const at /usr/include/fmt/format.h:1158 \| service::storage_service::get_local_ranges(seastar::basic_sstring<char, unsigned int, 15u, true> const&) const at /usr/include/fmt/format.h:1158 \| (inlined by) operator() at ./sstables/compaction_manager.cc:691 \| (inlined by) _M_invoke at /usr/include/c++/9/bits/std_function.h:286 \| std::function<std::vector<seastar::lw_shared_ptr<sstables::sstable>, std::allocator<seastar::lw_shared_ptr<sstables::sstable> > > (table const&)>::operator()(table const&) const at /usr/include/fmt/format.h:1158 \| (inlined by) compaction_manager::rewrite_sstables(table, sstables::compaction_options, std::function<std::vector<seastar::lw_shared_ptr<sstables::sstable>, std::allocator<seastar::lw_shared_ptr<sstables::sstable> > > (table const&)>) at ./sstables/compaction_manager.cc:604 \| compaction_manager::perform_cleanup(table) at /usr/include/fmt/format.h:1158 To fix, we furturize the function to get local ranges and sstables. In addition, this patch removes the dependency to global storage_service object. Fixes #6662 (cherry picked from commit `07e253542d`)	2020-08-27 13:11:39 +03:00
Raphael S. Carvalho	801994e299	sstables: optimize procedure that checks if a sstable needs cleanup needs_cleanup() returns true if a sstable needs cleanup. Turns out it's very slow because it iterates through all the local ranges for all sstables in the set, making its complexity: O(num_sstables * local_ranges) We can optimize it by taking into account that abstract_replication_strategy documents that get_ranges() will return a list of ranges that is sorted and non-overlapping. Compaction for cleanup already takes advantage of that when checking if a given partition can be actually purged. So needs_cleanup() can be optimized into O(num_sstables * log(local_ranges)). With num_sstables=1000, RF=3, then local_ranges=256(num_tokens)*3, it means the max # of checks performed will go from 768000 to ~9584. Fixes #6730. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200629171355.45118-2-raphaelsc@scylladb.com> (cherry picked from commit `cf352e7c14`)	2020-08-27 13:11:37 +03:00
Raphael S. Carvalho	3b932078bf	sstables: export needs_cleanup() May be needed elsewhere, like in an unit test. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200629171355.45118-1-raphaelsc@scylladb.com> (cherry picked from commit `a9eebdc778`)	2020-08-27 13:11:24 +03:00
Asias He	608f62a0e9	abstract_replication_strategy: Add get_ranges_in_thread Add a version that runs inside a seastar thread. The benefit is that get_ranges can yield to avoid stalls. Refs #6662 (cherry picked from commit `94995acedb`)	2020-08-27 13:10:32 +03:00
Asias He	d8619d3320	abstract_replication_strategy: Add get_ranges which takes token_metadata It is useful when the caller wants to calculate ranges using a custom token_metadata. It will be used soon in do_rebuild_replace_with_repair for replace operation. Refs: #5482 (cherry picked from commit `b640614aa6`)	2020-08-27 13:10:26 +03:00
Asias He	4f0c99a187	gossip: Fix race between shutdown message handler and apply_state_locally 1. The node1 is shutdown 2. The node1 sends shutdown message to node2 3. The node2 receives gossip shutdown message but the handler yields 4. The node1 is restarted 5. The node1 sends new gossip endpoint_state to node2, node2 applies the state in apply_state_locally and calls gossiper::handle_major_state_change and then calls gossiper::mark_alive 6. The shutdown message handler in step 3 resumes and sets status of node1 to SHUTDOWN 7. The gossiper::mark_alive fiber in step 5 resumes and calls gossiper::real_mark_alive, node2 will skip to mark node1 as alive because the status of node1 is SHUTDOWN. As a result, node1 is alive but it is not marked as UP by node2. To fix, we serialize the two operations. Fixes #7032 (cherry picked from commit `e6ceec1685`)	2020-08-27 11:16:10 +03:00
Nadav Har'El	ada79df082	alternator test: configurable temporary directory The test/alternator/run script creates a temporary directory for the Scylla database in /tmp. The assumption was that this is the fastest disk (usually even a ramdisk) on the test machine, and we didn't need anything else from it. But it turns out that on some systems, /tmp is actually a slow disk, so this patch adds a way to configure the temporary directory - if the TMPDIR environment variable exists, it is used instead of /tmp. As before this patch, a temporary subdirectry is created in $TMPDIR, and this subdirectory is automatically deleted when the test ends. The test.py script already passes an appropriate TMPDIR (testlog/$mode), which after this patch the Alternator test will use instead of /tmp. Fixes #6750 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200713193023.788634-1-nyh@scylladb.com> (cherry picked from commit `8e3be5e7d6`)	2020-08-26 19:48:45 +03:00
Nadav Har'El	1935f2b480	alternator: fix order conditions on binary attributes We implemented the order operators (LT, GT, LE, GE, BETWEEN) incorrectly for binary attributes: DynamoDB requires that the bytes be treated as unsigned for the purpose of order (so byte 128 is higher than 127), but our implementation uses Scylla's "bytes" type which has signed bytes. The solution is simple - we can continue to use the "bytes" type, but we need to use its compare_unsigned() function, not its "<" operator. This bug affected conditional operations ("Expected" and "ConditionExpression") and also filters ("QueryFilter", "ScanFilter", "FilterExpression"). The bug did not affect Query's key conditions ("KeyConditions", "KeyConditionExpression") because those already used Scylla's key comparison functions - which correctly compare binary blobs as unsigned bytes (in fact, this is why we have the compare_unsigned() function). The patch also adds tests that reproduce the bugs in conditional operations, and show that the bug did not exist in key conditions. Fixes #6573 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200603084257.394136-1-nyh@scylladb.com> (cherry picked from commit `f6b1f45d69`) Manyally removed tests in test_key_conditions.py which didn't exist in this branch.	2020-08-26 19:28:47 +03:00
Avi Kivity	44a76ed231	Merge "Unregister RPC verbs on stop" from Pavel E " There are 5 services, that register their RPC handlers in messaging service, but quite a few of them unregister them on stop. Unregistering is somewhat critical, not just because it makes the code look clean, but also because unregistration does wait for the message processing to complete, thus avoiding use-after-free's in the handlers. In particular, several handlers call service::get_schema_for_write() which, in turn, may end up in service::maybe_sync() calling for the local migration manager instance. All those handlers' processing must be waited for before stopping the migration manager. The set brings the RPC handlers unregistration in sync with the registration part. tests: unit (dev) dtest (dev: simple_boot_shutdown, repair) start-stop by hands (dev) fixes: #6904 " * 'br-rpc-unregister-verbs' of https://github.com/xemul/scylla: main: Add missing calls to unregister RPC hanlers messaging: Add missing per-service unregistering methods messaging: Add missing handlers unregistration helpers streaming: Do not use db->invoke_on_all in vain storage_proxy: Detach rpc unregistration from stop main: Shorten call to storage_proxy::init_messaging_service (cherry picked from commit `01b838e291`)	2020-08-26 14:42:40 +03:00
Raphael S. Carvalho	aeb49f4915	cql3/statements: verify that counter column cannot be added into non-counter table A check, to validate that counter column cannot be added into non-counter table, is missing for alter table statement. Validation is performed when building new schema, but it's limited to checking that a schema will not contain both counter and non-counter columns. Due to lack of validation, the added counter column could be incorrectly persisted to the schema, but this results in a crash when setting the new schema to its table. On restart, it can be confirmed that the schema change was indeed persisted when describing the table. This problem is fixed by doing proper validation for the alter table statement, which consists of making sure a new counter column cannot be added to a non-counter table. The test cdc_disallow_cdc_for_counters_test is adjusted because one of its tests was built on the assumption that counter column can be added into a non-counter table. Fixes #7065. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200824155709.34743-1-raphaelsc@scylladb.com> (cherry picked from commit `1c29f0a43d`)	2020-08-25 18:46:01 +03:00
Gleb Natapov	8d6b35ad20	lwt: fix possible leak of "prune" counter If get_schema_for_read() fails "prune" counter will not be decremented. The patch fixes it by creating RAI object earlier. Also return releasing of a mutation in release_mutation() which was dropped by mistake. Fixes #6124 Message-Id: <20200405080233.GA22509@scylladb.com> (cherry picked from commit `e5f7ccc4c8`)	2020-08-23 19:29:06 +03:00
Takuya ASADA	b123700ebe	dist/debian: disable debuginfo compression on .deb Since older binutils on some distribution does not able to handle compressed debuginfo generated on Fedora, we need to disable it. However, debian packager force debuginfo compression since debian/compat = 9, we have to uncompress them after compressed automatically. Fixes #6982 (cherry picked from commit `75c2362c95`)	2020-08-23 19:03:13 +03:00
Botond Dénes	6786b521f9	scylla-gdb.py: find_db(): don't return current shard's database for shard=0 The `shard` parameter of `find_db()` is optional and is defaulted to `None`. When missing, the current shard's database instance is returned. The problem is that the if condition checking this uses `not shard`, which also evaluates to `True` if `shard == 0`, resulting in returning the current shard's database instance for shard 0. Change the condition to `shard is None` to avoid this. Fixes: #7016 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200812091546.1704016-1-bdenes@scylladb.com> (cherry picked from commit `4cfab59eb1`)	2020-08-23 18:56:39 +03:00
Botond Dénes	fda0d1ae8e	table: get_sstables_by_partition_key(): don't make a copy of selected sstables Currently we assign the reference to the vector of selected sstables to `auto sst`. This makes a copy and we pass this local variable to `do_for_each()`, which will result in a use-after-free if the latter defers. Fix by not making a copy and instead just keep the reference. Fixes: #7060 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200818091241.2341332-1-bdenes@scylladb.com> (cherry picked from commit `78f94ba36a`)	2020-08-19 00:02:22 +03:00
Yaron Kaikov	e7cffb978a	release: prepare for 4.0.7 scylla-4.0.7	2020-08-17 00:38:43 +03:00
Benny Halevy	79a1c74921	db::commitlog: close file if wrapping failed When I/O error (e.g. EMFILE / ENOSPC) happens we hit an assert in ~append_challenged_posix_file_impl(): Assertion _closing_state == state::closed' failed. Commit `6160b9017d` add close on failure of the lamda defined in allocate_segment_ex, but it doesn't handle an error after the file is opened/created while it is wrapped with commitlog_file_extensions. Refs #5657 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Reviewed-by: Calle Wilund <calle@scylladb.com> Message-Id: <20200414115231.298632-1-bhalevy@scylladb.com> (cherry picked from commit `35892e4557`)	2020-08-16 19:58:23 +03:00
Calle Wilund	3ee854f9fc	cdc::log: Missing "preimage" check in row deletion pre-image Fixes #6561 Pre-image generation in row deletion case only checked if we had a pre-image result set row. But that can be from post-image. Also check actual existance of the pre-image CK. Message-Id: <20200608132804.23541-1-calle@scylladb.com> (cherry picked from commit `5105e9f5e1`)	2020-08-12 13:55:10 +03:00
Avi Kivity	2b65984d14	Merge "Fix GCC-10 related bugs and fix deletion of temporary garbage-collected sstables" from Raphael " Temporary garbage-collected SSTables, involved in the incremental compaction process which can be enabled for LCS, were incorrectly invalidating the cache when added to the set of SSTables. Also, those same temporary SSTables could be incorrectly removed, causing deletion warnings. The patchset "Don't invalidate row cache when adding GC SSTable" fixes those two issues by using the SSTable replacement mechanism, which is the correct method for replacing SSTables in the set. " * 'backport_fix_issue_6275_for_branch_4_0' of github.com:raphaelsc/scylla: row_cache_alloc_stress_test: Make sure GCC can't delete a new tests: Wait for a few futures sstables/compaction: Don't invalidate row cache when adding GC SSTable to SSTable set sstables/compaction: Change meaning of compaction_completion_desc input and output fields sstables/compaction: Clean up code around garbage_collected_sstable_writer compaction: enhance compaction_descriptor with creator and replace function	2020-08-11 18:16:41 +03:00
Nadav Har'El	52d1099d09	Update Seastar submodule > http: add "Expect: 100-continue" handling Fixes #6844	2020-08-11 13:33:45 +03:00
Asias He	3a03906377	repair: Switch to btree_set for repair_hash. In one of the longevity tests, we observed 1.3s reactor stall which came from repair_meta::get_full_row_hashes_source_op. It traced back to a call to std::unordered_set::insert() which triggered big memory allocation and reclaim. I measured std::unordered_set, absl::flat_hash_set, absl::node_hash_set and absl::btree_set. The absl::btree_set was the only one that seastar oversized allocation checker did not warn in my tests where around 300K repair hashes were inserted into the container. - unordered_set: hash_sets=295634, time=333029199 ns - flat_hash_set: hash_sets=295634, time=312484711 ns - node_hash_set: hash_sets=295634, time=346195835 ns - btree_set: hash_sets=295634, time=341379801 ns The btree_set is a bit slower than unordered_set but it does not have huge memory allocation. I do not measure real difference of total time to finish repair of the same dataset with unordered_set and btree_set. To fix, switch to absl btree_set container. Fixes #6190 (cherry picked from commit `67f6da6466`) (cherry picked from commit `a27188886a`)	2020-08-11 12:35:34 +03:00
Rafael Ávila de Espíndola	2395a240b4	build: Link with abseil It is a pity we have to list so many libraries, but abseil doesn't provide a .pc file. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> (cherry picked from commit `7d1f6725dd`) Ref #6190.	2020-08-11 12:35:32 +03:00
Rafael Ávila de Espíndola	d182c595a1	Add abseil as a submodule This adds the https://abseil.io library as a submodule. The patch series that follows needs a hash table that supports heterogeneous lookup, and abseil has a really good hash table that supports that (https://abseil.io/blog/20180927-swisstables). The library is still not available in Fedora, but it is fairly easy to use it directly from a submodule. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> (cherry picked from commit `383a9c6da9`) Ref #6190	2020-08-11 12:35:31 +03:00
Rafael Ávila de Espíndola	fe9c4611b3	cofigure: Don't overwrite seastar_cflags The variable seastar_cflags was being used for flags passed to seastar and for flags extracted from the seastar.pc file. This introduces a new variable for the flags extracted from the seastar.pc file. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> (cherry picked from commit `2ad09aefb6`) Ref #6190.	2020-08-11 12:35:28 +03:00
Calle Wilund	29df416720	database: Do not assert on replay positions if truncate does not flush Fixes #6995 In `c2c6c71` the assert on replay positions in flushed sstables discarded by truncate was broken, by the fact that we no longer flush all sstables unless auto snapshot is enabled. This means the low_mark assertion does not hold, because we maybe/probably never got around to creating the sstables that would hold said mark. Note that the (old) change to not create sstables and then just delete them is in itself good. But in that case we should not try to verify the rp mark. (cherry picked from commit `9620755c7f`)	2020-08-10 23:28:00 +03:00
Nadav Har'El	1d3c00572c	Update Seastar submodule with some backported fixes Fixes #7008 > futures_test: Don't use * on an optional without a value > net: Use offsetof instead of accessing a null pointer > allocator_test: Avoid undefined conversion > http: Don't use moved value > circular_buffer_fixed_capacity_test: Fix indentation > circular_buffer_fixed_capacity: Always mask indexes > rpc: Fix a use-after-return	2020-08-10 20:39:35 +03:00
Avi Kivity	9d6e2c5a71	Update seastar submodule * seastar 4ee384e15f...2dbd81d5db (1): > memory: fix small aligned free memory corruption Fixes #6831	2020-08-09 18:39:01 +03:00
Pavel Emelyanov	386741e3b7	storage_proxy_stats: Make get_ep_stat() noexcept The .get_ep_stat(ep) call can throw when registering metrics (we have issue for it, #5697). This is not expected by it callers, in particular abstract_write_response_handler::timeout_cb breaks in the middle and doesn't call the on_timeout() and the _proxy->remove_response_handler(), which results in not removed and not released responce handler. In turn not released response handler doesn't set the _ready future on which response_wait() waits -> stuck. Although the issue with .get_ep_stat() should be fixed, an exception in it mustn't lead to deadlocks, so the fix is to make the get_ep_stat() noexcept by catching the exception and returning a dummy stat object instead to let caller(s) finish. Fixes #5985 Tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200430163639.5242-1-xemul@scylladb.com> (cherry picked from commit `513ce1e6a5`)	2020-08-09 18:18:50 +03:00
Avi Kivity	d0fdc3960a	Merge 'hinted handoff: fix commitlog memory leak' from Piotr D " When commitlog is recreated in hints manager, only shutdown() method is called, but not release(). Because of that, some internal commitlog objects (`segment_manager` and `segment`s) may be left pointing to each other through shared_ptr reference cycles, which may result in memory leak when the parent commitlog object is destroyed. This PR prevents memory leaks that may happen this way by calling release() after shutdown() from the hints manager. Fixes: #6409, Fixes #6776 " * piodul-fix-commitlog-memory-leak-in-hinted-handoff: hinted handoff: disable warnings about segments left on disk hinted handoff: release memory on commitlog termination (cherry picked from commit `4c221855a1`)	2020-08-09 17:26:17 +03:00
Tomasz Grabiec	4035cf4f9f	thrift: Fix crash on unsorted column names in SlicePredicate The column names in SlicePredicate can be passed in arbitrary order. We converted them to clustering ranges in read_command preserving the original order. As a result, the clustering ranges in read command may appear out of order. This violates storage engine's assumptions and lead to undefined behavior. It was seen manifesting as a SIGSEGV or an abort in sstable reader when executing a get_slice() thrift verb: scylla: sstables/consumer.hh:476: seastar::future<> data_consumer::continuous_data_consumer<StateProcessor>::fast_forward_to(size_t, size_t) [with StateProcessor = sstables::data_consume_rows_context_m; size_t = long unsigned int]: Assertion `end >= _stream_position.position' failed. Fixes #6486. Tests: - added a new dtest to thrift_tests.py which reproduces the problem Message-Id: <1596725657-15802-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `bfd129cffe`)	2020-08-08 19:48:46 +03:00
Rafael Ávila de Espíndola	09367742b1	row_cache_alloc_stress_test: Make sure GCC can't delete a new We want to test that a std::bad_alloc is thrown, but GCC 10 has a new optimization (-fallocation-dce) that removes dead allocations. This patch assigns the value returned by new to a global so that GCC cannot delete it. With this all tests in a dev build pass with GCC 10. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200424201531.225807-1-espindola@scylladb.com> (cherry picked from commit `0d89bbd57f`)	2020-08-07 16:49:33 -03:00
Rafael Ávila de Espíndola	a18ff57b29	tests: Wait for a few futures GCC 10 now warns on these. This fixes the dev build with gcc 10. backport note: remove unneeded change which is not compatible with the branch in error_injection_test. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200424161006.17857-1-espindola@scylladb.com> (cherry picked from commit `543a9ebd9b`)	2020-08-07 16:32:12 -03:00
Raphael S. Carvalho	4734ba21a7	sstables/compaction: Don't invalidate row cache when adding GC SSTable to SSTable set Garbage collected SSTable is incorrectly added to SSTable set with a function that invalidates row cache. This problem is fixed by adding GC SStable to set using mechanism which replaces old sstables with new sstables. Also, adding GC SSTable to set in a separate call is not correct. We should make sure that GC SSTable reaches the SSTable set at the same time its respective old (input) SSTable is removed from the set, and that's done using a single request call to table. Fixes #5956. Fixes #6275. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `a214ccdf89`)	2020-08-06 19:08:46 -03:00
Raphael S. Carvalho	425af4c543	sstables/compaction: Change meaning of compaction_completion_desc input and output fields input_sstables is renamed to old_sstables and is about old SSTables that should be deleted and removed from the SSTable set. output_sstables is renamed to new_sstables and is about new SSTable that should be added to the SSTable set, replacing the old ones. This will allow us, for example, to add auxiliary SSTables to SSTable set using the same call which replaces output SSTables by input SSTables in compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `8f4458f1d5`)	2020-08-06 18:51:21 -03:00

1 2 3 4 5 ...

21830 Commits