scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 17:40:34 +00:00

Author	SHA1	Message	Date
Avi Kivity	9d0aaa941a	database: make run_with_compaction_disabled() a non-template Allows reducing dependencies down the line, and un-templating non-performance-critical functions is a good thing.	2017-09-11 20:09:45 +03:00
Avi Kivity	6b5514a3df	database: change truncate() to flush while compaction is disabled In preparation to make run_with_compaction_disabled() a non-template, we want to remove any non-copyable captures (so the function can be an std::function, which requires copyability). Move the flush within the compaction disabled region. This changes the behavior, but it shouldn't matter.	2017-09-11 20:09:45 +03:00
Avi Kivity	14fd4168dc	Merge seastar upstream * seastar 31b925d...92fdce2 (3): > shared_ptr: allow incomplete classes in lw_shared_ptr<> > Update DPDK to 17.05 > future: pass func as mutable to lambda arg of handle_exception[_type]	2017-09-11 20:09:04 +03:00
Tomasz Grabiec	95b3eaac97	debug: Allow running scylla_row_cache_report.stp script against a running process Message-Id: <1504776359-16424-1-git-send-email-tgrabiec@scylladb.com>	2017-09-11 14:17:30 +03:00
Avi Kivity	fe019ad84d	Merge "Refuse to load non-Scylla counter sstables" from Paweł "These patches make Scylla refuse to load counter sstables that may contain unsupported counter shards. They are recognised by the lack of the Scylla component. Fixes #2766." * tag 'reject-non-scylla-counter-sstables/v1' of https://github.com/pdziepak/scylla: db: reject non-Scylla counter sstables in flush_upload_dir db: disallow loading non-Scylla counter sstables sstable: add has_scylla_component()	2017-09-11 13:28:44 +03:00
Tzach Livyatan	83eab5c8d7	Remove comment about Too high number of concurrent compactions from scylla_compaction_manager_compactions help It should never happen and its not clear what too high stands for Signed-off-by: Tzach Livyatan <tzach@scylladb.com> Message-Id: <20170911085645.21222-1-tzach@scylladb.com>	2017-09-11 13:27:35 +03:00
Gleb Natapov	d0d8bdf615	storage_proxy: remove unused parameter from get_restricted_ranges() function Message-Id: <20170911084653.GH24167@scylladb.com>	2017-09-11 11:58:44 +02:00
Gleb Natapov	f66e9377d4	storage_proxy: do not keep reference to a keyspace during write A keyspace can be deleted while write is ongoing, so the object cannot be used after defer point. The keyspace reference is only used to check how many replies a write operation should wait for and this can be precalculated during write handler creation. Fixes #2777 Message-Id: <20170911084436.GG24167@scylladb.com>	2017-09-11 11:57:00 +02:00
Asias He	bb9dbc5ade	storage_service: Do not use c_str() in the logger Use logger.info("{}", msg) instead. Message-Id: <d2f15007a54554b58e29fd05331c06ae030d582f.1504832296.git.asias@scylladb.com>	2017-09-10 18:10:24 +03:00
Botond Dénes	9ebeb9d5ce	Fix --Wreturn-type warnings in tests: use abort() instead of assert(0) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <95927f933411302e84d57d169ee0147def7bc643.1504890922.git.bdenes@scylladb.com>	2017-09-10 17:09:53 +03:00
Gleb Natapov	9137446109	api: uses correct statistics for storage proxy range histograms. Message-Id: <20170910073458.GB1870@scylladb.com>	2017-09-10 16:18:36 +03:00
Pekka Enberg	d2632ddf1d	Merge "gossip: optimize apply_state_locally for large cluster" from Asias "This series tries to improve the bootstrap of a node in a large cluster by improving how gossip applies the gossip node state. In #2404, the joining node failed to bootstrap, because it did not see the seed node when storage_service::bootstrap ran. After this series, we apply the whole gossip state contained in the gossip ack/ack2 message before applying the next one, and we apply the state of the seed node earlier than non-seed node so we can have the seed node's state faster. We also add some randomness to the order of applying gossip node state to prevent some of the nodes' state are always applied earlier than the others. This series improves apply_state_locally for large cluster: - Tune the order of applying endpoint_state - Serialize apply_state_locally - Avoid copying of the gossip state map Fixes #2404" * tag 'asias/gossip_issue_2404_v2' of github.com:scylladb/seastar-dev: gossip: Avoid copying with apply_state_locally gossip: Serialize apply_state_locally gossip: Tune the order of applying endpoint_state in apply_state_locally gossip: Introduce is_seed helper gossip: Pass const endpoint_state& in notify_failure_detector gossip: Pass reference in notify_failure_detector	2017-09-08 11:41:43 +03:00
Asias He	57dd3cb2c5	gossip: Do not use c_str() in the logger Use logger.info("{}", msg) instead. Message-Id: <52c24d7dfe082ee926f065a6268d83fcb31ddc28.1504832289.git.asias@scylladb.com>	2017-09-08 10:59:42 +03:00
Asias He	e98ce7887b	gossip: Avoid copying with apply_state_locally Move the std::map<inet_address, endpoint_state> map from the gossip ack/ack2 message directly and move it around in apply_state_locally to avoid copying the map.	2017-09-08 15:19:48 +08:00
Asias He	fd879b4e09	gossip: Serialize apply_state_locally apply_state_locally will be called when gossip ack/ack2 message is received. It will use the std::map<inet_address, endpoint_state>& map to update the endpoint state. However, we can receive multiple such gossip ack/ack2 messages from multiple peer nodes in parallel. Currently, we process them in parallel. It is better to apply all the states from one node then move to apply all the states from another node than interleaving. Because it is more important to have the state of the whole cluster than to have a bit newer state from another peer (if it is newer), especially when the node boots up and runs its first round of gossip exchange. After this patch, we apply the whole gossip state contained in the gossip ack/ack2 message before applying the next one.	2017-09-08 15:19:47 +08:00
Asias He	9ccba950ba	gossip: Tune the order of applying endpoint_state in apply_state_locally We currently always apply the endpoint_state in the order of the endpoint ip address. This is not good because some of the endpoint's state is always applied earlier than the others. In large cluster, the number of endpoints can be large, it takes time to apply all of them. To make it more fair, we apply the endpoint_state randomly. Apply the seed node's state earlier because in bootstrap, we will check if we have seen the seed node in storage_service::bootstrap. In #2404, the bootstrap failed because, the joining node hasn't apply the seed node's state when storage_service::bootstrap runs.	2017-09-08 15:19:47 +08:00
Asias He	c5456ed38f	gossip: Introduce is_seed helper To check if a endpoint is a seed node.	2017-09-08 15:19:47 +08:00
Asias He	32edd95241	gossip: Pass const endpoint_state& in notify_failure_detector	2017-09-08 15:19:47 +08:00
Asias He	46e562cbfa	gossip: Pass reference in notify_failure_detector In large cluster, the map can be large. Pass reference to avoid copying.	2017-09-08 15:19:47 +08:00
Glauber Costa	db846326f8	compaction: remove dead code This code has no more users. Bury it. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20170908005305.29925-1-glauber@scylladb.com>	2017-09-08 08:17:15 +02:00
Tomasz Grabiec	57dc988475	Update seastar submodule * seastar 85ca12d...31b925d (19): > net/byteorder: fix 64 bit ntohq and htonq on big endian machines > core, util: fix compilation on non-x86 processors > core/memory: Fix SIGSEGV in small_pool::add_more_objects() > log: remove debug leftovers > Merge "TLS state machine fixes" from Calle > logger: allow adjusting the timestamp style for stdout logs > thread: make thread_context::s_main portable > core: add seastar::cache_line_size constant > Add detach() to input_stream and output_stream > Install dependencies for Arch Linux. > tls: Guard non-established sockets in sesrefs + more explicit close + states > tls: Make vec_push fully exception safe > basic_sstring: resize uses sstring > Merge "Add and correct unit tests" from Jesse > tcp: enforce 1-byte maximum segment invariant with zero window > tcp: verify 1-byte maximum segment invariant during send with zero window > memory: reduce small_pool vulnerability to fragmentation further > Prometheus: avoid merging all metrics family > net: Fix possible NULL pointer dereference.	2017-09-07 10:34:27 +02:00
Avi Kivity	d9ee2ad9f0	chunked_vector: avoid boost::small_vector with old boost versions Apparently older boost versions have a bug resulting in a double-free in boost::container::small_vector. Use std::vector instead. Fixes #2748. Tested-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20170903170207.21635-1-avi@scylladb.com>	2017-09-07 09:32:51 +03:00
Tomasz Grabiec	121cd8cb6c	tests: Fix cql_query_test.cc::test_duration_restrictions validate_request_failure() assumed that the future returned by execute_cql() is always ready, which doesn't have to be the case, and caused aborts in debug mode build. Message-Id: <1504701342-13300-1-git-send-email-tgrabiec@scylladb.com>	2017-09-06 15:49:03 +03:00
Tomasz Grabiec	3986486cb3	tests: cql_test_env: Avoid exceptions to make debugging easier Message-Id: <1504701375-13491-1-git-send-email-tgrabiec@scylladb.com>	2017-09-06 15:48:59 +03:00
Paweł Dziepak	e401d2d50b	db: reject non-Scylla counter sstables in flush_upload_dir Scylla already refuses to load counter sstables that do not have Scylla component. However, if this happens because of 'nodetool refresh' command the existing protection will trigger after sstables have been moved to the data directory. This is too later, so an additional check is added when the upload directory is scanned.	2017-09-06 12:04:26 +01:00
Paweł Dziepak	6a5e8bace1	db: disallow loading non-Scylla counter sstables Scylla does not support local and remote counter shards. This means that it is unsafe to directly load sstables that may contain them.	2017-09-06 12:03:58 +01:00
Paweł Dziepak	ebc538f4a3	sstable: add has_scylla_component() has_scylla_component() is going to be used to verify that an sstable has been generated by a recent version of Scylla. This would make it possible to reject sstables that may be unsafe to load (e.g. sstables containing legacy counter shards).	2017-09-06 12:03:45 +01:00
Avi Kivity	a59e375aad	Merge "Support termination of repair jobs" from Asias "This series implements the missing API to terminate all repairs. For example: $ curl -X POST --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/force_terminate_repair" With the new stream_plan::abort() api we can now abort the stream session assocaited with the repair as well. On top of this, we can support termination of single repair job instead all jobs. Fixes #2105" * tag 'asisas/repair_abort_v4' of github.com:scylladb/seastar-dev: repair: Support termination of repair jobs repair: Track repair_info repair: Intorduce repair id to repair_info map api: Add force_terminate_repair API streaming: Add abort to stream_plan streaming: Add abort_all_stream_sessions for stream_coordinator streaming: Introduce streaming::abort() streaming: Make stream_manager and coordinator message debug level streaming: Check if _stream_result is valid streaming: Log peer address in on_error streaming: Introduce received_failed_complete_message	2017-09-06 12:58:05 +03:00
Avi Kivity	31706ba989	Merge "Fix Scylla upgrades when counters are used" from Paweł "Scylla 1.7.4 and older use incorrect ordering of counter shards, this was fixed in `0d87f3dd7d` ("utils::UUID: operator< should behave as comparison of hex strings/bytes"). However, that patch was not backported to 1.7 branch until very recently. This means that versions 1.7.4 and older emit counter shards in an incorrect order and expect them to be so. This is particularly bad when dealing with imported correct sstables in which case some shards may become duplicated. The solution implemented in this patch is to allow any order of counter shards and automaticly merge all duplicates. The code is written in a way so that the correct ordering is expected in the fast path in order not to excessively punish unaffected deployments. A new feature flag CORRECT_COUNTER_ORDER is introduced to allow seamless upgrade from 1.7.4 to later Scylla versions. If that feature is not available Scylla still writes sstables and sends on-wire counters using the old ordering so that it can be correctly understood by 1.7.4, once the flag becomes available Scylla switches to the correct order. Fixes #2752." * tag 'fix-upgrade-with-counters/v2' of https://github.com/pdziepak/scylla: tests/counter: verify counter_id ordering counter: check that utils::UUID uses int64_t mutation_partition_serializer: use old counter ordering if necessary mutation_partition_view: do not expect counter shards to be sorted sstables: write counter shards in the order expected by the cluster tests/sstables: add storage_service_for_tests to counter write test tests/sstables: add test for reading wrong-order counter cells sstables: do not expect counter shards to be sorted storage_service: introduce CORRECT_COUNTER_ORDER feature tests/counter: test 1.7.4 compatible shard ordering counters: add helper for retrieving shards in 1.7.4 order tests/counter: add tests for 1.7.4 counter shard order counters: add counter id comparator compatible with Scylla 1.7.4 tests/counter: verify order of counter shards tests/counter: add test for sorting and deduplicating shards counters: add function for sorting and deduplicating counter cells counters: add counter_id::operator>	2017-09-05 14:20:55 +03:00
Paweł Dziepak	ed68a75b75	tests/counter: verify counter_id ordering	2017-09-05 10:52:54 +01:00
Paweł Dziepak	cdf7ba76f1	counter: check that utils::UUID uses int64_t	2017-09-05 10:46:03 +01:00
Paweł Dziepak	4aa72c6454	mutation_partition_serializer: use old counter ordering if necessary Until the cluster is fully upgraded from a version that uses the incorrect counter shard ordering it is essential to keep using it lest the old nodes corrupt the data upon receiving mutations with a counter shard ordering they do not expect.	2017-09-05 10:32:48 +01:00
Paweł Dziepak	b540516e5e	mutation_partition_view: do not expect counter shards to be sorted	2017-09-05 10:32:48 +01:00
Paweł Dziepak	84edb5a1f2	sstables: write counter shards in the order expected by the cluster If the feature signaling that we have switched to the correct ordering of counter shards is not enabled it means that the user still can do a rollback to a version that expects wrong ordering. In order to avoid any disasters when that happens write sstables using the 1.7.4 order until we know for sure that it is no longer needed.	2017-09-05 10:32:48 +01:00
Paweł Dziepak	2b614201a7	tests/sstables: add storage_service_for_tests to counter write test Writing a counters to a sstable is going to require cluster feature information, which requires accessing some singletons.	2017-09-05 10:32:48 +01:00
Paweł Dziepak	5007c9290a	tests/sstables: add test for reading wrong-order counter cells	2017-09-05 10:32:48 +01:00
Paweł Dziepak	3e1d09e71d	sstables: do not expect counter shards to be sorted	2017-09-05 10:32:48 +01:00
Paweł Dziepak	ecd2bf128b	storage_service: introduce CORRECT_COUNTER_ORDER feature Scylla 1.7.4 used incorrect ordering of counter shards. In order to fix this problem a new feature is introduced that will be used to determine when nodes with that bug fixed can start sending counter shard in the correct order.	2017-09-05 10:32:48 +01:00
Paweł Dziepak	1e03c4acbe	tests/counter: test 1.7.4 compatible shard ordering	2017-09-05 10:32:47 +01:00
Paweł Dziepak	067e429881	counters: add helper for retrieving shards in 1.7.4 order	2017-09-05 10:32:47 +01:00
Paweł Dziepak	fd25a09db2	tests/counter: add tests for 1.7.4 counter shard order	2017-09-05 10:32:47 +01:00
Paweł Dziepak	a93e8ce185	counters: add counter id comparator compatible with Scylla 1.7.4	2017-09-05 10:32:47 +01:00
Paweł Dziepak	b0f67c1680	tests/counter: verify order of counter shards	2017-09-05 10:32:47 +01:00
Paweł Dziepak	27397b5dad	tests/counter: add test for sorting and deduplicating shards	2017-09-05 10:32:47 +01:00
Paweł Dziepak	e0c2379f26	counters: add function for sorting and deduplicating counter cells Due to a bug in an implementation of UUID less compare some Scylla versions sort counter shards in an incorrect order. Moreover, when dealing with imported correct data the inconsistencies in ordering caused some counter shards to become duplicated.	2017-09-05 10:32:39 +01:00
Paweł Dziepak	74af818eaf	counters: add counter_id::operator>	2017-09-04 18:25:47 +01:00
Avi Kivity	4b06a2e95d	Merge "Fix exception safety in cache update related paths" from Tomasz * 'tgrabiec/make-row-cache-update-exception-safe' of github.com:scylladb/seastar-dev: row_cache: Improve safety of cache updates row_cache: Extract invalidate_sync() memtable: Mark mark_flushed() as noexcept database: Add non-throwing try_trigger_compaction() database: Make add_sstable() have strong exception guarantees row_cache: Don't require presence checker to be supplied externally database: Supply presence checker in sstable snapshots mutation_source: Introduce mutation_source::make_partition_presence_checker() mutation_reader: Move definitions up in the header mutation_reader: Use constructor delegation to reduce code duplication row_cache: Make populate() preserve continuity row_cache: Allow marking as fully continuous on construction database: Add missing serialization of sstable set udpate and cache invalidation	2017-09-04 18:37:42 +03:00
Tomasz Grabiec	d22fdf4261	row_cache: Improve safety of cache updates Cache imposes requirements on how updates to the on-disk mutation source are made: 1) each change to the on-disk muation source must be followed by cache synchronization reflecting that change 2) The two must be serialized with other synchronizations 3) must have strong failure guarantees (atomicity) Because of that, sstable list update and cache synchronization must be done under a lock, and cache synchronization cannot fail to synchronize. Normally cache synchronization achieves no-failure thing by wiping the cache (which is noexcept) in case failure is detect. There are some setup steps hoever which cannot be skipped, e.g. taking a lock followed by switching cache to use the new snapshot. That truly cannot fail. The lock inside cache synchronizers is redundant, since the user needs to take it anyway around the combined operation. In order to make ensuring strong exception guarantees easier, and making the cache interface easier to use correctly, this patch moves the control of the combined update into the cache. This is done by having cache::update() et al accept a callback (external_updater) which is supposed to perform modiciation of the underlying mutation source when invoked. This is in-line with the layering. Cache is layered on top of the on-disk mutation source (it wraps it) and reading has to go through cache. After the patch, modification also goes through cache. This way more of cache's requirements can be confined to its implementation. The failure semantics of update() and other synchronizers needed to change due to strong exception guaratnees. Now if it fails, it means the update was not performed, neither to the cache nor to the underlying mutation source. The database::_cache_update_sem goes away, serialization is done internally by the cache. The external_updater needs to have strong exception guarantees. This requirement is not new. It is however currently violated in some places. This patch marks those callbacks as noexcept and leaves a FIXME. Those should be fixed, but that's not in the scope of this patch. Aborting is still better than corrupting the state. Fixes #2754. Also fixes the following test failure: tests/row_cache_test.cc(949): fatal error: in "test_update_failure": critical check it->second.equal(*s, mopt->partition()) has failed which started to trigger after commit `318423d50b`. Thread stack allocation may fail, in which case we did not do the necessary invalidation.	2017-09-04 10:04:29 +02:00
Tomasz Grabiec	b0f3efa577	row_cache: Extract invalidate_sync()	2017-09-04 10:04:29 +02:00
Tomasz Grabiec	673a22f8e1	memtable: Mark mark_flushed() as noexcept Callers rely on that.	2017-09-04 10:04:29 +02:00

1 2 3 4 5 ...

13069 Commits