scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 03:56:42 +00:00

Author	SHA1	Message	Date
Avi Kivity	e4bb7ce73c	utils::chunked_vector: add rbegin() and related iterators Needed as an std::vector replacement. (cherry picked from commit `eaa9a5b0d7`) Prerequisite for #4780.	2019-11-19 11:17:54 +02:00
Avi Kivity	ecc54c1a68	utils: chunked_vector: make begin()/end() const correct begin() of a const vector should return a const_iterator, to avoid giving the caller the ability to mutate it. This slipped through since iterator's constructor does a const_cast. Noticed by code inspection. (cherry picked from commit `df6faae980`) Prerequisite for #4780.	2019-11-19 11:17:54 +02:00
Glauber Costa	71cfd108c6	do not crash in user-defined operations if the controller is disabled Scylla currently crashes if we run manual operations like nodetool compact with the controller disabled. While we neither like nor recommend running with the controller disabled, due to some corner cases in the controller algorithm we are not yet at the point in which we can deprecate this and are sometimes forced to disable it. The reason for the crash is that manual operations will invoke _backlog_of_shares, which returns what is the backlog needed to create a certain number of shares. That scan the existing control points, but when we run without the controller there are no control points and we crash. Backlog doesn't matter if the controller is disabled, and the return value of this function will be immaterial in this case. So to avoid the crash, we return something right away if the controller is disabled. Fixes #5016 Signed-off-by: Glauber Costa <glauber@scylladb.com> (cherry picked from commit `c9f2d1d105`)	2019-11-19 11:17:54 +02:00
Avi Kivity	d40a7a5e9e	Merge "Add proper aggregation for paged indexing" from Piotr " Fixes #4540 This series adds proper handling of aggregation for paged indexed queries. Before this series returned results were presented to the user in per-page partial manner, while they should have been returned as a single aggregated value. Tests: unit(dev) " * 'add_proper_aggregation_for_paged_indexing_for_3.0' of https://github.com/psarna/scylla: test: add 'eventually' block to index paging test tests: add indexing+paging test case for clustering keys tests: add indexing + paging + aggregation test case cql3: make DEFAULT_COUNT_PAGE_SIZE constant public cql3: add proper aggregation to paged indexing cql3: add a query options constructor with explicit page size cql3: enable explicit copying of query_options cql3: split execute_base_query implementation	2019-11-19 11:17:54 +02:00
Takuya ASADA	a163d245ec	dist/common/scripts/scylla_setup: don't proceed with empty NIC name Currently NIC selection prompt on scylla_setup just proceed setup when user just pressed Enter key on the prompt. The prompt should ask NIC name again until user input correct NIC name. Fixes #4517 Message-Id: <20190617124925.11559-1-syuu@scylladb.com> (cherry picked from commit `7320c966bc`)	2019-11-19 11:17:54 +02:00
Piotr Sarna	045831b706	test: add 'eventually' block to index paging test Without 'eventually', the test is flaky because the index can still be not up to date while checking its conditions. Fixes #4670 (cherry picked from commit `ebbe038d19`)	2019-11-15 09:15:29 +01:00
Piotr Sarna	148245ab6a	tests: add indexing+paging test case for clustering keys Indexing a non-prefix part of the clustering key has a separate code path (see issue #3405), so it deserves a separate test case.	2019-11-14 12:32:08 +01:00
Piotr Sarna	bbe5de1403	tests: add indexing + paging + aggregation test case Indexed queries used to erroneously return partial per-page results for aggregation queries. This test case used to reproduce the problem and now ensures that there would be no regressions. Refs #4540	2019-11-14 12:32:07 +01:00
Piotr Sarna	ca0df416c0	cql3: make DEFAULT_COUNT_PAGE_SIZE constant public The constant will be later used in test scenarios.	2019-11-14 12:25:37 +01:00
Piotr Sarna	37ed60374e	cql3: add proper aggregation to paged indexing Aggregated and paged filtering needs to aggregate the results from all pages in order to avoid returning partial per-page results. It's a little bit more complicated than regular aggregation, because each paging state needs to be translated between the base table and the underlying view. The routine keeps fetching pages from the underlying view, which are then used to fetch base rows, which go straight to the result set builder. Fixes #4540	2019-11-14 12:25:37 +01:00
Piotr Sarna	7c991a276b	cql3: add a query options constructor with explicit page size For internal use, there already exists a query_options constructor that copies data from another query_options with overwritten paging state. This commit adds an option to overwrite page size as well.	2019-11-14 10:49:28 +01:00
Piotr Sarna	72e039be85	cql3: enable explicit copying of query_options	2019-11-14 10:49:28 +01:00
Piotr Sarna	a28ecc4714	cql3: split execute_base_query implementation In order to handle aggregation queries correctly, the function that returns base query results is split into two, so it's possible to access raw query results, before they're converted into end-user CQL message.	2019-11-14 10:49:28 +01:00
Avi Kivity	584c555698	Update seastar submodule * seastar 3920dcb3f8...083dc0875e (2): > core: fix a race in execution stages > execution_stage: prevent unbounded growth Fixes #4749. Fixes #4856.	2019-11-13 13:15:54 +02:00
null	e772f11ee0	release: prepare for3.0.11 by yaronkaikov	2019-10-30 11:01:40 +02:00
Botond Dénes	d79b6a7481	repair: repair_cf_range(): extract result of local checksum calculation only once The loop that collects the result of the checksum calculations and logs any errors. The error logging includes `checksums[0]` which corresponds to the checksum calculation on the local node. This violates the assumption of the code following the loop, which assumes that the future of `checksums[0]` is intact after the loop terminates. However this is only true when the checksum calculation is successful and is false when it fails, as in this case the loop extracts the error and logs it. When the code after the loop checks again whether said calculation failed, it will get a false negative and will go ahead and attempt to extract the value, triggering an assert failure. Fix by making sure that even in the case of failed checksum calculation, the result of `checksum[0]` is extracted only once. Fixes: #5238 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20191029151709.90986-1-bdenes@scylladb.com> (cherry picked from commit `e48f301e95`)	2019-10-29 20:43:50 +02:00
Avi Kivity	85168c500c	Merge "Fix handling of schema alters and eviction in cache" from Tomasz " Fixes #5134, Eviction concurrent with preempted partition entry update after memtable flush may allow stale data to be populated into cache. Fixes #5135, Cache reads may miss some writes if schema alter followed by a read happened concurrently with preempted partition entry update. Fixes #5127, Cache populating read concurrent with schema alter may use the wrong schema version to interpret sstable data. Fixes #5128, Reads of multi-row partitions concurrent with memtable flush may fail or cause a node crash after schema alter. " * tag 'fix-cache-issues-with-schema-alter-and-eviction-v2' of github.com:tgrabiec/scylla: tests: row_cache: Introduce test_alter_then_preempted_update_then_memtable_read tests: row_cache_stress_test: Verify all entries are evictable at the end tests: row_cache_stress_test: Exercise single-partition reads tests: row_cache_stress_test: Add periodic schema alters tests: memtable_snapshot_source: Allow changing the schema tests: simple_schema: Prepare for schema altering row_cache: Record upgraded schema in memtable entries during update memtable: Extract memtable_entry::upgrade_schema() row_cache, mvcc: Prevent locked snapshots from being evicted row_cache: Make evict() not use invalidate_unwrapped() mvcc: Introduce partition_snapshot::touch() row_cache, mvcc: Do not upgrade schema of entries which are being updated row_cache: Use the correct schema version to populate the partition entry delegating_reader: Optimize fill_buffer() row_cache, memtable: Use upgrade_schema() flat_mutation_reader: Introduce upgrade_schema() (cherry picked from commit `8ed6f94a16`) (cherry picked from commit `3f4d9f210f`)	2019-10-22 19:47:02 +02:00
Botond Dénes	5b9e2cd6e6	querier_cache: correctly account entries evicted on insertion in the population Currently, the population stat is not increased for entries that are evicted immediately on insert, however the code that does the eviction still decreases the population stat, leading to an imbalance and in some cases the underflow of the population stat. To fix, unconditionally increase the population stat upon inserting an entry, regardless of whether it is immediately evicted or not. Fixes: #5123 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20191001153215.82997-1-bdenes@scylladb.com> (cherry picked from commit `00b432b61d`)	2019-10-05 12:36:34 +03:00
Avi Kivity	77f33ca106	Merge " hinted handoff: fix races during shutdown and draining" from Vlad " Fix races that may lead to use-after-free events and file system level exceptions during shutdown and drain. The root cause of use-after-free events in question is that space_watchdog blocks on end_point_hints_manager::file_update_mutex() and we need to make sure this mutex is alive as long as it's accessed even if the corresponding end_point_hints_manager instance is destroyed in the context of manager::drain_for(). File system exceptions may occur when space_watchdog attempts to scan a directory while it's being deleted from the drain_for() context. In case of such an exception new hints generation is going to be blocked - including for materialized views, till the next space_watchdog round (in 1s). Issues that are fixed are #4685 and #4836. Tested as follows: 1) Patched the code in order to trigger the race with (a lot) higher probability and running slightly modified hinted handoff replace dtest with a debug binary for 100 times. Side effect of this testing was discovering of #4836. 2) Using the same patch as above tested that there are no crashes and nodes survive stop/start sequences (they were not without this series) in the context of all hinted handoff dtests. Ran the whole set of tests with dev binary for 10 times. " Fixes #4685 Fixes #4836. * 'hinted_handoff_race_between_drain_for_and_space_watchdog_no_global_lock-v2' of https://github.com/vladzcloudius/scylla: hinted handoff: fix a race on a directory removal between space_watchdog and drain_for() hinted handoff: make taking file_update_mutex safe db::hints::manager::drain_for(): fix alignment db::hints::manager: serialize calls to drain_for() db::hints: cosmetics: identation and missing method qualifier (cherry picked from commit `3cb081eb84`)	2019-10-05 12:25:51 +03:00
Gleb Natapov	93760f13ee	messaging_service: enable reuseaddr on messaging service rpc Fixes #4943 Message-Id: <20190918152405.GV21540@scylladb.com> (cherry picked from commit `73e3d0a283`)	2019-10-03 15:24:53 +03:00
Avi Kivity	e597ae1176	Update seastar submodule * seastar af3fc691b9...3920dcb3f8 (2): > net: socket::{set,get}_reuseaddr() should not be virtual > Merge "fix some tcp connection bugs and add reuseaddr option to a client socket" from Gleb Prerequisite for #4943.	2019-10-03 15:23:35 +03:00
Tomasz Grabiec	79c7015cce	Merge "hinted handoff: don't reuse_segments and discard corrupted segments" from Vlad This series addresses two issues in the hinted handoff that should complete fixing the infamous #4231. In particular the second patch removes the requirement to manually delete hints files after upgrading to 3.0.4. Tested with manual unit testing. * https://github.com/vladzcloudius/scylla.git hinted_handoff_drop_broken_segments-v3: hinted handoff: disable "reuse_segments" commitlog: introduce a segment_error hinted handoff: discard corrupted segments (cherry picked from commit `ac0d435c3e`)	2019-09-28 19:52:57 +03:00
Asias He	00a14000cd	storage_service: Replicate and advertise tokens early in the boot up process When a node is restarted, there is a race between gossip starts (other nodes will mark this node up again and send requests) and the tokens are replicated to other shards. Here is an example: - n1, n2 - n2 is down, n1 think n2 is down - n2 starts again, n2 starts gossip service, n1 thinks n2 is up and sends reads/writes to n2, but n2 hasn't replicated the token_metadata to all the shards. - n2 complains: token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! storage_proxy - Failed to apply mutation from $ip#4: std::runtime_error (sorted_tokens is empty in first_token_index!) The code path looks like below: 0 stoarge_service::init_server 1 prepare_to_join() 2 add gossip application state of NET_VERSION, SCHEMA and so on. 3 _gossiper.start_gossiping().get() 4 join_token_ring() 5 _token_metadata.update_normal_tokens(tokens, get_broadcast_address()); 6 replicate_to_all_cores().get() 7 storage_service::set_gossip_tokens() which adds the gossip application state of TOKENS and STATUS The race talked above is at line 3 and line 6. To fix, we can replicate the token_metadata early after it is filled with the tokens read from system table before gossip starts. So that when other nodes think this restarting node is up, the tokens are already replicated to all the shards. In addition, this patch also fixes the issue that other nodes might see a node miss the TOKENS and STATUS application state in gossip if that node failed in the middle of a restarting process, i.e., it is killed after line 3 and before line 7. As a result we could not replace the node. Tests: update_cluster_layout_tests.py Fixes: #4709 Fixes: #4723 (cherry picked from commit `3b39a59135`)	2019-09-22 12:46:36 +03:00
Avi Kivity	1c40a0fcd2	Update seastar submodule * seastar ea859b5840...af3fc691b9 (1): > iotune: fix exception handling in case test file creation fails Fixes #5001.	2019-09-18 18:37:23 +03:00
Gleb Natapov	e10735852b	messaging_service: configure different streaming domain for each rpc server A streaming domain identifies a server across shards. Each server should have different one. Fixes: #4953 Message-Id: <20190908085327.GR21540@scylladb.com> (cherry picked from commit `9e9f64d90e`)	2019-09-09 20:37:40 +03:00
Avi Kivity	42433a25a8	Update seastar submodule * seastar 445b5126c2...ea859b5840 (1): > perftune: fix missing import for logging Fixes #4958.	2019-09-04 13:50:29 +03:00
Paweł Dziepak	d04d3fa653	mutation_partition: verify row::append_cell() precondition row::append_cell() has a precondition that the new cell column id needs to be larger than that of any other already existing cell. If this precondition is violated the row will end up in an invalid state. This patch adds assertion to make sure we fail early in such cases. (cherry picked from commit `060e3f8ac2`)	2019-08-23 15:06:18 +02:00
Avi Kivity	1bcc5a1b5c	Merge "database: assign proper io priority for streaming view updates" from Piotr " Streamed view updates parasitized on writing io priority, which is reserved for user writes - it's now properly bound to streaming write priority. Verified manually by checking appropriate io metrics: scylla_io_queue_total_bytes{class="streaming_write" ...} vs scylla_io_queue_total_bytes{class="query" ...} Tests: unit(dev) " Fixes #4615. * 'assign_proper_io_priority_to_streaming_view_updates' of https://github.com/psarna/scylla: db,view: wrap view update generation in stream scheduling group database: assign proper io priority for streaming view updates (cherry picked from commit `2c7435418a`)	2019-08-22 16:21:42 +03:00
Botond Dénes	450b9ac9bf	multishard_combining_reader: shard reader: don't stop on non-full prefixes This patch is a backport of the fix for #4733 (merged to master as `0cf4fab`). As the shard reader code has been substantially refactored post the 3.0 branch cut time, that fix cannot be backported at all, instead this is a separate fix developed specially for 3.0. To quickly reiterate, the problem at hand is that when recreating a previously evicted shard reader of a multishard reader, the position of the last fragment seen by that reader is used as the position after which the read resumes. For this we just created a clustering range starting from after the key (open bound). This works well in most cases but when that last key is a non-full prefix this will also ignore any still unread clustering rows that falls into that prefix. This patch doesn't attempt to fix the problem in a systematic way like the fix in master does, making sure reader recreation works properly with prefixes as well, instead, for the sake of minimizing the impact, we simply avoid ending the buffer on a prefix key. This fix is more naive and can cause over-read when the stream contains lots of successive range tombstones with prefix positions. On the other hand, this leads to a much simpler fix, and anyway, as reader eviction is much rarer in 3.0 this should have a lesser impact. A unit test is also added to make sure the problem is fixed. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20190819120748.28168-1-bdenes@scylladb.com>	2019-08-19 15:09:47 +03:00
Jenkins	b3bfd8c08d	release: prepare for 3.0.10 by hagitsegev scylla-3.0.10	2019-08-14 14:58:50 -04:00
Tomasz Grabiec	53c10b72dc	Merge "Fix the system.size_estimates table" from Kamil Fixes a segfault when querying for an empty keyspace. Also, fixes an infinite loop on smp > 1. Queries to system.size_estimates table which are not single-partition queries caused Scylla to go into an infinite loop inside multishard_combining_reader::fill_buffer. This happened because multishard_combinind_reader assumes that shards return rows belonging to separate partitions, which was not the case for size_estimates_mutation_reader. Fixes #4689	2019-08-14 15:31:54 +02:00
Kamil Braun	a690e20966	Fix infinite looping when performing a range query on system.size_estimates. Queries to system.size_estimates table which are not single parition queries caused Scylla to go into an infinite loop inside multishard_combining_reader::fill_buffer. This happened because multishard_combinind_reader assumes that shards return rows belonging to separate partitions, which was not the case for size_estimates_mutation_reader. This commit fixes the issue and closes #4689.	2019-08-14 12:51:33 +02:00
Kamil Braun	7172009a0d	Fix segmentation fault when querying system.size_estimates for an empty keyspace.	2019-08-14 12:51:33 +02:00
Kamil Braun	cb688ef62e	Refactor size_estimates_virtual_reader Move the implementation of size_estimates_mutation_reader to a separate compilation unit to speed up compilation times and increase readability. Refactor tests to use seastar::thread.	2019-08-14 12:51:27 +02:00
Kamil Braun	ff8265dd66	Fix command line argument parsing in main. Command line arguments are parsed twice in Scylla: once in main and once in Seastar's app_template::run. The first parse is there to check if the "--version" flag is present --- in this case the version is printed and the program exists. The second parsing is correct; however, most of the arguments were improperly treated as positional arguments during the first parsing (e.g., "--network host" would treat "host" as a positional argument). This happened because the arguments weren't known to the command line parser. This commit fixes the issue by moving the parsing code until after the arguments are registered. Resolves #4141. Signed-off-by: Kamil Braun <kbraun@scylladb.com> (cherry picked from commit `f155a2d334`)	2019-08-13 20:13:24 +03:00
Avi Kivity	a198db31dc	Merge "Fix disable_sstable_write synchronization with on_compaction_completion" from Benny " disable_sstable_write needs to acquire _sstable_deletion_sem to properly synchronize with background deletions done by on_compaction_completion to ensure no sstables will be created or deleted during reshuffle_sstables after storage_service::load_new_sstables disables sstable writes. Fixes #4622 Test: unit(dev), nodetool_additional_test.py migration_test.py " * 'fix-disable-sstable-write-for-3.0' of https://github.com/bhalevy/scylla: table: document _sstables_lock/_sstable_deletion_sem locking order table: disable_sstable_write: acquire _sstable_deletion_sem table: uninline enable_sstable_write	2019-08-12 16:53:47 +03:00
Avi Kivity	094a2a4263	Merge "Catch unclosed partition sstable write #4794 " from Tomasz " Not emitting partition_end for a partition is incorrect. SStable writer assumes that it is emitted. If it's not, the sstable will not be written correctly. The partition index entry for the last partition will be left partially written, which will result in errors during reads. Also, statistics and sstable key ranges will not include the last partition. It's better to catch this problem at the time of writing, and not generate bad sstables. Another way of handling this would be to implicitly generate a partition_end, but I don't think that we should do this. We cannot trust the mutation stream when invariants are violated, we don't know if this was really the last partition which was supposed to be written. So it's safer to fail the write. Enabled for both mc and la/ka. Passing --abort-on-internal-error on the command line will switch to aborting instead of throwing an exception. The reason we don't abort by default is that it may bring the whole cluster down and cause unavailability, while it may not be necessary to do so. It's safer to fail just the affected operation, e.g. repair. However, failing the operation with an exception leaves little information for debugging the root cause. So the idea is that the user would enable aborts on only one of the nodes in the cluster to get a core dump and not bring the whole cluster down. " * 'catch-unclosed-partition-sstable-write' of https://github.com/tgrabiec/scylla: sstables: writer: Validate that partition is closed when the input mutation stream ends config, exceptions: Add helper for handling internal errors utils: config_file: Introduce named_value::observe() (cherry picked from commit `95c0804731`) (cherry picked from commit `cf4c238b28`)	2019-08-08 16:47:26 +03:00
Asias He	cc0b4d249b	streaming: Send error code from the sender to receiver In case of error on the sender side, the sender does not propagate the error to the receiver. The sender will close the stream. As a result, the receiver will get nullopt from the source in get_next_mutation_fragment and pass mutation_fragment_opt with no value to the generating_reader. In turn, the generating_reader generates end of stream. However, the last element that the generating_reader has generated can be any type of mutation_fragment. This makes the sstable that consumes the generating_reader violates the mutation_fragment stream rule. To fix, we need to propagate the error. However RPC streaming does not support propagate the error in the framework. User has to send an error code explicitly. Fixes: #4789 (cherry picked from commit `bac987e32a`) streaming: Move stream_mutation_fragments_cmd to a new file Avoid including the stream_session.hh in messaging_service.hh. More importantly, fix the build because currently messaging_service.cc and messaging_service.hh does not include stream_mutation_fragments_cmd. I am not sure why it builds on my machine. Spotted this when backporting the change to 3.0 branch. Refs: #4789 (cherry picked from commit `49a73aa2fc`) streaming: Do not call rpc stream flush in send_mutation_fragments The stream close() guarantees the data sent will be flushed. No need to call the stream flush() since the stream is not reused. Follow up fix for commit `bac987e32a` (streaming: Send error code from the sender to receiver). Fixes: #4789 (cherry picked from commit `288371ce75`) Message-Id: <87058e290ae3f59f874b860121786b22f24957c7.1565189319.git.asias@scylladb.com>	2019-08-08 11:41:25 +02:00
Asias He	e10afc7f50	messaging_service: Check if messaging_service is stopped before get_rpc_client get_rpc_client assumes the messaging_service is not stopped. We should check is_stopping() before we call get_rpc_client. We do such check in existing code, e.g., send_message and friends. Do the same check in the newly introduced make_sink_and_source_for_stream_mutation_fragments() and friends for row level repair. Fixes: #4767 (cherry picked from commit `5d3e4d7b73`) Note: only the change for make_sink_and_source_for_stream_mutation_fragments is backported. Message-Id: <06079d4e48ea81ba567a2f45be2ab3a51f042e28.1565189319.git.asias@scylladb.com>	2019-08-08 11:40:49 +02:00
Tomasz Grabiec	407dfe0d68	lsa: Fix spurios abort with --enable-abort-on-lsa-bad-alloc allocate_segment() can fail even though we're not out of memory, when it's invoked inside an allocating section with the cache region locked. That section may later succeed after retried after memory reclamation. We should ignore bad_alloc thrown inside allocating section body and fail only when the whole section fails. Fixes #2924 Message-Id: <1550597493-22500-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `dafe22dd83`)	2019-08-08 11:39:39 +02:00
Raphael S. Carvalho	9370996a18	table: do not rely on undefined behavior in cleanup_sstables It shouldn't rely on argument evaluation order, which is ub. Fixes #4718. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry-picked from commit `0e732ed1cf`)	2019-08-07 21:53:12 +03:00
Rafael Ávila de Espíndola	ac105dd2a7	mc writer: Fix exception safety when closing _index_writer This fixes a possible cause of #4614. From the backtrace in that issue, it looks like a file is being closed twice. The first point in the backtrace where that seems likely is in the MC writer. My first idea was to add a writer::close and make it the responsibility of the code using the writer to call it. That way we would move work out of the destructor. That is a bit hard since the writer is destroyed from flat_mutation_reader::impl::~consumer_adapter and that would need to get a close function too. This patch instead just fixes an exception safety issue. If _index_writer->close() throws, _index_writer is still valid and ~writer will try to close it again. If the exception was thrown after _completed.set_value(), that would explain the assert about _completed.set_value() being called twice. With this patch the path outside of the destructor now moves the writer to a local variable before trying to close it. Fixes #4614 Message-Id: <20190710171747.27337-1-espindola@scylladb.com> (cherry picked from commit `281f3a69f8`)	2019-08-07 21:43:44 +03:00
Benny Halevy	1e62fc8aac	table: document _sstables_lock/_sstable_deletion_sem locking order Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `0e4567c881`)	2019-08-07 17:09:47 +03:00
Benny Halevy	c724eee649	table: disable_sstable_write: acquire _sstable_deletion_sem `disable_sstable_write` needs to acquire `_sstable_deletion_sem` to properly synchronize with background deletions done by `on_compaction_completion` to ensure no sstables will be created or deleted during `reshuffle_sstables` after `storage_service::load_new_sstables` disables sstable writes. Fixes #4622 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `6dad9baa1c`)	2019-08-07 17:06:38 +03:00
Benny Halevy	ebb14d93c9	table: uninline enable_sstable_write Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `bbbd749f70`)	2019-08-07 17:04:08 +03:00
Tomasz Grabiec	d77aaada86	sstables: ka/la: reader: Make sure push_ready_fragments() does not miss to emit partition_end Currently, if there is a fragment in _ready and _out_of_range was set after row end was consumer, push_ready_fragments() would return without emitting partition_end. This is problematic once we make consume_row_start() emit partiton_start directly, because we will want to assume that all fragments for the previous partition are emitted by then. If they're not, then we'd emit partition_start before partition_end for the previous partition. The fix is to make sure that push_ready_fragments() emits everything. Fixes #4786 (cherry picked from commit `9b8ac5ecbc`) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-08-01 13:06:56 +03:00
Avi Kivity	acd05e089f	Update seastar submodule * seastar 16641efb15...445b5126c2 (1): > reactor: fix deadlock of stall detector vs dlopen Fixes #4759.	2019-07-31 18:33:28 +03:00
Avi Kivity	f591c9c710	sstable: index_reader: close index_reader::reader more robustly If we had an error while reading, then we would have failed to close the reader, which in turn can cause memory corruption. Make the closing more robust by using then_wrapped (that doesn't skip on exception) and log the error for analysis. Fixes #4761. (cherry picked from commit `b272db368f`)	2019-07-27 18:20:17 +03:00
Jenkins	dea4489078	release: prepare for 3.0.9 by hagitsegev scylla-3.0.9	2019-07-24 12:09:49 +03:00
Raphael S. Carvalho	3172cc6bac	sstables/compaction: Fix segfault when replacing expired sstable in incremental compaction Fully expired sstable is not added to compacting set, meaning it's not actually compacted, but it's kept in a list of sstables which incremental compaction uses to check if any sstable can be replaced. Incremental compaction was unconditionally removing expired sstable from compacting set, which led to segfault because end iterator was given. The fix is about changing sstable_set::erase() behavior to follow standard one for erase functions which will works if the target element is not present. Fixes #4085. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20190130163100.5824-1-raphaelsc@scylladb.com> (cherry picked from commit `930f8caff9`)	2019-07-22 15:07:00 +03:00

1 2 3 4 5 ...

16962 Commits