scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 12:06:44 +00:00

Author	SHA1	Message	Date
Amnon Heiman	b3cdee7e27	init: do not allow replace-address for seeds If a node is a seed node, it can not be started with replace-address-first-boot or the replace-address flag. The issue is that as a seed node it will generate new tokens instead of replacing the existing one the user expect it to replaec when supplying the flags. This patch will throw a bad_configuration_error exception in this case. Fixes #3889 Signed-off-by: Amnon Heiman <amnon@scylladb.com> (cherry picked from commit `399d79fc6f`)	2019-12-23 17:24:52 +02:00
Rafael Ávila de Espíndola	4c42f18d82	cql: Fix use of UDT in reversed columns We were missing calls to underlying_type in a few locations and so the insert would think the given literal was invalid and the select would refuse to fetch a UDT field. Fixes #4672 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190708200516.59841-1-espindola@scylladb.com> (cherry picked from commit `4e7ffb80c0`)	2019-12-23 15:57:47 +02:00
Benny Halevy	ea8f8ab7a3	sstables: mc: prevent signed integer overflow Fix runtime error: signed integer overflow introduced by `2dc3776407` Delta-encoded values may wrap around if the encoded value is less than the base value. This could happen in two places: In the mc-format serialization header itself, where the base values are implicit Cassandra epoch time, and in the sstables data files, where the base values are taken from the encoding_stats (later written to the serialization_header). In these cases, when the calculation is done using signed integer/long we may see "runtime error: signed integer overflow" messages in debug mode (with -fsanitize=undefined / -fsanitize=signed-integer-overflow). Overflow here is expected and harmless since we do not gurantee that neither the base values in the serialization header are greater than or equal to Cassandra's epoch now that the delta-encoded values are always greater than or equal to the respective base values in the serialization header. To prevent these warnings, the subtraction/addition should be done with unsigned (two's complement) arithmetic and the result converted to the signed type. Note that to keep the code simple where possible, when also rely on implicit conversion of signed integers to unsigned when either one of added value is unsigned and the other is signed. Fixes: #4098 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190120142950.15776-1-bhalevy@scylladb.com> (cherry picked from commit `844a2de263`)	2019-12-15 15:52:35 +02:00
Piotr Sarna	db6821ce8f	table: Reduce read amplification in view update generation This commit makes sure that single-partition readers for read-before-write do not have fast-forwarding enabled, as it may lead to huge read amplification. The observed case was: 1. Creating an index. CREATE INDEX index1 ON myks2.standard1 ("C1"); 2. Running cassandra-stress in order to generate view updates. cassandra-stress write no-warmup n=1000000 cl=ONE -schema \ 'replication(factor=2) compaction(strategy=LeveledCompactionStrategy)' \ keyspace=myks2 -pop seq=4000000..8000000 -rate threads=100 -errors skip-read-validation -node 127.0.0.1; Without disabling fast-forwarding, single-partition readers were turned into scanning readers in cache, which resulted in reading 36GB (sic!) on a workload which generates less than 1GB of view updates. After applying the fix, the number dropped down to less than 1GB, as expected. Refs #5409 Fixes #4615 Fixes #5418 (cherry picked from commit `79c3a508f4`)	2019-12-05 22:36:41 +02:00
Rafael Ávila de Espíndola	3c91bad0dc	commitlog: make sure a file is closed If allocate or truncate throws, we have to close the file. Fixes #4877 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20191114174810.49004-1-espindola@scylladb.com> (cherry picked from commit `6160b9017d`) scylla-3.0.11	2019-11-24 17:50:06 +02:00
Tomasz Grabiec	bbe41a82be	row_cache: Fix abort on bad_alloc during cache update Since `90d6c0b`, cache will abort when trying to detach partition entries while they're updated. This should never happen. It can happen though, when the update fails on bad_alloc, because the cleanup guard invalidates the cache before it releases partition snapshots (held by "update" coroutine). Fix by destroying the coroutine first. Fixes #5327. Tests: - row_cache_test (dev) Message-Id: <1574360259-10132-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `e3d025d014`)	2019-11-24 17:44:30 +02:00
Nadav Har'El	6fb42269e9	merge: row_marker: correct row expiry condition Merged patch set by Piotr Dulikowski: This change corrects condition on which a row was considered expired by its TTL. The logic that decides when a row becomes expired was inconsistent with the logic that decides if a single cell is expired. A single cell becomes expired when expiry_timestamp <= now, while a row became expired when expiry_timestamp < now (notice the strict inequality). For rows inserted with TTL, this caused non-key cells to expire (change their values to null) one second before the row disappeared. Now, row expiry logic uses non-strict inequality. Fixes #4263, Fixes #5290. Tests: unit(dev) python test described in issue #5290 (cherry picked from commit `9b9609c65b`) (cherry picked from commit `95acf71680`)	2019-11-20 21:40:40 +02:00
Asias He	ee2255a189	gossip: Fix max generation drift measure Assume n1 and n2 in a cluster with generation number g1, g2. The cluster runs for more than 1 year (MAX_GENERATION_DIFFERENCE). When n1 reboots with generation g1' which is time based, n2 will see g1' > g2 + MAX_GENERATION_DIFFERENCE and reject n1's gossip update. To fix, check the generation drift with generation value this node would get if this node were restarted. This is a backport of CASSANDRA-10969. Fixes #5164 (cherry picked from commit `0a52ecb6df`)	2019-11-20 11:39:37 +02:00
Kamil Braun	3218e6cd4c	view: fix bug in virtual columns. When creating a virtual column of non-frozen map type, the wrong type was used for the map's keys. Fixes #5165. (cherry picked from commit `ef9d5750c8`)	2019-11-19 11:17:54 +02:00
Rafael Ávila de Espíndola	1d94aac551	sstable: close file_writer if an exception in thrown The previous code was not exception safe and would eventually cause a file to be destroyed without being closed, causing an assert failure. Unfortunately it doesn't seem to be possible to test this without error injection, since using an invalid directory fails before this code is executed. Fixes #4948 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190904002314.79591-1-espindola@scylladb.com> (cherry picked from commit `000514e7cc`)	2019-11-19 11:17:54 +02:00
Avi Kivity	2e5110d063	reconcilable_result: use chunked_vector to hold partitions Usually, a reconcilable_result holds very few partitions (1 is common), since the page size is limited by 1MB. But if we have paging disabled or if we are reconciling a range full of tombstones, we may see many more. This can cause large allocations. Change to chunked_vector to prevent those large allocations, as they can be quite expensive. Fixes #4780. (cherry picked from commit `093d2cd7e5`)	2019-11-19 11:17:54 +02:00
Avi Kivity	e4bb7ce73c	utils::chunked_vector: add rbegin() and related iterators Needed as an std::vector replacement. (cherry picked from commit `eaa9a5b0d7`) Prerequisite for #4780.	2019-11-19 11:17:54 +02:00
Avi Kivity	ecc54c1a68	utils: chunked_vector: make begin()/end() const correct begin() of a const vector should return a const_iterator, to avoid giving the caller the ability to mutate it. This slipped through since iterator's constructor does a const_cast. Noticed by code inspection. (cherry picked from commit `df6faae980`) Prerequisite for #4780.	2019-11-19 11:17:54 +02:00
Glauber Costa	71cfd108c6	do not crash in user-defined operations if the controller is disabled Scylla currently crashes if we run manual operations like nodetool compact with the controller disabled. While we neither like nor recommend running with the controller disabled, due to some corner cases in the controller algorithm we are not yet at the point in which we can deprecate this and are sometimes forced to disable it. The reason for the crash is that manual operations will invoke _backlog_of_shares, which returns what is the backlog needed to create a certain number of shares. That scan the existing control points, but when we run without the controller there are no control points and we crash. Backlog doesn't matter if the controller is disabled, and the return value of this function will be immaterial in this case. So to avoid the crash, we return something right away if the controller is disabled. Fixes #5016 Signed-off-by: Glauber Costa <glauber@scylladb.com> (cherry picked from commit `c9f2d1d105`)	2019-11-19 11:17:54 +02:00
Avi Kivity	d40a7a5e9e	Merge "Add proper aggregation for paged indexing" from Piotr " Fixes #4540 This series adds proper handling of aggregation for paged indexed queries. Before this series returned results were presented to the user in per-page partial manner, while they should have been returned as a single aggregated value. Tests: unit(dev) " * 'add_proper_aggregation_for_paged_indexing_for_3.0' of https://github.com/psarna/scylla: test: add 'eventually' block to index paging test tests: add indexing+paging test case for clustering keys tests: add indexing + paging + aggregation test case cql3: make DEFAULT_COUNT_PAGE_SIZE constant public cql3: add proper aggregation to paged indexing cql3: add a query options constructor with explicit page size cql3: enable explicit copying of query_options cql3: split execute_base_query implementation	2019-11-19 11:17:54 +02:00
Takuya ASADA	a163d245ec	dist/common/scripts/scylla_setup: don't proceed with empty NIC name Currently NIC selection prompt on scylla_setup just proceed setup when user just pressed Enter key on the prompt. The prompt should ask NIC name again until user input correct NIC name. Fixes #4517 Message-Id: <20190617124925.11559-1-syuu@scylladb.com> (cherry picked from commit `7320c966bc`)	2019-11-19 11:17:54 +02:00
Piotr Sarna	045831b706	test: add 'eventually' block to index paging test Without 'eventually', the test is flaky because the index can still be not up to date while checking its conditions. Fixes #4670 (cherry picked from commit `ebbe038d19`)	2019-11-15 09:15:29 +01:00
Piotr Sarna	148245ab6a	tests: add indexing+paging test case for clustering keys Indexing a non-prefix part of the clustering key has a separate code path (see issue #3405), so it deserves a separate test case.	2019-11-14 12:32:08 +01:00
Piotr Sarna	bbe5de1403	tests: add indexing + paging + aggregation test case Indexed queries used to erroneously return partial per-page results for aggregation queries. This test case used to reproduce the problem and now ensures that there would be no regressions. Refs #4540	2019-11-14 12:32:07 +01:00
Piotr Sarna	ca0df416c0	cql3: make DEFAULT_COUNT_PAGE_SIZE constant public The constant will be later used in test scenarios.	2019-11-14 12:25:37 +01:00
Piotr Sarna	37ed60374e	cql3: add proper aggregation to paged indexing Aggregated and paged filtering needs to aggregate the results from all pages in order to avoid returning partial per-page results. It's a little bit more complicated than regular aggregation, because each paging state needs to be translated between the base table and the underlying view. The routine keeps fetching pages from the underlying view, which are then used to fetch base rows, which go straight to the result set builder. Fixes #4540	2019-11-14 12:25:37 +01:00
Piotr Sarna	7c991a276b	cql3: add a query options constructor with explicit page size For internal use, there already exists a query_options constructor that copies data from another query_options with overwritten paging state. This commit adds an option to overwrite page size as well.	2019-11-14 10:49:28 +01:00
Piotr Sarna	72e039be85	cql3: enable explicit copying of query_options	2019-11-14 10:49:28 +01:00
Piotr Sarna	a28ecc4714	cql3: split execute_base_query implementation In order to handle aggregation queries correctly, the function that returns base query results is split into two, so it's possible to access raw query results, before they're converted into end-user CQL message.	2019-11-14 10:49:28 +01:00
Avi Kivity	584c555698	Update seastar submodule * seastar 3920dcb3f8...083dc0875e (2): > core: fix a race in execution stages > execution_stage: prevent unbounded growth Fixes #4749. Fixes #4856.	2019-11-13 13:15:54 +02:00
null	e772f11ee0	release: prepare for3.0.11 by yaronkaikov	2019-10-30 11:01:40 +02:00
Botond Dénes	d79b6a7481	repair: repair_cf_range(): extract result of local checksum calculation only once The loop that collects the result of the checksum calculations and logs any errors. The error logging includes `checksums[0]` which corresponds to the checksum calculation on the local node. This violates the assumption of the code following the loop, which assumes that the future of `checksums[0]` is intact after the loop terminates. However this is only true when the checksum calculation is successful and is false when it fails, as in this case the loop extracts the error and logs it. When the code after the loop checks again whether said calculation failed, it will get a false negative and will go ahead and attempt to extract the value, triggering an assert failure. Fix by making sure that even in the case of failed checksum calculation, the result of `checksum[0]` is extracted only once. Fixes: #5238 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20191029151709.90986-1-bdenes@scylladb.com> (cherry picked from commit `e48f301e95`)	2019-10-29 20:43:50 +02:00
Avi Kivity	85168c500c	Merge "Fix handling of schema alters and eviction in cache" from Tomasz " Fixes #5134, Eviction concurrent with preempted partition entry update after memtable flush may allow stale data to be populated into cache. Fixes #5135, Cache reads may miss some writes if schema alter followed by a read happened concurrently with preempted partition entry update. Fixes #5127, Cache populating read concurrent with schema alter may use the wrong schema version to interpret sstable data. Fixes #5128, Reads of multi-row partitions concurrent with memtable flush may fail or cause a node crash after schema alter. " * tag 'fix-cache-issues-with-schema-alter-and-eviction-v2' of github.com:tgrabiec/scylla: tests: row_cache: Introduce test_alter_then_preempted_update_then_memtable_read tests: row_cache_stress_test: Verify all entries are evictable at the end tests: row_cache_stress_test: Exercise single-partition reads tests: row_cache_stress_test: Add periodic schema alters tests: memtable_snapshot_source: Allow changing the schema tests: simple_schema: Prepare for schema altering row_cache: Record upgraded schema in memtable entries during update memtable: Extract memtable_entry::upgrade_schema() row_cache, mvcc: Prevent locked snapshots from being evicted row_cache: Make evict() not use invalidate_unwrapped() mvcc: Introduce partition_snapshot::touch() row_cache, mvcc: Do not upgrade schema of entries which are being updated row_cache: Use the correct schema version to populate the partition entry delegating_reader: Optimize fill_buffer() row_cache, memtable: Use upgrade_schema() flat_mutation_reader: Introduce upgrade_schema() (cherry picked from commit `8ed6f94a16`) (cherry picked from commit `3f4d9f210f`)	2019-10-22 19:47:02 +02:00
Botond Dénes	5b9e2cd6e6	querier_cache: correctly account entries evicted on insertion in the population Currently, the population stat is not increased for entries that are evicted immediately on insert, however the code that does the eviction still decreases the population stat, leading to an imbalance and in some cases the underflow of the population stat. To fix, unconditionally increase the population stat upon inserting an entry, regardless of whether it is immediately evicted or not. Fixes: #5123 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20191001153215.82997-1-bdenes@scylladb.com> (cherry picked from commit `00b432b61d`)	2019-10-05 12:36:34 +03:00
Avi Kivity	77f33ca106	Merge " hinted handoff: fix races during shutdown and draining" from Vlad " Fix races that may lead to use-after-free events and file system level exceptions during shutdown and drain. The root cause of use-after-free events in question is that space_watchdog blocks on end_point_hints_manager::file_update_mutex() and we need to make sure this mutex is alive as long as it's accessed even if the corresponding end_point_hints_manager instance is destroyed in the context of manager::drain_for(). File system exceptions may occur when space_watchdog attempts to scan a directory while it's being deleted from the drain_for() context. In case of such an exception new hints generation is going to be blocked - including for materialized views, till the next space_watchdog round (in 1s). Issues that are fixed are #4685 and #4836. Tested as follows: 1) Patched the code in order to trigger the race with (a lot) higher probability and running slightly modified hinted handoff replace dtest with a debug binary for 100 times. Side effect of this testing was discovering of #4836. 2) Using the same patch as above tested that there are no crashes and nodes survive stop/start sequences (they were not without this series) in the context of all hinted handoff dtests. Ran the whole set of tests with dev binary for 10 times. " Fixes #4685 Fixes #4836. * 'hinted_handoff_race_between_drain_for_and_space_watchdog_no_global_lock-v2' of https://github.com/vladzcloudius/scylla: hinted handoff: fix a race on a directory removal between space_watchdog and drain_for() hinted handoff: make taking file_update_mutex safe db::hints::manager::drain_for(): fix alignment db::hints::manager: serialize calls to drain_for() db::hints: cosmetics: identation and missing method qualifier (cherry picked from commit `3cb081eb84`)	2019-10-05 12:25:51 +03:00
Gleb Natapov	93760f13ee	messaging_service: enable reuseaddr on messaging service rpc Fixes #4943 Message-Id: <20190918152405.GV21540@scylladb.com> (cherry picked from commit `73e3d0a283`)	2019-10-03 15:24:53 +03:00
Avi Kivity	e597ae1176	Update seastar submodule * seastar af3fc691b9...3920dcb3f8 (2): > net: socket::{set,get}_reuseaddr() should not be virtual > Merge "fix some tcp connection bugs and add reuseaddr option to a client socket" from Gleb Prerequisite for #4943.	2019-10-03 15:23:35 +03:00
Tomasz Grabiec	79c7015cce	Merge "hinted handoff: don't reuse_segments and discard corrupted segments" from Vlad This series addresses two issues in the hinted handoff that should complete fixing the infamous #4231. In particular the second patch removes the requirement to manually delete hints files after upgrading to 3.0.4. Tested with manual unit testing. * https://github.com/vladzcloudius/scylla.git hinted_handoff_drop_broken_segments-v3: hinted handoff: disable "reuse_segments" commitlog: introduce a segment_error hinted handoff: discard corrupted segments (cherry picked from commit `ac0d435c3e`)	2019-09-28 19:52:57 +03:00
Asias He	00a14000cd	storage_service: Replicate and advertise tokens early in the boot up process When a node is restarted, there is a race between gossip starts (other nodes will mark this node up again and send requests) and the tokens are replicated to other shards. Here is an example: - n1, n2 - n2 is down, n1 think n2 is down - n2 starts again, n2 starts gossip service, n1 thinks n2 is up and sends reads/writes to n2, but n2 hasn't replicated the token_metadata to all the shards. - n2 complains: token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! token_metadata - sorted_tokens is empty in first_token_index! storage_proxy - Failed to apply mutation from $ip#4: std::runtime_error (sorted_tokens is empty in first_token_index!) The code path looks like below: 0 stoarge_service::init_server 1 prepare_to_join() 2 add gossip application state of NET_VERSION, SCHEMA and so on. 3 _gossiper.start_gossiping().get() 4 join_token_ring() 5 _token_metadata.update_normal_tokens(tokens, get_broadcast_address()); 6 replicate_to_all_cores().get() 7 storage_service::set_gossip_tokens() which adds the gossip application state of TOKENS and STATUS The race talked above is at line 3 and line 6. To fix, we can replicate the token_metadata early after it is filled with the tokens read from system table before gossip starts. So that when other nodes think this restarting node is up, the tokens are already replicated to all the shards. In addition, this patch also fixes the issue that other nodes might see a node miss the TOKENS and STATUS application state in gossip if that node failed in the middle of a restarting process, i.e., it is killed after line 3 and before line 7. As a result we could not replace the node. Tests: update_cluster_layout_tests.py Fixes: #4709 Fixes: #4723 (cherry picked from commit `3b39a59135`)	2019-09-22 12:46:36 +03:00
Avi Kivity	1c40a0fcd2	Update seastar submodule * seastar ea859b5840...af3fc691b9 (1): > iotune: fix exception handling in case test file creation fails Fixes #5001.	2019-09-18 18:37:23 +03:00
Gleb Natapov	e10735852b	messaging_service: configure different streaming domain for each rpc server A streaming domain identifies a server across shards. Each server should have different one. Fixes: #4953 Message-Id: <20190908085327.GR21540@scylladb.com> (cherry picked from commit `9e9f64d90e`)	2019-09-09 20:37:40 +03:00
Avi Kivity	42433a25a8	Update seastar submodule * seastar 445b5126c2...ea859b5840 (1): > perftune: fix missing import for logging Fixes #4958.	2019-09-04 13:50:29 +03:00
Paweł Dziepak	d04d3fa653	mutation_partition: verify row::append_cell() precondition row::append_cell() has a precondition that the new cell column id needs to be larger than that of any other already existing cell. If this precondition is violated the row will end up in an invalid state. This patch adds assertion to make sure we fail early in such cases. (cherry picked from commit `060e3f8ac2`)	2019-08-23 15:06:18 +02:00
Avi Kivity	1bcc5a1b5c	Merge "database: assign proper io priority for streaming view updates" from Piotr " Streamed view updates parasitized on writing io priority, which is reserved for user writes - it's now properly bound to streaming write priority. Verified manually by checking appropriate io metrics: scylla_io_queue_total_bytes{class="streaming_write" ...} vs scylla_io_queue_total_bytes{class="query" ...} Tests: unit(dev) " Fixes #4615. * 'assign_proper_io_priority_to_streaming_view_updates' of https://github.com/psarna/scylla: db,view: wrap view update generation in stream scheduling group database: assign proper io priority for streaming view updates (cherry picked from commit `2c7435418a`)	2019-08-22 16:21:42 +03:00
Botond Dénes	450b9ac9bf	multishard_combining_reader: shard reader: don't stop on non-full prefixes This patch is a backport of the fix for #4733 (merged to master as `0cf4fab`). As the shard reader code has been substantially refactored post the 3.0 branch cut time, that fix cannot be backported at all, instead this is a separate fix developed specially for 3.0. To quickly reiterate, the problem at hand is that when recreating a previously evicted shard reader of a multishard reader, the position of the last fragment seen by that reader is used as the position after which the read resumes. For this we just created a clustering range starting from after the key (open bound). This works well in most cases but when that last key is a non-full prefix this will also ignore any still unread clustering rows that falls into that prefix. This patch doesn't attempt to fix the problem in a systematic way like the fix in master does, making sure reader recreation works properly with prefixes as well, instead, for the sake of minimizing the impact, we simply avoid ending the buffer on a prefix key. This fix is more naive and can cause over-read when the stream contains lots of successive range tombstones with prefix positions. On the other hand, this leads to a much simpler fix, and anyway, as reader eviction is much rarer in 3.0 this should have a lesser impact. A unit test is also added to make sure the problem is fixed. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20190819120748.28168-1-bdenes@scylladb.com>	2019-08-19 15:09:47 +03:00
Jenkins	b3bfd8c08d	release: prepare for 3.0.10 by hagitsegev scylla-3.0.10	2019-08-14 14:58:50 -04:00
Tomasz Grabiec	53c10b72dc	Merge "Fix the system.size_estimates table" from Kamil Fixes a segfault when querying for an empty keyspace. Also, fixes an infinite loop on smp > 1. Queries to system.size_estimates table which are not single-partition queries caused Scylla to go into an infinite loop inside multishard_combining_reader::fill_buffer. This happened because multishard_combinind_reader assumes that shards return rows belonging to separate partitions, which was not the case for size_estimates_mutation_reader. Fixes #4689	2019-08-14 15:31:54 +02:00
Kamil Braun	a690e20966	Fix infinite looping when performing a range query on system.size_estimates. Queries to system.size_estimates table which are not single parition queries caused Scylla to go into an infinite loop inside multishard_combining_reader::fill_buffer. This happened because multishard_combinind_reader assumes that shards return rows belonging to separate partitions, which was not the case for size_estimates_mutation_reader. This commit fixes the issue and closes #4689.	2019-08-14 12:51:33 +02:00
Kamil Braun	7172009a0d	Fix segmentation fault when querying system.size_estimates for an empty keyspace.	2019-08-14 12:51:33 +02:00
Kamil Braun	cb688ef62e	Refactor size_estimates_virtual_reader Move the implementation of size_estimates_mutation_reader to a separate compilation unit to speed up compilation times and increase readability. Refactor tests to use seastar::thread.	2019-08-14 12:51:27 +02:00
Kamil Braun	ff8265dd66	Fix command line argument parsing in main. Command line arguments are parsed twice in Scylla: once in main and once in Seastar's app_template::run. The first parse is there to check if the "--version" flag is present --- in this case the version is printed and the program exists. The second parsing is correct; however, most of the arguments were improperly treated as positional arguments during the first parsing (e.g., "--network host" would treat "host" as a positional argument). This happened because the arguments weren't known to the command line parser. This commit fixes the issue by moving the parsing code until after the arguments are registered. Resolves #4141. Signed-off-by: Kamil Braun <kbraun@scylladb.com> (cherry picked from commit `f155a2d334`)	2019-08-13 20:13:24 +03:00
Avi Kivity	a198db31dc	Merge "Fix disable_sstable_write synchronization with on_compaction_completion" from Benny " disable_sstable_write needs to acquire _sstable_deletion_sem to properly synchronize with background deletions done by on_compaction_completion to ensure no sstables will be created or deleted during reshuffle_sstables after storage_service::load_new_sstables disables sstable writes. Fixes #4622 Test: unit(dev), nodetool_additional_test.py migration_test.py " * 'fix-disable-sstable-write-for-3.0' of https://github.com/bhalevy/scylla: table: document _sstables_lock/_sstable_deletion_sem locking order table: disable_sstable_write: acquire _sstable_deletion_sem table: uninline enable_sstable_write	2019-08-12 16:53:47 +03:00
Avi Kivity	094a2a4263	Merge "Catch unclosed partition sstable write #4794 " from Tomasz " Not emitting partition_end for a partition is incorrect. SStable writer assumes that it is emitted. If it's not, the sstable will not be written correctly. The partition index entry for the last partition will be left partially written, which will result in errors during reads. Also, statistics and sstable key ranges will not include the last partition. It's better to catch this problem at the time of writing, and not generate bad sstables. Another way of handling this would be to implicitly generate a partition_end, but I don't think that we should do this. We cannot trust the mutation stream when invariants are violated, we don't know if this was really the last partition which was supposed to be written. So it's safer to fail the write. Enabled for both mc and la/ka. Passing --abort-on-internal-error on the command line will switch to aborting instead of throwing an exception. The reason we don't abort by default is that it may bring the whole cluster down and cause unavailability, while it may not be necessary to do so. It's safer to fail just the affected operation, e.g. repair. However, failing the operation with an exception leaves little information for debugging the root cause. So the idea is that the user would enable aborts on only one of the nodes in the cluster to get a core dump and not bring the whole cluster down. " * 'catch-unclosed-partition-sstable-write' of https://github.com/tgrabiec/scylla: sstables: writer: Validate that partition is closed when the input mutation stream ends config, exceptions: Add helper for handling internal errors utils: config_file: Introduce named_value::observe() (cherry picked from commit `95c0804731`) (cherry picked from commit `cf4c238b28`)	2019-08-08 16:47:26 +03:00
Asias He	cc0b4d249b	streaming: Send error code from the sender to receiver In case of error on the sender side, the sender does not propagate the error to the receiver. The sender will close the stream. As a result, the receiver will get nullopt from the source in get_next_mutation_fragment and pass mutation_fragment_opt with no value to the generating_reader. In turn, the generating_reader generates end of stream. However, the last element that the generating_reader has generated can be any type of mutation_fragment. This makes the sstable that consumes the generating_reader violates the mutation_fragment stream rule. To fix, we need to propagate the error. However RPC streaming does not support propagate the error in the framework. User has to send an error code explicitly. Fixes: #4789 (cherry picked from commit `bac987e32a`) streaming: Move stream_mutation_fragments_cmd to a new file Avoid including the stream_session.hh in messaging_service.hh. More importantly, fix the build because currently messaging_service.cc and messaging_service.hh does not include stream_mutation_fragments_cmd. I am not sure why it builds on my machine. Spotted this when backporting the change to 3.0 branch. Refs: #4789 (cherry picked from commit `49a73aa2fc`) streaming: Do not call rpc stream flush in send_mutation_fragments The stream close() guarantees the data sent will be flushed. No need to call the stream flush() since the stream is not reused. Follow up fix for commit `bac987e32a` (streaming: Send error code from the sender to receiver). Fixes: #4789 (cherry picked from commit `288371ce75`) Message-Id: <87058e290ae3f59f874b860121786b22f24957c7.1565189319.git.asias@scylladb.com>	2019-08-08 11:41:25 +02:00
Asias He	e10afc7f50	messaging_service: Check if messaging_service is stopped before get_rpc_client get_rpc_client assumes the messaging_service is not stopped. We should check is_stopping() before we call get_rpc_client. We do such check in existing code, e.g., send_message and friends. Do the same check in the newly introduced make_sink_and_source_for_stream_mutation_fragments() and friends for row level repair. Fixes: #4767 (cherry picked from commit `5d3e4d7b73`) Note: only the change for make_sink_and_source_for_stream_mutation_fragments is backported. Message-Id: <06079d4e48ea81ba567a2f45be2ab3a51f042e28.1565189319.git.asias@scylladb.com>	2019-08-08 11:40:49 +02:00

1 2 3 4 5 ...

16973 Commits