scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 03:56:42 +00:00

Author	SHA1	Message	Date
Rafael Ávila de Espíndola	0f2f0d65d7	configure: Reduce the dynamic linker path size gdb has a SO_NAME_MAX_PATH_SIZE of 512, so we use that as the path size. Fixes: #6494 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200528202741.398695-2-espindola@scylladb.com> (cherry picked from commit `aa778ec152`)	2020-06-21 12:29:16 +03:00
Tomasz Grabiec	31c2f8a3ae	row_cache: Fix undefined behavior on key linearization This is relevant only when using partition or clustering keys which have a representation in memory which is larger than 12.8 KB (10% of LSA segment size). There are several places in code (cache, background garbage collection) which may need to linearize keys because of performing key comparison, but it's not done safely: 1) the code does not run with the LSA region locked, so pointers may get invalidated on linearization if it needs to reclaim memory. This is fixed by running the code inside an allocating section. 2) LSA region is locked, but the scope of with_linearized_managed_bytes() encloses the allocating section. If allocating section needs to reclaim, linearization context will contain invalidated pointers. The fix is to reorder the scopes so that linearization context lives within an allocating section. Example of 1 can be found in range_populating_reader::handle_end_of_stream() where it performs a lookup: auto prev = std::prev(it); if (prev->key().equal(_cache._schema, _last_key->_key)) { it->set_continuous(true); but handle_end_of_stream() is not invoked under allocating section. Example of 2 can be found in mutation_cleaner_impl::merge_some() where it does: return with_linearized_managed_bytes([&] { ... return _worker_state->alloc_section(region, [&] { Fixes #6637. Refs #6108. Tests: - unit (all) Message-Id: <1592218544-9435-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `e81fc1f095`)	2020-06-21 11:58:59 +03:00
Yaron Kaikov	ec12331f11	release: prepare for 3.3.4 scylla-3.3.4	2020-06-15 21:19:02 +03:00
Avi Kivity	ccc463b5e5	tools: toolchain: regenerate for gnutls 3.6.14 CVE-2020-13777. Fixes #6627. Toolchain source image registry disambiguated due to tighter podman defaults.	2020-06-15 08:05:58 +03:00
Calle Wilund	4a9676f6b7	gms::inet_address: Fix sign extension error in custom address formatting Fixes #5808 Seems some gcc:s will generate the code as sign extending. Mine does not, but this should be more correct anyhow. Added small stringify test to serialization_test for inet_address (cherry picked from commit `a14a28cdf4`)	2020-06-09 20:16:50 +03:00
Takuya ASADA	aaf4989c31	aws: update enhanced networking supported instance list Sync enhanced networking supported instance list to latest one. Reference: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html Fixes #6540 (cherry picked from commit `969c4258cf`)	2020-06-09 16:03:00 +03:00
Asias He	b29f954f20	gossip: Make is_safe_for_bootstrap more strict Consider 1. Start n1, n2 in the cluster 2. Stop n2 and delete all data for n2 3. Start n2 to replace itself with replace_address_first_boot: n2 4. Kill n2 before n2 finishes the replace operation 5. Remove replace_address_first_boot: n2 from scylla.yaml of n2 6. Delete all data for n2 7. Start n2 At step 7, n2 will be allowed to bootstrap as a new node, because the application state of n2 in the cluster is HIBERNATE which is not rejected in the check of is_safe_for_bootstrap. As a result, n2 will replace n2 with a different tokens and a different host_id, as if the old n2 node was removed from the cluster silently. Fixes #5172 (cherry picked from commit `cdcedf5eb9`)	2020-05-25 14:30:53 +03:00
Eliran Sinvani	5546d5df7b	Auth: return correct error code when role is not found Scylla returns the wrong error code (0000 - server internal error) in response to trying to do authentication/authorization operations that involves a non-existing role. This commit changes those cases to return error code 2200 (invalid query) which is the correct one and also the one that Cassandra returns. Tests: Unit tests (Dev) All auth and auth_role dtests (cherry picked from commit ce8cebe34801f0ef0e327a32f37442b513ffc214) Fixes #6363.	2020-05-25 12:58:38 +03:00
Amnon Heiman	541c29677f	storage_service: get_range_to_address_map prevent use after free The implementation of get_range_to_address_map has a default behaviour, when getting an empty keypsace, it uses the first non-system keyspace (first here is basically, just a keyspace). The current implementation has two issues, first, it uses a reference to a string that is held on a stack of another function. In other word, there's a use after free that is not clear why we never hit. The second, it calls get_non_system_keyspaces twice. Though this is not a bug, it's redundant (get_non_system_keyspaces uses a loop, so calling that function does have a cost). This patch solves both issues, by chaning the implementation to hold a string instead of a reference to a string. Second, it stores the results from get_non_system_keyspaces and reuse them it's more efficient and holds the returned values on the local stack. Fixes #6465 Signed-off-by: Amnon Heiman <amnon@scylladb.com> (cherry picked from commit `69a46d4179`)	2020-05-25 12:48:48 +03:00
Hagit Segev	06f18108c0	release: prepare for 3.3.3 scylla-3.3.3	2020-05-24 23:28:07 +03:00
Tomasz Grabiec	90002ca3d2	sstables: index_reader: Fix overflow when calculating promoted index end When index file is larger than 4GB, offset calculation will overflow uint32_t and _promoted_index_end will be too small. As a result, promoted_index_size calculation will underflow and the rest of the page will be interpretd as a promoted index. The partitions which are in the remainder of the index page will not be found by single-partition queries. Data is not lost. Introduced in `6c5f8e0eda`. Fixes #6040 Message-Id: <20200521174822.8350-1-tgrabiec@scylladb.com> (cherry picked from commit `a6c87a7b9e`)	2020-05-24 09:46:11 +03:00
Rafael Ávila de Espíndola	da23902311	repair: Make sure sinks are always closed In a recent next failure I got the following backtrace function=function@entry=0x270360 "seastar::rpc::sink_impl<Serializer, Out>::~sink_impl() [with Serializer = netw::serializer; Out = {repair_row_on_wire_with_cmd}]") at assert.c:101 at ./seastar/include/seastar/core/shared_ptr.hh:463 at repair/row_level.cc:2059 This patch changes a few functions to use finally to make sure the sink is always closed. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200515202803.60020-1-espindola@scylladb.com> (cherry picked from commit `311fbe2f0a`) Ref #6414	2020-05-20 09:00:57 +03:00
Asias He	2b0dc21f97	repair: Fix race between write_end_of_stream and apply_rows Consider: n1, n2, n1 is the repair master, n2 is the repair follower. === Case 1 === 1) n1 sends missing rows {r1, r2} to n2 2) n2 runs apply_rows_on_follower to apply rows, e.g., {r1, r2}, r1 is written to sstable, r2 is not written yet, r1 belongs to partition 1, r2 belongs to partition 2. It yields after row r1 is written. data: partition_start, r1 3) n1 sends repair_row_level_stop to n2 because error has happened on n1 4) n2 calls wait_for_writer_done() which in turn calls write_end_of_stream() data: partition_start, r1, partition_end 5) Step 2 resumes to apply the rows. data: partition_start, r1, partition_end, partition_end, partition_start, r2 === Case 2 === 1) n1 sends missing rows {r1, r2} to n2 2) n2 runs apply_rows_on_follower to apply rows, e.g., {r1, r2}, r1 is written to sstable, r2 is not written yet, r1 belongs to partition 1, r2 belongs to partition 2. It yields after partition_start for r2 is written but before _partition_opened is set to true. data: partition_start, r1, partition_end, partition_start 3) n1 sends repair_row_level_stop to n2 because error has happened on n1 4) n2 calls wait_for_writer_done() which in turn calls write_end_of_stream(). Since _partition_opened[node_idx] is false, partition_end is skipped, end_of_stream is written. data: partition_start, r1, partition_end, partition_start, end_of_stream This causes unbalanced partition_start and partition_end in the stream written to sstables. To fix, serialize the write_end_of_stream and apply_rows with a semaphore. Fixes: #6394 Fixes: #6296 Fixes: #6414 (cherry picked from commit `b2c4d9fdbc`)	2020-05-20 08:22:05 +03:00
Piotr Dulikowski	b544691493	hinted handoff: don't keep positions of old hints in rps_set When sending hints from one file, rps_set field in send_one_file_ctx keeps track of commitlog positions of hints that are being currently sent, or have failed to be sent. At the end of the operation, if sending of some hints failed, we will choose position of the earliest hint that failed to be sent, and will retry sending that file later, starting from that position. This position is stored in _last_not_complete_rp. Usually, this set has a bounded size, because we impose a limit of at most 128 hints being sent concurrently. Because we do not attempt to send any more hints after a failure is detected, rps_set should not have more than 128 elements at a time. Due to a bug, commitlog positions of old hints (older than gc_grace_seconds of the destination table) were inserted into rps_set but not removed after checking their age. This could cause rps_set to grow very large when replaying a file with old hints. Moreover, if the file mixed expired and non-expired hints (which could happen if it had hints to two tables with different gc_grace_seconds), and sending of some non-expired hints failed, then positions of expired hints could influence calculation _last_not_complete_rp, and more hints than necessary would be resent on the next retry. This simple patch removes commitlog position of a hint from rps_set when it is detected to be too old. Fixes #6422 (cherry picked from commit `85d5c3d5ee`)	2020-05-20 08:06:17 +03:00
Piotr Dulikowski	d420b06844	hinted handoff: remove discarded hint positions from rps_set Related commit: `85d5c3d` When attempting to send a hint, an exception might occur that results in that hint being discarded (e.g. keyspace or table of the hint was removed). When such an exception is thrown, position of the hint will already be stored in rps_set. We are only allowed to retain positions of hints that failed to be sent and needed to be retried later. Dropping a hint is not an error, therefore its position should be removed from rps_set - but current logic does not do that. Because of that bug, hint files with many discardable hints might cause rps_set to grow large when the file is replayed. Furthermore, leaving positions of such hints in rps_set might cause more hints than necessary to be re-sent if some non-discarded hints fail to be sent. This commit fixes the problem by removing positions of discarded hints from rps_set. Fixes #6433 (cherry picked from commit `0c5ac0da98`)	2020-05-20 08:04:10 +03:00
Avi Kivity	b3a2cb2f68	Update seastar submodule * seastar 0ebd89a858...30f03aeba9 (1): > timer: add scheduling_group awareness Fixes #6170.	2020-05-10 18:39:20 +03:00
Hagit Segev	c8c057f5f8	release: prepare for 3.3.2 scylla-3.3.2	2020-05-10 18:16:28 +03:00
Gleb Natapov	038bfc925c	storage_proxy: limit read repair only to replicas that answered during speculative reads Speculative reader has more targets that needed for CL. In case there is a digest mismatch the repair runs between all of them, but that violates provided CL. The patch makes it so that repair runs only between replicas that answered (there will be CL of them). Fixes #6123 Reviewed-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200402132245.GA21956@scylladb.com> (cherry picked from commit `36a24bbb70`)	2020-05-07 19:48:37 +03:00
Mike Goltsov	13a4e7db83	fix error in fstrim service (scylla_util.py) On Centos 7 machine: fstrim.timer not enabled, only unmasked due scylla_fstrim_setup on installation When trying run scylla-fstrim service manually you get error: Traceback (most recent call last): File "/opt/scylladb/scripts/libexec/scylla_fstrim", line 60, in <module> main() File "/opt/scylladb/scripts/libexec/scylla_fstrim", line 44, in main cfg = parse_scylla_dirs_with_default(conf=args.config) File "/opt/scylladb/scripts/scylla_util.py", line 484, in parse_scylla_dirs_with_default if key not in y or not y[k]: NameError: name 'k' is not defined It caused by error in scylla_util.py Fixes #6294. (cherry picked from commit `068bb3a5bf`)	2020-05-07 19:45:50 +03:00
Juliusz Stasiewicz	727d6cf8f3	atomic_cell: special rule for printing counter cells Until now, attempts to print counter update cell would end up calling abort() because `atomic_cell_view::value()` has no specialized visitor for `imr::pod<int64_t>::basic_view<is_mutable>`, i.e. counter update IMR type. Such visitor is not easy to write if we want to intercept counters only (and not all int64_t values). Anyway, linearized byte representation of counter cell would not be helpful without knowing if it consists of counter shards or counter update (delta) - and this must be known upon `deserialize`. This commit introduces simple approach: it determines cell type on high level (from `atomic_cell_view`) and prints counter contents by `counter_cell_view` or `atomic_cell_view::counter_update_value()`. Fixes #5616 (cherry picked from commit `0ea17216fe`)	2020-05-07 19:40:47 +03:00
Tomasz Grabiec	6d6d7b4abe	sstables: Release reserved space for sharding metadata The intention of the code was to clear sharding metadata chunked_vector so that it doesn't bloat memory. The type of c is `chunked_vector*`. Assigning `{}` clears the pointer while the intended behavior was to reset the `chunked_vector` instance. The original instance is left unmodified with all its reserved space. Because of this, the previous fix had no effect because token ranges are stored entirely inline and popping them doesn't realease memory. Fixes #4951 Tests: - sstable_mutation_test (dev) - manual using scylla binary on customer data on top of 2019.1.5 Reviewed-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <1584559892-27653-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `5fe626a887`)	2020-05-07 19:06:22 +03:00
Tomasz Grabiec	28f974b810	Merge "Don't return stale data by properly invalidating row cache after cleanup" from Raphael Row cache needs to be invalidated whenever data in sstables changes. Cleanup removes data from sstables which doesn't belong to the node anymore, which means cache must be invalidated on cleanup. Currently, stale data can be returned when a node re-owns ranges which data are still stored in the node's row cache, because cleanup didn't invalidate the cache." Fixes #4446. tests: - unit tests (dev mode) - dtests: update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_decommission_node_2_test cleanup_test.py (cherry picked from commit `d0b6be0820`)	2020-05-07 16:24:51 +03:00
Piotr Sarna	5fdadcaf3b	network_topology_strategy: validate integers In order to prevent users from creating a network topology strategy instance with invalid inputs, it's not enough to use std::stol() on the input: a string "3abc" still returns the number '3', but will later confuse cqlsh and other drivers, when they ask for topology strategy details. The error message is now more human readable, since for incorrect numeric inputs it used to return a rather cryptic message: ServerError: stol() This commit fixes the issue and comes with a simple test. Fixes #3801 Tests: unit(dev) Message-Id: <7aaae83d003738f047d28727430ca0a5cec6b9c6.1583478000.git.sarna@scylladb.com> (cherry picked from commit `5b7a35e02b`)	2020-05-07 16:24:49 +03:00
Pekka Enberg	a960394f27	scripts/jobs: Keep memory reserve when calculating parallelism The "jobs" script is used to determine the amount of compilation parallelism on a machine. It attempts to ensure each GCC process has at least 4 GB of memory per core. However, in the worst case scenario, we could end up having the GCC processes take up all the system memory, forcin swapping or OOM killer to kick in. For example, on a 4 core machine with 16 GB of memory, this worst case scenario seems easy to trigger in practice. Fix up the problem by keeping a 1 GB of memory reserve for other processes and calculating parallelism based on that. Message-Id: <20200423082753.31162-1-penberg@scylladb.com> (cherry picked from commit `7304a795e5`)	2020-05-04 19:01:54 +03:00
Piotr Sarna	3216a1a70a	alternator: fix signature timestamps Generating timestamps for auth signatures used a non-thread-safe ::gmtime function instead of thread-safe ::gmtime_r. Tests: unit(dev) Fixes #6345 (cherry picked from commit `fb7fa7f442`)	2020-05-04 17:08:13 +03:00
Avi Kivity	5a7fd41618	Merge 'Fix hang in multishard_writer' from Asias " This series fix hang in multishard_writer when error happens. It contains - multishard_writer: Abort the queue attached to consumers when producer fails - repair: Fix hang when the writer is dead Fixes #6241 Refs: #6248 " * asias-stream_fix_multishard_writer_hang: repair: Fix hang when the writer is dead mutation_writer_test: Add test_multishard_writer_producer_aborts multishard_writer: Abort the queue attached to consumers when producer fails (cherry picked from commit `8925e00e96`)	2020-05-01 20:13:00 +03:00
Raphael S. Carvalho	dd24ba7a62	api/service: fix segfault when taking a snapshot without keyspace specified If no keyspace is specified when taking snapshot, there will be a segfault because keynames is unconditionally dereferenced. Let's return an error because a keyspace must be specified when column families are specified. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200427195634.99940-1-raphaelsc@scylladb.com> (cherry picked from commit `02e046608f`) Fixes #6336.	2020-04-30 12:57:14 +03:00
Avi Kivity	204f6dd393	Update seastar submodule * seastar a0bdc6cd85...0ebd89a858 (1): > http server: fix "Date" header format Fixes #6253.	2020-04-26 19:31:44 +03:00
Nadav Har'El	b1278adc15	alternator: unzero "scylla_alternator_total_operations" metric In commit `388b492040`, which was only supposed to move around code, we accidentally lost the line which does _executor.local()._stats.total_operations++; So after this commit this counter was always zero... This patch returns the line incrementing this counter. Arguably, this counter is not very important - a user can also calculate this number by summing up all the counters in the scylla_alternator_operation array (these are counters for individual types of operations). Nevertheless, as long as we do export a "scylla_alternator_total_operations" metric, we need to correctly calculate it and can't leave it zero :-) Fixes #5836 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200219162820.14205-1-nyh@scylladb.com> (cherry picked from commit `b8aed18a24`)	2020-04-19 19:07:31 +03:00
Botond Dénes	ee9677ef71	schema: schema(): use std::stable_sort() to sort key columns When multiple key columns (clustering or partition) are passed to the schema constructor, all having the same column id, the expectation is that these columns will retain the order in which they were passed to `schema_builder::with_column()`. Currently however this is not guaranteed as the schema constructor sort key columns by column id with `std::sort()`, which doesn't guarantee that equally comparing elements retain their order. This can be an issue for indexes, the schemas of which are built independently on each node. If there is any room for variance between for the key column order, this can result in different nodes having incompatible schemas for the same index. The fix is to use `std::stable_sort()` which guarantees that the order of equally comparing elements won't change. This is a suspected cause of #5856, although we don't have hard proof. Fixes: #5856 Signed-off-by: Botond Dénes <bdenes@scylladb.com> [avi: upgraded "Refs" to "Fixes", since we saw that std::sort() becomes unstable at 17 elements, and the failing schema had a clustering key with 23 elements] Message-Id: <20200417121848.1456817-1-bdenes@scylladb.com> (cherry picked from commit `a4aa753f0f`)	2020-04-19 18:19:05 +03:00
Nadav Har'El	2060e361cf	materialized views: fix corner case of view updates used by Alternator While CQL does not allow creation of a materialized view with more than one base regular column in the view's key, in Alternator we do allow this - both partition and clustering key may be a base regular column. We had a bug in the logic handling this case: If the new base row is missing a value for one of the view key columns, we shouldn't create a view row. Similarly, if the existing base row was missing a value for one of the view key columns, a view row does not exist and doesn't need to be deleted. This was done incorrectly, and made decisions based on just one of the key columns, and the logic is now fixed (and I think, simplified) in this patch. With this patch, the Alternator test which previously failed because of this problem now passes. The patch also includes new tests in the existing C++ unit test test_view_with_two_regular_base_columns_in_key. This tests was already supposed to be testing various cases of two-new-key-columns updates, but missed the cases explained above. These new tests failed badly before this patch - some of them had clean write errors, others caused crashes. With this patch, they pass. Fixes #6008. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200312162503.8944-1-nyh@scylladb.com> (cherry picked from commit `635e6d887c`)	2020-04-19 15:24:19 +03:00
Hagit Segev	6f939ffe19	release: prepare for 3.3.1 scylla-3.3.1	2020-04-18 00:23:31 +03:00
Kamil Braun	69105bde8a	sstables: freeze types nested in collection types in legacy sstables Some legacy `mc` SSTables (created in Scylla 3.0) may contain incorrect serialization headers, which don't wrap frozen UDTs nested inside collections with the FrozenType<...> tag. When reading such SSTable, Scylla would detect a mismatch between the schema saved in schema tables (which correctly wraps UDTs in the FrozenType<...> tag) and the schema from the serialization header (which doesn't have these tags). SSTables created in Scylla versions 3.1 and above, in particular in Scylla versions that contain this commit, create correct serialization headers (which wrap UDTs in the FrozenType<...> tag). This commit does two things: 1. for all SSTables created after this commit, include a new feature flag, CorrectUDTsInCollections, presence of which implies that frozen UDTs inside collections have the FrozenType<...> tag. 2. when reading a Scylla SSTable without the feature flag, we assume that UDTs nested inside collections are always frozen, even if they don't have the tag. This assumption is safe to be made, because at the time of this commit, Scylla does not allow non-frozen (multi-cell) types inside collections or UDTs, and because of point 1 above. There is one edge case not covered: if we don't know whether the SSTable comes from Scylla or from C*. In that case we won't make the assumption described in 2. Therefore, if we get a mismatch between schema and serialization headers of a table which we couldn't confirm to come from Scylla, we will still reject the table. If any user encounters such an issue (unlikely), we will have to use another solution, e.g. using a separate tool to rewrite the SSTable. Fixes #6130. (cherry picked from commit `3d811e2f95`)	2020-04-17 09:12:28 +03:00
Kamil Braun	e09e9a5929	sstables: move definition of column_translation::state::build to a .cc file Ref #6130	2020-04-17 09:12:28 +03:00
Piotr Sarna	2308bdbccb	alternator: use partition tombstone if there's no clustering key As @tgrabiec helpfully pointed out, creating a row tombstone for a table which does not have a clustering key in its schema creates something that looks like an open-ended range tombstone. That's problematic for KA/LA sstable formats, which are incapable of writing such tombstones, so a workaround is provided in order to allow using KA/LA in alternator. Fixes #6035 Cherry-picked from `0a2d7addc0`	2020-04-16 12:14:10 +02:00
Asias He	a2d39c9a2e	gossip: Add an option to force gossip generation Consider 3 nodes in the cluster, n1, n2, n3 with gossip generation number g1, g2, g3. n1, n2, n3 running scylla version with commit `0a52ecb6df` (gossip: Fix max generation drift measure) One year later, user wants the upgrade n1,n2,n3 to a new version when n3 does a rolling restart with a new version, n3 will use a generation number g3'. Because g3' - g2 > MAX_GENERATION_DIFFERENCE and g3' - g1 > MAX_GENERATION_DIFFERENCE, so g1 and g2 will reject n3's gossip update and mark g3 as down. Such unnecessary marking of node down can cause availability issues. For example: DC1: n1, n2 DC2: n3, n4 When n3 and n4 restart, n1 and n2 will mark n3 and n4 as down, which causes the whole DC2 to be unavailable. To fix, we can start the node with a gossip generation within MAX_GENERATION_DIFFERENCE difference for the new node. Once all the nodes run the version with commit `0a52ecb6df`, the option is no logger needed. Fixes #5164 (cherry picked from commit `743b529c2b`)	2020-03-27 12:49:23 +01:00
Asias He	5fe2ce3bbe	gossiper: Always use the new generation number User reported an issue that after a node restart, the restarted node is marked as DOWN by other nodes in the cluster while the node is up and running normally. Consier the following: - n1, n2, n3 in the cluster - n3 shutdown itself - n3 send shutdown verb to n1 and n2 - n1 and n2 set n3 in SHUTDOWN status and force the heartbeat version to INT_MAX - n3 restarts - n3 sends gossip shadow rounds to n1 and n2, in storage_service::prepare_to_join, - n3 receives response from n1, in gossiper::handle_ack_msg, since _enabled = false and _in_shadow_round == false, n3 will apply the application state in fiber1, filber 1 finishes faster filber 2, it sets _in_shadow_round = false - n3 receives response from n2, in gossiper::handle_ack_msg, since _enabled = false and _in_shadow_round == false, n3 will apply the application state in fiber2, filber 2 yields - n3 finishes the shadow round and continues - n3 resets gossip endpoint_state_map with gossiper.reset_endpoint_state_map() - n3 resumes fiber 2, apply application state about n3 into endpoint_state_map, at this point endpoint_state_map contains information including n3 itself from n2. - n3 calls gossiper.start_gossiping(generation_number, app_states, ...) with new generation number generated correctly in storage_service::prepare_to_join, but in maybe_initialize_local_state(generation_nbr), it will not set new generation and heartbeat if the endpoint_state_map contains itself - n3 continues with the old generation and heartbeat learned in fiber 2 - n3 continues the gossip loop, in gossiper::run, hbs.update_heart_beat() the heartbeat is set to the number starting from 0. - n1 and n2 will not get update from n3 because they use the same generation number but n1 and n2 has larger heartbeat version - n1 and n2 will mark n3 as down even if n3 is alive. To fix, always use the the new generation number. Fixes: #5800 Backports: 3.0 3.1 3.2 (cherry picked from commit `62774ff882`)	2020-03-27 12:49:20 +01:00
Piotr Sarna	aafa34bbad	cql: fix qualifying indexed columns for filtering When qualifying columns to be fetched for filtering, we also check if the target column is not used as an index - in which case there's no need of fetching it. However, the check was incorrectly assuming that any restriction is eligible for indexing, while it's currently only true for EQ. The fix makes a more specific check and contains many dynamic casts, but these will hopefully we gone once our long planned "restrictions rewrite" is done. This commit comes with a test. Fixes #5708 Tests: unit(dev) (cherry picked from commit `767ff59418`)	2020-03-22 09:00:51 +01:00
Hagit Segev	7ae2cdf46c	release: prepare for 3.3.0 scylla-3.3.0	2020-03-19 21:46:44 +02:00
Hagit Segev	863f88c067	release: prepare for 3.3.rc3 scylla-3.3.rc3	2020-03-15 22:45:30 +02:00
Avi Kivity	90b4e9e595	Update seastar submodule * seastar f54084c08f...a0bdc6cd85 (1): > tls: Fix race and stale memory use in delayed shutdown Fixes #5759 (maybe)	2020-03-12 19:41:50 +02:00
Konstantin Osipov	434ad4548f	locator: correctly select endpoints if RF=0 SimpleStrategy creates a list of endpoints by iterating over the set of all configured endpoints for the given token, until we reach keyspace replication factor. There is a trivial coding bug when we first add at least one endpoint to the list, and then compare list size and replication factor. If RF=0 this never yields true. Fix by moving the RF check before at least one endpoint is added to the list. Cassandra never had this bug since it uses a less fancy while() loop. Fixes #5962 Message-Id: <20200306193729.130266-1-kostja@scylladb.com> (cherry picked from commit `ac6f64a885`)	2020-03-12 12:09:46 +02:00
Avi Kivity	cbbb15af5c	logalloc: increase capacity of _regions vector outside reclaim lock Reclaim consults the _regions vector, so we don't want it moving around while allocating more capacity. For that we take the reclaim lock. However, that can cause a false-positive OOM during startup: 1. all memory is allocated to LSA as part of priming (`2baa16b371`) 2. the _regions vector is resized from 64k to 128k, requiring a segment to be freed (plenty are free) 3. but reclaiming_lock is taken, so we cannot reclaim anything. To fix, resize the _regions vector outside the lock. Fixes #6003. Message-Id: <20200311091217.1112081-1-avi@scylladb.com> (cherry picked from commit `c020b4e5e2`)	2020-03-12 11:25:20 +02:00
Benny Halevy	3231580c05	dist/redhat: scylla.spec.mustache: set _no_recompute_build_ids By default, `/usr/lib/rpm/find-debuginfo.sh` will temper with the binary's build-id when stripping its debug info as it is passed the `--build-id-seed <version>.<release>` option. To prevent that we need to set the following macros as follows: unset `_unique_build_ids` set `_no_recompute_build_ids` to 1 Fixes #5881 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `25a763a187`)	2020-03-09 15:21:50 +02:00
Piotr Sarna	62364d9dcd	Merge 'cql3: do_execute_base_query: fix null deref ... ... when clustering key is unavailable' from Benny This series fixes null pointer dereference seen in #5794 `efd7efe` cql3: generate_base_key_from_index_pk; support optional index_ck `7af1f9e` cql3: do_execute_base_query: generate open-ended slice when clustering key is unavailable `7fe1a9e` cql3: do_execute_base_query: fixup indentation Fixes #5794 Branches: 3.3 Test: unit(dev) secondary_indexes_test:TestSecondaryIndexes.test_truncate_base(debug) * bhalevy/fix-5794-generate_base_key_from_index_pk: cql3: do_execute_base_query: fixup indentation cql3: do_execute_base_query: generate open-ended slice when clustering key is unavailable cql3: generate_base_key_from_index_pk; support optional index_ck (cherry picked from commit `4e95b67501`)	2020-03-09 15:20:01 +02:00
Takuya ASADA	3bed8063f6	dist/debian: fix "unable to open node-exporter.service.dpkg-new" error It seems like .service is conflicting on install time because the file installed twice, both debian/.service and debian/scylla-server.install. We don't need to use *.install, so we can just drop the line. Fixes #5640 (cherry picked from commit `29285b28e2`)	2020-03-03 12:40:39 +02:00
Yaron Kaikov	413fcab833	release: prepare for 3.3.rc2 scylla-3.3.rc2	2020-02-27 14:45:18 +02:00
Juliusz Stasiewicz	9f3c3036bf	cdc: set TTLs on CDC log cells Cells in CDC logs used to be created while completely neglecting TTLs (the TTLs from `cdc = {...'ttl':600}`). This patch adds TTLs to all cells; there are no row markers, so wee need not set TTL there. Fixes #5688 (cherry picked from commit `67b92c584f`)	2020-02-26 18:12:55 +02:00
Benny Halevy	ff2e108a6d	gossiper: do_stop_gossiping: copy live endpoints vector It can be resized asynchronously by mark_dead. Fixes #5701 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200203091344.229518-1-bhalevy@scylladb.com> (cherry picked from commit `f45fabab73`)	2020-02-26 13:00:11 +02:00
Gleb Natapov	ade788ffe8	commitlog: use commitlog IO scheduling class for segment zeroing There may be other commitlog writes waiting for zeroing to complete, so not using proper scheduling class causes priority inversion. Fixes #5858. Message-Id: <20200220102939.30769-2-gleb@scylladb.com> (cherry picked from commit `6a78cc9e31`)	2020-02-26 12:51:10 +02:00

1 2 3 4 5 ...

20796 Commits