scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 03:45:11 +00:00

Author	SHA1	Message	Date
Asias He	2cd4869c7c	test: Add test_node_ops_metrics.py It tests the node_ops_metrics_done metric reaches 100% when a node ops is done. Refs: #21174 (cherry picked from commit `9868ccbac0`)	2024-10-28 09:55:24 +00:00
Lakshmi Narayanan Sreethar	fd80dd2284	[Backport 6.0] replica/table: check memtable before discarding tombstone during read On the read path, the compacting reader is applied only to the sstable reader. This can cause an expired tombstone from an sstable to be purged from the request before it has a chance to merge with deleted data in the memtable leading to data resurrection. Fix this by checking the memtables before deciding to purge tombstones from the request on the read path. A tombstone will not be purged if a key exists in any of the table's memtables with a minimum live timestamp that is lower than the maximum purgeable timestamp. Fixes #20916 `perf-simple-query` stats before and after this fix : `build/Dev/scylla perf-simple-query --smp=1 --flush` : ``` // Before this Fix // --------------- 94941.79 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59393 insns/op, 24029 cycles/op, 0 errors) 97551.14 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59376 insns/op, 23966 cycles/op, 0 errors) 96599.92 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59367 insns/op, 23998 cycles/op, 0 errors) 97774.91 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59370 insns/op, 23968 cycles/op, 0 errors) 97796.13 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59368 insns/op, 23947 cycles/op, 0 errors) throughput: mean=96932.78 standard-deviation=1215.71 median=97551.14 median-absolute-deviation=842.13 maximum=97796.13 minimum=94941.79 instructions_per_op: mean=59374.78 standard-deviation=10.78 median=59369.59 median-absolute-deviation=6.36 maximum=59393.12 minimum=59367.02 cpu_cycles_per_op: mean=23981.67 standard-deviation=32.29 median=23967.76 median-absolute-deviation=16.33 maximum=24029.38 minimum=23947.19 // After this Fix // -------------- 95313.53 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59392 insns/op, 24058 cycles/op, 0 errors) 97311.48 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59375 insns/op, 24005 cycles/op, 0 errors) 98043.10 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59381 insns/op, 23941 cycles/op, 0 errors) 96750.31 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59396 insns/op, 24025 cycles/op, 0 errors) 93381.21 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59390 insns/op, 24097 cycles/op, 0 errors) throughput: mean=96159.93 standard-deviation=1847.88 median=96750.31 median-absolute-deviation=1151.55 maximum=98043.10 minimum=93381.21 instructions_per_op: mean=59386.60 standard-deviation=8.78 median=59389.55 median-absolute-deviation=6.02 maximum=59396.40 minimum=59374.73 cpu_cycles_per_op: mean=24025.13 standard-deviation=58.39 median=24025.17 median-absolute-deviation=32.67 maximum=24096.66 minimum=23941.22 ``` This PR fixes a regression introduced in `ce96b472d3` and should be backported to older versions. Closes scylladb/scylladb#20985 * github.com:scylladb/scylladb: topology-custom: add test to verify tombstone gc in read path replica/table: check memtable before discarding tombstone during read compaction_group: track maximum timestamp across all sstables (cherry picked from commit `519e167611`) Backported from #20985 to 6.0 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#21249	2024-10-25 11:20:24 +03:00
Botond Dénes	21cdf8833f	Merge '[Backport 6.0] tablet: Fix single-sstable split when attaching new unsplit sstables' from Raphael Raph Carvalho To fix a race between split and repair here `c1de4859d8`, a new sstable generated during streaming can be split before being attached to the sstable set. That's to prevent an unsplit sstable from reaching the set after the tablet map is resized. So we can think this split is an extension of the sstable writer. A failure during split means the new sstable won't be added. Also, the duration of split is also adding to the time erm is held. For example, repair writer will only release its erm once the split sstable is added into the set. This single-sstable split is going through run_custom_job(), which serializes with other maintenance tasks. That was a terrible decision, since the split may have to wait for ongoing maintenance task to finish, which means holding erm for longer. Additionally, if split monitor decides to run split on the entire compaction group, it can cause single-sstable split to be aborted since the former wants to select all sstables, propagating a failure to the streaming writer. That results in new sstable being leaked and may cause problems on restart, since the underlying tablet may have moved elsewhere or multiple splits may have happened. We have some fragility today in cleaning up leaked sstables on streaming failure, but this single-sstable split made it worse since the failure can happen during normal operation, when there's e.g. no I/O error. It makes sense to kill run_custom_job() usage, since the single-sstable split is offline and an extension of sstable writing, therefore it makes no sense to serialize with maintenance tasks. It must also inherit the sched group of the process writing the new sstable. The inheritance happens today, but is fragile. Fixes https://github.com/scylladb/scylladb/issues/20626. (cherry picked from commit `999f1f1318`) (cherry picked from commit `38ce2c605d`) Refs https://github.com/scylladb/scylladb/pull/20737 Closes scylladb/scylladb#21201 * github.com:scylladb/scylladb: tablet: Fix single-sstable split when attaching new unsplit sstables replica: Fix tablet split execute after restart	2024-10-25 11:18:58 +03:00
Botond Dénes	79b7aee58b	Merge '[Backport 6.0] Check system.tablets update before putting it into the table' from ScyllaDB Having tablet metadata with more than 1 pending replica will prevent this metadata from being (re)loaded due to sanity check on load. This patch fails the operation which tries to save the wrong metadata with a similar sanity check. For that, changes submitted to raft are validated, and if it's topology_change that affects system.tablets, the new "replicas" and "new_replicas" values are checked similarly to how they will be on (re)load. fixes #20043 (cherry picked from commit `f09fe4f351`) (cherry picked from commit `e5bf376cbc`) (cherry picked from commit `1863ccd900`) Refs #21020 Closes scylladb/scylladb#21112 * github.com:scylladb/scylladb: tablets: Validate system.tablets update group0_client: Introduce change validation group0_client: Add shared_token_metadata dependency replica/tablets: Add to_tablet_metadata_(row_)?key helpers replica/tablets: extract tablet_replica_set_from_cell()	2024-10-25 11:18:32 +03:00
Tomasz Grabiec	86066f5313	Merge '[Backport 6.0] replica: Fix tombstone GC during tablet split preparation' from Raphael Raph Carvalho During split prepare phase, there will be more than 1 compaction group with overlapping token range for a given replica. Assume tablet 1 has sstable A containing deleted data, and sstable B containing a tombstone that shadows data in A. Then split starts: sstable B is split first, and moved from main (unsplit) group to a split-ready group now compaction runs in split-ready group before sstable A is split tombstone GC logic today only looks at underlying group, so compaction is step 2 will discard the deleted data in A, since it belongs to another group (the unsplit one), and so the tombstone can be purged incorrectly. To fix it, compaction will now work with all uncompacting sstables that belong to the same replica, since tombstone GC requires all sstables that possibly contain shadowed data to be available for correct decision to be made. Fixes https://github.com/scylladb/scylladb/issues/20044. Please replace this line with justification for the backport/* labels added to this PR Branches 6.0, 6.1 and 6.2 are vulnerable, so backport is needed. (cherry picked from commit `bcd358595f`) (cherry picked from commit `93815e0649`) Refs https://github.com/scylladb/scylladb/pull/20939 Closes scylladb/scylladb#21204 * github.com:scylladb/scylladb: replica: Fix tombstone GC during tablet split preparation service: Improve error handling for split	2024-10-23 11:48:45 +02:00
Pavel Emelyanov	ae5885abf5	group0_client: Add shared_token_metadata dependency It will be needed later to get tablet_metadata from. The dependency is "OK", shared_token_metadata is low-level sharded service. Client already references db::system_keyspace, which in turn references replica::database which, finally, references token_metadata Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-10-22 12:35:44 +03:00
Botond Dénes	f84cdc7569	Merge '[Backport 6.0] atomic_delete: allow deletion of sstables from several prefixes' from ScyllaDB Allow create_pending_deletion_log to delete a bunch of sstables potentially resides in different prefixes (e.g. in the base directory and under staging/). The motivation arises from table::cleanup_tablet that calls compaction_group::cleanup on all cg:s via cleanup_compaction_groups. Cleanup, in turn, calls delete_sstables_atomically on all sstables in the compaction_group, in all states, including the normal state as well as staging - hence the requirement to support deleting sstables in different sub-directories. Also, apparently truncate calls delete_atomically for all sstables too, via table::discard_sstables, so if it happened to be executed during view update generation, i.e. when there are sstables in staging, it should hit the assertion failure reported in https://github.com/scylladb/scylladb/issues/18862 as well (although I haven't seen it yet, but I see no reason why it would happen). So the issue was apparently present since the initial implementation of the pending_delete_log. It's just that with tablet migration it is more likely to be hit. Fixes scylladb/scylladb#18862 Needs backport to 6.0 since tablets require this capability (cherry picked from commit `a7b92d7b6f`) (cherry picked from commit `027e64876a`) (cherry picked from commit `44bd183187`) (cherry picked from commit `f47b5e60bc`) Refs #19555 Closes scylladb/scylladb#20645 * github.com:scylladb/scylladb: sstable_directory: create_pending_deletion_log: place pending_delete log under the base directory sstables: storage: keep base directory in base class sstables: storage: define opened_directory in header file sstable_directory: use only dirlog	2024-10-22 09:21:04 +03:00
Raphael S. Carvalho	553803ac0f	replica: Fix tombstone GC during tablet split preparation During split prepare phase, there will be more than 1 compaction group with overlapping token range for a given replica. Assume tablet 1 has sstable A containing deleted data, and sstable B containing a tombstone that shadows data in A. Then split starts: 1) sstable B is split first, and moved from main (unsplit) group to a split-ready group 2) now compaction runs in split-ready group before sstable A is split tombstone GC logic today only looks at underlying group, so compaction is step 2 will discard the deleted data in A, since it belongs to another group (the unsplit one), and so the tombstone can be purged incorrectly. To fix it, compaction will now work with all uncompacting sstables that belong to the same replica, since tombstone GC requires all sstables that possibly contain shadowed data to be available for correct decision to be made. Fixes #20044. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `93815e0649`)	2024-10-20 20:38:21 -03:00
Raphael S. Carvalho	3ba613b833	replica: Fix tablet split execute after restart let's assume there are 2 nodes, n1, n2. n1 is the coordinator. 1) n1 emits split 2) n1 and n2 complete split work 3) n1 becomes aware all replicas are ready for split 4) n2 restarts, but places split sstable into main group[1] 5) n1 executes split 6) n2 handles split completion, but see the main group is not empty [1]: During split, main group should only contain unsplit sstables. If all sstables are split, main must be empty. This is a result of replica not setting storage group to split mode on restart (using tablet map) and therefore sstables are incorrectly placed on main group. The fix is about looking at tablet map and setting group to split mode before sstables are populated into it. Refs #20626. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `999f1f1318`)	2024-10-20 20:16:51 -03:00
Benny Halevy	f61e0e0f3e	sstable_directory: create_pending_deletion_log: place pending_delete log under the base directory To be able to atomically delete sstables both in base table directory and in its sub-directories, like `staging/`, use a shared pending_delete_dir under under the base directory. Note that this requires loading and processing the base directory first. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `f47b5e60bc`) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> # Conflicts: # sstables/sstable_directory.hh	2024-10-20 09:17:06 +03:00
Piotr Smaron	cdc5ee84ec	test: fix flaky `test_multidc_alter_tablets_rf` The testcase is flaky due to a known python driver issue: https://github.com/scylladb/python-driver/issues/317. This issue causes the `CREATE KEYSPACE` statement to be sometimes executed twice in a row, and the 2nd CREATE statement causes the test to fail. In order to work around it, it's enough to add `if not exists` when creating a ks. Fixes: #21034 Needs to be backported to all 6.x branches, as the PR introducing this flakiness is backported to every 6.x branch. (cherry picked from commit `3969ffb39f`) Closes scylladb/scylladb#21134	2024-10-17 11:03:14 +03:00
Piotr Smaron	5fa7c5dbc0	cql/tablets: handle MVs in ALTER tablets KEYSPACE ALTERing tablets-enabled KEYSPACES (KS) didn't account for materialized views (MV), and only produced tablets mutations changing tables. With this patch we're producing tablets mutations for both tables and MVs, hence when e.g. we change the replication factor (RF) of a KS, both the tables' RFs and MVs' RFs are updated along with tablets replicas. The `test_tablet_rf_change` testcase has been extended to also verify that MVs' tablets replicas are updated when RF changes. Fixes: #20240 (cherry picked from commit `e0c1a51642`) Closes scylladb/scylladb#21024	2024-10-17 09:36:35 +03:00
Kamil Braun	dddd9837b2	Merge '[Backport 6.0] cql: improve validating RF's change in ALTER tablets KS' from Piotr Smaron This patch series fixes a couple of bugs around validating if RF is not changed by too much when performing ALTER tablets KS. RF cannot change by more than 1 in total, because tablets load balancer cannot handle more work at once. Fixes: https://github.com/scylladb/scylladb/issues/20039 Should be backported to 6.0 & 6.1 (wherever tablets feature is present), as this bug may break the cluster. (cherry picked from commit `042825247f`) (cherry picked from commit `adf453af3f`) (cherry picked from commit `9c5950533f`) (cherry picked from commit `47acdc1f98`) (cherry picked from commit `93d61d7031`) (cherry picked from commit `6676e47371`) (cherry picked from commit `2aabe7f09c`) (cherry picked from commit `ee56bbfe61`) Refs https://github.com/scylladb/scylladb/pull/20208 Closes scylladb/scylladb#21047 * github.com:scylladb/scylladb: cql: sum of abs RFs diffs cannot exceed 1 in ALTER tablets KS cql: join new and old KS options in ALTER tablets KS cql: fix validation of ALTERing RFs in tablets KS cql: harden `alter_keyspace_statement.cc::validate_rf_difference` cql: validate RF change for new DCs in ALTER tablets KS cql: extend test_alter_tablet_keyspace_rf cql: refactor test_tablets::test_alter_tablet_keyspace cql: remove unused helper function from test_tablets	2024-10-15 12:45:40 +02:00
Sergey Zolotukhin	c5d7b66a5e	tests: Add tests for alter table with RF=1 to RF=0 Adding Vnodes and Tablets tests for alter keyspace operation that decreases replication factor from 1 to 0 for one of two data centers. Tablet version fails due to issue described in scylladb/scylladb#20625. Test for scylladb/scylladb#20625 (cherry picked from commit `132358dc92`)	2024-10-11 18:20:42 +00:00
Piotr Smaron	2557991f92	cql: sum of abs RFs diffs cannot exceed 1 in ALTER tablets KS Tablets load balancer is unable to process more than a single pending replica, thus ALTER tablets KS cannot accept an ALTER statement which would result in creating 2+ pending replicas, hence it has to validate if the sum of absoulte differences of RFs specified in the statement is not greter than 1. (cherry picked from commit `ee56bbfe61`)	2024-10-10 12:38:00 +02:00
Piotr Smaron	f05b9adba6	cql: join new and old KS options in ALTER tablets KS A bug has been discovered while trying to ALTER tablets KS and specifying only 1 out of 2 DCs - the not specified DC's RF has been zeroed. This is because ALTER tablets KS updated the KS only with the RF-per-DC mapping specified in the ALTER tablets KS statement, so if a DC was ommitted, it was assigned a value of RF=0. This commit fixes that plus additionally passes all the KS options, not only the replication options, to the topology coordinator, where the KS update is performed. `initial_tablets` is a special case, which requires a special handling in the source code, as we cannot simply update old initial_tablet's settings with the new ones, because if only ` and TABLETS = {'enabled': true}` is specified in the ALTER tablets KS statement, we should not zero the `initial_tablets`, but rather keep the old value - this is tested by the `test_alter_preserves_tablets_if_initial_tablets_skipped` testcase. Other than that, the above mentioned testcase started to fail with these changes, and it appeared to be an issue with the test not waiting until ALTER is completed, and thus reading the old value, hence the test's body has been modified to wait for ALTER to complete before performing validation. (cherry picked from commit `2aabe7f09c`)	2024-10-10 12:37:54 +02:00
Piotr Smaron	e03cc8aa6c	cql: validate RF change for new DCs in ALTER tablets KS ALTER tablets KS validated if RF is not changed by more than 1 for DCs that already had replicas, but not for DCs that didn't have them yet, so specifying an RF jump from 0 to 2 was possible when listing a new DC in ALTER tablets KS statement, which violated internal invariants of tablets load balancer. This PR fixes that bug and adds a multi-dc testcases to check if adding replicas to a new DC and removing replicas from a DC is honoring the RF change constraints. Refs: #20039 (cherry picked from commit `47acdc1f98`)	2024-10-08 18:08:22 +00:00
Piotr Smaron	4172d34c5c	cql: extend test_alter_tablet_keyspace_rf Added cases to also test decreasing RF and setting the same RF. Also added extra explanatory comments. (cherry picked from commit `9c5950533f`)	2024-10-08 18:08:22 +00:00
Piotr Smaron	b3fcd9fc5b	cql: refactor test_tablets::test_alter_tablet_keyspace 1. Renamed the testcase to emphasize that it only focuses on testing changing RF - there are other tests that test ALTER tablets KS in general. 2. Fixed whitespaces according to PEP8 (cherry picked from commit `adf453af3f`)	2024-10-08 18:08:21 +00:00
Piotr Smaron	9810fb3efd	cql: remove unused helper function from test_tablets `change_default_rf` is not used anywhere, moreover it uses `replication_factor` tag, which is forbidden in ALTER tablets KS statement. (cherry picked from commit `042825247f`)	2024-10-08 18:08:21 +00:00
Pavel Emelyanov	1ff582f808	cql: Check that CREATEing tablets/vnodes is consistent with the CLI There are two bits that control whenter replication strategy for a keyspace will use tablets or not -- the configuration option and CQL parameter. This patch tunes its parsing to implement the logic shown below: if (strategy.supports_tablets) { if (cql.with_tablets) { if (cfg.enable_tablets) { return create_keyspace_with_tablets(); } else { throw "tablets are not enabled"; } } else if (cql.with_tablets = off) { return create_keyspace_without_tablets(); } else { // cql.with_tablets is not specified if (cfg.enable_tablets) { return create_keyspace_with_tablets(); } else { return create_keyspace_without_tablets(); } } } else { // strategy doesn't support tablets if (cql.with_tablets == on) { throw "invalid cql parameter"; } else if (cql.with_tablets == off) { return create_keyspace_without_tablets(); } else { // cql.with_tablets is not specified return create_keyspace_without_tablets(); } } closes: #20088 In order to enable tablets "by default" for NetworkTopologyStrategy there's explicit check near ks_prop_defs::get_initial_tablets(), that's not very nice. It needs more care to fix it, e.g. provide feature service reference to abstract_replication_strategy constructor. But since ks_prop_defs code already highjacks options specifically for that strategy type (see prepare_options() helper), it's OK for now. There's also #20768 misbehavior that's preserved in this patch, but should be fixed eventually as well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20929	2024-10-03 17:08:26 +03:00
Michael Litvak	e392531ca9	mv: skip building view updates on a pending replica Currently, a pending replica that applies a write on a table that has materialized views, will build all the view updates as a normal replica, only to realize at a late point, in db::view::get_view_natural_endpoint(), that it doesn't have a paired view replica to send the updates to. It will then either drop the view updates, or send them to a pending view replica, if such exists. This work is unnecessary since it may be dropped, and even if there is a pending view replica to send the updates to, the updates that are built by the pending replica may be wrong since it may have incomplete information. This commit fixes the inefficiency by skipping the view update building step when applying an update on a pending replica. The metric total_view_updates_on_wrong_node is added to count the cases that a view update is determined to be unnecessary. The test reproduces the scenario of writing to a table and applying the update on a pending replica, and verifies that the pending replica doesn't try to build view updates. Fixes scylladb/scylladb#19152 Closes scylladb/scylladb#19488 Fixes scylladb/scylladb#20787 (cherry picked from commit `08b29460fc`) Closes scylladb/scylladb#20934	2024-10-03 11:17:13 +02:00
Calle Wilund	c311a93f0a	commitlog: Fix buffer_list_bytes not updated correctly Fixes #20862 With the change in `60af2f3cb2` the bookkeep for buffer memory was changed subtly, the problem here that we would shrink buffer size before we after flush use said buffer's size to decrement the buffer_list_bytes value, previously inc:ed by the full, allocated size. I.e. we would slowly grow this value instead of adjusting properly to actual used bytes. Test included. (cherry picked from commit `ee5e71172f`) Closes scylladb/scylladb#20913	2024-10-03 09:12:33 +03:00
Kamil Braun	3baa06d349	Merge ' [Backport 6.0] Populate raft address map from gossiper on raft configuration change' from Gleb Natapov For each new node added to the raft config populate it's ID to IP mapping in raft address map from the gossiper. The mapping may have expired if a node is added to the raft configuration long after it first appears in the gossiper. Fixes scylladb/scylladb#20600 Backport to all supported versions since the bug may cause bootstrapping failure. (cherry picked from commit `bddaf498df`) (cherry picked from commit `9e4cd32096`) Closes scylladb/scylladb#20866 * github.com:scylladb/scylladb: test: extend existing test to check that a joining node can map addresses of all pre-existing nodes during join group0: make sure that address map has an entry for each new node in the raft configuration	2024-09-30 17:05:01 +02:00
Gleb Natapov	b4d028f51f	test: extend existing test to check that a joining node can map addresses of all pre-existing nodes during join (cherry picked from commit `9e4cd32096`)	2024-09-29 12:52:15 +03:00
Kamil Braun	f6ae04a59c	Merge ' [Backport 6.0] mark node as being replaced earlier' from Gleb Natapov Before `17f4a151ce` the node was marked as been replaced in join_group0 state, before it actually joins the group0, so by the time it actually joins and starts transferring snapshot/log no traffic is sent to it. The commit changed this to mark the node as being replaced after the snapshot/log is already transferred so we can get the traffic to the node while it sill did not caught up with a leader and this may causes problems since the state is not complete. Mark the node as being replaced earlier, but still add the new node to the topology later as the commit above intended. Fixes: https://github.com/scylladb/scylladb/issues/20629 Need to be backported since this is a regression (cherry picked from commit `644e7a2012`) (cherry picked from commit `c0939d86f9`) (cherry picked from commit `1b4c255ffd`) Closes scylladb/scylladb#20835 * github.com:scylladb/scylladb: test: amend test_replace_reuse_ip test to check that there is no stale writes after snapshot transfer starts topology coordinator:: mark node as being replaced earlier topology coordinator: do metadata barrier before calling finish_accepting_node() during replace	2024-09-27 16:10:42 +02:00
Gleb Natapov	12495ccea5	test: amend test_replace_reuse_ip test to check that there is no stale writes after snapshot transfer starts (cherry picked from commit `1b4c255ffd`)	2024-09-26 12:53:06 +03:00
Nadav Har'El	05ed0a033a	alternator: exclude CDC log table from ListTables The Alternator command ListTables is supposed to list actual tables created with CreateTable, and should list things like materialized views (created for GSI or LSI) or CDC log tables. We already properly excluded materialized views from the list - and had the tests to prove it - but forgot both the exclusion and the testing for CDC log tables - so creating a table xyz with streams enable would cause ListTables to also list "xyz_scylla_cdc_log". This patch fixes both oversights: It adds the code to exclude CDC logs from the output of ListTables, add adds a test which reproduces the bug before this fix, and verifies the fix works. Fixes #19911. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#19914 (cherry picked from commit `d293a5787f`)	2024-09-26 11:24:57 +03:00
Botond Dénes	276af11202	Merge '[Backport 6.0] replica: ignore cleanup of deallocated storage group' from Aleksandra Martyniuk Cleanup of a deallocated tablet throws an exception. Since failed cleanup is retried, we end up in an infinite loop. Ignore cleanup of deallocated storage groups. Fixes: https://github.com/scylladb/scylladb/issues/19752. Needs to be backported to all branches with tablets (6.0 and later) (cherry picked from commit `20d6cf55f2`) (cherry picked from commit `2c4b1d6b45`) Refs https://github.com/scylladb/scylladb/pull/20584 Closes scylladb/scylladb#20628 * github.com:scylladb/scylladb: test: check if cleanup of deallocated sg is ignored replica: ignore cleanup of deallocated storage group	2024-09-26 10:58:01 +03:00
Lakshmi Narayanan Sreethar	24c6fce266	[Backport 6.0] database::get_all_tables_flushed_at: fix return value The `database::get_all_tables_flushed_at` method returns a variable without setting the computed all_tables_flushed_at value. This causes its caller, `maybe_flush_all_tables` to flush all the tables everytime regardless of when they were last flushed. Fix this by returning the computed value from `database::get_all_tables_flushed_at`. Fixes #20301 Closes scylladb/scylladb#20471 * github.com:scylladb/scylladb: cql-pytest: add test to verify compaction_flush_all_tables_before_major_seconds config database::get_all_tables_flushed_at: fix return value (cherry picked from commit `0e5b444777`) Backported from #20471 to 6.0. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#20580	2024-09-26 10:52:02 +03:00
Kamil Braun	45f01a886f	test: fix `topology_custom/test_raft_recovery_stuck` flakiness The test performs consecutive schema changes in RECOVERY mode. The second change relies on the first. However the driver might route the changes to different servers and we don't have group 0 to guarantee linearizability. We must rely on the first change coordinator to push the schema mutations to other servers before returning, but that only happens when it sees other servers as alive when doing the schema change. It wasn't guaranteed in the test. Fix this. Fixes scylladb/scylladb#20791 Should be backported to all branches containing this test to reduce flakiness. (cherry picked from commit `f390d4020a`) Closes scylladb/scylladb#20810	2024-09-25 15:12:30 +02:00
Abhinav	d8b66cf6ef	raft topology: add error for removal of non-normal nodes In the current scenario, We check if a node being removed is normal on the node initiating the removenode request. However, we don't have a similar check on the topology coordinator. The node being removed could be normal when we initiate the request, but it doesn't have to be normal when the topology coordinator starts handling the request. For example, the topology coordinator could have removed this node while handling another removenode request that was added to the request queue earlier. This commit intends to fix this issue by adding more checks in the enqueuing phase and return errors for duplicate requests for node removal. This PR fixes a bug. Hence we need to backport it. Fixes: scylladb/scylladb#20271 (cherry picked from commit `b25b8dccbd`) Closes scylladb/scylladb#20801	2024-09-25 11:36:02 +02:00
Gleb Natapov	c75d58aef5	test: skip test_lwt_semaphore::test_cas_semaphore in aarch64 debug mode The test configures write timeout to much smaller value to make the test run faster since for some writes sleep is inserted to hit the timeout, but it makes aarch64 debug flaky since timeout happens when it should not because of a natural slowness. (cherry picked from commit `71a5b1c6dd`) Closes scylladb/scylladb#20778	2024-09-24 15:20:56 +02:00
Botond Dénes	7d421abec4	Merge '[Manual Backport 6.0] generic_server: convert connection tracking to seastar::gate' from Laszlo Ersek This is a manual backport of #20212 to 6.0, superseding #20346 (which had run into conflicts). Please see the individual commit messages for backport notes. Fixes #10305 Closes scylladb/scylladb#20349 * github.com:scylladb/scylladb: generic_server: make server::stop() idempotent generic_server: coroutinize server::shutdown() generic_server: make server::shutdown() idempotent test/generic_server: add test case configure, cmake: sort the lists of boost unit tests generic_server: convert connection tracking to seastar::gate	2024-09-19 09:18:48 +03:00
Tomasz Grabiec	e38b42cedf	Merge '[Backport 6.0] tablets: Fix race between repair and split ' from Raphael "Raph" Carvalho Consider the following: ``` T 0 split prepare starts 1 repair starts 2 split prepare finishes 3 repair adds unsplit sstables 4 repair ends 5 split executes ``` If repair produces sstable after split prepare phase, the replica will not split that sstable later, as prepare phase is considered completed already. That causes split execution to fail as replicas weren't really prepared. This also can be triggered with load-and-stream which shares the same write (consumer) path. The approach to fix this is the same employed to prevent a race between split and migration. If migration happens during prepare phase, it can happen source misses the split request, but the tablet will still be split on the destination (if needed). Similarly, the repair writer becomes responsible for splitting the data if underlying table is in split mode. That's implemented in replica::table for correctness, so if node crashes, the new sstable missing split is still split before added to the set. Fixes https://github.com/scylladb/scylladb/issues/19378. Fixes https://github.com/scylladb/scylladb/issues/19416. Please replace this line with justification for the backport/* labels added to this PR (cherry picked from commit `239344ab55`) (cherry picked from commit `74612ad358`) Refs https://github.com/scylladb/scylladb/pull/19427 Closes scylladb/scylladb#20593 * github.com:scylladb/scylladb: tablets: Fix race between repair and split compaction: Allow "offline" sstable to be split	2024-09-17 13:24:36 +02:00
Aleksandra Martyniuk	62593ee85f	test: check if cleanup of deallocated sg is ignored (cherry picked from commit `2c4b1d6b45`)	2024-09-16 17:49:59 +02:00
Raphael S. Carvalho	c67967b65a	tablets: Fix race between repair and split Consider the following: T 0 split prepare starts 1 repair starts 2 split prepare finishes 3 repair adds unsplit sstables 4 repair ends 5 split executes If repair produces sstable after split prepare phase, the replica will not split that sstable later, as prepare phase is considered completed already. That causes split execution to fail as replicas weren't really prepared. This also can be triggered with load-and-stream which shares the same write (consumer) path. The approach to fix this is the same employed to prevent a race between split and migration. If migration happens during prepare phase, it can happen source misses the split request, but the tablet will still be split on the destination (if needed). Similarly, the repair writer becomes responsible for splitting the data if underlying table is in split mode. That's implemented in replica::table for correctness, so if node crashes, the new sstable missing split is still split before added to the set. Fixes #19378. Fixes #19416. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `74612ad358`)	2024-09-13 21:11:25 -03:00
Nadav Har'El	9c1f4d0953	Merge '[Backport 6.0] cql3: add option to not unify bind variables with the same name' from Avi Kivity Bind variables in CQL have two formats: positional (?) where a variable is referred to by its relative position in the statement, and named (:var), where the user is expected to supply a name->value mapping. In `19a6e69001` we identified the case where a named bind variable appears twice in a query, and collapsed it to a single entry in the statement metadata. Without this, a driver using the named variable syntax cannot disambiguate which variable is referred to. However, it turns out that users can use the positional call form even with the named variable syntax, by using the positional API of the driver. To support this use case, we add a configuration variable to disable the same-variable detection. Because the detection has to happen when the entire statement is visible, we have to supply the configuration to the parser. We call it the dialect and pass it from all callers. The alternative would be to add a pre-prepare call similar to fill_prepare_context that rewrites all expressions in a statement to deduplicate variables. A unit test is added. Fixes https://github.com/scylladb/scylladb/issues/15559 This may be useful to users transitioning from Cassandra, so merits a backport. (cherry picked from commit `f9322799af`) (cherry picked from commit `d69bf4f010`) (cherry picked from commit `ea8441dfa3`) Refs https://github.com/scylladb/scylladb/pull/19493 Subsumes #20389 Closes scylladb/scylladb#20551 * github.com:scylladb/scylladb: cql3: add option to not unify bind variables with the same name cql3: introduce dialect infrastructure cql3: prepared_statement_cache: drop cache key default constructor test: cql-pytest: config_value_context: remove strange ast.literal_eval call Merge 'config: round-trip boolean configuration variables' from Avi Kivity	2024-09-12 11:21:34 +03:00
Avi Kivity	ad52caac55	cql3: add option to not unify bind variables with the same name Bind variables in CQL have two formats: positional (`?`) where a variable is referred to by its relative position in the statement, and named (`:var`), where the user is expected to supply a name->value mapping. In `19a6e69001` we identified the case where a named bind variable appears twice in a query, and collapsed it to a single entry in the statement metadata. Without this, a driver using the named variable syntax cannot disambiguate which variable is referred to. However, it turns out that users can use the positional call form even with the named variable syntax, by using the positional API of the driver. To support this use case, we add a configuration variable to disable the same-variable detection. Because the detection has to happen when the entire statement is visible, we have to supply the configuration to the parser. We call it the `dialect` and pass it from all callers. The alternative would be to add a pre-prepare call similar to fill_prepare_context that rewrites all expressions in a statement to deduplicate variables. A unit test is added. Fixes #15559 (cherry picked from commit `ea8441dfa3`) (cherry picked from commit `edb3068ecf`)	2024-09-11 22:55:22 +03:00
Avi Kivity	aabad7e88f	cql3: introduce dialect infrastructure A dialect is a different way to interpret the same CQL statement. Examples: - how duplicate bind variable names are handled (later in this series) - whether `column = NULL` in LWT can return true (as is now) or whether it always returns NULL (as in SQL) Currently, dialect is an empty structure and will be filled in later. It is passed to query_processor methods that also accept a CQL string, and from there to the parser. It is part of the prepared statement cache key, so that if the dialect is changed online, previous parses of the statement are ignored and the statement is prepared again. The patch is careful to pick up the dialect at the entry point (e.g. CQL protocol server) so that the dialect doesn't change while a statement is parsed, prepared, and cached. (cherry picked from commit `d69bf4f010`)	2024-09-11 22:55:22 +03:00
Avi Kivity	643da8e3d8	test: cql-pytest: config_value_context: remove strange ast.literal_eval call cql-pytest's config_value_context is used to run a code sequence with different ScyllaDB configuration applied for a while. When it reads the original value (in order to restore it later), it applies ast.literal_eval() to it. This is strange, since the config variable isn't a Python literal. It was added in `8c464b2ddb` ("guardrails: restrict replication strategy (RS)"). Presumably, as a workaround for #19604 - it sufficiently massaged the input we read via SELECT to be acceptable later via UPDATE. Now that #19604 is fixed, we can remove the call to ast.literal_eval, but have to fix up the parameters to config_value_context to something that will be accepted without further massaging. This is a step towards fixing #15559, where we want to run some tests with a boolean configuration variable changed, and literal_eval is transforming the string representation of integers to integers and confusing the driver. Closes scylladb/scylladb#19696 (cherry picked from commit `d5af86bd8a`)	2024-09-11 22:55:22 +03:00
Nadav Har'El	79879be753	Merge 'config: round-trip boolean configuration variables' from Avi Kivity When you SELECT a boolean from system.config, it reads as true/false, but this isn't accepted on UPDATE (instead, we accept 1/0). This is surprising and annoying, so accept true/false in both directions. Not a regression, so a backport isn't strictly necessary. Closes scylladb/scylladb#19792 * github.com:scylladb/scylladb: config: specialize from-string conversion for bool config: wrap boost::lexical_cast<> when converting from strings (cherry picked from commit `9eb47b3ef0`)	2024-09-11 22:55:22 +03:00
Kamil Braun	553174251b	test: test_raft_no_quorum: increase raft timeout in debug mode The test cases in this file use an error injection to reduce raft group 0 timeouts (from the default 1 minute), in order to speed up the tests; the scenarios expect these timeouts to happen, so we want them to happen as quick as possible, but we don't want to reduce timeouts so much that it will make other operations fail when we don't expect them to (e.g. when the test wants to add a node to the cluster). Unfortunately the selected 5 seconds in debug mode was not enough and made the tests flaky: scylladb/scylladb#20111. Increase it to 10 seconds. This unfortunately will slow down these tests as they have to sometimes wait for 10 seconds for the timeout to happen. But better to have this than a flaky test. Fixes: scylladb/scylladb#20111 (cherry picked from commit `52fdf5b4c9`) Closes scylladb/scylladb#20478	2024-09-10 11:56:21 +03:00
Botond Dénes	8d5f2d8943	Merge '[Backport 6.0] repair: throw if batchlog manager isn't initialized' from Aleksandra Martyniuk repair_service::repair_flush_hints_batchlog_handler may access batchlog manager while it is uninitialized. Throw if batchlog manager isn't initialized. Fixes: https://github.com/scylladb/scylladb/issues/20236. Needs backport to 6.0 and 6.1 as they suffer from the uninitialized bm access. (cherry picked from commit `d8e4393418`) (cherry picked from commit `f38bb6483a`) Refs https://github.com/scylladb/scylladb/pull/20251 Closes scylladb/scylladb#20392 * github.com:scylladb/scylladb: test: add test to ensure repair won't fail with uninitialized bm repair: throw if batchlog manager isn't initialized	2024-09-09 15:13:38 +03:00
Gleb Natapov	8510568eda	topology coordinator: do not add replacing node without a ring to topology When only inter dc encryption is enabled a non encrypted connection between two nodes is allowed only if both nodes are in the same dc. If a nodes that initiates the connection knows that dst is in the same dc and hence use non encrypted connection, but the dst not yet knows the topology of the src such connection will not be allowed since dst cannot guaranty that dst is in the same dc. Currently, when topology coordinator is used, a replacing node will appear in the coordinator's topology immediately after it is added to the group0. The coordinator will try to send raft message to the new node and (assuming only inter dc encryption is enabled and replacing node and the coordinator are in the same dc) it will try to open regular, non encrypted, connection to it. But the replacing node will not have the coordinator in it's topology yet (it needs to sync the raft state for that). so it will reject such connection. To solve the problem the patch does not add a replacing node that was just added to group0 to the topology. It will be added later, when tokens will be assigned to it. At this point a replacing node will already make sure that its topology state is up-to-date (since it will execute a raft barrier in join_node_response_params handler) and it knows coordinator's topology. This aligns replace behaviour with bootstrap since bootstrap also does not add a node without a ring to the topology. The patch effectively reverts `b8ee8911ca` Fixes: scylladb/scylladb#19025 (cherry picked from commit `17f4a151ce`)	2024-09-02 17:04:42 +03:00
Gleb Natapov	cd324b8513	test: add test for replace in clusters with encryption enabled (cherry picked from commit `2f1b1fd45e`)	2024-09-02 17:04:42 +03:00
Gleb Natapov	d441d93e63	test.py: add server encryption support to cluster manager (cherry picked from commit `b98282a976`)	2024-09-02 17:04:42 +03:00
Aleksandra Martyniuk	ca3cbae70b	test: add test to ensure repair won't fail with uninitialized bm (cherry picked from commit `f38bb6483a`)	2024-09-02 10:37:18 +02:00
Laszlo Ersek	272c409b26	test/generic_server: add test case Check whether we can stop a generic server without first asking it to listen. The test fails currently; the failure mode is a hang, which triggers the 5 minute timeout set in the test: > unknown location(0): fatal error: in "stop_without_listening": > seastar::timed_out_error: timedout > seastar/src/testing/seastar_test.cc(43): last checkpoint > test/boost/generic_server_test.cc(34): Leaving test case > "stop_without_listening"; testing time: 300097447us Backport notes for 6.0: - Replace #include "utils/assert.hh" SCYLLA_ASSERT(false); with #include <cassert> assert(false); due to 6.0 lacking commit `aa1270a00c` ("treewide: change assert() to SCYLLA_ASSERT()", 2024-08-05). The header file "utils/assert.hh" wouldn't be difficult to backport, but separating it from the treewide changes in commit `aa1270a00c` might not be the best idea. Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com> (cherry picked from commit `dbc0ca6354`)	2024-08-30 15:22:18 +02:00
Laszlo Ersek	5490092abf	configure, cmake: sort the lists of boost unit tests Both lists were obviously meant to be sorted originally, but by today we've introduced many instances of disorder -- thus, inserting a new test in the proper place leaves the developer scratching their head. Sort both lists. Backport notes for 6.0: - Conflicts in "configure.py" and "test/boost/CMakeLists.txt", unsurprisingly. For the backport, I sorted the boost unit test list in each file manually, from scratch. Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com> (cherry picked from commit `931f2f8d73`)	2024-08-30 14:43:26 +02:00

1 2 3 4 5 ...

7030 Commits