Commit Graph

41065 Commits

Author SHA1 Message Date
Petr Gusev
646ca9515e test_topology_ops: check node restart after decommission
There used to be a problem with restarting a node after
decommissioning some other node - the restarting node
tries to apply the raft log, this log contains a record
about the decommissioned node, and we got stuck trying
to resolve its IP.

This was fixed in #16639 - we excluded IPs from
the RAFt log application code and moved it entirely
to host_id-s.

In this commit we add a regression test
for this case. We move the decommission_node
call before server_stop/server_start. We need
to add one more server to retain majority when
the node is decommissioned, otherwise the topology
coordinator won't migrate from the stopped node
before replacing it, and we'll get an error.

closes #14803
2024-02-06 13:29:42 +04:00
Petr Gusev
aeed5c5fe3 test_replace_reuse_ip: check other servers see the IP
The replaced node transitions to LEFT state, and
we used to remove the IPs of such nodes from gossiper.
If we replace with same IP, this caused the IP of the
new node to be removed from gossiper.

This problem was fixed by #16820, this commit
adds a regression test for it.

closes #15967
2024-02-06 13:28:04 +04:00
Benny Halevy
bd3ed168ab api/compaction_manager: stop_keyspace_compaction: prevent stack use-after-free
Since `t.parallel_foreach_table_state` may yield,
we should access `type` by reference when calling
`stop_compaction` since it is captured by the calling
lambda and gets lost when it returns if
`parallel_foreach_table_state` returns an unavailable
future.

Instead change all captures to `[&]` so we can access
the `type` variable held by the coroutine frame.

Fixes #16975

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#17143
2024-02-05 09:32:08 +02:00
Avi Kivity
784c2f8ad2 Merge 'treewide: replace calls to future::get0() by calls to future::get()' from Kefu Chai
get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.

Closes scylladb/scylladb#17130

* github.com:scylladb/scylladb:
  treewide: replace seastar::future::get0() with seastar::future::get()
  sstable: capture return value of get0() using auto
  utils: result_loop: define result_type with decayed type

[avi: add another one that snuck in while this was cooking]
2024-02-04 15:23:33 +02:00
Michał Chojnowski
ed98102c45 row_cache: update _prev_snapshot_pos even if apply_to_incomplete() is preempted
Commit e81fc1f095 accidentally broke the control
flow of row_cache::do_update().

Before that commit, the body of the loop was wrapped in a lambda.
Thus, to break out of the loop, `return` was used.

The bad commit removed the lambda, but didn't update the `return` accordingly.
Thus, since the commit, the statement doesn't just break out of the loop as
intended, but also skips the code after the loop, which updates `_prev_snapshot_pos`
to reflect the work done by the loop.

As a result, whenever `apply_to_incomplete()` (the `updater`) is preempted,
`do_update()` fails to update `_prev_snapshot_pos`. It remains in a
stale state, until `do_update()` runs again and either finishes or
is preempted outside of `updater`.

If we read a partition processed by `do_update()` but not covered by
`_prev_snapshot_pos`, we will read stale data (from the previous snapshot),
which will be remembered in the cache as the current data.

This results in outdated data being returned by the replica.
(And perhaps in something worse if range tombstones are involved.
I didn't investigate this possibility in depth).

Note: for queries with CL>1, occurences of this bug are likely to be hidden
by reconciliation, because the reconciled query will only see stale data if
the queried partition is affected by the bug on on *all* queried replicas
at the time of the query.

Fixes #16759

Closes scylladb/scylladb#17138
2024-02-04 11:17:41 +02:00
Botond Dénes
017a574b16 tools: lua_sstable_consumer.cc: load os and math libs
The amount of standard Lua libraries loaded for the sstable-script was
limited, due to fears that some libraries (like the io library) could
expose methods, which if used from the script could interfere with
seastar's asynchronous arhitecture. So initially only the table and
string libraries were loaded.
This patch adds two more libraries to be loaded: match and os. The
former is self-explanatory and the latter contains methods to work with
date/time, obtain the values of environment variables as well as launch
external processes. None of these should interfere with seastar, on the
other hand the facilities they provide can come very handy for sstable
scripts.

Closes scylladb/scylladb#17126
2024-02-02 19:00:57 +03:00
Pavel Emelyanov
52e6398ad6 messaging: Add formatter for netw::msg_addr
As a part of ongoing "support fmt v10" effort

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17053
2024-02-02 15:20:40 +01:00
Kefu Chai
cd3c7a50ed scylla_raid_setup: drop unused import
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17095
2024-02-02 15:20:40 +01:00
Kefu Chai
e62b29bab7 tasks: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17125
2024-02-02 15:20:40 +01:00
Pavel Emelyanov
75bc702ae8 utils: Remove unused operator<< for file_lock object
The lock itself is only used by utils/directories code

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17051
2024-02-02 15:20:40 +01:00
Kefu Chai
792fa4441e docs: s/ontop/on top/
this misspelling is identified by codespell. ontop cannot be found
on merriam-webster, but "on top" can, see
https://www.merriam-webster.com/dictionary/on%20top, so let's
replace ontop with "on top".

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17127
2024-02-02 15:20:40 +01:00
Botond Dénes
c9ab39af88 install-dependencies.sh: remove duplicate python3-pyudev package
It appeared in the list twice.

Closes scylladb/scylladb#17060
2024-02-02 15:20:40 +01:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Kefu Chai
deef78c796 sstable: capture return value of get0() using auto
instead of capturing the return value of `get0()` with a reference
type, use a plain type. as `get0()` returns a plain `T` while `get0()`
returns a `T&&`, to avoid the value referenced by `T&&` gets destroyed
after the expression, let's use a plain `auto` instead of `auto&&`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-02 22:12:18 +08:00
Kefu Chai
9fcca8f585 utils: result_loop: define result_type with decayed type
this change prepares for replacing `seastar::future::get0()` with
`seastar::future::get()`. the former's return type is a plain `T`,
while the latter is `T&&`. in this case `T` is
`boost::outcome::result<..>`. in order to extract its `error_type`,
we need to get its decayed type. since `std::remove_reference_t<T>`
also returns `T`, let's use it so it works with both `get0()` and `get()`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-02 22:12:18 +08:00
Pavel Emelyanov
9450a03cdf data_dictionary: Add formatter for keyspace-metadata
Other than being fmt v10 compatible, it's also shorter and easier to
read, thanks to fmt::join() helper

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17115
2024-02-02 11:26:39 +02:00
Kefu Chai
c7a01b9eb4 transport: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17092
2024-02-02 11:20:24 +02:00
Lakshmi Narayanan Sreethar
e86965c272 compaction: run rewrite_sstables_compaction_task_executor tasks in maintenance group
Use maintenance group to run all the compaction tasks that use the
rewrite_sstables_compaction_task_executor.

Fixes #16699

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#17112
2024-02-02 11:18:49 +02:00
Pavel Emelyanov
b557dcbf5a cql3: Sanitize ALTER KEYSPACE check for non-local storages
This kills three birds with one stone

1. fixes broken indentation
2. re-uses new_options local variable
3. stops using string literal to check storage type

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17111
2024-02-02 11:13:29 +02:00
Botond Dénes
63d44712af Merge 'storage_service: Fix indentation for stream_ranges' from Asias He
This is a follow up of "storage_service: Run stream_ranges cmd in streaming group" to fix indentation and drop a unnecessary co_return.

Refs: #17090

Closes scylladb/scylladb#17114

* github.com:scylladb/scylladb:
  storage_service: Drop unnecessary co_return in raft_topology_cmd_handler
  storage_service: Fix indentation for stream_ranges
2024-02-02 11:12:52 +02:00
Kefu Chai
b45af994c2 locator/utils: remove stale comment
this comment has already served its purpose when rewriting
C* in C++. since we've re-implemented it, there is no need to keep it
around.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17120
2024-02-02 11:07:35 +02:00
Asias He
23a8b0552c storage_service: Drop unnecessary co_return in raft_topology_cmd_handler
It is introduced in "storage_service: Run stream_ranges cmd in streaming
group".

Refs: #17090
2024-02-02 08:20:06 +08:00
Asias He
732a9b5253 storage_service: Fix indentation for stream_ranges
Fixes the indentation introduced in "storage_service: Run
stream_ranges cmd in streaming group".

Refs: #17090
2024-02-02 08:20:03 +08:00
Pavel Emelyanov
66b859a29f gms: Remove unused operator<< for feature object
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17109
2024-02-01 19:00:46 +02:00
Kefu Chai
aad8035bed replica/database: use structured-bind when appropriate
for better readability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17104
2024-02-01 16:31:29 +02:00
Botond Dénes
dc8e13baed Merge 'Move some tablets tests from topology_custom to cql-pytest' from Pavel Emelyanov
The latter suite is now tablets-aware and tablets cases from the former one can happily work with single shared scylla instance

Closes scylladb/scylladb#17101

* github.com:scylladb/scylladb:
  test/topology_custom: Remove test_tablets.py
  test/topology: Move test_tablet_change_initial_tablets
  test/topology: Move test_tablet_explicit_disabling
  test/topology: Move test_tablet_default_initialization
  test/topology: Move test_tablet_change_replication_strategy
  test/topology: Move test_tablet_change_replication_vnode_to_tablets
  cql-pytest: Add skip_without_tablets fixture
2024-02-01 16:28:43 +02:00
Kamil Braun
c911bf1a33 test_raft_snapshot_request: fix flakiness (again)
At the end of the test, we wait until a restarted node receives a
snapshot from the leader, and then verify that the log has been
truncated.

To check the snapshot, the test used the `system.raft_snapshots` table,
while the log is stored in `system.raft`.

Unfortunately, the two tables are not updated atomically when Raft
persists a snapshot (scylladb/scylladb#9603). We first update
`system.raft_snapshots`, then `system.raft` (see
`raft_sys_table_storage::store_snapshot_descriptor`). So after the wait
finishes, there's no guarantee the log has been truncated yet -- there's
a race between the test's last check and Scylla doing that last delete.

But we can check the snapshot using `system.raft` instead of
`system.raft_snapshots`, as `system.raft` has the latest ID. And since
1640f83fdc, storing that ID and truncating
the log in `system.raft` happens atomically.

Closes scylladb/scylladb#17106
2024-02-01 16:06:12 +02:00
Kefu Chai
946d281d39 exceptions: s/#warn/#warning/
`#warning` is a preprocessor macro in C/C++, while `#warn` is not. the
reason we haven't run into the build failure caused by this is likely
that we are only building on amd64/aarch64 with libstdc++ at the time
of writing.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17074
2024-02-01 14:50:17 +02:00
Botond Dénes
1a0300dba6 Merge 'compaction_manager: flush tables before cleanup' from Kefu Chai
according to the document "nodetool cleanup"

> Triggers removal of data that the node no longer owns

currently, scylla performs cleanup by rewriting the sstables. but
commitlog segments may still contain the mutations to the tables
which are dropped during sstable rewriting. when scylla server
restarts, the dirty mutations are replayed to the memtable. if
any of these dirty mutations changes the tables cleaned up. the
stale data are reapplied. this would lead to data resurrection.

so, in this change we following the same model of major compaction
where we

1. forcing new active segment,
2. flushing tables being cleaned up
3. perform cleanup using compaction

Fixes #4734

Closes scylladb/scylladb#16757

* github.com:scylladb/scylladb:
  storage_service: fall back to local cleanup in cleanup_all
  compaction: format flush_mode without the helper
  compaction_manager: flush all tables before cleanup
  replica: table: pass do_flush to table::perform_cleanup_compaction()
  api, compaction: promote flush_mode
2024-02-01 13:47:45 +02:00
libo-sober
a341b870bc Remove unnecessary calculations in integrity_checked_file_impl::write_dma.
Use calculated `rbuf_end` in `std::mismatch` to reduce unnecessary calculations.

Closes scylladb/scylladb#16979
2024-02-01 13:42:59 +02:00
Botond Dénes
8debb6b98f Merge 'storage_service: Run stream_ranges cmd in streaming group' from Asias He
Otherwise it will inherit the rpc verb's scheduling group which is gossip. As a result, it causes the streaming runs in the wrong scheduling group.

Fixes #17090

Closes scylladb/scylladb#17097

* github.com:scylladb/scylladb:
  streaming: Verify stream consumer runs inside streaming group
  storage_service: Run stream_ranges cmd in streaming group
2024-02-01 13:18:26 +02:00
Patryk Wrobel
25324bbe50 cql_test_env.cc: remove dead code
This change removes empty anonymous namespace
that is a dead code.

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>

Closes scylladb/scylladb#17099
2024-02-01 13:17:48 +02:00
Pavel Emelyanov
64cb3a6496 test/topology_custom: Remove test_tablets.py
It's now empty, all test cases had been moved to cql-pytest

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-02-01 13:59:51 +03:00
Pavel Emelyanov
3fbe93e45d test/topology: Move test_tablet_change_initial_tablets
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-02-01 13:59:51 +03:00
Pavel Emelyanov
480227fcad test/topology: Move test_tablet_explicit_disabling
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-02-01 13:59:51 +03:00
Pavel Emelyanov
45b0490100 test/topology: Move test_tablet_default_initialization
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-02-01 13:59:51 +03:00
Pavel Emelyanov
3258c56ca3 test/topology: Move test_tablet_change_replication_strategy
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-02-01 13:59:51 +03:00
Pavel Emelyanov
6f50cc2783 test/topology: Move test_tablet_change_replication_vnode_to_tablets
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-02-01 13:59:51 +03:00
Botond Dénes
b9af2efcb1 Merge 'directories: prevent inode cache fragmentation by orderly verifying data directory contents' from Lakshmi Narayanan Sreethar
During startup, the contents of the data directory are verified to ensure that they have the right owner and permissions. Verifying all the contents, which includes files that will be read and closed immediately, and files that will be held open for longer durations, together, can lead to memory fragementation in the dentry/inode cache.

Mitigate this by updating the verification in a such way that these two set of files will be verified separately ensuring their separation in the dentry/inode cache.

Fixes https://github.com/scylladb/scylladb/issues/14506

Closes scylladb/scylladb#16952

* github.com:scylladb/scylladb:
  directories: prevent inode cache fragmentation by orderly verifying data directory contents
  directories: skip verifying data directory contents during startup
  directories: co-routinize create_and_verify
2024-02-01 12:30:07 +02:00
Kefu Chai
4ec104e086 api: storage_service: correct a typo
s/a any keyspace/a given keyspace/

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17098
2024-02-01 10:55:58 +02:00
Botond Dénes
2a4b991772 Merge 'Fix mintimeuuid() call that could crash Scylla' from Nadav Har'El
This PR fixes the bug of certain calls to the `mintimeuuid()` CQL function which large negative timestamps could crash Scylla. It turns out we already had protections in place against very positive timestamps, but very negative timestamps could still cause bugs.

The actual fix in this series is just a few lines, but the bigger effort was improving the test coverage in this area. I added tests for the "date" type (the original reproducer for this bug used totimestamp() which takes a date parameter), and also reproducers for this bug directly, without totimestamp() function, and one with that function.

Finally this PR also replaces the assert() which made this molehill-of-a-bug into a mountain, by a throw.

Fixes #17035

Closes scylladb/scylladb#17073

* github.com:scylladb/scylladb:
  utils: replace assert() by on_internal_error()
  utils: add on_internal_error with common logger
  utils: add a timeuuid minimum, like we had maximum
  test/cql-pytest: tests for "date" type
2024-02-01 10:48:48 +02:00
Patryk Wrobel
6e5a85c387 replica/table: add tablet count metric
This change introduces a new metric called tablet_count
that is recalculated during construction of table object
and on each call to table::update_effective_replication_map().

To get the count of tablet per current shard, tablet map
is traversed and for each tablet_id tablet_map::get_shard()
is called. Its return value is compared with this_shard_id().

The new metric is maintained and exposed only for tables
that uses tablets.

Refs: scylladb#16131
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>

Closes scylladb/scylladb#17056
2024-02-01 10:46:53 +02:00
Asias He
2888c3086c utils: Add uuid_xor_to_uint32 helper
Convert the uuid to a uint32_t using xor.
It is useful to get a uint32_t number from the uuid.

Refs: #16927

Closes scylladb/scylladb#17049
2024-02-01 10:27:55 +02:00
Botond Dénes
f5917b215f Merge 'replica, tablet_allocator: do not compare unsigned with signed' from Kefu Chai
this series addresses couple `-Wsign-compare` warnings surfaced in the tree.

Closes scylladb/scylladb#17091

* github.com:scylladb/scylladb:
  tablet_allocator: do not compare signed and unsigned
  replica: table: do not compare signed with unsigned
2024-02-01 10:26:04 +02:00
Kefu Chai
7a8e8c2ced db: add formatter for db::write_type
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `db::write_type`, and drop
its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17093
2024-02-01 10:22:45 +02:00
Kefu Chai
005d231f96 db: add formatter for gms::application_state
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `gms::application_state`,
but its operator<< is preserved, as it is still used by the generic
homebrew formatter for `std::unordered_map<>`.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17096
2024-02-01 10:02:25 +02:00
Pavel Emelyanov
ab7ce3d1fa cql-pytest: Add skip_without_tablets fixture
It's opposite to skip_with_tablets one and thus also depends on
scylla_only one

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-02-01 10:58:13 +03:00
Lakshmi Narayanan Sreethar
dbe758d309 directories: prevent inode cache fragmentation by orderly verifying data directory contents
During startup, the contents of the data directory are verified to ensure
that they have the right owner and permissions. Verifying all the
contents, which includes files that will be read and closed immediately,
and files that will be held open for longer durations, together, can
lead to memory fragementation in the dentry/inode cache.

Prevent this by updating the verification in a such way that these two
set of files will be verified separately ensuring their separation in
the dentry/inode cache.

Fixes #14506

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-02-01 12:20:23 +05:30
Lakshmi Narayanan Sreethar
74a4085426 directories: skip verifying data directory contents during startup
This is in preparation for a subsequent patch that will verify the
contents of the data directory in a specific order.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-02-01 11:54:59 +05:30
Lakshmi Narayanan Sreethar
2e3d2498f4 directories: co-routinize create_and_verify
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-02-01 11:41:10 +05:30