The `prometheus_server` is started only conditionally
but the notification message is sent and logged
unconditionally.
Move it inside the condtional code block.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 282ff344db)
It is now logged out of place, so move it to right before
calling `start` on every database shard.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 23433f593c)
This commit adds a link to the Limitations section on the Tablets page
to the CQL pag, the tablets option.
This is actually the place where the user will need the information:
when creating a keyspace.
In addition, I've reorganized the section for better readability
(otherwise, the section about limitations was easy to miss)
and moved the section up on the page.
Note that I've removed the updated content from the `_common` folder
(which I deleted) to the .rst page - we no longer split OSS and Enterprise,
so there's no need to keep using the `scylladb_include_flag` directive
to include OSS- and Ent-specific content.
Fixes https://github.com/scylladb/scylladb/issues/22892
Fixes https://github.com/scylladb/scylladb/issues/22940Closesscylladb/scylladb#22939
(cherry picked from commit 0999fad279)
Closesscylladb/scylladb#23091
This commit adds the upgrade guides relevant in version 2025.1:
- From 6.2 to 2025.1
- From 2024.x to 2025.1
It also removes the upgrade guides that are not relevant in 2025.1 source available:
- Open Source upgrade guides
- From Open Source to Enterprise upgrade guides
- Links to the Enterprise upgrade guides
Also, as part of this PR, the remaining relevant content has been moved to
the new About Upgrade page.
WHAT NEEDS TO BE REVIEWED
- Review the instructions in the 6.2-to-2025.1 guide
- Review the instructions in the 2024.x-to-2025.1 guide
- Verify that there are no references to Open Source and Enterprise.
The scope of this PR does not have to include metrics - the info can be added
in a follow-up PR.
Fixes https://github.com/scylladb/scylladb/issues/22208
Fixes https://github.com/scylladb/scylladb/issues/22209
Fixes https://github.com/scylladb/scylladb/issues/23072
Fixes https://github.com/scylladb/scylladb/issues/22346Closesscylladb/scylladb#22352
(cherry picked from commit 850aec58e0)
Closesscylladb/scylladb#23106
In some cases the paused/unpaused node can hang not after 30s timeout.
This make the test flaky. Change the condition to always check the
coordinator's log if there is a hung node.
Add `stop_after_streaming` to the list of error injections which can
cause a node's hang.
Also add a wait for a new coordinator election in cluster events
which cause such elections.
Closesscylladb/scylladb#22825
(cherry picked from commit 99be9ac8d8)
Closesscylladb/scylladb#23007
In this series we implement the UpdateTable operation to add a GSI to an existing table, or remove a GSI from a table. As the individual commit messages will explained, this required changing how Alternator stores materialized view keys - instead of insisting that these key must be real columns (that is **not** the case when adding a GSI to an existing table), the materialized view can now take as its key any Alternator attribute serialized inside the ":attrs" map holding all non-key attributes. Fixes#11567.
We also fix the IndexStatus and Backfilling attributes returned by DescribeTable - as DynamoDB API users use this API to discover when a newly added GSI completed its "backfilling" (what we call "view building") stage. Fixes#11471.
This series should not be backported lightly - it's a new feature and required fairly large and intrusive changes that can introduce bugs to use cases that don't even use Alternator or its UpdateTable operations - every user of CQL materialized views or secondary indexes, as well as Alternator GSI or LSI, will use modified code. **It should be backported to 2025.1**, though - this version was actually branched long after this PR was sent, and it provides a feature that was promised for 2025.1.
Closesscylladb/scylladb#21989
* github.com:scylladb/scylladb:
alternator: fix view build on oversized GSI key attribute
mv: clean up do_delete_old_entry
test/alternator: unflake test for IndexStatus
test/alternator: work around unrelated bug causing test flakiness
docs/alternator: adding a GSI is no longer an unimplemented feature
test/alternator: remove xfail from all tests for issue 11567
alternator: overhaul implementation of GSIs and support UpdateTable
mv: support regular_column_transformation key columns in view
alternator: add new materialized-view computed column for item in map
build: in cmake build, schema needs alternator
build: build tests with Alternator
alternator: add function serialized_value_if_type()
mv: introduce regular_column_transformation, a new type of computed column
alternator: add IndexStatus/Backfilling in DescribeTable
alternator: add "LimitExceededException" error type
docs/alternator: document two more unimplemented Alternator features
(cherry picked from commit 529ff3efa5)
Closesscylladb/scylladb#22826
The test
topology_custom/test_alternator::test_localnodes_broadcast_rpc_address
sets up nodes with a silly "broadcast rpc address" and checks that
Alternator's "/localnodes" requests returns it correctly.
The problem is that although we don't use CQL in this test, the test
framework does open a CQL connection when the test starts, and closes
it when it ends. It turns out that when we set a silly "broadcast RPC
address", the driver tends to try to connect to it when shutting down,
I'm not even sure why. But the choice of the silly address was 1.2.3.4
is unfortunate, because this IP address is actually routable - and
the driver hangs until it times out (in practice, in a bit over two
minutes). This trivial patch changes 1.2.3.4 to 127.0.0.0 - and equally
silly address but one to which connections fail immediately.
Before this patch, the test often takes more than 2 minutes to finish
on my laptop, after this patch, it always finishes in 4-5 seconds.
Fixes#22744
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#22746
(cherry picked from commit f89235517d)
Closesscylladb/scylladb#22875
This PR addresses two related issues in our task system:
1. Prepares for asynchronous resource cleanup by converting release_resources() to a coroutine. This refactoring enables future improvements in how we handle resource cleanup.
2. Fixes a cross-shard resource cleanup issue in the SSTable loader where destruction of per-shard progress elements could trigger "shared_ptr accessed on non-owner cpu" errors in multi-shard environments. The fix uses coroutines to ensure resources are released on their owner shards.
Fixes#22759
---
this change addresses a regression introduced by d815d7013c, which is contained by 2025.1 and master branches. so it should be backported to 2025.1 branch.
- (cherry picked from commit 4c1f1baab4)
- (cherry picked from commit b448fea260)
Parent PR: #22791Closesscylladb/scylladb#22871
* github.com:scylladb/scylladb:
sstable_loader: fix cross-shard resource cleanup in download_task_impl
tasks: make release_resources() a coroutine
Set true to wait for the repair to complete. Set false to skip waiting
for the repair to complete. When the option is not provided, it defaults
to false.
It is useful for management tool that wants the api to be async.
Fixes#22418Closesscylladb/scylladb#22436
(cherry picked from commit fb318d0c81)
Closesscylladb/scylladb#22851
We are supposed to be loading the most recent RPC compression dictionary on startup, but we forgot to port the relevant piece of logic during the source-available port. This causes a restarted node not to use the dictionary for RPC compression until the next dictionary update.
Fix that.
Fixes#22738
This is more of a bugfix than an improvement, so it should be backported to 2025.1.
* (cherry picked from commit [dd82b40](dd82b40186))
* (cherry picked from commit [8fb2ea6](8fb2ea61ba))
Additionally cherry picked https://github.com/scylladb/scylladb/pull/22836 to fix the timeout.
Parent PR: #22739Closesscylladb/scylladb#22837
* github.com:scylladb/scylladb:
test_rpc_compression.py: fix an overly-short timeout
test_rpc_compression.py: test the dictionaries are loaded on startup
raft/group0_state_machine: load current RPC compression dict on startup
The code currently assumes that a session has both sender and receiver
streams, but it is possible to have just one or the other.
Change the test to include this scenario and remove this assumption from
the code.
Fixes: #22770Closesscylladb/scylladb#22771
(cherry picked from commit 87e8e00de6)
Closesscylladb/scylladb#22874
We need to allow replacing nodetool from scylla-enterprise-tools < 2024.2,
just like we did for scylla-tools < 5.5.
This is required to make packages able to upgrade from 2024.1.
Fixes#22820Closesscylladb/scylladb#22821
(cherry picked from commit b5e306047f)
Closesscylladb/scylladb#22867
Previously, download_task_impl's destructor would destroy per-shard progress
elements on whatever shard the task was destroyed on. In multi-shard
environments, this caused "shared_ptr accessed on non-owner cpu" errors when
attempting to free memory allocated on a different shard.
Fix by:
- Convert progress_per_shard into a sharded service
- Stop the service on owner shards during cleanup using coroutines
- Add operator+= to stream_progress to leverage seastar's built-in adder
instead of a custom adder struct
Alternative approaches considered:
1. Using foreign_ptr: Rejected as it would require interface changes
that complicate stream delegation. foreign_ptr manages the underlying
pointee with another smart pointer but does not expose the smart
pointer instance in its APIs, making it impossible to use
shared_ptr<stream_progress> in the interface.
2. Using vector<stream_progress>: Rejected for similar interface
compatibility reasons.
This solution maintains the existing interfaces while ensuring proper
cross-shard cleanup.
Fixesscylladb/scylladb#22759
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
(cherry picked from commit b448fea260)
Convert tasks::task_manager::task::impl::release_resources() to a coroutine
to prepare for upcoming changes that will implement asynchronous resource
release.
This is a preparatory refactoring that enables future coroutine-based
implementation of resource cleanup logic.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
(cherry picked from commit 4c1f1baab4)
`set_notify_handler()` is called after a querier was inserted into the querier cache. It has two purposes: set a callback for eviction and set a TTL for the cache entry. This latter was not disabling the pre-existing timeout of the permit (if any) and this would lead to premature eviction of the cache entry if the timeout was shorter than TTL (which his typical).
Disable the timeout before setting the TTL to prevent premature eviction.
Fixes: https://github.com/scylladb/scylladb/issues/22629
Backport required to all active releases, they are all affected.
- (cherry picked from commit a3ae0c7cee)
- (cherry picked from commit 9174f27cc8)
Parent PR: #22701Closesscylladb/scylladb#22752
* github.com:scylladb/scylladb:
reader_concurrency_semaphore: set_notify_handler(): disable timeout
reader_permit: mark check_abort() as const
As of right now, materialized views (and consequently secondary
indexes), lwt and counters are unsupported or experimental with tablets.
Since by defaults tablets are enabled, training cases using those
features are currently broken.
The right thing to do here is to disable tablets in those cases.
Fixes https://github.com/scylladb/scylladb/issues/22638Closesscylladb/scylladb#22661
(cherry picked from commit bea434f417)
Closesscylladb/scylladb#22808
This config item controls how many CPU-bound reads are allowed to run in
parallel. The effective concurrency of a single CPU core is 1, so
allowing more than one CPU-bound reads to run concurrently will just
result in time-sharing and both reads having higher latency.
However, restricting concurrency to 1 means that a CPU bound read that
takes a lot of time to complete can block other quick reads while it is
running. Increase this default setting to 2 as a compromise between not
over-using time-sharing, while not allowing such slow reads to block the
queue behind them.
Fixes: #22450Closesscylladb/scylladb#22679
(cherry picked from commit 3d12451d1f)
Closesscylladb/scylladb#22722
On short-pages, cut short because of a tombstone prefix.
When page-results are filtered and the filter drops some rows, the
last-position is taken from the page visitor, which does the filtering.
This means that last partition and row position will be that of the last
row the filter saw. This will not match the last position of the
replica, when the replica cut the page due to tombstones.
When fetching the next page, this means that all the tombstone suffix of
the last page, will be re-fetched. Worse still: the last position of the
next page will not match that of the saved reader left on the replica, so
the saved reader will be dropped and a new one created from scratch.
This wasted work will show up as elevated tail latencies.
Fix by always taking the last position from raw query results.
Fixes: #22620Closesscylladb/scylladb#22622
(cherry picked from commit 7ce932ce01)
Closesscylladb/scylladb#22719
When a replica get a write request it performs get_schema_for_write,
which waits until the schema is synced. However, database::add_column_family
marks a schema as synced before the table is added. Hence, the write may
see the schema as synced, but hit no_such_column_family as the table
hasn't been added yet.
Mark schema as synced after the table is added to database::_tables_metadata.
Fixes: #22347.
Closesscylladb/scylladb#22348
(cherry picked from commit 328818a50f)
Closesscylladb/scylladb#22604
Fixes#22401
In the fix for scylladb/scylla-enterprise#892, the extraction and check for sstable component encryption mask was copied
to a subroutine for description purposes, but a very important 1 << <value> shift was somehow
left on the floor.
Without this, the check for whether we actually contain a component encrypted can be wholly
broken for some components.
Closesscylladb/scylladb#22398
(cherry picked from commit 7db14420b7)
Closesscylladb/scylladb#22599
Currently, when the status of a task is queried and the task is already finished,
it gets unregistered. Getting the status shouldn't be a one-time operation.
Stop removing the task after its status is queried. Adjust tests not to rely
on this behavior. Add task_manager/drain API and nodetool tasks drain
command to remove finished tasks in the module.
Fixes: https://github.com/scylladb/scylladb/issues/21388.
It's a fix to task_manager API, should be backported to all branches
- (cherry picked from commit e37d1bcb98)
- (cherry picked from commit 18cc79176a)
Parent PR: #22310Closesscylladb/scylladb#22598
* github.com:scylladb/scylladb:
api: task_manager: do not unregister tasks on get_status
api: task_manager: add /task_manager/drain
`tablet_storage_group_manager::all_storage_groups_split()` calls `set_split_mode()` for each of its storage groups to create split ready compaction groups. It does this by iterating through storage groups using `std::ranges::all_of()` which is not guaranteed to iterate through the entire range, and will stop iterating on the first occurrence of the predicate (`set_split_mode()`) returning false. `set_split_mode()` creates the split compaction groups and returns false if the storage group's main compaction group or merging groups are not empty. This means that in cases where the tablet storage group manager has non-empty storage groups, we could have a situation where split compaction groups are not created for all storage groups.
The missing split compaction groups are later created in `tablet_storage_group_manager::split_all_storage_groups()` which also calls `set_split_mode()`, and that is the reason why split completes successfully. The problem is that
`tablet_storage_group_manager::all_storage_groups_split()` runs under a group0 guard, but
`tablet_storage_group_manager::split_all_storage_groups()` does not. This can cause problems with operations which should exclude with compaction group creation. i.e. DROP TABLE/DROP KEYSPACE
Fixes#22431
This is a bugfix and should be back ported to versions with tablets: 6.1 6.2 and 2025.1
- (cherry picked from commit 24e8d2a55c)
- (cherry picked from commit 8bff7786a8)
Parent PR: #22330Closesscylladb/scylladb#22560
* github.com:scylladb/scylladb:
test: add reproducer and test for fix to split ready CG creation
table: run set_split_mode() on all storage groups during all_storage_groups_split()
Since mid December, tests started failing with ENOMEM while
submitting I/O requests.
Logs of failed tests show IO uring was used as backend, but we
never deliberately switched to IO uring. Investigation pointed
to it happening accidentaly in commit 1bac6b75dc,
which turned on IO uring for allowing native tool in production,
and picked linux-aio backend explicitly when initializing Scylla.
But it missed that seastar-based tests would pick the default
backend, which is io_uring once enabled.
There's a reason we never made io_uring the default, which is
that it's not stable enough, and turns out we made the right
choice back then and it apparently continue to be unstable
causing flakiness in the tests.
Let's undo that accidental change in tests by explicitly
picking the linux-aio backend for seastar-based tests.
This should hopefully bring back stability.
Refs #21968.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#22695
(cherry picked from commit ce65164315)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#22800
We are supposed to be loading the most recent RPC compression dictionary
on startup, but we forgot to port the relevant piece of logic during
the source-available port.
(cherry picked from commit dd82b40186)
In few test cases of test_view_build_status we create a view, wait for
it and then query the view_build_status table and expect it to have all
rows for each node and view.
But it may fail because it could happen that the wait_for_view query and
the following queries are done on different nodes, and some of the nodes
didn't apply all the table updates yet, so they have missing rows.
To fix it, we change the assert to work in the eventual consistency
sense, retrying until the number of rows is as expectd.
Fixesscylladb/scylladb#22644Closesscylladb/scylladb#22654
(cherry picked from commit c098e9a327)
Closesscylladb/scylladb#22780
When upgrading for example from `2024.1` to `2025.1` the package name is
not identical casuing the upgrade command to fail:
```
Command: 'sudo DEBIAN_FRONTEND=noninteractive apt-get dist-upgrade scylla -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold"'
Exit code: 100
Stdout:
Selecting previously unselected package scylla.
Preparing to unpack .../6-scylla_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb ...
Unpacking scylla (2025.1.0~dev-0.20250118.1ef2d9d07692-1) ...
Errors were encountered while processing:
/tmp/apt-dpkg-install-JbOMav/0-scylla-conf_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/1-scylla-python3_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/2-scylla-server_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/3-scylla-kernel-conf_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/4-scylla-node-exporter_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/5-scylla-cqlsh_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
Stderr:
E: Sub-process /usr/bin/dpkg returned an error code (1)
```
Adding `Obsoletes` (for rpm) and `Replaces` (for deb)
Fixes: https://github.com/scylladb/scylladb/issues/22420Closesscylladb/scylladb#22457
(cherry picked from commit 93f53f4eb8)
Closesscylladb/scylladb#22753
set_notify_handler() is called after a querier was inserted into the
querier cache. It has two purposes: set a callback for eviction and set
a TTL for the cache entry. This latter was not disabling the
pre-existing timeout of the permit (if any) and this would lead to
premature eviction of the cache entry if the timeout was shorter than
TTL (which his typical).
Disable the timeout before setting the TTL to prevent premature
eviction.
Fixes: #scylladb/scylladb#22629
(cherry picked from commit 9174f27cc8)
Currently, the session ID under which the truncate for tablets request is
running is created during the request creation and queuing. This is a problem
because this could overwrite the session ID of any ongoing operation on
system.topology#session
This change moves the creation of the session ID for truncate from the request
creation to the request handling.
Fixes#22613Closesscylladb/scylladb#22615
(cherry picked from commit a59618e83d)
Closesscylladb/scylladb#22705
with_permit() creates a permit, with a self-reference, to avoid
attaching a continuation to the permit's run function. This
self-reference is used to keep the permit alive, until the execution
loop processes it. This self reference has to be carefully cleared on
error-paths, otherwise the permit will become a zombie, effectively
leaking memory.
Instead of trying to handle all loose ends, get rid of this
self-reference altogether: ask caller to provide a place to save the
permit, where it will survive until the end of the call. This makes the
call-site a little bit less nice, but it gets rid of a whole class of
possible bugs.
Fixes: #22588Closesscylladb/scylladb#22624
(cherry picked from commit f2d5819645)
Closesscylladb/scylladb#22704
Currently, when the tablet repair is started, info regarding
the operation is kept in the system.tablets. The new tablet states
are reflected in memory after load_topology_state is called.
Before that, the data in the table and the memory aren't consistent.
To check the supported operations, tablet_virtual_task uses in-memory
tablet_metadata. Hence, it may not see the operation, even though
its info is already kept in system.tablets table.
Run read barrier in tablet_virtual_task::contains to ensure it will
see the latest data. Add a test to check it.
Fixes: #21975.
Closesscylladb/scylladb#21995
(cherry picked from commit 610a761ca2)
Closesscylladb/scylladb#22694
If start_time/end_time is unspecified for a task, task_manager API
returns epoch. Nodetool prints the value in task status.
Fix nodetool tasks commands to print empty string for start_time/end_time
if it isn't specified.
Modify nodetool tasks status docs to show empty end_time.
Fixes: #22373.
Closesscylladb/scylladb#22370
(cherry picked from commit 477ad98b72)
Closesscylladb/scylladb#22601