Commit Graph

46439 Commits

Author SHA1 Message Date
Ferenc Szili
a59618e83d truncate: create session during request handling
Currently, the session ID under which the truncate for tablets request is
running is created during the request creation and queuing. This is a problem
because this could overwrite the session ID of any ongoing operation on
system.topology#session

This change moves the creation of the session ID for truncate from the request
creation to the request handling.

Fixes #22613

Closes scylladb/scylladb#22615
2025-02-04 22:11:24 +01:00
Botond Dénes
f2d5819645 reader_concurrency_semaphore: with_permit(): proper clean-up after queue overload
with_permit() creates a permit, with a self-reference, to avoid
attaching a continuation to the permit's run function. This
self-reference is used to keep the permit alive, until the execution
loop processes it. This self reference has to be carefully cleared on
error-paths, otherwise the permit will become a zombie, effectively
leaking memory.
Instead of trying to handle all loose ends, get rid of this
self-reference altogether: ask caller to provide a place to save the
permit, where it will survive until the end of the call. This makes the
call-site a little bit less nice, but it gets rid of a whole class of
possible bugs.

Fixes: #22588

Closes scylladb/scylladb#22624
2025-02-04 21:27:16 +02:00
Michał Chojnowski
bea434f417 pgo: disable tablets for training with secondary index, lwt and counters
As of right now, materialized views (and consequently secondary
indexes), lwt and counters are unsupported or experimental with tablets.
Since by defaults tablets are enabled, training cases using those
features are currently broken.

The right thing to do here is to disable tablets in those cases.

Fixes https://github.com/scylladb/scylladb/issues/22638

Closes scylladb/scylladb#22661
2025-02-04 15:38:53 +02:00
Aleksandra Martyniuk
683176d3db tasks: add shard, start_time, and end_time to task_stats
task_stats contains short info about a task. To get a list of task_stats
in the module, one needs to request /task_manager/list_module_tasks/{module}.

To make identification and navigation between tasks easier, extend
task_stats to contain shard, start_time, and end_time.

Closes scylladb/scylladb#22351
2025-02-04 12:11:24 +02:00
Botond Dénes
8c8db2052e Merge 'service: add child for tablet repair virtual task' from Aleksandra Martyniuk
tablet_repair_task_impl is run as a part of tablet repair. Make it
a child of tablet repair virtual task.

tablet_repair_task_impl started by /storage_service/repair_async API
(vnode repair) does not have a parent, as it is the top-level task
in that case.

No backport needed; new functionality

Closes scylladb/scylladb#22372

* github.com:scylladb/scylladb:
  test: add test to check tablet repair child
  service: add child for tablet repair virtual task
2025-02-04 12:08:24 +02:00
Aleksandra Martyniuk
610a761ca2 service: use read barrier in tablet_virtual_task::contains
Currently, when the tablet repair is started, info regarding
the operation is kept in the system.tablets. The new tablet states
are reflected in memory after load_topology_state is called.
Before that, the data in the table and the memory aren't consistent.

To check the supported operations, tablet_virtual_task uses in-memory
tablet_metadata. Hence, it may not see the operation, even though
its info is already kept in system.tablets table.

Run read barrier in tablet_virtual_task::contains to ensure it will
see the latest data. Add a test to check it.

Fixes: #21975.

Closes scylladb/scylladb#21995
2025-02-04 12:07:42 +02:00
Avi Kivity
6913f054e7 Update tools/cqlsh submodule
The driver update makes cqlsh work well with Python 3.13.

* tools/cqlsh 52c6130...02ec7c5 (18):
  > chore(deps): update dependency scylla-driver to v3.28.2
  > dist: support smooth upgrade from enterprise to source availalbe
  > github action: fix downloading of artifacts
  > chore(deps): update docker/setup-buildx-action action to v3
  > chore(deps): update docker/login-action action to v3
  > chore(deps): update docker/build-push-action action to v6
  > chore(deps): update docker/setup-qemu-action action to v3
  > chore(deps): update peter-evans/dockerhub-description action to v4
  > upload actions: update the usage for multiple artifacts
  > chore(deps): update actions/download-artifact action to v4.1.8
  > chore(deps): update dependency scylla-driver to v3.28.0
  > chore(deps): update pypa/cibuildwheel action to v2.22.0
  > chore(deps): update actions/checkout action to v4
  > chore(deps): update python docker tag to v3.13
  > chore(deps): update actions/upload-artifact action to v4
  > github actions: update it to work
  > add option to output driver debug
  > Add renovate.json (#107)

Closes scylladb/scylladb#22593
2025-02-04 12:06:54 +02:00
Avi Kivity
f25636884a api: storage_service: break out set_storage_service lambdas into free functions
This was originally an attempt to reduce the compile time of this
translation unit, but apparently it doesn't work. Still, it has
the effect of converting stack traces that say "set_storage_service"
and refer to some lambda to stack traces that refer to the operation
being performed, so it's a net positive.

To faciliate the change, we introduce new functions rest_bind(),
similar to (and in fact wrapping) std::bind_front(), that capture
references like the lambdas did originally. We can't use
std::bind_front directly since the call to
seastar::httpd::path_description::set() cannot be disambiguated
after the function is obscured by the template returned by
std::bind_front. The new function rest_bind() has constraints
to understand which overload is in use.

Closes scylladb/scylladb#22526
2025-02-04 12:06:18 +02:00
Ran Regev
edd56a2c1c moved cache files to db
As requested in #22097, moved the files
and fixed other includes and build system.

Fixes: #22097
Signed-off-by: Ran Regev <ran.regev@scylladb.com>

Closes scylladb/scylladb#22495
2025-02-04 12:21:31 +03:00
Pavel Emelyanov
e47c7d5255 Merge 'config: Improve internode_compression option validation and documentation' from Kefu Chai
This PR enhances the internode_compression configuration option in two ways:

1. Add validation for option values
   Previously, we silently defaulted to 'none' when given invalid values. Now we
   explicitly validate against the three supported values (all, dc, none) and
   reject invalid inputs. This provides better error messages when users
   misconfigure the option.

2. Fix documentation rendering
   The help text for this option previously used C++ escape sequences which
   rendered incorrectly in Sphinx-generated HTML. We now use bullet points with
   '*' prefix to list the available values, matching our documentation style
   for other config options. This ensures consistent rendering in both CLI
   and HTML outputs.

Note: The current documentation format puts type/default/liveness information
in the same bullet list as option values. This affects other config options
as well and will need to be addressed in a separate change.

---

this improves the handling of invalid option values, and improves the doc rendering, neither of which is critical. hence no need to backport.

Closes scylladb/scylladb#22548

* github.com:scylladb/scylladb:
  config: validate internode_compression option values
  config: start available options with '*'
2025-02-04 10:17:23 +03:00
Andrei Chekun
2a99494752 test.py: Remove workaround for python bug in asyncio
Bug https://bugs.python.org/issue26789 is resolved in python 3.10.
The frozen tool chain uses python 3.12. Since this is a supported and
recommended way for work environment, removing workaround and bumping
requirements for a newer python version.

Closes scylladb/scylladb#22627
2025-02-03 22:27:34 +02:00
David Garcia
fe4750ffc3 docs: fetch multiverson config from remote sources
fix: brand

Closes scylladb/scylladb#22616
2025-02-03 15:25:10 +02:00
Yaron Kaikov
4f832c31b9 .github/workflows/make-pr-ready-for-review: add missing permissions
Following the work done in ed4bfad5c3, the action is failing with the
following error:
```
Error: Input required and not supplied: token
```

It is due ot missing permissions in the workflow, adding it

Closes scylladb/scylladb#22630
2025-02-03 13:25:27 +02:00
Aleksandra Martyniuk
43427b8fe0 test: add test to check tablet repair child 2025-02-03 10:31:16 +01:00
Aleksandra Martyniuk
c23ce40f50 service: add child for tablet repair virtual task
tablet_repair_task_impl is run as a part of tablet repair. Make it
a child of tablet repair virtual task.

tablet_repair_task_impl started by /storage_service/repair_async API
(vnode repair) does not have a parent, as it is the top-level task
in that case.
2025-02-03 10:31:14 +01:00
Avi Kivity
d237d0a4ea Update seastar submodule
* seastar 71036ebcc0...5b95d1d798 (3):
  > rpc stream: do not abort stream queue if stream connection was closed without error
  > resource: fallback to sysconf when failed to detect memory size from hwloc
  > Merge 'scheduling_group: improve scheduling group creation exception safety' from Michael Litvak

scylla-gdb.py adjusted for scheduling_group_specific data structure
changes in Seastar. As part of that, a gratuitous dereference of
std::unique_ptr, which fails for std::unique_ptr<void*, ...>, was
removed.
2025-02-03 00:10:38 +02:00
Botond Dénes
e1b1a2068a reader_concurrency_semaphore: foreach_permit(): include _inactive_reads
So inactive reads show up in semaphore diagnostics dumps (currently the
only non-test user of this method).

Fixes: #22574

Closes scylladb/scylladb#22575
2025-01-30 22:46:57 +02:00
Michael Litvak
44c06ddfbb test/test_view_build_status: fix wrong assert in test
The test expects and asserts that after wait_for_view is completed we
read the view_build_status table and get a row for each node and view.
But this is wrong because wait_for_view may have read the table on one
node, and then we query the table on a different node that didn't insert
all the rows yet, so the assert could fail.

To fix it we change the test to retry and check that eventually all
expected rows are found and then eventually removed on the same host.

Fixes scylladb/scylladb#22547

Closes scylladb/scylladb#22585
2025-01-30 21:25:53 +02:00
Michael Litvak
6d34125eb7 view_builder: fix loop in view builder when tokens are moved
The view builder builds a view by going over the entire token ring,
consuming the base table partitions, and generating view updates for
each partition.

A view is considered as built when we complete a full cycle of the
token ring. Suppose we start to build a view at a token F. We will
consume all partitions with tokens starting at F until the maximum
token, then go back to the minimum token and consume all partitions
until F, and then we detect that we pass F and complete building the
view. This happens in the view builder consumer in
`check_for_built_views`.

The problem is that we check if we pass the first token F with the
condition `_step.current_token() >= it->first_token` whenever we consume
a new partition or the current_token goes back to the minimum token.
But suppose that we don't have any partitions with a token greater than
or equal to the first token (this could happen if the partition with
token F was moved to another node for example), then this condition will never be
satisfied, and we don't detect correctly when we pass F. Instead, we
go back to the minimum token, building the same token ranges again,
in a possibly infinite loop.

To fix this we add another step when reaching the end of the reader's
stream. When this happens it means we don't have any more fragments to
consume until the end of the range, so we advance the current_token to
the end of the range, simulating a partition, and check for built views
in that range.

Fixes scylladb/scylladb#21829

Closes scylladb/scylladb#22493
2025-01-30 14:35:18 +02:00
Nikos Dragazis
439862a8d4 test/cqlpy: Reproduce bug with exceeded limit on secondary index
Add two cqlpy tests that reproduce a bug where a secondary index query
returns more rows than the specified limit. This occurs when the indexed
column is a partition key column or the first clustering key column,
the query result spans multiple partitions, and the last partition
causes the limit to be exceeded.

`test/cqlpy/run --release ...` shows that the tests fail for Scylla
versions all the way back to 4.4.0. Older Scylla versions fail with a
syntax error in CQL query which suggests some incompatibility in the
CQL protocol. That said, this bug is not a regression.

The tests pass in Cassandra 5.0.2.

Refs #22158.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes scylladb/scylladb#22513
2025-01-30 13:24:15 +02:00
Kefu Chai
f39cfd8eb0 compaction: switch boost::algorithm::any_of to std::ranges::any_of
std::any_of was included by C++11, and boost::algorithm::any_of() is
provided by Boost for users stuck in the pre-C++11 era. in our case,
we've moved into C++23, where the ranges variant of this algorithm
is available.

in order to reduce the header dependency, let's switch to
`std::ranges::any_of()`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22503
2025-01-30 13:22:33 +02:00
Artsiom Mishuta
03606b8e22 test.py:topology_random_failures: enable tests deselected for #21534
removed tests deselectios for issue scylladb/scylladb#21534
as it closed now

fixes: scylladb/scylladb#21711

Closes scylladb/scylladb#22424
2025-01-30 12:12:19 +01:00
Wojciech Mitros
677f9962cf mv: forbid views with tablets by default
Materialized views with tablets are not stable yet, but we want
them available as an experimental feature, mainly for teseting.

The feature was added in scylladb/scylladb#21833,
but currently it has no effect. All tests have been updated to use the
feature, so we should finally make it work.
This patch prevents users from creating materialized views in keyspaces
using tablets when the VIEWS_WITH_TABLETS feature is not enabled - such
requests will now get rejected.

Fixes scylladb/scylladb#21832

Closes scylladb/scylladb#22217
2025-01-30 12:10:47 +01:00
aberry-21
69a0431cce schema: add validation for PERCENTILE values in speculative_retry configuration
This commit addresses issue #21825, where invalid PERCENTILE values for
the `speculative_retry` setting were not properly handled, causing potential
server crashes. The valid range for PERCENTILE is between 0 and 100, as defined
in the documentation for speculative retry options, where values above 100 or
below 0 are invalid and should be rejected.

The added validation ensures that such invalid values are rejected with a clear
error message, improving system stability and user experience.

Fixes #21825

Closes scylladb/scylladb#21879
2025-01-30 11:34:46 +02:00
Yaron Kaikov
ed4bfad5c3 .github: add action to make PR ready for review when conflicts label was removed
Moving a PR out of draft is only allowed to users with write access,
adding a github action to switch PR to `ready for review` once the
`conflicts` label was removed

Closes scylladb/scylladb#22446
2025-01-30 11:33:25 +02:00
Nadav Har'El
698a63e14b test/alternator: test for invalid B value in UpdateItem
This patch adds an Alternator test for the case of UpdateItem attempting
to insert in invalid B (bytes) value into an item. Values of type B
use base64 encoding, and an attempt to insert a value which isn't
valid base64 should be rejected, and this is what this test verifies.

The new tests reproduce issue #17539, which claimed we have a bug in
this area. However, test/alternator/run with the "--release" option
shows that this bug existed in Scylla 5.2, but but fixed long ago, in
5.3 and doesn't exist in master. But we never had a regression test this
issue, so now we do.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#22029
2025-01-30 11:33:03 +02:00
Botond Dénes
af46894bb7 Merge 'Rack aware view pairing' from Benny Halevy
Enabled with the tablets_rack_aware_view_pairing cluster feature
rack-aware pairing pairs base to view replicas that are in the
same dc and rack, using their ordinality in the replica map

We distinguish between 2 cases:
- Simple rack-aware pairing: when the replication factor in the dc
  is a multiple of the number of racks and the minimum number of nodes
  per rack in the dc is greater than or equal to rf / nr_racks.

  In this case (that includes the single rack case), all racks would
  have the same number of replicas, so we first filter all replicas
  by dc and rack, retaining their ordinality in the process, and
  finally, we pair between the base replicas and view replicas,
  that are in the same rack, using their original order in the
  tablet-map replica set.

  For example, nr_racks=2, rf=4:
  base_replicas = { N00, N01, N10, N11 }
  view_replicas = { N11, N12, N01, N02 }
  pairing would be: { N00, N01 }, { N01, N02 }, { N10, N11 }, { N11, N12 }
  Note that we don't optimize for self-pairing if it breaks pairing ordinality.

- Complex rack-aware pairing: when the replication factor is not
  a multiple of nr_racks.  In this case, we attempt best-match
  pairing in all racks, using the minimum number of base or view replicas
  in each rack (given their global ordinality), while pairing all the other
  replicas, across racks, sorted by their ordinality.

  For example, nr_racks=4, rf=3:
  base_replicas = { N00, N10, N20 }
  view_replicas = { N11, N21, N31 }
  pairing would be: { N00, N31 }\*, { N10, N11 }, { N20, N21 }
  \* cross-rack pair

  If we'd simply stable-sort both base and view replicas by rack,
  we might end up with much worse pairing across racks:
  { N00, N11 }\*, { N10, N21 }\*, { N20, N31 }\*
  \* cross-rack pair

Fixes scylladb/scylladb#17147

* This is an improvement so no backport is required

Closes scylladb/scylladb#21453

* github.com:scylladb/scylladb:
  network_topology_strategy_test: add tablets rack_aware_view_pairing tests
  view: get_view_natural_endpoint: implement rack-aware pairing for tablets
  view: get_view_natural_endpoint: handle case when there are too few view replicas
  view: get_view_natural_endpoint: track replica locator::nodes
  locator: topology: consult local_dc_rack if node not found by host_id
  locator: node: add dc and rack getters
  feature_service: add tablet_rack_aware_view_pairing feature
  view: get_view_natural_endpoint: refactor predicate function
  view: get_view_natural_endpoint: clarify documentation
  view: mutate_MV: optimize remote_endpoints filtering check
  view: mutate_MV: lookup base and view erms synchronously
  view: mutate_MV: calculate keyspace-dependent flags once
2025-01-30 11:32:19 +02:00
Aleksandra Martyniuk
328818a50f replica: mark registry entry as synch after the table is added
When a replica get a write request it performs get_schema_for_write,
which waits until the schema is synced. However, database::add_column_family
marks a schema as synced before the table is added. Hence, the write may
see the schema as synced, but hit no_such_column_family as the table
hasn't been added yet.

Mark schema as synced after the table is added to database::_tables_metadata.

Fixes: #22347.

Closes scylladb/scylladb#22348
2025-01-30 11:30:07 +02:00
Aleksandra Martyniuk
477ad98b72 nodetool: tasks: print empty string for start_time/end_time if unspecified
If start_time/end_time is unspecified for a task, task_manager API
returns epoch. Nodetool prints the value in task status.

Fix nodetool tasks commands to print empty string for start_time/end_time
if it isn't specified.

Modify nodetool tasks status docs to show empty end_time.

Fixes: #22373.

Closes scylladb/scylladb#22370
2025-01-30 11:29:36 +02:00
Calle Wilund
7db14420b7 encryption: Fix encrypted components mask check in describe
Fixes #22401

In the fix for scylladb/scylla-enterprise#892, the extraction and check for sstable component encryption mask was copied
to a subroutine for description purposes, but a very important 1 << <value> shift was somehow
left on the floor.

Without this, the check for whether we actually contain a component encrypted can be wholly
broken for some components.

Closes scylladb/scylladb#22398
2025-01-30 11:29:13 +02:00
Botond Dénes
8e89f2e88e Merge 'audit: make categories, tables, and keyspaces liveupdatable' from Andrzej Jackowski
This change:
 - Remove code that prevented audit from starting if audit_categories,
   audit_tables, and audit_keyspaces are not configured
 - Set liveness::LiveUpdate for audit_categories, audit_tables,
   and audit_keyspaces
 - Keep const reference to db::config in audit, so current config values
   can be obtained by audit implementation
 - Implement function audit::update_config to parse given string, update
   audit datastructures when needed, and log the changes.
 - Add observers to call audit::update_config when categories,
   tables, or keyspaces configuration changes

New functionality, so no backport needed.
Fixes https://github.com/scylladb/scylla-enterprise/issues/1789

Closes scylladb/scylladb#22449

* github.com:scylladb/scylladb:
  audit: make categories, tables, and keyspaces liveupdatable
  audit: move static parsing functions above audit constructors
  audit: move statement_category to string conversion to static function
  audit: start audit even with empty categories/tables/keyspaces
2025-01-30 11:28:49 +02:00
Botond Dénes
d8b8a6c5fc Merge 'api: task_manager: do not unregister finish task when its status is queried' from Aleksandra Martyniuk
Currently, when the status of a task is queried and the task is already finished,
it gets unregistered. Getting the status shouldn't be a one-time operation.

Stop removing the task after its status is queried. Adjust tests not to rely
on this behavior. Add task_manager/drain API and nodetool tasks drain
command to remove finished tasks in the module.

Fixes: https://github.com/scylladb/scylladb/issues/21388.

It's a fix to task_manager API, should be backported to all branches

Closes scylladb/scylladb#22310

* github.com:scylladb/scylladb:
  api: task_manager: do not unregister tasks on get_status
  api: task_manager: add /task_manager/drain
2025-01-30 11:27:44 +02:00
Botond Dénes
98fdf05b0e Merge 'Fix repair vs storage services initialization order' from Pavel Emelyanov
Repair service is started after storage service, while storage service needs to reference repair one for its needs. Recently it was noticed, that this reverse order may cause troubles and was fixed with the help of an extra gate. That's not nice and makes the start-stop mess even worse. The correct fix is to fix the order both services start/stop in.

Closes scylladb/scylladb#22368

* github.com:scylladb/scylladb:
  Revert "repair: add repair_service gate"
  main: Start repair before storage service
  repair: Check for sharded<view-builder> when constructing row_level_repair
2025-01-30 11:26:24 +02:00
Nadav Har'El
98a8ae0552 test/alternator: functional tests for Alternator multi-item transactions
This patch adds extensive functional tests for the DynamoDB multi-item
transactions feature - the TransactWriteItems and TransactGetItems
requests. We add 43 test functions, spanning more than 1000 lines of code,
covering the different parameters and corner cases of these requests.

Because we don't support the transaction feature in Alternator yet (this
is issue #5064), all of these tests fail on Alternator but all of them
were tested to pass on DynamoDB. So all new tests are marked "xfail".

These tests will be handy for whoever will implement this feature as
an acceptance test, and can also be useful for whoever will just want to
understand this feature better - the tests are short and simple and
heavily commented.

Note that these tests only check the correct functionality of individual
calls of these requests - these tests cannot and do not check the
consistency or isolation guarantees of concurrent invocations of
several requests. Such tests would require a different test framework,
such as the one requested in issue #6350, and are therefore not part of
this patch.

Note that this patch includes ONLY tests, and does not mean that an
implementation of the feature will soon follow. In fact, nobody is
currently working on implementing this feature.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#22239
2025-01-30 11:22:05 +02:00
Avi Kivity
000791ad5c README: adjust to reflect license change
Adjust the contact section to reflect the license change.

Closes scylladb/scylladb#22537
2025-01-30 10:28:32 +03:00
Kamil Braun
febd45861e test/lib: cql_test_env: make service shutdown more verbose
Introduce `defer_verbose_shutdown` in `cql_test_env` which logs
a message before and after shutting down a service, distinguishing
between success and failure.

The function is similar to the one in `main` but skips special error
handling logic applicable only to the main Scylla binary. The purpose
of the `cql_test_env` version of this function is only more verbose
logging. If necessary it can be extended in the future with additional
logic.

I estimated the impact on the size of produced log files using
`cdc_test` as an example:
```
$ build/dev/test/boost/combined_tests --run_test=cdc_test -- --smp=2 \
    >logfile 2>&1
$ du -b logfile
```

the result before this commit: 1964064 bytes, after: 2196432 bytes,
so estimated ~12% increase of log file size for boost tests that use
`cql_test_env`, assuming that the number of logs printed by each test is
similar to the logs printed by `cdc_test` (but I believe `cdc_test` is
one of the less verbose tests so this is an overestimate).

The motivation for this change is easier debugging of shutdown issues.
When investigating scylladb/scylladb#21983, where an exception is
thrown somewhere during the shutdown procedure, I found it hard to
pinpoint the service from which the exception originates. This change
will make it easier to debug issues like that by wrapping shutdown of
each service in a pair of messages logged when shutdown starts and when
it finishes (including when it fails). We should get more details on
this issue when it reproduces again in CI after this commit is merged
into `master`. (I failed to reproduce it locally with 1000 runs.)

Ref scylladb/scylladb#21983

Closes scylladb/scylladb#22566
2025-01-30 10:27:45 +03:00
Botond Dénes
5dd6fcfe6f Merge 'encrypted_file_impl: Check for reads on or past actual file length in transform' from Calle Wilund
Fixes #22236

If reading a file and not stopping on block bounds returned by `size()`, we could allow reading from (_file_size+&lt;1-15&gt;) (if crossing block boundary) and try to decrypt this buffer (last one).

Simplest example:
Actual data size: 4095
Physical file size: 4095 + key block size (typically 16)
Read from 4096: -> 15 bytes (padding) -> transform return `_file_size` - `read offset` -> wraparound -> rather larger number than we expected (not to mention the data in question is junk/zero).

Check on last block in `transform` would wrap around size due to us being >= file size (l).
Just do an early bounds check and return zero if we're past the actual data limit.

Closes scylladb/scylladb#22395

* github.com:scylladb/scylladb:
  encrypted_file_test: Test reads beyond decrypted file length
  encrypted_file_impl: Check for reads on or past actual file length in transform
2025-01-29 20:09:32 +02:00
Anna Stuchlik
2a6445343c doc: update the Web Installer docs to remove OSS
Fixes https://github.com/scylladb/scylladb/issues/22292

Closes scylladb/scylladb#22433
2025-01-29 20:00:01 +02:00
Anna Stuchlik
caf598b118 doc: add SStable support in 2025.1
This commit adds the information about SStable version support in 2025.1
by replacing "2022.2" with "2022.2 and above".

In addition, this commit removes information about versions that are
no longer supported.

Fixes https://github.com/scylladb/scylladb/issues/22485

Closes scylladb/scylladb#22486
2025-01-29 19:59:24 +02:00
dependabot[bot]
962bd452f5 build(deps): bump sphinx-scylladb-theme from 1.8.3 to 1.8.5 in /docs
Bumps [sphinx-scylladb-theme](https://github.com/scylladb/sphinx-scylladb-theme) from 1.8.3 to 1.8.5.
- [Release notes](https://github.com/scylladb/sphinx-scylladb-theme/releases)
- [Commits](https://github.com/scylladb/sphinx-scylladb-theme/compare/1.8.3...1.8.5)

---
updated-dependencies:
- dependency-name: sphinx-scylladb-theme
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Closes scylladb/scylladb#22435
2025-01-29 19:57:33 +02:00
Pavel Emelyanov
4aff86ac64 sstables: Mark sstable::sstable_buffer_size const
It really never changes once set in constructor

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#22533
2025-01-29 19:51:22 +02:00
Avi Kivity
c71f383cf2 Merge 'sstables: use std::variant instead of boost::variant' from Botond Dénes
Continue replacing boost types with std one where possible.

Improvement, no backport needed

Closes scylladb/scylladb#22290

* github.com:scylladb/scylladb:
  sstables: disk_types: disk_set_of_tagged_union: boost::variant -> std::variant
  scylla-gdb.py: std_variant: fix get()
  sstables: disk_types: remove unused disk_tagged_union
2025-01-29 15:29:15 +02:00
Michał Chojnowski
80072eefe5 test/scylla_gdb: add more checks to coro_task()
test_coro_frame is flaky, as if
`service::topology_coordinator::run() [clone .resume]` wasn't running
on the shard. But it's supposed to.

Perhaps this is a bug in `find_vptrs()`?
This patch asks `scylla find` for a second opinion, and also prints
all `find_vptrs()`, to see if it's the only coroutine missing
from there.

Closes scylladb/scylladb#22534
2025-01-29 11:02:24 +02:00
Kefu Chai
e218a62a7a cdc,index: replace boost::ends_with() with .ends_with()
since C++20, std::string and std::string_view started providing
`ends_with()` member function, the same applies to `seastar::sstring`,
so there is no need to use `boost::ends_with()` anymore.

in this change, we switch from `boost::ends_with()` to the member
functions variant to

- improve the readability
- reduce the header dependency

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22502
2025-01-29 11:52:55 +03:00
Pavel Emelyanov
2eb06c6384 Update seastar submodule with recent IO scheduler improvement
* seastar 18221366...71036ebc (2):
  > Merge 'fair_queue: make the fair_group token grabbing discipline more fair' from Michał Chojnowski
      apps/io_tester: add some test cases for the IO scheduler
      test: in fair_queue_test, ensure that tokens are only replenished by test_env
      fair_queue: track the total capacity of queued requests
      fair_queue: make the fair_group token grabbing discipline more fair
  > scheduling: auto-detect scheduling group key rename() method

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#22504
2025-01-29 11:39:12 +03:00
Kefu Chai
8d0cabb392 config: validate internode_compression option values
Previously, the internode_compression option silently defaulted to 'none'
for any unrecognized value instead of validating input. It only compared
against 'all' and 'dc', making it error-prone.

Add explicit validation for the three supported values:
- all
- dc
- none

This ensures invalid values are rejected both in command line and YAML
configuration, providing better error messages to users.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-29 14:52:35 +08:00
Kefu Chai
a81416862a config: start available options with '*'
use '*' prefix for config option values instead of escape sequences

The custom Sphinx extension that generates documentation from config.cc
help messages has issues with C++ escape sequences. For example,
"\tall: All traffic" renders incorrectly as "tall: All traffic" in HTML
output.

Instead of using escape sequences, switch to bullet-point style with '*'
prefix which works better in both CLI and HTML rendering. This matches
our existing documentation style for available option values in other
configs.

Note: This change puts type/default/liveness info in the same bullet
list as option values. This limitation affects other similar config
options and will need to be addressed comprehensively in a future
change.

Refs scylladb/scylladb#22423
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-29 14:52:35 +08:00
Michael Litvak
4f5550d7f2 cdc: fix handling of new generation during raft upgrade
During raft upgrade, a node may gossip about a new CDC generation that
was propagated through raft. The node that receives the generation by
gossip may have not applied the raft update yet, and it will not find
the generation in the system tables. We should consider this error
non-fatal and retry to read until it succeeds or becomes obsolete.

Another issue is when we fail with a "fatal" exception and not retrying
to read, the cdc metadata is left in an inconsistent state that causes
further attempts to insert this CDC generation to fail.

What happens is we complete preparing the new generation by calling `prepare`,
we insert an empty entry for the generation's timestamp, and then we fail. The
next time we try to insert the generation, we skip inserting it because we see
that it already has an entry in the metadata and we determine that
there's nothing to do. But this is wrong, because the entry is empty,
and we should continue to insert the generation.

To fix it, we change `prepare` to return `true` when the entry already
exists but it's empty, indicating we should continue to insert the
generation.

Fixes scylladb/scylladb#21227

Closes scylladb/scylladb#22093
2025-01-28 18:05:32 +01:00
Kamil Braun
add97ccc15 Merge 'Do not update topology on address change' from Gleb Natapov
Since now topology does not contain ip addresses there is no need to
create topology on an ip address change. Only peers table has to be
updated. The series factors out peers table update code from
sync_raft_topology_nodes() and calls it on topology and ip address
updates. As a side effect it fixes #22293 since now topology loading
does not require IP do be present, so the assert that is triggered in
this bug is removed.

Fixes: scylladb/scylladb#22293

Closes scylladb/scylladb#22519

* github.com:scylladb/scylladb:
  topology coordinator: do not update topology on address change
  topology coordinator: split out the peer table update functionality from raft state application
2025-01-28 12:52:29 +01:00
Avi Kivity
7f2d901c89 Merge 'repair: handle no_such_keyspace in repair preparation phase' from Aleksandra Martyniuk
Currently, data sync repair handles most no_such_keyspace exceptions,
but it omits the preparation phase, where the exception could be thrown
during make_global_effective_replication_map.

Skip the keyspace repair if no_such_keyspace is thrown during preparations.

Fixes: #22073.

Requires backport to 6.1 and 6.2 as they contain the bug

Closes scylladb/scylladb#22473

* github.com:scylladb/scylladb:
  test: add test to check if repair handles no_such_keyspace
  repair: handle keyspace dropped
2025-01-28 13:42:38 +02:00