This pull request adds support for creating custom indexes (at a metadata level) as long as a supported custom class is provided (currently only vector search).
The patch contains:
- a change in CREATE INDEX statement that allows for the USING keyword to be present as long as one of the supported classes is used
- support for describing custom indexes in the DESCRIBE statement
- unit tests
Co-authored by: @Balwancia
Closesscylladb/scylladb#23720
* github.com:scylladb/scylladb:
test/cqlpy: add custom index tests
index: support storing metadata for custom indices
Move the run_process method to resource gather instance, since we need to start a monitor to check memory consumption in the cgroup. Pytest has concept of the test, but it is completely different from test.py. Resource gather instance take test instance to save and extract information about the test. Additional method emulating test.py test instance added not to rewrite the resource gather instance. Finally, combining all these changes to have ability to get metrics for test in both runners: test.py and pytest.
Closesscylladb/scylladb#24091
* github.com:scylladb/scylladb:
test.py: add missing parameter for boost tests for pytest runner
test.py: add support for boost_data_test_case in combined tests
test.py: clean log files after a successful run
test.py: attach output of the boost test to the report
test.py: fix metrics DB location
test.py: move run_process to resource_gather.py
test.py: unify using constant for finding repo root directory
test.py: refactor run_process in facade.py
test.py: add the possibility to create a test alike object
Clean different output files from the boost and unit tests.
Move logs for boost test to the testlog directory instead of having additional directory pytest
Fix the issue introduced with scylladb/scylladb#22960. Suite log dir was changed, and the path for metrics DB was relying on it. As a result, DB is now located in the mode directory instead of the root of the testlog.
Move the run_process method to the resource gather instance, since we need to start monitor to check memory consumption in the cgroup. Since resource_gather needs test.py test object, and pytest has no clue about it, adding a simple namespace object to emulate such a test object. It needed only to gather some information regarding the test to be able to add records to the DB.
Since we have two facades that can share the same run process procedure, adding a common method to handle this to avoid code duplication.
Instead of finding dynamically the repo root directory relatively to the temp dir, that's in most cases in the repo, will fail if a non-default temp dir parameter is used. Additionally, to have the single source of truth of finding the repo root directory switching to the constants.
Add injecting environment variables to the process
Switch from print to propper logger
Set buffer size to 1 to avoid losing any data from the boost test if the test collapsed.
Currently, run process logs and return stdout and stderr, but boost tests are using stderr only. So stderr redirected to stdout. This helps with Jenkins as well, since we are reducing the number of files to store.
resource_gather.py needs test.py test object to work. It needs some information about the test to be able to write down this information to the DB with metrics. When running with pytest, there's no such test object, that's why adding make_test_object to mimic the test.py's test object.
Switching the getting the mode for constructing path to chgroup to test
instead of suite. They are the same, but this helps to have emulate less
in make_test_object method.
compile_commands.json is used by LSPs (e.g. `clangd` in VS Code) for
code navigation. `merge-compdb.py`, called by `configure.py`, merges
these files from Scylla, Seastar, and Abseil. The script filters
entries by checking the output attribute against a given prefix. This
is needed because Scylla’s compile_commands.json is generated by Ninja
and includes all build modes, in case the user specified multiple
ones in the call to configure.py. Seastar and Abseil databases,
generated by CMake, used to omit the output attribute, so filtering
did not apply. Starting with `CMake 3.20+`, output attributes are now
included and do not match the expected prefix. For example, they
could be of the form
`absl/synchronization/CMakeFiles/synchronization.dir/internal/futex_waiter.cc.o`.
This causes relevant entries from Seastar and Abseil to be filtered out.
This patch refactors `merge-compdb.py` to allow specifying an
optional prefix per input file, preserving the intent of applying
the output filtering logic only for ninja-generated
Scylla compdb file.
Closesscylladb/scylladb#24211
We have a significant amount of tests in scylla-dtest repository and I believe most of them can be just copied to test.py framework with adding a relatively small shim code. In this PR I done that for 2 tests: [alternator_tests.py](https://github.com/scylladb/scylla-dtest/blob/next/alternator_tests.py) and [error_example_test.py](https://github.com/scylladb/scylla-dtest/blob/next/error_example_test.py)
One of the problems is async nature of test.py framework and synchronous of scylla-dtest. It was resolved by using universalasync third-party library. Other problem is ccmlib and it's resolved by adding a shim code (`test/dtest/ccmlib`)
ccmlib has a lot of dead code and not all it's features used by scylla-dtest, in this PR I added checks that we will not accidentally use some of them or miss something. And when we'll done the migration we can easily remove all unused parameters and these checks.
`error_example_test.py` copied as is (just license preamble added), `alternator_tests.py` has small changes:
1. License preamble
2. Remove unused imports
3. Remove unneeded `skip_if` marker (I think it can be backported to dtest, or we can remove the test from dtest after merging this PR)
```diff
--- ../../../scylla-dtest/alternator_tests.py
+++ alternator_tests.py
@@ -1,17 +1,20 @@
+#
+# Copyright (C) 2025-present ScyllaDB
+#
+# SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+#
+
import logging
import operator
import os
import random
-import shutil
import string
-import subprocess
import tempfile
import time
from ast import literal_eval
from concurrent.futures.thread import ThreadPoolExecutor
from copy import deepcopy
from decimal import Decimal
-from pathlib import Path
from pprint import pformat
import boto3.dynamodb.types
@@ -46,7 +49,6 @@
)
from dtest_class import get_ip_from_node, wait_for
from tools.cluster import new_node
-from tools.marks import issue_open, with_feature
from tools.misc import set_trace_probability
from tools.retrying import retrying
@@ -168,7 +170,6 @@
read_and_delete_set_elements_thread.join()
@pytest.mark.next_gating
- @pytest.mark.skip_if(with_feature("tablets") & issue_open("#18002"))
def test_decommission_during_dynamo_load(self):
self.prepare_dynamodb_cluster(num_of_nodes=3)
node1, node2, node3 = self.cluster.nodelist()
```
Because all tests in this repo are considered to be "gating", I removed all not next_gating tests and all dtest's suites markers as a separate commit.
To reduce tests execution time run the tests in dev mode only and made some sleeps smaller.
In result, 23 tests added in total (22 in `test_alternator.py` and 1 in `test_error_example`.) The added tests will increase CI time by ~2х4 =8 minutes.
Closesscylladb/scylladb#21580
* github.com:scylladb/scylladb:
test.py: dtest/alternator_tests.py: make sleep intervals smaller
test.py: dtest/alternator_tests.py: remove not next_gating tests
test.py: migrate alternator_tests.py from dtest
test.py: initial implementation of dtest/ccm shim
test.py: manager: add server_get_returncode() method
test.py: manager: change CLI and env options on a node start
test.py: REST API: add set_trace_probability() method
test.py: REST API: add get_tokens() method
test.py: rework log_browsing for dtest migration
Use universalasync library to make test.py async code compatible
with synchronous code of dtest/ccm
Also, copied unmodified error_example_test.py from dtest as an example.
Run the test in `dev` mode only.
Add parameters to server_start() method to provide ability to
change Scylla' CLI and env options on a node start.
Also, add `expected_server_up_state` parameter as we have for
server_add() method.
Rework `ScyllaLogFile.wait_for()` method to make it easier
to add required methods to ScyllaNode class of ccm-like shim.
Also, added `ScyllaLogFile.grep_for_errors()` method and
reworked `ScyllaLogFile.grep()`
Starting with 2025.1, ScyllaDB versions are no longer called "Enterprise",
but the OS support page still uses that label.
This commit fixes that by replacing "Enterprise" with "ScyllaDB".
This update is required since we've removed "Enterprise" from everywhere else,
including the commands, so having it here is confusing.
Fixes https://github.com/scylladb/scylladb/issues/24179Closesscylladb/scylladb#24181
Fedora 42 merged /usr/sbin into /usr/bin [1]. As part of that change
the rpm macro %_sbindir was redefined from /usr/sbin to /usr/bin. As
a result RPM build on Fedora 42 fails: install.sh places some files
into /usr/sbin, while rpmbuild looks for them in /usr/bin.
We could resolve this either by following the change and moving
the files to /usr/bin as well, or fixing the spec to place the files
in /usr/sbin. The former is more difficult:
- what about Debian/Ubuntu?
- what about older RPM-based distributions (like all RHEL distributions)?
- what about scripts that hard-code /usr/sbin/<scylla utility>?
So we pick the latter, and redefine %_sbindir to /usr/sbin. Since that
directory still exists (as a symlink), installation on systems with
merged /usr/bin and /usr/sbin will work.
We'll have to address the problem later (likely by installing to either
/usr/bin or /usr/sbin depending on context), but for now, this is a simple
solution that works everywhere.
[1] https://fedoraproject.org/wiki/Changes/Unify_bin_and_sbinClosesscylladb/scylladb#24101
Currently, stream_manager is initialized after storage_service and
so it is stopped before the storage_service is. In its stop method
storage_service accesses stream_manager which is uninitialized
at a time.
Move stream_manager initialization over the storage_service initialization.
Fixes: #23207.
Closesscylladb/scylladb#24008
The conversion is unnecessary and likely dates back from before the
split between interval and wrapped_interval. It gets in the way
of making the conversion explicit.
Closesscylladb/scylladb#24164
This patch fixes "test/cqlpy/run --release 2025.1" which fails as
follows on all tests with indexes or views:
Secondary indexes are not supported on base tables with tablets
test/cqlpy/run can run cqlpy (and alternator) tests on various official
releases of Scylla which it knows how to download. When running old
versions of Scylla, we need to change the configuration options to those
that were needed on specific versions.
On new versions of Scylla we need to pass
--experimental-features=views-with-tablets
to be able to test materialized views, but in older versions we need to
remove that parameter because it didn't exist. We incorrectly removed it
for any versions 2025.1 or earlier, but that's incorrect - it just needs
to be removed for versions strictly earlier than 2025.1 - it is needed
for 2025.1 (I tested it is indeed needed even in the earliers RCs).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#24144
chunked_vector is a replacement for std::vector that avoids large contiguous
allocations.
In this series, we add some missing modifiers and improve quality-of-life for
chunked_vector users (the static_assert patch).
Those modifiers were generally unused since they have O(n) complexity
and therefore not useful for hot paths, but they are used in some
control plane code on vectors which we'd like to replace with chunked_vectors.
A candidate for such a replacement is token_range_vector (see #3335).
This is a prerequisite for fixing some minor stalls; I don't expect we'll backport
fixes to those stalls.
Closesscylladb/scylladb#24162
* github.com:scylladb/scylladb:
utils: chunked_vector: add swap() method
utils: chunked_vector: add range insert() overloads
utils: chunked_vector: relax static_assert
utils: chunked_vector: implement erase() for single elements and ranges
utils: chunked_vector: implement insert() for single-element inserts
Each view update is correlated to a write that generates it (aside from view
building which is throttled separately). These writes are limited by a throttling
mechanism, which effectively works by performing the writes with CL=ALL if
ongoing writes exceed some memory usage limit
When writes generate view updates, they usually also need to perform a read. This read
goes through a read concurrency semaphore where it can get delayed or killed. The
semaphore allows up to 100 concurrent reads and puts all remaining reads in a queue.
If the number of queued reads exceeds a specific limit, the view update will fail on
the replica, causing inconsistencies.
This limit is not necessary. When a read gets queued on the semaphore, the write that's
causing the view update is paused, so the write takes part in the regular write throttling.
If too many writes get stuck on view update reads, they will get throttled, so their
number is limited and the number of queued reads is also limited to the same amount.
In this patch we remove the specified queue length limit for the view update read concurrency
semaphore. Instead of this limit, the queue will be now limited indirectly, by the base write
throttling mechanism. This may allow the queue grow longer than with the previous limit, but
it shouldn't ever cause issues - we only perform up to 100 actual reads at once, and the
remaining ones that get queued use a tiny amount of memory, less than the writes that generated
them and which are getting limited directly.
Fixes https://github.com/scylladb/scylladb/issues/23319Closesscylladb/scylladb#24112
Negative load sizes don't make sense, but we've seen a case in
production, where a negative number was returned by ScyllaDB REST API,
so be prepared to handle these too.
Fixes: scylladb/scylladb#24134Closesscylladb/scylladb#24135
Lots of code from this test can be reused in PR #23861. I'm splitting it now in this change so we can merge it cleanly as a separate patch.
Refs #23564Closesscylladb/scylladb#24105
* github.com:scylladb/scylladb:
Refactor out code from test_restore_with_streaming_scopes
Refactor out code from test_restore_with_streaming_scopes
Refactor out code from test_restore_with_streaming_scopes
Refactor out code from test_restore_with_streaming_scopes
Refactor out code from test_restore_with_streaming_scopes
The non-streaming loading of sstables performs cleanup since recently [1]. For vnodes, unfortunately, cleanup is almost unavoidable, because of the nature of vnodes sharding, even if sstable is already clean. This leads to waste of IO and CPU for nothing. Skipping the cleanup in a smart way is possible, but requires too many changes in the code and in the on-disk data. However, the effort will not help existing SSTables and it's going to be obsoleted by tablets some time soon.
Said that, the easiest way to skip cleanup is the explicit --skip-cleanup option for nodetool and respective skip_cleanup parameter for API handler.
New feature, no backport
fixes#24136
refs #12422 [1]
Closesscylladb/scylladb#24139
* github.com:scylladb/scylladb:
nodetool: Add refresh --skip-cleanup option
api: Introduce skip_cleanup query parameter
distributed_loader: Don't create owned ranges if skip-cleanup is true
code: Push bool skip_cleanup flag around
Inserts an iterator range at some position.
Again we insert the range at the end and use std::rotate() to
move the newly inserted elements into place, forgoing possible
optimizations.
Unit tests are added.
chunked_vector is only implemented for types with a
non-throwing move constructor; this greatly simplifies
the implementation.
We have a static_assert to enforce it (should really
be a constraint, but chunked_vector predates C++ concepts).
This static_assert prevents forward declarations from compiling:
class forward_declared;
using a = utils::chunked_vector<forward_declared>;
`a` won't compile since the static_assert will be instantiated
and will fail since forward_declared is an incomplete type. Using
a constraint has the same problem.
Fix by moving the static_assert to the destructor. The destructor
won't be instantiated by the forward declaration, so it won't
trigger. It will trigger when someone destroys the vector; at this
point the types are no longer forward declared.
Implement using std::rotate() and resize(). The elements to be erased
are rotated to the end, then resized out of existence.
Again we defer optimization for trivially copyable types.
Unit tests are added.
Needed for range_streamer with token_ranges using chunked_vector.
The get_blob method linearizes data by copying it into a single buffer, which can cause 'oversized allocation' warnings.
In this commit we avoid copying by creating input stream on top of the original fragmened managed bytes, returned by untyped_result_set_row::get_view.
fixesscylladb/scylladb#23903
backport: no need, not a critical issue.
Closesscylladb/scylladb#24123
* github.com:scylladb/scylladb:
raft_sys_table_storage: avoid temporary buffer when deserializing log_entry
serializer_impl.hh: add as_input_stream(managed_bytes_view) overload
partition_range_compat's unwrap() needs insert if we are to
use it for chunked_vector (which we do).
Implement using push_back() and std::rotate().
emplace(iterator, args) is also implemented, though the benefit
is diluted (it will be moved after construction).
The implementation isn't optimal - if T is trivially copyable
then using std::memmove() will be much faster that std::rotate(),
but this complex optimization is left for later.
Unit tests are added.
Added function returning custom index class name.
Added printing custom index class name when using DESCRIBE.
Changed validation to reflect current support of indices.
One of pytest parameters in test_long_query_timeout_erm.py was
a CQL query containing spaces and special chars such as '*', '(', ')',
'{', '}'. After upgrading to Fedora 42, the test started to
fail with the error "test.pylib.rest_client.HTTPError: HTTP error 404"
with uri=`http://...[SELECT * FROM {}-True-False].dev.1`.
To prevent from such errors, this commit changes the parameter to
a string without spaces and such special characters.
Fixes: scylladb/scylladb#24124Closesscylladb/scylladb#24130
do_accepts might be called after `_gate` was closed.
In this case it should just return early rather
than throw gate_closed_exception, similar to the it breaks
from the infinite for loop when the _gate is closed.
With this change, do_accepts (and consequently, _listeners_stopped),
should never fail as it catches and ignores all exceptions
in the loop.
Fixes#23775
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#23818