Adds a test case for the `/storage_service/natural_endpoints/v2/{keyspace}` endpoint,
verifying that it correctly resolves natural endpoints for a composite partition key
passed as multiple `key_component` query parameters.
Users with single-column partition keys that contain colon characters
were unable to use certain REST APIs and 'nodetool' commands, because the
API split key by colon regardless of the partition key schema.
Affected commands:
- 'nodetool getendpoints'
- 'nodetool getsstables'
Affected endpoints:
- '/column_family/sstables/by_key'
- '/storage_service/natural_endpoints'
Refs: #16596 - This does not fully fix the issue, as users with compound
keys will face the issue if any column of the partition key contains
a colon character.
Closesscylladb/scylladb#24829
Added a new POST endpoint `/storage_service/drop_quarantined_sstables` to the REST API.
This endpoint allows dropping all quarantined SSTables either globally or
for a specific keyspace and tables.
Optional query parameters `keyspace` and `tables` (comma-separated table names) can be
provided to limit the scope of the operation.
Fixesscylladb/scylladb#19061
The "keyspace" and "cf" pair of options are now parsed similarly to how
recently changed ss::force_keyspace_compaction handler does.
The "scrub_mode" query param is saved directly into sstring variable and
its presense is checked by .empty() call. If the parameter is missing,
the request::get_query_param() would return empty string, so the change
is correct.
The "skip_corrupted" is boolean option, other options are already parsed
by hand, without the help of req_params facilities.
There's a test that validates the work of req_params::process() of scrub
endpoint -- it passes "invalid" options. This test is temporarily
removed according to the PR description.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Adds core integration of the audit subsystem into Scylla's main process flow. Changes include:
- Import audit subsystem header
- Initialize audit system during server startup using configuration and token metadata
- Start audit system after API server initialization with query processor and memory manager
- Add proper shutdown sequence for audit system using RAII pattern
- Add error handling for audit system initialization failures
The audit system is now properly integrated into Scylla's lifecycle, ensuring:
- Correct initialization order relative to other subsystems
- Proper resource cleanup during shutdown
- Graceful error handling for initialization failures
Stop taking snapshots of MVs and allow taking snapshot of individual tables, now one can take a snapshot of any base table, any view or index. Also add tests to cover new cases both boost test (using cc code) and pytest (using the API)
Also, update documentation to reflect the change
fixes: #21339fixes: #20760Closesscylladb/scylladb#21433
Python and Python developers don't like directory names to include a
minus sign, like "cql-pytest". In this patch we rename test/cql-pytest
to test/cqlpy, and also change a few references in other code (e.g., code
that used test/cql-pytest/run.py) and also references to this test suite
in documentation and comments.
Arguably, the word "test" was always redundant in test/cql-pytest, and
I want to leave the "py" in test/cqlpy to emphasize that it's Python-based
tests, contrasting with test/cql which are CQL-request-only approval
tests.
Fixes#20846
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
`database::find_column_family()` throws no_such_column_family
if an unknown ks.cf is fed to it. and we call into this function
without checking for the existence of ks.cf first. since
"/storage_service/tablets/move" is a public interface, we should
translate this error to a better http error.
in this change, we check for the existence of the given ks.cf, and
throw an exception so that it can be caught by seastar::httpd::routers,
and converted to an HTTP error.
Fixes#17198
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17217
This change introduces a logic, that is responsible
for checking if tablets are enabled for any of
keyspaces when get_ownership() is invoked.
Without it, the result would be calculated
based solely on sorted_tokens() which was
invalid.
Refs: scylladb#17342
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
Before this change, when user tried to utilize
'storage_service/ownership/{keyspace}' API with
keyspace parameter that uses tablets, then internal
error was thrown. The code was calling a function,
that is intended for vnodes: get_vnode_effective_replication_map().
This commit introduces graceful handling of such scenario and
extends the API to allow passing 'cf' parameter that denotes
table name.
Now, when keyspace uses tablets and cf parameter is not passed
a descriptive error message is returned via BAD_REQUEST.
Users cannot query ownership for keyspace that uses tablets,
but they can query ownership for a table in a given keyspace that uses tablets.
Also, new tests have been added to test/rest_api/test_storage_service.py and
to test/topology_experimental_raft/test_tablets.py in order to verify the behavior
with and without tablets enabled.
Refs: scylladb#17342
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change is intended to introduce tests for vnodes for
the following API paths:
- 'storage_service/ownership'
- 'storage_service/ownership/{keyspace}'
In next patches the logic that is tested will be adjusted
to work correctly when tablets are enabled. This is a safety
net that ensures that the logic is not broken.
Refs: scylladb#17342
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
Fix test_storage_service.py to work with tablets.
- test_describe_ring was failing because in storage_service/describe_ring
table must be specified for keyspaces with tablets.
Do not check the status if tablets are enabled. Add checks for
specified table;
- test_storage_service_keyspace_cleanup_with_no_owned_ranges
was failing because cleanup is disabled on keyspaces with tablets.
Use test_keyspace_vnodes fixture to use keyspace with tablet disabled;
- test_storage_service_get_natural_endpoints required
some minor type-related fixes.
To allow to filter the returned keyspaces based by the replication they
use: tablets or vnodes.
The filter can be disabled by omitting the parameter or passing "all".
The default is "all".
Fixes: #16509Closesscylladb/scylladb#17319
This API endpoint was failing when tablets were enabled
because of usage of get_vnode_effective_replication_map().
Moreover, it was providing an error message that was not
user-friendly.
This change extends the handler to properly service the incoming requests.
Furthermore, it introduces two new test cases that verify the behavior of
storage_service/range_to_endpoint_map API. It also adjusts the test case
of this endpoint for vnodes to succeed when tablets are enabled by default.
The new logic is as follows:
- when tablets are disabled then users may query endpoints
for a keyspace or for a given table in a keyspace
- when tablets are enabled then users have to provide
table name, because effective replication map is per-table
When user does not provide table name when tablets are enabled
for a given keyspace, then BAD_REQUEST is returned with a
meaningful error message.
Fixes: scylladb#17343
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
Closesscylladb/scylladb#17372
per its description, "`/storage_service/describe_ring/`" returns the
token ranges of an arbitrary keyspace. actually, it returns the
first keyspace which is of non-local-vnode-based-strategy. this API
is not used by nodetool, neither is it exercised in dtest.
scylla-manager has a wrapper for this API though, but that wrapper
is not used anywhere.
in this change, this API is dropped.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17197
For flushing all tables in the database.
The advantage of this api is that `commitlog->force_new_active_segment`
happens only once in `database::flush_all_tables` rather than
once per keyspace (when `nodetool flush` translates to
a sequence of `/storage_service/keyspace_flush` calls).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
test_storage_service_keyspace_cleanup_with_no_owned_ranges
from test_storage_service.py creates snapshots with tags based
on current time. Thus if a test runs on the same node twice
with time distance short enough, there may be a name collision
between the snapshots from two runs. This will cause the second
run to fail on assertions.
Use new_test_snapshot fixture to drop snapshots after the test.
Delete my_snapshot_tags as it's no longer necessary.
Fixes: #15680.
Closesscylladb/scylladb#15683
Fixes#13332
The tests user the discriminator "system" as prefix to assume keyspaces are marked
"internal" inside scylla. This is not true in enterprise universe (replicated key
provider). It maybe/probably should, but that train is sailing right now.
Fix by removing one assert (not correct) and use actual API info in the alternator
test.
Closes#13333
The REST test test_storage_service.py::test_toppartitions_pk_needs_escaping
was flaky. It tests the toppartition request, which unfortunately needs
to choose a sampling duration in advance, and we chose 1 second which we
considered more than enough - and indeed typically even 1ms is enough!
but very rarely (only know of only one occurance, in issue #13223) one
second is not enough.
Instead of increasing this 1 second and making this test even slower,
this patch takes a retry approach: The tests starts with a 0.01 second
duration, and is then retried with increasing durations until it succeeds
or a 5-seconds duration is reached. This retry approach has two benefits:
1. It de-flakes the test (allowing a very slow test to take 5 seconds
instead of 1 seconds which wasn't enough), and 2. At the same time it
makes a successful test much faster (it used to always take a full
second, now it takes 0.07 seconds on a dev build on my laptop).
A *failed* test may, in some cases, take 10 seconds after this patch
(although in some other cases, an error will be caught immediately),
but I consider this acceptable - this test should pass, after all,
and a failure indicates a regression and taking 10 seconds will be
the last of our worries in that case.
Fixes#13223.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#13238
This reverts commit 8e892426e2 and fixes
the code in a different way:
That commit moved the scylla_inject_error function from
test/alternator/util.py to test/cql-pytest/util.py and renamed
test/alternator/util.py. I found the rename confusing and unnecessary.
Moreover, the moved function isn't even usable today by the test suite
that includes it, cql-pytest, because it lacks the "rest_api" fixture :-)
so test/cql-pytest/util.py wasn't the right place for it anyway.
test/rest_api/rest_util.py could have been a good place for this function,
but there is another complication: Although the Alternator and rest_api
tests both had a "rest_api" fixture, it has a different type, which led
to the code in rest_api which used the moved function to have to jump
through hoops to call it instead of just passing "rest_api".
I think the best solution is to revert the above commit, and duplicate
the short scylla_inject_error() function. The duplication isn't an
exact copy - the test/rest_api/rest_util.py version now accepts the
"rest_api" fixture instead of the URL that the Alternator version used.
In the future we can remove some of this duplication by having some
shared "library" code but we should do it carefully and starting with
agreeing on the basic fixtures like "rest_api" and "cql", without that
it's not useful to share small functions that operate on them.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#11275
over the rest api
Performing compaction scrub user did not know whether an operation
was aborted.
If compaction scrub is aborted, return status the user gets over
rest api is set to 1.
storage_service/keyspaces?type=user along with user keyspaces returned
the keyspaces that were internal but non-system.
The list of the keyspaces for the user option
(storage_service/keyspaces?type=user) contains neither system nor
internal but only user keyspaces.
Fixes: #11042Closes#11049
Previously, any attempt to take a materialized view or secondary index
snapshot was considered a mistake and caused the snapshot operation to
abort, with a suggestion to snapshot the base table instead.
But an automatic pre-scrub snapshot of a view cannot be attributed to
user error, so the operation should not be aborted in that case.
(It is an open question whether the more correct thing to do during
pre-scrub snapshot would be to silently ignore views. Or perhaps they
should be ignored in all cases except when the user explicitly asks to
snapshot them, by name)
Closes#10760.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Detecting a secondary index by checking for a dot
in the table name is wrong as tables generated by Alternator
may contain a dot in their name.
Instead detect bot hmaterialized view and secondary indexes
using the schema()->is_view() method.
Fixes#10526
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Test the snapshot operations via the rest api.
Added test/rest_api/rest_util.py with
new_test_snapshot that creates a new test snapshot
and automagically deletes it when the `with` block
if exited, similar to new_test_keyspace and new_test_table.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This patch adds a reproducer for the JSON encoding in issue #9061.
The bug was already fixed (it was a Seastar bug, and Seastar was
updated in commit 5d4213e1b8), but
I verified that the test fails before that patch - and passes today.
It is useful to have such a test for regressions, as well as for
testing backports.
Unfortunately, the test isn't pretty. The test uses the toppartitions
API, which instead of having a "start" and "stop" request has a single
synchronous "start for a given duration" request, and we need to run
it with some fixed duration (we took 1 second), and in parallel, one
request.
Refs #9061.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220323180855.3307931-1-nyh@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
Splits and validate the cf parameter, containing an optional
comma-separated list of table names.
If any table is not found and a no_such_column_family
exception is thrown, wrap it in a `bad_param_exception`
so it will translate to `reply::status_type::bad_request`
rather than `reply::status_type::internal_server_error`.
With that, hide the split_cf function from api/api.hh
since it was used only from api/storage_service
and new use sites should use validate_tables instead.
Fixes#9754
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
FIXME: negative tests for not-found tables
should result in a requests.codes.bad_request
but currently result in requests.codes.internal_server_error.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>