Commit Graph

7294 Commits

Author SHA1 Message Date
Kamil Braun
5c9efdff50 Merge 'raft: store_snapshot_descriptor to use actually preserved items number when truncating the local log table' from Sergey Zolotukhin
io_fiber/store_snapshot_descriptor now gets the actual number of items
preserved when the log is truncated, fixing extra entries remained after
log snapshot creation. Also removes incorrect check for the number of
truncated items in the
raft_sys_table_storage::store_snapshot_descriptor.

Minor change: Added error_injection test API for changing snapshot thresholds settings.

Fixes scylladb/scylladb#16817
Fixes scylladb/scylladb#20080

Closes scylladb/scylladb#20095

* github.com:scylladb/scylladb:
  raft:  Ensure const correctness in applier_fiber.
  raft: Invoke store_snapshot_descriptor with actually preserved items.
  raft: Use raft_server_set_snapshot_thresholds in tests.
  raft: Fix indentation in server.cc
  raft: Add a test to check log size after truncation.
  raft: Add raft_server_set_snapshot_thresholds injection.
  utils: Ensure const correctness of injection_handler::get().
2024-08-20 18:15:30 +02:00
Tomasz Grabiec
ff52527c54 Merge 'repair: do_rebuild_replace_with_repair: use source_dc only when safe' from Benny Halevy
It is unsafe to restrict the sync nodes for repair to the source data center if it has too low replication factor in network_topology_replication_strategy, or if other nodes in that DC are ignored.

Also, this change restricts the usage of source_dc to `network_topology` and `everywhere_topology`
strategies, as with simple replication strategy
there is no guarantee that there would be any
more replicas in that data center.

Fixes #16826

Reproducer submitted as https://github.com/scylladb/scylla-dtest/pull/3865
It fails without this fix and passes with it.

* Requires backport to live versions.  Issue hit in the filed with 2022.2.14

Closes scylladb/scylladb#16827

* github.com:scylladb/scylladb:
  repair: do_rebuild_replace_with_repair: use source_dc only when safe
  repair: replace_with_repair: pass the replace_node downstream
  repair: replace_with_repair: pass ignore_nodes as a set of host_id:s
  repair: replace_rebuild_with_repair: pass ks_erms from caller
  nodetool: rebuild: add force option
  Add and use utils::optional_param to pass source_dc
2024-08-20 16:13:23 +02:00
Sergey Zolotukhin
c3e52ab942 raft: Invoke store_snapshot_descriptor with actually preserved items.
- raft_sys_table_storage::store_snapshot_descriptor now receives a number of
preserved items in the log, rather than _config.snapshot_trailing value;
- Incorrect check for truncated number of items in store_snapshot_descriptor
was removed.

Fixes scylladb/scylladb#16817
Fixes scylladb/scylladb#20080
2024-08-20 15:22:49 +02:00
Sergey Zolotukhin
922e035629 raft: Use raft_server_set_snapshot_thresholds in tests.
Replace raft_server_snapshot_reduce_threshold with raft_server_set_snapshot_thresholds in tests
as raft_server_set_snapshot_thresholds fully covers the functionality of raft_server_snapshot_reduce_threshold.
2024-08-20 15:08:49 +02:00
Sergey Zolotukhin
b6de8230a9 raft: Add a test to check log size after truncation.
The test checks that snapshot_trailing_size parameter is taken
into consideration when the log system table is truncated.
Test for  scylladb#16817
2024-08-20 14:15:50 +02:00
Botond Dénes
3ee0d7f2d1 Merge 'tools: Enhance scylla sstable shard-of to support tablets' from Kefu Chai
before this change, `scylla sstable shard-of` didn't support tablets,
because:

- with tablets enabled, data distribution uses the scheduler
- this replaces the previous method of mapping based on vnodes and shard numbers
- as a result, we can no longer deduce sstable mapping from token ranges

in this change, we:
- read `system.tablets` table to retrieve tablet information
- print the tablet's replica set (list of <host, shard> pairs)
- this helps users determine where a given sstable is hosted

This approach provides the closest equivalent functionality of
`shard-of` in the tablet era.

Fixes scylladb/scylladb#16488

---

no need to backport, it's an improvement, not a critical fix.

Closes scylladb/scylladb#20002

* github.com:scylladb/scylladb:
  tools: enhance `scylla sstable shard-of` to support tablets
  replica/tablets: extract tablet_replica_set_from_cell()
  tools: extract get_table_directory() out
  tools: extract read_mutation out
  build: split the list of source file across multiple line
  tools/scylla-sstable: print warning when running shard-of with tablets
2024-08-20 13:51:12 +03:00
Avi Kivity
7eb3b15fff Merge 'utils/tagged_integer: remove conversion to underlying integer' from Laszlo Ersek
~~~
utils/tagged_integer: remove conversion to underlying integer

Silently converting a tagged (i.e., "dimension-ful") integer to a naked
("dimensionless") integer defeats the purpose of having tagged integers,
and is a source of practical bugs, such as
<https://github.com/scylladb/scylladb/issues/20080>.

We could make the conversion operator explicit, for enforcing

  static_cast<TAGGED_INTEGER_TYPE::value_type>(TAGGED_INTEGER_VALUE)

in every conversion location -- but that's a mouthful to write. Instead,
remove the conversion operator, and let clients call the (identically
behaving) value() member function.
~~~

No backport needed (refactoring).

The series is supposed to solve #20081.

Two patches in the series touch up code that is known to be (orthogonally) buggy; see
- `service/raft_sys_table_storage: tweak dead code` (#20080)
- `test/raft/replication: untag index_t in test_case::get_first_val()` (#20151)

Fixes for those (independent) issues will have to be rebased on this series, or this series will have to be rebased on those (due to context conflicts).

The series builds at every stage. The debug and release unit test suites pass at the end.

Closes scylladb/scylladb#20159

* github.com:scylladb/scylladb:
  utils/tagged_integer: remove conversion to underlying integer
  test/raft/randomized_nemesis_test: clean up remaining index_t usage
  test/raft/randomized_nemesis_test: clean up index_t usage in store_snapshot()
  test/raft/replication: clean up remaining index_t usage
  test/raft/replication: take an "index_t start_idx" in create_log()
  test/raft/replication: untag index_t in test_case::get_first_val()
  test/raft/etcd_test: tag index_t and term_t for comparisons and subtractions
  test/raft/fsm_test: tag index_t and term_t for comparisons and subtractions
  test/raft/helpers: tighten compare_log_entries() param types
  service/raft_sys_table_storage: tweak dead code
  service/raft_sys_table_storage: simplify (snap.idx - preserve_log_entries)
  service/raft_sys_table_storage: untag index_t and term_t for queries
  raft/server: clean up index_t usage
  raft/tracker: don't drop out of index_t space for subtraction
  raft/fsm: clean up index_t and term_t usage
  raft/log: clean up index_t usage
  db/system_keyspace: promise a tagged integer from increment_and_get_generation()
  gms/gossiper: return "strong_ordering" from compare_endpoint_startup()
  gms/gossiper: get "int32_t" value of "gms::version_type" explicitly
2024-08-19 19:52:54 +03:00
Benny Halevy
0419b1d522 nodetool: rebuild: add force option
To be used to force usage of source_dc, even
when it is unsafe for rebuild.

Update docs and add test/nodetool/test_rebuild.py

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-08-19 17:20:12 +03:00
Kefu Chai
25b3c50f71 test/nodetool: print default value of options in help message
would be more helpful, if the output of "--help" command line can
include the default value of options.

so, in this change, we include the default values in it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20170
2024-08-19 16:15:24 +03:00
Botond Dénes
6835f7e993 Merge 'Add CQL-based RBAC support to Alternator' from Piotr Smaron
Alternator already supports **authentication** - the ability to to sign each request as a particular user. The users that can be used are the different "roles" that are created by CQL "CREATE ROLE" commands. This series adds support for **authorization**, i.e., the ability to determine that only some of these roles are allowed to read or write particular tables, to create new tables, and so on.

The way we chose to do this in this series is to support CQL's existing role-based access control (RBAC) commands - GRANT and REVOKE - on Alternator tables. For example, an Alternator table "xyz" is visible to CQL as "alternator_xyz.xyz", so a `GRANT SELECT ON alternator_xyz.xyz TO myrole` will allow read commands (e.g., GetItem) on that table, and without this GRANT, a GetItem will fail with `AccessDeniedException`.

This series adds the necessary checks to all relevant Alternator operations, and also adds extensive functional testing for this feature - i.e., that certain DynamoDB API operations are not allowed without the appropriate GRANTs.

The following permissions are needed for the following Alternator API operations:

* **SELECT**:      `GetItem`, `Query`, `Scan`, `BatchGetItem`, `GetRecords`
* **MODIFY**:      `PutItem`, `DeleteItem`, `UpdateItem`, `BatchWriteItem`
* **CREATE**:      `CreateTable`
* **DROP**:        `DeleteTable`
* **ALTER**:       `UpdateTable`, `TagResource`, `UntagResource`, `UpdateTimeToLive`
* _none needed_: `ListTables`, `DescribeTable`, `DescribeEndpoints`, `ListTagsOfResource`, `DescribeTimeToLive`, `DescribeContinuousBackups`, `ListStreams`, `DescribeStream`, `GetShardIterator`

Currently, I decided that for consistency each operation requires one permission only. For example, PutItem only requires MODIFY permission. This is despite the fact that in some cases (namely, `ReturnValues=ALL_OLD`) it can also _read_ the item. We should perhaps discuss this decision - and compare how it was done in CQL - e.g., what happens in LWT writes that may return old values?

Different permissions can be granted for a base table, each of its views, and the CDC table (Alternator streams). This adds power - e.g., we can allow a role to read only a view but not the base table, or read the table but not its history. GRANTing permissions on views or CDC logs require knowing their names, which are somewhat ugly (e.g., the name of GSI "abc" in table "xyz" is `alternator_xyz.xyz:abc`). But usefully, the error message when permissions are denied contains the full name of the table that was lacking permissions and which permissions were lacking, so users can easily add them.

In addition to permissions checking, this series also correctly supports _auto-grant_ (except #19798): When a role has permissions to `CreateTable`, any table it creates will automatically be granted all permissions for this role, so this role will be able to use the new table and eventually delete it. `DeleteTable` does the opposite - it removes permissions from tables being deleted, so that if later a second user re-creates a table with the same name, the first user will not have permissions over the new table.

The already-existing configuration parameter `alternator_enforce_authorization` (off by default), which previously only enabled authentication, now also enables authorization. Users that upgrade to the new version and already had `alternator_enforce_authorization=true` should verify that the users they use to authenticate either have the appropriate permissions or the "superuser" flag. Roles used to authenticate must also have the "login" flag.

Please note that although the new RBAC support implements the access control feature we asked for in #5047, this implementation is _not compatible_ with DynamoDB. In DynamoDB, the access control is configured through IAM operations or through the new `PutResourcePolicy` - operation, not through CQL (obviously!). DynamoDB also offers finer access-control granularity than we support (Scylla's RBAC works on entire tables, DynamoDB allows setting permissions on key prefixes, on individual attributes, and more). Despite this non-compatibility, I believe this feature, as is, will already be useful to Alternator users.

Fixes #5047 (after closing that issue, a new clean issue should be opened about the DynamoDB-compatible APIs that we didn't do - just so we remember this wasn't done yet).

New feature, should not be backported.

Closes scylladb/scylladb#20135

* github.com:scylladb/scylladb:
  tests: disable test_alternator_enforce_authorization_true
  test, alternator: test for alternator_enforce_authorization config
  test/pylib: allow setting driver_connect() options in servers_add()
  test: fix test_localnodes_joining_nodes
  alternator, RBAC: reproducer for missing CDC auto-grant
  alternator: document the new RBAC support
  alternator: add RBAC enforcement to GetRecords
  test/alternator: additional tests for RBAC
  test/alternator: reduce permissions-validity-in-ms
  test/alternator: add test for BatchGetItem from multiple tables
  alternator: test for operations that do not need any permissions
  alternator: add RBAC enforcement to UpdateTimeToLive
  alternator: add RBAC enforcement to TagResource and UntagResource
  alternator: add RBAC enforcement to BatchGetItem
  alternator: add RBAC enforcement to BatchWriteItem
  alternator: add RBAC enforcement to UpdateTable
  alternator: add RBAC enforcement to Query and Scan
  alternator: add RBAC enforcement to CreateTable
  alternator: add RBAC enforcement to DeleteTable
  alternator: add RBAC enforcement to UpdateItem
  alternator: add RBAC enforcement to DeleteItem
  alternator: add RBAC enforcement to PutItem
  alternator: add RBAC enforcement to GetItem
  alternator: stop using an "internal" client_state
2024-08-19 16:09:53 +03:00
Tomasz Grabiec
c1de4859d8 Merge 'tablets: Fix race between repair and split' from Raphael "Raph" Carvalho
Consider the following:

```
T
0   split prepare starts
1                               repair starts
2   split prepare finishes
3                               repair adds unsplit sstables
4                               repair ends
5   split executes
```

If repair produces sstable after split prepare phase, the replica will not split that sstable later, as prepare phase is considered completed already. That causes split execution to fail as replicas weren't really prepared. This also can be triggered with load-and-stream which shares the same write (consumer) path.

The approach to fix this is the same employed to prevent a race between split and migration. If migration happens during prepare phase, it can happen source misses the split request, but the tablet will still be split on the destination (if needed). Similarly, the repair writer becomes responsible for splitting the data if underlying table is in split mode. That's implemented in replica::table for correctness, so if node crashes, the new sstable missing split is still split before added to the set.

Fixes #19378.
Fixes #19416.

**Please replace this line with justification for the backport/\* labels added to this PR**

Closes scylladb/scylladb#19427

* github.com:scylladb/scylladb:
  tablets: Fix race between repair and split
  compaction: Allow "offline" sstable to be split
2024-08-19 14:44:28 +02:00
Piotr Smaron
cdc88cd06c tests: disable test_alternator_enforce_authorization_true
The test is flaky and needs to be fixed in order to not randomly break
our CI, OTOH can be commented out for the time being, so that we can
marge the feature.
2024-08-19 09:57:53 +02:00
Nadav Har'El
989dbef315 test, alternator: test for alternator_enforce_authorization config
This patch adds tests that demonstrates the current way that Alternator's
authentication and authorization are both enabled or disabled by the
option "alternator_enforce_authorization".

If in the future we decide to change this option or eliminate it (e.g.,
remain just with the "authenticator" and "authorizer" options), we can
easily update these tests to fit the new configuration parameters and
check they work as expected.

Because the new tests want to start Scylla instances with different
configuration parameters, they are written in the the "topology"
framework and not in the test/alternator framework. The test/alternator
framework still contains (test/alternator/test_cql_rbac.py) the vast
majority of the functional testing of the RBAC feature where all those
tests just assume that RBAC is enabled and needs to be tested.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
41418603e1 test/pylib: allow setting driver_connect() options in servers_add()
The manager.driver_connect() functions allows to pass parameters when
creating the connection (e.g., a special auth_provider), but unfortunately
right now the servers_add() function always calls driver_connect()
without parameters. So in this patch we just add a new optional
parameter to servers_add(), driver_connect_opts, that will be passed to
driver_connect().

In theory instead of the new option to driver_connect() a caller can
pass start=False to servers_add() and later call driver_connect()
manually with the right arguments. The problem is that start=False
avoids more than just calling driver_connect(), so it doesn't solve
the problem.

An example of using the new option is to run Scylla with authentication
enabled, and then connect to it using the correct default account
("cassandra"/"cassandra"):

    config = {
        'authenticator': 'PasswordAuthenticator',
        'authorizer': 'CassandraAuthorizer'
    }
    servers = await manager.servers_add(1, config=config,
        driver_connect_opts={'auth_provider':
            PlainTextAuthProvider(username='cassandra', password='cassandra')})

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
de20ac1a6d test: fix test_localnodes_joining_nodes
The existing test
topology_experimental_raft/test_alternator::test_localnodes_joining_nodes

Tried to create a second server but *not* wait for it to complete, but
the trick it used (cancelling the task) doesn't work since commit 2ee063c
makes a list of unwaited tasks and waits for them anyway. The test
*appears* to work because it is the last test in the file, but if we
ever add another test in the same file (like I plan to do in the next
patch), that other test will find a "BROKEN" ScyllaClusterManager and
report that it failed :-(

Other tricks I tried to use (like killing the servers) also didn't work
because of various limitations and complications of the test framework
and all its layers.

So not wanting to fight the fragile testing framework any more at this
point, I just gave up and the test will *wait* for the second server
to come up. This adds 120 seconds (!) to the test, but since this whole
test file already takes more than 500 seconds to complete, let's bite
this bullet. Maybe in the future when the test framework improves, we can
avoid this 120 second wait.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
79f9b3007e alternator, RBAC: reproducer for missing CDC auto-grant
This patch adds a reproducing (xfailing) test for issue #19798, which
shows that if a role is able to create an Alternator table, the role is
able to read the new table (this is known as "auto-grant"), but is NOT
able to read the CDC log (i.e., use Alternator Streams' "GetRecords").

Once we do fix this auto-grant bug, it's also important to also implement
auto-revoke - the permissions on a deleted table must be deleted as well
(otherwise the old owner of a deleted table will be able to read a new
table with the same name). This patch also adds a test verifying that
auto-revoke works. This test currently passes (because there is no auto-
grant, so nothing needs to be revoked...) but if we'll implement auto-grant
and forget auto-revoke, the second test will start to fail - so I added
this test as a precaution against a bad fix.

Refs #19798

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
f9ff475dfb alternator: add RBAC enforcement to GetRecords
This patch adds a requirement for the "SELECT" permission on a table to
run a GetRecords on it (the DynamoDB Streams API, i.e., CDC).

The grant is checked on the *CDC log table* - not on the base table,
which allows giving a role the ability to read the base but not is
change stream, or vice versa.

The operations ListStreams, DescribeStreams, GetShardIterators do not
require any permissions to run - they do not read any data, and are
(in my opinion) similar in spirit to DescribeTable, so I think it's fine
not to require any permissions for them.

A test is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
0789841cf8 test/alternator: additional tests for RBAC
Additional tests for support for CQL Role-Based Access Control (RBAC)
in Alternator:

1. Check that even in an Alternator table whose name isn't valid as CQL
   table names (e.g., uses the dot character) the GRANT/REVOKE commands
   work as expected.

2. Check that superuser roles have full permissions, as expected.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
409fea5541 test/alternator: reduce permissions-validity-in-ms
We set in test/cql-pytest/run.py, affecting test/alternator/run, the
configuration permissions_validity_in_ms by default to 100ms. This means
that tests that need to check how GRANT or REVOKE work always need to
sleep for more than 100ms, which can make a test with a lot of these
operations very slow.

So let's just set this configuration value to 5ms. I checked that it
doesn't adversely affect the total running speed of test/alternator/run.

This change only affects running tests through test/alternator/run, which
is expected to be fast. I left the default for test.py as it was, 100ms,
the latency of individual tests is less important there.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
1b20a11dec test/alternator: add test for BatchGetItem from multiple tables
While working on the RBAC on BatchGetItem, I noticed that although
BatchGetItem may ask to read items from several tables, we don't have
a test covering this case! This patch fixes that testing oversight.

Note that for the write-side version of this operation, BatchWriteItem,
we do have tests that write to several tables in the same batch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
f827bd51d2 alternator: test for operations that do not need any permissions
Some operations, namely ListTables, DescribeTable, DescribeEndpoints,
ListTagsOfResource, DescribeTimeToLive and DescribeContinuousBackups
do not need any permissions to be GRANTed to a role.

Our rationale for this decision is that in CQL, "describe table" and
friends also do not require any permissions.

This patch includes a test that verifies that they really don't need
permissions.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
9417cf8bcf alternator: add RBAC enforcement to UpdateTimeToLive
This patch adds a requirement for the "ALTER" permission on a table to
run a UpdateTimeToLive on it. UpdateTimeToLive is similar in purpose to
UpdateTable, so it makes sense to use the same permission "ALTER" as we
do for UpdateTable.

A tests is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
e76316495c alternator: add RBAC enforcement to TagResource and UntagResource
This patch adds a requirement for the "ALTER" permission on a table to
run the TagResource or UntagResource operations on it. CQL does not
have an exact parallel of DynamoDB's tagging feature, but our usual
use of tags as an extension of UpdateTable to change non-standard options
(e.g., write isolation policy or tablets setup), so it makes sense to
require the same permissions we require for UpdateTable - namely "ALTER".

A test for both operations is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:53 +02:00
Nadav Har'El
fda4a9fad8 alternator: add RBAC enforcement to BatchGetItem
This patch adds a requirement for the "SELECT" permission on a table to
run a BatchGetItem on it. A single batch may ask to write to several
different tables, so we fail the entire batch with AccessDeniedException
if any of the tables mentioned in the batch do not have SELECT permissions
for this role.

A tests is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:57:51 +02:00
Nadav Har'El
b02288785f alternator: add RBAC enforcement to BatchWriteItem
This patch adds a requirement for the "MODIFY" permission on a table to
run a BatchWriteItem on it. A single batch may ask to write to several
different tables, so we fail the entire batch with AccessDeniedException
if any of the tables mentioned in the batch do not have MODIFY permissions
for this role.

A tests is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:56:28 +02:00
Nadav Har'El
445a5d57cd alternator: add RBAC enforcement to UpdateTable
This patch adds a requirement for the "ALTER" permission on a table to
run a UpdateTable on it.

A tests is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:45:22 +02:00
Nadav Har'El
b4484158e7 alternator: add RBAC enforcement to Query and Scan
This patch adds a requirement for the "SELECT" permission on a table to
run a Query or Scan on it.

Both Query and Scan operations call the same do_query() function, so the
permission checks are put there.

Note that Query can read from either the base table or one of its views,
and the permissions on the base and each of the views can be separate
(so we can allow a role to only read one view, for example).

Tests for all of the above are also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:45:22 +02:00
Nadav Har'El
82f7e55943 alternator: add RBAC enforcement to CreateTable
This patch adds a requirement for the "CREATE" permission on ALL
KEYSPACES to run a CreateTable operation.

The CreateTable operation also performs so-called "auto-grant": When a
role creates a table, it is automatically granted full permissions to
read, write, change or delete that new table.

A test for all these things is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:45:22 +02:00
Nadav Har'El
79dfb7b7d5 alternator: add RBAC enforcement to DeleteTable
This patch adds a requirement for the "DROP" permission on a table to
run a DeleteTable on it.

Moreover, when a table and its views are deleted, any special permissions
previously GRANTed on this table are removed. This is necessary because
if a role creates a table it is automatically granted permissions on this
table (this is known as "auto-grant" - see the CreateTable patch for
details). If this role deletes this table and later a second role creates
a table with the same name, we don't want the first role to have
permissions on this new table.

Tests for permission enforcements and revocation on delete are also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:45:22 +02:00
Nadav Har'El
2ebc0501b8 alternator: add RBAC enforcement to UpdateItem
This patch adds a requirement for the "MODIFY" permission on a table to
run a UpdateItem on it.

Only the MODIFY permission is required, even if the operation may also
read the old value of the item, such as a read-modify-write operation
or even using ReturnValues='ALL_OLD'.

A test is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:45:22 +02:00
Nadav Har'El
36d8aea654 alternator: add RBAC enforcement to DeleteItem
This patch adds a requirement for the "MODIFY" permission on a table to
run a DeleteItem on it.

Only the MODIFY permission is required, even if the operation may also
read the old value of the item (using ReturnValues='ALL_OLD').

A test is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:45:22 +02:00
Nadav Har'El
34c975854a alternator: add RBAC enforcement to PutItem
This patch adds a requirement for the "MODIFY" permission on a table to
run a PutItem on it.

Only the MODIFY permission is required, even if the operation may also
read the old value of the item (using ReturnValues='ALL_OLD').

A test is also added.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:45:22 +02:00
Nadav Har'El
3008b8416c alternator: add RBAC enforcement to GetItem
In this patch, we begin to add role-based access control (RBAC)
enforement to Alternator - in this patch only to GetItem.

After the preparation of client_state correctly in the previous patch,
the permission check itself in the get_item() function is very simple.
The bigger part of this patch is a full functional test in
test/alternator/test_cql_rbac.py. The test is quite self-explanatory
and heavily commented. Basically we check that a new role cannot
read with GetItem a pre-existing table, and we can add that ability
by GRANTing (in CQL) the new role the ability to SELECT the table,
the keyspace, all keyspaces, or add that ability to some other role
that this role inherits.

In the following patches, we will add role-based access control to
the Alternator operations, but the functional tests will be shorter -
we don't need to check the role inheritence, "all keyspaces" feature,
and so on, for every operation separately since they all use the
same underlying checking functions which handles these role inheritence
issues in exactly the same way.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-08-19 09:45:22 +02:00
Tomasz Grabiec
ab7656a7be Merge 'replica: fix copy constructor of tablet_sstable_set' from Lakshmi Narayanan Sreethar
Commit 9f93dd9fa3 changed
`tablet_sstable_set::_sstable_sets` to be a `absl::flat_hash_map` and
in addition, `std::set<size_t> _sstable_set_ids` was
added. `_sstable_set_ids` is set up in the
`tablet_sstable_set(schema_ptr s, const storage_group_manager& sgm,
const locator::tablet_map& tmap)` constructor, but it is not copied in
`tablet_sstable_set(const tablet_sstable_set& o)`.

This affects the `tablet_sstable_set::tablet_sstable_set` method as it
depends on the copy constructor. Since sstable set can be cloned when
a new sstable set is added, the issue will cause ids not being copied
into the new sstable set. It's healed only after compaction, since the
sstable set is rebuilt from scratch there.

This PR fixes this issue by removing the existing copy constructor of
`tablet_sstable_set` to enable the implicit default copy constructor.

Fixes #19519

Closes scylladb/scylladb#20115

* github.com:scylladb/scylladb:
  boost/sstable_set_test: add testcase to test tablet_sstable_set copy constructor
  replica: fix copy constructor of tablet_sstable_set
2024-08-19 00:53:29 +02:00
Amnon Heiman
63fdfb89cd Add tests for Alternator batch operation metrics
This patch adds unit tests to verify the correctness of the newly
introduced histogram metrics for get and write batch operation
latencies.

The test uses the existing latency test with the added metrics.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-08-18 12:19:43 +03:00
Lakshmi Narayanan Sreethar
ec47b50859 boost/sstable_set_test: add testcase to test tablet_sstable_set copy constructor
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-08-17 23:38:05 +05:30
Kefu Chai
c628fa4e9e tools: enhance scylla sstable shard-of to support tablets
before this change, `scylla sstable shard-of` didn't support tablets,
because:

- with tablets enabled, data distribution uses the scheduler
- this replaces the previous method of mapping based on vnodes and shard numbers
- as a result, we can no longer deduce sstable mapping from token ranges

in this change, we:
- read `system.tablets` table to retrieve tablet information
- print the tablet's replica set (list of <host, shard> pairs)
- this helps users determine where a given sstable is hosted

This approach provides the closest equivalent functionality of
`shard-of` in the tablet era.

Fixes scylladb/scylladb#16488
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-08-15 15:49:55 +08:00
Kefu Chai
3f8f1d7274 tools/scylla-sstable: print warning when running shard-of with tablets
the subcommand of "shard-of" does not support tablets yet. so let's
print out an error message, instead of printing the mapping assuming
that the sstables are distributed based on token only.

this commit also adds two more command line options to this subcommand,
so that user is required to specify either "--vnodes" or "--tablets"
to instruct the tool how the cluster distributes the tokens across nodes
and their shards. this helps to minimize the suprise of user.

this change prepares for the succeeding changes to implement the tablets
support.

the corresponding test is updated accordingly so that it only exercises
the "shard-of" subcommand without tablets. we will test it with tablets
enabled in a succeeding change.

Refs scylladb/scylladb#16488
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-08-15 15:49:55 +08:00
Laszlo Ersek
9aa7d232d6 test/raft/randomized_nemesis_test: clean up remaining index_t usage
With implicit conversion of tagged integers to untagged ones going away,
explicitly tag (or untag, as necessary) the operands of the following
operations, in "test/raft/randomized_nemesis_test.cc":

- addition of tagged and untagged (both should be tagged)

- taking the minimum of an index difference and a container size (both
  should be untagged)

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-14 22:54:42 +02:00
Laszlo Ersek
1af3460a81 test/raft/randomized_nemesis_test: clean up index_t usage in store_snapshot()
With implicit conversion of tagged integers to untagged ones going away,
unpack and clean up the relatively complex

  first_to_remain = max(snap.idx + 1 - preserve_log_entries, 0)

calculation in persistence::store_snapshot().

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-14 22:54:42 +02:00
Laszlo Ersek
4dc2faa49a test/raft/replication: clean up remaining index_t usage
With implicit conversion of tagged integers to untagged ones going away,
explicitly untag the operands / arguments of the following operations, in
"test/raft/replication.hh":

- assignment to raft_cluster::_seen

- call to hasher_int::hash_range()

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-14 22:54:42 +02:00
Laszlo Ersek
3a32f3de81 test/raft/replication: take an "index_t start_idx" in create_log()
raft_cluster::get_states() passes a "start_idx" to create_log(), and
create_log() uses it as an "index_t" object. Match the type of "start_idx"
to its name.

This patch is best viewed with "git show -W".

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-14 22:54:42 +02:00
Laszlo Ersek
08e117aeb5 test/raft/replication: untag index_t in test_case::get_first_val()
In test_case::get_first_val(), the asssignment

  first_val = initial_snapshots[initial_leader].snap.idx;

*both* relies on implicit conversion of the tagged integer type "index_t"
to the underlying "uint64_t", *and* is a logic bug, as reported at
<https://github.com/scylladb/scylladb/issues/20151>.

For now, wean the buggy asssignment off the disappearing
tagged-to-untaggged conversion.

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-14 22:54:42 +02:00
Laszlo Ersek
6254fca7f5 test/raft/etcd_test: tag index_t and term_t for comparisons and subtractions
Properly annotate index_t and term_t constants for use in
BOOST_CHECK_EQUAL() and BOOST_CHECK().

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-14 22:54:42 +02:00
Laszlo Ersek
bd4fc85bf0 test/raft/fsm_test: tag index_t and term_t for comparisons and subtractions
Properly annotate index_t and term_t constants for use in
BOOST_CHECK_EQUAL(), BOOST_CHECK(). Clean up the first args of
read_quorum() calls -- stay in term_t space.

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-14 22:54:42 +02:00
Laszlo Ersek
265655473e test/raft/helpers: tighten compare_log_entries() param types
The "from" and "to" parameters of compare_log_entries() are raft log
indices; change them to raft::index_t, and update the callers.

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-14 22:54:42 +02:00
Avi Kivity
6d6f93e4b5 Merge 'test/nodetool: enable running nodetool tests under test/nodetool' from Kefu Chai
before this change, we assume user runs nodetool tests right under the root source directory. if user runs them under `test/nodetool`, the suppression rules are not applied. as the path is incorrect in that case.

after this change, the supression rules' path is deduced from the top src directory. so we can now run the nodetool test under `test/nodetool` .

---

no need to backport, this change improves developer's experience.

Closes scylladb/scylladb#20119

* github.com:scylladb/scylladb:
  test/nodetool: deduce subpression path from top srcdir
  test/nodetool: deduce path from top srcdir
2024-08-14 22:10:38 +03:00
Pavel Emelyanov
66d72e010c distributed_loader: Lock table via global table ptr
The lock_table() method needs database, ks and cf to find the table on
all shards. The same can be achieved with the help of global_table_ptr
thing that all the core callers already have at hand.

There's a test that doesn't have global table, but it can get one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#20139
2024-08-14 20:53:21 +03:00
Pavel Emelyanov
7e3e5cfcad sstable_directory: Simplify special-purpose local-only constructor
Typically the sstable_directory is constructed out of a table object.
Some code, namely tests and schema-loader, don't have table at hand and
construct directory out of schema, sharder, path-to-sstables, etc. This
code doesn't work with any storage options other than local ones, so
there's no need (yet) to carry this argument over.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#20138
2024-08-14 20:22:50 +03:00
Avi Kivity
28d3b91cce Merge 'test/perf/perf_sstables: use test_modes as the type of its option' from Kefu Chai
before this change, we look up for the mode using the command line
option as the key, but that's incorrect if the command line option does
not match with any of the known names. in that case, `test_mode` just
create another pair of <sstring, test_modes>, and return the second
component of this pair. and the second component is not what we expect.
we should have thrown an exception.

in this change

* the test_mode map is marked const.
* the overloads for parsing / formatting the `test_modes` type are
  added, so that boost::program_options can parse and format it.

after this change, we  print more user friendly error, like

```
/scylla perf-sstable --mode index-foo
error: the argument ('index-foo') for option '--mode' is invalid

Try --help.
```

instead of a bunch of output which is printed as if we passes the correct option as the argument of the `--mode` option.

---

it's an improvement of developer experience, hence no need to backport.

Closes scylladb/scylladb#20140

* github.com:scylladb/scylladb:
  test/perf/perf_sstable: use switch-case when appropriate
  test/perf/perf_sstables: use test_modes as the type of its option
2024-08-14 20:18:22 +03:00