Commit Graph

4865 Commits

Author SHA1 Message Date
Botond Dénes
3d75158fda Merge 'Allow no owned token ranges in cleanup compaction' from Benny Halevy
It is possible that a node will have no owned token ranges
in some keyspaces based on their replication strategy,
if the strategy is configured to have no replicas in
this node's data center.

In this case we should go ahead with cleanup that will
effectively delete all data.

Note that this is current very inefficient as we need
to filter every partition and drop it as unowned.
It can be optimized by either special casing this case
or, better, use skip forward to the next owned range.
This will skip to end-of-stream since there are no
owned ranges.

Fixes #13634

Also, add a respective rest_api unit test

Closes #13849

* github.com:scylladb/scylladb:
  test: rest_api: test_storage_service: add test_storage_service_keyspace_cleanup_with_no_owned_ranges
  compaction_manager: perform_cleanup: handle empty owned ranges
2023-05-11 15:05:06 +03:00
Botond Dénes
24cb351655 Merge 'test: sstable_*test: avoid using helper using generation_type::int_t ' from Kefu Chai
the series drops some of the callers using SSTable generation as integer. as the generation of SSTable is but an identifier, we should not use it as an integer out of generation_type's implementation.

Closes #13845

* github.com:scylladb/scylladb:
  test: drop unused helper functions
  test: sstable_mutation_test: avoid using helper using generation_type::int_t
  test: sstable_move_test: avoid using helper using generation_type::int_t
  test: sstable_*test: avoid using helper using generation_type::int_t
  test: sstable_3_x_test: do not use reuseable_sst() accepting integer
2023-05-11 10:17:02 +03:00
Benny Halevy
0b91bfbcc5 test: rest_api: test_storage_service: add test_storage_service_keyspace_cleanup_with_no_owned_ranges
Test cleanup on a keyspace after altering
it replication factor to 0.
Expect no sstables to remain.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-05-11 08:16:31 +03:00
Kefu Chai
29284d64a5 test: drop unused helper functions
all users of these two helpers have switched to their alternatives,
so there is no need to keep them.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-11 12:32:37 +08:00
Kefu Chai
b036d2b50c test: sstable_mutation_test: avoid using helper using generation_type::int_t
this change is one of the series which drops most of the callers
using SSTable generation as integer. as the generation of SSTable
is but an identifier, we should not use it as an integer out of
generation_type's implementation. so, in this change, instead of
using `generation_type::int_t` in the helper functions, we just
pass `generation_type` in place of integer. also, since
`generate_clustered()` is only used by functions in the same
compilation unit, let's take the opportunity to mark it `static`.
and there is no need to pass generation as a template parameter,
we just pass it as a regular parameter.

we will divert other callers of `reusable_sst(...,
generation_type::int)` in following-up changes in different ways.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-11 12:32:22 +08:00
Kefu Chai
689e1e99d6 test: sstable_move_test: avoid using helper using generation_type::int_t
this change is one of the series which drops most of the callers
using SSTable generation as integer. as the generation of SSTable
is but an identifier, we should not use it as an integer out of
generation_type's implementation. so, in this change, instead of
using `generation_type::int_t` in helper functions, we just use
`generation_type`. please note, despite that we'd prefer generating
the generations using generator, the SSTables used by the tests
modified by this change are stored in the repo, to ensure that the
tests are always able to find the SSTable files, we keep them
unchanged instead of using generation_generator, or a random
generation for the testing.

we will divert other callers of `reusable_sst(...,
generation_type::int)` in following-up changes in different ways.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-11 12:32:22 +08:00
Kefu Chai
bfd6caffbb test: sstable_*test: avoid using helper using generation_type::int_t
this change is one of the series which drops most of the callers
using SSTable generation as integer. as the generation of SSTable
is but an identifier, we should not use it as an integer out of
generation_type's implementation. so, in this change, instead of
using the helper accepting int, we switch to the one which accepts
generation_type by offering a default paramter, which is a
generation created using 1. this preserves the existing behavior.

we will divert other callers of `reusable_sst(...,
generation_type::int)` in following-up changes in different ways.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-11 12:32:22 +08:00
Kefu Chai
ab8efbf1ab test: sstable_3_x_test: do not use reuseable_sst() accepting integer
this change is one of the series which drops most of the callers
using SSTable generation as integer. as the generation of SSTable
is but an identifier, we should not use it as an integer out of
generation_type's implementation. so, in this change, instead of
using the helper accepting int, we switch to the one which accepts
generation_type.

also, as no callers are using the last parameter of `make_test_sstable()`,
let's drop it .

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-11 12:32:21 +08:00
Nadav Har'El
f1cad230bb Merge 'cql: enable setting permissions on resources with quoted UDT names' from Wojciech Mitros
This series fixes an issue with altering permissions on UDFs with
parameter types that are UDTs with quoted names and adds
a test for it.

The issue was caused by the format of the temporary string
that represented the UDT in `auth::resource`. After parsing the
user input to a raw type, we created a string representing the
UDT using `ut_name::to_string()`. The segment of the resulting
string that represented the name of the UDT was not quoted,
making us unable to parse it again when the UDT was being
`prepare`d. Other than for this purpose, the `ut_name::to_string()`
is used only for logging, so the solution was modifying it to
maybe quote the UDT name.

Ref: https://github.com/scylladb/scylladb/pull/12869

Closes #13257

* github.com:scylladb/scylladb:
  cql-pytest: test permissions for UDTs with quoted names
  cql: maybe quote user type name in ut_name::to_string()
  cql: add a check for currently used stack in parser
  cql-pytest: add an optional name parameter to new_type()
2023-05-10 19:10:29 +03:00
Wojciech Mitros
1f45c7364c cql: check permissions for used functions when creating a UDA
Currently, when creating a UDA, we only check for permissions
for creating functions. However, the creator gains all permissions
to the UDA, including the EXECUTE permission. This enables the
user to also execute the state/reduce/final functions that were
used in the UDA, even if they don't have the EXECUTE permissions
on them.

This patch adds checks for the missing EXECUTE permissions, so
that the UDA can be only created if the user has all required
permissions.

The new permissions that are now required when creating a UDA
are now granted in the existing UDA test.

Fixes #13818

Closes #13819
2023-05-10 18:06:04 +03:00
Wojciech Mitros
a86b9fa0bb auth: fix formatting of function resource with no arguments
Currently, when a function has no arguments, the function_args()
method, which is supposed to return a vector of string_views
representing the arguments of the function, returns a nullopt
instead, as if it was a functions_resource on all functions
or all functions in a keyspace. As a result, the functions_resource
can't be properly formatted.
This is fixed in this patch by returning an empty vector instead,
and the fix is confirmed in a cql-pytest.

Fixes #13842

Closes #13844
2023-05-10 17:07:33 +03:00
Nadav Har'El
e57252092c Merge 'cql3: result_set, selector: change value type to managed_bytes_opt' from Avi Kivity
CQL evolved several expression evaluation mechanisms: WHERE clause,
selectors (the SELECT clause), and the LWT IF clause are just some
examples. Most now use expressions, which use managed_bytes_opt
as the underlying value representation, but selectors still use bytes_opt.

This poses two problems:
1. bytes_opt generates large contiguous allocations when used with large blobs, impacting latency
2. trying to use expressions with bytes_opt will incur a copy, reducing performance

To solve the problem, we harmonize the data types to managed_bytes_opt
(#13216 notwithstanding). This is somewhat difficult since the source of the values
are views into a bytes_ostream. However, luckily bytes_ostream and managed_bytes_view
are mostly compatible so with a little effort this can be done.

The series is neutral wrt performance:

before:
```
222118.61 tps ( 61.1 allocs/op,  12.1 tasks/op,   43092 insns/op,        0 errors)
224250.14 tps ( 61.1 allocs/op,  12.1 tasks/op,   43094 insns/op,        0 errors)
224115.66 tps ( 61.1 allocs/op,  12.1 tasks/op,   43092 insns/op,        0 errors)
223508.70 tps ( 61.1 allocs/op,  12.1 tasks/op,   43107 insns/op,        0 errors)
223498.04 tps ( 61.1 allocs/op,  12.1 tasks/op,   43087 insns/op,        0 errors)
```

after:
```
220708.37 tps ( 61.1 allocs/op,  12.1 tasks/op,   43118 insns/op,        0 errors)
225168.99 tps ( 61.1 allocs/op,  12.1 tasks/op,   43081 insns/op,        0 errors)
222406.00 tps ( 61.1 allocs/op,  12.1 tasks/op,   43088 insns/op,        0 errors)
224608.27 tps ( 61.1 allocs/op,  12.1 tasks/op,   43102 insns/op,        0 errors)
225458.32 tps ( 61.1 allocs/op,  12.1 tasks/op,   43098 insns/op,        0 errors)
```

Though I expect with some more effort we can eliminate some copies.

Closes #13637

* github.com:scylladb/scylladb:
  cql3: untyped_result_set: switch to managed_bytes_view as the cell type
  cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt
  cql3: untyped_result_set: always own data
  types: abstract_type: add mixed-type versions of compare() and equal()
  utils/managed_bytes, serializer: add conversion between buffer_view<bytes_ostream> and managed_bytes_view
  utils: managed_bytes: add bidirectional conversion between bytes_opt and managed_bytes_opt
  utils: managed_bytes: add managed_bytes_view::with_linearized()
  utils: managed_bytes: mark managed_bytes_view::is_linearized() const
2023-05-10 15:01:45 +03:00
Botond Dénes
bb62038119 Merge 'Scrub compaction task' from Aleksandra Martyniuk
Task manager's tasks covering scrub compaction on top,
shard and table level.

For this levels we have common scrub tasks for each scrub
mode since they share code. Scrub modes will be differentiated
on compaction group level.

Closes #13694

* github.com:scylladb/scylladb:
  test: extend test_compaction_task.py to test scrub compaction
  compaction: add table_scrub_sstables_compaction_task_impl
  compaction: add shard_scrub_sstables_compaction_task_impl
  compaction: add scrub_sstables_compaction_task_impl
  api: get rid of unnecessary std::optional in scrub
  compaction: rename rewrite_sstables_compaction_task_impl
2023-05-10 14:18:20 +03:00
Kamil Braun
7d9ab44e81 Merge 'token_metadata: read remapping for write_both_read_new' from Gusev Petr
When new nodes are added or existing nodes are deleted, the topology
state machine needs to shunt reads from the old nodes to the new ones.
This happens in the `write_both_read_new` state. The problem is that
previously this state was not handled in any way in `token_metadata` and
the read nodes were only changed when the topology state machine reached
the final 'owned' state.

To handle `write_both_read_new` an additional `interval_map` inside
`token_metadata` is maintained similar to `pending_endpoints`.  It maps
the ranges affected by the ongoing topology change operation to replicas
which should be used for reading. When topology state sm reaches the
point when it needs to switch reads to a new topology, it passes
`request_read_new=true` in a call to `update_pending_ranges`. This
forces `update_pending_ranges` to compute the ranges based on new
topology and store them to the `interval_map`. On the data plane, when a
read on coordinator needs to decide which endpoints to use, it first
consults this `interval_map` in `token_metadata`, and only if it doesn't
contain a range for current token it uses normal endpoints from
`effective_replication_map`.

Closes #13376

* github.com:scylladb/scylladb:
  storage_proxy, storage_service: use new read endpoints
  storage_proxy: rename get_live_sorted_endpoints->get_endpoints_for_reading
  token_metadata: add unit test for endpoints_for_reading
  token_metadata: add endpoints for reading
  sequenced_set: add extract_set method
  token_metadata_impl: extract maybe_migration_endpoints helper function
  token_metadata_impl: introduce migration_info
  token_metadata_impl: refactor update_pending_ranges
  token_metadata: add unit tests
  token_metadata: fix indentation
  token_metadata_impl: return unique_ptr from clone functions
2023-05-10 10:03:30 +02:00
Avi Kivity
996f717dfc Merge 'cql3/prepare_expr: force token() receiver name to be partition key token' from Jan Ciołek
Let's say that we have a prepared statement with a token restriction:
```cql
SELECT * FROM some_table WHERE token(p1, p2) = ?
```

After calling `prepare` the drivers receives some information about the prepared statment, including names of values bound to each bind marker.

In case of a partition token restriction (`token(p1, p2) = ?`) there's an expectation that the name assigned to this bind marker will be `"partition key token"`.

In a recent change the code handling `token()` expressions has been unified with the code that handles generic function calls, and as a result the name has changed to `token(p1, p2)`.

It turns out that the Java driver relies on the name being `"partition key token"`, so a change to `token(p1, p2)` broke some things.

This patch sets the name back to `"partition key token"`. To achieve this we detect any restrictions that match the pattern `token(p1, p2, p3) = X` and set the receiver name for X to `"partition key token"`.

Fixes: #13769

Closes #13815

* github.com:scylladb/scylladb:
  cql-pytest: test that bind marker is partition key token
  cql3/prepare_expr: force token() receiver name to be partition key token
2023-05-09 20:44:46 +03:00
Petr Gusev
15fe4d8d69 token_metadata: add unit test for endpoints_for_reading 2023-05-09 18:42:03 +04:00
Botond Dénes
287ccce1cc Merge 'sstables: extract storage out ' from Kefu Chai
this change extracts the storage class and its derived classes
out into their own source files. for couple reasons:

- for better readability. the sstables.hh is over 1005 lines.
  and sstables.cc 3602 lines. it's a little bit difficult to figure
  out how the different parts in these sources interact with each
  other. for instance, with this change, it's clear some of helper
  functions are only used by file_system_storage.
- probably less inter-source dependency. by extracting the sources
  files out, they can be compiled individually, so changing one .cc
  file does not impact others. this could speed up the compilation
  time.

Closes #13785

* github.com:scylladb/scylladb:
  sstables: storage: coroutinize idempotent_link_file()
  sstables: extract storage out
2023-05-09 14:03:40 +03:00
Jan Ciolek
9ad1c5d9f2 cql-pytest: test that bind marker is partition key token
When preparing a query each bind marker gets a name.
For a query like:
```cql
SELECT * FROM some_table WHERE token(p1, p2) = ?
```
The bind marker's name should be `"partition key token"`.
Java driver relies on this name, having something else,
like `"token(p1, p2)"` be the name breaks the Java driver.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-05-09 12:33:06 +02:00
Petr Gusev
3120cabf56 token_metadata: add unit tests
We are going to refactor update_pending_ranges,
so in this commit we add some simple unit tests
to ensure we don't break it.
2023-05-09 13:56:06 +04:00
Aleksandra Martyniuk
f199ec5ec3 test: extend test_compaction_task.py to test scrub compaction 2023-05-09 11:15:26 +02:00
Kefu Chai
2eefcb37eb sstables: extract storage out
this change extracts the storage class and its derived classes
out into storage.cc and storage.hh. for couple reasons:

- for better readability. the sstables.hh is over 1005 lines.
  and sstables.cc 3602 lines. it's a little bit difficult to figure
  out how the different parts in these sources interact with each
  other. for instance, with this change, it's clear some of helper
  functions are only used by file_system_storage.
- probably less inter-source dependency. by extracting the sources
  files out, they can be compiled individually, so changing one .cc
  file does not impact others. this could speed up the compilation
  time.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-09 16:47:00 +08:00
Botond Dénes
20f620feb9 Merge 'replica, sstable: replace generation_type::value() with generation_type::as_int()' from Kefu Chai
this series prepares for the UUID based generation by replacing the general `value()` function with the function with more specific name: `as_int()`.

Closes #13796

* github.com:scylladb/scylladb:
  test: drop a reusable_sst() variant which accepts int as generation
  treewide: replace generation_type::value() with generation_type::as_int()
2023-05-09 07:30:54 +03:00
Nadav Har'El
5f37d43ee6 Merge 'compaction: validate: validate the index too' from Botond Dénes
In addition to the data file itself. Currently validation avoids the
index altogether, using the crawling reader which only relies on the
data file and ignores the index+summary. This is because a corrupt
sstable usually has a corrupt index too and using both at the same time
might hide the corruption. This patch adds targeted validation of the
index, independent of and in addition to the already existing data
validation: it validates the order of index entries as well as whether
the entry points to a complete partition in the data file.
This will usually result in duplicate errors for out-of-order
partitions: one for the data file and one for the index file.

Fixes: #9611

Closes #11405

* github.com:scylladb/scylladb:
  test/cql-pytest: add test_sstable_validation.py
  test/cql-pytest: extract scylla_path,temp_workdir fixtures to conftest.py
  tools/scylla-sstables: write validation result to stdout
  sstables/sstable: validate(): delegate to mx validator for mx sstables
  sstables/mx/reader: add mx specific validator
  mutation/mutation_fragment_stream_validator: add validator() accessor to validating filter
  sstables/mx/reader: template data_consume_rows_context_m on the consumer
  sstables/mx/reader: move row_processing_result to namespace scope
  sstables/mx/reader: use data_consumer::proceed directly
  sstables/mx/reader.cc: extend namespace to end-of-file (cosmetic)
  compaction/compaction: remove now unused scrub_validate_mode_validate_reader()
  compaction/compaction: move away from scrub_validate_mode_validate_reader()
  tools/scylla-sstable: move away from scrub_validate_mode_validate_reader()
  test/boost/sstable_compaction_test: move away from scrub_validate_mode_validate_reader()
  sstables/sstable: add validate() method
  compaction/compaction: scrub_sstables_validate_mode(): validate sstables one-by-one
  compaction: scrub: use error messages from validator
  mutation_fragment_stream_validator: produce error messages in low-level validator
2023-05-08 17:14:26 +03:00
Botond Dénes
b790f14456 reader_concurrency_semaphore: execution_loop(): trigger admission check when _ready_list is empty
The execution loop consumes permits from the _ready_list and executes
them. The _ready_list usually contains a single permit. When the
_ready_list is not empty, new permits are queued until it becomes empty.
The execution loops relies on admission checks triggered by the read
releasing resouces, to bring in any queued read into the _ready_list,
while it is executing the current read. But in some cases the current
read might not free any resorces and thus fail to trigger an admission
check and the currently queued permits will sit in the queue until
another source triggers an admission check.
I don't yet know how this situation can occur, if at all, but it is
reproducible with a simple unit test, so it is best to cover this
corner-case in the off-chance it happens in the wild.
Add an explicit admission check to the execution loop, after the
_ready_list is exhausted, to make sure any waiters that can be admitted
with an empty _ready_list are admitted immediately and execution
continues.

Fixes: #13540

Closes #13541
2023-05-08 17:11:41 +03:00
Kamil Braun
153cb00e9d test: test_random_tables: wait for token ring convergence before data queries
The test performs an `INSERT` followed by a `SELECT`, checking if the
previously inserted data is returned.

This may fail because we're using `ring_delay = 0` in tests and the two
queries may arrive at different nodes, whose `token_metadata` didn't
converge yet (it's eventually consistent based on gossiping).
I illustrated this here:
https://github.com/scylladb/scylladb/issues/12937#issuecomment-1536147455

Ensure that the nodes' token rings are synchronized (by waiting until
the token ring members on each node is the same as group 0
configuration).

Fixes #12937

Closes #13791
2023-05-08 13:22:52 +02:00
Kamil Braun
3f3dcf451b test: pylib: random_tables: perform read barrier in verify_schema
`RandomTables.verify_schema` is often called in topology tests after
performing a schema change. It compares the schema tables fetched from
some node to the expected latest schema stored by the `RandomTables`
object.

However there's no guarantee that the latest schema change has already
propagated to the node which we query. We could have performed the
schema change on a different node and the change may not have been
applied yet on all nodes.

To fix that, pick a specific node and perform a read barrier on it, then
use that node to fetch the schema tables.

Fixes #13788

Closes #13789
2023-05-08 13:21:10 +02:00
Avi Kivity
198738f2b1 Merge 'build: compile wasm udfs automatically' from Wojciech Mitros
Currently, when we deal with a Wasm program, we store
it in its final WebAssembly Text form. This causes a lot
of code bloat and is hard to read. Instead, we would like
to store only the source codes, and build Wasm when
necessary. This series adds build commands that
compile C/Rust sources to Wasm and uses them for Wasm
programs that we're already using.

After these changes, adding a new program that should be
compiled to Rust, requires only adding the source code
of it and updating the `wasms` and `wasm_deps` lists in
`configure.py`.

All Wasm programs are build by default when building all
artifacts, artifacts in a given mode, or when building
tests. Additionally, a {mode}-wasm target is added, so that
it's possible to build just the wasm files.
The generated files are saved in $builddir/{mode}/wasm,
and are accessed in cql-pytests similarly to the way we're
accessing the scylla binary - using glob.

Closes #13209

* github.com:scylladb/scylladb:
  wasm: replace wasm programs with their source programs
  build: prepare rules for compiling wasm files
  build: set the type of build_artifacts
  test: extend capabilities of Wasm reading helper funciton
2023-05-08 13:51:53 +03:00
Wojciech Mitros
6d89d718d9 wasm: replace wasm programs with their source programs
After recent changes, we are able to store only the
C/Rust source codes for Wasm programs, and only build
them when neccessary. This patch utilizes this
opportunity by removing most of the currently stored
raw Wasm programs, replacing them with C/Rust sources
and adding them to the new build system.
2023-05-08 10:47:34 +02:00
Wojciech Mitros
0a34a54c73 test: extend capabilities of Wasm reading helper funciton
Currently, we require that the Wasm file is named the same
as the funciton. In the future we may want multiple functions
with the same name, which we can't currently do due to this
limitation.
This patch allows specifying the function name, so that multiple
files can have a function with the same name.
Additionally, the helper method now escapes "'" characters, so
that they can appear in future Wasm files.
2023-05-08 10:47:34 +02:00
Botond Dénes
ab5fd0f750 Merge 's3: Provide timestamps in the s3 file implementation' from Raphael "Raph" Carvalho
SSTable relies on st.st_mtime for providing creation time of data
file, which in turn is used by features like tombstone compaction.

Therefore, let's implement it.

Fixes https://github.com/scylladb/scylladb/issues/13649.

Closes #13713

* github.com:scylladb/scylladb:
  s3: Provide timestamps in the s3 file implementation
  s3: Introduce get_object_stats()
  s3: introduce get_object_header()
2023-05-08 11:43:41 +03:00
Raphael S. Carvalho
57661f0392 s3: Introduce get_object_stats()
get_object_stats() will be used for retrieving content size and
also last modified.

The latter is required for filling st_mtim, etc, in the
s3::client::readable_file::stat() method.

Refs #13649.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-07 19:51:10 -03:00
Avi Kivity
42a1ced73b cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt
The expression system uses managed_bytes_opt for values, but result_set
uses bytes_opt. This means that processing values from the result set
in expressions requires a copy.

Out of the two, managed_bytes_opt is the better choice, since it prevents
large contiguous allocations for large blobs. So we switch result_set
to use managed_bytes_opt. Users of the result_set API are adjusted.

The db::function interface is not modified to limit churn; instead we
convert the types on entry and exit. This will be adjusted in a following
patch.
2023-05-07 17:17:36 +03:00
Botond Dénes
c1e8e86637 reader_concurrency_semaphore: reader_permit: clean-up after failed memory requests
When requesting memory via `reader_permit::request_memory()`, the
requested amount is added to `_requested_memory` member of the permit
impl. This is because multiple concurrent requests may be blocked and
waiting at the same time. When the requests are fulfilled, the entire
amount is consumed and individual requests track their requested amount
with `resource_units` to release later.
There is a corner-case related to this: if a reader permit is registered
as inactive while it is waiting for memory, its active requests are
killed with `std::bad_alloc`, but the `_requested_memory` fields is not
cleared. If the read survives because the killed requests were part of
a non-vital background read-ahead, a later memory request will also
include amount from the failed requests. This extra amount wil not be
released and hence will cause a resource leak when the permit is
destroyed.
Fix by detecting this corner case and clearing the `_requested_memory`
field. Modify the existing unit test for the scenario of a permit
waiting on memory being registered as inactive, to also cover this
corner case, reproducing the bug.

Fixes: #13539

Closes #13679
2023-05-07 14:06:51 +03:00
Kefu Chai
bd3e8d0460 test: drop a reusable_sst() variant which accepts int as generation
this is one of the changes to reduce the usage of integer based generation
test. in future, we will need to expand the test to exercise the UUID
based generation, or at least to be neutral to the underlying generation's
identifier type. so, to remove the helpers which only accept `generation_type::int_t`
would helps us to make this happen.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-06 18:24:48 +08:00
Kefu Chai
9b35faf485 treewide: replace generation_type::value() with generation_type::as_int()
* replace generation_type::value() with generation_type::as_int()
* drop generation_value()

because we will switch over to UUID based generation identifier, the member
function or the free function generation_value() cannot fulfill the needs
anymore. so, in this change, they are consolidated and are replaced by
"as_int()", whose name is more specific, and will also work and won't be
misleading even after switching to UUID based generation identifier. as
`value()` would be confusing by then: it could be an integer or a UUID.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-06 18:24:45 +08:00
Kamil Braun
70f2b09397 Merge 'scylla_cluster.py: fix read_last_line' from Gusev Petr
This is a follow-up to #13399, the patch
addresses the issues mentioned there:
* linesep can be split between blocks;
* linesep can be part of UTF-8 sequence;
* avoid excessively long lines, limit to 256 chars;
* the logic of the function made simpler and more maintainable.

Closes #13427

* github.com:scylladb/scylladb:
  pylib_test: add tests for read_last_line
  pytest: add pylib_test directory
  scylla_cluster.py: fix read_last_line
  scylla_cluster.py: move read_last_line to util.py
2023-05-05 13:29:15 +02:00
Kefu Chai
05a172c7e7 build: cmake: link against Boost::unit_test_framework
we introduced the linkage to Boost::unit_test_framework in
fe70333c19, this library is used by
test/lib/test_utils.cc, so update CMake accordingly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13781
2023-05-05 13:55:00 +03:00
Petr Gusev
8a0bcf9d9d pylib_test: add tests for read_last_line 2023-05-05 12:57:43 +04:00
Petr Gusev
7476e91d67 pytest: add pylib_test directory
We want to add tests for read_last_line,
in this commit we add a new directory for them
since there were no tests for pylib code before.
2023-05-05 12:57:43 +04:00
Petr Gusev
330d1d5163 scylla_cluster.py: fix read_last_line
This is a follow-up to #13399, the patch
addresses the issues mentioned there:
* linesep can be split between blocks;
* linesep can be part of UTF-8 sequence;
* avoid excessively long lines, limit to 512 chars;
* the logic of the function made simpler and more
maintainable.
2023-05-05 12:57:36 +04:00
Petr Gusev
8a5e211c30 scylla_cluster.py: move read_last_line to util.py
We want to add tests for read_last_line, so we
move it to make this simper.
2023-05-05 12:51:25 +04:00
Botond Dénes
687a8bb2f0 Merge 'Sanitize test::filename(sstable) API' from Pavel Emelyanov
There are two of them currently with slightly different declaration. Better to leave only one.

Closes #13772

* github.com:scylladb/scylladb:
  test: Deduplicate test::filename() static overload
  test: Make test::filename return fs::path
2023-05-05 11:36:08 +03:00
Pavel Emelyanov
ac305076bd test: Split test_twcs_interposer_on_memtable_flush naturally
The test case consists of two internal sub-test-cases. Making them
explicit kills three birds with one stone

- improves parallelizm
- removes env's tempdir wiping
- fixes code indentation

refs: #12707

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13768
2023-05-05 10:42:30 +03:00
Avi Kivity
f125a3e315 Merge 'tree: finish the reader_permit state renames' from Botond Dénes
In https://github.com/scylladb/scylladb/pull/13482 we renamed the reader permit states to more descriptive names. That PR however only covered only the states themselves and their usages, as well as the documentation in `docs/dev`.
This PR is a followup to said PR, completing the name changes: renaming all symbols, names, comments etc, so all is consistent and up-to-date.

Closes #13573

* github.com:scylladb/scylladb:
  reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update API w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes
2023-05-04 18:29:04 +03:00
Avi Kivity
204521b9a7 Merge 'mutation/mutation_compactor: validate range tombstone change before it is moved' from Botond Dénes
e2c9cdb576 moved the validation of the range tombstone change to the place where it is actually consumed, so we don't attempt to pass purged or discarded range tombstones to the validator. In doing so however, the validate pass was moved after the consume call, which moves the range tombstone change, the validator having been passed a moved-from range tombstone. Fix this by moving he validation to before the consume call.

Refs: #12575

Closes #13749

* github.com:scylladb/scylladb:
  test/boost/mutation_test: add sanity test for mutation compaction validator
  mutation/mutation_compactor: add validation level to compaction state query constructor
  mutation/mutation_compactor: validate range tombstone change before it is moved
2023-05-04 18:15:35 +03:00
Avi Kivity
1d351dde06 Merge 'Make S3 client work with real S3' from Pavel Emelyanov
Current S3 client was tested over minio and it takes few more touches to work with amazon S3.

The main challenge here is to support singed requests. The AWS S3 server explicitly bans unsigned multipart-upload requests, which in turn is the essential part of the sstables S3 backend, so we do need signing. Signing a request has many options and requirements, one of them is -- request _body_ can be or can be not included into signature calculations. This is called "(un)signed payload". Requests sent over plain HTTP require payload signing (i.e. -- request body should be included into signature calculations), which can a bit troublesome, so instead the PR uses unsigned payload (i.e. -- doesn't include the request body into signature calculation, only necessary headers and query parameters), but thus also needs HTTPS.

So what this set does is makes the existing S3 client code sign requests. In order to sign the request the code needs to get AWS key and secret (and region) from somewhere and this somewhere is the conf/object_storage.yaml config file. The signature generating code was previously merged (moved from alternator code) and updated to suit S3 client needs.

In order to properly support HTTPS the PR adds special connection factory to be used with seastar http client. The factory makes DNS resolving of AWS endpoint names and configures gnutls systemtrust.

fixes: #13425

Closes #13493

* github.com:scylladb/scylladb:
  doc: Add a document describing how to configure S3 backend
  s3/test: Add ability to run boost test over real s3
  s3/client: Sign requests if configured
  s3/client: Add connection factory with DNS resolve and configurable HTTPS
  s3/client: Keep server port on config
  s3/client: Construct it with config
  s3/client: Construct it with sstring endpoint
  sstables: Make s3_storage with endpoint config
  sstables_manager: Keep object storage configs onboard
  code: Introduce conf/object_storage.yaml configuration file
2023-05-04 18:08:54 +03:00
Pavel Emelyanov
56dfc21ba0 test: Deduplicate test::filename() static overload
There are two of them currently, both returning fs::path for sstable
components. One is static and can be dropped, callers are patched to use
the non-static one making the code tiny bit shorter.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-04 17:16:00 +03:00
Pavel Emelyanov
3f30a253be test: Make test::filename return fs::path
The sstable::filename() is private and is not supposed to be used as a
path to open any files. However, tests are different and they sometimes
know it is. For that they use test wrapper that has access to private
members and may make assumptions about meaning of sstable::filename().

Said that, the test::filename() should return fs::path, not sstring.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-04 17:14:04 +03:00
Tomasz Grabiec
e385ce8a2b Merge "fix stack use after free during shutdown" from Gleb
storage_service uses raft_group0 but the during shutdown the later is
destroyed before the former is stopped. This series move raft_group0
destruction to be after storage_service is stopped already. For the
move to work some existing dependencies of raft_group0 are dropped
since they do not really needed during the object creation.

Fixes #13522
2023-05-04 15:14:18 +02:00
Pavel Emelyanov
fe70333c19 test: Auto-skip object-storage test cases if run from shell
In case an sstable unit test case is run individually, it would fail
with exception saying that S3_... environment is not set. It's better to
skip the test-case rather than fail. If someone wants to run it from
shell, it will have to prepare S3 server (minio/AWS public bucket) and
provide proper environment for the test-case.

refs: #13569

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13755
2023-05-04 14:15:18 +03:00