Commit Graph

118 Commits

Author SHA1 Message Date
Kefu Chai
86b988a70b test/lib: do not use variable which could be moved away
C++ standard does not define the order in which the parameters
passed to a function are evaluated. so in theory, in
```c++
reusable_sst(sst->get_schema(), std::move(sst));
```
`std::move(sst)` could be evaluated before `sst->get_schema`.
but please note, `std::move(sst)` does not move `sst`
away, it merely cast `sst` to a rvalue reference, it is
`reusable_sst()` which *could* move `sst` away by
consuming it. so following call is much more dangerous
than the above one:
```c++
reusable_sst(sst->get_schema(), modify_sst(std::move(sst)))
```
nevertheless, this usage is still confusing. so instead
of passing a copy of `sst` to `reusable_sst`.

this change is inspired by clang-tidy, it warns like:

```
Warning: /home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:25: warning: 'sst' used after it was moved [bugprone-use-after-move]
  397 |     return reusable_sst(sst->get_schema(), std::move(sst));
      |                         ^
/home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:44: note: move occurred here
  397 |     return reusable_sst(sst->get_schema(), std::move(sst));
      |                                            ^
/home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:25: note: the use and move are unsequenced, i.e. there is no guarantee about the order in which they are evaluated
  397 |     return reusable_sst(sst->get_schema(), std::move(sst));
      |
```

per the analysis above, this is a false alarm.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18775
2024-05-21 10:02:10 +03:00
Lakshmi Narayanan Sreethar
79f6746298 sstables_manager: add member to store maintenance scheduling group
Store that maintenance scheduling group inside the sstables_manager. The
next patch will use this to run the components reloader fiber.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-19 15:23:45 +05:30
Aleksandra Martyniuk
532653f118 replica: replace table::as_table_state
Replace table::as_table_state with table::try_get_table_state_with_static_sharding
which throws if a table does not use static sharding.
2024-05-10 14:56:38 +02:00
Kefu Chai
a439ebcfce treewide: include fmt/ranges.h and/or fmt/std.h
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:56:16 +08:00
Lakshmi Narayanan Sreethar
169629dd40 test/lib: allow overriding available memory via test_env_config
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-04-02 01:37:47 +05:30
Avi Kivity
b530dc1e3b test: sstables::test_env: complete pimplification
sstables::test_env uses the pimpl idiom, but incompletely. This
prevents reaping some of the benefits.

Complete the pimplification:
 - the `impl` nested struct is moved out-of-line
 - all non-template member functions are moved out-of-line
 - a destructor is declared and defined out-of-line
 - the move constructor is also defined (necessary after the destructor is
   defined)

After this, we can forward-declare more components.
2024-03-21 22:29:01 +02:00
Avi Kivity
d745929b44 test/lib: test_env: move test_env::reusable_sst() to test_services.cc
test_env implementation is scattered around two .cc, concentrate it
in test_services.cc, which happens to be the file that doesn't cause
link errors.

Move toc_filename with it, as it is its only caller and it is static.
2024-03-21 22:21:02 +02:00
Avi Kivity
628017c810 test: sstables::test_env: mock sstables_registry
sstables::test_env is intended for sstable unit tests, but to satisfy its
dependency of an sstables_registry we instantiate an entire database.

Remove the dependency by having a mock implementation of sstables_registry
and using that instead.

Closes scylladb/scylladb#17895
2024-03-21 10:19:46 +01:00
Avi Kivity
e48eb76f61 sstables_manager: decouple from system_keyspace
sstables_manager now depends on system_keyspace for access to the
system.sstables table, needed by object storage. This violates
modularity, since sstables_manager is a relatively low-level leaf
module while system_keyspace integrates large parts of the system
(including, indirectly, sstables_manager).

One area where this is grating is sstables::test_env, which has
to include the much higher level cql_test_env to accommodate it.

Fix this by having sstables_manager expose its dependency on
system_keyspace as an interface, sstables_registry, and have
system_keyspace implement the glue logic in
system_keyspace_sstables_manager.

Closes scylladb/scylladb#17868
2024-03-18 20:38:07 +03:00
Raphael S. Carvalho
2c9b13d2d1 compaction: Check for key presence in memtable when calculating max purgeable timestamp
It was observed that some use cases might append old data constantly to
memtable, blocking GC of expired tombstones.

That's because timestamp of memtable is unconditionally used for
calculating max purgeable, even when the memtable doesn't contain the
key of the tombstone we're trying to GC.

The idea is to treat memtable as we treat L0 sstables, i.e. it will
only prevent GC if it contains data that is possibly shadowed by the
expired tombstone (after checking for key presence and timestamp).

Memtable will usually have a small subset of keys in largest tier,
so after this change, a large fraction of keys containing expired
tombstones can be GCed when memtable contains old data.

Fixes #17599.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#17835
2024-03-18 13:37:44 +02:00
Lakshmi Narayanan Sreethar
76f0d5e35b reader_permit: store schema_ptr instead of raw schema pointer
Store schema_ptr in reader permit instead of storing a const pointer to
schema to ensure that the schema doesn't get changed elsewhere when the
permit is holding on to it. Also update the constructors and all the
relevant callers to pass down schema_ptr instead of a raw pointer.

Fixes #16180

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#16658
2024-01-11 08:37:56 +02:00
Botond Dénes
e1b30f50be reader_concurrency_semaphore: add register_metrics constructor parameter
To be used in the next patch to control whether the semaphore registers
and exports metrics or not. We want to move metric registration to the
semaphore but we don't want all semaphores to export metrics. The
decision on whether a semaphore should or shouldn't export metrics
should be made on a case-by-case basis so this new parameter has no
default value (except for the for_tests constructor).
2023-12-13 06:25:45 -05:00
Avi Kivity
814f3eb6b5 sstables: name sstables_manager
Soon, the reader_concurrency_semaphore will require a unique
and meaningful name in order to label its metrics. To prepare
for that, name sstable_manager instances. This will be used
to generate a name for sstable_manager's reader_concurrency_semaphore.
2023-12-13 04:40:33 -05:00
Pavel Emelyanov
210b01a5ce config: Make object storage config updateable_value_source
Now its plain updateable_value, but without the ..._source object the
updateable_value is just a no-op value holder. In order for the
observers to operate there must be the value source, updating it would
update the attached updateable values _and_ notify the observers.

In order for the config to be the u.v._source, config entries should be
comparable to each other, thus the <=> operator for it

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-21 16:47:50 +03:00
Pavel Emelyanov
9fd270566a test/sstables: Introduce test_env_compaction_manager::perform_compaction()
Take it from compaction_manager_test::run() which is simplified overwite
of the compaction_manager::perform_compaction().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-13 11:44:51 +03:00
Pavel Emelyanov
aec3fc493a test/utils: Move compaction_manager_test::propagate_replacement()
The purpose of this method is to turn public the private
compaction_manager method of the same name. The caller of this method is
having sstable_test_env at hand with its test_env_compaction_manager, so
the de-private-isation call can be moved.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-13 11:44:51 +03:00
Pavel Emelyanov
19b524d0f3 sstable_test_env: Tune up maybe_start_compaction_manager() method
Make it public and add `bool enable` flag so that test cases could start
the compaction manager (to call make_table_for_tests() later) but keep
it disabled for their testing purposes.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-01 14:18:17 +03:00
Pavel Emelyanov
4db80ed61f table_for_tests: Use test_env's compaction manager
Now when the sstables::test_env provides the compaction manager
instance, the table_for_tests can start using it and can remove c.m. and
the sidecar task_manager.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:42:19 +03:00
Pavel Emelyanov
2c78b46c78 sstables::test_env: Carry compaction manager on board
Most of the test cases that use sstables::test_env do not mess with
table objects, they only need sstables. However, compaction test cases
do need table objects and, respectively, a compaction manager instance.
Today those test cases create compaction manager instance for each table
they create, but that's a bit heaviweight and doesn't work the way core
code works. This patch prepares the sstables::test_env to provide
compaction manager on demand by starting it as soon as it's asked to
create table object.

For now this compaction manager is unused, but it will be in next patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:39:54 +03:00
Pavel Emelyanov
b96d39e63a table_for_tests: Stop table on stop
Next patches will stop using compaction manager from table_for_tests in
favor of external one (spoiler: the one from sstables::test_env), thus
the compaction manager would outsurvive the table_for_tests object and
the table object wrapped by it. So in order for the table_for_tests to
stop correctly, it also needs to stop the wrapped table too.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:38:03 +03:00
Pavel Emelyanov
ac45aae0c4 table_for_tests: Ditch on-board concurrency semaphore
It's not used any longer and can be removed. This make table_for_tests
stopping code a bit shorter as well.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:36:59 +03:00
Pavel Emelyanov
21998296a7 table_for_tests: Require config argument to make table
This is the continuation of the previous patch. Make the caller of
table_for_tests constructor provide the table::config. This makes the
table_for_tests constructor shorter and more self-contained.

Also, the caller now needs to provide the reference to reader
concurrency semaphore, and that's good news, because the only caller for
today is the sstables::test_env that does have it. This makes the
semaphore sitting on table_for_tests itself unused and it will be
removed eventually.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:34:59 +03:00
Pavel Emelyanov
5ab1af3804 table_for_tests: Create table config locally
The table_for_tests keeps a copy of table::config on board. That's not
"idiomatic" as table config is a temporary object that should only be
needed while creating table object. Fortunately, the copy of config on
table_for_tests is no longer needed and it can be made temporary.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:33:29 +03:00
Pavel Emelyanov
76e57cc805 table_for_tests: Get concurrency semaphore from table
Making compaction permit needs a semaphore. Current code gets it from
the table_for_tests, but the very same semaphore reference sits on the
table. So get it from table, as the core code does. This will allow
removing the dedicated semaphore from table_for_tests in the future.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:32:32 +03:00
Pavel Emelyanov
769d9f17eb table_for_tests: Reuse cache tracker from sstables manager
When making table object it needs the cache tracker reference. The
table_for_tests keeps one on board, but the very same object already
sits on the sstables manager which has public getter.

This makes the table_for_tests's cache tracker object not needed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:29:49 +03:00
Pavel Emelyanov
89e253c77e table_for_tests: Remove unused constructor
No code constructs it with just sstables manager argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:29:29 +03:00
Pavel Emelyanov
8d704f2532 sstable_test_env: Coroutinize and move to .cc test_env::stop()
It's going to get larger, so better to move.
Also when coroutinized it's goind to be easier to extend.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-31 09:26:58 +03:00
Kefu Chai
af8bc8ba63 sstable: switch to uuid identifier for naming S3 sstable objects
before this change, we create a new UUID for a new sstable managed
by the s3_storage, and we use the string representation of UUID
defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for
naming the objects stored on s3_storage. but this representation is
not what we are using for storing sstables on local filesystem when
the option of "uuid_sstable_identifiers_enabled" is enabled. instead,
we are using a base36-based representation which is shorter.

to be consistent with the naming of the sstables created for local
filesystem, and more importantly, to simplify the interaction between
the local copy of sstables and those stored on object storage, we should
use the same string representation of the sstable identifier.

so, in this change:

1. instead of creating a new UUID, just reuse the generation of the
   sstable for the object's key.
2. do not store the uuid in the sstable_registry system table. As
   we already have the generation of the sstable for the same purpose.
3. switch the sstable identifier representation from the one defined
   by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the
   base36-based one (implemented by
   fmt::formatter<sstables::generation_type>)
4. enable the `uuid_sstable_identifers` cluster feature if it is
   enabled in the `test_env_config`, so that it the sstable manager
   can enable the uuid-based uuid when creating a new uuid for
   sstable.
5. throw if the generation of sstable is not UUID-based when
   accessing / manipulating an sstable with S3 storage backend. as
   the S3 storage backend now relies on this option. as, otherwise
   we'd have sstables with key like s3://bucket/number/basename, which
   is just unable to serve as a unique id for sstable if the bucket is
   shared across multiple tables.

Fixes #14175
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-10-23 10:08:22 +08:00
Kefu Chai
50c8619ed9 test: enable test to set uuid_sstable_identifiers
some of the tests are still relying on the integer-based sstable
identifier, so let's add a method to test_env, so that the tests
relying on this can opt-out. we will change the default setting
of sstables::test_env to use uuid-base sstable identifier in the
next commit. this change does not change the existing behavior.
it just adds a new knob to test_env_config. and let the tests
relying on this to customize the test_env_config to disable
use_uuid.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-10-07 18:56:47 +08:00
Kefu Chai
ac3406e537 utils/s3/creds: rename aws_config member variables
- s/key/access_key_id/
- s/secret/secret_access_key/
- s/token/session_token/

so they are more aligned with the AWS document.
for instance, in
https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#ConstructingTheAuthenticationHeader
AWSAccessKeyId is used in the "Authorization" header.

this would help with the readability and maintainability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-23 14:28:07 +08:00
Kefu Chai
c364efb998 utils/s3: auth using AWS_SESSION_TOKEN
when accessing AWS resources, uses are allowed to long-term security
credentials, they can also the temporary credentials. but if the latter
are used, we have to pass a session token along with the keys.
see also https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html
so, if we want to programatically get authenticated, we need to
set the "x-amz-security-token" header,
see
https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#UsingTemporarySecurityCredentials

so, in this change, we

1. add another member named `token` in `s3::endpoint_config::aws_config`
   for storing "AWS_SESSION_TOKEN".
2. populate the setting from "object_storage.yaml" and
  "$AWS_SESSION_TOKEN" environment variable.
3. set "x-amz-security-token" header if
   `s3::endpoint_config::aws_config::token` is not empty.

this should allow us to test s3 client and s3 object store backend
with S3 bucket, with the temporary credentials.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15486
2023-09-21 13:26:11 +03:00
Petr Gusev
beb29f094b system_keyspace: drop load phases
We want to switch system.scylla_local table to the
schema commitlog, but load phases hamper here - schema
commitlog is initialized after phase1,
so a table which is using it should be moved to phase2,
but system.scylla_local contains features, and we need
them before  schema commitlog initialization for
SCHEMA_COMMITLOG feature.

In this commit we are taking a different approach to
loading system tables. First, we load them all in
one pass in 'readonly' mode. In this mode, the table
cannot be written to and has not yet been assigned
a commit log. To achieve this we've added _readonly bool field
to the table class, it's initialized to true in table's
constructor. In addition, we changed the table constructor
to always assign nullptr to commitlog, and we trigger
an internal error if table.commitlog() property is accessed
while the table is in readonly mode. Then, after
triggering on_system_tables_loaded notifications on
feature_service and sstable_format_selector, we call
system_keyspace::mark_writable and eventually
table::mark_ready_for_writes which selects the
proper commitlog and marks the table as writable.

In sstable_compaction_test we drop several
mark_ready_for_writes calls since they are redundant,
the table has already been made writable in
env.make_table_for_tests call.

The table::commitlog function either returns the current
commitlog or causes an error if the table is readonly. This
didn't work for virtual tables, since they never called
mark_ready_for_writes. In this commit we add this
call to initialize_virtual_tables.
2023-09-13 23:17:20 +04:00
Petr Gusev
b90011294d config.cc: drop db::config::host_id
In this refactoring commit we remove the db::config::host_id
field, as it's hacky and duplicates token_metadata::get_my_id.

Some tests want specific host_id, we add it to cql_test_config
and use in cql_test_env.

We can't pass host_id to sstables_manager by value since it's
initialized in database constructor and host_id is not loaded yet.
We also prefer not to make a dependency on shared_token_metadata
since in this case we would have to create artificial
shared_token_metadata in many tools and tests where sstables_manager
is used. So we pass a function that returns host_id to
sstables_manager constructor.
2023-09-13 23:00:15 +04:00
Pavel Emelyanov
1d00cc5baa test/s3: Run tests over non-anonymous bucket
Currently minio applies anonymous public policy for the test bucket and
all tests just use unsigned S3 requests. This patch generates a policy
for the temporary minio user and removes the anon public one. All tests
are updated respectively to use the provided key:secret pair.

The use-https bit is off by default as minio still starts with plain
http. That's OK for now, all tests are local and have no secret data
anyway

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-07 11:16:13 +03:00
Pavel Emelyanov
e8e8539c7c code: Rename S3_PUBLIC_BUCKET_FOR_TEST
The bucket is going to stop being public, rename the env variable in
advance to make the essential patch smaller

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-07 10:25:53 +03:00
Raphael S. Carvalho
b578d6643f Kill scylla option to configure number of compaction groups
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:53 -03:00
Raphael S. Carvalho
2590eec352 replica: Generate group_id for compaction_group on demand
There are a few good reasons for this change.
1) compaction_group doesn't have to be aware of # of groups
2) thinking forward to dynamic tablets, # of groups cannot be
statically embedded in group id, otherwise it gets stale.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-15 09:04:05 -03:00
Benny Halevy
bb59687116 table: signal compaction_manager when staging sstables become eligible for cleanup
perform_cleanup may be waiting for those sstables
to become eligible for cleanup so signal it
when table::move_sstables_from_staging detects an
sstable that requires cleanup.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-05-17 11:33:22 +03:00
Avi Kivity
31e820e5a1 Merge 'Allow tombstone GC in compaction to be disabled on user request' from Raphael "Raph" Carvalho
Adding new APIs /column_family/tombstone_gc and /storage_service/tombstone_gc, that will allow for disabling tombstone garbage collection (GC) in compaction.

Mimicks existing APIs /column_family/autocompaction and /storage_service/autocompaction.

column_family variant must specify a single table only, following existing convention.

whereas the storage_service one can specify an entire keyspace, or a subset of a tables in a keyspace.

column_family API usage
-----

```
    The table name must be in keyspace:name format

    Get status:
    curl -s -X GET "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf"

    Enable GC
    curl -s -X POST "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf"

    Disable GC
    curl -s -X DELETE "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf"
```

storage_service API usage
-----

```
    Tables can be specified using a comma-separated list.

    Enable GC on keyspace
    curl -s -X POST "http://127.0.0.1:10000/storage_service/tombstone_gc/ks"

    Disable GC on keyspace
    curl -s -X DELETE "http://127.0.0.1:10000/storage_service/tombstone_gc/ks"

    Enable GC on a subset of tables
    curl -s -X POST
    "http://127.0.0.1:10000/storage_service/tombstone_gc/ks?cf=table1,table2"
```

Closes #13793

* github.com:scylladb/scylladb:
  test: Test new API for disabling tombstone GC
  test: rest_api: extract common testing code into generic functions
  Add API to disable tombstone GC in compaction
  api: storage_service: restore indentation
  api: storage_service: extract code to set attribute for a set of tables
  tests: Test new option for disabling tombstone GC in compaction
  compaction_strategy: bypass tombstone compaction if tombstone GC is disabled
  table: Allow tombstone GC in compaction to be disabled on user request
2023-05-14 14:16:16 +03:00
Raphael S. Carvalho
6c32148751 tests: Test new option for disabling tombstone GC in compaction
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-12 10:14:28 -03:00
Raphael S. Carvalho
3b28c26c77 table: Allow tombstone GC in compaction to be disabled on user request
If tombstone GC was disabled, compaction will ensure that fully expired
sstables won't be bypassed and that no expired tombstones will be
purged. Changing the value takes immediate effect even on ongoing
compactions.

Not wired into an API yet.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-12 10:14:28 -03:00
Pavel Emelyanov
a59096aa70 sstables, database: Move object storage config maintenance onto storage_manager
Right now the map<endpoint, config> sits on the sstables manager and its
update is governed by database (because it's peering and can kick other
shards to update it as well).

Having the sharded<storage_manager> at hand lets freeing database from
the need to update configs and keeps sstables_manager a bit smaller.
Also this will allow keeping s3 clients shared between sstables via this
map by next patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-11 19:39:00 +03:00
Pavel Emelyanov
2153751d45 sstables: Introduce sharded<storage_manager>
The manager in question keeps track of whatever sstables_manager needs
to work with the storage (spoiler: only S3 one). It's main-local sharded
peering service, so that container() call can be used by next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-11 19:36:01 +03:00
Pavel Emelyanov
3bec5ea2ce s3/client: Keep server port on config
Currently the code temporarily assumes that the endpoint port is 9000.
This is what tests' local minio is started with. This patch keeps the
port number on endpoint config and makes test get the port number from
minio starting code via environment.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-03 20:19:43 +03:00
Pavel Emelyanov
2f6aa5b52e code: Introduce conf/object_storage.yaml configuration file
In order to access real S3 bucket, the client should use signed requests
over https. Partially this is due to security considerations, partially
this is unavoidable, because multipart-uploading is banned for unsigned
requests on the S3. Also, signed requests over plain http require
signing the payload as well, which is a bit troublesome, so it's better
to stick to secure https and keep payload unsigned.

To prepare signed requests the code needs to know three things:
- aws key
- aws secret
- aws region name

The latter could be derived from the endpoint URL, but it's simpler to
configure it explicitly, all the more so there's an option to use S3
URLs without region name in them we could want to use some time.

To keep the described configuration the proposed place is the
object_storage.yaml file with the format

endpoints:
  - name: a.b.c
    port: 443
    aws_key: 12345
    aws_secret: abcdefghijklmnop
    ...

When loaded, the map gets into db::config and later will be propagated
down to sstables code (see next patch).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-03 20:19:15 +03:00
Pavel Emelyanov
f7df238545 test: Propagate storage options to table_for_test
Teach table_for_tests use any storage options, not just local one. For
now the only user that passes non-local options is sstables::test_env.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Pavel Emelyanov
fa1de16f30 test: Add support for s3 storage_options in config
When the sstable test case wants to run over S3 storage it needs to
specify that in test config by providing the S3 storage options. So
first thing this patch adds is the helper that makes these options based
on the env left by minio launcher from test.py.

Next, in order to make sstables_manager work with S3 it needs the
plugged system keyspace which, in turn, needs query processor, proxy,
database, etc. All this stuff lives in cql_test_env, so the test case
running with S3 options will run in a sstables::test_env nested inside
cql_test_env. The latter would also need to plug its system keyspace to
the former's sstables manager and turn the experimental feature ON.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Pavel Emelyanov
1e03733e8c test: Outline sstables::test_env::do_with_async()
It's growing larger, better to keep it in .cc file

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:15:45 +03:00
Pavel Emelyanov
f223f5357d test: Keep storage options on sstable_test_env config
So that it could be set to s3 by the test case on demand. Default is
local storage which uses env's tempdir or explicit path argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:15:45 +03:00
Botond Dénes
022465d673 Merge 'Tone down offstrategy log message' from Benny Halevy
In many cases we trigger offstrategy compaction opportunistically
also when there's nothing to do.  In this case we still print
to the log lots of info-level message and call
`run_offstrategy_compaction` that wastes more cpu cycles
on learning that it has nothing to do.

This change bails out early if the maintenance set is empty
and prints a "Skipping off-strategy compaction" message in debug
level instead.

Fixes #13466

Also, add an group_id class and return it from compaction_group and table_state.
Use that to identify the compaction_group / table_state by "ks_name.cf_name compaction_group=idx/total" in log messages.

Fixes #13467

Closes #13520

* github.com:scylladb/scylladb:
  compaction_manager: print compaction_group id
  compaction_group, table_state: add group_id member
  compaction_manager: offstrategy compaction: skip compaction if no candidates are found
2023-05-02 08:05:18 +03:00