Make it public and add `bool enable` flag so that test cases could start
the compaction manager (to call make_table_for_tests() later) but keep
it disabled for their testing purposes.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now when the sstables::test_env provides the compaction manager
instance, the table_for_tests can start using it and can remove c.m. and
the sidecar task_manager.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Most of the test cases that use sstables::test_env do not mess with
table objects, they only need sstables. However, compaction test cases
do need table objects and, respectively, a compaction manager instance.
Today those test cases create compaction manager instance for each table
they create, but that's a bit heaviweight and doesn't work the way core
code works. This patch prepares the sstables::test_env to provide
compaction manager on demand by starting it as soon as it's asked to
create table object.
For now this compaction manager is unused, but it will be in next patch.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Next patches will stop using compaction manager from table_for_tests in
favor of external one (spoiler: the one from sstables::test_env), thus
the compaction manager would outsurvive the table_for_tests object and
the table object wrapped by it. So in order for the table_for_tests to
stop correctly, it also needs to stop the wrapped table too.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's not used any longer and can be removed. This make table_for_tests
stopping code a bit shorter as well.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This is the continuation of the previous patch. Make the caller of
table_for_tests constructor provide the table::config. This makes the
table_for_tests constructor shorter and more self-contained.
Also, the caller now needs to provide the reference to reader
concurrency semaphore, and that's good news, because the only caller for
today is the sstables::test_env that does have it. This makes the
semaphore sitting on table_for_tests itself unused and it will be
removed eventually.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The table_for_tests keeps a copy of table::config on board. That's not
"idiomatic" as table config is a temporary object that should only be
needed while creating table object. Fortunately, the copy of config on
table_for_tests is no longer needed and it can be made temporary.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Making compaction permit needs a semaphore. Current code gets it from
the table_for_tests, but the very same semaphore reference sits on the
table. So get it from table, as the core code does. This will allow
removing the dedicated semaphore from table_for_tests in the future.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When making table object it needs the cache tracker reference. The
table_for_tests keeps one on board, but the very same object already
sits on the sstables manager which has public getter.
This makes the table_for_tests's cache tracker object not needed.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's going to get larger, so better to move.
Also when coroutinized it's goind to be easier to extend.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
before this change, we create a new UUID for a new sstable managed
by the s3_storage, and we use the string representation of UUID
defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for
naming the objects stored on s3_storage. but this representation is
not what we are using for storing sstables on local filesystem when
the option of "uuid_sstable_identifiers_enabled" is enabled. instead,
we are using a base36-based representation which is shorter.
to be consistent with the naming of the sstables created for local
filesystem, and more importantly, to simplify the interaction between
the local copy of sstables and those stored on object storage, we should
use the same string representation of the sstable identifier.
so, in this change:
1. instead of creating a new UUID, just reuse the generation of the
sstable for the object's key.
2. do not store the uuid in the sstable_registry system table. As
we already have the generation of the sstable for the same purpose.
3. switch the sstable identifier representation from the one defined
by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the
base36-based one (implemented by
fmt::formatter<sstables::generation_type>)
4. enable the `uuid_sstable_identifers` cluster feature if it is
enabled in the `test_env_config`, so that it the sstable manager
can enable the uuid-based uuid when creating a new uuid for
sstable.
5. throw if the generation of sstable is not UUID-based when
accessing / manipulating an sstable with S3 storage backend. as
the S3 storage backend now relies on this option. as, otherwise
we'd have sstables with key like s3://bucket/number/basename, which
is just unable to serve as a unique id for sstable if the bucket is
shared across multiple tables.
Fixes#14175
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
some of the tests are still relying on the integer-based sstable
identifier, so let's add a method to test_env, so that the tests
relying on this can opt-out. we will change the default setting
of sstables::test_env to use uuid-base sstable identifier in the
next commit. this change does not change the existing behavior.
it just adds a new knob to test_env_config. and let the tests
relying on this to customize the test_env_config to disable
use_uuid.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
when accessing AWS resources, uses are allowed to long-term security
credentials, they can also the temporary credentials. but if the latter
are used, we have to pass a session token along with the keys.
see also https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html
so, if we want to programatically get authenticated, we need to
set the "x-amz-security-token" header,
see
https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#UsingTemporarySecurityCredentials
so, in this change, we
1. add another member named `token` in `s3::endpoint_config::aws_config`
for storing "AWS_SESSION_TOKEN".
2. populate the setting from "object_storage.yaml" and
"$AWS_SESSION_TOKEN" environment variable.
3. set "x-amz-security-token" header if
`s3::endpoint_config::aws_config::token` is not empty.
this should allow us to test s3 client and s3 object store backend
with S3 bucket, with the temporary credentials.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15486
We want to switch system.scylla_local table to the
schema commitlog, but load phases hamper here - schema
commitlog is initialized after phase1,
so a table which is using it should be moved to phase2,
but system.scylla_local contains features, and we need
them before schema commitlog initialization for
SCHEMA_COMMITLOG feature.
In this commit we are taking a different approach to
loading system tables. First, we load them all in
one pass in 'readonly' mode. In this mode, the table
cannot be written to and has not yet been assigned
a commit log. To achieve this we've added _readonly bool field
to the table class, it's initialized to true in table's
constructor. In addition, we changed the table constructor
to always assign nullptr to commitlog, and we trigger
an internal error if table.commitlog() property is accessed
while the table is in readonly mode. Then, after
triggering on_system_tables_loaded notifications on
feature_service and sstable_format_selector, we call
system_keyspace::mark_writable and eventually
table::mark_ready_for_writes which selects the
proper commitlog and marks the table as writable.
In sstable_compaction_test we drop several
mark_ready_for_writes calls since they are redundant,
the table has already been made writable in
env.make_table_for_tests call.
The table::commitlog function either returns the current
commitlog or causes an error if the table is readonly. This
didn't work for virtual tables, since they never called
mark_ready_for_writes. In this commit we add this
call to initialize_virtual_tables.
In this refactoring commit we remove the db::config::host_id
field, as it's hacky and duplicates token_metadata::get_my_id.
Some tests want specific host_id, we add it to cql_test_config
and use in cql_test_env.
We can't pass host_id to sstables_manager by value since it's
initialized in database constructor and host_id is not loaded yet.
We also prefer not to make a dependency on shared_token_metadata
since in this case we would have to create artificial
shared_token_metadata in many tools and tests where sstables_manager
is used. So we pass a function that returns host_id to
sstables_manager constructor.
Currently minio applies anonymous public policy for the test bucket and
all tests just use unsigned S3 requests. This patch generates a policy
for the temporary minio user and removes the anon public one. All tests
are updated respectively to use the provided key:secret pair.
The use-https bit is off by default as minio still starts with plain
http. That's OK for now, all tests are local and have no secret data
anyway
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The bucket is going to stop being public, rename the env variable in
advance to make the essential patch smaller
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
There are a few good reasons for this change.
1) compaction_group doesn't have to be aware of # of groups
2) thinking forward to dynamic tablets, # of groups cannot be
statically embedded in group id, otherwise it gets stale.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
perform_cleanup may be waiting for those sstables
to become eligible for cleanup so signal it
when table::move_sstables_from_staging detects an
sstable that requires cleanup.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Adding new APIs /column_family/tombstone_gc and /storage_service/tombstone_gc, that will allow for disabling tombstone garbage collection (GC) in compaction.
Mimicks existing APIs /column_family/autocompaction and /storage_service/autocompaction.
column_family variant must specify a single table only, following existing convention.
whereas the storage_service one can specify an entire keyspace, or a subset of a tables in a keyspace.
column_family API usage
-----
```
The table name must be in keyspace:name format
Get status:
curl -s -X GET "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf"
Enable GC
curl -s -X POST "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf"
Disable GC
curl -s -X DELETE "http://127.0.0.1:10000/column_family/tombstone_gc/ks:cf"
```
storage_service API usage
-----
```
Tables can be specified using a comma-separated list.
Enable GC on keyspace
curl -s -X POST "http://127.0.0.1:10000/storage_service/tombstone_gc/ks"
Disable GC on keyspace
curl -s -X DELETE "http://127.0.0.1:10000/storage_service/tombstone_gc/ks"
Enable GC on a subset of tables
curl -s -X POST
"http://127.0.0.1:10000/storage_service/tombstone_gc/ks?cf=table1,table2"
```
Closes#13793
* github.com:scylladb/scylladb:
test: Test new API for disabling tombstone GC
test: rest_api: extract common testing code into generic functions
Add API to disable tombstone GC in compaction
api: storage_service: restore indentation
api: storage_service: extract code to set attribute for a set of tables
tests: Test new option for disabling tombstone GC in compaction
compaction_strategy: bypass tombstone compaction if tombstone GC is disabled
table: Allow tombstone GC in compaction to be disabled on user request
If tombstone GC was disabled, compaction will ensure that fully expired
sstables won't be bypassed and that no expired tombstones will be
purged. Changing the value takes immediate effect even on ongoing
compactions.
Not wired into an API yet.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Right now the map<endpoint, config> sits on the sstables manager and its
update is governed by database (because it's peering and can kick other
shards to update it as well).
Having the sharded<storage_manager> at hand lets freeing database from
the need to update configs and keeps sstables_manager a bit smaller.
Also this will allow keeping s3 clients shared between sstables via this
map by next patch.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The manager in question keeps track of whatever sstables_manager needs
to work with the storage (spoiler: only S3 one). It's main-local sharded
peering service, so that container() call can be used by next patches.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently the code temporarily assumes that the endpoint port is 9000.
This is what tests' local minio is started with. This patch keeps the
port number on endpoint config and makes test get the port number from
minio starting code via environment.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In order to access real S3 bucket, the client should use signed requests
over https. Partially this is due to security considerations, partially
this is unavoidable, because multipart-uploading is banned for unsigned
requests on the S3. Also, signed requests over plain http require
signing the payload as well, which is a bit troublesome, so it's better
to stick to secure https and keep payload unsigned.
To prepare signed requests the code needs to know three things:
- aws key
- aws secret
- aws region name
The latter could be derived from the endpoint URL, but it's simpler to
configure it explicitly, all the more so there's an option to use S3
URLs without region name in them we could want to use some time.
To keep the described configuration the proposed place is the
object_storage.yaml file with the format
endpoints:
- name: a.b.c
port: 443
aws_key: 12345
aws_secret: abcdefghijklmnop
...
When loaded, the map gets into db::config and later will be propagated
down to sstables code (see next patch).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Teach table_for_tests use any storage options, not just local one. For
now the only user that passes non-local options is sstables::test_env.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When the sstable test case wants to run over S3 storage it needs to
specify that in test config by providing the S3 storage options. So
first thing this patch adds is the helper that makes these options based
on the env left by minio launcher from test.py.
Next, in order to make sstables_manager work with S3 it needs the
plugged system keyspace which, in turn, needs query processor, proxy,
database, etc. All this stuff lives in cql_test_env, so the test case
running with S3 options will run in a sstables::test_env nested inside
cql_test_env. The latter would also need to plug its system keyspace to
the former's sstables manager and turn the experimental feature ON.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
So that it could be set to s3 by the test case on demand. Default is
local storage which uses env's tempdir or explicit path argument.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In many cases we trigger offstrategy compaction opportunistically
also when there's nothing to do. In this case we still print
to the log lots of info-level message and call
`run_offstrategy_compaction` that wastes more cpu cycles
on learning that it has nothing to do.
This change bails out early if the maintenance set is empty
and prints a "Skipping off-strategy compaction" message in debug
level instead.
Fixes#13466
Also, add an group_id class and return it from compaction_group and table_state.
Use that to identify the compaction_group / table_state by "ks_name.cf_name compaction_group=idx/total" in log messages.
Fixes#13467Closes#13520
* github.com:scylladb/scylladb:
compaction_manager: print compaction_group id
compaction_group, table_state: add group_id member
compaction_manager: offstrategy compaction: skip compaction if no candidates are found
Will be used by tablet-based replication strategies, for which
effective replication map is different per table.
Also, this patch adapts existing users of effective replication map to
use the per-table effective replication map.
For simplicity, every table has an effective replication map, even if
the erm is per keyspace. This way the client code can be uniform and
doesn't have to check whether replication strategy is per table.
Not all users of per-keyspace get_effective_replication_map() are
adapted yet to work per-table. Those algorithms will throw an
exception when invoked on a keyspace which uses per-table replication
strategy.
That will allow compaction_strategy to access the compaction group state
through compaction::table_state, which is the interface at which replica
talks to the compaction layer.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
And propagate it down to where it is created. This will be used to add
trace points for semaphore related events, but this will come in the
next patches.
This series cleans up unit test in preparation for PR #12994.
Helpers are added (or reused) to not rely on specific sstable generation numbers where possible (other than loading reference sstables that are committed to the repo with given generation numbers), and to generate the sstables for tests easily, taking advantage of generation management in `sstable_test_env`, `table_for_tests`, or `replica::table` itself.
Closes#13242
* github.com:scylladb/scylladb:
test: add verify_mutation helpers.
test: add make_sstable_containing memtable
test: table_for_tests: add make_sstable function
test: sstable_test_env: add make_sst_factory methods
test: sstable_compaction_test: do not rely on specific generations
tests: use make_sstable defaults as much as possible
test: sstable_test_env: add make_table_for_tests
test: sstable_datafile_test: do not rely on sepecific sstable generations
test: sstable_test_env: add reusable_sst(shared_sstable)
sstable: expose get_storage function
test: mutation_reader_test: create_sstable: do not rely on specific generations
test: mutation_reader_test: do_test_clustering_order_merger_sstable_set: rely on test_envsstable generation
test: mutation_reader_test: combined_mutation_reader_test: define a local sst_factory function
test: mutation_reader_test: do not use tmpdir
test: use big format by default
test: sstable_compaction_test: use highest sstable version by default
test: test_env: make_db_config: set cfg host_id
test: sstable_datafile_test: fixup indentation
test: sstable_datafile_test: various tests: do_with_async
test: sstable_3_x_test: validate_read, sstable_assertions: get shared_sstable
test: sstable_3_x_test: compare_sstables: get shared_sstable
test: sstable_3_x_test: write_sstables: return shared_sstable
test: sstable_3_x_test: write, compare, validate_sstables: use env.tempdir
test: sstable_3_x_test: compacted_sstable_reader: do not reopen compacted_sst
test: lib: test_services: delete now unused stop_and_keep_alive
test: sstable_compaction_test: use deferred_stop to stop table_for_tests
test: sstable_compaction_test: compound_sstable_set_incremental_selector_test: do_with_async
test: sstable_compaction_test: sstable_needs_cleanup_test: do_with_async
test: sstable_compaction_test: leveled_05: fixup indentation
test: sstable_compaction_test: leveled_05: do_with_async
test: sstable_compaction_test: compact_02: do_with_async
test: sstable_compaction_test: compact_sstables: simplify variable allocation
test: sstable_compaction_test: compact_sstables: reindent
test: sstable_compaction_test: compact_sstables: use thread
test: sstable_compaction_test: sstable_rewrite: simplify variable allocation
test: sstable_compaction_test: sstable_rewrite: fixup indentation
test: sstable_compaction_test: sstable_rewrite: do_with_async
test: sstable_compaction_test: compact: fixup indentation
test: sstable_compaction_test: compact: complete conversion to async thread
test: sstable_compaction_test: compaction_manager_basic_test: rename generations to idx
Wrap table_for_tests ctor to pass the env sstables_manager
as well as the temporary directory path, as this is the
most common use case, and in preparation for adding
a make_sstable method in table_for_tests.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Tables need to know which storage their sstables need to be located at,
so class table needs to have itw reference of the storage options. The
thing can be inherited from the keyspace metadata.
Tests sometimes create table without keyspace at hand. For those use
default-initialized storage options (which is local storage).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We currently have two method families to generate partition keys:
* make_keys() in test/lib/simple_schema.hh
* token_generation_for_shard() in test/lib/sstable_utils.hh
Both work only for schemas with a single partition key column of `text` type and both generate keys of fixed size.
This is very restrictive and simplistic. Tests, which wanted anything more complicated than that had to rely on open-coded key generation.
Also, many tests started to rely on the simplistic nature of these keys, in particular two tests started failing because the new key generation method generated keys of varying size:
* sstable_compaction_test.sstable_run_based_compaction_test
* sstable_mutation_test.test_key_count_estimation
These two tests seems to depend on generated keys all being of the same size. This makes some sense in the case of the key count estimation test, but makes no sense at all to me in the case of the sstable run test.
Closes#12657
* github.com:scylladb/scylladb:
test/lib/sstable_utils: remove now unused token_generation_for_shard() and friends
test/lib/simple_schema: remove now unused make_keys() and friends
test: migrate to tests::generate_partition_key[s]()
test/lib/test_services: add table_for_tests::make_default_schema()
test/lib: add key_utils.hh
test/lib/random_schema.hh: value_generator: add min_size_in_bytes
Now any boost test can run with multiple compaction groups by default,
without any change in the boost test cases whatsoever.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Scylla unit tests are limited to command line options defined by
Seastar testing framework.
For extending the set of options, Scylla unit tests can now
include test/lib/scylla_test_case.hh instead of seastar/testing/test_case.hh,
which will "hijack" the entry point and will process the command line
options, then feed the remaining options into seastar testing entry
point.
This is how it looks like when asking for help:
Scylla tests additional options:
--help Produces help message
--x-log2-compaction_groups arg (=0) Controls static number of compaction
groups per table per shard. For X groups,
set the option to log (base 2) of X.
Example: Value of 3 implies 8 groups.
Running 1 test case...
App options:
-h [ --help ] show help message
--help-seastar show help message about seastar options
--help-loggers print a list of logger names and exit
--random-seed arg Random number generator seed
--fail-on-abandoned-failed-futures arg (=1)
Fail the test if there are any
abandoned failed futures
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Creating the default schema, used in the default constructor of
table_for_tests. Allows for getting the default schema without creating
an instance first.
This adds the test/lib's tmpdir instance _and_ configures the
data_file_directories with this path. This makes sure sstables manager
and the rest of the test use the same directory for sstables. For now
it doesn't change anything, but helps next patching.
(A neat side effect of this change is that sstable_test_env is now
configured the same way as cql_test_env does)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The goal is to make it possible to make config with custom-initialized
options in test_env::impl's constructor initializer list (next patch).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>