this is the first step to the uuid-based generation identifier. the goal is to encapsulate the generation related logic in generator, so its consumers do not have to understand the difference between the int64_t based generation and UUID v1 based generation.
this commit should not change the behavior of existing scylla. it just allows us to derive from `generation_generator` so we can have another generator which generates UUID based generation identifier.
Closes#13073
* github.com:scylladb/scylladb:
replica, test: create generation id using generator
sstables: add generation_generator
test: sstables: use generate_n for generating ids for testing
The only reason why it's there (right next to compaction_fwd.hh) is
because the database::table_truncate_state subclass needs the definition
of compaction_manager::compaction_reenabler subclass.
However, the former sub is not used outside of database.cc and can be
defined in .cc. Keeping it outside of the header allows dropping the
compaction_manager.hh from database.hh thus greatly reducing its fanout
over the code (from ~180 indirect inclusions down to ~20).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13622
reuse generation_generator for generating generation identifiers for
less repeatings. also, add allow update generator to update its
lastest known generation id.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
so we don't need to keep a `prev_gen` around, this also prepares for
the coming change to use generation generator.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Preparing for #10459, this series defines sstables::generation_type::int_t
as `int64_t` at the moment and use that instead of naked `int64_t` variables
so it can be changed in the future to hold e.g. a `std::variant<int64_t, sstables::generation_id>`.
sstables::new_generation was defined to generation new, unique generations.
Currently it is based on incrementing a counter, but it can be extended in the future
to manufacture UUIDs.
The unit tests are cleaned up in this series to minimize their dependency on numeric generations.
Basically, they should be used for loading sstables with hard coded generation numbers stored under `test/resource/sstables`.
For all the rest, the tests should use existing and mechanisms introduced in this series such as generation_factory, sst_factory and smart make_sstable methods in sstable_test_env and table_for_tests to generate new sstables with a unique generation, and use the abstract sst->generation() method to get their generation if needed, without resorting the the actual value it may hold.
Closes#12994
* github.com:scylladb/scylladb:
everywhere: use sstables::generation_type
test: sstable_test_env: use make_new_generation
sstable_directory::components_lister::process: fixup indentation
sstables: make highest_generation_seen return optional generation
replica: table: add make_new_generation function
replica: table: move sstable generation related functions out of line
test: sstables: use generation_type::int_t
sstables: generation_type: define int_t
Use generation_type rather than generation_type::int_t
where possible and removed the deprecated
functions accepting the int_t.i
Ref #10459
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Convert all users to use sstables::generation_type::int_t.
Further patches will continue to convert most to
using sstables::generation_type instead so we can
abstract the value type.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
And propagate it down to where it is created. This will be used to add
trace points for semaphore related events, but this will come in the
next patches.
Tests should just generate the highest sstable version
available. There is no need to ontinue testing old versions,
in particular partially supported ones like "la".
Use also the default values for sstable::format_types, buffer_size,
etc. if there's no particular need to override them.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The replayer code needs system keyspace to fetch truncation records
from, thus it needs this explicit dependency. By the time it runs system
keyspace is fully initialized already
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This method requires callers to remember that the sstable is the collection of files on a filesystem and to know what exact directory they are all in. That's not going to work for object storage, instead, sstable should be moved between more abstract states.
This PR replaces move_to_new_dir() call with the change_state() one that accepts target sub-directory string and moves files around. Currently supported state changes:
* staging -> normal
* upload -> normal | staging
* any -> quarantine
All are pretty straightforward and move files between table basedir subdirectories with the exception that upload -> quarantine should move into upload/quarantine subdirectory. Another thing to keep in mind, that normal state doesn't have its subdir but maps directory to table's base directory.
Closes#12648
* github.com:scylladb/scylladb:
sstable: Remove explicit quarantization call
test: Move move_to_new_dir() method from sstable class
sstable, dist.-loader: Introduce and use pick_up_from_upload() method
sstables, code: Introduce and use change_state() call
distributed_loader: Let make_sstables_available choose target directory
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The call moves the sstable to the specified state.
The change state is translated into the storage driver state change
which is for todays filesystem storage means moving between directories.
The "normal" state maps to the base dir of the table, there's no
dedicated subdir for this state and this brings some trouble into the
play.
The thing is that in order to check if an sstable is in "normal" state
already its impossible to compare filename of its path to any
pre-defined values, as tables' basdirs are dynamic. To overcome this,
the change-state call checks that the sstable is in one of "known"
sub-states, and assumes that it's in normal state otherwise.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.
Closes#12858
Move mutation-related files to a new mutation/ directory. The names
are kept in the global namespace to reduce churn; the names are
unambiguous in any case.
mutation_reader remains in the readers/ module.
mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this
patch.
This is a step forward towards librarization or modularization of the
source base.
Closes#12788
We currently have two method families to generate partition keys:
* make_keys() in test/lib/simple_schema.hh
* token_generation_for_shard() in test/lib/sstable_utils.hh
Both work only for schemas with a single partition key column of `text` type and both generate keys of fixed size.
This is very restrictive and simplistic. Tests, which wanted anything more complicated than that had to rely on open-coded key generation.
Also, many tests started to rely on the simplistic nature of these keys, in particular two tests started failing because the new key generation method generated keys of varying size:
* sstable_compaction_test.sstable_run_based_compaction_test
* sstable_mutation_test.test_key_count_estimation
These two tests seems to depend on generated keys all being of the same size. This makes some sense in the case of the key count estimation test, but makes no sense at all to me in the case of the sstable run test.
Closes#12657
* github.com:scylladb/scylladb:
test/lib/sstable_utils: remove now unused token_generation_for_shard() and friends
test/lib/simple_schema: remove now unused make_keys() and friends
test: migrate to tests::generate_partition_key[s]()
test/lib/test_services: add table_for_tests::make_default_schema()
test/lib: add key_utils.hh
test/lib/random_schema.hh: value_generator: add min_size_in_bytes
We have enabled the command line options without changing a
single line of code, we only had to replace old include
with scylla_test_case.hh.
Next step is to add x-log-compaction-groups options, which will
determine the number of compaction groups to be used by all
instantiations of replica::table.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Use the newly introduced key generation facilities, instead of the the
old inflexible alternatives and hand-rolled code.
Most of the migrations are mechanic, but there are two tests that
were tricky to migrate:
* sstable_compaction_test.sstable_run_based_compaction_test
* sstable_mutation_test.test_key_count_estimation
These two tests seems to depend on generated keys all being of the same
size. This makes some sense in the case of the key count estimation
test, but makes no sense at all to me in the case of the sstable run
test.
database_test in timing out because it's having to run the tests calling
do_with_cql_env_and_compaction_groups 3x, one for each compaction group
setting. reduce it to 2 settings instead of 3 if running in debug mode.
Refs #12396.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#12421
Extends database_test to run the tests with more than one
compaction group, in addition to a single one (default).
Piggyback on existing tests. Avoids duplication.
Caught a bug when snapshotting, in implementation of
table::can_flush(), showing its usefulness.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
There's a bunch of helpers around XFS-specific temp-dir sitting in
publie sstable part. Drop it altogether, no code needs it for real.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
One case gets full sstable datafile path to get the basename from it.
There's already the basename helper on the class sstable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The latter is pretty popular test/lib header that disseminates the
former one over whole lot of unit tests. The former, in turn, naturally
includes sstables.hh thus making tons of unrelated tests depend on
sstables class unused by them.
However, simple removal doesn't work, becase of local_shard_only bool
class definition in sstable_utils.hh used in simple_schema.hh. This
thing, in turn, is used in keys making helpers that don't belong to
sstable utils, so these are moved into simple_schema as well.
When done, this affects the mutation_source_test.hh, which needs the
local_shard_only bool class (and helps spreading the sstables.hh
throughout more unrelated tests) and a bunch of .cc test sources that
used sstable_utils.hh to indirectly include various headers of their
demand.
After patching, sstables.hh touches 2x times less tests. As a side
effect the sstables_manager.hh also becomes 2x times less dependent
on by tests.
Continuation of 9bdea110a6
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#12240
Two places in tests move sstable to quarantine subdir by hand. There's
the class sstable method that does the same, so use it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This type is currently an unordered_set, but only consists of at most
two elements. Making it an enum_set renders it into a size_t variable
and better describes the intention.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The logic to reject explicit snapshot of views/indexes
was improved in aa127a2dbb.
However, we never implemented auto-snapshot of
view/indexes when taking a snapshot of the base table.
This is implemented in this patch.
The implementation is built on top of
ba42852b0e
so it would be hard to backport to 5.1 or earlier
releases.
Fixes#11612
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Define table_id as a distinct utils::tagged_uuid modeled after raft
tagged_id, so it can be differentiated from other uuid-class types,
in particular from table_schema_version.
Fixes#11207
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Pass an optional truncated_at time_point to
truncate_table_on_all_shards instead of the over-complicated
timestamp_func that returns the same time_point on all shards
anyhow, and was only used for coordination across shards.
Since now we synchronize the internal execution phase in
truncate_table_on_all_shards, there is no longer need
for this timestamp_func.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
timestamp_func
Since in the drop_table case we want to discard ALL
sstables in the table, not only those with `max_data_age()`
up until drop started.
Fixes#11232
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Following up on 1c26d49fba,
apply mutations on the correct db shard in all test cases
before we define and use database::truncate_table_on_all_shards.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Currently, all the mutations this test generates are applied on shard 0.
In rare cases, this may lead to the following crash, when the flushed
sstable doesn't contain any key that belongs to the current shard,
as seen in https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1390/artifact/testlog/x86_64/dev/database_test.test_truncate_without_snapshot_during_writes.114.log
```
WARN 2022-07-17 17:41:36,630 [shard 0] sstable - create_sharding_metadata: range=[{-468459073612751032, pk{00046b657930}}, {-468459073612751032, pk{00046b657930}}] has no intersection with shard=0 first_key={key: pk{00046b657930}, token:-468459073612751032} last_key={key: pk{00046b657930}, token:-468459073612751032} ranges_single_shard=[] ranges_all_shards={{1, {[{-468459073612751032, pk{00046b657930}}, {-468459073612751032, pk{00046b657930}}]}}}
ERROR 2022-07-17 17:41:36,630 [shard 0] table - failed to write sstable /jenkins/workspace/releng/Scylla-CI/scylla/testlog/x86_64/dev/scylla-e2b694c7-db4f-4f9d-9940-9c6c21850888/ks/cf-8f74aba005de11ed92fa8661a0ed7890/me-2-big-Data.db: std::runtime_error (Failed to generate sharding metadata for /jenkins/workspace/releng/Scylla-CI/scylla/testlog/x86_64/dev/scylla-e2b694c7-db4f-4f9d-9940-9c6c21850888/ks/cf-8f74aba005de11ed92fa8661a0ed7890/me-2-big-Data.db)
ERROR 2022-07-17 17:41:36,631 [shard 0] table - Memtable flush failed due to: std::runtime_error (Failed to generate sharding metadata for /jenkins/workspace/releng/Scylla-CI/scylla/testlog/x86_64/dev/scylla-e2b694c7-db4f-4f9d-9940-9c6c21850888/ks/cf-8f74aba005de11ed92fa8661a0ed7890/me-2-big-Data.db). Aborting, at 0x329e28e 0x329e780 0x329ea88 0xf5bc69 0xf956b1 0x3196dc4 0x3198037 0x319742a 0x32be2e4 0x32bd8e1 0x32ba01c 0x317f97d /lib64/libpthread.so.0+0x92a4 /lib64/libc.so.6+0x100322
```
Instead, generate random keys and apply them on their
owning shard, and truncate all database shards.
Fixes#11076
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This work gets us a step closer to compaction groups.
Everything in compaction layer but compaction_manager was converted to table_state.
After this work, we can start implementing compaction groups, as each group will be represented by its own table_state. User-triggered operations that span the entire table, not only a group, can be done by calling the manager operation on behalf of each group and then merging the results, if any.
Closes#11028
* github.com:scylladb/scylla:
compaction: remove forward declaration of replica::table
compaction_manager: make add() and remove() switch to table_state
compaction_manager: make run_custom_job() switch to table_state
compaction_manager: major: switch to table_state
compaction_manager: scrub: switch to table_state
compaction_manager: upgrade: switch to table_state
compaction: table_state: add get_sstables_manager()
compaction_manager: cleanup: switch to table_state
compaction_manager: offstrategy: switch to table_state()
compaction_manager: rewrite_sstables(): switch to table_state
compaction_manager: make run_with_compaction_disabled() switch to table_state
compaction_manager: compaction_reenabler: switch to table_state
compaction_manager: make submit(T) switch to table_state
compaction_manager: task: switch to table_state
compaction: table_state: Add is_auto_compaction_disabled_by_user()
compaction: table_state: Add on_compaction_completion()
compaction: table_state: Add make_sstable()
compaction_manager: make can_proceed switch to table_state
compaction_manager: make stop compaction procedures switch to table_state
compaction_manager: make get_compactions() switch to table_state
compaction_manager: change task::update_history() to use table_state instead
compaction_manager: make can_register_compaction() switch to table_state
compaction_manager: make get_candidates() switch to table_state
compaction_manager: make propagate_replacement() switch to table_state
compaction: Move table::in_strategy_sstables() and switch to table_state
compaction: table_state: Add maintenance sstable set
compaction_manager: make has_table_ongoing_compaction() switch to table_state
compaction_manager: make compaction_disabled() switch to table_state
compaction_manager: switch to table_state for mapping of compaction_state
compaction_manager: move task ctor into source
and use an async thread around `directory_lister`
rather than `lister::scan_dir` to simplify the implementation.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Based on the `clear_snapshot` test.
Test with multiple snapshots and different
combinations of parameters to database::clear_snapshot.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Refactor test_drop_table_with_auto_snapshot out of
drop_table_with_snapshots, adding a auto_snapshot param,
controlling how to configure the cql_test_env db:.config::auto_snapshot,
so we can test both cases - auto_snapshot enabled and disabled.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Instead of just `tmpdir_for_data`, so we can easily set auto_snapshot
for `drop_table_with_snapshots` in the next patch.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
If the table to remove has no snapshots then
completely remove its directory on storage
as the left-over directory slows down operations on the keyspace
and makes searching for live tables harder.
Fixes#10896
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Runs drop_column_family on all database shards.
Will be extended later to consider removing the table directory.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
in_strategy_sstables() doesn't have to be implemented in table, as it's
simply about main set with maintenance and staging files filtered out.
Also, let's make it switch to table_state as part of ongoing work.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>