All sharded<> services are created by cql_test_env on the stack. The
cql_test_env() is then used to keep references on some of them and to
export them to test cases via its methods. Proxy is missing on that
exportable list, but will be needed, so add one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The helper is like ::getenv() but checks if the variable exists and
throws descriptive exception. So instead of
fatal error: in "...": std::logic_error: basic_string: construction from null is not valid
one could get something like
fatal error: in "...": std::logic_error: Environment variable ... not set
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This patch adds storage options lw-ptr to sstables_manager::make_sstable
and makes the storage instance creation depend on the options. For local
it just creates the filesystem storage instance, for S3 -- throws, but
next patch will fix that.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently sstable carries a filesystem_storage instance on board. Next
patches will make it possible to use some other storage with different
data accessing methods. This patch makes sstable carry abstract storage
interface and make the existing filesystem_storage implement it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
As a first step towards using host_id to identify nodes instead of ip addresses
this series introduces a node abstraction, kept in topology,
indexed by both host_id and endpoint.
The revised interface also allows callers to handle cases where nodes
are not found in the topology more gracefully by introducing `find_node()` functions
that look up nodes by host_id or inet_address and also get a `must_exist` parameter
that, if false (the default parameter value) would return nullptr if the node is not found.
If true, `find_node` throws an internal error, since this indicates a violation of an internal
assumption that the node must exist in the topology.
Callers that may handle missing nodes, should use the more permissive flavor
and handle the !find_node() case gracefully.
Closes#11987
* github.com:scylladb/scylladb:
topology: add node state
topology: remove dead code
locator: add class node
topology: rename update_endpoint to add_or_update_endpoint
topology: define get_{rack,datacenter} inline
shared_token_metadata: mutate_token_metadata: replicate to all shards
locator: endpoint_dc_rack: refactor default_location
locator: endpoint_dc_rack: define default operator==
test: storage_proxy_test: provide valid endpoint_dc_rack
This test currently uses `test/lib/test_table.hh` to generate data for its test cases. This data generation facility is used by no other tests. Worse, it is redundant as we already have a random data generator with fixed schema, in `test/lib/mutation_source_test.hh`. So in this series, we migrate the test cases in said test file to random schema and its random data generation facilities. These are used by several other test cases and using random schema allows us to cover a wider (quasi-infinite) number of possibilities.
After migrating all tests away from it, `test/lib/test_table.hh` is removed.
This series also reduces the runtime of `fuzzy_test` drastically. It should now run in a few minutes or even in seconds (depending on the machine).
Fixes: #12944Closes#12574
* github.com:scylladb/scylladb:
test/lib: rm test_table.hh
test/boos/multishard_mutation_query_test: migrate other tests to random schema
test/boost/multishard_mutation_query_test: use ks keyspace
test/boost/multishard_mutation_query_test: improve test pager
test/boost/multishard_mutation_query_test: refactor fuzzy_test
test/boost: add multishard_mutation_query_test more memory
types/user: add get_name() accessor
test/lib/random_schema: add create_with_cql()
test/lib/random_schema: fix udt handling
test/lib/random_schema: type_generator(): also generate frozen types
test/lib/random_schema: type_generator(): make static column generation conditional
test/lib/random_schema: type_generator(): don't generate duration_type for keys
test/lib/random_schema: generate_random_mutations(): add overload with seed
test/lib/random_schema: generate_random_mutations(): respect range tombstone count param
test/lib/random_schema: generate_random_mutations(): add yields
test/lib/random_schema: generate_random_mutations(): fix indentation
test/lib/random_schema: generate_random_mutations(): coroutinize method
test/lib/random_schema: generate_random_mutations(): expand comment
Task manager compaction tasks that cover compaction group
compaction need access to compaction_manager::tasks.
To avoid circular dependency and be able to rely on forward
declaration, task needs to be moved out of compaction manager.
To avoid naming confusion compaction_manager::task is renamed.
Closes#13226
* github.com:scylladb/scylladb:
compaction: use compaction namespace in compaction_manager.cc
compaction: rename compaction::task
compaction: move compaction_manager::task out of compaction manager
compaction: move sstable_task definition to source file
And keep per node information (idx, host_id, endpoint, dc_rack, is_pending)
in node objects, indexed by topology on several indices like:
idx, host_id, endpoint, current/pending, per dc, per dc/rack.
The node index is a shorthand identifier for the node.
node* and index are valid while the respective topology instance is valid.
To be used, the caller must hold on to the topology / token_metadata object
(e.g. via a token_metadata_ptr or effective_replication_map)
Refs #6403
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
topology: add node idx
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in `commitlog::descriptor::descriptor`, which is logged with the `WARN` level.
A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new `schema_commitlog_directory` parameter to move the schema commitlog to another disk drive.
This is expected to be released in 5.3.
As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here.
Fixes: #11867Closes#13263
* github.com:scylladb/scylladb:
commitlog: use separate directory for schema commitlog
schema commitlog: fix commitlog_total_space_in_mb initialization
The commitlog api originally implied that
the commitlog_directory would contain files
from a single commitlog instance. This is
checked in segment_manager::list_descriptors,
if it encounters a file with an unknown
prefix, an exception occurs in
commitlog::descriptor::descriptor, which is
logged with the WARN level.
A new schema commitlog was added recently,
which shares the filesystem directory with
the main commitlog. This causes warnings
to be emitted on each boot. This patch
solves the warnings problem by moving
the schema commitlog to a separate directory.
In addition, the user can employ the new
schema_commitlog_directory parameter to move
the schema commitlog to another disk drive.
By default, the schema commitlog directory is
nested in the commitlog_directory. This can help
avoid problems during an upgrade if the
commitlog_directory in the custom scylla.yaml
is located on a separate disk partition.
This is expected to be released in 5.3.
As #13134 (raft tables->schema commitlog)
is also scheduled for 5.3, and it already
requires a clean rolling restart (no cl
segments to replay), we don't need to
specifically handle upgrade here.
Fixes: #11867
Preparing for #10459, this series defines sstables::generation_type::int_t
as `int64_t` at the moment and use that instead of naked `int64_t` variables
so it can be changed in the future to hold e.g. a `std::variant<int64_t, sstables::generation_id>`.
sstables::new_generation was defined to generation new, unique generations.
Currently it is based on incrementing a counter, but it can be extended in the future
to manufacture UUIDs.
The unit tests are cleaned up in this series to minimize their dependency on numeric generations.
Basically, they should be used for loading sstables with hard coded generation numbers stored under `test/resource/sstables`.
For all the rest, the tests should use existing and mechanisms introduced in this series such as generation_factory, sst_factory and smart make_sstable methods in sstable_test_env and table_for_tests to generate new sstables with a unique generation, and use the abstract sst->generation() method to get their generation if needed, without resorting the the actual value it may hold.
Closes#12994
* github.com:scylladb/scylladb:
everywhere: use sstables::generation_type
test: sstable_test_env: use make_new_generation
sstable_directory::components_lister::process: fixup indentation
sstables: make highest_generation_seen return optional generation
replica: table: add make_new_generation function
replica: table: move sstable generation related functions out of line
test: sstables: use generation_type::int_t
sstables: generation_type: define int_t
The wasm engine is moved from replica::database to the query_processor.
The wasm instance cache and compilation thread runner were already there,
but now they're also initialized in the query_processor constructor.
By moving the initialization to the constructor, we can now
be certain that all wasm-related objects (wasm instance cache,
compilation thread runner, and wasm engine, which was already
passed in the constructor) are initialized when we try to use
them because we have to use the query processor to access them
anyway.
The change is also motivated by the fact that we're planning
to take Wasm UDFs out of experimental, after which they should
stop getting special treatment.
Closes#13311
* github.com:scylladb/scylladb:
wasm: move wasm initialization to query_processor constructor
wasm: return wasm instance cache as a reference instead of a pointer
wasm: move wasm engine to query_processor
... and drop usage of global storage proxy from several places of mutate_MV().
This is the last dependency loop around storage proxy left as long as the last user of the global storage proxy. The trouble is that while proxy naturally depends on database, the database SUDDENLY requires proxy to push view updates from the guts of database::do_apply().
Similar loop existed in a form of database -> { large_data_handler, compaction manager } -> system keyspace -> database and it was cut in 917fdb9e53 (Cut database-system_keyspace circular dependency) by introducing a soft dependency link from l. d. handler / compaction manager to system keyspace. The similar solution is proposed here.
The database instance gets a soft dependency (shared_ptr) to view_update_generator instance. On start the link is nullptr and pushing view updates is not possible until view_updates_generator starts and plugs itself to the database. The plugging happens naturally, because v.u.generator needs proxy as explicit dependency and, thus, can reach database via proxy. This (seems to) works because tables that need view updates don't start being mutated until late enough, as late as v.u.generator starts.
As a nice side effect this allows removing a bunch of global storage proxy usages from mutate_MV() which opens a pretty short way towards de-globalizing proxy (after it only qctx, tracing and schema registry will be left).
Closes#13367
* github.com:scylladb/scylladb:
view: Drop global storage_proxy usage from mutate_MV()
view: Make mutate_MV() method of view_update_generator
table: Carry v.u.generator down to populate_views()
table: Carry v.u.generator down to do_push_view_replica_updates()
view: Keep v.u.generator shared pointer on view_builder::consumer
view: Capture v.u.generator on view_updating_consumer lambda
view: Plug view update generator to database
view: Add view_builder -> view_update_generator dependency
view: Add view_update_generator -> sharded<storage_proxy> dependency
This is important for multiple compaction groups, as they cannot share state that must span a single SSTable set.
The solution is about:
1) Decoupling compaction strategy from its state; making compaction_strategy a pure stateless entity
2) Each compaction group storing its own compaction strategy state
3) Compaction group feeds its state into compaction strategy whenever needed
Closes#13351
* github.com:scylladb/scylladb:
compaction: TWCS: wire up compaction_strategy_state
compaction: LCS: wire up compaction_strategy_state
compaction: Expose compaction_strategy_state through table_state
replica: Add compaction_strategy_state to compaction group
compaction: Introduce compaction_strategy_state
compaction: add table_state param to compaction_strategy::notify_completion()
compaction: LCS: extract state into a separate struct
compaction: TWCS: prepare for stateless strategy
compaction: TWCS: extract state into a separate struct
compaction: add const-qualifier to a few compaction_strategy methods
To avoid confusion with task manager tasks compaction::task is renamed
to compaction::compaction_task_exector. All inheriting classes are
modified similarly.
compaction_manager::task needs to be accessed from task manager compaction
tasks. Thus, compaction_manager::task and all inheriting classes are moved
from compaction manager to compaction namespace.
By moving the initialization to the constructor, we can now
be certain that all wasm-related objects (wasm instance cache,
compilation thread runner, and wasm engine, which was already
passed in the constructor) are initialized when we try to use
them because we have to use the query processor to access them
anyway.
The change is also motivated by the fact that we're planning
to take Wasm UDFs out of experimental, after which they should
stop getting special treatment.
The builder will need generator for view_builder::consumer in one of the
next patches.
The builder is a standalone service that starts one of the latest and no
other services need builder as their dependency.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The generator will be responsible for spreading view updates with the
help of mutate_MV helper. The latter needs storage proxy to operate, so
the generator gets this dependency in advance.
There's no need to change start/stop order at the moment, generator
already starts after and stops before proxy. Also, services that have
generator as dependency are not required by proxy (even indirectly) so
no circular dependency is produced at this point.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The wasm engine is used for compiling and executing Wasm UDFs, so
the query_processor is a more appropriate location for it than
replica::database, especially because the wasm instance cache
and the wasm alien thread runner are already there.
This patch also reduces the number of wasm engines to 1, shared by
all shards, as recommended by the wasmtime developers.
That will allow compaction_strategy to access the compaction group state
through compaction::table_state, which is the interface at which replica
talks to the compaction layer.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
We need this so that we can have multi-partition mutations which are applied atomically. If they live on different shards, we can't guarantee atomic write to the commitlog.
Fixes: #12642Closes#13134
* github.com:scylladb/scylladb:
test_raft_upgrade: add a test for schema commit log feature
scylla_cluster.py: add start flag to server_add
ServerInfo: drop host_id
scylla_cluster.py: add config to server_add
scylla_cluster.py: add expected_error to server_start
scylla_cluster.py: ScyllaServer.start, refactor error reporting
scylla_cluster.py: fix ScyllaServer.start, reset cmd if start failed
raft: check if schema commitlog is initialized Refuse to boot if neither the schema commitlog feature nor force_schema_commit_log is set. For the upgrade procedure the user should wait until the schema commitlog feature is enabled before enabling consistent_cluster_management.
raft: move raft initialization after init_system_keyspace
database: rename before_schema_keyspace_init->maybe_init_schema_commitlog
raft: use schema commitlog for raft tables
init_system_keyspace: refactoring towards explicit load phases
* generate lowercase names (upper-case seems to cause problems);
* preserve dependency order between UDTs when dumping them from schema;
* use built-in describe() to dump to CQL string;
* drop single arg dump_udts() overlad, which was not recursive, unlike
the vector variant;
For regular and static columns, to introduce some further randomness.
So far frozen types were generated only for primary key members and
embedded types.
When creating the reader, the lifecycle policy might return one that was saved on the last page and survived in the cache. This reader might have skipped some fast-forwarding ranges while sitting in the cache. To avoid using a reader reading a stale range (from the read's POV), check its read range and fast forward it if necessary.
Fixes: https://github.com/scylladb/scylladb/issues/12916Closes#12932
* github.com:scylladb/scylladb:
readers/multishard: shard_reader: fast-forward created reader to current range
readers/multishard: reader_lifecycle_policy: add get_read_range()
test/boost/multishard_mutation_query_test: paging: handle range becoming wrapping
There was an attempt to cut feature-service -> system-keyspace dependency (#13172) which turned out to require more changes. Here's a preparation squeezing from this future work.
This set
- leaves only batch-enabling API in feature service
- keeps the need for async context in feature service
- narrows down system keyspace features API to only load and store records
- relaxes features updating logic in sys.ks.
- cosmetic
Closes#13264
* github.com:scylladb/scylladb:
feature_service: Indentation fix after previous patch
feature_service: Move async context into enable()
system_keyspace: Refactor local features load/save helpers
feature_service: Mark supported_feature_set() const
feature_service: Remove single feature enabling method
boot: Enable features in batch
gossiper: Enable features in batch
We aim (#12642) to use the schema commit log
for raft tables. Now they are loaded at
the first call to init_system_keyspace in
main.cc, but the schema commitlog is only
initialized shortly before the second
call. This is important, since the schema
commitlog initialization
(database::before_schema_keyspace_init)
needs to access schema commitlog feature,
which is loaded from system.scylla_local
and therefore is only available after the
first init_system_keyspace call.
So the idea is to defer the loading of the raft tables
until the second call to init_system_keyspace,
just as it works for schema tables.
For this we need a tool to mark which tables
should be loaded in the first or second phase.
To do this, in this patch we introduce system_table_load_phase
enum. It's set in the schema_static_props for schema tables.
It replaces the system_keyspace::table_selector in the
signature of init_system_keyspace.
The call site for populate_keyspace in init_system_keyspace
was changed, table_selector.contains_keyspace was replaced with
db.local().has_keyspace. This check prevents calling
populate_keyspace(system_schema) on phase1, but allows for
populate_keyspace(system) on phase2 (to init raft tables).
On this second call some tables from system keyspace
(e.g. system.local) may have already been populated on phase1.
This check protects from double-populating them, since every
populated cf is marked as ready_for_writes.
Use generation_type rather than generation_type::int_t
where possible and removed the deprecated
functions accepting the int_t.i
Ref #10459
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Also, add a bunch of make_sstable variants that get a
generation_type param for this.
With that, the entry points for generation_type::int_t
are deprecated and their users will be converted
in following patches.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Convert all users to use sstables::generation_type::int_t.
Further patches will continue to convert most to
using sstables::generation_type instead so we can
abstract the value type.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
And propagate it down to where it is created. This will be used to add
trace points for semaphore related events, but this will come in the
next patches.
gcc dislikes a member name that matches a type name, as it changes
the type name retroactively. Fix by fully-qualifying the type name,
so it is not changed by the newly-introduced member.
Callers don't need to know that enabling features has this requirement
Indentation is deliberately left broken (until next patch)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's name is too generic despite it's narrow specialization. Also,
there's a version_from_string() method that does the same in a more
convenient way.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
To apply topology_change commands group0_state_machine needs to have an
access to the storage service to support topology changes over raft.
Message-Id: <20230316112801.1004602-10-gleb@scylladb.com>
This series cleans up unit test in preparation for PR #12994.
Helpers are added (or reused) to not rely on specific sstable generation numbers where possible (other than loading reference sstables that are committed to the repo with given generation numbers), and to generate the sstables for tests easily, taking advantage of generation management in `sstable_test_env`, `table_for_tests`, or `replica::table` itself.
Closes#13242
* github.com:scylladb/scylladb:
test: add verify_mutation helpers.
test: add make_sstable_containing memtable
test: table_for_tests: add make_sstable function
test: sstable_test_env: add make_sst_factory methods
test: sstable_compaction_test: do not rely on specific generations
tests: use make_sstable defaults as much as possible
test: sstable_test_env: add make_table_for_tests
test: sstable_datafile_test: do not rely on sepecific sstable generations
test: sstable_test_env: add reusable_sst(shared_sstable)
sstable: expose get_storage function
test: mutation_reader_test: create_sstable: do not rely on specific generations
test: mutation_reader_test: do_test_clustering_order_merger_sstable_set: rely on test_envsstable generation
test: mutation_reader_test: combined_mutation_reader_test: define a local sst_factory function
test: mutation_reader_test: do not use tmpdir
test: use big format by default
test: sstable_compaction_test: use highest sstable version by default
test: test_env: make_db_config: set cfg host_id
test: sstable_datafile_test: fixup indentation
test: sstable_datafile_test: various tests: do_with_async
test: sstable_3_x_test: validate_read, sstable_assertions: get shared_sstable
test: sstable_3_x_test: compare_sstables: get shared_sstable
test: sstable_3_x_test: write_sstables: return shared_sstable
test: sstable_3_x_test: write, compare, validate_sstables: use env.tempdir
test: sstable_3_x_test: compacted_sstable_reader: do not reopen compacted_sst
test: lib: test_services: delete now unused stop_and_keep_alive
test: sstable_compaction_test: use deferred_stop to stop table_for_tests
test: sstable_compaction_test: compound_sstable_set_incremental_selector_test: do_with_async
test: sstable_compaction_test: sstable_needs_cleanup_test: do_with_async
test: sstable_compaction_test: leveled_05: fixup indentation
test: sstable_compaction_test: leveled_05: do_with_async
test: sstable_compaction_test: compact_02: do_with_async
test: sstable_compaction_test: compact_sstables: simplify variable allocation
test: sstable_compaction_test: compact_sstables: reindent
test: sstable_compaction_test: compact_sstables: use thread
test: sstable_compaction_test: sstable_rewrite: simplify variable allocation
test: sstable_compaction_test: sstable_rewrite: fixup indentation
test: sstable_compaction_test: sstable_rewrite: do_with_async
test: sstable_compaction_test: compact: fixup indentation
test: sstable_compaction_test: compact: complete conversion to async thread
test: sstable_compaction_test: compaction_manager_basic_test: rename generations to idx
Some unit tests want to change the sstable::_dir on the fly. However,
the sstable::_dir is going away, so it needs a yet another virtual call
on storage driver.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13213