To be used for creating effective_replication_map
when token_metadata changes, and update all
keyspaces with it.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It will be used further to create shared copies
of effective_replication_map based on replication_strategy
type and config options.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The db::config reference is available on the database, which
can be get from the virtual_table itself. The problem is that
it's a const refernece, while system.config will be updateable
and will need non-const reference.
Adding non-const get_config() on the database looks wrong. The
database shouldn't be used as config provider, even the const
one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
table_state is being introduced for compaction subsystem, to remove table dependency
from compaction interface, fix layer violations, and also make unit testing
easier as table_state is an abstraction that can be implemented even with no
actual table backing it.
In this series, compaction strategy interfaces are switching to table_state,
and eventually, we'll make compact_sstables() switch to it too. The idea is
that no compaction code will directly reference a table object, but only work
with the abstraction instead. So compaction subdirectory can stop
including database.hh altogether, which is a great step forward.
"
* 'table_state_v5' of https://github.com/raphaelsc/scylla:
sstable_compaction_test: switch to table_state
compaction: stop including database.hh for compaction_strategy
compaction: switch to table_state in estimated_pending_compactions()
compaction: switch to table_state in compaction_strategy::get_major_compaction_job()
compaction: switch to table_state in compaction_strategy::get_sstables_for_compaction()
DTCS: reduce table dependency for task estimation
LCS: reduce table dependency for task estimation
table: Implement table_state
compaction: make table param of get_fully_expired_sstables() const
compaction_manager: make table param of has_table_ongoing_compaction() const
Introduce table_state
Last method in compaction_strategy using table. From now on,
compaction strategy no longer works directly with table.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
From now on, get_sstables_for_compaction() will use table_state.
With table_state, we avoid layer violations like strategy using
manager and also makes testing easier.
Compaction unit tests were temporarily disabled to avoid a giant
commit which is hard to parse.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
flat_reader_assertions::produces_range_tombstone() does not actually
check range tombstones beyond the fact that they are in fact range
tombstones (unless non-empty ck_ranges is passed).
Fixing the immediate problem reveals that:
* The assertion logic is not flexible enough to deal with
creatively-split or creatively-overlapping range tombstones.
* Some existing tests involving range tombstones are in fact wrong:
some assertions may (at least with some readers) refer to wrong
tombstones entirely, while others assert wrong things about right
tombstones.
* Range tombstones in pre-made sstables (such as those read by
sstable_3_x_test) have deletion time drift, and that now has to be
somehow dealt with.
This patch (which is not split into smaller ones because that would
either generate unreasonable amount of work towards ensuring
bisectability or entail "temporarily" disabling problematic tests,
which is cheating) contains the following changes:
* flat_reader_assertions check range tombstones more carefully, by
accumulating both expected and actually-read range tombstones into
lists and comparing those lists when a partition ends (or when the
assertion object is destroyed).
* flat_reader_assertions::may_produce_tombstones() can take
constraining ck_ranges.
* Both flat_reader_assertions and flat_reader_assertions_v2 can be
instructed to ignore tombstone deletion times, to help with tests that
read pre-made sstables.
* Affected tests are changed to reflect reality. Most changes to
tests make sense; the only one I am not completely sure about is in
test_uncompressed_filtering_and_forwarding_range_tombstones_read.
Fixes#9470
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
there's no need for wrapping compaction_data in shared_ptr, also
let's kill unused params in create_compaction_data to simplify
its creation.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Currently row marker shadowing the shadowable tombstone is only checked
in `apply(row_marker)`. This means that shadowing will only be checked
if the shadowable tombstone and row marker are set in the correct order.
This at the very least can cause flakyness in tests when a mutation
produced just the right way has a shadowable tombstone that can be
eliminated when the mutation is reconstructed in a different way,
leading to artificial differences when comparing those mutations.
This patch fixes this by checking shadowing in
`apply(shadowable_tombstone)` too, making the shadowing check symmetric.
There is still one vulnerability left: `row_marker& row_marker()`, which
allow overwriting the marker without triggering the corresponding
checks. We cannot remove this overload as it is used by compaction so we
just add a comment to it warning that `maybe_shadow()` has to be manually
invoked if it is used to mutate the marker (compaction takes care of
that). A caller which didn't do the manual check is
mutation_source_test: this patch updates it to use `apply(row_marker)`
instead.
Fixes: #9483
Tests: unit(dev)
Closes#9519
This mini series contains two fixes that are bundled together since the
second one assumes that the first one exists (or it will not fix
anything really...), the two problems were:
1. When certain operations are called on a service level controller
which doesn't have it's data accessor set, it can lead to a crash
since some operations will still try to dereference the accessor
pointer.
2. The cql environment test initialized the accessor with a
sharded<system_distributed_data>& however this sharded class as
itself is not initialized (sharded::start wasn't called), so for the
same that were unsafe for null dereference the accessor will now crash
for trying to access uninitialized sharded instance.
Closes#9468
* github.com:scylladb/scylla:
CQL test environment: Fix bad initialization order
Service Level Controller: Fix possible dereference of a null pointer
Serialize the metadata changes with
keyspace create, update, or drop.
This will become necessary in the following patch
when we update the effective_replication_map
on all keyspaces and we want instances on all shards
end up with the same replication map.
Note that storage_service::keyspace_changed is called
from the scheme_merge path so it already holds
the merge_lock.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The service level controller was initialized with a data
accessor that uses the system distributed keyspace before
the later have been initialized. If there is a use of
this accessor (for example by calling
to: service_level_controller::get_distributed_service_levels())
if will fail miserably and crash.
Not initializing the data accessor doesn't mean the same thing
since we can deal with such call when the accessor is not
initialized.
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
There are 4 flavours of mutation source tests that are all ran
sequentially -- plain, reversed and upgrade/downgrade ones that
check v1<->v2 conversions.
This patch splits them all into individual calls so that some
tests may want to have dedicated cases for each. "By default" they
are all run as they were.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
The storage_service is involved in the cdc_generation_service guts
more than needed.
- the bool _for_testing bit is cdc-only
- there's API-only cdc_generation_service getter
- cdc_g._s. startup code partially sits in s._s. one
This patch cleans most of the above leaving only the startup
_cdc_gen_id on board.
tests: unit(dev)
refs: #2795
"
* 'br-storage-service-vs-cdc-2' of https://github.com/xemul/scylla:
api: Use local sharded<cdc::generation_service> reference
main: Push cdc::generation_service via API
storage_service: Ditch for_testing boolean
cdc: Replace db::config with generation_service::config
cdc: Drop db::config from description_generator
cdc: Remove all arguments from maybe_rewrite_streams_descriptions
cdc: Move maybe_rewrite_streams_descriptions into after_join
cdc: Squash two methods into one
cdc: Turn make_new_cdc_generation a service method
cdc: Remove ring-delay arg from make_new_cdc_generation
cdc: Keep database reference on generation_service
"This series removes layer violation in compaction, and also
simplifies compaction manager and how it interacts with compaction
procedure."
* 'compaction_manager_layer_violation_fix/v4' of github.com:raphaelsc/scylla:
compaction: split compaction info and data for control
compaction_manager: use task when stopping a given compaction type
compaction: remove start_size and end_size from compaction_info
compaction_manager: introduce helpers for task
compaction_manager: introduce explicit ctor for task
compaction: kill sstables field in compaction_info
compaction: kill table pointer in compaction_info
compaction: simplify procedure to stop ongoing compactions
compaction: move management of compaction_info to compaction_manager
compaction: move output run id from compaction_info into task
by setting _alloc_count initially to 0.
The _alloc_count hasn't been explicitely specified. As the allocator has
been usually an automatic variable, _alloc_count had initially some
unspecified contents. This probalby means that cases where the first few
allocations passed and the later one failed, might haven't ever been
tested. Good thing is that most of the users have been transferred to
the Seastar failure injector, which (by accident) has been correct.
Closes#9420
compaction_info must only contain info data to be exported to the
outside world, whereas compaction_data will contain data for
controlling compaction behavior and stats which change as
compaction progresses.
This separation makes the interface clearer, also allowing for
future improvements like removing direct references to table
in compaction.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Today, compactions are tracked by both _compactions and _tasks,
where _compactions refer to actual ongoing compaction tasks,
whereas _tasks refer to manager tasks which is responsible for
spawning new compactions, retry them on failure, etc.
As each task can only have one ongoing compaction at a time,
let's move compaction into task, such that manager won't have to
look at both when deciding to do something like stopping a task.
So stopping a task becomes simpler, and duplication is naturally
gone.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Today, compaction is calling compaction manager to register / deregister
the compaction_info created by it.
This is a layer violation because manager sits one layer above
compaction, so manager should be responsible for managing compaction
info.
From now on, compaction_info will be created and managed by
compaction_manager. compaction will only have a reference to info,
which it can use to update the world about compaction progress.
This will allow compaction_manager to be simplified as info can be
coupled with its respective task, allowing duplication to be removed
and layer violation to be fixed.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
this run id is used to track partial runs that are being written to.
let's move it from info into task, as this is not an external info,
but rather one that belongs to compaction_manager.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Nowadays it purely controls whether or not to inject delays into
timestamps generation by cdc. The same effect can be achieved by
configuring the cdc::generation_service directly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This is to push the service towards general idea that each
component should have its own config and db::config to stay
in main.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The cql_config_updater is a sharded<> service that exists in main and
whose goal is to make sure some db::config's values are propagated into
cql_config. There's a more handy updateable_value<> glue for that.
tests: unit(dev)
refs: #2795
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210927090402.25980-1-xemul@scylladb.com>
To ensure all mutation sources uniformly support the current API of
reverse reading: reversed schema and half-reversed slice. This test will
also ensure that once we switch to native-reverse slice, all
mutation-sources will keep on working.
There's a circular dependency:
query processor needs database
database owns large_data_handler and compaction_manager
those two need qctx
qctx owns a query_processor
Respectively, the latter hidden dependency is not "tracked" by
constructor arguments -- the query processor is started after
the database and is deferred to be stopped before it. This works
in scylla, because query processor doesn't really stop there,
but in cql_test_env it's problematic as it stops everything,
including the qctx.
Recent database start-stop sanitation revealed this problem --
on database stop either l.d.h. or compaction manager try to
start (or continue) messing with the query processor. One problem
was faced immediatelly and pluged with the 75e1d7ea safety check
inside l.d.h., but still cql_test_env tests continue suffering
from use after free on stopped query processor.
The fix is to partially revert the 4b7846da by making the tests
stop some pieces of the database (inclusing l.d.h. and compaction
manager) as it used to before. In scylla this is, probably, not
needed, at least now -- the database shutdown code was and still
is run right before the stopping one.
tests: unit(debug)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210924080248.11764-1-xemul@scylladb.com>
"This series removes layer violation in compaction, and also
simplifies compaction manager and how it interacts with compaction
procedure."
* 'compaction_manager_layer_violation_fix/v3' of github.com:raphaelsc/scylla:
compaction: split compaction info and data for control
compaction_manager: use task when stopping a given compaction type
compaction: remove start_size and end_size from compaction_info
compaction_manager: introduce helpers for task
compaction_manager: introduce explicit ctor for task
compaction: kill sstables field in compaction_info
compaction: kill table pointer in compaction_info
compaction: simplify procedure to stop ongoing compactions
compaction: move management of compaction_info to compaction_manager
compaction: move output run id from compaction_info into task
compaction_info must only contain info data to be exported to the
outside world, whereas compaction_data will contain data for
controlling compaction behavior and stats which change as
compaction progresses.
This separation makes the interface clearer, also allowing for
future improvements like removing direct references to table
in compaction.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Today, compactions are tracked by both _compactions and _tasks,
where _compactions refer to actual ongoing compaction tasks,
whereas _tasks refer to manager tasks which is responsible for
spawning new compactions, retry them on failure, etc.
As each task can only have one ongoing compaction at a time,
let's move compaction into task, such that manager won't have to
look at both when deciding to do something like stopping a task.
So stopping a task becomes simpler, and duplication is naturally
gone.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Today, compaction is calling compaction manager to register / deregister
the compaction_info created by it.
This is a layer violation because manager sits one layer above
compaction, so manager should be responsible for managing compaction
info.
From now on, compaction_info will be created and managed by
compaction_manager. compaction will only have a reference to info,
which it can use to update the world about compaction progress.
This will allow compaction_manager to be simplified as info can be
coupled with its respective task, allowing duplication to be removed
and layer violation to be fixed.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
this run id is used to track partial runs that are being written to.
let's move it from info into task, as this is not an external info,
but rather one that belongs to compaction_manager.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
"
The main challenge here is to move messaging_service.start_listen()
call from out of gossiper into main. Other changes are pretty minor
compared to that and include
- patch gossiper API towards a standard start-shutdown-stop form
- gossiping "sharder info" in initial state
- configure cluster name and seeds via gossip_config
tests: unit(dev)
dtest.bootstrap_test.start_stop_test_node(dev)
manual(dev): start+stop, nodetool enable-/disablegossip
refs: #2737
refs: #2795
refs: #5489
"
* 'br-gossiper-dont-start-messaging-listen-2' of https://github.com/xemul/scylla:
code: Expell gossiper.hh from other headers
storage_service: Gossip "sharder" in initial states
gossiper: Relax set_seeds()
gossiper, main: Turn init_gossiper into get_seeds_from_config
storage_service: Eliminate the do-bind argument from everywhere
gossiper: Drop ms-registered manipulations
messaging, main, gossiper: Move listening start into main
gossiper: Do handlers reg/unreg from start/stop
gossiper: Split (un)init_messaging_handler()
gossiper: Relocate stop_gossiping() into .stop()
gossiper: Introduce .shutdown() and use where appropriate
gossiper: Set cluster_name via gossip_config
gossiper, main: Straighten start/stop
tests/cql_test_env: Open-code tst_init_ms_fd_gossiper
tests/cql_test_env: De-global most of gossiper
gossiper: Merge start_gossiping() overloads into one
gossiper: Use is_... helpers
gossiper: Fix do_shadow_round comment
gossiper: Dispose dead code
"
There's a whole lot of places that create an sstable for tests
like this
auto sst = env.make_sstable(...);
sst->write_components(...);
sst->load();
Some of them are already generalized with the make_sstable_easy
helper, but there are several instances of them.
Found while hunting down the places that use default IO sched
class behind the scenes.
tests: unit(dev)
"
* 'br-sst-tests-make-sstable-easy' of https://github.com/xemul/scylla:
test: Generalize make_sstable() and make_sstable_easy()
test: Use now existing helpers elsewhere
test: Generalize all make_sstable_easy()-s
test: Set test change estimation to 1
test: Generalize make_sstable_easy in mutation tests
test: Generalize make_sstable_easy in set tests
test: Reuse make_sstable_easy in datafile tests
test: Relax make_sstable_easy in compaction tests
It's much shorter and simpler to pass the seeds, obtained from the
config, into gossiper via gossip_config rahter than with the help
of a special call.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The same as in previous patch -- the gossiper doesn't need to know
if it should call messaging.start_listen() or not, neither should
do the storage_service.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The start/stop sequence we're moving towards assumes a shutdown (or
drain) method that will be called early on stop to notify the service
that the system is going down so it could prepare.
For gossiper it already means calling stop_gossiping() on the shard-0
instance. So by and large this patch renames a few stop_gossiping()
calls into .shutdown() ones.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's taken purely from the db::config and thus can be set up early.
Right now the empty name is converted into "Test Cluster" one, but
remains empty in the config and is later used by the system_keyspace
code. This logic remains intact.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Turn the gossiper start/stop sequence into the canonical form
gossiper.start(std::ref(dependencies)...).get();
auto stop_gossiper = defer({
gossiper.invoke_on_all(&gossiper::stop).get();
});
gossiper.invoke_on_all(&gossiper::start).get();
The deferred call should be gossiper.stop(); but for now keep
the instances memory alive.
This trick is safe at this point, because .start() and .stop()
methods are both empty (still).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The helper is called once. Keeping this code in the caller packs the
code, helps it look more like main() and facilitates further patching.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Gossiper is still global and cql_test_env heavily exploits this fact.
Clean that by getting the gossiper once and using the local reference
everywhere else.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The former constructs a memtable from the vector of mutations and
then does exactlty the same steps as the latter one -- creates an
sstable corresponding to the memtable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are already four of them. Those working with the mutation reader
can be folded into one with some default args.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>