With that, it is always expected that _compaction_state[cf]
exists when compaction jobs are submnitted.
Otherwise, throw std::out_of_range exception.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
So that the compaction_state won't be found from this point on,
while stopping the ongoing compaction.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that compaction tasks enter the compaction_state gate there is
no point in stopping ongoing compaction in parallel to closing the gate.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
And hold its gate to make sure the compaction_state outlives
the task and can be used to wait on all tasks and functions
using it.
With that, doing access _compaction_state[cf] to acquire
shared/exclusive locks but rather get to it via
task->compaction_state so it can be detached from
_compaction_state while task is running, if needed.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It must remove itself from the compaction_manager,
that will stop_ongoing_compactions.
Without that we're hitting
```
sstable_compaction_test: ./seastar/include/seastar/core/gate.hh:56: seastar::gate::~gate(): Assertion `!_count && "gate destroyed with outstanding requests"' failed.
```
when destroying the compaction_manager.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
We'd like to use compaction_state::gate both for functions
running with compaction disabled and for and tasks referring
to the compaction_state so that stop_ongoing_compactions
could wait on all functions referring to the state structure.
This is also cleaner with respect to not relying on
gate::use_count() when re-submitting regular compaction
when compaction is re-enabled.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
get_compactions().size() may return 0 while there are
non-zero tasks to stop.
Some tasks may not be marked as `compaction_running` since
they are either:
- postponed (due to compaction manger throttling of regular compaction)
- sleeping before retry.
In both cases we still want to stop them so the log message
should reflect both the number of ongoing compactions
and the actual number of tasks we're stopping.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The patch also removes the usage of map_reduce() because it is no longer needed
after 6191fd7701 that drops futures from the view mutation building path.
The patch preserves yielding point that map_reduce() provides though by
calling to coroutine::maybe_yield() explicitly.
Message-Id: <YZoV3GzJsxR9AZfl@scylladb.com>
"
After this series, compaction will finally stop including database.hh.
tests: unit(debug).
"
* 'stop_including_database_hh_for_compaction' of github.com:raphaelsc/scylla:
compaction: stop including database.hh
compaction: switch to table_state in get_fully_expired_sstables()
compaction: switch to table_state
compaction: table_state: Add missing methods required by compaction
Make compaction procedure switch to table_state. Only function in
compaction.cc still directly using table is
get_fully_expired_sstables(T,...), but subsequently we'll make it
switch to table_state and then we can finally stop including database.hh
in the compaction code.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
These are the only methods left for compaction to switch to
table_state, so compaction can finally stop including database.hh
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
"
Add a sharded locator::effective_replication_map_factory that holds
shared effective_replication_maps.
To search for e_r_m in the factory, we use a compound `factory_key`:
<replication_strategy type, replication_strategy options, token_metadata ring version>.
Start the sharded factory in main (plus cql_test_env and tools/schema_loader)
and pass a reference to it to storage_proxy and storage_server.
For each keyspace, use the registry to create the effective_replication_map.
When registered, effective_replication_map objects erase themselves
from the factory when destroyed. effective_replication_map then schedules
a background task to clear_gently its contents, protected by the e_r_m_f::stop()
function.
Note that for non-shard 0 instances, if the map
is not found in the registry, we construct it
by cloning the precalculated replication_map
from shard 0 to save the cpu cycles of re-calculating
it time and again on every shard.
Test: unit(dev), schema_loader_test(debug)
DTest: bootstrap_test.py:TestBootstrap.decommissioned_wiped_node_can_join_test update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_add_new_node_while_schema_changes_with_repair_test (dev)
"
* tag 'effective_replication_map_factory-v7' of https://github.com/bhalevy/scylla:
effective_replication_map: clear_gently when destroyed
database: shutdown keyspaces
test: cql_test_env: stop view_update_generator before database shuts down
effective_replication_map_factory: try cloning replication map from shard 0
tools: schema_loader: start a sharded erm_factory
storage_service: use erm_factory to create effective_replication_map
keyspace: use erm_factory to create effective_replication_map
effective_replication_map: erase from factory when destroyed
effective_replication_map_factory: add create_effective_replication_map
effective_replication_map: enable_lw_shared_from_this
effective_replication_map: define factory_key
keyspace: get a reference to the erm_factory
main: pass erm_factory to storage_service
main: pass erm_factory to storage_proxy
locator: add effective_replication_map_factory
Turns out most of regular writer can be reused by GC writer, so let's
merge the latter into the former. We gain a lot of simplification,
lots of duplication is removed, and additionally, GC writer can now
be enabled with interposer as it can be created on demand by
each interposer consumer (will be done in a later patch).
Refs #6472.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20211119120841.164317-1-raphaelsc@scylladb.com>
Prevent reactor stalls by gently clearing the replication_map
and token_metadata_ptr when the effective_replication_map is
destroyed.
This is done in the background, protected by the
effective_replication_map_factory::stop() method.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
release the keyspace effective_replication_map during
shutdown so that effective_replication_map_factory
can be stopped cleanly with no outstanding e_r_m:s.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
We can't have view updates happening after the database shuts down.
In particular, mutateMV depends on the keyspace effective_replaication_map
and it is going to be released when all keyspaces shut down, in the next patch.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Calculating a new effective_replication_map on each shard
is expensive. To try to save that, use the factory key to
look up an e_r_m on shard 0 and if found, use to to clone
its replication map and use that to make the shard-local
e_r_m copy.
In the future, we may want to improve that in 2 ways:
- instead of always going to shard 0, use hash(key) % smp::count
to create the first copy.
- make full copies only on NUMA nodes and keep a shared pointer
on all other shards.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This is required for an upcoming change to create effective_replication_map
on all shards in storage_service::replication_to_all_cores.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Instead of calculating the effective_replication_map
in replicate_to_all_cores, use effective_replication_map_factory::
create_effective_replication_map.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The effective_replication_map_factory keeps nakes pointers
to outstanding effective_replication_map:s.
These are kept valid using a shared effective_replication_map_ptr.
When the last shared ptr reference is dropped the effective_replication_map
object is destroyed, therefore the raw pointer to it in the factory
must be erased.
This now happens in ~effective_replication_map when the object
is marked as registered.
Registration happens when effective_replication_map_factory inserts
the newly created effective_replication_map to its _replication_maps
map, and the factory calles effective_replication_map::set_factory..
Note that effective_replication_map may be created temporarily
and not be inserted to the factory's map, therefore erase
is called only when required.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Make a factory key using the replication_strategy type
and config options, plus the token_metadata ring version
and use it to search an already-registred effective_replication_map.
If not found, calculate a new create_effective_replication_map
and register it using the above key.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
So a effective_replication_map_ptr can be generated
using a raw pointer by effective_replication_map_factory.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
To be used to locate the effective_replication_map
in the to-be-introduced effective_replication_map_factory.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
To be used for creating effective_replication_map
when token_metadata changes, and update all
keyspaces with it.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It will be used further to create shared copies
of effective_replication_map based on replication_strategy
type and config options.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Similar to other timeout handling paths, there is no need to print an
ERROR for timeout as the error is not returned anyhow.
Eventually the error will be reported at the query level
when the query times out or fails in any other way.
Also, similar to `storage_proxy::mutate_end`, traces were added
also for the error cases.
FWIW, these extraneous timeout error causes dtest failures.
E.g. alternator_tests:AlternatorTest.test_slow_query_logging
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211118153603.2975509-1-bhalevy@scylladb.com>
We're using a coarse resolution when rounding clock time for sstables to
be evenly distributed across time buckets. Let's use a better resolution,
to make sure sstables won't fall into the edges.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20211118172126.34545-1-raphaelsc@scylladb.com>
The test checks every 100 * smp::count milliseconds that a shard
had been able to make at least once step. Shards, in turn, take up
to 100 ms sleeping breaks between steps. It seems like on heavily
loaded nodes the checking period is too small and the test
stuck-detector shoots false-positives.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20211118154932.25859-1-xemul@scylladb.com>
I intentionally store lambdas in variables and pass them to
with_scheduling_group using std::ref. Coroutines don't put variables
captured by lambdas on stack frame. If the lambda containing them is not
stored, the captured variables will be lost, resulting in stack/heap use
after free errors. An alternative is to capture variables, then create
local variables inside lambda bodies that contain a copy/moved version
of the captured ones. For example, if the post_flush lambda wasn't
stored in a dedicated variable, then it wouldn't be put on the coroutine
frame. At the first co_await inside of it, the lambda object along with
variables captured by it (old and &newtabs created inside square
brackets) would go away. The underlying objects (e.g. newtabs created in
the outer scope) would still be valid, but the reference to it would be
gone, causing most of the tests to fail.
Message-Id: <20211118131441.215628-2-mikolaj.sieluzycki@scylladb.com>