Commit Graph

281 Commits

Author SHA1 Message Date
Benny Halevy
0aaaefbb5c database: drop_column_family: define table& cf
To reduce the churn in the following patch
that will pass the table& as a parameter.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-07 12:53:05 +03:00
Benny Halevy
bb1e5ffb8c database: drop_column_family: reuse uuid for evict_all_for_table
cf->schema()->id() is the same one returned
by find_uuid(ks_name, cf_name);

As a follow up, we should define a concrete
table_id type and rename schema::id() to schema::table_id()
to return it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-07 12:53:05 +03:00
Benny Halevy
e800e1e720 database: drop_column_family: move log message up a layer
Print once on "coordinator" shard.

And promote to info level as it's important to log
when we're dropping a table (and if we're going to take a snapshot).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-07 12:53:05 +03:00
Benny Halevy
ca78a63873 database: truncate: get rid of the unused ks param
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-07 12:53:05 +03:00
Benny Halevy
46e2a7c83b database: add truncate_table_on_all_shards
As a first step to decouple truncate from flush
and snpashot.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-07 12:53:05 +03:00
Benny Halevy
5e8c05f1a8 database: drop_table_on_all_shards: do not accept a truncated_at
timestamp_func

Since in the drop_table case we want to discard ALL
sstables in the table, not only those with `max_data_age()`
up until drop started.

Fixes #11232

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-07 12:52:51 +03:00
Benny Halevy
574909c78f database: truncate: get optional snapshot_name from caller
Before we change drop_table_on_all_shards to always
pass db_clock::time_point::max() in the next patch,
let it pass a unique snapshot name, otherwise
the snapshot name will always be based on the constant, max
time_point.

Refs #11232

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-07 12:03:19 +03:00
Benny Halevy
474b2fdf37 database: truncate: fix assert about replay_position low_mark
This assert was tweaked several times:
Introduced in 83323e155e,
then fixed in b2b1a1f7e1 to account
for no rp from discard_sstables, then in
9620755c7f to account for
cases we do not flush the table, then again in
71c5dc82df to make that more accurate.

But, the assert wasn't correct in the first place
in the sense that we first get `low_mark` which
represents the highest replay_position at the time truncate
was called, but then we call discard_sstables with a time_point
of `truncated_at` that we get from the caller via the timestamp_func,
and that one could be in the past, before truncate was called -
hence discard_sstables with that timestamp may very well
return a replay_position from older sstables, prior to flush
that can be smaller than the low_mark.

Fix this assert to account for that case.

The real fix to this issue is to have a truncate_tombstone
that will carry an authoritative api::timstamp (#11230)

Fixes #11231

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-07 09:18:06 +03:00
Pavel Emelyanov
527b345079 Merge 'storage_proxy: introduce a remote "subservice"' from Kamil Braun
Introduce a `remote` class that handles all remote communication in `storage_proxy`: sending and receiving RPCs, checking the state of other nodes by accessing the gossiper, and fetching schema.

The `remote` object lives inside `storage_proxy` and right now it's initialized and destroyed together with `storage_proxy`.

The long game here is to split the initialization of `storage_proxy` into two steps:
- the first step, which constructs `storage_proxy`, initializes it "locally" and does not require references to `messaging_service` and `gossiper`.
- the second step will take those references and add the `remote` part to `storage_proxy`.

This will allow us to remove some cycles from the service (de)initialization order and in general clean it up a bit. We'll be able to start `storage_proxy` right after the `database` (without messaging/gossiper). Similar refactors are planned for `query_processor`.

Closes #11088

* github.com:scylladb/scylladb:
  service: storage_proxy: pass `migration_manager*` to `init_messaging_service`
  service: storage_proxy: `remote`: make `_gossiper` a const reference
  gms: gossiper: mark some member functions const
  db: consistency_level: `filter_for_query`: take `const gossiper&`
  replica: table: `get_hit_rate`: take `const gossiper&`
  gms: gossiper: move `endpoint_filter` to `storage_proxy` module
  service: storage_proxy: pass `shared_ptr<gossiper>` to `start_hints_manager`
  service: storage_proxy: establish private section in `remote`
  service: storage_proxy: remove `migration_manager` pointer
  service: storage_proxy: remove calls to `storage_proxy::remote()` from `remote`
  service: storage_proxy: remove `_gossiper` field
  alternator: ttl: pass `gossiper&` to `expiration_service`
  service: storage_proxy: move `truncate_blocking` implementation to `remote`
  service: storage_proxy: introduce `is_alive` helper
  service: storage_proxy: remove `_messaging` reference
  service: storage_proxy: move `connection_dropped` to `remote`
  service: storage_proxy: make `encode_replica_exception_for_rpc` a static function
  service: storage_proxy: move `handle_write` to `remote`
  service: storage_proxy: move `handle_paxos_prune` to `remote`
  service: storage_proxy: move `handle_paxos_accept` to `remote`
  service: storage_proxy: move `handle_paxos_prepare` to `remote`
  service: storage_proxy: move `handle_truncate` to `remote`
  service: storage_proxy: move `handle_read_digest` to `remote`
  service: storage_proxy: move `handle_read_mutation_data` to `remote`
  service: storage_proxy: move `handle_read_data` to `remote`
  service: storage_proxy: move `handle_mutation_failed` to `remote`
  service: storage_proxy: move `handle_mutation_done` to `remote`
  service: storage_proxy: move `handle_paxos_learn` to `remote`
  service: storage_proxy: move `receive_mutation_handler` to `remote`
  service: storage_proxy: move `handle_counter_mutation` to `remote`
  service: storage_proxy: remove `get_local_shared_storage_proxy`
  service: storage_proxy: (de)register RPC handlers in `remote`
  service: storage_proxy: introduce `remote`
2022-08-04 17:50:20 +03:00
Kamil Braun
7b4146dd2a replica: table: get_hit_rate: take const gossiper&
It doesn't use any non-const members.
2022-08-04 12:16:09 +02:00
Benny Halevy
e4e92d44ae main: start compaction_manager as a sharded service
And pass a reference to it to the database rather
than having the database construct its own compaction_manager.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-02 07:50:15 +03:00
Benny Halevy
f26e655646 compaction_manager: add maybe_wait_for_sstable_count_reduction
Called from try_flush_memtable_to_sstable,
maybe_wait_for_sstable_count_reduction will wait for
compaction to catch up with memtable flush if there
the bucket to compact is inflated, having too many
sstables.  In that case we don't want to add fuel
to the fire by creating yet another sstable.

Fixes #4116

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-28 14:43:30 +03:00
Avi Kivity
2c0932cc41 Merge 'Reduce the amount of per-table metrics' from Amnon Heiman
This series is the first step in the effort to reduce the number of metrics reported by Scylla.
The series focuses on the per-table metrics.

The combination of histograms, per-tables, and per shard makes the number of metrics in a cluster explode.
The following series uses multiple tools to reduce the number of metrics.
1. Multiple metrics should only be reported for the user tables and the condition that checked it was not updated when more non-user keyspaces were added.
2. Second, instead of a histogram, per table, per shard, it will report a summary per table, per shard, and a single histogram per node.
3. Histograms, summaries, and counters will be reported only if they are used (for example, the cas-related metrics will not be reported for tables that are not using cas).

Closes #11058

* github.com:scylladb/scylla:
  Add summary_test
  database: Reduce the number of per-table metrics
  replica/table.cc: Do not register per-table metrics for system
  histogram_metrics_helper.hh: Add to_metrics_summary function
  Unified histogram, estimated_histogram, rates, and summaries
  Split the timed_rate_moving_average into data and timer
  utils/histogram.hh: should_sample should use a bitmask
  estimated_histogram: add missing getter method
2022-07-27 22:01:08 +03:00
Amnon Heiman
99a060126d database: Reduce the number of per-table metrics
This patch reduces the number of metrics that is reported per table, when
the per-table flag is on.

When possible, it moves from time_estimated_histogram and
timed_rate_moving_average_and_histogram to use the unified timer.

Instead of a histogram per shard, it will now report a summary per shard
and a histogram per node.

Counters, histograms, and summaries will not be reported if they were
never used.

The API was updated accordingly so it would not break.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-27 16:58:52 +03:00
Amnon Heiman
c31a58f2e9 replica/table.cc: Do not register per-table metrics for system
There is a set of per-table metrics that should only be registered for
user tables.
As time passes there are more keyspaces that are not for the user
keyspace and there is now a function that covers all those cases.

This patch replaces the implementation to use is_internal_keyspace.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-27 16:58:52 +03:00
Benny Halevy
b5abbb971f test: memtable_test: failed_flush_prevents_writes: extend error injection
Inject errors into all seal_active_memtable distinct error
handling sites.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 14:06:59 +03:00
Benny Halevy
a5911619c0 table: seal_active_memtable: abort if retried for too long
If we haven't been able to flush the memtable
in ~30 minutes (based on the number of retries)
just abort assuming that the OOM
condition is permanent rather than transient.

Refs #4344

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 14:06:59 +03:00
Benny Halevy
bc18f750c6 table: seal_active_memtable: abort on unexpected error
Currently when we can't write the flushed sstable
due to corruption in the memtable we get into
an infinite retry loop (see #10498).

Until we can go into maintenance mode, the next best thing
would be to abort, though there is still a risk that
commitlog replay will reproduce the corruption in the
memtable and we's end up with an infinite crash loop.
(hence #10498 is not Fixed with this patch)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 14:06:57 +03:00
Benny Halevy
f0a597a252 table: try_flush_memtable_to_sstable: propagate errors to seal_active_memtable
And let seal_active_memtable decide about how to handle them
as now all flush error handling logic is implemented there.

In particular, unlike today, sstable write errors will
cause internal error rather than loop forever.

Also, check for shutdown earlier to ignore errors
like semaphore_broken that might happen when
the table is stopped.

Refs #10498

(The issue will be considered fixed when going
into maintenance mode on write errors rather than
throwing internal error and potentially retrying forever)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 14:04:55 +03:00
Benny Halevy
d55a2ac762 dirty_memory_manager: flush_when_needed: move error handling to flush_one/seal_active_memtable
Currently flush is retried both by dirty_memory_manager::flush_when_needed
and table::seal_active_memtable, which may be called by other paths
like table::flush.

Unify the retry logic into seal_active_memtable so that
we have similar error handling semantics on all paths.

Refs #4174
Refs #10498

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
67479e4243 table: reindent seal_active_memtable 2022-07-27 13:43:17 +03:00
Benny Halevy
00941452d5 table: coroutinize seal_active_memtable
As a first step to making it robust using
state machine driven retries.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
d3acd80cf5 memtable_list: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
863e9d9e6a dirty_memory_manager: flush_when_needed: target error handling at flush_one
Now that everything prior to flush_one is noexcept
make table::seal_active_memtable and the paths that call it
noexcept, making sure that any errors are returned only
as exceptional futures, and handle them in flush_when_needed().

The original handle_exception had a broader scope than now needed,
so this change is mostly technical, to show that we can narrow down
the error handling to the continuation of flush_one - and verify that
the unit test is not broken.
A later patch moves this error handling logic away to seal_active_memtable.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
73e50bc97d database: delete unused seal_delayed_fn_type
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
fcb3347c7a memtable: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
2d1ba0d7d8 memtable: memtable_encoding_stats_collector: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Avi Kivity
2cb5f79e9d logalloc, dirty_memory_manager: move size-tracking binomial heap out of logalloc
The region_group mechanism used an intrusive heap handle embedded in
logalloc::region to allow region_group:s to track the largest region. But
with region_group moved out of logalloc, the handle is out of place.

Move it out, introducing a new intermediate class size_tracked_region
to hold the heap handle. We might eventually merge the new class into
memtable (which derives from it), but that requires a large rearrangement
of unit tests, so defer that.
2022-07-26 11:12:10 +03:00
Avi Kivity
ee720fa23b logalloc: relax lifetime rules around region_listener
Currently, a region_listener is added during construction and removed
during destruction. This was done to mimick the old region(region_group&)
constructor, as region_listener replaces region_group.

However, this makes moving the binomial heap handle outside logalloc
difficult. The natural place for the handle is in a derived class
of logalloc::region (e.g. memtable), but members of this derived class
will be destroyed earlier than the logalloc::region here. We could play
trickes with an earlier base class but it's better to just decouple
region lifecycle from listener lifecycle.

Do that be adding listen()/unlisten() methods. Some small awkwardness
remains in that merge() implicitly unlistens (see comment in
region::unlisten).

Unit tests are adjusted.
2022-07-26 11:12:10 +03:00
Avi Kivity
fbe8ea7727 logalloc, dirty_memory_manager: move region_group and associated code
region_group is an abstraction that allows accounting for groups of
regions, but the cost/benefit ratio of maintaining the abstraction
is poor. Each time we need to change decision algorithm of memtable
flushing (admittedly rarely), we need to distill that into an abstraction
for region_groups and then use it. An example is virtual regions groups;
we wanted to account for the partially flushed memtables and had to
invent region groups to stand in their place.

Rather than continuing to invest in the abstraction, break it now
and move it to the memtable dirty memory manager which is responsible
for making those decisions. The relevant code is moved to
dirty_memory_manager.hh and dirty_memory_manager.cc (new file), and
a new unit test file is added as well.

A downside of the change is that unit testing will be more difficult.
2022-07-26 11:12:10 +03:00
Avi Kivity
c91ee9d04e logalloc: decouple region_group from region
As a first step in moving region_group away from logalloc, decouple
communications between region and region_group. We introduce region_listener,
that listens for the events that region passed directly to region_group.
A region_group now installs a region_listener in a region, instead of
having region know about the region_group directly.

This decoupling is still leaky:
 - merge() chooses to forget the merged-from region's region_listener.
  This happens to be suitable for the only user of merge().
 - We're still embedding the binomial heap handle, used by region_group
   to keep track of region sizes, in regions. A complete decoupling would
   transfer that responsibility to region_group.
2022-07-26 11:12:03 +03:00
Avi Kivity
cb1251199a memtable: stop using logalloc::region::group() to test for flushed memtables
Currently, the memtable reader uses logalloc::region::group() to test
for whether a memtable has been flushed. If a memtable doesn't belong
to a region group (from dirty_memory_manager), it is flushed.

This is quite tortuous - logalloc::region::merge() makes the merged-from
region identical to the merged-to region. The merged-to region, the cache,
doesn't have a group, so the check works.

Since we're making region groups part of dirty_memory_manager, the cache
will no longer have this indirect way of communication with memtable. But
instead we can use a direct callback it already has -
on_detach_from_region_group(). Use that to set a flag, and examine it in
the read path.
2022-07-26 11:07:25 +03:00
Botond Dénes
6e20cb3255 Merge 'database_test: test_truncate_without_snapshot_during_writes: apply mutation on the correct shard' from Benny Halevy
Currently, all the mutations this test generates are applied on shard 0.
In rare cases, this may lead to the following crash, when the flushed
sstable doesn't contain any key that belongs to the current shard,
as seen in https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1390/artifact/testlog/x86_64/dev/database_test.test_truncate_without_snapshot_during_writes.114.log
```
WARN  2022-07-17 17:41:36,630 [shard 0] sstable - create_sharding_metadata: range=[{-468459073612751032, pk{00046b657930}}, {-468459073612751032, pk{00046b657930}}] has no intersection with shard=0 first_key={key: pk{00046b657930}, token:-468459073612751032} last_key={key: pk{00046b657930}, token:-468459073612751032} ranges_single_shard=[] ranges_all_shards={{1, {[{-468459073612751032, pk{00046b657930}}, {-468459073612751032, pk{00046b657930}}]}}}
ERROR 2022-07-17 17:41:36,630 [shard 0] table - failed to write sstable /jenkins/workspace/releng/Scylla-CI/scylla/testlog/x86_64/dev/scylla-e2b694c7-db4f-4f9d-9940-9c6c21850888/ks/cf-8f74aba005de11ed92fa8661a0ed7890/me-2-big-Data.db: std::runtime_error (Failed to generate sharding metadata for /jenkins/workspace/releng/Scylla-CI/scylla/testlog/x86_64/dev/scylla-e2b694c7-db4f-4f9d-9940-9c6c21850888/ks/cf-8f74aba005de11ed92fa8661a0ed7890/me-2-big-Data.db)
ERROR 2022-07-17 17:41:36,631 [shard 0] table - Memtable flush failed due to: std::runtime_error (Failed to generate sharding metadata for /jenkins/workspace/releng/Scylla-CI/scylla/testlog/x86_64/dev/scylla-e2b694c7-db4f-4f9d-9940-9c6c21850888/ks/cf-8f74aba005de11ed92fa8661a0ed7890/me-2-big-Data.db). Aborting, at 0x329e28e 0x329e780 0x329ea88 0xf5bc69 0xf956b1 0x3196dc4 0x3198037 0x319742a 0x32be2e4 0x32bd8e1 0x32ba01c 0x317f97d /lib64/libpthread.so.0+0x92a4 /lib64/libc.so.6+0x100322
```

Instead, generate random keys and apply them on their
owning shard, and truncate all database shards.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11066

* github.com:scylladb/scylla:
  database_test: test_truncate_without_snapshot_during_writes: apply mutation on the correct shard
  table: try_flush_memtable_to_sstable: consume: close reader on error
2022-07-20 09:06:07 +03:00
Avi Kivity
1f21c1ecc8 Merge "Add IO throttling to streaming class" from Pavel E
"
Same thing was done for compaction class some time ago, now
it's time for streaming to keep repair-generated IO in bounds.
This set mostly resembles the one for compaction IO class with
the exception that boot-time reshard/reshape currently runs in
streaming class, but that's nod great if the class is throttled,
so the set also moves boot-time IO into default IO class.
"

* 'br-streaming-class-throttling-2' of https://github.com/xemul/scylla:
  distributed_loader: Populate keyspaces in default class
  streaming: Maintain class bandwidth
  streaming: Pass db::config& to manager constructor
  config: Add stream_io_throughput_mb_per_sec option
  sstables: Keep priority class on sstable_directory
2022-07-19 17:10:25 +03:00
Benny Halevy
f60ff44fdf table: try_flush_memtable_to_sstable: consume: close reader on error
If an exception is throws in `consume` before
write_memtable_to_sstable is called or if the latter fails,
we must close the reader passed to it.

Fixes #11075

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-19 16:35:59 +03:00
Igor Ribeiro Barbosa Duarte
3b19bcf1a1 memtable_flush: Make memtable_flush_static_shares liveupdateable
This patch makes memtable_flush_static_shares liveupdateable
to avoid having to restart the cluster after updating
this config.

Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
2022-07-19 10:10:46 -03:00
Igor Ribeiro Barbosa Duarte
8dd0f4672d compaction: Make compaction_static_shares liveupdateable
This patch makes compaction_static_shares liveupdateable
to avoid having to restart the cluster after updating
this config.

Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
2022-07-19 10:10:46 -03:00
Igor Ribeiro Barbosa Duarte
c2ee6492e6 backlog_controller: Unify backlog_controller constructors
This patch adds the _static_shares variable to the backlog_controller so that
instead of having to use a separate constructor when controller is disabled,
we can use a single constructor and periodically check on the adjust method
if we should use the static shares or the controller. This will be useful on
the next patches to make compaction_static_shares and memtable_flush_static_shares
live updateable.

Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
2022-07-19 10:06:12 -03:00
Pavel Emelyanov
55d4fa49f7 distributed_loader: Populate keyspaces in default class
The streaming class throughput can be limitd with the respective option.
Doing boot-time reshard/reshape doesn't need to obey it, as the node is
not yet up but instead should get there as soon as possible.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-07-19 12:21:13 +03:00
Pavel Emelyanov
a56e2c83f3 sstables: Keep priority class on sstable_directory
Current code accepts priotity class as an argument to various functions
that need it and all its callers use streaming class. Next patches will
needs to sometimes use default class, but it will require heavy patching
of the distributed loader. Things get simpler if the priority class is
kept on sstable_directory on start.

This change also simplifies the ongoing effort on unification of sched
and IO classes.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-07-19 12:14:41 +03:00
Pavel Emelyanov
62d95f09de view: De-futurize make_view_update_builder()
It doesn't sleep, just returns ready future with builder

tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1384
       it's red because e-mail notification is broken (scylla-pkg#2988)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20220718132529.30751-1-xemul@scylladb.com>
2022-07-18 17:15:48 +03:00
Botond Dénes
9afd2dc428 Merge 'Make compaction manager switch to table abstraction ' from Raphael "Raph" Carvalho
This work gets us a step closer to compaction groups.

Everything in compaction layer but compaction_manager was converted to table_state.

After this work, we can start implementing compaction groups, as each group will be represented by its own table_state. User-triggered operations that span the entire table, not only a group, can be done by calling the manager operation on behalf of each group and then merging the results, if any.

Closes #11028

* github.com:scylladb/scylla:
  compaction: remove forward declaration of replica::table
  compaction_manager: make add() and remove() switch to table_state
  compaction_manager: make run_custom_job() switch to table_state
  compaction_manager: major: switch to table_state
  compaction_manager: scrub: switch to table_state
  compaction_manager: upgrade: switch to table_state
  compaction: table_state: add get_sstables_manager()
  compaction_manager: cleanup: switch to table_state
  compaction_manager: offstrategy: switch to table_state()
  compaction_manager: rewrite_sstables(): switch to table_state
  compaction_manager: make run_with_compaction_disabled() switch to table_state
  compaction_manager: compaction_reenabler: switch to table_state
  compaction_manager: make submit(T) switch to table_state
  compaction_manager: task: switch to table_state
  compaction: table_state: Add is_auto_compaction_disabled_by_user()
  compaction: table_state: Add on_compaction_completion()
  compaction: table_state: Add make_sstable()
  compaction_manager: make can_proceed switch to table_state
  compaction_manager: make stop compaction procedures switch to table_state
  compaction_manager: make get_compactions() switch to table_state
  compaction_manager: change task::update_history() to use table_state instead
  compaction_manager: make can_register_compaction() switch to table_state
  compaction_manager: make get_candidates() switch to table_state
  compaction_manager: make propagate_replacement() switch to table_state
  compaction: Move table::in_strategy_sstables() and switch to table_state
  compaction: table_state: Add maintenance sstable set
  compaction_manager: make has_table_ongoing_compaction() switch to table_state
  compaction_manager: make compaction_disabled() switch to table_state
  compaction_manager: switch to table_state for mapping of compaction_state
  compaction_manager: move task ctor into source
2022-07-18 15:18:29 +03:00
Benny Halevy
bbbbea65fb database: clear_snapshot: remove dropped table directory when it has no remaining snapshots
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-17 14:33:34 +03:00
Benny Halevy
c70a675d77 database: clear_snapshot: make it a coroutine and use thread
and use an async thread around `directory_lister`
rather than `lister::scan_dir` to simplify the implementation.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-17 14:33:34 +03:00
Benny Halevy
d7564b9081 database: make drop_column_family private
Now that all users are converted to use the public
entry point - drop_table_on_all.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-17 14:33:34 +03:00
Benny Halevy
ae3b1b5a64 database_test: drop_table_with_snapshots: test auto_snapshot
Refactor test_drop_table_with_auto_snapshot out of
drop_table_with_snapshots, adding a auto_snapshot param,
controlling how to configure the cql_test_env db:.config::auto_snapshot,
so we can test both cases - auto_snapshot enabled and disabled.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-17 14:33:34 +03:00
Benny Halevy
2e37dcf62a database: drop_table_on_all_shards: remove table directory having no snapshots
If the table to remove has no snapshots then
completely remove its directory on storage
as the left-over directory slows down operations on the keyspace
and makes searching for live tables harder.

Fixes #10896

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-17 14:33:34 +03:00
Benny Halevy
e005629afb database: add drop_table_on_all_shards
Runs drop_column_family on all database shards.
Will be extended later to consider removing the table directory.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-17 14:33:34 +03:00
Raphael S. Carvalho
a94d974835 compaction_manager: make add() and remove() switch to table_state
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-07-16 21:35:06 -03:00
Raphael S. Carvalho
9a1efc69d0 compaction_manager: major: switch to table_state
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-07-16 21:35:06 -03:00