Commit Graph

81 Commits

Author SHA1 Message Date
Lakshmi Narayanan Sreethar
3c1fd843c8 [Backport 6.0]: sstables: do not reload components of unlinked sstables
The SSTable is removed from the reclaimed memory tracking logic only
when its object is deleted. However, there is a risk that the Bloom
filter reloader may attempt to reload the SSTable after it has been
unlinked but before the SSTable object is destroyed. Prevent this by
removing the SSTable from the reclaimed list maintained by the manager
as soon as it is unlinked.

The original logic that updated the memory tracking in
`sstables_manager::deactivate()` is left in place as (a) the variables
have to be updated only when the SSTable object is actually deleted, as
the memory used by the filter is not freed as long as the SSTable is
alive, and (b) the `_reclaimed.erase(*sst)` is still useful during
shutdown, for example, when the SSTable is not unlinked but just
destroyed.

Fixes https://github.com/scylladb/scylladb/issues/19722

Closes scylladb/scylladb#19717

* github.com:scylladb/scylladb:
  boost/bloom_filter_test: add testcase to verify unlinked sstables are not reloaded
  sstables: do not reload components of unlinked sstables
  sstables/sstables_manager: introduce on_unlink method

(cherry picked from commit 591876b44e)

Backported from #19717 to 6.0

Closes scylladb/scylladb#19830
2024-07-23 23:16:53 +03:00
Lakshmi Narayanan Sreethar
6f58768c46 sstables_manager: use maintenance scheduling group to run components reload fiber
Fixes #18675

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-19 15:23:45 +05:30
Lakshmi Narayanan Sreethar
79f6746298 sstables_manager: add member to store maintenance scheduling group
Store that maintenance scheduling group inside the sstables_manager. The
next patch will use this to run the components reloader fiber.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-19 15:23:45 +05:30
Lakshmi Narayanan Sreethar
0b061194a7 sstables_manager: reload previously reclaimed components when memory is available
When an SSTable is dropped, the associated bloom filter gets discarded
from memory, bringing down the total memory consumption of bloom
filters. Any bloom filter that was previously reclaimed from memory due
to the total usage crossing the threshold, can now be reloaded back into
memory if the total usage can still stay below the threshold. Added
support to reload such reclaimed filters back into memory when memory
becomes available.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-09 17:49:22 +05:30
Lakshmi Narayanan Sreethar
f758d7b114 sstables_manager: start a fiber to reload components
Start a fiber that gets notified whenever an sstable gets deleted. The
fiber doesn't do anything yet but the following patch will add support
to reload reclaimed components if there is sufficient memory.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-09 17:49:22 +05:30
Lakshmi Narayanan Sreethar
2340ab63c6 sstables_manager: add new intrusive set to track the reclaimed sstables
The new set holds the sstables from where the memory has been reclaimed
and is sorted in ascending order of the total memory reclaimed.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-09 17:48:58 +05:30
Lakshmi Narayanan Sreethar
02d272fdb3 sstable: track memory reclaimed from components per sstable
Added a member variable _total_memory_reclaimed to the sstable class
that tracks the total memory reclaimed from a sstable.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-09 17:48:58 +05:30
Lakshmi Narayanan Sreethar
a36965c474 sstables_manager: support reclaiming memory from components
Reclaim memory from the SSTable that has the most reclaimable memory if
the total reclaimable memory has crossed the threshold. Only the bloom
filter memory is considered reclaimable for now.

Fixes #17747

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-04-02 01:37:47 +05:30
Lakshmi Narayanan Sreethar
2ca4b0a7a2 sstables_manager: store available memory size
The available memory size is required to calculate the reclaim memory
threshold, so store that within the sstables manager.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-04-02 01:37:47 +05:30
Lakshmi Narayanan Sreethar
f05bb4ba36 sstables_manager: add variable to track component memory usage
sstables_manager::_total_reclaimable_memory variable tracks the total
memory that is reclaimable from all the SSTables managed by it.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-04-02 01:37:47 +05:30
Avi Kivity
e48eb76f61 sstables_manager: decouple from system_keyspace
sstables_manager now depends on system_keyspace for access to the
system.sstables table, needed by object storage. This violates
modularity, since sstables_manager is a relatively low-level leaf
module while system_keyspace integrates large parts of the system
(including, indirectly, sstables_manager).

One area where this is grating is sstables::test_env, which has
to include the much higher level cql_test_env to accommodate it.

Fix this by having sstables_manager expose its dependency on
system_keyspace as an interface, sstables_registry, and have
system_keyspace implement the glue logic in
system_keyspace_sstables_manager.

Closes scylladb/scylladb#17868
2024-03-18 20:38:07 +03:00
Kefu Chai
d1dd71fbd7 mutation: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16889
2024-01-21 16:58:26 +02:00
Botond Dénes
e1b30f50be reader_concurrency_semaphore: add register_metrics constructor parameter
To be used in the next patch to control whether the semaphore registers
and exports metrics or not. We want to move metric registration to the
semaphore but we don't want all semaphores to export metrics. The
decision on whether a semaphore should or shouldn't export metrics
should be made on a case-by-case basis so this new parameter has no
default value (except for the for_tests constructor).
2023-12-13 06:25:45 -05:00
Avi Kivity
814f3eb6b5 sstables: name sstables_manager
Soon, the reader_concurrency_semaphore will require a unique
and meaningful name in order to label its metrics. To prepare
for that, name sstable_manager instances. This will be used
to generate a name for sstable_manager's reader_concurrency_semaphore.
2023-12-13 04:40:33 -05:00
Pavel Emelyanov
604279f064 sstables/storage: Reimplement atomic deletion in sstables_manager
Right now the atomic deletion is called on manager, but it gets the
actual deletion function from storage and off-loads the deletion to it.
This patch makes the manager fully responsible for the delition by
implemeting the sequence of

    auto ctx = storage.prepare()
    for sst in sstables:
        sst.unlink()
    storage.complate(ctx)

Storage implementations provide the prepare/complete methods. The
filesystem storage does it via deletion log and the s3 storage is still
not atomic :(

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-05 16:46:01 +03:00
Pavel Emelyanov
2bf1e2a294 sstables: Throw early if endpoint for keyspace is not configured
When a keyspace is created it initiaizes the storage for it and
initialization of S3 storage is the good place to check if the endpoint
for the storage is configured at all.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 15:25:58 +03:00
Pavel Emelyanov
f2a99ad30a replica: Move storage options validation to sstables manager
Currently the cql statement .validate() callback is responsible for
checking if the non-local storage options are allowed with the
respective feature. Next patch will need to extend this check to also
validate the details of the provided storage options, but doing it at
cql level doesn't seem correct -- it's "too far" from query processor
down to sstables manager.

Good news is that there's a lower-level validation of the new keyspace,
namely the database::validate_new_keyspace() call. Move the storage
options validation into sstables manager, while at it, reimplement it
as a visitor to facilitate further extentions and plug the new
validation to the aforementioned database::validate_new_keyspace().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 15:24:59 +03:00
Pavel Emelyanov
2c31cd7817 sstables: Add has_endpoint_client() helper to manager
It's the get_endpoint_client() peer that only checks the client
presense. To be used by next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 14:31:08 +03:00
Pavel Emelyanov
11b704e8b8 replica/{ks|cf}: Move storage init/destroy to sstables manager
It's the manager that knows about storages and it should init/destroy
it. Also the "upload" and "staging" paths are about to be hidden in
sstables/ code, this code move also facilitates that.

The indentation in storage.cc is deliberately broken to make next patch
look nicer (spoiler: it won't have to shift those lines right).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-08 20:23:16 +03:00
Pavel Emelyanov
182a5348d4 code: Configure s3 clients' memory usage
This sets the real limits on the memory semaphore.

- scylla sets it to 1% of total memory, 10Mb min, 100Mb max
- tests set it to 16Mb
- perf test sets it to all available memory

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-20 17:50:29 +03:00
Pavel Emelyanov
b299757884 s3::client: Construct client with shared semaphore
The semaphore will be used to cap memory consumption by client. This
patch makes sure the reference to a semaphore exists as an argument to
client's constructor, not more than that.

In scylla binary, the semaphore sits on storage_manager. In tests the
semaphore is some local object. For now the semaphore is unused and is
initialized locked as this patch just pushes the needed argument all the
way around, next patches will make use of it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-20 17:50:07 +03:00
Pavel Emelyanov
f40b4e3e84 sstables::storage_manager: Introduce config
Just an empty config that's fed to storage_manager when constructed as a
preparation for further heavier patching

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-20 17:42:59 +03:00
Petr Gusev
b90011294d config.cc: drop db::config::host_id
In this refactoring commit we remove the db::config::host_id
field, as it's hacky and duplicates token_metadata::get_my_id.

Some tests want specific host_id, we add it to cql_test_config
and use in cql_test_env.

We can't pass host_id to sstables_manager by value since it's
initialized in database constructor and host_id is not loaded yet.
We also prefer not to make a dependency on shared_token_metadata
since in this case we would have to create artificial
shared_token_metadata in many tools and tests where sstables_manager
is used. So we pass a function that returns host_id to
sstables_manager constructor.
2023-09-13 23:00:15 +04:00
Kefu Chai
86e8be2dcd replica:database: log if endpoint not found
if the endpoint specified when creating a KEYSPACE is not found,
when flushing a memtable, we would throw an `std::out_of_range`
exception when looking up the client in `storage_manager::_s3_endpoints`
by the name of endpoint. and scylla would crash because of it. so
far, we don't have a good way to error out early. since the
storage option for keyspace is still experimental, we can live
with this, but would be better if we can spot this error in logging
messages when testing this feature.

also, in this change, `std::invalid_argument` is thrown instead of
`std::out_of_range`. it's more appropriate in this circumstance.

Refs #15074
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15075
2023-08-28 10:51:19 +03:00
Pavel Emelyanov
ef25352412 sstable: Construct it with state
This just moves make_path() call from outside of sstable::sstable()
inside it. Later it will be moved even further. Also, now sstable can
know its state and keep it (next patch)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:28:54 +03:00
Pavel Emelyanov
1f247f0b05 sstables_manager: Remove state-less make_sstable()
Now all callers specify the state they want their sstables in explicitly
and the old API can be removed

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:28:54 +03:00
Pavel Emelyanov
249a6a4d27 sstable: Introduce state enum
There are several states between which an sstable can migrate. Nowadays
the state is encoded into sstable directory, which is not nice. Also S3
backed sstables don't support states only keeping sstables in "normal".

This patch adds enum state in order to replace the path-encoded one
eventually. The new sstables_manager::make_sstable() method is added
that accepts table directory (without quarantine/ or staging/ component)
and the desired initial state (optional). Next patches will make use of
this maker and the existing one will be removed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 14:45:52 +03:00
Kefu Chai
949bb719cd sstables: use try_emplace() when appropriate
so we don't have to search in the unordered_map twice. and it's
more readable, as we don't need to compare an iterator with the
sentry.

also, take the opportunity to simplify the code by using the
temporary `s3_cfg` when possible instead of `it->second.cfg`
which is less readable.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-07-04 15:40:10 +08:00
Kefu Chai
dcfbc85485 replica,sstable: do not assign a value to a shared_ptr
instead using the operator=(T&&) to assign an instance of `T` to a
shared_ptr, assign a new instance of shared_ptr to it.

unlike std::shared_ptr, seastar::shared_ptr allows us to move a value
into the existing value pointed by shared_ptr with operator=(). the
corresponding change in seastar is
319ae0b530.
but this is a little bit confusing, as the behavior of a shared_ptr
should look like a pointer instead the value pointed by it. and this
could be error-prune, because user could use something like
```c++
p = std::string();
```
by accident, and expect that the value pointed by `p` is cleared.
and all copies of this shared_ptr are updated accordingly. what
he/she really wants is:
```c++
*p = std::string();
```
and the code compiles, while the outcome of the statement is that
the pointee of `p` is destructed, and `p` now points to a new
instance of string with a new address. the copies of this
instance of shared_ptr still hold the old value.

this behavior is not expected. so before deprecating and removing
this operator. let's stop using it.

in this change, we update two caller sites of the
`lw_shared_ptr::operator=(T&&)`. instead of creating a new instance
pointee of the pointer in-place, a new instance of lw_shared_ptr is
created, and is assigned to the existing shared_ptr.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-07-04 15:39:52 +08:00
Kefu Chai
f014ccf369 Revert "Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai""
This reverts commit 562087beff.

The regressions introduced by the reverted change have been fixed.
So let's revert this revert to resurrect the
uuid_sstable_identifier_enabled support.

Fixes #10459
2023-06-21 13:02:40 +03:00
Botond Dénes
562087beff Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai"
This reverts commit d1dc579062, reversing
changes made to 3a73048bc9.

Said commit caused regressions in dtests. We need to investigate and fix
those, but in the meanwhile let's revert this to reduce the disruption
to our workflows.

Refs: #14283
2023-06-19 08:49:27 +03:00
Kefu Chai
939fa087cc sstables, replica: pass uuid_sstable_identifiers to generation generator
before this change, we assume that generation is always integer based.
in order to enable the UUID-based generation identifier if the related
option is set, we should populate this option down to generation generator.

because we don't have access to the cluster features in some places where
a new generation is created, a new accessor exposing feature_service from
sstable manager is added.

Fixes #10459
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-06-15 17:54:59 +08:00
Kefu Chai
03be1f438c sstables: move get_components_lister() into sstable_directory
sstables_manager::get_component_lister() is used by sstable_directory.
and almost all the "ingredients" used to create a component lister
are located in sstable_directory. among the other things, the two
implementations of `components_lister` are located right in
`sstable_directory`. there is no need to outsource this to
sstables_manager just for accessing the system_keyspace, which is
already exposed as a public function of `sstables_manager`. so let's
move this helper into sstable_directory as a member function.

with this change, we can even go further by moving the
`components_lister` implementations into the same .cc file.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13853
2023-05-18 08:43:35 +03:00
Botond Dénes
20ff122a84 Merge 'Delete S3 sstables without the help of deletion log' from Pavel Emelyanov
There are two layers of stables deletion -- delete-atomically and wipe. The former is in fact the "API" method, it's called by table code when the specific sstable(s) are no longer needed. It's called "atomically" because it's expected to fail in the middle in a safe manner so that subsequent boot would pick the dangling parts and proceed. The latter is a low-level removal function that can fail in the middle, but it's not of _its_ care.

Currently the atomic deletion is implemented with the help of sstable_directory::delete_atomically() method that commits sstables files names into deletion log, then calls wipe (indirectly), then drops the deletion log. On boot all found deletion logs are replayed. The described functionality is used regardless of the sstable storage type, even for S3, though deletion log is an overkill for S3, it's better be implemented with the help of ownership table. In fact, S3 storage already implements atomic deletion in its wipe method thus being overly careful.

So this PR
- makes atomic deletion be storage-specific
- makes S3 wipe non-atomic

fixes: #13016
note: Replaying sstables deletion from ownership table on boot is not here, see #13024

Closes #13562

* github.com:scylladb/scylladb:
  sstables: Implement atomic deleter for s3 storage
  sstables: Get atomic deleter from underlying storage
  sstables: Move delete_atomically to manager and rename
2023-05-15 08:57:47 +03:00
Pavel Emelyanov
6a8139a4fe sstables: Get atomic deleter from underlying storage
While the driver isn't known without the sstable itself, we have a
vector of them can can get it from the front element. This is not very
generic, but fortunately all sstables here belong to the same table and,
respectively, to the same storage and even prefix. The latter is also
assert-checked by the sstable_directory atomic deleter code.

For now S3 storage returns the same directory-based deleter, but next
patch will change that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-12 17:52:13 +03:00
Pavel Emelyanov
5985f00da9 sstables: Move delete_atomically to manager and rename
This is to let manager decide which storage driver to call for atomic
sstables deletion in the next patch. While at it -- rename the
sstable_directory's method into something more descriptive (to make
compiler catch all callers of it).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-12 17:52:12 +03:00
Pavel Emelyanov
613acba5d0 s3: Pick client from manager via handle
Add the global-factory onto the client that is

- cross-shard copyable
- generates a client from local storage_manager by given endpoint

With that the s3 file handle is fixed and also picks up shared s3
clients from the storage manager instead of creating its own one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-11 19:39:01 +03:00
Pavel Emelyanov
63ff6744d8 s3: Live-update clients' configs
Now when the client is accessible directli via the storage_manager, when
the latter is requested to update its endpoint config, it can kick the
client to do the same.

The latter, in turn, can only update the AWS creds info for now. The
endpoint port and https usage are immutable for now.

Also, updating the endpoint address is not possible, but for another
reason -- the endpoint itself is the part of keyspace configuration and
updating one in the object_storage.yaml will have no effect on it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-11 19:39:01 +03:00
Pavel Emelyanov
e6760482b2 sstables: Keep clients shared across sstables
Nowadays each sstable gets its own instance of an s3::client. This patch
keeps clients on storage_manager's endpoints map and when creating a
storage for an sstable -- grab the shared pointer from the map, thus
making one client serve all sstables over there (except for those that
duplicated their files with the help of foreign-info, but that's to be
handled by next patches).

Moving the ownership of a client to the storage_manager level also means
that the client has to be closed on manager's stop, not on sstable
destroy.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-11 19:39:01 +03:00
Pavel Emelyanov
743f26040f storage_manager: Rewrap config map
Now the map is endpoint -> config_ptr. Wrap the config_ptr into an
s3_endpoint struct. Next patch will keep the client on this new wrapper
struct thus making them shared between sstables.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-11 19:39:01 +03:00
Pavel Emelyanov
a59096aa70 sstables, database: Move object storage config maintenance onto storage_manager
Right now the map<endpoint, config> sits on the sstables manager and its
update is governed by database (because it's peering and can kick other
shards to update it as well).

Having the sharded<storage_manager> at hand lets freeing database from
the need to update configs and keeps sstables_manager a bit smaller.
Also this will allow keeping s3 clients shared between sstables via this
map by next patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-11 19:39:00 +03:00
Pavel Emelyanov
2153751d45 sstables: Introduce sharded<storage_manager>
The manager in question keeps track of whatever sstables_manager needs
to work with the storage (spoiler: only S3 one). It's main-local sharded
peering service, so that container() call can be used by next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-11 19:36:01 +03:00
Pavel Emelyanov
bd1e3c688f sstables_manager: Keep object storage configs onboard
The user sstables manager will need to provide endpoint config for
sstables' storage drivers. For that it needs to get it from db::config
and keep in-sync with its updates.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-03 20:19:43 +03:00
Pavel Emelyanov
f04c6cdf9a sstable_directory: Add ownership table components lister
When sstables are stored on object storage, they are "registered" in the
system.sstables_registry ownership table. The sstable_directory is
supposed to list sstables from this table, so here's the respective
components lister.

The lister is created by sstables_manager, by the time it's requested
from the the system keyspace is already plugged. The lister only handles
"sealed" sstables. Dangling ones are still ignored, this is to be fixed
later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:44:29 +03:00
Pavel Emelyanov
8bd9f7accf sstable_directory: Make components_lister and API
Now the lister is filesystem-specific. There will soon come another one
for S3, so the sstable_directory should be prepared for that by making
the lister an abstract class.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:44:29 +03:00
Pavel Emelyanov
5f7f0117e1 sstable_directory: Create components lister based on storage options
The directory's lister is storage-specific and should be created
differently for different storage options.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:44:29 +03:00
Pavel Emelyanov
e34b86dd61 system_keyspace: Plug to user sstables manager too
The sharded<sys_ks> instances are plugged to large data handler and
compaction manager to maintain the circular dependency between these
components via the interposing database instance. Do the same for user
sstables manager, because S3 driver will need to update the local
ownership table.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Pavel Emelyanov
4bb885b759 sstable: Make storage instance based on storage options
This patch adds storage options lw-ptr to sstables_manager::make_sstable
and makes the storage instance creation depend on the options. For local
it just creates the filesystem storage instance, for S3 -- throws, but
next patch will fix that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Pavel Emelyanov
df5384cb1e sstable_directory: Keep components_lister aboard
The lister is supposed to be alive throughout .process_sstable_dir() and
can die after .commit_directory_changes().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-02-21 16:32:06 +03:00
Botond Dénes
8658cfc066 reader_concurrency_semaphore: add {serialize,kill}_limit_multiplier parameters
Propagate the recently added
reader_concurrency_semaphore_{serialize,kill}_limit_multiplier config items
to the semaphore. Not used yet.
2023-01-16 02:05:27 -05:00