Commit Graph

661 Commits

Author SHA1 Message Date
Botond Dénes
4f165eb3e9 sstables: introduce load_metadata()
Loads just the metadata components. No validation.
Split off from load(), to allow scylla-sstable to partially load an
sstable.

(cherry picked from commit 435c01d1e6)
2024-07-12 10:36:59 +00:00
Raphael S. Carvalho
715ae689c0 Implement fast streaming for intra-node migration
With intra-node migration, all the movement is local, so we can make
streaming faster by just cloning the sstable set of leaving replica
and loading it into the pending one.

This cloning is underlying storage specific, but s3 doesn't support
snapshot() yet (th sstables::storage procedure which clone is built
upon). It's only supported by file system, with help of hard links.
A new generation is picked for new cloned sstable, and it will
live in the same directory as the original.

A challenge I bumped into was to understand why table refused to
load the sstable at pending replica, as it considered them foreign.
Later I realized that sharder (for reads) at this stage of migration
will point only to leaving replica. It didn't fail with mutation
based streaming, because the sstable writer considers the shard --
that the sstable was written into -- as its owner, regardless of what
sharder says. That was fixed by mimicking this behavior during
loading at pending.

test:
./test.py --mode=dev intranode --repeat=100 passes.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2024-05-16 00:28:47 +02:00
Lakshmi Narayanan Sreethar
54bb03cff8 sstables: support reloading reclaimed components
Added support to reload components from which memory was previously
reclaimed as the total memory of reclaimable components crossed a
threshold. The implementation is kept simple as only the bloom filters
are considered reclaimable for now.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-09 17:48:58 +05:30
Lakshmi Narayanan Sreethar
140d8871e1 sstable: add link and comparator class to support new instrusive set
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-09 17:48:58 +05:30
Lakshmi Narayanan Sreethar
3ef2f79d14 sstable: renamed intrusive list link type
Renamed the intrusive list link type to differentiate it from the set
link type that will be added in an upcoming patch.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-09 17:48:58 +05:30
Lakshmi Narayanan Sreethar
02d272fdb3 sstable: track memory reclaimed from components per sstable
Added a member variable _total_memory_reclaimed to the sstable class
that tracks the total memory reclaimed from a sstable.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-09 17:48:58 +05:30
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Lakshmi Narayanan Sreethar
4f0aee62d1 sstables: support reclaiming memory from components
Added support to track total memory from components that are reclaimable
and to reclaim memory from them if and when required. Right now only the
bloom filters are considered as reclaimable components but this can be
extended to any component in the future.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-04-02 01:37:47 +05:30
Avi Kivity
e48eb76f61 sstables_manager: decouple from system_keyspace
sstables_manager now depends on system_keyspace for access to the
system.sstables table, needed by object storage. This violates
modularity, since sstables_manager is a relatively low-level leaf
module while system_keyspace integrates large parts of the system
(including, indirectly, sstables_manager).

One area where this is grating is sstables::test_env, which has
to include the much higher level cql_test_env to accommodate it.

Fix this by having sstables_manager expose its dependency on
system_keyspace as an interface, sstables_registry, and have
system_keyspace implement the glue logic in
system_keyspace_sstables_manager.

Closes scylladb/scylladb#17868
2024-03-18 20:38:07 +03:00
Kefu Chai
5754b9eb08 sstables: add fmt::foramtter for sstable_state
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `sstables::sstable_state`,
drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-23 13:55:49 +08:00
Kefu Chai
07da9fd197 sstable: change sstable_touch_directory_io_check() to accept fs::path
this change is a follow-up of 637dd730. the goal is to use
std::filesystem::path for manipulating paths, and to avoid the
converting between sstring and fs::path back and forth.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17214
2024-02-08 10:01:47 +03:00
Kefu Chai
2c859bc310 sstables: let state_to_dir(sstable_state) return string_view
state_to_dir(sstable_state) translate the enum to the corresponding
directory component. and it returns a `seastar::sstring`. not all
the callers of this function expect a full-blown sstring instance,
on the contrary, quite a few of them just want a string-alike object
which represents the directory component, so they can use it, for
instance to compose a path, or just format the given `state` enum.

so to avoid the overhead of creating/destroying the `seastar::sstring`
instance, let's switch to `std::string_view`. with this change, we
will be able to implement the fmt::formatter for `sstable_state`
without the help of the formatter of sstring.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17213
2024-02-08 10:00:08 +03:00
Avi Kivity
8ee75ae8f4 sstables: writer: don't require effective_replication_map for sharding metadata
Currently, we pass an effective_replication_map_ptr to sstable_writer,
so that we can get a stable dht::sharder for writing the sharding metadata.
This is needed because with tablets, the sharder can change dynamically.

However, this is both bad and unnecessary:
 - bad: holding on to an effective_replication_map_ptr is a barrier
   for topology operations, preventing tablet migrations (etc) while
   an sstable is being written
 - unnecessary: tablets don't require sharding metadata at all, since
   two tablets cannot overlap (unlike two sstables from different shards in
   the same node). So the first/last key is sufficient to determine the
   shard/tablet ownership.

Given that, just pass the sharder for vnode sstables, and don't generate
sharding metadata for tablet sstables.
2024-01-23 22:23:08 +02:00
Benny Halevy
d6071945c8 compaction, table: ignore foreign sstables replay_position
The sstables replay_position in stats_metadata is
valid only on the originating node and shard.

Therefore, validate the originating host and shard
before using it in compaction or table truncate.

Fixes #10080

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#16550
2024-01-16 18:45:59 +02:00
Raphael S. Carvalho
5e55954f27 replica: Make the storage snapshot survive concurrent compactions
Consider this:
1) file streaming takes storage snapshot = list of sstables
2) concurrent compaction unlink some of those sstables from file system
3) file streaming tries to send unlinked sstables, but files other
than data and index cannot be read as only data and index have file
descriptors opened

To fix it, the snapshot now returns a set of files, one per sstable
component, for each sstable.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#16476
2023-12-21 12:50:28 +02:00
Raphael S. Carvalho
d1e6dfadea sstables: Harden estimate_droppable_tombstone_ratio() interface
The interface is fragile because the user may incorrectly use the
wrong "gc before". Given that sstable knows how to properly calculate
"gc before", let's do it in estimate__d__t__r(), leaving no room
for mistakes.

sstable_run's variant was also changed to conform to new interface,
allowing ICS to properly estimate droppable ratio, using GC before
that is calculated using each sstable's range. That's important for
upcoming tablets, as we want to query only the range that belongs
to a particular tablet in the repair history table.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#15931
2023-12-20 19:04:41 +02:00
Avi Kivity
8fa2e3ad2a Merge 'Remove sstables::remove_by_toc_name()' from Pavel Emelyanov
The helper in question complicates the logic of sstable_directory::process() by making garbage collection differently for sstables deleted "atomically" and deleted "one-by-one". Also, the code that deletes sstables one-by-one and uses remove_by_toc_name() renders excessive TOC file reading, because there's sstable object at hand and it had all_components() ready for use.

Surprisingly, there was no test for the deletion-log functionality. This PR adds one. The test passes before the g.c. and regular unlink fix, and (of course) continues passing after it.

Closes scylladb/scylladb#16240

* github.com:scylladb/scylladb:
  sstables: Drop remove_by_name()
  sstables/fs_storage: Wipe by recognized+unrecognized components
  sstable_directory: Enlight deletion log replay
  sstables: Split remove_by_toc_name()
  test: Add test case to validate deletion log work
  sstable_directory: Close dir on exception
  sstable_directory: Fix indentation after previous patch
  sstable_directory: Coroutinize delete_with_pending_deletion_log()
  test: Sstable on_delete() is not necessarily in a thread
  sstable_directory: Split delete_with_pending_deletion_log()
2023-12-03 17:29:34 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Pavel Emelyanov
17fd558df8 sstables: Drop remove_by_name()
It was used by deletion log replay and by storage wipe, now it's unused

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 18:20:20 +03:00
Pavel Emelyanov
5ff5946520 sstables: Split remove_by_toc_name()
The helper consists of three phases:

- move TOC -> TOC.tmp
- remove components listed in TOC
- remove TOC.tmp

The first step is needed separately by the next patch

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 18:20:20 +03:00
Pavel Emelyanov
0da37d5fa6 sstable: Generalize toc file read and parse
There are several places where TOC file is parsed into a vector of
components -- sstable::read_toc(), remove_by_toc_name() and
remove_by_registry_entry(). All three deserve some generalization.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-29 12:09:52 +03:00
Kefu Chai
15bfa09454 treewide: do not mark return value const if this has no effect
this change is a cleanup.

to mark a return value without value semantics has no effect. these
`const` specifier useless. so let's drop them.

and, if we compile the tree with `-Wignore-qualifiers`, the compiler
would warn like:

```
/home/kefu/dev/scylladb/schema/schema.hh:245:5: error: 'const' type qualifier on return type has no effect [-Werror,-Wignored-qualifiers]
  245 |     const index_metadata_kind kind() const;
      |     ^~~~~
```
so this change also silences the above warnings.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-11-17 17:46:19 +08:00
Kefu Chai
efd65aebb2 build: cmake: add check-header target
to have feature parity with `configure.py`. we won't need this
once we migrate to C++20 modules. but before that day comes, we
need to stick with C++ headers.

we generate a rule for each .hh files to create a corresponding
.cc and then compile it, in order to verify the self-containness of
that header. so the number of rule is quite large, to avoid the
unnecessary overhead. the check-header target is enabled only if
`Scylla_CHECK_HEADERS` option is enabled.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15913
2023-11-13 10:27:06 +02:00
Pavel Emelyanov
63758d19ce sstables: Add state string to state enum class convert
There's the backward converter already out there. Next code will need to
convert string representation of the state back to the internal type.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-24 19:12:37 +03:00
Raphael S. Carvalho
fded314e46 sstables: Fix update of tombstone GC settings to have immediate effect
After "repair: Get rid of the gc_grace_seconds", the sstable's schema (mode,
gc period if applicable, etc) is used to estimate the amount of droppable
data (or determine full expiration = max_deletion_time < gc_before).
It could happen that the user switched from timeout to repair mode, but
sstables will still use the old mode, despite the user asked for a new one.
Another example is when you play with value of grace period, to prevent
data resurrection if repair won't be able to run in a timely manner.
The problem persists until all sstables using old GC settings are recompacted
or node is restarted.
To fix this, we have to feed latest schema into sstable procedures used
for expiration purposes.

Fixes #15643.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#15746
2023-10-19 16:27:59 +03:00
Aleksandra Martyniuk
7b3e0ab1f2 compaction: sstables: monitor validation scrub with compaction_read_generator
Validation scrub bypasses the usual compaction machinery, though it
still needs to be tracked with compaction_progress_monitor so that
we could reach its progress from compaction task executor.

Track sstable scrub in validate mode with read monitors.
2023-10-12 17:03:46 +02:00
Pavel Emelyanov
d112098c08 sstables: Make descriptor from sstable without parsing
When loading unshared remote sstable, sstable_directory needs to make a
descriptor out of a real sstable. For that it parses the sstable's Data
component path which is pretty weird. It's simpler to make descriptor
out of the ssatble itself.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-05 12:21:01 +03:00
Pavel Emelyanov
4370e6c8d0 distributed_loader: Print sstable state explicitly
When populating from a particular directory, populator code converts
state to subdir name, then prints the path. The conversion is pretty
much artificial, it's better to provide printer for state and print
state explicitly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-14 16:41:26 +03:00
Pavel Emelyanov
5c39d61b62 sstable: Maintain state
This means -- keep state on sstable, change it with change_state() call
and (!) fix the is_<state>() helpers not to check storage->prefix()

nit: mark requires_view_building() noexcept while at it

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:40:44 +03:00
Pavel Emelyanov
b06917f235 sstable: Make .change_state() accept state, not directory string
Pretty cosmetic change, but it will allow S3 to finally support moving
sstables between states (after this patch it still doesn't)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:40:44 +03:00
Pavel Emelyanov
ef25352412 sstable: Construct it with state
This just moves make_path() call from outside of sstable::sstable()
inside it. Later it will be moved even further. Also, now sstable can
know its state and keep it (next patch)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:28:54 +03:00
Pavel Emelyanov
249a6a4d27 sstable: Introduce state enum
There are several states between which an sstable can migrate. Nowadays
the state is encoded into sstable directory, which is not nice. Also S3
backed sstables don't support states only keeping sstables in "normal".

This patch adds enum state in order to replace the path-encoded one
eventually. The new sstables_manager::make_sstable() method is added
that accepts table directory (without quarantine/ or staging/ component)
and the desired initial state (optional). Next patches will make use of
this maker and the existing one will be removed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 14:45:52 +03:00
Pavel Emelyanov
f15246f5ef sstable: Remove filename(dir, ...) method
It's only used by fs storage driver that can do dir/file concatenation
on its own. Moreover, this method is not welcome to be used even
internally

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 14:00:50 +03:00
Benny Halevy
b08f2ac4c6 sstable: add on_delete observer
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-08 08:15:00 +03:00
Benny Halevy
6f037549ac sstables: delete_with_pending_deletion_log: batch sync_directory
When deleting multiple sstables with the same prefix
the deletion atomicity is ensured by the pending_delete_log file,
so if scylla crashes in the middle, deletions will be replyed on
restart.

Therefore, we don't have to ensure atomicity of each individual
`unlink`.  We just need to sync the directory once, before
removing the pending_delete_log file.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #14967
2023-08-06 18:52:13 +03:00
Tomasz Grabiec
ad983ac23d sstables: Compute sstable shards using sharder from erm when loading
schema::get_sharder() does not use the correct sharder for
tablet-based tables.  Code which is supposed to work with all kinds of
tables should obtain the sharder from erm::get_sharder().
2023-06-21 00:58:24 +02:00
Tomasz Grabiec
17d6163548 sstables: Generate sharding metadata using sharder from erm when writing
We need to keep sharding metadata consistent with tablet mapping to
shards in order for node restart to detect that those sstables belong
to a single shard and that resharding is not necessary. Resharding of
sstables based on tablet metadata is not implemented yet and will
abort after this series.

Keeping sharding metadata accurate for tablets is only necessary until
compaction group integration is finished. After that, we can use the
sstable token range to determine the owning tablet and thus the owning
shard. Before that, we can't, because a single sstable may contain
keys from different tablets, and the whole key range may overlap with
keys which belong to other shards.
2023-06-21 00:58:24 +02:00
Pavel Emelyanov
d1de796f6b sstable: Move XFS renamer hack into fs storage
The method sits on sstable, but is called only from fs storage and it's
the only place that really needs it

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #14230
2023-06-14 12:35:04 +03:00
Pavel Emelyanov
66e43912d6 code: Switch to seastar API level 7
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).

So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command

The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields

Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)

Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile

The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13963
2023-06-06 13:29:16 +03:00
Pavel Emelyanov
6c453df9d7 sstables: Remove default prio class from rewrite_statistics()
The method is called with explicitly default pririty class and puts one
into the fstream options. This whole chain can be avoided

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-23 13:54:31 +03:00
Pavel Emelyanov
438132ad4b sstables: Remove prio class from validate_checksums subs
The sstable.read_checksum() and .read_digest() accept prio class
argument from validate_checsums(), but it's always the "default" one.
Remove the arg and remove stream options initializations as they'll pick
up default prio class on their default constructing.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-23 13:54:31 +03:00
Pavel Emelyanov
7396d9d291 sstables: Remove always default io-prio from validate_checksums()
All calls to sstables::validate_checksums() happen with explicitly
default priority class. Just hard-code it as such in the method

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-23 13:54:31 +03:00
Pavel Emelyanov
ed50fda1fe sstable: Toss tempdir extension usage
The tempdir for filesystem-based sstables is {generation}.sstable one.
There are two places that need to know the ".sstable" extention -- the
tempdir creating code and the tempdir garbage-collecting code.

This patch simplifies the sstable class by patching the aforementioned
functions to use newly introduced tempdir_extension string directly,
without the help of static one-line helpers.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-17 15:19:38 +03:00
Pavel Emelyanov
e8c0ae28b5 sstable: Drop pending_delete_dir_basename()
The helper is used to return const char* value of the pending delete
dir. Callers can use it directly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-17 15:17:33 +03:00
Pavel Emelyanov
7792479865 sstable: Drop is_pending_delete_dir() helper
It's only used by the sstable_directory::replay_pending_delete_log()
method. The latter is only called by the sstable_directory itself with
the path being pending-delete dir for sure. So the method can be made
private and the is_pending_delete_dir() can be removed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-17 15:17:32 +03:00
Botond Dénes
287ccce1cc Merge 'sstables: extract storage out ' from Kefu Chai
this change extracts the storage class and its derived classes
out into their own source files. for couple reasons:

- for better readability. the sstables.hh is over 1005 lines.
  and sstables.cc 3602 lines. it's a little bit difficult to figure
  out how the different parts in these sources interact with each
  other. for instance, with this change, it's clear some of helper
  functions are only used by file_system_storage.
- probably less inter-source dependency. by extracting the sources
  files out, they can be compiled individually, so changing one .cc
  file does not impact others. this could speed up the compilation
  time.

Closes #13785

* github.com:scylladb/scylladb:
  sstables: storage: coroutinize idempotent_link_file()
  sstables: extract storage out
2023-05-09 14:03:40 +03:00
Kefu Chai
2eefcb37eb sstables: extract storage out
this change extracts the storage class and its derived classes
out into storage.cc and storage.hh. for couple reasons:

- for better readability. the sstables.hh is over 1005 lines.
  and sstables.cc 3602 lines. it's a little bit difficult to figure
  out how the different parts in these sources interact with each
  other. for instance, with this change, it's clear some of helper
  functions are only used by file_system_storage.
- probably less inter-source dependency. by extracting the sources
  files out, they can be compiled individually, so changing one .cc
  file does not impact others. this could speed up the compilation
  time.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-09 16:47:00 +08:00
Nadav Har'El
5f37d43ee6 Merge 'compaction: validate: validate the index too' from Botond Dénes
In addition to the data file itself. Currently validation avoids the
index altogether, using the crawling reader which only relies on the
data file and ignores the index+summary. This is because a corrupt
sstable usually has a corrupt index too and using both at the same time
might hide the corruption. This patch adds targeted validation of the
index, independent of and in addition to the already existing data
validation: it validates the order of index entries as well as whether
the entry points to a complete partition in the data file.
This will usually result in duplicate errors for out-of-order
partitions: one for the data file and one for the index file.

Fixes: #9611

Closes #11405

* github.com:scylladb/scylladb:
  test/cql-pytest: add test_sstable_validation.py
  test/cql-pytest: extract scylla_path,temp_workdir fixtures to conftest.py
  tools/scylla-sstables: write validation result to stdout
  sstables/sstable: validate(): delegate to mx validator for mx sstables
  sstables/mx/reader: add mx specific validator
  mutation/mutation_fragment_stream_validator: add validator() accessor to validating filter
  sstables/mx/reader: template data_consume_rows_context_m on the consumer
  sstables/mx/reader: move row_processing_result to namespace scope
  sstables/mx/reader: use data_consumer::proceed directly
  sstables/mx/reader.cc: extend namespace to end-of-file (cosmetic)
  compaction/compaction: remove now unused scrub_validate_mode_validate_reader()
  compaction/compaction: move away from scrub_validate_mode_validate_reader()
  tools/scylla-sstable: move away from scrub_validate_mode_validate_reader()
  test/boost/sstable_compaction_test: move away from scrub_validate_mode_validate_reader()
  sstables/sstable: add validate() method
  compaction/compaction: scrub_sstables_validate_mode(): validate sstables one-by-one
  compaction: scrub: use error messages from validator
  mutation_fragment_stream_validator: produce error messages in low-level validator
2023-05-08 17:14:26 +03:00
Botond Dénes
47959454eb sstables/sstable: add validate() method
To replace the validate code currently in compaction/compaction.cc (not
in this commit). We want to push down this logic to the sstable layer,
so that:
* Non compaction code that wishes to validate sstables (tests, tools)
  doesn't have to go through compaction.
* We can abstract how sstables are validated, in particular we want to
  add a new more low-level validation method that only the more recent
  sstable versions (mx) will support.
2023-05-02 09:42:41 -04:00
Pavel Emelyanov
81a1416ebf sstables: Add and call storage::destroy()
The s3_storage leaks client when sstable gets destoryed. So far this
came unnoticed, but debug-mode unit test ran over minio captured it. So
here's the fix.

When sstable is destroyed it also kicks the storage to do whatever
cleanup is needed. In case of s3 storage the cleanup is in closing the
on-boarded client. Until #13458 is fixed each sstable has its own
private version of the client and there's no other place where it can be
close()d in co_await-able mannter.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:15:44 +03:00