Commit Graph

3332 Commits

Author SHA1 Message Date
Kefu Chai
0b13de52de sstable/mx: add fmt::formatter for cached_promoted_index::promoted_index_block
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for
`cached_promoted_index::promoted_index_block`, and drop its
operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17415
2024-02-20 09:00:32 +02:00
Kefu Chai
7baee379de sstable/storage: pass fs::path to storage::create_links()
this change is a follow-up of 637dd730. the goal is to use
std::filesystem::path for manipulating paths, and to avoid the
converting between sstring and fs::path back and forth.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17257
2024-02-12 13:26:11 +02:00
Avi Kivity
14bf09f447 Merge 'utils: managed_bytes: optimize memory usage for small buffers' from Michał Chojnowski
managed_bytes is implemented as chain of blob_storage objects.
Each blob_storage contains 24 bytes of metadata. But in the most
common case -- when there is only a single element in the chain --
16 bytes of this metadata is trivial/unused.

This is regrettable waste because managed_bytes is used for every
database cell in the memtables and cache. It means that every value
of size >= 7 bytes (smaller ones fit in the inline storage of
managed_bytes) receives 16 bytes of useless overhead.

To correct that, this series adds to managed_bytes an alternative storage
layout -- used for buffers small enough to fit in one fragment -- which only
stores the necessary minimum of metadata. (That is: a pointer to the parent,
to facilitate moving the storage during memory defragmentation).

This saves 16 bytes on every cell greater than 15 bytes. Which includes e.g.
every live cell with value bigger than 6 bytes, which likely applies to most cells.

Before:
```
$ build/release/scylla perf-simple-query --duration 10
median 218692.88 tps ( 61.1 allocs/op,  13.1 tasks/op,   41762 insns/op,        0 errors)
$ build/release/scylla perf-simple-query --duration 10 --write
median 173511.46 tps ( 58.3 allocs/op,  13.2 tasks/op,   53258 insns/op,        0 errors)
$ build/release/test/perf/mutation_footprint_test -c1 --row-count=20 --partition-count=100 --data-size=8 --column-count=16
 - in cache:     2580222
 - in memtable:  2549852
```

After:
```
$ build/release/scylla perf-simple-query --duration 10
median 218780.89 tps ( 61.1 allocs/op,  13.1 tasks/op,   41763 insns/op,        0 errors)
$ build/release/scylla perf-simple-query --duration 10 --write
median 173105.78 tps ( 58.3 allocs/op,  13.2 tasks/op,   52913 insns/op,        0 errors)
$ build/release/test/perf/mutation_footprint_test -c1 --row-count=20 --partition-count=100 --data-size=8 --column-count=16
 - in cache:     2068238
 - in memtable:  2037696
```

Closes scylladb/scylladb#14263

* github.com:scylladb/scylladb:
  utils: managed_bytes: optimize memory usage for small buffers
  utils: managed_bytes: rewrite managed_bytes methods in terms of managed_bytes_view
2024-02-11 16:43:40 +02:00
Kefu Chai
33224cc10b sstables/storage: avoid unnecessary type cast
the type of `_dir` was changed to fs::path back in 637dd730, there
is no need to cast `_dir` to fs::path anymore.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17256
2024-02-11 16:37:05 +02:00
Michał Chojnowski
5a3e4a1cc0 utils: managed_bytes: optimize memory usage for small buffers
managed_bytes is implemented as chain of blob_storage objects.
Each blob_storage contains 24 bytes of metadata. But in the most
common case -- when there is only a single element in the chain --
16 bytes of this metadata is trivial/unused.

This is regrettable waste because managed_bytes is used for every
database cell in the memtables and cache. It means that every value
of size >= 7 bytes (smaller ones fit in the inline storage of
managed_bytes) receives 16 bytes of useless overhead.

To correct that, this patch adds to managed_bytes an alternative storage
layout -- used for buffers small enough to fit in one contiguous
fragment -- which only stores the necessary minimum of metadata.
(That is: a pointer to the parent, to facilitate moving the storage during
memory defragmentation).
2024-02-09 20:56:20 +01:00
Kefu Chai
e02958ad35 sstable: let make_entry_descriptor() accept a single fs::path
both of its callers are passing parent_path() and filename() to
it. so let the callee to do this. simpler this way.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17225
2024-02-08 16:44:16 +03:00
Kefu Chai
07da9fd197 sstable: change sstable_touch_directory_io_check() to accept fs::path
this change is a follow-up of 637dd730. the goal is to use
std::filesystem::path for manipulating paths, and to avoid the
converting between sstring and fs::path back and forth.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17214
2024-02-08 10:01:47 +03:00
Kefu Chai
2c859bc310 sstables: let state_to_dir(sstable_state) return string_view
state_to_dir(sstable_state) translate the enum to the corresponding
directory component. and it returns a `seastar::sstring`. not all
the callers of this function expect a full-blown sstring instance,
on the contrary, quite a few of them just want a string-alike object
which represents the directory component, so they can use it, for
instance to compose a path, or just format the given `state` enum.

so to avoid the overhead of creating/destroying the `seastar::sstring`
instance, let's switch to `std::string_view`. with this change, we
will be able to implement the fmt::formatter for `sstable_state`
without the help of the formatter of sstring.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17213
2024-02-08 10:00:08 +03:00
Kefu Chai
f3845a7f3d sstable: replace "welp" with more descriptive words
despite that "welp" is more emotional expressive, it is considered
a misspelling of "whelp" by codespell. that's why this comment stands
out. but from a non-native speaker's point of view, probably we can
use more descriptive words to explain what "welp" is for in plain
English.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17183
2024-02-06 16:31:18 +02:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Kefu Chai
deef78c796 sstable: capture return value of get0() using auto
instead of capturing the return value of `get0()` with a reference
type, use a plain type. as `get0()` returns a plain `T` while `get0()`
returns a `T&&`, to avoid the value referenced by `T&&` gets destroyed
after the expression, let's use a plain `auto` instead of `auto&&`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-02 22:12:18 +08:00
libo-sober
a341b870bc Remove unnecessary calculations in integrity_checked_file_impl::write_dma.
Use calculated `rbuf_end` in `std::mismatch` to reduce unnecessary calculations.

Closes scylladb/scylladb#16979
2024-02-01 13:42:59 +02:00
Kefu Chai
b931d93668 treewide: fix misspellings in code comments
these misspellings are identified by codespell.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17004
2024-01-31 09:16:10 +02:00
Kefu Chai
637dd73079 sstable/storage: use fs::path to represent _dir and _temp_dir
they are directories, and we are concating strings to build the paths
to the sstable components. so it would be more elegant to use fs::path
for manipulating paths.

this change was inspired by the discussion on passing the relative
path to sstable to `scylla sstables`, where we use the
`path::parent_path()` as the dir of sstable, and then concatenate
it with the filename component. but if the `parent_path()` method
returns an empty string, we end up with a path like
"/me-42-big-TOC.txt", which is not reachable. what we should be
reading is "me-42-big-TOC.txt". so, we should better off either
using `fs::path` or enforcing the absolute path.

since we already using "/" as separator, and concatenating strings,
this is an opportunity to switch over to `fs::path` to address
the problem and to avoid the string concatenating.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16982
2024-01-26 09:54:41 +02:00
Avi Kivity
8ee75ae8f4 sstables: writer: don't require effective_replication_map for sharding metadata
Currently, we pass an effective_replication_map_ptr to sstable_writer,
so that we can get a stable dht::sharder for writing the sharding metadata.
This is needed because with tablets, the sharder can change dynamically.

However, this is both bad and unnecessary:
 - bad: holding on to an effective_replication_map_ptr is a barrier
   for topology operations, preventing tablet migrations (etc) while
   an sstable is being written
 - unnecessary: tablets don't require sharding metadata at all, since
   two tablets cannot overlap (unlike two sstables from different shards in
   the same node). So the first/last key is sufficient to determine the
   shard/tablet ownership.

Given that, just pass the sharder for vnode sstables, and don't generate
sharding metadata for tablet sstables.
2024-01-23 22:23:08 +02:00
Kefu Chai
d1dd71fbd7 mutation: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16889
2024-01-21 16:58:26 +02:00
Kefu Chai
09a688d325 sstables: do not use lambda when not necessary
before this change, we always reference the return value of
`make_reader()`, and the return value's type `flat_mutation_reader_v2`
is movable, so we can just pass it by moving away from it.

in this change, instead of using a lambda, let's just have the
return value of it. simpler this way.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16835
2024-01-18 15:54:49 +02:00
Benny Halevy
d6071945c8 compaction, table: ignore foreign sstables replay_position
The sstables replay_position in stats_metadata is
valid only on the originating node and shard.

Therefore, validate the originating host and shard
before using it in compaction or table truncate.

Fixes #10080

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#16550
2024-01-16 18:45:59 +02:00
Kefu Chai
54d49c04e0 db, sstable: bump up default sstable format to "md"
before this change, we defaults to use "mc" sstable format, and
switch to "md" if the cluster agrees on using it, and to
"me" if the cluster agrees on using this. the cluster feature
is used to get the consensus across the members in the cluster,
if any of the existing nodes in the cluster has its `sstable_format`
configured to, for instance, "mc", then the cluster is stuck with
"mc".

but we disabled "mc" sstable format back in 3d345609, the first LTS
release including that change was scylla v5.2.0. which means, the
cluster of the last major version Scylla should be using "md" or
"me". per our document on upgrade, see docs/upgrade/index.rst,

> You should perform the upgrades consecutively - to each
> successive X.Y version, without skipping any major or minor version.
>
> Before you upgrade to the next version, the whole cluster (each
> node) must be upgraded to the previous version.

we can assume that, a 6.x node will only join a cluster
with 5.x or 6.x nodes. (joining a 7.x cluster should work, but
this is not relevant to this change). in both cases, since
5.x and up scylla can only configured with "md" `sstable_format`,
there is no need to switch from "mc" to "md" anymore. so we can
ditch the code supporting it.

Refs #16551
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-01-11 22:43:05 +08:00
Lakshmi Narayanan Sreethar
76f0d5e35b reader_permit: store schema_ptr instead of raw schema pointer
Store schema_ptr in reader permit instead of storing a const pointer to
schema to ensure that the schema doesn't get changed elsewhere when the
permit is holding on to it. Also update the constructors and all the
relevant callers to pass down schema_ptr instead of a raw pointer.

Fixes #16180

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#16658
2024-01-11 08:37:56 +02:00
Kefu Chai
a6152cb87b sstables: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16666
2024-01-09 11:45:44 +02:00
Raphael S. Carvalho
5e55954f27 replica: Make the storage snapshot survive concurrent compactions
Consider this:
1) file streaming takes storage snapshot = list of sstables
2) concurrent compaction unlink some of those sstables from file system
3) file streaming tries to send unlinked sstables, but files other
than data and index cannot be read as only data and index have file
descriptors opened

To fix it, the snapshot now returns a set of files, one per sstable
component, for each sstable.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#16476
2023-12-21 12:50:28 +02:00
Raphael S. Carvalho
d1e6dfadea sstables: Harden estimate_droppable_tombstone_ratio() interface
The interface is fragile because the user may incorrectly use the
wrong "gc before". Given that sstable knows how to properly calculate
"gc before", let's do it in estimate__d__t__r(), leaving no room
for mistakes.

sstable_run's variant was also changed to conform to new interface,
allowing ICS to properly estimate droppable ratio, using GC before
that is calculated using each sstable's range. That's important for
upcoming tablets, as we want to query only the range that belongs
to a particular tablet in the repair history table.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#15931
2023-12-20 19:04:41 +02:00
Botond Dénes
e1b30f50be reader_concurrency_semaphore: add register_metrics constructor parameter
To be used in the next patch to control whether the semaphore registers
and exports metrics or not. We want to move metric registration to the
semaphore but we don't want all semaphores to export metrics. The
decision on whether a semaphore should or shouldn't export metrics
should be made on a case-by-case basis so this new parameter has no
default value (except for the for_tests constructor).
2023-12-13 06:25:45 -05:00
Avi Kivity
814f3eb6b5 sstables: name sstables_manager
Soon, the reader_concurrency_semaphore will require a unique
and meaningful name in order to label its metrics. To prepare
for that, name sstable_manager instances. This will be used
to generate a name for sstable_manager's reader_concurrency_semaphore.
2023-12-13 04:40:33 -05:00
Kefu Chai
af0ba3d648 sstables: writer: do not include unused header
the helpers in bit_cast.hh are not used, so drop this #include.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-12-12 21:09:51 +08:00
Kefu Chai
893f319004 sstables: add formatter for index_consume_entry_context_state
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, in order to enable the code in the header to
access the formatter without being moved down after the full specialization's
definition, we

* move the enum definition out of the class and before the
  class,
* rename the enum's name from state to index_consume_entry_context_state
* define a formatter for index_consume_entry_context_state
* remove its operator<<().

as fmt v10 is able to use `format_as()` as a fallback, the formatter
full specialization is guarded with `#if FMT_VERSION < 10'00'00`. we
will remove it after we start build with fmt v10.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16204
2023-12-08 12:45:38 +02:00
Pavel Emelyanov
b9abd504be sstables/storage: Drop atomic deleter
Now the deleter function is not in use and can be dropped

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-05 16:47:52 +03:00
Pavel Emelyanov
604279f064 sstables/storage: Reimplement atomic deletion in sstables_manager
Right now the atomic deletion is called on manager, but it gets the
actual deletion function from storage and off-loads the deletion to it.
This patch makes the manager fully responsible for the delition by
implemeting the sequence of

    auto ctx = storage.prepare()
    for sst in sstables:
        sst.unlink()
    storage.complate(ctx)

Storage implementations provide the prepare/complete methods. The
filesystem storage does it via deletion log and the s3 storage is still
not atomic :(

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-05 16:46:01 +03:00
Pavel Emelyanov
4ecf4c4a6a sstables/storage: Add prepare/complete skaffold for atomic deletion
The atomic deletion is going to look like

    auto ctx = storage.prepare()
    for sst in sstables:
        sst.unlink()
    storage.complate(ctx)

and this patch prepares the class storage for that by extending it with
prepare and complete methods. The opaque ctx object is also here

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-05 16:44:13 +03:00
Avi Kivity
8fa2e3ad2a Merge 'Remove sstables::remove_by_toc_name()' from Pavel Emelyanov
The helper in question complicates the logic of sstable_directory::process() by making garbage collection differently for sstables deleted "atomically" and deleted "one-by-one". Also, the code that deletes sstables one-by-one and uses remove_by_toc_name() renders excessive TOC file reading, because there's sstable object at hand and it had all_components() ready for use.

Surprisingly, there was no test for the deletion-log functionality. This PR adds one. The test passes before the g.c. and regular unlink fix, and (of course) continues passing after it.

Closes scylladb/scylladb#16240

* github.com:scylladb/scylladb:
  sstables: Drop remove_by_name()
  sstables/fs_storage: Wipe by recognized+unrecognized components
  sstable_directory: Enlight deletion log replay
  sstables: Split remove_by_toc_name()
  test: Add test case to validate deletion log work
  sstable_directory: Close dir on exception
  sstable_directory: Fix indentation after previous patch
  sstable_directory: Coroutinize delete_with_pending_deletion_log()
  test: Sstable on_delete() is not necessarily in a thread
  sstable_directory: Split delete_with_pending_deletion_log()
2023-12-03 17:29:34 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Pavel Emelyanov
17fd558df8 sstables: Drop remove_by_name()
It was used by deletion log replay and by storage wipe, now it's unused

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 18:20:20 +03:00
Pavel Emelyanov
4405a625f6 sstables/fs_storage: Wipe by recognized+unrecognized components
Currently wiping fs-backed sstable happens via reading and parsing its
TOC file back. Then the three-step process goes:

- move TOC -> TOC.tmp
- remove components (obtained from TOC.tmp)
- remove TOC.tmp

However, wiping sstable happens in one of two cases -- the sstable was
loaded from the TOC file _or_ sstable had evaluated the needed
components and generated TOC file. With that, the 2nd step can be made
without reading the TOC file, just by looking at all components sitting
on the sstable

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 18:20:20 +03:00
Pavel Emelyanov
de931702ec sstable_directory: Enlight deletion log replay
Garbage collection of sstables is scattered between two strages -- g.c.
per-se and the regular processing.

The former stage collects deletion logs and for each log found goes
ahead and deletes the full sstable with the standard sequence:

- move TOC -> TOC.tmp
- remove components
- remove TOC.tmp

The latter stage picks up partially unlinked sstables that didn't go via
atomic deletion with the log. This comes as

- collect all components
  - keep TOC's and TOC.tmp's in separate lists
  - attach other components to TOC/TOC.tmp by generation value
- for all TOC.tmp's get all attached components and remove them
- continue loading TOC's with attached components

Said that, replaying deletion log can be as light as just the first step
out of the above sequence -- just move TOC to TOC.tmp. After that the
regular processing would pick the remaining components and clean them

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 18:20:20 +03:00
Pavel Emelyanov
5ff5946520 sstables: Split remove_by_toc_name()
The helper consists of three phases:

- move TOC -> TOC.tmp
- remove components listed in TOC
- remove TOC.tmp

The first step is needed separately by the next patch

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 18:20:20 +03:00
Pavel Emelyanov
fcf080b63b sstable_directory: Close dir on exception
When committing the deletion log creation its containing directory is
sync-ed via opened file. This place is not exception safe and directory
can be left unclosed

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 15:00:38 +03:00
Pavel Emelyanov
bb167dcca5 sstable_directory: Fix indentation after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 15:00:38 +03:00
Pavel Emelyanov
28b1289d4b sstable_directory: Coroutinize delete_with_pending_deletion_log()
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 15:00:38 +03:00
Pavel Emelyanov
ed043e5762 sstable_directory: Split delete_with_pending_deletion_log()
The helper consists of three parts -- prepare the deletion log, unlink
sstables and drop the deletion log. For testing the first part is needed
as a separate step, so here's this split.

It renders two nested async contexts, but it will change soon.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-01 15:00:37 +03:00
Kefu Chai
7a1fbb38f9 sstable: order uuid-based generation as timeuuid
under most circumstances, we don't care the ordering of the sstable
identifiers, as they are just identifiers. so, as long as they can be
compared, we are good. but we have tests with expect that the sstables
can be ordered by the time they are created. for instance,
sstable_run_based_compaction_test has this expectaion.

before this change, we compare two UUID-based generations by its
(MSB, LSB) lexicographically. but UUID v1 put the lower bits of
the timestamp at the higher bits of MSB, so the ordering of the
"time" in timeuuid is not preserved when comparing the UUID-based
generations. this breaks the test of sstable_run_based_compaction_test,
which feeds the sstables to be compacted in a set, and the set is
ordered with the generation of the sstables.

after this change, we consider the UUID-based generation as
a timeuuid when comparing them.

Fixes #16215
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16238
2023-11-30 14:50:44 +02:00
Pavel Emelyanov
0da37d5fa6 sstable: Generalize toc file read and parse
There are several places where TOC file is parsed into a vector of
components -- sstable::read_toc(), remove_by_toc_name() and
remove_by_registry_entry(). All three deserve some generalization.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-29 12:09:52 +03:00
Botond Dénes
f46cdce9d3 Merge 'Make memtable flush tolerate misconfigured S3 storage' from Pavel Emelyanov
Nowadays if memtable gets flushed into misconfigured S3 storage, the flush fails and aborts the whole scylla process. That's not very elegant. First, because upon restart garbage collecting non-sealed sstables would fail again. Second, because re-configuring an endpoint can be done runtime, scylla re-reads this config upon HUP signal.

Flushing memtable restarts when seeing ENOSPC/EDQUOT errors from on-disk sstables. This PR extends this to handle misconfigured S3 endpoints as well.

fixes: #13745

Closes scylladb/scylladb#15635

* github.com:scylladb/scylladb:
  test: Add object_store test to validate config reloading works
  test: Add config update facility to test cluster
  test: Make S3_Server export config file as pathlib.Path
  config: Make object storage config updateable_value_source
  memtable: Extend list of checking codes
  sstables/storage/s3: Fix missing TOC status check
  s3/client: Map http exceptions into storage_io_error
  exceptions: Extend storage_io_error construction options
2023-11-28 09:33:37 +02:00
Pavel Emelyanov
1efddc228d sstable: Do not nest io-check wrappers into each other
When sealing an sstable on local storage  the storage driver performs
several flushes on a file that is directory open via checked-file.
Flush calls are wrapped with sstable_write_io_check, but that's
excessive, the checked file will wrap flushes with io-checks on its own

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#16173
2023-11-27 15:53:02 +02:00
Kefu Chai
ca31dab9d2 sstable: drop repaired_at related code
before we support incremental repair, these is no point have the
code path setting / getting it. and even worse, it incurs confusion.

so, in this change, we

* just set the field to 0,
* drop the corresponding field in metadata_collector, as we never
  update it.
* add a comment to explain why this variable is initialized to 0

Fixes #16098
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16169
2023-11-24 15:12:25 +02:00
Pavel Emelyanov
a34dae8c37 sstables/storage/s3: Fix missing TOC status check
When TOC file is missing while garbage collecting the S3 server would
resolve with storage_io_error(ENOENT) nowadays

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-21 16:47:50 +03:00
Pavel Emelyanov
2bf1e2a294 sstables: Throw early if endpoint for keyspace is not configured
When a keyspace is created it initiaizes the storage for it and
initialization of S3 storage is the good place to check if the endpoint
for the storage is configured at all.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 15:25:58 +03:00
Pavel Emelyanov
f2a99ad30a replica: Move storage options validation to sstables manager
Currently the cql statement .validate() callback is responsible for
checking if the non-local storage options are allowed with the
respective feature. Next patch will need to extend this check to also
validate the details of the provided storage options, but doing it at
cql level doesn't seem correct -- it's "too far" from query processor
down to sstables manager.

Good news is that there's a lower-level validation of the new keyspace,
namely the database::validate_new_keyspace() call. Move the storage
options validation into sstables manager, while at it, reimplement it
as a visitor to facilitate further extentions and plug the new
validation to the aforementioned database::validate_new_keyspace().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 15:24:59 +03:00
Pavel Emelyanov
2c31cd7817 sstables: Add has_endpoint_client() helper to manager
It's the get_endpoint_client() peer that only checks the client
presense. To be used by next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 14:31:08 +03:00
Kefu Chai
15bfa09454 treewide: do not mark return value const if this has no effect
this change is a cleanup.

to mark a return value without value semantics has no effect. these
`const` specifier useless. so let's drop them.

and, if we compile the tree with `-Wignore-qualifiers`, the compiler
would warn like:

```
/home/kefu/dev/scylladb/schema/schema.hh:245:5: error: 'const' type qualifier on return type has no effect [-Werror,-Wignored-qualifiers]
  245 |     const index_metadata_kind kind() const;
      |     ^~~~~
```
so this change also silences the above warnings.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-11-17 17:46:19 +08:00