Commit Graph

25 Commits

Author SHA1 Message Date
Michael Livshin
3fef604075 sstables_manager: add get_local_host_id() method and support
Since ME sstable format includes originating host id in stats
metadata, local host id needs to be made available for writing and
validation.

Both Scylla server (where local host id comes from the `system.local`
table) and unit tests (where it is fabricated) must be accomodated.
Regardless of how the host id is obtained, it is stored in the db
config instance and accessed through `sstables_manager`.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-02-16 18:21:24 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Tomasz Grabiec
f14576f4be sstables: Hide partition_index_cache implementation away from sstables.hh
Reduces scope of the header to index_reader.hh which reduces
recompilation time.
2021-07-02 19:02:14 +02:00
Tomasz Grabiec
af4cc233c3 sstables: Destroy partition index cache gently
There could be a lot of them so we should clear it gently to avoid
reactor stalls.
2021-07-02 19:02:14 +02:00
Tomasz Grabiec
a5c72ed899 sstables, database: Keep cache_tracker reference inside sstables_manager
So that sstable code can pick it up for caching (lru and region).
2021-07-02 10:25:58 +02:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Piotr Sarna
6de2691bbd sstables,test: remove variables depending on old features
In order to maintain backward compatibility wrt. cluster features,
two boolean variables were kept in sstable writers:
 - correctly_serialize_non_compound_range_tombstones
 - correctly_serialize_static_compact_in_mc

Since these features are assumed to always be present now,
the above variables are no longer needed and can be purged.
2021-03-30 09:37:41 +02:00
Piotr Sarna
28c9af6fa5 sstables: stop relying on CORRECT_STATIC_COMPACT_IN_MC feature
The feature bit is going away because it's over 2 years old,
so the code which depended on it becomes unconditional.
2021-03-30 09:37:04 +02:00
Botond Dénes
f0b284dab8 sstables: enable token monotonicity validation by default
Partition key order validation in data written to sstables can be very
disruptive. All our components in the storage layers assume that
partitions are in order, which means that reading out-of-order
partitions triggers undefined behaviour. Computer scientists often joke
that undefined behaviour can erase your hard drive and in this case the
damage done by undefined behaviour caused by out-of-order partitions is
very close to that. The corruption is known to mutate causing crashes,
corrupting more data and even loose data. For this reason it is
imperative that out-of-order partitions cannot get into sstables. This
patch enables token monotonicity validation unconditionally in
the sstable writer. As partition key monotonicity checks involve a key
copy per partition, which might have an impact on the performance, we do
the next best thing instead and enable only token monotonicity
validation.
2021-03-01 07:49:23 +02:00
Botond Dénes
694f8a4ec6 mutation_fragment_stream_validating_filter: make validation levels more fine-grained
Currently key order validation for the mutation fragment stream
validating filter is all or nothing. Either no keys (partition or
clustering) are validated or all of them. As we suspect that clustering
key order validation would add a significant overhead, this discourages
turning key validation on, which means we miss out on partition key
monotonicity validation which has a much more moderate cost.
This patch makes this configurable in a more fine-grained fashion,
providing separate levels for partition and clustering key monotonicity
validation.

As the choice for the default validation level is not as clear-cut as
before, the default value for the validation level is removed in the
validating filter's constructor.
2021-03-01 07:49:23 +02:00
Benny Halevy
22f6023ac3 sstables: sstable_writer_config: add origin member
Add a string describing where the sstables originated
from (e.g. memtable, repair, streaming, compaction, etc.)

If configure_writer is called with a nullptr, the origin
will be equal to an empty string.

Introduce test_env_sstables_manager that provides an overload
of configure_writer with no parmeters that calls the base-class'
configure_writer with "test" origin.  This was to reduce the
code churn in this patch and to keep the tests simple.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-01 16:45:52 +02:00
Avi Kivity
5db96170a5 sstables: make sstables_manager take charge of closing sstables
Currently, closing sstables happens from the sstable destructor.
This is problematic since a destructor cannot wait for I/O, so
we launch the file close process in the background. We therefore
lose track of when the closing actually takes place.

This patch makes sstables_manager take charge of the close process.
Every sstable is linked into one of two intrusive lists in its
manager: _active or _undergoing_close. When the reference count
of the sstable drops to zero, we move it from _active to
_undergoing_close and begin closing the files. sstables_manager
remembers all closes and when sstables_manager::close() is called,
it waits for all of them to complete. Therefore,
sstables_manager::close() allows us to know that all files it
manages are closed (and deleted if necessary).

The sstables_manager also gains a destructor, which disables
move construction.
2020-09-23 20:55:17 +03:00
Avi Kivity
a90a511d36 sstables_manager: introduce a stub close()
sstables_manager is going to take charge of its sstables lifetimes,
so it will need a close() to wait until sstables are deleted.

This patch adds sstables_manager::close() so that the surrounding
infrastructure can be wired to call it. Once that's done, we can
make it do the waiting.
2020-09-23 20:55:04 +03:00
Piotr Sarna
16b4b86697 sstables: drop checks for non-compound range tombstones support
Correct non-compound range tombstones are supported for over 2 years
and upgrades are only allowed from versions which already have the
support, so the checks are hereby dropped.
2020-09-14 12:09:51 +02:00
Pavel Emelyanov
89a1b09214 sstables_manager: Keep format on
Make the database be the format_selector target, so
when the format is selected its set on database which
in turn just forwards the selection into sstables
managers. All users of the format are already patched
to read it from those managers.

The initial value for the format is the highest, which
is needed by tests. When scylla starts the format is
updated by format_selector, first after reading from
system tables, then by selectiing it from features.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-05-25 14:17:28 +03:00
Rafael Ávila de Espíndola
0d7281ca06 sstable: Move sstables_manager constructor out of line
There is no reason to have it in a header.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200320005225.178381-1-espindola@scylladb.com>
2020-03-21 19:47:29 +02:00
Pavel Emelyanov
7363d56946 sstables: Move get_highest_supported_format
The global get_highest_supported_format helper and its declaration
are scattered all over the code, so clean this up and prepare the
ground for moving _sstables_format from the storage_service onto
the sstables_manager (not this set).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-02-25 14:31:45 +03:00
Pavel Emelyanov
5dea657991 sstable_writer_config: Extend with more db::config stuff
The enable_sstable_key_validation and summary_bytes_cost are used
in sstables writing code, keeping them on sstable_writer_config
removes more calls to global get_config().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-02-25 14:31:34 +03:00
Pavel Emelyanov
85d9326d70 sstables_manager: Don't use global helper to generate writer config
The main goal of this patch is to stop using get_config() glbal
when creating the sstable_writer_config instance.

Other than being global the existing get_config() is also confusing
as it effectively generates 3 (three) sorts of configs -- one for
scylla, when db config and features are ready, the other one for
tests, when no storage service is at hands, and the third one for
tests as well, when the storage service is created by test env
(likely intentionally, but maybe by coincidence the resulting config
is the same as for no-storage-service case).

With this patch it's now 100% clear which one is used when. Also
this makes half the work of removing get_config() helper.

The db::config and feature_service used to initialize the managers
are referenced by database that creates and keeps managers on,
so the references are safe.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-02-25 14:31:04 +03:00
Pavel Emelyanov
3a603729d4 sstable_writer_config: Sanitize out some features fields initialization
Similar to previous patch -- initialize config fields from features
in configurator, not in default initializers.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-02-25 14:31:04 +03:00
Pavel Emelyanov
34302a3e1c sstable_writer_config: Factor out some field initialization
The promoted_index_block_size is taken from db config in two places.
Factor this out and, at the same time, stop keeping it as std::optional.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-02-25 14:31:04 +03:00
Pavel Emelyanov
5adce3390c sstables: Generate writer config via manager only
The sstable_writer_config creation looks simple (just declare
the struct instance) but behind the scenes references storage
and feature services, messes with database config, etc.

This patch teaches the sstables_manager generate the writer
config and makes the rest of the code use it. For future
safety by-hands creation of the sstable_writer_config is
prohibited.

The manager is referenced through table-s and sstable-s, but
two existing sstables_managers live on database object, and
table-s and sstable-s both live shorter than the database,
this reference is save.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-02-25 14:31:04 +03:00
Pavel Emelyanov
f289da1e3b sstables: Keep reference on manager
This is needed for further patching. The sstables_manager outlives
all sstables objects, so it's safe.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-02-25 14:31:03 +03:00
Benny Halevy
223e1af521 sstables: provide large_data_handler to constructor
And use it for writing the sstable and/or when deleting it.

Refs #4198

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-03-26 16:24:19 +02:00
Benny Halevy
eebc3701a5 sstables: introduce sstables_manager
The goal of the sstables manager is to track and manage sstables life-cycle.
There is a sstable manager instance per database and it is passed to each column-family
(and test environment) on construction.
All sstables created, loaded, and deleted pass through the sstables manager.

The manager will make sure consumers of sstables are in sync so that sstables
will not be deleted while in use.

Refs #4149

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-03-26 16:05:08 +02:00