The namespace usage in this directory is very inconsistent, with files
and classes scattered in:
* global namespace
* namespace compaction
* namespace sstables
With cases, where all three used in the same file. This code used to
live in sstables/ and some of it still retains namespace sstables as a
heritage of that time. The mismatch between the dir (future module) and
the namespace used is confusing, so finish the migration and move all
code in compaction/ to namespace compaction too.
This patch, although large, is mechanic and only the following kind of
changes are made:
* replace namespace sstable {} with namespace compaction {}
* add namespace compaction {}
* drop/add sstables::
* drop/add compaction::
* move around forward-declarations so they are in the correct namespace
context
This refactoring revealed some awkward leftover coupling between
sstables and compaction, in sstables/sstable_set.cc, where the
make_sstable_set() methods of compaction strategies are implemented.
Remove support for generating numerical sstable generation for new sstables.
Loading such sstables is still supported but new sstables are always created with a uuid generation.
This is possible since:
* All live versions (since 5.4 / f014ccf369) now support uuid sstable generations.
* The `uuid_sstable_identifiers_enabled` config option (that is unused from version 2025.2 / 6da758d74c) controls only the use of uuid generations when creating new sstables. SSTables with uuid generations should still be properly loaded by older versions, even if `uuid_sstable_identifiers_enabled` is set to `false`.
Fixes#24248
* Enhancement, no backport needed
Closesscylladb/scylladb#24512
* github.com:scylladb/scylladb:
streaming: stream_blob: use the table sstable_generation_generator
replica: distributed_loader: process_upload_dir: use the table sstable_generation_generator
sstables: sstable_generation_generator: stop tracking highest generation
replica: table: get rid of update_sstables_known_generation
sstables: sstable_directory: stop tracking highest_generation
replica: distributed_loader: stop tracking highest_generation
sstables: sstable_generation: get rid of uuid_identifiers bool class
sstables_manager: drop uuid_sstable_identifiers
feature_service: move UUID_SSTABLE_IDENTIFIERS to supported_feature_set
test: cql_query_test: add test_sstable_load_mixed_generation_type
test: sstable_datafile_test: move copy_directory helper to test/lib/test_utils
test: database_test: move table_dir helper to test/lib/test_utils
It is not needed anymore.
With that database::_sstable_generation_generator can
be a regular member rather than optional and initialized
later.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Since table_state is a view to a compaction group, it makes sense
to rename it as so.
With upcoming incremental repair, each replica::compaction_group
will be actually two compaction groups, so there will be two
views for each replica::compaction_group.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
After it was switched to use schema builder, the indenation of untouched
lines deserves one extra space.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Everything, but perf test is straightforward switch.
The perf-test generated regular columns dynamically via vector, with
builder the vector goes away.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.
Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.
To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.
[1] 66ef711d68Closesscylladb/scylladb#20006
It does nothing by calls the sstable method of the same name. Callers
can do it on their own, the method is public.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Erase retired sstable from compaction_state::sstables_requiring_cleanup
also on_compaction_completion (in addition to
compacting_sstable_registration::release_compacting
for offstrategy compaction with piggybacked cleanup
or any other compaction type that doesn't use
compacting_sstable_registration.
Add cleanup_during_offstrategy_incremental_compaction_test
that is modeled after cleanup_incremental_compaction_test to check
that cleanup doesn't attempt to cleanup already-deleted
sstables that were left over by offstrategy compaction
in sstables_requiring_cleanup.
Fixes#14304
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Use generation_type rather than generation_type::int_t
where possible and removed the deprecated
functions accepting the int_t.i
Ref #10459
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It is possible to find no generation in an empty
table directory, and in he future, with uuid generations
it'd be possible to find no numeric generations in the
directory.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Convert all users to use sstables::generation_type::int_t.
Further patches will continue to convert most to
using sstables::generation_type instead so we can
abstract the value type.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.
Closes#12858
Tests using replica::table::add_sstable_and_update_cache() cannot
rely on the sstable being added to a single compaction group, if
the test was forced to run with multiple groups.
Additionally let's remove try_flush_memtable_to_sstable() which
is retricted to a single group, allowing the entire test to now
pass with multiple groups.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Inferring shard from generation is long gone. We still use it in
some scripts, but that's no longer needed in Scylla, when loading
the SSTables, and it also conflicts with ongoing work of UUID-based
generations.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#12476
That's important for multiple compaction groups. Once replica::table
supports multiple groups, there will be no table::as_table_state(),
so for testing table with a single group, we'll be relying on
column_family_for_tests::as_table_state().
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Now memtables live in compaction_group. Also introduced function
that selects group based on token, but today table always return
the single group managed by it. Once multiple groups are supported,
then the function should interpret token content to select the
group.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Preparatory change for main sstable set to be moved into compaction
group. After that, tests can no longer direct access the main
set.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
And let seal_active_memtable decide about how to handle them
as now all flush error handling logic is implemented there.
In particular, unlike today, sstable write errors will
cause internal error rather than loop forever.
Also, check for shutdown earlier to ignore errors
like semaphore_broken that might happen when
the table is stopped.
Refs #10498
(The issue will be considered fixed when going
into maintenance mode on write errors rather than
throwing internal error and potentially retrying forever)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
`generation_type` is (supposed to be) conceptually different from
`int64_t` (even if physically they are the same), but at present
Scylla code still largely treats them interchangeably.
In addition to using `generation_type` in more places, we
provide (no-op) `generation_value()` and `generation_from_value()`
operations to make the smoke-and-mirrors more believable.
The churn is considerable, but all mechanical. To avoid even
more (way, way more) churn, unit test code is left untreated for
now, except where it uses the affected core APIs directly.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Memtables are a replica-side entity, and so are moved to the
replica module and namespace.
Memtables are also used outside the replica, in two places:
- in some virtual tables; this is also in some way inside the replica,
(virtual readers are installed at the replica level, not the
cooordinator), so I don't consider it a layering violation
- in many sstable unit tests, as a convenient way to create sstables
with known input. This is a layering violation.
We could make memtables their own module, but I think this is wrong.
Memtables are deeply tied into replica memory management, and trying
to make them a low-level primitive (at a lower level than sstables) will
be difficult. Not least because memtables use sstables. Instead, we
should have a memtable-like thing that doesn't support merging and
doesn't have all other funky memtable stuff, and instead replace
the uses of memtables in sstable tests with some kind of
make_flat_mutation_reader_from_unsorted_mutations() that does
the sorting that is the reason for the use of memtables in tests (and
live with the layering violation meanwhile).
Test: unit (dev)
Closes#10120
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.
References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.
scylla-gdb.py is adjusted to look for both the new and old names.
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.
As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
Today, table relies on row_cache::invalidate() serialization for
concurrent sstable list updates to produce correct results.
That's very error prone because table is relying on an implementation
detail of invalidate() to get things right.
Instead, let's make table itself take care of serialization on
concurrent updates.
To achieve that, sstable_list_builder is introduced. Only one
builder can be alive for a given table, so serialization is guaranteed
as long as the builder is kept alive throughout the update procedure.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210721001716.210281-1-raphaelsc@scylladb.com>
From now own, _sstables becomes the compound set, and _main_sstables refer
only to the main sstables of the table. In the near future, maintenance
set will be introduced and will also be managed by the compound set.
So add_sstable() and on_compaction_completion() are changed to
explicitly insert and remove sstables from the main set.
By storing compound set in _sstables, functions which used _sstables for
creating reader, computing statistics, etc, will not have to be changed
when we introduce the maintenance set, so code change is a lot minimized
by this approach.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Column_family_test allows performing private methods on column_family's
sstable_set. It may be useful not only in the boost tests, so it's moved
from test/boost/sstable_test.hh to test/lib/sstable_test_env.hh.
sstable_test.hh includes sstable_test_env.hh, so no includes need to be
changed.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
On every compaction completion, sstable set is rebuilt from scratch.
With LCS and ~160G of data per shard, it means we'll have to create
a new sstable set with ~1000 entries whenever compaction completes,
which will likely result in reactor stalling for a significant
amount of time.
This is fixed by futurizing build_new_sstable_list(), so it will
yield whenever needed.
Fixes#7758.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
procedure is changed to return the new set, so caller will be responsible
for replacing the old set with the new one. this will allow our future
work where building new set and enabling it will be decoupled.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
This replaces a lot of make_lw_shared(schema(...)) with
make_shared_schema(...).
This makes it easier to drop a dependency on the differences between
seastar::make_shared and std::make_shared.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
sstable_test.hh started as collection of utilities shared between the
various `_sstable_test.cc` files. Predictably other tests started using
it as well, among them some that are non boost unit tests. This poses a
problem as if we add the missing boost/test/unit_test.hpp include to
sstable_test.hh these tests will suddenly have missing symbols from
boost::test. To avoid linking boost::test into all these users, extract
utilities more widely used into sstable_utils.hh
This descriptor contain all information needed for table to be properly
updated on compaction completion. A new member will be added to it soon,
which will store ranges to be invalidated in row cache on behalf of
cleanup compaction.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
and replace all dht::global_partitioner().decorate_key
with dht::decorate_key
It is an improvement because dht::decorate_key takes schema
and uses it to obtain partitioner instead of using global
partitioner as it was before.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
query_processor is a central class, so reducing its includes
can reduce dependencies treewite. This patch removes includes
for parsed_statement, cf_statement, and untyped_result_set and
fixes up the rest of the tree to include what it lacks as a result
of these removals.
The only use we had for the subscription was calling done, may as well
call it early and store the future<>.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
gcc 10 requires a semicolon after every compound requirement,
as per the standard. Add missing semicolons where necessary.
Message-Id: <20200129205805.20928-1-avi@scylladb.com>
Based on heap profiling, buffers used for storing half-parsed fields are
a major contributor to the overall memory consumption of reads. This
memory was completely "under the radar" before. Track it by using
tracked `temporary_buffer` instances everywhere in
`continuous_data_consumer`. As `continuous_data_consumer` is the basis
for parsing all index and data files, adding the tracing here
automatically covers all data, index and promoted index parsing.
I'm almost convinced that there is a better place to store the `permit`
then the three places now, but so far I was unable to completely
decipher the our data/index file parsing class hierarchy.