this is the 12nd changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals:
- to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience
- to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules.
this changeset includes following changes:
- build: cmake: remove Seastar from the option name
- build: cmake: add missing sources in test-lib and utils
- build: cmake: do not include main.cc in scylla-main
- build: cmake: define SEASTAR_TESTING_MAIN for SEASTAR tests
- build: cmake: add more tests
Closes#13180
* github.com:scylladb/scylladb:
build: cmake: add more tests
build: cmake: define SEASTAR_TESTING_MAIN for SEASTAR tests
build: cmake: do not include main.cc in scylla-main
build: cmake: add missing sources in test-lib and utils
build: cmake: remove Seastar from the option name
Simplified, more direct version of "dependency injection".
I.e. caller/initiator (main/cql_test_env) provides a set of
services it will eventually start. Configurable can remember
these. And use, at least after "start" notification.
Closes#13037
- build: cmake: remove test which does not exist yet
- build: cmake: document add_scylla_test()
- build: cmake: extract index, repair and data_dictionary out
- build: cmake: extract scylla-main out
- build: cmake: find Snappy before using it
- build: cmake: add missing linkages
- build: cmake: add missing sources to test-lib
- build: cmake: link sstables against libdeflate
- build: cmake: link Boost::regex against ICU::uc
Closes#13110
* github.com:scylladb/scylladb:
build: cmake: link Boost::regex against ICU::uc
build: cmake: link sstables against libdeflate
build: cmake: add missing sources to test-lib
build: cmake: add missing linkages
build: cmake: find Snappy before using it
build: cmake: extract scylla-main out
build: cmake: extract index, repair and data_dictionary out
build: cmake: document add_scylla_test()
build: cmake: remove test which does not exist yet
Allows a configurable to subscribe to life cycle notifications for scylla app.
I.e. do stuff on start/stop.
Also allow configurables in cql_test_env
v2:
* Fix camel casing
* Make callbacks future<> (should have been. mismerge?)
Closes#13035
When creating a deletion log for a bunch of sstables the code checks
that all sstables share the same "storage" by lexicographically
comparing their prefixes. That's not correct, as filesystem paths may
refer to the same directory even if not being equal.
So far that's been mostly OK, because paths manipulations were done in
simple forms without producing unequal paths. Patch 8a061bd8 (sstables,
code: Introduce and use change_state() call) triggerred a corner case.
fs::path foo("/foo");
sstring sub("");
foo = foo / sub;
produces a correct path of "/foo/", but the trailing slash breaks the
aforementioned assumption about prefixes comparison. As a result, when
an sstable moves between, say, staging and normal locations it may gain
a trailing slash breaking the deletion log creation code.
The fix is to restrict the deletion log creation not to rely on path
strings comparison completely and trim the trailing slash if it happens.
A test is included.
fixes: #13085
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13090
The method in question is only called with env's tempdir, so there's no
point in explicitly passing it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's a wonderful comment describing what the reusable_sst is for near
one of its wrappers. It's better to drop the wrapper and move the
comment to where it belongs.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Lots of test cases make sstables with monotonically incrementing
generation values. In Scylla code this counter is maintained in class
table, but sstable tests not always have it. To mimic this behavior, the
test_env can keep track of the generation, so that callers just don't
mess with it (next patch).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's a make_sstable_containing() helper that creates sstable and
populates it with mutations (and makes some post validation). The helper
accepts a factory function that should make sstable for it.
This patch shuffles this helper a bit by introducing an overload that
populates (and validates) the already existing sstable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's the lonely test case that uses the mentioned template to carry
its own instance of tempdir over its lifetime. Patch the case to re-use
the already existing env's tempdir and drop the template.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Lots of (most of) test cases out there generate sstables inside env's
temporary directory. This patch adds some sugar to env that will allow
test cases omit explicit env.tempdir() call.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This is a RAII-sh helper that cleans temp directory on destruction. To
be used in cases when a test needs to do several checks over clean
temporary directory (future patches).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The former helper is going to get rid of the fs::path& dir argument,
but the latter cannot yet live without it. The simplest solution is to
open-code the helper until better times.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
these dependencies were found when trying to compile
`user_function_test`. whenever a library libfoo references another one,
say, libbar, the corresponding linkage from libfoo to libbar is added.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The code for compare_endpoints originates at the dawn of time (bc034aeaec)
and is called on the fast path from storage_proxy via `sort_by_proximity`.
This series considerably reduces the function's footprint by:
1. carefully coding the many comparisons in the function so to reduce the number of conditional banches (apparently the compiler isn't doing a good enough job at optimizing it in this case)
2. avoid sstring copy in topology::get_{datacenter,rack}
Closes#12761
* github.com:scylladb/scylladb:
topology: optimize compare_endpoints
to_string: add print operators for std::{weak,partial}_ordering
utils: to_sstring: deinline std::strong_ordering print operator
move to_string.hh to utils/
test: network_topology: add test_topology_compare_endpoints
There are two places that do it -- commitlog and batchlog replayers. Both can have local system-keyspace reference and use system-keyspace local query-processor for it. The peering save_truncation_record() is not that simple and is not patched by this PR
Closes#13087
* github.com:scylladb/scylladb:
system_keyspace: Unstatic get_truncation_record()
system_keyspace: Unstatic get_truncated_at()
batchlog_manager: Add system_keyspace dependency
main: Swap batchlog manager and system keyspace starts
system_keyspace: Unstatic get_truncated_position()
system_keyspace: Remove unused method
commitlog: Create commitlog_replayer with system keyspace
test: Make cql_test_env::get_system_keyspace() return sharded
commiltlog: Line-up field definitions
The manager will need system ks to get truncation record from, so add it
explicitly. Start-stop sequence no allows that
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
the type of return value of `get_table_views()` is a reference, so we
cannot return a reference to a temporary value.
in this change, a member variable is added to hold the _table_schema,
so it can outlive the function call.
this should silence following warning from Clang:
```
test/lib/expr_test_utils.cc:543:16: error: returning reference to local temporary object [-Werror,-Wreturn-stack-address]
return {view_ptr(_table_schema)};
^~~~~~~~~~~~~~~~~~~~~~~~~
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
- build: cmake: extract more subsystem out into its own CMakeLists.txt
- build: cmake: remove swagger_gen_files
- build: cmake: remove stale TODO comments
- build: cmake: expose scylla_gen_build_dir
- build: cmake: link against cryptopp
- build: cmake: add missing source to utils
- build: cmake: move lib sources into test-lib
- build: cmake: add test/perf
Closes#13059
* github.com:scylladb/scylladb:
build: cmake: add expr_test test
build: cmake: allow test to specify the sources
build: cmake: add test/perf
build: cmake: move lib sources into test-lib
build: cmake: add missing source to utils
build: cmake: link against cryptopp
build: cmake: expose scylla_gen_build_dir
build: cmake: remove stale TODO comments
build: cmake: remove swagger_gen_files
build: cmake: extract more subsystem out into its own CMakeLists.txt
This method requires callers to remember that the sstable is the collection of files on a filesystem and to know what exact directory they are all in. That's not going to work for object storage, instead, sstable should be moved between more abstract states.
This PR replaces move_to_new_dir() call with the change_state() one that accepts target sub-directory string and moves files around. Currently supported state changes:
* staging -> normal
* upload -> normal | staging
* any -> quarantine
All are pretty straightforward and move files between table basedir subdirectories with the exception that upload -> quarantine should move into upload/quarantine subdirectory. Another thing to keep in mind, that normal state doesn't have its subdir but maps directory to table's base directory.
Closes#12648
* github.com:scylladb/scylladb:
sstable: Remove explicit quarantization call
test: Move move_to_new_dir() method from sstable class
sstable, dist.-loader: Introduce and use pick_up_from_upload() method
sstables, code: Introduce and use change_state() call
distributed_loader: Let make_sstables_available choose target directory
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
It's known that reading large cells in reverse cause large allocations.
Source: https://github.com/scylladb/scylladb/issues/11642
The loading is preliminary work for splitting large partitions into
fragments composing a run and then be able to later read such a run
in an efficiency way using the position metadata.
The splitting is not turned on yet, anywhere. Therefore, we can
temporarily disable the loading, as a way to avoid regressions in
stable versions. Large allocations can cause stalls due to foreground
memory eviction kicking in.
The default values for position metadata say that first and last
position include all clustering rows, but they aren't used anywhere
other than by sstable_run to determine if a run is disjoint at
clustering level, but given that no splitting is done yet, it
does not really matter.
Unit tests relying on position metadata were adjusted to enable
the loading, such that they can still pass.
Fixes#11642.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#12979
The initial intent was to reduce the fanout of shared_sstable.hh through
v.u.g.hh -> cql_test_env.hh chain, but it also resulted in some shots
around v.u.g.hh -> database.hh inclusion.
By and large:
- v.u.g.hh doesn't need database.hh
- cql_test_env.hh doesn't need v.u.g.hh (and thus -- the
shared_sstable.hh) but needs database.hh instead
- few other .cc files need v.u.g.hh directly as they pulled it via
cql_test_env.hh before
- add forward declarations in few other places
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#12952
There's a bunch of test cases that check how moving sstables files
around the filesystem works. These need the generic move_to_new_dir()
method from sstable, so move it there.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
this change is based on Botond Dénes's change which gave an overhaul
to the original CMake building system. this change is not enough
to build tests with CMake, as we still need to sort out the
dependencies.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The former is a convenience wrapper over the latter. There's no real benefit in using it, but having two test_env-s is worse than just one.
Closes#12794
* github.com:scylladb/scylladb:
sstable_utils: Move the test_setup to perf/
sstable_utils: Remove unused wrappers over test_env
sstable_test: Open-code do_with_cloned_tmp_directory
sstable_test: Asynchronize statistics_rewrite case
tests: Replace test_setup::do_with_tmp_directory with test_env::do_with(_async)?
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.
Closes#12858
Move mutation-related files to a new mutation/ directory. The names
are kept in the global namespace to reduce churn; the names are
unambiguous in any case.
mutation_reader remains in the readers/ module.
mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this
patch.
This is a step forward towards librarization or modularization of the
source base.
Closes#12788
LWT `IF` (column_condition) duplicates the expression prepare and evaluation code. Annoyingly,
LWT IF semantics are a little different than the rest of CQL: a NULL equals NULL, whereas usually
NULL = NULL evaluates to NULL.
This series converts `IF` prepare and evaluate to use the standard expression code. We employ
expression rewriting to adjust for the slightly different semantics.
In a few places, we adjust LWT semantics to harmonize them with the rest of CQL. These are pointed
out in their own separate patches so the changes don't get lost in the flood.
Closes#12356
* github.com:scylladb/scylladb:
cql3: lwt: move IF clause expression construction to grammar
cql3: column_condition: evaluate column_condition as a single expression
cql3: lwt: allow negative list indexes in IF clause
cql3: lwt: do not short-circuit col[NULL] in IF clause
cql3: column_condition: convert _column to an expression
cql3: expr: generalize evaluation of subscript expressions
cql3: expr: introduce adjust_for_collection_as_maps()
cql3: update_parameters: use evaluation_inputs compatible row prefetch
cql3: expr: protect extract_column_value() from partial clustering keys
cql3: expr: extract extract_column_value() from evaluation machinery
cql3: selection: introduce selection_from_partition_slice
cql3: expr: move check for ordering on duration types from restrictions to prepare
cql3: expr: remove restrictions oper_is_slice() in favor of expr::is_slice()
cql3: column_condition: optimize LIKE with constant pattern after preparing
cql3: expr: add optimizer for LIKE with constant pattern
test: lib: add helper to evaluate an expression with bind variables but no table
cql3: column_condition: make the left-hand-side part of column_condition::raw
cql3: lwt: relax constraints on map subscripts and LIKE patterns
cql3: expr: fix search_and_replace() for subscripts
cql3: expr: fix function evaluation with NULL inputs
cql3: expr: add LWT IF clause variants of binary operators
cql3: expr: change evaluate_binop_sides to return more NULL information
It's only used by the single test and apparently exists since the times
seastar was missing the future::discard_result() sugar
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#12803
Sometimes we want to defeat the expression optimizer's ability to
fold constant expressions. A bind variable is a convenient way to
do this, without the complexity of faking a schema and row inputs.
Add a helper to evaluate an expression with bind variable parameters,
doing all the paperwork for us.
A companion make_bind_variable() is added to likewise simplify
creating bind variables for tests.
The sstable perf test uses test_setup ability to create temporary
directory and clean it and that's the only place that uses it. Move the
remainders of test_setup into perf/ so that no unit tests attempt to
re-use it (there's test_env for that).
Remove unused _walker and _create_directory while at it.
Mark protected stuff private while at it as well.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We currently have two method families to generate partition keys:
* make_keys() in test/lib/simple_schema.hh
* token_generation_for_shard() in test/lib/sstable_utils.hh
Both work only for schemas with a single partition key column of `text` type and both generate keys of fixed size.
This is very restrictive and simplistic. Tests, which wanted anything more complicated than that had to rely on open-coded key generation.
Also, many tests started to rely on the simplistic nature of these keys, in particular two tests started failing because the new key generation method generated keys of varying size:
* sstable_compaction_test.sstable_run_based_compaction_test
* sstable_mutation_test.test_key_count_estimation
These two tests seems to depend on generated keys all being of the same size. This makes some sense in the case of the key count estimation test, but makes no sense at all to me in the case of the sstable run test.
Closes#12657
* github.com:scylladb/scylladb:
test/lib/sstable_utils: remove now unused token_generation_for_shard() and friends
test/lib/simple_schema: remove now unused make_keys() and friends
test: migrate to tests::generate_partition_key[s]()
test/lib/test_services: add table_for_tests::make_default_schema()
test/lib: add key_utils.hh
test/lib/random_schema.hh: value_generator: add min_size_in_bytes
The system_keyspace defines several auxiliary methods to help view_builder update system.scylla_views_builds_in_progress and system.built_views tables. All use global qctx thing.
It only takes adding view_builder -> system_keyspace dependency in order to de-static all those helpers and let them use query-processor from it, not the qctx.
Closes#12728
* github.com:scylladb/scylladb:
system_keysace: De-static calls that update view-building tables
storage_service: Coroutinize mark_existing_views_as_built()
api: Unset column_famliy endpoints
api: Carry sharded<db::system_keyspace> reference over
view_builder: Add system_keyspace dependency
Introduces task manager's compaction module. That's an initial
part of integration of compaction with task manager.
When fully integrated, task manager will allow user to track compaction
operations, check status and progress of each individual one. It will help
with creating an asynchronous version of rest api that forces any compaction.
Currently, users can see with /task_manager/list_modules api call that
compaction is one of the modules accessible through task manager.
They won't get any additional information though, since compaction
tasks are not created yet.
A shared_ptr to compaction module is kept in compaction manager.
Closes#12635
* github.com:scylladb/scylladb:
compaction: test: pass task_manager to compaction_manager in test environment
compaction: create and register task manager's module for compaction
tasks: add task_manager constructor without arguments
This series switches memtable and cache to use a new representation for mutation data,
called `mutation_partition_v2`. In this representation, range tombstone information is stored
in the same tree as rows, attached to row entries. Each entry has a new tombstone field,
which represents range tombstone part which applies to the interval between this entry and
the previous one. See docs/dev/mvcc.md for more details about the format.
The transient mutation object still uses the old model in order to avoid work needed to adapt
old code to the new model. It may also be a good idea to live with two models, since the
transient mutation has different requirements and thus different trade-offs can be made.
Transient mutation doesn't need to support eviction and strong exception guarantees,
so its algorithms and in-memory representation can be simpler.
This allows us to incrementally evict range tombstone information. Before this series,
range tombstones were accumulated and evicted only when the whole partition entry was evicted. This
could lead to inefficient use of cache memory.
Another advantage of the new representation is that reads don't have to lookup
range tombstone information in a different tree while reading. This leads to simpler
and more efficient readers.
There are several disadvantages too. Firstly, rows_entry is now larger by 16 bytes.
Secondly, update algorithms are more complex because they need to deoverlap range tombstone
information. Also, to handle preemption and provide strong exception guarantees, update
algorithms may need to allocate sentinel entries, which adds complexity and reduces performance.
The memtable reader was changed to use the same cursor implementation
which cache uses, for improved code reuse and reducing risk of bugs
due to discrepancy of algorithms which deal with MVCC.
Remaining work:
- performance optimizations to apply_monotonically() to avoid regressions
- performance testing
- preemption support in apply_to_incomplete (cache update from memtable)
Fixes#2578Fixes#3288Fixes#10587Closes#12048
* github.com:scylladb/scylladb:
test: mvcc: Extend some scenarios with exhaustive consistency checks on eviction
test: mvcc: Extract mvcc_container::allocate_in_region()
row_cache, lru: Introduce evict_shallow()
test: mvcc: Avoid copies of mutation under failure injection
test: mvcc: Add missing logalloc::reclaim_lock to test_apply_is_atomic
mutation_partition_v2: Avoid full scan when applying mutation to non-evictable
Pass is_evictable to apply()
tests: mutation_partition_v2: Introduce test_external_memory_usage_v2 mirroring the test for v1
tests: mutation: Fix test_external_memory_usage() to not measure mutation object footprint
tests: mutation_partition_v2: Add test for exception safety of mutation merging
tests: Add tests for the mutation_partition_v2 model
mutation_partition_v2: Implement compact()
cache_tracker: Extract insert(mutation_partition_v2&)
mvcc, mutation_partition: Document guarantees in case merging succeeds
mutation_partition_v2: Accept arbitrary preemption source in apply_monotonically()
mutation_partition_v2: Simplify get_continuity()
row_cache: Distinguish dummy insertion site in trace log
db: Use mutation_partition_v2 in mvcc
range_tombstone_change_merger: Introduce peek()
readers: Extract range_tombstone_change_merger
mvcc: partition_snapshot_row_cursor: Handle non-evictable snapshots
mvcc: partition_snapshot_row_cursor: Support digest calculation
mutation_partition_v2: Store range tombstones together with rows
db: Introduce mutation_partition_v2
doc: Introduce docs/dev/mvcc.md
db: cache_tracker: Introduce insert() variant which positions before existing entry in the LRU
db: Print range_tombstone bounds as position_in_partition
test: memtable_test: Relax test_segment_migration_during_flush
test: cache_flat_mutation_reader: Avoid timestamp clash
test: cache_flat_mutation_reader_test: Use monotonic timestamps when inserting rows
test: mvcc: Fix sporadic failures due to compact_for_compaction()
test: lib: random_mutation_generator: Produce partition tombstone less often
test: lib: random_utils: Introduce with_probability()
test: lib: Improve error message in has_same_continuity()
test: mvcc: mvcc_container: Avoid UB in tracker() getter when there is no tracker
test: mvcc: Insert entries in the tracker
test: mvcc_test: Do not set dummy::no on non-clustering rows
mutation_partition: Print full position in error report in append_clustered_row()
db: mutation_cleaner: Extract make_region_space_guard()
position_in_partition: Optimize equality check
mvcc: Fix version merging state resetting
mutation_partition: apply_resume: Mark operator bool() as explicit
The view builder updates system.scylla_views_builds_in_progress and
.built_views tables and thus needs the system keyspace instance.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Each instance of compaction manager should have compaction module pointer
initialized. All contructors get task_manager reference with which
the module is created.
Now any boost test can run with multiple compaction groups by default,
without any change in the boost test cases whatsoever.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Scylla unit tests are limited to command line options defined by
Seastar testing framework.
For extending the set of options, Scylla unit tests can now
include test/lib/scylla_test_case.hh instead of seastar/testing/test_case.hh,
which will "hijack" the entry point and will process the command line
options, then feed the remaining options into seastar testing entry
point.
This is how it looks like when asking for help:
Scylla tests additional options:
--help Produces help message
--x-log2-compaction_groups arg (=0) Controls static number of compaction
groups per table per shard. For X groups,
set the option to log (base 2) of X.
Example: Value of 3 implies 8 groups.
Running 1 test case...
App options:
-h [ --help ] show help message
--help-seastar show help message about seastar options
--help-loggers print a list of logger names and exit
--random-seed arg Random number generator seed
--fail-on-abandoned-failed-futures arg (=1)
Fail the test if there are any
abandoned failed futures
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>