Pass the test-generated shared_sstable to validate_read
and then to sstable_assertions so it can be used
for make_sstable version and generation params.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Just use the one we created during compaction
for verification so we won't have to rely on a particular
generation/version.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Rather than calling cf.stop_and_keep_alive() before the test exits.
since it must be stopped also on failure.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
We already use test_env::do_with_async in this function
but we didn't take full advantage of it to simplify the
implementation.
Do that before further changes are made.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
generations to idx
The function used `calculate_generation_for_new_table` for
the sstables generation. The so-called `generations` are just used
to generate key indices.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The `storage_options` describes where sstables should be located. Currently the object reside on keyspace_metadata, but is thus not available at the place it's needed the most -- the `table::make_sstable()` call. This set converts keyspace_metadata::storage_opts to be lw-shared-ptr and shares the ptr with class table.
refs: #12523 (detached small change from large PR)
Closes#13212
* github.com:scylladb/scylladb:
table: Keep storage options lw-shared-ptr
keyspace_metadata: Make storage options lw-shared-ptr
this change should address the FTBFS with Clang-17.
turns out we are comparing a mutation with an
optimized_optional<mutation>. and Clang-17 does not want to convert the
LHS, which is a mutation to optimized_optional<mutation> for performing
the comparison using operator==(const optimized_optional<mutation>&),
desipte that optimized_optional(const T& obj) is not marked explicit.
this is understandable.
so, in this change, instead of relying on the implicit conversion, we
just
* check if the optional actually holds a value
* and compare the value by deferencing the optional.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13196
this series intends to deprecate `::join()`, as it always materializes a range into a concrete string. but what we always want is to print the elements in the given range to stream, or to a seastar logger, which is backed by fmtlib. also, because fmtlib offers exactly the same set of features implemented by to_string.hh, this change would allow us to use fmtlib to replace to_string.hh for better maintainability, and potentially better performance. as fmtlib is lazy evaluated, and claims to be performant under most circumstances.
Closes#13163
* github.com:scylladb/scylladb:
utils: to_string: move join to namespace utils
treewide: use fmt::join() when appropriate
row_cache: pass "const cache_entry" to operator<<
* add a new test KIND "UNIT", which provides its own main()
* add all tests which were not included yet
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Tables need to know which storage their sstables need to be located at,
so class table needs to have itw reference of the storage options. The
thing can be inherited from the keyspace metadata.
Tests sometimes create table without keyspace at hand. For those use
default-initialized storage options (which is local storage).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
now that fmtlib provides fmt::join(). see
https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view
there is not need to revent the wheel. so in this change, the homebrew
join() is replaced with fmt::join().
as fmt::join() returns an join_view(), this could improve the
performance under certain circumstances where the fully materialized
string is not needed.
please note, the goal of this change is to use fmt::join(), and this
change does not intend to improve the performance of existing
implementation based on "operator<<" unless the new implementation is
much more complicated. we will address the unnecessarily materialized
strings in a follow-up commit.
some noteworthy things related to this change:
* unlike the existing `join()`, `fmt::join()` returns a view. so we
have to materialize the view if what we expect is a `sstring`
* `fmt::format()` does not accept a view, so we cannot pass the
return value of `fmt::join()` to `fmt::format()`
* fmtlib does not format a typed pointer, i.e., it does not format,
for instance, a `const std::string*`. but operator<<() always print
a typed pointer. so if we want to format a typed pointer, we either
need to cast the pointer to `void*` or use `fmt::ptr()`.
* fmtlib is not able to pick up the overload of
`operator<<(std::ostream& os, const column_definition* cd)`, so we
have to use a wrapper class of `maybe_column_definition` for printing
a pointer to `column_definition`. since the overload is only used
by the two overloads of
`statement_restrictions::add_single_column_parition_key_restriction()`,
the operator<< for `const column_definition*` is dropped.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this is the 13rd changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals:
- to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience
- to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules.
this changeset includes following changes:
- build: cmake: increase per link job mem to 4GiB
- build: cmake: add missing sources to test-lib
- build: cmake: add more tests
- build: cmake: remote quotes in "include()" commands
- build: cmake: drop unnecessary linkages
Closes#13199
* github.com:scylladb/scylladb:
build: cmake: drop unnecessary linkages
build: cmake: remote quotes in "include()" commands
build: cmake: add more tests
build: cmake: add missing sources to test-lib
build: cmake: increase per link job mem to 4GiB
The translated Cassandra unit tests in cassandra_tests/validation/operations/
reproduced three bugs in GROUP BY's interaction with LIMIT and PER PARTITION
LIMIT - issue #5361, #5362 and #5363. Unfortunately, those test functions
are very long, and each test fails on all of these issues and a few more,
making it difficult to use these tests to verify when those tests have
been fixed. In other words, ideally a patch for issue 5361 should un-xfail
some reproducing test for this issue - but all the existing tests will
continue to fail after fixing 5361, because of other remaining bugs.
So in this patch, I created a new test file test_group_by.py with my own
tests for the GROUP BY feature. I tried to explore the different
capabilities of the GROUP BY feature, its different success and error
paths, and how GROUP BY interacts with LIMIT and PER PARTITION LIMIT.
As usual, I created many small test functions and not one huge test
function, and as a result we now have 5 xfailing tests which each
reproduces one bug and when the bug is fixed, it will start to pass.
All tests added here pass on Cassandra.
Refs #5361
Refs #5362
Refs #5363
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#13136
A read that requested memory and has to wait for it can be registered as inactive. This can happen for example if the memory request originated from a background I/O operation (a read-ahead maybe).
Handling this case is currently very difficult. What we want to do is evict such a read on-the-spot: the fact that there is a read waiting on memory means memory is in demand and so inactive reads should be evicted. To evict this reader, we'd first have to remove it from the memory wait list, which is almost impossible currently, because `expiring_fifo<>`, the type used for the wait list, doesn't allow for that. So in this PR we set out to make this possible first, by transforming all current queues to be intrusive lists of permits. Permits are already linked into an intrusive list, to allow for enumerating all existing permits. We use these existing hooks to link the permits into the appropriate queue, and back to `_permit_list` when they are not in any special queue. To make this possible we first have to make all lists store naked permits, moving all auxiliary data fields currently stored in wrappers like `entry` into the permit itself. With this, all queues and lists in the semaphore are intrusive lists, storing permits directly, which has the following implications:
* queues no longer take extra memory, as all of them are intrusive
* permits are completely self-sufficient w.r.t to queuing: code can queue or dequeue permits just with a reference to a permit at hand, no other wrapper, iterator, pointer, etc. is necessary.
* queues don't keep permits alive anymore; destroying a permit will automatically unlink it from the respective queue, although this might lead to use-after-free. Not a problem in practice, only one code-path (`reader_concurrenc_semaphore::with_permit()`) had to be adjusted.
After all that extensive preparations, we can now handle the case of evicting a reader which is queued on memory.
Fixes: #12700Closes#12777
* github.com:scylladb/scylladb:
reader_concurrency_semaphore: handle reader blocked on memory becoming inactive
reader_concurrency_semaphore: move _permit_list next to the other lists
reader_permit: evict inactive read on timeout
reader_concurrency_semaphore: move inactive_read to .cc
reader_concurrency_semaphore: store permits in _inactive_reads
reader_concurrency_semaphore: inactive_read: de-inline more methods
reader_concurrency_semaphore: make _ready_list intrusive
reader_permit: add wait_for_execution state
reader_concurrency_semaphore: make wait lists intrusive
reader_concurrency_semaphore: move most wait_queue methods out-of-line
reader_concurrency_semaphore: store permits directly in queues
reader_permit: introduce (private) operator * and ->
reader_concurrency_semaphore: remove redundant waiters() member
reader_concurrency_semaphore: add waiters counter
reader_permit: use check_abort() for timeout
reader_concurrency_semaphore: maybe_dump_permit_diagnostics(): remove permit list param
reader_concurrency_semaphroe: make foreach_permit() const
reader_permit: add get_schema() and get_op_name() accessors
reader_concurrency_semaphore: mark maybe_dump_permit_diagnostics as noexcept
this is the 12nd changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals:
- to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience
- to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules.
this changeset includes following changes:
- build: cmake: remove Seastar from the option name
- build: cmake: add missing sources in test-lib and utils
- build: cmake: do not include main.cc in scylla-main
- build: cmake: define SEASTAR_TESTING_MAIN for SEASTAR tests
- build: cmake: add more tests
Closes#13180
* github.com:scylladb/scylladb:
build: cmake: add more tests
build: cmake: define SEASTAR_TESTING_MAIN for SEASTAR tests
build: cmake: do not include main.cc in scylla-main
build: cmake: add missing sources in test-lib and utils
build: cmake: remove Seastar from the option name
This is a translation of Cassandra's CQL unit test source file
validation/operations/SelectGroupByTest.java into our cql-pytest
framework.
This test file contains only 8 separate test functions, but each of them
is very long checking hundreds of different combinations of GROUP BY with
other things like LIMIT, ORDER BY, etc., so 6 out of the 7 tests fail on
Scylla on one of the bugs listed below - most of the tests actually fail
in multiple places due to multiple bugs. All tests pass on Cassandra.
The tests reproduce six already-known Scylla issues and one new issue:
Already known issues:
Refs #2060: Allow mixing token and partition key restrictions
Refs #5361: LIMIT doesn't work when using GROUP BY
Refs #5362: LIMIT is not doing it right when using GROUP BY
Refs #5363: PER PARTITION LIMIT doesn't work right when using GROUP BY
Refs #12477: Combination of COUNT with GROUP BY is different from Cassandra
in case of no matches
Refs #12479: SELECT DISTINCT should refuse GROUP BY with clustering column
A new issue discovered by these tests:
Refs #13109: Incorrect sort order when combining IN, GROUP BY and ORDER BY
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#13126
In debug mode the timings are:
view_schema_test: 90 sec
cql_query_test: 170 sec
memtable_test: 2090 sec
cql_functions_test: 2591 sec
other tests that are in/out of this list are not that obvious, but the
former two apparently deserve being replaced with the latter two.
Timings for dev/release modes are not that horrible, but the "first pair
is notably smaller than the latter" relation also exists.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13142
Today, the SSTable generation provides a hint on which shard owns a
particular SSTable. That hint determines which shard will load the
SSTable into memory.
With upcoming UUID generation, we will no longer have this hint
embedded into the SSTable generation, meaning that SSTables will be
loaded at random shards. This is not good because shards will have
to reference memory from other shards to access the SSTable
metadata that was allocated elsewhere.
This patch changes sstable_directory to:
1) Use generation value to only determine which shard will calculate
the owner shards for SSTables. Essentially works like a round-robin
distribution.
2) The shard assigned to compute the owners for a SSTable will do
so reading the minimum from disk, usually only Scylla file is
needed.
3) Once that shard finished computing the owners, it will forward
the SSTable to the shard that own it.
4) Shards will later load SSTables locally that were forwarded to
them.
Closes#13114
* github.com:scylladb/scylladb:
sstables: sstable_directory: Load SSTable at the shard that actually own it
sstables: sstable_directory: Give sstable_info_vector a more descriptive name
sstables: Allow owner shards to be computed for a partially loaded SSTable
sstables: Move SSTable loading to sstable_directory::sort_sstable()
sstables: Move sstable_directory::sort_sstable() to private interface
sstables: Restore indentation in sstable_directory::sort_sstable()
sstables: Coroutinize sstable_directory::sort_sstable()
sstables: sstable_directory: Extract sstable loading from process_descriptor()
sstables: sstable_directory: Separate private fields from methods
sstables: Coroutinize sstable_directory::process_descriptor
* test/boost: add more tests: all tests listed in test/boost/CMakeLists.txt
should build now.
* rust: add inc library, which is used for testing.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Simplified, more direct version of "dependency injection".
I.e. caller/initiator (main/cql_test_env) provides a set of
services it will eventually start. Configurable can remember
these. And use, at least after "start" notification.
Closes#13037
- build: cmake: remove test which does not exist yet
- build: cmake: document add_scylla_test()
- build: cmake: extract index, repair and data_dictionary out
- build: cmake: extract scylla-main out
- build: cmake: find Snappy before using it
- build: cmake: add missing linkages
- build: cmake: add missing sources to test-lib
- build: cmake: link sstables against libdeflate
- build: cmake: link Boost::regex against ICU::uc
Closes#13110
* github.com:scylladb/scylladb:
build: cmake: link Boost::regex against ICU::uc
build: cmake: link sstables against libdeflate
build: cmake: add missing sources to test-lib
build: cmake: add missing linkages
build: cmake: find Snappy before using it
build: cmake: extract scylla-main out
build: cmake: extract index, repair and data_dictionary out
build: cmake: document add_scylla_test()
build: cmake: remove test which does not exist yet
Today, owner shards can only be computed for a fully loaded SSTable.
For upcoming changes in the SSTable loader, we want to load the minimum
from disk to be able to compute the set of shards owning the SSTable.
If sharding metadata is available, it means we only need to read
TOC and Scylla components.
Otherwise, Summary must be read to provide first and last keys for
compute_shards_for_this_sstable() to operate on them instead.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Kill said read's memory requests with std::bad_alloc and dequeue it from
the memory wait list, then evict it on the spot.
Now that `_inactive_reads` just store permits, we can do this easily.
This patch adds an Alternator test reproducing issue #6389 - that
concurrent TagResource and/or UntagResource operations was broken and
some of the concurrent modifications were lost.
The test has two threads, one loops adds and removes a tag A, the
other adds and removes a tag B. After we add tag A, we expect tag A
to be there - but due to issue #6389 this modification was sometimes
lost when it raced with an operation on B.
This test consistently failed before issue #6389 was fixed, and passes
now after the issue was fixed by the previous patches. The bug reproduces
by chance, so it requires a fairly long loop (a few seconds) to be sure
it reproduces - so is marked a "veryslow" test and will not run in CI,
but can be used to manually reproduce this issue with:
test/alternator/run --runveryslow test_tag.py::test_concurrent_tag
Refs #6389.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>