This list is the list of on-disk files, which is the property of
filesystem scan state. When committing directory changes (read: removing
those files) the list can be moved-from the state.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The lister is supposed to be alive throughout .process_sstable_dir() and
can die after .commit_directory_changes().
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The scan_state keeps the state of listing directory with sstables. It
now lives on the .process_sstable_dir() stack, but it can as well live
on the lister itself.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
- build: cmake: link cql3 against wasmtime_bindings
- build: cmake: output rust binding headers in expected dir
- build: cmake: link auth against cql3
Closes#12927
* github.com:scylladb/scylladb:
build: cmake: link auth against cql3
build: cmake: output rust binding headers in expected dir
build: cmake: link cql3 against wasmtime_bindings
In test_v2_apply_monotonically_is_monotonic_on_alloc_failures we
generate mutations with non-full continuity, so we should pass
is_evictable::yes to apply_monotonically(). Otherwise, it will assume
fully-continuous versions and not try to maintain continuity by
inserting sentinels.
This manifested in sporadic failures on continuity check.
Fixes#12882Closes#12921
* github.com:scylladb/scylladb:
test: mutation_test: Fix sporadic failure due to continuity mismatch
test: mutation_test: Fix copy-paste mistake in trace-level logging
this change was previously reverted by
cbc005c6f5 . it turns out this change
was but the offending change. so let's resurrect it.
`job` was introduced back in 782ebcece4,
so we could consume the option specified in DEB_BUILD_OPTIONS
environmental variable. but now that we always repackage
the artifacts prebuilt in the relocatable package. we don't build
them anymore when packaging debian packages. see
9388f3d626 . and `job` is not
passed to `ninja` anymore.
so, in this change, `job` is removed from debian/rules as well, as
it is not used.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12924
This is a translation of Cassandra's CQL unit test source file
validation/operations/CompactStorageTest.java into our cql-pytest
framework.
This very large test file includes 86 tests for various types of
operations and corner cases of WITH COMPACT STORAGE tables.
All 86 tests pass on Cassandra (except one using a deprecated feature
that needs to be specially enabled). 30 of the tests fail on Scylla
reproducing 7 already-known Scylla issues and 7 previously-unknown issues:
Already known issues:
Refs #3882: Support "ALTER TABLE DROP COMPACT STORAGE"
Refs #4244: Add support for mixing token, multi- and single-column
restrictions
Refs #5361: LIMIT doesn't work when using GROUP BY
Refs #5362: LIMIT is not doing it right when using GROUP BY
Refs #5363: PER PARTITION LIMIT doesn't work right when using GROUP BY
Refs #7735: CQL parser missing support for Cassandra 3.10's new "+=" syntax
Refs #8627: Cleanly reject updates with indexed values where value > 64k
New issues:
Refs #12471: Range deletions on COMPACT STORAGE is not supported
Refs #12474: DELETE prints misleading error message suggesting
ALLOW FILTERING would work
Refs #12477: Combination of COUNT with GROUP BY is different from
Cassandra in case of no matches
Refs #12479: SELECT DISTINCT should refuse GROUP BY with clustering column
Refs #12526: Support filtering on COMPACT tables
Refs #12749: Unsupported empty clustering key in COMPACT table
Refs #12815: Hidden column "value" in compact table isn't completely hidden
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12816
load-and-stream implements no policy when deciding which SSTables will go in
each streaming round (batch of 16 SSTables), meaning the choice is random.
It can take advantage of the fact that the LSM-tree layout, with ICS and LCS,
is a set of SSTable runs, where each run is composed of SSTables that are
disjoint in their key range.
By sorting SSTables to be streamed by their first key, the effect is that
SSTable runs will be incrementally streamed (in token order).
SSTable runs in the same replica group (or in the same node) will have their
content deduplicated, reducing significantly the amount of data we need to
put on the wire. The improvement is proportional to the space amplification
in the table, which again, depends on the compaction strategy used.
Another important benefit is that the destination nodes will receive SSTables
in token order, allowing off-strategy compaction to be more efficient.
This is how I tested it:
1) Generated a 5GB dataset to a ICS table.
2) Started a fresh 2-node cluster. RF=2.
3) Ran load-and-stream against one of the replicas.
BEFORE:
$ time curl -X POST "http://127.0.0.1:10000/storage_service/sstables/keyspace1?cf=standard1&load_and_stream=true"
real 4m40.613s
user 0m0.005s
sys 0m0.007s
AFTER:
$ time curl -X POST "http://127.0.0.1:10000/storage_service/sstables/keyspace1?cf=standard1&load_and_stream=true"
real 2m39.271s
user 0m0.005s
sys 0m0.004s
That's ~1.76x faster.
That's explained by deduplication:
BEFORE:
INFO 2023-02-17 22:59:01,100 [shard 0] stream_session - [Stream #79d3ce7a-ea47-4b6e-9214-930610a18ccd] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3445376, received_partitions=2755835
INFO 2023-02-17 22:59:41,491 [shard 0] stream_session - [Stream #bc6bad99-4438-4e1e-92db-b2cb394039c8] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3308288, received_partitions=2836491
INFO 2023-02-17 23:00:20,585 [shard 0] stream_session - [Stream #e95c4f49-0a2f-47ea-b41f-d900dd87ead5] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3129088, received_partitions=2734029
INFO 2023-02-17 23:00:49,297 [shard 0] stream_session - [Stream #255cba95-a099-4fec-a72c-f87d5cac2b1d] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2544128, received_partitions=1959370
INFO 2023-02-17 23:01:33,110 [shard 0] stream_session - [Stream #96b5737e-30c7-4af8-a8b8-96fecbcbcbd0] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3624576, received_partitions=3085681
INFO 2023-02-17 23:02:20,909 [shard 0] stream_session - [Stream #3185a48b-fb9e-4190-88f4-5c7a386bc9bd] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3505024, received_partitions=3079345
INFO 2023-02-17 23:03:02,039 [shard 0] stream_session - [Stream #0d2964dc-d5e3-4775-825c-97f736d14713] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2808192, received_partitions=2655811
AFTER:
INFO 2023-02-17 23:12:49,155 [shard 0] stream_session - [Stream #bf00963c-3334-4035-b1a9-4b3ceb7a188a] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2965376, received_partitions=1006535
INFO 2023-02-17 23:13:13,365 [shard 0] stream_session - [Stream #1cd2e3ac-a68b-4cb5-8a06-707e91cf59db] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3543936, received_partitions=1406157
INFO 2023-02-17 23:13:37,474 [shard 0] stream_session - [Stream #5a278230-6b4b-461f-8396-c15df7092d03] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3639936, received_partitions=1371298
INFO 2023-02-17 23:14:02,132 [shard 0] stream_session - [Stream #19f40dc3-e02a-4321-a917-a6590d99dd03] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3638912, received_partitions=1435386
INFO 2023-02-17 23:14:26,673 [shard 0] stream_session - [Stream #d47507eb-2067-4e8f-a4f7-c82d5fbd4228] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3561600, received_partitions=1423024
INFO 2023-02-17 23:14:49,307 [shard 0] stream_session - [Stream #d42ee911-253a-4de6-ac89-6a3c05b88d66] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2382592, received_partitions=1452656
INFO 2023-02-17 23:15:10,067 [shard 0] stream_session - [Stream #1f78c1bf-8e20-41bd-95de-16de3fc5f86c] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2632320, received_partitions=1252298
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20230219191924.37070-1-raphaelsc@scylladb.com>
Changing default manager from 56090 to 5090
@amnonh please review
@annastuchlik please change if other locations in Docs require this change
Closes#12682
they are part of the CQL type system, and are "closer" to types.
let's move them into "types" directory.
the building systems are updated accordingly.
the source files referencing `types.hh` were updated using following
command:
```
find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} +
```
the source files under sstables include "types.hh", which is
indeed the one located under "sstables", so include "sstables/types.hh"
instea, so it's more explicit.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12926
as auth headers references cql3
```
In file included from /home/kefu/dev/scylladb/auth/authenticator.cc:16:
In file included from /home/kefu/dev/scylladb/cql3/query_processor.hh:24:
/home/kefu/dev/scylladb/lang/wasm_instance_cache.hh:20:10: fatal error: 'rust/cxx.h' file not found
^~~~~~~~~~~~
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
we include rust binding headers like `rust/wasmtime_bindings.hh`.
so they should be located in directory named "rust".
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
as it references headers provided by wasmtime_bindings:
```
In file included from /home/kefu/dev/scylladb/cql3/functions/user_function.cc:9:
In file included from /home/kefu/dev/scylladb/cql3/functions/user_function.hh:16:
/home/kefu/dev/scylladb/lang/wasm.hh:14:10: fatal error: 'rust/wasmtime_bindings.hh' file not found
^~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Currently the code will assert because cl pointer will be null and it
will be null because there is no mutations to initialize it from.
Message-Id: <20230212144837.2276080-3-gleb@scylladb.com>
In test_v2_apply_monotonically_is_monotonic_on_alloc_failures we
generate mutations with non-full continuity, so we should pass
is_evictable::yes to apply_monotonically(). Otherwise, it will assume
fully-continuous versions and not try to maintain continuity by
inserting sentinels.
This manifested in sporadic failures on continuity check.
Fixes#12882
this change is based on Botond Dénes's change which gave an overhaul
to the original CMake building system. this change is not enough
to build tests with CMake, as we still need to sort out the
dependencies.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
it turns out Boost::regex references ICU::i18n, but it fails to
bring the linkage to its public interface. so let's do this on behalf
of it.
```
: && /home/kefu/.local/bin/clang++ -Wall -Werror -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-missing-braces -Wno-overloaded-virtual -Wno-parentheses-equality -Wno-unsupported-friend -march=westmere -O0 -g -gz CMakeFiles/scylla.dir/absl-flat_hash_map.cc.o CMakeFiles/$
ld.lld: error: undefined symbol: icu_67::Collator::createInstance(icu_67::Locale const&, UErrorCode&)
>>> referenced by icu.hpp:56 (/usr/include/boost/regex/icu.hpp:56)
>>> CMakeFiles/scylla.dir/utils/like_matcher.cc.o:(boost::re_detail_107500::icu_regex_traits_implementation::icu_regex_traits_implementation(icu_67::Locale const&))
>>> referenced by icu.hpp:61 (/usr/include/boost/regex/icu.hpp:61)
>>> CMakeFiles/scylla.dir/utils/like_matcher.cc.o:(boost::re_detail_107500::icu_regex_traits_implementation::icu_regex_traits_implementation(icu_67::Locale const&))
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
antlr3 generates code like `((foo == bar))`. but Clang does not
like it. let's disable this warning. and explore other options later.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
When we added zstd (f14e6e73bb), we used the static library
as we used some experimental APIs. However, now the dynamic
library works, so apparently the experimenal API is now standard.
Switch to the dynamic library. It doesn't improve anything, but it
aligns with how we do things.
Closes#12902
`int_range::make_singular()` accepts a single `int` as its parameter,
so there is no need to brace the paramter with `{}`. this helps to silence
the warning from Clang, like:
```
/home/kefu/dev/scylladb/test/perf/perf_fast_forward.cc:1396:63: error: braces around scalar initializer [-Werror,-Wbraced-scalar-init]
check_no_disk_reads(test(int_range::make_singular({100}))),
^~~~~
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12903
Fixes https://github.com/scylladb/scylla-docs/issues/4140
This PR adds a new Knowledge Base article about improved garbage collection in ICS. It's based on the document created by @raphaelsc https://docs.google.com/document/d/1fA7uBcN9tgxeHwCbWftPJz071dlhucoOYO1-KJeOG8I/edit?usp=sharing.
@raphaelsc Could you review it? I've made some improvements to the language and text organization, but I didn't add or remove any content, so it should be a quick review.
@tzach requested a diagram, but we can add it later. It would be great to have this content published asap.
Closes#12792
* github.com:scylladb/scylladb:
doc: add the new KB to the list of topics
doc: add a new KB article about timbstone garbage collection in ICS
Task ttl can be set with task manager test api, which is disabled
in release mode.
Move get_and_update_ttl from task manager test api to task manager
api, so that it can be used in release mode.
Closes#12894
The developer documentation from `building.md` suggested to run unit tests with `./tools/toolchain/dbuild test` command, however this command only invokes `test` bash tool, which immediately returns with status `1`:
```
[piotrs@new-host scylladb]$ ./tools/toolchain/dbuild test
[piotrs@new-host scylladb]$ echo $?
1
```
This was probably unintended mistake and what author really meant was invoking `dbuild ninja test`.
Closes#12890
as `my_result_collector` has virtual function, and its dtor is not
marked virtual, Clang complains. let's mark its base class virtual
to be on the safe side.
```
/home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:100:2: error: delete called on non-final 'my_result_collector' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
delete __ptr;
^
/home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:405:4: note: in instantiation of member function 'std::default_delete<my_result_collector>::operator()' requested here
get_deleter()(std::move(__ptr));
^
/home/kefu/dev/scylladb/db/virtual_table.cc:134:25: note: in instantiation of member function 'std::unique_ptr<my_result_collector>::~unique_ptr' requested here
auto consumer = std::make_unique<my_result_collector>(s, permit, &pr, std::move(reader_and_handle.second));
^
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12879
The former is a convenience wrapper over the latter. There's no real benefit in using it, but having two test_env-s is worse than just one.
Closes#12794
* github.com:scylladb/scylladb:
sstable_utils: Move the test_setup to perf/
sstable_utils: Remove unused wrappers over test_env
sstable_test: Open-code do_with_cloned_tmp_directory
sstable_test: Asynchronize statistics_rewrite case
tests: Replace test_setup::do_with_tmp_directory with test_env::do_with(_async)?
This patch adds a reproducer for the bug described in issue #7964 -
The restriction `where k=1 and c=2` (when k,c are the key columns)
returns (at most) a single row so doesn't need ALLOW FILTERING,
but if we add a third restriction, say `v=2`, this still processes
at most a single row so doesn't need ALLOW FILTERING - and both
Scylla and Cassandra get it wrong - so it's marked with both xfail
and cassandra_bug.
The patch also adds another test that for longer partition slices,
e.g., `where k=1 and c>2`, although the slice itself doesn't need
filtering, if we add `v=2` here we suddenly do need ALLOW FILTERING,
because the slice itself may be a large number of rows, and adding
`v=2` may restrict it to just a few results. This test passes
on both Scylla and Cassandra.
Issue #7964 mentioned these scenarios and even had some example code,
but we never added it to the test suite, so we finally do it now.
Refs #7964
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12850
There are two methods to mess with compaction history -- update and get. The former had been patched to use local system-keyspace instance by 907fd2d3 (system_keyspace: De-static compaction history update) now it's time for the latter (spoiler: it's only used by the API handler)
Closes#12889
* github.com:scylladb/scylladb:
system_keyspace; Make get_compaction_history non static and drop qctx
api, compaction_manager: Get compaction history via manager
system_keyspace: Move compaction_history_entry to namespace scope
Tests of each module that is integrated with task manager use
calls to task manager api. Boilerplate to call, check status, and
get result may be reduced using functions.
task_manager_utils.py contains wrappers for task manager api
calls and helpers that may be reused by different tests.
Closes#12844
* github.com:scylladb/scylladb:
test: use functions from task_manager_utils.py in test_task_manager.py
test: add task_manager_utils.py
as an abstract base class `output_writer` is inherited by both
`json_output_writer` and `text_output_writer`. and `output_manager`
manages the lifecycles of used writers using
`std::unique_ptr<output_writer>`.
before this change, the dtor of `output_writer` is not marked as
virtual, so when its dtor is invoked, what gets called is the base
class's dtor. but the dtor of `json_output_writer` is non-trivial
in the sense that this class is aggregated by a bunch of member
variables. if we don't invoke its dtor when destroying this object,
leakage is expected.
so, in this change, the dtor of `output_writer` is marked as virtual,
this makes all of its derived classes' dtor virtual. and the right
dtor is always called.
test/perf is only designed for testing, and not used in production,
also, this feature was recently integrated into scylla executable in
228ccdc1c7.
so there is no need to backport this change.
change should also silence the warning from Clang 17:
```
/home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:100:2: error: delete called on 'output_writer' that is abstract but has non-virtual destructor [-Werror,-Wdelete-abstract-non-virtual-dtor]
delete __ptr;
^
/home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:405:4: note: in instantiation of member function 'std::default_delete<output_writer>::operator()' requested here
get_deleter()(std::move(__ptr));
^
/home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/stl_construct.h:88:15: note: in instantiation of member function 'std::unique_ptr<output_writer>::~unique_ptr' requested here
__location->~_Tp();
^
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12888