They were running in recovery to reuse existing system tables without
group0 id, but since we want to remove recovery mode we need to
re-generate the tables.
Introduce a test that demonstrates mutation loss caused by premature
loop termination in tablet_sstable_streamer::stream. The code broke
out of the SSTable iteration when encountering a non-overlapping range,
which skipped subsequent SSTables that should have been partially
contained. This test showcases the problem only.
Example:
Tablet range: [4, 5]
SSTable ranges:
[0,5]
[0, 3] <--- is considered exhausted, and causes skip to next tablet
[2, 5] <--- is missed for range [4, 5]
There are some tests which want sstables of all format versions
in `test/resource`. This tests adds `ms` files for those tests.
I didn't think much about this change, I just mechanically
generated the `ms` from the existing `me` sstables in the same directories
(using `scylla sstable upgrade`) for the tests which were complaining
about the lack of `ms` files.
This PR builds on the byte comparable support introduced in #23541 to add byte comparable support for all the collection types.
This implementation adheres to the byte-comparable format specification in https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/bytecomparable/ByteComparable.md
Refs https://github.com/scylladb/scylladb/issues/19407
New feature - backport not required.
Closesscylladb/scylladb#25603
* github.com:scylladb/scylladb:
types/comparable_bytes: add compatibility testcases for collection types
types/comparable_bytes: update compatibility testcase to support collection types
types/comparable_bytes: support empty type
types/comparable_bytes: support reversed types
types/comparable_bytes: support vector cql3 type
types/comparable_bytes: support tuple and UDT cql3 type
types/comparable_bytes: support map cql3 type
types/comparable_bytes: support set and list cql3 types
types/comparable_bytes: introduce encode/decode_component
types/comparable_bytes: introduce to_comparable_bytes/from_comparable_bytes
This patch adds compatibility testcases for the following cql3 types :
set, list, map, tuple, vector and reversed types.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Test that we can load sstables with mixed, numerical and uuid
generation types, and verify the expected data.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
wasm32-wasi has been removed in Rust 1.84 (Jan 5th, 2025). if one
compiles the tree with Rust 1.84 or up, following build failure is
expected:
```
[2/305] Building WASM /home/kefu/dev/scylladb/build/wasm/return_input.wasm
FAILED: wasm/return_input.wasm /home/kefu/dev/scylladb/build/wasm/return_input.wasm
cd /home/kefu/dev/scylladb/test/resource/wasm/rust && /usr/bin/cargo build --target=wasm32-wasi --example=return_input --locked --manifest-path=Cargo.toml --target-dir=/home/kefu/dev/scylladb/build/test/resource/wasm/rust && wasm-opt /home/kefu/dev/scylladb/build/test/resource/wasm/rust/wasm32-wasi//debug/examples/return_input.wasm -Oz -o /home/kefu/dev/scylladb/build/wasm/return_input.wasm && wasm-strip /home/kefu/dev/scylladb/build/wasm/return_input.wasm
error: failed to run `rustc` to learn about target-specific information
Caused by:
process didn't exit successfully: `rustc - --crate-name ___ --print=file-names --target wasm32-wasi --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=split-debuginfo --print=crate-name --print=cfg` (exit status: 1)
--- stderr
error: Error loading target specification: Could not find specification for target "wasm32-wasi". Run `rustc --print target-list` for a list of built-in targets
```
in order to workaround this issue, let's check for supported target,
and use wasm32-wasip1 if wasm32-wasi is not listed as the supported
target.
Refs #20878
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22320
Fixes https://github.com/scylladb/scylla-enterprise/issues/5016#issuecomment-2558464631
EAR - encryption at rest. Allows on-disk file encryption of sstables and commitlog data.
Introduces OpenSSL based file level encrypted storage, managed via a set of providers
ranging from local files to cloud KMS providers.
For a more comprehensive explanation, see the included docs (or if possible, original
source tree).
Manual bulk merge of EAR feature from enterprise repo to main scylla repo.
Breaks some features apart, but main EAR is still a humongous commit, because to separate this
I would have to mess with code incrementally, adding time and risk.
This PR includes the local file gen tool, tests and also p11 validation.
Note: CI will not execute the full tests unless master CI is set to provide the same environment
as the enterprise one. Not sure about the status of this ATM.
Note: Includes code to compile against cryptsoft kmipc SDK, but not the SDK. If you happen to
check out this tree in the scylla folder and configure, it will be linked against and KMIP functionality
will be enabled, otherwise not.
Closesscylladb/scylladb#22233
* github.com:scylladb/scylladb:
docs: Add EAR docs
main/build: Add p11-kit and initialize
tools: Add local-file-key-generator tool
tests: Add EAR tests
tmpdir: shorten test tempdir path
EAR: port the ear feature from enterprise
cql_test_env: Add optional query timeout
schema/migration_manager: Add schema validate
sstables: add get_shared_components accessor
config/config_file: Add exports and definitions of config_type_for<>
This PR extends authentication with 2 mechanisms:
- a new role_manager subclass, which allows managing users via
LDAP server,
- a new authenticator, which delegates plaintext authentication
to a running saslauthd daemon.
The features have been ported from the enterprise repository
with their test.py tests and the documentation as part of
changing license to source available.
Fixes: scylladb/scylla-enterprise#5000Fixes: scylladb/scylla-enterprise#5001Closesscylladb/scylladb#22030
All current unit tests for scrub in validate mode generate random
SSTables on the fly.
Add some more tests with frozen Cassandra SSTables from the source tree
to verify compatibility with Cassandra. Use some of the existing 3.x
Cassandra SSTables to test the valid case, and use the same schema to
generate some corrupted SSTables for the invalid case. Overall, the new
tests cover the following scenarios:
* valid compressed/uncompressed
* compressed/uncompressed with invalid checksums
* compressed/uncompressed with invalid digest
For the compressed SSTable with invalid checksums, a small chunk length
was used (4KiB) to have more chunks with less disk space. For
uncompressed SSTables the chunk length is not configurable.
Finally, since the SSTables live in the source tree, the quarantine
mechanism was disabled.
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
so that "wasm" target is built. "wasm" generates the text format
of wasm code. and these wasm applications are used by the test_wasm
tests.
the rules generated by `configure.py` adds these .wat files as a
dependency of `{mode}-build`, which is in turn a dependency of `{mode}`.
in this change, let's mirror this behavior by making `wasm` ALL,
so it is built by the default target.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19391
The currently used versions of "time" and "rustix" depencies
had minor security vulnerabilities.
In this patch:
- the "rustix" crate is updated
- the "chrono" crate that we depend on was not compatible
with the version of the "time" crate that had fixes, so
we updated the "chrono" crate, which actually removed the
dependency on "time" completely.
Both updated were performed using "cargo update" on the
relevant package and the corresponding version.
Fixes#15772Closesscylladb/scylladb#16378
this target mirrors the target named `{mode}e-test` in the
`build.ninja` build script created by `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15448
instead of checking the availability of a required program, let's
use the `REQUIRED` argument introduced by CMake 3.18, simpler this
way.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15447
as an alternative to passing the link-args using the environmental variable,
we can also use build script to pass the "-C link-args=<FLAG>" to the compiler.
see https://doc.rust-lang.org/nightly/cargo/reference/build-scripts.html#cargorustc-link-argflag
to ensure that cargo is called again by ninja, after build.rs is
updated, build.rs is added as a dependency of {wasm} files along with
Cargo.lock.
this change is verified using following command
```
RUSTFLAGS='--print link-args' cargo build \
--target=wasm32-wasi \
--example=return_input \
--locked \
--manifest-path=Cargo.toml \
--target-dir=build/cmake/test/resource/wasm/rust
```
the output includes "-zstack-size=131072" in the argument passed to lld:
```
Compiling examples v0.0.0 (/home/kefu/dev/scylladb/test/resource/wasm/rust)
LC_ALL="C"
PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin/self-contained:/home/kefu/.local/bin:/home/kefu/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin"
VSLANG="1033"
"lld"
"-flavor" "wasm" "--rsp-quoting=posix" "--export"
"_scylla_abi" "--export" "_scylla_free" "--export" "_scylla_malloc"
"--export" "return_input" "-z" "stack-size=1048576" "--stack-first"
"--allow-undefined" "--fatal-warnings" "--no-demangle"
...
"-L" "/usr/lib/rustlib/wasm32-wasi/lib"
"-L" "/usr/lib/rustlib/wasm32-wasi/lib/self-contained"
"-o"
"/home/kefu/dev/scylladb/build/cmake/test/resource/wasm/rust/wasm32-wasi/debug/examples/return_input-ef03083560989040.wasm"
"--gc-sections"
"--no-entry"
"-O0"
"-zstack-size=131072"
```
with this change, it'd be easier to build .wat files in CMake, so
we don't need to repeat the settings in both configure.py and
CMakeLists.txt
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14123
we compile .wat files from .rs and .c source files since
6d89d718d9.
these .wat are used by test/cql-pytest/test_wasm.py . let's update
the CMake building system accordingly so these .wat files can also
be generated using the "wasm" target. since the ctest system is
not used. this change should allow us to perform this test manually.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14126
This is a translation of Cassandra's CQL unit test source file
validation/entities/UFTypesTest.java into our cql-pytest framework.
There are 7 tests, which reproduce one known bug:
Refs #13746: UDF can only be used in SELECT, and abort when used in WHERE, or in INSERT/UPDATE/DELETE commands
And uncovered two previously unknown bugs:
Refs #13855: UDF with a non-frozen collection parameter cannot be called on a frozen value
Refs #13860: A non-frozen collection returned by a UDF cannot be used as a frozen one
Additionally, we encountered an issue that can be treated as either a bug or a hole in documentation:
Refs #13866: Argument and return types in UDFs can be frozen
Closes#13867
After recent changes, we are able to store only the
C/Rust source codes for Wasm programs, and only build
them when neccessary. This patch utilizes this
opportunity by removing most of the currently stored
raw Wasm programs, replacing them with C/Rust sources
and adding them to the new build system.
Add a test for a wasm aggregate function
which uses the new metrics to check if the cache has
been hit at least once.
Also check that the cache can get reused on different
queries, by testing that the number of queries is
higher than the number of cache misses.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
After compiling to WASM, UDFs become much larger than the
source code. When they're included in test_wasm.py, it
becomes difficult to navigate in the file. Moving them
to another place does not make understanding the test
scripts harder, because the source code is still included.
This problem will become even more severe when testing
UDFs using WASI.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Closes#10934
To call a UDF that is using WASI, we need to properly
configure the wasmtime instance that it will be called
on. The configuration was missing from udf_cache::load(),
so we add it here.
The free function does not return any value, so we should use
a calling method that does not expect any returns.
This patch adds such a method and uses it.
A test that did not pass without this fix and does pass after
is added.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Closes#10935
Keyspace storage options series adds a new schema table:
system_schema.scylla_keyspaces. The regenerated cases ensure
that this new table is taken into account when the schema feature
is available.
As flat mutation reader {up,down}grades get added to the write path,
comparing range-tombstone-containing (at least) sstables byte-by-byte
to a reference is starting to seem like a fool's errand.
* When a flat mutation reader is {up,down}graded, information may get
lost while splitting range tombstones. Making those splits revertable
should in theory be possible but would surely make {up,down}graders
slower and more complex, and may also possibly entail adding
information to in-memory representation of range tombstones and
range rombstone changes. Such investment for the sake of 7 unit tests
does not seem wise, given that the plan is to get rid of reader
{up,down}grade logic once the move to flat mutation reader v2 is
completed.
* All affected tests also validate their written sstables
semantically.
* At least some of the offending reference sstables are not
"canonical" wrt range tombstones to begin with -- they contain range
tombstones that overlap with clustering rows. The fact that Scylla
does not "canonicalize" those in some way seems purely incidental.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Prerequisite for the "ME sstable format support" series (which has been
posted to the mailing list) -- to be merged or rejected together with
that.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Closes#9939
With previous design of the schema change test, a regeneration
was necessary each time a new distributed system table was added.
It was not the original purpose of the test to keep track of new
distributed tables which simply propagate on their own,
so the test case is now modified: internal distributed tables
are not part of the schema digest anymore, which means that
changes inside them will not cause mismatches.
This change involves a one-shot regeneration of all digests,
which due to historical reasons included internal distributed
tables in the digest, but no further regenerations should ever
be necessary when a new internal distributed table is added.
Commit a6ad70d3da changed the format of
stream IDs: the lower 8 bytes were previously generated randomly, now
some of them have semantics. In particular, the least significant byte
contains a version (stream IDs might evolve with further releases).
This is a backward-incompatible change: the code won't properly handle
stream IDs with all lower 8 bytes generated randomly. To protect us from
subtle bugs, the code has an assertion that checks the stream ID's
version.
This means that if an experimental user used CDC before the change and
then upgraded, they might hit the assertion when a node attempts to
retrieve a CDC generation with old stream IDs from the CDC description
tables and then decode it.
In effect, the user won't even be able to start a node.
Similarly as with the case described in
d89b7a0548, the simplest fix is to rename
the tables. This fix must get merged in before CDC goes out of
experimental.
Now, if the user upgrades their cluster from a pre-rename version, the
node will simply complain that it can't obtain the CDC generation
instead of preventing the cluster from working. The user will be able to
use CDC after running checkAndRepairCDCStreams.
Since a new table is added to the system_distributed keyspace, the
cluster's schema has changed, so sstables and digests need to be
regenerated for schema_digest_test.
Test also the md format in all_sstable_versions.
Add pre-computed md-sstable files generated using Cassandra version 3.11.7
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Commit 968177da04 has changed the schema
of cdc_topology_description and cdc_description tables in the
system_distributed keyspace.
Unfortunately this was a backwards-incompatible change: these tables
would always be created, irrespective of whether or not "experimental"
was enabled. They just wouldn't be populated with experimental=off.
If the user now tries to upgrade Scylla from a version before this change
to a version after this change, it will work as long as CDC is protected
b the experimental flag and the flag is off.
However, if we drop the flag, or if the user turns experimental on,
weird things will happen, such as nodes refusing to start because they
try to populate cdc_topology_description while assuming a different schema
for this table.
The simplest fix for this problem is to rename the tables. This fix must
get merged in before CDC goes out of experimental.
If the user upgrades his cluster from a pre-rename version, he will simply
have two garbage tables that he is free to delete after upgrading.
sstables and digests need to be regenerated for schema_digest_test since
this commit effectively adds new tables to the system_distributed keyspace.
This doesn't result in schema disagreement because the table is
announced to all nodes through the migration manager.
Some legacy `mc` SSTables (created in Scylla 3.0) may contain incorrect
serialization headers, which don't wrap frozen UDTs nested inside collections
with the FrozenType<...> tag. When reading such SSTable,
Scylla would detect a mismatch between the schema saved in schema
tables (which correctly wraps UDTs in the FrozenType<...> tag) and the schema
from the serialization header (which doesn't have these tags).
SSTables created in Scylla versions 3.1 and above, in particular in
Scylla versions that contain this commit, create correct serialization
headers (which wrap UDTs in the FrozenType<...> tag).
This commit does two things:
1. for all SSTables created after this commit, include a new feature
flag, CorrectUDTsInCollections, presence of which implies that frozen
UDTs inside collections have the FrozenType<...> tag.
2. when reading a Scylla SSTable without the feature flag, we assume that UDTs
nested inside collections are always frozen, even if they don't have
the tag. This assumption is safe to be made, because at the time of
this commit, Scylla does not allow non-frozen (multi-cell) types inside
collections or UDTs, and because of point 1 above.
There is one edge case not covered: if we don't know whether the SSTable
comes from Scylla or from C*. In that case we won't make the assumption
described in 2. Therefore, if we get a mismatch between schema and
serialization headers of a table which we couldn't confirm to come from
Scylla, we will still reject the table. If any user encounters such an
issue (unlikely), we will have to use another solution, e.g. using a
separate tool to rewrite the SSTable.
Fixes#6130.
The cdc_topology_description table will be used internally
by nodes to send new CDC stream generations to other nodes.
The cdc_description table is a user-facing table,
used to inform users about new sets of CDC streams.
Regenerate sstables and digests for schema_change_test.
We don't need to protect this change by a schema feature:
when a node creates these tables, it announces them
to all other nodes. If schema agreement happens before
this migration, all nodes will use a digest calculated
without these tables. If it happens after, then all nodes
will eventually know about these tables and use a digest
calculated with these tables.
The original "test_schema_digest_does_not_change" test case ensures
that schema digests will match for older nodes that do not support
all the features yet (including computed columns).
The additional case uses sstables generated after CDC was enabled
and a table with CDC enabled is created,
in order to make sure that the digest computed
including CDC column does not change spuriously as well.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
The plan is to move the unstructured content of tests/ directory
into the following directories of test/:
test/lib - shared header and source files for unit tests
test/boost - boost unit tests
test/unit - non-boost unit tests
test/manual - tests intended to be run manually
test/resource - binary test resources and configuration files
In order to not break git bisect and preserve the file history,
first move most of the header files and resources.
Update paths to these files in .cc files, which are not moved.