this change has no impact on `build.ninja` generated by `configure.py`.
as we are using a `set` for tracking the tests to be built. but it's
still an improvement, as we should not add duplicated entries in a set
when initializing it.
there are two occurrences of `test/boost/double_decker_test`, the one
which is in the club of the local cluster of collections tests - bptree,
btree, radix_tree and double_decker are preserved.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14478
Since most group0 commands are just mutations it is easy to combine them
before passing them to a subsystem they destined to since it is more
efficient. The logic that handles those mutations in a subsystem will
run once for each batch of commands instead of for each individual
command. This is especially useful when a node catches up to a leader and
gets a lot of commands together.
The patch here does exactly that. It combines commands into a single
command if possible, but it preserves an order between commands, so each
time it encounters a command to a different subsystem it flushes already
combined batch and starts a new one. This extra safety assumes that
there are dependencies between subsystems managed by group0, so the order
matters. It may be not the case now, but we prefer to be on a safe side.
Broadcast table commands are not mutations, so they are never combined.
* 'raft-merge-cmds' of https://github.com/gleb-cloudius/scylla:
test: add test for group0 raft command merging
service: raft: respect max mutation size limit when persisting raft entries
group0_state_machine: merge commands before applying them whenever possible
Add a test that submits 3 large commands each one a little bit larger
than 1/3 of maximum mutation size. Check that in the end 2 command were
executed (first 2 were merged and third was executed separately).
Fixes#10099
Adds the com.scylladb.auth.CertificateAuthenticator type. If set as authenticator, will extract roles from TLS authentication certificate (not wire cert - those are server side) subject, based on configurable regex.
Example:
scylla.yaml:
```
authenticator: com.scylladb.auth.CertificateAuthenticator
auth_superuser_name: <name>
auth_certificate_role_query: CN=([^,\s]+)
client_encryption_options:
enabled: True
certificate: <server cert>
keyfile: <server key>
truststore: <shared trust>
require_client_auth: True
```
In a client, then use a certificate signed with the <shared trust> store as auth cert, with the common name <name>. I.e. for qlsh set "usercert" and "userkey" to these certificate files.
No user/password needs to be sent, but role will be picked up from auth certificate. If none is present, the transport will reject the connection. If the certificate subject does not contain a recongnized role name (from config or set in tables) the authenticator mechanism will reject it.
Otherwise, connection becomes the role described.
To facilitate this, this also contains the addition of allowing setting super user name + salted passwd via command line/conf + some tweaks to SASL part of connection setup.
Closes#12214
* github.com:scylladb/scylladb:
docs: Add documentation of certificate auth + auth_superuser_name
auth: Add TLS certificate authenticator
transport: Try to do early, transport based auth if possible
auth: Allow for early (certificate/transport) authentication
auth: Allow specifying initial superuser name + passwd (salted) in config
roles-metadata: Coroutinuze some helpers
View building from staging creates a reader from scratch (memtable
\+ sstables - staging) for every partition, in order to calculate
the diff between new staging data and data in base sstable set,
and then pushes the result into the view replicas.
perf shows that the reader creation is very expensive:
```
+ 12.15% 10.75% reactor-3 scylla [.] lexicographical_tri_compare<compound_type<(allow_prefixes)0>::iterator, compound_type<(allow_prefixes)0>::iterator, legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()(managed_bytes_basic_view<(mutable_view)0>, managed_bytes
+ 10.01% 9.99% reactor-3 scylla [.] boost::icl::is_empty<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+ 8.95% 8.94% reactor-3 scylla [.] legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()
+ 7.29% 7.28% reactor-3 scylla [.] dht::ring_position_tri_compare
+ 6.28% 6.27% reactor-3 scylla [.] dht::tri_compare
+ 4.11% 3.52% reactor-3 scylla [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+ 4.09% 4.07% reactor-3 scylla [.] sstables::index_consume_entry_context<sstables::index_consumer>::process_state
+ 3.46% 0.93% reactor-3 scylla [.] sstables::sstable_run::will_introduce_overlapping
+ 2.53% 2.53% reactor-3 libstdc++.so.6 [.] std::_Rb_tree_increment
+ 2.45% 2.45% reactor-3 scylla [.] boost::icl::non_empty::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+ 2.14% 2.13% reactor-3 scylla [.] boost::icl::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+ 2.07% 2.07% reactor-3 scylla [.] logalloc::region_impl::free
+ 2.06% 1.91% reactor-3 scylla [.] sstables::index_consumer::consume_entry(sstables::parsed_partition_index_entry&&)::{lambda()https://github.com/scylladb/scylladb/issues/1}::operator()() const::{lambda()https://github.com/scylladb/scylladb/issues/1}::operator()
+ 2.04% 2.04% reactor-3 scylla [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+ 1.87% 0.00% reactor-3 [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe
+ 1.86% 0.00% reactor-3 [kernel.kallsyms] [k] do_syscall_64
+ 1.39% 1.38% reactor-3 libc.so.6 [.] __memcmp_avx2_movbe
+ 1.37% 0.92% reactor-3 scylla [.] boost::icl::segmental::join_left<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::
+ 1.34% 1.33% reactor-3 scylla [.] logalloc::region_impl::alloc_small
+ 1.33% 1.33% reactor-3 scylla [.] seastar::memory::small_pool::add_more_objects
+ 1.30% 0.35% reactor-3 scylla [.] seastar::reactor::do_run
+ 1.29% 1.29% reactor-3 scylla [.] seastar::memory::allocate
+ 1.19% 0.05% reactor-3 libc.so.6 [.] syscall
+ 1.16% 1.04% reactor-3 scylla [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst
+ 1.07% 0.79% reactor-3 scylla [.] sstables::partitioned_sstable_set::insert
```
That shows some significant amount of work for inserting sstables
into the interval map and maintaining the sstable run (which sorts
fragments by first key and checks for overlapping).
The interval map is known for having issues with L0 sstables, as
it will have to be replicated almost to every single interval
stored by the map, causing terrible space and time complexity.
With enough L0 sstables, it can fall into quadratic behavior.
This overhead is fixed by not building a new fresh sstable set
when recreating the reader, but rather supplying a predicate
to sstable set that will filter out staging sstables when
creating either a single-key or range scan reader.
This could have another benefit over today's approach which
may incorrectly consider a staging sstable as non-staging, if
the staging sst wasn't included in the current batch for view
building.
With this improvement, view building was measured to be 3x faster.
from
`INFO 2023-06-16 12:36:40,014 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 963957ms = 50kB/s`
to
`INFO 2023-06-16 14:47:12,129 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 319899ms = 150kB/s`
Refs https://github.com/scylladb/scylladb/issues/14089.
Fixes scylladb/scylladb#14244.
Closes#14364
* github.com:scylladb/scylladb:
table: Optimize creation of reader excluding staging for view building
view_update_generator: Dump throughput and duration for view update from staging
utils: Extract pretty printers into a header
Fixes#10099
Adds the com.scylladb.auth.CertificateAuthenticator type. If set as authenticator,
will extract roles from TLS authentication certificate (not wire cert - those are
server side) subject, based on configurable regex.
Example:
scylla.yaml:
authenticator: com.scylladb.auth.CertificateAuthenticator
auth_superuser_name: <name>
auth_certificate_role_queries:
- source: SUBJECT
query: CN=([^,\s]+)
client_encryption_options:
enabled: True
certificate: <server cert>
keyfile: <server key>
truststore: <shared trust>
require_client_auth: True
In a client, then use a certificate signed with the <shared trust>
store as auth cert, with the common name <name>. I.e. for cqlsh
set "usercert" and "userkey" to these certificate files.
No user/password needs to be sent, but role will be picked up
from auth certificate. If none is present, the transport will
reject the connection. If the certificate subject does not
contain a recongnized role name (from config or set in tables)
the authenticator mechanism will reject it.
Otherwise, connection becomes the role described.
This reverts commit 562087beff.
The regressions introduced by the reverted change have been fixed.
So let's revert this revert to resurrect the
uuid_sstable_identifier_enabled support.
Fixes#10459
This reverts commit d1dc579062, reversing
changes made to 3a73048bc9.
Said commit caused regressions in dtests. We need to investigate and fix
those, but in the meanwhile let's revert this to reduce the disruption
to our workflows.
Refs: #14283
This reverts commit 7dadd38161.
The latest revert cited debuggability trumping performance, but the
performance loss is su huge here that debug builds are unusable and
next promotions time out.
In the interest of progress, pick the lesser of two evils.
this change generalize the value of generation_type so it also
supports UUID based identifier.
* sstables/generation_type.h:
- add formatter and parse for UUID. please note, Cassandra uses
a different format for formatting the SSTable identifier. and
this formatter suits our needs as it uses underscore "_" as the
delimiter, as the file name of components uses dash "-" as the
delimiter. instead of reinventing the formatting or just use
another delimiter in the stringified UUID, we choose to use the
Cassandra's formatting.
- add accessors for accessing the type and value of generation_type
- add constructor for constructing generation_type with UUID and
string.
- use hash for placing sstables with uuid identifiers into shards
for more uniformed distrbution of tables in shards.
* replica/table.cc:
- only update the generator if the given generation contains an
integer
* test/boost:
- add a simple test to verify the generation_type is able to
parse and format
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This reverts commit 9526258b89.
Because the issue (#14213) supposed to be fix only exists in the
enterprise branch. And that issue has been fixed in a different
way in a different place.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14234
This reverts commit 7e68ed6a5d. With -Og
simple 'gdb build/debug/scylla -ex start' shows the function parameters
as "optimized out" while with -O0 they display fine. This applies to
all variables, not just main's parameters.
Bisect revealed that this behavior started with the reverted commit; it's
not due to a later toolchain update.
Fixes#14196.
Closes#14197
when compiling the generated source files, sometimes, we can run
into the FTBFS like:
02:18:54 FAILED: build/release/gen/cql3/CqlParser.o
02:18:54 clang++ ... -o build/release/gen/cql3/CqlParser.o build/release/gen/cql3/CqlParser.cpp
...
02:18:54 In file included from build/release/gen/cql3/CqlParser.cpp:44:
02:18:54 In file included from build/release/gen/cql3/CqlParser.hpp:75:
02:18:54 In file included from ./cql3/statements/create_function_statement.hh:12:
02:18:54 In file included from ./cql3/functions/user_function.hh:16:
02:18:54 ./lang/wasm.hh:15:10: fatal error: 'rust/wasmtime_bindings.hh' file not found
02:18:54 #include "rust/wasmtime_bindings.hh"
02:18:54 ^~~~~~~~~~~~~~~~~~~~~~~~~~~
CqlParser.cc is a source file generated from cql3/Cql.g, this source in
turn includes another source file generated from
wasmtime_bindings/src/lib.rs. but we failed to setup this dependency in
the build.ninja rules -- we only teach ninja that "to compile the
grammer source files, please prepare the`serializers` source files
first". but this is not enough.
so, in this change, we just replace `serializers` with `gen_headers`,
as the latter is a superset of the former. and should fulfill the needs
of CqlParser.cc.
Fixes#14213
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14214
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).
So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command
The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields
Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)
Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile
The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13963
as an alternative to passing the link-args using the environmental variable,
we can also use build script to pass the "-C link-args=<FLAG>" to the compiler.
see https://doc.rust-lang.org/nightly/cargo/reference/build-scripts.html#cargorustc-link-argflag
to ensure that cargo is called again by ninja, after build.rs is
updated, build.rs is added as a dependency of {wasm} files along with
Cargo.lock.
this change is verified using following command
```
RUSTFLAGS='--print link-args' cargo build \
--target=wasm32-wasi \
--example=return_input \
--locked \
--manifest-path=Cargo.toml \
--target-dir=build/cmake/test/resource/wasm/rust
```
the output includes "-zstack-size=131072" in the argument passed to lld:
```
Compiling examples v0.0.0 (/home/kefu/dev/scylladb/test/resource/wasm/rust)
LC_ALL="C"
PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin/self-contained:/home/kefu/.local/bin:/home/kefu/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin"
VSLANG="1033"
"lld"
"-flavor" "wasm" "--rsp-quoting=posix" "--export"
"_scylla_abi" "--export" "_scylla_free" "--export" "_scylla_malloc"
"--export" "return_input" "-z" "stack-size=1048576" "--stack-first"
"--allow-undefined" "--fatal-warnings" "--no-demangle"
...
"-L" "/usr/lib/rustlib/wasm32-wasi/lib"
"-L" "/usr/lib/rustlib/wasm32-wasi/lib/self-contained"
"-o"
"/home/kefu/dev/scylladb/build/cmake/test/resource/wasm/rust/wasm32-wasi/debug/examples/return_input-ef03083560989040.wasm"
"--gc-sections"
"--no-entry"
"-O0"
"-zstack-size=131072"
```
with this change, it'd be easier to build .wat files in CMake, so
we don't need to repeat the settings in both configure.py and
CMakeLists.txt
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14123
if we just want to build a single test and scylla executables, we
might want to use `configure.py` like:
./configure.py --mode debug --compiler clang++ --with scylla --with test/boost/database_test
which generates `build.ninja` for us, with following rules:
build $builddir/debug/test/boost/database_test_g: link.debug ... | $builddir/debug/seastar/libseastar.so
$builddir/debug/seastar/libseastar_testing.so
libs = $seastar_libs_debug $libs -lthrift -lboost_system $seastar_testing_libs_debug
libs = $seastar_libs_debug
but the last line prevents database_test_g for linking against
the third-party libraries like libabsl, which could have been
pulled in by $libs. but the second assignment expression just
makes the value of `libs` identical to that of `seastar_libs_debug`.
but that library does not include the libraries which are only
used by scylla. so we could run into link failure with the
`build.ninja` generated with this command line. like:
```
FAILED: build/debug/test/boost/database_test_g
...
ld.lld: error: undefined symbol: seastar::testing::entry_point(int, char**)
>>> referenced by scylla_test_case.hh:22 (./test/lib/scylla_test_case.hh:22)
>>> build/debug/test/boost/database_test.o:(main)
...
ld.lld: error: undefined symbol: boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_tes
t::basic_cstring<char const>)
>>> referenced by database_test.cc:298 (test/boost/database_test.cc:298)
>>> build/debug/test/boost/database_test.o:(require_exist(seastar::basic_sstring<char, unsigned int, 15u, true> const&, bool))
...
```
with this change, the extra assignment expression is dropped. this
should not cause any regression. as f'$seastar_libs_{mode}' as
been included as a part of `local_libs` before the grand if-the-else
block in the for loop before this `f.write()` statement.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14041
Here's a simple test that can be used to check S3 object read latencies.
To run one must export the same variables as for any other S3 unit test:
- S3_SERVER_ADDRESS_FOR_TEST
- S3_SERVER_PORT_FOR_TEST
- S3_PUBLIC_BUCKET_FOR_TEST
and the AWS creds are a must via AWS_S3_EXTRA='$key:$secret:$region' env
variable.
Accepted options are
--duration SEC -- test duration in seconds
--parallel NR -- number of fibers to run in parallel
--object-size BYTES -- object size to use (1MB by default)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13895
It turns out that numeric_limits defines an implicit implementation
for std::numeric_limits<utils::tagged_integer<Tag, ValueType>>
which apprently returns a default-constructed tagged_integer
for min() and max(), and this broke
`gms::heart_beat_state::force_highest_possible_version_unsafe()`
since [gms: heart_beat_state: use generation_type and version_type](4cdad8bc8b)
(merged in [Merge 'gms: define and use generation and version types'...](7f04d8231d))
Implementing min/max correctly
Fixes#13801Closes#13880
* github.com:scylladb/scylladb:
storage_service: handle_state_normal: on_internal_error on "owns no tokens"
utils: tagged_integer: implement std::numeric_limits::{min,max}
test: add tagged_integer_test
Add basic test for tagged+integer arithmetic operations.
Remove const qualifier from `tagged_integer::operator[+-]=`
as these are add/sub-assign operators that need to modify
the value in place.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This is a translation of Cassandra's CQL unit test source file
validation/entities/UFTypesTest.java into our cql-pytest framework.
There are 7 tests, which reproduce one known bug:
Refs #13746: UDF can only be used in SELECT, and abort when used in WHERE, or in INSERT/UPDATE/DELETE commands
And uncovered two previously unknown bugs:
Refs #13855: UDF with a non-frozen collection parameter cannot be called on a frozen value
Refs #13860: A non-frozen collection returned by a UDF cannot be used as a frozen one
Additionally, we encountered an issue that can be treated as either a bug or a hole in documentation:
Refs #13866: Argument and return types in UDFs can be frozen
Closes#13867
The validator classes have their definition in a header located in mutation/, while their implementation is located in a .cc in readers/mutation_reader.cc.
This PR fixes this inconsistency by moving the implementation into mutation/mutation_fragment_stream_validator.cc. The only change is that the validator code gets a new logger instance (but the logger variable itself is left unchanged for now).
Closes#13831
* github.com:scylladb/scylladb:
mutation/mutation_fragment_stream_validator.cc: rename logger
readers,mutation: move mutation_fragment_stream_validator to mutation/
When new nodes are added or existing nodes are deleted, the topology
state machine needs to shunt reads from the old nodes to the new ones.
This happens in the `write_both_read_new` state. The problem is that
previously this state was not handled in any way in `token_metadata` and
the read nodes were only changed when the topology state machine reached
the final 'owned' state.
To handle `write_both_read_new` an additional `interval_map` inside
`token_metadata` is maintained similar to `pending_endpoints`. It maps
the ranges affected by the ongoing topology change operation to replicas
which should be used for reading. When topology state sm reaches the
point when it needs to switch reads to a new topology, it passes
`request_read_new=true` in a call to `update_pending_ranges`. This
forces `update_pending_ranges` to compute the ranges based on new
topology and store them to the `interval_map`. On the data plane, when a
read on coordinator needs to decide which endpoints to use, it first
consults this `interval_map` in `token_metadata`, and only if it doesn't
contain a range for current token it uses normal endpoints from
`effective_replication_map`.
Closes#13376
* github.com:scylladb/scylladb:
storage_proxy, storage_service: use new read endpoints
storage_proxy: rename get_live_sorted_endpoints->get_endpoints_for_reading
token_metadata: add unit test for endpoints_for_reading
token_metadata: add endpoints for reading
sequenced_set: add extract_set method
token_metadata_impl: extract maybe_migration_endpoints helper function
token_metadata_impl: introduce migration_info
token_metadata_impl: refactor update_pending_ranges
token_metadata: add unit tests
token_metadata: fix indentation
token_metadata_impl: return unique_ptr from clone functions
The validator classes have their definition in a header located in mutation/,
while their implementation is located in a .cc in readers/mutation_reader.cc.
This patch fixes this inconsistency by moving the implementation into
mutation/mutation_fragment_stream_validator.cc. The only change is that
the validator code gets a new logger instance (but the logger variable itself
is left unchanged for now).
this change extracts the storage class and its derived classes
out into their own source files. for couple reasons:
- for better readability. the sstables.hh is over 1005 lines.
and sstables.cc 3602 lines. it's a little bit difficult to figure
out how the different parts in these sources interact with each
other. for instance, with this change, it's clear some of helper
functions are only used by file_system_storage.
- probably less inter-source dependency. by extracting the sources
files out, they can be compiled individually, so changing one .cc
file does not impact others. this could speed up the compilation
time.
Closes#13785
* github.com:scylladb/scylladb:
sstables: storage: coroutinize idempotent_link_file()
sstables: extract storage out
this change extracts the storage class and its derived classes
out into storage.cc and storage.hh. for couple reasons:
- for better readability. the sstables.hh is over 1005 lines.
and sstables.cc 3602 lines. it's a little bit difficult to figure
out how the different parts in these sources interact with each
other. for instance, with this change, it's clear some of helper
functions are only used by file_system_storage.
- probably less inter-source dependency. by extracting the sources
files out, they can be compiled individually, so changing one .cc
file does not impact others. this could speed up the compilation
time.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
After recent changes, we are able to store only the
C/Rust source codes for Wasm programs, and only build
them when neccessary. This patch utilizes this
opportunity by removing most of the currently stored
raw Wasm programs, replacing them with C/Rust sources
and adding them to the new build system.
Currently, when we deal with a Wasm program, we store
it in its final WebAssembly Text form. This causes a lot
of code bloat and is hard to read. Instead, we would like
to store only the (C/Rust) source codes, and build Wasm
when neccessary. This patch adds build commands that
compile C/Rust sources to Wasm.
After these changes, adding a new program that should be
compiled to Rust, requires only adding the source code
of it and updating the wasms and wasm_deps lists in
configure.py.
All Wasm programs are build by default when building all
artifacts, all artifacts in a given mode, or when building
tests. Additionally, a ninja wasm target is added, so that
it's possible to build just the wasm files.
The generated files are saved in $builddir/wasm.
Currently, build_artifacts are of type set[str] | list, which prevents
us from performing set operations on it. In a future patch, we will
want to take a set difference and set intersections with it, so we
initialize the type of build_artifacts to a set in all cases.
In case an sstable unit test case is run individually, it would fail
with exception saying that S3_... environment is not set. It's better to
skip the test-case rather than fail. If someone wants to run it from
shell, it will have to prepare S3 server (minio/AWS public bucket) and
provide proper environment for the test-case.
refs: #13569
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13755
it turns out the only places where we have compiler warnings of
-W-parentheses-equality is the source code generated by ANTLR. strictly
speaking, this is valid C++ code, just not quite readable from the
hygienic point of view. so let's enable this warning in the source tree,
but only disable it when compiling the sources generated by ANTLR.
please note, this warning option is supported by both GCC and Clang,
so no need to test if it is supported.
for a sample of the warnings, see:
```
/home/kefu/dev/scylladb/build/cmake/cql3/CqlLexer.cpp:21752:38: error: equality comparison with extraneous parentheses [-Werror,-Wparentheses-equality]
if ( (LA4_0 == '$'))
~~~~~~^~~~~~
/home/kefu/dev/scylladb/build/cmake/cql3/CqlLexer.cpp:21752:38: note: remove extraneous parentheses around the comparison to silence this warning
if ( (LA4_0 == '$'))
~ ^ ~
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This mini-series cleans up printing of ranges in utils/to_string.hh
It generalizes the helper function to work on a std::ranges::range,
with some exceptions, and adds a helper for boost::transformed_range.
It also changes the internal interface by moving `join` the the utils namespace
and use std::string rather than seastar::sstring.
Additional unit tests were added to test/boost/json_test
Fixes#13146Closes#13159
* github.com:scylladb/scylladb:
utils: to_string: get rid of utils::join
utils: to_string: get rid of to_string(std::initializer_list)
utils: to_string: get rid of to_string(const Range&)
utils: to_string: generalize range helpers
test: add string_format_test
utils: chunked_vector: add std::ranges::range ctor
In the previous patch we added a limit in Alternator for the magnitude
and precision of numbers, based on a function get_magnitude_and_precision
whose implementation was, unfortunately, rather elaborate and delicate.
Although we did add in the previous patches some end-to-end tests which
confirmed that the final decision made based on this function, to accept or
reject numbers, was a correct decision in a few cases, such an elaborate
function deserves a separate unit test for checking just that function
in isolation. In fact, this unit tests uncovered some bugs in the first
implementation of get_magnitude_and_precision() which the other tests
missed.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This PR introduces an experimental feature called "tablets". Tablets are
a way to distribute data in the cluster, which is an alternative to the
current vnode-based replication. Vnode-based replication strategy tries
to evenly distribute the global token space shared by all tables among
nodes and shards. With tablets, the aim is to start from a different
side. Divide resources of replica-shard into tablets, with a goal of
having a fixed target tablet size, and then assign those tablets to
serve fragments of tables (also called tablets). This will allow us to
balance the load in a more flexible manner, by moving individual tablets
around. Also, unlike with vnode ranges, tablet replicas live on a
particular shard on a given node, which will allow us to bind raft
groups to tablets. Those goals are not yet achieved with this PR, but it
lays the ground for this.
Things achieved in this PR:
- You can start a cluster and create a keyspace whose tables will use
tablet-based replication. This is done by setting `initial_tablets`
option:
```
CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy',
'replication_factor': 3,
'initial_tablets': 8};
```
All tables created in such a keyspace will be tablet-based.
Tablet-based replication is a trait, not a separate replication
strategy. Tablets don't change the spirit of replication strategy, it
just alters the way in which data ownership is managed. In theory, we
could use it for other strategies as well like
EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy
is augmented to support tablets.
- You can create and drop tablet-based tables (no DDL language changes)
- DML / DQL work with tablet-based tables
Replicas for tablet-based tables are chosen from tablet metadata
instead of token metadata
Things which are not yet implemented:
- handling of views, indexes, CDC created on tablet-based tables
- sharding is done using the old method, it ignores the shard allocated in tablet metadata
- node operations (topology changes, repair, rebuild) are not handling tablet-based tables
- not integrated with compaction groups
- tablet allocator piggy-backs on tokens to choose replicas.
Eventually we want to allocate based on current load, not statically
Closes#13387
* github.com:scylladb/scylladb:
test: topology: Introduce test_tablets.py
raft: Introduce 'raft_server_force_snapshot' error injection
locator: network_topology_strategy: Support tablet replication
service: Introduce tablet_allocator
locator: Introduce tablet_aware_replication_strategy
locator: Extract maybe_remove_node_being_replaced()
dht: token_metadata: Introduce get_my_id()
migration_manager: Send tablet metadata as part of schema pull
storage_service: Load tablet metadata when reloading topology state
storage_service: Load tablet metadata on boot and from group0 changes
db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata()
migration_notifier: Introduce before_drop_keyspace()
migration_manager: Make prepare_keyspace_drop_announcement() return a future<>
test: perf: Introduce perf-tablets
test: Introduce tablets_test
test: lib: Do not override table id in create_table()
utils, tablets: Introduce external_memory_usage()
db: tablets: Add printers
db: tablets: Add persistence layer
dht: Use last_token_of_compaction_group() in split_token_range_msb()
locator: Introduce tablet_metadata
dht: Introduce first_token()
dht: Introduce next_token()
storage_proxy: Improve trace-level logging
locator: token_metadata: Fix confusing comment on ring_range()
dht, storage_proxy: Abstract token space splitting
Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries"
db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms()
db: Introduce get_non_local_vnode_based_strategy_keyspaces()
service: storage_proxy: Avoid copying keyspace name in write handler
locator: Introduce per-table replication strategy
treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type
locator: Introduce effective_replication_map
locator: Rename effective_replication_map to vnode_effective_replication_map
locator: effective_replication_map: Abstract get_pending_endpoints()
db: Propagate feature_service to abstract_replication_strategy::validate_options()
db: config: Introduce experimental "TABLETS" feature
db: Log replication strategy for debugging purposes
db: Log full exception on error in do_parse_schema_tables()
db: keyspace: Remove non-const replication strategy getter
config: Reformat
Currently, responsible for injecting mutations of system.tablets to
schema changes.
Note that not all migrations are handled currently. Dependant view or
cdc table drops are not handled.
Although get_generation_number implementation is
completely generic, it is used exclusively to seed
the gossip generation number.
Following patches will define a strong gms::generation_id
type and this function should return it.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
A generic template for defining strongly typed
integer types.
Use it here to replace raft::internal::tagged_uint64.
Will be used for defining gms generation and version
as strong and distinguishable types in following patches.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Move gms::arrival_window to api/failure_detector which is its only user.
and get rid of the rest, which is not used, now that we use direct_failure_detector instead.
TODO: integare direct_failure_detector with failure_detector api.
Closes#13576
* github.com:scylladb/scylladb:
gms: get rid of unused failure_detector
api: failure_detector: remove false dependency on failure_detector::arrival_window
test: rest_api: add test_failure_detector
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `stream_reason` without the help of `operator<<`.
please note, because we still cannot use the generic formatter for
std::unordered_map provided by fmtlib, so in order to drop `operator<<`
for `stream_reason`, and to print `unordered_map<stream_reason>`,
`fmt::join()` is used as a temporary solution. we will audit all
`fmt::join()` calls, after removing the homebrew formatter of
`std::unordered_map`.
the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13609
The legacy failure_detector is now unused and can be removed.
TODO: integare direct_failure_detector with failure_detector api.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>