Commit Graph

1719 Commits

Author SHA1 Message Date
Kefu Chai
fa8eaab62b build: remove duplicated test
this change has no impact on `build.ninja` generated by `configure.py`.
as we are using a `set` for tracking the tests to be built. but it's
still an improvement, as we should not add duplicated entries in a set
when initializing it.

there are two occurrences of `test/boost/double_decker_test`, the one
which is in the club of the local cluster of collections tests - bptree,
btree, radix_tree and double_decker are preserved.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14478
2023-07-05 15:43:04 +03:00
Avi Kivity
66c47d40e6 cql3: selection: drop selector_factories, selectables, and selectors
The whole class hierarchy is no longer used by anything and we can just
delete it.
2023-07-03 19:45:17 +03:00
Kamil Braun
b912eeade5 Merge 'merge raft commands to group0 before applying them whenever possible' from Gleb
Since most group0 commands are just mutations it is easy to combine them
before passing them to a subsystem they destined to since it is more
efficient. The logic that handles those mutations in a subsystem will
run once for each batch of commands instead of for each individual
command. This is especially useful when a node catches up to a leader and
gets a lot of commands together.

The patch here does exactly that. It combines commands into a single
command if possible, but it preserves an order between commands, so each
time it encounters a command to a different subsystem it flushes already
combined batch and starts a new one. This extra safety assumes that
there are dependencies between subsystems managed by group0, so the order
matters. It may be not the case now, but we prefer to be on a safe side.

Broadcast table commands are not mutations, so they are never combined.

* 'raft-merge-cmds' of https://github.com/gleb-cloudius/scylla:
  test: add test for group0 raft command merging
  service: raft: respect max mutation size limit when persisting raft entries
  group0_state_machine: merge commands before applying them whenever possible
2023-06-28 17:21:07 +02:00
Gleb Natapov
945f476363 test: add test for group0 raft command merging
Add a test that submits 3 large commands each one a little bit larger
than 1/3 of maximum mutation size. Check that in the end 2 command were
executed (first 2 were merged and third was executed separately).
2023-06-27 14:59:55 +03:00
Avi Kivity
f86dd857ca Merge 'Certificate based authorization' from Calle Wilund
Fixes #10099

Adds the com.scylladb.auth.CertificateAuthenticator type. If set as authenticator, will extract roles from TLS authentication certificate (not wire cert - those are server side) subject, based on configurable regex.

Example:

scylla.yaml:

```
    authenticator: com.scylladb.auth.CertificateAuthenticator
    auth_superuser_name: <name>
    auth_certificate_role_query: CN=([^,\s]+)

    client_encryption_options:
      enabled: True
      certificate: <server cert>
      keyfile: <server key>
      truststore: <shared trust>
      require_client_auth: True
```
In a client, then use a certificate signed with the <shared trust> store as auth cert, with the common name <name>. I.e. for  qlsh set "usercert" and "userkey" to these certificate files.

No user/password needs to be sent, but role will be picked up from auth certificate. If none is present, the transport will reject the connection. If the certificate subject does not contain a recongnized role name (from config or set in tables) the authenticator mechanism will reject it.

Otherwise, connection becomes the role described.

To facilitate this, this also contains the addition of allowing setting super user name + salted passwd via command line/conf + some tweaks to SASL part of connection setup.

Closes #12214

* github.com:scylladb/scylladb:
  docs: Add documentation of certificate auth + auth_superuser_name
  auth: Add TLS certificate authenticator
  transport: Try to do early, transport based auth if possible
  auth: Allow for early (certificate/transport) authentication
  auth: Allow specifying initial superuser name + passwd (salted) in config
  roles-metadata: Coroutinuze some helpers
2023-06-27 12:52:14 +03:00
Botond Dénes
f5e3b8df6d Merge 'Optimize creation of reader excluding staging for view building' from Raphael "Raph" Carvalho
View building from staging creates a reader from scratch (memtable
\+ sstables - staging) for every partition, in order to calculate
the diff between new staging data and data in base sstable set,
and then pushes the result into the view replicas.

perf shows that the reader creation is very expensive:
```
+   12.15%    10.75%  reactor-3        scylla             [.] lexicographical_tri_compare<compound_type<(allow_prefixes)0>::iterator, compound_type<(allow_prefixes)0>::iterator, legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()(managed_bytes_basic_view<(mutable_view)0>, managed_bytes
+   10.01%     9.99%  reactor-3        scylla             [.] boost::icl::is_empty<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    8.95%     8.94%  reactor-3        scylla             [.] legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()
+    7.29%     7.28%  reactor-3        scylla             [.] dht::ring_position_tri_compare
+    6.28%     6.27%  reactor-3        scylla             [.] dht::tri_compare
+    4.11%     3.52%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    4.09%     4.07%  reactor-3        scylla             [.] sstables::index_consume_entry_context<sstables::index_consumer>::process_state
+    3.46%     0.93%  reactor-3        scylla             [.] sstables::sstable_run::will_introduce_overlapping
+    2.53%     2.53%  reactor-3        libstdc++.so.6     [.] std::_Rb_tree_increment
+    2.45%     2.45%  reactor-3        scylla             [.] boost::icl::non_empty::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.14%     2.13%  reactor-3        scylla             [.] boost::icl::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.07%     2.07%  reactor-3        scylla             [.] logalloc::region_impl::free
+    2.06%     1.91%  reactor-3        scylla             [.] sstables::index_consumer::consume_entry(sstables::parsed_partition_index_entry&&)::{lambda()https://github.com/scylladb/scylladb/issues/1}::operator()() const::{lambda()https://github.com/scylladb/scylladb/issues/1}::operator()
+    2.04%     2.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    1.87%     0.00%  reactor-3        [kernel.kallsyms]  [k] entry_SYSCALL_64_after_hwframe
+    1.86%     0.00%  reactor-3        [kernel.kallsyms]  [k] do_syscall_64
+    1.39%     1.38%  reactor-3        libc.so.6          [.] __memcmp_avx2_movbe
+    1.37%     0.92%  reactor-3        scylla             [.] boost::icl::segmental::join_left<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::
+    1.34%     1.33%  reactor-3        scylla             [.] logalloc::region_impl::alloc_small
+    1.33%     1.33%  reactor-3        scylla             [.] seastar::memory::small_pool::add_more_objects
+    1.30%     0.35%  reactor-3        scylla             [.] seastar::reactor::do_run
+    1.29%     1.29%  reactor-3        scylla             [.] seastar::memory::allocate
+    1.19%     0.05%  reactor-3        libc.so.6          [.] syscall
+    1.16%     1.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst
+    1.07%     0.79%  reactor-3        scylla             [.] sstables::partitioned_sstable_set::insert

```
That shows some significant amount of work for inserting sstables
into the interval map and maintaining the sstable run (which sorts
fragments by first key and checks for overlapping).

The interval map is known for having issues with L0 sstables, as
it will have to be replicated almost to every single interval
stored by the map, causing terrible space and time complexity.
With enough L0 sstables, it can fall into quadratic behavior.

This overhead is fixed by not building a new fresh sstable set
when recreating the reader, but rather supplying a predicate
to sstable set that will filter out staging sstables when
creating either a single-key or range scan reader.

This could have another benefit over today's approach which
may incorrectly consider a staging sstable as non-staging, if
the staging sst wasn't included in the current batch for view
building.

With this improvement, view building was measured to be 3x faster.

from
`INFO  2023-06-16 12:36:40,014 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 963957ms = 50kB/s`

to
`INFO  2023-06-16 14:47:12,129 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 319899ms = 150kB/s`

Refs https://github.com/scylladb/scylladb/issues/14089.
Fixes scylladb/scylladb#14244.

Closes #14364

* github.com:scylladb/scylladb:
  table: Optimize creation of reader excluding staging for view building
  view_update_generator: Dump throughput and duration for view update from staging
  utils: Extract pretty printers into a header
2023-06-27 07:25:30 +03:00
Raphael S. Carvalho
83c70ac04f utils: Extract pretty printers into a header
Can be easily reused elsewhere.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-06-26 21:58:20 -03:00
Kefu Chai
fb05fddd7d build: build with -O0 if Clang >= 16 is used
to workaround https://github.com/llvm/llvm-project/issues/62842,
per the test this issue only surfaces when compiling the tree
with
ae7bf2b80b
which is included in Clang version 16, and the issue disappears
when the tree is compiled with -O0.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14391
2023-06-26 18:55:10 +03:00
Calle Wilund
a3db540142 auth: Add TLS certificate authenticator
Fixes #10099

Adds the com.scylladb.auth.CertificateAuthenticator type. If set as authenticator,
will extract roles from TLS authentication certificate (not wire cert - those are
server side) subject, based on configurable regex.

Example:

scylla.yaml:

authenticator: com.scylladb.auth.CertificateAuthenticator
auth_superuser_name: <name>
auth_certificate_role_queries:
	- source: SUBJECT
	  query: CN=([^,\s]+)

client_encryption_options:
  enabled: True
  certificate: <server cert>
  keyfile: <server key>
  truststore: <shared trust>
  require_client_auth: True

In a client, then use a certificate signed with the <shared trust>
store as auth cert, with the common name <name>. I.e. for cqlsh
set "usercert" and "userkey" to these certificate files.

No user/password needs to be sent, but role will be picked up
from auth certificate. If none is present, the transport will
reject the connection. If the certificate subject does not
contain a recongnized role name (from config or set in tables)
the authenticator mechanism will reject it.

Otherwise, connection becomes the role described.
2023-06-26 15:00:21 +00:00
Kefu Chai
f014ccf369 Revert "Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai""
This reverts commit 562087beff.

The regressions introduced by the reverted change have been fixed.
So let's revert this revert to resurrect the
uuid_sstable_identifier_enabled support.

Fixes #10459
2023-06-21 13:02:40 +03:00
Botond Dénes
562087beff Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai"
This reverts commit d1dc579062, reversing
changes made to 3a73048bc9.

Said commit caused regressions in dtests. We need to investigate and fix
those, but in the meanwhile let's revert this to reduce the disruption
to our workflows.

Refs: #14283
2023-06-19 08:49:27 +03:00
Avi Kivity
b7627085cb Revert "Revert "configure: Switch debug build from -O0 to -Og""
This reverts commit 7dadd38161.

The latest revert cited debuggability trumping performance, but the
performance loss is su huge here that debug builds are unusable and
next promotions time out.

In the interest of progress, pick the lesser of two evils.
2023-06-17 15:20:26 +03:00
Kefu Chai
15543464ce sstables, replica: support UUID in generation_type
this change generalize the value of generation_type so it also
supports UUID based identifier.

* sstables/generation_type.h:
  - add formatter and parse for UUID. please note, Cassandra uses
    a different format for formatting the SSTable identifier. and
    this formatter suits our needs as it uses underscore "_" as the
    delimiter, as the file name of components uses dash "-" as the
    delimiter. instead of reinventing the formatting or just use
    another delimiter in the stringified UUID, we choose to use the
    Cassandra's formatting.
  - add accessors for accessing the type and value of generation_type
  - add constructor for constructing generation_type with UUID and
    string.
  - use hash for placing sstables with uuid identifiers into shards
    for more uniformed distrbution of tables in shards.
* replica/table.cc:
  - only update the generator if the given generation contains an
    integer
* test/boost:
  - add a simple test to verify the generation_type is able to
    parse and format

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-06-15 17:54:59 +08:00
Kefu Chai
c508c656c5 Revert "build: make gen_headers a dependency of gen/*.o"
This reverts commit 9526258b89.
Because the issue (#14213) supposed to be fix only exists in the
enterprise branch. And that issue has been fixed in a different
way in a different place.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14234
2023-06-14 14:13:46 +03:00
Avi Kivity
7dadd38161 Revert "configure: Switch debug build from -O0 to -Og"
This reverts commit 7e68ed6a5d. With -Og
simple 'gdb build/debug/scylla -ex start' shows the function parameters
as "optimized out" while with -O0 they display fine. This applies to
all variables, not just main's parameters.

Bisect revealed that this behavior started with the reverted commit; it's
not due to a later toolchain update.

Fixes #14196.

Closes #14197
2023-06-13 18:08:21 +03:00
Kefu Chai
9526258b89 build: make gen_headers a dependency of gen/*.o
when compiling the generated source files, sometimes, we can run
into the FTBFS like:

02:18:54  FAILED: build/release/gen/cql3/CqlParser.o
02:18:54  clang++ ... -o build/release/gen/cql3/CqlParser.o build/release/gen/cql3/CqlParser.cpp
...
02:18:54  In file included from build/release/gen/cql3/CqlParser.cpp:44:
02:18:54  In file included from build/release/gen/cql3/CqlParser.hpp:75:
02:18:54  In file included from ./cql3/statements/create_function_statement.hh:12:
02:18:54  In file included from ./cql3/functions/user_function.hh:16:
02:18:54  ./lang/wasm.hh:15:10: fatal error: 'rust/wasmtime_bindings.hh' file not found
02:18:54  #include "rust/wasmtime_bindings.hh"
02:18:54           ^~~~~~~~~~~~~~~~~~~~~~~~~~~

CqlParser.cc is a source file generated from cql3/Cql.g, this source in
turn includes another source file generated from
wasmtime_bindings/src/lib.rs. but we failed to setup this dependency in
the build.ninja rules -- we only teach ninja that "to compile the
grammer source files, please prepare the`serializers` source files
first". but this is not enough.

so, in this change, we just replace `serializers` with `gen_headers`,
as the latter is a superset of the former. and should fulfill the needs
of CqlParser.cc.

Fixes #14213
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14214
2023-06-12 15:36:03 +03:00
Avi Kivity
79bfe04d2a cql3: remove abstract_marker vestiges
Removed by e458340821 ("cql3: Remove term")

Closes #14192
2023-06-12 10:41:04 +03:00
Pavel Emelyanov
66e43912d6 code: Switch to seastar API level 7
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).

So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command

The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields

Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)

Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile

The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13963
2023-06-06 13:29:16 +03:00
Kefu Chai
9ba610c811 build: specify link-args using build script
as an alternative to passing the link-args using the environmental variable,
we can also use build script to pass the "-C link-args=<FLAG>" to the compiler.
see https://doc.rust-lang.org/nightly/cargo/reference/build-scripts.html#cargorustc-link-argflag
to ensure that cargo is called again by ninja, after build.rs is
updated, build.rs is added as a dependency of {wasm} files along with
Cargo.lock.

this change is verified using following command
```
RUSTFLAGS='--print link-args' cargo build \
  --target=wasm32-wasi \
  --example=return_input \
  --locked \
  --manifest-path=Cargo.toml \
  --target-dir=build/cmake/test/resource/wasm/rust
```

the output includes "-zstack-size=131072" in the argument passed to lld:
```
   Compiling examples v0.0.0 (/home/kefu/dev/scylladb/test/resource/wasm/rust)
LC_ALL="C"
PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin/self-contained:/home/kefu/.local/bin:/home/kefu/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin"
VSLANG="1033"
"lld"
"-flavor" "wasm" "--rsp-quoting=posix" "--export"
"_scylla_abi" "--export" "_scylla_free" "--export" "_scylla_malloc"
"--export" "return_input" "-z" "stack-size=1048576" "--stack-first"
"--allow-undefined" "--fatal-warnings" "--no-demangle"
...
"-L" "/usr/lib/rustlib/wasm32-wasi/lib"
"-L" "/usr/lib/rustlib/wasm32-wasi/lib/self-contained"
"-o"
"/home/kefu/dev/scylladb/build/cmake/test/resource/wasm/rust/wasm32-wasi/debug/examples/return_input-ef03083560989040.wasm"
"--gc-sections"
"--no-entry"
"-O0"
"-zstack-size=131072"
```

with this change, it'd be easier to build .wat files in CMake, so
we don't need to repeat the settings in both configure.py and
CMakeLists.txt

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14123
2023-06-06 10:54:39 +03:00
Kefu Chai
55ee0e2724 build: preserve $libs when linking a single testing executable
if we just want to build a single test and scylla executables, we
might want to use `configure.py` like:

./configure.py --mode debug --compiler clang++ --with scylla --with test/boost/database_test

which generates `build.ninja` for us, with following rules:

build $builddir/debug/test/boost/database_test_g: link.debug ... | $builddir/debug/seastar/libseastar.so
$builddir/debug/seastar/libseastar_testing.so
   libs = $seastar_libs_debug $libs -lthrift -lboost_system $seastar_testing_libs_debug
   libs = $seastar_libs_debug

but the last line prevents database_test_g for linking against
the third-party libraries like libabsl, which could have been
pulled in by $libs. but the second assignment expression just
makes the value of `libs` identical to that of `seastar_libs_debug`.
but that library does not include the libraries which are only
used by scylla. so we could run into link failure with the
`build.ninja` generated with this command line. like:
```
FAILED: build/debug/test/boost/database_test_g
...
ld.lld: error: undefined symbol: seastar::testing::entry_point(int, char**)
>>> referenced by scylla_test_case.hh:22 (./test/lib/scylla_test_case.hh:22)
>>>               build/debug/test/boost/database_test.o:(main)
...
ld.lld: error: undefined symbol: boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_tes
t::basic_cstring<char const>)
>>> referenced by database_test.cc:298 (test/boost/database_test.cc:298)
>>>               build/debug/test/boost/database_test.o:(require_exist(seastar::basic_sstring<char, unsigned int, 15u, true> const&, bool))
...
```

with this change, the extra assignment expression is dropped. this
should not cause any regression. as f'$seastar_libs_{mode}' as
been included as a part of `local_libs` before the grand if-the-else
block in the for loop before this `f.write()` statement.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14041
2023-05-29 23:03:24 +03:00
Pavel Emelyanov
132260973a tests: Add perf test for S3 client (reading latencies)
Here's a simple test that can be used to check S3 object read latencies.
To run one must export the same variables as for any other S3 unit test:

- S3_SERVER_ADDRESS_FOR_TEST
- S3_SERVER_PORT_FOR_TEST
- S3_PUBLIC_BUCKET_FOR_TEST

and the AWS creds are a must via AWS_S3_EXTRA='$key:$secret:$region' env
variable.

Accepted options are

   --duration SEC -- test duration in seconds
   --parallel NR -- number of fibers to run in parallel
   --object-size BYTES -- object size to use (1MB by default)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13895
2023-05-24 09:29:48 +03:00
Kamil Braun
5a8e2153a0 Merge 'Fix heart_beat_state::force_highest_possible_version_unsafe' from Benny Halevy
It turns out that numeric_limits defines an implicit implementation
for std::numeric_limits<utils::tagged_integer<Tag, ValueType>>
which apprently returns a default-constructed tagged_integer
for min() and max(), and this broke
`gms::heart_beat_state::force_highest_possible_version_unsafe()`
since [gms: heart_beat_state: use generation_type and version_type](4cdad8bc8b)
(merged in [Merge 'gms: define and use generation and version types'...](7f04d8231d))

Implementing min/max correctly
Fixes #13801

Closes #13880

* github.com:scylladb/scylladb:
  storage_service: handle_state_normal: on_internal_error on "owns no tokens"
  utils: tagged_integer: implement std::numeric_limits::{min,max}
  test: add tagged_integer_test
2023-05-16 13:59:41 +02:00
Benny Halevy
1b5d5205c8 test: add tagged_integer_test
Add basic test for tagged+integer arithmetic operations.

Remove const qualifier from `tagged_integer::operator[+-]=`
as these are add/sub-assign operators that need to modify
the value in place.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-05-14 23:26:58 +03:00
Wojciech Mitros
1e18731a69 cql-pytest: translate Cassandra's UFTypesTest
This is a translation of Cassandra's CQL unit test source file
validation/entities/UFTypesTest.java into our cql-pytest framework.

There are 7 tests, which reproduce one known bug:
Refs #13746: UDF can only be used in SELECT, and abort when used in WHERE, or in INSERT/UPDATE/DELETE commands

And uncovered two previously unknown bugs:

Refs #13855: UDF with a non-frozen collection parameter cannot be called on a frozen value
Refs #13860: A non-frozen collection returned by a UDF cannot be used as a frozen one

Additionally, we encountered an issue that can be treated as either a bug or a hole in documentation:

Refs #13866: Argument and return types in UDFs can be frozen

Closes #13867
2023-05-14 15:22:03 +03:00
Avi Kivity
e252dbcfb8 Merge ' readers,mutation: move mutation_fragment_stream_validator to mutation/' from Botond Dénes
The validator classes have their definition in a header located in mutation/, while their implementation is located in a .cc in readers/mutation_reader.cc.
This PR fixes this inconsistency by moving the implementation into mutation/mutation_fragment_stream_validator.cc. The only change is that the validator code gets a new logger instance (but the logger variable itself is left unchanged for now).

Closes #13831

* github.com:scylladb/scylladb:
  mutation/mutation_fragment_stream_validator.cc: rename logger
  readers,mutation: move mutation_fragment_stream_validator to mutation/
2023-05-10 12:54:53 +03:00
Kamil Braun
7d9ab44e81 Merge 'token_metadata: read remapping for write_both_read_new' from Gusev Petr
When new nodes are added or existing nodes are deleted, the topology
state machine needs to shunt reads from the old nodes to the new ones.
This happens in the `write_both_read_new` state. The problem is that
previously this state was not handled in any way in `token_metadata` and
the read nodes were only changed when the topology state machine reached
the final 'owned' state.

To handle `write_both_read_new` an additional `interval_map` inside
`token_metadata` is maintained similar to `pending_endpoints`.  It maps
the ranges affected by the ongoing topology change operation to replicas
which should be used for reading. When topology state sm reaches the
point when it needs to switch reads to a new topology, it passes
`request_read_new=true` in a call to `update_pending_ranges`. This
forces `update_pending_ranges` to compute the ranges based on new
topology and store them to the `interval_map`. On the data plane, when a
read on coordinator needs to decide which endpoints to use, it first
consults this `interval_map` in `token_metadata`, and only if it doesn't
contain a range for current token it uses normal endpoints from
`effective_replication_map`.

Closes #13376

* github.com:scylladb/scylladb:
  storage_proxy, storage_service: use new read endpoints
  storage_proxy: rename get_live_sorted_endpoints->get_endpoints_for_reading
  token_metadata: add unit test for endpoints_for_reading
  token_metadata: add endpoints for reading
  sequenced_set: add extract_set method
  token_metadata_impl: extract maybe_migration_endpoints helper function
  token_metadata_impl: introduce migration_info
  token_metadata_impl: refactor update_pending_ranges
  token_metadata: add unit tests
  token_metadata: fix indentation
  token_metadata_impl: return unique_ptr from clone functions
2023-05-10 10:03:30 +02:00
Botond Dénes
8681f3e997 readers,mutation: move mutation_fragment_stream_validator to mutation/
The validator classes have their definition in a header located in mutation/,
while their implementation is located in a .cc in readers/mutation_reader.cc.
This patch fixes this inconsistency by moving the implementation into
mutation/mutation_fragment_stream_validator.cc. The only change is that
the validator code gets a new logger instance (but the logger variable itself
is left unchanged for now).
2023-05-09 07:55:13 -04:00
Botond Dénes
287ccce1cc Merge 'sstables: extract storage out ' from Kefu Chai
this change extracts the storage class and its derived classes
out into their own source files. for couple reasons:

- for better readability. the sstables.hh is over 1005 lines.
  and sstables.cc 3602 lines. it's a little bit difficult to figure
  out how the different parts in these sources interact with each
  other. for instance, with this change, it's clear some of helper
  functions are only used by file_system_storage.
- probably less inter-source dependency. by extracting the sources
  files out, they can be compiled individually, so changing one .cc
  file does not impact others. this could speed up the compilation
  time.

Closes #13785

* github.com:scylladb/scylladb:
  sstables: storage: coroutinize idempotent_link_file()
  sstables: extract storage out
2023-05-09 14:03:40 +03:00
Petr Gusev
3120cabf56 token_metadata: add unit tests
We are going to refactor update_pending_ranges,
so in this commit we add some simple unit tests
to ensure we don't break it.
2023-05-09 13:56:06 +04:00
Kefu Chai
2eefcb37eb sstables: extract storage out
this change extracts the storage class and its derived classes
out into storage.cc and storage.hh. for couple reasons:

- for better readability. the sstables.hh is over 1005 lines.
  and sstables.cc 3602 lines. it's a little bit difficult to figure
  out how the different parts in these sources interact with each
  other. for instance, with this change, it's clear some of helper
  functions are only used by file_system_storage.
- probably less inter-source dependency. by extracting the sources
  files out, they can be compiled individually, so changing one .cc
  file does not impact others. this could speed up the compilation
  time.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-09 16:47:00 +08:00
Wojciech Mitros
6d89d718d9 wasm: replace wasm programs with their source programs
After recent changes, we are able to store only the
C/Rust source codes for Wasm programs, and only build
them when neccessary. This patch utilizes this
opportunity by removing most of the currently stored
raw Wasm programs, replacing them with C/Rust sources
and adding them to the new build system.
2023-05-08 10:47:34 +02:00
Wojciech Mitros
c065ae0ded build: prepare rules for compiling wasm files
Currently, when we deal with a Wasm program, we store
it in its final WebAssembly Text form. This causes a lot
of code bloat and is hard to read. Instead, we would like
to store only the (C/Rust) source codes, and build Wasm
when neccessary. This patch adds build commands that
compile C/Rust sources to Wasm.
After these changes, adding a new program that should be
compiled to Rust, requires only adding the source code
of it and updating the wasms and wasm_deps lists in
configure.py.
All Wasm programs are build by default when building all
artifacts, all artifacts in a given mode, or when building
tests. Additionally, a ninja wasm target is added, so that
it's possible to build just the wasm files.
The generated files are saved in $builddir/wasm.
2023-05-08 10:47:34 +02:00
Wojciech Mitros
c53d68ee3e build: set the type of build_artifacts
Currently, build_artifacts are of type set[str] | list, which prevents
us from performing set operations on it. In a future patch, we will
want to take a set difference and set intersections with it, so we
initialize the type of build_artifacts to a set in all cases.
2023-05-08 10:47:34 +02:00
Pavel Emelyanov
fe70333c19 test: Auto-skip object-storage test cases if run from shell
In case an sstable unit test case is run individually, it would fail
with exception saying that S3_... environment is not set. It's better to
skip the test-case rather than fail. If someone wants to run it from
shell, it will have to prepare S3 server (minio/AWS public bucket) and
provide proper environment for the test-case.

refs: #13569

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13755
2023-05-04 14:15:18 +03:00
Kefu Chai
c76486c508 build: only apply -Wno-parentheses-equality to ANTLR generated sources
it turns out the only places where we have compiler warnings of
-W-parentheses-equality is the source code generated by ANTLR. strictly
speaking, this is valid C++ code, just not quite readable from the
hygienic point of view. so let's enable this warning in the source tree,
but only disable it when compiling the sources generated by ANTLR.

please note, this warning option is supported by both GCC and Clang,
so no need to test if it is supported.

for a sample of the warnings, see:
```
/home/kefu/dev/scylladb/build/cmake/cql3/CqlLexer.cpp:21752:38: error: equality comparison with extraneous parentheses [-Werror,-Wparentheses-equality]
                            if ( (LA4_0 == '$'))
                                  ~~~~~~^~~~~~
/home/kefu/dev/scylladb/build/cmake/cql3/CqlLexer.cpp:21752:38: note: remove extraneous parentheses around the comparison to silence this warning
                            if ( (LA4_0 == '$'))
                                 ~      ^     ~
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-04 11:16:27 +08:00
Botond Dénes
7baa2d9cb2 Merge 'Cleanup range printing' from Benny Halevy
This mini-series cleans up printing of ranges in utils/to_string.hh

It generalizes the helper function to work on a std::ranges::range,
with some exceptions, and adds a helper for boost::transformed_range.

It also changes the internal interface by moving `join` the the utils namespace
and use std::string rather than seastar::sstring.

Additional unit tests were added to test/boost/json_test

Fixes #13146

Closes #13159

* github.com:scylladb/scylladb:
  utils: to_string: get rid of utils::join
  utils: to_string: get rid of to_string(std::initializer_list)
  utils: to_string: get rid of to_string(const Range&)
  utils: to_string: generalize range helpers
  test: add string_format_test
  utils: chunked_vector: add std::ranges::range ctor
2023-05-02 14:55:18 +03:00
Nadav Har'El
e74f69bb56 alternator: unit test for number magnitude and precision function
In the previous patch we added a limit in Alternator for the magnitude
and precision of numbers, based on a function get_magnitude_and_precision
whose implementation was, unfortunately, rather elaborate and delicate.

Although we did add in the previous patches some end-to-end tests which
confirmed that the final decision made based on this function, to accept or
reject numbers, was a correct decision in a few cases, such an elaborate
function deserves a separate unit test for checking just that function
in isolation. In fact, this unit tests uncovered some bugs in the first
implementation of get_magnitude_and_precision() which the other tests
missed.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-05-02 11:04:05 +03:00
Benny Halevy
59e89efca6 test: add string_format_test
Test string formatting before cleaning up
utils/to_string.hh in the next patches.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-05-02 10:48:46 +03:00
Kefu Chai
662f8fa66e build: reenable -Wmissing-braces
since we've addressed all the -Wmissing-braces warnings, we can
now enable this warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-28 16:59:29 +08:00
Kamil Braun
30cc07b40d Merge 'Introduce tablets' from Tomasz Grabiec
This PR introduces an experimental feature called "tablets". Tablets are
a way to distribute data in the cluster, which is an alternative to the
current vnode-based replication. Vnode-based replication strategy tries
to evenly distribute the global token space shared by all tables among
nodes and shards. With tablets, the aim is to start from a different
side. Divide resources of replica-shard into tablets, with a goal of
having a fixed target tablet size, and then assign those tablets to
serve fragments of tables (also called tablets). This will allow us to
balance the load in a more flexible manner, by moving individual tablets
around. Also, unlike with vnode ranges, tablet replicas live on a
particular shard on a given node, which will allow us to bind raft
groups to tablets. Those goals are not yet achieved with this PR, but it
lays the ground for this.

Things achieved in this PR:

  - You can start a cluster and create a keyspace whose tables will use
    tablet-based replication. This is done by setting `initial_tablets`
    option:

    ```
        CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy',
                        'replication_factor': 3,
                        'initial_tablets': 8};
    ```

    All tables created in such a keyspace will be tablet-based.

    Tablet-based replication is a trait, not a separate replication
    strategy. Tablets don't change the spirit of replication strategy, it
    just alters the way in which data ownership is managed. In theory, we
    could use it for other strategies as well like
    EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy
    is augmented to support tablets.

  - You can create and drop tablet-based tables (no DDL language changes)

  - DML / DQL work with tablet-based tables

    Replicas for tablet-based tables are chosen from tablet metadata
    instead of token metadata

Things which are not yet implemented:

  - handling of views, indexes, CDC created on tablet-based tables
  - sharding is done using the old method, it ignores the shard allocated in tablet metadata
  - node operations (topology changes, repair, rebuild) are not handling tablet-based tables
  - not integrated with compaction groups
  - tablet allocator piggy-backs on tokens to choose replicas.
    Eventually we want to allocate based on current load, not statically

Closes #13387

* github.com:scylladb/scylladb:
  test: topology: Introduce test_tablets.py
  raft: Introduce 'raft_server_force_snapshot' error injection
  locator: network_topology_strategy: Support tablet replication
  service: Introduce tablet_allocator
  locator: Introduce tablet_aware_replication_strategy
  locator: Extract maybe_remove_node_being_replaced()
  dht: token_metadata: Introduce get_my_id()
  migration_manager: Send tablet metadata as part of schema pull
  storage_service: Load tablet metadata when reloading topology state
  storage_service: Load tablet metadata on boot and from group0 changes
  db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata()
  migration_notifier: Introduce before_drop_keyspace()
  migration_manager: Make prepare_keyspace_drop_announcement() return a future<>
  test: perf: Introduce perf-tablets
  test: Introduce tablets_test
  test: lib: Do not override table id in create_table()
  utils, tablets: Introduce external_memory_usage()
  db: tablets: Add printers
  db: tablets: Add persistence layer
  dht: Use last_token_of_compaction_group() in split_token_range_msb()
  locator: Introduce tablet_metadata
  dht: Introduce first_token()
  dht: Introduce next_token()
  storage_proxy: Improve trace-level logging
  locator: token_metadata: Fix confusing comment on ring_range()
  dht, storage_proxy: Abstract token space splitting
  Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries"
  db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms()
  db: Introduce get_non_local_vnode_based_strategy_keyspaces()
  service: storage_proxy: Avoid copying keyspace name in write handler
  locator: Introduce per-table replication strategy
  treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type
  locator: Introduce effective_replication_map
  locator: Rename effective_replication_map to vnode_effective_replication_map
  locator: effective_replication_map: Abstract get_pending_endpoints()
  db: Propagate feature_service to abstract_replication_strategy::validate_options()
  db: config: Introduce experimental "TABLETS" feature
  db: Log replication strategy for debugging purposes
  db: Log full exception on error in do_parse_schema_tables()
  db: keyspace: Remove non-const replication strategy getter
  config: Reformat
2023-04-27 09:40:18 +02:00
Tomasz Grabiec
5e89f2f5ba service: Introduce tablet_allocator
Currently, responsible for injecting mutations of system.tablets to
schema changes.

Note that not all migrations are handled currently. Dependant view or
cdc table drops are not handled.
2023-04-24 10:49:37 +02:00
Tomasz Grabiec
4b4238b069 test: perf: Introduce perf-tablets
Example output:

  $ build/release/scylla perf-tablets --tables 10 --tablets-per-table $((8*1024)) --rf 3

  testlog - Total tablet count: 81920
  testlog - Size of tablet_metadata in memory: 7683 KiB
  testlog - Copied in 2.163421 [ms]
  testlog - Cleared in 0.767507 [ms]
  testlog - Saved in 774.813232 [ms]
  testlog - Read in 246.666885 [ms]
  testlog - Read mutations in 211.677292 [ms]
  testlog - Size of canonical mutations: 20.633621 [MiB]
  testlog - Disk space used by system.tablets: 0.902344 [MiB]
2023-04-24 10:49:37 +02:00
Tomasz Grabiec
70a35f70a6 test: Introduce tablets_test 2023-04-24 10:49:37 +02:00
Tomasz Grabiec
9d786c1ebc db: tablets: Add persistence layer 2023-04-24 10:49:37 +02:00
Tomasz Grabiec
fceb5f8cf6 locator: Introduce tablet_metadata
token_metadata now stores tablet metadata with information about
tablets in the system.
2023-04-24 10:49:37 +02:00
Benny Halevy
d1817e9e1b utils: move generation-number to gms
Although get_generation_number implementation is
completely generic, it is used exclusively to seed
the gossip generation number.

Following patches will define a strong gms::generation_id
type and this function should return it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:37:32 +03:00
Benny Halevy
f5f566bdd8 utils: add tagged_integer
A generic template for defining strongly typed
integer types.

Use it here to replace raft::internal::tagged_uint64.
Will be used for defining gms generation and version
as strong and distinguishable types in following patches.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 08:37:32 +03:00
Kamil Braun
55f43e532c Merge 'get rid of gms/failure_detector' from Benny Halevy
Move gms::arrival_window to api/failure_detector which is its only user.
and get rid of the rest, which is not used, now that we use direct_failure_detector instead.

TODO: integare direct_failure_detector with failure_detector api.

Closes #13576

* github.com:scylladb/scylladb:
  gms: get rid of unused failure_detector
  api: failure_detector: remove false dependency on failure_detector::arrival_window
  test: rest_api: add test_failure_detector
2023-04-21 11:47:44 +02:00
Kefu Chai
9215adee46 streaming: specialize fmt::formatter<stream_reason>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `stream_reason` without the help of `operator<<`.

please note, because we still cannot use the generic formatter for
std::unordered_map provided by fmtlib, so in order to drop `operator<<`
for `stream_reason`, and to print `unordered_map<stream_reason>`,
`fmt::join()` is used as a temporary solution. we will audit all
`fmt::join()` calls, after removing the homebrew formatter of
`std::unordered_map`.

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13609
2023-04-21 09:44:23 +03:00
Benny Halevy
3f1ac846d8 gms: get rid of unused failure_detector
The legacy failure_detector is now unused and can be removed.

TODO: integare direct_failure_detector with failure_detector api.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-21 09:08:27 +03:00