scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 14:15:46 +00:00

Author	SHA1	Message	Date
Kefu Chai	fa8eaab62b	build: remove duplicated test this change has no impact on `build.ninja` generated by `configure.py`. as we are using a `set` for tracking the tests to be built. but it's still an improvement, as we should not add duplicated entries in a set when initializing it. there are two occurrences of `test/boost/double_decker_test`, the one which is in the club of the local cluster of collections tests - bptree, btree, radix_tree and double_decker are preserved. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14478	2023-07-05 15:43:04 +03:00
Avi Kivity	66c47d40e6	cql3: selection: drop selector_factories, selectables, and selectors The whole class hierarchy is no longer used by anything and we can just delete it.	2023-07-03 19:45:17 +03:00
Kamil Braun	b912eeade5	Merge 'merge raft commands to group0 before applying them whenever possible' from Gleb Since most group0 commands are just mutations it is easy to combine them before passing them to a subsystem they destined to since it is more efficient. The logic that handles those mutations in a subsystem will run once for each batch of commands instead of for each individual command. This is especially useful when a node catches up to a leader and gets a lot of commands together. The patch here does exactly that. It combines commands into a single command if possible, but it preserves an order between commands, so each time it encounters a command to a different subsystem it flushes already combined batch and starts a new one. This extra safety assumes that there are dependencies between subsystems managed by group0, so the order matters. It may be not the case now, but we prefer to be on a safe side. Broadcast table commands are not mutations, so they are never combined. * 'raft-merge-cmds' of https://github.com/gleb-cloudius/scylla: test: add test for group0 raft command merging service: raft: respect max mutation size limit when persisting raft entries group0_state_machine: merge commands before applying them whenever possible	2023-06-28 17:21:07 +02:00
Gleb Natapov	945f476363	test: add test for group0 raft command merging Add a test that submits 3 large commands each one a little bit larger than 1/3 of maximum mutation size. Check that in the end 2 command were executed (first 2 were merged and third was executed separately).	2023-06-27 14:59:55 +03:00
Avi Kivity	f86dd857ca	Merge 'Certificate based authorization' from Calle Wilund Fixes #10099 Adds the com.scylladb.auth.CertificateAuthenticator type. If set as authenticator, will extract roles from TLS authentication certificate (not wire cert - those are server side) subject, based on configurable regex. Example: scylla.yaml: ``` authenticator: com.scylladb.auth.CertificateAuthenticator auth_superuser_name: <name> auth_certificate_role_query: CN=([^,\s]+) client_encryption_options: enabled: True certificate: <server cert> keyfile: <server key> truststore: <shared trust> require_client_auth: True ``` In a client, then use a certificate signed with the <shared trust> store as auth cert, with the common name <name>. I.e. for qlsh set "usercert" and "userkey" to these certificate files. No user/password needs to be sent, but role will be picked up from auth certificate. If none is present, the transport will reject the connection. If the certificate subject does not contain a recongnized role name (from config or set in tables) the authenticator mechanism will reject it. Otherwise, connection becomes the role described. To facilitate this, this also contains the addition of allowing setting super user name + salted passwd via command line/conf + some tweaks to SASL part of connection setup. Closes #12214 * github.com:scylladb/scylladb: docs: Add documentation of certificate auth + auth_superuser_name auth: Add TLS certificate authenticator transport: Try to do early, transport based auth if possible auth: Allow for early (certificate/transport) authentication auth: Allow specifying initial superuser name + passwd (salted) in config roles-metadata: Coroutinuze some helpers	2023-06-27 12:52:14 +03:00
Botond Dénes	f5e3b8df6d	Merge 'Optimize creation of reader excluding staging for view building' from Raphael "Raph" Carvalho View building from staging creates a reader from scratch (memtable \+ sstables - staging) for every partition, in order to calculate the diff between new staging data and data in base sstable set, and then pushes the result into the view replicas. perf shows that the reader creation is very expensive: ``` + 12.15% 10.75% reactor-3 scylla [.] lexicographical_tri_compare<compound_type<(allow_prefixes)0>::iterator, compound_type<(allow_prefixes)0>::iterator, legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()(managed_bytes_basic_view<(mutable_view)0>, managed_bytes + 10.01% 9.99% reactor-3 scylla [.] boost::icl::is_empty<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> > + 8.95% 8.94% reactor-3 scylla [.] legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator() + 7.29% 7.28% reactor-3 scylla [.] dht::ring_position_tri_compare + 6.28% 6.27% reactor-3 scylla [.] dht::tri_compare + 4.11% 3.52% reactor-3 scylla [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+ 4.09% 4.07% reactor-3 scylla [.] sstables::index_consume_entry_context<sstables::index_consumer>::process_state + 3.46% 0.93% reactor-3 scylla [.] sstables::sstable_run::will_introduce_overlapping + 2.53% 2.53% reactor-3 libstdc++.so.6 [.] std::_Rb_tree_increment + 2.45% 2.45% reactor-3 scylla [.] boost::icl::non_empty::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> > + 2.14% 2.13% reactor-3 scylla [.] boost::icl::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> > + 2.07% 2.07% reactor-3 scylla [.] logalloc::region_impl::free + 2.06% 1.91% reactor-3 scylla [.] sstables::index_consumer::consume_entry(sstables::parsed_partition_index_entry&&)::{lambda()https://github.com/scylladb/scylladb/issues/1}::operator()() const::{lambda()https://github.com/scylladb/scylladb/issues/1}::operator() + 2.04% 2.04% reactor-3 scylla [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+ 1.87% 0.00% reactor-3 [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe + 1.86% 0.00% reactor-3 [kernel.kallsyms] [k] do_syscall_64 + 1.39% 1.38% reactor-3 libc.so.6 [.] __memcmp_avx2_movbe + 1.37% 0.92% reactor-3 scylla [.] boost::icl::segmental::join_left<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables:: + 1.34% 1.33% reactor-3 scylla [.] logalloc::region_impl::alloc_small + 1.33% 1.33% reactor-3 scylla [.] seastar::memory::small_pool::add_more_objects + 1.30% 0.35% reactor-3 scylla [.] seastar::reactor::do_run + 1.29% 1.29% reactor-3 scylla [.] seastar::memory::allocate + 1.19% 0.05% reactor-3 libc.so.6 [.] syscall + 1.16% 1.04% reactor-3 scylla [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst + 1.07% 0.79% reactor-3 scylla [.] sstables::partitioned_sstable_set::insert ``` That shows some significant amount of work for inserting sstables into the interval map and maintaining the sstable run (which sorts fragments by first key and checks for overlapping). The interval map is known for having issues with L0 sstables, as it will have to be replicated almost to every single interval stored by the map, causing terrible space and time complexity. With enough L0 sstables, it can fall into quadratic behavior. This overhead is fixed by not building a new fresh sstable set when recreating the reader, but rather supplying a predicate to sstable set that will filter out staging sstables when creating either a single-key or range scan reader. This could have another benefit over today's approach which may incorrectly consider a staging sstable as non-staging, if the staging sst wasn't included in the current batch for view building. With this improvement, view building was measured to be 3x faster. from `INFO 2023-06-16 12:36:40,014 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 963957ms = 50kB/s` to `INFO 2023-06-16 14:47:12,129 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 319899ms = 150kB/s` Refs https://github.com/scylladb/scylladb/issues/14089. Fixes scylladb/scylladb#14244. Closes #14364 * github.com:scylladb/scylladb: table: Optimize creation of reader excluding staging for view building view_update_generator: Dump throughput and duration for view update from staging utils: Extract pretty printers into a header	2023-06-27 07:25:30 +03:00
Raphael S. Carvalho	83c70ac04f	utils: Extract pretty printers into a header Can be easily reused elsewhere. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-06-26 21:58:20 -03:00
Kefu Chai	fb05fddd7d	build: build with -O0 if Clang >= 16 is used to workaround https://github.com/llvm/llvm-project/issues/62842, per the test this issue only surfaces when compiling the tree with `ae7bf2b80b` which is included in Clang version 16, and the issue disappears when the tree is compiled with -O0. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14391	2023-06-26 18:55:10 +03:00
Calle Wilund	a3db540142	auth: Add TLS certificate authenticator Fixes #10099 Adds the com.scylladb.auth.CertificateAuthenticator type. If set as authenticator, will extract roles from TLS authentication certificate (not wire cert - those are server side) subject, based on configurable regex. Example: scylla.yaml: authenticator: com.scylladb.auth.CertificateAuthenticator auth_superuser_name: <name> auth_certificate_role_queries: - source: SUBJECT query: CN=([^,\s]+) client_encryption_options: enabled: True certificate: <server cert> keyfile: <server key> truststore: <shared trust> require_client_auth: True In a client, then use a certificate signed with the <shared trust> store as auth cert, with the common name <name>. I.e. for cqlsh set "usercert" and "userkey" to these certificate files. No user/password needs to be sent, but role will be picked up from auth certificate. If none is present, the transport will reject the connection. If the certificate subject does not contain a recongnized role name (from config or set in tables) the authenticator mechanism will reject it. Otherwise, connection becomes the role described.	2023-06-26 15:00:21 +00:00
Kefu Chai	f014ccf369	Revert "Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai"" This reverts commit `562087beff`. The regressions introduced by the reverted change have been fixed. So let's revert this revert to resurrect the uuid_sstable_identifier_enabled support. Fixes #10459	2023-06-21 13:02:40 +03:00
Botond Dénes	562087beff	Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai" This reverts commit `d1dc579062`, reversing changes made to `3a73048bc9`. Said commit caused regressions in dtests. We need to investigate and fix those, but in the meanwhile let's revert this to reduce the disruption to our workflows. Refs: #14283	2023-06-19 08:49:27 +03:00
Avi Kivity	b7627085cb	Revert "Revert "configure: Switch debug build from -O0 to -Og"" This reverts commit `7dadd38161`. The latest revert cited debuggability trumping performance, but the performance loss is su huge here that debug builds are unusable and next promotions time out. In the interest of progress, pick the lesser of two evils.	2023-06-17 15:20:26 +03:00
Kefu Chai	15543464ce	sstables, replica: support UUID in generation_type this change generalize the value of generation_type so it also supports UUID based identifier. * sstables/generation_type.h: - add formatter and parse for UUID. please note, Cassandra uses a different format for formatting the SSTable identifier. and this formatter suits our needs as it uses underscore "_" as the delimiter, as the file name of components uses dash "-" as the delimiter. instead of reinventing the formatting or just use another delimiter in the stringified UUID, we choose to use the Cassandra's formatting. - add accessors for accessing the type and value of generation_type - add constructor for constructing generation_type with UUID and string. - use hash for placing sstables with uuid identifiers into shards for more uniformed distrbution of tables in shards. * replica/table.cc: - only update the generator if the given generation contains an integer * test/boost: - add a simple test to verify the generation_type is able to parse and format Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-06-15 17:54:59 +08:00
Kefu Chai	c508c656c5	Revert "build: make gen_headers a dependency of gen/*.o" This reverts commit `9526258b89`. Because the issue (#14213) supposed to be fix only exists in the enterprise branch. And that issue has been fixed in a different way in a different place. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14234	2023-06-14 14:13:46 +03:00
Avi Kivity	7dadd38161	Revert "configure: Switch debug build from -O0 to -Og" This reverts commit `7e68ed6a5d`. With -Og simple 'gdb build/debug/scylla -ex start' shows the function parameters as "optimized out" while with -O0 they display fine. This applies to all variables, not just main's parameters. Bisect revealed that this behavior started with the reverted commit; it's not due to a later toolchain update. Fixes #14196. Closes #14197	2023-06-13 18:08:21 +03:00
Kefu Chai	9526258b89	build: make gen_headers a dependency of gen/*.o when compiling the generated source files, sometimes, we can run into the FTBFS like: 02:18:54 FAILED: build/release/gen/cql3/CqlParser.o 02:18:54 clang++ ... -o build/release/gen/cql3/CqlParser.o build/release/gen/cql3/CqlParser.cpp ... 02:18:54 In file included from build/release/gen/cql3/CqlParser.cpp:44: 02:18:54 In file included from build/release/gen/cql3/CqlParser.hpp:75: 02:18:54 In file included from ./cql3/statements/create_function_statement.hh:12: 02:18:54 In file included from ./cql3/functions/user_function.hh:16: 02:18:54 ./lang/wasm.hh:15:10: fatal error: 'rust/wasmtime_bindings.hh' file not found 02:18:54 #include "rust/wasmtime_bindings.hh" 02:18:54 ^~~~~~~~~~~~~~~~~~~~~~~~~~~ CqlParser.cc is a source file generated from cql3/Cql.g, this source in turn includes another source file generated from wasmtime_bindings/src/lib.rs. but we failed to setup this dependency in the build.ninja rules -- we only teach ninja that "to compile the grammer source files, please prepare the`serializers` source files first". but this is not enough. so, in this change, we just replace `serializers` with `gen_headers`, as the latter is a superset of the former. and should fulfill the needs of CqlParser.cc. Fixes #14213 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14214	2023-06-12 15:36:03 +03:00
Avi Kivity	79bfe04d2a	cql3: remove abstract_marker vestiges Removed by `e458340821` ("cql3: Remove term") Closes #14192	2023-06-12 10:41:04 +03:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Kefu Chai	9ba610c811	build: specify link-args using build script as an alternative to passing the link-args using the environmental variable, we can also use build script to pass the "-C link-args=<FLAG>" to the compiler. see https://doc.rust-lang.org/nightly/cargo/reference/build-scripts.html#cargorustc-link-argflag to ensure that cargo is called again by ninja, after build.rs is updated, build.rs is added as a dependency of {wasm} files along with Cargo.lock. this change is verified using following command ``` RUSTFLAGS='--print link-args' cargo build \ --target=wasm32-wasi \ --example=return_input \ --locked \ --manifest-path=Cargo.toml \ --target-dir=build/cmake/test/resource/wasm/rust ``` the output includes "-zstack-size=131072" in the argument passed to lld: ``` Compiling examples v0.0.0 (/home/kefu/dev/scylladb/test/resource/wasm/rust) LC_ALL="C" PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin/self-contained:/home/kefu/.local/bin:/home/kefu/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin" VSLANG="1033" "lld" "-flavor" "wasm" "--rsp-quoting=posix" "--export" "_scylla_abi" "--export" "_scylla_free" "--export" "_scylla_malloc" "--export" "return_input" "-z" "stack-size=1048576" "--stack-first" "--allow-undefined" "--fatal-warnings" "--no-demangle" ... "-L" "/usr/lib/rustlib/wasm32-wasi/lib" "-L" "/usr/lib/rustlib/wasm32-wasi/lib/self-contained" "-o" "/home/kefu/dev/scylladb/build/cmake/test/resource/wasm/rust/wasm32-wasi/debug/examples/return_input-ef03083560989040.wasm" "--gc-sections" "--no-entry" "-O0" "-zstack-size=131072" ``` with this change, it'd be easier to build .wat files in CMake, so we don't need to repeat the settings in both configure.py and CMakeLists.txt Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14123	2023-06-06 10:54:39 +03:00
Kefu Chai	55ee0e2724	build: preserve $libs when linking a single testing executable if we just want to build a single test and scylla executables, we might want to use `configure.py` like: ./configure.py --mode debug --compiler clang++ --with scylla --with test/boost/database_test which generates `build.ninja` for us, with following rules: build $builddir/debug/test/boost/database_test_g: link.debug ... \| $builddir/debug/seastar/libseastar.so $builddir/debug/seastar/libseastar_testing.so libs = $seastar_libs_debug $libs -lthrift -lboost_system $seastar_testing_libs_debug libs = $seastar_libs_debug but the last line prevents database_test_g for linking against the third-party libraries like libabsl, which could have been pulled in by $libs. but the second assignment expression just makes the value of `libs` identical to that of `seastar_libs_debug`. but that library does not include the libraries which are only used by scylla. so we could run into link failure with the `build.ninja` generated with this command line. like: ``` FAILED: build/debug/test/boost/database_test_g ... ld.lld: error: undefined symbol: seastar::testing::entry_point(int, char**) >>> referenced by scylla_test_case.hh:22 (./test/lib/scylla_test_case.hh:22) >>> build/debug/test/boost/database_test.o:(main) ... ld.lld: error: undefined symbol: boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_tes t::basic_cstring<char const>) >>> referenced by database_test.cc:298 (test/boost/database_test.cc:298) >>> build/debug/test/boost/database_test.o:(require_exist(seastar::basic_sstring<char, unsigned int, 15u, true> const&, bool)) ... ``` with this change, the extra assignment expression is dropped. this should not cause any regression. as f'$seastar_libs_{mode}' as been included as a part of `local_libs` before the grand if-the-else block in the for loop before this `f.write()` statement. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14041	2023-05-29 23:03:24 +03:00
Pavel Emelyanov	132260973a	tests: Add perf test for S3 client (reading latencies) Here's a simple test that can be used to check S3 object read latencies. To run one must export the same variables as for any other S3 unit test: - S3_SERVER_ADDRESS_FOR_TEST - S3_SERVER_PORT_FOR_TEST - S3_PUBLIC_BUCKET_FOR_TEST and the AWS creds are a must via AWS_S3_EXTRA='$key:$secret:$region' env variable. Accepted options are --duration SEC -- test duration in seconds --parallel NR -- number of fibers to run in parallel --object-size BYTES -- object size to use (1MB by default) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13895	2023-05-24 09:29:48 +03:00
Kamil Braun	5a8e2153a0	Merge 'Fix heart_beat_state::force_highest_possible_version_unsafe' from Benny Halevy It turns out that numeric_limits defines an implicit implementation for std::numeric_limits<utils::tagged_integer<Tag, ValueType>> which apprently returns a default-constructed tagged_integer for min() and max(), and this broke `gms::heart_beat_state::force_highest_possible_version_unsafe()` since [gms: heart_beat_state: use generation_type and version_type](`4cdad8bc8b`) (merged in [Merge 'gms: define and use generation and version types'...](`7f04d8231d`)) Implementing min/max correctly Fixes #13801 Closes #13880 * github.com:scylladb/scylladb: storage_service: handle_state_normal: on_internal_error on "owns no tokens" utils: tagged_integer: implement std::numeric_limits::{min,max} test: add tagged_integer_test	2023-05-16 13:59:41 +02:00
Benny Halevy	1b5d5205c8	test: add tagged_integer_test Add basic test for tagged+integer arithmetic operations. Remove const qualifier from `tagged_integer::operator[+-]=` as these are add/sub-assign operators that need to modify the value in place. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-14 23:26:58 +03:00
Wojciech Mitros	1e18731a69	cql-pytest: translate Cassandra's UFTypesTest This is a translation of Cassandra's CQL unit test source file validation/entities/UFTypesTest.java into our cql-pytest framework. There are 7 tests, which reproduce one known bug: Refs #13746: UDF can only be used in SELECT, and abort when used in WHERE, or in INSERT/UPDATE/DELETE commands And uncovered two previously unknown bugs: Refs #13855: UDF with a non-frozen collection parameter cannot be called on a frozen value Refs #13860: A non-frozen collection returned by a UDF cannot be used as a frozen one Additionally, we encountered an issue that can be treated as either a bug or a hole in documentation: Refs #13866: Argument and return types in UDFs can be frozen Closes #13867	2023-05-14 15:22:03 +03:00
Avi Kivity	e252dbcfb8	Merge ' readers,mutation: move mutation_fragment_stream_validator to mutation/' from Botond Dénes The validator classes have their definition in a header located in mutation/, while their implementation is located in a .cc in readers/mutation_reader.cc. This PR fixes this inconsistency by moving the implementation into mutation/mutation_fragment_stream_validator.cc. The only change is that the validator code gets a new logger instance (but the logger variable itself is left unchanged for now). Closes #13831 * github.com:scylladb/scylladb: mutation/mutation_fragment_stream_validator.cc: rename logger readers,mutation: move mutation_fragment_stream_validator to mutation/	2023-05-10 12:54:53 +03:00
Kamil Braun	7d9ab44e81	Merge 'token_metadata: read remapping for write_both_read_new' from Gusev Petr When new nodes are added or existing nodes are deleted, the topology state machine needs to shunt reads from the old nodes to the new ones. This happens in the `write_both_read_new` state. The problem is that previously this state was not handled in any way in `token_metadata` and the read nodes were only changed when the topology state machine reached the final 'owned' state. To handle `write_both_read_new` an additional `interval_map` inside `token_metadata` is maintained similar to `pending_endpoints`. It maps the ranges affected by the ongoing topology change operation to replicas which should be used for reading. When topology state sm reaches the point when it needs to switch reads to a new topology, it passes `request_read_new=true` in a call to `update_pending_ranges`. This forces `update_pending_ranges` to compute the ranges based on new topology and store them to the `interval_map`. On the data plane, when a read on coordinator needs to decide which endpoints to use, it first consults this `interval_map` in `token_metadata`, and only if it doesn't contain a range for current token it uses normal endpoints from `effective_replication_map`. Closes #13376 * github.com:scylladb/scylladb: storage_proxy, storage_service: use new read endpoints storage_proxy: rename get_live_sorted_endpoints->get_endpoints_for_reading token_metadata: add unit test for endpoints_for_reading token_metadata: add endpoints for reading sequenced_set: add extract_set method token_metadata_impl: extract maybe_migration_endpoints helper function token_metadata_impl: introduce migration_info token_metadata_impl: refactor update_pending_ranges token_metadata: add unit tests token_metadata: fix indentation token_metadata_impl: return unique_ptr from clone functions	2023-05-10 10:03:30 +02:00
Botond Dénes	8681f3e997	readers,mutation: move mutation_fragment_stream_validator to mutation/ The validator classes have their definition in a header located in mutation/, while their implementation is located in a .cc in readers/mutation_reader.cc. This patch fixes this inconsistency by moving the implementation into mutation/mutation_fragment_stream_validator.cc. The only change is that the validator code gets a new logger instance (but the logger variable itself is left unchanged for now).	2023-05-09 07:55:13 -04:00
Botond Dénes	287ccce1cc	Merge 'sstables: extract storage out ' from Kefu Chai this change extracts the storage class and its derived classes out into their own source files. for couple reasons: - for better readability. the sstables.hh is over 1005 lines. and sstables.cc 3602 lines. it's a little bit difficult to figure out how the different parts in these sources interact with each other. for instance, with this change, it's clear some of helper functions are only used by file_system_storage. - probably less inter-source dependency. by extracting the sources files out, they can be compiled individually, so changing one .cc file does not impact others. this could speed up the compilation time. Closes #13785 * github.com:scylladb/scylladb: sstables: storage: coroutinize idempotent_link_file() sstables: extract storage out	2023-05-09 14:03:40 +03:00
Petr Gusev	3120cabf56	token_metadata: add unit tests We are going to refactor update_pending_ranges, so in this commit we add some simple unit tests to ensure we don't break it.	2023-05-09 13:56:06 +04:00
Kefu Chai	2eefcb37eb	sstables: extract storage out this change extracts the storage class and its derived classes out into storage.cc and storage.hh. for couple reasons: - for better readability. the sstables.hh is over 1005 lines. and sstables.cc 3602 lines. it's a little bit difficult to figure out how the different parts in these sources interact with each other. for instance, with this change, it's clear some of helper functions are only used by file_system_storage. - probably less inter-source dependency. by extracting the sources files out, they can be compiled individually, so changing one .cc file does not impact others. this could speed up the compilation time. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-09 16:47:00 +08:00
Wojciech Mitros	6d89d718d9	wasm: replace wasm programs with their source programs After recent changes, we are able to store only the C/Rust source codes for Wasm programs, and only build them when neccessary. This patch utilizes this opportunity by removing most of the currently stored raw Wasm programs, replacing them with C/Rust sources and adding them to the new build system.	2023-05-08 10:47:34 +02:00
Wojciech Mitros	c065ae0ded	build: prepare rules for compiling wasm files Currently, when we deal with a Wasm program, we store it in its final WebAssembly Text form. This causes a lot of code bloat and is hard to read. Instead, we would like to store only the (C/Rust) source codes, and build Wasm when neccessary. This patch adds build commands that compile C/Rust sources to Wasm. After these changes, adding a new program that should be compiled to Rust, requires only adding the source code of it and updating the wasms and wasm_deps lists in configure.py. All Wasm programs are build by default when building all artifacts, all artifacts in a given mode, or when building tests. Additionally, a ninja wasm target is added, so that it's possible to build just the wasm files. The generated files are saved in $builddir/wasm.	2023-05-08 10:47:34 +02:00
Wojciech Mitros	c53d68ee3e	build: set the type of build_artifacts Currently, build_artifacts are of type set[str] \| list, which prevents us from performing set operations on it. In a future patch, we will want to take a set difference and set intersections with it, so we initialize the type of build_artifacts to a set in all cases.	2023-05-08 10:47:34 +02:00
Pavel Emelyanov	fe70333c19	test: Auto-skip object-storage test cases if run from shell In case an sstable unit test case is run individually, it would fail with exception saying that S3_... environment is not set. It's better to skip the test-case rather than fail. If someone wants to run it from shell, it will have to prepare S3 server (minio/AWS public bucket) and provide proper environment for the test-case. refs: #13569 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13755	2023-05-04 14:15:18 +03:00
Kefu Chai	c76486c508	build: only apply -Wno-parentheses-equality to ANTLR generated sources it turns out the only places where we have compiler warnings of -W-parentheses-equality is the source code generated by ANTLR. strictly speaking, this is valid C++ code, just not quite readable from the hygienic point of view. so let's enable this warning in the source tree, but only disable it when compiling the sources generated by ANTLR. please note, this warning option is supported by both GCC and Clang, so no need to test if it is supported. for a sample of the warnings, see: ``` /home/kefu/dev/scylladb/build/cmake/cql3/CqlLexer.cpp:21752:38: error: equality comparison with extraneous parentheses [-Werror,-Wparentheses-equality] if ( (LA4_0 == '$')) ~~~~~~^~~~~~ /home/kefu/dev/scylladb/build/cmake/cql3/CqlLexer.cpp:21752:38: note: remove extraneous parentheses around the comparison to silence this warning if ( (LA4_0 == '$')) ~ ^ ~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-04 11:16:27 +08:00
Botond Dénes	7baa2d9cb2	Merge 'Cleanup range printing' from Benny Halevy This mini-series cleans up printing of ranges in utils/to_string.hh It generalizes the helper function to work on a std::ranges::range, with some exceptions, and adds a helper for boost::transformed_range. It also changes the internal interface by moving `join` the the utils namespace and use std::string rather than seastar::sstring. Additional unit tests were added to test/boost/json_test Fixes #13146 Closes #13159 * github.com:scylladb/scylladb: utils: to_string: get rid of utils::join utils: to_string: get rid of to_string(std::initializer_list) utils: to_string: get rid of to_string(const Range&) utils: to_string: generalize range helpers test: add string_format_test utils: chunked_vector: add std::ranges::range ctor	2023-05-02 14:55:18 +03:00
Nadav Har'El	e74f69bb56	alternator: unit test for number magnitude and precision function In the previous patch we added a limit in Alternator for the magnitude and precision of numbers, based on a function get_magnitude_and_precision whose implementation was, unfortunately, rather elaborate and delicate. Although we did add in the previous patches some end-to-end tests which confirmed that the final decision made based on this function, to accept or reject numbers, was a correct decision in a few cases, such an elaborate function deserves a separate unit test for checking just that function in isolation. In fact, this unit tests uncovered some bugs in the first implementation of get_magnitude_and_precision() which the other tests missed. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-05-02 11:04:05 +03:00
Benny Halevy	59e89efca6	test: add string_format_test Test string formatting before cleaning up utils/to_string.hh in the next patches. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-05-02 10:48:46 +03:00
Kefu Chai	662f8fa66e	build: reenable -Wmissing-braces since we've addressed all the -Wmissing-braces warnings, we can now enable this warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-28 16:59:29 +08:00
Kamil Braun	30cc07b40d	Merge 'Introduce tablets' from Tomasz Grabiec This PR introduces an experimental feature called "tablets". Tablets are a way to distribute data in the cluster, which is an alternative to the current vnode-based replication. Vnode-based replication strategy tries to evenly distribute the global token space shared by all tables among nodes and shards. With tablets, the aim is to start from a different side. Divide resources of replica-shard into tablets, with a goal of having a fixed target tablet size, and then assign those tablets to serve fragments of tables (also called tablets). This will allow us to balance the load in a more flexible manner, by moving individual tablets around. Also, unlike with vnode ranges, tablet replicas live on a particular shard on a given node, which will allow us to bind raft groups to tablets. Those goals are not yet achieved with this PR, but it lays the ground for this. Things achieved in this PR: - You can start a cluster and create a keyspace whose tables will use tablet-based replication. This is done by setting `initial_tablets` option: ``` CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 3, 'initial_tablets': 8}; ``` All tables created in such a keyspace will be tablet-based. Tablet-based replication is a trait, not a separate replication strategy. Tablets don't change the spirit of replication strategy, it just alters the way in which data ownership is managed. In theory, we could use it for other strategies as well like EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy is augmented to support tablets. - You can create and drop tablet-based tables (no DDL language changes) - DML / DQL work with tablet-based tables Replicas for tablet-based tables are chosen from tablet metadata instead of token metadata Things which are not yet implemented: - handling of views, indexes, CDC created on tablet-based tables - sharding is done using the old method, it ignores the shard allocated in tablet metadata - node operations (topology changes, repair, rebuild) are not handling tablet-based tables - not integrated with compaction groups - tablet allocator piggy-backs on tokens to choose replicas. Eventually we want to allocate based on current load, not statically Closes #13387 * github.com:scylladb/scylladb: test: topology: Introduce test_tablets.py raft: Introduce 'raft_server_force_snapshot' error injection locator: network_topology_strategy: Support tablet replication service: Introduce tablet_allocator locator: Introduce tablet_aware_replication_strategy locator: Extract maybe_remove_node_being_replaced() dht: token_metadata: Introduce get_my_id() migration_manager: Send tablet metadata as part of schema pull storage_service: Load tablet metadata when reloading topology state storage_service: Load tablet metadata on boot and from group0 changes db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata() migration_notifier: Introduce before_drop_keyspace() migration_manager: Make prepare_keyspace_drop_announcement() return a future<> test: perf: Introduce perf-tablets test: Introduce tablets_test test: lib: Do not override table id in create_table() utils, tablets: Introduce external_memory_usage() db: tablets: Add printers db: tablets: Add persistence layer dht: Use last_token_of_compaction_group() in split_token_range_msb() locator: Introduce tablet_metadata dht: Introduce first_token() dht: Introduce next_token() storage_proxy: Improve trace-level logging locator: token_metadata: Fix confusing comment on ring_range() dht, storage_proxy: Abstract token space splitting Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries" db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms() db: Introduce get_non_local_vnode_based_strategy_keyspaces() service: storage_proxy: Avoid copying keyspace name in write handler locator: Introduce per-table replication strategy treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type locator: Introduce effective_replication_map locator: Rename effective_replication_map to vnode_effective_replication_map locator: effective_replication_map: Abstract get_pending_endpoints() db: Propagate feature_service to abstract_replication_strategy::validate_options() db: config: Introduce experimental "TABLETS" feature db: Log replication strategy for debugging purposes db: Log full exception on error in do_parse_schema_tables() db: keyspace: Remove non-const replication strategy getter config: Reformat	2023-04-27 09:40:18 +02:00
Tomasz Grabiec	5e89f2f5ba	service: Introduce tablet_allocator Currently, responsible for injecting mutations of system.tablets to schema changes. Note that not all migrations are handled currently. Dependant view or cdc table drops are not handled.	2023-04-24 10:49:37 +02:00
Tomasz Grabiec	4b4238b069	test: perf: Introduce perf-tablets Example output: $ build/release/scylla perf-tablets --tables 10 --tablets-per-table $((8*1024)) --rf 3 testlog - Total tablet count: 81920 testlog - Size of tablet_metadata in memory: 7683 KiB testlog - Copied in 2.163421 [ms] testlog - Cleared in 0.767507 [ms] testlog - Saved in 774.813232 [ms] testlog - Read in 246.666885 [ms] testlog - Read mutations in 211.677292 [ms] testlog - Size of canonical mutations: 20.633621 [MiB] testlog - Disk space used by system.tablets: 0.902344 [MiB]	2023-04-24 10:49:37 +02:00
Tomasz Grabiec	70a35f70a6	test: Introduce tablets_test	2023-04-24 10:49:37 +02:00
Tomasz Grabiec	9d786c1ebc	db: tablets: Add persistence layer	2023-04-24 10:49:37 +02:00
Tomasz Grabiec	fceb5f8cf6	locator: Introduce tablet_metadata token_metadata now stores tablet metadata with information about tablets in the system.	2023-04-24 10:49:37 +02:00
Benny Halevy	d1817e9e1b	utils: move generation-number to gms Although get_generation_number implementation is completely generic, it is used exclusively to seed the gossip generation number. Following patches will define a strong gms::generation_id type and this function should return it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:37:32 +03:00
Benny Halevy	f5f566bdd8	utils: add tagged_integer A generic template for defining strongly typed integer types. Use it here to replace raft::internal::tagged_uint64. Will be used for defining gms generation and version as strong and distinguishable types in following patches. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:37:32 +03:00
Kamil Braun	55f43e532c	Merge 'get rid of gms/failure_detector' from Benny Halevy Move gms::arrival_window to api/failure_detector which is its only user. and get rid of the rest, which is not used, now that we use direct_failure_detector instead. TODO: integare direct_failure_detector with failure_detector api. Closes #13576 * github.com:scylladb/scylladb: gms: get rid of unused failure_detector api: failure_detector: remove false dependency on failure_detector::arrival_window test: rest_api: add test_failure_detector	2023-04-21 11:47:44 +02:00
Kefu Chai	9215adee46	streaming: specialize fmt::formatter<stream_reason> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `stream_reason` without the help of `operator<<`. please note, because we still cannot use the generic formatter for std::unordered_map provided by fmtlib, so in order to drop `operator<<` for `stream_reason`, and to print `unordered_map<stream_reason>`, `fmt::join()` is used as a temporary solution. we will audit all `fmt::join()` calls, after removing the homebrew formatter of `std::unordered_map`. the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13609	2023-04-21 09:44:23 +03:00
Benny Halevy	3f1ac846d8	gms: get rid of unused failure_detector The legacy failure_detector is now unused and can be removed. TODO: integare direct_failure_detector with failure_detector api. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-21 09:08:27 +03:00

1 2 3 4 5 ...

1719 Commits