scylladb

Author	SHA1	Message	Date
Andrei Chekun	293cf355df	[test.py] Fix log for failed node was nod added to failed directory If something happens during nod adding to the cluster, it will not be registered as a part of the cluster. This leads to situations during log gathering that logs for a such node will be missing.	2024-06-17 11:16:55 +02:00
Andrei Chekun	7bbb8d9260	[test.py] Fix URl for failed logs directory in CI Incorrect passing of the artifacts_dir_url parameter from test.py to pytest leads to the situation when it will pass None as a string and pytest will generate incorrect URL.	2024-06-17 11:16:48 +02:00
Kamil Braun	bbb424a757	Merge '[test.py] Add uniqueness to the test name' from Andrei Chekun In CI test always executed with option --repeat=3 that leads to generate 3 test results with the same name. Junit plugin in CI cannot distinguish correctly the difference between these results. In case when we have two passes and one fail, the link to test result will sometimes be redirected to the incorrect one because the test name is the same. To fix this ReportPlugin added that will be responsible to modify the test case name during junit report generation adding to the test name mode and run id. Fixes: https://github.com/scylladb/scylladb/issues/17851 Fixes: https://github.com/scylladb/scylladb/issues/15973 Closes scylladb/scylladb#19235 * github.com:scylladb/scylladb: [test.py] Add uniqueness to the test name [test.py] Refactor alternator, nodetool, rest_api	2024-06-14 17:59:07 +02:00
Kefu Chai	d498ca3afa	test: randomized_nemesis_test: use BOOST_REQUIRE_* when appropriate for better debuggability. Refs #17030 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19282	2024-06-14 15:33:07 +03:00
Botond Dénes	b2ebc172d0	Merge 'Fix usage of utils/chunked_vector::reserve_partial' from Lakshmi Narayanan Sreethar utils/chunked_vector::reserve_partial: fix usage in callers The method reserve_partial(), when used as documented, quits before the intended capacity can be reserved fully. This can lead to overallocation of memory in the last chunk when data is inserted to the chunked vector. The method itself doesn't have any bug but the way it is being used by the callers needs to be updated to get the desired behaviour. Instead of calling it repeatedly with the value returned from the previous call until it returns zero, it should be repeatedly called with the intended size until the vector's capacity reaches that size. This PR updates the method comment and all the callers to use the right way. Fixes #19254 Closes scylladb/scylladb#19279 * github.com:scylladb/scylladb: utils/large_bitset: remove unused includes identified by clangd utils/large_bitset: use thread::maybe_yield() test/boost/chunked_managed_vector_test: fix testcase tests_reserve_partial utils/lsa/chunked_managed_vector: fix reserve_partial() utils/chunked_vector: return void from reserve_partial and make_room test/boost/chunked_vector_test: fix testcase tests_reserve_partial utils/chunked_vector::reserve_partial: fix usage in callers	2024-06-14 15:31:00 +03:00
Kamil Braun	982fa31250	Merge 'test: servers_add: fix the expected_error parameter' from Patryk Jędrzejczak This PR fixes two problems with the `expected_error` parameter in `server_add` and `servers_add`. 1. It didn't work in `server_add` if the cluster was empty because of an incorrect attempt to connect the driver. 2. It didn't work in `servers_add` completely because the `seeds` parameter was handled incorrectly. This PR only adds improvements in the testing framework, no need to backport it. Closes scylladb/scylladb#19255 * github.com:scylladb/scylladb: test: manager_client, scylla_cluster: fix type annotations in add_servers test: manager_client: don't connect driver after failed server_{add, start} test: scylla_cluster: pass seeds to add_servers	2024-06-14 11:33:21 +02:00
Andrei Chekun	8d1d206aff	[test.py] Add uniqueness to the test name In CI test always executed with option --repeat=3 that leads to generate 3 test results with the same name. Junit plugin in CI cannot distinguish correctly the difference between these results. In case when we have two passes and one fail, the link to test result will sometimes be redirected to the incorrect one because the test name is the same. To fix this ReportPlugin added that will be responsible to modify the test case name during junit report generation adding to the test name mode and run id. Fixes: https://github.com/scylladb/scylladb/issues/17851 Fixes: https://github.com/scylladb/scylladb/issues/15973	2024-06-14 11:23:04 +02:00
Wojciech Mitros	9bae1814ab	test: add test for failed view building write For various reasons, a view building write may fail. When that happens, the view building should not finish until these writes are successfully retried and they should not interfere with any writes that are performed to the base table while the view is building. The test introduced in this patch confirms that this is the case. Refs scylladb/scylladb#19261 Closes scylladb/scylladb#19263	2024-06-14 10:38:21 +02:00
Lakshmi Narayanan Sreethar	310c5da4bb	test/boost/chunked_managed_vector_test: fix testcase tests_reserve_partial Update the maximum size tested by the testcase. The test always created only one chunk as the maximum size tested by it (1 << 12 = 4KB) was less than the default max chunk size (12.8 KB). So, use twice the max_chunk_capacity as the test size distribution upper limit to verify that partial_reserve can reserve multiple chunks. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-06-14 13:47:10 +05:30
Lakshmi Narayanan Sreethar	d4f8b91bd6	utils/lsa/chunked_managed_vector: fix reserve_partial() Fix the method comment and return types of chunked_managed_vector's reserve_partial() similar to chunked_vector's reserve_partial() as it has the same issues mentioned in #19254. Also update the usage in the chunked_managed_vector_test. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-06-14 13:47:10 +05:30
Lakshmi Narayanan Sreethar	29f036a777	test/boost/chunked_vector_test: fix testcase tests_reserve_partial Fix the usage of reserve_partial in the testcase. Also update the maximum chunk size used by the testcase. The test always created only one chunk as the maximum size tested by it (1 << 12 = 4KB) was less than the default max chunk size (128 KB). So, use smaller chunk size, 512 bytes, to verify that partial_reserve can reserve multiple chunks. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-06-14 13:43:07 +05:30
Kefu Chai	df094061e3	test: randomized_nemesis_test: define static variable before this change, when linking randomized_nemesis_test with ld.lld: ``` [4/4] Linking CXX executable test/raft/RelWithDebInfo/randomized_nemesis_test FAILED: test/raft/RelWithDebInfo/randomized_nemesis_test : && /home/kefu/.local/bin/clang++ -ffunction-sections -fdata-sections -O3 -g -gz -Xlinker --build-id=sha1 --ld-path=ld.lld -dynamic-linker=/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////lib64/ld-linux-x86-64.so.2 -Xlinker --gc-sections test/raft/CMakeFiles/test-raft-helper.dir/RelWithDebInfo/helpers.cc.o test/raft/CMakeFiles/randomized_nemesis_test.dir/RelWithDebInfo/randomized_nemesis_test.cc.o -o test/raft/RelWithDebInfo/randomized_nemesis_test -L/home/kefu/dev/scylladb/idl/absl::headers -Wl,-rpath,/home/kefu/dev/scylladb/idl/absl::headers test/lib/RelWithDebInfo/libtest-lib.a seastar/RelWithDebInfo/libseastar.a /usr/lib64/libxxhash.so seastar/RelWithDebInfo/libseastar_testing.a test/lib/RelWithDebInfo/libtest-lib.a -Xlinker --push-state -Xlinker --whole-archive auth/RelWithDebInfo/libscylla_auth.a -Xlinker --pop-state /usr/lib64/libcrypt.so cdc/RelWithDebInfo/libcdc.a compaction/RelWithDebInfo/libcompaction.a mutation_writer/RelWithDebInfo/libmutation_writer.a -Xlinker --push-state -Xlinker --whole-archive dht/RelWithDebInfo/libscylla_dht.a -Xlinker --pop-state types/RelWithDebInfo/libtypes.a index/RelWithDebInfo/libindex.a -Xlinker --push-state -Xlinker --whole-archive locator/RelWithDebInfo/libscylla_locator.a -Xlinker --pop-state message/RelWithDebInfo/libmessage.a gms/RelWithDebInfo/libgms.a sstables/RelWithDebInfo/libsstables.a readers/RelWithDebInfo/libreaders.a schema/RelWithDebInfo/libschema.a -Xlinker --push-state -Xlinker --whole-archive tracing/RelWithDebInfo/libscylla_tracing.a -Xlinker --pop-state RelWithDebInfo/libscylla-main.a abseil/absl/strings/RelWithDebInfo/libabsl_cord.a abseil/absl/strings/RelWithDebInfo/libabsl_cordz_info.a abseil/absl/strings/RelWithDebInfo/libabsl_cord_internal.a abseil/absl/strings/RelWithDebInfo/libabsl_cordz_functions.a abseil/absl/strings/RelWithDebInfo/libabsl_cordz_handle.a abseil/absl/crc/RelWithDebInfo/libabsl_crc_cord_state.a abseil/absl/crc/RelWithDebInfo/libabsl_crc32c.a abseil/absl/crc/RelWithDebInfo/libabsl_crc_internal.a abseil/absl/crc/RelWithDebInfo/libabsl_crc_cpu_detect.a abseil/absl/strings/RelWithDebInfo/libabsl_str_format_internal.a /usr/lib64/libz.so service/RelWithDebInfo/libservice.a node_ops/RelWithDebInfo/libnode_ops.a service/RelWithDebInfo/libservice.a node_ops/RelWithDebInfo/libnode_ops.a -lsystemd raft/RelWithDebInfo/libraft.a repair/RelWithDebInfo/librepair.a streaming/RelWithDebInfo/libstreaming.a replica/RelWithDebInfo/libreplica.a db/RelWithDebInfo/libdb.a mutation/RelWithDebInfo/libmutation.a data_dictionary/RelWithDebInfo/libdata_dictionary.a cql3/RelWithDebInfo/libcql3.a transport/RelWithDebInfo/libtransport.a cql3/RelWithDebInfo/libcql3.a transport/RelWithDebInfo/libtransport.a lang/RelWithDebInfo/liblang.a /usr/lib64/liblua-5.4.so -lm /usr/lib64/libsnappy.so.1.1.10 abseil/absl/container/RelWithDebInfo/libabsl_raw_hash_set.a abseil/absl/hash/RelWithDebInfo/libabsl_hash.a abseil/absl/hash/RelWithDebInfo/libabsl_city.a abseil/absl/types/RelWithDebInfo/libabsl_bad_variant_access.a abseil/absl/hash/RelWithDebInfo/libabsl_low_level_hash.a abseil/absl/types/RelWithDebInfo/libabsl_bad_optional_access.a abseil/absl/container/RelWithDebInfo/libabsl_hashtablez_sampler.a abseil/absl/profiling/RelWithDebInfo/libabsl_exponential_biased.a abseil/absl/synchronization/RelWithDebInfo/libabsl_synchronization.a abseil/absl/debugging/RelWithDebInfo/libabsl_stacktrace.a abseil/absl/synchronization/RelWithDebInfo/libabsl_graphcycles_internal.a abseil/absl/synchronization/RelWithDebInfo/libabsl_kernel_timeout_internal.a abseil/absl/debugging/RelWithDebInfo/libabsl_symbolize.a abseil/absl/debugging/RelWithDebInfo/libabsl_debugging_internal.a abseil/absl/base/RelWithDebInfo/libabsl_malloc_internal.a abseil/absl/debugging/RelWithDebInfo/libabsl_demangle_internal.a abseil/absl/time/RelWithDebInfo/libabsl_time.a abseil/absl/strings/RelWithDebInfo/libabsl_strings.a abseil/absl/strings/RelWithDebInfo/libabsl_strings_internal.a abseil/absl/strings/RelWithDebInfo/libabsl_string_view.a abseil/absl/base/RelWithDebInfo/libabsl_throw_delegate.a abseil/absl/numeric/RelWithDebInfo/libabsl_int128.a abseil/absl/base/RelWithDebInfo/libabsl_base.a abseil/absl/base/RelWithDebInfo/libabsl_raw_logging_internal.a abseil/absl/base/RelWithDebInfo/libabsl_log_severity.a abseil/absl/base/RelWithDebInfo/libabsl_spinlock_wait.a -lrt abseil/absl/time/RelWithDebInfo/libabsl_civil_time.a abseil/absl/time/RelWithDebInfo/libabsl_time_zone.a rust/RelWithDebInfo/libwasmtime_bindings.a rust/librust_combined.a /usr/lib64/libdeflate.so utils/RelWithDebInfo/libutils.a /usr/lib64/libxxhash.so /usr/lib64/libcryptopp.so /usr/lib64/libboost_regex.so.1.83.0 /usr/lib64/libicui18n.so /usr/lib64/libicuuc.so /usr/lib64/libboost_unit_test_framework.so.1.83.0 seastar/RelWithDebInfo/libseastar_testing.a seastar/RelWithDebInfo/libseastar.a /usr/lib64/libboost_program_options.so /usr/lib64/libboost_thread.so /usr/lib64/libboost_chrono.so /usr/lib64/libboost_atomic.so /usr/lib64/libcares.so /usr/lib64/libfmt.so.10.2.1 /usr/lib64/liblz4.so -ldl /usr/lib64/libgnutls.so -latomic /usr/lib64/libsctp.so /usr/lib64/libprotobuf.so /usr/lib64/libyaml-cpp.so /usr/lib64/libhwloc.so //usr/lib64/liburing.so /usr/lib64/libnuma.so /usr/lib64/libboost_unit_test_framework.so && : ld.lld: error: undefined symbol: append_seq::magic >>> referenced by impl.hpp:92 (/usr/include/boost/test/tools/old/impl.hpp:92) >>> test/raft/CMakeFiles/randomized_nemesis_test.dir/RelWithDebInfo/randomized_nemesis_test.cc.o:(__cxx_global_var_init.38) >>> referenced by impl.hpp:92 (/usr/include/boost/test/tools/old/impl.hpp:92) >>> test/raft/CMakeFiles/randomized_nemesis_test.dir/RelWithDebInfo/randomized_nemesis_test.cc.o:(__cxx_global_var_init.38) >>> referenced by impl.hpp:92 (/usr/include/boost/test/tools/old/impl.hpp:92) >>> test/raft/CMakeFiles/randomized_nemesis_test.dir/RelWithDebInfo/randomized_nemesis_test.cc.o:(append_seq::append(int) const) >>> referenced 5 more times clang++: error: linker command failed with exit code 1 (use -v to see invocation) ``` it turns out `append_seq::magic` is only declared, but never defined. please note, the non-inline static member variable in its class definition is not considered as a definition, see [class.static.data](https://eel.is/c++draft/class.static.data#3) > The declaration of a non-inline static data member in its class > definition is not a definition and may be of an incomplete type > other than cv void. so, let's declare it as a `constexpr` instead. it implies `inline`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19283	2024-06-14 10:00:21 +03:00
Botond Dénes	bf429695b6	Merge 'test_tablets: add test_tablet_storage_freeing' from Michał Chojnowski Before work on tablets was completed, it was noticed that — due to some missing pieces of implementation — Scylla doesn't properly close sstables for migrated-away tablets. Because of this, disk space wasn't being reclaimed properly. Since the missing pieces of implementation were added, the problem should be gone now. This patch adds a test which was used to reproduce the problem earlier. It's expected to pass now, validating that the issue was fixed. Should be backported to branch-6.0, because the tested problem was also affecting that branch. Fixes #16946 Closes scylladb/scylladb#18906 * github.com:scylladb/scylladb: test_tablets: add test_tablet_storage_freeing test: pylib: add get_sstables_disk_usage()	2024-06-14 08:08:54 +03:00
Benny Halevy	fb3db7d81f	perf-simple-query: add cpu_cycles / op metric Example output: ``` bhalevy@[] scylla$ build/release/scylla perf-simple-query --default-log-level=error -c 1 --duration 10 random-seed=4058714023 enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no} Disabling auto compaction Creating 10000 partitions... 86912.75 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 42346 insns/op, 22811 cycles/op, 0 errors) 91348.29 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 42306 insns/op, 22362 cycles/op, 0 errors) 87965.84 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 42338 insns/op, 22966 cycles/op, 0 errors) 90793.67 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 42351 insns/op, 22783 cycles/op, 0 errors) 90104.27 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 42358 insns/op, 22875 cycles/op, 0 errors) 90397.13 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 42355 insns/op, 22735 cycles/op, 0 errors) 89142.39 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42363 insns/op, 22996 cycles/op, 0 errors) 90410.40 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 42363 insns/op, 22725 cycles/op, 0 errors) 88173.10 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42366 insns/op, 23160 cycles/op, 0 errors) 88416.51 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42379 insns/op, 23102 cycles/op, 0 errors) median 90104.26849997675 median absolute deviation: 1244.02 maximum: 91348.29 minimum: 86912.75 ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#18818	2024-06-14 07:42:09 +03:00
Andrei Chekun	93b9b85c12	[test.py] Refactor alternator, nodetool, rest_api Make alternator, nodetool and rest_api test directories as python packages. Move scylla-gdb to scylla_gdb and make it python package.	2024-06-13 13:56:10 +02:00
Avi Kivity	f1819419cc	Merge 'scylla-sstable: add method to load the schema from the sstable itself' from Botond Dénes As it turns out, each sstable carries its own schema in its serialization header (Statistics component). This schema is incomplete -- the names of the key columns are not stored, just their type. Static and regular columns do have names and types stored however. This bare-bones schema is enough to parse and display the content of the sstable. Another thing missing is schema options (the stuff after the `WITH` keyword, except the clustering order). The only options stored are the compression options (in the CompressionInfo component), this is actually needed to read the Data component. This series adds a new method to `tools/schema_loader.cc` to extract the schema stored in the sstable itself. This new schema load method is used as the last fall-back for obtaining the schema, in case scylla-sstable is trying to autodetect the schema of the sstable. Although, right now this bare-bones schema is enough for everything scylla-sstable does, it is more future proof to stick to the "full" schema if possible, so this new method is the last resort for now. Fixes: https://github.com/scylladb/scylladb/issues/17869 Fixes: https://github.com/scylladb/scylladb/issues/18809 New functionality, no backport needed. Closes scylladb/scylladb#19169 * github.com:scylladb/scylladb: tools/scylla-sstable: log loaded schema with trace level tools/scylla-sstable: load schema from the sstable as fallback tools/schema_loader: introduce load_schema_from_sstable() test/lib/random_schema: remove assert on min number of regular columns sstables: introduce load_metadata()	2024-06-13 12:21:09 +03:00
Nadav Har'El	44ea1993ba	test/cql-pytest: tests CREATE/DROP INDEX during paged query This patch includes extensive testing for what happens to an ongoing paged query when a secondary index is suddenly added or dropped. Issue #18992 was opened suggesting that this would be broken, and indeed the tests included here show that it is indeed broken. The four tests included in this patch are heavily commented to explain what they are testing and why, but here is a short summary of what is being tested by each of them: 1. A paged query filtering on v=17 continues correctly even if an index is created on v. 2. A paged query filtering on v1 and v2 where v2 is indexed, continues correctly even if an index is created on v1 (remember that Scylla prefers to use the first index mentioned in the query). 3. A paged query using an index on v continues correctly even if that index is deleted. 4. However, if the query doesn't say "ALLOW FILTERING", it cannot be continued after the index is deleted. All these tests pass on Cassandra, but all of them except the fourth fail on Scylla, reproducing issue #18992. Somewhat to my suprise, the failure of the query in all the failed tests is silent (i.e., trying to fetch the next page just fetches nothing and says the iteration is done). I was expecting more dramatic failures ("marshaling error" messages, crashes, etc.) but didn't get them. Refs #18992 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#19000	2024-06-13 08:39:22 +03:00
Botond Dénes	43c44f0af5	tools/scylla-sstable: load schema from the sstable as fallback When auto-detecting the schema of the sstable, if all other methods failed, load the schema from the sstable's serialization header. This schema is incomplete. It is just enough to parse and display the content of the sstable. Although parsing and displaying the content of the sstable is all scylla-sstable does, it is more future-compatible to us the full schema when possible. So the always-available but minimal schema that each sstable has on itself, is used just as a fallback. The test which tested the case when all schema load attempts fail, doesn't work now, because loading the serialization header always succeeds. So convert this test into two positive tests, testing the serialization header schema fallback instead.	2024-06-13 01:32:17 -04:00
Botond Dénes	8f2ba03465	tools/schema_loader: introduce load_schema_from_sstable() Allows loading the schema from an sstable's serialization header. This schema is incomplete, but it is enough to parse and display the content of the sstable.	2024-06-13 01:32:17 -04:00
Botond Dénes	0d7335dd27	test/lib/random_schema: remove assert on min number of regular columns It is legal for a schema to have 0 regular columns, so remove the assert on the schema specification's regular column count.	2024-06-13 01:32:17 -04:00
Piotr Dulikowski	0b5a0c969a	Merge 'hinted handoff: migrate sync point to host ID' from Michael Litvak Change the format of sync points to use host ID instead of IPs, to be consistent with the use of host IDs in hinted handoff module. Introduce sync point v3 format which is the same as v2 except it stores host IDs instead of IPs. The decoding supports both formats with host IDs and IPs, so a sync point contains now a variant of either types, and in the case of new type the translation is avoided. Fixes #18653 Closes scylladb/scylladb#19134 * github.com:scylladb/scylladb: db/hints: migrate sync point to host ID db/hints: rename sync point structures with _v1 suffix to _v1_v2	2024-06-13 06:16:00 +02:00
Patryk Jędrzejczak	a7ab9a015a	test: manager_client, scylla_cluster: fix type annotations in add_servers	2024-06-12 16:51:20 +02:00
Patryk Jędrzejczak	1eb25d22c6	test: manager_client: don't connect driver after failed server_{add, start} If adding or starting a server fails expectedly, there is no reason to update or connect the driver. Moreover, before this patch, we couldn't use `server_add` and `servers_add` with `expected_error` if the cluster was empty. After expected bootstrap failures, we tried to connect the driver, which rightfully failed on `assert len(hosts) > 0` in `cluster_con`.	2024-06-12 16:51:20 +02:00
Patryk Jędrzejczak	8f486de8d3	test: scylla_cluster: pass seeds to add_servers This parameter was incorrectly missing. For this reason, `expected_error` was passed from `add_servers` to `add_server` as `seeds`, which caused strange crashes.	2024-06-12 16:51:19 +02:00
Kefu Chai	223fba3243	test: memtable_test: increase unspooled_dirty_soft_limit before this change, when performing memtable_test, we expect that the memtables of ks.cf is the only memtables being flushed. and we inject 4 failures in the code path of flush, and wait until 4 of them are triggered. but in the background, `dirty_memory_manager` performs flush on all tables when necessary. so, the total number of failures is not necessary the total number of failures triggered when flushing ks.cf, some of them could be triggered when flushing system tables. that's why we have sporadict test failures from this test. as we might check `t.min_memtable_timestamp()` too soon. after this change, we increase `unspooled_dirty_soft_limit` setting, in order to disable `dirty_memory_manager`, so that the only flush is performed by the test. Fixes #19034 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-06-12 19:17:27 +08:00
Kefu Chai	2df4e9cfc2	test: memtable_test: replace BOOST_ASSERT with BOOST_REQURE before this change, we verify the behavior of design under test using `BOOST_ASSERT()`, which is a wrapper around `assert()`, so if a test fails, the test just aborts. this is not very helpful for postmortem debugging. after this change, we use `BOOST_REQUIRE` macro for verifying the behavior, so that Boost.Test prints out the condition if it does not hold when we test it. Refs #19034 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-06-12 19:17:27 +08:00
Kefu Chai	4175e02d9d	clustering_bounds_comparator: drop operator<< for bound_kind turns out operator<< for bound_kind is not used anymore, so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19159	2024-06-11 18:01:06 +02:00
Avi Kivity	6608f49718	Merge 'make enable_compacting_data_for_streaming_and_repair truly live-update' from Botond Dénes This config item is propagated to the table object via table::config. Although the field in `table::config`, used to propagate the value, was `utils::updateable_value<T>`, it was assigned a constant and so the live-update chain was broken. This series fixes this and adds a test which fails before the patch and passes after. The test needed new test infrastructure, around the failure injection api, namely the ability to exfiltrate the value of internal variable. This infrastructure is also added in this series. Fixes: https://github.com/scylladb/scylladb/issues/18674 - [x] This patch has to be backported because it fixes broken functionality Closes scylladb/scylladb#18705 * github.com:scylladb/scylladb: test/topology_custom: add test for enable_compacting_data_for_streaming_and_repair live-update test/pylib: rest_client: add get_injection() api/error_injection: add getter for error_injection utils/error_injection: add set_parameter() replica/database: fix live-update enable_compacting_data_for_streaming_and_repair	2024-06-11 15:53:19 +03:00
Michael Litvak	afc9a1a8a6	db/hints: migrate sync point to host ID Change the format of sync points to use host ID instead of IPs, to be consistent with the use of host IDs in hinted handoff module. Introduce sync point v3 format which is the same as v2 except it stores host IDs instead of IPs. The encoding of sync points now always uses the new v3 format with host IDs. The decoding supports both formats with host IDs and IPs, so a sync point contains now a variant of either types, and in the case of the new format the translation from IP to host ID is avoided.	2024-06-11 11:07:00 +02:00
Botond Dénes	8ef4fbdb87	test/topology_custom: add test for enable_compacting_data_for_streaming_and_repair live-update Avoid this the live-update feature of this config item breaking silently.	2024-06-11 04:17:48 -04:00
Botond Dénes	0c61b1822c	test/pylib: rest_client: add get_injection() The /v2/error_injection/{injection} endpoint now has a GET method too, expose this.	2024-06-11 04:17:48 -04:00
Pavel Emelyanov	1b9cedb3f3	test: Reduce failure detector timeout for failed tablets migration test Most of the time this test spends waiting for a node to die. Helps 3x times Was real 9m21,950s user 1m11,439s sys 1m26,022s Now real 3m37,780s user 0m58,439s sys 1m13,698s refs: #17764 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#19222	2024-06-11 09:55:06 +02:00
Raphael S. Carvalho	7b41630299	replica: Refresh mutation source when allocating tablet replicas Consider the following: 1) table A has N tablets and views 2) migration starts for a tablet of A from node 1 to 2. 3) migration is at write_both_read_old stage 4) coordinator will push writes to both nodes (pending and leaving) 5) A has view, so writes to it will also result in reads (table::push_view_replica_updates()) 6) tablet's update_effective_replication_map() is not refreshing tablet sstable set (for new tablet migrating in) 7) so read on step 5 is not being able to find sstable set for tablet migrating in Causes the following error: "tablets - SSTable set wasn't found for tablet 21 of table mview.users" which means loss of write on pending replica. The fix will refresh the table's sstable set (tablet_sstable_set) and cache's snapshot. It's not a problem to refresh the cache snapshot as long as the logical state of the data hasn't changed, which is true when allocating new tablet replicas. That's also done in the context of compactions for example. Fixes #19052. Fixes #19033. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#19099	2024-06-11 06:59:04 +03:00
Calle Wilund	51c53d8db6	main/minio_server.py: Respect any preexisting AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY vars Fixes scylladb/scylla-pkg#3845 Don't overwrite (or rather change) AWS credentials variables if already set in enclosing environment. Ensures EAR tests for AWS KMS can run properly in CI. v2: * Allow environment variables in reading obj storage config - allows CI to use real credentials in env without risking putting them info less seure files * Don't write credentials info from miniserver into config, instead use said environment vars to propagate creds. v3: * Fix python launch scripts to not clear environment, thus retaining above aws envs. Closes scylladb/scylladb#19086	2024-06-11 06:59:04 +03:00
Nadav Har'El	73dfa4143a	cql-pytest: translate Cassandra's tests for SELECT DISTINCT This is a translation of Cassandra's CQL unit test source file DistinctQueryPagingTest.java into our cql-pytest framework. The 5 tests did not reproduce any previously-unknown bug, but did provide additional reproducers for one already-known issue: Refs #10354: SELECT DISTINCT should allow filter on static columns, not just partition keys Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#18971	2024-06-11 06:59:04 +03:00
Michał Chojnowski	823da140dd	test_tablets: add test_tablet_storage_freeing Tests that tablet storage is freed after it is migrated away. Fixes #16946	2024-06-10 14:25:37 +02:00
Michał Chojnowski	7741491b47	test: pylib: add get_sstables_disk_usage() Adds an util for measuring the disk usage of the given table on the given node. Will be used in a follow-up patch for testing that sstables are freed properly.	2024-06-10 14:25:37 +02:00
Botond Dénes	7b2aad56c4	test/boost/sstable_datafile_test: remove unused semaphores The tests use the ones from test_env, the explicitely created ones are unused. Closes scylladb/scylladb#19167	2024-06-09 20:43:59 +03:00
Tomasz Grabiec	c8f71f4825	test: tablets: Fix flakiness of test_removenode_with_ignored_node due to read timeout The check query may be executed on a node which doesn't yet see that the downed server is down, as it is not shut down gracefully. The query coordinator can choose the down node as a CL=1 replica for read and time out. To fix, wait for all nodes to notice the node is down before executing the checking query. Fixes #17938 Closes scylladb/scylladb#19137	2024-06-09 19:39:57 +03:00
Avi Kivity	7b301f0cb9	Merge 'Encapsulate wasm and lua management in lang::manager service' from Pavel Emelyanov After wasm udf appeared, code in main, create_function_statement and schema_tables got some involvements into details of wasm engine management. Also, even prior to this, there was duplication in how function context is created by statement code and schema_tables code. This PR generalizes function context creation and encapsulates the management in sharded<lang::manager> service. Also it removes the wasm::startup_context thing and makes wasm start/stop be "classical" (see #2737) Closes scylladb/scylladb#19166 * github.com:scylladb/scylladb: code: Enlighten wasm headers usage lang: Unfriend wasm context from manager lang, cql3, schema_tables: Don't mess with db::config lang: Don't use db::config to create lua context lang: Don't use db::config to create wasm context lang: Drop manager::precompile() method cql3, schema_tables: Generalize function creation wasm: Replace startup_context with wasm_config lang: Add manager::start() method lang: Move manager to lang namespace lang: Move wasm::manager to its .cc/.hh files	2024-06-09 19:32:26 +03:00
Avi Kivity	b2a500a9a1	Merge 'alternator: keep TTL work in the maintenance scheduling group' from Botond Dénes Alternator has a custom TTL implementation. This is based on a loop, which scans existing rows in the table, then decides whether each row have reached its end-of-life and deletes it if it did. This work is done in the background, and therefore it uses the maintenance (streaming) scheduling group. However, it was observed that part of this work leaks into the statement scheduling group, competing with user workloads, negatively affecting its latencies. This was found to be causes by the reads and writes done on behalf of the alternator TTL, which looses its maintenance scheduling group when these have to go to a remote node. This is because the messaging service was not configured to recognize the streaming scheduling group, when statement verbs like read or writes are invoked. The messaging service currently recognizes two statement "tenants": the user tenant (statement scheduling group) and system (default scheduling group), as we used to have only user-initiated operations and sytsem (internal) ones. With alternator TTL, there is now a need to distinguish between two kinds of system operation: foreground and background ones. The former should use the system tenant while the latter will use the new maintenance tenant (streaming scheduling group). This series adds a streaming tenant to the messaging service configuration and it adds a test which confirms that with this change, alternator TTL is entirely contained in the maintenance scheduling group. Fixes: #18719 - [x] Scans executed on behalf of alternator TTL are running in the statement group, disturbing user-workloads, this PR has to be backported to fix this. Closes scylladb/scylladb#18729 * github.com:scylladb/scylladb: alternator, scheduler: test reproducing RPC scheduling group bug main: add maintenance tenant to messaging_service's scheduling config	2024-06-09 19:20:18 +03:00
Nadav Har'El	13cf6c543d	test/alternator: fix flaky test test_item_latency The Alternator test test_metrics.py::test_item_latency confirms that for several operation types (PutItem, GetItem, DeleteItem, UpdateItem) we did not forget to measure their latencies. The test checked that a latency was updated by checking that two metrics increases: scylla_alternator_op_latency_count scylla_alternator_op_latency_sum However, it turns out that the "sum" is only an approximate sum of all latencies, and when the total sum grows large it sometimes does not increase when a short latency is added to the statistics. When this happens, this test fails on the assertion that the "sum" increases after an operation. We saw this happening sometimes in CI runs. The simple fix is to stop checking _sum at all, and only verify that the _count increases - this is really an integer counter that unconditionally increases when a latency is added to the histogram. Don't worry that the strength of this test is reduced - this test was never meant to check the accuracy or correctness of the histograms - we should have different (and better) tests for that, unrelated to Alternator. The purpose of this test is only to verify that for some specific operation like PutItem, Alternator didn't forget to measure its latency and update the histogram. We want to avoid a bug like we had in counters in the past (#9406). Fixes #18847. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#19080	2024-06-09 19:19:09 +03:00
Kefu Chai	f4706be8a8	test: test_topology_ops: adapt to tablets in `e7d4e080`, we reenabled the background writes in this test, but when running with tablets enabled, background writes are still disabled because of #17025, which was fixed last week. so we can enable background writes with tablets. in this change, * background writes are enabled with tablets. * increase the number of nodes by 1 so that we have enough nodes to fulfill the needs of tablets, which enforces that the number of replicas should always satisfy RF. * pass rf to `start_writes()` explicitly, so we have less magic numbers in the test, and make the data dependencies more obvious. Fixes #17589 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18707	2024-06-08 17:46:37 +02:00
Gleb Natapov	34cf5c81f6	group0, topology coordinator: run group0 and the topology coordinator in gossiper scheduling group Currently they both run in streaming group and it may become busy during repair/mv building and affect group0 functionality. Move it to the gossiper group where it should have more time to run. Fixes scylladb/scylladb#18863 Closes scylladb/scylladb#19138	2024-06-07 15:31:44 +02:00
Pavel Emelyanov	b854bf4b83	lang: Don't use db::config to create lua context Similarly to previous patch, lua context needs db::config for creation. It's better to get the configurables via lang::manager::config. One thing to note -- lua config carries updateable_values on board, but respective db::config options and _not_ LiveUpdate-able, so the lua config could just use simple data types. This patch keeps updateable values intact for brevity. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 13:07:05 +03:00
Pavel Emelyanov	783ccc0a74	lang: Don't use db::config to create wasm context The managerr needs to get two "fuel" configurables from db::config in order to create context. Instead of carrying db config from callers, keep the options on existing lang::manager::config and use them. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 13:07:05 +03:00
Pavel Emelyanov	fe7ff7172d	wasm: Replace startup_context with wasm_config The lang::manager starts with the help of a context because it needs to have std::shared_ptr<> pointg to cross-shard shared wasm engine and runner thread. For that a context is created in advance, that then helps sharing the engine and runner across manager instances. This patch removes the "context" and replaces it with classical manager::config. With it, it's lang::manager who's now responsible for initializing itself. In order to have cross-shard engine and thread pointers, the start() method uses invoke_on_others() facility to share the pointer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 12:35:57 +03:00
Pavel Emelyanov	0dad72b736	lang: Add manager::start() method Just like any other sharded<> service, the lang::manager now starts and stops in a classical sequence of await sharded<manager>::start() defer([] { await sharded<manager>::stop() }) await sharded<manager>::invoke_on_all(&manager::start) For now the method is no-op, next patches will start using it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 12:35:57 +03:00
Pavel Emelyanov	f950469af5	lang: Move manager to lang namespace And, while at it, rename local variable to refer to it to as "manager" not "wasm". Query processor and database also have getters named "wasm()", these are not renamed yet to keep patch smaller (and those getters are going to be reworked further anyway). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 12:35:57 +03:00
Pavel Emelyanov	1dec79e97d	lang: Move wasm::manager to its .cc/.hh files It's going to become a facade in front of both -- wasm and lua, so keep it in files with language independent names. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 12:35:57 +03:00

1 2 3 4 5 ...

6991 Commits