scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 03:45:11 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	9802bb6564	Merge 'Remove explicit flush() from sstable component writer' from Pavel Emelyanov Writing into sstable component output stream should be done with care. In particular -- flushing can happen only once right before closing the stream. Flushing the stream in between several writes is not going to work, because file stream would step on unaligned IO and S3 upload stream would send completion message to the server and would lose any subsequent write. Most of the file_writer users already obey that and flush the writer once right before closing it. The do_write_simple() is extra careful about exceptions handling, but it's an overkill (see first patch). It's better to make file_writer API explicitly lack the ability to flush itself by flushing the stream when closing the writer. Closes #13338 * github.com:scylladb/scylladb: sstables: Move writer flush into close (and remove it) sstables: Relax exception handling in do_write_simple	2023-04-05 12:09:31 +02:00
Tomasz Grabiec	bbabf07f69	Merge 'test/boost/multishard_mutation_query: use random schema' from Botond Dénes This test currently uses `test/lib/test_table.hh` to generate data for its test cases. This data generation facility is used by no other tests. Worse, it is redundant as we already have a random data generator with fixed schema, in `test/lib/mutation_source_test.hh`. So in this series, we migrate the test cases in said test file to random schema and its random data generation facilities. These are used by several other test cases and using random schema allows us to cover a wider (quasi-infinite) number of possibilities. After migrating all tests away from it, `test/lib/test_table.hh` is removed. This series also reduces the runtime of `fuzzy_test` drastically. It should now run in a few minutes or even in seconds (depending on the machine). Fixes: #12944 Closes #12574 * github.com:scylladb/scylladb: test/lib: rm test_table.hh test/boos/multishard_mutation_query_test: migrate other tests to random schema test/boost/multishard_mutation_query_test: use ks keyspace test/boost/multishard_mutation_query_test: improve test pager test/boost/multishard_mutation_query_test: refactor fuzzy_test test/boost: add multishard_mutation_query_test more memory types/user: add get_name() accessor test/lib/random_schema: add create_with_cql() test/lib/random_schema: fix udt handling test/lib/random_schema: type_generator(): also generate frozen types test/lib/random_schema: type_generator(): make static column generation conditional test/lib/random_schema: type_generator(): don't generate duration_type for keys test/lib/random_schema: generate_random_mutations(): add overload with seed test/lib/random_schema: generate_random_mutations(): respect range tombstone count param test/lib/random_schema: generate_random_mutations(): add yields test/lib/random_schema: generate_random_mutations(): fix indentation test/lib/random_schema: generate_random_mutations(): coroutinize method test/lib/random_schema: generate_random_mutations(): expand comment	2023-04-05 10:32:58 +02:00
Michał Chojnowski	df0905357e	mutation_partition_v2: add sentinel to the tracker after adding it to the tree Every tracker insertion has to have a corresponding removal or eviction, (otherwise the number of rows in the tracker will be misaccounted). If we add the row to the tracker before adding it to the tree, and the tree insertion fails (with bad_alloc), this contract will be violated. Fix that. Note: the problem is currently irrelevant because an exception during sentinel insertion will abort the program anyway. Closes #13336	2023-04-05 09:52:44 +02:00
Raphael S. Carvalho	457c772c9c	replica: Make compaction_group responsible for deleting off-strategy compaction input Compaction group is responsible for deleting SSTables of "in-strategy" compactions, i.e. regular, major, cleanup, etc. Both in-strategy and off-strategy compaction have their completion handled using the same compaction group interface, which is compaction_group::table_state::on_compaction_completion(..., sstables::offstrategy offstrategy) So it's important to bring symmetry there, by moving the responsibility of deleting off-strategy input, from manager to group. Another important advantage is that off-strategy deletion is now throttled and gated, allowing for better control, e.g. table waiting for deletion on shutdown. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13432	2023-04-05 08:37:48 +03:00
Botond Dénes	f7421aab2c	Merge 'cmake: sync with `configure.py` (16/n)' from Kefu Chai this is the 15th changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals: - to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience - to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules. also, i just found that the scylla executable built with cmake building system segfault in master HEAD. like ``` AddressSanitizer:DEADLYSIGNAL ================================================================= ==3974496==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x7ffd48549f70 sp 0x7ffd48549728 T0) ==3974496==Hint: pc points to the zero page. ==3974496==The signal is caused by a READ memory access. ==3974496==Hint: address points to the zero page. #0 0x0 (<unknown module>) #1 0x14e785a5 in wasmtime_runtime::traphandlers::unix::trap_handler::h1f510afc2968497f /home/kefu/.cargo/registry/src/mirrors.sjtug.sjtu.edu.cn-7a04d2510079875b/wasmtime-runtime-5.0.1/src/traphandlers/unix.rs:159:9 #2 0x7f3462e5eb9f (/lib64/libc.so.6+0x3db9f) (BuildId: 6107835fa7d4725691b2b7f6aaee7abe09f493b2) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (<unknown module>) ==3974496==ABORTING Aborting on shard 0. Backtrace: 0xd16c38a 0x13c5aab0 0x13b9821e 0x13c2fdc7 /lib64/libc.so.6+0x3db9f /lib64/libc.so.6+0x8eb93 /lib64/libc.so.6+0x3daed /lib64/libc.so.6+0x2687e 0xd1e5f8a 0xd1e3d34 0xd1ca059 0xd1c5e29 0xd1c5605 0x14e785a5 /lib64/libc.so.6+0x3db9f ``` decoded: ``` __interceptor_backtrace at ??:? void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /home/kefu/dev/scylladb/seastar/include/seastar/util/backtrace.hh:60 seastar::backtrace_buffer::append_backtrace() at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:778 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:808 seastar::print_with_backtrace(char const, bool) at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:820 (inlined by) seastar::sigabrt_action() at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:3882 (inlined by) operator() at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:3858 (inlined by) __invoke at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:3854 /lib64/libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=6107835fa7d4725691b2b7f6aaee7abe09f493b2, for GNU/Linux 3.2.0, not stripped __GI___sigaction at :? __pthread_kill_implementation at ??:? __GI_raise at :? __GI_abort at :? __sanitizer::Abort() at ??:? __sanitizer::Die() at ??:? __asan::ScopedInErrorReport::~ScopedInErrorReport() at ??:? __asan::ReportDeadlySignal(__sanitizer::SignalContext const&) at ??:? __asan::AsanOnDeadlySignal(int, void, void) at ??:? wasmtime_runtime::traphandlers::unix::trap_handler at /home/kefu/.cargo/registry/src/mirrors.sjtug.sjtu.edu.cn-7a04d2510079875b/wasmtime-runtime-5.0.1/src/traphandlers/unix.rs:159 __GI___sigaction at :? ``` this led me to this change. but unfortunately, this changeset does not address the segfault. will continue the investigation in my free cycles. Closes #13434 github.com:scylladb/scylladb: build: cmake: include cxx.h with relative path build: cmake: set stack frame limits build: cmake: pass -fvisibility=hidden to compiler build: cmake: use -O0 on aarch64, otherwise -Og	2023-04-05 06:57:23 +03:00
Yaron Kaikov	c80ab78741	doc: update supported os for 2022.1 ubuntu22.04 is already supported on both `5.0` and `2022.1` updating the table Closes #13340	2023-04-05 06:43:58 +03:00
Pavel Emelyanov	f5de0582c8	alternator,util: Move aws4-hmac-sha256 signature generator to util S3 client cannot perform anonymous multipart uploads into any real S3 buckets regardless of their configuration. Since multipart upload is essential part of the sstables backend, we need to implement the authorisation support for the client early. (side note): with minio anonymous multipart upload works, with aws s3 anonymous PUT and DELETE can be configured, it's exactly the combination of aws + multipart upload that does need authorization. Fortunately, the signature generation and signature checking code is symmetrical and we have the checking option already in alternator :) So what this patch does is just moves the alternator::get_signature() helper into utils/. A sad side effect of that is all tests now need to link with gnutls :( that is used to compute the hash value itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13428	2023-04-04 18:24:48 +03:00
Nadav Har'El	aeabfcb93f	Merge 'Revert scylla sstable schema improvements' from Botond Dénes This PR reverts the scylla sstable schema loading improvements as they fail in CI every other run. I am already working on fixes for these but I am not sure I understand all the failures so it is best to revert and re-post the series later. Fixes: #13404 Fixes: #13410 Closes #13419 * github.com:scylladb/scylladb: Revert "Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes" Revert "tools/schema_loader: don't require results from optional schema tables"	2023-04-04 18:22:14 +03:00
Anna Stuchlik	447ce58da5	doc: update Raft doc for versions 5.2 and 2023.1 Fixes https://github.com/scylladb/scylladb/issues/13345 Fixes https://github.com/scylladb/scylladb/issues/13421 This commit updates the Raft documentation page to be up to date in versions 5.2 and 2023.1. - Irrelevant information about previous releases is removed. - Some information is clarified. - Mentions of version 5.2 are either removed (if possible) or version 2023.1 is added. Closes #13426	2023-04-04 15:15:56 +02:00
Kefu Chai	dceb364c5c	build: cmake: include cxx.h with relative path before this change, the wasm binding source files includes the cxxbridge header file of `cxx.h` with its full path. to better mirror the behavior of configure.py, let's just include this header file with relative path. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-04 15:33:20 +08:00
Kefu Chai	ecd5bf98d9	build: cmake: set stack frame limits * transpose include(mode.common) and include (mode.${build_mode}), so the former can reference the value defined by the latter. * set stack_usage_threshold for supported build modes. please note, this compiler option (-Wstack-usage=<bytes>) is only supported by GCC so far. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-04 15:33:20 +08:00
Kefu Chai	6cc8800c85	build: cmake: pass -fvisibility=hidden to compiler this mirrors the behavior of `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-04 15:33:20 +08:00
Kefu Chai	066e9567ee	build: cmake: use -O0 on aarch64, otherwise -Og this addresses an oversight in `b234c839e4`, which is supposed to mirror the behavior of `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-04 15:33:20 +08:00
Anna Stuchlik	595325c11b	doc: add upgrade guide from 5.2 to 2023.1 Related: https://github.com/scylladb/scylla-enterprise/issues/2770 This commit adds the upgrade guide from ScyllaDB Open Source 5.2 to ScyllaDB Enterprise 2023.1. This commit does not cover metric updates (the metrics file has no content, which needs to be added in another PR). As this is an upgrade guide, this commit must be merged to master and backported to branch-5.2 and branch-2023.1 in scylla-enterprise.git. Closes #13294	2023-04-04 08:24:00 +03:00
Botond Dénes	8167f11a23	Merge 'Move compaction manager tasks out of compaction manager' from Aleksandra Martyniuk Task manager compaction tasks that cover compaction group compaction need access to compaction_manager::tasks. To avoid circular dependency and be able to rely on forward declaration, task needs to be moved out of compaction manager. To avoid naming confusion compaction_manager::task is renamed. Closes #13226 * github.com:scylladb/scylladb: compaction: use compaction namespace in compaction_manager.cc compaction: rename compaction::task compaction: move compaction_manager::task out of compaction manager compaction: move sstable_task definition to source file	2023-04-03 15:40:42 +03:00
Botond Dénes	54c0a387a2	Revert "Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes" This reverts commit `32fff17e19`, reversing changes made to `164afe14ad`. This series proved to be problematic, the new test introduced by it failing quite often. Revert it until the problems are tracked down and fixed.	2023-04-03 13:54:00 +03:00
Botond Dénes	04b1219694	Revert "tools/schema_loader: don't require results from optional schema tables" This reverts commit `c15f53f971`. Said commit is based on a commit which we want to revert because it's unit test if flaky.	2023-04-03 13:53:06 +03:00
Petr Gusev	09636b20f3	scylla_cluster.py: optimize node logs reading There are two occasions in scylla_cluster where we read the node logs, and in both of them we read the entire file in memory. This is not efficient and may cause an OOM. In the first case we need the last line of the log file, so we seek at the end and move backwards looking for a new line symbol. In the second case we look through the log file to find the expected_error. The readlines() method returns a Python list object, which means it reads the entire file in memory. It's sufficient to just remove it since iterating over the file instance already yields lines lazily one by one. This is a follow-up for #13134. Closes #13399	2023-04-03 12:28:08 +02:00
Marcin Maliszkiewicz	99f8d7dcbe	db: view: use deferred_close for closing staging_sstable_reader When consume_in_thread throws the reader should still be closed. Related https://github.com/scylladb/scylla-enterprise/issues/2661 Closes #13398 Refs: scylladb/scylla-enterprise#2661 Fixes: #13413	2023-04-03 09:02:55 +03:00
Botond Dénes	ca062d1fba	Merge ' mutation: replace operator<<(..) with fmt formatter' from Kefu Chai this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `position_in_partition` and `partition_region` without using ostream<<. also, this change removes `operator<<(ostream, const position_in_partition_view&)` , `operator<<(ostream, const partition_region&)` along with their callers. Refs #13245 Closes #13391 * github.com:scylladb/scylladb: mutation: drop operator<< for position_in_partition and friends partition_snapshot_row_cursor: do not use operator<< when printing position mutation: specialize fmt::formatter<position_in_partition> mutation: specialize fmt::formatter<partition_region>	2023-04-03 08:34:55 +03:00
Kefu Chai	6c37829224	wasm: add noexcept specifier for alien::run_on() as alien::run_on() requires the function to be noexcept, let's make this explicit. also, this paves the road to the type constraint added to `alien::run_on()`. the type contraint will enforce this requirement to the function passed to `alien::run_on()`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13375	2023-04-03 08:19:00 +03:00
Botond Dénes	36e53d571c	Merge 'Treewide use-after-move bug fixes' from Raphael "Raph" Carvalho That's courtersy of `153813d3b8`, which annotates Seastar smart pointer classes with Clang's consumed attributes, to help Clang to statically spot use-after-move bugs. Closes #13386 * github.com:scylladb/scylladb: replica: Fix use-after-move in table::make_streaming_reader index/built_indexes_virtual_reader.hh: Fix use-after-move db/view/build_progress_virtual_reader: Fix use-after-move sstables: Fix use-after-move when making reader in reverse mode	2023-04-03 06:57:54 +03:00
Raphael S. Carvalho	d2d151ae5b	Fix use-after-move when initializing row cache with dummy entry Courtersy of clang-tidy: row_cache.cc:1191:28: warning: 'entry' used after it was moved [bugprone-use-after-move] _partitions.insert(entry.position().token().raw(), std::move(entry), dht::ring_position_comparator{_schema}); ^ row_cache.cc:1191:60: note: move occurred here _partitions.insert(entry.position().token().raw(), std::move(entry), dht::ring_position_comparator{_schema}); ^ row_cache.cc:1191:28: note: the use and move are unsequenced, i.e. there is no guarantee about the order in which they are evaluated _partitions.insert(entry.position().token().raw(), std::move(entry), dht::ring_position_comparator{*_schema}); The use-after-move is UB, as for it to happen, depends on evaluation order. We haven't hit it yet as clang is left-to-right. Fixes #13400. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13401	2023-03-31 19:46:53 +03:00
Botond Dénes	c15f53f971	tools/schema_loader: don't require results from optional schema tables When loading a schema from disk, only the `tables` and `columns` tables are required to have an entry to the loaded schema. All the others are optional. Yet the schema loader expects all the tables to have a corresponding entry, which leads to errors when trying to load a schema which doesn't. Relax the loader to only require existing entries in the two mandatory tables and not the others. Closes #13393	2023-03-31 16:35:42 +02:00
Kefu Chai	c24a9600af	docs: dev: correct a typo s/By expending/By expanding/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13392	2023-03-31 17:19:08 +03:00
Raphael S. Carvalho	04932a66d3	replica: Fix use-after-move in table::make_streaming_reader Variant used by streaming/stream_transfer_task.cc: , reader(cf.make_streaming_reader(cf.schema(), std::move(permit_), prs)) as full slice is retrieved after schema is moved (clang evaluates left-to-right), the stream transfer task can be potentially working on a stale slice for a particular set of partitions. static report: In file included from replica/dirty_memory_manager.cc:6: replica/database.hh:706:83: error: invalid invocation of method 'operator->' on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed] return make_streaming_reader(std::move(schema), std::move(permit), range, schema->full_slice()); Fixes #13397. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-31 08:44:46 -03:00
Raphael S. Carvalho	f8df3c72d4	index/built_indexes_virtual_reader.hh: Fix use-after-move static report: ./index/built_indexes_virtual_reader.hh:228:40: warning: invalid invocation of method 'operator->' on object 's' while it is in the 'consumed' state [-Wconsumed] _db.find_column_family(s->ks_name(), system_keyspace::v3::BUILT_VIEWS), Fixes #13396. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-31 08:41:44 -03:00
Raphael S. Carvalho	1ecba373d6	db/view/build_progress_virtual_reader: Fix use-after-move use-after-free in ctor, which potentially leads to a failure when locating table from moved schema object. static report In file included from db/system_keyspace.cc:51: ./db/view/build_progress_virtual_reader.hh:202:40: warning: invalid invocation of method 'operator->' on object 's' while it is in the 'consumed' state [-Wconsumed] _db.find_column_family(s->ks_name(), system_keyspace::v3::SCYLLA_VIEWS_BUILDS_IN_PROGRESS), Fixes #13395. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-31 08:40:30 -03:00
Raphael S. Carvalho	213eaab246	sstables: Fix use-after-move when making reader in reverse mode static report: sstables/mx/reader.cc:1705:58: error: invalid invocation of method 'operator' on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed] legacy_reverse_slice_to_native_reverse_slice(schema, slice.get()), pc, std::move(trace_state), fwd, fwd_mr, monitor); Fixes #13394. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-31 08:39:11 -03:00
Kefu Chai	6e956c5358	mutation: drop operator<< for position_in_partition and friends now that all their callers are removed, let's just drop these operators. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-31 19:03:14 +08:00
Kefu Chai	76dde9fd50	partition_snapshot_row_cursor: do not use operator<< when printing position in order to prepare for dropping the `operator<<()` for `position_in_partition_view`, let's use fmtlib to print `position()`. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-31 19:03:14 +08:00
Kefu Chai	4ec4859179	mutation: specialize fmt::formatter<position_in_partition> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print - position_in_partition - position_in_partition_view - position_in_partition_view::printer without the help of fmt::ostream. their `operator<<(ostream,..)` are reimplemented using fmtlib accordingly to ease the review. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-31 19:03:14 +08:00
Kefu Chai	500eeeb12c	mutation: specialize fmt::formatter<partition_region> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `partition_region` with the help of fmt::ostream. to help with the review process, the corresponding `to_string()` is dropped, and its callers now switch over to `fmt::to_string()` in this change as well. to use `fmt::to_string()` helps with consolidating all places to use fmtlib for printing/formatting. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-31 19:03:14 +08:00
Tomasz Grabiec	99cb948eac	direct_failure_detector: Avoid throwing exceptions in the success path sleep_abortable() is aborted on success, which causes sleep_aborted exception to be thrown. This causes scylla to throw every 100ms for each pinged node. Throwing may reduce performance if happens often. Also, it spams the logs if --logger-log-level exception=trace is enabled. Avoid by swallowing the exception on cancellation. Fixes #13278. Closes #13279	2023-03-31 12:40:43 +02:00
Alejo Sanchez	81b40c10de	test/pylib: RandomTables.add_column with value column When adding extra columns in a test, make them value column. Name them with the "v_" prefix and use the value column number counter. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #13271	2023-03-31 11:19:49 +02:00
Alejo Sanchez	e3b462507d	test/pylib: topology: support clusters of initial size 0 To allow tests with custom clusters, allow configuration of initial cluster size of 0. Add a proof-of-concept test to be removed later. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #13342	2023-03-31 11:17:58 +02:00
Kefu Chai	e107b31d23	test: sstable: remove unused class in sstable test generation_for_sharded_test is not used by any of these sstable tests, so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13388	2023-03-31 08:02:22 +03:00
Botond Dénes	f777916055	Merge 'Offstrategy keyspace compaction task' from Aleksandra Martyniuk Task manager task implementations of classes that cover offstrategy keyspace compaction which can be start through /storage_service/keyspace_compaction/ api. Top level task covers the whole compaction and creates child tasks on each shard. Closes #12713 * github.com:scylladb/scylladb: test: extend test_compaction_task.py to test offstrategy compaction compaction: create task manager's task for offstrategy keyspace compaction on one shard compaction: create task manager's task for offstrategy keyspace compaction compaction: create offstrategy_compaction_task_impl	2023-03-31 07:09:17 +03:00
Pavel Emelyanov	7d6ab5c84d	code: Remove some headers from query_processor.hh The forward_service.hh and raft_group0_client.hh can be replaced with forward declarations. Few other files need their previously indirectly included headers back. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13384	2023-03-31 07:08:41 +03:00
Tomasz Grabiec	4d6443e030	Merge 'Schema commitlog separate dir' from Gusev Petr The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in `commitlog::descriptor::descriptor`, which is logged with the `WARN` level. A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new `schema_commitlog_directory` parameter to move the schema commitlog to another disk drive. This is expected to be released in 5.3. As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here. Fixes: #11867 Closes #13263 * github.com:scylladb/scylladb: commitlog: use separate directory for schema commitlog schema commitlog: fix commitlog_total_space_in_mb initialization	2023-03-30 23:48:58 +02:00
Petr Gusev	0152c000bb	commitlog: use separate directory for schema commitlog The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in commitlog::descriptor::descriptor, which is logged with the WARN level. A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new schema_commitlog_directory parameter to move the schema commitlog to another disk drive. By default, the schema commitlog directory is nested in the commitlog_directory. This can help avoid problems during an upgrade if the commitlog_directory in the custom scylla.yaml is located on a separate disk partition. This is expected to be released in 5.3. As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here. Fixes: #11867	2023-03-30 21:55:50 +04:00
Petr Gusev	f31bd26971	schema commitlog: fix commitlog_total_space_in_mb initialization It seems there was a typo here, which caused commitlog_total_space_in_mb to always be zero and the schema commitlog to be effectively unlimited in size.	2023-03-30 21:55:50 +04:00
Botond Dénes	207dcbb8fa	Merge 'sstables: prepare for uuid-based generation_type' from Benny Halevy Preparing for #10459, this series defines sstables::generation_type::int_t as `int64_t` at the moment and use that instead of naked `int64_t` variables so it can be changed in the future to hold e.g. a `std::variant<int64_t, sstables::generation_id>`. sstables::new_generation was defined to generation new, unique generations. Currently it is based on incrementing a counter, but it can be extended in the future to manufacture UUIDs. The unit tests are cleaned up in this series to minimize their dependency on numeric generations. Basically, they should be used for loading sstables with hard coded generation numbers stored under `test/resource/sstables`. For all the rest, the tests should use existing and mechanisms introduced in this series such as generation_factory, sst_factory and smart make_sstable methods in sstable_test_env and table_for_tests to generate new sstables with a unique generation, and use the abstract sst->generation() method to get their generation if needed, without resorting the the actual value it may hold. Closes #12994 * github.com:scylladb/scylladb: everywhere: use sstables::generation_type test: sstable_test_env: use make_new_generation sstable_directory::components_lister::process: fixup indentation sstables: make highest_generation_seen return optional generation replica: table: add make_new_generation function replica: table: move sstable generation related functions out of line test: sstables: use generation_type::int_t sstables: generation_type: define int_t	2023-03-30 17:05:07 +03:00
Pavel Emelyanov	92318fdeae	Merge 'Initialize Wasm together with query_processor' from Wojciech Mitros The wasm engine is moved from replica::database to the query_processor. The wasm instance cache and compilation thread runner were already there, but now they're also initialized in the query_processor constructor. By moving the initialization to the constructor, we can now be certain that all wasm-related objects (wasm instance cache, compilation thread runner, and wasm engine, which was already passed in the constructor) are initialized when we try to use them because we have to use the query processor to access them anyway. The change is also motivated by the fact that we're planning to take Wasm UDFs out of experimental, after which they should stop getting special treatment. Closes #13311 * github.com:scylladb/scylladb: wasm: move wasm initialization to query_processor constructor wasm: return wasm instance cache as a reference instead of a pointer wasm: move wasm engine to query_processor	2023-03-30 14:30:23 +03:00
Nadav Har'El	59ab9aac44	Merge 'functions: reframe aggregate functions in terms of scalar functions' from Avi Kivity Currently, aggregate functions are implemented in a statefull manner. The accumulator is stored internally in an aggregate_function::aggregate, requiring each query to instantiate new instances (see aggregate_function_selector's constructor, and note how it's called from selector::new_instance()). This makes aggregates hard to use in expressions, since expressions are stateless (with state only provided to evaluate()). To facilitate migration towards stateless expressions, we define a stateless_aggregate_function (modeled after user-defined aggregates, which are already stateless). This new struct defines the aggregate in terms of three scalar functions: one to aggregate a new input into an accumulator (provided in the first parameter), one to finalize an accumulator into a result, and one to reduce two accumulators for parallelized aggregation. All existing native aggregate functions are converted to the new model, and the old interface is removed. This series does not yet convert selectors to expressions, but it does remove one of the obstacles. Performance evaluation: I created a table with a million ints on a single-node cluster, and ran the avg() function on them. I measured the number of instructions executed with `perf stat -p $(pgrep scylla) -e instructions` while the query was running. The query executed from cache, memtables were flushed beforehand. The instruction count per row increased from roughly 49k to roughly 52k, indicating 3k extra instructions per row. While 3k instructions to execute a function is huge, it is currently dwarfed by other overhead (and will be even less important in a cluster where it CL>1 will cause non-coordinator code to run multiple times). Closes #13105 * github.com:scylladb/scylladb: cql3/selection, forward_service: use use stateless_aggregate_function directly db: functions: fold stateless_aggregate_function_adapter into aggregate_function cql3: functions: simplify accumulator_for template cql3: functions: base user-defined aggregates on stateless aggregates cql3: functions: drop native_aggregate_function cql3: functions: reimplement count(column) statelessly cql3: functions: reimplement avg() statelessly cql3: functions: reimplement sum() statelessly cql3: functions: change wide accumulator type to varint cql3: functions: unreverse types for min/max cql3: functions: rename make_{min,max}_dynamic_function cql3: functions: reimplement min/max statelessly cql3: functions: reimplement count(*) statelessly cql3: functions: simplify creating native functions even more cql3: functions: add helpers for automating marshalling for scalar functions types: fix big_decimal constructor from literal 0 cql3: functions: add helper class for internal scalar functions db: functions: add stateless aggregate functions db, cql3: move scalar_function from cql3/functions to db/functions	2023-03-30 13:58:47 +03:00
Aleksandra Martyniuk	306d44568f	test: extend test_compaction_task.py to test offstrategy compaction	2023-03-30 10:52:27 +02:00
Aleksandra Martyniuk	8afa54d4f6	compaction: create task manager's task for offstrategy keyspace compaction on one shard Implementation of task_manager's task that covers local offstrategy keyspace compaction.	2023-03-30 10:49:09 +02:00
Aleksandra Martyniuk	73860b7c9d	compaction: create task manager's task for offstrategy keyspace compaction Implementation of task_manager's task covering offstrategy keyspace compaction that can be started through storage_service api.	2023-03-30 10:44:56 +02:00
Aleksandra Martyniuk	e8ef8a51d5	compaction: create offstrategy_compaction_task_impl offstrategy_compaction_task_impl serves as a base class of all concrete offstrategy compaction task classes.	2023-03-30 10:28:17 +02:00
Nadav Har'El	32fff17e19	Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes `scylla-sstable` currently has two ways to obtain the schema: * via a `schema.cql` file. * load schema definition from memory (only works for system tables). This meant that for most cases it was necessary to export the schema into a `CQL` format and write it to a file. This is very flexible. The sstable can be inspected anywhere, it doesn't have to be on the same host where it originates form. Yet in many cases the sstable is inspected on the same host where it originates from. In this cases, the schema is readily available in the schema tables on disk and it is plain annoying to have to export it into a file, just to quickly inspect an sstable file. This series solves this annoyance by providing a mechanism to load schemas from the on-disk schema tables. Furthermore, an auto-detect mechanism is provided to detect the location of these schema tables based on the path of the sstable, but if that fails, the tool check the usual locations of the scylla data dir, the scylla confguration file and even looks for environment variables that tell the location of these. The old methods are still supported. In fact, if a `schema.cql` is present in the working directory of the tool, it is preferred over any other method, allowing for an easy force-override. If the auto-detection magic fails, an error is printed to the console, advising the user to turn on debug level logging to see what went wrong. A comprehensive test is added which checks all the different schema loading mechanisms. The documentation is also updated to reflect the changes. This change breaks the backward-compatibility of the command-line API of the tool, as `--system-schema` is now just a flag, the keyspace and table names are supplied separately via the new `--keyspace` and `--table` options. I don't think this will break anybody's workflow as this tools is still lightly used, exactly because of the annoying way the schema has to be provided. Hopefully after this series, this will change. Example: ``` $ ./build/dev/scylla sstable dump-data /var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine/me-1-big-Data.db {"sstables":{"/var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine//me-1-big-Data.db":[{"key":{"token":"-3485513579396041028","raw":"000400000000","value":"0"},"clustering_elements":[{"type":"clustering-row","key":{"raw":"","value":""},"marker":{"timestamp":1677837047297728},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1677837047297728,"value":"0"}}}]}]}} ``` As seen above, subdirectories like `qurantine`, `staging` etc are also supported. Fixes: https://github.com/scylladb/scylladb/issues/10126 Closes #13075 * github.com:scylladb/scylladb: docs/operating-scylla/admin-tools: scylla-sstable.rst: update schema section test/cql-pytest: test_tools.py: add test for schema loading test/cql-pytest: nodetool.py: add flush_keyspace() tools/scylla-sstable: reform schema loading mechanism tools/schema_loader: add load_schema_from_schema_tables() db/schema_tables: expose types schema	2023-03-30 09:35:59 +03:00

1 2 3 4 5 ...

36019 Commits